idnits 2.17.1 draft-ietf-tcpm-generalized-ecn-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The draft header indicates that this document obsoletes RFC5562, but the abstract doesn't seem to mention this, which it should. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). == The document seems to contain a disclaimer for pre-RFC5378 work, but was first submitted on or after 10 November 2008. The disclaimer is usually necessary only for documents that revise or obsolete older RFCs, and that take significant amounts of text from those RFCs. If you can contact all authors of the source material and they are willing to grant the BCP78 rights to the IETF Trust, you can and should remove the disclaimer. Otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (November 3, 2019) is 1629 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- == Outdated reference: A later version (-28) exists of draft-ietf-tcpm-accurate-ecn-09 ** Obsolete normative reference: RFC 793 (Obsoleted by RFC 9293) == Outdated reference: A later version (-34) exists of draft-ietf-quic-transport-23 == Outdated reference: A later version (-29) exists of draft-ietf-tsvwg-ecn-l4s-id-07 == Outdated reference: A later version (-20) exists of draft-ietf-tsvwg-l4s-arch-04 == Outdated reference: A later version (-06) exists of draft-stewart-tsvwg-sctpecn-05 -- Obsolete informational reference (is this intentional?): RFC 2140 (Obsoleted by RFC 9040) -- Obsolete informational reference (is this intentional?): RFC 4960 (Obsoleted by RFC 9260) Summary: 1 error (**), 0 flaws (~~), 8 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group M. Bagnulo 3 Internet-Draft UC3M 4 Obsoletes: 5562 (if approved) B. Briscoe 5 Intended status: Experimental Independent 6 Expires: May 6, 2020 November 3, 2019 8 ECN++: Adding Explicit Congestion Notification (ECN) to TCP Control 9 Packets 10 draft-ietf-tcpm-generalized-ecn-05 12 Abstract 14 This document describes an experimental modification to ECN when used 15 with TCP. It allows the use of ECN on the following TCP packets: 16 SYNs, pure ACKs, Window probes, FINs, RSTs and retransmissions. 18 Status of This Memo 20 This Internet-Draft is submitted in full conformance with the 21 provisions of BCP 78 and BCP 79. 23 Internet-Drafts are working documents of the Internet Engineering 24 Task Force (IETF). Note that other groups may also distribute 25 working documents as Internet-Drafts. The list of current Internet- 26 Drafts is at https://datatracker.ietf.org/drafts/current/. 28 Internet-Drafts are draft documents valid for a maximum of six months 29 and may be updated, replaced, or obsoleted by other documents at any 30 time. It is inappropriate to use Internet-Drafts as reference 31 material or to cite them other than as "work in progress." 33 This Internet-Draft will expire on May 6, 2020. 35 Copyright Notice 37 Copyright (c) 2019 IETF Trust and the persons identified as the 38 document authors. All rights reserved. 40 This document is subject to BCP 78 and the IETF Trust's Legal 41 Provisions Relating to IETF Documents 42 (https://trustee.ietf.org/license-info) in effect on the date of 43 publication of this document. Please review these documents 44 carefully, as they describe your rights and restrictions with respect 45 to this document. Code Components extracted from this document must 46 include Simplified BSD License text as described in Section 4.e of 47 the Trust Legal Provisions and are provided without warranty as 48 described in the Simplified BSD License. 50 This document may contain material from IETF Documents or IETF 51 Contributions published or made publicly available before November 52 10, 2008. The person(s) controlling the copyright in some of this 53 material may not have granted the IETF Trust the right to allow 54 modifications of such material outside the IETF Standards Process. 55 Without obtaining an adequate license from the person(s) controlling 56 the copyright in such materials, this document may not be modified 57 outside the IETF Standards Process, and derivative works of it may 58 not be created outside the IETF Standards Process, except to format 59 it for publication as an RFC or to translate it into languages other 60 than English. 62 Table of Contents 64 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 65 1.1. Motivation . . . . . . . . . . . . . . . . . . . . . . . 4 66 1.2. Experiment Goals . . . . . . . . . . . . . . . . . . . . 5 67 1.3. Document Structure . . . . . . . . . . . . . . . . . . . 6 68 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 6 69 3. Specification . . . . . . . . . . . . . . . . . . . . . . . . 7 70 3.1. Network (e.g. Firewall) Behaviour . . . . . . . . . . . . 7 71 3.2. Sender Behaviour . . . . . . . . . . . . . . . . . . . . 8 72 3.2.1. SYN (Send) . . . . . . . . . . . . . . . . . . . . . 9 73 3.2.2. SYN-ACK (Send) . . . . . . . . . . . . . . . . . . . 13 74 3.2.3. Pure ACK (Send) . . . . . . . . . . . . . . . . . . . 14 75 3.2.4. Window Probe (Send) . . . . . . . . . . . . . . . . . 15 76 3.2.5. FIN (Send) . . . . . . . . . . . . . . . . . . . . . 16 77 3.2.6. RST (Send) . . . . . . . . . . . . . . . . . . . . . 16 78 3.2.7. Retransmissions (Send) . . . . . . . . . . . . . . . 17 79 3.2.8. General Fall-back for any Control Packet or 80 Retransmission . . . . . . . . . . . . . . . . . . . 17 81 3.3. Receiver Behaviour . . . . . . . . . . . . . . . . . . . 17 82 3.3.1. Receiver Behaviour for Any TCP Control Packet or 83 Retransmission . . . . . . . . . . . . . . . . . . . 18 84 3.3.2. SYN (Receive) . . . . . . . . . . . . . . . . . . . . 18 85 3.3.3. Pure ACK (Receive) . . . . . . . . . . . . . . . . . 19 86 3.3.4. FIN (Receive) . . . . . . . . . . . . . . . . . . . . 19 87 3.3.5. RST (Receive) . . . . . . . . . . . . . . . . . . . . 20 88 3.3.6. Retransmissions (Receive) . . . . . . . . . . . . . . 20 89 4. Rationale . . . . . . . . . . . . . . . . . . . . . . . . . . 20 90 4.1. The Reliability Argument . . . . . . . . . . . . . . . . 20 91 4.2. SYNs . . . . . . . . . . . . . . . . . . . . . . . . . . 21 92 4.2.1. Argument 1a: Unrecognized CE on the SYN . . . . . . . 21 93 4.2.2. Argument 1b: ECT Considered Invalid on the SYN . . . 22 94 4.2.3. Caching Strategies for ECT on SYNs . . . . . . . . . 24 95 4.2.4. Argument 2: DoS Attacks . . . . . . . . . . . . . . . 26 96 4.3. SYN-ACKs . . . . . . . . . . . . . . . . . . . . . . . . 27 97 4.3.1. Possibility of Unrecognized CE on the SYN-ACK . . . . 27 98 4.3.2. Response to Congestion on a SYN-ACK . . . . . . . . . 28 99 4.3.3. Fall-Back if ECT SYN-ACK Fails . . . . . . . . . . . 29 100 4.4. Pure ACKs . . . . . . . . . . . . . . . . . . . . . . . . 29 101 4.4.1. Mechanisms to Respond to CE-Marked Pure ACKs . . . . 31 102 4.4.2. Summary: Enabling ECN on Pure ACKs . . . . . . . . . 34 103 4.5. Window Probes . . . . . . . . . . . . . . . . . . . . . . 34 104 4.6. FINs . . . . . . . . . . . . . . . . . . . . . . . . . . 35 105 4.7. RSTs . . . . . . . . . . . . . . . . . . . . . . . . . . 35 106 4.8. Retransmitted Packets. . . . . . . . . . . . . . . . . . 37 107 4.9. General Fall-back for any Control Packet . . . . . . . . 38 108 5. Interaction with popular variants or derivatives of TCP . . . 38 109 5.1. IW10 . . . . . . . . . . . . . . . . . . . . . . . . . . 39 110 5.2. TFO . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 111 5.3. L4S . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 112 5.4. Other transport protocols . . . . . . . . . . . . . . . . 41 113 6. Security Considerations . . . . . . . . . . . . . . . . . . . 41 114 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 41 115 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 42 116 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 42 117 9.1. Normative References . . . . . . . . . . . . . . . . . . 42 118 9.2. Informative References . . . . . . . . . . . . . . . . . 43 119 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 46 121 1. Introduction 123 RFC 3168 [RFC3168] specifies support of Explicit Congestion 124 Notification (ECN) in IP (v4 and v6). By using the ECN capability, 125 network elements (e.g. routers, switches) performing Active Queue 126 Management (AQM) can use ECN marks instead of packet drops to signal 127 congestion to the endpoints of a communication. This results in 128 lower packet loss and increased performance. RFC 3168 also specifies 129 support for ECN in TCP, but solely on data packets. For various 130 reasons it precludes the use of ECN on TCP control packets (TCP SYN, 131 TCP SYN-ACK, pure ACKs, Window probes) and on retransmitted packets. 132 RFC 3168 is silent about the use of ECN on RST and FIN packets. RFC 133 5562 [RFC5562] is an experimental modification to ECN that enables 134 ECN support for TCP SYN-ACK packets. 136 This document defines an experimental modification to ECN [RFC3168] 137 that shall be called ECN++. It enables ECN support on all the 138 aforementioned types of TCP packet. The mechanisms proposed in this 139 document have been defined conservatively and with safety in mind, 140 possibly in some cases at the expense of performance. 142 ECN++ uses a sender-only deployment model. It works whether the two 143 ends of the TCP connection use classic ECN feedback [RFC3168] or 144 experimental Accurate ECN feedback (AccECN 146 [I-D.ietf-tcpm-accurate-ecn]), the two ECN feedback mechanisms for 147 TCP being standardized at the time of writing. 149 Using ECN on initial SYN packets provides significant benefits, as we 150 describe in the next subsection. However, only AccECN provides a way 151 to feed back whether the SYN was CE marked, and RFC 3168 does not. 152 Therefore, implementers of ECN++ are RECOMMENDED to also implement 153 AccECN. Conversely, if AccECN (or an equivalent safety mechanism) is 154 not implemented with ECN++, this specification rules out ECN on the 155 SYN. 157 ECN++ is designed for compatibility with a number of latency 158 improvements to TCP such as TCP Fast Open (TFO [RFC7413]), initial 159 window of 10 SMSS (IW10 [RFC6928]) and Low latency Low Loss Scalable 160 Transport (L4S [I-D.ietf-tsvwg-l4s-arch]), but they can all be 161 implemented and deployed independently. [RFC8311] is a standards 162 track procedural device that relaxes requirements in RFC 3168 and 163 other standards track RFCs that would otherwise preclude the 164 experimental modifications needed for ECN++ and other ECN 165 experiments. 167 1.1. Motivation 169 The absence of ECN support on TCP control packets and retransmissions 170 has a potential harmful effect. In any ECN deployment, non-ECN- 171 capable packets suffer a penalty when they traverse a congested 172 bottleneck. For instance, with a drop probability of 1%, 1% of 173 connection attempts suffer a timeout of about 1 second before the SYN 174 is retransmitted, which is highly detrimental to the performance of 175 short flows. TCP control packets, particularly TCP SYNs and SYN- 176 ACKs, are important for performance, so dropping them is best 177 avoided. 179 Not using ECN on control packets can be particularly detrimental to 180 performance in environments where the ECN marking level is high. For 181 example, [judd-nsdi] shows that in a controlled private data centre 182 (DC) environment where ECN is used (in conjunction with DCTCP 183 [RFC8257]), the probability of being able to establish a new 184 connection using a non-ECN SYN packet drops to close to zero even 185 when there are only 16 ongoing TCP flows transmitting at full speed. 186 The issue is that DCTCP exhibits a much more aggressive response to 187 packet marking (which is why it is only applicable in controlled 188 environments). This leads to a high marking probability for ECN- 189 capable packets, and in turn a high drop probability for non-ECN 190 packets. Therefore non-ECN SYNs are dropped aggressively, rendering 191 it nearly impossible to establish a new connection in the presence of 192 even mild traffic load. 194 Finally, there are ongoing experimental efforts to promote the 195 adoption of a slightly modified variant of DCTCP (and similar 196 congestion controls) over the Internet to achieve low latency, low 197 loss and scalable throughput (L4S) for all communications 198 [I-D.ietf-tsvwg-l4s-arch]. In such an approach, L4S packets identify 199 themselves using an ECN codepoint [I-D.ietf-tsvwg-ecn-l4s-id]. With 200 L4S, preventing TCP control packets from obtaining the benefits of 201 ECN would not only expose them to the prevailing level of congestion 202 loss, but it would also classify them into a different queue. Then 203 only L4S data packets would be classified into the L4S queue that is 204 expected to have lower latency, while the packets controlling and 205 retransmitting these data packets would still get stuck behind the 206 queue induced by non-L4S-enabled TCP traffic. 208 1.2. Experiment Goals 210 The goal of the experimental modifications defined in this document 211 is to allow the use of ECN on all TCP packets. Experiments are 212 expected in the public Internet as well as in controlled environments 213 to understand the following issues: 215 o How SYNs, Window probes, pure ACKs, FINs, RSTs and retransmissions 216 that carry the ECT(0), ECT(1) or CE codepoints are processed by 217 the TCP endpoints and the network (including routers, firewalls 218 and other middleboxes). In particular we would like to learn if 219 these packets are frequently blocked or if these packets are 220 usually forwarded and processed. 222 o The scale of deployment of the different flavours of ECN, 223 including [RFC3168], [RFC5562], [RFC3540] and 224 [I-D.ietf-tcpm-accurate-ecn]. 226 o How much the performance of TCP communications is improved by 227 allowing ECN marking of each packet type. 229 o To identify any issues (including security issues) raised by 230 enabling ECN marking of these packets. 232 o To conduct the specific experiments identified in the text by the 233 strings "EXPERIMENTATION NEEDED" or "MEASUREMENTS NEEDED". 235 The data gathered through the experiments described in this document, 236 particularly under the first 2 bullets above, will help in the 237 redesign of the final mechanism (if needed) for adding ECN support to 238 the different packet types considered in this document. 240 Success criteria: The experiment will be a success if we obtain 241 enough data to have a clearer view of the deployability and benefits 242 of enabling ECN on all TCP packets, as well as any issues. If the 243 results of the experiment show that it is feasible to deploy such 244 changes; that there are gains to be achieved through the changes 245 described in this specification; and that no other major issues may 246 interfere with the deployment of the proposed changes; then it would 247 be reasonable to adopt the proposed changes in a standards track 248 specification that would update RFC 3168. 250 1.3. Document Structure 252 The remainder of this document is structured as follows. In 253 Section 2, we present the terminology used in the rest of the 254 document. In Section 3, we specify the modifications to provide ECN 255 support to TCP SYNs, pure ACKs, Window probes, FINs, RSTs and 256 retransmissions. We describe both the network behaviour and the 257 endpoint behaviour. Section 5 discusses variations of the 258 specification that will be necessary to interwork with a number of 259 popular variants or derivatives of TCP. RFC 3168 provides a number 260 of specific reasons why ECN support is not appropriate for each 261 packet type. In Section 4, we revisit each of these arguments for 262 each packet type to justify why it is reasonable to conduct this 263 experiment. 265 2. Terminology 267 The keywords MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, 268 SHOULD NOT, RECOMMENDED, NOT RECOMMENDED, MAY, and OPTIONAL in this 269 document, are to be interpreted as described in BCP 14 [RFC2119] when 270 and only when they appear in all capitals [RFC8174]. 272 Pure ACK: A TCP segment with the ACK flag set and no data payload. 274 SYN: A TCP segment with the SYN (synchronize) flag set. 276 Window probe: Defined in [RFC0793], a window probe is a TCP segment 277 with only one byte of data sent to learn if the receive window is 278 still zero. 280 FIN: A TCP segment with the FIN (finish) flag set. 282 RST: A TCP segment with the RST (reset) flag set. 284 Retransmission: A TCP segment that has been retransmitted by the TCP 285 sender. 287 TCP client: The initiating end of a TCP connection. Also called the 288 initiator. 290 TCP server: The responding end of a TCP connection. Also called the 291 responder. 293 ECT: ECN-Capable Transport. One of the two codepoints ECT(0) or 294 ECT(1) in the ECN field [RFC3168] of the IP header (v4 or v6). An 295 ECN-capable sender sets one of these to indicate that both transport 296 end-points support ECN. When this specification says the sender sets 297 an ECT codepoint, by default it means ECT(0). Optionally, it could 298 mean ECT(1), which is in the process of being redefined for use by 299 L4S experiments [RFC8311] [I-D.ietf-tsvwg-ecn-l4s-id]. 301 Not-ECT: The ECN codepoint set by senders that indicates that the 302 transport is not ECN-capable. 304 CE: Congestion Experienced. The ECN codepoint that an intermediate 305 node sets to indicate congestion [RFC3168]. A node sets an 306 increasing proportion of ECT packets to CE as the level of congestion 307 increases. 309 3. Specification 311 The experimental ECN++ changes to the specification of TCP over ECN 312 [RFC3168] defined here primarily alter the behaviour of the sending 313 host for each half-connection. However, there are subsections for 314 forwarding elements and receivers below, which recommend that they 315 accept the new packets - they should do already, but might not. This 316 will allow implementers to check the receive side code while they are 317 altering the send-side code. All changes can be deployed at each 318 end-point independently of others and independent of any network 319 behaviour. 321 The feedback behaviour at the receiver depends on whether classic ECN 322 TCP feedback [RFC3168] or Accurate ECN (AccECN) TCP feedback 323 [I-D.ietf-tcpm-accurate-ecn] has been negotiated. Nonetheless, 324 neither receiver feedback behaviour is altered by the present 325 specification. 327 3.1. Network (e.g. Firewall) Behaviour 329 Previously the specification of ECN for TCP [RFC3168] required the 330 sender to set not-ECT on TCP control packets and retransmissions. 331 Some readers of RFC 3168 might have erroneously interpreted this as a 332 requirement for firewalls, intrusion detection systems, etc. to check 333 and enforce this behaviour. Section 4.3 of [RFC8311] updates RFC 334 3168 to remove this ambiguity. It requires firewalls or any 335 intermediate nodes not to treat certain types of ECN-capable TCP 336 segment differently (except potentially in one attack scenario). 337 This is likely to only involve a firewall rule change in a fraction 338 of cases (at most 0.4% of paths according to the tests reported in 339 Section 4.2.2). 341 In case a TCP sender encounters a middlebox blocking ECT on certain 342 TCP segments, the specification below includes behaviour to fall back 343 to non-ECN. However, this loses the benefit of ECN on control 344 packets. So operators are RECOMMENDED to alter their firewall rules 345 to comply with the requirement referred to above (section 4.3 of 346 [RFC8311]). 348 3.2. Sender Behaviour 350 For each type of control packet or retransmission, the following 351 sections detail changes to the sender's behaviour in two respects: i) 352 whether it sets ECT; and ii) its response to congestion feedback. 353 Table 1 summarises these two behaviours for each type of packet, but 354 the relevant subsection below should be referred to for the detailed 355 behaviour. The subsection on the SYN is more complex than the 356 others, because it has to include fall-back behaviour if the ECT 357 packet appears not to have got through, and caching of the outcome to 358 detect persistent failures. 360 +---------+----------------+-----------------+----------------------+ 361 | TCP | ECN field if | ECN field if | Congestion Response | 362 | packet | AccECN f/b | RFC3168 f/b | | 363 | type | negotiated* | negotiated* | | 364 +---------+----------------+-----------------+----------------------+ 365 | SYN | ECT | not-ECT | If AccECN, reduce IW | 366 | | | | | 367 | SYN-ACK | ECT | ECT | Reduce IW | 368 | | | | | 369 | Pure | ECT | not-ECT | If AccECN, usual | 370 | ACK | | | cwnd response and | 371 | | | | optionally [RFC5690] | 372 | | | | | 373 | W Probe | ECT | ECT | Usual cwnd response | 374 | | | | | 375 | FIN | ECT | ECT | None or optionally | 376 | | | | [RFC5690] | 377 | | | | | 378 | RST | ECT | ECT | N/A | 379 | | | | | 380 | Re-XMT | ECT | ECT | Usual cwnd response | 381 +---------+----------------+-----------------+----------------------+ 383 Window probe and retransmission are abbreviated to W Probe an Re-XMT. 384 * For a SYN, "negotiated" means "requested". 386 Table 1: Summary of sender behaviour. In each case the relevant 387 section below should be referred to for the detailed behaviour 389 It can be seen that we recommend against the sender setting ECT on 390 the SYN if it is not requesting AccECN feedback. Therefore it is 391 RECOMMENDED that the experimental AccECN specification 392 [I-D.ietf-tcpm-accurate-ecn] is implemented, along with the ECN++ 393 experiment, because it is expected that ECT on the SYN will give the 394 most significant performance gain, particularly for short flows. 396 Nonetheless, this specification also caters for the case where an 397 ECN++ TCP sender is not using AccECN. This could be because it does 398 not support AccECN or because the other end of the TCP connection 399 does not (AccECN can only be used for a connection if both ends 400 support it). 402 3.2.1. SYN (Send) 403 3.2.1.1. Setting ECT on the SYN 405 With classic [RFC3168] ECN feedback, the SYN was not expected to be 406 ECN-capable, so the flag provided to feed back congestion was put to 407 another use (it is used in combination with other flags to indicate 408 that the responder supports ECN). In contrast, Accurate ECN (AccECN) 409 feedback [I-D.ietf-tcpm-accurate-ecn] provides a codepoint in the 410 SYN-ACK for the responder to feed back whether the SYN arrived marked 411 CE. Therefore the setting of the IP/ECN field on the SYN is 412 specified separately for each case in the following two subsections. 414 3.2.1.1.1. ECN++ TCP Client also Supports AccECN 416 For the ECN++ experiment, if the SYN is requesting AccECN feedback, 417 the TCP sender will also set ECT on the SYN. It can ignore the 418 prohibition in section 6.1.1 of RFC 3168 against setting ECT on such 419 a SYN, as per Section 4.3 of [RFC8311]. 421 3.2.1.1.2. ECN++ TCP Client does not Support AccECN 423 If the SYN sent by a TCP initiator does not attempt to negotiate 424 Accurate ECN feedback, or does not use an equivalent safety 425 mechanism, it MUST still comply with RFC 3168, which says that a TCP 426 initiator "MUST NOT set ECT on a SYN". 428 The only envisaged examples of "equivalent safety mechanisms" are: a) 429 some future TCP ECN feedback protocol, perhaps evolved from AccECN, 430 that feeds back CE marking on a SYN; b) setting the initial window to 431 1 SMSS. IW=1 is NOT RECOMMENDED because it could degrade 432 performance, but might be appropriate for certain lightweight TCP 433 implementations. 435 See Section 4.2 for discussion and rationale. 437 If the TCP initiator does not set ECT on the SYN, the rest of 438 Section 3.2.1 does not apply. 440 3.2.1.2. Caching where to use ECT on SYNs 442 This subsection only applies if the ECN++ TCP client set ECTs on the 443 SYN and supports AccECN. 445 Until AccECN servers become widely deployed, a TCP initiator that 446 sets ECT on a SYN (which typically implies the same SYN also requests 447 AccECN, as above) SHOULD also maintain a cache entry per server to 448 record servers that it is not worth sending an ECT SYN to, e.g. 449 because they do not support AccECN and therefore have no logic for 450 congestion markings on the SYN. Mobile hosts MAY maintain a cache 451 entry per access network to record 'non-ECT SYN' entries against 452 proxies (see Section 4.2.3). This cache can be implemented as part 453 of the shared state across multiple TCP connections, following 454 [RFC2140]. 456 Subsequently the initiator will not set ECT on a SYN to such a server 457 or proxy, but it can still always request AccECN support (because the 458 response will state any earlier stage of ECN evolution that the 459 server supports with no performance penalty). If a server 460 subsequently upgrades to support AccECN, the initiator will discover 461 this as soon as it next connects, then it can remove the server from 462 its cache and subsequently always set ECT for that server. 464 The client can limit the size of its cache of 'non-ECT SYN' servers. 465 Then, while AccECN is not widely deployed, it will only cache the 466 'non-ECT SYN' servers that are most used and most recently used by 467 the client. As the client accesses servers that have been expelled 468 from its cache, it will simply use ECT on the SYN by default. 470 Servers that do not support ECN as a whole do not need to be recorded 471 separately from non-support of AccECN because the response to a 472 request for AccECN immediately states which stage in the evolution of 473 ECN the server supports (AccECN [I-D.ietf-tcpm-accurate-ecn], classic 474 ECN [RFC3168] or no ECN). 476 The above strategy is named "optimistic ECT and cache failures". It 477 is believed to be sufficient based on three measurement studies and 478 assumptions detailed in Section 4.2.3. However, Section 4.2.3 gives 479 two other strategies and the choice between them depends on the 480 implementer's goals and the deployment prevalence of ECN variants in 481 the network and on servers, not to mention the prevalence of some 482 significant bugs. 484 If the initiator times out without seeing a SYN-ACK, it will 485 separately cache this fact (see fall-back in Section 3.2.1.4 for 486 details). 488 3.2.1.3. SYN Congestion Response 490 As explained above, this subsection only applies if the ECN++ TCP 491 client sets ECT on the initial SYN. 493 If the SYN-ACK returned to the TCP initiator confirms that the server 494 supports AccECN, it will also be able to indicate whether or not the 495 SYN was CE-marked. If the SYN was CE-marked, and if the initial 496 window is greater than 1 MSS, then, the initiator MUST reduce its 497 Initial Window (IW) and SHOULD reduce it to 1 SMSS (sender maximum 498 segment size). The rationale is the same as that for the response to 499 CE on a SYN-ACK (Section 4.3.2). 501 If the initiator has set ECT on the SYN and if the SYN-ACK shows that 502 the server does not support feedback of a CE on the SYN (e.g. it does 503 not support AccECN) and if the initial congestion window of the 504 initiator is greater than 1 MSS, then the TCP initiator MUST 505 conservatively reduce its Initial Window and SHOULD reduce it to 1 506 SMSS. A reduction to greater than 1 SMSS MAY be appropriate (see 507 Section 4.2.1). Conservatism is necessary because the SYN-ACK cannot 508 show whether the SYN was CE-marked. 510 If the TCP initiator (host A) receives a SYN from the remote end 511 (host B) after it has sent a SYN to B, it indicates the (unusual) 512 case of a simultaneous open. Host A will respond with a SYN-ACK. 513 Host A will probably then receive a SYN-ACK in response to its own 514 SYN, after which it can follow the appropriate one of the two 515 paragraphs above. 517 In all the above cases, the initiator does not have to back off its 518 retransmission timer as it would in response to a timeout following 519 no response to its SYN [RFC6298], because both the SYN and the SYN- 520 ACK have been successfully delivered through the network. Also, the 521 initiator does not need to exit slow start or reduce ssthresh, which 522 is not even required when a SYN is lost [RFC5681]. 524 If an initial window of more than 3 segments is implemented (e.g. 525 IW10 [RFC6928]), Section 5 gives additional recommendations. 527 3.2.1.4. Fall-Back Following No Response to an ECT SYN 529 As explained above, this subsection only applies if the ECN++ TCP 530 client also sets ECT on the initial SYN. 532 An ECT SYN might be lost due to an over-zealous path element (or 533 server) blocking ECT packets that do not conform to RFC 3168. Some 534 evidence of this was found in a 2014 study [ecn-pam], but in a more 535 recent study using 2017 data [Mandalari18] extensive measurements 536 found no case where ECT on TCP control packets was treated any 537 differently from ECT on TCP data packets. Loss is commonplace for 538 numerous other reasons, e.g. congestion loss at a non-ECN queue on 539 the forward or reverse path, transmission errors, etc. 540 Alternatively, the cause of the loss might be the associated attempt 541 to negotiate AccECN, or possibly other unrelated options on the SYN. 543 Therefore, if the timer expires after the TCP initiator has sent the 544 first ECT SYN, it SHOULD make one more attempt to retransmit the SYN 545 with ECT set (backing off the timer as usual). If the retransmission 546 timer expires again, it SHOULD retransmit the SYN with the not-ECT 547 codepoint in the IP header, to expedite connection set-up. If other 548 experimental fields or options were on the SYN, it will also be 549 necessary to follow their specifications for fall-back too. It would 550 make sense to coordinate all the strategies for fall-back in order to 551 isolate the specific cause of the problem. 553 If the TCP initiator is caching failed connection attempts, it SHOULD 554 NOT give up using ECT on the first SYN of subsequent connection 555 attempts until it is clear that a blockage persistently and 556 specifically affects ECT on SYNs. This is because loss is so 557 commonplace for other reasons. Even if it does eventually decide to 558 give up setting ECT on the SYN, it will probably not need to give up 559 on AccECN on the SYN. In any case, if a cache is used, it SHOULD be 560 arranged to expire so that the initiator will infrequently attempt to 561 check whether the problem has been resolved. 563 Other fall-back strategies MAY be adopted where applicable (see 564 Section 4.2.2 for suggestions, and the conditions under which they 565 would apply). 567 3.2.2. SYN-ACK (Send) 569 3.2.2.1. Setting ECT on the SYN-ACK 571 For the ECN++ experiment, the TCP implementation will set ECT on SYN- 572 ACKs. It can ignore the requirement in section 6.1.1 of RFC 3168 to 573 set not-ECT on a SYN-ACK, as per Section 4.3 of [RFC8311]. 575 3.2.2.2. SYN-ACK Congestion Response 577 A host that sets ECT on SYN-ACKs MUST reduce its initial window in 578 response to any congestion feedback, whether using classic ECN or 579 AccECN (see Section 4.3.1). It SHOULD reduce it to 1 SMSS. This is 580 different to the behaviour specified in an earlier experiment that 581 set ECT on the SYN-ACK [RFC5562]. This is justified in 582 Section 4.3.2. 584 The responder does not have to back off its retransmission timer 585 because the ECN feedback proves that the network is delivering 586 packets successfully and is not severely overloaded. Also the 587 responder does not have to leave slow start or reduce ssthresh, which 588 is not even required when a SYN-ACK has been lost. 590 The congestion response to CE-marking on a SYN-ACK for a server that 591 implements either the TCP Fast Open experiment (TFO [RFC7413]) or 592 experimentation with an initial window of more than 3 segments (e.g. 593 IW10 [RFC6928]) is discussed in Section 5. 595 3.2.2.3. Fall-Back Following No Response to an ECT SYN-ACK 597 After the responder sends a SYN-ACK with ECT set, if its 598 retransmission timer expires it SHOULD retransmit one more SYN-ACK 599 with ECT set (and back-off its timer as usual). If the timer expires 600 again, it SHOULD retransmit the SYN-ACK with not-ECT in the IP 601 header. If other experimental fields or options were on the initial 602 SYN-ACK, it will also be necessary to follow their specifications for 603 fall-back. It would make sense to co-ordinate all the strategies for 604 fall-back in order to isolate the specific cause of the problem. 606 This fall-back strategy attempts to use ECT one more time than the 607 strategy for ECT SYN-ACKs in [RFC5562] (which is made obsolete, being 608 superseded by the present specification). Other fall-back strategies 609 MAY be adopted if found to be more effective, e.g. fall-back to not- 610 ECT on the first retransmission attempt. 612 The server MAY cache failed connection attempts, e.g. per client 613 access network. A client-based alternative to caching at the server 614 is given in Section 4.3.3. If the TCP server is caching failed 615 connection attempts, it SHOULD NOT give up using ECT on the first 616 SYN-ACK of subsequent connection attempts until it is clear that the 617 blockage persistently and specifically affects ECT on SYN-ACKs. This 618 is because loss is so commonplace for other reasons (see 619 Section 3.2.1.4). If a cache is used, it SHOULD be arranged to 620 expire so that the server will infrequently attempt to check whether 621 the problem has been resolved. 623 3.2.3. Pure ACK (Send) 625 A Pure ACK is an ACK packet that does not carry data, which includes 626 the Pure ACK at the end of TCP's 3-way handshake. 628 For the ECN++ experiment, whether a TCP implementation sets ECT on a 629 Pure ACK depends on whether or not Accurate ECN TCP feedback 630 [I-D.ietf-tcpm-accurate-ecn] has been successfully negotiated for a 631 particular TCP connection, as specified in the following two 632 subsections. 634 3.2.3.1. Pure ACK without AccECN Feedback 636 If AccECN has not been successfully negotiated for a connection, ECT 637 MUST NOT be set on Pure ACKs by either end. 639 3.2.3.2. Pure ACK with AccECN Feedback 641 For the ECN++ experiment, if AccECN has been successfully negotiated, 642 either end of the connection will set ECT on Pure ACKs. They can 643 ignore the requirement in section 6.1.4 of RFC 3168 to set not-ECT on 644 a pure ACK, as per Section 4.3 of [RFC8311]. 646 MEASUREMENTS NEEDED: Measurements are needed to learn how the 647 deployed base of network elements and RFC 3168 servers react to 648 pure ACKs marked with the ECT(0)/ECT(1)/CE codepoints, i.e. 649 whether they are dropped, codepoint cleared or processed and the 650 congestion indication fed back on a subsequent packet. 652 See Section 3.3.3 for the implications if a host receives a CE-marked 653 Pure ACK. 655 3.2.3.2.1. Pure ACK Congestion Response 657 As explained above, this subsection only applies if AccECN has been 658 successfully negotiated for the TCP connection. 660 A host that sets ECT on pure ACKs SHOULD respond to the congestion 661 signal resulting from pure ACKs being marked with the CE codepoint. 662 The specific response will need to be defined as an update to each 663 congestion control specification. Possible responses to congestion 664 feedback include reducing the congestion window (CWND) and/or 665 regulating the pure ACK rate (see Section 4.4.1.1). 667 Note that, in comparison, TCP Congestion Control [RFC5681] does not 668 require a TCP to detect or respond to loss of pure ACKs at all; it 669 requires no reduction in congestion window or ACK rate. 671 3.2.4. Window Probe (Send) 673 For the ECN++ experiment, the TCP sender will set ECT on window 674 probes. It can ignore the prohibition in section 6.1.6 of RFC 3168 675 against setting ECT on a window probe, as per Section 4.3 of 676 [RFC8311]. 678 A window probe contains a single octet, so it is no different from a 679 regular TCP data segment. Therefore a TCP receiver will feed back 680 any CE marking on a window probe as normal (either using classic ECN 681 feedback or AccECN feedback). The sender of the probe will then 682 reduce its congestion window as normal. 684 A receive window of zero indicates that the application is not 685 consuming data fast enough and does not imply anything about network 686 congestion. Once the receive window opens, the congestion window 687 might become the limiting factor, so it is correct that CE-marked 688 probes reduce the congestion window. This complements cwnd 689 validation [RFC7661], which reduces cwnd as more time elapses without 690 having used available capacity. However, CE-marking on window probes 691 does not reduce the rate of the probes themselves. This is unlikely 692 to present a problem, given the duration between window probes 693 doubles [RFC1122] as long as the receiver is advertising a zero 694 window (currently minimum 1 second, maximum at least 1 minute 695 [RFC6298]). 697 MEASUREMENTS NEEDED: Measurements are needed to learn how the 698 deployed base of network elements and servers react to Window 699 probes marked with the ECT(0)/ECT(1)/CE codepoints, i.e. whether 700 they are dropped, codepoint cleared or processed. 702 3.2.5. FIN (Send) 704 A TCP implementation can set ECT on a FIN. 706 See Section 3.3.4 for the implications if a host receives a CE-marked 707 FIN. 709 A congestion response to a CE-marking on a FIN is not required. 711 After sending a FIN, the endpoint will not send any more data in the 712 connection. Therefore, even if the FIN-ACK indicates that the FIN 713 was CE-marked (whether using classic or AccECN feedback), reducing 714 the congestion window will not affect anything. 716 After sending a FIN, a host might send one or more pure ACKs. If it 717 is using one of the techniques in Section 3.2.3 to regulate the 718 delayed ACK ratio for pure ACKs, it could equally be applied after a 719 FIN. But this is not required. 721 MEASUREMENTS NEEDED: Measurements are needed to learn how the 722 deployed base of network elements and servers react to FIN packets 723 marked with the ECT(0)/ECT(1)/CE codepoints, i.e. whether they 724 are dropped, codepoint cleared or processed. 726 3.2.6. RST (Send) 728 A TCP implementation can set ECT on a RST. 730 See Section 3.3.5 for the implications if a host receives a CE-marked 731 RST. 733 A congestion response to a CE-marking on a RST is not required (and 734 actually not possible). 736 MEASUREMENTS NEEDED: Measurements are needed to learn how the 737 deployed base of network elements and servers react to RST packets 738 marked with the ECT(0)/ECT(1)/CE codepoints, i.e. whether they 739 are dropped, codepoint cleared or processed. 741 3.2.7. Retransmissions (Send) 743 For the ECN++ experiment, the TCP sender will set ECT on 744 retransmitted segments. It can ignore the prohibition in section 745 6.1.5 of RFC 3168 against setting ECT on retransmissions, as per 746 Section 4.3 of [RFC8311]. 748 See Section 3.3.6 for the implications if a host receives a CE-marked 749 retransmission. 751 If the TCP sender receives feedback that a retransmitted packet was 752 CE-marked, it will react as it would to any feedback of CE-marking on 753 a data packet. 755 MEASUREMENTS NEEDED: Measurements are needed to learn how the 756 deployed base of network elements and servers react to 757 retransmissions marked with the ECT(0)/ECT(1)/CE codepoints, i.e. 758 whether they are dropped, codepoint cleared or processed. 760 3.2.8. General Fall-back for any Control Packet or Retransmission 762 Extensive measurements in fixed and mobile networks [Mandalari18] 763 have found no evidence of blockages due to ECT being set on any type 764 of TCP control packet. 766 In case traversal problems arise in future, fall-back measures have 767 been specified above, but only for the cases where ECT on the initial 768 packet of a half-connection (SYN or SYN-ACK) is persistently failing 769 to get through. 771 Fall-back measures for blockage of ECT on other TCP control packets 772 MAY be implemented. However they are not specified here given the 773 lack of any evidence they will be needed. Section 4.9 justifies this 774 advice in more detail. 776 3.3. Receiver Behaviour 778 The present ECN++ specification primarily concerns the behaviour for 779 sending TCP control packets or retransmissions. Below are a few 780 changes to the receive side of an implementation that are recommended 781 while updating its send side. Nonetheless, where deployment is 782 concerned, ECN++ is still a sender-only deployment, because it does 783 not depend on receivers complying with any of these recommendations. 785 3.3.1. Receiver Behaviour for Any TCP Control Packet or Retransmission 787 RFC8311 is a standards track update to RFC 3168 in order to (amongst 788 other things) "...allow the use of ECT codepoints on SYN packets, 789 pure acknowledgement packets, window probe packets, and 790 retransmissions of packets..., provided that the changes from RFC 791 3168 are documented in an Experimental RFC in the IETF document 792 stream." 794 Section 4.3 of RFC 8311 amends every statement in RFC 3168 that 795 precludes the use of ECT on control packets and retransmissions to 796 add "unless otherwise specified by an Experimental RFC in the IETF 797 document stream". The present specification is such an Experimental 798 RFC. Therefore, In order for this experiment to be useful, the 799 following requirements follow from RFC8311: 801 o Any TCP implementation SHOULD accept receipt of any valid TCP 802 control packet or retransmission irrespective of its IP/ECN field. 803 If any existing implementation does not, it SHOULD be updated to 804 do so. 806 o A TCP implementation taking part in the experiments proposed here 807 MUST accept receipt of any valid TCP control packet or 808 retransmission irrespective of its IP/ECN field. 810 These measures are derived from the robustness principle of "... be 811 liberal in what you accept from others", in order to ensure 812 compatibility with any future protocol changes that allow ECT on any 813 TCP packet. 815 3.3.2. SYN (Receive) 817 RFC 3168 negotiates the use of ECN for the connection end-to-end 818 using the ECN flags in the TCP header. When RFC3168 says that "A 819 host MUST NOT set ECT on SYN ... packets." it is silent as to what a 820 TCP server ought to do if it receives a SYN packet with a non-zero 821 IP/ECN field. 823 As the time of the writing, some implementations of TCP servers (see 824 Section 4.2.2.2) assume that, if a host receives a SYN with a non- 825 zero IP/ECN field, it must be due to network mangling, and they 826 disable ECN for the rest of the connection. Section 4.2.2.2 also 827 finds that this type of network mangling seems to be virtually non- 828 existent so it would be preferable to report any such mangling so it 829 can be fixed. 831 For the avoidance of doubt, the normative statements for all TCP 832 control packets in Section 3.3.1 are interpreted for the case when a 833 SYN is received as follows: 835 o Any TCP server implementation SHOULD accept receipt of a valid SYN 836 that requests ECN support for the connection, irrespective of the 837 IP/ECN field of the SYN. If any existing implementation does not, 838 it SHOULD be updated to do so. 840 o A TCP implementation taking part in the ECN++ experiment MUST 841 accept receipt of a valid SYN, irrespective of its IP/ECN field. 843 o If the SYN is CE-marked and the server has no logic to feed back a 844 CE mark on a SYN-ACK (e.g. it does not support AccECN), it has to 845 ignore the CE-mark (the client detects this case and behaves 846 conservatively in mitigation - see Section 3.2.1.3). 848 3.3.3. Pure ACK (Receive) 850 For the avoidance of doubt, the normative statements for all TCP 851 control packets in Section 3.3.1 are interpreted for the case when a 852 Pure ACK is received as follows: 854 o Any TCP implementation SHOULD accept receipt of a pure ACK with a 855 non-zero ECN field, despite current RFCs precluding the sending of 856 such packets. 858 o A TCP implementation taking part in the ECN++ experiment MUST 859 accept receipt of a pure ACK with a non-zero ECN field. 861 The question of whether and how the receiver of pure ACKs is required 862 to feed back any CE marks on them is outside the scope of the present 863 specification because it is a matter for the relevant feedback 864 specification ([RFC3168] or [I-D.ietf-tcpm-accurate-ecn]). AccECN 865 feedback is required to count CE marking of any control packet 866 including pure ACKs. Whereas RFC 3168 is silent on this point, so 867 feedback of CE-markings might be implementation specific (see 868 Section 4.4.1.1). 870 3.3.4. FIN (Receive) 872 The TCP data receiver MUST ignore the CE codepoint on incoming FINs 873 that fail any validity check. The validity check in section 5.2 of 874 [RFC5961] is RECOMMENDED. 876 3.3.5. RST (Receive) 878 The "challenge ACK" approach to checking the validity of RSTs 879 (section 3.2 of [RFC5961] is RECOMMENDED at the data receiver. 881 3.3.6. Retransmissions (Receive) 883 The TCP data receiver MUST ignore the CE codepoint on incoming 884 segments that fail any validity check. The validity check in section 885 5.2 of [RFC5961] is RECOMMENDED. This will effectively mitigate an 886 attack that uses spoofed data packets to fool the receiver into 887 feeding back spoofed congestion indications to the sender, which in 888 turn would be fooled into continually reducing its congestion window. 890 4. Rationale 892 This section is informative, not normative. It presents counter- 893 arguments against the justifications in the RFC series for disabling 894 ECN on TCP control segments and retransmissions. It also gives 895 rationale for why ECT is safe on control segments that have not, so 896 far, been mentioned in the RFC series. First it addresses over- 897 arching arguments used for most packet types, then it addresses the 898 specific arguments for each packet type in turn. 900 4.1. The Reliability Argument 902 Section 5.2 of RFC 3168 states: 904 "To ensure the reliable delivery of the congestion indication of 905 the CE codepoint, an ECT codepoint MUST NOT be set in a packet 906 unless the loss of that packet [at a subsequent node] in the 907 network would be detected by the end nodes and interpreted as an 908 indication of congestion." 910 We believe this argument is misplaced. TCP does not deliver most 911 control packets reliably. So it is more important to allow control 912 packets to be ECN-capable, which greatly improves reliable delivery 913 of the control packets themselves (see motivation in Section 1.1). 914 ECN also improves the reliability and latency of delivery of any 915 congestion notification on control packets, particularly because TCP 916 does not detect the loss of most types of control packet anyway. 917 Both these points outweigh by far the concern that a CE marking 918 applied to a control packet by one node might subsequently be dropped 919 by another node. 921 The principle to determine whether a packet can be ECN-capable ought 922 to be "do no extra harm", meaning that the reliability of a 923 congestion signal's delivery ought to be no worse with ECN than 924 without. In particular, setting the CE codepoint on the very same 925 packet that would otherwise have been dropped fulfills this 926 criterion, since either the packet is delivered and the CE signal is 927 delivered to the endpoint, or the packet is dropped and the original 928 congestion signal (packet loss) is delivered to the endpoint. 930 The concern about a CE marking being dropped at a subsequent node 931 might be motivated by the idea that ECN-marking a packet at the first 932 node does not remove the packet, so it could go on to worsen 933 congestion at a subsequent node. However, it is not useful to reason 934 about congestion by considering single packets. The departure rate 935 from the first node will generally be the same (fully utilized) with 936 or without ECN, so this argument does not apply. 938 4.2. SYNs 940 RFC 5562 presents two arguments against ECT marking of SYN packets 941 (quoted verbatim): 943 "First, when the TCP SYN packet is sent, there are no guarantees 944 that the other TCP endpoint (node B in Figure 2) is ECN-Capable, 945 or that it would be able to understand and react if the ECN CE 946 codepoint was set by a congested router. 948 Second, the ECN-Capable codepoint in TCP SYN packets could be 949 misused by malicious clients to "improve" the well-known TCP SYN 950 attack. By setting an ECN-Capable codepoint in TCP SYN packets, a 951 malicious host might be able to inject a large number of TCP SYN 952 packets through a potentially congested ECN-enabled router, 953 congesting it even further." 955 The first point actually describes two subtly different issues. So 956 below three arguments are countered in turn. 958 4.2.1. Argument 1a: Unrecognized CE on the SYN 960 This argument certainly applied at the time RFC 5562 was written, 961 when no ECN responder mechanism had any logic to recognize a CE 962 marking on a SYN and, even if logic were added, there was no field in 963 the SYN-ACK to feed it back. The problem was that, during the 3WHS, 964 the flag in the TCP header for ECN feedback (called Echo Congestion 965 Experienced) had been overloaded to negotiate the use of ECN itself. 967 The accurate ECN (AccECN) protocol [I-D.ietf-tcpm-accurate-ecn] has 968 since been designed to solve this problem. Two features are 969 important here: 971 1. An AccECN server uses the 3 'ECN' bits in the TCP header of the 972 SYN-ACK to respond to the client. 4 of the possible 8 codepoints 973 provide enough space for the server to feed back which of the 4 974 IP/ECN codepoints was on the incoming SYN (including CE of 975 course). 977 2. If any of these 4 codepoints are in the SYN-ACK, it confirms that 978 the server supports AccECN and, if another codepoint is returned, 979 it confirms that the server doesn't support AccECN. 981 This still does not seem to allow a client to set ECT on a SYN, it 982 only finds out whether the server would have supported it afterwards. 983 The trick the client uses for ECN++ is to set ECT on the SYN 984 optimistically then, if the SYN-ACK reveals that the server wouldn't 985 have understood CE on the SYN, the client responds conservatively as 986 if the SYN was marked with CE. 988 The recommended conservative congestion response is to reduce the 989 initial window, which does not affect the performance of very popular 990 protocols such as HTTP, since it is extremely rare for an HTTP client 991 to send more than one packet as its initial request anyway (for data 992 on HTTP/1 & HTTP/2 request sizes see Fig 3 in [Manzoor17]). Any 993 clients that do frequently use a larger initial window for their 994 first message to the server can cache which servers will not 995 understand ECT on a SYN (see Section 4.2.3 below). If caching is not 996 practical, such clients could reduce the initial window to say IW2 or 997 IW3. 999 EXPERIMENTATION NEEDED: Experiments will be needed to determine 1000 any better strategy for reducing IW in response to congestion on a 1001 SYN, when the server does not support congestion feedback on the 1002 SYN-ACK (whether cached or discovered explicitly). 1004 4.2.2. Argument 1b: ECT Considered Invalid on the SYN 1006 Given, until now, ECT-marked SYN packets have been prohibited, it 1007 cannot be assumed they will be accepted, by TCP middleboxes or 1008 servers. 1010 4.2.2.1. ECT on SYN Considered Invalid by Middleboxes 1012 According to a study using 2014 data [ecn-pam] from a limited range 1013 of fixed vantage points, for the top 1M Alexa web sites, adding the 1014 ECN capability to SYNs was increasing connection establishment 1015 failures by about 0.4%. 1017 From a wider range of fixed and mobile vantage points, a more recent 1018 study in Jan-May 2017 [Mandalari18] found no occurrences of blocking 1019 of ECT on SYNs. However, in more than half the mobile networks 1020 tested it found wiping of the ECN codepoint at the first hop. 1022 MEASUREMENTS NEEDED: As wiping at the first hop is remedied, 1023 measurements will be needed to check whether SYNs with ECT are 1024 sometimes blocked deeper into the path. 1026 Silent failures introduce a retransmission timeout delay (default 1 1027 second) at the initiator before it attempts any fall back strategy 1028 (whereas explicit RSTs can be dealt with immediately). Ironically, 1029 making SYNs ECN-capable is intended to avoid the timeout when a SYN 1030 is lost due to congestion. Fortunately, if there is any discard of 1031 ECN-capable SYNs due to policy, it will occur predictably, not 1032 randomly like congestion. So the initiator should be able to avoid 1033 it by caching those sites that do not support ECN-capable SYNs (see 1034 the last paragraph of Section 3.2.1.2). 1036 4.2.2.2. ECT on SYN Considered Invalid by Servers 1038 A study conducted in Nov 2017 [Kuehlewind18] found that, of the 82% 1039 of the Alexa top 50k web servers that supported ECN, 84% disabled ECN 1040 if the IP/ECN field on the SYN was ECT0, CE or either. Given most 1041 web servers use Linux, this behaviour can most likely be traced to a 1042 patch contributed in May 2012 that was first distributed in v3.5 of 1043 the Linux kernel [strict-ecn]. The comment says "RFC3168 : 6.1.1 SYN 1044 packets must not have ECT/ECN bits set. If we receive a SYN packet 1045 with these bits set, it means a network is playing bad games with TOS 1046 bits. In order to avoid possible false congestion notifications, we 1047 disable TCP ECN negociation." Of course, some of the 84% might be 1048 due to similar code in other OSs. 1050 For brevity we shall call this the "over-strict" ECN test, because it 1051 is over-conservative with what it accepts, contrary to Postel's 1052 robustness principle. A robust protocol will not usually assume 1053 network mangling without comparing with the value originally sent, 1054 and one packet is not sufficient to make an assumption with such 1055 irreversible consequences anyway. 1057 Ironically, networks rarely seem to alter the IP/ECN field on a SYN 1058 from zero to non-zero anyway. In a study conducted in Jan-May 2017 1059 over millions of paths from vantage points in a few dozen mobile and 1060 fixed networks [Mandalari18], no such transition was observed. With 1061 such a small or non-existent incidence of this sort of network 1062 mangling, it would be preferable to report any residual problem paths 1063 so that they can be fixed. 1065 Whatever, the widespread presence of this 'over-strict' test proves 1066 that RFC 5562 was correct to expect that ECT would be considered 1067 invalid on SYNs. Nonetheless, it is not an insurmountable problem - 1068 the over-strict test in Linux was patched in Apr 2019 1069 [relax-strict-ecn] and caching can work round it where previous 1070 versions of Linux are running. The prevalence of these "over-strict" 1071 ECN servers makes it challenging to cache them all. However, 1072 Section 4.2.3 below explains how a cache of limited size can 1073 alleviate this problem for a client's most popular sites. 1075 For the future, [RFC8311] updates RFC 3168 to clarify that the IP/ECN 1076 field does not have to be zero on a SYN if documented in an 1077 experimental RFC such as the present ECN++ specification. 1079 4.2.3. Caching Strategies for ECT on SYNs 1081 Given the server handling of ECN on SYNs outlined in Section 4.2.2.2 1082 above, an initiator might combine AccECN with three candidate caching 1083 strategies for setting ECT on a SYN: 1085 (S1): Pessimistic ECT and cache successes: The initiator always 1086 requests AccECN, but by default without ECT on the SYN. Then 1087 it caches those servers that confirm that they support AccECN 1088 as 'ECT SYN OK'. On a subsequent connection to any server 1089 that supports AccECN, the initiator can then set ECT on the 1090 SYN. When connecting to other servers (non-ECN or classic 1091 ECN) it will not set ECT on the SYN, so it will not fail the 1092 'over-strict' ECN test. 1094 Longer term, as servers upgrade to AccECN, the initiator is 1095 still requesting AccECN, so it will add them to the cache and 1096 use ECT on subsequent SYNs to those servers. However, 1097 assuming it has to cap the size of the cache, the client will 1098 not have the benefit of ECT SYNs to those less frequently used 1099 AccECN servers expelled from its cache. 1101 (S2): Optimistic ECT: The initiator always requests AccECN and by 1102 default sets ECT on the SYN. Then, if the server response 1103 shows it has no AccECN logic (so it cannot feed back a CE 1104 mark), the initiator conservatively behaves as if the SYN was 1105 CE-marked, by reducing its initial window. 1107 A. No cache. 1109 B. Cache failures: The optimistic ECT strategy can be 1110 improved by caching solely those servers that do not 1111 support AccECN as 'ECT SYN NOK'. This would include non- 1112 ECN servers and all Classic ECN servers whether 'over- 1113 strict' or not. On subsequent connections to these non- 1114 AccECN servers, the initiator will still request AccECN 1115 but not set ECT on the SYN. Then, the connection can 1116 still fall back to Classic ECN, if the server supports it, 1117 and the initiator can use its full initial window (if it 1118 has enough request data to need it). 1120 Longer term, as servers upgrade to AccECN, the initiator 1121 will remove them from the cache and use ECT on subsequent 1122 SYNs to that server. 1124 Where an access network operator mediates Internet access 1125 via a proxy that does not support AccECN, the optimistic 1126 ECT strategy will always fail. This scenario is more 1127 likely in mobile networks. Therefore, a mobile host could 1128 cache lack of AccECN support per attached access network 1129 operator. Whenever it attached to a new operator, it 1130 could check a well-known AccECN test server and, if it 1131 found no AccECN support, it would add a cache entry for 1132 the attached operator. It would only use ECT when neither 1133 network nor server were cached. It would only populate 1134 its per server cache when not attached to a non-AccECN 1135 proxy. 1137 (S3): ECT by configuration: In a controlled environment, the 1138 administrator can make sure that servers support ECN-capable 1139 SYN packets. Examples of controlled environments are single- 1140 tenant DCs, and possibly multi-tenant DCs if it is assumed 1141 that each tenant mostly communicates with its own VMs. 1143 For unmanaged environments like the public Internet, pragmatically 1144 the choice is between strategies (S1), (S2A) and (S2B). The 1145 normative specification for ECT on a SYN in Section 3.2.1 recommends 1146 the "optimistic ECT and cache failures" strategy (S2B) but the choice 1147 depends on the implementer's motivation for using ECN++, and the 1148 deployment prevalence of different technologies and bug-fixes. 1150 o The "pessimistic ECT and cache successes" strategy (S1) suffers 1151 from exposing the initial SYN to the prevailing loss level, even 1152 if the server supports ECT on SYNs, but only on the first 1153 connection to each AccECN server. If AccECN becomes widely 1154 deployed on servers, SYNs to those AccECN servers that are less 1155 frequently used by the client and therefore don't fit in the cache 1156 will not benefit from ECN protection at all. 1158 o The "optimistic ECT without a cache" strategy (S2A) is the 1159 simplest. It would satisfy the goal of an implementer who is 1160 solely interested in low latency using AccECN and ECN++ and is not 1161 concerned about fall-back to Classic ECN. 1163 o The "optimistic ECT and cache failures" strategy (S2B) exploits 1164 ECT on SYNs from the very first attempt. But if the server turns 1165 out to be 'over-strict' it will disable ECN for the connection, 1166 but only for the first connection if it's one of the client's more 1167 popular servers that fits in the cache. If the server turns out 1168 not to support AccECN, the initiator has to conservatively limit 1169 its initial window, but again only for the first connection if 1170 it's one of the client's more popular servers (and anyway this 1171 rarely makes any difference when most client requests fit in a 1172 single packet). 1174 Note that, if AccECN deployment grows, caching successes (S1) starts 1175 off small then grows, while caching failures (S2B) becomes large at 1176 first, then shrinks. At half-way, the size of the cache has to be 1177 capped with either approach, so the default behaviour for all the 1178 servers that do not fit in the cache is as important as the behaviour 1179 for the popular servers that do fit. 1181 MEASUREMENTS NEEDED: Measurements are needed to determine which 1182 strategy would be sufficient for any particular client, whether a 1183 particular client would need different strategies in different 1184 circumstances and how many occurrences of problems would be masked 1185 by how few cache entries. 1187 Another strategy would be to send a not-ECT SYN a short delay (below 1188 the typical lowest RTT) after an ECT SYN and only accept the non-ECT 1189 connection if it returned first. This would reduce the performance 1190 penalty for those deploying ECT SYN support. However, this 'happy 1191 eyeballs' approach becomes complex when multiple optional features 1192 are all tried on the first SYN (or on multiple SYNs), so it is not 1193 recommended. 1195 4.2.4. Argument 2: DoS Attacks 1197 [RFC5562] says that ECT SYN packets could be misused by malicious 1198 clients to augment "the well-known TCP SYN attack". It goes on to 1199 say "a malicious host might be able to inject a large number of TCP 1200 SYN packets through a potentially congested ECN-enabled router, 1201 congesting it even further." 1203 We assume this is a reference to the TCP SYN flood attack (see 1204 https://en.wikipedia.org/wiki/SYN_flood), which is an attack against 1205 a responder end point. We assume the idea of this attack is to use 1206 ECT to get more packets through an ECN-enabled router in preference 1207 to other non-ECN traffic so that they can go on to use the SYN 1208 flooding attack to inflict more damage on the responder end point. 1209 This argument could apply to flooding with any type of packet, but we 1210 assume SYNs are singled out because their source address is easier to 1211 spoof, whereas floods of other types of packets are easier to block. 1213 Mandating Not-ECT in an RFC does not stop attackers using ECT for 1214 flooding. Nonetheless, if a standard says SYNs are not meant to be 1215 ECT it would make it legitimate for firewalls to discard them. 1216 However this would negate the considerable benefit of ECT SYNs for 1217 compliant transports and seems unnecessary because RFC 3168 already 1218 provides the means to address this concern. In section 7, RFC 3168 1219 says "During periods where ... the potential packet marking rate 1220 would be high, our recommendation is that routers drop packets rather 1221 then set the CE codepoint..." and this advice is repeated in 1222 [RFC7567] (section 4.2.1). This makes it harder for flooding packets 1223 to gain from ECT. 1225 [ecn-overload] showed that ECT can only slightly augment flooding 1226 attacks relative to a non-ECT attack. It was hard to overload the 1227 link without causing the queue to grow, which in turn caused the AQM 1228 to disable ECN and switch to drop, thus negating any advantage of 1229 using ECT. This was true even with the switch-over point set to 25% 1230 drop probability (i.e. the arrival rate was 133% of the link rate). 1232 4.3. SYN-ACKs 1234 The proposed approach in Section 3.2.2 for experimenting with ECN- 1235 capable SYN-ACKs is effectively identical to the scheme called ECN+ 1236 [ECN-PLUS]. In 2005, the ECN+ paper demonstrated that it could 1237 reduce the average Web response time by an order of magnitude. It 1238 also argued that adding ECT to SYN-ACKs did not raise any new 1239 security vulnerabilities. 1241 4.3.1. Possibility of Unrecognized CE on the SYN-ACK 1243 The feedback behaviour by the initiator in response to a CE-marked 1244 SYN-ACK from the responder depends on whether classic ECN feedback 1245 [RFC3168] or AccECN feedback [I-D.ietf-tcpm-accurate-ecn] has been 1246 negotiated. In either case no change is required to RFC 3168 or the 1247 AccECN specification. 1249 Some classic ECN client implementations might ignore a CE-mark on a 1250 SYN-ACK, or even ignore a SYN-ACK packet entirely if it is set to ECT 1251 or CE. This is a possibility because an RFC 3168 implementation 1252 would not necessarily expect a SYN-ACK to be ECN-capable. This issue 1253 already came up when the IETF first decided to experiment with ECN on 1254 SYN-ACKs [RFC5562] and it was decided to go ahead without any extra 1255 precautionary measures. This was because the probability of 1256 encountering the problem was believed to be low and the harm if the 1257 problem arose was also low (see Appendix B of RFC 5562). 1259 4.3.2. Response to Congestion on a SYN-ACK 1261 The IETF has already specified an experiment with ECN-capable SYN-ACK 1262 packets [RFC5562]. It was inspired by the ECN+ paper, but it 1263 specified a much more conservative congestion response to a CE-marked 1264 SYN-ACK, called ECN+/TryOnce. This required the server to reduce its 1265 initial window to 1 segment (like ECN+), but then the server had to 1266 send a second SYN-ACK and wait for its ACK before it could continue 1267 with its initial window of 1 SMSS. The second SYN-ACK of this 5-way 1268 handshake had to carry no data, and had to disable ECN, but no 1269 justification was given for these last two aspects. 1271 The present ECN++ experimental specification obsoletes RFC 5562 1272 because it uses the ECN+ congestion response, not ECN+/TryOnce. 1273 First we argue against the rationale for ECN+/TryOnce given in 1274 sections 4.4 and 6.2 of [RFC5562]. It starts with a rather too 1275 literal interpretation of the requirement in RFC 3168 that says TCP's 1276 response to a single CE mark has to be "essentially the same as the 1277 congestion control response to a *single* dropped packet." TCP's 1278 response to a dropped initial (SYN or SYN-ACK) packet is to wait for 1279 the retransmission timer to expire (currently 1s). However, this 1280 long delay assumes the worst case between two possible causes of the 1281 loss: a) heavy overload; or b) the normal capacity-seeking behaviour 1282 of other TCP flows. When the network is still delivering CE-marked 1283 packets, it implies that there is an AQM at the bottleneck and that 1284 it is not overloaded. This is because an AQM under overload will 1285 disable ECN (as recommended in section 7 of RFC 3168 and repeated in 1286 section 4.2.1 of RFC 7567). So scenario (a) can be ruled out. 1287 Therefore, TCP's response to a CE-marked SYN-ACK can be similar to 1288 its response to the loss of _any_ packet, rather than backing off as 1289 if the special _initial_ packet of a flow has been lost. 1291 How TCP responds to the loss of any single packet depends what it has 1292 just been doing. But there is not really a precedent for TCP's 1293 response when it experiences a CE mark having sent only one (small) 1294 packet. If TCP had been adding one segment per RTT, it would have 1295 halved its congestion window, but it hasn't established a congestion 1296 window yet. If it had been exponentially increasing it would have 1297 exited slow start, but it hasn't started exponentially increasing yet 1298 so it hasn't established a slow-start threshold. 1300 Therefore, we have to work out a reasoned argument for what to do. 1301 If an AQM is CE-marking packets, it implies there is already a queue 1302 and it is probably already somewhere around the AQM's operating point 1303 - it is unlikely to be well below and it might be well above. So, 1304 the more data packets that the client sends in its IW, the more 1305 likely at least one will be CE marked, leading it to exit slow-start 1306 early. On the other hand, it is highly unlikely that the SYN-ACK 1307 itself pushed the AQM into congestion, so it will be safe to 1308 introduce another single segment immediately (1 RTT after the SYN- 1309 ACK). Therefore, starting to probe for capacity with a slow start 1310 from an initial window of 1 segment seems appropriate to the 1311 circumstances. This is the approach adopted in Section 3.2.2. 1313 EXPERIMENTATION NEEDED: Experiments will be needed to check the 1314 above reasoning and determine any better strategy for reducing IW 1315 in response to congestion on a SYN-ACK (or a SYN). 1317 4.3.3. Fall-Back if ECT SYN-ACK Fails 1319 An alternative to the server caching failed connection attempts would 1320 be for the server to rely on the client caching failed attempts (on 1321 the basis that the client would cache a failure whether ECT was 1322 blocked on the SYN or the SYN-ACK). This strategy cannot be used if 1323 the SYN does not request AccECN support. It works as follows: if the 1324 server receives a SYN that requests AccECN support but is set to not- 1325 ECT, it replies with a SYN-ACK also set to not-ECT. If a middlebox 1326 only blocks ECT on SYNs, not SYN-ACKs, this strategy might disable 1327 ECN on a SYN-ACK when it did not need to, but at least it saves the 1328 server from maintaining a cache. 1330 4.4. Pure ACKs 1332 Section 5.2 of RFC 3168 gives the following arguments for not 1333 allowing the ECT marking of pure ACKs (ACKs not piggy-backed on 1334 data): 1336 "To ensure the reliable delivery of the congestion indication of 1337 the CE codepoint, an ECT codepoint MUST NOT be set in a packet 1338 unless the loss of that packet in the network would be detected by 1339 the end nodes and interpreted as an indication of congestion. 1341 Transport protocols such as TCP do not necessarily detect all 1342 packet drops, such as the drop of a "pure" ACK packet; for 1343 example, TCP does not reduce the arrival rate of subsequent ACK 1344 packets in response to an earlier dropped ACK packet. Any 1345 proposal for extending ECN-Capability to such packets would have 1346 to address issues such as the case of an ACK packet that was 1347 marked with the CE codepoint but was later dropped in the network. 1348 We believe that this aspect is still the subject of research, so 1349 this document specifies that at this time, "pure" ACK packets MUST 1350 NOT indicate ECN-Capability." 1352 Later on, in section 6.1.4 it reads: 1354 "For the current generation of TCP congestion control algorithms, 1355 pure acknowledgement packets (e.g., packets that do not contain 1356 any accompanying data) MUST be sent with the not-ECT codepoint. 1357 Current TCP receivers have no mechanisms for reducing traffic on 1358 the ACK-path in response to congestion notification. Mechanisms 1359 for responding to congestion on the ACK-path are areas for current 1360 and future research. (One simple possibility would be for the 1361 sender to reduce its congestion window when it receives a pure ACK 1362 packet with the CE codepoint set). For current TCP 1363 implementations, a single dropped ACK generally has only a very 1364 small effect on the TCP's sending rate." 1366 We next address each of the arguments presented above. 1368 The first argument is a specific instance of the reliability argument 1369 for the case of pure ACKs. This has already been addressed by 1370 countering the general reliability argument in Section 4.1. 1372 The second argument says that ECN ought not to be enabled unless 1373 there is a mechanism to respond to it. This argument actually 1374 comprises three sub-arguments: 1376 Mechanism feasibility: If ECN is enabled on Pure ACKs, are there, or 1377 could there be, suitable mechanisms to detect, feed back and 1378 respond to ECN-marked Pure ACKs? 1380 Do no extra harm: There has never been a mechanism to respond to 1381 loss of non-ECN Pure ACKs. So it seems that adding ECN without a 1382 response mechanism will do no extra harm to others, while 1383 improving a connection's own performance (because loss of an ACK 1384 holds back new data). However, if the end systems have no 1385 response mechanism, ECN Pure ACKs do slightly more harm than non- 1386 ECN, because the AQM doesn't immediately clear ECT packets from 1387 the queue until it reaches overload and disables ECN. 1389 Standards policy: Even if there were no harm to others, does it set 1390 an undesirable precedent to allow a flow to use ECN to protect its 1391 Pure ACKs from loss, when there is no mechanism to respond to ECN- 1392 marking? 1394 The last two arguments involve value judgements, but they both depend 1395 on the concrete technical question of mechanism feasibility, which 1396 will therefore be addressed first in Section 4.4.1 below. Then 1397 Section 4.4.2 draws conclusions by addressing the value judgements in 1398 the other two questions. 1400 4.4.1. Mechanisms to Respond to CE-Marked Pure ACKs 1402 The question of whether the receiver of pure ACKs is required to 1403 detect and feed back any CE-marking is outside the scope of the 1404 present specification - it is a matter for the relevant feedback 1405 specification (classic ECN [RFC3168] and AccECN 1406 [I-D.ietf-tcpm-accurate-ecn]). The response to congestion feedback 1407 is also out of scope, because it would be defined in the base TCP 1408 congestion control specification [RFC5681] or its variants. 1410 Nonetheless, in order to decide whether the present ECN++ 1411 experimental specification should require a host to set ECT on pure 1412 ACKs, we only need to know whether a response mechanism would be 1413 feasible - we do not have to standardize it. So the bullets below 1414 assess, for each type of feedback, whether the three stages of the 1415 congestion response mechanism could all work. 1417 Detection: Can the receiver of a pure ACK detect a CE marking on 1418 it?: 1420 * Classic feedback: RFC 3168 is silent on this point. The 1421 implementer of the receiver would not expect CE marks on pure 1422 ACKs, but the implementation might happen to check for CE marks 1423 before it looks for the data. So detection will be 1424 implementation-dependent. 1426 * AccECN feedback: the AccECN specification requires the receiver 1427 of any TCP packets to count any CE marks on them (whether or 1428 not it sends ECN-capable control packets itself). 1430 Feedback: TCP never ACKs a pure ACK, but the receiver of a CE-mark 1431 on a pure ACK could feed it back when it sends a subsequent data 1432 segment (if it ever does): 1434 * Classic feedback: RFC 3168 is silent on this point, so feedback 1435 of CE-markings might be implementation specific. If the 1436 receiver (of the pure ACKs) did generate feedback, it would set 1437 the echo congestion experienced (ECE) flag in the TCP header of 1438 subsequent packets in the round, as it would to feed back CE on 1439 data packets. 1441 * AccECN feedback: the receiver continually feeds back a count of 1442 the number of CE-marked packets that it has received and, 1443 optionally, a count of CE-marked bytes. For either metric, 1444 AccECN includes pure ACKs and indeed all types of packets. 1446 Congestion response: In either case (classic or AccECN feedback), if 1447 the TCP sender does receive feedback about CE-markings on pure 1448 ACKs, it will be able to reduce the congestion window (cwnd) and/ 1449 or the ACK rate. 1451 Therefore a congestion response mechanism is clearly feasible if 1452 AccECN has been negotiated, but the position is unknown for the 1453 installed base of classic ECN feedback. 1455 4.4.1.1. Congestion Window Response to CE-Marked Pure ACKs 1457 This subsection explores issues that congestion control designers 1458 will need to consider when defining a cwnd response to CE-marked Pure 1459 ACKs. 1461 A CE-mark on a Pure ACK does not mean that only Pure ACKs are causing 1462 congestion. It only means that the marked Pure ACK is part of an 1463 aggregate that is collectively causing a bottleneck queue to randomly 1464 CE-mark a fraction of the packets. A CE-mark on a Pure ACK might be 1465 due to data packets in other flows through the same bottleneck, due 1466 to data packets interspersed between Pure ACKs in the same half- 1467 connection, or just due to the rate of Pure ACKs alone. (RFC 3168 1468 only considered the last possibility, which led to the argument that 1469 ECN-enabled Pure ACKs had to be deferred, because ACK congestion 1470 control was a research issue.) 1472 If a host has been sending a mix of Pure ACKs and data, it doesn't 1473 need to work out whether a particular CE mark was on a Pure ACK or 1474 not; it just needs to respond to congestion feedback as a whole by 1475 reducing its congestion window (cwnd), which limits the data it can 1476 launch into flight through the congested bottleneck. If it is purely 1477 receiving data and sending only Pure ACKs, reducing cwnd will have 1478 caused it no harm, having no effect on its ACK rate (the next 1479 subsection addresses that). 1481 However, when a host is sending data as well as Pure ACKs, it would 1482 not be right for CE-marks on Pure ACKs and on data packets to induce 1483 the same reduction in cwnd. A possible way to address this issue 1484 would be to weight the response by the size of the marked packets 1485 (assuming the congestion control supports a weighted response, e.g. 1486 [RFC8257]). For instance, one could calculate the fraction of CE- 1487 marked bytes (headers and data) over each round trip (say) as 1488 follows: 1490 (CE-marked header bytes + CE-marked data bytes) / (all header 1491 bytes + all data bytes) 1493 Header bytes can be calculated by multiplying a packet count by a 1494 nominal header size, which is possible with AccECN feedback, because 1495 it gives a count of CE-marked packets (as well as CE-marked bytes). 1497 The above simple aggregate calculation caters for the full range of 1498 scenarios; from all Pure ACKs to just a few interspersed with data 1499 packets. 1501 Note that any mechanism that reduces cwnd due to CE-marked Pure ACKs 1502 would need to be integrated with the congestion window validation 1503 mechanism [RFC7661], which already conservatively reduces cwnd over 1504 time because cwnd becomes stale if it is not used to fill the pipe. 1506 4.4.1.2. ACK Rate Response to CE-Marked Pure ACKs 1508 Reducing the congestion window will have no effect on the rate of 1509 pure ACKs. The worst case here is if the bottleneck is congested 1510 solely with pure ACKs, but it could also be problematic if a large 1511 fraction of the load was from unresponsive ACKs, leaving little or no 1512 capacity for the load from responsive data. 1514 Since RFC 3168 was published, experimental Acknowledgement Congestion 1515 Control (AckCC) techniques have been documented in [RFC5690] 1516 (informational). So any pair of TCP end-points can choose to agree 1517 to regulate the delayed ACK ratio in response to lost or CE-marked 1518 pure ACKs. However, the protocol has a number of open issues 1519 concerning deployment (e.g. it requires support from both ends, it 1520 relies on two new TCP options, one of which is required on the SYN 1521 where option space is at a premium and, if either option is blocked 1522 by a middlebox, no fall-back behaviour is specified). 1524 The new TCP options address two problems, namely that TCP had: i) no 1525 mechanism to allow ECT to be set on pure ACKs; and ii) no mechanism 1526 to feed back loss or CE-marking of pure ACKs. A combination of the 1527 present specification and AccECN addresses both these problems, at 1528 least for CE-marking. So it might now be possible to design an ECN- 1529 specific ACK congestion control scheme without the extra TCP options 1530 proposed in RFC 5690. However, such a mechanism is out of scope of 1531 the present document. 1533 Setting aside the practicality of RFC 5690, the need for AckCC has 1534 not been conclusively demonstrated. It has been argued that the 1535 Internet has survived so far with no mechanism to even detect loss of 1536 pure ACKs. However, it has also been argued that ECN is not the same 1537 as loss. Packet discard can naturally thin the ACK load to whatever 1538 the bottleneck can support, whereas ECN marking does not (it queues 1539 the ACKs instead). Nonetheless, RFC 3168 (section 7) recommends that 1540 an AQM switches over from ECN marking to discard when the marking 1541 probability becomes high. Therefore discard can still be relied on 1542 to thin out ECN-enabled pure ACKs as a last resort. 1544 4.4.2. Summary: Enabling ECN on Pure ACKs 1546 In the case when AccECN has been negotiated, it provides a feasible 1547 congestion response mechanism, so the arguments for ECT on pure ACKs 1548 heavily outweigh those against. ECN is always more and never less 1549 reliable for delivery of congestion notification. A cwnd reduction 1550 needs to be considered by congestion control designers as a response 1551 to congestion on pure ACKs. Separately, AckCC (or an improved 1552 variant exploiting AccECN) could optionally be used to regulate the 1553 spacing between pure ACKs. However, it is not clear whether AckCC is 1554 justified. If it is not, packet discard will still act as the 1555 "congestion response of last resort" by thinning out the traffic. In 1556 contrast, not setting ECT on pure ACKs is certainly detrimental to 1557 performance, because when a pure ACK is lost it can prevent the 1558 release of new data. 1560 In the case when Classic ECN has been negotiated, the argument for 1561 ECT on pure ACKs is less clear-cut. Some of the installed base of 1562 RFC 3168 implementations might happen to (unintentionally) provide a 1563 feedback mechanism to support a cwnd response. For those that did 1564 not, setting ECT on pure ACKs would be better for the flow's own 1565 performance than not setting it. However, where there was no 1566 feedback mechanism, setting ECT could do slightly more harm than not 1567 setting it. AckCC could provide a complementary response mechanism, 1568 because it is designed to work with RFC 3168 ECN, but it has 1569 deployment challenges. In summary, a congestion response mechanism 1570 is unlikely to be feasible with the installed base of classic ECN. 1572 This specification uses a safe approach. Allowing hosts to set ECT 1573 on Pure ACKs without a feasible response mechanism could result in 1574 risk. It would certainly improve the flow's own performance, but it 1575 would slightly increase potential harm to others. Morevoer, if would 1576 set an undesirable precedent for setting ECT on packets with no 1577 mechanism to respond to any resulting congestion signals. Therefore, 1578 Section 3.2.3 allows ECT on Pure ACKs if AccECN feedback has been 1579 negotiated, but not with classic RFC 3168 ECN feedback. 1581 4.5. Window Probes 1583 Section 6.1.6 of RFC 3168 presents only the reliability argument for 1584 prohibiting ECT on Window probes: 1586 "If a window probe packet is dropped in the network, this loss is 1587 not detected by the receiver. Therefore, the TCP data sender MUST 1588 NOT set either an ECT codepoint or the CWR bit on window probe 1589 packets. 1591 However, because window probes use exact sequence numbers, they 1592 cannot be easily spoofed in denial-of-service attacks. Therefore, 1593 if a window probe arrives with the CE codepoint set, then the 1594 receiver SHOULD respond to the ECN indications." 1596 The reliability argument has already been addressed in Section 4.1. 1598 Allowing ECT on window probes could considerably improve performance 1599 because, once the receive window has reopened, if a window probe is 1600 lost the sender will stall until the next window probe reaches the 1601 receiver, which might be after the maximum retransmission timeout (at 1602 least 1 minute [RFC6928]). 1604 On the bright side, RFC 3168 at least specifies the receiver 1605 behaviour if a CE-marked window probe arrives, so changing the 1606 behaviour ought to be less painful than for other packet types. 1608 4.6. FINs 1610 RFC 3168 is silent on whether a TCP sender can set ECT on a FIN. A 1611 FIN is considered as part of the sequence of data, and the rate of 1612 pure ACKs sent after a FIN could be controlled by a CE marking on the 1613 FIN. Therefore there is no reason not to set ECT on a FIN. 1615 4.7. RSTs 1617 RFC 3168 is silent on whether a TCP sender can set ECT on a RST. The 1618 host generating the RST message does not have an open connection 1619 after sending it (either because there was no such connection when 1620 the packet that triggered the RST message was received or because the 1621 packet that triggered the RST message also triggered the closure of 1622 the connection). 1624 Moreover, the receiver of a CE-marked RST message can either: i) 1625 accept the RST message and close the connection; ii) emit a so-called 1626 challenge ACK in response (with suitable throttling) [RFC5961] and 1627 otherwise ignore the RST (e.g. because the sequence number is in- 1628 window but not the precise number expected next); or iii) discard the 1629 RST message (e.g. because the sequence number is out-of-window). In 1630 the first two cases there is no point in echoing any CE mark received 1631 because the sender closed its connection when it sent the RST. In 1632 the third case it makes sense to discard the CE signal as well as the 1633 RST. 1635 Although a congestion response following a CE-marking on a RST does 1636 not appear to make sense, the following factors have been considered 1637 before deciding whether the sender ought to set ECT on a RST message: 1639 o As explained above, a congestion response by the sender of a CE- 1640 marked RST message is not possible; 1642 o So the only reason for the sender setting ECT on a RST would be to 1643 improve the reliability of the message's delivery; 1645 o RST messages are used to both mount and mitigate attacks: 1647 * Spoofed RST messages are used by attackers to terminate ongoing 1648 connections, although the mitigations in RFC 5961 have 1649 considerably raised the bar against off-path RST attacks; 1651 * Legitimate RST messages allow endpoints to inform their peers 1652 to eliminate existing state that correspond to non existing 1653 connections, liberating resources e.g. in DoS attacks 1654 scenarios; 1656 o AQMs are advised to disable ECN marking during persistent 1657 overload, so: 1659 * it is harder for an attacker to exploit ECN to intensify an 1660 attack; 1662 * it is harder for a legitimate user to exploit ECN to more 1663 reliably mitigate an attack 1665 o Prohibiting ECT on a RST would deny the benefit of ECN to 1666 legitimate RST messages, but not to attackers who can disregard 1667 RFCs; 1669 o If ECT were prohibited on RSTs 1671 * it would be easy for security middleboxes to discard all ECN- 1672 capable RSTs; 1674 * However, unlike a SYN flood, it is already easy for a security 1675 middlebox (or host) to distinguish a RST flood from legitimate 1676 traffic [RFC5961], and even if a some legitimate RSTs are 1677 accidentally removed as well, legitimate connections still 1678 function. 1680 So, on balance, it has been decided that it is worth experimenting 1681 with ECT on RSTs. During experiments, if the ECN capability on RSTs 1682 is found to open a vulnerability that is hard to close, this decision 1683 can be reversed, before it is specified for the standards track. 1685 4.8. Retransmitted Packets. 1687 RFC 3168 says the sender "MUST NOT" set ECT on retransmitted packets. 1688 The rationale for this consumes nearly 2 pages of RFC 3168, so the 1689 reader is referred to section 6.1.5 of RFC 3168, rather than quoting 1690 it all here. There are essentially three arguments, namely: 1691 reliability; DoS attacks; and over-reaction to congestion. We 1692 address them in order below. 1694 The reliability argument has already been addressed in Section 4.1. 1696 Protection against DoS attacks is not afforded by prohibiting ECT on 1697 retransmitted packets. An attacker can set CE on spoofed 1698 retransmissions whether or not it is prohibited by an RFC. 1699 Protection against the DoS attack described in section 6.1.5 of RFC 1700 3168 is solely afforded by the requirement that "the TCP data 1701 receiver SHOULD ignore the CE codepoint on out-of-window packets". 1702 Therefore in Section 3.2.7 the sender is allowed to set ECT on 1703 retransmitted packets, in order to reduce the chance of them being 1704 dropped. We also strengthen the receiver's requirement from "SHOULD 1705 ignore" to "MUST ignore". And we generalize the receiver's 1706 requirement to include failure of any validity check, not just out- 1707 of-window checks, in order to include the more stringent validity 1708 checks in RFC 5961 that have been developed since RFC 3168. 1710 A consequence is that, for those retransmitted packets that arrive at 1711 the receiver after the original packet has been properly received 1712 (so-called spurious retransmissions), any CE marking will be ignored. 1713 There is no problem with that because the fact that the original 1714 packet has been delivered implies that the sender's original 1715 congestion response (when it deemed the packet lost and retransmitted 1716 it) was unnecessary. 1718 Finally, the third argument is about over-reacting to congestion. 1719 The argument goes that, if a retransmitted packet is dropped, the 1720 sender will not detect it, so it will not react again to congestion 1721 (it would have reduced its congestion window already when it 1722 retransmitted the packet). Whereas, if retransmitted packets can be 1723 CE tagged instead of dropped, senders could potentially react more 1724 than once to congestion. However, we argue that it is legitimate to 1725 respond again to congestion if it still persists in subsequent round 1726 trip(s). 1728 Therefore, in all three cases, it is not incorrect to set ECT on 1729 retransmissions. 1731 4.9. General Fall-back for any Control Packet 1733 Extensive experiments have found no evidence of any traversal 1734 problems with ECT on any TCP control packet [Mandalari18]. 1735 Nonetheless, Sections 3.2.1.4 and 3.2.2.3 specify fall-back measures 1736 if ECT on the first packet of each half-connection (SYN or SYN-ACK) 1737 appears to be blocking progress. Here, the question of fall-back 1738 measures for ECT on other control packets is explored. It supports 1739 the advice given in Section 3.2.8; until there's evidence that 1740 something's broken, don't fix it. 1742 If an implementation has had to disable ECT to ensure the first 1743 packet of a flow (SYN or SYN-ACK) gets through, the question arises 1744 whether it ought to disable ECT on all subsequent control packets 1745 within the same TCP connection. Without evidence of any such 1746 problems, this seems unnecessarily cautious. Particularly given it 1747 would be hard to detect loss of most other types of TCP control 1748 packets that are not ACK'd. And particularly given that 1749 unnecessarily removing ECT from other control packets could lead to 1750 performance problems, e.g. by directing them into another queue 1751 [I-D.ietf-tsvwg-ecn-l4s-id] or over a different path, because some 1752 broken multipath equipment (erroneously) routes based on all 8 bits 1753 of the Diffserv field. 1755 In the case where a connection starts without ECT on the SYN (perhaps 1756 because problems with previous connections had been cached), there 1757 will have been no test for ECT traversal in the client-server 1758 direction until the pure ACK that completes the handshake. It is 1759 possible that some middlebox might block ECT on this pure ACK or on 1760 later retransmissions of lost packets. Similarly, after a route 1761 change, the new path might include some middlebox that blocks ECT on 1762 some or all TCP control packets. However, without evidence of such 1763 problems, the complexity of a fix does not seem worthwhile. 1765 MORE MEASUREMENTS NEEDED (?): If further two-ended measurements do 1766 find evidence for these traversal problems, measurements would be 1767 needed to check for correlation of ECT traversal problems between 1768 different control packets. It might then be necessary to 1769 introduce a catch-all fall-back rule that disables ECT on certain 1770 subsequent TCP control packets based on some criteria developed 1771 from these measurements. 1773 5. Interaction with popular variants or derivatives of TCP 1775 The following subsections discuss any interactions between setting 1776 ECT on all packets and using the following popular variants of TCP: 1777 IW10 and TFO. It also briefly notes the possibility that the 1778 principles applied here should translate to protocols derived from 1779 TCP. This section is informative not normative, because no 1780 interactions have been identified that require any change to 1781 specifications. The subsection on IW10 discusses potential changes 1782 to specifications but recommends that no changes are needed. 1784 The designs of the following TCP variants have also been assessed and 1785 found not to interact adversely with ECT on TCP control packets: SYN 1786 cookies (see Appendix A of [RFC4987] and section 3.1 of [RFC5562]), 1787 TCP Fast Open (TFO [RFC7413]) and L4S [I-D.ietf-tsvwg-l4s-arch]. 1789 5.1. IW10 1791 IW10 is an experiment to determine whether it is safe for TCP to use 1792 an initial window of 10 SMSS [RFC6928]. 1794 This subsection does not recommend any additions to the present 1795 specification in order to interwork with IW10. The specifications as 1796 they stand are safe, and there is only a corner-case with ECT on the 1797 SYN where performance could be occasionally improved, as explained 1798 below. 1800 As specified in Section 3.2.1.1, a TCP initiator will typically only 1801 set ECT on the SYN if it requests AccECN support. If, however, the 1802 SYN-ACK tells the initiator that the responder does not support 1803 AccECN, Section 3.2.1.1 advises the initiator to conservatively 1804 reduce its initial window, preferably to 1 SMSS because, if the SYN 1805 was CE-marked, the SYN-ACK has no way to feed that back. 1807 If the initiator implements IW10, it seems rather over-conservative 1808 to reduce IW from 10 to 1 just in case a congestion marking was 1809 missed. Nonetheless, a reduction to 1 SMSS will rarely harm 1810 performance, because: 1812 o as long as the initiator is caching failures to negotiate AccECN, 1813 subsequent attempts to access the same server will not use ECT on 1814 the SYN anyway, so there will no longer be any need to 1815 conservatively reduce IW; 1817 o currently, at least for web sessions, it is extremely rare for a 1818 TCP initiator (client) to have more than one data segment to send 1819 at the start of a TCP connection (see Fig 3 in [Manzoor17]) - IW10 1820 is primarily exploited by TCP servers. 1822 If a responder receives feedback that the SYN-ACK was CE-marked, 1823 Section 3.2.2.2 recommends that it reduces its initial window, 1824 preferably to 1 SMSS. When the responder also implements IW10, it 1825 might again seem rather over-conservative to reduce IW from 10 to 1. 1826 But in this case the rationale is somewhat different: 1828 o Feedback that the SYN-ACK was CE-marked is an explicit indication 1829 that the queue has been building, not just uncertainty due to 1830 absence of feedback; 1832 o Given it is now likely that a queue already exists, the more data 1833 packets that the server sends in its IW, the more likely at least 1834 one will be CE marked, leading it to exit slow-start early. 1836 Experimentation will be needed to determine the best strategy. It 1837 should be noted that experience from recent congestion avoidance 1838 experiments where the window is reduced by less than half is not 1839 necessarily applicable to a flow start scenario. Reducing cwnd by 1840 less is one thing. Reducing an increase in cwnd by less is another. 1842 5.2. TFO 1844 TCP Fast Open (TFO [RFC7413]) is an experiment to remove the round 1845 trip delay of TCP's 3-way hand-shake (3WHS). A TFO initiator caches 1846 a cookie from a previous connection with a TFO-enabled server. Then, 1847 for subsequent connections to the same server, any data included on 1848 the SYN can be passed directly to the server application, which can 1849 then return up to an initial window of response data on the SYN-ACK 1850 and on data segments straight after it, without waiting for the ACK 1851 that completes the 3WHS. 1853 The TFO experiment and the present experiment to add ECN-support for 1854 TCP control packets can be combined without altering either 1855 specification, which is justified as follows: 1857 o The handling of ECN marking on a SYN is no different whether or 1858 not it carries data. 1860 o In response to any CE-marking on the SYN-ACK, the responder adopts 1861 the normal response to congestion, as discussed in Section 7.2 of 1862 [RFC7413]. 1864 5.3. L4S 1866 A Low Latency Low Loss Scalable throughput (L4S) variant of TCP such 1867 as TCP Prague [PragueLinux] is mandated to negotiate AccECN feedback, 1868 and strongly recommended to use ECN++ [I-D.ietf-tsvwg-ecn-l4s-id]. 1870 The L4S experiment and the present ECN++ experiment can be combined 1871 without altering any of the specifications. The only difference 1872 would be in the recommendation of the best SYN cache strategy. 1874 The normative specification for ECT on a SYN in Section 3.2.1 1875 recommends the "optimistic ECT and cache failures" strategy (S2B 1876 defined in Section 4.2.3) for the general Internet. However, if a 1877 user's Internet access bottleneck supported L4S ECN but not Classic 1878 ECN, the "optimistic ECT without a cache" strategy (S2A) would make 1879 most sense, because there would be little point trying to avoid the 1880 'over-strict' test and negotiate Classic ECN, if L4S ECN but not 1881 Classic ECN was available on that user's access link (as is the case 1882 with Low Latency DOCSIS [DOCSIS3.1]). 1884 Strategy (S2A) is the simplest, because it requires no cache. It 1885 would satisfy the goal of an implementer who is solely interested in 1886 ultra-low latency using AccECN and ECN++ (e.g. accessing L4S servers) 1887 and is not concerned about fall-back to Classic ECN (e.g. when 1888 accessing other servers). 1890 5.4. Other transport protocols 1892 Experience from experiments on adding ECN support to all TCP packets 1893 ought to be directly transferable between TCP and other transport 1894 protocols, like SCTP or QUIC. 1896 Stream Control Transmission Protocol (SCTP [RFC4960]) is a standards 1897 track transport protocol derived from TCP. SCTP currently does not 1898 include ECN support, but Appendix A of RFC 4960 broadly describes how 1899 it would be supported and a (long-expired) draft on the addition of 1900 ECN to SCTP has been produced [I-D.stewart-tsvwg-sctpecn]. This 1901 draft avoided setting ECT on control packets and retransmissions, 1902 closely following the arguments in RFC 3168. 1904 QUIC [I-D.ietf-quic-transport] is another standards track transport 1905 protocol offering similar services to TCP but intended to exploit 1906 some of the benefits of running over UDP. Building on the arguments 1907 in the current draft, a QUIC sender sets ECT(0) on all packets. 1909 6. Security Considerations 1911 Section 3.2.6 considers the question of whether ECT on RSTs will 1912 allow RST attacks to be intensified. There are several security 1913 arguments presented in RFC 3168 for preventing the ECN marking of TCP 1914 control packets and retransmitted segments. We believe all of them 1915 have been properly addressed in Section 4, particularly Section 4.2.4 1916 and Section 4.8 on DoS attacks using spoofed ECT-marked SYNs and 1917 spoofed CE-marked retransmissions. 1919 7. IANA Considerations 1921 There are no IANA considerations in this memo. 1923 8. Acknowledgments 1925 Thanks to Mirja Kuehlewind, David Black, Padma Bhooma, Gorry 1926 Fairhurst, Michael Scharf, Yuchung Cheng and Christophe Paasch for 1927 their useful reviews. 1929 The work of Marcelo Bagnulo has been performed in the framework of 1930 the H2020-ICT-2014-2 project 5G NORMA. His contribution reflects the 1931 consortium's view, but the consortium is not liable for any use that 1932 may be made of any of the information contained therein. 1934 Bob Briscoe's contribution was partly funded by the Research Council 1935 of Norway through the TimeIn project, partly by CableLabs and partly 1936 by the Comcast Innovation Fund. The views expressed here are solely 1937 those of the authors. 1939 9. References 1941 9.1. Normative References 1943 [I-D.ietf-tcpm-accurate-ecn] 1944 Briscoe, B., Kuehlewind, M., and R. Scheffenegger, "More 1945 Accurate ECN Feedback in TCP", draft-ietf-tcpm-accurate- 1946 ecn-09 (work in progress), July 2019. 1948 [RFC0793] Postel, J., "Transmission Control Protocol", STD 7, 1949 RFC 793, DOI 10.17487/RFC0793, September 1981, 1950 . 1952 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1953 Requirement Levels", BCP 14, RFC 2119, 1954 DOI 10.17487/RFC2119, March 1997, 1955 . 1957 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 1958 of Explicit Congestion Notification (ECN) to IP", 1959 RFC 3168, DOI 10.17487/RFC3168, September 2001, 1960 . 1962 [RFC5961] Ramaiah, A., Stewart, R., and M. Dalal, "Improving TCP's 1963 Robustness to Blind In-Window Attacks", RFC 5961, 1964 DOI 10.17487/RFC5961, August 2010, 1965 . 1967 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 1968 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 1969 May 2017, . 1971 [RFC8311] Black, D., "Relaxing Restrictions on Explicit Congestion 1972 Notification (ECN) Experimentation", RFC 8311, 1973 DOI 10.17487/RFC8311, January 2018, 1974 . 1976 9.2. Informative References 1978 [DOCSIS3.1] 1979 CableLabs, "MAC and Upper Layer Protocols Interface 1980 (MULPI) Specification, CM-SP-MULPIv3.1", Data-Over-Cable 1981 Service Interface Specifications DOCSIS(R) 3.1 Version i17 1982 or later, January 2019, . 1985 [ecn-overload] 1986 Steen, H., "Destruction Testing: Ultra-Low Delay using 1987 Dual Queue Coupled Active Queue Management", Masters 1988 Thesis, Uni Oslo , May 2017, 1989 . 1992 [ecn-pam] Trammell, B., Kuehlewind, M., Boppart, D., Learmonth, I., 1993 Fairhurst, G., and R. Scheffenegger, "Enabling Internet- 1994 Wide Deployment of Explicit Congestion Notification", 1995 Int'l Conf. on Passive and Active Network Measurement 1996 (PAM'15) pp193-205, 2015, . 1999 [ECN-PLUS] 2000 Kuzmanovic, A., "The Power of Explicit Congestion 2001 Notification", ACM SIGCOMM 35(4):61--72, 2005, 2002 . 2004 [I-D.ietf-quic-transport] 2005 Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed 2006 and Secure Transport", draft-ietf-quic-transport-23 (work 2007 in progress), September 2019. 2009 [I-D.ietf-tsvwg-ecn-l4s-id] 2010 Schepper, K. and B. Briscoe, "Identifying Modified 2011 Explicit Congestion Notification (ECN) Semantics for 2012 Ultra-Low Queuing Delay (L4S)", draft-ietf-tsvwg-ecn-l4s- 2013 id-07 (work in progress), July 2019. 2015 [I-D.ietf-tsvwg-l4s-arch] 2016 Briscoe, B., Schepper, K., Bagnulo, M., and G. White, "Low 2017 Latency, Low Loss, Scalable Throughput (L4S) Internet 2018 Service: Architecture", draft-ietf-tsvwg-l4s-arch-04 (work 2019 in progress), July 2019. 2021 [I-D.stewart-tsvwg-sctpecn] 2022 Stewart, R., Tuexen, M., and X. Dong, "ECN for Stream 2023 Control Transmission Protocol (SCTP)", draft-stewart- 2024 tsvwg-sctpecn-05 (work in progress), January 2014. 2026 [judd-nsdi] 2027 Judd, G., "Attaining the promise and avoiding the pitfalls 2028 of TCP in the Datacenter", USENIX Symposium on Networked 2029 Systems Design and Implementation (NSDI'15) pp.145-157, 2030 May 2015, . 2032 [Kuehlewind18] 2033 Kuehlewind, M., Walter, M., Learmonth, I., and B. 2034 Trammell, "Tracing Internet Path Transparency", In Proc: 2035 Network Traffic Measurement and Analysis Conference (TMA) 2036 2018 , June 2018, . 2039 [Mandalari18] 2040 Mandalari, A., Lutu, A., Briscoe, B., Bagnulo, M., and Oe. 2041 Alay, "Measuring ECN++: Good News for ++, Bad News for ECN 2042 over Mobile", IEEE Communications Magazine , March 2018, 2043 . 2045 [Manzoor17] 2046 Manzoor, J., Drago, I., and R. Sadre, "How HTTP/2 is 2047 changing Web traffic and how to detect it", In Proc: 2048 Network Traffic Measurement and Analysis Conference (TMA) 2049 2017 pp.1-9, June 2017, 2050 . 2052 [PragueLinux] 2053 Briscoe, B., De Schepper, K., Albisser, O., Misund, J., 2054 Tilmans, O., Kuehlewind, M., and A. Ahmed, "Implementing 2055 the `TCP Prague' Requirements for Low Latency Low Loss 2056 Scalable Throughput (L4S)", Proc. Linux Netdev 0x13 , 2057 March 2019, . 2060 [relax-strict-ecn] 2061 Tilmans, O., "tcp: Accept ECT on SYN in the presence of 2062 RFC8311", Linux netdev patch list , April 2019, 2063 . 2065 [RFC1122] Braden, R., Ed., "Requirements for Internet Hosts - 2066 Communication Layers", STD 3, RFC 1122, 2067 DOI 10.17487/RFC1122, October 1989, 2068 . 2070 [RFC2140] Touch, J., "TCP Control Block Interdependence", RFC 2140, 2071 DOI 10.17487/RFC2140, April 1997, 2072 . 2074 [RFC3540] Spring, N., Wetherall, D., and D. Ely, "Robust Explicit 2075 Congestion Notification (ECN) Signaling with Nonces", 2076 RFC 3540, DOI 10.17487/RFC3540, June 2003, 2077 . 2079 [RFC4960] Stewart, R., Ed., "Stream Control Transmission Protocol", 2080 RFC 4960, DOI 10.17487/RFC4960, September 2007, 2081 . 2083 [RFC4987] Eddy, W., "TCP SYN Flooding Attacks and Common 2084 Mitigations", RFC 4987, DOI 10.17487/RFC4987, August 2007, 2085 . 2087 [RFC5562] Kuzmanovic, A., Mondal, A., Floyd, S., and K. 2088 Ramakrishnan, "Adding Explicit Congestion Notification 2089 (ECN) Capability to TCP's SYN/ACK Packets", RFC 5562, 2090 DOI 10.17487/RFC5562, June 2009, 2091 . 2093 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 2094 Control", RFC 5681, DOI 10.17487/RFC5681, September 2009, 2095 . 2097 [RFC5690] Floyd, S., Arcia, A., Ros, D., and J. Iyengar, "Adding 2098 Acknowledgement Congestion Control to TCP", RFC 5690, 2099 DOI 10.17487/RFC5690, February 2010, 2100 . 2102 [RFC6298] Paxson, V., Allman, M., Chu, J., and M. Sargent, 2103 "Computing TCP's Retransmission Timer", RFC 6298, 2104 DOI 10.17487/RFC6298, June 2011, 2105 . 2107 [RFC6928] Chu, J., Dukkipati, N., Cheng, Y., and M. Mathis, 2108 "Increasing TCP's Initial Window", RFC 6928, 2109 DOI 10.17487/RFC6928, April 2013, 2110 . 2112 [RFC7413] Cheng, Y., Chu, J., Radhakrishnan, S., and A. Jain, "TCP 2113 Fast Open", RFC 7413, DOI 10.17487/RFC7413, December 2014, 2114 . 2116 [RFC7567] Baker, F., Ed. and G. Fairhurst, Ed., "IETF 2117 Recommendations Regarding Active Queue Management", 2118 BCP 197, RFC 7567, DOI 10.17487/RFC7567, July 2015, 2119 . 2121 [RFC7661] Fairhurst, G., Sathiaseelan, A., and R. Secchi, "Updating 2122 TCP to Support Rate-Limited Traffic", RFC 7661, 2123 DOI 10.17487/RFC7661, October 2015, 2124 . 2126 [RFC8257] Bensley, S., Thaler, D., Balasubramanian, P., Eggert, L., 2127 and G. Judd, "Data Center TCP (DCTCP): TCP Congestion 2128 Control for Data Centers", RFC 8257, DOI 10.17487/RFC8257, 2129 October 2017, . 2131 [strict-ecn] 2132 Dumazet, E., "tcp: be more strict before accepting ECN 2133 negociation", Linux netdev patch list , May 2012, 2134 . 2136 Authors' Addresses 2138 Marcelo Bagnulo 2139 Universidad Carlos III de Madrid 2140 Av. Universidad 30 2141 Leganes, Madrid 28911 2142 SPAIN 2144 Phone: 34 91 6249500 2145 Email: marcelo@it.uc3m.es 2146 URI: http://www.it.uc3m.es 2148 Bob Briscoe 2149 Independent 2150 UK 2152 Email: ietf@bobbriscoe.net 2153 URI: http://bobbriscoe.net/