idnits 2.17.1 

draft-ietf-tcpm-generalized-ecn-05.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  -- The draft header indicates that this document obsoletes RFC5562, but the
     abstract doesn't seem to mention this, which it should.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == The document seems to lack the recommended RFC 2119 boilerplate, even if
     it appears to use RFC 2119 keywords. 

     (The document does seem to have the reference to RFC 2119 which the
     ID-Checklist requires).
  == The document seems to contain a disclaimer for pre-RFC5378 work, but was
     first submitted on or after 10 November 2008.  The disclaimer is usually
     necessary only for documents that revise or obsolete older RFCs, and that
     take significant amounts of text from those RFCs.  If you can contact all
     authors of the source material and they are willing to grant the BCP78
     rights to the IETF Trust, you can and should remove the disclaimer. 
     Otherwise, the disclaimer is needed and you can ignore this comment. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (November 3, 2019) is 1629 days in the past.  Is this
     intentional?


  Checking references for intended status: Experimental
  ----------------------------------------------------------------------------

  == Outdated reference: A later version (-28) exists of
     draft-ietf-tcpm-accurate-ecn-09

  ** Obsolete normative reference: RFC  793 (Obsoleted by RFC 9293)

  == Outdated reference: A later version (-34) exists of
     draft-ietf-quic-transport-23

  == Outdated reference: A later version (-29) exists of
     draft-ietf-tsvwg-ecn-l4s-id-07

  == Outdated reference: A later version (-20) exists of
     draft-ietf-tsvwg-l4s-arch-04

  == Outdated reference: A later version (-06) exists of
     draft-stewart-tsvwg-sctpecn-05

  -- Obsolete informational reference (is this intentional?): RFC 2140
     (Obsoleted by RFC 9040)

  -- Obsolete informational reference (is this intentional?): RFC 4960
     (Obsoleted by RFC 9260)


     Summary: 1 error (**), 0 flaws (~~), 8 warnings (==), 4 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                         M. Bagnulo
3	Internet-Draft                                                      UC3M
4	Obsoletes: 5562 (if approved)                                 B. Briscoe
5	Intended status: Experimental                                Independent
6	Expires: May 6, 2020                                    November 3, 2019

8	  ECN++: Adding Explicit Congestion Notification (ECN) to TCP Control
9	                                Packets
10	                   draft-ietf-tcpm-generalized-ecn-05

12	Abstract

14	   This document describes an experimental modification to ECN when used
15	   with TCP.  It allows the use of ECN on the following TCP packets:
16	   SYNs, pure ACKs, Window probes, FINs, RSTs and retransmissions.

18	Status of This Memo

20	   This Internet-Draft is submitted in full conformance with the
21	   provisions of BCP 78 and BCP 79.

23	   Internet-Drafts are working documents of the Internet Engineering
24	   Task Force (IETF).  Note that other groups may also distribute
25	   working documents as Internet-Drafts.  The list of current Internet-
26	   Drafts is at https://datatracker.ietf.org/drafts/current/.

28	   Internet-Drafts are draft documents valid for a maximum of six months
29	   and may be updated, replaced, or obsoleted by other documents at any
30	   time.  It is inappropriate to use Internet-Drafts as reference
31	   material or to cite them other than as "work in progress."

33	   This Internet-Draft will expire on May 6, 2020.

35	Copyright Notice

37	   Copyright (c) 2019 IETF Trust and the persons identified as the
38	   document authors.  All rights reserved.

40	   This document is subject to BCP 78 and the IETF Trust's Legal
41	   Provisions Relating to IETF Documents
42	   (https://trustee.ietf.org/license-info) in effect on the date of
43	   publication of this document.  Please review these documents
44	   carefully, as they describe your rights and restrictions with respect
45	   to this document.  Code Components extracted from this document must
46	   include Simplified BSD License text as described in Section 4.e of
47	   the Trust Legal Provisions and are provided without warranty as
48	   described in the Simplified BSD License.

50	   This document may contain material from IETF Documents or IETF
51	   Contributions published or made publicly available before November
52	   10, 2008.  The person(s) controlling the copyright in some of this
53	   material may not have granted the IETF Trust the right to allow
54	   modifications of such material outside the IETF Standards Process.
55	   Without obtaining an adequate license from the person(s) controlling
56	   the copyright in such materials, this document may not be modified
57	   outside the IETF Standards Process, and derivative works of it may
58	   not be created outside the IETF Standards Process, except to format
59	   it for publication as an RFC or to translate it into languages other
60	   than English.

62	Table of Contents

64	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
65	     1.1.  Motivation  . . . . . . . . . . . . . . . . . . . . . . .   4
66	     1.2.  Experiment Goals  . . . . . . . . . . . . . . . . . . . .   5
67	     1.3.  Document Structure  . . . . . . . . . . . . . . . . . . .   6
68	   2.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   6
69	   3.  Specification . . . . . . . . . . . . . . . . . . . . . . . .   7
70	     3.1.  Network (e.g. Firewall) Behaviour . . . . . . . . . . . .   7
71	     3.2.  Sender Behaviour  . . . . . . . . . . . . . . . . . . . .   8
72	       3.2.1.  SYN (Send)  . . . . . . . . . . . . . . . . . . . . .   9
73	       3.2.2.  SYN-ACK (Send)  . . . . . . . . . . . . . . . . . . .  13
74	       3.2.3.  Pure ACK (Send) . . . . . . . . . . . . . . . . . . .  14
75	       3.2.4.  Window Probe (Send) . . . . . . . . . . . . . . . . .  15
76	       3.2.5.  FIN (Send)  . . . . . . . . . . . . . . . . . . . . .  16
77	       3.2.6.  RST (Send)  . . . . . . . . . . . . . . . . . . . . .  16
78	       3.2.7.  Retransmissions (Send)  . . . . . . . . . . . . . . .  17
79	       3.2.8.  General Fall-back for any Control Packet or
80	               Retransmission  . . . . . . . . . . . . . . . . . . .  17
81	     3.3.  Receiver Behaviour  . . . . . . . . . . . . . . . . . . .  17
82	       3.3.1.  Receiver Behaviour for Any TCP Control Packet or
83	               Retransmission  . . . . . . . . . . . . . . . . . . .  18
84	       3.3.2.  SYN (Receive) . . . . . . . . . . . . . . . . . . . .  18
85	       3.3.3.  Pure ACK (Receive)  . . . . . . . . . . . . . . . . .  19
86	       3.3.4.  FIN (Receive) . . . . . . . . . . . . . . . . . . . .  19
87	       3.3.5.  RST (Receive) . . . . . . . . . . . . . . . . . . . .  20
88	       3.3.6.  Retransmissions (Receive) . . . . . . . . . . . . . .  20
89	   4.  Rationale . . . . . . . . . . . . . . . . . . . . . . . . . .  20
90	     4.1.  The Reliability Argument  . . . . . . . . . . . . . . . .  20
91	     4.2.  SYNs  . . . . . . . . . . . . . . . . . . . . . . . . . .  21
92	       4.2.1.  Argument 1a: Unrecognized CE on the SYN . . . . . . .  21
93	       4.2.2.  Argument 1b: ECT Considered Invalid on the SYN  . . .  22
94	       4.2.3.  Caching Strategies for ECT on SYNs  . . . . . . . . .  24
95	       4.2.4.  Argument 2: DoS Attacks . . . . . . . . . . . . . . .  26
96	     4.3.  SYN-ACKs  . . . . . . . . . . . . . . . . . . . . . . . .  27
97	       4.3.1.  Possibility of Unrecognized CE on the SYN-ACK . . . .  27
98	       4.3.2.  Response to Congestion on a SYN-ACK . . . . . . . . .  28
99	       4.3.3.  Fall-Back if ECT SYN-ACK Fails  . . . . . . . . . . .  29
100	     4.4.  Pure ACKs . . . . . . . . . . . . . . . . . . . . . . . .  29
101	       4.4.1.  Mechanisms to Respond to CE-Marked Pure ACKs  . . . .  31
102	       4.4.2.  Summary: Enabling ECN on Pure ACKs  . . . . . . . . .  34
103	     4.5.  Window Probes . . . . . . . . . . . . . . . . . . . . . .  34
104	     4.6.  FINs  . . . . . . . . . . . . . . . . . . . . . . . . . .  35
105	     4.7.  RSTs  . . . . . . . . . . . . . . . . . . . . . . . . . .  35
106	     4.8.  Retransmitted Packets.  . . . . . . . . . . . . . . . . .  37
107	     4.9.  General Fall-back for any Control Packet  . . . . . . . .  38
108	   5.  Interaction with popular variants or derivatives of TCP . . .  38
109	     5.1.  IW10  . . . . . . . . . . . . . . . . . . . . . . . . . .  39
110	     5.2.  TFO . . . . . . . . . . . . . . . . . . . . . . . . . . .  40
111	     5.3.  L4S . . . . . . . . . . . . . . . . . . . . . . . . . . .  40
112	     5.4.  Other transport protocols . . . . . . . . . . . . . . . .  41
113	   6.  Security Considerations . . . . . . . . . . . . . . . . . . .  41
114	   7.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  41
115	   8.  Acknowledgments . . . . . . . . . . . . . . . . . . . . . . .  42
116	   9.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  42
117	     9.1.  Normative References  . . . . . . . . . . . . . . . . . .  42
118	     9.2.  Informative References  . . . . . . . . . . . . . . . . .  43
119	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  46

121	1.  Introduction

123	   RFC 3168 [RFC3168] specifies support of Explicit Congestion
124	   Notification (ECN) in IP (v4 and v6).  By using the ECN capability,
125	   network elements (e.g. routers, switches) performing Active Queue
126	   Management (AQM) can use ECN marks instead of packet drops to signal
127	   congestion to the endpoints of a communication.  This results in
128	   lower packet loss and increased performance.  RFC 3168 also specifies
129	   support for ECN in TCP, but solely on data packets.  For various
130	   reasons it precludes the use of ECN on TCP control packets (TCP SYN,
131	   TCP SYN-ACK, pure ACKs, Window probes) and on retransmitted packets.
132	   RFC 3168 is silent about the use of ECN on RST and FIN packets.  RFC
133	   5562 [RFC5562] is an experimental modification to ECN that enables
134	   ECN support for TCP SYN-ACK packets.

136	   This document defines an experimental modification to ECN [RFC3168]
137	   that shall be called ECN++. It enables ECN support on all the
138	   aforementioned types of TCP packet.  The mechanisms proposed in this
139	   document have been defined conservatively and with safety in mind,
140	   possibly in some cases at the expense of performance.

142	   ECN++ uses a sender-only deployment model.  It works whether the two
143	   ends of the TCP connection use classic ECN feedback [RFC3168] or
144	   experimental Accurate ECN feedback (AccECN

146	   [I-D.ietf-tcpm-accurate-ecn]), the two ECN feedback mechanisms for
147	   TCP being standardized at the time of writing.

149	   Using ECN on initial SYN packets provides significant benefits, as we
150	   describe in the next subsection.  However, only AccECN provides a way
151	   to feed back whether the SYN was CE marked, and RFC 3168 does not.
152	   Therefore, implementers of ECN++ are RECOMMENDED to also implement
153	   AccECN.  Conversely, if AccECN (or an equivalent safety mechanism) is
154	   not implemented with ECN++, this specification rules out ECN on the
155	   SYN.

157	   ECN++ is designed for compatibility with a number of latency
158	   improvements to TCP such as TCP Fast Open (TFO [RFC7413]), initial
159	   window of 10 SMSS (IW10 [RFC6928]) and Low latency Low Loss Scalable
160	   Transport (L4S [I-D.ietf-tsvwg-l4s-arch]), but they can all be
161	   implemented and deployed independently.  [RFC8311] is a standards
162	   track procedural device that relaxes requirements in RFC 3168 and
163	   other standards track RFCs that would otherwise preclude the
164	   experimental modifications needed for ECN++ and other ECN
165	   experiments.

167	1.1.  Motivation

169	   The absence of ECN support on TCP control packets and retransmissions
170	   has a potential harmful effect.  In any ECN deployment, non-ECN-
171	   capable packets suffer a penalty when they traverse a congested
172	   bottleneck.  For instance, with a drop probability of 1%, 1% of
173	   connection attempts suffer a timeout of about 1 second before the SYN
174	   is retransmitted, which is highly detrimental to the performance of
175	   short flows.  TCP control packets, particularly TCP SYNs and SYN-
176	   ACKs, are important for performance, so dropping them is best
177	   avoided.

179	   Not using ECN on control packets can be particularly detrimental to
180	   performance in environments where the ECN marking level is high.  For
181	   example, [judd-nsdi] shows that in a controlled private data centre
182	   (DC) environment where ECN is used (in conjunction with DCTCP
183	   [RFC8257]), the probability of being able to establish a new
184	   connection using a non-ECN SYN packet drops to close to zero even
185	   when there are only 16 ongoing TCP flows transmitting at full speed.
186	   The issue is that DCTCP exhibits a much more aggressive response to
187	   packet marking (which is why it is only applicable in controlled
188	   environments).  This leads to a high marking probability for ECN-
189	   capable packets, and in turn a high drop probability for non-ECN
190	   packets.  Therefore non-ECN SYNs are dropped aggressively, rendering
191	   it nearly impossible to establish a new connection in the presence of
192	   even mild traffic load.

194	   Finally, there are ongoing experimental efforts to promote the
195	   adoption of a slightly modified variant of DCTCP (and similar
196	   congestion controls) over the Internet to achieve low latency, low
197	   loss and scalable throughput (L4S) for all communications
198	   [I-D.ietf-tsvwg-l4s-arch].  In such an approach, L4S packets identify
199	   themselves using an ECN codepoint [I-D.ietf-tsvwg-ecn-l4s-id].  With
200	   L4S, preventing TCP control packets from obtaining the benefits of
201	   ECN would not only expose them to the prevailing level of congestion
202	   loss, but it would also classify them into a different queue.  Then
203	   only L4S data packets would be classified into the L4S queue that is
204	   expected to have lower latency, while the packets controlling and
205	   retransmitting these data packets would still get stuck behind the
206	   queue induced by non-L4S-enabled TCP traffic.

208	1.2.  Experiment Goals

210	   The goal of the experimental modifications defined in this document
211	   is to allow the use of ECN on all TCP packets.  Experiments are
212	   expected in the public Internet as well as in controlled environments
213	   to understand the following issues:

215	   o  How SYNs, Window probes, pure ACKs, FINs, RSTs and retransmissions
216	      that carry the ECT(0), ECT(1) or CE codepoints are processed by
217	      the TCP endpoints and the network (including routers, firewalls
218	      and other middleboxes).  In particular we would like to learn if
219	      these packets are frequently blocked or if these packets are
220	      usually forwarded and processed.

222	   o  The scale of deployment of the different flavours of ECN,
223	      including [RFC3168], [RFC5562], [RFC3540] and
224	      [I-D.ietf-tcpm-accurate-ecn].

226	   o  How much the performance of TCP communications is improved by
227	      allowing ECN marking of each packet type.

229	   o  To identify any issues (including security issues) raised by
230	      enabling ECN marking of these packets.

232	   o  To conduct the specific experiments identified in the text by the
233	      strings "EXPERIMENTATION NEEDED" or "MEASUREMENTS NEEDED".

235	   The data gathered through the experiments described in this document,
236	   particularly under the first 2 bullets above, will help in the
237	   redesign of the final mechanism (if needed) for adding ECN support to
238	   the different packet types considered in this document.

240	   Success criteria: The experiment will be a success if we obtain
241	   enough data to have a clearer view of the deployability and benefits
242	   of enabling ECN on all TCP packets, as well as any issues.  If the
243	   results of the experiment show that it is feasible to deploy such
244	   changes; that there are gains to be achieved through the changes
245	   described in this specification; and that no other major issues may
246	   interfere with the deployment of the proposed changes; then it would
247	   be reasonable to adopt the proposed changes in a standards track
248	   specification that would update RFC 3168.

250	1.3.  Document Structure

252	   The remainder of this document is structured as follows.  In
253	   Section 2, we present the terminology used in the rest of the
254	   document.  In Section 3, we specify the modifications to provide ECN
255	   support to TCP SYNs, pure ACKs, Window probes, FINs, RSTs and
256	   retransmissions.  We describe both the network behaviour and the
257	   endpoint behaviour.  Section 5 discusses variations of the
258	   specification that will be necessary to interwork with a number of
259	   popular variants or derivatives of TCP.  RFC 3168 provides a number
260	   of specific reasons why ECN support is not appropriate for each
261	   packet type.  In Section 4, we revisit each of these arguments for
262	   each packet type to justify why it is reasonable to conduct this
263	   experiment.

265	2.  Terminology

267	   The keywords MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD,
268	   SHOULD NOT, RECOMMENDED, NOT RECOMMENDED, MAY, and OPTIONAL in this
269	   document, are to be interpreted as described in BCP 14 [RFC2119] when
270	   and only when they appear in all capitals [RFC8174].

272	   Pure ACK: A TCP segment with the ACK flag set and no data payload.

274	   SYN: A TCP segment with the SYN (synchronize) flag set.

276	   Window probe: Defined in [RFC0793], a window probe is a TCP segment
277	   with only one byte of data sent to learn if the receive window is
278	   still zero.

280	   FIN: A TCP segment with the FIN (finish) flag set.

282	   RST: A TCP segment with the RST (reset) flag set.

284	   Retransmission: A TCP segment that has been retransmitted by the TCP
285	   sender.

287	   TCP client: The initiating end of a TCP connection.  Also called the
288	   initiator.

290	   TCP server: The responding end of a TCP connection.  Also called the
291	   responder.

293	   ECT: ECN-Capable Transport.  One of the two codepoints ECT(0) or
294	   ECT(1) in the ECN field [RFC3168] of the IP header (v4 or v6).  An
295	   ECN-capable sender sets one of these to indicate that both transport
296	   end-points support ECN.  When this specification says the sender sets
297	   an ECT codepoint, by default it means ECT(0).  Optionally, it could
298	   mean ECT(1), which is in the process of being redefined for use by
299	   L4S experiments [RFC8311] [I-D.ietf-tsvwg-ecn-l4s-id].

301	   Not-ECT: The ECN codepoint set by senders that indicates that the
302	   transport is not ECN-capable.

304	   CE: Congestion Experienced.  The ECN codepoint that an intermediate
305	   node sets to indicate congestion [RFC3168].  A node sets an
306	   increasing proportion of ECT packets to CE as the level of congestion
307	   increases.

309	3.  Specification

311	   The experimental ECN++ changes to the specification of TCP over ECN
312	   [RFC3168] defined here primarily alter the behaviour of the sending
313	   host for each half-connection.  However, there are subsections for
314	   forwarding elements and receivers below, which recommend that they
315	   accept the new packets - they should do already, but might not.  This
316	   will allow implementers to check the receive side code while they are
317	   altering the send-side code.  All changes can be deployed at each
318	   end-point independently of others and independent of any network
319	   behaviour.

321	   The feedback behaviour at the receiver depends on whether classic ECN
322	   TCP feedback [RFC3168] or Accurate ECN (AccECN) TCP feedback
323	   [I-D.ietf-tcpm-accurate-ecn] has been negotiated.  Nonetheless,
324	   neither receiver feedback behaviour is altered by the present
325	   specification.

327	3.1.  Network (e.g.  Firewall) Behaviour

329	   Previously the specification of ECN for TCP [RFC3168] required the
330	   sender to set not-ECT on TCP control packets and retransmissions.
331	   Some readers of RFC 3168 might have erroneously interpreted this as a
332	   requirement for firewalls, intrusion detection systems, etc. to check
333	   and enforce this behaviour.  Section 4.3 of [RFC8311] updates RFC
334	   3168 to remove this ambiguity.  It requires firewalls or any
335	   intermediate nodes not to treat certain types of ECN-capable TCP
336	   segment differently (except potentially in one attack scenario).
337	   This is likely to only involve a firewall rule change in a fraction
338	   of cases (at most 0.4% of paths according to the tests reported in
339	   Section 4.2.2).

341	   In case a TCP sender encounters a middlebox blocking ECT on certain
342	   TCP segments, the specification below includes behaviour to fall back
343	   to non-ECN.  However, this loses the benefit of ECN on control
344	   packets.  So operators are RECOMMENDED to alter their firewall rules
345	   to comply with the requirement referred to above (section 4.3 of
346	   [RFC8311]).

348	3.2.  Sender Behaviour

350	   For each type of control packet or retransmission, the following
351	   sections detail changes to the sender's behaviour in two respects: i)
352	   whether it sets ECT; and ii) its response to congestion feedback.
353	   Table 1 summarises these two behaviours for each type of packet, but
354	   the relevant subsection below should be referred to for the detailed
355	   behaviour.  The subsection on the SYN is more complex than the
356	   others, because it has to include fall-back behaviour if the ECT
357	   packet appears not to have got through, and caching of the outcome to
358	   detect persistent failures.

360	   +---------+----------------+-----------------+----------------------+
361	   | TCP     | ECN field if   | ECN field if    | Congestion Response  |
362	   | packet  | AccECN f/b     | RFC3168 f/b     |                      |
363	   | type    | negotiated*    | negotiated*     |                      |
364	   +---------+----------------+-----------------+----------------------+
365	   | SYN     | ECT            | not-ECT         | If AccECN, reduce IW |
366	   |         |                |                 |                      |
367	   | SYN-ACK | ECT            | ECT             | Reduce IW            |
368	   |         |                |                 |                      |
369	   | Pure    | ECT            | not-ECT         | If AccECN, usual     |
370	   | ACK     |                |                 | cwnd response and    |
371	   |         |                |                 | optionally [RFC5690] |
372	   |         |                |                 |                      |
373	   | W Probe | ECT            | ECT             | Usual cwnd response  |
374	   |         |                |                 |                      |
375	   | FIN     | ECT            | ECT             | None or optionally   |
376	   |         |                |                 | [RFC5690]            |
377	   |         |                |                 |                      |
378	   | RST     | ECT            | ECT             | N/A                  |
379	   |         |                |                 |                      |
380	   | Re-XMT  | ECT            | ECT             | Usual cwnd response  |
381	   +---------+----------------+-----------------+----------------------+

383	   Window probe and retransmission are abbreviated to W Probe an Re-XMT.
384	               * For a SYN, "negotiated" means "requested".

386	     Table 1: Summary of sender behaviour.  In each case the relevant
387	      section below should be referred to for the detailed behaviour

389	   It can be seen that we recommend against the sender setting ECT on
390	   the SYN if it is not requesting AccECN feedback.  Therefore it is
391	   RECOMMENDED that the experimental AccECN specification
392	   [I-D.ietf-tcpm-accurate-ecn] is implemented, along with the ECN++
393	   experiment, because it is expected that ECT on the SYN will give the
394	   most significant performance gain, particularly for short flows.

396	   Nonetheless, this specification also caters for the case where an
397	   ECN++ TCP sender is not using AccECN.  This could be because it does
398	   not support AccECN or because the other end of the TCP connection
399	   does not (AccECN can only be used for a connection if both ends
400	   support it).

402	3.2.1.  SYN (Send)
403	3.2.1.1.  Setting ECT on the SYN

405	   With classic [RFC3168] ECN feedback, the SYN was not expected to be
406	   ECN-capable, so the flag provided to feed back congestion was put to
407	   another use (it is used in combination with other flags to indicate
408	   that the responder supports ECN).  In contrast, Accurate ECN (AccECN)
409	   feedback [I-D.ietf-tcpm-accurate-ecn] provides a codepoint in the
410	   SYN-ACK for the responder to feed back whether the SYN arrived marked
411	   CE.  Therefore the setting of the IP/ECN field on the SYN is
412	   specified separately for each case in the following two subsections.

414	3.2.1.1.1.  ECN++ TCP Client also Supports AccECN

416	   For the ECN++ experiment, if the SYN is requesting AccECN feedback,
417	   the TCP sender will also set ECT on the SYN.  It can ignore the
418	   prohibition in section 6.1.1 of RFC 3168 against setting ECT on such
419	   a SYN, as per Section 4.3 of [RFC8311].

421	3.2.1.1.2.  ECN++ TCP Client does not Support AccECN

423	   If the SYN sent by a TCP initiator does not attempt to negotiate
424	   Accurate ECN feedback, or does not use an equivalent safety
425	   mechanism, it MUST still comply with RFC 3168, which says that a TCP
426	   initiator "MUST NOT set ECT on a SYN".

428	   The only envisaged examples of "equivalent safety mechanisms" are: a)
429	   some future TCP ECN feedback protocol, perhaps evolved from AccECN,
430	   that feeds back CE marking on a SYN; b) setting the initial window to
431	   1 SMSS.  IW=1 is NOT RECOMMENDED because it could degrade
432	   performance, but might be appropriate for certain lightweight TCP
433	   implementations.

435	   See Section 4.2 for discussion and rationale.

437	   If the TCP initiator does not set ECT on the SYN, the rest of
438	   Section 3.2.1 does not apply.

440	3.2.1.2.  Caching where to use ECT on SYNs

442	   This subsection only applies if the ECN++ TCP client set ECTs on the
443	   SYN and supports AccECN.

445	   Until AccECN servers become widely deployed, a TCP initiator that
446	   sets ECT on a SYN (which typically implies the same SYN also requests
447	   AccECN, as above) SHOULD also maintain a cache entry per server to
448	   record servers that it is not worth sending an ECT SYN to, e.g.
449	   because they do not support AccECN and therefore have no logic for
450	   congestion markings on the SYN.  Mobile hosts MAY maintain a cache
451	   entry per access network to record 'non-ECT SYN' entries against
452	   proxies (see Section 4.2.3).  This cache can be implemented as part
453	   of the shared state across multiple TCP connections, following
454	   [RFC2140].

456	   Subsequently the initiator will not set ECT on a SYN to such a server
457	   or proxy, but it can still always request AccECN support (because the
458	   response will state any earlier stage of ECN evolution that the
459	   server supports with no performance penalty).  If a server
460	   subsequently upgrades to support AccECN, the initiator will discover
461	   this as soon as it next connects, then it can remove the server from
462	   its cache and subsequently always set ECT for that server.

464	   The client can limit the size of its cache of 'non-ECT SYN' servers.
465	   Then, while AccECN is not widely deployed, it will only cache the
466	   'non-ECT SYN' servers that are most used and most recently used by
467	   the client.  As the client accesses servers that have been expelled
468	   from its cache, it will simply use ECT on the SYN by default.

470	   Servers that do not support ECN as a whole do not need to be recorded
471	   separately from non-support of AccECN because the response to a
472	   request for AccECN immediately states which stage in the evolution of
473	   ECN the server supports (AccECN [I-D.ietf-tcpm-accurate-ecn], classic
474	   ECN [RFC3168] or no ECN).

476	   The above strategy is named "optimistic ECT and cache failures".  It
477	   is believed to be sufficient based on three measurement studies and
478	   assumptions detailed in Section 4.2.3.  However, Section 4.2.3 gives
479	   two other strategies and the choice between them depends on the
480	   implementer's goals and the deployment prevalence of ECN variants in
481	   the network and on servers, not to mention the prevalence of some
482	   significant bugs.

484	   If the initiator times out without seeing a SYN-ACK, it will
485	   separately cache this fact (see fall-back in Section 3.2.1.4 for
486	   details).

488	3.2.1.3.  SYN Congestion Response

490	   As explained above, this subsection only applies if the ECN++ TCP
491	   client sets ECT on the initial SYN.

493	   If the SYN-ACK returned to the TCP initiator confirms that the server
494	   supports AccECN, it will also be able to indicate whether or not the
495	   SYN was CE-marked.  If the SYN was CE-marked, and if the initial
496	   window is greater than 1 MSS, then, the initiator MUST reduce its
497	   Initial Window (IW) and SHOULD reduce it to 1 SMSS (sender maximum
498	   segment size).  The rationale is the same as that for the response to
499	   CE on a SYN-ACK (Section 4.3.2).

501	   If the initiator has set ECT on the SYN and if the SYN-ACK shows that
502	   the server does not support feedback of a CE on the SYN (e.g. it does
503	   not support AccECN) and if the initial congestion window of the
504	   initiator is greater than 1 MSS, then the TCP initiator MUST
505	   conservatively reduce its Initial Window and SHOULD reduce it to 1
506	   SMSS.  A reduction to greater than 1 SMSS MAY be appropriate (see
507	   Section 4.2.1).  Conservatism is necessary because the SYN-ACK cannot
508	   show whether the SYN was CE-marked.

510	   If the TCP initiator (host A) receives a SYN from the remote end
511	   (host B) after it has sent a SYN to B, it indicates the (unusual)
512	   case of a simultaneous open.  Host A will respond with a SYN-ACK.
513	   Host A will probably then receive a SYN-ACK in response to its own
514	   SYN, after which it can follow the appropriate one of the two
515	   paragraphs above.

517	   In all the above cases, the initiator does not have to back off its
518	   retransmission timer as it would in response to a timeout following
519	   no response to its SYN [RFC6298], because both the SYN and the SYN-
520	   ACK have been successfully delivered through the network.  Also, the
521	   initiator does not need to exit slow start or reduce ssthresh, which
522	   is not even required when a SYN is lost [RFC5681].

524	   If an initial window of more than 3 segments is implemented (e.g.
525	   IW10 [RFC6928]), Section 5 gives additional recommendations.

527	3.2.1.4.  Fall-Back Following No Response to an ECT SYN

529	   As explained above, this subsection only applies if the ECN++ TCP
530	   client also sets ECT on the initial SYN.

532	   An ECT SYN might be lost due to an over-zealous path element (or
533	   server) blocking ECT packets that do not conform to RFC 3168.  Some
534	   evidence of this was found in a 2014 study [ecn-pam], but in a more
535	   recent study using 2017 data [Mandalari18] extensive measurements
536	   found no case where ECT on TCP control packets was treated any
537	   differently from ECT on TCP data packets.  Loss is commonplace for
538	   numerous other reasons, e.g. congestion loss at a non-ECN queue on
539	   the forward or reverse path, transmission errors, etc.
540	   Alternatively, the cause of the loss might be the associated attempt
541	   to negotiate AccECN, or possibly other unrelated options on the SYN.

543	   Therefore, if the timer expires after the TCP initiator has sent the
544	   first ECT SYN, it SHOULD make one more attempt to retransmit the SYN
545	   with ECT set (backing off the timer as usual).  If the retransmission
546	   timer expires again, it SHOULD retransmit the SYN with the not-ECT
547	   codepoint in the IP header, to expedite connection set-up.  If other
548	   experimental fields or options were on the SYN, it will also be
549	   necessary to follow their specifications for fall-back too.  It would
550	   make sense to coordinate all the strategies for fall-back in order to
551	   isolate the specific cause of the problem.

553	   If the TCP initiator is caching failed connection attempts, it SHOULD
554	   NOT give up using ECT on the first SYN of subsequent connection
555	   attempts until it is clear that a blockage persistently and
556	   specifically affects ECT on SYNs.  This is because loss is so
557	   commonplace for other reasons.  Even if it does eventually decide to
558	   give up setting ECT on the SYN, it will probably not need to give up
559	   on AccECN on the SYN.  In any case, if a cache is used, it SHOULD be
560	   arranged to expire so that the initiator will infrequently attempt to
561	   check whether the problem has been resolved.

563	   Other fall-back strategies MAY be adopted where applicable (see
564	   Section 4.2.2 for suggestions, and the conditions under which they
565	   would apply).

567	3.2.2.  SYN-ACK (Send)

569	3.2.2.1.  Setting ECT on the SYN-ACK

571	   For the ECN++ experiment, the TCP implementation will set ECT on SYN-
572	   ACKs.  It can ignore the requirement in section 6.1.1 of RFC 3168 to
573	   set not-ECT on a SYN-ACK, as per Section 4.3 of [RFC8311].

575	3.2.2.2.  SYN-ACK Congestion Response

577	   A host that sets ECT on SYN-ACKs MUST reduce its initial window in
578	   response to any congestion feedback, whether using classic ECN or
579	   AccECN (see Section 4.3.1).  It SHOULD reduce it to 1 SMSS.  This is
580	   different to the behaviour specified in an earlier experiment that
581	   set ECT on the SYN-ACK [RFC5562].  This is justified in
582	   Section 4.3.2.

584	   The responder does not have to back off its retransmission timer
585	   because the ECN feedback proves that the network is delivering
586	   packets successfully and is not severely overloaded.  Also the
587	   responder does not have to leave slow start or reduce ssthresh, which
588	   is not even required when a SYN-ACK has been lost.

590	   The congestion response to CE-marking on a SYN-ACK for a server that
591	   implements either the TCP Fast Open experiment (TFO [RFC7413]) or
592	   experimentation with an initial window of more than 3 segments (e.g.
593	   IW10 [RFC6928]) is discussed in Section 5.

595	3.2.2.3.  Fall-Back Following No Response to an ECT SYN-ACK

597	   After the responder sends a SYN-ACK with ECT set, if its
598	   retransmission timer expires it SHOULD retransmit one more SYN-ACK
599	   with ECT set (and back-off its timer as usual).  If the timer expires
600	   again, it SHOULD retransmit the SYN-ACK with not-ECT in the IP
601	   header.  If other experimental fields or options were on the initial
602	   SYN-ACK, it will also be necessary to follow their specifications for
603	   fall-back.  It would make sense to co-ordinate all the strategies for
604	   fall-back in order to isolate the specific cause of the problem.

606	   This fall-back strategy attempts to use ECT one more time than the
607	   strategy for ECT SYN-ACKs in [RFC5562] (which is made obsolete, being
608	   superseded by the present specification).  Other fall-back strategies
609	   MAY be adopted if found to be more effective, e.g. fall-back to not-
610	   ECT on the first retransmission attempt.

612	   The server MAY cache failed connection attempts, e.g. per client
613	   access network.  A client-based alternative to caching at the server
614	   is given in Section 4.3.3.  If the TCP server is caching failed
615	   connection attempts, it SHOULD NOT give up using ECT on the first
616	   SYN-ACK of subsequent connection attempts until it is clear that the
617	   blockage persistently and specifically affects ECT on SYN-ACKs.  This
618	   is because loss is so commonplace for other reasons (see
619	   Section 3.2.1.4).  If a cache is used, it SHOULD be arranged to
620	   expire so that the server will infrequently attempt to check whether
621	   the problem has been resolved.

623	3.2.3.  Pure ACK (Send)

625	   A Pure ACK is an ACK packet that does not carry data, which includes
626	   the Pure ACK at the end of TCP's 3-way handshake.

628	   For the ECN++ experiment, whether a TCP implementation sets ECT on a
629	   Pure ACK depends on whether or not Accurate ECN TCP feedback
630	   [I-D.ietf-tcpm-accurate-ecn] has been successfully negotiated for a
631	   particular TCP connection, as specified in the following two
632	   subsections.

634	3.2.3.1.  Pure ACK without AccECN Feedback

636	   If AccECN has not been successfully negotiated for a connection, ECT
637	   MUST NOT be set on Pure ACKs by either end.

639	3.2.3.2.  Pure ACK with AccECN Feedback

641	   For the ECN++ experiment, if AccECN has been successfully negotiated,
642	   either end of the connection will set ECT on Pure ACKs.  They can
643	   ignore the requirement in section 6.1.4 of RFC 3168 to set not-ECT on
644	   a pure ACK, as per Section 4.3 of [RFC8311].

646	      MEASUREMENTS NEEDED: Measurements are needed to learn how the
647	      deployed base of network elements and RFC 3168 servers react to
648	      pure ACKs marked with the ECT(0)/ECT(1)/CE codepoints, i.e.
649	      whether they are dropped, codepoint cleared or processed and the
650	      congestion indication fed back on a subsequent packet.

652	   See Section 3.3.3 for the implications if a host receives a CE-marked
653	   Pure ACK.

655	3.2.3.2.1.  Pure ACK Congestion Response

657	   As explained above, this subsection only applies if AccECN has been
658	   successfully negotiated for the TCP connection.

660	   A host that sets ECT on pure ACKs SHOULD respond to the congestion
661	   signal resulting from pure ACKs being marked with the CE codepoint.
662	   The specific response will need to be defined as an update to each
663	   congestion control specification.  Possible responses to congestion
664	   feedback include reducing the congestion window (CWND) and/or
665	   regulating the pure ACK rate (see Section 4.4.1.1).

667	   Note that, in comparison, TCP Congestion Control [RFC5681] does not
668	   require a TCP to detect or respond to loss of pure ACKs at all; it
669	   requires no reduction in congestion window or ACK rate.

671	3.2.4.  Window Probe (Send)

673	   For the ECN++ experiment, the TCP sender will set ECT on window
674	   probes.  It can ignore the prohibition in section 6.1.6 of RFC 3168
675	   against setting ECT on a window probe, as per Section 4.3 of
676	   [RFC8311].

678	   A window probe contains a single octet, so it is no different from a
679	   regular TCP data segment.  Therefore a TCP receiver will feed back
680	   any CE marking on a window probe as normal (either using classic ECN
681	   feedback or AccECN feedback).  The sender of the probe will then
682	   reduce its congestion window as normal.

684	   A receive window of zero indicates that the application is not
685	   consuming data fast enough and does not imply anything about network
686	   congestion.  Once the receive window opens, the congestion window
687	   might become the limiting factor, so it is correct that CE-marked
688	   probes reduce the congestion window.  This complements cwnd
689	   validation [RFC7661], which reduces cwnd as more time elapses without
690	   having used available capacity.  However, CE-marking on window probes
691	   does not reduce the rate of the probes themselves.  This is unlikely
692	   to present a problem, given the duration between window probes
693	   doubles [RFC1122] as long as the receiver is advertising a zero
694	   window (currently minimum 1 second, maximum at least 1 minute
695	   [RFC6298]).

697	      MEASUREMENTS NEEDED: Measurements are needed to learn how the
698	      deployed base of network elements and servers react to Window
699	      probes marked with the ECT(0)/ECT(1)/CE codepoints, i.e. whether
700	      they are dropped, codepoint cleared or processed.

702	3.2.5.  FIN (Send)

704	   A TCP implementation can set ECT on a FIN.

706	   See Section 3.3.4 for the implications if a host receives a CE-marked
707	   FIN.

709	   A congestion response to a CE-marking on a FIN is not required.

711	   After sending a FIN, the endpoint will not send any more data in the
712	   connection.  Therefore, even if the FIN-ACK indicates that the FIN
713	   was CE-marked (whether using classic or AccECN feedback), reducing
714	   the congestion window will not affect anything.

716	   After sending a FIN, a host might send one or more pure ACKs.  If it
717	   is using one of the techniques in Section 3.2.3 to regulate the
718	   delayed ACK ratio for pure ACKs, it could equally be applied after a
719	   FIN.  But this is not required.

721	      MEASUREMENTS NEEDED: Measurements are needed to learn how the
722	      deployed base of network elements and servers react to FIN packets
723	      marked with the ECT(0)/ECT(1)/CE codepoints, i.e.  whether they
724	      are dropped, codepoint cleared or processed.

726	3.2.6.  RST (Send)

728	   A TCP implementation can set ECT on a RST.

730	   See Section 3.3.5 for the implications if a host receives a CE-marked
731	   RST.

733	   A congestion response to a CE-marking on a RST is not required (and
734	   actually not possible).

736	      MEASUREMENTS NEEDED: Measurements are needed to learn how the
737	      deployed base of network elements and servers react to RST packets
738	      marked with the ECT(0)/ECT(1)/CE codepoints, i.e.  whether they
739	      are dropped, codepoint cleared or processed.

741	3.2.7.  Retransmissions (Send)

743	   For the ECN++ experiment, the TCP sender will set ECT on
744	   retransmitted segments.  It can ignore the prohibition in section
745	   6.1.5 of RFC 3168 against setting ECT on retransmissions, as per
746	   Section 4.3 of [RFC8311].

748	   See Section 3.3.6 for the implications if a host receives a CE-marked
749	   retransmission.

751	   If the TCP sender receives feedback that a retransmitted packet was
752	   CE-marked, it will react as it would to any feedback of CE-marking on
753	   a data packet.

755	      MEASUREMENTS NEEDED: Measurements are needed to learn how the
756	      deployed base of network elements and servers react to
757	      retransmissions marked with the ECT(0)/ECT(1)/CE codepoints, i.e.
758	      whether they are dropped, codepoint cleared or processed.

760	3.2.8.  General Fall-back for any Control Packet or Retransmission

762	   Extensive measurements in fixed and mobile networks [Mandalari18]
763	   have found no evidence of blockages due to ECT being set on any type
764	   of TCP control packet.

766	   In case traversal problems arise in future, fall-back measures have
767	   been specified above, but only for the cases where ECT on the initial
768	   packet of a half-connection (SYN or SYN-ACK) is persistently failing
769	   to get through.

771	   Fall-back measures for blockage of ECT on other TCP control packets
772	   MAY be implemented.  However they are not specified here given the
773	   lack of any evidence they will be needed.  Section 4.9 justifies this
774	   advice in more detail.

776	3.3.  Receiver Behaviour

778	   The present ECN++ specification primarily concerns the behaviour for
779	   sending TCP control packets or retransmissions.  Below are a few
780	   changes to the receive side of an implementation that are recommended
781	   while updating its send side.  Nonetheless, where deployment is
782	   concerned, ECN++ is still a sender-only deployment, because it does
783	   not depend on receivers complying with any of these recommendations.

785	3.3.1.  Receiver Behaviour for Any TCP Control Packet or Retransmission

787	   RFC8311 is a standards track update to RFC 3168 in order to (amongst
788	   other things) "...allow the use of ECT codepoints on SYN packets,
789	   pure acknowledgement packets, window probe packets, and
790	   retransmissions of packets..., provided that the changes from RFC
791	   3168 are documented in an Experimental RFC in the IETF document
792	   stream."

794	   Section 4.3 of RFC 8311 amends every statement in RFC 3168 that
795	   precludes the use of ECT on control packets and retransmissions to
796	   add "unless otherwise specified by an Experimental RFC in the IETF
797	   document stream".  The present specification is such an Experimental
798	   RFC.  Therefore, In order for this experiment to be useful, the
799	   following requirements follow from RFC8311:

801	   o  Any TCP implementation SHOULD accept receipt of any valid TCP
802	      control packet or retransmission irrespective of its IP/ECN field.
803	      If any existing implementation does not, it SHOULD be updated to
804	      do so.

806	   o  A TCP implementation taking part in the experiments proposed here
807	      MUST accept receipt of any valid TCP control packet or
808	      retransmission irrespective of its IP/ECN field.

810	   These measures are derived from the robustness principle of "... be
811	   liberal in what you accept from others", in order to ensure
812	   compatibility with any future protocol changes that allow ECT on any
813	   TCP packet.

815	3.3.2.  SYN (Receive)

817	   RFC 3168 negotiates the use of ECN for the connection end-to-end
818	   using the ECN flags in the TCP header.  When RFC3168 says that "A
819	   host MUST NOT set ECT on SYN ... packets." it is silent as to what a
820	   TCP server ought to do if it receives a SYN packet with a non-zero
821	   IP/ECN field.

823	   As the time of the writing, some implementations of TCP servers (see
824	   Section 4.2.2.2) assume that, if a host receives a SYN with a non-
825	   zero IP/ECN field, it must be due to network mangling, and they
826	   disable ECN for the rest of the connection.  Section 4.2.2.2 also
827	   finds that this type of network mangling seems to be virtually non-
828	   existent so it would be preferable to report any such mangling so it
829	   can be fixed.

831	   For the avoidance of doubt, the normative statements for all TCP
832	   control packets in Section 3.3.1 are interpreted for the case when a
833	   SYN is received as follows:

835	   o  Any TCP server implementation SHOULD accept receipt of a valid SYN
836	      that requests ECN support for the connection, irrespective of the
837	      IP/ECN field of the SYN.  If any existing implementation does not,
838	      it SHOULD be updated to do so.

840	   o  A TCP implementation taking part in the ECN++ experiment MUST
841	      accept receipt of a valid SYN, irrespective of its IP/ECN field.

843	   o  If the SYN is CE-marked and the server has no logic to feed back a
844	      CE mark on a SYN-ACK (e.g. it does not support AccECN), it has to
845	      ignore the CE-mark (the client detects this case and behaves
846	      conservatively in mitigation - see Section 3.2.1.3).

848	3.3.3.  Pure ACK (Receive)

850	   For the avoidance of doubt, the normative statements for all TCP
851	   control packets in Section 3.3.1 are interpreted for the case when a
852	   Pure ACK is received as follows:

854	   o  Any TCP implementation SHOULD accept receipt of a pure ACK with a
855	      non-zero ECN field, despite current RFCs precluding the sending of
856	      such packets.

858	   o  A TCP implementation taking part in the ECN++ experiment MUST
859	      accept receipt of a pure ACK with a non-zero ECN field.

861	   The question of whether and how the receiver of pure ACKs is required
862	   to feed back any CE marks on them is outside the scope of the present
863	   specification because it is a matter for the relevant feedback
864	   specification ([RFC3168] or [I-D.ietf-tcpm-accurate-ecn]).  AccECN
865	   feedback is required to count CE marking of any control packet
866	   including pure ACKs.  Whereas RFC 3168 is silent on this point, so
867	   feedback of CE-markings might be implementation specific (see
868	   Section 4.4.1.1).

870	3.3.4.  FIN (Receive)

872	   The TCP data receiver MUST ignore the CE codepoint on incoming FINs
873	   that fail any validity check.  The validity check in section 5.2 of
874	   [RFC5961] is RECOMMENDED.

876	3.3.5.  RST (Receive)

878	   The "challenge ACK" approach to checking the validity of RSTs
879	   (section 3.2 of [RFC5961] is RECOMMENDED at the data receiver.

881	3.3.6.  Retransmissions (Receive)

883	   The TCP data receiver MUST ignore the CE codepoint on incoming
884	   segments that fail any validity check.  The validity check in section
885	   5.2 of [RFC5961] is RECOMMENDED.  This will effectively mitigate an
886	   attack that uses spoofed data packets to fool the receiver into
887	   feeding back spoofed congestion indications to the sender, which in
888	   turn would be fooled into continually reducing its congestion window.

890	4.  Rationale

892	   This section is informative, not normative.  It presents counter-
893	   arguments against the justifications in the RFC series for disabling
894	   ECN on TCP control segments and retransmissions.  It also gives
895	   rationale for why ECT is safe on control segments that have not, so
896	   far, been mentioned in the RFC series.  First it addresses over-
897	   arching arguments used for most packet types, then it addresses the
898	   specific arguments for each packet type in turn.

900	4.1.  The Reliability Argument

902	   Section 5.2 of RFC 3168 states:

904	      "To ensure the reliable delivery of the congestion indication of
905	      the CE codepoint, an ECT codepoint MUST NOT be set in a packet
906	      unless the loss of that packet [at a subsequent node] in the
907	      network would be detected by the end nodes and interpreted as an
908	      indication of congestion."

910	   We believe this argument is misplaced.  TCP does not deliver most
911	   control packets reliably.  So it is more important to allow control
912	   packets to be ECN-capable, which greatly improves reliable delivery
913	   of the control packets themselves (see motivation in Section 1.1).
914	   ECN also improves the reliability and latency of delivery of any
915	   congestion notification on control packets, particularly because TCP
916	   does not detect the loss of most types of control packet anyway.
917	   Both these points outweigh by far the concern that a CE marking
918	   applied to a control packet by one node might subsequently be dropped
919	   by another node.

921	   The principle to determine whether a packet can be ECN-capable ought
922	   to be "do no extra harm", meaning that the reliability of a
923	   congestion signal's delivery ought to be no worse with ECN than
924	   without.  In particular, setting the CE codepoint on the very same
925	   packet that would otherwise have been dropped fulfills this
926	   criterion, since either the packet is delivered and the CE signal is
927	   delivered to the endpoint, or the packet is dropped and the original
928	   congestion signal (packet loss) is delivered to the endpoint.

930	   The concern about a CE marking being dropped at a subsequent node
931	   might be motivated by the idea that ECN-marking a packet at the first
932	   node does not remove the packet, so it could go on to worsen
933	   congestion at a subsequent node.  However, it is not useful to reason
934	   about congestion by considering single packets.  The departure rate
935	   from the first node will generally be the same (fully utilized) with
936	   or without ECN, so this argument does not apply.

938	4.2.  SYNs

940	   RFC 5562 presents two arguments against ECT marking of SYN packets
941	   (quoted verbatim):

943	      "First, when the TCP SYN packet is sent, there are no guarantees
944	      that the other TCP endpoint (node B in Figure 2) is ECN-Capable,
945	      or that it would be able to understand and react if the ECN CE
946	      codepoint was set by a congested router.

948	      Second, the ECN-Capable codepoint in TCP SYN packets could be
949	      misused by malicious clients to "improve" the well-known TCP SYN
950	      attack.  By setting an ECN-Capable codepoint in TCP SYN packets, a
951	      malicious host might be able to inject a large number of TCP SYN
952	      packets through a potentially congested ECN-enabled router,
953	      congesting it even further."

955	   The first point actually describes two subtly different issues.  So
956	   below three arguments are countered in turn.

958	4.2.1.  Argument 1a: Unrecognized CE on the SYN

960	   This argument certainly applied at the time RFC 5562 was written,
961	   when no ECN responder mechanism had any logic to recognize a CE
962	   marking on a SYN and, even if logic were added, there was no field in
963	   the SYN-ACK to feed it back.  The problem was that, during the 3WHS,
964	   the flag in the TCP header for ECN feedback (called Echo Congestion
965	   Experienced) had been overloaded to negotiate the use of ECN itself.

967	   The accurate ECN (AccECN) protocol [I-D.ietf-tcpm-accurate-ecn] has
968	   since been designed to solve this problem.  Two features are
969	   important here:

971	   1.  An AccECN server uses the 3 'ECN' bits in the TCP header of the
972	       SYN-ACK to respond to the client. 4 of the possible 8 codepoints
973	       provide enough space for the server to feed back which of the 4
974	       IP/ECN codepoints was on the incoming SYN (including CE of
975	       course).

977	   2.  If any of these 4 codepoints are in the SYN-ACK, it confirms that
978	       the server supports AccECN and, if another codepoint is returned,
979	       it confirms that the server doesn't support AccECN.

981	   This still does not seem to allow a client to set ECT on a SYN, it
982	   only finds out whether the server would have supported it afterwards.
983	   The trick the client uses for ECN++ is to set ECT on the SYN
984	   optimistically then, if the SYN-ACK reveals that the server wouldn't
985	   have understood CE on the SYN, the client responds conservatively as
986	   if the SYN was marked with CE.

988	   The recommended conservative congestion response is to reduce the
989	   initial window, which does not affect the performance of very popular
990	   protocols such as HTTP, since it is extremely rare for an HTTP client
991	   to send more than one packet as its initial request anyway (for data
992	   on HTTP/1 & HTTP/2 request sizes see Fig 3 in [Manzoor17]).  Any
993	   clients that do frequently use a larger initial window for their
994	   first message to the server can cache which servers will not
995	   understand ECT on a SYN (see Section 4.2.3 below).  If caching is not
996	   practical, such clients could reduce the initial window to say IW2 or
997	   IW3.

999	      EXPERIMENTATION NEEDED: Experiments will be needed to determine
1000	      any better strategy for reducing IW in response to congestion on a
1001	      SYN, when the server does not support congestion feedback on the
1002	      SYN-ACK (whether cached or discovered explicitly).

1004	4.2.2.  Argument 1b: ECT Considered Invalid on the SYN

1006	   Given, until now, ECT-marked SYN packets have been prohibited, it
1007	   cannot be assumed they will be accepted, by TCP middleboxes or
1008	   servers.

1010	4.2.2.1.  ECT on SYN Considered Invalid by Middleboxes

1012	   According to a study using 2014 data [ecn-pam] from a limited range
1013	   of fixed vantage points, for the top 1M Alexa web sites, adding the
1014	   ECN capability to SYNs was increasing connection establishment
1015	   failures by about 0.4%.

1017	   From a wider range of fixed and mobile vantage points, a more recent
1018	   study in Jan-May 2017 [Mandalari18] found no occurrences of blocking
1019	   of ECT on SYNs.  However, in more than half the mobile networks
1020	   tested it found wiping of the ECN codepoint at the first hop.

1022	      MEASUREMENTS NEEDED: As wiping at the first hop is remedied,
1023	      measurements will be needed to check whether SYNs with ECT are
1024	      sometimes blocked deeper into the path.

1026	   Silent failures introduce a retransmission timeout delay (default 1
1027	   second) at the initiator before it attempts any fall back strategy
1028	   (whereas explicit RSTs can be dealt with immediately).  Ironically,
1029	   making SYNs ECN-capable is intended to avoid the timeout when a SYN
1030	   is lost due to congestion.  Fortunately, if there is any discard of
1031	   ECN-capable SYNs due to policy, it will occur predictably, not
1032	   randomly like congestion.  So the initiator should be able to avoid
1033	   it by caching those sites that do not support ECN-capable SYNs (see
1034	   the last paragraph of Section 3.2.1.2).

1036	4.2.2.2.  ECT on SYN Considered Invalid by Servers

1038	   A study conducted in Nov 2017 [Kuehlewind18] found that, of the 82%
1039	   of the Alexa top 50k web servers that supported ECN, 84% disabled ECN
1040	   if the IP/ECN field on the SYN was ECT0, CE or either.  Given most
1041	   web servers use Linux, this behaviour can most likely be traced to a
1042	   patch contributed in May 2012 that was first distributed in v3.5 of
1043	   the Linux kernel [strict-ecn].  The comment says "RFC3168 : 6.1.1 SYN
1044	   packets must not have ECT/ECN bits set.  If we receive a SYN packet
1045	   with these bits set, it means a network is playing bad games with TOS
1046	   bits.  In order to avoid possible false congestion notifications, we
1047	   disable TCP ECN negociation."  Of course, some of the 84% might be
1048	   due to similar code in other OSs.

1050	   For brevity we shall call this the "over-strict" ECN test, because it
1051	   is over-conservative with what it accepts, contrary to Postel's
1052	   robustness principle.  A robust protocol will not usually assume
1053	   network mangling without comparing with the value originally sent,
1054	   and one packet is not sufficient to make an assumption with such
1055	   irreversible consequences anyway.

1057	   Ironically, networks rarely seem to alter the IP/ECN field on a SYN
1058	   from zero to non-zero anyway.  In a study conducted in Jan-May 2017
1059	   over millions of paths from vantage points in a few dozen mobile and
1060	   fixed networks [Mandalari18], no such transition was observed.  With
1061	   such a small or non-existent incidence of this sort of network
1062	   mangling, it would be preferable to report any residual problem paths
1063	   so that they can be fixed.

1065	   Whatever, the widespread presence of this 'over-strict' test proves
1066	   that RFC 5562 was correct to expect that ECT would be considered
1067	   invalid on SYNs.  Nonetheless, it is not an insurmountable problem -
1068	   the over-strict test in Linux was patched in Apr 2019
1069	   [relax-strict-ecn] and caching can work round it where previous
1070	   versions of Linux are running.  The prevalence of these "over-strict"
1071	   ECN servers makes it challenging to cache them all.  However,
1072	   Section 4.2.3 below explains how a cache of limited size can
1073	   alleviate this problem for a client's most popular sites.

1075	   For the future, [RFC8311] updates RFC 3168 to clarify that the IP/ECN
1076	   field does not have to be zero on a SYN if documented in an
1077	   experimental RFC such as the present ECN++ specification.

1079	4.2.3.  Caching Strategies for ECT on SYNs

1081	   Given the server handling of ECN on SYNs outlined in Section 4.2.2.2
1082	   above, an initiator might combine AccECN with three candidate caching
1083	   strategies for setting ECT on a SYN:

1085	   (S1):  Pessimistic ECT and cache successes: The initiator always
1086	          requests AccECN, but by default without ECT on the SYN.  Then
1087	          it caches those servers that confirm that they support AccECN
1088	          as 'ECT SYN OK'.  On a subsequent connection to any server
1089	          that supports AccECN, the initiator can then set ECT on the
1090	          SYN.  When connecting to other servers (non-ECN or classic
1091	          ECN) it will not set ECT on the SYN, so it will not fail the
1092	          'over-strict' ECN test.

1094	          Longer term, as servers upgrade to AccECN, the initiator is
1095	          still requesting AccECN, so it will add them to the cache and
1096	          use ECT on subsequent SYNs to those servers.  However,
1097	          assuming it has to cap the size of the cache, the client will
1098	          not have the benefit of ECT SYNs to those less frequently used
1099	          AccECN servers expelled from its cache.

1101	   (S2):  Optimistic ECT: The initiator always requests AccECN and by
1102	          default sets ECT on the SYN.  Then, if the server response
1103	          shows it has no AccECN logic (so it cannot feed back a CE
1104	          mark), the initiator conservatively behaves as if the SYN was
1105	          CE-marked, by reducing its initial window.

1107	          A.  No cache.

1109	          B.  Cache failures: The optimistic ECT strategy can be
1110	              improved by caching solely those servers that do not
1111	              support AccECN as 'ECT SYN NOK'.  This would include non-
1112	              ECN servers and all Classic ECN servers whether 'over-
1113	              strict' or not.  On subsequent connections to these non-
1114	              AccECN servers, the initiator will still request AccECN
1115	              but not set ECT on the SYN.  Then, the connection can
1116	              still fall back to Classic ECN, if the server supports it,
1117	              and the initiator can use its full initial window (if it
1118	              has enough request data to need it).

1120	              Longer term, as servers upgrade to AccECN, the initiator
1121	              will remove them from the cache and use ECT on subsequent
1122	              SYNs to that server.

1124	              Where an access network operator mediates Internet access
1125	              via a proxy that does not support AccECN, the optimistic
1126	              ECT strategy will always fail.  This scenario is more
1127	              likely in mobile networks.  Therefore, a mobile host could
1128	              cache lack of AccECN support per attached access network
1129	              operator.  Whenever it attached to a new operator, it
1130	              could check a well-known AccECN test server and, if it
1131	              found no AccECN support, it would add a cache entry for
1132	              the attached operator.  It would only use ECT when neither
1133	              network nor server were cached.  It would only populate
1134	              its per server cache when not attached to a non-AccECN
1135	              proxy.

1137	   (S3):  ECT by configuration: In a controlled environment, the
1138	          administrator can make sure that servers support ECN-capable
1139	          SYN packets.  Examples of controlled environments are single-
1140	          tenant DCs, and possibly multi-tenant DCs if it is assumed
1141	          that each tenant mostly communicates with its own VMs.

1143	   For unmanaged environments like the public Internet, pragmatically
1144	   the choice is between strategies (S1), (S2A) and (S2B).  The
1145	   normative specification for ECT on a SYN in Section 3.2.1 recommends
1146	   the "optimistic ECT and cache failures" strategy (S2B) but the choice
1147	   depends on the implementer's motivation for using ECN++, and the
1148	   deployment prevalence of different technologies and bug-fixes.

1150	   o  The "pessimistic ECT and cache successes" strategy (S1) suffers
1151	      from exposing the initial SYN to the prevailing loss level, even
1152	      if the server supports ECT on SYNs, but only on the first
1153	      connection to each AccECN server.  If AccECN becomes widely
1154	      deployed on servers, SYNs to those AccECN servers that are less
1155	      frequently used by the client and therefore don't fit in the cache
1156	      will not benefit from ECN protection at all.

1158	   o  The "optimistic ECT without a cache" strategy (S2A) is the
1159	      simplest.  It would satisfy the goal of an implementer who is
1160	      solely interested in low latency using AccECN and ECN++ and is not
1161	      concerned about fall-back to Classic ECN.

1163	   o  The "optimistic ECT and cache failures" strategy (S2B) exploits
1164	      ECT on SYNs from the very first attempt.  But if the server turns
1165	      out to be 'over-strict' it will disable ECN for the connection,
1166	      but only for the first connection if it's one of the client's more
1167	      popular servers that fits in the cache.  If the server turns out
1168	      not to support AccECN, the initiator has to conservatively limit
1169	      its initial window, but again only for the first connection if
1170	      it's one of the client's more popular servers (and anyway this
1171	      rarely makes any difference when most client requests fit in a
1172	      single packet).

1174	   Note that, if AccECN deployment grows, caching successes (S1) starts
1175	   off small then grows, while caching failures (S2B) becomes large at
1176	   first, then shrinks.  At half-way, the size of the cache has to be
1177	   capped with either approach, so the default behaviour for all the
1178	   servers that do not fit in the cache is as important as the behaviour
1179	   for the popular servers that do fit.

1181	      MEASUREMENTS NEEDED: Measurements are needed to determine which
1182	      strategy would be sufficient for any particular client, whether a
1183	      particular client would need different strategies in different
1184	      circumstances and how many occurrences of problems would be masked
1185	      by how few cache entries.

1187	   Another strategy would be to send a not-ECT SYN a short delay (below
1188	   the typical lowest RTT) after an ECT SYN and only accept the non-ECT
1189	   connection if it returned first.  This would reduce the performance
1190	   penalty for those deploying ECT SYN support.  However, this 'happy
1191	   eyeballs' approach becomes complex when multiple optional features
1192	   are all tried on the first SYN (or on multiple SYNs), so it is not
1193	   recommended.

1195	4.2.4.  Argument 2: DoS Attacks

1197	   [RFC5562] says that ECT SYN packets could be misused by malicious
1198	   clients to augment "the well-known TCP SYN attack".  It goes on to
1199	   say "a malicious host might be able to inject a large number of TCP
1200	   SYN packets through a potentially congested ECN-enabled router,
1201	   congesting it even further."

1203	   We assume this is a reference to the TCP SYN flood attack (see
1204	   https://en.wikipedia.org/wiki/SYN_flood), which is an attack against
1205	   a responder end point.  We assume the idea of this attack is to use
1206	   ECT to get more packets through an ECN-enabled router in preference
1207	   to other non-ECN traffic so that they can go on to use the SYN
1208	   flooding attack to inflict more damage on the responder end point.
1209	   This argument could apply to flooding with any type of packet, but we
1210	   assume SYNs are singled out because their source address is easier to
1211	   spoof, whereas floods of other types of packets are easier to block.

1213	   Mandating Not-ECT in an RFC does not stop attackers using ECT for
1214	   flooding.  Nonetheless, if a standard says SYNs are not meant to be
1215	   ECT it would make it legitimate for firewalls to discard them.
1216	   However this would negate the considerable benefit of ECT SYNs for
1217	   compliant transports and seems unnecessary because RFC 3168 already
1218	   provides the means to address this concern.  In section 7, RFC 3168
1219	   says "During periods where ... the potential packet marking rate
1220	   would be high, our recommendation is that routers drop packets rather
1221	   then set the CE codepoint..." and this advice is repeated in
1222	   [RFC7567] (section 4.2.1).  This makes it harder for flooding packets
1223	   to gain from ECT.

1225	   [ecn-overload] showed that ECT can only slightly augment flooding
1226	   attacks relative to a non-ECT attack.  It was hard to overload the
1227	   link without causing the queue to grow, which in turn caused the AQM
1228	   to disable ECN and switch to drop, thus negating any advantage of
1229	   using ECT.  This was true even with the switch-over point set to 25%
1230	   drop probability (i.e. the arrival rate was 133% of the link rate).

1232	4.3.  SYN-ACKs

1234	   The proposed approach in Section 3.2.2 for experimenting with ECN-
1235	   capable SYN-ACKs is effectively identical to the scheme called ECN+
1236	   [ECN-PLUS].  In 2005, the ECN+ paper demonstrated that it could
1237	   reduce the average Web response time by an order of magnitude.  It
1238	   also argued that adding ECT to SYN-ACKs did not raise any new
1239	   security vulnerabilities.

1241	4.3.1.  Possibility of Unrecognized CE on the SYN-ACK

1243	   The feedback behaviour by the initiator in response to a CE-marked
1244	   SYN-ACK from the responder depends on whether classic ECN feedback
1245	   [RFC3168] or AccECN feedback [I-D.ietf-tcpm-accurate-ecn] has been
1246	   negotiated.  In either case no change is required to RFC 3168 or the
1247	   AccECN specification.

1249	   Some classic ECN client implementations might ignore a CE-mark on a
1250	   SYN-ACK, or even ignore a SYN-ACK packet entirely if it is set to ECT
1251	   or CE.  This is a possibility because an RFC 3168 implementation
1252	   would not necessarily expect a SYN-ACK to be ECN-capable.  This issue
1253	   already came up when the IETF first decided to experiment with ECN on
1254	   SYN-ACKs [RFC5562] and it was decided to go ahead without any extra
1255	   precautionary measures.  This was because the probability of
1256	   encountering the problem was believed to be low and the harm if the
1257	   problem arose was also low (see Appendix B of RFC 5562).

1259	4.3.2.  Response to Congestion on a SYN-ACK

1261	   The IETF has already specified an experiment with ECN-capable SYN-ACK
1262	   packets [RFC5562].  It was inspired by the ECN+ paper, but it
1263	   specified a much more conservative congestion response to a CE-marked
1264	   SYN-ACK, called ECN+/TryOnce.  This required the server to reduce its
1265	   initial window to 1 segment (like ECN+), but then the server had to
1266	   send a second SYN-ACK and wait for its ACK before it could continue
1267	   with its initial window of 1 SMSS.  The second SYN-ACK of this 5-way
1268	   handshake had to carry no data, and had to disable ECN, but no
1269	   justification was given for these last two aspects.

1271	   The present ECN++ experimental specification obsoletes RFC 5562
1272	   because it uses the ECN+ congestion response, not ECN+/TryOnce.
1273	   First we argue against the rationale for ECN+/TryOnce given in
1274	   sections 4.4 and 6.2 of [RFC5562].  It starts with a rather too
1275	   literal interpretation of the requirement in RFC 3168 that says TCP's
1276	   response to a single CE mark has to be "essentially the same as the
1277	   congestion control response to a *single* dropped packet."  TCP's
1278	   response to a dropped initial (SYN or SYN-ACK) packet is to wait for
1279	   the retransmission timer to expire (currently 1s).  However, this
1280	   long delay assumes the worst case between two possible causes of the
1281	   loss: a) heavy overload; or b) the normal capacity-seeking behaviour
1282	   of other TCP flows.  When the network is still delivering CE-marked
1283	   packets, it implies that there is an AQM at the bottleneck and that
1284	   it is not overloaded.  This is because an AQM under overload will
1285	   disable ECN (as recommended in section 7 of RFC 3168 and repeated in
1286	   section 4.2.1 of RFC 7567).  So scenario (a) can be ruled out.
1287	   Therefore, TCP's response to a CE-marked SYN-ACK can be similar to
1288	   its response to the loss of _any_ packet, rather than backing off as
1289	   if the special _initial_ packet of a flow has been lost.

1291	   How TCP responds to the loss of any single packet depends what it has
1292	   just been doing.  But there is not really a precedent for TCP's
1293	   response when it experiences a CE mark having sent only one (small)
1294	   packet.  If TCP had been adding one segment per RTT, it would have
1295	   halved its congestion window, but it hasn't established a congestion
1296	   window yet.  If it had been exponentially increasing it would have
1297	   exited slow start, but it hasn't started exponentially increasing yet
1298	   so it hasn't established a slow-start threshold.

1300	   Therefore, we have to work out a reasoned argument for what to do.
1301	   If an AQM is CE-marking packets, it implies there is already a queue
1302	   and it is probably already somewhere around the AQM's operating point
1303	   - it is unlikely to be well below and it might be well above.  So,
1304	   the more data packets that the client sends in its IW, the more
1305	   likely at least one will be CE marked, leading it to exit slow-start
1306	   early.  On the other hand, it is highly unlikely that the SYN-ACK
1307	   itself pushed the AQM into congestion, so it will be safe to
1308	   introduce another single segment immediately (1 RTT after the SYN-
1309	   ACK).  Therefore, starting to probe for capacity with a slow start
1310	   from an initial window of 1 segment seems appropriate to the
1311	   circumstances.  This is the approach adopted in Section 3.2.2.

1313	      EXPERIMENTATION NEEDED: Experiments will be needed to check the
1314	      above reasoning and determine any better strategy for reducing IW
1315	      in response to congestion on a SYN-ACK (or a SYN).

1317	4.3.3.  Fall-Back if ECT SYN-ACK Fails

1319	   An alternative to the server caching failed connection attempts would
1320	   be for the server to rely on the client caching failed attempts (on
1321	   the basis that the client would cache a failure whether ECT was
1322	   blocked on the SYN or the SYN-ACK).  This strategy cannot be used if
1323	   the SYN does not request AccECN support.  It works as follows: if the
1324	   server receives a SYN that requests AccECN support but is set to not-
1325	   ECT, it replies with a SYN-ACK also set to not-ECT.  If a middlebox
1326	   only blocks ECT on SYNs, not SYN-ACKs, this strategy might disable
1327	   ECN on a SYN-ACK when it did not need to, but at least it saves the
1328	   server from maintaining a cache.

1330	4.4.  Pure ACKs

1332	   Section 5.2 of RFC 3168 gives the following arguments for not
1333	   allowing the ECT marking of pure ACKs (ACKs not piggy-backed on
1334	   data):

1336	      "To ensure the reliable delivery of the congestion indication of
1337	      the CE codepoint, an ECT codepoint MUST NOT be set in a packet
1338	      unless the loss of that packet in the network would be detected by
1339	      the end nodes and interpreted as an indication of congestion.

1341	      Transport protocols such as TCP do not necessarily detect all
1342	      packet drops, such as the drop of a "pure" ACK packet; for
1343	      example, TCP does not reduce the arrival rate of subsequent ACK
1344	      packets in response to an earlier dropped ACK packet.  Any
1345	      proposal for extending ECN-Capability to such packets would have
1346	      to address issues such as the case of an ACK packet that was
1347	      marked with the CE codepoint but was later dropped in the network.
1348	      We believe that this aspect is still the subject of research, so
1349	      this document specifies that at this time, "pure" ACK packets MUST
1350	      NOT indicate ECN-Capability."

1352	   Later on, in section 6.1.4 it reads:

1354	      "For the current generation of TCP congestion control algorithms,
1355	      pure acknowledgement packets (e.g., packets that do not contain
1356	      any accompanying data) MUST be sent with the not-ECT codepoint.
1357	      Current TCP receivers have no mechanisms for reducing traffic on
1358	      the ACK-path in response to congestion notification.  Mechanisms
1359	      for responding to congestion on the ACK-path are areas for current
1360	      and future research.  (One simple possibility would be for the
1361	      sender to reduce its congestion window when it receives a pure ACK
1362	      packet with the CE codepoint set).  For current TCP
1363	      implementations, a single dropped ACK generally has only a very
1364	      small effect on the TCP's sending rate."

1366	   We next address each of the arguments presented above.

1368	   The first argument is a specific instance of the reliability argument
1369	   for the case of pure ACKs.  This has already been addressed by
1370	   countering the general reliability argument in Section 4.1.

1372	   The second argument says that ECN ought not to be enabled unless
1373	   there is a mechanism to respond to it.  This argument actually
1374	   comprises three sub-arguments:

1376	   Mechanism feasibility:  If ECN is enabled on Pure ACKs, are there, or
1377	      could there be, suitable mechanisms to detect, feed back and
1378	      respond to ECN-marked Pure ACKs?

1380	   Do no extra harm:  There has never been a mechanism to respond to
1381	      loss of non-ECN Pure ACKs.  So it seems that adding ECN without a
1382	      response mechanism will do no extra harm to others, while
1383	      improving a connection's own performance (because loss of an ACK
1384	      holds back new data).  However, if the end systems have no
1385	      response mechanism, ECN Pure ACKs do slightly more harm than non-
1386	      ECN, because the AQM doesn't immediately clear ECT packets from
1387	      the queue until it reaches overload and disables ECN.

1389	   Standards policy:  Even if there were no harm to others, does it set
1390	      an undesirable precedent to allow a flow to use ECN to protect its
1391	      Pure ACKs from loss, when there is no mechanism to respond to ECN-
1392	      marking?

1394	   The last two arguments involve value judgements, but they both depend
1395	   on the concrete technical question of mechanism feasibility, which
1396	   will therefore be addressed first in Section 4.4.1 below.  Then
1397	   Section 4.4.2 draws conclusions by addressing the value judgements in
1398	   the other two questions.

1400	4.4.1.  Mechanisms to Respond to CE-Marked Pure ACKs

1402	   The question of whether the receiver of pure ACKs is required to
1403	   detect and feed back any CE-marking is outside the scope of the
1404	   present specification - it is a matter for the relevant feedback
1405	   specification (classic ECN [RFC3168] and AccECN
1406	   [I-D.ietf-tcpm-accurate-ecn]).  The response to congestion feedback
1407	   is also out of scope, because it would be defined in the base TCP
1408	   congestion control specification [RFC5681] or its variants.

1410	   Nonetheless, in order to decide whether the present ECN++
1411	   experimental specification should require a host to set ECT on pure
1412	   ACKs, we only need to know whether a response mechanism would be
1413	   feasible - we do not have to standardize it.  So the bullets below
1414	   assess, for each type of feedback, whether the three stages of the
1415	   congestion response mechanism could all work.

1417	   Detection:  Can the receiver of a pure ACK detect a CE marking on
1418	      it?:

1420	      *  Classic feedback: RFC 3168 is silent on this point.  The
1421	         implementer of the receiver would not expect CE marks on pure
1422	         ACKs, but the implementation might happen to check for CE marks
1423	         before it looks for the data.  So detection will be
1424	         implementation-dependent.

1426	      *  AccECN feedback: the AccECN specification requires the receiver
1427	         of any TCP packets to count any CE marks on them (whether or
1428	         not it sends ECN-capable control packets itself).

1430	   Feedback:  TCP never ACKs a pure ACK, but the receiver of a CE-mark
1431	      on a pure ACK could feed it back when it sends a subsequent data
1432	      segment (if it ever does):

1434	      *  Classic feedback: RFC 3168 is silent on this point, so feedback
1435	         of CE-markings might be implementation specific.  If the
1436	         receiver (of the pure ACKs) did generate feedback, it would set
1437	         the echo congestion experienced (ECE) flag in the TCP header of
1438	         subsequent packets in the round, as it would to feed back CE on
1439	         data packets.

1441	      *  AccECN feedback: the receiver continually feeds back a count of
1442	         the number of CE-marked packets that it has received and,
1443	         optionally, a count of CE-marked bytes.  For either metric,
1444	         AccECN includes pure ACKs and indeed all types of packets.

1446	   Congestion response:  In either case (classic or AccECN feedback), if
1447	      the TCP sender does receive feedback about CE-markings on pure
1448	      ACKs, it will be able to reduce the congestion window (cwnd) and/
1449	      or the ACK rate.

1451	   Therefore a congestion response mechanism is clearly feasible if
1452	   AccECN has been negotiated, but the position is unknown for the
1453	   installed base of classic ECN feedback.

1455	4.4.1.1.  Congestion Window Response to CE-Marked Pure ACKs

1457	   This subsection explores issues that congestion control designers
1458	   will need to consider when defining a cwnd response to CE-marked Pure
1459	   ACKs.

1461	   A CE-mark on a Pure ACK does not mean that only Pure ACKs are causing
1462	   congestion.  It only means that the marked Pure ACK is part of an
1463	   aggregate that is collectively causing a bottleneck queue to randomly
1464	   CE-mark a fraction of the packets.  A CE-mark on a Pure ACK might be
1465	   due to data packets in other flows through the same bottleneck, due
1466	   to data packets interspersed between Pure ACKs in the same half-
1467	   connection, or just due to the rate of Pure ACKs alone.  (RFC 3168
1468	   only considered the last possibility, which led to the argument that
1469	   ECN-enabled Pure ACKs had to be deferred, because ACK congestion
1470	   control was a research issue.)

1472	   If a host has been sending a mix of Pure ACKs and data, it doesn't
1473	   need to work out whether a particular CE mark was on a Pure ACK or
1474	   not; it just needs to respond to congestion feedback as a whole by
1475	   reducing its congestion window (cwnd), which limits the data it can
1476	   launch into flight through the congested bottleneck.  If it is purely
1477	   receiving data and sending only Pure ACKs, reducing cwnd will have
1478	   caused it no harm, having no effect on its ACK rate (the next
1479	   subsection addresses that).

1481	   However, when a host is sending data as well as Pure ACKs, it would
1482	   not be right for CE-marks on Pure ACKs and on data packets to induce
1483	   the same reduction in cwnd.  A possible way to address this issue
1484	   would be to weight the response by the size of the marked packets
1485	   (assuming the congestion control supports a weighted response, e.g.
1486	   [RFC8257]).  For instance, one could calculate the fraction of CE-
1487	   marked bytes (headers and data) over each round trip (say) as
1488	   follows:

1490	      (CE-marked header bytes + CE-marked data bytes) / (all header
1491	      bytes + all data bytes)

1493	   Header bytes can be calculated by multiplying a packet count by a
1494	   nominal header size, which is possible with AccECN feedback, because
1495	   it gives a count of CE-marked packets (as well as CE-marked bytes).

1497	   The above simple aggregate calculation caters for the full range of
1498	   scenarios; from all Pure ACKs to just a few interspersed with data
1499	   packets.

1501	   Note that any mechanism that reduces cwnd due to CE-marked Pure ACKs
1502	   would need to be integrated with the congestion window validation
1503	   mechanism [RFC7661], which already conservatively reduces cwnd over
1504	   time because cwnd becomes stale if it is not used to fill the pipe.

1506	4.4.1.2.  ACK Rate Response to CE-Marked Pure ACKs

1508	   Reducing the congestion window will have no effect on the rate of
1509	   pure ACKs.  The worst case here is if the bottleneck is congested
1510	   solely with pure ACKs, but it could also be problematic if a large
1511	   fraction of the load was from unresponsive ACKs, leaving little or no
1512	   capacity for the load from responsive data.

1514	   Since RFC 3168 was published, experimental Acknowledgement Congestion
1515	   Control (AckCC) techniques have been documented in [RFC5690]
1516	   (informational).  So any pair of TCP end-points can choose to agree
1517	   to regulate the delayed ACK ratio in response to lost or CE-marked
1518	   pure ACKs.  However, the protocol has a number of open issues
1519	   concerning deployment (e.g. it requires support from both ends, it
1520	   relies on two new TCP options, one of which is required on the SYN
1521	   where option space is at a premium and, if either option is blocked
1522	   by a middlebox, no fall-back behaviour is specified).

1524	   The new TCP options address two problems, namely that TCP had: i) no
1525	   mechanism to allow ECT to be set on pure ACKs; and ii) no mechanism
1526	   to feed back loss or CE-marking of pure ACKs.  A combination of the
1527	   present specification and AccECN addresses both these problems, at
1528	   least for CE-marking.  So it might now be possible to design an ECN-
1529	   specific ACK congestion control scheme without the extra TCP options
1530	   proposed in RFC 5690.  However, such a mechanism is out of scope of
1531	   the present document.

1533	   Setting aside the practicality of RFC 5690, the need for AckCC has
1534	   not been conclusively demonstrated.  It has been argued that the
1535	   Internet has survived so far with no mechanism to even detect loss of
1536	   pure ACKs.  However, it has also been argued that ECN is not the same
1537	   as loss.  Packet discard can naturally thin the ACK load to whatever
1538	   the bottleneck can support, whereas ECN marking does not (it queues
1539	   the ACKs instead).  Nonetheless, RFC 3168 (section 7) recommends that
1540	   an AQM switches over from ECN marking to discard when the marking
1541	   probability becomes high.  Therefore discard can still be relied on
1542	   to thin out ECN-enabled pure ACKs as a last resort.

1544	4.4.2.  Summary: Enabling ECN on Pure ACKs

1546	   In the case when AccECN has been negotiated, it provides a feasible
1547	   congestion response mechanism, so the arguments for ECT on pure ACKs
1548	   heavily outweigh those against.  ECN is always more and never less
1549	   reliable for delivery of congestion notification.  A cwnd reduction
1550	   needs to be considered by congestion control designers as a response
1551	   to congestion on pure ACKs.  Separately, AckCC (or an improved
1552	   variant exploiting AccECN) could optionally be used to regulate the
1553	   spacing between pure ACKs.  However, it is not clear whether AckCC is
1554	   justified.  If it is not, packet discard will still act as the
1555	   "congestion response of last resort" by thinning out the traffic.  In
1556	   contrast, not setting ECT on pure ACKs is certainly detrimental to
1557	   performance, because when a pure ACK is lost it can prevent the
1558	   release of new data.

1560	   In the case when Classic ECN has been negotiated, the argument for
1561	   ECT on pure ACKs is less clear-cut.  Some of the installed base of
1562	   RFC 3168 implementations might happen to (unintentionally) provide a
1563	   feedback mechanism to support a cwnd response.  For those that did
1564	   not, setting ECT on pure ACKs would be better for the flow's own
1565	   performance than not setting it.  However, where there was no
1566	   feedback mechanism, setting ECT could do slightly more harm than not
1567	   setting it.  AckCC could provide a complementary response mechanism,
1568	   because it is designed to work with RFC 3168 ECN, but it has
1569	   deployment challenges.  In summary, a congestion response mechanism
1570	   is unlikely to be feasible with the installed base of classic ECN.

1572	   This specification uses a safe approach.  Allowing hosts to set ECT
1573	   on Pure ACKs without a feasible response mechanism could result in
1574	   risk.  It would certainly improve the flow's own performance, but it
1575	   would slightly increase potential harm to others.  Morevoer, if would
1576	   set an undesirable precedent for setting ECT on packets with no
1577	   mechanism to respond to any resulting congestion signals.  Therefore,
1578	   Section 3.2.3 allows ECT on Pure ACKs if AccECN feedback has been
1579	   negotiated, but not with classic RFC 3168 ECN feedback.

1581	4.5.  Window Probes

1583	   Section 6.1.6 of RFC 3168 presents only the reliability argument for
1584	   prohibiting ECT on Window probes:

1586	      "If a window probe packet is dropped in the network, this loss is
1587	      not detected by the receiver.  Therefore, the TCP data sender MUST
1588	      NOT set either an ECT codepoint or the CWR bit on window probe
1589	      packets.

1591	      However, because window probes use exact sequence numbers, they
1592	      cannot be easily spoofed in denial-of-service attacks.  Therefore,
1593	      if a window probe arrives with the CE codepoint set, then the
1594	      receiver SHOULD respond to the ECN indications."

1596	   The reliability argument has already been addressed in Section 4.1.

1598	   Allowing ECT on window probes could considerably improve performance
1599	   because, once the receive window has reopened, if a window probe is
1600	   lost the sender will stall until the next window probe reaches the
1601	   receiver, which might be after the maximum retransmission timeout (at
1602	   least 1 minute [RFC6928]).

1604	   On the bright side, RFC 3168 at least specifies the receiver
1605	   behaviour if a CE-marked window probe arrives, so changing the
1606	   behaviour ought to be less painful than for other packet types.

1608	4.6.  FINs

1610	   RFC 3168 is silent on whether a TCP sender can set ECT on a FIN.  A
1611	   FIN is considered as part of the sequence of data, and the rate of
1612	   pure ACKs sent after a FIN could be controlled by a CE marking on the
1613	   FIN.  Therefore there is no reason not to set ECT on a FIN.

1615	4.7.  RSTs

1617	   RFC 3168 is silent on whether a TCP sender can set ECT on a RST.  The
1618	   host generating the RST message does not have an open connection
1619	   after sending it (either because there was no such connection when
1620	   the packet that triggered the RST message was received or because the
1621	   packet that triggered the RST message also triggered the closure of
1622	   the connection).

1624	   Moreover, the receiver of a CE-marked RST message can either: i)
1625	   accept the RST message and close the connection; ii) emit a so-called
1626	   challenge ACK in response (with suitable throttling) [RFC5961] and
1627	   otherwise ignore the RST (e.g. because the sequence number is in-
1628	   window but not the precise number expected next); or iii) discard the
1629	   RST message (e.g. because the sequence number is out-of-window).  In
1630	   the first two cases there is no point in echoing any CE mark received
1631	   because the sender closed its connection when it sent the RST.  In
1632	   the third case it makes sense to discard the CE signal as well as the
1633	   RST.

1635	   Although a congestion response following a CE-marking on a RST does
1636	   not appear to make sense, the following factors have been considered
1637	   before deciding whether the sender ought to set ECT on a RST message:

1639	   o  As explained above, a congestion response by the sender of a CE-
1640	      marked RST message is not possible;

1642	   o  So the only reason for the sender setting ECT on a RST would be to
1643	      improve the reliability of the message's delivery;

1645	   o  RST messages are used to both mount and mitigate attacks:

1647	      *  Spoofed RST messages are used by attackers to terminate ongoing
1648	         connections, although the mitigations in RFC 5961 have
1649	         considerably raised the bar against off-path RST attacks;

1651	      *  Legitimate RST messages allow endpoints to inform their peers
1652	         to eliminate existing state that correspond to non existing
1653	         connections, liberating resources e.g. in DoS attacks
1654	         scenarios;

1656	   o  AQMs are advised to disable ECN marking during persistent
1657	      overload, so:

1659	      *  it is harder for an attacker to exploit ECN to intensify an
1660	         attack;

1662	      *  it is harder for a legitimate user to exploit ECN to more
1663	         reliably mitigate an attack

1665	   o  Prohibiting ECT on a RST would deny the benefit of ECN to
1666	      legitimate RST messages, but not to attackers who can disregard
1667	      RFCs;

1669	   o  If ECT were prohibited on RSTs

1671	      *  it would be easy for security middleboxes to discard all ECN-
1672	         capable RSTs;

1674	      *  However, unlike a SYN flood, it is already easy for a security
1675	         middlebox (or host) to distinguish a RST flood from legitimate
1676	         traffic [RFC5961], and even if a some legitimate RSTs are
1677	         accidentally removed as well, legitimate connections still
1678	         function.

1680	   So, on balance, it has been decided that it is worth experimenting
1681	   with ECT on RSTs.  During experiments, if the ECN capability on RSTs
1682	   is found to open a vulnerability that is hard to close, this decision
1683	   can be reversed, before it is specified for the standards track.

1685	4.8.  Retransmitted Packets.

1687	   RFC 3168 says the sender "MUST NOT" set ECT on retransmitted packets.
1688	   The rationale for this consumes nearly 2 pages of RFC 3168, so the
1689	   reader is referred to section 6.1.5 of RFC 3168, rather than quoting
1690	   it all here.  There are essentially three arguments, namely:
1691	   reliability; DoS attacks; and over-reaction to congestion.  We
1692	   address them in order below.

1694	   The reliability argument has already been addressed in Section 4.1.

1696	   Protection against DoS attacks is not afforded by prohibiting ECT on
1697	   retransmitted packets.  An attacker can set CE on spoofed
1698	   retransmissions whether or not it is prohibited by an RFC.
1699	   Protection against the DoS attack described in section 6.1.5 of RFC
1700	   3168 is solely afforded by the requirement that "the TCP data
1701	   receiver SHOULD ignore the CE codepoint on out-of-window packets".
1702	   Therefore in Section 3.2.7 the sender is allowed to set ECT on
1703	   retransmitted packets, in order to reduce the chance of them being
1704	   dropped.  We also strengthen the receiver's requirement from "SHOULD
1705	   ignore" to "MUST ignore".  And we generalize the receiver's
1706	   requirement to include failure of any validity check, not just out-
1707	   of-window checks, in order to include the more stringent validity
1708	   checks in RFC 5961 that have been developed since RFC 3168.

1710	   A consequence is that, for those retransmitted packets that arrive at
1711	   the receiver after the original packet has been properly received
1712	   (so-called spurious retransmissions), any CE marking will be ignored.
1713	   There is no problem with that because the fact that the original
1714	   packet has been delivered implies that the sender's original
1715	   congestion response (when it deemed the packet lost and retransmitted
1716	   it) was unnecessary.

1718	   Finally, the third argument is about over-reacting to congestion.
1719	   The argument goes that, if a retransmitted packet is dropped, the
1720	   sender will not detect it, so it will not react again to congestion
1721	   (it would have reduced its congestion window already when it
1722	   retransmitted the packet).  Whereas, if retransmitted packets can be
1723	   CE tagged instead of dropped, senders could potentially react more
1724	   than once to congestion.  However, we argue that it is legitimate to
1725	   respond again to congestion if it still persists in subsequent round
1726	   trip(s).

1728	   Therefore, in all three cases, it is not incorrect to set ECT on
1729	   retransmissions.

1731	4.9.  General Fall-back for any Control Packet

1733	   Extensive experiments have found no evidence of any traversal
1734	   problems with ECT on any TCP control packet [Mandalari18].
1735	   Nonetheless, Sections 3.2.1.4 and 3.2.2.3 specify fall-back measures
1736	   if ECT on the first packet of each half-connection (SYN or SYN-ACK)
1737	   appears to be blocking progress.  Here, the question of fall-back
1738	   measures for ECT on other control packets is explored.  It supports
1739	   the advice given in Section 3.2.8; until there's evidence that
1740	   something's broken, don't fix it.

1742	   If an implementation has had to disable ECT to ensure the first
1743	   packet of a flow (SYN or SYN-ACK) gets through, the question arises
1744	   whether it ought to disable ECT on all subsequent control packets
1745	   within the same TCP connection.  Without evidence of any such
1746	   problems, this seems unnecessarily cautious.  Particularly given it
1747	   would be hard to detect loss of most other types of TCP control
1748	   packets that are not ACK'd.  And particularly given that
1749	   unnecessarily removing ECT from other control packets could lead to
1750	   performance problems, e.g. by directing them into another queue
1751	   [I-D.ietf-tsvwg-ecn-l4s-id] or over a different path, because some
1752	   broken multipath equipment (erroneously) routes based on all 8 bits
1753	   of the Diffserv field.

1755	   In the case where a connection starts without ECT on the SYN (perhaps
1756	   because problems with previous connections had been cached), there
1757	   will have been no test for ECT traversal in the client-server
1758	   direction until the pure ACK that completes the handshake.  It is
1759	   possible that some middlebox might block ECT on this pure ACK or on
1760	   later retransmissions of lost packets.  Similarly, after a route
1761	   change, the new path might include some middlebox that blocks ECT on
1762	   some or all TCP control packets.  However, without evidence of such
1763	   problems, the complexity of a fix does not seem worthwhile.

1765	      MORE MEASUREMENTS NEEDED (?): If further two-ended measurements do
1766	      find evidence for these traversal problems, measurements would be
1767	      needed to check for correlation of ECT traversal problems between
1768	      different control packets.  It might then be necessary to
1769	      introduce a catch-all fall-back rule that disables ECT on certain
1770	      subsequent TCP control packets based on some criteria developed
1771	      from these measurements.

1773	5.  Interaction with popular variants or derivatives of TCP

1775	   The following subsections discuss any interactions between setting
1776	   ECT on all packets and using the following popular variants of TCP:
1777	   IW10 and TFO.  It also briefly notes the possibility that the
1778	   principles applied here should translate to protocols derived from
1779	   TCP.  This section is informative not normative, because no
1780	   interactions have been identified that require any change to
1781	   specifications.  The subsection on IW10 discusses potential changes
1782	   to specifications but recommends that no changes are needed.

1784	   The designs of the following TCP variants have also been assessed and
1785	   found not to interact adversely with ECT on TCP control packets: SYN
1786	   cookies (see Appendix A of [RFC4987] and section 3.1 of [RFC5562]),
1787	   TCP Fast Open (TFO [RFC7413]) and L4S [I-D.ietf-tsvwg-l4s-arch].

1789	5.1.  IW10

1791	   IW10 is an experiment to determine whether it is safe for TCP to use
1792	   an initial window of 10 SMSS [RFC6928].

1794	   This subsection does not recommend any additions to the present
1795	   specification in order to interwork with IW10.  The specifications as
1796	   they stand are safe, and there is only a corner-case with ECT on the
1797	   SYN where performance could be occasionally improved, as explained
1798	   below.

1800	   As specified in Section 3.2.1.1, a TCP initiator will typically only
1801	   set ECT on the SYN if it requests AccECN support.  If, however, the
1802	   SYN-ACK tells the initiator that the responder does not support
1803	   AccECN, Section 3.2.1.1 advises the initiator to conservatively
1804	   reduce its initial window, preferably to 1 SMSS because, if the SYN
1805	   was CE-marked, the SYN-ACK has no way to feed that back.

1807	   If the initiator implements IW10, it seems rather over-conservative
1808	   to reduce IW from 10 to 1 just in case a congestion marking was
1809	   missed.  Nonetheless, a reduction to 1 SMSS will rarely harm
1810	   performance, because:

1812	   o  as long as the initiator is caching failures to negotiate AccECN,
1813	      subsequent attempts to access the same server will not use ECT on
1814	      the SYN anyway, so there will no longer be any need to
1815	      conservatively reduce IW;

1817	   o  currently, at least for web sessions, it is extremely rare for a
1818	      TCP initiator (client) to have more than one data segment to send
1819	      at the start of a TCP connection (see Fig 3 in [Manzoor17]) - IW10
1820	      is primarily exploited by TCP servers.

1822	   If a responder receives feedback that the SYN-ACK was CE-marked,
1823	   Section 3.2.2.2 recommends that it reduces its initial window,
1824	   preferably to 1 SMSS.  When the responder also implements IW10, it
1825	   might again seem rather over-conservative to reduce IW from 10 to 1.
1826	   But in this case the rationale is somewhat different:

1828	   o  Feedback that the SYN-ACK was CE-marked is an explicit indication
1829	      that the queue has been building, not just uncertainty due to
1830	      absence of feedback;

1832	   o  Given it is now likely that a queue already exists, the more data
1833	      packets that the server sends in its IW, the more likely at least
1834	      one will be CE marked, leading it to exit slow-start early.

1836	   Experimentation will be needed to determine the best strategy.  It
1837	   should be noted that experience from recent congestion avoidance
1838	   experiments where the window is reduced by less than half is not
1839	   necessarily applicable to a flow start scenario.  Reducing cwnd by
1840	   less is one thing.  Reducing an increase in cwnd by less is another.

1842	5.2.  TFO

1844	   TCP Fast Open (TFO [RFC7413]) is an experiment to remove the round
1845	   trip delay of TCP's 3-way hand-shake (3WHS).  A TFO initiator caches
1846	   a cookie from a previous connection with a TFO-enabled server.  Then,
1847	   for subsequent connections to the same server, any data included on
1848	   the SYN can be passed directly to the server application, which can
1849	   then return up to an initial window of response data on the SYN-ACK
1850	   and on data segments straight after it, without waiting for the ACK
1851	   that completes the 3WHS.

1853	   The TFO experiment and the present experiment to add ECN-support for
1854	   TCP control packets can be combined without altering either
1855	   specification, which is justified as follows:

1857	   o  The handling of ECN marking on a SYN is no different whether or
1858	      not it carries data.

1860	   o  In response to any CE-marking on the SYN-ACK, the responder adopts
1861	      the normal response to congestion, as discussed in Section 7.2 of
1862	      [RFC7413].

1864	5.3.  L4S

1866	   A Low Latency Low Loss Scalable throughput (L4S) variant of TCP such
1867	   as TCP Prague [PragueLinux] is mandated to negotiate AccECN feedback,
1868	   and strongly recommended to use ECN++ [I-D.ietf-tsvwg-ecn-l4s-id].

1870	   The L4S experiment and the present ECN++ experiment can be combined
1871	   without altering any of the specifications.  The only difference
1872	   would be in the recommendation of the best SYN cache strategy.

1874	   The normative specification for ECT on a SYN in Section 3.2.1
1875	   recommends the "optimistic ECT and cache failures" strategy (S2B
1876	   defined in Section 4.2.3) for the general Internet.  However, if a
1877	   user's Internet access bottleneck supported L4S ECN but not Classic
1878	   ECN, the "optimistic ECT without a cache" strategy (S2A) would make
1879	   most sense, because there would be little point trying to avoid the
1880	   'over-strict' test and negotiate Classic ECN, if L4S ECN but not
1881	   Classic ECN was available on that user's access link (as is the case
1882	   with Low Latency DOCSIS [DOCSIS3.1]).

1884	   Strategy (S2A) is the simplest, because it requires no cache.  It
1885	   would satisfy the goal of an implementer who is solely interested in
1886	   ultra-low latency using AccECN and ECN++ (e.g. accessing L4S servers)
1887	   and is not concerned about fall-back to Classic ECN (e.g. when
1888	   accessing other servers).

1890	5.4.  Other transport protocols

1892	   Experience from experiments on adding ECN support to all TCP packets
1893	   ought to be directly transferable between TCP and other transport
1894	   protocols, like SCTP or QUIC.

1896	   Stream Control Transmission Protocol (SCTP [RFC4960]) is a standards
1897	   track transport protocol derived from TCP.  SCTP currently does not
1898	   include ECN support, but Appendix A of RFC 4960 broadly describes how
1899	   it would be supported and a (long-expired) draft on the addition of
1900	   ECN to SCTP has been produced [I-D.stewart-tsvwg-sctpecn].  This
1901	   draft avoided setting ECT on control packets and retransmissions,
1902	   closely following the arguments in RFC 3168.

1904	   QUIC [I-D.ietf-quic-transport] is another standards track transport
1905	   protocol offering similar services to TCP but intended to exploit
1906	   some of the benefits of running over UDP.  Building on the arguments
1907	   in the current draft, a QUIC sender sets ECT(0) on all packets.

1909	6.  Security Considerations

1911	   Section 3.2.6 considers the question of whether ECT on RSTs will
1912	   allow RST attacks to be intensified.  There are several security
1913	   arguments presented in RFC 3168 for preventing the ECN marking of TCP
1914	   control packets and retransmitted segments.  We believe all of them
1915	   have been properly addressed in Section 4, particularly Section 4.2.4
1916	   and Section 4.8 on DoS attacks using spoofed ECT-marked SYNs and
1917	   spoofed CE-marked retransmissions.

1919	7.  IANA Considerations

1921	   There are no IANA considerations in this memo.

1923	8.  Acknowledgments

1925	   Thanks to Mirja Kuehlewind, David Black, Padma Bhooma, Gorry
1926	   Fairhurst, Michael Scharf, Yuchung Cheng and Christophe Paasch for
1927	   their useful reviews.

1929	   The work of Marcelo Bagnulo has been performed in the framework of
1930	   the H2020-ICT-2014-2 project 5G NORMA.  His contribution reflects the
1931	   consortium's view, but the consortium is not liable for any use that
1932	   may be made of any of the information contained therein.

1934	   Bob Briscoe's contribution was partly funded by the Research Council
1935	   of Norway through the TimeIn project, partly by CableLabs and partly
1936	   by the Comcast Innovation Fund.  The views expressed here are solely
1937	   those of the authors.

1939	9.  References

1941	9.1.  Normative References

1943	   [I-D.ietf-tcpm-accurate-ecn]
1944	              Briscoe, B., Kuehlewind, M., and R. Scheffenegger, "More
1945	              Accurate ECN Feedback in TCP", draft-ietf-tcpm-accurate-
1946	              ecn-09 (work in progress), July 2019.

1948	   [RFC0793]  Postel, J., "Transmission Control Protocol", STD 7,
1949	              RFC 793, DOI 10.17487/RFC0793, September 1981,
1950	              <https://www.rfc-editor.org/info/rfc793>.

1952	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
1953	              Requirement Levels", BCP 14, RFC 2119,
1954	              DOI 10.17487/RFC2119, March 1997,
1955	              <https://www.rfc-editor.org/info/rfc2119>.

1957	   [RFC3168]  Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
1958	              of Explicit Congestion Notification (ECN) to IP",
1959	              RFC 3168, DOI 10.17487/RFC3168, September 2001,
1960	              <https://www.rfc-editor.org/info/rfc3168>.

1962	   [RFC5961]  Ramaiah, A., Stewart, R., and M. Dalal, "Improving TCP's
1963	              Robustness to Blind In-Window Attacks", RFC 5961,
1964	              DOI 10.17487/RFC5961, August 2010,
1965	              <https://www.rfc-editor.org/info/rfc5961>.

1967	   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
1968	              2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
1969	              May 2017, <https://www.rfc-editor.org/info/rfc8174>.

1971	   [RFC8311]  Black, D., "Relaxing Restrictions on Explicit Congestion
1972	              Notification (ECN) Experimentation", RFC 8311,
1973	              DOI 10.17487/RFC8311, January 2018,
1974	              <https://www.rfc-editor.org/info/rfc8311>.

1976	9.2.  Informative References

1978	   [DOCSIS3.1]
1979	              CableLabs, "MAC and Upper Layer Protocols Interface
1980	              (MULPI) Specification, CM-SP-MULPIv3.1", Data-Over-Cable
1981	              Service Interface Specifications DOCSIS(R) 3.1 Version i17
1982	              or later, January 2019, <https://specification-
1983	              search.cablelabs.com/CM-SP-MULPIv3.1>.

1985	   [ecn-overload]
1986	              Steen, H., "Destruction Testing: Ultra-Low Delay using
1987	              Dual Queue Coupled Active Queue Management", Masters
1988	              Thesis, Uni Oslo , May 2017,
1989	              <https://www.duo.uio.no/bitstream/handle/10852/57424/
1990	              thesis-henrste.pdf?sequence=1>.

1992	   [ecn-pam]  Trammell, B., Kuehlewind, M., Boppart, D., Learmonth, I.,
1993	              Fairhurst, G., and R. Scheffenegger, "Enabling Internet-
1994	              Wide Deployment of Explicit Congestion Notification",
1995	              Int'l Conf. on Passive and Active Network Measurement
1996	              (PAM'15) pp193-205, 2015, <https://link.springer.com/
1997	              chapter/10.1007/978-3-319-15509-8_15>.

1999	   [ECN-PLUS]
2000	              Kuzmanovic, A., "The Power of Explicit Congestion
2001	              Notification", ACM SIGCOMM 35(4):61--72, 2005,
2002	              <http://dl.acm.org/citation.cfm?id=1080100>.

2004	   [I-D.ietf-quic-transport]
2005	              Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed
2006	              and Secure Transport", draft-ietf-quic-transport-23 (work
2007	              in progress), September 2019.

2009	   [I-D.ietf-tsvwg-ecn-l4s-id]
2010	              Schepper, K. and B. Briscoe, "Identifying Modified
2011	              Explicit Congestion Notification (ECN) Semantics for
2012	              Ultra-Low Queuing Delay (L4S)", draft-ietf-tsvwg-ecn-l4s-
2013	              id-07 (work in progress), July 2019.

2015	   [I-D.ietf-tsvwg-l4s-arch]
2016	              Briscoe, B., Schepper, K., Bagnulo, M., and G. White, "Low
2017	              Latency, Low Loss, Scalable Throughput (L4S) Internet
2018	              Service: Architecture", draft-ietf-tsvwg-l4s-arch-04 (work
2019	              in progress), July 2019.

2021	   [I-D.stewart-tsvwg-sctpecn]
2022	              Stewart, R., Tuexen, M., and X. Dong, "ECN for Stream
2023	              Control Transmission Protocol (SCTP)", draft-stewart-
2024	              tsvwg-sctpecn-05 (work in progress), January 2014.

2026	   [judd-nsdi]
2027	              Judd, G., "Attaining the promise and avoiding the pitfalls
2028	              of TCP in the Datacenter", USENIX Symposium on Networked
2029	              Systems Design and Implementation (NSDI'15) pp.145-157,
2030	              May 2015, <https://www.usenix.org/node/188966>.

2032	   [Kuehlewind18]
2033	              Kuehlewind, M., Walter, M., Learmonth, I., and B.
2034	              Trammell, "Tracing Internet Path Transparency", In Proc:
2035	              Network Traffic Measurement and Analysis Conference (TMA)
2036	              2018 , June 2018, <http://tma.ifip.org/2018/wp-
2037	              content/uploads/sites/3/2018/06/tma2018_paper12.pdf>.

2039	   [Mandalari18]
2040	              Mandalari, A., Lutu, A., Briscoe, B., Bagnulo, M., and Oe.
2041	              Alay, "Measuring ECN++: Good News for ++, Bad News for ECN
2042	              over Mobile", IEEE Communications Magazine , March 2018,
2043	              <https://ieeexplore.ieee.org/document/8316790>.

2045	   [Manzoor17]
2046	              Manzoor, J., Drago, I., and R. Sadre, "How HTTP/2 is
2047	              changing Web traffic and how to detect it", In Proc:
2048	              Network Traffic Measurement and Analysis Conference (TMA)
2049	              2017 pp.1-9, June 2017,
2050	              <https://ieeexplore.ieee.org/document/8002899>.

2052	   [PragueLinux]
2053	              Briscoe, B., De Schepper, K., Albisser, O., Misund, J.,
2054	              Tilmans, O., Kuehlewind, M., and A. Ahmed, "Implementing
2055	              the `TCP Prague' Requirements for Low Latency Low Loss
2056	              Scalable Throughput (L4S)", Proc. Linux Netdev 0x13 ,
2057	              March 2019, <https://www.netdevconf.org/0x13/
2058	              session.html?talk-tcp-prague-l4s>.

2060	   [relax-strict-ecn]
2061	              Tilmans, O., "tcp: Accept ECT on SYN in the presence of
2062	              RFC8311", Linux netdev patch list , April 2019,
2063	              <https://lore.kernel.org/patchwork/patch/1057812/>.

2065	   [RFC1122]  Braden, R., Ed., "Requirements for Internet Hosts -
2066	              Communication Layers", STD 3, RFC 1122,
2067	              DOI 10.17487/RFC1122, October 1989,
2068	              <https://www.rfc-editor.org/info/rfc1122>.

2070	   [RFC2140]  Touch, J., "TCP Control Block Interdependence", RFC 2140,
2071	              DOI 10.17487/RFC2140, April 1997,
2072	              <https://www.rfc-editor.org/info/rfc2140>.

2074	   [RFC3540]  Spring, N., Wetherall, D., and D. Ely, "Robust Explicit
2075	              Congestion Notification (ECN) Signaling with Nonces",
2076	              RFC 3540, DOI 10.17487/RFC3540, June 2003,
2077	              <https://www.rfc-editor.org/info/rfc3540>.

2079	   [RFC4960]  Stewart, R., Ed., "Stream Control Transmission Protocol",
2080	              RFC 4960, DOI 10.17487/RFC4960, September 2007,
2081	              <https://www.rfc-editor.org/info/rfc4960>.

2083	   [RFC4987]  Eddy, W., "TCP SYN Flooding Attacks and Common
2084	              Mitigations", RFC 4987, DOI 10.17487/RFC4987, August 2007,
2085	              <https://www.rfc-editor.org/info/rfc4987>.

2087	   [RFC5562]  Kuzmanovic, A., Mondal, A., Floyd, S., and K.
2088	              Ramakrishnan, "Adding Explicit Congestion Notification
2089	              (ECN) Capability to TCP's SYN/ACK Packets", RFC 5562,
2090	              DOI 10.17487/RFC5562, June 2009,
2091	              <https://www.rfc-editor.org/info/rfc5562>.

2093	   [RFC5681]  Allman, M., Paxson, V., and E. Blanton, "TCP Congestion
2094	              Control", RFC 5681, DOI 10.17487/RFC5681, September 2009,
2095	              <https://www.rfc-editor.org/info/rfc5681>.

2097	   [RFC5690]  Floyd, S., Arcia, A., Ros, D., and J. Iyengar, "Adding
2098	              Acknowledgement Congestion Control to TCP", RFC 5690,
2099	              DOI 10.17487/RFC5690, February 2010,
2100	              <https://www.rfc-editor.org/info/rfc5690>.

2102	   [RFC6298]  Paxson, V., Allman, M., Chu, J., and M. Sargent,
2103	              "Computing TCP's Retransmission Timer", RFC 6298,
2104	              DOI 10.17487/RFC6298, June 2011,
2105	              <https://www.rfc-editor.org/info/rfc6298>.

2107	   [RFC6928]  Chu, J., Dukkipati, N., Cheng, Y., and M. Mathis,
2108	              "Increasing TCP's Initial Window", RFC 6928,
2109	              DOI 10.17487/RFC6928, April 2013,
2110	              <https://www.rfc-editor.org/info/rfc6928>.

2112	   [RFC7413]  Cheng, Y., Chu, J., Radhakrishnan, S., and A. Jain, "TCP
2113	              Fast Open", RFC 7413, DOI 10.17487/RFC7413, December 2014,
2114	              <https://www.rfc-editor.org/info/rfc7413>.

2116	   [RFC7567]  Baker, F., Ed. and G. Fairhurst, Ed., "IETF
2117	              Recommendations Regarding Active Queue Management",
2118	              BCP 197, RFC 7567, DOI 10.17487/RFC7567, July 2015,
2119	              <https://www.rfc-editor.org/info/rfc7567>.

2121	   [RFC7661]  Fairhurst, G., Sathiaseelan, A., and R. Secchi, "Updating
2122	              TCP to Support Rate-Limited Traffic", RFC 7661,
2123	              DOI 10.17487/RFC7661, October 2015,
2124	              <https://www.rfc-editor.org/info/rfc7661>.

2126	   [RFC8257]  Bensley, S., Thaler, D., Balasubramanian, P., Eggert, L.,
2127	              and G. Judd, "Data Center TCP (DCTCP): TCP Congestion
2128	              Control for Data Centers", RFC 8257, DOI 10.17487/RFC8257,
2129	              October 2017, <https://www.rfc-editor.org/info/rfc8257>.

2131	   [strict-ecn]
2132	              Dumazet, E., "tcp: be more strict before accepting ECN
2133	              negociation", Linux netdev patch list , May 2012,
2134	              <https://patchwork.ozlabs.org/patch/156953/>.

2136	Authors' Addresses

2138	   Marcelo Bagnulo
2139	   Universidad Carlos III de Madrid
2140	   Av. Universidad 30
2141	   Leganes, Madrid  28911
2142	   SPAIN

2144	   Phone: 34 91 6249500
2145	   Email: marcelo@it.uc3m.es
2146	   URI:   http://www.it.uc3m.es

2148	   Bob Briscoe
2149	   Independent
2150	   UK

2152	   Email: ietf@bobbriscoe.net
2153	   URI:   http://bobbriscoe.net/