idnits 2.17.1 

draft-ietf-tcpm-generalized-ecn-09.txt:
-(2150): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  == There are 6 instances of lines with non-ascii characters in the document.


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  -- The draft header indicates that this document obsoletes RFC5562, but the
     abstract doesn't seem to mention this, which it should.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == The document seems to lack the recommended RFC 2119 boilerplate, even if
     it appears to use RFC 2119 keywords. 

     (The document does seem to have the reference to RFC 2119 which the
     ID-Checklist requires).
  == The document seems to contain a disclaimer for pre-RFC5378 work, but was
     first submitted on or after 10 November 2008.  The disclaimer is usually
     necessary only for documents that revise or obsolete older RFCs, and that
     take significant amounts of text from those RFCs.  If you can contact all
     authors of the source material and they are willing to grant the BCP78
     rights to the IETF Trust, you can and should remove the disclaimer. 
     Otherwise, the disclaimer is needed and you can ignore this comment. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (31 January 2022) is 816 days in the past.  Is this
     intentional?


  Checking references for intended status: Experimental
  ----------------------------------------------------------------------------

  == Outdated reference: A later version (-28) exists of
     draft-ietf-tcpm-accurate-ecn-15

  ** Obsolete normative reference: RFC  793 (Obsoleted by RFC 9293)

  -- Obsolete informational reference (is this intentional?): RFC 4960
     (Obsoleted by RFC 9260)

  -- Obsolete informational reference (is this intentional?): RFC 2140
     (Obsoleted by RFC 9040)

  == Outdated reference: A later version (-29) exists of
     draft-ietf-tsvwg-ecn-l4s-id-23

  == Outdated reference: A later version (-20) exists of
     draft-ietf-tsvwg-l4s-arch-15

  == Outdated reference: A later version (-07) exists of
     draft-stewart-tsvwg-sctpecn-05


     Summary: 1 error (**), 0 flaws (~~), 8 warnings (==), 4 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                         M. Bagnulo
3	Internet-Draft                                                      UC3M
4	Obsoletes: 5562 (if approved)                                 B. Briscoe
5	Intended status: Experimental                                Independent
6	Expires: 4 August 2022                                   31 January 2022

8	  ECN++: Adding Explicit Congestion Notification (ECN) to TCP Control
9	                                Packets
10	                   draft-ietf-tcpm-generalized-ecn-09

12	Abstract

14	   This document describes an experimental modification to ECN when used
15	   with TCP.  It allows the use of ECN on the following TCP packets:
16	   SYNs, pure ACKs, Window probes, FINs, RSTs and retransmissions.

18	Status of This Memo

20	   This Internet-Draft is submitted in full conformance with the
21	   provisions of BCP 78 and BCP 79.

23	   Internet-Drafts are working documents of the Internet Engineering
24	   Task Force (IETF).  Note that other groups may also distribute
25	   working documents as Internet-Drafts.  The list of current Internet-
26	   Drafts is at https://datatracker.ietf.org/drafts/current/.

28	   Internet-Drafts are draft documents valid for a maximum of six months
29	   and may be updated, replaced, or obsoleted by other documents at any
30	   time.  It is inappropriate to use Internet-Drafts as reference
31	   material or to cite them other than as "work in progress."

33	   This Internet-Draft will expire on 4 August 2022.

35	Copyright Notice

37	   Copyright (c) 2022 IETF Trust and the persons identified as the
38	   document authors.  All rights reserved.

40	   This document is subject to BCP 78 and the IETF Trust's Legal
41	   Provisions Relating to IETF Documents (https://trustee.ietf.org/
42	   license-info) in effect on the date of publication of this document.
43	   Please review these documents carefully, as they describe your rights
44	   and restrictions with respect to this document.  Code Components
45	   extracted from this document must include Revised BSD License text as
46	   described in Section 4.e of the Trust Legal Provisions and are
47	   provided without warranty as described in the Revised BSD License.

49	   This document may contain material from IETF Documents or IETF
50	   Contributions published or made publicly available before November
51	   10, 2008.  The person(s) controlling the copyright in some of this
52	   material may not have granted the IETF Trust the right to allow
53	   modifications of such material outside the IETF Standards Process.
54	   Without obtaining an adequate license from the person(s) controlling
55	   the copyright in such materials, this document may not be modified
56	   outside the IETF Standards Process, and derivative works of it may
57	   not be created outside the IETF Standards Process, except to format
58	   it for publication as an RFC or to translate it into languages other
59	   than English.

61	Table of Contents

63	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
64	     1.1.  Motivation  . . . . . . . . . . . . . . . . . . . . . . .   4
65	     1.2.  Experiment Goals  . . . . . . . . . . . . . . . . . . . .   5
66	     1.3.  Document Structure  . . . . . . . . . . . . . . . . . . .   6
67	   2.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   6
68	   3.  Specification . . . . . . . . . . . . . . . . . . . . . . . .   7
69	     3.1.  Network (e.g. Firewall) Behaviour . . . . . . . . . . . .   8
70	     3.2.  Sender Behaviour  . . . . . . . . . . . . . . . . . . . .   8
71	       3.2.1.  SYN (Send)  . . . . . . . . . . . . . . . . . . . . .  10
72	       3.2.2.  SYN-ACK (Send)  . . . . . . . . . . . . . . . . . . .  13
73	       3.2.3.  Pure ACK (Send) . . . . . . . . . . . . . . . . . . .  14
74	       3.2.4.  Window Probe (Send) . . . . . . . . . . . . . . . . .  16
75	       3.2.5.  FIN (Send)  . . . . . . . . . . . . . . . . . . . . .  16
76	       3.2.6.  RST (Send)  . . . . . . . . . . . . . . . . . . . . .  17
77	       3.2.7.  Retransmissions (Send)  . . . . . . . . . . . . . . .  17
78	       3.2.8.  General Fall-back for any Control Packet or
79	               Retransmission  . . . . . . . . . . . . . . . . . . .  18
80	     3.3.  Receiver Behaviour  . . . . . . . . . . . . . . . . . . .  18
81	       3.3.1.  Receiver Behaviour for Any TCP Control Packet or
82	               Retransmission  . . . . . . . . . . . . . . . . . . .  18
83	       3.3.2.  SYN (Receive) . . . . . . . . . . . . . . . . . . . .  19
84	       3.3.3.  Pure ACK (Receive)  . . . . . . . . . . . . . . . . .  20
85	       3.3.4.  FIN (Receive) . . . . . . . . . . . . . . . . . . . .  20
86	       3.3.5.  RST (Receive) . . . . . . . . . . . . . . . . . . . .  20
87	       3.3.6.  Retransmissions (Receive) . . . . . . . . . . . . . .  21
88	   4.  Rationale . . . . . . . . . . . . . . . . . . . . . . . . . .  21
89	     4.1.  The Reliability Argument  . . . . . . . . . . . . . . . .  21
90	     4.2.  SYNs  . . . . . . . . . . . . . . . . . . . . . . . . . .  22
91	       4.2.1.  Argument 1a: Unrecognized CE on the SYN . . . . . . .  22
92	       4.2.2.  Argument 1b: ECT Considered Invalid on the SYN  . . .  23
93	       4.2.3.  Caching Strategies for ECT on SYNs  . . . . . . . . .  25
94	       4.2.4.  Argument 2: DoS Attacks . . . . . . . . . . . . . . .  27
95	     4.3.  SYN-ACKs  . . . . . . . . . . . . . . . . . . . . . . . .  28
96	       4.3.1.  Possibility of Unrecognized CE on the SYN-ACK . . . .  28
97	       4.3.2.  Response to Congestion on a SYN-ACK . . . . . . . . .  29
98	       4.3.3.  Fall-Back if ECT SYN-ACK Fails  . . . . . . . . . . .  30
99	     4.4.  Pure ACKs . . . . . . . . . . . . . . . . . . . . . . . .  30
100	       4.4.1.  Mechanisms to Respond to CE-Marked Pure ACKs  . . . .  32
101	       4.4.2.  Summary: Enabling ECN on Pure ACKs  . . . . . . . . .  35
102	     4.5.  Window Probes . . . . . . . . . . . . . . . . . . . . . .  36
103	     4.6.  FINs  . . . . . . . . . . . . . . . . . . . . . . . . . .  37
104	     4.7.  RSTs  . . . . . . . . . . . . . . . . . . . . . . . . . .  37
105	     4.8.  Retransmitted Packets.  . . . . . . . . . . . . . . . . .  38
106	     4.9.  General Fall-back for any Control Packet  . . . . . . . .  39
107	   5.  Interaction with popular variants or derivatives of TCP . . .  40
108	     5.1.  IW10  . . . . . . . . . . . . . . . . . . . . . . . . . .  41
109	     5.2.  TFO . . . . . . . . . . . . . . . . . . . . . . . . . . .  42
110	     5.3.  L4S . . . . . . . . . . . . . . . . . . . . . . . . . . .  42
111	     5.4.  Other transport protocols . . . . . . . . . . . . . . . .  43
112	   6.  Security Considerations . . . . . . . . . . . . . . . . . . .  43
113	   7.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  43
114	   8.  Acknowledgments . . . . . . . . . . . . . . . . . . . . . . .  44
115	   9.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  44
116	     9.1.  Normative References  . . . . . . . . . . . . . . . . . .  44
117	     9.2.  Informative References  . . . . . . . . . . . . . . . . .  45
118	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  48

120	1.  Introduction

122	   RFC 3168 [RFC3168] specifies support of Explicit Congestion
123	   Notification (ECN) in IP (v4 and v6).  By using the ECN capability,
124	   network elements (e.g. routers, switches) performing Active Queue
125	   Management (AQM) can use ECN marks instead of packet drops to signal
126	   congestion to the endpoints of a communication.  This results in
127	   lower packet loss and increased performance.  RFC 3168 also specifies
128	   support for ECN in TCP, but solely on data packets.  For various
129	   reasons it precludes the use of ECN on TCP control packets (TCP SYN,
130	   TCP SYN-ACK, pure ACKs, Window probes) and on retransmitted packets.
131	   RFC 3168 is silent about the use of ECN on RST and FIN packets.  RFC
132	   5562 [RFC5562] is an experimental modification to ECN that enables
133	   ECN support for TCP SYN-ACK packets.

135	   This document defines an experimental modification to ECN [RFC3168]
136	   that shall be called ECN++. It enables ECN support on all the
137	   aforementioned types of TCP packet.  RFC 5562 (which was called ECN+)
138	   is obsoleted by the present specification, because it has the same
139	   goal of enabling ECT, but on only one type of control packet.  The
140	   mechanisms proposed in this document have been defined conservatively
141	   and with safety in mind, possibly in some cases at the expense of
142	   performance.

144	   ECN++ uses a sender-only deployment model.  It works whether the two
145	   ends of the TCP connection use classic ECN feedback [RFC3168] or
146	   Accurate ECN feedback (AccECN [I-D.ietf-tcpm-accurate-ecn]), the two
147	   ECN feedback mechanisms for TCP being standardized at the time of
148	   writing.

150	   Using ECN on initial SYN packets provides significant benefits, as we
151	   describe in the next subsection.  However, only AccECN provides a way
152	   to feed back whether the SYN was CE marked, and RFC 3168 does not.
153	   Therefore, implementers of ECN++ are RECOMMENDED to also implement
154	   AccECN.  Conversely, if AccECN (or an equivalent safety mechanism) is
155	   not implemented with ECN++, this specification rules out ECN on the
156	   SYN.

158	   ECN++ is designed for compatibility with a number of latency
159	   improvements to TCP such as TCP Fast Open (TFO [RFC7413]), initial
160	   window of 10 SMSS (IW10 [RFC6928]) and Low latency Low Loss Scalable
161	   Transport (L4S [I-D.ietf-tsvwg-l4s-arch]), but they can all be
162	   implemented and deployed independently.  [RFC8311] is a standards
163	   track procedural device that relaxes requirements in RFC 3168 and
164	   other standards track RFCs that would otherwise preclude the
165	   experimental modifications needed for ECN++ and other ECN
166	   experiments.

168	1.1.  Motivation

170	   The absence of ECN support on TCP control packets and retransmissions
171	   has a potential harmful effect.  In any ECN deployment, non-ECN-
172	   capable packets suffer a penalty when they traverse a congested
173	   bottleneck.  For instance, with a drop probability of 1%, 1% of
174	   connection attempts suffer a timeout of about 1 second before the SYN
175	   is retransmitted, which is highly detrimental to the performance of
176	   short flows.  TCP control packets, particularly TCP SYNs and SYN-
177	   ACKs, are important for performance, so dropping them is best
178	   avoided.

180	   Not using ECN on control packets can be particularly detrimental to
181	   performance in environments where the ECN marking level is high.  For
182	   example, [judd-nsdi] shows that in a controlled private data centre
183	   (DC) environment where ECN is used (in conjunction with DCTCP
184	   [RFC8257]), the probability of being able to establish a new
185	   connection using a non-ECN SYN packet drops to close to zero even
186	   when there are only 16 ongoing TCP flows transmitting at full speed.
187	   The issue is that DCTCP exhibits a much more aggressive response to
188	   packet marking (which is why it is only applicable in controlled
189	   environments).  This leads to a high marking probability for ECN-
190	   capable packets, and in turn a high drop probability for non-ECN
191	   packets.  Therefore non-ECN SYNs are dropped aggressively, rendering
192	   it nearly impossible to establish a new connection in the presence of
193	   even mild traffic load.

195	   Finally, there are ongoing experimental efforts to promote the
196	   adoption of a slightly modified variant of DCTCP (and similar
197	   congestion controls) over the Internet to achieve low latency, low
198	   loss and scalable throughput (L4S) for all communications
199	   [I-D.ietf-tsvwg-l4s-arch].  In such an approach, L4S packets identify
200	   themselves using an ECN codepoint [I-D.ietf-tsvwg-ecn-l4s-id].  With
201	   L4S, preventing TCP control packets from obtaining the benefits of
202	   ECN would not only expose them to the prevailing level of congestion
203	   loss, but it would also classify them into a different queue.  Then
204	   only L4S data packets would be classified into the L4S queue that is
205	   expected to have lower latency, while the packets controlling and
206	   retransmitting these data packets would still get stuck behind the
207	   queue induced by non-L4S-enabled TCP traffic.

209	1.2.  Experiment Goals

211	   The goal of the experimental modifications defined in this document
212	   is to allow the use of ECN on all TCP packets.  Experiments are
213	   expected in the public Internet as well as in controlled environments
214	   to understand the following issues:

216	   *  How SYNs, Window probes, pure ACKs, FINs, RSTs and retransmissions
217	      that carry the ECT(0), ECT(1) or CE codepoints are processed by
218	      the TCP endpoints and the network (including routers, firewalls
219	      and other middleboxes).  In particular we would like to learn if
220	      these packets are frequently blocked or if these packets are
221	      usually forwarded and processed.

223	   *  The scale of deployment of the different flavours of ECN,
224	      including [RFC3168], [RFC5562], [RFC3540] and
225	      [I-D.ietf-tcpm-accurate-ecn].

227	   *  How much the performance of TCP communications is improved by
228	      allowing ECN marking of each packet type.

230	   *  To identify any issues (including security issues) raised by
231	      enabling ECN marking of these packets.

233	   *  To conduct the specific experiments identified in the text by the
234	      strings "EXPERIMENTATION NEEDED" or "MEASUREMENTS NEEDED".

236	   The data gathered through the experiments described in this document,
237	   particularly under the first 2 bullets above, will help in the
238	   redesign of the final mechanism (if needed) for adding ECN support to
239	   the different packet types considered in this document.

241	   Success criteria: The experiment will be a success if we obtain
242	   enough data to have a clearer view of the deployability and benefits
243	   of enabling ECN on all TCP packets, as well as any issues.  If the
244	   results of the experiment show that it is feasible to deploy such
245	   changes; that there are gains to be achieved through the changes
246	   described in this specification; and that no other major issues may
247	   interfere with the deployment of the proposed changes; then it would
248	   be reasonable to adopt the proposed changes in a standards track
249	   specification that would update RFC 3168.

251	1.3.  Document Structure

253	   The remainder of this document is structured as follows.  In
254	   Section 2, we present the terminology used in the rest of the
255	   document.  In Section 3, we specify the modifications to provide ECN
256	   support to TCP SYNs, pure ACKs, Window probes, FINs, RSTs and
257	   retransmissions.  We describe both the network behaviour and the
258	   endpoint behaviour.  Section 5 discusses variations of the
259	   specification that will be necessary to interwork with a number of
260	   popular variants or derivatives of TCP.  RFC 3168 provides a number
261	   of specific reasons why ECN support is not appropriate for each
262	   packet type.  In Section 4, we revisit each of these arguments for
263	   each packet type to justify why it is reasonable to conduct this
264	   experiment.

266	2.  Terminology

268	   The keywords MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD,
269	   SHOULD NOT, RECOMMENDED, NOT RECOMMENDED, MAY, and OPTIONAL in this
270	   document, are to be interpreted as described in BCP 14 [RFC2119] when
271	   and only when they appear in all capitals [RFC8174].

273	   Pure ACK: A TCP segment with the ACK flag set and no data payload.

275	   SYN: A TCP segment with the SYN (synchronize) flag set.

277	   Window probe: Defined in [RFC0793], a window probe is a TCP segment
278	   with only one byte of data sent to learn if the receive window is
279	   still zero.

281	   FIN: A TCP segment with the FIN (finish) flag set.

283	   RST: A TCP segment with the RST (reset) flag set.

285	   Retransmission: A TCP segment that has been retransmitted by the TCP
286	   sender.

288	   TCP client: The initiating end of a TCP connection.  Also called the
289	   initiator.

291	   TCP server: The responding end of a TCP connection.  Also called the
292	   responder.

294	   ECT: ECN-Capable Transport.  One of the two codepoints ECT(0) or
295	   ECT(1) in the ECN field [RFC3168] of the IP header (v4 or v6).  An
296	   ECN-capable sender sets one of these to indicate that both transport
297	   end-points support ECN.  When this specification says the sender sets
298	   an ECT codepoint, by default it means ECT(0).  Optionally, it could
299	   mean ECT(1), which is in the process of being redefined for use by
300	   L4S experiments [RFC8311] [I-D.ietf-tsvwg-ecn-l4s-id].

302	   Not-ECT: The ECN codepoint set by senders that indicates that the
303	   transport is not ECN-capable.

305	   CE: Congestion Experienced.  The ECN codepoint that an intermediate
306	   node sets to indicate congestion [RFC3168].  A node sets an
307	   increasing proportion of ECT packets to CE as the level of congestion
308	   increases.

310	3.  Specification

312	   The experimental ECN++ changes to the specification of TCP over ECN
313	   [RFC3168] defined here primarily alter the behaviour of the sending
314	   host for each half-connection.  However, there are subsections for
315	   forwarding elements and receivers below, which recommend that they
316	   accept the new packets - they should do already, but might not.  This
317	   will allow implementers to check the receive side code while they are
318	   altering the send-side code.  All changes can be deployed at each
319	   end-point independently of others and independent of any network
320	   behaviour.

322	   The feedback behaviour at the receiver depends on whether classic ECN
323	   TCP feedback [RFC3168] or Accurate ECN (AccECN) TCP feedback
324	   [I-D.ietf-tcpm-accurate-ecn] has been negotiated.  Nonetheless,
325	   neither receiver feedback behaviour is altered by the present
326	   specification.

328	3.1.  Network (e.g. Firewall) Behaviour

330	   Previously the specification of ECN for TCP [RFC3168] required the
331	   sender to set not-ECT on TCP control packets and retransmissions.
332	   Some readers of RFC 3168 might have erroneously interpreted this as a
333	   requirement for firewalls, intrusion detection systems, etc. to check
334	   and enforce this behaviour.  Section 4.3 of [RFC8311] updates RFC
335	   3168 to remove this ambiguity.  It requires firewalls or any
336	   intermediate nodes not to treat certain types of ECN-capable TCP
337	   segment differently (except potentially in one attack scenario).
338	   This is likely to only involve a firewall rule change in a fraction
339	   of cases (at most 0.4% of paths according to the tests reported in
340	   Section 4.2.2).

342	   In case a TCP sender encounters a middlebox blocking ECT on certain
343	   TCP segments, the specification below includes behaviour to fall back
344	   to non-ECN.  However, this loses the benefit of ECN on control
345	   packets.  So operators are RECOMMENDED to alter their firewall rules
346	   to comply with the requirement referred to above (section 4.3 of
347	   [RFC8311]).

349	3.2.  Sender Behaviour

351	   For each type of control packet or retransmission, the following
352	   sections detail changes to the sender's behaviour in two respects: i)
353	   whether it sets ECT; and ii) its response to congestion feedback.
354	   Table 1 summarises these two behaviours for each type of packet, but
355	   the relevant subsection below should be referred to for the detailed
356	   behaviour.  The subsection on the SYN is more complex than the
357	   others, because it has to include fall-back behaviour if the ECT
358	   packet appears not to have got through, and caching of the outcome to
359	   detect persistent failures.

361	    +============+==============+==============+======================+
362	    | TCP packet | ECN field if | ECN field if | Congestion Response  |
363	    | type       | AccECN f/b   | RFC3168 f/b  |                      |
364	    |            | negotiated*  | negotiated*  |                      |
365	    +============+==============+==============+======================+
366	    | SYN        | ECT          | not-ECT      | If AccECN, reduce IW |
367	    +------------+--------------+--------------+----------------------+
368	    | SYN-ACK    | ECT          | ECT          | Reduce IW            |
369	    +------------+--------------+--------------+----------------------+
370	    | Pure ACK   | ECT          | not-ECT      | If AccECN, usual     |
371	    |            |              |              | cwnd response and    |
372	    |            |              |              | optionally [RFC5690] |
373	    +------------+--------------+--------------+----------------------+
374	    | W Probe    | ECT          | ECT          | Usual cwnd response  |
375	    +------------+--------------+--------------+----------------------+
376	    | FIN        | ECT          | ECT          | None or optionally   |
377	    |            |              |              | [RFC5690]            |
378	    +------------+--------------+--------------+----------------------+
379	    | RST        | ECT          | ECT          | N/A                  |
380	    +------------+--------------+--------------+----------------------+
381	    | Re-XMT     | ECT          | ECT          | Usual cwnd response  |
382	    +------------+--------------+--------------+----------------------+

384	          Table 1: Summary of sender behaviour.  In each case the
385	       relevant section below should be referred to for the detailed
386	                                 behaviour

388	   Window probe and retransmission are abbreviated to W Probe an Re-XMT.
389	   * For a SYN, "negotiated" means "requested".

391	   It can be seen that we recommend against the sender setting ECT on
392	   the SYN if it is not requesting AccECN feedback.  Therefore it is
393	   RECOMMENDED that the AccECN specification
394	   [I-D.ietf-tcpm-accurate-ecn] is implemented, along with the ECN++
395	   experiment, because it is expected that ECT on the SYN will give the
396	   most significant performance gain, particularly for short flows.

398	   Nonetheless, this specification also caters for the case where an
399	   ECN++ TCP sender is not using AccECN.  This could be because it does
400	   not support AccECN or because the other end of the TCP connection
401	   does not (AccECN can only be used for a connection if both ends
402	   support it).

404	   Note that Table 1 does not imply any obligation to set any packet to
405	   ECT.  ECN++ removes the restrictions that RFC 3168 places against
406	   setting ECT on these types of packets, and an implementation would
407	   normally be expected to take advantage of this, but it does not have
408	   to.  Therefore, an implementation of the ECN++ experiment would be
409	   compliant if, for instance, it set ECT on some types of control
410	   packets but not others.

412	3.2.1.  SYN (Send)

414	3.2.1.1.  Setting ECT on the SYN

416	   With classic [RFC3168] ECN feedback, the SYN was not expected to be
417	   ECN-capable, so the flag provided to feed back congestion was put to
418	   another use (it is used in combination with other flags to indicate
419	   that the responder supports ECN).  In contrast, Accurate ECN (AccECN)
420	   feedback [I-D.ietf-tcpm-accurate-ecn] provides a codepoint in the
421	   SYN-ACK for the responder to feed back whether the SYN arrived marked
422	   CE.  Therefore the setting of the IP/ECN field on the SYN is
423	   specified separately for each case in the following two subsections.

425	3.2.1.1.1.  ECN++ TCP Client also Supports AccECN

427	   For the ECN++ experiment, if the SYN is requesting AccECN feedback,
428	   the TCP sender will also set ECT on the SYN.  It can ignore the
429	   prohibition in section 6.1.1 of RFC 3168 against setting ECT on such
430	   a SYN, as per Section 4.3 of [RFC8311].

432	3.2.1.1.2.  ECN++ TCP Client does not Support AccECN

434	   If the SYN sent by a TCP initiator does not attempt to negotiate
435	   Accurate ECN feedback, or does not use an equivalent safety
436	   mechanism, it MUST still comply with RFC 3168, which says that a TCP
437	   initiator "MUST NOT set ECT on a SYN".

439	   The only envisaged examples of "equivalent safety mechanisms" are: a)
440	   some future TCP ECN feedback protocol, perhaps evolved from AccECN,
441	   that feeds back CE marking on a SYN; b) setting the initial window to
442	   1 SMSS.  IW=1 is NOT RECOMMENDED because it could degrade
443	   performance, but might be appropriate for certain lightweight TCP
444	   implementations.

446	   See Section 4.2 for discussion and rationale.

448	   If the TCP initiator does not set ECT on the SYN, the rest of
449	   Section 3.2.1 does not apply.

451	3.2.1.2.  Caching where to use ECT on SYNs

453	   This subsection only applies if the ECN++ TCP client set ECTs on the
454	   SYN and supports AccECN.

456	   Until AccECN servers become widely deployed, a TCP initiator that
457	   sets ECT on a SYN (which typically implies the same SYN also requests
458	   AccECN, as above) SHOULD also maintain a cache entry per server to
459	   record servers that it is not worth sending an ECT SYN to,
460	   e.g. because they do not support AccECN and therefore have no logic
461	   for congestion markings on the SYN.  Mobile hosts MAY maintain a
462	   cache entry per access network to record 'non-ECT SYN' entries
463	   against proxies (see Section 4.2.3).  This cache can be implemented
464	   as part of the shared state across multiple TCP connections,
465	   following [RFC2140].

467	   Subsequently the initiator will not set ECT on a SYN to such a server
468	   or proxy, but it can still always request AccECN support (because the
469	   response will state any earlier stage of ECN evolution that the
470	   server supports with no performance penalty).  If a server
471	   subsequently upgrades to support AccECN, the initiator will discover
472	   this as soon as it next connects, then it can remove the server from
473	   its cache and subsequently always set ECT for that server.

475	   The client can limit the size of its cache of 'non-ECT SYN' servers.
476	   Then, while AccECN is not widely deployed, it will only cache the
477	   'non-ECT SYN' servers that are most used and most recently used by
478	   the client.  As the client accesses servers that have been expelled
479	   from its cache, it will simply use ECT on the SYN by default.

481	   Servers that do not support ECN as a whole do not need to be recorded
482	   separately from non-support of AccECN because the response to a
483	   request for AccECN immediately states which stage in the evolution of
484	   ECN the server supports (AccECN [I-D.ietf-tcpm-accurate-ecn], classic
485	   ECN [RFC3168] or no ECN).

487	   The above strategy is named "optimistic ECT and cache failures".  It
488	   is believed to be sufficient based on three measurement studies and
489	   assumptions detailed in Section 4.2.3.  However, Section 4.2.3 gives
490	   two other strategies and the choice between them depends on the
491	   implementer's goals and the deployment prevalence of ECN variants in
492	   the network and on servers, not to mention the prevalence of some
493	   significant bugs.

495	   If the initiator times out without seeing a SYN-ACK, it will
496	   separately cache this fact (see fall-back in Section 3.2.1.4 for
497	   details).

499	3.2.1.3.  SYN Congestion Response

501	   As explained above, this subsection only applies if the ECN++ TCP
502	   client sets ECT on the initial SYN.

504	   If the SYN-ACK returned to the TCP initiator confirms that the server
505	   supports AccECN, it will also be able to indicate whether or not the
506	   SYN was CE-marked.  If the SYN was CE-marked, and if the initial
507	   window is greater than 1 MSS, then, the initiator MUST reduce its
508	   Initial Window (IW) and SHOULD reduce it to 1 SMSS (sender maximum
509	   segment size).  The rationale is the same as that for the response to
510	   CE on a SYN-ACK (Section 4.3.2).

512	   If the initiator has set ECT on the SYN and if the SYN-ACK shows that
513	   the server does not support feedback of a CE on the SYN (e.g. it does
514	   not support AccECN) and if the initial congestion window of the
515	   initiator is greater than 1 MSS, then the TCP initiator MUST
516	   conservatively reduce its Initial Window and SHOULD reduce it to 1
517	   SMSS.  A reduction to greater than 1 SMSS MAY be appropriate (see
518	   Section 4.2.1).  Conservatism is necessary because the SYN-ACK cannot
519	   show whether the SYN was CE-marked.

521	   If the TCP initiator (host A) receives a SYN from the remote end
522	   (host B) after it has sent a SYN to B, it indicates the (unusual)
523	   case of a simultaneous open.  Host A will respond with a SYN-ACK.
524	   Host A will probably then receive a SYN-ACK in response to its own
525	   SYN, after which it can follow the appropriate one of the two
526	   paragraphs above.

528	   In all the above cases, the initiator does not have to back off its
529	   retransmission timer as it would in response to a timeout following
530	   no response to its SYN [RFC6298], because both the SYN and the SYN-
531	   ACK have been successfully delivered through the network.  Also, the
532	   initiator does not need to exit slow start or reduce ssthresh, which
533	   is not even required when a SYN is lost [RFC5681].

535	   If an initial window of more than 3 segments is implemented
536	   (e.g. IW10 [RFC6928]), Section 5 gives additional recommendations.

538	3.2.1.4.  Fall-Back Following No Response to an ECT SYN

540	   As explained above, this subsection only applies if the ECN++ TCP
541	   client also sets ECT on the initial SYN.

543	   An ECT SYN might be lost due to an over-zealous path element (or
544	   server) blocking ECT packets that do not conform to RFC 3168.  Some
545	   evidence of this was found in a 2014 study [ecn-pam], but in a more
546	   recent study using 2017 data [Mandalari18] extensive measurements
547	   found no case where ECT on TCP control packets was treated any
548	   differently from ECT on TCP data packets.  Loss is commonplace for
549	   numerous other reasons, e.g. congestion loss at a non-ECN queue on
550	   the forward or reverse path, transmission errors, etc.
551	   Alternatively, the cause of the loss might be the associated attempt
552	   to negotiate AccECN, or possibly other unrelated options on the SYN.

554	   Therefore, if the timer expires after the TCP initiator has sent the
555	   first ECT SYN, it SHOULD make one more attempt to retransmit the SYN
556	   with ECT set (backing off the timer as usual).  If the retransmission
557	   timer expires again, it SHOULD retransmit the SYN with the not-ECT
558	   codepoint in the IP header, to expedite connection set-up.  If other
559	   experimental fields or options were on the SYN, it will also be
560	   necessary to follow their specifications for fall-back too.  It would
561	   make sense to coordinate all the strategies for fall-back in order to
562	   isolate the specific cause of the problem.

564	   If the TCP initiator is caching failed connection attempts, it SHOULD
565	   NOT give up using ECT on the first SYN of subsequent connection
566	   attempts until it is clear that a blockage persistently and
567	   specifically affects ECT on SYNs.  This is because loss is so
568	   commonplace for other reasons.  Even if it does eventually decide to
569	   give up setting ECT on the SYN, it will probably not need to give up
570	   on AccECN on the SYN.  In any case, if a cache is used, it SHOULD be
571	   arranged to expire so that the initiator will infrequently attempt to
572	   check whether the problem has been resolved.

574	   Other fall-back strategies MAY be adopted where applicable (see
575	   Section 4.2.2 for suggestions, and the conditions under which they
576	   would apply).

578	3.2.2.  SYN-ACK (Send)

580	3.2.2.1.  Setting ECT on the SYN-ACK

582	   For the ECN++ experiment, the TCP implementation will set ECT on SYN-
583	   ACKs.  It can ignore the requirement in section 6.1.1 of RFC 3168 to
584	   set not-ECT on a SYN-ACK, as per Section 4.3 of [RFC8311].

586	3.2.2.2.  SYN-ACK Congestion Response

588	   A host that sets ECT on SYN-ACKs MUST reduce its initial window in
589	   response to any congestion feedback, whether using classic ECN or
590	   AccECN (see Section 4.3.1).  It SHOULD reduce it to 1 SMSS.  This is
591	   different to the behaviour specified in an earlier experiment that
592	   set ECT on the SYN-ACK [RFC5562].  This is justified in
593	   Section 4.3.2.

595	   The responder does not have to back off its retransmission timer
596	   because the ECN feedback proves that the network is delivering
597	   packets successfully and is not severely overloaded.  Also the
598	   responder does not have to leave slow start or reduce ssthresh, which
599	   is not even required when a SYN-ACK has been lost.

601	   The congestion response to CE-marking on a SYN-ACK for a server that
602	   implements either the TCP Fast Open experiment (TFO [RFC7413]) or
603	   experimentation with an initial window of more than 3 segments
604	   (e.g. IW10 [RFC6928]) is discussed in Section 5.

606	3.2.2.3.  Fall-Back Following No Response to an ECT SYN-ACK

608	   After the responder sends a SYN-ACK with ECT set, if its
609	   retransmission timer expires it SHOULD retransmit one more SYN-ACK
610	   with ECT set (and back-off its timer as usual).  If the timer expires
611	   again, it SHOULD retransmit the SYN-ACK with not-ECT in the IP
612	   header.  If other experimental fields or options were on the initial
613	   SYN-ACK, it will also be necessary to follow their specifications for
614	   fall-back.  It would make sense to co-ordinate all the strategies for
615	   fall-back in order to isolate the specific cause of the problem.

617	   This fall-back strategy attempts to use ECT one more time than the
618	   strategy for ECT SYN-ACKs in [RFC5562] (which is made obsolete, being
619	   superseded by the present specification).  Other fall-back strategies
620	   MAY be adopted if found to be more effective, e.g. fall-back to not-
621	   ECT on the first retransmission attempt.

623	   The server MAY cache failed connection attempts, e.g. per client
624	   access network.  A client-based alternative to caching at the server
625	   is given in Section 4.3.3.  If the TCP server is caching failed
626	   connection attempts, it SHOULD NOT give up using ECT on the first
627	   SYN-ACK of subsequent connection attempts until it is clear that the
628	   blockage persistently and specifically affects ECT on SYN-ACKs.  This
629	   is because loss is so commonplace for other reasons (see
630	   Section 3.2.1.4).  If a cache is used, it SHOULD be arranged to
631	   expire so that the server will infrequently attempt to check whether
632	   the problem has been resolved.

634	3.2.3.  Pure ACK (Send)

636	   A Pure ACK is an ACK packet that does not carry data, which includes
637	   the Pure ACK at the end of TCP's 3-way handshake.

639	   For the ECN++ experiment, whether a TCP implementation sets ECT on a
640	   Pure ACK depends on whether or not Accurate ECN TCP feedback
641	   [I-D.ietf-tcpm-accurate-ecn] has been successfully negotiated for a
642	   particular TCP connection, as specified in the following two
643	   subsections.

645	3.2.3.1.  Pure ACK without AccECN Feedback

647	   If AccECN has not been successfully negotiated for a connection, ECT
648	   MUST NOT be set on Pure ACKs by either end.

650	3.2.3.2.  Pure ACK with AccECN Feedback

652	   For the ECN++ experiment, if AccECN has been successfully negotiated,
653	   either end of the connection will set ECT on Pure ACKs.  They can
654	   ignore the requirement in section 6.1.4 of RFC 3168 to set not-ECT on
655	   a pure ACK, as per Section 4.3 of [RFC8311].

657	      MEASUREMENTS NEEDED: Measurements are needed to learn how the
658	      deployed base of network elements and RFC 3168 servers react to
659	      pure ACKs marked with the ECT(0)/ECT(1)/CE codepoints,
660	      i.e. whether they are dropped, codepoint cleared or processed and
661	      the congestion indication fed back on a subsequent packet.

663	   See Section 3.3.3 for the implications if a host receives a CE-marked
664	   Pure ACK.

666	3.2.3.2.1.  Pure ACK Congestion Response

668	   As explained above, this subsection only applies if AccECN has been
669	   successfully negotiated for the TCP connection.

671	   A host that sets ECT on pure ACKs SHOULD respond to the congestion
672	   signal resulting from pure ACKs being marked with the CE codepoint.
673	   The specific response will need to be defined as an update to each
674	   congestion control specification.  Possible responses to congestion
675	   feedback include reducing the congestion window (CWND) and/or
676	   regulating the pure ACK rate (see Section 4.4.1.1).

678	   Note that, in comparison, TCP Congestion Control [RFC5681] does not
679	   require a TCP to detect or respond to loss of pure ACKs at all; it
680	   requires no reduction in congestion window or ACK rate.

682	3.2.4.  Window Probe (Send)

684	   For the ECN++ experiment, the TCP sender will set ECT on window
685	   probes.  It can ignore the prohibition in section 6.1.6 of RFC 3168
686	   against setting ECT on a window probe, as per Section 4.3 of
687	   [RFC8311].

689	   A window probe contains a single octet, so it is no different from a
690	   regular TCP data segment.  Therefore a TCP receiver will feed back
691	   any CE marking on a window probe as normal (either using classic ECN
692	   feedback or AccECN feedback).  The sender of the probe will then
693	   reduce its congestion window as normal.

695	   A receive window of zero indicates that the application is not
696	   consuming data fast enough and does not imply anything about network
697	   congestion.  Once the receive window opens, the congestion window
698	   might become the limiting factor, so it is correct that CE-marked
699	   probes reduce the congestion window.  This complements cwnd
700	   validation [RFC7661], which reduces cwnd as more time elapses without
701	   having used available capacity.  However, CE-marking on window probes
702	   does not reduce the rate of the probes themselves.  This is unlikely
703	   to present a problem, given the duration between window probes
704	   doubles [RFC1122] as long as the receiver is advertising a zero
705	   window (currently minimum 1 second, maximum at least 1 minute
706	   [RFC6298]).

708	      MEASUREMENTS NEEDED: Measurements are needed to learn how the
709	      deployed base of network elements and servers react to Window
710	      probes marked with the ECT(0)/ECT(1)/CE codepoints, i.e. whether
711	      they are dropped, codepoint cleared or processed.

713	3.2.5.  FIN (Send)

715	   A TCP implementation can set ECT on a FIN.

717	   See Section 3.3.4 for the implications if a host receives a CE-marked
718	   FIN.

720	   A congestion response to a CE-marking on a FIN is not required.

722	   After sending a FIN, the endpoint will not send any more data in the
723	   connection.  Therefore, even if the FIN-ACK indicates that the FIN
724	   was CE-marked (whether using classic or AccECN feedback), reducing
725	   the congestion window will not affect anything.

727	   After sending a FIN, a host might send one or more pure ACKs.  If it
728	   is using one of the techniques in Section 3.2.3 to regulate the
729	   delayed ACK ratio for pure ACKs, it could equally be applied after a
730	   FIN.  But this is not required.

732	      MEASUREMENTS NEEDED: Measurements are needed to learn how the
733	      deployed base of network elements and servers react to FIN packets
734	      marked with the ECT(0)/ECT(1)/CE codepoints, i.e. whether they are
735	      dropped, codepoint cleared or processed.

737	3.2.6.  RST (Send)

739	   A TCP implementation can set ECT on a RST.

741	   See Section 3.3.5 for the implications if a host receives a CE-marked
742	   RST.

744	   A congestion response to a CE-marking on a RST is not required (and
745	   actually not possible).

747	      MEASUREMENTS NEEDED: Measurements are needed to learn how the
748	      deployed base of network elements and servers react to RST packets
749	      marked with the ECT(0)/ECT(1)/CE codepoints, i.e. whether they are
750	      dropped, codepoint cleared or processed.

752	   Implementers SHOULD ensure that RST packets (and control packets
753	   generally) are always sent out with the same ECN field regardless of
754	   the TCP state machine.  Otherwise the ECN field could reveal internal
755	   TCP state.  For instance, the ECN field on a RST ought not to reveal
756	   any distinction between a non-listening port, a recently in-use port,
757	   and a closed session port.

759	3.2.7.  Retransmissions (Send)

761	   For the ECN++ experiment, the TCP sender will set ECT on
762	   retransmitted segments.  It can ignore the prohibition in section
763	   6.1.5 of RFC 3168 against setting ECT on retransmissions, as per
764	   Section 4.3 of [RFC8311].

766	   See Section 3.3.6 for the implications if a host receives a CE-marked
767	   retransmission.

769	   If the TCP sender receives feedback that a retransmitted packet was
770	   CE-marked, it will react as it would to any feedback of CE-marking on
771	   a data packet.

773	      MEASUREMENTS NEEDED: Measurements are needed to learn how the
774	      deployed base of network elements and servers react to
775	      retransmissions marked with the ECT(0)/ECT(1)/CE codepoints,
776	      i.e. whether they are dropped, codepoint cleared or processed.

778	3.2.8.  General Fall-back for any Control Packet or Retransmission

780	   Extensive measurements in fixed and mobile networks [Mandalari18]
781	   have found no evidence of blockages due to ECT being set on any type
782	   of TCP control packet.

784	   In case traversal problems arise in future, fall-back measures have
785	   been specified above, but only for the cases where ECT on the initial
786	   packet of a half-connection (SYN or SYN-ACK) is persistently failing
787	   to get through.

789	   Fall-back measures for blockage of ECT on other TCP control packets
790	   MAY be implemented.  However they are not specified here given the
791	   lack of any evidence they will be needed.  Section 4.9 justifies this
792	   advice in more detail.

794	3.3.  Receiver Behaviour

796	   The present ECN++ specification primarily concerns the behaviour for
797	   sending TCP control packets or retransmissions.  Below are a few
798	   changes to the receive side of an implementation that are recommended
799	   while updating its send side.  Nonetheless, where deployment is
800	   concerned, ECN++ is still a sender-only deployment, because it does
801	   not depend on receivers complying with any of these recommendations.

803	3.3.1.  Receiver Behaviour for Any TCP Control Packet or Retransmission

805	   RFC8311 is a standards track update to RFC 3168 in order to (amongst
806	   other things) "...allow the use of ECT codepoints on SYN packets,
807	   pure acknowledgement packets, window probe packets, and
808	   retransmissions of packets..., provided that the changes from RFC
809	   3168 are documented in an Experimental RFC in the IETF document
810	   stream."

812	   Section 4.3 of RFC 8311 amends every statement in RFC 3168 that
813	   precludes the use of ECT on control packets and retransmissions to
814	   add "unless otherwise specified by an Experimental RFC in the IETF
815	   document stream".  The present specification is such an Experimental
816	   RFC.  Therefore, In order for the present RFC 8311 experiment to be
817	   useful, TCP receivers will need to satisfy the following
818	   requirements:

820	   *  Any TCP implementation SHOULD accept receipt of any valid TCP
821	      control packet or retransmission irrespective of its IP/ECN field.
822	      If any existing implementation does not, it SHOULD be updated to
823	      do so.

825	   *  A TCP implementation taking part in the experiments proposed here
826	      MUST accept receipt of any valid TCP control packet or
827	      retransmission irrespective of its IP/ECN field.

829	   The following sections give further requirements specific to each
830	   type of control packet.

832	   These measures are derived from the robustness principle of "...  be
833	   liberal in what you accept from others", not only to ensure
834	   compatibility with the present experimental specification, but also
835	   any future protocol changes that allow ECT on any TCP packet.

837	3.3.2.  SYN (Receive)

839	   RFC 3168 negotiates the use of ECN for the connection end-to-end
840	   using the ECN flags in the TCP header.  RFC 3168 originally said that
841	   "A host MUST NOT set ECT on SYN ... packets." but it was silent as to
842	   what a TCP server ought to do if it receives a SYN packet with a non-
843	   zero IP/ECN field anyway.

845	   For the avoidance of doubt, the normative statements for all TCP
846	   control packets in Section 3.3.1 are interpreted for the specific
847	   case when a SYN is received as follows:

849	   *  Any TCP server implementation SHOULD accept receipt of a valid SYN
850	      that requests ECN support for the connection, irrespective of the
851	      IP/ECN field of the SYN.  If any existing implementation does not,
852	      it SHOULD be updated to do so.

854	   *  A TCP implementation taking part in the ECN++ experiment MUST
855	      accept receipt of a valid SYN, irrespective of its IP/ECN field.

857	   *  If the SYN is CE-marked and the server has no logic to feed back a
858	      CE mark on a SYN-ACK (e.g. it does not support AccECN), it has to
859	      ignore the CE-mark (the client detects this case and behaves
860	      conservatively in mitigation - see Section 3.2.1.3).

862	   Rationale: At the time of the writing, some implementations of TCP
863	   servers (see Section 4.2.2.2) assume that, if a host receives a SYN
864	   with a non-zero IP/ECN field, it must be due to network mangling, and
865	   they disable ECN for the rest of the connection.  Section 4.2.2.2
866	   cites a measurement study run in 2017 that found no occurrence of
867	   this type of network mangling.  However, a year earlier, when ECN was
868	   enabled on connections from Apple clients, there was a case of a
869	   whole network that re-marked the ECN field of every packet to CE (it
870	   was rapidly fixed).

872	   When ECN was not allowed on SYNs, it made sense to look for a non-
873	   zero ECN field on the SYN to detect this type of network mangling.
874	   But now that ECN is being allowed on a SYN, detection needs to be
875	   more nuanced.  A server needs to disable the test on the SYN alone
876	   for AccECN SYNs (which was done for Linux RFC 3168 servers in 2019
877	   [relax-strict-ecn]) and for RFC 3168 SYNs it needs to watch for three
878	   or four packets all set to CE at the start of a flow.  If such
879	   mangling is indeed now so rare, it would also be preferable to log
880	   each case detected and manually report it to the responsible network,
881	   so that the problem will eventually be eliminated.

883	3.3.3.  Pure ACK (Receive)

885	   For the avoidance of doubt, the normative statements for all TCP
886	   control packets in Section 3.3.1 are interpreted for the specific
887	   case when a Pure ACK is received as follows:

889	   *  Any TCP implementation SHOULD accept receipt of a pure ACK with a
890	      non-zero ECN field, despite current RFCs precluding the sending of
891	      such packets.

893	   *  A TCP implementation taking part in the ECN++ experiment MUST
894	      accept receipt of a pure ACK with a non-zero ECN field.

896	   The question of whether and how the receiver of pure ACKs is required
897	   to feed back any CE marks on them is outside the scope of the present
898	   specification because it is a matter for the relevant feedback
899	   specification ([RFC3168] or [I-D.ietf-tcpm-accurate-ecn]).  AccECN
900	   feedback is required to count CE marking of any control packet
901	   including pure ACKs.  Whereas RFC 3168 is silent on this point, so
902	   feedback of CE-markings might be implementation specific (see
903	   Section 4.4.1.1).

905	3.3.4.  FIN (Receive)

907	   The TCP data receiver MUST ignore the CE codepoint on incoming FINs
908	   that fail any validity check.  The validity check in section 5.2 of
909	   [RFC5961] is RECOMMENDED.

911	3.3.5.  RST (Receive)

913	   The "challenge ACK" approach to checking the validity of RSTs
914	   (section 3.2 of [RFC5961] is RECOMMENDED at the data receiver.

916	3.3.6.  Retransmissions (Receive)

918	   The TCP data receiver MUST ignore the CE codepoint on incoming
919	   segments that fail any validity check.  The validity check in section
920	   5.2 of [RFC5961] is RECOMMENDED.  This will effectively mitigate an
921	   attack that uses spoofed data packets to fool the receiver into
922	   feeding back spoofed congestion indications to the sender, which in
923	   turn would be fooled into continually reducing its congestion window.

925	4.  Rationale

927	   This section is informative, not normative.  It presents counter-
928	   arguments against the justifications in the RFC series for disabling
929	   ECN on TCP control segments and retransmissions.  It also gives
930	   rationale for why ECT is safe on control segments that have not, so
931	   far, been mentioned in the RFC series.  First it addresses over-
932	   arching arguments used for most packet types, then it addresses the
933	   specific arguments for each packet type in turn.

935	4.1.  The Reliability Argument

937	   Section 5.2 of RFC 3168 states:

939	      "To ensure the reliable delivery of the congestion indication of
940	      the CE codepoint, an ECT codepoint MUST NOT be set in a packet
941	      unless the loss of that packet [at a subsequent node] in the
942	      network would be detected by the end nodes and interpreted as an
943	      indication of congestion."

945	   We believe this argument is misplaced.  TCP does not deliver most
946	   control packets reliably.  So it is more important to allow control
947	   packets to be ECN-capable, which greatly improves reliable delivery
948	   of the control packets themselves (see motivation in Section 1.1).
949	   ECN also improves the reliability and latency of delivery of any
950	   congestion notification on control packets, particularly because TCP
951	   does not detect the loss of most types of control packet anyway.
952	   Both these points outweigh by far the concern that a CE marking
953	   applied to a control packet by one node might subsequently be dropped
954	   by another node.

956	   The principle to determine whether a packet can be ECN-capable ought
957	   to be "do no extra harm", meaning that the reliability of a
958	   congestion signal's delivery ought to be no worse with ECN than
959	   without.  In particular, setting the CE codepoint on the very same
960	   packet that would otherwise have been dropped fulfills this
961	   criterion, since either the packet is delivered and the CE signal is
962	   delivered to the endpoint, or the packet is dropped and the original
963	   congestion signal (packet loss) is delivered to the endpoint.

965	   The concern about a CE marking being dropped at a subsequent node
966	   might be motivated by the idea that ECN-marking a packet at the first
967	   node does not remove the packet, so it could go on to worsen
968	   congestion at a subsequent node.  However, it is not useful to reason
969	   about congestion by considering single packets.  The departure rate
970	   from the first node will generally be the same (fully utilized) with
971	   or without ECN, so this argument does not apply.

973	4.2.  SYNs

975	   RFC 5562 presents two arguments against ECT marking of SYN packets
976	   (quoted verbatim):

978	      "First, when the TCP SYN packet is sent, there are no guarantees
979	      that the other TCP endpoint (node B in Figure 2) is ECN-Capable,
980	      or that it would be able to understand and react if the ECN CE
981	      codepoint was set by a congested router.

983	      Second, the ECN-Capable codepoint in TCP SYN packets could be
984	      misused by malicious clients to "improve" the well-known TCP SYN
985	      attack.  By setting an ECN-Capable codepoint in TCP SYN packets, a
986	      malicious host might be able to inject a large number of TCP SYN
987	      packets through a potentially congested ECN-enabled router,
988	      congesting it even further."

990	   The first point actually describes two subtly different issues.  So
991	   below three arguments are countered in turn.

993	4.2.1.  Argument 1a: Unrecognized CE on the SYN

995	   This argument certainly applied at the time RFC 5562 was written,
996	   when no ECN responder mechanism had any logic to recognize a CE
997	   marking on a SYN and, even if logic were added, there was no field in
998	   the SYN-ACK to feed it back.  The problem was that, during the 3WHS,
999	   the flag in the TCP header for ECN feedback (called Echo Congestion
1000	   Experienced) had been overloaded to negotiate the use of ECN itself.

1002	   The accurate ECN (AccECN) protocol [I-D.ietf-tcpm-accurate-ecn] has
1003	   since been designed to solve this problem.  Two features are
1004	   important here:

1006	   1.  An AccECN server uses the 3 'ECN' bits in the TCP header of the
1007	       SYN-ACK to respond to the client. 4 of the possible 8 codepoints
1008	       provide enough space for the server to feed back which of the 4
1009	       IP/ECN codepoints was on the incoming SYN (including CE of
1010	       course).

1012	   2.  If any of these 4 codepoints are in the SYN-ACK, it confirms that
1013	       the server supports AccECN and, if another codepoint is returned,
1014	       it confirms that the server doesn't support AccECN.

1016	   This still does not seem to allow a client to set ECT on a SYN, it
1017	   only finds out whether the server would have supported it afterwards.
1018	   The trick the client uses for ECN++ is to set ECT on the SYN
1019	   optimistically then, if the SYN-ACK reveals that the server wouldn't
1020	   have understood CE on the SYN, the client responds conservatively as
1021	   if the SYN was marked with CE.

1023	   The recommended conservative congestion response is to reduce the
1024	   initial window, which does not affect the performance of very popular
1025	   protocols such as HTTP, since it is extremely rare for an HTTP client
1026	   to send more than one packet as its initial request anyway (for data
1027	   on HTTP/1 & HTTP/2 request sizes see Fig 3 in [Manzoor17]).  Any
1028	   clients that do frequently use a larger initial window for their
1029	   first message to the server can cache which servers will not
1030	   understand ECT on a SYN (see Section 4.2.3 below).  If caching is not
1031	   practical, such clients could reduce the initial window to say IW2 or
1032	   IW3.

1034	      EXPERIMENTATION NEEDED: Experiments will be needed to determine
1035	      any better strategy for reducing IW in response to congestion on a
1036	      SYN, when the server does not support congestion feedback on the
1037	      SYN-ACK (whether cached or discovered explicitly).

1039	4.2.2.  Argument 1b: ECT Considered Invalid on the SYN

1041	   Given, until now, ECT-marked SYN packets have been prohibited, it
1042	   cannot be assumed they will be accepted, by TCP middleboxes or
1043	   servers.

1045	4.2.2.1.  ECT on SYN Considered Invalid by Middleboxes

1047	   According to a study using 2014 data [ecn-pam] from a limited range
1048	   of fixed vantage points, for the top 1M Alexa web sites, adding the
1049	   ECN capability to SYNs was increasing connection establishment
1050	   failures by about 0.4%.

1052	   From a wider range of fixed and mobile vantage points, a more recent
1053	   study in Jan-May 2017 [Mandalari18] found no occurrences of blocking
1054	   of ECT on SYNs.  However, in more than half the mobile networks
1055	   tested it found wiping of the ECN codepoint at the first hop.

1057	      MEASUREMENTS NEEDED: As wiping at the first hop is remedied,
1058	      measurements will be needed to check whether SYNs with ECT are
1059	      sometimes blocked deeper into the path.

1061	   Silent failures introduce a retransmission timeout delay (default 1
1062	   second) at the initiator before it attempts any fall back strategy
1063	   (whereas explicit RSTs can be dealt with immediately).  Ironically,
1064	   making SYNs ECN-capable is intended to avoid the timeout when a SYN
1065	   is lost due to congestion.  Fortunately, if there is any discard of
1066	   ECN-capable SYNs due to policy, it will occur predictably, not
1067	   randomly like congestion.  So the initiator should be able to avoid
1068	   it by caching those sites that do not support ECN-capable SYNs (see
1069	   the last paragraph of Section 3.2.1.2).

1071	4.2.2.2.  ECT on SYN Considered Invalid by Servers

1073	   A study conducted in Nov 2017 [Kuehlewind18] found that, of the 82%
1074	   of the Alexa top 50k web servers that supported ECN, 84% disabled ECN
1075	   if the IP/ECN field on the SYN was ECT0, CE or either.  Given most
1076	   web servers use Linux, this behaviour can most likely be traced to a
1077	   patch contributed in May 2012 that was first distributed in v3.5 of
1078	   the Linux kernel [strict-ecn].  The comment says "RFC3168 : 6.1.1 SYN
1079	   packets must not have ECT/ECN bits set.  If we receive a SYN packet
1080	   with these bits set, it means a network is playing bad games with TOS
1081	   bits.  In order to avoid possible false congestion notifications, we
1082	   disable TCP ECN negociation."  Of course, some of the 84% might be
1083	   due to similar code in other OSs.

1085	   For brevity we shall call this the "over-strict" ECN test, because it
1086	   is over-conservative with what it accepts, contrary to Postel's
1087	   robustness principle.  A robust protocol will not usually assume
1088	   network mangling without comparing with the value originally sent,
1089	   and one packet is not sufficient to make an assumption with such
1090	   irreversible consequences anyway.

1092	   Ironically, networks rarely seem to alter the IP/ECN field on a SYN
1093	   from zero to non-zero anyway.  In a study conducted in Jan-May 2017
1094	   over millions of paths from vantage points in a few dozen mobile and
1095	   fixed networks [Mandalari18], no such transition was observed.  With
1096	   such a small or non-existent incidence of this sort of network
1097	   mangling, it would be preferable to report any residual problem paths
1098	   so that they can be fixed.

1100	   Whatever, the widespread presence of this 'over-strict' test proves
1101	   that RFC 5562 was correct to expect that ECT would be considered
1102	   invalid on SYNs.  Nonetheless, it is not an insurmountable problem -
1103	   the over-strict test in Linux was patched in Apr 2019
1104	   [relax-strict-ecn] and caching can work round it where previous
1105	   versions of Linux are running.  The prevalence of these "over-strict"
1106	   ECN servers makes it challenging to cache them all.  However,
1107	   Section 4.2.3 below explains how a cache of limited size can
1108	   alleviate this problem for a client's most popular sites.

1110	   For the future, [RFC8311] updates RFC 3168 to clarify that the IP/ECN
1111	   field does not have to be zero on a SYN if documented in an
1112	   experimental RFC such as the present ECN++ specification.

1114	4.2.3.  Caching Strategies for ECT on SYNs

1116	   Given the server handling of ECN on SYNs outlined in Section 4.2.2.2
1117	   above, an initiator might combine AccECN with three candidate caching
1118	   strategies for setting ECT on a SYN:

1120	   (S1):  Pessimistic ECT and cache successes: The initiator always
1121	          requests AccECN, but by default without ECT on the SYN.  Then
1122	          it caches those servers that confirm that they support AccECN
1123	          as 'ECT SYN OK'.  On a subsequent connection to any server
1124	          that supports AccECN, the initiator can then set ECT on the
1125	          SYN.  When connecting to other servers (non-ECN or classic
1126	          ECN) it will not set ECT on the SYN, so it will not fail the
1127	          'over-strict' ECN test.

1129	          Longer term, as servers upgrade to AccECN, the initiator is
1130	          still requesting AccECN, so it will add them to the cache and
1131	          use ECT on subsequent SYNs to those servers.  However,
1132	          assuming it has to cap the size of the cache, the client will
1133	          not have the benefit of ECT SYNs to those less frequently used
1134	          AccECN servers expelled from its cache.

1136	   (S2):  Optimistic ECT: The initiator always requests AccECN and by
1137	          default sets ECT on the SYN.  Then, if the server response
1138	          shows it has no AccECN logic (so it cannot feed back a CE
1139	          mark), the initiator conservatively behaves as if the SYN was
1140	          CE-marked, by reducing its initial window.

1142	          a.  No cache.

1144	          b.  Cache failures: The optimistic ECT strategy can be
1145	              improved by caching solely those servers that do not
1146	              support AccECN as 'ECT SYN NOK'.  This would include non-
1147	              ECN servers and all Classic ECN servers whether 'over-
1148	              strict' or not.  On subsequent connections to these non-
1149	              AccECN servers, the initiator will still request AccECN
1150	              but not set ECT on the SYN.  Then, the connection can
1151	              still fall back to Classic ECN, if the server supports it,
1152	              and the initiator can use its full initial window (if it
1153	              has enough request data to need it).

1155	              Longer term, as servers upgrade to AccECN, the initiator
1156	              will remove them from the cache and use ECT on subsequent
1157	              SYNs to that server.

1159	              Where an access network operator mediates Internet access
1160	              via a proxy that does not support AccECN, the optimistic
1161	              ECT strategy will always fail.  This scenario is more
1162	              likely in mobile networks.  Therefore, a mobile host could
1163	              cache lack of AccECN support per attached access network
1164	              operator.  Whenever it attached to a new operator, it
1165	              could check a well-known AccECN test server and, if it
1166	              found no AccECN support, it would add a cache entry for
1167	              the attached operator.  It would only use ECT when neither
1168	              network nor server were cached.  It would only populate
1169	              its per server cache when not attached to a non-AccECN
1170	              proxy.

1172	   (S3):  ECT by configuration: In a controlled environment, the
1173	          administrator can make sure that servers support ECN-capable
1174	          SYN packets.  Examples of controlled environments are single-
1175	          tenant DCs, and possibly multi-tenant DCs if it is assumed
1176	          that each tenant mostly communicates with its own VMs.

1178	   For unmanaged environments like the public Internet, pragmatically
1179	   the choice is between strategies (S1), (S2A) and (S2B).  The
1180	   normative specification for ECT on a SYN in Section 3.2.1 recommends
1181	   the "optimistic ECT and cache failures" strategy (S2B) but the choice
1182	   depends on the implementer's motivation for using ECN++, and the
1183	   deployment prevalence of different technologies and bug-fixes.

1185	   *  The "pessimistic ECT and cache successes" strategy (S1) suffers
1186	      from exposing the initial SYN to the prevailing loss level, even
1187	      if the server supports ECT on SYNs, but only on the first
1188	      connection to each AccECN server.  If AccECN becomes widely
1189	      deployed on servers, SYNs to those AccECN servers that are less
1190	      frequently used by the client and therefore don't fit in the cache
1191	      will not benefit from ECN protection at all.

1193	   *  The "optimistic ECT without a cache" strategy (S2A) is the
1194	      simplest.  It would satisfy the goal of an implementer who is
1195	      solely interested in low latency using AccECN and ECN++ and is not
1196	      concerned about fall-back to Classic ECN.

1198	   *  The "optimistic ECT and cache failures" strategy (S2B) exploits
1199	      ECT on SYNs from the very first attempt.  But if the server turns
1200	      out to be 'over-strict' it will disable ECN for the connection,
1201	      but only for the first connection if it's one of the client's more
1202	      popular servers that fits in the cache.  If the server turns out
1203	      not to support AccECN, the initiator has to conservatively limit
1204	      its initial window, but again only for the first connection if
1205	      it's one of the client's more popular servers (and anyway this
1206	      rarely makes any difference when most client requests fit in a
1207	      single packet).

1209	   Note that, if AccECN deployment grows, caching successes (S1) starts
1210	   off small then grows, while caching failures (S2B) becomes large at
1211	   first, then shrinks.  At half-way, the size of the cache has to be
1212	   capped with either approach, so the default behaviour for all the
1213	   servers that do not fit in the cache is as important as the behaviour
1214	   for the popular servers that do fit.

1216	      MEASUREMENTS NEEDED: Measurements are needed to determine which
1217	      strategy would be sufficient for any particular client, whether a
1218	      particular client would need different strategies in different
1219	      circumstances and how many occurrences of problems would be masked
1220	      by how few cache entries.

1222	   Another strategy would be to send a not-ECT SYN a short delay (below
1223	   the typical lowest RTT) after an ECT SYN and only accept the non-ECT
1224	   connection if it returned first.  This would reduce the performance
1225	   penalty for those deploying ECT SYN support.  However, this 'happy
1226	   eyeballs' approach becomes complex when multiple optional features
1227	   are all tried on the first SYN (or on multiple SYNs), so it is not
1228	   recommended.

1230	4.2.4.  Argument 2: DoS Attacks

1232	   [RFC5562] says that ECT SYN packets could be misused by malicious
1233	   clients to augment "the well-known TCP SYN attack".  It goes on to
1234	   say "a malicious host might be able to inject a large number of TCP
1235	   SYN packets through a potentially congested ECN-enabled router,
1236	   congesting it even further."
1237	   We assume this is a reference to the TCP SYN flood attack (see
1238	   https://en.wikipedia.org/wiki/SYN_flood), which is an attack against
1239	   a responder end point.  We assume the idea of this attack is to use
1240	   ECT to get more packets through an ECN-enabled router in preference
1241	   to other non-ECN traffic so that they can go on to use the SYN
1242	   flooding attack to inflict more damage on the responder end point.
1243	   This argument could apply to flooding with any type of packet, but we
1244	   assume SYNs are singled out because their source address is easier to
1245	   spoof, whereas floods of other types of packets are easier to block.

1247	   Mandating Not-ECT in an RFC does not stop attackers using ECT for
1248	   flooding.  Nonetheless, if a standard says SYNs are not meant to be
1249	   ECT it would make it legitimate for firewalls to discard them.
1250	   However this would negate the considerable benefit of ECT SYNs for
1251	   compliant transports and seems unnecessary because RFC 3168 already
1252	   provides the means to address this concern.  In section 7, RFC 3168
1253	   says "During periods where ... the potential packet marking rate
1254	   would be high, our recommendation is that routers drop packets rather
1255	   then set the CE codepoint..." and this advice is repeated in
1256	   [RFC7567] (section 4.2.1).  This makes it harder for flooding packets
1257	   to gain from ECT.

1259	   [ecn-overload] showed that ECT can only slightly augment flooding
1260	   attacks relative to a non-ECT attack.  It was hard to overload the
1261	   link without causing the queue to grow, which in turn caused the AQM
1262	   to disable ECN and switch to drop, thus negating any advantage of
1263	   using ECT.  This was true even with the switch-over point set to 25%
1264	   drop probability (i.e. the arrival rate was 133% of the link rate).

1266	4.3.  SYN-ACKs

1268	   The proposed approach in Section 3.2.2 for experimenting with ECN-
1269	   capable SYN-ACKs is effectively identical to the scheme called ECN+
1270	   [ECN-PLUS].  In 2005, the ECN+ paper demonstrated that it could
1271	   reduce the average Web response time by an order of magnitude.  It
1272	   also argued that adding ECT to SYN-ACKs did not raise any new
1273	   security vulnerabilities.

1275	4.3.1.  Possibility of Unrecognized CE on the SYN-ACK

1277	   The feedback behaviour by the initiator in response to a CE-marked
1278	   SYN-ACK from the responder depends on whether classic ECN feedback
1279	   [RFC3168] or AccECN feedback [I-D.ietf-tcpm-accurate-ecn] has been
1280	   negotiated.  In either case no change is required to RFC 3168 or the
1281	   AccECN specification.

1283	   Some classic ECN client implementations might ignore a CE-mark on a
1284	   SYN-ACK, or even ignore a SYN-ACK packet entirely if it is set to ECT
1285	   or CE.  This is a possibility because an RFC 3168 implementation
1286	   would not necessarily expect a SYN-ACK to be ECN-capable.  This issue
1287	   already came up when the IETF first decided to experiment with ECN on
1288	   SYN-ACKs [RFC5562] and it was decided to go ahead without any extra
1289	   precautionary measures.  This was because the probability of
1290	   encountering the problem was believed to be low and the harm if the
1291	   problem arose was also low (see Appendix B of RFC 5562).

1293	4.3.2.  Response to Congestion on a SYN-ACK

1295	   The IETF has already specified an experiment with ECN-capable SYN-ACK
1296	   packets [RFC5562].  It was inspired by the ECN+ paper, but it
1297	   specified a much more conservative congestion response to a CE-marked
1298	   SYN-ACK, called ECN+/TryOnce.  This required the server to reduce its
1299	   initial window to 1 segment (like ECN+), but then the server had to
1300	   send a second SYN-ACK and wait for its ACK before it could continue
1301	   with its initial window of 1 SMSS.  The second SYN-ACK of this 5-way
1302	   handshake had to carry no data, and had to disable ECN, but no
1303	   justification was given for these last two aspects.

1305	   The present ECN++ experimental specification obsoletes RFC 5562
1306	   because it uses the ECN+ congestion response, not ECN+/TryOnce.
1307	   First we argue against the rationale for ECN+/TryOnce given in
1308	   sections 4.4 and 6.2 of [RFC5562].  It starts with a rather too
1309	   literal interpretation of the requirement in RFC 3168 that says TCP's
1310	   response to a single CE mark has to be "essentially the same as the
1311	   congestion control response to a *single* dropped packet."  TCP's
1312	   response to a dropped initial (SYN or SYN-ACK) packet is to wait for
1313	   the retransmission timer to expire (currently 1s).  However, this
1314	   long delay assumes the worst case between two possible causes of the
1315	   loss: a) heavy overload; or b) the normal capacity-seeking behaviour
1316	   of other TCP flows.  When the network is still delivering CE-marked
1317	   packets, it implies that there is an AQM at the bottleneck and that
1318	   it is not overloaded.  This is because an AQM under overload will
1319	   disable ECN (as recommended in section 7 of RFC 3168 and repeated in
1320	   section 4.2.1 of RFC 7567).  So scenario (a) can be ruled out.
1321	   Therefore, TCP's response to a CE-marked SYN-ACK can be similar to
1322	   its response to the loss of _any_ packet, rather than backing off as
1323	   if the special _initial_ packet of a flow has been lost.

1325	   How TCP responds to the loss of any single packet depends what it has
1326	   just been doing.  But there is not really a precedent for TCP's
1327	   response when it experiences a CE mark having sent only one (small)
1328	   packet.  If TCP had been adding one segment per RTT, it would have
1329	   halved its congestion window, but it hasn't established a congestion
1330	   window yet.  If it had been exponentially increasing it would have
1331	   exited slow start, but it hasn't started exponentially increasing yet
1332	   so it hasn't established a slow-start threshold.

1334	   Therefore, we have to work out a reasoned argument for what to do.
1335	   If an AQM is CE-marking packets, it implies there is already a queue
1336	   and it is probably already somewhere around the AQM's operating point
1337	   - it is unlikely to be well below and it might be well above.  So,
1338	   the more data packets that the client sends in its IW, the more
1339	   likely at least one will be CE marked, leading it to exit slow-start
1340	   early.  On the other hand, it is highly unlikely that the SYN-ACK
1341	   itself pushed the AQM into congestion, so it will be safe to
1342	   introduce another single segment immediately (1 RTT after the SYN-
1343	   ACK).  Therefore, starting to probe for capacity with a slow start
1344	   from an initial window of 1 segment seems appropriate to the
1345	   circumstances.  This is the approach adopted in Section 3.2.2.

1347	      EXPERIMENTATION NEEDED: Experiments will be needed to check the
1348	      above reasoning and determine any better strategy for reducing IW
1349	      in response to congestion on a SYN-ACK (or a SYN).

1351	4.3.3.  Fall-Back if ECT SYN-ACK Fails

1353	   An alternative to the server caching failed connection attempts would
1354	   be for the server to rely on the client caching failed attempts (on
1355	   the basis that the client would cache a failure whether ECT was
1356	   blocked on the SYN or the SYN-ACK).  This strategy cannot be used if
1357	   the SYN does not request AccECN support.  It works as follows: if the
1358	   server receives a SYN that requests AccECN support but is set to not-
1359	   ECT, it replies with a SYN-ACK also set to not-ECT.  If a middlebox
1360	   only blocks ECT on SYNs, not SYN-ACKs, this strategy might disable
1361	   ECN on a SYN-ACK when it did not need to, but at least it saves the
1362	   server from maintaining a cache.

1364	4.4.  Pure ACKs

1366	   Section 5.2 of RFC 3168 gives the following arguments for not
1367	   allowing the ECT marking of pure ACKs (ACKs not piggy-backed on
1368	   data):

1370	      "To ensure the reliable delivery of the congestion indication of
1371	      the CE codepoint, an ECT codepoint MUST NOT be set in a packet
1372	      unless the loss of that packet in the network would be detected by
1373	      the end nodes and interpreted as an indication of congestion.

1375	      Transport protocols such as TCP do not necessarily detect all
1376	      packet drops, such as the drop of a "pure" ACK packet; for
1377	      example, TCP does not reduce the arrival rate of subsequent ACK
1378	      packets in response to an earlier dropped ACK packet.  Any
1379	      proposal for extending ECN-Capability to such packets would have
1380	      to address issues such as the case of an ACK packet that was
1381	      marked with the CE codepoint but was later dropped in the network.
1382	      We believe that this aspect is still the subject of research, so
1383	      this document specifies that at this time, "pure" ACK packets MUST
1384	      NOT indicate ECN-Capability."

1386	   Later on, in section 6.1.4 it reads:

1388	      "For the current generation of TCP congestion control algorithms,
1389	      pure acknowledgement packets (e.g., packets that do not contain
1390	      any accompanying data) MUST be sent with the not-ECT codepoint.
1391	      Current TCP receivers have no mechanisms for reducing traffic on
1392	      the ACK-path in response to congestion notification.  Mechanisms
1393	      for responding to congestion on the ACK-path are areas for current
1394	      and future research.  (One simple possibility would be for the
1395	      sender to reduce its congestion window when it receives a pure ACK
1396	      packet with the CE codepoint set).  For current TCP
1397	      implementations, a single dropped ACK generally has only a very
1398	      small effect on the TCP's sending rate."

1400	   We next address each of the arguments presented above.

1402	   The first argument is a specific instance of the reliability argument
1403	   for the case of pure ACKs.  This has already been addressed by
1404	   countering the general reliability argument in Section 4.1.

1406	   The second argument says that ECN ought not to be enabled unless
1407	   there is a mechanism to respond to it.  This argument actually
1408	   comprises three sub-arguments:

1410	   Mechanism feasibility:  If ECN is enabled on Pure ACKs, are there, or
1411	      could there be, suitable mechanisms to detect, feed back and
1412	      respond to ECN-marked Pure ACKs?

1414	   Do no extra harm:  There has never been a mechanism to respond to
1415	      loss of non-ECN Pure ACKs.  So it seems that adding ECN without a
1416	      response mechanism will do no extra harm to others, while
1417	      improving a connection's own performance (because loss of an ACK
1418	      holds back new data).  However, if the end systems have no
1419	      response mechanism, ECN Pure ACKs do slightly more harm than non-
1420	      ECN, because the AQM doesn't immediately clear ECT packets from
1421	      the queue until it reaches overload and disables ECN.

1423	   Standards policy:  Even if there were no harm to others, does it set
1424	      an undesirable precedent to allow a flow to use ECN to protect its
1425	      Pure ACKs from loss, when there is no mechanism to respond to ECN-
1426	      marking?

1428	   The last two arguments involve value judgements, but they both depend
1429	   on the concrete technical question of mechanism feasibility, which
1430	   will therefore be addressed first in Section 4.4.1 below.  Then
1431	   Section 4.4.2 draws conclusions by addressing the value judgements in
1432	   the other two questions.

1434	4.4.1.  Mechanisms to Respond to CE-Marked Pure ACKs

1436	   The question of whether the receiver of pure ACKs is required to
1437	   detect and feed back any CE-marking is outside the scope of the
1438	   present specification - it is a matter for the relevant feedback
1439	   specification (classic ECN [RFC3168] and AccECN
1440	   [I-D.ietf-tcpm-accurate-ecn]).  The response to congestion feedback
1441	   is also out of scope, because it would be defined in the base TCP
1442	   congestion control specification [RFC5681] or its variants.

1444	   Nonetheless, in order to decide whether the present ECN++
1445	   experimental specification should require a host to set ECT on pure
1446	   ACKs, we only need to know whether a response mechanism would be
1447	   feasible - we do not have to standardize it.  So the bullets below
1448	   assess, for each type of feedback, whether the three stages of the
1449	   congestion response mechanism could all work.

1451	   Detection:  Can the receiver of a pure ACK detect a CE marking on
1452	      it?:

1454	      *  Classic feedback: RFC 3168 is silent on this point.  The
1455	         implementer of the receiver would not expect CE marks on pure
1456	         ACKs, but the implementation might happen to check for CE marks
1457	         before it looks for the data.  So detection will be
1458	         implementation-dependent.

1460	      *  AccECN feedback: the AccECN specification requires the receiver
1461	         of any TCP packets to count any CE marks on them (whether or
1462	         not it sends ECN-capable control packets itself).

1464	   Feedback:  As a general rule, TCP does not ACK a pure ACK.  However,
1465	      even if the receiver of a CE-mark on a pure ACK does not feed it
1466	      back immediately, it could still include it within subsequent
1467	      feedback, for instance when it later sends a data segment (if it
1468	      ever does):

1470	      *  Classic feedback: RFC 3168 is silent on this point, so feedback
1471	         of CE-markings might be implementation specific.  If the
1472	         receiver (of the pure ACKs) did generate feedback, it would set
1473	         the echo congestion experienced (ECE) flag in the TCP header of
1474	         subsequent packets in the round, as it would to feed back CE on
1475	         data packets.

1477	      *  AccECN feedback: the receiver continually feeds back a count of
1478	         the number of CE-marked packets that it has received and,
1479	         optionally, a count of CE-marked bytes.  For either metric,
1480	         AccECN takes into account all types of packets, including pure
1481	         ACKs.  CE-marked pure ACKs will solely increment the packet
1482	         counter; not any byte counter, because by definition they
1483	         contain no bytes of data.

1485	   Congestion response:  In either case (classic or AccECN feedback), if
1486	      the TCP sender does receive feedback about CE-markings on pure
1487	      ACKs, it will be able to reduce the congestion window (cwnd) and/
1488	      or the ACK rate.

1490	   Therefore a congestion response mechanism is clearly feasible if
1491	   AccECN has been negotiated, but the position is unknown for the
1492	   installed base of classic ECN feedback.

1494	4.4.1.1.  Congestion Window Response to CE-Marked Pure ACKs

1496	   This subsection explores issues that congestion control designers
1497	   will need to consider when defining a cwnd response to CE-marked Pure
1498	   ACKs.

1500	   A CE-mark on a Pure ACK does not mean that only Pure ACKs are causing
1501	   congestion.  It only means that the marked Pure ACK is part of an
1502	   aggregate that is collectively causing a bottleneck queue to randomly
1503	   CE-mark a fraction of the packets.  A CE-mark on a Pure ACK might be
1504	   due to data packets in other flows through the same bottleneck, due
1505	   to data packets interspersed between Pure ACKs in the same half-
1506	   connection, or just due to the rate of Pure ACKs alone.  (RFC 3168
1507	   only considered the last possibility, which led to the argument that
1508	   ECN-enabled Pure ACKs had to be deferred, because ACK congestion
1509	   control was a research issue.)
1510	   If a host has been sending a mix of Pure ACKs and data, it doesn't
1511	   need to work out whether a particular CE mark was on a Pure ACK or
1512	   not; it just needs to respond to congestion feedback as a whole by
1513	   reducing its congestion window (cwnd), which limits the data it can
1514	   launch into flight through the congested bottleneck.  If it is purely
1515	   receiving data and sending only Pure ACKs, reducing cwnd will have
1516	   caused it no harm, having no effect on its ACK rate (the next
1517	   subsection addresses that).

1519	   However, when a host is sending data as well as Pure ACKs, it would
1520	   not be right for CE-marks on Pure ACKs and on data packets to induce
1521	   the same reduction in cwnd.  A possible way to address this issue
1522	   would be to weight the response by the size of the marked packets
1523	   (assuming the congestion control supports a weighted response,
1524	   e.g. [RFC8257]).  For instance, one could calculate the fraction of
1525	   CE-marked bytes (headers and data) over each round trip (say) as
1526	   follows:

1528	      (CE-marked header bytes + CE-marked data bytes) / (all header
1529	      bytes + all data bytes)

1531	   Header bytes can be calculated by multiplying a packet count by a
1532	   nominal header size, which is possible with AccECN feedback, because
1533	   it gives a count of CE-marked packets (as well as CE-marked bytes).
1534	   The above simple aggregate calculation caters for the full range of
1535	   scenarios; from all Pure ACKs to just a few interspersed with data
1536	   packets.

1538	   Note that any mechanism that reduces cwnd due to CE-marked Pure ACKs
1539	   would need to be integrated with the congestion window validation
1540	   mechanism [RFC7661], which already conservatively reduces cwnd over
1541	   time because cwnd becomes stale if it is not used to fill the pipe.

1543	4.4.1.2.  ACK Rate Response to CE-Marked Pure ACKs

1545	   Reducing the congestion window will have no effect on the rate of
1546	   pure ACKs.  The worst case here is if the bottleneck is congested
1547	   solely with pure ACKs, but it could also be problematic if a large
1548	   fraction of the load was from unresponsive ACKs, leaving little or no
1549	   capacity for the load from responsive data.

1551	   Since RFC 3168 was published, experimental Acknowledgement Congestion
1552	   Control (AckCC) techniques have been documented in [RFC5690]
1553	   (informational).  So any pair of TCP end-points can choose to agree
1554	   to regulate the delayed ACK ratio in response to lost or CE-marked
1555	   pure ACKs.  However, the protocol has a number of open issues
1556	   concerning deployment (e.g. it requires support from both ends, it
1557	   relies on two new TCP options, one of which is required on the SYN
1558	   where option space is at a premium and, if either option is blocked
1559	   by a middlebox, no fall-back behaviour is specified).

1561	   The new TCP options address two problems, namely that TCP had: i) no
1562	   mechanism to allow ECT to be set on pure ACKs; and ii) no mechanism
1563	   to feed back loss or CE-marking of pure ACKs.  A combination of the
1564	   present specification and AccECN addresses both these problems, at
1565	   least for CE-marking.  So it might now be possible to design an ECN-
1566	   specific ACK congestion control scheme without the extra TCP options
1567	   proposed in RFC 5690.  However, such a mechanism is out of scope of
1568	   the present document.

1570	   Setting aside the practicality of RFC 5690, the need for AckCC has
1571	   not been conclusively demonstrated.  It has been argued that the
1572	   Internet has survived so far with no mechanism to even detect loss of
1573	   pure ACKs.  However, it has also been argued that ECN is not the same
1574	   as loss.  Packet discard can naturally thin the ACK load to whatever
1575	   the bottleneck can support, whereas ECN marking does not (it queues
1576	   the ACKs instead).  Nonetheless, RFC 3168 (section 7) recommends that
1577	   an AQM switches over from ECN marking to discard when the marking
1578	   probability becomes high.  Therefore discard can still be relied on
1579	   to thin out ECN-enabled pure ACKs as a last resort.

1581	4.4.2.  Summary: Enabling ECN on Pure ACKs

1583	   In the case when AccECN has been negotiated, it provides a feasible
1584	   congestion response mechanism, so the arguments for ECT on pure ACKs
1585	   heavily outweigh those against.  ECN is always more and never less
1586	   reliable for delivery of congestion notification.  A cwnd reduction
1587	   needs to be considered by congestion control designers as a response
1588	   to congestion on pure ACKs.  Separately, AckCC (or an improved
1589	   variant exploiting AccECN) could optionally be used to regulate the
1590	   spacing between pure ACKs.  However, it is not clear whether AckCC is
1591	   justified.  If it is not, packet discard will still act as the
1592	   "congestion response of last resort" by thinning out the traffic.  In
1593	   contrast, not setting ECT on pure ACKs is certainly detrimental to
1594	   performance, because when a pure ACK is lost it can prevent the
1595	   release of new data.

1597	   In the case when Classic ECN has been negotiated, the argument for
1598	   ECT on pure ACKs is less clear-cut.  Some of the installed base of
1599	   RFC 3168 implementations might happen to (unintentionally) provide a
1600	   feedback mechanism to support a cwnd response.  For those that did
1601	   not, setting ECT on pure ACKs would be better for the flow's own
1602	   performance than not setting it.  However, where there was no
1603	   feedback mechanism, setting ECT could do slightly more harm than not
1604	   setting it.  AckCC could provide a complementary response mechanism,
1605	   because it is designed to work with RFC 3168 ECN, but it has
1606	   deployment challenges.  In summary, a congestion response mechanism
1607	   is unlikely to be feasible with the installed base of classic ECN.

1609	   This specification uses a safe approach.  Allowing hosts to set ECT
1610	   on Pure ACKs without a feasible response mechanism could result in
1611	   risk.  It would certainly improve the flow's own performance, but it
1612	   would slightly increase potential harm to others.  Morevoer, if would
1613	   set an undesirable precedent for setting ECT on packets with no
1614	   mechanism to respond to any resulting congestion signals.  Therefore,
1615	   Section 3.2.3 allows ECT on Pure ACKs if AccECN feedback has been
1616	   negotiated, but not with classic RFC 3168 ECN feedback.

1618	4.5.  Window Probes

1620	   Section 6.1.6 of RFC 3168 presents only the reliability argument for
1621	   prohibiting ECT on Window probes:

1623	      "If a window probe packet is dropped in the network, this loss is
1624	      not detected by the receiver.  Therefore, the TCP data sender MUST
1625	      NOT set either an ECT codepoint or the CWR bit on window probe
1626	      packets.

1628	      However, because window probes use exact sequence numbers, they
1629	      cannot be easily spoofed in denial-of-service attacks.  Therefore,
1630	      if a window probe arrives with the CE codepoint set, then the
1631	      receiver SHOULD respond to the ECN indications."

1633	   The reliability argument has already been addressed in Section 4.1.

1635	   Allowing ECT on window probes could considerably improve performance
1636	   because, once the receive window has reopened, if a window probe is
1637	   lost the sender will stall until the next window probe reaches the
1638	   receiver, which might be after the maximum retransmission timeout (at
1639	   least 1 minute [RFC6928]).

1641	   On the bright side, RFC 3168 at least specifies the receiver
1642	   behaviour if a CE-marked window probe arrives, so changing the
1643	   behaviour ought to be less painful than for other packet types.

1645	4.6.  FINs

1647	   RFC 3168 is silent on whether a TCP sender can set ECT on a FIN.  A
1648	   FIN is considered as part of the sequence of data, and the rate of
1649	   pure ACKs sent after a FIN could be controlled by a CE marking on the
1650	   FIN.  Therefore there is no reason not to set ECT on a FIN.

1652	4.7.  RSTs

1654	   RFC 3168 is silent on whether a TCP sender can set ECT on a RST.  The
1655	   host generating the RST message does not have an open connection
1656	   after sending it (either because there was no such connection when
1657	   the packet that triggered the RST message was received or because the
1658	   packet that triggered the RST message also triggered the closure of
1659	   the connection).

1661	   Moreover, the receiver of a CE-marked RST message can either: i)
1662	   accept the RST message and close the connection; ii) emit a so-called
1663	   challenge ACK in response (with suitable throttling) [RFC5961] and
1664	   otherwise ignore the RST (e.g. because the sequence number is in-
1665	   window but not the precise number expected next); or iii) discard the
1666	   RST message (e.g. because the sequence number is out-of-window).  In
1667	   the first two cases there is no point in echoing any CE mark received
1668	   because the sender closed its connection when it sent the RST.  In
1669	   the third case it makes sense to discard the CE signal as well as the
1670	   RST.

1672	   Although a congestion response following a CE-marking on a RST does
1673	   not appear to make sense, the following factors have been considered
1674	   before deciding whether the sender ought to set ECT on a RST message:

1676	   *  As explained above, a congestion response by the sender of a CE-
1677	      marked RST message is not possible;

1679	   *  So the only reason for the sender setting ECT on a RST would be to
1680	      improve the reliability of the message's delivery;

1682	   *  RST messages are used to both mount and mitigate attacks:

1684	      -  Spoofed RST messages are used by attackers to terminate ongoing
1685	         connections, although the mitigations in RFC 5961 have
1686	         considerably raised the bar against off-path RST attacks;

1688	      -  Legitimate RST messages allow endpoints to inform their peers
1689	         to eliminate existing state that correspond to non existing
1690	         connections, liberating resources e.g. in DoS attacks
1691	         scenarios;

1693	   *  AQMs are advised to disable ECN marking during persistent
1694	      overload, so:

1696	      -  it is harder for an attacker to exploit ECN to intensify an
1697	         attack;

1699	      -  it is harder for a legitimate user to exploit ECN to more
1700	         reliably mitigate an attack

1702	   *  Prohibiting ECT on a RST would deny the benefit of ECN to
1703	      legitimate RST messages, but not to attackers who can disregard
1704	      RFCs;

1706	   *  If ECT were prohibited on RSTs

1708	      -  it would be easy for security middleboxes to discard all ECN-
1709	         capable RSTs;

1711	      -  However, unlike a SYN flood, it is already easy for a security
1712	         middlebox (or host) to distinguish a RST flood from legitimate
1713	         traffic [RFC5961], and even if a some legitimate RSTs are
1714	         accidentally removed as well, legitimate connections still
1715	         function.

1717	   So, on balance, it has been decided that it is worth experimenting
1718	   with ECT on RSTs.  During experiments, if the ECN capability on RSTs
1719	   is found to open a vulnerability that is hard to close, this decision
1720	   can be reversed, before it is specified for the standards track.

1722	4.8.  Retransmitted Packets.

1724	   RFC 3168 says the sender "MUST NOT" set ECT on retransmitted packets.
1725	   The rationale for this consumes nearly 2 pages of RFC 3168, so the
1726	   reader is referred to section 6.1.5 of RFC 3168, rather than quoting
1727	   it all here.  There are essentially three arguments, namely:
1728	   reliability; DoS attacks; and over-reaction to congestion.  We
1729	   address them in order below.

1731	   The reliability argument has already been addressed in Section 4.1.

1733	   Protection against DoS attacks is not afforded by prohibiting ECT on
1734	   retransmitted packets.  An attacker can set CE on spoofed
1735	   retransmissions whether or not it is prohibited by an RFC.
1736	   Protection against the DoS attack described in section 6.1.5 of RFC
1737	   3168 is solely afforded by the requirement that "the TCP data
1738	   receiver SHOULD ignore the CE codepoint on out-of-window packets".
1739	   Therefore in Section 3.2.7 the sender is allowed to set ECT on
1740	   retransmitted packets, in order to reduce the chance of them being
1741	   dropped.  We also strengthen the receiver's requirement from "SHOULD
1742	   ignore" to "MUST ignore".  And we generalize the receiver's
1743	   requirement to include failure of any validity check, not just out-
1744	   of-window checks, in order to include the more stringent validity
1745	   checks in RFC 5961 that have been developed since RFC 3168.

1747	   A consequence is that, for those retransmitted packets that arrive at
1748	   the receiver after the original packet has been properly received
1749	   (so-called spurious retransmissions), any CE marking will be ignored.
1750	   There is no problem with that because the fact that the original
1751	   packet has been delivered implies that the sender's original
1752	   congestion response (when it deemed the packet lost and retransmitted
1753	   it) was unnecessary.

1755	   Finally, the third argument is about over-reacting to congestion.
1756	   The argument goes that, if a retransmitted packet is dropped, the
1757	   sender will not detect it, so it will not react again to congestion
1758	   (it would have reduced its congestion window already when it
1759	   retransmitted the packet).  Whereas, if retransmitted packets can be
1760	   CE tagged instead of dropped, senders could potentially react more
1761	   than once to congestion.  However, we argue that it is legitimate to
1762	   respond again to congestion if it still persists in subsequent round
1763	   trip(s).

1765	   Therefore, in all three cases, it is not incorrect to set ECT on
1766	   retransmissions.

1768	4.9.  General Fall-back for any Control Packet

1770	   Extensive experiments have found no evidence of any traversal
1771	   problems with ECT on any TCP control packet [Mandalari18].
1772	   Nonetheless, Sections 3.2.1.4 and 3.2.2.3 specify fall-back measures
1773	   if ECT on the first packet of each half-connection (SYN or SYN-ACK)
1774	   appears to be blocking progress.  Here, the question of fall-back
1775	   measures for ECT on other control packets is explored.  It supports
1776	   the advice given in Section 3.2.8; until there's evidence that
1777	   something's broken, don't fix it.

1779	   If an implementation has had to disable ECT to ensure the first
1780	   packet of a flow (SYN or SYN-ACK) gets through, the question arises
1781	   whether it ought to disable ECT on all subsequent control packets
1782	   within the same TCP connection.  Without evidence of any such
1783	   problems, this seems unnecessarily cautious.  Particularly given it
1784	   would be hard to detect loss of most other types of TCP control
1785	   packets that are not ACK'd.  And particularly given that
1786	   unnecessarily removing ECT from other control packets could lead to
1787	   performance problems, e.g. by directing them into another queue
1788	   [I-D.ietf-tsvwg-ecn-l4s-id] or over a different path, because some
1789	   broken multipath equipment (erroneously) routes based on all 8 bits
1790	   of the Diffserv field.

1792	   In the case where a connection starts without ECT on the SYN (perhaps
1793	   because problems with previous connections had been cached), there
1794	   will have been no test for ECT traversal in the client-server
1795	   direction until the pure ACK that completes the handshake.  It is
1796	   possible that some middlebox might block ECT on this pure ACK or on
1797	   later retransmissions of lost packets.  Similarly, after a route
1798	   change, the new path might include some middlebox that blocks ECT on
1799	   some or all TCP control packets.  However, without evidence of such
1800	   problems, the complexity of a fix does not seem worthwhile.

1802	      MORE MEASUREMENTS NEEDED (?): If further two-ended measurements do
1803	      find evidence for these traversal problems, measurements would be
1804	      needed to check for correlation of ECT traversal problems between
1805	      different control packets.  It might then be necessary to
1806	      introduce a catch-all fall-back rule that disables ECT on certain
1807	      subsequent TCP control packets based on some criteria developed
1808	      from these measurements.

1810	5.  Interaction with popular variants or derivatives of TCP

1812	   The following subsections discuss any interactions between setting
1813	   ECT on all packets and using the following popular variants of TCP:
1814	   IW10 and TFO.  It also briefly notes the possibility that the
1815	   principles applied here should translate to protocols derived from
1816	   TCP.  This section is informative not normative, because no
1817	   interactions have been identified that require any change to
1818	   specifications.  The subsection on IW10 discusses potential changes
1819	   to specifications but recommends that no changes are needed.

1821	   The designs of the following TCP variants have also been assessed and
1822	   found not to interact adversely with ECT on TCP control packets: SYN
1823	   cookies (see Appendix A of [RFC4987] and section 3.1 of [RFC5562]),
1824	   TCP Fast Open (TFO [RFC7413]) and L4S [I-D.ietf-tsvwg-l4s-arch].

1826	5.1.  IW10

1828	   IW10 is an experiment to determine whether it is safe for TCP to use
1829	   an initial window of 10 SMSS [RFC6928].

1831	   This subsection does not recommend any additions to the present
1832	   specification in order to interwork with IW10.  The specifications as
1833	   they stand are safe, and there is only a corner-case with ECT on the
1834	   SYN where performance could be occasionally improved, as explained
1835	   below.

1837	   As specified in Section 3.2.1.1, a TCP initiator will typically only
1838	   set ECT on the SYN if it requests AccECN support.  If, however, the
1839	   SYN-ACK tells the initiator that the responder does not support
1840	   AccECN, Section 3.2.1.1 advises the initiator to conservatively
1841	   reduce its initial window, preferably to 1 SMSS because, if the SYN
1842	   was CE-marked, the SYN-ACK has no way to feed that back.

1844	   If the initiator implements IW10, it seems rather over-conservative
1845	   to reduce IW from 10 to 1 just in case a congestion marking was
1846	   missed.  Nonetheless, a reduction to 1 SMSS will rarely harm
1847	   performance, because:

1849	   *  as long as the initiator is caching failures to negotiate AccECN,
1850	      subsequent attempts to access the same server will not use ECT on
1851	      the SYN anyway, so there will no longer be any need to
1852	      conservatively reduce IW;

1854	   *  currently, at least for web sessions, it is extremely rare for a
1855	      TCP initiator (client) to have more than one data segment to send
1856	      at the start of a TCP connection (see Fig 3 in [Manzoor17]) - IW10
1857	      is primarily exploited by TCP servers.

1859	   If a responder receives feedback that the SYN-ACK was CE-marked,
1860	   Section 3.2.2.2 recommends that it reduces its initial window,
1861	   preferably to 1 SMSS.  When the responder also implements IW10, it
1862	   might again seem rather over-conservative to reduce IW from 10 to 1.
1863	   But in this case the rationale is somewhat different:

1865	   *  Feedback that the SYN-ACK was CE-marked is an explicit indication
1866	      that the queue has been building, not just uncertainty due to
1867	      absence of feedback;

1869	   *  Given it is now likely that a queue already exists, the more data
1870	      packets that the server sends in its IW, the more likely at least
1871	      one will be CE marked, leading it to exit slow-start early.

1873	   Experimentation will be needed to determine the best strategy.  It
1874	   should be noted that experience from recent congestion avoidance
1875	   experiments where the window is reduced by less than half is not
1876	   necessarily applicable to a flow start scenario.  Reducing cwnd by
1877	   less is one thing.  Reducing an increase in cwnd by less is another.

1879	5.2.  TFO

1881	   TCP Fast Open (TFO [RFC7413]) is an experiment to remove the round
1882	   trip delay of TCP's 3-way hand-shake (3WHS).  A TFO initiator caches
1883	   a cookie from a previous connection with a TFO-enabled server.  Then,
1884	   for subsequent connections to the same server, any data included on
1885	   the SYN can be passed directly to the server application, which can
1886	   then return up to an initial window of response data on the SYN-ACK
1887	   and on data segments straight after it, without waiting for the ACK
1888	   that completes the 3WHS.

1890	   The TFO experiment and the present experiment to add ECN-support for
1891	   TCP control packets can be combined without altering either
1892	   specification, which is justified as follows:

1894	   *  The handling of ECN marking on a SYN is no different whether or
1895	      not it carries data.

1897	   *  In response to any CE-marking on the SYN-ACK, the responder adopts
1898	      the normal response to congestion, as discussed in Section 7.2 of
1899	      [RFC7413].

1901	5.3.  L4S

1903	   A Low Latency Low Loss Scalable throughput (L4S) variant of TCP such
1904	   as TCP Prague [PragueLinux] is mandated to negotiate AccECN feedback,
1905	   and strongly recommended to use ECN++ [I-D.ietf-tsvwg-ecn-l4s-id].

1907	   The L4S experiment and the present ECN++ experiment can be combined
1908	   without altering any of the specifications.  The only difference
1909	   would be in the recommendation of the best SYN cache strategy.

1911	   The normative specification for ECT on a SYN in Section 3.2.1
1912	   recommends the "optimistic ECT and cache failures" strategy (S2B
1913	   defined in Section 4.2.3) for the general Internet.  However, if a
1914	   user's Internet access bottleneck supported L4S ECN but not Classic
1915	   ECN, the "optimistic ECT without a cache" strategy (S2A) would make
1916	   most sense, because there would be little point trying to avoid the
1917	   'over-strict' test and negotiate Classic ECN, if L4S ECN but not
1918	   Classic ECN was available on that user's access link (as is the case
1919	   with Low Latency DOCSIS [DOCSIS3.1]).

1921	   Strategy (S2A) is the simplest, because it requires no cache.  It
1922	   would satisfy the goal of an implementer who is solely interested in
1923	   ultra-low latency using AccECN and ECN++ (e.g. accessing L4S servers)
1924	   and is not concerned about fall-back to Classic ECN (e.g. when
1925	   accessing other servers).

1927	5.4.  Other transport protocols

1929	   Experience from experiments on adding ECN support to all TCP packets
1930	   ought to be directly transferable between TCP and other transport
1931	   protocols, like SCTP or QUIC.

1933	   Stream Control Transmission Protocol (SCTP [RFC4960]) is a standards
1934	   track transport protocol derived from TCP.  SCTP currently does not
1935	   include ECN support, but Appendix A of RFC 4960 broadly describes how
1936	   it would be supported and a (long-expired) draft on the addition of
1937	   ECN to SCTP has been produced [I-D.stewart-tsvwg-sctpecn].  This
1938	   draft avoided setting ECT on control packets and retransmissions,
1939	   closely following the arguments in RFC 3168.

1941	   QUIC [RFC9000] is another standards track transport protocol offering
1942	   similar services to TCP but intended to exploit some of the benefits
1943	   of running over UDP.  Building on the arguments in the current draft,
1944	   a QUIC sender sets ECT(0) on all packets.

1946	6.  Security Considerations

1948	   Section 3.2.6 considers the question of whether ECT on RSTs will
1949	   allow RST attacks to be intensified.  There are several security
1950	   arguments presented in RFC 3168 for preventing the ECN marking of TCP
1951	   control packets and retransmitted segments.  We believe all of them
1952	   have been properly addressed in Section 4, particularly Section 4.2.4
1953	   and Section 4.8 on DoS attacks using spoofed ECT-marked SYNs and
1954	   spoofed CE-marked retransmissions.

1956	   Section 3.2.6 on sending TCP RSTs points out that implementers need
1957	   to take care to ensure that the ECN field on a RST does not depend on
1958	   TCP's state machine.  Otherwise the internal information revealed
1959	   could be of use to potential attackers.  This point applies more
1960	   generally to all control packets, not just RSTs.

1962	7.  IANA Considerations

1964	   There are no IANA considerations in this memo.

1966	8.  Acknowledgments

1968	   Thanks to Mirja Kuehlewind, David Black, Padma Bhooma, Gorry
1969	   Fairhurst, Michael Scharf, Yuchung Cheng and Christophe Paasch for
1970	   their useful reviews.  Richard Scheffenegger provided useful advice
1971	   gained from implementing ECN++ for FreeBSD.

1973	   The work of Marcelo Bagnulo has been performed in the framework of
1974	   the H2020-ICT-2014-2 project 5G NORMA.  His contribution reflects the
1975	   consortium's view, but the consortium is not liable for any use that
1976	   may be made of any of the information contained therein.

1978	   Bob Briscoe's contribution was partly funded by the Research Council
1979	   of Norway through the TimeIn project, partly by CableLabs and partly
1980	   by the Comcast Innovation Fund.  The views expressed here are solely
1981	   those of the authors.

1983	9.  References

1985	9.1.  Normative References

1987	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
1988	              Requirement Levels", BCP 14, RFC 2119,
1989	              DOI 10.17487/RFC2119, March 1997,
1990	              <https://www.rfc-editor.org/info/rfc2119>.

1992	   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
1993	              2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
1994	              May 2017, <https://www.rfc-editor.org/info/rfc8174>.

1996	   [RFC3168]  Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
1997	              of Explicit Congestion Notification (ECN) to IP",
1998	              RFC 3168, DOI 10.17487/RFC3168, September 2001,
1999	              <https://www.rfc-editor.org/info/rfc3168>.

2001	   [RFC5961]  Ramaiah, A., Stewart, R., and M. Dalal, "Improving TCP's
2002	              Robustness to Blind In-Window Attacks", RFC 5961,
2003	              DOI 10.17487/RFC5961, August 2010,
2004	              <https://www.rfc-editor.org/info/rfc5961>.

2006	   [I-D.ietf-tcpm-accurate-ecn]
2007	              Briscoe, B., Kühlewind, M., and R. Scheffenegger, "More
2008	              Accurate ECN Feedback in TCP", Work in Progress, Internet-
2009	              Draft, draft-ietf-tcpm-accurate-ecn-15, 12 July 2021,
2010	              <https://datatracker.ietf.org/doc/html/draft-ietf-tcpm-
2011	              accurate-ecn-15>.

2013	   [RFC8311]  Black, D., "Relaxing Restrictions on Explicit Congestion
2014	              Notification (ECN) Experimentation", RFC 8311,
2015	              DOI 10.17487/RFC8311, January 2018,
2016	              <https://www.rfc-editor.org/info/rfc8311>.

2018	   [RFC0793]  Postel, J., "Transmission Control Protocol", STD 7,
2019	              RFC 793, DOI 10.17487/RFC0793, September 1981,
2020	              <https://www.rfc-editor.org/info/rfc793>.

2022	9.2.  Informative References

2024	   [RFC1122]  Braden, R., Ed., "Requirements for Internet Hosts -
2025	              Communication Layers", STD 3, RFC 1122,
2026	              DOI 10.17487/RFC1122, October 1989,
2027	              <https://www.rfc-editor.org/info/rfc1122>.

2029	   [RFC3540]  Spring, N., Wetherall, D., and D. Ely, "Robust Explicit
2030	              Congestion Notification (ECN) Signaling with Nonces",
2031	              RFC 3540, DOI 10.17487/RFC3540, June 2003,
2032	              <https://www.rfc-editor.org/info/rfc3540>.

2034	   [RFC4960]  Stewart, R., Ed., "Stream Control Transmission Protocol",
2035	              RFC 4960, DOI 10.17487/RFC4960, September 2007,
2036	              <https://www.rfc-editor.org/info/rfc4960>.

2038	   [RFC4987]  Eddy, W., "TCP SYN Flooding Attacks and Common
2039	              Mitigations", RFC 4987, DOI 10.17487/RFC4987, August 2007,
2040	              <https://www.rfc-editor.org/info/rfc4987>.

2042	   [RFC5562]  Kuzmanovic, A., Mondal, A., Floyd, S., and K.
2043	              Ramakrishnan, "Adding Explicit Congestion Notification
2044	              (ECN) Capability to TCP's SYN/ACK Packets", RFC 5562,
2045	              DOI 10.17487/RFC5562, June 2009,
2046	              <https://www.rfc-editor.org/info/rfc5562>.

2048	   [RFC5681]  Allman, M., Paxson, V., and E. Blanton, "TCP Congestion
2049	              Control", RFC 5681, DOI 10.17487/RFC5681, September 2009,
2050	              <https://www.rfc-editor.org/info/rfc5681>.

2052	   [RFC5690]  Floyd, S., Arcia, A., Ros, D., and J. Iyengar, "Adding
2053	              Acknowledgement Congestion Control to TCP", RFC 5690,
2054	              DOI 10.17487/RFC5690, February 2010,
2055	              <https://www.rfc-editor.org/info/rfc5690>.

2057	   [RFC6298]  Paxson, V., Allman, M., Chu, J., and M. Sargent,
2058	              "Computing TCP's Retransmission Timer", RFC 6298,
2059	              DOI 10.17487/RFC6298, June 2011,
2060	              <https://www.rfc-editor.org/info/rfc6298>.

2062	   [RFC6928]  Chu, J., Dukkipati, N., Cheng, Y., and M. Mathis,
2063	              "Increasing TCP's Initial Window", RFC 6928,
2064	              DOI 10.17487/RFC6928, April 2013,
2065	              <https://www.rfc-editor.org/info/rfc6928>.

2067	   [RFC7413]  Cheng, Y., Chu, J., Radhakrishnan, S., and A. Jain, "TCP
2068	              Fast Open", RFC 7413, DOI 10.17487/RFC7413, December 2014,
2069	              <https://www.rfc-editor.org/info/rfc7413>.

2071	   [RFC7567]  Baker, F., Ed. and G. Fairhurst, Ed., "IETF
2072	              Recommendations Regarding Active Queue Management",
2073	              BCP 197, RFC 7567, DOI 10.17487/RFC7567, July 2015,
2074	              <https://www.rfc-editor.org/info/rfc7567>.

2076	   [RFC7661]  Fairhurst, G., Sathiaseelan, A., and R. Secchi, "Updating
2077	              TCP to Support Rate-Limited Traffic", RFC 7661,
2078	              DOI 10.17487/RFC7661, October 2015,
2079	              <https://www.rfc-editor.org/info/rfc7661>.

2081	   [RFC8257]  Bensley, S., Thaler, D., Balasubramanian, P., Eggert, L.,
2082	              and G. Judd, "Data Center TCP (DCTCP): TCP Congestion
2083	              Control for Data Centers", RFC 8257, DOI 10.17487/RFC8257,
2084	              October 2017, <https://www.rfc-editor.org/info/rfc8257>.

2086	   [RFC2140]  Touch, J., "TCP Control Block Interdependence", RFC 2140,
2087	              DOI 10.17487/RFC2140, April 1997,
2088	              <https://www.rfc-editor.org/info/rfc2140>.

2090	   [I-D.ietf-tsvwg-ecn-l4s-id]
2091	              Schepper, K. D. and B. Briscoe, "Explicit Congestion
2092	              Notification (ECN) Protocol for Very Low Queuing Delay
2093	              (L4S)", Work in Progress, Internet-Draft, draft-ietf-
2094	              tsvwg-ecn-l4s-id-23, 24 December 2021,
2095	              <https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg-
2096	              ecn-l4s-id-23>.

2098	   [I-D.ietf-tsvwg-l4s-arch]
2099	              Briscoe, B., Schepper, K. D., Bagnulo, M., and G. White,
2100	              "Low Latency, Low Loss, Scalable Throughput (L4S) Internet
2101	              Service: Architecture", Work in Progress, Internet-Draft,
2102	              draft-ietf-tsvwg-l4s-arch-15, 24 December 2021,
2103	              <https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg-
2104	              l4s-arch-15>.

2106	   [I-D.stewart-tsvwg-sctpecn]
2107	              Stewart, R. R., Tuexen, M., and X. Dong, "ECN for Stream
2108	              Control Transmission Protocol (SCTP)", Work in Progress,
2109	              Internet-Draft, draft-stewart-tsvwg-sctpecn-05, 15 January
2110	              2014, <https://datatracker.ietf.org/doc/html/draft-
2111	              stewart-tsvwg-sctpecn-05>.

2113	   [RFC9000]  Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based
2114	              Multiplexed and Secure Transport", RFC 9000,
2115	              DOI 10.17487/RFC9000, May 2021,
2116	              <https://www.rfc-editor.org/info/rfc9000>.

2118	   [judd-nsdi]
2119	              Judd, G.J., "Attaining the promise and avoiding the
2120	              pitfalls of TCP in the Datacenter", USENIX Symposium on
2121	              Networked Systems Design and Implementation
2122	              (NSDI'15) pp.145-157, May 2015,
2123	              <https://www.usenix.org/node/188966>.

2125	   [ecn-pam]  Trammell, B., Kühlewind, M., Boppart, D., Learmonth, I.,
2126	              Fairhurst, G., and R. Scheffenegger, "Enabling Internet-
2127	              Wide Deployment of Explicit Congestion Notification",
2128	              Int'l Conf. on Passive and Active Network Measurement
2129	              (PAM'15) pp193-205, 2015, <https://link.springer.com/
2130	              chapter/10.1007/978-3-319-15509-8_15>.

2132	   [ECN-PLUS] Kuzmanovic, A., "The Power of Explicit Congestion
2133	              Notification", ACM SIGCOMM 35(4):61--72, 2005,
2134	              <http://dl.acm.org/citation.cfm?id=1080100>.

2136	   [Mandalari18]
2137	              Mandalari, A., Lutu, A., Briscoe, B., Bagnulo, M., and Ö.
2138	              Alay, "Measuring ECN++: Good News for ++, Bad News for ECN
2139	              over Mobile", IEEE Communications Magazine , March 2018,
2140	              <https://ieeexplore.ieee.org/document/8316790>.

2142	   [Manzoor17]
2143	              Manzoor, J., Drago, I., and R. Sadre, "How HTTP/2 is
2144	              changing Web traffic and how to detect it", In Proc:
2145	              Network Traffic Measurement and Analysis Conference (TMA)
2146	              2017 pp.1-9, June 2017,
2147	              <https://ieeexplore.ieee.org/document/8002899>.

2149	   [Kuehlewind18]
2150	              Kühlewind, M., Walter, M., Learmonth, I., and B. Trammell,
2151	              "Tracing Internet Path Transparency", In Proc: Network
2152	              Traffic Measurement and Analysis Conference (TMA) 2018 ,
2153	              June 2018, <http://tma.ifip.org/2018/wp-
2154	              content/uploads/sites/3/2018/06/tma2018_paper12.pdf>.

2156	   [strict-ecn]
2157	              Dumazet, E., "tcp: be more strict before accepting ECN
2158	              negociation", Linux netdev patch list , 4 May 2012,
2159	              <https://patchwork.ozlabs.org/patch/156953/>.

2161	   [relax-strict-ecn]
2162	              Tilmans, O., "tcp: Accept ECT on SYN in the presence of
2163	              RFC8311", Linux netdev patch list , 3 April 2019,
2164	              <https://lore.kernel.org/patchwork/patch/1057812/>.

2166	   [ecn-overload]
2167	              Steen, H., "Destruction Testing: Ultra-Low Delay using
2168	              Dual Queue Coupled Active Queue Management", Masters
2169	              Thesis, Uni Oslo , May 2017,
2170	              <https://www.duo.uio.no/bitstream/handle/10852/57424/
2171	              thesis-henrste.pdf?sequence=1>.

2173	   [PragueLinux]
2174	              Briscoe, B., De Schepper, K., Albisser, O., Misund, J.,
2175	              Tilmans, O., Kühlewind, M., and A.S. Ahmed, "Implementing
2176	              the `TCP Prague' Requirements for Low Latency Low Loss
2177	              Scalable Throughput (L4S)", Proc. Linux Netdev 0x13 ,
2178	              March 2019, <https://www.netdevconf.org/0x13/
2179	              session.html?talk-tcp-prague-l4s>.

2181	   [DOCSIS3.1]
2182	              CableLabs, "MAC and Upper Layer Protocols Interface
2183	              (MULPI) Specification, CM-SP-MULPIv3.1", Data-Over-Cable
2184	              Service Interface Specifications DOCSIS® 3.1 Version i17
2185	              or later, 21 January 2019, <https://specification-
2186	              search.cablelabs.com/CM-SP-MULPIv3.1>.

2188	Authors' Addresses

2190	   Marcelo Bagnulo
2191	   Universidad Carlos III de Madrid
2192	   Av. Universidad 30
2193	   28911 Leganes Madrid
2194	   Spain

2196	   Phone: 34 91 6249500
2197	   Email: marcelo@it.uc3m.es
2198	   URI:   http://www.it.uc3m.es

2200	   Bob Briscoe
2201	   Independent
2202	   United Kingdom

2204	   Email: ietf@bobbriscoe.net
2205	   URI:   http://bobbriscoe.net/