idnits 2.17.1 

draft-ietf-tcpm-generalized-ecn-01.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  -- The draft header indicates that this document obsoletes RFC5562, but the
     abstract doesn't seem to mention this, which it should.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == The document seems to lack the recommended RFC 2119 boilerplate, even if
     it appears to use RFC 2119 keywords. 

     (The document does seem to have the reference to RFC 2119 which the
     ID-Checklist requires).
  -- The document date (September 28, 2017) is 2392 days in the past.  Is
     this intentional?


  Checking references for intended status: Experimental
  ----------------------------------------------------------------------------

  == Outdated reference: A later version (-28) exists of
     draft-ietf-tcpm-accurate-ecn-03

  == Outdated reference: A later version (-08) exists of
     draft-ietf-tsvwg-ecn-experimentation-06

  == Outdated reference: A later version (-34) exists of
     draft-ietf-quic-transport-06

  == Outdated reference: A later version (-29) exists of
     draft-ietf-tsvwg-ecn-l4s-id-00

  == Outdated reference: A later version (-20) exists of
     draft-ietf-tsvwg-l4s-arch-00

  == Outdated reference: A later version (-06) exists of
     draft-stewart-tsvwg-sctpecn-05

  -- Obsolete informational reference (is this intentional?): RFC  793
     (Obsoleted by RFC 9293)

  -- Obsolete informational reference (is this intentional?): RFC 4960
     (Obsoleted by RFC 9260)


     Summary: 0 errors (**), 0 flaws (~~), 8 warnings (==), 4 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------


2	Network Working Group                                         M. Bagnulo
3	Internet-Draft                                                      UC3M
4	Obsoletes: 5562 (if approved)                                 B. Briscoe
5	Intended status: Experimental                                  CableLabs
6	Expires: April 1, 2018                                September 28, 2017

8	  ECN++: Adding Explicit Congestion Notification (ECN) to TCP Control
9	                                Packets
10	                   draft-ietf-tcpm-generalized-ecn-01

12	Abstract

14	   This document describes an experimental modification to ECN when used
15	   with TCP.  It allows the use of ECN on the following TCP packets:
16	   SYNs, pure ACKs, Window probes, FINs, RSTs and retransmissions.

18	Status of This Memo

20	   This Internet-Draft is submitted in full conformance with the
21	   provisions of BCP 78 and BCP 79.

23	   Internet-Drafts are working documents of the Internet Engineering
24	   Task Force (IETF).  Note that other groups may also distribute
25	   working documents as Internet-Drafts.  The list of current Internet-
26	   Drafts is at https://datatracker.ietf.org/drafts/current/.

28	   Internet-Drafts are draft documents valid for a maximum of six months
29	   and may be updated, replaced, or obsoleted by other documents at any
30	   time.  It is inappropriate to use Internet-Drafts as reference
31	   material or to cite them other than as "work in progress."

33	   This Internet-Draft will expire on April 1, 2018.

35	Copyright Notice

37	   Copyright (c) 2017 IETF Trust and the persons identified as the
38	   document authors.  All rights reserved.

40	   This document is subject to BCP 78 and the IETF Trust's Legal
41	   Provisions Relating to IETF Documents
42	   (https://trustee.ietf.org/license-info) in effect on the date of
43	   publication of this document.  Please review these documents
44	   carefully, as they describe your rights and restrictions with respect
45	   to this document.  Code Components extracted from this document must
46	   include Simplified BSD License text as described in Section 4.e of
47	   the Trust Legal Provisions and are provided without warranty as
48	   described in the Simplified BSD License.

50	Table of Contents

52	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
53	     1.1.  Motivation  . . . . . . . . . . . . . . . . . . . . . . .   3
54	     1.2.  Experiment Goals  . . . . . . . . . . . . . . . . . . . .   4
55	     1.3.  Document Structure  . . . . . . . . . . . . . . . . . . .   5
56	   2.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   5
57	   3.  Specification . . . . . . . . . . . . . . . . . . . . . . . .   6
58	     3.1.  Network (e.g. Firewall) Behaviour . . . . . . . . . . . .   6
59	     3.2.  Endpoint Behaviour  . . . . . . . . . . . . . . . . . . .   7
60	       3.2.1.  SYN . . . . . . . . . . . . . . . . . . . . . . . . .   8
61	       3.2.2.  SYN-ACK . . . . . . . . . . . . . . . . . . . . . . .  11
62	       3.2.3.  Pure ACK  . . . . . . . . . . . . . . . . . . . . . .  13
63	       3.2.4.  Window Probe  . . . . . . . . . . . . . . . . . . . .  14
64	       3.2.5.  FIN . . . . . . . . . . . . . . . . . . . . . . . . .  14
65	       3.2.6.  RST . . . . . . . . . . . . . . . . . . . . . . . . .  15
66	       3.2.7.  Retransmissions . . . . . . . . . . . . . . . . . . .  15
67	       3.2.8.  General Fall-back for any Control Packet or
68	               Retransmission  . . . . . . . . . . . . . . . . . . .  16
69	   4.  Rationale . . . . . . . . . . . . . . . . . . . . . . . . . .  16
70	     4.1.  The Reliability Argument  . . . . . . . . . . . . . . . .  16
71	     4.2.  SYNs  . . . . . . . . . . . . . . . . . . . . . . . . . .  17
72	       4.2.1.  Argument 1a: Unrecognized CE on the SYN . . . . . . .  17
73	       4.2.2.  Argument 1b: Unrecognized ECT on the SYN  . . . . . .  19
74	       4.2.3.  Argument 2: DoS Attacks . . . . . . . . . . . . . . .  21
75	     4.3.  SYN-ACKs  . . . . . . . . . . . . . . . . . . . . . . . .  22
76	       4.3.1.  Response to Congestion on a SYN-ACK . . . . . . . . .  22
77	       4.3.2.  Fall-Back if ECT SYN-ACK Fails  . . . . . . . . . . .  23
78	     4.4.  Pure ACKs . . . . . . . . . . . . . . . . . . . . . . . .  23
79	       4.4.1.  Cwnd Response to CE-Marked Pure ACKs  . . . . . . . .  25
80	       4.4.2.  ACK Rate Response to CE-Marked Pure ACKs  . . . . . .  26
81	       4.4.3.  Summary: Enabling ECN on Pure ACKs  . . . . . . . . .  26
82	     4.5.  Window Probes . . . . . . . . . . . . . . . . . . . . . .  27
83	     4.6.  FINs  . . . . . . . . . . . . . . . . . . . . . . . . . .  28
84	     4.7.  RSTs  . . . . . . . . . . . . . . . . . . . . . . . . . .  28
85	     4.8.  Retransmitted Packets.  . . . . . . . . . . . . . . . . .  29
86	     4.9.  General Fall-back for any Control Packet  . . . . . . . .  30
87	   5.  Interaction with popular variants or derivatives of TCP . . .  31
88	     5.1.  IW10  . . . . . . . . . . . . . . . . . . . . . . . . . .  31
89	     5.2.  TFO . . . . . . . . . . . . . . . . . . . . . . . . . . .  32
90	     5.3.  TCP Derivatives . . . . . . . . . . . . . . . . . . . . .  33
91	   6.  Security Considerations . . . . . . . . . . . . . . . . . . .  33
92	   7.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  33
93	   8.  Acknowledgments . . . . . . . . . . . . . . . . . . . . . . .  33
94	   9.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  33
95	     9.1.  Normative References  . . . . . . . . . . . . . . . . . .  34
96	     9.2.  Informative References  . . . . . . . . . . . . . . . . .  34
97	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  37

99	1.  Introduction

101	   RFC 3168 [RFC3168] specifies support of Explicit Congestion
102	   Notification (ECN) in IP (v4 and v6).  By using the ECN capability,
103	   network elements (e.g. routers, switches) performing Active Queue
104	   Management (AQM) can use ECN marks instead of packet drops to signal
105	   congestion to the endpoints of a communication.  This results in
106	   lower packet loss and increased performance.  RFC 3168 also specifies
107	   support for ECN in TCP, but solely on data packets.  For various
108	   reasons it precludes the use of ECN on TCP control packets (TCP SYN,
109	   TCP SYN-ACK, pure ACKs, Window probes) and on retransmitted packets.
110	   RFC 3168 is silent about the use of ECN on RST and FIN packets.  RFC
111	   5562 [RFC5562] is an experimental modification to ECN that enables
112	   ECN support for TCP SYN-ACK packets.

114	   This document defines an experimental modification to ECN [RFC3168]
115	   that shall be called ECN++. It enables ECN support on all the
116	   aforementioned types of TCP packet.

118	   ECN++ is a sender-side change.  It works whether the two ends of the
119	   TCP connection use classic ECN feedback [RFC3168] or experimental
120	   Accurate ECN feedback (AccECN [I-D.ietf-tcpm-accurate-ecn]).
121	   Nonetheless, if the client does not implement AccECN, it cannot use
122	   ECN++ on the one packet that offers most benefit from it - the
123	   initial SYN.  Therefore, implementers of ECN++ are RECOMMENDED to
124	   also implement AccECN.

126	   ECN++ is designed for compatibility with a number of latency
127	   improvements to TCP such as TCP Fast Open (TFO [RFC7413]), initial
128	   window of 10 SMSS (IW10 [RFC6928]) and Low latency Low Loss Scalable
129	   Transport (L4S [I-D.ietf-tsvwg-l4s-arch]), but they can all be
130	   implemented and deployed independently.
131	   [I-D.ietf-tsvwg-ecn-experimentation] is a standards track procedural
132	   device that relaxes requirements in RFC 3168 and other standards
133	   track RFCs that would otherwise preclude the experimental
134	   modifications needed for ECN++ and other ECN experiments.

136	1.1.  Motivation

138	   The absence of ECN support on TCP control packets and retransmissions
139	   has a potential harmful effect.  In any ECN deployment, non-ECN-
140	   capable packets suffer a penalty when they traverse a congested
141	   bottleneck.  For instance, with a drop probability of 1%, 1% of
142	   connection attempts suffer a timeout of about 1 second before the SYN
143	   is retransmitted, which is highly detrimental to the performance of
144	   short flows.  TCP control packets, particularly TCP SYNs and SYN-
145	   ACKs, are important for performance, so dropping them is best
146	   avoided.

148	   Non-ECN control packets particularly harm performance in environments
149	   where the ECN marking level is high.  For example, [judd-nsdi] shows
150	   that in a controlled private data centre (DC) environment where ECN
151	   is used (in conjunction with DCTCP [I-D.ietf-tcpm-dctcp]), the
152	   probability of being able to establish a new connection using a non-
153	   ECN SYN packet drops to close to zero even when there are only 16
154	   ongoing TCP flows transmitting at full speed.  The issue is that
155	   DCTCP exhibits a much more aggressive response to packet marking
156	   (which is why it is only applicable in controlled environments).
157	   This leads to a high marking probability for ECN-capable packets, and
158	   in turn a high drop probability for non-ECN packets.  Therefore non-
159	   ECN SYNs are dropped aggressively, rendering it nearly impossible to
160	   establish a new connection in the presence of even mild traffic load.

162	   Finally, there are ongoing experimental efforts to promote the
163	   adoption of a slightly modified variant of DCTCP (and similar
164	   congestion controls) over the Internet to achieve low latency, low
165	   loss and scalable throughput (L4S) for all communications
166	   [I-D.ietf-tsvwg-l4s-arch].  In such an approach, L4S packets identify
167	   themselves using an ECN codepoint [I-D.ietf-tsvwg-ecn-l4s-id].  With
168	   L4S and potentially other similar cases, preventing TCP control
169	   packets from obtaining the benefits of ECN would not only expose them
170	   to the prevailing level of congestion loss, but it would also
171	   classify control packets into a different queue with different
172	   network treatment, which may also lead to reordering, further
173	   degrading TCP performance.

175	1.2.  Experiment Goals

177	   The goal of the experimental modifications defined in this document
178	   is to allow the use of ECN on all TCP packets.  Experiments are
179	   expected in the public Internet as well as in controlled environments
180	   to understand the following issues:

182	   o  How SYNs, Window probes, pure ACKs, FINs, RSTs and retransmissions
183	      that carry the ECT(0), ECT(1) or CE codepoints are processed by
184	      the TCP endpoints and the network (including routers, firewalls
185	      and other middleboxes).  In particular we would like to learn if
186	      these packets are frequently blocked or if these packets are
187	      usually forwarded and processed.

189	   o  The scale of deployment of the different flavours of ECN,
190	      including [RFC3168], [RFC5562], [RFC3540] and
191	      [I-D.ietf-tcpm-accurate-ecn].

193	   o  How much the performance of TCP communications is improved by
194	      allowing ECN marking of each packet type.

196	   o  To identify any issues (including security issues) raised by
197	      enabling ECN marking of these packets.

199	   The data gathered through the experiments described in this document,
200	   particularly under the first 2 bullets above, will help in the design
201	   of the final mechanism (if any) for adding ECN support to the
202	   different packet types considered in this document.  Whenever data
203	   input is needed to assist in a design choice, it is spelled out
204	   throughout the document.

206	   Success criteria: The experiment will be a success if we obtain
207	   enough data to have a clearer view of the deployability and benefits
208	   of enabling ECN on all TCP packets, as well as any issues.  If the
209	   results of the experiment show that it is feasible to deploy such
210	   changes; that there are gains to be achieved through the changes
211	   described in this specification; and that no other major issues may
212	   interfere with the deployment of the proposed changes; then it would
213	   be reasonable to adopt the proposed changes in a standards track
214	   specification that would update RFC 3168.

216	1.3.  Document Structure

218	   The remainder of this document is structured as follows.  In
219	   Section 2, we present the terminology used in the rest of the
220	   document.  In Section 3, we specify the modifications to provide ECN
221	   support to TCP SYNs, pure ACKs, Window probes, FINs, RSTs and
222	   retransmissions.  We describe both the network behaviour and the
223	   endpoint behaviour.  Section 5 discusses variations of the
224	   specification that will be necessary to interwork with a number of
225	   popular variants or derivatives of TCP.  RFC 3168 provides a number
226	   of specific reasons why ECN support is not appropriate for each
227	   packet type.  In Section 4, we revisit each of these arguments for
228	   each packet type to justify why it is reasonable to conduct this
229	   experiment.

231	2.  Terminology

233	   The keywords MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD,
234	   SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL, when they appear in this
235	   document, are to be interpreted as described in [RFC2119].

237	   Pure ACK: A TCP segment with the ACK flag set and no data payload.

239	   SYN: A TCP segment with the SYN (synchronize) flag set.

241	   Window probe: Defined in [RFC0793], a window probe is a TCP segment
242	   with only one byte of data sent to learn if the receive window is
243	   still zero.

245	   FIN: A TCP segment with the FIN (finish) flag set.

247	   RST: A TCP segment with the RST (reset) flag set.

249	   Retransmission: A TCP segment that has been retransmitted by the TCP
250	   sender.

252	   ECT: ECN-Capable Transport.  One of the two codepoints ECT(0) or
253	   ECT(1) in the ECN field [RFC3168] of the IP header (v4 or v6).  An
254	   ECN-capable sender sets one of these to indicate that both transport
255	   end-points support ECN.  When this specification says the sender sets
256	   an ECT codepoint, by default it means ECT(0).  Optionally, it could
257	   mean ECT(1), which is in the process of being redefined for use by
258	   L4S experiments [I-D.ietf-tsvwg-ecn-experimentation]
259	   [I-D.ietf-tsvwg-ecn-l4s-id].

261	   Not-ECT: The ECN codepoint set by senders that indicates that the
262	   transport is not ECN-capable.

264	   CE: Congestion Experienced.  The ECN codepoint that an intermediate
265	   node sets to indicate congestion [RFC3168].  A node sets an
266	   increasing proportion of ECT packets to CE as the level of congestion
267	   increases.

269	3.  Specification

271	3.1.  Network (e.g.  Firewall) Behaviour

273	   Previously the specification of ECN for TCP [RFC3168] required the
274	   sender to set not-ECT on TCP control packets and retransmissions.
275	   Some readers of RFC 3168 might have erroneously interpreted this as a
276	   requirement for firewalls, intrusion detection systems, etc. to check
277	   and enforce this behaviour.  Section 4.3 of
278	   [I-D.ietf-tsvwg-ecn-experimentation] updates RFC 3168 to remove this
279	   ambiguity.  It require firewalls or any intermediate nodes not to
280	   treat certain types of ECN-capable TCP segment differently (except
281	   potentially in one attack scenario).  This is likely to only involve
282	   a firewall rule change in a fraction of cases (at most 0.4% of paths
283	   according to the tests reported in Section 4.2.2).

285	   In case a TCP sender encounters a middlebox blocking ECT on certain
286	   TCP segments, the specification below includes behaviour to fall back
287	   to non-ECN.  However, this loses the benefit of ECN on control
288	   packets.  So operators are RECOMMENDED to alter their firewall rules
289	   to comply with the requirement referred to above (section 4.3 of
290	   [I-D.ietf-tsvwg-ecn-experimentation]).

292	3.2.  Endpoint Behaviour

294	   The changes to the specification of TCP over ECN [RFC3168] defined
295	   here solely alter the behaviour of the sending host for each half-
296	   connection.  All changes can be deployed at each end-point
297	   independently of others and independent of any network behaviour.

299	   The feedback behaviour at the receiver depends on whether classic ECN
300	   TCP feedback [RFC3168] or Accurate ECN (AccECN) TCP feedback
301	   [I-D.ietf-tcpm-accurate-ecn] has been negotiated.  Nonetheless,
302	   neither receiver feedback behaviour is altered by the present
303	   specification.

305	   For each type of control packet or retransmission, the following
306	   sections detail changes to the sender's behaviour in two respects: i)
307	   whether it sets ECT; and ii) its response to congestion feedback.
308	   Table 1 summarises these two behaviours for each type of packet, but
309	   the relevant subsection below should be referred to for the detailed
310	   behaviour.  The subsection on the SYN is more complex than the
311	   others, because it has to include fall-back behaviour if the ECT
312	   packet appears not to have got through, and caching of the outcome to
313	   detect persistent failures.

315	   +---------+-----------------+------------------+--------------------+
316	   | TCP     | ECN field if    | ECN field if     | Congestion         |
317	   | packet  | AccECN f/b      | RFC3168 f/b      | Response           |
318	   | type    | negotiated*     | negotiated*      |                    |
319	   +---------+-----------------+------------------+--------------------+
320	   | SYN     | ECT             | not-ECT          | Reduce IW          |
321	   |         |                 |                  |                    |
322	   | SYN-ACK | ECT             | ECT              | Reduce IW          |
323	   |         |                 |                  |                    |
324	   | Pure    | ECT             | ECT              | Usual cwnd         |
325	   | ACK     |                 |                  | response and       |
326	   |         |                 |                  | optionally         |
327	   |         |                 |                  | [RFC5690]          |
328	   |         |                 |                  |                    |
329	   | W Probe | ECT             | ECT              | Usual cwnd         |
330	   |         |                 |                  | response           |
331	   |         |                 |                  |                    |
332	   | FIN     | ECT             | ECT              | None or optionally |
333	   |         |                 |                  | [RFC5690]          |
334	   |         |                 |                  |                    |
335	   | RST     | ECT             | ECT              | N/A                |
336	   |         |                 |                  |                    |
337	   | Re-XMT  | ECT             | ECT              | Usual cwnd         |
338	   |         |                 |                  | response           |
339	   +---------+-----------------+------------------+--------------------+

341	   Window probe and retransmission are abbreviated to W Probe an Re-XMT.
342	               * For a SYN, "negotiated" means "requested".

344	     Table 1: Summary of sender behaviour.  In each case the relevant
345	      section below should be referred to for the detailed behaviour

347	   It can be seen that the sender can set ECT in all cases, except if it
348	   is not requesting AccECN feedback on the SYN.  Therefore it is
349	   RECOMMENDED that the experimental AccECN specification
350	   [I-D.ietf-tcpm-accurate-ecn] is implemented (as well as the present
351	   specification), because it is expected that ECT on the SYN will give
352	   the most significant performance gain, particularly for short flows.
353	   Nonetheless, this specification also caters for the case where AccECN
354	   feedback is not implemented.

356	3.2.1.  SYN

358	3.2.1.1.  Setting ECT on the SYN

360	   With classic [RFC3168] ECN feedback, the SYN was never expected to be
361	   ECN-capable, so the flag provided to feed back congestion was put to
362	   another use (it is used in combination with other flags to indicate
363	   that the responder supports ECN).  In contrast, Accurate ECN (AccECN)
364	   feedback [I-D.ietf-tcpm-accurate-ecn] provides two codepoints in the
365	   SYN-ACK for the responder to feed back whether or not the SYN arrived
366	   marked CE.

368	   Therefore, a TCP initiator MUST NOT set ECT on a SYN unless it also
369	   attempts to negotiate Accurate ECN feedback in the same SYN.

371	   For the experiments proposed here, if the SYN is requesting AccECN
372	   feedback, the TCP sender will also set ECT on the SYN.  It can ignore
373	   the prohibition in section 6.1.1 of RFC 3168 against setting ECT on
374	   such a SYN.

376	   The following subsections about the SYN solely apply to this case
377	   where the initiator sent an ECT SYN.

379	3.2.1.2.  Caching Lack of AccECN Support for ECT on SYNs

381	   Until AccECN servers become widely deployed, a TCP initiator that
382	   sets ECT on a SYN (which implies the same SYN also requests AccECN,
383	   as required above) SHOULD also maintain a cache entry per server to
384	   record that the server does not support AccECN and therefore has no
385	   logic for congestion markings on the SYN.  Mobile hosts MAY maintain
386	   a cache entry per access network to record lack of AccECN support by
387	   proxies (see Section 4.2.1).

389	   The initiator will record any server's SYN-ACK response that does not
390	   support AccECN.  Subsequently the initiator will not set ECT on a SYN
391	   to such a server, but it can still always request AccECN support
392	   (because the response will state any earlier stage of ECN evolution
393	   that the server supports with no performance penalty).  The initiator
394	   will discover a server that has upgraded to support AccECN as soon as
395	   it next connects, then it can remove the server from its cache and
396	   subsequently always set ECT for that server.

398	   If the initiator times out without seeing a SYN-ACK, it will also
399	   cache this fact (see fall-back in Section 3.2.1.4 for details).

401	   There is no need to cache successful attempts, because the default
402	   ECT SYN behaviour performs optimally on success anyway.  Servers that
403	   do not support ECN as a whole probably do not need to be recorded
404	   separately from non-support of AccECN because the response to a
405	   request for AccECN immediately states which stage in the evolution of
406	   ECN the server supports (AccECN [I-D.ietf-tcpm-accurate-ecn], classic
407	   ECN [RFC3168] or no ECN).

409	   The above strategy is named "optimistic ECT and cache failures".  It
410	   is believed to be sufficient based on initial measurements and
411	   assumptions detailed in Section 4.2.1, which also gives alternative
412	   strategies in case larger scale measurements uncover different
413	   scenarios.

415	3.2.1.3.  SYN Congestion Response

417	   If the SYN-ACK returned to the TCP initiator confirms that the server
418	   supports AccECN, it will also indicate whether or not the SYN was CE-
419	   marked.  If the SYN was CE-marked, the initiator MUST reduce its
420	   Initial Window (IW) and SHOULD reduce it to 1 SMSS (sender maximum
421	   segment size).

423	   If ECT has been set on the SYN and if the SYN-ACK shows that the
424	   server does not support AccECN, the TCP initiator MUST conservatively
425	   reduce its Initial Window and SHOULD reduce it to 1 SMSS.  A
426	   reduction to greater than 1 SMSS MAY be appropriate (see
427	   Section 4.2.1).  Conservatism is necessary because a non-AccECN SYN-
428	   ACK cannot show whether the SYN was CE-marked.

430	   If the TCP initiator (host A) receives a SYN from the remote end
431	   (host B) after it has sent a SYN to B, it indicates the (unusual)
432	   case of a simultaneous open.  Host A will respond with a SYN-ACK.
433	   Host A will probably then receive a SYN-ACK in response to its own
434	   SYN, after which it can follow the appropriate one of the two
435	   paragraphs above.

437	   In all the above cases, the initiator does not have to back off its
438	   retransmission timer as it would in response to a timeout following
439	   no response to its SYN [RFC6298], because both the SYN and the SYN-
440	   ACK have been successfully delivered through the network.  Also, the
441	   initiator does not need to exit slow start or reduce ssthresh, which
442	   is not even required when a SYN is lost [RFC5681].

444	   If an initial window of 10 (IW10 [RFC6928]) is implemented, Section 5
445	   gives additional recommendations.

447	3.2.1.4.  Fall-Back Following No Response to an ECT SYN

449	   An ECT SYN might be lost due to an over-zealous path element (or
450	   server) blocking ECT packets that do not conform to RFC 3168.  Some
451	   evidence of this was found in a 2014 study [ecn-pam], but in a more
452	   recent 2017 study {ToDo: Add reference (under submission)} extensive
453	   measurements found no case where ECT on TCP control packets was
454	   treated any differently from ECT on TCP data packets.  Loss is
455	   commonplace for numerous other reasons, e.g. congestion loss at a
456	   non-ECN queue on the forward or reverse path, transmission errors,
457	   etc.  Alternatively, the cause of the loss might be the attempt to
458	   negotiate AccECN, or possibly other unrelated options on the SYN.

460	   Therefore, if the timer expires after the TCP initiator has sent the
461	   first ECT SYN, it SHOULD make one more attempt to retransmit the SYN
462	   with ECT set (backing off the timer as usual).  If the retransmission
463	   timer expires again, it SHOULD retransmit the SYN with the not-ECT
464	   codepoint in the IP header, to expedite connection set-up.  If other
465	   experimental fields or options were on the SYN, it will also be
466	   necessary to follow their specifications for fall-back too.  It would
467	   make sense to coordinate all the strategies for fall-back in order to
468	   isolate the specific cause of the problem.

470	   If the TCP initiator is caching failed connection attempts, it SHOULD
471	   NOT give up using ECT on the first SYN of subsequent connection
472	   attempts until it is clear that a blockage persistently and
473	   specifically affects ECT on SYNs.  This is because loss is so
474	   commonplace for other reasons.  Even if it does eventually decide to
475	   give up setting ECT on the SYN, it will probably not need to give up
476	   on AccECN on the SYN.  In any case, if a cache is used, it SHOULD be
477	   arranged to expire so that the initiator will infrequently attempt to
478	   check whether the problem has been resolved.

480	   Other fall-back strategies MAY be adopted where applicable (see
481	   Section 4.2.2 for suggestions, and the conditions under which they
482	   would apply).

484	3.2.2.  SYN-ACK

486	3.2.2.1.  Setting ECT on the SYN-ACK

488	   For the experiments proposed here, the TCP implementation will set
489	   ECT on SYN-ACKs.  It can ignore the requirement in section 6.1.1 of
490	   RFC 3168 to set not-ECT on a SYN-ACK.

492	   The feedback behaviour by the initiator in response to a CE-marked
493	   SYN-ACK from the responder depends on whether classic ECN feedback
494	   [RFC3168] or AccECN feedback [I-D.ietf-tcpm-accurate-ecn] has been
495	   negotiated.  In either case no change is required to RFC 3168 or the
496	   AccECN specification.

498	   Some classic ECN implementations might ignore a CE-mark on a SYN-ACK,
499	   or even ignore a SYN-ACK packet entirely if it is set to ECT or CE.
500	   This is a possibility because an RFC 3168 implementation would not
501	   necessarily expect a SYN-ACK to be ECN-capable.

503	      FOR DISCUSSION: To eliminate this problem, the WG could decide to
504	      prohibit setting ECT on SYN-ACKs unless AccECN has been
505	      negotiated.  However, this issue already came up when the IETF
506	      first decided to experiment with ECN on SYN-ACKs [RFC5562] and it
507	      was decided to go ahead without any extra precautionary measures
508	      because the risk was low.  This was because the probability of
509	      encountering the problem was believed to be low and the harm if
510	      the problem arose was also low (see Appendix B of RFC 5562).

512	      MEASUREMENTS NEEDED: Server-side experiments could determine
513	      whether this specific problem is indeed rare across the current
514	      installed base of clients that support ECN.

516	3.2.2.2.  SYN-ACK Congestion Response

518	   A host that sets ECT on SYN-ACKs MUST reduce its initial window in
519	   response to any congestion feedback, whether using classic ECN or
520	   AccECN.  It SHOULD reduce it to 1 SMSS.  This is different to the
521	   behaviour specified in an earlier experiment that set ECT on the SYN-
522	   ACK [RFC5562].  This is justified in Section 4.3.

524	   The responder does not have to back off its retransmission timer
525	   because the ECN feedback proves that the network is delivering
526	   packets successfully and is not severely overloaded.  Also the
527	   responder does not have to leave slow start or reduce ssthresh, which
528	   is not even required when a SYN-ACK has been lost.

530	   The congestion response to CE-marking on a SYN-ACK for a server that
531	   implements either the TCP Fast Open experiment (TFO [RFC7413]) or the
532	   initial window of 10 experiment (IW10 [RFC6928]) is discussed in
533	   Section 5.

535	3.2.2.3.  Fall-Back Following No Response to an ECT SYN-ACK

537	   After the responder sends a SYN-ACK with ECT set, if its
538	   retransmission timer expires it SHOULD retransmit one more SYN-ACK
539	   with ECT set (and back-off its timer as usual).  If the timer expires
540	   again, it SHOULD retransmit the SYN-ACK with not-ECT in the IP
541	   header.  If other experimental fields or options were on the initial
542	   SYN-ACK, it will also be necessary to follow their specifications for
543	   fall-back.  It would make sense to co-ordinate all the strategies for
544	   fall-back in order to isolate the specific cause of the problem.

546	   This fall-back strategy attempts to use ECT one more time than the
547	   strategy for ECT SYN-ACKs in [RFC5562] (which is made obsolete, being
548	   superseded by the present specification).  Other fall-back strategies
549	   MAY be adopted if found to be more effective, e.g. fall-back to not-
550	   ECT on the first retransmission attempt.

552	   The server MAY cache failed connection attempts, e.g. per client
553	   access network.  An client-based alternative to caching at the server
554	   is given in Section 4.3.2.  If the TCP server is caching failed
555	   connection attempts, it SHOULD NOT give up using ECT on the first
556	   SYN-ACK of subsequent connection attempts until it is clear that the
557	   blockage persistently and specifically affects ECT on SYN-ACKs.  This
558	   is because loss is so commonplace for other reasons (see
559	   Section 3.2.1.4).  If a cache is used, it SHOULD be arranged to
560	   expire so that the server will infrequently attempt to check whether
561	   the problem has been resolved.

563	3.2.3.  Pure ACK

565	   For the experiments proposed here, the TCP implementation will set
566	   ECT on pure ACKs.  It can ignore the requirement in section 6.1.4 of
567	   RFC 3168 to set not-ECT on a pure ACK.

569	   A host that sets ECT on pure ACKs MUST reduce its congestion window
570	   in response to any congestion feedback, in order to regulate any data
571	   segments it might be sending amongst the pure ACKs. {ToDo: Reconsider
572	   this requirement in the light of WG comments.} It MAY also implement
573	   AckCC [RFC5690] to regulate the pure ACK rate, but this is not
574	   required.  Note that, in comparison, TCP Congestion Control [RFC5681]
575	   does not require a TCP to detect or respond to loss of pure ACKs at
576	   all; it requires no reduction in congestion window or ACK rate.

578	   The question of whether the receiver of pure ACKs is required to feed
579	   back any CE marks on them is a matter for the relevant feedback
580	   specification ([RFC3168] or [I-D.ietf-tcpm-accurate-ecn]).  It is
581	   outside the scope of the present specification.  Currently AccECN
582	   feedback is required to count CE marking of any control packet
583	   including pure ACKs.  Whereas RFC 3168 is silent on this point, so
584	   feedback of CE-markings might be implementation specific (see
585	   Section 4.4.1).

587	      DISCUSSION: An AccECN deployment or an implementation of RFC 3168
588	      that feeds back CE on pure ACKs will be at a disadvantage compared
589	      to an RFC 3168 implementation that does not.  To solve this, the
590	      WG could decide to prohibit setting ECT on pure ACKs unless AccECN
591	      has been negotiated.  If it does, the penultimate sentence of the
592	      Introduction will need to be modified.

594	      MEASUREMENTS NEEDED: Measurements are needed to learn how the
595	      deployed base of network elements and RFC 3168 servers react to
596	      pure ACKs marked with the ECT(0)/ECT(1)/CE codepoints, i.e.
597	      whether they are dropped, codepoint cleared or processed and the
598	      congestion indication fed back on a subsequent packet.

600	3.2.4.  Window Probe

602	   For the experiments proposed here, the TCP sender will set ECT on
603	   window probes.  It can ignore the prohibition in section 6.1.6 of RFC
604	   3168 against setting ECT on a window probe.

606	   A window probe contains a single octet, so it is no different from a
607	   regular TCP data segment.  Therefore a TCP receiver will feed back
608	   any CE marking on a window probe as normal (either using classic ECN
609	   feedback or AccECN feedback).  The sender of the probe will then
610	   reduce its congestion window as normal.

612	   A receive window of zero indicates that the application is not
613	   consuming data fast enough and does not imply anything about network
614	   congestion.  Once the receive window opens, the congestion window
615	   might become the limiting factor, so it is correct that CE-marked
616	   probes reduce the congestion window.  This complements cwnd
617	   validation [RFC7661], which reduces cwnd as more time elapses without
618	   having used available capacity.  However, CE-marking on window probes
619	   does not reduce the rate of the probes themselves.  This is unlikely
620	   to present a problem, given the duration between window probes
621	   doubles [RFC1122] as long as the receiver is advertising a zero
622	   window (currently minimum 1 second, maximum at least 1 minute
623	   [RFC6298]).

625	      MEASUREMENTS NEEDED: Measurements are needed to learn how the
626	      deployed base of network elements and servers react to Window
627	      probes marked with the ECT(0)/ECT(1)/CE codepoints, i.e. whether
628	      they are dropped, codepoint cleared or processed.

630	3.2.5.  FIN

632	   A TCP implementation can set ECT on a FIN.

634	   The TCP data receiver MUST ignore the CE codepoint on incoming FINs
635	   that fail any validity check.  The validity check in section 5.2 of
636	   [RFC5961] is RECOMMENDED.

638	   A congestion response to a CE-marking on a FIN is not required.

640	   After sending a FIN, the endpoint will not send any more data in the
641	   connection.  Therefore, even if the FIN-ACK indicates that the FIN
642	   was CE-marked (whether using classic or AccECN feedback), reducing
643	   the congestion window will not affect anything.

645	   After sending a FIN, a host might send one or more pure ACKs.  If it
646	   is using one of the techniques in Section 3.2.3 to regulate the
647	   delayed ACK ratio for pure ACKs, it could equally be applied after a
648	   FIN.  But this is not required.

650	      MEASUREMENTS NEEDED: Measurements are needed to learn how the
651	      deployed base of network elements and servers react to FIN packets
652	      marked with the ECT(0)/ECT(1)/CE codepoints, i.e.  whether they
653	      are dropped, codepoint cleared or processed.

655	3.2.6.  RST

657	   A TCP implementation can set ECT on a RST.

659	   The "challenge ACK" approach to checking the validity of RSTs
660	   (section 3.2 of [RFC5961] is RECOMMENDED at the data receiver.

662	   A congestion response to a CE-marking on a RST is not required (and
663	   actually not possible).

665	      MEASUREMENTS NEEDED: Measurements are needed to learn how the
666	      deployed base of network elements and servers react to RST packets
667	      marked with the ECT(0)/ECT(1)/CE codepoints, i.e.  whether they
668	      are dropped, codepoint cleared or processed.

670	3.2.7.  Retransmissions

672	   For the experiments proposed here, the TCP sender will set ECT on
673	   retransmitted segments.  It can ignore the prohibition in section
674	   6.1.5 of RFC 3168 against setting ECT on retransmissions.

676	   Nonetheless, the TCP data receiver MUST ignore the CE codepoint on
677	   incoming segments that fail any validity check.  The validity check
678	   in section 5.2 of [RFC5961] is RECOMMENDED.  This will effectively
679	   mitigate an attack that uses spoofed data packets to fool the
680	   receiver into feeding back spoofed congestion indications to the
681	   sender, which in turn would be fooled into continually halving its
682	   congestion window.

684	   If the TCP sender receives feedback that a retransmitted packet was
685	   CE-marked, it will react as it would to any feedback of CE-marking on
686	   a data packet.

688	      MEASUREMENTS NEEDED: Measurements are needed to learn how the
689	      deployed base of network elements and servers react to
690	      retransmissions marked with the ECT(0)/ECT(1)/CE codepoints, i.e.
691	      whether they are dropped, codepoint cleared or processed.

693	3.2.8.  General Fall-back for any Control Packet or Retransmission

695	   Extensive measurements in fixed and mobile networks {ToDo: reference
696	   (under submission)} have found no evidence of blockages due to ECT
697	   being set on any type of TCP control packet.

699	   In case traversal problems arise in future, fall-back measures have
700	   been specified above, but only for the cases where ECT on the initial
701	   packet of a half-connection (SYN or SYN-ACK) is persistently failing
702	   to get through.

704	   Fall-back measures for blockage of ECT on other TCP control packets
705	   MAY be implemented.  However they are not specified here given the
706	   lack of any evidence they will be needed.  Section 4.9 justifies this
707	   advice in more detail.

709	4.  Rationale

711	   This section is informative, not normative.  It presents counter-
712	   arguments against the justifications in the RFC series for disabling
713	   ECN on TCP control segments and retransmissions.  It also gives
714	   rationale for why ECT is safe on control segments that have not, so
715	   far, been mentioned in the RFC series.  First it addresses over-
716	   arching arguments used for most packet types, then it addresses the
717	   specific arguments for each packet type in turn.

719	4.1.  The Reliability Argument

721	   Section 5.2 of RFC 3168 states:

723	      "To ensure the reliable delivery of the congestion indication of
724	      the CE codepoint, an ECT codepoint MUST NOT be set in a packet
725	      unless the loss of that packet [at a subsequent node] in the
726	      network would be detected by the end nodes and interpreted as an
727	      indication of congestion."

729	   We believe this argument is misplaced.  TCP does not deliver most
730	   control packets reliably.  So it is more important to allow control
731	   packets to be ECN-capable, which greatly improves reliable delivery
732	   of the control packets themselves (see motivation in Section 1.1).
733	   ECN also improves the reliability and latency of delivery of any
734	   congestion notification on control packets, particularly because TCP
735	   does not detect the loss of most types of control packet anyway.
736	   Both these points outweigh by far the concern that a CE marking
737	   applied to a control packet by one node might subsequently be dropped
738	   by another node.

740	   The principle to determine whether a packet can be ECN-capable ought
741	   to be "do no extra harm", meaning that the reliability of a
742	   congestion signal's delivery ought to be no worse with ECN than
743	   without.  In particular, setting the CE codepoint on the very same
744	   packet that would otherwise have been dropped fulfills this
745	   criterion, since either the packet is delivered and the CE signal is
746	   delivered to the endpoint, or the packet is dropped and the original
747	   congestion signal (packet loss) is delivered to the endpoint.

749	   The concern about a CE marking being dropped at a subsequent node
750	   might be motivated by the idea that ECN-marking a packet at the first
751	   node does not remove the packet, so it could go on to worsen
752	   congestion at a subsequent node.  However, it is not useful to reason
753	   about congestion by considering single packets.  The departure rate
754	   from the first node will generally be the same (fully utilized) with
755	   or without ECN, so this argument does not apply.

757	4.2.  SYNs

759	   RFC 5562 presents two arguments against ECT marking of SYN packets
760	   (quoted verbatim):

762	      "First, when the TCP SYN packet is sent, there are no guarantees
763	      that the other TCP endpoint (node B in Figure 2) is ECN-Capable,
764	      or that it would be able to understand and react if the ECN CE
765	      codepoint was set by a congested router.

767	      Second, the ECN-Capable codepoint in TCP SYN packets could be
768	      misused by malicious clients to "improve" the well-known TCP SYN
769	      attack.  By setting an ECN-Capable codepoint in TCP SYN packets, a
770	      malicious host might be able to inject a large number of TCP SYN
771	      packets through a potentially congested ECN-enabled router,
772	      congesting it even further."

774	   The first point actually describes two subtly different issues.  So
775	   below three arguments are countered in turn.

777	4.2.1.  Argument 1a: Unrecognized CE on the SYN

779	   This argument certainly applied at the time RFC 5562 was written,
780	   when no ECN responder mechanism had any logic to recognize or feed
781	   back a CE marking on a SYN.  The problem was that, during the 3WHS,
782	   the flag in the TCP header for ECN feedback (called Echo Congestion
783	   Experienced) had been overloaded to negotiate the use of ECN itself.
784	   So there was no space for feedback in a SYN-ACK.

786	   The accurate ECN (AccECN) protocol [I-D.ietf-tcpm-accurate-ecn] has
787	   since been designed to solve this problem, using a two-pronged
788	   approach.  First AccECN uses the 3 ECN bits in the TCP header as 8
789	   codepoints, so there is space for the responder to feed back whether
790	   there was CE on the SYN.  Second a TCP initiator can always request
791	   AccECN support on every SYN, and any responder reveals its level of
792	   ECN support: AccECN, classic ECN, or no ECN.  Therefore, if a
793	   responder does indicate that it supports AccECN, the initiator can be
794	   sure that, if there is no CE feedback on the SYN-ACK, then there
795	   really was no CE on the SYN.

797	   An initiator can combine AccECN with three possible strategies for
798	   setting ECT on a SYN:

800	   (S1):  Pessimistic ECT and cache successes: The initiator always
801	          requests AccECN in the SYN, but without setting ECT.  Then it
802	          records those servers that confirm that they support AccECN in
803	          a cache.  On a subsequent connection to any server that
804	          supports AccECN, the initiator can then set ECT on the SYN.

806	   (S2):  Optimistic ECT: The initiator always sets ECT optimistically
807	          on the initial SYN and it always requests AccECN support.
808	          Then, if the server response shows it has no AccECN logic (so
809	          it cannot feed back a CE mark), the initiator conservatively
810	          behaves as if the SYN was CE-marked, by reducing its initial
811	          window.

813	          A.  No cache: The optimistic ECT strategy ought to work fairly
814	              well without caching any responses.

816	          B.  Cache failures: The optimistic ECT strategy can be
817	              improved by recording solely those servers that do not
818	              support AccECN.  On subsequent connections to these non-
819	              AccECN servers, the initiator will still request AccECN
820	              but not set ECT on the SYN.  Then, the initiator can use
821	              its full initial window (if it has enough request data to
822	              need it).  Longer term, as servers upgrade to AccECN, the
823	              initiator will remove them from the cache and use ECT on
824	              subsequent SYNs to that server.

826	              Where an access network operator mediates Internet access
827	              via a proxy that does not support AccECN, the optimistic
828	              ECT strategy will always fail.  This scenario is more
829	              likely in mobile networks.  Therefore, a mobile host could
830	              cache lack of AccECN support per attached access network
831	              operator.  Whenever it attached to a new operator, it
832	              could check a well-known AccECN test server and, if it
833	              found no AccECN support, it would add a cache entry for
834	              the attached operator.  It would only use ECT when neither
835	              network nor server were cached.  It would only populate
836	              its per server cache when not attached to a non-AccECN
837	              proxy.

839	   (S3):  ECT by configuration: In a controlled environment, the
840	          administrator can make sure that servers support ECN-capable
841	          SYN packets.  Examples of controlled environments are single-
842	          tenant DCs, and possibly multi-tenant DCs if it is assumed
843	          that each tenant mostly communicates with its own VMs.

845	   For unmanaged environments like the public Internet, pragmatically
846	   the choice is between strategies (S1) and (S2B):

848	   o  The "pessimistic ECT and cache successes" strategy (S1) suffers
849	      from exposing the initial SYN to the prevailing loss level, even
850	      if the server supports ECT on SYNs, but only on the first
851	      connection to each AccECN server.

853	   o  The "optimistic ECT and cache failures" strategy (S2B) exploits a
854	      server's support for ECT on SYNs from the very first attempt.  But
855	      if the server turns out not to support AccECN, the initiator has
856	      to conservatively limit its initial window - usually
857	      unnecessarily.  Nonetheless, initiator request data (as opposed to
858	      server response data) is rarely larger than 1 SMSS anyway {ToDo:
859	      reference? (this information was given informally by Yuchung
860	      Cheng)}.

862	   The normative specification for ECT on a SYN in Section 3.2.1 uses
863	   the "optimistic ECT and cache failures" strategy (S2B) on the
864	   assumption that an initial window of 1 SMSS is usually sufficient for
865	   client requests anyway.  Clients that often initially send more than
866	   1 SMSS of data could use strategy (S1) during initial deployment, and
867	   strategy (S2B) later (when the probability of servers supporting
868	   AccECN and the likelihood of seeing some CE marking is higher).
869	   Also, as deployment proceeds, caching successes (S1) starts off small
870	   then grows, while caching failures (S2B) becomes large at first, then
871	   shrinks.

873	      MEASUREMENTS NEEDED: Measurements are needed to determine whether
874	      one or the other strategy would be sufficient for any particular
875	      client, or whether a particular client would need both strategies
876	      in different circumstances.

878	4.2.2.  Argument 1b: Unrecognized ECT on the SYN

880	   Given, until now, ECT-marked SYN packets have been prohibited, it
881	   cannot be assumed they will be accepted.

883	   According to a study using 2014 data [ecn-pam] from a limited range
884	   of vantage points, out of the top 1M Alexa web sites, 4791 (0.82%)
885	   IPv4 sites and 104 (0.61%) IPv6 sites failed to establish a
886	   connection when they received a TCP SYN with any ECN codepoint set in
887	   the IP header and the appropriate ECN flags in the TCP header.  Of
888	   these, about 41% failed to establish a connection due to the ECN
889	   flags in the TCP header even with a Not-ECT ECN field in the IP
890	   header (i.e. despite full compliance with RFC 3168).  Therefore
891	   adding the ECN capability to SYNs was increasing connection
892	   establishment failures by about 0.4%.

894	   In a study using 2017 data from a wider range of fixed and mobile
895	   vantage points to the top 500k Alexa servers, no case was found where
896	   adding the ECN capability to a SYN increased the likelihood of
897	   connection establishment failure {ToDo: reference (under
898	   submission)}.

900	      MEASUREMENTS NEEDED: More investigation is needed to understand
901	      the different outcomes of the 2014 and 2017 studies.

903	   RFC 3168 says "a host MUST NOT set ECT on SYN [...] packets", but it
904	   does not say what the responder should do if an ECN-capable SYN
905	   arrives.  So, in the 2014 study, perhaps some responder
906	   implementations were checking that the SYN complied with RFC 3168,
907	   then silently ignoring non-compliant SYNs (or perhaps returning a
908	   RST).  Also some middleboxes (e.g. firewalls) might have been
909	   discarding non-compliant SYNs.  For the future,
910	   [I-D.ietf-tsvwg-ecn-experimentation] updates RFC 3168 to clarify that
911	   middleboxes "SHOULD NOT" do this, but that does not alter the past.

913	   Whereas RSTs can be dealt with immediately, silent failures introduce
914	   a retransmission timeout delay (default 1 second) at the initiator
915	   before it attempts any fall back strategy.  Ironically, making SYNs
916	   ECN-capable is intended to avoid the timeout when a SYN is lost due
917	   to congestion.  Fortunately, if there is any discard of ECN-capable
918	   SYNs due to policy, it will occur predictably, not randomly like
919	   congestion.  So the initiator can avoid it by caching those sites
920	   that do not support ECN-capable SYNs.  This further justifies the use
921	   of the "optimistic ECT and cache failures" strategy in Section 3.2.1.

923	      MEASUREMENTS NEEDED: Experiments are needed to determine whether
924	      blocking of ECT on SYNs is widespread, and how many occurrences of
925	      problems would be masked by how few cache entries.

927	   If blocking is too widespread for the "optimistic ECT and cache
928	   failures" strategy (S2B), the "pessimistic ECT and cache successes"
929	   strategy (Section 4.2.1) would be better.

931	      MEASUREMENTS NEEDED: Then measurements would be needed on whether
932	      failures were still widespread on the third connection attempt
933	      after the more careful ("pessimistic") first and second attempts.

935	   If so, it might be necessary to send a not-ECT SYN a short delay
936	   after an ECT SYN and only accept the non-ECT connection if it
937	   returned first.  This would reduce the performance penalty for those
938	   deploying ECT SYN support.

940	      FOR DISCUSSION: If this becomes necessary, how much delay ought to
941	      be required before the second SYN?  Certainly less than the
942	      standard RTO (1 second).  But more or less than the maximum RTT
943	      expected over the surface of the earth (roughly 250ms)?  Or even
944	      back-to-back?

946	   However, based on the data above from [ecn-pam], even a cache of a
947	   dozen or so sites ought to avoid all ECN-related performance problems
948	   with roughly the Alexa top thousand.  So it is questionable whether
949	   sending two SYNs will be necessary, particularly given failures at
950	   well-maintained sites could reduce further once ECT SYNs are
951	   standardized.

953	4.2.3.  Argument 2: DoS Attacks

955	   [RFC5562] says that ECT SYN packets could be misused by malicious
956	   clients to augment "the well-known TCP SYN attack".  It goes on to
957	   say "a malicious host might be able to inject a large number of TCP
958	   SYN packets through a potentially congested ECN-enabled router,
959	   congesting it even further."

961	   We assume this is a reference to the TCP SYN flood attack (see
962	   https://en.wikipedia.org/wiki/SYN_flood), which is an attack against
963	   a responder end point.  We assume the idea of this attack is to use
964	   ECT to get more packets through an ECN-enabled router in preference
965	   to other non-ECN traffic so that they can go on to use the SYN
966	   flooding attack to inflict more damage on the responder end point.
967	   This argument could apply to flooding with any type of packet, but we
968	   assume SYNs are singled out because their source address is easier to
969	   spoof, whereas floods of other types of packets are easier to block.

971	   Mandating Not-ECT in an RFC does not stop attackers using ECT for
972	   flooding.  Nonetheless, if a standard says SYNs are not meant to be
973	   ECT it would make it legitimate for firewalls to discard them.
974	   However this would negate the considerable benefit of ECT SYNs for
975	   compliant transports and seems unnecessary because RFC 3168 already
976	   provides the means to address this concern.  In section 7, RFC 3168
977	   says "During periods where ... the potential packet marking rate
978	   would be high, our recommendation is that routers drop packets rather
979	   then set the CE codepoint..." and this advice is repeated in
980	   [RFC7567] (section 4.2.1).  This makes it harder for flooding packets
981	   to gain from ECT.

983	   Further experiments are needed to test how much malicious hosts can
984	   use ECT to augment flooding attacks without triggering AQMs to turn
985	   off ECN support (flying "just under the radar").  If it is found that
986	   ECT can only slightly augment flooding attacks, the risk of such
987	   attacks will need to be weighed against the performance benefits of
988	   ECT SYNs.

990	4.3.  SYN-ACKs

992	   The proposed approach in Section 3.2.2 for experimenting with ECN-
993	   capable SYN-ACKs is effectively identical to the scheme called ECN+
994	   [ECN-PLUS].  In 2005, the ECN+ paper demonstrated that it could
995	   reduce the average Web response time by an order of magnitude.  It
996	   also argued that adding ECT to SYN-ACKs did not raise any new
997	   security vulnerabilities.

999	4.3.1.  Response to Congestion on a SYN-ACK

1001	   The IETF has already specified an experiment with ECN-capable SYN-ACK
1002	   packets [RFC5562].  It was inspired by the ECN+ paper, but it
1003	   specified a much more conservative congestion response to a CE-marked
1004	   SYN-ACK, called ECN+/TryOnce.  This required the server to reduce its
1005	   initial window to 1 segment (like ECN+), but then the server had to
1006	   send a second SYN-ACK and wait for its ACK before it could continue
1007	   with its initial window of 1 SMSS.  The second SYN-ACK of this 5-way
1008	   handshake had to carry no data, and had to disable ECN, but no
1009	   justification was given for these last two aspects.

1011	   The present ECN experiment obsoletes RFC 5562 because it uses the
1012	   ECN+ congestion response, not ECN+/TryOnce.  First we argue against
1013	   the rationale for ECN+/TryOnce given in sections 4.4 and 6.2 of
1014	   [RFC5562].  It starts with a rather too literal interpretation of the
1015	   requirement in RFC 3168 that says TCP's response to a single CE mark
1016	   has to be "essentially the same as the congestion control response to
1017	   a *single* dropped packet."  TCP's response to a dropped initial (SYN
1018	   or SYN-ACK) packet is to wait for the retransmission timer to expire
1019	   (currently 1s).  However, this long delay assumes the worst case
1020	   between two possible causes of the loss: a) heavy overload; or b) the
1021	   normal capacity-seeking behaviour of other TCP flows.  When the
1022	   network is still delivering CE-marked packets, it implies that there
1023	   is an AQM at the bottleneck and that it is not overloaded.  This is
1024	   because an AQM under overload will disable ECN (as recommended in
1025	   section 7 of RFC 3168 and repeated in section 4.2.1 of RFC 7567).  So
1026	   scenario (a) can be ruled out.  Therefore, TCP's response to a CE-
1027	   marked SYN-ACK can be similar to its response to the loss of _any_
1028	   packet, rather than backing off as if the special _initial_ packet of
1029	   a flow has been lost.

1031	   How TCP responds to the loss of any single packet depends what it has
1032	   just been doing.  But there is not really a precedent for TCP's
1033	   response when it experiences a CE mark having sent only one (small)
1034	   packet.  If TCP had been adding one segment per RTT, it would have
1035	   halved its congestion window, but it hasn't established a congestion
1036	   window yet.  If it had been exponentially increasing it would have
1037	   exited slow start, but it hasn't started exponentially increasing yet
1038	   so it hasn't established a slow-start threshold.

1040	   Therefore, we have to work out a reasoned argument for what to do.
1041	   If an AQM is CE-marking packets, it implies there is already a queue
1042	   and it is probably already somewhere around the AQM's operating point
1043	   - it is unlikely to be well below and it might be well above.  So, it
1044	   does not seem sensible to add a number of packets at once.  On the
1045	   other hand, it is highly unlikely that the SYN-ACK itself pushed the
1046	   AQM into congestion, so it will be safe to introduce another single
1047	   segment immediately (1 RTT after the SYN-ACK).  Therefore, starting
1048	   to probe for capacity with a slow start from an initial window of 1
1049	   segment seems appropriate to the circumstances.  This is the approach
1050	   adopted in Section 3.2.2.

1052	4.3.2.  Fall-Back if ECT SYN-ACK Fails

1054	   An alternative to the server caching failed connection attempts would
1055	   be for the server to rely on the client caching failed attempts (on
1056	   the basis that the client would cache a failure whether ECT was
1057	   blocked on the SYN or the SYN-ACK).  This strategy cannot be used if
1058	   the SYN does not request AccECN support.  It works as follows: if the
1059	   server receives a SYN that requests AccECN support but is set to not-
1060	   ECT, it replies with a SYN-ACK also set to not-ECT.  If a middlebox
1061	   only blocks ECT on SYNs, not SYN-ACKs, this strategy might disable
1062	   ECN on a SYN-ACK when it did not need to, but at least it saves the
1063	   server from maintaining a cache.

1065	4.4.  Pure ACKs

1067	   Section 5.2 of RFC 3168 gives the following arguments for not
1068	   allowing the ECT marking of pure ACKs (ACKs not piggy-backed on
1069	   data):

1071	      "To ensure the reliable delivery of the congestion indication of
1072	      the CE codepoint, an ECT codepoint MUST NOT be set in a packet
1073	      unless the loss of that packet in the network would be detected by
1074	      the end nodes and interpreted as an indication of congestion.

1076	      Transport protocols such as TCP do not necessarily detect all
1077	      packet drops, such as the drop of a "pure" ACK packet; for
1078	      example, TCP does not reduce the arrival rate of subsequent ACK
1079	      packets in response to an earlier dropped ACK packet.  Any
1080	      proposal for extending ECN-Capability to such packets would have
1081	      to address issues such as the case of an ACK packet that was
1082	      marked with the CE codepoint but was later dropped in the network.
1083	      We believe that this aspect is still the subject of research, so
1084	      this document specifies that at this time, "pure" ACK packets MUST
1085	      NOT indicate ECN-Capability."

1087	   Later on, in section 6.1.4 it reads:

1089	      "For the current generation of TCP congestion control algorithms,
1090	      pure acknowledgement packets (e.g., packets that do not contain
1091	      any accompanying data) MUST be sent with the not-ECT codepoint.
1092	      Current TCP receivers have no mechanisms for reducing traffic on
1093	      the ACK-path in response to congestion notification.  Mechanisms
1094	      for responding to congestion on the ACK-path are areas for current
1095	      and future research.  (One simple possibility would be for the
1096	      sender to reduce its congestion window when it receives a pure ACK
1097	      packet with the CE codepoint set).  For current TCP
1098	      implementations, a single dropped ACK generally has only a very
1099	      small effect on the TCP's sending rate."

1101	   We next address each of the arguments presented above.

1103	   The first argument is a specific instance of the reliability argument
1104	   for the case of pure ACKs.  This has already been addressed by
1105	   countering the general reliability argument in Section 4.1.

1107	   The second argument says that ECN ought not to be enabled unless
1108	   there is a mechanism to respond to it.  However, actually there _is_
1109	   a mechanism to respond to congestion on a pure ACK that RFC 3168 has
1110	   overlooked - the congestion window mechanism.  When data segments and
1111	   pure ACKs are interspersed, congestion notifications ought to
1112	   regulate the congestion window, whether they are on data segments or
1113	   on pure ACKs.  Otherwise, if ECN is disabled on Pure ACKs, and if
1114	   (say) 70% of the segments in one direction are Pure ACKs, about 70%
1115	   of the congestion notifications will be missed and the data segments
1116	   will not be correctly regulated.

1118	   So RFC 3168 ought to have considered two congestion response
1119	   mechanisms - reducing the congestion window (cwnd) and reducing the
1120	   ACK rate - and only the latter was missing.  Further, RFC 3168 was
1121	   incorrect to assume that, if one ACK was a pure ACK, all segments in
1122	   the same direction would be pure ACKs.  Admittedly a continual stream
1123	   of pure ACKs in one direction is quite a common case (e.g. a file
1124	   download).  However, it is also common for the pure ACKs to be
1125	   interspersed with data segments (e.g.  HTTP/2 browser requests
1126	   controlling a web application).  Indeed, it is more likely that any
1127	   congestion experienced by pure ACKs will be due to mixing with data
1128	   segments, either within the same flow, or within competing flows.

1130	   This insight swings the argument towards enabling ECN on pure ACKs so
1131	   that CE marks can drive the cwnd response to congestion (whenever
1132	   data segments are interspersed with the pure ACKs).  Then to
1133	   separately decide whether an ACK rate response is also required (when
1134	   they are ECN-enabled).  The two types of response are addressed
1135	   separately in the following two subsections, then a final subsection
1136	   draws conclusions.

1138	4.4.1.  Cwnd Response to CE-Marked Pure ACKs

1140	   If the sender of pure ACKs sets them to ECT, the bullets below assess
1141	   whether the three stages of the congestion response mechanism will
1142	   all work for each type of congestion feedback (classic ECN [RFC3168]
1143	   and AccECN [I-D.ietf-tcpm-accurate-ecn]):

1145	   Detection:  The receiver of a pure ACK can detect a CE marking on it:

1147	      *  Classic feedback: the receiver will not expect CE marks on pure
1148	         ACKs, so it will be implementation-dependent whether it happens
1149	         to check for CE marks on all packets.

1151	      *  AccECN feedback: the AccECN specification requires the receiver
1152	         of any TCP packets to count any CE marks on them (whether or
1153	         not control packets are ECN-capable).

1155	   Feedback:  TCP never ACKs a pure ACK, but the receiver of a CE-mark
1156	      on a pure ACK can feed it back when it sends a subsequent data
1157	      segment (if it ever does):

1159	      *  Classic feedback: the receiver (of the pure ACKs) would set the
1160	         echo congestion experienced (ECE) flag in the TCP header as
1161	         normal.

1163	      *  AccECN feedback: the receiver continually feeds back a count of
1164	         the number of CE-marked packets that it has received (and, if
1165	         possible, a count of CE-marked bytes).

1167	   Congestion response:  In either case (classic or AccECN feedback), if
1168	      the TCP sender does receive feedback about CE-markings on pure
1169	      ACKs, it will react in the usual way by reducing its congestion
1170	      window accordingly.  This will regulate the rate of any data
1171	      packets it is sending amongst the pure ACKs.  Note that, while a
1172	      host has no application data to send, any congestion window it has
1173	      attained might also be reduced by the congestion window validation
1174	      mechanism [RFC7661].

1176	4.4.2.  ACK Rate Response to CE-Marked Pure ACKs

1178	   Reducing the congestion window will have no effect on the rate of
1179	   pure ACKs.  The worst case here is if the bottleneck is congested
1180	   solely with pure ACKs, but it could also be problematic if a large
1181	   fraction of the load was from unresponsive ACKs, leaving little or no
1182	   capacity for the load from responsive data.

1184	   Since RFC 3168 was published, Acknowledgement Congestion Control
1185	   (AckCC) techniques have been documented in [RFC5690] (informational).
1186	   So any pair of TCP end-points can choose to agree to regulate the
1187	   delayed ACK ratio in response to lost or CE-marked pure ACKs.
1188	   However, the protocol has a number of open deployment issues (e.g. it
1189	   relies on two new TCP options, one of which is required on the SYN
1190	   where option space is at a premium and, if either option is blocked
1191	   by a middlebox, no fall-back behaviour is specified).  The new TCP
1192	   options addressed two problems, namely that TCP had: i) no mechanism
1193	   to allow ECT to be set on pure ACKs; and ii) no mechanism to feed
1194	   back loss or CE-marking of pure ACKs.  A combination of the present
1195	   specification and AccECN addresses both these problems, at least for
1196	   ECN marking.  So it might now be possible to design an ECN-specific
1197	   ACK congestion control scheme without the extra TCP options proposed
1198	   in RFC 5690.  However, such a mechanism is out of scope of the
1199	   present document.

1201	   Setting aside the practicality of RFC 5690, the need for AckCC has
1202	   not been conclusively demonstrated.  It has been argued that the
1203	   Internet has survived so far with no mechanism to even detect loss of
1204	   pure ACKs.  However, it has also been argued that ECN is not the same
1205	   as loss.  Packet discard can naturally thin the ACK load to whatever
1206	   the bottleneck can support, whereas ECN marking does not (it queues
1207	   the ACKs instead).  Nonetheless, RFC 3168 (section 7) recommends that
1208	   an AQM switches over from ECN marking to discard when the marking
1209	   probability becomes high.  Therefore discard can still be relied on
1210	   to thin out ECN-enabled pure ACKs as a last resort.

1212	4.4.3.  Summary: Enabling ECN on Pure ACKs

1214	   In the case when AccECN has been negotiated, the arguments for ECT
1215	   (and CE) on pure ACKs heavily outweigh those against.  ECN is always
1216	   more and never less reliable for delivery of congestion notification.
1217	   The cwnd response has been overlooked as a mechanism for responding
1218	   to congestion on pure ACKs, so it is incorrect not to set ECT on pure
1219	   ACKs when they are interspersed with data segments.  And when they
1220	   are not, packet discard still acts as the "congestion response of
1221	   last resort".  In contrast, not setting ECT on pure ACKs is certainly
1222	   detrimental to performance, because when a pure ACK is lost it can
1223	   prevent the release of new data.  Separately, AckCC (or perhaps an
1224	   improved variant exploiting AccECN) could optionally be used to
1225	   regulate the spacing between pure ACKs.  However, it is not clear
1226	   whether AckCC is justified.

1228	   In the case when Classic ECN has been negotiated, there is still an
1229	   argument for ECT (and CE) on pure ACKs, but it is less clear-cut.
1230	   Some existing RFC 3168 implementations might happen to
1231	   (unintentionally) provide the correct feedback to support a cwnd
1232	   response.  Even for those that did not, setting ECT on pure ACKs
1233	   would still be better for performance than not setting it and do no
1234	   extra harm.  If AckCC was required, it is designed to work with RFC
1235	   3168 ECN.

1237	4.5.  Window Probes

1239	   Section 6.1.6 of RFC 3168 presents only the reliability argument for
1240	   prohibiting ECT on Window probes:

1242	      "If a window probe packet is dropped in the network, this loss is
1243	      not detected by the receiver.  Therefore, the TCP data sender MUST
1244	      NOT set either an ECT codepoint or the CWR bit on window probe
1245	      packets.

1247	      However, because window probes use exact sequence numbers, they
1248	      cannot be easily spoofed in denial-of-service attacks.  Therefore,
1249	      if a window probe arrives with the CE codepoint set, then the
1250	      receiver SHOULD respond to the ECN indications."

1252	   The reliability argument has already been addressed in Section 4.1.

1254	   Allowing ECT on window probes could considerably improve performance
1255	   because, once the receive window has reopened, if a window probe is
1256	   lost the sender will stall until the next window probe reaches the
1257	   receiver, which might be after the maximum retransmission timeout (at
1258	   least 1 minute [RFC6928]).

1260	   On the bright side, RFC 3168 at least specifies the receiver
1261	   behaviour if a CE-marked window probe arrives, so changing the
1262	   behaviour ought to be less painful than for other packet types.

1264	4.6.  FINs

1266	   RFC 3168 is silent on whether a TCP sender can set ECT on a FIN.  A
1267	   FIN is considered as part of the sequence of data, and the rate of
1268	   pure ACKs sent after a FIN could be controlled by a CE marking on the
1269	   FIN.  Therefore there is no reason not to set ECT on a FIN.

1271	4.7.  RSTs

1273	   RFC 3168 is silent on whether a TCP sender can set ECT on a RST.  The
1274	   host generating the RST message does not have an open connection
1275	   after sending it (either because there was no such connection when
1276	   the packet that triggered the RST message was received or because the
1277	   packet that triggered the RST message also triggered the closure of
1278	   the connection).

1280	   Moreover, the receiver of a CE-marked RST message can either: i)
1281	   accept the RST message and close the connection; ii) emit a so-called
1282	   challenge ACK in response (with suitable throttling) [RFC5961] and
1283	   otherwise ignore the RST (e.g. because the sequence number is in-
1284	   window but not the precise number expected next); or iii) discard the
1285	   RST message (e.g. because the sequence number is out-of-window).  In
1286	   the first two cases there is no point in echoing any CE mark received
1287	   because the sender closed its connection when it sent the RST.  In
1288	   the third case it makes sense to discard the CE signal as well as the
1289	   RST.

1291	   Although a congestion response following a CE-marking on a RST does
1292	   not appear to make sense, the following factors have been considered
1293	   before deciding whether the sender ought to set ECT on a RST message:

1295	   o  As explained above, a congestion response by the sender of a CE-
1296	      marked RST message is not possible;

1298	   o  So the only reason for the sender setting ECT on a RST would be to
1299	      improve the reliability of the message's delivery;

1301	   o  RST messages are used to both mount and mitigate attacks:

1303	      *  Spoofed RST messages are used by attackers to terminate ongoing
1304	         connections, although the mitigations in RFC 5961 have
1305	         considerably raised the bar against off-path RST attacks;

1307	      *  Legitimate RST messages allow endpoints to inform their peers
1308	         to eliminate existing state that correspond to non existing
1309	         connections, liberating resources e.g. in DoS attacks
1310	         scenarios;

1312	   o  AQMs are advised to disable ECN marking during persistent
1313	      overload, so:

1315	      *  it is harder for an attacker to exploit ECN to intensify an
1316	         attack;

1318	      *  it is harder for a legitimate user to exploit ECN to more
1319	         reliably mitigate an attack

1321	   o  Prohibiting ECT on a RST would deny the benefit of ECN to
1322	      legitimate RST messages, but not to attackers who can disregard
1323	      RFCs;

1325	   o  If ECT were prohibited on RSTs

1327	      *  it would be easy for security middleboxes to discard all ECN-
1328	         capable RSTs;

1330	      *  However, unlike a SYN flood, it is already easy for a security
1331	         middlebox (or host) to distinguish a RST flood from legitimate
1332	         traffic [RFC5961], and even if a some legitimate RSTs are
1333	         accidentally removed as well, legitimate connections still
1334	         function.

1336	   So, on balance, it has been decided that it is worth experimenting
1337	   with ECT on RSTs.  During experiments, if the ECN capability on RSTs
1338	   is found to open a vulnerability that is hard to close, this decision
1339	   can be reversed, before it is specified for the standards track.

1341	4.8.  Retransmitted Packets.

1343	   RFC 3168 says the sender "MUST NOT" set ECT on retransmitted packets.
1344	   The rationale for this consumes nearly 2 pages of RFC 3168, so the
1345	   reader is referred to section 6.1.5 of RFC 3168, rather than quoting
1346	   it all here.  There are essentially three arguments, namely:
1347	   reliability; DoS attacks; and over-reaction to congestion.  We
1348	   address them in order below.

1350	   The reliability argument has already been addressed in Section 4.1.

1352	   Protection against DoS attacks is not afforded by prohibiting ECT on
1353	   retransmitted packets.  An attacker can set CE on spoofed
1354	   retransmissions whether or not it is prohibited by an RFC.
1355	   Protection against the DoS attack described in section 6.1.5 of RFC
1356	   3168 is solely afforded by the requirement that "the TCP data
1357	   receiver SHOULD ignore the CE codepoint on out-of-window packets".
1358	   Therefore in Section 3.2.7 the sender is allowed to set ECT on
1359	   retransmitted packets, in order to reduce the chance of them being
1360	   dropped.  We also strengthen the receiver's requirement from "SHOULD
1361	   ignore" to "MUST ignore".  And we generalize the receiver's
1362	   requirement to include failure of any validity check, not just out-
1363	   of-window checks, in order to include the more stringent validity
1364	   checks in RFC 5961 that have been developed since RFC 3168.

1366	   A consequence is that, for those retransmitted packets that arrive at
1367	   the receiver after the original packet has been properly received
1368	   (so-called spurious retransmissions), any CE marking will be ignored.
1369	   There is no problem with that because the fact that the original
1370	   packet has been delivered implies that the sender's original
1371	   congestion response (when it deemed the packet lost and retransmitted
1372	   it) was unnecessary.

1374	   Finally, the third argument is about over-reacting to congestion.
1375	   The argument goes that, if a retransmitted packet is dropped, the
1376	   sender will not detect it, so it will not react again to congestion
1377	   (it would have reduced its congestion window already when it
1378	   retransmitted the packet).  Whereas, if retransmitted packets can be
1379	   CE tagged instead of dropped, senders could potentially react more
1380	   than once to congestion.  However, we argue that it is legitimate to
1381	   respond again to congestion if it still persists in subsequent round
1382	   trip(s).

1384	   Therefore, in all three cases, it is not incorrect to set ECT on
1385	   retransmissions.

1387	4.9.  General Fall-back for any Control Packet

1389	   Extensive experiments have found no evidence of any traversal
1390	   problems with ECT on any TCP control packet {ToDo: reference (under
1391	   submission)}. Nonetheless, Sections 3.2.1.4 and 3.2.2.3 specify fall-
1392	   back measures if ECT on the first packet of each half-connection (SYN
1393	   or SYN-ACK) appears to be blocking progress.  Here, the question of
1394	   fall-back measures for ECT on other control packets is explored.  It
1395	   supports the advice given in Section 3.2.8; until there's evidence
1396	   that something's broken, don't fix it.

1398	   If an implementation has had to disable ECT to ensure the first
1399	   packet of a flow (SYN or SYN-ACK) gets through, the question arises
1400	   whether it ought to disable ECT on all subsequent control packets
1401	   within the same TCP connection.  Without evidence of any such
1402	   problems, this seems unnecessarily cautious.  Particularly given it
1403	   would be hard to detect loss of most other types of TCP control
1404	   packets that are not ACK'd.  And particularly given that
1405	   unnecessarily removing ECT from other control packets could lead to
1406	   performance problems, e.g. by directing them into an inferior queue
1407	   [I-D.ietf-tsvwg-ecn-l4s-id] or over a different path, because some
1408	   broken multipath equipment (erroneously) routes based on all 8 bits
1409	   of the Diffserv field.

1411	   In the case where a connection starts without ECT on the SYN (perhaps
1412	   because problems with previous connections had been cached), there
1413	   will have been no test for ECT traversal in the client-server
1414	   direction until the pure ACK that completes the handshake.  It is
1415	   possible that some middlebox might block ECT on this pure ACK or on
1416	   later retransmissions of lost packets.  Similarly, after a route
1417	   change, the new path might include some middlebox that blocks ECT on
1418	   some or all TCP control packets.  However, without evidence of such
1419	   problems, the complexity of a fix does not seem worthwhile.

1421	      MORE MEASUREMENTS NEEDED (?): If further two-ended measurements do
1422	      find evidence for these traversal problems, measurements would be
1423	      needed to check for correlation of ECT traversal problems between
1424	      different control packets.  It might then be necessary to
1425	      introduce a catch-all fall-back rule that disables ECT on certain
1426	      subsequent TCP control packets based on some criteria developed
1427	      from these measurements.

1429	5.  Interaction with popular variants or derivatives of TCP

1431	   The following subsections discuss any interactions between setting
1432	   ECT on all packets and using the following popular variants of TCP:
1433	   IW10 and TFO.  It also briefly notes the possibility that the
1434	   principles applied here should translate to protocols derived from
1435	   TCP.  This section is informative not normative, because no
1436	   interactions have been identified that require any change to
1437	   specifications.  The subsection on IW10 discusses potential changes
1438	   to specifications but recommends that no changes are needed.

1440	   The designs of the following TCP variants have also been assessed and
1441	   found not to interact adversely with ECT on TCP control packets: SYN
1442	   cookies (see Appendix A of [RFC4987] and section 3.1 of [RFC5562]),
1443	   TCP Fast Open (TFO [RFC7413]) and L4S [I-D.ietf-tsvwg-l4s-arch].

1445	5.1.  IW10

1447	   IW10 is an experiment to determine whether it is safe for TCP to use
1448	   an initial window of 10 SMSS [RFC6928].

1450	   This subsection does not recommend any additions to the present
1451	   specification in order to interwork with IW10.  The specifications as
1452	   they stand are safe, and there is only a corner-case with ECT on the
1453	   SYN where performance could be occasionally improved, as explained
1454	   below.

1456	   As specified in Section 3.2.1.1, a TCP initiator can only set ECT on
1457	   the SYN if it requests AccECN support.  If, however, the SYN-ACK
1458	   tells the initiator that the responder does not support AccECN,
1459	   Section 3.2.1.1 advises the initiator to conservatively reduce its
1460	   initial window to 1 SMSS because, if the SYN was CE-marked, the SYN-
1461	   ACK has no way to feed that back.

1463	   If the initiator implements IW10, it seems rather over-conservative
1464	   to reduce IW from 10 to 1 just in case a congestion marking was
1465	   missed.  Nonetheless, the reduction to 1 SMSS will rarely harm
1466	   performance, because:

1468	   o  as long as the initiator is caching failures to negotiate AccECN,
1469	      subsequent attempts to access the same server will not use ECT on
1470	      the SYN anyway, so there will no longer be any need to
1471	      conservatively reduce IW;

1473	   o  currently it is not common for a TCP initiator (client) to have
1474	      more than one data segment to send {ToDo: evidence/reference?} -
1475	      IW10 is primarily exploited by TCP servers.

1477	   If a responder receives feedback that the SYN-ACK was CE-marked,
1478	   Section 3.2.2.2 mandates that it reduces its initial window to 1
1479	   SMSS.  When the responder also implements IW10, it is particularly
1480	   important to adhere to this requirement in order to avoid overflowing
1481	   a queue that is clearly already congested.

1483	5.2.  TFO

1485	   TCP Fast Open (TFO [RFC7413]) is an experiment to remove the round
1486	   trip delay of TCP's 3-way hand-shake (3WHS).  A TFO initiator caches
1487	   a cookie from a previous connection with a TFO-enabled server.  Then,
1488	   for subsequent connections to the same server, any data included on
1489	   the SYN can be passed directly to the server application, which can
1490	   then return up to an initial window of response data on the SYN-ACK
1491	   and on data segments straight after it, without waiting for the ACK
1492	   that completes the 3WHS.

1494	   The TFO experiment and the present experiment to add ECN-support for
1495	   TCP control packets can be combined without altering either
1496	   specification, which is justified as follows:

1498	   o  The handling of ECN marking on a SYN is no different whether or
1499	      not it carries data.

1501	   o  In response to any CE-marking on the SYN-ACK, the responder adopts
1502	      the normal response to congestion, as discussed in Section 7.2 of
1503	      [RFC7413].

1505	5.3.  TCP Derivatives

1507	   Stream Control Transmission Protocol (SCTP [RFC4960]) is a standards
1508	   track transport protocol derived from TCP.  SCTP currently does not
1509	   include ECN support, but Appendix A of RFC 4960 broadly describes how
1510	   it would be supported and a (long-expired) draft on the addition of
1511	   ECN to SCTP has been produced [I-D.stewart-tsvwg-sctpecn].  This
1512	   draft avoided setting ECT on control packets and retransmissions,
1513	   closely following the arguments in RFC 3168.

1515	   QUIC [I-D.ietf-quic-transport] is another standards track transport
1516	   protocol offering similar services to TCP but intended to exploit
1517	   some of the benefits of running over UDP.  A way to add ECN support
1518	   to QUIC has been proposed [I-D.johansson-quic-ecn].

1520	   Experience from experiments on adding ECN support to all TCP packets
1521	   ought to be directly transferable to derivatives of TCP, like SCTP or
1522	   QUIC.

1524	6.  Security Considerations

1526	   Section 3.2.6 considers the question of whether ECT on RSTs will
1527	   allow RST attacks to be intensified.  There are several security
1528	   arguments presented in RFC 3168 for preventing the ECN marking of TCP
1529	   control packets and retransmitted segments.  We believe all of them
1530	   have been properly addressed in Section 4, particularly Section 4.2.3
1531	   and Section 4.8 on DoS attacks using spoofed ECT-marked SYNs and
1532	   spoofed CE-marked retransmissions.

1534	7.  IANA Considerations

1536	   There are no IANA considerations in this memo.

1538	8.  Acknowledgments

1540	   Thanks to Mirja Kuehlewind, David Black, Padma Bhooma and Gorry
1541	   Fairhurst for their useful reviews.

1543	   The work of Marcelo Bagnulo has been performed in the framework of
1544	   the H2020-ICT-2014-2 project 5G NORMA.  His contribution reflects the
1545	   consortium's view, but the consortium is not liable for any use that
1546	   may be made of any of the information contained therein.

1548	9.  References
1549	9.1.  Normative References

1551	   [I-D.ietf-tcpm-accurate-ecn]
1552	              Briscoe, B., Kuehlewind, M., and R. Scheffenegger, "More
1553	              Accurate ECN Feedback in TCP", draft-ietf-tcpm-accurate-
1554	              ecn-03 (work in progress), May 2017.

1556	   [I-D.ietf-tsvwg-ecn-experimentation]
1557	              Black, D., "Explicit Congestion Notification (ECN)
1558	              Experimentation", draft-ietf-tsvwg-ecn-experimentation-06
1559	              (work in progress), September 2017.

1561	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
1562	              Requirement Levels", BCP 14, RFC 2119,
1563	              DOI 10.17487/RFC2119, March 1997,
1564	              <https://www.rfc-editor.org/info/rfc2119>.

1566	   [RFC3168]  Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
1567	              of Explicit Congestion Notification (ECN) to IP",
1568	              RFC 3168, DOI 10.17487/RFC3168, September 2001,
1569	              <https://www.rfc-editor.org/info/rfc3168>.

1571	   [RFC5961]  Ramaiah, A., Stewart, R., and M. Dalal, "Improving TCP's
1572	              Robustness to Blind In-Window Attacks", RFC 5961,
1573	              DOI 10.17487/RFC5961, August 2010,
1574	              <https://www.rfc-editor.org/info/rfc5961>.

1576	9.2.  Informative References

1578	   [ecn-pam]  Trammell, B., Kuehlewind, M., Boppart, D., Learmonth, I.,
1579	              Fairhurst, G., and R. Scheffenegger, "Enabling Internet-
1580	              Wide Deployment of Explicit Congestion Notification",
1581	              Int'l Conf. on Passive and Active Network Measurement
1582	              (PAM'15) pp193-205, 2015.

1584	   [ECN-PLUS]
1585	              Kuzmanovic, A., "The Power of Explicit Congestion
1586	              Notification", ACM SIGCOMM 35(4):61--72, 2005.

1588	   [I-D.ietf-quic-transport]
1589	              Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed
1590	              and Secure Transport", draft-ietf-quic-transport-06 (work
1591	              in progress), September 2017.

1593	   [I-D.ietf-tcpm-dctcp]
1594	              Bensley, S., Thaler, D., Balasubramanian, P., Eggert, L.,
1595	              and G. Judd, "Datacenter TCP (DCTCP): TCP Congestion
1596	              Control for Datacenters", draft-ietf-tcpm-dctcp-10 (work
1597	              in progress), August 2017.

1599	   [I-D.ietf-tsvwg-ecn-l4s-id]
1600	              Schepper, K. and B. Briscoe, "Identifying Modified
1601	              Explicit Congestion Notification (ECN) Semantics for
1602	              Ultra-Low Queuing Delay", draft-ietf-tsvwg-ecn-l4s-id-00
1603	              (work in progress), May 2017.

1605	   [I-D.ietf-tsvwg-l4s-arch]
1606	              Briscoe, B., Schepper, K., and M. Bagnulo, "Low Latency,
1607	              Low Loss, Scalable Throughput (L4S) Internet Service:
1608	              Architecture", draft-ietf-tsvwg-l4s-arch-00 (work in
1609	              progress), May 2017.

1611	   [I-D.johansson-quic-ecn]
1612	              Johansson, I., "ECN support in QUIC", draft-johansson-
1613	              quic-ecn-03 (work in progress), May 2017.

1615	   [I-D.stewart-tsvwg-sctpecn]
1616	              Stewart, R., Tuexen, M., and X. Dong, "ECN for Stream
1617	              Control Transmission Protocol (SCTP)", draft-stewart-
1618	              tsvwg-sctpecn-05 (work in progress), January 2014.

1620	   [judd-nsdi]
1621	              Judd, G., "Attaining the promise and avoiding the pitfalls
1622	              of TCP in the Datacenter", USENIX Symposium on Networked
1623	              Systems Design and Implementation (NSDI'15) pp.145-157,
1624	              May 2015.

1626	   [RFC0793]  Postel, J., "Transmission Control Protocol", STD 7,
1627	              RFC 793, DOI 10.17487/RFC0793, September 1981,
1628	              <https://www.rfc-editor.org/info/rfc793>.

1630	   [RFC1122]  Braden, R., Ed., "Requirements for Internet Hosts -
1631	              Communication Layers", STD 3, RFC 1122,
1632	              DOI 10.17487/RFC1122, October 1989,
1633	              <https://www.rfc-editor.org/info/rfc1122>.

1635	   [RFC3540]  Spring, N., Wetherall, D., and D. Ely, "Robust Explicit
1636	              Congestion Notification (ECN) Signaling with Nonces",
1637	              RFC 3540, DOI 10.17487/RFC3540, June 2003,
1638	              <https://www.rfc-editor.org/info/rfc3540>.

1640	   [RFC4960]  Stewart, R., Ed., "Stream Control Transmission Protocol",
1641	              RFC 4960, DOI 10.17487/RFC4960, September 2007,
1642	              <https://www.rfc-editor.org/info/rfc4960>.

1644	   [RFC4987]  Eddy, W., "TCP SYN Flooding Attacks and Common
1645	              Mitigations", RFC 4987, DOI 10.17487/RFC4987, August 2007,
1646	              <https://www.rfc-editor.org/info/rfc4987>.

1648	   [RFC5562]  Kuzmanovic, A., Mondal, A., Floyd, S., and K.
1649	              Ramakrishnan, "Adding Explicit Congestion Notification
1650	              (ECN) Capability to TCP's SYN/ACK Packets", RFC 5562,
1651	              DOI 10.17487/RFC5562, June 2009,
1652	              <https://www.rfc-editor.org/info/rfc5562>.

1654	   [RFC5681]  Allman, M., Paxson, V., and E. Blanton, "TCP Congestion
1655	              Control", RFC 5681, DOI 10.17487/RFC5681, September 2009,
1656	              <https://www.rfc-editor.org/info/rfc5681>.

1658	   [RFC5690]  Floyd, S., Arcia, A., Ros, D., and J. Iyengar, "Adding
1659	              Acknowledgement Congestion Control to TCP", RFC 5690,
1660	              DOI 10.17487/RFC5690, February 2010,
1661	              <https://www.rfc-editor.org/info/rfc5690>.

1663	   [RFC6298]  Paxson, V., Allman, M., Chu, J., and M. Sargent,
1664	              "Computing TCP's Retransmission Timer", RFC 6298,
1665	              DOI 10.17487/RFC6298, June 2011,
1666	              <https://www.rfc-editor.org/info/rfc6298>.

1668	   [RFC6928]  Chu, J., Dukkipati, N., Cheng, Y., and M. Mathis,
1669	              "Increasing TCP's Initial Window", RFC 6928,
1670	              DOI 10.17487/RFC6928, April 2013,
1671	              <https://www.rfc-editor.org/info/rfc6928>.

1673	   [RFC7413]  Cheng, Y., Chu, J., Radhakrishnan, S., and A. Jain, "TCP
1674	              Fast Open", RFC 7413, DOI 10.17487/RFC7413, December 2014,
1675	              <https://www.rfc-editor.org/info/rfc7413>.

1677	   [RFC7567]  Baker, F., Ed. and G. Fairhurst, Ed., "IETF
1678	              Recommendations Regarding Active Queue Management",
1679	              BCP 197, RFC 7567, DOI 10.17487/RFC7567, July 2015,
1680	              <https://www.rfc-editor.org/info/rfc7567>.

1682	   [RFC7661]  Fairhurst, G., Sathiaseelan, A., and R. Secchi, "Updating
1683	              TCP to Support Rate-Limited Traffic", RFC 7661,
1684	              DOI 10.17487/RFC7661, October 2015,
1685	              <https://www.rfc-editor.org/info/rfc7661>.

1687	Authors' Addresses

1689	   Marcelo Bagnulo
1690	   Universidad Carlos III de Madrid
1691	   Av. Universidad 30
1692	   Leganes, Madrid  28911
1693	   SPAIN

1695	   Phone: 34 91 6249500
1696	   Email: marcelo@it.uc3m.es
1697	   URI:   http://www.it.uc3m.es

1699	   Bob Briscoe
1700	   CableLabs
1701	   UK

1703	   Email: ietf@bobbriscoe.net
1704	   URI:   http://bobbriscoe.net/