idnits 2.17.1 

draft-briscoe-aqm-dualq-coupled-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (August 07, 2015) is 3184 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  -- Looks like a reference, but probably isn't: '0' on line 788

  -- Looks like a reference, but probably isn't: '1' on line 788

  == Unused Reference: 'RFC4774' is defined on line 620, but no explicit
     reference was found in the text

  == Outdated reference: A later version (-06) exists of
     draft-ietf-aqm-fq-codel-01

  == Outdated reference: A later version (-10) exists of draft-ietf-aqm-pie-01

  -- Obsolete informational reference (is this intentional?): RFC 2309
     (Obsoleted by RFC 7567)


     Summary: 0 errors (**), 0 flaws (~~), 4 warnings (==), 4 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------


2	Active Queue Management (aqm)                             K. De Schepper
3	Internet-Draft                                                 Bell Labs
4	Intended status: Standards Track                         B. Briscoe, Ed.
5	Expires: February 8, 2016                                    Independent
6	                                                           O. Bondarenko
7	                                                     Simula Research Lab
8	                                                                I. Tsang
9	                                                               Bell Labs
10	                                                         August 07, 2015

12	  DualQ Coupled AQM for Low Latency, Low Loss and Scalable Throughput
13	                   draft-briscoe-aqm-dualq-coupled-00

15	Abstract

17	   Data Centre TCP (DCTCP) was designed to provide predictably low
18	   queuing latency, near-zero loss, and throughput scalability using
19	   explicit congestion notification (ECN) and an extremely simple
20	   marking behaviour on switches.  However, DCTCP does not co-exist with
21	   existing TCP traffic---throughput starves.  So, until now, DCTCP
22	   could only be deployed where a clean-slate environment could be
23	   arranged, such as in private data centres.  This specification
24	   defines `DualQ Coupled Active Queue Management (AQM)' to allow
25	   scalable congestion controls like DCTCP to safely co-exist with
26	   classic Internet traffic.  The Coupled AQM ensures that a flow runs
27	   at about the same rate whether it uses DCTCP or TCP Reno/Cubic, but
28	   without inspecting transport layer flow identifiers.  When tested in
29	   a residential broadband setting, DCTCP achieved sub-millisecond
30	   average queuing delay and zero congestion loss under a wide range of
31	   mixes of DCTCP and `Classic' broadband Internet traffic, without
32	   compromising the performance of the Classic traffic.  The solution
33	   also reduces network complexity and eliminates network configuration.

35	Status of This Memo

37	   This Internet-Draft is submitted in full conformance with the
38	   provisions of BCP 78 and BCP 79.

40	   Internet-Drafts are working documents of the Internet Engineering
41	   Task Force (IETF).  Note that other groups may also distribute
42	   working documents as Internet-Drafts.  The list of current Internet-
43	   Drafts is at http://datatracker.ietf.org/drafts/current/.

45	   Internet-Drafts are draft documents valid for a maximum of six months
46	   and may be updated, replaced, or obsoleted by other documents at any
47	   time.  It is inappropriate to use Internet-Drafts as reference
48	   material or to cite them other than as "work in progress."
49	   This Internet-Draft will expire on February 8, 2016.

51	Copyright Notice

53	   Copyright (c) 2015 IETF Trust and the persons identified as the
54	   document authors.  All rights reserved.

56	   This document is subject to BCP 78 and the IETF Trust's Legal
57	   Provisions Relating to IETF Documents
58	   (http://trustee.ietf.org/license-info) in effect on the date of
59	   publication of this document.  Please review these documents
60	   carefully, as they describe your rights and restrictions with respect
61	   to this document.  Code Components extracted from this document must
62	   include Simplified BSD License text as described in Section 4.e of
63	   the Trust Legal Provisions and are provided without warranty as
64	   described in the Simplified BSD License.

66	Table of Contents

68	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
69	     1.1.  Problem and Scope . . . . . . . . . . . . . . . . . . . .   2
70	     1.2.  Terminology . . . . . . . . . . . . . . . . . . . . . . .   4
71	     1.3.  Features  . . . . . . . . . . . . . . . . . . . . . . . .   5
72	   2.  DualQ Coupled AQM Algorithm . . . . . . . . . . . . . . . . .   6
73	     2.1.  Coupled AQM . . . . . . . . . . . . . . . . . . . . . . .   6
74	     2.2.  Dual Queue  . . . . . . . . . . . . . . . . . . . . . . .   7
75	     2.3.  Traffic Classification  . . . . . . . . . . . . . . . . .   7
76	     2.4.  Normative Requirements  . . . . . . . . . . . . . . . . .   9
77	   3.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  10
78	   4.  Security Considerations . . . . . . . . . . . . . . . . . . .  10
79	     4.1.  Overload Handling . . . . . . . . . . . . . . . . . . . .  10
80	   5.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  11
81	   6.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  11
82	     6.1.  Normative References  . . . . . . . . . . . . . . . . . .  11
83	     6.2.  Informative References  . . . . . . . . . . . . . . . . .  11
84	   Appendix A.  Example DualQ Coupled Algorithm  . . . . . . . . . .  14
85	   Appendix B.  Guidance on Controlling Throughput Equivalence . . .  20
86	   Appendix C.  DCTCP Safety Enhancements  . . . . . . . . . . . . .  21
87	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  22

89	1.  Introduction

91	1.1.  Problem and Scope

93	   Latency is becoming the critical performance factor for many (most?)
94	   applications on the public Internet, e.g.  Web, voice, conversational
95	   video, gaming and finance apps.  In the developed world, further
96	   increases in access network bit-rate offer diminishing returns,
97	   whereas latency is still a multi-faceted problem.  In the last decade
98	   or so, much has been done to reduce propagation time by placing
99	   caches or servers closer to users.  However, queuing remains a major
100	   component of latency.

102	   The Diffserv architecture provides Expedited Forwarding [RFC3246], so
103	   that low latency traffic can jump the queue of other traffic.
104	   However, on access links dedicated to individual sites (homes, small
105	   enterprises or mobile devices), often all traffic at any one time
106	   will be latency-sensitive.  Then Diffserv is of little use.  Instead,
107	   we need to remove the causes of any unnecessary delay.

109	   The bufferbloat project has shown that excessively-large buffering
110	   (`bufferbloat') has been introducing significantly more delay than
111	   the underlying propagation time.  These delays appear only
112	   intermittently--only when a capacity-seeking (e.g.  TCP) flow is long
113	   enough for the queue to fill the buffer, making every packet in other
114	   flows sharing the buffer sit through the queue.

116	   Active queue management (AQM) was originally developed to solve this
117	   problem (and others).  Unlike Diffserv, AQM controls latency for
118	   _all_ traffic in a class.  In general, AQMs introduce an increasing
119	   level of discard from the buffer the longer the queue persists above
120	   a shallow threshold.  This gives sufficient signals to capacity-
121	   seeking (aka. greedy) flows to keep the buffer empty for its intended
122	   purpose: absorbing bursts.  However, RED [RFC2309] and other
123	   algorithms from the 1990s were sensitive to their configuration and
124	   hard to set correctly.  So, AQM was not widely deployed.  More recent
125	   state-of-the-art AQMs, e.g.  fq_CoDel [I-D.ietf-aqm-fq-codel],
126	   PIE [I-D.ietf-aqm-pie], Adaptive RED [ARED01], define the threshold
127	   in time not bytes, so it is invariant for different link rates.

129	   It seems that further changes to the network alone will now yield
130	   diminishing returns.  Data Centre TCP
131	   (DCTCP [I-D.bensley-tcpm-dctcp]) teaches us that a small but radical
132	   change to TCP is needed to cut two major outstanding causes of
133	   queuing delay variability:

135	   1.  the `sawtooth' varying rate of TCP itself;

137	   2.  the smoothing delay deliberately introduced into AQMs to permit
138	       bursts without triggering losses.

140	   The former causes a flow's round trip time (RTT) to vary from about 1
141	   to 2 times the base RTT between the machines in question.  The latter
142	   delays the system's response to change by a worst-case
143	   (transcontinental) RTT, which could be hundreds of times the actual
144	   RTT of typical traffic from localized CDNs.

146	   Latency is not our only concern:

148	   3.  It was known when TCP was first developed that it would not scale
149	       to high bandwidth-delay products.

151	   Given regular broadband bit-rates over WAN distances are
152	   already [RFC3649] beyond the scaling range of `classic' TCP Reno,
153	   `less unscalable' Cubic [I-D.zimmermann-tcpm-cubic] and
154	   Compound [I-D.sridharan-tcpm-ctcp] variants of TCP have been
155	   successfully deployed.  However, these are now approaching their
156	   scaling limits.  Unfortunately, fully scalable TCPs such as DCTCP
157	   cause `classic' TCP to starve itself, which is why they have been
158	   confined to private data centres or research testbeds (until now).

160	   This document specifies a `DualQ Coupled AQM' that solves the problem
161	   of coexistence between DCTCP and classic flows, without having to
162	   inspect flow identifiers.  The AQM is not like flow-queuing
163	   approaches [I-D.ietf-aqm-fq-codel] that classify packets by flow
164	   identifier into numerous separate queues in order to isolate sparse
165	   flows from the higher latency in the queues assigned to heavier flow.
166	   In contrast, the AQM exploits the behaviour of scalable congestion
167	   controls like DCTCP so that every packet in every flow sharing the
168	   queue for DCTCP-like traffic can be served with very low latency.

170	   The AQM needs fewer operations per packet than RED uses.  Also, no
171	   network configuration is needed for a wide range of scenarios where
172	   the range of RTTs is typical for the public Internet.  Therefore it
173	   is believed the Coupled AQM would be applicable and easy to deploy in
174	   all types of buffers; buffers in cost-reduced mass-market residential
175	   equipment; buffers in end-system stacks; buffers in carrier-scale
176	   equipment including remote access servers, routers, firewalls and
177	   Ethernet switches; buffers in network interface cards, buffers in
178	   virtualized network appliances, hypervisors, and so on.

180	   The supporting paper [DCttH15] gives the full rationale for the AQM's
181	   design, both discursively and in more precise mathematical form.

183	1.2.  Terminology

185	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
186	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
187	   document are to be interpreted as described in [RFC2119].  In this
188	   document, these words will appear with that interpretation only when
189	   in ALL CAPS.  Lower case uses of these words are not to be
190	   interpreted as carrying RFC-2119 significance.

192	   The DualQ Coupled AQM uses two queues for two services.  Each of the
193	   following terms identifies both the service and the queue that
194	   provides the service:

196	   Classic (denoted by subscript C):  The `Classic' service is intended
197	      for all the behaviours that currently co-exist with TCP Reno (TCP
198	      Cubic, Compound, SCTP, etc).

200	   Low-Latency, Low-Loss and Scalable (L4S, denoted by subscript L):
201	      The `L4S' service is intended for DCTCP traffic but it is also
202	      more general--it will allow a set of congestion controls with
203	      similar scaling properties to DCTCP (e.g.  Relentless [Mathis09])
204	      to evolve.

206	   Either service can cope with a proportion of unresponsive or less-
207	   responsive traffic as well (e.g.  DNS, VoIP, etc).

209	1.3.  Features

211	   The AQM couples marking and/or dropping across the two queues such
212	   that a flow will get roughly the same throughput whichever it uses.
213	   Therefore both queues can feed into the full capacity of a link and
214	   no rates need to be configured for the queues.  The L4S queue enables
215	   scalable congestion controls like DCTCP to give stunningly low and
216	   predictably low latency, without compromising the performance of
217	   competing 'Classic' Internet traffic.  Thousands of tests have been
218	   conducted in a typical fixed residential broadband setting.  Typical
219	   experiments used a base round trip delay of 7ms between the data
220	   centre and home network, and large amounts of background traffic in
221	   both queues.  For every L4S packet, the AQM kept the 99th percentile
222	   of queuing delay to about 1ms, and no losses at all were introduced
223	   by the AQM.  Details of the extensive experiments will be made
224	   available [DCttH15].

226	   Subjective testing was also conducted using a demanding panoramic
227	   interactive video application run over a stack with DCTCP enabled and
228	   deployed on the testbed.  Each user could pan or zoom their own high
229	   definition (HD) sub-window of a larger video scene from a football
230	   match.  Even though the user was also downloading large amounts of
231	   L4S and Classic data, latency was so low that the picture appeared to
232	   stick to their finger on the touchpad (all the L4S data achieved the
233	   same ultra-low latency).  With an alternative AQM, the video
234	   noticeably lagged behind the finger gestures.

236	   Unlike Diffserv Expedited Forwarding, the L4S queue does not have to
237	   be limited to a small proportion of the link capacity in order to
238	   achieve low delay.  The L4S queue can be filled with a heavy load of
239	   capacity-seeking flows like DCTCP and still achieve low delay.  The
240	   L4S queue does not rely on the presence of other traffic in the
241	   Classic queue that can be 'overtaken'.  It gives low latency to L4S
242	   traffic whether or not there is Classic traffic, and the latency of
243	   Classic traffic does not suffer when a proportion of the traffic is
244	   L4S.  The two queues are only necessary because DCTCP-like flows
245	   cannot keep latency predictably low and keep utilization high if they
246	   are mixed with legacy TCP flows,

248	   The experiments used the Linux implementation of DCTCP that is
249	   deployed in private data centres, without any modification despite
250	   its known deficiencies.  Nonetheless, certain modifications will be
251	   necessary before DCTCP is safe to use on the Internet, which are
252	   recorded for now in Appendix C.  However, the focus of this
253	   specification is to get the network service in place.  Then, without
254	   any management intervention, applications can exploit it by migrating
255	   to scalable controls like DCTCP, which can then evolve _while_ their
256	   benefits are being enjoyed by everyone on the Internet.

258	2.  DualQ Coupled AQM Algorithm

260	   There are two main aspects to the algorithm:

262	   o  the Coupled AQM that addresses throughput equivalence between
263	      Classic (e.g.  Reno, Cubic) flows and L4S (e.g.  DCTCP) flows

265	   o  the Dual Queue structure that provides latency separation for L4S
266	      flows to isolate them from the typically large Classic queue.

268	2.1.  Coupled AQM

270	   In the 1990s, the `TCP formula' was derived for the relationship
271	   between TCP's congestion window, cwnd, and its drop probability, p.
272	   To a first order approximation, cwnd of TCP Reno is inversely
273	   proportional to the square root of p.  TCP Cubic implements a Reno-
274	   compatibility mode, which is the only relevant mode for typical RTTs
275	   under 20ms, while the throughput of a single flow is less than about
276	   500Mb/s.  Therefore we can assume that Cubic traffic behaves similar
277	   to Reno (but with a slightly different constant of proportionality),
278	   and we shall use the term 'Classic' for the collection of Reno and
279	   Cubic in Reno mode.

281	   In our supporting paper [DCttH15], we derive the equivalent rate
282	   equation for DCTCP, for which cwnd is inversely proportional to p
283	   (not the square root), where in this case p is the ECN marking
284	   probability.  DCTCP is not the only congestion control that behaves
285	   like this, so we use the term 'L4S' traffic for all similar
286	   behaviour.

288	   In order to make a DCTCP flow run at roughly the same rate as a Reno
289	   TCP flow (all other factors being equal), we make the drop
290	   probability for Classic traffic, p_C distinct from the marking
291	   probability for L4S traffic, p_L (in contrast to RFC3168 which
292	   requires them to be the same).  We make the Classic drop probability
293	   p_C proportional to the square of the L4S marking probability p_L.
294	   This is because we need to make the Reno flow rate equal the DCTCP
295	   flow rate, so we have to square the square root of p_C in the Reno
296	   rate equation to make it the same as the straight p_L in the DCTCP
297	   rate equation.

299	   There is a really simple way to implement the square of a probability
300	   - by testing the queue against two random numbers not one.  This is
301	   the approach adopted in Appendix A.

303	   Stating this as a formula, the relation between Classic drop
304	   probability, p_C, and L4S marking probability, p_L needs to take the
305	   form:

307	       p_C = ( p_L / 2^k )^2                  (1)

309	   where 2^k is the constant of proportionality, which is expressed as a
310	   power of 2 so that implementations can avoid costly division by
311	   shifting p_L by k bits to the right.

313	2.2.  Dual Queue

315	   Classic traffic builds a large queue, so a separate queue is provided
316	   for L4S traffic, and it is scheduled with strict priority.
317	   Nonetheless, coupled marking ensures that giving priority to L4S
318	   traffic still leaves the right amount of spare scheduling time for
319	   Classic flows to each get equivalent throughput to DCTCP flows (all
320	   other factors such as RTT being equal).  The algorithm achieves this
321	   without having to inspect flow identifiers.

323	2.3.  Traffic Classification

325	   Both the Coupled AQM and DualQ mechanisms need an identifier to
326	   distinguish L4S and C packets, which will need to be standardized.
327	   In our tests we used a cleared ECN field to indicate C packets and
328	   L4S otherwise.  The ECN specification [RFC3168] currently defines a
329	   mark as equivalent to a drop.  However, it says

331	      "An environment where all end nodes were ECN-Capable could allow
332	      new criteria to be developed for setting the CE codepoint, and new
333	      congestion control mechanisms for end-node reaction to CE packets.
334	      However, this is a research issue, and as such is not addressed in
335	      this document."

337	   and [RFC4774]} gives valid ways to alter ECN's semantics without
338	   harming interoperability.

340	   Since publication in 2001,deployment of RFC3168 ECN has been dogged
341	   by bugs and misunderstandings.  In recent years RFC3168 ECN has been
342	   deployed quite successfully on servers [ECN_Deploy], and until
343	   recently it was deployed but not enabled on a fair proportion of user
344	   machines.  Recently one major developer of client devices has
345	   configured ECN on-by-default in its beta releases.  However although
346	   some network equipment vendors and developers have implemented ECN,
347	   there is little evidence that any public network operator is
348	   considering or has deployed ECN-capable AQMs on network equipment
349	   yet.

351	   A number of private data centre operators have deployed ECN, but not
352	   RFC3168 ECN.  Instead, they are using DCTCP to get predictable ultra-
353	   low latency, and they are either ensuring that there is no non-DCTCP
354	   traffic [I-D.bensley-tcpm-dctcp], or they are segregating such
355	   traffic from DCTCP using Diffserv [DCTCP_Pitfalls].  The RFC3168
356	   approach merely prevents drop, whereas the DCTCP approach provides
357	   scalable throughput and ultra-low latency as well as avoiding drop.
358	   Consequently it has been questioned whether the RFC3168 approach
359	   offers enough performance improvement for an operator to countenance
360	   the cost and risk of deployment.  There has been some discussions at
361	   the IETF on changing the meaning of an ECN mark to move towards the
362	   DCTCP approach.The performance results from our experiments with
363	   DCTCP for broadband residential users are certainly significant
364	   enough to warrant interest from operators.

366	   For those who have managed to get classic ECN widely deployed on end-
367	   systems, moving the goalposts at this stage would be harsh.  If the
368	   meaning of ECN cannot be changed from "equivalent to drop", it would
369	   be possible to identify the L4S service in another way, e.g. a
370	   combination of ECN and Diffserv, or using the ECT(1) codepoint.  The
371	   Diffserv codepoint is not ideal, because L4S is an end-to-end service
372	   and a DSCP is not preserved end-to-end.  However, combining ECN and
373	   Diffserv may be sufficient for initial deployment, while confined to
374	   controlled sets of networks, during which time any users of classic
375	   ECN can upgrade to L4S.  The ECT(1) codepoint is perhaps less ideal,
376	   because two separate uses of ECN really need two codepoints each, and
377	   anyway it could be argued that the last ECN codepoint should not be
378	   burned when the current one is not being used.

380	   This draft does not currently recommend an approach for identifying
381	   for the L4S service, which is initially left open for discussion
382	   within the IETF.

384	2.4.  Normative Requirements

386	   In the Dual Queue, L4S packets MUST be given priority over Classic,
387	   although strict priority MAY not be appropriate.

389	   All L4S traffic MUST be ECN-capable, although some Classic traffic
390	   MAY also be ECN-capable.

392	   Whatever identifier is used for L4S traffic, it will still be
393	   necessary to agree on the meaning of an ECN marking on L4S traffic,
394	   relative to a drop of Classic traffic.  In order to prevent
395	   starvation of Classic traffic by scalable L4S traffic (e.g.  DCTCP)
396	   the drop probability of Classic traffic MUST be proportional to the
397	   square of the marking probability of L4S traffic, In other words, the
398	   power to which p_L is raised in Eqn. (1) MUST be 2.

400	   The constant of proportionality, k, in Eqn (1) determines the
401	   relative flow rates of Classic and L4S flows when the AQM concerned
402	   is the bottleneck (all other factors being equal). k does not have to
403	   be standardized because differences do not prevent interoperability.
404	   However, k has to take some value, and each operator can make that
405	   choice.

407	   A value of k=0 is RECOMMENDED as the default for public Internet
408	   access networks, assuming the DCTCP algorithm remains similar to that
409	   in [I-D.bensley-tcpm-dctcp].  Nonetheless choice of k is a matter of
410	   operator policy, and operators MAY choose a different value using
411	   Table 1 and the guidelines in Appendix B.

413	   Typically, access network operators isolate customers from each other
414	   with some form of layer-2 multiplexing (TDM in DOCSIS, CDMA in 3G) or
415	   L3 scheduling (WRR in broadband), rather than relying on TCP to share
416	   capacity between customers [RFC0970].  In such cases, the choice of k
417	   will solely affect relative flow rates within the customer's access
418	   capacity, not between customers.  Also, k would not affect rates of
419	   small flows, nor long flows at any times when they are all Classic or
420	   all L4S.

422	   An example DualQ Coupled AQM algorithm is given in Appendix A.
423	   Marking and dropping in each queue is based on an AQM called Curvy
424	   RED, which is intended to improve on RED, PIE and CoDel.  We have
425	   found that Curvy RED offers good performance, requires less
426	   operations per packet than RED and is insensitive to configuration.
427	   Nonetheless, it would be possible to control each queue with an
428	   alternative AQM, as long as the above normative requirements (those
429	   expressed in capitals) are observed, which are intended to be
430	   independent of the specific AQM.

432	   {ToDo: Add management and monitoring requirements}

434	3.  IANA Considerations

436	   This specification contains no IANA considerations.

438	4.  Security Considerations

440	4.1.  Overload Handling

442	   Where the interests of users or flows might conflict, it could be
443	   necessary to police traffic to isolate any harm to performance.  This
444	   is a policy issue that needs to be separable from a basic AQM, but
445	   the scheme does need to handle overload.  A trade-off needs to be
446	   made between complexity and the risk of either class harming the
447	   other.  It is an operator policy to define what must happen if the
448	   service time of the classic queue becomes too great.  In the
449	   following subsections three optional non-exclusive overload
450	   protections are defined.  Their objective is for the overload
451	   behaviour of the DualQ AQM to be similar to a single queue AQM.
452	   Other overload protections can be envisaged:

454	   Minimum throughput service:   By replacing the priority scheduler
455	      with a weighted round robin scheduler, a minimum throughput
456	      service can be guaranteed for Classic traffic.  Typically the
457	      scheduling weight of the Classic queue will be small (e.g. 5%) to
458	      avoid interference with the coupling but big enough to avoid
459	      complete starvation of Classic traffic.

461	   Drop on overload:  On severe overload, e.g. due to non responsive
462	      traffic, queues will typically overflow and packet drop will be
463	      unavoidable when the queues reach their limits.  The drop-limit of
464	      each queue should be configured by specifying the maximum
465	      supported load and determining the expected maximum size of each
466	      queue when that load is separately applied to each queue.  The
467	      Classic queue limit will typically be larger than the L4S queue
468	      limit.  Overflow of one traffic type will automatically result in
469	      drop in its respective queue.  Both traffic types will get a high
470	      congestion signal, due to the coupled marking, which will result
471	      in similar starvation of responsive traffic in both queues.  Thus,
472	      the behaviour will be like a single queue AQM.  To further improve
473	      the arrival fairness of a single queue an extra overall AQM limit
474	      can be applied, which is a limit to the sum of both queues.  To be
475	      effective, it should be configured to be less than the sum of the
476	      limits of both queues, but greater than the maximum individual
477	      queue limit.  It ensures that the drop probability of unresponsive
478	      traffic will be independent of its traffic type.

480	   Delay on overload:  To control milder overload of responsive traffic,
481	      particularly when close to the maximum congestion signal, delay
482	      can be used as an alternative congestion control mechanism.  The
483	      Dual Queue Coupled AQM can be made to behave like a single FIFO
484	      queue with differentiated service times by replacing the priority
485	      scheduler with a very simple "biased longest sojourn time first
486	      scheduler".  The bias is defined as a maximum sojourn time
487	      difference (T_m) between the Classic and L4S packets.  The
488	      scheduler adds T_m to the sojourn time of the next L4S packet,
489	      before comparing it with the timestamp of the next Classic packet,
490	      then it selects the packet with the greater adjusted sojourn time.
491	      This time shifted FIFO queue behaves just like a single FIFO queue
492	      under moderate and high overload.

494	5.  Acknowledgements

496	   Thanks to Anil Agarwal for detailed review comments and suggestions
497	   on how to make our explanation clearer.

499	   The authors' contributions are part-funded by the European Community
500	   under its Seventh Framework Programme through the Reducing Internet
501	   Transport Latency (RITE) project (ICT-317700).  The views expressed
502	   here are solely those of the authors.

504	6.  References

506	6.1.  Normative References

508	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
509	              Requirement Levels", BCP 14, RFC 2119,
510	              DOI 10.17487/RFC2119, March 1997,
511	              <http://www.rfc-editor.org/info/rfc2119>.

513	6.2.  Informative References

515	   [ARED01]   Floyd, S., Gummadi, R., and S. Shenker, "Adaptive RED: An
516	              Algorithm for Increasing the Robustness of RED's Active
517	              Queue Management", ACIRI Technical Report , August 2001,
518	              <http://www.icir.org/floyd/red.html>.

520	   [CoDel]    Nichols, K. and V. Jacobson, "Controlling Queue Delay",
521	              ACM Queue 10(5), May 2012,
522	              <http://queue.acm.org/issuedetail.cfm?issue=2208917>.

524	   [CRED_Insights]
525	              Briscoe, B., "Insights from Curvy RED (Random Early
526	              Detection)", BT Technical Report TR-TUB8-2015-003, July
527	              2015,
528	              <http://www.bobbriscoe.net/projects/latency/credi_tr.pdf>.

530	   [DCTCP_Pitfalls]
531	              Judd, G., "Attaining the Promise and Avoiding the Pitfalls
532	              of TCP in the Datacenter", 12th USENIX Symposium on
533	              Networked Systems Design and Implementation (NSDI 15)
534	              145--157, May 2015,
535	              <http://blogs.usenix.org/conference/nsdi15/technical-
536	              sessions/presentation/judd>.

538	   [DCttH15]  De Schepper, K., Bondarenko, O., Briscoe, B., and I.
539	              Tsang, "`Data Centre to the Home': Ultra-Low Latency for
540	              All", 2015, <http://www.bobbriscoe.net/projects/latency/
541	              dctth_preprint.pdf>.

543	              (Under submission)

545	   [ECN_Deploy]
546	              Trammell, B., Kuehlewind, M., Boppart, D., Learmonth, I.,
547	              Fairhurst, G., and R. Scheffenegger, "Enabling Internet-
548	              Wide Deployment of Explicit Congestion Notification", Proc
549	              Passive & Active Measurement (PAM'15) Conference , 2015,
550	              <http://ecn.ethz.ch/ecn-pam15.pdf>.

552	   [I-D.bensley-tcpm-dctcp]
553	              Bensley, S., Eggert, L., Thaler, D., Balasubramanian, P.,
554	              and G. Judd, "Microsoft's Datacenter TCP (DCTCP): TCP
555	              Congestion Control for Datacenters", draft-bensley-tcpm-
556	              dctcp-05 (work in progress), July 2015.

558	   [I-D.ietf-aqm-fq-codel]
559	              Hoeiland-Joergensen, T., McKenney, P.,
560	              dave.taht@gmail.com, d., Gettys, J., and E. Dumazet,
561	              "FlowQueue-Codel", draft-ietf-aqm-fq-codel-01 (work in
562	              progress), July 2015.

564	   [I-D.ietf-aqm-pie]
565	              Pan, R., Natarajan, P., Baker, F., and G. White, "PIE: A
566	              Lightweight Control Scheme To Address the Bufferbloat
567	              Problem", draft-ietf-aqm-pie-01 (work in progress), March
568	              2015.

570	   [I-D.ietf-tcpm-accecn-reqs]
571	              Kuehlewind, M., Scheffenegger, R., and B. Briscoe,
572	              "Problem Statement and Requirements for a More Accurate
573	              ECN Feedback", draft-ietf-tcpm-accecn-reqs-08 (work in
574	              progress), March 2015.

576	   [I-D.sridharan-tcpm-ctcp]
577	              Sridharan, M., Tan, K., Bansal, D., and D. Thaler,
578	              "Compound TCP: A New TCP Congestion Control for High-Speed
579	              and Long Distance Networks", draft-sridharan-tcpm-ctcp-02
580	              (work in progress), November 2008.

582	   [I-D.zimmermann-tcpm-cubic]
583	              Rhee, I., Xu, L., Ha, S., Zimmermann, A., Eggert, L., and
584	              R. Scheffenegger, "CUBIC for Fast Long-Distance Networks",
585	              draft-zimmermann-tcpm-cubic-01 (work in progress), April
586	              2015.

588	   [Mathis09]
589	              Mathis, M., "Relentless Congestion Control", PFLDNeT'09 ,
590	              May 2009, <http://www.hpcc.jp/pfldnet2009/
591	              Program_files/1569198525.pdf>.

593	   [RFC0970]  Nagle, J., "On Packet Switches With Infinite Storage",
594	              RFC 970, DOI 10.17487/RFC0970, December 1985,
595	              <http://www.rfc-editor.org/info/rfc970>.

597	   [RFC2309]  Braden, B., Clark, D., Crowcroft, J., Davie, B., Deering,
598	              S., Estrin, D., Floyd, S., Jacobson, V., Minshall, G.,
599	              Partridge, C., Peterson, L., Ramakrishnan, K., Shenker,
600	              S., Wroclawski, J., and L. Zhang, "Recommendations on
601	              Queue Management and Congestion Avoidance in the
602	              Internet", RFC 2309, DOI 10.17487/RFC2309, April 1998,
603	              <http://www.rfc-editor.org/info/rfc2309>.

605	   [RFC3168]  Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
606	              of Explicit Congestion Notification (ECN) to IP",
607	              RFC 3168, DOI 10.17487/RFC3168, September 2001,
608	              <http://www.rfc-editor.org/info/rfc3168>.

610	   [RFC3246]  Davie, B., Charny, A., Bennet, J., Benson, K., Le Boudec,
611	              J., Courtney, W., Davari, S., Firoiu, V., and D.
612	              Stiliadis, "An Expedited Forwarding PHB (Per-Hop
613	              Behavior)", RFC 3246, DOI 10.17487/RFC3246, March 2002,
614	              <http://www.rfc-editor.org/info/rfc3246>.

616	   [RFC3649]  Floyd, S., "HighSpeed TCP for Large Congestion Windows",
617	              RFC 3649, DOI 10.17487/RFC3649, December 2003,
618	              <http://www.rfc-editor.org/info/rfc3649>.

620	   [RFC4774]  Floyd, S., "Specifying Alternate Semantics for the
621	              Explicit Congestion Notification (ECN) Field", BCP 124,
622	              RFC 4774, DOI 10.17487/RFC4774, November 2006,
623	              <http://www.rfc-editor.org/info/rfc4774>.

625	   [TCP-sub-mss-w]
626	              Briscoe, B. and K. De Schepper, "Scaling TCP's Congestion
627	              Window for Small Round Trip Times", BT Technical Report
628	              TR-TUB8-2015-002, May 2015,
629	              <http://www.bobbriscoe.net/projects/latency/
630	              sub-mss-w.pdf>.

632	Appendix A.  Example DualQ Coupled Algorithm

634	   As a concrete example, the pseudocode below gives the DualQ Coupled
635	   AQM algorithm we used in testing.  Although we designed the AQM to be
636	   efficient in integer arithmetic, to aid understanding it is first
637	   given using real-number arithmetic.  Then, one possible optimization
638	   for integer arithmetic is given, also in pseudocode.  To aid
639	   comparison, the line numbers are kept in step between the two by
640	   using letter suffixes where the longer code needs extra lines.

642	   1:  dualq_dequeue(lq, cq) {  % Couples L4S & Classic queues, lq & cq
643	   2:     if ( lq.dequeue(pkt) ) {
644	   3a:       p_L = cq.sec() / 2^S_L
645	   3b:       if ( lq.byt() > T )
646	   3c:           mark(pkt)
647	   3d:        elif ( p_L > maxrand(U) )
648	   4:           mark(pkt)
649	   5:        return(pkt)              % return the packet and stop here
650	   6:     }
651	   7:     while ( cq.dequeue(pkt) ) {
652	   8a:       alpha = 2^(-f_C)
653	   8b:       Q_C = alpha * pkt.sec() + (1-alpha)* Q_C % Classic Q EWMA
654	   9a:       sqrt_p_C = Q_C / 2^S_C
655	   9b:       if ( sqrt_p_C > maxrand(2*U) )
656	   10:          drop(pkt)                     % Squared drop, redo loop
657	   11:       else
658	   12:          return(pkt)           % return the packet and stop here
659	   13:    }
660	   14:    return(NULL)                           % no packet to dequeue
661	   15: }

663	   16: maxrand(u) {                % return the max of u random numbers
664	   17:     maxr=0
665	   18:     while (u-- > 0)
666	   19:         maxr = max(maxr, rand())               % 0 <= rand() < 1
667	   20:     return(maxr)
668	   21: }

670	        Figure 1: Example Dequeue Pseudocode for Coupled DualQ AQM

672	   Packet classification code is not shown, as it is no different from
673	   regular packet classification.  Potential classification schemes are
674	   discussed in Section 2.  Overload protection code will be included in
675	   a future draft {ToDo}.

677	   At the outer level, the structure of dualq_dequeue() implements
678	   strict priority scheduling.  The code is written assuming the AQM is
679	   applied on dequeue (Note 1) . Every time dualq_dequeue() is called,
680	   the if-block in lines 2-6 determines whether there is an L4S packet
681	   to dequeue by calling lq.dequeue(pkt), and otherwise the while-block
682	   in lines 7-13 determines whether there is a Classic packet to
683	   dequeue, by calling cq.dequeue(pkt).  (Note 2)

685	   In the lower priority Classic queue, a while loop is used so that, if
686	   the AQM determines that a classic packet should be dropped, it
687	   continues to test for classic packets deciding whether to drop each
688	   until it actually forwards one.  Thus, every call to dualq_dequeue()
689	   returns one packet if at least one is present in either queue,
690	   otherwise it returns NULL at line 14.  (Note 3)

692	   Within each queue, the decision whether to drop or mark is taken as
693	   follows (to simplify the explanation, it is assumed that U=1):

695	   L4S:  If the test at line 2 determines there is an L4S packet to
696	      dequeue, the tests at lines 3a and 3c determine whether to mark
697	      it.  The first is a simple test of whether the L4S queue (lq.byt()
698	      in bytes) is greater than a step threshold T in bytes (Note 4).
699	      The second test is similar to the random ECN marking in RED, but
700	      with the following differences: i) the marking function does not
701	      start with a plateau of zero marking until a minimum threshold,
702	      rather the marking probability starts to increase as soon as the
703	      queue is positive; ii) marking depends on queuing time, not bytes,
704	      in order to scale for any link rate without being reconfigured;
705	      iii) marking of the L4S queue does not depend on itself, it
706	      depends on the queuing time of the _other_ (Classic) queue, where
707	      cq.sec() is the queuing time of the packet at the head of the
708	      Classic queue (zero if empty); iv) marking depends on the
709	      instantaneous queuing time (of the other queue), not a smoothed
710	      average; v) the queue is compared with the maximum of U random
711	      numbers (but if U=1, this is the same as the single random number
712	      used in RED).

714	      Specifically, in line 3a the marking probability p_L is set to the
715	      Classic queueing time qc.sec() in seconds divided by the L4S
716	      scaling parameter 2^S_L, which represents the queuing time (in
717	      seconds) at which marking probability would hit 100%. Then in line
718	      3d (if U=1) the result is compared with a uniformly distributed
719	      random number between 0 and 1, which ensures that marking
720	      probability will linearly increase with queueing time.  The
721	      scaling parameter is expressed as a power of 2 so that division
722	      can be implemented as a right bit-shift (>>) in line 3 of the
723	      integer variant of the pseudocode (Figure 2).

725	   Classic:  If the test at line 7 determines that there is at least one
726	      Classic packet to dequeue, the test at line 9b determines whether
727	      to drop it.  But before that, line 8b updates Q_C, which is an
728	      exponentially weighted moving average (Note 5) of the queuing time
729	      in the Classic queue, where pkt.sec() is the instantaneous
730	      queueing time of the current Classic packet and alpha is the EWMA
731	      constant for the classic queue.  In line 8a, alpha is represented
732	      as an integer power of 2, so that in line 8 of the integer code
733	      the division needed to weight the moving average can be
734	      implemented by a right bit-shift (>> f_C).

736	      Lines 9a and 9b implement the drop function.  In line 9a the
737	      averaged queuing time Q_C is divided by the Classic scaling
738	      parameter 2^S_C, in the same way that queuing time was scaled for
739	      L4S marking.  This scaled queuing time is given the variable name
740	      sqrt_p_C because it will be squared to compute Classic drop
741	      probability, so before it is squared it is effectively the square
742	      root of the drop probability.  The squaring is done by comparing
743	      it with the maximum out of two random numbers (assuming U=1).
744	      Comparing it with the maximum out of two is the same as the
745	      logical `AND' of two tests, which ensures drop probability rises
746	      with the square of queuing time (Note 6).  Again, the scaling
747	      parameter is expressed as a power of 2 so that division can be
748	      implemented as a right bit-shift in line 9 of the integer
749	      pseudocode.

751	   The marking/dropping functions in each queue (lines 3 & 9) are two
752	   cases of a new generalization of RED called Curvy RED, motivated as
753	   follows.  When we compared the performance of our AQM with fq_CoDel
754	   and PIE, we came to the conclusion that their goal of holding queuing
755	   delay to a fixed target is misguided [CRED_Insights].  As the number
756	   of flows increases, if the AQM does not allow TCP to increase queuing
757	   delay, it has to introduce abnormally high levels of loss.  Then loss
758	   rather than queuing becomes the dominant cause of delay for short
759	   flows, due to timeouts and tail losses.

761	   Curvy RED constrains delay with a softened target that allows some
762	   increase in delay as load increases.  This is achieved by increasing
763	   drop probability on a convex curve relative to queue growth (the
764	   square curve in the Classic queue, if U=1).  Like RED, the curve hugs
765	   the zero axis while the queue is shallow.  Then, as load increases,
766	   it introduces a growing barrier to higher delay.  But, unlike RED, it
767	   requires only one parameter, the scaling, not three.

769	   There follows a summary listing of the two parameters used for each
770	   of the two queues:

772	   Classic:

774	      S_C :   The scaling factor of the dropping function scales Classic
775	         queuing times in the range [0, 2^(S_C)] seconds into a dropping
776	         probability in the range [0,1].  To make division efficient, it
777	         is constrained to be an integer power of two;

779	      f_C :  To smooth the queuing time of the Classic queue and make
780	         multiplication efficient, we use a negative integer power of
781	         two for the dimensionless EWMA constant, which we define as
782	         2^(-f_C).

784	   L4S :

786	      S_L (and k):   As for the Classic queue, the scaling factor of the
787	         L4S marking function scales Classic queueing times in the range
788	         [0, 2^(S_L)] seconds into a probability in the range [0,1].
789	         Note that S_L = S_C + k, where k is the coupling between the
790	         queues (Section 2.1).  So S_L and k count as only one
791	         parameter;

793	      T :  The queue size in bytes at which step threshold marking
794	         starts in the L4S queue.

796	   {ToDo: These are the raw parameters used within the algorithm.  A
797	   configuration front-end could accept more meaningful parameters and
798	   convert them into these raw parameters.}

800	   From our experiments so far, recommended values for these parameters
801	   are: S_C = -1; f_C = 5; T = 5 * MTU for the range of base RTTs
802	   typical on the public Internet.  [CRED_Insights] explains why these
803	   parameters are applicable whatever rate link this AQM implementation
804	   is deployed on and how the parameters would need to be adjusted for a
805	   scenario with a different range of RTTs (e.g. a data centre) {ToDo
806	   incorporate a summary of that report into this draft}. The setting of
807	   k depends on policy (see Section 2.4 and Appendix B respectively for
808	   its recommended setting and guidance on alternatives).

810	   There is also a cUrviness parameter, U, which is a small positive
811	   integer.  It is likely to take the same hard-coded value for all
812	   implementations, once experiments have determined a good value.  We
813	   have solely used U=1 in our experiments so far, but results might be
814	   even better with U=2 or higher.

816	   Note that the dropping function at line 9 calls maxrand(2*U), which
817	   gives twice as much curviness as the call to maxrand(U) in the
818	   marking function at line 3.  This is the trick that implements the
819	   square rule in equation (1) (Section 2.1).  This is based on the fact
820	   that, given a number X from 1 to 6, the probability that two dice
821	   throws will both be less than X is the square of the probability that
822	   one throw will be less than X.  So, when U=1, the L4S marking
823	   function is linear and the Classic dropping function is squared.  If
824	   U=2, L4S would be a square function and Classic would be quartic.
825	   And so on.

827	   The maxrand(u) function in lines 16-21 simply generates u random
828	   numbers and returns the maximum (Note 7).  Typically, maxrand(u)
829	   could be run in parallel out of band.  For instance, if U=1, the
830	   Classic queue would require the maximum of two random numbers.  So,
831	   instead of calling maxrand(2*U) in-band, the maximum of every pair of
832	   values from a pseudorandom number generator could be generated out-
833	   of-band, and held in a buffer ready for the Classic queue to consume.

835	   1:  dualq_dequeue(lq, cq) {  % Couples L4S & Classic queues, lq & cq
836	   2:     if ( lq.dequeue(pkt) ) {
837	   3:        if ((lq.byt() > T) || ((cq.ns() >> (S_L-2)) > maxrand(U)))
838	   4:           mark(pkt)
839	   5:        return(pkt)              % return the packet and stop here
840	   6:     }
841	   7:     while ( cq.dequeue(pkt) ) {
842	   8:         Q_C += (pkt.ns() - Q_C) >> f_C           % Classic Q EWMA
843	   9:        if ( (Q_C >> (S_C-2) ) > maxrand(2*U) )
844	   10:          drop(pkt)                     % Squared drop, redo loop
845	   11:       else
846	   12:          return(pkt)           % return the packet and stop here
847	   13:    }
848	   14:    return(NULL)                           % no packet to dequeue
849	   15: }

851	   Figure 2: Optimised Example Dequeue Pseudocode for Coupled DualQ AQM
852	                         using Integer Arithmetic

854	   Notes:

856	   1.  The drain rate of the queue can vary if it is scheduled relative
857	       to other queues, or to cater for fluctuations in a wireless
858	       medium.  To auto-adjust to changes in drain rate, the queue must
859	       be measured in time, not bytes or packets [CoDel].  In our Linux
860	       implementation, it was easiest to measure queuing time at
861	       dequeue.  Queuing time can be estimated when a packet is enqueued
862	       by measuring the queue length in bytes and dividing by the recent
863	       drain rate.

865	   2.  An implementation has to use priority queueing, but it need not
866	       implement strict priority.

868	   3.  If packets can be enqueued while processing dequeue code, an
869	       implementer might prefer to place the while loop around both
870	       queues so that it goes back to test again whether any L4S packets
871	       arrived while it was dropping a Classic packet.

873	   4.  In order not to change too many factors at once, for now, we keep
874	       the marking function for DCTCP-only traffic as similar as
875	       possible to DCTCP.  However, unlike DCTCP, all processing is at
876	       dequeue, so we determine whether to mark a packet at the head of
877	       the queue by the byte-length of the queue _behind_ it.  We plan
878	       to test whether using queuing time will work in all
879	       circumstances, and if we find that the step can cause
880	       oscillations, we will investigate replacing it with a steep
881	       random marking curve.

883	   5.  An EWMA is only one possible way to filter bursts; other more
884	       adaptive smoothing methods could be valid and it might be
885	       appropriate to decrease the EWMA faster than it increases.

887	   6.  In practice at line 10 the Classic queue would probably test for
888	       ECN capability on the packet to determine whether to drop or mark
889	       the packet.  However, for brevity such detail is omitted.  All
890	       packets classified into the L4S queue have to be ECN-capable, so
891	       no dropping logic is necessary at line 3.  Nonetheless, L4S
892	       packets could be dropped by overload code (see Section 4.1).

894	   7.  In the integer variant of the pseudocode (Figure 2) real numbers
895	       are all represented as integers scaled up by 2^32.  In lines 3 &
896	       9 the function maxrand() is arranged to return an integer in the
897	       range 0 <= maxrand() < 2^32.  Queuing times are also scaled up by
898	       2^32, but in two stages: i) In lines 3 and 8 queuing times
899	       cq.ns() and pkt.ns() are returned in integer nanoseconds, making
900	       the values about 2^30 times larger than when the units were
901	       seconds, ii) then in lines 3 and 9 an adjustment of -2 to the
902	       right bit-shift multiplies the result by 2^2, to complete the
903	       scaling by 2^32.

905	Appendix B.  Guidance on Controlling Throughput Equivalence

907	                     +---------------+------+-------+
908	                     | RTT_C / RTT_L | Reno | Cubic |
909	                     +---------------+------+-------+
910	                     |             1 | k=1  | k=0   |
911	                     |             2 | k=2  | k=1   |
912	                     |             3 | k=2  | k=2   |
913	                     |             4 | k=3  | k=2   |
914	                     |             5 | k=3  | k=3   |
915	                     +---------------+------+-------+

917	   Table 1: Value of k for which DCTCP throughput is roughly the same as
918	                Reno or Cubic, for some example RTT ratios

920	   To determine the appropriate policy, the operator first has to judge
921	   whether it wants DCTCP flows to have roughly equal throughput with
922	   Reno or with Cubic (because, even in its Reno-compatibility mode,
923	   Cubic is about 1.4 times more aggressive than Reno).  Then the
924	   operator needs to decide at what ratio of RTTs it wants DCTCP and
925	   Classic flows to have roughly equal throughput.  For example choosing
926	   the recommended value of k=0 will make DCTCP throughput roughly the
927	   same as Cubic, _if their RTTs are the same_.

929	   However, even if the base RTTs are the same, the actual RTTs are
930	   unlikely to be the same, because Classic (Cubic or Reno) traffic
931	   needs a large queue to avoid under-utilization and excess drop,
932	   whereas L4S (DCTCP) does not.  The operator might still choose this
933	   policy if it judges that DCTCP throughput should be rewarded for
934	   keeping its own queue short.

936	   On the other hand, the operator will choose one of the higher values
937	   for k, if it wants to slow DCTCP down to roughly the same throughput
938	   as Classic flows, to compensate for Classic flows slowing themselves
939	   down by causing themselves extra queuing delay.

941	   The values for k in the table are derived from the formulae, which
942	   was developed in [DCttH15]:

944	       2^k = 1.64 (RTT_reno / RTT_dc)                  (2)
945	       2^k = 1.19 (RTT_cubic / RTT_dc )                (3)

947	   For localized traffic from a particular ISP's data centre, we used
948	   the measured RTTs to calculate that a value of k=3 would achieve
949	   throughput equivalence, and our experiments verified the formula very
950	   closely.

952	Appendix C.  DCTCP Safety Enhancements

954	   This Appendix is informational not normative.  It records changes
955	   needed to DCTCP implementations so they can co-exist safely alongside
956	   other traffic sources.  They are recorded here until a more
957	   appropriate draft is available to hold them.

959	   Proposed changes are listed in rough order of criticality.  Therefore
960	   those later in the list may not be necessary:

962	   o  Negotiate its altered feedback semantics, which conveys the extent
963	      of ECN marking, not just its existence, and this feedback needs to
964	      be robust to loss [I-D.ietf-tcpm-accecn-reqs];

966	   o  fall back to Reno or Cubic behaviour on loss;

968	   o  use a packet identifier associated with the L4S service;

970	   o  average ECN feedback over its own RTT, not the hard-coded RTT
971	      suitable only for data-centres, perhaps like Relentless
972	      TCP [Mathis09];

974	   o  handle a window of less than 2 when the RTT is low, rather than
975	      increase the queue [TCP-sub-mss-w].

977	   o  test heuristically whether ECN marking is emanating from an
978	      RFC3168 AQM.

980	   Other, non-essential enhancements to DCTCP can be envisaged.

982	Authors' Addresses

984	   Koen De Schepper
985	   Bell Labs
986	   Antwerp
987	   Belgium

989	   Email: koen.de_schepper@alcatel-lucent.com
990	   URI:   https://www.bell-labs.com/usr/koen.de_schepper

992	   Bob Briscoe (editor)
993	   Independent

995	   Email: ietf@bobbriscoe.net
996	   URI:   http://bobbriscoe.net/

998	   Olga Bondarenko
999	   Simula Research Lab
1000	   Lysaker
1001	   Norway

1003	   Email: olgabnd@gmail.com
1004	   URI:   https://www.simula.no/people/olgabo

1006	   Ing-jyh Tsang
1007	   Bell Labs
1008	   Antwerp
1009	   Belgium

1011	   Email: ing-jyh.tsang@alcatel-lucent.com