idnits 2.17.1 

draft-ietf-tsvwg-l4s-arch-05.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The abstract seems to contain references ([RFC8257]), which it
     shouldn't.  Please replace those with straight textual mentions of the
     documents in question.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == The document doesn't use any RFC 2119 keywords, yet seems to have RFC
     2119 boilerplate text.

  -- The document date (February 20, 2020) is 1527 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Outdated reference: A later version (-07) exists of
     draft-briscoe-docsis-q-protection-00

  == Outdated reference: A later version (-02) exists of
     draft-cardwell-iccrg-bbr-congestion-control-00

  == Outdated reference: A later version (-34) exists of
     draft-ietf-quic-transport-25

  == Outdated reference: A later version (-28) exists of
     draft-ietf-tcpm-accurate-ecn-09

  == Outdated reference: A later version (-15) exists of
     draft-ietf-tcpm-generalized-ecn-05

  == Outdated reference: A later version (-25) exists of
     draft-ietf-tsvwg-aqm-dualq-coupled-10

  == Outdated reference: A later version (-22) exists of
     draft-ietf-tsvwg-ecn-encap-guidelines-13

  == Outdated reference: A later version (-29) exists of
     draft-ietf-tsvwg-ecn-l4s-id-08

  == Outdated reference: A later version (-07) exists of
     draft-stewart-tsvwg-sctpecn-05

  -- Obsolete informational reference (is this intentional?): RFC 4960
     (Obsoleted by RFC 9260)

  -- Obsolete informational reference (is this intentional?): RFC 7540
     (Obsoleted by RFC 9113)

  -- Obsolete informational reference (is this intentional?): RFC 8312
     (Obsoleted by RFC 9438)


     Summary: 1 error (**), 0 flaws (~~), 11 warnings (==), 4 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Transport Area Working Group                             B. Briscoe, Ed.
3	Internet-Draft                                               Independent
4	Intended status: Informational                            K. De Schepper
5	Expires: August 23, 2020                                 Nokia Bell Labs
6	                                                        M. Bagnulo Braun
7	                                        Universidad Carlos III de Madrid
8	                                                                G. White
9	                                                               CableLabs
10	                                                       February 20, 2020

12	   Low Latency, Low Loss, Scalable Throughput (L4S) Internet Service:
13	                              Architecture
14	                      draft-ietf-tsvwg-l4s-arch-05

16	Abstract

18	   This document describes the L4S architecture, which enables Internet
19	   applications to achieve Low Latency, Low Loss, and Scalable
20	   throughput (L4S), while coexisting on shared network bottlenecks with
21	   existing Internet applications that are not built to take advantage
22	   of this new technology.

24	   In traditional bottleneck links that utilize a single, shared egress
25	   queue, a variety of application traffic flows can share the
26	   bottleneck queue simultaneously.  As a result, each sender's behavior
27	   and its response to the congestion signals (delay, packet drop, ECN
28	   marking) provided by the queue can impact the performance of all
29	   other applications that share the link.  Furthermore, it is
30	   considered important that new protocols coexist in a reasonably fair
31	   manner with existing protocols (most notably TCP and QUIC).  As a
32	   result, senders tend to normalize on behaviors that are not
33	   significantly different than those in use by the majority of the
34	   existing senders.  For many years, the majority of traffic on the
35	   Internet has used either the Reno AIMD congestion controller or the
36	   Cubic algorithm, and as a result any newly proposed congestion
37	   controller needs to demonstrate that it provides reasonable fairness
38	   when sharing a bottleneck with flows that use Reno or Cubic.  This
39	   has led to an ossification in congestion control, where improved
40	   congestion controllers cannot easily be deployed on the Internet.

42	   It is well known that the common existing congestion controllers
43	   (e.g.  Reno and Cubic) increase their congestion window (the amount
44	   of data in flight) until they induce congestion, and they respond to
45	   the congestion signals of packet loss (or equivalently ECN marks) by
46	   significantly reducing their congestion window.  This leads to a
47	   large sawtooth of the congestion window that manifests itself as a
48	   combination of queue delay and/or link underutilization.

50	   Meanwhile, in closed network environments, such as data centres, new
51	   congestion controllers (e.g.  DCTCP [RFC8257]) have been deployed
52	   that significantly outperform Reno and Cubic in terms of queue delay
53	   and link utilization across a much wider range of network conditions.

55	   The L4S architecture provides an approach that allows for the
56	   deployment of next generation congestion controllers while preserving
57	   reasonably fair coexistence with Reno and Cubic.

59	   The L4S architecture consists of three components: network support to
60	   isolate L4S traffic from other traffic and to provide appropriate
61	   congestion signaling to both types; protocol features that allow
62	   network elements to identify L4S traffic and allow for communication
63	   of congestion signaling; and host support for immediate congestion
64	   signaling and an appropriate congestion response that enables
65	   scalable performance.

67	Status of This Memo

69	   This Internet-Draft is submitted in full conformance with the
70	   provisions of BCP 78 and BCP 79.

72	   Internet-Drafts are working documents of the Internet Engineering
73	   Task Force (IETF).  Note that other groups may also distribute
74	   working documents as Internet-Drafts.  The list of current Internet-
75	   Drafts is at https://datatracker.ietf.org/drafts/current/.

77	   Internet-Drafts are draft documents valid for a maximum of six months
78	   and may be updated, replaced, or obsoleted by other documents at any
79	   time.  It is inappropriate to use Internet-Drafts as reference
80	   material or to cite them other than as "work in progress."

82	   This Internet-Draft will expire on August 23, 2020.

84	Copyright Notice

86	   Copyright (c) 2020 IETF Trust and the persons identified as the
87	   document authors.  All rights reserved.

89	   This document is subject to BCP 78 and the IETF Trust's Legal
90	   Provisions Relating to IETF Documents
91	   (https://trustee.ietf.org/license-info) in effect on the date of
92	   publication of this document.  Please review these documents
93	   carefully, as they describe your rights and restrictions with respect
94	   to this document.  Code Components extracted from this document must
95	   include Simplified BSD License text as described in Section 4.e of
96	   the Trust Legal Provisions and are provided without warranty as
97	   described in the Simplified BSD License.

99	Table of Contents

101	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
102	   2.  L4S Architecture Overview . . . . . . . . . . . . . . . . . .   5
103	   3.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   6
104	   4.  L4S Architecture Components . . . . . . . . . . . . . . . . .   8
105	   5.  Rationale . . . . . . . . . . . . . . . . . . . . . . . . . .  11
106	     5.1.  Why These Primary Components? . . . . . . . . . . . . . .  11
107	     5.2.  Why Not Alternative Approaches? . . . . . . . . . . . . .  13
108	   6.  Applicability . . . . . . . . . . . . . . . . . . . . . . . .  15
109	     6.1.  Applications  . . . . . . . . . . . . . . . . . . . . . .  15
110	     6.2.  Use Cases . . . . . . . . . . . . . . . . . . . . . . . .  17
111	     6.3.  Deployment Considerations . . . . . . . . . . . . . . . .  18
112	       6.3.1.  Deployment Topology . . . . . . . . . . . . . . . . .  19
113	       6.3.2.  Deployment Sequences  . . . . . . . . . . . . . . . .  20
114	       6.3.3.  L4S Flow but Non-L4S Bottleneck . . . . . . . . . . .  22
115	       6.3.4.  Other Potential Deployment Issues . . . . . . . . . .  23
116	   7.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  23
117	   8.  Security Considerations . . . . . . . . . . . . . . . . . . .  23
118	     8.1.  Traffic (Non-)Policing  . . . . . . . . . . . . . . . . .  23
119	     8.2.  'Latency Friendliness'  . . . . . . . . . . . . . . . . .  24
120	     8.3.  Interaction between Rate Policing and L4S . . . . . . . .  25
121	     8.4.  ECN Integrity . . . . . . . . . . . . . . . . . . . . . .  26
122	   9.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  26
123	   10. References  . . . . . . . . . . . . . . . . . . . . . . . . .  26
124	     10.1.  Normative References . . . . . . . . . . . . . . . . . .  26
125	     10.2.  Informative References . . . . . . . . . . . . . . . . .  27
126	   Appendix A.  Standardization items  . . . . . . . . . . . . . . .  33
127	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  35

129	1.  Introduction

131	   It is increasingly common for _all_ of a user's applications at any
132	   one time to require low delay: interactive Web, Web services, voice,
133	   conversational video, interactive video, interactive remote presence,
134	   instant messaging, online gaming, remote desktop, cloud-based
135	   applications and video-assisted remote control of machinery and
136	   industrial processes.  In the last decade or so, much has been done
137	   to reduce propagation delay by placing caches or servers closer to
138	   users.  However, queuing remains a major, albeit intermittent,
139	   component of latency.  For instance spikes of hundreds of
140	   milliseconds are common.  During a long-running flow, even with
141	   state-of-the-art active queue management (AQM), the base speed-of-
142	   light path delay roughly doubles.  Low loss is also important
143	   because, for interactive applications, losses translate into even
144	   longer retransmission delays.

146	   It has been demonstrated that, once access network bit rates reach
147	   levels now common in the developed world, increasing capacity offers
148	   diminishing returns if latency (delay) is not addressed.
149	   Differentiated services (Diffserv) offers Expedited Forwarding (EF
150	   [RFC3246]) for some packets at the expense of others, but this is not
151	   sufficient when all (or most) of a user's applications require low
152	   latency.

154	   Therefore, the goal is an Internet service with ultra-Low queueing
155	   Latency, ultra-Low Loss and Scalable throughput (L4S).  Ultra-low
156	   queuing latency means less than 1 millisecond (ms) on average and
157	   less than about 2 ms at the 99th percentile.  L4S is potentially for
158	   _all_ traffic - a service for all traffic needs none of the
159	   configuration or management baggage (traffic policing, traffic
160	   contracts) associated with favouring some traffic over others.  This
161	   document describes the L4S architecture for achieving these goals.

163	   It must be said that queuing delay only degrades performance
164	   infrequently [Hohlfeld14].  It only occurs when a large enough
165	   capacity-seeking (e.g.  TCP) flow is running alongside the user's
166	   traffic in the bottleneck link, which is typically in the access
167	   network.  Or when the low latency application is itself a large
168	   capacity-seeking flow (e.g. interactive video).  At these times, the
169	   performance improvement from L4S must be sufficient that network
170	   operators will be motivated to deploy it.

172	   Active Queue Management (AQM) is part of the solution to queuing
173	   under load.  AQM improves performance for all traffic, but there is a
174	   limit to how much queuing delay can be reduced by solely changing the
175	   network; without addressing the root of the problem.

177	   The root of the problem is the presence of standard TCP congestion
178	   control (Reno [RFC5681]) or compatible variants (e.g.  TCP Cubic
179	   [RFC8312]).  We shall use the term 'Classic' for these Reno-Friendly
180	   congestion controls.  It has been demonstrated that if the sending
181	   host replaces a Classic congestion control with a 'Scalable'
182	   alternative, when a suitable AQM is deployed in the network the
183	   performance under load of all the above interactive applications can
184	   be significantly improved.  For instance, queuing delay under heavy
185	   load with the example DCTCP/DualQ solution cited below is roughly 1
186	   millisecond (1 to 2 ms) at the 99th percentile without losing link
187	   utilization.  This compares with 5 to 20 ms on _average_ with a
188	   Classic congestion control and current state-of-the-art AQMs such as
189	   fq_CoDel [RFC8290] or PIE [RFC8033] and about 20-30 ms at the 99th
190	   percentile.  Also, with a Classic congestion control, reducing
191	   queueing to even 5 ms is typically only possible by losing some
192	   utilization.

194	   It has been demonstrated [DCttH15] that it is possible to deploy such
195	   an L4S service alongside the existing best efforts service so that
196	   all of a user's applications can shift to it when their stack is
197	   updated.  Access networks are typically designed with one link as the
198	   bottleneck for each site (which might be a home, small enterprise or
199	   mobile device), so deployment at a single network node should give
200	   nearly all the benefit.  The L4S approach also requires component
201	   mechanisms at the endpoints to fulfill its goal.  This document
202	   presents the L4S architecture, by describing the different components
203	   and how they interact to provide the scalable low-latency, low-loss,
204	   Internet service.

206	2.  L4S Architecture Overview

208	   There are three main components to the L4S architecture (illustrated
209	   in Figure 1):

211	   1) Network:  L4S traffic needs to be isolated from the queuing
212	      latency of Classic traffic.  One queue per application flow (FQ)
213	      is one way to achieve this, e.g.  [RFC8290].  However, just two
214	      queues is sufficient and does not require inspection of transport
215	      layer headers in the network, which is not always possible (see
216	      Section 5.2).  With just two queues, it might seem impossible to
217	      know how much capacity to schedule for each queue without
218	      inspecting how many flows at any one time are using each.  And
219	      capacity in access networks is too costly to arbitrarily partition
220	      into two.  The Dual Queue Coupled AQM was developed as a minimal
221	      complexity solution to this problem.  It acts like a 'semi-
222	      permeable' membrane that partitions latency but not bandwidth.
223	      Note that there is no bandwidth priority between the two queues
224	      because they are for transition from Classic to L4S behaviour, not
225	      prioritization.  Section 4 gives a high level explanation of how
226	      FQ and DualQ solutions work, and
227	      [I-D.ietf-tsvwg-aqm-dualq-coupled] gives a full explanation of the
228	      Coupled DualQ.

230	   2) Protocol:  A host needs to distinguish L4S and Classic packets
231	      with an identifier so that the network can classify them into
232	      their separate treatments.  [I-D.ietf-tsvwg-ecn-l4s-id] considers
233	      various alternative identifiers, and concludes that all
234	      alternatives involve compromises, but the ECT(1) and CE codepoints
235	      of the ECN field represent a workable solution.

237	   3) Host:  Scalable congestion controls already exist.  They solve the
238	      scaling problem with Reno congestion control that was explained in
239	      [RFC3649].  The one used most widely (in controlled environments)
240	      is Data Center TCP (DCTCP [RFC8257]), which has been implemented
241	      and deployed in Windows Server Editions (since 2012), in Linux and
242	      in FreeBSD.  Although DCTCP as-is 'works' well over the public
243	      Internet, most implementations lack certain safety features that
244	      will be necessary once it is used outside controlled environments
245	      like data centres (see Section 6.3.3 and Appendix A).  A similar
246	      scalable congestion control will also need to be transplanted into
247	      protocols other than TCP (QUIC, SCTP, RTP/RTCP, RMCAT, etc.)
248	      Indeed, between the present document being drafted and published,
249	      the following scalable congestion controls were implemented: TCP
250	      Prague [PragueLinux], QUIC Prague, an L4S variant of the RMCAT
251	      SCReAM controller [RFC8298] and the L4S ECN part of BBRv2
252	      [I-D.cardwell-iccrg-bbr-congestion-control] intended for TCP and
253	      QUIC.

255	                        (2)                     (1)
256	                 .-------^------. .--------------^-------------------.
257	    ,-(3)-----.                                  ______
258	   ; ________  :            L4S   --------.     |      |
259	   :|Scalable| :               _\        ||___\_| mark |
260	   :| sender | :  __________  / /        ||   / |______|\   _________
261	   :|________|\; |          |/    --------'         ^    \1|condit'nl|
262	    `---------'\_|  IP-ECN  |              Coupling :     \|priority |_\
263	     ________  / |Classifier|                       :     /|scheduler| /
264	    |Classic |/  |__________|\    --------.      ___:__  / |_________|
265	    | sender |                \_\  || | |||___\_| mark/|/
266	    |________|                  /  || | |||   / | drop |
267	                         Classic  --------'     |______|

269	     Figure 1: Components of an L4S Solution: 1) Isolation in separate
270	    network queues; 2) Packet Identification Protocol; and 3) Scalable
271	                               Sending Host

273	3.  Terminology

275	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
276	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
277	   document are to be interpreted as described in [RFC2119].  In this
278	   document, these words will appear with that interpretation only when
279	   in ALL CAPS.  Lower case uses of these words are not to be
280	   interpreted as carrying RFC-2119 significance.  COMMENT: Since this
281	   will be an information document, This should be removed.

283	   Classic Congestion Control:  A congestion control behaviour that can
284	      co-exist with standard TCP Reno [RFC5681] without causing flow
285	      rate starvation.  With Classic congestion controls, as flow rate
286	      scales, the number of round trips between congestion signals
287	      (losses or ECN marks) rises with the flow rate.  So it takes
288	      longer and longer to recover after each congestion event.

290	      Therefore control of queuing and utilization becomes very slack,
291	      and the slightest disturbance prevents a high rate from being
292	      attained [RFC3649].

294	      For instance, with 1500 byte packets and an end-to-end round trip
295	      time (RTT) of 36 ms, over the years, as Reno flow rate scales from
296	      2 to 100 Mb/s the number of round trips taken to recover from a
297	      congestion event rises proportionately, from 4 to 200.  Cubic was
298	      developed to be less unscalable, but it is approaching its scaling
299	      limit; with the same RTT of 36ms, at 100Mb/s it takes over 300
300	      round trips to recover, and at 800 Mb/s its recovery time doubles
301	      to over 600 round trips, or more than 20 seconds.

303	   Scalable Congestion Control:  A congestion control where the average
304	      time from one congestion signal to the next (the recovery time)
305	      remains invariant as the flow rate scales, all other factors being
306	      equal.  This maintains the same degree of control over queueing
307	      and utilization whatever the flow rate, as well as ensuring that
308	      high throughput is robust to disturbances.  For instance, DCTCP
309	      averages 2 congestion signals per round-trip whatever the flow
310	      rate.  See Section 4.3 of [I-D.ietf-tsvwg-ecn-l4s-id] for more
311	      explanation.

313	   Classic service:  The Classic service is intended for all the
314	      congestion control behaviours that co-exist with Reno [RFC5681]
315	      (e.g.  Reno itself, Cubic [RFC8312], Compound
316	      [I-D.sridharan-tcpm-ctcp], TFRC [RFC5348]).

318	   Low-Latency, Low-Loss Scalable throughput (L4S) service:  The 'L4S'
319	      service is intended for traffic from scalable congestion control
320	      algorithms, such as Data Center TCP [RFC8257].  The L4S service is
321	      for more general traffic than just DCTCP--it allows the set of
322	      congestion controls with similar scaling properties to DCTCP to
323	      evolve (e.g.  Relentless TCP [Mathis09], TCP Prague [LinuxPrague]
324	      and the L4S variant of SCREAM for real-time media [RFC8298]).

326	      Both Classic and L4S services can cope with a proportion of
327	      unresponsive or less-responsive traffic as well, as long as it
328	      does not build a queue (e.g.  DNS, VoIP, game sync datagrams,
329	      etc).

331	      The terms Classic or L4S can also qualify other nouns, such
332	      'queue', 'codepoint', 'identifier', 'classification', 'packet',
333	      'flow'.  For example: a 'Classic queue', means a queue providing
334	      the Classic service; an L4S packet means a packet with an L4S
335	      identifier sent from an L4S congestion control.

337	   Classic ECN:  The original Explicit Congestion Notification (ECN)
338	      protocol [RFC3168], which requires ECN signals to be treated the
339	      same as drops, both when generated in the network and when
340	      responded to by the sender.  The names used for the four
341	      codepoints of the 2-bit IP-ECN field are as defined in [RFC3168]:
342	      Not ECT, ECT(0), ECT(1) and CE, where ECT stands for ECN-Capable
343	      Transport and CE stands for Congestion Experienced.

345	   Site:  A home, mobile device, small enterprise or campus, where the
346	      network bottleneck is typically the access link to the site.  Not
347	      all network arrangements fit this model but it is a useful, widely
348	      applicable generalisation.

350	4.  L4S Architecture Components

352	   The L4S architecture is composed of the following elements.

354	   Protocols:The L4S architecture encompasses the two identifier changes
355	   (an unassignment and an assignment) and optional further identifiers:

357	   a.  An essential aspect of a scalable congestion control is the use
358	       of explicit congestion signals rather than losses, because the
359	       signals need to be sent immediately and frequently.  'Classic'
360	       ECN [RFC3168] requires an ECN signal to be treated the same as a
361	       drop, both when it is generated in the network and when it is
362	       responded to by hosts.  L4S needs networks and hosts to support a
363	       different meaning for ECN:

365	       *  much more frequent signals--too often to use drops;

367	       *  immediately tracking every fluctuation of the queue--too soon
368	          to commit to dropping packets.

370	       So the standards track [RFC3168] has had to be updated to allow
371	       L4S packets to depart from the 'same as drop' constraint.
372	       [RFC8311] is a standards track update to relax specific
373	       requirements in RFC 3168 (and certain other standards track
374	       RFCs), which clears the way for the experimental changes proposed
375	       for L4S.  [RFC8311] also reclassifies the original experimental
376	       assignment of the ECT(1) codepoint as an ECN nonce [RFC3540] as
377	       historic.

379	   b.  [I-D.ietf-tsvwg-ecn-l4s-id] recommends ECT(1) is used as the
380	       identifier to classify L4S packets into a separate treatment from
381	       Classic packets.  This satisfies the requirements for identifying
382	       an alternative ECN treatment in [RFC4774].

384	       The CE codepoint is used to indicate Congestion Experienced by
385	       both L4S and Classic treatments.  This raises the concern that a
386	       Classic AQM earlier on the path might have marked some ECT(0)
387	       packets as CE.  Then these packets will be erroneously classified
388	       into the L4S queue.  [I-D.ietf-tsvwg-ecn-l4s-id] explains why 5
389	       unlikely eventualities all have to coincide for this to have any
390	       detrimental effect, which even then would only involve a
391	       vanishingly small likelihood of a spurious retransmission.

393	   c.  A network operator might wish to include certain unresponsive,
394	       non-L4S traffic in the L4S queue if it is deemed to be smoothly
395	       enough paced and low enough rate not to build a queue.  For
396	       instance, VoIP, low rate datagrams to sync online games,
397	       relatively low rate application-limited traffic, DNS, LDAP, etc.
398	       This traffic would need to be tagged with specific identifiers,
399	       e.g. a low latency Diffserv Codepoint such as Expedited
400	       Forwarding (EF [RFC3246]), Non-Queue-Building (NQB
401	       [I-D.white-tsvwg-nqb]), or operator-specific identifiers.

403	   Network components: The L4S architecture encompasses either dual-
404	   queue or per-flow queue solutions:

406	   a.  The Coupled Dual Queue AQM achieves the 'semi-permeable' membrane
407	       property mentioned earlier as follows.  The obvious part is that
408	       using two separate queues isolates the queuing delay of one from
409	       the other.  The less obvious part is how the two queues act as if
410	       they are a single pool of bandwidth without the scheduler needing
411	       to decide between them.  This is achieved by making the Classic
412	       traffic appear as if it is an equivalent amount of traffic in the
413	       L4S queue, by coupling across the drop probability of the Classic
414	       AQM to drive the ECN marking level applied to L4S traffic.  This
415	       makes the L4S flows slow down to leave just enough capacity for
416	       the Classic traffic (as they would if they were the same type of
417	       traffic sharing the same queue).  Then the scheduler can serve
418	       the L4S queue with priority, because the L4S traffic isn't
419	       offering up enough traffic to use all the priority that it is
420	       given.  Therefore, on short time-scales (sub-round-trip) the
421	       prioritization of the L4S queue protects its low latency by
422	       allowing bursts to dissipate quickly; but on longer time-scales
423	       (round-trip and longer) the Classic queue creates an equal and
424	       opposite pressure against the L4S traffic to ensure that neither
425	       has priority when it comes to bandwidth.  The tension between
426	       prioritizing L4S and coupling marking from Classic results in
427	       per-flow fairness.  To protect against unresponsive traffic in
428	       the L4S queue taking advantage of the prioritization and starving
429	       the Classic queue, it is advisable not to use strict priority,
430	       but instead to use a weighted scheduler.

432	       When there is no Classic traffic, the L4S queue's AQM comes into
433	       play, and it sets an appropriate marking rate to maintain ultra-
434	       low queuing delay.

436	       The Coupled Dual Queue AQM has been specified as generically as
437	       possible [I-D.ietf-tsvwg-aqm-dualq-coupled] without specifying
438	       the particular AQMs to use in the two queues so that designers
439	       are free to implement diverse ideas.  Then informational
440	       appendices give pseudocode examples of different specific AQM
441	       approaches.  Initially a zero-config variant of RED called Curvy
442	       RED was implemented, tested and documented.  Then, a variant of
443	       PIE called DualPI2 (pronounced Dual PI Squared) [DualPI2Linux]
444	       was implemented and found to perform better than Curvy RED over a
445	       wide range of conditions, so it was documented in another
446	       appendix of [I-D.ietf-tsvwg-aqm-dualq-coupled].  A Coupled DualQ
447	       variant based on PIE has also been specified and implemented for
448	       Low Latency DOCSIS [DOCSIS3.1].

450	   b.  A scheduler with per-flow queues can be used for L4S.  It is
451	       simple to modify an existing design such as FQ-CoDel or FQ-PIE.
452	       For instance within each queue of an FQ_CoDel system, as well as
453	       a CoDel AQM, immediate (unsmoothed) shallow threshold ECN marking
454	       has been added.  Then the Classic AQM such as CoDel or PIE is
455	       applied to non-ECN or ECT(0) packets, while the shallow threshold
456	       is applied to ECT(1) packets, to give sub-millisecond average
457	       queue delay.

459	   Host mechanisms: The L4S architecture includes a number of mechanisms
460	   in the end host that we enumerate next:

462	   a.  Data Center TCP is the most widely used example of a scalable
463	       congestion control.  It has been documented as an informational
464	       record of the protocol currently in use [RFC8257].  It will be
465	       necessary to define a number of safety features for a variant
466	       usable on the public Internet.  A draft list of these, known as
467	       the Prague L4S requirements, has been drawn up (see Appendix A of
468	       [I-D.ietf-tsvwg-ecn-l4s-id]).  The list also includes some
469	       optional performance improvements.

471	   b.  Transport protocols other than TCP use various congestion
472	       controls designed to be friendly with Reno.  Before they can use
473	       the L4S service, it will be necessary to implement scalable
474	       variants of each of these congestion control behaviours.  The
475	       following standards track RFCs currently define these protocols:
476	       ECN in TCP [RFC3168], in SCTP [RFC4960], in RTP [RFC6679], and in
477	       DCCP [RFC4340].  Not all are in widespread use, but those that
478	       are will eventually need to be updated to allow a different
479	       congestion response, which they will have to indicate by using
480	       the ECT(1) codepoint.  Scalable variants are under consideration
481	       for some new transport protocols that are themselves under
482	       development, e.g.  QUIC [I-D.ietf-quic-transport] and certain
483	       real-time media congestion avoidance techniques (RMCAT)
484	       protocols.

486	   c.  ECN feedback is sufficient for L4S in some transport protocols
487	       (RTCP, DCCP) but not others:

489	       *  For the case of TCP, the feedback protocol for ECN embeds the
490	          assumption from Classic ECN that an ECN mark is the same as a
491	          drop, making it unusable for a scalable TCP.  Therefore, the
492	          implementation of TCP receivers will have to be upgraded
493	          [RFC7560].  Work to standardize and implement more accurate
494	          ECN feedback for TCP (AccECN) is in progress
495	          [I-D.ietf-tcpm-accurate-ecn], [PragueLinux].

497	       *  ECN feedback is only roughly sketched in an appendix of the
498	          SCTP specification.  A fuller specification has been proposed
499	          [I-D.stewart-tsvwg-sctpecn], which would need to be
500	          implemented and deployed before SCTCP could support L4S.

502	5.  Rationale

504	5.1.  Why These Primary Components?

506	   Explicit congestion signalling (protocol):  Explicit congestion
507	      signalling is a key part of the L4S approach.  In contrast, use of
508	      drop as a congestion signal creates a tension because drop is both
509	      a useful signal (more would reduce delay) and an impairment (less
510	      would reduce delay):

512	      *  Explicit congestion signals can be used many times per round
513	         trip, to keep tight control, without any impairment.  Under
514	         heavy load, even more explicit signals can be applied so the
515	         queue can be kept short whatever the load.  Whereas state-of-
516	         the-art AQMs have to introduce very high packet drop at high
517	         load to keep the queue short.  Further, when using ECN, the
518	         congestion control's sawtooth reduction can be smaller and
519	         therefore return to the operating point more often, without
520	         worrying that this causes more signals (one at the top of each
521	         smaller sawtooth).  The consequent smaller amplitude sawteeth
522	         fit between a very shallow marking threshold and an empty
523	         queue, so delay variation can be very low, without risk of
524	         under-utilization.

526	      *  Explicit congestion signals can be sent immediately to track
527	         fluctuations of the queue.  L4S shifts smoothing from the
528	         network (which doesn't know the round trip times of all the
529	         flows) to the host (which knows its own round trip time).
530	         Previously, the network had to smooth to keep a worst-case
531	         round trip stable, delaying congestion signals by 100-200ms.

533	      All the above makes it clear that explicit congestion signalling
534	      is only advantageous for latency if it does not have to be
535	      considered 'the same as' drop (as was required with Classic ECN
536	      [RFC3168]).  Therefore, in a DualQ AQM, the L4S queue uses a new
537	      L4S variant of ECN that is not equivalent to drop
538	      [I-D.ietf-tsvwg-ecn-l4s-id], while the Classic queue uses either
539	      classic ECN [RFC3168] or drop, which are equivalent.

541	      Before Classic ECN was standardized, there were various proposals
542	      to give an ECN mark a different meaning from drop.  However, there
543	      was no particular reason to agree on any one of the alternative
544	      meanings, so 'the same as drop' was the only compromise that could
545	      be reached.  RFC 3168 contains a statement that:

547	         "An environment where all end nodes were ECN-Capable could
548	         allow new criteria to be developed for setting the CE
549	         codepoint, and new congestion control mechanisms for end-node
550	         reaction to CE packets.  However, this is a research issue, and
551	         as such is not addressed in this document."

553	   Latency isolation with coupled congestion notification (network):
554	      Using just two queues is not essential to L4S (more would be
555	      possible), but it is the simplest way to isolate all the L4S
556	      traffic that keeps latency low from all the legacy Classic traffic
557	      that does not.

559	      Similarly, coupling the congestion notification between the queues
560	      is not necessarily essential, but it is a clever and simple way to
561	      allow senders to determine their rate, packet-by-packet, rather
562	      than be overridden by a network scheduler.  Because otherwise a
563	      network scheduler would have to inspect at least transport layer
564	      headers, and it would have to continually assign a rate to each
565	      flow without any easy way to understand application intent.

567	   L4S packet identifier (protocol):  Once there are at least two
568	      separate treatments in the network, hosts need an identifier at
569	      the IP layer to distinguish which treatment they intend to use.

571	   Scalable congestion notification (host):  A scalable congestion
572	      control keeps the signalling frequency high so that rate
573	      variations can be small when signalling is stable, and rate can
574	      track variations in available capacity as rapidly as possible
575	      otherwise.

577	   Low loss:  Latency is not the only concern of L4S.  The 'Low Loss"
578	      part of the name denotes that L4S generally achieves zero
579	      congestion loss due to its use of ECN.  Otherwise, loss would
580	      itself cause delay, particularly for short flows, due to
581	      retransmission delay [RFC2884].

583	   Scalable throughput:  The "Scalable throughput" part of the name
584	      denotes that the per-flow throughput of scalable congestion
585	      controls should scale indefinitely, avoiding the imminent scaling
586	      problems with Reno-Friendly congestion control algorithms
587	      [RFC3649].  It was known when TCP congestion avoidance was first
588	      developed that it would not scale to high bandwidth-delay products
589	      (see footnote 6 in [TCP-CA]).  Today, regular broadband bit-rates
590	      over WAN distances are already beyond the scaling range of Classic
591	      Reno congestion control.  So `less unscalable' Cubic [RFC8312] and
592	      Compound [I-D.sridharan-tcpm-ctcp] variants of TCP have been
593	      successfully deployed.  However, these are now approaching their
594	      scaling limits.  For instance, at 800Mb/s with a 20ms round trip,
595	      Cubic induces a congestion signal only every 500 round trips or 10
596	      seconds, which makes its dynamic control very sloppy.  In contrast
597	      on average a scalable congestion control like DCTCP or TCP Prague
598	      induces 2 congestion signals per round trip, which remains
599	      invariant for any flow rate, keeping dynamic control very tight.

601	5.2.  Why Not Alternative Approaches?

603	   All the following approaches address some part of the same problem
604	   space as L4S.  In each case, it is shown that L4S complements them or
605	   improves on them, rather than being a mutually exclusive alternative:

607	   Diffserv:  Diffserv addresses the problem of bandwidth apportionment
608	      for important traffic as well as queuing latency for delay-
609	      sensitive traffic.  L4S solely addresses the problem of queuing
610	      latency (as well as loss and throughput scaling).  Diffserv will
611	      still be necessary where important traffic requires priority (e.g.
612	      for commercial reasons, or for protection of critical
613	      infrastructure traffic) - see [I-D.briscoe-tsvwg-l4s-diffserv].
614	      Nonetheless, if there are Diffserv classes for important traffic,
615	      the L4S approach can provide low latency for _all_ traffic within
616	      each Diffserv class (including the case where there is only one
617	      Diffserv class).

619	      Also, as already explained, Diffserv only works for a small subset
620	      of the traffic on a link.  It is not applicable when all the
621	      applications in use at one time at a single site (home, small
622	      business or mobile device) require low latency.  Also, because L4S
623	      is for all traffic, it needs none of the management baggage
624	      (traffic policing, traffic contracts) associated with favouring
625	      some packets over others.  This baggage has held Diffserv back
626	      from widespread end-to-end deployment.

628	   State-of-the-art AQMs:  AQMs such as PIE and fq_CoDel give a
629	      significant reduction in queuing delay relative to no AQM at all.
630	      L4S is intended to complement these AQMs, and should not distract
631	      from the need to deploy them as widely as possible.  Nonetheless,
632	      without addressing the large saw-toothing rate variations of
633	      Classic congestion controls, AQMs alone cannot reduce queuing
634	      delay too far without significantly reducing link utilization.
635	      The L4S approach resolves this tension by ensuring hosts can
636	      minimize the size of their sawteeth without appearing so
637	      aggressive to legacy flows that they starve them.

639	   Per-flow queuing:  Similarly, per-flow queuing is not incompatible
640	      with the L4S approach.  However, one queue for every flow can be
641	      thought of as overkill compared to the minimum of two queues for
642	      all traffic needed for the L4S approach.  The overkill of per-flow
643	      queuing has side-effects:

645	      A.  fq makes high performance networking equipment costly
646	          (processing and memory) - in contrast dual queue code can be
647	          very simple;

649	      B.  fq requires packet inspection into the end-to-end transport
650	          layer, which doesn't sit well alongside encryption for privacy
651	          - in contrast the use of ECN as the classifier for L4S
652	          requires no deeper inspection than the IP layer;

654	      C.  fq isolates the queuing of each flow from the others but not
655	          from itself so existing FQ implementations still need to have
656	          support for scalable congestion control added.

658	          It might seem that self-inflicted queuing delay should not
659	          count, because if the delay wasn't in the network it would
660	          just shift to the sender.  However, modern adaptive
661	          applications, e.g.  HTTP/2 [RFC7540] or the interactive media
662	          applications described in Section 6, can keep low latency
663	          objects at the front of their local send queue by shuffling
664	          priorities of other objects dependent on the progress of other
665	          transfers.  They cannot shuffle packets once they have
666	          released them into the network.

668	      D.  fq prevents any one flow from consuming more than 1/N of the
669	          capacity at any instant, where N is the number of flows.  This
670	          is fine if all flows are elastic, but it does not sit well
671	          with a variable bit rate real-time multimedia flow, which
672	          requires wriggle room to sometimes take more and other times
673	          less than a 1/N share.

675	          It might seem that an fq scheduler offers the benefit that it
676	          prevents individual flows from hogging all the bandwidth.
677	          However, L4S has been deliberately designed so that policing
678	          of individual flows can be added as a policy choice, rather
679	          than requiring one specific policy choice as the mechanism
680	          itself.  A scheduler (like fq) has to decide packet-by-packet
681	          which flow to schedule without knowing application intent.
682	          Whereas a separate policing function can be configured less
683	          strictly, so that senders can still control the instantaneous
684	          rate of each flow dependent on the needs of each application
685	          (e.g. variable rate video), giving more wriggle-room before a
686	          flow is deemed non-compliant.  Also policing of queuing and of
687	          flow-rates can be applied independently.

689	   Alternative Back-off ECN (ABE):  Here again, L4S is not an
690	      alternative to ABE but a complement that introduces much lower
691	      queuing delay.  ABE [RFC8511] alters the host behaviour in
692	      response to ECN marking to utilize a link better and give ECN
693	      flows faster throughput.  It uses ECT(0) and assumes the network
694	      still treats ECN and drop the same.  Therefore ABE exploits any
695	      lower queuing delay that AQMs can provide.  But as explained
696	      above, AQMs still cannot reduce queuing delay too far without
697	      losing link utilization (to allow for other, non-ABE, flows).

699	   BBRv1:  v1 of Bottleneck Bandwidth and Round-trip propagation time
700	      (BBR [I-D.cardwell-iccrg-bbr-congestion-control]) controls queuing
701	      delay end-to-end without needing any special logic in the network,
702	      such as an AQM - so it works pretty-much on any path.  Setting
703	      some problems with capacity sharing aside, queuing delay is good
704	      with BBRv1, but perhaps not quite as low as with state-of-the-art
705	      AQMs such as PIE or fq_CoDel, and certainly nowhere near as low as
706	      with L4S.  Queuing delay is also not consistently low, due to its
707	      regular bandwidth probes and the aggressive flow start-up phase.

709	      L4S is a complement to BBRv1.  Indeed BBRv2 uses L4S ECN and a
710	      scalable L4S congestion control behaviour in response to any ECN
711	      signalling from the path.

713	6.  Applicability

715	6.1.  Applications

717	   A transport layer that solves the current latency issues will provide
718	   new service, product and application opportunities.

720	   With the L4S approach, the following existing applications will
721	   experience significantly better quality of experience under load:

723	   o  Gaming, including cloud based gaming;

725	   o  VoIP;

727	   o  Video conferencing;

729	   o  Web browsing;

731	   o  (Adaptive) video streaming;

733	   o  Instant messaging.

735	   The significantly lower queuing latency also enables some interactive
736	   application functions to be offloaded to the cloud that would hardly
737	   even be usable today:

739	   o  Cloud based interactive video;

741	   o  Cloud based virtual and augmented reality.

743	   The above two applications have been successfully demonstrated with
744	   L4S, both running together over a 40 Mb/s broadband access link
745	   loaded up with the numerous other latency sensitive applications in
746	   the previous list as well as numerous downloads - all sharing the
747	   same bottleneck queue simultaneously [L4Sdemo16].  For the former, a
748	   panoramic video of a football stadium could be swiped and pinched so
749	   that, on the fly, a proxy in the cloud could generate a sub-window of
750	   the match video under the finger-gesture control of each user.  For
751	   the latter, a virtual reality headset displayed a viewport taken from
752	   a 360 degree camera in a racing car.  The user's head movements
753	   controlled the viewport extracted by a cloud-based proxy.  In both
754	   cases, with 7 ms end-to-end base delay, the additional queuing delay
755	   of roughly 1 ms was so low that it seemed the video was generated
756	   locally.

758	   Using a swiping finger gesture or head movement to pan a video are
759	   extremely latency-demanding actions--far more demanding than VoIP.
760	   Because human vision can detect extremely low delays of the order of
761	   single milliseconds when delay is translated into a visual lag
762	   between a video and a reference point (the finger or the orientation
763	   of the head sensed by the balance system in the inner ear --- the
764	   vestibular system).

766	   Without the low queuing delay of L4S, cloud-based applications like
767	   these would not be credible without significantly more access
768	   bandwidth (to deliver all possible video that might be viewed) and
769	   more local processing, which would increase the weight and power
770	   consumption of head-mounted displays.  When all interactive
771	   processing can be done in the cloud, only the data to be rendered for
772	   the end user needs to be sent.

774	   Other low latency high bandwidth applications such as:

776	   o  Interactive remote presence;

778	   o  Video-assisted remote control of machinery or industrial
779	      processes.

781	   are not credible at all without very low queuing delay.  No amount of
782	   extra access bandwidth or local processing can make up for lost time.

784	6.2.  Use Cases

786	   The following use-cases for L4S are being considered by various
787	   interested parties:

789	   o  Where the bottleneck is one of various types of access network:
790	      DSL, cable, mobile, satellite

792	      *  Radio links (cellular, WiFi, satellite) that are distant from
793	         the source are particularly challenging.  The radio link
794	         capacity can vary rapidly by orders of magnitude, so it is
795	         often desirable to hold a buffer to utilise sudden increases of
796	         capacity;

798	      *  cellular networks are further complicated by a perceived need
799	         to buffer in order to make hand-overs imperceptible;

801	      *  Satellite networks generally have a very large base RTT, so
802	         even with minimal queuing, overall delay can never be extremely
803	         low;

805	      *  Nonetheless, it is certainly desirable not to hold a buffer
806	         purely because of the sawteeth of Classic congestion controls,
807	         when it is more than is needed for all the above reasons.

809	   o  Private networks of heterogeneous data centres, where there is no
810	      single administrator that can arrange for all the simultaneous
811	      changes to senders, receivers and network needed to deploy DCTCP:

813	      *  a set of private data centres interconnected over a wide area
814	         with separate administrations, but within the same company

816	      *  a set of data centres operated by separate companies
817	         interconnected by a community of interest network (e.g. for the
818	         finance sector)

820	      *  multi-tenant (cloud) data centres where tenants choose their
821	         operating system stack (Infrastructure as a Service - IaaS)

823	   o  Different types of transport (or application) congestion control:

825	      *  elastic (TCP/SCTP);

827	      *  real-time (RTP, RMCAT);

829	      *  query (DNS/LDAP).

831	   o  Where low delay quality of service is required, but without
832	      inspecting or intervening above the IP layer
833	      [I-D.smith-encrypted-traffic-management]:

835	      *  mobile and other networks have tended to inspect higher layers
836	         in order to guess application QoS requirements.  However, with
837	         growing demand for support of privacy and encryption, L4S
838	         offers an alternative.  There is no need to select which
839	         traffic to favour for queuing, when L4S gives favourable
840	         queuing to all traffic.

842	   o  If queuing delay is minimized, applications with a fixed delay
843	      budget can communicate over longer distances, or via a longer
844	      chain of service functions [RFC7665] or onion routers.

846	6.3.  Deployment Considerations

848	   The DualQ is, in itself, an incremental deployment framework for L4S
849	   AQMs so that L4S traffic can coexist with existing Classic (Reno-
850	   friendly) traffic.  Section 6.3.1 explains why only deploying a DualQ
851	   AQM [I-D.ietf-tsvwg-aqm-dualq-coupled] in one node at each end of the
852	   access link will realize nearly all the benefit of L4S.

854	   L4S involves both end systems and the network, so Section 6.3.2
855	   suggests some typical sequences to deploy each part, and why there
856	   will be an immediate and significant benefit after deploying just one
857	   part.

859	   If an ECN-enabled DualQ AQM has not been deployed at a bottleneck, an
860	   L4S flow is required to include a fall-back strategy to Classic
861	   behaviour.  Section 6.3.3 describes how an L4S flow detects this, and
862	   how to minimize the effect of false negative detection.

864	6.3.1.  Deployment Topology

866	   DualQ AQMs will not have to be deployed throughout the Internet
867	   before L4S will work for anyone.  Operators of public Internet access
868	   networks typically design their networks so that the bottleneck will
869	   nearly always occur at one known (logical) link.  This confines the
870	   cost of queue management technology to one place.

872	   The case of mesh networks is different and will be discussed later in
873	   this section.  But the known bottleneck case is generally true for
874	   Internet access to all sorts of different 'sites', where the word
875	   'site' includes home networks, small-to-medium sized campus or
876	   enterprise networks and even cellular devices (Figure 2).  Also, this
877	   known-bottleneck case tends to be applicable whatever the access link
878	   technology; whether xDSL, cable, cellular, line-of-sight wireless or
879	   satellite.

881	   Therefore, the full benefit of the L4S service should be available in
882	   the downstream direction when the DualQ AQM is deployed at the
883	   ingress to this bottleneck link (or links for multihomed sites).  And
884	   similarly, the full upstream service will be available once the DualQ
885	   is deployed at the upstream ingress.

887	                                            ______
888	                                           (      )
889	                         __          __  (          )
890	                        |DQ\________/DQ|( enterprise )
891	                    ___ |__/        \__| ( /campus  )
892	                   (   )                   (______)
893	                 (      )                           ___||_
894	   +----+      (          )  __                 __ /      \
895	   | DC |-----(    Core    )|DQ\_______________/DQ|| home |
896	   +----+      (          ) |__/               \__||______|
897	                  (_____) __
898	                         |DQ\__/\        __ ,===.
899	                         |__/    \  ____/DQ||| ||mobile
900	                                  \/    \__|||_||device
901	                                            | o |
902	                                            `---'

904	   Figure 2: Likely location of DualQ (DQ) Deployments in common access
905	                                topologies

907	   Deployment in mesh topologies depends on how over-booked the core is.
908	   If the core is non-blocking, or at least generously provisioned so
909	   that the edges are nearly always the bottlenecks, it would only be
910	   necessary to deploy the DualQ AQM at the edge bottlenecks.  For
911	   example, some data-centre networks are designed with the bottleneck
912	   in the hypervisor or host NICs, while others bottleneck at the top-
913	   of-rack switch (both the output ports facing hosts and those facing
914	   the core).

916	   The DualQ would eventually also need to be deployed at any other
917	   persistent bottlenecks such as network interconnections, e.g. some
918	   public Internet exchange points and the ingress and egress to WAN
919	   links interconnecting data-centres.

921	6.3.2.  Deployment Sequences

923	   For any one L4S flow to work, it requires 3 parts to have been
924	   deployed.  This was the same deployment problem that ECN faced
925	   [RFC8170] so we have learned from this.

927	   Firstly, L4S deployment exploits the fact that DCTCP already exists
928	   on many Internet hosts (Windows, FreeBSD and Linux); both servers and
929	   clients.  Therefore, just deploying DualQ AQM at a network bottleneck
930	   immediately gives a working deployment of all the L4S parts.  DCTCP
931	   needs some safety concerns to be fixed for general use over the
932	   public Internet (see Section 2.3 of [I-D.ietf-tsvwg-ecn-l4s-id]), but
933	   DCTCP is not on by default, so these issues can be managed within
934	   controlled deployments or controlled trials.

936	   Secondly, the performance improvement with L4S is so significant that
937	   it enables new interactive services and products that were not
938	   previously possible.  It is much easier for companies to initiate new
939	   work on deployment if there is budget for a new product trial.  If,
940	   in contrast, there were only an incremental performance improvement
941	   (as with Classic ECN), spending on deployment tends to be much harder
942	   to justify.

944	   Thirdly, the L4S identifier is defined so that initially network
945	   operators can enable L4S exclusively for certain customers or certain
946	   applications.  But this is carefully defined so that it does not
947	   compromise future evolution towards L4S as an Internet-wide service.
948	   This is because the L4S identifier is defined not only as the end-to-
949	   end ECN field, but it can also optionally be combined with any other
950	   packet header or some status of a customer or their access link
951	   [I-D.ietf-tsvwg-ecn-l4s-id].  Operators could do this anyway, even if
952	   it were not blessed by the IETF.  However, it is best for the IETF to
953	   specify that they must use their own local identifier in combination
954	   with the IETF's identifier.  Then, if an operator enables the
955	   optional local-use approach, they only have to remove this extra rule
956	   to make the service work Internet-wide - it will already traverse
957	   middleboxes, peerings, etc.

959	   +-+--------------------+----------------------+---------------------+
960	   | | Servers or proxies |      Access link     |             Clients |
961	   +-+--------------------+----------------------+---------------------+
962	   |1| DCTCP (existing)   |                      |    DCTCP (existing) |
963	   | |                    | DualQ AQM downstream |                     |
964	   | |       WORKS DOWNSTREAM FOR CONTROLLED DEPLOYMENTS/TRIALS        |
965	   +-+--------------------+----------------------+---------------------+
966	   |2| TCP Prague         |                      |  AccECN (already in |
967	   | |                    |                      | progress:DCTCP/BBR) |
968	   | |                 FULLY     WORKS     DOWNSTREAM                  |
969	   +-+--------------------+----------------------+---------------------+
970	   |3|                    |  DualQ AQM upstream  |          TCP Prague |
971	   | |                    |                      |                     |
972	   | |              FULLY WORKS UPSTREAM AND DOWNSTREAM                |
973	   +-+--------------------+----------------------+---------------------+

975	                Figure 3: Example L4S Deployment Sequences

977	   Figure 3 illustrates some example sequences in which the parts of L4S
978	   might be deployed.  It consists of the following stages:

980	   1.  Here, the immediate benefit of a single AQM deployment can be
981	       seen, but limited to a controlled trial or controlled deployment.
982	       In this example downstream deployment is first, but in other
983	       scenarios the upstream might be deployed first.  If no AQM at all
984	       was previously deployed for the downstream access, the DualQ AQM
985	       greatly improves the Classic service (as well as adding the L4S
986	       service).  If an AQM was already deployed, the Classic service
987	       will be unchanged (and L4S will add an improvement on top).

989	   2.  In this stage, the name 'TCP Prague' is used to represent a
990	       variant of DCTCP that is safe to use in a production environment.
991	       If the application is primarily unidirectional, 'TCP Prague' at
992	       one end will provide all the benefit needed.  Accurate ECN
993	       feedback (AccECN) [I-D.ietf-tcpm-accurate-ecn] is needed at the
994	       other end, but it is a generic ECN feedback facility that is
995	       already planned to be deployed for other purposes, e.g.  DCTCP,
996	       BBR [I-D.cardwell-iccrg-bbr-congestion-control].  The two ends
997	       can be deployed in either order, because, in TCP, an L4S
998	       congestion control only enables itself if it has negotiated the
999	       use of AccECN feedback with the other end during the connection
1000	       handshake.  Thus, deployment of TCP Prague on a server enables
1001	       L4S trials to move to a production service in one direction,
1002	       wherever AccECN is deployed at the other end.  This stage might
1003	       be further motivated by the performance improvements of TCP
1004	       Prague relative to DCTCP (see Appendix A.2 of
1005	       [I-D.ietf-tsvwg-ecn-l4s-id]).

1007	   3.  This is a two-move stage to enable L4S upstream.  The DualQ or
1008	       TCP Prague can be deployed in either order as already explained.
1009	       To motivate the first of two independent moves, the deferred
1010	       benefit of enabling new services after the second move has to be
1011	       worth it to cover the first mover's investment risk.  As
1012	       explained already, the potential for new interactive services
1013	       provides this motivation.  The DualQ AQM also greatly improves
1014	       the upstream Classic service, assuming no other AQM has already
1015	       been deployed.

1017	   Note that other deployment sequences might occur.  For instance: the
1018	   upstream might be deployed first; a non-TCP protocol might be used
1019	   end-to-end, e.g.  QUIC, RMCAT; a body such as the 3GPP might require
1020	   L4S to be implemented in 5G user equipment, or other random acts of
1021	   kindness.

1023	6.3.3.  L4S Flow but Non-L4S Bottleneck

1025	   If L4S is enabled between two hosts but there is no L4S AQM at the
1026	   bottleneck, any drop from the bottleneck will trigger the L4S sender
1027	   to fall back to a classic ('Reno-Friendly') behaviour (see
1028	   Appendix A.1.3 of [I-D.ietf-tsvwg-ecn-l4s-id]).

1030	   Unfortunately, as well as protecting legacy traffic, this rule
1031	   degrades the L4S service whenever there is a loss, even if the loss
1032	   was not from a non-DualQ bottleneck (false negative).  And
1033	   unfortunately, prevalent drop can be due to other causes, e.g.:

1035	   o  congestion loss at other transient bottlenecks, e.g. due to bursts
1036	      in shallower queues;

1038	   o  transmission errors, e.g. due to electrical interference;

1040	   o  rate policing.

1042	   Three complementary approaches are in progress to address this issue,
1043	   but they are all currently research:

1045	   o  In Prague congestion control, ignore certain losses deemed
1046	      unlikely to be due to congestion (using some ideas from BBR
1047	      [I-D.cardwell-iccrg-bbr-congestion-control] but with no need to
1048	      ignore nearly all losses).  This could mask any of the above types
1049	      of loss (requires consensus on how to safely interoperate with
1050	      drop-based congestion controls).

1052	   o  A combination of RACK, reconfigured link retransmission and L4S
1053	      could address transmission errors [UnorderedLTE],
1054	      [I-D.ietf-tsvwg-ecn-l4s-id];

1056	   o  Hybrid ECN/drop policers (see Section 8.3).

1058	   L4S deployment scenarios that minimize these issues (e.g. over
1059	   wireline networks) can proceed in parallel to this research, in the
1060	   expectation that research success could continually widen L4S
1061	   applicability.

1063	   Classic ECN support is starting to materialize on the Internet as an
1064	   increased level of CE marking.  Given some of this Classic ECN might
1065	   be due to single-queue ECN deployment, an L4S sender will have to
1066	   fall back to a classic ('Reno-Friendly') behaviour if it detects that
1067	   ECN marking is accompanied by greater queuing delay or greater delay
1068	   variation than would be expected with L4S (see Appendix A.1.4 of
1069	   [I-D.ietf-tsvwg-ecn-l4s-id]).  It is hard to detect whether this is
1070	   all due to the addition of support for ECN in the Linux
1071	   implementation of FQ-CoDel, which would not require fall-back to
1072	   Classic behaviour, because FQ inherently forces the throughput of
1073	   each flow to be equal irrespective of its aggressiveness.

1075	6.3.4.  Other Potential Deployment Issues

1077	   An L4S AQM uses the ECN field to signal congestion.  So, in common
1078	   with Classic ECN, if the AQM is within a tunnel or at a lower layer,
1079	   correct functioning of ECN signalling requires correct propagation of
1080	   the ECN field up the layers [RFC6040],
1081	   [I-D.ietf-tsvwg-ecn-encap-guidelines].

1083	7.  IANA Considerations

1085	   This specification contains no IANA considerations.

1087	8.  Security Considerations

1089	8.1.  Traffic (Non-)Policing

1091	   Because the L4S service can serve all traffic that is using the
1092	   capacity of a link, it should not be necessary to police access to
1093	   the L4S service.  In contrast, Diffserv only works if some packets
1094	   get less favourable treatment than others.  So Diffserv has to use
1095	   traffic policers to limit how much traffic can be favoured.  In turn,
1096	   traffic policers require traffic contracts between users and networks
1097	   as well as pairwise between networks.  Because L4S will lack all this
1098	   management complexity, it is more likely to work end-to-end.

1100	   During early deployment (and perhaps always), some networks will not
1101	   offer the L4S service.  These networks do not need to police or re-
1102	   mark L4S traffic - they just forward it unchanged as best efforts
1103	   traffic, as they already forward traffic with ECT(1) today.  At a
1104	   bottleneck, such networks will introduce some queuing and dropping.
1105	   When a scalable congestion control detects a drop it will have to
1106	   respond as if it is a Classic congestion control (as required in
1107	   Section 2.3 of [I-D.ietf-tsvwg-ecn-l4s-id]).  This will ensure safe
1108	   interworking with other traffic at the 'legacy' bottleneck, but it
1109	   will degrade the L4S service to no better (but never worse) than
1110	   classic best efforts, whenever a legacy (non-L4S) bottleneck is
1111	   encountered on a path.

1113	   Certain network operators might choose to restrict access to the L4S
1114	   class, perhaps only to selected premium customers as a value-added
1115	   service.  Their packet classifier (item 2 in Figure 1) could identify
1116	   such customers against some other field (e.g. source address range)
1117	   as well as ECN.  If only the ECN L4S identifier matched, but not the
1118	   source address (say), the classifier could direct these packets (from
1119	   non-premium customers) into the Classic queue.  Clearly explaining
1120	   how operators can use an additional local classifiers (see
1121	   [I-D.ietf-tsvwg-ecn-l4s-id]) is intended to remove any tendency to
1122	   bleach the L4S identifier.  Then at least the L4S ECN identifier will
1123	   be more likely to survive end-to-end even though the service may not
1124	   be supported at every hop.  Such arrangements would only require
1125	   simple registered/not-registered packet classification, rather than
1126	   the managed, application-specific traffic policing against customer-
1127	   specific traffic contracts that Diffserv uses.

1129	8.2.  'Latency Friendliness'

1131	   The L4S service does rely on self-constraint - not in terms of
1132	   limiting rate, but in terms of limiting latency (burstiness).  It is
1133	   hoped that self-interest and standardisation of dynamic behaviour
1134	   (especially flow start-up) will be sufficient to prevent transports
1135	   from sending excessive bursts of L4S traffic, given the application's
1136	   own latency will suffer most from such behaviour.

1138	   Whether burst policing becomes necessary remains to be seen.  Without
1139	   it, there will be potential for attacks on the low latency of the L4S
1140	   service.  However it may only be necessary to apply such policing
1141	   reactively, e.g. punitively targeted at any deployments of new bursty
1142	   malware.

1144	   A per-flow (5-tuple) queue protection function
1145	   [I-D.briscoe-docsis-q-protection] has been developed for the low
1146	   latency queue in DOCSIS, which has adopted the DualQ L4S
1147	   architecture.  It protects the low latency service from any queue-
1148	   building flows that accidentally or maliciously classify themselves
1149	   into the low latency queue.  It is designed to score flows based
1150	   solely on their contribution to queuing (not flow rate in itself).
1151	   Then, if the shared low latency queue is at risk of exceeding a
1152	   threshold, the function redirects enough packets of the highest
1153	   scoring flow(s) into the Classic queue to preserve low latency.

1155	   Such a queue protection function is not considered a necessary part
1156	   of the L4S architecture, which works without it (in a similar way to
1157	   how the Internet works without per-flow rate policing).  Indeed,
1158	   under normal circumstances, DOCSIS queue protection does not
1159	   intervene, and if operators find it is not necessary they can disable
1160	   it.  Part of the L4S experiment will be to see whether such a
1161	   function is necessary.

1163	8.3.  Interaction between Rate Policing and L4S

1165	   As mentioned in Section 5.2, L4S should remove the need for low
1166	   latency Diffserv classes.  However, those Diffserv classes that give
1167	   certain applications or users priority over capacity, would still be
1168	   applicable in certain scenarios (e.g.  corporate networks).  Then,
1169	   within such Diffserv classes, L4S would often be applicable to give
1170	   traffic low latency and low loss as well.  Within such a Diffserv
1171	   class, the bandwidth available to a user or application is often
1172	   limited by a rate policer.  Similarly, in the default Diffserv class,
1173	   rate policers are used to partition shared capacity.

1175	   A classic rate policer drops any packets exceeding a set rate,
1176	   usually also giving a burst allowance (variants exist where the
1177	   policer re-marks non-compliant traffic to a discard-eligible Diffserv
1178	   codepoint, so they may be dropped elsewhere during contention).
1179	   Whenever L4S traffic encounters one of these rate policers, it will
1180	   experience drops and the source has to fall back to a Classic
1181	   congestion control, thus losing the benefits of L4S.  So, in networks
1182	   that already use rate policers and plan to deploy L4S, it will be
1183	   preferable to redesign these rate policers to be more friendly to the
1184	   L4S service.

1186	   This is currently a research area.  It might be achieved by setting a
1187	   threshold where ECN marking is introduced, such that it is just under
1188	   the policed rate or just under the burst allowance where drop is
1189	   introduced.  This could be applied to various types of policer, e.g.
1190	   [RFC2697], [RFC2698] or the 'local' (non-ConEx) variant of the ConEx
1191	   congestion policer [I-D.briscoe-conex-policing].  It might also be
1192	   possible to design scalable congestion controls to respond less
1193	   catastrophically to loss that has not been preceded by a period of
1194	   increasing delay.

1196	   The design of L4S-friendly rate policers will require a separate
1197	   dedicated document.  For further discussion of the interaction
1198	   between L4S and Diffserv, see [I-D.briscoe-tsvwg-l4s-diffserv].

1200	8.4.  ECN Integrity

1202	   Receiving hosts can fool a sender into downloading faster by
1203	   suppressing feedback of ECN marks (or of losses if retransmissions
1204	   are not necessary or available otherwise).  Various ways to protect
1205	   transport feedback integrity have been developed.  For instance:

1207	   o  The sender can test the integrity of the receiver's feedback by
1208	      occasionally setting the IP-ECN field to the congestion
1209	      experienced (CE) codepoint, which is normally only set by a
1210	      congested link.  Then the sender can test whether the receiver's
1211	      feedback faithfully reports what it expects (see 2nd para of
1212	      Section 20.2 of [RFC3168]).

1214	   o  A network can enforce a congestion response to its ECN markings
1215	      (or packet losses) by auditing congestion exposure (ConEx)
1216	      [RFC7713].

1218	   o  The TCP authentication option (TCP-AO [RFC5925]) can be used to
1219	      detect tampering with TCP congestion feedback.

1221	   o  The ECN Nonce [RFC3540] was proposed to detect tampering with
1222	      congestion feedback, but it has been reclassified as historic
1223	      [RFC8311].

1225	   Appendix C.1 of [I-D.ietf-tsvwg-ecn-l4s-id] gives more details of
1226	   these techniques including their applicability and pros and cons.

1228	9.  Acknowledgements

1230	   Thanks to Richard Scheffenegger, Wes Eddy, Karen Nielsen, David Black
1231	   and Jake Holland for their useful review comments.

1233	   Bob Briscoe and Koen De Schepper were part-funded by the European
1234	   Community under its Seventh Framework Programme through the Reducing
1235	   Internet Transport Latency (RITE) project (ICT-317700).  Bob Briscoe
1236	   was also part-funded by the Research Council of Norway through the
1237	   TimeIn project, partly by CableLabs and partly by the Comcast
1238	   Innovation Fund.  The views expressed here are solely those of the
1239	   authors.

1241	10.  References

1243	10.1.  Normative References

1245	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
1246	              Requirement Levels", BCP 14, RFC 2119,
1247	              DOI 10.17487/RFC2119, March 1997,
1248	              <https://www.rfc-editor.org/info/rfc2119>.

1250	10.2.  Informative References

1252	   [DCttH15]  De Schepper, K., Bondarenko, O., Briscoe, B., and I.
1253	              Tsang, "`Data Centre to the Home': Ultra-Low Latency for
1254	              All", RITE project Technical Report , 2015,
1255	              <http://riteproject.eu/publications/>.

1257	   [DOCSIS3.1]
1258	              CableLabs, "MAC and Upper Layer Protocols Interface
1259	              (MULPI) Specification, CM-SP-MULPIv3.1", Data-Over-Cable
1260	              Service Interface Specifications DOCSIS(R) 3.1 Version i17
1261	              or later, January 2019, <https://specification-
1262	              search.cablelabs.com/CM-SP-MULPIv3.1>.

1264	   [DualPI2Linux]
1265	              Albisser, O., De Schepper, K., Briscoe, B., Tilmans, O.,
1266	              and H. Steen, "DUALPI2 - Low Latency, Low Loss and
1267	              Scalable (L4S) AQM", Proc. Linux Netdev 0x13 , March 2019,
1268	              <https://www.netdevconf.org/0x13/session.html?talk-
1269	              DUALPI2-AQM>.

1271	   [Hohlfeld14]
1272	              Hohlfeld , O., Pujol, E., Ciucu, F., Feldmann, A., and P.
1273	              Barford, "A QoE Perspective on Sizing Network Buffers",
1274	              Proc. ACM Internet Measurement Conf (IMC'14) hmm, November
1275	              2014.

1277	   [I-D.briscoe-conex-policing]
1278	              Briscoe, B., "Network Performance Isolation using
1279	              Congestion Policing", draft-briscoe-conex-policing-01
1280	              (work in progress), February 2014.

1282	   [I-D.briscoe-docsis-q-protection]
1283	              Briscoe, B. and G. White, "Queue Protection to Preserve
1284	              Low Latency", draft-briscoe-docsis-q-protection-00 (work
1285	              in progress), July 2019.

1287	   [I-D.briscoe-tsvwg-l4s-diffserv]
1288	              Briscoe, B., "Interactions between Low Latency, Low Loss,
1289	              Scalable Throughput (L4S) and Differentiated Services",
1290	              draft-briscoe-tsvwg-l4s-diffserv-02 (work in progress),
1291	              November 2018.

1293	   [I-D.cardwell-iccrg-bbr-congestion-control]
1294	              Cardwell, N., Cheng, Y., Yeganeh, S., and V. Jacobson,
1295	              "BBR Congestion Control", draft-cardwell-iccrg-bbr-
1296	              congestion-control-00 (work in progress), July 2017.

1298	   [I-D.ietf-quic-transport]
1299	              Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed
1300	              and Secure Transport", draft-ietf-quic-transport-25 (work
1301	              in progress), January 2020.

1303	   [I-D.ietf-tcpm-accurate-ecn]
1304	              Briscoe, B., Kuehlewind, M., and R. Scheffenegger, "More
1305	              Accurate ECN Feedback in TCP", draft-ietf-tcpm-accurate-
1306	              ecn-09 (work in progress), July 2019.

1308	   [I-D.ietf-tcpm-generalized-ecn]
1309	              Bagnulo, M. and B. Briscoe, "ECN++: Adding Explicit
1310	              Congestion Notification (ECN) to TCP Control Packets",
1311	              draft-ietf-tcpm-generalized-ecn-05 (work in progress),
1312	              November 2019.

1314	   [I-D.ietf-tsvwg-aqm-dualq-coupled]
1315	              Schepper, K., Briscoe, B., and G. White, "DualQ Coupled
1316	              AQMs for Low Latency, Low Loss and Scalable Throughput
1317	              (L4S)", draft-ietf-tsvwg-aqm-dualq-coupled-10 (work in
1318	              progress), July 2019.

1320	   [I-D.ietf-tsvwg-ecn-encap-guidelines]
1321	              Briscoe, B., Kaippallimalil, J., and P. Thaler,
1322	              "Guidelines for Adding Congestion Notification to
1323	              Protocols that Encapsulate IP", draft-ietf-tsvwg-ecn-
1324	              encap-guidelines-13 (work in progress), May 2019.

1326	   [I-D.ietf-tsvwg-ecn-l4s-id]
1327	              Schepper, K. and B. Briscoe, "Identifying Modified
1328	              Explicit Congestion Notification (ECN) Semantics for
1329	              Ultra-Low Queuing Delay (L4S)", draft-ietf-tsvwg-ecn-l4s-
1330	              id-08 (work in progress), November 2019.

1332	   [I-D.smith-encrypted-traffic-management]
1333	              Smith, K., "Network management of encrypted traffic",
1334	              draft-smith-encrypted-traffic-management-05 (work in
1335	              progress), May 2016.

1337	   [I-D.sridharan-tcpm-ctcp]
1338	              Sridharan, M., Tan, K., Bansal, D., and D. Thaler,
1339	              "Compound TCP: A New TCP Congestion Control for High-Speed
1340	              and Long Distance Networks", draft-sridharan-tcpm-ctcp-02
1341	              (work in progress), November 2008.

1343	   [I-D.stewart-tsvwg-sctpecn]
1344	              Stewart, R., Tuexen, M., and X. Dong, "ECN for Stream
1345	              Control Transmission Protocol (SCTP)", draft-stewart-
1346	              tsvwg-sctpecn-05 (work in progress), January 2014.

1348	   [I-D.white-tsvwg-nqb]
1349	              White, G. and T. Fossati, "Identifying and Handling Non
1350	              Queue Building Flows in a Bottleneck Link", draft-white-
1351	              tsvwg-nqb-02 (work in progress), June 2019.

1353	   [L4Sdemo16]
1354	              Bondarenko, O., De Schepper, K., Tsang, I., and B.
1355	              Briscoe, "orderedUltra-Low Delay for All: Live Experience,
1356	              Live Analysis", Proc. MMSYS'16 pp33:1--33:4, May 2016,
1357	              <http://dl.acm.org/citation.cfm?doid=2910017.2910633
1358	              (videos of demos:
1359	              https://riteproject.eu/dctth/#1511dispatchwg )>.

1361	   [LinuxPrague]
1362	              Briscoe, B., De Schepper, K., Albisser, O., Misund, J.,
1363	              Tilmans, O., Kuehlewind, M., and A. Ahmed, "Implementing
1364	              the `TCP Prague' Requirements for Low Latency Low Loss
1365	              Scalable Throughput (L4S)", Proc. Linux Netdev 0x13 ,
1366	              March 2019, <https://www.netdevconf.org/0x13/
1367	              session.html?talk-tcp-prague-l4s>.

1369	   [Mathis09]
1370	              Mathis, M., "Relentless Congestion Control", PFLDNeT'09 ,
1371	              May 2009, <https://www.gdt.id.au/~gdt/
1372	              presentations/2010-07-06-questnet-tcp/reference-
1373	              materials/papers/mathis-relentless-congestion-
1374	              control.pdf>.

1376	   [NewCC_Proc]
1377	              Eggert, L., "Experimental Specification of New Congestion
1378	              Control Algorithms", IETF Operational Note ion-tsv-alt-cc,
1379	              July 2007.

1381	   [PragueLinux]
1382	              Briscoe, B., De Schepper, K., Albisser, O., Misund, J.,
1383	              Tilmans, O., Kuehlewind, M., and A. Ahmed, "Implementing
1384	              the `TCP Prague' Requirements for Low Latency Low Loss
1385	              Scalable Throughput (L4S)", Proc. Linux Netdev 0x13 ,
1386	              March 2019, <https://www.netdevconf.org/0x13/
1387	              session.html?talk-tcp-prague-l4s>.

1389	   [RFC2697]  Heinanen, J. and R. Guerin, "A Single Rate Three Color
1390	              Marker", RFC 2697, DOI 10.17487/RFC2697, September 1999,
1391	              <https://www.rfc-editor.org/info/rfc2697>.

1393	   [RFC2698]  Heinanen, J. and R. Guerin, "A Two Rate Three Color
1394	              Marker", RFC 2698, DOI 10.17487/RFC2698, September 1999,
1395	              <https://www.rfc-editor.org/info/rfc2698>.

1397	   [RFC2884]  Hadi Salim, J. and U. Ahmed, "Performance Evaluation of
1398	              Explicit Congestion Notification (ECN) in IP Networks",
1399	              RFC 2884, DOI 10.17487/RFC2884, July 2000,
1400	              <https://www.rfc-editor.org/info/rfc2884>.

1402	   [RFC3168]  Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
1403	              of Explicit Congestion Notification (ECN) to IP",
1404	              RFC 3168, DOI 10.17487/RFC3168, September 2001,
1405	              <https://www.rfc-editor.org/info/rfc3168>.

1407	   [RFC3246]  Davie, B., Charny, A., Bennet, J., Benson, K., Le Boudec,
1408	              J., Courtney, W., Davari, S., Firoiu, V., and D.
1409	              Stiliadis, "An Expedited Forwarding PHB (Per-Hop
1410	              Behavior)", RFC 3246, DOI 10.17487/RFC3246, March 2002,
1411	              <https://www.rfc-editor.org/info/rfc3246>.

1413	   [RFC3540]  Spring, N., Wetherall, D., and D. Ely, "Robust Explicit
1414	              Congestion Notification (ECN) Signaling with Nonces",
1415	              RFC 3540, DOI 10.17487/RFC3540, June 2003,
1416	              <https://www.rfc-editor.org/info/rfc3540>.

1418	   [RFC3649]  Floyd, S., "HighSpeed TCP for Large Congestion Windows",
1419	              RFC 3649, DOI 10.17487/RFC3649, December 2003,
1420	              <https://www.rfc-editor.org/info/rfc3649>.

1422	   [RFC4340]  Kohler, E., Handley, M., and S. Floyd, "Datagram
1423	              Congestion Control Protocol (DCCP)", RFC 4340,
1424	              DOI 10.17487/RFC4340, March 2006,
1425	              <https://www.rfc-editor.org/info/rfc4340>.

1427	   [RFC4774]  Floyd, S., "Specifying Alternate Semantics for the
1428	              Explicit Congestion Notification (ECN) Field", BCP 124,
1429	              RFC 4774, DOI 10.17487/RFC4774, November 2006,
1430	              <https://www.rfc-editor.org/info/rfc4774>.

1432	   [RFC4960]  Stewart, R., Ed., "Stream Control Transmission Protocol",
1433	              RFC 4960, DOI 10.17487/RFC4960, September 2007,
1434	              <https://www.rfc-editor.org/info/rfc4960>.

1436	   [RFC5348]  Floyd, S., Handley, M., Padhye, J., and J. Widmer, "TCP
1437	              Friendly Rate Control (TFRC): Protocol Specification",
1438	              RFC 5348, DOI 10.17487/RFC5348, September 2008,
1439	              <https://www.rfc-editor.org/info/rfc5348>.

1441	   [RFC5681]  Allman, M., Paxson, V., and E. Blanton, "TCP Congestion
1442	              Control", RFC 5681, DOI 10.17487/RFC5681, September 2009,
1443	              <https://www.rfc-editor.org/info/rfc5681>.

1445	   [RFC5925]  Touch, J., Mankin, A., and R. Bonica, "The TCP
1446	              Authentication Option", RFC 5925, DOI 10.17487/RFC5925,
1447	              June 2010, <https://www.rfc-editor.org/info/rfc5925>.

1449	   [RFC6040]  Briscoe, B., "Tunnelling of Explicit Congestion
1450	              Notification", RFC 6040, DOI 10.17487/RFC6040, November
1451	              2010, <https://www.rfc-editor.org/info/rfc6040>.

1453	   [RFC6679]  Westerlund, M., Johansson, I., Perkins, C., O'Hanlon, P.,
1454	              and K. Carlberg, "Explicit Congestion Notification (ECN)
1455	              for RTP over UDP", RFC 6679, DOI 10.17487/RFC6679, August
1456	              2012, <https://www.rfc-editor.org/info/rfc6679>.

1458	   [RFC7540]  Belshe, M., Peon, R., and M. Thomson, Ed., "Hypertext
1459	              Transfer Protocol Version 2 (HTTP/2)", RFC 7540,
1460	              DOI 10.17487/RFC7540, May 2015,
1461	              <https://www.rfc-editor.org/info/rfc7540>.

1463	   [RFC7560]  Kuehlewind, M., Ed., Scheffenegger, R., and B. Briscoe,
1464	              "Problem Statement and Requirements for Increased Accuracy
1465	              in Explicit Congestion Notification (ECN) Feedback",
1466	              RFC 7560, DOI 10.17487/RFC7560, August 2015,
1467	              <https://www.rfc-editor.org/info/rfc7560>.

1469	   [RFC7665]  Halpern, J., Ed. and C. Pignataro, Ed., "Service Function
1470	              Chaining (SFC) Architecture", RFC 7665,
1471	              DOI 10.17487/RFC7665, October 2015,
1472	              <https://www.rfc-editor.org/info/rfc7665>.

1474	   [RFC7713]  Mathis, M. and B. Briscoe, "Congestion Exposure (ConEx)
1475	              Concepts, Abstract Mechanism, and Requirements", RFC 7713,
1476	              DOI 10.17487/RFC7713, December 2015,
1477	              <https://www.rfc-editor.org/info/rfc7713>.

1479	   [RFC8033]  Pan, R., Natarajan, P., Baker, F., and G. White,
1480	              "Proportional Integral Controller Enhanced (PIE): A
1481	              Lightweight Control Scheme to Address the Bufferbloat
1482	              Problem", RFC 8033, DOI 10.17487/RFC8033, February 2017,
1483	              <https://www.rfc-editor.org/info/rfc8033>.

1485	   [RFC8170]  Thaler, D., Ed., "Planning for Protocol Adoption and
1486	              Subsequent Transitions", RFC 8170, DOI 10.17487/RFC8170,
1487	              May 2017, <https://www.rfc-editor.org/info/rfc8170>.

1489	   [RFC8257]  Bensley, S., Thaler, D., Balasubramanian, P., Eggert, L.,
1490	              and G. Judd, "Data Center TCP (DCTCP): TCP Congestion
1491	              Control for Data Centers", RFC 8257, DOI 10.17487/RFC8257,
1492	              October 2017, <https://www.rfc-editor.org/info/rfc8257>.

1494	   [RFC8290]  Hoeiland-Joergensen, T., McKenney, P., Taht, D., Gettys,
1495	              J., and E. Dumazet, "The Flow Queue CoDel Packet Scheduler
1496	              and Active Queue Management Algorithm", RFC 8290,
1497	              DOI 10.17487/RFC8290, January 2018,
1498	              <https://www.rfc-editor.org/info/rfc8290>.

1500	   [RFC8298]  Johansson, I. and Z. Sarker, "Self-Clocked Rate Adaptation
1501	              for Multimedia", RFC 8298, DOI 10.17487/RFC8298, December
1502	              2017, <https://www.rfc-editor.org/info/rfc8298>.

1504	   [RFC8311]  Black, D., "Relaxing Restrictions on Explicit Congestion
1505	              Notification (ECN) Experimentation", RFC 8311,
1506	              DOI 10.17487/RFC8311, January 2018,
1507	              <https://www.rfc-editor.org/info/rfc8311>.

1509	   [RFC8312]  Rhee, I., Xu, L., Ha, S., Zimmermann, A., Eggert, L., and
1510	              R. Scheffenegger, "CUBIC for Fast Long-Distance Networks",
1511	              RFC 8312, DOI 10.17487/RFC8312, February 2018,
1512	              <https://www.rfc-editor.org/info/rfc8312>.

1514	   [RFC8511]  Khademi, N., Welzl, M., Armitage, G., and G. Fairhurst,
1515	              "TCP Alternative Backoff with ECN (ABE)", RFC 8511,
1516	              DOI 10.17487/RFC8511, December 2018,
1517	              <https://www.rfc-editor.org/info/rfc8511>.

1519	   [TCP-CA]   Jacobson, V. and M. Karels, "Congestion Avoidance and
1520	              Control", Laurence Berkeley Labs Technical Report ,
1521	              November 1988, <http://ee.lbl.gov/papers/congavoid.pdf>.

1523	   [TCP-sub-mss-w]
1524	              Briscoe, B. and K. De Schepper, "Scaling TCP's Congestion
1525	              Window for Small Round Trip Times", BT Technical Report
1526	              TR-TUB8-2015-002, May 2015,
1527	              <http://www.bobbriscoe.net/projects/latency/sub-mss-
1528	              w.pdf>.

1530	   [UnorderedLTE]
1531	              Austrheim, M., "Implementing immediate forwarding for 4G
1532	              in a network simulator", Masters Thesis, Uni Oslo , June
1533	              2019.

1535	Appendix A.  Standardization items

1537	   The following table includes all the items that will need to be
1538	   standardized to provide a full L4S architecture.

1540	   The table is too wide for the ASCII draft format, so it has been
1541	   split into two, with a common column of row index numbers on the
1542	   left.

1544	   The columns in the second part of the table have the following
1545	   meanings:

1547	   WG:  The IETF WG most relevant to this requirement.  The "tcpm/iccrg"
1548	      combination refers to the procedure typically used for congestion
1549	      control changes, where tcpm owns the approval decision, but uses
1550	      the iccrg for expert review [NewCC_Proc];

1552	   TCP:  Applicable to all forms of TCP congestion control;

1554	   DCTCP:  Applicable to Data Center TCP as currently used (in
1555	      controlled environments);

1557	   DCTCP bis:  Applicable to an future Data Center TCP congestion
1558	      control intended for controlled environments;

1560	   XXX Prague:  Applicable to a Scalable variant of XXX (TCP/SCTP/RMCAT)
1561	      congestion control.

1563	   +-----+------------------------+------------------------------------+
1564	   | Req | Requirement            | Reference                          |
1565	   | #   |                        |                                    |
1566	   +-----+------------------------+------------------------------------+
1567	   | 0   | ARCHITECTURE           |                                    |
1568	   | 1   | L4S IDENTIFIER         | [I-D.ietf-tsvwg-ecn-l4s-id]        |
1569	   | 2   | DUAL QUEUE AQM         | [I-D.ietf-tsvwg-aqm-dualq-coupled] |
1570	   | 3   | Suitable ECN Feedback  | [I-D.ietf-tcpm-accurate-ecn],      |
1571	   |     |                        | [I-D.stewart-tsvwg-sctpecn].       |
1572	   |     |                        |                                    |
1573	   |     | SCALABLE TRANSPORT -   |                                    |
1574	   |     | SAFETY ADDITIONS       |                                    |
1575	   | 4-1 | Fall back to           | [I-D.ietf-tsvwg-ecn-l4s-id] S.2.3, |
1576	   |     | Reno/Cubic on loss     | [RFC8257]                          |
1577	   | 4-2 | Fall back to           | [I-D.ietf-tsvwg-ecn-l4s-id] S.2.3  |
1578	   |     | Reno/Cubic if classic  |                                    |
1579	   |     | ECN bottleneck         |                                    |
1580	   |     | detected               |                                    |
1581	   |     |                        |                                    |
1582	   | 4-3 | Reduce RTT-dependence  | [I-D.ietf-tsvwg-ecn-l4s-id] S.2.3  |
1583	   |     |                        |                                    |
1584	   | 4-4 | Scaling TCP's          | [I-D.ietf-tsvwg-ecn-l4s-id] S.2.3, |
1585	   |     | Congestion Window for  | [TCP-sub-mss-w]                    |
1586	   |     | Small Round Trip Times |                                    |
1587	   |     | SCALABLE TRANSPORT -   |                                    |
1588	   |     | PERFORMANCE            |                                    |
1589	   |     | ENHANCEMENTS           |                                    |
1590	   | 5-1 | Setting ECT in TCP     | [I-D.ietf-tcpm-generalized-ecn]    |
1591	   |     | Control Packets and    |                                    |
1592	   |     | Retransmissions        |                                    |
1593	   | 5-2 | Faster-than-additive   | [I-D.ietf-tsvwg-ecn-l4s-id] (Appx  |
1594	   |     | increase               | A.2.2)                             |
1595	   | 5-3 | Faster Convergence at  | [I-D.ietf-tsvwg-ecn-l4s-id] (Appx  |
1596	   |     | Flow Start             | A.2.2)                             |
1597	   +-----+------------------------+------------------------------------+
1598	   +-----+--------+-----+-------+-----------+--------+--------+--------+
1599	   | #   | WG     | TCP | DCTCP | DCTCP-bis | TCP    | SCTP   | RMCAT  |
1600	   |     |        |     |       |           | Prague | Prague | Prague |
1601	   +-----+--------+-----+-------+-----------+--------+--------+--------+
1602	   | 0   | tsvwg  | Y   | Y     | Y         | Y      | Y      | Y      |
1603	   | 1   | tsvwg  |     |       | Y         | Y      | Y      | Y      |
1604	   | 2   | tsvwg  | n/a | n/a   | n/a       | n/a    | n/a    | n/a    |
1605	   |     |        |     |       |           |        |        |        |
1606	   |     |        |     |       |           |        |        |        |
1607	   |     |        |     |       |           |        |        |        |
1608	   | 3   | tcpm   | Y   | Y     | Y         | Y      | n/a    | n/a    |
1609	   |     |        |     |       |           |        |        |        |
1610	   | 4-1 | tcpm   |     | Y     | Y         | Y      | Y      | Y      |
1611	   |     |        |     |       |           |        |        |        |
1612	   | 4-2 | tcpm/  |     |       |           | Y      | Y      | ?      |
1613	   |     | iccrg? |     |       |           |        |        |        |
1614	   |     |        |     |       |           |        |        |        |
1615	   |     |        |     |       |           |        |        |        |
1616	   |     |        |     |       |           |        |        |        |
1617	   |     |        |     |       |           |        |        |        |
1618	   | 4-3 | tcpm/  |     |       | Y         | Y      | Y      | ?      |
1619	   |     | iccrg? |     |       |           |        |        |        |
1620	   | 4-4 | tcpm   | Y   | Y     | Y         | Y      | Y      | ?      |
1621	   |     |        |     |       |           |        |        |        |
1622	   |     |        |     |       |           |        |        |        |
1623	   | 5-1 | tcpm   | Y   | Y     | Y         | Y      | n/a    | n/a    |
1624	   |     |        |     |       |           |        |        |        |
1625	   | 5-2 | tcpm/  |     |       | Y         | Y      | Y      | ?      |
1626	   |     | iccrg? |     |       |           |        |        |        |
1627	   | 5-3 | tcpm/  |     |       | Y         | Y      | Y      | ?      |
1628	   |     | iccrg? |     |       |           |        |        |        |
1629	   +-----+--------+-----+-------+-----------+--------+--------+--------+

1631	Authors' Addresses

1633	   Bob Briscoe (editor)
1634	   Independent
1635	   UK

1637	   Email: ietf@bobbriscoe.net
1638	   URI:   http://bobbriscoe.net/
1639	   Koen De Schepper
1640	   Nokia Bell Labs
1641	   Antwerp
1642	   Belgium

1644	   Email: koen.de_schepper@nokia.com
1645	   URI:   https://www.bell-labs.com/usr/koen.de_schepper

1647	   Marcelo Bagnulo
1648	   Universidad Carlos III de Madrid
1649	   Av. Universidad 30
1650	   Leganes, Madrid 28911
1651	   Spain

1653	   Phone: 34 91 6249500
1654	   Email: marcelo@it.uc3m.es
1655	   URI:   http://www.it.uc3m.es

1657	   Greg White
1658	   CableLabs
1659	   US

1661	   Email: G.White@CableLabs.com