idnits 2.17.1 

draft-ietf-tsvwg-l4s-arch-16.txt:
-(1671): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  == There are 6 instances of lines with non-ascii characters in the document.


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (1 February 2022) is 814 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Outdated reference: A later version (-07) exists of
     draft-briscoe-docsis-q-protection-02

  == Outdated reference: A later version (-03) exists of
     draft-briscoe-iccrg-prague-congestion-control-00

  == Outdated reference: A later version (-02) exists of
     draft-cardwell-iccrg-bbr-congestion-control-01

  == Outdated reference: A later version (-28) exists of
     draft-ietf-tcpm-accurate-ecn-15

  == Outdated reference: A later version (-15) exists of
     draft-ietf-tcpm-generalized-ecn-09

  == Outdated reference: A later version (-25) exists of
     draft-ietf-tsvwg-aqm-dualq-coupled-20

  == Outdated reference: A later version (-22) exists of
     draft-ietf-tsvwg-ecn-encap-guidelines-16

  == Outdated reference: A later version (-29) exists of
     draft-ietf-tsvwg-ecn-l4s-id-23

  == Outdated reference: A later version (-06) exists of
     draft-ietf-tsvwg-l4sops-02

  == Outdated reference: A later version (-22) exists of
     draft-ietf-tsvwg-nqb-08

  == Outdated reference: A later version (-23) exists of
     draft-ietf-tsvwg-rfc6040update-shim-14

  == Outdated reference: A later version (-07) exists of
     draft-stewart-tsvwg-sctpecn-05

  -- Obsolete informational reference (is this intentional?): RFC 4960
     (Obsoleted by RFC 9260)

  -- Obsolete informational reference (is this intentional?): RFC 7540
     (Obsoleted by RFC 9113)

  -- Obsolete informational reference (is this intentional?): RFC 8312
     (Obsoleted by RFC 9438)


     Summary: 0 errors (**), 0 flaws (~~), 14 warnings (==), 4 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Transport Area Working Group                             B. Briscoe, Ed.
3	Internet-Draft                                               Independent
4	Intended status: Informational                            K. De Schepper
5	Expires: 5 August 2022                                   Nokia Bell Labs
6	                                                        M. Bagnulo Braun
7	                                        Universidad Carlos III de Madrid
8	                                                                G. White
9	                                                               CableLabs
10	                                                         1 February 2022

12	   Low Latency, Low Loss, Scalable Throughput (L4S) Internet Service:
13	                              Architecture
14	                      draft-ietf-tsvwg-l4s-arch-16

16	Abstract

18	   This document describes the L4S architecture, which enables Internet
19	   applications to achieve Low queuing Latency, Low Loss, and Scalable
20	   throughput (L4S).  The insight on which L4S is based is that the root
21	   cause of queuing delay is in the congestion controllers of senders,
22	   not in the queue itself.  With the L4S architecture all Internet
23	   applications could (but do not have to) transition away from
24	   congestion control algorithms that cause substantial queuing delay,
25	   to a new class of congestion controls that induce very little
26	   queuing, aided by explicit congestion signalling from the network.
27	   This new class of congestion controls can provide low latency for
28	   capacity-seeking flows, so applications can achieve both high
29	   bandwidth and low latency.

31	   The architecture primarily concerns incremental deployment.  It
32	   defines mechanisms that allow the new class of L4S congestion
33	   controls to coexist with 'Classic' congestion controls in a shared
34	   network.  These mechanisms aim to ensure that the latency and
35	   throughput performance using an L4S-compliant congestion controller
36	   is usually much better (and rarely worse) than performance would have
37	   been using a 'Classic' congestion controller, and that competing
38	   flows continuing to use 'Classic' controllers are typically not
39	   impacted by the presence of L4S.  These characteristics are important
40	   to encourage adoption of L4S congestion control algorithms and L4S
41	   compliant network elements.

43	   The L4S architecture consists of three components: network support to
44	   isolate L4S traffic from classic traffic; protocol features that
45	   allow network elements to identify L4S traffic; and host support for
46	   L4S congestion controls.

48	Status of This Memo

50	   This Internet-Draft is submitted in full conformance with the
51	   provisions of BCP 78 and BCP 79.

53	   Internet-Drafts are working documents of the Internet Engineering
54	   Task Force (IETF).  Note that other groups may also distribute
55	   working documents as Internet-Drafts.  The list of current Internet-
56	   Drafts is at https://datatracker.ietf.org/drafts/current/.

58	   Internet-Drafts are draft documents valid for a maximum of six months
59	   and may be updated, replaced, or obsoleted by other documents at any
60	   time.  It is inappropriate to use Internet-Drafts as reference
61	   material or to cite them other than as "work in progress."

63	   This Internet-Draft will expire on 5 August 2022.

65	Copyright Notice

67	   Copyright (c) 2022 IETF Trust and the persons identified as the
68	   document authors.  All rights reserved.

70	   This document is subject to BCP 78 and the IETF Trust's Legal
71	   Provisions Relating to IETF Documents (https://trustee.ietf.org/
72	   license-info) in effect on the date of publication of this document.
73	   Please review these documents carefully, as they describe your rights
74	   and restrictions with respect to this document.  Code Components
75	   extracted from this document must include Revised BSD License text as
76	   described in Section 4.e of the Trust Legal Provisions and are
77	   provided without warranty as described in the Revised BSD License.

79	Table of Contents

81	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
82	     1.1.  Document Roadmap  . . . . . . . . . . . . . . . . . . . .   5
83	   2.  L4S Architecture Overview . . . . . . . . . . . . . . . . . .   5
84	   3.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   7
85	   4.  L4S Architecture Components . . . . . . . . . . . . . . . . .   9
86	     4.1.  Protocol Mechanisms . . . . . . . . . . . . . . . . . . .   9
87	     4.2.  Network Components  . . . . . . . . . . . . . . . . . . .  10
88	     4.3.  Host Mechanisms . . . . . . . . . . . . . . . . . . . . .  13
89	   5.  Rationale . . . . . . . . . . . . . . . . . . . . . . . . . .  15
90	     5.1.  Why These Primary Components? . . . . . . . . . . . . . .  15
91	     5.2.  What L4S adds to Existing Approaches  . . . . . . . . . .  18
92	   6.  Applicability . . . . . . . . . . . . . . . . . . . . . . . .  21
93	     6.1.  Applications  . . . . . . . . . . . . . . . . . . . . . .  21
94	     6.2.  Use Cases . . . . . . . . . . . . . . . . . . . . . . . .  23
95	     6.3.  Applicability with Specific Link Technologies . . . . . .  24
96	     6.4.  Deployment Considerations . . . . . . . . . . . . . . . .  24
97	       6.4.1.  Deployment Topology . . . . . . . . . . . . . . . . .  25
98	       6.4.2.  Deployment Sequences  . . . . . . . . . . . . . . . .  26
99	       6.4.3.  L4S Flow but Non-ECN Bottleneck . . . . . . . . . . .  29
100	       6.4.4.  L4S Flow but Classic ECN Bottleneck . . . . . . . . .  30
101	       6.4.5.  L4S AQM Deployment within Tunnels . . . . . . . . . .  30
102	   7.  IANA Considerations (to be removed by RFC Editor) . . . . . .  30
103	   8.  Security Considerations . . . . . . . . . . . . . . . . . . .  30
104	     8.1.  Traffic Rate (Non-)Policing . . . . . . . . . . . . . . .  30
105	     8.2.  'Latency Friendliness'  . . . . . . . . . . . . . . . . .  32
106	     8.3.  Interaction between Rate Policing and L4S . . . . . . . .  33
107	     8.4.  ECN Integrity . . . . . . . . . . . . . . . . . . . . . .  34
108	     8.5.  Privacy Considerations  . . . . . . . . . . . . . . . . .  35
109	   9.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  35
110	   10. Informative References  . . . . . . . . . . . . . . . . . . .  35
111	   Appendix A.  Standardization items  . . . . . . . . . . . . . . .  45
112	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  48

114	1.  Introduction

116	   At any one time, it is increasingly common for all of the traffic in
117	   a bottleneck link (e.g. a household's Internet access) to come from
118	   applications that prefer low delay: interactive Web, Web services,
119	   voice, conversational video, interactive video, interactive remote
120	   presence, instant messaging, online gaming, remote desktop, cloud-
121	   based applications and video-assisted remote control of machinery and
122	   industrial processes.  In the last decade or so, much has been done
123	   to reduce propagation delay by placing caches or servers closer to
124	   users.  However, queuing remains a major, albeit intermittent,
125	   component of latency.  For instance spikes of hundreds of
126	   milliseconds are not uncommon, even with state-of-the-art active
127	   queue management (AQM) [COBALT], [DOCSIS3AQM].  Queuing in access
128	   network bottlenecks is typically configured to cause overall network
129	   delay to roughly double during a long-running flow, relative to
130	   expected base (unloaded) path delay [BufferSize].  Low loss is also
131	   important because, for interactive applications, losses translate
132	   into even longer retransmission delays.

134	   It has been demonstrated that, once access network bit rates reach
135	   levels now common in the developed world, increasing capacity offers
136	   diminishing returns if latency (delay) is not addressed
137	   [Dukkipati06], [Rajiullah15].  Therefore, the goal is an Internet
138	   service with very Low queueing Latency, very Low Loss and Scalable
139	   throughput (L4S).  Very low queuing latency means less than
140	   1 millisecond (ms) on average and less than about 2 ms at the 99th
141	   percentile.  This document describes the L4S architecture for
142	   achieving these goals.

144	   Differentiated services (Diffserv) offers Expedited Forwarding
145	   (EF [RFC3246]) for some packets at the expense of others, but this
146	   makes no difference when all (or most) of the traffic at a bottleneck
147	   at any one time requires low latency.  In contrast, L4S still works
148	   well when all traffic is L4S - a service that gives without taking
149	   needs none of the configuration or management baggage (traffic
150	   policing, traffic contracts) associated with favouring some traffic
151	   flows over others.

153	   Queuing delay degrades performance intermittently [Hohlfeld14].  It
154	   occurs when a large enough capacity-seeking (e.g. TCP) flow is
155	   running alongside the user's traffic in the bottleneck link, which is
156	   typically in the access network.  Or when the low latency application
157	   is itself a large capacity-seeking or adaptive rate (e.g. interactive
158	   video) flow.  At these times, the performance improvement from L4S
159	   must be sufficient that network operators will be motivated to deploy
160	   it.

162	   Active Queue Management (AQM) is part of the solution to queuing
163	   under load.  AQM improves performance for all traffic, but there is a
164	   limit to how much queuing delay can be reduced by solely changing the
165	   network; without addressing the root of the problem.

167	   The root of the problem is the presence of standard TCP congestion
168	   control (Reno [RFC5681]) or compatible variants (e.g. TCP
169	   Cubic [RFC8312]).  We shall use the term 'Classic' for these Reno-
170	   friendly congestion controls.  Classic congestion controls induce
171	   relatively large saw-tooth-shaped excursions up the queue and down
172	   again, which have been growing as flow rate scales [RFC3649].  So if
173	   a network operator naively attempts to reduce queuing delay by
174	   configuring an AQM to operate at a shallower queue, a Classic
175	   congestion control will significantly underutilize the link at the
176	   bottom of every saw-tooth.

178	   It has been demonstrated that if the sending host replaces a Classic
179	   congestion control with a 'Scalable' alternative, when a suitable AQM
180	   is deployed in the network the performance under load of all the
181	   above interactive applications can be significantly improved.  For
182	   instance, queuing delay under heavy load with the example DCTCP/DualQ
183	   solution cited below on a DSL or Ethernet link is roughly 1 to 2
184	   milliseconds at the 99th percentile without losing link
185	   utilization [DualPI2Linux], [DCttH19] (for other link types, see
186	   Section 6.3).  This compares with 5-20 ms on _average_ with a Classic
187	   congestion control and current state-of-the-art AQMs such as FQ-
188	   CoDel [RFC8290], PIE [RFC8033] or DOCSIS PIE [RFC8034] and about
189	   20-30 ms at the 99th percentile [DualPI2Linux].

191	   L4S is designed for incremental deployment.  It is possible to deploy
192	   the L4S service at a bottleneck link alongside the existing best
193	   efforts service [DualPI2Linux] so that unmodified applications can
194	   start using it as soon as the sender's stack is updated.  Access
195	   networks are typically designed with one link as the bottleneck for
196	   each site (which might be a home, small enterprise or mobile device),
197	   so deployment at either or both ends of this link should give nearly
198	   all the benefit in the respective direction.  With some transport
199	   protocols, namely TCP and SCTP, the sender has to check for suitably
200	   updated receiver feedback, whereas with more recent transport
201	   protocols such as QUIC and DCCP, all receivers have always been
202	   suitable.

204	   This document presents the L4S architecture, by describing and
205	   justifying the component parts and how they interact to provide the
206	   scalable, low latency, low loss Internet service.  It also details
207	   the approach to incremental deployment, as briefly summarized above.

209	1.1.  Document Roadmap

211	   This document describes the L4S architecture in three passes.  First
212	   this brief overview gives the very high level idea and states the
213	   main components with minimal rationale.  This is only intended to
214	   give some context for the terminology definitions that follow in
215	   Section 3, and to explain the structure of the rest of the document.
216	   Then Section 4 goes into more detail on each component with some
217	   rationale, but still mostly stating what the architecture is, rather
218	   than why.  Finally Section 5 justifies why each element of the
219	   solution was chosen (Section 5.1) and why these choices were
220	   different from other solutions (Section 5.2).

222	   Having described the architecture, Section 6 clarifies its
223	   applicability; that is, the applications and use-cases that motivated
224	   the design, the challenges applying the architecture to various link
225	   technologies, and various incremental deployment models: including
226	   the two main deployment topologies, different sequences for
227	   incremental deployment and various interactions with pre-existing
228	   approaches.  The document ends with the usual tail pieces, including
229	   extensive discussion of traffic policing and other security
230	   considerations Section 8.

232	2.  L4S Architecture Overview

234	   Below we outline the three main components to the L4S architecture;
235	   1) the scalable congestion control on the sending host; 2) the AQM at
236	   the network bottleneck; and 3) the protocol between them.

238	   But first, the main point to grasp is that low latency is not
239	   provided by the network - low latency results from the careful
240	   behaviour of the scalable congestion controllers used by L4S senders.
241	   The network does have a role - primarily to isolate the low latency
242	   of the carefully behaving L4S traffic from the higher queuing delay
243	   needed by traffic with pre-existing Classic behaviour.  The network
244	   also alters the way it signals queue growth to the transport - It
245	   uses the Explicit Congestion Notification (ECN) protocol, but it
246	   signals the very start of queue growth - immediately without the
247	   smoothing delay typical of Classic AQMs.  Because ECN support is
248	   essential for L4S, senders use the ECN field as the protocol to
249	   identify to the network which packets are L4S and which are Classic.

251	   1) Host:  Scalable congestion controls already exist.  They solve the
252	      scaling problem with Classic congestion controls, such as Reno or
253	      Cubic.  Because flow rate has scaled since TCP congestion control
254	      was first designed in 1988, assuming the flow lasts long enough,
255	      it now takes hundreds of round trips (and growing) to recover
256	      after a congestion signal (whether a loss or an ECN mark) as shown
257	      in the examples in Section 5.1 and [RFC3649].  Therefore control
258	      of queuing and utilization becomes very slack, and the slightest
259	      disturbances (e.g. from new flows starting) prevent a high rate
260	      from being attained.

262	      With a scalable congestion control, the average time from one
263	      congestion signal to the next (the recovery time) remains
264	      invariant as the flow rate scales, all other factors being equal.
265	      This maintains the same degree of control over queueing and
266	      utilization whatever the flow rate, as well as ensuring that high
267	      throughput is more robust to disturbances.  The scalable control
268	      used most widely (in controlled environments) is Data Center TCP
269	      (DCTCP [RFC8257]), which has been implemented and deployed in
270	      Windows Server Editions (since 2012), in Linux and in FreeBSD.
271	      Although DCTCP as-is functions well over wide-area round trip
272	      times, most implementations lack certain safety features that
273	      would be necessary for use outside controlled environments like
274	      data centres (see Section 6.4.3 and Appendix A).  So scalable
275	      congestion control needs to be implemented in TCP and other
276	      transport protocols (QUIC, SCTP, RTP/RTCP, RMCAT, etc.).  Indeed,
277	      between the present document being drafted and published, the
278	      following scalable congestion controls were implemented: TCP
279	      Prague [PragueLinux], QUIC Prague, an L4S variant of the RMCAT
280	      SCReAM controller [SCReAM] and the L4S ECN part of BBRv2 [BBRv2]
281	      intended for TCP and QUIC transports.

283	   2) Network:  L4S traffic needs to be isolated from the queuing
284	      latency of Classic traffic.  One queue per application flow (FQ)
285	      is one way to achieve this, e.g. FQ-CoDel [RFC8290].  However,
286	      just two queues is sufficient and does not require inspection of
287	      transport layer headers in the network, which is not always
288	      possible (see Section 5.2).  With just two queues, it might seem
289	      impossible to know how much capacity to schedule for each queue
290	      without inspecting how many flows at any one time are using each.
291	      And it would be undesirable to arbitrarily divide access network
292	      capacity into two partitions.  The Dual Queue Coupled AQM was
293	      developed as a minimal complexity solution to this problem.  It
294	      acts like a 'semi-permeable' membrane that partitions latency but
295	      not bandwidth.  As such, the two queues are for transition from
296	      Classic to L4S behaviour, not bandwidth prioritization.

298	      Section 4 gives a high level explanation of how the per-flow-queue
299	      (FQ) and DualQ variants of L4S work, and
300	      [I-D.ietf-tsvwg-aqm-dualq-coupled] gives a full explanation of the
301	      DualQ Coupled AQM framework.  A specific marking algorithm is not
302	      mandated for L4S AQMs.  Appendices of
303	      [I-D.ietf-tsvwg-aqm-dualq-coupled] give non-normative examples
304	      that have been implemented and evaluated, and give recommended
305	      default parameter settings.  It is expected that L4S experiments
306	      will improve knowledge of parameter settings and whether the set
307	      of marking algorithms needs to be limited.

309	   3) Protocol:  A host needs to distinguish L4S and Classic packets
310	      with an identifier so that the network can classify them into
311	      their separate treatments.  The L4S identifier
312	      spec. [I-D.ietf-tsvwg-ecn-l4s-id] concludes that all alternatives
313	      involve compromises, but the ECT(1) and CE codepoints of the ECN
314	      field represent a workable solution.  As already explained, the
315	      network also uses ECN to immediately signal the very start of
316	      queue growth to the transport.

318	3.  Terminology

320	   Note: The following definitions are copied from
321	   [I-D.ietf-tsvwg-ecn-l4s-id] for convenience.  If there are accidental
322	   differences those in [I-D.ietf-tsvwg-ecn-l4s-id] take precedence.

324	   Classic Congestion Control:  A congestion control behaviour that can
325	      co-exist with standard Reno [RFC5681] without causing
326	      significantly negative impact on its flow rate [RFC5033].  The
327	      scaling problem with Classic congestion control is explained, with
328	      examples, in Section 5.1 and in [RFC3649].

330	   Scalable Congestion Control:  A congestion control where the average
331	      time from one congestion signal to the next (the recovery time)
332	      remains invariant as the flow rate scales, all other factors being
333	      equal.  For instance, DCTCP averages 2 congestion signals per
334	      round-trip whatever the flow rate, as do other recently developed
335	      scalable congestion controls, e.g. Relentless TCP [Mathis09], TCP
336	      Prague [I-D.briscoe-iccrg-prague-congestion-control],
337	      [PragueLinux], BBRv2 [I-D.cardwell-iccrg-bbr-congestion-control]
338	      and the L4S variant of SCReAM for real-time
339	      media [SCReAM], [RFC8298]).  See Section 4.3 of
340	      [I-D.ietf-tsvwg-ecn-l4s-id] for more explanation.

342	   Classic service:  The Classic service is intended for all the
343	      congestion control behaviours that co-exist with Reno [RFC5681]
344	      (e.g. Reno itself, Cubic [RFC8312],
345	      Compound [I-D.sridharan-tcpm-ctcp], TFRC [RFC5348]).  The term
346	      'Classic queue' means a queue providing the Classic service.

348	   Low-Latency, Low-Loss Scalable throughput (L4S) service:  The 'L4S'
349	      service is intended for traffic from scalable congestion control
350	      algorithms, such as the Prague congestion
351	      control [I-D.briscoe-iccrg-prague-congestion-control], which was
352	      derived from DCTCP  [RFC8257].  The L4S service is for more
353	      general traffic than just TCP Prague--it allows the set of
354	      congestion controls with similar scaling properties to Prague to
355	      evolve, such as the examples listed above (Relentless, SCReAM).
356	      The term 'L4S queue' means a queue providing the L4S service.

358	      The terms Classic or L4S can also qualify other nouns, such as
359	      'queue', 'codepoint', 'identifier', 'classification', 'packet',
360	      'flow'.  For example: an L4S packet means a packet with an L4S
361	      identifier sent from an L4S congestion control.

363	      Both Classic and L4S services can cope with a proportion of
364	      unresponsive or less-responsive traffic as well, but in the L4S
365	      case its rate has to be smooth enough or low enough not build a
366	      queue (e.g. DNS, VoIP, game sync datagrams, etc).

368	   Reno-friendly:  The subset of Classic traffic that is friendly to the
369	      standard Reno congestion control defined for TCP in [RFC5681].
370	      The TFRC spec. [RFC5348] indirectly implies that 'friendly' is
371	      defined as "generally within a factor of two of the sending rate
372	      of a TCP flow under the same conditions".  Reno-friendly is used
373	      here in place of 'TCP-friendly', given the latter has become
374	      imprecise, because the TCP protocol is now used with so many
375	      different congestion control behaviours, and Reno is used in non-
376	      TCP transports such as QUIC [RFC9000].

378	   Classic ECN:  The original Explicit Congestion Notification (ECN)
379	      protocol [RFC3168], which requires ECN signals to be treated as
380	      equivalent to drops, both when generated in the network and when
381	      responded to by the sender.

383	      L4S uses the ECN field as an identifier
384	      [I-D.ietf-tsvwg-ecn-l4s-id] with the names for the four codepoints
385	      of the 2-bit IP-ECN field unchanged from those defined in
386	      [RFC3168]: Not ECT, ECT(0), ECT(1) and CE, where ECT stands for
387	      ECN-Capable Transport and CE stands for Congestion Experienced.  A
388	      packet marked with the CE codepoint is termed 'ECN-marked' or
389	      sometimes just 'marked' where the context makes ECN obvious.

391	   Site:  A home, mobile device, small enterprise or campus, where the
392	      network bottleneck is typically the access link to the site.  Not
393	      all network arrangements fit this model but it is a useful, widely
394	      applicable generalization.

396	4.  L4S Architecture Components

398	   The L4S architecture is composed of the elements in the following
399	   three subsections.

401	4.1.  Protocol Mechanisms

403	   The L4S architecture involves: a) unassignment of an identifier; b)
404	   reassignment of the same identifier; and c) optional further
405	   identifiers:

407	   a.  An essential aspect of a scalable congestion control is the use
408	       of explicit congestion signals.  'Classic' ECN [RFC3168] requires
409	       an ECN signal to be treated as equivalent to drop, both when it
410	       is generated in the network and when it is responded to by hosts.
411	       L4S needs networks and hosts to support a more fine-grained
412	       meaning for each ECN signal that is less severe than a drop, so
413	       that the L4S signals:

415	       *  can be much more frequent;

417	       *  can be signalled immediately, without the significant delay
418	          required to smooth out fluctuations in the queue.

420	       To enable L4S, the standards track Classic ECN spec. [RFC3168]
421	       has had to be updated to allow L4S packets to depart from the
422	       'equivalent to drop' constraint.  [RFC8311] is a standards track
423	       update to relax specific requirements in RFC 3168 (and certain
424	       other standards track RFCs), which clears the way for the
425	       experimental changes proposed for L4S.  [RFC8311] also
426	       reclassifies the original experimental assignment of the ECT(1)
427	       codepoint as an ECN nonce [RFC3540] as historic.

429	   b.  [I-D.ietf-tsvwg-ecn-l4s-id] specifies that ECT(1) is used as the
430	       identifier to classify L4S packets into a separate treatment from
431	       Classic packets.  This satisfies the requirement for identifying
432	       an alternative ECN treatment in [RFC4774].

434	       The CE codepoint is used to indicate Congestion Experienced by
435	       both L4S and Classic treatments.  This raises the concern that a
436	       Classic AQM earlier on the path might have marked some ECT(0)
437	       packets as CE.  Then these packets will be erroneously classified
438	       into the L4S queue.  Appendix B of [I-D.ietf-tsvwg-ecn-l4s-id]
439	       explains why five unlikely eventualities all have to coincide for
440	       this to have any detrimental effect, which even then would only
441	       involve a vanishingly small likelihood of a spurious
442	       retransmission.

444	   c.  A network operator might wish to include certain unresponsive,
445	       non-L4S traffic in the L4S queue if it is deemed to be smoothly
446	       enough paced and low enough rate not to build a queue.  For
447	       instance, VoIP, low rate datagrams to sync online games,
448	       relatively low rate application-limited traffic, DNS, LDAP, etc.
449	       This traffic would need to be tagged with specific identifiers,
450	       e.g. a low latency Diffserv Codepoint such as Expedited
451	       Forwarding (EF [RFC3246]), Non-Queue-Building
452	       (NQB [I-D.ietf-tsvwg-nqb]), or operator-specific identifiers.

454	4.2.  Network Components

456	   The L4S architecture aims to provide low latency without the _need_
457	   for per-flow operations in network components.  Nonetheless, the
458	   architecture does not preclude per-flow solutions.  The following
459	   bullets describe the known arrangements: a) the DualQ Coupled AQM
460	   with an L4S AQM in one queue coupled from a Classic AQM in the other;
461	   b) Per-Flow Queues with an instance of a Classic and an L4S AQM in
462	   each queue; c) Dual queues with per-flow AQMs, but no per-flow
463	   queues:

465	   a.  The Dual Queue Coupled AQM (illustrated in Figure 1) achieves the
466	       'semi-permeable' membrane property mentioned earlier as follows:

468	       *  Latency isolation: Two separate queues are used to isolate L4S
469	          queuing delay from the larger queue that Classic traffic needs
470	          to maintain full utilization.

472	       *  Bandwidth pooling: The two queues act as if they are a single
473	          pool of bandwidth in which flows of either type get roughly
474	          equal throughput without the scheduler needing to identify any
475	          flows.  This is achieved by having an AQM in each queue, but
476	          the Classic AQM provides a congestion signal to both queues in
477	          a manner that ensures a consistent response from the two
478	          classes of congestion control.  Specifically, the Classic AQM
479	          generates a drop/mark probability based on congestion in its
480	          own queue, which it uses both to drop/mark packets in its own
481	          queue and to affect the marking probability in the L4S queue.
482	          The strength of the coupling of the congestion signalling
483	          between the two queues is enough to make the L4S flows slow
484	          down to leave the right amount of capacity for the Classic
485	          flows (as they would if they were the same type of traffic
486	          sharing the same queue).

488	       Then the scheduler can serve the L4S queue with priority (denoted
489	       by the '1' on the higher priority input), because the L4S traffic
490	       isn't offering up enough traffic to use all the priority that it
491	       is given.  Therefore:

493	       *  for latency isolation on short time-scales (sub-round-trip)
494	          the prioritization of the L4S queue protects its low latency
495	          by allowing bursts to dissipate quickly;

497	       *  but for bandwidth pooling on longer time-scales (round-trip
498	          and longer) the Classic queue creates an equal and opposite
499	          pressure against the L4S traffic to ensure that neither has
500	          priority when it comes to bandwidth - the tension between
501	          prioritizing L4S and coupling the marking from the Classic AQM
502	          results in approximate per-flow fairness.

504	       To protect against unresponsive traffic taking advantage of the
505	       prioritization of the L4S queue and starving the Classic queue,
506	       it is advisable for the priority to be conditional, not strict
507	       (see Appendix A of [I-D.ietf-tsvwg-aqm-dualq-coupled]).

509	       When there is no Classic traffic, the L4S queue's own AQM comes
510	       into play.  It starts congestion marking with a very shallow
511	       queue, so L4S traffic maintains very low queuing delay.

513	       If either queue becomes persistently overloaded, drop of ECN-
514	       capable packets is introduced, as recommended in Section 7 of
515	       [RFC3168] and Section 4.2.1 of [RFC7567].  Then both queues
516	       introduce the same level of drop (not shown in the figure).

518	       The Dual Queue Coupled AQM has been specified as generically as
519	       possible [I-D.ietf-tsvwg-aqm-dualq-coupled] without specifying
520	       the particular AQMs to use in the two queues so that designers
521	       are free to implement diverse ideas.  Informational appendices in
522	       that draft give pseudocode examples of two different specific AQM
523	       approaches: one called DualPI2 (pronounced Dual PI
524	       Squared) [DualPI2Linux] that uses the PI2 variant of PIE, and a
525	       zero-config variant of RED called Curvy RED.  A DualQ Coupled AQM
526	       based on PIE has also been specified and implemented for Low
527	       Latency DOCSIS [DOCSIS3.1].

529	                            (3)                  (2)
530	                     .-------^------..------------^------------------.
531	        ,-(1)-----.                               _____
532	       ; ________  :            L4S  -------.    |     |
533	       :|Scalable| :               _\      ||__\_|mark |
534	       :| sender | :  __________  / /      ||  / |_____|\   _________
535	       :|________|\; |          |/   -------'       ^    \1|condit'nl|
536	        `---------'\_|  IP-ECN  |          Coupling :     \|priority |_\
537	         ________  / |Classifier|                   :     /|scheduler| /
538	        |Classic |/  |__________|\   -------.     __:__  / |_________|
539	        | sender |                \_\ || | ||__\_|mark/|/
540	        |________|                  / || | ||  / |drop |
541	                             Classic -------'    |_____|

543	         Figure 1: Components of an L4S DualQ Coupled AQM Solution: 1)
544	            Scalable Sending Host; 2) Isolation in separate network
545	                 queues; and 3) Packet Identification Protocol

547	   b.  Per-Flow Queues and AQMs: A scheduler with per-flow queues such
548	       as FQ-CoDel or FQ-PIE can be used for L4S.  For instance within
549	       each queue of an FQ-CoDel system, as well as a CoDel AQM, there
550	       is typically also the option of ECN marking at an immediate
551	       (unsmoothed) shallow threshold to support use in data centres
552	       (see Sec.5.2.7 of [RFC8290]).  In Linux, this has been modified
553	       so that the shallow threshold can be solely applied to ECT(1)
554	       packets [FQ_CoDel_Thresh].  Then if there is a flow of non-ECN or
555	       ECT(0) packets in the per-flow-queue, the Classic AQM (e.g.
556	       CoDel) is applied; while if there is a flow of ECT(1) packets in
557	       the queue, the shallower (typically sub-millisecond) threshold is
558	       applied.  In addition, ECT(0) and not-ECT packets could
559	       potentially be classified into a separate flow-queue from ECT(1)
560	       and CE packets to avoid them mixing if they share a common flow-
561	       identifier (e.g. in a VPN).

563	   c.  Dual-queues, but per-flow AQMs: It should also be possible to use
564	       dual queues for isolation, but with per-flow marking to control
565	       flow-rates (instead of the coupled per-queue marking of the Dual
566	       Queue Coupled AQM).  One of the two queues would be for isolating
567	       L4S packets, which would be classified by the ECN codepoint.
568	       Flow rates could be controlled by flow-specific marking.  The
569	       policy goal of the marking could be to differentiate flow rates
570	       (e.g. [Nadas20], which requires additional signalling of a per-
571	       flow 'value'), or to equalize flow-rates (perhaps in a similar
572	       way to Approx Fair CoDel [AFCD],
573	       [I-D.morton-tsvwg-codel-approx-fair], but with two queues not
574	       one).

576	       Note that whenever the term 'DualQ' is used loosely without
577	       saying whether marking is per-queue or per-flow, it means a dual
578	       queue AQM with per-queue marking.

580	4.3.  Host Mechanisms

582	   The L4S architecture includes two main mechanisms in the end host
583	   that we enumerate next:

585	   a.  Scalable Congestion Control at the sender: Section 2 defines a
586	       scalable congestion control as one where the average time from
587	       one congestion signal to the next (the recovery time) remains
588	       invariant as the flow rate scales, all other factors being equal.
589	       Data Center TCP is the most widely used example.  It has been
590	       documented as an informational record of the protocol currently
591	       in use in controlled environments [RFC8257].  A draft list of
592	       safety and performance improvements for a scalable congestion
593	       control to be usable on the public Internet has been drawn up
594	       (the so-called 'Prague L4S requirements' in Appendix A of

596	       [I-D.ietf-tsvwg-ecn-l4s-id]).  The subset that involve risk of
597	       harm to others have been captured as normative requirements in
598	       Section 4 of [I-D.ietf-tsvwg-ecn-l4s-id].  TCP
599	       Prague [I-D.briscoe-iccrg-prague-congestion-control] has been
600	       implemented in Linux as a reference implementation to address
601	       these requirements [PragueLinux].

603	       Transport protocols other than TCP use various congestion
604	       controls that are designed to be friendly with Reno.  Before they
605	       can use the L4S service, they will need to be updated to
606	       implement a scalable congestion response, which they will have to
607	       indicate by using the ECT(1) codepoint.  Scalable variants are
608	       under consideration for more recent transport protocols,
609	       e.g. QUIC, and the L4S ECN part of
610	       BBRv2 [I-D.cardwell-iccrg-bbr-congestion-control] is a scalable
611	       congestion control intended for the TCP and QUIC transports,
612	       amongst others.  Also an L4S variant of the RMCAT SCReAM
613	       controller [RFC8298] has been implemented [SCReAM] for media
614	       transported over RTP.

616	       Section 4.3 of [I-D.ietf-tsvwg-ecn-l4s-id] defines scalable
617	       congestion control in more detail, and specifies that
618	       requirements that an L4S scalable congestion control has to
619	       comply with.

621	   b.  The ECN feedback in some transport protocols is already
622	       sufficiently fine-grained for L4S (specifically DCCP [RFC4340]
623	       and QUIC [RFC9000]).  But others either require update or are in
624	       the process of being updated:

626	       *  For the case of TCP, the feedback protocol for ECN embeds the
627	          assumption from Classic ECN [RFC3168] that an ECN mark is
628	          equivalent to a drop, making it unusable for a scalable TCP.
629	          Therefore, the implementation of TCP receivers will have to be
630	          upgraded [RFC7560].  Work to standardize and implement more
631	          accurate ECN feedback for TCP (AccECN) is in
632	          progress [I-D.ietf-tcpm-accurate-ecn], [PragueLinux].

634	       *  ECN feedback is only roughly sketched in an appendix of the
635	          SCTP specification [RFC4960].  A fuller specification has been
636	          proposed in a long-expired draft [I-D.stewart-tsvwg-sctpecn],
637	          which would need to be implemented and deployed before SCTCP
638	          could support L4S.

640	       *  For RTP, sufficient ECN feedback was defined in [RFC6679], but
641	          [RFC8888] defines the latest standards track improvements.

643	5.  Rationale

645	5.1.  Why These Primary Components?

647	   Explicit congestion signalling (protocol):  Explicit congestion
648	      signalling is a key part of the L4S approach.  In contrast, use of
649	      drop as a congestion signal creates a tension because drop is both
650	      an impairment (less would be better) and a useful signal (more
651	      would be better):

653	      *  Explicit congestion signals can be used many times per round
654	         trip, to keep tight control, without any impairment.  Under
655	         heavy load, even more explicit signals can be applied so the
656	         queue can be kept short whatever the load.  In contrast,
657	         Classic AQMs have to introduce very high packet drop at high
658	         load to keep the queue short.  By using ECN, an L4S congestion
659	         control's sawtooth reduction can be smaller and therefore
660	         return to the operating point more often, without worrying that
661	         more sawteeth will cause more signals.  The consequent smaller
662	         amplitude sawteeth fit between an empty queue and a very
663	         shallow marking threshold (~1 ms in the public Internet), so
664	         queue delay variation can be very low, without risk of under-
665	         utilization.

667	      *  Explicit congestion signals can be emitted immediately to track
668	         fluctuations of the queue.  L4S shifts smoothing from the
669	         network to the host.  The network doesn't know the round trip
670	         times of any of the flows.  So if the network is responsible
671	         for smoothing (as in the Classic approach), it has to assume a
672	         worst case RTT, otherwise long RTT flows would become unstable.
673	         This delays Classic congestion signals by 100-200 ms.  In
674	         contrast, each host knows its own round trip time.  So, in the
675	         L4S approach, the host can smooth each flow over its own RTT,
676	         introducing no more soothing delay than strictly necessary
677	         (usually only a few milliseconds).  A host can also choose not
678	         to introduce any smoothing delay if appropriate, e.g. during
679	         flow start-up.

681	      Neither of the above are feasible if explicit congestion
682	      signalling has to be considered 'equivalent to drop' (as was
683	      required with Classic ECN [RFC3168]), because drop is an
684	      impairment as well as a signal.  So drop cannot be excessively
685	      frequent, and drop cannot be immediate, otherwise too many drops
686	      would turn out to have been due to only a transient fluctuation in
687	      the queue that would not have warranted dropping a packet in
688	      hindsight.  Therefore, in an L4S AQM, the L4S queue uses a new L4S
689	      variant of ECN that is not equivalent to drop (see section 5.2 of
690	      [I-D.ietf-tsvwg-ecn-l4s-id]), while the Classic queue uses either
691	      Classic ECN [RFC3168] or drop, which are equivalent to each other.

693	      Before Classic ECN was standardized, there were various proposals
694	      to give an ECN mark a different meaning from drop.  However, there
695	      was no particular reason to agree on any one of the alternative
696	      meanings, so 'equivalent to drop' was the only compromise that
697	      could be reached.  RFC 3168 contains a statement that:

699	         "An environment where all end nodes were ECN-Capable could
700	         allow new criteria to be developed for setting the CE
701	         codepoint, and new congestion control mechanisms for end-node
702	         reaction to CE packets.  However, this is a research issue, and
703	         as such is not addressed in this document."

705	   Latency isolation (network):  L4S congestion controls keep queue
706	      delay low whereas Classic congestion controls need a queue of the
707	      order of the RTT to avoid under-utilization.  One queue cannot
708	      have two lengths, therefore L4S traffic needs to be isolated in a
709	      separate queue (e.g. DualQ) or queues (e.g. FQ).

711	   Coupled congestion notification:  Coupling the congestion
712	      notification between two queues as in the DualQ Coupled AQM is not
713	      necessarily essential, but it is a simple way to allow senders to
714	      determine their rate, packet by packet, rather than be overridden
715	      by a network scheduler.  An alternative is for a network scheduler
716	      to control the rate of each application flow (see discussion in
717	      Section 5.2).

719	   L4S packet identifier (protocol):  Once there are at least two
720	      treatments in the network, hosts need an identifier at the IP
721	      layer to distinguish which treatment they intend to use.

723	   Scalable congestion notification:  A scalable congestion control in
724	      the host keeps the signalling frequency from the network high
725	      whatever the flow rate, so that queue delay variations can be
726	      small when conditions are stable, and rate can track variations in
727	      available capacity as rapidly as possible otherwise.

729	   Low loss:  Latency is not the only concern of L4S.  The 'Low Loss'
730	      part of the name denotes that L4S generally achieves zero
731	      congestion loss due to its use of ECN.  Otherwise, loss would
732	      itself cause delay, particularly for short flows, due to
733	      retransmission delay [RFC2884].

735	   Scalable throughput:  The "Scalable throughput" part of the name
736	      denotes that the per-flow throughput of scalable congestion
737	      controls should scale indefinitely, avoiding the imminent scaling
738	      problems with Reno-friendly congestion control
739	      algorithms [RFC3649].  It was known when TCP congestion avoidance
740	      was first developed in 1988 that it would not scale to high
741	      bandwidth-delay products (see footnote 6 in [TCP-CA]).  Today,
742	      regular broadband flow rates over WAN distances are already beyond
743	      the scaling range of Classic Reno congestion control.  So `less
744	      unscalable' Cubic [RFC8312] and Compound [I-D.sridharan-tcpm-ctcp]
745	      variants of TCP have been successfully deployed.  However, these
746	      are now approaching their scaling limits.

748	      For instance, we will consider a scenario with a maximum RTT of
749	      30 ms at the peak of each sawtooth.  As Reno packet rate scales 8x
750	      from 1,250 to 10,000 packet/s (from 15 to 120 Mb/s with 1500 B
751	      packets), the time to recover from a congestion event rises
752	      proportionately by 8x as well, from 422 ms to 3.38 s.  It is
753	      clearly problematic for a congestion control to take multiple
754	      seconds to recover from each congestion event.  Cubic [RFC8312]
755	      was developed to be less unscalable, but it is approaching its
756	      scaling limit; with the same max RTT of 30 ms, at 120 Mb/s Cubic
757	      is still fully in its Reno-friendly mode, so it takes about 4.3 s
758	      to recover.  However, once the flow rate scales by 8x again to
759	      960 Mb/s it enters true Cubic mode, with a recovery time of
760	      12.2 s.  From then on, each further scaling by 8x doubles Cubic's
761	      recovery time (because the cube root of 8 is 2), e.g. at 7.68 Gb/s
762	      the recovery time is 24.3 s.  In contrast a scalable congestion
763	      control like DCTCP or TCP Prague induces 2 congestion signals per
764	      round trip on average, which remains invariant for any flow rate,
765	      keeping dynamic control very tight.

767	      For a feel of where the global average lone-flow download sits on
768	      this scale at the time of writing (2021), according to [BDPdata]
769	      globally averaged fixed access capacity was 103 Mb/s in 2020 and
770	      averaged base RTT to a CDN was 25-34ms in 2019.  Averaging of per-
771	      country data was weighted by Internet user population (data
772	      collected globally is necessarily of variable quality, but the
773	      paper does double-check that the outcome compares well against a
774	      second source).  So a lone CUBIC flow would at best take about 200
775	      round trips (5 s) to recover from each of its sawtooth reductions,
776	      if the flow even lasted that long.  This is described as 'at best'
777	      because it assume everyone uses an AQM, whereas in reality most
778	      users still have a (probably bloated) tail-drop buffer.  In the
779	      tail-drop case, likely average recovery time would be at least 4x
780	      5 s, if not more, because RTT under load would be at least double
781	      that of an AQM, and recovery time depends on the square of RTT.

783	      Although work on scaling congestion controls tends to start with
784	      TCP as the transport, the above is not intended to exclude other
785	      transports (e.g. SCTP, QUIC) or less elastic algorithms
786	      (e.g. RMCAT), which all tend to adopt the same or similar
787	      developments.

789	5.2.  What L4S adds to Existing Approaches

791	   All the following approaches address some part of the same problem
792	   space as L4S.  In each case, it is shown that L4S complements them or
793	   improves on them, rather than being a mutually exclusive alternative:

795	   Diffserv:  Diffserv addresses the problem of bandwidth apportionment
796	      for important traffic as well as queuing latency for delay-
797	      sensitive traffic.  Of these, L4S solely addresses the problem of
798	      queuing latency.  Diffserv will still be necessary where important
799	      traffic requires priority (e.g. for commercial reasons, or for
800	      protection of critical infrastructure traffic) - see
801	      [I-D.briscoe-tsvwg-l4s-diffserv].  Nonetheless, the L4S approach
802	      can provide low latency for all traffic within each Diffserv class
803	      (including the case where there is only the one default Diffserv
804	      class).

806	      Also, Diffserv can only provide a latency benefit if a small
807	      subset of the traffic on a bottleneck link requests low latency.
808	      As already explained, it has no effect when all the applications
809	      in use at one time at a single site (home, small business or
810	      mobile device) require low latency.  In contrast, because L4S
811	      works for all traffic, it needs none of the management baggage
812	      (traffic policing, traffic contracts) associated with favouring
813	      some packets over others.  This lack of management baggage ought
814	      to give L4S a better chance of end-to-end deployment.

816	      In particular, because networks tend not to trust end systems to
817	      identify which packets should be favoured over others, where
818	      networks assign packets to Diffserv classes they tend to use
819	      packet inspection of application flow identifiers or deeper
820	      inspection of application signatures.  Thus, nowadays, Diffserv
821	      doesn't always sit well with encryption of the layers above IP
822	      [RFC8404].  So users have to choose between privacy and QoS.

824	      As with Diffserv, the L4S identifier is in the IP header.  But, in
825	      contrast to Diffserv, the L4S identifier does not convey a want or
826	      a need for a certain level of quality.  Rather, it promises a
827	      certain behaviour (scalable congestion response), which networks
828	      can objectively verify if they need to.  This is because low delay
829	      depends on collective host behaviour, whereas bandwidth priority
830	      depends on network behaviour.

832	   State-of-the-art AQMs:  AQMs such as PIE and FQ-CoDel give a
833	      significant reduction in queuing delay relative to no AQM at all.
834	      L4S is intended to complement these AQMs, and should not distract
835	      from the need to deploy them as widely as possible.  Nonetheless,
836	      AQMs alone cannot reduce queuing delay too far without
837	      significantly reducing link utilization, because the root cause of
838	      the problem is on the host - where Classic congestion controls use
839	      large saw-toothing rate variations.  The L4S approach resolves
840	      this tension between delay and utilization by enabling hosts to
841	      minimize the amplitude of their sawteeth.  A single-queue Classic
842	      AQM is not sufficient to allow hosts to use small sawteeth for two
843	      reasons: i) smaller sawteeth would not get lower delay in an AQM
844	      designed for larger amplitude Classic sawteeth, because a queue
845	      can only have one length at a time; and ii) much smaller sawteeth
846	      implies much more frequent sawteeth, so L4S flows would drive a
847	      Classic AQM into a high level of ECN-marking, which would appear
848	      as heavy congestion to Classic flows, which in turn would greatly
849	      reduce their rate as a result (see Section 6.4.4).

851	   Per-flow queuing or marking:  Similarly, per-flow approaches such as
852	      FQ-CoDel or Approx Fair CoDel [AFCD] are not incompatible with the
853	      L4S approach.  However, per-flow queuing alone is not enough - it
854	      only isolates the queuing of one flow from others; not from
855	      itself.  Per-flow implementations need to have support for
856	      scalable congestion control added, which has already been done for
857	      FQ-CoDel in Linux (see Sec.5.2.7 of [RFC8290] and
858	      [FQ_CoDel_Thresh]).  Without this simple modification, per-flow
859	      AQMs like FQ-CoDel would still not be able to support applications
860	      that need both very low delay and high bandwidth, e.g. video-based
861	      control of remote procedures, or interactive cloud-based video
862	      (see Note 1 below).

864	      Although per-flow techniques are not incompatible with L4S, it is
865	      important to have the DualQ alternative.  This is because handling
866	      end-to-end (layer 4) flows in the network (layer 3 or 2) precludes
867	      some important end-to-end functions.  For instance:

869	      a.  Per-flow forms of L4S like FQ-CoDel are incompatible with full
870	          end-to-end encryption of transport layer identifiers for
871	          privacy and confidentiality (e.g. IPSec or encrypted VPN
872	          tunnels, as opposed to TLS over UDP), because they require
873	          packet inspection to access the end-to-end transport flow
874	          identifiers.

876	          In contrast, the DualQ form of L4S requires no deeper
877	          inspection than the IP layer.  So, as long as operators take
878	          the DualQ approach, their users can have both very low queuing
879	          delay and full end-to-end encryption [RFC8404].

881	      b.  With per-flow forms of L4S, the network takes over control of
882	          the relative rates of each application flow.  Some see it as
883	          an advantage that the network will prevent some flows running
884	          faster than others.  Others consider it an inherent part of
885	          the Internet's appeal that applications can control their rate
886	          while taking account of the needs of others via congestion
887	          signals.  They maintain that this has allowed applications
888	          with interesting rate behaviours to evolve, for instance,
889	          variable bit-rate video that varies around an equal share
890	          rather than being forced to remain equal at every instant, or
891	          e2e scavenger behaviours [RFC6817] that use less than an equal
892	          share of capacity [LEDBAT_AQM].

894	          The L4S architecture does not require the IETF to commit to
895	          one approach over the other, because it supports both, so that
896	          the 'market' can decide.  Nonetheless, in the spirit of 'Do
897	          one thing and do it well' [McIlroy78], the DualQ option
898	          provides low delay without prejudging the issue of flow-rate
899	          control.  Then, flow rate policing can be added separately if
900	          desired.  This allows application control up to a point, but
901	          the network can still choose to set the point at which it
902	          intervenes to prevent one flow completely starving another.

904	      Note:

906	      1.  It might seem that self-inflicted queuing delay within a per-
907	          flow queue should not be counted, because if the delay wasn't
908	          in the network it would just shift to the sender.  However,
909	          modern adaptive applications, e.g. HTTP/2 [RFC7540] or some
910	          interactive media applications (see Section 6.1), can keep low
911	          latency objects at the front of their local send queue by
912	          shuffling priorities of other objects dependent on the
913	          progress of other transfers (for example see [lowat]).  They
914	          cannot shuffle objects once they have released them into the
915	          network.

917	   Alternative Back-off ECN (ABE):  Here again, L4S is not an
918	      alternative to ABE but a complement that introduces much lower
919	      queuing delay.  ABE [RFC8511] alters the host behaviour in
920	      response to ECN marking to utilize a link better and give ECN
921	      flows faster throughput.  It uses ECT(0) and assumes the network
922	      still treats ECN and drop the same.  Therefore ABE exploits any
923	      lower queuing delay that AQMs can provide.  But as explained
924	      above, AQMs still cannot reduce queuing delay too far without
925	      losing link utilization (to allow for other, non-ABE, flows).

927	   BBR:  Bottleneck Bandwidth and Round-trip propagation time
928	      (BBR [I-D.cardwell-iccrg-bbr-congestion-control]) controls queuing
929	      delay end-to-end without needing any special logic in the network,
930	      such as an AQM.  So it works pretty-much on any path.  BBR keeps
931	      queuing delay reasonably low, but perhaps not quite as low as with
932	      state-of-the-art AQMs such as PIE or FQ-CoDel, and certainly
933	      nowhere near as low as with L4S.  Queuing delay is also not
934	      consistently low, due to BBR's regular bandwidth probing spikes
935	      and its aggressive flow start-up phase.

937	      L4S complements BBR.  Indeed
938	      BBRv2 [I-D.cardwell-iccrg-bbr-congestion-control] can use L4S ECN
939	      where available and a scalable L4S congestion control behaviour in
940	      response to any ECN signalling from the path.  The L4S ECN signal
941	      complements the delay based congestion control aspects of BBR with
942	      an explicit indication that hosts can use, both to converge on a
943	      fair rate and to keep below a shallow queue target set by the
944	      network.  Without L4S ECN, both these aspects need to be assumed
945	      or estimated.

947	6.  Applicability

949	6.1.  Applications

951	   A transport layer that solves the current latency issues will provide
952	   new service, product and application opportunities.

954	   With the L4S approach, the following existing applications also
955	   experience significantly better quality of experience under load:

957	   *  Gaming, including cloud based gaming;

959	   *  VoIP;

961	   *  Video conferencing;

963	   *  Web browsing;

965	   *  (Adaptive) video streaming;
966	   *  Instant messaging.

968	   The significantly lower queuing latency also enables some interactive
969	   application functions to be offloaded to the cloud that would hardly
970	   even be usable today:

972	   *  Cloud based interactive video;

974	   *  Cloud based virtual and augmented reality.

976	   The above two applications have been successfully demonstrated with
977	   L4S, both running together over a 40 Mb/s broadband access link
978	   loaded up with the numerous other latency sensitive applications in
979	   the previous list as well as numerous downloads - all sharing the
980	   same bottleneck queue simultaneously [L4Sdemo16].  For the former, a
981	   panoramic video of a football stadium could be swiped and pinched so
982	   that, on the fly, a proxy in the cloud could generate a sub-window of
983	   the match video under the finger-gesture control of each user.  For
984	   the latter, a virtual reality headset displayed a viewport taken from
985	   a 360 degree camera in a racing car.  The user's head movements
986	   controlled the viewport extracted by a cloud-based proxy.  In both
987	   cases, with 7 ms end-to-end base delay, the additional queuing delay
988	   of roughly 1 ms was so low that it seemed the video was generated
989	   locally.

991	   Using a swiping finger gesture or head movement to pan a video are
992	   extremely latency-demanding actions--far more demanding than VoIP.
993	   Because human vision can detect extremely low delays of the order of
994	   single milliseconds when delay is translated into a visual lag
995	   between a video and a reference point (the finger or the orientation
996	   of the head sensed by the balance system in the inner ear --- the
997	   vestibular system).

999	   Without the low queuing delay of L4S, cloud-based applications like
1000	   these would not be credible without significantly more access
1001	   bandwidth (to deliver all possible video that might be viewed) and
1002	   more local processing, which would increase the weight and power
1003	   consumption of head-mounted displays.  When all interactive
1004	   processing can be done in the cloud, only the data to be rendered for
1005	   the end user needs to be sent.

1007	   Other low latency high bandwidth applications such as:

1009	   *  Interactive remote presence;

1011	   *  Video-assisted remote control of machinery or industrial
1012	      processes.

1014	   are not credible at all without very low queuing delay.  No amount of
1015	   extra access bandwidth or local processing can make up for lost time.

1017	6.2.  Use Cases

1019	   The following use-cases for L4S are being considered by various
1020	   interested parties:

1022	   *  Where the bottleneck is one of various types of access network:
1023	      e.g. DSL, Passive Optical Networks (PON), DOCSIS cable, mobile,
1024	      satellite (see Section 6.3 for some technology-specific details)

1026	   *  Private networks of heterogeneous data centres, where there is no
1027	      single administrator that can arrange for all the simultaneous
1028	      changes to senders, receivers and network needed to deploy DCTCP:

1030	      -  a set of private data centres interconnected over a wide area
1031	         with separate administrations, but within the same company

1033	      -  a set of data centres operated by separate companies
1034	         interconnected by a community of interest network (e.g. for the
1035	         finance sector)

1037	      -  multi-tenant (cloud) data centres where tenants choose their
1038	         operating system stack (Infrastructure as a Service - IaaS)

1040	   *  Different types of transport (or application) congestion control:

1042	      -  elastic (TCP/SCTP);

1044	      -  real-time (RTP, RMCAT);

1046	      -  query (DNS/LDAP).

1048	   *  Where low delay quality of service is required, but without
1049	      inspecting or intervening above the IP layer [RFC8404]:

1051	      -  mobile and other networks have tended to inspect higher layers
1052	         in order to guess application QoS requirements.  However, with
1053	         growing demand for support of privacy and encryption, L4S
1054	         offers an alternative.  There is no need to select which
1055	         traffic to favour for queuing, when L4S can give favourable
1056	         queuing to all traffic.

1058	   *  If queuing delay is minimized, applications with a fixed delay
1059	      budget can communicate over longer distances, or via a longer
1060	      chain of service functions [RFC7665] or onion routers.

1062	   *  If delay jitter is minimized, it is possible to reduce the
1063	      dejitter buffers on the receive end of video streaming, which
1064	      should improve the interactive experience

1066	6.3.  Applicability with Specific Link Technologies

1068	   Certain link technologies aggregate data from multiple packets into
1069	   bursts, and buffer incoming packets while building each burst.  WiFi,
1070	   PON and cable all involve such packet aggregation, whereas fixed
1071	   Ethernet and DSL do not.  No sender, whether L4S or not, can do
1072	   anything to reduce the buffering needed for packet aggregation.  So
1073	   an AQM should not count this buffering as part of the queue that it
1074	   controls, given no amount of congestion signals will reduce it.

1076	   Certain link technologies also add buffering for other reasons,
1077	   specifically:

1079	   *  Radio links (cellular, WiFi, satellite) that are distant from the
1080	      source are particularly challenging.  The radio link capacity can
1081	      vary rapidly by orders of magnitude, so it is considered desirable
1082	      to hold a standing queue that can utilize sudden increases of
1083	      capacity;

1085	   *  Cellular networks are further complicated by a perceived need to
1086	      buffer in order to make hand-overs imperceptible;

1088	   L4S cannot remove the need for all these different forms of
1089	   buffering.  However, by removing 'the longest pole in the tent'
1090	   (buffering for the large sawteeth of Classic congestion controls),
1091	   L4S exposes all these 'shorter poles' to greater scrutiny.

1093	   Until now, the buffering needed for these additional reasons tended
1094	   to be over-specified - with the excuse that none were 'the longest
1095	   pole in the tent'.  But having removed the 'longest pole', it becomes
1096	   worthwhile to minimize them, for instance reducing packet aggregation
1097	   burst sizes and MAC scheduling intervals.

1099	6.4.  Deployment Considerations

1101	   L4S AQMs, whether DualQ [I-D.ietf-tsvwg-aqm-dualq-coupled] or FQ,
1102	   e.g. [RFC8290] are, in themselves, an incremental deployment
1103	   mechanism for L4S - so that L4S traffic can coexist with existing
1104	   Classic (Reno-friendly) traffic.  Section 6.4.1 explains why only
1105	   deploying an L4S AQM in one node at each end of the access link will
1106	   realize nearly all the benefit of L4S.

1108	   L4S involves both end systems and the network, so Section 6.4.2
1109	   suggests some typical sequences to deploy each part, and why there
1110	   will be an immediate and significant benefit after deploying just one
1111	   part.

1113	   Section 6.4.3 and Section 6.4.4 describe the converse incremental
1114	   deployment case where there is no L4S AQM at the network bottleneck,
1115	   so any L4S flow traversing this bottleneck has to take care in case
1116	   it is competing with Classic traffic.

1118	6.4.1.  Deployment Topology

1120	   L4S AQMs will not have to be deployed throughout the Internet before
1121	   L4S can benefit anyone.  Operators of public Internet access networks
1122	   typically design their networks so that the bottleneck will nearly
1123	   always occur at one known (logical) link.  This confines the cost of
1124	   queue management technology to one place.

1126	   The case of mesh networks is different and will be discussed later in
1127	   this section.  But the known bottleneck case is generally true for
1128	   Internet access to all sorts of different 'sites', where the word
1129	   'site' includes home networks, small- to medium-sized campus or
1130	   enterprise networks and even cellular devices (Figure 2).  Also, this
1131	   known-bottleneck case tends to be applicable whatever the access link
1132	   technology; whether xDSL, cable, PON, cellular, line of sight
1133	   wireless or satellite.

1135	   Therefore, the full benefit of the L4S service should be available in
1136	   the downstream direction when an L4S AQM is deployed at the ingress
1137	   to this bottleneck link.  And similarly, the full upstream service
1138	   will be available once an L4S AQM is deployed at the ingress into the
1139	   upstream link.  (Of course, multi-homed sites would only see the full
1140	   benefit once all their access links were covered.)
1141	                                            ______
1142	                                           (      )
1143	                         __          __  (          )
1144	                        |DQ\________/DQ|( enterprise )
1145	                    ___ |__/        \__| ( /campus  )
1146	                   (   )                   (______)
1147	                 (      )                           ___||_
1148	   +----+      (          )  __                 __ /      \
1149	   | DC |-----(    Core    )|DQ\_______________/DQ|| home |
1150	   +----+      (          ) |__/               \__||______|
1151	                  (_____) __
1152	                         |DQ\__/\        __ ,===.
1153	                         |__/    \  ____/DQ||| ||mobile
1154	                                  \/    \__|||_||device
1155	                                            | o |
1156	                                            `---'

1158	       Figure 2: Likely location of DualQ (DQ) Deployments in common
1159	                             access topologies

1161	   Deployment in mesh topologies depends on how overbooked the core is.
1162	   If the core is non-blocking, or at least generously provisioned so
1163	   that the edges are nearly always the bottlenecks, it would only be
1164	   necessary to deploy an L4S AQM at the edge bottlenecks.  For example,
1165	   some data-centre networks are designed with the bottleneck in the
1166	   hypervisor or host NICs, while others bottleneck at the top-of-rack
1167	   switch (both the output ports facing hosts and those facing the
1168	   core).

1170	   An L4S AQM would often next be needed where the WiFi links in a home
1171	   sometimes become the bottleneck.  And an L4S AQM would eventually
1172	   also need to be deployed at any other persistent bottlenecks such as
1173	   network interconnections, e.g. some public Internet exchange points
1174	   and the ingress and egress to WAN links interconnecting data-centres.

1176	6.4.2.  Deployment Sequences

1178	   For any one L4S flow to provide benefit, it requires three (or
1179	   sometimes two) parts to have been deployed: i) the congestion control
1180	   at the sender; ii) the AQM at the bottleneck; and iii) older
1181	   transports (namely TCP) need upgraded receiver feedback too.  This
1182	   was the same deployment problem that ECN faced [RFC8170] so we have
1183	   learned from that experience.

1185	   Firstly, L4S deployment exploits the fact that DCTCP already exists
1186	   on many Internet hosts (Windows, FreeBSD and Linux); both servers and
1187	   clients.  Therefore, an L4S AQM can be deployed at a network
1188	   bottleneck to immediately give a working deployment of all the L4S
1189	   parts for testing, as long as the ECT(0) codepoint is switched to
1190	   ECT(1).  DCTCP needs some safety concerns to be fixed for general use
1191	   over the public Internet (see Section 4.3 of
1192	   [I-D.ietf-tsvwg-ecn-l4s-id]), but DCTCP is not on by default, so
1193	   these issues can be managed within controlled deployments or
1194	   controlled trials.

1196	   Secondly, the performance improvement with L4S is so significant that
1197	   it enables new interactive services and products that were not
1198	   previously possible.  It is much easier for companies to initiate new
1199	   work on deployment if there is budget for a new product trial.  If,
1200	   in contrast, there were only an incremental performance improvement
1201	   (as with Classic ECN), spending on deployment tends to be much harder
1202	   to justify.

1204	   Thirdly, the L4S identifier is defined so that initially network
1205	   operators can enable L4S exclusively for certain customers or certain
1206	   applications.  But this is carefully defined so that it does not
1207	   compromise future evolution towards L4S as an Internet-wide service.
1208	   This is because the L4S identifier is defined not only as the end-to-
1209	   end ECN field, but it can also optionally be combined with any other
1210	   packet header or some status of a customer or their access link (see
1211	   section 5.4 of [I-D.ietf-tsvwg-ecn-l4s-id]).  Operators could do this
1212	   anyway, even if it were not blessed by the IETF.  However, it is best
1213	   for the IETF to specify that, if they use their own local identifier,
1214	   it must be in combination with the IETF's identifier.  Then, if an
1215	   operator has opted for an exclusive local-use approach, later they
1216	   only have to remove this extra rule to make the service work
1217	   Internet-wide - it will already traverse middleboxes, peerings, etc.

1219	   +-+--------------------+----------------------+---------------------+
1220	   | | Servers or proxies |      Access link     |             Clients |
1221	   +-+--------------------+----------------------+---------------------+
1222	   |0| DCTCP (existing)   |                      |    DCTCP (existing) |
1223	   +-+--------------------+----------------------+---------------------+
1224	   |1|                    |Add L4S AQM downstream|                     |
1225	   | |       WORKS DOWNSTREAM FOR CONTROLLED DEPLOYMENTS/TRIALS        |
1226	   +-+--------------------+----------------------+---------------------+
1227	   |2| Upgrade DCTCP to   |                      |Replace DCTCP feedb'k|
1228	   | | TCP Prague         |                      |         with AccECN |
1229	   | |                 FULLY     WORKS     DOWNSTREAM                  |
1230	   +-+--------------------+----------------------+---------------------+
1231	   | |                    |                      |    Upgrade DCTCP to |
1232	   |3|                    | Add L4S AQM upstream |          TCP Prague |
1233	   | |                    |                      |                     |
1234	   | |              FULLY WORKS UPSTREAM AND DOWNSTREAM                |
1235	   +-+--------------------+----------------------+---------------------+
1236	                 Figure 3: Example L4S Deployment Sequence

1238	   Figure 3 illustrates some example sequences in which the parts of L4S
1239	   might be deployed.  It consists of the following stages:

1241	   1.  Here, the immediate benefit of a single AQM deployment can be
1242	       seen, but limited to a controlled trial or controlled deployment.
1243	       In this example downstream deployment is first, but in other
1244	       scenarios the upstream might be deployed first.  If no AQM at all
1245	       was previously deployed for the downstream access, an L4S AQM
1246	       greatly improves the Classic service (as well as adding the L4S
1247	       service).  If an AQM was already deployed, the Classic service
1248	       will be unchanged (and L4S will add an improvement on top).

1250	   2.  In this stage, the name 'TCP
1251	       Prague' [I-D.briscoe-iccrg-prague-congestion-control] is used to
1252	       represent a variant of DCTCP that is designed to be used in a
1253	       production Internet environment (assuming it complies with the
1254	       requirements in Section 4 of [I-D.ietf-tsvwg-ecn-l4s-id]).  If
1255	       the application is primarily unidirectional, 'TCP Prague' at one
1256	       end will provide all the benefit needed.  For TCP transports,
1257	       Accurate ECN feedback (AccECN) [I-D.ietf-tcpm-accurate-ecn] is
1258	       needed at the other end, but it is a generic ECN feedback
1259	       facility that is already planned to be deployed for other
1260	       purposes, e.g. DCTCP, BBR.  The two ends can be deployed in
1261	       either order, because, in TCP, an L4S congestion control only
1262	       enables itself if it has negotiated the use of AccECN feedback
1263	       with the other end during the connection handshake.  Thus,
1264	       deployment of TCP Prague on a server enables L4S trials to move
1265	       to a production service in one direction, wherever AccECN is
1266	       deployed at the other end.  This stage might be further motivated
1267	       by the performance improvements of TCP Prague relative to DCTCP
1268	       (see Appendix A.2 of [I-D.ietf-tsvwg-ecn-l4s-id]).

1270	       Unlike TCP, from the outset, QUIC ECN feedback [RFC9000] has
1271	       supported L4S.  Therefore, if the transport is QUIC, one-ended
1272	       deployment of a Prague congestion control at this stage is simple
1273	       and sufficient.

1275	   3.  This is a two-move stage to enable L4S upstream.  An L4S AQM or
1276	       TCP Prague can be deployed in either order as already explained.
1277	       To motivate the first of two independent moves, the deferred
1278	       benefit of enabling new services after the second move has to be
1279	       worth it to cover the first mover's investment risk.  As
1280	       explained already, the potential for new interactive services
1281	       provides this motivation.  An L4S AQM also improves the upstream
1282	       Classic service - significantly if no other AQM has already been
1283	       deployed.

1285	   Note that other deployment sequences might occur.  For instance: the
1286	   upstream might be deployed first; a non-TCP protocol might be used
1287	   end-to-end, e.g. QUIC, RTP; a body such as the 3GPP might require L4S
1288	   to be implemented in 5G user equipment, or other random acts of
1289	   kindness.

1291	6.4.3.  L4S Flow but Non-ECN Bottleneck

1293	   If L4S is enabled between two hosts, the L4S sender is required to
1294	   coexist safely with Reno in response to any drop (see Section 4.3 of
1295	   [I-D.ietf-tsvwg-ecn-l4s-id]).

1297	   Unfortunately, as well as protecting Classic traffic, this rule
1298	   degrades the L4S service whenever there is any loss, even if the
1299	   cause is not persistent congestion at a bottleneck, e.g.:

1301	   *  congestion loss at other transient bottlenecks, e.g. due to bursts
1302	      in shallower queues;

1304	   *  transmission errors, e.g. due to electrical interference;

1306	   *  rate policing.

1308	   Three complementary approaches are in progress to address this issue,
1309	   but they are all currently research:

1311	   *  In Prague congestion control, ignore certain losses deemed
1312	      unlikely to be due to congestion (using some ideas from
1313	      BBR [I-D.cardwell-iccrg-bbr-congestion-control] regarding isolated
1314	      losses).  This could mask any of the above types of loss while
1315	      still coexisting with drop-based congestion controls.

1317	   *  A combination of RACK, L4S and link retransmission without
1318	      resequencing could repair transmission errors without the head of
1319	      line blocking delay usually associated with link-layer
1320	      retransmission [UnorderedLTE], [I-D.ietf-tsvwg-ecn-l4s-id];

1322	   *  Hybrid ECN/drop rate policers (see Section 8.3).

1324	   L4S deployment scenarios that minimize these issues (e.g. over
1325	   wireline networks) can proceed in parallel to this research, in the
1326	   expectation that research success could continually widen L4S
1327	   applicability.

1329	6.4.4.  L4S Flow but Classic ECN Bottleneck

1331	   Classic ECN support is starting to materialize on the Internet as an
1332	   increased level of CE marking.  It is hard to detect whether this is
1333	   all due to the addition of support for ECN in implementations of FQ-
1334	   CoDel and/or FQ-COBALT, which is not generally problematic, because
1335	   flow-queue (FQ) scheduling inherently prevents a flow from exceeding
1336	   the 'fair' rate irrespective of its aggressiveness.  However, some of
1337	   this Classic ECN marking might be due to single-queue ECN deployment.
1338	   This case is discussed in Section 4.3 of [I-D.ietf-tsvwg-ecn-l4s-id].

1340	6.4.5.  L4S AQM Deployment within Tunnels

1342	   An L4S AQM uses the ECN field to signal congestion.  So, in common
1343	   with Classic ECN, if the AQM is within a tunnel or at a lower layer,
1344	   correct functioning of ECN signalling requires correct propagation of
1345	   the ECN field up the layers [RFC6040],
1346	   [I-D.ietf-tsvwg-rfc6040update-shim],
1347	   [I-D.ietf-tsvwg-ecn-encap-guidelines].

1349	7.  IANA Considerations (to be removed by RFC Editor)

1351	   This specification contains no IANA considerations.

1353	8.  Security Considerations

1355	8.1.  Traffic Rate (Non-)Policing

1357	   In the current Internet, scheduling usually enforces separation
1358	   between 'sites' (e.g. households, businesses or mobile users
1359	   [RFC0970]) and various techniques like redirection to traffic
1360	   scrubbing facilities deal with flooding attacks.  However, there has
1361	   never been a universal need to police the rate of individual
1362	   application flows - the Internet has generally always relied on self-
1363	   restraint of congestion controls at senders for sharing intra-'site'
1364	   capacity.

1366	   As explained in Section 5.2, the DualQ variant of L4S provides low
1367	   delay without prejudging the issue of flow-rate control.  Then, if
1368	   flow-rate control is needed, per-flow-queuing (FQ) can be used
1369	   instead, or flow rate policing can be added as a modular addition to
1370	   a DualQ.

1372	   Because the L4S service reduces delay without increasing the delay of
1373	   Classic traffic, it should not be necessary to rate-police access to
1374	   the L4S service.  In contrast, Section 5.2 explains how Diffserv only
1375	   makes a difference if some packets get less favourable treatment than
1376	   others, which typically requires traffic rate policing, which can, in
1377	   turn, lead to further complexity such as traffic contracts at trust
1378	   boundaries.  Because L4S avoids this management complexity, it is
1379	   more likely to work end-to-end.

1381	   During early deployment (and perhaps always), some networks will not
1382	   offer the L4S service.  In general, these networks should not need to
1383	   police L4S traffic.  They are required (by both [RFC3168] and
1384	   [I-D.ietf-tsvwg-ecn-l4s-id]) not to change the L4S identifier, which
1385	   would interfere with end-to-end congestion control.  Instead they can
1386	   merely treat L4S traffic as Not-ECT, as they might already treat all
1387	   ECN traffic today.  At a bottleneck, such networks will introduce
1388	   some queuing and dropping.  When a scalable congestion control
1389	   detects a drop it will have to respond safely with respect to Classic
1390	   congestion controls (as required in Section 4.3 of
1391	   [I-D.ietf-tsvwg-ecn-l4s-id]).  This will degrade the L4S service to
1392	   be no better (but never worse) than Classic best efforts, whenever a
1393	   non-ECN bottleneck is encountered on a path (see Section 6.4.3).

1395	   In cases that are expected to be rare, networks that solely support
1396	   Classic ECN [RFC3168] in a single queue bottleneck might opt to
1397	   police L4S traffic so as to protect competing Classic ECN traffic
1398	   (for instance, see Section 6.1.3 of [I-D.ietf-tsvwg-l4sops]).
1399	   However, Section 4.3 of [I-D.ietf-tsvwg-ecn-l4s-id] recommends that
1400	   the sender adapts its congestion response to properly coexist with
1401	   Classic ECN flows, i.e. reverting to the self-restraint approach.

1403	   Certain network operators might choose to restrict access to the L4S
1404	   class, perhaps only to selected premium customers as a value-added
1405	   service.  Their packet classifier (item 2 in Figure 1) could identify
1406	   such customers against some other field (e.g. source address range)
1407	   as well as classifying on the ECN field.  If only the ECN L4S
1408	   identifier matched, but not the source address (say), the classifier
1409	   could direct these packets (from non-premium customers) into the
1410	   Classic queue.  Explaining clearly how operators can use an
1411	   additional local classifiers (see section 5.4 of
1412	   [I-D.ietf-tsvwg-ecn-l4s-id]) is intended to remove any motivation to
1413	   clear the L4S identifier.  Then at least the L4S ECN identifier will
1414	   be more likely to survive end-to-end even though the service may not
1415	   be supported at every hop.  Such local arrangements would only
1416	   require simple registered/not-registered packet classification,
1417	   rather than the managed, application-specific traffic policing
1418	   against customer-specific traffic contracts that Diffserv uses.

1420	8.2.  'Latency Friendliness'

1422	   Like the Classic service, the L4S service relies on self-restraint -
1423	   limiting rate in response to congestion.  In addition, the L4S
1424	   service requires self-restraint in terms of limiting latency
1425	   (burstiness).  It is hoped that self-interest and guidance on dynamic
1426	   behaviour (especially flow start-up, which might need to be
1427	   standardized) will be sufficient to prevent transports from sending
1428	   excessive bursts of L4S traffic, given the application's own latency
1429	   will suffer most from such behaviour.

1431	   Whether burst policing becomes necessary remains to be seen.  Without
1432	   it, there will be potential for attacks on the low latency of the L4S
1433	   service.

1435	   If needed, various arrangements could be used to address this
1436	   concern:

1438	   Local bottleneck queue protection:  A per-flow (5-tuple) queue
1439	      protection function [I-D.briscoe-docsis-q-protection] has been
1440	      developed for the low latency queue in DOCSIS, which has adopted
1441	      the DualQ L4S architecture.  It protects the low latency service
1442	      from any queue-building flows that accidentally or maliciously
1443	      classify themselves into the low latency queue.  It is designed to
1444	      score flows based solely on their contribution to queuing (not
1445	      flow rate in itself).  Then, if the shared low latency queue is at
1446	      risk of exceeding a threshold, the function redirects enough
1447	      packets of the highest scoring flow(s) into the Classic queue to
1448	      preserve low latency.

1450	   Distributed traffic scrubbing:  Rather than policing locally at each
1451	      bottleneck, it may only be necessary to address problems
1452	      reactively, e.g. punitively target any deployments of new bursty
1453	      malware, in a similar way to how traffic from flooding attack
1454	      sources is rerouted via scrubbing facilities.

1456	   Local bottleneck per-flow scheduling:  Per-flow scheduling should
1457	      inherently isolate non-bursty flows from bursty (see Section 5.2
1458	      for discussion of the merits of per-flow scheduling relative to
1459	      per-flow policing).

1461	   Distributed access subnet queue protection:  Per-flow queue
1462	      protection could be arranged for a queue structure distributed
1463	      across a subnet inter-communicating using lower layer control
1464	      messages (see Section 2.1.4 of [QDyn]).  For instance, in a radio
1465	      access network, user equipment already sends regular buffer status
1466	      reports to a radio network controller, which could use this
1467	      information to remotely police individual flows.

1469	   Distributed Congestion Exposure to Ingress Policers:  The Congestion
1470	      Exposure (ConEx) architecture [RFC7713] which uses egress audit to
1471	      motivate senders to truthfully signal path congestion in-band
1472	      where it can be used by ingress policers.  An edge-to-edge variant
1473	      of this architecture is also possible.

1475	   Distributed Domain-edge traffic conditioning:  An architecture
1476	      similar to Diffserv [RFC2475] may be preferred, where traffic is
1477	      proactively conditioned on entry to a domain, rather than
1478	      reactively policed only if it leads to queuing once combined with
1479	      other traffic at a bottleneck.

1481	   Distributed core network queue protection:  The policing function
1482	      could be divided between per-flow mechanisms at the network
1483	      ingress that characterize the burstiness of each flow into a
1484	      signal carried with the traffic, and per-class mechanisms at
1485	      bottlenecks that act on these signals if queuing actually occurs
1486	      once the traffic converges.  This would be somewhat similar to
1487	      [Nadas20], which is in turn similar to the idea behind core
1488	      stateless fair queuing.

1490	   None of these possible queue protection capabilities are considered a
1491	   necessary part of the L4S architecture, which works without them (in
1492	   a similar way to how the Internet works without per-flow rate
1493	   policing).  Indeed, even where latency policers are deployed, under
1494	   normal circumstances they would not intervene, and if operators found
1495	   they were not necessary they could disable them.  Part of the L4S
1496	   experiment will be to see whether such a function is necessary, and
1497	   which arrangements are most appropriate to the size of the problem.

1499	8.3.  Interaction between Rate Policing and L4S

1501	   As mentioned in Section 5.2, L4S should remove the need for low
1502	   latency Diffserv classes.  However, those Diffserv classes that give
1503	   certain applications or users priority over capacity, would still be
1504	   applicable in certain scenarios (e.g. corporate networks).  Then,
1505	   within such Diffserv classes, L4S would often be applicable to give
1506	   traffic low latency and low loss as well.  Within such a Diffserv
1507	   class, the bandwidth available to a user or application is often
1508	   limited by a rate policer.  Similarly, in the default Diffserv class,
1509	   rate policers are used to partition shared capacity.

1511	   A classic rate policer drops any packets exceeding a set rate,
1512	   usually also giving a burst allowance (variants exist where the
1513	   policer re-marks non-compliant traffic to a discard-eligible Diffserv
1514	   codepoint, so they can be dropped elsewhere during contention).
1515	   Whenever L4S traffic encounters one of these rate policers, it will
1516	   experience drops and the source will have to fall back to a Classic
1517	   congestion control, thus losing the benefits of L4S (Section 6.4.3).
1518	   So, in networks that already use rate policers and plan to deploy
1519	   L4S, it will be preferable to redesign these rate policers to be more
1520	   friendly to the L4S service.

1522	   L4S-friendly rate policing is currently a research area (note that
1523	   this is not the same as latency policing).  It might be achieved by
1524	   setting a threshold where ECN marking is introduced, such that it is
1525	   just under the policed rate or just under the burst allowance where
1526	   drop is introduced.  For instance the two-rate three-colour marker
1527	   [RFC2698] or a PCN threshold and excess-rate marker [RFC5670] could
1528	   mark ECN at the lower rate and drop at the higher.  Or an existing
1529	   rate policer could have congestion-rate policing added, e.g. using
1530	   the 'local' (non-ConEx) variant of the ConEx aggregate congestion
1531	   policer [I-D.briscoe-conex-policing].  It might also be possible to
1532	   design scalable congestion controls to respond less catastrophically
1533	   to loss that has not been preceded by a period of increasing delay.

1535	   The design of L4S-friendly rate policers will require a separate
1536	   dedicated document.  For further discussion of the interaction
1537	   between L4S and Diffserv, see [I-D.briscoe-tsvwg-l4s-diffserv].

1539	8.4.  ECN Integrity

1541	   Receiving hosts can fool a sender into downloading faster by
1542	   suppressing feedback of ECN marks (or of losses if retransmissions
1543	   are not necessary or available otherwise).  Various ways to protect
1544	   transport feedback integrity have been developed.  For instance:

1546	   *  The sender can test the integrity of the receiver's feedback by
1547	      occasionally setting the IP-ECN field to the congestion
1548	      experienced (CE) codepoint, which is normally only set by a
1549	      congested link.  Then the sender can test whether the receiver's
1550	      feedback faithfully reports what it expects (see 2nd para of
1551	      Section 20.2 of [RFC3168]).

1553	   *  A network can enforce a congestion response to its ECN markings
1554	      (or packet losses) by auditing congestion exposure
1555	      (ConEx) [RFC7713].

1557	   *  Transport layer authentication such as the TCP authentication
1558	      option (TCP-AO [RFC5925]) or QUIC's use of TLS [RFC9001] can
1559	      detect any tampering with congestion feedback.

1561	   *  The ECN Nonce [RFC3540] was proposed to detect tampering with
1562	      congestion feedback, but it has been reclassified as
1563	      historic [RFC8311].

1565	   Appendix C.1 of [I-D.ietf-tsvwg-ecn-l4s-id] gives more details of
1566	   these techniques including their applicability and pros and cons.

1568	8.5.  Privacy Considerations

1570	   As discussed in Section 5.2, the L4S architecture does not preclude
1571	   approaches that inspect end-to-end transport layer identifiers.  For
1572	   instance, L4S support has been added to FQ-CoDel, which classifies by
1573	   application flow ID in the network.  However, the main innovation of
1574	   L4S is the DualQ AQM framework that does not need to inspect any
1575	   deeper than the outermost IP header, because the L4S identifier is in
1576	   the IP-ECN field.

1578	   Thus, the L4S architecture enables very low queuing delay without
1579	   _requiring_ inspection of information above the IP layer.  This means
1580	   that users who want to encrypt application flow identifiers, e.g. in
1581	   IPSec or other encrypted VPN tunnels, don't have to sacrifice low
1582	   delay [RFC8404].

1584	   Because L4S can provide low delay for a broad set of applications
1585	   that choose to use it, there is no need for individual applications
1586	   or classes within that broad set to be distinguishable in any way
1587	   while traversing networks.  This removes much of the ability to
1588	   correlate between the delay requirements of traffic and other
1589	   identifying features [RFC6973].  There may be some types of traffic
1590	   that prefer not to use L4S, but the coarse binary categorization of
1591	   traffic reveals very little that could be exploited to compromise
1592	   privacy.

1594	9.  Acknowledgements

1596	   Thanks to Richard Scheffenegger, Wes Eddy, Karen Nielsen, David
1597	   Black, Jake Holland, Vidhi Goel, Ermin Sakic, Praveen
1598	   Balasubramanian, Gorry Fairhurst, Mirja Kuehlewind, Philip Eardley,
1599	   Neal Cardwell and Pete Heist for their useful review comments.

1601	   Bob Briscoe and Koen De Schepper were part-funded by the European
1602	   Community under its Seventh Framework Programme through the Reducing
1603	   Internet Transport Latency (RITE) project (ICT-317700).  Bob Briscoe
1604	   was also part-funded by the Research Council of Norway through the
1605	   TimeIn project, partly by CableLabs and partly by the Comcast
1606	   Innovation Fund.  The views expressed here are solely those of the
1607	   authors.

1609	10.  Informative References

1611	   [AFCD]     Xue, L., Kumar, S., Cui, C., Kondikoppa, P., Chiu, C-H.,
1612	              and S-J. Park, "Towards fair and low latency next
1613	              generation high speed networks: AFCD queuing", Journal of
1614	              Network and Computer Applications 70:183--193, July 2016,
1615	              <https://doi.org/10.1016/j.jnca.2016.03.021>.

1617	   [BBRv2]    Cardwell, N., "TCP BBR v2 Alpha/Preview Release", github
1618	              repository; Linux congestion control module,
1619	              <https://github.com/google/bbr/blob/v2alpha/README.md>.

1621	   [BDPdata]  Briscoe, B., "PI2 Parameters", Technical Report TR-BB-
1622	              2021-001 arXiv:2107.01003 [cs.NI], July 2021,
1623	              <https://arxiv.org/abs/2107.01003>.

1625	   [BufferSize]
1626	              Appenzeller, G., Keslassy, I., and N. McKeown, "Sizing
1627	              Router Buffers", In Proc. SIGCOMM'04 34(4):281--292,
1628	              September 2004, <https://doi.org/10.1145/1015467.1015499>.

1630	   [COBALT]   Palmei, J., Gupta, S., Imputato, P., Morton, J.,
1631	              Tahiliani, M. P., Avallone, S., and D. Täht, "Design and
1632	              Evaluation of COBALT Queue Discipline", In Proc. IEEE
1633	              Int'l Symp. Local and Metropolitan Area Networks
1634	              (LANMAN'19) 2019:1-6, July 2019,
1635	              <https://ieeexplore.ieee.org/abstract/document/8847054>.

1637	   [DCttH19]  De Schepper, K., Bondarenko, O., Tilmans, O., and B.
1638	              Briscoe, "`Data Centre to the Home': Ultra-Low Latency for
1639	              All", Updated RITE project Technical Report , July 2019,
1640	              <https://bobbriscoe.net/pubs.html#DCttH_TR>.

1642	   [DOCSIS3.1]
1643	              CableLabs, "MAC and Upper Layer Protocols Interface
1644	              (MULPI) Specification, CM-SP-MULPIv3.1", Data-Over-Cable
1645	              Service Interface Specifications DOCSIS® 3.1 Version i17
1646	              or later, 21 January 2019, <https://specification-
1647	              search.cablelabs.com/CM-SP-MULPIv3.1>.

1649	   [DOCSIS3AQM]
1650	              White, G., "Active Queue Management Algorithms for DOCSIS
1651	              3.0; A Simulation Study of CoDel, SFQ-CoDel and PIE in
1652	              DOCSIS 3.0 Networks", CableLabs Technical Report , April
1653	              2013, <{http://www.cablelabs.com/wp-
1654	              content/uploads/2013/11/
1655	              Active_Queue_Management_Algorithms_DOCSIS_3_0.pdf>.

1657	   [DualPI2Linux]
1658	              Albisser, O., De Schepper, K., Briscoe, B., Tilmans, O.,
1659	              and H. Steen, "DUALPI2 - Low Latency, Low Loss and
1660	              Scalable (L4S) AQM", Proc. Linux Netdev 0x13 , March 2019,
1661	              <https://www.netdevconf.org/0x13/session.html?talk-
1662	              DUALPI2-AQM>.

1664	   [Dukkipati06]
1665	              Dukkipati, N. and N. McKeown, "Why Flow-Completion Time is
1666	              the Right Metric for Congestion Control", ACM CCR
1667	              36(1):59--62, January 2006,
1668	              <https://dl.acm.org/doi/10.1145/1111322.1111336>.

1670	   [FQ_CoDel_Thresh]
1671	              Høiland-Jørgensen, T., "fq_codel: generalise ce_threshold
1672	              marking for subset of traffic", Linux Patch Commit ID:
1673	              dfcb63ce1de6b10b, 20 October 2021,
1674	              <https://git.kernel.org/pub/scm/linux/kernel/git/netdev/
1675	              net-next.git/commit/?id=dfcb63ce1de6b10b>.

1677	   [Hohlfeld14]
1678	              Hohlfeld, O., Pujol, E., Ciucu, F., Feldmann, A., and P.
1679	              Barford, "A QoE Perspective on Sizing Network Buffers",
1680	              Proc. ACM Internet Measurement Conf (IMC'14) hmm, November
1681	              2014, <http://doi.acm.org/10.1145/2663716.2663730>.

1683	   [I-D.briscoe-conex-policing]
1684	              Briscoe, B., "Network Performance Isolation using
1685	              Congestion Policing", Work in Progress, Internet-Draft,
1686	              draft-briscoe-conex-policing-01, 14 February 2014,
1687	              <https://datatracker.ietf.org/doc/html/draft-briscoe-
1688	              conex-policing-01>.

1690	   [I-D.briscoe-docsis-q-protection]
1691	              Briscoe, B. and G. White, "The DOCSIS(r) Queue Protection
1692	              Algorithm to Preserve Low Latency", Work in Progress,
1693	              Internet-Draft, draft-briscoe-docsis-q-protection-02, 31
1694	              January 2022, <https://datatracker.ietf.org/doc/html/
1695	              draft-briscoe-docsis-q-protection-02>.

1697	   [I-D.briscoe-iccrg-prague-congestion-control]
1698	              Schepper, K. D., Tilmans, O., and B. Briscoe, "Prague
1699	              Congestion Control", Work in Progress, Internet-Draft,
1700	              draft-briscoe-iccrg-prague-congestion-control-00, 9 March
1701	              2021, <https://datatracker.ietf.org/doc/html/draft-
1702	              briscoe-iccrg-prague-congestion-control-00>.

1704	   [I-D.briscoe-tsvwg-l4s-diffserv]
1705	              Briscoe, B., "Interactions between Low Latency, Low Loss,
1706	              Scalable Throughput (L4S) and Differentiated Services",
1707	              Work in Progress, Internet-Draft, draft-briscoe-tsvwg-l4s-
1708	              diffserv-02, 4 November 2018,
1709	              <https://datatracker.ietf.org/doc/html/draft-briscoe-
1710	              tsvwg-l4s-diffserv-02>.

1712	   [I-D.cardwell-iccrg-bbr-congestion-control]
1713	              Cardwell, N., Cheng, Y., Yeganeh, S. H., Swett, I., and V.
1714	              Jacobson, "BBR Congestion Control", Work in Progress,
1715	              Internet-Draft, draft-cardwell-iccrg-bbr-congestion-
1716	              control-01, 7 November 2021,
1717	              <https://datatracker.ietf.org/doc/html/draft-cardwell-
1718	              iccrg-bbr-congestion-control-01>.

1720	   [I-D.ietf-tcpm-accurate-ecn]
1721	              Briscoe, B., Kühlewind, M., and R. Scheffenegger, "More
1722	              Accurate ECN Feedback in TCP", Work in Progress, Internet-
1723	              Draft, draft-ietf-tcpm-accurate-ecn-15, 12 July 2021,
1724	              <https://datatracker.ietf.org/doc/html/draft-ietf-tcpm-
1725	              accurate-ecn-15>.

1727	   [I-D.ietf-tcpm-generalized-ecn]
1728	              Bagnulo, M. and B. Briscoe, "ECN++: Adding Explicit
1729	              Congestion Notification (ECN) to TCP Control Packets",
1730	              Work in Progress, Internet-Draft, draft-ietf-tcpm-
1731	              generalized-ecn-09, 31 January 2022,
1732	              <https://datatracker.ietf.org/doc/html/draft-ietf-tcpm-
1733	              generalized-ecn-09>.

1735	   [I-D.ietf-tsvwg-aqm-dualq-coupled]
1736	              Schepper, K. D., Briscoe, B., and G. White, "DualQ Coupled
1737	              AQMs for Low Latency, Low Loss and Scalable Throughput
1738	              (L4S)", Work in Progress, Internet-Draft, draft-ietf-
1739	              tsvwg-aqm-dualq-coupled-20, 24 December 2021,
1740	              <https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg-
1741	              aqm-dualq-coupled-20>.

1743	   [I-D.ietf-tsvwg-ecn-encap-guidelines]
1744	              Briscoe, B. and J. Kaippallimalil, "Guidelines for Adding
1745	              Congestion Notification to Protocols that Encapsulate IP",
1746	              Work in Progress, Internet-Draft, draft-ietf-tsvwg-ecn-
1747	              encap-guidelines-16, 25 May 2021,
1748	              <https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg-
1749	              ecn-encap-guidelines-16>.

1751	   [I-D.ietf-tsvwg-ecn-l4s-id]
1752	              Schepper, K. D. and B. Briscoe, "Explicit Congestion
1753	              Notification (ECN) Protocol for Very Low Queuing Delay
1754	              (L4S)", Work in Progress, Internet-Draft, draft-ietf-
1755	              tsvwg-ecn-l4s-id-23, 24 December 2021,
1756	              <https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg-
1757	              ecn-l4s-id-23>.

1759	   [I-D.ietf-tsvwg-l4sops]
1760	              White, G., "Operational Guidance for Deployment of L4S in
1761	              the Internet", Work in Progress, Internet-Draft, draft-
1762	              ietf-tsvwg-l4sops-02, 25 October 2021,
1763	              <https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg-
1764	              l4sops-02>.

1766	   [I-D.ietf-tsvwg-nqb]
1767	              White, G. and T. Fossati, "A Non-Queue-Building Per-Hop
1768	              Behavior (NQB PHB) for Differentiated Services", Work in
1769	              Progress, Internet-Draft, draft-ietf-tsvwg-nqb-08, 25
1770	              October 2021, <https://datatracker.ietf.org/doc/html/
1771	              draft-ietf-tsvwg-nqb-08>.

1773	   [I-D.ietf-tsvwg-rfc6040update-shim]
1774	              Briscoe, B., "Propagating Explicit Congestion Notification
1775	              Across IP Tunnel Headers Separated by a Shim", Work in
1776	              Progress, Internet-Draft, draft-ietf-tsvwg-rfc6040update-
1777	              shim-14, 25 May 2021,
1778	              <https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg-
1779	              rfc6040update-shim-14>.

1781	   [I-D.morton-tsvwg-codel-approx-fair]
1782	              Morton, J. and P. G. Heist, "Controlled Delay Approximate
1783	              Fairness AQM", Work in Progress, Internet-Draft, draft-
1784	              morton-tsvwg-codel-approx-fair-01, 9 March 2020,
1785	              <https://datatracker.ietf.org/doc/html/draft-morton-tsvwg-
1786	              codel-approx-fair-01>.

1788	   [I-D.sridharan-tcpm-ctcp]
1789	              Sridharan, M., Tan, K., Bansal, D., and D. Thaler,
1790	              "Compound TCP: A New TCP Congestion Control for High-Speed
1791	              and Long Distance Networks", Work in Progress, Internet-
1792	              Draft, draft-sridharan-tcpm-ctcp-02, 11 November 2008,
1793	              <https://datatracker.ietf.org/doc/html/draft-sridharan-
1794	              tcpm-ctcp-02>.

1796	   [I-D.stewart-tsvwg-sctpecn]
1797	              Stewart, R. R., Tuexen, M., and X. Dong, "ECN for Stream
1798	              Control Transmission Protocol (SCTP)", Work in Progress,
1799	              Internet-Draft, draft-stewart-tsvwg-sctpecn-05, 15 January
1800	              2014, <https://datatracker.ietf.org/doc/html/draft-
1801	              stewart-tsvwg-sctpecn-05>.

1803	   [L4Sdemo16]
1804	              Bondarenko, O., De Schepper, K., Tsang, I., and B.
1805	              Briscoe, "Ultra-Low Delay for All: Live Experience, Live
1806	              Analysis", Proc. MMSYS'16 pp33:1--33:4, May 2016,
1807	              <http://dl.acm.org/citation.cfm?doid=2910017.2910633
1808	              (videos of demos:
1809	              https://riteproject.eu/dctth/#1511dispatchwg )>.

1811	   [LEDBAT_AQM]
1812	              Al-Saadi, R., Armitage, G., and J. But, "Characterising
1813	              LEDBAT Performance Through Bottlenecks Using PIE, FQ-CoDel
1814	              and FQ-PIE Active Queue Management", Proc. IEEE 42nd
1815	              Conference on Local Computer Networks (LCN) 278--285,
1816	              2017, <https://ieeexplore.ieee.org/document/8109367>.

1818	   [lowat]    Meenan, P., "Optimizing HTTP/2 prioritization with BBR and
1819	              tcp_notsent_lowat", Cloudflare Blog , 12 October 2018,
1820	              <https://blog.cloudflare.com/http-2-prioritization-with-
1821	              nginx/>.

1823	   [Mathis09] Mathis, M., "Relentless Congestion Control", PFLDNeT'09 ,
1824	              May 2009, <https://www.gdt.id.au/~gdt/
1825	              presentations/2010-07-06-questnet-tcp/reference-
1826	              materials/papers/mathis-relentless-congestion-
1827	              control.pdf>.

1829	   [McIlroy78]
1830	              McIlroy, M.D., Pinson, E. N., and B. A. Tague, "UNIX Time-
1831	              Sharing System: Foreword", The Bell System Technical
1832	              Journal 57:6(1902--1903), July 1978,
1833	              <https://archive.org/details/bstj57-6-1899>.

1835	   [Nadas20]  Nádas, S., Gombos, G., Fejes, F., and S. Laki, "A
1836	              Congestion Control Independent L4S Scheduler", Proc.
1837	              Applied Networking Research Workshop (ANRW '20) 45--51,
1838	              July 2020, <https://doi.org/10.1145/3404868.3406669>.

1840	   [NewCC_Proc]
1841	              Eggert, L., "Experimental Specification of New Congestion
1842	              Control Algorithms", IETF Operational Note ion-tsv-alt-cc,
1843	              July 2007, <https://www.ietf.org/iesg/statement/
1844	              congestion-control.html>.

1846	   [PragueLinux]
1847	              Briscoe, B., De Schepper, K., Albisser, O., Misund, J.,
1848	              Tilmans, O., Kühlewind, M., and A.S. Ahmed, "Implementing
1849	              the `TCP Prague' Requirements for Low Latency Low Loss
1850	              Scalable Throughput (L4S)", Proc. Linux Netdev 0x13 ,
1851	              March 2019, <https://www.netdevconf.org/0x13/
1852	              session.html?talk-tcp-prague-l4s>.

1854	   [QDyn]     Briscoe, B., "Rapid Signalling of Queue Dynamics",
1855	              bobbriscoe.net Technical Report TR-BB-2017-001;
1856	              arXiv:1904.07044 [cs.NI], September 2017,
1857	              <https://arxiv.org/abs/1904.07044>.

1859	   [Rajiullah15]
1860	              Rajiullah, M., "Towards a Low Latency Internet:
1861	              Understanding and Solutions", Masters Thesis; Karlstad
1862	              Uni, Dept of Maths & CS 2015:41, 2015, <https://www.diva-
1863	              portal.org/smash/get/diva2:846109/FULLTEXT01.pdf>.

1865	   [RFC0970]  Nagle, J., "On Packet Switches With Infinite Storage",
1866	              RFC 970, DOI 10.17487/RFC0970, December 1985,
1867	              <https://www.rfc-editor.org/info/rfc970>.

1869	   [RFC2475]  Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z.,
1870	              and W. Weiss, "An Architecture for Differentiated
1871	              Services", RFC 2475, DOI 10.17487/RFC2475, December 1998,
1872	              <https://www.rfc-editor.org/info/rfc2475>.

1874	   [RFC2698]  Heinanen, J. and R. Guerin, "A Two Rate Three Color
1875	              Marker", RFC 2698, DOI 10.17487/RFC2698, September 1999,
1876	              <https://www.rfc-editor.org/info/rfc2698>.

1878	   [RFC2884]  Hadi Salim, J. and U. Ahmed, "Performance Evaluation of
1879	              Explicit Congestion Notification (ECN) in IP Networks",
1880	              RFC 2884, DOI 10.17487/RFC2884, July 2000,
1881	              <https://www.rfc-editor.org/info/rfc2884>.

1883	   [RFC3168]  Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
1884	              of Explicit Congestion Notification (ECN) to IP",
1885	              RFC 3168, DOI 10.17487/RFC3168, September 2001,
1886	              <https://www.rfc-editor.org/info/rfc3168>.

1888	   [RFC3246]  Davie, B., Charny, A., Bennet, J.C.R., Benson, K., Le
1889	              Boudec, J.Y., Courtney, W., Davari, S., Firoiu, V., and D.
1890	              Stiliadis, "An Expedited Forwarding PHB (Per-Hop
1891	              Behavior)", RFC 3246, DOI 10.17487/RFC3246, March 2002,
1892	              <https://www.rfc-editor.org/info/rfc3246>.

1894	   [RFC3540]  Spring, N., Wetherall, D., and D. Ely, "Robust Explicit
1895	              Congestion Notification (ECN) Signaling with Nonces",
1896	              RFC 3540, DOI 10.17487/RFC3540, June 2003,
1897	              <https://www.rfc-editor.org/info/rfc3540>.

1899	   [RFC3649]  Floyd, S., "HighSpeed TCP for Large Congestion Windows",
1900	              RFC 3649, DOI 10.17487/RFC3649, December 2003,
1901	              <https://www.rfc-editor.org/info/rfc3649>.

1903	   [RFC4340]  Kohler, E., Handley, M., and S. Floyd, "Datagram
1904	              Congestion Control Protocol (DCCP)", RFC 4340,
1905	              DOI 10.17487/RFC4340, March 2006,
1906	              <https://www.rfc-editor.org/info/rfc4340>.

1908	   [RFC4774]  Floyd, S., "Specifying Alternate Semantics for the
1909	              Explicit Congestion Notification (ECN) Field", BCP 124,
1910	              RFC 4774, DOI 10.17487/RFC4774, November 2006,
1911	              <https://www.rfc-editor.org/info/rfc4774>.

1913	   [RFC4960]  Stewart, R., Ed., "Stream Control Transmission Protocol",
1914	              RFC 4960, DOI 10.17487/RFC4960, September 2007,
1915	              <https://www.rfc-editor.org/info/rfc4960>.

1917	   [RFC5033]  Floyd, S. and M. Allman, "Specifying New Congestion
1918	              Control Algorithms", BCP 133, RFC 5033,
1919	              DOI 10.17487/RFC5033, August 2007,
1920	              <https://www.rfc-editor.org/info/rfc5033>.

1922	   [RFC5348]  Floyd, S., Handley, M., Padhye, J., and J. Widmer, "TCP
1923	              Friendly Rate Control (TFRC): Protocol Specification",
1924	              RFC 5348, DOI 10.17487/RFC5348, September 2008,
1925	              <https://www.rfc-editor.org/info/rfc5348>.

1927	   [RFC5670]  Eardley, P., Ed., "Metering and Marking Behaviour of PCN-
1928	              Nodes", RFC 5670, DOI 10.17487/RFC5670, November 2009,
1929	              <https://www.rfc-editor.org/info/rfc5670>.

1931	   [RFC5681]  Allman, M., Paxson, V., and E. Blanton, "TCP Congestion
1932	              Control", RFC 5681, DOI 10.17487/RFC5681, September 2009,
1933	              <https://www.rfc-editor.org/info/rfc5681>.

1935	   [RFC5925]  Touch, J., Mankin, A., and R. Bonica, "The TCP
1936	              Authentication Option", RFC 5925, DOI 10.17487/RFC5925,
1937	              June 2010, <https://www.rfc-editor.org/info/rfc5925>.

1939	   [RFC6040]  Briscoe, B., "Tunnelling of Explicit Congestion
1940	              Notification", RFC 6040, DOI 10.17487/RFC6040, November
1941	              2010, <https://www.rfc-editor.org/info/rfc6040>.

1943	   [RFC6679]  Westerlund, M., Johansson, I., Perkins, C., O'Hanlon, P.,
1944	              and K. Carlberg, "Explicit Congestion Notification (ECN)
1945	              for RTP over UDP", RFC 6679, DOI 10.17487/RFC6679, August
1946	              2012, <https://www.rfc-editor.org/info/rfc6679>.

1948	   [RFC6817]  Shalunov, S., Hazel, G., Iyengar, J., and M. Kuehlewind,
1949	              "Low Extra Delay Background Transport (LEDBAT)", RFC 6817,
1950	              DOI 10.17487/RFC6817, December 2012,
1951	              <https://www.rfc-editor.org/info/rfc6817>.

1953	   [RFC6973]  Cooper, A., Tschofenig, H., Aboba, B., Peterson, J.,
1954	              Morris, J., Hansen, M., and R. Smith, "Privacy
1955	              Considerations for Internet Protocols", RFC 6973,
1956	              DOI 10.17487/RFC6973, July 2013,
1957	              <https://www.rfc-editor.org/info/rfc6973>.

1959	   [RFC7540]  Belshe, M., Peon, R., and M. Thomson, Ed., "Hypertext
1960	              Transfer Protocol Version 2 (HTTP/2)", RFC 7540,
1961	              DOI 10.17487/RFC7540, May 2015,
1962	              <https://www.rfc-editor.org/info/rfc7540>.

1964	   [RFC7560]  Kuehlewind, M., Ed., Scheffenegger, R., and B. Briscoe,
1965	              "Problem Statement and Requirements for Increased Accuracy
1966	              in Explicit Congestion Notification (ECN) Feedback",
1967	              RFC 7560, DOI 10.17487/RFC7560, August 2015,
1968	              <https://www.rfc-editor.org/info/rfc7560>.

1970	   [RFC7567]  Baker, F., Ed. and G. Fairhurst, Ed., "IETF
1971	              Recommendations Regarding Active Queue Management",
1972	              BCP 197, RFC 7567, DOI 10.17487/RFC7567, July 2015,
1973	              <https://www.rfc-editor.org/info/rfc7567>.

1975	   [RFC7665]  Halpern, J., Ed. and C. Pignataro, Ed., "Service Function
1976	              Chaining (SFC) Architecture", RFC 7665,
1977	              DOI 10.17487/RFC7665, October 2015,
1978	              <https://www.rfc-editor.org/info/rfc7665>.

1980	   [RFC7713]  Mathis, M. and B. Briscoe, "Congestion Exposure (ConEx)
1981	              Concepts, Abstract Mechanism, and Requirements", RFC 7713,
1982	              DOI 10.17487/RFC7713, December 2015,
1983	              <https://www.rfc-editor.org/info/rfc7713>.

1985	   [RFC8033]  Pan, R., Natarajan, P., Baker, F., and G. White,
1986	              "Proportional Integral Controller Enhanced (PIE): A
1987	              Lightweight Control Scheme to Address the Bufferbloat
1988	              Problem", RFC 8033, DOI 10.17487/RFC8033, February 2017,
1989	              <https://www.rfc-editor.org/info/rfc8033>.

1991	   [RFC8034]  White, G. and R. Pan, "Active Queue Management (AQM) Based
1992	              on Proportional Integral Controller Enhanced PIE) for
1993	              Data-Over-Cable Service Interface Specifications (DOCSIS)
1994	              Cable Modems", RFC 8034, DOI 10.17487/RFC8034, February
1995	              2017, <https://www.rfc-editor.org/info/rfc8034>.

1997	   [RFC8170]  Thaler, D., Ed., "Planning for Protocol Adoption and
1998	              Subsequent Transitions", RFC 8170, DOI 10.17487/RFC8170,
1999	              May 2017, <https://www.rfc-editor.org/info/rfc8170>.

2001	   [RFC8257]  Bensley, S., Thaler, D., Balasubramanian, P., Eggert, L.,
2002	              and G. Judd, "Data Center TCP (DCTCP): TCP Congestion
2003	              Control for Data Centers", RFC 8257, DOI 10.17487/RFC8257,
2004	              October 2017, <https://www.rfc-editor.org/info/rfc8257>.

2006	   [RFC8290]  Hoeiland-Joergensen, T., McKenney, P., Taht, D., Gettys,
2007	              J., and E. Dumazet, "The Flow Queue CoDel Packet Scheduler
2008	              and Active Queue Management Algorithm", RFC 8290,
2009	              DOI 10.17487/RFC8290, January 2018,
2010	              <https://www.rfc-editor.org/info/rfc8290>.

2012	   [RFC8298]  Johansson, I. and Z. Sarker, "Self-Clocked Rate Adaptation
2013	              for Multimedia", RFC 8298, DOI 10.17487/RFC8298, December
2014	              2017, <https://www.rfc-editor.org/info/rfc8298>.

2016	   [RFC8311]  Black, D., "Relaxing Restrictions on Explicit Congestion
2017	              Notification (ECN) Experimentation", RFC 8311,
2018	              DOI 10.17487/RFC8311, January 2018,
2019	              <https://www.rfc-editor.org/info/rfc8311>.

2021	   [RFC8312]  Rhee, I., Xu, L., Ha, S., Zimmermann, A., Eggert, L., and
2022	              R. Scheffenegger, "CUBIC for Fast Long-Distance Networks",
2023	              RFC 8312, DOI 10.17487/RFC8312, February 2018,
2024	              <https://www.rfc-editor.org/info/rfc8312>.

2026	   [RFC8404]  Moriarty, K., Ed. and A. Morton, Ed., "Effects of
2027	              Pervasive Encryption on Operators", RFC 8404,
2028	              DOI 10.17487/RFC8404, July 2018,
2029	              <https://www.rfc-editor.org/info/rfc8404>.

2031	   [RFC8511]  Khademi, N., Welzl, M., Armitage, G., and G. Fairhurst,
2032	              "TCP Alternative Backoff with ECN (ABE)", RFC 8511,
2033	              DOI 10.17487/RFC8511, December 2018,
2034	              <https://www.rfc-editor.org/info/rfc8511>.

2036	   [RFC8888]  Sarker, Z., Perkins, C., Singh, V., and M. Ramalho, "RTP
2037	              Control Protocol (RTCP) Feedback for Congestion Control",
2038	              RFC 8888, DOI 10.17487/RFC8888, January 2021,
2039	              <https://www.rfc-editor.org/info/rfc8888>.

2041	   [RFC9000]  Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based
2042	              Multiplexed and Secure Transport", RFC 9000,
2043	              DOI 10.17487/RFC9000, May 2021,
2044	              <https://www.rfc-editor.org/info/rfc9000>.

2046	   [RFC9001]  Thomson, M., Ed. and S. Turner, Ed., "Using TLS to Secure
2047	              QUIC", RFC 9001, DOI 10.17487/RFC9001, May 2021,
2048	              <https://www.rfc-editor.org/info/rfc9001>.

2050	   [SCReAM]   Johansson, I., "SCReAM", github repository; ,
2051	              <https://github.com/EricssonResearch/scream/blob/master/
2052	              README.md>.

2054	   [TCP-CA]   Jacobson, V. and M.J. Karels, "Congestion Avoidance and
2055	              Control", Laurence Berkeley Labs Technical Report ,
2056	              November 1988, <http://ee.lbl.gov/papers/congavoid.pdf>.

2058	   [TCP-sub-mss-w]
2059	              Briscoe, B. and K. De Schepper, "Scaling TCP's Congestion
2060	              Window for Small Round Trip Times", BT Technical Report
2061	              TR-TUB8-2015-002, May 2015,
2062	              <http://www.bobbriscoe.net/projects/latency/sub-mss-
2063	              w.pdf>.

2065	   [UnorderedLTE]
2066	              Austrheim, M.V., "Implementing immediate forwarding for 4G
2067	              in a network simulator", Masters Thesis, Uni Oslo , June
2068	              2019.

2070	Appendix A.  Standardization items

2072	   The following table includes all the items that will need to be
2073	   standardized to provide a full L4S architecture.

2075	   The table is too wide for the ASCII draft format, so it has been
2076	   split into two, with a common column of row index numbers on the
2077	   left.

2079	   The columns in the second part of the table have the following
2080	   meanings:

2082	   WG:  The IETF WG most relevant to this requirement.  The "tcpm/iccrg"
2083	      combination refers to the procedure typically used for congestion
2084	      control changes, where tcpm owns the approval decision, but uses
2085	      the iccrg for expert review [NewCC_Proc];

2087	   TCP:  Applicable to all forms of TCP congestion control;

2089	   DCTCP:  Applicable to Data Center TCP as currently used (in
2090	      controlled environments);

2092	   DCTCP bis:  Applicable to any future Data Center TCP congestion
2093	      control intended for controlled environments;

2095	   XXX Prague:  Applicable to a Scalable variant of XXX (TCP/SCTP/RMCAT)
2096	      congestion control.

2098	   +=====+========================+====================================+
2099	   | Req | Requirement            | Reference                          |
2100	   | #   |                        |                                    |
2101	   +=====+========================+====================================+
2102	   | 0   | ARCHITECTURE           |                                    |
2103	   +-----+------------------------+------------------------------------+
2104	   | 1   | L4S IDENTIFIER         | [I-D.ietf-tsvwg-ecn-l4s-id] S.3    |
2105	   +-----+------------------------+------------------------------------+
2106	   | 2   | DUAL QUEUE AQM         | [I-D.ietf-tsvwg-aqm-dualq-coupled] |
2107	   +-----+------------------------+------------------------------------+
2108	   | 3   | Suitable ECN           | [I-D.ietf-tcpm-accurate-ecn]       |
2109	   |     | Feedback               | S.4.2,                             |
2110	   |     |                        | [I-D.stewart-tsvwg-sctpecn].       |
2111	   +-----+------------------------+------------------------------------+
2112	   +-----+------------------------+------------------------------------+
2113	   |     | SCALABLE TRANSPORT -   |                                    |
2114	   |     | SAFETY ADDITIONS       |                                    |
2115	   +-----+------------------------+------------------------------------+
2116	   | 4-1 | Fall back to Reno/     | [I-D.ietf-tsvwg-ecn-l4s-id] S.4.3, |
2117	   |     | Cubic on loss          | [RFC8257]                          |
2118	   +-----+------------------------+------------------------------------+
2119	   | 4-2 | Fall back to Reno/     | [I-D.ietf-tsvwg-ecn-l4s-id] S.4.3  |
2120	   |     | Cubic if classic ECN   |                                    |
2121	   |     | bottleneck detected    |                                    |
2122	   +-----+------------------------+------------------------------------+
2123	   +-----+------------------------+------------------------------------+
2124	   | 4-3 | Reduce RTT-            | [I-D.ietf-tsvwg-ecn-l4s-id] S.4.3  |
2125	   |     | dependence             |                                    |
2126	   +-----+------------------------+------------------------------------+
2127	   +-----+------------------------+------------------------------------+
2128	   | 4-4 | Scaling TCP's          | [I-D.ietf-tsvwg-ecn-l4s-id] S.4.3, |
2129	   |     | Congestion Window      | [TCP-sub-mss-w]                    |
2130	   |     | for Small Round Trip   |                                    |
2131	   |     | Times                  |                                    |
2132	   +-----+------------------------+------------------------------------+
2133	   |     | SCALABLE TRANSPORT -   |                                    |
2134	   |     | PERFORMANCE            |                                    |
2135	   |     | ENHANCEMENTS           |                                    |
2136	   +-----+------------------------+------------------------------------+
2137	   | 5-1 | Setting ECT in TCP     | [I-D.ietf-tcpm-generalized-ecn]    |
2138	   |     | Control Packets and    |                                    |
2139	   |     | Retransmissions        |                                    |
2140	   +-----+------------------------+------------------------------------+
2141	   | 5-2 | Faster-than-additive   | [I-D.ietf-tsvwg-ecn-l4s-id] (Appx  |
2142	   |     | increase               | A.2.2)                             |
2143	   +-----+------------------------+------------------------------------+
2144	   | 5-3 | Faster Convergence     | [I-D.ietf-tsvwg-ecn-l4s-id] (Appx  |
2145	   |     | at Flow Start          | A.2.2)                             |
2146	   +-----+------------------------+------------------------------------+

2148	                                  Table 1

2150	   +=====+========+=====+=======+===========+========+========+========+
2151	   | #   | WG     | TCP | DCTCP | DCTCP-bis | TCP    | SCTP   | RMCAT  |
2152	   |     |        |     |       |           | Prague | Prague | Prague |
2153	   +=====+========+=====+=======+===========+========+========+========+
2154	   | 0   | tsvwg  | Y   | Y     | Y         | Y      | Y      | Y      |
2155	   +-----+--------+-----+-------+-----------+--------+--------+--------+
2156	   | 1   | tsvwg  |     |       | Y         | Y      | Y      | Y      |
2157	   +-----+--------+-----+-------+-----------+--------+--------+--------+
2158	   | 2   | tsvwg  | n/a | n/a   | n/a       | n/a    | n/a    | n/a    |
2159	   +-----+--------+-----+-------+-----------+--------+--------+--------+
2160	   +-----+--------+-----+-------+-----------+--------+--------+--------+
2161	   +-----+--------+-----+-------+-----------+--------+--------+--------+
2162	   +-----+--------+-----+-------+-----------+--------+--------+--------+
2163	   | 3   | tcpm   | Y   | Y     | Y         | Y      | n/a    | n/a    |
2164	   +-----+--------+-----+-------+-----------+--------+--------+--------+
2165	   +-----+--------+-----+-------+-----------+--------+--------+--------+
2166	   | 4-1 | tcpm   |     | Y     | Y         | Y      | Y      | Y      |
2167	   +-----+--------+-----+-------+-----------+--------+--------+--------+
2168	   +-----+--------+-----+-------+-----------+--------+--------+--------+
2169	   | 4-2 | tcpm/  |     |       |           | Y      | Y      | ?      |
2170	   |     | iccrg? |     |       |           |        |        |        |
2171	   +-----+--------+-----+-------+-----------+--------+--------+--------+
2172	   +-----+--------+-----+-------+-----------+--------+--------+--------+
2173	   +-----+--------+-----+-------+-----------+--------+--------+--------+
2174	   +-----+--------+-----+-------+-----------+--------+--------+--------+
2175	   +-----+--------+-----+-------+-----------+--------+--------+--------+
2176	   | 4-3 | tcpm/  |     |       | Y         | Y      | Y      | ?      |
2177	   |     | iccrg? |     |       |           |        |        |        |
2178	   +-----+--------+-----+-------+-----------+--------+--------+--------+
2179	   | 4-4 | tcpm   | Y   | Y     | Y         | Y      | Y      | ?      |
2180	   +-----+--------+-----+-------+-----------+--------+--------+--------+
2181	   +-----+--------+-----+-------+-----------+--------+--------+--------+
2182	   +-----+--------+-----+-------+-----------+--------+--------+--------+
2183	   | 5-1 | tcpm   | Y   | Y     | Y         | Y      | n/a    | n/a    |
2184	   +-----+--------+-----+-------+-----------+--------+--------+--------+
2185	   +-----+--------+-----+-------+-----------+--------+--------+--------+
2186	   | 5-2 | tcpm/  |     |       | Y         | Y      | Y      | ?      |
2187	   |     | iccrg? |     |       |           |        |        |        |
2188	   +-----+--------+-----+-------+-----------+--------+--------+--------+
2189	   | 5-3 | tcpm/  |     |       | Y         | Y      | Y      | ?      |
2190	   |     | iccrg? |     |       |           |        |        |        |
2191	   +-----+--------+-----+-------+-----------+--------+--------+--------+

2193	                                  Table 2

2195	Authors' Addresses
2196	   Bob Briscoe (editor)
2197	   Independent
2198	   United Kingdom

2200	   Email: ietf@bobbriscoe.net
2201	   URI:   http://bobbriscoe.net/

2203	   Koen De Schepper
2204	   Nokia Bell Labs
2205	   Antwerp
2206	   Belgium

2208	   Email: koen.de_schepper@nokia.com
2209	   URI:   https://www.bell-labs.com/usr/koen.de_schepper

2211	   Marcelo Bagnulo
2212	   Universidad Carlos III de Madrid
2213	   Av. Universidad 30
2214	   Leganes, Madrid 28911
2215	   Spain

2217	   Phone: 34 91 6249500
2218	   Email: marcelo@it.uc3m.es
2219	   URI:   http://www.it.uc3m.es

2221	   Greg White
2222	   CableLabs
2223	   United States of America

2225	   Email: G.White@CableLabs.com