idnits 2.17.1 

draft-ietf-tsvwg-l4s-arch-15.txt:
-(1669): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  == There are 6 instances of lines with non-ascii characters in the document.


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (24 December 2021) is 847 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Outdated reference: A later version (-07) exists of
     draft-briscoe-docsis-q-protection-01

  == Outdated reference: A later version (-03) exists of
     draft-briscoe-iccrg-prague-congestion-control-00

  == Outdated reference: A later version (-02) exists of
     draft-cardwell-iccrg-bbr-congestion-control-01

  == Outdated reference: A later version (-28) exists of
     draft-ietf-tcpm-accurate-ecn-15

  == Outdated reference: A later version (-15) exists of
     draft-ietf-tcpm-generalized-ecn-08

  == Outdated reference: A later version (-25) exists of
     draft-ietf-tsvwg-aqm-dualq-coupled-19

  == Outdated reference: A later version (-22) exists of
     draft-ietf-tsvwg-ecn-encap-guidelines-16

  == Outdated reference: A later version (-29) exists of
     draft-ietf-tsvwg-ecn-l4s-id-22

  == Outdated reference: A later version (-06) exists of
     draft-ietf-tsvwg-l4sops-02

  == Outdated reference: A later version (-22) exists of
     draft-ietf-tsvwg-nqb-08

  == Outdated reference: A later version (-23) exists of
     draft-ietf-tsvwg-rfc6040update-shim-14

  == Outdated reference: A later version (-06) exists of
     draft-stewart-tsvwg-sctpecn-05

  -- Obsolete informational reference (is this intentional?): RFC 4960
     (Obsoleted by RFC 9260)

  -- Obsolete informational reference (is this intentional?): RFC 7540
     (Obsoleted by RFC 9113)

  -- Obsolete informational reference (is this intentional?): RFC 8312
     (Obsoleted by RFC 9438)


     Summary: 0 errors (**), 0 flaws (~~), 14 warnings (==), 4 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Transport Area Working Group                             B. Briscoe, Ed.
3	Internet-Draft                                               Independent
4	Intended status: Informational                            K. De Schepper
5	Expires: 27 June 2022                                    Nokia Bell Labs
6	                                                        M. Bagnulo Braun
7	                                        Universidad Carlos III de Madrid
8	                                                                G. White
9	                                                               CableLabs
10	                                                        24 December 2021

12	   Low Latency, Low Loss, Scalable Throughput (L4S) Internet Service:
13	                              Architecture
14	                      draft-ietf-tsvwg-l4s-arch-15

16	Abstract

18	   This document describes the L4S architecture, which enables Internet
19	   applications to achieve Low queuing Latency, Low Loss, and Scalable
20	   throughput (L4S).  The insight on which L4S is based is that the root
21	   cause of queuing delay is in the congestion controllers of senders,
22	   not in the queue itself.  With the L4S architecture all Internet
23	   applications could (but do not have to) transition away from
24	   congestion control algorithms that cause substantial queuing delay,
25	   to a new class of congestion controls that induce very little
26	   queuing, aided by explicit congestion signalling from the network.
27	   This new class of congestion controls can provide low latency for
28	   capacity-seeking flows, so applications can achieve both high
29	   bandwidth and low latency.

31	   The architecture primarily concerns incremental deployment.  It
32	   defines mechanisms that allow the new class of L4S congestion
33	   controls to coexist with 'Classic' congestion controls in a shared
34	   network.  These mechanisms aim to ensure that the latency and
35	   throughput performance using an L4S-compliant congestion controller
36	   is usually much better (and rarely worse) than performance would have
37	   been using a 'Classic' congestion controller, and that competing
38	   flows continuing to use 'Classic' controllers are typically not
39	   impacted by the presence of L4S.  These characteristics are important
40	   to encourage adoption of L4S congestion control algorithms and L4S
41	   compliant network elements.

43	   The L4S architecture consists of three components: network support to
44	   isolate L4S traffic from classic traffic; protocol features that
45	   allow network elements to identify L4S traffic; and host support for
46	   L4S congestion controls.

48	Status of This Memo

50	   This Internet-Draft is submitted in full conformance with the
51	   provisions of BCP 78 and BCP 79.

53	   Internet-Drafts are working documents of the Internet Engineering
54	   Task Force (IETF).  Note that other groups may also distribute
55	   working documents as Internet-Drafts.  The list of current Internet-
56	   Drafts is at https://datatracker.ietf.org/drafts/current/.

58	   Internet-Drafts are draft documents valid for a maximum of six months
59	   and may be updated, replaced, or obsoleted by other documents at any
60	   time.  It is inappropriate to use Internet-Drafts as reference
61	   material or to cite them other than as "work in progress."

63	   This Internet-Draft will expire on 27 June 2022.

65	Copyright Notice

67	   Copyright (c) 2021 IETF Trust and the persons identified as the
68	   document authors.  All rights reserved.

70	   This document is subject to BCP 78 and the IETF Trust's Legal
71	   Provisions Relating to IETF Documents (https://trustee.ietf.org/
72	   license-info) in effect on the date of publication of this document.
73	   Please review these documents carefully, as they describe your rights
74	   and restrictions with respect to this document.  Code Components
75	   extracted from this document must include Revised BSD License text as
76	   described in Section 4.e of the Trust Legal Provisions and are
77	   provided without warranty as described in the Revised BSD License.

79	Table of Contents

81	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
82	     1.1.  Document Roadmap  . . . . . . . . . . . . . . . . . . . .   5
83	   2.  L4S Architecture Overview . . . . . . . . . . . . . . . . . .   5
84	   3.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   7
85	   4.  L4S Architecture Components . . . . . . . . . . . . . . . . .   9
86	     4.1.  Protocol Mechanisms . . . . . . . . . . . . . . . . . . .   9
87	     4.2.  Network Components  . . . . . . . . . . . . . . . . . . .  10
88	     4.3.  Host Mechanisms . . . . . . . . . . . . . . . . . . . . .  13
89	   5.  Rationale . . . . . . . . . . . . . . . . . . . . . . . . . .  15
90	     5.1.  Why These Primary Components? . . . . . . . . . . . . . .  15
91	     5.2.  What L4S adds to Existing Approaches  . . . . . . . . . .  18
92	   6.  Applicability . . . . . . . . . . . . . . . . . . . . . . . .  21
93	     6.1.  Applications  . . . . . . . . . . . . . . . . . . . . . .  21
94	     6.2.  Use Cases . . . . . . . . . . . . . . . . . . . . . . . .  23
95	     6.3.  Applicability with Specific Link Technologies . . . . . .  24
96	     6.4.  Deployment Considerations . . . . . . . . . . . . . . . .  24
97	       6.4.1.  Deployment Topology . . . . . . . . . . . . . . . . .  25
98	       6.4.2.  Deployment Sequences  . . . . . . . . . . . . . . . .  26
99	       6.4.3.  L4S Flow but Non-ECN Bottleneck . . . . . . . . . . .  29
100	       6.4.4.  L4S Flow but Classic ECN Bottleneck . . . . . . . . .  30
101	       6.4.5.  L4S AQM Deployment within Tunnels . . . . . . . . . .  30
102	   7.  IANA Considerations (to be removed by RFC Editor) . . . . . .  30
103	   8.  Security Considerations . . . . . . . . . . . . . . . . . . .  30
104	     8.1.  Traffic Rate (Non-)Policing . . . . . . . . . . . . . . .  30
105	     8.2.  'Latency Friendliness'  . . . . . . . . . . . . . . . . .  32
106	     8.3.  Interaction between Rate Policing and L4S . . . . . . . .  33
107	     8.4.  ECN Integrity . . . . . . . . . . . . . . . . . . . . . .  34
108	     8.5.  Privacy Considerations  . . . . . . . . . . . . . . . . .  35
109	   9.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  35
110	   10. Informative References  . . . . . . . . . . . . . . . . . . .  35
111	   Appendix A.  Standardization items  . . . . . . . . . . . . . . .  45
112	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  48

114	1.  Introduction

116	   At any one time, it is increasingly common for all of the traffic in
117	   a bottleneck link (e.g. a household's Internet access) to come from
118	   applications that prefer low delay: interactive Web, Web services,
119	   voice, conversational video, interactive video, interactive remote
120	   presence, instant messaging, online gaming, remote desktop, cloud-
121	   based applications and video-assisted remote control of machinery and
122	   industrial processes.  In the last decade or so, much has been done
123	   to reduce propagation delay by placing caches or servers closer to
124	   users.  However, queuing remains a major, albeit intermittent,
125	   component of latency.  For instance spikes of hundreds of
126	   milliseconds are not uncommon, even with state-of-the-art active
127	   queue management (AQM) [COBALT], [DOCSIS3AQM].  Queuing in access
128	   network bottlenecks is typically configured to cause overall network
129	   delay to roughly double during a long-running flow, relative to
130	   expected base (unloaded) path delay [BufferSize].  Low loss is also
131	   important because, for interactive applications, losses translate
132	   into even longer retransmission delays.

134	   It has been demonstrated that, once access network bit rates reach
135	   levels now common in the developed world, increasing capacity offers
136	   diminishing returns if latency (delay) is not addressed
137	   [Dukkipati06], [Rajiullah15].  Therefore, the goal is an Internet
138	   service with very Low queueing Latency, very Low Loss and Scalable
139	   throughput (L4S).  Very low queuing latency means less than
140	   1 millisecond (ms) on average and less than about 2 ms at the 99th
141	   percentile.  This document describes the L4S architecture for
142	   achieving these goals.

144	   Differentiated services (Diffserv) offers Expedited Forwarding
145	   (EF [RFC3246]) for some packets at the expense of others, but this
146	   makes no difference when all (or most) of the traffic at a bottleneck
147	   at any one time requires low latency.  In contrast, L4S still works
148	   well when all traffic is L4S - a service that gives without taking
149	   needs none of the configuration or management baggage (traffic
150	   policing, traffic contracts) associated with favouring some traffic
151	   flows over others.

153	   Queuing delay degrades performance intermittently [Hohlfeld14].  It
154	   occurs when a large enough capacity-seeking (e.g. TCP) flow is
155	   running alongside the user's traffic in the bottleneck link, which is
156	   typically in the access network.  Or when the low latency application
157	   is itself a large capacity-seeking or adaptive rate (e.g. interactive
158	   video) flow.  At these times, the performance improvement from L4S
159	   must be sufficient that network operators will be motivated to deploy
160	   it.

162	   Active Queue Management (AQM) is part of the solution to queuing
163	   under load.  AQM improves performance for all traffic, but there is a
164	   limit to how much queuing delay can be reduced by solely changing the
165	   network; without addressing the root of the problem.

167	   The root of the problem is the presence of standard TCP congestion
168	   control (Reno [RFC5681]) or compatible variants (e.g. TCP
169	   Cubic [RFC8312]).  We shall use the term 'Classic' for these Reno-
170	   friendly congestion controls.  Classic congestion controls induce
171	   relatively large saw-tooth-shaped excursions up the queue and down
172	   again, which have been growing as flow rate scales [RFC3649].  So if
173	   a network operator naively attempts to reduce queuing delay by
174	   configuring an AQM to operate at a shallower queue, a Classic
175	   congestion control will significantly underutilize the link at the
176	   bottom of every saw-tooth.

178	   It has been demonstrated that if the sending host replaces a Classic
179	   congestion control with a 'Scalable' alternative, when a suitable AQM
180	   is deployed in the network the performance under load of all the
181	   above interactive applications can be significantly improved.  For
182	   instance, queuing delay under heavy load with the example DCTCP/DualQ
183	   solution cited below on a DSL or Ethernet link is roughly 1 to 2
184	   milliseconds at the 99th percentile without losing link
185	   utilization [DualPI2Linux], [DCttH19] (for other link types, see
186	   Section 6.3).  This compares with 5-20 ms on _average_ with a Classic
187	   congestion control and current state-of-the-art AQMs such as FQ-
188	   CoDel [RFC8290], PIE [RFC8033] or DOCSIS PIE [RFC8034] and about
189	   20-30 ms at the 99th percentile [DualPI2Linux].

191	   L4S is designed for incremental deployment.  It is possible to deploy
192	   the L4S service at a bottleneck link alongside the existing best
193	   efforts service [DualPI2Linux] so that unmodified applications can
194	   start using it as soon as the sender's stack is updated.  Access
195	   networks are typically designed with one link as the bottleneck for
196	   each site (which might be a home, small enterprise or mobile device),
197	   so deployment at either or both ends of this link should give nearly
198	   all the benefit in the respective direction.  With some transport
199	   protocols, namely TCP and SCTP, the sender has to check for suitably
200	   updated receiver feedback, whereas with more recent transport
201	   protocols such as QUIC and DCCP, all receivers have always been
202	   suitable.

204	   This document presents the L4S architecture, by describing and
205	   justifying the component parts and how they interact to provide the
206	   scalable, low latency, low loss Internet service.  It also details
207	   the approach to incremental deployment, as briefly summarized above.

209	1.1.  Document Roadmap

211	   This document describes the L4S architecture in three passes.  First
212	   this brief overview gives the very high level idea and states the
213	   main components with minimal rationale.  This is only intended to
214	   give some context for the terminology definitions that follow in
215	   Section 3, and to explain the structure of the rest of the document.
216	   Then Section 4 goes into more detail on each component with some
217	   rationale, but still mostly stating what the architecture is, rather
218	   than why.  Finally Section 5 justifies why each element of the
219	   solution was chosen (Section 5.1) and why these choices were
220	   different from other solutions (Section 5.2).

222	   Having described the architecture, Section 6 clarifies its
223	   applicability; that is, the applications and use-cases that motivated
224	   the design, the challenges applying the architecture to various link
225	   technologies, and various incremental deployment models: including
226	   the two main deployment topologies, different sequences for
227	   incremental deployment and various interactions with pre-existing
228	   approaches.  The document ends with the usual tail pieces, including
229	   extensive discussion of traffic policing and other security
230	   considerations Section 8.

232	2.  L4S Architecture Overview

234	   Below we outline the three main components to the L4S architecture;
235	   1) the scalable congestion control on the sending host; 2) the AQM at
236	   the network bottleneck; and 3) the protocol between them.

238	   But first, the main point to grasp is that low latency is not
239	   provided by the network - low latency results from the careful
240	   behaviour of the scalable congestion controllers used by L4S senders.
241	   The network does have a role - primarily to isolate the low latency
242	   of the carefully behaving L4S traffic from the higher queuing delay
243	   needed by traffic with pre-existing Classic behaviour.  The network
244	   also alters the way it signals queue growth to the transport - It
245	   uses the Explicit Congestion Notification (ECN) protocol, but it
246	   signals the very start of queue growth - immediately without the
247	   smoothing delay typical of Classic AQMs.  Because ECN support is
248	   essential for L4S, senders use the ECN field as the protocol to
249	   identify to the network which packets are L4S and which are Classic.

251	   1) Host:  Scalable congestion controls already exist.  They solve the
252	      scaling problem with Classic congestion controls, such as Reno or
253	      Cubic.  Because flow rate has scaled since TCP congestion control
254	      was first designed in 1988, assuming the flow lasts long enough,
255	      it now takes hundreds of round trips (and growing) to recover
256	      after a congestion signal (whether a loss or an ECN mark) as shown
257	      in the examples in Section 5.1 and [RFC3649].  Therefore control
258	      of queuing and utilization becomes very slack, and the slightest
259	      disturbances (e.g. from new flows starting) prevent a high rate
260	      from being attained.

262	      With a scalable congestion control, the average time from one
263	      congestion signal to the next (the recovery time) remains
264	      invariant as the flow rate scales, all other factors being equal.
265	      This maintains the same degree of control over queueing and
266	      utilization whatever the flow rate, as well as ensuring that high
267	      throughput is more robust to disturbances.  The scalable control
268	      used most widely (in controlled environments) is Data Center TCP
269	      (DCTCP [RFC8257]), which has been implemented and deployed in
270	      Windows Server Editions (since 2012), in Linux and in FreeBSD.
271	      Although DCTCP as-is functions well over wide-area round trip
272	      times, most implementations lack certain safety features that
273	      would be necessary for use outside controlled environments like
274	      data centres (see Section 6.4.3 and Appendix A).  So scalable
275	      congestion control needs to be implemented in TCP and other
276	      transport protocols (QUIC, SCTP, RTP/RTCP, RMCAT, etc.).  Indeed,
277	      between the present document being drafted and published, the
278	      following scalable congestion controls were implemented: TCP
279	      Prague [PragueLinux], QUIC Prague, an L4S variant of the RMCAT
280	      SCReAM controller [SCReAM] and the L4S ECN part of BBRv2 [BBRv2]
281	      intended for TCP and QUIC transports.

283	   2) Network:  L4S traffic needs to be isolated from the queuing
284	      latency of Classic traffic.  One queue per application flow (FQ)
285	      is one way to achieve this, e.g. FQ-CoDel [RFC8290].  However,
286	      just two queues is sufficient and does not require inspection of
287	      transport layer headers in the network, which is not always
288	      possible (see Section 5.2).  With just two queues, it might seem
289	      impossible to know how much capacity to schedule for each queue
290	      without inspecting how many flows at any one time are using each.
291	      And it would be undesirable to arbitrarily divide access network
292	      capacity into two partitions.  The Dual Queue Coupled AQM was
293	      developed as a minimal complexity solution to this problem.  It
294	      acts like a 'semi-permeable' membrane that partitions latency but
295	      not bandwidth.  As such, the two queues are for transition from
296	      Classic to L4S behaviour, not bandwidth prioritization.

298	      Section 4 gives a high level explanation of how the per-flow-queue
299	      (FQ) and DualQ variants of L4S work, and
300	      [I-D.ietf-tsvwg-aqm-dualq-coupled] gives a full explanation of the
301	      DualQ Coupled AQM framework.  A specific marking algorithm is not
302	      mandated for L4S AQMs.  Appendices of
303	      [I-D.ietf-tsvwg-aqm-dualq-coupled] give non-normative examples
304	      that have been implemented and evaluated, and give recommended
305	      default parameter settings.  It is expected that L4S experiments
306	      will improve knowledge of parameter settings and whether the set
307	      of marking algorithms needs to be limited.

309	   3) Protocol:  A host needs to distinguish L4S and Classic packets
310	      with an identifier so that the network can classify them into
311	      their separate treatments.  The L4S identifier
312	      spec. [I-D.ietf-tsvwg-ecn-l4s-id] concludes that all alternatives
313	      involve compromises, but the ECT(1) and CE codepoints of the ECN
314	      field represent a workable solution.  As already explained, the
315	      network also uses ECN to immediately signal the very start of
316	      queue growth to the transport.

318	3.  Terminology

320	   Note: The following definitions are copied from
321	   [I-D.ietf-tsvwg-ecn-l4s-id] for convenience.  If there are accidental
322	   differences those in [I-D.ietf-tsvwg-ecn-l4s-id] take precedence.

324	   Classic Congestion Control:  A congestion control behaviour that can
325	      co-exist with standard Reno [RFC5681] without causing
326	      significantly negative impact on its flow rate [RFC5033].  The
327	      scaling problem with Classic congestion control is explained, with
328	      examples, in Section 5.1 and in [RFC3649].

330	   Scalable Congestion Control:  A congestion control where the average
331	      time from one congestion signal to the next (the recovery time)
332	      remains invariant as the flow rate scales, all other factors being
333	      equal.  For instance, DCTCP averages 2 congestion signals per
334	      round-trip whatever the flow rate, as do other recently developed
335	      scalable congestion controls, e.g. Relentless TCP [Mathis09], TCP
336	      Prague [I-D.briscoe-iccrg-prague-congestion-control],
337	      [PragueLinux], BBRv2 [BBRv2] and the L4S variant of SCReAM for
338	      real-time media [SCReAM], [RFC8298]).  See Section 4.3 of
339	      [I-D.ietf-tsvwg-ecn-l4s-id] for more explanation.

341	   Classic service:  The Classic service is intended for all the
342	      congestion control behaviours that co-exist with Reno [RFC5681]
343	      (e.g. Reno itself, Cubic [RFC8312],
344	      Compound [I-D.sridharan-tcpm-ctcp], TFRC [RFC5348]).  The term
345	      'Classic queue' means a queue providing the Classic service.

347	   Low-Latency, Low-Loss Scalable throughput (L4S) service:  The 'L4S'
348	      service is intended for traffic from scalable congestion control
349	      algorithms, such as the Prague congestion
350	      control [I-D.briscoe-iccrg-prague-congestion-control], which was
351	      derived from DCTCP  [RFC8257].  The L4S service is for more
352	      general traffic than just TCP Prague--it allows the set of
353	      congestion controls with similar scaling properties to Prague to
354	      evolve, such as the examples listed above (Relentless, SCReAM).
355	      The term 'L4S queue' means a queue providing the L4S service.

357	      The terms Classic or L4S can also qualify other nouns, such as
358	      'queue', 'codepoint', 'identifier', 'classification', 'packet',
359	      'flow'.  For example: an L4S packet means a packet with an L4S
360	      identifier sent from an L4S congestion control.

362	      Both Classic and L4S services can cope with a proportion of
363	      unresponsive or less-responsive traffic as well, but in the L4S
364	      case its rate has to be smooth enough or low enough not build a
365	      queue (e.g. DNS, VoIP, game sync datagrams, etc).

367	   Reno-friendly:  The subset of Classic traffic that is friendly to the
368	      standard Reno congestion control defined for TCP in [RFC5681].
369	      The TFRC spec. [RFC5348] indirectly implies that 'friendly' is
370	      defined as "generally within a factor of two of the sending rate
371	      of a TCP flow under the same conditions".  Reno-friendly is used
372	      here in place of 'TCP-friendly', given the latter has become
373	      imprecise, because the TCP protocol is now used with so many
374	      different congestion control behaviours, and Reno is used in non-
375	      TCP transports such as QUIC [RFC9000].

377	   Classic ECN:  The original Explicit Congestion Notification (ECN)
378	      protocol [RFC3168], which requires ECN signals to be treated as
379	      equivalent to drops, both when generated in the network and when
380	      responded to by the sender.

382	      L4S uses the ECN field as an identifier
383	      [I-D.ietf-tsvwg-ecn-l4s-id] with the names for the four codepoints
384	      of the 2-bit IP-ECN field unchanged from those defined in
385	      [RFC3168]: Not ECT, ECT(0), ECT(1) and CE, where ECT stands for
386	      ECN-Capable Transport and CE stands for Congestion Experienced.  A
387	      packet marked with the CE codepoint is termed 'ECN-marked' or
388	      sometimes just 'marked' where the context makes ECN obvious.

390	   Site:  A home, mobile device, small enterprise or campus, where the
391	      network bottleneck is typically the access link to the site.  Not
392	      all network arrangements fit this model but it is a useful, widely
393	      applicable generalization.

395	4.  L4S Architecture Components

397	   The L4S architecture is composed of the elements in the following
398	   three subsections.

400	4.1.  Protocol Mechanisms

402	   The L4S architecture involves: a) unassignment of an identifier; b)
403	   reassignment of the same identifier; and c) optional further
404	   identifiers:

406	   a.  An essential aspect of a scalable congestion control is the use
407	       of explicit congestion signals.  'Classic' ECN [RFC3168] requires
408	       an ECN signal to be treated as equivalent to drop, both when it
409	       is generated in the network and when it is responded to by hosts.
410	       L4S needs networks and hosts to support a more fine-grained
411	       meaning for each ECN signal that is less severe than a drop, so
412	       that the L4S signals:

414	       *  can be much more frequent;

416	       *  can be signalled immediately, without the significant delay
417	          required to smooth out fluctuations in the queue.

419	       To enable L4S, the standards track Classic ECN spec. [RFC3168]
420	       has had to be updated to allow L4S packets to depart from the
421	       'equivalent to drop' constraint.  [RFC8311] is a standards track
422	       update to relax specific requirements in RFC 3168 (and certain
423	       other standards track RFCs), which clears the way for the
424	       experimental changes proposed for L4S.  [RFC8311] also
425	       reclassifies the original experimental assignment of the ECT(1)
426	       codepoint as an ECN nonce [RFC3540] as historic.

428	   b.  [I-D.ietf-tsvwg-ecn-l4s-id] specifies that ECT(1) is used as the
429	       identifier to classify L4S packets into a separate treatment from
430	       Classic packets.  This satisfies the requirement for identifying
431	       an alternative ECN treatment in [RFC4774].

433	       The CE codepoint is used to indicate Congestion Experienced by
434	       both L4S and Classic treatments.  This raises the concern that a
435	       Classic AQM earlier on the path might have marked some ECT(0)
436	       packets as CE.  Then these packets will be erroneously classified
437	       into the L4S queue.  Appendix B of [I-D.ietf-tsvwg-ecn-l4s-id]
438	       explains why five unlikely eventualities all have to coincide for
439	       this to have any detrimental effect, which even then would only
440	       involve a vanishingly small likelihood of a spurious
441	       retransmission.

443	   c.  A network operator might wish to include certain unresponsive,
444	       non-L4S traffic in the L4S queue if it is deemed to be smoothly
445	       enough paced and low enough rate not to build a queue.  For
446	       instance, VoIP, low rate datagrams to sync online games,
447	       relatively low rate application-limited traffic, DNS, LDAP, etc.
448	       This traffic would need to be tagged with specific identifiers,
449	       e.g. a low latency Diffserv Codepoint such as Expedited
450	       Forwarding (EF [RFC3246]), Non-Queue-Building
451	       (NQB [I-D.ietf-tsvwg-nqb]), or operator-specific identifiers.

453	4.2.  Network Components

455	   The L4S architecture aims to provide low latency without the _need_
456	   for per-flow operations in network components.  Nonetheless, the
457	   architecture does not preclude per-flow solutions.  The following
458	   bullets describe the known arrangements: a) the DualQ Coupled AQM
459	   with an L4S AQM in one queue coupled from a Classic AQM in the other;
460	   b) Per-Flow Queues with an instance of a Classic and an L4S AQM in
461	   each queue; c) Dual queues with per-flow AQMs, but no per-flow
462	   queues:

464	   a.  The Dual Queue Coupled AQM (illustrated in Figure 1) achieves the
465	       'semi-permeable' membrane property mentioned earlier as follows:

467	       *  Latency isolation: Two separate queues are used to isolate L4S
468	          queuing delay from the larger queue that Classic traffic needs
469	          to maintain full utilization.

471	       *  Bandwidth pooling: The two queues act as if they are a single
472	          pool of bandwidth in which flows of either type get roughly
473	          equal throughput without the scheduler needing to identify any
474	          flows.  This is achieved by having an AQM in each queue, but
475	          the Classic AQM provides a congestion signal to both queues in
476	          a manner that ensures a consistent response from the two
477	          classes of congestion control.  Specifically, the Classic AQM
478	          generates a drop/mark probability based on congestion in its
479	          own queue, which it uses both to drop/mark packets in its own
480	          queue and to affect the marking probability in the L4S queue.
481	          The strength of the coupling of the congestion signalling
482	          between the two queues is enough to make the L4S flows slow
483	          down to leave the right amount of capacity for the Classic
484	          flows (as they would if they were the same type of traffic
485	          sharing the same queue).

487	       Then the scheduler can serve the L4S queue with priority (denoted
488	       by the '1' on the higher priority input), because the L4S traffic
489	       isn't offering up enough traffic to use all the priority that it
490	       is given.  Therefore:

492	       *  for latency isolation on short time-scales (sub-round-trip)
493	          the prioritization of the L4S queue protects its low latency
494	          by allowing bursts to dissipate quickly;

496	       *  but for bandwidth pooling on longer time-scales (round-trip
497	          and longer) the Classic queue creates an equal and opposite
498	          pressure against the L4S traffic to ensure that neither has
499	          priority when it comes to bandwidth - the tension between
500	          prioritizing L4S and coupling the marking from the Classic AQM
501	          results in approximate per-flow fairness.

503	       To protect against unresponsive traffic taking advantage of the
504	       prioritization of the L4S queue and starving the Classic queue,
505	       it is advisable for the priority to be conditional, not strict
506	       (see Appendix A of [I-D.ietf-tsvwg-aqm-dualq-coupled]).

508	       When there is no Classic traffic, the L4S queue's own AQM comes
509	       into play.  It starts congestion marking with a very shallow
510	       queue, so L4S traffic maintains very low queuing delay.

512	       If either queue becomes persistently overloaded, drop of ECN-
513	       capable packets is introduced, as recommended in Section 7 of
514	       [RFC3168] and Section 4.2.1 of [RFC7567].  Then both queues
515	       introduce the same level of drop (not shown in the figure).

517	       The Dual Queue Coupled AQM has been specified as generically as
518	       possible [I-D.ietf-tsvwg-aqm-dualq-coupled] without specifying
519	       the particular AQMs to use in the two queues so that designers
520	       are free to implement diverse ideas.  Informational appendices in
521	       that draft give pseudocode examples of two different specific AQM
522	       approaches: one called DualPI2 (pronounced Dual PI
523	       Squared) [DualPI2Linux] that uses the PI2 variant of PIE, and a
524	       zero-config variant of RED called Curvy RED.  A DualQ Coupled AQM
525	       based on PIE has also been specified and implemented for Low
526	       Latency DOCSIS [DOCSIS3.1].

528	                            (3)                  (2)
529	                     .-------^------..------------^------------------.
530	        ,-(1)-----.                               _____
531	       ; ________  :            L4S  -------.    |     |
532	       :|Scalable| :               _\      ||__\_|mark |
533	       :| sender | :  __________  / /      ||  / |_____|\   _________
534	       :|________|\; |          |/   -------'       ^    \1|condit'nl|
535	        `---------'\_|  IP-ECN  |          Coupling :     \|priority |_\
536	         ________  / |Classifier|                   :     /|scheduler| /
537	        |Classic |/  |__________|\   -------.     __:__  / |_________|
538	        | sender |                \_\ || | ||__\_|mark/|/
539	        |________|                  / || | ||  / |drop |
540	                             Classic -------'    |_____|

542	         Figure 1: Components of an L4S DualQ Coupled AQM Solution: 1)
543	            Scalable Sending Host; 2) Isolation in separate network
544	                 queues; and 3) Packet Identification Protocol

546	   b.  Per-Flow Queues and AQMs: A scheduler with per-flow queues such
547	       as FQ-CoDel or FQ-PIE can be used for L4S.  For instance within
548	       each queue of an FQ-CoDel system, as well as a CoDel AQM, there
549	       is typically also the option of ECN marking at an immediate
550	       (unsmoothed) shallow threshold to support use in data centres
551	       (see Sec.5.2.7 of [RFC8290]).  In Linux, this has been modified
552	       so that the shallow threshold can be solely applied to ECT(1)
553	       packets [FQ_CoDel_Thresh].  Then if there is a flow of non-ECN or
554	       ECT(0) packets in the per-flow-queue, the Classic AQM (e.g.
555	       CoDel) is applied; while if there is a flow of ECT(1) packets in
556	       the queue, the shallower (typically sub-millisecond) threshold is
557	       applied.  In addition, ECT(0) and not-ECT packets could
558	       potentially be classified into a separate flow-queue from ECT(1)
559	       and CE packets to avoid them mixing if they share a common flow-
560	       identifier (e.g. in a VPN).

562	   c.  Dual-queues, but per-flow AQMs: It should also be possible to use
563	       dual queues for isolation, but with per-flow marking to control
564	       flow-rates (instead of the coupled per-queue marking of the Dual
565	       Queue Coupled AQM).  One of the two queues would be for isolating
566	       L4S packets, which would be classified by the ECN codepoint.
567	       Flow rates could be controlled by flow-specific marking.  The
568	       policy goal of the marking could be to differentiate flow rates
569	       (e.g. [Nadas20], which requires additional signalling of a per-
570	       flow 'value'), or to equalize flow-rates (perhaps in a similar
571	       way to Approx Fair CoDel [AFCD],
572	       [I-D.morton-tsvwg-codel-approx-fair], but with two queues not
573	       one).

575	       Note that whenever the term 'DualQ' is used loosely without
576	       saying whether marking is per-queue or per-flow, it means a dual
577	       queue AQM with per-queue marking.

579	4.3.  Host Mechanisms

581	   The L4S architecture includes two main mechanisms in the end host
582	   that we enumerate next:

584	   a.  Scalable Congestion Control at the sender: Section 2 defines a
585	       scalable congestion control as one where the average time from
586	       one congestion signal to the next (the recovery time) remains
587	       invariant as the flow rate scales, all other factors being equal.
588	       Data Center TCP is the most widely used example.  It has been
589	       documented as an informational record of the protocol currently
590	       in use in controlled environments [RFC8257].  A draft list of
591	       safety and performance improvements for a scalable congestion
592	       control to be usable on the public Internet has been drawn up
593	       (the so-called 'Prague L4S requirements' in Appendix A of

595	       [I-D.ietf-tsvwg-ecn-l4s-id]).  The subset that involve risk of
596	       harm to others have been captured as normative requirements in
597	       Section 4 of [I-D.ietf-tsvwg-ecn-l4s-id].  TCP
598	       Prague [I-D.briscoe-iccrg-prague-congestion-control] has been
599	       implemented in Linux as a reference implementation to address
600	       these requirements [PragueLinux].

602	       Transport protocols other than TCP use various congestion
603	       controls that are designed to be friendly with Reno.  Before they
604	       can use the L4S service, they will need to be updated to
605	       implement a scalable congestion response, which they will have to
606	       indicate by using the ECT(1) codepoint.  Scalable variants are
607	       under consideration for more recent transport protocols,
608	       e.g. QUIC, and the L4S ECN part of BBRv2 [BBRv2] is a scalable
609	       congestion control intended for the TCP and QUIC transports,
610	       amongst others.  Also an L4S variant of the RMCAT SCReAM
611	       controller [RFC8298] has been implemented [SCReAM] for media
612	       transported over RTP.

614	       Section 4.3 of [I-D.ietf-tsvwg-ecn-l4s-id] defines scalable
615	       congestion control in more detail, and specifies that
616	       requirements that an L4S scalable congestion control has to
617	       comply with.

619	   b.  The ECN feedback in some transport protocols is already
620	       sufficiently fine-grained for L4S (specifically DCCP [RFC4340]
621	       and QUIC [RFC9000]).  But others either require update or are in
622	       the process of being updated:

624	       *  For the case of TCP, the feedback protocol for ECN embeds the
625	          assumption from Classic ECN [RFC3168] that an ECN mark is
626	          equivalent to a drop, making it unusable for a scalable TCP.
627	          Therefore, the implementation of TCP receivers will have to be
628	          upgraded [RFC7560].  Work to standardize and implement more
629	          accurate ECN feedback for TCP (AccECN) is in
630	          progress [I-D.ietf-tcpm-accurate-ecn], [PragueLinux].

632	       *  ECN feedback is only roughly sketched in an appendix of the
633	          SCTP specification [RFC4960].  A fuller specification has been
634	          proposed in a long-expired draft [I-D.stewart-tsvwg-sctpecn],
635	          which would need to be implemented and deployed before SCTCP
636	          could support L4S.

638	       *  For RTP, sufficient ECN feedback was defined in [RFC6679], but
639	          [RFC8888] defines the latest standards track improvements.

641	5.  Rationale

643	5.1.  Why These Primary Components?

645	   Explicit congestion signalling (protocol):  Explicit congestion
646	      signalling is a key part of the L4S approach.  In contrast, use of
647	      drop as a congestion signal creates a tension because drop is both
648	      an impairment (less would be better) and a useful signal (more
649	      would be better):

651	      *  Explicit congestion signals can be used many times per round
652	         trip, to keep tight control, without any impairment.  Under
653	         heavy load, even more explicit signals can be applied so the
654	         queue can be kept short whatever the load.  In contrast,
655	         Classic AQMs have to introduce very high packet drop at high
656	         load to keep the queue short.  By using ECN, an L4S congestion
657	         control's sawtooth reduction can be smaller and therefore
658	         return to the operating point more often, without worrying that
659	         more sawteeth will cause more signals.  The consequent smaller
660	         amplitude sawteeth fit between an empty queue and a very
661	         shallow marking threshold (~1 ms in the public Internet), so
662	         queue delay variation can be very low, without risk of under-
663	         utilization.

665	      *  Explicit congestion signals can be emitted immediately to track
666	         fluctuations of the queue.  L4S shifts smoothing from the
667	         network to the host.  The network doesn't know the round trip
668	         times of any of the flows.  So if the network is responsible
669	         for smoothing (as in the Classic approach), it has to assume a
670	         worst case RTT, otherwise long RTT flows would become unstable.
671	         This delays Classic congestion signals by 100-200 ms.  In
672	         contrast, each host knows its own round trip time.  So, in the
673	         L4S approach, the host can smooth each flow over its own RTT,
674	         introducing no more soothing delay than strictly necessary
675	         (usually only a few milliseconds).  A host can also choose not
676	         to introduce any smoothing delay if appropriate, e.g. during
677	         flow start-up.

679	      Neither of the above are feasible if explicit congestion
680	      signalling has to be considered 'equivalent to drop' (as was
681	      required with Classic ECN [RFC3168]), because drop is an
682	      impairment as well as a signal.  So drop cannot be excessively
683	      frequent, and drop cannot be immediate, otherwise too many drops
684	      would turn out to have been due to only a transient fluctuation in
685	      the queue that would not have warranted dropping a packet in
686	      hindsight.  Therefore, in an L4S AQM, the L4S queue uses a new L4S
687	      variant of ECN that is not equivalent to drop (see section 5.2 of
688	      [I-D.ietf-tsvwg-ecn-l4s-id]), while the Classic queue uses either
689	      Classic ECN [RFC3168] or drop, which are equivalent to each other.

691	      Before Classic ECN was standardized, there were various proposals
692	      to give an ECN mark a different meaning from drop.  However, there
693	      was no particular reason to agree on any one of the alternative
694	      meanings, so 'equivalent to drop' was the only compromise that
695	      could be reached.  RFC 3168 contains a statement that:

697	         "An environment where all end nodes were ECN-Capable could
698	         allow new criteria to be developed for setting the CE
699	         codepoint, and new congestion control mechanisms for end-node
700	         reaction to CE packets.  However, this is a research issue, and
701	         as such is not addressed in this document."

703	   Latency isolation (network):  L4S congestion controls keep queue
704	      delay low whereas Classic congestion controls need a queue of the
705	      order of the RTT to avoid under-utilization.  One queue cannot
706	      have two lengths, therefore L4S traffic needs to be isolated in a
707	      separate queue (e.g. DualQ) or queues (e.g. FQ).

709	   Coupled congestion notification:  Coupling the congestion
710	      notification between two queues as in the DualQ Coupled AQM is not
711	      necessarily essential, but it is a simple way to allow senders to
712	      determine their rate, packet by packet, rather than be overridden
713	      by a network scheduler.  An alternative is for a network scheduler
714	      to control the rate of each application flow (see discussion in
715	      Section 5.2).

717	   L4S packet identifier (protocol):  Once there are at least two
718	      treatments in the network, hosts need an identifier at the IP
719	      layer to distinguish which treatment they intend to use.

721	   Scalable congestion notification:  A scalable congestion control in
722	      the host keeps the signalling frequency from the network high
723	      whatever the flow rate, so that queue delay variations can be
724	      small when conditions are stable, and rate can track variations in
725	      available capacity as rapidly as possible otherwise.

727	   Low loss:  Latency is not the only concern of L4S.  The 'Low Loss'
728	      part of the name denotes that L4S generally achieves zero
729	      congestion loss due to its use of ECN.  Otherwise, loss would
730	      itself cause delay, particularly for short flows, due to
731	      retransmission delay [RFC2884].

733	   Scalable throughput:  The "Scalable throughput" part of the name
734	      denotes that the per-flow throughput of scalable congestion
735	      controls should scale indefinitely, avoiding the imminent scaling
736	      problems with Reno-friendly congestion control
737	      algorithms [RFC3649].  It was known when TCP congestion avoidance
738	      was first developed in 1988 that it would not scale to high
739	      bandwidth-delay products (see footnote 6 in [TCP-CA]).  Today,
740	      regular broadband flow rates over WAN distances are already beyond
741	      the scaling range of Classic Reno congestion control.  So `less
742	      unscalable' Cubic [RFC8312] and Compound [I-D.sridharan-tcpm-ctcp]
743	      variants of TCP have been successfully deployed.  However, these
744	      are now approaching their scaling limits.

746	      For instance, we will consider a scenario with a maximum RTT of
747	      30 ms at the peak of each sawtooth.  As Reno packet rate scales 8x
748	      from 1,250 to 10,000 packet/s (from 15 to 120 Mb/s with 1500 B
749	      packets), the time to recover from a congestion event rises
750	      proportionately by 8x as well, from 422 ms to 3.38 s.  It is
751	      clearly problematic for a congestion control to take multiple
752	      seconds to recover from each congestion event.  Cubic [RFC8312]
753	      was developed to be less unscalable, but it is approaching its
754	      scaling limit; with the same max RTT of 30 ms, at 120 Mb/s Cubic
755	      is still fully in its Reno-friendly mode, so it takes about 4.3 s
756	      to recover.  However, once the flow rate scales by 8x again to
757	      960 Mb/s it enters true Cubic mode, with a recovery time of
758	      12.2 s.  From then on, each further scaling by 8x doubles Cubic's
759	      recovery time (because the cube root of 8 is 2), e.g. at 7.68 Gb/s
760	      the recovery time is 24.3 s.  In contrast a scalable congestion
761	      control like DCTCP or TCP Prague induces 2 congestion signals per
762	      round trip on average, which remains invariant for any flow rate,
763	      keeping dynamic control very tight.

765	      For a feel of where the global average lone-flow download sits on
766	      this scale at the time of writing (2021), according to [BDPdata]
767	      globally averaged fixed access capacity was 103 Mb/s in 2020 and
768	      averaged base RTT to a CDN was 25-34ms in 2019.  Averaging of per-
769	      country data was weighted by Internet user population (data
770	      collected globally is necessarily of variable quality, but the
771	      paper does double-check that the outcome compares well against a
772	      second source).  So a lone CUBIC flow would at best take about 200
773	      round trips (5 s) to recover from each of its sawtooth reductions,
774	      if the flow even lasted that long.  This is described as 'at best'
775	      because it assume everyone uses an AQM, whereas in reality most
776	      users still have a (probably bloated) tail-drop buffer.  In the
777	      tail-drop case, likely average recovery time would be at least 4x
778	      5 s, if not more, because RTT under load would be at least double
779	      that of an AQM, and recovery time depends on the square of RTT.

781	      Although work on scaling congestion controls tends to start with
782	      TCP as the transport, the above is not intended to exclude other
783	      transports (e.g. SCTP, QUIC) or less elastic algorithms
784	      (e.g. RMCAT), which all tend to adopt the same or similar
785	      developments.

787	5.2.  What L4S adds to Existing Approaches

789	   All the following approaches address some part of the same problem
790	   space as L4S.  In each case, it is shown that L4S complements them or
791	   improves on them, rather than being a mutually exclusive alternative:

793	   Diffserv:  Diffserv addresses the problem of bandwidth apportionment
794	      for important traffic as well as queuing latency for delay-
795	      sensitive traffic.  Of these, L4S solely addresses the problem of
796	      queuing latency.  Diffserv will still be necessary where important
797	      traffic requires priority (e.g. for commercial reasons, or for
798	      protection of critical infrastructure traffic) - see
799	      [I-D.briscoe-tsvwg-l4s-diffserv].  Nonetheless, the L4S approach
800	      can provide low latency for all traffic within each Diffserv class
801	      (including the case where there is only the one default Diffserv
802	      class).

804	      Also, Diffserv can only provide a latency benefit if a small
805	      subset of the traffic on a bottleneck link requests low latency.
806	      As already explained, it has no effect when all the applications
807	      in use at one time at a single site (home, small business or
808	      mobile device) require low latency.  In contrast, because L4S
809	      works for all traffic, it needs none of the management baggage
810	      (traffic policing, traffic contracts) associated with favouring
811	      some packets over others.  This lack of management baggage ought
812	      to give L4S a better chance of end-to-end deployment.

814	      In particular, because networks tend not to trust end systems to
815	      identify which packets should be favoured over others, where
816	      networks assign packets to Diffserv classes they tend to use
817	      packet inspection of application flow identifiers or deeper
818	      inspection of application signatures.  Thus, nowadays, Diffserv
819	      doesn't always sit well with encryption of the layers above IP
820	      [RFC8404].  So users have to choose between privacy and QoS.

822	      As with Diffserv, the L4S identifier is in the IP header.  But, in
823	      contrast to Diffserv, the L4S identifier does not convey a want or
824	      a need for a certain level of quality.  Rather, it promises a
825	      certain behaviour (scalable congestion response), which networks
826	      can objectively verify if they need to.  This is because low delay
827	      depends on collective host behaviour, whereas bandwidth priority
828	      depends on network behaviour.

830	   State-of-the-art AQMs:  AQMs such as PIE and FQ-CoDel give a
831	      significant reduction in queuing delay relative to no AQM at all.
832	      L4S is intended to complement these AQMs, and should not distract
833	      from the need to deploy them as widely as possible.  Nonetheless,
834	      AQMs alone cannot reduce queuing delay too far without
835	      significantly reducing link utilization, because the root cause of
836	      the problem is on the host - where Classic congestion controls use
837	      large saw-toothing rate variations.  The L4S approach resolves
838	      this tension between delay and utilization by enabling hosts to
839	      minimize the amplitude of their sawteeth.  A single-queue Classic
840	      AQM is not sufficient to allow hosts to use small sawteeth for two
841	      reasons: i) smaller sawteeth would not get lower delay in an AQM
842	      designed for larger amplitude Classic sawteeth, because a queue
843	      can only have one length at a time; and ii) much smaller sawteeth
844	      implies much more frequent sawteeth, so L4S flows would drive a
845	      Classic AQM into a high level of ECN-marking, which would appear
846	      as heavy congestion to Classic flows, which in turn would greatly
847	      reduce their rate as a result (see Section 6.4.4).

849	   Per-flow queuing or marking:  Similarly, per-flow approaches such as
850	      FQ-CoDel or Approx Fair CoDel [AFCD] are not incompatible with the
851	      L4S approach.  However, per-flow queuing alone is not enough - it
852	      only isolates the queuing of one flow from others; not from
853	      itself.  Per-flow implementations need to have support for
854	      scalable congestion control added, which has already been done for
855	      FQ-CoDel in Linux (see Sec.5.2.7 of [RFC8290] and
856	      [FQ_CoDel_Thresh]).  Without this simple modification, per-flow
857	      AQMs like FQ-CoDel would still not be able to support applications
858	      that need both very low delay and high bandwidth, e.g. video-based
859	      control of remote procedures, or interactive cloud-based video
860	      (see Note 1 below).

862	      Although per-flow techniques are not incompatible with L4S, it is
863	      important to have the DualQ alternative.  This is because handling
864	      end-to-end (layer 4) flows in the network (layer 3 or 2) precludes
865	      some important end-to-end functions.  For instance:

867	      a.  Per-flow forms of L4S like FQ-CoDel are incompatible with full
868	          end-to-end encryption of transport layer identifiers for
869	          privacy and confidentiality (e.g. IPSec or encrypted VPN
870	          tunnels, as opposed to TLS over UDP), because they require
871	          packet inspection to access the end-to-end transport flow
872	          identifiers.

874	          In contrast, the DualQ form of L4S requires no deeper
875	          inspection than the IP layer.  So, as long as operators take
876	          the DualQ approach, their users can have both very low queuing
877	          delay and full end-to-end encryption [RFC8404].

879	      b.  With per-flow forms of L4S, the network takes over control of
880	          the relative rates of each application flow.  Some see it as
881	          an advantage that the network will prevent some flows running
882	          faster than others.  Others consider it an inherent part of
883	          the Internet's appeal that applications can control their rate
884	          while taking account of the needs of others via congestion
885	          signals.  They maintain that this has allowed applications
886	          with interesting rate behaviours to evolve, for instance,
887	          variable bit-rate video that varies around an equal share
888	          rather than being forced to remain equal at every instant, or
889	          e2e scavenger behaviours [RFC6817] that use less than an equal
890	          share of capacity [LEDBAT_AQM].

892	          The L4S architecture does not require the IETF to commit to
893	          one approach over the other, because it supports both, so that
894	          the 'market' can decide.  Nonetheless, in the spirit of 'Do
895	          one thing and do it well' [McIlroy78], the DualQ option
896	          provides low delay without prejudging the issue of flow-rate
897	          control.  Then, flow rate policing can be added separately if
898	          desired.  This allows application control up to a point, but
899	          the network can still choose to set the point at which it
900	          intervenes to prevent one flow completely starving another.

902	      Note:

904	      1.  It might seem that self-inflicted queuing delay within a per-
905	          flow queue should not be counted, because if the delay wasn't
906	          in the network it would just shift to the sender.  However,
907	          modern adaptive applications, e.g. HTTP/2 [RFC7540] or some
908	          interactive media applications (see Section 6.1), can keep low
909	          latency objects at the front of their local send queue by
910	          shuffling priorities of other objects dependent on the
911	          progress of other transfers (for example see [lowat]).  They
912	          cannot shuffle objects once they have released them into the
913	          network.

915	   Alternative Back-off ECN (ABE):  Here again, L4S is not an
916	      alternative to ABE but a complement that introduces much lower
917	      queuing delay.  ABE [RFC8511] alters the host behaviour in
918	      response to ECN marking to utilize a link better and give ECN
919	      flows faster throughput.  It uses ECT(0) and assumes the network
920	      still treats ECN and drop the same.  Therefore ABE exploits any
921	      lower queuing delay that AQMs can provide.  But as explained
922	      above, AQMs still cannot reduce queuing delay too far without
923	      losing link utilization (to allow for other, non-ABE, flows).

925	   BBR:  Bottleneck Bandwidth and Round-trip propagation time
926	      (BBR [I-D.cardwell-iccrg-bbr-congestion-control]) controls queuing
927	      delay end-to-end without needing any special logic in the network,
928	      such as an AQM.  So it works pretty-much on any path (although it
929	      has not been without problems, particularly capacity sharing in
930	      BBRv1).  BBR keeps queuing delay reasonably low, but perhaps not
931	      quite as low as with state-of-the-art AQMs such as PIE or FQ-
932	      CoDel, and certainly nowhere near as low as with L4S.  Queuing
933	      delay is also not consistently low, due to BBR's regular bandwidth
934	      probing spikes and its aggressive flow start-up phase.

936	      L4S complements BBR.  Indeed BBRv2 [BBRv2] can use L4S ECN where
937	      available and a scalable L4S congestion control behaviour in
938	      response to any ECN signalling from the path.  The L4S ECN signal
939	      complements the delay based congestion control aspects of BBR with
940	      an explicit indication that hosts can use, both to converge on a
941	      fair rate and to keep below a shallow queue target set by the
942	      network.  Without L4S ECN, both these aspects need to be assumed
943	      or estimated.

945	6.  Applicability

947	6.1.  Applications

949	   A transport layer that solves the current latency issues will provide
950	   new service, product and application opportunities.

952	   With the L4S approach, the following existing applications also
953	   experience significantly better quality of experience under load:

955	   *  Gaming, including cloud based gaming;

957	   *  VoIP;

959	   *  Video conferencing;

961	   *  Web browsing;

963	   *  (Adaptive) video streaming;
964	   *  Instant messaging.

966	   The significantly lower queuing latency also enables some interactive
967	   application functions to be offloaded to the cloud that would hardly
968	   even be usable today:

970	   *  Cloud based interactive video;

972	   *  Cloud based virtual and augmented reality.

974	   The above two applications have been successfully demonstrated with
975	   L4S, both running together over a 40 Mb/s broadband access link
976	   loaded up with the numerous other latency sensitive applications in
977	   the previous list as well as numerous downloads - all sharing the
978	   same bottleneck queue simultaneously [L4Sdemo16].  For the former, a
979	   panoramic video of a football stadium could be swiped and pinched so
980	   that, on the fly, a proxy in the cloud could generate a sub-window of
981	   the match video under the finger-gesture control of each user.  For
982	   the latter, a virtual reality headset displayed a viewport taken from
983	   a 360 degree camera in a racing car.  The user's head movements
984	   controlled the viewport extracted by a cloud-based proxy.  In both
985	   cases, with 7 ms end-to-end base delay, the additional queuing delay
986	   of roughly 1 ms was so low that it seemed the video was generated
987	   locally.

989	   Using a swiping finger gesture or head movement to pan a video are
990	   extremely latency-demanding actions--far more demanding than VoIP.
991	   Because human vision can detect extremely low delays of the order of
992	   single milliseconds when delay is translated into a visual lag
993	   between a video and a reference point (the finger or the orientation
994	   of the head sensed by the balance system in the inner ear --- the
995	   vestibular system).

997	   Without the low queuing delay of L4S, cloud-based applications like
998	   these would not be credible without significantly more access
999	   bandwidth (to deliver all possible video that might be viewed) and
1000	   more local processing, which would increase the weight and power
1001	   consumption of head-mounted displays.  When all interactive
1002	   processing can be done in the cloud, only the data to be rendered for
1003	   the end user needs to be sent.

1005	   Other low latency high bandwidth applications such as:

1007	   *  Interactive remote presence;

1009	   *  Video-assisted remote control of machinery or industrial
1010	      processes.

1012	   are not credible at all without very low queuing delay.  No amount of
1013	   extra access bandwidth or local processing can make up for lost time.

1015	6.2.  Use Cases

1017	   The following use-cases for L4S are being considered by various
1018	   interested parties:

1020	   *  Where the bottleneck is one of various types of access network:
1021	      e.g. DSL, Passive Optical Networks (PON), DOCSIS cable, mobile,
1022	      satellite (see Section 6.3 for some technology-specific details)

1024	   *  Private networks of heterogeneous data centres, where there is no
1025	      single administrator that can arrange for all the simultaneous
1026	      changes to senders, receivers and network needed to deploy DCTCP:

1028	      -  a set of private data centres interconnected over a wide area
1029	         with separate administrations, but within the same company

1031	      -  a set of data centres operated by separate companies
1032	         interconnected by a community of interest network (e.g. for the
1033	         finance sector)

1035	      -  multi-tenant (cloud) data centres where tenants choose their
1036	         operating system stack (Infrastructure as a Service - IaaS)

1038	   *  Different types of transport (or application) congestion control:

1040	      -  elastic (TCP/SCTP);

1042	      -  real-time (RTP, RMCAT);

1044	      -  query (DNS/LDAP).

1046	   *  Where low delay quality of service is required, but without
1047	      inspecting or intervening above the IP layer [RFC8404]:

1049	      -  mobile and other networks have tended to inspect higher layers
1050	         in order to guess application QoS requirements.  However, with
1051	         growing demand for support of privacy and encryption, L4S
1052	         offers an alternative.  There is no need to select which
1053	         traffic to favour for queuing, when L4S can give favourable
1054	         queuing to all traffic.

1056	   *  If queuing delay is minimized, applications with a fixed delay
1057	      budget can communicate over longer distances, or via a longer
1058	      chain of service functions [RFC7665] or onion routers.

1060	   *  If delay jitter is minimized, it is possible to reduce the
1061	      dejitter buffers on the receive end of video streaming, which
1062	      should improve the interactive experience

1064	6.3.  Applicability with Specific Link Technologies

1066	   Certain link technologies aggregate data from multiple packets into
1067	   bursts, and buffer incoming packets while building each burst.  WiFi,
1068	   PON and cable all involve such packet aggregation, whereas fixed
1069	   Ethernet and DSL do not.  No sender, whether L4S or not, can do
1070	   anything to reduce the buffering needed for packet aggregation.  So
1071	   an AQM should not count this buffering as part of the queue that it
1072	   controls, given no amount of congestion signals will reduce it.

1074	   Certain link technologies also add buffering for other reasons,
1075	   specifically:

1077	   *  Radio links (cellular, WiFi, satellite) that are distant from the
1078	      source are particularly challenging.  The radio link capacity can
1079	      vary rapidly by orders of magnitude, so it is considered desirable
1080	      to hold a standing queue that can utilize sudden increases of
1081	      capacity;

1083	   *  Cellular networks are further complicated by a perceived need to
1084	      buffer in order to make hand-overs imperceptible;

1086	   L4S cannot remove the need for all these different forms of
1087	   buffering.  However, by removing 'the longest pole in the tent'
1088	   (buffering for the large sawteeth of Classic congestion controls),
1089	   L4S exposes all these 'shorter poles' to greater scrutiny.

1091	   Until now, the buffering needed for these additional reasons tended
1092	   to be over-specified - with the excuse that none were 'the longest
1093	   pole in the tent'.  But having removed the 'longest pole', it becomes
1094	   worthwhile to minimize them, for instance reducing packet aggregation
1095	   burst sizes and MAC scheduling intervals.

1097	6.4.  Deployment Considerations

1099	   L4S AQMs, whether DualQ [I-D.ietf-tsvwg-aqm-dualq-coupled] or FQ,
1100	   e.g. [RFC8290] are, in themselves, an incremental deployment
1101	   mechanism for L4S - so that L4S traffic can coexist with existing
1102	   Classic (Reno-friendly) traffic.  Section 6.4.1 explains why only
1103	   deploying an L4S AQM in one node at each end of the access link will
1104	   realize nearly all the benefit of L4S.

1106	   L4S involves both end systems and the network, so Section 6.4.2
1107	   suggests some typical sequences to deploy each part, and why there
1108	   will be an immediate and significant benefit after deploying just one
1109	   part.

1111	   Section 6.4.3 and Section 6.4.4 describe the converse incremental
1112	   deployment case where there is no L4S AQM at the network bottleneck,
1113	   so any L4S flow traversing this bottleneck has to take care in case
1114	   it is competing with Classic traffic.

1116	6.4.1.  Deployment Topology

1118	   L4S AQMs will not have to be deployed throughout the Internet before
1119	   L4S can benefit anyone.  Operators of public Internet access networks
1120	   typically design their networks so that the bottleneck will nearly
1121	   always occur at one known (logical) link.  This confines the cost of
1122	   queue management technology to one place.

1124	   The case of mesh networks is different and will be discussed later in
1125	   this section.  But the known bottleneck case is generally true for
1126	   Internet access to all sorts of different 'sites', where the word
1127	   'site' includes home networks, small- to medium-sized campus or
1128	   enterprise networks and even cellular devices (Figure 2).  Also, this
1129	   known-bottleneck case tends to be applicable whatever the access link
1130	   technology; whether xDSL, cable, PON, cellular, line of sight
1131	   wireless or satellite.

1133	   Therefore, the full benefit of the L4S service should be available in
1134	   the downstream direction when an L4S AQM is deployed at the ingress
1135	   to this bottleneck link.  And similarly, the full upstream service
1136	   will be available once an L4S AQM is deployed at the ingress into the
1137	   upstream link.  (Of course, multi-homed sites would only see the full
1138	   benefit once all their access links were covered.)
1139	                                            ______
1140	                                           (      )
1141	                         __          __  (          )
1142	                        |DQ\________/DQ|( enterprise )
1143	                    ___ |__/        \__| ( /campus  )
1144	                   (   )                   (______)
1145	                 (      )                           ___||_
1146	   +----+      (          )  __                 __ /      \
1147	   | DC |-----(    Core    )|DQ\_______________/DQ|| home |
1148	   +----+      (          ) |__/               \__||______|
1149	                  (_____) __
1150	                         |DQ\__/\        __ ,===.
1151	                         |__/    \  ____/DQ||| ||mobile
1152	                                  \/    \__|||_||device
1153	                                            | o |
1154	                                            `---'

1156	       Figure 2: Likely location of DualQ (DQ) Deployments in common
1157	                             access topologies

1159	   Deployment in mesh topologies depends on how overbooked the core is.
1160	   If the core is non-blocking, or at least generously provisioned so
1161	   that the edges are nearly always the bottlenecks, it would only be
1162	   necessary to deploy an L4S AQM at the edge bottlenecks.  For example,
1163	   some data-centre networks are designed with the bottleneck in the
1164	   hypervisor or host NICs, while others bottleneck at the top-of-rack
1165	   switch (both the output ports facing hosts and those facing the
1166	   core).

1168	   An L4S AQM would often next be needed where the WiFi links in a home
1169	   sometimes become the bottleneck.  And an L4S AQM would eventually
1170	   also need to be deployed at any other persistent bottlenecks such as
1171	   network interconnections, e.g. some public Internet exchange points
1172	   and the ingress and egress to WAN links interconnecting data-centres.

1174	6.4.2.  Deployment Sequences

1176	   For any one L4S flow to provide benefit, it requires three (or
1177	   sometimes two) parts to have been deployed: i) the congestion control
1178	   at the sender; ii) the AQM at the bottleneck; and iii) older
1179	   transports (namely TCP) need upgraded receiver feedback too.  This
1180	   was the same deployment problem that ECN faced [RFC8170] so we have
1181	   learned from that experience.

1183	   Firstly, L4S deployment exploits the fact that DCTCP already exists
1184	   on many Internet hosts (Windows, FreeBSD and Linux); both servers and
1185	   clients.  Therefore, an L4S AQM can be deployed at a network
1186	   bottleneck to immediately give a working deployment of all the L4S
1187	   parts for testing, as long as the ECT(0) codepoint is switched to
1188	   ECT(1).  DCTCP needs some safety concerns to be fixed for general use
1189	   over the public Internet (see Section 4.3 of
1190	   [I-D.ietf-tsvwg-ecn-l4s-id]), but DCTCP is not on by default, so
1191	   these issues can be managed within controlled deployments or
1192	   controlled trials.

1194	   Secondly, the performance improvement with L4S is so significant that
1195	   it enables new interactive services and products that were not
1196	   previously possible.  It is much easier for companies to initiate new
1197	   work on deployment if there is budget for a new product trial.  If,
1198	   in contrast, there were only an incremental performance improvement
1199	   (as with Classic ECN), spending on deployment tends to be much harder
1200	   to justify.

1202	   Thirdly, the L4S identifier is defined so that initially network
1203	   operators can enable L4S exclusively for certain customers or certain
1204	   applications.  But this is carefully defined so that it does not
1205	   compromise future evolution towards L4S as an Internet-wide service.
1206	   This is because the L4S identifier is defined not only as the end-to-
1207	   end ECN field, but it can also optionally be combined with any other
1208	   packet header or some status of a customer or their access link (see
1209	   section 5.4 of [I-D.ietf-tsvwg-ecn-l4s-id]).  Operators could do this
1210	   anyway, even if it were not blessed by the IETF.  However, it is best
1211	   for the IETF to specify that, if they use their own local identifier,
1212	   it must be in combination with the IETF's identifier.  Then, if an
1213	   operator has opted for an exclusive local-use approach, later they
1214	   only have to remove this extra rule to make the service work
1215	   Internet-wide - it will already traverse middleboxes, peerings, etc.

1217	   +-+--------------------+----------------------+---------------------+
1218	   | | Servers or proxies |      Access link     |             Clients |
1219	   +-+--------------------+----------------------+---------------------+
1220	   |0| DCTCP (existing)   |                      |    DCTCP (existing) |
1221	   +-+--------------------+----------------------+---------------------+
1222	   |1|                    |Add L4S AQM downstream|                     |
1223	   | |       WORKS DOWNSTREAM FOR CONTROLLED DEPLOYMENTS/TRIALS        |
1224	   +-+--------------------+----------------------+---------------------+
1225	   |2| Upgrade DCTCP to   |                      |Replace DCTCP feedb'k|
1226	   | | TCP Prague         |                      |         with AccECN |
1227	   | |                 FULLY     WORKS     DOWNSTREAM                  |
1228	   +-+--------------------+----------------------+---------------------+
1229	   | |                    |                      |    Upgrade DCTCP to |
1230	   |3|                    | Add L4S AQM upstream |          TCP Prague |
1231	   | |                    |                      |                     |
1232	   | |              FULLY WORKS UPSTREAM AND DOWNSTREAM                |
1233	   +-+--------------------+----------------------+---------------------+
1234	                 Figure 3: Example L4S Deployment Sequence

1236	   Figure 3 illustrates some example sequences in which the parts of L4S
1237	   might be deployed.  It consists of the following stages:

1239	   1.  Here, the immediate benefit of a single AQM deployment can be
1240	       seen, but limited to a controlled trial or controlled deployment.
1241	       In this example downstream deployment is first, but in other
1242	       scenarios the upstream might be deployed first.  If no AQM at all
1243	       was previously deployed for the downstream access, an L4S AQM
1244	       greatly improves the Classic service (as well as adding the L4S
1245	       service).  If an AQM was already deployed, the Classic service
1246	       will be unchanged (and L4S will add an improvement on top).

1248	   2.  In this stage, the name 'TCP
1249	       Prague' [I-D.briscoe-iccrg-prague-congestion-control] is used to
1250	       represent a variant of DCTCP that is designed to be used in a
1251	       production Internet environment (assuming it complies with the
1252	       requirements in Section 4 of [I-D.ietf-tsvwg-ecn-l4s-id]).  If
1253	       the application is primarily unidirectional, 'TCP Prague' at one
1254	       end will provide all the benefit needed.  For TCP transports,
1255	       Accurate ECN feedback (AccECN) [I-D.ietf-tcpm-accurate-ecn] is
1256	       needed at the other end, but it is a generic ECN feedback
1257	       facility that is already planned to be deployed for other
1258	       purposes, e.g. DCTCP, BBR.  The two ends can be deployed in
1259	       either order, because, in TCP, an L4S congestion control only
1260	       enables itself if it has negotiated the use of AccECN feedback
1261	       with the other end during the connection handshake.  Thus,
1262	       deployment of TCP Prague on a server enables L4S trials to move
1263	       to a production service in one direction, wherever AccECN is
1264	       deployed at the other end.  This stage might be further motivated
1265	       by the performance improvements of TCP Prague relative to DCTCP
1266	       (see Appendix A.2 of [I-D.ietf-tsvwg-ecn-l4s-id]).

1268	       Unlike TCP, from the outset, QUIC ECN feedback [RFC9000] has
1269	       supported L4S.  Therefore, if the transport is QUIC, one-ended
1270	       deployment of a Prague congestion control at this stage is simple
1271	       and sufficient.

1273	   3.  This is a two-move stage to enable L4S upstream.  An L4S AQM or
1274	       TCP Prague can be deployed in either order as already explained.
1275	       To motivate the first of two independent moves, the deferred
1276	       benefit of enabling new services after the second move has to be
1277	       worth it to cover the first mover's investment risk.  As
1278	       explained already, the potential for new interactive services
1279	       provides this motivation.  An L4S AQM also improves the upstream
1280	       Classic service - significantly if no other AQM has already been
1281	       deployed.

1283	   Note that other deployment sequences might occur.  For instance: the
1284	   upstream might be deployed first; a non-TCP protocol might be used
1285	   end-to-end, e.g. QUIC, RTP; a body such as the 3GPP might require L4S
1286	   to be implemented in 5G user equipment, or other random acts of
1287	   kindness.

1289	6.4.3.  L4S Flow but Non-ECN Bottleneck

1291	   If L4S is enabled between two hosts, the L4S sender is required to
1292	   coexist safely with Reno in response to any drop (see Section 4.3 of
1293	   [I-D.ietf-tsvwg-ecn-l4s-id]).

1295	   Unfortunately, as well as protecting Classic traffic, this rule
1296	   degrades the L4S service whenever there is any loss, even if the
1297	   cause is not persistent congestion at a bottleneck, e.g.:

1299	   *  congestion loss at other transient bottlenecks, e.g. due to bursts
1300	      in shallower queues;

1302	   *  transmission errors, e.g. due to electrical interference;

1304	   *  rate policing.

1306	   Three complementary approaches are in progress to address this issue,
1307	   but they are all currently research:

1309	   *  In Prague congestion control, ignore certain losses deemed
1310	      unlikely to be due to congestion (using some ideas from
1311	      BBR [I-D.cardwell-iccrg-bbr-congestion-control] regarding isolated
1312	      losses).  This could mask any of the above types of loss while
1313	      still coexisting with drop-based congestion controls.

1315	   *  A combination of RACK, L4S and link retransmission without
1316	      resequencing could repair transmission errors without the head of
1317	      line blocking delay usually associated with link-layer
1318	      retransmission [UnorderedLTE], [I-D.ietf-tsvwg-ecn-l4s-id];

1320	   *  Hybrid ECN/drop rate policers (see Section 8.3).

1322	   L4S deployment scenarios that minimize these issues (e.g. over
1323	   wireline networks) can proceed in parallel to this research, in the
1324	   expectation that research success could continually widen L4S
1325	   applicability.

1327	6.4.4.  L4S Flow but Classic ECN Bottleneck

1329	   Classic ECN support is starting to materialize on the Internet as an
1330	   increased level of CE marking.  It is hard to detect whether this is
1331	   all due to the addition of support for ECN in implementations of FQ-
1332	   CoDel and/or FQ-COBALT, which is not generally problematic, because
1333	   flow-queue (FQ) scheduling inherently prevents a flow from exceeding
1334	   the 'fair' rate irrespective of its aggressiveness.  However, some of
1335	   this Classic ECN marking might be due to single-queue ECN deployment.
1336	   This case is discussed in Section 4.3 of [I-D.ietf-tsvwg-ecn-l4s-id].

1338	6.4.5.  L4S AQM Deployment within Tunnels

1340	   An L4S AQM uses the ECN field to signal congestion.  So, in common
1341	   with Classic ECN, if the AQM is within a tunnel or at a lower layer,
1342	   correct functioning of ECN signalling requires correct propagation of
1343	   the ECN field up the layers [RFC6040],
1344	   [I-D.ietf-tsvwg-rfc6040update-shim],
1345	   [I-D.ietf-tsvwg-ecn-encap-guidelines].

1347	7.  IANA Considerations (to be removed by RFC Editor)

1349	   This specification contains no IANA considerations.

1351	8.  Security Considerations

1353	8.1.  Traffic Rate (Non-)Policing

1355	   In the current Internet, scheduling usually enforces separation
1356	   between 'sites' (e.g. households, businesses or mobile users
1357	   [RFC0970]) and various techniques like redirection to traffic
1358	   scrubbing facilities deal with flooding attacks.  However, there has
1359	   never been a universal need to police the rate of individual
1360	   application flows - the Internet has generally always relied on self-
1361	   restraint of congestion controls at senders for sharing intra-'site'
1362	   capacity.

1364	   As explained in Section 5.2, the DualQ variant of L4S provides low
1365	   delay without prejudging the issue of flow-rate control.  Then, if
1366	   flow-rate control is needed, per-flow-queuing (FQ) can be used
1367	   instead, or flow rate policing can be added as a modular addition to
1368	   a DualQ.

1370	   Because the L4S service reduces delay without increasing the delay of
1371	   Classic traffic, it should not be necessary to rate-police access to
1372	   the L4S service.  In contrast, Section 5.2 explains how Diffserv only
1373	   makes a difference if some packets get less favourable treatment than
1374	   others, which typically requires traffic rate policing, which can, in
1375	   turn, lead to further complexity such as traffic contracts at trust
1376	   boundaries.  Because L4S avoids this management complexity, it is
1377	   more likely to work end-to-end.

1379	   During early deployment (and perhaps always), some networks will not
1380	   offer the L4S service.  In general, these networks should not need to
1381	   police L4S traffic.  They are required (by both [RFC3168] and
1382	   [I-D.ietf-tsvwg-ecn-l4s-id]) not to change the L4S identifier, which
1383	   would interfere with end-to-end congestion control.  Instead they can
1384	   merely treat L4S traffic as Not-ECT, as they might already treat all
1385	   ECN traffic today.  At a bottleneck, such networks will introduce
1386	   some queuing and dropping.  When a scalable congestion control
1387	   detects a drop it will have to respond safely with respect to Classic
1388	   congestion controls (as required in Section 4.3 of
1389	   [I-D.ietf-tsvwg-ecn-l4s-id]).  This will degrade the L4S service to
1390	   be no better (but never worse) than Classic best efforts, whenever a
1391	   non-ECN bottleneck is encountered on a path (see Section 6.4.3).

1393	   In cases that are expected to be rare, networks that solely support
1394	   Classic ECN [RFC3168] in a single queue bottleneck might opt to
1395	   police L4S traffic so as to protect competing Classic ECN traffic
1396	   (for instance, see Section 6.1.3 of [I-D.ietf-tsvwg-l4sops]).
1397	   However, Section 4.3 of [I-D.ietf-tsvwg-ecn-l4s-id] recommends that
1398	   the sender adapts its congestion response to properly coexist with
1399	   Classic ECN flows, i.e. reverting to the self-restraint approach.

1401	   Certain network operators might choose to restrict access to the L4S
1402	   class, perhaps only to selected premium customers as a value-added
1403	   service.  Their packet classifier (item 2 in Figure 1) could identify
1404	   such customers against some other field (e.g. source address range)
1405	   as well as classifying on the ECN field.  If only the ECN L4S
1406	   identifier matched, but not the source address (say), the classifier
1407	   could direct these packets (from non-premium customers) into the
1408	   Classic queue.  Explaining clearly how operators can use an
1409	   additional local classifiers (see section 5.4 of
1410	   [I-D.ietf-tsvwg-ecn-l4s-id]) is intended to remove any motivation to
1411	   clear the L4S identifier.  Then at least the L4S ECN identifier will
1412	   be more likely to survive end-to-end even though the service may not
1413	   be supported at every hop.  Such local arrangements would only
1414	   require simple registered/not-registered packet classification,
1415	   rather than the managed, application-specific traffic policing
1416	   against customer-specific traffic contracts that Diffserv uses.

1418	8.2.  'Latency Friendliness'

1420	   Like the Classic service, the L4S service relies on self-restraint -
1421	   limiting rate in response to congestion.  In addition, the L4S
1422	   service requires self-restraint in terms of limiting latency
1423	   (burstiness).  It is hoped that self-interest and guidance on dynamic
1424	   behaviour (especially flow start-up, which might need to be
1425	   standardized) will be sufficient to prevent transports from sending
1426	   excessive bursts of L4S traffic, given the application's own latency
1427	   will suffer most from such behaviour.

1429	   Whether burst policing becomes necessary remains to be seen.  Without
1430	   it, there will be potential for attacks on the low latency of the L4S
1431	   service.

1433	   If needed, various arrangements could be used to address this
1434	   concern:

1436	   Local bottleneck queue protection:  A per-flow (5-tuple) queue
1437	      protection function [I-D.briscoe-docsis-q-protection] has been
1438	      developed for the low latency queue in DOCSIS, which has adopted
1439	      the DualQ L4S architecture.  It protects the low latency service
1440	      from any queue-building flows that accidentally or maliciously
1441	      classify themselves into the low latency queue.  It is designed to
1442	      score flows based solely on their contribution to queuing (not
1443	      flow rate in itself).  Then, if the shared low latency queue is at
1444	      risk of exceeding a threshold, the function redirects enough
1445	      packets of the highest scoring flow(s) into the Classic queue to
1446	      preserve low latency.

1448	   Distributed traffic scrubbing:  Rather than policing locally at each
1449	      bottleneck, it may only be necessary to address problems
1450	      reactively, e.g. punitively target any deployments of new bursty
1451	      malware, in a similar way to how traffic from flooding attack
1452	      sources is rerouted via scrubbing facilities.

1454	   Local bottleneck per-flow scheduling:  Per-flow scheduling should
1455	      inherently isolate non-bursty flows from bursty (see Section 5.2
1456	      for discussion of the merits of per-flow scheduling relative to
1457	      per-flow policing).

1459	   Distributed access subnet queue protection:  Per-flow queue
1460	      protection could be arranged for a queue structure distributed
1461	      across a subnet inter-communicating using lower layer control
1462	      messages (see Section 2.1.4 of [QDyn]).  For instance, in a radio
1463	      access network, user equipment already sends regular buffer status
1464	      reports to a radio network controller, which could use this
1465	      information to remotely police individual flows.

1467	   Distributed Congestion Exposure to Ingress Policers:  The Congestion
1468	      Exposure (ConEx) architecture [RFC7713] which uses egress audit to
1469	      motivate senders to truthfully signal path congestion in-band
1470	      where it can be used by ingress policers.  An edge-to-edge variant
1471	      of this architecture is also possible.

1473	   Distributed Domain-edge traffic conditioning:  An architecture
1474	      similar to Diffserv [RFC2475] may be preferred, where traffic is
1475	      proactively conditioned on entry to a domain, rather than
1476	      reactively policed only if it leads to queuing once combined with
1477	      other traffic at a bottleneck.

1479	   Distributed core network queue protection:  The policing function
1480	      could be divided between per-flow mechanisms at the network
1481	      ingress that characterize the burstiness of each flow into a
1482	      signal carried with the traffic, and per-class mechanisms at
1483	      bottlenecks that act on these signals if queuing actually occurs
1484	      once the traffic converges.  This would be somewhat similar to
1485	      [Nadas20], which is in turn similar to the idea behind core
1486	      stateless fair queuing.

1488	   None of these possible queue protection capabilities are considered a
1489	   necessary part of the L4S architecture, which works without them (in
1490	   a similar way to how the Internet works without per-flow rate
1491	   policing).  Indeed, even where latency policers are deployed, under
1492	   normal circumstances they would not intervene, and if operators found
1493	   they were not necessary they could disable them.  Part of the L4S
1494	   experiment will be to see whether such a function is necessary, and
1495	   which arrangements are most appropriate to the size of the problem.

1497	8.3.  Interaction between Rate Policing and L4S

1499	   As mentioned in Section 5.2, L4S should remove the need for low
1500	   latency Diffserv classes.  However, those Diffserv classes that give
1501	   certain applications or users priority over capacity, would still be
1502	   applicable in certain scenarios (e.g. corporate networks).  Then,
1503	   within such Diffserv classes, L4S would often be applicable to give
1504	   traffic low latency and low loss as well.  Within such a Diffserv
1505	   class, the bandwidth available to a user or application is often
1506	   limited by a rate policer.  Similarly, in the default Diffserv class,
1507	   rate policers are used to partition shared capacity.

1509	   A classic rate policer drops any packets exceeding a set rate,
1510	   usually also giving a burst allowance (variants exist where the
1511	   policer re-marks non-compliant traffic to a discard-eligible Diffserv
1512	   codepoint, so they can be dropped elsewhere during contention).
1513	   Whenever L4S traffic encounters one of these rate policers, it will
1514	   experience drops and the source will have to fall back to a Classic
1515	   congestion control, thus losing the benefits of L4S (Section 6.4.3).
1516	   So, in networks that already use rate policers and plan to deploy
1517	   L4S, it will be preferable to redesign these rate policers to be more
1518	   friendly to the L4S service.

1520	   L4S-friendly rate policing is currently a research area (note that
1521	   this is not the same as latency policing).  It might be achieved by
1522	   setting a threshold where ECN marking is introduced, such that it is
1523	   just under the policed rate or just under the burst allowance where
1524	   drop is introduced.  For instance the two-rate three-colour marker
1525	   [RFC2698] or a PCN threshold and excess-rate marker [RFC5670] could
1526	   mark ECN at the lower rate and drop at the higher.  Or an existing
1527	   rate policer could have congestion-rate policing added, e.g. using
1528	   the 'local' (non-ConEx) variant of the ConEx aggregate congestion
1529	   policer [I-D.briscoe-conex-policing].  It might also be possible to
1530	   design scalable congestion controls to respond less catastrophically
1531	   to loss that has not been preceded by a period of increasing delay.

1533	   The design of L4S-friendly rate policers will require a separate
1534	   dedicated document.  For further discussion of the interaction
1535	   between L4S and Diffserv, see [I-D.briscoe-tsvwg-l4s-diffserv].

1537	8.4.  ECN Integrity

1539	   Receiving hosts can fool a sender into downloading faster by
1540	   suppressing feedback of ECN marks (or of losses if retransmissions
1541	   are not necessary or available otherwise).  Various ways to protect
1542	   transport feedback integrity have been developed.  For instance:

1544	   *  The sender can test the integrity of the receiver's feedback by
1545	      occasionally setting the IP-ECN field to the congestion
1546	      experienced (CE) codepoint, which is normally only set by a
1547	      congested link.  Then the sender can test whether the receiver's
1548	      feedback faithfully reports what it expects (see 2nd para of
1549	      Section 20.2 of [RFC3168]).

1551	   *  A network can enforce a congestion response to its ECN markings
1552	      (or packet losses) by auditing congestion exposure
1553	      (ConEx) [RFC7713].

1555	   *  Transport layer authentication such as the TCP authentication
1556	      option (TCP-AO [RFC5925]) or QUIC's use of TLS [RFC9001] can
1557	      detect any tampering with congestion feedback.

1559	   *  The ECN Nonce [RFC3540] was proposed to detect tampering with
1560	      congestion feedback, but it has been reclassified as
1561	      historic [RFC8311].

1563	   Appendix C.1 of [I-D.ietf-tsvwg-ecn-l4s-id] gives more details of
1564	   these techniques including their applicability and pros and cons.

1566	8.5.  Privacy Considerations

1568	   As discussed in Section 5.2, the L4S architecture does not preclude
1569	   approaches that inspect end-to-end transport layer identifiers.  For
1570	   instance, L4S support has been added to FQ-CoDel, which classifies by
1571	   application flow ID in the network.  However, the main innovation of
1572	   L4S is the DualQ AQM framework that does not need to inspect any
1573	   deeper than the outermost IP header, because the L4S identifier is in
1574	   the IP-ECN field.

1576	   Thus, the L4S architecture enables very low queuing delay without
1577	   _requiring_ inspection of information above the IP layer.  This means
1578	   that users who want to encrypt application flow identifiers, e.g. in
1579	   IPSec or other encrypted VPN tunnels, don't have to sacrifice low
1580	   delay [RFC8404].

1582	   Because L4S can provide low delay for a broad set of applications
1583	   that choose to use it, there is no need for individual applications
1584	   or classes within that broad set to be distinguishable in any way
1585	   while traversing networks.  This removes much of the ability to
1586	   correlate between the delay requirements of traffic and other
1587	   identifying features [RFC6973].  There may be some types of traffic
1588	   that prefer not to use L4S, but the coarse binary categorization of
1589	   traffic reveals very little that could be exploited to compromise
1590	   privacy.

1592	9.  Acknowledgements

1594	   Thanks to Richard Scheffenegger, Wes Eddy, Karen Nielsen, David
1595	   Black, Jake Holland, Vidhi Goel, Ermin Sakic, Praveen
1596	   Balasubramanian, Gorry Fairhurst, Mirja Kuehlewind, Philip Eardley,
1597	   Neal Cardwell and Pete Heist for their useful review comments.

1599	   Bob Briscoe and Koen De Schepper were part-funded by the European
1600	   Community under its Seventh Framework Programme through the Reducing
1601	   Internet Transport Latency (RITE) project (ICT-317700).  Bob Briscoe
1602	   was also part-funded by the Research Council of Norway through the
1603	   TimeIn project, partly by CableLabs and partly by the Comcast
1604	   Innovation Fund.  The views expressed here are solely those of the
1605	   authors.

1607	10.  Informative References

1609	   [AFCD]     Xue, L., Kumar, S., Cui, C., Kondikoppa, P., Chiu, C-H.,
1610	              and S-J. Park, "Towards fair and low latency next
1611	              generation high speed networks: AFCD queuing", Journal of
1612	              Network and Computer Applications 70:183--193, July 2016,
1613	              <https://doi.org/10.1016/j.jnca.2016.03.021>.

1615	   [BBRv2]    Cardwell, N., "TCP BBR v2 Alpha/Preview Release", github
1616	              repository; Linux congestion control module,
1617	              <https://github.com/google/bbr/blob/v2alpha/README.md>.

1619	   [BDPdata]  Briscoe, B., "PI2 Parameters", Technical Report TR-BB-
1620	              2021-001 arXiv:2107.01003 [cs.NI], July 2021,
1621	              <https://arxiv.org/abs/2107.01003>.

1623	   [BufferSize]
1624	              Appenzeller, G., Keslassy, I., and N. McKeown, "Sizing
1625	              Router Buffers", In Proc. SIGCOMM'04 34(4):281--292,
1626	              September 2004, <https://doi.org/10.1145/1015467.1015499>.

1628	   [COBALT]   Palmei, J., Gupta, S., Imputato, P., Morton, J.,
1629	              Tahiliani, M. P., Avallone, S., and D. Täht, "Design and
1630	              Evaluation of COBALT Queue Discipline", In Proc. IEEE
1631	              Int'l Symp. Local and Metropolitan Area Networks
1632	              (LANMAN'19) 2019:1-6, July 2019,
1633	              <https://ieeexplore.ieee.org/abstract/document/8847054>.

1635	   [DCttH19]  De Schepper, K., Bondarenko, O., Tilmans, O., and B.
1636	              Briscoe, "`Data Centre to the Home': Ultra-Low Latency for
1637	              All", Updated RITE project Technical Report , July 2019,
1638	              <https://bobbriscoe.net/pubs.html#DCttH_TR>.

1640	   [DOCSIS3.1]
1641	              CableLabs, "MAC and Upper Layer Protocols Interface
1642	              (MULPI) Specification, CM-SP-MULPIv3.1", Data-Over-Cable
1643	              Service Interface Specifications DOCSIS® 3.1 Version i17
1644	              or later, 21 January 2019, <https://specification-
1645	              search.cablelabs.com/CM-SP-MULPIv3.1>.

1647	   [DOCSIS3AQM]
1648	              White, G., "Active Queue Management Algorithms for DOCSIS
1649	              3.0; A Simulation Study of CoDel, SFQ-CoDel and PIE in
1650	              DOCSIS 3.0 Networks", CableLabs Technical Report , April
1651	              2013, <{http://www.cablelabs.com/wp-
1652	              content/uploads/2013/11/
1653	              Active_Queue_Management_Algorithms_DOCSIS_3_0.pdf>.

1655	   [DualPI2Linux]
1656	              Albisser, O., De Schepper, K., Briscoe, B., Tilmans, O.,
1657	              and H. Steen, "DUALPI2 - Low Latency, Low Loss and
1658	              Scalable (L4S) AQM", Proc. Linux Netdev 0x13 , March 2019,
1659	              <https://www.netdevconf.org/0x13/session.html?talk-
1660	              DUALPI2-AQM>.

1662	   [Dukkipati06]
1663	              Dukkipati, N. and N. McKeown, "Why Flow-Completion Time is
1664	              the Right Metric for Congestion Control", ACM CCR
1665	              36(1):59--62, January 2006,
1666	              <https://dl.acm.org/doi/10.1145/1111322.1111336>.

1668	   [FQ_CoDel_Thresh]
1669	              Høiland-Jørgensen, T., "fq_codel: generalise ce_threshold
1670	              marking for subset of traffic", Linux Patch Commit ID:
1671	              dfcb63ce1de6b10b, 20 October 2021,
1672	              <https://git.kernel.org/pub/scm/linux/kernel/git/netdev/
1673	              net-next.git/commit/?id=dfcb63ce1de6b10b>.

1675	   [Hohlfeld14]
1676	              Hohlfeld, O., Pujol, E., Ciucu, F., Feldmann, A., and P.
1677	              Barford, "A QoE Perspective on Sizing Network Buffers",
1678	              Proc. ACM Internet Measurement Conf (IMC'14) hmm, November
1679	              2014, <http://doi.acm.org/10.1145/2663716.2663730>.

1681	   [I-D.briscoe-conex-policing]
1682	              Briscoe, B., "Network Performance Isolation using
1683	              Congestion Policing", Work in Progress, Internet-Draft,
1684	              draft-briscoe-conex-policing-01, 14 February 2014,
1685	              <https://datatracker.ietf.org/doc/html/draft-briscoe-
1686	              conex-policing-01>.

1688	   [I-D.briscoe-docsis-q-protection]
1689	              Briscoe, B. and G. White, "The DOCSIS(r) Queue Protection
1690	              Algorithm to Preserve Low Latency", Work in Progress,
1691	              Internet-Draft, draft-briscoe-docsis-q-protection-01, 17
1692	              December 2021, <https://datatracker.ietf.org/doc/html/
1693	              draft-briscoe-docsis-q-protection-01>.

1695	   [I-D.briscoe-iccrg-prague-congestion-control]
1696	              Schepper, K. D., Tilmans, O., and B. Briscoe, "Prague
1697	              Congestion Control", Work in Progress, Internet-Draft,
1698	              draft-briscoe-iccrg-prague-congestion-control-00, 9 March
1699	              2021, <https://datatracker.ietf.org/doc/html/draft-
1700	              briscoe-iccrg-prague-congestion-control-00>.

1702	   [I-D.briscoe-tsvwg-l4s-diffserv]
1703	              Briscoe, B., "Interactions between Low Latency, Low Loss,
1704	              Scalable Throughput (L4S) and Differentiated Services",
1705	              Work in Progress, Internet-Draft, draft-briscoe-tsvwg-l4s-
1706	              diffserv-02, 4 November 2018,
1707	              <https://datatracker.ietf.org/doc/html/draft-briscoe-
1708	              tsvwg-l4s-diffserv-02>.

1710	   [I-D.cardwell-iccrg-bbr-congestion-control]
1711	              Cardwell, N., Cheng, Y., Yeganeh, S. H., Swett, I., and V.
1712	              Jacobson, "BBR Congestion Control", Work in Progress,
1713	              Internet-Draft, draft-cardwell-iccrg-bbr-congestion-
1714	              control-01, 7 November 2021,
1715	              <https://datatracker.ietf.org/doc/html/draft-cardwell-
1716	              iccrg-bbr-congestion-control-01>.

1718	   [I-D.ietf-tcpm-accurate-ecn]
1719	              Briscoe, B., Kühlewind, M., and R. Scheffenegger, "More
1720	              Accurate ECN Feedback in TCP", Work in Progress, Internet-
1721	              Draft, draft-ietf-tcpm-accurate-ecn-15, 12 July 2021,
1722	              <https://datatracker.ietf.org/doc/html/draft-ietf-tcpm-
1723	              accurate-ecn-15>.

1725	   [I-D.ietf-tcpm-generalized-ecn]
1726	              Bagnulo, M. and B. Briscoe, "ECN++: Adding Explicit
1727	              Congestion Notification (ECN) to TCP Control Packets",
1728	              Work in Progress, Internet-Draft, draft-ietf-tcpm-
1729	              generalized-ecn-08, 2 August 2021,
1730	              <https://datatracker.ietf.org/doc/html/draft-ietf-tcpm-
1731	              generalized-ecn-08>.

1733	   [I-D.ietf-tsvwg-aqm-dualq-coupled]
1734	              Schepper, K. D., Briscoe, B., and G. White, "DualQ Coupled
1735	              AQMs for Low Latency, Low Loss and Scalable Throughput
1736	              (L4S)", Work in Progress, Internet-Draft, draft-ietf-
1737	              tsvwg-aqm-dualq-coupled-19, 3 November 2021,
1738	              <https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg-
1739	              aqm-dualq-coupled-19>.

1741	   [I-D.ietf-tsvwg-ecn-encap-guidelines]
1742	              Briscoe, B. and J. Kaippallimalil, "Guidelines for Adding
1743	              Congestion Notification to Protocols that Encapsulate IP",
1744	              Work in Progress, Internet-Draft, draft-ietf-tsvwg-ecn-
1745	              encap-guidelines-16, 25 May 2021,
1746	              <https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg-
1747	              ecn-encap-guidelines-16>.

1749	   [I-D.ietf-tsvwg-ecn-l4s-id]
1750	              Schepper, K. D. and B. Briscoe, "Explicit Congestion
1751	              Notification (ECN) Protocol for Very Low Queuing Delay
1752	              (L4S)", Work in Progress, Internet-Draft, draft-ietf-
1753	              tsvwg-ecn-l4s-id-22, 8 November 2021,
1754	              <https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg-
1755	              ecn-l4s-id-22>.

1757	   [I-D.ietf-tsvwg-l4sops]
1758	              White, G., "Operational Guidance for Deployment of L4S in
1759	              the Internet", Work in Progress, Internet-Draft, draft-
1760	              ietf-tsvwg-l4sops-02, 25 October 2021,
1761	              <https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg-
1762	              l4sops-02>.

1764	   [I-D.ietf-tsvwg-nqb]
1765	              White, G. and T. Fossati, "A Non-Queue-Building Per-Hop
1766	              Behavior (NQB PHB) for Differentiated Services", Work in
1767	              Progress, Internet-Draft, draft-ietf-tsvwg-nqb-08, 25
1768	              October 2021, <https://datatracker.ietf.org/doc/html/
1769	              draft-ietf-tsvwg-nqb-08>.

1771	   [I-D.ietf-tsvwg-rfc6040update-shim]
1772	              Briscoe, B., "Propagating Explicit Congestion Notification
1773	              Across IP Tunnel Headers Separated by a Shim", Work in
1774	              Progress, Internet-Draft, draft-ietf-tsvwg-rfc6040update-
1775	              shim-14, 25 May 2021,
1776	              <https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg-
1777	              rfc6040update-shim-14>.

1779	   [I-D.morton-tsvwg-codel-approx-fair]
1780	              Morton, J. and P. G. Heist, "Controlled Delay Approximate
1781	              Fairness AQM", Work in Progress, Internet-Draft, draft-
1782	              morton-tsvwg-codel-approx-fair-01, 9 March 2020,
1783	              <https://datatracker.ietf.org/doc/html/draft-morton-tsvwg-
1784	              codel-approx-fair-01>.

1786	   [I-D.sridharan-tcpm-ctcp]
1787	              Sridharan, M., Tan, K., Bansal, D., and D. Thaler,
1788	              "Compound TCP: A New TCP Congestion Control for High-Speed
1789	              and Long Distance Networks", Work in Progress, Internet-
1790	              Draft, draft-sridharan-tcpm-ctcp-02, 11 November 2008,
1791	              <https://datatracker.ietf.org/doc/html/draft-sridharan-
1792	              tcpm-ctcp-02>.

1794	   [I-D.stewart-tsvwg-sctpecn]
1795	              Stewart, R. R., Tuexen, M., and X. Dong, "ECN for Stream
1796	              Control Transmission Protocol (SCTP)", Work in Progress,
1797	              Internet-Draft, draft-stewart-tsvwg-sctpecn-05, 15 January
1798	              2014, <https://datatracker.ietf.org/doc/html/draft-
1799	              stewart-tsvwg-sctpecn-05>.

1801	   [L4Sdemo16]
1802	              Bondarenko, O., De Schepper, K., Tsang, I., and B.
1803	              Briscoe, "Ultra-Low Delay for All: Live Experience, Live
1804	              Analysis", Proc. MMSYS'16 pp33:1--33:4, May 2016,
1805	              <http://dl.acm.org/citation.cfm?doid=2910017.2910633
1806	              (videos of demos:
1807	              https://riteproject.eu/dctth/#1511dispatchwg )>.

1809	   [LEDBAT_AQM]
1810	              Al-Saadi, R., Armitage, G., and J. But, "Characterising
1811	              LEDBAT Performance Through Bottlenecks Using PIE, FQ-CoDel
1812	              and FQ-PIE Active Queue Management", Proc. IEEE 42nd
1813	              Conference on Local Computer Networks (LCN) 278--285,
1814	              2017, <https://ieeexplore.ieee.org/document/8109367>.

1816	   [lowat]    Meenan, P., "Optimizing HTTP/2 prioritization with BBR and
1817	              tcp_notsent_lowat", Cloudflare Blog , 12 October 2018,
1818	              <https://blog.cloudflare.com/http-2-prioritization-with-
1819	              nginx/>.

1821	   [Mathis09] Mathis, M., "Relentless Congestion Control", PFLDNeT'09 ,
1822	              May 2009, <https://www.gdt.id.au/~gdt/
1823	              presentations/2010-07-06-questnet-tcp/reference-
1824	              materials/papers/mathis-relentless-congestion-
1825	              control.pdf>.

1827	   [McIlroy78]
1828	              McIlroy, M.D., Pinson, E. N., and B. A. Tague, "UNIX Time-
1829	              Sharing System: Foreword", The Bell System Technical
1830	              Journal 57:6(1902--1903), July 1978,
1831	              <https://archive.org/details/bstj57-6-1899>.

1833	   [Nadas20]  Nádas, S., Gombos, G., Fejes, F., and S. Laki, "A
1834	              Congestion Control Independent L4S Scheduler", Proc.
1835	              Applied Networking Research Workshop (ANRW '20) 45--51,
1836	              July 2020, <https://doi.org/10.1145/3404868.3406669>.

1838	   [NewCC_Proc]
1839	              Eggert, L., "Experimental Specification of New Congestion
1840	              Control Algorithms", IETF Operational Note ion-tsv-alt-cc,
1841	              July 2007, <https://www.ietf.org/iesg/statement/
1842	              congestion-control.html>.

1844	   [PragueLinux]
1845	              Briscoe, B., De Schepper, K., Albisser, O., Misund, J.,
1846	              Tilmans, O., Kühlewind, M., and A.S. Ahmed, "Implementing
1847	              the `TCP Prague' Requirements for Low Latency Low Loss
1848	              Scalable Throughput (L4S)", Proc. Linux Netdev 0x13 ,
1849	              March 2019, <https://www.netdevconf.org/0x13/
1850	              session.html?talk-tcp-prague-l4s>.

1852	   [QDyn]     Briscoe, B., "Rapid Signalling of Queue Dynamics",
1853	              bobbriscoe.net Technical Report TR-BB-2017-001;
1854	              arXiv:1904.07044 [cs.NI], September 2017,
1855	              <https://arxiv.org/abs/1904.07044>.

1857	   [Rajiullah15]
1858	              Rajiullah, M., "Towards a Low Latency Internet:
1859	              Understanding and Solutions", Masters Thesis; Karlstad
1860	              Uni, Dept of Maths & CS 2015:41, 2015, <https://www.diva-
1861	              portal.org/smash/get/diva2:846109/FULLTEXT01.pdf>.

1863	   [RFC0970]  Nagle, J., "On Packet Switches With Infinite Storage",
1864	              RFC 970, DOI 10.17487/RFC0970, December 1985,
1865	              <https://www.rfc-editor.org/info/rfc970>.

1867	   [RFC2475]  Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z.,
1868	              and W. Weiss, "An Architecture for Differentiated
1869	              Services", RFC 2475, DOI 10.17487/RFC2475, December 1998,
1870	              <https://www.rfc-editor.org/info/rfc2475>.

1872	   [RFC2698]  Heinanen, J. and R. Guerin, "A Two Rate Three Color
1873	              Marker", RFC 2698, DOI 10.17487/RFC2698, September 1999,
1874	              <https://www.rfc-editor.org/info/rfc2698>.

1876	   [RFC2884]  Hadi Salim, J. and U. Ahmed, "Performance Evaluation of
1877	              Explicit Congestion Notification (ECN) in IP Networks",
1878	              RFC 2884, DOI 10.17487/RFC2884, July 2000,
1879	              <https://www.rfc-editor.org/info/rfc2884>.

1881	   [RFC3168]  Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
1882	              of Explicit Congestion Notification (ECN) to IP",
1883	              RFC 3168, DOI 10.17487/RFC3168, September 2001,
1884	              <https://www.rfc-editor.org/info/rfc3168>.

1886	   [RFC3246]  Davie, B., Charny, A., Bennet, J.C.R., Benson, K., Le
1887	              Boudec, J.Y., Courtney, W., Davari, S., Firoiu, V., and D.
1888	              Stiliadis, "An Expedited Forwarding PHB (Per-Hop
1889	              Behavior)", RFC 3246, DOI 10.17487/RFC3246, March 2002,
1890	              <https://www.rfc-editor.org/info/rfc3246>.

1892	   [RFC3540]  Spring, N., Wetherall, D., and D. Ely, "Robust Explicit
1893	              Congestion Notification (ECN) Signaling with Nonces",
1894	              RFC 3540, DOI 10.17487/RFC3540, June 2003,
1895	              <https://www.rfc-editor.org/info/rfc3540>.

1897	   [RFC3649]  Floyd, S., "HighSpeed TCP for Large Congestion Windows",
1898	              RFC 3649, DOI 10.17487/RFC3649, December 2003,
1899	              <https://www.rfc-editor.org/info/rfc3649>.

1901	   [RFC4340]  Kohler, E., Handley, M., and S. Floyd, "Datagram
1902	              Congestion Control Protocol (DCCP)", RFC 4340,
1903	              DOI 10.17487/RFC4340, March 2006,
1904	              <https://www.rfc-editor.org/info/rfc4340>.

1906	   [RFC4774]  Floyd, S., "Specifying Alternate Semantics for the
1907	              Explicit Congestion Notification (ECN) Field", BCP 124,
1908	              RFC 4774, DOI 10.17487/RFC4774, November 2006,
1909	              <https://www.rfc-editor.org/info/rfc4774>.

1911	   [RFC4960]  Stewart, R., Ed., "Stream Control Transmission Protocol",
1912	              RFC 4960, DOI 10.17487/RFC4960, September 2007,
1913	              <https://www.rfc-editor.org/info/rfc4960>.

1915	   [RFC5033]  Floyd, S. and M. Allman, "Specifying New Congestion
1916	              Control Algorithms", BCP 133, RFC 5033,
1917	              DOI 10.17487/RFC5033, August 2007,
1918	              <https://www.rfc-editor.org/info/rfc5033>.

1920	   [RFC5348]  Floyd, S., Handley, M., Padhye, J., and J. Widmer, "TCP
1921	              Friendly Rate Control (TFRC): Protocol Specification",
1922	              RFC 5348, DOI 10.17487/RFC5348, September 2008,
1923	              <https://www.rfc-editor.org/info/rfc5348>.

1925	   [RFC5670]  Eardley, P., Ed., "Metering and Marking Behaviour of PCN-
1926	              Nodes", RFC 5670, DOI 10.17487/RFC5670, November 2009,
1927	              <https://www.rfc-editor.org/info/rfc5670>.

1929	   [RFC5681]  Allman, M., Paxson, V., and E. Blanton, "TCP Congestion
1930	              Control", RFC 5681, DOI 10.17487/RFC5681, September 2009,
1931	              <https://www.rfc-editor.org/info/rfc5681>.

1933	   [RFC5925]  Touch, J., Mankin, A., and R. Bonica, "The TCP
1934	              Authentication Option", RFC 5925, DOI 10.17487/RFC5925,
1935	              June 2010, <https://www.rfc-editor.org/info/rfc5925>.

1937	   [RFC6040]  Briscoe, B., "Tunnelling of Explicit Congestion
1938	              Notification", RFC 6040, DOI 10.17487/RFC6040, November
1939	              2010, <https://www.rfc-editor.org/info/rfc6040>.

1941	   [RFC6679]  Westerlund, M., Johansson, I., Perkins, C., O'Hanlon, P.,
1942	              and K. Carlberg, "Explicit Congestion Notification (ECN)
1943	              for RTP over UDP", RFC 6679, DOI 10.17487/RFC6679, August
1944	              2012, <https://www.rfc-editor.org/info/rfc6679>.

1946	   [RFC6817]  Shalunov, S., Hazel, G., Iyengar, J., and M. Kuehlewind,
1947	              "Low Extra Delay Background Transport (LEDBAT)", RFC 6817,
1948	              DOI 10.17487/RFC6817, December 2012,
1949	              <https://www.rfc-editor.org/info/rfc6817>.

1951	   [RFC6973]  Cooper, A., Tschofenig, H., Aboba, B., Peterson, J.,
1952	              Morris, J., Hansen, M., and R. Smith, "Privacy
1953	              Considerations for Internet Protocols", RFC 6973,
1954	              DOI 10.17487/RFC6973, July 2013,
1955	              <https://www.rfc-editor.org/info/rfc6973>.

1957	   [RFC7540]  Belshe, M., Peon, R., and M. Thomson, Ed., "Hypertext
1958	              Transfer Protocol Version 2 (HTTP/2)", RFC 7540,
1959	              DOI 10.17487/RFC7540, May 2015,
1960	              <https://www.rfc-editor.org/info/rfc7540>.

1962	   [RFC7560]  Kuehlewind, M., Ed., Scheffenegger, R., and B. Briscoe,
1963	              "Problem Statement and Requirements for Increased Accuracy
1964	              in Explicit Congestion Notification (ECN) Feedback",
1965	              RFC 7560, DOI 10.17487/RFC7560, August 2015,
1966	              <https://www.rfc-editor.org/info/rfc7560>.

1968	   [RFC7567]  Baker, F., Ed. and G. Fairhurst, Ed., "IETF
1969	              Recommendations Regarding Active Queue Management",
1970	              BCP 197, RFC 7567, DOI 10.17487/RFC7567, July 2015,
1971	              <https://www.rfc-editor.org/info/rfc7567>.

1973	   [RFC7665]  Halpern, J., Ed. and C. Pignataro, Ed., "Service Function
1974	              Chaining (SFC) Architecture", RFC 7665,
1975	              DOI 10.17487/RFC7665, October 2015,
1976	              <https://www.rfc-editor.org/info/rfc7665>.

1978	   [RFC7713]  Mathis, M. and B. Briscoe, "Congestion Exposure (ConEx)
1979	              Concepts, Abstract Mechanism, and Requirements", RFC 7713,
1980	              DOI 10.17487/RFC7713, December 2015,
1981	              <https://www.rfc-editor.org/info/rfc7713>.

1983	   [RFC8033]  Pan, R., Natarajan, P., Baker, F., and G. White,
1984	              "Proportional Integral Controller Enhanced (PIE): A
1985	              Lightweight Control Scheme to Address the Bufferbloat
1986	              Problem", RFC 8033, DOI 10.17487/RFC8033, February 2017,
1987	              <https://www.rfc-editor.org/info/rfc8033>.

1989	   [RFC8034]  White, G. and R. Pan, "Active Queue Management (AQM) Based
1990	              on Proportional Integral Controller Enhanced PIE) for
1991	              Data-Over-Cable Service Interface Specifications (DOCSIS)
1992	              Cable Modems", RFC 8034, DOI 10.17487/RFC8034, February
1993	              2017, <https://www.rfc-editor.org/info/rfc8034>.

1995	   [RFC8170]  Thaler, D., Ed., "Planning for Protocol Adoption and
1996	              Subsequent Transitions", RFC 8170, DOI 10.17487/RFC8170,
1997	              May 2017, <https://www.rfc-editor.org/info/rfc8170>.

1999	   [RFC8257]  Bensley, S., Thaler, D., Balasubramanian, P., Eggert, L.,
2000	              and G. Judd, "Data Center TCP (DCTCP): TCP Congestion
2001	              Control for Data Centers", RFC 8257, DOI 10.17487/RFC8257,
2002	              October 2017, <https://www.rfc-editor.org/info/rfc8257>.

2004	   [RFC8290]  Hoeiland-Joergensen, T., McKenney, P., Taht, D., Gettys,
2005	              J., and E. Dumazet, "The Flow Queue CoDel Packet Scheduler
2006	              and Active Queue Management Algorithm", RFC 8290,
2007	              DOI 10.17487/RFC8290, January 2018,
2008	              <https://www.rfc-editor.org/info/rfc8290>.

2010	   [RFC8298]  Johansson, I. and Z. Sarker, "Self-Clocked Rate Adaptation
2011	              for Multimedia", RFC 8298, DOI 10.17487/RFC8298, December
2012	              2017, <https://www.rfc-editor.org/info/rfc8298>.

2014	   [RFC8311]  Black, D., "Relaxing Restrictions on Explicit Congestion
2015	              Notification (ECN) Experimentation", RFC 8311,
2016	              DOI 10.17487/RFC8311, January 2018,
2017	              <https://www.rfc-editor.org/info/rfc8311>.

2019	   [RFC8312]  Rhee, I., Xu, L., Ha, S., Zimmermann, A., Eggert, L., and
2020	              R. Scheffenegger, "CUBIC for Fast Long-Distance Networks",
2021	              RFC 8312, DOI 10.17487/RFC8312, February 2018,
2022	              <https://www.rfc-editor.org/info/rfc8312>.

2024	   [RFC8404]  Moriarty, K., Ed. and A. Morton, Ed., "Effects of
2025	              Pervasive Encryption on Operators", RFC 8404,
2026	              DOI 10.17487/RFC8404, July 2018,
2027	              <https://www.rfc-editor.org/info/rfc8404>.

2029	   [RFC8511]  Khademi, N., Welzl, M., Armitage, G., and G. Fairhurst,
2030	              "TCP Alternative Backoff with ECN (ABE)", RFC 8511,
2031	              DOI 10.17487/RFC8511, December 2018,
2032	              <https://www.rfc-editor.org/info/rfc8511>.

2034	   [RFC8888]  Sarker, Z., Perkins, C., Singh, V., and M. Ramalho, "RTP
2035	              Control Protocol (RTCP) Feedback for Congestion Control",
2036	              RFC 8888, DOI 10.17487/RFC8888, January 2021,
2037	              <https://www.rfc-editor.org/info/rfc8888>.

2039	   [RFC9000]  Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based
2040	              Multiplexed and Secure Transport", RFC 9000,
2041	              DOI 10.17487/RFC9000, May 2021,
2042	              <https://www.rfc-editor.org/info/rfc9000>.

2044	   [RFC9001]  Thomson, M., Ed. and S. Turner, Ed., "Using TLS to Secure
2045	              QUIC", RFC 9001, DOI 10.17487/RFC9001, May 2021,
2046	              <https://www.rfc-editor.org/info/rfc9001>.

2048	   [SCReAM]   Johansson, I., "SCReAM", github repository; ,
2049	              <https://github.com/EricssonResearch/scream/blob/master/
2050	              README.md>.

2052	   [TCP-CA]   Jacobson, V. and M.J. Karels, "Congestion Avoidance and
2053	              Control", Laurence Berkeley Labs Technical Report ,
2054	              November 1988, <http://ee.lbl.gov/papers/congavoid.pdf>.

2056	   [TCP-sub-mss-w]
2057	              Briscoe, B. and K. De Schepper, "Scaling TCP's Congestion
2058	              Window for Small Round Trip Times", BT Technical Report
2059	              TR-TUB8-2015-002, May 2015,
2060	              <http://www.bobbriscoe.net/projects/latency/sub-mss-
2061	              w.pdf>.

2063	   [UnorderedLTE]
2064	              Austrheim, M.V., "Implementing immediate forwarding for 4G
2065	              in a network simulator", Masters Thesis, Uni Oslo , June
2066	              2019.

2068	Appendix A.  Standardization items

2070	   The following table includes all the items that will need to be
2071	   standardized to provide a full L4S architecture.

2073	   The table is too wide for the ASCII draft format, so it has been
2074	   split into two, with a common column of row index numbers on the
2075	   left.

2077	   The columns in the second part of the table have the following
2078	   meanings:

2080	   WG:  The IETF WG most relevant to this requirement.  The "tcpm/iccrg"
2081	      combination refers to the procedure typically used for congestion
2082	      control changes, where tcpm owns the approval decision, but uses
2083	      the iccrg for expert review [NewCC_Proc];

2085	   TCP:  Applicable to all forms of TCP congestion control;

2087	   DCTCP:  Applicable to Data Center TCP as currently used (in
2088	      controlled environments);

2090	   DCTCP bis:  Applicable to any future Data Center TCP congestion
2091	      control intended for controlled environments;

2093	   XXX Prague:  Applicable to a Scalable variant of XXX (TCP/SCTP/RMCAT)
2094	      congestion control.

2096	   +=====+========================+====================================+
2097	   | Req | Requirement            | Reference                          |
2098	   | #   |                        |                                    |
2099	   +=====+========================+====================================+
2100	   | 0   | ARCHITECTURE           |                                    |
2101	   +-----+------------------------+------------------------------------+
2102	   | 1   | L4S IDENTIFIER         | [I-D.ietf-tsvwg-ecn-l4s-id] S.3    |
2103	   +-----+------------------------+------------------------------------+
2104	   | 2   | DUAL QUEUE AQM         | [I-D.ietf-tsvwg-aqm-dualq-coupled] |
2105	   +-----+------------------------+------------------------------------+
2106	   | 3   | Suitable ECN           | [I-D.ietf-tcpm-accurate-ecn]       |
2107	   |     | Feedback               | S.4.2,                             |
2108	   |     |                        | [I-D.stewart-tsvwg-sctpecn].       |
2109	   +-----+------------------------+------------------------------------+
2110	   +-----+------------------------+------------------------------------+
2111	   |     | SCALABLE TRANSPORT -   |                                    |
2112	   |     | SAFETY ADDITIONS       |                                    |
2113	   +-----+------------------------+------------------------------------+
2114	   | 4-1 | Fall back to Reno/     | [I-D.ietf-tsvwg-ecn-l4s-id] S.4.3, |
2115	   |     | Cubic on loss          | [RFC8257]                          |
2116	   +-----+------------------------+------------------------------------+
2117	   | 4-2 | Fall back to Reno/     | [I-D.ietf-tsvwg-ecn-l4s-id] S.4.3  |
2118	   |     | Cubic if classic ECN   |                                    |
2119	   |     | bottleneck detected    |                                    |
2120	   +-----+------------------------+------------------------------------+
2121	   +-----+------------------------+------------------------------------+
2122	   | 4-3 | Reduce RTT-            | [I-D.ietf-tsvwg-ecn-l4s-id] S.4.3  |
2123	   |     | dependence             |                                    |
2124	   +-----+------------------------+------------------------------------+
2125	   +-----+------------------------+------------------------------------+
2126	   | 4-4 | Scaling TCP's          | [I-D.ietf-tsvwg-ecn-l4s-id] S.4.3, |
2127	   |     | Congestion Window      | [TCP-sub-mss-w]                    |
2128	   |     | for Small Round Trip   |                                    |
2129	   |     | Times                  |                                    |
2130	   +-----+------------------------+------------------------------------+
2131	   |     | SCALABLE TRANSPORT -   |                                    |
2132	   |     | PERFORMANCE            |                                    |
2133	   |     | ENHANCEMENTS           |                                    |
2134	   +-----+------------------------+------------------------------------+
2135	   | 5-1 | Setting ECT in TCP     | [I-D.ietf-tcpm-generalized-ecn]    |
2136	   |     | Control Packets and    |                                    |
2137	   |     | Retransmissions        |                                    |
2138	   +-----+------------------------+------------------------------------+
2139	   | 5-2 | Faster-than-additive   | [I-D.ietf-tsvwg-ecn-l4s-id] (Appx  |
2140	   |     | increase               | A.2.2)                             |
2141	   +-----+------------------------+------------------------------------+
2142	   | 5-3 | Faster Convergence     | [I-D.ietf-tsvwg-ecn-l4s-id] (Appx  |
2143	   |     | at Flow Start          | A.2.2)                             |
2144	   +-----+------------------------+------------------------------------+

2146	                                  Table 1

2148	   +=====+========+=====+=======+===========+========+========+========+
2149	   | #   | WG     | TCP | DCTCP | DCTCP-bis | TCP    | SCTP   | RMCAT  |
2150	   |     |        |     |       |           | Prague | Prague | Prague |
2151	   +=====+========+=====+=======+===========+========+========+========+
2152	   | 0   | tsvwg  | Y   | Y     | Y         | Y      | Y      | Y      |
2153	   +-----+--------+-----+-------+-----------+--------+--------+--------+
2154	   | 1   | tsvwg  |     |       | Y         | Y      | Y      | Y      |
2155	   +-----+--------+-----+-------+-----------+--------+--------+--------+
2156	   | 2   | tsvwg  | n/a | n/a   | n/a       | n/a    | n/a    | n/a    |
2157	   +-----+--------+-----+-------+-----------+--------+--------+--------+
2158	   +-----+--------+-----+-------+-----------+--------+--------+--------+
2159	   +-----+--------+-----+-------+-----------+--------+--------+--------+
2160	   +-----+--------+-----+-------+-----------+--------+--------+--------+
2161	   | 3   | tcpm   | Y   | Y     | Y         | Y      | n/a    | n/a    |
2162	   +-----+--------+-----+-------+-----------+--------+--------+--------+
2163	   +-----+--------+-----+-------+-----------+--------+--------+--------+
2164	   | 4-1 | tcpm   |     | Y     | Y         | Y      | Y      | Y      |
2165	   +-----+--------+-----+-------+-----------+--------+--------+--------+
2166	   +-----+--------+-----+-------+-----------+--------+--------+--------+
2167	   | 4-2 | tcpm/  |     |       |           | Y      | Y      | ?      |
2168	   |     | iccrg? |     |       |           |        |        |        |
2169	   +-----+--------+-----+-------+-----------+--------+--------+--------+
2170	   +-----+--------+-----+-------+-----------+--------+--------+--------+
2171	   +-----+--------+-----+-------+-----------+--------+--------+--------+
2172	   +-----+--------+-----+-------+-----------+--------+--------+--------+
2173	   +-----+--------+-----+-------+-----------+--------+--------+--------+
2174	   | 4-3 | tcpm/  |     |       | Y         | Y      | Y      | ?      |
2175	   |     | iccrg? |     |       |           |        |        |        |
2176	   +-----+--------+-----+-------+-----------+--------+--------+--------+
2177	   | 4-4 | tcpm   | Y   | Y     | Y         | Y      | Y      | ?      |
2178	   +-----+--------+-----+-------+-----------+--------+--------+--------+
2179	   +-----+--------+-----+-------+-----------+--------+--------+--------+
2180	   +-----+--------+-----+-------+-----------+--------+--------+--------+
2181	   | 5-1 | tcpm   | Y   | Y     | Y         | Y      | n/a    | n/a    |
2182	   +-----+--------+-----+-------+-----------+--------+--------+--------+
2183	   +-----+--------+-----+-------+-----------+--------+--------+--------+
2184	   | 5-2 | tcpm/  |     |       | Y         | Y      | Y      | ?      |
2185	   |     | iccrg? |     |       |           |        |        |        |
2186	   +-----+--------+-----+-------+-----------+--------+--------+--------+
2187	   | 5-3 | tcpm/  |     |       | Y         | Y      | Y      | ?      |
2188	   |     | iccrg? |     |       |           |        |        |        |
2189	   +-----+--------+-----+-------+-----------+--------+--------+--------+

2191	                                  Table 2

2193	Authors' Addresses
2194	   Bob Briscoe (editor)
2195	   Independent
2196	   United Kingdom

2198	   Email: ietf@bobbriscoe.net
2199	   URI:   http://bobbriscoe.net/

2201	   Koen De Schepper
2202	   Nokia Bell Labs
2203	   Antwerp
2204	   Belgium

2206	   Email: koen.de_schepper@nokia.com
2207	   URI:   https://www.bell-labs.com/usr/koen.de_schepper

2209	   Marcelo Bagnulo
2210	   Universidad Carlos III de Madrid
2211	   Av. Universidad 30
2212	   Leganes, Madrid 28911
2213	   Spain

2215	   Phone: 34 91 6249500
2216	   Email: marcelo@it.uc3m.es
2217	   URI:   http://www.it.uc3m.es

2219	   Greg White
2220	   CableLabs
2221	   United States of America

2223	   Email: G.White@CableLabs.com