idnits 2.17.1 

draft-ietf-tsvwg-l4s-arch-06.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == The document doesn't use any RFC 2119 keywords, yet seems to have RFC
     2119 boilerplate text.

  -- The document date (March 9, 2020) is 1508 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Outdated reference: A later version (-07) exists of
     draft-briscoe-docsis-q-protection-00

  == Outdated reference: A later version (-02) exists of
     draft-cardwell-iccrg-bbr-congestion-control-00

  == Outdated reference: A later version (-34) exists of
     draft-ietf-quic-transport-27

  == Outdated reference: A later version (-28) exists of
     draft-ietf-tcpm-accurate-ecn-11

  == Outdated reference: A later version (-15) exists of
     draft-ietf-tcpm-generalized-ecn-05

  == Outdated reference: A later version (-25) exists of
     draft-ietf-tsvwg-aqm-dualq-coupled-10

  == Outdated reference: A later version (-22) exists of
     draft-ietf-tsvwg-ecn-encap-guidelines-13

  == Outdated reference: A later version (-29) exists of
     draft-ietf-tsvwg-ecn-l4s-id-09

  == Outdated reference: A later version (-07) exists of
     draft-stewart-tsvwg-sctpecn-05

  -- Obsolete informational reference (is this intentional?): RFC 4960
     (Obsoleted by RFC 9260)

  -- Obsolete informational reference (is this intentional?): RFC 7540
     (Obsoleted by RFC 9113)

  -- Obsolete informational reference (is this intentional?): RFC 8312
     (Obsoleted by RFC 9438)


     Summary: 0 errors (**), 0 flaws (~~), 11 warnings (==), 4 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------


2	Transport Area Working Group                             B. Briscoe, Ed.
3	Internet-Draft                                               Independent
4	Intended status: Informational                            K. De Schepper
5	Expires: September 10, 2020                              Nokia Bell Labs
6	                                                        M. Bagnulo Braun
7	                                        Universidad Carlos III de Madrid
8	                                                                G. White
9	                                                               CableLabs
10	                                                           March 9, 2020

12	   Low Latency, Low Loss, Scalable Throughput (L4S) Internet Service:
13	                              Architecture
14	                      draft-ietf-tsvwg-l4s-arch-06

16	Abstract

18	   This document describes the L4S architecture, which enables Internet
19	   applications to achieve Low Latency, Low Loss, and Scalable
20	   throughput (L4S).  The insight on which L4S is based is that the root
21	   cause of queuing delay is in the congestion controllers of senders,
22	   not in the queue itself.  The L4S architecture is intended to enable
23	   *all* Internet applications to transition away from congestion
24	   control algorithms that cause queuing delay, to a new class of
25	   congestion controls that utilize explicit congestion signaling
26	   provided by the network.  This new class of congestion control can
27	   provide low latency for capacity-seeking flows, so applications can
28	   achieve both high bandwidth and low latency.

30	   The architecture primarily concerns incremental deployment.  It
31	   defines mechanisms that allow both classes of congestion control to
32	   coexist in a shared network.  These mechanisms aim to ensure that the
33	   latency and throughput performance using an L4S-compliant congestion
34	   controller is usually much better (and never worse) than the
35	   performance would have been using a 'Classic' congestion controller,
36	   and that competing flows continuing to use 'Classic' controllers are
37	   typically not impacted by the presence of L4S.  These characteristics
38	   are important to encourage adoption of L4S congestion control
39	   algorithms and L4S compliant network elements.

41	   The L4S architecture consists of three components: network support to
42	   isolate L4S traffic from classic traffic and to provide appropriate
43	   congestion signaling to both types; protocol features that allow
44	   network elements to identify L4S traffic and allow for communication
45	   of congestion signaling; and host support for immediate congestion
46	   signaling with an appropriate congestion response that enables
47	   scalable performance.

49	Status of This Memo

51	   This Internet-Draft is submitted in full conformance with the
52	   provisions of BCP 78 and BCP 79.

54	   Internet-Drafts are working documents of the Internet Engineering
55	   Task Force (IETF).  Note that other groups may also distribute
56	   working documents as Internet-Drafts.  The list of current Internet-
57	   Drafts is at https://datatracker.ietf.org/drafts/current/.

59	   Internet-Drafts are draft documents valid for a maximum of six months
60	   and may be updated, replaced, or obsoleted by other documents at any
61	   time.  It is inappropriate to use Internet-Drafts as reference
62	   material or to cite them other than as "work in progress."

64	   This Internet-Draft will expire on September 10, 2020.

66	Copyright Notice

68	   Copyright (c) 2020 IETF Trust and the persons identified as the
69	   document authors.  All rights reserved.

71	   This document is subject to BCP 78 and the IETF Trust's Legal
72	   Provisions Relating to IETF Documents
73	   (https://trustee.ietf.org/license-info) in effect on the date of
74	   publication of this document.  Please review these documents
75	   carefully, as they describe your rights and restrictions with respect
76	   to this document.  Code Components extracted from this document must
77	   include Simplified BSD License text as described in Section 4.e of
78	   the Trust Legal Provisions and are provided without warranty as
79	   described in the Simplified BSD License.

81	Table of Contents

83	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
84	   2.  L4S Architecture Overview . . . . . . . . . . . . . . . . . .   4
85	   3.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   6
86	   4.  L4S Architecture Components . . . . . . . . . . . . . . . . .   8
87	   5.  Rationale . . . . . . . . . . . . . . . . . . . . . . . . . .  11
88	     5.1.  Why These Primary Components? . . . . . . . . . . . . . .  11
89	     5.2.  Why Not Alternative Approaches? . . . . . . . . . . . . .  13
90	   6.  Applicability . . . . . . . . . . . . . . . . . . . . . . . .  15
91	     6.1.  Applications  . . . . . . . . . . . . . . . . . . . . . .  15
92	     6.2.  Use Cases . . . . . . . . . . . . . . . . . . . . . . . .  17
93	     6.3.  Deployment Considerations . . . . . . . . . . . . . . . .  18
94	       6.3.1.  Deployment Topology . . . . . . . . . . . . . . . . .  19
95	       6.3.2.  Deployment Sequences  . . . . . . . . . . . . . . . .  20
96	       6.3.3.  L4S Flow but Non-L4S Bottleneck . . . . . . . . . . .  22
97	       6.3.4.  Other Potential Deployment Issues . . . . . . . . . .  23
98	   7.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  23
99	   8.  Security Considerations . . . . . . . . . . . . . . . . . . .  23
100	     8.1.  Traffic (Non-)Policing  . . . . . . . . . . . . . . . . .  23
101	     8.2.  'Latency Friendliness'  . . . . . . . . . . . . . . . . .  24
102	     8.3.  Interaction between Rate Policing and L4S . . . . . . . .  25
103	     8.4.  ECN Integrity . . . . . . . . . . . . . . . . . . . . . .  26
104	   9.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  26
105	   10. References  . . . . . . . . . . . . . . . . . . . . . . . . .  27
106	     10.1.  Normative References . . . . . . . . . . . . . . . . . .  27
107	     10.2.  Informative References . . . . . . . . . . . . . . . . .  27
108	   Appendix A.  Standardization items  . . . . . . . . . . . . . . .  33
109	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  35

111	1.  Introduction

113	   It is increasingly common for _all_ of a user's applications at any
114	   one time to require low delay: interactive Web, Web services, voice,
115	   conversational video, interactive video, interactive remote presence,
116	   instant messaging, online gaming, remote desktop, cloud-based
117	   applications and video-assisted remote control of machinery and
118	   industrial processes.  In the last decade or so, much has been done
119	   to reduce propagation delay by placing caches or servers closer to
120	   users.  However, queuing remains a major, albeit intermittent,
121	   component of latency.  For instance spikes of hundreds of
122	   milliseconds are common.  During a long-running flow, even with
123	   state-of-the-art active queue management (AQM), the base speed-of-
124	   light path delay roughly doubles.  Low loss is also important
125	   because, for interactive applications, losses translate into even
126	   longer retransmission delays.

128	   It has been demonstrated that, once access network bit rates reach
129	   levels now common in the developed world, increasing capacity offers
130	   diminishing returns if latency (delay) is not addressed.
131	   Differentiated services (Diffserv) offers Expedited Forwarding (EF
132	   [RFC3246]) for some packets at the expense of others, but this is not
133	   sufficient when all (or most) of a user's applications require low
134	   latency.

136	   Therefore, the goal is an Internet service with ultra-Low queueing
137	   Latency, ultra-Low Loss and Scalable throughput (L4S).  Ultra-low
138	   queuing latency means less than 1 millisecond (ms) on average and
139	   less than about 2 ms at the 99th percentile.  L4S is potentially for
140	   _all_ traffic - a service for all traffic needs none of the
141	   configuration or management baggage (traffic policing, traffic
142	   contracts) associated with favouring some traffic over others.  This
143	   document describes the L4S architecture for achieving these goals.

145	   It must be said that queuing delay only degrades performance
146	   infrequently [Hohlfeld14].  It only occurs when a large enough
147	   capacity-seeking (e.g.  TCP) flow is running alongside the user's
148	   traffic in the bottleneck link, which is typically in the access
149	   network.  Or when the low latency application is itself a large
150	   capacity-seeking flow (e.g. interactive video).  At these times, the
151	   performance improvement from L4S must be sufficient that network
152	   operators will be motivated to deploy it.

154	   Active Queue Management (AQM) is part of the solution to queuing
155	   under load.  AQM improves performance for all traffic, but there is a
156	   limit to how much queuing delay can be reduced by solely changing the
157	   network; without addressing the root of the problem.

159	   The root of the problem is the presence of standard TCP congestion
160	   control (Reno [RFC5681]) or compatible variants (e.g.  TCP Cubic
161	   [RFC8312]).  We shall use the term 'Classic' for these Reno-friendly
162	   congestion controls.  It has been demonstrated that if the sending
163	   host replaces a Classic congestion control with a 'Scalable'
164	   alternative, when a suitable AQM is deployed in the network the
165	   performance under load of all the above interactive applications can
166	   be significantly improved.  For instance, queuing delay under heavy
167	   load with the example DCTCP/DualQ solution cited below is roughly 1
168	   millisecond (1 to 2 ms) at the 99th percentile without losing link
169	   utilization.  This compares with 5 to 20 ms on _average_ with a
170	   Classic congestion control and current state-of-the-art AQMs such as
171	   fq_CoDel [RFC8290] or PIE [RFC8033] and about 20-30 ms at the 99th
172	   percentile.  Also, with a Classic congestion control, reducing
173	   queueing to even 5 ms is typically only possible by losing some
174	   utilization.

176	   It has been demonstrated [DCttH15] that it is possible to deploy such
177	   an L4S service alongside the existing best efforts service so that
178	   all of a user's applications can shift to it when their stack is
179	   updated.  Access networks are typically designed with one link as the
180	   bottleneck for each site (which might be a home, small enterprise or
181	   mobile device), so deployment at a single network node should give
182	   nearly all the benefit.  The L4S approach also requires component
183	   mechanisms at the endpoints to fulfill its goal.  This document
184	   presents the L4S architecture, by describing the different components
185	   and how they interact to provide the scalable low-latency, low-loss,
186	   Internet service.

188	2.  L4S Architecture Overview

190	   There are three main components to the L4S architecture (illustrated
191	   in Figure 1):

193	   1) Network:  L4S traffic needs to be isolated from the queuing
194	      latency of Classic traffic.  One queue per application flow (FQ)
195	      is one way to achieve this, e.g.  [RFC8290].  However, just two
196	      queues is sufficient and does not require inspection of transport
197	      layer headers in the network, which is not always possible (see
198	      Section 5.2).  With just two queues, it might seem impossible to
199	      know how much capacity to schedule for each queue without
200	      inspecting how many flows at any one time are using each.  And
201	      capacity in access networks is too costly to arbitrarily partition
202	      into two.  The Dual Queue Coupled AQM was developed as a minimal
203	      complexity solution to this problem.  It acts like a 'semi-
204	      permeable' membrane that partitions latency but not bandwidth.
205	      Note that there is no bandwidth priority between the two queues
206	      because they are for transition from Classic to L4S behaviour, not
207	      prioritization.  Section 4 gives a high level explanation of how
208	      FQ and DualQ solutions work, and
209	      [I-D.ietf-tsvwg-aqm-dualq-coupled] gives a full explanation of the
210	      DualQ Coupled AQM framework.

212	   2) Protocol:  A host needs to distinguish L4S and Classic packets
213	      with an identifier so that the network can classify them into
214	      their separate treatments.  [I-D.ietf-tsvwg-ecn-l4s-id] considers
215	      various alternative identifiers, and concludes that all
216	      alternatives involve compromises, but the ECT(1) and CE codepoints
217	      of the ECN field represent a workable solution.

219	   3) Host:  Scalable congestion controls already exist.  They solve the
220	      scaling problem with Reno congestion control that was explained in
221	      [RFC3649].  The one used most widely (in controlled environments)
222	      is Data Center TCP (DCTCP [RFC8257]), which has been implemented
223	      and deployed in Windows Server Editions (since 2012), in Linux and
224	      in FreeBSD.  Although DCTCP as-is 'works' well over the public
225	      Internet, most implementations lack certain safety features that
226	      will be necessary once it is used outside controlled environments
227	      like data centres (see Section 6.3.3 and Appendix A).  A similar
228	      scalable congestion control will also need to be transplanted into
229	      protocols other than TCP (QUIC, SCTP, RTP/RTCP, RMCAT, etc.)
230	      Indeed, between the present document being drafted and published,
231	      the following scalable congestion controls were implemented: TCP
232	      Prague [PragueLinux], QUIC Prague, an L4S variant of the RMCAT
233	      SCReAM controller [RFC8298] and the L4S ECN part of BBRv2
234	      [I-D.cardwell-iccrg-bbr-congestion-control] intended for TCP and
235	      QUIC.

237	                        (2)                     (1)
238	                 .-------^------. .--------------^-------------------.
239	    ,-(3)-----.                                  ______
240	   ; ________  :            L4S   --------.     |      |
241	   :|Scalable| :               _\        ||___\_| mark |
242	   :| sender | :  __________  / /        ||   / |______|\   _________
243	   :|________|\; |          |/    --------'         ^    \1|condit'nl|
244	    `---------'\_|  IP-ECN  |              Coupling :     \|priority |_\
245	     ________  / |Classifier|                       :     /|scheduler| /
246	    |Classic |/  |__________|\    --------.      ___:__  / |_________|
247	    | sender |                \_\  || | |||___\_| mark/|/
248	    |________|                  /  || | |||   / | drop |
249	                         Classic  --------'     |______|

251	     Figure 1: Components of an L4S Solution: 1) Isolation in separate
252	    network queues; 2) Packet Identification Protocol; and 3) Scalable
253	                               Sending Host

255	3.  Terminology

257	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
258	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
259	   document are to be interpreted as described in [RFC2119].  In this
260	   document, these words will appear with that interpretation only when
261	   in ALL CAPS.  Lower case uses of these words are not to be
262	   interpreted as carrying RFC-2119 significance.  COMMENT: Since this
263	   will be an information document, This should be removed.

265	   Classic Congestion Control:  A congestion control behaviour that can
266	      co-exist with standard TCP Reno [RFC5681] without causing
267	      significantly negative impact on its flow rate [RFC5033].  With
268	      Classic congestion controls, as flow rate scales, the number of
269	      round trips between congestion signals (losses or ECN marks) rises
270	      with the flow rate.  So it takes longer and longer to recover
271	      after each congestion event.  Therefore control of queuing and
272	      utilization becomes very slack, and the slightest disturbance
273	      prevents a high rate from being attained [RFC3649].

275	      For instance, with 1500 byte packets and an end-to-end round trip
276	      time (RTT) of 36 ms, over the years, as Reno flow rate scales from
277	      2 to 100 Mb/s the number of round trips taken to recover from a
278	      congestion event rises proportionately, from 4 to 200.  Cubic
279	      [RFC8312] was developed to be less unscalable, but it is
280	      approaching its scaling limit; with the same RTT of 36ms, at
281	      100Mb/s it takes about 106 round trips to recover, and at 800 Mb/s
282	      its recovery time triples to over 340 round trips, or still more
283	      than 12 seconds (Reno would take 57 seconds).

285	   Scalable Congestion Control:  A congestion control where the average
286	      time from one congestion signal to the next (the recovery time)
287	      remains invariant as the flow rate scales, all other factors being
288	      equal.  This maintains the same degree of control over queueing
289	      and utilization whatever the flow rate, as well as ensuring that
290	      high throughput is robust to disturbances.  For instance, DCTCP
291	      averages 2 congestion signals per round-trip whatever the flow
292	      rate.  See Section 4.3 of [I-D.ietf-tsvwg-ecn-l4s-id] for more
293	      explanation.

295	   Classic service:  The Classic service is intended for all the
296	      congestion control behaviours that co-exist with Reno [RFC5681]
297	      (e.g.  Reno itself, Cubic [RFC8312], Compound
298	      [I-D.sridharan-tcpm-ctcp], TFRC [RFC5348]).  The term 'Classic
299	      queue' means a queue providing the Classic service.

301	   Low-Latency, Low-Loss Scalable throughput (L4S) service:  The 'L4S'
302	      service is intended for traffic from scalable congestion control
303	      algorithms, such as Data Center TCP [RFC8257].  The L4S service is
304	      for more general traffic than just DCTCP--it allows the set of
305	      congestion controls with similar scaling properties to DCTCP to
306	      evolve (e.g.  Relentless TCP [Mathis09], TCP Prague [PragueLinux]
307	      and the L4S variant of SCREAM for real-time media [RFC8298]).  The
308	      term 'L4S queue' means a queue providing the L4S service.

310	      The terms Classic or L4S can also qualify other nouns, such as
311	      'queue', 'codepoint', 'identifier', 'classification', 'packet',
312	      'flow'.  For example: an L4S packet means a packet with an L4S
313	      identifier sent from an L4S congestion control.

315	      Both Classic and L4S services can cope with a proportion of
316	      unresponsive or less-responsive traffic as well, as long as it
317	      does not build a queue (e.g.  DNS, VoIP, game sync datagrams,
318	      etc).

320	   Reno-friendly:  The subset of Classic traffic that excludes
321	      unresponsive traffic and excludes experimental congestion controls
322	      intended to coexist with Reno but without always being strictly
323	      friendly to it (as allowed by [RFC5033]).  Reno-friendly is used
324	      in place of 'TCP-friendly', given that the TCP protocol is used
325	      with many different congestion control behaviours.

327	   Classic ECN:  The original Explicit Congestion Notification (ECN)
328	      protocol [RFC3168], which requires ECN signals to be treated the
329	      same as drops, both when generated in the network and when
330	      responded to by the sender.

332	      The names used for the four codepoints of the 2-bit IP-ECN field
333	      are as defined in [RFC3168]: Not ECT, ECT(0), ECT(1) and CE, where
334	      ECT stands for ECN-Capable Transport and CE stands for Congestion
335	      Experienced.

337	   Site:  A home, mobile device, small enterprise or campus, where the
338	      network bottleneck is typically the access link to the site.  Not
339	      all network arrangements fit this model but it is a useful, widely
340	      applicable generalization.

342	4.  L4S Architecture Components

344	   The L4S architecture is composed of the following elements.

346	   Protocols:The L4S architecture encompasses the two identifier changes
347	   (an unassignment and an assignment) and optional further identifiers:

349	   a.  An essential aspect of a scalable congestion control is the use
350	       of explicit congestion signals rather than losses, because the
351	       signals need to be sent immediately and frequently.  'Classic'
352	       ECN [RFC3168] requires an ECN signal to be treated the same as a
353	       drop, both when it is generated in the network and when it is
354	       responded to by hosts.  L4S needs networks and hosts to support a
355	       different meaning for ECN:

357	       *  much more frequent signals--too often to use drops;

359	       *  immediately tracking every fluctuation of the queue--too soon
360	          to commit to dropping packets.

362	       So the standards track [RFC3168] has had to be updated to allow
363	       L4S packets to depart from the 'same as drop' constraint.
364	       [RFC8311] is a standards track update to relax specific
365	       requirements in RFC 3168 (and certain other standards track
366	       RFCs), which clears the way for the experimental changes proposed
367	       for L4S.  [RFC8311] also reclassifies the original experimental
368	       assignment of the ECT(1) codepoint as an ECN nonce [RFC3540] as
369	       historic.

371	   b.  [I-D.ietf-tsvwg-ecn-l4s-id] recommends ECT(1) is used as the
372	       identifier to classify L4S packets into a separate treatment from
373	       Classic packets.  This satisfies the requirements for identifying
374	       an alternative ECN treatment in [RFC4774].

376	       The CE codepoint is used to indicate Congestion Experienced by
377	       both L4S and Classic treatments.  This raises the concern that a
378	       Classic AQM earlier on the path might have marked some ECT(0)
379	       packets as CE.  Then these packets will be erroneously classified
380	       into the L4S queue.  [I-D.ietf-tsvwg-ecn-l4s-id] explains why 5
381	       unlikely eventualities all have to coincide for this to have any
382	       detrimental effect, which even then would only involve a
383	       vanishingly small likelihood of a spurious retransmission.

385	   c.  A network operator might wish to include certain unresponsive,
386	       non-L4S traffic in the L4S queue if it is deemed to be smoothly
387	       enough paced and low enough rate not to build a queue.  For
388	       instance, VoIP, low rate datagrams to sync online games,
389	       relatively low rate application-limited traffic, DNS, LDAP, etc.
390	       This traffic would need to be tagged with specific identifiers,
391	       e.g. a low latency Diffserv Codepoint such as Expedited
392	       Forwarding (EF [RFC3246]), Non-Queue-Building (NQB
393	       [I-D.white-tsvwg-nqb]), or operator-specific identifiers.

395	   Network components: The L4S architecture encompasses either dual-
396	   queue or per-flow queue solutions:

398	   a.  The Coupled Dual Queue AQM achieves the 'semi-permeable' membrane
399	       property mentioned earlier as follows.  The obvious part is that
400	       using two separate queues isolates the queuing delay of one from
401	       the other.  The less obvious part is how the two queues act as if
402	       they are a single pool of bandwidth without the scheduler needing
403	       to decide between them.  This is achieved by having the Classic
404	       AQM provide a congestion signal to both queues in a manner that
405	       ensures a consistent response from the two types of congestion
406	       control.  In other words, the Classic AQM generates a drop/mark
407	       probability based on congestion in the Classic queue, uses this
408	       probability to drop/mark packets in that queue, and also uses
409	       this probability to affect the marking probability in the L4S
410	       queue.  This coupling of the congestion signaling between the two
411	       queues makes the L4S flows slow down to leave the right amount of
412	       capacity for the Classic traffic (as they would if they were the
413	       same type of traffic sharing the same queue).  Then the scheduler
414	       can serve the L4S queue with priority, because the L4S traffic
415	       isn't offering up enough traffic to use all the priority that it
416	       is given.  Therefore, on short time-scales (sub-round-trip) the
417	       prioritization of the L4S queue protects its low latency by
418	       allowing bursts to dissipate quickly; but on longer time-scales
419	       (round-trip and longer) the Classic queue creates an equal and
420	       opposite pressure against the L4S traffic to ensure that neither
421	       has priority when it comes to bandwidth.  The tension between
422	       prioritizing L4S and coupling marking from Classic results in
423	       per-flow fairness.  To protect against unresponsive traffic in
424	       the L4S queue taking advantage of the prioritization and starving
425	       the Classic queue, it is advisable not to use strict priority,
426	       but instead to use a weighted scheduler.

428	       When there is no Classic traffic, the L4S queue's AQM comes into
429	       play, and it sets an appropriate marking rate to maintain ultra-
430	       low queuing delay.

432	       The Coupled Dual Queue AQM has been specified as generically as
433	       possible [I-D.ietf-tsvwg-aqm-dualq-coupled] without specifying
434	       the particular AQMs to use in the two queues so that designers
435	       are free to implement diverse ideas.  Informational appendices in
436	       that draft give pseudocode examples of two different specific AQM
437	       approaches: a variant of PIE called DualPI2 (pronounced Dual PI
438	       Squared) [DualPI2Linux], and a zero-config variant of RED called
439	       Curvy RED.  A DualQ Coupled AQM variant based on PIE has also
440	       been specified and implemented for Low Latency DOCSIS
441	       [DOCSIS3.1].

443	   b.  A scheduler with per-flow queues can be used for L4S.  It is
444	       simple to modify an existing design such as FQ-CoDel or FQ-PIE.
445	       For instance within each queue of an FQ_CoDel system, as well as
446	       a CoDel AQM, immediate (unsmoothed) shallow threshold ECN marking
447	       has been added.  Then the Classic AQM such as CoDel or PIE is
448	       applied to non-ECN or ECT(0) packets, while the shallow threshold
449	       is applied to ECT(1) packets, to give sub-millisecond average
450	       queue delay.

452	   Host mechanisms: The L4S architecture includes a number of mechanisms
453	   in the end host that we enumerate next:

455	   a.  Data Center TCP is the most widely used example of a scalable
456	       congestion control.  It has been documented as an informational
457	       record of the protocol currently in use [RFC8257].  It will be
458	       necessary to define a number of safety features for a variant
459	       usable on the public Internet.  A draft list of these, known as
460	       the Prague L4S requirements, has been drawn up (see Appendix A of
461	       [I-D.ietf-tsvwg-ecn-l4s-id]).  The list also includes some
462	       optional performance improvements.

464	   b.  Transport protocols other than TCP use various congestion
465	       controls designed to be friendly with Reno.  Before they can use
466	       the L4S service, it will be necessary to implement scalable
467	       variants of each of these congestion control behaviours.  The
468	       following standards track RFCs currently define these protocols:
469	       ECN in TCP [RFC3168], in SCTP [RFC4960], in RTP [RFC6679], and in
470	       DCCP [RFC4340].  Not all are in widespread use, but those that
471	       are will eventually need to be updated to allow a different
472	       congestion response, which they will have to indicate by using
473	       the ECT(1) codepoint.  Scalable variants are under consideration
474	       for some new transport protocols that are themselves under
475	       development, e.g.  QUIC [I-D.ietf-quic-transport] and certain
476	       real-time media congestion avoidance techniques (RMCAT)
477	       protocols.

479	   c.  ECN feedback is sufficient for L4S in some transport protocols
480	       (RTCP, DCCP) but not others:

482	       *  For the case of TCP, the feedback protocol for ECN embeds the
483	          assumption from Classic ECN that an ECN mark is the same as a
484	          drop, making it unusable for a scalable TCP.  Therefore, the
485	          implementation of TCP receivers will have to be upgraded
486	          [RFC7560].  Work to standardize and implement more accurate
487	          ECN feedback for TCP (AccECN) is in progress
488	          [I-D.ietf-tcpm-accurate-ecn], [PragueLinux].

490	       *  ECN feedback is only roughly sketched in an appendix of the
491	          SCTP specification.  A fuller specification has been proposed
492	          [I-D.stewart-tsvwg-sctpecn], which would need to be
493	          implemented and deployed before SCTCP could support L4S.

495	5.  Rationale

497	5.1.  Why These Primary Components?

499	   Explicit congestion signalling (protocol):  Explicit congestion
500	      signalling is a key part of the L4S approach.  In contrast, use of
501	      drop as a congestion signal creates a tension because drop is both
502	      a useful signal (more would reduce delay) and an impairment (less
503	      would reduce delay):

505	      *  Explicit congestion signals can be used many times per round
506	         trip, to keep tight control, without any impairment.  Under
507	         heavy load, even more explicit signals can be applied so the
508	         queue can be kept short whatever the load.  Whereas state-of-
509	         the-art AQMs have to introduce very high packet drop at high
510	         load to keep the queue short.  Further, when using ECN, the
511	         congestion control's sawtooth reduction can be smaller and
512	         therefore return to the operating point more often, without
513	         worrying that this causes more signals (one at the top of each
514	         smaller sawtooth).  The consequent smaller amplitude sawteeth
515	         fit between a very shallow marking threshold and an empty
516	         queue, so delay variation can be very low, without risk of
517	         under-utilization.

519	      *  Explicit congestion signals can be sent immediately to track
520	         fluctuations of the queue.  L4S shifts smoothing from the
521	         network (which doesn't know the round trip times of all the
522	         flows) to the host (which knows its own round trip time).

524	         Previously, the network had to smooth to keep a worst-case
525	         round trip stable, delaying congestion signals by 100-200ms.

527	      All the above makes it clear that explicit congestion signalling
528	      is only advantageous for latency if it does not have to be
529	      considered 'the same as' drop (as was required with Classic ECN
530	      [RFC3168]).  Therefore, in a DualQ AQM, the L4S queue uses a new
531	      L4S variant of ECN that is not equivalent to drop
532	      [I-D.ietf-tsvwg-ecn-l4s-id], while the Classic queue uses either
533	      classic ECN [RFC3168] or drop, which are equivalent.

535	      Before Classic ECN was standardized, there were various proposals
536	      to give an ECN mark a different meaning from drop.  However, there
537	      was no particular reason to agree on any one of the alternative
538	      meanings, so 'the same as drop' was the only compromise that could
539	      be reached.  RFC 3168 contains a statement that:

541	         "An environment where all end nodes were ECN-Capable could
542	         allow new criteria to be developed for setting the CE
543	         codepoint, and new congestion control mechanisms for end-node
544	         reaction to CE packets.  However, this is a research issue, and
545	         as such is not addressed in this document."

547	   Latency isolation with coupled congestion notification (network):
548	      Using just two queues is not essential to L4S (more would be
549	      possible), but it is the simplest way to isolate all the L4S
550	      traffic that keeps latency low from all the legacy Classic traffic
551	      that does not.

553	      Similarly, coupling the congestion notification between the queues
554	      is not necessarily essential, but it is a clever and simple way to
555	      allow senders to determine their rate, packet-by-packet, rather
556	      than be overridden by a network scheduler.  Because otherwise a
557	      network scheduler would have to inspect at least transport layer
558	      headers, and it would have to continually assign a rate to each
559	      flow without any easy way to understand application intent.

561	   L4S packet identifier (protocol):  Once there are at least two
562	      separate treatments in the network, hosts need an identifier at
563	      the IP layer to distinguish which treatment they intend to use.

565	   Scalable congestion notification (host):  A scalable congestion
566	      control keeps the signalling frequency high so that rate
567	      variations can be small when signalling is stable, and rate can
568	      track variations in available capacity as rapidly as possible
569	      otherwise.

571	   Low loss:  Latency is not the only concern of L4S.  The 'Low Loss"
572	      part of the name denotes that L4S generally achieves zero
573	      congestion loss due to its use of ECN.  Otherwise, loss would
574	      itself cause delay, particularly for short flows, due to
575	      retransmission delay [RFC2884].

577	   Scalable throughput:  The "Scalable throughput" part of the name
578	      denotes that the per-flow throughput of scalable congestion
579	      controls should scale indefinitely, avoiding the imminent scaling
580	      problems with Reno-friendly congestion control algorithms
581	      [RFC3649].  It was known when TCP congestion avoidance was first
582	      developed that it would not scale to high bandwidth-delay products
583	      (see footnote 6 in [TCP-CA]).  Today, regular broadband bit-rates
584	      over WAN distances are already beyond the scaling range of Classic
585	      Reno congestion control.  So `less unscalable' Cubic [RFC8312] and
586	      Compound [I-D.sridharan-tcpm-ctcp] variants of TCP have been
587	      successfully deployed.  However, these are now approaching their
588	      scaling limits.  For instance, at 800Mb/s with a 20ms round trip,
589	      Cubic induces a congestion signal only every 500 round trips or 10
590	      seconds, which makes its dynamic control very sloppy.  In contrast
591	      on average a scalable congestion control like DCTCP or TCP Prague
592	      induces 2 congestion signals per round trip, which remains
593	      invariant for any flow rate, keeping dynamic control very tight.

595	5.2.  Why Not Alternative Approaches?

597	   All the following approaches address some part of the same problem
598	   space as L4S.  In each case, it is shown that L4S complements them or
599	   improves on them, rather than being a mutually exclusive alternative:

601	   Diffserv:  Diffserv addresses the problem of bandwidth apportionment
602	      for important traffic as well as queuing latency for delay-
603	      sensitive traffic.  L4S solely addresses the problem of queuing
604	      latency (as well as loss and throughput scaling).  Diffserv will
605	      still be necessary where important traffic requires priority (e.g.
606	      for commercial reasons, or for protection of critical
607	      infrastructure traffic) - see [I-D.briscoe-tsvwg-l4s-diffserv].
608	      Nonetheless, if there are Diffserv classes for important traffic,
609	      the L4S approach can provide low latency for _all_ traffic within
610	      each Diffserv class (including the case where there is only one
611	      Diffserv class).

613	      Also, as already explained, Diffserv only works for a small subset
614	      of the traffic on a link.  It is not applicable when all the
615	      applications in use at one time at a single site (home, small
616	      business or mobile device) require low latency.  Also, because L4S
617	      is for all traffic, it needs none of the management baggage
618	      (traffic policing, traffic contracts) associated with favouring
619	      some packets over others.  This baggage has held Diffserv back
620	      from widespread end-to-end deployment.

622	   State-of-the-art AQMs:  AQMs such as PIE and fq_CoDel give a
623	      significant reduction in queuing delay relative to no AQM at all.
624	      L4S is intended to complement these AQMs, and should not distract
625	      from the need to deploy them as widely as possible.  Nonetheless,
626	      without addressing the large saw-toothing rate variations of
627	      Classic congestion controls, AQMs alone cannot reduce queuing
628	      delay too far without significantly reducing link utilization.
629	      The L4S approach resolves this tension by ensuring hosts can
630	      minimize the size of their sawteeth without appearing so
631	      aggressive to legacy flows that they starve them.

633	   Per-flow queuing:  Similarly, per-flow queuing is not incompatible
634	      with the L4S approach.  However, one queue for every flow can be
635	      thought of as overkill compared to the minimum of two queues for
636	      all traffic needed for the L4S approach.  The overkill of per-flow
637	      queuing has side-effects:

639	      A.  fq makes high performance networking equipment costly
640	          (processing and memory) - in contrast dual queue code can be
641	          very simple;

643	      B.  fq requires packet inspection into the end-to-end transport
644	          layer, which doesn't sit well alongside encryption for privacy
645	          - in contrast the use of ECN as the classifier for L4S
646	          requires no deeper inspection than the IP layer;

648	      C.  fq isolates the queuing of each flow from the others but not
649	          from itself so existing FQ implementations still need to have
650	          support for scalable congestion control added.

652	          It might seem that self-inflicted queuing delay should not
653	          count, because if the delay wasn't in the network it would
654	          just shift to the sender.  However, modern adaptive
655	          applications, e.g.  HTTP/2 [RFC7540] or the interactive media
656	          applications described in Section 6, can keep low latency
657	          objects at the front of their local send queue by shuffling
658	          priorities of other objects dependent on the progress of other
659	          transfers.  They cannot shuffle packets once they have
660	          released them into the network.

662	      D.  fq prevents any one flow from consuming more than 1/N of the
663	          capacity at any instant, where N is the number of flows.  This
664	          is fine if all flows are elastic, but it does not sit well
665	          with a variable bit rate real-time multimedia flow, which
666	          requires wriggle room to sometimes take more and other times
667	          less than a 1/N share.

669	          It might seem that an fq scheduler offers the benefit that it
670	          prevents individual flows from hogging all the bandwidth.
671	          However, L4S has been deliberately designed so that policing
672	          of individual flows can be added as a policy choice, rather
673	          than requiring one specific policy choice as the mechanism
674	          itself.  A scheduler (like fq) has to decide packet-by-packet
675	          which flow to schedule without knowing application intent.
676	          Whereas a separate policing function can be configured less
677	          strictly, so that senders can still control the instantaneous
678	          rate of each flow dependent on the needs of each application
679	          (e.g. variable rate video), giving more wriggle-room before a
680	          flow is deemed non-compliant.  Also policing of queuing and of
681	          flow-rates can be applied independently.

683	   Alternative Back-off ECN (ABE):  Here again, L4S is not an
684	      alternative to ABE but a complement that introduces much lower
685	      queuing delay.  ABE [RFC8511] alters the host behaviour in
686	      response to ECN marking to utilize a link better and give ECN
687	      flows faster throughput.  It uses ECT(0) and assumes the network
688	      still treats ECN and drop the same.  Therefore ABE exploits any
689	      lower queuing delay that AQMs can provide.  But as explained
690	      above, AQMs still cannot reduce queuing delay too far without
691	      losing link utilization (to allow for other, non-ABE, flows).

693	   BBRv1:  v1 of Bottleneck Bandwidth and Round-trip propagation time
694	      (BBR [I-D.cardwell-iccrg-bbr-congestion-control]) controls queuing
695	      delay end-to-end without needing any special logic in the network,
696	      such as an AQM - so it works pretty-much on any path.  Setting
697	      some problems with capacity sharing aside, queuing delay is good
698	      with BBRv1, but perhaps not quite as low as with state-of-the-art
699	      AQMs such as PIE or fq_CoDel, and certainly nowhere near as low as
700	      with L4S.  Queuing delay is also not consistently low, due to its
701	      regular bandwidth probes and the aggressive flow start-up phase.

703	      L4S is a complement to BBRv1.  Indeed BBRv2 uses L4S ECN and a
704	      scalable L4S congestion control behaviour in response to any ECN
705	      signalling from the path.

707	6.  Applicability

709	6.1.  Applications

711	   A transport layer that solves the current latency issues will provide
712	   new service, product and application opportunities.

714	   With the L4S approach, the following existing applications will
715	   experience significantly better quality of experience under load:

717	   o  Gaming, including cloud based gaming;

719	   o  VoIP;

721	   o  Video conferencing;

723	   o  Web browsing;

725	   o  (Adaptive) video streaming;

727	   o  Instant messaging.

729	   The significantly lower queuing latency also enables some interactive
730	   application functions to be offloaded to the cloud that would hardly
731	   even be usable today:

733	   o  Cloud based interactive video;

735	   o  Cloud based virtual and augmented reality.

737	   The above two applications have been successfully demonstrated with
738	   L4S, both running together over a 40 Mb/s broadband access link
739	   loaded up with the numerous other latency sensitive applications in
740	   the previous list as well as numerous downloads - all sharing the
741	   same bottleneck queue simultaneously [L4Sdemo16].  For the former, a
742	   panoramic video of a football stadium could be swiped and pinched so
743	   that, on the fly, a proxy in the cloud could generate a sub-window of
744	   the match video under the finger-gesture control of each user.  For
745	   the latter, a virtual reality headset displayed a viewport taken from
746	   a 360 degree camera in a racing car.  The user's head movements
747	   controlled the viewport extracted by a cloud-based proxy.  In both
748	   cases, with 7 ms end-to-end base delay, the additional queuing delay
749	   of roughly 1 ms was so low that it seemed the video was generated
750	   locally.

752	   Using a swiping finger gesture or head movement to pan a video are
753	   extremely latency-demanding actions--far more demanding than VoIP.
754	   Because human vision can detect extremely low delays of the order of
755	   single milliseconds when delay is translated into a visual lag
756	   between a video and a reference point (the finger or the orientation
757	   of the head sensed by the balance system in the inner ear --- the
758	   vestibular system).

760	   Without the low queuing delay of L4S, cloud-based applications like
761	   these would not be credible without significantly more access
762	   bandwidth (to deliver all possible video that might be viewed) and
763	   more local processing, which would increase the weight and power
764	   consumption of head-mounted displays.  When all interactive
765	   processing can be done in the cloud, only the data to be rendered for
766	   the end user needs to be sent.

768	   Other low latency high bandwidth applications such as:

770	   o  Interactive remote presence;

772	   o  Video-assisted remote control of machinery or industrial
773	      processes.

775	   are not credible at all without very low queuing delay.  No amount of
776	   extra access bandwidth or local processing can make up for lost time.

778	6.2.  Use Cases

780	   The following use-cases for L4S are being considered by various
781	   interested parties:

783	   o  Where the bottleneck is one of various types of access network:
784	      DSL, cable, mobile, satellite

786	      *  Radio links (cellular, WiFi, satellite) that are distant from
787	         the source are particularly challenging.  The radio link
788	         capacity can vary rapidly by orders of magnitude, so it is
789	         often desirable to hold a buffer to utilize sudden increases of
790	         capacity;

792	      *  cellular networks are further complicated by a perceived need
793	         to buffer in order to make hand-overs imperceptible;

795	      *  Satellite networks generally have a very large base RTT, so
796	         even with minimal queuing, overall delay can never be extremely
797	         low;

799	      *  Nonetheless, it is certainly desirable not to hold a buffer
800	         purely because of the sawteeth of Classic congestion controls,
801	         when it is more than is needed for all the above reasons.

803	   o  Private networks of heterogeneous data centres, where there is no
804	      single administrator that can arrange for all the simultaneous
805	      changes to senders, receivers and network needed to deploy DCTCP:

807	      *  a set of private data centres interconnected over a wide area
808	         with separate administrations, but within the same company

810	      *  a set of data centres operated by separate companies
811	         interconnected by a community of interest network (e.g. for the
812	         finance sector)

814	      *  multi-tenant (cloud) data centres where tenants choose their
815	         operating system stack (Infrastructure as a Service - IaaS)

817	   o  Different types of transport (or application) congestion control:

819	      *  elastic (TCP/SCTP);

821	      *  real-time (RTP, RMCAT);

823	      *  query (DNS/LDAP).

825	   o  Where low delay quality of service is required, but without
826	      inspecting or intervening above the IP layer
827	      [I-D.smith-encrypted-traffic-management]:

829	      *  mobile and other networks have tended to inspect higher layers
830	         in order to guess application QoS requirements.  However, with
831	         growing demand for support of privacy and encryption, L4S
832	         offers an alternative.  There is no need to select which
833	         traffic to favour for queuing, when L4S gives favourable
834	         queuing to all traffic.

836	   o  If queuing delay is minimized, applications with a fixed delay
837	      budget can communicate over longer distances, or via a longer
838	      chain of service functions [RFC7665] or onion routers.

840	6.3.  Deployment Considerations

842	   The DualQ is, in itself, an incremental deployment framework for L4S
843	   AQMs so that L4S traffic can coexist with existing Classic (Reno-
844	   friendly) traffic.  Section 6.3.1 explains why only deploying a DualQ
845	   AQM [I-D.ietf-tsvwg-aqm-dualq-coupled] in one node at each end of the
846	   access link will realize nearly all the benefit of L4S.

848	   L4S involves both end systems and the network, so Section 6.3.2
849	   suggests some typical sequences to deploy each part, and why there
850	   will be an immediate and significant benefit after deploying just one
851	   part.

853	   If an ECN-enabled DualQ AQM has not been deployed at a bottleneck, an
854	   L4S flow is required to include a fall-back strategy to Classic
855	   behaviour.  Section 6.3.3 describes how an L4S flow detects this, and
856	   how to minimize the effect of false negative detection.

858	6.3.1.  Deployment Topology

860	   DualQ AQMs will not have to be deployed throughout the Internet
861	   before L4S will work for anyone.  Operators of public Internet access
862	   networks typically design their networks so that the bottleneck will
863	   nearly always occur at one known (logical) link.  This confines the
864	   cost of queue management technology to one place.

866	   The case of mesh networks is different and will be discussed later in
867	   this section.  But the known bottleneck case is generally true for
868	   Internet access to all sorts of different 'sites', where the word
869	   'site' includes home networks, small-to-medium sized campus or
870	   enterprise networks and even cellular devices (Figure 2).  Also, this
871	   known-bottleneck case tends to be applicable whatever the access link
872	   technology; whether xDSL, cable, cellular, line-of-sight wireless or
873	   satellite.

875	   Therefore, the full benefit of the L4S service should be available in
876	   the downstream direction when the DualQ AQM is deployed at the
877	   ingress to this bottleneck link (or links for multihomed sites).  And
878	   similarly, the full upstream service will be available once the DualQ
879	   is deployed at the upstream ingress.

881	                                            ______
882	                                           (      )
883	                         __          __  (          )
884	                        |DQ\________/DQ|( enterprise )
885	                    ___ |__/        \__| ( /campus  )
886	                   (   )                   (______)
887	                 (      )                           ___||_
888	   +----+      (          )  __                 __ /      \
889	   | DC |-----(    Core    )|DQ\_______________/DQ|| home |
890	   +----+      (          ) |__/               \__||______|
891	                  (_____) __
892	                         |DQ\__/\        __ ,===.
893	                         |__/    \  ____/DQ||| ||mobile
894	                                  \/    \__|||_||device
895	                                            | o |
896	                                            `---'

898	   Figure 2: Likely location of DualQ (DQ) Deployments in common access
899	                                topologies

901	   Deployment in mesh topologies depends on how over-booked the core is.
902	   If the core is non-blocking, or at least generously provisioned so
903	   that the edges are nearly always the bottlenecks, it would only be
904	   necessary to deploy the DualQ AQM at the edge bottlenecks.  For
905	   example, some data-centre networks are designed with the bottleneck
906	   in the hypervisor or host NICs, while others bottleneck at the top-
907	   of-rack switch (both the output ports facing hosts and those facing
908	   the core).

910	   The DualQ would eventually also need to be deployed at any other
911	   persistent bottlenecks such as network interconnections, e.g. some
912	   public Internet exchange points and the ingress and egress to WAN
913	   links interconnecting data-centres.

915	6.3.2.  Deployment Sequences

917	   For any one L4S flow to work, it requires 3 parts to have been
918	   deployed.  This was the same deployment problem that ECN faced
919	   [RFC8170] so we have learned from this.

921	   Firstly, L4S deployment exploits the fact that DCTCP already exists
922	   on many Internet hosts (Windows, FreeBSD and Linux); both servers and
923	   clients.  Therefore, just deploying DualQ AQM at a network bottleneck
924	   immediately gives a working deployment of all the L4S parts.  DCTCP
925	   needs some safety concerns to be fixed for general use over the
926	   public Internet (see Section 2.3 of [I-D.ietf-tsvwg-ecn-l4s-id]), but
927	   DCTCP is not on by default, so these issues can be managed within
928	   controlled deployments or controlled trials.

930	   Secondly, the performance improvement with L4S is so significant that
931	   it enables new interactive services and products that were not
932	   previously possible.  It is much easier for companies to initiate new
933	   work on deployment if there is budget for a new product trial.  If,
934	   in contrast, there were only an incremental performance improvement
935	   (as with Classic ECN), spending on deployment tends to be much harder
936	   to justify.

938	   Thirdly, the L4S identifier is defined so that initially network
939	   operators can enable L4S exclusively for certain customers or certain
940	   applications.  But this is carefully defined so that it does not
941	   compromise future evolution towards L4S as an Internet-wide service.
942	   This is because the L4S identifier is defined not only as the end-to-
943	   end ECN field, but it can also optionally be combined with any other
944	   packet header or some status of a customer or their access link
945	   [I-D.ietf-tsvwg-ecn-l4s-id].  Operators could do this anyway, even if
946	   it were not blessed by the IETF.  However, it is best for the IETF to
947	   specify that they must use their own local identifier in combination
948	   with the IETF's identifier.  Then, if an operator enables the
949	   optional local-use approach, they only have to remove this extra rule
950	   to make the service work Internet-wide - it will already traverse
951	   middleboxes, peerings, etc.

953	   +-+--------------------+----------------------+---------------------+
954	   | | Servers or proxies |      Access link     |             Clients |
955	   +-+--------------------+----------------------+---------------------+
956	   |1| DCTCP (existing)   |                      |    DCTCP (existing) |
957	   | |                    | DualQ AQM downstream |                     |
958	   | |       WORKS DOWNSTREAM FOR CONTROLLED DEPLOYMENTS/TRIALS        |
959	   +-+--------------------+----------------------+---------------------+
960	   |2| TCP Prague         |                      |  AccECN (already in |
961	   | |                    |                      | progress:DCTCP/BBR) |
962	   | |                 FULLY     WORKS     DOWNSTREAM                  |
963	   +-+--------------------+----------------------+---------------------+
964	   |3|                    |  DualQ AQM upstream  |          TCP Prague |
965	   | |                    |                      |                     |
966	   | |              FULLY WORKS UPSTREAM AND DOWNSTREAM                |
967	   +-+--------------------+----------------------+---------------------+

969	                Figure 3: Example L4S Deployment Sequences

971	   Figure 3 illustrates some example sequences in which the parts of L4S
972	   might be deployed.  It consists of the following stages:

974	   1.  Here, the immediate benefit of a single AQM deployment can be
975	       seen, but limited to a controlled trial or controlled deployment.
976	       In this example downstream deployment is first, but in other
977	       scenarios the upstream might be deployed first.  If no AQM at all
978	       was previously deployed for the downstream access, the DualQ AQM
979	       greatly improves the Classic service (as well as adding the L4S
980	       service).  If an AQM was already deployed, the Classic service
981	       will be unchanged (and L4S will add an improvement on top).

983	   2.  In this stage, the name 'TCP Prague' is used to represent a
984	       variant of DCTCP that is safe to use in a production environment.
985	       If the application is primarily unidirectional, 'TCP Prague' at
986	       one end will provide all the benefit needed.  Accurate ECN
987	       feedback (AccECN) [I-D.ietf-tcpm-accurate-ecn] is needed at the
988	       other end, but it is a generic ECN feedback facility that is
989	       already planned to be deployed for other purposes, e.g.  DCTCP,
990	       BBR [I-D.cardwell-iccrg-bbr-congestion-control].  The two ends
991	       can be deployed in either order, because, in TCP, an L4S
992	       congestion control only enables itself if it has negotiated the
993	       use of AccECN feedback with the other end during the connection
994	       handshake.  Thus, deployment of TCP Prague on a server enables
995	       L4S trials to move to a production service in one direction,
996	       wherever AccECN is deployed at the other end.  This stage might
997	       be further motivated by the performance improvements of TCP
998	       Prague relative to DCTCP (see Appendix A.2 of
999	       [I-D.ietf-tsvwg-ecn-l4s-id]).

1001	   3.  This is a two-move stage to enable L4S upstream.  The DualQ or
1002	       TCP Prague can be deployed in either order as already explained.
1003	       To motivate the first of two independent moves, the deferred
1004	       benefit of enabling new services after the second move has to be
1005	       worth it to cover the first mover's investment risk.  As
1006	       explained already, the potential for new interactive services
1007	       provides this motivation.  The DualQ AQM also greatly improves
1008	       the upstream Classic service, assuming no other AQM has already
1009	       been deployed.

1011	   Note that other deployment sequences might occur.  For instance: the
1012	   upstream might be deployed first; a non-TCP protocol might be used
1013	   end-to-end, e.g.  QUIC, RMCAT; a body such as the 3GPP might require
1014	   L4S to be implemented in 5G user equipment, or other random acts of
1015	   kindness.

1017	6.3.3.  L4S Flow but Non-L4S Bottleneck

1019	   If L4S is enabled between two hosts but there is no L4S AQM at the
1020	   bottleneck, any drop from the bottleneck will trigger the L4S sender
1021	   to fall back to a classic ('Reno-friendly') behaviour (see
1022	   Appendix A.1.3 of [I-D.ietf-tsvwg-ecn-l4s-id]).

1024	   Unfortunately, as well as protecting legacy traffic, this rule
1025	   degrades the L4S service whenever there is a loss, even if the loss
1026	   was not from a non-DualQ bottleneck (false negative).  And
1027	   unfortunately, prevalent drop can be due to other causes, e.g.:

1029	   o  congestion loss at other transient bottlenecks, e.g. due to bursts
1030	      in shallower queues;

1032	   o  transmission errors, e.g. due to electrical interference;

1034	   o  rate policing.

1036	   Three complementary approaches are in progress to address this issue,
1037	   but they are all currently research:

1039	   o  In Prague congestion control, ignore certain losses deemed
1040	      unlikely to be due to congestion (using some ideas from BBR
1041	      [I-D.cardwell-iccrg-bbr-congestion-control] but with no need to
1042	      ignore nearly all losses).  This could mask any of the above types
1043	      of loss (requires consensus on how to safely interoperate with
1044	      drop-based congestion controls).

1046	   o  A combination of RACK, reconfigured link retransmission and L4S
1047	      could address transmission errors [UnorderedLTE],
1048	      [I-D.ietf-tsvwg-ecn-l4s-id];

1050	   o  Hybrid ECN/drop rate policers (see Section 8.3).

1052	   L4S deployment scenarios that minimize these issues (e.g. over
1053	   wireline networks) can proceed in parallel to this research, in the
1054	   expectation that research success could continually widen L4S
1055	   applicability.

1057	   Classic ECN support is starting to materialize on the Internet as an
1058	   increased level of CE marking.  Given some of this Classic ECN might
1059	   be due to single-queue ECN deployment, an L4S sender will have to
1060	   fall back to a classic ('Reno-friendly') behaviour if it detects that
1061	   ECN marking is accompanied by greater queuing delay or greater delay
1062	   variation than would be expected with L4S (see Appendix A.1.4 of
1063	   [I-D.ietf-tsvwg-ecn-l4s-id]).  It is hard to detect whether this is
1064	   all due to the addition of support for ECN in the Linux
1065	   implementation of FQ-CoDel, which would not require fall-back to
1066	   Classic behaviour, because FQ inherently forces the throughput of
1067	   each flow to be equal irrespective of its aggressiveness.

1069	6.3.4.  Other Potential Deployment Issues

1071	   An L4S AQM uses the ECN field to signal congestion.  So, in common
1072	   with Classic ECN, if the AQM is within a tunnel or at a lower layer,
1073	   correct functioning of ECN signalling requires correct propagation of
1074	   the ECN field up the layers [RFC6040],
1075	   [I-D.ietf-tsvwg-ecn-encap-guidelines].

1077	7.  IANA Considerations

1079	   This specification contains no IANA considerations.

1081	8.  Security Considerations

1083	8.1.  Traffic (Non-)Policing

1085	   Because the L4S service can serve all traffic that is using the
1086	   capacity of a link, it should not be necessary to police access to
1087	   the L4S service.  In contrast, Diffserv only works if some packets
1088	   get less favourable treatment than others.  So Diffserv has to use
1089	   traffic rate policers to limit how much traffic can be favoured.  In
1090	   turn, traffic policers require traffic contracts between users and
1091	   networks as well as pairwise between networks.  Because L4S will lack
1092	   all this management complexity, it is more likely to work end-to-end.

1094	   During early deployment (and perhaps always), some networks will not
1095	   offer the L4S service.  These networks do not need to police or re-
1096	   mark L4S traffic - they just forward it unchanged as best efforts
1097	   traffic, as they already forward traffic with ECT(1) today.  At a
1098	   bottleneck, such networks will introduce some queuing and dropping.
1099	   When a scalable congestion control detects a drop it will have to
1100	   respond as if it is a Classic congestion control (as required in
1101	   Section 2.3 of [I-D.ietf-tsvwg-ecn-l4s-id]).  This will ensure safe
1102	   interworking with other traffic at the 'legacy' bottleneck, but it
1103	   will degrade the L4S service to no better (but never worse) than
1104	   classic best efforts, whenever a legacy (non-L4S) bottleneck is
1105	   encountered on a path.

1107	   Certain network operators might choose to restrict access to the L4S
1108	   class, perhaps only to selected premium customers as a value-added
1109	   service.  Their packet classifier (item 2 in Figure 1) could identify
1110	   such customers against some other field (e.g. source address range)
1111	   as well as ECN.  If only the ECN L4S identifier matched, but not the
1112	   source address (say), the classifier could direct these packets (from
1113	   non-premium customers) into the Classic queue.  Clearly explaining
1114	   how operators can use an additional local classifiers (see
1115	   [I-D.ietf-tsvwg-ecn-l4s-id]) is intended to remove any tendency to
1116	   bleach the L4S identifier.  Then at least the L4S ECN identifier will
1117	   be more likely to survive end-to-end even though the service may not
1118	   be supported at every hop.  Such arrangements would only require
1119	   simple registered/not-registered packet classification, rather than
1120	   the managed, application-specific traffic policing against customer-
1121	   specific traffic contracts that Diffserv uses.

1123	8.2.  'Latency Friendliness'

1125	   Like the Classic service, the L4S service relies on self-constraint -
1126	   limiting rate in response to congestion.  In addition, the L4S
1127	   service requires self-constraint in terms of limiting latency
1128	   (burstiness).  It is hoped that self-interest and standardization of
1129	   dynamic behaviour (especially flow start-up) will be sufficient to
1130	   prevent transports from sending excessive bursts of L4S traffic,
1131	   given the application's own latency will suffer most from such
1132	   behaviour.

1134	   Whether burst policing becomes necessary remains to be seen.  Without
1135	   it, there will be potential for attacks on the low latency of the L4S
1136	   service.  However it may only be necessary to apply such policing
1137	   reactively, e.g. punitively targeted at any deployments of new bursty
1138	   malware.

1140	   A per-flow (5-tuple) queue protection function
1141	   [I-D.briscoe-docsis-q-protection] has been developed for the low
1142	   latency queue in DOCSIS, which has adopted the DualQ L4S
1143	   architecture.  It protects the low latency service from any queue-
1144	   building flows that accidentally or maliciously classify themselves
1145	   into the low latency queue.  It is designed to score flows based
1146	   solely on their contribution to queuing (not flow rate in itself).
1147	   Then, if the shared low latency queue is at risk of exceeding a
1148	   threshold, the function redirects enough packets of the highest
1149	   scoring flow(s) into the Classic queue to preserve low latency.

1151	   Such a queue protection function is not considered a necessary part
1152	   of the L4S architecture, which works without it (in a similar way to
1153	   how the Internet works without per-flow rate policing).  Indeed,
1154	   under normal circumstances, DOCSIS queue protection does not
1155	   intervene, and if operators find it is not necessary they can disable
1156	   it.  Part of the L4S experiment will be to see whether such a
1157	   function is necessary.

1159	8.3.  Interaction between Rate Policing and L4S

1161	   As mentioned in Section 5.2, L4S should remove the need for low
1162	   latency Diffserv classes.  However, those Diffserv classes that give
1163	   certain applications or users priority over capacity, would still be
1164	   applicable in certain scenarios (e.g.  corporate networks).  Then,
1165	   within such Diffserv classes, L4S would often be applicable to give
1166	   traffic low latency and low loss as well.  Within such a Diffserv
1167	   class, the bandwidth available to a user or application is often
1168	   limited by a rate policer.  Similarly, in the default Diffserv class,
1169	   rate policers are used to partition shared capacity.

1171	   A classic rate policer drops any packets exceeding a set rate,
1172	   usually also giving a burst allowance (variants exist where the
1173	   policer re-marks non-compliant traffic to a discard-eligible Diffserv
1174	   codepoint, so they may be dropped elsewhere during contention).
1175	   Whenever L4S traffic encounters one of these rate policers, it will
1176	   experience drops and the source has to fall back to a Classic
1177	   congestion control, thus losing the benefits of L4S.  So, in networks
1178	   that already use rate policers and plan to deploy L4S, it will be
1179	   preferable to redesign these rate policers to be more friendly to the
1180	   L4S service.

1182	   L4S-friendly rate policing is currently a research area (note that
1183	   this is not the same as latency policing).  It might be achieved by
1184	   setting a threshold where ECN marking is introduced, such that it is
1185	   just under the policed rate or just under the burst allowance where
1186	   drop is introduced.  This could be applied to various types of rate
1187	   policer, e.g.  [RFC2697], [RFC2698] or the 'local' (non-ConEx)
1188	   variant of the ConEx congestion policer [I-D.briscoe-conex-policing].
1189	   It might also be possible to design scalable congestion controls to
1190	   respond less catastrophically to loss that has not been preceded by a
1191	   period of increasing delay.

1193	   The design of L4S-friendly rate policers will require a separate
1194	   dedicated document.  For further discussion of the interaction
1195	   between L4S and Diffserv, see [I-D.briscoe-tsvwg-l4s-diffserv].

1197	8.4.  ECN Integrity

1199	   Receiving hosts can fool a sender into downloading faster by
1200	   suppressing feedback of ECN marks (or of losses if retransmissions
1201	   are not necessary or available otherwise).  Various ways to protect
1202	   transport feedback integrity have been developed.  For instance:

1204	   o  The sender can test the integrity of the receiver's feedback by
1205	      occasionally setting the IP-ECN field to the congestion
1206	      experienced (CE) codepoint, which is normally only set by a
1207	      congested link.  Then the sender can test whether the receiver's
1208	      feedback faithfully reports what it expects (see 2nd para of
1209	      Section 20.2 of [RFC3168]).

1211	   o  A network can enforce a congestion response to its ECN markings
1212	      (or packet losses) by auditing congestion exposure (ConEx)
1213	      [RFC7713].

1215	   o  The TCP authentication option (TCP-AO [RFC5925]) can be used to
1216	      detect tampering with TCP congestion feedback.

1218	   o  The ECN Nonce [RFC3540] was proposed to detect tampering with
1219	      congestion feedback, but it has been reclassified as historic
1220	      [RFC8311].

1222	   Appendix C.1 of [I-D.ietf-tsvwg-ecn-l4s-id] gives more details of
1223	   these techniques including their applicability and pros and cons.

1225	9.  Acknowledgements

1227	   Thanks to Richard Scheffenegger, Wes Eddy, Karen Nielsen, David Black
1228	   and Jake Holland for their useful review comments.

1230	   Bob Briscoe and Koen De Schepper were part-funded by the European
1231	   Community under its Seventh Framework Programme through the Reducing
1232	   Internet Transport Latency (RITE) project (ICT-317700).  Bob Briscoe
1233	   was also part-funded by the Research Council of Norway through the
1234	   TimeIn project, partly by CableLabs and partly by the Comcast
1235	   Innovation Fund.  The views expressed here are solely those of the
1236	   authors.

1238	10.  References

1240	10.1.  Normative References

1242	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
1243	              Requirement Levels", BCP 14, RFC 2119,
1244	              DOI 10.17487/RFC2119, March 1997,
1245	              <https://www.rfc-editor.org/info/rfc2119>.

1247	10.2.  Informative References

1249	   [DCttH15]  De Schepper, K., Bondarenko, O., Briscoe, B., and I.
1250	              Tsang, "`Data Centre to the Home': Ultra-Low Latency for
1251	              All", RITE project Technical Report , 2015,
1252	              <http://riteproject.eu/publications/>.

1254	   [DOCSIS3.1]
1255	              CableLabs, "MAC and Upper Layer Protocols Interface
1256	              (MULPI) Specification, CM-SP-MULPIv3.1", Data-Over-Cable
1257	              Service Interface Specifications DOCSIS(R) 3.1 Version i17
1258	              or later, January 2019, <https://specification-
1259	              search.cablelabs.com/CM-SP-MULPIv3.1>.

1261	   [DualPI2Linux]
1262	              Albisser, O., De Schepper, K., Briscoe, B., Tilmans, O.,
1263	              and H. Steen, "DUALPI2 - Low Latency, Low Loss and
1264	              Scalable (L4S) AQM", Proc. Linux Netdev 0x13 , March 2019,
1265	              <https://www.netdevconf.org/0x13/session.html?talk-
1266	              DUALPI2-AQM>.

1268	   [Hohlfeld14]
1269	              Hohlfeld , O., Pujol, E., Ciucu, F., Feldmann, A., and P.
1270	              Barford, "A QoE Perspective on Sizing Network Buffers",
1271	              Proc. ACM Internet Measurement Conf (IMC'14) hmm, November
1272	              2014.

1274	   [I-D.briscoe-conex-policing]
1275	              Briscoe, B., "Network Performance Isolation using
1276	              Congestion Policing", draft-briscoe-conex-policing-01
1277	              (work in progress), February 2014.

1279	   [I-D.briscoe-docsis-q-protection]
1280	              Briscoe, B. and G. White, "Queue Protection to Preserve
1281	              Low Latency", draft-briscoe-docsis-q-protection-00 (work
1282	              in progress), July 2019.

1284	   [I-D.briscoe-tsvwg-l4s-diffserv]
1285	              Briscoe, B., "Interactions between Low Latency, Low Loss,
1286	              Scalable Throughput (L4S) and Differentiated Services",
1287	              draft-briscoe-tsvwg-l4s-diffserv-02 (work in progress),
1288	              November 2018.

1290	   [I-D.cardwell-iccrg-bbr-congestion-control]
1291	              Cardwell, N., Cheng, Y., Yeganeh, S., and V. Jacobson,
1292	              "BBR Congestion Control", draft-cardwell-iccrg-bbr-
1293	              congestion-control-00 (work in progress), July 2017.

1295	   [I-D.ietf-quic-transport]
1296	              Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed
1297	              and Secure Transport", draft-ietf-quic-transport-27 (work
1298	              in progress), February 2020.

1300	   [I-D.ietf-tcpm-accurate-ecn]
1301	              Briscoe, B., Kuehlewind, M., and R. Scheffenegger, "More
1302	              Accurate ECN Feedback in TCP", draft-ietf-tcpm-accurate-
1303	              ecn-11 (work in progress), March 2020.

1305	   [I-D.ietf-tcpm-generalized-ecn]
1306	              Bagnulo, M. and B. Briscoe, "ECN++: Adding Explicit
1307	              Congestion Notification (ECN) to TCP Control Packets",
1308	              draft-ietf-tcpm-generalized-ecn-05 (work in progress),
1309	              November 2019.

1311	   [I-D.ietf-tsvwg-aqm-dualq-coupled]
1312	              Schepper, K., Briscoe, B., and G. White, "DualQ Coupled
1313	              AQMs for Low Latency, Low Loss and Scalable Throughput
1314	              (L4S)", draft-ietf-tsvwg-aqm-dualq-coupled-10 (work in
1315	              progress), July 2019.

1317	   [I-D.ietf-tsvwg-ecn-encap-guidelines]
1318	              Briscoe, B., Kaippallimalil, J., and P. Thaler,
1319	              "Guidelines for Adding Congestion Notification to
1320	              Protocols that Encapsulate IP", draft-ietf-tsvwg-ecn-
1321	              encap-guidelines-13 (work in progress), May 2019.

1323	   [I-D.ietf-tsvwg-ecn-l4s-id]
1324	              Schepper, K. and B. Briscoe, "Identifying Modified
1325	              Explicit Congestion Notification (ECN) Semantics for
1326	              Ultra-Low Queuing Delay (L4S)", draft-ietf-tsvwg-ecn-l4s-
1327	              id-09 (work in progress), February 2020.

1329	   [I-D.smith-encrypted-traffic-management]
1330	              Smith, K., "Network management of encrypted traffic",
1331	              draft-smith-encrypted-traffic-management-05 (work in
1332	              progress), May 2016.

1334	   [I-D.sridharan-tcpm-ctcp]
1335	              Sridharan, M., Tan, K., Bansal, D., and D. Thaler,
1336	              "Compound TCP: A New TCP Congestion Control for High-Speed
1337	              and Long Distance Networks", draft-sridharan-tcpm-ctcp-02
1338	              (work in progress), November 2008.

1340	   [I-D.stewart-tsvwg-sctpecn]
1341	              Stewart, R., Tuexen, M., and X. Dong, "ECN for Stream
1342	              Control Transmission Protocol (SCTP)", draft-stewart-
1343	              tsvwg-sctpecn-05 (work in progress), January 2014.

1345	   [I-D.white-tsvwg-nqb]
1346	              White, G. and T. Fossati, "Identifying and Handling Non
1347	              Queue Building Flows in a Bottleneck Link", draft-white-
1348	              tsvwg-nqb-02 (work in progress), June 2019.

1350	   [L4Sdemo16]
1351	              Bondarenko, O., De Schepper, K., Tsang, I., and B.
1352	              Briscoe, "orderedUltra-Low Delay for All: Live Experience,
1353	              Live Analysis", Proc. MMSYS'16 pp33:1--33:4, May 2016,
1354	              <http://dl.acm.org/citation.cfm?doid=2910017.2910633
1355	              (videos of demos:
1356	              https://riteproject.eu/dctth/#1511dispatchwg )>.

1358	   [Mathis09]
1359	              Mathis, M., "Relentless Congestion Control", PFLDNeT'09 ,
1360	              May 2009, <https://www.gdt.id.au/~gdt/
1361	              presentations/2010-07-06-questnet-tcp/reference-
1362	              materials/papers/mathis-relentless-congestion-
1363	              control.pdf>.

1365	   [NewCC_Proc]
1366	              Eggert, L., "Experimental Specification of New Congestion
1367	              Control Algorithms", IETF Operational Note ion-tsv-alt-cc,
1368	              July 2007.

1370	   [PragueLinux]
1371	              Briscoe, B., De Schepper, K., Albisser, O., Misund, J.,
1372	              Tilmans, O., Kuehlewind, M., and A. Ahmed, "Implementing
1373	              the `TCP Prague' Requirements for Low Latency Low Loss
1374	              Scalable Throughput (L4S)", Proc. Linux Netdev 0x13 ,
1375	              March 2019, <https://www.netdevconf.org/0x13/
1376	              session.html?talk-tcp-prague-l4s>.

1378	   [RFC2697]  Heinanen, J. and R. Guerin, "A Single Rate Three Color
1379	              Marker", RFC 2697, DOI 10.17487/RFC2697, September 1999,
1380	              <https://www.rfc-editor.org/info/rfc2697>.

1382	   [RFC2698]  Heinanen, J. and R. Guerin, "A Two Rate Three Color
1383	              Marker", RFC 2698, DOI 10.17487/RFC2698, September 1999,
1384	              <https://www.rfc-editor.org/info/rfc2698>.

1386	   [RFC2884]  Hadi Salim, J. and U. Ahmed, "Performance Evaluation of
1387	              Explicit Congestion Notification (ECN) in IP Networks",
1388	              RFC 2884, DOI 10.17487/RFC2884, July 2000,
1389	              <https://www.rfc-editor.org/info/rfc2884>.

1391	   [RFC3168]  Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
1392	              of Explicit Congestion Notification (ECN) to IP",
1393	              RFC 3168, DOI 10.17487/RFC3168, September 2001,
1394	              <https://www.rfc-editor.org/info/rfc3168>.

1396	   [RFC3246]  Davie, B., Charny, A., Bennet, J., Benson, K., Le Boudec,
1397	              J., Courtney, W., Davari, S., Firoiu, V., and D.
1398	              Stiliadis, "An Expedited Forwarding PHB (Per-Hop
1399	              Behavior)", RFC 3246, DOI 10.17487/RFC3246, March 2002,
1400	              <https://www.rfc-editor.org/info/rfc3246>.

1402	   [RFC3540]  Spring, N., Wetherall, D., and D. Ely, "Robust Explicit
1403	              Congestion Notification (ECN) Signaling with Nonces",
1404	              RFC 3540, DOI 10.17487/RFC3540, June 2003,
1405	              <https://www.rfc-editor.org/info/rfc3540>.

1407	   [RFC3649]  Floyd, S., "HighSpeed TCP for Large Congestion Windows",
1408	              RFC 3649, DOI 10.17487/RFC3649, December 2003,
1409	              <https://www.rfc-editor.org/info/rfc3649>.

1411	   [RFC4340]  Kohler, E., Handley, M., and S. Floyd, "Datagram
1412	              Congestion Control Protocol (DCCP)", RFC 4340,
1413	              DOI 10.17487/RFC4340, March 2006,
1414	              <https://www.rfc-editor.org/info/rfc4340>.

1416	   [RFC4774]  Floyd, S., "Specifying Alternate Semantics for the
1417	              Explicit Congestion Notification (ECN) Field", BCP 124,
1418	              RFC 4774, DOI 10.17487/RFC4774, November 2006,
1419	              <https://www.rfc-editor.org/info/rfc4774>.

1421	   [RFC4960]  Stewart, R., Ed., "Stream Control Transmission Protocol",
1422	              RFC 4960, DOI 10.17487/RFC4960, September 2007,
1423	              <https://www.rfc-editor.org/info/rfc4960>.

1425	   [RFC5033]  Floyd, S. and M. Allman, "Specifying New Congestion
1426	              Control Algorithms", BCP 133, RFC 5033,
1427	              DOI 10.17487/RFC5033, August 2007,
1428	              <https://www.rfc-editor.org/info/rfc5033>.

1430	   [RFC5348]  Floyd, S., Handley, M., Padhye, J., and J. Widmer, "TCP
1431	              Friendly Rate Control (TFRC): Protocol Specification",
1432	              RFC 5348, DOI 10.17487/RFC5348, September 2008,
1433	              <https://www.rfc-editor.org/info/rfc5348>.

1435	   [RFC5681]  Allman, M., Paxson, V., and E. Blanton, "TCP Congestion
1436	              Control", RFC 5681, DOI 10.17487/RFC5681, September 2009,
1437	              <https://www.rfc-editor.org/info/rfc5681>.

1439	   [RFC5925]  Touch, J., Mankin, A., and R. Bonica, "The TCP
1440	              Authentication Option", RFC 5925, DOI 10.17487/RFC5925,
1441	              June 2010, <https://www.rfc-editor.org/info/rfc5925>.

1443	   [RFC6040]  Briscoe, B., "Tunnelling of Explicit Congestion
1444	              Notification", RFC 6040, DOI 10.17487/RFC6040, November
1445	              2010, <https://www.rfc-editor.org/info/rfc6040>.

1447	   [RFC6679]  Westerlund, M., Johansson, I., Perkins, C., O'Hanlon, P.,
1448	              and K. Carlberg, "Explicit Congestion Notification (ECN)
1449	              for RTP over UDP", RFC 6679, DOI 10.17487/RFC6679, August
1450	              2012, <https://www.rfc-editor.org/info/rfc6679>.

1452	   [RFC7540]  Belshe, M., Peon, R., and M. Thomson, Ed., "Hypertext
1453	              Transfer Protocol Version 2 (HTTP/2)", RFC 7540,
1454	              DOI 10.17487/RFC7540, May 2015,
1455	              <https://www.rfc-editor.org/info/rfc7540>.

1457	   [RFC7560]  Kuehlewind, M., Ed., Scheffenegger, R., and B. Briscoe,
1458	              "Problem Statement and Requirements for Increased Accuracy
1459	              in Explicit Congestion Notification (ECN) Feedback",
1460	              RFC 7560, DOI 10.17487/RFC7560, August 2015,
1461	              <https://www.rfc-editor.org/info/rfc7560>.

1463	   [RFC7665]  Halpern, J., Ed. and C. Pignataro, Ed., "Service Function
1464	              Chaining (SFC) Architecture", RFC 7665,
1465	              DOI 10.17487/RFC7665, October 2015,
1466	              <https://www.rfc-editor.org/info/rfc7665>.

1468	   [RFC7713]  Mathis, M. and B. Briscoe, "Congestion Exposure (ConEx)
1469	              Concepts, Abstract Mechanism, and Requirements", RFC 7713,
1470	              DOI 10.17487/RFC7713, December 2015,
1471	              <https://www.rfc-editor.org/info/rfc7713>.

1473	   [RFC8033]  Pan, R., Natarajan, P., Baker, F., and G. White,
1474	              "Proportional Integral Controller Enhanced (PIE): A
1475	              Lightweight Control Scheme to Address the Bufferbloat
1476	              Problem", RFC 8033, DOI 10.17487/RFC8033, February 2017,
1477	              <https://www.rfc-editor.org/info/rfc8033>.

1479	   [RFC8170]  Thaler, D., Ed., "Planning for Protocol Adoption and
1480	              Subsequent Transitions", RFC 8170, DOI 10.17487/RFC8170,
1481	              May 2017, <https://www.rfc-editor.org/info/rfc8170>.

1483	   [RFC8257]  Bensley, S., Thaler, D., Balasubramanian, P., Eggert, L.,
1484	              and G. Judd, "Data Center TCP (DCTCP): TCP Congestion
1485	              Control for Data Centers", RFC 8257, DOI 10.17487/RFC8257,
1486	              October 2017, <https://www.rfc-editor.org/info/rfc8257>.

1488	   [RFC8290]  Hoeiland-Joergensen, T., McKenney, P., Taht, D., Gettys,
1489	              J., and E. Dumazet, "The Flow Queue CoDel Packet Scheduler
1490	              and Active Queue Management Algorithm", RFC 8290,
1491	              DOI 10.17487/RFC8290, January 2018,
1492	              <https://www.rfc-editor.org/info/rfc8290>.

1494	   [RFC8298]  Johansson, I. and Z. Sarker, "Self-Clocked Rate Adaptation
1495	              for Multimedia", RFC 8298, DOI 10.17487/RFC8298, December
1496	              2017, <https://www.rfc-editor.org/info/rfc8298>.

1498	   [RFC8311]  Black, D., "Relaxing Restrictions on Explicit Congestion
1499	              Notification (ECN) Experimentation", RFC 8311,
1500	              DOI 10.17487/RFC8311, January 2018,
1501	              <https://www.rfc-editor.org/info/rfc8311>.

1503	   [RFC8312]  Rhee, I., Xu, L., Ha, S., Zimmermann, A., Eggert, L., and
1504	              R. Scheffenegger, "CUBIC for Fast Long-Distance Networks",
1505	              RFC 8312, DOI 10.17487/RFC8312, February 2018,
1506	              <https://www.rfc-editor.org/info/rfc8312>.

1508	   [RFC8511]  Khademi, N., Welzl, M., Armitage, G., and G. Fairhurst,
1509	              "TCP Alternative Backoff with ECN (ABE)", RFC 8511,
1510	              DOI 10.17487/RFC8511, December 2018,
1511	              <https://www.rfc-editor.org/info/rfc8511>.

1513	   [TCP-CA]   Jacobson, V. and M. Karels, "Congestion Avoidance and
1514	              Control", Laurence Berkeley Labs Technical Report ,
1515	              November 1988, <http://ee.lbl.gov/papers/congavoid.pdf>.

1517	   [TCP-sub-mss-w]
1518	              Briscoe, B. and K. De Schepper, "Scaling TCP's Congestion
1519	              Window for Small Round Trip Times", BT Technical Report
1520	              TR-TUB8-2015-002, May 2015,
1521	              <http://www.bobbriscoe.net/projects/latency/sub-mss-
1522	              w.pdf>.

1524	   [UnorderedLTE]
1525	              Austrheim, M., "Implementing immediate forwarding for 4G
1526	              in a network simulator", Masters Thesis, Uni Oslo , June
1527	              2019.

1529	Appendix A.  Standardization items

1531	   The following table includes all the items that will need to be
1532	   standardized to provide a full L4S architecture.

1534	   The table is too wide for the ASCII draft format, so it has been
1535	   split into two, with a common column of row index numbers on the
1536	   left.

1538	   The columns in the second part of the table have the following
1539	   meanings:

1541	   WG:  The IETF WG most relevant to this requirement.  The "tcpm/iccrg"
1542	      combination refers to the procedure typically used for congestion
1543	      control changes, where tcpm owns the approval decision, but uses
1544	      the iccrg for expert review [NewCC_Proc];

1546	   TCP:  Applicable to all forms of TCP congestion control;

1548	   DCTCP:  Applicable to Data Center TCP as currently used (in
1549	      controlled environments);

1551	   DCTCP bis:  Applicable to an future Data Center TCP congestion
1552	      control intended for controlled environments;

1554	   XXX Prague:  Applicable to a Scalable variant of XXX (TCP/SCTP/RMCAT)
1555	      congestion control.

1557	   +-----+------------------------+------------------------------------+
1558	   | Req | Requirement            | Reference                          |
1559	   | #   |                        |                                    |
1560	   +-----+------------------------+------------------------------------+
1561	   | 0   | ARCHITECTURE           |                                    |
1562	   | 1   | L4S IDENTIFIER         | [I-D.ietf-tsvwg-ecn-l4s-id]        |
1563	   | 2   | DUAL QUEUE AQM         | [I-D.ietf-tsvwg-aqm-dualq-coupled] |
1564	   | 3   | Suitable ECN Feedback  | [I-D.ietf-tcpm-accurate-ecn],      |
1565	   |     |                        | [I-D.stewart-tsvwg-sctpecn].       |
1566	   |     |                        |                                    |
1567	   |     | SCALABLE TRANSPORT -   |                                    |
1568	   |     | SAFETY ADDITIONS       |                                    |
1569	   | 4-1 | Fall back to           | [I-D.ietf-tsvwg-ecn-l4s-id] S.2.3, |
1570	   |     | Reno/Cubic on loss     | [RFC8257]                          |
1571	   | 4-2 | Fall back to           | [I-D.ietf-tsvwg-ecn-l4s-id] S.2.3  |
1572	   |     | Reno/Cubic if classic  |                                    |
1573	   |     | ECN bottleneck         |                                    |
1574	   |     | detected               |                                    |
1575	   |     |                        |                                    |
1576	   | 4-3 | Reduce RTT-dependence  | [I-D.ietf-tsvwg-ecn-l4s-id] S.2.3  |
1577	   |     |                        |                                    |
1578	   | 4-4 | Scaling TCP's          | [I-D.ietf-tsvwg-ecn-l4s-id] S.2.3, |
1579	   |     | Congestion Window for  | [TCP-sub-mss-w]                    |
1580	   |     | Small Round Trip Times |                                    |
1581	   |     | SCALABLE TRANSPORT -   |                                    |
1582	   |     | PERFORMANCE            |                                    |
1583	   |     | ENHANCEMENTS           |                                    |
1584	   | 5-1 | Setting ECT in TCP     | [I-D.ietf-tcpm-generalized-ecn]    |
1585	   |     | Control Packets and    |                                    |
1586	   |     | Retransmissions        |                                    |
1587	   | 5-2 | Faster-than-additive   | [I-D.ietf-tsvwg-ecn-l4s-id] (Appx  |
1588	   |     | increase               | A.2.2)                             |
1589	   | 5-3 | Faster Convergence at  | [I-D.ietf-tsvwg-ecn-l4s-id] (Appx  |
1590	   |     | Flow Start             | A.2.2)                             |
1591	   +-----+------------------------+------------------------------------+
1592	   +-----+--------+-----+-------+-----------+--------+--------+--------+
1593	   | #   | WG     | TCP | DCTCP | DCTCP-bis | TCP    | SCTP   | RMCAT  |
1594	   |     |        |     |       |           | Prague | Prague | Prague |
1595	   +-----+--------+-----+-------+-----------+--------+--------+--------+
1596	   | 0   | tsvwg  | Y   | Y     | Y         | Y      | Y      | Y      |
1597	   | 1   | tsvwg  |     |       | Y         | Y      | Y      | Y      |
1598	   | 2   | tsvwg  | n/a | n/a   | n/a       | n/a    | n/a    | n/a    |
1599	   |     |        |     |       |           |        |        |        |
1600	   |     |        |     |       |           |        |        |        |
1601	   |     |        |     |       |           |        |        |        |
1602	   | 3   | tcpm   | Y   | Y     | Y         | Y      | n/a    | n/a    |
1603	   |     |        |     |       |           |        |        |        |
1604	   | 4-1 | tcpm   |     | Y     | Y         | Y      | Y      | Y      |
1605	   |     |        |     |       |           |        |        |        |
1606	   | 4-2 | tcpm/  |     |       |           | Y      | Y      | ?      |
1607	   |     | iccrg? |     |       |           |        |        |        |
1608	   |     |        |     |       |           |        |        |        |
1609	   |     |        |     |       |           |        |        |        |
1610	   |     |        |     |       |           |        |        |        |
1611	   |     |        |     |       |           |        |        |        |
1612	   | 4-3 | tcpm/  |     |       | Y         | Y      | Y      | ?      |
1613	   |     | iccrg? |     |       |           |        |        |        |
1614	   | 4-4 | tcpm   | Y   | Y     | Y         | Y      | Y      | ?      |
1615	   |     |        |     |       |           |        |        |        |
1616	   |     |        |     |       |           |        |        |        |
1617	   | 5-1 | tcpm   | Y   | Y     | Y         | Y      | n/a    | n/a    |
1618	   |     |        |     |       |           |        |        |        |
1619	   | 5-2 | tcpm/  |     |       | Y         | Y      | Y      | ?      |
1620	   |     | iccrg? |     |       |           |        |        |        |
1621	   | 5-3 | tcpm/  |     |       | Y         | Y      | Y      | ?      |
1622	   |     | iccrg? |     |       |           |        |        |        |
1623	   +-----+--------+-----+-------+-----------+--------+--------+--------+

1625	Authors' Addresses

1627	   Bob Briscoe (editor)
1628	   Independent
1629	   UK

1631	   Email: ietf@bobbriscoe.net
1632	   URI:   http://bobbriscoe.net/
1633	   Koen De Schepper
1634	   Nokia Bell Labs
1635	   Antwerp
1636	   Belgium

1638	   Email: koen.de_schepper@nokia.com
1639	   URI:   https://www.bell-labs.com/usr/koen.de_schepper

1641	   Marcelo Bagnulo
1642	   Universidad Carlos III de Madrid
1643	   Av. Universidad 30
1644	   Leganes, Madrid 28911
1645	   Spain

1647	   Phone: 34 91 6249500
1648	   Email: marcelo@it.uc3m.es
1649	   URI:   http://www.it.uc3m.es

1651	   Greg White
1652	   CableLabs
1653	   US

1655	   Email: G.White@CableLabs.com