idnits 2.17.1
draft-ietf-tsvwg-l4s-arch-08.txt:
Checking boilerplate required by RFC 5378 and the IETF Trust (see
https://trustee.ietf.org/license-info):
----------------------------------------------------------------------------
No issues found here.
Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
----------------------------------------------------------------------------
No issues found here.
Checking nits according to https://www.ietf.org/id-info/checklist :
----------------------------------------------------------------------------
No issues found here.
Miscellaneous warnings:
----------------------------------------------------------------------------
== The copyright year in the IETF Trust and authors Copyright Line does not
match the current year
-- The document date (November 15, 2020) is 1258 days in the past. Is this
intentional?
Checking references for intended status: Informational
----------------------------------------------------------------------------
== Outdated reference: A later version (-07) exists of
draft-briscoe-docsis-q-protection-00
== Outdated reference: A later version (-02) exists of
draft-cardwell-iccrg-bbr-congestion-control-00
== Outdated reference: A later version (-34) exists of
draft-ietf-quic-transport-32
== Outdated reference: A later version (-28) exists of
draft-ietf-tcpm-accurate-ecn-13
== Outdated reference: A later version (-15) exists of
draft-ietf-tcpm-generalized-ecn-06
== Outdated reference: A later version (-25) exists of
draft-ietf-tsvwg-aqm-dualq-coupled-12
== Outdated reference: A later version (-22) exists of
draft-ietf-tsvwg-ecn-encap-guidelines-13
== Outdated reference: A later version (-29) exists of
draft-ietf-tsvwg-ecn-l4s-id-11
== Outdated reference: A later version (-23) exists of
draft-ietf-tsvwg-rfc6040update-shim-10
== Outdated reference: A later version (-07) exists of
draft-stewart-tsvwg-sctpecn-05
-- Obsolete informational reference (is this intentional?): RFC 4960
(Obsoleted by RFC 9260)
-- Obsolete informational reference (is this intentional?): RFC 7540
(Obsoleted by RFC 9113)
-- Obsolete informational reference (is this intentional?): RFC 8312
(Obsoleted by RFC 9438)
Summary: 0 errors (**), 0 flaws (~~), 11 warnings (==), 4 comments (--).
Run idnits with the --verbose option for more detailed information about
the items above.
--------------------------------------------------------------------------------
2 Transport Area Working Group B. Briscoe, Ed.
3 Internet-Draft Independent
4 Intended status: Informational K. De Schepper
5 Expires: May 19, 2021 Nokia Bell Labs
6 M. Bagnulo Braun
7 Universidad Carlos III de Madrid
8 G. White
9 CableLabs
10 November 15, 2020
12 Low Latency, Low Loss, Scalable Throughput (L4S) Internet Service:
13 Architecture
14 draft-ietf-tsvwg-l4s-arch-08
16 Abstract
18 This document describes the L4S architecture, which enables Internet
19 applications to achieve Low queuing Latency, Low Loss, and Scalable
20 throughput (L4S). The insight on which L4S is based is that the root
21 cause of queuing delay is in the congestion controllers of senders,
22 not in the queue itself. The L4S architecture is intended to enable
23 _all_ Internet applications to transition away from congestion
24 control algorithms that cause queuing delay, to a new class of
25 congestion controls that induce very little queuing, aided by
26 explicit congestion signaling from the network. This new class of
27 congestion control can provide low latency for capacity-seeking
28 flows, so applications can achieve both high bandwidth and low
29 latency.
31 The architecture primarily concerns incremental deployment. It
32 defines mechanisms that allow the new class of L4S congestion
33 controls to coexist with 'Classic' congestion controls in a shared
34 network. These mechanisms aim to ensure that the latency and
35 throughput performance using an L4S-compliant congestion controller
36 is usually much better (and never worse) than the performance would
37 have been using a 'Classic' congestion controller, and that competing
38 flows continuing to use 'Classic' controllers are typically not
39 impacted by the presence of L4S. These characteristics are important
40 to encourage adoption of L4S congestion control algorithms and L4S
41 compliant network elements.
43 The L4S architecture consists of three components: network support to
44 isolate L4S traffic from classic traffic; protocol features that
45 allow network elements to identify L4S traffic; and host support for
46 L4S congestion controls.
48 Status of This Memo
50 This Internet-Draft is submitted in full conformance with the
51 provisions of BCP 78 and BCP 79.
53 Internet-Drafts are working documents of the Internet Engineering
54 Task Force (IETF). Note that other groups may also distribute
55 working documents as Internet-Drafts. The list of current Internet-
56 Drafts is at https://datatracker.ietf.org/drafts/current/.
58 Internet-Drafts are draft documents valid for a maximum of six months
59 and may be updated, replaced, or obsoleted by other documents at any
60 time. It is inappropriate to use Internet-Drafts as reference
61 material or to cite them other than as "work in progress."
63 This Internet-Draft will expire on May 19, 2021.
65 Copyright Notice
67 Copyright (c) 2020 IETF Trust and the persons identified as the
68 document authors. All rights reserved.
70 This document is subject to BCP 78 and the IETF Trust's Legal
71 Provisions Relating to IETF Documents
72 (https://trustee.ietf.org/license-info) in effect on the date of
73 publication of this document. Please review these documents
74 carefully, as they describe your rights and restrictions with respect
75 to this document. Code Components extracted from this document must
76 include Simplified BSD License text as described in Section 4.e of
77 the Trust Legal Provisions and are provided without warranty as
78 described in the Simplified BSD License.
80 Table of Contents
82 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
83 2. L4S Architecture Overview . . . . . . . . . . . . . . . . . . 5
84 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 6
85 4. L4S Architecture Components . . . . . . . . . . . . . . . . . 7
86 5. Rationale . . . . . . . . . . . . . . . . . . . . . . . . . . 12
87 5.1. Why These Primary Components? . . . . . . . . . . . . . . 12
88 5.2. What L4S adds to Existing Approaches . . . . . . . . . . 14
89 6. Applicability . . . . . . . . . . . . . . . . . . . . . . . . 17
90 6.1. Applications . . . . . . . . . . . . . . . . . . . . . . 17
91 6.2. Use Cases . . . . . . . . . . . . . . . . . . . . . . . . 19
92 6.3. Applicability with Specific Link Technologies . . . . . . 20
93 6.4. Deployment Considerations . . . . . . . . . . . . . . . . 20
94 6.4.1. Deployment Topology . . . . . . . . . . . . . . . . . 21
95 6.4.2. Deployment Sequences . . . . . . . . . . . . . . . . 22
96 6.4.3. L4S Flow but Non-ECN Bottleneck . . . . . . . . . . . 25
97 6.4.4. L4S Flow but Classic ECN Bottleneck . . . . . . . . . 25
98 6.4.5. L4S AQM Deployment within Tunnels . . . . . . . . . . 26
99 7. IANA Considerations (to be removed by RFC Editor) . . . . . . 26
100 8. Security Considerations . . . . . . . . . . . . . . . . . . . 26
101 8.1. Traffic Rate (Non-)Policing . . . . . . . . . . . . . . . 26
102 8.2. 'Latency Friendliness' . . . . . . . . . . . . . . . . . 27
103 8.3. Interaction between Rate Policing and L4S . . . . . . . . 29
104 8.4. ECN Integrity . . . . . . . . . . . . . . . . . . . . . . 29
105 8.5. Privacy Considerations . . . . . . . . . . . . . . . . . 30
106 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 31
107 10. Informative References . . . . . . . . . . . . . . . . . . . 31
108 Appendix A. Standardization items . . . . . . . . . . . . . . . 38
109 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 40
111 1. Introduction
113 It is increasingly common for _all_ of a user's applications at any
114 one time to require low delay: interactive Web, Web services, voice,
115 conversational video, interactive video, interactive remote presence,
116 instant messaging, online gaming, remote desktop, cloud-based
117 applications and video-assisted remote control of machinery and
118 industrial processes. In the last decade or so, much has been done
119 to reduce propagation delay by placing caches or servers closer to
120 users. However, queuing remains a major, albeit intermittent,
121 component of latency. For instance spikes of hundreds of
122 milliseconds are common, even with state-of-the-art active queue
123 management (AQM). During a long-running flow, queuing is typically
124 configured to cause overall network delay to roughly double relative
125 to expected base (unloaded) path delay. Low loss is also important
126 because, for interactive applications, losses translate into even
127 longer retransmission delays.
129 It has been demonstrated that, once access network bit rates reach
130 levels now common in the developed world, increasing capacity offers
131 diminishing returns if latency (delay) is not addressed.
132 Differentiated services (Diffserv) offers Expedited Forwarding
133 (EF [RFC3246]) for some packets at the expense of others, but this is
134 not sufficient when all (or most) of a user's applications require
135 low latency.
137 Therefore, the goal is an Internet service with ultra-Low queueing
138 Latency, ultra-Low Loss and Scalable throughput (L4S). Ultra-low
139 queuing latency means less than 1 millisecond (ms) on average and
140 less than about 2 ms at the 99th percentile. L4S is potentially for
141 _all_ traffic - a service for all traffic needs none of the
142 configuration or management baggage (traffic policing, traffic
143 contracts) associated with favouring some traffic over others. This
144 document describes the L4S architecture for achieving these goals.
146 It must be said that queuing delay only degrades performance
147 infrequently [Hohlfeld14]. It only occurs when a large enough
148 capacity-seeking (e.g. TCP) flow is running alongside the user's
149 traffic in the bottleneck link, which is typically in the access
150 network. Or when the low latency application is itself a large
151 capacity-seeking or adaptive rate (e.g. interactive video) flow. At
152 these times, the performance improvement from L4S must be sufficient
153 that network operators will be motivated to deploy it.
155 Active Queue Management (AQM) is part of the solution to queuing
156 under load. AQM improves performance for all traffic, but there is a
157 limit to how much queuing delay can be reduced by solely changing the
158 network; without addressing the root of the problem.
160 The root of the problem is the presence of standard TCP congestion
161 control (Reno [RFC5681]) or compatible variants (e.g. TCP
162 Cubic [RFC8312]). We shall use the term 'Classic' for these Reno-
163 friendly congestion controls. Classic congestion controls induce
164 relatively large saw-tooth-shaped excursions up the queue and down
165 again, which have been growing as flow rate scales [RFC3649]. So if
166 a network operator naively attempts to reduce queuing delay by
167 configuring an AQM to operate at a shallower queue, a Classic
168 congestion control will significantly underutilize the link at the
169 bottom of every saw-tooth.
171 It has been demonstrated that if the sending host replaces a Classic
172 congestion control with a 'Scalable' alternative, when a suitable AQM
173 is deployed in the network the performance under load of all the
174 above interactive applications can be significantly improved. For
175 instance, queuing delay under heavy load with the example DCTCP/DualQ
176 solution cited below on a DSL or Ethernet link is roughly 1 to 2
177 milliseconds at the 99th percentile without losing link
178 utilization [DualPI2Linux], [DCttH15] (for other link types, see
179 Section 6.3). This compares with 5 to 20 ms on _average_ with a
180 Classic congestion control and current state-of-the-art AQMs such as
181 FQ-CoDel [RFC8290], PIE [RFC8033] or DOCSIS PIE [RFC8034] and about
182 20-30 ms at the 99th percentile [DualPI2Linux].
184 It has also been demonstrated [DCttH15], [DualPI2Linux] that it is
185 possible to deploy such an L4S service alongside the existing best
186 efforts service so that all of a user's applications can shift to it
187 when their stack is updated. Access networks are typically designed
188 with one link as the bottleneck for each site (which might be a home,
189 small enterprise or mobile device), so deployment at each end of this
190 link should give nearly all the benefit in each direction. The L4S
191 approach also requires component mechanisms at the endpoints to
192 fulfill its goal. This document presents the L4S architecture, by
193 describing the different components and how they interact to provide
194 the scalable, low latency, low loss Internet service.
196 2. L4S Architecture Overview
198 There are three main components to the L4S architecture:
200 1) Network: L4S traffic needs to be isolated from the queuing
201 latency of Classic traffic. One queue per application flow (FQ)
202 is one way to achieve this, e.g. FQ-CoDel [RFC8290]. However,
203 just two queues is sufficient and does not require inspection of
204 transport layer headers in the network, which is not always
205 possible (see Section 5.2). With just two queues, it might seem
206 impossible to know how much capacity to schedule for each queue
207 without inspecting how many flows at any one time are using each.
208 And it would be undesirable to arbitrarily divide access network
209 capacity into two partitions. The Dual Queue Coupled AQM was
210 developed as a minimal complexity solution to this problem. It
211 acts like a 'semi-permeable' membrane that partitions latency but
212 not bandwidth. As such, the two queues are for transition from
213 Classic to L4S behaviour, not bandwidth prioritization. Section 4
214 gives a high level explanation of how FQ and DualQ solutions work,
215 and [I-D.ietf-tsvwg-aqm-dualq-coupled] gives a full explanation of
216 the DualQ Coupled AQM framework.
218 2) Protocol: A host needs to distinguish L4S and Classic packets
219 with an identifier so that the network can classify them into
220 their separate treatments. [I-D.ietf-tsvwg-ecn-l4s-id] considers
221 various alternative identifiers for L4S, and concludes that all
222 alternatives involve compromises, but the ECT(1) and CE codepoints
223 of the ECN field represent a workable solution.
225 3) Host: Scalable congestion controls already exist. They solve the
226 scaling problem with Reno congestion control that was explained in
227 [RFC3649]. The one used most widely (in controlled environments)
228 is Data Center TCP (DCTCP [RFC8257]), which has been implemented
229 and deployed in Windows Server Editions (since 2012), in Linux and
230 in FreeBSD. Although DCTCP as-is 'works' well over the public
231 Internet, most implementations lack certain safety features that
232 will be necessary once it is used outside controlled environments
233 like data centres (see Section 6.4.3 and Appendix A). Scalable
234 congestion control will also need to be implemented in protocols
235 other than TCP (QUIC, SCTP, RTP/RTCP, RMCAT, etc.). Indeed,
236 between the present document being drafted and published, the
237 following scalable congestion controls were implemented: TCP
238 Prague [PragueLinux], QUIC Prague, an L4S variant of the RMCAT
239 SCReAM controller [RFC8298] and the L4S ECN part of
240 BBRv2 [I-D.cardwell-iccrg-bbr-congestion-control] intended for TCP
241 and QUIC transports.
243 3. Terminology
245 Classic Congestion Control: A congestion control behaviour that can
246 co-exist with standard TCP Reno [RFC5681] without causing
247 significantly negative impact on its flow rate [RFC5033]. With
248 Classic congestion controls, as flow rate scales, the number of
249 round trips between congestion signals (losses or ECN marks) rises
250 with the flow rate. So it takes longer and longer to recover
251 after each congestion event. Therefore control of queuing and
252 utilization becomes very slack, and the slightest disturbance
253 prevents a high rate from being attained [RFC3649].
255 For instance, with 1500 byte packets and an end-to-end round trip
256 time (RTT) of 36 ms, over the years, as Reno flow rate scales from
257 2 to 100 Mb/s the number of round trips taken to recover from a
258 congestion event rises proportionately, from 4 to 200.
259 Cubic [RFC8312] was developed to be less unscalable, but it is
260 approaching its scaling limit; with the same RTT of 36 ms, at
261 100Mb/s it takes about 106 round trips to recover, and at 800 Mb/s
262 its recovery time triples to over 340 round trips, or still more
263 than 12 seconds (Reno would take 57 seconds).
265 Scalable Congestion Control: A congestion control where the average
266 time from one congestion signal to the next (the recovery time)
267 remains invariant as the flow rate scales, all other factors being
268 equal. This maintains the same degree of control over queueing
269 and utilization whatever the flow rate, as well as ensuring that
270 high throughput is more robust to disturbances (e.g. from new
271 flows starting). For instance, DCTCP averages 2 congestion
272 signals per round-trip whatever the flow rate, as do other
273 recently developed scalable congestion controls, e.g. Relentless
274 TCP [Mathis09], TCP Prague [PragueLinux] and the L4S variant of
275 SCReAM for real-time media [RFC8298]).See Section 4.3 of
276 [I-D.ietf-tsvwg-ecn-l4s-id] for more explanation.
278 Classic service: The Classic service is intended for all the
279 congestion control behaviours that co-exist with Reno [RFC5681]
280 (e.g. Reno itself, Cubic [RFC8312],
281 Compound [I-D.sridharan-tcpm-ctcp], TFRC [RFC5348]). The term
282 'Classic queue' means a queue providing the Classic service.
284 Low-Latency, Low-Loss Scalable throughput (L4S) service: The 'L4S'
285 service is intended for traffic from scalable congestion control
286 algorithms, such as Data Center TCP [RFC8257]. The L4S service is
287 for more general traffic than just DCTCP--it allows the set of
288 congestion controls with similar scaling properties to DCTCP to
289 evolve, such as the examples listed above (Relentless, Prague,
290 SCReAM). The term 'L4S queue' means a queue providing the L4S
291 service.
293 The terms Classic or L4S can also qualify other nouns, such as
294 'queue', 'codepoint', 'identifier', 'classification', 'packet',
295 'flow'. For example: an L4S packet means a packet with an L4S
296 identifier sent from an L4S congestion control.
298 Both Classic and L4S services can cope with a proportion of
299 unresponsive or less-responsive traffic as well, as long as it
300 does not build a queue (e.g. DNS, VoIP, game sync datagrams, etc).
302 Reno-friendly: The subset of Classic traffic that excludes
303 unresponsive traffic and excludes experimental congestion controls
304 intended to coexist with Reno but without always being strictly
305 friendly to it (as allowed by [RFC5033]). Reno-friendly is used
306 in place of 'TCP-friendly', given that friendliness is a property
307 of the congestion controller (Reno), not the wire protocol (TCP),
308 which is used with many different congestion control behaviours.
310 Classic ECN: The original Explicit Congestion Notification (ECN)
311 protocol [RFC3168], which requires ECN signals to be treated as
312 equivalent to drops, both when generated in the network and when
313 responded to by the sender.
315 The names used for the four codepoints of the 2-bit IP-ECN field
316 are as defined in [RFC3168]: Not ECT, ECT(0), ECT(1) and CE, where
317 ECT stands for ECN-Capable Transport and CE stands for Congestion
318 Experienced.
320 Site: A home, mobile device, small enterprise or campus, where the
321 network bottleneck is typically the access link to the site. Not
322 all network arrangements fit this model but it is a useful, widely
323 applicable generalization.
325 4. L4S Architecture Components
327 The L4S architecture is composed of the following elements.
329 Protocols: The L4S architecture encompasses two identifier changes
330 (an unassignment and an assignment) and optional further identifiers:
332 a. An essential aspect of a scalable congestion control is the use
333 of explicit congestion signals rather than losses, because the
334 signals need to be sent frequently and immediately. In contrast,
335 'Classic' ECN [RFC3168] requires an ECN signal to be treated as
336 equivalent to drop, both when it is generated in the network and
337 when it is responded to by hosts. L4S needs networks and hosts
338 to support a different meaning for ECN:
340 * much more frequent signals--too often to require an equivalent
341 excessive degree of drop from non-ECN flows;
343 * immediately tracking every fluctuation of the queue--too soon
344 to warrant dropping packets from non-ECN flows.
346 So the standards track [RFC3168] has had to be updated to allow
347 L4S packets to depart from the 'same as drop' constraint.
348 [RFC8311] is a standards track update to relax specific
349 requirements in RFC 3168 (and certain other standards track
350 RFCs), which clears the way for the experimental changes proposed
351 for L4S. [RFC8311] also reclassifies the original experimental
352 assignment of the ECT(1) codepoint as an ECN nonce [RFC3540] as
353 historic.
355 b. [I-D.ietf-tsvwg-ecn-l4s-id] recommends ECT(1) is used as the
356 identifier to classify L4S packets into a separate treatment from
357 Classic packets. This satisfies the requirements for identifying
358 an alternative ECN treatment in [RFC4774].
360 The CE codepoint is used to indicate Congestion Experienced by
361 both L4S and Classic treatments. This raises the concern that a
362 Classic AQM earlier on the path might have marked some ECT(0)
363 packets as CE. Then these packets will be erroneously classified
364 into the L4S queue. [I-D.ietf-tsvwg-ecn-l4s-id] explains why 5
365 unlikely eventualities all have to coincide for this to have any
366 detrimental effect, which even then would only involve a
367 vanishingly small likelihood of a spurious retransmission.
369 c. A network operator might wish to include certain unresponsive,
370 non-L4S traffic in the L4S queue if it is deemed to be smoothly
371 enough paced and low enough rate not to build a queue. For
372 instance, VoIP, low rate datagrams to sync online games,
373 relatively low rate application-limited traffic, DNS, LDAP, etc.
374 This traffic would need to be tagged with specific identifiers,
375 e.g. a low latency Diffserv Codepoint such as Expedited
376 Forwarding (EF [RFC3246]), Non-Queue-Building
377 (NQB [I-D.white-tsvwg-nqb]), or operator-specific identifiers.
379 Network components: The L4S architecture aims to provide low latency
380 without the _need_ for per-flow operations in network components.
381 Nonetheless, the architecture does not preclude per-flow solutions -
382 it encompasses the following combinations:
384 a. The Dual Queue Coupled AQM (illustrated in Figure 1) achieves the
385 'semi-permeable' membrane property mentioned earlier as follows.
386 The obvious part is that using two separate queues isolates the
387 queuing delay of one from the other. The less obvious part is
388 how the two queues act as if they are a single pool of bandwidth
389 without the scheduler needing to decide between them. This is
390 achieved by having the Classic AQM provide a congestion signal to
391 both queues in a manner that ensures a consistent response from
392 the two types of congestion control. In other words, the Classic
393 AQM generates a drop/mark probability based on congestion in the
394 Classic queue, uses this probability to drop/mark packets in that
395 queue, and also uses this probability to affect the marking
396 probability in the L4S queue. This coupling of the congestion
397 signaling between the two queues makes the L4S flows slow down to
398 leave the right amount of capacity for the Classic traffic (as
399 they would if they were the same type of traffic sharing the same
400 queue). Then the scheduler can serve the L4S queue with
401 priority, because the L4S traffic isn't offering up enough
402 traffic to use all the priority that it is given. Therefore, on
403 short time-scales (sub-round-trip) the prioritization of the L4S
404 queue protects its low latency by allowing bursts to dissipate
405 quickly; but on longer time-scales (round-trip and longer) the
406 Classic queue creates an equal and opposite pressure against the
407 L4S traffic to ensure that neither has priority when it comes to
408 bandwidth. The tension between prioritizing L4S and coupling
409 marking from Classic results in per-flow fairness. To protect
410 against unresponsive traffic in the L4S queue taking advantage of
411 the prioritization and starving the Classic queue, it is
412 advisable not to use strict priority, but instead to use a
413 weighted scheduler (see Appendix A of
414 [I-D.ietf-tsvwg-aqm-dualq-coupled]).
416 When there is no Classic traffic, the L4S queue's AQM comes into
417 play, and it sets an appropriate marking rate to maintain ultra-
418 low queuing delay.
420 The Dual Queue Coupled AQM has been specified as generically as
421 possible [I-D.ietf-tsvwg-aqm-dualq-coupled] without specifying
422 the particular AQMs to use in the two queues so that designers
423 are free to implement diverse ideas. Informational appendices in
424 that draft give pseudocode examples of two different specific AQM
425 approaches: one called DualPI2 (pronounced Dual PI
426 Squared) [DualPI2Linux] that uses the PI2 variant of PIE, and a
427 zero-config variant of RED called Curvy RED. A DualQ Coupled AQM
428 based on PIE has also been specified and implemented for Low
429 Latency DOCSIS [DOCSIS3.1].
431 (2) (1)
432 .-------^------. .--------------^-------------------.
433 ,-(3)-----. ______
434 ; ________ : L4S --------. | |
435 :|Scalable| : _\ ||___\_| mark |
436 :| sender | : __________ / / || / |______|\ _________
437 :|________|\; | |/ --------' ^ \1|condit'nl|
438 `---------'\_| IP-ECN | Coupling : \|priority |_\
439 ________ / |Classifier| : /|scheduler| /
440 |Classic |/ |__________|\ --------. ___:__ / |_________|
441 | sender | \_\ || | |||___\_| mark/|/
442 |________| / || | ||| / | drop |
443 Classic --------' |______|
445 Figure 1: Components of an L4S Solution: 1) Isolation in separate
446 network queues; 2) Packet Identification Protocol; and 3) Scalable
447 Sending Host
449 b. A scheduler with per-flow queues can be used for L4S. It is
450 simple to modify an existing design such as FQ-CoDel or FQ-PIE.
451 For instance within each queue of an FQ-CoDel system, as well as
452 a CoDel AQM, immediate (unsmoothed) shallow threshold ECN marking
453 has been added (see Sec.5.2.7 of [RFC8290]). Then the Classic
454 AQM such as CoDel or PIE is applied to non-ECN or ECT(0) packets,
455 while the shallow threshold is applied to ECT(1) packets, to give
456 sub-millisecond average queue delay.
458 c. It would also be possible to use dual queues for isolation, but
459 with per-flow marking to control flow-rates (instead of the
460 coupled per-queue marking of the Dual Queue Coupled AQM). One of
461 the two queues would be for isolating L4S packets, which would be
462 classified by the ECN codepoint. Flow rates could be controlled
463 by flow-specific marking. The policy goal of the marking could
464 be to differentiate flow rates (e.g. [Nadas20], which requires
465 additional signalling of a per-flow 'value'), or to equalize
466 flow-rates (perhaps in a similar way to Approx Fair CoDel [AFCD],
467 [I-D.morton-tsvwg-codel-approx-fair], but with two queues not
468 one).
470 Note that whenever the term 'DualQ' is used loosely without
471 saying whether marking is per-queue or per-flow, it means a dual
472 queue AQM with per-queue marking.
474 Host mechanisms: The L4S architecture includes two main mechanisms in
475 the end host that we enumerate next:
477 a. Scalable Congestion Control: Data Center TCP is the most widely
478 used example. It has been documented as an informational record
479 of the protocol currently in use in controlled
480 environments [RFC8257]. A draft list of safety and performance
481 improvements for a scalable congestion control to be usable on
482 the public Internet has been drawn up (the so-called 'Prague L4S
483 requirements' in Appendix A of [I-D.ietf-tsvwg-ecn-l4s-id]). The
484 subset that involve risk of harm to others have been captured as
485 normative requirements in Section 4 of
486 [I-D.ietf-tsvwg-ecn-l4s-id]. TCP Prague has been implemented in
487 Linux as a reference implementation to address these requirements
488 [PragueLinux].
490 Transport protocols other than TCP use various congestion
491 controls that are designed to be friendly with Reno. Before they
492 can use the L4S service, it will be necessary to implement
493 scalable variants of each of these congestion control behaviours.
494 They will eventually need to be updated to implement a scalable
495 congestion response, which they will have to indicate by using
496 the ECT(1) codepoint. Scalable variants are under consideration
497 for some new transport protocols that are themselves under
498 development, e.g. QUIC. Also the L4S ECN part of
499 BBRv2 [I-D.cardwell-iccrg-bbr-congestion-control] is a scalable
500 congestion control intended for the TCP and QUIC transports,
501 amongst others. Also an L4S variant of the RMCAT SCReAM
502 controller [RFC8298] has been implemented for media transported
503 over RTP.
505 b. ECN feedback is sufficient for L4S in some transport protocols
506 (specifically DCCP [RFC4340] and QUIC [I-D.ietf-quic-transport]).
507 But others either require update or are in the process of being
508 updated:
510 * For the case of TCP, the feedback protocol for ECN embeds the
511 assumption from Classic ECN [RFC3168] that an ECN mark is
512 equivalent to a drop, making it unusable for a scalable TCP.
513 Therefore, the implementation of TCP receivers will have to be
514 upgraded [RFC7560]. Work to standardize and implement more
515 accurate ECN feedback for TCP (AccECN) is in
516 progress [I-D.ietf-tcpm-accurate-ecn], [PragueLinux].
518 * ECN feedback is only roughly sketched in an appendix of the
519 SCTP specification [RFC4960]. A fuller specification has been
520 proposed in a long-expired draft [I-D.stewart-tsvwg-sctpecn],
521 which would need to be implemented and deployed before SCTCP
522 could support L4S.
524 * For RTP, sufficient ECN feedback was defined in [RFC6679], but
525 [I-D.ietf-avtcore-cc-feedback-message] defines the latest
526 standards track improvements.
528 5. Rationale
530 5.1. Why These Primary Components?
532 Explicit congestion signalling (protocol): Explicit congestion
533 signalling is a key part of the L4S approach. In contrast, use of
534 drop as a congestion signal creates a tension because drop is both
535 an impairment (less would be better) and a useful signal (more
536 would be better):
538 * Explicit congestion signals can be used many times per round
539 trip, to keep tight control, without any impairment. Under
540 heavy load, even more explicit signals can be applied so the
541 queue can be kept short whatever the load. Whereas state-of-
542 the-art AQMs have to introduce very high packet drop at high
543 load to keep the queue short. Further, when using ECN, the
544 congestion control's sawtooth reduction can be smaller and
545 therefore return to the operating point more often, without
546 worrying that this causes more signals (one at the top of each
547 smaller sawtooth). The consequent smaller amplitude sawteeth
548 fit between a very shallow marking threshold and an empty
549 queue, so queue delay variation can be very low, without risk
550 of under-utilization.
552 * Explicit congestion signals can be sent immediately to track
553 fluctuations of the queue. L4S shifts smoothing from the
554 network (which doesn't know the round trip times of all the
555 flows) to the host (which knows its own round trip time).
556 Previously, the network had to smooth to keep a worst-case
557 round trip stable, which delayed congestion signals by 100-200
558 ms.
560 All the above makes it clear that explicit congestion signalling
561 is only advantageous for latency if it does not have to be
562 considered 'equivalent to' drop (as was required with Classic
563 ECN [RFC3168]). Therefore, in an L4S AQM, the L4S queue uses a
564 new L4S variant of ECN that is not equivalent to
565 drop [I-D.ietf-tsvwg-ecn-l4s-id], while the Classic queue uses
566 either classic ECN [RFC3168] or drop, which are equivalent to each
567 other.
569 Before Classic ECN was standardized, there were various proposals
570 to give an ECN mark a different meaning from drop. However, there
571 was no particular reason to agree on any one of the alternative
572 meanings, so 'equivalent to drop' was the only compromise that
573 could be reached. RFC 3168 contains a statement that:
575 "An environment where all end nodes were ECN-Capable could
576 allow new criteria to be developed for setting the CE
577 codepoint, and new congestion control mechanisms for end-node
578 reaction to CE packets. However, this is a research issue, and
579 as such is not addressed in this document."
581 Latency isolation (network): L4S congestion controls keep queue
582 delay low whereas Classic congestion controls need a queue of the
583 order of the RTT to avoid under-utilization. One queue cannot
584 have two lengths, therefore L4S traffic needs to be isolated in a
585 separate queue (e.g. DualQ) or queues (e.g. FQ).
587 Coupled congestion notification: Coupling the congestion
588 notification between two queues as in the DualQ Coupled AQM is not
589 necessarily essential, but it is a simple way to allow senders to
590 determine their rate, packet by packet, rather than be overridden
591 by a network scheduler. An alternative is for a network scheduler
592 to control the rate of each application flow (see discussion in
593 Section 5.2).
595 L4S packet identifier (protocol): Once there are at least two
596 treatments in the network, hosts need an identifier at the IP
597 layer to distinguish which treatment they intend to use.
599 Scalable congestion notification: A scalable congestion control in
600 the host keeps the signalling frequency from the network high so
601 that rate variations can be small when signalling is stable, and
602 rate can track variations in available capacity as rapidly as
603 possible otherwise.
605 Low loss: Latency is not the only concern of L4S. The 'Low Loss"
606 part of the name denotes that L4S generally achieves zero
607 congestion loss due to its use of ECN. Otherwise, loss would
608 itself cause delay, particularly for short flows, due to
609 retransmission delay [RFC2884].
611 Scalable throughput: The "Scalable throughput" part of the name
612 denotes that the per-flow throughput of scalable congestion
613 controls should scale indefinitely, avoiding the imminent scaling
614 problems with Reno-friendly congestion control
615 algorithms [RFC3649]. It was known when TCP congestion avoidance
616 was first developed that it would not scale to high bandwidth-
617 delay products (see footnote 6 in [TCP-CA]). Today, regular
618 broadband bit-rates over WAN distances are already beyond the
619 scaling range of Classic Reno congestion control. So `less
620 unscalable' Cubic [RFC8312] and Compound [I-D.sridharan-tcpm-ctcp]
621 variants of TCP have been successfully deployed. However, these
622 are now approaching their scaling limits. As the examples in
623 Section 3 demonstrate, as flow rate scales Classic congestion
624 controls like Reno or Cubic induce a congestion signal more and
625 more infrequently (hundreds of round trips at today's flow rates
626 and growing), which makes dynamic control very sloppy. In
627 contrast on average a scalable congestion control like DCTCP or
628 TCP Prague induces 2 congestion signals per round trip, which
629 remains invariant for any flow rate, keeping dynamic control very
630 tight.
632 Although work on scaling congestion controls tends to start with
633 TCP as the transport, the above is not intended to exclude other
634 transports (e.g. SCTP, QUIC) or less elastic algorithms
635 (e.g. RMCAT), which all tend to adopt the same or similar
636 developments.
638 5.2. What L4S adds to Existing Approaches
640 All the following approaches address some part of the same problem
641 space as L4S. In each case, it is shown that L4S complements them or
642 improves on them, rather than being a mutually exclusive alternative:
644 Diffserv: Diffserv addresses the problem of bandwidth apportionment
645 for important traffic as well as queuing latency for delay-
646 sensitive traffic. Of these, L4S solely addresses the problem of
647 queuing latency. Diffserv will still be necessary where important
648 traffic requires priority (e.g. for commercial reasons, or for
649 protection of critical infrastructure traffic) - see
650 [I-D.briscoe-tsvwg-l4s-diffserv]. Nonetheless, the L4S approach
651 can provide low latency for _all_ traffic within each Diffserv
652 class (including the case where there is only the one default
653 Diffserv class).
655 Also, Diffserv only works for a small subset of the traffic on a
656 link. As already explained, it is not applicable when all the
657 applications in use at one time at a single site (home, small
658 business or mobile device) require low latency. In contrast,
659 because L4S is for all traffic, it needs none of the management
660 baggage (traffic policing, traffic contracts) associated with
661 favouring some packets over others. This baggage has probably
662 held Diffserv back from widespread end-to-end deployment.
664 In particular, because networks tend not to trust end systems to
665 identify which packets should be favoured over others, where
666 networks assign packets to Diffserv classes they often use packet
667 inspection of application flow identifiers or deeper inspection of
668 application signatures. Thus, nowadays, Diffserv doesn't always
669 sit well with encryption of the layers above IP. So users have to
670 choose between privacy and QoS.
672 As with Diffserv, the L4S identifier is in the IP header. But, in
673 contrast to Diffserv, the L4S identifier does not convey a want or
674 a need for a certain level of quality. Rather, it promises a
675 certain behaviour (scalable congestion response), which networks
676 can objectively verify if they need to. This is because low delay
677 depends on collective host behaviour, whereas bandwidth priority
678 depends on network behaviour.
680 State-of-the-art AQMs: AQMs such as PIE and FQ-CoDel give a
681 significant reduction in queuing delay relative to no AQM at all.
682 L4S is intended to complement these AQMs, and should not distract
683 from the need to deploy them as widely as possible. Nonetheless,
684 AQMs alone cannot reduce queuing delay too far without
685 significantly reducing link utilization, because the root cause of
686 the problem is on the host - where Classic congestion controls use
687 large saw-toothing rate variations. The L4S approach resolves
688 this tension by ensuring hosts can minimize the size of their
689 sawteeth without appearing so aggressive to Classic flows that
690 they starve them.
692 Per-flow queuing or marking: Similarly, per-flow approaches such as
693 FQ-CoDel or Approx Fair CoDel [AFCD] are not incompatible with the
694 L4S approach. However, per-flow queuing alone is not enough - it
695 only isolates the queuing of one flow from others; not from
696 itself. Per-flow implementations still need to have support for
697 scalable congestion control added, which has already been done in
698 FQ-CoDel (see Sec.5.2.7 of [RFC8290]). Without this simple
699 modification, per-flow AQMs like FQ-CoDel would still not be able
700 to support applications that need both ultra-low delay and high
701 bandwidth, e.g. video-based control of remote procedures, or
702 interactive cloud-based video (see Note 1 below).
704 Although per-flow techniques are not incompatible with L4S, it is
705 important to have the DualQ alternative. This is because handling
706 end-to-end (layer 4) flows in the network (layer 3 or 2) precludes
707 some important end-to-end functions. For instance:
709 A. Per-flow forms of L4S like FQ-CoDel are incompatible with full
710 end-to-end encryption of transport layer identifiers for
711 privacy and confidentiality (e.g. IPSec or encrypted VPN
712 tunnels), because they require packet inspection to access the
713 end-to-end transport flow identifiers.
715 In contrast, the DualQ form of L4S requires no deeper
716 inspection than the IP layer. So, as long as operators take
717 the DualQ approach, their users can have both ultra-low
718 queuing delay and full end-to-end encryption [RFC8404].
720 B. With per-flow forms of L4S, the network takes over control of
721 the relative rates of each application flow. Some see it as
722 an advantage that the network will prevent some flows running
723 faster than others. Others consider it an inherent part of
724 the Internet's appeal that applications can control their rate
725 while taking account of the needs of others via congestion
726 signals. They maintain that this has allowed applications
727 with interesting rate behaviours to evolve, for instance,
728 variable bit-rate video that varies around an equal share
729 rather than being forced to remain equal at every instant, or
730 scavenger services that use less than an equal share of
731 capacity [LEDBAT_AQM].
733 The L4S architecture does not require the IETF to commit to
734 one approach over the other, because it supports both, so that
735 the market can decide. Nonetheless, in the spirit of 'Do one
736 thing and do it well' [McIlroy78], the DualQ option provides
737 low delay without prejudging the issue of flow-rate control.
738 Then, flow rate policing can be added separately if desired.
739 This allows application control up to a point, but the network
740 can still choose to set the point at which it intervenes to
741 prevent one flow completely starving another.
743 Note:
745 1. It might seem that self-inflicted queuing delay within a per-
746 flow queue should not be counted, because if the delay wasn't
747 in the network it would just shift to the sender. However,
748 modern adaptive applications, e.g. HTTP/2 [RFC7540] or some
749 interactive media applications (see Section 6.1), can keep low
750 latency objects at the front of their local send queue by
751 shuffling priorities of other objects dependent on the
752 progress of other transfers. They cannot shuffle objects once
753 they have released them into the network.
755 Alternative Back-off ECN (ABE): Here again, L4S is not an
756 alternative to ABE but a complement that introduces much lower
757 queuing delay. ABE [RFC8511] alters the host behaviour in
758 response to ECN marking to utilize a link better and give ECN
759 flows faster throughput. It uses ECT(0) and assumes the network
760 still treats ECN and drop the same. Therefore ABE exploits any
761 lower queuing delay that AQMs can provide. But as explained
762 above, AQMs still cannot reduce queuing delay too far without
763 losing link utilization (to allow for other, non-ABE, flows).
765 BBR: Bottleneck Bandwidth and Round-trip propagation time
766 (BBR [I-D.cardwell-iccrg-bbr-congestion-control]) controls queuing
767 delay end-to-end without needing any special logic in the network,
768 such as an AQM. So it works pretty-much on any path (although it
769 has not been without problems, particularly capacity sharing in
770 BBRv1). BBR keeps queuing delay reasonably low, but perhaps not
771 quite as low as with state-of-the-art AQMs such as PIE or FQ-
772 CoDel, and certainly nowhere near as low as with L4S. Queuing
773 delay is also not consistently low, due to BBR's regular bandwidth
774 probing spikes and its aggressive flow start-up phase.
776 L4S complements BBR. Indeed BBRv2 uses L4S ECN and a scalable L4S
777 congestion control behaviour in response to any ECN signalling
778 from the path. The L4S ECN signal complements the delay based
779 congestion control aspects of BBR with an explicit indication that
780 hosts can use, both to converge on a fair rate and to keep below a
781 shallow queue target set by the network. Without L4S ECN, both
782 these aspects need to be assumed or estimated.
784 6. Applicability
786 6.1. Applications
788 A transport layer that solves the current latency issues will provide
789 new service, product and application opportunities.
791 With the L4S approach, the following existing applications also
792 experience significantly better quality of experience under load:
794 o Gaming, including cloud based gaming;
796 o VoIP;
798 o Video conferencing;
800 o Web browsing;
802 o (Adaptive) video streaming;
804 o Instant messaging.
806 The significantly lower queuing latency also enables some interactive
807 application functions to be offloaded to the cloud that would hardly
808 even be usable today:
810 o Cloud based interactive video;
812 o Cloud based virtual and augmented reality.
814 The above two applications have been successfully demonstrated with
815 L4S, both running together over a 40 Mb/s broadband access link
816 loaded up with the numerous other latency sensitive applications in
817 the previous list as well as numerous downloads - all sharing the
818 same bottleneck queue simultaneously [L4Sdemo16]. For the former, a
819 panoramic video of a football stadium could be swiped and pinched so
820 that, on the fly, a proxy in the cloud could generate a sub-window of
821 the match video under the finger-gesture control of each user. For
822 the latter, a virtual reality headset displayed a viewport taken from
823 a 360 degree camera in a racing car. The user's head movements
824 controlled the viewport extracted by a cloud-based proxy. In both
825 cases, with 7 ms end-to-end base delay, the additional queuing delay
826 of roughly 1 ms was so low that it seemed the video was generated
827 locally.
829 Using a swiping finger gesture or head movement to pan a video are
830 extremely latency-demanding actions--far more demanding than VoIP.
831 Because human vision can detect extremely low delays of the order of
832 single milliseconds when delay is translated into a visual lag
833 between a video and a reference point (the finger or the orientation
834 of the head sensed by the balance system in the inner ear --- the
835 vestibular system).
837 Without the low queuing delay of L4S, cloud-based applications like
838 these would not be credible without significantly more access
839 bandwidth (to deliver all possible video that might be viewed) and
840 more local processing, which would increase the weight and power
841 consumption of head-mounted displays. When all interactive
842 processing can be done in the cloud, only the data to be rendered for
843 the end user needs to be sent.
845 Other low latency high bandwidth applications such as:
847 o Interactive remote presence;
849 o Video-assisted remote control of machinery or industrial
850 processes.
852 are not credible at all without very low queuing delay. No amount of
853 extra access bandwidth or local processing can make up for lost time.
855 6.2. Use Cases
857 The following use-cases for L4S are being considered by various
858 interested parties:
860 o Where the bottleneck is one of various types of access network:
861 e.g. DSL, Passive Optical Networks (PON), DOCSIS cable, mobile,
862 satellite (see Section 6.3 for some technology-specific details)
864 o Private networks of heterogeneous data centres, where there is no
865 single administrator that can arrange for all the simultaneous
866 changes to senders, receivers and network needed to deploy DCTCP:
868 * a set of private data centres interconnected over a wide area
869 with separate administrations, but within the same company
871 * a set of data centres operated by separate companies
872 interconnected by a community of interest network (e.g. for the
873 finance sector)
875 * multi-tenant (cloud) data centres where tenants choose their
876 operating system stack (Infrastructure as a Service - IaaS)
878 o Different types of transport (or application) congestion control:
880 * elastic (TCP/SCTP);
882 * real-time (RTP, RMCAT);
884 * query (DNS/LDAP).
886 o Where low delay quality of service is required, but without
887 inspecting or intervening above the IP layer [RFC8404]:
889 * mobile and other networks have tended to inspect higher layers
890 in order to guess application QoS requirements. However, with
891 growing demand for support of privacy and encryption, L4S
892 offers an alternative. There is no need to select which
893 traffic to favour for queuing, when L4S gives favourable
894 queuing to all traffic.
896 o If queuing delay is minimized, applications with a fixed delay
897 budget can communicate over longer distances, or via a longer
898 chain of service functions [RFC7665] or onion routers.
900 o If delay jitter is minimized, it is possible to reduce the
901 dejitter buffers on the receive end of video streaming, which
902 should improve the interactive experience
904 6.3. Applicability with Specific Link Technologies
906 Certain link technologies aggregate data from multiple packets into
907 bursts, and buffer incoming packets while building each burst. WiFi,
908 PON and cable all involve such packet aggregation, whereas fixed
909 Ethernet and DSL do not. No sender, whether L4S or not, can do
910 anything to reduce the buffering needed for packet aggregation. So
911 an AQM should not count this buffering as part of the queue that it
912 controls, given no amount of congestion signals will reduce it.
914 Certain link technologies also add buffering for other reasons,
915 specifically:
917 o Radio links (cellular, WiFi, satellite) that are distant from the
918 source are particularly challenging. The radio link capacity can
919 vary rapidly by orders of magnitude, so it is considered desirable
920 to hold a standing queue that can utilize sudden increases of
921 capacity;
923 o Cellular networks are further complicated by a perceived need to
924 buffer in order to make hand-overs imperceptible;
926 L4S cannot remove the need for all these different forms of
927 buffering. However, by removing 'the longest pole in the tent'
928 (buffering for the large sawteeth of Classic congestion controls),
929 L4S exposes all these 'shorter poles' to greater scrutiny.
931 Until now, the buffering needed for these additional reasons tended
932 to be over-specified - with the excuse that none were 'the longest
933 pole in the tent'. But having removed the 'longest pole', it becomes
934 worthwhile to minimize them, for instance reducing packet aggregation
935 burst sizes and MAC scheduling intervals.
937 6.4. Deployment Considerations
939 L4S AQMs, whether DualQ [I-D.ietf-tsvwg-aqm-dualq-coupled] or FQ,
940 e.g. [RFC8290] are, in themselves, an incremental deployment
941 mechanism for L4S - so that L4S traffic can coexist with existing
942 Classic (Reno-friendly) traffic. Section 6.4.1 explains why only
943 deploying an L4S AQM in one node at each end of the access link will
944 realize nearly all the benefit of L4S.
946 L4S involves both end systems and the network, so Section 6.4.2
947 suggests some typical sequences to deploy each part, and why there
948 will be an immediate and significant benefit after deploying just one
949 part.
951 Section 6.4.3 and Section 6.4.4 describe the converse incremental
952 deployment case where there is no L4S AQM at the network bottleneck,
953 so any L4S flow traversing this bottleneck has to take care in case
954 it is competing with Classic traffic.
956 6.4.1. Deployment Topology
958 L4S AQMs will not have to be deployed throughout the Internet before
959 L4S will work for anyone. Operators of public Internet access
960 networks typically design their networks so that the bottleneck will
961 nearly always occur at one known (logical) link. This confines the
962 cost of queue management technology to one place.
964 The case of mesh networks is different and will be discussed later in
965 this section. But the known bottleneck case is generally true for
966 Internet access to all sorts of different 'sites', where the word
967 'site' includes home networks, small- to medium-sized campus or
968 enterprise networks and even cellular devices (Figure 2). Also, this
969 known-bottleneck case tends to be applicable whatever the access link
970 technology; whether xDSL, cable, PON, cellular, line of sight
971 wireless or satellite.
973 Therefore, the full benefit of the L4S service should be available in
974 the downstream direction when an L4S AQM is deployed at the ingress
975 to this bottleneck link. And similarly, the full upstream service
976 will be available once an L4S AQM is deployed at the ingress into the
977 upstream link. (Of course, multi-homed sites would only see the full
978 benefit once all their access links were covered.)
979 ______
980 ( )
981 __ __ ( )
982 |DQ\________/DQ|( enterprise )
983 ___ |__/ \__| ( /campus )
984 ( ) (______)
985 ( ) ___||_
986 +----+ ( ) __ __ / \
987 | DC |-----( Core )|DQ\_______________/DQ|| home |
988 +----+ ( ) |__/ \__||______|
989 (_____) __
990 |DQ\__/\ __ ,===.
991 |__/ \ ____/DQ||| ||mobile
992 \/ \__|||_||device
993 | o |
994 `---'
996 Figure 2: Likely location of DualQ (DQ) Deployments in common access
997 topologies
999 Deployment in mesh topologies depends on how over-booked the core is.
1000 If the core is non-blocking, or at least generously provisioned so
1001 that the edges are nearly always the bottlenecks, it would only be
1002 necessary to deploy an L4S AQM at the edge bottlenecks. For example,
1003 some data-centre networks are designed with the bottleneck in the
1004 hypervisor or host NICs, while others bottleneck at the top-of-rack
1005 switch (both the output ports facing hosts and those facing the
1006 core).
1008 An L4S AQM would eventually also need to be deployed at any other
1009 persistent bottlenecks such as network interconnections, e.g. some
1010 public Internet exchange points and the ingress and egress to WAN
1011 links interconnecting data-centres.
1013 6.4.2. Deployment Sequences
1015 For any one L4S flow to work, it requires 3 parts to have been
1016 deployed. This was the same deployment problem that ECN
1017 faced [RFC8170] so we have learned from that experience.
1019 Firstly, L4S deployment exploits the fact that DCTCP already exists
1020 on many Internet hosts (Windows, FreeBSD and Linux); both servers and
1021 clients. Therefore, just deploying an L4S AQM at a network
1022 bottleneck immediately gives a working deployment of all the L4S
1023 parts. DCTCP needs some safety concerns to be fixed for general use
1024 over the public Internet (see Section 2.3 of
1025 [I-D.ietf-tsvwg-ecn-l4s-id]), but DCTCP is not on by default, so
1026 these issues can be managed within controlled deployments or
1027 controlled trials.
1029 Secondly, the performance improvement with L4S is so significant that
1030 it enables new interactive services and products that were not
1031 previously possible. It is much easier for companies to initiate new
1032 work on deployment if there is budget for a new product trial. If,
1033 in contrast, there were only an incremental performance improvement
1034 (as with Classic ECN), spending on deployment tends to be much harder
1035 to justify.
1037 Thirdly, the L4S identifier is defined so that initially network
1038 operators can enable L4S exclusively for certain customers or certain
1039 applications. But this is carefully defined so that it does not
1040 compromise future evolution towards L4S as an Internet-wide service.
1041 This is because the L4S identifier is defined not only as the end-to-
1042 end ECN field, but it can also optionally be combined with any other
1043 packet header or some status of a customer or their access
1044 link [I-D.ietf-tsvwg-ecn-l4s-id]. Operators could do this anyway,
1045 even if it were not blessed by the IETF. However, it is best for the
1046 IETF to specify that, if they use their own local identifier, it must
1047 be in combination with the IETF's identifier. Then, if an operator
1048 has opted for an exclusive local-use approach, later they only have
1049 to remove this extra rule to make the service work Internet-wide - it
1050 will already traverse middleboxes, peerings, etc.
1052 +-+--------------------+----------------------+---------------------+
1053 | | Servers or proxies | Access link | Clients |
1054 +-+--------------------+----------------------+---------------------+
1055 |0| DCTCP (existing) | | DCTCP (existing) |
1056 +-+--------------------+----------------------+---------------------+
1057 |1| |Add L4S AQM downstream| |
1058 | | WORKS DOWNSTREAM FOR CONTROLLED DEPLOYMENTS/TRIALS |
1059 +-+--------------------+----------------------+---------------------+
1060 |2| Upgrade DCTCP to | |Replace DCTCP feedb'k|
1061 | | TCP Prague | | with AccECN |
1062 | | FULLY WORKS DOWNSTREAM |
1063 +-+--------------------+----------------------+---------------------+
1064 | | | | Upgrade DCTCP to |
1065 |3| | Add L4S AQM upstream | TCP Prague |
1066 | | | | |
1067 | | FULLY WORKS UPSTREAM AND DOWNSTREAM |
1068 +-+--------------------+----------------------+---------------------+
1070 Figure 3: Example L4S Deployment Sequence
1072 Figure 3 illustrates some example sequences in which the parts of L4S
1073 might be deployed. It consists of the following stages:
1075 1. Here, the immediate benefit of a single AQM deployment can be
1076 seen, but limited to a controlled trial or controlled deployment.
1077 In this example downstream deployment is first, but in other
1078 scenarios the upstream might be deployed first. If no AQM at all
1079 was previously deployed for the downstream access, an L4S AQM
1080 greatly improves the Classic service (as well as adding the L4S
1081 service). If an AQM was already deployed, the Classic service
1082 will be unchanged (and L4S will add an improvement on top).
1084 2. In this stage, the name 'TCP Prague' [PragueLinux] is used to
1085 represent a variant of DCTCP that is safe to use in a production
1086 Internet environment. If the application is primarily
1087 unidirectional, 'TCP Prague' at one end will provide all the
1088 benefit needed. For TCP transports, Accurate ECN feedback
1089 (AccECN) [I-D.ietf-tcpm-accurate-ecn] is needed at the other end,
1090 but it is a generic ECN feedback facility that is already planned
1091 to be deployed for other purposes, e.g. DCTCP, BBR. The two ends
1092 can be deployed in either order, because, in TCP, an L4S
1093 congestion control only enables itself if it has negotiated the
1094 use of AccECN feedback with the other end during the connection
1095 handshake. Thus, deployment of TCP Prague on a server enables
1096 L4S trials to move to a production service in one direction,
1097 wherever AccECN is deployed at the other end. This stage might
1098 be further motivated by the performance improvements of TCP
1099 Prague relative to DCTCP (see Appendix A.2 of
1100 [I-D.ietf-tsvwg-ecn-l4s-id]).
1102 Unlike TCP, from the outset, QUIC ECN
1103 feedback [I-D.ietf-quic-transport] has supported L4S. Therefore,
1104 if the transport is QUIC, one-ended deployment of a Prague
1105 congestion control at this stage is simple and sufficient.
1107 3. This is a two-move stage to enable L4S upstream. An L4S AQM or
1108 TCP Prague can be deployed in either order as already explained.
1109 To motivate the first of two independent moves, the deferred
1110 benefit of enabling new services after the second move has to be
1111 worth it to cover the first mover's investment risk. As
1112 explained already, the potential for new interactive services
1113 provides this motivation. An L4S AQM also improves the upstream
1114 Classic service - significantly if no other AQM has already been
1115 deployed.
1117 Note that other deployment sequences might occur. For instance: the
1118 upstream might be deployed first; a non-TCP protocol might be used
1119 end-to-end, e.g. QUIC, RTP; a body such as the 3GPP might require L4S
1120 to be implemented in 5G user equipment, or other random acts of
1121 kindness.
1123 6.4.3. L4S Flow but Non-ECN Bottleneck
1125 If L4S is enabled between two hosts, the L4S sender is required to
1126 coexist safely with Reno in response to any drop (see Section 4.3 of
1127 [I-D.ietf-tsvwg-ecn-l4s-id]).
1129 Unfortunately, as well as protecting Classic traffic, this rule
1130 degrades the L4S service whenever there is any loss, even if the
1131 cause is not persistent congestion at a bottleneck, e.g.:
1133 o congestion loss at other transient bottlenecks, e.g. due to bursts
1134 in shallower queues;
1136 o transmission errors, e.g. due to electrical interference;
1138 o rate policing.
1140 Three complementary approaches are in progress to address this issue,
1141 but they are all currently research:
1143 o In Prague congestion control, ignore certain losses deemed
1144 unlikely to be due to congestion (using some ideas from
1145 BBR [I-D.cardwell-iccrg-bbr-congestion-control] regarding isolated
1146 losses). This could mask any of the above types of loss while
1147 still coexisting with drop-based congestion controls.
1149 o A combination of RACK, L4S and link retransmission without
1150 resequencing could repair transmission errors without the head of
1151 line blocking delay usually associated with link-layer
1152 retransmission [UnorderedLTE], [I-D.ietf-tsvwg-ecn-l4s-id];
1154 o Hybrid ECN/drop rate policers (see Section 8.3).
1156 L4S deployment scenarios that minimize these issues (e.g. over
1157 wireline networks) can proceed in parallel to this research, in the
1158 expectation that research success could continually widen L4S
1159 applicability.
1161 6.4.4. L4S Flow but Classic ECN Bottleneck
1163 Classic ECN support is starting to materialize on the Internet as an
1164 increased level of CE marking. It is hard to detect whether this is
1165 all due to the addition of support for ECN in the Linux
1166 implementation of FQ-CoDel, which is not problematic, because FQ
1167 inherently forces the throughput of each flow to be equal
1168 irrespective of its aggressiveness. However, some of this Classic
1169 ECN marking might be due to single-queue ECN deployment. This case
1170 is discussed in Section 4.3 of [I-D.ietf-tsvwg-ecn-l4s-id]).
1172 6.4.5. L4S AQM Deployment within Tunnels
1174 An L4S AQM uses the ECN field to signal congestion. So, in common
1175 with Classic ECN, if the AQM is within a tunnel or at a lower layer,
1176 correct functioning of ECN signalling requires correct propagation of
1177 the ECN field up the layers [RFC6040],
1178 [I-D.ietf-tsvwg-rfc6040update-shim],
1179 [I-D.ietf-tsvwg-ecn-encap-guidelines].
1181 7. IANA Considerations (to be removed by RFC Editor)
1183 This specification contains no IANA considerations.
1185 8. Security Considerations
1187 8.1. Traffic Rate (Non-)Policing
1189 Because the L4S service can serve all traffic that is using the
1190 capacity of a link, it should not be necessary to rate-police access
1191 to the L4S service. In contrast, Diffserv only works if some packets
1192 get less favourable treatment than others. So Diffserv has to use
1193 traffic rate policers to limit how much traffic can be favoured. In
1194 turn, traffic policers require traffic contracts between users and
1195 networks as well as pairwise between networks. Because L4S will lack
1196 all this management complexity, it is more likely to work end-to-end.
1198 During early deployment (and perhaps always), some networks will not
1199 offer the L4S service. In general, these networks should not need to
1200 police L4S traffic - they are required not to change the L4S
1201 identifier, merely treating the traffic as best efforts traffic, as
1202 they already treat traffic with ECT(1) today. At a bottleneck, such
1203 networks will introduce some queuing and dropping. When a scalable
1204 congestion control detects a drop it will have to respond safely with
1205 respect to Classic congestion controls (as required in Section 4.3 of
1206 [I-D.ietf-tsvwg-ecn-l4s-id]). This will degrade the L4S service to
1207 be no better (but never worse) than Classic best efforts, whenever a
1208 non-ECN bottleneck is encountered on a path (see Section 6.4.3).
1210 In some cases, networks that solely support Classic ECN [RFC3168] in
1211 a single queue bottleneck might opt to police L4S traffic in order to
1212 protect competing Classic ECN traffic.
1214 Certain network operators might choose to restrict access to the L4S
1215 class, perhaps only to selected premium customers as a value-added
1216 service. Their packet classifier (item 2 in Figure 1) could identify
1217 such customers against some other field (e.g. source address range)
1218 as well as ECN. If only the ECN L4S identifier matched, but not the
1219 source address (say), the classifier could direct these packets (from
1220 non-premium customers) into the Classic queue. Explaining clearly
1221 how operators can use an additional local classifiers (see
1222 [I-D.ietf-tsvwg-ecn-l4s-id]) is intended to remove any motivation to
1223 bleach the L4S identifier. Then at least the L4S ECN identifier will
1224 be more likely to survive end-to-end even though the service may not
1225 be supported at every hop. Such local arrangements would only
1226 require simple registered/not-registered packet classification,
1227 rather than the managed, application-specific traffic policing
1228 against customer-specific traffic contracts that Diffserv uses.
1230 8.2. 'Latency Friendliness'
1232 Like the Classic service, the L4S service relies on self-constraint -
1233 limiting rate in response to congestion. In addition, the L4S
1234 service requires self-constraint in terms of limiting latency
1235 (burstiness). It is hoped that self-interest and guidance on dynamic
1236 behaviour (especially flow start-up, which might need to be
1237 standardized) will be sufficient to prevent transports from sending
1238 excessive bursts of L4S traffic, given the application's own latency
1239 will suffer most from such behaviour.
1241 Whether burst policing becomes necessary remains to be seen. Without
1242 it, there will be potential for attacks on the low latency of the L4S
1243 service.
1245 If needed, various arrangements could be used to address this
1246 concern:
1248 Local bottleneck queue protection: A per-flow (5-tuple) queue
1249 protection function [I-D.briscoe-docsis-q-protection] has been
1250 developed for the low latency queue in DOCSIS, which has adopted
1251 the DualQ L4S architecture. It protects the low latency service
1252 from any queue-building flows that accidentally or maliciously
1253 classify themselves into the low latency queue. It is designed to
1254 score flows based solely on their contribution to queuing (not
1255 flow rate in itself). Then, if the shared low latency queue is at
1256 risk of exceeding a threshold, the function redirects enough
1257 packets of the highest scoring flow(s) into the Classic queue to
1258 preserve low latency.
1260 Distributed traffic scrubbing: Rather than policing locally at each
1261 bottleneck, it may only be necessary to address problems
1262 reactively, e.g. punitively target any deployments of new bursty
1263 malware, in a similar way to how traffic from flooding attack
1264 sources is rerouted via scrubbing facilities.
1266 Local bottleneck per-flow scheduling: Per-flow scheduling should
1267 inherently isolate non-bursty flows from bursty (see Section 5.2
1268 for discussion of the merits of per-flow scheduling relative to
1269 per-flow policing).
1271 Distributed access subnet queue protection: Per-flow queue
1272 protection could be arranged for a queue structure distributed
1273 across a subnet inter-communicating using lower layer control
1274 messages (see Section 2.1.4 of [QDyn]). For instance, in a radio
1275 access network user equipment already sends regular buffer status
1276 reports to a radio network controller, which could use this
1277 information to remotely police individual flows.
1279 Distributed Congestion Exposure to Ingress Policers: The Congestion
1280 Exposure (ConEx) architecture [RFC7713] which uses egress audit to
1281 motivate senders to truthfully signal path congestion in-band
1282 where it can be used by ingress policers. An edge-to-edge variant
1283 of this architecture is also possible.
1285 Distributed Domain-edge traffic conditioning: An architecture
1286 similar to Diffserv [RFC2475] may be preferred, where traffic is
1287 proactively conditioned on entry to a domain, rather than
1288 reactively policed only if it is leads to queuing once combined
1289 with other traffic at a bottleneck.
1291 Distributed core network queue protection: The policing function
1292 could be divided between per-flow mechanisms at the network
1293 ingress that characterize the burstiness of each flow into a
1294 signal carried with the traffic, and per-class mechanisms at
1295 bottlenecks that act on these signals if queuing actually occurs
1296 once the traffic converges. This would be somewhat similar to the
1297 idea behind core stateless fair queuing, which is in turn similar
1298 to [Nadas20].
1300 None of these possible queue protection capabilities are considered a
1301 necessary part of the L4S architecture, which works without them (in
1302 a similar way to how the Internet works without per-flow rate
1303 policing). Indeed, under normal circumstances, latency policers
1304 would not intervene, and if operators found they were not necessary
1305 they could disable them. Part of the L4S experiment will be to see
1306 whether such a function is necessary, and which arrangements are most
1307 appropriate to the size of the problem.
1309 8.3. Interaction between Rate Policing and L4S
1311 As mentioned in Section 5.2, L4S should remove the need for low
1312 latency Diffserv classes. However, those Diffserv classes that give
1313 certain applications or users priority over capacity, would still be
1314 applicable in certain scenarios (e.g. corporate networks). Then,
1315 within such Diffserv classes, L4S would often be applicable to give
1316 traffic low latency and low loss as well. Within such a Diffserv
1317 class, the bandwidth available to a user or application is often
1318 limited by a rate policer. Similarly, in the default Diffserv class,
1319 rate policers are used to partition shared capacity.
1321 A classic rate policer drops any packets exceeding a set rate,
1322 usually also giving a burst allowance (variants exist where the
1323 policer re-marks non-compliant traffic to a discard-eligible Diffserv
1324 codepoint, so they may be dropped elsewhere during contention).
1325 Whenever L4S traffic encounters one of these rate policers, it will
1326 experience drops and the source will have to fall back to a Classic
1327 congestion control, thus losing the benefits of L4S (Section 6.4.3).
1328 So, in networks that already use rate policers and plan to deploy
1329 L4S, it will be preferable to redesign these rate policers to be more
1330 friendly to the L4S service.
1332 L4S-friendly rate policing is currently a research area (note that
1333 this is not the same as latency policing). It might be achieved by
1334 setting a threshold where ECN marking is introduced, such that it is
1335 just under the policed rate or just under the burst allowance where
1336 drop is introduced. This could be applied to various types of rate
1337 policer, e.g. [RFC2697], [RFC2698] or the 'local' (non-ConEx) variant
1338 of the ConEx congestion policer [I-D.briscoe-conex-policing]. It
1339 might also be possible to design scalable congestion controls to
1340 respond less catastrophically to loss that has not been preceded by a
1341 period of increasing delay.
1343 The design of L4S-friendly rate policers will require a separate
1344 dedicated document. For further discussion of the interaction
1345 between L4S and Diffserv, see [I-D.briscoe-tsvwg-l4s-diffserv].
1347 8.4. ECN Integrity
1349 Receiving hosts can fool a sender into downloading faster by
1350 suppressing feedback of ECN marks (or of losses if retransmissions
1351 are not necessary or available otherwise). Various ways to protect
1352 transport feedback integrity have been developed. For instance:
1354 o The sender can test the integrity of the receiver's feedback by
1355 occasionally setting the IP-ECN field to the congestion
1356 experienced (CE) codepoint, which is normally only set by a
1357 congested link. Then the sender can test whether the receiver's
1358 feedback faithfully reports what it expects (see 2nd para of
1359 Section 20.2 of [RFC3168]).
1361 o A network can enforce a congestion response to its ECN markings
1362 (or packet losses) by auditing congestion exposure
1363 (ConEx) [RFC7713].
1365 o The TCP authentication option (TCP-AO [RFC5925]) can be used to
1366 detect tampering with TCP congestion feedback.
1368 o The ECN Nonce [RFC3540] was proposed to detect tampering with
1369 congestion feedback, but it has been reclassified as
1370 historic [RFC8311].
1372 Appendix C.1 of [I-D.ietf-tsvwg-ecn-l4s-id] gives more details of
1373 these techniques including their applicability and pros and cons.
1375 8.5. Privacy Considerations
1377 As discussed in Section 5.2, the L4S architecture does not preclude
1378 approaches that inspect end-to-end transport layer identifiers. For
1379 instance it is simple to add L4S support to FQ-CoDel, which
1380 classifies by application flow ID in the network. However, the main
1381 innovation of L4S is the DualQ AQM framework that does not need to
1382 inspect any deeper than the outermost IP header, because the L4S
1383 identifier is in the IP-ECN field.
1385 Thus, the L4S architecture enables ultra-low queuing delay without
1386 _requiring_ inspection of information above the IP layer. This means
1387 that users who want to encrypt application flow identifiers, e.g. in
1388 IPSec or other encrypted VPN tunnels, don't have to sacrifice low
1389 delay [RFC8404].
1391 Because L4S can provide low delay for a broad set of applications
1392 that choose to use it, there is no need for individual applications
1393 or classes within that broad set to be distinguishable in any way
1394 while traversing networks. This removes much of the ability to
1395 correlate between the delay requirements of traffic and other
1396 identifying features [RFC6973]. There may be some types of traffic
1397 that prefer not to use L4S, but the coarse binary categorization of
1398 traffic reveals very little that could be exploited to compromise
1399 privacy.
1401 9. Acknowledgements
1403 Thanks to Richard Scheffenegger, Wes Eddy, Karen Nielsen, David Black
1404 and Jake Holland for their useful review comments.
1406 Bob Briscoe and Koen De Schepper were part-funded by the European
1407 Community under its Seventh Framework Programme through the Reducing
1408 Internet Transport Latency (RITE) project (ICT-317700). Bob Briscoe
1409 was also part-funded by the Research Council of Norway through the
1410 TimeIn project, partly by CableLabs and partly by the Comcast
1411 Innovation Fund. The views expressed here are solely those of the
1412 authors.
1414 10. Informative References
1416 [AFCD] Xue, L., Kumar, S., Cui, C., Kondikoppa, P., Chiu, C-H.,
1417 and S-J. Park, "Towards fair and low latency next
1418 generation high speed networks: AFCD queuing", Journal of
1419 Network and Computer Applications 70:183--193, July 2016.
1421 [DCttH15] De Schepper, K., Bondarenko, O., Briscoe, B., and I.
1422 Tsang, "`Data Centre to the Home': Ultra-Low Latency for
1423 All", RITE project Technical Report , 2015,
1424 .
1426 [DOCSIS3.1]
1427 CableLabs, "MAC and Upper Layer Protocols Interface
1428 (MULPI) Specification, CM-SP-MULPIv3.1", Data-Over-Cable
1429 Service Interface Specifications DOCSIS(R) 3.1 Version i17
1430 or later, January 2019, .
1433 [DualPI2Linux]
1434 Albisser, O., De Schepper, K., Briscoe, B., Tilmans, O.,
1435 and H. Steen, "DUALPI2 - Low Latency, Low Loss and
1436 Scalable (L4S) AQM", Proc. Linux Netdev 0x13 , March 2019,
1437 .
1440 [Hohlfeld14]
1441 Hohlfeld , O., Pujol, E., Ciucu, F., Feldmann, A., and P.
1442 Barford, "A QoE Perspective on Sizing Network Buffers",
1443 Proc. ACM Internet Measurement Conf (IMC'14) hmm, November
1444 2014.
1446 [I-D.briscoe-conex-policing]
1447 Briscoe, B., "Network Performance Isolation using
1448 Congestion Policing", draft-briscoe-conex-policing-01
1449 (work in progress), February 2014.
1451 [I-D.briscoe-docsis-q-protection]
1452 Briscoe, B. and G. White, "Queue Protection to Preserve
1453 Low Latency", draft-briscoe-docsis-q-protection-00 (work
1454 in progress), July 2019.
1456 [I-D.briscoe-tsvwg-l4s-diffserv]
1457 Briscoe, B., "Interactions between Low Latency, Low Loss,
1458 Scalable Throughput (L4S) and Differentiated Services",
1459 draft-briscoe-tsvwg-l4s-diffserv-02 (work in progress),
1460 November 2018.
1462 [I-D.cardwell-iccrg-bbr-congestion-control]
1463 Cardwell, N., Cheng, Y., Yeganeh, S., and V. Jacobson,
1464 "BBR Congestion Control", draft-cardwell-iccrg-bbr-
1465 congestion-control-00 (work in progress), July 2017.
1467 [I-D.ietf-avtcore-cc-feedback-message]
1468 Sarker, Z., Perkins, C., Singh, V., and M. Ramalho, "RTP
1469 Control Protocol (RTCP) Feedback for Congestion Control",
1470 draft-ietf-avtcore-cc-feedback-message-09 (work in
1471 progress), November 2020.
1473 [I-D.ietf-quic-transport]
1474 Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed
1475 and Secure Transport", draft-ietf-quic-transport-32 (work
1476 in progress), October 2020.
1478 [I-D.ietf-tcpm-accurate-ecn]
1479 Briscoe, B., Kuehlewind, M., and R. Scheffenegger, "More
1480 Accurate ECN Feedback in TCP", draft-ietf-tcpm-accurate-
1481 ecn-13 (work in progress), November 2020.
1483 [I-D.ietf-tcpm-generalized-ecn]
1484 Bagnulo, M. and B. Briscoe, "ECN++: Adding Explicit
1485 Congestion Notification (ECN) to TCP Control Packets",
1486 draft-ietf-tcpm-generalized-ecn-06 (work in progress),
1487 October 2020.
1489 [I-D.ietf-tsvwg-aqm-dualq-coupled]
1490 Schepper, K., Briscoe, B., and G. White, "DualQ Coupled
1491 AQMs for Low Latency, Low Loss and Scalable Throughput
1492 (L4S)", draft-ietf-tsvwg-aqm-dualq-coupled-12 (work in
1493 progress), July 2020.
1495 [I-D.ietf-tsvwg-ecn-encap-guidelines]
1496 Briscoe, B., Kaippallimalil, J., and P. Thaler,
1497 "Guidelines for Adding Congestion Notification to
1498 Protocols that Encapsulate IP", draft-ietf-tsvwg-ecn-
1499 encap-guidelines-13 (work in progress), May 2019.
1501 [I-D.ietf-tsvwg-ecn-l4s-id]
1502 Schepper, K. and B. Briscoe, "Identifying Modified
1503 Explicit Congestion Notification (ECN) Semantics for
1504 Ultra-Low Queuing Delay (L4S)", draft-ietf-tsvwg-ecn-l4s-
1505 id-11 (work in progress), November 2020.
1507 [I-D.ietf-tsvwg-rfc6040update-shim]
1508 Briscoe, B., "Propagating Explicit Congestion Notification
1509 Across IP Tunnel Headers Separated by a Shim", draft-ietf-
1510 tsvwg-rfc6040update-shim-10 (work in progress), March
1511 2020.
1513 [I-D.morton-tsvwg-codel-approx-fair]
1514 Morton, J. and P. Heist, "Controlled Delay Approximate
1515 Fairness AQM", draft-morton-tsvwg-codel-approx-fair-01
1516 (work in progress), March 2020.
1518 [I-D.sridharan-tcpm-ctcp]
1519 Sridharan, M., Tan, K., Bansal, D., and D. Thaler,
1520 "Compound TCP: A New TCP Congestion Control for High-Speed
1521 and Long Distance Networks", draft-sridharan-tcpm-ctcp-02
1522 (work in progress), November 2008.
1524 [I-D.stewart-tsvwg-sctpecn]
1525 Stewart, R., Tuexen, M., and X. Dong, "ECN for Stream
1526 Control Transmission Protocol (SCTP)", draft-stewart-
1527 tsvwg-sctpecn-05 (work in progress), January 2014.
1529 [I-D.white-tsvwg-nqb]
1530 White, G. and T. Fossati, "Identifying and Handling Non
1531 Queue Building Flows in a Bottleneck Link", draft-white-
1532 tsvwg-nqb-02 (work in progress), June 2019.
1534 [L4Sdemo16]
1535 Bondarenko, O., De Schepper, K., Tsang, I., and B.
1536 Briscoe, "orderedUltra-Low Delay for All: Live Experience,
1537 Live Analysis", Proc. MMSYS'16 pp33:1--33:4, May 2016,
1538 .
1542 [LEDBAT_AQM]
1543 Al-Saadi, R., Armitage, G., and J. But, "Characterising
1544 LEDBAT Performance Through Bottlenecks Using PIE, FQ-CoDel
1545 and FQ-PIE Active Queue Management", Proc. IEEE 42nd
1546 Conference on Local Computer Networks (LCN) 278--285,
1547 2017, .
1549 [Mathis09]
1550 Mathis, M., "Relentless Congestion Control", PFLDNeT'09 ,
1551 May 2009, .
1556 [McIlroy78]
1557 McIlroy, M., Pinson, E., and B. Tague, "UNIX Time-Sharing
1558 System: Foreword", The Bell System Technical Journal
1559 57:6(1902--1903), July 1978,
1560 .
1562 [Nadas20] Nadas, S., Gombos, G., Fejes, F., and S. Laki, "A
1563 Congestion Control Independent L4S Scheduler", Proc.
1564 Applied Networking Research Workshop (ANRW '20) 45--51,
1565 July 2020.
1567 [NewCC_Proc]
1568 Eggert, L., "Experimental Specification of New Congestion
1569 Control Algorithms", IETF Operational Note ion-tsv-alt-cc,
1570 July 2007.
1572 [PragueLinux]
1573 Briscoe, B., De Schepper, K., Albisser, O., Misund, J.,
1574 Tilmans, O., Kuehlewind, M., and A. Ahmed, "Implementing
1575 the `TCP Prague' Requirements for Low Latency Low Loss
1576 Scalable Throughput (L4S)", Proc. Linux Netdev 0x13 ,
1577 March 2019, .
1580 [QDyn] Briscoe, B., "Rapid Signalling of Queue Dynamics",
1581 bobbriscoe.net Technical Report TR-BB-2017-001;
1582 arXiv:1904.07044 [cs.NI], September 2017,
1583 .
1585 [RFC2475] Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z.,
1586 and W. Weiss, "An Architecture for Differentiated
1587 Services", RFC 2475, DOI 10.17487/RFC2475, December 1998,
1588 .
1590 [RFC2697] Heinanen, J. and R. Guerin, "A Single Rate Three Color
1591 Marker", RFC 2697, DOI 10.17487/RFC2697, September 1999,
1592 .
1594 [RFC2698] Heinanen, J. and R. Guerin, "A Two Rate Three Color
1595 Marker", RFC 2698, DOI 10.17487/RFC2698, September 1999,
1596 .
1598 [RFC2884] Hadi Salim, J. and U. Ahmed, "Performance Evaluation of
1599 Explicit Congestion Notification (ECN) in IP Networks",
1600 RFC 2884, DOI 10.17487/RFC2884, July 2000,
1601 .
1603 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
1604 of Explicit Congestion Notification (ECN) to IP",
1605 RFC 3168, DOI 10.17487/RFC3168, September 2001,
1606 .
1608 [RFC3246] Davie, B., Charny, A., Bennet, J., Benson, K., Le Boudec,
1609 J., Courtney, W., Davari, S., Firoiu, V., and D.
1610 Stiliadis, "An Expedited Forwarding PHB (Per-Hop
1611 Behavior)", RFC 3246, DOI 10.17487/RFC3246, March 2002,
1612 .
1614 [RFC3540] Spring, N., Wetherall, D., and D. Ely, "Robust Explicit
1615 Congestion Notification (ECN) Signaling with Nonces",
1616 RFC 3540, DOI 10.17487/RFC3540, June 2003,
1617 .
1619 [RFC3649] Floyd, S., "HighSpeed TCP for Large Congestion Windows",
1620 RFC 3649, DOI 10.17487/RFC3649, December 2003,
1621 .
1623 [RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram
1624 Congestion Control Protocol (DCCP)", RFC 4340,
1625 DOI 10.17487/RFC4340, March 2006,
1626 .
1628 [RFC4774] Floyd, S., "Specifying Alternate Semantics for the
1629 Explicit Congestion Notification (ECN) Field", BCP 124,
1630 RFC 4774, DOI 10.17487/RFC4774, November 2006,
1631 .
1633 [RFC4960] Stewart, R., Ed., "Stream Control Transmission Protocol",
1634 RFC 4960, DOI 10.17487/RFC4960, September 2007,
1635 .
1637 [RFC5033] Floyd, S. and M. Allman, "Specifying New Congestion
1638 Control Algorithms", BCP 133, RFC 5033,
1639 DOI 10.17487/RFC5033, August 2007,
1640 .
1642 [RFC5348] Floyd, S., Handley, M., Padhye, J., and J. Widmer, "TCP
1643 Friendly Rate Control (TFRC): Protocol Specification",
1644 RFC 5348, DOI 10.17487/RFC5348, September 2008,
1645 .
1647 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion
1648 Control", RFC 5681, DOI 10.17487/RFC5681, September 2009,
1649 .
1651 [RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP
1652 Authentication Option", RFC 5925, DOI 10.17487/RFC5925,
1653 June 2010, .
1655 [RFC6040] Briscoe, B., "Tunnelling of Explicit Congestion
1656 Notification", RFC 6040, DOI 10.17487/RFC6040, November
1657 2010, .
1659 [RFC6679] Westerlund, M., Johansson, I., Perkins, C., O'Hanlon, P.,
1660 and K. Carlberg, "Explicit Congestion Notification (ECN)
1661 for RTP over UDP", RFC 6679, DOI 10.17487/RFC6679, August
1662 2012, .
1664 [RFC6973] Cooper, A., Tschofenig, H., Aboba, B., Peterson, J.,
1665 Morris, J., Hansen, M., and R. Smith, "Privacy
1666 Considerations for Internet Protocols", RFC 6973,
1667 DOI 10.17487/RFC6973, July 2013,
1668 .
1670 [RFC7540] Belshe, M., Peon, R., and M. Thomson, Ed., "Hypertext
1671 Transfer Protocol Version 2 (HTTP/2)", RFC 7540,
1672 DOI 10.17487/RFC7540, May 2015,
1673 .
1675 [RFC7560] Kuehlewind, M., Ed., Scheffenegger, R., and B. Briscoe,
1676 "Problem Statement and Requirements for Increased Accuracy
1677 in Explicit Congestion Notification (ECN) Feedback",
1678 RFC 7560, DOI 10.17487/RFC7560, August 2015,
1679 .
1681 [RFC7665] Halpern, J., Ed. and C. Pignataro, Ed., "Service Function
1682 Chaining (SFC) Architecture", RFC 7665,
1683 DOI 10.17487/RFC7665, October 2015,
1684 .
1686 [RFC7713] Mathis, M. and B. Briscoe, "Congestion Exposure (ConEx)
1687 Concepts, Abstract Mechanism, and Requirements", RFC 7713,
1688 DOI 10.17487/RFC7713, December 2015,
1689 .
1691 [RFC8033] Pan, R., Natarajan, P., Baker, F., and G. White,
1692 "Proportional Integral Controller Enhanced (PIE): A
1693 Lightweight Control Scheme to Address the Bufferbloat
1694 Problem", RFC 8033, DOI 10.17487/RFC8033, February 2017,
1695 .
1697 [RFC8034] White, G. and R. Pan, "Active Queue Management (AQM) Based
1698 on Proportional Integral Controller Enhanced PIE) for
1699 Data-Over-Cable Service Interface Specifications (DOCSIS)
1700 Cable Modems", RFC 8034, DOI 10.17487/RFC8034, February
1701 2017, .
1703 [RFC8170] Thaler, D., Ed., "Planning for Protocol Adoption and
1704 Subsequent Transitions", RFC 8170, DOI 10.17487/RFC8170,
1705 May 2017, .
1707 [RFC8257] Bensley, S., Thaler, D., Balasubramanian, P., Eggert, L.,
1708 and G. Judd, "Data Center TCP (DCTCP): TCP Congestion
1709 Control for Data Centers", RFC 8257, DOI 10.17487/RFC8257,
1710 October 2017, .
1712 [RFC8290] Hoeiland-Joergensen, T., McKenney, P., Taht, D., Gettys,
1713 J., and E. Dumazet, "The Flow Queue CoDel Packet Scheduler
1714 and Active Queue Management Algorithm", RFC 8290,
1715 DOI 10.17487/RFC8290, January 2018,
1716 .
1718 [RFC8298] Johansson, I. and Z. Sarker, "Self-Clocked Rate Adaptation
1719 for Multimedia", RFC 8298, DOI 10.17487/RFC8298, December
1720 2017, .
1722 [RFC8311] Black, D., "Relaxing Restrictions on Explicit Congestion
1723 Notification (ECN) Experimentation", RFC 8311,
1724 DOI 10.17487/RFC8311, January 2018,
1725 .
1727 [RFC8312] Rhee, I., Xu, L., Ha, S., Zimmermann, A., Eggert, L., and
1728 R. Scheffenegger, "CUBIC for Fast Long-Distance Networks",
1729 RFC 8312, DOI 10.17487/RFC8312, February 2018,
1730 .
1732 [RFC8404] Moriarty, K., Ed. and A. Morton, Ed., "Effects of
1733 Pervasive Encryption on Operators", RFC 8404,
1734 DOI 10.17487/RFC8404, July 2018,
1735 .
1737 [RFC8511] Khademi, N., Welzl, M., Armitage, G., and G. Fairhurst,
1738 "TCP Alternative Backoff with ECN (ABE)", RFC 8511,
1739 DOI 10.17487/RFC8511, December 2018,
1740 .
1742 [TCP-CA] Jacobson, V. and M. Karels, "Congestion Avoidance and
1743 Control", Laurence Berkeley Labs Technical Report ,
1744 November 1988, .
1746 [TCP-sub-mss-w]
1747 Briscoe, B. and K. De Schepper, "Scaling TCP's Congestion
1748 Window for Small Round Trip Times", BT Technical Report
1749 TR-TUB8-2015-002, May 2015,
1750 .
1753 [UnorderedLTE]
1754 Austrheim, M., "Implementing immediate forwarding for 4G
1755 in a network simulator", Masters Thesis, Uni Oslo , June
1756 2019.
1758 Appendix A. Standardization items
1760 The following table includes all the items that will need to be
1761 standardized to provide a full L4S architecture.
1763 The table is too wide for the ASCII draft format, so it has been
1764 split into two, with a common column of row index numbers on the
1765 left.
1767 The columns in the second part of the table have the following
1768 meanings:
1770 WG: The IETF WG most relevant to this requirement. The "tcpm/iccrg"
1771 combination refers to the procedure typically used for congestion
1772 control changes, where tcpm owns the approval decision, but uses
1773 the iccrg for expert review [NewCC_Proc];
1775 TCP: Applicable to all forms of TCP congestion control;
1777 DCTCP: Applicable to Data Center TCP as currently used (in
1778 controlled environments);
1780 DCTCP bis: Applicable to any future Data Center TCP congestion
1781 control intended for controlled environments;
1783 XXX Prague: Applicable to a Scalable variant of XXX (TCP/SCTP/RMCAT)
1784 congestion control.
1786 +-----+------------------------+------------------------------------+
1787 | Req | Requirement | Reference |
1788 | # | | |
1789 +-----+------------------------+------------------------------------+
1790 | 0 | ARCHITECTURE | |
1791 | 1 | L4S IDENTIFIER | [I-D.ietf-tsvwg-ecn-l4s-id] |
1792 | 2 | DUAL QUEUE AQM | [I-D.ietf-tsvwg-aqm-dualq-coupled] |
1793 | 3 | Suitable ECN Feedback | [I-D.ietf-tcpm-accurate-ecn], |
1794 | | | [I-D.stewart-tsvwg-sctpecn]. |
1795 | | | |
1796 | | SCALABLE TRANSPORT - | |
1797 | | SAFETY ADDITIONS | |
1798 | 4-1 | Fall back to | [I-D.ietf-tsvwg-ecn-l4s-id] S.2.3, |
1799 | | Reno/Cubic on loss | [RFC8257] |
1800 | 4-2 | Fall back to | [I-D.ietf-tsvwg-ecn-l4s-id] S.2.3 |
1801 | | Reno/Cubic if classic | |
1802 | | ECN bottleneck | |
1803 | | detected | |
1804 | | | |
1805 | 4-3 | Reduce RTT-dependence | [I-D.ietf-tsvwg-ecn-l4s-id] S.2.3 |
1806 | | | |
1807 | 4-4 | Scaling TCP's | [I-D.ietf-tsvwg-ecn-l4s-id] S.2.3, |
1808 | | Congestion Window for | [TCP-sub-mss-w] |
1809 | | Small Round Trip Times | |
1810 | | SCALABLE TRANSPORT - | |
1811 | | PERFORMANCE | |
1812 | | ENHANCEMENTS | |
1813 | 5-1 | Setting ECT in TCP | [I-D.ietf-tcpm-generalized-ecn] |
1814 | | Control Packets and | |
1815 | | Retransmissions | |
1816 | 5-2 | Faster-than-additive | [I-D.ietf-tsvwg-ecn-l4s-id] (Appx |
1817 | | increase | A.2.2) |
1818 | 5-3 | Faster Convergence at | [I-D.ietf-tsvwg-ecn-l4s-id] (Appx |
1819 | | Flow Start | A.2.2) |
1820 +-----+------------------------+------------------------------------+
1821 +-----+--------+-----+-------+-----------+--------+--------+--------+
1822 | # | WG | TCP | DCTCP | DCTCP-bis | TCP | SCTP | RMCAT |
1823 | | | | | | Prague | Prague | Prague |
1824 +-----+--------+-----+-------+-----------+--------+--------+--------+
1825 | 0 | tsvwg | Y | Y | Y | Y | Y | Y |
1826 | 1 | tsvwg | | | Y | Y | Y | Y |
1827 | 2 | tsvwg | n/a | n/a | n/a | n/a | n/a | n/a |
1828 | | | | | | | | |
1829 | | | | | | | | |
1830 | | | | | | | | |
1831 | 3 | tcpm | Y | Y | Y | Y | n/a | n/a |
1832 | | | | | | | | |
1833 | 4-1 | tcpm | | Y | Y | Y | Y | Y |
1834 | | | | | | | | |
1835 | 4-2 | tcpm/ | | | | Y | Y | ? |
1836 | | iccrg? | | | | | | |
1837 | | | | | | | | |
1838 | | | | | | | | |
1839 | | | | | | | | |
1840 | | | | | | | | |
1841 | 4-3 | tcpm/ | | | Y | Y | Y | ? |
1842 | | iccrg? | | | | | | |
1843 | 4-4 | tcpm | Y | Y | Y | Y | Y | ? |
1844 | | | | | | | | |
1845 | | | | | | | | |
1846 | 5-1 | tcpm | Y | Y | Y | Y | n/a | n/a |
1847 | | | | | | | | |
1848 | 5-2 | tcpm/ | | | Y | Y | Y | ? |
1849 | | iccrg? | | | | | | |
1850 | 5-3 | tcpm/ | | | Y | Y | Y | ? |
1851 | | iccrg? | | | | | | |
1852 +-----+--------+-----+-------+-----------+--------+--------+--------+
1854 Authors' Addresses
1856 Bob Briscoe (editor)
1857 Independent
1858 UK
1860 Email: ietf@bobbriscoe.net
1861 URI: http://bobbriscoe.net/
1862 Koen De Schepper
1863 Nokia Bell Labs
1864 Antwerp
1865 Belgium
1867 Email: koen.de_schepper@nokia.com
1868 URI: https://www.bell-labs.com/usr/koen.de_schepper
1870 Marcelo Bagnulo
1871 Universidad Carlos III de Madrid
1872 Av. Universidad 30
1873 Leganes, Madrid 28911
1874 Spain
1876 Phone: 34 91 6249500
1877 Email: marcelo@it.uc3m.es
1878 URI: http://www.it.uc3m.es
1880 Greg White
1881 CableLabs
1882 US
1884 Email: G.White@CableLabs.com