idnits 2.17.1 draft-ietf-tsvwg-l4s-arch-08.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (November 15, 2020) is 1258 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-07) exists of draft-briscoe-docsis-q-protection-00 == Outdated reference: A later version (-02) exists of draft-cardwell-iccrg-bbr-congestion-control-00 == Outdated reference: A later version (-34) exists of draft-ietf-quic-transport-32 == Outdated reference: A later version (-28) exists of draft-ietf-tcpm-accurate-ecn-13 == Outdated reference: A later version (-15) exists of draft-ietf-tcpm-generalized-ecn-06 == Outdated reference: A later version (-25) exists of draft-ietf-tsvwg-aqm-dualq-coupled-12 == Outdated reference: A later version (-22) exists of draft-ietf-tsvwg-ecn-encap-guidelines-13 == Outdated reference: A later version (-29) exists of draft-ietf-tsvwg-ecn-l4s-id-11 == Outdated reference: A later version (-23) exists of draft-ietf-tsvwg-rfc6040update-shim-10 == Outdated reference: A later version (-07) exists of draft-stewart-tsvwg-sctpecn-05 -- Obsolete informational reference (is this intentional?): RFC 4960 (Obsoleted by RFC 9260) -- Obsolete informational reference (is this intentional?): RFC 7540 (Obsoleted by RFC 9113) -- Obsolete informational reference (is this intentional?): RFC 8312 (Obsoleted by RFC 9438) Summary: 0 errors (**), 0 flaws (~~), 11 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Transport Area Working Group B. Briscoe, Ed. 3 Internet-Draft Independent 4 Intended status: Informational K. De Schepper 5 Expires: May 19, 2021 Nokia Bell Labs 6 M. Bagnulo Braun 7 Universidad Carlos III de Madrid 8 G. White 9 CableLabs 10 November 15, 2020 12 Low Latency, Low Loss, Scalable Throughput (L4S) Internet Service: 13 Architecture 14 draft-ietf-tsvwg-l4s-arch-08 16 Abstract 18 This document describes the L4S architecture, which enables Internet 19 applications to achieve Low queuing Latency, Low Loss, and Scalable 20 throughput (L4S). The insight on which L4S is based is that the root 21 cause of queuing delay is in the congestion controllers of senders, 22 not in the queue itself. The L4S architecture is intended to enable 23 _all_ Internet applications to transition away from congestion 24 control algorithms that cause queuing delay, to a new class of 25 congestion controls that induce very little queuing, aided by 26 explicit congestion signaling from the network. This new class of 27 congestion control can provide low latency for capacity-seeking 28 flows, so applications can achieve both high bandwidth and low 29 latency. 31 The architecture primarily concerns incremental deployment. It 32 defines mechanisms that allow the new class of L4S congestion 33 controls to coexist with 'Classic' congestion controls in a shared 34 network. These mechanisms aim to ensure that the latency and 35 throughput performance using an L4S-compliant congestion controller 36 is usually much better (and never worse) than the performance would 37 have been using a 'Classic' congestion controller, and that competing 38 flows continuing to use 'Classic' controllers are typically not 39 impacted by the presence of L4S. These characteristics are important 40 to encourage adoption of L4S congestion control algorithms and L4S 41 compliant network elements. 43 The L4S architecture consists of three components: network support to 44 isolate L4S traffic from classic traffic; protocol features that 45 allow network elements to identify L4S traffic; and host support for 46 L4S congestion controls. 48 Status of This Memo 50 This Internet-Draft is submitted in full conformance with the 51 provisions of BCP 78 and BCP 79. 53 Internet-Drafts are working documents of the Internet Engineering 54 Task Force (IETF). Note that other groups may also distribute 55 working documents as Internet-Drafts. The list of current Internet- 56 Drafts is at https://datatracker.ietf.org/drafts/current/. 58 Internet-Drafts are draft documents valid for a maximum of six months 59 and may be updated, replaced, or obsoleted by other documents at any 60 time. It is inappropriate to use Internet-Drafts as reference 61 material or to cite them other than as "work in progress." 63 This Internet-Draft will expire on May 19, 2021. 65 Copyright Notice 67 Copyright (c) 2020 IETF Trust and the persons identified as the 68 document authors. All rights reserved. 70 This document is subject to BCP 78 and the IETF Trust's Legal 71 Provisions Relating to IETF Documents 72 (https://trustee.ietf.org/license-info) in effect on the date of 73 publication of this document. Please review these documents 74 carefully, as they describe your rights and restrictions with respect 75 to this document. Code Components extracted from this document must 76 include Simplified BSD License text as described in Section 4.e of 77 the Trust Legal Provisions and are provided without warranty as 78 described in the Simplified BSD License. 80 Table of Contents 82 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 83 2. L4S Architecture Overview . . . . . . . . . . . . . . . . . . 5 84 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 6 85 4. L4S Architecture Components . . . . . . . . . . . . . . . . . 7 86 5. Rationale . . . . . . . . . . . . . . . . . . . . . . . . . . 12 87 5.1. Why These Primary Components? . . . . . . . . . . . . . . 12 88 5.2. What L4S adds to Existing Approaches . . . . . . . . . . 14 89 6. Applicability . . . . . . . . . . . . . . . . . . . . . . . . 17 90 6.1. Applications . . . . . . . . . . . . . . . . . . . . . . 17 91 6.2. Use Cases . . . . . . . . . . . . . . . . . . . . . . . . 19 92 6.3. Applicability with Specific Link Technologies . . . . . . 20 93 6.4. Deployment Considerations . . . . . . . . . . . . . . . . 20 94 6.4.1. Deployment Topology . . . . . . . . . . . . . . . . . 21 95 6.4.2. Deployment Sequences . . . . . . . . . . . . . . . . 22 96 6.4.3. L4S Flow but Non-ECN Bottleneck . . . . . . . . . . . 25 97 6.4.4. L4S Flow but Classic ECN Bottleneck . . . . . . . . . 25 98 6.4.5. L4S AQM Deployment within Tunnels . . . . . . . . . . 26 99 7. IANA Considerations (to be removed by RFC Editor) . . . . . . 26 100 8. Security Considerations . . . . . . . . . . . . . . . . . . . 26 101 8.1. Traffic Rate (Non-)Policing . . . . . . . . . . . . . . . 26 102 8.2. 'Latency Friendliness' . . . . . . . . . . . . . . . . . 27 103 8.3. Interaction between Rate Policing and L4S . . . . . . . . 29 104 8.4. ECN Integrity . . . . . . . . . . . . . . . . . . . . . . 29 105 8.5. Privacy Considerations . . . . . . . . . . . . . . . . . 30 106 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 31 107 10. Informative References . . . . . . . . . . . . . . . . . . . 31 108 Appendix A. Standardization items . . . . . . . . . . . . . . . 38 109 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 40 111 1. Introduction 113 It is increasingly common for _all_ of a user's applications at any 114 one time to require low delay: interactive Web, Web services, voice, 115 conversational video, interactive video, interactive remote presence, 116 instant messaging, online gaming, remote desktop, cloud-based 117 applications and video-assisted remote control of machinery and 118 industrial processes. In the last decade or so, much has been done 119 to reduce propagation delay by placing caches or servers closer to 120 users. However, queuing remains a major, albeit intermittent, 121 component of latency. For instance spikes of hundreds of 122 milliseconds are common, even with state-of-the-art active queue 123 management (AQM). During a long-running flow, queuing is typically 124 configured to cause overall network delay to roughly double relative 125 to expected base (unloaded) path delay. Low loss is also important 126 because, for interactive applications, losses translate into even 127 longer retransmission delays. 129 It has been demonstrated that, once access network bit rates reach 130 levels now common in the developed world, increasing capacity offers 131 diminishing returns if latency (delay) is not addressed. 132 Differentiated services (Diffserv) offers Expedited Forwarding 133 (EF [RFC3246]) for some packets at the expense of others, but this is 134 not sufficient when all (or most) of a user's applications require 135 low latency. 137 Therefore, the goal is an Internet service with ultra-Low queueing 138 Latency, ultra-Low Loss and Scalable throughput (L4S). Ultra-low 139 queuing latency means less than 1 millisecond (ms) on average and 140 less than about 2 ms at the 99th percentile. L4S is potentially for 141 _all_ traffic - a service for all traffic needs none of the 142 configuration or management baggage (traffic policing, traffic 143 contracts) associated with favouring some traffic over others. This 144 document describes the L4S architecture for achieving these goals. 146 It must be said that queuing delay only degrades performance 147 infrequently [Hohlfeld14]. It only occurs when a large enough 148 capacity-seeking (e.g. TCP) flow is running alongside the user's 149 traffic in the bottleneck link, which is typically in the access 150 network. Or when the low latency application is itself a large 151 capacity-seeking or adaptive rate (e.g. interactive video) flow. At 152 these times, the performance improvement from L4S must be sufficient 153 that network operators will be motivated to deploy it. 155 Active Queue Management (AQM) is part of the solution to queuing 156 under load. AQM improves performance for all traffic, but there is a 157 limit to how much queuing delay can be reduced by solely changing the 158 network; without addressing the root of the problem. 160 The root of the problem is the presence of standard TCP congestion 161 control (Reno [RFC5681]) or compatible variants (e.g. TCP 162 Cubic [RFC8312]). We shall use the term 'Classic' for these Reno- 163 friendly congestion controls. Classic congestion controls induce 164 relatively large saw-tooth-shaped excursions up the queue and down 165 again, which have been growing as flow rate scales [RFC3649]. So if 166 a network operator naively attempts to reduce queuing delay by 167 configuring an AQM to operate at a shallower queue, a Classic 168 congestion control will significantly underutilize the link at the 169 bottom of every saw-tooth. 171 It has been demonstrated that if the sending host replaces a Classic 172 congestion control with a 'Scalable' alternative, when a suitable AQM 173 is deployed in the network the performance under load of all the 174 above interactive applications can be significantly improved. For 175 instance, queuing delay under heavy load with the example DCTCP/DualQ 176 solution cited below on a DSL or Ethernet link is roughly 1 to 2 177 milliseconds at the 99th percentile without losing link 178 utilization [DualPI2Linux], [DCttH15] (for other link types, see 179 Section 6.3). This compares with 5 to 20 ms on _average_ with a 180 Classic congestion control and current state-of-the-art AQMs such as 181 FQ-CoDel [RFC8290], PIE [RFC8033] or DOCSIS PIE [RFC8034] and about 182 20-30 ms at the 99th percentile [DualPI2Linux]. 184 It has also been demonstrated [DCttH15], [DualPI2Linux] that it is 185 possible to deploy such an L4S service alongside the existing best 186 efforts service so that all of a user's applications can shift to it 187 when their stack is updated. Access networks are typically designed 188 with one link as the bottleneck for each site (which might be a home, 189 small enterprise or mobile device), so deployment at each end of this 190 link should give nearly all the benefit in each direction. The L4S 191 approach also requires component mechanisms at the endpoints to 192 fulfill its goal. This document presents the L4S architecture, by 193 describing the different components and how they interact to provide 194 the scalable, low latency, low loss Internet service. 196 2. L4S Architecture Overview 198 There are three main components to the L4S architecture: 200 1) Network: L4S traffic needs to be isolated from the queuing 201 latency of Classic traffic. One queue per application flow (FQ) 202 is one way to achieve this, e.g. FQ-CoDel [RFC8290]. However, 203 just two queues is sufficient and does not require inspection of 204 transport layer headers in the network, which is not always 205 possible (see Section 5.2). With just two queues, it might seem 206 impossible to know how much capacity to schedule for each queue 207 without inspecting how many flows at any one time are using each. 208 And it would be undesirable to arbitrarily divide access network 209 capacity into two partitions. The Dual Queue Coupled AQM was 210 developed as a minimal complexity solution to this problem. It 211 acts like a 'semi-permeable' membrane that partitions latency but 212 not bandwidth. As such, the two queues are for transition from 213 Classic to L4S behaviour, not bandwidth prioritization. Section 4 214 gives a high level explanation of how FQ and DualQ solutions work, 215 and [I-D.ietf-tsvwg-aqm-dualq-coupled] gives a full explanation of 216 the DualQ Coupled AQM framework. 218 2) Protocol: A host needs to distinguish L4S and Classic packets 219 with an identifier so that the network can classify them into 220 their separate treatments. [I-D.ietf-tsvwg-ecn-l4s-id] considers 221 various alternative identifiers for L4S, and concludes that all 222 alternatives involve compromises, but the ECT(1) and CE codepoints 223 of the ECN field represent a workable solution. 225 3) Host: Scalable congestion controls already exist. They solve the 226 scaling problem with Reno congestion control that was explained in 227 [RFC3649]. The one used most widely (in controlled environments) 228 is Data Center TCP (DCTCP [RFC8257]), which has been implemented 229 and deployed in Windows Server Editions (since 2012), in Linux and 230 in FreeBSD. Although DCTCP as-is 'works' well over the public 231 Internet, most implementations lack certain safety features that 232 will be necessary once it is used outside controlled environments 233 like data centres (see Section 6.4.3 and Appendix A). Scalable 234 congestion control will also need to be implemented in protocols 235 other than TCP (QUIC, SCTP, RTP/RTCP, RMCAT, etc.). Indeed, 236 between the present document being drafted and published, the 237 following scalable congestion controls were implemented: TCP 238 Prague [PragueLinux], QUIC Prague, an L4S variant of the RMCAT 239 SCReAM controller [RFC8298] and the L4S ECN part of 240 BBRv2 [I-D.cardwell-iccrg-bbr-congestion-control] intended for TCP 241 and QUIC transports. 243 3. Terminology 245 Classic Congestion Control: A congestion control behaviour that can 246 co-exist with standard TCP Reno [RFC5681] without causing 247 significantly negative impact on its flow rate [RFC5033]. With 248 Classic congestion controls, as flow rate scales, the number of 249 round trips between congestion signals (losses or ECN marks) rises 250 with the flow rate. So it takes longer and longer to recover 251 after each congestion event. Therefore control of queuing and 252 utilization becomes very slack, and the slightest disturbance 253 prevents a high rate from being attained [RFC3649]. 255 For instance, with 1500 byte packets and an end-to-end round trip 256 time (RTT) of 36 ms, over the years, as Reno flow rate scales from 257 2 to 100 Mb/s the number of round trips taken to recover from a 258 congestion event rises proportionately, from 4 to 200. 259 Cubic [RFC8312] was developed to be less unscalable, but it is 260 approaching its scaling limit; with the same RTT of 36 ms, at 261 100Mb/s it takes about 106 round trips to recover, and at 800 Mb/s 262 its recovery time triples to over 340 round trips, or still more 263 than 12 seconds (Reno would take 57 seconds). 265 Scalable Congestion Control: A congestion control where the average 266 time from one congestion signal to the next (the recovery time) 267 remains invariant as the flow rate scales, all other factors being 268 equal. This maintains the same degree of control over queueing 269 and utilization whatever the flow rate, as well as ensuring that 270 high throughput is more robust to disturbances (e.g. from new 271 flows starting). For instance, DCTCP averages 2 congestion 272 signals per round-trip whatever the flow rate, as do other 273 recently developed scalable congestion controls, e.g. Relentless 274 TCP [Mathis09], TCP Prague [PragueLinux] and the L4S variant of 275 SCReAM for real-time media [RFC8298]).See Section 4.3 of 276 [I-D.ietf-tsvwg-ecn-l4s-id] for more explanation. 278 Classic service: The Classic service is intended for all the 279 congestion control behaviours that co-exist with Reno [RFC5681] 280 (e.g. Reno itself, Cubic [RFC8312], 281 Compound [I-D.sridharan-tcpm-ctcp], TFRC [RFC5348]). The term 282 'Classic queue' means a queue providing the Classic service. 284 Low-Latency, Low-Loss Scalable throughput (L4S) service: The 'L4S' 285 service is intended for traffic from scalable congestion control 286 algorithms, such as Data Center TCP [RFC8257]. The L4S service is 287 for more general traffic than just DCTCP--it allows the set of 288 congestion controls with similar scaling properties to DCTCP to 289 evolve, such as the examples listed above (Relentless, Prague, 290 SCReAM). The term 'L4S queue' means a queue providing the L4S 291 service. 293 The terms Classic or L4S can also qualify other nouns, such as 294 'queue', 'codepoint', 'identifier', 'classification', 'packet', 295 'flow'. For example: an L4S packet means a packet with an L4S 296 identifier sent from an L4S congestion control. 298 Both Classic and L4S services can cope with a proportion of 299 unresponsive or less-responsive traffic as well, as long as it 300 does not build a queue (e.g. DNS, VoIP, game sync datagrams, etc). 302 Reno-friendly: The subset of Classic traffic that excludes 303 unresponsive traffic and excludes experimental congestion controls 304 intended to coexist with Reno but without always being strictly 305 friendly to it (as allowed by [RFC5033]). Reno-friendly is used 306 in place of 'TCP-friendly', given that friendliness is a property 307 of the congestion controller (Reno), not the wire protocol (TCP), 308 which is used with many different congestion control behaviours. 310 Classic ECN: The original Explicit Congestion Notification (ECN) 311 protocol [RFC3168], which requires ECN signals to be treated as 312 equivalent to drops, both when generated in the network and when 313 responded to by the sender. 315 The names used for the four codepoints of the 2-bit IP-ECN field 316 are as defined in [RFC3168]: Not ECT, ECT(0), ECT(1) and CE, where 317 ECT stands for ECN-Capable Transport and CE stands for Congestion 318 Experienced. 320 Site: A home, mobile device, small enterprise or campus, where the 321 network bottleneck is typically the access link to the site. Not 322 all network arrangements fit this model but it is a useful, widely 323 applicable generalization. 325 4. L4S Architecture Components 327 The L4S architecture is composed of the following elements. 329 Protocols: The L4S architecture encompasses two identifier changes 330 (an unassignment and an assignment) and optional further identifiers: 332 a. An essential aspect of a scalable congestion control is the use 333 of explicit congestion signals rather than losses, because the 334 signals need to be sent frequently and immediately. In contrast, 335 'Classic' ECN [RFC3168] requires an ECN signal to be treated as 336 equivalent to drop, both when it is generated in the network and 337 when it is responded to by hosts. L4S needs networks and hosts 338 to support a different meaning for ECN: 340 * much more frequent signals--too often to require an equivalent 341 excessive degree of drop from non-ECN flows; 343 * immediately tracking every fluctuation of the queue--too soon 344 to warrant dropping packets from non-ECN flows. 346 So the standards track [RFC3168] has had to be updated to allow 347 L4S packets to depart from the 'same as drop' constraint. 348 [RFC8311] is a standards track update to relax specific 349 requirements in RFC 3168 (and certain other standards track 350 RFCs), which clears the way for the experimental changes proposed 351 for L4S. [RFC8311] also reclassifies the original experimental 352 assignment of the ECT(1) codepoint as an ECN nonce [RFC3540] as 353 historic. 355 b. [I-D.ietf-tsvwg-ecn-l4s-id] recommends ECT(1) is used as the 356 identifier to classify L4S packets into a separate treatment from 357 Classic packets. This satisfies the requirements for identifying 358 an alternative ECN treatment in [RFC4774]. 360 The CE codepoint is used to indicate Congestion Experienced by 361 both L4S and Classic treatments. This raises the concern that a 362 Classic AQM earlier on the path might have marked some ECT(0) 363 packets as CE. Then these packets will be erroneously classified 364 into the L4S queue. [I-D.ietf-tsvwg-ecn-l4s-id] explains why 5 365 unlikely eventualities all have to coincide for this to have any 366 detrimental effect, which even then would only involve a 367 vanishingly small likelihood of a spurious retransmission. 369 c. A network operator might wish to include certain unresponsive, 370 non-L4S traffic in the L4S queue if it is deemed to be smoothly 371 enough paced and low enough rate not to build a queue. For 372 instance, VoIP, low rate datagrams to sync online games, 373 relatively low rate application-limited traffic, DNS, LDAP, etc. 374 This traffic would need to be tagged with specific identifiers, 375 e.g. a low latency Diffserv Codepoint such as Expedited 376 Forwarding (EF [RFC3246]), Non-Queue-Building 377 (NQB [I-D.white-tsvwg-nqb]), or operator-specific identifiers. 379 Network components: The L4S architecture aims to provide low latency 380 without the _need_ for per-flow operations in network components. 381 Nonetheless, the architecture does not preclude per-flow solutions - 382 it encompasses the following combinations: 384 a. The Dual Queue Coupled AQM (illustrated in Figure 1) achieves the 385 'semi-permeable' membrane property mentioned earlier as follows. 386 The obvious part is that using two separate queues isolates the 387 queuing delay of one from the other. The less obvious part is 388 how the two queues act as if they are a single pool of bandwidth 389 without the scheduler needing to decide between them. This is 390 achieved by having the Classic AQM provide a congestion signal to 391 both queues in a manner that ensures a consistent response from 392 the two types of congestion control. In other words, the Classic 393 AQM generates a drop/mark probability based on congestion in the 394 Classic queue, uses this probability to drop/mark packets in that 395 queue, and also uses this probability to affect the marking 396 probability in the L4S queue. This coupling of the congestion 397 signaling between the two queues makes the L4S flows slow down to 398 leave the right amount of capacity for the Classic traffic (as 399 they would if they were the same type of traffic sharing the same 400 queue). Then the scheduler can serve the L4S queue with 401 priority, because the L4S traffic isn't offering up enough 402 traffic to use all the priority that it is given. Therefore, on 403 short time-scales (sub-round-trip) the prioritization of the L4S 404 queue protects its low latency by allowing bursts to dissipate 405 quickly; but on longer time-scales (round-trip and longer) the 406 Classic queue creates an equal and opposite pressure against the 407 L4S traffic to ensure that neither has priority when it comes to 408 bandwidth. The tension between prioritizing L4S and coupling 409 marking from Classic results in per-flow fairness. To protect 410 against unresponsive traffic in the L4S queue taking advantage of 411 the prioritization and starving the Classic queue, it is 412 advisable not to use strict priority, but instead to use a 413 weighted scheduler (see Appendix A of 414 [I-D.ietf-tsvwg-aqm-dualq-coupled]). 416 When there is no Classic traffic, the L4S queue's AQM comes into 417 play, and it sets an appropriate marking rate to maintain ultra- 418 low queuing delay. 420 The Dual Queue Coupled AQM has been specified as generically as 421 possible [I-D.ietf-tsvwg-aqm-dualq-coupled] without specifying 422 the particular AQMs to use in the two queues so that designers 423 are free to implement diverse ideas. Informational appendices in 424 that draft give pseudocode examples of two different specific AQM 425 approaches: one called DualPI2 (pronounced Dual PI 426 Squared) [DualPI2Linux] that uses the PI2 variant of PIE, and a 427 zero-config variant of RED called Curvy RED. A DualQ Coupled AQM 428 based on PIE has also been specified and implemented for Low 429 Latency DOCSIS [DOCSIS3.1]. 431 (2) (1) 432 .-------^------. .--------------^-------------------. 433 ,-(3)-----. ______ 434 ; ________ : L4S --------. | | 435 :|Scalable| : _\ ||___\_| mark | 436 :| sender | : __________ / / || / |______|\ _________ 437 :|________|\; | |/ --------' ^ \1|condit'nl| 438 `---------'\_| IP-ECN | Coupling : \|priority |_\ 439 ________ / |Classifier| : /|scheduler| / 440 |Classic |/ |__________|\ --------. ___:__ / |_________| 441 | sender | \_\ || | |||___\_| mark/|/ 442 |________| / || | ||| / | drop | 443 Classic --------' |______| 445 Figure 1: Components of an L4S Solution: 1) Isolation in separate 446 network queues; 2) Packet Identification Protocol; and 3) Scalable 447 Sending Host 449 b. A scheduler with per-flow queues can be used for L4S. It is 450 simple to modify an existing design such as FQ-CoDel or FQ-PIE. 451 For instance within each queue of an FQ-CoDel system, as well as 452 a CoDel AQM, immediate (unsmoothed) shallow threshold ECN marking 453 has been added (see Sec.5.2.7 of [RFC8290]). Then the Classic 454 AQM such as CoDel or PIE is applied to non-ECN or ECT(0) packets, 455 while the shallow threshold is applied to ECT(1) packets, to give 456 sub-millisecond average queue delay. 458 c. It would also be possible to use dual queues for isolation, but 459 with per-flow marking to control flow-rates (instead of the 460 coupled per-queue marking of the Dual Queue Coupled AQM). One of 461 the two queues would be for isolating L4S packets, which would be 462 classified by the ECN codepoint. Flow rates could be controlled 463 by flow-specific marking. The policy goal of the marking could 464 be to differentiate flow rates (e.g. [Nadas20], which requires 465 additional signalling of a per-flow 'value'), or to equalize 466 flow-rates (perhaps in a similar way to Approx Fair CoDel [AFCD], 467 [I-D.morton-tsvwg-codel-approx-fair], but with two queues not 468 one). 470 Note that whenever the term 'DualQ' is used loosely without 471 saying whether marking is per-queue or per-flow, it means a dual 472 queue AQM with per-queue marking. 474 Host mechanisms: The L4S architecture includes two main mechanisms in 475 the end host that we enumerate next: 477 a. Scalable Congestion Control: Data Center TCP is the most widely 478 used example. It has been documented as an informational record 479 of the protocol currently in use in controlled 480 environments [RFC8257]. A draft list of safety and performance 481 improvements for a scalable congestion control to be usable on 482 the public Internet has been drawn up (the so-called 'Prague L4S 483 requirements' in Appendix A of [I-D.ietf-tsvwg-ecn-l4s-id]). The 484 subset that involve risk of harm to others have been captured as 485 normative requirements in Section 4 of 486 [I-D.ietf-tsvwg-ecn-l4s-id]. TCP Prague has been implemented in 487 Linux as a reference implementation to address these requirements 488 [PragueLinux]. 490 Transport protocols other than TCP use various congestion 491 controls that are designed to be friendly with Reno. Before they 492 can use the L4S service, it will be necessary to implement 493 scalable variants of each of these congestion control behaviours. 494 They will eventually need to be updated to implement a scalable 495 congestion response, which they will have to indicate by using 496 the ECT(1) codepoint. Scalable variants are under consideration 497 for some new transport protocols that are themselves under 498 development, e.g. QUIC. Also the L4S ECN part of 499 BBRv2 [I-D.cardwell-iccrg-bbr-congestion-control] is a scalable 500 congestion control intended for the TCP and QUIC transports, 501 amongst others. Also an L4S variant of the RMCAT SCReAM 502 controller [RFC8298] has been implemented for media transported 503 over RTP. 505 b. ECN feedback is sufficient for L4S in some transport protocols 506 (specifically DCCP [RFC4340] and QUIC [I-D.ietf-quic-transport]). 507 But others either require update or are in the process of being 508 updated: 510 * For the case of TCP, the feedback protocol for ECN embeds the 511 assumption from Classic ECN [RFC3168] that an ECN mark is 512 equivalent to a drop, making it unusable for a scalable TCP. 513 Therefore, the implementation of TCP receivers will have to be 514 upgraded [RFC7560]. Work to standardize and implement more 515 accurate ECN feedback for TCP (AccECN) is in 516 progress [I-D.ietf-tcpm-accurate-ecn], [PragueLinux]. 518 * ECN feedback is only roughly sketched in an appendix of the 519 SCTP specification [RFC4960]. A fuller specification has been 520 proposed in a long-expired draft [I-D.stewart-tsvwg-sctpecn], 521 which would need to be implemented and deployed before SCTCP 522 could support L4S. 524 * For RTP, sufficient ECN feedback was defined in [RFC6679], but 525 [I-D.ietf-avtcore-cc-feedback-message] defines the latest 526 standards track improvements. 528 5. Rationale 530 5.1. Why These Primary Components? 532 Explicit congestion signalling (protocol): Explicit congestion 533 signalling is a key part of the L4S approach. In contrast, use of 534 drop as a congestion signal creates a tension because drop is both 535 an impairment (less would be better) and a useful signal (more 536 would be better): 538 * Explicit congestion signals can be used many times per round 539 trip, to keep tight control, without any impairment. Under 540 heavy load, even more explicit signals can be applied so the 541 queue can be kept short whatever the load. Whereas state-of- 542 the-art AQMs have to introduce very high packet drop at high 543 load to keep the queue short. Further, when using ECN, the 544 congestion control's sawtooth reduction can be smaller and 545 therefore return to the operating point more often, without 546 worrying that this causes more signals (one at the top of each 547 smaller sawtooth). The consequent smaller amplitude sawteeth 548 fit between a very shallow marking threshold and an empty 549 queue, so queue delay variation can be very low, without risk 550 of under-utilization. 552 * Explicit congestion signals can be sent immediately to track 553 fluctuations of the queue. L4S shifts smoothing from the 554 network (which doesn't know the round trip times of all the 555 flows) to the host (which knows its own round trip time). 556 Previously, the network had to smooth to keep a worst-case 557 round trip stable, which delayed congestion signals by 100-200 558 ms. 560 All the above makes it clear that explicit congestion signalling 561 is only advantageous for latency if it does not have to be 562 considered 'equivalent to' drop (as was required with Classic 563 ECN [RFC3168]). Therefore, in an L4S AQM, the L4S queue uses a 564 new L4S variant of ECN that is not equivalent to 565 drop [I-D.ietf-tsvwg-ecn-l4s-id], while the Classic queue uses 566 either classic ECN [RFC3168] or drop, which are equivalent to each 567 other. 569 Before Classic ECN was standardized, there were various proposals 570 to give an ECN mark a different meaning from drop. However, there 571 was no particular reason to agree on any one of the alternative 572 meanings, so 'equivalent to drop' was the only compromise that 573 could be reached. RFC 3168 contains a statement that: 575 "An environment where all end nodes were ECN-Capable could 576 allow new criteria to be developed for setting the CE 577 codepoint, and new congestion control mechanisms for end-node 578 reaction to CE packets. However, this is a research issue, and 579 as such is not addressed in this document." 581 Latency isolation (network): L4S congestion controls keep queue 582 delay low whereas Classic congestion controls need a queue of the 583 order of the RTT to avoid under-utilization. One queue cannot 584 have two lengths, therefore L4S traffic needs to be isolated in a 585 separate queue (e.g. DualQ) or queues (e.g. FQ). 587 Coupled congestion notification: Coupling the congestion 588 notification between two queues as in the DualQ Coupled AQM is not 589 necessarily essential, but it is a simple way to allow senders to 590 determine their rate, packet by packet, rather than be overridden 591 by a network scheduler. An alternative is for a network scheduler 592 to control the rate of each application flow (see discussion in 593 Section 5.2). 595 L4S packet identifier (protocol): Once there are at least two 596 treatments in the network, hosts need an identifier at the IP 597 layer to distinguish which treatment they intend to use. 599 Scalable congestion notification: A scalable congestion control in 600 the host keeps the signalling frequency from the network high so 601 that rate variations can be small when signalling is stable, and 602 rate can track variations in available capacity as rapidly as 603 possible otherwise. 605 Low loss: Latency is not the only concern of L4S. The 'Low Loss" 606 part of the name denotes that L4S generally achieves zero 607 congestion loss due to its use of ECN. Otherwise, loss would 608 itself cause delay, particularly for short flows, due to 609 retransmission delay [RFC2884]. 611 Scalable throughput: The "Scalable throughput" part of the name 612 denotes that the per-flow throughput of scalable congestion 613 controls should scale indefinitely, avoiding the imminent scaling 614 problems with Reno-friendly congestion control 615 algorithms [RFC3649]. It was known when TCP congestion avoidance 616 was first developed that it would not scale to high bandwidth- 617 delay products (see footnote 6 in [TCP-CA]). Today, regular 618 broadband bit-rates over WAN distances are already beyond the 619 scaling range of Classic Reno congestion control. So `less 620 unscalable' Cubic [RFC8312] and Compound [I-D.sridharan-tcpm-ctcp] 621 variants of TCP have been successfully deployed. However, these 622 are now approaching their scaling limits. As the examples in 623 Section 3 demonstrate, as flow rate scales Classic congestion 624 controls like Reno or Cubic induce a congestion signal more and 625 more infrequently (hundreds of round trips at today's flow rates 626 and growing), which makes dynamic control very sloppy. In 627 contrast on average a scalable congestion control like DCTCP or 628 TCP Prague induces 2 congestion signals per round trip, which 629 remains invariant for any flow rate, keeping dynamic control very 630 tight. 632 Although work on scaling congestion controls tends to start with 633 TCP as the transport, the above is not intended to exclude other 634 transports (e.g. SCTP, QUIC) or less elastic algorithms 635 (e.g. RMCAT), which all tend to adopt the same or similar 636 developments. 638 5.2. What L4S adds to Existing Approaches 640 All the following approaches address some part of the same problem 641 space as L4S. In each case, it is shown that L4S complements them or 642 improves on them, rather than being a mutually exclusive alternative: 644 Diffserv: Diffserv addresses the problem of bandwidth apportionment 645 for important traffic as well as queuing latency for delay- 646 sensitive traffic. Of these, L4S solely addresses the problem of 647 queuing latency. Diffserv will still be necessary where important 648 traffic requires priority (e.g. for commercial reasons, or for 649 protection of critical infrastructure traffic) - see 650 [I-D.briscoe-tsvwg-l4s-diffserv]. Nonetheless, the L4S approach 651 can provide low latency for _all_ traffic within each Diffserv 652 class (including the case where there is only the one default 653 Diffserv class). 655 Also, Diffserv only works for a small subset of the traffic on a 656 link. As already explained, it is not applicable when all the 657 applications in use at one time at a single site (home, small 658 business or mobile device) require low latency. In contrast, 659 because L4S is for all traffic, it needs none of the management 660 baggage (traffic policing, traffic contracts) associated with 661 favouring some packets over others. This baggage has probably 662 held Diffserv back from widespread end-to-end deployment. 664 In particular, because networks tend not to trust end systems to 665 identify which packets should be favoured over others, where 666 networks assign packets to Diffserv classes they often use packet 667 inspection of application flow identifiers or deeper inspection of 668 application signatures. Thus, nowadays, Diffserv doesn't always 669 sit well with encryption of the layers above IP. So users have to 670 choose between privacy and QoS. 672 As with Diffserv, the L4S identifier is in the IP header. But, in 673 contrast to Diffserv, the L4S identifier does not convey a want or 674 a need for a certain level of quality. Rather, it promises a 675 certain behaviour (scalable congestion response), which networks 676 can objectively verify if they need to. This is because low delay 677 depends on collective host behaviour, whereas bandwidth priority 678 depends on network behaviour. 680 State-of-the-art AQMs: AQMs such as PIE and FQ-CoDel give a 681 significant reduction in queuing delay relative to no AQM at all. 682 L4S is intended to complement these AQMs, and should not distract 683 from the need to deploy them as widely as possible. Nonetheless, 684 AQMs alone cannot reduce queuing delay too far without 685 significantly reducing link utilization, because the root cause of 686 the problem is on the host - where Classic congestion controls use 687 large saw-toothing rate variations. The L4S approach resolves 688 this tension by ensuring hosts can minimize the size of their 689 sawteeth without appearing so aggressive to Classic flows that 690 they starve them. 692 Per-flow queuing or marking: Similarly, per-flow approaches such as 693 FQ-CoDel or Approx Fair CoDel [AFCD] are not incompatible with the 694 L4S approach. However, per-flow queuing alone is not enough - it 695 only isolates the queuing of one flow from others; not from 696 itself. Per-flow implementations still need to have support for 697 scalable congestion control added, which has already been done in 698 FQ-CoDel (see Sec.5.2.7 of [RFC8290]). Without this simple 699 modification, per-flow AQMs like FQ-CoDel would still not be able 700 to support applications that need both ultra-low delay and high 701 bandwidth, e.g. video-based control of remote procedures, or 702 interactive cloud-based video (see Note 1 below). 704 Although per-flow techniques are not incompatible with L4S, it is 705 important to have the DualQ alternative. This is because handling 706 end-to-end (layer 4) flows in the network (layer 3 or 2) precludes 707 some important end-to-end functions. For instance: 709 A. Per-flow forms of L4S like FQ-CoDel are incompatible with full 710 end-to-end encryption of transport layer identifiers for 711 privacy and confidentiality (e.g. IPSec or encrypted VPN 712 tunnels), because they require packet inspection to access the 713 end-to-end transport flow identifiers. 715 In contrast, the DualQ form of L4S requires no deeper 716 inspection than the IP layer. So, as long as operators take 717 the DualQ approach, their users can have both ultra-low 718 queuing delay and full end-to-end encryption [RFC8404]. 720 B. With per-flow forms of L4S, the network takes over control of 721 the relative rates of each application flow. Some see it as 722 an advantage that the network will prevent some flows running 723 faster than others. Others consider it an inherent part of 724 the Internet's appeal that applications can control their rate 725 while taking account of the needs of others via congestion 726 signals. They maintain that this has allowed applications 727 with interesting rate behaviours to evolve, for instance, 728 variable bit-rate video that varies around an equal share 729 rather than being forced to remain equal at every instant, or 730 scavenger services that use less than an equal share of 731 capacity [LEDBAT_AQM]. 733 The L4S architecture does not require the IETF to commit to 734 one approach over the other, because it supports both, so that 735 the market can decide. Nonetheless, in the spirit of 'Do one 736 thing and do it well' [McIlroy78], the DualQ option provides 737 low delay without prejudging the issue of flow-rate control. 738 Then, flow rate policing can be added separately if desired. 739 This allows application control up to a point, but the network 740 can still choose to set the point at which it intervenes to 741 prevent one flow completely starving another. 743 Note: 745 1. It might seem that self-inflicted queuing delay within a per- 746 flow queue should not be counted, because if the delay wasn't 747 in the network it would just shift to the sender. However, 748 modern adaptive applications, e.g. HTTP/2 [RFC7540] or some 749 interactive media applications (see Section 6.1), can keep low 750 latency objects at the front of their local send queue by 751 shuffling priorities of other objects dependent on the 752 progress of other transfers. They cannot shuffle objects once 753 they have released them into the network. 755 Alternative Back-off ECN (ABE): Here again, L4S is not an 756 alternative to ABE but a complement that introduces much lower 757 queuing delay. ABE [RFC8511] alters the host behaviour in 758 response to ECN marking to utilize a link better and give ECN 759 flows faster throughput. It uses ECT(0) and assumes the network 760 still treats ECN and drop the same. Therefore ABE exploits any 761 lower queuing delay that AQMs can provide. But as explained 762 above, AQMs still cannot reduce queuing delay too far without 763 losing link utilization (to allow for other, non-ABE, flows). 765 BBR: Bottleneck Bandwidth and Round-trip propagation time 766 (BBR [I-D.cardwell-iccrg-bbr-congestion-control]) controls queuing 767 delay end-to-end without needing any special logic in the network, 768 such as an AQM. So it works pretty-much on any path (although it 769 has not been without problems, particularly capacity sharing in 770 BBRv1). BBR keeps queuing delay reasonably low, but perhaps not 771 quite as low as with state-of-the-art AQMs such as PIE or FQ- 772 CoDel, and certainly nowhere near as low as with L4S. Queuing 773 delay is also not consistently low, due to BBR's regular bandwidth 774 probing spikes and its aggressive flow start-up phase. 776 L4S complements BBR. Indeed BBRv2 uses L4S ECN and a scalable L4S 777 congestion control behaviour in response to any ECN signalling 778 from the path. The L4S ECN signal complements the delay based 779 congestion control aspects of BBR with an explicit indication that 780 hosts can use, both to converge on a fair rate and to keep below a 781 shallow queue target set by the network. Without L4S ECN, both 782 these aspects need to be assumed or estimated. 784 6. Applicability 786 6.1. Applications 788 A transport layer that solves the current latency issues will provide 789 new service, product and application opportunities. 791 With the L4S approach, the following existing applications also 792 experience significantly better quality of experience under load: 794 o Gaming, including cloud based gaming; 796 o VoIP; 798 o Video conferencing; 800 o Web browsing; 802 o (Adaptive) video streaming; 804 o Instant messaging. 806 The significantly lower queuing latency also enables some interactive 807 application functions to be offloaded to the cloud that would hardly 808 even be usable today: 810 o Cloud based interactive video; 812 o Cloud based virtual and augmented reality. 814 The above two applications have been successfully demonstrated with 815 L4S, both running together over a 40 Mb/s broadband access link 816 loaded up with the numerous other latency sensitive applications in 817 the previous list as well as numerous downloads - all sharing the 818 same bottleneck queue simultaneously [L4Sdemo16]. For the former, a 819 panoramic video of a football stadium could be swiped and pinched so 820 that, on the fly, a proxy in the cloud could generate a sub-window of 821 the match video under the finger-gesture control of each user. For 822 the latter, a virtual reality headset displayed a viewport taken from 823 a 360 degree camera in a racing car. The user's head movements 824 controlled the viewport extracted by a cloud-based proxy. In both 825 cases, with 7 ms end-to-end base delay, the additional queuing delay 826 of roughly 1 ms was so low that it seemed the video was generated 827 locally. 829 Using a swiping finger gesture or head movement to pan a video are 830 extremely latency-demanding actions--far more demanding than VoIP. 831 Because human vision can detect extremely low delays of the order of 832 single milliseconds when delay is translated into a visual lag 833 between a video and a reference point (the finger or the orientation 834 of the head sensed by the balance system in the inner ear --- the 835 vestibular system). 837 Without the low queuing delay of L4S, cloud-based applications like 838 these would not be credible without significantly more access 839 bandwidth (to deliver all possible video that might be viewed) and 840 more local processing, which would increase the weight and power 841 consumption of head-mounted displays. When all interactive 842 processing can be done in the cloud, only the data to be rendered for 843 the end user needs to be sent. 845 Other low latency high bandwidth applications such as: 847 o Interactive remote presence; 849 o Video-assisted remote control of machinery or industrial 850 processes. 852 are not credible at all without very low queuing delay. No amount of 853 extra access bandwidth or local processing can make up for lost time. 855 6.2. Use Cases 857 The following use-cases for L4S are being considered by various 858 interested parties: 860 o Where the bottleneck is one of various types of access network: 861 e.g. DSL, Passive Optical Networks (PON), DOCSIS cable, mobile, 862 satellite (see Section 6.3 for some technology-specific details) 864 o Private networks of heterogeneous data centres, where there is no 865 single administrator that can arrange for all the simultaneous 866 changes to senders, receivers and network needed to deploy DCTCP: 868 * a set of private data centres interconnected over a wide area 869 with separate administrations, but within the same company 871 * a set of data centres operated by separate companies 872 interconnected by a community of interest network (e.g. for the 873 finance sector) 875 * multi-tenant (cloud) data centres where tenants choose their 876 operating system stack (Infrastructure as a Service - IaaS) 878 o Different types of transport (or application) congestion control: 880 * elastic (TCP/SCTP); 882 * real-time (RTP, RMCAT); 884 * query (DNS/LDAP). 886 o Where low delay quality of service is required, but without 887 inspecting or intervening above the IP layer [RFC8404]: 889 * mobile and other networks have tended to inspect higher layers 890 in order to guess application QoS requirements. However, with 891 growing demand for support of privacy and encryption, L4S 892 offers an alternative. There is no need to select which 893 traffic to favour for queuing, when L4S gives favourable 894 queuing to all traffic. 896 o If queuing delay is minimized, applications with a fixed delay 897 budget can communicate over longer distances, or via a longer 898 chain of service functions [RFC7665] or onion routers. 900 o If delay jitter is minimized, it is possible to reduce the 901 dejitter buffers on the receive end of video streaming, which 902 should improve the interactive experience 904 6.3. Applicability with Specific Link Technologies 906 Certain link technologies aggregate data from multiple packets into 907 bursts, and buffer incoming packets while building each burst. WiFi, 908 PON and cable all involve such packet aggregation, whereas fixed 909 Ethernet and DSL do not. No sender, whether L4S or not, can do 910 anything to reduce the buffering needed for packet aggregation. So 911 an AQM should not count this buffering as part of the queue that it 912 controls, given no amount of congestion signals will reduce it. 914 Certain link technologies also add buffering for other reasons, 915 specifically: 917 o Radio links (cellular, WiFi, satellite) that are distant from the 918 source are particularly challenging. The radio link capacity can 919 vary rapidly by orders of magnitude, so it is considered desirable 920 to hold a standing queue that can utilize sudden increases of 921 capacity; 923 o Cellular networks are further complicated by a perceived need to 924 buffer in order to make hand-overs imperceptible; 926 L4S cannot remove the need for all these different forms of 927 buffering. However, by removing 'the longest pole in the tent' 928 (buffering for the large sawteeth of Classic congestion controls), 929 L4S exposes all these 'shorter poles' to greater scrutiny. 931 Until now, the buffering needed for these additional reasons tended 932 to be over-specified - with the excuse that none were 'the longest 933 pole in the tent'. But having removed the 'longest pole', it becomes 934 worthwhile to minimize them, for instance reducing packet aggregation 935 burst sizes and MAC scheduling intervals. 937 6.4. Deployment Considerations 939 L4S AQMs, whether DualQ [I-D.ietf-tsvwg-aqm-dualq-coupled] or FQ, 940 e.g. [RFC8290] are, in themselves, an incremental deployment 941 mechanism for L4S - so that L4S traffic can coexist with existing 942 Classic (Reno-friendly) traffic. Section 6.4.1 explains why only 943 deploying an L4S AQM in one node at each end of the access link will 944 realize nearly all the benefit of L4S. 946 L4S involves both end systems and the network, so Section 6.4.2 947 suggests some typical sequences to deploy each part, and why there 948 will be an immediate and significant benefit after deploying just one 949 part. 951 Section 6.4.3 and Section 6.4.4 describe the converse incremental 952 deployment case where there is no L4S AQM at the network bottleneck, 953 so any L4S flow traversing this bottleneck has to take care in case 954 it is competing with Classic traffic. 956 6.4.1. Deployment Topology 958 L4S AQMs will not have to be deployed throughout the Internet before 959 L4S will work for anyone. Operators of public Internet access 960 networks typically design their networks so that the bottleneck will 961 nearly always occur at one known (logical) link. This confines the 962 cost of queue management technology to one place. 964 The case of mesh networks is different and will be discussed later in 965 this section. But the known bottleneck case is generally true for 966 Internet access to all sorts of different 'sites', where the word 967 'site' includes home networks, small- to medium-sized campus or 968 enterprise networks and even cellular devices (Figure 2). Also, this 969 known-bottleneck case tends to be applicable whatever the access link 970 technology; whether xDSL, cable, PON, cellular, line of sight 971 wireless or satellite. 973 Therefore, the full benefit of the L4S service should be available in 974 the downstream direction when an L4S AQM is deployed at the ingress 975 to this bottleneck link. And similarly, the full upstream service 976 will be available once an L4S AQM is deployed at the ingress into the 977 upstream link. (Of course, multi-homed sites would only see the full 978 benefit once all their access links were covered.) 979 ______ 980 ( ) 981 __ __ ( ) 982 |DQ\________/DQ|( enterprise ) 983 ___ |__/ \__| ( /campus ) 984 ( ) (______) 985 ( ) ___||_ 986 +----+ ( ) __ __ / \ 987 | DC |-----( Core )|DQ\_______________/DQ|| home | 988 +----+ ( ) |__/ \__||______| 989 (_____) __ 990 |DQ\__/\ __ ,===. 991 |__/ \ ____/DQ||| ||mobile 992 \/ \__|||_||device 993 | o | 994 `---' 996 Figure 2: Likely location of DualQ (DQ) Deployments in common access 997 topologies 999 Deployment in mesh topologies depends on how over-booked the core is. 1000 If the core is non-blocking, or at least generously provisioned so 1001 that the edges are nearly always the bottlenecks, it would only be 1002 necessary to deploy an L4S AQM at the edge bottlenecks. For example, 1003 some data-centre networks are designed with the bottleneck in the 1004 hypervisor or host NICs, while others bottleneck at the top-of-rack 1005 switch (both the output ports facing hosts and those facing the 1006 core). 1008 An L4S AQM would eventually also need to be deployed at any other 1009 persistent bottlenecks such as network interconnections, e.g. some 1010 public Internet exchange points and the ingress and egress to WAN 1011 links interconnecting data-centres. 1013 6.4.2. Deployment Sequences 1015 For any one L4S flow to work, it requires 3 parts to have been 1016 deployed. This was the same deployment problem that ECN 1017 faced [RFC8170] so we have learned from that experience. 1019 Firstly, L4S deployment exploits the fact that DCTCP already exists 1020 on many Internet hosts (Windows, FreeBSD and Linux); both servers and 1021 clients. Therefore, just deploying an L4S AQM at a network 1022 bottleneck immediately gives a working deployment of all the L4S 1023 parts. DCTCP needs some safety concerns to be fixed for general use 1024 over the public Internet (see Section 2.3 of 1025 [I-D.ietf-tsvwg-ecn-l4s-id]), but DCTCP is not on by default, so 1026 these issues can be managed within controlled deployments or 1027 controlled trials. 1029 Secondly, the performance improvement with L4S is so significant that 1030 it enables new interactive services and products that were not 1031 previously possible. It is much easier for companies to initiate new 1032 work on deployment if there is budget for a new product trial. If, 1033 in contrast, there were only an incremental performance improvement 1034 (as with Classic ECN), spending on deployment tends to be much harder 1035 to justify. 1037 Thirdly, the L4S identifier is defined so that initially network 1038 operators can enable L4S exclusively for certain customers or certain 1039 applications. But this is carefully defined so that it does not 1040 compromise future evolution towards L4S as an Internet-wide service. 1041 This is because the L4S identifier is defined not only as the end-to- 1042 end ECN field, but it can also optionally be combined with any other 1043 packet header or some status of a customer or their access 1044 link [I-D.ietf-tsvwg-ecn-l4s-id]. Operators could do this anyway, 1045 even if it were not blessed by the IETF. However, it is best for the 1046 IETF to specify that, if they use their own local identifier, it must 1047 be in combination with the IETF's identifier. Then, if an operator 1048 has opted for an exclusive local-use approach, later they only have 1049 to remove this extra rule to make the service work Internet-wide - it 1050 will already traverse middleboxes, peerings, etc. 1052 +-+--------------------+----------------------+---------------------+ 1053 | | Servers or proxies | Access link | Clients | 1054 +-+--------------------+----------------------+---------------------+ 1055 |0| DCTCP (existing) | | DCTCP (existing) | 1056 +-+--------------------+----------------------+---------------------+ 1057 |1| |Add L4S AQM downstream| | 1058 | | WORKS DOWNSTREAM FOR CONTROLLED DEPLOYMENTS/TRIALS | 1059 +-+--------------------+----------------------+---------------------+ 1060 |2| Upgrade DCTCP to | |Replace DCTCP feedb'k| 1061 | | TCP Prague | | with AccECN | 1062 | | FULLY WORKS DOWNSTREAM | 1063 +-+--------------------+----------------------+---------------------+ 1064 | | | | Upgrade DCTCP to | 1065 |3| | Add L4S AQM upstream | TCP Prague | 1066 | | | | | 1067 | | FULLY WORKS UPSTREAM AND DOWNSTREAM | 1068 +-+--------------------+----------------------+---------------------+ 1070 Figure 3: Example L4S Deployment Sequence 1072 Figure 3 illustrates some example sequences in which the parts of L4S 1073 might be deployed. It consists of the following stages: 1075 1. Here, the immediate benefit of a single AQM deployment can be 1076 seen, but limited to a controlled trial or controlled deployment. 1077 In this example downstream deployment is first, but in other 1078 scenarios the upstream might be deployed first. If no AQM at all 1079 was previously deployed for the downstream access, an L4S AQM 1080 greatly improves the Classic service (as well as adding the L4S 1081 service). If an AQM was already deployed, the Classic service 1082 will be unchanged (and L4S will add an improvement on top). 1084 2. In this stage, the name 'TCP Prague' [PragueLinux] is used to 1085 represent a variant of DCTCP that is safe to use in a production 1086 Internet environment. If the application is primarily 1087 unidirectional, 'TCP Prague' at one end will provide all the 1088 benefit needed. For TCP transports, Accurate ECN feedback 1089 (AccECN) [I-D.ietf-tcpm-accurate-ecn] is needed at the other end, 1090 but it is a generic ECN feedback facility that is already planned 1091 to be deployed for other purposes, e.g. DCTCP, BBR. The two ends 1092 can be deployed in either order, because, in TCP, an L4S 1093 congestion control only enables itself if it has negotiated the 1094 use of AccECN feedback with the other end during the connection 1095 handshake. Thus, deployment of TCP Prague on a server enables 1096 L4S trials to move to a production service in one direction, 1097 wherever AccECN is deployed at the other end. This stage might 1098 be further motivated by the performance improvements of TCP 1099 Prague relative to DCTCP (see Appendix A.2 of 1100 [I-D.ietf-tsvwg-ecn-l4s-id]). 1102 Unlike TCP, from the outset, QUIC ECN 1103 feedback [I-D.ietf-quic-transport] has supported L4S. Therefore, 1104 if the transport is QUIC, one-ended deployment of a Prague 1105 congestion control at this stage is simple and sufficient. 1107 3. This is a two-move stage to enable L4S upstream. An L4S AQM or 1108 TCP Prague can be deployed in either order as already explained. 1109 To motivate the first of two independent moves, the deferred 1110 benefit of enabling new services after the second move has to be 1111 worth it to cover the first mover's investment risk. As 1112 explained already, the potential for new interactive services 1113 provides this motivation. An L4S AQM also improves the upstream 1114 Classic service - significantly if no other AQM has already been 1115 deployed. 1117 Note that other deployment sequences might occur. For instance: the 1118 upstream might be deployed first; a non-TCP protocol might be used 1119 end-to-end, e.g. QUIC, RTP; a body such as the 3GPP might require L4S 1120 to be implemented in 5G user equipment, or other random acts of 1121 kindness. 1123 6.4.3. L4S Flow but Non-ECN Bottleneck 1125 If L4S is enabled between two hosts, the L4S sender is required to 1126 coexist safely with Reno in response to any drop (see Section 4.3 of 1127 [I-D.ietf-tsvwg-ecn-l4s-id]). 1129 Unfortunately, as well as protecting Classic traffic, this rule 1130 degrades the L4S service whenever there is any loss, even if the 1131 cause is not persistent congestion at a bottleneck, e.g.: 1133 o congestion loss at other transient bottlenecks, e.g. due to bursts 1134 in shallower queues; 1136 o transmission errors, e.g. due to electrical interference; 1138 o rate policing. 1140 Three complementary approaches are in progress to address this issue, 1141 but they are all currently research: 1143 o In Prague congestion control, ignore certain losses deemed 1144 unlikely to be due to congestion (using some ideas from 1145 BBR [I-D.cardwell-iccrg-bbr-congestion-control] regarding isolated 1146 losses). This could mask any of the above types of loss while 1147 still coexisting with drop-based congestion controls. 1149 o A combination of RACK, L4S and link retransmission without 1150 resequencing could repair transmission errors without the head of 1151 line blocking delay usually associated with link-layer 1152 retransmission [UnorderedLTE], [I-D.ietf-tsvwg-ecn-l4s-id]; 1154 o Hybrid ECN/drop rate policers (see Section 8.3). 1156 L4S deployment scenarios that minimize these issues (e.g. over 1157 wireline networks) can proceed in parallel to this research, in the 1158 expectation that research success could continually widen L4S 1159 applicability. 1161 6.4.4. L4S Flow but Classic ECN Bottleneck 1163 Classic ECN support is starting to materialize on the Internet as an 1164 increased level of CE marking. It is hard to detect whether this is 1165 all due to the addition of support for ECN in the Linux 1166 implementation of FQ-CoDel, which is not problematic, because FQ 1167 inherently forces the throughput of each flow to be equal 1168 irrespective of its aggressiveness. However, some of this Classic 1169 ECN marking might be due to single-queue ECN deployment. This case 1170 is discussed in Section 4.3 of [I-D.ietf-tsvwg-ecn-l4s-id]). 1172 6.4.5. L4S AQM Deployment within Tunnels 1174 An L4S AQM uses the ECN field to signal congestion. So, in common 1175 with Classic ECN, if the AQM is within a tunnel or at a lower layer, 1176 correct functioning of ECN signalling requires correct propagation of 1177 the ECN field up the layers [RFC6040], 1178 [I-D.ietf-tsvwg-rfc6040update-shim], 1179 [I-D.ietf-tsvwg-ecn-encap-guidelines]. 1181 7. IANA Considerations (to be removed by RFC Editor) 1183 This specification contains no IANA considerations. 1185 8. Security Considerations 1187 8.1. Traffic Rate (Non-)Policing 1189 Because the L4S service can serve all traffic that is using the 1190 capacity of a link, it should not be necessary to rate-police access 1191 to the L4S service. In contrast, Diffserv only works if some packets 1192 get less favourable treatment than others. So Diffserv has to use 1193 traffic rate policers to limit how much traffic can be favoured. In 1194 turn, traffic policers require traffic contracts between users and 1195 networks as well as pairwise between networks. Because L4S will lack 1196 all this management complexity, it is more likely to work end-to-end. 1198 During early deployment (and perhaps always), some networks will not 1199 offer the L4S service. In general, these networks should not need to 1200 police L4S traffic - they are required not to change the L4S 1201 identifier, merely treating the traffic as best efforts traffic, as 1202 they already treat traffic with ECT(1) today. At a bottleneck, such 1203 networks will introduce some queuing and dropping. When a scalable 1204 congestion control detects a drop it will have to respond safely with 1205 respect to Classic congestion controls (as required in Section 4.3 of 1206 [I-D.ietf-tsvwg-ecn-l4s-id]). This will degrade the L4S service to 1207 be no better (but never worse) than Classic best efforts, whenever a 1208 non-ECN bottleneck is encountered on a path (see Section 6.4.3). 1210 In some cases, networks that solely support Classic ECN [RFC3168] in 1211 a single queue bottleneck might opt to police L4S traffic in order to 1212 protect competing Classic ECN traffic. 1214 Certain network operators might choose to restrict access to the L4S 1215 class, perhaps only to selected premium customers as a value-added 1216 service. Their packet classifier (item 2 in Figure 1) could identify 1217 such customers against some other field (e.g. source address range) 1218 as well as ECN. If only the ECN L4S identifier matched, but not the 1219 source address (say), the classifier could direct these packets (from 1220 non-premium customers) into the Classic queue. Explaining clearly 1221 how operators can use an additional local classifiers (see 1222 [I-D.ietf-tsvwg-ecn-l4s-id]) is intended to remove any motivation to 1223 bleach the L4S identifier. Then at least the L4S ECN identifier will 1224 be more likely to survive end-to-end even though the service may not 1225 be supported at every hop. Such local arrangements would only 1226 require simple registered/not-registered packet classification, 1227 rather than the managed, application-specific traffic policing 1228 against customer-specific traffic contracts that Diffserv uses. 1230 8.2. 'Latency Friendliness' 1232 Like the Classic service, the L4S service relies on self-constraint - 1233 limiting rate in response to congestion. In addition, the L4S 1234 service requires self-constraint in terms of limiting latency 1235 (burstiness). It is hoped that self-interest and guidance on dynamic 1236 behaviour (especially flow start-up, which might need to be 1237 standardized) will be sufficient to prevent transports from sending 1238 excessive bursts of L4S traffic, given the application's own latency 1239 will suffer most from such behaviour. 1241 Whether burst policing becomes necessary remains to be seen. Without 1242 it, there will be potential for attacks on the low latency of the L4S 1243 service. 1245 If needed, various arrangements could be used to address this 1246 concern: 1248 Local bottleneck queue protection: A per-flow (5-tuple) queue 1249 protection function [I-D.briscoe-docsis-q-protection] has been 1250 developed for the low latency queue in DOCSIS, which has adopted 1251 the DualQ L4S architecture. It protects the low latency service 1252 from any queue-building flows that accidentally or maliciously 1253 classify themselves into the low latency queue. It is designed to 1254 score flows based solely on their contribution to queuing (not 1255 flow rate in itself). Then, if the shared low latency queue is at 1256 risk of exceeding a threshold, the function redirects enough 1257 packets of the highest scoring flow(s) into the Classic queue to 1258 preserve low latency. 1260 Distributed traffic scrubbing: Rather than policing locally at each 1261 bottleneck, it may only be necessary to address problems 1262 reactively, e.g. punitively target any deployments of new bursty 1263 malware, in a similar way to how traffic from flooding attack 1264 sources is rerouted via scrubbing facilities. 1266 Local bottleneck per-flow scheduling: Per-flow scheduling should 1267 inherently isolate non-bursty flows from bursty (see Section 5.2 1268 for discussion of the merits of per-flow scheduling relative to 1269 per-flow policing). 1271 Distributed access subnet queue protection: Per-flow queue 1272 protection could be arranged for a queue structure distributed 1273 across a subnet inter-communicating using lower layer control 1274 messages (see Section 2.1.4 of [QDyn]). For instance, in a radio 1275 access network user equipment already sends regular buffer status 1276 reports to a radio network controller, which could use this 1277 information to remotely police individual flows. 1279 Distributed Congestion Exposure to Ingress Policers: The Congestion 1280 Exposure (ConEx) architecture [RFC7713] which uses egress audit to 1281 motivate senders to truthfully signal path congestion in-band 1282 where it can be used by ingress policers. An edge-to-edge variant 1283 of this architecture is also possible. 1285 Distributed Domain-edge traffic conditioning: An architecture 1286 similar to Diffserv [RFC2475] may be preferred, where traffic is 1287 proactively conditioned on entry to a domain, rather than 1288 reactively policed only if it is leads to queuing once combined 1289 with other traffic at a bottleneck. 1291 Distributed core network queue protection: The policing function 1292 could be divided between per-flow mechanisms at the network 1293 ingress that characterize the burstiness of each flow into a 1294 signal carried with the traffic, and per-class mechanisms at 1295 bottlenecks that act on these signals if queuing actually occurs 1296 once the traffic converges. This would be somewhat similar to the 1297 idea behind core stateless fair queuing, which is in turn similar 1298 to [Nadas20]. 1300 None of these possible queue protection capabilities are considered a 1301 necessary part of the L4S architecture, which works without them (in 1302 a similar way to how the Internet works without per-flow rate 1303 policing). Indeed, under normal circumstances, latency policers 1304 would not intervene, and if operators found they were not necessary 1305 they could disable them. Part of the L4S experiment will be to see 1306 whether such a function is necessary, and which arrangements are most 1307 appropriate to the size of the problem. 1309 8.3. Interaction between Rate Policing and L4S 1311 As mentioned in Section 5.2, L4S should remove the need for low 1312 latency Diffserv classes. However, those Diffserv classes that give 1313 certain applications or users priority over capacity, would still be 1314 applicable in certain scenarios (e.g. corporate networks). Then, 1315 within such Diffserv classes, L4S would often be applicable to give 1316 traffic low latency and low loss as well. Within such a Diffserv 1317 class, the bandwidth available to a user or application is often 1318 limited by a rate policer. Similarly, in the default Diffserv class, 1319 rate policers are used to partition shared capacity. 1321 A classic rate policer drops any packets exceeding a set rate, 1322 usually also giving a burst allowance (variants exist where the 1323 policer re-marks non-compliant traffic to a discard-eligible Diffserv 1324 codepoint, so they may be dropped elsewhere during contention). 1325 Whenever L4S traffic encounters one of these rate policers, it will 1326 experience drops and the source will have to fall back to a Classic 1327 congestion control, thus losing the benefits of L4S (Section 6.4.3). 1328 So, in networks that already use rate policers and plan to deploy 1329 L4S, it will be preferable to redesign these rate policers to be more 1330 friendly to the L4S service. 1332 L4S-friendly rate policing is currently a research area (note that 1333 this is not the same as latency policing). It might be achieved by 1334 setting a threshold where ECN marking is introduced, such that it is 1335 just under the policed rate or just under the burst allowance where 1336 drop is introduced. This could be applied to various types of rate 1337 policer, e.g. [RFC2697], [RFC2698] or the 'local' (non-ConEx) variant 1338 of the ConEx congestion policer [I-D.briscoe-conex-policing]. It 1339 might also be possible to design scalable congestion controls to 1340 respond less catastrophically to loss that has not been preceded by a 1341 period of increasing delay. 1343 The design of L4S-friendly rate policers will require a separate 1344 dedicated document. For further discussion of the interaction 1345 between L4S and Diffserv, see [I-D.briscoe-tsvwg-l4s-diffserv]. 1347 8.4. ECN Integrity 1349 Receiving hosts can fool a sender into downloading faster by 1350 suppressing feedback of ECN marks (or of losses if retransmissions 1351 are not necessary or available otherwise). Various ways to protect 1352 transport feedback integrity have been developed. For instance: 1354 o The sender can test the integrity of the receiver's feedback by 1355 occasionally setting the IP-ECN field to the congestion 1356 experienced (CE) codepoint, which is normally only set by a 1357 congested link. Then the sender can test whether the receiver's 1358 feedback faithfully reports what it expects (see 2nd para of 1359 Section 20.2 of [RFC3168]). 1361 o A network can enforce a congestion response to its ECN markings 1362 (or packet losses) by auditing congestion exposure 1363 (ConEx) [RFC7713]. 1365 o The TCP authentication option (TCP-AO [RFC5925]) can be used to 1366 detect tampering with TCP congestion feedback. 1368 o The ECN Nonce [RFC3540] was proposed to detect tampering with 1369 congestion feedback, but it has been reclassified as 1370 historic [RFC8311]. 1372 Appendix C.1 of [I-D.ietf-tsvwg-ecn-l4s-id] gives more details of 1373 these techniques including their applicability and pros and cons. 1375 8.5. Privacy Considerations 1377 As discussed in Section 5.2, the L4S architecture does not preclude 1378 approaches that inspect end-to-end transport layer identifiers. For 1379 instance it is simple to add L4S support to FQ-CoDel, which 1380 classifies by application flow ID in the network. However, the main 1381 innovation of L4S is the DualQ AQM framework that does not need to 1382 inspect any deeper than the outermost IP header, because the L4S 1383 identifier is in the IP-ECN field. 1385 Thus, the L4S architecture enables ultra-low queuing delay without 1386 _requiring_ inspection of information above the IP layer. This means 1387 that users who want to encrypt application flow identifiers, e.g. in 1388 IPSec or other encrypted VPN tunnels, don't have to sacrifice low 1389 delay [RFC8404]. 1391 Because L4S can provide low delay for a broad set of applications 1392 that choose to use it, there is no need for individual applications 1393 or classes within that broad set to be distinguishable in any way 1394 while traversing networks. This removes much of the ability to 1395 correlate between the delay requirements of traffic and other 1396 identifying features [RFC6973]. There may be some types of traffic 1397 that prefer not to use L4S, but the coarse binary categorization of 1398 traffic reveals very little that could be exploited to compromise 1399 privacy. 1401 9. Acknowledgements 1403 Thanks to Richard Scheffenegger, Wes Eddy, Karen Nielsen, David Black 1404 and Jake Holland for their useful review comments. 1406 Bob Briscoe and Koen De Schepper were part-funded by the European 1407 Community under its Seventh Framework Programme through the Reducing 1408 Internet Transport Latency (RITE) project (ICT-317700). Bob Briscoe 1409 was also part-funded by the Research Council of Norway through the 1410 TimeIn project, partly by CableLabs and partly by the Comcast 1411 Innovation Fund. The views expressed here are solely those of the 1412 authors. 1414 10. Informative References 1416 [AFCD] Xue, L., Kumar, S., Cui, C., Kondikoppa, P., Chiu, C-H., 1417 and S-J. Park, "Towards fair and low latency next 1418 generation high speed networks: AFCD queuing", Journal of 1419 Network and Computer Applications 70:183--193, July 2016. 1421 [DCttH15] De Schepper, K., Bondarenko, O., Briscoe, B., and I. 1422 Tsang, "`Data Centre to the Home': Ultra-Low Latency for 1423 All", RITE project Technical Report , 2015, 1424 . 1426 [DOCSIS3.1] 1427 CableLabs, "MAC and Upper Layer Protocols Interface 1428 (MULPI) Specification, CM-SP-MULPIv3.1", Data-Over-Cable 1429 Service Interface Specifications DOCSIS(R) 3.1 Version i17 1430 or later, January 2019, . 1433 [DualPI2Linux] 1434 Albisser, O., De Schepper, K., Briscoe, B., Tilmans, O., 1435 and H. Steen, "DUALPI2 - Low Latency, Low Loss and 1436 Scalable (L4S) AQM", Proc. Linux Netdev 0x13 , March 2019, 1437 . 1440 [Hohlfeld14] 1441 Hohlfeld , O., Pujol, E., Ciucu, F., Feldmann, A., and P. 1442 Barford, "A QoE Perspective on Sizing Network Buffers", 1443 Proc. ACM Internet Measurement Conf (IMC'14) hmm, November 1444 2014. 1446 [I-D.briscoe-conex-policing] 1447 Briscoe, B., "Network Performance Isolation using 1448 Congestion Policing", draft-briscoe-conex-policing-01 1449 (work in progress), February 2014. 1451 [I-D.briscoe-docsis-q-protection] 1452 Briscoe, B. and G. White, "Queue Protection to Preserve 1453 Low Latency", draft-briscoe-docsis-q-protection-00 (work 1454 in progress), July 2019. 1456 [I-D.briscoe-tsvwg-l4s-diffserv] 1457 Briscoe, B., "Interactions between Low Latency, Low Loss, 1458 Scalable Throughput (L4S) and Differentiated Services", 1459 draft-briscoe-tsvwg-l4s-diffserv-02 (work in progress), 1460 November 2018. 1462 [I-D.cardwell-iccrg-bbr-congestion-control] 1463 Cardwell, N., Cheng, Y., Yeganeh, S., and V. Jacobson, 1464 "BBR Congestion Control", draft-cardwell-iccrg-bbr- 1465 congestion-control-00 (work in progress), July 2017. 1467 [I-D.ietf-avtcore-cc-feedback-message] 1468 Sarker, Z., Perkins, C., Singh, V., and M. Ramalho, "RTP 1469 Control Protocol (RTCP) Feedback for Congestion Control", 1470 draft-ietf-avtcore-cc-feedback-message-09 (work in 1471 progress), November 2020. 1473 [I-D.ietf-quic-transport] 1474 Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed 1475 and Secure Transport", draft-ietf-quic-transport-32 (work 1476 in progress), October 2020. 1478 [I-D.ietf-tcpm-accurate-ecn] 1479 Briscoe, B., Kuehlewind, M., and R. Scheffenegger, "More 1480 Accurate ECN Feedback in TCP", draft-ietf-tcpm-accurate- 1481 ecn-13 (work in progress), November 2020. 1483 [I-D.ietf-tcpm-generalized-ecn] 1484 Bagnulo, M. and B. Briscoe, "ECN++: Adding Explicit 1485 Congestion Notification (ECN) to TCP Control Packets", 1486 draft-ietf-tcpm-generalized-ecn-06 (work in progress), 1487 October 2020. 1489 [I-D.ietf-tsvwg-aqm-dualq-coupled] 1490 Schepper, K., Briscoe, B., and G. White, "DualQ Coupled 1491 AQMs for Low Latency, Low Loss and Scalable Throughput 1492 (L4S)", draft-ietf-tsvwg-aqm-dualq-coupled-12 (work in 1493 progress), July 2020. 1495 [I-D.ietf-tsvwg-ecn-encap-guidelines] 1496 Briscoe, B., Kaippallimalil, J., and P. Thaler, 1497 "Guidelines for Adding Congestion Notification to 1498 Protocols that Encapsulate IP", draft-ietf-tsvwg-ecn- 1499 encap-guidelines-13 (work in progress), May 2019. 1501 [I-D.ietf-tsvwg-ecn-l4s-id] 1502 Schepper, K. and B. Briscoe, "Identifying Modified 1503 Explicit Congestion Notification (ECN) Semantics for 1504 Ultra-Low Queuing Delay (L4S)", draft-ietf-tsvwg-ecn-l4s- 1505 id-11 (work in progress), November 2020. 1507 [I-D.ietf-tsvwg-rfc6040update-shim] 1508 Briscoe, B., "Propagating Explicit Congestion Notification 1509 Across IP Tunnel Headers Separated by a Shim", draft-ietf- 1510 tsvwg-rfc6040update-shim-10 (work in progress), March 1511 2020. 1513 [I-D.morton-tsvwg-codel-approx-fair] 1514 Morton, J. and P. Heist, "Controlled Delay Approximate 1515 Fairness AQM", draft-morton-tsvwg-codel-approx-fair-01 1516 (work in progress), March 2020. 1518 [I-D.sridharan-tcpm-ctcp] 1519 Sridharan, M., Tan, K., Bansal, D., and D. Thaler, 1520 "Compound TCP: A New TCP Congestion Control for High-Speed 1521 and Long Distance Networks", draft-sridharan-tcpm-ctcp-02 1522 (work in progress), November 2008. 1524 [I-D.stewart-tsvwg-sctpecn] 1525 Stewart, R., Tuexen, M., and X. Dong, "ECN for Stream 1526 Control Transmission Protocol (SCTP)", draft-stewart- 1527 tsvwg-sctpecn-05 (work in progress), January 2014. 1529 [I-D.white-tsvwg-nqb] 1530 White, G. and T. Fossati, "Identifying and Handling Non 1531 Queue Building Flows in a Bottleneck Link", draft-white- 1532 tsvwg-nqb-02 (work in progress), June 2019. 1534 [L4Sdemo16] 1535 Bondarenko, O., De Schepper, K., Tsang, I., and B. 1536 Briscoe, "orderedUltra-Low Delay for All: Live Experience, 1537 Live Analysis", Proc. MMSYS'16 pp33:1--33:4, May 2016, 1538 . 1542 [LEDBAT_AQM] 1543 Al-Saadi, R., Armitage, G., and J. But, "Characterising 1544 LEDBAT Performance Through Bottlenecks Using PIE, FQ-CoDel 1545 and FQ-PIE Active Queue Management", Proc. IEEE 42nd 1546 Conference on Local Computer Networks (LCN) 278--285, 1547 2017, . 1549 [Mathis09] 1550 Mathis, M., "Relentless Congestion Control", PFLDNeT'09 , 1551 May 2009, . 1556 [McIlroy78] 1557 McIlroy, M., Pinson, E., and B. Tague, "UNIX Time-Sharing 1558 System: Foreword", The Bell System Technical Journal 1559 57:6(1902--1903), July 1978, 1560 . 1562 [Nadas20] Nadas, S., Gombos, G., Fejes, F., and S. Laki, "A 1563 Congestion Control Independent L4S Scheduler", Proc. 1564 Applied Networking Research Workshop (ANRW '20) 45--51, 1565 July 2020. 1567 [NewCC_Proc] 1568 Eggert, L., "Experimental Specification of New Congestion 1569 Control Algorithms", IETF Operational Note ion-tsv-alt-cc, 1570 July 2007. 1572 [PragueLinux] 1573 Briscoe, B., De Schepper, K., Albisser, O., Misund, J., 1574 Tilmans, O., Kuehlewind, M., and A. Ahmed, "Implementing 1575 the `TCP Prague' Requirements for Low Latency Low Loss 1576 Scalable Throughput (L4S)", Proc. Linux Netdev 0x13 , 1577 March 2019, . 1580 [QDyn] Briscoe, B., "Rapid Signalling of Queue Dynamics", 1581 bobbriscoe.net Technical Report TR-BB-2017-001; 1582 arXiv:1904.07044 [cs.NI], September 2017, 1583 . 1585 [RFC2475] Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z., 1586 and W. Weiss, "An Architecture for Differentiated 1587 Services", RFC 2475, DOI 10.17487/RFC2475, December 1998, 1588 . 1590 [RFC2697] Heinanen, J. and R. Guerin, "A Single Rate Three Color 1591 Marker", RFC 2697, DOI 10.17487/RFC2697, September 1999, 1592 . 1594 [RFC2698] Heinanen, J. and R. Guerin, "A Two Rate Three Color 1595 Marker", RFC 2698, DOI 10.17487/RFC2698, September 1999, 1596 . 1598 [RFC2884] Hadi Salim, J. and U. Ahmed, "Performance Evaluation of 1599 Explicit Congestion Notification (ECN) in IP Networks", 1600 RFC 2884, DOI 10.17487/RFC2884, July 2000, 1601 . 1603 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 1604 of Explicit Congestion Notification (ECN) to IP", 1605 RFC 3168, DOI 10.17487/RFC3168, September 2001, 1606 . 1608 [RFC3246] Davie, B., Charny, A., Bennet, J., Benson, K., Le Boudec, 1609 J., Courtney, W., Davari, S., Firoiu, V., and D. 1610 Stiliadis, "An Expedited Forwarding PHB (Per-Hop 1611 Behavior)", RFC 3246, DOI 10.17487/RFC3246, March 2002, 1612 . 1614 [RFC3540] Spring, N., Wetherall, D., and D. Ely, "Robust Explicit 1615 Congestion Notification (ECN) Signaling with Nonces", 1616 RFC 3540, DOI 10.17487/RFC3540, June 2003, 1617 . 1619 [RFC3649] Floyd, S., "HighSpeed TCP for Large Congestion Windows", 1620 RFC 3649, DOI 10.17487/RFC3649, December 2003, 1621 . 1623 [RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram 1624 Congestion Control Protocol (DCCP)", RFC 4340, 1625 DOI 10.17487/RFC4340, March 2006, 1626 . 1628 [RFC4774] Floyd, S., "Specifying Alternate Semantics for the 1629 Explicit Congestion Notification (ECN) Field", BCP 124, 1630 RFC 4774, DOI 10.17487/RFC4774, November 2006, 1631 . 1633 [RFC4960] Stewart, R., Ed., "Stream Control Transmission Protocol", 1634 RFC 4960, DOI 10.17487/RFC4960, September 2007, 1635 . 1637 [RFC5033] Floyd, S. and M. Allman, "Specifying New Congestion 1638 Control Algorithms", BCP 133, RFC 5033, 1639 DOI 10.17487/RFC5033, August 2007, 1640 . 1642 [RFC5348] Floyd, S., Handley, M., Padhye, J., and J. Widmer, "TCP 1643 Friendly Rate Control (TFRC): Protocol Specification", 1644 RFC 5348, DOI 10.17487/RFC5348, September 2008, 1645 . 1647 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 1648 Control", RFC 5681, DOI 10.17487/RFC5681, September 2009, 1649 . 1651 [RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP 1652 Authentication Option", RFC 5925, DOI 10.17487/RFC5925, 1653 June 2010, . 1655 [RFC6040] Briscoe, B., "Tunnelling of Explicit Congestion 1656 Notification", RFC 6040, DOI 10.17487/RFC6040, November 1657 2010, . 1659 [RFC6679] Westerlund, M., Johansson, I., Perkins, C., O'Hanlon, P., 1660 and K. Carlberg, "Explicit Congestion Notification (ECN) 1661 for RTP over UDP", RFC 6679, DOI 10.17487/RFC6679, August 1662 2012, . 1664 [RFC6973] Cooper, A., Tschofenig, H., Aboba, B., Peterson, J., 1665 Morris, J., Hansen, M., and R. Smith, "Privacy 1666 Considerations for Internet Protocols", RFC 6973, 1667 DOI 10.17487/RFC6973, July 2013, 1668 . 1670 [RFC7540] Belshe, M., Peon, R., and M. Thomson, Ed., "Hypertext 1671 Transfer Protocol Version 2 (HTTP/2)", RFC 7540, 1672 DOI 10.17487/RFC7540, May 2015, 1673 . 1675 [RFC7560] Kuehlewind, M., Ed., Scheffenegger, R., and B. Briscoe, 1676 "Problem Statement and Requirements for Increased Accuracy 1677 in Explicit Congestion Notification (ECN) Feedback", 1678 RFC 7560, DOI 10.17487/RFC7560, August 2015, 1679 . 1681 [RFC7665] Halpern, J., Ed. and C. Pignataro, Ed., "Service Function 1682 Chaining (SFC) Architecture", RFC 7665, 1683 DOI 10.17487/RFC7665, October 2015, 1684 . 1686 [RFC7713] Mathis, M. and B. Briscoe, "Congestion Exposure (ConEx) 1687 Concepts, Abstract Mechanism, and Requirements", RFC 7713, 1688 DOI 10.17487/RFC7713, December 2015, 1689 . 1691 [RFC8033] Pan, R., Natarajan, P., Baker, F., and G. White, 1692 "Proportional Integral Controller Enhanced (PIE): A 1693 Lightweight Control Scheme to Address the Bufferbloat 1694 Problem", RFC 8033, DOI 10.17487/RFC8033, February 2017, 1695 . 1697 [RFC8034] White, G. and R. Pan, "Active Queue Management (AQM) Based 1698 on Proportional Integral Controller Enhanced PIE) for 1699 Data-Over-Cable Service Interface Specifications (DOCSIS) 1700 Cable Modems", RFC 8034, DOI 10.17487/RFC8034, February 1701 2017, . 1703 [RFC8170] Thaler, D., Ed., "Planning for Protocol Adoption and 1704 Subsequent Transitions", RFC 8170, DOI 10.17487/RFC8170, 1705 May 2017, . 1707 [RFC8257] Bensley, S., Thaler, D., Balasubramanian, P., Eggert, L., 1708 and G. Judd, "Data Center TCP (DCTCP): TCP Congestion 1709 Control for Data Centers", RFC 8257, DOI 10.17487/RFC8257, 1710 October 2017, . 1712 [RFC8290] Hoeiland-Joergensen, T., McKenney, P., Taht, D., Gettys, 1713 J., and E. Dumazet, "The Flow Queue CoDel Packet Scheduler 1714 and Active Queue Management Algorithm", RFC 8290, 1715 DOI 10.17487/RFC8290, January 2018, 1716 . 1718 [RFC8298] Johansson, I. and Z. Sarker, "Self-Clocked Rate Adaptation 1719 for Multimedia", RFC 8298, DOI 10.17487/RFC8298, December 1720 2017, . 1722 [RFC8311] Black, D., "Relaxing Restrictions on Explicit Congestion 1723 Notification (ECN) Experimentation", RFC 8311, 1724 DOI 10.17487/RFC8311, January 2018, 1725 . 1727 [RFC8312] Rhee, I., Xu, L., Ha, S., Zimmermann, A., Eggert, L., and 1728 R. Scheffenegger, "CUBIC for Fast Long-Distance Networks", 1729 RFC 8312, DOI 10.17487/RFC8312, February 2018, 1730 . 1732 [RFC8404] Moriarty, K., Ed. and A. Morton, Ed., "Effects of 1733 Pervasive Encryption on Operators", RFC 8404, 1734 DOI 10.17487/RFC8404, July 2018, 1735 . 1737 [RFC8511] Khademi, N., Welzl, M., Armitage, G., and G. Fairhurst, 1738 "TCP Alternative Backoff with ECN (ABE)", RFC 8511, 1739 DOI 10.17487/RFC8511, December 2018, 1740 . 1742 [TCP-CA] Jacobson, V. and M. Karels, "Congestion Avoidance and 1743 Control", Laurence Berkeley Labs Technical Report , 1744 November 1988, . 1746 [TCP-sub-mss-w] 1747 Briscoe, B. and K. De Schepper, "Scaling TCP's Congestion 1748 Window for Small Round Trip Times", BT Technical Report 1749 TR-TUB8-2015-002, May 2015, 1750 . 1753 [UnorderedLTE] 1754 Austrheim, M., "Implementing immediate forwarding for 4G 1755 in a network simulator", Masters Thesis, Uni Oslo , June 1756 2019. 1758 Appendix A. Standardization items 1760 The following table includes all the items that will need to be 1761 standardized to provide a full L4S architecture. 1763 The table is too wide for the ASCII draft format, so it has been 1764 split into two, with a common column of row index numbers on the 1765 left. 1767 The columns in the second part of the table have the following 1768 meanings: 1770 WG: The IETF WG most relevant to this requirement. The "tcpm/iccrg" 1771 combination refers to the procedure typically used for congestion 1772 control changes, where tcpm owns the approval decision, but uses 1773 the iccrg for expert review [NewCC_Proc]; 1775 TCP: Applicable to all forms of TCP congestion control; 1777 DCTCP: Applicable to Data Center TCP as currently used (in 1778 controlled environments); 1780 DCTCP bis: Applicable to any future Data Center TCP congestion 1781 control intended for controlled environments; 1783 XXX Prague: Applicable to a Scalable variant of XXX (TCP/SCTP/RMCAT) 1784 congestion control. 1786 +-----+------------------------+------------------------------------+ 1787 | Req | Requirement | Reference | 1788 | # | | | 1789 +-----+------------------------+------------------------------------+ 1790 | 0 | ARCHITECTURE | | 1791 | 1 | L4S IDENTIFIER | [I-D.ietf-tsvwg-ecn-l4s-id] | 1792 | 2 | DUAL QUEUE AQM | [I-D.ietf-tsvwg-aqm-dualq-coupled] | 1793 | 3 | Suitable ECN Feedback | [I-D.ietf-tcpm-accurate-ecn], | 1794 | | | [I-D.stewart-tsvwg-sctpecn]. | 1795 | | | | 1796 | | SCALABLE TRANSPORT - | | 1797 | | SAFETY ADDITIONS | | 1798 | 4-1 | Fall back to | [I-D.ietf-tsvwg-ecn-l4s-id] S.2.3, | 1799 | | Reno/Cubic on loss | [RFC8257] | 1800 | 4-2 | Fall back to | [I-D.ietf-tsvwg-ecn-l4s-id] S.2.3 | 1801 | | Reno/Cubic if classic | | 1802 | | ECN bottleneck | | 1803 | | detected | | 1804 | | | | 1805 | 4-3 | Reduce RTT-dependence | [I-D.ietf-tsvwg-ecn-l4s-id] S.2.3 | 1806 | | | | 1807 | 4-4 | Scaling TCP's | [I-D.ietf-tsvwg-ecn-l4s-id] S.2.3, | 1808 | | Congestion Window for | [TCP-sub-mss-w] | 1809 | | Small Round Trip Times | | 1810 | | SCALABLE TRANSPORT - | | 1811 | | PERFORMANCE | | 1812 | | ENHANCEMENTS | | 1813 | 5-1 | Setting ECT in TCP | [I-D.ietf-tcpm-generalized-ecn] | 1814 | | Control Packets and | | 1815 | | Retransmissions | | 1816 | 5-2 | Faster-than-additive | [I-D.ietf-tsvwg-ecn-l4s-id] (Appx | 1817 | | increase | A.2.2) | 1818 | 5-3 | Faster Convergence at | [I-D.ietf-tsvwg-ecn-l4s-id] (Appx | 1819 | | Flow Start | A.2.2) | 1820 +-----+------------------------+------------------------------------+ 1821 +-----+--------+-----+-------+-----------+--------+--------+--------+ 1822 | # | WG | TCP | DCTCP | DCTCP-bis | TCP | SCTP | RMCAT | 1823 | | | | | | Prague | Prague | Prague | 1824 +-----+--------+-----+-------+-----------+--------+--------+--------+ 1825 | 0 | tsvwg | Y | Y | Y | Y | Y | Y | 1826 | 1 | tsvwg | | | Y | Y | Y | Y | 1827 | 2 | tsvwg | n/a | n/a | n/a | n/a | n/a | n/a | 1828 | | | | | | | | | 1829 | | | | | | | | | 1830 | | | | | | | | | 1831 | 3 | tcpm | Y | Y | Y | Y | n/a | n/a | 1832 | | | | | | | | | 1833 | 4-1 | tcpm | | Y | Y | Y | Y | Y | 1834 | | | | | | | | | 1835 | 4-2 | tcpm/ | | | | Y | Y | ? | 1836 | | iccrg? | | | | | | | 1837 | | | | | | | | | 1838 | | | | | | | | | 1839 | | | | | | | | | 1840 | | | | | | | | | 1841 | 4-3 | tcpm/ | | | Y | Y | Y | ? | 1842 | | iccrg? | | | | | | | 1843 | 4-4 | tcpm | Y | Y | Y | Y | Y | ? | 1844 | | | | | | | | | 1845 | | | | | | | | | 1846 | 5-1 | tcpm | Y | Y | Y | Y | n/a | n/a | 1847 | | | | | | | | | 1848 | 5-2 | tcpm/ | | | Y | Y | Y | ? | 1849 | | iccrg? | | | | | | | 1850 | 5-3 | tcpm/ | | | Y | Y | Y | ? | 1851 | | iccrg? | | | | | | | 1852 +-----+--------+-----+-------+-----------+--------+--------+--------+ 1854 Authors' Addresses 1856 Bob Briscoe (editor) 1857 Independent 1858 UK 1860 Email: ietf@bobbriscoe.net 1861 URI: http://bobbriscoe.net/ 1862 Koen De Schepper 1863 Nokia Bell Labs 1864 Antwerp 1865 Belgium 1867 Email: koen.de_schepper@nokia.com 1868 URI: https://www.bell-labs.com/usr/koen.de_schepper 1870 Marcelo Bagnulo 1871 Universidad Carlos III de Madrid 1872 Av. Universidad 30 1873 Leganes, Madrid 28911 1874 Spain 1876 Phone: 34 91 6249500 1877 Email: marcelo@it.uc3m.es 1878 URI: http://www.it.uc3m.es 1880 Greg White 1881 CableLabs 1882 US 1884 Email: G.White@CableLabs.com