idnits 2.17.1 draft-ietf-tsvwg-l4sops-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (5 May 2021) is 1084 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'Reno' is mentioned on line 197, but not defined == Outdated reference: A later version (-25) exists of draft-ietf-tsvwg-aqm-dualq-coupled-13 == Outdated reference: A later version (-29) exists of draft-ietf-tsvwg-ecn-l4s-id-12 == Outdated reference: A later version (-20) exists of draft-ietf-tsvwg-l4s-arch-08 Summary: 0 errors (**), 0 flaws (~~), 5 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Transport Area Working Group G. White, Ed. 3 Internet-Draft CableLabs 4 Intended status: Informational 5 May 2021 5 Expires: 6 November 2021 7 Operational Guidance for Deployment of L4S in the Internet 8 draft-ietf-tsvwg-l4sops-00 10 Abstract 12 This document is intended to provide guidance in order to ensure 13 successful deployment of Low Latency Low Loss Scalable throughput 14 (L4S) in the Internet. Other L4S documents provide guidance for 15 running an L4S experiment, but this document is focused solely on 16 potential interactions between L4S flows and flows using the original 17 ('Classic') ECN over a Classic ECN bottleneck link. The document 18 discusses the potential outcomes of these interactions, describes 19 mechanisms to detect the presence of Classic ECN bottlenecks, and 20 identifies opportunities to prevent and/or detect and resolve 21 fairness problems in such networks. This guidance is aimed at 22 operators of end-systems, operators of networks, and researchers. 24 Status of This Memo 26 This Internet-Draft is submitted in full conformance with the 27 provisions of BCP 78 and BCP 79. 29 Internet-Drafts are working documents of the Internet Engineering 30 Task Force (IETF). Note that other groups may also distribute 31 working documents as Internet-Drafts. The list of current Internet- 32 Drafts is at https://datatracker.ietf.org/drafts/current/. 34 Internet-Drafts are draft documents valid for a maximum of six months 35 and may be updated, replaced, or obsoleted by other documents at any 36 time. It is inappropriate to use Internet-Drafts as reference 37 material or to cite them other than as "work in progress." 39 This Internet-Draft will expire on 6 November 2021. 41 Copyright Notice 43 Copyright (c) 2021 IETF Trust and the persons identified as the 44 document authors. All rights reserved. 46 This document is subject to BCP 78 and the IETF Trust's Legal 47 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 48 license-info) in effect on the date of publication of this document. 49 Please review these documents carefully, as they describe your rights 50 and restrictions with respect to this document. Code Components 51 extracted from this document must include Simplified BSD License text 52 as described in Section 4.e of the Trust Legal Provisions and are 53 provided without warranty as described in the Simplified BSD License. 55 Table of Contents 57 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 58 2. Per-Flow Fairness . . . . . . . . . . . . . . . . . . . . . . 4 59 3. Detection of Classic ECN Bottlenecks . . . . . . . . . . . . 6 60 3.1. Recent Studies . . . . . . . . . . . . . . . . . . . . . 6 61 3.2. Future Experiments . . . . . . . . . . . . . . . . . . . 7 62 4. Operator of an L4S host . . . . . . . . . . . . . . . . . . . 8 63 4.1. Edge Servers . . . . . . . . . . . . . . . . . . . . . . 10 64 4.2. Other hosts . . . . . . . . . . . . . . . . . . . . . . . 11 65 5. Operator of a Network Employing RFC3168 FIFO Bottlenecks . . 11 66 5.1. Configure AQM to treat ECT(1) as NotECT . . . . . . . . . 12 67 5.2. ECT(1) Tunnel Bypass . . . . . . . . . . . . . . . . . . 12 68 5.3. Configure Non-Coupled Dual Queue . . . . . . . . . . . . 12 69 5.4. WRED with ECT(1) Differentation . . . . . . . . . . . . . 13 70 5.5. Disable RFC3168 Support . . . . . . . . . . . . . . . . . 13 71 5.6. Re-mark ECT(1) to NotECT Prior to AQM . . . . . . . . . . 14 72 6. Operator of a Network Employing RFC3168 FQ Bottlenecks . . . 14 73 7. Conclusion of the L4S experiment . . . . . . . . . . . . . . 15 74 7.1. Successful termination of the L4S experiment . . . . . . 15 75 7.2. Unsuccessful termination of the L4S experiment . . . . . 15 76 8. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 15 77 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 16 78 10. Security Considerations . . . . . . . . . . . . . . . . . . . 16 79 11. Informative References . . . . . . . . . . . . . . . . . . . 16 80 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 18 82 1. Introduction 84 Low-latency, low-loss, scalable throughput (L4S) 85 [I-D.ietf-tsvwg-l4s-arch] traffic is designed to provide lower 86 queuing delay than conventional traffic via a new network service 87 based on a modified Explicit Congestion Notification (ECN) response 88 from the network. L4S traffic is identified by the ECT(1) codepoint, 89 and network bottlenecks that support L4S should congestion-mark 90 ECT(1) packets to enable L4S congestion feedback. However, L4S 91 traffic is also expected to coexist well with classic congestion 92 controlled traffic even if the bottleneck queue does not support L4S. 93 This includes paths where the bottleneck link utilizes packet drops 94 in response to congestion (either due to buffer overrun or active 95 queue management), as well as paths that implement a 'flow-queuing' 96 scheduler such as fq_codel [RFC8290]. A potential area of poor 97 interoperability lies in network bottlenecks employing a shared queue 98 that implements an Active Queue Management (AQM) algorithm that 99 provides Explicit Congestion Notification signaling according to 100 [RFC3168]. RFC3168 has been updated (via [RFC8311]) to reserve 101 ECT(1) for experimental use only (also see [IANA-ECN]), and its use 102 for L4S has been specified in [I-D.ietf-tsvwg-ecn-l4s-id]. However, 103 any deployed RFC3168 AQMs might not be updated, and RFC8311 still 104 prefers that routers not involved in L4S experimentation treat ECT(1) 105 and ECT(0) as equivalent. It has been demonstrated ([Detection]) 106 that when a set of long-running flows comprising both classic 107 congestion controlled flows and L4S-compliant congestion controlled 108 flows compete for bandwidth in such a legacy shared RFC3168 queue, 109 the classic congestion controlled flows may achieve lower throughput 110 than they would have if all of the flows had been classic congestion 111 controlled flows. This 'unfairness' between the two classes is more 112 pronounced on longer RTT paths (e.g. 50ms and above) and/or at higher 113 link rates (e.g. 50 Mbps and above). The lower the capacity per 114 flow, the less pronounced the problem becomes. Thus the imbalance is 115 most significant when the slowest flow rate is still high in absolute 116 terms. 118 The root cause of the unfairness is that the L4S architecture 119 redefines the congestion signal (CE mark) and congestion response in 120 the case of packets marked ECT(1) (used by L4S senders), whereas a 121 RFC3168 queue does not differentiate between packets marked ECT(0) 122 (used by classic senders) and those marked ECT(1), and provides 123 identical CE marks to both types. The result is that the two classes 124 respond differently to the CE congestion signal. The classic senders 125 expect that CE marks are sent very rarely (e.g. approximately 1 CE 126 mark every 200 round trips on a 50 Mbps x 50ms path) while the L4S 127 senders expect very frequent CE marking (e.g. approximately 2 CE 128 marks per round trip). The result is that the classic senders 129 respond to the CE marks provided by the bottleneck by yielding 130 capacity to the L4S flows. The resulting rate imbalance can be 131 demonstrated, and could be a cause of concern in some cases. 133 This concern primarily relates to single-queue (FIFO) bottleneck 134 links that implement RFC3168 ECN, but the situation can also 135 potentially occur with per-flow queuing, e.g. fq_codel [RFC8290], 136 when flow isolation is imperfect due to hash collisions or VPN 137 tunnels. 139 While the above mentioned unfairness has been demonstrated in 140 laboratory testing, it has not been observed in operational networks, 141 in part because members of the Transport Working group are not aware 142 of any deployments of single-queue Classic ECN bottlenecks in the 143 Internet. 145 This issue was considered in November 2015 (and reaffirmed in April 146 2020) when the WG decided on the identifier to use for L4S, as 147 recorded in Appendix B.1 of [I-D.ietf-tsvwg-ecn-l4s-id]. It was 148 recognized that compromises would have to be made because IP header 149 space is extremely limited. A number of alternative codepoint 150 schemes were compared for their ability to traverse most Internet 151 paths, to work over tunnels, to work at lower layers, to work with 152 TCP, etc. It was decided to progress on the basis that robust 153 performance in presence of these single-queue RFC3168 bottlenecks is 154 not the most critical issue, since it was believed that they are 155 rare. Nonetheless, there is the possibility that such deployments 156 exist, and there is the possibility that more could be deployed/ 157 enabled in the future, hence there is an interest in providing 158 guidance to ensure that measures can be taken to address the 159 potential issues, should they arise in practice. 161 TODO: further discussion on severity and who might be impacted? 163 2. Per-Flow Fairness 165 There are a number of factors that influence the relative rates 166 achieved by a set of users or a set of applications sharing a queue 167 in a bottleneck link. Notably the response that each application has 168 to congestion signals (whether loss or explicit signaling) can play a 169 large role in determining whether the applications share the 170 bandwidth in an equitable manner. In the Internet, ISPs typically 171 control capacity sharing between their customers using a scheduler at 172 the access bottleneck rather than relying on the congestion responses 173 of end-systems. So in that context this question primarily concerns 174 capacity sharing between the applications used by one customer site. 175 Nonetheless, there are many networks on the Internet where capacity 176 sharing relies, at least to some extent, on congestion control in the 177 end-systems. The traditional norm for congestion response has been 178 that it is handled on a per-connection basis, and that (all else 179 being equal) it results in each connection in the bottleneck 180 achieving a data rate inversely proportional to the average RTT of 181 the connection. The end result (in the case of steady-state behavior 182 of a set of like connections) is that each user or application 183 achieves a data rate proportional to N/RTT, where N is the number of 184 simultaneous connections that the user or application creates, and 185 RTT is the harmonic mean of the average round-trip-times for those 186 connections. Thus, users or applications that create a larger number 187 of connections and/or that have a lower RTT achieve a larger share of 188 the bottleneck link rate than others. 190 While this may not be considered fair by many, it nonetheless has 191 been the typical starting point for discussions around fairness. In 192 fact it has been common when evaluating new congestion responses to 193 actually set aside N & RTT as variables in the equation, and just 194 compare per-flow rates between flows with the same RTT. For example 195 [RFC5348] defines the congestion response for a flow to be 196 '"reasonably fair" if its sending rate is generally within a factor 197 of two of the sending rate of a [Reno] TCP flow under the same 198 conditions.' Given that RTTs can vary by roughly two orders of 199 magnitude and flow counts can vary by at least an order of magnitude 200 between applications, it seems that the accepted definition of 201 reasonable fairness leaves quite a bit of room for different levels 202 of performance between users or applications, and so perhaps isn't 203 the gold standard, but is rather a metric that is used because of its 204 convenience. 206 In practice, the effect of this RTT dependence has historically been 207 muted by the fact that many networks were deployed with very large 208 ("bloated") drop-tail buffers that would introduce queuing delays 209 well in excess of the base RTT of the flows utilizing the link, thus 210 equalizing (to some degree) the effective RTTs of those flows. 211 Recently, as network equipment suppliers and operators have worked to 212 improve the latency performance of the network by the use of smaller 213 buffers and/or AQM algorithms, this has had the side-effect of 214 uncovering the inherent RTT bias in classic congestion control 215 algorithms. 217 The L4S architecture aims to significantly improve this situation, by 218 requiring senders to adopt a congestion response that eliminates RTT 219 bias as much as possible (see [I-D.ietf-tsvwg-ecn-l4s-id]). As a 220 result, L4S promotes a level of per-flow fairness beyond what is 221 ordinarily considered for classic senders, the RFC3168 issue 222 notwithstanding. 224 It is also worth noting that the congestion control algorithms 225 deployed currently on the internet tend toward (RTT-weighted) 226 fairness only over long timescales. For example, the cubic algorithm 227 can take minutes to converge to fairness when a new flow joins an 228 existing flow on a link [Cubic]. Since the vast majority of TCP 229 connections don't last for minutes, it is unclear to what degree per- 230 flow, same-RTT fairness, even when demonstrated in the lab, 231 translates to the real world. 233 So, in real networks, where per-application, per-end-host or per- 234 customer fairness may be more important than long-term, same-RTT, 235 per-flow fairness, it may not be that instructive to focus on the 236 latter as being a necessary end goal. 238 Nonetheless, situations in which the presence of an L4S flow has the 239 potential to cause harm [Harm] to classic flows need to be 240 understood. Most importantly, if there are situations in which the 241 introduction of L4S traffic would degrade both the absolute and 242 relative performance of classic traffic significantly, i.e. to the 243 point that it would be considered starvation while L4S was not 244 starved, these situations need to be understood and either remedied 245 or avoided. 247 Aligned with this context, the guidance provided in this document is 248 aimed not at monitoring the relative performance of L4S senders 249 compared against classic senders on a per-flow basis, but rather at 250 identifying instances where RFC3168 bottlenecks are deployed so that 251 operators of L4S senders can have the opportunity to assess whether 252 any actions need to be taken. Additionally this document provides 253 guidance for network operators around configuring any RFC3168 254 bottlenecks to minimize the potential for negative interactions 255 between L4S and classic senders. 257 3. Detection of Classic ECN Bottlenecks 259 The IETF encourages researchers, end system deployers and network 260 operators to conduct experiments to identify to what degree RFC3168 261 bottlecks exist in networks. These types of measurement campaigns, 262 even if each is conducted over a limited set of paths, could be 263 useful to further understand the scope of any potential issues, to 264 guide end system deployers on where to examine performance more 265 closely (or possibly delay L4S deployment), and to help network 266 operators identify nodes where remediation may be necessary to 267 provide the best performance. 269 3.1. Recent Studies 271 A small number of recent studies have attempted to gauge the level of 272 RFC3168 deployment in the internet. 274 In 2020, Akamai conducted a study 275 (https://mailarchive.ietf.org/arch/msg/tsvwg/2tbRHphJ8K_CE6is9n7iQy- 276 VAZM/) of "downstream" (server to client) CE marking broken out by 277 ASN on two separate days, one in late March, the other in mid July 278 [Akamai]. They concluded that prevalence of CE-marking was low 279 across the ~800 ASNs observed, but it was growing, and that they 280 could not determine whether the CE marking was due to a single queue 281 or FQ. There were a small handful (5-7) of ASNs showing evidence of 282 CE-marking across more than 10% of their client IPs, and the global 283 baseline was CE-marking across 0.3% of IPs. 285 In 2017, Apple reported [TCPECN] on their observations of ECN marking 286 by networks, broken out by country. They reported four countries 287 that exceeded the global baseline seen by Akamai, but one of these 288 (Argentine Republic) was later discovered to be due to a bug, leaving 289 three countries: China 1% of paths, Mexico 3.2% of paths, France 6% 290 of paths. The percentage in France appears consistent with reports 291 (https://mailarchive.ietf.org/arch/msg/tsvwg/ 292 UyvpwUiNw0obd_EylBBV7kDRIHs/) that fq_codel has been implemented in 293 DSL home routers deployed by Free.fr. 295 In December 2020 - January 2021, Pete Heist worked with a small 296 cooperative WISP in the Czech Republic to collect data on CE-marking 297 [I-D.heist-tsvwg-ecn-deployment-observations]. This ISP had deployed 298 RFC3168 fq_codel equipment in some of their subnets, but in other 299 subnets there were 33 IPs where CE-marking was possibly observed, 300 corresponding to approximately 10% of paths, significantly greater 301 than the baseline reported by Akamai. It was agreed 302 (https://mailarchive.ietf.org/arch/msg/tsvwg/Rj7GylByZuFa3_LTCMvEfb- 303 CYpw/) that these were likely to be due to fq_codel implementations 304 in home routers deployed by members of the cooperative. 306 The interpretation of these studies seems to be that all of the known 307 RFC3168 deployments are fq_codel, the majority of the currently 308 unknown deployments are likely to be fq_codel, and there may be a 309 small number of networks where CE-marking is prevalent (and thus 310 likely ISP-managed) where it is currently unknown as to whether the 311 source is a FIFO or an FQ system. 313 Other studies (e.g. [EnablingECN], [ECNreadiness], [MeasuringECN]) 314 have examined ECN traversal, but have not reported data on prevalence 315 of CE-marking by networks. 317 3.2. Future Experiments 319 The design of future experiments should consider not only the 320 detection of RFC3168 ECN marking, but also the determination whether 321 the bottleneck AQM is a single queue (FIFO) or a flow-queuing (FQ) 322 system. It is believed that the vast majority, if not all, of the 323 RFC3168 AQMs in use at bottleneck links are flow-queuing systems 324 (e.g. fq_codel [RFC8290] or [COBALT]). When flow isolation is 325 successful, the FQ scheduling of such queues isolates classic 326 congestion control traffic from L4S traffic, and thus eliminates the 327 potential for unfairness. But, these systems are known to sometimes 328 result in imperfect isolation, either due to hash collisions (see 329 Section 5.3 (https://datatracker.ietf.org/doc/html/rfc8290#section- 330 5.3) of [RFC8290]) or because of VPN tunneling (see Section 6.2 331 (https://datatracker.ietf.org/doc/html/rfc8290#section-6.2) of 332 [RFC8290]). It is believed that the majority of FQ deployments in 333 bottleneck links today (e.g. [Cake]) employ hashing algorithms that 334 virtually eliminate the possibility of collisions, making this a non- 335 issue for those deployments. But, VPN tunnels remain an issue for FQ 336 deployments, and the introduction of L4S traffic raises the 337 possibility that tunnels containing mixed classic and L4S traffic 338 would exist, in which case FQ implementations that have not been 339 updated to be L4S-aware could exhibit similar unfairness properties 340 as single queue AQMs. Until such queues are upgraded to support L4S 341 (see Section 6) or treat ECT(1) as not-ECT traffic, end-host 342 mitigations such as separating L4S and Classic traffic into distinct 343 VPN tunnels could be employed. 345 [Detection] contains recommendations on some of the mechanisms that 346 can be used to detect RFC3168 bottlenecks. In particular, Section 4 347 of [Detection] outlines an approach for out-band-detection of RFC3168 348 bottlenecks. 350 4. Operator of an L4S host 352 From a host's perspective, support for L4S only involves the sender 353 via ECT(1) marking & L4S-compatible congestion control. The receiver 354 is involved in ECN feedback but can generally be agnostic to whether 355 ECN is being used for L4S [I-D.ietf-tsvwg-l4s-arch]. Between these 356 two entities, it is primarily incumbent upon the sender to evaluate 357 the potential for presence of RFC3168 FIFO bottlenecks and make 358 decisions whether or not to use L4S congestion control. While is is 359 possible for a receiver to disable L4S functionality by not 360 negotiating ECN, a general purpose receiver is not expected to 361 perform any testing or monitoring for RFC3168, and is also not 362 expected to invoke any active response in the case that such a 363 bottleneck exists. 365 Prior to deployment of any new technology, it is commonplace for the 366 parties involved in the deployment to validate the performance of the 367 new technology, via lab testing, limited field testing, large scale 368 field testing, etc. The same is expected for deployers of L4S 369 technology. As part of that validation, it is recommended that 370 deployers consider the issue of RFC3168 FIFO bottlenecks and conduct 371 experiments as described in the previous section, or otherwise assess 372 the impact that the L4S technology will have in the networks in which 373 it is to be deployed, and take action as is described further in this 374 section. 376 If pre-deployment testing raises concerns about issues with RFC3168 377 bottlenecks, the actions taken may depend on the server type: 379 * General purpose servers (e.g. web servers) 380 - Out-of-band active testing could be performed by the server. 381 For example, a javascript application could run simultaneous 382 downloads (i.e. with and without L4S) during page reading time 383 in order to survey for presence of RFC3168 FIFO bottlenecks on 384 paths to users (e.g. as described in Section 4 of [Detection]). 386 - In-band testing could be built in to the transport protocol 387 implementation at the sender in order to perform detection (see 388 Section 5 of [Detection], though note that this mechanism does 389 not differentiate between FIFO and FQ). 391 - Discontinuing use of L4S based on the detection of RFC3168 FIFO 392 bottlenecks is likely not needed for short transactional 393 transfers (e.g. sub 10 seconds) since these are unlikely to 394 achieve the steady-state conditions where unfairness has been 395 observed. 397 - For longer file transfers, it may be possible to fall-back to 398 Classic behavior in real-time (i.e. when doing in-band 399 testing), or to cache those destinations where RFC3168 has been 400 detected, and disable L4S for subsequent long file transfers to 401 those destinations. 403 * Specialized servers handling long-running sessions (e.g. cloud 404 gaming) 406 - Out-of-band active testing could be performed at each session 407 startup 409 - Out-of-band active testing could be integrated into a "pre- 410 validation" of the service, done when the user signs up, and 411 periodically thereafter 413 - In-band detection as described in [Detection] could be 414 performed during the session 416 TODO: discussion of risk of incorrectly classifying a path 418 In addition, the responsibilities of and actions taken by a sender 419 may depend on the environment in which it is deployed. The following 420 sub-sections discuss two scenarios: senders serving a limited known 421 target audience and those that serve an unknown target audience. 423 4.1. Edge Servers 425 Some hosts (such as CDN leaf nodes and servers internal to an ISP) 426 are deployed in environments in which they serve content to a 427 constrained set of networks or clients. The operator of such hosts 428 may be able to determine whether there is the possibility of 429 [RFC3168] FIFO bottlenecks being present, and utilize this 430 information to make decisions on selectively deploying L4S and/or 431 disabling it (e.g. bleaching ECN). Furthermore, such an operator may 432 be able to determine the likelihood of an L4S bottleneck being 433 present, and use this information as well. 435 For example, if a particular network is known to have deployed legacy 436 [RFC3168] FIFO bottlenecks, usage of L4S for long capacity-seeking 437 file transfers on that network could be delayed until those 438 bottlenecks can be upgraded to mitigate any potential issues as 439 discussed in the next section. 441 Prior to deploying L4S on edge servers a server operator should: 443 * Consult with network operators on presence of legacy [RFC3168] 444 FIFO bottlenecks 446 * Consult with network operators on presence of L4S bottlenecks 448 * Perform pre-deployment testing per network 450 If a particular network offers connectivity to other networks (e.g. 451 in the case of an ISP offering service to their customer's networks), 452 the lack of RFC3168 FIFO bottleneck deployment in the ISP network 453 can't be taken as evidence that RFC3168 FIFO bottlenecks don't exist 454 end-to-end (because one may have been deployed by the end-user 455 network). In these cases, deployment of L4S will need to take 456 appropriate steps to detect the presence of such bottlenecks. At 457 present, it is believed that the vast majority of RFC3168 bottlenecks 458 in end-user networks are implementations that utilize fq_codel or 459 Cake, where the unfairness problem is less likely to be a concern. 460 While this doesn't completely eliminate the possibility that a legacy 461 [RFC3168] FIFO bottleneck could exist, it nonetheless provides useful 462 information that can be utilized in the decision making around the 463 potential risk for any unfairness to be experienced by end users. 465 4.2. Other hosts 467 Hosts that are deployed in locations that serve a wide variety of 468 networks face a more difficult prospect in terms of handling the 469 potential presence of RFC3168 FIFO bottlenecks. Nonetheless, the 470 steps listed in the ealier section (based on server type) can be 471 taken to minimize the risk of unfairness. 473 The interpretation of studies on ECN usage and their deployment 474 context (see Section 3.1) has so far concluded that RFC3168 FIFO 475 bottlenecks are likely to be rare, and so detections using these 476 techniques may also prove to be rare. Therefore, it may be possible 477 for a host to cache a list of end host ip addresses where a RFC3168 478 bottleneck has been detected. Entries in such a cache would need to 479 age-out after a period of time to account for IP address changes, 480 path changes, equipment upgrades, etc. [TODO: more info on ways to 481 cache/maintain such a list] 483 It has been suggested that a public block-list of domains that 484 implement RFC3168 FIFO bottlenecks could be maintained. There are a 485 number of significant issues that would seem to make this idea 486 infeasible, not the least of which is the fact that presence of 487 RFC3168 FIFO bottlenecks or L4S bottlenecks is not a property of a 488 domain, it is the property of a link, and therefore of the particular 489 current path between two endpoints. 491 It has also been suggested that a public allow-list of domains that 492 are participating in the L4S experiment could be maintained. This 493 approach would not be useful, given the presence of an L4S domain on 494 the path does not imply the absence of RFC3168 AQMs upstream or 495 downstream of that domain. Also, the approach cannot cater for 496 domains with a mix of L4S and RFC3168 AQMs. 498 5. Operator of a Network Employing RFC3168 FIFO Bottlenecks 500 While it is, of course, preferred for networks to deploy L4S-capable 501 high fidelity congestion signaling, and while it is more preferable 502 for L4S senders to detect problems themselves, a network operator who 503 has deployed equipment in a likely bottleneck link location (i.e. a 504 link that is expected to be fully saturated) that is configured with 505 a legacy [RFC3168] FIFO AQM can take certain steps in order to 506 improve rate fairness between classic traffic and L4S traffic, and 507 thus enable L4S to be deployed in a greater number of paths. 509 Some of the options listed in this section may not be feasible in all 510 networking equipment. 512 5.1. Configure AQM to treat ECT(1) as NotECT 514 If equipment is configurable in such a way as to only supply CE marks 515 to ECT(0) packets, and treat ECT(1) packets identically to NotECT, or 516 is upgradable to support this capability, doing so will eliminate the 517 risk of unfairness. 519 5.2. ECT(1) Tunnel Bypass 521 Tunnel ECT(1) traffic through the RFC3168 bottleneck with the outer 522 header indicating Not-ECT, by using either an ECN tunnel ingress in 523 Compatibility Mode [RFC6040] or a Limited Functionality ECN tunnel 524 [RFC3168]. 526 Two variants exist for this approach 528 1. per-domain: tunnel ECT(1) pkts to domain edge towards dst 530 2. per-dst: tunnel ECT(1) pkts to dst 532 5.3. Configure Non-Coupled Dual Queue 534 Equipment supporting [RFC3168] may be configurable to enable two 535 parallel queues for the same traffic class, with classification done 536 based on the ECN field. 538 Option 1: 540 * Configure 2 queues, both with ECN; 50:50 WRR scheduler 542 - Queue #1: ECT(1) & CE packets - Shallow immediate AQM target 544 - Queue #2: ECT(0) & NotECT packets - Classic AQM target 546 * Outcome in the case of n L4S flows and m long-running Classic 547 flows 549 - if m & n are non-zero, flows get 1/2n and 1/2m of the capacity, 550 otherwise 1/n or 1/m 552 - never < 1/2 each flow's rate if all had been Classic 554 This option would allow L4S flows to achieve low latency, low loss 555 and scalable throughput, but would sacrifice the more precise flow 556 balance offered by [I-D.ietf-tsvwg-aqm-dualq-coupled]. This option 557 would be expected to result in some reordering of previously CE 558 marked packets sent by Classic ECN senders, which is a trait shared 559 with [I-D.ietf-tsvwg-aqm-dualq-coupled]. As is discussed in 560 [I-D.ietf-tsvwg-ecn-l4s-id], this reordering would be either zero 561 risk or very low risk. 563 Option 2: 565 * Configure 2 queues, both with AQM; 50:50 WRR scheduler 567 - Queue #1: ECT(1) & NotECT packets - ECN disabled 569 - Queue #2: ECT(0) & CE packets - ECN enabled 571 * Outcome 573 - ECT(1) treated as NotECT 575 - Flow balance for the 2 queues is the same as in option 1 577 This option would not allow L4S flows to achieve low latency, low 578 loss and scalable throughput in this bottleneck link. As a result it 579 is the less preferred option. 581 5.4. WRED with ECT(1) Differentation 583 This configuration is similar to Option 2 in the previous section, 584 but uses a single queue with WRED functionality. 586 * Configure the queue with two WRED classes 588 * Class #1: ECT(1) & NotECT packets - ECN disabled 590 * Class #2: ECT(0) & CE packets - ECN enabled 592 5.5. Disable RFC3168 Support 594 Disabling an [RFC3168] AQM from CE marking both ECT(0) traffic and 595 ECT(1) traffic eliminates the unfairness issue. A downside to this 596 approach is that classic senders will no longer get the benefits of 597 Explict Congestion Notification at this bottleneck link. This 598 alternative is only mentioned in case there is no other way to 599 reconfigure an RFC3168 AQM. 601 5.6. Re-mark ECT(1) to NotECT Prior to AQM 603 Remarking ECT(1) packets as NotECT (i.e. bleaching ECT(1)) ensures 604 that they are treated identically to classic NotECT senders. 605 However, this action is not recommended because a) it would also 606 prevent downstream L4S bottlenecks from providing high fidelity 607 congestion signals; and b) it could lead to problems with future 608 experiments that use ECT(1) in alternative ways to L4S. This 609 alternative is only mentioned in case there is no other way to 610 reconfigure an RFC3168 AQM. 612 Note that the CE codepoint must never be bleached, otherwise it would 613 black-hole congestion indications. 615 6. Operator of a Network Employing RFC3168 FQ Bottlenecks 617 A network operator who has deployed flow-queuing systems that 618 implement RFC3168 (e.g. fq_codel or CAKE) at network bottlenecks will 619 likely see fewer potential issues when L4S traffic is present on 620 their network as compared to operators of RFC3168 FIFOs. As 621 discussed in previous sections, the flow queuing mechanism will 622 typically isolate L4S flows and Classic flows into separate queues, 623 and the scheduler will then enforce per-flow fairness. As a result, 624 the potential fairness issues between Classic and L4S traffic that 625 can occur in FIFOs will typically not occur in FQ systems. That 626 said, FQ systems commonly treat a tunneled traffic aggregate as a 627 single flow, and thus a tunneled traffic aggregate that contains a 628 mix of Classic and L4S traffic will utilize a single queue, and could 629 experience the same fairness issue as has been described for RFC3168 630 FIFOs. This unfairness is compounded by the fact that the FQ 631 scheduler will already be causing unfairness to flows within the 632 tunnel relative to flows that are not tunneled. Additionally, many 633 of the deployed RFC3168 FQ systems currently implement an AQM 634 algorithm (either CoDel or COBALT) that is designed for Classic 635 traffic and reacts sluggishly to L4S (or unresponsive) traffic, with 636 the result being that L4S senders could in some cases see worse 637 latency performance than Classic senders. 639 While the potential unfairness result is arguably less impactful in 640 the case of RFC3168 FQ bottlenecks, it is believed that RFC3168 FQ 641 bottlenecks are currently more common than RFC3168 FIFO bottlenecks. 642 The most common deployments of RFC3168 FQ bottlenecks are in home 643 routers running OpenWRT firmware where the user has turned the 644 feature on. 646 As is the case with RFC3168 FIFOs, the preferred remedy for a network 647 operator that wishes to enable the best performance possible with 648 regard to L4S, is for the network operator to update RFC3168 FQ 649 bottlenecks to be L4S-aware. In cases where that is infeasible, 650 several of the remedies described in the previous section can be used 651 to reduce or eliminate these issues. 653 * Configure AQM to treat ECT(1) as NotECT 655 * ECT(1) Tunnel Bypass 657 * Disable RFC3168 Support 659 * Re-mark ECT(1) to NotECT Prior to AQM 661 7. Conclusion of the L4S experiment 663 This section gives guidance on how L4S-deploying networks and 664 endpoints should respond to either of the two possible outcomes of 665 the IETF-supported L4S experiment. 667 7.1. Successful termination of the L4S experiment 669 If the L4S experiment is deemed successful, the IETF would be 670 expected to move the L4S specifications to standards track. Networks 671 would then be encouraged to continue/begin deploying L4S-aware nodes 672 and to replace all non-L4S-aware RFC3168 AQMs already deployed as far 673 as feasible, or at least restrict RFC3168 AQM to interpret ECT(1) 674 equal to NotECT. Networks that participated in the experiment would 675 be expected to track the evolution of the L4S standards and adapt 676 their implementations accordingly (e.g. if as part of switching from 677 experimental to standards track, changes in the L4S RFCs become 678 necessary). 680 7.2. Unsuccessful termination of the L4S experiment 682 If the L4S experiment is deemed unsuccessful due to lack of 683 deployment of compliant end-systems or AQMs, it might need to be 684 terminated: any L4S network nodes should then be un-deployed and the 685 ECT(1) codepoint usage should be released/recycled as quickly as 686 possible, recognizing that this process may take some time. To 687 facilitate this potential outcome, [draft-ecn-l4s-id] requires L4S 688 hosts to be configurable to revert to non-L4S congestion control, and 689 networks to be configurable to treat ECT(1) the same as ECT(0). 691 8. Contributors 693 Thanks to Bob Briscoe, Jake Holland, Koen De Schepper, Olivier 694 Tilmans, Tom Henderson, Asad Ahmed, Gorry Fairhurst, Sebastian 695 Moeller, and members of the TSVWG mailing list for their 696 contributions to this document. 698 9. IANA Considerations 700 None. 702 10. Security Considerations 704 For further study. 706 11. Informative References 708 [Akamai] Holland, J., "Latency & AQM Observations on the Internet", 709 IETF MAPRG interim-2020-maprg-01, August 2020, 710 . 714 [Cake] Hoiland-Jorgensen, T., Taht, D., and J. Morton, "Piece of 715 CAKE: A Comprehensive Queue Management Solution for Home 716 Gateways", 2018, . 718 [COBALT] Palmei, J. and et al., "Design and Evaluation of COBALT 719 Queue Discipline", IEEE International Symposium on Local 720 and Metropolitan Area Networks 2019, 2019, 721 . 723 [Cubic] Ha, S., Rhee, I., and L. Xu, "CUBIC: A New TCP-Friendly 724 High-Speed TCP Variant", ACM SIGOPS Operating Systems 725 Review , 2008, 726 . 729 [Detection] 730 Briscoe, B. and A.S. Ahmed, "TCP Prague Fall-back on 731 Detection of a Classic ECN AQM", ArXiv , February 2021, 732 . 734 [ECNreadiness] 735 Bauer, S., Beverly, R., and A. Berger, "Measuring the 736 State of ECN Readiness in Servers, Clients, and Routers", 737 Proc ACM SIGCOMM Internet Measurement Conference IMC'11, 738 2011, 739 . 741 [EnablingECN] 742 Trammel, B., Kuehlewind, M., Boppart, D., Learmonth, I., 743 Fairhurst, G., and R. Scheffenegger, "Enabling Internet- 744 Wide Deployment of Explicit Congestion Notification", Proc 745 Passive & Active Measurement Conference PAM15, 2015, 746 . 749 [Harm] Ware, R., Mukerjee, M., Seshan, S., and J. Sherry, "Beyond 750 Jain's Fairness Index: Setting the Bar For The Deployment 751 of Congestion Control Algorithms", Hotnets'19 , 2019, 752 . 755 [I-D.heist-tsvwg-ecn-deployment-observations] 756 Heist, P. and J. Morton, "Explicit Congestion Notification 757 (ECN) Deployment Observations", Work in Progress, 758 Internet-Draft, draft-heist-tsvwg-ecn-deployment- 759 observations-02, 8 March 2021, . 763 [I-D.ietf-tsvwg-aqm-dualq-coupled] 764 Schepper, K., Briscoe, B., and G. White, "DualQ Coupled 765 AQMs for Low Latency, Low Loss and Scalable Throughput 766 (L4S)", Work in Progress, Internet-Draft, draft-ietf- 767 tsvwg-aqm-dualq-coupled-13, 15 November 2020, 768 . 771 [I-D.ietf-tsvwg-ecn-l4s-id] 772 Schepper, K. and B. Briscoe, "Identifying Modified 773 Explicit Congestion Notification (ECN) Semantics for 774 Ultra-Low Queuing Delay (L4S)", Work in Progress, 775 Internet-Draft, draft-ietf-tsvwg-ecn-l4s-id-12, 15 776 November 2020, . 779 [I-D.ietf-tsvwg-l4s-arch] 780 Briscoe, B., Schepper, K., Bagnulo, M., and G. White, "Low 781 Latency, Low Loss, Scalable Throughput (L4S) Internet 782 Service: Architecture", Work in Progress, Internet-Draft, 783 draft-ietf-tsvwg-l4s-arch-08, 15 November 2020, 784 . 787 [IANA-ECN] Internet Assigned Numbers Authority, "IANA ECN Field 788 Assignments", 2018, . 791 [MeasuringECN] 792 Mandalari, AM., Lutu, A., Briscoe, B., Bagnulo, M., and O. 793 Alay, "Measuring ECN++: Good News for ++, Bad News for ECN 794 over Mobile", DOI 10.1109/MCOM.2018.1700739, IEEE 795 Communications Magazine vol. 56, no. 3, March 2018, 796 . 798 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 799 of Explicit Congestion Notification (ECN) to IP", 800 RFC 3168, DOI 10.17487/RFC3168, September 2001, 801 . 803 [RFC5348] Floyd, S., Handley, M., Padhye, J., and J. Widmer, "TCP 804 Friendly Rate Control (TFRC): Protocol Specification", 805 RFC 5348, DOI 10.17487/RFC5348, September 2008, 806 . 808 [RFC6040] Briscoe, B., "Tunnelling of Explicit Congestion 809 Notification", RFC 6040, DOI 10.17487/RFC6040, November 810 2010, . 812 [RFC8290] Hoeiland-Joergensen, T., McKenney, P., Taht, D., Gettys, 813 J., and E. Dumazet, "The Flow Queue CoDel Packet Scheduler 814 and Active Queue Management Algorithm", RFC 8290, 815 DOI 10.17487/RFC8290, January 2018, 816 . 818 [RFC8311] Black, D., "Relaxing Restrictions on Explicit Congestion 819 Notification (ECN) Experimentation", RFC 8311, 820 DOI 10.17487/RFC8311, January 2018, 821 . 823 [TCPECN] Bhooma, P., "TCP ECN: Experience with enabling ECN on the 824 Internet", 98th IETF MAPRG Presentation , 2017, 825 . 829 Author's Address 831 Greg White (editor) 832 CableLabs 834 Email: g.white@cablelabs.com