idnits 2.17.1 draft-briscoe-tsvwg-aqm-tcpm-rmcat-l4s-problem-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 8, 2016) is 2820 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-02) exists of draft-briscoe-tsvwg-ecn-l4s-id-01 == Outdated reference: A later version (-10) exists of draft-ietf-aqm-pie-08 == Outdated reference: A later version (-28) exists of draft-ietf-tcpm-accurate-ecn-01 == Outdated reference: A later version (-07) exists of draft-ietf-tcpm-cubic-01 == Outdated reference: A later version (-10) exists of draft-ietf-tcpm-dctcp-01 == Outdated reference: A later version (-01) exists of draft-khademi-tcpm-alternativebackoff-ecn-00 == Outdated reference: A later version (-06) exists of draft-stewart-tsvwg-sctpecn-05 -- Obsolete informational reference (is this intentional?): RFC 4960 (Obsoleted by RFC 9260) Summary: 0 errors (**), 0 flaws (~~), 8 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Transport Services (tsv) B. Briscoe, Ed. 3 Internet-Draft Simula Research Lab 4 Intended status: Informational K. De Schepper 5 Expires: January 9, 2017 Nokia Bell Labs 6 M. Bagnulo Braun 7 Universidad Carlos III de Madrid 8 July 8, 2016 10 Low Latency, Low Loss, Scalable Throughput (L4S) Internet Service: 11 Problem Statement 12 draft-briscoe-tsvwg-aqm-tcpm-rmcat-l4s-problem-02 14 Abstract 16 This document motivates a new service that the Internet could provide 17 to eventually replace best efforts for all traffic: Low Latency, Low 18 Loss, Scalable throughput (L4S). It is becoming common for _all_ (or 19 most) applications being run by a user at any one time to require low 20 latency. However, the only solution the IETF can offer for ultra-low 21 queuing delay is Diffserv, which only favours a minority of packets 22 at the expense of others. In extensive testing the new L4S service 23 keeps average queuing delay under a millisecond for _all_ 24 applications even under very heavy load, without sacrificing 25 utilization; and it keeps congestion loss to zero. It is becoming 26 widely recognized that adding more access capacity gives diminishing 27 returns, because latency is becoming the critical problem. Even with 28 a high capacity broadband access, the reduced latency of L4S 29 remarkably and consistently improves performance under load for 30 applications such as interactive video, conversational video, voice, 31 Web, gaming, instant messaging, remote desktop and cloud-based apps 32 (even when all being used at once over the same access link). The 33 insight is that the root cause of queuing delay is in TCP, not in the 34 queue. By fixing the sending TCP (and other transports) queuing 35 latency becomes so much better than today that operators will want to 36 deploy the network part of L4S to enable new products and services. 37 Further, the network part is simple to deploy - incrementally with 38 zero-config. Both parts, sender and network, ensure coexistence with 39 other legacy traffic. At the same time L4S solves the long- 40 recognized problem with the future scalability of TCP throughput. 42 This document explains the underlying problems that have been 43 preventing the Internet from enjoying such performance improvements. 44 It then outlines the parts necessary for a solution and the steps 45 that will be needed to standardize them. It points out opportunities 46 that will open up, and sets out some likely use-cases, including 47 ultra-low latency interaction with cloud processing over the public 48 Internet. 50 Status of This Memo 52 This Internet-Draft is submitted in full conformance with the 53 provisions of BCP 78 and BCP 79. 55 Internet-Drafts are working documents of the Internet Engineering 56 Task Force (IETF). Note that other groups may also distribute 57 working documents as Internet-Drafts. The list of current Internet- 58 Drafts is at http://datatracker.ietf.org/drafts/current/. 60 Internet-Drafts are draft documents valid for a maximum of six months 61 and may be updated, replaced, or obsoleted by other documents at any 62 time. It is inappropriate to use Internet-Drafts as reference 63 material or to cite them other than as "work in progress." 65 This Internet-Draft will expire on January 9, 2017. 67 Copyright Notice 69 Copyright (c) 2016 IETF Trust and the persons identified as the 70 document authors. All rights reserved. 72 This document is subject to BCP 78 and the IETF Trust's Legal 73 Provisions Relating to IETF Documents 74 (http://trustee.ietf.org/license-info) in effect on the date of 75 publication of this document. Please review these documents 76 carefully, as they describe your rights and restrictions with respect 77 to this document. Code Components extracted from this document must 78 include Simplified BSD License text as described in Section 4.e of 79 the Trust Legal Provisions and are provided without warranty as 80 described in the Simplified BSD License. 82 Table of Contents 84 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 85 1.1. The Application Performance Problem . . . . . . . . . . . 3 86 1.2. The Technology Problem . . . . . . . . . . . . . . . . . 4 87 1.3. Terminology . . . . . . . . . . . . . . . . . . . . . . . 5 88 1.4. The Standardization Problem . . . . . . . . . . . . . . . 7 89 2. Rationale . . . . . . . . . . . . . . . . . . . . . . . . . . 9 90 2.1. Why These Primary Components? . . . . . . . . . . . . . . 9 91 2.2. Why Not Alternative Approaches? . . . . . . . . . . . . . 10 92 3. Opportunities . . . . . . . . . . . . . . . . . . . . . . . . 11 93 3.1. Use Cases . . . . . . . . . . . . . . . . . . . . . . . . 12 94 4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14 95 5. Security Considerations . . . . . . . . . . . . . . . . . . . 14 96 5.1. Traffic (Non-)Policing . . . . . . . . . . . . . . . . . 14 97 5.2. 'Latency Friendliness' . . . . . . . . . . . . . . . . . 14 98 5.3. ECN Integrity . . . . . . . . . . . . . . . . . . . . . . 15 99 6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 16 100 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 16 101 7.1. Normative References . . . . . . . . . . . . . . . . . . 16 102 7.2. Informative References . . . . . . . . . . . . . . . . . 16 103 Appendix A. Required features for scalable transport protocols 104 to be safely deployable in the Internet (a.k.a. TCP 105 Prague requirements) . . . . . . . . . . . . . . . . 19 106 Appendix B. Standardization items . . . . . . . . . . . . . . . 23 107 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 26 109 1. Introduction 111 1.1. The Application Performance Problem 113 It is increasingly common for _all_ of a user's applications at any 114 one time to require low delay: interactive Web, Web services, voice, 115 conversational video, interactive video, instant messaging, online 116 gaming, remote desktop and cloud-based applications. In the last 117 decade or so, much has been done to reduce propagation delay by 118 placing caches or servers closer to users. However, queuing remains 119 a major, albeit intermittent, component of latency. When present it 120 typically doubles the path delay from that due to the base speed-of- 121 light. Low loss is also important because, for interactive 122 applications, losses translate into even longer retransmission 123 delays. 125 It has been demonstrated that, once access network bit rates reach 126 levels now common in the developed world, increasing capacity offers 127 diminishing returns if latency (delay) is not addressed. 128 Differentiated services (Diffserv) offers Expedited Forwarding 129 [RFC3246] for some packets at the expense of others, but this is not 130 applicable when all (or most) of a user's applications require low 131 latency. 133 Therefore, the goal is an Internet service with ultra-Low queueing 134 Latency, ultra-Low Loss and Scalable throughput (L4S) - for _all_ 135 traffic. Having motivated the goal of 'L4S for all', this document 136 enumerates the problems that have to be overcome to reach it. 138 It must be said that queuing delay only degrades performance 139 infrequently [Hohlfeld14]. It only occurs when a large enough 140 capacity-seeking (e.g. TCP) flow is running alongside the user's 141 traffic in the bottleneck link, which is typically in the access 142 network. Or when the low latency application is itself a large 143 capacity-seeking flow (e.g. interactive video). At these times, the 144 performance improvement must be so remarkable that network operators 145 will be motivated to deploy it. 147 1.2. The Technology Problem 149 Active Queue Management (AQM) is part of the solution to queuing 150 under load. AQM improves performance for all traffic, but there is a 151 limit to how much queuing delay can be reduced by solely changing the 152 network; without addressing the root of the problem. 154 The root of the problem is the presence of standard TCP congestion 155 control (Reno [RFC5681]) or compatible variants (e.g. TCP Cubic 156 [I-D.ietf-tcpm-cubic]). We shall call this family of congestion 157 controls 'Classic' TCP. It has been demonstrated that if the sending 158 host replaces Classic TCP with a 'Scalable' alternative, when a 159 suitable AQM is deployed in the network the performance under load of 160 all the above interactive applications can be stunningly improved. 161 For instance, queuing delay under heavy load with the example DCTCP/ 162 DualQ solution cited below is roughly 1 millisecond (1 ms) at the 163 99th percentile without losing link utilization. This compares with 164 5 to 20 ms on _average_ with a Classic TCP and current state-of-the- 165 art AQMs such as fq_CoDel [I-D.ietf-aqm-fq-codel] or 166 PIE [I-D.ietf-aqm-pie]. Also, with a Classic TCP, 5 ms of queuing is 167 usually only possible by losing some utilization. 169 It has been convincingly demonstrated [DCttH15] that it is possible 170 to deploy such an L4S service alongside the existing best efforts 171 service so that all of a user's applications can shift to it when 172 their stack is updated. Access networks are typically designed with 173 one link as the bottleneck for each site (which might be a home, 174 small enterprise or mobile device), so deployment at a single node 175 should give nearly all the benefit. Although the main incremental 176 deployment problem has been solved, and the remaining work seems 177 straightforward, there may need to be changes in approach during the 178 process of engineering a complete solution. 180 There are three main parts to the L4S approach (illustrated in 181 Figure 1): 183 2) Network: The L4S service needs to be isolated from the queuing 184 latency of the Classic service. However, the two should be able 185 to freely share a common pool of capacity. This is because there 186 is no way to predict how many flows at any one time might use each 187 service and capacity in access networks is too scarce to partition 188 into two. So a 'semi-permeable' membrane is needed that 189 partitions latency but not bandwidth. The Dual Queue Coupled AQM 190 [I-D.briscoe-aqm-dualq-coupled] is an example of such a semi- 191 permeable membrane. 193 Per-flow queuing such as in [I-D.ietf-aqm-fq-codel] could be used, 194 but it partitions both latency and bandwdith between every e2e 195 flow. So it is rather overkill, which brings disadvantages (see 196 Section 2.2), not least that thousands of queues are needed when 197 two are sufficient. 199 1) Protocol: A host needs to distinguish L4S and Classic packets 200 with an identifier so that the network can classify them into 201 their separate treatments. [I-D.briscoe-tsvwg-ecn-l4s-id] 202 considers various alternative identifiers, and concludes that all 203 alternatives involve compromises, but the ECT(1) codepoint of the 204 ECN field is a workable solution. 206 3) Host: Scalable congestion controls already exist. They solve the 207 scaling problem with TCP first pointed out in [RFC3649]. The one 208 used most widely (in controlled environments) is Data Centre TCP 209 (DCTCP [I-D.ietf-tcpm-dctcp]), which has been implemented and 210 deployed in Windows Server Editions (since 2012), in Linux and in 211 FreeBSD. Although DCTCP as-is 'works' well over the public 212 Internet, most implementations lack certain safety features that 213 will be necessary once it is used outside controlled environments 214 like data centres (see later). A similar scalable congestion 215 control will also need to be transplanted into protocols other 216 than TCP (SCTP, RTP/RTCP, RMCAT, etc.) 218 (1) (2) 219 .-------^------. .--------------^-------------------. 220 ,-(3)-----. ______ 221 ; ________ : L4S --------. | | 222 :|Scalable| : _\ ||___\_| mark | 223 :| sender | : __________ / / || / |______|\ _________ 224 :|________|\; | |/ --------' ^ \1| | 225 `---------'\__| IP-ECN | Coupling : \|priority |_\ 226 ________ / |Classifier| : /|scheduler| / 227 |Classic |/ |__________|\ --------. ___:__ / |_________| 228 | sender | \_\ || | |||___\_| mark/|/ 229 |________| / || | ||| / | drop | 230 Classic --------' |______| 232 Figure 1: Components of an L4S Solution: 1) Isolation in separate 233 network queues; 2) Packet Identification Protocol; and 3) Scalable 234 Sending Host 236 1.3. Terminology 238 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 239 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 240 document are to be interpreted as described in [RFC2119]. In this 241 document, these words will appear with that interpretation only when 242 in ALL CAPS. Lower case uses of these words are not to be 243 interpreted as carrying RFC-2119 significance. 245 Classic service: The 'Classic' service is intended for all the 246 congestion control behaviours that currently co-exist with TCP 247 Reno (e.g. TCP Cubic, Compound, SCTP, etc). 249 Low-Latency, Low-Loss and Scalable (L4S) service: The 'L4S' service 250 is intended for traffic from scalable TCP algorithms such as Data 251 Centre TCP. But it is also more general--it will allow a set of 252 congestion controls with similar scaling properties to DCTCP (e.g. 253 Relentless [Mathis09]) to evolve. 255 Both Classic and L4S services can cope with a proportion of 256 unresponsive or less-responsive traffic as well (e.g. DNS, VoIP, 257 etc). 259 Scalable Congestion Control: A congestion control where flow rate is 260 inversely proportional to the level of congestion signals. Then, 261 as flow rate scales, the number of congestion signals per round 262 trip remains invariant, maintaining the same degree of control. 263 For instance, DCTCP averages 2 congestion signals per round-trip 264 whatever the flow rate. 266 Classic Congestion Control: A congestion control with a flow rate 267 compatible with standard TCP Reno [RFC5681]. With Classic 268 congestion controls, as capacity increases enabling higher flow 269 rates, the number of round trips between congestion signals 270 (losses or ECN marks) rises in proportion to the flow rate. So 271 control of queuing and/or utilization becomes very slack. For 272 instance, with 1500 B packets and an RTT of 18 ms, as TCP Reno 273 flow rate increases from 2 to 100 Mb/s the number of round trips 274 between congestion signals rises proportionately, from 2 to 100. 276 The default congestion control in Linux (TCP Cubic) is Reno- 277 compatible for most scenarios expected for some years. For 278 instance, with a typical domestic round-trip time (RTT) of 18ms, 279 TCP Cubic only switches out of Reno-compatibility mode once the 280 flow rate approaches 1 Gb/s. For a typical data centre RTT of 1 281 ms, the switch-over point is theoretically 1.3 Tb/s. However, 282 with a less common transcontinental RTT of 100 ms, it only remains 283 Reno-compatible up to 13 Mb/s. All examples assume 1,500 B 284 packets. 286 Classic ECN: The original proposed standard Explicit Congestion 287 Notification (ECN) protocol [RFC3168], which requires ECN signals 288 to be treated the same as drops, both when generated in the 289 network and when responded to by the sender. 291 Site: A home, mobile device, small enterprise or campus, where the 292 network bottleneck is typically the access link to the site. Not 293 all network arrangements fit this model but it is a useful, widely 294 applicable generalisation. 296 1.4. The Standardization Problem 298 0) Architecture: The first step will be to articulate the structure 299 and interworking requirements of the set of parts that would 300 satisfy the overall application performance requirements. 302 Then specific interworking aspects of the following three components 303 parts will need to be defined: 305 1) Protocol: 307 A. [I-D.briscoe-tsvwg-ecn-l4s-id] recommends ECT(1) is used as 308 the identifier to classify L4S and Classic packets into their 309 separate treatments, as required by [RFC4774]. The draft also 310 points out that the original experimental assignment of this 311 codepoint as an ECN nonce [RFC3540] needs to be made obsolete 312 (it was never deployed, and it offers no security benefit now 313 that deployment is optional). 315 B. An essential aspect of a scalable congestion control is the 316 use of explicit congestion signals rather than losses, because 317 the signals need to be sent immediately and frequently--too 318 often to use drops. 'Classic' ECN [RFC3168] requires an ECN 319 signal to be treated the same as a drop, both when it is 320 generated in the network and when it is responded to by hosts. 321 L4S allows networks and hosts to support two separate meanings 322 for ECN. So the standards track [RFC3168] will need to be 323 updated to allow ECT(1) packets to depart from the 'same as 324 drop' constraint. 326 2) Network: The Dual Queue Coupled AQM has been specified as 327 generically as possible [I-D.briscoe-aqm-dualq-coupled] as a 328 'semi-permeable' membrane without specifying the particular AQMs 329 to use in the two queues. An informational appendix of the draft 330 is provided for pseudocode examples of different possible AQM 331 approaches. Initially a zero-config variant of RED called Curvy 332 RED was implemented, tested and documented. A variant of PIE has 333 been implemented and tested and is about to be documented. The 334 aim is for designers to be free to implement diverse ideas. So 335 the brief normative body of the draft only specifies the minimum 336 constraints an AQM needs to comply with to ensure that the L4S and 337 Classic services will coexist. 339 3) Host: 341 A. Data Centre TCP is the most widely used example of a scalable 342 congestion control. It is being documented in the TCPM WG as 343 an informational record of the protocol currently in use 344 [I-D.ietf-tcpm-dctcp]. It will be necessary to define a 345 number of safety features for a variant usable on the public 346 Internet. A draft list of these, known as the TCP Prague 347 requirements, has been drawn up (see Appendix A). 349 B. Transport protocols other than TCP use various congestion 350 controls designed to be friendly with Classic TCP. It will be 351 necessary to implement scalable variants of each of these 352 transport behaviours before they can use the L4S service. The 353 following standards track RFCs currently define these 354 protocols, and they will need to be updated to allow a 355 different congestion response, which they will have to 356 indicate by using the ECT(1) codepoint: ECN in TCP [RFC3168], 357 in SCTP [RFC4960], in RTP [RFC6679], and in DCCP [RFC4340]. 359 C. ECN feedback is sufficient for L4S in some transport protocols 360 (RTCP, DCCP) but not others: 362 + For the case of TCP, the feedback protocol for ECN embeds 363 the assumption from Classic ECN that it is the same as 364 drop, making it unusable for a scalable TCP. Therefore, 365 the implementation of TCP receivers will have to be 366 upgraded [RFC7560]. Work to standardize more accurate ECN 367 feedback for TCP (AccECN [I-D.ietf-tcpm-accurate-ecn]) is 368 already in progress. 370 + ECN feedback is only roughly sketched in an appendix of the 371 SCTP specification. A fuller specification has been 372 proposed [I-D.stewart-tsvwg-sctpecn], which would need to 373 be implemented and deployed. 375 Currently, the new specification of the ECN protocol 376 [I-D.briscoe-tsvwg-ecn-l4s-id] has been written for the experimental 377 track. Perhaps a better approach would be to make this a standards 378 track protocol draft that updates the definition of ECT(1) in all the 379 above standards track RFCs and obsoletes its experimental use for the 380 ECN nonce. Then experimental specifications of example network (AQM) 381 and host (congestion control) algorithms can be written. 383 2. Rationale 385 2.1. Why These Primary Components? 387 Explicit congestion signalling (protocol): Explicit congestion 388 signalling is a key part of the L4S approach. In contrast, use of 389 drop as a congestion signal creates a tension because drop is both 390 a useful signal (more would reduce delay) and an impairment (less 391 would reduce delay). Explicit congestion signals can be used many 392 times per round trip, to keep tight control, without any 393 impairment. Under heavy load, even more explicit signals can be 394 applied so the queue can be kept short whatever the load. Whereas 395 state-of-the-art AQMs have to introduce very high packet drop at 396 high load to keep the queue short. Further, TCP's sawtooth 397 reduction can be smaller, and therefore return to the operating 398 point more often, without worrying that this causes more signals 399 (one at the top of each smaller sawtooth). The consequent smaller 400 amplitude sawteeth fit between a very shallow marking threshold 401 and an empty queue, so delay variation can be very low, without 402 risk of under-utilization. 404 All the above makes it clear that explicit congestion signalling 405 is only advantageous for latency if it does not have to be 406 considered 'the same as' drop (as required with Classic ECN 407 [RFC3168]). Before Classic ECN was standardized, there were 408 various proposals to give an ECN mark a different meaning from 409 drop. However, there was no particular reason to agree on any one 410 of the alternative meanings, so 'the same as drop' was the only 411 compromise that could be reached. RFC 3168 contains a statement 412 that: 414 "An environment where all end nodes were ECN-Capable could 415 allow new criteria to be developed for setting the CE 416 codepoint, and new congestion control mechanisms for end-node 417 reaction to CE packets. However, this is a research issue, and 418 as such is not addressed in this document." 420 Latency isolation with coupled congestion notification (network): 421 Using just two queues is not essential to L4S (more would be 422 possible), but it is the simplest way to isolate all the L4S 423 traffic that keeps latency low from all the legacy Classic traffic 424 that does not. 426 Similarly, coupling the congestion notification between the queues 427 is not necessarily essential, but it is a clever and simple way to 428 allow senders to determine their rate, packet-by-packet, rather 429 than be overridden by a network scheduler. Because otherwise a 430 network scheduler would have to inspect at least transport layer 431 headers, and it would have to continually assign a rate to each 432 flow without any easy way to understand application intent. 434 L4S packet identifier (protocol): Once there are at least two 435 separate treatments in the network, hosts need an identifier at 436 the IP layer to distinguish which treatment they intend to use. 438 Scalable congestion notification (host): A scalable congestion 439 control keeps the signalling frequency high so that rate 440 variations can be small when signalling is stable, and rate can 441 track variations in available capacity as rapidly as possible 442 otherwise. 444 2.2. Why Not Alternative Approaches? 446 All the following approaches address some part of the same problem 447 space as L4S. In each case, it is shown that L4S complements them or 448 improves on them, rather than being a mutually exclusive alternative: 450 Diffserv: Diffserv addresses the problem of bandwidth apportionment 451 for important traffic as well as queuing latency for delay- 452 sensitive traffic. L4S solely addresses the problem of queuing 453 latency. Diffserv will still be necessary where important traffic 454 requires priority (e.g. for commercial reasons, or for protection 455 of critical infrastructure traffic). Nonetheless, if there are 456 Diffserv classes for important traffic, the L4S approach can 457 provide low latency for _all_ traffic within each Diffserv class 458 (including the case where there is only one Diffserv class). 460 Also, as already explained, Diffserv only works for a small subset 461 of the traffic on a link. It is not applicable when all the 462 applications in use at one time at a single site (home, small 463 business or mobile device) require low latency. Also, because L4S 464 is for all traffic, it needs none of the management baggage 465 (traffic policing, traffic contracts) associated with favouring 466 some packets over others. This baggage has held Diffserv back 467 from widespread end-to-end deployment. 469 State-of-the-art AQMs: AQMs such as PIE and fq_CoDel give a 470 significant reduction in queuing delay relative to no AQM at all. 471 The L4S work is intended to complement these AQMs, and we 472 definitely do not want to distract from the need to deploy them as 473 widely as possible. Nonetheless, without addressing the large 474 saw-toothing rate variations of Classic congestion controls, AQMs 475 alone cannot reduce queuing delay too far without significantly 476 reducing link utilization. The L4S approach resolves this tension 477 by ensuring hosts can minimize the size of their sawteeth without 478 appearing so aggressive to legacy flows that they starve. 480 Per-flow queuing: Similarly per-flow queuing is not incompatible 481 with the L4S approach. However, one queue for every flow can be 482 thought of as overkill compared to the minimum of two queues for 483 all traffic needed for the L4S approach. The overkill of per-flow 484 queuing has side-effects: 486 A. fq makes high performance networking equipment costly 487 (processing and memory) - in contrast dual queue code can be 488 very simple; 490 B. fq requires packet inspection into the end-to-end transport 491 layer, which doesn't sit well alongside encryption for privacy 492 - in contrast a dual queue only operates at the IP layer; 494 C. fq decides packet-by-packet which flow to schedule without 495 knowing application intent. In contrast, in the L4S approach 496 the sender still controls the relative rate of each flow 497 dependent on the needs of each application. 499 Alternative Back-off ECN (ABE): Yet again, L4S is not an alternative 500 to ABE but a complement that introduces much lower queuing delay. 501 ABE [I-D.khademi-tcpm-alternativebackoff-ecn] alters the host 502 behaviour in response to ECN marking to utilize a link better and 503 give ECN flows a faster throughput, but it assumes the network 504 still treats ECN and drop the same. Therefore ABE exploits any 505 lower queuing delay that AQMs can provide. But as explained 506 above, AQMs still cannot reduce queuing delay too far without 507 losing link utilization (for other non-ABE flows). 509 3. Opportunities 511 A transport layer that solves the current latency issues will provide 512 new service, product and application opportunities. 514 With the L4S approach, the following existing applications will 515 immediately experience significantly better quality of experience 516 under load in the best effort class: 518 o Gaming 520 o VoIP 522 o Video conferencing 524 o Web browsing 526 o (Adaptive) video streaming 527 o Instant messaging 529 The significantly lower queuing latency also enables some interactive 530 application functions to be offloaded to the cloud that would hardly 531 even be usable today: 533 o Cloud based interactive video 535 o Cloud based virtual and augmented reality 537 The above two applications have been successfully demonstrated with 538 L4S, both running together over a 40 Mb/s broadband access link 539 loaded up with the numerous other latency sensitive applications in 540 the previous list as well as numerous downloads. A panoramic video 541 of a football stadium can be swiped and pinched so that on the fly a 542 proxy in the cloud generates a sub-window of the match video under 543 the finger-gesture control of each user. At the same time, a virtual 544 reality headset fed from a 360 degree camera in a racing car has been 545 demonstrated, where the user's head movements control the scene 546 generated in the cloud. In both cases, with 7 ms end-to-end base 547 delay, the additional queuing delay of roughly 1 ms is so low that it 548 seems the video is generated locally. See https://riteproject.eu/ 549 dctth/ for videos of these demonstrations. 551 Using a swiping finger gesture or head movement to pan a video are 552 extremely demanding applications--far more demanding than VoIP. 553 Because human vision can detect extremely low delays of the order of 554 single milliseconds when delay is translated into a visual lag 555 between a video and a reference point (the finger or the orientation 556 of the head). 558 If low network delay is not available, all fine interaction has to be 559 done locally and therefore much more redundant data has to be 560 downloaded. When all interactive processing can be done in the 561 cloud, only the data to be rendered for the end user needs to be 562 sent. Whereas, once applications can rely on minimal queues in the 563 network, they can focus on reducing their own latency by only 564 minimizing the application send queue. 566 3.1. Use Cases 568 The following use-cases for L4S are being considered by various 569 interested parties: 571 o Where the bottleneck is one of various types of access network: 572 DSL, cable, mobile, satellite 573 * Radio links (cellular, WiFi) that are distant from the source 574 are particularly challenging. The radio link capacity can vary 575 rapidly by orders of magnitude, so it is often desirable to 576 hold a buffer to utilise sudden increases of capacity; 578 * cellular networks are further complicated by a perceived need 579 to buffer in order to make hand-overs imperceptible; 581 * Satellite networks generally have a very large base RTT, so 582 even with minimal queuing, overall delay can never be extremely 583 low; 585 * Nonetheless, it is certainly desirable not to hold a buffer 586 purely because of the sawteeth of Classic TCP, when it is more 587 than is needed for all the above reasons. 589 o Private networks of heterogeneous data centres, where there is no 590 single administrator that can arrange for all the simultaneous 591 changes to senders, receivers and network needed to deploy DCTCP: 593 * a set of private data centres interconnected over a wide area 594 with separate administrations, but within the same company 596 * a set of data centres operated by separate companies 597 interconnected by a community of interest network (e.g. for the 598 finance sector) 600 * multi-tenant (cloud) data centres where tenants choose their 601 operating system stack (Infrastructure as a Service - IaaS) 603 o Different types of transport (or application) congestion control: 605 * elastic (TCP/SCTP); 607 * real-time (RTP, RMCAT); 609 * query (DNS/LDAP). 611 o Where low delay quality of service is required, but without 612 inspecting or intervening above the IP layer 613 [I-D.you-encrypted-traffic-management]: 615 * mobile and other networks have tended to inspect higher layers 616 in order to guess application QoS requirements. However, with 617 growing demand for support of privacy and encryption, L4S 618 offers an alternative. There is no need to select which 619 traffic to favour for queuing, when L4S gives favourable 620 queuing to all traffic. 622 4. IANA Considerations 624 This specification contains no IANA considerations. 626 5. Security Considerations 628 5.1. Traffic (Non-)Policing 630 Because the L4S service can serve all traffic that is using the 631 capacity of a link, it should not be necessary to police access to 632 the L4S service. In contrast, Diffserv only works if some packets 633 get less favourable treatement than others. So it has to use traffic 634 policers to limit how much traffic can be favoured, In turn, traffic 635 policers require traffic contracts between users and networks as well 636 as pairwise between networks. Because L4S will lack all this 637 management complexity, it is more likely to work end-to-end. 639 During early deployment (and perhaps always), some networks will not 640 offer the L4S service. These networks do not need to police or re- 641 mark L4S traffic - they just forward it unchanged as best efforts 642 traffic, as they would already forward traffic with ECT(1) today. At 643 a bottleneck, such networks will introduce some queuing and dropping. 644 When a scalable congestion control detects a drop it will have to 645 respond as if it is a Classic congestion control (see item 3-1 in 646 Appendix A). This will ensure safe interworking with other traffic 647 at the 'legacy' bottleneck. 649 Certain network operators might choose to restict access to the L4S 650 class, perhaps only to customers who have paid a premium. In the 651 packet classifer (item 2 in Figure 1), they could identify such 652 customers using some other field than ECN (e.g. source address 653 range), and just ignore the L4S identifier for non-paying customers. 654 This would ensure that the L4S identifier survives end-to-end even 655 though the service does not have to be supported at every hop. Such 656 arrangements would only require simple registered/not-registered 657 packet classification, rather than the managed application-specific 658 traffic policing against customer-specific traffic contracts that 659 Diffserv requires. 661 5.2. 'Latency Friendliness' 663 The L4S service does rely on self-constraint - not in terms of 664 limiting capacity usage, but in terms of limiting burstiness. It is 665 believed that standardisation of dynamic behaviour (cf. TCP slow- 666 start) and self-interest will be sufficient to prevent transports 667 from sending excessive bursts of L4S traffic, given the application's 668 own latency will suffer most from such behaviour. 670 Whether burst policing becomes necessary remains to be seen. Without 671 it, there will be potential for attacks on the low latency of the L4S 672 service. However it may only be necessary to apply such policing 673 reactively, e.g. punitively targeted at any deployments of new bursty 674 malware. 676 5.3. ECN Integrity 678 Receiving hosts can fool a sender into downloading faster by 679 suppressing feedback of ECN marks (or of losses if retransmissions 680 are not necessary or available otherwise). [RFC3540] proposes that a 681 TCP sender could pseudorandomly set either of ECT(0) or ECT(1) in 682 each packet of a flow and remember the sequence it had set, termed 683 the ECN nonce. If the receiver supports the nonce, it can prove that 684 it is not suppressing feedback by reflecting its knowledge of the 685 sequence back to the sender. The nonce was proposed on the 686 assumption that receivers might be more likely to cheat congestion 687 control than senders (although senders also have a motive to cheat). 689 If L4S uses the ECT(1) codepoint of ECN for packet classification, it 690 will have to obsolete the experimental nonce. As far as is known, 691 the ECN Nonce has never been deployed, and it was only implemented 692 for a couple of testbed evaluations. It would be nearly impossible 693 to deploy now, because any misbehaving receiver can simply opt-out, 694 which would be unremarkable given all receivers currently opt-out. 696 Other ways to protect TCP feedback integrity have since been 697 developed. For instance: 699 o the sender can test the integrity of the receiver's feedback by 700 occasionally setting the IP-ECN field to a value normally only set 701 by the network. Then it can test whether the receiver's feedback 702 faithfully reports what it expects [I-D.moncaster-tcpm-rcv-cheat]. 703 This method consumes no extra codepoints. It works for loss and 704 it will work for ECN feedback in any transport protocol suitable 705 for L4S. However, it shares the same assumption as the nonce; 706 that the sender is not cheating and it is motivated to prevent the 707 receiver cheating; 709 o A network can enforce a congestion response to its ECN markings 710 (or packet losses) by auditing congestion exposure (ConEx) 711 [RFC7713]. Whether the receiver or a downstream network is 712 suppressing congestion feedback or the sender is unresponsive to 713 the feedback, or both, ConEx audit can neutralise any advantage 714 that any of these three parties would otherwise gain. ConEx is 715 only currently defined for IPv6 and consumes a destination option 716 header. It has been implemented, but not deployed as far as is 717 known. 719 6. Acknowledgements 721 7. References 723 7.1. Normative References 725 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 726 Requirement Levels", BCP 14, RFC 2119, 727 DOI 10.17487/RFC2119, March 1997, 728 . 730 7.2. Informative References 732 [Alizadeh-stability] 733 Alizadeh, M., Javanmard, A., and B. Prabhakar, "Analysis 734 of DCTCP: Stability, Convergence, and Fairness", ACM 735 SIGMETRICS 2011 , June 2011. 737 [DCttH15] De Schepper, K., Bondarenko, O., Briscoe, B., and I. 738 Tsang, "'Data Centre to the Home': Ultra-Low Latency for 739 All", 2015, . 742 (Under submission) 744 [Hohlfeld14] 745 Hohlfeld , O., Pujol, E., Ciucu, F., Feldmann, A., and P. 746 Barford, "A QoE Perspective on Sizing Network Buffers", 747 Proc. ACM Internet Measurement Conf (IMC'14) hmm, November 748 2014. 750 [I-D.briscoe-aqm-dualq-coupled] 751 Schepper, K., Briscoe, B., Bondarenko, O., and I. Tsang, 752 "DualQ Coupled AQM for Low Latency, Low Loss and Scalable 753 Throughput", draft-briscoe-aqm-dualq-coupled-01 (work in 754 progress), March 2016. 756 [I-D.briscoe-tsvwg-ecn-l4s-id] 757 Schepper, K., Briscoe, B., and I. Tsang, "Identifying 758 Modified Explicit Congestion Notification (ECN) Semantics 759 for Ultra-Low Queuing Delay", draft-briscoe-tsvwg-ecn-l4s- 760 id-01 (work in progress), March 2016. 762 [I-D.ietf-aqm-fq-codel] 763 Hoeiland-Joergensen, T., McKenney, P., 764 dave.taht@gmail.com, d., Gettys, J., and E. Dumazet, "The 765 FlowQueue-CoDel Packet Scheduler and Active Queue 766 Management Algorithm", draft-ietf-aqm-fq-codel-06 (work in 767 progress), March 2016. 769 [I-D.ietf-aqm-pie] 770 Pan, R., Natarajan, P., Baker, F., and G. White, "PIE: A 771 Lightweight Control Scheme To Address the Bufferbloat 772 Problem", draft-ietf-aqm-pie-08 (work in progress), June 773 2016. 775 [I-D.ietf-tcpm-accurate-ecn] 776 Briscoe, B., Kuehlewind, M., and R. Scheffenegger, "More 777 Accurate ECN Feedback in TCP", draft-ietf-tcpm-accurate- 778 ecn-01 (work in progress), June 2016. 780 [I-D.ietf-tcpm-cubic] 781 Rhee, I., Xu, L., Ha, S., Zimmermann, A., Eggert, L., and 782 R. Scheffenegger, "CUBIC for Fast Long-Distance Networks", 783 draft-ietf-tcpm-cubic-01 (work in progress), January 2016. 785 [I-D.ietf-tcpm-dctcp] 786 Bensley, S., Eggert, L., Thaler, D., Balasubramanian, P., 787 and G. Judd, "Datacenter TCP (DCTCP): TCP Congestion 788 Control for Datacenters", draft-ietf-tcpm-dctcp-01 (work 789 in progress), November 2015. 791 [I-D.khademi-tcpm-alternativebackoff-ecn] 792 Khademi, N., Welzl, M., Armitage, G., and G. Fairhurst, 793 "TCP Alternative Backoff with ECN (ABE)", draft-khademi- 794 tcpm-alternativebackoff-ecn-00 (work in progress), May 795 2016. 797 [I-D.moncaster-tcpm-rcv-cheat] 798 Moncaster, T., Briscoe, B., and A. Jacquet, "A TCP Test to 799 Allow Senders to Identify Receiver Non-Compliance", draft- 800 moncaster-tcpm-rcv-cheat-03 (work in progress), July 2014. 802 [I-D.stewart-tsvwg-sctpecn] 803 Stewart, R., Tuexen, M., and X. Dong, "ECN for Stream 804 Control Transmission Protocol (SCTP)", draft-stewart- 805 tsvwg-sctpecn-05 (work in progress), January 2014. 807 [I-D.you-encrypted-traffic-management] 808 You, J. and C. Xiong, "The Effect of Encrypted Traffic on 809 the QoS Mechanisms in Cellular Networks", draft-you- 810 encrypted-traffic-management-00 (work in progress), 811 October 2015. 813 [Mathis09] 814 Mathis, M., "Relentless Congestion Control", PFLDNeT'09 , 815 May 2009, . 818 [NewCC_Proc] 819 Eggert, L., "Experimental Specification of New Congestion 820 Control Algorithms", IETF Operational Note ion-tsv-alt-cc, 821 July 2007. 823 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 824 of Explicit Congestion Notification (ECN) to IP", 825 RFC 3168, DOI 10.17487/RFC3168, September 2001, 826 . 828 [RFC3246] Davie, B., Charny, A., Bennet, J., Benson, K., Le Boudec, 829 J., Courtney, W., Davari, S., Firoiu, V., and D. 830 Stiliadis, "An Expedited Forwarding PHB (Per-Hop 831 Behavior)", RFC 3246, DOI 10.17487/RFC3246, March 2002, 832 . 834 [RFC3540] Spring, N., Wetherall, D., and D. Ely, "Robust Explicit 835 Congestion Notification (ECN) Signaling with Nonces", 836 RFC 3540, DOI 10.17487/RFC3540, June 2003, 837 . 839 [RFC3649] Floyd, S., "HighSpeed TCP for Large Congestion Windows", 840 RFC 3649, DOI 10.17487/RFC3649, December 2003, 841 . 843 [RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram 844 Congestion Control Protocol (DCCP)", RFC 4340, 845 DOI 10.17487/RFC4340, March 2006, 846 . 848 [RFC4774] Floyd, S., "Specifying Alternate Semantics for the 849 Explicit Congestion Notification (ECN) Field", BCP 124, 850 RFC 4774, DOI 10.17487/RFC4774, November 2006, 851 . 853 [RFC4960] Stewart, R., Ed., "Stream Control Transmission Protocol", 854 RFC 4960, DOI 10.17487/RFC4960, September 2007, 855 . 857 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 858 Control", RFC 5681, DOI 10.17487/RFC5681, September 2009, 859 . 861 [RFC6679] Westerlund, M., Johansson, I., Perkins, C., O'Hanlon, P., 862 and K. Carlberg, "Explicit Congestion Notification (ECN) 863 for RTP over UDP", RFC 6679, DOI 10.17487/RFC6679, August 864 2012, . 866 [RFC7560] Kuehlewind, M., Ed., Scheffenegger, R., and B. Briscoe, 867 "Problem Statement and Requirements for Increased Accuracy 868 in Explicit Congestion Notification (ECN) Feedback", 869 RFC 7560, DOI 10.17487/RFC7560, August 2015, 870 . 872 [RFC7713] Mathis, M. and B. Briscoe, "Congestion Exposure (ConEx) 873 Concepts, Abstract Mechanism, and Requirements", RFC 7713, 874 DOI 10.17487/RFC7713, December 2015, 875 . 877 [TCP-sub-mss-w] 878 Briscoe, B. and K. De Schepper, "Scaling TCP's Congestion 879 Window for Small Round Trip Times", BT Technical Report 880 TR-TUB8-2015-002, May 2015, 881 . 884 [TCPPrague] 885 Briscoe, B., "Notes: DCTCP evolution 'bar BoF': Tue 21 Jul 886 2015, 17:40, Prague", tcpprague mailing list archive , 887 July 2015. 889 Appendix A. Required features for scalable transport protocols to be 890 safely deployable in the Internet (a.k.a. TCP Prague 891 requirements) 893 This list contains a list of features, mechanisms and modifications 894 from currently defined behaviour for scalable Transport protocols so 895 that they can be safely deployed over the public Internet. This list 896 of requirements was produced at an ad hoc meeting during IETF-94 in 897 Prague [TCPPrague]. 899 One of such scalable transport protocols is DCTCP, currently 900 specified in [I-D.ietf-tcpm-dctcp]. In its current form, DCTCP is 901 specified to be deployable in controlled environments and deploying 902 it in the public Internet would lead to a number of issues, both from 903 the safety and the performance perspective. In this section, we 904 describe the modifications and additional mechanisms that are 905 required for its deployment over the global Internet. We use DCTCP 906 as a base, but it is likely that most of these requirements equally 907 apply to other scalable transport protocols. 909 We next provide a brief description of each required feature. 911 Requirement #4.1: Fall back to Reno/Cubic congestion control on 912 packet loss. 914 Description: In case of packet loss, the scalable transport MUST 915 react as classic TCP (whatever the classic version of TCP is running 916 in the host, e.g. Reno, Cubic). 918 Motivation: As part of the safety conditions for deploying a scalable 919 transport over the public Internet is to make sure that it behaves 920 properly when some or all the network devices connecting the two 921 endpoints that implement the scalable transport have not been 922 upgraded. In particular, it may be the case that some of the 923 switches along the path between the two endpoints may only react to 924 congestion by dropping packets (i.e. no ECN marking). It is 925 important that in these cases, the scalable transport react to the 926 congestion signal in the form of a packet drop similarly to classic 927 TCP. 929 In the particular case of DCTCP, the current DCTCP specification 930 states that "It is RECOMMENDED that an implementation deal with loss 931 episodes in the same way as conventional TCP." For safe deployment 932 in the public Internet of a scalable transport, the above requirement 933 needs to be defined as a MUST. 935 Packet loss, while rare, may also occur in the case that the 936 bottleneck is L4S capable. In this case, the sender may receive a 937 high number of packets marked with the CE bit set and also experience 938 a loss. Current DCTCP implementations react differently to this 939 situation. At least one implementation reacts only to the drop 940 signal (e.g. by halving the CWND) and at least another DCTCP 941 implementation reacts to both signals (e.g. by halving the CWND due 942 to the drop and also further reducing the CWND based on the 943 proportion of marked packet). We believe that further 944 experimentation is needed to understand what is the best behaviour 945 for the public Internet, which may or not be one of the existent 946 implementations. 948 Requirement #4.2: Fall back to Reno/Cubic congestion control on 949 classic ECN bottlenecks. 951 Description: The scalable transport protocol SHOULD/MAY? behave as 952 classic TCP with classic ECN if the path contains a legacy bottleneck 953 which marks both ect(0) and ect(1) in the same way as drop (non L4S, 954 but ECN capable bottleneck). 956 Motivation: Similarly to Requirement #3.1, this requirement is a 957 safety condition in case L4S-capable endpoints are communicating over 958 a path that contains one or more non-L4S but ECN capable switches and 959 one of them happens to be the bottleneck. In this case, the scalable 960 transport will attempt to fill in the buffer of the bottleneck switch 961 up to the marking threshold and produce a small sawtooth around that 962 operation point. The result is that the switch will set its 963 operation point with the buffer full and all other non-scalable 964 transports will be starved (as they will react reducing their CWND 965 more aggressively than the scalable transport). 967 Scalable transports then MUST be able to detect the presence of a 968 classic ECN bottleneck and fall back to classic TCP/classic ECN 969 behaviour in this case. 971 Discussion: It is not clear at this point if it is possible to design 972 a mechanism that always detect the aforementioned cases. One 973 possibility is to base the detection on an increase on top of a 974 minimum RTT, but it is not yet clear which value should trigger this. 975 Having a delay based fall back response on L4S may as well be 976 beneficial for preserving low latency without legacy network nodes. 977 Even if it possible to design such a mechanism, it may well be that 978 it would encompass additional complexity that implementers may 979 consider unnecessary. The need for this mechanism depends on the 980 extent of classic ECN deployment. 982 Requirement #4.3: Reduce RTT dependence 984 Description: Scalable transport congestion control algorithms MUST 985 reduce or eliminate the RTT bias within the range of RTTs available. 987 Motivation: Classic TCP's throughput is known to be inversely 988 proportional to RTT. One would expect flows over very low RTT paths 989 to nearly starve flows over larger RTTs. However, because Classic 990 TCP induces a large queue, it has never allowed a very low RTT path 991 to exist, so far. For instance, consider two paths with base RTT 1ms 992 and 100ms. If Classic TCP induces a 20ms queue, it turns these RTTs 993 into 21ms and 120ms leading to a throughput ratio of about 1:6. 994 Whereas if a Scalable TCP induces only a 1ms queue, the ratio is 995 2:101. Therefore, with small queues, long RTT flows will essentially 996 starve. 998 Scalable transport protocol MUST then accommodate flows across the 999 range of RTTs enabled by the deployment of L4S service over the 1000 public Internet. 1002 Requirement #4.4: Scaling down the congestion window. 1004 Description: Scalable transports MUST be responsive to congestion 1005 when RTTs are significantly smaller than in the current public 1006 Internet. 1008 Motivation: As currently specified, the minimum CWND of TCP (and the 1009 scalable extensions such as DCTCP), is set to 2 MSS. Once this 1010 minimum CWND is reached, the transport protocol ceases to react to 1011 congestion signals (the CWND is not further reduced beyond this 1012 minimum size). 1014 L4S mechanisms reduce significantly the queueing delay, achieving 1015 smaller RTTs over the Internet. For the same CWND, smaller RTTs 1016 imply higher transmission rates. The result is that when scalable 1017 transport are used and small RTTs are achieved, the minimum value of 1018 the CWND currently defined in 2 MSS may still result in a high 1019 transmission rate for a large number of common scenarios. For 1020 example, as described in [TCP-sub-mss-w], consider a residential 1021 setting with an broadband Internet access of 40Mbps. Suppose now a 1022 number of equal TCP flows running in parallel with the Internet 1023 access link being the bottleneck. Suppose that for these flows, the 1024 RTT is 6ms and the MSS is 1500B. The minimum transmission rate 1025 supported by TCP in this scenario is when CWND is set to 2 MSS, which 1026 results in 4Mbps for each flow. This means that in this scenario, if 1027 the number of flows is higher than 10, the congestion control ceases 1028 to be responsive and starts to build up a queue in the network. 1030 In order to address this issue, the congestion control mechanism for 1031 scalable transports MUST be responsive for the new range of RTT 1032 resulting from the decrease of the queueing delay. 1034 There are several ways how this can be achieved. One possible sub- 1035 MSS window mechanism is described in [TCP-sub-mss-w]. 1037 In addition to the safety requirements described before, there are 1038 some optimizations that while not required for the safe deployment of 1039 scalable transports over the public Internet, would results in an 1040 optimized performance. We describe them next. 1042 Optimization #5.1: Setting ECT in SYN, SYN/ACK and pure ACK packets. 1044 Description: Scalable transport SHOULD set the ECT bit in SYN, SYN/ 1045 ACK and pure ACK packets. 1047 Motivation: Failing to set the ECT bit in SYN, SYN/ACK or ACK packets 1048 results in these packets being more likely dropped during congestion 1049 events. Dropping SYN and SYN/ACK packets is particularly bad for 1050 performance as the retransmission timers for these packets are large. 1051 [RFC3168] prevents from marking these packets due to security 1052 reasons. The arguments provided should be revisited in the the 1053 context of L4S and evaluate if avoiding marking these packets is 1054 still the best approach. 1056 Optimization #5.2: Faster than additive increase. 1058 Description: Scalable transport MAY support faster than additive 1059 increase in the congestion avoidance phase. 1061 Motivation: As currently defined, DCTCP supports additive increase in 1062 congestion avoidance phase. It would be beneficial for performance 1063 to update the congestion control algorithm to increase the CWND more 1064 than 1 MSS per RTT during the congestion avoidance phase. In the 1065 context of L4S such mechanism, must also provide fairness with other 1066 classes of traffic, including classic TCP and possibly scalable TCP 1067 that uses additive increase. 1069 Optimization #5.3: Faster convergence to fairness. 1071 Description: Scalable transport SHOULD converge to a fair share 1072 allocation of the available capacity as fast as classic TCP or 1073 faster. 1075 Motivation: The time required for a new flow to obtain its fair share 1076 of the capacity of the bottleneck when the there are already ongoing 1077 flows using up all the bottleneck capacity is higher in the case of 1078 DCTCP than in the case of classic TCP (about a factor of 1,5 and 2 1079 larger according to [Alizadeh-stability]). This is detrimental in 1080 general, but it is very harmful for short flows, which performance 1081 can be worse than the one obtained with classic TCP. for this reason 1082 it is desirable that scalable transport provide convergence times no 1083 larger than classic TCP. 1085 Appendix B. Standardization items 1087 The following table includes all the itmes that should be 1088 standardized to provide a full L4S architecture. 1090 The table is too wide for the ASCII draft format, so it has been 1091 split into two, with a common column of row index numbers on the 1092 left. 1094 The columns in the second part of the table have the following 1095 meanings: 1097 WG: The IETF WG most relevant to this requirement. The "tcpm/iccrg" 1098 combination refers to the procedure typically used for congestion 1099 control changes, where tcpm owns the approval decision, but uses 1100 the iccrg for expert review [NewCC_Proc]; 1102 TCP: Applicable to all forms of TCP congestion control; 1104 DCTCP: Applicable to Data Centre TCP as currently used (in 1105 controlled environments); 1107 DCTCP bis: Applicable to an future Data Centre TCP congestion 1108 control intended for controlled environments; 1110 XXX Prague: Applicable to a Scalable variant of XXX (TCP/SCTP/RMCAT) 1111 congestion control. 1113 +-----+-----------------------+-------------------------------------+ 1114 | Req | Requirement | Reference | 1115 | # | | | 1116 +-----+-----------------------+-------------------------------------+ 1117 | 0 | ARCHITECTURE | | 1118 | 1 | L4S IDENTIFIER | [I-D.briscoe-tsvwg-ecn-l4s-id] | 1119 | 2 | DUAL QUEUE AQM | [I-D.briscoe-aqm-dualq-coupled] | 1120 | 3 | Suitable ECN Feedback | [I-D.ietf-tcpm-accurate-ecn], | 1121 | | | [I-D.stewart-tsvwg-sctpecn]. | 1122 | | | | 1123 | | SCALABLE TRANSPORT - | | 1124 | | SAFETY ADDITIONS | | 1125 | 4-1 | Fall back to | [I-D.ietf-tcpm-dctcp] | 1126 | | Reno/Cubic on loss | | 1127 | 4-2 | Fall back to | | 1128 | | Reno/Cubic if classic | | 1129 | | ECN bottleneck | | 1130 | | detected | | 1131 | | | | 1132 | 4-3 | Reduce RTT-dependence | | 1133 | | | | 1134 | 4-4 | Scaling TCP's | [TCP-sub-mss-w] | 1135 | | Congestion Window for | | 1136 | | Small Round Trip | | 1137 | | Times | | 1138 | | SCALABLE TRANSPORT - | | 1139 | | PERFORMANCE | | 1140 | | ENHANCEMENTS | | 1141 | 5-1 | Setting ECT in SYN, | draft-bagnulo-tsvwg-generalized-ECN | 1142 | | SYN/ACK and pure ACK | | 1143 | | packets | | 1144 | 5-2 | Faster-than-additive | | 1145 | | increase | | 1146 | 5-3 | Less drastic exit | | 1147 | | from slow-start | | 1148 +-----+-----------------------+-------------------------------------+ 1149 +-----+--------+-----+-------+-----------+--------+--------+--------+ 1150 | # | WG | TCP | DCTCP | DCTCP-bis | TCP | SCTP | RMCAT | 1151 | | | | | | Prague | Prague | Prague | 1152 +-----+--------+-----+-------+-----------+--------+--------+--------+ 1153 | 0 | tsvwg? | Y | Y | Y | Y | Y | Y | 1154 | 1 | tsvwg? | | | Y | Y | Y | Y | 1155 | 2 | aqm? | n/a | n/a | n/a | n/a | n/a | n/a | 1156 | | | | | | | | | 1157 | | | | | | | | | 1158 | | | | | | | | | 1159 | 3 | tcpm | Y | Y | Y | Y | n/a | n/a | 1160 | | | | | | | | | 1161 | 4-1 | tcpm | | Y | Y | Y | Y | Y | 1162 | | | | | | | | | 1163 | 4-2 | tcpm/ | | | | Y | Y | ? | 1164 | | iccrg? | | | | | | | 1165 | | | | | | | | | 1166 | | | | | | | | | 1167 | | | | | | | | | 1168 | | | | | | | | | 1169 | 4-3 | tcpm/ | | | Y | Y | Y | ? | 1170 | | iccrg? | | | | | | | 1171 | 4-4 | tcpm | Y | Y | Y | Y | Y | ? | 1172 | | | | | | | | | 1173 | | | | | | | | | 1174 | 5-1 | tsvwg | Y | Y | Y | Y | n/a | n/a | 1175 | | | | | | | | | 1176 | 5-2 | tcpm/ | | | Y | Y | Y | ? | 1177 | | iccrg? | | | | | | | 1178 | 5-3 | tcpm/ | | | Y | Y | Y | ? | 1179 | | iccrg? | | | | | | | 1180 +-----+--------+-----+-------+-----------+--------+--------+--------+ 1182 Authors' Addresses 1184 Bob Briscoe (editor) 1185 Simula Research Lab 1187 Email: ietf@bobbriscoe.net 1188 URI: http://bobbriscoe.net/ 1189 Koen De Schepper 1190 Nokia Bell Labs 1191 Antwerp 1192 Belgium 1194 Email: koen.de_schepper@nokia.com 1195 URI: https://www.bell-labs.com/usr/koen.de_schepper 1197 Marcelo Bagnulo 1198 Universidad Carlos III de Madrid 1199 Av. Universidad 30 1200 Leganes, Madrid 28911 1201 Spain 1203 Phone: 34 91 6249500 1204 Email: marcelo@it.uc3m.es 1205 URI: http://www.it.uc3m.es