idnits 2.17.1 draft-briscoe-docsis-q-protection-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == There are 4 instances of lines with non-ascii characters in the document. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 859 has weird spacing: '... pkt_sz prob...' -- The document date (31 January 2022) is 816 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Looks like a reference, but probably isn't: '0' on line 682 -- Looks like a reference, but probably isn't: '1' on line 682 == Outdated reference: A later version (-29) exists of draft-ietf-tsvwg-ecn-l4s-id-23 == Outdated reference: A later version (-22) exists of draft-ietf-tsvwg-nqb-08 == Outdated reference: A later version (-03) exists of draft-briscoe-iccrg-prague-congestion-control-00 == Outdated reference: A later version (-25) exists of draft-ietf-tsvwg-aqm-dualq-coupled-20 == Outdated reference: A later version (-20) exists of draft-ietf-tsvwg-l4s-arch-15 Summary: 0 errors (**), 0 flaws (~~), 8 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group B. Briscoe, Ed. 3 Internet-Draft Independent 4 Intended status: Informational G. White 5 Expires: 4 August 2022 CableLabs 6 31 January 2022 8 The DOCSIS® Queue Protection Algorithm to Preserve Low Latency 9 draft-briscoe-docsis-q-protection-02 11 Abstract 13 This informational document explains the specification of the queue 14 protection algorithm used in DOCSIS technology since version 3.1. A 15 shared low latency queue relies on the non-queue-building behaviour 16 of every traffic flow using it. However, some flows might not take 17 such care, either accidentally or maliciously. If a queue is about 18 to exceed a threshold level of delay, the queue protection algorithm 19 can rapidly detect the flows most likely to be responsible. It can 20 then prevent harm to other traffic in the low latency queue by 21 ejecting selected packets (or all packets) of these flows. The 22 document is designed for three types of audience: a) congestion 23 control designers who need to understand how to keep on the 'good' 24 side of the algorithm; b) implementers of the algorithm who want to 25 understand it in more depth; and c) designers of algorithms with 26 similar goals, perhaps for non-DOCSIS scenarios. 28 Status of This Memo 30 This Internet-Draft is submitted in full conformance with the 31 provisions of BCP 78 and BCP 79. 33 Internet-Drafts are working documents of the Internet Engineering 34 Task Force (IETF). Note that other groups may also distribute 35 working documents as Internet-Drafts. The list of current Internet- 36 Drafts is at https://datatracker.ietf.org/drafts/current/. 38 Internet-Drafts are draft documents valid for a maximum of six months 39 and may be updated, replaced, or obsoleted by other documents at any 40 time. It is inappropriate to use Internet-Drafts as reference 41 material or to cite them other than as "work in progress." 43 This Internet-Draft will expire on 4 August 2022. 45 Copyright Notice 47 Copyright (c) 2022 IETF Trust and the persons identified as the 48 document authors. All rights reserved. 50 This document is subject to BCP 78 and the IETF Trust's Legal 51 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 52 license-info) in effect on the date of publication of this document. 53 Please review these documents carefully, as they describe your rights 54 and restrictions with respect to this document. Code Components 55 extracted from this document must include Revised BSD License text as 56 described in Section 4.e of the Trust Legal Provisions and are 57 provided without warranty as described in the Revised BSD License. 59 Table of Contents 61 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 62 1.1. Document Roadmap . . . . . . . . . . . . . . . . . . . . 4 63 1.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4 64 1.3. Copyright Material . . . . . . . . . . . . . . . . . . . 5 65 2. Approach - In Brief . . . . . . . . . . . . . . . . . . . . . 5 66 2.1. Mechanism . . . . . . . . . . . . . . . . . . . . . . . . 6 67 2.2. Policy . . . . . . . . . . . . . . . . . . . . . . . . . 6 68 3. Necessary Flow Behaviour . . . . . . . . . . . . . . . . . . 7 69 4. Pseudocode Walk-Through . . . . . . . . . . . . . . . . . . . 8 70 4.1. Input Parameters, Constants and Variables . . . . . . . . 8 71 4.2. Queue Protection Data Path . . . . . . . . . . . . . . . 11 72 4.2.1. The qprotect() function . . . . . . . . . . . . . . . 12 73 4.2.2. The pick_bucket() function . . . . . . . . . . . . . 13 74 4.2.3. The fill_bucket() function . . . . . . . . . . . . . 16 75 4.2.4. The calcProbNative() function . . . . . . . . . . . . 16 76 5. Rationale . . . . . . . . . . . . . . . . . . . . . . . . . . 17 77 5.1. Rationale: Blame for Queuing, not for Rate in Itself . . 17 78 5.2. Rationale for Aging the Queuing Score . . . . . . . . . . 19 79 5.3. Rationale for Normalized Queuing Score . . . . . . . . . 20 80 5.4. Rationale for Policy Conditions . . . . . . . . . . . . . 21 81 5.5. Rationale for Reclassification as the Policy Action . . . 24 82 6. Limitations . . . . . . . . . . . . . . . . . . . . . . . . . 25 83 7. IANA Considerations (to be removed by RFC Editor) . . . . . . 25 84 8. Security Considerations . . . . . . . . . . . . . . . . . . . 25 85 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 26 86 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 26 87 10.1. Normative References . . . . . . . . . . . . . . . . . . 26 88 10.2. Informative References . . . . . . . . . . . . . . . . . 27 89 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 28 91 1. Introduction 93 This informational document explains the specification of the queue 94 protection (QProt) algorithm used in DOCSIS technology since version 95 3.1 [DOCSIS]. 97 Although the algorithm is defined in annex P of [DOCSIS], it relies 98 on cross-references to other parts of the set of specs. This 99 document pulls all the strands together into one self-contained 100 document. The core of the document is a similar pseudocode walk- 101 through to that in the DOCSIS spec, but it also includes additional 102 material: i) a brief overview; ii) a definition of how a data sender 103 needs to behave to avoid triggering queue protection; and iii) a 104 section giving the rationale for the design choices. 106 Low queuing delay depends on hosts sending their data smoothly, 107 either at low rate or responding to explicit congestion notifications 108 (ECN). So low queuing latency is something hosts create themselves, 109 not something the network gives them. Therefore, self-interest gives 110 no incentive for flows to mis-mark their packets for the low latency 111 queue. However, traffic from an application that does not behave in 112 a non-queue-building way might erroneously be classified into a low 113 latency queue, whether accidentally or maliciously. QProt protects 114 other traffic in the low latency queue from the harm due to excess 115 queuing that would otherwise be caused by such anomalous behaviour. 117 In normal scenarios without misclassified traffic, QProt is not 118 expected to intervene at all in the classification or forwarding of 119 packets. 121 An overview of how low latency support has been added to DOCSIS 122 technology is given in [I-D.white-tsvwg-lld]. In each direction of a 123 DOCSIS link (upstream and downstream), there are two queues: one for 124 Low Latency and one for Classic traffic, in an arrangement similar to 125 the IETF's Coupled DualQ AQM [I-D.ietf-tsvwg-aqm-dualq-coupled]. The 126 two queues enable a transition from 'Classic' to 'Scalable' 127 congestion control so that low latency can become the norm for any 128 application, including ones seeking both full throughput and low 129 latency, not just low-rate applications that have been more 130 traditionally associated with a low latency service. The Classic 131 queue is only necessary for traffic such as traditional (Reno/Cubic) 132 TCP that needs about a round trip of buffering to fully utilize the 133 link, and therefore has no incentive to mismark itself as low 134 latency. The QProt function is located at the ingress to the Low 135 Latency queue. Therefore, in the upstream QProt is located on the 136 cable modem (CM), and in the downstream it is located on the cable 137 CMTS (CM Termination System). If an arriving packet triggers queue 138 protection, the QProt algorithm reclassifies the packet from the Low 139 Latency queue into the Classic queue. 141 If QProt is used in settings other than DOCSIS links, it would be a 142 simple matter to detect queue-building flows by using slightly 143 different conditions, and/or to trigger a different action as a 144 consequence, as appropriate for the scenario, e.g., dropping instead 145 of reclassifying packets or perhaps accumulating a second per-flow 146 score to decide whether to redirect a whole flow rather than just 147 certain packets. Such work is for future study and out of scope of 148 the present document. 150 The algorithm is based on a rigorous approach to quantifying how much 151 each flow contributes to congestion, which is used in economics to 152 allocate responsibility for the cost of one party's behaviour on 153 others (the economic externality). Another important feature of the 154 approach is that the metric used for the queuing score is based on 155 the same variable that determines the level of ECN signalling seen by 156 the sender [RFC8311], [I-D.ietf-tsvwg-ecn-l4s-id]. This makes the 157 internal queuing score visible externally as ECN markings. This 158 transparency is necessary to be able to objectively state (in 159 Section 3) how a flow can keep on the 'good' side of the algorithm. 161 1.1. Document Roadmap 163 The core of the document is the walk-through of the DOCSIS QProt 164 algorithm's pseudocode in Section 4. 166 Prior to that, two brief sections provide a "bluffer's guide to 167 QProt" which should suffice for those who do not need the details or 168 the insights. Section 2 summarizes the approach used in the 169 algorithm. Then Section 3 considers QProt from the perspective of 170 the end-system, by defining the behaviour that a flow needs to comply 171 with to avoid the QProt algorithm ejecting its packets from the low 172 latency queue. 174 Section 5 gives deeper insight into the principles and rationale 175 behind the algorithm. Then Section 6 explains the limitations of the 176 approach, followed by the usual closing sections. 178 1.2. Terminology 180 The normative language for the DOCSIS QProt algorithm is in the 181 DOCSIS specs [DOCSIS], [DOCSIS-CM-OSS], [DOCSIS-CCAP-OSS] not in this 182 informational guide. If there is any inconsistency, the DOCSIS specs 183 take precedence. 185 The following terms and abbreviations are used: 187 CM: Cable Modem 189 CMTS: CM Termination System 191 Congestion-rate: The rate at which a flow induces ECN-marked (or 192 dropped) bytes, where an ECN-mark on a packet is defined as 193 marking all the packet's bytes. Congestion-bit-rate and 194 congestion-volume were introduced in [RFC7713] and [RFC6789]. 196 DOCSIS: Data Over Cable System Interface Specification. "DOCSIS" is 197 a registered trademark of Cable Television Laboratories, Inc. 198 ("CableLabs"). 200 Non-queue-building: A flow that tends not to build a queue 202 Queue-building: A flow that builds a queue. If it is classified 203 into the Low Latency queue, it is therefore a candidate for the 204 queue protection algorithm to detect and sanction. 206 ECN: Explicit Congestion Notification 208 QProt: The Queue Protection function 210 1.3. Copyright Material 212 Parts of this document are reproduced from [DOCSIS] with kind 213 permission of the copyright holder, Cable Television Laboratories, 214 Inc. ("CableLabs"). 216 2. Approach - In Brief 218 The algorithm is divided into mechanism and policy. There is only a 219 tiny amount of policy code, but policy might need to be changed in 220 the future. So, where hardware implementation is being considered, 221 it would be advisable to implement the policy aspects in firmware or 222 software: 224 * The mechanism aspects identify flows, maintain flow-state and 225 accumulate per-flow queuing scores; 227 * The policy aspects can be divided into conditions and actions: 229 - The conditions are the logic that determines when action should 230 be taken to avert the risk of queuing delay becoming excessive; 232 - The actions determine how this risk is averted, e.g., by 233 redirecting packets from a flow into another queue, or to 234 reclassify a whole flow that seems to be misclassified. 236 2.1. Mechanism 238 The algorithm maintains per-flow-state, where 'flow' usually means an 239 end-to-end (layer-4) 5-tuple. The flow-state consists of a queuing 240 score normalized to also represent the flow-state's own expiry time 241 (explained in Section 5.3). A higher queuing score pushes out the 242 expiry time further. 244 Non-queue-building flows tend to release their flow-state rapidly --- 245 it usually expires reasonably early in the gap between the packets of 246 a normal flow. Then the memory can be recycled for packets from 247 other flows that arrive in between. So only queue-building flows 248 hold flow state persistently. 250 The simplicity and effectiveness of the algorithm is due to the 251 definition of the queuing score. It uses the internal variable from 252 the native AQM, probNative, that determines the ECN marking 253 probability of the low latency queue alone (Classic queuing is not 254 relevant). The actual DOCSIS QProt algorithm is defined using 255 integer arithmetic, but in the floating point arithmetic used in this 256 document, (0 <= probNative <= 1). The queueing score algorithm 257 accumulates the size of each arriving packet of a flow scaled by the 258 value of probNative at the time. 260 The algorithm as described so far would accumulate a number that 261 would rise at the so-called congestion-rate of the flow, i.e., the 262 rate at which the flow is contributing to congestion, or the rate at 263 which the AQM is forwarding bytes of the flow that are ECN marked. 264 However, rather than growing continually, the queuing score is also 265 reduced (or 'aged') at a constant rate. This gives each flow an 266 allowed congestion rate, which is justified in Section 5.2. 268 In practice, the queuing score is normalized into time units (so that 269 it represents the expiry time of the flow state, as already discussed 270 above). Then it does not need to be explicitly aged, because the 271 natural passage of time implicitly 'ages' an expiry time (explained 272 in Section 5.3). 274 2.2. Policy 276 The algorithm uses the queuing score to determine whether to eject 277 each packet only at the time it first arrives. This limits the 278 policies available. For instance, when queueing delay exceeds a 279 threshold, it is not possible to eject a packet from the flow with 280 the highest queuing scoring, because that would involve searching the 281 queue for such a packet (if indeed one was still in the queue). 282 Nonetheless, it is still possible to develop a policy that protects 283 the low latency of the queue by making the queuing score threshold 284 stricter the greater the excess of queuing delay relative to the 285 threshold (explained in Section 5.4). 287 In the DOCSIS QProt spec at the time of writing, when the policy 288 conditions are met the action taken to protect the low latency queue 289 is to reclassify a packet into the Classic queue (justified in 290 Section 5.5). 292 3. Necessary Flow Behaviour 294 The QProt algorithm described here can be used for responsive and/or 295 unresponsive flows. 297 * It is possible to objectively describe the least responsive way 298 that a flow will need to respond to congestion signals in order to 299 avoid triggering queue protection, no matter the link capacity and 300 no matter how much other traffic there is. 302 * It is not possible to describe how fast or smooth an unresponsive 303 flow should be to avoid queue protection, because this depends on 304 how much other traffic there is and the capacity of the link, 305 which an application is unable to know. However, the more 306 smoothly an unresponsive flow paces its packets and the lower its 307 rate relative to typical broadband link capacities, the less 308 likelihood that it will risk causing enough queueing to trigger 309 queue protection. 311 Responsive low latency flows can use an L4S ECN codepoint 312 [I-D.ietf-tsvwg-ecn-l4s-id] to get classified into the low latency 313 queue. 315 A sender can arrange for flows that are smooth but do not respond to 316 ECN marking to be classified into the low latency queue by using the 317 Non-Queue-Building (NQB) Diffserv codepoint [I-D.ietf-tsvwg-nqb], 318 which the DOCSIS specs support, or an operator can use various other 319 local classifiers. 321 The QProt algorithm is driven from the same variable that drives the 322 ECN marking probability in the low latency queue (see the Immediate 323 Active Queue Management Annex in [DOCSIS]). The algorithm that 324 calculates this internal variable is run on the arrival of every 325 packet, whether it is ECN-capable or not, so that it can be used by 326 the QProt algorithm. But the variable is only used to ECN-mark 327 packets that are ECN-capable. 329 Not only does this dual use of the variable improve processing 330 efficiency, but it also makes the basis of the QProt algorithm 331 visible and transparent, at least for responsive ECN-capable flows. 333 Then it is possible to state objectively that a flow can avoid 334 triggering queue protection by keeping the bit rate of ECN marked 335 packets (the congestion-rate) below AGING, which is a configured 336 constant of the algorithm (default 2^19 B/s ~= 4.2 Mb/s). Note that 337 it is in a congestion controller's own interest to keep its average 338 congestion-rate well below this level (e.g., ~1 Mb/s), to ensure that 339 it does not trigger queue protection during transient dynamics. 341 If the QProt algorithm is used in other settings, it would still need 342 to be based on the visible level of congestion signalling, in a 343 similar way to the DOCSIS approach. Without transparency of the 344 basis of the algorithm's decisions, end-systems would not be able to 345 avoid triggering queue protection on an objective basis. 347 4. Pseudocode Walk-Through 349 4.1. Input Parameters, Constants and Variables 351 The operator input parameters that set the parameters in the first 352 two blocks of pseudocode below are defined for cable modems (CMs) in 353 [DOCSIS-CM-OSS] and for CMTSs in [DOCSIS-CCAP-OSS]. Then, further 354 constants are either derived from the input parameters or hard-coded. 356 Defaults and units are shown in square brackets. Defaults (or indeed 357 any aspect of the algorithm) are subject to change, so the latest 358 DOCSIS specs are the definitive references. Also any operator might 359 set certain parameters to non-default values. 361 362 // Input Parameters 363 QPROTECT_ON // Queue Protection is enabled [Default: TRUE] 364 CRITICALqL_us // L queue threshold delay [us] Default: MAXTH_us 365 CRITICALqLSCORE_us // The threshold queuing score [Default: 4000us] 366 LG_AGING // The aging rate of the q'ing score [Default: 19] 367 // as log base 2 of the congestion-rate [lg(B/s)] 369 // Input Parameters for the calcProbNative() algorithm: 370 MAXTH_us // Max IAQM marking threshold [Default: 1000us] 371 LG_RANGE // Log base 2 of the range of ramp [lg(ns)] 372 // Default: 2^19 = 524288 ns (roughly 525 us) 373 374 375 // Constants, either derived from input parameters or hard-coded 376 T_RES // Resolution of t_exp [ns] 377 // Convert units 378 AGING = pow(2, (LG_AGING-30) ) * T_RES // lg([B/s]) to [B/T_RES] 379 CRITICALqL = CRITICALqL_us * 1000 // [us] to [ns] 380 CRITICALqLSCORE = CRITICALqLSCORE_us * 1000 /T_RES // [us] to [T_RES] 381 // Threshold for the q'ing score condition 382 CRITICALqLPRODUCT = CRITICALqL * CRITICALqLSCORE 383 qLSCORE_MAX = 5E9 / T_RES // Max queuing score = 5 s 385 ATTEMPTS = 2; // Max attempts to pick a bucket (vendor-specific) 386 BI_SIZE = 5; // Bit-width of index number for non-default buckets 387 NBUCKETS = pow(2, BI_SIZE); // No. of non-default buckets 388 MASK = NBUCKETS-1; // convenient constant, filled with ones 390 // Queue Protection exit states 391 EXIT_SUCCESS = 0; // Forward the packet 392 EXIT_SANCTION = 1; // Redirect the packet 394 MAX_PROB = 1; // For integer arithmetic, would use a large int 395 // e.g., 2^31, to allow space for overflow 396 MAXTH = MAXTH_us * 1000; // Max marking threshold [ns] 397 // Minimum marking threshold of 2 MTU for slow links [ns] 398 FLOOR = 2 * 8 * MAX FRAME SIZE * 10^9 / MAX RATE; 399 RANGE = (1 << LG_RANGE); // Range of ramp [ns] 400 MINTH = max ( MAXTH - RANGE, FLOOR); 401 MAXTH = MINTH + RANGE; // Max marking threshold [ns] 402 404 The resolution for expressing time, T_RES, needs to be chosen to 405 ensure that expiry times for buckets can represent times that are a 406 fraction (e.g., 1/10) of the expected packet interarrival time for 407 the system. 409 The following definitions explain the purpose of important variables 410 and functions. 412 413 // Public variables: 414 qdelay // The current queuing delay of the LL queue [ns] 415 probNative // Native marking probability of LL queue within [0,1] 417 // External variables 418 packet // The structure holding packet header fields 419 packet.size // The size of the current packet [B] 420 packet.uflow // The flow identifier of the current packet 421 // (e.g., 5-tuple or 4-tuple if IPSec) 423 // Irrelevant details of DOCSIS function to return qdelay are removed 424 qdelayL(...) // Returns current delay of the low latency Q [ns] 425 427 Pseudocode for how the algorithm categorizes packets by flow ID to 428 populate the variable packet.uflow is not given in detail here. The 429 application's flow ID is usually defined by a common 5-tuple (or 430 4-tuple) of: 432 * source and destination IP addresses of the innermost IP header 433 found; 435 * the protocol (IPv4) or next header (IPv6) field in this IP header 437 * either of: 439 - source and destination port numbers, for TCP, UDP, UDP-Lite, 440 SCTP, DCCP, etc. 442 - Security Parameters Index (SPI) for IPSec Encapsulating 443 Security Payload (ESP) [RFC4303]. 445 The Microflow Classification section of the Queue Protection Annex of 446 the DOCSIS spec. [DOCSIS] defines various strategies to find these 447 headers by skipping extension headers or encapsulations. If they 448 cannot be found, the spec. defines various less-specific 3-tuples 449 that would be used. The DOCSIS spec. should be referred to for all 450 these strategies, which will not be repeated here. 452 The array of bucket structures defined below is used by all the Queue 453 Protection functions: 455 456 struct bucket { // The leaky bucket structure to hold per-flow state 457 id; // identifier (e.g., 5-tuple) of flow using bucket 458 t_exp; // expiry time in units of T_RES 459 // (t_exp - now) = flow's normalized q'ing score 460 }; 461 struct bucket buckets[NBUCKETS+1]; 462 464 4.2. Queue Protection Data Path 466 All the functions of Queue Protection operate on the data path, 467 driven by packet arrivals. 469 The following functions that maintain per-flow queuing scores and 470 manage per-flow state are considered primarily as mechanism: 472 pick_bucket(uflow_id); // Returns bucket identifier 474 fill_bucket(bucket_id, pkt_size, probNative); // Returns queuing 475 score 477 calcProbNative(qdelay) // Returns probability of ECN-marking 479 The following function is primarily concerned with policy: 481 qprotect(packet, ...); // Returns exit status to either forward or 482 redirect the packet 484 ('...' suppresses distracting detail.) 486 Future modifications to policy aspects are more likely than to 487 mechanisms. Therefore, policy aspects would be less appropriate 488 candidates for any hardware acceleration. 490 The entry point to these functions is qprotect(), which is called 491 from packet classification before each packet is enqueued into the 492 appropriate queue, queue_id, as follows: 494 495 classifier(packet) { 496 // Determine which queue using ECN, DSCP and any local-use fields 497 queue_id = classify(packet); 498 // LQ & CQ are macros for valid queue IDs returned by classify() 499 if (queue_id == LQ) { 500 // if packet classified to Low Latency Service Flow 501 if (QPROTECT_ON) { 502 if (qprotect(packet, ...) == EXIT_SANCTION) { 503 // redirect packet to Classic Service Flow 504 queue_id = CQ; 505 } 506 } 507 return queue_id; 508 } 509 511 4.2.1. The qprotect() function 513 On each packet arrival, qprotect() measures the current queue delay 514 and derives the native marking probability from it. Then it uses 515 pick_bucket to find the bucket already holding the flow's state, or 516 to allocate a new bucket if the flow is new or its state has expired 517 (the most likely case). Then the queuing score is updated by the 518 fill_bucket() function. That completes the mechanism aspects. 520 The comments against the subsequent policy conditions and actions 521 should be self-explanatory at a superficial level. The deeper 522 rationale for these conditions is given in Section 5.4. 524 525 // Per packet queue protection 526 qprotect(packet, ...) { 528 bckt_id; // bucket index 529 qLscore; // queuing score of pkt's flow in units of T_RES 531 qdelay = qL.qdelay(...); 532 probNative = calcProbNative(qdelay); 534 bckt_id = pick_bucket(packet.uflow); 535 // if (bckt_id->t_exp risks overflow) // Details not shown 536 // return EXIT_SANCTION; 537 qLscore = fill_bucket(buckets[bckt_id], packet.size, probNative); 539 // Determine whether to sanction packet 540 if ( ( ( qdelay > CRITICALqL ) // Test if qdelay over threshold... 541 // ...and if flow's q'ing score scaled by qdelay/CRITICALqL 542 // ...exceeds CRITICALqLSCORE 543 && ( qdelay * qLscore > CRITICALqLPRODUCT ) ) 544 // or qLSCORE_MAX reached 545 || ( qLscore >= qLSCORE_MAX ) ) 547 return EXIT_SANCTION; 549 else 550 return EXIT_SUCCESS; 551 } 552 554 4.2.2. The pick_bucket() function 556 The pick_bucket() function is optimized for flow-state that will 557 normally have expired from packet to packet of the same flow. It is 558 just one way of finding the bucket associated with the flow ID of 559 each packet - it might be possible to develop more efficient 560 alternatives. 562 The algorithm is arranged so that the bucket holding any live (non- 563 expired) flow-state associated with a packet will always be found 564 before a new bucket is allocated. The constant ATTEMPTS, defined 565 earlier, determines how many hashes are used to find a bucket for 566 each flow (actually, only one hash is generated; then, by default, 5 567 bits of it at a time are used as the hash value, because by default 568 there are 2^5 = 32 buckets). 570 The algorithm stores the flow's own ID in its flow-state. So, when a 571 packet of a flow arrives, the algorithm tries up to ATTEMPTS times to 572 hash to a bucket, looking for the flow's own ID. If found, it uses 573 that bucket, first resettings the expiry time to 'now' if it has 574 expired. 576 If it does not find the flow's ID, and the expiry time is still 577 current, the algorithm can tell that another flow is using that 578 bucket, and it continues to look for a bucket for the flow. Even if 579 it finds another flow's bucket where the expiry time has passed, it 580 doesn't immediately use it. It merely remembers it as the potential 581 bucket to use. But first it runs through all the ATTEMPTS hashes to 582 look for a bucket assigned to the flow ID. Then, if a live bucket is 583 not already associated with the packet's flow, the algorithm should 584 have already set aside an existing bucket with a score that has aged 585 out. Given this bucket is no longer necessary to hold state for its 586 previous flow, it can be recycled for use by the present packet's 587 flow. 589 If all else fails, there is one additional bucket (called the dregs) 590 that can be used. If the dregs is still in live use by another flow, 591 subsequent flows that cannot find a bucket of their own all share it, 592 adding their score to the one in the dregs. A flow might get away 593 with using the dregs on its own, but when there are many mis-marked 594 flows, multiple flows are more likely to collide in the dregs, 595 including innocent flows. The choice of number of buckets and number 596 of hash attempts determines how likely it will be that this 597 undesirable scenario will occur. 599 600 // Pick the bucket associated with flow uflw 601 pick_bucket(uflw) { 603 now; // current time 604 j; // loop counter 605 h32; // holds hash of the packet's flow IDs 606 h; // bucket index being checked 607 hsav; // interim chosen bucket index 609 h32 = hash32(uflw); // 32-bit hash of flow ID 610 hsav = NBUCKETS; // Default bucket 611 now = get_time_now(); // in units of T_RES 613 // The for loop checks ATTEMPTS buckets for ownership by flow-ID 614 // It also records the 1st bucket, if any, that could be recycled 615 // because it's expired. 616 // Must not recycle a bucket until all ownership checks completed 617 for (j=0; j>= BI_SIZE; // Bit-shift hash for next attempt 631 } 632 // If reached here, no tested bucket was owned by the flow-ID 633 if (hsav != NBUCKETS) { 634 // If here, found an expired bucket within the above for loop 635 buckets[hsav].t_exp = now; // Reset expired bucket 636 } else { 637 // If here, we're having to use the default bucket (the dregs) 638 if (buckets[hsav].t_exp <= now) { // If dregs has expired... 639 buckets[hsav].t_exp = now; // ...reset it 640 } 641 } 642 buckets[hsav].id = uflw; // In either case, claim for recycling 643 return hsav; 644 } 645 647 4.2.3. The fill_bucket() function 649 The fill_bucket() function both accumulates and ages the queuing 650 score over time, as outlined in Section 2.1. To make aging the score 651 efficient, the increment of the queuing score is normalized to units 652 of time by dividing by AGING, so that the result represents the new 653 expiry time of the flow. 655 Given that probNative is already used to select which packets to ECN- 656 mark, it might be thought that the queuing score could just be 657 incremented by the full size of each selected packet, instead of 658 incrementing it by the product of every packet's size (pkt_sz) and 659 probNative. However, the unpublished experience of one of the 660 authors with other congestion policers has found that the score then 661 increments far too jumpily, particularly when probNative is low. 663 A deeper explanation of the queuing score is given in Section 5. 665 666 fill_bucket(bckt_id, pkt_sz, probNative) { 667 now; // current time 668 now = get_time_now(); // in units of T_RES 669 // Add packet's queuing score 670 // For integer arithmetic, a bit-shift can replace the division 671 qLscore = min(buckets[bckt_id].t_exp - now 672 + probNative * pkt_sz / AGING, qLSCORE_MAX); 673 buckets[bckt_id].t_exp = now + qLscore; 674 return qLscore; 675 } 676 678 4.2.4. The calcProbNative() function 680 To derive this queuing score, the QProt algorithm uses the linear 681 ramp function calcProbNative() to normalize instantaneous queuing 682 delay into a probability in the range [0,1], which it assigns to 683 probNative. 685 686 calcProbNative(qdelay){ 687 if ( qdelay >= MAXTH ) { 688 probNative = MAX_PROB; 689 } else if ( qdelay > MINTH ) { 690 probNative = MAX_PROB * (qdelay - MINTH)/RANGE; 691 // In practice, the * and the / would use a bit-shift 692 } else { 693 probNative = 0; 694 } 695 return probNative; 696 } 697 699 5. Rationale 701 5.1. Rationale: Blame for Queuing, not for Rate in Itself 703 Figure 1 shows the bit rates of two flows as stacked areas. It poses 704 the question of which flow is more to blame for queuing delay; the 705 unresponsive constant bit rate flow (c) that is consuming about 80% 706 of the capacity, or the flow sending regular short unresponsive 707 bursts (b)? The smoothness of c seems better for avoiding queuing, 708 but its high rate does not. However, if flow c was not there, or ran 709 slightly more slowly, b would not cause any queuing. 711 ^ bit rate (stacked areas) 712 | ,-. ,-. ,-. ,-. ,-. 713 |--|b|----------|b|----------|b|----------|b|----------|b|---Capacity 714 |__|_|__________|_|__________|_|__________|_|__________|_|_____ 715 | 716 | c 717 | 718 | 719 | 720 +----------------------------------------------------------------> 721 time 723 Figure 1: Which is More to Blame for Queuing Delay? 725 To explain queuing scores, in the following it will initially be 726 assumed that the QProt algorithm is accumulating queuing scores, but 727 not taking any action as a result. 729 To quantify the responsibility that each flow bears for queuing 730 delay, the QProt algorithm accumulates the product of the rate of 731 each flow and the level of congestion, both measured at the instant 732 each packet arrives. The instantaneous flow rate is represented at 733 each discrete event when a packet arrives by the packet's size, which 734 accumulates faster the more packets arrive within each unit of time. 735 The level of congestion is normalized to a dimensionless number 736 between 0 and 1 (nativeProb). This fractional congestion level is 737 used in preference to a direct dependence on queuing delay for two 738 reasons: 740 * to be able to ignore very low levels of queuing that contribute 741 insignificantly to delay 743 * to be able to erect a steep barrier against excessive queuing 744 delay 746 The unit of the resulting queue score is "congested-bytes" per 747 second, which distinguishes it from just bytes per second. 749 Then, during the periods between bursts (b), neither flow accumulates 750 any queuing score - the high rate of c is benign. But, during each 751 burst, if we say the rate of c and b are 80% and 45% of capacity, 752 thus causing 25% overload, they each bear (80/125)% and (45/125)% of 753 the responsibility for the queuing delay (64% and 36%). The 754 algorithm does not explicitly calculate these percentages. They are 755 just the outcome of the number of packets arriving from each flow 756 during the burst. 758 To summarize, the queuing score never sanctions rate solely on its 759 own account. It only sanctions rate inasmuch as it causes queuing. 761 ^ bit rate (stacked areas) , 762 | ,-. |\ ,- 763 |------Capacity-|b|----------,-.----------|b|----------|b\----- 764 | __|_|_______ |b| /``\| _...-._-': | ,.-- 765 | ,-. __/ \__|_|_ _/ |/ \|/ 766 | |b| ___/ \___/ __ r 767 | |_|/ v \__/ \_______ _/\____/ 768 | _/ \__/ 769 | 770 +----------------------------------------------------------------> 771 time 773 Figure 2: Responsibility for Queuing: More Complex Scenario 775 Figure 2 gives a more complex illustration of the way the queuing 776 score assigns responsibility for queuing (limited to the precision 777 that ASCII art can illustrate). The figure shows the bit rates of 778 three flows represented as stacked areas labelled b, v and r. The 779 unresponsive bursts (b) are the same as in the previous example, but 780 a variable rate video (v) replaces flow c. It's rate varies as the 781 complexity of the video scene varies. Also on a slower timescale, in 782 response to the level of congestion, the video adapts its quality. 783 However, on a short time-scale it appears to be unresponsive to small 784 amounts of queuing. Also, part-way through, a low latency responsive 785 flow (r) joins in, aiming to fill the balance of capacity left by the 786 other two. 788 The combination of the first burst and the low application-limited 789 rate of the video causes neither flow to accumulate queuing score. 790 In contrast, the second burst causes similar excessive overload 791 (125%) to the example in Figure 1. Then, the video happens to reduce 792 its rate (probably due to a less complex scene) so the third burst 793 causes only a little congestion. Let us assume the resulting queue 794 causes probNative to rise to just 1%, then the queuing score will 795 only accumulate 1% of the size of each packet of flows v and b during 796 this burst. 798 The fourth burst happens to arrive just as the new responsive flow 799 (r) has filled the available capacity, so it leads to very rapid 800 growth of the queue. After a round trip the responsive flow rapidly 801 backs off, and the adaptive video also backs off more rapidly than it 802 would normally, because of the very high congestion level. The rapid 803 response to congestion of flow r reduces the queuing score that all 804 three flows accumulate, but they each still bear the cost in 805 proportion to the product of the rates at which their packets arrive 806 at the queue and the value of probNative when they do so. Thus, 807 during the fifth burst, they all accumulate less score than the 808 fourth, because the queuing delay is not as excessive. 810 5.2. Rationale for Aging the Queuing Score 812 Even well-behaved flows will not always be able to respond fast 813 enough to dynamic events. Also well-behaved flows, e.g., DCTCP 814 [RFC8257], TCP Prague [I-D.briscoe-iccrg-prague-congestion-control], 815 BBRv2 [BBRv2] or the L4S variant of SCReAM [SCReAM] for real-time 816 media [RFC8298], can maintain a very shallow queue by continual 817 careful probing for more while also continually subtracting a little 818 from their rate (or congestion window) in response to low levels of 819 ECN signalling. Therefore, the QProt algorithm needs to continually 820 offer a degree of forgiveness to age out the queuing score as it 821 accumulates. 823 Scalable congestion controllers such as those above maintain their 824 congestion window in inverse proportion to the congestion level, 825 probNative. That leads to the important property that on average a 826 scalable flow holds the product of its congestion window and the 827 congestion level constant, no matter the capacity of the link or how 828 many other flows it competes with. For instance, if the link 829 capacity doubles, a scalable flow induces half the congestion 830 probability. Or if three scalable flows compete for the capacity, 831 each flow will reduce to one third of the capacity they would use on 832 their own and increase the congestion level by 3x. 834 This suggests that the QProt algorithm will not sanction a well- 835 behaved scalable flow if it ages out the queuing score at a 836 sufficient constant rate. The constant will need to be somewhat 837 above the average of a well-behaved scalable flow to allow for normal 838 dynamics. 840 Relating QProt's aging constant to a scalable flow does not mean that 841 a flow has to behave like a scalable flow. It can be less 842 aggressive, but not more. For instance, a longer RTT flow can run at 843 a lower congestion-rate than the aging rate, but it can also increase 844 its aggressiveness to equal the rate of short RTT scalable flows 845 [ScalingCC]. The constant aging of QProt also means that a long- 846 running unresponsive flow will be prone to trigger QProt if it runs 847 faster than a competing responsive scalable flow would. And, of 848 course, if a flow causes excessive queuing in the short-term, its 849 queuing score will still rise faster than the constant aging process 850 will decrease it. Then QProt will still eject the flow's packets 851 before they harm the low latency of the shared queue. 853 5.3. Rationale for Normalized Queuing Score 855 The QProt algorithm holds a flow's queuing score state in a structure 856 called a bucket, because of its similarity to a classic leaky bucket 857 (except the contents of the bucket does not represent bytes). 859 probNative * pkt_sz probNative * pkt_sz / AGING 860 | | 861 | V | | V | 862 | : | ___ | : | 863 |_____| ___ |_____| 864 | | ___ | | 865 |__ __| |__ __| 866 | | 867 V V 868 AGING * Dt Dt 870 Figure 3: Normalization of Queuing Score 872 The accumulation and aging of the queuing score is shown on the left 873 of Figure 3 in token bucket form. Dt is the difference between the 874 times when the scores of the current and previous packets were 875 processed. 877 A normalized equivalent of this token bucket is shown on the right of 878 Figure 3, dividing both the input and output by the constant AGING 879 rate. The result is a bucket-depth that represents time and it 880 drains at the rate that time passes. 882 As a further optimization, the time the bucket was last updated is 883 not stored with the flow-state. Instead, when the bucket is 884 initialized the queuing score is added to the system time 'now' and 885 the resulting expiry time is written into the bucket. Subsequently, 886 if the bucket has not expired, the incremental queuing score is added 887 to the time already held in the bucket. Then the queuing score 888 always represents the expiry time of the flow-state itself. This 889 means that the queuing score does not need to be aged explicitly 890 because it ages itself implicitly. 892 5.4. Rationale for Policy Conditions 894 Pseudocode for the QProt policy conditions is given in Section 4.1 895 within the second half of the qprotect() function. When each packet 896 arrives, after finding its flow state and updating the queuing score 897 of the packet's flow, the algorithm checks whether the shared queue 898 delay exceeds a constant threshold CRITICALqL (e.g., 2 ms), as 899 repeated below for convenience: 901 902 if ( ( qdelay > CRITICALqL ) // Test if qdelay over threshold... 903 // ...and if flow's q'ing score scaled by qdelay/CRITICALqL 904 // ...exceeds CRITICALqLSCORE 905 && ( qdelay * qLscore > CRITICALqLPRODUCT ) ) 906 // Recall that CRITICALqLPRODUCT = CRITICALqL * CRITICALqLSCORE 907 909 If the queue delay threshold is exceeded, the flow's queuing score is 910 temporarily scaled up by the ratio of the current queue delay to the 911 threshold queuing delay, CRITICALqL (the reason for the scaling is 912 given next). If this scaled up score exceeds another constant 913 threshold CRITICALqLSCORE, the packet is ejected. The actual last 914 line of code above multiplies both sides of the second condition by 915 CRITICALqL to avoid a costly division. 917 This approach allows each packet to be assessed once, as it arrives. 918 Once queue delay exceeds the threshold, it has two implications: 920 * The current packet might be ejected even though there are packets 921 already in the queue from flows with higher queuing scores. 922 However, any flow that continues to contribute to the queue will 923 have to send further packets, giving an opportunity to eject them 924 as well, as they subsequently arrive. 926 * The next packets to arrive might not be ejected, because they 927 might belong to flows with low queuing scores. In this case, 928 queue delay could continue to rise with no opportunity to eject a 929 packet. This is why the queuing score is scaled up by the current 930 queue delay. Then, the more the queue has grown without ejecting 931 a packet, the more the algorithm 'raises the bar' to further 932 packets. 934 The above approach is preferred over the extra per-packet processing 935 cost of searching the buckets for the flow with the highest queuing 936 score and searching the queue for one of its packets to eject (if one 937 is still in the queue). 939 Note that by default CRITICALqL_us is set to the maximum threshold of 940 the ramp marking algorithm, MAXTH_us. However, there is some debate 941 as to whether setting it to the minimum threshold instead would 942 improve QProt performance. This would roughly double the ratio of 943 qdelay to CRITICALqL, which is compared against the CRITICALqLSCORE 944 threshold. So the threshold would have to be roughly doubled 945 accordingly. 947 Figure 4 explains this approach graphically. On the horizontal axis 948 it shows actual harm, meaning the queuing delay in the shared queue. 949 On the vertical axis it shows the behaviour record of the flow 950 associated with the currently arriving packet, represented in the 951 algorithm by the flow's queuing score. The shaded region represents 952 the combination of actual harm and behaviour record that will lead to 953 the packet being ejected. 955 Behaviour Record: 956 Queueing Score of 957 Arriving Packet's Flow 958 ^ 959 | + |/ / / / / / / / / / / / / / / / / / / 960 | + N | / / / / / / / / / / / / / / / / / / / 961 | + |/ / / / / / / / / / 962 | + | / / / / E (Eject packet) / / / / / 963 | + |/ / / / / / / / / / 964 | + | / / / / / / / / / / / / / / / / / / / 965 | + |/ / / / / / / / / / / / / / / / / / / 966 | +| / / / / / / / / / / / / / / / / / / / 967 | |+ / / / / / / / / / / / / / / / / / / 968 | N | + / / / / / / / / / / / / / / / / / 969 | (No actual | +/ / / / / / / / / / / / / / / 970 | harm) | + / / / / / / / / / / / / 971 | | P (Pass over) + ,/ / / / / / / / 972 | | ^ + /./ /_/ 973 +--------------+------------------------------------------> 974 CRITICALqL Actual Harm: Shared Queue Delay 976 Figure 4: Graphical Explanation of the Policy Conditions 978 The regions labelled 'N' represent cases where the first condition is 979 not met - no actual harm - queue delay is below the critical 980 threshold, CRITICALqL. 982 The region labelled 'E' represents cases where there is actual harm 983 (queue delay exceeds CRITICALqL) and the queuing score associated 984 with the arriving packet is high enough to be able to eject it with 985 certainty. 987 The region labelled 'P' represents cases where there is actual harm, 988 but the queuing score of the arriving packet is insufficient to eject 989 it, so it has to be Passed over. This adds to queuing delay, but the 990 alternative would be to sanction an innocent flow. It can be seen 991 that, as actual harm increases, the judgement of innocence becomes 992 increasingly stringent; the behaviour record of the next packet's 993 flow does not have to be as bad to eject it. 995 Conditioning ejection on actual harm helps prevent VPN packets being 996 ejected unnecessarily. VPNs consisting of multiple flows can tend to 997 accumulate queuing score faster than it is aged out, because the 998 aging rate is intended for a single flow. However, whether or not 999 some traffic is in a VPN, the queue delay threshold (CRITICALqL) will 1000 be no more likely to be exceeded. So conditioning ejection on actual 1001 harm helps reduce the chance that VPN traffic will be ejected by the 1002 QProt function. 1004 5.5. Rationale for Reclassification as the Policy Action 1006 When the DOCSIS QProt algorithm deems that it is necessary to eject a 1007 packet to protect the Low Latency queue, it redirects the packet to 1008 the Classic queue. In the Low Latency DOCSIS architecture (as in 1009 Coupled DualQ AQMs generally), a scheduler is arranged to give the 1010 Low Latency Queue conditional priority over the Classic queue, for 1011 instance using weighted round robin (WRR) with a high weight in 1012 favour of the Low Latency queue. 1014 Therefore, typically, an ejected packet will experience higher 1015 queuing delay than it would otherwise, and it could be re-ordered 1016 within its flow (assuming QProt does not eject all packets of an 1017 anomalous flow). The mild harm caused to the performance of the 1018 ejected packet's flow is deliberate. It gives senders a slight 1019 incentive to identify their packets correctly. 1021 If there were no such harm, there would be nothing to prevent all 1022 flows from identifying themselves as suitable for classification into 1023 the low latency queue, and just letting QProt sort the resulting 1024 aggregate into queue-building and non-queue-building flows. This 1025 might seem like a useful alternative to requiring senders to 1026 correctly identify their flows. However, handling of mis-classified 1027 flows is not without a cost. The more packets that have to be 1028 reclassified, the more often the delay of the low latency queue would 1029 exceed the threshold. Also more memory would be required to hold the 1030 extra flow state. 1032 When a packet is redirected into the Classic queue, an operator might 1033 want to alter the identifier(s) that originally caused it to be 1034 classified into the Low Latency queue, so that the packet will not be 1035 classified into another low latency queue further downstream. 1036 However, redirection of occasional packets can be due to unusually 1037 high transient load just at the specific bottleneck, not necessarily 1038 at any other bottleneck, and not necessarily due to bad flow 1039 behaviour. Therefore, Section 5.4.1.2 of [I-D.ietf-tsvwg-ecn-l4s-id] 1040 precludes a network node from altering the end-to-end ECN field to 1041 exclude traffic from L4S treatment. Instead a local-use identifier 1042 ought to be used (e.g., Diffserv Codepoint or VLAN tag), so that each 1043 operator can apply its own policy, without prejudging what other 1044 operators ought to do. 1046 Although not supported in the DOCSIS specs, QProt could be extended 1047 to recognize that large numbers of redirected packets belong to the 1048 same flow. This might be detected when the bucket expiry time t_exp 1049 exceeds a threshold, which could also conveniently prevent overflow 1050 of the t_exp variable (see the qprotect() function in Section 4.2.1). 1051 Depending on policy and implementation capabilities, QProt could then 1052 install a classifier to redirect a whole flow into the Classic queue, 1053 with an idle timeout to remove stale classifiers. In these 1054 'persistent offender' cases, QProt might also overwrite each 1055 redirected packet's DSCP or clear its ECN field to Not-ECT, in order 1056 to protect other potential L4S queues downstream. The DOCSIS specs 1057 do not discuss sanctioning whole flows, so further discussion is 1058 beyond the scope of the present document. 1060 6. Limitations 1062 The QProt algorithm groups packets with common layer-4 flow 1063 identifiers. It then uses this grouping to accumulate queuing scores 1064 and to sanction packets. 1066 This choice of identifier for grouping is pragmatic with no 1067 scientific basis. All the packets of a flow certainly pass between 1068 the same two endpoints. But some applications might initiate 1069 multiple flows between the same end-points, e.g., for media, control, 1070 data, etc. Others might use common flow identifiers for all these 1071 streams. Also, a user might group multiple application flows within 1072 the same encrypted VPN between the same layer-4 tunnel end-points. 1073 And even if there were a one-to-one mapping between flows and 1074 applications, there is no reason to believe that the rate at which 1075 congestion can be caused ought to be allocated on a per application 1076 flow basis. 1078 The use of a queuing score that excludes those aspects of flow rate 1079 that do not contribute to queuing (Section 5.1) goes some way to 1080 mitigating this limitation, because the algorithm does not judge 1081 responsibility for queuing delay primarily on the combined rate of a 1082 set of flows grouped under one flow ID. 1084 7. IANA Considerations (to be removed by RFC Editor) 1086 This specification contains no IANA considerations. 1088 8. Security Considerations 1090 The whole of this document concerns traffic security. It considers 1091 the security question of how to identify and eject traffic that does 1092 not comply with the non-queue-building behaviour required to use a 1093 shared low latency queue, whether accidentally or maliciously. 1095 The algorithm has been designed to fail gracefully in the face of 1096 traffic crafted to overrun the resources used for the algorithm's own 1097 processing and flow state. This means that non-queue-building flows 1098 will always be less likely to be sanctioned than queue-building 1099 flows. But an attack could be contrived to deplete resources in such 1100 a way that the proportion of innocent (non-queue-building) flows that 1101 are incorrectly sanctioned could increase. 1103 Section 8.2 of the L4S architecture [I-D.ietf-tsvwg-l4s-arch] 1104 introduces the problem of maintaining low latency by either self- 1105 restraint or enforcement, and places DOCSIS queue protection in 1106 context within a wider set of approaches to the problem. 1108 9. Acknowledgements 1110 Thanks to Tom Henderson and to Adrian Farrel for their reviews of 1111 this document. The design of the QProt algorithm and the settings of 1112 the parameters benefited from discussion and critique from the 1113 participants of the cable industry working group on Low Latency 1114 DOCSIS. CableLabs funded Bob Briscoe's initial work on this 1115 document. 1117 10. References 1119 10.1. Normative References 1121 [DOCSIS] CableLabs, "MAC and Upper Layer Protocols Interface 1122 (MULPI) Specification, CM-SP-MULPIv3.1", Data-Over-Cable 1123 Service Interface Specifications DOCSIS® 3.1 Version I17 1124 or later, 21 January 2019, . 1127 [DOCSIS-CCAP-OSS] 1128 CableLabs, "CCAP Operations Support System Interface 1129 Spec", Data-Over-Cable Service Interface Specifications 1130 DOCSIS® 3.1 Version I14 or later, 21 January 2019, 1131 . 1134 [DOCSIS-CM-OSS] 1135 CableLabs, "Cable Modem Operations Support System 1136 Interface Spec", Data-Over-Cable Service Interface 1137 Specifications DOCSIS® 3.1 Version I14 or later, 21 1138 January 2019, . 1141 [I-D.ietf-tsvwg-ecn-l4s-id] 1142 Schepper, K. D. and B. Briscoe, "Explicit Congestion 1143 Notification (ECN) Protocol for Very Low Queuing Delay 1144 (L4S)", Work in Progress, Internet-Draft, draft-ietf- 1145 tsvwg-ecn-l4s-id-23, 24 December 2021, 1146 . 1149 [I-D.ietf-tsvwg-nqb] 1150 White, G. and T. Fossati, "A Non-Queue-Building Per-Hop 1151 Behavior (NQB PHB) for Differentiated Services", Work in 1152 Progress, Internet-Draft, draft-ietf-tsvwg-nqb-08, 25 1153 October 2021, . 1156 [RFC8311] Black, D., "Relaxing Restrictions on Explicit Congestion 1157 Notification (ECN) Experimentation", RFC 8311, 1158 DOI 10.17487/RFC8311, January 2018, 1159 . 1161 10.2. Informative References 1163 [BBRv2] Cardwell, N., "TCP BBR v2 Alpha/Preview Release", github 1164 repository; Linux congestion control module, 1165 . 1167 [I-D.briscoe-iccrg-prague-congestion-control] 1168 Schepper, K. D., Tilmans, O., and B. Briscoe, "Prague 1169 Congestion Control", Work in Progress, Internet-Draft, 1170 draft-briscoe-iccrg-prague-congestion-control-00, 9 March 1171 2021, . 1174 [I-D.ietf-tsvwg-aqm-dualq-coupled] 1175 Schepper, K. D., Briscoe, B., and G. White, "DualQ Coupled 1176 AQMs for Low Latency, Low Loss and Scalable Throughput 1177 (L4S)", Work in Progress, Internet-Draft, draft-ietf- 1178 tsvwg-aqm-dualq-coupled-20, 24 December 2021, 1179 . 1182 [I-D.ietf-tsvwg-l4s-arch] 1183 Briscoe, B., Schepper, K. D., Bagnulo, M., and G. White, 1184 "Low Latency, Low Loss, Scalable Throughput (L4S) Internet 1185 Service: Architecture", Work in Progress, Internet-Draft, 1186 draft-ietf-tsvwg-l4s-arch-15, 24 December 2021, 1187 . 1190 [I-D.white-tsvwg-lld] 1191 White, G., Sundaresan, K., and B. Briscoe, "Low Latency 1192 DOCSIS - Technology Overview", Work in Progress, Internet- 1193 Draft, draft-white-tsvwg-lld-00, 11 March 2019, 1194 . 1197 [RFC4303] Kent, S., "IP Encapsulating Security Payload (ESP)", 1198 RFC 4303, DOI 10.17487/RFC4303, December 2005, 1199 . 1201 [RFC6789] Briscoe, B., Ed., Woundy, R., Ed., and A. Cooper, Ed., 1202 "Congestion Exposure (ConEx) Concepts and Use Cases", 1203 RFC 6789, DOI 10.17487/RFC6789, December 2012, 1204 . 1206 [RFC7713] Mathis, M. and B. Briscoe, "Congestion Exposure (ConEx) 1207 Concepts, Abstract Mechanism, and Requirements", RFC 7713, 1208 DOI 10.17487/RFC7713, December 2015, 1209 . 1211 [RFC8257] Bensley, S., Thaler, D., Balasubramanian, P., Eggert, L., 1212 and G. Judd, "Data Center TCP (DCTCP): TCP Congestion 1213 Control for Data Centers", RFC 8257, DOI 10.17487/RFC8257, 1214 October 2017, . 1216 [RFC8298] Johansson, I. and Z. Sarker, "Self-Clocked Rate Adaptation 1217 for Multimedia", RFC 8298, DOI 10.17487/RFC8298, December 1218 2017, . 1220 [ScalingCC] 1221 Briscoe, B. and K. De Schepper, "Resolving Tensions 1222 between Congestion Control Scaling Requirements", Simula 1223 Technical Report TR-CS-2016-001 arXiv:1904.07605, July 1224 2017, . 1226 [SCReAM] Johansson, I., "SCReAM", github repository; , 1227 . 1230 Authors' Addresses 1232 Bob Briscoe (editor) 1233 Independent 1234 United Kingdom 1236 Email: ietf@bobbriscoe.net 1237 URI: http://bobbriscoe.net/ 1238 Greg White 1239 CableLabs 1240 United States of America 1242 Email: G.White@CableLabs.com