idnits 2.17.1 draft-cheng-iccrg-delivery-rate-estimation-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([RFC793], [RFC9000]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 141: '...f data delivered MAY be tracked in uni...' RFC 2119 keyword, line 330: '...owing parameters MUST be tracked to im...' RFC 2119 keyword, line 535: '... it is RECOMMENDED that the per-pack...' RFC 2119 keyword, line 549: '... is RECOMMENDED that applications or...' RFC 2119 keyword, line 574: '..., then the following approaches MAY be...' (2 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (8 November 2021) is 872 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Experimental ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 793 (Obsoleted by RFC 9293) Summary: 3 errors (**), 0 flaws (~~), 1 warning (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Congestion Control Research Group Y. Cheng 3 Internet-Draft N. Cardwell 4 Intended status: Experimental S. Hassas Yeganeh 5 Expires: 12 May 2022 V. Jacobson 6 Google 7 8 November 2021 9 Delivery Rate Estimation 10 draft-cheng-iccrg-delivery-rate-estimation-01 12 Abstract 14 This document describes a generic algorithm for a transport protocol 15 sender to estimate the current delivery rate of its data. At a high 16 level, the algorithm estimates the rate at which the network 17 delivered the most recent flight of outbound data packets for a 18 single flow. In addition, it tracks whether the rate sample was 19 application-limited, meaning the transmission rate was limited by the 20 sending application rather than the congestion control algorithm. 21 This algorithm can be implemented in any transport protocol that 22 supports packet-delivery acknowledgment (thus far, open source 23 implementations are available for TCP [RFC793] and QUIC [RFC9000]). 25 Status of This Memo 27 This Internet-Draft is submitted in full conformance with the 28 provisions of BCP 78 and BCP 79. 30 Internet-Drafts are working documents of the Internet Engineering 31 Task Force (IETF). Note that other groups may also distribute 32 working documents as Internet-Drafts. The list of current Internet- 33 Drafts is at https://datatracker.ietf.org/drafts/current/. 35 Internet-Drafts are draft documents valid for a maximum of six months 36 and may be updated, replaced, or obsoleted by other documents at any 37 time. It is inappropriate to use Internet-Drafts as reference 38 material or to cite them other than as "work in progress." 40 This Internet-Draft will expire on 12 May 2022. 42 Copyright Notice 44 Copyright (c) 2021 IETF Trust and the persons identified as the 45 document authors. All rights reserved. 47 This document is subject to BCP 78 and the IETF Trust's Legal 48 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 49 license-info) in effect on the date of publication of this document. 50 Please review these documents carefully, as they describe your rights 51 and restrictions with respect to this document. Code Components 52 extracted from this document must include Simplified BSD License text 53 as described in Section 4.e of the Trust Legal Provisions and are 54 provided without warranty as described in the Simplified BSD License. 56 Table of Contents 58 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 59 2. Algorithm Overview . . . . . . . . . . . . . . . . . . . . . 3 60 2.1. Requirements . . . . . . . . . . . . . . . . . . . . . . 3 61 2.2. Estimating Delivery Rate . . . . . . . . . . . . . . . . 3 62 2.2.1. ACK Rate . . . . . . . . . . . . . . . . . . . . . . 4 63 2.2.2. Compression and Aggregation . . . . . . . . . . . . . 5 64 2.2.3. Send Rate . . . . . . . . . . . . . . . . . . . . . . 5 65 2.2.4. Delivery Rate . . . . . . . . . . . . . . . . . . . . 6 66 2.3. Tracking application-limited phases . . . . . . . . . . . 6 67 3. Detailed Algorithm . . . . . . . . . . . . . . . . . . . . . 7 68 3.1. Variables . . . . . . . . . . . . . . . . . . . . . . . . 7 69 3.1.1. Per-connection (C) state . . . . . . . . . . . . . . 7 70 3.1.2. Per-packet (P) state . . . . . . . . . . . . . . . . 8 71 3.1.3. Rate Sample (rs) Output . . . . . . . . . . . . . . . 8 72 3.2. Transmitting or retransmitting a data packet . . . . . . 9 73 3.3. Upon receiving an ACK . . . . . . . . . . . . . . . . . . 10 74 3.4. Detecting application-limited phases . . . . . . . . . . 11 75 4. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 12 76 4.1. Offload Mechanisms . . . . . . . . . . . . . . . . . . . 12 77 4.2. Impact of ACK losses . . . . . . . . . . . . . . . . . . 12 78 4.3. Impact of packet reordering . . . . . . . . . . . . . . . 13 79 4.4. Impact of packet loss and retransmissions . . . . . . . . 13 80 4.5. Connections without SACK support . . . . . . . . . . . . 13 81 5. Implementation Status . . . . . . . . . . . . . . . . . . . . 14 82 6. Security Considerations . . . . . . . . . . . . . . . . . . . 15 83 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 15 84 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 15 85 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 15 86 9.1. Normative References . . . . . . . . . . . . . . . . . . 15 87 9.2. Informative References . . . . . . . . . . . . . . . . . 16 88 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 16 90 1. Introduction 92 This document describes a generic algorithm for a transport protocol 93 sender to estimate the current delivery rate of its data on the fly. 94 This technique has been used for a congestion control algorithm that 95 relies on fresh, reliable, and inexpensive delivery rate information 96 [draft-cardwell-iccrg-bbr-congestion-control] [CCGHJ17]. 98 At a high level, the algorithm estimates the rate at which the 99 network delivered the most recent flight of outbound data packets for 100 a single flow. In addition, it tracks whether the rate sample was 101 application-limited, meaning the transmission rate was limited by the 102 sending application rather than the congestion control algorithm. 104 Each acknowledgment that cumulatively or selectively acknowledges 105 that the network has delivered new data produces a rate sample which 106 records the amount of data delivered over the time interval between 107 the transmission of a data packet and the acknowledgment of that 108 packet. The samples reflect the recent goodput through some 109 bottleneck, which may reside either in the network or on the end 110 hosts (sender or receiver). 112 2. Algorithm Overview 114 2.1. Requirements 116 This algorithm can be implemented in any transport protocol that 117 supports packet-delivery acknowledgment (so far, implementations are 118 available for TCP [RFC793] and QUIC [RFC9000]). This algorithm 119 requires a small amount of added logic on the sender, and requires 120 that the sender maintain a small amount of additional per-packet 121 state for packets sent but not yet delivered. In the most general 122 case it requires high-precision (microsecond-granularity or better) 123 timestamps on the sender (though millisecond-granularity may suffice 124 for lower bandwidths). It does not require any receiver or network 125 changes. While selective acknowledgments for out-of-order data 126 (e.g., [RFC2018]) are not required, such a mechanism is highly 127 recommended for accurate estimation during reordering and loss 128 recovery phases. 130 2.2. Estimating Delivery Rate 132 A delivery rate sample records the estimated rate at which the 133 network delivered packets for a single flow, calculated over the time 134 interval between the transmission of a data packet and the 135 acknowledgment of that packet. Since the rate samples only include 136 packets actually cumulatively and/or selectively acknowledged, the 137 sender knows the exact octets that were delivered to the receiver 138 (not lost), and the sender can can compute an estimate of a 139 bottleneck delivery rate over that time interval. 141 The amount of data delivered MAY be tracked in units of either octets 142 or packets. Tracking data in units of octets is more accurate, since 143 packet sizes can vary. But for some purposes, including congestion 144 control, tracking data in units of packets may suffice. 146 2.2.1. ACK Rate 148 First, consider the rate at which data is acknowledged by the 149 receiver. In this algorithm, the computation of the ACK rate models 150 the average slope of a hypothetical "delivered" curve that tracks the 151 cumulative quantity of data delivered so far on the Y axis, and time 152 elapsed on the X axis. Since ACKs arrive in discrete events, this 153 "delivered" curve forms a step function, where each ACK causes a 154 discrete increase in the "delivered" count that causes a vertical 155 upward step up in the curve. This "ack_rate" computation is the 156 average slope of the "delivered" step function, as measured from the 157 "knee" of the step (ACK) preceding the transmit to the "knee" of the 158 step (ACK) for packet P. 160 Given this model, the ack rate sample "slope" is computed as the 161 ratio between the amount of data marked as delivered over this time 162 interval, and the time over which it is marked as delivered: 164 ack_rate = data_acked / ack_elapsed 166 To calculate the amount of data ACKed over the interval, the sender 167 records in per-packet state "P.delivered", the amount of data that 168 had been marked delivered before transmitting packet P, and then 169 records how much data had been marked delivered by the time the ACK 170 for the packet arrives (in "C.delivered"), and computes the 171 difference: 173 data_acked = C.delivered - P.delivered 175 To compute the time interval, "ack_elapsed", one might imagine that 176 it would be feasible to use the round-trip time (RTT) of the packet. 177 But it is not safe to simply calculate a bandwidth estimate by using 178 the time between the transmit of a packet and the acknowledgment of 179 that packet. Transmits and ACKs can happen out of phase with each 180 other, clocked in separate processes. In general transmits often 181 happen at some point later than the most recent ACK, due to 182 processing or pacing delays. Because of this effect, drastic over- 183 estimates can happen if a sender were to attempt to estimate 184 bandwidth by using the round-trip time. 186 This document specifies the following approach for computing 187 "ack_elapsed". The starting time is "P.delivered_time", the time of 188 the delivery curve "knee" from the ACK preceding the transmit. The 189 ending time is "C.delivered_time", the time of the delivery curve 190 "knee" from the ACK for P. Then we compute "ack_elapsed" as: 192 ack_elapsed = C.delivered_time - P.delivered_time 194 This yields our equation for computing the ACK rate, as the "slope" 195 from the "knee" preceding the transmit to the "knee" at ACK: 197 ack_rate = data_acked / ack_elapsed 198 ack_rate = (C.delivered - P.delivered) / 199 (C.delivered_time - P.delivered_time) 201 2.2.2. Compression and Aggregation 203 For computing the delivery_rate, the sender prefers ack_rate, the 204 rate at which packets were acknowledged, since this usually the most 205 reliable metric. However, this approach of directly using "ack_rate" 206 faces a challenge when used with paths featuring aggregation, 207 compression, or ACK decimation, which are prevalent [A15]. In such 208 cases, ACK arrivals can temporarily make it appear as if data packets 209 were delivered much faster than the bottleneck rate. To filter out 210 such implausible ack_rate samples, we consider the send rate for each 211 flight of data, as follows. 213 2.2.3. Send Rate 215 The sender calculates the send rate, "send_rate", for a flight of 216 data as follows. Define "P.first_sent_time" as the time of the first 217 send in a flight of data, and "P.sent_time" as the time the final 218 send in that flight of data (the send that transmits packet "P"). 219 The elapsed time for sending the flight is: 221 send_elapsed = (P.sent_time - P.first_sent_time) 223 Then we calculate the send_rate as: 225 send_rate = data_acked / send_elapsed 227 Using our "delivery" curve model above, the send_rate can be viewed 228 as the average slope of a "send" curve that traces the amount of data 229 sent on the Y axis, and the time elapsed on the X axis: the average 230 slope of the transmission of this flight of data. 232 2.2.4. Delivery Rate 234 Since it is physically impossible to have data delivered faster than 235 it is sent in a sustained fashion, when the estimator notices that 236 the ack_rate for a flight is faster than the send rate for the 237 flight, it filters out the implausible ack_rate by capping the 238 delivery rate sample to be no higher than the send rate. 240 More precisely, over the interval between each transmission and 241 corresponding ACK, the sender calculates a delivery rate sample, 242 "delivery_rate", using the minimum of the rate at which packets were 243 acknowledged or the rate at which they were sent: 245 delivery_rate = min(send_rate, ack_rate) 247 Since ack_rate and send_rate both have data_acked as a numerator, 248 this can be computed more efficiently with a single division (instead 249 of two), as follows: 251 delivery_elapsed = max(ack_elapsed, send_elapsed) 252 delivery_rate = data_acked / delivery_elapsed 254 2.3. Tracking application-limited phases 256 In application-limited phases the transmission rate is limited by the 257 application rather than the congestion control algorithm. Modern 258 transport protocol connections are often application-limited, either 259 due to request/response workloads (e.g. Web traffic, RPC traffic) or 260 because the sender transmits data in chunks (e.g. adaptive streaming 261 video). 263 Knowing whether a delivery rate sample was application-limited is 264 crucial for congestion control algorithms and applications to use the 265 estimated delivery rate samples properly. For example, congestion 266 control algorithms may not want to react to a delivery rate that is 267 lower simply because the sender is application-limited; for 268 congestion control the key metric is the rate at which the network 269 path can deliver data, and not simply the rate at which the 270 application happens to be transmitting data at any moment. 272 To track this, the estimator marks a bandwidth sample as application- 273 limited if there was some moment during the sampled flight of data 274 packets when there was no data ready to send. 276 An application-limited phase starts when the sending application 277 requests to send new data, or the connection's retransmission 278 mechanisms decide to retransmit data, and the connection meets all of 279 the following conditions 280 1. The transport send buffer has less than one SMSS of unsent data 281 available to send. 283 2. The sending flow is not currently in the process of transmitting 284 a packet. 286 3. The amount of data considered in flight is less than the 287 congestion window (cwnd). 289 4. All the packets considered lost have been retransmitted. 291 If these conditions are all met then the sender has run out of data 292 to feed the network. This would effectively create a "bubble" of 293 idle time in the data pipeline. This idle time means that any 294 delivery rate sample obtained from this data packet, and any rate 295 sample from a packet that follows it in the next round trip, is going 296 to be an application-limited sample that potentially underestimates 297 the true available bandwidth. Thus, when the algorithm marks a 298 transport flow as application-limited, it marks all bandwidth samples 299 for the next round trip as application-limited (at which point, the 300 "bubble" can be said to have exited the data pipeline). 302 3. Detailed Algorithm 304 3.1. Variables 306 3.1.1. Per-connection (C) state 308 This algorithm requires the following new state variables for each 309 transport connection: 311 C.delivered: The total amount of data (measured in octets or in 312 packets) delivered so far over the lifetime of the transport 313 connection. This does not include pure ACK packets. 315 C.delivered_time: The wall clock time when C.delivered was last 316 updated. 318 C.first_sent_time: If packets are in flight, then this holds the send 319 time of the packet that was most recently marked as delivered. Else, 320 if the connection was recently idle, then this holds the send time of 321 most recently sent packet. 323 C.app_limited: The index of the last transmitted packet marked as 324 application-limited, or 0 if the connection is not currently 325 application-limited. 327 We also assume that the transport protocol sender implementation 328 tracks the following state per connection. If the following state 329 variables are not tracked by an existing implementation, all the 330 following parameters MUST be tracked to implement this algorithm: 332 C.write_seq: The data sequence number one higher than that of the 333 last octet queued for transmission in the transport layer write 334 buffer. 336 C.pending_transmissions: The number of bytes queued for transmission 337 on the sending host at layers lower than the transport layer (i.e. 338 network layer, traffic shaping layer, network device layer). 340 C.lost_out: The number of packets in the current outstanding window 341 that are marked as lost. 343 C.retrans_out: The number of packets in the current outstanding 344 window that are being retransmitted. 346 C.pipe: The sender's estimate of the amount of data outstanding in 347 the network (measured in octets or packets). This includes data 348 packets in the current outstanding window that are being transmitted 349 or retransmitted and have not been SACKed or marked lost (e.g. "pipe" 350 from [RFC6675]). This does not include pure ACK packets. 352 3.1.2. Per-packet (P) state 354 This algorithm requires the following new state variables for each 355 packet that has been transmitted but not yet ACKed or SACKed: 357 P.delivered: C.delivered when the packet was sent from transport 358 connection C. 360 P.delivered_time: C.delivered_time when the packet was sent. 362 P.first_sent_time: C.first_sent_time when the packet was sent. 364 P.is_app_limited: true if C.app_limited was non-zero when the packet 365 was sent, else false. 367 P.sent_time: The time when the packet was sent. 369 3.1.3. Rate Sample (rs) Output 371 This algorithm provides its output in a RateSample structure rs, 372 containing the following fields: 374 rs.delivery_rate: The delivery rate sample (in most cases 375 rs.delivered / rs.interval). 377 rs.is_app_limited: The P.is_app_limited from the most recent packet 378 delivered; indicates whether the rate sample is application-limited. 380 rs.interval: The length of the sampling interval. 382 rs.delivered: The amount of data marked as delivered over the 383 sampling interval. 385 rs.prior_delivered: The P.delivered count from the most recent packet 386 delivered. 388 rs.prior_time: The P.delivered_time from the most recent packet 389 delivered. 391 rs.send_elapsed: Send time interval calculated from the most recent 392 packet delivered (see the "Send Rate" section above). 394 rs.ack_elapsed: ACK time interval calculated from the most recent 395 packet delivered (see the "ACK Rate" section above). 397 3.2. Transmitting or retransmitting a data packet 399 Upon transmitting or retransmitting a data packet, the sender 400 snapshots the current delivery information in per-packet state. This 401 will allow the sender to generate a rate sample later, in the 402 UpdateRateSample() step, when the packet is (S)ACKed. 404 If there are packets already in flight, then we need to start 405 delivery rate samples from the time we received the most recent ACK, 406 to try to ensure that we include the full time the network needs to 407 deliver all in-flight packets. If there are no packets in flight 408 yet, then we can start the delivery rate interval at the current 409 time, since we know that any ACKs after now indicate that the network 410 was able to deliver those packets completely in the sampling interval 411 between now and the next ACK. 413 Upon each packet transmission, the sender executes the following 414 steps: 416 SendPacket(Packet P): 417 if (SND.NXT == SND.UNA) /* no packets in flight yet? */ 418 C.first_sent_time = C.delivered_time = Now() 419 P.first_sent_time = C.first_sent_time 420 P.delivered_time = C.delivered_time 421 P.delivered = C.delivered 422 P.is_app_limited = (C.app_limited != 0) 424 3.3. Upon receiving an ACK 426 When an ACK arrives, the sender invokes GenerateRateSample() to fill 427 in a rate sample. For each packet that was newly SACKed or ACKed, 428 UpdateRateSample() updates the rate sample based on a snapshot of 429 connection delivery information from the time at which the packet was 430 last transmitted. UpdateRateSample() is invoked multiple times when 431 a stretched ACK acknowledges multiple data packets. In this case we 432 use the information from the most recently sent packet, i.e., the 433 packet with the highest "P.delivered" value. 435 /* Upon receiving ACK, fill in delivery rate sample rs. */ 436 GenerateRateSample(RateSample rs): 437 for each newly SACKed or ACKed packet P 438 UpdateRateSample(P, rs) 440 /* Clear app-limited field if bubble is ACKed and gone. */ 441 if (C.app_limited and C.delivered > C.app_limited) 442 C.app_limited = 0 444 if (rs.prior_time == 0) 445 return false /* nothing delivered on this ACK */ 447 /* Use the longer of the send_elapsed and ack_elapsed */ 448 rs.interval = max(rs.send_elapsed, rs.ack_elapsed) 450 rs.delivered = C.delivered - rs.prior_delivered 452 /* Normally we expect interval >= MinRTT. 453 * Note that rate may still be over-estimated when a spuriously 454 * retransmitted skb was first (s)acked because "interval" 455 * is under-estimated (up to an RTT). However, continuously 456 * measuring the delivery rate during loss recovery is crucial 457 * for connections that suffer heavy or prolonged losses. 458 */ 459 if (rs.interval < MinRTT(tp)) 460 rs.interval = -1 461 return false /* no reliable sample */ 463 if (rs.interval != 0) 464 rs.delivery_rate = rs.delivered / rs.interval 466 return true; /* we filled in rs with a rate sample */ 468 /* Update rs when packet is SACKed or ACKed. */ 469 UpdateRateSample(Packet P, RateSample rs): 470 if P.delivered_time == 0 471 return /* P already SACKed */ 473 C.delivered += P.data_length 474 C.delivered_time = Now() 476 /* Update info using the newest packet: */ 477 if (P.delivered > rs.prior_delivered) 478 rs.prior_delivered = P.delivered 479 rs.prior_time = P.delivered_time 480 rs.is_app_limited = P.is_app_limited 481 rs.send_elapsed = P.sent_time - P.first_sent_time 482 rs.ack_elapsed = C.delivered_time - P.delivered_time 483 C.first_sent_time = P.sent_time 485 /* Mark the packet as delivered once it's SACKed to 486 * avoid being used again when it's cumulatively acked. 487 */ 488 P.delivered_time = 0 490 3.4. Detecting application-limited phases 492 An application-limited phase starts when the connection decides to 493 send more data, at a point in time when the connection has run out of 494 data. Some decisions to send more data are triggered by the 495 application writing more data to the connection, and some are 496 triggered by loss detection (during ACK processing or upon the 497 triggering of a timer) estimating that some sequence ranges need to 498 be retransmitted. To detect all such cases, the algorithm calls 499 CheckIfApplicationLimited() to check for application-limited behavior 500 in the following situations: 502 * The sending application asks the transport layer to send more 503 data; i.e., upon each write from the application, before new 504 application data is enqueued in the transport send buffer or 505 transmitted. 507 * At the beginning of ACK processing, before updating the estimated 508 number of packets in flight, and before congestion control 509 modifies the cwnd or pacing rate. 511 * At the beginning of connection timer processing, for all timers 512 that might result in the transmission of one or more data 513 segments. For example: RTO timers, TLP timers, RACK reordering 514 timers, or Zero Window Probe timers. 516 When checking for application-limited behavior, the connection checks 517 all the conditions previously described in the "Tracking application- 518 limited phases" section, and if all are met then it marks the 519 connection as application-limited: 521 CheckIfApplicationLimited(): 522 if (C.write_seq - SND.NXT < SND.MSS and 523 C.pending_transmissions == 0 and 524 C.pipe < cwnd and 525 C.lost_out <= C.retrans_out) 526 C.app_limited = (C.delivered + C.pipe) ? : 1 528 4. Discussion 530 4.1. Offload Mechanisms 532 If a transport sender implementation uses an offload mechanism (such 533 as TSO, GSO, etc.) to combine multiple SMSS of data into a single 534 packet "aggregate" for the purposes of scheduling transmissions, then 535 it is RECOMMENDED that the per-packet state be tracked for each 536 packet "aggregate" rather than each SMSS. For simplicity this 537 document refers to such state as "per-packet", whether it is per 538 "aggregate" or per SMSS. 540 4.2. Impact of ACK losses 542 Delivery rate samples are generated upon receiving each ACK; ACKs may 543 contain both cumulative and selective acknowledgment information. 544 Losing an ACK results in losing the delivery rate sample 545 corresponding to that ACK, and generating a delivery rate sample at 546 later a time (upon the arrival of the next ACK). This can 547 underestimate the delivery rate due the artificially inflated 548 "rs.interval". As with any effect that can cause underestimation, it 549 is RECOMMENDED that applications or congestion control algorithms 550 using the output of this algorithm apply appropriate filtering to 551 mitigate the impact of this effect. 553 4.3. Impact of packet reordering 555 This algorithm is robust to packet reordering; it makes no 556 assumptions about the order in which packets are delivered or ACKed. 557 In particular, for a particular packet P, it does not matter which 558 packets are delivered between the transmission of P and the ACK of 559 packet P, since C.delivered will be incremented appropriately in any 560 case. 562 4.4. Impact of packet loss and retransmissions 564 There are several possible approaches for handling cases where a 565 delivery rate sample is based on an ACK or SACK for a retransmitted 566 packet. 568 If the transport protocol supports unambiguous ACKs for retransmitted 569 data sequence ranges (as in QUIC [RFC9000]) then the algorithm is 570 perfectly robust to retransmissions, because the starting packet, P, 571 for the sample can be unambiguously retrieved. 573 If the transport protocol, like TCP [RFC793], has ambiguous ACKs for 574 retransmitted sequence ranges, then the following approaches MAY be 575 used: 577 1. The sender MAY choose to filter out implausible delivery rate 578 samples, as described in the GenerateRateSample() step in the 579 "Upon receiving an ACK" section, by discarding samples whose 580 rs.interval is lower than the minimum RTT seen on the connection. 582 2. The sender MAY choose to skip the generation of a delivery rate 583 sample for a retransmitted sequence range. 585 4.5. Connections without SACK support 587 If the transport connection does not use SACK (i.e., either or both 588 ends of the connections do not accept SACK), then this algorithm can 589 be extended to estimate approximate delivery rates using duplicate 590 ACKs (much like Reno and [RFC5681] estimates that each duplicate ACK 591 indicates that a data packet has been delivered). The details of 592 this extension will be described in a future version of this draft. 594 5. Implementation Status 596 This section records the status of known implementations of the 597 algorithm defined by this specification at the time of posting of 598 this Internet-Draft, and is based on a proposal described in 599 [RFC7942]. The description of implementations in this section is 600 intended to assist the IETF in its decision processes in progressing 601 drafts to RFCs. Please note that the listing of any individual 602 implementation here does not imply endorsement by the IETF. 603 Furthermore, no effort has been spent to verify the information 604 presented here that was supplied by IETF contributors. This is not 605 intended as, and must not be construed to be, a catalog of available 606 implementations or their features. Readers are advised to note that 607 other implementations may exist. 609 According to [RFC7942], "this will allow reviewers and working groups 610 to assign due consideration to documents that have the benefit of 611 running code, which may serve as evidence of valuable experimentation 612 and feedback that have made the implemented protocols more mature. 613 It is up to the individual working groups to use this information as 614 they see fit". 616 As of the time of writing, the following implementations of this 617 algorithm have been publicly released: 619 * Linux TCP 621 - Source code URL: 623 o GPLv2 license: https://git.kernel.org/pub/scm/linux/kernel/g 624 it/torvalds/linux.git/tree/net/ipv4/tcp_rate.c 626 o BSD-style license: https://groups.google.com/d/msg/bbr- 627 dev/X0LbDptlOzo/EVgkRjVHBQAJ 629 - Source: Google 631 - Maturity: production 633 - License: dual-licensed: GPLv2 / BSD-style 635 - Contact: https://groups.google.com/d/forum/bbr-dev 637 - Last updated: September 24, 2021 639 * QUIC 641 - Source code URLs: 643 o https://source.chromium.org/chromium/chromium/src/+/master:n 644 et/third_party/quiche/src/quic/core/congestion_control/ 645 bandwidth_sampler.cc 647 o https://source.chromium.org/chromium/chromium/src/+/master:n 648 et/third_party/quiche/src/quic/core/congestion_control/ 649 bandwidth_sampler.h 651 - Source: Google 653 - Maturity: production 655 - License: BSD-style 657 - Contact: https://groups.google.com/d/forum/bbr-dev 659 - Last updated: October 5, 2021 661 6. Security Considerations 663 This proposal makes no changes to the underlying security of 664 transport protocols or congestion control algorithms. This algorithm 665 adds no security considerations beyond those involved in the existing 666 standard congestion control algorithm [RFC5681]. 668 7. IANA Considerations 670 This document makes no request of IANA. 672 Note to RFC Editor: this section may be removed on publication as an 673 RFC. 675 8. Acknowledgments 677 The authors would like to thank C. Stephen Gunn, Eric Dumazet, Ian 678 Swett, Jana Iyengar, Victor Vasiliev, Nandita Dukkipati, Pawel 679 Jurczyk, Biren Roy, David Wetherall, Amin Vahdat, Leonidas 680 Kontothanassis, and the YouTube, google.com, Bandwidth Enforcer, and 681 Google SRE teams for their invaluable help and support. We would 682 like to thank Hiren Panchasara, Bao Zheng, and Jonathan Morton for 683 feedback and suggestions on earlier versions of this document. 685 9. References 687 9.1. Normative References 689 [RFC793] Postel, J., "Transmission Control Protocol", September 690 1981. 692 [RFC2018] Mathis, M. and J. Mahdavi, "TCP Selective Acknowledgment 693 Options", RFC 2018, October 1996, 694 . 696 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 697 Control", RFC 5681, September 2009, 698 . 700 [RFC7942] Sheffer, Y. and A. Farrel, "Improving Awareness of Running 701 Code: The Implementation Status Section", July 2016. 703 [RFC6675] Blanton, E., Allman, M., Wang, L., Jarvinen, I., Kojo, M., 704 and Y. Nishida, "A Conservative Loss Recovery Algorithm 705 Based on Selective Acknowledgment (SACK) for TCP", 706 RFC 6675, August 2012, 707 . 709 [RFC9000] Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based 710 Multiplexed and Secure Transport", RFC 9000, 711 DOI 10.17487/RFC9000, May 2021, 712 . 714 9.2. Informative References 716 [draft-cardwell-iccrg-bbr-congestion-control] 717 Cardwell, N., Cheng, Y., Hassas Yeganeh, S., and V. 718 Jacobson, "BBR Congestion Control", Work in Progress, 719 Internet-Draft, draft-cardwell-iccrg-bbr-congestion- 720 control, November 2021, . 723 [CCGHJ17] Cardwell, N., Cheng, Y., Gunn, C., Hassas Yeganeh, S., and 724 V. Jacobson, "BBR: Congestion-Based Congestion Control", 725 Communications of the ACM Feb 2017, February 2017, 726 . 729 [A15] Abrahamsson, M., "TCP ACK suppression", IETF AQM mailing 730 list , November 2015, . 733 Authors' Addresses 735 Yuchung Cheng 736 Google 738 Email: ycheng@google.com 739 Neal Cardwell 740 Google 742 Email: ncardwell@google.com 744 Soheil Hassas Yeganeh 745 Google 747 Email: soheil@google.com 749 Van Jacobson 750 Google 752 Email: vanj@google.com