idnits 2.17.1 draft-cheng-iccrg-delivery-rate-estimation-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([RFC793], [RFC9000]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 141: '...f data delivered MAY be tracked in uni...' RFC 2119 keyword, line 331: '...owing parameters MUST be tracked to im...' RFC 2119 keyword, line 536: '... it is RECOMMENDED that the per-pack...' RFC 2119 keyword, line 550: '... is RECOMMENDED that applications or...' RFC 2119 keyword, line 575: '..., then the following approaches MAY be...' (2 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (7 March 2022) is 780 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Experimental ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 793 (Obsoleted by RFC 9293) Summary: 3 errors (**), 0 flaws (~~), 1 warning (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Congestion Control Research Group Y. Cheng 3 Internet-Draft N. Cardwell 4 Intended status: Experimental S. Hassas Yeganeh 5 Expires: 8 September 2022 V. Jacobson 6 Google 7 7 March 2022 9 Delivery Rate Estimation 10 draft-cheng-iccrg-delivery-rate-estimation-02 12 Abstract 14 This document describes a generic algorithm for a transport protocol 15 sender to estimate the current delivery rate of its data. At a high 16 level, the algorithm estimates the rate at which the network 17 delivered the most recent flight of outbound data packets for a 18 single flow. In addition, it tracks whether the rate sample was 19 application-limited, meaning the transmission rate was limited by the 20 sending application rather than the congestion control algorithm. 21 This algorithm can be implemented in any transport protocol that 22 supports packet-delivery acknowledgment (thus far, open source 23 implementations are available for TCP [RFC793] and QUIC [RFC9000]). 25 Status of This Memo 27 This Internet-Draft is submitted in full conformance with the 28 provisions of BCP 78 and BCP 79. 30 Internet-Drafts are working documents of the Internet Engineering 31 Task Force (IETF). Note that other groups may also distribute 32 working documents as Internet-Drafts. The list of current Internet- 33 Drafts is at https://datatracker.ietf.org/drafts/current/. 35 Internet-Drafts are draft documents valid for a maximum of six months 36 and may be updated, replaced, or obsoleted by other documents at any 37 time. It is inappropriate to use Internet-Drafts as reference 38 material or to cite them other than as "work in progress." 40 This Internet-Draft will expire on 8 September 2022. 42 Copyright Notice 44 Copyright (c) 2022 IETF Trust and the persons identified as the 45 document authors. All rights reserved. 47 This document is subject to BCP 78 and the IETF Trust's Legal 48 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 49 license-info) in effect on the date of publication of this document. 50 Please review these documents carefully, as they describe your rights 51 and restrictions with respect to this document. Code Components 52 extracted from this document must include Revised BSD License text as 53 described in Section 4.e of the Trust Legal Provisions and are 54 provided without warranty as described in the Revised BSD License. 56 Table of Contents 58 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 59 2. Algorithm Overview . . . . . . . . . . . . . . . . . . . . . 3 60 2.1. Requirements . . . . . . . . . . . . . . . . . . . . . . 3 61 2.2. Estimating Delivery Rate . . . . . . . . . . . . . . . . 3 62 2.2.1. ACK Rate . . . . . . . . . . . . . . . . . . . . . . 4 63 2.2.2. Compression and Aggregation . . . . . . . . . . . . . 5 64 2.2.3. Send Rate . . . . . . . . . . . . . . . . . . . . . . 5 65 2.2.4. Delivery Rate . . . . . . . . . . . . . . . . . . . . 6 66 2.3. Tracking application-limited phases . . . . . . . . . . . 6 67 3. Detailed Algorithm . . . . . . . . . . . . . . . . . . . . . 7 68 3.1. Variables . . . . . . . . . . . . . . . . . . . . . . . . 7 69 3.1.1. Per-connection (C) state . . . . . . . . . . . . . . 7 70 3.1.2. Per-packet (P) state . . . . . . . . . . . . . . . . 8 71 3.1.3. Rate Sample (rs) Output . . . . . . . . . . . . . . . 8 72 3.2. Transmitting or retransmitting a data packet . . . . . . 9 73 3.3. Upon receiving an ACK . . . . . . . . . . . . . . . . . . 10 74 3.4. Detecting application-limited phases . . . . . . . . . . 11 75 4. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 12 76 4.1. Offload Mechanisms . . . . . . . . . . . . . . . . . . . 12 77 4.2. Impact of ACK losses . . . . . . . . . . . . . . . . . . 12 78 4.3. Impact of packet reordering . . . . . . . . . . . . . . . 13 79 4.4. Impact of packet loss and retransmissions . . . . . . . . 13 80 4.5. Connections without SACK support . . . . . . . . . . . . 13 81 5. Implementation Status . . . . . . . . . . . . . . . . . . . . 14 82 6. Security Considerations . . . . . . . . . . . . . . . . . . . 15 83 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 15 84 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 15 85 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 15 86 9.1. Normative References . . . . . . . . . . . . . . . . . . 15 87 9.2. Informative References . . . . . . . . . . . . . . . . . 16 88 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 16 90 1. Introduction 92 This document describes a generic algorithm for a transport protocol 93 sender to estimate the current delivery rate of its data on the fly. 94 This technique has been used for a congestion control algorithm that 95 relies on fresh, reliable, and inexpensive delivery rate information 96 [draft-cardwell-iccrg-bbr-congestion-control] [CCGHJ17]. 98 At a high level, the algorithm estimates the rate at which the 99 network delivered the most recent flight of outbound data packets for 100 a single flow. In addition, it tracks whether the rate sample was 101 application-limited, meaning the transmission rate was limited by the 102 sending application rather than the congestion control algorithm. 104 Each acknowledgment that cumulatively or selectively acknowledges 105 that the network has delivered new data produces a rate sample which 106 records the amount of data delivered over the time interval between 107 the transmission of a data packet and the acknowledgment of that 108 packet. The samples reflect the recent goodput through some 109 bottleneck, which may reside either in the network or on the end 110 hosts (sender or receiver). 112 2. Algorithm Overview 114 2.1. Requirements 116 This algorithm can be implemented in any transport protocol that 117 supports packet-delivery acknowledgment (so far, implementations are 118 available for TCP [RFC793] and QUIC [RFC9000]). This algorithm 119 requires a small amount of added logic on the sender, and requires 120 that the sender maintain a small amount of additional per-packet 121 state for packets sent but not yet delivered. In the most general 122 case it requires high-precision (microsecond-granularity or better) 123 timestamps on the sender (though millisecond-granularity may suffice 124 for lower bandwidths). It does not require any receiver or network 125 changes. While selective acknowledgments for out-of-order data 126 (e.g., [RFC2018]) are not required, such a mechanism is highly 127 recommended for accurate estimation during reordering and loss 128 recovery phases. 130 2.2. Estimating Delivery Rate 132 A delivery rate sample records the estimated rate at which the 133 network delivered packets for a single flow, calculated over the time 134 interval between the transmission of a data packet and the 135 acknowledgment of that packet. Since the rate samples only include 136 packets actually cumulatively and/or selectively acknowledged, the 137 sender knows the exact octets that were delivered to the receiver 138 (not lost), and the sender can compute an estimate of a bottleneck 139 delivery rate over that time interval. 141 The amount of data delivered MAY be tracked in units of either octets 142 or packets. Tracking data in units of octets is more accurate, since 143 packet sizes can vary. But for some purposes, including congestion 144 control, tracking data in units of packets may suffice. 146 2.2.1. ACK Rate 148 First, consider the rate at which data is acknowledged by the 149 receiver. In this algorithm, the computation of the ACK rate models 150 the average slope of a hypothetical "delivered" curve that tracks the 151 cumulative quantity of data delivered so far on the Y axis, and time 152 elapsed on the X axis. Since ACKs arrive in discrete events, this 153 "delivered" curve forms a step function, where each ACK causes a 154 discrete increase in the "delivered" count that causes a vertical 155 upward step up in the curve. This "ack_rate" computation is the 156 average slope of the "delivered" step function, as measured from the 157 "knee" of the step (ACK) preceding the transmit to the "knee" of the 158 step (ACK) for packet P. 160 Given this model, the ack rate sample "slope" is computed as the 161 ratio between the amount of data marked as delivered over this time 162 interval, and the time over which it is marked as delivered: 164 ack_rate = data_acked / ack_elapsed 166 To calculate the amount of data ACKed over the interval, the sender 167 records in per-packet state "P.delivered", the amount of data that 168 had been marked delivered before transmitting packet P, and then 169 records how much data had been marked delivered by the time the ACK 170 for the packet arrives (in "C.delivered"), and computes the 171 difference: 173 data_acked = C.delivered - P.delivered 175 To compute the time interval, "ack_elapsed", one might imagine that 176 it would be feasible to use the round-trip time (RTT) of the packet. 177 But it is not safe to simply calculate a bandwidth estimate by using 178 the time between the transmit of a packet and the acknowledgment of 179 that packet. Transmits and ACKs can happen out of phase with each 180 other, clocked in separate processes. In general, transmissions 181 often happen at some point later than the most recent ACK, due to 182 processing or pacing delays. Because of this effect, drastic over- 183 estimates can happen if a sender were to attempt to estimate 184 bandwidth by using the round-trip time. 186 This document specifies the following approach for computing 187 "ack_elapsed". The starting time is "P.delivered_time", the time of 188 the delivery curve "knee" from the ACK preceding the transmit. The 189 ending time is "C.delivered_time", the time of the delivery curve 190 "knee" from the ACK for P. Then we compute "ack_elapsed" as: 192 ack_elapsed = C.delivered_time - P.delivered_time 194 This yields our equation for computing the ACK rate, as the "slope" 195 from the "knee" preceding the transmit to the "knee" at ACK: 197 ack_rate = data_acked / ack_elapsed 198 ack_rate = (C.delivered - P.delivered) / 199 (C.delivered_time - P.delivered_time) 201 2.2.2. Compression and Aggregation 203 For computing the delivery_rate, the sender prefers ack_rate, the 204 rate at which packets were acknowledged, since this usually the most 205 reliable metric. However, this approach of directly using "ack_rate" 206 faces a challenge when used with paths featuring aggregation, 207 compression, or ACK decimation, which are prevalent [A15]. In such 208 cases, ACK arrivals can temporarily make it appear as if data packets 209 were delivered much faster than the bottleneck rate. To filter out 210 such implausible ack_rate samples, we consider the send rate for each 211 flight of data, as follows. 213 2.2.3. Send Rate 215 The sender calculates the send rate, "send_rate", for a flight of 216 data as follows. Define "P.first_sent_time" as the time of the first 217 send in a flight of data, and "P.sent_time" as the time the final 218 send in that flight of data (the send that transmits packet "P"). 219 The elapsed time for sending the flight is: 221 send_elapsed = (P.sent_time - P.first_sent_time) 223 Then we calculate the send_rate as: 225 send_rate = data_acked / send_elapsed 227 Using our "delivery" curve model above, the send_rate can be viewed 228 as the average slope of a "send" curve that traces the amount of data 229 sent on the Y axis, and the time elapsed on the X axis: the average 230 slope of the transmission of this flight of data. 232 2.2.4. Delivery Rate 234 Since it is physically impossible to have data delivered faster than 235 it is sent in a sustained fashion, when the estimator notices that 236 the ack_rate for a flight is faster than the send rate for the 237 flight, it filters out the implausible ack_rate by capping the 238 delivery rate sample to be no higher than the send rate. 240 More precisely, over the interval between each transmission and 241 corresponding ACK, the sender calculates a delivery rate sample, 242 "delivery_rate", using the minimum of the rate at which packets were 243 acknowledged or the rate at which they were sent: 245 delivery_rate = min(send_rate, ack_rate) 247 Since ack_rate and send_rate both have data_acked as a numerator, 248 this can be computed more efficiently with a single division (instead 249 of two), as follows: 251 delivery_elapsed = max(ack_elapsed, send_elapsed) 252 delivery_rate = data_acked / delivery_elapsed 254 2.3. Tracking application-limited phases 256 In application-limited phases the transmission rate is limited by the 257 application rather than the congestion control algorithm. Modern 258 transport protocol connections are often application-limited, either 259 due to request/response workloads (e.g. Web traffic, RPC traffic) or 260 because the sender transmits data in chunks (e.g. adaptive streaming 261 video). 263 Knowing whether a delivery rate sample was application-limited is 264 crucial for congestion control algorithms and applications to use the 265 estimated delivery rate samples properly. For example, congestion 266 control algorithms may not want to react to a delivery rate that is 267 lower simply because the sender is application-limited; for 268 congestion control the key metric is the rate at which the network 269 path can deliver data, and not simply the rate at which the 270 application happens to be transmitting data at any moment. 272 To track this, the estimator marks a bandwidth sample as application- 273 limited if there was some moment during the sampled flight of data 274 packets when there was no data ready to send. 276 An application-limited phase starts when the sending application 277 requests to send new data, or the connection's retransmission 278 mechanisms decide to retransmit data, and the connection meets all of 279 the following conditions: 281 1. The transport send buffer has less than one SMSS of unsent data 282 available to send. 284 2. The sending flow is not currently in the process of transmitting 285 a packet. 287 3. The amount of data considered in flight is less than the 288 congestion window (cwnd). 290 4. All the packets considered lost have been retransmitted. 292 If these conditions are all met then the sender has run out of data 293 to feed the network. This would effectively create a "bubble" of 294 idle time in the data pipeline. This idle time means that any 295 delivery rate sample obtained from this data packet, and any rate 296 sample from a packet that follows it in the next round trip, is going 297 to be an application-limited sample that potentially underestimates 298 the true available bandwidth. Thus, when the algorithm marks a 299 transport flow as application-limited, it marks all bandwidth samples 300 for the next round trip as application-limited (at which point, the 301 "bubble" can be said to have exited the data pipeline). 303 3. Detailed Algorithm 305 3.1. Variables 307 3.1.1. Per-connection (C) state 309 This algorithm requires the following new state variables for each 310 transport connection: 312 C.delivered: The total amount of data (measured in octets or in 313 packets) delivered so far over the lifetime of the transport 314 connection. This does not include pure ACK packets. 316 C.delivered_time: The wall clock time when C.delivered was last 317 updated. 319 C.first_sent_time: If packets are in flight, then this holds the send 320 time of the packet that was most recently marked as delivered. Else, 321 if the connection was recently idle, then this holds the send time of 322 most recently sent packet. 324 C.app_limited: The index of the last transmitted packet marked as 325 application-limited, or 0 if the connection is not currently 326 application-limited. 328 We also assume that the transport protocol sender implementation 329 tracks the following state per connection. If the following state 330 variables are not tracked by an existing implementation, all the 331 following parameters MUST be tracked to implement this algorithm: 333 C.write_seq: The data sequence number one higher than that of the 334 last octet queued for transmission in the transport layer write 335 buffer. 337 C.pending_transmissions: The number of bytes queued for transmission 338 on the sending host at layers lower than the transport layer (i.e. 339 network layer, traffic shaping layer, network device layer). 341 C.lost_out: The number of packets in the current outstanding window 342 that are marked as lost. 344 C.retrans_out: The number of packets in the current outstanding 345 window that are being retransmitted. 347 C.pipe: The sender's estimate of the amount of data outstanding in 348 the network (measured in octets or packets). This includes data 349 packets in the current outstanding window that are being transmitted 350 or retransmitted and have not been SACKed or marked lost (e.g. "pipe" 351 from [RFC6675]). This does not include pure ACK packets. 353 3.1.2. Per-packet (P) state 355 This algorithm requires the following new state variables for each 356 packet that has been transmitted but not yet ACKed or SACKed: 358 P.delivered: C.delivered when the packet was sent from transport 359 connection C. 361 P.delivered_time: C.delivered_time when the packet was sent. 363 P.first_sent_time: C.first_sent_time when the packet was sent. 365 P.is_app_limited: true if C.app_limited was non-zero when the packet 366 was sent, else false. 368 P.sent_time: The time when the packet was sent. 370 3.1.3. Rate Sample (rs) Output 372 This algorithm provides its output in a RateSample structure rs, 373 containing the following fields: 375 rs.delivery_rate: The delivery rate sample (in most cases 376 rs.delivered / rs.interval). 378 rs.is_app_limited: The P.is_app_limited from the most recent packet 379 delivered; indicates whether the rate sample is application-limited. 381 rs.interval: The length of the sampling interval. 383 rs.delivered: The amount of data marked as delivered over the 384 sampling interval. 386 rs.prior_delivered: The P.delivered count from the most recent packet 387 delivered. 389 rs.prior_time: The P.delivered_time from the most recent packet 390 delivered. 392 rs.send_elapsed: Send time interval calculated from the most recent 393 packet delivered (see the "Send Rate" section above). 395 rs.ack_elapsed: ACK time interval calculated from the most recent 396 packet delivered (see the "ACK Rate" section above). 398 3.2. Transmitting or retransmitting a data packet 400 Upon transmitting or retransmitting a data packet, the sender 401 snapshots the current delivery information in per-packet state. This 402 will allow the sender to generate a rate sample later, in the 403 UpdateRateSample() step, when the packet is (S)ACKed. 405 If there are packets already in flight, then we need to start 406 delivery rate samples from the time we received the most recent ACK, 407 to try to ensure that we include the full time the network needs to 408 deliver all in-flight packets. If there are no packets in flight 409 yet, then we can start the delivery rate interval at the current 410 time, since we know that any ACKs after now indicate that the network 411 was able to deliver those packets completely in the sampling interval 412 between now and the next ACK. 414 Upon each packet transmission, the sender executes the following 415 steps: 417 SendPacket(Packet P): 418 if (SND.NXT == SND.UNA) /* no packets in flight yet? */ 419 C.first_sent_time = C.delivered_time = Now() 420 P.first_sent_time = C.first_sent_time 421 P.delivered_time = C.delivered_time 422 P.delivered = C.delivered 423 P.is_app_limited = (C.app_limited != 0) 425 3.3. Upon receiving an ACK 427 When an ACK arrives, the sender invokes GenerateRateSample() to fill 428 in a rate sample. For each packet that was newly SACKed or ACKed, 429 UpdateRateSample() updates the rate sample based on a snapshot of 430 connection delivery information from the time at which the packet was 431 last transmitted. UpdateRateSample() is invoked multiple times when 432 a stretched ACK acknowledges multiple data packets. In this case we 433 use the information from the most recently sent packet, i.e., the 434 packet with the highest "P.delivered" value. 436 /* Upon receiving ACK, fill in delivery rate sample rs. */ 437 GenerateRateSample(RateSample rs): 438 for each newly SACKed or ACKed packet P 439 UpdateRateSample(P, rs) 441 /* Clear app-limited field if bubble is ACKed and gone. */ 442 if (C.app_limited and C.delivered > C.app_limited) 443 C.app_limited = 0 445 if (rs.prior_time == 0) 446 return false /* nothing delivered on this ACK */ 448 /* Use the longer of the send_elapsed and ack_elapsed */ 449 rs.interval = max(rs.send_elapsed, rs.ack_elapsed) 451 rs.delivered = C.delivered - rs.prior_delivered 453 /* Normally we expect interval >= MinRTT. 454 * Note that rate may still be overestimated when a spuriously 455 * retransmitted skb was first (s)acked because "interval" 456 * is under-estimated (up to an RTT). However, continuously 457 * measuring the delivery rate during loss recovery is crucial 458 * for connections that suffer heavy or prolonged losses. 459 */ 460 if (rs.interval < MinRTT(tp)) 461 rs.interval = -1 462 return false /* no reliable sample */ 464 if (rs.interval != 0) 465 rs.delivery_rate = rs.delivered / rs.interval 467 return true; /* we filled in rs with a rate sample */ 469 /* Update rs when a packet is SACKed or ACKed. */ 470 UpdateRateSample(Packet P, RateSample rs): 471 if P.delivered_time == 0 472 return /* P already SACKed */ 474 C.delivered += P.data_length 475 C.delivered_time = Now() 477 /* Update info using the newest packet: */ 478 if (P.delivered > rs.prior_delivered) 479 rs.prior_delivered = P.delivered 480 rs.prior_time = P.delivered_time 481 rs.is_app_limited = P.is_app_limited 482 rs.send_elapsed = P.sent_time - P.first_sent_time 483 rs.ack_elapsed = C.delivered_time - P.delivered_time 484 C.first_sent_time = P.sent_time 486 /* Mark the packet as delivered once it's SACKed to 487 * avoid being used again when it's cumulatively acked. 488 */ 489 P.delivered_time = 0 491 3.4. Detecting application-limited phases 493 An application-limited phase starts when the connection decides to 494 send more data, at a point in time when the connection has run out of 495 data. Some decisions to send more data are triggered by the 496 application writing more data to the connection, and some are 497 triggered by loss detection (during ACK processing or upon the 498 triggering of a timer) estimating that some sequence ranges need to 499 be retransmitted. To detect all such cases, the algorithm calls 500 CheckIfApplicationLimited() to check for application-limited behavior 501 in the following situations: 503 * The sending application asks the transport layer to send more 504 data; i.e., upon each write from the application, before new 505 application data is enqueued in the transport send buffer or 506 transmitted. 508 * At the beginning of ACK processing, before updating the estimated 509 number of packets in flight, and before congestion control 510 modifies the cwnd or pacing rate. 512 * At the beginning of connection timer processing, for all timers 513 that might result in the transmission of one or more data 514 segments. For example: RTO timers, TLP timers, RACK reordering 515 timers, or Zero Window Probe timers. 517 When checking for application-limited behavior, the connection checks 518 all the conditions previously described in the "Tracking application- 519 limited phases" section, and if all are met then it marks the 520 connection as application-limited: 522 CheckIfApplicationLimited(): 523 if (C.write_seq - SND.NXT < SND.MSS and 524 C.pending_transmissions == 0 and 525 C.pipe < cwnd and 526 C.lost_out <= C.retrans_out) 527 C.app_limited = (C.delivered + C.pipe) ? : 1 529 4. Discussion 531 4.1. Offload Mechanisms 533 If a transport sender implementation uses an offload mechanism (such 534 as TSO, GSO, etc.) to combine multiple SMSS of data into a single 535 packet "aggregate" for the purposes of scheduling transmissions, then 536 it is RECOMMENDED that the per-packet state be tracked for each 537 packet "aggregate" rather than each SMSS. For simplicity this 538 document refers to such state as "per-packet", whether it is per 539 "aggregate" or per SMSS. 541 4.2. Impact of ACK losses 543 Delivery rate samples are generated upon receiving each ACK; ACKs may 544 contain both cumulative and selective acknowledgment information. 545 Losing an ACK results in losing the delivery rate sample 546 corresponding to that ACK, and generating a delivery rate sample at 547 later a time (upon the arrival of the next ACK). This can 548 underestimate the delivery rate due the artificially inflated 549 "rs.interval". As with any effect that can cause underestimation, it 550 is RECOMMENDED that applications or congestion control algorithms 551 using the output of this algorithm apply appropriate filtering to 552 mitigate the impact of this effect. 554 4.3. Impact of packet reordering 556 This algorithm is robust to packet reordering; it makes no 557 assumptions about the order in which packets are delivered or ACKed. 558 In particular, for a particular packet P, it does not matter which 559 packets are delivered between the transmission of P and the ACK of 560 packet P, since C.delivered will be incremented appropriately in any 561 case. 563 4.4. Impact of packet loss and retransmissions 565 There are several possible approaches for handling cases where a 566 delivery rate sample is based on an ACK or SACK for a retransmitted 567 packet. 569 If the transport protocol supports unambiguous ACKs for retransmitted 570 data sequence ranges (as in QUIC [RFC9000]) then the algorithm is 571 perfectly robust to retransmissions, because the starting packet, P, 572 for the sample can be unambiguously retrieved. 574 If the transport protocol, like TCP [RFC793], has ambiguous ACKs for 575 retransmitted sequence ranges, then the following approaches MAY be 576 used: 578 1. The sender MAY choose to filter out implausible delivery rate 579 samples, as described in the GenerateRateSample() step in the 580 "Upon receiving an ACK" section, by discarding samples whose 581 rs.interval is lower than the minimum RTT seen on the connection. 583 2. The sender MAY choose to skip the generation of a delivery rate 584 sample for a retransmitted sequence range. 586 4.5. Connections without SACK support 588 If the transport connection does not use SACK (i.e., either or both 589 ends of the connections do not accept SACK), then this algorithm can 590 be extended to estimate approximate delivery rates using duplicate 591 ACKs (much like Reno and [RFC5681] estimates that each duplicate ACK 592 indicates that a data packet has been delivered). The details of 593 this extension will be described in a future version of this draft. 595 5. Implementation Status 597 This section records the status of known implementations of the 598 algorithm defined by this specification at the time of posting of 599 this Internet-Draft, and is based on a proposal described in 600 [RFC7942]. The description of implementations in this section is 601 intended to assist the IETF in its decision processes in progressing 602 drafts to RFCs. Please note that the listing of any individual 603 implementation here does not imply endorsement by the IETF. 604 Furthermore, no effort has been spent to verify the information 605 presented here that was supplied by IETF contributors. This is not 606 intended as, and must not be construed to be, a catalog of available 607 implementations or their features. Readers are advised to note that 608 other implementations may exist. 610 According to [RFC7942], "this will allow reviewers and working groups 611 to assign due consideration to documents that have the benefit of 612 running code, which may serve as evidence of valuable experimentation 613 and feedback that have made the implemented protocols more mature. 614 It is up to the individual working groups to use this information as 615 they see fit". 617 As of the time of writing, the following implementations of this 618 algorithm have been publicly released: 620 * Linux TCP 622 - Source code URL: 624 o GPLv2 license: https://git.kernel.org/pub/scm/linux/kernel/g 625 it/torvalds/linux.git/tree/net/ipv4/tcp_rate.c 627 o BSD-style license: https://groups.google.com/d/msg/bbr- 628 dev/X0LbDptlOzo/EVgkRjVHBQAJ 630 - Source: Google 632 - Maturity: production 634 - License: dual-licensed: GPLv2 / BSD-style 636 - Contact: https://groups.google.com/d/forum/bbr-dev 638 - Last updated: September 24, 2021 640 * QUIC 642 - Source code URLs: 644 o https://source.chromium.org/chromium/chromium/src/+/master:n 645 et/third_party/quiche/src/quic/core/congestion_control/ 646 bandwidth_sampler.cc 648 o https://source.chromium.org/chromium/chromium/src/+/master:n 649 et/third_party/quiche/src/quic/core/congestion_control/ 650 bandwidth_sampler.h 652 - Source: Google 654 - Maturity: production 656 - License: BSD-style 658 - Contact: https://groups.google.com/d/forum/bbr-dev 660 - Last updated: October 5, 2021 662 6. Security Considerations 664 This proposal makes no changes to the underlying security of 665 transport protocols or congestion control algorithms. This algorithm 666 adds no security considerations beyond those involved in the existing 667 standard congestion control algorithm [RFC5681]. 669 7. IANA Considerations 671 This document makes no request of IANA. 673 Note to RFC Editor: this section may be removed on publication as an 674 RFC. 676 8. Acknowledgments 678 The authors would like to thank C. Stephen Gunn, Eric Dumazet, Ian 679 Swett, Jana Iyengar, Victor Vasiliev, Nandita Dukkipati, Pawel 680 Jurczyk, Biren Roy, David Wetherall, Amin Vahdat, Leonidas 681 Kontothanassis, and the YouTube, google.com, Bandwidth Enforcer, and 682 Google SRE teams for their invaluable help and support. We would 683 like to thank Hiren Panchasara, Bao Zheng, and Jonathan Morton for 684 feedback and suggestions on earlier versions of this document. 686 9. References 688 9.1. Normative References 690 [RFC793] Postel, J., "Transmission Control Protocol", September 691 1981. 693 [RFC2018] Mathis, M. and J. Mahdavi, "TCP Selective Acknowledgment 694 Options", RFC 2018, October 1996, 695 . 697 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 698 Control", RFC 5681, September 2009, 699 . 701 [RFC7942] Sheffer, Y. and A. Farrel, "Improving Awareness of Running 702 Code: The Implementation Status Section", July 2016. 704 [RFC6675] Blanton, E., Allman, M., Wang, L., Jarvinen, I., Kojo, M., 705 and Y. Nishida, "A Conservative Loss Recovery Algorithm 706 Based on Selective Acknowledgment (SACK) for TCP", 707 RFC 6675, August 2012, 708 . 710 [RFC9000] Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based 711 Multiplexed and Secure Transport", RFC 9000, 712 DOI 10.17487/RFC9000, May 2021, 713 . 715 9.2. Informative References 717 [draft-cardwell-iccrg-bbr-congestion-control] 718 Cardwell, N., Cheng, Y., Hassas Yeganeh, S., and V. 719 Jacobson, "BBR Congestion Control", Work in Progress, 720 Internet-Draft, draft-cardwell-iccrg-bbr-congestion- 721 control, November 2021, . 724 [CCGHJ17] Cardwell, N., Cheng, Y., Gunn, C., Hassas Yeganeh, S., and 725 V. Jacobson, "BBR: Congestion-Based Congestion Control", 726 Communications of the ACM Feb 2017, February 2017, 727 . 730 [A15] Abrahamsson, M., "TCP ACK suppression", IETF AQM mailing 731 list , November 2015, . 734 Authors' Addresses 736 Yuchung Cheng 737 Google 738 Email: ycheng@google.com 739 Neal Cardwell 740 Google 741 Email: ncardwell@google.com 743 Soheil Hassas Yeganeh 744 Google 745 Email: soheil@google.com 747 Van Jacobson 748 Google 749 Email: vanj@google.com