idnits 2.17.1 draft-bagnulo-iccrg-rledbat-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 179: '...nders. In particular, the sender MUST...' RFC 2119 keyword, line 180: '... implement [I-D.ietf-tcpm-rfc793bis] and it also MUST implement the...' RFC 2119 keyword, line 181: '... Time Stamp Option as defined in [RFC7323]. Also, the sender SHOULD...' RFC 2119 keyword, line 187: '... The rLEDBAT receiver MUST use an LBE...' RFC 2119 keyword, line 197: '...rLEDBAT receiver SHOULD use the LEDBAT...' (12 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 29, 2019) is 1641 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- == Outdated reference: A later version (-01) exists of draft-balasubramanian-iccrg-ledbatplusplus-00 == Outdated reference: A later version (-28) exists of draft-ietf-tcpm-rfc793bis-14 -- Obsolete informational reference (is this intentional?): RFC 8312 (Obsoleted by RFC 9438) Summary: 1 error (**), 0 flaws (~~), 3 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group M. Bagnulo 3 Internet-Draft A. Garcia-Martinez 4 Intended status: Experimental UC3M 5 Expires: May 1, 2020 G. Montenegro 6 P. Balasubramanian 7 Microsoft 8 October 29, 2019 10 rLEDBAT: receiver-driven Low Extra Delay Background Transport for TCP 11 draft-bagnulo-iccrg-rledbat-01.txt 13 Abstract 15 This document specifies the rLEDBAT, a set of mechanisms that enable 16 the execution of a less-than-best-effort congestion control algorithm 17 for TCP at the receiver end. 19 Status of This Memo 21 This Internet-Draft is submitted in full conformance with the 22 provisions of BCP 78 and BCP 79. 24 Internet-Drafts are working documents of the Internet Engineering 25 Task Force (IETF). Note that other groups may also distribute 26 working documents as Internet-Drafts. The list of current Internet- 27 Drafts is at https://datatracker.ietf.org/drafts/current/. 29 Internet-Drafts are draft documents valid for a maximum of six months 30 and may be updated, replaced, or obsoleted by other documents at any 31 time. It is inappropriate to use Internet-Drafts as reference 32 material or to cite them other than as "work in progress." 34 This Internet-Draft will expire on May 1, 2020. 36 Copyright Notice 38 Copyright (c) 2019 IETF Trust and the persons identified as the 39 document authors. All rights reserved. 41 This document is subject to BCP 78 and the IETF Trust's Legal 42 Provisions Relating to IETF Documents 43 (https://trustee.ietf.org/license-info) in effect on the date of 44 publication of this document. Please review these documents 45 carefully, as they describe your rights and restrictions with respect 46 to this document. Code Components extracted from this document must 47 include Simplified BSD License text as described in Section 4.e of 48 the Trust Legal Provisions and are provided without warranty as 49 described in the Simplified BSD License. 51 Table of Contents 53 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 54 2. Motivations for rLEDBAT . . . . . . . . . . . . . . . . . . . 3 55 3. rLEDBAT mechanisms . . . . . . . . . . . . . . . . . . . . . 4 56 3.1. Controlling the receive window . . . . . . . . . . . . . 5 57 3.1.1. Avoiding window shrinking . . . . . . . . . . . . . . 6 58 3.1.2. Window Scale Option . . . . . . . . . . . . . . . . . 7 59 3.2. Measuring the Round Trip Time . . . . . . . . . . . . . . 7 60 3.2.1. Measuring the RTT to estimate the queueing delay . . 8 61 3.3. Detecting retransmissions and packet losses . . . . . . . 10 62 4. Security Considerations . . . . . . . . . . . . . . . . . . . 11 63 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 64 6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 11 65 7. Informative References . . . . . . . . . . . . . . . . . . . 11 66 Appendix A. Difficulties while measuring one way delay at the 67 TCP receiver . . . . . . . . . . . . . . . . . . . . 12 68 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 13 70 1. Introduction 72 LEDBAT (Low Extra Delay Background Transport) [RFC6817] is a 73 congestion-control algorithm that implements a less-than-best-effort 74 (LBE) traffic class. 76 When LEDBAT traffic shares a bottleneck with one or more TCP 77 connections using standard congestion control algorithms such as 78 Cubic [RFC8312] (hereafter standard-TCP for short), it reduces its 79 sending rate earlier and more aggressively than standard-TCP 80 congestion control, allowing standard-TCP traffic to use more of the 81 available capacity. In the absence of competing standard-TCP 82 traffic, LEDBAT aims to make an efficient use of the available 83 capacity, while keeping the queuing delay within predefined bounds. 85 LEDBAT reacts both to packet loss and to variations in delay. 86 Regarding to packet loss, LEDBAT reacts with a multiplicative 87 decrease, similar to most TCP congestion controllers. Regarding 88 delay, LEDBAT aims for a target queueing delay. When the measured 89 current queueing delay is below the target, LEDBAT increases the 90 sending rate and when the delay is above the target, it reduces the 91 sending rate. LEDBAT estimates the queuing delay by subtracting the 92 measured current one-way delay from the estimated base one-way delay 93 (i.e. the one-way delay in the absence of queues). 95 The LEDBAT specification [RFC6817] defines the LEDBAT congestion- 96 control algorithm, implemented in the sender to control its sending 97 rate. LEDBAT is specified in a protocol and layer agnostic manner. 99 LEDBAT++ [I-D.balasubramanian-iccrg-ledbatplusplus] is also an LBE 100 congestion control algorithm which is inspired in LEDBAT while 101 addressing several problems identified with the original LEDBAT 102 specification. In particular the differences between LEDBAT and 103 LEDBAT++ include: i) LEDBAT++ uses the round-trip-time (RTT) (as 104 opposed to the one way delay used in LEDBAT) to estimate the queuing 105 delay; ii) LEDBAT++ uses an Additive Increase/Multiplicative Decrease 106 algorithm to achieve inter-LEDBAT++ fairness and avoid the late-comer 107 advantage observed in LEDBAT; iii) LEDBAT++ performs periodic 108 slowdowns to improve the measurement of the base delay; iv) LEDBAT++ 109 is defined for TCP. 111 In this note, we describe rLEDBAT, a set of mechanisms that enable 112 the execution of an LBE delay-based congestion control algorithm such 113 as LEDBAT++ in the receiver end of a TCP connection. 115 2. Motivations for rLEDBAT 117 rLEDBAT enables new use cases and new deployment models, fostering 118 the use of LBE traffic and benefitting the global Internet by 119 improving overall allocation of resources. The following scenarios 120 are enabled by rLEDBAT: 122 Content Delivery Networks and more sophisticated file distribution 123 scenarios: Consider the case where the source of a file to be 124 distributed (e.g., a software developer that wishes to distribute 125 a software update) would prefer to use LBE and it enables LEDBAT/ 126 LEDBAT++ in the servers containing the source file. However, 127 because the file is being distributed through a CDN which 128 surrogates do not support LBE congestion control, the result is 129 that the file transfers, originated from CDN surrogates will not 130 be using LBE. Interestingly enough, in the case of the software 131 update, the developer may also control the software performing the 132 download in the client, the receiver of the file, but because 133 current LEDBAT/LEDBAT++ are sender-based algorithms, controlling 134 the client is not enough to enable LBE congestion control in the 135 communication. rLEDBAT would enable the use of LBE traffic class 136 for file distribution in this setup. 138 Interference from proxies and other middleboxes: Proxies and other 139 middleboxes are a commonplace in the Internet. For instance, in 140 the case of mobile networks, proxies are frequently used. In the 141 case of enterprise networks, it is common to deploy corporate 142 proxies for filtering and firewalling. In the case of satellite 143 links, Performance Enhancement Proxies (PEPs) are deployed to 144 mitigate the effect of the long delay in TCP connection. These 145 proxies terminate the TCP connection on both ends and prevent the 146 use of LBE congestion control in the segment between the proxy and 147 the sink of the content, the client. By enabling rLEDBAT, clients 148 would be able to enable LBE traffic between them and the proxy. 150 Receiver-defined preferences. It is frequent that the bottleneck 151 of the communication is the access link. This is particularly 152 true in the case of mobile devices. It is then especially 153 relevant for mobile devices to properly manage the capacity of the 154 access link. With current technologies, it is possible for the 155 mobile device to use different congestion control algorithms 156 expressing different preferences for the traffic. For instance, a 157 device can choose to use standard-TCP for some traffic and to use 158 LEDBAT/LEDBAT++ for other traffic. However, this would only 159 affect the outgoing traffic since both standard-TCP and LEDBAT/ 160 LEDBAT++ are sender-driven. The mobile device has no means to 161 manage the traffic in the down-link, which is in most cases, the 162 communication bottleneck for a typical eye-ball end-user. rLEDBAT 163 enables the mobile device to selectively use LBE traffic class for 164 some of the incoming traffic. For instance, by using rLEDBAT, a 165 user can use regular standard-TCP/UDP for video stream (e.g., 166 Youtube) and use rLEDBAT for other background file download. 168 3. rLEDBAT mechanisms 170 rLEDBAT provides the mechanisms to implement an LBE congestion 171 control algorithm at the receiver-end of a TCP connection. The 172 rLEDBAT receiver controls the sender's rate through the Receive 173 Window announced to the sender in the TCP header. 175 rLEDBAT assumes that the sender is a standard TCP sender. rLEDBAT 176 does not require any rLEDBAT-specific modifications to the TCP 177 sender. The envisioned deployment model for rLEDBAT is that the 178 clients implement rLEDBAT and this enables rLEDBAT in communications 179 with existent standard TCP senders. In particular, the sender MUST 180 implement [I-D.ietf-tcpm-rfc793bis] and it also MUST implement the 181 Time Stamp Option as defined in [RFC7323]. Also, the sender SHOULD 182 implement some of the standard congestion control mechanisms, such as 183 Cubic [RFC8312] or New Reno [RFC5681]. 185 rLEDBAT does not defines a new congestion control algorithm. The LBE 186 congestion control algorithm executed in the rLEDBAT receiver is 187 defined in other documents. The rLEDBAT receiver MUST use an LBE 188 congestion control algorithm. Because rLEDBAT assumes a standard TCP 189 sender, the sender will be using a "best effort" congestion control 190 algorithm (such as Cubic or New Reno). Since rLEDBAT uses the 191 Receive Window to control the sender's rate and the sender calculates 192 the sender's window as the minimum of the Receive window and the 193 congestion window, rLEDBAT will only be effective as long as the 194 congestion control algorithm executed in the receiver yields a 195 smaller window than the one calculated by the sender. This is 196 normally the case when the receiver is using an LBE congestion 197 control algorithm. The rLEDBAT receiver SHOULD use the LEDBAT++ 198 congestion control algorithm 199 [I-D.balasubramanian-iccrg-ledbatplusplus]. The rLEDBAT MAY use 200 other LBE congestion control algorithms defined elsewhere as long as 201 they use the round-trip-time as input to estimate the queueing delay 202 as rLEDBAT as currently defined does not provide the means for the 203 sender to estimate the queueing delay using one way delay 204 measurements. 206 EDITOR'S NOTE: should we recommend the use of LEDBAT for rLEDBAT as 207 long as the RTT is used to estimate the queueing delay? 209 Irrespectively of which congestion control algorithm is executed in 210 the receiver, an rLEDBAT connection will never be more aggressive 211 than standard TCP since it is always bounded by the congestion 212 control algorithm executed at the sender. 214 rLEDBAT is essentially composed of three types of mechanisms, namely, 215 those that provide the means to measure the round trip time, 216 mechanisms to detect packet loss and the means to manipulate the 217 Receive Window to control the sender's rate. We describe them next. 219 3.1. Controlling the receive window 221 rLEDBAT uses the Receive Window (RCV.WND) of TCP to enable the 222 receiver to control the sender's rate. [I-D.ietf-tcpm-rfc793bis] 223 defines that the RCV.WND is used to announce the available receive 224 buffer to the sender for flow control purposes. In order to avoid 225 confusion, we will call fc.WND the value that a standard RFC793bis 226 TCP receiver calculates to set in the receive window for flow control 227 purposes. We call rl.WND the window value calculated by rLEDBAT 228 algorithm and we call RCV.WND the value actually included in the 229 Receive Window field of the TCP header. For a RFC793bis receiver, 230 RCV.WND == fc.WND. 232 In the case of rLEDBAT receiver, the rLEDBAT receiver MUST NOT set 233 the RCV.WND to a value larger than fc.WND and it SHOULD set the 234 RCV.WND to the minimum of rl.WND and fc.WND, honoring both. 236 When using rLEDBAT, two congestion controllers are in action in the 237 flow of data from the sender to the receiver, namely, the congestion 238 control algorithm of TCP in the sender side and the LBE congestion 239 control algorithm executed in the receiver and conveyed to the sender 240 through the RCV.WND. In the normal TCP operation, the sender uses 241 the minimum of the congestion window cwnd and the receiver window 242 RCV.WND to calculate the sender's window SND.WND. This is also true 243 for rLEDBAT, as the sender is a regular TCP sender. This guarantees 244 that the rLEDBAT flow will never transmit more aggressively than a 245 TCP flow, as the sender's congestion window limits the sending rate. 246 Moreover, because a LBE congestion control algorithm such as LEDBAT/ 247 LEDBAT++ is designed to react earlier and more aggressively to 248 congestion than regular TCP congestion control, the rl.WND contained 249 in the RCV.WND field of TCP will be in general smaller than the 250 congestion window calculated by the TCP sender, implying that the 251 rLEDBAT congestion control algorithm will be effectively controlling 252 the sender's window. 254 In summary, the sender's window is: SND.WND = min(cwnd, rl.WND, 255 fc.WND) 257 3.1.1. Avoiding window shrinking 259 The LEDBAT/LEDBAT++ algorithm executed in a rLEDBAT receiver 260 increases or decreases the rl.WND according to congestion signals 261 (variations on the estimations of the queueing delay and packet 262 loss). If the new value for rl.WND is smaller than the current one 263 then directly announcing it in the RCV.WND may result in shrinking 264 the window, i.e., moving the right window edge to the left. 265 Shrinking the window is discouraged as per [I-D.ietf-tcpm-rfc793bis], 266 as it may cause unnecessary packet loss and performance penalty. To 267 be consistent with [I-D.ietf-tcpm-rfc793bis], the rLEDBAT receiver 268 SHOULD NOT shrink the receive window. 270 In order to avoid window shrinking, upon the reception of a data 271 packet, the announced window can be reduced in the number of bytes 272 contained in the packet at most. This may fall short to honor the 273 new calculated value of the rl.WND. So, in order to reduce the 274 window as dictated by the rLEDBAT algorithm, the receiver SHOULD 275 progressively reduce the advertised RCV.WND, always honoring that the 276 reduction is less or equal than the received bytes, until the target 277 window determined by the rLEDBAT algorithm is reached. This implies 278 that it may take up to one RTT for the rLEDBAT receiver to drain 279 enough in-flight bytes to completely close its receive window without 280 shrinking it. This is more than sufficient to honor the window 281 output from the LEDBAT/LEDBAT++ algorithms since they only allows to 282 perform at most one multiplicative decrease per RTT. 284 3.1.2. Window Scale Option 286 The Window Scale (WS) option [RFC7323] is a mean to increase the 287 maximum window size permitted by the Receive Window. The use of the 288 WS option implies that the changes in the window are expressed in the 289 units resulting of the WS option used in the TCP connection. This 290 means that the rLEDBAT client will have to accumulate the increases 291 resulting from the different received packets, and only convey a 292 change in the window when the accumulated sum of increases is equal 293 or higher than one unit used to express the receive window according 294 to the WS option in place for the TCP connection. 296 Changes in the receive window that are smaller than 1 MSS are 297 unlikely to have any immediate impact on the sender's rate, as usual 298 TCP segmentation practice results in sending full segments (i.e., 299 segments of size equal to the MSS). So, accumulating changes in the 300 receive window until completing a full MSS in the sender or in the 301 receiver makes little difference. 303 Current WS option specification [RFC7323] defines that allowed values 304 for the WS option are between 0 and 14. Assuming a MSS around 1500 305 bytes, WS option values between 0 and 11 result in the receive window 306 being expressed in units that are about 1 MSS or smaller. So, WS 307 option values between 0 and 11 have no impact in rLEDBAT. 309 WS option values higher than 11 can affect the dynamics of rLEDBAT, 310 since control may become too coarse (e.g., with WS of 14, a change in 311 one unit of the receive window implies a change of 10 MSS in the 312 effective window). 314 For the above reasons, the rLEDBAT client SHOULD set WS option values 315 lower than 12. Additional experimentation is required to explore the 316 impact of larger WS values in rLEDBAT dynamics. 318 Note that the recommendation for rLEDBAT to set the WS option value 319 to lower values does not precludes the communication with servers 320 that set the WS option values to larger values, since the WS option 321 value used is set independently for each direction of the TCP 322 connection. 324 3.2. Measuring the Round Trip Time 326 LEDBAT++ measures base and current RTT to estimate the queueing 327 delay. In the next sections we describe how rLEDBAT mechanisms 328 enable the receiver to measure the RTT. 330 The original LEDBAT algorithm uses the one-way delay to estimate the 331 queuing delay. We have encountered a number of issues when 332 attempting to measure the one-way delay in TCP, which resulted in 333 deferring the recommendation of the use of one-way delay to estimate 334 the queuing delay in rLEDBAT for the future, when additional research 335 is done in this space. We describe the difficulties encountered in 336 the Appendix below. 338 3.2.1. Measuring the RTT to estimate the queueing delay 340 LEDBAT++ uses the round trip time (RTT) to estimate the queueing 341 delay. In order to estimate the queueing delay using the RTT, the 342 rLEDBAT receiver estimates the base RTT (i.e., the constant 343 components of the RTT) and also measures the current RTT. By 344 subtracting these two values, we obtain the queuing delay to be used 345 by the rLEDBAT controller. 347 LEDBAT++ discovers the base RTT (RTTb) by taking the minimum value of 348 the measured RTTs over a period of time. The current RTT (RTTc) is 349 estimated using a number of recent samples and applying a filter, 350 such as the minimum (or the mean) of the last k samples. Using the 351 RTT to estimate the queueing delay has a number of shortcomings and 352 difficulties that we discuss next. 354 The queuing delay measured using the RTT includes also the queueing 355 delay experienced by the return packets in the direction from the 356 rLEDBAT receiver to the sender. This is a fundamental limitation of 357 this approach. The impact of this error is that the rLEDBAT 358 controller will also react to congestion in the reverse path 359 direction which results in an even more conservative mechanism. 361 In order to measure the RTT, the rLEDBAT client MUST enable the Time 362 Stamp (TS) option [RFC7323]. By matching the TSVal value carried in 363 outgoing packets with the TSecr value observed in incoming packets, 364 it is possible to measure the RTT. This allows the rLEDBAT receiver 365 to measure the RTT even if it is acting as a pure receiver. In a 366 pure receiver there is no data flowing from the rLEDBAT receiver to 367 the sender, making impossible to match data packets with 368 acknowledgements packets to measure the RTT, as it is usually done in 369 TCP for other purposes. 371 Depending on the frequency of the local clock used to generate the 372 values included in the TS option, several packets may carry the same 373 TSVal value. If that happens, the rLEDBAT receiver will be unable to 374 match the different outgoing packets carrying the same TSVal value 375 with the different incoming packets carrying also the same TSecr 376 value. However, it is not necessary for rLEDBAT to use all packets 377 to estimate the RTT and sampling a subset of in-flight packets per 378 RTT is enough to properly assess the queueing delay. The RTT MUST 379 then be calculated as the time since the first packet with a given 380 TSVal was sent and the first packet that was received with the same 381 value contained in the TSecr. Other packets with repeated TS values 382 SHOULD NOT be used for the RTT calculation. 384 Several issues must be addressed in order to avoid an artificial 385 increase of the observed RTT. Different issues emerge depending 386 whether the rLEDBAT capable host is sending data packets or pure ACKs 387 to measure the RTT. We next consider the issues separately. 389 3.2.1.1. Measuring RTT sending pure ACKs 391 In this scenario, the rLEDBAT node (node A) sends a pure ACK to the 392 other endpoint of the TCP connection (node B), including the TS 393 option. Upon the reception of the TS Option, host B will copy the 394 value of the TSVal into the TSecr field of the TS option and include 395 that option into the next data packet towards host A. However, there 396 are two reasons why B may not send a packet immediately back to A, 397 artificially increasing the measured RTT. The first reason is when A 398 has no data to send. The second is when A has no available window to 399 put more packets in-flight. We describe next how each of these cases 400 is addressed. 402 The case where the host B has no data to send when it receives the 403 pure Acknowledgement is expected to be rare in the rLEDBAT use cases. 404 rLEDBAT will be used mostly for background file transfers so the 405 expected common case is that the sender will have data to send 406 throughout the lifetime of the communication. However, if, for 407 example, the file is structured in blocks of data, it may be the case 408 that seldom, the sender will have to wait until the next block is 409 available to proceed with the data transfer and momentarily lack of 410 data to send. To address this situation, the filter used by the 411 congestion control algorithm executed in the receiver SHOULD discard 412 the larger samples (e.g. a min filter would achieve this) when 413 measuring the RTT using pure ACK packets. 415 The limitation of available sender's window to send more packets can 416 come either from the TCP congestion window in host B or from the 417 announced receive window from the rLEDBAT in host A. Normally, the 418 receive window will be the one to limit the sender's transmission 419 rate, since the LBE congestion control algorithm used by the rLEDBAT 420 node is designed to be more restrictive on the sender's rate than 421 standard-TCP. If the limiting factor is the congestion window in the 422 sender, it is less relevant if rLEDBAT further reduces the receive 423 window due to a bloated RTT measurement, since the rLEDBAT is not 424 actively controlling the sender's rate. Nevertheless, the proposed 425 approach to discard larger samples would also address this issue. 427 To address the case in which the limiting factor is the receive 428 window announced by rLEDBAT, the congestion control algorithm at the 429 receiver SHOULD discard the RTT measurements done using pure ACK 430 packets while reducing the window and avoid including bloated samples 431 in the queueing delay estimation. The rLEDBAT receiver is aware 432 whether a given TSVal value was sent in a pure ACK packet where the 433 window was reduced, and if so, it can discard the corresponding RTT 434 measurement. 436 3.2.1.2. Measuring the RTT sending data packets 438 In the case that the rLEDBAT node is sending data packets and 439 matching them with pure ACKs to measure the RTT, a factor that can 440 artificially increase the RTT measured is the presence of delayed 441 Acknowledgements. According to the TS option generation rules 442 [RFC7323], the value included in the TSecr for a delayed ACK is the 443 one in the TSVal field of the earliest unacknowledged segment. This 444 may artificially increase the measured RTT. 446 If both endpoints of the connection are sending data packets, 447 Acknowledgments are piggybacked into the data packets and they are 448 not delayed. Delayed ACKs only increase the RTT measurement in the 449 case that the sender has no data to send. Since the expected use 450 case for rLEDBAT is that the sender will be sending background 451 traffic to the rLEDBAT receiver, the cases where delayed ACKs 452 increase the measured RTT are expected to be rare. 454 Nevertheless, for those measurements done using data packets sent by 455 the rLEDBAT node matching pure ACKs sent from the other endpoint of 456 the connection, they will result in an increased RTT. The additional 457 increase in the measured RTT will range between the transmission 458 delay of on packet and 500 ms. The reason for this is that delayed 459 ACKs are generated every second data packet received and not delayed 460 more than 500 ms according to [I-D.ietf-tcpm-rfc793bis]. The rLEDBAT 461 receiver MAY discard the RTT measurements done using data packets 462 from the rLEBDAT receiver and matching pure ACKs, especially if it 463 has recent measurements done using other packet combinations. Also, 464 applying a filter that discard larger samples would also address this 465 issue (e.g. a min filter). 467 3.3. Detecting retransmissions and packet losses 469 The rLEDBAT receiver is capable of detecting retransmitted packets in 470 the following way. We call RCV.HGH the highest sequence number 471 correspondent to a received byte of data (not assuming that all bytes 472 with smaller sequence numbers have been received already, there may 473 be holes) and we call TSV.HGH the TSVal value corresponding to the 474 segment in which that byte was carried. SEG.SEQ stands for the 475 sequence number of a newly received segment and we call TSV.SEQ the 476 TSVal value of the newly received segment. 478 If SEG.SEQ < RCV.HGH and TSV.SEQ > TSV.HGH then the newly received 479 segment is a retransmission. This is so because the newly received 480 segment was generated later than another already received segment 481 which contained data with a larger sequence number. This means that 482 this segment was lost and was retransmitted. 484 The proposed mechanism to detect retransmissions at the receiver 485 fails when there are window tail drops. If all packets in the tail 486 of the window are lost, the receiver will not be able to detect a 487 mismatch between the sequence numbers of the packets and the order of 488 the timestamps. In this case, rLEDBAT will not react to losses but 489 the TCP congestion controller at the sender will, most likely, reduce 490 its window to 1MSS and take over the control of the sending rate, 491 until slow start ramps up and catches the current value of the 492 rLEDBAT window. 494 4. Security Considerations 496 5. IANA Considerations 498 6. Acknowledgements 500 This work was supported by the EU through the H2020 5G-RANGE project 501 and by the Spanish Ministry of Economy and Competitiveness through 502 the 5G-City project (TEC2016-76795-C6-3-R). 504 7. Informative References 506 [I-D.balasubramanian-iccrg-ledbatplusplus] 507 Balasubramanian, P., Ertugay, O., and D. Havey, "LEDBAT++: 508 Congestion Control for Background Traffic", draft- 509 balasubramanian-iccrg-ledbatplusplus-00 (work in 510 progress), July 2019. 512 [I-D.ietf-tcpm-rfc793bis] 513 Eddy, W., "Transmission Control Protocol Specification", 514 draft-ietf-tcpm-rfc793bis-14 (work in progress), July 515 2019. 517 [khono] Kohno, T., Broido, A., and K. Claffy, "Remote physical 518 device fingerprinting", IEEE Transactions on Dependable 519 and Secure Computing Vol 2, Number 2, 2005. 521 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 522 Control", RFC 5681, DOI 10.17487/RFC5681, September 2009, 523 . 525 [RFC6817] Shalunov, S., Hazel, G., Iyengar, J., and M. Kuehlewind, 526 "Low Extra Delay Background Transport (LEDBAT)", RFC 6817, 527 DOI 10.17487/RFC6817, December 2012, 528 . 530 [RFC7323] Borman, D., Braden, B., Jacobson, V., and R. 531 Scheffenegger, Ed., "TCP Extensions for High Performance", 532 RFC 7323, DOI 10.17487/RFC7323, September 2014, 533 . 535 [RFC8312] Rhee, I., Xu, L., Ha, S., Zimmermann, A., Eggert, L., and 536 R. Scheffenegger, "CUBIC for Fast Long-Distance Networks", 537 RFC 8312, DOI 10.17487/RFC8312, February 2018, 538 . 540 Appendix A. Difficulties while measuring one way delay at the TCP 541 receiver 543 The LEDBAT algorithm uses the one-way delay of packets as input. A 544 TCP receiver can measure the delay of incoming packets directly (as 545 opposed to the sender-based LEDBAT, where the receiver measures the 546 one-way delay and needs to convey it to the sender). 548 In the case of TCP, the receiver can use the Time Stamp option to 549 measure the one way delay by subtracting the time stamp contained in 550 the incoming packet from the local time at which the packet has 551 arrived. 553 In order to measure the one way delay using TCP timestamps, the 554 rLEDBAT receiver needs to discover the units in which the values of 555 the TS option are expressed and second, to account for the skew 556 between the two clocks of the endpoints of the TCP connection. Note 557 that a mismatch of 100 ppm (parts per million) in the estimation at 558 the receiver of the clock rate of the sender accounts for 6 ms of 559 variation per minute in the measured delay for a communication, just 560 one order of magnitude below the target set for controlling the rate 561 by rLEDBAT. Typical skew for untrained clocks is reported to be 562 around 100-200 ppm [RFC6817]. 564 In order to learn both the TS units and the clock skew, the rLEDBAT 565 receiver compares how much local time has elapsed between the sender 566 has issued two packets with different TS values. By comparing the 567 local time difference and the TS value difference, the receiver can 568 assess the TS units and relative clock skews. In order for this to 569 be accurate, the packets carrying the different TS values should 570 experience equal (or at least similar delay) when traveling from the 571 sender to the receiver, as any difference in the experienced delays 572 would introduce error in the unit/skew estimation. The receiver 573 should then choose two packets that have experienced a similar delay 574 (for example, the minimum delay of n packets). The problem is that 575 the delay measured is contaminated by the error in the clock unit/ 576 skew estimation, so it is not possible to tell which is the packet 577 that experienced the minimum delay. It would be possible to select 578 the packets that experienced a minimum RTT. The problem with this 579 approach is that the error is essentially the RTT measured, which is 580 not an acceptable bound. 582 Moreover, as this measure to estimate the Time Stamp clock units and 583 drift is affected by the propagation time of the packets themselves, 584 it is not possible to estimate this value by just using two packets 585 if the time between them is short [khono]). Although better 586 estimations can be achieved from using multiple points to estimate 587 the value (i.e., through lineal regression), a non negligible time is 588 required until enough data is gathered to reach the required 589 precission. 591 An additional difficulty regarding the estimation of the TS units and 592 clock skew in the context of (r)LEDBAT is that the LEDBAT congestion 593 controller actions directly affect the (queueing) delay experienced 594 by packets. In particular, if there is an error in the estimation of 595 the TS units/skew, the LEDBAT controller will attempt to compensate 596 it by reducing/increasing the load. The result is that the LEDBAT 597 operation interferes with the TS units/clock skew measurements. 598 Because of this, measurements are more accurate when there is no 599 traffic in the connection (in addition to the packets used for the 600 measurements). The problem is that the receiver is unaware if the 601 sender is injecting traffic at any point in time, and 602 opportunistically seize quiet intervals to preform measurements. The 603 receiver can however, force periodic slowdowns, reducing the 604 announced receive window to a few packets and perform the 605 measurements then. 607 Authors' Addresses 609 Marcelo Bagnulo 610 UC3M 612 Email: marcelo@it.uc3m.es 613 Alberto Garcia-Martinez 614 UC3M 616 Email: alberto@it.uc3m.es 618 Gabriel Montenegro 619 Microsoft 621 Email: Gabriel.Montenegro@microsoft.com 623 Praveen Balasubramanian 624 Microsoft 626 Email: pravb@microsoft.com