idnits 2.17.1 draft-ietf-ledbat-survey-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (April 24, 2011) is 4750 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Obsolete informational reference (is this intentional?): RFC 1323 (Obsoleted by RFC 7323) -- Obsolete informational reference (is this intentional?): RFC 2309 (Obsoleted by RFC 7567) -- Obsolete informational reference (is this intentional?): RFC 3662 (Obsoleted by RFC 8622) == Outdated reference: A later version (-10) exists of draft-ietf-ledbat-congestion-04 Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force M. Welzl 3 Internet-Draft University of Oslo 4 Intended status: Informational D. Ros 5 Expires: October 26, 2011 Institut Telecom / Telecom 6 Bretagne 7 April 24, 2011 9 A Survey of Lower-than-Best-Effort Transport Protocols 10 draft-ietf-ledbat-survey-07.txt 12 Abstract 14 This document provides a survey of transport protocols which are 15 designed to have a smaller bandwidth and/or delay impact on standard 16 TCP than standard TCP itself when they share a bottleneck with it. 17 Such protocols could be used for delay-insensitive "background" 18 traffic, as they provide what is sometimes called a "less than" (or 19 "lower than") best-effort service. 21 Status of this Memo 23 This Internet-Draft is submitted in full conformance with the 24 provisions of BCP 78 and BCP 79. 26 Internet-Drafts are working documents of the Internet Engineering 27 Task Force (IETF). Note that other groups may also distribute 28 working documents as Internet-Drafts. The list of current Internet- 29 Drafts is at http://datatracker.ietf.org/drafts/current/. 31 Internet-Drafts are draft documents valid for a maximum of six months 32 and may be updated, replaced, or obsoleted by other documents at any 33 time. It is inappropriate to use Internet-Drafts as reference 34 material or to cite them other than as "work in progress." 36 This Internet-Draft will expire on October 26, 2011. 38 Copyright Notice 40 Copyright (c) 2011 IETF Trust and the persons identified as the 41 document authors. All rights reserved. 43 This document is subject to BCP 78 and the IETF Trust's Legal 44 Provisions Relating to IETF Documents 45 (http://trustee.ietf.org/license-info) in effect on the date of 46 publication of this document. Please review these documents 47 carefully, as they describe your rights and restrictions with respect 48 to this document. Code Components extracted from this document must 49 include Simplified BSD License text as described in Section 4.e of 50 the Trust Legal Provisions and are provided without warranty as 51 described in the Simplified BSD License. 53 Table of Contents 55 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 56 2. Delay-based transport protocols . . . . . . . . . . . . . . . 4 57 2.1. Accuracy of delay-based congestion predictors . . . . . . 6 58 2.2. Potential issues with delay-based congestion control 59 for LBE transport . . . . . . . . . . . . . . . . . . . . 7 60 3. Non-delay-based transport protocols . . . . . . . . . . . . . 8 61 4. Upper-layer approaches . . . . . . . . . . . . . . . . . . . . 9 62 4.1. Receiver-oriented, flow-control based approaches . . . . . 10 63 5. Network-assisted approaches . . . . . . . . . . . . . . . . . 11 64 6. LEDBAT Considerations . . . . . . . . . . . . . . . . . . . . 12 65 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 12 66 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 13 67 9. Security Considerations . . . . . . . . . . . . . . . . . . . 13 68 10. Changes from the previous version (TO BE REMOVED BY THE 69 RFC EDITOR UPON COMPLETION) . . . . . . . . . . . . . . . . . 13 70 11. Informative References . . . . . . . . . . . . . . . . . . . . 13 71 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 18 73 1. Introduction 75 This document presents a brief survey of proposals to attain a Less 76 than Best Effort (LBE) service by means of end-host mechanisms. We 77 loosely define a LBE service as a service which results in smaller 78 bandwidth and/or delay impact on standard TCP than standard TCP 79 itself, when sharing a bottleneck with it. We refer to systems that 80 are designed to provide this service as LBE systems. With the 81 exception of TCP Vegas, which we present for historical reasons, we 82 exclude systems that have been noted to exhibit LBE behavior under 83 some circumstances but were not designed for this purpose (e.g. 84 RAPID [Kon09], [Aru10]). 86 Generally, LBE behavior can be achieved by reacting to queue growth 87 earlier than standard TCP would, or by changing the congestion 88 avoidance behavior of TCP without utilizing any additional implicit 89 feedback. It is therefore assumed that readers are familiar with TCP 90 congestion control [RFC5681]. Some mechanisms achieve an LBE 91 behavior without modifying transport protocol standards (e.g., by 92 changing the receiver window of standard TCP), whereas others 93 leverage network-level mechanisms at the transport layer for LBE 94 purposes. According to this classification, solutions have been 95 categorized in this document as delay-based transport protocols, non- 96 delay-based transport protocols, upper-layer approaches and network- 97 assisted approaches. Some of the schemes in the first two categories 98 could be implemented using TCP without changing its header format; 99 this would facilitate their deployment in the Internet. The schemes 100 in the third category are, by design, supposed to be especially easy 101 to deploy, because they only describe a way in which existing 102 transport protocols are used. Finally, mechanisms in the last 103 category require changes to equipment along the path, which can 104 greatly complicate their deployment. 106 This document is a product of the Low Extra Delay Background 107 Transport (LEDBAT) Working Group. It aims at putting the congestion 108 control algorithm that the working group has specified [Sha11] in the 109 context of the state of the art in LBE transport. This survey is not 110 exhaustive, as this would not be possible or useful; the authors/ 111 editors have selected key, well-known, or otherwise interesting 112 techniques for inclusion at their discretion. There is also a 113 substantial amount of work that is related to the LBE concept but not 114 presenting a solution that can be installed in end hosts or expected 115 to work over the Internet (e.g., there is a DiffServ-based, Lower- 116 Effort service [RFC3662], and the IETF Congestion Exposure (CONEX) 117 Working Group is developing a mechanism which can incentivize LEDBAT- 118 like applications). Such work is outside the scope of this document. 120 2. Delay-based transport protocols 122 It is wrong to generally equate "little impact on standard TCP" with 123 "small sending rate". Without Explicit Congestion Notification (ECN) 124 support, standard TCP will normally increase its congestion window 125 (and effective sending rate) until a queue overflows, causing one or 126 more packets to be dropped and the effective rate to be reduced. A 127 protocol which stops increasing the rate before this event happens 128 can, in principle, achieve a better performance than standard TCP. 130 TCP Vegas [Bra94] is one of the first protocols that was known to 131 have a smaller sending rate than standard TCP when both protocols 132 share a bottleneck [Kur00] -- yet it was designed to achieve more, 133 not less throughput than standard TCP. Indeed, when TCP Vegas is the 134 only congestion control algorithm used by flows going through the 135 bottleneck, its throughput is greater than the throughput of standard 136 TCP. Depending on the bottleneck queue length, TCP Vegas itself can 137 be starved by standard TCP flows. This can be remedied to some 138 degree by the RED Active Queue Management mechanism [RFC2309]. Vegas 139 linearly increases or decreases the sending rate, based on the 140 difference between the expected throughput and the actual throughput. 141 The estimation is based on RTT measurements. 143 The congestion avoidance behavior is the protocol's most important 144 feature in terms of historical relevance as well as relevance in the 145 context of this document (it has been shown that other elements of 146 the protocol can sometimes play a greater role for its overall 147 behavior [Hen00]). In congestion avoidance, once per RTT, TCP Vegas 148 calculates the expected throughput as WindowSize / BaseRTT, where 149 WindowSize is the current congestion window and BaseRTT is the 150 minimum of all measured RTTs. The expected throughput is then 151 compared with the actual throughput measured by recent 152 acknowledgements. If the actual throughput is smaller than the 153 expected throughput minus a threshold called "beta", this is taken as 154 a sign of congestion, causing the protocol to linearly decrease its 155 rate. If the actual throughput is greater than the expected 156 throughput minus a threshold called "alpha" (with alpha < beta), this 157 is taken as a sign that the network is underutilized, causing the 158 protocol to linearly increase its rate. 160 TCP Vegas has been analyzed extensively. One of the most prominent 161 properties of TCP Vegas is its fairness between multiple flows of the 162 same kind, which does not penalize flows with large propagation 163 delays in the same way as standard TCP. While it was not the first 164 protocol that uses delay as a congestion indication, its predecessors 165 (like CARD [Jai89], Tri-S [Wan91] or DUAL [Wan92]) are not discussed 166 here because of the historical "landmark" role that TCP Vegas has 167 taken in the literature. 169 Delay-based transport protocols which were designed to be non- 170 intrusive include TCP Nice [Ven02] and TCP Low Priority (TCP-LP) 171 [Kuz06]. TCP Nice [Ven02] follows the same basic approach as TCP 172 Vegas but improves upon it in some aspects. Because of its moderate 173 linear-decrease congestion response, TCP Vegas can affect standard 174 TCP despite its ability to detect congestion early. TCP Nice removes 175 this issue by halving the congestion window (at most once per RTT, 176 like standard TCP) instead of linearly reducing it. To avoid being 177 too conservative, this is only done if a fixed predefined fraction of 178 delay-based incipient congestion signals appears within one RTT. 179 Otherwise, TCP Nice falls back to the congestion avoidance rules of 180 TCP Vegas if no packet was lost or standard TCP if a packet was lost. 181 One more feature of TCP Nice is its ability to support a congestion 182 window of less than one packet, by clocking out single packets over 183 more than one RTT. With ns-2 simulations and real-life experiments 184 using a Linux implementation, the authors of [Ven02] show that TCP 185 Nice achieves its goal of efficiently utilizing spare capacity while 186 being non-intrusive to standard TCP. 188 Other than TCP Vegas and TCP Nice, TCP-LP [Kuz06] uses only the one- 189 way delay (OWD) instead of the RTT as an indicator of incipient 190 congestion. This is done to avoid reacting to delay fluctuations 191 that are caused by reverse cross-traffic. Using the TCP Timestamps 192 option [RFC1323], the OWD is determined as the difference between the 193 receiver's Timestamp value in the ACK and the original Timestamp 194 value that the receiver copied into the ACK. While the result of 195 this subtraction can only precisely represent the OWD if clocks are 196 synchronized, its absolute value is of no concern to TCP-LP and hence 197 clock synchronization is unnecessary. Using a constant smoothing 198 parameter, TCP-LP calculates an Exponentially Weighted Moving Average 199 (EWMA) of the measured OWD and checks whether the result exceeds a 200 threshold within the range of the minimum and maximum OWD that was 201 seen during the connections's lifetime; if it does, this condition is 202 interpreted as an "early congestion indication". The minimum and 203 maximum OWD values are initialized during the slow-start phase. 205 Regarding its reaction to an early congestion indication, TCP-LP 206 tries to strike a middle ground between the overly conservative 207 choice of _immediately_ setting the congestion window to one packet, 208 and the presumably too aggressive choice of simply halving the 209 congestion window like standard TCP; TCP-LP tries to delay the former 210 action by an additional RTT, to see if there is persistent congestion 211 or not. It does so by halving the window at first in response to an 212 early congestion indication, then initializing an "inference time-out 213 timer", and maintaining the current congestion window until this 214 timer fires. If another early congestion indication appeared during 215 this "inference phase", the window is then set to 1; otherwise, the 216 window is maintained and TCP-LP continues to increase it in the 217 standard Additive-Increase fashion. This method ensures that it 218 takes at least two RTTs for a TCP-LP flow to decrease its window to 219 1, and, like standard TCP, TCP-LP reacts to congestion at most once 220 per RTT. 222 Using a simple analytical model, the authors of TCP-LP [Kuz06] 223 illustrate the feasibility of a delay-based LBE transport by showing 224 that, due to the non-linear relationship between throughput and RTT, 225 it is possible to avoid interfering with standard TCP traffic even 226 when the flows under consideration have a larger RTT than standard 227 TCP flows. With ns-2 simulations and real-life experiments using a 228 Linux implementation, the authors of [Kuz06] show that TCP-LP is 229 largely non-intrusive to TCP traffic while at the same time enabling 230 it to utilize a large portion of the excess network bandwidth, which 231 is fairly shared among competing TCP-LP flows. They also show that 232 using their protocol for bulk data transfers greatly reduces file 233 transfer times of competing best-effort web traffic. 235 Sync-TCP [Wei05] follows a similar approach as TCP-LP, by adapting 236 its reaction to congestion according to changes in the OWD. By 237 comparing the estimated (average) forward queuing delay to the 238 maximum observed delay, Sync-TCP adapts the AIMD parameters depending 239 on the trend followed by the average delay over an observation 240 window. Even though the authors of [Wei05] did not explicitly 241 consider its use as an LBE protocol, Sync-TCP was designed to react 242 early to incipient congestion, while grabbing available bandwidth 243 more aggressively than a standard TCP in congestion-avoidance mode. 245 Delay-based congestion control is also at the basis of proposals 246 aiming at adapting TCP's congestion avoidance to very high-speed 247 networks. Some of these proposals, like Compound TCP [Tan06][Sri08] 248 and TCP Illinois [Liu08], are hybrid loss- and delay-based 249 mechanisms, whereas others (e.g., NewVegas [Dev03], FAST TCP [Wei06] 250 or CODE TCP [Cha10]) are variants of Vegas based primarily on delays. 252 2.1. Accuracy of delay-based congestion predictors 254 The accuracy of delay-based congestion predictors has been the 255 subject of a good deal of research, see e.g. [Bia03], [Mar03], 256 [Pra04], [Rew06], [McC08]. The main result of most of these studies 257 is that delays (or, more precisely, round-trip times) are, in 258 general, weakly correlated with congestion. There are several 259 factors that may induce such a poor correlation: 261 o Bottleneck buffer size: in principle, a delay-based mechanism 262 could be made "more than TCP friendly" _if_ buffers are "large 263 enough", so that RTT fluctuations and/or deviations from the 264 minimum RTT can be detected by the end-host with reasonable 265 accuracy. Otherwise, it may be hard to distinguish real delay 266 variations from measurement noise. 268 o RTT measurement issues: in principle, RTT samples may suffer from 269 poor resolution, due to timers which are too coarse-grained with 270 respect to the scale of delay fluctuations. Also, a flow may 271 obtain a very noisy estimate of RTTs due to undersampling, under 272 some circumstances (e.g., the flow rate is much lower than the 273 link bandwidth). For TCP, other potential sources of measurement 274 noise include: TCP segmentation offloading (TSO) and the use of 275 delayed ACKs [Hay10]. A congested reverse path may also result in 276 an erroneous assessment of the congestion state of the forward 277 path. Finally, in the case of fast or short-distance links, the 278 majority of the measured delay can in fact be due to processing in 279 the involved hosts; typically, this processing delay is not of 280 interest, and it can underlie fluctuations that are not related to 281 the network at all. 283 o Level of statistical multiplexing and RTT sampling: it may be easy 284 for an individual flow to "miss" loss/queue overflow events, 285 especially if the number of flows sharing a bottleneck buffer is 286 significant. This is nicely illustrated e.g. in Fig. 1 of 287 [McC08]. 289 o Impact of wireless links: several mechanisms that are typical of 290 wireless links, like link-layer scheduling and error recovery, may 291 induce strong delay fluctuations over short time scales [Gur04]. 293 Interestingly, the results of Bhandarkar et al. [Bha07] seem to 294 paint a slightly different picture, regarding the accuracy of delay- 295 based congestion prediction. Bhandarkar et al. claim that it is 296 possible to significantly improve prediction accuracy by adopting 297 some simple techniques (smoothing of RTT samples, increasing the RTT 298 sampling frequency). Nonetheless, they acknowledge that even with 299 such techniques, it is not possible to eradicate detection errors. 300 Their proposed delay-based congestion avoidance method, PERT 301 (Probabilistic Early Response TCP), mitigates the impact of residual 302 detection errors by means of a probabilistic response mechanism to 303 congestion detection events. 305 2.2. Potential issues with delay-based congestion control for LBE 306 transport 308 Whether a delay-based protocol behaves in its intended manner (e.g., 309 it is "more than TCP friendly", or it grabs available bandwidth in a 310 very aggressive manner) may depend on the accuracy issues listed in 311 Section 2.1. Moreover, protocols like Vegas need to keep an estimate 312 of the minimum ("base") delay; this makes such protocols highly 313 sensitive to eventual changes in the end-to-end route during the 314 lifetime of the flow [Mo99]. 316 Regarding the issue of false positives/false negatives with a delay- 317 based congestion detector, most studies focus on the loss of 318 throughput coming from the erroneous detection of queue build-up and 319 of alleviation of congestion. Arguably, for a LBE transport protocol 320 it's better to err on the "more-than-TCP-friendly side", that is, to 321 always yield to _perceived_ congestion whether it is "real" or not; 322 however, failure to detect congestion (due to one of the above 323 accuracy problems) would result in behavior that is not LBE. For 324 instance, consider the case in which the bottleneck buffer is small, 325 so that the contribution of queueing delay at the bottleneck to the 326 global end-to-end delay is small. In such a case, a flow using a 327 delay-based mechanism might end up consuming a good deal of bandwidth 328 with respect to a competing standard TCP flow, unless it also 329 incorporates a suitable reaction to loss. 331 A delay-based mechanism may also suffer from the so-called "latecomer 332 advantage" (or latecomer unfairness) problem. Consider the case in 333 which the bottleneck link is already (very) congested. In such a 334 scenario, delay variations may be quite small, hence, it may be very 335 difficult to tell an empty queue from a heavily-loaded queue, in 336 terms of delay fluctuation. Therefore, a newly-arriving delay-based 337 flow may start sending faster when there is already heavy congestion, 338 eventually driving away loss-based flows [Sha05][Car10]. 340 3. Non-delay-based transport protocols 342 There exist a few transport-layer proposals that achieve an LBE 343 service without relying on delay as an indicator of congestion. In 344 the algorithms discussed below the loss rate of the flow determines, 345 either implicitly or explicitly, the sending rate (which is adapted 346 so as to obtain a lower share of the available bandwidth than 347 standard TCP); such mechanisms likely cause more queuing delay and 348 react to congestion more slowly than delay-based ones. 350 4CP [Liu07], which stands for "Competitive and Considerate Congestion 351 Control", is a protocol which provides a LBE service by changing the 352 window control rules of standard TCP. A "virtual window" is 353 maintained which, during a so-called "bad congestion phase" is 354 reduced to less than a predefined minimum value of the actual 355 congestion window. The congestion window is only increased again 356 once the virtual window exceeds this minimum, and in this way the 357 virtual window controls the duration during which the sender 358 transmits with a fixed minimum rate. Whether the congestion state is 359 "bad" or "good" depends on whether the loss event rate is above or 360 below a threshold (or target) value. The 4CP congestion avoidance 361 algorithm allows for setting a target average window and avoids 362 starvation of "background" flows while bounding the impact on 363 "foreground" flows. Its performance was evaluated in ns-2 364 simulations and in real-life experiments with a kernel-level 365 implementation in Microsoft Windows Vista. 367 The MulTFRC [Dam09] protocol is an extension of TCP-Friendly Rate 368 Control (TFRC) [RFC5348] for multiple flows. MulTFRC takes the main 369 idea of MulTCP [Cro98] and similar proposals (e.g., [Hac04], [Hac08], 370 [Kuo08]) a step further. A single MulTCP flow tries to emulate (and 371 be as friendly as) a number N > 1 of parallel TCP flows. By 372 supporting values of N between 0 and 1, MulTFRC can be used as a 373 mechanism for a LBE service. Since it does not react to delay like 374 the protocols described in Section 2 but adjusts its rate like TFRC, 375 MulTFRC can probably be expected to be more aggressive than 376 mechanisms such as TCP Nice or TCP-LP. This also means that MulTFRC 377 is less likely to be prone to starvation, as its aggressiveness is 378 tunable at a fine granularity, even when N is between 0 and 1. 380 4. Upper-layer approaches 382 The proposals described in this section do not require modifying 383 transport protocol standards. Most of them can be regarded as 384 running "on top" of an existing transport, even though they may be 385 implemented either at the application layer (i.e., in user-level 386 processes), or in the kernel of the end hosts' operating system. 387 Such "upper-layer" mechanisms may arguably be easier to deploy than 388 transport-layer approaches, since they do not require any changes to 389 the transport itself. 391 A simplistic, application-level approach to a background transport 392 service may consist in scheduling automated transfers at times when 393 the network is lightly loaded, as described in e.g. [Dyk02] for 394 cooperative proxy caching. An issue with such a technique is that it 395 may not necessarily be applicable to applications like peer-to-peer 396 file transfer, since the notion of an "off-peak hour" is not 397 meaningful when end-hosts may be located anywhere in the world. 399 The so-called Background Intelligent Transfer Service (BITS) [BITS] 400 is implemented in several versions of Microsoft Windows. BITS uses a 401 system of application-layer priority levels for file-transfer jobs, 402 together with monitoring of bandwidth usage of the network interface 403 (or, in more recent versions, of the network gateway connected to the 404 end-host), so that, low-priority transfers at a given end-host give 405 way to both high-priority (foreground) transfers and traffic from 406 interactive applications at the same host. 408 A different approach is taken in [Egg05] -- here, the priority of a 409 flow is reduced via a generic idletime scheduling strategy in a 410 host's operating system. While results presented in this paper show 411 that the new scheduler can effectively shield regular tasks from low- 412 priority ones (e.g., TCP from greedy UDP) with only a minor 413 performance impact, it is an underlying assumption that all involved 414 end hosts would use the idletime scheduler. In other words, it is 415 not the focus of this work to protect a standard TCP flow which 416 originates from any host where the presented scheduling scheme may 417 not be implemented. 419 4.1. Receiver-oriented, flow-control based approaches 421 Some proposals for achieving an LBE behavior work by exploiting 422 existing transport-layer features -- typically, at the "receiving" 423 side. In particular, TCP's built-in flow control can be used as a 424 means to achieve a low-priority transport service. 426 The mechanism described in [Spr00] is an example of the above 427 technique. Such mechanism controls the bandwidth by letting the 428 receiver intelligently manipulate the receiver window of standard 429 TCP. This is possible because the authors assume a client-server 430 setting where the receiver's access link is typically the bottleneck. 431 The scheme incorporates a delay-based calculation of the expected 432 queue length at the bottleneck, which is quite similar to the 433 calculation in the above delay-based protocols, e.g. TCP Vegas. 434 Using a Linux implementation, where TCP flows are classified 435 according to their application's needs, Spring et al. show in [Spr00] 436 that a significant improvement in packet latency can be attained over 437 an unmodified system, while maintaining good link utilization. 439 A similar method is employed by Mehra et al. [Meh03], where both the 440 advertised receiver window and the delay in sending ACK messages are 441 dynamically adapted to attain a given rate. As in [Spr00], Mehra et 442 al. assume that the bottleneck is located at the receiver's access 443 link. However, the latter also propose a bandwidth-sharing system, 444 allowing to control the bandwidth allocated to different flows, as 445 well as to allot a minimum rate to some flows. 447 Receiver window tuning is also done in [Key04], where choosing the 448 right value for the window is phrased as an optimization problem. On 449 this basis, two algorithms are presented, binary search -- which is 450 faster than the other one at achieving a good operation point but 451 fluctuates -- and stochastic optimization, which does not fluctuate 452 but converges slower than binary search. These algorithms merely use 453 the previous receiver window and the amount of data received during 454 the previous control interval as input. According to [Key04], the 455 encouraging simulation results suggest that such an application level 456 mechanism can work almost as well as a transport layer scheme like 457 TCP-LP. 459 Another way of dealing with non-interactive flows, like e.g. web 460 prefetching, is to rate-limit the transfer of such bursty traffic 461 [Cro98b]. Note that one of the techniques used in [Cro98b] is, 462 precisely, to have the downloading application adapt the TCP receiver 463 window, so as to reduce the data rate to the minimum needed (thus, 464 disturbing other flows as little as possible while respecting a 465 deadline for the transfer of the data). 467 5. Network-assisted approaches 469 Network-layer mechanisms, like active queue management (AQM) and 470 packet scheduling in routers, can be exploited by a transport 471 protocol for achieving an LBE service. Such approaches may result in 472 improved protection of non-LBE flows (e.g., when scheduling is used); 473 besides, approaches using an explicit, AQM-based congestion signaling 474 may arguably be more robust than, say, delay-based transports for 475 detecting impending congestion. However, an obvious drawback of any 476 network-assisted approach is that, in principle, they need 477 modifications in both end-hosts and intermediate network nodes. 479 Harp [Kok04] realizes a LBE service by dissipating background traffic 480 to less-utilized paths of the network, based on multipath routing and 481 multipath congestion control. This is achieved without changing all 482 routers, by using edge nodes as relays. According to the authors, 483 these edge nodes should be gateways of organizations in order to 484 align their scheme with usage incentives, but the technical solution 485 would also work if Harp was only deployed in end hosts. It detects 486 impending congestion by looking at delay, similar to TCP Nice 487 [Ven02], and manages to improve the utilization and fairness of TCP 488 over pure single-path solutions without requiring any changes to the 489 TCP itself. 491 Another technique is that used by protocols like Network-Friendly TCP 492 (NF-TCP) [Aru10], where a bandwidth-estimation module integrated into 493 the transport protocol allows to rapidly take advantage of free 494 capacity. NF-TCP combines this with an early congestion detection 495 based on Explicit Congestion Notification (ECN) [RFC3168] and RED 496 [RFC2309]; when congestion starts building up, appropriate tuning of 497 a RED queue allows to mark low-priority (i.e., NF-TCP) packets with a 498 much higher probability than high-priority (i.e., standard TCP) 499 packets, so low-priority flows yield up bandwidth before standard TCP 500 flows. NF-TCP could be implemented by adapting the congestion 501 control behavior of TCP without requiring to change the protocol on 502 the wire -- with the only exception that NF-TCP-capable routers must 503 be able to somehow distinguish NF-TCP traffic from other TCP traffic. 505 In [Ven08], Venkataraman et al. propose a transport-layer approach to 506 leverage an existing, network-layer LBE service based on priority 507 queueing. Their transport protocol, which they call PLT (Priority- 508 Layer Transport), splits a layer-4 connection into two flows, a high- 509 priority one and a low-priority one. The high-priority flow is sent 510 over the higher-priority queueing class (in principle, offering a 511 best-effort service) using an AIMD, TCP-like congestion control 512 mechanism. The low-priority flow, which is mapped to the LBE class, 513 uses a non TCP-friendly congestion control algorithm. The goal of 514 PLT is thus to maximize its aggregate throughput by exploiting unused 515 capacity in an aggressive way, while protecting standard TCP flows 516 carried by the best-effort class. Similar in spirit, [Ott03] 517 proposes simple changes to only the AIMD parameters of TCP for use 518 over a network-layer LBE service, so that such "filler" traffic may 519 aggressively consume unused bandwidth. Note that [Ven08] also 520 considers a mechanism for detecting the lack of priority queueing in 521 the network, so that the non-TCP friendly flow may be inhibited. The 522 PLT receiver monitors the loss rate of both flows; if the high- 523 priority flow starts seeing losses while the low-priority one does 524 not experience 100% loss, this is taken as an indication of the 525 absence of strict priority queueing. 527 6. LEDBAT Considerations 529 The previous sections have shown that there is a large amount of work 530 on attaining an LBE service, and that it is quite heterogeneous in 531 nature. The algorithm developed by the LEDBAT working group [Sha11] 532 can be classified as a delay-based mechanism, and is as such similar 533 in spirit to the protocols presented in Section 2. It is, however, 534 not a protocol -- how it is actually applied to the Internet, i.e., 535 how to use existing or even new transport protocols together with the 536 LEDBAT algorithm, is not defined by the LEDBAT Working Group. As it 537 heavily relies on delay, the discussion in Section 2.1 and 538 Section 2.2 applies to it. The performance of LEDBAT has been 539 analyzed in comparison with some of the other work presented here in 540 several articles, e.g. [Aru10], [Car10], [Sch10] but these analyses 541 have to be examined with care: at the time of writing, LEDBAT was 542 still a moving target. 544 7. Acknowledgements 546 The authors would like to thank Melissa Chavez, Dragana Damjanovic 547 and Yinxia Zhao for reference pointers, as well as Jari Arkko, 548 Mayutan Arumaithurai, Elwyn Davies, Wesley Eddy, Stephen Farrell, 549 Mirja Kuehlewind, Tina Tsou and Rolf Winter for their detailed 550 reviews and suggestions. 552 8. IANA Considerations 554 This memo includes no request to IANA. 556 9. Security Considerations 558 This document introduces no new security considerations. 560 10. Changes from the previous version (TO BE REMOVED BY THE RFC EDITOR 561 UPON COMPLETION) 563 o Fixed a broken sentence at the end of the introduction. 565 11. Informative References 567 [Aru10] Arumaithurai, M., Fu, X., and K. Ramakrishnan, "NF-TCP: A 568 Network Friendly TCP Variant for Background Delay- 569 Insensitive Applications", Technical Report No. IFI-TB- 570 2010-05, Institute of Computer Science, University of 571 Goettingen, Germany, September 2010, . 575 [BITS] Microsoft, "Windows Background Intelligent Transfer 576 Service", 577 . 579 [Bha07] Bhandarkar, S., Reddy, A., Zhang, Y., and D. Loguinov, 580 "Emulating AQM from end hosts", Proceedings of ACM 581 SIGCOMM 2007, 2007. 583 [Bia03] Biaz, S. and N. Vaidya, "Is the round-trip time correlated 584 with the number of packets in flight?", Proceedings of the 585 3rd ACM SIGCOMM conference on Internet measurement (IMC 586 '03) , pages 273-278, 2003. 588 [Bra94] Brakmo, L., O'Malley, S., and L. Peterson, "TCP Vegas: New 589 techniques for congestion detection and avoidance", 590 Proceedings of SIGCOMM '94, pages 24-35, August 1994. 592 [Car10] Carofiglio, G., Muscariello, L., Rossi, D., and S. 594 Valenti, "The quest for LEDBAT fairness", Proceedings of 595 IEEE GLOBECOM 2010, December 2010. 597 [Cha10] Chan, Y., Lin, C., Chan, C., and C. Ho, "CODE TCP: A 598 competitive delay-based TCP", Computer Communications , 599 33(9):1013-1029, June 2010. 601 [Cro98] Crowcroft, J. and P. Oechslin, "Differentiated end-to-end 602 Internet services using a weighted proportional fair 603 sharing TCP", ACM SIGCOMM Computer Communication 604 Review vol. 28, no. 3 (July 1998), pp. 53-69, 1998. 606 [Cro98b] Crovella, M. and P. Barford, "The network effects of 607 prefetching", Proceedings of IEEE INFOCOM 1998, 608 April 1998. 610 [Dam09] Damjanovic, D. and M. Welzl, "MulTFRC: Providing Weighted 611 Fairness for Multimedia Applications (and others too!)", 612 ACM Computer Communication Review vol. 39, no. 3 (July 613 2009), 2009. 615 [Dev03] De Vendictis, A., Baiocchi, A., and M. Bonacci, "Analysis 616 and enhancement of TCP Vegas congestion control in a mixed 617 TCP Vegas and TCP Reno network scenario", Performance 618 Evaluation , 53(3-4):225-253, 2003. 620 [Dyk02] Dykes, S. and K. Robbins, "Limitations and benefits of 621 cooperative proxy caching", IEEE Journal on Selected Areas 622 in Communications 20(7):1290-1304, September 2002. 624 [Egg05] Eggert, L. and J. Touch, "Idletime Scheduling with 625 Preemption Intervals", Proceedings of 20th ACM Symposium 626 on Operating Systems Principles SOSP 2005, Brighton, 627 United Kingdom, pp. 249/262, October 2005. 629 [Gur04] Gurtov, A. and S. Floyd, "Modeling wireless links for 630 transport protocols", ACM SIGCOMM Computer Communications 631 Review 34(2):85-96, April 2004. 633 [Hac04] Hacker, T., Noble, B., and B. Athey, "Improving Throughput 634 and Maintaining Fairness using Parallel TCP", Proceedings 635 of IEEE INFOCOM 2004, March 2004. 637 [Hac08] Hacker, T. and P. Smith, "Stochastic TCP: A Statistical 638 Approach to Congestion Avoidance", Proceedings of 639 PFLDnet 2008, March 2008. 641 [Hay10] Hayes, D., "Timing enhancements to the FreeBSD kernel to 642 support delay and rate based TCP mechanisms", Technical 643 Report 100219A, Centre for Advanced Internet 644 Architectures, Swinburne University of Technology, 645 February 2010. 647 [Hen00] Hengartner, U., Bolliger, J., and T. Gross, "TCP Vegas 648 revisited", Proceedings of IEEE INFOCOM 2000, March 2000. 650 [Jai89] Jain, R., "A delay-based approach for congestion avoidance 651 in interconnected heterogeneous computer networks", ACM 652 Computer Communication Review , 19(5):56-71, October 1989. 654 [Key04] Key, P., Massoulie, L., and B. Wang, "Emulating Low- 655 Priority Transport at the Application Layer: a Background 656 Transfer Service", Proceedings of ACM SIGMETRICS 2004, 657 January 2004. 659 [Kok04] Kokku, R., Bohra, A., Ganguly, S., and A. Venkataramani, 660 "A Multipath Background Network Architecture", Proceedings 661 of IEEE INFOCOM 2007, May 2007. 663 [Kon09] Konda, V. and J. Kaur, "RAPID: Shrinking the Congestion- 664 control Timescale", Proceedings of IEEE INFOCOM 2009, 665 April 2009. 667 [Kuo08] Kuo, F. and X. Fu, "Probe-Aided MulTCP: an aggregate 668 congestion control mechanism", ACM SIGCOMM Computer 669 Communication Review vol. 38, no. 1 (January 2008), pp. 670 17-28, 2008. 672 [Kur00] Kurata, K., Hasegawa, G., and M. Murata, "Fairness 673 Comparisons Between TCP Reno and TCP Vegas for Future 674 Deployment of TCP Vegas", Proceedings of INET 2000, 675 July 2000. 677 [Kuz06] Kuzmanovic, A. and E. Knightly, "TCP-LP: low-priority 678 service via end-point congestion control", IEEE/ACM 679 Transactions on Networking (ToN) Volume 14, Issue 4, pp. 680 739-752., August 2006, 681 . 683 [Liu07] Liu, S., Vojnovic, M., and D. Gunawardena, "Competitive 684 and Considerate Congestion Control for Bulk Data 685 Transfers", Proceedings of IWQoS 2007, June 2007. 687 [Liu08] Liu, S., Basar, T., and R. Srikant, "TCP-Illinois: A loss- 688 and delay-based congestion control algorithm for high- 689 speed networks", Performance Evaluation , 65(6-7):417-440, 690 2008. 692 [Mar03] Martin, J., Nilsson, A., and I. Rhee, "Delay-based 693 congestion avoidance for TCP", IEEE/ACM Transactions on 694 Networking , 11(3):356-369, June 2003. 696 [McC08] McCullagh, G. and D. Leith, "Delay-based congestion 697 control: Sampling and correlation issues revisited", 698 Technical report, Hamilton Institute, 2008. 700 [Meh03] Mehra, P., Zakhor, A., and C. De Vleeschouwer, "Receiver- 701 Driven Bandwidth Sharing for TCP", Proceedings of IEEE 702 INFOCOM 2003, April 2003. 704 [Mo99] Mo, J., La, R., Anantharam, V., and J. Walrand, "Analysis 705 and Comparison of TCP Reno and TCP Vegas", Proceedings of 706 IEEE INFOCOM 1999, March 1999. 708 [Ott03] Ott, B., Warnky, T., and V. Liberatore, "Congestion 709 control for low-priority filler traffic", SPIE QoS 2003 710 (Quality of Service over Next-Generation Internet), In 711 Proc. SPIE, Vol. 5245, 154, Monterey (CA), USA, July 2003. 713 [Pra04] Prasad, R., Jain, M., and C. Dovrolis, "On the 714 effectiveness of delay-based congestion avoidance", 715 Proceedings of PFLDnet , 2004. 717 [RFC1323] Jacobson, V., Braden, B., and D. Borman, "TCP Extensions 718 for High Performance", RFC 1323, May 1992. 720 [RFC2309] Braden, B., Clark, D., Crowcroft, J., Davie, B., Deering, 721 S., Estrin, D., Floyd, S., Jacobson, V., Minshall, G., 722 Partridge, C., Peterson, L., Ramakrishnan, K., Shenker, 723 S., Wroclawski, J., and L. Zhang, "Recommendations on 724 Queue Management and Congestion Avoidance in the 725 Internet", RFC 2309, April 1998. 727 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 728 of Explicit Congestion Notification (ECN) to IP", 729 RFC 3168, September 2001. 731 [RFC3662] Bless, R., Nichols, K., and K. Wehrle, "A Lower Effort 732 Per-Domain Behavior (PDB) for Differentiated Services", 733 RFC 3662, December 2003. 735 [RFC5348] Floyd, S., Handley, M., Padhye, J., and J. Widmer, "TCP 736 Friendly Rate Control (TFRC): Protocol Specification", 737 RFC 5348, September 2008. 739 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 740 Control", RFC 5681, September 2009. 742 [Rew06] Rewaskar, S., Kaur, J., and D. Smith, "Why don't delay- 743 based congestion estimators work in the real-world?", 744 Technical report TR06-001, University of North Carolina at 745 Chapel Hill, Dept. of Computer Science, January 2006. 747 [Sch10] Schneider, J., Wagner, J., Winter, R., and H. Kolbe, "Out 748 of my Way -- Evaluating Low Extra Delay Background 749 Transport in an ADSL Access Network", Proceedings of the 750 22nd International Teletraffic Congress ITC22, 2010. 752 [Sha05] Shalunov, S., Dunn, L., Gu, Y., Low, S., Rhee, I., Senger, 753 S., Wydrowski, B., and L. Xu, "Design Space for a Bulk 754 Transport Tool", Technical Report, Internet2 Transport 755 Group, May 2005. 757 [Sha11] Shalunov, S., Hazel, G., Iyengar, J., and M. Kuehlewind, 758 "Low Extra Delay Background Transport (LEDBAT)", 759 draft-ietf-ledbat-congestion-04.txt (work in progress), 760 March 2011. 762 [Spr00] Spring, N., Chesire, M., Berryman, M., Sahasranaman, V., 763 Anderson, T., and B. Bershad, "Receiver based management 764 of low bandwidth access links", Proceedings of IEEE 765 INFOCOM 2000, pp. 245-254, vol.1, 2000. 767 [Sri08] Sridharan, M., Tan, K., Bansala, D., and D. Thaler, 768 "Compound TCP: A new TCP congestion control for high-speed 769 and long distance networks", Internet Draft 770 draft-sridharan-tcpm-ctcp , work in progress, 771 November 2008. 773 [Tan06] Tan, K., Song, J., Zhang, Q., and M. Sridharan, "A 774 Compound TCP approach for high-speed and long distance 775 networks", Proceedings of IEEE INFOCOM 2006, Barcelona, 776 Spain, April 2008. 778 [Ven02] Venkataramani, A., Kokku, R., and M. Dahlin, "TCP Nice: a 779 mechanism for background transfers", Proceedings of 780 OSDI '02, 2002. 782 [Ven08] Venkataraman, V., Francis, P., Kodialam, M., and T. 783 Lakshman, "A priority-layered approach to transport for 784 high bandwidth-delay product networks", Proceedings of ACM 785 CoNEXT, Madrid, December 2008. 787 [Wan91] Wang, Z. and J. Crowcroft, "A new congestion control 788 scheme: slow start and search (Tri-S)", ACM Computer 789 Communication Review , 21(1):56-71, January 1991. 791 [Wan92] Wang, Z. and J. Crowcroft, "Eliminating periodic packet 792 losses in the 4.3-Tahoe BSD TCP congestion control 793 algorithm", ACM Computer Communication Review , 22(2): 794 9-16, January 1992. 796 [Wei05] Weigle, M., Jeffay, K., and F. Smith, "Delay-based early 797 congestion detection and adaptation in TCP: impact on web 798 performance", Computer Communications 28(8):837-850, 799 May 2005. 801 [Wei06] Wei, D., Jin, C., Low, S., and S. Hegde, "FAST TCP: 802 Motivation, architecture, algorithms, performance", IEEE/ 803 ACM Transactions on Networking , 14(6):1246-1259, 804 December 2006. 806 Authors' Addresses 808 Michael Welzl 809 University of Oslo 810 Department of Informatics, PO Box 1080 Blindern 811 N-0316 Oslo, 812 Norway 814 Phone: +47 22 85 24 20 815 Email: michawe@ifi.uio.no 817 David Ros 818 Institut Telecom / Telecom Bretagne 819 Rue de la Chataigneraie, CS 17607 820 35576 Cesson Sevigne cedex, 821 France 823 Phone: +33 2 99 12 70 46 824 Email: david.ros@telecom-bretagne.eu