idnits 2.17.1 draft-ietf-ledbat-survey-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (January 14, 2011) is 4851 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Obsolete informational reference (is this intentional?): RFC 1323 (Obsoleted by RFC 7323) -- Obsolete informational reference (is this intentional?): RFC 2309 (Obsoleted by RFC 7567) -- Obsolete informational reference (is this intentional?): RFC 3662 (Obsoleted by RFC 8622) Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force M. Welzl 3 Internet-Draft University of Oslo 4 Intended status: Informational D. Ros 5 Expires: July 18, 2011 Institut Telecom / Telecom 6 Bretagne 7 January 14, 2011 9 A Survey of Lower-than-Best-Effort Transport Protocols 10 draft-ietf-ledbat-survey-04.txt 12 Abstract 14 This document provides a survey of transport protocols which are 15 designed to have a smaller bandwidth and/or delay impact on standard 16 TCP than standard TCP itself when they share a bottleneck with it. 17 Such protocols could be used for delay-insensitive "background" 18 traffic, as they provide what is sometimes called a "less than" (or 19 "lower than") best-effort service. 21 Status of this Memo 23 This Internet-Draft is submitted in full conformance with the 24 provisions of BCP 78 and BCP 79. 26 Internet-Drafts are working documents of the Internet Engineering 27 Task Force (IETF). Note that other groups may also distribute 28 working documents as Internet-Drafts. The list of current Internet- 29 Drafts is at http://datatracker.ietf.org/drafts/current/. 31 Internet-Drafts are draft documents valid for a maximum of six months 32 and may be updated, replaced, or obsoleted by other documents at any 33 time. It is inappropriate to use Internet-Drafts as reference 34 material or to cite them other than as "work in progress." 36 This Internet-Draft will expire on July 18, 2011. 38 Copyright Notice 40 Copyright (c) 2011 IETF Trust and the persons identified as the 41 document authors. All rights reserved. 43 This document is subject to BCP 78 and the IETF Trust's Legal 44 Provisions Relating to IETF Documents 45 (http://trustee.ietf.org/license-info) in effect on the date of 46 publication of this document. Please review these documents 47 carefully, as they describe your rights and restrictions with respect 48 to this document. Code Components extracted from this document must 49 include Simplified BSD License text as described in Section 4.e of 50 the Trust Legal Provisions and are provided without warranty as 51 described in the Simplified BSD License. 53 Table of Contents 55 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 56 2. Delay-based transport protocols . . . . . . . . . . . . . . . 3 57 2.1. Accuracy of delay-based congestion predictors . . . . . . 6 58 2.2. Potential issues with delay-based congestion control 59 for LBE transport . . . . . . . . . . . . . . . . . . . . 7 60 3. Non-delay-based transport protocols . . . . . . . . . . . . . 8 61 4. Upper-layer approaches . . . . . . . . . . . . . . . . . . . . 9 62 4.1. Receiver-oriented, flow-control based approaches . . . . . 10 63 5. Network-assisted approaches . . . . . . . . . . . . . . . . . 11 64 6. Conclusion and LEDBAT Considerations . . . . . . . . . . . . . 12 65 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 12 66 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 13 67 9. Security Considerations . . . . . . . . . . . . . . . . . . . 13 68 10. Changes from the previous version (section to be removed 69 later) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 70 11. Informative References . . . . . . . . . . . . . . . . . . . . 13 71 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 18 73 1. Introduction 75 This document presents a brief survey of proposals to attain a Less 76 than Best Effort (LBE) service by means of end-host mechanisms. We 77 loosely define a LBE service as a service which results in smaller 78 bandwidth and/or delay impact on standard TCP than standard TCP 79 itself, when sharing a bottleneck with it. We refer to systems that 80 are designed to provide this service as LBE systems. With the 81 exception of TCP Vegas, which we present for historical reasons, we 82 exclude systems that have been noted to exhibit LBE behavior under 83 some circumstances but were not designed for this purpose (e.g. 84 RAPID [Kon09], [Aru10]). 86 Generally, LBE behavior can be achieved by reacting to queue growth 87 earlier than standard TCP would, or by changing the congestion 88 avoidance behavior of TCP without utilizing any additional implicit 89 feedback. It is therefore assumed that readers are familiar with TCP 90 congestion control [RFC5681]. Some mechanisms achieve an LBE 91 behavior without modifying transport protocol standards (e.g., by 92 changing the receiver window of standard TCP), whereas others 93 leverage network-level mechanisms at the transport layer for LBE 94 purposes. According to this classification, solutions have been 95 categorized in this document as delay-based transport protocols, non- 96 delay-based transport protocols, upper-layer approaches and network- 97 assisted approaches. 99 This document is a product of the Low Extra Delay Background 100 Transport (LEDBAT) Working Group for comparison with the chosen 101 approach. Most techniques discussed here were tested in limited 102 simulations or experimental testbeds, but LEDBAT's algorithm is 103 already under widespread deployment. This survey is not exhaustive, 104 as this would not be possible or useful; the authors/editors have 105 selected key, well-known, or otherwise interesting techniques for 106 inclusion at their discretion. There is also a substantial amount of 107 work that is related to the LBE concept but not presenting a solution 108 that can be installed in end hosts or expected to work over the 109 Internet (e.g., a DiffServ-based, Lower-Effort service [RFC3662]); 110 such mechanisms are outside the scope of this document. 112 2. Delay-based transport protocols 114 It is wrong to generally equate "little impact on standard TCP" with 115 "small sending rate". Without ECN support, standard TCP will 116 normally increase its congestion window (and effective sending rate) 117 until a queue overflows, causing one or more packets to be dropped 118 and the effective rate to be reduced. A protocol which stops 119 increasing the rate before this event happens can, in principle, 120 achieve a better performance than standard TCP. In the absence of 121 any other traffic, this is even true for TCP itself when its maximum 122 send window is limited to the bandwidth*round-trip time (RTT) 123 product. 125 TCP Vegas [Bra94] is one of the first protocols that was known to 126 have a smaller sending rate than standard TCP when both protocols 127 share a bottleneck [Kur00] -- yet it was designed to achieve more, 128 not less throughput than standard TCP. Indeed, when it is the only 129 protocol on the bottleneck, the throughput of TCP Vegas is greater 130 than the throughput of standard TCP. Depending on the bottleneck 131 queue length, TCP Vegas itself can be starved by standard TCP flows. 132 This can be remedied to some degree by the RED Active Queue 133 Management mechanism [RFC2309]. Vegas linearly increases or 134 decreases the sending rate, based on the difference between the 135 expected throughput and the actual throughput. The estimation is 136 based on RTT measurements. 138 The congestion avoidance behavior is the protocol's most important 139 feature in terms of historical relevance as well as relevance in the 140 context of this document (it has been shown that other elements of 141 the protocol can sometimes play a greater role for its overall 142 behavior [Hen00]). In congestion avoidance, once per RTT, TCP Vegas 143 calculates the expected throughput as WindowSize / BaseRTT, where 144 WindowSize is the current congestion window and BaseRTT is the 145 minimum of all measured RTTs. The expected throughput is then 146 compared with the actual throughput measured by recent 147 acknowledgements. If the actual throughput is smaller than the 148 expected throughput minus a threshold called "beta", this is taken as 149 a sign of congestion, causing the protocol to linearly decrease its 150 rate. If the actual throughput is greater than the expected 151 throughput minus a threshold called "alpha" (with alpha < beta), this 152 is taken as a sign that the network is underutilized, causing the 153 protocol to linearly increase its rate. 155 TCP Vegas has been analyzed extensively. One of the most prominent 156 properties of TCP Vegas is its fairness between multiple flows of the 157 same kind, which does not penalize flows with large propagation 158 delays in the same way as standard TCP. While it was not the first 159 protocol that uses delay as a congestion indication, its predecessors 160 (like CARD [Jai89], Tri-S [Wan91] or DUAL [Wan92]) are not discussed 161 here because of the historical "landmark" role that TCP Vegas has 162 taken in the literature. 164 Delay-based transport protocols which were designed to be non- 165 intrusive include TCP Nice [Ven02] and TCP Low Priority (TCP-LP) 166 [Kuz06]. TCP Nice [Ven02] follows the same basic approach as TCP 167 Vegas but improves upon it in some aspects. Because of its moderate 168 linear-decrease congestion response, TCP Vegas can affect standard 169 TCP despite its ability to detect congestion early. TCP Nice removes 170 this issue by halving the congestion window (at most once per RTT, 171 like standard TCP) instead of linearly reducing it. To avoid being 172 too conservative, this is only done if a fixed predefined fraction of 173 delay-based incipient congestion signals appears within one RTT. 174 Otherwise, TCP Nice falls back to the congestion avoidance rules of 175 TCP Vegas if no packet was lost or standard TCP if a packet was lost. 176 One more feature of TCP Nice is its ability to support a congestion 177 window of less than one packet, by clocking out single packets over 178 more than one RTT. With ns-2 simulations and real-life experiments 179 using a Linux implementation, the authors of [Ven02] show that TCP 180 Nice achieves its goal of efficiently utilizing spare capacity while 181 being non-intrusive to standard TCP. 183 Other than TCP Vegas and TCP Nice, TCP-LP [Kuz06] uses only the one- 184 way delay (OWD) instead of the RTT as an indicator of incipient 185 congestion. This is done to avoid reacting to delay fluctuations 186 that are caused by reverse cross-traffic. Using the TCP Timestamps 187 option [RFC1323], the OWD is determined as the difference between the 188 receiver's Timestamp value in the ACK and the original Timestamp 189 value that the receiver copied into the ACK. While the result of 190 this subtraction can only precisely represent the OWD if clocks are 191 synchronized, its absolute value is of no concern to TCP-LP and hence 192 clock synchronization is unnecessary. Using a constant smoothing 193 parameter, TCP-LP calculates an Exponentially Weighted Moving Average 194 (EWMA) of the measured OWD and checks whether the result exceeds a 195 threshold within the range of the minimum and maximum OWD that was 196 seen during the connections's lifetime; if it does, this condition is 197 interpreted as an "early congestion indication". The minimum and 198 maximum OWD values are initialized during the slow-start phase. 200 Regarding its reaction to an early congestion indication, TCP-LP 201 tries to strike a middle ground between the overly conservative 202 choice of _immediately_ setting the congestion window to one packet, 203 and the presumably too aggressive choice of simply halving the 204 congestion window like standard TCP; TCP-LP tries to delay the former 205 action by an additional RTT, to see if there is persistent congestion 206 or not. It does so by halving the window at first in response to an 207 early congestion indication, then initializing an "inference time-out 208 timer", and maintaining the current congestion window until this 209 timer fires. If another early congestion indication appeared during 210 this "inference phase", the window is then set to 1; otherwise, the 211 window is maintained and TCP-LP continues to increase it in the 212 standard Additive-Increase fashion. This method ensures that it 213 takes at least two RTTs for a TCP-LP flow to decrease its window to 214 1, and, like standard TCP, TCP-LP reacts to congestion at most once 215 per RTT. 217 Using a simple analytical model, the authors of TCP-LP [Kuz06] 218 illustrate the feasibility of a delay-based LBE transport by showing 219 that, due to the non-linear relationship between throughput and RTT, 220 it is possible to avoid interfering with standard TCP traffic even 221 when the flows under consideration have a larger RTT than standard 222 TCP flows. With ns-2 simulations and real-life experiments using a 223 Linux implementation, the authors of [Kuz06] show that TCP-LP is 224 largely non-intrusive to TCP traffic while at the same time enabling 225 it to utilize a large portion of the excess network bandwidth, which 226 is fairly shared among competing TCP-LP flows. They also show that 227 using their protocol for bulk data transfers greatly reduces file 228 transfer times of competing best-effort web traffic. 230 Sync-TCP [Wei05] follows a similar approach as TCP-LP, by adapting 231 its reaction to congestion according to changes in the OWD. By 232 comparing the estimated (average) forward queuing delay to the 233 maximum observed delay, Sync-TCP adapts the AIMD parameters depending 234 on the trend followed by the average delay over an observation 235 window. Even though the authors of [Wei05] did not explicitly 236 consider its use as an LBE protocol, Sync-TCP was designed to react 237 early to incipient congestion, while grabbing available bandwidth 238 more aggressively than a standard TCP in congestion-avoidance mode. 240 Delay-based congestion control is also at the basis of proposals 241 aiming at adapting TCP's congestion avoidance to very high-speed 242 networks. Some of these proposals, like Compound TCP [Tan06][Sri08] 243 and TCP Illinois [Liu08], are hybrid loss- and delay-based 244 mechanisms, whereas others (e.g., NewVegas [Dev03], FAST TCP [Wei06] 245 or CODE TCP [Cha10]) are variants of Vegas based primarily on delays. 247 2.1. Accuracy of delay-based congestion predictors 249 The accuracy of delay-based congestion predictors has been the 250 subject of a good deal of research, see e.g. [Bia03], [Mar03], 251 [Pra04], [Rew06], [McC08]. The main result of most of these studies 252 is that delays (or, more precisely, round-trip times) are, in 253 general, weakly correlated with congestion. There are several 254 factors that may induce such a poor correlation: 256 o Bottleneck buffer size: in principle, a delay-based mechanism 257 could be made "more than TCP friendly" _if_ buffers are "large 258 enough", so that RTT fluctuations and/or deviations from the 259 minimum RTT can be detected by the end-host with reasonable 260 accuracy. Otherwise, it may be hard to distinguish real delay 261 variations from measurement noise. 263 o RTT measurement issues: in principle, RTT samples may suffer from 264 poor resolution, due to timers which are too coarse-grained with 265 respect to the scale of delay fluctuations. Also, a flow may 266 obtain a very noisy estimate of RTTs due to undersampling, under 267 some circumstances (e.g., the flow rate is much lower than the 268 link bandwidth). For TCP, other potential sources of measurement 269 noise include: TCP segmentation offloading (TSO) and the use of 270 delayed ACKs [Hay10]. A congested reverse path may also result in 271 an erroneous assessment of the congestion state of the forward 272 path. Finally, in the case of fast or short-distance links, the 273 majority of the measured delay can in fact be due to processing in 274 the involved hosts; typically, this processing delay is not of 275 interest, and it can underly fluctuations that are not related to 276 the network at all. 278 o Level of statistical multiplexing and RTT sampling: it may be easy 279 for an individual flow to "miss" loss/queue overflow events, 280 especially if the number of flows sharing a bottleneck buffer is 281 significant. This is nicely illustrated e.g. in Fig. 1 of 282 [McC08]. 284 o Impact of wireless links: several mechanisms that are typical of 285 wireless links, like link-layer scheduling and error recovery, may 286 induce strong delay fluctuations over short time scales [Gur04]. 288 Interestingly, the results of Bhandarkar et al. [Bha07] seem to 289 paint a slightly different picture, regarding the accuracy of delay- 290 based congestion prediction. Bhandarkar et al. claim that it is 291 possible to significantly improve prediction accuracy by adopting 292 some simple techniques (smoothing of RTT samples, increasing the RTT 293 sampling frequency). Nonetheless, they acknowledge that even with 294 such techniques, it is not possible to eradicate detection errors. 295 Their proposed delay-based congestion avoidance method, PERT 296 (Probabilistic Early Response TCP), mitigates the impact of residual 297 detection errors by means of a probabilistic response mechanism to 298 congestion detection events. 300 2.2. Potential issues with delay-based congestion control for LBE 301 transport 303 Whether a delay-based protocol behaves in its intended manner (e.g., 304 it is "more than TCP friendly", or it grabs available bandwidth in a 305 very aggressive manner) may therefore depend on the accuracy issues 306 listed in Section 2.1. Moreover, protocols like Vegas need to keep 307 an estimate of the minimum ("base") delay; this makes such protocols 308 highly sensitive to eventual changes in the end-to-end route during 309 the lifetime of the flow [Mo99]. 311 Regarding the issue of false positives/false negatives with a delay- 312 based congestion detector, most studies focus on the loss of 313 throughput coming from the erroneous detection of queue build-up and 314 of alleviation of congestion. Arguably, for a LBE transport protocol 315 it's better to err on the "more-than-TCP-friendly side", that is, to 316 always yield to _perceived_ congestion whether it is "real" or not; 317 however, failure to detect congestion (due to one of the above 318 accuracy problems) would result in behavior that is not LBE. For 319 instance, consider the case in which the bottleneck buffer is small, 320 so that the contribution of queueing delay at the bottleneck to the 321 global end-to-end delay is small. In such a case, a flow using a 322 delay-based mechanism might end up consuming a good deal of bandwidth 323 with respect to a competing standard TCP flow, unless it also 324 incorporates a suitable reaction to loss. 326 A delay-based mechanism may also suffer from the so-called "latecomer 327 advantage" (or latecomer unfairness) problem. Consider the case in 328 which the bottleneck link is already (very) congested. In such a 329 scenario, delay variations may be quite small, hence, it may be very 330 difficult to tell an empty queue from a heavily-loaded queue, in 331 terms of delay fluctuation. Therefore, a newly-arriving delay-based 332 flow may start sending faster when there is already heavy congestion, 333 eventually driving away loss-based flows [Sha05][Car10]. 335 3. Non-delay-based transport protocols 337 There exist a few transport-layer proposals that achieve an LBE 338 service without relying on delay as an indicator of congestion. In 339 the algorithms discussed below the loss rate of the flow determines, 340 either implicitly or explicitly, the sending rate (which is adapted 341 so as to obtain a lower share of the available bandwidth than 342 standard TCP); such mechanisms likely cause more queuing delay and 343 react to congestion more slowly than delay-based ones. 345 4CP [Liu07], which stands for "Competitive and Considerate Congestion 346 Control", is a protocol which provides a LBE service by changing the 347 window control rules of standard TCP. A "virtual window" is 348 maintained which, during a so-called "bad congestion phase" is 349 reduced to less than a predefined minimum value of the actual 350 congestion window. The congestion window is only increased again 351 once the virtual window exceeds this minimum, and in this way the 352 virtual window controls the duration during which the sender 353 transmits with a fixed minimum rate. Whether the congestion state is 354 "bad" or "good" depends on whether the loss event rate is above or 355 below a threshold (or target) value. The 4CP congestion avoidance 356 algorithm allows for setting a target average window and avoids 357 starvation of "background" flows while bounding the impact on 358 "foreground" flows. Its performance was evaluated in ns-2 359 simulations and in real-life experiments with a kernel-level 360 implementation in Microsoft Windows Vista. 362 The MulTFRC [Dam09] protocol is an extension of TCP-Friendly Rate 363 Control (TFRC) [RFC5348] for multiple flows. MulTFRC takes the main 364 idea of MulTCP [Cro98] and similar proposals (e.g., [Hac04], [Hac08], 365 [Kuo08]) a step further. A single MulTCP flow tries to emulate (and 366 be as friendly as) a number N > 1 of parallel TCP flows. By 367 supporting values of N between 0 and 1, MulTFRC can be used as a 368 mechanism for a LBE service. Since it does not react to delay like 369 the protocols described in Section 2 but adjusts its rate like TFRC, 370 MulTFRC can probably be expected to be more aggressive than 371 mechanisms such as TCP Nice or TCP-LP. This also means that MulTFRC 372 is less likely to be prone to starvation, as its aggressiveness is 373 tunable at a fine granularity, even when N is between 0 and 1. 375 4. Upper-layer approaches 377 The proposals described in this section do not require modifying 378 transport protocol standards. Most of them can be regarded as 379 running "on top" of an existing transport, even though they may be 380 implemented either at the application layer (i.e., in user-level 381 processes), or in the kernel of the end hosts' operating system. 382 Such "upper-layer" mechanisms may arguably be easier to deploy than 383 transport-layer approaches, since they do not require any changes to 384 the transport itself. 386 A simplistic, application-level approach to a background transport 387 service may consist in scheduling automated transfers at times when 388 the network is lightly loaded, as described in e.g. [Dyk02] for 389 cooperative proxy caching. An issue with such a technique is that it 390 may not necessarily be appropriate to applications like peer-to-peer 391 file transfer, since the notion of an "off-peak hour" is not 392 meaningful when end-hosts may be located anywhere in the world. 394 The so-called Background Intelligent Transfer Service (BITS) [BITS] 395 is implemented in several versions of Microsoft Windows. BITS uses a 396 system of application-layer priority levels for file-transfer jobs, 397 together with monitoring of bandwidth usage of the network interface 398 (or, in more recent versions, of the network gateway connected to the 399 end-host), so that, low-priority transfers at a given end-host give 400 way to both high-priority (foreground) transfers and traffic from 401 interactive applications at the same host. 403 A different approach is taken in [Egg05] -- here, the priority of a 404 flow is reduced via a generic idletime scheduling strategy in a 405 host's operating system. While results presented in this paper show 406 that the new scheduler can effectively shield regular tasks from low- 407 priority ones (e.g., TCP from greedy UDP) with only a minor 408 performance impact, it is an underlying assumption that all involved 409 end hosts would use the idletime scheduler. In other words, it is 410 not the focus of this work to protect a standard TCP flow which 411 originates from any host where the presented scheduling scheme may 412 not be implemented. 414 4.1. Receiver-oriented, flow-control based approaches 416 Some proposals for achieving an LBE behavior work by exploiting 417 existing transport-layer features -- typically, at the "receiving" 418 side. In particular, TCP's built-in flow control can be used as a 419 means to achieve a low-priority transport service. 421 The mechanism described in [Spr00] is an example of the above 422 technique. Such mechanism controls the bandwidth by letting the 423 receiver intelligently manipulate the receiver window of standard 424 TCP. This is possible because the authors assume a client-server 425 setting where the receiver's access link is typically the bottleneck. 426 The scheme incorporates a delay-based calculation of the expected 427 queue length at the bottleneck, which is quite similar to the 428 calculation in the above delay-based protocols, e.g. TCP Vegas. 429 Using a Linux implementation, where TCP flows are classified 430 according to their application's needs, Spring et al. show in [Spr00] 431 that a significant improvement in packet latency can be attained over 432 an unmodified system, while maintaining good link utilization. 434 A similar method is employed by Mehra et al. [Meh03], where both the 435 advertised receiver window and the delay in sending ACK messages are 436 dynamically adapted to attain a given rate. As in [Spr00], Mehra et 437 al. assume that the bottleneck is located at the receiver's access 438 link. However, the latter also propose a bandwidth-sharing system, 439 allowing to control the bandwidth allocated to different flows, as 440 well as to allot a minimum rate to some flows. 442 Receiver window tuning is also done in [Key04], where choosing the 443 right value for the window is phrased as an optimization problem. On 444 this basis, two algorithms are presented, binary search -- which is 445 faster than the other one at achieving a good operation point but 446 fluctuates -- and stochastic optimization, which does not fluctuate 447 but converges slower than binary search. These algorithms merely use 448 the previous receiver window and the amount of data received during 449 the previous control interval as input. According to [Key04], the 450 encouraging simulation results suggest that such an application level 451 mechanism can work almost as well as a transport layer scheme like 452 TCP-LP. 454 Another way of dealing with non-interactive flows, like e.g. web 455 prefetching, is to rate-limit the transfer of such bursty traffic 456 [Cro98b]. Note that one of the techniques used in [Cro98b] is, 457 precisely, to have the downloading application adapt the TCP receiver 458 window, so as to reduce the data rate to the minimum needed (thus, 459 disturbing other flows as little as possible while respecting a 460 deadline for the transfer of the data). 462 5. Network-assisted approaches 464 Network-layer mechanisms, like active queue management (AQM) and 465 packet scheduling in routers, can be exploited by a transport 466 protocol for achieving an LBE service. Such approaches may result in 467 improved protection of non-LBE flows (e.g., when scheduling is used); 468 besides, approaches using an explicit, AQM-based congestion signaling 469 may arguably be more robust than, say, delay-based transports for 470 detecting impending congestion. However, an obvious drawback of any 471 network-assisted approach is that, in principle, they need 472 modifications in both end-hosts and intermediate network nodes. 474 Harp [Kok04] realizes a LBE service by dissipating background traffic 475 to less-utilized paths of the network, based on multipath routing and 476 multipath congestion control. This is achieved without changing all 477 routers, by using edge nodes as relays. According to the authors, 478 these edge nodes should be gateways of organizations in order to 479 align their scheme with usage incentives, but the technical solution 480 would also work if Harp was only deployed in end hosts. It detects 481 impending congestion by looking at delay, similar to TCP Nice 482 [Ven02], and manages to improve the utilization and fairness of TCP 483 over pure single-path solutions without requiring any changes to the 484 TCP itself. 486 Another technique is that used by protocols like Network-Friendly TCP 487 (NF-TCP) [Aru10], where a bandwidth-estimation module integrated into 488 the transport protocol allows to rapidly take advantage of free 489 capacity. NF-TCP combines this with an early congestion detection 490 based on Explicit Congestion Notification (ECN) [RFC3168] and RED 491 [RFC2309]; when congestion starts building up, appropriate tuning of 492 a RED queue allows to mark low-priority (i.e., NF-TCP) packets with a 493 much higher probability than high-priority (i.e., standard TCP) 494 packets, so low-priority flows yield up bandwidth before standard TCP 495 flows. NF-TCP could be implemented by adapting the congestion 496 control behavior of TCP without requiring to change the protocol on 497 the wire -- with the only exception that NF-TCP-capable routers must 498 be able to somehow distinguish NF-TCP traffic from other TCP traffic. 500 In [Ven08], Venkataraman et al. propose a transport-layer approach to 501 leverage an existing, network-layer LBE service based on priority 502 queueing. Their transport protocol, which they call PLT (Priority- 503 Layer Transport), splits a layer-4 connection into two flows, a high- 504 priority one and a low-priority one. The high-priority flow is sent 505 over the higher-priority queueing class (in principle, offering a 506 best-effort service) using an AIMD, TCP-like congestion control 507 mechanism. The low-priority flow, which is mapped to the LBE class, 508 uses a non TCP-friendly congestion control algorithm. The goal of 509 PLT is thus to maximize its aggregate throughput by exploiting unused 510 capacity in an aggressive way, while protecting standard TCP flows 511 carried by the best-effort class. Similar in spirit, [Ott03] 512 proposes simple changes to only the AIMD parameters of TCP for use 513 over a network-layer LBE service, so that such "filler" traffic may 514 aggressively consume unused bandwidth. Note that [Ven08] also 515 considers a mechanism for detecting the lack of priority queueing in 516 the network, so that the non-TCP friendly flow may be inhibited. The 517 PLT receiver monitors the loss rate of both flows; if the high- 518 priority flow starts seeing losses while the low-priority one does 519 not experience 100% loss, this is taken as an indication of the 520 absence of strict priority queueing. 522 6. Conclusion and LEDBAT Considerations 524 The previous sections have shown that there is a large amount of work 525 on attaining an LBE service, and that it is quite heterogeneous in 526 nature. What most of the discussed mechanisms have in common, 527 however, is the fact that they have only been tested in limited 528 simulations or experimental testbeds. As we have initially stated, 529 the algorithm produced by the LEDBAT Working Group (which is itself 530 also called LEDBAT) is already under widespread deployment. 532 LEDBAT can be classified as a delay-based mechanism, and is as such 533 similar in spirit to the protocols presented in Section 2. It is, 534 however, not a protocol -- how it is actually applied to the 535 Internet, i.e. how to use existing or even new transport protocols to 536 realize the LEDBAT algorithm, is not defined by the LEDBAT Working 537 Group. As it heavily relies on delay, the discussion in Section 2.1 538 and Section 2.2 applies to it. The performance of LEDBAT has been 539 analyzed in comparison with some of the other work presented here in 540 several articles, e.g. [Aru10], [Car10], [Sch10] but these analyses 541 have to be examined with care: at the time of writing, LEDBAT is 542 still a moving target. 544 7. Acknowledgements 546 The authors would like to thank Dragana Damjanovic, Melissa Chavez 547 and Yinxia Zhao for reference pointers, as well as Mayutan 548 Arumaithurai, Mirja Kuehlewind and Wesley Eddy for their detailed 549 reviews and suggestions. 551 8. IANA Considerations 553 This memo includes no request to IANA. 555 9. Security Considerations 557 This document introduces no new security considerations. 559 10. Changes from the previous version (section to be removed later) 561 o Updated the introduction to clarify that only mechanisms which are 562 designed for LBE are covered (adding a reference to RAPID and a 563 reference where it is shown that it is not always LBE-ish). 565 o Expanded the NF-TCP acronym. 567 o Added a "conclusion and LEDBAT considerations" section. 569 11. Informative References 571 [Aru10] Arumaithurai, M., Fu, X., and K. Ramakrishnan, "NF-TCP: A 572 Network Friendly TCP Variant for Background Delay- 573 Insensitive Applications", Technical Report No. IFI-TB- 574 2010-05, Institute of Computer Science, University of 575 Goettingen, Germany, September 2010, . 579 [BITS] Microsoft, "Windows Background Intelligent Transfer 580 Service", 581 . 583 [Bha07] Bhandarkar, S., Reddy, A., Zhang, Y., and D. Loguinov, 584 "Emulating AQM from end hosts", Proceedings of ACM 585 SIGCOMM 2007, 2007. 587 [Bia03] Biaz, S. and N. Vaidya, "Is the round-trip time correlated 588 with the number of packets in flight?", Proceedings of the 589 3rd ACM SIGCOMM conference on Internet measurement (IMC 590 '03) , pages 273-278, 2003. 592 [Bra94] Brakmo, L., O'Malley, S., and L. Peterson, "TCP Vegas: New 593 techniques for congestion detection and avoidance", 594 Proceedings of SIGCOMM '94, pages 24-35, August 1994. 596 [Car10] Carofiglio, G., Muscariello, L., Rossi, D., and S. 597 Valenti, "The quest for LEDBAT fairness", Proceedings of 598 IEEE GLOBECOM 2010, December 2010. 600 [Cha10] Chan, Y., Lin, C., Chan, C., and C. Ho, "CODE TCP: A 601 competitive delay-based TCP", Computer Communications , 602 33(9):1013-1029, June 2010. 604 [Cro98] Crowcroft, J. and P. Oechslin, "Differentiated end-to-end 605 Internet services using a weighted proportional fair 606 sharing TCP", ACM SIGCOMM Computer Communication 607 Review vol. 28, no. 3 (July 1998), pp. 53-69, 1998. 609 [Cro98b] Crovella, M. and P. Barford, "The network effects of 610 prefetching", Proceedings of IEEE INFOCOM 1998, 611 April 1998. 613 [Dam09] Damjanovic, D. and M. Welzl, "MulTFRC: Providing Weighted 614 Fairness for Multimedia Applications (and others too!)", 615 ACM Computer Communication Review vol. 39, no. 3 (July 616 2009), 2009. 618 [Dev03] De Vendictis, A., Baiocchi, A., and M. Bonacci, "Analysis 619 and enhancement of TCP Vegas congestion control in a mixed 620 TCP Vegas and TCP Reno network scenario", Performance 621 Evaluation , 53(3-4):225-253, 2003. 623 [Dyk02] Dykes, S. and K. Robbins, "Limitations and benefits of 624 cooperative proxy caching", IEEE Journal on Selected Areas 625 in Communications 20(7):1290-1304, September 2002. 627 [Egg05] Eggert, L. and J. Touch, "Idletime Scheduling with 628 Preemption Intervals", Proceedings of 20th ACM Symposium 629 on Operating Systems Principles SOSP 2005, Brighton, 630 United Kingdom, pp. 249/262, October 2005. 632 [Gur04] Gurtov, A. and S. Floyd, "Modeling wireless links for 633 transport protocols", ACM SIGCOMM Computer Communications 634 Review 34(2):85-96, April 2004. 636 [Hac04] Hacker, T., Noble, B., and B. Athey, "Improving Throughput 637 and Maintaining Fairness using Parallel TCP", Proceedings 638 of IEEE INFOCOM 2004, March 2004. 640 [Hac08] Hacker, T. and P. Smith, "Stochastic TCP: A Statistical 641 Approach to Congestion Avoidance", Proceedings of 642 PFLDnet 2008, March 2008. 644 [Hay10] Hayes, D., "Timing enhancements to the FreeBSD kernel to 645 support delay and rate based TCP mechanisms", Technical 646 Report 100219A , Centre for Advanced Internet 647 Architectures, Swinburne University of Technology, 648 February 2010. 650 [Hen00] Hengartner, U., Bolliger, J., and T. Gross, "TCP Vegas 651 revisited", Proceedings of IEEE INFOCOM 2000, March 2000. 653 [Jai89] Jain, R., "A delay-based approach for congestion avoidance 654 in interconnected heterogeneous computer networks", ACM 655 Computer Communication Review , 19(5):56-71, October 1989. 657 [Key04] Key, P., Massoulie, L., and B. Wang, "Emulating Low- 658 Priority Transport at the Application Layer: a Background 659 Transfer Service", Proceedings of ACM SIGMETRICS 2004, 660 January 2004. 662 [Kok04] Kokku, R., Bohra, A., Ganguly, S., and A. Venkataramani, 663 "A Multipath Background Network Architecture", Proceedings 664 of IEEE INFOCOM 2007, May 2007. 666 [Kon09] Konda, V. and J. Kaur, "RAPID: Shrinking the Congestion- 667 control Timescale", Proceedings of IEEE INFOCOM 2009, 668 April 2009. 670 [Kuo08] Kuo, F. and X. Fu, "Probe-Aided MulTCP: an aggregate 671 congestion control mechanism", ACM SIGCOMM Computer 672 Communication Review vol. 38, no. 1 (January 2008), pp. 673 17-28, 2008. 675 [Kur00] Kurata, K., Hasegawa, G., and M. Murata, "Fairness 676 Comparisons Between TCP Reno and TCP Vegas for Future 677 Deployment of TCP Vegas", Proceedings of INET 2000, 678 July 2000. 680 [Kuz06] Kuzmanovic, A. and E. Knightly, "TCP-LP: low-priority 681 service via end-point congestion control", IEEE/ACM 682 Transactions on Networking (ToN) Volume 14, Issue 4, pp. 683 739-752., August 2006, 684 . 686 [Liu07] Liu, S., Vojnovic, M., and D. Gunawardena, "Competitive 687 and Considerate Congestion Control for Bulk Data 688 Transfers", Proceedings of IWQoS 2007, June 2007. 690 [Liu08] Liu, S., Basar, T., and R. Srikant, "TCP-Illinois: A loss- 691 and delay-based congestion control algorithm for high- 692 speed networks", Performance Evaluation , 65(6-7):417-440, 693 2008. 695 [Mar03] Martin, J., Nilsson, A., and I. Rhee, "Delay-based 696 congestion avoidance for TCP", IEEE/ACM Transactions on 697 Networking , 11(3):356-369, June 2003. 699 [McC08] McCullagh, G. and D. Leith, "Delay-based congestion 700 control: Sampling and correlation issues revisited", 701 Technical report , Hamilton Institute, 2008. 703 [Meh03] Mehra, P., Zakhor, A., and C. De Vleeschouwer, "Receiver- 704 Driven Bandwidth Sharing for TCP", Proceedings of IEEE 705 INFOCOM 2003, April 2003. 707 [Mo99] Mo, J., La, R., Anantharam, V., and J. Walrand, "Analysis 708 and Comparison of TCP Reno and TCP Vegas", Proceedings of 709 IEEE INFOCOM 1999, March 1999. 711 [Ott03] Ott, B., Warnky, T., and V. Liberatore, "Congestion 712 control for low-priority filler traffic", SPIE QoS 2003 713 (Quality of Service over Next-Generation Internet), In 714 Proc. SPIE, Vol. 5245, 154, Monterey (CA), USA, July 2003. 716 [Pra04] Prasad, R., Jain, M., and C. Dovrolis, "On the 717 effectiveness of delay-based congestion avoidance", 718 Proceedings of PFLDnet , 2004. 720 [RFC1323] Jacobson, V., Braden, B., and D. Borman, "TCP Extensions 721 for High Performance", RFC 1323, May 1992. 723 [RFC2309] Braden, B., Clark, D., Crowcroft, J., Davie, B., Deering, 724 S., Estrin, D., Floyd, S., Jacobson, V., Minshall, G., 725 Partridge, C., Peterson, L., Ramakrishnan, K., Shenker, 726 S., Wroclawski, J., and L. Zhang, "Recommendations on 727 Queue Management and Congestion Avoidance in the 728 Internet", RFC 2309, April 1998. 730 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 731 of Explicit Congestion Notification (ECN) to IP", 732 RFC 3168, September 2001. 734 [RFC3662] Bless, R., Nichols, K., and K. Wehrle, "A Lower Effort 735 Per-Domain Behavior (PDB) for Differentiated Services", 736 RFC 3662, December 2003. 738 [RFC5348] Floyd, S., Handley, M., Padhye, J., and J. Widmer, "TCP 739 Friendly Rate Control (TFRC): Protocol Specification", 740 RFC 5348, September 2008. 742 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 743 Control", RFC 5681, September 2009. 745 [Rew06] Rewaskar, S., Kaur, J., and D. Smith, "Why don't delay- 746 based congestion estimators work in the real-world?", 747 Technical report TR06-001 , University of North Carolina 748 at Chapel Hill, Dept. of Computer Science, January 2006. 750 [Sch10] Schneider, J., Wagner, J., Winter, R., and H. Kolbe, "Out 751 of my Way -- Evaluating Low Extra Delay Background 752 Transport in an ADSL Access Network", Proceedings of the 753 22nd International Teletraffic Congress ITC22, 2010. 755 [Sha05] Shalunov, S., Dunn, L., Gu, Y., Low, S., Rhee, I., Senger, 756 S., Wydrowski, B., and L. Xu, "Design Space for a Bulk 757 Transport Tool", Technical Report , Internet2 Transport 758 Group, May 2005. 760 [Spr00] Spring, N., Chesire, M., Berryman, M., Sahasranaman, V., 761 Anderson, T., and B. Bershad, "Receiver based management 762 of low bandwidth access links", Proceedings of IEEE 763 INFOCOM 2000, pp. 245-254, vol.1, 2000. 765 [Sri08] Sridharan, M., Tan, K., Bansala, D., and D. Thaler, 766 "Compound TCP: A new TCP congestion control for high-speed 767 and long distance networks", Internet Draft 768 draft-sridharan-tcpm-ctcp , work in progress, 769 November 2008. 771 [Tan06] Tan, K., Song, J., Zhang, Q., and M. Sridharan, "A 772 Compound TCP approach for high-speed and long distance 773 networks", Proceedings of IEEE INFOCOM 2006, Barcelona, 774 Spain, April 2008. 776 [Ven02] Venkataramani, A., Kokku, R., and M. Dahlin, "TCP Nice: a 777 mechanism for background transfers", Proceedings of 778 OSDI '02, 2002. 780 [Ven08] Venkataraman, V., Francis, P., Kodialam, M., and T. 781 Lakshman, "A priority-layered approach to transport for 782 high bandwidth-delay product networks", Proceedings of ACM 783 CoNEXT, Madrid, December 2008. 785 [Wan91] Wang, Z. and J. Crowcroft, "A new congestion control 786 scheme: slow start and search (Tri-S)", ACM Computer 787 Communication Review , 21(1):56-71, January 1991. 789 [Wan92] Wang, Z. and J. Crowcroft, "Eliminating periodic packet 790 losses in the 4.3-Tahoe BSD TCP congestion control 791 algorithm", ACM Computer Communication Review , 22(2): 792 9-16, January 1992. 794 [Wei05] Weigle, M., Jeffay, K., and F. Smith, "Delay-based early 795 congestion detection and adaptation in TCP: impact on web 796 performance", Computer Communications 28(8):837-850, 797 May 2005. 799 [Wei06] Wei, D., Jin, C., Low, S., and S. Hegde, "FAST TCP: 800 Motivation, architecture, algorithms, performance", IEEE/ 801 ACM Transactions on Networking , 14(6):1246-1259, 802 December 2006. 804 Authors' Addresses 806 Michael Welzl 807 University of Oslo 808 Department of Informatics, PO Box 1080 Blindern 809 N-0316 Oslo, 810 Norway 812 Phone: +47 22 85 24 20 813 Email: michawe@ifi.uio.no 815 David Ros 816 Institut Telecom / Telecom Bretagne 817 Rue de la Chataigneraie, CS 17607 818 35576 Cesson Sevigne cedex, 819 France 821 Phone: +33 2 99 12 70 46 822 Email: david.ros@telecom-bretagne.eu