idnits 2.17.1 

draft-ietf-ledbat-survey-03.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (December 17, 2010) is 4878 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  -- Obsolete informational reference (is this intentional?): RFC 1323
     (Obsoleted by RFC 7323)

  -- Obsolete informational reference (is this intentional?): RFC 2309
     (Obsoleted by RFC 7567)

  -- Obsolete informational reference (is this intentional?): RFC 3662
     (Obsoleted by RFC 8622)


     Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 4 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------


2	Internet Engineering Task Force                                 M. Welzl
3	Internet-Draft                                        University of Oslo
4	Intended status: Informational                                    D. Ros
5	Expires: June 20, 2011                        Institut Telecom / Telecom
6	                                                                Bretagne
7	                                                       December 17, 2010

9	         A Survey of Lower-than-Best-Effort Transport Protocols
10	                    draft-ietf-ledbat-survey-03.txt

12	Abstract

14	   This document provides a survey of transport protocols which are
15	   designed to have a smaller bandwidth and/or delay impact on standard
16	   TCP than standard TCP itself when they share a bottleneck with it.
17	   Such protocols could be used for delay-insensitive "background"
18	   traffic, as they provide what is sometimes called a "less than" (or
19	   "lower than") best-effort service.

21	Status of this Memo

23	   This Internet-Draft is submitted in full conformance with the
24	   provisions of BCP 78 and BCP 79.

26	   Internet-Drafts are working documents of the Internet Engineering
27	   Task Force (IETF).  Note that other groups may also distribute
28	   working documents as Internet-Drafts.  The list of current Internet-
29	   Drafts is at http://datatracker.ietf.org/drafts/current/.

31	   Internet-Drafts are draft documents valid for a maximum of six months
32	   and may be updated, replaced, or obsoleted by other documents at any
33	   time.  It is inappropriate to use Internet-Drafts as reference
34	   material or to cite them other than as "work in progress."

36	   This Internet-Draft will expire on June 20, 2011.

38	Copyright Notice

40	   Copyright (c) 2010 IETF Trust and the persons identified as the
41	   document authors.  All rights reserved.

43	   This document is subject to BCP 78 and the IETF Trust's Legal
44	   Provisions Relating to IETF Documents
45	   (http://trustee.ietf.org/license-info) in effect on the date of
46	   publication of this document.  Please review these documents
47	   carefully, as they describe your rights and restrictions with respect
48	   to this document.  Code Components extracted from this document must
49	   include Simplified BSD License text as described in Section 4.e of
50	   the Trust Legal Provisions and are provided without warranty as
51	   described in the Simplified BSD License.

53	Table of Contents

55	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
56	   2.  Delay-based transport protocols  . . . . . . . . . . . . . . .  3
57	     2.1.  Accuracy of delay-based congestion predictors  . . . . . .  6
58	     2.2.  Potential issues with delay-based congestion control
59	           for LBE transport  . . . . . . . . . . . . . . . . . . . .  7
60	   3.  Non-delay-based transport protocols  . . . . . . . . . . . . .  8
61	   4.  Upper-layer approaches . . . . . . . . . . . . . . . . . . . .  9
62	     4.1.  Receiver-oriented, flow-control based approaches . . . . . 10
63	   5.  Network-assisted approaches  . . . . . . . . . . . . . . . . . 11
64	   6.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 12
65	   7.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 12
66	   8.  Security Considerations  . . . . . . . . . . . . . . . . . . . 12
67	   9.  Changes from the previous version (section to be removed
68	       later) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
69	   10. Informative References . . . . . . . . . . . . . . . . . . . . 13
70	   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 17

72	1.  Introduction

74	   This document presents a brief survey of proposals to attain a Less
75	   than Best Effort (LBE) service by means of end-host mechanisms.  We
76	   loosely define a LBE service as a service which results in smaller
77	   bandwidth and/or delay impact on standard TCP than standard TCP
78	   itself, when sharing a bottleneck with it.  We refer to systems that
79	   provide this service as LBE systems.

81	   Generally, LBE behavior can be achieved by reacting to queue growth
82	   earlier than standard TCP would, or by changing the congestion
83	   avoidance behavior of TCP without utilizing any additional implicit
84	   feedback.  It is therefore assumed that readers are familiar with TCP
85	   congestion control [RFC5681].  Some mechanisms achieve an LBE
86	   behavior without modifying transport protocol standards (e.g., by
87	   changing the receiver window of standard TCP), whereas others
88	   leverage network-level mechanisms at the transport layer for LBE
89	   purposes.  According to this classification, solutions have been
90	   categorized in this document as delay-based transport protocols, non-
91	   delay-based transport protocols, upper-layer approaches and network-
92	   assisted approaches.

94	   This document is a product of the Low Extra Delay Background
95	   Transport (LEDBAT) Working Group for comparison with the chosen
96	   approach.  Most techniques discussed here were tested in limited
97	   simulations or experimental testbeds, but LEDBAT's algorithm is
98	   already under widespread deployment.  This survey is not exhaustive,
99	   as this would not be possible or useful; the authors/editors have
100	   selected key, well-known, or otherwise interesting techniques for
101	   inclusion at their discretion.  There is also a substantial amount of
102	   work that is related to the LBE concept but not presenting a solution
103	   that can be installed in end hosts or expected to work over the
104	   Internet (e.g., a DiffServ-based, Lower-Effort service [RFC3662]);
105	   such mechanisms are outside the scope of this document.

107	2.  Delay-based transport protocols

109	   It is wrong to generally equate "little impact on standard TCP" with
110	   "small sending rate".  Without ECN support, standard TCP will
111	   normally increase its congestion window (and effective sending rate)
112	   until a queue overflows, causing one or more packets to be dropped
113	   and the effective rate to be reduced.  A protocol which stops
114	   increasing the rate before this event happens can, in principle,
115	   achieve a better performance than standard TCP.  In the absence of
116	   any other traffic, this is even true for TCP itself when its maximum
117	   send window is limited to the bandwidth*round-trip time (RTT)
118	   product.

120	   TCP Vegas [Bra94] is one of the first protocols that was known to
121	   have a smaller sending rate than standard TCP when both protocols
122	   share a bottleneck [Kur00] -- yet it was designed to achieve more,
123	   not less throughput than standard TCP.  Indeed, when it is the only
124	   protocol on the bottleneck, the throughput of TCP Vegas is greater
125	   than the throughput of standard TCP.  Depending on the bottleneck
126	   queue length, TCP Vegas itself can be starved by standard TCP flows.
127	   This can be remedied to some degree by the RED Active Queue
128	   Management mechanism [RFC2309].  Vegas linearly increases or
129	   decreases the sending rate, based on the difference between the
130	   expected throughput and the actual throughput.  The estimation is
131	   based on RTT measurements.

133	   The congestion avoidance behavior is the protocol's most important
134	   feature in terms of historical relevance as well as relevance in the
135	   context of this document (it has been shown that other elements of
136	   the protocol can sometimes play a greater role for its overall
137	   behavior [Hen00]).  In congestion avoidance, once per RTT, TCP Vegas
138	   calculates the expected throughput as WindowSize / BaseRTT, where
139	   WindowSize is the current congestion window and BaseRTT is the
140	   minimum of all measured RTTs.  The expected throughput is then
141	   compared with the actual throughput measured by recent
142	   acknowledgements.  If the actual throughput is smaller than the
143	   expected throughput minus a threshold called "beta", this is taken as
144	   a sign of congestion, causing the protocol to linearly decrease its
145	   rate.  If the actual throughput is greater than the expected
146	   throughput minus a threshold called "alpha" (with alpha < beta), this
147	   is taken as a sign that the network is underutilized, causing the
148	   protocol to linearly increase its rate.

150	   TCP Vegas has been analyzed extensively.  One of the most prominent
151	   properties of TCP Vegas is its fairness between multiple flows of the
152	   same kind, which does not penalize flows with large propagation
153	   delays in the same way as standard TCP.  While it was not the first
154	   protocol that uses delay as a congestion indication, its predecessors
155	   (like CARD [Jai89], Tri-S [Wan91] or DUAL [Wan92]) are not discussed
156	   here because of the historical "landmark" role that TCP Vegas has
157	   taken in the literature.

159	   Delay-based transport protocols which were designed to be non-
160	   intrusive include TCP Nice [Ven02] and TCP Low Priority (TCP-LP)
161	   [Kuz06].  TCP Nice [Ven02] follows the same basic approach as TCP
162	   Vegas but improves upon it in some aspects.  Because of its moderate
163	   linear-decrease congestion response, TCP Vegas can affect standard
164	   TCP despite its ability to detect congestion early.  TCP Nice removes
165	   this issue by halving the congestion window (at most once per RTT,
166	   like standard TCP) instead of linearly reducing it.  To avoid being
167	   too conservative, this is only done if a fixed predefined fraction of
168	   delay-based incipient congestion signals appears within one RTT.
169	   Otherwise, TCP Nice falls back to the congestion avoidance rules of
170	   TCP Vegas if no packet was lost or standard TCP if a packet was lost.
171	   One more feature of TCP Nice is its ability to support a congestion
172	   window of less than one packet, by clocking out single packets over
173	   more than one RTT.  With ns-2 simulations and real-life experiments
174	   using a Linux implementation, the authors of [Ven02] show that TCP
175	   Nice achieves its goal of efficiently utilizing spare capacity while
176	   being non-intrusive to standard TCP.

178	   Other than TCP Vegas and TCP Nice, TCP-LP [Kuz06] uses only the one-
179	   way delay (OWD) instead of the RTT as an indicator of incipient
180	   congestion.  This is done to avoid reacting to delay fluctuations
181	   that are caused by reverse cross-traffic.  Using the TCP Timestamps
182	   option [RFC1323], the OWD is determined as the difference between the
183	   receiver's Timestamp value in the ACK and the original Timestamp
184	   value that the receiver copied into the ACK.  While the result of
185	   this subtraction can only precisely represent the OWD if clocks are
186	   synchronized, its absolute value is of no concern to TCP-LP and hence
187	   clock synchronization is unnecessary.  Using a constant smoothing
188	   parameter, TCP-LP calculates an Exponentially Weighted Moving Average
189	   (EWMA) of the measured OWD and checks whether the result exceeds a
190	   threshold within the range of the minimum and maximum OWD that was
191	   seen during the connections's lifetime; if it does, this condition is
192	   interpreted as an "early congestion indication".  The minimum and
193	   maximum OWD values are initialized during the slow-start phase.

195	   Regarding its reaction to an early congestion indication, TCP-LP
196	   tries to strike a middle ground between the overly conservative
197	   choice of _immediately_ setting the congestion window to one packet,
198	   and the presumably too aggressive choice of simply halving the
199	   congestion window like standard TCP; TCP-LP tries to delay the former
200	   action by an additional RTT, to see if there is persistent congestion
201	   or not.  It does so by halving the window at first in response to an
202	   early congestion indication, then initializing an "inference time-out
203	   timer", and maintaining the current congestion window until this
204	   timer fires.  If another early congestion indication appeared during
205	   this "inference phase", the window is then set to 1; otherwise, the
206	   window is maintained and TCP-LP continues to increase it in the
207	   standard Additive-Increase fashion.  This method ensures that it
208	   takes at least two RTTs for a TCP-LP flow to decrease its window to
209	   1, and, like standard TCP, TCP-LP reacts to congestion at most once
210	   per RTT.

212	   Using a simple analytical model, the authors of TCP-LP [Kuz06]
213	   illustrate the feasibility of a delay-based LBE transport by showing
214	   that, due to the non-linear relationship between throughput and RTT,
215	   it is possible to avoid interfering with standard TCP traffic even
216	   when the flows under consideration have a larger RTT than standard
217	   TCP flows.  With ns-2 simulations and real-life experiments using a
218	   Linux implementation, the authors of [Kuz06] show that TCP-LP is
219	   largely non-intrusive to TCP traffic while at the same time enabling
220	   it to utilize a large portion of the excess network bandwidth, which
221	   is fairly shared among competing TCP-LP flows.  They also show that
222	   using their protocol for bulk data transfers greatly reduces file
223	   transfer times of competing best-effort web traffic.

225	   Sync-TCP [Wei05] follows a similar approach as TCP-LP, by adapting
226	   its reaction to congestion according to changes in the OWD.  By
227	   comparing the estimated (average) forward queuing delay to the
228	   maximum observed delay, Sync-TCP adapts the AIMD parameters depending
229	   on the trend followed by the average delay over an observation
230	   window.  Even though the authors of [Wei05] did not explicitly
231	   consider its use as an LBE protocol, Sync-TCP was designed to react
232	   early to incipient congestion, while grabbing available bandwidth
233	   more aggressively than a standard TCP in congestion-avoidance mode.

235	   Delay-based congestion control is also at the basis of proposals
236	   aiming at adapting TCP's congestion avoidance to very high-speed
237	   networks.  Some of these proposals, like Compound TCP [Tan06][Sri08]
238	   and TCP Illinois [Liu08], are hybrid loss- and delay-based
239	   mechanisms, whereas others (e.g., NewVegas [Dev03], FAST TCP [Wei06]
240	   or CODE TCP [Cha10]) are variants of Vegas based primarily on delays.

242	2.1.  Accuracy of delay-based congestion predictors

244	   The accuracy of delay-based congestion predictors has been the
245	   subject of a good deal of research, see e.g.  [Bia03], [Mar03],
246	   [Pra04], [Rew06], [McC08].  The main result of most of these studies
247	   is that delays (or, more precisely, round-trip times) are, in
248	   general, weakly correlated with congestion.  There are several
249	   factors that may induce such a poor correlation:

251	   o  Bottleneck buffer size: in principle, a delay-based mechanism
252	      could be made "more than TCP friendly" _if_ buffers are "large
253	      enough", so that RTT fluctuations and/or deviations from the
254	      minimum RTT can be detected by the end-host with reasonable
255	      accuracy.  Otherwise, it may be hard to distinguish real delay
256	      variations from measurement noise.

258	   o  RTT measurement issues: in principle, RTT samples may suffer from
259	      poor resolution, due to timers which are too coarse-grained with
260	      respect to the scale of delay fluctuations.  Also, a flow may
261	      obtain a very noisy estimate of RTTs due to undersampling, under
262	      some circumstances (e.g., the flow rate is much lower than the
263	      link bandwidth).  For TCP, other potential sources of measurement
264	      noise include: TCP segmentation offloading (TSO) and the use of
265	      delayed ACKs [Hay10].  A congested reverse path may also result in
266	      an erroneous assessment of the congestion state of the forward
267	      path.  Finally, in the case of fast or short-distance links, the
268	      majority of the measured delay can in fact be due to processing in
269	      the involved hosts; typically, this processing delay is not of
270	      interest, and it can underly fluctuations that are not related to
271	      the network at all.

273	   o  Level of statistical multiplexing and RTT sampling: it may be easy
274	      for an individual flow to "miss" loss/queue overflow events,
275	      especially if the number of flows sharing a bottleneck buffer is
276	      significant.  This is nicely illustrated e.g. in Fig. 1 of
277	      [McC08].

279	   o  Impact of wireless links: several mechanisms that are typical of
280	      wireless links, like link-layer scheduling and error recovery, may
281	      induce strong delay fluctuations over short time scales [Gur04].

283	   Interestingly, the results of Bhandarkar et al.  [Bha07] seem to
284	   paint a slightly different picture, regarding the accuracy of delay-
285	   based congestion prediction.  Bhandarkar et al. claim that it is
286	   possible to significantly improve prediction accuracy by adopting
287	   some simple techniques (smoothing of RTT samples, increasing the RTT
288	   sampling frequency).  Nonetheless, they acknowledge that even with
289	   such techniques, it is not possible to eradicate detection errors.
290	   Their proposed delay-based congestion avoidance method, PERT
291	   (Probabilistic Early Response TCP), mitigates the impact of residual
292	   detection errors by means of a probabilistic response mechanism to
293	   congestion detection events.

295	2.2.  Potential issues with delay-based congestion control for LBE
296	      transport

298	   Whether a delay-based protocol behaves in its intended manner (e.g.,
299	   it is "more than TCP friendly", or it grabs available bandwidth in a
300	   very aggressive manner) may therefore depend on the accuracy issues
301	   listed in Section 2.1.  Moreover, protocols like Vegas need to keep
302	   an estimate of the minimum ("base") delay; this makes such protocols
303	   highly sensitive to eventual changes in the end-to-end route during
304	   the lifetime of the flow [Mo99].

306	   Regarding the issue of false positives/false negatives with a delay-
307	   based congestion detector, most studies focus on the loss of
308	   throughput coming from the erroneous detection of queue build-up and
309	   of alleviation of congestion.  Arguably, for a LBE transport protocol
310	   it's better to err on the "more-than-TCP-friendly side", that is, to
311	   always yield to _perceived_ congestion whether it is "real" or not;
312	   however, failure to detect congestion (due to one of the above
313	   accuracy problems) would result in behavior that is not LBE.  For
314	   instance, consider the case in which the bottleneck buffer is small,
315	   so that the contribution of queueing delay at the bottleneck to the
316	   global end-to-end delay is small.  In such a case, a flow using a
317	   delay-based mechanism might end up consuming a good deal of bandwidth
318	   with respect to a competing standard TCP flow, unless it also
319	   incorporates a suitable reaction to loss.

321	   A delay-based mechanism may also suffer from the so-called "latecomer
322	   advantage" (or latecomer unfairness) problem.  Consider the case in
323	   which the bottleneck link is already (very) congested.  In such a
324	   scenario, delay variations may be quite small, hence, it may be very
325	   difficult to tell an empty queue from a heavily-loaded queue, in
326	   terms of delay fluctuation.  Therefore, a newly-arriving delay-based
327	   flow may start sending faster when there is already heavy congestion,
328	   eventually driving away loss-based flows [Sha05][Car10].

330	3.  Non-delay-based transport protocols

332	   There exist a few transport-layer proposals that achieve an LBE
333	   service without relying on delay as an indicator of congestion.  In
334	   the algorithms discussed below the loss rate of the flow determines,
335	   either implicitly or explicitly, the sending rate (which is adapted
336	   so as to obtain a lower share of the available bandwidth than
337	   standard TCP); such mechanisms likely react to congestion more slowly
338	   than delay-based ones.

340	   4CP [Liu07], which stands for "Competitive and Considerate Congestion
341	   Control", is a protocol which provides a LBE service by changing the
342	   window control rules of standard TCP.  A "virtual window" is
343	   maintained which, during a so-called "bad congestion phase" is
344	   reduced to less than a predefined minimum value of the actual
345	   congestion window.  The congestion window is only increased again
346	   once the virtual window exceeds this minimum, and in this way the
347	   virtual window controls the duration during which the sender
348	   transmits with a fixed minimum rate.  Whether the congestion state is
349	   "bad" or "good" depends on whether the loss event rate is above or
350	   below a threshold (or target) value.  The 4CP congestion avoidance
351	   algorithm allows for setting a target average window and avoids
352	   starvation of "background" flows while bounding the impact on
353	   "foreground" flows.  Its performance was evaluated in ns-2
354	   simulations and in real-life experiments with a kernel-level
355	   implementation in Microsoft Windows Vista.

357	   The MulTFRC [Dam09] protocol is an extension of TCP-Friendly Rate
358	   Control (TFRC) [RFC5348] for multiple flows.  MulTFRC takes the main
359	   idea of MulTCP [Cro98] and similar proposals (e.g., [Hac04], [Hac08],
360	   [Kuo08]) a step further.  A single MulTCP flow tries to emulate (and
361	   be as friendly as) a number N > 1 of parallel TCP flows.  By
362	   supporting values of N between 0 and 1, MulTFRC can be used as a
363	   mechanism for a LBE service.  Since it does not react to delay like
364	   the protocols described in Section 2 but adjusts its rate like TFRC,
365	   MulTFRC can probably be expected to be more aggressive than
366	   mechanisms such as TCP Nice or TCP-LP.  This also means that MulTFRC
367	   is less likely to be prone to starvation, as its aggressiveness is
368	   tunable at a fine granularity, even when N is between 0 and 1.

370	4.  Upper-layer approaches

372	   The proposals described in this section do not require modifying
373	   transport protocol standards.  Most of them can be regarded as
374	   running "on top" of an existing transport, even though they may be
375	   implemented either at the application layer (i.e., in user-level
376	   processes), or in the kernel of the end hosts' operating system.
377	   Such "upper-layer" mechanisms may arguably be easier to deploy than
378	   transport-layer approaches, since they do not require any changes to
379	   the transport itself.

381	   A simplistic, application-level approach to a background transport
382	   service may consist in scheduling automated transfers at times when
383	   the network is lightly loaded, as described in e.g.  [Dyk02] for
384	   cooperative proxy caching.  An issue with such a technique is that it
385	   may not necessarily be appropriate to applications like peer-to-peer
386	   file transfer, since the notion of an "off-peak hour" is not
387	   meaningful when end-hosts may be located anywhere in the world.

389	   The so-called Background Intelligent Transfer Service (BITS) [BITS]
390	   is implemented in several versions of Microsoft Windows.  BITS uses a
391	   system of application-layer priority levels for file-transfer jobs,
392	   together with monitoring of bandwidth usage of the network interface
393	   (or, in more recent versions, of the network gateway connected to the
394	   end-host), so that, low-priority transfers at a given end-host give
395	   way to both high-priority (foreground) transfers and traffic from
396	   interactive applications at the same host.

398	   A different approach is taken in [Egg05] -- here, the priority of a
399	   flow is reduced via a generic idletime scheduling strategy in a
400	   host's operating system.  While results presented in this paper show
401	   that the new scheduler can effectively shield regular tasks from low-
402	   priority ones (e.g., TCP from greedy UDP) with only a minor
403	   performance impact, it is an underlying assumption that all involved
404	   end hosts would use the idletime scheduler.  In other words, it is
405	   not the focus of this work to protect a standard TCP flow which
406	   originates from any host where the presented scheduling scheme may
407	   not be implemented.

409	4.1.  Receiver-oriented, flow-control based approaches

411	   Some proposals for achieving an LBE behavior work by exploiting
412	   existing transport-layer features -- typically, at the "receiving"
413	   side.  In particular, TCP's built-in flow control can be used as a
414	   means to achieve a low-priority transport service.

416	   The mechanism described in [Spr00] is an example of the above
417	   technique.  Such mechanism controls the bandwidth by letting the
418	   receiver intelligently manipulate the receiver window of standard
419	   TCP.  This is possible because the authors assume a client-server
420	   setting where the receiver's access link is typically the bottleneck.
421	   The scheme incorporates a delay-based calculation of the expected
422	   queue length at the bottleneck, which is quite similar to the
423	   calculation in the above delay-based protocols, e.g.  TCP Vegas.
424	   Using a Linux implementation, where TCP flows are classified
425	   according to their application's needs, Spring et al. show in [Spr00]
426	   that a significant improvement in packet latency can be attained over
427	   an unmodified system, while maintaining good link utilization.

429	   A similar method is employed by Mehra et al.  [Meh03], where both the
430	   advertised receiver window and the delay in sending ACK messages are
431	   dynamically adapted to attain a given rate.  As in [Spr00], Mehra et
432	   al. assume that the bottleneck is located at the receiver's access
433	   link.  However, the latter also propose a bandwidth-sharing system,
434	   allowing to control the bandwidth allocated to different flows, as
435	   well as to allot a minimum rate to some flows.

437	   Receiver window tuning is also done in [Key04], where choosing the
438	   right value for the window is phrased as an optimization problem.  On
439	   this basis, two algorithms are presented, binary search -- which is
440	   faster than the other one at achieving a good operation point but
441	   fluctuates -- and stochastic optimization, which does not fluctuate
442	   but converges slower than binary search.  These algorithms merely use
443	   the previous receiver window and the amount of data received during
444	   the previous control interval as input.  According to [Key04], the
445	   encouraging simulation results suggest that such an application level
446	   mechanism can work almost as well as a transport layer scheme like
447	   TCP-LP.

449	   Another way of dealing with non-interactive flows, like e.g. web
450	   prefetching, is to rate-limit the transfer of such bursty traffic
451	   [Cro98b].  Note that one of the techniques used in [Cro98b] is,
452	   precisely, to have the downloading application adapt the TCP receiver
453	   window, so as to reduce the data rate to the minimum needed (thus,
454	   disturbing other flows as little as possible while respecting a
455	   deadline for the transfer of the data).

457	5.  Network-assisted approaches

459	   Network-layer mechanisms, like active queue management (AQM) and
460	   packet scheduling in routers, can be exploited by a transport
461	   protocol for achieving an LBE service.  Such approaches may result in
462	   improved protection of non-LBE flows (e.g., when scheduling is used);
463	   besides, approaches using an explicit, AQM-based congestion signaling
464	   may arguably be more robust than, say, delay-based transports for
465	   detecting impending congestion.  However, an obvious drawback of any
466	   network-assisted approach is that, in principle, they need
467	   modifications in both end-hosts and intermediate network nodes.

469	   Harp [Kok04] realizes a LBE service by dissipating background traffic
470	   to less-utilized paths of the network, based on multipath routing and
471	   multipath congestion control.  This is achieved without changing all
472	   routers, by using edge nodes as relays.  According to the authors,
473	   these edge nodes should be gateways of organizations in order to
474	   align their scheme with usage incentives, but the technical solution
475	   would also work if Harp was only deployed in end hosts.  It detects
476	   impending congestion by looking at delay, similar to TCP Nice
477	   [Ven02], and manages to improve the utilization and fairness of TCP
478	   over pure single-path solutions without requiring any changes to the
479	   TCP itself.

481	   Another technique is that used by protocols like NF-TCP [Aru10b],
482	   where a bandwidth-estimation module integrated into the transport
483	   protocol allows to rapidly take advantage of free capacity.  NF-TCP
484	   combines this with an early congestion detection based on Explicit
485	   Congestion Notification (ECN) [RFC3168] and RED [RFC2309]; when
486	   congestion starts building up, appropriate tuning of a RED queue
487	   allows to mark low-priority (i.e., NF-TCP) packets with a much higher
488	   probability than high-priority (i.e., standard TCP) packets, so low-
489	   priority flows yield up bandwidth before standard TCP flows.  NF-TCP
490	   could be implemented by adapting the congestion control behavior of
491	   TCP without requiring to change the protocol on the wire -- with the
492	   only exception that NF-TCP-capable routers must be able to somehow
493	   distinguish NF-TCP traffic from other TCP traffic.

495	   In [Ven08], Venkataraman et al. propose a transport-layer approach to
496	   leverage an existing, network-layer LBE service based on priority
497	   queueing.  Their transport protocol, which they call PLT (Priority-
498	   Layer Transport), splits a layer-4 connection into two flows, a high-
499	   priority one and a low-priority one.  The high-priority flow is sent
500	   over the higher-priority queueing class (in principle, offering a
501	   best-effort service) using an AIMD, TCP-like congestion control
502	   mechanism.  The low-priority flow, which is mapped to the LBE class,
503	   uses a non TCP-friendly congestion control algorithm.  The goal of
504	   PLT is thus to maximize its aggregate throughput by exploiting unused
505	   capacity in an aggressive way, while protecting standard TCP flows
506	   carried by the best-effort class.  Similar in spirit, [Ott03]
507	   proposes simple changes to only the AIMD parameters of TCP for use
508	   over a network-layer LBE service, so that such "filler" traffic may
509	   aggressively consume unused bandwidth.  Note that [Ven08] also
510	   considers a mechanism for detecting the lack of priority queueing in
511	   the network, so that the non-TCP friendly flow may be inhibited.  The
512	   PLT receiver monitors the loss rate of both flows; if the high-
513	   priority flow starts seeing losses while the low-priority one does
514	   not experience 100% loss, this is taken as an indication of the
515	   absence of strict priority queueing.

517	6.  Acknowledgements

519	   The authors would like to thank Dragana Damjanovic, Melissa Chavez
520	   and Yinxia Zhao for reference pointers, as well as Mayutan
521	   Arumaithurai, Mirja Kuehlewind and Wesley Eddy for their detailed
522	   reviews and suggestions.

524	7.  IANA Considerations

526	   This memo includes no request to IANA.

528	8.  Security Considerations

530	   This document introduces no new security considerations.

532	9.  Changes from the previous version (section to be removed later)

534	   o  Updated the introduction to cover indended audience and say that
535	      this is coming from the LEDBAT WG.

537	   o  Removed the "+" in reference anchors.

539	   o  Small editorial changes and various fixes based on Wes' comments
540	      throughout the document.

542	10.  Informative References

544	   [Aru10b]   Arumaithurai, M., Fu, X., and K. Ramakrishnan, "NF-TCP: A
545	              Network Friendly TCP Variant for Background Delay-
546	              Insensitive Applications", Technical Report No. IFI-TB-
547	              2010-05,  Institute of Computer Science, University of
548	              Goettingen, Germany, September 2010, <http://
549	              www.net.informatik.uni-goettingen.de/publications/1718/
550	              NF-TCP-techreport.pdf>.

552	   [BITS]     Microsoft, "Windows Background Intelligent Transfer
553	              Service",
554	              <http://msdn.microsoft.com/library/bb968799(VS.85).aspx>.

556	   [Bha07]    Bhandarkar, S., Reddy, A., Zhang, Y., and D. Loguinov,
557	              "Emulating AQM from end hosts", Proceedings of ACM
558	              SIGCOMM 2007, 2007.

560	   [Bia03]    Biaz, S. and N. Vaidya, "Is the round-trip time correlated
561	              with the number of packets in flight?", Proceedings of the
562	              3rd ACM SIGCOMM conference on Internet measurement (IMC
563	              '03) , pages 273-278, 2003.

565	   [Bra94]    Brakmo, L., O'Malley, S., and L. Peterson, "TCP Vegas: New
566	              techniques for congestion detection and avoidance",
567	              Proceedings of SIGCOMM '94, pages 24-35, August 1994.

569	   [Car10]    Carofiglio, G., Muscariello, L., Rossi, D., and S.
570	              Valenti, "The quest for LEDBAT fairness", Proceedings of
571	              IEEE GLOBECOM 2010, December 2010.

573	   [Cha10]    Chan, Y., Lin, C., Chan, C., and C. Ho, "CODE TCP: A
574	              competitive delay-based TCP", Computer Communications ,
575	              33(9):1013-1029, June 2010.

577	   [Cro98]    Crowcroft, J. and P. Oechslin, "Differentiated end-to-end
578	              Internet services using a weighted proportional fair
579	              sharing TCP", ACM SIGCOMM Computer Communication
580	              Review vol. 28, no. 3 (July 1998), pp. 53-69, 1998.

582	   [Cro98b]   Crovella, M. and P. Barford, "The network effects of
583	              prefetching", Proceedings of IEEE INFOCOM 1998,
584	              April 1998.

586	   [Dam09]    Damjanovic, D. and M. Welzl, "MulTFRC: Providing Weighted
587	              Fairness for Multimedia Applications (and others too!)",
588	              ACM Computer Communication Review vol. 39, no. 3 (July
589	              2009), 2009.

591	   [Dev03]    De Vendictis, A., Baiocchi, A., and M. Bonacci, "Analysis
592	              and enhancement of TCP Vegas congestion control in a mixed
593	              TCP Vegas and TCP Reno network scenario", Performance
594	              Evaluation , 53(3-4):225-253, 2003.

596	   [Dyk02]    Dykes, S. and K. Robbins, "Limitations and benefits of
597	              cooperative proxy caching", IEEE Journal on Selected Areas
598	              in Communications  20(7):1290-1304, September 2002.

600	   [Egg05]    Eggert, L. and J. Touch, "Idletime Scheduling with
601	              Preemption Intervals", Proceedings of 20th ACM Symposium
602	              on Operating Systems Principles SOSP 2005, Brighton,
603	              United Kingdom, pp. 249/262, October 2005.

605	   [Gur04]    Gurtov, A. and S. Floyd, "Modeling wireless links for
606	              transport protocols", ACM SIGCOMM Computer Communications
607	              Review  34(2):85-96, April 2004.

609	   [Hac04]    Hacker, T., Noble, B., and B. Athey, "Improving Throughput
610	              and Maintaining Fairness using Parallel TCP", Proceedings
611	              of IEEE INFOCOM 2004, March 2004.

613	   [Hac08]    Hacker, T. and P. Smith, "Stochastic TCP: A Statistical
614	              Approach to Congestion Avoidance", Proceedings of
615	              PFLDnet 2008, March 2008.

617	   [Hay10]    Hayes, D., "Timing enhancements to the FreeBSD kernel to
618	              support delay and rate based TCP mechanisms", Technical
619	              Report 100219A , Centre for Advanced Internet
620	              Architectures, Swinburne University of Technology,
621	              February 2010.

623	   [Hen00]    Hengartner, U., Bolliger, J., and T. Gross, "TCP Vegas
624	              revisited", Proceedings of IEEE INFOCOM 2000, March 2000.

626	   [Jai89]    Jain, R., "A delay-based approach for congestion avoidance
627	              in interconnected heterogeneous computer networks", ACM
628	              Computer Communication Review , 19(5):56-71, October 1989.

630	   [Key04]    Key, P., Massoulie, L., and B. Wang, "Emulating Low-
631	              Priority Transport at the Application Layer: a Background
632	              Transfer Service", Proceedings of ACM SIGMETRICS 2004,
633	              January 2004.

635	   [Kok04]    Kokku, R., Bohra, A., Ganguly, S., and A. Venkataramani,
636	              "A Multipath Background Network Architecture", Proceedings
637	              of IEEE INFOCOM 2007, May 2007.

639	   [Kuo08]    Kuo, F. and X. Fu, "Probe-Aided MulTCP: an aggregate
640	              congestion control mechanism", ACM SIGCOMM Computer
641	              Communication Review vol. 38, no. 1 (January 2008), pp.
642	              17-28, 2008.

644	   [Kur00]    Kurata, K., Hasegawa, G., and M. Murata, "Fairness
645	              Comparisons Between TCP Reno and TCP Vegas for Future
646	              Deployment of TCP Vegas", Proceedings of INET 2000,
647	              July 2000.

649	   [Kuz06]    Kuzmanovic, A. and E. Knightly, "TCP-LP: low-priority
650	              service via end-point congestion control", IEEE/ACM
651	              Transactions on Networking (ToN)  Volume 14, Issue 4, pp.
652	              739-752., August 2006,
653	              <http://www.ece.rice.edu/networks/TCP-LP/>.

655	   [Liu07]    Liu, S., Vojnovic, M., and D. Gunawardena, "Competitive
656	              and Considerate Congestion Control for Bulk Data
657	              Transfers", Proceedings of IWQoS 2007, June 2007.

659	   [Liu08]    Liu, S., Basar, T., and R. Srikant, "TCP-Illinois: A loss-
660	              and delay-based congestion control algorithm for high-
661	              speed networks", Performance Evaluation , 65(6-7):417-440,
662	              2008.

664	   [Mar03]    Martin, J., Nilsson, A., and I. Rhee, "Delay-based
665	              congestion avoidance for TCP", IEEE/ACM Transactions on
666	              Networking , 11(3):356-369, June 2003.

668	   [McC08]    McCullagh, G. and D. Leith, "Delay-based congestion
669	              control: Sampling and correlation issues revisited",
670	              Technical report , Hamilton Institute, 2008.

672	   [Meh03]    Mehra, P., Zakhor, A., and C. De Vleeschouwer, "Receiver-
673	              Driven Bandwidth Sharing for TCP", Proceedings of IEEE
674	              INFOCOM 2003, April 2003.

676	   [Mo99]     Mo, J., La, R., Anantharam, V., and J. Walrand, "Analysis
677	              and Comparison of TCP Reno and TCP Vegas", Proceedings of
678	              IEEE INFOCOM 1999, March 1999.

680	   [Ott03]    Ott, B., Warnky, T., and V. Liberatore, "Congestion
681	              control for low-priority filler traffic", SPIE QoS 2003
682	              (Quality of Service over Next-Generation Internet), In
683	              Proc. SPIE, Vol. 5245, 154, Monterey (CA), USA, July 2003.

685	   [Pra04]    Prasad, R., Jain, M., and C. Dovrolis, "On the
686	              effectiveness of delay-based congestion avoidance",
687	              Proceedings of PFLDnet , 2004.

689	   [RFC1323]  Jacobson, V., Braden, B., and D. Borman, "TCP Extensions
690	              for High Performance", RFC 1323, May 1992.

692	   [RFC2309]  Braden, B., Clark, D., Crowcroft, J., Davie, B., Deering,
693	              S., Estrin, D., Floyd, S., Jacobson, V., Minshall, G.,
694	              Partridge, C., Peterson, L., Ramakrishnan, K., Shenker,
695	              S., Wroclawski, J., and L. Zhang, "Recommendations on
696	              Queue Management and Congestion Avoidance in the
697	              Internet", RFC 2309, April 1998.

699	   [RFC3168]  Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
700	              of Explicit Congestion Notification (ECN) to IP",
701	              RFC 3168, September 2001.

703	   [RFC3662]  Bless, R., Nichols, K., and K. Wehrle, "A Lower Effort
704	              Per-Domain Behavior (PDB) for Differentiated Services",
705	              RFC 3662, December 2003.

707	   [RFC5348]  Floyd, S., Handley, M., Padhye, J., and J. Widmer, "TCP
708	              Friendly Rate Control (TFRC): Protocol Specification",
709	              RFC 5348, September 2008.

711	   [RFC5681]  Allman, M., Paxson, V., and E. Blanton, "TCP Congestion
712	              Control", RFC 5681, September 2009.

714	   [Rew06]    Rewaskar, S., Kaur, J., and D. Smith, "Why don't delay-
715	              based congestion estimators work in the real-world?",
716	              Technical report TR06-001 , University of North Carolina
717	              at Chapel Hill, Dept. of Computer Science, January 2006.

719	   [Sha05]    Shalunov, S., Dunn, L., Gu, Y., Low, S., Rhee, I., Senger,
720	              S., Wydrowski, B., and L. Xu, "Design Space for a Bulk
721	              Transport Tool", Technical Report , Internet2 Transport
722	              Group, May 2005.

724	   [Spr00]    Spring, N., Chesire, M., Berryman, M., Sahasranaman, V.,
725	              Anderson, T., and B. Bershad, "Receiver based management
726	              of low bandwidth access links", Proceedings of IEEE
727	              INFOCOM 2000, pp. 245-254, vol.1, 2000.

729	   [Sri08]    Sridharan, M., Tan, K., Bansala, D., and D. Thaler,
730	              "Compound TCP: A new TCP congestion control for high-speed
731	              and long distance networks", Internet Draft
732	              draft-sridharan-tcpm-ctcp , work in progress,
733	              November 2008.

735	   [Tan06]    Tan, K., Song, J., Zhang, Q., and M. Sridharan, "A
736	              Compound TCP approach for high-speed and long distance
737	              networks", Proceedings of IEEE INFOCOM 2006, Barcelona,
738	              Spain, April 2008.

740	   [Ven02]    Venkataramani, A., Kokku, R., and M. Dahlin, "TCP Nice: a
741	              mechanism for background transfers", Proceedings of
742	              OSDI '02, 2002.

744	   [Ven08]    Venkataraman, V., Francis, P., Kodialam, M., and T.
745	              Lakshman, "A priority-layered approach to transport for
746	              high bandwidth-delay product networks", Proceedings of ACM
747	              CoNEXT,  Madrid, December 2008.

749	   [Wan91]    Wang, Z. and J. Crowcroft, "A new congestion control
750	              scheme: slow start and search (Tri-S)", ACM Computer
751	              Communication Review , 21(1):56-71, January 1991.

753	   [Wan92]    Wang, Z. and J. Crowcroft, "Eliminating periodic packet
754	              losses in the 4.3-Tahoe BSD TCP congestion control
755	              algorithm", ACM Computer Communication Review , 22(2):
756	              9-16, January 1992.

758	   [Wei05]    Weigle, M., Jeffay, K., and F. Smith, "Delay-based early
759	              congestion detection and adaptation in TCP: impact on web
760	              performance", Computer Communications  28(8):837-850,
761	              May 2005.

763	   [Wei06]    Wei, D., Jin, C., Low, S., and S. Hegde, "FAST TCP:
764	              Motivation, architecture, algorithms, performance", IEEE/
765	              ACM Transactions on Networking , 14(6):1246-1259,
766	              December 2006.

768	Authors' Addresses

770	   Michael Welzl
771	   University of Oslo
772	   Department of Informatics, PO Box 1080 Blindern
773	   N-0316 Oslo,
774	   Norway

776	   Phone: +43 512 507 6110
777	   Email: michawe@ifi.uio.no
778	   David Ros
779	   Institut Telecom / Telecom Bretagne
780	   Rue de la Chataigneraie, CS 17607
781	   35576 Cesson Sevigne cedex,
782	   France

784	   Phone: +33 2 99 12 70 46
785	   Email: david.ros@telecom-bretagne.eu