idnits 2.17.1 

draft-ietf-rmcat-coupled-cc-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (September 14, 2015) is 3146 days in the past.  Is
     this intentional?


  Checking references for intended status: Experimental
  ----------------------------------------------------------------------------

     No issues found here.

     Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------


2	RTP Media Congestion Avoidance                                  S. Islam
3	Techniques (rmcat)                                              M. Welzl
4	Internet-Draft                                               S. Gjessing
5	Intended status: Experimental                         University of Oslo
6	Expires: March 17, 2016                               September 14, 2015

8	                Coupled congestion control for RTP media
9	                     draft-ietf-rmcat-coupled-cc-00

11	Abstract

13	   When multiple congestion controlled RTP sessions traverse the same
14	   network bottleneck, it can be beneficial to combine their controls
15	   such that the total on-the-wire behavior is improved.  This document
16	   describes such a method for flows that have the same sender, in a way
17	   that is as flexible and simple as possible while minimizing the
18	   amount of changes needed to existing RTP applications.  It specifies
19	   how to apply the method for the NADA congestion control algorithm.

21	Status of this Memo

23	   This Internet-Draft is submitted in full conformance with the
24	   provisions of BCP 78 and BCP 79.

26	   Internet-Drafts are working documents of the Internet Engineering
27	   Task Force (IETF).  Note that other groups may also distribute
28	   working documents as Internet-Drafts.  The list of current Internet-
29	   Drafts is at http://datatracker.ietf.org/drafts/current/.

31	   Internet-Drafts are draft documents valid for a maximum of six months
32	   and may be updated, replaced, or obsoleted by other documents at any
33	   time.  It is inappropriate to use Internet-Drafts as reference
34	   material or to cite them other than as "work in progress."

36	   This Internet-Draft will expire on March 17, 2016.

38	Copyright Notice

40	   Copyright (c) 2015 IETF Trust and the persons identified as the
41	   document authors.  All rights reserved.

43	   This document is subject to BCP 78 and the IETF Trust's Legal
44	   Provisions Relating to IETF Documents
45	   (http://trustee.ietf.org/license-info) in effect on the date of
46	   publication of this document.  Please review these documents
47	   carefully, as they describe your rights and restrictions with respect
48	   to this document.  Code Components extracted from this document must
49	   include Simplified BSD License text as described in Section 4.e of
50	   the Trust Legal Provisions and are provided without warranty as
51	   described in the Simplified BSD License.

53	Table of Contents

55	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
56	   2.  Definitions  . . . . . . . . . . . . . . . . . . . . . . . . .  3
57	   3.  Limitations  . . . . . . . . . . . . . . . . . . . . . . . . .  4
58	   4.  Architectural overview . . . . . . . . . . . . . . . . . . . .  5
59	   5.  Roles  . . . . . . . . . . . . . . . . . . . . . . . . . . . .  6
60	     5.1.  SBD  . . . . . . . . . . . . . . . . . . . . . . . . . . .  6
61	     5.2.  FSE  . . . . . . . . . . . . . . . . . . . . . . . . . . .  6
62	     5.3.  Flows  . . . . . . . . . . . . . . . . . . . . . . . . . .  7
63	       5.3.1.  Example algorithm 1 - Active FSE . . . . . . . . . . .  7
64	       5.3.2.  Example algorithm 2 - Conservative Active FSE  . . . .  8
65	   6.  Application  . . . . . . . . . . . . . . . . . . . . . . . . . 10
66	     6.1.  NADA . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
67	     6.2.  General recommendations  . . . . . . . . . . . . . . . . . 10
68	   7.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 11
69	   8.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 11
70	   9.  Security Considerations  . . . . . . . . . . . . . . . . . . . 11
71	   10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 11
72	     10.1. Normative References . . . . . . . . . . . . . . . . . . . 11
73	     10.2. Informative References . . . . . . . . . . . . . . . . . . 12
74	   Appendix A.  Scheduling  . . . . . . . . . . . . . . . . . . . . . 13
75	   Appendix B.  Example algorithm - Passive FSE . . . . . . . . . . . 13
76	     B.1.  Example operation (passive)  . . . . . . . . . . . . . . . 15
77	   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 20

79	1.  Introduction

81	   When there is enough data to send, a congestion controller must
82	   increase its sending rate until the path's capacity has been reached;
83	   depending on the controller, sometimes the rate is increased further,
84	   until packets are ECN-marked or dropped.  This process inevitably
85	   creates undesirable queuing delay -- an effect that is amplified when
86	   multiple congestion controlled connections traverse the same network
87	   bottleneck.

89	   The Congestion Manager (CM) [RFC3124] couples flows by providing a
90	   single congestion controller.  It is hard to implement because it
91	   requires an additional congestion controller and removes all per-
92	   connection congestion control functionality, which is quite a
93	   significant change to existing RTP based applications.  This document
94	   presents a method to combine the behavior of congestion control
95	   mechanisms that is easier to implement than the Congestion Manager
96	   [RFC3124] and also requires less significant changes to existing RTP
97	   based applications.  It attempts to roughly approximate the CM
98	   behavior by sharing information between existing congestion
99	   controllers.  It is able to honor user-specified priorities, which is
100	   required by rtcweb [rtcweb-usecases].

102	2.  Definitions

104	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
105	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
106	   document are to be interpreted as described in RFC 2119 [RFC2119].

108	   Available Bandwidth:
109	         The available bandwidth is the nominal link capacity minus the
110	         amount of traffic that traversed the link during a certain time
111	         interval, divided by that time interval.

113	   Bottleneck:
114	         The first link with the smallest available bandwidth along the
115	         path between a sender and receiver.

117	   Flow:
118	         A flow is the entity that congestion control is operating on.
119	         It could, for example, be a transport layer connection, an RTP
120	         session, or a subsession that is multiplexed onto a single RTP
121	         session together with other subsessions.

123	   Flow Group Identifier (FGI):
124	         A unique identifier for each subset of flows that is limited by
125	         a common bottleneck.

127	   Flow State Exchange (FSE):
128	         The entity that maintains information that is exchanged between
129	         flows.

131	   Flow Group (FG):
132	         A group of flows having the same FGI.

134	   Shared Bottleneck Detection (SBD):
135	         The entity that determines which flows traverse the same
136	         bottleneck in the network, or the process of doing so.

138	3.  Limitations

140	   Sender-side only:
141	         Coupled congestion control as described here only operates
142	         inside a single host on the sender side.  This is because,
143	         irrespective of where the major decisions for congestion
144	         control are taken, the sender of a flow needs to eventually
145	         decide the transmission rate.  Additionally, the necessary
146	         information about how much data an application can currently
147	         send on a flow is often only available at the sender side,
148	         making the sender an obvious choice for placement of the
149	         elements and mechanisms described here.

151	   Shared bottlenecks do not change quickly:
152	         As per the definition above, a bottleneck depends on cross
153	         traffic, and since such traffic can heavily fluctuate,
154	         bottlenecks can change at a high frequency (e.g., there can be
155	         oscillation between two or more links).  This means that, when
156	         flows are partially routed along different paths, they may
157	         quickly change between sharing and not sharing a bottleneck.
158	         For simplicity, here it is assumed that a shared bottleneck is
159	         valid for a time interval that is significantly longer than the
160	         interval at which congestion controllers operate.  Note that,
161	         for the only SBD mechanism defined in this document
162	         (multiplexing on the same five-tuple), the notion of a shared
163	         bottleneck stays correct even in the presence of fast traffic
164	         fluctuations: since all flows that are assumed to share a
165	         bottleneck are routed in the same way, if the bottleneck
166	         changes, it will still be shared.

168	4.  Architectural overview

170	   Figure 1 shows the elements of the architecture for coupled
171	   congestion control: the Flow State Exchange (FSE), Shared Bottleneck
172	   Detection (SBD) and Flows.  The FSE is a storage element that can be
173	   implemented in two ways: active and passive.  In the active version,
174	   it initiates communication with flows and SBD.  However, in the
175	   passive version, it does not actively initiate communication with
176	   flows and SBD; its only active role is internal state maintenance
177	   (e.g., an implementation could use soft state to remove a flow's data
178	   after long periods of inactivity).  Every time a flow's congestion
179	   control mechanism would normally update its sending rate, the flow
180	   instead updates information in the FSE and performs a query on the
181	   FSE, leading to a sending rate that can be different from what the
182	   congestion controller originally determined.  Using information
183	   about/from the currently active flows, SBD updates the FSE with the
184	   correct Flow State Identifiers (FSIs).

186	                          -------  <---  Flow 1
187	                          | FSE |  <---  Flow 2 ..
188	                          -------  <---  .. Flow N
189	                             ^
190	                             |             |
191	                          -------          |
192	                          | SBD |  <-------|
193	                          -------

195	             Figure 1: Coupled congestion control architecture

197	   Since everything shown in Figure 1 is assumed to operate on a single
198	   host (the sender) only, this document only describes aspects that
199	   have an influence on the resulting on-the-wire behavior.  It does,
200	   for instance, not define how many bits must be used to represent
201	   FSIs, or in which way the entities communicate.  Implementations can
202	   take various forms: for instance, all the elements in the figure
203	   could be implemented within a single application, thereby operating
204	   on flows generated by that application only.  Another alternative
205	   could be to implement both the FSE and SBD together in a separate
206	   process which different applications communicate with via some form
207	   of Inter-Process Communication (IPC).  Such an implementation would
208	   extend the scope to flows generated by multiple applications.  The
209	   FSE and SBD could also be included in the Operating System kernel.

211	5.  Roles

213	   This section gives an overview of the roles of the elements of
214	   coupled congestion control, and provides an example of how coupled
215	   congestion control can operate.

217	5.1.  SBD

219	   SBD uses knowledge about the flows to determine which flows belong in
220	   the same Flow Group (FG), and assigns FGIs accordingly.  This
221	   knowledge can be derived in three basic ways:

223	   1.  From multiplexing: it can be based on the simple assumption that
224	       packets sharing the same five-tuple (IP source and destination
225	       address, protocol, and transport layer port number pair) and
226	       having the same Differentiated Services Code Point (DSCP) in the
227	       IP header are typically treated in the same way along the path.
228	       The latter method is the only one specified in this document: SBD
229	       MAY consider all flows that use the same five-tuple and DSCP to
230	       belong to the same FG.  This classification applies to certain
231	       tunnels, or RTP flows that are multiplexed over one transport
232	       (cf. [transport-multiplex]).  In one way or another, such
233	       multiplexing will probably be recommended for use with rtcweb
234	       [rtcweb-rtp-usage].

236	   2.  Via configuration: e.g. by assuming that a common wireless uplink
237	       is also a shared bottleneck.

239	   3.  From measurements: e.g. by considering correlations among
240	       measured delay and loss as an indication of a shared bottleneck.

242	   The methods above have some essential trade-offs: e.g., multiplexing
243	   is a completely reliable measure, however it is limited in scope to
244	   two end points (i.e., it cannot be applied to couple congestion
245	   controllers of one sender talking to multiple receivers).  A
246	   measurement-based SBD mechanism is described in [sbd].  Measurements
247	   can never be 100% reliable, in particular because they are based on
248	   the past but applying coupled congestion control means to make an
249	   assumption about the future; it is therefore recommended to implement
250	   cautionary measures, e.g. by disabling coupled congestion control if
251	   enabling it causes a significant increase in delay and/or packet
252	   loss.  Measurements also take time, which entails a certain delay for
253	   turning on coupling (refer to [sbd] for details).

255	5.2.  FSE

257	   The FSE contains a list of all flows that have registered with it.
258	   For each flow, it stores the following:

260	   o  a unique flow number to identify the flow

262	   o  the FGI of the FG that it belongs to (based on the definitions in
263	      this document, a flow has only one bottleneck, and can therefore
264	      be in only one FG)

266	   o  a priority P, which here is assumed to be represented as a
267	      floating point number in the range from 0.1 (unimportant) to 1
268	      (very important).  A negative value is used to indicate that a
269	      flow has terminated

271	   o  The rate used by the flow in bits per second, FSE_R.

273	   The FSE can operate on window-based as well as rate-based congestion
274	   controllers (TEMPORARY NOTE: and probably -- not yet tested --
275	   combinations thereof, with calculations to convert from one to the
276	   other).  In case of a window-based controller, FSE_R is a window, and
277	   all the text below should be considered to refer to window, not
278	   rates.

280	   In the FSE, each FG contains one static variable S_CR which is meant
281	   to be the sum of the calculated rates of all flows in the same FG
282	   (including the flow itself).  This value is used to calculate the
283	   sending rate.

285	   The information listed here is enough to implement the sample flow
286	   algorithm given below.  FSE implementations could easily be extended
287	   to store, e.g., a flow's current sending rate for statistics
288	   gathering or future potential optimizations.

290	5.3.  Flows

292	   Flows register themselves with SBD and FSE when they start,
293	   deregister from the FSE when they stop, and carry out an UPDATE
294	   function call every time their congestion controller calculates a new
295	   sending rate.  Via UPDATE, they provide the newly calculated rate and
296	   optionally (if the algorithm supports it) the desired rate.  The
297	   desired rate is less than the calculated rate in case of application-
298	   limited flows; otherwise, it is the same as the calculated rate.

300	   Below, two example algorithms are described.  While other algorithms
301	   could be used instead, the same algorithm must be applied to all
302	   flows.

304	5.3.1.  Example algorithm 1 - Active FSE

306	   This algorithm was designed to be the simplest possible method to
307	   assign rates according to the priorities of flows.  Simulations
308	   results in [fse] indicate that it does however not significantly
309	   reduce queuing delay and packet loss.

311	   (1)  When a flow f starts, it registers itself with SBD and the FSE.
312	        FSE_R is initialized with the congestion controller's initial
313	        rate.  SBD will assign the correct FGI.  When a flow is assigned
314	        an FGI, it adds its FSE_R to S_CR.

316	   (2)  When a flow f stops, its entry is removed from the list.

318	   (3)  Every time the congestion controller of the flow f determines a
319	        new sending rate CC_R, the flow calls UPDATE, which carries out
320	        the tasks listed below to derive the new sending rates for all
321	        the flows in the FG.  A flow's UPDATE function uses a local
322	        (i.e. per-flow) temporary variable S_P, which is the sum of all
323	        the priorities.

325	        (a)  It updates S_CR.

327	               S_CR = S_CR + CC_R - FSE_R(f)

329	        (b)  It calculates the sum of all the priorities, S_P.

331	               S_P = 0
332	               for all flows i in FG do
333	                   S_P = S_P + P(i)
334	               end for

336	        (c)  It calculates the sending rates for all the flows in an FG
337	             and distributes them.

339	               for all flows i in FG do
340	                   FSE_R(i) = (P(i)*S_CR)/S_P
341	                   send FSE_R(i) to the flow i
342	               end for

344	5.3.2.  Example algorithm 2 - Conservative Active FSE

346	   This algorithm extends algorithm 1 to conservatively emulate the
347	   behavior of a single flow by proportionally reducing the aggregate
348	   rate on congestion.  Simulations results in [fse] indicate that it
349	   can significantly reduce queuing delay and packet loss.

351	   (1)  When a flow f starts, it registers itself with SBD and the FSE.
352	        FSE_R is initialized with the congestion controller's initial
353	        rate.  SBD will assign the correct FGI.  When a flow is assigned
354	        an FGI, it adds its FSE_R to S_CR.

356	   (2)  When a flow f stops, its entry is removed from the list.

358	   (3)  Every time the congestion controller of the flow f determines a
359	        new sending rate CC_R, the flow calls UPDATE, which carries out
360	        the tasks listed below to derive the new sending rates for all
361	        the flows in the FG.  A flow's UPDATE function uses a local
362	        (i.e. per-flow) temporary variable S_P, which is the sum of all
363	        the priorities, and a local variable DELTA, which is used to
364	        calculate the difference between CC_R and the previously stored
365	        FSE_R. To prevent flows from either ignoring congestion or
366	        overreacting, a timer keeps them from changing their rates
367	        immediately after the common rate reduction that follows a
368	        congestion event.  This timer is set to 2 RTTs of the flow that
369	        experienced congestion because it is assumed that a congestion
370	        event can persist for up to one RTT of that flow, with another
371	        RTT added to compensate for fluctuations in the measured RTT
372	        value.

374	        (a)  It updates S_CR based on DELTA.

376	               if Timer has expired or not set then
377	                 DELTA = CC_R - FSE_R(f)
378	                 if DELTA < 0 then  // Reduce S_CR proportionally
379	                   S_CR = S_CR * CC_R / FSE_R(f)
380	                   Set Timer for 2 RTTs
381	                 else
382	                   S_CR = S_CR + DELTA
383	                 end if
384	                end if

386	        (b)  It calculates the sum of all the priorities, S_P.

388	               S_P = 0
389	               for all flows i in FG do
390	                   S_P = S_P + P(i)
391	               end for

393	        (c)  It calculates the sending rates for all the flows in an FG
394	             and distributes them.

396	               for all flows i in FG do
397	                   FSE_R(i) = (P(i)*S_CR)/S_P
398	                   send FSE_R(i) to the flow i
399	               end for

401	6.  Application

403	   This section specifies how the FSE can be applied to specific
404	   congestion control mechanisms and makes general recommendations that
405	   facilitate applying the FSE to future congestion controls.

407	6.1.  NADA

409	   Network-Assisted Dynamic Adapation (NADA) [nada] is a congestion
410	   control scheme for rtcweb.  It calculates a reference rate R_n upon
411	   receiving an acknowledgment, and then, based on the reference rate,
412	   it calculates a video target rate R_v and a sending rate for the
413	   flows, R_s.

415	   When applying the FSE to NADA, the UPDATE function call described in
416	   Section 5.3 gives the FSE NADA's reference rate R_n.  The recommended
417	   algorithm for NADA is the Active FSE in Section 5.3.1.  In step 3
418	   (c), when the FSE_R(i) is "sent" to the flow i, this means updating
419	   R_v and R_s of flow i with the value of FSE_R(i).

421	   NADA simulation results are available from
422	   http://heim.ifi.uio.no/safiquli/coupled-cc/.  The next version of
423	   this document will refer to a technical report that will be made
424	   available at the same URL.

426	6.2.  General recommendations

428	   This section will provides general advice for applying the FSE to
429	   congestion control mechanisms.  TEMPORARY NOTE: Future versions of
430	   this document will contain a longer list.

432	   Receiver-side calculations:
433	         When receiver-side calculations make assumptions about the rate
434	         of the sender, the calculations need to be synchronized or the
435	         receiver needs to be updated accordingly.  This applies to TFRC
436	         [RFC5348], for example, where simulations showed somewhat less
437	         favorable results when using the FSE without a receiver-side
438	         change [fse].

440	7.  Acknowledgements

442	   This document has benefitted from discussions with and feedback from
443	   David Hayes, Mirja Kuehlewind, Andreas Petlund, David Ros (who also
444	   gave the FSE its name), Zaheduzzaman Sarker and Varun Singh.  The
445	   authors would like to thank Xiaoqing Zhu for helping with NADA.

447	   This work was partially funded by the European Community under its
448	   Seventh Framework Programme through the Reducing Internet Transport
449	   Latency (RITE) project (ICT-317700).

451	8.  IANA Considerations

453	   This memo includes no request to IANA.

455	9.  Security Considerations

457	   In scenarios where the architecture described in this document is
458	   applied across applications, various cheating possibilities arise:
459	   e.g., supporting wrong values for the calculated rate, the desired
460	   rate, or the priority of a flow.  In the worst case, such cheating
461	   could either prevent other flows from sending or make them send at a
462	   rate that is unreasonably large.  The end result would be unfair
463	   behavior at the network bottleneck, akin to what could be achieved
464	   with any UDP based application.  Hence, since this is no worse than
465	   UDP in general, there seems to be no significant harm in using this
466	   in the absence of UDP rate limiters.

468	   In the case of a single-user system, it should also be in the
469	   interest of any application programmer to give the user the best
470	   possible experience by using reasonable flow priorities or even
471	   letting the user choose them.  In a multi-user system, this interest
472	   may not be given, and one could imagine the worst case of an "arms
473	   race" situation, where applications end up setting their priorities
474	   to the maximum value.  If all applications do this, the end result is
475	   a fair allocation in which the priority mechanism is implicitly
476	   eliminated, and no major harm is done.

478	10.  References

480	10.1.  Normative References

482	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
483	              Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/
484	              RFC2119, March 1997,
485	              <http://www.rfc-editor.org/info/rfc2119>.

487	   [RFC3124]  Balakrishnan, H. and S. Seshan, "The Congestion Manager",
488	              RFC 3124, DOI 10.17487/RFC3124, June 2001,
489	              <http://www.rfc-editor.org/info/rfc3124>.

491	   [RFC5348]  Floyd, S., Handley, M., Padhye, J., and J. Widmer, "TCP
492	              Friendly Rate Control (TFRC): Protocol Specification",
493	              RFC 5348, DOI 10.17487/RFC5348, September 2008,
494	              <http://www.rfc-editor.org/info/rfc5348>.

496	10.2.  Informative References

498	   [fse]      Islam, S., Welzl, M., Gjessing, S., and N. Khademi,
499	              "Coupled Congestion Control for RTP Media", ACM SIGCOMM
500	              Capacity Sharing Workshop (CSWS 2014); extended version
501	              available as a technical report from
502	              http://safiquli.at.ifi.uio.no/paper/fse-tech-report.pdf ,
503	              2014.

505	   [nada]     Zhu, X., Pan, R., Ramalho, M., Mena, S., Ganzhorn, C.,
506	              Jones, P., and S. De Aronco, "NADA: A Unified Congestion
507	              Control Scheme for Real-Time Media",
508	              draft-ietf-rmcat-nada-00 (work in progress), April 2015.

510	   [rtcweb-rtp-usage]
511	              Perkins, C., Westerlund, M., and J. Ott, "Web Real-Time
512	              Communication (WebRTC): Media Transport and Use of RTP",
513	              draft-ietf-rtcweb-rtp-usage-18.txt (work in progress),
514	              October 2014.

516	   [rtcweb-usecases]
517	              Holmberg, C., Hakansson, S., and G. Eriksson, "Web Real-
518	              Time Communication Use-cases and Requirements",
519	              draft-ietf-rtcweb-use-cases-and-requirements-14.txt (work
520	              in progress), February 2014.

522	   [sbd]      Hayes, D., Ferlin, S., and M. Welzl, "Shared Bottleneck
523	              Detection for Coupled Congestion Control for RTP Media",
524	              draft-ietf-rmcat-sbd-00.txt (work in progress), May 2015.

526	   [transport-multiplex]
527	              Westerlund, M. and C. Perkins, "Multiple RTP Sessions on a
528	              Single Lower-Layer Transport",
529	              draft-westerlund-avtcore-transport-multiplexing-07.txt
530	              (work in progress), October 2013.

532	Appendix A.  Scheduling

534	   When connections originate from the same host, it would be possible
535	   to use only one single sender-side congestion controller which
536	   determines the overall allowed sending rate, and then use a local
537	   scheduler to assign a proportion of this rate to each RTP session.
538	   This way, priorities could also be implemented as a function of the
539	   scheduler.  The Congestion Manager (CM) [RFC3124] also uses such a
540	   scheduling function.

542	Appendix B.  Example algorithm - Passive FSE

544	   Active algorithms calculate the rates for all the flows in the FG and
545	   actively distribute them.  In a passive algorithm, UPDATE returns a
546	   rate that should be used instead of the rate that the congestion
547	   controller has determined.  This can make a passive algorithm easier
548	   to implement; however, when round-trip times of flows are unequal,
549	   shorter-RTT flows will update and react to the overall FSE state more
550	   often than longer-RTT flows, which can produce unwanted side effects.
551	   This problem is more significant when the congestion control
552	   convergence depends on the RTT.  While the passive algorithm works
553	   better for congestion controls with RTT-independent convergence, it
554	   can still produce oscillations on short time scales.  The algorithm
555	   described below is therefore considered as highly experimental.

557	   This passive version of the FSE stores the following information in
558	   addition to the variables described in Section 5.2:

560	   o  The desired rate DR.  This can be smaller than the calculated rate
561	      if the application feeding into the flow has less data to send
562	      than the congestion controller would allow.  In case of a bulk
563	      transfer, DR must be set to CC_R received from the flow's
564	      congestion module.

566	   The passive version of the FSE contains one static variable per FG
567	   called TLO (Total Leftover Rate -- used to let a flow 'take'
568	   bandwidth from application-limited or terminated flows) which is
569	   initialized to 0.  For the passive version, S_CR is limited to
570	   increase or decrease as conservatively as a flow's congestion
571	   controller decides in order to prohibit sudden rate jumps.

573	   (1)  When a flow f starts, it registers itself with SBD and the FSE.
574	        FSE_R and DR are initialized with the congestion controller's
575	        initial rate.  SBD will assign the correct FGI.  When a flow is
576	        assigned an FGI, it adds its FSE_R to S_CR.

578	   (2)  When a flow f stops, it sets its DR to 0 and sets P to -1.

580	   (3)  Every time the congestion controller of the flow f determines a
581	        new sending rate CC_R, assuming the flow's new desired rate
582	        new_DR to be "infinity" in case of a bulk data transfer with an
583	        unknown maximum rate, the flow calls UPDATE, which carries out
584	        the tasks listed below to derive the flow's new sending rate,
585	        Rate.  A flow's UPDATE function uses a few local (i.e. per-flow)
586	        temporary variables, which are all initialized to 0: DELTA,
587	        new_S_CR and S_P.

589	        (a)  For all the flows in its FG (including itself), it
590	             calculates the sum of all the calculated rates, new_S_CR.
591	             Then it calculates the difference between FSE_R(f) and
592	             CC_R, DELTA.

594	               for all flows i in FG do
595	                   new_S_CR = new_S_CR + FSE_R(i)
596	               end for
597	               DELTA =  CC_R - FSE_R(f)

599	        (b)  It updates S_CR, FSE_R(f) and DR(f).

601	               FSE_R(f) = CC_R
602	               if DELTA > 0 then  // the flow's rate has increased
603	                   S_CR = S_CR + DELTA
604	               else if DELTA < 0 then
605	                   S_CR = new_S_CR + DELTA
606	               end if
607	               DR(f) = min(new_DR,FSE_R(f))

609	        (c)  It calculates the leftover rate TLO, removes the terminated
610	             flows from the FSE and calculates the sum of all the
611	             priorities, S_P.

613	               for all flows i in FG do
614	                  if P(i)<0 then
615	                     delete flow
616	                  else
617	                     S_P = S_P + P(i)
618	                  end if
619	               end for
620	               if DR(f) < FSE_R(f) then
621	                  TLO = TLO + (P(f)/S_P) * S_CR - DR(f))
622	               end if

624	        (d)  It calculates the sending rate, Rate.

626	               Rate = min(new_DR, (P(f)*S_CR)/S_P + TLO)

628	               if Rate != new_DR and TLO > 0 then
629	                   TLO = 0  // f has 'taken' TLO
630	               end if

632	        (e)  It updates DR(f) and FSE_R(f) with Rate.

634	               if Rate > DR(f) then
635	                   DR(f) = Rate
636	               end if
637	               FSE_R(f)  = Rate

639	   The goals of the flow algorithm are to achieve prioritization,
640	   improve network utilization in the face of application-limited flows,
641	   and impose limits on the increase behavior such that the negative
642	   impact of multiple flows trying to increase their rate together is
643	   minimized.  It does that by assigning a flow a sending rate that may
644	   not be what the flow's congestion controller expected.  It therefore
645	   builds on the assumption that no significant inefficiencies arise
646	   from temporary application-limited behavior or from quickly jumping
647	   to a rate that is higher than the congestion controller intended.
648	   How problematic these issues really are depends on the controllers in
649	   use and requires careful per-controller experimentation.  The coupled
650	   congestion control mechanism described here also does not require all
651	   controllers to be equal; effects of heterogeneous controllers, or
652	   homogeneous controllers being in different states, are also subject
653	   to experimentation.

655	   This algorithm gives all the leftover rate of application-limited
656	   flows to the first flow that updates its sending rate, provided that
657	   this flow needs it all (otherwise, its own leftover rate can be taken
658	   by the next flow that updates its rate).  Other policies could be
659	   applied, e.g. to divide the leftover rate of a flow equally among all
660	   other flows in the FGI.

662	B.1.  Example operation (passive)

664	   In order to illustrate the operation of the passive coupled
665	   congestion control algorithm, this section presents a toy example of
666	   two flows that use it.  Let us assume that both flows traverse a
667	   common 10 Mbit/s bottleneck and use a simplistic congestion
668	   controller that starts out with 1 Mbit/s, increases its rate by 1
669	   Mbit/s in the absence of congestion and decreases it by 2 Mbit/s in
670	   the presence of congestion.  For simplicity, flows are assumed to
671	   always operate in a round-robin fashion.  Rate numbers below without
672	   units are assumed to be in Mbit/s.  For illustration purposes, the
673	   actual sending rate is also shown for every flow in FSE diagrams even
674	   though it is not really stored in the FSE.

676	   Flow #1 begins.  It is a bulk data transfer and considers itself to
677	   have top priority.  This is the FSE after the flow algorithm's step
678	   1:

680	   ----------------------------------------
681	   | # | FGI |  P  | FSE_R  |  DR  | Rate |
682	   |   |     |     |        |      |      |
683	   | 1 |  1  |  1  |   1    |   1  |   1  |
684	   ----------------------------------------
685	   S_CR = 1, TLO = 0

687	   Its congestion controller gradually increases its rate.  Eventually,
688	   at some point, the FSE should look like this:

690	   --------------------------------------
691	   | # | FGI |  P  |  FSE_R  |  DR  | Rate |
692	   |   |     |     |         |      |      |
693	   | 1 |  1  |  1  |   10    |  10  |  10  |
694	   -----------------------------------------
695	   S_CR = 10, TLO = 0

697	   Now another flow joins.  It is also a bulk data transfer, and has a
698	   lower priority (0.5):

700	   ----------------------------------------
701	   | # | FGI |   P   | FSE_R  |  DR  | Rate |
702	   |   |     |       |        |      |      |
703	   | 1 |  1  |   1   |   10   |  10  |  10  |
704	   | 2 |  1  |  0.5  |    1   |   1  |   1  |
705	   ------------------------------------------
706	   S_CR = 11, TLO = 0

708	   Now assume that the first flow updates its rate to 8, because the
709	   total sending rate of 11 exceeds the total capacity.  Let us take a
710	   closer look at what happens in step 3 of the flow algorithm.

712	   CC_R = 8. new_DR = infinity.
713	   3 a) new_S_CR = 11; DELTA = 8 - 10 = -2.
714	   3 b) FSE_Rf) = 8. DELTA is negative, hence S_CR = 9;
715	        DR(f) = 8.
716	   3 c) S_P = 1.5.
717	   3 d) new sending rate = min(infinity, 1/1.5 * 9 + 0) = 6.
718	   3 e) FSE_R(f) = 6.

720	   The resulting FSE looks as follows:
721	   ----------------------------------------
722	   | # | FGI |   P   |  FSE_R  |  DR  | Rate |
723	   |   |     |       |         |      |      |
724	   | 1 |  1  |   1   |    6    |   8  |   6  |
725	   | 2 |  1  |  0.5  |    1    |   1  |   1  |
726	   -------------------------------------------
727	   S_CR = 9, TLO = 0

729	   The effect is that flow #1 is sending with 6 Mbit/s instead of the 8
730	   Mbit/s that the congestion controller derived.  Let us now assume
731	   that flow #2 updates its rate.  Its congestion controller detects
732	   that the network is not fully saturated (the actual total sending
733	   rate is 6+1=7) and increases its rate.

735	   CC_R=2. new_DR = infinity.
736	   3 a) new_S_CR = 7; DELTA = 2 - 1 = 1.
737	   3 b) FSE_R(f) = 2. DELTA is positive, hence S_CR = 9 + 1 = 10;
738	        DR(f) = 2.
739	   3 c) S_P = 1.5.
740	   3 d) new sending rate = min(infinity, 0.5/1.5 * 10 + 0) = 3.33.
741	   3 e) DR(f) = FSE_R(f) = 3.33.

743	   The resulting FSE looks as follows:
744	   -------------------------------------------
745	   | # | FGI |   P   |  FSE_R  |  DR  | Rate |
746	   |   |     |       |         |      |      |
747	   | 1 |  1  |   1   |    6    |   8  |   6  |
748	   | 2 |  1  |  0.5  |   3.33  | 3.33 | 3.33 |
749	   -------------------------------------------
750	   S_CR = 10, TLO = 0
751	   The effect is that flow #2 is now sending with 3.33 Mbit/s, which is
752	   close to half of the rate of flow #1 and leads to a total utilization
753	   of 6(#1) + 3.33(#2) = 9.33 Mbit/s.  Flow #2's congestion controller
754	   has increased its rate faster than the controller actually expected.
755	   Now, flow #1 updates its rate.  Its congestion controller detects
756	   that the network is not fully saturated and increases its rate.
757	   Additionally, the application feeding into flow #1 limits the flow's
758	   sending rate to at most 2 Mbit/s.

760	   CC_R=7. new_DR=2.
761	   3 a) new_S_CR = 9.33; DELTA = 1.
762	   3 b) FSE_R(f) = 7, DELTA is positive, hence S_CR = 10 + 1 = 11;
763	        DR = min(2, 7) = 2.
764	   3 c) S_P = 1.5; DR(f) < FSE_R(f), hence TLO = 1/1.5 * 11 - 2 = 5.33.
765	   3 d) new sending rate = min(2, 1/1.5 * 11 + 5.33) = 2.
766	   3 e) FSE_R(f) = 2.

768	   The resulting FSE looks as follows:
769	   -------------------------------------------
770	   | # | FGI |   P   |  FSE_R  |  DR  | Rate |
771	   |   |     |       |         |      |      |
772	   | 1 |  1  |   1   |    2    |   2  |   2  |
773	   | 2 |  1  |  0.5  |   3.33  | 3.33 | 3.33 |
774	   -------------------------------------------
775	   S_CR = 11, TLO = 5.33

777	   Now, the total rate of the two flows is 2 + 3.33 = 5.33 Mbit/s, i.e.
778	   the network is significantly underutilized due to the limitation of
779	   flow #1.  Flow #2 updates its rate.  Its congestion controller
780	   detects that the network is not fully saturated and increases its
781	   rate.

783	   CC_R=4.33. new_DR = infinity.
784	   3 a) new_S_CR = 5.33; DELTA = 1.
785	   3 b) FSE_R(f) = 4.33. DELTA is positive, hence S_CR = 12;
786	        DR(f) = 4.33.
787	   3 c) S_P = 1.5.
788	   3 d) new sending rate: min(infinity, 0.5/1.5 * 12 + 5.33 ) = 9.33.
789	   3 e) FSE_R(f) = 9.33, DR(f) = 9.33.

791	   The resulting FSE looks as follows:
792	   -------------------------------------------
793	   | # | FGI |   P   |  FSE_R  |  DR  | Rate |
794	   |   |     |       |         |      |      |
795	   | 1 |  1  |   1   |    2    |   2  |   2  |
796	   | 2 |  1  |  0.5  |   9.33  | 9.33 | 9.33 |
797	   -------------------------------------------
798	   S_CR = 12, TLO = 0

800	   Now, the total rate of the two flows is 2 + 9.33 = 11.33 Mbit/s.
801	   Finally, flow #1 terminates.  It sets P to -1 and DR to 0.  Let us
802	   assume that it terminated late enough for flow #2 to still experience
803	   the network in a congested state, i.e. flow #2 decreases its rate in
804	   the next iteration.

806	   CC_R = 7.33. new_DR = infinity.
807	   3 a) new_S_CR = 11.33; DELTA = -2.
808	   3 b) FSE_R(f) = 7.33. DELTA is negative, hence S_CR = 9.33;
809	        DR(f) = 7.33.
810	   3 c) Flow 1 has P = -1, hence it is deleted from the FSE.
811	        S_P = 0.5.
812	   3 d) new sending rate: min(infinity, 0.5/0.5*9.33 + 0) = 9.33.
813	   3 e) FSE_R(f) = DR(f) = 9.33.

815	   The resulting FSE looks as follows:
816	   -------------------------------------------
817	   | # | FGI |   P   |  FSE_R  |  DR  | Rate |
818	   |   |     |       |         |      |      |
819	   | 2 |  1  |  0.5  |   9.33  | 9.33 | 9.33 |
820	   -------------------------------------------
821	   S_CR = 9.33, TLO = 0

823	Authors' Addresses

825	   Safiqul Islam
826	   University of Oslo
827	   PO Box 1080 Blindern
828	   Oslo,   N-0316
829	   Norway

831	   Phone: +47 22 84 08 37
832	   Email: safiquli@ifi.uio.no

834	   Michael Welzl
835	   University of Oslo
836	   PO Box 1080 Blindern
837	   Oslo,   N-0316
838	   Norway

840	   Phone: +47 22 85 24 20
841	   Email: michawe@ifi.uio.no

843	   Stein Gjessing
844	   University of Oslo
845	   PO Box 1080 Blindern
846	   Oslo,   N-0316
847	   Norway

849	   Phone: +47 22 85 24 44
850	   Email: steing@ifi.uio.no