idnits 2.17.1 

draft-ietf-dccp-tfrc-media-02.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** It looks like you're using RFC 3978 boilerplate.  You should update this
     to the boilerplate described in the IETF Trust License Policy document
     (see https://trustee.ietf.org/license-info), which is required now.

  -- Found old boilerplate from RFC 3978, Section 5.1 on line 15.

  -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on
     line 745.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 722.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 729.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 735.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an Introduction section.

  ** The document seems to lack a Security Considerations section.

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust Copyright Line does not match the
     current year

  == Line 356 has weird spacing: '...network    dec...'

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (July 2007) is 6129 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  -- Missing reference section? 'RFC 3550' on line 686 looks like a reference

  -- Missing reference section? 'RFC 3714' on line 676 looks like a reference

  -- Missing reference section? 'RFC 3448' on line 673 looks like a reference

  -- Missing reference section? 'DCCP' on line 661 looks like a reference

  -- Missing reference section? 'CCID3' on line 669 looks like a reference

  -- Missing reference section? 'RTP-TFRC' on line 702 looks like a reference

  -- Missing reference section? 'CCID2' on line 665 looks like a reference

  -- Missing reference section? 'RFC 3517' on line 682 looks like a reference

  -- Missing reference section? 'ECN' on line 694 looks like a reference

  -- Missing reference section? 'XTIME' on line 690 looks like a reference

  -- Missing reference section? 'MPEG4' on line 698 looks like a reference

  -- Missing reference section? 'RFC 3261' on line 679 looks like a reference


     Summary: 4 errors (**), 0 flaws (~~), 2 warnings (==), 19 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	   Strategies for Streaming Media Using TFRC
2	   Internet Draft                                             T. Phelan
3	   Document: draft-ietf-dccp-tfrc-media-02.txt           Sonus Networks
4	   Expires: January 2008                                      July 2007
5	   Intended status: Informational

7	                Strategies for Streaming Media Applications
8	                      Using TCP-Friendly Rate Control

10	Status of this Memo

12	   By submitting this Internet-Draft, each author represents that any
13	   applicable patent or other IPR claims of which he or she is aware
14	   have been or will be disclosed, and any of which he or she becomes
15	   aware will be disclosed, in accordance with Section 6 of BCP 79.

17	   Internet-Drafts are working documents of the Internet Engineering
18	   Task Force (IETF), its areas, and its working groups.  Note that
19	   other groups may also distribute working documents as Internet-
20	   Drafts.

22	   Internet-Drafts are draft documents valid for a maximum of six months
23	   and may be updated, replaced, or obsoleted by other documents at any
24	   time.  It is inappropriate to use Internet-Drafts as reference
25	   material or to cite them other than as "work in progress."

27	   The list of current Internet-Drafts can be accessed at
28	   http://www.ietf.org/ietf/1id-abstracts.txt.

30	   The list of Internet-Draft Shadow Directories can be accessed at
31	   http://www.ietf.org/shadow.html.

33	   This Internet-Draft will expire on November 30, 2007.

35	Abstract

37	   This document discusses strategies for using streaming media
38	   applications with unreliable congestion-controlled transport
39	   protocols such as the Datagram Congestion Control Protocol (DCCP) or
40	   the RTP Profile for TCP Friendly Rate Control.  Of particular
41	   interest is how media streams, which have their own transmit rate
42	   requirements, can be adapted to the varying and sometimes conflicting
43	   transmit rate requirements of congestion control protocols such as
44	   TCP-Friendly Rate Control (TFRC).

46	Table of Contents

48	   1. Introduction...................................................3
49	   2. TFRC Basics....................................................3
50	   3. Streaming Media Applications...................................5
51	      3.1 Stream Switching...........................................6
52	      3.2 Media Buffers..............................................7
53	      3.3 Variable Rate Media Streams................................7
54	   4. Strategies for Streaming Media Applications....................8
55	      4.1 First Strategy -- One-way Pre-recorded Media...............8
56	         4.1.1 Strategy 1............................................8
57	         4.1.2 Issues With Strategy 1................................9
58	      4.2 Second Try -- One-way Live Media..........................10
59	         4.2.1 Strategy 2...........................................10
60	         4.2.2 Issues with Strategy 2...............................12
61	      4.3 One More Time -- Two-way Interactive Media................12
62	         4.3.1 Strategy 3...........................................13
63	         4.3.2 Issues with Strategy 3...............................14
64	   5. Security Considerations.......................................14
65	   6. IANA Considerations...........................................14
66	   7. Thanks........................................................14
67	   8. Informative References........................................15
68	   9. Author's Address..............................................16
69	1.        Introduction

71	   The canonical streaming media application emits fixed-sized (often
72	   small) packets at a regular interval.  It relies on the network to
73	   deliver the packets to the receiver in roughly the same regular
74	   interval.  Often, the transmitter operates in a fire-and-forget mode,
75	   receiving no indications of packet delivery and never changing its
76	   mode of operation.  This often holds true even if the packets are
77	   encapsulated in the Real-time Transport Protocol (RTP) and the RTP
78	   Control Protocol (RTCP) [RFC 3550] is used to get receiver
79	   information; it's rare that the RTCP reports trigger changes in the
80	   transmitted stream.

82	   The IAB has expressed concerns over the stability of the Internet if
83	   these applications become too popular with regard to TCP-based
84	   applications [RFC 3714].  They suggest that media applications should
85	   monitor their packet loss rate, and abort if they exceed certain
86	   thresholds.  Unfortunately, up until this threshold is reached, the
87	   network, the media applications, and the other applications are all
88	   experiencing considerable duress.

90	   TCP-Friendly Rate Control (TFRC, [RFC 3448]) offers an alternative to
91	   the [RFC 3714] method.  The key differentiator of TFRC, relative to
92	   the Additive Increase Multiplicative Decrease (AIMD) method used in
93	   TCP and SCTP, is its smooth response to packet loss.  TFRC has been
94	   implemented as one of the "pluggable" congestion control algorithms
95	   for the Datagram Congestion Control Protocol (DCCP, [DCCP] and
96	   [CCID3]) and as a profile for RTP [RTP-TFRC].

98	   This document explores issues to consider and strategies to employ
99	   when adapting or creating streaming media applications to use
100	   transport protocols using TFRC for congestion control.  The approach
101	   here is one of successive refinement.  Strategies are described and
102	   their strengths and weaknesses are explored.  New strategies are then
103	   presented that improve on the previous ones and the process iterates.
104	   The intent is to illuminate the issues, rather than to jump to
105	   solutions, in order to provide guidance to application designers.

107	2.        TFRC Basics

109	   AIMD congestion control algorithms, such DCCP's CCID2 [CCID2] or
110	   TCP's SACK-based control [RFC 3517], use a congestion window (the
111	   maximum number of packets or segments in flight) to limit the
112	   transmitter.  The congestion window is increased by one for each
113	   acknowledged packet, or for each window of acknowledged packets,
114	   depending on the phase of operation.  If any packet is dropped (or
115	   ECN-marked [ECN]; for simplicity in the rest of the document assume
116	   that "dropped" equals "dropped or ECN-marked"), the congestion window
117	   is halved.  This produces a characteristic saw-tooth wave variation
118	   in throughput, where the throughput increases linearly up to the
119	   network capacity and then drops abruptly (roughly shown in Figure 1).

121	                 |
122	                 |      /|    /|    /|    /|    /
123	                 |     / |   / |   / |   / |   /
124	       Throughput|    /  |  /  |  /  |  /  |  /
125	                 |   /   | /   | /   | /   | /
126	                 |  /    |/    |/    |/    |/
127	                 |
128	                  ----------------------------------
129	                                 Time

131	   Figure 1: Characteristic throughput for AIMD congestion control.

133	   On the other hand, with TCP-Friendly Rate Control (TFRC), the
134	   immediate response to packet drops is less dramatic.  To compensate
135	   for this TFRC is less aggressive in probing for new capacity after a
136	   loss.  TFRC uses a version of the TCP throughput equation to compute
137	   a maximum transmit rate, taking a weighted history of loss events as
138	   input (more weight is given to more recent losses).  The
139	   characteristic throughput graph for a TFRC connection looks like a
140	   flattened sine wave (extremely roughly shown in Figure 2).

142	                 |
143	                 |    --        --        --
144	                 |   /  \      /  \      /  \
145	       Throughput|  /    \    /    \    /    \
146	                 | /      \  /      \  /      \
147	                 |-        --        --        -
148	                 |
149	                  ----------------------------------
150	                                 Time

152	   Figure 2: Characteristic throughput for TFRC congestion control.

154	   In addition to this high-level behavior, there are several details of
155	   TFRC operation that, at first blush at least, seem at odds with
156	   common media stream transmission practices.  Some particular
157	   considerations are:

159	    o  Slow Start -- A connection starts out with a transmission rate of
160	       up to four packets per round trip time (RTT).  After the first
161	       RTT, the rate is doubled each RTT until a packet is lost.  At
162	       this point the transmission rate is halved and we enter the
163	       equation-based phase of operation.  It's likely that in many
164	       situations the initial transmit rate is slower than the lowest
165	       bit rate encoding of the media.  This will require the
166	       application to deal with a ramp up period.

168	    o  Capacity Probing and Lost Packets -- If the application transmits
169	       for some time at the maximum rate that TFRC will allow without
170	       packet loss, TFRC will continuously raise the allowed rate until
171	       a packet is lost.  This means that, in many circumstances, if an
172	       application wants to transmit at the maximum possible rate,
173	       packet loss will not be an exceptional event, but will happen
174	       routinely in the course of probing for more capacity.

176	    o  Idleness Penalty -- TFRC follows a "use it or lose it" policy.
177	       If the transmitter goes idle for a few RTTs, as it would if, for
178	       instance, silence suppression were being used, the transmit rate
179	       returns to two packets per RTT, and then doubles every RTT until
180	       the previous rate is achieved.  This can make restarting after a
181	       silence suppression interval problematic.

183	    o  Contentment Penalty -- TFRC likes to satisfy greed.  If you are
184	       transmitting at the maximum allowed rate, TFRC will try to raise
185	       that rate.  However, if your application is transmitting below
186	       the maximum allowed rate, the maximum allowed rate will not be
187	       increased higher than twice the current transmit rate, no matter
188	       how long it has been since the last increase.  This can create
189	       problems when attempting to shift to a higher rate encoding, or
190	       with video codecs that vary the transmission rate with the amount
191	       of movement in the image.

193	    o  Packet Rate, not Bit Rate -- TFRC controls the rate that packets
194	       may enter the network, not bytes.  To respond to a lowered
195	       transmit rate you must reduce the packet transmission rate.
196	       Making the packets smaller while still keeping the same packet
197	       rate will not be effective.

199	    o  Smooth Variance of Transmit Rate -- The strength and purpose of
200	       TFRC (over AIMD Congestion Control) is that it smoothly decreases
201	       the transmission rate in response to recent packet loss events,
202	       and smoothly increases the rate in the absence of loss events.
203	       This smoothness is somewhat at odds with most media stream
204	       encodings, where the transition from one rate to another is often
205	       a step function.

207	3.        Streaming Media Applications

209	   While all streaming media applications have some characteristics in
210	   common (e.g. data must arrive at the receiver at some minimum rate
211	   for reasonable operation), other characteristics (e.g. tolerance of
212	   end-to-end delay) vary considerably from application to application.
213	   For the purposes of this document, it's useful to divide streaming
214	   media applications into three subtypes:

216	    o  One-way pre-recorded media
217	    o  One-way live media
218	    o  Two-way interactive media

220	   The relevant difference, as far as this discussion goes, between
221	   recorded and live media is that recorded media can be transmitted as
222	   fast as the network allows (assuming adequate buffering at the
223	   receiver) -- it could be viewed as a special file transfer operation.
224	   Live media can't be transmitted faster than the rate at which it's
225	   encoded.

227	   The difference between one-way and two-way media is the sensitivity
228	   to delay.  For one-way applications, delays from encoding at the
229	   sender to playout at the receiver of several or even tens of seconds
230	   are acceptable.  For two-way applications delays from encoding to
231	   playout of as little as 150 ms are often problematic [XTIME].

233	   While delay sensitivity is most problematic when dealing with two-way
234	   conversational applications such as telephony, it is also apparent in
235	   nominally one-way applications when certain user interactions are
236	   allowed, such as program switching ("channel surfing") or fast
237	   forward/skip.  Arguably, these user interactions have turned the one-
238	   way application into a two-way application -- there just isn't the
239	   same sort of data flowing in both directions.

241	3.1         Stream Switching

243	   The discussion here assumes that media transmitters are able to
244	   provide their data in a number of encodings with various bit rate
245	   requirements and are able to dynamically change between these
246	   encodings with low overhead.  It also assumes that switching back and
247	   forth between coding rates does not cause excessive user annoyance.

249	   Given the current state of codec art, these are big assumptions.  As
250	   a practical matter, continuous shifts between higher and lower
251	   quality levels can greatly annoy users, much more so than one shift
252	   to a lower quality level and then staying there.  The algorithms
253	   given below indicate methods for returning to higher bandwidth
254	   encodings, but, because of the bad user perception of shifting
255	   quality, many media applications may choose to never invoke these
256	   methods.

258	   Also, the algorithms and results described here hold even if the
259	   media sources can only supply media at one rate.  Obviously the
260	   statements about switching encoding rates don't apply, and an
261	   application with only one encoding rate behaves as if it is
262	   simultaneously at its minimum and maximum rate.

264	   For convenience in the discussion below, assume that all media
265	   streams have two encodings, a high bit rate and a low bit rate,
266	   unless otherwise indicated.

268	3.2         Media Buffers

270	   The strategies below make use of the concept of a media buffer.  A
271	   media buffer is a first-in-first-out queue of media data.  The buffer
272	   is filled by some source of data (the encoder or the network) and
273	   drained by some sink (the network or the playout device).  It
274	   provides rate and jitter compensation between the source and the
275	   sink.

277	   Media buffer contents are measured in seconds of media play time, not
278	   bytes or packets.  Media buffers are completely application-level
279	   constructs and are separate from transport-layer transmit and receive
280	   queues.

282	3.3         Variable Rate Media Streams

284	   The canonical media codec encodes its media as a constant rate bit
285	   stream.  As the technology has progressed from its time-division
286	   multiplexing roots, this constant rate stream has become not so
287	   constant.  Voice codecs often employ silence suppression (also called
288	   Voice Activity Detection, or VAD), where the stream (in at least one
289	   direction) goes totally idle for sometimes several seconds while one
290	   side listens to what the other side has to say.  When the one side
291	   wants to start talking again, the codec resumes sending immediately
292	   at its "constant" rate.

294	   Video codecs similarly employ what could be called "stillness"
295	   suppression, but is instead called motion compensation.  Often these
296	   codecs effectively transmit the changes from one video frame to
297	   another.  When there is little change from frame to frame (as when
298	   the background is constant and a talking head is just moving its
299	   lips) the amount of information to send is small.  When there is a
300	   major motion, or change of scene, much more information must be sent.
301	   For some codecs, the variation from the minimum rate to the maximum
302	   rate can be a factor of ten [MPEG4] or more.  Unlike voice codecs,
303	   though, video codecs typically never go completely idle.

305	   These abrupt jumps in transmission rate are problematic for any
306	   congestion control algorithm.  A basic tenet of all existing
307	   algorithms assumes that increases in transmission rate must be
308	   gradual and smooth to avoid damaging other connections in the
309	   network.  In TFRC, the transmission rate in a Round Trip Time (RTT)
310	   can never be more than twice the rate actually delivered to the
311	   receiver in the previous RTT.

313	   TFRC uses a maximum rate of two packets per RTT after an idle period.
314	   This rate might support immediate restart of voice data after a
315	   silence period, at least when the RTT is in the suitable range for
316	   two-way media.  More problematic are the factor of ten variations in
317	   some video codecs.  In some circumstances, TFRC allows an application
318	   to double its transmit rate over one RTT (assuming no recent packet
319	   loss events), but an immediate ten times increase is not possible.

321	4.        Strategies for Streaming Media Applications

323	   This section covers a number of strategies that can be used by
324	   streaming media applications.  Each strategy is applicable to one or
325	   more subtypes of streaming media.

327	4.1         First Strategy -- One-way Pre-recorded Media

329	   The first strategy is suitable for use with pre-recorded media, and
330	   takes advantage of the fact that the data for pre-recorded media can
331	   be transferred to the receiver as fast as the network will allow it,
332	   assuming that the receiver has sufficient buffer space.

334	4.1.1           Strategy 1

336	   Assume a recorded program resides on a media server, and the server
337	   and its clients are capable of stream switching between two encoding
338	   rates, as described in section 3.1.

340	   The client (receiver) implements a media buffer as a playout buffer.
341	   This buffer is potentially big enough to hold the entire recording.
342	   The playout buffer has three thresholds: a low threshold, a playback
343	   start threshold, and a high threshold, in order of increasing size.
344	   These values will typically be in the several to tens of seconds
345	   range.  The buffer is filled by data arriving from the network and
346	   drained at the decoding rate necessary to display the data to the
347	   user.  Figure 3 shows this schematically.

349	                             high threshold
350	                                 |  playback start threshold
351	                                 |    |  low threshold
352	   +-------+                     |    |    |
353	   | Media |  transmit at    +---v----v----v--+
354	   | File  |---------------->| Playout buffer |-------> display
355	   |       |  TFRC max rate  +----------------+ drain at
356	   +-------+                 fill at network    decode rate
357	                             arrival rate

359	   Figure 3: Transfer and playout of one-way pre-recorded media.

361	   During the connection the server needs to be able to determine the
362	   depth of data in the playout buffer.  This could be provided by
363	   direct feedback from the client to the server, or the server could
364	   estimate its depth (e.g. the server knows how much data has been sent
365	   and how much time has passed).

367	   To start the connection, the server begins transmitting data in the
368	   high bit rate encoding as fast as TFRC allows.  Since TFRC is in slow
369	   start, this is probably too slow initially, but eventually the rate
370	   should increase to fast enough (assuming sufficient capacity in the
371	   network path).  As the client receives data from the network it adds
372	   it to the playout buffer.  Once the buffer depth reaches the playback
373	   start threshold, the receiver begins draining the buffer and playing
374	   the contents to the user.

376	   If the network has sufficient capacity, TFRC will eventually raise
377	   the transmit rate to greater than necessary to keep up with or exceed
378	   the decoding rate, the playout buffer will back up as necessary, and
379	   the entire program will eventually be transferred.

381	   If the TFRC transmit rate never gets fast enough, or loss events make
382	   TFRC drop the rate, the receiver will drain the playout buffer faster
383	   than it is filled.  When the playout buffer drops below the low
384	   threshold the server switches to the low bit rate encoding.  Assuming
385	   that the network has a bit more capacity than the low bit rate
386	   requires, the playout buffer will begin filling again.

388	   When the buffer crosses the high threshold the server may switch back
389	   to the high encoding rate.  Assuming that the network still doesn't
390	   have enough capacity for the high bit rate, the playout buffer will
391	   start draining again.  When it reaches the low threshold the server
392	   switches again to the low bit rate encoding.  The server will
393	   oscillate back and forth like this until the connection is concluded.

395	   If the network has insufficient capacity to support the low bit rate
396	   encoding, the playout buffer will eventually drain completely, and
397	   playback will need to be paused until the buffer refills to the
398	   playback start level.

400	   Note that, in this scheme, the server doesn't need to explicitly know
401	   the rate that TFRC has determined; it simply always sends as fast as
402	   TFRC allows (perhaps alternately reading a chunk of data from disk
403	   and then blocking on the socket write call until it's transmitted).
404	   TFRC shapes the stream to the network's requirements, and the playout
405	   buffer feedback allows the server to shape the stream to the
406	   application's requirements.

408	4.1.2           Issues With Strategy 1

410	   The advantage of this strategy is that it provides insurance against
411	   an unpredictable future.  Since there's no guarantee that a currently
412	   supported transmit rate will continue to be supported, the strategy
413	   takes what the network is willing to give when it's willing to give
414	   it.  The data is transferred from the server to the client perhaps
415	   faster than is strictly necessary, but once it's there no network
416	   problems (or new sources of traffic) can affect the display.

418	   Silence suppression can be used with this strategy, since the
419	   transmitter doesn't actually go idle during the silence -- it just
420	   gets further ahead.  Variable rate video codecs can also function
421	   well.  Again, the transmitter will get ahead faster during the
422	   interpolated frames and fall back during the index frames, but a
423	   playout buffer of a few seconds is probably sufficient to mask these
424	   variations.

426	   One obvious disadvantage, if the client is a "thin" device, is the
427	   large buffer at the client.  A subtler disadvantage involves the way
428	   TFRC probes the network to determine its capacity.  Basically, TFRC
429	   does not have an a priori idea of what the network capacity is; it
430	   simply gradually increases the transmit rate until packets are lost,
431	   and then backs down.  After a period of time with no losses, the rate
432	   is gradually increased again until more packets are lost.  Over the
433	   long term, the transmit rate will oscillate up and down, with packet
434	   loss events occurring at the rate peaks.

436	   This means that packet loss will likely be routine with this
437	   strategy.  For any given transfer, the number of lost packets is
438	   likely to be small, but non-zero.  Whether this causes noticeable
439	   quality problems depends on the characteristics of the particular
440	   codec in use.

442	4.2         Second Try -- One-way Live Media

444	   With one-way live media you can only transmit the data as fast as
445	   it's created, but end-to-end delays of several or tens of seconds are
446	   usually acceptable.

448	4.2.1           Strategy 2

450	   Assume that we have a playout media buffer at the receiver and a
451	   transmit media buffer at the sender.  The transmit buffer is filled
452	   at the encoding rate and drained at the TFRC transmit rate.  The
453	   playout buffer is filled at the network arrival rate and drained at
454	   the decoding rate.  The playout buffer has a playback start threshold
455	   and the transmit buffer has a switch encoding threshold and a discard
456	   data threshold.  These thresholds are on the order of several to tens
457	   of seconds.  Switch encoding is less than discard data, which is less
458	   than playback start.  Figure 4 shows this schematically.

460	                   discard data
461	                     |  switch encoding
462	                     |   |                 playback start
463	                     |   |                   |
464	   media   +---------v---v---+          +----v-----------+
465	   ------->| Transmit buffer |--------->| Playout buffer |---> display
466	   source  +-----------------+ transmit +----------------+
467	           fill at             at TFRC rate          drain at
468	           encode rate                               decode rate

470	   Figure 4: Transfer and playout of one-way live media.

472	   At the start of the connection, the sender places data into the
473	   transmit buffer at the high encoding rate.  The buffer is drained at
474	   the TFRC transmit rate, which at this point is in slow-start and is
475	   probably slower than the encoding rate.  This will cause a backup in
476	   the transmit buffer.  Eventually TFRC will slow-start to a rate
477	   slightly above the rate necessary to sustain the encoding rate
478	   (assuming the network has sufficient capacity).  When this happens
479	   the transmit buffer will drain and we'll reach a steady state
480	   condition where the transmit buffer is normally empty and we're
481	   transmitting at a rate that is probably below the maximum allowed by
482	   TFRC.

484	   Meanwhile at the receiver, the playout buffer is filling, and when it
485	   reaches the playback start threshold playback will start.  After TFRC
486	   slow-start is complete and the transmit buffer is drained, this
487	   buffer will reach a steady state where packets are arriving from the
488	   network at the encoding rate (ignoring jitter) and being drained at
489	   the (equal) decoding rate.  The depth of the buffer will be the
490	   playback start threshold plus the maximum depth of the transmit
491	   buffer during slow start.

493	   Now assume that network congestion (packet loss) forces TFRC to drop
494	   its rate to below that needed by the high encoding rate.  The
495	   transmit buffer will begin to fill and the playout buffer will begin
496	   to drain.  When the transmit buffer reaches the switch encoding
497	   threshold, the sender switches to the low encoding rate, and converts
498	   all of the data in the transmit buffer to low rate encoding.

500	   Assuming that the network can support the new, lower, rate (and a
501	   little more) the transmit buffer will begin to drain and the playout
502	   buffer will begin to fill.  Eventually the transmit buffer will empty
503	   and the playout buffer will be back to its steady state level.

505	   At this point (or perhaps after a slight delay) the sender can switch
506	   back to the higher rate encoding.  If the new rate can't be sustained
507	   the transmit buffer will fill again, and the playout buffer will
508	   drain.  When the transmit buffer reaches the switch encoding
509	   threshold the sender goes back to the lower encoding rate.  This
510	   oscillation continues until the stream ends or the network is able to
511	   support the high encoding rate for the long term.

513	   If the network can't support the low encoding rate, the transmit
514	   buffer will continue to fill (and the playout buffer will continue to
515	   drain).  When the transmit buffer reaches the discard data threshold,
516	   the sender must discard a chunk of data from the transmit buffer for
517	   every chunk of data added.  Preferably, the discard should happen
518	   from the head of the transmit buffer, as these are the stalest data,
519	   but the application could make other choices (e.g. discard the
520	   earliest silence in the buffer).  This discard behavior continues
521	   until the transmit buffer falls below the switch encoding threshold.
522	   If the playout buffer ever drains completely, the receiver should
523	   fill the output with suitable material (e.g. silence or stillness).

525	   Note that this strategy is also suitable for one-way pre-recorded
526	   media, as long as the transmit buffer is only filled at the encoding
527	   rate, not at the disk read rate.

529	4.2.2           Issues with Strategy 2

531	   Strategy 2 is fairly effective.  There is a limit on the necessary
532	   size of the playout buffer at the client, so clients with limited
533	   resources can be supported.  When silence suppression is used or
534	   motion compensation sends interpolated frames, the transmit rate will
535	   actually go down, and then must slowly ramp up to return to the
536	   maximum rates, but this smoothing can often be masked by a playout
537	   buffer of a few seconds.

539	   Also, since strategy 2 limits the transmission rate to the maximum
540	   encoding rate, and therefore doesn't try to get every last bit of
541	   possible throughput from the network, routine packet loss can be
542	   avoided (assuming that there's enough network capacity for the
543	   maximum encoding rate).

545	4.3         One More Time -- Two-way Interactive Media

547	   Two-way interactive media is characterized by its low tolerance for
548	   end-to-end delay, usually requiring less than 150 ms for interactive
549	   conversation, including jitter buffering at the receiver.  Rate
550	   adapting buffers will insert too much delay and the slow start period
551	   is likely to be noticeable ("Hello" clipping).

553	   This low delay requirement makes using TFRC with variable-rate codecs
554	   (codecs using silence suppression or motion compensation) highly
555	   problematic.  The extra delays imposed by the smooth rate increases
556	   mandated by TFRC are unlikely to be tolerated by the interactive
557	   applications.

559	   There are further problems with the usual practice in interactive
560	   voice applications of using small packets.  In voice applications,
561	   the data rate is low enough that waiting to accumulate enough data to
562	   fill a large packet adds unacceptable delay.  For example, the G.711
563	   codec generates one byte of data every 125 microseconds.  To
564	   accumulate enough data for a 1480-byte packet, the encoder would need
565	   to delay some data by 185 ms, eating up the entire delay budget just
566	   for packetization.  These considerations can also apply to very low
567	   rate video.

569	   The goal of TFRC is fair sharing of a bottleneck, in packets per
570	   second, with a TCP application using 1480-byte packets.  Applications
571	   using smaller packets will receive a fair share of packets per
572	   second, but less than a fair of bytes per second.  With the packet
573	   sizes typically in use in interactive voice applications (e.g., 80
574	   bytes of user data for G.711 with 10 ms packetization), it can be
575	   very difficult to achieve useful byte per second rates when in
576	   competition with TCP applications.

578	   Further research is needed to resolve these issues.  The strategy
579	   below can only be applied to constant rate codecs whose data rate is
580	   sufficiently large to fill 1480-byte packets within tolerable delay
581	   limits.

583	4.3.1           Strategy 3

585	   To start, the calling party sends an INVITE (loosely using SIP [RFC
586	   3261] terminology) indicating the IP address and port to use for
587	   media at its end.  Without informing the called user, the called
588	   system responds to the INVITE by connecting to the calling party
589	   media port.  Both end systems then begin exchanging test data, at the
590	   (slowly increasing) rate allowed by TFRC.  The purpose of this test
591	   data is to see what rate the connection can be ramped up to.  If a
592	   minimum acceptable rate cannot be achieved within some time period,
593	   the call is cleared (conceptually, the calling party hears "fast
594	   busy" and the called user is never informed of the incoming call).
595	   Note that once the rate has ramped up sufficiently for the highest
596	   rate codec there's no need to go further.

598	   If an acceptable rate can be achieved (in both directions), the
599	   called user is informed of the incoming call.  The test data is
600	   continued during this period.  Once the called user accepts the call,
601	   the test data is replaced by real data at the same rate.

603	   If congestion is encountered during the call, TFRC will reduce its
604	   allowed sending rate.  When that rate falls below the codec currently
605	   in use, the sender switches to a lower rate codec, but should pad its
606	   transmission out to the allowed TFRC rate.  Note that this padding is
607	   only necessary if the application wishes to return to the higher
608	   encoding rate when possible.  If the TFRC rate continues to fall past
609	   the lowest rate codec, the sender must discard packets to conform to
610	   that rate.

612	   If the network capacity is sufficient to support one of the lower
613	   rate codecs, eventually the congestion will clear and TFRC will
614	   slowly increase the allowed transmit rate.  The application should
615	   increase its transmission padding to keep up with the increasing TFRC
616	   rate.  The application may switch back to the higher rate codec when
617	   the TFRC rate reaches a sufficient value.

619	   An application that did not wish to switch back to the higher
620	   encoding (perhaps for reasons outlined in section 3.1) would not need
621	   to pad its transmission out to the TFRC maximum rate.

623	   Note that the receiver would normally implement a short playout
624	   buffer (with playback start on the order of some tens of
625	   milliseconds) to smooth out jitter in the packet arrival gaps.

627	4.3.2           Issues with Strategy 3

629	   An obvious issue with strategy 3 is the post-dial call connection
630	   delay imposed by the slow-start ramp up.  This is perhaps less of an
631	   issue for two-way video applications, where post-dial delays of
632	   several seconds are accepted practice.  For telephony applications,
633	   however, post-dial delays significantly greater than a second are a
634	   problem, given that users have been conditioned to that behavior by
635	   the public telephone network.  On the other hand, the four packets
636	   per RTT initial transmit rate allowed by DCCP's CCID3 in some
637	   circumstance is likely to be sufficient for many telephony
638	   applications, and the ramp up will be very quick.

640	   As was stated in section 4.3, this strategy is only suitable for use
641	   with constant-rate codecs with fast enough data rates to tolerate
642	   using large packets.

644	5.        Security Considerations

646	   There are no security considerations for this document.  Security
647	   consideration for TFRC and the protocols implementing TFRC are
648	   discussed in their defining documents.

650	6.        IANA Considerations

652	   There are no IANA actions required for this document.

654	7.        Thanks

656	   Thanks to the AVT working group, especially Philippe Gentric and
657	   Brian Rosen, for comments on the earlier version of this document.

659	8.        Informative References

661	   [DCCP]      E. Kohler, M. Handley, S. Floyd, J. Padhye, Datagram
662	               Congestion Control Protocol (DCCP), February 2004, draft-
663	               ietf-dccp-spec-06.txt, work in progress.

665	   [CCID2]     S. Floyd, E. Kohler, Profile for DCCP Congestion Control
666	               2: TCP-Like Congestion Control, February 2004, draft-
667	               ietf-dccp-ccid2-05.txt, work in progress.

669	   [CCID3]     S. Floyd, E. Kohler, J. Padhye, Profile for DCCP
670	               Congestion Control 3: TFRC Congestion Control, February
671	               2004, draft-ietf-dccp-ccid3-04.txt, work in progress.

673	   [RFC 3448]  M. Handley, S. Floyd, J. Padhye, J. Widmer, TCP Friendly
674	               Rate Control (TFRC): Protocol Specification, RFC 3448.

676	   [RFC 3714]  S. Floyd, J, Kempf, IAB Concerns Regarding Congestion for
677	               Voice Traffic in the Internet, March 2004, RFC 3714.

679	   [RFC 3261]  J. Rosenberg, et al, SIP: Session Initiation Protocol,
680	               June 2002, RFC 3261

682	   [RFC 3517]  E. Blanton, M. Allman, K. Fall, L. Wang, A Conservative
683	               Selective Acknowledgment (SACK)-based Loss Recovery
684	               Algorithm for TCP, April 2003, RFC 3517

686	   [RFC 3550]  H. Schulzrinne, S. Casner, R. Frederick, V. Jacobson,
687	               RTP: A Transport Protocol for Real-Time Applications,
688	               July 2003, RFC 3550

690	   [XTIME]     ITU-T: Series G: Transmission Systems and Media, Digital
691	               Systems and Networks, Recommendation G.114, One-way
692	               Transmission Time, May 2000

694	   [ECN]       K. Ramakrishnan, S. Floyd, D. Black, The Addition of
695	               Explicit Congestion Notification (ECN) to IP, September
696	               2001, RFC 3168

698	   [MPEG4]     ISO/IEC International Standard 14496 (MPEG-4),
699	               Information technology - Coding of audio-visual objects,
700	               January 2000

702	   [RTP-TFRC]  L. Gharai, RTP Profile for TCP-Friendly Rate Control,
703	               October 2004, draft-ietf-avt-tfrc-profile-03.txt, work in
704	               progress
705	9.        Author's Address

707	   Tom Phelan
708	   Sonus Networks
709	   7 Technology Park Dr.
710	   Westford, MA USA 01886
711	   Phone: +1-978-614-8456
712	   Email: tphelan@sonusnet.com
713	   Intellectual Property Statement

715	   The IETF takes no position regarding the validity or scope of any
716	   Intellectual Property Rights or other rights that might be claimed to
717	   pertain to the implementation or use of the technology described in
718	   this document or the extent to which any license under such rights
719	   might or might not be available; nor does it represent that it has
720	   made any independent effort to identify any such rights.  Information
721	   on the procedures with respect to rights in RFC documents can be
722	   found in BCP 78 and BCP 79.

724	   Copies of IPR disclosures made to the IETF Secretariat and any
725	   assurances of licenses to be made available, or the result of an
726	   attempt made to obtain a general license or permission for the use of
727	   such proprietary rights by implementers or users of this
728	   specification can be obtained from the IETF on-line IPR repository at
729	   http://www.ietf.org/ipr.

731	   The IETF invites any interested party to bring to its attention any
732	   copyrights, patents or patent applications, or other proprietary
733	   rights that may cover technology that may be required to implement
734	   this standard.  Please address the information to the IETF at ietf-
735	   ipr@ietf.org.

737	   Disclaimer of Validity

739	   This document and the information contained herein are provided on an
740	   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
741	   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
742	   THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
743	   OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
744	   THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
745	   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

747	   Copyright Statement

749	   Copyright (C) The IETF Trust (2007).

751	   This document is subject to the rights, licenses and restrictions
752	   contained in BCP 78, and except as set forth therein, the authors
753	   retain all their rights.

755	   Acknowledgment

757	   Funding for the RFC Editor function is currently provided by the
758	   Internet Society.