idnits 2.17.1 draft-ietf-dccp-tfrc-media-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 15. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 745. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 722. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 729. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 735. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Introduction section. ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year == Line 356 has weird spacing: '...network dec...' -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (July 2007) is 6129 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Missing reference section? 'RFC 3550' on line 686 looks like a reference -- Missing reference section? 'RFC 3714' on line 676 looks like a reference -- Missing reference section? 'RFC 3448' on line 673 looks like a reference -- Missing reference section? 'DCCP' on line 661 looks like a reference -- Missing reference section? 'CCID3' on line 669 looks like a reference -- Missing reference section? 'RTP-TFRC' on line 702 looks like a reference -- Missing reference section? 'CCID2' on line 665 looks like a reference -- Missing reference section? 'RFC 3517' on line 682 looks like a reference -- Missing reference section? 'ECN' on line 694 looks like a reference -- Missing reference section? 'XTIME' on line 690 looks like a reference -- Missing reference section? 'MPEG4' on line 698 looks like a reference -- Missing reference section? 'RFC 3261' on line 679 looks like a reference Summary: 4 errors (**), 0 flaws (~~), 2 warnings (==), 19 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Strategies for Streaming Media Using TFRC 2 Internet Draft T. Phelan 3 Document: draft-ietf-dccp-tfrc-media-02.txt Sonus Networks 4 Expires: January 2008 July 2007 5 Intended status: Informational 7 Strategies for Streaming Media Applications 8 Using TCP-Friendly Rate Control 10 Status of this Memo 12 By submitting this Internet-Draft, each author represents that any 13 applicable patent or other IPR claims of which he or she is aware 14 have been or will be disclosed, and any of which he or she becomes 15 aware will be disclosed, in accordance with Section 6 of BCP 79. 17 Internet-Drafts are working documents of the Internet Engineering 18 Task Force (IETF), its areas, and its working groups. Note that 19 other groups may also distribute working documents as Internet- 20 Drafts. 22 Internet-Drafts are draft documents valid for a maximum of six months 23 and may be updated, replaced, or obsoleted by other documents at any 24 time. It is inappropriate to use Internet-Drafts as reference 25 material or to cite them other than as "work in progress." 27 The list of current Internet-Drafts can be accessed at 28 http://www.ietf.org/ietf/1id-abstracts.txt. 30 The list of Internet-Draft Shadow Directories can be accessed at 31 http://www.ietf.org/shadow.html. 33 This Internet-Draft will expire on November 30, 2007. 35 Abstract 37 This document discusses strategies for using streaming media 38 applications with unreliable congestion-controlled transport 39 protocols such as the Datagram Congestion Control Protocol (DCCP) or 40 the RTP Profile for TCP Friendly Rate Control. Of particular 41 interest is how media streams, which have their own transmit rate 42 requirements, can be adapted to the varying and sometimes conflicting 43 transmit rate requirements of congestion control protocols such as 44 TCP-Friendly Rate Control (TFRC). 46 Table of Contents 48 1. Introduction...................................................3 49 2. TFRC Basics....................................................3 50 3. Streaming Media Applications...................................5 51 3.1 Stream Switching...........................................6 52 3.2 Media Buffers..............................................7 53 3.3 Variable Rate Media Streams................................7 54 4. Strategies for Streaming Media Applications....................8 55 4.1 First Strategy -- One-way Pre-recorded Media...............8 56 4.1.1 Strategy 1............................................8 57 4.1.2 Issues With Strategy 1................................9 58 4.2 Second Try -- One-way Live Media..........................10 59 4.2.1 Strategy 2...........................................10 60 4.2.2 Issues with Strategy 2...............................12 61 4.3 One More Time -- Two-way Interactive Media................12 62 4.3.1 Strategy 3...........................................13 63 4.3.2 Issues with Strategy 3...............................14 64 5. Security Considerations.......................................14 65 6. IANA Considerations...........................................14 66 7. Thanks........................................................14 67 8. Informative References........................................15 68 9. Author's Address..............................................16 69 1. Introduction 71 The canonical streaming media application emits fixed-sized (often 72 small) packets at a regular interval. It relies on the network to 73 deliver the packets to the receiver in roughly the same regular 74 interval. Often, the transmitter operates in a fire-and-forget mode, 75 receiving no indications of packet delivery and never changing its 76 mode of operation. This often holds true even if the packets are 77 encapsulated in the Real-time Transport Protocol (RTP) and the RTP 78 Control Protocol (RTCP) [RFC 3550] is used to get receiver 79 information; it's rare that the RTCP reports trigger changes in the 80 transmitted stream. 82 The IAB has expressed concerns over the stability of the Internet if 83 these applications become too popular with regard to TCP-based 84 applications [RFC 3714]. They suggest that media applications should 85 monitor their packet loss rate, and abort if they exceed certain 86 thresholds. Unfortunately, up until this threshold is reached, the 87 network, the media applications, and the other applications are all 88 experiencing considerable duress. 90 TCP-Friendly Rate Control (TFRC, [RFC 3448]) offers an alternative to 91 the [RFC 3714] method. The key differentiator of TFRC, relative to 92 the Additive Increase Multiplicative Decrease (AIMD) method used in 93 TCP and SCTP, is its smooth response to packet loss. TFRC has been 94 implemented as one of the "pluggable" congestion control algorithms 95 for the Datagram Congestion Control Protocol (DCCP, [DCCP] and 96 [CCID3]) and as a profile for RTP [RTP-TFRC]. 98 This document explores issues to consider and strategies to employ 99 when adapting or creating streaming media applications to use 100 transport protocols using TFRC for congestion control. The approach 101 here is one of successive refinement. Strategies are described and 102 their strengths and weaknesses are explored. New strategies are then 103 presented that improve on the previous ones and the process iterates. 104 The intent is to illuminate the issues, rather than to jump to 105 solutions, in order to provide guidance to application designers. 107 2. TFRC Basics 109 AIMD congestion control algorithms, such DCCP's CCID2 [CCID2] or 110 TCP's SACK-based control [RFC 3517], use a congestion window (the 111 maximum number of packets or segments in flight) to limit the 112 transmitter. The congestion window is increased by one for each 113 acknowledged packet, or for each window of acknowledged packets, 114 depending on the phase of operation. If any packet is dropped (or 115 ECN-marked [ECN]; for simplicity in the rest of the document assume 116 that "dropped" equals "dropped or ECN-marked"), the congestion window 117 is halved. This produces a characteristic saw-tooth wave variation 118 in throughput, where the throughput increases linearly up to the 119 network capacity and then drops abruptly (roughly shown in Figure 1). 121 | 122 | /| /| /| /| / 123 | / | / | / | / | / 124 Throughput| / | / | / | / | / 125 | / | / | / | / | / 126 | / |/ |/ |/ |/ 127 | 128 ---------------------------------- 129 Time 131 Figure 1: Characteristic throughput for AIMD congestion control. 133 On the other hand, with TCP-Friendly Rate Control (TFRC), the 134 immediate response to packet drops is less dramatic. To compensate 135 for this TFRC is less aggressive in probing for new capacity after a 136 loss. TFRC uses a version of the TCP throughput equation to compute 137 a maximum transmit rate, taking a weighted history of loss events as 138 input (more weight is given to more recent losses). The 139 characteristic throughput graph for a TFRC connection looks like a 140 flattened sine wave (extremely roughly shown in Figure 2). 142 | 143 | -- -- -- 144 | / \ / \ / \ 145 Throughput| / \ / \ / \ 146 | / \ / \ / \ 147 |- -- -- - 148 | 149 ---------------------------------- 150 Time 152 Figure 2: Characteristic throughput for TFRC congestion control. 154 In addition to this high-level behavior, there are several details of 155 TFRC operation that, at first blush at least, seem at odds with 156 common media stream transmission practices. Some particular 157 considerations are: 159 o Slow Start -- A connection starts out with a transmission rate of 160 up to four packets per round trip time (RTT). After the first 161 RTT, the rate is doubled each RTT until a packet is lost. At 162 this point the transmission rate is halved and we enter the 163 equation-based phase of operation. It's likely that in many 164 situations the initial transmit rate is slower than the lowest 165 bit rate encoding of the media. This will require the 166 application to deal with a ramp up period. 168 o Capacity Probing and Lost Packets -- If the application transmits 169 for some time at the maximum rate that TFRC will allow without 170 packet loss, TFRC will continuously raise the allowed rate until 171 a packet is lost. This means that, in many circumstances, if an 172 application wants to transmit at the maximum possible rate, 173 packet loss will not be an exceptional event, but will happen 174 routinely in the course of probing for more capacity. 176 o Idleness Penalty -- TFRC follows a "use it or lose it" policy. 177 If the transmitter goes idle for a few RTTs, as it would if, for 178 instance, silence suppression were being used, the transmit rate 179 returns to two packets per RTT, and then doubles every RTT until 180 the previous rate is achieved. This can make restarting after a 181 silence suppression interval problematic. 183 o Contentment Penalty -- TFRC likes to satisfy greed. If you are 184 transmitting at the maximum allowed rate, TFRC will try to raise 185 that rate. However, if your application is transmitting below 186 the maximum allowed rate, the maximum allowed rate will not be 187 increased higher than twice the current transmit rate, no matter 188 how long it has been since the last increase. This can create 189 problems when attempting to shift to a higher rate encoding, or 190 with video codecs that vary the transmission rate with the amount 191 of movement in the image. 193 o Packet Rate, not Bit Rate -- TFRC controls the rate that packets 194 may enter the network, not bytes. To respond to a lowered 195 transmit rate you must reduce the packet transmission rate. 196 Making the packets smaller while still keeping the same packet 197 rate will not be effective. 199 o Smooth Variance of Transmit Rate -- The strength and purpose of 200 TFRC (over AIMD Congestion Control) is that it smoothly decreases 201 the transmission rate in response to recent packet loss events, 202 and smoothly increases the rate in the absence of loss events. 203 This smoothness is somewhat at odds with most media stream 204 encodings, where the transition from one rate to another is often 205 a step function. 207 3. Streaming Media Applications 209 While all streaming media applications have some characteristics in 210 common (e.g. data must arrive at the receiver at some minimum rate 211 for reasonable operation), other characteristics (e.g. tolerance of 212 end-to-end delay) vary considerably from application to application. 213 For the purposes of this document, it's useful to divide streaming 214 media applications into three subtypes: 216 o One-way pre-recorded media 217 o One-way live media 218 o Two-way interactive media 220 The relevant difference, as far as this discussion goes, between 221 recorded and live media is that recorded media can be transmitted as 222 fast as the network allows (assuming adequate buffering at the 223 receiver) -- it could be viewed as a special file transfer operation. 224 Live media can't be transmitted faster than the rate at which it's 225 encoded. 227 The difference between one-way and two-way media is the sensitivity 228 to delay. For one-way applications, delays from encoding at the 229 sender to playout at the receiver of several or even tens of seconds 230 are acceptable. For two-way applications delays from encoding to 231 playout of as little as 150 ms are often problematic [XTIME]. 233 While delay sensitivity is most problematic when dealing with two-way 234 conversational applications such as telephony, it is also apparent in 235 nominally one-way applications when certain user interactions are 236 allowed, such as program switching ("channel surfing") or fast 237 forward/skip. Arguably, these user interactions have turned the one- 238 way application into a two-way application -- there just isn't the 239 same sort of data flowing in both directions. 241 3.1 Stream Switching 243 The discussion here assumes that media transmitters are able to 244 provide their data in a number of encodings with various bit rate 245 requirements and are able to dynamically change between these 246 encodings with low overhead. It also assumes that switching back and 247 forth between coding rates does not cause excessive user annoyance. 249 Given the current state of codec art, these are big assumptions. As 250 a practical matter, continuous shifts between higher and lower 251 quality levels can greatly annoy users, much more so than one shift 252 to a lower quality level and then staying there. The algorithms 253 given below indicate methods for returning to higher bandwidth 254 encodings, but, because of the bad user perception of shifting 255 quality, many media applications may choose to never invoke these 256 methods. 258 Also, the algorithms and results described here hold even if the 259 media sources can only supply media at one rate. Obviously the 260 statements about switching encoding rates don't apply, and an 261 application with only one encoding rate behaves as if it is 262 simultaneously at its minimum and maximum rate. 264 For convenience in the discussion below, assume that all media 265 streams have two encodings, a high bit rate and a low bit rate, 266 unless otherwise indicated. 268 3.2 Media Buffers 270 The strategies below make use of the concept of a media buffer. A 271 media buffer is a first-in-first-out queue of media data. The buffer 272 is filled by some source of data (the encoder or the network) and 273 drained by some sink (the network or the playout device). It 274 provides rate and jitter compensation between the source and the 275 sink. 277 Media buffer contents are measured in seconds of media play time, not 278 bytes or packets. Media buffers are completely application-level 279 constructs and are separate from transport-layer transmit and receive 280 queues. 282 3.3 Variable Rate Media Streams 284 The canonical media codec encodes its media as a constant rate bit 285 stream. As the technology has progressed from its time-division 286 multiplexing roots, this constant rate stream has become not so 287 constant. Voice codecs often employ silence suppression (also called 288 Voice Activity Detection, or VAD), where the stream (in at least one 289 direction) goes totally idle for sometimes several seconds while one 290 side listens to what the other side has to say. When the one side 291 wants to start talking again, the codec resumes sending immediately 292 at its "constant" rate. 294 Video codecs similarly employ what could be called "stillness" 295 suppression, but is instead called motion compensation. Often these 296 codecs effectively transmit the changes from one video frame to 297 another. When there is little change from frame to frame (as when 298 the background is constant and a talking head is just moving its 299 lips) the amount of information to send is small. When there is a 300 major motion, or change of scene, much more information must be sent. 301 For some codecs, the variation from the minimum rate to the maximum 302 rate can be a factor of ten [MPEG4] or more. Unlike voice codecs, 303 though, video codecs typically never go completely idle. 305 These abrupt jumps in transmission rate are problematic for any 306 congestion control algorithm. A basic tenet of all existing 307 algorithms assumes that increases in transmission rate must be 308 gradual and smooth to avoid damaging other connections in the 309 network. In TFRC, the transmission rate in a Round Trip Time (RTT) 310 can never be more than twice the rate actually delivered to the 311 receiver in the previous RTT. 313 TFRC uses a maximum rate of two packets per RTT after an idle period. 314 This rate might support immediate restart of voice data after a 315 silence period, at least when the RTT is in the suitable range for 316 two-way media. More problematic are the factor of ten variations in 317 some video codecs. In some circumstances, TFRC allows an application 318 to double its transmit rate over one RTT (assuming no recent packet 319 loss events), but an immediate ten times increase is not possible. 321 4. Strategies for Streaming Media Applications 323 This section covers a number of strategies that can be used by 324 streaming media applications. Each strategy is applicable to one or 325 more subtypes of streaming media. 327 4.1 First Strategy -- One-way Pre-recorded Media 329 The first strategy is suitable for use with pre-recorded media, and 330 takes advantage of the fact that the data for pre-recorded media can 331 be transferred to the receiver as fast as the network will allow it, 332 assuming that the receiver has sufficient buffer space. 334 4.1.1 Strategy 1 336 Assume a recorded program resides on a media server, and the server 337 and its clients are capable of stream switching between two encoding 338 rates, as described in section 3.1. 340 The client (receiver) implements a media buffer as a playout buffer. 341 This buffer is potentially big enough to hold the entire recording. 342 The playout buffer has three thresholds: a low threshold, a playback 343 start threshold, and a high threshold, in order of increasing size. 344 These values will typically be in the several to tens of seconds 345 range. The buffer is filled by data arriving from the network and 346 drained at the decoding rate necessary to display the data to the 347 user. Figure 3 shows this schematically. 349 high threshold 350 | playback start threshold 351 | | low threshold 352 +-------+ | | | 353 | Media | transmit at +---v----v----v--+ 354 | File |---------------->| Playout buffer |-------> display 355 | | TFRC max rate +----------------+ drain at 356 +-------+ fill at network decode rate 357 arrival rate 359 Figure 3: Transfer and playout of one-way pre-recorded media. 361 During the connection the server needs to be able to determine the 362 depth of data in the playout buffer. This could be provided by 363 direct feedback from the client to the server, or the server could 364 estimate its depth (e.g. the server knows how much data has been sent 365 and how much time has passed). 367 To start the connection, the server begins transmitting data in the 368 high bit rate encoding as fast as TFRC allows. Since TFRC is in slow 369 start, this is probably too slow initially, but eventually the rate 370 should increase to fast enough (assuming sufficient capacity in the 371 network path). As the client receives data from the network it adds 372 it to the playout buffer. Once the buffer depth reaches the playback 373 start threshold, the receiver begins draining the buffer and playing 374 the contents to the user. 376 If the network has sufficient capacity, TFRC will eventually raise 377 the transmit rate to greater than necessary to keep up with or exceed 378 the decoding rate, the playout buffer will back up as necessary, and 379 the entire program will eventually be transferred. 381 If the TFRC transmit rate never gets fast enough, or loss events make 382 TFRC drop the rate, the receiver will drain the playout buffer faster 383 than it is filled. When the playout buffer drops below the low 384 threshold the server switches to the low bit rate encoding. Assuming 385 that the network has a bit more capacity than the low bit rate 386 requires, the playout buffer will begin filling again. 388 When the buffer crosses the high threshold the server may switch back 389 to the high encoding rate. Assuming that the network still doesn't 390 have enough capacity for the high bit rate, the playout buffer will 391 start draining again. When it reaches the low threshold the server 392 switches again to the low bit rate encoding. The server will 393 oscillate back and forth like this until the connection is concluded. 395 If the network has insufficient capacity to support the low bit rate 396 encoding, the playout buffer will eventually drain completely, and 397 playback will need to be paused until the buffer refills to the 398 playback start level. 400 Note that, in this scheme, the server doesn't need to explicitly know 401 the rate that TFRC has determined; it simply always sends as fast as 402 TFRC allows (perhaps alternately reading a chunk of data from disk 403 and then blocking on the socket write call until it's transmitted). 404 TFRC shapes the stream to the network's requirements, and the playout 405 buffer feedback allows the server to shape the stream to the 406 application's requirements. 408 4.1.2 Issues With Strategy 1 410 The advantage of this strategy is that it provides insurance against 411 an unpredictable future. Since there's no guarantee that a currently 412 supported transmit rate will continue to be supported, the strategy 413 takes what the network is willing to give when it's willing to give 414 it. The data is transferred from the server to the client perhaps 415 faster than is strictly necessary, but once it's there no network 416 problems (or new sources of traffic) can affect the display. 418 Silence suppression can be used with this strategy, since the 419 transmitter doesn't actually go idle during the silence -- it just 420 gets further ahead. Variable rate video codecs can also function 421 well. Again, the transmitter will get ahead faster during the 422 interpolated frames and fall back during the index frames, but a 423 playout buffer of a few seconds is probably sufficient to mask these 424 variations. 426 One obvious disadvantage, if the client is a "thin" device, is the 427 large buffer at the client. A subtler disadvantage involves the way 428 TFRC probes the network to determine its capacity. Basically, TFRC 429 does not have an a priori idea of what the network capacity is; it 430 simply gradually increases the transmit rate until packets are lost, 431 and then backs down. After a period of time with no losses, the rate 432 is gradually increased again until more packets are lost. Over the 433 long term, the transmit rate will oscillate up and down, with packet 434 loss events occurring at the rate peaks. 436 This means that packet loss will likely be routine with this 437 strategy. For any given transfer, the number of lost packets is 438 likely to be small, but non-zero. Whether this causes noticeable 439 quality problems depends on the characteristics of the particular 440 codec in use. 442 4.2 Second Try -- One-way Live Media 444 With one-way live media you can only transmit the data as fast as 445 it's created, but end-to-end delays of several or tens of seconds are 446 usually acceptable. 448 4.2.1 Strategy 2 450 Assume that we have a playout media buffer at the receiver and a 451 transmit media buffer at the sender. The transmit buffer is filled 452 at the encoding rate and drained at the TFRC transmit rate. The 453 playout buffer is filled at the network arrival rate and drained at 454 the decoding rate. The playout buffer has a playback start threshold 455 and the transmit buffer has a switch encoding threshold and a discard 456 data threshold. These thresholds are on the order of several to tens 457 of seconds. Switch encoding is less than discard data, which is less 458 than playback start. Figure 4 shows this schematically. 460 discard data 461 | switch encoding 462 | | playback start 463 | | | 464 media +---------v---v---+ +----v-----------+ 465 ------->| Transmit buffer |--------->| Playout buffer |---> display 466 source +-----------------+ transmit +----------------+ 467 fill at at TFRC rate drain at 468 encode rate decode rate 470 Figure 4: Transfer and playout of one-way live media. 472 At the start of the connection, the sender places data into the 473 transmit buffer at the high encoding rate. The buffer is drained at 474 the TFRC transmit rate, which at this point is in slow-start and is 475 probably slower than the encoding rate. This will cause a backup in 476 the transmit buffer. Eventually TFRC will slow-start to a rate 477 slightly above the rate necessary to sustain the encoding rate 478 (assuming the network has sufficient capacity). When this happens 479 the transmit buffer will drain and we'll reach a steady state 480 condition where the transmit buffer is normally empty and we're 481 transmitting at a rate that is probably below the maximum allowed by 482 TFRC. 484 Meanwhile at the receiver, the playout buffer is filling, and when it 485 reaches the playback start threshold playback will start. After TFRC 486 slow-start is complete and the transmit buffer is drained, this 487 buffer will reach a steady state where packets are arriving from the 488 network at the encoding rate (ignoring jitter) and being drained at 489 the (equal) decoding rate. The depth of the buffer will be the 490 playback start threshold plus the maximum depth of the transmit 491 buffer during slow start. 493 Now assume that network congestion (packet loss) forces TFRC to drop 494 its rate to below that needed by the high encoding rate. The 495 transmit buffer will begin to fill and the playout buffer will begin 496 to drain. When the transmit buffer reaches the switch encoding 497 threshold, the sender switches to the low encoding rate, and converts 498 all of the data in the transmit buffer to low rate encoding. 500 Assuming that the network can support the new, lower, rate (and a 501 little more) the transmit buffer will begin to drain and the playout 502 buffer will begin to fill. Eventually the transmit buffer will empty 503 and the playout buffer will be back to its steady state level. 505 At this point (or perhaps after a slight delay) the sender can switch 506 back to the higher rate encoding. If the new rate can't be sustained 507 the transmit buffer will fill again, and the playout buffer will 508 drain. When the transmit buffer reaches the switch encoding 509 threshold the sender goes back to the lower encoding rate. This 510 oscillation continues until the stream ends or the network is able to 511 support the high encoding rate for the long term. 513 If the network can't support the low encoding rate, the transmit 514 buffer will continue to fill (and the playout buffer will continue to 515 drain). When the transmit buffer reaches the discard data threshold, 516 the sender must discard a chunk of data from the transmit buffer for 517 every chunk of data added. Preferably, the discard should happen 518 from the head of the transmit buffer, as these are the stalest data, 519 but the application could make other choices (e.g. discard the 520 earliest silence in the buffer). This discard behavior continues 521 until the transmit buffer falls below the switch encoding threshold. 522 If the playout buffer ever drains completely, the receiver should 523 fill the output with suitable material (e.g. silence or stillness). 525 Note that this strategy is also suitable for one-way pre-recorded 526 media, as long as the transmit buffer is only filled at the encoding 527 rate, not at the disk read rate. 529 4.2.2 Issues with Strategy 2 531 Strategy 2 is fairly effective. There is a limit on the necessary 532 size of the playout buffer at the client, so clients with limited 533 resources can be supported. When silence suppression is used or 534 motion compensation sends interpolated frames, the transmit rate will 535 actually go down, and then must slowly ramp up to return to the 536 maximum rates, but this smoothing can often be masked by a playout 537 buffer of a few seconds. 539 Also, since strategy 2 limits the transmission rate to the maximum 540 encoding rate, and therefore doesn't try to get every last bit of 541 possible throughput from the network, routine packet loss can be 542 avoided (assuming that there's enough network capacity for the 543 maximum encoding rate). 545 4.3 One More Time -- Two-way Interactive Media 547 Two-way interactive media is characterized by its low tolerance for 548 end-to-end delay, usually requiring less than 150 ms for interactive 549 conversation, including jitter buffering at the receiver. Rate 550 adapting buffers will insert too much delay and the slow start period 551 is likely to be noticeable ("Hello" clipping). 553 This low delay requirement makes using TFRC with variable-rate codecs 554 (codecs using silence suppression or motion compensation) highly 555 problematic. The extra delays imposed by the smooth rate increases 556 mandated by TFRC are unlikely to be tolerated by the interactive 557 applications. 559 There are further problems with the usual practice in interactive 560 voice applications of using small packets. In voice applications, 561 the data rate is low enough that waiting to accumulate enough data to 562 fill a large packet adds unacceptable delay. For example, the G.711 563 codec generates one byte of data every 125 microseconds. To 564 accumulate enough data for a 1480-byte packet, the encoder would need 565 to delay some data by 185 ms, eating up the entire delay budget just 566 for packetization. These considerations can also apply to very low 567 rate video. 569 The goal of TFRC is fair sharing of a bottleneck, in packets per 570 second, with a TCP application using 1480-byte packets. Applications 571 using smaller packets will receive a fair share of packets per 572 second, but less than a fair of bytes per second. With the packet 573 sizes typically in use in interactive voice applications (e.g., 80 574 bytes of user data for G.711 with 10 ms packetization), it can be 575 very difficult to achieve useful byte per second rates when in 576 competition with TCP applications. 578 Further research is needed to resolve these issues. The strategy 579 below can only be applied to constant rate codecs whose data rate is 580 sufficiently large to fill 1480-byte packets within tolerable delay 581 limits. 583 4.3.1 Strategy 3 585 To start, the calling party sends an INVITE (loosely using SIP [RFC 586 3261] terminology) indicating the IP address and port to use for 587 media at its end. Without informing the called user, the called 588 system responds to the INVITE by connecting to the calling party 589 media port. Both end systems then begin exchanging test data, at the 590 (slowly increasing) rate allowed by TFRC. The purpose of this test 591 data is to see what rate the connection can be ramped up to. If a 592 minimum acceptable rate cannot be achieved within some time period, 593 the call is cleared (conceptually, the calling party hears "fast 594 busy" and the called user is never informed of the incoming call). 595 Note that once the rate has ramped up sufficiently for the highest 596 rate codec there's no need to go further. 598 If an acceptable rate can be achieved (in both directions), the 599 called user is informed of the incoming call. The test data is 600 continued during this period. Once the called user accepts the call, 601 the test data is replaced by real data at the same rate. 603 If congestion is encountered during the call, TFRC will reduce its 604 allowed sending rate. When that rate falls below the codec currently 605 in use, the sender switches to a lower rate codec, but should pad its 606 transmission out to the allowed TFRC rate. Note that this padding is 607 only necessary if the application wishes to return to the higher 608 encoding rate when possible. If the TFRC rate continues to fall past 609 the lowest rate codec, the sender must discard packets to conform to 610 that rate. 612 If the network capacity is sufficient to support one of the lower 613 rate codecs, eventually the congestion will clear and TFRC will 614 slowly increase the allowed transmit rate. The application should 615 increase its transmission padding to keep up with the increasing TFRC 616 rate. The application may switch back to the higher rate codec when 617 the TFRC rate reaches a sufficient value. 619 An application that did not wish to switch back to the higher 620 encoding (perhaps for reasons outlined in section 3.1) would not need 621 to pad its transmission out to the TFRC maximum rate. 623 Note that the receiver would normally implement a short playout 624 buffer (with playback start on the order of some tens of 625 milliseconds) to smooth out jitter in the packet arrival gaps. 627 4.3.2 Issues with Strategy 3 629 An obvious issue with strategy 3 is the post-dial call connection 630 delay imposed by the slow-start ramp up. This is perhaps less of an 631 issue for two-way video applications, where post-dial delays of 632 several seconds are accepted practice. For telephony applications, 633 however, post-dial delays significantly greater than a second are a 634 problem, given that users have been conditioned to that behavior by 635 the public telephone network. On the other hand, the four packets 636 per RTT initial transmit rate allowed by DCCP's CCID3 in some 637 circumstance is likely to be sufficient for many telephony 638 applications, and the ramp up will be very quick. 640 As was stated in section 4.3, this strategy is only suitable for use 641 with constant-rate codecs with fast enough data rates to tolerate 642 using large packets. 644 5. Security Considerations 646 There are no security considerations for this document. Security 647 consideration for TFRC and the protocols implementing TFRC are 648 discussed in their defining documents. 650 6. IANA Considerations 652 There are no IANA actions required for this document. 654 7. Thanks 656 Thanks to the AVT working group, especially Philippe Gentric and 657 Brian Rosen, for comments on the earlier version of this document. 659 8. Informative References 661 [DCCP] E. Kohler, M. Handley, S. Floyd, J. Padhye, Datagram 662 Congestion Control Protocol (DCCP), February 2004, draft- 663 ietf-dccp-spec-06.txt, work in progress. 665 [CCID2] S. Floyd, E. Kohler, Profile for DCCP Congestion Control 666 2: TCP-Like Congestion Control, February 2004, draft- 667 ietf-dccp-ccid2-05.txt, work in progress. 669 [CCID3] S. Floyd, E. Kohler, J. Padhye, Profile for DCCP 670 Congestion Control 3: TFRC Congestion Control, February 671 2004, draft-ietf-dccp-ccid3-04.txt, work in progress. 673 [RFC 3448] M. Handley, S. Floyd, J. Padhye, J. Widmer, TCP Friendly 674 Rate Control (TFRC): Protocol Specification, RFC 3448. 676 [RFC 3714] S. Floyd, J, Kempf, IAB Concerns Regarding Congestion for 677 Voice Traffic in the Internet, March 2004, RFC 3714. 679 [RFC 3261] J. Rosenberg, et al, SIP: Session Initiation Protocol, 680 June 2002, RFC 3261 682 [RFC 3517] E. Blanton, M. Allman, K. Fall, L. Wang, A Conservative 683 Selective Acknowledgment (SACK)-based Loss Recovery 684 Algorithm for TCP, April 2003, RFC 3517 686 [RFC 3550] H. Schulzrinne, S. Casner, R. Frederick, V. Jacobson, 687 RTP: A Transport Protocol for Real-Time Applications, 688 July 2003, RFC 3550 690 [XTIME] ITU-T: Series G: Transmission Systems and Media, Digital 691 Systems and Networks, Recommendation G.114, One-way 692 Transmission Time, May 2000 694 [ECN] K. Ramakrishnan, S. Floyd, D. Black, The Addition of 695 Explicit Congestion Notification (ECN) to IP, September 696 2001, RFC 3168 698 [MPEG4] ISO/IEC International Standard 14496 (MPEG-4), 699 Information technology - Coding of audio-visual objects, 700 January 2000 702 [RTP-TFRC] L. Gharai, RTP Profile for TCP-Friendly Rate Control, 703 October 2004, draft-ietf-avt-tfrc-profile-03.txt, work in 704 progress 705 9. Author's Address 707 Tom Phelan 708 Sonus Networks 709 7 Technology Park Dr. 710 Westford, MA USA 01886 711 Phone: +1-978-614-8456 712 Email: tphelan@sonusnet.com 713 Intellectual Property Statement 715 The IETF takes no position regarding the validity or scope of any 716 Intellectual Property Rights or other rights that might be claimed to 717 pertain to the implementation or use of the technology described in 718 this document or the extent to which any license under such rights 719 might or might not be available; nor does it represent that it has 720 made any independent effort to identify any such rights. Information 721 on the procedures with respect to rights in RFC documents can be 722 found in BCP 78 and BCP 79. 724 Copies of IPR disclosures made to the IETF Secretariat and any 725 assurances of licenses to be made available, or the result of an 726 attempt made to obtain a general license or permission for the use of 727 such proprietary rights by implementers or users of this 728 specification can be obtained from the IETF on-line IPR repository at 729 http://www.ietf.org/ipr. 731 The IETF invites any interested party to bring to its attention any 732 copyrights, patents or patent applications, or other proprietary 733 rights that may cover technology that may be required to implement 734 this standard. Please address the information to the IETF at ietf- 735 ipr@ietf.org. 737 Disclaimer of Validity 739 This document and the information contained herein are provided on an 740 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 741 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 742 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS 743 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 744 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 745 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 747 Copyright Statement 749 Copyright (C) The IETF Trust (2007). 751 This document is subject to the rights, licenses and restrictions 752 contained in BCP 78, and except as set forth therein, the authors 753 retain all their rights. 755 Acknowledgment 757 Funding for the RFC Editor function is currently provided by the 758 Internet Society.