idnits 2.17.1 draft-alvestrand-rtcweb-congestion-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document date (October 22, 2012) is 4166 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-03) exists of draft-alvestrand-rmcat-remb-01 ** Obsolete normative reference: RFC 3448 (Obsoleted by RFC 5348) Summary: 1 error (**), 0 flaws (~~), 3 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group H. Lundin 3 Internet-Draft S. Holmer 4 Intended status: Informational H. Alvestrand, Ed. 5 Expires: April 25, 2013 Google 6 October 22, 2012 8 A Google Congestion Control Algorithm for Real-Time Communication on the 9 World Wide Web 10 draft-alvestrand-rtcweb-congestion-03 12 Abstract 14 This document describes two methods of congestion control when using 15 real-time communications on the World Wide Web (RTCWEB); one sender- 16 based and one receiver-based. 18 It is published as an input document to the RMCAT working group on 19 congestion control for media streams. The mailing list of that WG is 20 rmcat@ietf.org. 22 Requirements Language 24 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 25 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 26 document are to be interpreted as described in RFC 2119 [RFC2119]. 28 Status of this Memo 30 This Internet-Draft is submitted in full conformance with the 31 provisions of BCP 78 and BCP 79. 33 Internet-Drafts are working documents of the Internet Engineering 34 Task Force (IETF). Note that other groups may also distribute 35 working documents as Internet-Drafts. The list of current Internet- 36 Drafts is at http://datatracker.ietf.org/drafts/current/. 38 Internet-Drafts are draft documents valid for a maximum of six months 39 and may be updated, replaced, or obsoleted by other documents at any 40 time. It is inappropriate to use Internet-Drafts as reference 41 material or to cite them other than as "work in progress." 43 This Internet-Draft will expire on April 25, 2013. 45 Copyright Notice 47 Copyright (c) 2012 IETF Trust and the persons identified as the 48 document authors. All rights reserved. 50 This document is subject to BCP 78 and the IETF Trust's Legal 51 Provisions Relating to IETF Documents 52 (http://trustee.ietf.org/license-info) in effect on the date of 53 publication of this document. Please review these documents 54 carefully, as they describe your rights and restrictions with respect 55 to this document. Code Components extracted from this document must 56 include Simplified BSD License text as described in Section 4.e of 57 the Trust Legal Provisions and are provided without warranty as 58 described in the Simplified BSD License. 60 Table of Contents 62 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 63 1.1. Mathemathical notation conventions . . . . . . . . . . . . 3 64 2. System model . . . . . . . . . . . . . . . . . . . . . . . . . 4 65 3. Receiver side control . . . . . . . . . . . . . . . . . . . . 5 66 3.1. Procsesing multiple streams using RTP timestamp to NTP 67 time conversion . . . . . . . . . . . . . . . . . . . . . 5 68 3.2. Arrival-time model . . . . . . . . . . . . . . . . . . . . 5 69 3.3. Arrival-time filter . . . . . . . . . . . . . . . . . . . 7 70 3.4. Over-use detector . . . . . . . . . . . . . . . . . . . . 8 71 3.5. Rate control . . . . . . . . . . . . . . . . . . . . . . . 9 72 4. Sender side control . . . . . . . . . . . . . . . . . . . . . 11 73 5. Interoperability Considerations . . . . . . . . . . . . . . . 13 74 6. Implementation Experience . . . . . . . . . . . . . . . . . . 13 75 7. Further Work . . . . . . . . . . . . . . . . . . . . . . . . . 14 76 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 15 77 9. Security Considerations . . . . . . . . . . . . . . . . . . . 15 78 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 15 79 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 16 80 11.1. Normative References . . . . . . . . . . . . . . . . . . . 16 81 11.2. Informative References . . . . . . . . . . . . . . . . . . 16 82 Appendix A. Change log . . . . . . . . . . . . . . . . . . . . . 16 83 A.1. Version -00 to -01 . . . . . . . . . . . . . . . . . . . . 16 84 A.2. Version -01 to -02 . . . . . . . . . . . . . . . . . . . . 17 85 A.3. Version -02 to -03 . . . . . . . . . . . . . . . . . . . . 17 86 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 17 88 1. Introduction 90 Congestion control is a requirement for all applications that wish to 91 share the Internet [RFC2914]. 93 The problem of doing congestion control for real-time media is made 94 difficult for a number of reasons: 96 o The media is usually encoded in forms that cannot be quickly 97 changed to accommodate varying bandwidth, and bandwidth 98 requirements can often be changed only in discrete, rather large 99 steps 101 o The participants may have certain specific wishes on how to 102 respond - which may not be reducing the bandwidth required by the 103 flow on which congestion is discovered 105 o The encodings are usually sensitive to packet loss, while the real 106 time requirement precludes the repair of packet loss by 107 retransmission 109 This memo describes two congestion control algorithms that together 110 are seen to give reasonable performance and reasonable (not perfect) 111 bandwidth sharing with other conferences and with TCP-using 112 applications that share the same links. 114 The signalling used consists of standard RTP timestamps [RFC3550] 115 possibly augmented with RTP transmission time offsets [RFC5450], 116 standard RTCP feedback reports and Temporary Maximum Media Stream Bit 117 Rate Requests (TMMBR) as defined in [RFC5104] section 3.5.4, or by 118 using the REMB feedback report defined in [I-D.alvestrand-rmcat-remb] 120 1.1. Mathemathical notation conventions 122 The mathematics of this document have been transcribed from a more 123 formula-friendly format. 125 The following notational conventions are used: 127 X_bar The variable X, where X is a vector - conventionally marked by 128 a bar on top of the variable name. 130 X_hat An estimate of the true value of variable X - conventionally 131 marked by a circumflex accent on top of the variable name. 133 X(i) The "i"th value of X - conventionally marked by a subscript i. 135 [x y z] A row vector consisting of elements x, y and z. 137 X_bar^T The transpose of vector X_bar. 139 E{X} The expected value of the stochastic variable X 141 2. System model 143 The following elements are in the system: 145 o RTP packet - an RTP packet containing media data. 147 o Frame - a set of RTP packets transmitted from the sender at the 148 same time instant. This could be a video frame, an audio frame, 149 or a mix of audio and video packets. A frame can be defined by 150 the RTP packet send time (RTP timestamp + transmission time 151 offset), or by the RTP timestamp if the transmission time offset 152 field is not present. 154 o Incoming media streams - a stream of frames consisting of RTP 155 packets. 157 o Media codec - has a bandwidth control, and encodes the incoming 158 media stream into an RTP stream. 160 o RTP sender - sends the RTP stream over the network to the RTP 161 receiver. Generates the RTP timestamp. 163 o RTP receiver - receives the RTP stream, notes the time of arrival. 164 Regenerates the media stream for the recipient. 166 o RTCP sender at RTP sender - sends sender reports with mappings 167 between RTP timestamps and NTP time. 169 o RTCP sender at RTP receiver - sends receiver reports and TMMBR/ 170 REMB messages. 172 o RTCP receiver at RTP sender - receives receiver reports and TMMBR/ 173 REMB messages, reports these to sender side control. 175 o RTCP receiver at RTP receiver. 177 o Sender side control - takes loss rate info, round trip time info, 178 and TMMBR/REMB messages and computes a sending bitrate. 180 o Receiver side control - takes the packet arrival info at the RTP 181 receiver and decides when to send TMMBR/REMB messages. 183 Together, sender side control and receiver side control implement the 184 congestion control algorithm. 186 3. Receiver side control 188 The receive-side algorithm can be further decomposed into four parts: 189 an RTP timestamp to NTP time conversion, arrival-time filter, an 190 over-use detector, and a remote rate-control. 192 3.1. Procsesing multiple streams using RTP timestamp to NTP time 193 conversion 195 It is common that multiple RTP streams are sent from the sender to 196 the receiver. In such a situation the RTP timestamps of incoming can 197 first be converted to a common time base using the RTP timestamp and 198 NTP time pairs in RTCP SR reports[RFC3550]. The converted timestamps 199 can then be used instead of RTP timestamps in the arrival-time 200 filtering, and since all streams from the same sender have timestamps 201 in the same time base they can all be processed by the same filter. 202 This has the advantage of quicker reactions and reduces problems of 203 noisy measurements due to self-inflicted cross-traffic. 205 In the time interval from the start of the call until a stream from 206 the same sender has received an RTCP SR report, the receiver-side 207 control operates in single-stream mode. In that mode only one RTP 208 stream can be processed by the over-use detector. As soon as a 209 stream has received one or more RTCP SR reports the receiver-side 210 control can change to a multi-stream mode, where all RTP streams from 211 the same sender which have received one or more RTCP SR reports can 212 be processed by the over-use detector. When switching to the multi- 213 stream mode the state of the over-use detector must be modified to 214 avoid a time base mismatch. This can either be done by resetting the 215 stored RTP timestamp values or by converting them using the newly 216 received RTCP SR report. 218 3.2. Arrival-time model 220 This section describes an adaptive filter that continuously updates 221 estimates of network parameters based on the timing of the received 222 frames. 224 At the receiving side we are observing groups of incoming packets, 225 where each group of packets corresponding to the same frame having 226 timestamp T(i). 228 Each frame is assigned a receive time t(i), which corresponds to the 229 time at which the whole frame has been received (ignoring any packet 230 losses). A frame is delayed relative to its predecessor if t(i)-t(i- 231 1)>T(i)-T(i-1), i.e., if the arrival time difference is larger than 232 the timestamp difference. 234 We define the (relative) inter-arrival time, d(i) as 236 d(i) = t(i)-t(i-1)-(T(i)-T(i-1)) 238 Since the time ts to send a frame of size L over a path with a 239 capacity of C is roughly 241 ts = L/C 243 we can model the inter-arrival time as 245 L(i)-L(i-1) 246 d(i) = -------------- + w(i) = dL(i)/C+w(i) 247 C 249 Here, w(i) is a sample from a stochastic process W, which is a 250 function of the capacity C, the current cross traffic X(i), and the 251 current send bit rate R(i). We model W as a white Gaussian process. 252 If we are over-using the channel we expect w(i) to increase, and if a 253 queue on the network path is being emptied, w(i) will decrease; 254 otherwise the mean of w(i) will be zero. 256 Breaking out the mean m(i) from w(i) to make the process zero mean, 257 we get 259 Equation 5 261 d(i) = dL(i)/C + m(i) + v(i) 263 This is our fundamental model, where we take into account that a 264 large frame needs more time to traverse the link than a small frame, 265 thus arriving with higher relative delay. The noise term represents 266 network jitter and other delay effects not captured by the model. 268 When graphing the values for d(i) versus dL(i) on a scatterplot, we 269 find that most samples cluster around the center, and the outliers 270 are clustered along a line with average slope 1/C and zero offset. 272 For instance, when using a regular video codec, most frames are 273 roughly the same size after encoding (the central "cloud"); the 274 exceptions are I-frames (or key frames) which are typically much 275 larger than the average causing positive outliers (the I-frame 276 itself) and negative outliers (the frame after an I-frame) on the dL 277 axis. Audio frames on the other hand often consist of single packets 278 of equal size, and an audio-only media stream would have its frames 279 scattered at dL = 0. 281 3.3. Arrival-time filter 283 The parameters d(i) and dL(i) are readily available for each frame i 284 > 1, and we want to estimate C(i) and m(i) and use those estimates to 285 detect whether or not we are over-using the bandwidth currently 286 available. These parameters are easily estimated by any adaptive 287 filter - we are using the Kalman filter. 289 Let 291 theta_bar(i) = [1/C(i) m(i)]^T 293 and call it the state of time i. We model the state evolution from 294 time i to time i+1 as 296 theta_bar(i+1) = theta_bar(i) + u_bar(i) 298 where u_bar(i) is the zero mean white Gaussian process noise with 299 covariance 301 Equation 7 303 Q(i) = E{u_bar(i) u_bar(i)^T} 305 Given equation 5 we get 307 Equation 8 309 d(i) = h_bar(i)^T theta_bar(i) + v(i) 311 h_bar(i) = [dL(i) 1]^T 313 where v(i) is zero mean white Gaussian measurement noise with 314 variance var_v = sigma(v,i)^2 316 The Kalman filter recursively updates our estimate 317 theta_hat(i) = [1/C_hat(i) m_hat(i)]^T 319 as 321 z(i) = d(i) - h_bar(i)^T * theta_hat(i-1) 323 theta_hat(i) = theta_hat(i-1) + z(i) * k_bar(i) 325 E(i-1) * h_bar(i) 326 k_bar(i) = -------------------------------------------- 327 var_v_hat + h_bar(i)^T * E(i-1) * h_bar(i) 329 E(i) = (I - K_bar(i) * h_bar(i)^T) * E(i-1) + Q(i) 331 I is the 2-by-2 identity matrix. 333 The variance var_v = sigma(v,i)^2 is estimated using an exponential 334 averaging filter, modified for variable sampling rate 336 var_v_hat = beta*sigma(v,i-1)^2 + (1-beta)*z(i)^2 338 beta = (1-alpha)^(30/(1000 * f_max)) 340 where f_max = max {1/(T(j) - T(j-1))} for j in i-K+1...i is the 341 highest rate at which frames have been captured by the camera the 342 last K frames and alpha is a filter coefficient typically chosen as a 343 number in the interval [0.1, 0.001]. Since our assumption that v(i) 344 should be zero mean WGN is less accurate in some cases, we have 345 introduced an additional outlier filter around the updates of 346 var_v_hat. If z(i) > 3 var_v_hat the filter is updated with 3 347 sqrt(var_v_hat) rather than z(i). For instance v(i) will not be 348 white in situations where packets are sent at a higher rate than the 349 channel capacity, in which case they will be queued behind each 350 other. In a similar way, Q(i) is chosen as a diagonal matrix with 351 main diagonal elements given by 353 diag(Q(i)) = 30/(1000 * f_max)[10^-10 10^-2]^T 355 It is necessary to scale these filter parameters with the frame rate 356 to make the detector respond as quickly at low frame rates as at high 357 frame rates. 359 3.4. Over-use detector 361 The offset estimate m(i) is compared with a threshold gamma_1. An 362 estimate above the threshold is considered as an indication of over- 363 use. Such an indication is not enough for the detector to signal 364 over-use to the rate control subsystem. Not until over-use has been 365 detected for at least gamma_2 milliseconds and at least gamma_3 366 frames, a definitive over-use will be signaled. However, if the 367 offset estimate m(i) was decreased in the last update, over-use will 368 not be signaled even if all the above conditions are met. Similarly, 369 the opposite state, under-use, is detected when m(i) < -gamma_1. If 370 neither over-use nor under-use is detected, the detector will be in 371 the normal state. 373 3.5. Rate control 375 The rate control at the receiving side is designed to increase the 376 receive-side estimate of the available bandwidth A_hat as long as the 377 detected state is normal. Doing that assures that we, sooner or 378 later, will reach the available bandwidth of the channel and detect 379 an over-use. 381 As soon as over-use has been detected the receive-side estimate of 382 the available bandwidth is decreased. In this way we get a recursive 383 and adaptive estimate of the available bandwidth. 385 In this document we make the assumption that the rate control 386 subsystem is executed periodically and that this period is constant. 388 The rate control subsystem has 3 states: Increase, Decrease and Hold. 389 "Increase" is the state when no congestion is detected; "Decrease" is 390 the state where congestion is detected, and "Hold" is a state that 391 waits until built-up queues have drained before going to "increase" 392 state. 394 The state transitions (with blank fields meaning "remain in state") 395 are: 397 State ----> | Hold |Increase |Decrease 398 Signal----------------------------------------- 399 v | | | 400 Over-use | Decrease |Decrease | 401 ----------------------------------------------- 402 Normal | Increase | |Hold 403 ----------------------------------------------- 404 Under-use | |Hold |Hold 405 ----------------------------------------------- 407 The subsystem starts in the increase state, where it will stay until 408 over-use or under-use has been detected by the detector subsystem. 410 On every update the receive-side estimate of the available bandwidth 411 is increased with a factor which is a function of the global system 412 response time and the estimated measurement noise variance var_v_hat. 413 The global system response time is the time from an increase that 414 causes over-use until that over-use can be detected by the over-use 415 detector. The variance var_v_hat affects how responsive the Kalman 416 filter is, and is thus used as an indicator of the delay inflicted by 417 the Kalman filter. 419 A_hat(i) = eta*A_hat(i-1) 420 1.001+B 421 eta(RTT, var_v_hat) = ------------------------------------------ 422 1+e^(b(d*RTT - (c1 * var_v_hat + c2))) 424 Here, B, b, d, c1 and c2 are design parameters. 426 Since the system depends on over-using the channel to verify the 427 current available bandwidth estimate, we must make sure that our 428 estimate doesn't diverge from the rate at which the sender is 429 actually sending. Thus, if the sender is unable to produce a bit 430 stream with the bit rate the receiver is asking for, the available 431 bandwidth estimate must stay within a given bound. Therefore we 432 introduce a threshold 434 A_hat(i) < 1.5 * R_hat(i) 436 where R_hat(i) is the incoming bit rate measured over a T seconds 437 window: 439 R_hat(i) = 1/T * sum(L(j)) for j from 1 to N(i) 441 N(i) is the number of frames received the past T seconds and L(j) is 442 the payload size of frame j. Ideally T should be chosen to match the 443 rate controller at the sender. A window between 0.5 and 1 second is 444 recommended. 446 When an over-use is detected the system transitions to the decrease 447 state, where the receive-side available bandwidth estimate is 448 decreased to a factor times the currently incoming bit rate. 450 A_hat(i) = alpha*R_hat(i) 452 alpha is typically chosen to be in the interval [0.8, 0.95]. 454 When the detector signals under-use to the rate control subsystem, we 455 know that queues in the network path are being emptied, indicating 456 that our available bandwidth estimate is lower than the actual 457 available bandwidth. Upon that signal the rate control subsystem 458 will enter the hold state, where the receive-side available bandwidth 459 estimate will be held constant while waiting for the queues to 460 stabilize at a lower level - a way of keeping the delay as low as 461 possible. This decrease of delay is wanted, and expected, 462 immediately after the estimate has been reduced due to over-use, but 463 can also happen if the cross traffic over some links is reduced. In 464 either case we want to measure the highest incoming rate during the 465 under-use interval: 467 R_max = max{R_hat(i)} for i in 1..K 469 where K is the number of frames of under-use before returning to the 470 normal state. R_max is a measure of the actual bandwidth available 471 and is a good guess of what bit rate the sender should be able to 472 transmit at. Therefore the receive-side available bandwidth estimate 473 will be set to R_max when we transition from the hold state to the 474 increase state. 476 One design decision is when to send rate control messages. The time 477 from a change in congestion to the sending of the feedback message is 478 a limitation on how fast the sender can react. Sending too many 479 messages giving no new information is a waste of bandwidth - but in 480 the case of severe congestion, feedback messages can be lost, 481 resulting in a failure to react in a timely manner. 483 The conclusion is that feedback messages should be sent on a 484 "heartbeat" schedule, allowing the sender side control to react to 485 missing feedback messages by reducing its send rate, but they should 486 also be sent whenever the estimated bandwidth value has changed 487 significantly, without waiting for the heartbeat time, up to some 488 limiting upper bound on the send rate. 490 The minimum interval is named t_min_fb_interval. 492 The maximum interval is named t_max_fb_interval. 494 The permissible values of these intervals will be bounded by the RTP 495 session's RTCP bandwidth and its rtcp_frr setting. 497 [TODO: Get some example values for these timers] 499 4. Sender side control 501 An additional congestion controller resides at the sending side. It 502 bases its decisions on the round-trip time, packet loss and available 503 bandwidth estimates transmitted from the receiving side. 505 The available bandwidth estimates produced by the receiving side are 506 only reliable when the size of the queues along the channel are large 507 enough. If the queues are very short, over-use will only be visible 508 through packet losses, which aren't used by the receiving side 509 algorithm. 511 This algorithm is run every time a receive report arrives at the 512 sender, which will happen no more often than t_min_fb_interval, and 513 no less often than t_max_fb_interval. If no receive report is 514 received within 2x t_max_fb_interval (indicating at least 2 lost 515 feedback reports), the algorithm will take action as if all packets 516 in the interval have been lost, resulting in a halving of the send 517 rate. 519 o If 2-10% of the packets have been lost since the previous report 520 from the receiver, the sender available bandwidth estimate As(i) 521 (As denotes 'sender available bandwidth') will be kept unchanged. 523 o If more than 10% of the packets have been lost a new estimate is 524 calculated as As(i)=As(i-1)(1-0.5p), where p is the loss ratio. 526 o As long as less than 2% of the packets have been lost As(i) will 527 be increased as As(i)=1.05(As(i-1)+1000) 529 The new send-side estimate is limited by the TCP Friendly Rate 530 Control formula [RFC3448] and the receive-side estimate of the 531 available bandwidth A(i): 532 8 s 533 As(i) >= ---------------------------------------------------------- 534 R*sqrt(2*b*p/3) + (t_RTO*(3*sqrt(3*b*p/8) * p * (1+32*p^2))) 536 As(i) <= A(i) 538 where b is the number of packets acknowledged by a single TCP 539 acknowledgement (set to 1 per TFRC recommendations), t_RTO is the TCP 540 retransmission timeout value in seconds (set to 4*R) and s is the 541 average packet size in bytes. R is the round-trip time in seconds. 543 (The multiplication by 8 comes because TFRC is computing bandwidth in 544 bytes, while this document computes bandwidth in bits.) 546 In words: The sender-side estimate will never be larger than the 547 receiver-side estimate, and will never be lower than the estimate 548 from the TFRC formula. 550 We motivate the packet loss thresholds by noting that if the 551 transmission channel has a small amount of packet loss due to over- 552 use, that amount will soon increase if the sender does not adjust his 553 bit rate. Therefore we will soon enough reach above the 10 % 554 threshold and adjust As(i). However if the packet loss rate does not 555 increase, the losses are probably not related to self-induced channel 556 over-use and therefore we should not react on them. 558 5. Interoperability Considerations 560 There are three scenarios of interest, and one included for reference 562 o Both parties implement the algorithms described here 564 o Sender implements the algorithm described in section Section 4, 565 recipient does not implement Section 3 567 o Recipient implements the algorithm in section Section 3, sender 568 does not implement Section 4. 570 In the case where both parties implement the algorithms, we expect to 571 see most of the congestion control response to slowly varying 572 conditions happen by TMMBR/REMB messages from recipient to sender. 573 At most times, the sender will send less than the congestion-inducing 574 bandwidth limit C, and when he sends more, congestion will be 575 detected before packets are lost. 577 If sudden changes happen, packets will be lost, and the sender side 578 control will trigger, limiting traffic until the congestion becomes 579 low enough that the system switches back to the receiver-controlled 580 state. 582 In the case where sender only implements, we expect to see somewhat 583 higher loss rates and delays, but the system will still be overall 584 TCP friendly and self-adjusting; the governing term in the 585 calculation will be the TFRC formula. 587 In the case where recipient implements this algorithm and sender does 588 not, congestion will be avoided for slow changes as long as the 589 sender understands and obeys TMMBR/REMB; there will be no backoff for 590 packet-loss-inducing changes in capacity. Given that some kind of 591 congestion control is mandatory for the sender according to the TMMBR 592 spec, this case has to be reevaluated against the specific congestion 593 control implemented by the sender. 595 6. Implementation Experience 597 This algorithm has been implemented in the open-source WebRTC 598 project. 600 7. Further Work 602 This draft is offered as input to the congestion control discussion. 604 Work that can be done on this basis includes: 606 o Consideration of timing info: It may be sensible to use the 607 proposed TFRC RTP header extensions [I-D.gharai-avtcore-rtp-tfrc] 608 to carry per-packet timing information, which would both give more 609 data points and a timestamp applied closer to the network 610 interface. This draft includes consideration of using the 611 transmission time offset defined in [RFC5450] 613 o Considerations of cross-channel calculation: If all packets in 614 multiple streams follow the same path over the network, congestion 615 or queueing information should be considered across all packets 616 between two parties, not just per media stream. A feedback 617 message (REMB) that may be suitable for such a purpose is given in 618 [I-D.alvestrand-rmcat-remb]. 620 o Considerations of cross-channel balancing: The decision to slow 621 down sending in a situation with multiple media streams should be 622 taken across all media streams, not per stream. 624 o Considerations of additional input: How and where packet loss 625 detected at the recipient can be added to the algorithm. 627 o Considerations of locus of control: Whether the sender or the 628 recipient is in the best position to figure out which media 629 streams it makes sense to slow down, and therefore whether one 630 should use TMMBR to slow down one channel, signal an overall 631 bandwidth change and let the sender make the decision, or signal 632 the (possibly processed) delay info and let the sender run the 633 algorithm. 635 o Considerations of over-bandwidth estimation: Whether we can use 636 the estimate of how much we're over bandwidth in section 3 to 637 influence how much we reduce the bandwidth, rather than using a 638 fixed factor. 640 o Startup considerations. It's unreasonable to assume that just 641 starting at full rate is always the best strategy. 643 o Dealing with sender traffic shaping, which delays sending of 644 packets. Using send-time timestamps rather than RTP timestamps 645 may be useful here, but as long as the sender's traffic shaping 646 does not spread out packets more than the bottleneck link, it 647 should not matter. 649 o Stability considerations. It is not clear how to show that the 650 algorithm cannot provide an oscillating state, either alone or 651 when competing with other algorithms / flows. 653 These are matters for further work; since some of them involve 654 extensions that have not yet been standardized, this could take some 655 time. 657 8. IANA Considerations 659 This document makes no request of IANA. 661 Note to RFC Editor: this section may be removed on publication as an 662 RFC. 664 9. Security Considerations 666 An attacker with the ability to insert or remove messages on the 667 connection will, of course, have the ability to mess up rate control, 668 causing people to send either too fast or too slow, and causing 669 congestion. 671 In this case, the control information is carried inside RTP, and can 672 be protected against modification or message insertion using SRTP, 673 just as for the media. Given that timestamps are carried in the RTP 674 header, which is not encrypted, this is not protected against 675 disclosure, but it seems hard to mount an attack based on timing 676 information only. 678 10. Acknowledgements 680 Thanks to Randell Jesup, Magnus Westerlund, Varun Singh, Tim Panton, 681 Soo-Hyun Choo, Jim Gettys, Ingemar Johansson, Michael Welzl and 682 others for providing valuable feedback on earlier versions of this 683 draft. 685 11. References 686 11.1. Normative References 688 [I-D.alvestrand-rmcat-remb] 689 Alvestrand, H., "RTCP message for Receiver Estimated 690 Maximum Bitrate", draft-alvestrand-rmcat-remb-01 (work in 691 progress), July 2012. 693 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 694 Requirement Levels", BCP 14, RFC 2119, March 1997. 696 [RFC3448] Handley, M., Floyd, S., Padhye, J., and J. Widmer, "TCP 697 Friendly Rate Control (TFRC): Protocol Specification", 698 RFC 3448, January 2003. 700 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 701 Jacobson, "RTP: A Transport Protocol for Real-Time 702 Applications", STD 64, RFC 3550, July 2003. 704 [RFC5104] Wenger, S., Chandra, U., Westerlund, M., and B. Burman, 705 "Codec Control Messages in the RTP Audio-Visual Profile 706 with Feedback (AVPF)", RFC 5104, February 2008. 708 [RFC5450] Singer, D. and H. Desineni, "Transmission Time Offsets in 709 RTP Streams", RFC 5450, March 2009. 711 11.2. Informative References 713 [I-D.gharai-avtcore-rtp-tfrc] 714 Gharai, L. and C. Perkins, "RTP with TCP Friendly Rate 715 Control", draft-gharai-avtcore-rtp-tfrc-01 (work in 716 progress), September 2011. 718 [RFC2914] Floyd, S., "Congestion Control Principles", BCP 41, 719 RFC 2914, September 2000. 721 Appendix A. Change log 723 A.1. Version -00 to -01 725 o Added change log 727 o Added appendix outlining new extensions 729 o Added a section on when to send feedback to the end of section 3.3 730 "Rate control", and defined min/max FB intervals. 732 o Added size of over-bandwidth estimate usage to "further work" 733 section. 735 o Added startup considerations to "further work" section. 737 o Added sender-delay considerations to "further work" section. 739 o Filled in acknowledgements section from mailing list discussion. 741 A.2. Version -01 to -02 743 o Defined the term "frame", incorporating the transmission time 744 offset into its definition, and removed references to "video 745 frame". 747 o Referred to "m(i)" from the text to make the derivation clearer. 749 o Made it clearer that we modify our estimates of available 750 bandwidth, and not the true available bandwidth. 752 o Removed the appendixes outlining new extensions, added pointers to 753 REMB draft and RFC 5450. 755 A.3. Version -02 to -03 757 o Added a section on how to process multiple streams in a single 758 estimator using RTP timestamps to NTP time conversion. 760 o Stated in introduction that the draft is aimed at the RMCAT 761 working group. 763 Authors' Addresses 765 Henrik Lundin 766 Google 767 Kungsbron 2 768 Stockholm 11122 769 Sweden 770 Stefan Holmer 771 Google 772 Kungsbron 2 773 Stockholm 11122 774 Sweden 776 Email: holmer@google.com 778 Harald Alvestrand (editor) 779 Google 780 Kungsbron 2 781 Stockholm 11122 782 Sweden 784 Email: harald@alvestrand.no