idnits 2.17.1 draft-ietf-rmcat-gcc-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 583 has weird spacing: '... of the syste...' == Line 585 has weird spacing: '...nt used for t...' -- The document date (September 8, 2015) is 3147 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Looks like a reference, but probably isn't: '6' on line 426 -- Looks like a reference, but probably isn't: '600' on line 426 == Outdated reference: A later version (-01) exists of draft-holmer-rmcat-transport-wide-cc-extensions-00 ** Obsolete normative reference: RFC 3448 (Obsoleted by RFC 5348) Summary: 1 error (**), 0 flaws (~~), 4 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group S. Holmer 3 Internet-Draft H. Lundin 4 Intended status: Informational Google 5 Expires: March 11, 2016 G. Carlucci 6 L. De Cicco 7 S. Mascolo 8 Politecnico di Bari 9 September 8, 2015 11 A Google Congestion Control Algorithm for Real-Time Communication 12 draft-ietf-rmcat-gcc-00 14 Abstract 16 This document describes two methods of congestion control when using 17 real-time communications on the World Wide Web (RTCWEB); one delay- 18 based and one loss-based. 20 It is published as an input document to the RMCAT working group on 21 congestion control for media streams. The mailing list of that 22 working group is rmcat@ietf.org. 24 Requirements Language 26 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 27 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 28 document are to be interpreted as described in RFC 2119 [RFC2119]. 30 Status of This Memo 32 This Internet-Draft is submitted in full conformance with the 33 provisions of BCP 78 and BCP 79. 35 Internet-Drafts are working documents of the Internet Engineering 36 Task Force (IETF). Note that other groups may also distribute 37 working documents as Internet-Drafts. The list of current Internet- 38 Drafts is at http://datatracker.ietf.org/drafts/current/. 40 Internet-Drafts are draft documents valid for a maximum of six months 41 and may be updated, replaced, or obsoleted by other documents at any 42 time. It is inappropriate to use Internet-Drafts as reference 43 material or to cite them other than as "work in progress." 45 This Internet-Draft will expire on March 11, 2016. 47 Copyright Notice 49 Copyright (c) 2015 IETF Trust and the persons identified as the 50 document authors. All rights reserved. 52 This document is subject to BCP 78 and the IETF Trust's Legal 53 Provisions Relating to IETF Documents 54 (http://trustee.ietf.org/license-info) in effect on the date of 55 publication of this document. Please review these documents 56 carefully, as they describe your rights and restrictions with respect 57 to this document. Code Components extracted from this document must 58 include Simplified BSD License text as described in Section 4.e of 59 the Trust Legal Provisions and are provided without warranty as 60 described in the Simplified BSD License. 62 Table of Contents 64 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 65 1.1. Mathematical notation conventions . . . . . . . . . . . . 3 66 2. System model . . . . . . . . . . . . . . . . . . . . . . . . 4 67 3. Feedback and extensions . . . . . . . . . . . . . . . . . . . 5 68 4. Delay-based control . . . . . . . . . . . . . . . . . . . . . 5 69 4.1. Arrival-time model . . . . . . . . . . . . . . . . . . . 5 70 4.2. Arrival-time filter . . . . . . . . . . . . . . . . . . . 7 71 4.3. Over-use detector . . . . . . . . . . . . . . . . . . . . 9 72 4.4. Rate control . . . . . . . . . . . . . . . . . . . . . . 10 73 4.5. Parameters settings . . . . . . . . . . . . . . . . . . . 13 74 5. Loss-based control . . . . . . . . . . . . . . . . . . . . . 13 75 6. Interoperability Considerations . . . . . . . . . . . . . . . 15 76 7. Implementation Experience . . . . . . . . . . . . . . . . . . 15 77 8. Further Work . . . . . . . . . . . . . . . . . . . . . . . . 15 78 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 16 79 10. Security Considerations . . . . . . . . . . . . . . . . . . . 16 80 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 16 81 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 16 82 12.1. Normative References . . . . . . . . . . . . . . . . . . 16 83 12.2. Informative References . . . . . . . . . . . . . . . . . 17 84 Appendix A. Change log . . . . . . . . . . . . . . . . . . . . . 17 85 A.1. Version -00 to -01 . . . . . . . . . . . . . . . . . . . 17 86 A.2. Version -01 to -02 . . . . . . . . . . . . . . . . . . . 17 87 A.3. Version -02 to -03 . . . . . . . . . . . . . . . . . . . 18 88 A.4. rtcweb-03 to rmcat-00 . . . . . . . . . . . . . . . . . . 18 89 A.5. rmcat -00 to -01 . . . . . . . . . . . . . . . . . . . . 18 90 A.6. rmcat -01 to -02 . . . . . . . . . . . . . . . . . . . . 18 91 A.7. rmcat -02 to -03 . . . . . . . . . . . . . . . . . . . . 18 92 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 19 94 1. Introduction 96 Congestion control is a requirement for all applications sharing the 97 Internet resources [RFC2914]. 99 Congestion control for real-time media is challenging for a number of 100 reasons: 102 o The media is usually encoded in forms that cannot be quickly 103 changed to accommodate varying bandwidth, and bandwidth 104 requirements can often be changed only in discrete, rather large 105 steps 107 o The participants may have certain specific wishes on how to 108 respond - which may not be reducing the bandwidth required by the 109 flow on which congestion is discovered 111 o The encodings are usually sensitive to packet loss, while the 112 real-time requirement precludes the repair of packet loss by 113 retransmission 115 This memo describes two congestion control algorithms that together 116 are able to provide good performance and reasonable bandwidth sharing 117 with other video flows using the same congestion control and with TCP 118 flows that share the same links. 120 The signaling used consists of experimental RTP header extensions and 121 RTCP messages RFC 3550 [RFC3550] as defined in [abs-send-time], 122 [I-D.alvestrand-rmcat-remb] and 123 [I-D.holmer-rmcat-transport-wide-cc-extensions]. 125 1.1. Mathematical notation conventions 127 The mathematics of this document have been transcribed from a more 128 formula-friendly format. 130 The following notational conventions are used: 132 X_bar The variable X, where X is a vector - conventionally marked by 133 a bar on top of the variable name. 135 X_hat An estimate of the true value of variable X - conventionally 136 marked by a circumflex accent on top of the variable name. 138 X(i) The "i"th value of vector X - conventionally marked by a 139 subscript i. 141 [x y z] A row vector consisting of elements x, y and z. 143 X_bar^T The transpose of vector X_bar. 145 E{X} The expected value of the stochastic variable X 147 2. System model 149 The following elements are in the system: 151 o RTP packet - an RTP packet containing media data. 153 o Packet group - a set of RTP packets transmitted from the sender 154 uniquely identified by the group departure and group arrival time 155 (absolute send time) [abs-send-time]. These could be video 156 packets, audio packets, or a mix of audio and video packets. 158 o Incoming media stream - a stream of frames consisting of RTP 159 packets. 161 o RTP sender - sends the RTP stream over the network to the RTP 162 receiver. It generates the RTP timestamp and the abs-send-time 163 header extension 165 o RTP receiver - receives the RTP stream, marks the time of arrival. 167 o RTCP sender at RTP receiver - sends receiver reports, REMB 168 messages and transport-wide RTCP feedback messages. 170 o RTCP receiver at RTP sender - receives receiver reports and REMB 171 messages and transport-wide RTCP feedback messages, reports these 172 to the sender side controller. 174 o RTCP receiver at RTP receiver, receives sender reports from the 175 sender. 177 o Loss-based controller - takes loss rate measurement, round trip 178 time measurement and REMB messages, and computes a target sending 179 bitrate. 181 o Delay-based controller - takes the packet arrival info, either at 182 the RTP receiver, or from the feedback received by the RTP sender, 183 and computes a maximum bitrate which it passes to the loss-based 184 controller. 186 Together, loss-based controller and delay-based controller implement 187 the congestion control algorithm. 189 3. Feedback and extensions 191 There are two ways to implement the proposed algorithm. One where 192 both the controllers are running at the send-side, and one where the 193 delay-based controller runs on the receive-side and the loss-based 194 controller runs on the send-side. 196 The first version can be realized by using a per-packet feedback 197 protocol as described in 198 [I-D.holmer-rmcat-transport-wide-cc-extensions]. Here, the RTP 199 receiver will record the arrival time and the transport-wide sequence 200 number of each received packet, which will be sent back to the sender 201 periodically using the transport-wide feedback message. The 202 RECOMMENDED feedback interval is once per received video frame or at 203 least once every 30 ms if audio-only or multi-stream. If the 204 feedback overhead needs to be limited this interval can be increased 205 to 100 ms. 207 The sender will map the received {sequence number, arrival time} 208 pairs to the send-time of each packet covered by the feedback report, 209 and feed those timestamps to the delay-based controller. It will 210 also compute a loss ratio based on the sequence numbers in the 211 feedback message. 213 The second version can be realized by having a delay-based controller 214 at the receive-side, monitoring and processing the arrival time and 215 size of incoming packets. The sender SHOULD use the abs-send-time 216 RTP header extension [abs-send-time] to enable the receiver to 217 compute the inter-group delay variation. The output from the delay- 218 based controller will be a bitrate, which will be sent back to the 219 sender using the REMB feedback message [I-D.alvestrand-rmcat-remb]. 220 The packet loss ratio is sent back via RTCP receiver reports. At the 221 sender the bitrate in the REMB message and the fraction of packets 222 lost are fed into the loss-based controller, which outputs a final 223 target bitrate. It is RECOMMENDED to send the REMB message as soon 224 as congestion is detected, and otherwise at least once every second. 226 4. Delay-based control 228 The delay-based control algorithm can be further decomposed into 229 three parts: an arrival-time filter, an over-use detector, and a rate 230 controller. 232 4.1. Arrival-time model 234 This section describes an adaptive filter that continuously updates 235 estimates of network parameters based on the timing of the received 236 packets. 238 We define the inter-arrival time, t(i) - t(i-1), as the difference in 239 arrival time of two packets or two groups of packets. 240 Correspondingly, the inter-departure time, T(i) - T(i-1), is defined 241 as the difference in departure-time of two packets or two groups of 242 packets. Finally, the inter-group delay variation, d(i), is defined 243 as the difference between the inter-arrival time and the inter- 244 departure time. Or interpreted differently, as the difference 245 between the delay of group i and group i-1. 247 d(i) = t(i) - t(i-1) - (T(i) - T(i-1)) 249 At the receiving side we are observing groups of incoming packets, 250 where a group of packets is defined as follows: 252 o A sequence of packets which are sent within a burst_time interval 253 constitute a group. RECOMMENDED value for burst_time is 5 ms. 255 o In addition, any packet which has an inter-arrival time less than 256 burst_time and an inter-group delay variation d(i) less than 0 is 257 also considered being part of the current group of packets. The 258 reasoning behind including these packets in the group is to better 259 handle delay transients, caused by packets being queued up for 260 reasons unrelated to congestion. As an example this has been 261 observed to happen on many Wi-Fi and wireless networks. 263 An inter-departure time is computed between consecutive groups as 264 T(i) - T(i-1), where T(i) is the departure timestamp of the last 265 packet in the current packet group being processed. Any packets 266 received out of order are ignored by the arrival-time model. 268 Each group is assigned a receive time t(i), which corresponds to the 269 time at which the last packet of the group was received. A group is 270 delayed relative to its predecessor if t(i) - t(i-1) > T(i) - T(i-1), 271 i.e., if the inter-arrival time is larger than the inter-departure 272 time. 274 Since the time ts to send a group of packets of size L over a path 275 with a capacity of C is roughly 277 ts = L/C 279 we can model the inter-group delay variation as: 281 d(i) = L(i)/C(i) - L(i-1)/C(i-1) + w(i) = 283 L(i)-L(i-1) 284 = -------------- + w(i) = dL(i)/C(i) + w(i) 285 C(i) 287 Here, w(i) is a sample from a stochastic process W, which is a 288 function of the capacity C(i), the current cross traffic, and the 289 current sent bitrate. C is modeled as being constant as we expect it 290 to vary more slowly than other parameters of this model. We model W 291 as a white Gaussian process. If we are over-using the channel we 292 expect the mean of w(i) to increase, and if a queue on the network 293 path is being emptied, the mean of w(i) will decrease; otherwise the 294 mean of w(i) will be zero. 296 Breaking out the mean, m(i), from w(i) to make the process zero mean, 297 we get 299 Equation 1 301 d(i) = dL(i)/C(i) + m(i) + v(i) 303 This is our fundamental model, where we take into account that a 304 large group of packets need more time to traverse the link than a 305 small group, thus arriving with higher relative delay. The noise 306 term represents network jitter and other delay effects not captured 307 by the model. 309 4.2. Arrival-time filter 311 The parameters d(i) and dL(i) are readily available for each group of 312 packets, i > 1, and we want to estimate C(i) and m(i) and use those 313 estimates to detect whether or not the bottleneck link is over-used. 314 These parameters can be estimated by any adaptive filter - we are 315 using the Kalman filter. 317 Let 319 theta_bar(i) = [1/C(i) m(i)]^T 321 and call it the state at time i. We model the state evolution from 322 time i to time i+1 as 324 theta_bar(i+1) = theta_bar(i) + u_bar(i) 326 where u_bar(i) is the state noise that we model as a stationary 327 process with Gaussian statistic with zero mean and covariance 328 Q(i) = E{u_bar(i) * u_bar(i)^T} 330 Q(i) is RECOMMENDED as a diagonal matrix with main diagonal elements 331 as: 333 diag(Q(i)) = [10^-13 10^-3]^T 335 Given equation 1 we get 337 d(i) = h_bar(i)^T * theta_bar(i) + v(i) 339 h_bar(i) = [dL(i) 1]^T 341 where v(i) is zero mean white Gaussian measurement noise with 342 variance var_v = sigma(v,i)^2 344 The Kalman filter recursively updates our estimate 346 theta_hat(i) = [1/C_hat(i) m_hat(i)]^T 348 as 350 z(i) = d(i) - h_bar(i)^T * theta_hat(i-1) 352 theta_hat(i) = theta_hat(i-1) + z(i) * k_bar(i) 354 ( E(i-1) + Q(i) ) * h_bar(i) 355 k_bar(i) = ------------------------------------------------------ 356 var_v_hat(i) + h_bar(i)^T * (E(i-1) + Q(i)) * h_bar(i) 358 E(i) = (I - k_bar(i) * h_bar(i)^T) * (E(i-1) + Q(i)) 360 where I is the 2-by-2 identity matrix. 362 The variance var_v(i) = sigma_v(i)^2 is estimated using an 363 exponential averaging filter, modified for variable sampling rate 365 var_v_hat(i) = max(beta * var_v_hat(i-1) + (1-beta) * z(i)^2, 1) 367 beta = (1-chi)^(30/(1000 * f_max)) 369 where f_max = max {1/(T(j) - T(j-1))} for j in i-K+1,...,i is the 370 highest rate at which the last K packet groups have been received and 371 chi is a filter coefficient typically chosen as a number in the 372 interval [0.1, 0.001]. Since our assumption that v(i) should be zero 373 mean WGN is less accurate in some cases, we have introduced an 374 additional outlier filter around the updates of var_v_hat. If z(i) > 375 3*sqrt(var_v_hat) the filter is updated with 3*sqrt(var_v_hat) rather 376 than z(i). For instance v(i) will not be white in situations where 377 packets are sent at a higher rate than the channel capacity, in which 378 case they will be queued behind each other. 380 4.3. Over-use detector 382 The offset estimate m(i), obtained as the output of the arrival-time 383 filter, is compared with a threshold gamma_1(i). An estimate above 384 the threshold is considered as an indication of over-use. Such an 385 indication is not enough for the detector to signal over-use to the 386 rate control subsystem. A definitive over-use will be signaled only 387 if over-use has been detected for at least gamma_2 milliseconds. 388 However, if m(i) < m(i-1), over-use will not be signaled even if all 389 the above conditions are met. Similarly, the opposite state, under- 390 use, is detected when m(i) < -gamma_1(i). If neither over-use nor 391 under-use is detected, the detector will be in the normal state. 393 The threshold gamma_1 has a remarkable impact on the overall dynamics 394 and performance of the algorithm. In particular, it has been shown 395 that using a static threshold gamma_1, a flow controlled by the 396 proposed algorithm can be starved by a concurrent TCP flow [Pv13]. 397 This starvation can be avoided by increasing the threshold gamma_1 to 398 a sufficiently large value. 400 The reason is that, by using a larger value of gamma_1, a larger 401 queuing delay can be tolerated, whereas with a small gamma_1, the 402 over-use detector quickly reacts to a small increase in the offset 403 estimate m(i) by generating an over-use signal that reduces the 404 delay-based estimate of the available bandwidth A_hat (see 405 Section 4.4). Thus, it is necessary to dynamically tune the 406 threshold gamma_1 to get good performance in the most common 407 scenarios, such as when competing with loss-based flows. 409 For this reason, we propose to vary the threshold gamma_1(i) 410 according to the following dynamic equation: 412 gamma_1(i) = gamma_1(i-1) + (t(i)-t(i-1)) * K(i) * (|m(i)|-gamma_1(i-1)) 414 with K(i)=K_d if |m(i)| < gamma_1(i-1) or K(i)=K_u otherwise. The 415 rationale is to increase gamma_1(i) when m(i) is outside of the range 416 [-gamma_1(i-1),gamma_1(i-1)], whereas, when the offset estimate m(i) 417 falls back into the range, gamma_1 is decreased. In this way when 418 m(i) increases, for instance due to a TCP flow entering the same 419 bottleneck, gamma_1(i) increases and avoids the uncontrolled 420 generation of over-use signals which may lead to starvation of the 421 flow controlled by the proposed algorithm [Pv13]. Moreover, 422 gamma_1(i) SHOULD NOT be updated if this condition holds: 424 |m(i)| - gamma_1(i) > 15 426 It is also RECOMMENDED to clamp gamma_1(i) to the range [6, 600], 427 since a too small gamma_1(i) can cause the detector to become overly 428 sensitive. 430 On the other hand, when m(i) falls back into the range 431 [-gamma_1(i-1),gamma_1(i-1)] the threshold gamma_1(i) is decreased so 432 that a lower queuing delay can be achieved. 434 It is RECOMMENDED to choose K_u > K_d so that the rate at which 435 gamma_1 is increased is higher than the rate at which it is 436 decreased. With this setting it is possible to increase the 437 threshold in the case of a concurrent TCP flow and prevent starvation 438 as well as enforcing intra-protocol fairness. RECOMMENDED values for 439 gamma_1(0), gamma_2, K_u and K_d are respectively 12.5 ms, 10 ms, 440 0.01 and 0.00018. 442 4.4. Rate control 444 The rate control is split in two parts, one controlling the bandwidth 445 estimate based on delay, and one controlling the bandwidth estimate 446 based on loss. Both are designed to increase the estimate of the 447 available bandwidth A_hat as long as there is no detected congestion 448 and to ensure that we will eventually match the available bandwidth 449 of the channel and detect an over-use. 451 As soon as over-use has been detected, the available bandwidth 452 estimated by the delay-based controller is decreased. In this way we 453 get a recursive and adaptive estimate of the available bandwidth. 455 In this document we make the assumption that the rate control 456 subsystem is executed periodically and that this period is constant. 458 The rate control subsystem has 3 states: Increase, Decrease and Hold. 459 "Increase" is the state when no congestion is detected; "Decrease" is 460 the state where congestion is detected, and "Hold" is a state that 461 waits until built-up queues have drained before going to "increase" 462 state. 464 The state transitions (with blank fields meaning "remain in state") 465 are: 467 +----+--------+-----------+------------+--------+ 468 | \ State | Hold | Increase |Decrease| 469 | \ | | | | 470 | Signal\ | | | | 471 +--------+----+-----------+------------+--------+ 472 | Over-use | Decrease | Decrease | | 473 +-------------+-----------+------------+--------+ 474 | Normal | Increase | | Hold | 475 +-------------+-----------+------------+--------+ 476 | Under-use | | Hold | Hold | 477 +-------------+-----------+------------+--------+ 479 The subsystem starts in the increase state, where it will stay until 480 over-use or under-use has been detected by the detector subsystem. 481 On every update the delay-based estimate of the available bandwidth 482 is increased, either multiplicatively or additively, depending on its 483 current state. 485 The system does a multiplicative increase if the current bandwidth 486 estimate appears to be far from convergence, while it does an 487 additive increase if it appears to be closer to convergence. We 488 assume that we are close to convergence if the currently incoming 489 bitrate, R_hat(i), is close to an average of the incoming bitrates at 490 the time when we previously have been in the Decrease state. "Close" 491 is defined as three standard deviations around this average. It is 492 RECOMMENDED to measure this average and standard deviation with an 493 exponential moving average with the smoothing factor 0.95, as it is 494 expected that this average covers multiple occasions at which we are 495 in the Decrease state. Whenever valid estimates of these statistics 496 are not available, we assume that we have not yet come close to 497 convergence and therefore remain in the multiplicative increase 498 state. 500 If R_hat(i) increases above three standard deviations of the average 501 max bitrate, we assume that the current congestion level has changed, 502 at which point we reset the average max bitrate and go back to the 503 multiplicative increase state. 505 R_hat(i) is the incoming bitrate measured by the delay-based 506 controller over a T seconds window: 508 R_hat(i) = 1/T * sum(L(j)) for j from 1 to N(i) 510 N(i) is the number of packets received the past T seconds and L(j) is 511 the payload size of packet j. A window between 0.5 and 1 second is 512 RECOMMENDED. 514 During multiplicative increase, the estimate is increased by at most 515 8% per second. 517 eta = 1.08^min(time_since_last_update_ms / 1000, 1.0) 518 A_hat(i) = eta * A_hat(i-1) 520 During the additive increase the estimate is increased with at most 521 half a packet per response_time interval. The response_time interval 522 is estimated as the round-trip time plus 100 ms as an estimate of 523 over-use estimator and detector reaction time. 525 response_time_ms = 100 + rtt_ms 526 beta = 0.5 * min(time_since_last_update_ms / response_time_ms, 1.0) 527 A_hat(i) = A_hat(i-1) + max(1000, beta * expected_packet_size_bits) 529 expected_packet_size_bits is used to get a slightly slower slope for 530 the additive increase at lower bitrates. It can for instance be 531 computed from the current bitrate by assuming a frame rate of 30 532 frames per second: 534 bits_per_frame = A_hat(i-1) / 30 535 packets_per_frame = ceil(bits_per_frame / (1200 * 8)) 536 avg_packet_size_bits = bits_per_frame / packets_per_frame 538 Since the system depends on over-using the channel to verify the 539 current available bandwidth estimate, we must make sure that our 540 estimate does not diverge from the rate at which the sender is 541 actually sending. Thus, if the sender is unable to produce a bit 542 stream with the bitrate the congestion controller is asking for, the 543 available bandwidth estimate should stay within a given bound. 544 Therefore we introduce a threshold 546 A_hat(i) < 1.5 * R_hat(i) 548 When an over-use is detected the system transitions to the decrease 549 state, where the delay-based available bandwidth estimate is 550 decreased to a factor times the currently incoming bitrate. 552 A_hat(i) = alpha * R_hat(i) 554 alpha is typically chosen to be in the interval [0.8, 0.95], 0.85 is 555 the RECOMMENDED value. 557 When the detector signals under-use to the rate control subsystem, we 558 know that queues in the network path are being emptied, indicating 559 that our available bandwidth estimate A_hat is lower than the actual 560 available bandwidth. Upon that signal the rate control subsystem 561 will enter the hold state, where the receive-side available bandwidth 562 estimate will be held constant while waiting for the queues to 563 stabilize at a lower level - a way of keeping the delay as low as 564 possible. This decrease of delay is wanted, and expected, 565 immediately after the estimate has been reduced due to over-use, but 566 can also happen if the cross traffic over some links is reduced. 568 It is RECOMMENDED that the routine to update A_hat(i) is run at least 569 once every response_time interval. 571 4.5. Parameters settings 573 +------------+-------------------------------------+----------------+ 574 | Parameter | Description | RECOMMENDED | 575 | | | Value | 576 +------------+-------------------------------------+----------------+ 577 | burst_time | Time limit in milliseconds between | 5 ms | 578 | | packet bursts which identifies a | | 579 | | group | | 580 | Q | State noise covariance matrix | diag(Q(i)) = | 581 | | | [10^-13 | 582 | | | 10^-3]^T | 583 | E(0) | Initial value of the system error | diag(E(0)) = | 584 | | covariance | [100 0.1]^T | 585 | chi | Coefficient used for the measured | [0.1, 0.001] | 586 | | noise variance | | 587 | gamma_1(0) | Initial value for the adaptive | 12.5 ms | 588 | | threshold | | 589 | gamma_2 | Time required to trigger an overuse | 10 ms | 590 | | signal | | 591 | K_u | Coefficient for the adaptive | 0.01 | 592 | | threshold | | 593 | K_d | Coefficient for the adaptive | 0.00018 | 594 | | threshold | | 595 | T | Time window for measuring the | [0.5, 1] s | 596 | | received bitrate | | 597 | alpha | Decrease rate factor | 0.85 | 598 +------------+-------------------------------------+----------------+ 600 Table 1: RECOMMENDED values for delay based controller 602 Table 1 604 5. Loss-based control 606 A second part of the congestion controller bases its decisions on the 607 round-trip time, packet loss and available bandwidth estimates A_hat 608 received from the delay-based controller. The available bandwidth 609 estimates computed by the loss-based controller are denoted with 610 As_hat. 612 The available bandwidth estimates A_hat produced by the delay-based 613 controller are only reliable when the size of the queues along the 614 path sufficiently large. If the queues are very short, over-use will 615 only be visible through packet losses, which are not used by the 616 delay-based controller. 618 The loss-based controller SHOULD run every time feedback from the 619 receiver is received. 621 o If 2-10% of the packets have been lost since the previous report 622 from the receiver, the sender available bandwidth estimate 623 As_hat(i) will be kept unchanged. 625 o If more than 10% of the packets have been lost a new estimate is 626 calculated as As_hat(i) = As_hat(i-1)(1-0.5p), where p is the loss 627 ratio. 629 o As long as less than 2% of the packets have been lost As_hat(i) 630 will be increased as As_hat(i) = 1.05(As_hat(i-1)) 632 The new bandwidth estimate is lower-bounded by the TCP Friendly Rate 633 Control formula [RFC3448] and upper-bounded by the delay-based 634 estimate of the available bandwidth A_hat(i), where the delay-based 635 estimate has precedence: 637 8 s 638 As_hat(i) >= --------------------------------------------------------- 639 R*sqrt(2*b*p/3) + (t_RTO*(3*sqrt(3*b*p/8)*p*(1+32*p^2))) 641 As_hat(i) <= A_hat(i) 643 where b is the number of packets acknowledged by a single TCP 644 acknowledgment (set to 1 per TFRC recommendations), t_RTO is the TCP 645 retransmission timeout value in seconds (set to 4*R) and s is the 646 average packet size in bytes. R is the round-trip time in seconds. 648 (The multiplication by 8 comes because TFRC is computing bandwidth in 649 bytes, while this document computes bandwidth in bits.) 651 In words: The loss-based estimate will never be larger than the 652 delay-based estimate, and will never be lower than the estimate from 653 the TFRC formula except if the delay-based estimate is lower than the 654 TFRC estimate. 656 We motivate the packet loss thresholds by noting that if the 657 transmission channel has a small amount of packet loss due to over- 658 use, that amount will soon increase if the sender does not adjust his 659 bitrate. Therefore we will soon enough reach above the 10% threshold 660 and adjust As_hat(i). However, if the packet loss ratio does not 661 increase, the losses are probably not related to self-inflicted 662 congestion and therefore we should not react on them. 664 6. Interoperability Considerations 666 In case a sender implementing these algorithms talks to a receiver 667 which do not implement any of the proposed RTCP messages and RTP 668 header extensions, it is suggested that the sender monitors RTCP 669 receiver reports and uses the fraction of lost packets and the round- 670 trip time as input to the loss-based controller. The delay-based 671 controller should be left disabled. 673 7. Implementation Experience 675 This algorithm has been implemented in the open-source WebRTC 676 project, has been in use in Chrome since M23, and is being used by 677 Google Hangouts. 679 Deployment of the algorithm have revealed problems related to, e.g, 680 congested or otherwise problematic WiFi networks, which have led to 681 algorithm improvements. The algorithm has also been tested in a 682 multi-party conference scenario with a conference server which 683 terminates the congestion control between endpoints. This ensures 684 that no assumptions are being made by the congestion control about 685 maximum send and receive bitrates, etc., which typically is out of 686 control for a conference server. 688 8. Further Work 690 This draft is offered as input to the congestion control discussion. 692 Work that can be done on this basis includes: 694 o Considerations of integrated loss control: How loss and delay 695 control can be better integrated, and the loss control improved. 697 o Considerations of locus of control: evaluate the performance of 698 having all congestion control logic at the sender, compared to 699 splitting logic between sender and receiver. 701 o Considerations of utilizing ECN as a signal for congestion 702 estimation and link over-use detection. 704 9. IANA Considerations 706 This document makes no request of IANA. 708 Note to RFC Editor: this section may be removed on publication as an 709 RFC. 711 10. Security Considerations 713 An attacker with the ability to insert or remove messages on the 714 connection would have the ability to disrupt rate control. This 715 could make the algorithm to produce either a sending rate under- 716 utilizing the bottleneck link capacity, or a too high sending rate 717 causing network congestion. 719 In this case, the control information is carried inside RTP, and can 720 be protected against modification or message insertion using SRTP, 721 just as for the media. Given that timestamps are carried in the RTP 722 header, which is not encrypted, this is not protected against 723 disclosure, but it seems hard to mount an attack based on timing 724 information only. 726 11. Acknowledgements 728 Thanks to Randell Jesup, Magnus Westerlund, Varun Singh, Tim Panton, 729 Soo-Hyun Choo, Jim Gettys, Ingemar Johansson, Michael Welzl and 730 others for providing valuable feedback on earlier versions of this 731 draft. 733 12. References 735 12.1. Normative References 737 [I-D.alvestrand-rmcat-remb] 738 Alvestrand, H., "RTCP message for Receiver Estimated 739 Maximum Bitrate", draft-alvestrand-rmcat-remb-03 (work in 740 progress), October 2013. 742 [I-D.holmer-rmcat-transport-wide-cc-extensions] 743 Holmer, S., Flodman, M., and E. Sprang, "RTP Extensions 744 for Transport-wide Congestion Control", draft-holmer- 745 rmcat-transport-wide-cc-extensions-00 (work in progress), 746 March 2015. 748 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 749 Requirement Levels", BCP 14, RFC 2119, March 1997. 751 [RFC3448] Handley, M., Floyd, S., Padhye, J., and J. Widmer, "TCP 752 Friendly Rate Control (TFRC): Protocol Specification", RFC 753 3448, January 2003. 755 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 756 Jacobson, "RTP: A Transport Protocol for Real-Time 757 Applications", STD 64, RFC 3550, July 2003. 759 [abs-send-time] 760 "RTP Header Extension for Absolute Sender Time", 761 . 764 12.2. Informative References 766 [Pv13] De Cicco, L., Carlucci, G., and S. Mascolo, "Understanding 767 the Dynamic Behaviour of the Google Congestion Control", 768 Packet Video Workshop , December 2013. 770 [RFC2914] Floyd, S., "Congestion Control Principles", BCP 41, RFC 771 2914, September 2000. 773 Appendix A. Change log 775 A.1. Version -00 to -01 777 o Added change log 779 o Added appendix outlining new extensions 781 o Added a section on when to send feedback to the end of section 3.3 782 "Rate control", and defined min/max FB intervals. 784 o Added size of over-bandwidth estimate usage to "further work" 785 section. 787 o Added startup considerations to "further work" section. 789 o Added sender-delay considerations to "further work" section. 791 o Filled in acknowledgments section from mailing list discussion. 793 A.2. Version -01 to -02 795 o Defined the term "frame", incorporating the transmission time 796 offset into its definition, and removed references to "video 797 frame". 799 o Referred to "m(i)" from the text to make the derivation clearer. 801 o Made it clearer that we modify our estimates of available 802 bandwidth, and not the true available bandwidth. 804 o Removed the appendixes outlining new extensions, added pointers to 805 REMB draft and RFC 5450. 807 A.3. Version -02 to -03 809 o Added a section on how to process multiple streams in a single 810 estimator using RTP timestamps to NTP time conversion. 812 o Stated in introduction that the draft is aimed at the RMCAT 813 working group. 815 A.4. rtcweb-03 to rmcat-00 817 Renamed draft to link the draft name to the RMCAT WG. 819 A.5. rmcat -00 to -01 821 Spellcheck. Otherwise no changes, this is a "keepalive" release. 823 A.6. rmcat -01 to -02 825 o Added Luca De Cicco and Saverio Mascolo as authors. 827 o Extended the "Over-use detector" section with new technical 828 details on how to dynamically tune the offset gamma_1 for improved 829 fairness properties. 831 o Added reference to a paper analyzing the behavior of the proposed 832 algorithm. 834 A.7. rmcat -02 to -03 836 o Swapped receiver-side/sender-side controller with delay-based/ 837 loss-based controller as there is no longer a requirement to run 838 the delay-based controller on the receiver-side. 840 o Removed the discussion about multiple streams and transmission 841 time offsets. 843 o Introduced a new section about "Feedback and extensions". 845 o Improvements to the threshold adaptation in the "Over-use 846 detector" section. 848 o Swapped the previous MIMD rate control algorithm for a new AIMD 849 rate control algorithm. 851 Authors' Addresses 853 Stefan Holmer 854 Google 855 Kungsbron 2 856 Stockholm 11122 857 Sweden 859 Email: holmer@google.com 861 Henrik Lundin 862 Google 863 Kungsbron 2 864 Stockholm 11122 865 Sweden 867 Gaetano Carlucci 868 Politecnico di Bari 869 Via Orabona, 4 870 Bari 70125 871 Italy 873 Email: gaetano.carlucci@poliba.it 875 Luca De Cicco 876 Politecnico di Bari 877 Via Orabona, 4 878 Bari 70125 879 Italy 881 Email: l.decicco@poliba.it 883 Saverio Mascolo 884 Politecnico di Bari 885 Via Orabona, 4 886 Bari 70125 887 Italy 889 Email: mascolo@poliba.it