idnits 2.17.1 draft-zhu-rmcat-nada-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** There is 1 instance of lines with control characters in the document. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 278 has weird spacing: '...ons via expon...' == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document date (September 11, 2013) is 3872 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'RFC3168' is defined on line 498, but no explicit reference was found in the text -- Obsolete informational reference (is this intentional?): RFC 2309 (Obsoleted by RFC 7567) Summary: 2 errors (**), 0 flaws (~~), 4 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group X. Zhu 3 Internet Draft R. Pan 4 Intended Status: Informational Cisco Systems 5 Expires: March 15, 2014 September 11, 2013 7 NADA: A Unified Congestion Control Scheme for Real-Time Media 8 draft-zhu-rmcat-nada-02 10 Abstract 12 This document describes a scheme named network-assisted dynamic 13 adaptation (NADA), a novel congestion control approach for 14 interactive real-time media applications, such as video conferencing. 15 In the proposed scheme, the sender regulates its sending rate based 16 on either implicit or explicit congestion signaling, in a unified 17 approach. The scheme can benefit from explicit congestion 18 notification (ECN) markings from network nodes. It also maintains 19 consistent sender behavior in the absence of such markings, by 20 reacting to queuing delays and packet losses instead. 22 We present here the overall system architecture, recommended 23 behaviors at the sender and the receiver, as well as expected network 24 node operations. Results from extensive simulation studies of the 25 proposed scheme are available upon request. 27 Status of this Memo 29 This Internet-Draft is submitted to IETF in full conformance with the 30 provisions of BCP 78 and BCP 79. 32 Internet-Drafts are working documents of the Internet Engineering 33 Task Force (IETF), its areas, and its working groups. Note that 34 other groups may also distribute working documents as 35 Internet-Drafts. 37 Internet-Drafts are draft documents valid for a maximum of six months 38 and may be updated, replaced, or obsoleted by other documents at any 39 time. It is inappropriate to use Internet-Drafts as reference 40 material or to cite them other than as "work in progress." 42 The list of current Internet-Drafts can be accessed at 43 http://www.ietf.org/1id-abstracts.html 45 The list of Internet-Draft Shadow Directories can be accessed at 46 http://www.ietf.org/shadow.html 48 Copyright and License Notice 50 Copyright (c) 2012 IETF Trust and the persons identified as the 51 document authors. All rights reserved. 53 This document is subject to BCP 78 and the IETF Trust's Legal 54 Provisions Relating to IETF Documents 55 (http://trustee.ietf.org/license-info) in effect on the date of 56 publication of this document. Please review these documents 57 carefully, as they describe your rights and restrictions with respect 58 to this document. Code Components extracted from this document must 59 include Simplified BSD License text as described in Section 4.e of 60 the Trust Legal Provisions and are provided without warranty as 61 described in the Simplified BSD License. 63 Table of Contents 65 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 66 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . 3 67 3. System Model . . . . . . . . . . . . . . . . . . . . . . . . . 3 68 4. Network Node Operations . . . . . . . . . . . . . . . . . . . . 4 69 4.1 Default behavior of drop tail . . . . . . . . . . . . . . . 4 70 4.2 ECN marking . . . . . . . . . . . . . . . . . . . . . . . . 4 71 4.3 PCN marking . . . . . . . . . . . . . . . . . . . . . . . . 5 72 4.4 Comments and Discussions . . . . . . . . . . . . . . . . . . 6 73 5. Receiver Behavior . . . . . . . . . . . . . . . . . . . . . . . 6 74 5.1 Monitoring per-packet statistics . . . . . . . . . . . . . . 6 75 5.2 Calculating time-smoothed values . . . . . . . . . . . . . . 6 76 5.3 Sending periodic feedback . . . . . . . . . . . . . . . . . 7 77 6. Sender Behavior . . . . . . . . . . . . . . . . . . . . . . . . 7 78 6.1 Video encoder rate control . . . . . . . . . . . . . . . . . 8 79 6.2 Rate shaping buffer . . . . . . . . . . . . . . . . . . . . 8 80 6.3 Reference rate calculator . . . . . . . . . . . . . . . . . 9 81 6.4 Video target rate and sending rate calculator . . . . . . . 9 82 6.5 Slow-start behavior . . . . . . . . . . . . . . . . . . . . 10 83 7. Incremental Deployment . . . . . . . . . . . . . . . . . . . . 10 84 8. Implementation Status . . . . . . . . . . . . . . . . . . . . . 11 85 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 11 86 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 11 87 10.1 Normative References . . . . . . . . . . . . . . . . . . . 11 88 10.2 Informative References . . . . . . . . . . . . . . . . . . 11 89 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 12 91 1. Introduction 93 Interactive real-time media applications introduce a unique set of 94 challenges for congestion control. Unlike TCP, the mechanism used for 95 real-time media needs to adapt fast to instantaneous bandwidth 96 changes, accommodate fluctuations in the output of video encoder rate 97 control, and cause low queuing delay over the network. An ideal 98 scheme should also make effective use of all types of congestion 99 signals, including packet losses, queuing delay, and explicit 100 congestion notification (ECN) markings. 102 Based on the above considerations, we present a scheme named network- 103 assisted dynamic adaptation (NADA). The proposed design benefits from 104 explicit congestion control signals (e.g., ECN markings) from the 105 network, and remains compatible in the presence of implicit signals 106 (delay or loss) only. In addition, it supports weighted bandwidth 107 sharing among competing video flows. 109 This documentation describes the overall system architecture, 110 recommended designs at the sender and receiver, as well as expected 111 network nodes operations. The signaling mechanism consists of 112 standard RTP timestamp [RFC3550] and standard RTCP feedback reports. 114 2. Terminology 116 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 117 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 118 document are to be interpreted as described in RFC 2119 [RFC2119]. 120 3. System Model 122 The system consists of the following elements: 124 * Incoming media stream, in the form of consecutive raw video 125 frames and audio samples; 127 * Media encoder with rate control capabilities. It takes the 128 incoming media stream and encodes it to an RTP stream at a 129 target bit rate R_o. Note that the actual output rate from the 130 encoder R_v may fluctuate randomly around R_o. Also, the encoder 131 can only change its rate at rather coarse time intervals, on the 132 order of seconds. 134 * RTP sender, responsible for calculating the target bit rate 135 R_o based on network congestion signals (delay or ECN marking 136 reports from the receiver), and for regulating the actual 137 sending rate R_s accordingly. A rate shaping buffer is employed 138 to absorb the instantaneous difference between video encoder 139 output rate R_v and sending rate R_s. The buffer size L_s, 140 together with R_o, influences the calculation of R_s. The RTP 141 sender also generates RTP timestamp in outgoing packets. 143 * RTP receiver, responsible for measuring and estimating end-to- 144 end delay d based on sender RTP timestamp. In the presence of 145 packet losses and ECN markings, it also records the individual 146 loss and marking events, and calculates the equivalent delay 147 d_tilde that accounts for queuing delay, ECN marking, and packet 148 losses. The receiver feeds such statistics back to the sender 149 via periodic RTCP reports. 151 * Network node, with several modes of operation. The system can 152 work with the default behavior of a simple drop tail queue. It 153 can also benefit from advanced AQM features such as RED-based 154 ECN marking, and PCN marking using a token bucket algorithm. 156 In the following, we will elaborate on the respective operations at the 157 network node, the receiver, and the sender. 159 4. Network Node Operations 161 We consider three variations of queue management behavior at the network 162 node, leading to either implicit or explicit congestion signals. 164 4.1 Default behavior of drop tail 166 In conventional network with drop tail or RED queues, congestion is 167 inferred from the estimation of end-to-end delay. No special action is 168 required at network node. 170 Packet drops at the queue are detected at the receiver, and contributes 171 to the calculation of the equivalent delay d_tilde. 173 4.2 ECN marking 175 In this mode, the network node randomly marks the ECN field in the IP 176 packet header following the Random Early Detection (RED) algorithm 177 [RFC2309]. Calculation of the marking probability involves the following 178 steps: 180 * upon packet arrival, update smoothed queue size q_avg as: 182 q_avg = alpha*q + (1-alpha)*q_avg. 184 The smoothing parameter alpha is a value between 0 and 1. A value of 185 alpha=1 corresponds to performing no smoothing at all. 187 * calculate marking probability p as: 189 p = 0, if q < q_lo; 191 q_avg - q_lo 192 p = p_max*--------------, if q_lo <= q < q_hi; 193 q_hi - q_lo 195 p = 1, if q >= q_hi. 197 Here, q_lo and q_hi corresponds to the low and high thresholds of queue 198 occupancy. The maximum parking probability is p_max. 200 The ECN markings events will contribute to the calculation of an 201 equivalent delay d_tilde at the receiver. No changes are required at the 202 sender. 204 4.3 PCN marking 206 As a more advanced feature, we also envision network nodes which support 207 PCN marking based on virtual queues. In such a case, the marking 208 probability of the ECN bit in the IP packet header is calculated as 209 follows: 211 * upon packet arrival, meter packet against token bucket (r,b); 213 * update token level b_tk; 215 * calculate the marking probability as: 217 p = 0, if b-b_tk < b_lo; 219 b-b_tk-b_lo 220 p = p_max* --------------, if b_lo<= b-b_tk =b_hi. 225 Here, the token bucket lower and upper limits are denoted by b_lo and 226 b_hi, respectively. The parameter b indicates the size of the token 227 bucket. The parameter r is chosen as r=gamma*C, where gamma<1 is the 228 target utilization ratio and C designates link capacity. The maximum 229 marking probability is p_max. 231 The ECN markings events will contribute to the calculation of an 232 equivalent delay d_tilde at the receiver. No changes are required at the 233 sender. The virtual queuing mechanism from the PCN marking algorithm 234 will lead to additional benefits such as zero standing queues. 236 4.4 Comments and Discussions 238 In all three flavors described above, the network queue operates with 239 the simple first-in-first-out (FIFO) principle. There is no need to 240 maintain per-flow state. Such a simple design ensures that the system 241 can scale easily with large number of video flows and high link 242 capacity. 244 The sender behavior stays the same in the presence of all types of 245 congestion signals: delay, loss, ECN marking due to either RED/ECN or 246 PCN algorithms. This unified approach allows a graceful transition of 247 the scheme as the level of congestion in the network shifts dynamically 248 between different regimes. 250 5. Receiver Behavior 252 The role of the receiver is fairly straightforward. It is in charge of 253 four steps: a) monitoring end-to-end delay/loss/marking statistics on a 254 per-packet basis; b) aggregating all forms of congestion signals in 255 terms of the equivalent delay; c) calculating time-smoothed value of the 256 congestion signal; and d) sending periodic reports back to the sender. 258 5.1 Monitoring per-packet statistics 260 The receiver observes and estimates one-way delay d_n for the n-th 261 packet, ECN marking event 1_M, and packet loss event 1_L. Here, 1_M and 262 1_L are binary indicators: the value of 1 corresponding to a marked or 263 lost packet and value of 0 indicates no marking or loss. 265 The equivalent delay d_tilde is calculated as follows: 267 d_tilde = d_n + 1_M d_M + 1_M d_L, 269 where d_M is a prescribed fictitious delay value corresponding to the 270 ECN marking event (e.g., d_M = 200 ms), and d_L is a prescribed 271 fictitious delay value corresponding to the packet loss event (e.g., d_L 272 = 1 second). By introducing a large fictitious delay penalty for ECN 273 marking and packet losses, our proposed scheme leads to low end-to-end 274 actual delays in the presence of such events. 276 5.2 Calculating time-smoothed values 278 The receiver smoothes its observations via exponential averaging: 280 x_n = alpha*d_tilde + (1-alpha)*x_n. 282 The weighting parameter alpha adjusts the level of smoothing. 284 5.3 Sending periodic feedback 286 Periodically, the receiver sends back the updated value of x in RTCP 287 messages, to aid the sender in its calculation of target rate. The size 288 of acknowledgement packets are typically on the order of tens of bytes, 289 and are significantly smaller than average video packet sizes. 290 Therefore, the bandwidth overhead of the receiver acknowledgement stream 291 is sufficiently low. 293 6. Sender Behavior 295 -------------------- 296 | | 297 | Reference Rate | <--------- RTCP report 298 | Calculator | 299 | | 300 -------------------- 301 | 302 | R_n 303 | 304 -------------------------- 305 | | 306 | | 307 \ / \ / 308 -------------------- ----------------- 309 | | | | 310 | Video Target | | Sending Rate | 311 | Rate Calculator | | Calculator | 312 | | | | 313 -------------------- ----------------- 314 | /|\ /|\ | 315 R_v| | | | 316 | ----------------------- | 317 | | | R_s 318 ------------ |L_s | 319 | | | | 320 | | R_o -------------- \|/ 321 | Encoder |----------> | | | | | ---------------> 322 | | | | | | | video packets 323 ------------ -------------- 324 Rate Shaping Buffer 326 Figure 1 NADA Sender Structure 328 Figure 1 provides a more detailed view of the NADA sender. Upon 329 receipt of an RTCP report from the receiver, the NADA sender updates 330 its calculation of the reference rate R_n as a function of the 331 network congestion signal. It further adjusts both the target rate 332 for the live video encoder R_v and the sending rate R_s over the 333 network based on the updated value of R_n, as well as the size of 334 the rate shaping buffer. 336 The following sections describe these modules in further details, 337 and explain how they interact with each other. 339 6.1 Video encoder rate control 341 The video encoder rate control procedure has the following 342 characteristics: 344 * Rate changes can happen only at large intervals, on the order of 345 seconds. 347 * Given a target rate R_o, the encoder output rate may randomly 348 fluctuate around it. 350 * The encoder output rate is further constrained by video content 351 complexity. The range of the final rate output is [R_min, R_max]. 352 Note that it's content-dependent, and may change over time. 354 Note that operation of the live video encoder is out of the scope of our 355 design for a congestion control scheme in NADA. Instead, its behavior 356 treated as a black box. 358 6.2 Rate shaping buffer 360 A rate shaping buffer is employed to absorb any instantaneous mismatch 361 between encoder rate output R_o and regulated sending rate R_s. The size 362 of the buffer evolves from time t-tau to time t as: 364 L_s(t) = max [0, L_s(t-tau)+R_v*tau-R_s*tau]. 366 A large rate shaping buffer contributes to higher end-to-end delay, 367 which may harm the performance of real-time media communications. 368 Therefore, the sender has a strong incentive to constrain the size of 369 the shaping buffer. It can either deplete it faster by increasing the 370 sending rate R_s, or limit its growth by reducing the target rate for 371 the video encoder rate control R_v. 373 6.3 Reference rate calculator 375 The sender calculates the reference rate R_n based on network congestion 376 information from receiver RTCP reports. It first compensates the effect 377 of delayed observation by one round-trip time (RTT) via a linear 378 predictor: 380 x_n - x_n-1 381 x_hat = x_n + ---------------*tau_o (1) 382 delta 384 In (1), the arrival interval between the (n-1)-th the n-th packets is 385 designated by delta. The parameter tau_o indicates the reference round- 386 trip-time, hence the prediction step size. 388 The reference rate is then calculated as: 390 R_max-R_min 391 R_n = R_min + w*---------------*x_ref (2) 392 x_hat 394 Here, R_min and R_max denote the content-dependent rate range the 395 encoder can produce. The weight of priority level is w. The reference 396 congestion signal x_ref is chosen so that the maximum rate of R_max can 397 be achieved when x_hat = w*x_ref. Note that the combination of w and 398 x_ref determines how sensitive the rate adaptation scheme is in reaction 399 to fluctuations in observed signal x. The final target rate R_o is 400 clipped within the range of [R_min, R_max]. 402 Note that the sender does not need any explicit knowledge of the 403 management scheme inside the network. Rather, it reacts to the 404 aggregation of all forms of congestion indications (delay, loss, and 405 marking) via the composite congestion signal x_n from the receiver in a 406 coherent manner. 408 6.4 Video target rate and sending rate calculator 410 The target rate for the live video encoder is updated based on both the 411 reference rate R_n and the rate shaping buffer size L_s, as follows: 413 L_s 414 R_v = R_o - beta_v * -------. (3) 415 tau_v 417 Similarly, the outgoing rate is regulated based on both the reference 418 rate R_n and the rate shaping buffer size L_s, such that: 420 L_s 421 R_s = R_o + beta_s * -------. (4) 422 tau_v 424 In (3) and (4), the first term indicates the rate calculated from 425 network congestion feedback alone. The second term indicates the 426 influence of the rate shaping buffer. A large rate shaping buffer nudges 427 the encoder target rate slightly below -- and the sending rate slightly 428 above -- the reference rate R_n. Intuitively, the amount of extra rate 429 offset needed to completely drain the rate shaping buffer within the 430 same time frame of encoder rate adaptation tau_v is given by L_s/tau_v. 431 The scaling parameters beta_v and beta_s can be tuned to balance between 432 the competing goals of maintaining a small rate shaping buffer and 433 deviating the system from the reference rate point. 435 6.5 Slow-start behavior 437 Finally, special care needs to be taken during the startup phase of a 438 video stream, since it may take several roundtrip-times before the 439 sender can collect statistically robust information on network 440 congestion. We propose to regulate the reference rate R_n to grow 441 linearly in the beginning, no more than: R_ss at time t: 443 t-t_0 444 R_ss(t) = R_min + -------(R_max-R_min). 445 T 447 The start time of the stream is t_0, and T represents the time horizon 448 over which the slow-start mechanism is effective. The encoder target 449 rate is chosen to be the minimum of R_n and R_ss during the first T 450 seconds. 452 7. Incremental Deployment 454 One nice property of proposed design is the consistent video end point 455 behavior irrespective of network node variations. This facilitates 456 gradual, incremental adoption of the scheme. 458 To start off with, the proposed encoder congestion control mechanism can 459 be implemented without any explicit support from the network, and rely 460 solely on observed one-way delay measurements and packet loss ratios as 461 implicit congestion signals. 463 When ECN is enabled at the network nodes with RED-based marking, the 464 receiver can fold its observations of ECN markings into the calculation 465 of the equivalent delay. The sender can react to these explicit 466 congestion signals without any modification. 468 Ultimately, networks equipped with proactive marking based on token 469 bucket level metering can reap the additional benefits of zero standing 470 queues and lower end-to-end delay and work seamlessly with existing 471 senders and receivers. 473 8. Implementation Status 475 The proposed NADA scheme has been implemented in the ns-2 simulation 476 platform [ns2]. Extensive simulation evaluations of the scheme are 477 documented in [Zhu-PV13]. 479 A Linux-based testbed implementation is currently underway. 481 9. IANA Considerations 483 There are no actions for IANA. 485 10. References 487 10.1 Normative References 489 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 490 Requirement Levels", BCP 14, RFC 2119, March 1997. 492 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 493 Jacobson, "RTP: A Transport Protocol for Real-Time 494 Applications", STD 64, RFC 3550, July 2003. 496 10.2 Informative References 498 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 499 of Explicit Congestion Notification (ECN) to IP", 500 RFC 3168, September 2001. 502 [RFC2309] Braden, B., Clark, D., Crowcroft, J., Davie, B., Deering, 503 S., Estrin, D., Floyd, S., Jacobson, V., Minshall, G., 504 Partridge, C., Peterson, L., Ramakrishnan, K., Shenker, 505 S., Wroclawski, J., and L. Zhang, "Recommendations on 506 Queue Management and Congestion Avoidance in the 507 Internet", RFC 2309, April 1998. 509 [ns2] "The Network Simulator - ns-2", http://www.isi.edu/nsnam/ns/ 511 [Zhu-PV13] Zhu, X., Pan, R., "NADA: A Unified Congestion Control 512 Scheme for Low-Latency Interactive Video", IEEE 513 International Packet Video Workshop (PV'13), 2013. 514 Submitted. 516 Authors' Addresses 518 Xiaoqing Zhu 519 Cisco Systems, 520 510 McCarthy Blvd, 521 Milpitas, CA 95134, USA 522 EMail: xiaoqzhu@cisco.com 524 Rong Pan 525 Cisco Systems 526 510 McCarthy Blvd, 527 Milpitas, CA 95134, USA 528 Email: ropan@cisco.com