idnits 2.17.1 

draft-zhu-rmcat-nada-02.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack a Security Considerations section.

  ** There is 1 instance of lines with control characters in the document.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == Line 278 has weird spacing: '...ons via  expon...'

  == The document doesn't use any RFC 2119 keywords, yet seems to have RFC
     2119 boilerplate text.

  -- The document date (September 11, 2013) is 3872 days in the past.  Is
     this intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Unused Reference: 'RFC3168' is defined on line 498, but no explicit
     reference was found in the text

  -- Obsolete informational reference (is this intentional?): RFC 2309
     (Obsoleted by RFC 7567)


     Summary: 2 errors (**), 0 flaws (~~), 4 warnings (==), 2 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                             X. Zhu
3	Internet Draft                                                    R. Pan
4	Intended Status: Informational                             Cisco Systems
5	Expires: March 15, 2014                               September 11, 2013

7	     NADA: A Unified Congestion Control Scheme for Real-Time Media
8	                        draft-zhu-rmcat-nada-02

10	Abstract

12	   This document describes a scheme named network-assisted dynamic
13	   adaptation (NADA), a novel congestion control approach for
14	   interactive real-time media applications, such as video conferencing.
15	   In the proposed scheme, the sender regulates its sending rate based
16	   on either implicit or explicit congestion signaling, in a unified
17	   approach. The scheme can benefit from explicit congestion
18	   notification (ECN) markings from network nodes. It also maintains
19	   consistent sender behavior in the absence of such markings, by
20	   reacting to queuing delays and packet losses instead.

22	   We present here the overall system architecture, recommended
23	   behaviors at the sender and the receiver, as well as expected network
24	   node operations. Results from extensive simulation studies of the
25	   proposed scheme are available upon request.

27	Status of this Memo

29	   This Internet-Draft is submitted to IETF in full conformance with the
30	   provisions of BCP 78 and BCP 79.

32	   Internet-Drafts are working documents of the Internet Engineering
33	   Task Force (IETF), its areas, and its working groups.  Note that
34	   other groups may also distribute working documents as
35	   Internet-Drafts.

37	   Internet-Drafts are draft documents valid for a maximum of six months
38	   and may be updated, replaced, or obsoleted by other documents at any
39	   time.  It is inappropriate to use Internet-Drafts as reference
40	   material or to cite them other than as "work in progress."

42	   The list of current Internet-Drafts can be accessed at
43	   http://www.ietf.org/1id-abstracts.html

45	   The list of Internet-Draft Shadow Directories can be accessed at
46	   http://www.ietf.org/shadow.html

48	Copyright and License Notice

50	   Copyright (c) 2012 IETF Trust and the persons identified as the
51	   document authors. All rights reserved.

53	   This document is subject to BCP 78 and the IETF Trust's Legal
54	   Provisions Relating to IETF Documents
55	   (http://trustee.ietf.org/license-info) in effect on the date of
56	   publication of this document. Please review these documents
57	   carefully, as they describe your rights and restrictions with respect
58	   to this document. Code Components extracted from this document must
59	   include Simplified BSD License text as described in Section 4.e of
60	   the Trust Legal Provisions and are provided without warranty as
61	   described in the Simplified BSD License.

63	Table of Contents

65	   1. Introduction  . . . . . . . . . . . . . . . . . . . . . . . . .  3
66	   2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . .  3
67	   3. System Model  . . . . . . . . . . . . . . . . . . . . . . . . .  3
68	   4. Network Node Operations . . . . . . . . . . . . . . . . . . . .  4
69	     4.1 Default behavior of drop tail  . . . . . . . . . . . . . . .  4
70	     4.2 ECN marking  . . . . . . . . . . . . . . . . . . . . . . . .  4
71	     4.3 PCN marking  . . . . . . . . . . . . . . . . . . . . . . . .  5
72	     4.4 Comments and Discussions . . . . . . . . . . . . . . . . . .  6
73	   5. Receiver Behavior . . . . . . . . . . . . . . . . . . . . . . .  6
74	     5.1 Monitoring per-packet statistics . . . . . . . . . . . . . .  6
75	     5.2 Calculating time-smoothed values . . . . . . . . . . . . . .  6
76	     5.3 Sending periodic feedback  . . . . . . . . . . . . . . . . .  7
77	   6. Sender Behavior . . . . . . . . . . . . . . . . . . . . . . . .  7
78	     6.1 Video encoder rate control . . . . . . . . . . . . . . . . .  8
79	     6.2 Rate shaping buffer  . . . . . . . . . . . . . . . . . . . .  8
80	     6.3 Reference rate calculator  . . . . . . . . . . . . . . . . .  9
81	     6.4 Video target rate and sending rate calculator  . . . . . . .  9
82	     6.5 Slow-start behavior  . . . . . . . . . . . . . . . . . . . . 10
83	   7. Incremental Deployment  . . . . . . . . . . . . . . . . . . . . 10
84	   8. Implementation Status . . . . . . . . . . . . . . . . . . . . . 11
85	   9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 11
86	   10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 11
87	     10.1  Normative References . . . . . . . . . . . . . . . . . . . 11
88	     10.2  Informative References . . . . . . . . . . . . . . . . . . 11
89	   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 12

91	1. Introduction

93	   Interactive real-time media applications introduce a unique set of
94	   challenges for congestion control. Unlike TCP, the mechanism used for
95	   real-time media needs to adapt fast to instantaneous bandwidth
96	   changes, accommodate fluctuations in the output of video encoder rate
97	   control, and cause low queuing delay over the network. An ideal
98	   scheme should also make effective use of all types of congestion
99	   signals, including packet losses, queuing delay, and explicit
100	   congestion notification (ECN) markings.

102	   Based on the above considerations, we present a scheme named network-
103	   assisted dynamic adaptation (NADA). The proposed design benefits from
104	   explicit congestion control signals (e.g., ECN markings) from the
105	   network, and remains compatible in the presence of implicit signals
106	   (delay or loss) only. In addition, it supports weighted bandwidth
107	   sharing among competing video flows.

109	   This documentation describes the overall system architecture,
110	   recommended designs at the sender and receiver, as well as expected
111	   network nodes operations. The signaling mechanism consists of
112	   standard RTP timestamp [RFC3550] and standard RTCP feedback reports.

114	2. Terminology

116	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
117	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
118	   document are to be interpreted as described in RFC 2119 [RFC2119].

120	3. System Model

122	   The system consists of the following elements:

124	        * Incoming media stream, in the form of consecutive raw video
125	        frames and audio samples;

127	        * Media encoder with rate control capabilities. It takes the
128	        incoming media stream and encodes it to an RTP stream at a
129	        target bit rate R_o. Note that the actual output rate from the
130	        encoder R_v may fluctuate randomly around R_o. Also, the encoder
131	        can only change its rate at rather coarse time intervals, on the
132	        order of seconds.

134	        * RTP sender, responsible for calculating the target bit rate
135	        R_o based on network congestion signals (delay or ECN marking
136	        reports from the receiver), and for regulating the actual
137	        sending rate R_s accordingly. A rate shaping buffer is employed
138	        to absorb the instantaneous difference between video encoder
139	        output rate R_v and sending rate R_s. The buffer size L_s,
140	        together with R_o, influences the calculation of R_s. The RTP
141	        sender also generates RTP timestamp in outgoing packets.

143	        * RTP receiver, responsible for measuring and estimating end-to-
144	        end delay d based on sender RTP timestamp. In the presence of
145	        packet losses and ECN markings, it also records the individual
146	        loss and marking events, and calculates the equivalent delay
147	        d_tilde that accounts for queuing delay, ECN marking, and packet
148	        losses. The receiver feeds such statistics back to the sender
149	        via periodic RTCP reports.

151	        * Network node, with several modes of operation. The system can
152	        work with the default behavior of a simple drop tail queue.  It
153	        can also benefit from advanced AQM features such as RED-based
154	        ECN marking, and PCN marking using a token bucket algorithm.

156	In the following, we will elaborate on the respective operations at the
157	network node, the receiver, and the sender.

159	4. Network Node Operations

161	We consider three variations of queue management behavior at the network
162	node, leading to either implicit or explicit congestion signals.

164	4.1 Default behavior of drop tail

166	In conventional network with drop tail or RED queues, congestion is
167	inferred from the estimation of end-to-end delay. No special action is
168	required at network node.

170	Packet drops at the queue are detected at the receiver, and contributes
171	to the calculation of the equivalent delay d_tilde.

173	4.2 ECN marking

175	In this mode, the network node randomly marks the ECN field in the IP
176	packet header following the Random Early Detection (RED) algorithm
177	[RFC2309]. Calculation of the marking probability involves the following
178	steps:

180	    * upon packet arrival, update smoothed queue size q_avg as:

182	                  q_avg = alpha*q + (1-alpha)*q_avg.

184	    The smoothing parameter alpha is a value between 0 and 1. A value of
185	    alpha=1 corresponds to performing no smoothing at all.

187	    * calculate marking probability p as:

189	        p = 0, if q < q_lo;

191	                   q_avg - q_lo
192	        p = p_max*--------------, if q_lo <= q < q_hi;
193	                   q_hi - q_lo

195	        p = 1, if q >= q_hi.

197	Here, q_lo and q_hi corresponds to the low and high thresholds of queue
198	occupancy. The maximum parking probability is p_max.

200	The ECN markings events will contribute to the calculation of an
201	equivalent delay d_tilde at the receiver. No changes are required at the
202	sender.

204	4.3 PCN marking

206	As a more advanced feature, we also envision network nodes which support
207	PCN marking based on virtual queues. In such a case, the marking
208	probability of the ECN bit in the IP packet header is calculated as
209	follows:

211	    * upon packet arrival, meter packet against token bucket (r,b);

213	    * update token level b_tk;

215	    * calculate the marking probability as:

217	        p = 0, if b-b_tk < b_lo;

219	                    b-b_tk-b_lo
220	        p = p_max* --------------, if b_lo<= b-b_tk <b_hi;
221	                     b_hi-b_lo

223	        p = 1, if b-b_tk>=b_hi.

225	Here, the token bucket lower and upper limits are denoted by b_lo and
226	b_hi, respectively. The parameter b indicates the size of the token
227	bucket. The parameter r is chosen as r=gamma*C, where gamma<1 is the
228	target utilization ratio and C designates link capacity. The maximum
229	marking probability is p_max.

231	The ECN markings events will contribute to the calculation of an
232	equivalent delay d_tilde at the receiver. No changes are required at the
233	sender. The virtual queuing mechanism from the PCN marking algorithm
234	will lead to additional benefits such as zero standing queues.

236	4.4 Comments and Discussions

238	In all three flavors described above, the network queue operates with
239	the simple first-in-first-out (FIFO) principle. There is no need to
240	maintain per-flow state. Such a simple design ensures that the system
241	can scale easily with large number of video flows and high link
242	capacity.

244	The sender behavior stays the same in the presence of all types of
245	congestion signals: delay, loss, ECN marking due to either RED/ECN or
246	PCN algorithms. This unified approach allows a graceful transition of
247	the scheme as the level of congestion in the network shifts dynamically
248	between different regimes.

250	5. Receiver Behavior

252	The role of the receiver is fairly straightforward. It is in charge of
253	four steps: a) monitoring end-to-end delay/loss/marking statistics on a
254	per-packet basis; b) aggregating all forms of congestion signals in
255	terms of the equivalent delay; c) calculating time-smoothed value of the
256	congestion signal; and d) sending periodic reports back to the sender.

258	5.1 Monitoring per-packet statistics

260	The receiver observes and estimates one-way delay d_n for the n-th
261	packet, ECN marking event 1_M, and packet loss event 1_L. Here, 1_M and
262	1_L are binary indicators: the value of 1 corresponding to a marked or
263	lost packet and value of 0 indicates no marking or loss.

265	The equivalent delay d_tilde is calculated as follows:

267	                  d_tilde = d_n + 1_M d_M + 1_M d_L,

269	where d_M is a prescribed fictitious delay value corresponding to the
270	ECN marking event (e.g., d_M = 200 ms), and d_L is a prescribed
271	fictitious delay value corresponding to the packet loss event (e.g., d_L
272	= 1 second). By introducing a large fictitious delay penalty for ECN
273	marking and packet losses, our proposed scheme leads to low end-to-end
274	actual delays in the presence of such events.

276	5.2 Calculating time-smoothed values

278	The receiver smoothes its observations via  exponential averaging:

280	                 x_n = alpha*d_tilde + (1-alpha)*x_n.

282	The weighting parameter alpha adjusts the level of smoothing.

284	5.3 Sending periodic feedback

286	Periodically, the receiver sends back the updated value of x in RTCP
287	messages, to aid the sender in its calculation of target rate.  The size
288	of acknowledgement packets are typically on the order of tens of bytes,
289	and are significantly smaller than average video packet sizes.
290	Therefore, the bandwidth overhead of the receiver acknowledgement stream
291	is sufficiently low.

293	6. Sender Behavior

295	                    --------------------
296	                    |                  |
297	                    |  Reference Rate  | <--------- RTCP report
298	                    |  Calculator      |
299	                    |                  |
300	                    --------------------
301	                            |
302	                            | R_n
303	                            |
304	                --------------------------
305	               |                          |
306	               |                          |
307	              \ /                        \ /
308	    --------------------           -----------------
309	    |                  |           |               |
310	    |  Video Target    |           | Sending Rate  |
311	    |  Rate Calculator |           | Calculator    |
312	    |                  |           |               |
313	    --------------------           -----------------
314	       |        /|\                   /|\      |
315	    R_v|         |                     |       |
316	       |         -----------------------       |
317	       |                     |                 | R_s
318	    ------------             |L_s              |
319	    |          |             |                 |
320	    |          |  R_o    --------------       \|/
321	    |  Encoder |---------->   | | | | | --------------->
322	    |          |              | | | | |     video packets
323	    ------------         --------------
324	                         Rate Shaping Buffer

326	                     Figure 1 NADA Sender Structure

328	    Figure 1 provides a more detailed view of the NADA sender. Upon
329	    receipt of an RTCP report from the receiver, the NADA sender updates
330	    its calculation of the reference rate R_n as a function of the
331	    network congestion signal. It further adjusts both the target rate
332	    for the live video encoder R_v and the sending rate R_s over the
333	    network based on the updated value of R_n, as well as the size of
334	    the rate shaping buffer.

336	    The following sections describe these modules in further details,
337	    and explain how they interact with each other.

339	6.1 Video encoder rate control

341	The video encoder rate control procedure has the following
342	characteristics:

344	    * Rate changes can happen only at large intervals, on the order of
345	    seconds.

347	    * Given a target rate R_o, the encoder output rate may randomly
348	    fluctuate around it.

350	    * The encoder output rate is further constrained by video content
351	    complexity. The range of the final rate output is [R_min, R_max].
352	    Note that it's content-dependent, and may change over time.

354	Note that operation of the live video encoder is out of the scope of our
355	design for a congestion control scheme in NADA. Instead, its behavior
356	treated as a black box.

358	6.2 Rate shaping buffer

360	A rate shaping buffer is employed to absorb any instantaneous mismatch
361	between encoder rate output R_o and regulated sending rate R_s. The size
362	of the buffer evolves from time t-tau to time t as:

364	             L_s(t) = max [0, L_s(t-tau)+R_v*tau-R_s*tau].

366	A large rate shaping buffer contributes to higher end-to-end delay,
367	which may harm the performance of real-time media communications.
368	Therefore, the sender has a strong incentive to constrain the size of
369	the shaping buffer. It can either deplete it faster by increasing the
370	sending rate R_s, or limit its growth by reducing the target rate for
371	the video encoder rate control R_v.

373	6.3 Reference rate calculator

375	The sender calculates the reference rate R_n based on network congestion
376	information from receiver RTCP reports. It first compensates the effect
377	of delayed observation by one round-trip time (RTT) via a linear
378	predictor:

380	                        x_n - x_n-1
381	        x_hat = x_n + ---------------*tau_o       (1)
382	                            delta

384	In (1), the arrival interval between the (n-1)-th the n-th packets is
385	designated by delta. The parameter tau_o indicates the reference round-
386	trip-time, hence the prediction step size.

388	The reference rate is then calculated as:

390	                          R_max-R_min
391	        R_n = R_min + w*---------------*x_ref     (2)
392	                            x_hat

394	Here, R_min and R_max denote the content-dependent rate range the
395	encoder can produce. The weight of priority level is w. The reference
396	congestion signal x_ref is chosen so that the maximum rate of R_max can
397	be achieved when x_hat = w*x_ref. Note that the combination of w and
398	x_ref determines how sensitive the rate adaptation scheme is in reaction
399	to fluctuations in observed signal x. The final target rate R_o is
400	clipped within the range of [R_min, R_max].

402	Note that the sender does not need any explicit knowledge of the
403	management scheme inside the network. Rather, it reacts to the
404	aggregation of all forms of congestion indications (delay, loss, and
405	marking) via the composite congestion signal x_n from the receiver in a
406	coherent manner.

408	6.4 Video target rate and sending rate calculator

410	The target rate for the live video encoder is updated based on both the
411	reference rate R_n and the rate shaping buffer size L_s, as follows:

413	                               L_s
414	        R_v = R_o - beta_v * -------.       (3)
415	                              tau_v

417	Similarly, the outgoing rate is regulated based on both the reference
418	rate R_n and the rate shaping buffer size L_s, such that:

420	                               L_s
421	        R_s = R_o + beta_s * -------.       (4)
422	                              tau_v

424	In (3) and (4), the first term indicates the rate calculated from
425	network congestion feedback alone. The second term indicates the
426	influence of the rate shaping buffer. A large rate shaping buffer nudges
427	the encoder target rate slightly below -- and the sending rate slightly
428	above -- the reference rate R_n. Intuitively, the amount of extra rate
429	offset needed to completely drain the rate shaping buffer within the
430	same time frame of encoder rate adaptation tau_v is given by L_s/tau_v.
431	The scaling parameters beta_v and beta_s can be tuned to balance between
432	the competing goals of maintaining a small rate shaping buffer and
433	deviating the system from the reference rate point.

435	6.5 Slow-start behavior

437	Finally, special care needs to be taken during the startup phase of a
438	video stream, since it may take several roundtrip-times before the
439	sender can collect statistically robust information on network
440	congestion. We propose to regulate the reference rate R_n to grow
441	linearly in the beginning, no more than: R_ss at time t:

443	                           t-t_0
444	        R_ss(t) = R_min + -------(R_max-R_min).
445	                             T

447	The start time of the stream is t_0, and T represents the time horizon
448	over which the slow-start mechanism is effective. The encoder target
449	rate is chosen to be the minimum of R_n and R_ss during the first T
450	seconds.

452	7. Incremental Deployment

454	One nice property of proposed design is the consistent video end point
455	behavior irrespective of network node variations. This facilitates
456	gradual, incremental adoption of the scheme.

458	To start off with, the proposed encoder congestion control mechanism can
459	be implemented without any explicit support from the network, and rely
460	solely on observed one-way delay measurements and packet loss ratios as
461	implicit congestion signals.

463	When ECN is enabled at the network nodes with RED-based marking, the
464	receiver can fold its observations of ECN markings into the calculation
465	of the equivalent delay. The sender can react to these explicit
466	congestion signals without any modification.

468	Ultimately, networks equipped with proactive marking based on token
469	bucket level metering can reap the additional benefits of zero standing
470	queues and lower end-to-end delay and work seamlessly with existing
471	senders and receivers.

473	8. Implementation Status

475	The proposed NADA scheme has been implemented in the ns-2 simulation
476	platform [ns2]. Extensive simulation evaluations of the scheme are
477	documented in [Zhu-PV13].

479	A Linux-based testbed implementation is currently underway.

481	9. IANA Considerations

483	There are no actions for IANA.

485	10. References

487	10.1  Normative References

489	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
490	              Requirement Levels", BCP 14, RFC 2119, March 1997.

492	   [RFC3550]  Schulzrinne, H., Casner, S., Frederick, R., and V.
493	              Jacobson, "RTP: A Transport Protocol for Real-Time
494	              Applications", STD 64, RFC 3550, July 2003.

496	10.2  Informative References

498	   [RFC3168]  Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
499	              of Explicit Congestion Notification (ECN) to IP",
500	              RFC 3168, September 2001.

502	   [RFC2309]  Braden, B., Clark, D., Crowcroft, J., Davie, B., Deering,
503	              S., Estrin, D., Floyd, S., Jacobson, V., Minshall, G.,
504	              Partridge, C., Peterson, L., Ramakrishnan, K., Shenker,
505	              S., Wroclawski, J., and L. Zhang, "Recommendations on
506	              Queue Management and Congestion Avoidance in the
507	              Internet", RFC 2309, April 1998.

509	   [ns2] "The Network Simulator - ns-2", http://www.isi.edu/nsnam/ns/

511	   [Zhu-PV13] Zhu, X., Pan, R., "NADA: A Unified Congestion Control
512	              Scheme for Low-Latency Interactive Video", IEEE
513	              International Packet Video Workshop (PV'13), 2013.
514	              Submitted.

516	Authors' Addresses

518	   Xiaoqing Zhu
519	   Cisco Systems,
520	   510 McCarthy Blvd,
521	   Milpitas, CA 95134, USA
522	   EMail: xiaoqzhu@cisco.com

524	   Rong Pan
525	   Cisco Systems
526	   510 McCarthy Blvd,
527	   Milpitas, CA 95134, USA
528	   Email: ropan@cisco.com