idnits 2.17.1 

draft-hayes-rmcat-sbd-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (October 10, 2014) is 3485 days in the past.  Is this
     intentional?


  Checking references for intended status: Experimental
  ----------------------------------------------------------------------------

  == Outdated reference: A later version (-05) exists of
     draft-welzl-rmcat-coupled-cc-03


     Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------


2	RTP Media Congestion Avoidance                             D. Hayes, Ed.
3	Techniques                                            University of Oslo
4	Internet-Draft                                                 S. Ferlin
5	Intended status: Experimental                 Simula Research Laboratory
6	Expires: April 13, 2015                                         M. Welzl
7	                                                      University of Oslo
8	                                                        October 10, 2014

10	   Shared Bottleneck Detection for Coupled Congestion Control for RTP
11	                                 Media.
12	                        draft-hayes-rmcat-sbd-00

14	Abstract

16	   This document describes a mechanism to detect whether end-to-end data
17	   flows share a common bottleneck.  It relies on summary statistics
18	   that are calculated by a data receiver based on continuous
19	   measurements and regularly fed to a grouping algorithm that runs
20	   wherever the knowledge is needed.  This mechanism complements the
21	   coupled congestion control mechanism in draft-welzl-rmcat-coupled-cc.

23	Status of this Memo

25	   This Internet-Draft is submitted in full conformance with the
26	   provisions of BCP 78 and BCP 79.

28	   Internet-Drafts are working documents of the Internet Engineering
29	   Task Force (IETF).  Note that other groups may also distribute
30	   working documents as Internet-Drafts.  The list of current Internet-
31	   Drafts is at http://datatracker.ietf.org/drafts/current/.

33	   Internet-Drafts are draft documents valid for a maximum of six months
34	   and may be updated, replaced, or obsoleted by other documents at any
35	   time.  It is inappropriate to use Internet-Drafts as reference
36	   material or to cite them other than as "work in progress."

38	   This Internet-Draft will expire on April 13, 2015.

40	Copyright Notice

42	   Copyright (c) 2014 IETF Trust and the persons identified as the
43	   document authors.  All rights reserved.

45	   This document is subject to BCP 78 and the IETF Trust's Legal
46	   Provisions Relating to IETF Documents
47	   (http://trustee.ietf.org/license-info) in effect on the date of
48	   publication of this document.  Please review these documents
49	   carefully, as they describe your rights and restrictions with respect
50	   to this document.  Code Components extracted from this document must
51	   include Simplified BSD License text as described in Section 4.e of
52	   the Trust Legal Provisions and are provided without warranty as
53	   described in the Simplified BSD License.

55	Table of Contents

57	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
58	     1.1.  The signals  . . . . . . . . . . . . . . . . . . . . . . .  3
59	       1.1.1.  Packet Loss  . . . . . . . . . . . . . . . . . . . . .  3
60	       1.1.2.  Packet Delay . . . . . . . . . . . . . . . . . . . . .  3
61	       1.1.3.  Path Lag . . . . . . . . . . . . . . . . . . . . . . .  4
62	   2.  Definitions  . . . . . . . . . . . . . . . . . . . . . . . . .  4
63	     2.1.  Parameter Values . . . . . . . . . . . . . . . . . . . . .  5
64	   3.  Mechanism  . . . . . . . . . . . . . . . . . . . . . . . . . .  5
65	     3.1.  Key metrics and their calculation  . . . . . . . . . . . .  6
66	       3.1.1.  Mean delay . . . . . . . . . . . . . . . . . . . . . .  6
67	       3.1.2.  Skewness Estimate  . . . . . . . . . . . . . . . . . .  7
68	       3.1.3.  Variance Estimate  . . . . . . . . . . . . . . . . . .  7
69	       3.1.4.  Oscilation Estimate  . . . . . . . . . . . . . . . . .  8
70	       3.1.5.  Packet loss  . . . . . . . . . . . . . . . . . . . . .  8
71	     3.2.  Flow Grouping  . . . . . . . . . . . . . . . . . . . . . .  8
72	       3.2.1.  Flow Grouping Algorithm  . . . . . . . . . . . . . . .  8
73	       3.2.2.  Using the flow group signal  . . . . . . . . . . . . .  9
74	   4.  Measuring OWD  . . . . . . . . . . . . . . . . . . . . . . . . 10
75	     4.1.  Time stamp resolution  . . . . . . . . . . . . . . . . . . 10
76	   5.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 10
77	   6.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 10
78	   7.  Security Considerations  . . . . . . . . . . . . . . . . . . . 10
79	   8.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 11
80	     8.1.  Normative References . . . . . . . . . . . . . . . . . . . 11
81	     8.2.  Informative References . . . . . . . . . . . . . . . . . . 11
82	   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 12

84	1.  Introduction

86	   In the Internet, it is not normally known if flows (e.g., TCP
87	   connections or UDP data streams) traverse the same bottlenecks.  Even
88	   flows that have the same sender and receiver may take different paths
89	   and share a bottleneck or not.  Flows that share a bottleneck link
90	   usually compete with one another for their share of the capacity.
91	   This competition has the potential to increase packet loss and
92	   delays.  This is especially relevant for interactive applications
93	   that communicate simultaneously with multiple peers (such as multi-
94	   party video).  For RTP media applications such as RTCWEB,
95	   [I-D.welzl-rmcat-coupled-cc] describes a scheme that combines the
96	   congestion controllers of flows in order to honor their priorities
97	   and avoid unnecessary packet loss as well as delay.  This mechanism
98	   relies on some form of Shared Bottleneck Detection (SBD); here, a
99	   measurement-based SBD approach is described.

101	1.1.  The signals

103	   The current Internet is unable to explicitly inform endpoints as to
104	   which flows share bottlenecks, so endpoints need to infer this from
105	   packet loss and packet delay.

107	1.1.1.  Packet Loss

109	   Packet loss is often a relatively rare signal.  Therefore, on its own
110	   it is of limited use for SBD, however, it is a valuable supplementary
111	   measure when it is more prevalent.

113	1.1.2.  Packet Delay

115	   End-to-end delay measurements include noise from every device along
116	   the path in addition to the delay perturbation at the bottleneck
117	   device.  The noise is often significantly increased if the round-trip
118	   time is used.  The cleanest signal is obtained by using One-Way-Delay
119	   (OWD).

121	   Measuring absolute OWD is difficult since it requires both the sender
122	   and receiver clocks to be synchronised.  However, since the
123	   statistics being collected are relative to the mean OWD, a relative
124	   OWD measurement is sufficient.  Clock drift is not usually
125	   significant over the time intervals used by this SBD mechanism (see
126	   [RFC6817] A.2 for a discussion on clock drift and OWD measurements).

128	   Each packet arriving at the bottleneck buffer may experience very
129	   different queue lengths, and therefore waiting times.  A single OWD
130	   sample does therefore not characterize the actual OWD of a path well.
131	   However, multiple OWD measurements do reflect the distribution of
132	   delays experienced at the bottleneck.

134	1.1.3.  Path Lag

136	   Flows that share a common bottleneck may traverse different paths,
137	   and these paths will often have different base delays.  This makes it
138	   difficult to correlate changes in delay or loss.  This technique uses
139	   the long term shape of the delay distribution as a base for
140	   comparison to counter this.

142	2.  Definitions

144	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
145	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
146	   document are to be interpreted as described in RFC 2119 [RFC2119].

148	   Acronyms used in this document:

150	      OWD -- One Way Delay

152	      RTT -- Round Trip Time

154	      SBD -- Shared Bottleneck Detection

156	   Conventions used in this document:

158	      T     --     the base time interval over which measurements are
159	                   made.

161	      N     --     the number of base time, T, intervals used in some
162	                   calculations.

164	      sum_T(...) --  summation of all the measurements of the variable
165	                   in parentheses taken over the interval T

167	      sum_N(...) --  summation of N terms of the variable in parentheses

169	      sum_NT(...) --  summation of all measurements taken over the
170	                   interval N*T

172	      E_T(...) --  the expectation or mean of the measurements of the
173	                   variable in parentheses over T

175	      E_N(...) --  The expectation or mean of the last N values of the
176	                   variable in parentheses

178	      max_T(...) --  the maximum recorded measurement of the variable in
179	                   parentheses taken over the interval T

181	      p_l, p_f, p_pdf, p_s, p_d, p_v --  various thresholds used in the
182	                   mechanism.

184	2.1.  Parameter Values

186	   Reference [Hayes-LCN14] uses T=350ms, N=50, p_l = 0.1, p_f = 0.2,
187	   p_pdf = 0.3, p_s = p_d = p_v = 0.2.  These are values that seem to
188	   work well over a wide range of practical Internet conditions.

190	3.  Mechanism

192	   The mechanism described in this document is based on the observation
193	   that the delay measurements of flows that share a common bottleneck
194	   have similar shape characteristics.  The shape of these
195	   characteristics are described using 3 key summary statistics:

197	      variance (estimate PDV, see Section 3.1.3)

199	      skewness (estimate skewest, see Section 3.1.2)

201	      oscillation (estimate freqest, see Section 3.1.4)

203	   Summary statistics help to address both the noise and the path lag
204	   problems by describing the general shape over a relatively long
205	   period of time.  This is sufficient for their application in coupled
206	   congestion control for RTP Media.  They can be signalled from a
207	   receiver, which measures the OWD and calculates the summary
208	   statistics, to a sender, which is the entity that is transmitting the
209	   media stream.  An RTP Media device may be both a sender and a
210	   receiver.  SBD can be performed at both the Sender and the Receiver.

212	                                  +----+
213	                                  | H2 |
214	                                  +----+
215	                                     |
216	                                     | L2
217	                                     |
218	                         +----+  L1  |  L3  +----+
219	                         | H1 |------|------| H3 |
220	                         +----+             +----+

222	       A network with 3 hosts (H1, H2, H3) and 3 links (L1, L2, L3).

224	                                 Figure 1

226	   In Figure 1, there are two possible cases for shared bottleneck
227	   detection: a sender-based and a receiver-based case.

229	   1.  Sender-based: consider a situation where host H1 sends media
230	       streams to hosts H2 and H3, and L1 is a shared bottleneck.  H2
231	       and H3 measure the OWD and calculate summary statistics, which
232	       they send to H1 every T. H1, having this knowledge, can determine
233	       the shared bottleneck and accordingly control the send rates.

235	   2.  Receiver-based: consider that H2 is also sending media to H3, and
236	       L3 is a shared bottleneck.  If H3 sends summary statistics to H1
237	       and H2, neither H1 nor H2 alone obtain enough knowledge to detect
238	       this shared bottleneck; H3 can however determine it by combining
239	       the summary statistics related to H1 and H2, respectively.  This
240	       case is applicable when send rates are controlled by the
241	       receiver; then, the signal from H3 to the senders contains the
242	       sending rate.

244	   A discussion of the required signaling for the receiver-based case is
245	   beyond the scope of this document.  For the sender-based case, the
246	   messages and their data format will be defined here in future
247	   versions of this document.  We envision that an initialization
248	   message from the sender to the receiver could specify which key
249	   metrics are requested out of a possibly extensible set (losscnt, PDV,
250	   skewest, freqest).  The grouping algorithm described in this document
251	   requires all four of these metrics, and receivers MUST be able to
252	   provide them, but future algorithms may be able to exploit other
253	   metrics (e.g. metrics based on explicit network signals).  Moreover,
254	   the initialization message could specify T, N, and the necessary
255	   resolution and precision (number of bits per field).

257	3.1.  Key metrics and their calculation

259	   Measurements are calculated over a base interval, T. T should be long
260	   enough to provide enough samples for a good estimate of skewness, but
261	   short enough so that a measure of the oscillation can be made from N
262	   of these estimates.  Reference [Hayes-LCN14] uses T = 350ms and N =
263	   50, which are values that seem to work well over a wide range of
264	   practical Internet conditions.

266	3.1.1.  Mean delay

268	   The mean delay is not a useful signal for comparisons, however, it is
269	   a base measure for the 3 summary statistics.  The mean delay,
270	   E_T(OWD), is the average one way delay measured over T.

272	   To facilitate the other calculations, the last N E_T(OWD) values will
273	   need to be stored in a cyclic buffer along with the moving average of
274	   E_T(OWD):

276	      E_N(E_T(OWD)) = sum_N(E_T(OWD)) / N

278	3.1.2.  Skewness Estimate

280	   Skewness is difficult to calculate efficiently and accurately.
281	   Ideally it should be calculated over the entire measurement for the
282	   entire period (N * T), however this would require storing every delay
283	   measurement over the period.  Instead, an estimate is made over T
284	   using the previous calculation of E_T(OWD).  Comparisons are made
285	   using the mean of N skew estimates.

287	   The skewness is estimated using two counters, counting the number of
288	   one way delay samples above and below the mean:

290	      skewest = (sum_T(OWD < E_T(OWD)) - sum_T(OWD > E_T(OWD)))/num(OWD)

292	         where

294	            if (OWD < E_T(OWD)) 1 else 0

296	            if (OWD > E_T(OWD)) 1 else 0

298	         skewest is a number between -1 and 1

300	      E_N(skewest) = sum_N(skewest) /N

302	   For implementation ease, E_T(OWD) is the mean delay of the previous T
303	   interval.  Care must be taken when implementing the comparisons to
304	   ensure that rounding does not bias skewest.

306	3.1.3.  Variance Estimate

308	   Packet Delay Variation (PDV) ([RFC5481] and [ITU-Y1540] is used as an
309	   estimator of the variance of the delay signal.  We define PDV as
310	   follows:

312	      PDV = (max(OWD) - E_T(OWD))

314	      E_N(PDV) = sum_N(PDV) /N

316	   This modifies PDV as outlined in [RFC5481] to provide a summary
317	   statistic version that best aids the grouping decisions of the
318	   algorithm (see [Hayes-LCN14] section IVB).

320	3.1.4.  Oscilation Estimate

322	   An estimate of the low frequency oscillation of the delay signal is
323	   calculated by counting and normalising the significant mean,
324	   E_T(OWD), crossings of E_N(E_T(OWD)):

326	      freqest = number_of_crossings / N

328	      Where

330	         we define a significant mean crossing as a crossing that
331	         extends p_v * E_N(PDV) from E_N(E_T(OWD)).  In our experiments
332	         we have found that p_v = 0.2 is a good value.

334	   Freqest is a number between 0 and 1.  Freqest and can be approximated
335	   incrementally as follows:

337	      With each new calculation of E_T(OWD) a decision is made as to
338	      whether this value of E_T(OWD) significantly crosses the current
339	      long term mean, E_N(E_T(OWD), with respect to the previous
340	      significant mean crossing.

342	      A cyclic buffer, last_N_crossings, records a 1 if there is a
343	      significant mean crossing, otherwise a 0.

345	      The counter, number_of_crossings, is incremented when there is a
346	      significant mean crossing and subtracted from when a non zero
347	      value is removed from the last_N_crossings.

349	   This approximation of freqest was not used in [Hayes-LCN14], which
350	   calculated freqest every T using the current E_N(E_T(OWD)).  Our
351	   tests show that this approximation of freqest yields results that are
352	   almost identical to when the full calculation is performed every T.

354	3.1.5.  Packet loss

356	   The proportion of packets lost is used as a supplementary measure:

358	      PL_NT = sum_NT(lost packets) / sum_NT(total packets)

360	3.2.  Flow Grouping

362	3.2.1.  Flow Grouping Algorithm

364	   The following grouping algorithm is RECOMMENDED for SBD in this
365	   context and is sufficient and efficient for small to moderate numbers
366	   of flows.  For very large numbers of flows, hundreds, a more complex
367	   clustering algorithm may be substituted.

369	   Flows determined to be experiencing congestion are successively
370	   divided into groups based on freqest, PDV, and skewest.

372	   The first step is to determine which flows are experiencing
373	   congestion.  This is important, since if a flow is not experiencing
374	   congestion its delay based metrics will not describe the bottleneck,
375	   but the "noise" from the rest of the path.  Skewness, with proportion
376	   of packets loss as a supplementary measure, is used to do this:

378	   1.  Grouping will be performed on flows where:

380	          E_N(skewest) < 0 || PL_NT > p_l.

382	   These flows, flows experiencing congestion, are then progressively
383	   divided into groups based on the freqest, PDV, and skewest summary
384	   statistics.  The process proceeds according to the following steps:

386	   2.  Group flows whose difference in sorted freqest is less than a
387	       threshold:

389	          diff(freqest) < p_f

391	   3.  Group flows whose difference in sorted E_N(PDV) is less than a
392	       threshold:

394	          diff(E_N(PDV)) < (p_pdv * E_N(PDV))

396	   4.  Group flows whose difference in sorted E_N(skewest) or PL_NT is
397	       less than a threshold:

399	          if PL_NT < p_l

401	             diff(E_N(skewness)) < p_s

403	          otherwise

405	             diff(PL_NT) < p_d

407	   This procedure involves sorting the groups, according to the measure
408	   being used to divide them.  It is simple to implement, and efficient
409	   for small numbers of flows, such as are expected in RTCWEB.

411	3.2.2.  Using the flow group signal

413	   A grouping decisions is made every T. Network conditions can cause
414	   bottlenecks to fluctuate.  A coupled congestion controller MAY decide
415	   only to couple groups that remain stable, say grouped together 90% of
416	   the time, depending on its objectives.  Recommendations concerning
417	   this are beyond the scope of this draft and will be specific to the
418	   coupled congestion controllers objectives.

420	4.  Measuring OWD

422	   This section discusses the OWD measurements required for this
423	   algorithm to detect shared bottlenecks.

425	   The SBD mechanism described in this draft relies on differences
426	   between OWD measurements to avoid the practical problems with
427	   measuring absolute OWD (see [Hayes-LCN14] section IIIC).  Since all
428	   summary statistics are relative to the mean OWD and sender/receiver
429	   clock offsets are approximately constant over the measurement
430	   periods, the offset is subtracted out in the calculation.

432	4.1.  Time stamp resolution

434	   The SBD mechanism requires timing information precise enough to be
435	   able to make comparisons.  As a rule of thumb, the time resolution
436	   should be less than one hundredth of a typical paths range of delays.
437	   In general, the lower the time resolution, the more care that needs
438	   to be taken to ensure rounding errors don't bias the skewness
439	   calculation.

441	   Typical RTP media flows use sub-millisecond timers, which should be
442	   adequate in most situations.

444	5.  Acknowledgements

446	   This work was part-funded by the European Community under its Seventh
447	   Framework Programme through the Reducing Internet Transport Latency
448	   (RITE) project (ICT-317700).  The views expressed are solely those of
449	   the authors.

451	6.  IANA Considerations

453	   This memo includes no request to IANA.

455	7.  Security Considerations

457	   The security considerations of RFC 3550 [RFC3550], RFC 4585
458	   [RFC4585], and RFC 5124 [RFC5124] are expected to apply.

460	   Non-authenticated RTCP packets carrying shared bottleneck indications
461	   and summary statistics could attackers to alter the bottleneck
462	   sharing characteristics for private gain or disruption of other
463	   parties communication.

465	8.  References

467	8.1.  Normative References

469	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
470	              Requirement Levels", BCP 14, RFC 2119, March 1997.

472	8.2.  Informative References

474	   [Hayes-LCN14]
475	              Hayes, D., Ferlin, S., and M. Welzl, "Practical Passive
476	              Shared Bottleneck Detection using Shape Summary
477	              Statistics", Proc. the IEEE Local Computer Networks
478	              (LCN) p150-158, September 2014, <http://heim.ifi.uio.no/
479	              davihay/
480	              hayes14__pract_passiv_shared_bottl_detec-abstract.html>.

482	   [I-D.welzl-rmcat-coupled-cc]
483	              Welzl, M., Islam, S., and S. Gjessing, "Coupled congestion
484	              control for RTP media", draft-welzl-rmcat-coupled-cc-03
485	              (work in progress), May 2014.

487	   [ITU-Y1540]
488	              ITU-T, "Internet protocol data communication service - IP
489	              packet transfer and availability performance parameters",
490	              Series Y: Global Information Infrastructure, Internet
491	              Protocol Aspects and Next-Generation Networks ,
492	              March 2011,
493	              <http://www.itu.int/rec/T-REC-Y.1540-201103-I/en>.

495	   [RFC3550]  Schulzrinne, H., Casner, S., Frederick, R., and V.
496	              Jacobson, "RTP: A Transport Protocol for Real-Time
497	              Applications", STD 64, RFC 3550, July 2003.

499	   [RFC4585]  Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey,
500	              "Extended RTP Profile for Real-time Transport Control
501	              Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585,
502	              July 2006.

504	   [RFC5124]  Ott, J. and E. Carrara, "Extended Secure RTP Profile for
505	              Real-time Transport Control Protocol (RTCP)-Based Feedback
506	              (RTP/SAVPF)", RFC 5124, February 2008.

508	   [RFC5481]  Morton, A. and B. Claise, "Packet Delay Variation
509	              Applicability Statement", RFC 5481, March 2009.

511	   [RFC6817]  Shalunov, S., Hazel, G., Iyengar, J., and M. Kuehlewind,
512	              "Low Extra Delay Background Transport (LEDBAT)", RFC 6817,
513	              December 2012.

515	Authors' Addresses

517	   David Hayes (editor)
518	   University of Oslo
519	   PO Box 1080 Blindern
520	   Oslo,   N-0316
521	   Norway

523	   Phone: +47 2284 5566
524	   Email: davihay@ifi.uio.no

526	   Simone Ferlin
527	   Simula Research Laboratory
528	   P.O.Box 134
529	   Lysaker,   1325
530	   Norway

532	   Phone: +47 4072 0702
533	   Email: ferlin@simula.no

535	   Michael Welzl
536	   University of Oslo
537	   PO Box 1080 Blindern
538	   Oslo,   N-0316
539	   Norway

541	   Phone: +47 2285 2420
542	   Email: michawe@ifi.uio.no