idnits 2.17.1 draft-ietf-rmcat-sbd-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (May 8, 2015) is 3274 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- == Outdated reference: A later version (-05) exists of draft-welzl-rmcat-coupled-cc-04 Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 RTP Media Congestion Avoidance D. Hayes, Ed. 3 Techniques University of Oslo 4 Internet-Draft S. Ferlin 5 Intended status: Experimental Simula Research Laboratory 6 Expires: November 9, 2015 M. Welzl 7 University of Oslo 8 May 8, 2015 10 Shared Bottleneck Detection for Coupled Congestion Control for RTP 11 Media. 12 draft-ietf-rmcat-sbd-00 14 Abstract 16 This document describes a mechanism to detect whether end-to-end data 17 flows share a common bottleneck. It relies on summary statistics 18 that are calculated by a data receiver based on continuous 19 measurements and regularly fed to a grouping algorithm that runs 20 wherever the knowledge is needed. This mechanism complements the 21 coupled congestion control mechanism in draft-welzl-rmcat-coupled-cc. 23 Status of this Memo 25 This Internet-Draft is submitted in full conformance with the 26 provisions of BCP 78 and BCP 79. 28 Internet-Drafts are working documents of the Internet Engineering 29 Task Force (IETF). Note that other groups may also distribute 30 working documents as Internet-Drafts. The list of current Internet- 31 Drafts is at http://datatracker.ietf.org/drafts/current/. 33 Internet-Drafts are draft documents valid for a maximum of six months 34 and may be updated, replaced, or obsoleted by other documents at any 35 time. It is inappropriate to use Internet-Drafts as reference 36 material or to cite them other than as "work in progress." 38 This Internet-Draft will expire on November 9, 2015. 40 Copyright Notice 42 Copyright (c) 2015 IETF Trust and the persons identified as the 43 document authors. All rights reserved. 45 This document is subject to BCP 78 and the IETF Trust's Legal 46 Provisions Relating to IETF Documents 47 (http://trustee.ietf.org/license-info) in effect on the date of 48 publication of this document. Please review these documents 49 carefully, as they describe your rights and restrictions with respect 50 to this document. Code Components extracted from this document must 51 include Simplified BSD License text as described in Section 4.e of 52 the Trust Legal Provisions and are provided without warranty as 53 described in the Simplified BSD License. 55 Table of Contents 57 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 58 1.1. The signals . . . . . . . . . . . . . . . . . . . . . . . 3 59 1.1.1. Packet Loss . . . . . . . . . . . . . . . . . . . . . 3 60 1.1.2. Packet Delay . . . . . . . . . . . . . . . . . . . . . 3 61 1.1.3. Path Lag . . . . . . . . . . . . . . . . . . . . . . . 4 62 2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 4 63 2.1. Parameter Values . . . . . . . . . . . . . . . . . . . . . 5 64 3. Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . 6 65 3.1. Key metrics and their calculation . . . . . . . . . . . . 7 66 3.1.1. Mean delay . . . . . . . . . . . . . . . . . . . . . . 7 67 3.1.2. Skewness Estimate . . . . . . . . . . . . . . . . . . 8 68 3.1.3. Variance Estimate . . . . . . . . . . . . . . . . . . 9 69 3.1.4. Oscillation Estimate . . . . . . . . . . . . . . . . . 9 70 3.1.5. Packet loss . . . . . . . . . . . . . . . . . . . . . 10 71 3.2. Flow Grouping . . . . . . . . . . . . . . . . . . . . . . 10 72 3.2.1. Flow Grouping Algorithm . . . . . . . . . . . . . . . 10 73 3.2.2. Using the flow group signal . . . . . . . . . . . . . 12 74 3.3. Removing Noise from the Estimates . . . . . . . . . . . . 12 75 3.3.1. Oscillation noise . . . . . . . . . . . . . . . . . . 12 76 3.3.2. Clock drift . . . . . . . . . . . . . . . . . . . . . 13 77 3.3.3. Bias in the skewness measure . . . . . . . . . . . . . 14 78 3.4. Reducing lag and Improving Responsiveness . . . . . . . . 14 79 3.4.1. Improving the response of the skewness estimate . . . 15 80 3.4.2. Improving the response of the variance estimate . . . 15 81 4. Measuring OWD . . . . . . . . . . . . . . . . . . . . . . . . 16 82 4.1. Time stamp resolution . . . . . . . . . . . . . . . . . . 16 83 5. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 16 84 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 16 85 7. Security Considerations . . . . . . . . . . . . . . . . . . . 16 86 8. Change history . . . . . . . . . . . . . . . . . . . . . . . . 17 87 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 17 88 9.1. Normative References . . . . . . . . . . . . . . . . . . . 17 89 9.2. Informative References . . . . . . . . . . . . . . . . . . 17 90 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 18 92 1. Introduction 94 In the Internet, it is not normally known if flows (e.g., TCP 95 connections or UDP data streams) traverse the same bottlenecks. Even 96 flows that have the same sender and receiver may take different paths 97 and share a bottleneck or not. Flows that share a bottleneck link 98 usually compete with one another for their share of the capacity. 99 This competition has the potential to increase packet loss and 100 delays. This is especially relevant for interactive applications 101 that communicate simultaneously with multiple peers (such as multi- 102 party video). For RTP media applications such as RTCWEB, 103 [I-D.welzl-rmcat-coupled-cc] describes a scheme that combines the 104 congestion controllers of flows in order to honor their priorities 105 and avoid unnecessary packet loss as well as delay. This mechanism 106 relies on some form of Shared Bottleneck Detection (SBD); here, a 107 measurement-based SBD approach is described. 109 1.1. The signals 111 The current Internet is unable to explicitly inform endpoints as to 112 which flows share bottlenecks, so endpoints need to infer this from 113 whatever information is available to them. The mechanism described 114 here currently utilises packet loss and packet delay, but is not 115 restricted to these. 117 1.1.1. Packet Loss 119 Packet loss is often a relatively rare signal. Therefore, on its own 120 it is of limited use for SBD, however, it is a valuable supplementary 121 measure when it is more prevalent. 123 1.1.2. Packet Delay 125 End-to-end delay measurements include noise from every device along 126 the path in addition to the delay perturbation at the bottleneck 127 device. The noise is often significantly increased if the round-trip 128 time is used. The cleanest signal is obtained by using One-Way-Delay 129 (OWD). 131 Measuring absolute OWD is difficult since it requires both the sender 132 and receiver clocks to be synchronised. However, since the 133 statistics being collected are relative to the mean OWD, a relative 134 OWD measurement is sufficient. Clock drift is not usually 135 significant over the time intervals used by this SBD mechanism (see 136 [RFC6817] A.2 for a discussion on clock drift and OWD measurements). 137 However, in circumstances where it is significant, Section 3.3.2 138 outlines a way of adjusting the calculations to cater for it. 140 Each packet arriving at the bottleneck buffer may experience very 141 different queue lengths, and therefore different waiting times. A 142 single OWD sample does not, therefore, characterize the path well. 143 However, multiple OWD measurements do reflect the distribution of 144 delays experienced at the bottleneck. 146 1.1.3. Path Lag 148 Flows that share a common bottleneck may traverse different paths, 149 and these paths will often have different base delays. This makes it 150 difficult to correlate changes in delay or loss. This technique uses 151 the long term shape of the delay distribution as a base for 152 comparison to counter this. 154 2. Definitions 156 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 157 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 158 document are to be interpreted as described in RFC 2119 [RFC2119]. 160 Acronyms used in this document: 162 OWD -- One Way Delay 164 PDV -- Packet Delay Variation 166 RTT -- Round Trip Time 168 SBD -- Shared Bottleneck Detection 170 Conventions used in this document: 172 T -- the base time interval over which measurements are 173 made. 175 N -- the number of base time, T, intervals used in some 176 calculations. 178 sum_T(...) -- summation of all the measurements of the variable 179 in parentheses taken over the interval T 181 sum(...) -- summation of terms of the variable in parentheses 183 sum_N(...) -- summation of N terms of the variable in parentheses 184 sum_NT(...) -- summation of all measurements taken over the 185 interval N*T 187 E_T(...) -- the expectation or mean of the measurements of the 188 variable in parentheses over T 190 E_N(...) -- The expectation or mean of the last N values of the 191 variable in parentheses 193 E_M(...) -- The expectation or mean of the last M values of the 194 variable in parentheses, where M <= N. 196 max_T(...) -- the maximum recorded measurement of the variable in 197 parentheses taken over the interval T 199 min_T(...) -- the minimum recorded measurement of the variable in 200 parentheses taken over the interval T 202 num_T(...) -- the count of measurements of the variable in 203 parentheses taken in the interval T 205 num_VM(...) -- the count of valid values of the variable in 206 parentheses given M records 208 PC -- a boolean variable indicating the particular flow was 209 identified as experiencing congestion in the previous 210 interval T (i.e. Previously Congested) 212 CD_T -- an estimate of the effect of Clock Drift on the mean 213 OWD per T 215 CD_Adj(...) -- Mean OWD adjusted for clock drift 217 p_l, p_f, p_pdv, c_s, c_h, p_s, p_d, p_v -- various thresholds 218 used in the mechanism. 220 N, M, and F -- number of values (calculated over T). 222 2.1. Parameter Values 224 Reference [Hayes-LCN14] uses T=350ms, N=50, p_l = 0.1. The other 225 parameters have been tightened to reflect minor enhancements to the 226 algorithm outlined in Section 3.3: c_s = -0.01, p_f = p_s = p_d = 227 0.1, p_pdv = 0.2, p_v = 0.2. M=50, F=10, and c_h = 0.3 are 228 additional parameters defined in the document. These are values that 229 seem to work well over a wide range of practical Internet conditions, 230 but are the subject of ongoing tests. 232 3. Mechanism 234 The mechanism described in this document is based on the observation 235 that the distribution of delay measurements of packets from flows 236 that share a common bottleneck have similar shape characteristics. 237 These shape characteristics are described using 3 key summary 238 statistics: 240 variance (estimate var_est, see Section 3.1.3) 242 skewness (estimate skew_est, see Section 3.1.2) 244 oscillation (estimate freq_est, see Section 3.1.4) 246 with packet loss (estimate pkt_loss, see Section 3.1.5) used as a 247 supplementary statistic. 249 Summary statistics help to address both the noise and the path lag 250 problems by describing the general shape over a relatively long 251 period of time. This is sufficient for their application in coupled 252 congestion control for RTP Media. They can be signalled from a 253 receiver, which measures the OWD and calculates the summary 254 statistics, to a sender, which is the entity that is transmitting the 255 media stream. An RTP Media device may be both a sender and a 256 receiver. SBD can be performed at either Sender or receiver or both. 258 +----+ 259 | H2 | 260 +----+ 261 | 262 | L2 263 | 264 +----+ L1 | L3 +----+ 265 | H1 |------|------| H3 | 266 +----+ +----+ 268 A network with 3 hosts (H1, H2, H3) and 3 links (L1, L2, L3). 270 Figure 1 272 In Figure 1, there are two possible cases for shared bottleneck 273 detection: a sender-based and a receiver-based case. 275 1. Sender-based: consider a situation where host H1 sends media 276 streams to hosts H2 and H3, and L1 is a shared bottleneck. H2 277 and H3 measure the OWD and calculate summary statistics, which 278 they send to H1 every T. H1, having this knowledge, can determine 279 the shared bottleneck and accordingly control the send rates. 281 2. Receiver-based: consider that H2 is also sending media to H3, and 282 L3 is a shared bottleneck. If H3 sends summary statistics to H1 283 and H2, neither H1 nor H2 alone obtain enough knowledge to detect 284 this shared bottleneck; H3 can however determine it by combining 285 the summary statistics related to H1 and H2, respectively. This 286 case is applicable when send rates are controlled by the 287 receiver; then, the signal from H3 to the senders contains the 288 sending rate. 290 A discussion of the required signalling for the receiver-based case 291 is beyond the scope of this document. For the sender-based case, the 292 messages and their data format will be defined here in future 293 versions of this document. We envision that an initialization 294 message from the sender to the receiver could specify which key 295 metrics are requested out of a possibly extensible set (pkt_loss, 296 var_est, skew_est, freq_est). The grouping algorithm described in 297 this document requires all four of these metrics, and receivers MUST 298 be able to provide them, but future algorithms may be able to exploit 299 other metrics (e.g. metrics based on explicit network signals). 300 Moreover, the initialization message could specify T, N, and the 301 necessary resolution and precision (number of bits per field). 303 3.1. Key metrics and their calculation 305 Measurements are calculated over a base interval, T. T should be long 306 enough to provide enough samples for a good estimate of skewness, but 307 short enough so that a measure of the oscillation can be made from N 308 of these estimates. Reference [Hayes-LCN14] uses T = 350ms and 309 N=M=50, which are values that seem to work well over a wide range of 310 practical Internet conditions. 312 3.1.1. Mean delay 314 The mean delay is not a useful signal for comparisons between flows 315 since flows may traverse quite different paths and clocks will not 316 necessarily be synchronized. However, it is a base measure for the 3 317 summary statistics. The mean delay, E_T(OWD), is the average one way 318 delay measured over T. 320 To facilitate the other calculations, the last N E_T(OWD) values will 321 need to be stored in a cyclic buffer along with the moving average of 322 E_T(OWD): 324 mean_delay = E_M(E_T(OWD)) = sum_M(E_T(OWD)) / M 326 where M <= N. Generally M=N, setting M to be less than N allows the 327 mechanism to be more responsive to changes, but potentially at the 328 expense of a higher error rate (see Section 3.4 for a discussion on 329 improving the responsiveness of the mechanism.) 331 3.1.2. Skewness Estimate 333 Skewness is difficult to calculate efficiently and accurately. 334 Ideally it should be calculated over the entire period (M * T) from 335 the mean OWD over that period. However this would require storing 336 every delay measurement over the period. Instead, an estimate is 337 made over T using the previous calculation of mean_delay. 338 Comparisons are made using the mean of M skew estimates (an 339 alternative that removes bias in the mean is given in Section 3.3.3). 341 The skewness is estimated using two counters, counting the number of 342 one way delay samples (OWD) above and below the mean: 344 skew_est_T = (sum_T(OWD < mean_delay) 346 - sum_T(OWD > mean_delay)) / num_T(OWD) 348 where 350 if (OWD < mean_delay) 1 else 0 352 if (OWD > mean_delay) 1 else 0 354 skew_est_T is a number between -1 and 1 356 skew_est = E_M(skew_est_T) = sum_M(skew_est_T) / M 358 For implementation ease, mean_delay does not include the mean of the 359 current T interval. 361 Note: Care must be taken when implementing the comparisons to ensure 362 that rounding does not bias skew_est. It is important that the mean 363 is calculated with a higher precision than the samples. 365 3.1.3. Variance Estimate 367 Packet Delay Variation (PDV) ([RFC5481] and [ITU-Y1540]) is used as 368 an estimator of the variance of the delay signal. We define PDV as 369 follows: 371 PDV = PDV_max = max_T(OWD) - E_T(OWD) 373 var_est = E_M(PDV) = sum_M(PDV) / M 375 This modifies PDV as outlined in [RFC5481] to provide a summary 376 statistic version that best aids the grouping decisions of the 377 algorithm (see [Hayes-LCN14] section IVB). 379 The use of PDV = PDV_min = E_T(OWD) - min_T(OWD) is currently being 380 investigated as an alternative that is less sensitive to noise. The 381 drawback of using PDV_min is that it does not distinguish between 382 groups of flows with similar values of skew_est as well as PDV_max 383 (see [Hayes-LCN14] section IVB). 385 3.1.4. Oscillation Estimate 387 An estimate of the low frequency oscillation of the delay signal is 388 calculated by counting and normalising the significant mean, 389 E_T(OWD), crossings of mean_delay: 391 freq_est = number_of_crossings / N 393 Where 395 we define a significant mean crossing as a crossing that 396 extends p_v * var_est from mean_delay. In our experiments we 397 have found that p_v = 0.2 is a good value. 399 Freq_est is a number between 0 and 1. Freq_est can be approximated 400 incrementally as follows: 402 With each new calculation of E_T(OWD) a decision is made as to 403 whether this value of E_T(OWD) significantly crosses the current 404 long term mean, mean_delay, with respect to the previous 405 significant mean crossing. 407 A cyclic buffer, last_N_crossings, records a 1 if there is a 408 significant mean crossing, otherwise a 0. 410 The counter, number_of_crossings, is incremented when there is a 411 significant mean crossing and subtracted from when a non-zero 412 value is removed from the last_N_crossings. 414 This approximation of freq_est was not used in [Hayes-LCN14], which 415 calculated freq_est every T using the current E_N(E_T(OWD)). Our 416 tests show that this approximation of freq_est yields results that 417 are almost identical to when the full calculation is performed every 418 T. 420 3.1.5. Packet loss 422 The proportion of packets lost is used as a supplementary measure: 424 pkt_loss = sum_NT(lost packets) / sum_NT(total packets) 426 Note: When pkt_loss is small it is very variable, however, when 427 pkt_loss is high it becomes a stable measure for making grouping 428 decisions. 430 3.2. Flow Grouping 432 3.2.1. Flow Grouping Algorithm 434 The following grouping algorithm is RECOMMENDED for SBD in the RMCAT 435 context and is sufficient and efficient for small to moderate numbers 436 of flows. For very large numbers of flows (e.g. hundreds), a more 437 complex clustering algorithm may be substituted. 439 Since no single metric is precise enough to group flows (due to 440 noise), the algorithm uses multiple metrics. Each metric offers a 441 different "view" of the bottleneck link characteristics, and used 442 together they enable a more precise grouping of flows than would 443 otherwise be possible. 445 Flows determined to be experiencing congestion are successively 446 divided into groups based on freq_est, var_est, and skew_est. 448 The first step is to determine which flows are experiencing 449 congestion. This is important, since if a flow is not experiencing 450 congestion its delay based metrics will not describe the bottleneck, 451 but the "noise" from the rest of the path. Skewness, with proportion 452 of packets loss as a supplementary measure, is used to do this: 454 1. Grouping will be performed on flows where: 456 skew_est < c_s 458 || ( skew_est < c_h && PC ) 460 || pkt_loss > p_l 462 The parameter c_s controls how sensitive the mechanism is in 463 detecting congestion. C_s = 0.0 was used in [Hayes-LCN14]. A value 464 of c_s = 0.05 is a little more sensitive, and c_s = -0.05 is a little 465 less sensitive. C_h controls the hysteresis on flows that were 466 grouped as experiencing congestion last time. 468 These flows, flows experiencing congestion, are then progressively 469 divided into groups based on the freq_est, PDV, and skew_est summary 470 statistics. The process proceeds according to the following steps: 472 2. Group flows whose difference in sorted freq_est is less than a 473 threshold: 475 diff(freq_est) < p_f 477 3. Group flows whose difference in sorted E_N(PDV) (highest to 478 lowest) is less than a threshold: 480 diff(var_est) < (p_pdv * var_est) 482 The threshold, (p_pdv * var_est), is with respect to the highest 483 value in the difference. 485 4. Group flows whose difference in sorted skew_est or pkt_loss is 486 less than a threshold: 488 if pkt_loss < p_l 490 diff(skew_est) < p_s 492 otherwise 494 diff(pkt_loss) < (p_d * pkt_loss) 496 The threshold, (p_d * pkt_loss), is with respect to the 497 highest value in the difference. 499 This procedure involves sorting estimates from highest to lowest. It 500 is simple to implement, and efficient for small numbers of flows, 501 such as are expected in RTCWEB. 503 3.2.2. Using the flow group signal 505 A grouping decisions is made every T from the second T, though they 506 will not attain their full design accuracy until after the N'th T 507 interval. 509 Network conditions, and even the congestion controllers, can cause 510 bottlenecks to fluctuate. A coupled congestion controller MAY decide 511 only to couple groups that remain stable, say grouped together 90% of 512 the time, depending on its objectives. Recommendations concerning 513 this are beyond the scope of this draft and will be specific to the 514 coupled congestion controllers objectives. 516 3.3. Removing Noise from the Estimates 518 The following describe small changes to the calculation of the key 519 metrics that help remove noise from them. Currently these "tweaks" 520 are described separately to keep the main description succinct. In 521 future revisions of the draft these enhancements may replace the 522 original key metric calculations. 524 3.3.1. Oscillation noise 526 When a path has no congestion, the PDV will be very small and the 527 recorded significant mean crossings will be the result of path noise. 528 Thus up to N-1 meaningless mean crossings can be a source of error at 529 the point a link becomes a bottleneck and flows traversing it begin 530 to be grouped. 532 To remove this source of noise from freq_est: 534 1. Set the current PDV to PDV = NaN (a value representing an invalid 535 record, ie Not a Number) for flows that are deemed to not be 536 experiencing congestion by the first skew_est based grouping test 537 (see Section 3.2.1). 539 2. Then var_est = sum_M(PDV != NaN) / num_VM(PDV) 541 3. For freq_est, only record a significant mean crossing if flow is 542 experiencing congestion. 544 These three changes will remove the non-congestion noise from 545 freq_est. 547 3.3.2. Clock drift 549 Generally sender and receiver clock drift will be too small to cause 550 significant errors in the estimators. Skew_est is most sensitive to 551 this type of noise. In circumstances where clock drift is high, 552 making M < N can reduce this error. 554 A better method is to estimate the effect the clock drift is having 555 on the E_N(E_T(OWD)), and then adjust mean_delay accordingly. A 556 simple method of doing this follows: 558 First divide the N E_T(OWD) values into two halves (N/2 in each) 559 -- old and new. 561 Calculate a mean of the old half: 563 Older_mean = E_old(E_T(OWD)) / N/2 565 Calculate a mean of the new (most recent) half: 567 Newer_mean = E_new(E_T(OWD)) / N/2 569 A linear estimate of the Clock Drift per T estimates is: 571 CD_T = (Newer_mean - Older_mean)/N/2 573 An adjusted mean estimate then is: 575 mean_delay = CD_Adj(E_M(E_T(OWD))) = E_M(E_T(OWD)) + CD_T * 576 (M/2 + 0.5) 578 CD_Adj can be thought of as a prediction of what the long term mean 579 will be in the current measurement period T. It is used as the basis 580 for skew_est and freq_est. 582 3.3.3. Bias in the skewness measure 584 If successive calculations of skew_est are made with very different 585 numbers of samples (num_T(OWD)), the simple calculation of 586 E_M(skew_est) used for grouping decisions will be biased by the 587 intervals that have few samples samples. This bias can be corrected 588 if necessary as follows. 590 skew_base_T = sum_T(OWD < mean_delay) - sum_T(OWD > mean_delay) 592 skew_est = sum_MT(skew_base_T)/num_MT(OWD) 594 This calculation requires slightly more state, since an 595 implementation will need to maintain two cyclic buffers storing 596 skew_base_T and num_T(OWD) respectively to manage the rolling 597 summations (note only one cyclic buffer is needed for the calculation 598 of skew_est outlined previously). 600 3.4. Reducing lag and Improving Responsiveness 602 Measurement based shared bottleneck detection makes decisions in the 603 present based on what has been measured in the past. This means that 604 there is always a lag in responding to changing conditions. This 605 mechanism is based on summary statistics taken over (N*T) seconds. 606 This mechanism can be made more responsive to changing conditions by: 608 1. Reducing N and/or M -- but at the expense of less accurate 609 metrics, and/or 611 2. Exploiting the fact that more recent measurements are more 612 valuable than older measurements and weighting them accordingly. 614 Although more recent measurements are more valuable, older 615 measurements are still needed to gain an accurate estimate of the 616 distribution descriptor we are measuring. Unfortunately, the simple 617 exponentially weighted moving average weights drop off too quickly 618 for our requirements and have an infinite tail. A simple linearly 619 declining weighted moving average also does not provide enough weight 620 to the most recent measurements. We propose a piecewise linear 621 distribution of weights, such that the first section (samples 1:F) is 622 flat as in a simple moving average, and the second section (samples 623 F+1:M) is linearly declining weights to the end of the averaging 624 window. We choose integer weights, which allows incremental 625 calculation without introducing rounding errors. 627 3.4.1. Improving the response of the skewness estimate 629 The weighted moving average for skew_est, based on skew_est in 630 Section 3.3.3, can be calculated as follows: 632 skew_est = ((M-F+1)*sum(skew_base_T(1:F)) 634 + sum([(M-F):1].*skew_base_T(F+1:M))) 636 / ((M-F+1)*sum(numsampT(1:F)) 638 + sum([(M-F):1].*numsampT(F+1:M))) 640 where numsampT is an array of the number of OWD samples in each T (ie 641 num_T(OWD)), and numsampT(1) is the most recent; skew_base_T(1) is 642 the most recent calculation of skew_base_T; 1:F refers to the integer 643 values 1 through to F, and [(M-F):1] refers to an array of the 644 integer values (M-F) declining through to 1; and ".*" is the array 645 scalar dot product operator. 647 3.4.2. Improving the response of the variance estimate 649 The weighted moving average for var_est can be calculated as follows: 651 var_est = ((M-F+1)*sum(PDV(1:F)) + sum([(M-F):1].*PDV(F+1:M))) 653 / (F*(M-F+1) + sum([(M-F):1]) 655 where 1:F refers to the integer values 1 through to F, and [(M-F):1] 656 refers to an array of the integer values (M-F) declining through to 657 1; and ".*" is the array scalar dot product operator. When removing 658 oscillation noise (see Section 3.3.1) this calculation must be 659 adjusted to allow for invalid PDV records. 661 4. Measuring OWD 663 This section discusses the OWD measurements required for this 664 algorithm to detect shared bottlenecks. 666 The SBD mechanism described in this draft relies on differences 667 between OWD measurements to avoid the practical problems with 668 measuring absolute OWD (see [Hayes-LCN14] section IIIC). Since all 669 summary statistics are relative to the mean OWD and sender/receiver 670 clock offsets should be approximately constant over the measurement 671 periods, the offset is subtracted out in the calculation. 673 4.1. Time stamp resolution 675 The SBD mechanism requires timing information precise enough to be 676 able to make comparisons. As a rule of thumb, the time resolution 677 should be less than one hundredth of a typical path's range of 678 delays. In general, the lower the time resolution, the more care 679 that needs to be taken to ensure rounding errors do not bias the 680 skewness calculation. 682 Typical RTP media flows use sub-millisecond timers, which should be 683 adequate in most situations. 685 5. Acknowledgements 687 This work was part-funded by the European Community under its Seventh 688 Framework Programme through the Reducing Internet Transport Latency 689 (RITE) project (ICT-317700). The views expressed are solely those of 690 the authors. 692 6. IANA Considerations 694 This memo includes no request to IANA. 696 7. Security Considerations 698 The security considerations of RFC 3550 [RFC3550], RFC 4585 699 [RFC4585], and RFC 5124 [RFC5124] are expected to apply. 701 Non-authenticated RTCP packets carrying shared bottleneck indications 702 and summary statistics could allow attackers to alter the bottleneck 703 sharing characteristics for private gain or disruption of other 704 parties communication. 706 8. Change history 708 Changes made to this document: 710 02->WG-00 : Fixed missing 0.5 in 3.3.2 and missing brace in 3.3.3 712 01->02 : New section describing improvements to the key metric 713 calculations that help to remove noise, bias, and 714 reduce lag. Some revisions to the notation to make 715 it clearer. Some tightening of the thresholds. 717 00->01 : Revisions to terminology for clarity 719 9. References 721 9.1. Normative References 723 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 724 Requirement Levels", BCP 14, RFC 2119, March 1997. 726 9.2. Informative References 728 [Hayes-LCN14] 729 Hayes, D., Ferlin, S., and M. Welzl, "Practical Passive 730 Shared Bottleneck Detection using Shape Summary 731 Statistics", Proc. the IEEE Local Computer Networks 732 (LCN) p150-158, September 2014, . 736 [I-D.welzl-rmcat-coupled-cc] 737 Welzl, M., Islam, S., and S. Gjessing, "Coupled congestion 738 control for RTP media", draft-welzl-rmcat-coupled-cc-04 739 (work in progress), October 2014. 741 [ITU-Y1540] 742 ITU-T, "Internet Protocol Data Communication Service - IP 743 Packet Transfer and Availability Performance Parameters", 744 Series Y: Global Information Infrastructure, Internet 745 Protocol Aspects and Next-Generation Networks , 746 March 2011, 747 . 749 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 750 Jacobson, "RTP: A Transport Protocol for Real-Time 751 Applications", STD 64, RFC 3550, July 2003. 753 [RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey, 754 "Extended RTP Profile for Real-time Transport Control 755 Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, 756 July 2006. 758 [RFC5124] Ott, J. and E. Carrara, "Extended Secure RTP Profile for 759 Real-time Transport Control Protocol (RTCP)-Based Feedback 760 (RTP/SAVPF)", RFC 5124, February 2008. 762 [RFC5481] Morton, A. and B. Claise, "Packet Delay Variation 763 Applicability Statement", RFC 5481, March 2009. 765 [RFC6817] Shalunov, S., Hazel, G., Iyengar, J., and M. Kuehlewind, 766 "Low Extra Delay Background Transport (LEDBAT)", RFC 6817, 767 December 2012. 769 Authors' Addresses 771 David Hayes (editor) 772 University of Oslo 773 PO Box 1080 Blindern 774 Oslo, N-0316 775 Norway 777 Phone: +47 2284 5566 778 Email: davihay@ifi.uio.no 780 Simone Ferlin 781 Simula Research Laboratory 782 P.O.Box 134 783 Lysaker, 1325 784 Norway 786 Phone: +47 4072 0702 787 Email: ferlin@simula.no 789 Michael Welzl 790 University of Oslo 791 PO Box 1080 Blindern 792 Oslo, N-0316 793 Norway 795 Phone: +47 2285 2420 796 Email: michawe@ifi.uio.no