idnits 2.17.1 draft-ietf-rmcat-sbd-10.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords -- however, there's a paragraph with a matching beginning. Boilerplate error? (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document date (February 16, 2018) is 2255 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- == Outdated reference: A later version (-09) exists of draft-ietf-avtcore-cc-feedback-message-00 == Outdated reference: A later version (-09) exists of draft-ietf-rmcat-coupled-cc-07 Summary: 0 errors (**), 0 flaws (~~), 4 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 RTP Media Congestion Avoidance Techniques D. Hayes, Ed. 3 Internet-Draft Simula Research Laboratory 4 Intended status: Experimental S. Ferlin 5 Expires: August 20, 2018 6 M. Welzl 7 K. Hiorth 8 University of Oslo 9 February 16, 2018 11 Shared Bottleneck Detection for Coupled Congestion Control for RTP 12 Media. 13 draft-ietf-rmcat-sbd-10 15 Abstract 17 This document describes a mechanism to detect whether end-to-end data 18 flows share a common bottleneck. It relies on summary statistics 19 that are calculated based on continuous measurements and used as 20 input to a grouping algorithm that runs wherever the knowledge is 21 needed. 23 Status of This Memo 25 This Internet-Draft is submitted in full conformance with the 26 provisions of BCP 78 and BCP 79. 28 Internet-Drafts are working documents of the Internet Engineering 29 Task Force (IETF). Note that other groups may also distribute 30 working documents as Internet-Drafts. The list of current Internet- 31 Drafts is at https://datatracker.ietf.org/drafts/current/. 33 Internet-Drafts are draft documents valid for a maximum of six months 34 and may be updated, replaced, or obsoleted by other documents at any 35 time. It is inappropriate to use Internet-Drafts as reference 36 material or to cite them other than as "work in progress." 38 This Internet-Draft will expire on August 20, 2018. 40 Copyright Notice 42 Copyright (c) 2018 IETF Trust and the persons identified as the 43 document authors. All rights reserved. 45 This document is subject to BCP 78 and the IETF Trust's Legal 46 Provisions Relating to IETF Documents 47 (https://trustee.ietf.org/license-info) in effect on the date of 48 publication of this document. Please review these documents 49 carefully, as they describe your rights and restrictions with respect 50 to this document. Code Components extracted from this document must 51 include Simplified BSD License text as described in Section 4.e of 52 the Trust Legal Provisions and are provided without warranty as 53 described in the Simplified BSD License. 55 Table of Contents 57 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 58 1.1. The Basic Mechanism . . . . . . . . . . . . . . . . . . . 3 59 1.2. The Signals . . . . . . . . . . . . . . . . . . . . . . . 3 60 1.2.1. Packet Loss . . . . . . . . . . . . . . . . . . . . . 3 61 1.2.2. Packet Delay . . . . . . . . . . . . . . . . . . . . 4 62 1.2.3. Path Lag . . . . . . . . . . . . . . . . . . . . . . 4 63 2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 4 64 2.1. Parameters and Their Effect . . . . . . . . . . . . . . . 6 65 2.2. Recommended Parameter Values . . . . . . . . . . . . . . 7 66 3. Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . 7 67 3.1. SBD Feedback Requirements . . . . . . . . . . . . . . . . 8 68 3.1.1. Feedback When All the Logic is Placed at the Sender . 9 69 3.1.2. Feedback When the Statistics are Calculated at the 70 Receiver and SBD Performed at the Sender . . . . . . 9 71 3.1.3. Feedback When Bottlenecks can be Determined at Both 72 Senders and Receivers . . . . . . . . . . . . . . . . 10 73 3.2. Key Metrics and Their Calculation . . . . . . . . . . . . 10 74 3.2.1. Mean Delay . . . . . . . . . . . . . . . . . . . . . 10 75 3.2.2. Skewness Estimate . . . . . . . . . . . . . . . . . . 11 76 3.2.3. Variability Estimate . . . . . . . . . . . . . . . . 12 77 3.2.4. Oscillation Estimate . . . . . . . . . . . . . . . . 12 78 3.2.5. Packet Loss . . . . . . . . . . . . . . . . . . . . . 13 79 3.3. Flow Grouping . . . . . . . . . . . . . . . . . . . . . . 13 80 3.3.1. Flow Grouping Algorithm . . . . . . . . . . . . . . . 13 81 3.3.2. Using the Flow Group Signal . . . . . . . . . . . . . 16 82 4. Enhancements to the Basic SBD Algorithm . . . . . . . . . . . 16 83 4.1. Reducing Lag and Improving Responsiveness . . . . . . . . 16 84 4.1.1. Improving the Response of the Skewness Estimate . . . 17 85 4.1.2. Improving the Response of the Variability Estimate . 19 86 4.2. Removing Oscillation Noise . . . . . . . . . . . . . . . 19 87 5. Measuring OWD . . . . . . . . . . . . . . . . . . . . . . . . 20 88 5.1. Time-stamp Resolution . . . . . . . . . . . . . . . . . . 20 89 5.2. Clock Skew . . . . . . . . . . . . . . . . . . . . . . . 20 90 6. Expected Feedback from Experiments . . . . . . . . . . . . . 20 91 7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 21 92 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 21 93 9. Security Considerations . . . . . . . . . . . . . . . . . . . 21 94 10. Change history . . . . . . . . . . . . . . . . . . . . . . . 21 95 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 23 96 11.1. Normative References . . . . . . . . . . . . . . . . . . 23 97 11.2. Informative References . . . . . . . . . . . . . . . . . 23 98 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 24 100 1. Introduction 102 In the Internet, it is not normally known if flows (e.g., TCP 103 connections or UDP data streams) traverse the same bottlenecks. Even 104 flows that have the same sender and receiver may take different paths 105 and may or may not share a bottleneck. Flows that share a bottleneck 106 link usually compete with one another for their share of the 107 capacity. This competition has the potential to increase packet loss 108 and delays. This is especially relevant for interactive applications 109 that communicate simultaneously with multiple peers (such as multi- 110 party video). For RTP media applications such as RTCWEB, 111 [I-D.ietf-rmcat-coupled-cc] describes a scheme that combines the 112 congestion controllers of flows in order to honor their priorities 113 and avoid unnecessary packet loss as well as delay. This mechanism 114 relies on some form of Shared Bottleneck Detection (SBD); here, a 115 measurement-based SBD approach is described. 117 1.1. The Basic Mechanism 119 The mechanism groups flows that have similar statistical 120 characteristics together. Section 3.3.1 describes a simple method 121 for achieving this, however, a major part of this draft is concerned 122 with collecting suitable statistics for this purpose. 124 1.2. The Signals 126 The current Internet is unable to explicitly inform endpoints as to 127 which flows share bottlenecks, so endpoints need to infer this from 128 whatever information is available to them. The mechanism described 129 here currently utilizes packet loss and packet delay, but is not 130 restricted to these. As ECN becomes more prevalent it too will 131 become a valuable base signal. 133 1.2.1. Packet Loss 135 Packet loss is often a relatively rare signal. Therefore, on its own 136 it is of limited use for SBD, however, it is a valuable supplementary 137 measure when it is more prevalent. 139 1.2.2. Packet Delay 141 End-to-end delay measurements include noise from every device along 142 the path in addition to the delay perturbation at the bottleneck 143 device. The noise is often significantly increased if the round-trip 144 time is used. The cleanest signal is obtained by using One-Way-Delay 145 (OWD). 147 Measuring absolute OWD is difficult since it requires both the sender 148 and receiver clocks to be synchronized. However, since the 149 statistics being collected are relative to the mean OWD, a relative 150 OWD measurement is sufficient. Clock skew is not usually significant 151 over the time intervals used by this SBD mechanism (see [RFC6817] A.2 152 for a discussion on clock skew and OWD measurements). However, in 153 circumstances where it is significant, Section 5.2 outlines a way of 154 adjusting the calculations to cater for it. 156 Each packet arriving at the bottleneck buffer may experience very 157 different queue lengths, and therefore different waiting times. A 158 single OWD sample does not, therefore, characterize the path well. 159 However, multiple OWD measurements do reflect the distribution of 160 delays experienced at the bottleneck. 162 1.2.3. Path Lag 164 Flows that share a common bottleneck may traverse different paths, 165 and these paths will often have different base delays. This makes it 166 difficult to correlate changes in delay or loss. This technique uses 167 the long term shape of the delay distribution as a base for 168 comparison to counter this. 170 2. Definitions 172 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 173 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 174 "OPTIONAL" in this document are to be interpreted as described in BCP 175 14 [RFC2119] RFC2119 [RFC2119] RFC8174 [RFC8174] when, and only when, 176 they appear in all capitals, as shown here. 178 Acronyms used in this document: 180 OWD -- One Way Delay 182 MAD -- Mean Absolute Deviation 184 RTT -- Round Trip Time 186 SBD -- Shared Bottleneck Detection 188 Conventions used in this document: 190 T -- the base time interval over which measurements are 191 made 193 N -- the number of base time, T, intervals used in some 194 calculations 196 M -- the number of base time, T, intervals used in some 197 calculations, where M <= N 199 sum(...) -- summation of terms of the variable in parentheses 201 sum_T(...) -- summation of all the measurements of the variable 202 in parentheses taken over the interval T 204 sum_NT(...) -- summation of all measurements taken over the 205 interval N*T 207 sum_MT(...) -- summation of all measurements taken over the 208 interval M*T 210 E_T(...) -- the expectation or mean of the measurements of the 211 variable in parentheses over T 213 E_N(...) -- the expectation or mean of the last N values of 214 the variable in parentheses 216 E_M(...) -- the expectation or mean of the last M values of 217 the variable in parentheses 219 num_T(...) -- the count of measurements of the variable in 220 parentheses taken in the interval T 222 num_MT(...) -- the count of measurements of the variable in 223 parentheses taken in the interval NT 225 PB -- a boolean variable indicating the particular flow 226 was identified transiting a bottleneck in the 227 previous interval T (i.e. Previously Bottleneck) 229 skew_est -- a measure of skewness in a OWD distribution 231 skew_base_T -- a variable used as an intermediate step in 232 calculating skew_est 234 var_est -- a measure of variability in OWD measurements 235 var_base_T -- a variable used as an intermediate step in 236 calculating var_est 238 freq_est -- a measure of low frequency oscillation in the OWD 239 measurements 241 pkt_loss -- a measure of the proportion of packets lost 243 p_l, p_f, p_mad, c_s, c_h, p_s, p_d, p_v -- various thresholds 244 used in the mechanism 246 M and F -- number of values related to N 248 2.1. Parameters and Their Effect 250 T T should be long enough so that there are enough packets 251 received during T for a useful estimate of short term mean 252 OWD and variation statistics. Making T too large can limit 253 the efficacy of freq_est. It will also increase the response 254 time of the mechanism. Making T too small will make the 255 metrics noisier. 257 N & M N should be large enough to provide a stable estimate of 258 oscillations in OWD. Usually M=N, though having M mean_delay) skew_base_T-- 497 The mean_delay does not include the mean of the current T interval to 498 enable it to be calculated iteratively. 500 skew_est = sum_MT(skew_base_T)/num_MT(OWD) 502 where skew_est is a number between -1 and 1 504 Note: Care must be taken when implementing the comparisons to ensure 505 that rounding does not bias skew_est. It is important that the mean 506 is calculated with a higher precision than the samples. 508 3.2.3. Variability Estimate 510 Mean Absolute Deviation (MAD) delay is a robust variability measure 511 that copes well with different send rates. It can be implemented in 512 an online manner as follows: 514 var_base_T = sum_T(|OWD - E_T(OWD)|) 516 where 518 |x| is the absolute value of x 520 E_T(OWD) is the mean OWD calculated in the previous T 522 var_est = MAD_MT = sum_MT(var_base_T)/num_MT(OWD) 524 3.2.4. Oscillation Estimate 526 An estimate of the low frequency oscillation of the delay signal is 527 calculated by counting and normalizing the significant mean, 528 E_T(OWD), crossings of mean_delay: 530 freq_est = number_of_crossings / N 532 where we define a significant mean crossing as a crossing that 533 extends p_v * var_est from mean_delay. In our experiments we 534 have found that p_v = 0.7 is a good value. 536 Freq_est is a number between 0 and 1. Freq_est can be approximated 537 incrementally as follows: 539 With each new calculation of E_T(OWD) a decision is made as to 540 whether this value of E_T(OWD) significantly crosses the current 541 long term mean, mean_delay, with respect to the previous 542 significant mean crossing. 544 A cyclic buffer, last_N_crossings, records a 1 if there is a 545 significant mean crossing, otherwise a 0. 547 The counter, number_of_crossings, is incremented when there is a 548 significant mean crossing and decremented when a non-zero value is 549 removed from the last_N_crossings. 551 This approximation of freq_est was not used in [Hayes-LCN14], which 552 calculated freq_est every T using the current E_N(E_T(OWD)). Our 553 tests show that this approximation of freq_est yields results that 554 are almost identical to when the full calculation is performed every 555 T. 557 3.2.5. Packet Loss 559 The proportion of packets lost over the period NT is used as a 560 supplementary measure: 562 pkt_loss = sum_NT(lost packets) / sum_NT(total packets) 564 Note: When pkt_loss is small it is very variable, however, when 565 pkt_loss is high it becomes a stable measure for making grouping 566 decisions. 568 3.3. Flow Grouping 570 3.3.1. Flow Grouping Algorithm 572 The following grouping algorithm is RECOMMENDED for use of SBD with 573 coupled congestion control for RTP media [I-D.ietf-rmcat-coupled-cc] 574 and is sufficient and efficient for small to moderate numbers of 575 flows. For very large numbers of flows (e.g. hundreds), a more 576 complex clustering algorithm may be substituted. 578 Since no single metric is precise enough to group flows (due to 579 noise), the algorithm uses multiple metrics. Each metric offers a 580 different "view" of the bottleneck link characteristics, and used 581 together they enable a more precise grouping of flows than would 582 otherwise be possible. 584 Flows determined to be transiting a bottleneck are successively 585 divided into groups based on freq_est, var_est, skew_est and 586 pkt_loss. 588 The first step is to determine which flows are transiting a 589 bottleneck. This is important, since if a flow is not transiting a 590 bottleneck its delay based metrics will not describe the bottleneck, 591 but the "noise" from the rest of the path. Skewness, with proportion 592 of packet loss as a supplementary measure, is used to do this: 594 1. Grouping will be performed on flows that are inferred to be 595 traversing a bottleneck by: 597 skew_est < c_s 599 || ( skew_est < c_h & PB ) || pkt_loss > p_l 601 The parameter c_s controls how sensitive the mechanism is in 602 detecting a bottleneck. c_s = 0.0 was used in [Hayes-LCN14]. A value 603 of c_s = 0.1 is a little more sensitive, and c_s = -0.1 is a little 604 less sensitive. c_h controls the hysteresis on flows that were 605 grouped as transiting a bottleneck last time. If the test result is 606 TRUE, PB=TRUE, otherwise PB=FALSE. 608 These flows, flows transiting a bottleneck, are then progressively 609 divided into groups based on the freq_est, var_est, and skew_est 610 summary statistics. The process proceeds according to the following 611 steps: 613 2. Group flows whose difference in sorted freq_est is less than a 614 threshold: 616 diff(freq_est) < p_f 618 3. Subdivide the groups obtained in 2. by grouping flows whose 619 difference in sorted E_M(var_est) (highest to lowest) is less 620 than a threshold: 622 diff(var_est) < (p_mad * var_est) 624 The threshold, (p_mad * var_est), is with respect to the highest 625 value in the difference. 627 4. Subdivide the groups obtained in 3. by grouping flows whose 628 difference in sorted skew_est is less than a threshold: 630 diff(skew_est) < p_s 632 5. When packet loss is high enough to be reliable (pkt_loss > p_l), 633 Subdivide the groups obtained in 4. by grouping flows whose 634 difference is less than a threshold 636 diff(pkt_loss) < (p_d * pkt_loss) 638 The threshold, (p_d * pkt_loss), is with respect to the highest 639 value in the difference. 641 This procedure involves sorting estimates from highest to lowest. It 642 is simple to implement, and efficient for small numbers of flows (up 643 to 10-20). Figure 2 illustrates this algorithm. 645 ********* 646 * Flows * 647 ***.**.** 648 / ' 649 / '--. 650 / \ 651 .---v--. .----v---. 652 1. Flows traversing | Cong | | UnCong | 653 a bottleneck '-.--.-' '--------' 654 / \ 655 / \ 656 / \ 657 .--v--. v-----. 658 2. Divide by | g_1 | ... | g_n | 659 freq_est '---.-. '----.. 660 / \ / \ 661 / '--. v '------. 662 / \ \ 663 .----v-. .-v----. .---v--. 664 3. Divide by | g_1a | ... | g_1z | ... | g_nz | 665 var_est '---.-.' '-----.. '-.-.--' 666 / \ / \ / | 667 / '-----. v v v | 668 / \ | 669 .-v-----. .-v-----. .---v---. 670 4. Divide by | g_1ai | ... | g_1ax | ... | g_nzx | 671 skew_est '----.-.' '------.. '-.-.---' 672 / \ / \ / | 673 / '--. v v v | 674 / \ | 675 .-----v--. .-v------. .----v---. 676 5. Divide by | g_1aiA | ... | g_1aiZ | ... | g_nzxZ | 677 pkt_loss '--------' '--------' '--------' 678 (when applicable) 680 Simple grouping algorithm. 682 Figure 2 684 3.3.2. Using the Flow Group Signal 686 Grouping decisions can be made every T from the second T, however 687 they will not attain their full design accuracy until after the 688 2*N'th T interval. We recommend that grouping decisions are not made 689 until 2*M T intervals. 691 Network conditions, and even the congestion controllers, can cause 692 bottlenecks to fluctuate. A coupled congestion controller MAY decide 693 only to couple groups that remain stable, say grouped together 90% of 694 the time, depending on its objectives. Recommendations concerning 695 this are beyond the scope of this document and will be specific to 696 the coupled congestion controller's objectives. 698 4. Enhancements to the Basic SBD Algorithm 700 The SBD algorithm as specified in Section 3 was found to work well 701 for a broad variety of conditions. The following enhancements to the 702 basic mechanisms have been found to significantly improve the 703 algorithm's performance under some circumstances and SHOULD be 704 implemented. These "tweaks" are described separately to keep the 705 main description succinct. 707 4.1. Reducing Lag and Improving Responsiveness 709 This section describes how to improve the responsiveness of the basic 710 algorithm. 712 Measurement based shared bottleneck detection makes decisions in the 713 present based on what has been measured in the past. This means that 714 there is always a lag in responding to changing conditions. This 715 mechanism is based on summary statistics taken over (N*T) seconds. 716 This mechanism can be made more responsive to changing conditions by: 718 1. Reducing N and/or M -- but at the expense of having less accurate 719 metrics, and/or 721 2. Exploiting the fact that more recent measurements are more 722 valuable than older measurements and weighting them accordingly. 724 Although more recent measurements are more valuable, older 725 measurements are still needed to gain an accurate estimate of the 726 distribution descriptor we are measuring. Unfortunately, the simple 727 exponentially weighted moving average weights drop off too quickly 728 for our requirements and have an infinite tail. A simple linearly 729 declining weighted moving average also does not provide enough weight 730 to the most recent measurements. We propose a piecewise linear 731 distribution of weights, such that the first section (samples 1:F) is 732 flat as in a simple moving average, and the second section (samples 733 F+1:M) is linearly declining weights to the end of the averaging 734 window. We choose integer weights, which allows incremental 735 calculation without introducing rounding errors. 737 4.1.1. Improving the Response of the Skewness Estimate 739 The weighted moving average for skew_est, based on skew_est in 740 Section 3.2.2, can be calculated as follows: 742 skew_est = ((M-F+1)*sum(skew_base_T(1:F)) 744 + sum([(M-F):1].*skew_base_T(F+1:M))) 746 / ((M-F+1)*sum(numsampT(1:F)) 748 + sum([(M-F):1].*numsampT(F+1:M))) 750 where numsampT is an array of the number of OWD samples in each T 751 (i.e. num_T(OWD)), and numsampT(1) is the most recent; skew_base_T(1) 752 is the most recent calculation of skew_base_T; 1:F refers to the 753 integer values 1 through to F, and [(M-F):1] refers to an array of 754 the integer values (M-F) declining through to 1; and ".*" is the 755 array scalar dot product operator. 757 To calculate this weighted skew_est incrementally: 759 Notation: F_ - flat portion, D_ - declining portion, W_ - weighted 760 component 762 Initialize: sum_skewbase = 0, F_skewbase=0, W_D_skewbase=0 764 skewbase_hist = buffer length M initialize to 0 766 numsampT = buffer length M initialized to 0 768 Steps per iteration: 770 1. old_skewbase = skewbase_hist(M) 772 2. old_numsampT = numsampT(M) 774 3. cycle(skewbase_hist) 776 4. cycle(numsampT) 778 5. numsampT(1) = num_T(OWD) 780 6. skewbase_hist(1) = skew_base_T 782 7. F_skewbase = F_skewbase + skew_base_T - skewbase_hist(F+1) 784 8. W_D_skewbase = W_D_skewbase + (M-F)*skewbase_hist(F+1) 785 - sum_skewbase 787 9. W_D_numsamp = W_D_numsamp + (M-F)*numsampT(F+1) - sum_numsamp 788 + F_numsamp 790 10. F_numsamp = F_numsamp + numsampT(1) - numsampT(F+1) 792 11. sum_skewbase = sum_skewbase + skewbase_hist(F+1) - old_skewbase 794 12. sum_numsamp = sum_numsamp + numsampT(1) - old_numsampT 796 13. skew_est = ((M-F+1)*F_skewbase + W_D_skewbase) / 797 ((M-F+1)*F_numsamp+W_D_numsamp) 799 Where cycle(....) refers to the operation on a cyclic buffer where 800 the start of the buffer is now the next element in the buffer. 802 4.1.2. Improving the Response of the Variability Estimate 804 Similarly the weighted moving average for var_est can be calculated 805 as follows: 807 var_est = ((M-F+1)*sum(var_base_T(1:F)) 809 + sum([(M-F):1].*var_base_T(F+1:M))) 811 / ((M-F+1)*sum(numsampT(1:F)) 813 + sum([(M-F):1].*numsampT(F+1:M))) 815 where numsampT is an array of the number of OWD samples in each T 816 (i.e. num_T(OWD)), and numsampT(1) is the most recent; skew_base_T(1) 817 is the most recent calculation of skew_base_T; 1:F refers to the 818 integer values 1 through to F, and [(M-F):1] refers to an array of 819 the integer values (M-F) declining through to 1; and ".*" is the 820 array scalar dot product operator. When removing oscillation noise 821 (see Section 4.2) this calculation must be adjusted to allow for 822 invalid var_base_T records. 824 Var_est can be calculated incrementally in the same way as skew_est 825 in Section 4.1.1. However, note that the buffer numsampT is used for 826 both calculations so the operations on it should not be repeated. 828 4.2. Removing Oscillation Noise 830 When a path has no bottleneck, var_est will be very small and the 831 recorded significant mean crossings will be the result of path noise. 832 Thus up to N-1 meaningless mean crossings can be a source of error at 833 the point a link becomes a bottleneck and flows traversing it begin 834 to be grouped. 836 To remove this source of noise from freq_est: 838 1. Set the current var_base_T = NaN (a value representing an invalid 839 record, i.e. Not a Number) for flows that are deemed to not be 840 transiting a bottleneck by the first skew_est based grouping test 841 (see Section 3.3.1). 843 2. Then var_est = sum_MT(var_base_T != NaN) / num_MT(OWD) 845 3. For freq_est, only record a significant mean crossing if flow 846 deemed to be transiting a bottleneck. 848 These three changes can help to remove the non-bottleneck noise from 849 freq_est. 851 5. Measuring OWD 853 This section discusses the OWD measurements required for this 854 algorithm to detect shared bottlenecks. 856 The SBD mechanism described in this document relies on differences 857 between OWD measurements to avoid the practical problems with 858 measuring absolute OWD (see [Hayes-LCN14] section IIIC). Since all 859 summary statistics are relative to the mean OWD and sender/receiver 860 clock offsets should be approximately constant over the measurement 861 periods, the offset is subtracted out in the calculation. 863 5.1. Time-stamp Resolution 865 The SBD mechanism requires timing information precise enough to be 866 able to make comparisons. As a rule of thumb, the time resolution 867 should be less than one hundredth of a typical path's range of 868 delays. In general, the coarser the time resolution, the more care 869 that needs to be taken to ensure rounding errors do not bias the 870 skewness calculation. Frequent timing information in millisecond 871 resolution as described by [I-D.ietf-avtcore-cc-feedback-message] 872 should be sufficient for the sender to calculate relative OWD. 874 5.2. Clock Skew 876 Generally sender and receiver clock skew will be too small to cause 877 significant errors in the estimators. Skew_est and freq_est are the 878 most sensitive to this type of noise due to their use of a mean OWD 879 calculated over a longer interval. In circumstances where clock skew 880 is high, basing skew_est only on the previous T's mean and ignoring 881 freq_est provides a noisier but reliable signal. 883 A more sophisticated method is to estimate the effect the clock skew 884 is having on the summary statistics, and then adjust statistics 885 accordingly. There are a number of techniques in the literature, 886 including [Zhang-Infocom02]. 888 6. Expected Feedback from Experiments 890 The algorithm described in this memo has so far been evaluated using 891 simulations and small scale experiments. Real network tests using 892 RTP Media Congestion Avoidance Techniques (RMCAT) congestion control 893 algorithms will help confirm the default parameter choice. For 894 example, the time interval T may need to be made longer if the packet 895 rate is very low. Implementers and testers are invited to document 896 their findings in an Internet draft. 898 7. Acknowledgments 900 This work was part-funded by the European Community under its Seventh 901 Framework Programme through the Reducing Internet Transport Latency 902 (RITE) project (ICT-317700). The views expressed are solely those of 903 the authors. 905 8. IANA Considerations 907 This memo includes no request to IANA. 909 9. Security Considerations 911 The security considerations of RFC 3550 [RFC3550], RFC 4585 912 [RFC4585], and RFC 5124 [RFC5124] are expected to apply. 914 Non-authenticated RTCP packets carrying OWD measurements, shared 915 bottleneck indications, and/or summary statistics could allow 916 attackers to alter the bottleneck sharing characteristics for private 917 gain or disruption of other parties' communication. When using SBD 918 for coupled congestion control as described in 919 [I-D.ietf-rmcat-coupled-cc], the security considerations of 920 [I-D.ietf-rmcat-coupled-cc] apply. 922 10. Change history 924 XX RFC ED - PLEASE REMOVE THIS SECTION XXX 926 Changes made to this document: 928 WG-09->WG-10 : AD review addressed. 930 WG-08->WG-09 : Removed definitions that are no longer used. Added 931 pkt_loss definition. Refined c_s recommendation. 933 WG-07->WG-08 : Updates addressing https://www.ietf.org/mail- 934 archive/web/rmcat/current/msg01671.html Mainly 935 clarifications. 937 WG-06->WG-07 : Updates addressing 938 https://mailarchive.ietf.org/arch/msg/ 939 rmcat/80B6q4nI7carGcf_ddBwx7nKvOw. Mainly 940 clarifications. Figure 2 to supplement grouping 941 algorithm description. 943 WG-05->WG-06 : Updates addressing WG reviews 944 https://mailarchive.ietf.org/arch/msg/rmcat/- 945 1JdrTMq1Y5T6ZNlOkrQJQ27TzE and 946 https://mailarchive.ietf.org/arch/msg/rmcat/ 947 eI2Q1f8NL2SxbJgjFLR4_rEmJ_g. This has mainly 948 involved minor clarifications, including the moving 949 of 3.4.1 and 3.5 into the new Section 4, and 3.4.1 950 into Section 5 952 WG-04->WG-05 : Fix ToC formatting. Add section on expected 953 feedback from experiments replacing short section 954 on implementation status. Added comment on ECN as 955 a signal. Clarification of lost packet signaling. 956 Change term "draft" to "document" where 957 appropriate. American spelling. Some tightening 958 of the text. 960 WG-03->WG-04 : Add M to terminology table, suggest skew_est based 961 on previous T and no freq_est in clock skew 962 section, feedback requirements as a separate sub 963 section. 965 WG-02->WG-03 : Correct misspelled author 967 WG-01->WG-02 : Removed ambiguity associated with the term 968 "congestion". Expanded the description of 969 initialization messages. Removed PDV metric. 970 Added description of incremental weighted metric 971 calculations for skew_est. Various clarifications 972 based on implementation work. Fixed typos and 973 tuned parameters. 975 WG-00->WG-01 : Moved unbiased skew section to replace skew 976 estimate, more robust variability estimator, the 977 term variance replaced with variability, clock 978 drift term corrected to clock skew, revision to 979 clock skew section with a place holder, description 980 of parameters. 982 02->WG-00 : Fixed missing 0.5 in 3.3.2 and missing brace in 983 3.3.3 985 01->02 : New section describing improvements to the key 986 metric calculations that help to remove noise, 987 bias, and reduce lag. Some revisions to the 988 notation to make it clearer. Some tightening of 989 the thresholds. 991 00->01 : Revisions to terminology for clarity 993 11. References 995 11.1. Normative References 997 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 998 Requirement Levels", BCP 14, RFC 2119, 999 DOI 10.17487/RFC2119, March 1997, 1000 . 1002 11.2. Informative References 1004 [Hayes-LCN14] 1005 Hayes, D., Ferlin, S., and M. Welzl, "Practical Passive 1006 Shared Bottleneck Detection using Shape Summary 1007 Statistics", Proc. the IEEE Local Computer Networks 1008 (LCN) pp150-158, September 2014, 1009 . 1012 [I-D.ietf-avtcore-cc-feedback-message] 1013 Sarker, Z., Perkins, C., Singh, V., and M. Ramalho, "RTP 1014 Control Protocol (RTCP) Feedback for Congestion Control", 1015 draft-ietf-avtcore-cc-feedback-message-00 (work in 1016 progress), October 2017. 1018 [I-D.ietf-rmcat-coupled-cc] 1019 Islam, S., Welzl, M., and S. Gjessing, "Coupled congestion 1020 control for RTP media", draft-ietf-rmcat-coupled-cc-07 1021 (work in progress), September 2017. 1023 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 1024 Jacobson, "RTP: A Transport Protocol for Real-Time 1025 Applications", STD 64, RFC 3550, DOI 10.17487/RFC3550, 1026 July 2003, . 1028 [RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey, 1029 "Extended RTP Profile for Real-time Transport Control 1030 Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, 1031 DOI 10.17487/RFC4585, July 2006, 1032 . 1034 [RFC5124] Ott, J. and E. Carrara, "Extended Secure RTP Profile for 1035 Real-time Transport Control Protocol (RTCP)-Based Feedback 1036 (RTP/SAVPF)", RFC 5124, DOI 10.17487/RFC5124, February 1037 2008, . 1039 [RFC6817] Shalunov, S., Hazel, G., Iyengar, J., and M. Kuehlewind, 1040 "Low Extra Delay Background Transport (LEDBAT)", RFC 6817, 1041 DOI 10.17487/RFC6817, December 2012, 1042 . 1044 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 1045 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 1046 May 2017, . 1048 [Zhang-Infocom02] 1049 Zhang, L., Liu, Z., and H. Xia, "Clock synchronization 1050 algorithms for network measurements", Proc. the IEEE 1051 International Conference on Computer Communications 1052 (INFOCOM) pp160-169, September 2002, 1053 . 1055 Authors' Addresses 1057 David Hayes (editor) 1058 Simula Research Laboratory 1059 P.O. Box 134 1060 Lysaker 1325 1061 Norway 1063 Email: davidh@simula.no 1065 Simone Ferlin 1067 Email: simone@ferlin.io 1069 Michael Welzl 1070 University of Oslo 1071 PO Box 1080 Blindern 1072 Oslo N-0316 1073 Norway 1075 Email: michawe@ifi.uio.no 1077 Kristian Hiorth 1078 University of Oslo 1079 PO Box 1080 Blindern 1080 Oslo N-0316 1081 Norway 1083 Email: kristahi@ifi.uio.no