idnits 2.17.1 draft-vpolak-plrsearch-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Introduction section. ** The abstract seems to contain references ([RFC2544]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 22, 2018) is 2013 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'RFC8174' is defined on line 269, but no explicit reference was found in the text Summary: 2 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Benchmarking Working Group M. Konstantynowicz, Ed. 3 Internet-Draft V. Polak, Ed. 4 Intended status: Informational Cisco Systems 5 Expires: April 25, 2019 October 22, 2018 7 Probabilistic Loss Ratio Search for Packet Throughput (PLRsearch) 8 draft-vpolak-plrsearch-00 10 Abstract 12 This document addresses challenges while applying methodologies 13 described in [RFC2544] to benchmarking NFV (Network Function 14 Virtualization) over an extended period of time, sometimes referred 15 to as "soak testing". More specifically to benchmarking software 16 based implementations of NFV data planes. Packet throughput search 17 approach proposed by this document assumes that system under test is 18 probabilistic in nature, and not deterministic. 20 Status of This Memo 22 This Internet-Draft is submitted in full conformance with the 23 provisions of BCP 78 and BCP 79. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF). Note that other groups may also distribute 27 working documents as Internet-Drafts. The list of current Internet- 28 Drafts is at https://datatracker.ietf.org/drafts/current/. 30 Internet-Drafts are draft documents valid for a maximum of six months 31 and may be updated, replaced, or obsoleted by other documents at any 32 time. It is inappropriate to use Internet-Drafts as reference 33 material or to cite them other than as "work in progress." 35 This Internet-Draft will expire on April 25, 2019. 37 Copyright Notice 39 Copyright (c) 2018 IETF Trust and the persons identified as the 40 document authors. All rights reserved. 42 This document is subject to BCP 78 and the IETF Trust's Legal 43 Provisions Relating to IETF Documents 44 (https://trustee.ietf.org/license-info) in effect on the date of 45 publication of this document. Please review these documents 46 carefully, as they describe your rights and restrictions with respect 47 to this document. Code Components extracted from this document must 48 include Simplified BSD License text as described in Section 4.e of 50 Internet-DrafProbabilistic Loss Ratio Search for Packet Thr October 2018 52 the Trust Legal Provisions and are provided without warranty as 53 described in the Simplified BSD License. 55 Table of Contents 57 1. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . 2 58 2. Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 59 3. Poisson Distribution . . . . . . . . . . . . . . . . . . . . 3 60 4. Fitting Function Coefficients Distribution . . . . . . . . . 4 61 5. Algorithm Formulas . . . . . . . . . . . . . . . . . . . . . 5 62 5.1. Integration . . . . . . . . . . . . . . . . . . . . . . . 5 63 5.2. Optimizations . . . . . . . . . . . . . . . . . . . . . . 5 64 6. Known Implementations . . . . . . . . . . . . . . . . . . . . 5 65 6.1. FD.io CSIT Implementation Specifics . . . . . . . . . . . 5 66 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 5 67 8. Security Considerations . . . . . . . . . . . . . . . . . . . 6 68 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 6 69 10. Normative References . . . . . . . . . . . . . . . . . . . . 6 70 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 6 72 1. Motivation 74 Network providers are interested in throughput a device can sustain. 76 RFC 2544 assumes loss ratio is given by a deterministic function of 77 offered load. But NFV software devices are not deterministic 78 (enough). This leads for deterministic algorithms (such as MLRsearch 79 with single trial) to return results, which when repeated show 80 relatively high standard deviation, thus making it harder to tell 81 what "the throughput" actually is. 83 We need another algorithm, which takes this indeterminism into 84 account. 86 2. Model 88 Each algorithm searches for an answer to a precisely formulated 89 question. When the question involves indeterministic systems, it has 90 to specify probabilities (or prior distributions) which are tied to a 91 specific probabilistic model. Different models will have different 92 number (and meaning) of parameters. Complicated (but more realistic) 93 models have many parameters, and the math involved can be very 94 complicated. It is better to start with simpler probabilistic model, 95 and only change it when the output of the simpler algorithm is not 96 stable or useful enough. 98 TODO: Refer to packet forwarding terminology, such as "offered load" 99 and "loss ratio". 101 Internet-DrafProbabilistic Loss Ratio Search for Packet Thr October 2018 103 TODO: Mention that no packet duplication is expected (or is filtered 104 out). 106 TODO: Define critical load and critical region earlier. 108 This document is focused on algorithms related to packet loss count 109 only. No latency (or other information) is taken into account. For 110 simplicity, only one type of measurement is considered: dynamically 111 computed offered load, constant within trial measurement of 112 predetermined trial duration. 114 Also, running longer trials (in some situations) could be more 115 efficient, but in order to perform trial at multiple offered loads 116 withing critical region, trial durations should be kept as short as 117 possible. 119 3. Poisson Distribution 121 TODO: Give link to more officially published literature about Poisson 122 distribution. 124 Note-1: that the algorithm makes an assumption that packet traffic 125 generator detects duplicate packets on receive detection, and reports 126 this as an error. 128 Note-2: Binomial distribution is a better fit compared to Poisson 129 distribution (acknowledging that the number of packets lost cannot be 130 higher than the number of packets offered), but the difference tends 131 to be relevant in loads far above the critical region, so using 132 Poisson distribution helps the algorithm focus on critical region 133 better. 135 When comparing different offered loads, the average loss per second 136 is assumed to increase, but the (deterministic) function from offered 137 load into average loss rate is otherwise unknown. 139 Given a loss target (configurable, by default one packet lost per 140 second), there is an unknown offered load when the average is exactly 141 that. We call that the "critical load". If critical load seems 142 higher than maximum offerable load, we should use the maximum 143 offerable load to make search output more stable. 145 Of course, there are great many increasing functions. The offered 146 load has to be chosen for each trial, and the computed posterior 147 distribution of critical load can change with each trial result. 149 To make the space of possible functions more tractable, some other 150 simplifying assumption is needed. As the algorithm will be examining 152 Internet-DrafProbabilistic Loss Ratio Search for Packet Thr October 2018 154 (also) loads close to the critical load, linear approximation to the 155 function (TODO: name the function) in the critical region is 156 important. But as the search algorithm needs to evaluate the 157 function also far away from the critical region, the approximate 158 function has to be well- behaved for every positive offered load, 159 specifically it cannot predict non-positive packet loss rate. 161 Within this document, "fitting function" is the name for such well- 162 behaved function which approximates the unknown function in the 163 critical region. 165 Results from trials far from the critical region are likely to affect 166 the critical rate estimate negatively, as the fitting function does 167 not need to be a good approximation there. Instead of discarding 168 some results, or "suppressing" their impact with ad-hoc methods 169 (other than using Poisson distribution instead of binomial) is not 170 used, as such methods tend to make the overall search unstable. We 171 rely on most of measurements being done (eventually) within the 172 critical region, and overweighting far-off measurements (eventually) 173 for well-behaved fitting functions. 175 4. Fitting Function Coefficients Distribution 177 To accomodate systems with different behaviours, the fitting function 178 is expected to have few numeric parameters affecting its shape 179 (mainly affecting the linear approximation in the critical region). 181 The general search algorithm can use whatever increasing fitting 182 function, some specific functions can be described later. 184 TODO: Describe sigmoid-based and erf-based functions. 186 It is up to implementer to chose a fitting function and prior 187 distribution of its parameters. The rest of this document assumes 188 each parameter is independently and uniformly distributed over common 189 interval. Implementers are to add non-linear transformations into 190 their fitting functions if their prior is different. 192 TODO: Move the following sentence into more appropriate place. 194 Speaking about new trials, each next trial will be done at offered 195 load equal to the current average of the critical load. 197 Exit condition is either critical load stdev becoming small enough, 198 or overal search time becoming long enough. 200 The algorithm should report both avg and stdev for critical load. If 201 the reported averages follow a trend (without reaching equilibrium), 203 Internet-DrafProbabilistic Loss Ratio Search for Packet Thr October 2018 205 avg and stdev should refer to the equilibrium estibated based on the 206 trend, not to immediate posterior values. 208 TODO: Explicitly mention the iterative character of the search. 210 5. Algorithm Formulas 212 5.1. Integration 214 The posterior distributions for fitting function parameters will not 215 be integrable in general. 217 The search algorithm utilises the fact that trial measurement takes 218 some time, so this time can be used for numeric integration (using 219 suitable method, such as Monte Carlo) to achieve sufficient 220 precision. 222 5.2. Optimizations 224 After enough trials, the posterior distribution will be concentrated 225 in a narrow area of parameter space. The integration method could 226 take advantage of that. 228 Even in the concentrated area, the likelihood can be quite small, so 229 the integration algorithm should track the logarithm of the 230 likelihood, and also avoid underflow errors bu ther means. 232 6. Known Implementations 234 The only known working implementatin of Probabilistic Loss Ratio 235 Search for Packet Throughput is in Linux Foundation FD.io CSIT 236 project. https://wiki.fd.io/view/CSIT. https://git.fd.io/csit/. 238 6.1. FD.io CSIT Implementation Specifics 240 In a sample implemenation in FD.io CSIT project, there is around 0.5 241 second delay between trials due to restrictons imposed by packet 242 traffic generator in use (T-Rex), avoiding that delay is out of scope 243 of this document. 245 TODO: Describe how the current integration algorithm finds the 246 concentrated area. 248 7. IANA Considerations 250 .. 252 Internet-DrafProbabilistic Loss Ratio Search for Packet Thr October 2018 254 8. Security Considerations 256 .. 258 9. Acknowledgements 260 .. 262 10. Normative References 264 [RFC2544] Bradner, S. and J. McQuaid, "Benchmarking Methodology for 265 Network Interconnect Devices", RFC 2544, 266 DOI 10.17487/RFC2544, March 1999, 267 . 269 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 270 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 271 May 2017, . 273 Authors' Addresses 275 Maciek Konstantynowicz (editor) 276 Cisco Systems 278 Email: mkonstan@cisco.com 280 Vratko Polak (editor) 281 Cisco Systems 283 Email: vrpolak@cisco.com