idnits 2.17.1 draft-vpolak-mkonstan-mlrsearch-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Introduction section. ** The abstract seems to contain references ([RFC2544]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 22, 2018) is 2014 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'RFC8174' is defined on line 459, but no explicit reference was found in the text Summary: 2 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Benchmarking Working Group M. Konstantynowicz, Ed. 3 Internet-Draft V. Polak, Ed. 4 Intended status: Informational Cisco Systems 5 Expires: April 25, 2019 October 22, 2018 7 Multiple Loss Ratio Search for Packet Throughput (MLRsearch) 8 draft-vpolak-mkonstan-mlrsearch-00 10 Abstract 12 This document proposes changes to [RFC2544], specifically to packet 13 throughput search methodology, by defining a new search algorithm 14 referred to as Multiple Loss Ratio search (MLRsearch for short). 15 Instead of relying on binary search with pre-set starting offered 16 load, it proposes a novel approach discovering the starting point in 17 the initial phase, and then searching for packet throughput based on 18 defined packet loss ratio (PLR) input criteria and defined final 19 trial duration time. One of the key design principles behind 20 MLSsearch is minimizing the total test duration and searching for 21 multiple packet throughput rates (each with a corresponding PLR) 22 concurrently, instead of doing it sequentially. 24 The main motivation behind MLRsearch is the new set of challenges and 25 requirements posed by NFV (Network Function Virtualization), 26 specifically software based implementations of NFV data planes. 27 Using [RFC2544] in the experience of the authors yields often not 28 repetitive and not replicable end results due to a large number of 29 factors that are out of scope for this draft. MLRsearch aims to 30 address this chalenge and define a common (standard?) way to evaluate 31 NFV packet throughput performance that takes into account varying 32 characteristics of NFV systems under test. 34 Status of This Memo 36 This Internet-Draft is submitted in full conformance with the 37 provisions of BCP 78 and BCP 79. 39 Internet-Drafts are working documents of the Internet Engineering 40 Task Force (IETF). Note that other groups may also distribute 41 working documents as Internet-Drafts. The list of current Internet- 42 Drafts is at https://datatracker.ietf.org/drafts/current/. 44 Internet-Drafts are draft documents valid for a maximum of six months 45 and may be updated, replaced, or obsoleted by other documents at any 46 time. It is inappropriate to use Internet-Drafts as reference 47 material or to cite them other than as "work in progress." 49 Internet-DrafMultiple Loss Ratio Search for Packet Throughp October 2018 51 This Internet-Draft will expire on April 25, 2019. 53 Copyright Notice 55 Copyright (c) 2018 IETF Trust and the persons identified as the 56 document authors. All rights reserved. 58 This document is subject to BCP 78 and the IETF Trust's Legal 59 Provisions Relating to IETF Documents 60 (https://trustee.ietf.org/license-info) in effect on the date of 61 publication of this document. Please review these documents 62 carefully, as they describe your rights and restrictions with respect 63 to this document. Code Components extracted from this document must 64 include Simplified BSD License text as described in Section 4.e of 65 the Trust Legal Provisions and are provided without warranty as 66 described in the Simplified BSD License. 68 Table of Contents 70 1. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 2 71 2. MLRsearch Background . . . . . . . . . . . . . . . . . . . . 3 72 3. MLRsearch Overview . . . . . . . . . . . . . . . . . . . . . 3 73 4. Sample Implementation . . . . . . . . . . . . . . . . . . . . 5 74 4.1. Input Parameters . . . . . . . . . . . . . . . . . . . . 6 75 4.2. Initial phase . . . . . . . . . . . . . . . . . . . . . . 6 76 4.3. Non-initial phases . . . . . . . . . . . . . . . . . . . 7 77 5. Known Implementations . . . . . . . . . . . . . . . . . . . . 8 78 5.1. FD.io CSIT Implementation Deviations . . . . . . . . . . 9 79 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 80 7. Security Considerations . . . . . . . . . . . . . . . . . . . 10 81 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 10 82 9. Normative References . . . . . . . . . . . . . . . . . . . . 10 83 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 10 85 1. Terminology 87 o NDR - Non-Drop Rate, a packet throughput metric with Packet Loss 88 Ratio equal zero (a zero packet loss), expressed in packets-per- 89 second (pps). NDR packet throughput has an associated metric 90 oftentimes referred to as NDR bandwidth expressed in bits-per- 91 second (bps), and calculated as a product of: 93 * NDR packet rate for specific packet (frame) size, and 95 * Packet (L2 frame size) size in bits plus any associated L1 96 overhead. 98 Internet-DrafMultiple Loss Ratio Search for Packet Throughp October 2018 100 o PLR - Packet Loss Ratio, a packet loss metric calculated as a 101 ratio of (packets_transmitted - packets_received) to 102 packets_transmitted, over the test trial duration. 104 o PDR - Partial-Drop Rate, a packet throughput metric with Packet 105 Loss Ratio greater than zero (a non-zero packet loss), expressed 106 in packets-per-second (pps). PDR packet throughput has an 107 associated metric oftentimes referred to as PDR bandwidth 108 expressed in bits-per- second (bps), and calculated as a product 109 of: 111 * PDR packet rate for specific packet (frame) size, and 113 * Packet (L2 frame size) size in bits plus any associated L1 114 overhead. 116 2. MLRsearch Background 118 Multiple Loss Rate search (MLRsearch) is a packet throughput search 119 algorithm suitable for deterministic (as opposed to probabilistic) 120 systems. MLRsearch discovers multiple packet throughput rates in a 121 single search, each rate associated with a distinct Packet Loss Ratio 122 (PLR) criteria. 124 Two popular names for particular PLR criteria are Non-Drop Rate (NDR, 125 with PLR=0, zero packet loss) and Partial Drop Rate (PDR, with PLR>0, 126 non-zero packet loss). MLRsearch discovers NDR and PDR in a single 127 search reducing required execution time compared to separate binary 128 searches for NDR and PDR. MLRsearch reduces execution time even 129 further by relying on shorter trial durations of intermediate steps, 130 with only the final measurements conducted at the specified final 131 trial duration. This results in the shorter overall search execution 132 time when compared to a standard NDR/PDR binary search, while 133 guaranteeing the same or similar results. (TODO: Specify "standard" 134 in the previous sentence.) 136 If needed, MLRsearch can be easily adopted to discover more 137 throughput rates with different pre-defined PLRs. 139 Unless otherwise noted, all throughput rates are _always_ bi- 140 directional aggregates of two equal (symmetric) uni-directional 141 packet rates received and reported by an external traffic generator. 143 3. MLRsearch Overview 145 The main properties of MLRsearch: 147 Internet-DrafMultiple Loss Ratio Search for Packet Throughp October 2018 149 o MLRsearch is a duration aware multi-phase multi-rate search 150 algorithm. 152 * Initial phase determines promising starting interval for the 153 search. 155 * Intermediate phases progress towards defined final search 156 criteria. 158 * Final phase executes measurements according to the final search 159 criteria. 161 o *Initial phase*: 163 * Uses link rate as a starting transmit rate and discovers the 164 Maximum Receive Rate (MRR) used as an input to the first 165 intermediate phase. 167 o *Intermediate phases*: 169 * Start with initial trial duration (in the first phase) and 170 converge geometrically towards the final trial duration (in the 171 final phase). 173 * Track two values for NDR and two for PDR. 175 + The values are called (NDR or PDR) lower_bound and 176 upper_bound. 178 + Each value comes from a specific trial measurement (most 179 recent for that transmit rate), and as such the value is 180 associated with that measurement's duration and loss. 182 + A bound can be invalid, for example if NDR lower_bound has 183 been measured with nonzero loss. 185 + Invalid bounds are not real boundaries for the searched 186 value, but are needed to track interval widths. 188 + Valid bounds are real boundaries for the searched value. 190 + Each non-initial phase ends with all bounds valid. 192 * Start with a large (lower_bound, upper_bound) interval width 193 and geometrically converge towards the width goal (measurement 194 resolution) of the phase. Each phase halves the previous width 195 goal. 197 Internet-DrafMultiple Loss Ratio Search for Packet Throughp October 2018 199 * Use internal and external searches: 201 + External search - measures at transmit rates outside the 202 (lower_bound, upper_bound) interval. Activated when a bound 203 is invalid, to search for a new valid bound by doubling the 204 interval width. It is a variant of "exponential search"_. 206 + Internal search - "binary search"_, measures at transmit 207 rates within the (lower_bound, upper_bound) valid interval, 208 halving the interval width. 210 o *Final phase* is executed with the final test trial duration, and 211 the final width goal that determines resolution of the overall 212 search. Intermediate phases together with the final phase are 213 called non-initial phases. 215 The main benefits of MLRsearch vs. binary search include: 217 o In general MLRsearch is likely to execute more search trials 218 overall, but less trials at a set final duration. 220 o In well behaving cases it greatly reduces (>50%) the overall 221 duration compared to a single PDR (or NDR) binary search duration, 222 while finding multiple drop rates. 224 o In all cases MLRsearch yields the same or similar results to 225 binary search. 227 o Note: both binary search and MLRsearch are susceptible to 228 reporting non-repeatable results across multiple runs for very bad 229 behaving cases. 231 Caveats: 233 o Worst case MLRsearch can take longer than a binary search e.g. in 234 case of drastic changes in behaviour for trials at varying 235 durations. 237 4. Sample Implementation 239 Following is a brief description of a sample MLRsearch implementation 240 based on the open-source code running in FD.io CSIT project as part 241 of a Continuous Integration / Continuous Development (CI/CD) 242 framework. 244 Internet-DrafMultiple Loss Ratio Search for Packet Throughp October 2018 246 4.1. Input Parameters 248 1. *maximum_transmit_rate* - maximum packet transmit rate to be used 249 by external traffic generator, limited by either the actual 250 Ethernet link rate or traffic generator NIC model capabilities. 251 Sample defaults: 2 * 14.88 Mpps for 64B 10GE link rate, 2 * 18.75 252 Mpps for 64B 40GE NIC maximum rate. 254 2. *minimum_transmit_rate* - minimum packet transmit rate to be used 255 for measurements. MLRsearch fails if lower transmit rate needs 256 to be used to meet search criteria. Default: 2 * 10 kpps (could 257 be higher). 259 3. *final_trial_duration* - required trial duration for final rate 260 measurements. Default: 30 sec. 262 4. *initial_trial_duration* - trial duration for initial MLRsearch 263 phase. Default: 1 sec. 265 5. *final_relative_width* - required measurement resolution 266 expressed as (lower_bound, upper_bound) interval width relative 267 to upper_bound. Default: 0.5%. 269 6. *packet_loss_ratio* - maximum acceptable PLR search criteria for 270 PDR measurements. Default: 0.5%. 272 7. *number_of_intermediate_phases* - number of phases between the 273 initial phase and the final phase. Impacts the overall MLRsearch 274 duration. Less phases are required for well behaving cases, more 275 phases may be needed to reduce the overall search duration for 276 worse behaving cases. Default (2). (Value chosen based on 277 limited experimentation to date. More experimentation needed to 278 arrive to clearer guidelines.) 280 4.2. Initial phase 282 1. First trial measures at maximum rate and discovers MRR. a. _in_: 283 trial_duration = initial_trial_duration. b. _in_: 284 offered_transmit_rate = maximum_transmit_rate. c. _do_: single 285 trial. d. _out_: measured loss ratio. e. _out_: mrr = measured 286 receive rate. 288 2. Second trial measures at MRR and discovers MRR2. a. _in_: 289 trial_duration = initial_trial_duration. b. _in_: 290 offered_transmit_rate = MRR. c. _do_: single trial. d. _out_: 291 measured loss ratio. e. _out_: mrr2 = measured receive rate. 293 Internet-DrafMultiple Loss Ratio Search for Packet Throughp October 2018 295 3. Third trial measures at MRR2. a. _in_: trial_duration = 296 initial_trial_duration. b. _in_: offered_transmit_rate = MRR2. 297 c. _do_: single trial. d. _out_: measured loss ratio. 299 4.3. Non-initial phases 301 1. Main loop: a. _in_: trial_duration for the current phase. Set to 302 initial_trial_duration for the first intermediate phase; to 303 final_trial_duration for the final phase; or to the element of 304 interpolating geometric sequence for other intermediate phases. 305 For example with two intermediate phases, trial_duration of the 306 second intermediate phase is the geometric average of 307 initial_strial_duration and final_trial_duration. b. _in_: 308 relative_width_goal for the current phase. Set to 309 final_relative_width for the final phase; doubled for each 310 preceding phase. For example with two intermediate phases, the 311 first intermediate phase uses quadruple of final_relative_width 312 and the second intermediate phase uses double of 313 final_relative_width. c. _in_: ndr_interval, pdr_interval from 314 the previous main loop iteration or the previous phase. If the 315 previous phase is the initial phase, both intervals have 316 lower_bound = MRR2, uper_bound = MRR. Note that the initial 317 phase is likely to create intervals with invalid bounds. d. 318 _do_: According to the procedure described in point 2, either 319 exit the phase (by jumping to 1.g.), or prepare new transmit rate 320 to measure with. e. _do_: Perform the trial measurement at the 321 new transmit rate and trial_duration, compute its loss ratio. f. 322 _do_: Update the bounds of both intervals, based on the new 323 measurement. The actual update rules are numerous, as NDR 324 external search can affect PDR interval and vice versa, but the 325 result agrees with rules of both internal and external search. 326 For example, any new measurement below an invalid lower_bound 327 becomes the new lower_bound, while the old measurement 328 (previously acting as the invalid lower_bound) becomes a new and 329 valid upper_bound. Go to next iteration (1.c.), taking the 330 updated intervals as new input. g. _out_: current ndr_interval 331 and pdr_interval. In the final phase this is also considered to 332 be the result of the whole search. For other phases, the next 333 phase loop is started with the current results as an input. 335 2. New transmit rate (or exit) calculation (for 1.d.): 337 * If there is an invalid bound then prepare for external search: 339 + _If_ the most recent measurement at NDR lower_bound 340 transmit rate had the loss higher than zero, then the new 341 transmit rate is NDR lower_bound decreased by two NDR 342 interval widths. 344 Internet-DrafMultiple Loss Ratio Search for Packet Throughp October 2018 346 + Else, _if_ the most recent measurement at PDR lower_bound 347 transmit rate had the loss higher than PLR, then the new 348 transmit rate is PDR lower_bound decreased by two PDR 349 interval widths. 351 + Else, _if_ the most recent measurement at NDR upper_bound 352 transmit rate had no loss, then the new transmit rate is 353 NDR upper_bound increased by two NDR interval widths. 355 + Else, _if_ the most recent measurement at PDR upper_bound 356 transmit rate had the loss lower or equal to PLR, then the 357 new transmit rate is PDR upper_bound increased by two PDR 358 interval widths. 360 * If interval width is higher than the current phase goal: 362 + Else, _if_ NDR interval does not meet the current phase 363 width goal, prepare for internal search. The new transmit 364 rate is (NDR lower bound + NDR upper bound) / 2. 366 + Else, _if_ PDR interval does not meet the current phase 367 width goal, prepare for internal search. The new transmit 368 rate is (PDR lower bound + PDR upper bound) / 2. 370 * Else, _if_ some bound has still only been measured at a lower 371 duration, prepare to re-measure at the current duration (and 372 the same transmit rate). The order of priorities is: 374 + NDR lower_bound, 376 + PDR lower_bound, 378 + NDR upper_bound, 380 + PDR upper_bound. 382 * _Else_, do not prepare any new rate, to exit the phase. This 383 ensures that at the end of each non-initial phase all 384 intervals are valid, narrow enough, and measured at current 385 phase trial duration. 387 5. Known Implementations 389 The only known working implementatin of MLRsearch is in Linux 390 Foundation FD.io CSIT project. https://wiki.fd.io/view/CSIT. 391 https://git.fd.io/csit/. 393 Internet-DrafMultiple Loss Ratio Search for Packet Throughp October 2018 395 5.1. FD.io CSIT Implementation Deviations 397 This document so far has been describing a simplified version of 398 MLRsearch algorithm. The full algorithm as implemented contains 399 additional logic, which makes some of the details (but not general 400 ideas) above incorrect. Here is a short description of the 401 additional logic as a list of principles, explaining their main 402 differences from (or additions to) the simplified description, but 403 without detailing their mutual interaction. 405 1. _Logarithmic transmit rate._ In order to better fit the relative 406 width goal, the interval doubling and halving is done 407 differently. For example, the middle of 2 and 8 is 4, not 5. 409 2. _Optimistic maximum rate._ The increased rate is never higher 410 than the maximum rate. Upper bound at that rate is always 411 considered valid. 413 3. _Pessimistic minimum rate._ The decreased rate is never lower 414 than the minimum rate. If a lower bound at that rate is invalid, 415 a phase stops refining the interval further (until it gets re- 416 measured). 418 4. _Conservative interval updates._ Measurements above current upper 419 bound never update a valid upper bound, even if drop ratio is 420 low. Measurements below current lower bound always update any 421 lower bound if drop ratio is high. 423 5. _Ensure sufficient interval width._ Narrow intervals make 424 external search take more time to find a valid bound. If the new 425 transmit increased or decreased rate would result in width less 426 than the current goal, increase/decrease more. This can happen 427 if the measurement for the other interval makes the current 428 interval too narrow. Similarly, take care the measurements in 429 the initial phase create wide enough interval. 431 6. _Timeout for bad cases._ The worst case for MLRsearch is when 432 each phase converges to intervals way different than the results 433 of the previous phase. Rather than suffer total search time 434 several times larger than pure binary search, the implemented 435 tests fail themselves when the search takes too long (given by 436 argument _timeout_). 438 6. IANA Considerations 440 .. 442 Internet-DrafMultiple Loss Ratio Search for Packet Throughp October 2018 444 7. Security Considerations 446 .. 448 8. Acknowledgements 450 .. 452 9. Normative References 454 [RFC2544] Bradner, S. and J. McQuaid, "Benchmarking Methodology for 455 Network Interconnect Devices", RFC 2544, 456 DOI 10.17487/RFC2544, March 1999, 457 . 459 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 460 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 461 May 2017, . 463 Authors' Addresses 465 Maciek Konstantynowicz (editor) 466 Cisco Systems 468 Email: mkonstan@cisco.com 470 Vratko Polak (editor) 471 Cisco Systems 473 Email: vrpolak@cisco.com