idnits 2.17.1 

draft-vpolak-mkonstan-mlrsearch-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an Introduction section.

  ** The abstract seems to contain references ([RFC2544]), which it
     shouldn't.  Please replace those with straight textual mentions of the
     documents in question.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (October 22, 2018) is 2014 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Unused Reference: 'RFC8174' is defined on line 459, but no explicit
     reference was found in the text


     Summary: 2 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Benchmarking Working Group                       M. Konstantynowicz, Ed.
3	Internet-Draft                                             V. Polak, Ed.
4	Intended status: Informational                             Cisco Systems
5	Expires: April 25, 2019                                 October 22, 2018

7	      Multiple Loss Ratio Search for Packet Throughput (MLRsearch)
8	                   draft-vpolak-mkonstan-mlrsearch-00

10	Abstract

12	   This document proposes changes to [RFC2544], specifically to packet
13	   throughput search methodology, by defining a new search algorithm
14	   referred to as Multiple Loss Ratio search (MLRsearch for short).
15	   Instead of relying on binary search with pre-set starting offered
16	   load, it proposes a novel approach discovering the starting point in
17	   the initial phase, and then searching for packet throughput based on
18	   defined packet loss ratio (PLR) input criteria and defined final
19	   trial duration time.  One of the key design principles behind
20	   MLSsearch is minimizing the total test duration and searching for
21	   multiple packet throughput rates (each with a corresponding PLR)
22	   concurrently, instead of doing it sequentially.

24	   The main motivation behind MLRsearch is the new set of challenges and
25	   requirements posed by NFV (Network Function Virtualization),
26	   specifically software based implementations of NFV data planes.
27	   Using [RFC2544] in the experience of the authors yields often not
28	   repetitive and not replicable end results due to a large number of
29	   factors that are out of scope for this draft.  MLRsearch aims to
30	   address this chalenge and define a common (standard?) way to evaluate
31	   NFV packet throughput performance that takes into account varying
32	   characteristics of NFV systems under test.

34	Status of This Memo

36	   This Internet-Draft is submitted in full conformance with the
37	   provisions of BCP 78 and BCP 79.

39	   Internet-Drafts are working documents of the Internet Engineering
40	   Task Force (IETF).  Note that other groups may also distribute
41	   working documents as Internet-Drafts.  The list of current Internet-
42	   Drafts is at https://datatracker.ietf.org/drafts/current/.

44	   Internet-Drafts are draft documents valid for a maximum of six months
45	   and may be updated, replaced, or obsoleted by other documents at any
46	   time.  It is inappropriate to use Internet-Drafts as reference
47	   material or to cite them other than as "work in progress."

49	Internet-DrafMultiple Loss Ratio Search for Packet Throughp October 2018

51	   This Internet-Draft will expire on April 25, 2019.

53	Copyright Notice

55	   Copyright (c) 2018 IETF Trust and the persons identified as the
56	   document authors.  All rights reserved.

58	   This document is subject to BCP 78 and the IETF Trust's Legal
59	   Provisions Relating to IETF Documents
60	   (https://trustee.ietf.org/license-info) in effect on the date of
61	   publication of this document.  Please review these documents
62	   carefully, as they describe your rights and restrictions with respect
63	   to this document.  Code Components extracted from this document must
64	   include Simplified BSD License text as described in Section 4.e of
65	   the Trust Legal Provisions and are provided without warranty as
66	   described in the Simplified BSD License.

68	Table of Contents

70	   1.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   2
71	   2.  MLRsearch Background  . . . . . . . . . . . . . . . . . . . .   3
72	   3.  MLRsearch Overview  . . . . . . . . . . . . . . . . . . . . .   3
73	   4.  Sample Implementation . . . . . . . . . . . . . . . . . . . .   5
74	     4.1.  Input Parameters  . . . . . . . . . . . . . . . . . . . .   6
75	     4.2.  Initial phase . . . . . . . . . . . . . . . . . . . . . .   6
76	     4.3.  Non-initial phases  . . . . . . . . . . . . . . . . . . .   7
77	   5.  Known Implementations . . . . . . . . . . . . . . . . . . . .   8
78	     5.1.  FD.io CSIT Implementation Deviations  . . . . . . . . . .   9
79	   6.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   9
80	   7.  Security Considerations . . . . . . . . . . . . . . . . . . .  10
81	   8.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  10
82	   9.  Normative References  . . . . . . . . . . . . . . . . . . . .  10
83	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  10

85	1.  Terminology

87	   o  NDR - Non-Drop Rate, a packet throughput metric with Packet Loss
88	      Ratio equal zero (a zero packet loss), expressed in packets-per-
89	      second (pps).  NDR packet throughput has an associated metric
90	      oftentimes referred to as NDR bandwidth expressed in bits-per-
91	      second (bps), and calculated as a product of:

93	      *  NDR packet rate for specific packet (frame) size, and

95	      *  Packet (L2 frame size) size in bits plus any associated L1
96	         overhead.

98	Internet-DrafMultiple Loss Ratio Search for Packet Throughp October 2018

100	   o  PLR - Packet Loss Ratio, a packet loss metric calculated as a
101	      ratio of (packets_transmitted - packets_received) to
102	      packets_transmitted, over the test trial duration.

104	   o  PDR - Partial-Drop Rate, a packet throughput metric with Packet
105	      Loss Ratio greater than zero (a non-zero packet loss), expressed
106	      in packets-per-second (pps).  PDR packet throughput has an
107	      associated metric oftentimes referred to as PDR bandwidth
108	      expressed in bits-per- second (bps), and calculated as a product
109	      of:

111	      *  PDR packet rate for specific packet (frame) size, and

113	      *  Packet (L2 frame size) size in bits plus any associated L1
114	         overhead.

116	2.  MLRsearch Background

118	   Multiple Loss Rate search (MLRsearch) is a packet throughput search
119	   algorithm suitable for deterministic (as opposed to probabilistic)
120	   systems.  MLRsearch discovers multiple packet throughput rates in a
121	   single search, each rate associated with a distinct Packet Loss Ratio
122	   (PLR) criteria.

124	   Two popular names for particular PLR criteria are Non-Drop Rate (NDR,
125	   with PLR=0, zero packet loss) and Partial Drop Rate (PDR, with PLR>0,
126	   non-zero packet loss).  MLRsearch discovers NDR and PDR in a single
127	   search reducing required execution time compared to separate binary
128	   searches for NDR and PDR.  MLRsearch reduces execution time even
129	   further by relying on shorter trial durations of intermediate steps,
130	   with only the final measurements conducted at the specified final
131	   trial duration.  This results in the shorter overall search execution
132	   time when compared to a standard NDR/PDR binary search, while
133	   guaranteeing the same or similar results.  (TODO: Specify "standard"
134	   in the previous sentence.)

136	   If needed, MLRsearch can be easily adopted to discover more
137	   throughput rates with different pre-defined PLRs.

139	   Unless otherwise noted, all throughput rates are _always_ bi-
140	   directional aggregates of two equal (symmetric) uni-directional
141	   packet rates received and reported by an external traffic generator.

143	3.  MLRsearch Overview

145	   The main properties of MLRsearch:

147	Internet-DrafMultiple Loss Ratio Search for Packet Throughp October 2018

149	   o  MLRsearch is a duration aware multi-phase multi-rate search
150	      algorithm.

152	      *  Initial phase determines promising starting interval for the
153	         search.

155	      *  Intermediate phases progress towards defined final search
156	         criteria.

158	      *  Final phase executes measurements according to the final search
159	         criteria.

161	   o  *Initial phase*:

163	      *  Uses link rate as a starting transmit rate and discovers the
164	         Maximum Receive Rate (MRR) used as an input to the first
165	         intermediate phase.

167	   o  *Intermediate phases*:

169	      *  Start with initial trial duration (in the first phase) and
170	         converge geometrically towards the final trial duration (in the
171	         final phase).

173	      *  Track two values for NDR and two for PDR.

175	         +  The values are called (NDR or PDR) lower_bound and
176	            upper_bound.

178	         +  Each value comes from a specific trial measurement (most
179	            recent for that transmit rate), and as such the value is
180	            associated with that measurement's duration and loss.

182	         +  A bound can be invalid, for example if NDR lower_bound has
183	            been measured with nonzero loss.

185	         +  Invalid bounds are not real boundaries for the searched
186	            value, but are needed to track interval widths.

188	         +  Valid bounds are real boundaries for the searched value.

190	         +  Each non-initial phase ends with all bounds valid.

192	      *  Start with a large (lower_bound, upper_bound) interval width
193	         and geometrically converge towards the width goal (measurement
194	         resolution) of the phase.  Each phase halves the previous width
195	         goal.

197	Internet-DrafMultiple Loss Ratio Search for Packet Throughp October 2018

199	      *  Use internal and external searches:

201	         +  External search - measures at transmit rates outside the
202	            (lower_bound, upper_bound) interval.  Activated when a bound
203	            is invalid, to search for a new valid bound by doubling the
204	            interval width.  It is a variant of "exponential search"_.

206	         +  Internal search - "binary search"_, measures at transmit
207	            rates within the (lower_bound, upper_bound) valid interval,
208	            halving the interval width.

210	   o  *Final phase* is executed with the final test trial duration, and
211	      the final width goal that determines resolution of the overall
212	      search.  Intermediate phases together with the final phase are
213	      called non-initial phases.

215	   The main benefits of MLRsearch vs. binary search include:

217	   o  In general MLRsearch is likely to execute more search trials
218	      overall, but less trials at a set final duration.

220	   o  In well behaving cases it greatly reduces (>50%) the overall
221	      duration compared to a single PDR (or NDR) binary search duration,
222	      while finding multiple drop rates.

224	   o  In all cases MLRsearch yields the same or similar results to
225	      binary search.

227	   o  Note: both binary search and MLRsearch are susceptible to
228	      reporting non-repeatable results across multiple runs for very bad
229	      behaving cases.

231	   Caveats:

233	   o  Worst case MLRsearch can take longer than a binary search e.g. in
234	      case of drastic changes in behaviour for trials at varying
235	      durations.

237	4.  Sample Implementation

239	   Following is a brief description of a sample MLRsearch implementation
240	   based on the open-source code running in FD.io CSIT project as part
241	   of a Continuous Integration / Continuous Development (CI/CD)
242	   framework.

244	Internet-DrafMultiple Loss Ratio Search for Packet Throughp October 2018

246	4.1.  Input Parameters

248	   1.  *maximum_transmit_rate* - maximum packet transmit rate to be used
249	       by external traffic generator, limited by either the actual
250	       Ethernet link rate or traffic generator NIC model capabilities.
251	       Sample defaults: 2 * 14.88 Mpps for 64B 10GE link rate, 2 * 18.75
252	       Mpps for 64B 40GE NIC maximum rate.

254	   2.  *minimum_transmit_rate* - minimum packet transmit rate to be used
255	       for measurements.  MLRsearch fails if lower transmit rate needs
256	       to be used to meet search criteria.  Default: 2 * 10 kpps (could
257	       be higher).

259	   3.  *final_trial_duration* - required trial duration for final rate
260	       measurements.  Default: 30 sec.

262	   4.  *initial_trial_duration* - trial duration for initial MLRsearch
263	       phase.  Default: 1 sec.

265	   5.  *final_relative_width* - required measurement resolution
266	       expressed as (lower_bound, upper_bound) interval width relative
267	       to upper_bound.  Default: 0.5%.

269	   6.  *packet_loss_ratio* - maximum acceptable PLR search criteria for
270	       PDR measurements.  Default: 0.5%.

272	   7.  *number_of_intermediate_phases* - number of phases between the
273	       initial phase and the final phase.  Impacts the overall MLRsearch
274	       duration.  Less phases are required for well behaving cases, more
275	       phases may be needed to reduce the overall search duration for
276	       worse behaving cases.  Default (2).  (Value chosen based on
277	       limited experimentation to date.  More experimentation needed to
278	       arrive to clearer guidelines.)

280	4.2.  Initial phase

282	   1.  First trial measures at maximum rate and discovers MRR.  a. _in_:
283	       trial_duration = initial_trial_duration.  b. _in_:
284	       offered_transmit_rate = maximum_transmit_rate.  c. _do_: single
285	       trial.  d. _out_: measured loss ratio.  e. _out_: mrr = measured
286	       receive rate.

288	   2.  Second trial measures at MRR and discovers MRR2.  a. _in_:
289	       trial_duration = initial_trial_duration.  b. _in_:
290	       offered_transmit_rate = MRR.  c. _do_: single trial.  d. _out_:
291	       measured loss ratio.  e. _out_: mrr2 = measured receive rate.

293	Internet-DrafMultiple Loss Ratio Search for Packet Throughp October 2018

295	   3.  Third trial measures at MRR2.  a. _in_: trial_duration =
296	       initial_trial_duration.  b. _in_: offered_transmit_rate = MRR2.
297	       c. _do_: single trial.  d. _out_: measured loss ratio.

299	4.3.  Non-initial phases

301	   1.  Main loop: a. _in_: trial_duration for the current phase.  Set to
302	       initial_trial_duration for the first intermediate phase; to
303	       final_trial_duration for the final phase; or to the element of
304	       interpolating geometric sequence for other intermediate phases.
305	       For example with two intermediate phases, trial_duration of the
306	       second intermediate phase is the geometric average of
307	       initial_strial_duration and final_trial_duration.  b. _in_:
308	       relative_width_goal for the current phase.  Set to
309	       final_relative_width for the final phase; doubled for each
310	       preceding phase.  For example with two intermediate phases, the
311	       first intermediate phase uses quadruple of final_relative_width
312	       and the second intermediate phase uses double of
313	       final_relative_width.  c. _in_: ndr_interval, pdr_interval from
314	       the previous main loop iteration or the previous phase.  If the
315	       previous phase is the initial phase, both intervals have
316	       lower_bound = MRR2, uper_bound = MRR.  Note that the initial
317	       phase is likely to create intervals with invalid bounds.  d.
318	       _do_: According to the procedure described in point 2, either
319	       exit the phase (by jumping to 1.g.), or prepare new transmit rate
320	       to measure with.  e. _do_: Perform the trial measurement at the
321	       new transmit rate and trial_duration, compute its loss ratio.  f.
322	       _do_: Update the bounds of both intervals, based on the new
323	       measurement.  The actual update rules are numerous, as NDR
324	       external search can affect PDR interval and vice versa, but the
325	       result agrees with rules of both internal and external search.
326	       For example, any new measurement below an invalid lower_bound
327	       becomes the new lower_bound, while the old measurement
328	       (previously acting as the invalid lower_bound) becomes a new and
329	       valid upper_bound.  Go to next iteration (1.c.), taking the
330	       updated intervals as new input.  g. _out_: current ndr_interval
331	       and pdr_interval.  In the final phase this is also considered to
332	       be the result of the whole search.  For other phases, the next
333	       phase loop is started with the current results as an input.

335	   2.  New transmit rate (or exit) calculation (for 1.d.):

337	       *  If there is an invalid bound then prepare for external search:

339	          +  _If_ the most recent measurement at NDR lower_bound
340	             transmit rate had the loss higher than zero, then the new
341	             transmit rate is NDR lower_bound decreased by two NDR
342	             interval widths.

344	Internet-DrafMultiple Loss Ratio Search for Packet Throughp October 2018

346	          +  Else, _if_ the most recent measurement at PDR lower_bound
347	             transmit rate had the loss higher than PLR, then the new
348	             transmit rate is PDR lower_bound decreased by two PDR
349	             interval widths.

351	          +  Else, _if_ the most recent measurement at NDR upper_bound
352	             transmit rate had no loss, then the new transmit rate is
353	             NDR upper_bound increased by two NDR interval widths.

355	          +  Else, _if_ the most recent measurement at PDR upper_bound
356	             transmit rate had the loss lower or equal to PLR, then the
357	             new transmit rate is PDR upper_bound increased by two PDR
358	             interval widths.

360	       *  If interval width is higher than the current phase goal:

362	          +  Else, _if_ NDR interval does not meet the current phase
363	             width goal, prepare for internal search.  The new transmit
364	             rate is (NDR lower bound + NDR upper bound) / 2.

366	          +  Else, _if_ PDR interval does not meet the current phase
367	             width goal, prepare for internal search.  The new transmit
368	             rate is (PDR lower bound + PDR upper bound) / 2.

370	       *  Else, _if_ some bound has still only been measured at a lower
371	          duration, prepare to re-measure at the current duration (and
372	          the same transmit rate).  The order of priorities is:

374	          +  NDR lower_bound,

376	          +  PDR lower_bound,

378	          +  NDR upper_bound,

380	          +  PDR upper_bound.

382	       *  _Else_, do not prepare any new rate, to exit the phase.  This
383	          ensures that at the end of each non-initial phase all
384	          intervals are valid, narrow enough, and measured at current
385	          phase trial duration.

387	5.  Known Implementations

389	   The only known working implementatin of MLRsearch is in Linux
390	   Foundation FD.io CSIT project. https://wiki.fd.io/view/CSIT.
391	   https://git.fd.io/csit/.

393	Internet-DrafMultiple Loss Ratio Search for Packet Throughp October 2018

395	5.1.  FD.io CSIT Implementation Deviations

397	   This document so far has been describing a simplified version of
398	   MLRsearch algorithm.  The full algorithm as implemented contains
399	   additional logic, which makes some of the details (but not general
400	   ideas) above incorrect.  Here is a short description of the
401	   additional logic as a list of principles, explaining their main
402	   differences from (or additions to) the simplified description, but
403	   without detailing their mutual interaction.

405	   1.  _Logarithmic transmit rate._ In order to better fit the relative
406	       width goal, the interval doubling and halving is done
407	       differently.  For example, the middle of 2 and 8 is 4, not 5.

409	   2.  _Optimistic maximum rate._ The increased rate is never higher
410	       than the maximum rate.  Upper bound at that rate is always
411	       considered valid.

413	   3.  _Pessimistic minimum rate._ The decreased rate is never lower
414	       than the minimum rate.  If a lower bound at that rate is invalid,
415	       a phase stops refining the interval further (until it gets re-
416	       measured).

418	   4.  _Conservative interval updates._ Measurements above current upper
419	       bound never update a valid upper bound, even if drop ratio is
420	       low.  Measurements below current lower bound always update any
421	       lower bound if drop ratio is high.

423	   5.  _Ensure sufficient interval width._ Narrow intervals make
424	       external search take more time to find a valid bound.  If the new
425	       transmit increased or decreased rate would result in width less
426	       than the current goal, increase/decrease more.  This can happen
427	       if the measurement for the other interval makes the current
428	       interval too narrow.  Similarly, take care the measurements in
429	       the initial phase create wide enough interval.

431	   6.  _Timeout for bad cases._ The worst case for MLRsearch is when
432	       each phase converges to intervals way different than the results
433	       of the previous phase.  Rather than suffer total search time
434	       several times larger than pure binary search, the implemented
435	       tests fail themselves when the search takes too long (given by
436	       argument _timeout_).

438	6.  IANA Considerations

440	   ..

442	Internet-DrafMultiple Loss Ratio Search for Packet Throughp October 2018

444	7.  Security Considerations

446	   ..

448	8.  Acknowledgements

450	   ..

452	9.  Normative References

454	   [RFC2544]  Bradner, S. and J. McQuaid, "Benchmarking Methodology for
455	              Network Interconnect Devices", RFC 2544,
456	              DOI 10.17487/RFC2544, March 1999,
457	              <https://www.rfc-editor.org/info/rfc2544>.

459	   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
460	              2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
461	              May 2017, <https://www.rfc-editor.org/info/rfc8174>.

463	Authors' Addresses

465	   Maciek Konstantynowicz (editor)
466	   Cisco Systems

468	   Email: mkonstan@cisco.com

470	   Vratko Polak (editor)
471	   Cisco Systems

473	   Email: vrpolak@cisco.com