idnits 2.17.1 

draft-ietf-bmwg-b2b-frame-04.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  -- The draft header indicates that this document updates RFC2544, but the
     abstract doesn't seem to directly say this.  It does mention RFC2544
     though, so this could be OK.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == The document seems to lack the recommended RFC 2119 boilerplate, even if
     it appears to use RFC 2119 keywords -- however, there's a paragraph with
     a matching beginning. Boilerplate error?

     (The document does seem to have the reference to RFC 2119 which the
     ID-Checklist requires).
     (Using the creation date from RFC2544, updated by this document, for
     RFC5378 checks: 1999-03-01)

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (December 18, 2020) is 1215 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  -- Obsolete informational reference (is this intentional?): RFC 1944
     (Obsoleted by RFC 2544)


     Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 4 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------


2	Network Working Group                                          A. Morton
3	Internet-Draft                                                 AT&T Labs
4	Updates: 2544 (if approved)                            December 18, 2020
5	Intended status: Informational
6	Expires: June 21, 2021

8	        Updates for the Back-to-back Frame Benchmark in RFC 2544
9	                      draft-ietf-bmwg-b2b-frame-04

11	Abstract

13	   Fundamental Benchmarking Methodologies for Network Interconnect
14	   Devices of interest to the IETF are defined in RFC 2544.  This memo
15	   updates the procedures of the test to measure the Back-to-back frames
16	   Benchmark of RFC 2544, based on further experience.

18	   This memo updates Section 26.4 of RFC 2544.

20	Requirements Language

22	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
23	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
24	   "OPTIONAL" in this document are to be interpreted as described in BCP
25	   14[RFC2119] [RFC8174] when, and only when, they appear in all
26	   capitals, as shown here.

28	Status of This Memo

30	   This Internet-Draft is submitted in full conformance with the
31	   provisions of BCP 78 and BCP 79.

33	   Internet-Drafts are working documents of the Internet Engineering
34	   Task Force (IETF).  Note that other groups may also distribute
35	   working documents as Internet-Drafts.  The list of current Internet-
36	   Drafts is at https://datatracker.ietf.org/drafts/current/.

38	   Internet-Drafts are draft documents valid for a maximum of six months
39	   and may be updated, replaced, or obsoleted by other documents at any
40	   time.  It is inappropriate to use Internet-Drafts as reference
41	   material or to cite them other than as "work in progress."

43	   This Internet-Draft will expire on June 21, 2021.

45	Copyright Notice

47	   Copyright (c) 2020 IETF Trust and the persons identified as the
48	   document authors.  All rights reserved.

50	   This document is subject to BCP 78 and the IETF Trust's Legal
51	   Provisions Relating to IETF Documents
52	   (https://trustee.ietf.org/license-info) in effect on the date of
53	   publication of this document.  Please review these documents
54	   carefully, as they describe your rights and restrictions with respect
55	   to this document.  Code Components extracted from this document must
56	   include Simplified BSD License text as described in Section 4.e of
57	   the Trust Legal Provisions and are provided without warranty as
58	   described in the Simplified BSD License.

60	Table of Contents

62	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
63	   2.  Scope and Goals . . . . . . . . . . . . . . . . . . . . . . .   3
64	   3.  Motivation  . . . . . . . . . . . . . . . . . . . . . . . . .   4
65	   4.  Prerequisites . . . . . . . . . . . . . . . . . . . . . . . .   6
66	   5.  Back-to-back Frames . . . . . . . . . . . . . . . . . . . . .   7
67	     5.1.  Preparing the list of Frame sizes . . . . . . . . . . . .   7
68	     5.2.  Test for a Single Frame Size  . . . . . . . . . . . . . .   8
69	     5.3.  Test Repetition and Benchmark . . . . . . . . . . . . . .   9
70	     5.4.  Benchmark Calculations  . . . . . . . . . . . . . . . . .   9
71	   6.  Reporting . . . . . . . . . . . . . . . . . . . . . . . . . .  11
72	   7.  Security Considerations . . . . . . . . . . . . . . . . . . .  12
73	   8.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  12
74	   9.  Acknowledgments  . . . . . . . . . . . . . . . . . . . . . .  12
75	   10. References  . . . . . . . . . . . . . . . . . . . . . . . . .  13
76	     10.1.  Normative References . . . . . . . . . . . . . . . . . .  13
77	     10.2.  Informative References . . . . . . . . . . . . . . . . .  13
78	   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .  15

80	1.  Introduction

82	   The IETF's fundamental Benchmarking Methodologies are defined in
83	   [RFC2544], supported by the terms and definitions in [RFC1242], and
84	   [RFC2544] actually obsoletes an earlier specification, [RFC1944].
85	   Over time, the benchmarking community has updated [RFC2544] several
86	   times, including the Device Reset Benchmark [RFC6201], and the
87	   important Applicability Statement [RFC6815] concerning use outside
88	   the Isolated Test Environment (ITE) required for accurate
89	   benchmarking.  Other specifications implicitly update [RFC2544], such
90	   as the IPv6 Benchmarking Methodologies in [RFC5180].

92	   Recent testing experience with the Back-to-back Frame test and
93	   Benchmark in Section 26.4 of [RFC2544] indicates that an update is
94	   warranted [OPNFV-2017] [VSPERF-b2b].  In particular, analysis of the
95	   results indicates that buffer size matters when compensating for
96	   interruptions of software packet processing, and this finding
97	   increases the importance of the Back-to-back frame characterization
98	   described here.  This memo describes additional rationale and
99	   provides the updated method.

101	   [RFC2544] (which obsoletes [RFC1944]) provides its own Requirements
102	   Language consistent with [RFC2119], since [RFC1944] pre-dates
103	   [RFC2119] and all three memos share common authorship.
104	   Today,[RFC8174] clarifies the usage of Requirements Language, so the
105	   requirements presented in this memo are expressed in [RFC8174] terms,
106	   and intended for those performing/reporting laboratory tests to
107	   improve clarity and repeatability, and for those designing devices
108	   that facilitate these tests.

110	2.  Scope and Goals

112	   The scope of this memo is to define an updated method to
113	   unambiguously perform tests, measure the benchmark(s), and report the
114	   results for Back-to-back Frames (presently described in Section 26.4
115	   of [RFC2544]).

117	   The goal is to provide more efficient test procedures where possible,
118	   and to expand reporting with additional interpretation of the
119	   results.  The tests described in this memo address the cases in which
120	   the maximum frame rate of a single ingress port cannot be transferred
121	   loss-free to an egress port (for some frame sizes of interest).

123	   [RFC2544] Benchmarks rely on test conditions with constant frame
124	   sizes, with the goal of understanding what network device capability
125	   has been tested.  Tests with the smallest size stress the header
126	   processing capacity, and tests with the largest size stress the
127	   overall bit processing capacity.  Tests with sizes in-between may
128	   determine the transition between these two capacities.  However,
129	   conditions simultaneously sending a mixture of Internet frame sizes
130	   (IMIX), such as those described in [RFC6985], MUST NOT be used in
131	   Back-to-back Frame testing.

133	   Section 3 of [RFC8239] describes buffer size testing for physical
134	   networking devices in a data center.  The [RFC8239] methods measure
135	   buffer latency directly with traffic on multiple ingress ports that
136	   overload an egress port on the Device Under Test (DUT) and are not
137	   subject to the revised calculations presented in this memo.
138	   Likewise, the methods of [RFC8239] SHOULD be used for test cases
139	   where the egress port buffer is the known point of overload.

141	3.  Motivation

143	   Section 3.1 of [RFC1242] describes the rationale for the Back-to-back
144	   Frames Benchmark.  To summarize, there are several reasons that
145	   devices on a network produce bursts of frames at the minimum allowed
146	   spacing; and it is, therefore, worthwhile to understand the Device
147	   Under Test (DUT) limit on the length of such bursts in practice.
148	   Also, [RFC1242] states:

150	          "Tests of this parameter are intended to determine the extent
151	          of data buffering in the device."

153	   After this test was defined, there have been occasional discussions
154	   of the stability and repeatability of the results, both over time and
155	   across labs.  Fortunately, the Open Platform for Network Function
156	   Virtualization (OPNFV) VSPERF project's Continuous Integration (CI)
157	   [VSPERF-CI] testing routinely repeats Back-to-back Frame tests to
158	   verify that test functionality has been maintained through
159	   development of the test control programs.  These tests were used as a
160	   basis to evaluate stability and repeatability, even across lab set-
161	   ups when the test platform was migrated to new DUT hardware at the
162	   end of 2016.

164	   When the VSPERF CI results were examined [VSPERF-b2b], several
165	   aspects of the results were considered notable:

167	   1.  Back-to-back Frame Benchmark was very consistent for some fixed
168	       frame sizes, and somewhat variable for other frame sizes.

170	   2.  The number of Back-to-back Frames with zero loss reported for
171	       large frame sizes was unexpectedly long (translating to 30
172	       seconds of buffer time), and no explanation or measurement limit
173	       condition was indicated.  It was important that the buffering
174	       time calculations were part of the referenced testing and
175	       analysis[VSPERF-b2b], because the calculated buffer times of 30
176	       seconds for some frame sizes were clearly wrong or highly
177	       suspect.  On the other hand, a result expressed only as a large
178	       number of Back-to-back Frames does not permit such an easy
179	       comparison with reality.

181	   3.  Calculation of the extent of buffer time in the DUT helped to
182	       explain the results observed with all frame sizes (for example,
183	       tests with some frame sizes cannot exceed the frame header
184	       processing rate of the DUT and thus no buffering occurs;
185	       therefore, the results depended on the test equipment and not the
186	       DUT).

188	   4.  It was found that a better estimate of the DUT buffer time could
189	       be calculated using measurements of both the longest burst in
190	       frames without loss and results from the Throughput tests
191	       conducted according to Section 26.1 of [RFC2544].  It is apparent
192	       that the DUT's frame processing rate empties the buffer during a
193	       trial and tends to increase the "implied" buffer size estimate
194	       (measured according to Section 26.4 of [RFC2544] because many
195	       frames have departed the buffer when the burst of frames ends).
196	       A calculation using the Throughput measurement can reveal a
197	       "corrected" buffer size estimate.

199	   Further, if the Throughput tests of Section 26.1 of [RFC2544] are
200	   conducted as a prerequisite test, the number of frame sizes required
201	   for Back-to-back Frame Benchmarking can be reduced to one or more of
202	   the small frame sizes, or the results for large frame sizes can be
203	   noted as invalid in the results if tested anyway (these are the
204	   larger frame sizes for which the back-to-back frame rate cannot
205	   exceed the frame header processing rate of the DUT and little or no
206	   buffering occurs).

208	   The material below provides the details of the calculation to
209	   estimate the actual buffer storage available in the DUT, using
210	   results from the Throughput tests for each frame size, and the
211	   maximum theoretical frame rate for the DUT links (which constrain the
212	   minimum frame spacing).

214	   In reality, there are many buffers and packet header processing steps
215	   in a typical DUT.  The simplified model used in these calculations
216	   for the DUT includes a packet header processing function with limited
217	   rate of operation, as shown below:

219	                        |------------ DUT --------|
220	   Generator -> Ingress -> Buffer -> HeaderProc -> Egress -> Receiver

222	   So, in the Back-to-back Frame testing:

224	   1.  The ingress burst arrives at Max Theoretical Frame Rate, and
225	       initially the frames are buffered.

227	   2.  The packet header processing function (HeaderProc) operates at
228	       the "Measured Throughput" (Section 26.1 of [RFC2544]), removing
229	       frames from the buffer (this is the best approximation we have).

231	   3.  Frames that have been processed are clearly not in the buffer, so
232	       the Corrected DUT buffer time equation (Section 5.4) estimates
233	       and removes the frames that the DUT forwarded on egress during
234	       the burst.  We define buffer time as the number of frames
235	       occupying the buffer divided by the Maximum Theoretical Frame
236	       Rate (on ingress) for the frame size under test.

238	   4.  A helpful concept is the buffer filling rate, which is the
239	       difference between the Max Theoretical Frame Rate (ingress) and
240	       the Measured Throughput (HeaderProc on egress).  If the actual
241	       buffer size in frames was known, the time to fill the buffer
242	       during a measurement can be calculated using the filling rate as
243	       a check on measurements.  However, the buffer in the model
244	       represents many buffers of different sizes in the DUT data path.

246	   Knowledge of approximate buffer storage size (in time or bytes) may
247	   be useful to estimate whether frame losses will occur if DUT
248	   forwarding is temporarily suspended in a production deployment, due
249	   to an unexpected interruption of frame processing (an interruption of
250	   duration greater than the estimated buffer would certainly cause lost
251	   frames).  In Section 5, the calculations for the correct buffer time
252	   use the combination of offered load at Max Theoretical Frame Rate and
253	   header processing speed at 100% of Measured Throughput.  Other
254	   combinations are possible, such as changing the percent of measured
255	   Throughput to account for other processes reducing the header
256	   processing rate.

258	   The presentation of OPNFV VSPERF evaluation and development of
259	   enhanced search algorithms [VSPERF-BSLV] was discussed at IETF-102.
260	   The enhancements are intended to compensate for transient interrupts
261	   that may cause loss at near-Throughput levels of offered load.
262	   Subsequent analysis of the results indicates that buffers within the
263	   DUT can compensate for some interrupts, and this finding increases
264	   the importance of the Back-to-back frame characterization described
265	   here.

267	4.  Prerequisites

269	   The Test Setup MUST be consistent with Figure 1 of [RFC2544], or
270	   Figure 2 when the tester's sender and receiver are different devices.
271	   Other mandatory testing aspects described in [RFC2544] MUST be
272	   included, unless explicitly modified in the next section.

274	   The ingress and egress link speeds and link layer protocols MUST be
275	   specified and used to compute the maximum theoretical frame rate when
276	   respecting the minimum inter-frame gap.

278	   The test results for the Throughput Benchmark conducted according to
279	   Section 26.1 of [RFC2544] for all [RFC2544]-RECOMMENDED frame sizes
280	   MUST be available to reduce the tested frame size list, or to note
281	   invalid results for individual frame sizes (because the burst length
282	   may be essentially infinite for large frame sizes).

284	   Note that:

286	   o  the Throughput and the Back-to-back Frame measurement
287	      configuration traffic characteristics (unidirectional or bi-
288	      directional, and number of flows generated) MUST match.

290	   o  the Throughput measurement MUST be under zero-loss conditions,
291	      according to Section 26.1 of [RFC2544].

293	   The Back-to-back Benchmark described in Section 3.1 of [RFC1242] MUST
294	   be measured directly by the tester, where buffer size is inferred
295	   from Back-to-back Frame bursts and associated packet loss
296	   measurements.  Therefore, sources of packet loss that are unrelated
297	   to consistent evaluation of buffer size SHOULD be identified and
298	   removed or mitigated.  Example sources include:

300	   o  On-path active components that are external to the DUT

302	   o  Operating system environment interrupting DUT operation

304	   o  Shared resource contention between the DUT and other off-path
305	      component(s) impacting DUT's behaviour, sometimes called the
306	      "noisy neighbour" problem with virtualized network functions.

308	   Mitigations applicable to some of the sources above are discussed in
309	   Section 5.2, with the other measurement requirements described below
310	   in Section 5.

312	5.  Back-to-back Frames

314	   Objective: To characterize the ability of a DUT to process back-to-
315	   back frames as defined in [RFC1242].

317	   The Procedure follows.

319	5.1.  Preparing the list of Frame sizes

321	   From the list of RECOMMENDED frame sizes (Section 9 of [RFC2544]),
322	   select the subset of frame sizes whose measured Throughput (during
323	   prerequisite testing) was less than the Maximum Theoretical Frame
324	   Rate of the DUT/test-set-up.  These are the only frame sizes where it
325	   is possible to produce a burst of frames that cause the DUT buffers
326	   to fill and eventually overflow, producing one or more discarded
327	   frames.

329	5.2.  Test for a Single Frame Size

331	   Each trial in the test requires the tester to send a burst of frames
332	   (after idle time) with the minimum inter-frame gap, and to count the
333	   corresponding frames forwarded by the DUT.

335	   The duration of the trial includes three REQUIRED components:

337	   1.  The time to send the burst of frames (at the back-to-back rate),
338	       determined by the search algorithm.

340	   2.  The time to receive the transferred burst of frames (at the
341	       [RFC2544] Throughput rate), possibly truncated by buffer
342	       overflow, and certainly including the latency of the DUT.

344	   3.  At least 2 seconds not overlapping the time to receive the burst
345	       (2.), to ensure that DUT buffers have depleted.  Longer times
346	       MUST be used when conditions warrant, such as when buffer times
347	       >2 seconds are measured or when burst sending times are >2
348	       seconds, but care is needed since this time component directly
349	       increases trial duration and many trials and tests comprise a
350	       complete benchmarking study.

352	   The upper search limit for the time to send each burst MUST be
353	   configurable, to values as high as 30 seconds (buffer time results
354	   reported at or near the configured upper limit are likely invalid,
355	   and the test MUST be repeated with a higher search limit).

357	   If all frames have been received, the tester increases the length of
358	   the burst according to the search algorithm and performs another
359	   trial.

361	   If the received frame count is less than the number of frames in the
362	   burst, then the limit of DUT processing and buffering may have been
363	   exceeded, and the burst length is determined by the search algorithm
364	   for the next trial (the burst length is typically reduced, but see
365	   below).

367	   Classic search algorithms have been adapted for use in benchmarking,
368	   where the search requires discovery of a pair of outcomes, one with
369	   no loss and another with loss, at load conditions within the
370	   acceptable tolerance or accuracy.  Conditions encountered when
371	   benchmarking the Infrastructure for Network Function Virtualization
372	   require algorithm enhancement.  Fortunately, the adaptation of Binary
373	   Search, and an enhanced Binary Search with Loss Verification have
374	   been specified in clause 12.3 of [TST009].  These algorithms can
375	   easily be used for Back-to-back Frame benchmarking by replacing the
376	   Offered Load level with burst length in frames.  [TST009] Annex B
377	   describes the theory behind the enhanced Binary Search with Loss
378	   Verification algorithm.

380	   There is also promising work-in-progress that may prove useful in
381	   Back-to-back Frame benchmarking.
382	   [I-D.vpolak-mkonstan-bmwg-mlrsearch] and [I-D.vpolak-bmwg-plrsearch]
383	   are two such examples.

385	   Either the [TST009] Binary Search or Binary Search with Loss
386	   Verification algorithms MUST be used, and input parameters to the
387	   algorithm(s) MUST be reported.

389	   The tester usually imposes a (configurable) minimum step size for
390	   burst length, and the step size MUST be reported with the results (as
391	   this influences the accuracy and variation of test results).

393	   The original Section 26.4 of [RFC2544] definition is stated below:

395	      The Back-to-back Frame value is the longest burst of frames that
396	      the DUT can successfully process and buffer without frame loss, as
397	      determined from the series of trials.

399	5.3.  Test Repetition and Benchmark

401	   On this topic, Section 26.4 of [RFC2544] requires:

403	      The trial length MUST be at least 2 seconds and SHOULD be repeated
404	      at least 50 times with the average of the recorded values being
405	      reported.

407	   Therefore, the Back-to-back Frame Benchmark is the average of burst
408	   length values over repeated tests to determine the longest burst of
409	   frames that the DUT can successfully process and buffer without frame
410	   loss.  Each of the repeated tests completes an independent search
411	   process.

413	   In this update, the test MUST be repeated N times (the number of
414	   repetitions is now a variable that must be reported),for each frame
415	   size in the subset list, and each Back-to-back Frame value made
416	   available for further processing (below).

418	5.4.  Benchmark Calculations

420	   For each frame size, calculate the following summary statistics for
421	   longest Back-to-back Frame values over the N tests:

423	   o  Average (Benchmark)
424	   o  Minimum

426	   o  Maximum

428	   o  Standard Deviation

430	   Further, calculate the Implied DUT Buffer Time and the Corrected DUT
431	   Buffer Time in seconds, as follows:

433	   Implied DUT Buffer Time =

435	      Average num of Back-to-back Frames / Max Theoretical Frame Rate

437	   The formula above is simply expressing the burst of frames in units
438	   of time.

440	   The next step is to apply a correction factor that accounts for the
441	   DUT's frame forwarding operation during the test (assuming the simple
442	   model of the DUT composed of a buffer and a forwarding function,
443	   described in Section 3).

445	   Corrected DUT Buffer Time =
446	                     /                                         \
447	      Implied DUT    |Implied DUT       Measured Throughput    |
448	   =  Buffer Time -  |Buffer Time * -------------------------- |
449	                     |              Max Theoretical Frame Rate |
450	                     \                                         /

452	   where:

454	   1.  The "Measured Throughput" is the [RFC2544] Throughput Benchmark
455	       for the frame size tested, as augmented by methods including the
456	       Binary Search with Loss Verification algorithm in [TST009] where
457	       applicable, and MUST be expressed in frames per second in this
458	       equation.

460	   2.  The "Max Theoretical Frame Rate" is a calculated value for the
461	       interface speed and link layer technology used, and MUST be
462	       expressed in frames per second in this equation.

464	   The term on the far right in the formula for Corrected DUT Buffer
465	   Time accounts for all the frames in the Burst that were transmitted
466	   by the DUT *while the Burst of frames were sent in*. So, these frames
467	   are not in the buffer and the buffer size is more accurately
468	   estimated by excluding them.

470	6.  Reporting

472	   The back-to-back frame results SHOULD be reported in the format of a
473	   table with a row for each of the tested frame sizes.  There SHOULD be
474	   columns for the frame size and for the resultant average frame count
475	   for each type of data stream tested.

477	   The number of tests Averaged for the Benchmark, N, MUST be reported.

479	   The Minimum, Maximum, and Standard Deviation across all complete
480	   tests SHOULD also be reported (they are referred to as
481	   "Min,Max,StdDev" in the table below).

483	   The Corrected DUT Buffer Time SHOULD also be reported.

485	   If the tester operates using a limited maximum burst length in
486	   frames, then this maximum length SHOULD be reported.

488	   +--------------+----------------+----------------+------------------+
489	   | Frame Size,  | Ave B2B        | Min,Max,StdDev | Corrected Buff   |
490	   | octets       | Length, frames |                | Time, Sec        |
491	   +--------------+----------------+----------------+------------------+
492	   | 64           | 26000          | 25500,27000,20 | 0.00004          |
493	   +--------------+----------------+----------------+------------------+

495	                        Back-to-Back Frame Results

497	   Static and configuration parameters (reported with the table above):

499	   Number of test repetitions, N

501	   Minimum Step Size (during searches), in frames.

503	   If the tester has a specific (actual) frame rate of interest (less
504	   than the Throughput rate), it is useful to estimate the buffer time
505	   at that actual frame rate:

507	   Actual Buffer Time =
508	                                      Max Theoretical Frame Rate
509	        = Corrected DUT Buffer Time * --------------------------
510	                                          Actual Frame Rate

512	   and report this value, properly labeled.

514	7.  Security Considerations

516	   Benchmarking activities as described in this memo are limited to
517	   technology characterization using controlled stimuli in a laboratory
518	   environment, with dedicated address space and the other constraints
519	   of[RFC2544].

521	   The benchmarking network topology will be an independent test setup
522	   and MUST NOT be connected to devices that may forward the test
523	   traffic into a production network, or misroute traffic to the test
524	   management network.  See [RFC6815].

526	   Further, benchmarking is performed on an "opaque-box" (a.k.a.
527	   "black-box") basis, relying solely on measurements observable
528	   external to the DUT/SUT.

530	   The DUT developers are commonly independent from the personnel and
531	   institutions conducting benchmarking studies.  DUT developers might
532	   have incentives to alter the performance of the DUT if the test
533	   conditions can be detected.  Special capabilities SHOULD NOT exist in
534	   the DUT/SUT specifically for benchmarking purposes.  Procedures
535	   described in this document are not designed to detect such activity.
536	   Additional testing outside of the scope of this document would be
537	   needed and has been used successfully in the past to discover such
538	   malpractices.

540	   Any implications for network security arising from the DUT/SUT SHOULD
541	   be identical in the lab and in production networks.

543	8.  IANA Considerations

545	   This memo makes no requests of IANA.

547	9.  Acknowledgements

549	   Thanks to Trevor Cooper, Sridhar Rao, and Martin Klozik of the VSPERF
550	   project for many contributions to the early testing [VSPERF-b2b].
551	   Yoshiaki Itou has also investigated the topic, and made useful
552	   suggestions.  Maciek Konstantyowicz and Vratko Polak also provided
553	   many comments and suggestions based on extensive integration testing
554	   and resulting search algorithm proposals - the most up-to-date
555	   feedback possible.  Tim Carlin also provided comments and support for
556	   the draft.  Warren Kumari's review improved readability in several
557	   key passages.  David Black, Martin Duke, and Scott Bradner's comments
558	   improved the clarity and configuration advice on trial duration.
559	   Malisa Vucinic suggested additional text on DUT design cautions in
560	   the Security Considerations section.

562	10.  References

564	10.1.  Normative References

566	   [RFC1242]  Bradner, S., "Benchmarking Terminology for Network
567	              Interconnection Devices", RFC 1242, DOI 10.17487/RFC1242,
568	              July 1991, <https://www.rfc-editor.org/info/rfc1242>.

570	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
571	              Requirement Levels", BCP 14, RFC 2119,
572	              DOI 10.17487/RFC2119, March 1997,
573	              <https://www.rfc-editor.org/info/rfc2119>.

575	   [RFC2544]  Bradner, S. and J. McQuaid, "Benchmarking Methodology for
576	              Network Interconnect Devices", RFC 2544,
577	              DOI 10.17487/RFC2544, March 1999,
578	              <https://www.rfc-editor.org/info/rfc2544>.

580	   [RFC6985]  Morton, A., "IMIX Genome: Specification of Variable Packet
581	              Sizes for Additional Testing", RFC 6985,
582	              DOI 10.17487/RFC6985, July 2013,
583	              <https://www.rfc-editor.org/info/rfc6985>.

585	   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
586	              2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
587	              May 2017, <https://www.rfc-editor.org/info/rfc8174>.

589	   [RFC8239]  Avramov, L. and J. Rapp, "Data Center Benchmarking
590	              Methodology", RFC 8239, DOI 10.17487/RFC8239, August 2017,
591	              <https://www.rfc-editor.org/info/rfc8239>.

593	   [TST009]   Morton, A., "ETSI GS NFV-TST 009 V3.4.1 (2020-12),
594	              "Network Functions Virtualisation (NFV) Release 3;
595	              Testing; Specification of Networking Benchmarks and
596	              Measurement Methods for NFVI"", December 2020,
597	              <https://www.etsi.org/deliver/etsi_gs/NFV-
598	              TST/001_099/009/03.04.01_60/gs_NFV-TST009v030401p.pdf>.

600	10.2.  Informative References

602	   [I-D.vpolak-bmwg-plrsearch]
603	              Konstantynowicz, M. and V. Polak, "Probabilistic Loss
604	              Ratio Search for Packet Throughput (PLRsearch)", draft-
605	              vpolak-bmwg-plrsearch-03 (work in progress), March 2020.

607	   [I-D.vpolak-mkonstan-bmwg-mlrsearch]
608	              Konstantynowicz, M. and V. Polak, "Multiple Loss Ratio
609	              Search for Packet Throughput (MLRsearch)", draft-vpolak-
610	              mkonstan-bmwg-mlrsearch-03 (work in progress), March 2020.

612	   [OPNFV-2017]
613	              Cooper, T., Morton, A., and S. Rao, "Dataplane
614	              Performance, Capacity, and Benchmarking in OPNFV", June
615	              2017,
616	              <https://wiki.opnfv.org/download/attachments/10293193/
617	              VSPERF-Dataplane-Perf-Cap-Bench.pptx?api=v2>.

619	   [RFC1944]  Bradner, S. and J. McQuaid, "Benchmarking Methodology for
620	              Network Interconnect Devices", RFC 1944,
621	              DOI 10.17487/RFC1944, May 1996,
622	              <https://www.rfc-editor.org/info/rfc1944>.

624	   [RFC5180]  Popoviciu, C., Hamza, A., Van de Velde, G., and D.
625	              Dugatkin, "IPv6 Benchmarking Methodology for Network
626	              Interconnect Devices", RFC 5180, DOI 10.17487/RFC5180, May
627	              2008, <https://www.rfc-editor.org/info/rfc5180>.

629	   [RFC6201]  Asati, R., Pignataro, C., Calabria, F., and C. Olvera,
630	              "Device Reset Characterization", RFC 6201,
631	              DOI 10.17487/RFC6201, March 2011,
632	              <https://www.rfc-editor.org/info/rfc6201>.

634	   [RFC6815]  Bradner, S., Dubray, K., McQuaid, J., and A. Morton,
635	              "Applicability Statement for RFC 2544: Use on Production
636	              Networks Considered Harmful", RFC 6815,
637	              DOI 10.17487/RFC6815, November 2012,
638	              <https://www.rfc-editor.org/info/rfc6815>.

640	   [VSPERF-b2b]
641	              Morton, A., "Back2Back Testing Time Series (from CI)",
642	              June 2017, <https://wiki.opnfv.org/display/vsperf/
643	              Traffic+Generator+Testing#TrafficGeneratorTesting-
644	              AppendixB:Back2BackTestingTimeSeries(fromCI)>.

646	   [VSPERF-BSLV]
647	              Morton, A. and S. Rao, "Evolution of Repeatability in
648	              Benchmarking: Fraser Plugfest (Summary for IETF BMWG)",
649	              July 2018,
650	              <https://datatracker.ietf.org/meeting/102/materials/
651	              slides-102-bmwg-evolution-of-repeatability-in-
652	              benchmarking-fraser-plugfest-summary-for-ietf-bmwg-00>.

654	   [VSPERF-CI]
655	              Tahhan, M., "OPNFV VSPERF CI", June 2019,
656	              <https://wiki.opnfv.org/display/vsperf/VSPERF+CI>.

658	Author's Address

660	   Al Morton
661	   AT&T Labs
662	   200 Laurel Avenue South
663	   Middletown,, NJ  07748
664	   USA

666	   Phone: +1 732 420 1571
667	   Fax:   +1 732 368 1192
668	   Email: acmorton@att.com