<?xml version="1.0" encoding="US-ASCII"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<?rfc toc="yes"?>
<?rfc tocompact="yes"?>
<?rfc tocdepth="3"?>
<?rfc tocindent="yes"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes"?>
<?rfc comments="yes"?>
<?rfc inline="yes"?>
<?rfc compact="yes"?>
<?rfc subcompact="no"?>
<rfc category="info" docName="draft-morton-bmwg-b2b-frame-05"
     ipr="trust200902" updates="2544">
  <front>
    <title abbrev="B2B Frame Update">Updates for the Back-to-back Frame
    Benchmark in RFC 2544</title>

    <author fullname="Al Morton" initials="A." surname="Morton">
      <organization>AT&amp;T Labs</organization>

      <address>
        <postal>
          <street>200 Laurel Avenue South</street>

          <city>Middletown,</city>

          <region>NJ</region>

          <code>07748</code>

          <country>USA</country>
        </postal>

        <phone>+1 732 420 1571</phone>

        <facsimile>+1 732 368 1192</facsimile>

        <email>acmorton@att.com</email>

        <uri/>
      </address>
    </author>

    <date day="2" month="March" year="2019"/>

    <abstract>
      <t>Fundamental Benchmarking Methodologies for Network Interconnect
      Devices of interest to the IETF are defined in RFC 2544. This memo
      updates the procedures of the test to measure the Back-to-back frames
      Benchmark of RFC 2544, based on further experience.</t>

      <t>This memo updates Section 26.4 of RFC 2544.</t>
    </abstract>

    <note title="Requirements Language">
      <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
      "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
      "OPTIONAL" in this document are to be interpreted as described in BCP
      14<xref target="RFC2119"/> <xref target="RFC8174"/> when, and only when,
      they appear in all capitals, as shown here.</t>

      <t/>
    </note>
  </front>

  <middle>
    <section title="Introduction">
      <t>The IETF's fundamental Benchmarking Methodologies are defined in<xref
      target="RFC2544"/>, supported by the terms and definitions in <xref
      target="RFC1242"/>, and <xref target="RFC2544"/> actually obsoletes an
      earlier specification, <xref target="RFC1944"/>. Over time, the
      benchmarking community has updated <xref target="RFC2544"/> several
      times, including the Device Reset Benchmark <xref target="RFC6201"/>,
      and the important Applicability Statement <xref target="RFC6815"/>
      concerning use outside the Isolated Test Environment (ITE) required for
      accurate benchmarking. Other specifications implicitly update <xref
      target="RFC2544"/>, such as the IPv6 Benchmarking Methodologies in <xref
      target="RFC5180"/>.</t>

      <t>Recent testing experience with the Back-to-back Frame test and
      Benchmark in Section 26.4 of <xref target="RFC2544"/> indicates that an
      update is warranted <xref target="OPNFV-2017"/> <xref
      target="VSPERF-b2b"/>. This memo describes the rationale and provides
      the updated method.</t>

      <t><xref target="RFC2544"/> provides its own Requirements Language
      consistent with <xref target="RFC2119"/>, since <xref target="RFC1944"/>
      predates <xref target="RFC2119"/>. Thus, the requirements presented in
      this memo are expressed in <xref target="RFC2119"/> terms, and intended
      for those performing/reporting laboratory tests to improve clarity and
      repeatability, and for those designing devices that facilitate these
      tests.</t>
    </section>

    <section title="Scope and Goals">
      <t>The scope of this memo is to define an updated method to
      unambiguously perform tests, measure the benchmark(s), and report the
      results for Back-to-back Frames (presently described Section 26.4 of
      <xref target="RFC2544"/>).</t>

      <t>The goal is to provide more efficient test procedures where possible,
      and to expand reporting with additional interpretation of the results.
      The tests described in this memo address the cases where the maximum
      frame rate of a single ingress port cannot be transferred to an egress
      port loss-free (for some frame sizes of interest).</t>

      <t><xref target="RFC2544"/> Benchmarks rely on test conditions with
      constant frame sizes, with the goal of understanding what network device
      capability has been tested. Tests with the smallest size stress the
      header processing capacity, and tests with the largest size stress the
      overall bit processing capacity. Tests with sizes in-between may
      determine the transition between these two capacities. However,
      conditions simultaneously sending multiple frame sizes, such as those
      described in <xref target="RFC6985"/>, MUST NOT be used in Back-to-back
      Frame testing.</t>

      <t>Section 3 of <xref target="RFC8239"/> describes buffer size testing
      for physical networking devices in a Data Center. The <xref
      target="RFC8239"/> methods measure buffer latency directly with traffic
      on multiple ingress ports that overload an egress port on the Device
      Under Test (DUT), and are not subject to the revised calculations
      presented in this memo.</t>
    </section>

    <section title="Motivation">
      <t>Section 3.1 of <xref target="RFC1242"/> describes the rationale for
      the Back-to-back Frames Benchmark. To summarize, there are several
      reasons that devices on a network produce bursts of frames at the
      minimum allowed spacing, and it is therefore worthwhile to understand
      the Device Under Test (DUT) limit on the length of such bursts in
      practice. Also, <xref target="RFC1242"/> states: <figure>
          <artwork><![CDATA[       "Tests of this parameter are intended to determine the extent 
       of data buffering in the device."]]></artwork>
        </figure></t>

      <t>After this test was defined, there have been occasional discussions
      of the stability and repeatability of the results, both over time and
      across labs. Fortunately, the Open Platform for Network Function
      Virtualization (OPNFV) VSPERF project's Continuous Integration (CI)
      testing routinely repeats Back-to-back Frame tests to verify that test
      functionality has been maintained through development of the test
      control programs. These tests were used as a basis to evaluate stability
      and repeatability, even across lab set-ups when the test platform was
      migrated to new DUT hardware at the end of 2016.</t>

      <t>When the VSPERF CI results were examined <xref target="VSPERF-b2b"/>,
      several aspects of the results were considered notable:<list
          style="numbers">
          <t>Back-to-back Frame Benchmark was very consistent for some fixed
          frame sizes, and somewhat variable for others.</t>

          <t>The Back-to-back Frame length reported for large frame sizes was
          unexpectedly long, and no explanation or measurement limit condition
          was indicated.</t>

          <t>Calculation of the extent of buffer time in the DUT helped to
          explain the results observed with all frame sizes (for example, some
          frame sizes cannot exceed the frame header processing rate of the
          DUT and therefore no buffering occurs, therefore the results
          depended on the test equipment and not the DUT).</t>

          <t>It was found that the actual buffer time in the DUT could be
          estimated using results from the Throughput tests conducted
          according to Section 26.1 of <xref target="RFC2544"/>, because it
          appears that the DUT's frame processing rate may tend to increase
          the estimate.</t>
        </list></t>

      <t>Further, if the Throughput tests of Section 26.1 of <xref
      target="RFC2544"/> are conducted as a prerequisite test, the number of
      frame sizes required for Back-to-back Frame Benchmarking can be reduced
      to one or more of the small frame sizes, or the results for large frame
      sizes can be noted as invalid in the results if tested anyway (these are
      the frame sizes for which the back-to-back frame rate cannot exceed the
      exceed the frame header processing rate of the DUT and no buffering
      occurs).</t>

      <t><xref target="VSPERF-b2b"/> provides the details of the calculation
      to estimate the actual buffer storage available in the DUT, using
      results from the Throughput tests for each frame size, and the maximum
      theoretical frame rate for the DUT links (which constrain the minimum
      frame spacing). We present some of these details here.</t>

      <t>The simplified model used in these calculations for the DUT includes
      a packet header processing function with limited rate of operation, as
      shown below:</t>

      <t><figure>
          <artwork><![CDATA[                     |------------ DUT --------|
Generator -> Ingress -> Buffer -> HeaderProc -> Egress -> Receiver
]]></artwork>
        </figure></t>

      <t>So, in the back2back frame testing:<list style="numbers">
          <t>The Ingress burst arrives at Max Theoretical Frame Rate, and
          initially the frames are buffered</t>

          <t>The packet header processing function (HeaderProc) operates at
          approximately the &ldquo;Measured Throughput&rdquo;, removing frames
          from the buffer </t>

          <t>Frames that have been processed are clearly not in the buffer, so
          the Corrected DUT buffer time equation (Section 5.4) estimates and
          removes the frames that the DUT forwarded on Egress during the
          burst. </t>
        </list></t>

      <t>Knowledge of approximate buffer storage size (in time or bytes) may
      be useful to estimate whether frame losses will occur if DUT forwarding
      is temporarily suspended in a production deployment, due to an
      unexpected interruption of frame processing (an interruption of duration
      greater than the estimated buffer would certainly cause lost
      frames).</t>

      <t>The presentation of OPNFV VSPERF evaluation and development of
      enhanced search alogorithms <xref target="VSPERF-BSLV"/> was discussed
      at IETF-102. The enhancements are intended to compensate for transient
      inerrrupts that may cause loss at near-Throughput levels of offered
      load. Subsequent analysis of the results indicates that buffers within
      the DUT can compensate for some interrupts, and this finding increases
      the importance of the Back-to-back frame characterization described
      here.</t>
    </section>

    <section title="Prerequisites">
      <t>The Test Setup MUST be consistent with Figure 1 of <xref
      target="RFC2544"/>, or Figure 2 when the tester's sender and recover are
      different devices. Other mandatory testing aspects described in <xref
      target="RFC2544"/> MUST be included, unless explicitly modified in the
      next section.</t>

      <t>The ingress and egress link speeds and link layer protocols MUST be
      specified and used to compute the maximum theoretical frame rate when
      respecting the minimum inter-frame gap.</t>

      <t>The test results for the Throughput Benchmark conducted according to
      Section 26.1 of <xref target="RFC2544"/> for all <xref
      target="RFC2544"/>-RECOMMENDED frame sizes MUST be available to reduce
      the tested frame size list, or to note invalid results for individual
      frame sizes (because the burst length may be essentially infinite for
      large frame sizes).</t>

      <t>Note that:<list style="symbols">
          <t>the Throughput and the Back-to-back Frame measurement
          configuration traffic characteristics (unidirectional or
          bi-directional) MUST match.</t>

          <t>the Throughput measurement MUST be under zero-loss conditions,
          according to Section 26.1 of <xref target="RFC2544"/>.</t>
        </list>The Back-to-back Benchmark described in Section 3.1 of <xref
      target="RFC1242"/> MUST be measured directly by the tester. Additional
      measurement requirements are described below in Section 5.</t>
    </section>

    <section title="Back-to-back Frames">
      <t>Objective: To characterize the ability of a DUT to process
      back-to-back frames as defined in <xref target="RFC1242"/>.</t>

      <t>The Procedure follows.</t>

      <section title="Preparing the list of Frame sizes">
        <t>From the list of RECOMMENDED Frame sizes (Section 9 of <xref
        target="RFC2544"/>), select the subset of Frame sizes whose measured
        Throughput was less than the maximum theoretical Frame Rate. These are
        the only Frame sizes where it is possible to produce a burst of frames
        that cause the DUT buffers to fill and eventually overflow, producing
        one or more discarded frames.</t>
      </section>

      <section title="Test for a Single Frame Size">
        <t>Each trial in the test requires the tester to send a burst of
        frames (after idle time) with the minimum inter-frame gap, and to
        count the corresponding frames forwarded by the DUT.</t>

        <t>The duration of the trial MUST be at least 2 seconds, to allow DUT
        buffers to deplete.</t>

        <t>If all frames have been received, the tester increases the length
        of the burst according to the search algorithm and performs another
        trial.</t>

        <t>If the received frame count is less than the number of frames in
        the burst, then the limit of DUT processing and buffering may have
        been exceeded, and the burst length is determined by the search
        algorithm for the next trial.</t>

        <t>Classic search algorithms have been adapted for use in
        benchmarking, where the search requires discovery of a pair of
        outcomes, one with no loss and another with loss, at load conditions
        within the acceptable tolerance. Also for conditions encountered when
        benchmarking the Infrastructure for Network Function Virtualization
        require algorithm enhancement. Fortunately, the adaptation of Binary
        Search, and an enhanced Binary Search with Loss Verification have been
        specified in <xref target="TST009"/>. These alogorithms (see clause
        12.3) can easily be used for Back-to-back Frame benchmarking by
        replacing the Offered Load level with burst length in frames. <xref
        target="TST009"/> Annex B describes the theory behind the enhanced
        Binary Search algorithm.</t>

        <t>Either the <xref target="TST009"/> Binary Search or Binary Search
        with Loss Verification algorithms MUST be used, and input parameters
        to the algorithm(s) MUST be reported.</t>

        <t>The Back-to-back Frame value is the longest burst of frames that
        the DUT can successfully process and buffer without frame loss, as
        determined from the series of trials. The tester may impose a
        (configurable) minimum step size for burst length, and the step size
        MUST be reported with the results (as this influences the accuracy and
        variation of test results).</t>
      </section>

      <section title="Test Repetition">
        <t>The test MUST be repeated N times for each frame size in the subset
        list, and each Back-to-back Frame value made available for further
        processing (below).</t>
      </section>

      <section title="Benchmark Calculations">
        <t>For each Frame size, calculate the following summary statistics for
        Back-to-back Frame values over the N tests:<list style="symbols">
            <t>Average (Benchmark)</t>

            <t>Minimum</t>

            <t>Maximum</t>

            <t>Standard Deviation</t>
          </list></t>

        <t>Further, calculate the Implied DUT Buffer Time and the Corrected
        DUT Buffer Time in seconds, as follows:<figure>
            <artwork><![CDATA[Implied DUT Buffer Time =

   Average num of Back-to-back Frames / Max Theoretical Frame Rate
]]></artwork>
          </figure>The formula above is simply expressing the Burst of Frames
        in units of time.</t>

        <t>The next step is to apply a correction factor that accounts for the
        DUT's frame forwarding operation during the test (assuming a simple
        model of the DUT composed of a buffer and a forwarding function).</t>

        <t><figure>
            <artwork><![CDATA[Corrected DUT Buffer Time =

                                   Measured Throughput
     = Implied DUT Buffer Time * --------------------------
                                 Max Theoretical Frame Rate
 ]]></artwork>
          </figure></t>

        <t>where:<list style="numbers">
            <t>The &ldquo;Measured Throughput&rdquo; is the RFC2544 Throughput
            Benchmark for the frame size tested, and MUST be expressed in
            Frames per second in this equation.</t>

            <t>The &ldquo;Max Theoretical Frame Rate&rdquo; is a calculated
            value for the interface speed and link layer technology used, and
            MUST be expressed in Frames per second in this equation.</t>
          </list></t>

        <t>The term on the far right in the formula for Corrected DUT Buffer
        Time accounts for all the frames in the Burst that were transmitted by
        the DUT *while the Burst of frames were sent in*. So, these frames are
        not in the Buffer and the Buffer size is more accurately estimated by
        excluding them.</t>
      </section>
    </section>

    <section title="Reporting">
      <t>The back-to-back results SHOULD be reported in the format of a table
      with a row for each of the tested frame sizes. There SHOULD be columns
      for the frame size and for the resultant average frame count for each
      type of data stream tested.</t>

      <t>The number of tests Averaged for the Benchmark, N, MUST be
      reported.</t>

      <t>The Minimum, Maximum, and Standard Deviation across all complete
      tests SHOULD also be reported.</t>

      <t>The Corrected DUT Buffer Time SHOULD also be reported.</t>

      <t>If the tester operates using a maximum burst length in frames, then
      this maximum length SHOULD be reported.</t>

      <texttable style="full" title="Back-to-Back Frame Results">
        <ttcol>Frame Size, octets</ttcol>

        <ttcol>Ave B2B Length, frames</ttcol>

        <ttcol>Min,Max,StdDev</ttcol>

        <ttcol>Corrected Buff Time, Sec</ttcol>

        <c>64</c>

        <c>26000</c>

        <c>25500,27000,20</c>

        <c>0.00004</c>
      </texttable>

      <t>Static and configuration parameters:</t>

      <t>Number of test repetitions, N</t>

      <t>Minimum Step Size (during searches), in frames.</t>

      <t/>
    </section>

    <section title="Security Considerations">
      <t>Benchmarking activities as described in this memo are limited to
      technology characterization using controlled stimuli in a laboratory
      environment, with dedicated address space and the other constraints
      <xref target="RFC2544"/>.</t>

      <t>The benchmarking network topology will be an independent test setup
      and MUST NOT be connected to devices that may forward the test traffic
      into a production network, or misroute traffic to the test management
      network. See <xref target="RFC6815"/>.</t>

      <t>Further, benchmarking is performed on a "black-box" basis, relying
      solely on measurements observable external to the DUT/SUT.</t>

      <t>Special capabilities SHOULD NOT exist in the DUT/SUT specifically for
      benchmarking purposes. Any implications for network security arising
      from the DUT/SUT SHOULD be identical in the lab and in production
      networks.</t>
    </section>

    <section anchor="IANA" title="IANA Considerations">
      <t>This memo makes no requests of IANA.</t>
    </section>

    <section title="Acknowledgements">
      <t>Thanks to Trevor Cooper, Sridhar Rao, and Martin Klozik of the VSPERF
      project for many contributions to the testing <xref
      target="VSPERF-b2b"/>. Yoshiaki Itou has also investigated the topic,
      and made useful suggestions.</t>
    </section>
  </middle>

  <back>
    <references title="Normative References">
      <?rfc include='reference.RFC.2544'?>

      <?rfc include='reference.RFC.1944'?>

      <?rfc include='reference.RFC.1242'?>

      <?rfc include="reference.RFC.2119"?>

      <?rfc include='reference.RFC.6201'?>

      <?rfc include='reference.RFC.6815'?>

      <?rfc include='reference.RFC.5180'?>

      <?rfc include='reference.RFC.6985'?>

      <?rfc include='reference.RFC.8174'?>

      <?rfc ?>

      <?rfc ?>

      <?rfc ?>

      <?rfc ?>

      <?rfc ?>

      <?rfc ?>

      <?rfc ?>
    </references>

    <references title="Informative References">
      <reference anchor="OPNFV-2017"
                 target="https://wiki.opnfv.org/download/attachments/10293193/VSPERF-Dataplane-Perf-Cap-Bench.pptx?api=v2">
        <front>
          <title>Dataplane Performance, Capacity, and Benchmarking in
          OPNFV</title>

          <author fullname="Trevor Cooper" initials="T." surname="Cooper">
            <organization>Intel Corp.</organization>
          </author>

          <author fullname="Al Morton" initials="A." surname="Morton">
            <organization>AT&amp;T Labs</organization>
          </author>

          <author fullname="Sridhar Rao" initials="S." surname="Rao">
            <organization>Spirent Communications</organization>
          </author>

          <date day="15" month="June" year="2017"/>
        </front>
      </reference>

      <reference anchor="VSPERF-b2b"
                 target="https://wiki.opnfv.org/display/vsperf/Traffic+Generator+Testing#TrafficGeneratorTesting-AppendixB:Back2BackTestingTimeSeries(fromCI)">
        <front>
          <title>Back2Back Testing Time Series (from CI)</title>

          <author fullname="Al Morton" initials="A." surname="Morton">
            <organization/>
          </author>

          <date month="June" year="2017"/>
        </front>
      </reference>

      <reference anchor="VSPERF-BSLV"
                 target="https://datatracker.ietf.org/meeting/102/materials/slides-102-bmwg-evolution-of-repeatability-in-benchmarking-fraser-plugfest-summary-for-ietf-bmwg-00">
        <front>
          <title>Evolution of Repeatability in Benchmarking: Fraser Plugfest
          (Summary for IETF BMWG)</title>

          <author fullname="Al Morton" initials="A." surname="Morton">
            <organization>AT&amp;T Labs</organization>
          </author>

          <author fullname="Sridhar Rao" initials="S." surname="Rao">
            <organization>Spirent Communications</organization>
          </author>

          <date month="July" year="2018"/>
        </front>
      </reference>

      <reference anchor="TST009"
                 target="https://www.etsi.org/deliver/etsi_gs/NFV-TST/001_099/009/03.01.01_60/gs_NFV-TST009v030101p.pdf">
        <front>
          <title>ETSI GS NFV-TST 009 V3.1.1 (2018-10), "Network Functions
          Virtualisation (NFV) Release 3; Testing; Specification of Networking
          Benchmarks and Measurement Methods for NFVI"</title>

          <author fullname="Rapporteur: Al Morton">
            <organization>ETSI Network Function Virtualization
            ISG</organization>
          </author>

          <date month="October" year="2018"/>
        </front>
      </reference>

      <?rfc include='reference.RFC.8239'?>

      <?rfc ?>
    </references>
  </back>
</rfc>
