idnits 2.17.1 

draft-ietf-bmwg-traffic-management-05.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (June 2, 2015) is 3250 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  ** Obsolete normative reference: RFC 2680 (Obsoleted by RFC 7680)


     Summary: 1 error (**), 0 flaws (~~), 1 warning (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	Network Working Group                                     B. Constantine
2	Internet Draft                                                      JDSU
3	Intended status: Informational                               R. Krishnan
4	Expires: February 2016                            Brocade Communications
5	June 2, 2015

7	                      Traffic Management Benchmarking
8	                draft-ietf-bmwg-traffic-management-05.txt

10	Status of this Memo

12	   This Internet-Draft is submitted in full conformance with the
13	   provisions of BCP 78 and BCP 79.

15	   Internet-Drafts are working documents of the Internet Engineering
16	   Task Force (IETF).  Note that other groups may also distribute
17	   working documents as Internet-Drafts.  The list of current Internet-
18	   Drafts is at http://datatracker.ietf.org/drafts/current/.

20	   Internet-Drafts are draft documents valid for a maximum of six months
21	   and may be updated, replaced, or obsoleted by other documents at any
22	   time.  It is inappropriate to use Internet-Drafts as reference
23	   material or to cite them other than as "work in progress."

25	   This Internet-Draft will expire on December 2, 2015.

27	Copyright Notice

29	   Copyright (c) 2015 IETF Trust and the persons identified as the
30	   document authors.  All rights reserved.

32	   This document is subject to BCP 78 and the IETF Trust's Legal
33	   Provisions Relating to IETF Documents
34	   (http://trustee.ietf.org/license-info) in effect on the date of
35	   publication of this document.  Please review these documents
36	   carefully, as they describe your rights and restrictions with respect
37	   to this document.  Code Components extracted from this document must
38	   include Simplified BSD License text as described in Section 4.e of
39	   the Trust Legal Provisions and are provided without warranty as
40	   described in the Simplified BSD License.

42	Abstract

44	   This framework describes a practical methodology for benchmarking the
45	   traffic management capabilities of networking devices (i.e. policing,
46	   shaping, etc.). The goal is to provide a repeatable test method that
47	   objectively compares performance of the device's traffic management
48	   capabilities and to specify the means to benchmark traffic management
49	   with representative application traffic.

51	Table of Contents

53	   1. Introduction...................................................4
54	      1.1. Traffic Management Overview...............................4
55	      1.2. DUT Lab Configuration and Testing Overview................5
56	   2. Conventions used in this document..............................7
57	   3. Scope and Goals................................................8
58	   4. Traffic Benchmarking Metrics...................................9
59	      4.1. Metrics for Stateless Traffic Tests......................10
60	      4.2. Metrics for Stateful Traffic Tests.......................11
61	   5. Tester Capabilities...........................................12
62	      5.1. Stateless Test Traffic Generation........................13
63	         5.1.1. Burst Hunt with Stateless Traffic...................13
64	      5.2. Stateful Test Pattern Generation.........................13
65	         5.2.1. TCP Test Pattern Definitions........................15
66	   6. Traffic Benchmarking Methodology..............................16
67	      6.1. Policing Tests...........................................16
68	      6.1.1 Policer Individual Tests................................17
69	          6.1.2 Policer Capacity Tests..............................18
70	          6.1.2.1 Maximum Policers on Single Physical Port..........19
71	          6.1.2.2 Single Policer on All Physical Ports..............20
72	          6.1.2.3 Maximum Policers on All Physical Ports............21
73	      6.2. Queue/Scheduler Tests....................................21
74	      6.2.1 Queue/Scheduler Individual Tests........................21
75	          6.2.1.1 Testing Queue/Scheduler with Stateless Traffic....21
76	          6.2.1.2 Testing Queue/Scheduler with Stateful Traffic.....23
77	        6.2.2 Queue / Scheduler Capacity Tests......................25
78	          6.2.2.1 Multiple Queues / Single Port Active..............25
79	          6.2.2.1.1 Strict Priority on Egress Port..................26
80	          6.2.2.1.2 Strict Priority + Weighted Fair Queue (WFQ).....26
81	          6.2.2.2 Single Queue per Port / All Ports Active..........27
82	          6.2.2.3 Multiple Queues per Port, All Ports Active........27
83	      6.3. Shaper tests.............................................28
84	        6.3.1 Shaper Individual Tests...............................28
85	          6.3.1.1 Testing Shaper with Stateless Traffic.............29
86	          6.3.1.2 Testing Shaper with Stateful Traffic..............30
87	        6.3.2 Shaper Capacity Tests.................................32
88	          6.3.2.1 Single Queue Shaped, All Physical Ports Active....32
89	          6.3.2.2 All Queues Shaped, Single Port Active.............32
90	          6.3.2.3 All Queues Shaped, All Ports Active...............33
91	      6.4. Concurrent Capacity Load Tests...........................34
92	   7. Security Considerations.......................................34
93	   8. IANA Considerations...........................................34
94	   9. References....................................................35
95	      9.1. Normative References.....................................35
96	      9.2. Informative References...................................35
97	   Appendix A: Open Source Tools for Traffic Management Testing.....36
98	   Appendix B: Stateful TCP Test Patterns...........................37
99	   Acknowledgments..................................................41
100	   Authors' Addresses...............................................42

102	1. Introduction

104	   Traffic management (i.e. policing, shaping, etc.) is an increasingly
105	   important component when implementing network Quality of Service
106	   (QoS).

108	   There is currently no framework to benchmark these features
109	   although some standards address specific areas which are described
110	   in Section 1.1.

112	   This draft provides a framework to conduct repeatable traffic
113	   management benchmarks for devices and systems in a lab environment.

115	   Specifically, this framework defines the methods to characterize
116	   the capacity of the following traffic management features in network
117	   devices; classification, policing, queuing / scheduling, and
118	   traffic shaping.

120	   This benchmarking framework can also be used as a test procedure to
121	   assist in the tuning of traffic management parameters before service
122	   activation. In addition to Layer 2/3 (Ethernet / IP) benchmarking,
123	   Layer 4 (TCP) test patterns are proposed by this draft in order to
124	   more realistically benchmark end-user traffic.

126	1.1. Traffic Management Overview

128	   In general, a device with traffic management capabilities performs
129	   the following functions:

131	  - Traffic classification: identifies traffic according to various
132	    configuration rules (for example IEEE 802.1Q Virtual LAN (VLAN),
133	    Differential Services Code Point (DSCP) etc.) and marks this traffic
134	    internally to the network device. Multiple external priorities
135	    (DSCP, 802.1p, etc.) can map to the same priority in the device.
136	  - Traffic policing: limits the rate of traffic that enters a network
137	    device according to the traffic classification.  If the traffic
138	    exceeds the provisioned limits, the traffic is either dropped or
139	    remarked and forwarded onto to the next network device
140	  - Traffic Scheduling: provides traffic classification within the
141	    network device by directing packets to various types of queues and
142	    applies a dispatching algorithm to assign the forwarding sequence
143	    of packets
144	  - Traffic shaping: a traffic control technique that actively buffers
145	    and smooths the output rate in an attempt to adapt bursty traffic
146	    to the configured limits
147	  - Active Queue Management (AQM): AQM involves monitoring the status
148	    of internal queues and proactively dropping (or remarking) packets,
149	    which causes hosts using congestion-aware protocols to back-off and
150	    in turn alleviate queue congestion [AQM-RECO].  On the other hand,
151	    classic traffic management techniques reactively drop (or remark)
152	    packets based on queue full condition. The benchmarking scenarios
153	    for AQM are different and is outside of the scope of this testing
154	    framework.

156	   Even though AQM is outside of scope of this framework, it should be
157	   noted that the TCP metrics and TCP test patterns (defined in Sections
158	   4.2 and 5.2, respectively) could be useful to test new AQM
159	   algorithms (targeted to alleviate buffer bloat). Examples of these
160	   algorithms include code1 and pie (draft-ietf-aqm-code1 and
161	   draft-ietf-aqm-pie).

163	   The following diagram is a generic model of the traffic management
164	   capabilities within a network device.  It is not intended to
165	   represent all variations of manufacturer traffic management
166	   capabilities, but provide context to this test framework.

168	   |----------|   |----------------|   |--------------|   |----------|
169	   |          |   |                |   |              |   |          |
170	   |Interface |   |Ingress Actions |   |Egress Actions|   |Interface |
171	   |Input     |   |(classification,|   |(scheduling,  |   |Output    |
172	   |Queues    |   | marking,       |   | shaping,     |   |Queues    |
173	   |          |-->| policing or    |-->| active queue |-->|          |
174	   |          |   | shaping)       |   | management   |   |          |
175	   |          |   |                |   | remarking)   |   |          |
176	   |----------|   |----------------|   |--------------|   |----------|

178	   Figure 1: Generic Traffic Management capabilities of a Network Device

180	   Ingress actions such as classification are defined in [RFC4689]
181	   and include IP addresses, port numbers, DSCP, etc.  In terms of
182	   marking, [RFC2697] and [RFC2698] define a single rate and dual rate,
183	   three color marker, respectively.

185	   The Metro Ethernet Forum (MEF) specifies policing and shaping in
186	   terms of Ingress and Egress Subscriber/Provider Conditioning
187	   Functions in MEF12.1 [MEF-12.1]; Ingress and Bandwidth Profile
188	   attributes in MEF10.2 [MEF-10.2] and MEF 26 [MEF-26].

190	1.2 Lab Configuration and Testing Overview

192	   The following is the description of the lab set-up for the traffic
193	   management tests:

195	    +--------------+     +-------+     +----------+    +-----------+
196	    | Transmitting |     |       |     |          |    | Receiving |
197	    | Test Host    |     |       |     |          |    | Test Host |
198	    |              |-----| Device|---->| Network  |--->|           |
199	    |              |     | Under |     | Delay    |    |           |
200	    |              |     | Test  |     | Emulator |    |           |
201	    |              |<----|       |<----|          |<---|           |
202	    |              |     |       |     |          |    |           |
203	    +--------------+     +-------+     +----------+    +-----------+

205	   As shown in the test diagram, the framework supports uni-directional
206	   and bi-directional traffic management tests (where the transmitting
207	   and receiving roles would be reversed on the return path).

209	   This testing framework describes the tests and metrics for each of
210	   the following traffic management functions:
211	   - Classification
212	   - Policing
213	   - Queuing / Scheduling
214	   - Shaping

216	   The tests are divided into individual and rated capacity tests.
217	   The individual tests are intended to benchmark the traffic management
218	   functions according to the metrics defined in Section 4.  The
219	   capacity tests verify traffic management functions under the load of
220	   many simultaneous individual tests and their flows.

222	   This involves concurrent testing of multiple interfaces with the
223	   specific traffic management function enabled, and increasing load to
224	   the capacity limit of each interface.

226	   As an example: a device is specified to be capable of shaping on all
227	   of its egress ports. The individual test would first be conducted to
228	   benchmark the specified shaping function against the metrics defined
229	   in section 4.  Then the capacity test would be executed to test the
230	   shaping function concurrently on all interfaces and with maximum
231	   traffic load.

233	   The Network Delay Emulator (NDE) is required for TCP stateful tests
234	   in order to allow TCP to utilize a significant size TCP window in its
235	   control loop.

237	   Also note that the Network Delay Emulator (NDE) SHOULD be passive in
238	   nature such as a fiber spool.  This is recommended to eliminate the
239	   potential effects that an active delay element (i.e. test impairment
240	   generator) may have on the test flows.  In the case where a fiber
241	   spool is not practical due to the desired latency, an active NDE MUST
242	   be independently verified to be capable of adding the configured
243	   delay without loss.  In other words, the DUT would be removed and the
244	   NDE performance benchmarked independently.

246	   Note that the NDE SHOULD be used only as emulated delay. Most NDEs
247	   allow for per flow delay actions, emulating QoS prioritization.  For
248	   this framework, the NDE's sole purpose is simply to add delay to all
249	   packets (emulate network latency). So to benchmark the performance of
250	   the NDE, maximum offered load should be tested against the following
251	   frame sizes: 128, 256, 512, 768, 1024, 1500,and 9600 bytes. The delay
252	   accuracy at each of these packet sizes can then be used to calibrate
253	   the range of expected Bandwidth Delay Product (BDP) for the TCP
254	   stateful tests.

256	2. Conventions used in this document

258	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
259	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
260	   document are to be interpreted as described in [RFC2119].

262	   The following acronyms are used:

264	   AQM: Active Queue Management

266	   BB: Bottleneck Bandwidth

268	   BDP: Bandwidth Delay Product

270	   BSA: Burst Size Achieved

272	   CBS: Committed Burst Size

274	   CIR: Committed Information Rate

276	   DUT: Device Under Test

278	   EBS: Excess Burst Size

280	   EIR: Excess Information Rate

282	   NDE: Network Delay Emulator

284	   SP: Strict Priority Queuing

286	   QL: Queue Length

288	   QoS: Quality of Service

290	   RTH: Receiving Test Host

292	   RTT: Round Trip Time

294	   SBB: Shaper Burst Bytes

296	   SBI: Shaper Burst Interval

298	   SR: Shaper Rate

300	   SSB: Send Socket Buffer

302	   Tc: CBS Time Interval

304	   Te: EBS Time Interval

306	   Ti Transmission Interval

308	   TTH: Transmitting Test Host
309	   TTP: TCP Test Pattern

311	   TTPET: TCP Test Pattern Execution Time

313	3. Scope and Goals

315	   The scope of this work is to develop a framework for benchmarking and
316	   testing the traffic management capabilities of network devices in the
317	   lab environment.  These network devices may include but are not
318	   limited to:
319	   - Switches (including Layer 2/3 devices)
320	   - Routers
321	   - Firewalls
322	   - General Layer 4-7 appliances (Proxies, WAN Accelerators, etc.)

324	   Essentially, any network device that performs traffic management as
325	   defined in section 1.1 can be benchmarked or tested with this
326	   framework.

328	   The primary goal is to assess the maximum forwarding performance
329	   deemed to be within the provisioned traffic limits that a network
330	   device can sustain without dropping or impairing packets, or
331	   compromising the accuracy of multiple instances of traffic
332	   management functions. This is the benchmark for comparison between
333	   devices.

335	   Within this framework, the metrics are defined for each traffic
336	   management test but do not include pass / fail criterion, which is
337	   not within the charter of BMWG.  This framework provides the test
338	   methods and metrics to conduct repeatable testing, which will
339	   provide the means to compare measured performance between DUTs.

341	   As mentioned in section 1.2, these methods describe the individual
342	   tests and metrics for several management functions. It is also within
343	   scope that this framework will benchmark each function in terms of
344	   overall rated capacity.  This involves concurrent testing of multiple
345	   interfaces with the specific traffic management function enabled, up
346	   to the capacity limit of each interface.

348	   It is not within scope of this framework to specify the procedure for
349	   testing multiple configurations of traffic management functions
350	   concurrently.  The multitudes of possible combinations is almost
351	   unbounded and the ability to identify functional "break points"
352	   would be almost impossible.

354	   However, section 6.4 provides suggestions for some profiles of
355	   concurrent functions that would be useful to benchmark.  The key
356	   requirement for any concurrent test function is that tests MUST
357	   produce reliable and repeatable results.

359	   Also, it is not within scope to perform conformance testing. Tests
360	   defined in this framework benchmark the traffic management functions
361	   according to the metrics defined in section 4 and do not address any
362	   conformance to standards related to traffic management.

364	   The current specifications don't specify exact behavior or
365	   implementation and the specifications that do exist (cited in
366	   Section 1.1) allow implementations to vary w.r.t. short term rate
367	   accuracy and other factors. This is a primary driver for this
368	   framework: to provide an objective means to compare vendor traffic
369	   management functions.

371	   Another goal is to devise methods that utilize flows with
372	   congestion-aware transport (TCP) as part of the traffic load and
373	   still produce repeatable results in the isolated test environment.
374	   This framework will derive stateful test patterns (TCP or
375	   application layer) that can also be used to further benchmark the
376	   performance of applicable traffic management techniques such as
377	   queuing / scheduling and traffic shaping. In cases where the
378	   network device is stateful in nature (i.e. firewall, etc.),
379	   stateful test pattern traffic is important to test along with
380	   stateless, UDP traffic in specific test scenarios (i.e.
381	   applications using TCP transport and UDP VoIP, etc.).

383	   As mentioned earlier in the document, repeatability of test results
384	   is critical, especially considering the nature of stateful TCP
385	   traffic.  To this end, the stateful tests will use TCP test patterns
386	   to emulate applications.  This framework also provides guidelines
387	   for application modeling and open source tools to achieve the
388	   repeatable stimulus.  And finally, TCP metrics from [RFC6349] MUST
389	   be measured for each stateful test and provide the means to compare
390	   each repeated test.

392	   Even though the scope is targeted to TCP applications (i.e. Web,
393	   Email, database, etc.), the framework could be applied to SCTP
394	   in terms of test patterns. WebRTC, SS7 signaling, and 3gpp are
395	   examples of SCTP protocols that could be modeled with this
396	   framework to benchmark SCTP's effect on traffic management
397	   performance.

399	   Also note that currently, this framework does not address tcpcrypt
400	   (encrypted TCP) test patterns, although the metrics defined in
401	   Section 4.2 can still be used since the metrics are based on TCP
402	   retransmission and RTT measurements (versus any of the payload).
403	   Thus if tcpcrypt becomes popular, it would be natural for
404	   benchmarkers to consider encrypted TCP patterns and include them
405	   in test cases.

407	4. Traffic Benchmarking Metrics

409	   The metrics to be measured during the benchmarks are divided into two
410	   (2) sections: packet layer metrics used for the stateless traffic
411	   testing and TCP layer metrics used for the stateful traffic
412	   testing.

414	4.1.  Metrics for Stateless Traffic Tests

416	   Stateless traffic measurements require that sequence number and
417	   time-stamp be inserted into the payload for lost packet analysis.
418	   Delay analysis may be achieved by insertion of timestamps directly
419	   into the packets or timestamps stored elsewhere (packet captures).
420	   This framework does not specify the packet format to carry sequence
421	   number or timing information.

423	   However,[RFC4737] and [RFC4689] provide recommendations
424	   for sequence tracking along with definitions of in-sequence and
425	   out-of-order packets.

427	   The following are the metrics that MUST be measured during the
428	   stateless traffic benchmarking components of the tests:

430	   - Burst Size Achieved (BSA): for the traffic policing and network
431	   queue tests, the tester will be configured to send bursts to test
432	   either the Committed Burst Size (CBS) or Excess Burst Size (EBS) of
433	   a policer or the queue / buffer size configured in the DUT.  The
434	   Burst Size Achieved metric is a measure of the actual burst size
435	   received at the egress port of the DUT with no lost packets.  As an
436	   example, the configured CBS of a DUT is 64KB and after the burst
437	   test, only a 63 KB can be achieved without packet loss.  Then 63KB is
438	   the BSA.  Also, the average Packet Delay Variation (PDV see below) as
439	   experienced by the packets sent at the BSA burst size should be
440	   recorded. This metric shall be reported in units of bytes, KBytes,
441	   or MBytes.

443	   - Lost Packets(LP): For all traffic management tests, the tester will
444	   transmit the test packets into the DUT ingress port and the number of
445	   packets received at the egress port will be measured.  The difference
446	   between packets transmitted into the ingress port and received at the
447	   egress port is the number of lost packets as measured at the egress
448	   port.  These packets must have unique identifiers such that only the
449	   test packets are measured.  For cases where multiple flows are
450	   transmitted from ingress to egress port (e.g. IP conversations), each
451	   flow must have sequence numbers within the test packets stream.

453	   [RFC6703] and [RFC2680] describe the need to establish the
454	   time threshold to wait before a packet is declared as lost, and this
455	   threshold MUST be reported with the results. This metric shall be
456	   reported as an integer number which cannot be negative. (see:
457	   http://tools.ietf.org/html/rfc6703#section-4.1)

459	   - Out of Order (OOO): in additions to the LP metric, the test
460	   packets must be monitored for sequence. [RFC4689] defines the
461	   general function of sequence tracking, as well as definitions
462	   for in-sequence and out-of-order packets. Out-of-order packets
463	   will be counted per [RFC4737]. This metric shall be reported as
464	   an integer number which cannot be negative.

466	   - Packet Delay (PD): the Packet Delay metric is the difference
467	   between the timestamp of the received egress port packets and the
468	   packets transmitted into the ingress port and specified in [RFC1242].
469	   The transmitting host and receiving host time must be in
470	   time sync using NTP , GPS, etc.  This metric SHALL be reported as a
471	   real number of seconds, where a negative measurement usually
472	   indicates a time synchronization problem between test devices.

474	   - Packet Delay Variation (PDV): the Packet Delay Variation metric is
475	   the variation between the timestamp of the received egress port
476	   packets and specified in [RFC5481].  Note that per [RFC5481],
477	   this PDV is the variation of one-way delay across many packets in
478	   the traffic flow.  Per the measurement formula in [RFC5481], select
479	   the high percentile of 99% and units of measure will be a real
480	   number of seconds (negative is not possible for PDV and would
481	   indicate a measurement error).

483	   - Shaper Rate (SR): The SR represents the average DUT output
484	  rate (bps) over the test interval.  The Shaper Rate is only
485	  applicable to the traffic shaping tests.

487	  - Shaper Burst Bytes (SBB):   A traffic shaper will emit packets in
488	   different size "trains"; these are frames "back-to-back", respect
489	   the mandatory inter-frame gap.  This metric characterizes the method
490	   by which the shaper emits traffic. Some shapers transmit larger
491	   bursts per interval, and a burst of 1 packet would apply to the
492	   extreme case of a shaper sending a CBR stream of single packets.
493	   This metric SHALL be reported in units of bytes, KBytes, or MBytes.
494	   Shaper Burst Bytes is only applicable to thetraffic shaping tests.

496	   - Shaper Burst Interval(SBI):  the SBI is the time between shaper
497	   emitted bursts and is measured at the DUT egress port.  This metric
498	   shall be reported as an real number of seconds. Shaper Burst
499	   Interval is only applicable to the traffic shaping tests,

501	4.2. Metrics for Stateful Traffic Tests

503	   The stateful metrics will be based on [RFC6349] TCP metrics and
504	   MUST include:

506	   - TCP Test Pattern Execution Time (TTPET): [RFC6349] defined the TCP
507	   Transfer Time for bulk transfers, which is simply the measured time
508	   to transfer bytes across single or concurrent TCP connections. The
509	   TCP test patterns used in traffic management tests will include bulk
510	   transfer and interactive applications.  The interactive patterns
511	   include instances such as HTTP business applications, database
512	   applications, etc.  The TTPET will be the measure of the time for a
513	   single execution of a TCP Test Pattern (TTP). Average, minimum, and
514	   maximum times will be measured or calculated and expressed as a real
515	   number of seconds.

517	   An example would be an interactive HTTP TTP session which should take
518	   5 seconds on a GigE network with 0.5 millisecond latency. During ten
519	   (10) executions of this TTP, the TTPET results might be: average of
520	   6.5 seconds, minimum of 5.0 seconds, and maximum of 7.9 seconds.

522	   - TCP Efficiency: after the execution of the TCP Test Pattern, TCP
523	   Efficiency represents the percentage of Bytes that were not
524	   retransmitted.

526	                          Transmitted Bytes - Retransmitted Bytes

528	      TCP Efficiency % =  ---------------------------------------  X 100

530	                                   Transmitted Bytes

532	   Transmitted Bytes are the total number of TCP Bytes to be transmitted
533	   including the original and the retransmitted Bytes.  These
534	   retransmitted bytes should be recorded from the sender's TCP/IP stack

536	   perspective, to avoid any misinterpretation that a reordered packet
537	   is a retransmitted packet (as may be the case with packet decode
538	   interpretation).

540	   - Buffer Delay: represents the increase in RTT during a TCP test
541	   versus the baseline DUT RTT (non congested, inherent latency).  RTT
542	   and the technique to measure RTT (average versus baseline) are
543	   defined in [RFC6349].  Referencing [RFC6349], the average RTT is
544	   derived from the total of all measured RTTs during the actual test
545	   sampled at every second divided by the test duration in seconds.

547	                                         Total RTTs during transfer
548	         Average RTT during transfer = -----------------------------
549	                                        Transfer duration in seconds

551	                        Average RTT during Transfer - Baseline RTT
552	       Buffer Delay % = ------------------------------------------ X 100
553	                                    Baseline RTT

555	    Note that even though this was not explicitly stated in [RFC6349],
556	    retransmitted packets should not be used in RTT measurements.

558	    Also, the test results should record the average RTT in millisecond
559	    across the entire test duration and number of samples.

561	5. Tester Capabilities

563	    The testing capabilities of the traffic management test environment
564	    are divided into two (2) sections: stateless traffic testing and
565	    stateful traffic testing

567	5.1. Stateless Test Traffic Generation

569	   The test device MUST be capable of generating traffic at up to the
570	   link speed of the DUT.  The test device must be calibrated to verify
571	   that it will not drop any packets.  The test device's inherent PD and
572	   PDV must also be calibrated and subtracted from the PD and PDV
573	   metrics.  The test device must support the encapsulation to be
574	   tested such as IEEE 802.1Q VLAN, IEEE 802.1ad Q-in-Q, Multiprotocol
575	   Label Switching (MPLS), etc.  Also, the test device must allow
576	   control of the classification techniques defined in [RFC4689]
577	   (i.e. IP address, DSCP, TOS, etc classification).

579	   The open source tool "iperf" can be used to generate stateless UDP
580	   traffic and is discussed in Appendix A.  Since iperf is a software
581	   based tool, there will be performance limitations at higher link
582	   speeds (e.g. GigE, 10 GigE, etc.).  Careful calibration of any test
583	   environment using iperf is important.  At higher link speeds, it is
584	   recommended to use hardware based packet test equipment.

586	5.1.1 Burst Hunt with Stateless Traffic

588	   A central theme for the traffic management tests is to benchmark the
589	   specified burst parameter of traffic management function, since burst

591	   parameters of SLAs are specified in bytes.  For testing efficiency,
592	   it is recommended to include a burst hunt feature, which automates
593	   the manual process of determining the maximum burst size which can
594	   be supported by a traffic management function.

596	   The burst hunt algorithm should start at the target burst size
597	   (maximum burst size supported by the traffic management function)
598	   and will send single bursts until it can determine the largest burst
599	   that can pass without loss.  If the target burst size passes, then
600	   the test is complete.  The hunt aspect occurs when the target burst
601	   size is not achieved; the algorithm will drop down to a configured
602	   minimum burst size and incrementally increase the burst until the
603	   maximum burst supported by the DUT is discovered.  The recommended
604	   granularity of the incremental burst size increase is 1 KB.

606	   Optionally for a policer function and if the burst size passes, the
607	   burst should be increased by increments of 1 KB to verify that the
608	   policer is truly configured properly (or enabled at all).

610	5.2. Stateful Test Pattern Generation

612	   The TCP test host will have many of the same attributes as the TCP
613	   test host defined in [RFC6349].  The TCP test device may be a
614	   standard computer or a dedicated communications test instrument. In
615	   both cases, it must be capable of emulating both a client and a
616	   server.

618	   For any test using stateful TCP test traffic, the Network Delay
619	   Emulator (NDE function from the lab set-up diagram) must be used in
620	   order to provide a meaningful BDP.  As referenced in section 2, the
621	   target traffic rate and configured RTT MUST be verified independently
622	   using just the NDE for all stateful tests (to ensure the NDE can
623	   delay without loss).

625	   The TCP test host MUST be capable to generate and receive stateful
626	   TCP test traffic at the full link speed of the DUT.  As a general
627	   rule of thumb, testing TCP Throughput at rates greater than 500 Mbps
628	   may require high performance server hardware or dedicated hardware
629	   based test tools.

631	   The TCP test host MUST allow adjusting both Send and Receive Socket
632	   Buffer sizes.  The Socket Buffers must be large enough to fill the
633	   BDP for bulk transfer TCP test application traffic.

635	   Measuring RTT and retransmissions per connection will generally
636	   require a dedicated communications test instrument. In the absence of
637	   dedicated hardware based test tools, these measurements may need to
638	   be conducted with packet capture tools, i.e. conduct TCP Throughput
639	   tests and analyze RTT and retransmissions in packet captures.

641	   The TCP implementation used by the test host MUST be specified in
642	   the test results (e.g. TCP New Reno, TCP options supported, etc.).
643	   Additionally, the test results SHALL provide specific congestion
644	   control algorithm details, as per [RFC3148].

646	   While [RFC6349] defined the means to conduct throughput tests of TCP
647	   bulk transfers, the traffic management framework will extend TCP test
648	   execution into interactive TCP application traffic.  Examples include
649	   email, HTTP, business applications, etc.  This interactive traffic is
650	   bi-directional and can be chatty, meaning many turns in traffic
651	   communication during the course of a transaction (versus the
652	   relatively uni-directional flow of bulk transfer applications).

654	   The test device must not only support bulk TCP transfer application
655	   traffic but MUST also support chatty traffic.  A valid stress test
656	   SHOULD include both traffic types. This is due to the non-uniform,
657	   bursty nature of chatty applications versus the relatively uniform
658	   nature of bulk transfers (the bulk transfer smoothly stabilizes to
659	   equilibrium state under lossless conditions).

661	   While iperf is an excellent choice for TCP bulk transfer testing,
662	   the netperf open source tool provides the ability to control the
663	   client and server request / response behavior.  The netperf-wrapper
664	   tool is a Python wrapper to run multiple simultaneous netperf
665	   instances and aggregate the results.  Appendix A provides an overview
666	   of netperf / netperf-wrapper and another open source application
667	   emulation tools, iperf. As with any software based tool, the
668	   performance must be qualified to the link speed to be tested.
669	   Hardware-based test equipment should be considered for reliable
670	   results at higher links speeds (e.g. 1 GigE, 10 GigE).

672	5.2.1. TCP Test Pattern Definitions

674	   As mentioned in the goals of this framework, techniques are defined
675	   to specify TCP traffic test patterns to benchmark traffic
676	   management technique(s) and produce repeatable results. Some
677	   network devices such as firewalls, will not process stateless test
678	   traffic which is another reason why stateful TCP test traffic must
679	   be used.

681	   An application could be fully emulated up to Layer 7, however this
682	   framework proposes that stateful TCP test patterns be used in order
683	   to provide granular and repeatable control for the benchmarks. The
684	   following diagram illustrates a simple Web Browsing application
685	   (HTTP).

687	                   GET url

689	   Client      ------------------------>   Web

691	   Web             200 OK        100ms |

693	   Browser     <------------------------    Server

695	   In this example, the Client Web Browser (Client) requests a URL and
696	   then the Web Server delivers the web page content to the Client
697	   (after a Server delay of 100 millisecond).  This asynchronous,
698	   "request/response" behavior is intrinsic to most TCP based
699	   applications such as Email (SMTP), File Transfers (FTP and SMB),

701	   Database (SQL), Web Applications (SOAP), REST, etc.  The impact to
702	   the network elements is due to the multitudes of Clients and the
703	   variety of bursty traffic, which stresses traffic management
704	   functions. The actual emulation of the specific application
705	   protocols is not required and TCP test patterns can be defined to
706	   mimic the application network traffic flows and produce repeatable
707	   results.

709	   Application modeling techniques have been proposed in
710	   "3GPP2 C.R1002-0 v1.0" and provides examples to model the behavior of
711	   HTTP, FTP, and WAP applications at the TCP layer. The models have
712	   been defined with various mathematical distributions for the
713	   Request/Response bytes and inter-request gap times.  The model
714	   definition format described in this work are the basis for the
715	   guidelines provides in Appendix B and are also similar to formats
716	   used by network modeling tools.  Packet captures can also be used to
717	   characterize application traffic and specify some of the test
718	   patterns listed in Appendix B.

720	   This framework does not specify a fixed set of TCP test patterns, but
721	   does provide test cases that SHOULD be performed in Appendix B.  Some
722	   of these examples reflect those specified in "draft-ietf-bmwg-ca-
723	   bench-meth-04" which suggests traffic mixes for a variety of
724	   representative application profiles.  Other examples are simply
725	   well-known application traffic types such as HTTP.

727	6. Traffic Benchmarking Methodology

729	   The traffic benchmarking methodology uses the test set-up from
730	   section 2 and metrics defined in section 4.

732	   Each test SHOULD compare the network device's internal statistics
733	   (available via command line management interface, SNMP, etc.) to the
734	   measured metrics defined in section 4.  This evaluates the accuracy
735	   of the internal traffic management counters under individual test
736	   conditions and capacity test conditions that are defined in each
737	   subsection.  This comparison is not intended to compare real-time
738	   statistics, but the cumulative statistics reported after the test
739	   has completed and device counters have updated (it is common for
740	   device counters to update after a 10 second or greater interval).

742	   From a device configuration standpoint, scheduling and shaping
743	   functionality can be applied to logical ports such Link Aggregation
744	   (LAG). This would result in the same scheduling and shaping
745	   configuration applied to all the member physical ports. The focus of
746	   this draft is only on tests at a physical port level.

748	   The following sections provide the objective, procedure, metrics, and
749	   reporting format for each test.  For all test steps, the following
750	   global parameters must be specified:

752	   Test Runs (Tr). Defines the number of times the test needs to be run
753	   to ensure accurate and repeatable results.  The recommended value is
754	   a minimum of 10.

756	   Test Duration (Td). Defines the duration of a test iteration,
757	   expressed in seconds.  The recommended minimum value is 60 seconds.

759	   The variability in the test results MUST be measured between Test
760	   Runs and if the variation is characterized as a significant portion
761	   of the measured values, the next step may be to revise the methods to
762	   achieve better consistency.

764	6.1. Policing Tests

766	   A policer is defined as the entity performing the policy function.
767	   The intent of the policing tests is to verify the policer performance
768	   (i.e. CIR-CBS and EIR-EBS parameters). The tests will verify that the
769	   network device can handle the CIR with CBS and the EIR with EBS and
770	   will use back-back packet testing concepts from [RFC2544] (but
771	   adapted to burst size algorithms and terminology).  Also [MEF-14],
772	   [MEF-19], and [MEF-37] provide some basis for specific components of
773	   this test.  The burst hunt algorithm defined in section 5.1.1 can
774	   also be used to automate the measurement of the CBS value.

776	   The tests are divided into two (2) sections; individual policer
777	   tests and then full capacity policing tests. It is important to
778	   benchmark the basic functionality of the individual policer then
779	   proceed into the fully rated capacity of the device. This capacity
780	   may include the number of policing policies per device and the
781	   number of policers simultaneously active across all ports.

783	6.1.1 Policer Individual Tests

785	   Objective:
786	   Test a policer as defined by [RFC4115] or MEF 10.2, depending upon
787	   the equipment's specification.  In addition to verifying that the
788	   policer allows the specified CBS and EBS bursts to pass, the policer
789	   test MUST verify that the policer will remark or drop excess, and
790	   pass traffic at the specified CBS/EBS values.

792	   Test Summary:
793	   Policing tests should use stateless traffic. Stateful TCP test
794	   traffic will generally be adversely affected by a policer in the
795	   absence of traffic shaping.  So while TCP traffic could be used,
796	   it is more accurate to benchmark a policer with stateless traffic.

798	   As an example for [RFC4115], consider a CBS and EBS of 64KB and CIR
799	   and EIR of 100 Mbps on a 1GigE physical link (in color-blind mode).
800	   A stateless traffic burst of 64KB would be sent into the policer at
801	   the GigE rate. This equates to approximately a 0.512 millisecond
802	   burst time (64 KB at 1 GigE). The traffic generator must space these
803	   bursts to ensure that the aggregate throughput does not exceed the
804	   CIR. The Ti between the bursts would equal CBS * 8 / CIR = 5.12
805	   millisecond in this example.

807	   Test Metrics:
808	   The metrics defined in section 4.1 (BSA, LP, OOS, PD, and PDV) SHALL
809	   be measured at the egress port and recorded.

811	   Procedure:
812	   1. Configure the DUT policing parameters for the desired CIR/EIR and
813	      CBS/EBS values to be tested

815	   2. Configure the tester to generate a stateless traffic burst equal
816	      to CBS and an interval equal to Ti (CBS in bits / CIR)

818	   3. Compliant Traffic Step: Generate bursts of CBS  + EBS traffic into
819	      the policer ingress port and measure the metrics defined in
820	      section 4.1 (BSA, LP. OOS, PD, and PDV) at the egress port and
821	      across the entire Td (default 60 seconds duration)

823	   4. Excess Traffic Test: Generate bursts of greater than CBS + EBS
824	      limit traffic into the policer ingress port and verify that the
825	      policer only allowed the BSA bytes to exit the egress. The excess
826	      burst MUST be recorded and the recommended value is 1000 bytes.
827	      Additional tests beyond the simple color-blind example might
828	      include: color-aware mode, configurations where EIR is greater
829	      than CIR, etc.

831	   Reporting Format:
832	   The policer individual report MUST contain all results for each
833	   CIR/EIR/CBS/EBS test run and a recommended format is as follows:

835	   ********************************************************
836	   Test Configuration Summary:  Tr, Td

838	   DUT Configuration Summary:  CIR, EIR, CBS, EBS

840	   The results table should contain entries for each test run, (Test #1
841	   to Test #Tr).

843	   Compliant Traffic Test:  BSA, LP, OOS, PD, and PDV

845	   Excess Traffic Test:  BSA
846	   ********************************************************

848	6.1.2 Policer Capacity Tests

850	   Objective:
851	   The intent of the capacity tests is to verify the policer performance
852	   in a scaled environment with multiple ingress customer policers on
853	   multiple physical ports.  This test will benchmark the maximum number
854	   of active policers as specified by the device manufacturer.

856	   Test Summary:
857	   The specified policing function capacity is generally expressed in
858	   terms of the number of policers active on each individual physical
859	   port as well as the number of unique policer rates that are utilized.
860	   For all of the capacity tests, the benchmarking test procedure and
861	   report format described in Section 6.1.1 for a single policer MUST
862	   be applied to each of the physical port policers.

864	   As an example, a Layer 2 switching device may specify that each of
865	   the 32 physical ports can be policed using a pool of policing service
866	   policies.  The device may carry a single customer's traffic on each
867	   physical port and a single policer is instantiated per physical port.
868	   Another possibility is that a single physical port may carry multiple
869	   customers, in which case many customer flows would be policed
870	   concurrently on an individual physical port (separate policers per
871	   customer on an individual port).

873	   Test Metrics:
874	   The metrics defined in section 4.1 (BSA, LP, OOS, PD, and PDV) SHALL
875	   be measured at the egress port and recorded.

877	   The following sections provide the specific test scenarios,
878	   procedures, and reporting formats for each policer capacity test.

880	6.1.2.1 Maximum Policers on Single Physical Port Test

882	   Test Summary:
883	   The first policer capacity test will benchmark a single physical
884	   port, maximum policers on that physical port.

886	   Assume multiple categories of ingress policers at rates r1, r2,...rn.
887	   There are multiple customers on a single physical port. Each customer
888	   could be represented by a single tagged vlan, double tagged vlan,
889	   VPLS instance etc. Each customer is mapped to a different policer.
890	   Each of the policers can be of rates r1, r2,..., rn.

892	   An example configuration would be
893	   - Y1 customers, policer rate r1
894	   - Y2 customers, policer rate r2
895	   - Y3 customers, policer rate r3
896	   ...
897	   - Yn customers, policer rate rn

899	   Some bandwidth on the physical port is dedicated for other traffic
900	   non customer traffic); this includes network control protocol
901	   traffic. There is a separate policer for the other traffic. Typical
902	   deployments have 3 categories of policers; there may be some
903	   deployments with more or less than 3 categories of ingress
904	   policers.

906	   Test Procedure:
907	   1. Configure the DUT policing parameters for the desired CIR/EIR and
908	      CBS/EBS values for each policer rate (r1-rn) to be tested

910	   2. Configure the tester to generate a stateless traffic burst equal
911	      to CBS and an interval equal to TI (CBS in bits/CIR) for each
912	      customer stream (Y1 - Yn).  The encapsulation for each customer
913	      must also be configured according to the service tested (VLAN,
914	      VPLS, IP mapping, etc.).

916	   3. Compliant Traffic Step: Generate bursts of CBS + EBS traffic into
917	      the policer ingress port for each customer traffic stream and
918	      measure the metrics defined in section 4.1 (BSA, LP, OOS, PD, and
919	      PDV) at the egress port for each stream and across the entire Td
920	      (default 30 seconds duration)

922	   4. Excess Traffic Test: Generate bursts of greater than CBS + EBS
923	      limit traffic into the policer ingress port for each customer
924	      traffic stream and verify that the policer only allowed the BSA
925	      bytes to exit the egress for each stream.  The excess burst MUST
926	      recorded and the recommended value is 1000 bytes.

928	   Reporting Format:
929	   The policer individual report MUST contain all results for each
930	   CIR/EIR/CBS/EBS test run, per customer traffic stream.

932	   A recommended format is as follows:

934	   ********************************************************
935	   Test Configuration Summary:  Tr, Td

937	   Customer traffic stream Encapsulation:  Map each stream to VLAN,
938	   VPLS, IP address

940	   DUT Configuration Summary per Customer Traffic Stream:  CIR, EIR,
941	   CBS, EBS

943	   The results table should contain entries for each test run, (Test #1
944	   to Test #Tr).

946	   Customer Stream Y1-Yn (see note), Compliant Traffic Test:  BSA, LP,
947	   OOS, PD, and PDV

949	   Customer Stream Y1-Yn (see note), Excess Traffic Test:  BSA
950	   ********************************************************

952	   Note: For each test run, there will be a two (2) rows for each
953	   customer stream, the compliant traffic result and the excess traffic
954	   result.

956	6.1.2.2 Single Policer on All Physical Ports

958	   Test Summary:
959	   The second policer capacity test involves a single Policer function
960	   per physical port with all physical ports active. In this test,
961	   there is a single policer per physical port. The policer can have
962	   one of the rates r1, r2,.., rn. All the physical ports in the
963	   networking device are active.

965	   Procedure:
966	   The procedure is identical to 6.1.1, the configured parameters must
967	   be reported per port and the test report must include results per
968	   measured egress port

970	6.1.2.3 Maximum Policers on All Physical Ports

972	   Finally the third policer capacity test involves a combination of the
973	   first and second capacity test, namely maximum policers active per
974	   physical port and all physical ports are active.

976	   Procedure:
977	   Uses the procedural method from 6.1.2.1 and the configured parameters
978	   must be reported per port and the test report must include per stream
979	   results per measured egress port.

981	6.2. Queue and Scheduler Tests

983	   Queues and traffic Scheduling are closely related in that a queue's
984	   priority dictates the manner in which the traffic scheduler
985	   transmits packets out of the egress port.

987	   Since device queues / buffers are generally an egress function, this
988	   test framework will discuss testing at the egress (although the
989	   technique can be applied to ingress side queues).

991	   Similar to the policing tests, the tests are divided into two
992	   sections; individual queue/scheduler function tests and then full
993	   capacity tests.

995	6.2.1 Queue/Scheduler Individual Tests Overview

997	   The various types of scheduling techniques include FIFO, Strict
998	   Priority (SP), Weighted Fair Queueing (WFQ) along with other
999	   variations.  This test framework recommends to test at a minimum
1000	   of three techniques although it is the discretion of the tester
1001	   to benchmark other device scheduling algorithms.

1003	6.2.1.1 Queue/Scheduler with Stateless Traffic Test

1005	   Objective:
1006	   Verify that the configured queue and scheduling technique can
1007	   handle stateless traffic bursts up to the queue depth.

1009	   Test Summary:
1010	   A network device queue is memory based unlike a policing function,
1011	   which is token or credit based.  However, the same concepts from
1012	   section 6.1 can be applied to testing network device queues.

1014	   The device's network queue should be configured to the desired size
1015	   in KB (queue length, QL) and then stateless traffic should be
1016	   transmitted to test this QL.

1018	   A queue should be able to handle repetitive bursts with the
1019	   transmission gaps proportional to the bottleneck bandwidth.  This
1020	   gap is referred to as the transmission interval (Ti).  Ti can
1021	   be defined for the traffic bursts and is based off of the QL and
1022	   Bottleneck Bandwidth (BB) of the egress interface.

1024	   Ti = QL * 8 / BB

1026	   Note that this equation is similar to the Ti required for
1027	   transmission into a policer (QL = CBS, BB = CIR).  Also note that the
1028	   burst hunt algorithm defined in section 5.1.1 can also be used to
1029	   automate the measurement of the queue value.

1031	   The stateless traffic burst shall be transmitted at the link speed
1032	   and spaced within the Ti time interval. The metrics defined in
1033	   section 4.1 shall be measured at the egress port and recorded; the
1034	   primary result is to verify the BSA and that no packets are dropped.

1036	   The scheduling function must also be characterized to benchmark the
1037	   device's ability to schedule the queues according to the priority.
1038	   An example would be 2 levels of priority including SP and FIFO
1039	   queueing.  Under a flow load greater the egress port speed, the
1040	   higher priority packets should be transmitted without drops (and
1041	   also maintain low latency), while the lower priority (or best
1042	   effort) queue may be dropped.

1044	   Test Metrics:
1045	   The metrics defined in section 4.1 (BSA, LP, OOS, PD, and PDV) SHALL
1046	   be measured at the egress port and recorded.

1048	   Procedure:
1049	   1. Configure the DUT queue length (QL) and scheduling technique
1050	      (FIFO, SP, etc) parameters

1052	   2. Configure the tester to generate a stateless traffic burst equal
1053	      to QL and an interval equal to Ti (QL in bits/BB)

1055	   3. Generate bursts of QL traffic into the DUT and measure the
1056	      metrics defined in section 4.1 (LP, OOS, PD, and PDV) at the
1057	      egress port and across the entire Td (default 30 seconds
1058	      duration)

1060	   Report Format:
1061	   The Queue/Scheduler Stateless Traffic individual report MUST contain
1062	   all results for each QL/BB test run and a recommended format is as
1063	   follows:

1065	   ********************************************************
1066	   Test Configuration Summary:  Tr, Td

1068	   DUT Configuration Summary:  Scheduling technique, BB and QL

1070	   The results table should contain entries for each test run
1071	   as follows,

1073	   (Test #1 to Test #Tr).

1075	   - LP, OOS, PD, and PDV
1076	   ********************************************************

1078	6.2.1.2 Testing Queue/Scheduler with Stateful Traffic

1080	   Objective:
1081	   Verify that the configured queue and scheduling technique can handle
1082	   stateful traffic bursts up to the queue depth.

1084	   Test Background and Summary:
1085	   To provide a more realistic benchmark and to test queues in layer 4
1086	   devices such as firewalls, stateful traffic testing is recommended
1087	   for the queue tests.  Stateful traffic tests will also utilize the
1088	   Network Delay Emulator (NDE) from the network set-up configuration in
1089	   section 2.

1091	   The BDP of the TCP test traffic must be calibrated to the QL of the
1092	   device queue.  Referencing [RFC6349], the BDP is equal to:

1094	   BB * RTT / 8 (in bytes)

1096	   The NDE must be configured to an RTT value which is large enough to
1097	   allow the BDP to be greater than QL.  An example test scenario is
1098	   defined below:

1100	   - Ingress link = GigE
1101	   - Egress link = 100 Mbps (BB)
1102	   - QL = 32KB

1104	   RTT(min) = QL * 8 / BB and would equal 2.56 ms (and the BDP = 32KB)

1106	   In this example, one (1) TCP connection with window size / SSB of
1107	   32KB would be required to test the QL of 32KB.  This Bulk Transfer
1108	   Test can be accomplished using iperf as described in Appendix A.

1110	   Two types of TCP tests MUST be performed: Bulk Transfer test and
1111	   Micro Burst Test Pattern as documented in Appendix B.  The Bulk
1112	   Transfer Test only bursts during the TCP Slow Start (or Congestion
1113	   Avoidance) state, while the Micro Burst test emulates application
1114	   layer bursting which may occur any time during the TCP connection.

1116	   Other tests types SHOULD include: Simple Web Site, Complex Web Site,
1117	   Business Applications, Email, SMB/CIFS File Copy (which are also
1118	   documented in Appendix B).

1120	   Test Metrics:
1121	   The test results will be recorded per the stateful metrics defined in
1122	   section 4.2, primarily the TCP Test Pattern Execution Time (TTPET),
1123	   TCP Efficiency, and Buffer Delay.

1125	  Procedure:

1127	   1. Configure the DUT queue length (QL) and scheduling technique
1128	      (FIFO, SP, etc) parameters

1130	   2. Configure the test generator* with a profile of an emulated
1131	      application traffic mixture

1133	    - The application mixture MUST be defined in terms of percentage
1134	      of the total bandwidth to be tested

1136	  - The rate of transmission for each application within the mixture
1137	       MUST be also be configurable

1139	   * The test generator MUST be capable of generating precise TCP
1140	   test patterns for each application specified, to ensure repeatable
1141	   results.

1143	   3. Generate application traffic between the ingress (client side) and
1144	      egress (server side) ports of the DUT and measure application
1145	      throughput the metrics (TTPET, TCP Efficiency, and Buffer Delay),

1147	      per application stream and at the ingress and egress port (across
1148	      the entire Td, default 60 seconds duration).

1150	   Concerning application measurements, a couple of items require
1151	   clarification.  An application session may be comprised of a single
1152	   TCP connection or multiple TCP connections.

1154	   For the single TCP connection application sessions, the application
1155	   thoughput / metrics have a 1-1 relationship to the TCP connection
1156	   measurements.

1158	   If an application session (i.e. HTTP-based application) utilizes
1159	   multiple TCP connections, then all of the TCP connections are
1160	   aggregated in the application throughput measurement / metrics for
1161	   that application.

1163	   Then there is the case of mulitlple instances of an application
1164	   session (i.e. multiple FTPs emulating multiple clients). In this
1165	   situation, the test should measure / record each FTP application
1166	   session independently, tabulating the minimum, maximum, and average
1167	   for all FTP sessions.

1169	   Finally, application throughput measurements are based on Layer 4
1170	   TCP throughput and do not include bytes retransmitted.  The TCP
1171	   Efficiency metric MUST be measured during the test and provides a
1172	   measure of "goodput" during each test.

1174	   Reporting Format:
1175	   The Queue/Scheduler Stateful Traffic individual report MUST contain
1176	   all results for each traffic scheduler and QL/BB test run and a
1177	   recommended format is as follows:

1179	   ********************************************************
1180	   Test Configuration Summary:  Tr, Td

1182	   DUT Configuration Summary:  Scheduling technique, BB and QL

1184	   Application Mixture and Intensities:  this is the percent configured
1185	   of each application type

1187	   The results table should contain entries for each test run with
1188	   minimum, maximum, and average per application session as follows,
1189	   (Test #1 to Test #Tr)

1191	   - Per Application Throughout (bps) and TTPET
1192	   - Per Application Bytes In and Bytes Out
1193	   - Per Application TCP Efficiency, and Buffer Delay
1194	   ********************************************************

1196	6.2.2 Queue / Scheduler Capacity Tests

1198	   Objective:
1199	   The intent of these capacity tests is to benchmark queue/scheduler
1200	   performance in a scaled environment with multiple queues/schedulers
1201	   active on multiple egress physical ports. This test will benchmark
1202	   the maximum number of queues and schedulers as specified by the
1203	   device manufacturer.  Each priority in the system will map to a
1204	   separate queue.

1206	   Test Metrics:
1207	   The metrics defined in section 4.1 (BSA, LP, OOS, PD, and PDV) SHALL
1208	   be measured at the egress port and recorded.

1210	   The following sections provide the specific test scenarios,
1211	   procedures, and reporting formats for each queue / scheduler capacity
1212	   test.

1214	6.2.2.1 Multiple Queues / Single Port Active

1216	   For the first scheduler / queue capacity test, multiple queues per
1217	   port will be tested on a single physical port. In this case,
1218	   all the queues (typically 8) are active on a single physical port.
1219	   Traffic from multiple ingress physical ports are directed to the
1220	   same egress physical port which will cause oversubscription on the
1221	   egress physical port.

1223	   There are many types of priority schemes and combinations of
1224	   priorities that are managed by the scheduler. The following
1225	   sections specify the priority schemes that should be tested.

1227	6.2.2.1.1 Strict Priority on Egress Port

1229	   Test Summary:
1230	   For this test, Strict Priority (SP) scheduling on the egress
1231	   physical port should be tested and the benchmarking methodology
1232	   specified in section 6.2.1.1 and 6.2.1.2 (procedure, metrics,
1233	   and reporting format) should be applied here.  For a given
1234	   priority, each ingress physical port should get a fair share of
1235	   the egress physical port bandwidth.

1237	   Since this is a capacity test, the configuration and report
1238	   results format from 6.2.1.1 and 6.2.1.2 MUST also include:

1240	   Configuration:
1241	   - The number of physical ingress ports active during the test
1242	   - The classication marking (DSCP, VLAN, etc.) for each physical
1243	     ingress port
1244	   - The traffic rate for stateful traffic and the traffic rate
1245	     / mixture for stateful traffic for each physical ingress port

1247	   Report results:
1248	   - For each ingress port traffic stream, the achieved throughput
1249	     rate and metrics at the egress port

1251	6.2.2.1.2 Strict Priority + Weighted Fair Queue (WFQ) on Egress Port

1253	   Test Summary:
1254	   For this test, Strict Priority (SP) and Weighted Fair Queue (WFQ)
1255	   should be enabled simultaneously in the scheduler but on a single
1256	   egress port. The benchmarking methodology specified in Section

1258	   6.2.1.1 and 6.2.1.2 (procedure, metrics, and reporting format)
1259	   should be applied here.  Additionally, the egress port bandwidth
1260	   sharing among weighted queues should be proportional to the assigned
1261	   weights. For a given priority, each ingress physical port should get
1262	   a fair share of the egress physical port bandwidth.

1264	   Since this is a capacity test, the configuration and report results
1265	   format from 6.2.1.1 and 6.2.1.2 MUST also include:

1267	   Configuration:
1268	   - The number of physical ingress ports active during the test
1269	   - The classication marking (DSCP, VLAN, etc.) for each physical
1270	     ingress port
1271	   - The traffic rate for stateful traffic and the traffic rate /
1272	     mixture for stateful traffic for each physical ingress port

1274	   Report results:
1275	   - For each ingress port traffic stream, the achieved throughput rate
1276	     and metrics at each queue of the egress port queue (both the SP
1277	     and WFQ queue).

1279	   Example:
1280	   - Egress Port SP Queue: throughput and metrics for ingress streams
1281	     1-n
1282	   - Egress Port WFQ Queue: throughput and metrics for ingress streams
1283	     1-n

1285	6.2.2.2 Single Queue per Port / All Ports Active

1287	   Test Summary:
1288	   Traffic from multiple ingress physical ports are directed to the
1289	   same egress physical port, which will cause oversubscription on the
1290	   egress physical port. Also, the same amount of traffic is directed
1291	   to each egress physical port.

1293	   The benchmarking methodology specified in Section 6.2.1.1
1294	   and 6.2.1.2 (procedure, metrics, and reporting format)  should be
1295	   applied here. Each ingress physical port should get a fair share of
1296	   the egress physical port bandwidth. Additionally, each egress
1297	   physical port should receive the same amount of traffic.

1299	Since this is a capacity test, the configuration and report results
1300	   format from 6.2.1.1 and 6.2.1.2 MUST also include:

1302	   Configuration:
1303	   - The number of ingress ports active during the test
1304	   - The number of egress ports active during the test
1305	   - The classication marking (DSCP, VLAN, etc.) for each physical
1306	     ingress port
1307	   - The traffic rate for stateful traffic and the traffic rate /
1308	     mixture for stateful traffic for each physical ingress port

1310	   Report results:
1311	   - For each egress port, the achieved throughput rate and metrics at
1312	     the egress port queue for each ingress port stream.

1314	   Example:
1315	   - Egress Port 1: throughput and metrics for ingress streams 1-n
1316	   - Egress Port n: throughput and metrics for ingress streams 1-n

1318	6.2.2.3 Multiple Queues per Port, All Ports Active

1320	   Traffic from multiple ingress physical ports are directed to all
1321	   queues of each egress physical port, which will cause
1322	   oversubscription on the egress physical ports. Also, the same
1323	   amount of traffic is directed to each egress physical port.

1325	   The benchmarking methodology specified in Section 6.2.1.1
1326	   and 6.2.1.2 (procedure, metrics, and reporting format) should be
1327	   applied here. For a given priority, each ingress physical port
1328	   should get a fair share of the egress physical port bandwidth.
1329	   Additionally, each egress physical port should receive the same
1330	   amount of traffic.

1332	   Since this is a capacity test, the configuration and report results
1333	   format from 6.2.1.1 and 6.2.1.2 MUST also include:

1335	   Configuration:
1336	   - The number of physical ingress ports active during the test
1337	   - The classication marking (DSCP, VLAN, etc.) for each physical
1338	     ingress port
1339	   - The traffic rate for stateful traffic and the traffic rate /
1340	     mixture for stateful traffic for each physical ingress port

1342	   Report results:
1343	   - For each egress port, the achieved throughput rate and metrics at
1344	     each egress port queue for each ingress port stream.

1346	   Example:
1347	   - Egress Port 1, SP Queue: throughput and metrics for ingress
1348	     streams 1-n
1349	   - Egress Port 2, WFQ Queue: throughput and metrics for ingress
1350	     streams 1-n
1351	   .
1352	   .
1353	   - Egress Port n, SP Queue: throughput and metrics for ingress
1354	     streams 1-n
1355	   - Egress Port n, WFQ Queue: throughput and metrics for ingress
1356	     streams 1-n

1358	6.3. Shaper tests

1360	   A traffic shaper is memory based like a queue, but with the added
1361	   intelligence of an active traffic scheduler. The same concepts from
1362	   section 6.2 (Queue testing) can be applied to testing network device
1363	   shaper.

1365	   Again, the tests are divided into two sections; individual shaper
1366	   benchmark tests and then full capacity shaper benchmark tests.

1368	6.3.1 Shaper Individual Tests Overview

1370	   A traffic shaper generally has three (3) components that can be
1371	   configured:

1373	   - Ingress Queue bytes
1374	   - Shaper Rate, bps
1375	   - Burst Committed (Bc) and Burst Excess (Be), bytes

1377	   The Ingress Queue holds burst traffic and the shaper then meters
1378	   traffic out of the egress port according to the Shaper Rate and
1379	   Bc/Be parameters.  Shapers generally transmit into policers, so
1380	   the idea is for the emitted traffic to conform to the policer's
1381	   limits.

1383	6.3.1.1 Testing Shaper with Stateless Traffic

1385	   Objective:
1386	   Test a shaper by transmitting stateless traffic bursts into the
1387	   shaper ingress port and verifying that the egress traffic is shaped
1388	   according to the shaper traffic profile.

1390	   Test Summary:
1391	   The stateless traffic must be burst into the DUT ingress port and
1392	   not exceed the Ingress Queue.  The burst can be a single burst or
1393	   multiple bursts.  If multiple bursts are transmitted, then the
1394	   Ti (Time interval) must be large enough so that the Shaper Rate is
1395	   not exceeded.  An example will clarify single and multiple burst
1396	   test cases.

1398	   In the example, the shaper's ingress and egress ports are both full
1399	   duplex Gigabit Ethernet.  The Ingress Queue is configured to be
1400	   512,000 bytes, the Shaper Rate (SR) = 50 Mbps, and both Bc/Be
1401	   configured to be 32,000 bytes.  For a single burst test, the
1402	   transmitting test device would burst 512,000 bytes maximum into the
1403	   ingress port and then stop transmitting.

1405	  If a multiple burst test is to be conducted, then the burst bytes
1406	   divided by the time interval between the 512,000 byte bursts must
1407	   not exceed the Shaper Rate.  The time interval (Ti) must adhere to
1408	   a similar formula as described in section 6.2.1.1 for queues, namely:

1410	   Ti = Ingress Queue x 8 / Shaper Rate

1412	   For the example from the previous paragraph, Ti between bursts must
1413	   be greater than 82 millisecond (512,000 bytes x 8 / 50,000,000 bps).
1414	   This yields an average rate of 50 Mbps so that an Input Queue
1415	   would not overflow.

1417	   Test Metrics:
1418	   - The metrics defined in section 4.1 (LP, OOS, PDV, SR, SBB, SBI)
1419	   SHALL be measured at the egress port and recorded.

1421	   Procedure:
1422	   1. Configure the DUT shaper ingress queue length (QL) and shaper
1423	      egress rate parameters (SR, Bc, Be) parameters

1425	   2. Configure the tester to generate a stateless traffic burst equal
1426	      to QL and an interval equal to Ti (QL in bits/BB)

1428	   3. Generate bursts of QL traffic into the DUT and measure the metrics
1429	      defined in section 4.1 (LP, OOS, PDV, SR, SBB, SBI) at the egress
1430	      port and across the entire Td (default 30 seconds duration)

1432	   Report Format:
1433	   The Shaper Stateless Traffic individual report MUST contain all
1434	   results for each QL/SR test run and a recommended format is as
1435	   follows:
1436	   ********************************************************
1437	   Test Configuration Summary:  Tr, Td

1439	   DUT Configuration Summary:  Ingress Burst Rate, QL, SR

1441	   The results table should contain entries for each test run as
1442	   follows,(Test #1 to Test #Tr).

1444	   - LP, OOS, PDV, SR, SBB, SBI
1445	   ********************************************************

1447	6.3.1.2 Testing Shaper with Stateful Traffic

1449	   Objective:
1450	   Test a shaper by transmitting stateful traffic bursts into the shaper
1451	   ingress port and verifying that the egress traffic is shaped
1452	   according to the shaper traffic profile.

1454	   Test Summary:
1455	   To provide a more realistic benchmark and to test queues in layer 4
1456	   devices such as firewalls, stateful traffic testing is also
1457	   recommended for the shaper tests.  Stateful traffic tests will also
1458	   utilize the Network Delay Emulator (NDE) from the network set-up
1459	   configuration in section 2.

1461	   The BDP of the TCP test traffic must be calculated as described in
1462	   section 6.2.2. To properly stress network buffers and the traffic
1463	   shaping function, the cumulative TCP window should exceed the BDP
1464	   which will stress the shaper.  BDP factors of 1.1 to 1.5 are
1465	   recommended, but the values are the discretion of the tester and
1466	   should be documented.

1468	   The cumulative TCP Window Sizes* (RWND at the receiving end & CWND
1469	   at the transmitting end) equates to:

1471	   TCP window size* for each connection x number of connections

1473	   * as described in section 3 of [RFC6349], the SSB MUST be large
1474	   enough to fill the BDP

1476	   Example, if the BDP is equal to 256 Kbytes and a connection size of
1477	   64Kbytes is used for each connection, then it would require four (4)
1478	   connections to fill the BDP and 5-6 connections (over subscribe the
1479	   BDP) to stress test the traffic shaping function.

1481	   Two types of TCP tests MUST be performed: Bulk Transfer test and
1482	   Micro Burst Test Pattern as documented in Appendix B.  The Bulk
1483	   Transfer Test only bursts during the TCP Slow Start (or Congestion
1484	   Avoidance) state, while the Micro Burst test emulates application
1485	   layer bursting which may any time during the TCP connection.

1487	   Other tests types SHOULD include: Simple Web Site, Complex Web Site,
1488	   Business Applications, Email, SMB/CIFS File Copy (which are also
1489	   documented in Appendix B).

1491	   Test Metrics:
1492	   The test results will be recorded per the stateful metrics defined in
1493	   section 4.2, primarily the TCP Test Pattern Execution Time (TTPET),
1494	   TCP Efficiency, and Buffer Delay.

1496	   Procedure:
1497	   1. Configure the DUT shaper ingress queue length (QL) and shaper
1498	      egress rate parameters (SR, Bc, Be) parameters

1500	   2. Configure the test generator* with a profile of an emulated
1501	      application traffic mixture

1503	      - The application mixture MUST be defined in terms of percentage
1504	        of the total bandwidth to be tested

1506	      - The rate of transmission for each application within the mixture
1507	        MUST be also be configurable

1509	   * The test generator MUST be capable of generating precise TCP
1510	   test patterns for each application specified, to ensure repeatable
1511	   results.

1513	 3. Generate application traffic between the ingress (client side) and
1514	    egress (server side) ports of the DUT and measure the metrics
1515	    (TTPET, TCP Efficiency, and Buffer Delay) per application stream
1516	    and at the ingress and egress port (across the entire Td, default
1517	    30 seconds duration).

1519	   Reporting Format:
1520	   The Shaper Stateful Traffic individual report MUST contain all
1521	   results for each traffic scheduler and QL/SR test run and a
1522	   recommended format is as follows:

1524	   ********************************************************
1525	   Test Configuration Summary: Tr, Td

1527	  DUT Configuration Summary: Ingress Burst Rate, QL, SR

1529	   Application Mixture and Intensities: this is the percent configured
1530	   of each application type
1531	   The results table should contain entries for each test run with
1532	   minimum, maximum, and average per application session as follows,
1533	   (Test #1 to Test #Tr)

1535	   - Per Application Throughout (bps) and TTPET
1536	   - Per Application Bytes In and Bytes Out
1537	   - Per Application TCP Efficiency, and Buffer Delay
1538	   ********************************************************

1540	6.3.2 Shaper Capacity Tests

1542	   Objective:
1543	   The intent of these scalability tests is to verify shaper performance
1544	   in a scaled environment with shapers active on multiple queues on
1545	   multiple egress physical ports. This test will benchmark the maximum
1546	   number of shapers as specified by the device manufacturer.

1548	   The following sections provide the specific test scenarios,
1549	   procedures, and reporting formats for each shaper capacity test.

1551	6.3.2.1 Single Queue Shaped, All Physical Ports Active

1553	   Test Summary:
1554	   The first shaper capacity test involves per port shaping, all
1555	   physical ports active. Traffic from multiple ingress physical ports
1556	   are directed to the same egress physical port and this will cause
1557	   oversubscription on the egress physical port. Also, the same amount
1558	   of traffic is directed to each egress physical port.

1560	   The benchmarking methodology specified in Section 6.3.1 (procedure,
1561	   metrics, and reporting format) should be applied here. Since this is
1562	   a capacity test, the configuration and report results format from
1563	   6.3.1 MUST also include:

1565	   Configuration:
1566	   - The number of physical ingress ports active during the test
1567	   - The classication marking (DSCP, VLAN, etc.) for each physical
1568	     ingress port
1569	   - The traffic rate for stateful traffic and the traffic rate /
1570	     mixture for stateful traffic for each physical ingress port
1571	   - The shaped egress ports shaper parameters (QL, SR, Bc, Be)

1573	    Report results:
1574	   - For each active egress port, the achieved throughput rate and
1575	     shaper metrics for each ingress port traffic stream

1577	   Example:
1578	   - Egress Port 1: throughput and metrics for ingress streams 1-n
1579	   - Egress Port n: throughput and metrics for ingress streams 1-n

1581	6.3.2.2 All Queues Shaped, Single Port Active

1583	   Test Summary:
1584	   The second shaper capacity test is conducted with all queues actively
1585	   shaping on a single physical port. The benchmarking methodology
1586	   described in per port shaping test (previous section) serves as the
1587	   foundation for this. Additionally, each of the SP queues on the
1588	   egress physical port is configured with a shaper. For the highest
1589	   priority queue, the maximum amount of bandwidth available is limited
1590	   by the bandwidth of the shaper. For the lower priority queues, the
1591	   maximum amount of bandwidth available is limited by the bandwidth of
1592	   the shaper and traffic in higher priority queues.

1594	   The benchmarking methodology specified in Section 6.3.1 (procedure,
1595	   metrics, and reporting format) should be applied here. Since this is
1596	   a capacity test, the configuration and report results format from
1597	   6.3.1 MUST also include:

1599	   Configuration:
1600	   - The number of physical ingress ports active during the test
1601	   - The classication marking (DSCP, VLAN, etc.) for each physical
1602	     ingress port
1603	   - The traffic rate for stateful traffic and the traffic rate/mixture
1604	     for stateful traffic for each physical ingress port
1605	   - For the active egress port, each shaper queue parameters (QL, SR,
1606	     Bc, Be)

1608	   Report results:
1609	   - For each queue of the active egress port, the achieved throughput
1610	     rate and shaper metrics for each ingress port traffic stream

1612	   Example:
1613	   - Egress Port High Priority Queue: throughput and metrics for
1614	     ingress streams 1-n
1615	   - Egress Port Lower Priority Queue: throughput and metrics for
1616	     ingress streams 1-n

1618	6.3.2.3 All Queues Shaped, All Ports Active

1620	   Test Summary:
1621	   And for the third shaper capacity test (which is a combination of the
1622	   tests in the previous two sections),all queues will be actively
1623	   shaping and all physical ports active.

1625	   The benchmarking methodology specified in Section 6.3.1 (procedure,
1626	   metrics, and reporting format) should be applied here.  Since this is
1627	   a capacity test, the configuration and report results format from
1628	   6.3.1 MUST also include:

1630	   Configuration:
1631	   - The number of physical ingress ports active during the test
1632	   - The classication marking (DSCP, VLAN, etc.) for each physical
1633	     ingress port
1634	   - The traffic rate for stateful traffic and the traffic rate /
1635	     mixture for stateful traffic for each physical ingress port
1636	   - For each of the active egress ports, shaper port and per queue
1637	     parameters(QL, SR, Bc, Be)

1639	Report results:
1640	   - For each queue of each active egress port, the achieved throughput
1641	     rate and shaper metrics for each ingress port traffic stream

1643	   Example:
1644	   - Egress Port 1 High Priority Queue: throughput and metrics for
1645	     ingress streams 1-n
1646	   - Egress Port 1 Lower Priority Queue: throughput and metrics for
1647	     ingress streams 1-n
1648	   .
1649	   - Egress Port n High Priority Queue: throughput and metrics for
1650	     ingress streams 1-n
1651	   - Egress Port n Lower Priority Queue: throughput and metrics for
1652	     ingress streams 1-n

1654	6.4 Concurrent Capacity Load Tests

1656	   As mentioned in the scope of this document, it is impossible to
1657	   specify the various permutations of concurrent traffic management
1658	   functions that should be tested in a device for capacity testing.
1659	   However, some profiles are listed below which may be useful
1660	   to test under capacity as well:

1662	   - Policers on ingress and queuing on egress
1663	   - Policers on ingress and shapers on egress (not intended for a
1664	     flow to be policed then shaped, these would be two different
1665	     flows tested at the same time)
1666	   - etc.

1668	   The test procedures and reporting formatting from the previous
1669	   sections may be modified to accommodate the capacity test profile.

1671	7. Security Considerations

1673	   Documents of this type do not directly affect the security of the
1674	   Internet or of corporate networks as long as benchmarking is not
1675	   performed on devices or systems connected to production networks.

1677	   Further, benchmarking is performed on a "black-box" basis, relying
1678	   solely on measurements observable external to the DUT/SUT.

1680	   Special capabilities SHOULD NOT exist in the DUT/SUT specifically for
1681	   benchmarking purposes.  Any implications for network security arising
1682	   from the DUT/SUT SHOULD be identical in the lab and in production
1683	   networks.

1685	8. IANA Considerations

1687	   This document does not REQUIRE an IANA registration for ports
1688	   dedicated to the TCP testing described in this document.

1690	9. References

1692	9.1. Normative References

1694	   [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
1695	             Requirement Levels", BCP 14, RFC2119, March 1997.

1697	   [RFC1242] S. Bradner, "Benchmarking Terminology for Network
1698	   Interconnection Devices," RFC1242 July 1991

1700	   [RFC2544] S. Bradner, "Benchmarking Methodology for Network
1701	   Interconnect Devices," RFC2544 March 1999

1703	   [RFC3148] M. Mathis et al., A Framework for Defining Empirical
1704	   Bulk Transfer Capacity Metrics," RFC3148 July 2001

1706	   [RFC5481] A. Morton et al., "Packet Delay Variation Applicability
1707	   Statement," RFC5481 March 2009

1709	   [RFC6703] A. Morton et al., "Reporting IP Network Performance
1710	   Metrics: Different Points of View." RFC 6703 August 2012

1712	   [RFC2680] G. Almes et al., "A One-way Packet Loss Metric for IPPM,"
1713	             RFC2680 September 1999

1715	   [RFC4689] S. Poretsky et al., "Terminology for Benchmarking
1716	             Network-layer Traffic Control Mechanisms," RFC4689,
1717	             October 2006

1719	   [RFC4737] A. Morton et al., "Packet Reordering Metrics," RFC4737,
1720	             February 2006

1722	   [RFC4115] O. Aboul-Magd et al., "A Differentiated Service Two-Rate,
1723	   Three-Color Marker with Efficient Handling of in-Profile Traffic."
1724	   RFC4115 July 2005

1726	   [RFC6349] Barry Constantine et al., "Framework for TCP Throughput
1727	             Testing," RFC6349, August 2011

1729	9.2. Informative References

1731	   [RFC2697] J. Heinanen et al., "A Single Rate Three Color Marker,"
1732	             RFC2697, September 1999

1734	   [RFC2698] J. Heinanen et al., "A Two Rate Three Color Marker, "
1735	             RFC2698, September 1999

1737	   [AQM-RECO] Fred Baker et al., "IETF Recommendations Regarding
1738	              Active Queue Management," August 2014,
1739	              https://datatracker.ietf.org/doc/draft-ietf-aqm-
1740	              recommendation/

1742	   [MEF-10.2] "MEF 10.2: Ethernet Services Attributes Phase 2," October
1743	               2009, http://metroethernetforum.org/PDF_Documents/
1744	               technical-specifications/MEF10.2.pdf

1746	   [MEF-12.1] "MEF 12.1: Carrier Ethernet Network Architecture
1747	              Framework --
1748	              Part 2: Ethernet Services Layer - Base Elements," April
1749	              2010, https://www.metroethernetforum.org/Assets/Technical
1750	              _Specifications/PDF/MEF12.1.pdf

1752	   [MEF-26] "MEF 26: External Network Network Interface (ENNI) -
1753	             Phase 1,"January 2010,  http://www.metroethernetforum.org
1754	             /PDF_Documents/technical-specifications/MEF26.pdf

1756	   [MEF-14] "Abstract Test Suite for Traffic Management Phase 1,
1757	            https://www.metroethernetforum.org/Assets
1758	            /Technical_Specifications/PDF/MEF_14.pdf

1760	   [MEF-19] "Abstract Test Suite for UNII Type 1",
1761	             https://www.metroethernetforum.org/Assets
1762	             /Technical_Specifications/PDF/MEF_19.pdf

1764	   [MEF-37] "Abstract Test Suite for ENNI",
1765	            https://www.metroethernetforum.org/Assets
1766	            /Technical_Specifications/PDF/MEF_37.pdf

1768	   Appendix A: Open Source Tools for Traffic Management Testing

1770	   This framework specifies that stateless and stateful behaviors SHOULD
1771	   both be tested.  Some open source tools that can be used to
1772	   accomplish many of the tests proposed in this framework are:
1773	   iperf, netperf (with netperf-wrapper),uperf, TMIX,
1774	   TCP-incast-generator, and D-ITG (Distributed Internet Traffic
1775	   Generator).

1777	   Iperf can generate UDP or TCP based traffic; a client and server must
1778	   both run the iperf software in the same traffic mode.  The server is
1779	   set up to listen and then the test traffic is controlled from the
1780	   client.  Both uni-directional and bi-directional concurrent testing
1781	   are supported.

1783	   The UDP mode can be used for the stateless traffic testing.  The
1784	   target bandwidth, packet size, UDP port, and test duration can be
1785	   controlled.  A report of bytes transmitted, packets lost, and delay
1786	   variation are provided by the iperf receiver.

1788	   Iperf (TCP mode), TCP-incast-generator, and D-ITG can be used for
1789	   stateful traffic testing to test bulk transfer traffic.  The TCP
1790	   Window size (which is actually the SSB), the number of connections,
1791	   the packet size, TCP port and the test duration can be controlled.
1792	   A report of bytes transmitted and throughput achieved are provided
1793	   by the iperf sender, while TCP-incast-generator and D-ITG provide
1794	   even more statistics.

1796	   Netperf is a software application that provides network bandwidth
1797	   testing between two hosts on a network. It supports Unix domain
1798	   sockets, TCP, SCTP, DLPI and UDP via BSD Sockets. Netperf provides
1799	   a number of predefined tests e.g. to measure bulk (unidirectional)
1800	   data transfer or request response performance
1801	   http://en.wikipedia.org/wiki/Netperf). Netperf-wrapper is a Python
1802	   script that runs multiple simultaneous netperf instances and
1803	   aggregate the results.

1805	   uperf uses a description (or model) of an application mixture and
1806	   the tool generates the load according to the model desciptor. uperf
1807	   is more flexible than Netperf in it's ability to generate request
1808	   / response application behavior within a single TCP connection.  The
1809	   application model descriptor can be based off of empirical data, but
1810	   currently the import of packet captures is not directly supported.

1812	   Tmix is another application traffic emulation tool and uses packet
1813	   captures directly to create the traffic profile. The packet trace is
1814	   'reverse compiled' into a source-level characterization, called a
1815	   connection vector, of each TCP connection present in the trace. While
1816	   most widely used in ns2 simulation environment, TMix also runs on
1817	   Linux hosts.

1819	   These open source tool's traffic generation capabilities facilitate
1820	   the emulation of the TCP test patterns which are discussed in
1821	   Appendix B.

1823	Appendix B: Stateful TCP Test Patterns

1825	   This framework recommends at a minimum the following TCP test
1826	   patterns since they are representative of real world application
1827	   traffic (section 5.2.1 describes some methods to derive other
1828	   application-based TCP test patterns).

1830	   - Bulk Transfer: generate concurrent TCP connections whose aggregate
1831	   number of in-flight data bytes would fill the BDP.  Guidelines
1832	   from [RFC6349] are used to create this TCP traffic pattern.

1834	   - Micro Burst: generate precise burst patterns within a single or
1835	   multiple TCP connections(s).  The idea is for TCP to establish
1836	   equilibrium and then burst application bytes at defined sizes.  The
1837	   test tool must allow the burst size and burst time interval to be
1838	   configurable.

1840	   - Web Site Patterns: The HTTP traffic model from
1841	   "3GPP2 C.R1002-0 v1.0" is referenced (Table 4.1.3.2.1) to develop
1842	   these TCP test patterns.  In summary, the HTTP traffic model consists
1843	   of the following parameters:
1844	       - Main object size (Sm)
1845	       - Embedded object size (Se)
1846	       - Number of embedded objects per page (Nd)
1847	       - Client processing time (Tcp)
1848	       - Server processing time (Tsp)

1850	    Web site test patterns are illustrated with the following examples:

1852	      - Simple Web Site: mimic the request / response and object
1853	      download behavior of a basic web site (small company).
1854	      - Complex Web Site: mimic the request / response and object
1855	      download behavior of a complex web site (ecommerce site).

1857	   Referencing the HTTP traffic model parameters , the following table
1858	   was derived (by analysis and experimentation) for Simple and Complex
1859	   Web site TCP test patterns:

1861	                            Simple         Complex
1862	   Parameter                Web Site       Web Site
1863	   -----------------------------------------------------
1864	   Main object              Ave. = 10KB    Ave. = 300KB
1865	    size (Sm)               Min. = 100B    Min. = 50KB
1866	                            Max. = 500KB   Max. = 2MB

1868	   Embedded object          Ave. = 7KB     Ave. = 10KB
1869	    size (Se)               Min. = 50B     Min. = 100B
1870	                            Max. = 350KB   Max. = 1MB

1872	   Number of embedded       Ave. = 5       Ave. = 25
1873	    objects per page (Nd)   Min. = 2       Min. = 10
1874	                            Max. = 10      Max. = 50

1876	   Client processing        Ave. = 3s      Ave. = 10s
1877	    time (Tcp)*             Min. = 1s      Min. = 3s
1878	                            Max. = 10s     Max. = 30s

1880	   Server processing        Ave. = 5s      Ave. = 8s
1881	    time (Tsp)*             Min. = 1s      Min. = 2s
1882	                            Max. = 15s     Max. = 30s

1884	   * The client and server processing time is distributed across the
1885	   transmission / receipt of all of the main and embedded objects

1887	   To be clear, the parameters in this table are reasonable guidelines
1888	   for the TCP test pattern traffic generation.  The test tool can use
1889	   fixed parameters for simpler tests and mathematical distributions for
1890	   more complex tests.  However, the test pattern must be repeatable to
1891	   ensure that the benchmark results can be reliably compared.

1893	   - Inter-active Patterns:  While Web site patterns are inter-active
1894	   to a degree, they mainly emulate the downloading of various
1895	   complexity web sites.  Inter-active patterns are more chatty in
1896	   nature since there is alot of user interaction with the servers.
1897	   Examples include business applications such as Peoplesoft, Oracle
1898	   and consumer applications such as Facebook, IM, etc.  For the inter-
1899	   active patterns, the packet capture technique was used to
1900	   characterize some business applications and also the email
1901	   application.

1903	   In summary, an inter-active application can be described by the
1904	   following parameters:
1905	       - Client message size (Scm)
1906	       - Number of Client messages (Nc)
1907	       - Server response size (Srs)
1908	       - Number of server messages (Ns)
1909	       - Client processing time (Tcp)
1910	       - Server processing Time (Tsp)
1911	       - File size upload (Su)*
1912	       - File size download (Sd)*

1914	    * The file size parameters account for attachments uploaded or
1915	    downloaded and may not be present in all inter-active applications

1917	   Again using packet capture as a means to characterize, the following
1918	   table reflects the guidelines for Simple Business Application,
1919	   Complex Business Application, eCommerce, and Email Send / Receive:

1921	                     Simple       Complex
1922	   Parameter         Biz. App.    Biz. App     eCommerce*  Email
1923	   --------------------------------------------------------------------
1924	   Client message    Ave. = 450B  Ave. = 2KB   Ave. = 1KB  Ave. = 200B
1925	    size (Scm)       Min. = 100B  Min. = 500B  Min. = 100B Min. = 100B
1926	                     Max. = 1.5KB Max. = 100KB Max. = 50KB Max. = 1KB

1928	   Number of client  Ave. = 10    Ave. = 100   Ave. = 20    Ave. = 10
1929	    messages (Nc)    Min. = 5     Min. = 50    Min. = 10    Min. = 5
1930	                     Max. = 25    Max. = 250   Max. = 100   Max. = 25

1932	   Client processing Ave. = 10s   Ave. = 30s   Ave. = 15s   Ave. = 5s
1933	    time (Tcp)**     Min. = 3s    Min. = 3s    Min. = 5s    Min. = 3s
1934	                     Max. = 30s   Max. = 60s   Max. = 120s  Max. = 45s

1936	   Server response   Ave. = 2KB   Ave. = 5KB   Ave. = 8KB   Ave. = 200B
1937	    size (Srs)       Min. = 500B  Min. = 1KB   Min. = 100B  Min. = 150B
1938	                     Max. = 100KB Max. = 1MB   Max. = 50KB  Max. = 750B

1940	   Number of server  Ave. = 50    Ave. = 200   Ave. = 100   Ave. = 15
1941	    messages (Ns)    Min. = 10    Min. = 25    Min. = 15    Min. = 5
1942	                     Max. = 200   Max. = 1000  Max. = 500   Max. = 40

1944	   Server processing Ave. = 0.5s  Ave. = 1s    Ave. = 2s    Ave. = 4s
1945	    time (Tsp)**     Min. = 0.1s  Min. = 0.5s  Min. = 1s    Min. = 0.5s
1946	                     Max. = 5s    Max. = 20s   Max. = 10s   Max. = 15s

1948	   Complex Business Application, eCommerce, and Email Send / Receive
1949	   (continued):

1951	                     Simple       Complex
1952	   Parameter         Biz. App.    Biz. App     eCommerce*  Email
1953	   --------------------------------------------------------------------
1954	    File size        Ave. = 50KB  Ave. = 100KB Ave. = N/A   Ave. = 100KB
1955	    upload (Su)      Min. = 2KB   Min. = 10KB  Min. = N/A   Min. = 20KB
1956	                     Max. = 200KB Max. = 2MB   Max. = N/A   Max. = 10MB

1958	    File size        Ave. = 50KB  Ave. = 100KB Ave. = N/A   Ave. = 100KB
1959	    download (Sd)    Min. = 2KB   Min. = 10KB  Min. = N/A   Min. = 20KB
1960	                     Max. = 200KB Max. = 2MB   Max. = N/A   Max. = 10MB

1962	*  eCommerce used a combination of packet capture techniques and
1963	   reference traffic flows from "SPECweb2009" (need proper reference)
1964	** The client and server processing time is distributed across the
1965	   transmission / receipt of all of messages.  Client processing time
1966	   consists mainly of the delay between user interactions (not machine
1967	   processing).

1969	   And again, the parameters in this table are the guidelines for the
1970	   TCP test pattern traffic generation.  The test tool can use fixed
1971	   parameters for simpler tests and mathematical distributions for more
1972	   complex tests.  However, the test pattern must be repeatable to
1973	   ensure that the benchmark results can be reliably compared.

1975	   - SMB/CIFS File Copy: mimic a network file copy, both read and write.
1976	   As opposed to FTP which is a bulk transfer and is only flow
1977	   controlled via TCP, SMB/CIFS divides a file into application blocks
1978	   and utilizes application level handshaking in addition to
1979	   TCP flow control.

1981	   In summary, an SMB/CIFS file copy can be described by the following
1982	   parameters:
1983	       - Client message size (Scm)
1984	       - Number of client messages (Nc)
1985	       - Server response size (Srs)
1986	       - Number of Server messages (Ns)
1987	       - Client processing time (Tcp)
1988	       - Server processing time (Tsp)
1989	       - Block size (Sb)

1991	   The client and server messages are SMB control messages.  The Block
1992	   size is the data portion of th file transfer.

1994	   Again using packet capture as a means to characterize the following
1995	   table reflects the guidelines for SMB/CIFS file copy:

1997	                     SMB
1998	   Parameter         File Copy
1999	   ------------------------------
2000	   Client message    Ave. = 450B
2001	    size (Scm)       Min. = 100B
2002	                     Max. = 1.5KB
2003	   Number of client  Ave. = 10
2004	    messages (Nc)    Min. = 5
2005	                     Max. = 25
2006	   Client processing Ave. = 1ms
2007	    time (Tcp)       Min. = 0.5ms
2008	                     Max. = 2
2009	   Server response   Ave. = 2KB
2010	    size (Srs)       Min. = 500B
2011	                     Max. = 100KB
2012	   Number of server  Ave. = 10
2013	    messages (Ns)    Min. = 10
2014	                     Max. = 200
2015	   Server processing Ave. = 1ms
2016	    time (Tsp)       Min. = 0.5ms
2017	                     Max. = 2ms
2018	    Block            Ave. = N/A
2019	     Size (Sb)*      Min. = 16KB
2020	                     Max. = 128KB

2022	    *Depending upon the tested file size, the block size will be
2023	    transferred n number of times to complete the example.  An example
2024	    would be a 10 MB file test and 64KB block size.  In this case 160
2025	    blocks would be transferred after the control channel is opened
2026	    between the client and server.

2028	Acknowledgments

2030	   We would like to thank Al Morton for his continuous review and
2031	   invaluable input to the document.  We would also like to thank
2032	   Scott Bradner for providing guidance early in the drafts
2033	   conception in the area of benchmarking scope of traffic management
2034	   functions.  Additionally, we would like to thank Tim Copley for this
2035	   original input and David Taht, Gory Erg, Toke Hoiland-Jorgensen for
2036	   their review and input for the AQM group. And for the formal reviews
2037	   of this document, we would like to thank Gilles Forget,
2038	   Vijay Gurbani, Reinhard Schrage, and Bhuvaneswaran Vengainathan

2040	Authors' Addresses

2042	   Barry Constantine
2043	   JDSU, Test and Measurement Division
2044	   Germantown, MD 20876-7100, USA
2045	   Phone: +1 240 404 2227
2046	   Email: barry.constantine@jdsu.com

2048	   Ram Krishnan
2049	   Brocade Communications
2050	   San Jose, 95134, USA
2051	   Phone: +001-408-406-7890
2052	   Email: ramk@brocade.com