idnits 2.17.1 

draft-ietf-bmwg-ca-bench-meth-04.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (Feb 2013) is 4081 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Missing Reference: '16' is mentioned on line 651, but not defined

  ** Obsolete normative reference: RFC 3511 (ref. '4') (Obsoleted by RFC 9411)

  ** Obsolete normative reference: RFC 4614 (ref. '11') (Obsoleted by RFC
     7414)

  -- Obsolete informational reference (is this intentional?): RFC 4566 (ref.
     '14') (Obsoleted by RFC 8866)

  -- Obsolete informational reference (is this intentional?): RFC 5226 (ref.
     '15') (Obsoleted by RFC 8126)


     Summary: 2 errors (**), 0 flaws (~~), 2 warnings (==), 3 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Internet Engineering Task Force                              M. Hamilton
3	Internet-Draft                                                      Ixia
4	Intended status: Informational                                  S. Banks
5	Expires: August 5, 2013                                Aerohive Networks
6	                                                                Feb 2013

8	       Benchmarking Methodology for Content-Aware Network Devices
9	                    draft-ietf-bmwg-ca-bench-meth-04

11	Abstract

13	   This document defines a set of test scenarios and metrics that can be
14	   used to benchmark content-aware network devices.  The scenarios in
15	   the following document are intended to more accurately predict the
16	   performance of these devices when subjected to dynamic traffic
17	   patterns.  This document will operate within the constraints of the
18	   Benchmarking Working Group charter, namely black box characterization
19	   in a laboratory environment.

21	Status of this Memo

23	   This Internet-Draft is submitted in full conformance with the
24	   provisions of BCP 78 and BCP 79.

26	   Internet-Drafts are working documents of the Internet Engineering
27	   Task Force (IETF).  Note that other groups may also distribute
28	   working documents as Internet-Drafts.  The list of current Internet-
29	   Drafts is at http://datatracker.ietf.org/drafts/current/.

31	   Internet-Drafts are draft documents valid for a maximum of six months
32	   and may be updated, replaced, or obsoleted by other documents at any
33	   time.  It is inappropriate to use Internet-Drafts as reference
34	   material or to cite them other than as "work in progress."

36	   This Internet-Draft will expire on August 5, 2013.

38	Copyright Notice

40	   Copyright (c) 2013 IETF Trust and the persons identified as the
41	   document authors.  All rights reserved.

43	   This document is subject to BCP 78 and the IETF Trust's Legal
44	   Provisions Relating to IETF Documents
45	   (http://trustee.ietf.org/license-info) in effect on the date of
46	   publication of this document.  Please review these documents
47	   carefully, as they describe your rights and restrictions with respect
48	   to this document.  Code Components extracted from this document must
49	   include Simplified BSD License text as described in Section 4.e of
50	   the Trust Legal Provisions and are provided without warranty as
51	   described in the Simplified BSD License.

53	Table of Contents

55	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  4
56	     1.1.  Requirements Language  . . . . . . . . . . . . . . . . . .  5
57	   2.  Scope  . . . . . . . . . . . . . . . . . . . . . . . . . . . .  5
58	   3.  Test Setup . . . . . . . . . . . . . . . . . . . . . . . . . .  5
59	     3.1.  Test Considerations  . . . . . . . . . . . . . . . . . . .  6
60	     3.2.  Clients and Servers  . . . . . . . . . . . . . . . . . . .  6
61	     3.3.  Traffic Generation Requirements  . . . . . . . . . . . . .  6
62	     3.4.  Discussion of Network Limitations  . . . . . . . . . . . .  6
63	     3.5.  Framework for Traffic Specification  . . . . . . . . . . .  8
64	     3.6.  Multiple Client/Server Testing . . . . . . . . . . . . . .  8
65	     3.7.  Device Configuration Considerations  . . . . . . . . . . .  8
66	       3.7.1.  Network Addressing . . . . . . . . . . . . . . . . . .  9
67	       3.7.2.  Network Address Translation  . . . . . . . . . . . . .  9
68	       3.7.3.  TCP Stack Considerations . . . . . . . . . . . . . . .  9
69	       3.7.4.  Other Considerations . . . . . . . . . . . . . . . . .  9
70	   4.  Benchmarking Tests . . . . . . . . . . . . . . . . . . . . . .  9
71	     4.1.  Maximum Application Session Establishment Rate . . . . . . 10
72	       4.1.1.  Objective  . . . . . . . . . . . . . . . . . . . . . . 10
73	       4.1.2.  Setup Parameters . . . . . . . . . . . . . . . . . . . 10
74	       4.1.3.  Procedure  . . . . . . . . . . . . . . . . . . . . . . 10
75	       4.1.4.  Measurement  . . . . . . . . . . . . . . . . . . . . . 10
76	         4.1.4.1.  Maximum Application Flow Rate  . . . . . . . . . . 10
77	         4.1.4.2.  Application Flow Duration  . . . . . . . . . . . . 11
78	         4.1.4.3.  Application Efficiency . . . . . . . . . . . . . . 11
79	         4.1.4.4.  Application Flow Latency . . . . . . . . . . . . . 11
80	     4.2.  Application Throughput . . . . . . . . . . . . . . . . . . 11
81	       4.2.1.  Objective  . . . . . . . . . . . . . . . . . . . . . . 11
82	       4.2.2.  Setup Parameters . . . . . . . . . . . . . . . . . . . 11
83	       4.2.3.  Procedure  . . . . . . . . . . . . . . . . . . . . . . 12
84	       4.2.4.  Measurement  . . . . . . . . . . . . . . . . . . . . . 12
85	         4.2.4.1.  Maximum Throughput . . . . . . . . . . . . . . . . 12
86	         4.2.4.2.  Maximum Application Flow Rate  . . . . . . . . . . 12
87	         4.2.4.3.  Application Flow Duration  . . . . . . . . . . . . 12
88	         4.2.4.4.  Application Efficiency . . . . . . . . . . . . . . 12
89	         4.2.4.5.  Packet Loss  . . . . . . . . . . . . . . . . . . . 12
90	         4.2.4.6.  Application Flow Latency . . . . . . . . . . . . . 12
91	     4.3.  Malformed Traffic Handling . . . . . . . . . . . . . . . . 13
92	       4.3.1.  Objective  . . . . . . . . . . . . . . . . . . . . . . 13
93	       4.3.2.  Setup Parameters . . . . . . . . . . . . . . . . . . . 13
94	       4.3.3.  Procedure  . . . . . . . . . . . . . . . . . . . . . . 13
95	       4.3.4.  Measurement  . . . . . . . . . . . . . . . . . . . . . 13

97	   5.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 13
98	   6.  Security Considerations  . . . . . . . . . . . . . . . . . . . 13
99	   7.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 14
100	     7.1.  Normative References . . . . . . . . . . . . . . . . . . . 14
101	     7.2.  Informative References . . . . . . . . . . . . . . . . . . 15
102	     7.3.  URL References . . . . . . . . . . . . . . . . . . . . . . 15
103	   Appendix A.  Example Traffic Mix . . . . . . . . . . . . . . . . . 15
104	   Appendix B.  Malformed Traffic Algorithm . . . . . . . . . . . . . 17
105	   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 19

107	1.  Introduction

109	   Content-aware and deep packet inspection (DPI) device deployments
110	   have grown significantly in recent years.  No longer are devices
111	   simply using Ethernet and IP headers to make forwarding decisions.
112	   This class of device now uses application-specific data to make these
113	   decisions.  For example, a web-application firewall (WAF) may use
114	   search criteria upon the HTTP uniform resource indicator (URI)[1] to
115	   decide whether a HTTP GET method may traverse the network.  In the
116	   case of lawful/legal intercept technology, a device could use the
117	   phone number within the Session Description Protocol[14] to determine
118	   whether a voice-over-IP phone may be allowed to connect.  In addition
119	   to the development of entirely new classes of devices, devices that
120	   could historically be classified as 'stateless' or raw forwarding
121	   devices are now performing DPI functionality.  Devices such as core
122	   and edge routers are now being developed with DPI functionality to
123	   make more intelligent routing and forwarding decisions.

125	   The Benchmarking Working Group (BMWG) has historically produced
126	   Internet Drafts and Requests for Comment that are focused
127	   specifically on creating output metrics that are derived from a very
128	   specific and well-defined set of input parameters that are completely
129	   and unequivocally reproducible from test bed to test bed.  The end
130	   goal of such methodologies is to, in the words of the RFC 2544 [2],
131	   reduce "specsmanship" in the industry and hold vendors accountable
132	   for performance claims.

134	   The end goal of this methodology is to generate performance metrics
135	   in a lab environment that will closely relate to actual observed
136	   performance on production networks.  By utilizing dynamic traffic
137	   patterns relevant to modern networks, this methodology should be able
138	   to closely tie laboratory and production metrics.  It should be
139	   further noted than any metrics acquired from production networks
140	   SHOULD be captured according to the policies and procedures of the
141	   IPPM or PMOL working groups.

143	   An explicit non-goal of this document is to replace existing
144	   methodology/terminology pairs such as RFC 2544 [2]/RFC 1242 [3] or
145	   RFC 3511 [4]/RFC 2647 [5].  The explicit goal of this document is to
146	   create a methodology more suited for modern devices while
147	   complementing the data acquired using existing BMWG methodologies.
148	   This document does not assume completely repeatable input stimulus.
149	   The nature of application-driven networks is such that a single
150	   dropped packet inherently changes the input stimulus from a network
151	   perspective.  While application flows will be specified in great
152	   detail, it simply is not practical to require totally repeatable
153	   input stimulus.

155	1.1.  Requirements Language

157	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
158	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
159	   document are to be interpreted as described in RFC 2119 [6].

161	2.  Scope

163	   Content-aware devices take many forms, shapes and architectures.
164	   These devices are advanced network interconnect devices that inspect
165	   deep into the application payload of network data packets to do
166	   classification.  They may be as simple as a firewall that uses
167	   application data inspection for rule set enforcement, or they may
168	   have advanced functionality such as performing protocol decoding and
169	   validation, anti-virus, anti-spam and even application exploit
170	   filtering.  The document will universally call these devices
171	   middleboxes, as defined by RFC 3234 [7].

173	   This document is strictly focused on examining performance and
174	   robustness across a focused set of metrics: throughput(min/max/avg/
175	   sample std dev), transaction rates(successful/failed), application
176	   response times, concurrent flows, and unidirectional packet latency.
177	   None of the metrics captured through this methodology are specific to
178	   a device and the results are DUT implementation independent.
179	   Functional testing of the DUT is outside the scope of this
180	   methodology.

182	   Devices such as firewalls, intrusion detection and prevention
183	   devices, wireless LAN controllers, application delivery controllers,
184	   deep packet inspection devices, wide-area network(WAN) optimization
185	   devices, and unified threat management systems generally fall into
186	   the content-aware category.  While this list may become obsolete,
187	   these are a subset of devices that fall under this scope of testing.

189	3.  Test Setup

191	   This document will be applicable to most test configurations and will
192	   not be confined to a discussion on specific test configurations.
193	   Since each DUT/SUT will have their own unique configuration, users
194	   SHOULD configure their device with the same parameters that would be
195	   used in the actual deployment of the device or a typical deployment,
196	   if the actual deployment is unknown.  A summary of the DUT
197	   configuration MUST be published with the final benchmarking results.
198	   In order to improve repeatability, the published configuration
199	   information SHOULD include command-line scripts used to configure the
200	   DUT, if any, and SHOULD also include any configuration information
201	   for the test equipment used."

203	3.1.  Test Considerations

205	3.2.  Clients and Servers

207	   Content-aware device testing SHOULD involve multiple clients and
208	   multiple servers.  As with RFC 3511 [4], this methodology will use
209	   the terms virtual clients/servers because both the client and server
210	   will be represented by the tester and not actual clients/servers.
211	   Similarly defined in RFC 3511 [4], a data source may emulate multiple
212	   clients and/or servers within the context of the same test scenario.
213	   The test report SHOULD indicate the number of virtual clients/servers
214	   used during the test.  IANA has reserved address ranges for
215	   laboratory characterization.  These are defined for IPv4 and IPv6 by
216	   RFC 2544 Appendix C [2] and RFC 5180 Section 5.2 [8] respectively and
217	   SHOULD be consulted prior to testing.

219	3.3.  Traffic Generation Requirements

221	   The explicit purposes of content-aware devices vary widely, but these
222	   devices use information deeper inside the application flow to make
223	   decisions and classify traffic.  This methodology will utilize
224	   traffic flows that resemble real application traffic without
225	   utilizing captures from live production networks.  Application Flows,
226	   as defined in Section 1.1 RFC 2724 [9] are able to be well-defined
227	   without simply referring to a network capture.  An example traffic
228	   template is defined and listed in Appendix A of this document.  A
229	   user of this methodology is free to utilize the example mix as
230	   provided in the appendix.  If a user of this methodology understands
231	   the traffic patterns in their production network, that user MAY use
232	   the template provided in Appendix A to describe a traffic mix
233	   appropriate for their environment.  In all cases, users MUST report
234	   the traffic mix used in the test, and SHOULD report this using a
235	   template similar to that in Appendix A.

237	   The test tool SHOULD be able to create application flows between
238	   every client and server, regardless of direction.  The tester SHOULD
239	   be able to open TCP connections on multiple destination ports and
240	   SHOULD be able to direct UDP traffic to multiple destination ports.

242	3.4.  Discussion of Network Limitations

244	   Prior to executing the methodology as outlined in the following
245	   sections, it is imperative to understand the implications of
246	   utilizing representative application flows for the traffic content of
247	   the benchmarking effort.  One interesting aspect of utilizing
248	   application flows is that each flow is inherently different from
249	   every other application flow.  The content of each flow will vary
250	   from application to application, and in most cases, even varies
251	   within the same type of application flow.  The following description
252	   of the methodology will individually benchmark every individual type
253	   and subset of application flow, prior to performing similar tests
254	   with a traffic mix as specified either by the example mix in
255	   Appendix A, or as defined by the user of this methodology.

257	   The purpose of this process is to ensure that any performance
258	   implications that are discovered during the mixed testing aren't due
259	   to the inherent physical network limitations.  As an example of this
260	   phenomena, it is useful to examine a network device inserted into a
261	   single path, as illustrated in the following diagram.

263	                                +----------+
264	                     +---+  1gE |   DUT/   | 1gE  +---+
265	                     |C/S|------|   SUT    |------|C/S|
266	                     +---+      +----------+      +---+

268	                      Simple Inline DUT Configuration

270	                    Figure 1: Simple Middle-box Example

272	   For the purpose of this discussion, let's take a hypothetical
273	   application flow that utilizes UDP for the transport layer.  Assume
274	   that the sample transaction we will be using to model this particular
275	   flow requires 10 UDP datagrams to complete the transaction.  For
276	   simplicity, each datagram within the flow is exactly 64 bytes,
277	   including associated Ethernet, IP, and UDP overhead.  With any
278	   network device,there are always three metrics which interact with
279	   each other: number of concurrent application flows, number of
280	   application flows per second, and layer-7 throughput.

282	   Our example test bed is a single-path device connected by 1 gigabit
283	   Ethernet links.  The purpose of this benchmark effort is to quantify
284	   the number of application flows per second that may be processed
285	   through our device under test.  Let's assume that the result from our
286	   scenario is that the DUT is able to process 10,000 application flows
287	   per second.  The question is whether that ceiling is the actual
288	   ceiling of the device, or if it is actually being limited by one of
289	   the other metrics.  If we do the appropriate math, 10000 flows per
290	   second, with each flow at 640 total bytes means that we are achieving
291	   an aggregate bitrate of roughly 49 Mbps.  This is dramatically less
292	   than the 1 gigabit physical link we are using.  We can conclude that
293	   10,000 flows per second is in fact the performance limit of the
294	   device.

296	   If we change the example slightly and increase the size of each
297	   datagram to 1312 bytes, then it becomes necessary to recompute the
298	   load.  Assuming the same observed DUT limitation of 10,000 flows per
299	   second, it must be ensured that this is an artifact of the DUT, and
300	   not of physical limitations.  For each flow, we'll require 104,960
301	   bits. 10,000 flows per second implies a throughput of roughly 1 Gbps.
302	   At this point, we cannot definitively answer whether the DUT is
303	   actually limited to 10,000 flows per second.  If we are able to
304	   modify the scenario, and utilize 10 Gigabit interfaces, then perhaps
305	   the flow per second ceiling will be reached at a higher number than
306	   10,000.

308	   This example illustrates why a user of this methodology SHOULD
309	   benchmark each application variant individually to ensure that the
310	   cause of a measured limit is fully understood

312	3.5.  Framework for Traffic Specification

314	   The following table SHOULD be specified for each application flow
315	   variant.

317	   o  Data Exchanged By Flow, Bits

319	   o  Offered Percentage of Total Flows

321	   o  Transport Protocol(s)

323	   o  Destination Port(s)

325	3.6.  Multiple Client/Server Testing

327	   In actual network deployments, connections are being established
328	   between multiple clients and multiple servers simultaneously.  Device
329	   vendors have been known to optimize the operation of their devices
330	   for easily defined patterns.  The connection sequence ordering
331	   scenarios a device will see on a network will likely be much less
332	   deterministic.  In fact, many application flows have multiple layer 4
333	   connections within a single flow, with client and server reversing
334	   roles.  Flow initiation SHOULD be in a pseudo-random manner across
335	   ingress ports.

337	3.7.  Device Configuration Considerations

339	   The configuration of the DUT may have an effect on the observed
340	   results of the following methodology.  A comprehensive, but certainly
341	   not exhaustive, list of potential considerations is listed below.

343	3.7.1.  Network Addressing

345	   The IANA has issued a range of IP addresses to the BMWG for purposes
346	   of benchmarking.  Please refer to RFC 2544 [2] and RFC 5180 [8] for
347	   more details.  If more IPv4 addresses are required than the RFC 2544
348	   allotment provides, then allocations from the private address space
349	   as defined in RFC 1918 [10] may be used.

351	3.7.2.  Network Address Translation

353	   Many content-aware devices are capable of performing Network Address
354	   Translation (NAT)[5].  If the final deployment of the DUT will have
355	   this functionality enabled, then the DUT SHOULD also have it enabled
356	   during the execution of this methodology.  It MAY be beneficial to
357	   perform the test series in both modes in order to determine the
358	   performance differential when using NAT.  The test report SHOULD
359	   indicate whether NAT was enabled during the testing process.

361	3.7.3.  TCP Stack Considerations

363	   The IETF has historically provided guidance and information on TCP
364	   stack considerations.  This methodology is strictly focused on
365	   performance metrics at layers above 4, thus does not specifically
366	   define any TCP stack configuration parameters of either the tester or
367	   the DUTs.  The TCP configuration of the tester MUST remain constant
368	   across all DUTs in order to ensure comparable results.  While the
369	   following list of references is not exhaustive, each document
370	   contains a relevant discussion on TCP stack considerations.

372	   The general IETF TCP roadmap is defined in RFC 4614 [11] and
373	   congestion control algorithms are discussed in Section 2 of RFC 3148
374	   [12] with even more detailed references.  TCP receive and congestion
375	   window sizes are discussed in detail in RFC 6349 [13].

377	3.7.4.  Other Considerations

379	   Various content-aware devices will have widely varying feature sets.
380	   In the interest of representative test results, the DUT features that
381	   will likely be enabled in the final deployment SHOULD be used.  This
382	   methodology is not intended to advise on which features should be
383	   enabled, but to suggest using actual deployment configurations.

385	4.  Benchmarking Tests

387	   Each of the following benchmark scenarios SHOULD be run with each of
388	   the single application flow templates.  Upon completion of all
389	   iterations, the mixed test SHOULD be completed, subject to the
390	   traffic mix as defined by the user.

392	4.1.  Maximum Application Session Establishment Rate

394	4.1.1.  Objective

396	   To determine the maximum rate through which a device is able to
397	   establish and complete application flows as defined by
398	   draft-ietf-bmwg-ca-bench-term-00.

400	4.1.2.  Setup Parameters

402	   The following parameters SHOULD be used and reported for all tests:

404	   For each application protocol in use during the test run, the table
405	   provided in Section 3.5 SHOULD be published.

407	4.1.3.  Procedure

409	   The test SHOULD generate application network traffic that meets the
410	   conditions of Section 3.3.  The traffic pattern SHOULD begin with an
411	   application flow rate of 10% of expected maximum.  The test SHOULD be
412	   configured to increase the attempt rate in units of 10% up through
413	   110% of expected maximum.  In the case where expected maximum is
414	   limited by physical link rate as discovered through Appendix A, the
415	   maximum rate will attempted will be 100% of expected maximum, or
416	   "wire-speed performance".  The duration of each loading phase SHOULD
417	   be at least 30 seconds.  This test MAY be repeated, each subsequent
418	   iteration beginning at 5% of expected maximum and increasing session
419	   establishment rate to 110% of the maximum observed from the previous
420	   test run.

422	   This procedure MAY be repeated any reasonable number of times with
423	   the results being averaged together.

425	4.1.4.  Measurement

427	   The following metrics MAY be determined from this test, and SHOULD be
428	   observed for each application protocol within the traffic mix:

430	4.1.4.1.  Maximum Application Flow Rate

432	   The test tool SHOULD report the maximum rate at which application
433	   flows were completed, as defined by RFC 2647 [5], Section 3.7.  This
434	   rate SHOULD be reported individually for each application protocol
435	   present within the traffic mix.

437	4.1.4.2.  Application Flow Duration

439	   The test tool SHOULD report the minimum, maximum and average
440	   application duration, as defined by RFC 2647 [5], Section 3.9.  This
441	   duration SHOULD be reported individually for each application
442	   protocol present within the traffic mix.

444	4.1.4.3.  Application Efficiency

446	   The test tool SHOULD report the application efficiency, similarly
447	   defined for TCP by RFC 6349 [13].

449	                          Transmitted Bytes - Retransmitted Bytes
450	      App Efficiency % =  ---------------------------------------  X 100
451	                                     Transmitted Bytes

453	           Figure 2: Application Efficiency Percent Calculation

455	   Note than calculation less than 100% does not necessarily imply
456	   noticeably degraded performance since certain applications utilize
457	   algorithms to maintain a quality user experience in the face of data
458	   loss.

460	4.1.4.4.  Application Flow Latency

462	   The test tool SHOULD report the minimum, maximum and average amount
463	   of time an application flow member takes to traverse the DUT, as
464	   defined by RFC 1242 [3], Section 3.8.  This value SHOULD be reported
465	   individually for each application protocol present within the traffic
466	   mix.

468	4.2.  Application Throughput

470	4.2.1.  Objective

472	   To determine the maximum rate through which a device is able to
473	   forward bits when using application flows as defined in the previous
474	   sections.

476	4.2.2.  Setup Parameters

478	   The same parameter reporting procedure as described in Section 4.1.2
479	   SHOULD be used for all tests.

481	4.2.3.  Procedure

483	   This test will attempt to send application flows through the device
484	   at a flow rate of 30% of the maximum, as observed in Section 4.1.
485	   This procedure MAY be repeated with the results from each iteration
486	   averaged together.

488	4.2.4.  Measurement

490	   The following metrics MAY be determined from this test, and SHOULD be
491	   observed for each application protocol within the traffic mix:

493	4.2.4.1.  Maximum Throughput

495	   The test tool SHOULD report the minimum, maximum and average
496	   application throughput.

498	4.2.4.2.  Maximum Application Flow Rate

500	   The test tool SHOULD report the maximum rate at which application
501	   flows were completed, as defined by RFC 2647 [5], Section 3.7.  This
502	   rate SHOULD be reported individually for each application protocol
503	   present within the traffic mix.

505	4.2.4.3.  Application Flow Duration

507	   The test tool SHOULD report the minimum, maximum and average
508	   application duration, as defined by RFC 2647 [5], Section 3.9.  This
509	   duration SHOULD be reported individually for each application
510	   protocol present within the traffic mix.

512	4.2.4.4.  Application Efficiency

514	   The test tool SHOULD report the application efficiency as defined in
515	   Section 4.1.4.3.

517	4.2.4.5.  Packet Loss

519	   The test tool SHOULD report the number of packets lost or dropped
520	   from source to destination.

522	4.2.4.6.  Application Flow Latency

524	   The test tool SHOULD report the minimum, maximum and average amount
525	   of time an application flow member takes to traverse the DUT, as
526	   defined by RFC 1242 [3], Section 3.13.  This value SHOULD be reported
527	   individually for each application protocol present within the traffic
528	   mix.

530	4.3.  Malformed Traffic Handling

532	4.3.1.  Objective

534	   To determine the effects on performance and stability that malformed
535	   traffic may have on the DUT.

537	4.3.2.  Setup Parameters

539	   The same parameters SHOULD be used for Transport-Layer and
540	   Application Layer Parameters previously specified in Section 4.1.2
541	   and Section 4.2.2.

543	4.3.3.  Procedure

545	   This test will utilize the procedures specified previously in
546	   Section 4.1.3 and Section 4.2.3.  When performing the procedures
547	   listed previously, the tester should generate malformed traffic at
548	   all protocol layers.  This is commonly known as fuzzed traffic.
549	   Fuzzing techniques generally modify portions of packets, including
550	   checksum errors, invalid protocol options, and improper protocol
551	   conformance.

553	   The process by which the tester SHOULD generate the malformed traffic
554	   is outlined in detail in Appendix B.

556	4.3.4.  Measurement

558	   For each protocol present in the traffic mix, the metrics specified
559	   by Section 4.1.4 and Section 4.2.4 MAY be determined.  This data may
560	   be used to ascertain the effects of fuzzed traffic on the DUT.

562	5.  IANA Considerations

564	   This memo includes no request to IANA.

566	   All drafts are required to have an IANA considerations section (see
567	   the update of RFC 2434 [15] for a guide).  If the draft does not
568	   require IANA to do anything, the section contains an explicit
569	   statement that this is the case (as above).  If there are no
570	   requirements for IANA, the section will be removed during conversion
571	   into an RFC by the RFC Editor.

573	6.  Security Considerations

575	   Benchmarking activities as described in this memo are limited to
576	   technology characterization using controlled stimuli in a laboratory
577	   environment, with dedicated address space and the other constraints
578	   RFC 2544 [2].

580	   The benchmarking network topology will be an independent test setup
581	   and MUST NOT be connected to devices that may forward the test
582	   traffic into a production network, or mis-route traffic to the test
583	   management network

585	7.  References

587	7.1.  Normative References

589	   [1]   Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
590	         Resource Identifier (URI): Generic Syntax", STD 66, RFC 3986,
591	         January 2005.

593	   [2]   Bradner, S. and J. McQuaid, "Benchmarking Methodology for
594	         Network Interconnect Devices", RFC 2544, March 1999.

596	   [3]   Bradner, S., "Benchmarking terminology for network
597	         interconnection devices", RFC 1242, July 1991.

599	   [4]   Hickman, B., Newman, D., Tadjudin, S., and T. Martin,
600	         "Benchmarking Methodology for Firewall Performance", RFC 3511,
601	         April 2003.

603	   [5]   Newman, D., "Benchmarking Terminology for Firewall
604	         Performance", RFC 2647, August 1999.

606	   [6]   Bradner, S., "Key words for use in RFCs to Indicate Requirement
607	         Levels", BCP 14, RFC 2119, March 1997.

609	   [7]   Carpenter, B. and S. Brim, "Middleboxes: Taxonomy and Issues",
610	         RFC 3234, February 2002.

612	   [8]   Popoviciu, C., Hamza, A., Van de Velde, G., and D. Dugatkin,
613	         "IPv6 Benchmarking Methodology for Network Interconnect
614	         Devices", RFC 5180, May 2008.

616	   [9]   Handelman, S., Stibler, S., Brownlee, N., and G. Ruth, "RTFM:
617	         New Attributes for Traffic Flow Measurement", RFC 2724,
618	         October 1999.

620	   [10]  Rekhter, Y., Moskowitz, R., Karrenberg, D., Groot, G., and E.
621	         Lear, "Address Allocation for Private Internets", BCP 5,
622	         RFC 1918, February 1996.

624	   [11]  Duke, M., Braden, R., Eddy, W., and E. Blanton, "A Roadmap for
625	         Transmission Control Protocol (TCP) Specification Documents",
626	         RFC 4614, September 2006.

628	   [12]  Mathis, M. and M. Allman, "A Framework for Defining Empirical
629	         Bulk Transfer Capacity Metrics", RFC 3148, July 2001.

631	   [13]  Constantine, B., Forget, G., Geib, R., and R. Schrage,
632	         "Framework for TCP Throughput Testing", RFC 6349, August 2011.

634	7.2.  Informative References

636	   [14]  Handley, M., Jacobson, V., and C. Perkins, "SDP: Session
637	         Description Protocol", RFC 4566, July 2006.

639	   [15]  Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA
640	         Considerations Section in RFCs", BCP 26, RFC 5226, May 2008.

642	7.3.  URL References

644	   [16]  Sandvine Corporation, "http://www.sandvine.com/general/
645	         document.download.asp?docID=58&sourceID=0", 2012.

647	Appendix A.  Example Traffic Mix

649	   This appendix shows an example case of a protocol mix that may be
650	   used with this methodology.  This mix closely represents the research
651	   published by Sandvine [16] in their biannual report for the first
652	   half of 2012 on North American fixed access service provider
653	   networks.

655	      +------------+------------------+--------------------+--------+
656	      |  Direction | Application Flow |       Options      |  Value |
657	      +------------+------------------+--------------------+--------+
658	      |  Upstream  |    BitTorrent    |                    |        |
659	      |            |                  | Avg Flow Size (L7) | 512 MB |
660	      |            |                  |   Flow Percentage  |  44.4% |
661	      |            |       HTTP       |                    |        |
662	      |            |                  | Avg Flow Size (L7) | 128 kB |
663	      |            |                  |   Flow Percentage  |  7.3%  |
664	      |            |       Skype      |                    |        |
665	      |            |                  | Avg Flow Size (L7) |  8 MB  |
666	      |            |                  |   Flow Percentage  |  4.9%  |
667	      |            |      SSL/TLS     |                    |        |
668	      |            |                  | Avg Flow Size (L7) | 128 kB |
669	      |            |                  |   Flow Percentage  |  3.2%  |
670	      |            |      Netflix     |                    |        |
671	      |            |                  | Avg Flow Size (L7) | 500 kB |
672	      |            |                  |   Flow Percentage  |  3.1%  |
673	      |            |     PPStream     |                    |        |
674	      |            |                  | Avg Flow Size (L7) | 500 MB |
675	      |            |                  |   Flow Percentage  |  2.2%  |
676	      |            |      YouTube     |                    |        |
677	      |            |                  | Avg Flow Size (L7) |  4 MB  |
678	      |            |                  |   Flow Percentage  |  1.9%  |
679	      |            |     Facebook     |                    |        |
680	      |            |                  | Avg Flow Size (L7) |  2 MB  |
681	      |            |                  |   Flow Percentage  |  1.9%  |
682	      |            |      Teredo      |                    |        |
683	      |            |                  | Avg Flow Size (L7) | 500 MB |
684	      |            |                  |   Flow Percentage  |  1.2%  |
685	      |            |  Apple iMessage  |                    |        |
686	      |            |                  | Avg Flow Size (L7) |  40 kB |
687	      |            |                  |   Flow Percentage  |  1.1%  |
688	      |            |     Bulk TCP     |                    |        |
689	      |            |                  | Avg Flow Size (L7) | 128 kB |
690	      |            |                  |   Flow Percentage  |  28.8% |
691	      | Downstream |      Netflix     |                    |        |
692	      |            |                  | Avg Flow Size (L7) | 512 MB |
693	      |            |                  |   Flow Percentage  |  32.9% |
694	      |            |      YouTube     |                    |        |
695	      |            |                  | Avg Flow Size (L7) |  5 MB  |
696	      |            |                  |   Flow Percentage  |  13.8% |
697	      |            |       HTTP       |                    |        |
698	      |            |                  | Avg Flow Size (L7) |  1 MB  |
699	      |            |                  |   Flow Percentage  |  12.1% |
700	      |            |    BitTorrent    |                    |        |
701	      |            |                  | Avg Flow Size (L7) | 500 MB |
702	      |            |                  |   Flow Percentage  |  6.3%  |
703	      |            |      iTunes      |                    |        |
704	      |            |                  | Avg Flow Size (L7) |  32 MB |
705	      |            |                  |   Flow Percentage  |  3.8%  |
706	      |            |    Flash Video   |                    |        |
707	      |            |                  | Avg Flow Size (L7) | 100 MB |
708	      |            |                  |   Flow Percentage  |  2.6%  |
709	      |            |       MPEG       |                    |        |
710	      |            |                  | Avg Flow Size (L7) | 100 MB |
711	      |            |                  |   Flow Percentage  |  2.0%  |
712	      |            |       RTMP       |                    |        |
713	      |            |                  | Avg Flow Size (L7) |  50 MB |
714	      |            |                  |   Flow Percentage  |  2.0%  |
715	      |            |       Hulu       |                    |        |
716	      |            |                  | Avg Flow Size (L7) | 300 MB |
717	      |            |                  |   Flow Percentage  |  1.8%  |
718	      |            |      SSL/TLS     |                    |        |
719	      |            |                  | Avg Flow Size (L7) | 256 kB |
720	      |            |                  |   Flow Percentage  |  1.6%  |
721	      |            |     Bulk TCP     |                    |        |
722	      |            |                  | Avg Flow Size (L7) | 500 kB |
723	      |            |                  |   Flow Percentage  |  21.1% |
724	      +------------+------------------+--------------------+--------+

726	                     Table 1: Example Traffic Pattern

728	Appendix B.  Malformed Traffic Algorithm

730	   Each application flow will be broken into multiple transport
731	   segments, IP packets, and Ethernet frames.  The malformed traffic
732	   algorithm looks very similar to the IP Stack Integrity Checker
733	   project at http://isic.sourceforge.net.

735	   The algorithm is very simple and starts by defining each of the
736	   fields within the TCP/IP stack that will be malformed during
737	   transmission.  The following table illustrates the Ethernet, IPv4,
738	   IPv6, TCP, and UDP fields which are able to be malformed by the
739	   algorithm.  The first column lists the protocol, the second column
740	   shows the actual header field name, with the third column showing the
741	   percentage of packets that should have the field modified by the
742	   malformation algorithm.

744	         +--------------+--------------------------+-------------+
745	         |   Protocol   |       Header Field       | Malformed % |
746	         +--------------+--------------------------+-------------+
747	         | Total Frames |                          |      1%     |
748	         |   Ethernet   |                          |             |
749	         |              |      Destination MAC     |      0%     |
750	         |              |        Source MAC        |      1%     |
751	         |              |         Ethertype        |      1%     |
752	         |              |            CRC           |      1%     |
753	         | IP Version 4 |                          |             |
754	         |              |          Version         |      1%     |
755	         |              |            IHL           |      1%     |
756	         |              |      Type of Service     |      1%     |
757	         |              |       Total Length       |      1%     |
758	         |              |      Identification      |      1%     |
759	         |              |           Flags          |      1%     |
760	         |              |      Fragment Offset     |      1%     |
761	         |              |       Time to Live       |      1%     |
762	         |              |         Protocol         |      1%     |
763	         |              |      Header Checksum     |      1%     |
764	         |              |      Source Address      |      1%     |
765	         |              |    Destination Address   |      1%     |
766	         |              |          Options         |      1%     |
767	         |              |          Padding         |      1%     |
768	         |      UDP     |                          |             |
769	         |              |        Source Port       |      1%     |
770	         |              |     Destination Port     |      1%     |
771	         |              |          Length          |      1%     |
772	         |              |         Checksum         |      1%     |
773	         |      TCP     |                          |             |
774	         |              |        Source Port       |      1%     |
775	         |              |     Destination Port     |      1%     |
776	         |              |      Sequence Number     |      1%     |
777	         |              |  Acknowledgement Number  |      1%     |
778	         |              |        Data Offset       |      1%     |
779	         |              |      Reserved(3 bit)     |      1%     |
780	         |              |       Flags(9 bit)       |      1%     |
781	         |              |        Window Size       |      1%     |
782	         |              |         Checksum         |      1%     |
783	         |              |      Urgent Pointer      |      1%     |
784	         |              | Options(Variable Length) |      1%     |
785	         +--------------+--------------------------+-------------+

787	                     Table 2: Malformed Header Values

789	   This algorithm is to be used across the regular application flows
790	   used throughout the rest of the methodology.  As each frame is
791	   emitted from the test tool, a pseudo-random number generator will
792	   indicate whether the frame is to be malformed by creating a number
793	   between 0 and 100.  If the number is less than the percentage defined
794	   in the table, then that frame will be malformed.  If the frame is to
795	   be malformed, then each of the headers in the table present within
796	   the frame will follow the same process.  If it is determined that a
797	   header field should be malformed, the same pseudo-random number
798	   generator will be used to create a random number for the specified
799	   header field.

801	Authors' Addresses

803	   Mike Hamilton
804	   Ixia
805	   Austin, TX  78730
806	   US

808	   Phone: +1 512 636 2303
809	   Email: mhamilton@ixiacom.com

811	   Sarah Banks
812	   Aerohive Networks
813	   San Jose, CA  95134
814	   US

816	   Email: sbanks@aerohive.com