idnits 2.17.1 

draft-ietf-bmwg-ca-bench-meth-02.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (July 16, 2012) is 4296 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  ** Obsolete normative reference: RFC 3511 (ref. '4') (Obsoleted by RFC 9411)

  ** Obsolete normative reference: RFC 4614 (ref. '11') (Obsoleted by RFC
     7414)

  -- Obsolete informational reference (is this intentional?): RFC 4566 (ref.
     '14') (Obsoleted by RFC 8866)

  -- Obsolete informational reference (is this intentional?): RFC 5226 (ref.
     '15') (Obsoleted by RFC 8126)


     Summary: 2 errors (**), 0 flaws (~~), 1 warning (==), 3 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Internet Engineering Task Force                              M. Hamilton
3	Internet-Draft                                     BreakingPoint Systems
4	Intended status: Informational                                  S. Banks
5	Expires: January 17, 2013                                  Cisco Systems
6	                                                           July 16, 2012

8	       Benchmarking Methodology for Content-Aware Network Devices
9	                    draft-ietf-bmwg-ca-bench-meth-02

11	Abstract

13	   This document defines a set of test scenarios and metrics that can be
14	   used to benchmark content-aware network devices.  The scenarios in
15	   the following document are intended to more accurately predict the
16	   performance of these devices when subjected to dynamic traffic
17	   patterns.  This document will operate within the constraints of the
18	   Benchmarking Working Group charter, namely black box characterization
19	   in a laboratory environment.

21	Status of this Memo

23	   This Internet-Draft is submitted in full conformance with the
24	   provisions of BCP 78 and BCP 79.

26	   Internet-Drafts are working documents of the Internet Engineering
27	   Task Force (IETF).  Note that other groups may also distribute
28	   working documents as Internet-Drafts.  The list of current Internet-
29	   Drafts is at http://datatracker.ietf.org/drafts/current/.

31	   Internet-Drafts are draft documents valid for a maximum of six months
32	   and may be updated, replaced, or obsoleted by other documents at any
33	   time.  It is inappropriate to use Internet-Drafts as reference
34	   material or to cite them other than as "work in progress."

36	   This Internet-Draft will expire on January 17, 2013.

38	Copyright Notice

40	   Copyright (c) 2012 IETF Trust and the persons identified as the
41	   document authors.  All rights reserved.

43	   This document is subject to BCP 78 and the IETF Trust's Legal
44	   Provisions Relating to IETF Documents
45	   (http://trustee.ietf.org/license-info) in effect on the date of
46	   publication of this document.  Please review these documents
47	   carefully, as they describe your rights and restrictions with respect
48	   to this document.  Code Components extracted from this document must
49	   include Simplified BSD License text as described in Section 4.e of
50	   the Trust Legal Provisions and are provided without warranty as
51	   described in the Simplified BSD License.

53	Table of Contents

55	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  4
56	     1.1.  Requirements Language  . . . . . . . . . . . . . . . . . .  5
57	   2.  Scope  . . . . . . . . . . . . . . . . . . . . . . . . . . . .  5
58	   3.  Test Setup . . . . . . . . . . . . . . . . . . . . . . . . . .  5
59	     3.1.  Test Considerations  . . . . . . . . . . . . . . . . . . .  6
60	     3.2.  Clients and Servers  . . . . . . . . . . . . . . . . . . .  6
61	     3.3.  Traffic Generation Requirements  . . . . . . . . . . . . .  6
62	     3.4.  Discussion of Network Limitations  . . . . . . . . . . . .  6
63	     3.5.  Framework for Traffic Specification  . . . . . . . . . . .  8
64	     3.6.  Multiple Client/Server Testing . . . . . . . . . . . . . .  8
65	     3.7.  Device Configuration Considerations  . . . . . . . . . . .  8
66	       3.7.1.  Network Addressing . . . . . . . . . . . . . . . . . .  8
67	       3.7.2.  Network Address Translation  . . . . . . . . . . . . .  9
68	       3.7.3.  TCP Stack Considerations . . . . . . . . . . . . . . .  9
69	       3.7.4.  Other Considerations . . . . . . . . . . . . . . . . .  9
70	   4.  Benchmarking Tests . . . . . . . . . . . . . . . . . . . . . .  9
71	     4.1.  Maximum Application Session Establishment Rate . . . . . .  9
72	       4.1.1.  Objective  . . . . . . . . . . . . . . . . . . . . . . 10
73	       4.1.2.  Setup Parameters . . . . . . . . . . . . . . . . . . . 10
74	         4.1.2.1.  Application-Layer Parameters . . . . . . . . . . . 10
75	       4.1.3.  Procedure  . . . . . . . . . . . . . . . . . . . . . . 10
76	       4.1.4.  Measurement  . . . . . . . . . . . . . . . . . . . . . 10
77	         4.1.4.1.  Maximum Application Flow Rate  . . . . . . . . . . 10
78	         4.1.4.2.  Application Flow Duration  . . . . . . . . . . . . 11
79	         4.1.4.3.  Application Efficiency . . . . . . . . . . . . . . 11
80	         4.1.4.4.  Application Flow Latency . . . . . . . . . . . . . 11
81	     4.2.  Application Throughput . . . . . . . . . . . . . . . . . . 11
82	       4.2.1.  Objective  . . . . . . . . . . . . . . . . . . . . . . 11
83	       4.2.2.  Setup Parameters . . . . . . . . . . . . . . . . . . . 11
84	         4.2.2.1.  Parameters . . . . . . . . . . . . . . . . . . . . 11
85	       4.2.3.  Procedure  . . . . . . . . . . . . . . . . . . . . . . 12
86	       4.2.4.  Measurement  . . . . . . . . . . . . . . . . . . . . . 12
87	         4.2.4.1.  Maximum Throughput . . . . . . . . . . . . . . . . 12
88	         4.2.4.2.  Maximum Application Flow Rate  . . . . . . . . . . 12
89	         4.2.4.3.  Application Flow Duration  . . . . . . . . . . . . 12
90	         4.2.4.4.  Application Efficiency . . . . . . . . . . . . . . 12
91	         4.2.4.5.  Packet Loss  . . . . . . . . . . . . . . . . . . . 12
92	         4.2.4.6.  Application Flow Latency . . . . . . . . . . . . . 12
93	     4.3.  Malformed Traffic Handling . . . . . . . . . . . . . . . . 13
94	       4.3.1.  Objective  . . . . . . . . . . . . . . . . . . . . . . 13
95	       4.3.2.  Setup Parameters . . . . . . . . . . . . . . . . . . . 13
96	       4.3.3.  Procedure  . . . . . . . . . . . . . . . . . . . . . . 13
97	       4.3.4.  Measurement  . . . . . . . . . . . . . . . . . . . . . 13
98	   5.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 13
99	   6.  Security Considerations  . . . . . . . . . . . . . . . . . . . 13
100	   7.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 14
101	     7.1.  Normative References . . . . . . . . . . . . . . . . . . . 14
102	     7.2.  Informative References . . . . . . . . . . . . . . . . . . 15
103	   Appendix A.  Example Traffic Mix . . . . . . . . . . . . . . . . . 15
104	   Appendix B.  Malformed Traffic Algorithm . . . . . . . . . . . . . 17
105	   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 19

107	1.  Introduction

109	   Content-aware and deep packet inspection (DPI) device deployments
110	   have grown significantly in recent years.  No longer are devices
111	   simply using Ethernet and IP headers to make forwarding decisions.
112	   This class of device now uses application-specific data to make these
113	   decisions.  For example, a web-application firewall (WAF) may use
114	   search criteria upon the HTTP uniform resource indicator (URI)[1] to
115	   decide whether a HTTP GET method may traverse the network.  In the
116	   case of lawful/legal intercept technology, a device could use the
117	   phone number within the Session Description Protocol[14] to determine
118	   whether a voice-over-IP phone may be allowed to connect.  In addition
119	   to the development of entirely new classes of devices, devices that
120	   could historically be classified as 'stateless' or raw forwarding
121	   devices are now performing DPI functionality.  Devices such as core
122	   and edge routers are now being developed with DPI functionality to
123	   make more intelligent routing and forwarding decisions.

125	   The Benchmarking Working Group (BMWG) has historically produced
126	   Internet Drafts and Requests for Comment that are focused
127	   specifically on creating output metrics that are derived from a very
128	   specific and well-defined set of input parameters that are completely
129	   and unequivocally reproducible from test bed to test bed.  The end
130	   goal of such methodologies is to, in the words of the RFC 2544 [2],
131	   reduce "specsmanship" in the industry and hold vendors accountable
132	   for performance claims.

134	   The end goal of this methodology is to generate performance metrics
135	   in a lab environment that will closely relate to actual observed
136	   performance on production networks.  By utilizing dynamic traffic
137	   patterns relevant to modern networks, this methodology should be able
138	   to closely tie laboratory and production metrics.  It should be
139	   further noted than any metrics acquired from production networks
140	   SHOULD be captured according to the policies and procedures of the
141	   IPPM or PMOL working groups.

143	   An explicit non-goal of this document is to replace existing
144	   methodology/terminology pairs such as RFC 2544 [2]/RFC 1242 [3] or
145	   RFC 3511 [4]/RFC 2647 [5].  The explicit goal of this document is to
146	   create a methodology more suited for modern devices while
147	   complementing the data acquired using existing BMWG methodologies.
148	   This document does not assume completely repeatable input stimulus.
149	   The nature of application-driven networks is such that a single
150	   dropped packet inherently changes the input stimulus from a network
151	   perspective.  While application flows will be specified in great
152	   detail, it simply is not practical to require totally repeatable
153	   input stimulus.

155	1.1.  Requirements Language

157	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
158	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
159	   document are to be interpreted as described in RFC 2119 [6].

161	2.  Scope

163	   Content-aware devices take many forms, shapes and architectures.
164	   These devices are advanced network interconnect devices that inspect
165	   deep into the application payload of network data packets to do
166	   classification.  They may be as simple as a firewall that uses
167	   application data inspection for rule set enforcement, or they may
168	   have advanced functionality such as performing protocol decoding and
169	   validation, anti-virus, anti-spam and even application exploit
170	   filtering.  The document will universally call these devices
171	   middleboxes, as defined by RFC 3234 [7].

173	   This document is strictly focused on examining performance and
174	   robustness across a focused set of metrics: throughput(min/max/avg/
175	   sample std dev), transaction rates(successful/failed), application
176	   response times, concurrent flows, and unidirectional packet latency.
177	   None of the metrics captured through this methodology are specific to
178	   a device and the results are DUT implementation independent.
179	   Functional testing of the DUT is outside the scope of this
180	   methodology.

182	   Devices such as firewalls, intrusion detection and prevention
183	   devices, wireless LAN controllers, application delivery controllers,
184	   deep packet inspection devices, wide-area network(WAN) optimization
185	   devices, and unified threat management systems generally fall into
186	   the content-aware category.  While this list may become obsolete,
187	   these are a subset of devices that fall under this scope of testing.

189	3.  Test Setup

191	   This document will be applicable to most test configurations and will
192	   not be confined to a discussion on specific test configurations.
193	   Since each DUT/SUT will have their own unique configuration, users
194	   SHOULD configure their device with the same parameters that would be
195	   used in the actual deployment of the device or a typical deployment,
196	   if the actual deployment is unknown.  In order to improve
197	   repeatability, the DUT configuration SHOULD be published with the
198	   final benchmarking results.  If available, command-line scripts used
199	   to configured the DUT and any configuration information for the
200	   tester SHOULD be published with the final results

202	3.1.  Test Considerations

204	3.2.  Clients and Servers

206	   Content-aware device testing SHOULD involve multiple clients and
207	   multiple servers.  As with RFC 3511 [4], this methodology will use
208	   the terms virtual clients/servers because both the client and server
209	   will be represented by the tester and not actual clients/servers.
210	   Similarly defined in RFC 3511 [4], a data source may emulate multiple
211	   clients and/or servers within the context of the same test scenario.
212	   The test report SHOULD indicate the number of virtual clients/servers
213	   used during the test.  IANA has reserved address ranges for
214	   laboratory characterization.  These are defined for IPv4 and IPv6 by
215	   RFC 2544 Appendix C [2] and RFC 5180 Section 5.2 [8] respectively and
216	   SHOULD be consulted prior to testing.

218	3.3.  Traffic Generation Requirements

220	   The explicit purposes of content-aware devices vary widely, but these
221	   devices use information deeper inside the application flow to make
222	   decisions and classify traffic.  This methodology will utilize
223	   traffic flows that resemble real application traffic without
224	   utilizing captures from live production networks.  Application Flows,
225	   as defined in RFC 2722 [9] are able to be well-defined without simply
226	   referring to a network capture.  An example traffic template is
227	   defined and listed in Appendix A of this document.  A user of this
228	   methodology is free to utilize the example mix as provided in the
229	   appendix.  If a user of this methodology understands the traffic
230	   patterns in their production network, that user MAY use the template
231	   provided in Appendix A to describe a traffic mix appropriate for
232	   their environment.

234	   The test tool SHOULD be able to create application flows between
235	   every client and server, regardless of direction.  The tester SHOULD
236	   be able to open TCP connections on multiple destination ports and
237	   SHOULD be able to direct UDP traffic to multiple destination ports.

239	3.4.  Discussion of Network Limitations

241	   Prior to executing the methodology as outlined in the following
242	   sections, it is imperative to understand the implications of
243	   utilizing representative application flows for the traffic content of
244	   the benchmarking effort.  One interesting aspect of utilizing
245	   application flows is that each flow is inherently different from
246	   every other application flow.  The content of each flow will vary
247	   from application to application, and in most cases, even varies
248	   within the same type of application flow.  The following description
249	   of the methodology will individually benchmark every individual type
250	   and subset of application flow, prior to performing similar tests
251	   with a traffic mix as specified either by the example mix in
252	   Appendix A, or as defined by the user of this methodology.

254	   The purpose of this process is to ensure that any performance
255	   implications that are discovered during the mixed testing aren't due
256	   to the inherent physical network limitations.  As an example of this
257	   phenomena, it is useful to examine a network device inserted into a
258	   single path, as illustrated in the following diagram.

260	                                +----------+
261	                     +---+  1gE |   DUT/   | 1gE  +---+
262	                     |C/S|------|   SUT    |------|C/S|
263	                     +---+      +----------+      +---+

265	                      Simple Inline DUT Configuration

267	                    Figure 1: Simple Middle-box Example

269	   For the purpose of this discussion, let's take a hypothetical
270	   application flow that utilizes UDP for the transport layer.  Assume
271	   that the sample transaction we will be using to model this particular
272	   flow requires 10 UDP datagrams to complete the transaction.  For
273	   simplicity, each datagram within the flow is exactly 64 bytes,
274	   including associated Ethernet, IP, and UDP overhead.  With any
275	   network device,there are always three metrics which interact with
276	   each other: number of concurrent application flows, number of
277	   application flows per second, and layer-7 throughput.

279	   Our example test bed is a single-path device connected by 1 gigabit
280	   Ethernet links.  The purpose of this benchmark effort is to quantify
281	   the number of application flows per second that may be processed
282	   through our device under test.  Let's assume that the result from our
283	   scenario is that the DUT is able to process 10,000 application flows
284	   per second.  The question is whether that ceiling is the actual
285	   ceiling of the device, or if it is actually being limited by one of
286	   the other metrics.  If we do the appropriate math, 10000 flows per
287	   second, with each flow at 640 total bytes means that we are achieving
288	   an aggregate bitrate of roughly 49 Mbps.  This is dramatically less
289	   than the 1 gigabit physical link we are using.  We can conclude that
290	   10,000 flows per second is in fact the performance limit of the
291	   device.

293	   If we change the example slightly and increase the size of each
294	   datagram to 1312 bytes, then it becomes necessary to recompute the
295	   load.  Assuming the same observed DUT limitation of 10,000 flows per
296	   second, it must be ensured that this is an artifact of the DUT, and
297	   not of physical limitations.  For each flow, we'll require 104,960
298	   bits. 10,000 flows per second implies a throughput of roughly 1 Gbps.
299	   At this point, we cannot definitively answer whether the DUT is
300	   actually limited to 10,000 flows per second.  If we are able to
301	   modify the scenario, and utilize 10 Gigabit interfaces, then perhaps
302	   the flow per second ceiling will be reached at a higher number than
303	   10,000.

305	   This example illustrates why a user of this methodology SHOULD
306	   benchmark each application variant individually to ensure that the
307	   cause of a measured limit is fully understood

309	3.5.  Framework for Traffic Specification

311	   The following table SHOULD be specified for each application flow
312	   variant.

314	   o  Flow Size in Bits

316	   o  Percentage of Aggregate Flows: 25%

318	   o  Transport Protocol(s): TCP,UDP

320	   o  Destination Port(s): 80

322	3.6.  Multiple Client/Server Testing

324	   In actual network deployments, connections are being established
325	   between multiple clients and multiple servers simultaneously.  Device
326	   vendors have been known to optimize the operation of their devices
327	   for easily defined patterns.  The connection sequence ordering
328	   scenarios a device will see on a network will likely be much less
329	   deterministic.  In fact, many application flows have multiple layer 4
330	   connections within a single flow, with client and server reversing
331	   roles.  This methodology makes no assumptions about flow initiation
332	   sequence across multiple ports.

334	3.7.  Device Configuration Considerations

336	   The configuration of the DUT may have an effect on the observed
337	   results of the following methodology.  A comprehensive, but certainly
338	   not exhaustive, list of potential considerations is listed below.

340	3.7.1.  Network Addressing

342	   The IANA has issued a range of IP addresses to the BMWG for purposes
343	   of benchmarking.  Please refer to RFC 2544 [2] and RFC 5180 [8] for
344	   more details.  If more IPv4 addresses are required than the RFC 2544
345	   allotment provides, then allocations from the private address space
346	   as defined in RFC 1918 [10] may be used.

348	3.7.2.  Network Address Translation

350	   Many content-aware devices are capable of performing Network Address
351	   Translation (NAT)[5].  If the final deployment of the DUT will have
352	   this functionality enabled, then the DUT SHOULD also have it enabled
353	   during the execution of this methodology.  It MAY be beneficial to
354	   perform the test series in both modes in order to determine the
355	   performance differential when using NAT.  The test report SHOULD
356	   indicate whether NAT was enabled during the testing process.

358	3.7.3.  TCP Stack Considerations

360	   The IETF has historically provided guidance and information on TCP
361	   stack considerations.  This methodology is strictly focused on
362	   performance metrics at layers above 4, thus does not specifically
363	   define any TCP stack configuration parameters of either the tester or
364	   the DUTs.  The TCP configuration of the tester MUST remain constant
365	   across all DUTs in order to ensure comparable results.  While the
366	   following list of references is not exhaustive, each document
367	   contains a relevant discussion on TCP stack considerations.

369	   The general IETF TCP roadmap is defined in RFC 4614 [11] and
370	   congestion control algorithms are discussed in Section 2 of RFC 3148
371	   [12] with even more detailed references.  TCP receive and congestion
372	   window sizes are discussed in detail in RFC 6349 [13].

374	3.7.4.  Other Considerations

376	   Various content-aware devices will have widely varying feature sets.
377	   In the interest of representative test results, the DUT features that
378	   will likely be enabled in the final deployment SHOULD be used.  This
379	   methodology is not intended to advise on which features should be
380	   enabled, but to suggest using actual deployment configurations.

382	4.  Benchmarking Tests

384	   Each of the following benchmark scenarios SHOULD be run with each of
385	   the single application flow templates.  Upon completion of all
386	   iterations, the mixed test SHOULD be completed, subject to the
387	   traffic mix as defined by the user.

389	4.1.  Maximum Application Session Establishment Rate
390	4.1.1.  Objective

392	   To determine the maximum rate through which a device is able to
393	   establish and complete application flows as defined by
394	   draft-ietf-bmwg-ca-bench-term-00.

396	4.1.2.  Setup Parameters

398	   The following parameters SHOULD be used and reported for all tests:

400	4.1.2.1.  Application-Layer Parameters

402	   For each application protocol in use during the test run, the table
403	   provided in Section 3.5 SHOULD be published.

405	4.1.3.  Procedure

407	   The test SHOULD generate application network traffic that meets the
408	   conditions of Section 3.3.  The traffic pattern SHOULD begin with an
409	   application flow rate of 10% of expected maximum.  The test SHOULD be
410	   configured to increase the attempt rate in units of 10% up through
411	   110% of expected maximum.  In the case where expected maximum is
412	   limited by physical link rate as discovered through Appendix A, the
413	   maximum rate will attempted will be 100% of expected maximum, or
414	   "wire-speed performance".  The duration of each loading phase SHOULD
415	   be at least 30 seconds.  This test MAY be repeated, each subsequent
416	   iteration beginning at 5% of expected maximum and increasing session
417	   establishment rate to 110% of the maximum observed from the previous
418	   test run.

420	   This procedure MAY be repeated any reasonable number of times with
421	   the results being averaged together.

423	4.1.4.  Measurement

425	   The following metrics MAY be determined from this test, and SHOULD be
426	   observed for each application protocol within the traffic mix:

428	4.1.4.1.  Maximum Application Flow Rate

430	   The test tool SHOULD report the maximum rate at which application
431	   flows were completed, as defined by RFC 2647 [5], Section 3.7.  This
432	   rate SHOULD be reported individually for each application protocol
433	   present within the traffic mix.

435	4.1.4.2.  Application Flow Duration

437	   The test tool SHOULD report the minimum, maximum and average
438	   application duration, as defined by RFC 2647 [5], Section 3.9.  This
439	   duration SHOULD be reported individually for each application
440	   protocol present within the traffic mix.

442	4.1.4.3.  Application Efficiency

444	   The test tool SHOULD report the application efficiency, similarly
445	   defined for TCP by RFC 6349 [13].

447	                          Transmitted Bytes - Retransmitted Bytes
448	      App Efficiency % =  ---------------------------------------  X 100
449	                                     Transmitted Bytes

451	           Figure 2: Application Efficiency Percent Calculation

453	   Note than calculation less than 100% does not necessarily imply
454	   noticeably degraded performance since certain applications utilize
455	   algorithms to maintain a quality user experience in the face of data
456	   loss.

458	4.1.4.4.  Application Flow Latency

460	   The test tool SHOULD report the minimum, maximum and average amount
461	   of time an application flow member takes to traverse the DUT, as
462	   defined by RFC 1242 [3], Section 3.8.  This value SHOULD be reported
463	   individually for each application protocol present within the traffic
464	   mix.

466	4.2.  Application Throughput

468	4.2.1.  Objective

470	   To determine the maximum rate through which a device is able to
471	   forward bits when using application flows as defined in the previous
472	   sections.

474	4.2.2.  Setup Parameters

476	   The following parameters SHOULD be used and reported for all tests:

478	4.2.2.1.  Parameters

480	   The same parameters as described in Section 4.1.2 SHOULD be used.

482	4.2.3.  Procedure

484	   This test will attempt to send application flows through the device
485	   at a flow rate of 30% of the maximum, as observed in Section 4.1.
486	   This procedure MAY be repeated with the results from each iteration
487	   averaged together.

489	4.2.4.  Measurement

491	   The following metrics MAY be determined from this test, and SHOULD be
492	   observed for each application protocol within the traffic mix:

494	4.2.4.1.  Maximum Throughput

496	   The test tool SHOULD report the minimum, maximum and average
497	   application throughput.

499	4.2.4.2.  Maximum Application Flow Rate

501	   The test tool SHOULD report the maximum rate at which application
502	   flows were completed, as defined by RFC 2647 [5], Section 3.7.  This
503	   rate SHOULD be reported individually for each application protocol
504	   present within the traffic mix.

506	4.2.4.3.  Application Flow Duration

508	   The test tool SHOULD report the minimum, maximum and average
509	   application duration, as defined by RFC 2647 [5], Section 3.9.  This
510	   duration SHOULD be reported individually for each application
511	   protocol present within the traffic mix.

513	4.2.4.4.  Application Efficiency

515	   The test tool SHOULD report the application efficiency as defined in
516	   Section 4.1.4.3.

518	4.2.4.5.  Packet Loss

520	   The test tool SHOULD report the number of packets lost or dropped
521	   from source to destination.

523	4.2.4.6.  Application Flow Latency

525	   The test tool SHOULD report the minimum, maximum and average amount
526	   of time an application flow member takes to traverse the DUT, as
527	   defined by RFC 1242 [3], Section 3.13.  This value SHOULD be reported
528	   individually for each application protocol present within the traffic
529	   mix.

531	4.3.  Malformed Traffic Handling

533	4.3.1.  Objective

535	   To determine the effects on performance and stability that malformed
536	   traffic may have on the DUT.

538	4.3.2.  Setup Parameters

540	   The same parameters SHOULD be used for Transport-Layer and
541	   Application Layer Parameters previously specified in Section 4.1.2
542	   and Section 4.2.2.

544	4.3.3.  Procedure

546	   This test will utilize the procedures specified previously in
547	   Section 4.1.3 and Section 4.2.3.  When performing the procedures
548	   listed previously, the tester should generate malformed traffic at
549	   all protocol layers.  This is commonly known as fuzzed traffic.
550	   Fuzzing techniques generally modify portions of packets, including
551	   checksum errors, invalid protocol options, and improper protocol
552	   conformance.

554	   The process by which the tester SHOULD generate the malformed traffic
555	   is outlined in detail in Appendix B.

557	4.3.4.  Measurement

559	   For each protocol present in the traffic mix, the metrics specified
560	   by Section 4.1.4 and Section 4.2.4 MAY be determined.  This data may
561	   be used to ascertain the effects of fuzzed traffic on the DUT.

563	5.  IANA Considerations

565	   This memo includes no request to IANA.

567	   All drafts are required to have an IANA considerations section (see
568	   the update of RFC 2434 [15] for a guide).  If the draft does not
569	   require IANA to do anything, the section contains an explicit
570	   statement that this is the case (as above).  If there are no
571	   requirements for IANA, the section will be removed during conversion
572	   into an RFC by the RFC Editor.

574	6.  Security Considerations

576	   Benchmarking activities as described in this memo are limited to
577	   technology characterization using controlled stimuli in a laboratory
578	   environment, with dedicated address space and the other constraints
579	   RFC 2544 [2].

581	   The benchmarking network topology will be an independent test setup
582	   and MUST NOT be connected to devices that may forward the test
583	   traffic into a production network, or mis-route traffic to the test
584	   management network

586	7.  References

588	7.1.  Normative References

590	   [1]   Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
591	         Resource Identifier (URI): Generic Syntax", STD 66, RFC 3986,
592	         January 2005.

594	   [2]   Bradner, S. and J. McQuaid, "Benchmarking Methodology for
595	         Network Interconnect Devices", RFC 2544, March 1999.

597	   [3]   Bradner, S., "Benchmarking terminology for network
598	         interconnection devices", RFC 1242, July 1991.

600	   [4]   Hickman, B., Newman, D., Tadjudin, S., and T. Martin,
601	         "Benchmarking Methodology for Firewall Performance", RFC 3511,
602	         April 2003.

604	   [5]   Newman, D., "Benchmarking Terminology for Firewall
605	         Performance", RFC 2647, August 1999.

607	   [6]   Bradner, S., "Key words for use in RFCs to Indicate Requirement
608	         Levels", BCP 14, RFC 2119, March 1997.

610	   [7]   Carpenter, B. and S. Brim, "Middleboxes: Taxonomy and Issues",
611	         RFC 3234, February 2002.

613	   [8]   Popoviciu, C., Hamza, A., Van de Velde, G., and D. Dugatkin,
614	         "IPv6 Benchmarking Methodology for Network Interconnect
615	         Devices", RFC 5180, May 2008.

617	   [9]   Brownlee, N., Mills, C., and G. Ruth, "Traffic Flow
618	         Measurement: Architecture", RFC 2722, October 1999.

620	   [10]  Rekhter, Y., Moskowitz, R., Karrenberg, D., Groot, G., and E.
621	         Lear, "Address Allocation for Private Internets", BCP 5,
622	         RFC 1918, February 1996.

624	   [11]  Duke, M., Braden, R., Eddy, W., and E. Blanton, "A Roadmap for
625	         Transmission Control Protocol (TCP) Specification Documents",
626	         RFC 4614, September 2006.

628	   [12]  Mathis, M. and M. Allman, "A Framework for Defining Empirical
629	         Bulk Transfer Capacity Metrics", RFC 3148, July 2001.

631	   [13]  Constantine, B., Forget, G., Geib, R., and R. Schrage,
632	         "Framework for TCP Throughput Testing", RFC 6349, August 2011.

634	7.2.  Informative References

636	   [14]  Handley, M., Jacobson, V., and C. Perkins, "SDP: Session
637	         Description Protocol", RFC 4566, July 2006.

639	   [15]  Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA
640	         Considerations Section in RFCs", BCP 26, RFC 5226, May 2008.

642	Appendix A.  Example Traffic Mix

644	   This appendix shows an example case of a protocol mix that may be
645	   used with this methodology.  This mix closely represents the research
646	   published by Sandvine in their biannual report for the first half of
647	   2012 on North American fixed access service provider networks.

649	      +------------+------------------+--------------------+--------+
650	      |  Direction | Application Flow |       Options      |  Value |
651	      +------------+------------------+--------------------+--------+
652	      |  Upstream  |    BitTorrent    |                    |        |
653	      |            |                  | Avg Flow Size (L7) | 512 MB |
654	      |            |                  |   Flow Percentage  |  44.4% |
655	      |            |       HTTP       |                    |        |
656	      |            |                  | Avg Flow Size (L7) | 128 kB |
657	      |            |                  |   Flow Percentage  |  7.3%  |
658	      |            |       Skype      |                    |        |
659	      |            |                  | Avg Flow Size (L7) |  8 MB  |
660	      |            |                  |   Flow Percentage  |  4.9%  |
661	      |            |      SSL/TLS     |                    |        |
662	      |            |                  | Avg Flow Size (L7) | 128 kB |
663	      |            |                  |   Flow Percentage  |  3.2%  |
664	      |            |      Netflix     |                    |        |
665	      |            |                  | Avg Flow Size (L7) | 500 kB |
666	      |            |                  |   Flow Percentage  |  3.1%  |
667	      |            |     PPStream     |                    |        |
668	      |            |                  | Avg Flow Size (L7) | 500 MB |
669	      |            |                  |   Flow Percentage  |  2.2%  |
670	      |            |      YouTube     |                    |        |
671	      |            |                  | Avg Flow Size (L7) |  4 MB  |
672	      |            |                  |   Flow Percentage  |  1.9%  |
673	      |            |     Facebook     |                    |        |
674	      |            |                  | Avg Flow Size (L7) |  2 MB  |
675	      |            |                  |   Flow Percentage  |  1.9%  |
676	      |            |      Teredo      |                    |        |
677	      |            |                  | Avg Flow Size (L7) | 500 MB |
678	      |            |                  |   Flow Percentage  |  1.2%  |
679	      |            |  Apple iMessage  |                    |        |
680	      |            |                  | Avg Flow Size (L7) |  40 kB |
681	      |            |                  |   Flow Percentage  |  1.1%  |
682	      |            |     Bulk TCP     |                    |        |
683	      |            |                  | Avg Flow Size (L7) | 128 kB |
684	      |            |                  |   Flow Percentage  |  28.8% |
685	      | Downstream |      Netflix     |                    |        |
686	      |            |                  | Avg Flow Size (L7) | 512 MB |
687	      |            |                  |   Flow Percentage  |  32.9% |
688	      |            |      YouTube     |                    |        |
689	      |            |                  | Avg Flow Size (L7) |  5 MB  |
690	      |            |                  |   Flow Percentage  |  13.8% |
691	      |            |       HTTP       |                    |        |
692	      |            |                  | Avg Flow Size (L7) |  1 MB  |
693	      |            |                  |   Flow Percentage  |  12.1% |
694	      |            |    BitTorrent    |                    |        |
695	      |            |                  | Avg Flow Size (L7) | 500 MB |
696	      |            |                  |   Flow Percentage  |  6.3%  |
697	      |            |      iTunes      |                    |        |
698	      |            |                  | Avg Flow Size (L7) |  32 MB |
699	      |            |                  |   Flow Percentage  |  3.8%  |
700	      |            |    Flash Video   |                    |        |
701	      |            |                  | Avg Flow Size (L7) | 100 MB |
702	      |            |                  |   Flow Percentage  |  2.6%  |
703	      |            |       MPEG       |                    |        |
704	      |            |                  | Avg Flow Size (L7) | 100 MB |
705	      |            |                  |   Flow Percentage  |  2.0%  |
706	      |            |       RTMP       |                    |        |
707	      |            |                  | Avg Flow Size (L7) |  50 MB |
708	      |            |                  |   Flow Percentage  |  2.0%  |
709	      |            |       Hulu       |                    |        |
710	      |            |                  | Avg Flow Size (L7) | 300 MB |
711	      |            |                  |   Flow Percentage  |  1.8%  |
712	      |            |      SSL/TLS     |                    |        |
713	      |            |                  | Avg Flow Size (L7) | 256 kB |
714	      |            |                  |   Flow Percentage  |  1.6%  |
715	      |            |     Bulk TCP     |                    |        |
716	      |            |                  | Avg Flow Size (L7) | 500 kB |
717	      |            |                  |   Flow Percentage  |  21.1% |
718	      +------------+------------------+--------------------+--------+
719	                     Table 1: Example Traffic Pattern

721	Appendix B.  Malformed Traffic Algorithm

723	   Each application flow will be broken into multiple transport
724	   segments, IP packets, and Ethernet frames.  The malformed traffic
725	   algorithm looks very similar to the IP Stack Integrity Checker
726	   project at http://isic.sourceforge.net.

728	   The algorithm is very simple and starts by defining each of the
729	   fields within the TCP/IP stack that will be malformed during
730	   transmission.  The following table illustrates the Ethernet, IPv4,
731	   IPv6, TCP, and UDP fields which are able to be malformed by the
732	   algorithm.  The first column lists the protocol, the second column
733	   shows the actual header field name, with the third column showing the
734	   percentage of packets that should have the field modified by the
735	   malformation algorithm.

737	         +--------------+--------------------------+-------------+
738	         |   Protocol   |       Header Field       | Malformed % |
739	         +--------------+--------------------------+-------------+
740	         | Total Frames |                          |      1%     |
741	         |   Ethernet   |                          |             |
742	         |              |      Destination MAC     |      0%     |
743	         |              |        Source MAC        |      1%     |
744	         |              |         Ethertype        |      1%     |
745	         |              |            CRC           |      1%     |
746	         | IP Version 4 |                          |             |
747	         |              |          Version         |      1%     |
748	         |              |            IHL           |      1%     |
749	         |              |      Type of Service     |      1%     |
750	         |              |       Total Length       |      1%     |
751	         |              |      Identification      |      1%     |
752	         |              |           Flags          |      1%     |
753	         |              |      Fragment Offset     |      1%     |
754	         |              |       Time to Live       |      1%     |
755	         |              |         Protocol         |      1%     |
756	         |              |      Header Checksum     |      1%     |
757	         |              |      Source Address      |      1%     |
758	         |              |    Destination Address   |      1%     |
759	         |              |          Options         |      1%     |
760	         |              |          Padding         |      1%     |
761	         |      UDP     |                          |             |
762	         |              |        Source Port       |      1%     |
763	         |              |     Destination Port     |      1%     |
764	         |              |          Length          |      1%     |
765	         |              |         Checksum         |      1%     |
766	         |      TCP     |                          |             |
767	         |              |        Source Port       |      1%     |
768	         |              |     Destination Port     |      1%     |
769	         |              |      Sequence Number     |      1%     |
770	         |              |  Acknowledgement Number  |      1%     |
771	         |              |        Data Offset       |      1%     |
772	         |              |      Reserved(3 bit)     |      1%     |
773	         |              |       Flags(9 bit)       |      1%     |
774	         |              |        Window Size       |      1%     |
775	         |              |         Checksum         |      1%     |
776	         |              |      Urgent Pointer      |      1%     |
777	         |              | Options(Variable Length) |      1%     |
778	         +--------------+--------------------------+-------------+

780	                     Table 2: Malformed Header Values

782	   This algorithm is to be used across the regular application flows
783	   used throughout the rest of the methodology.  As each frame is
784	   emitted from the test tool, a pseudo-random number generator will
785	   indicate whether the frame is to be malformed by creating a number
786	   between 0 and 100.  If the number is less than the percentage defined
787	   in the table, then that frame will be malformed.  If the frame is to
788	   be malformed, then each of the headers in the table present within
789	   the frame will follow the same process.  If it is determined that a
790	   header field should be malformed, the same pseudo-random number
791	   generator will be used to create a random number for the specified
792	   header field.

794	Authors' Addresses

796	   Mike Hamilton
797	   BreakingPoint Systems
798	   Austin, TX  78717
799	   US

801	   Phone: +1 512 636 2303
802	   Email: mhamilton@breakingpoint.com

804	   Sarah Banks
805	   Cisco Systems
806	   San Jose, CA  95134
807	   US

809	   Email: sabanks@cisco.com