idnits 2.17.1 

draft-ietf-ippm-tcp-throughput-tm-08.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (November 14, 2010) is 4905 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  ** Obsolete normative reference: RFC 1323 (Obsoleted by RFC 7323)


     Summary: 1 error (**), 0 flaws (~~), 1 warning (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	Network Working Group                                     B. Constantine
2	Internet-Draft                                                      JDSU
3	Intended status: Informational                                 G. Forget
4	Expires: May 14, 2011                      Bell Canada (Ext. Consultant)
5	                                                            Rudiger Geib
6	                                                        Deutsche Telekom
7	                                                        Reinhard Schrage
8	                                                      Schrage Consulting

10	                                                       November 14, 2010

12	                  Framework for TCP Throughput Testing
13	                draft-ietf-ippm-tcp-throughput-tm-08.txt

15	Abstract

17	   This framework describes a methodology for measuring end-to-end TCP
18	   throughput performance in a managed IP network. The intention is to
19	   provide a practical methodology to validate TCP layer performance.
20	   The goal is to provide a better indication of the user experience.
21	   In this framework, various TCP and IP parameters are identified and
22	   should be tested as part of a managed IP network verification.

24	Requirements Language

26	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
27	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
28	   document are to be interpreted as described in RFC 2119 [RFC2119].

30	Status of this Memo

32	   This Internet-Draft is submitted in full conformance with the
33	   provisions of BCP 78 and BCP 79.

35	   Internet-Drafts are working documents of the Internet Engineering
36	   Task Force (IETF).  Note that other groups may also distribute
37	   working documents as Internet-Drafts.  The list of current Internet-
38	   Drafts is at http://datatracker.ietf.org/drafts/current/.

40	   Internet-Drafts are draft documents valid for a maximum of six months
41	   and may be updated, replaced, or obsoleted by other documents at any
42	   time.  It is inappropriate to use Internet-Drafts as reference
43	   material or to cite them other than as "work in progress."

45	           This Internet-Draft will expire on May 14, 2011.

47	   Copyright Notice

49	   Copyright (c) 2010 IETF Trust and the persons identified as the
50	   document authors.  All rights reserved.

52	   This document is subject to BCP 78 and the IETF Trust's Legal
53	   Provisions Relating to IETF Documents
54	   (http://trustee.ietf.org/license-info) in effect on the date of
55	   publication of this document.  Please review these documents
56	   carefully, as they describe your rights and restrictions with respect
57	   to this document.  Code Components extracted from this document must
58	   include Simplified BSD License text as described in Section 4.e of
59	   the Trust Legal Provisions and are provided without warranty as
60	   described in the Simplified BSD License.

62	Table of Contents

64	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
65	     1.1   Test Set-up and Terminology  . . . . . . . . . . . . . . .  4
66	   2.  Scope and Goals of this methodology. . . . . . . . . . . . . .  5
67	     2.1   TCP Equilibrium. . . . . . . . . . . . . . . . . . . . . .  6
68	   3.  TCP Throughput Testing Methodology . . . . . . . . . . . . . .  7
69	     3.1   Determine Network Path MTU . . . . . . . . . . . . . . . .  9
70	     3.2.  Baseline Round Trip Time and Bandwidth . . . . . . . . . . 10
71	         3.2.1  Techniques to Measure Round Trip Time . . . . . . . . 10
72	         3.2.2  Techniques to Measure end-to-end Bandwidth. . . . . . 11
73	     3.3.  TCP Throughput Tests . . . . . . . . . . . . . . . . . . . 12
74	         3.3.1 Calculate Ideal TCP Receive Window Size. . . . . . . . 12
75	         3.3.2 Metrics for TCP Throughput Tests . . . . . . . . . . . 15
76	         3.3.3 Conducting the TCP Throughput Tests. . . . . . . . . . 18
77	         3.3.4 Single vs. Multiple TCP Connection Testing . . . . . . 19
78	         3.3.5 Interpretation of the TCP Throughput Results . . . . . 20
79	     3.4. Traffic Management Tests .  . . . . . . . . . . . . . . . . 20
80	         3.4.1 Traffic Shaping Tests. . . . . . . . . . . . . . . . . 21
81	          3.4.1.1 Interpretation of Traffic Shaping Test Results. . . 21
82	         3.4.2 RED Tests. . . . . . . . . . . . . . . . . . . . . . . 22
83	          3.4.2.1 Interpretation of RED Results . . . . . . . . . . . 23
84	   4.  Security Considerations  . . . . . . . . . . . . . . . . . . . 23
85	   5.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 23
86	   6.  Acknowledgments  . . . . . . . . . . . . . . . . . . . . . . . 23
87	   7.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 24
88	     7.1   Normative References . . . . . . . . . . . . . . . . . . . 24
89	     7.2   Informative References . . . . . . . . . . . . . . . . . . 24

91	   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 25

93	1. Introduction

95	   Network providers are coming to the realization that Layer 2/3
96	   testing is not enough to adequately ensure end-user's satisfaction.
97	   An SLA (Service Level Agreement) is provided to business customers
98	   and is generally based upon Layer 2/3 criteria such as access rate,
99	   latency, packet loss and delay variations.  On the other hand,
100	   measuring TCP throughput provides meaningful results with respect to
101	   user experience.  Thus, the network provider community desires to
102	   measure IP network throughput performance at the TCP layer.

104	   Additionally, business enterprise customers seek to conduct
105	   repeatable TCP throughput tests between locations.  Since these
106	   enterprises rely on the networks of the providers, a common test
107	   methodology with predefined metrics will benefit both parties.

109	   Note that the primary focus of this methodology is managed business
110	   class IP networks; i.e. those Ethernet terminated services for which
111	   businesses are provided an SLA from the network provider.  End-users
112	   with "best effort" access between locations can use this methodology,
113	   but this framework and its metrics are intended to be used in a
114	   predictable managed IP service environment.

116	   So the intent behind this document is to define a methodology for
117	   testing sustained TCP layer performance.  In this document, the
118	   maximum achievable TCP throughput is that amount of data per unit
119	   time that TCP transports when trying to reach Equilibrium, i.e.
120	   after the initial slow start and congestion avoidance phases.  We
121	   refer to this as the maximum achievable TCP Throughput for the TCP
122	   connection(s).

124	   TCP uses a congestion window, (TCP CWND), to determine how many
125	   packets it can send at one time. A larger TCP CWND permits a higher
126	   throughput.  TCP "slow start" and "congestion avoidance" algorithms
127	   together determine the TCP CWND size.  The Maximum TCP CWND size is
128	   also tributary to the buffer space allocated by the kernel for each
129	   socket.  For each socket, there is a default buffer size that can be
130	   changed by the program using a system library called just before
131	   opening the socket.  There is also a kernel enforced maximum buffer
132	   size.  This buffer size can be adjusted at both ends of the socket
133	   (send and receive).  In order to obtain the maximum throughput, it
134	   is critical to use optimal TCP Send and Receive Socket Buffer sizes
135	   as well as the optimal TCP Receive Window size.

137	   There are many variables to consider when conducting a TCP throughput
138	   test and this methodology focuses on the most common:
139	   - Path MTU and Maximum Segment Size (MSS)
140	   - RTT and Bottleneck BW
141	   - Ideal TCP Receive Window (including Ideal Receive Socket Buffer)
142	   - Ideal Send Socket Buffer
143	   - TCP Congestion Window (TCP CWND)
144	   - Single Connection and Multiple Connections testing
145	   This methodology proposes TCP testing that should be performed in
146	   addition to traditional Layer 2/3 type tests.  Layer 2/3 tests are
147	   required to verify the integrity of the network before conducting TCP
148	   test.  Examples include iperf (UDP mode) or manual packet layer test
149	   techniques where packet throughput, loss, and delay measurements are
150	   conducted.  When available, standardized testing similar to RFC 2544
151	   [RFC2544] but adapted for use in operational networks may be used.
152	   Note: RFC 2544 was never meant to be used outside a lab environment.

154	1.1 Test Set-up and Terminology

156	   This section provides a general overview of the test configuration
157	   for this methodology.  The test is intended to be conducted on an
158	   end-to-end operational and managed IP network.  A multitude of
159	   network architectures and topologies can be tested.  The following
160	   set-up diagram is very general and it only illustrates the
161	   segmentation within end user and network provider domains.

163	   Common terminologies used in the test methodology are:

165	   - Bottleneck Bandwidth (BB), lowest bandwidth along the complete
166	     path. Bottleneck Bandwidth and Bandwidth are used synonymously
167	     in this document. Most of the time the Bottleneck Bandwidth is
168	     in the access portion of the wide area network (CE - PE)
169	   - Customer Provided Equipment (CPE), refers to customer owned
170	     equipment (routers, switches, computers, etc.)
171	   - Customer Edge (CE), refers to provider owned demarcation device.
172	   - End-user: The business enterprise customer.  For the purposes of
173	     conducting TCP throughput tests, this may be the IT department.
174	   - Network Under Test (NUT), refers to the tested IP network path.
175	   - Provider Edge (PE), refers to provider's distribution equipment.
176	   - P (Provider), refers to provider core network equipment.
177	   - Round-Trip Time (RTT), refers to Layer 4 back and forth delay.
178	   - Round-Trip Delay (RTD), refers to Layer 1 back and forth delay.
179	   - TCP Throughput Test Device (TCP TTD), refers to compliant TCP
180	     host that generates traffic and measures metrics as defined in
181	     this methodology. i.e. a dedicated communications test instrument.

183	 +----+ +----+ +----+  +----+ +---+  +---+ +----+  +----+ +----+ +----+
184	 | TCP|-| CPE|-| CE |--| PE |-| P |--| P |-| PE |--| CE |-| CPE|-| TCP|
185	 | TTD| |    | |    |BB|    | |   |  |   | |    |BB|    | |    | | TTD|
186	 +----+ +----+ +----+  +----+ +---+  +---+ +----+  +----+ +----+ +----+
187	        <------------------------ NUT ------------------------>
188	    R >-----------------------------------------------------------|
189	    T                                                             |
190	    T <-----------------------------------------------------------|

192	   Note that the NUT may consist of a variety of devices including but
193	   not limited to, load balancers, proxy servers or WAN acceleration
194	   devices.  The detailed topology of the NUT should be well understood
195	   when conducting the TCP throughput tests, although this methodology
196	   makes no attempt to characterize specific network architectures.

198	2. Scope and Goals of this Methodology

200	   Before defining the goals, it is important to clearly define the
201	   areas that are out-of-scope.

203	   - This methodology is not intended to predict the TCP throughput
204	   during the transient stages of a TCP connection, such as the initial
205	   slow start.

207	   - This methodology is not intended to definitively benchmark TCP
208	   implementations of one OS to another, although some users may find
209	   some value in conducting qualitative experiments.

211	   - This methodology is not intended to provide detailed diagnosis
212	   of problems within end-points or within the network itself as
213	   related to non-optimal TCP performance, although a results
214	   interpretation section for each test step may provide insight in
215	   regards with potential issues.

217	   - This methodology does not propose to operate permanently with high
218	   measurement loads.  TCP performance and optimization within
219	   operational networks may be captured and evaluated by using data
220	   from the "TCP Extended Statistics MIB" [RFC4898].

222	   - This methodology is not intended to measure TCP throughput as part
223	   of an SLA, or to compare the TCP performance between service
224	   providers or to compare between implementations of this methodology
225	   in dedicated communications test instruments.

227	   In contrast to the above exclusions, a primary goal is to define a
228	   method to conduct a practical, end-to-end assessment of sustained
229	   TCP performance within a managed business class IP network.  Another
230	   key goal is to establish a set of "best practices" that a non-TCP
231	   expert should apply when validating the ability of a managed network
232	   to carry end-user TCP applications.

234	   Other specific goals are to :

236	   - Provide a practical test approach that specifies IP hosts
237	   configurable TCP parameters such as TCP Receive Window size, Socket
238	   Buffer size, MSS (Maximum Segment Size), number of connections, and
239	   how these affect the outcome of TCP performance over a network.
240	   See section 3.3.3.

242	   - Provide specific test conditions like link speed, RTT, TCP Receive
243	   Window size, Socket Buffer size and maximum achievable TCP throughput
244	   when trying to reach TCP Equilibrium.  For guideline purposes,
245	   provide examples of test conditions and their maximum achievable
246	   TCP throughput.  Section 2.1 provides specific details concerning the
247	   definition of TCP Equilibrium within this methodology while section 3
248	   provides specific test conditions with examples.

250	   - Define three (3) basic metrics to compare the performance of TCP
251	   connections under various network conditions.  See section 3.3.2.

253	   - In test situations where the recommended procedure does not yield
254	   the maximum achievable TCP throughput results, this methodology
255	   provides some possible areas within the end host or network that
256	   should be considered for investigation.  Although again, this
257	   methodology is not intended to provide a detailed diagnosis on these
258	   issues.  See section 3.3.5.

260	2.1 TCP Equilibrium

262	   TCP connections have three (3) fundamental congestion window phases
263	   as documented in [RFC5681].

265	   These 3 phases are:
266	   1 - The Slow Start phase, which occurs at the beginning of a TCP
267	   transmission or after a retransmission time out.

269	   2 - The Congestion Avoidance phase, during which TCP ramps up to
270	   establish the maximum attainable throughput on an end-to-end network
271	   path.  Retransmissions are a natural by-product of the TCP congestion
272	   avoidance algorithm as it seeks to achieve maximum throughput.

274	   3 - The Retransmission Time-out phase, which could include Fast
275	   Retransmit (Tahoe) or Fast Recovery (Reno & New Reno). When multiple
276	   packet lost occurs, Congestion Avoidance phase transitions to Fast
277	   Retransmission or Fast Recovery depending upon TCP implementations.
278	   If a Time-Out occurs, TCP transitions back to the Slow Start phase.

280	   The following diagram depicts these 3 phases.

282	            |              Trying to reach TCP Equilibrium >>>>>>>>>>>>>
283	        /\  | High ssthresh     TCP CWND    3
284	        /\  | Loss Event *      halving     Retransmission
285	        /\  |            * \    upon loss   Time-Out         Adjusted
286	        /\  |            *  \    /\        _______           ssthresh
287	        /\  |            *   \  /  \      /M-Loss |         *
288	   TCP      |            * 2  \/    \    / Events |1       *
289	   Through- |            * Congestion\  /         |Slow    *
290	   put      | 1         *  Avoidance  \/          |Start  *
291	            | Slow     *            Half          |     *
292	            | Start  *              TCP CWND       *
293	            |___*_______________________Minimum TCP CWND after Time-Out_
294	                           Time >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
295	   Note : ssthresh = Slow Start threshold.

297	   Through the above 3 phases, TCP is trying to reach Equilibrium, but
298	   since packet loss is currently its only available feedback indicator,
299	   TCP will never reach that goal.  Although, a well tuned (and managed)
300	   IP network with well tuned IP hosts and applications should perform
301	   very close to TCP Equilibrium and to the BB (Bottleneck Bandwidth).

303	   This TCP methodology provides guidelines to measure the maximum
304	   achievable TCP throughput or maximum TCP sustained rate obtained
305	   after TCP CWND has stabilized to an optimal value.  All maximum
306	   achievable TCP throughputs specified in section 3 are with respect to
307	   this condition.

309	   It is important to clarify the interaction between the sender's Send
310	   Socket Buffer and the receiver's advertised TCP Receive Window.  TCP
311	   test programs such as iperf, ttcp, etc. allow the sender to control
312	   the quantity of TCP Bytes transmitted and unacknowledged (in-flight),
313	   commonly referred to as the Send Socket Buffer.   This is done
314	   independently of the TCP Receive Window size advertised by the
315	   receiver.  Implications to the capabilities of the Throughput Test
316	   Device (TTD) are covered at the end of section 3.

318	3. TCP Throughput Testing Methodology

320	   As stated earlier in section 1, it is considered best practice to
321	   verify the integrity of the network by conducting Layer2/3 tests such
322	   as [RFC2544] or other methods of network stress tests.  Although, it
323	   is important to mention here that RFC 2544 was never meant to be used
324	   outside a lab environment.

326	   If the network is not performing properly in terms of packet loss,
327	   jitter, etc. then the TCP layer testing will not be meaningful. A
328	   dysfunctional network will not reach close enough to TCP Equilibrium
329	   to provide optimal TCP throughputs with the available bandwidth.

331	   TCP Throughput testing may require cooperation between the end user
332	   customer and the network provider.  In a Layer 2/3 VPN architecture,
333	   the testing should be conducted either on the CPE or on the CE device
334	   and not on the PE (Provider Edge) router.

336	   The following represents the sequential order of steps for this
337	   testing methodology:

339	   1. Identify the Path MTU.  Packetization Layer Path MTU Discovery
340	   or PLPMTUD, [RFC4821], MUST be conducted to verify the network path
341	   MTU.  Conducting PLPMTUD establishes the upper limit for the MSS to
342	   be used in subsequent steps.

344	   2. Baseline Round Trip Time and Bandwidth. This step establishes the
345	   inherent, non-congested Round Trip Time (RTT) and the bottleneck
346	   bandwidth of the end-to-end network path.  These measurements are
347	   used to provide estimates of the ideal TCP Receive Window and Send
348	   Socket Buffer sizes that SHOULD be used in subsequent test steps.
349	   These measurements reference [RFC2681] and [RFC4898] to measure RTD
350	   and the associated RTT.

352	   3. TCP Connection Throughput Tests.  With baseline measurements
353	   of Round Trip Time and bottleneck bandwidth, single and multiple TCP
354	   connection throughput tests SHOULD be conducted to baseline network
355	   performance expectations.

357	   4. Traffic Management Tests.  Various traffic management and queuing
358	   techniques can be tested in this step, using multiple TCP
359	   connections.  Multiple connections testing should verify that the
360	   network is configured properly for traffic shaping versus policing,
361	   various queuing implementations and RED.

363	   Important to note are some of the key characteristics and
364	   considerations for the TCP test instrument.  The test host may be a
365	   standard computer or a dedicated communications test instrument.
366	   In both cases, they must be capable of emulating both client and
367	   server.

369	   The following criteria should be considered when selecting whether
370	   the TCP test host can be a standard computer or has to be a dedicated
371	   communications test instrument:

373	   - TCP implementation used by the test host, OS version, i.e. Linux OS
374	   kernel using TCP Reno, TCP options supported, etc.  These will
375	   obviously be more important when using dedicated communications test
376	   instruments where the TCP implementation may be customized or tuned
377	   to run in higher performance hardware.  When a compliant TCP TTD is
378	   used, the TCP implementation MUST be identified in the test results.
379	   The compliant TCP TTD should be usable for complete end-to-end
380	   testing through network security elements and should also be usable
381	   for testing network sections.

383	   - More important, the TCP test host MUST be capable to generate
384	   and receive stateful TCP test traffic at the full link speed of the
385	   network under test. Stateful TCP test traffic means that the test
386	   host MUST fully implement a TCP stack; this is generally a comment
387	   aimed at dedicated communications test equipments which sometimes
388	   "blast" packets with TCP headers. As a general rule of thumb, testing
389	   TCP throughput at rates greater than 100 Mbit/sec MAY require high
390	   performance server hardware or dedicated hardware based test tools.

392	   - A compliant TCP Throughput Test Device MUST allow adjusting both
393	   Send Socket Buffer and TCP Receive Window sizes.  The Receive Socket
394	   Buffer MUST be large enough to accommodate the TCP Receive Window.

396	   - Measuring RTT and retransmissions per connection will generally
397	   require a dedicated communications test instrument. In the absence of
398	   dedicated hardware based test tools, these measurements may need to
399	   be conducted with packet capture tools, i.e. conduct TCP throughput
400	   tests and analyze RTT and retransmission results in packet captures.
401	   Another option may be to use "TCP Extended Statistics MIB" per
402	   [RFC4898].

404	   - The RFC4821 PLPMTUD test SHOULD be conducted with a dedicated
405	   tester which exposes the ability to run the PLPMTUD algorithm
406	   independent from the OS stack.

408	3.1. Determine Network Path MTU

410	   TCP implementations should use Path MTU Discovery techniques (PMTUD).
411	   PMTUD relies on ICMP 'need to frag' messages to learn the path MTU.
412	   When a device has a packet to send which has the Don't Fragment (DF)
413	   bit in the IP header set and the packet is larger than the Maximum
414	   Transmission Unit (MTU) of the next hop, the packet is dropped and
415	   the device sends an ICMP 'need to frag' message back to the host that
416	   originated the packet. The ICMP 'need to frag' message includes
417	   the next hop MTU which PMTUD uses to tune the TCP Maximum Segment
418	   Size (MSS). Unfortunately, because many network managers completely
419	   disable ICMP, this technique does not always prove reliable.

421	   Packetization Layer Path MTU Discovery or PLPMTUD [RFC4821] MUST then
422	   be conducted to verify the network path MTU.  PLPMTUD can be used
423	   with or without ICMP. The following sections provide a summary of the
424	   PLPMTUD approach and an example using TCP. [RFC4821] specifies a
425	   search_high and a search_low parameter for the MTU.  As specified in
426	   [RFC4821], 1024 Bytes is a safe value for search_low in modern
427	   networks.

429	   It is important to determine the links overhead along the IP path,
430	   and then to select a TCP MSS size corresponding to the Layer 3 MTU.
431	   For example, if the MTU is 1024 Bytes and the TCP/IP headers are 40
432	   Bytes, then the MSS would be set to 984 Bytes.

434	   An example scenario is a network where the actual path MTU is 1240
435	   Bytes.  The TCP client probe MUST be capable of setting the MSS for
436	   the probe packets and could start at MSS = 984 (which corresponds
437	   to an MTU size of 1024 Bytes).

439	   The TCP client probe would open a TCP connection and advertise the
440	   MSS as 984.  Note that the client probe MUST generate these packets
441	   with the DF bit set. The TCP client probe then sends test traffic
442	   per a small default Send Socket Buffer size of ~8KBytes.  It should
443	   be kept small to minimize the possibility of congesting the network,
444	   which may induce packet loss.  The duration of the test should also
445	   be short (10-30 seconds), again to minimize congestive effects
446	   during the test.

448	   In the example of a 1240 Bytes path MTU, probing with an MSS equal to
449	   984 would yield a successful probe and the test client packets would
450	   be successfully transferred to the test server.

452	   Also note that the test client MUST verify that the MSS advertised
453	   is indeed negotiated.  Network devices with built-in Layer 4
454	   capabilities can intercede during the connection establishment and
455	   reduce the advertised MSS to avoid fragmentation.  This is certainly
456	   a desirable feature from a network perspective, but it can yield
457	   erroneous test results if the client test probe does not confirm the
458	   negotiated MSS.

460	   The next test probe would use the search_high value and this would
461	   be set to MSS = 1460 to correspond to a 1500 Bytes MTU.  In this
462	   example, the test client will retransmit based upon time-outs, since
463	   no ACKs will be received from the test server.  This test probe is
464	   marked as a conclusive failure if none of the test packets are
465	   ACK'ed.  If any of the test packets are ACK'ed, congestive network
466	   may be the cause and the test probe is not conclusive.  Re-testing
467	   at other times of the day is recommended to further isolate.

469	   The test is repeated until the desired granularity of the MTU is
470	   discovered.  The method can yield precise results at the expense of
471	   probing time.  One approach may be to reduce the probe size to
472	   half between the unsuccessful search_high and successful search_low
473	   value and raise it by half also when seeking the upper limit.

475	3.2. Baseline Round Trip Time and Bandwidth

477	   Before stateful TCP testing can begin, it is important to determine
478	   the baseline Round Trip Time (non-congested inherent delay) and
479	   bottleneck bandwidth of the end-to-end network to be tested.  These
480	   measurements are used to provide estimates of the ideal TCP Receive
481	   Window and Send Socket Buffer sizes that SHOULD be used in subsequent
482	   test steps.

484	3.2.1 Techniques to Measure Round Trip Time

486	   Following the definitions used in section 1.1, Round Trip Time (RTT)
487	   is the elapsed time between the clocking in of the first bit of a
488	   payload sent packet to the receipt of the last bit of the
489	   corresponding Acknowledgment.  Round Trip Delay (RTD) is used
490	   synonymously to twice the Link Latency.  RTT measurements SHOULD use
491	   techniques defined in [RFC2681] or statistics available from MIBs
492	   defined in [RFC4898].

494	   The RTT SHOULD be baselined during "off-peak" hours to obtain a
495	   reliable figure for inherent network latency versus additional delay
496	   caused by network buffering.  When sampling values of RTT over a test
497	   interval, the minimum value measured SHOULD be used as the baseline
498	   RTT since this will most closely estimate the inherent network
499	   latency.  This inherent RTT is also used to determine the Buffer
500	   Delay Percentage metric which is defined in Section 3.3.2
501	   The following list is not meant to be exhaustive,  although it
502	   summarizes some of the most common ways to determine round trip time.
503	   The desired resolution of the measurement (i.e. msec versus usec) may
504	   dictate whether the RTT measurement can be achieved with ICMP pings
505	   or by a dedicated communications test instrument with precision
506	   timers.

508	   The objective in this section is to list several techniques
509	   in order of decreasing accuracy.

511	   - Use test equipment on each end of the network, "looping" the
512	   far-end tester so that a packet stream can be measured back and forth
513	   from end-to-end. This RTT measurement may be compatible with delay
514	   measurement protocols specified in [RFC5357].

516	   - Conduct packet captures of TCP test sessions using "iperf" or FTP,
517	   or other TCP test applications.   By running multiple experiments,
518	   packet captures can then be analyzed to estimate RTT based upon the
519	   SYN -> SYN-ACK from the 3 way handshake at the beginning of the TCP
520	   sessions. Although, note that Firewalls might slow down 3 way
521	   handshakes, so it might be useful to compare with measured RTT later
522	   on in the same capture.

524	   - ICMP Pings may also be adequate to provide round trip time
525	   estimations.  Some limitations with ICMP Ping may include msec
526	   resolution and whether the network elements are responding to pings
527	   or not.  Also, ICMP is often rate-limited and segregated into
528	   different buffer queues, so it is not as reliable and accurate as
529	   in-band measurements.

531	3.2.2 Techniques to Measure end-to-end Bandwidth

533	   There are many well established techniques available to provide
534	   estimated measures of bandwidth over a network.  These measurements
535	   SHOULD be conducted in both directions of the network, especially for
536	   access networks, which may be asymmetrical.  Measurements SHOULD use
537	   network capacity techniques defined in [RFC5136].

539	   Before any TCP Throughput test can be done, a bandwidth measurement
540	   test MUST be run with stateless IP streams(not stateful TCP) in order
541	   to determine the available bandwidths in each direction.  This test
542	   should obviously be performed at various intervals throughout a
543	   business day or even across a week.  Ideally, the bandwidth test
544	   should produce logged outputs of the achieved bandwidths across the
545	   test interval.

547	3.3. TCP Throughput Tests

549	   This methodology specifically defines TCP throughput techniques to
550	   verify sustained TCP performance in a managed business IP network, as
551	   defined in section 2.1. This section and others will define the
552	   method to conduct these sustained TCP throughput tests and guidelines
553	   for the predicted results.

555	   With baseline measurements of round trip time and bandwidth
556	   from section 3.2, a series of single and multiple TCP connection
557	   throughput tests SHOULD be conducted to baseline network performance
558	   against expectations.  The number of trials and the type of testing
559	   (single versus multiple connections) will vary according to the
560	   intention of the test.  One example would be a single connection test
561	   in which the throughput achieved by large Send Socket Buffer and TCP
562	   Receive Window sizes (i.e. 256KB) is to be measured. It would be
563	   advisable to test performance at various times of the business day.

565	   It is RECOMMENDED to run the tests in each direction independently
566	   first, then run both directions simultaneously.  In each case,
567	   TCP Transfer Time, TCP Efficiency, and Buffer Delay Percentage MUST
568	   be measured in each direction.  These metrics are defined in 3.3.2.

570	3.3.1 Calculate Ideal TCP Receive Window Size

572	   The ideal TCP Receive Window size can be calculated from the
573	   bandwidth delay product (BDP), which is:

575	   BDP (bits) = RTT (sec) x Bandwidth (bps)

577	   Note that the RTT is being used as the "Delay" variable in the
578	   BDP calculations.

580	   Then, by dividing the BDP by 8, we obtain the "ideal" TCP Receive
581	   Window size in Bytes.  For optimal results, the Send Socket Buffer
582	   size must be adjusted to the same value at the opposite end of the
583	   network path.

585	   Ideal TCP RWIN = BDP / 8

587	   An example would be a T3 link with 25 msec RTT.  The BDP would equal
588	   ~1,105,000 bits and the ideal TCP Receive Window would be ~138
589	   KBytes.

591	   The following table provides some representative network Link Speeds,
592	   RTT, BDP, and their associated Ideal TCP Receive Window sizes.

594	   Table 3.3.1: Link Speed, RTT and calculated BDP & TCP Receive Window

596	   Link                                               Ideal TCP
597	   Speed*           RTT               BDP             Receive Window
598	   (Mbps)           (ms)             (bits)            (KBytes)
599	   ---------------------------------------------------------------------
600	    1.536            20              30,720              3.84
601	    1.536            50              76,800              9.60
602	    1.536           100             153,600             19.20
603	    44.21            10             442,100             55.26
604	    44.21            15             663,150             82.89
605	    44.21            25           1,105,250            138.16
606	    100               1             100,000             12.50
607	    100               2             200,000             25.00
608	    100               5             500,000             62.50
609	    1,000           0.1             100,000             12.50
610	    1,000           0.5             500,000             62.50
611	    1,000             1           1,000,000            125.00
612	    10,000          0.05            500,000             62.50
613	    10,000          0.3           3,000,000            375.00

615	   * Note that link speed is the bottleneck bandwidth for the NUT

617	   The following serial link speeds are used:
618	   - T1 = 1.536 Mbits/sec (for a B8ZS line encoding facility)
619	   - T3 = 44.21 Mbits/sec (for a C-Bit Framing facility)

621	   The above table illustrates the ideal TCP Receive Window size.
622	   If a smaller TCP Receive Window is used, then the TCP Throughput
623	   is not optimal. To calculate the Ideal TCP Throughput, the following
624	   formula is used: TCP Throughput = TCP RWIN X 8 / RTT

626	   An example could be a 100 Mbps IP path with 5 ms RTT and a TCP
627	   Receive Window size of 16KB, then:

629	   TCP Throughput = 16 KBytes X 8 bits / 5 ms.
630	   TCP Throughput = 128,000 bits / 0.005 sec.
631	   TCP Throughput = 25.6 Mbps.

633	   Another example for a T3 using the same calculation formula is
634	   illustrated on the next page:
635	   TCP Throughput = TCP RWIN X 8 / RTT.
636	   TCP Throughput = 16 KBytes X 8 bits / 10 ms.
637	   TCP Throughput = 128,000 bits / 0.01 sec.
638	   TCP Throughput = 12.8 Mbps.

640	   When the TCP Receive Window size exceeds the BDP (i.e. T3 link,
641	   64 KBytes TCP Receive Window on a 10 ms RTT path), the maximum frames
642	   per second limit of 3664 is reached and the calculation formula is:

644	   TCP Throughput = Max FPS X MSS X 8.
645	   TCP Throughput = 3664 FPS X 1460 Bytes X 8 bits.
646	   TCP Throughput = 42.8 Mbps
647	   The following diagram compares achievable TCP throughputs on a T3
648	   with Send Socket Buffer & TCP Receive Window sizes of 16KB vs. 64KB.

650	           45|
651	             |           _______42.8M
652	           40|           |64KB |
653	TCP          |           |     |
654	Throughput 35|           |     |
655	in Mbps      |           |     |          +-----+34.1M
656	           30|           |     |          |64KB |
657	             |           |     |          |     |
658	           25|           |     |          |     |
659	             |           |     |          |     |
660	           20|           |     |          |     |          _______20.5M
661	             |           |     |          |     |          |64KB |
662	           15|           |     |          |     |          |     |
663	             |12.8M+-----|     |          |     |          |     |
664	           10|     |16KB |     |          |     |          |     |
665	             |     |     |     |8.5M+-----|     |          |     |
666	            5|     |     |     |    |16KB |     |5.1M+-----|     |
667	             |_____|_____|_____|____|_____|_____|____|16KB |_____|_____
668	                        10               15               25
669	                                RTT in milliseconds

671	   The following diagram shows the achievable TCP throughput on a 25ms
672	   T3 when Send Socket Buffer & TCP Receive Window sizes are increased.

674	           45|
675	             |
676	           40|                                             +-----+40.9M
677	TCP          |                                             |     |
678	Throughput 35|                                             |     |
679	in Mbps      |                                             |     |
680	           30|                                             |     |
681	             |                                             |     |
682	           25|                                             |     |
683	             |                                             |     |
684	           20|                               +-----+20.5M  |     |
685	             |                               |     |       |     |
686	           15|                               |     |       |     |
687	             |                               |     |       |     |
688	           10|                  +-----+10.2M |     |       |     |
689	             |                  |     |      |     |       |     |
690	            5|     +-----+5.1M  |     |      |     |       |     |
691	             |_____|_____|______|_____|______|_____|_______|_____|_____
692	                     16           32           64            128*
693	                          TCP Receive Window size in KBytes

695	   * Note that 128KB requires [RFC1323] TCP Window scaling option.

697	3.3.2 Metrics for TCP Throughput Tests

699	   This framework focuses on a TCP throughput methodology and also
700	   provides several basic metrics to compare results of various
701	   throughput tests.  It is recognized that the complexity and
702	   unpredictability of TCP makes it impossible to develop a complete
703	   set of metrics that accounts for the myriad of variables (i.e. RTT
704	   variation, loss conditions, TCP implementation, etc.).  However,
705	   these basic metrics will facilitate TCP throughput comparisons
706	   under varying network conditions and between network traffic
707	   management techniques.

709	   The first metric is the TCP Transfer Time, which is simply the
710	   measured time it takes to transfer a block of data across
711	   simultaneous TCP connections.  This concept is useful when
712	   benchmarking traffic management techniques and where multiple
713	   TCP connections are required.

715	   TCP Transfer time may also be used to provide a normalized ratio of
716	   the actual TCP Transfer Time versus the Ideal Transfer Time.  This
717	   ratio is called the TCP Transfer Index and is defined as:

719	                     Actual TCP Transfer Time
720	                    -------------------------
721	                     Ideal TCP Transfer Time

723	   The Ideal TCP Transfer time is derived from the network path
724	   bottleneck bandwidth and various Layer 1/2/3/4 overheads associated
725	   with the network path.  Additionally, both the TCP Receive Window and
726	   the Send Socket Buffer sizes must be tuned to equal the bandwidth
727	   delay product (BDP) as described in section 3.3.1.

729	   The following table illustrates the Ideal TCP Transfer time of a
730	   single TCP connection when its TCP Receive Window and Send Socket
731	   Buffer sizes are equal to the BDP.

733	   Table 3.3.2: Link Speed, RTT, BDP, TCP Throughput, and
734	                Ideal TCP Transfer time for a 100 MB File

736	    Link                             Maximum             Ideal TCP
737	    Speed                   BDP      Achievable TCP      Transfer time
738	    (Mbps)     RTT (ms)   (KBytes)   Throughput(Mbps)    (seconds)
739	   --------------------------------------------------------------------
740	    1.536        50          9.6          1.4                571
741	    44.21        25        138.2         42.8                 18
742	    100           2         25.0         94.9                  9
743	    1,000         1        125.0        949.2                  1
744	    10,000      0.05        62.5        9,492                0.1

746	    Transfer times are rounded for simplicity.

748	   For a 100MB file(100 x 8 = 800 Mbits), the Ideal TCP Transfer Time
749	   is derived as follows:

751	                                           800 Mbits
752	       Ideal TCP Transfer Time = -----------------------------------
753	                                  Maximum Achievable TCP Throughput

755	   The maximum achievable layer 2 throughput on T1 and T3 Interfaces
756	   is based on the maximum frames per second (FPS) permitted by the
757	   actual layer 1 speed when the MTU is 1500 Bytes.

759	   The maximum FPS for a T1 is 127 and the calculation formula is:
760	   FPS = T1 Link Speed / ((MTU + PPP + Flags + CRC16) X 8)
761	   FPS = (1.536M /((1500 Bytes + 4 Bytes + 2 Bytes + 2 Bytes) X 8 )))
762	   FPS = (1.536M / (1508 Bytes X 8))
763	   FPS = 1.536 Mbps / 12064 bits
764	   FPS = 127

766	   The maximum FPS for a T3 is 3664 and the calculation formula is:
767	   FPS = T3 Link Speed / ((MTU + PPP + Flags + CRC16) X 8)
768	   FPS = (44.21M /((1500 Bytes + 4 Bytes + 2 Bytes + 2 Bytes) X 8 )))
769	   FPS = (44.21M / (1508 Bytes X 8))
770	   FPS = 44.21 Mbps / 12064 bits
771	   FPS = 3664

773	   The 1508 equates to:

775	     MTU + PPP + Flags + CRC16

777	   Where MTU is 1500 Bytes, PPP is 4 Bytes, Flags are 2 Bytes and CRC16
778	   is 2 Bytes.

780	   Then, to obtain the Maximum Achievable TCP Throughput (layer 4), we
781	   simply use: MSS in Bytes X 8 bits X max FPS.
782	   For a T3, the maximum TCP Throughput = 1460 Bytes X 8 bits X 3664 FPS
783	   Maximum TCP Throughput = 11680 bits X 3664 FPS
784	   Maximum TCP Throughput = 42.8 Mbps.

786	   The maximum achievable layer 2 throughput on Ethernet Interfaces is
787	   based on the maximum frames per second permitted by the IEEE802.3
788	   standard when the MTU is 1500 Bytes.

790	   The maximum FPS for 100M Ethernet is 8127 and the calculation is:
791	   FPS = (100Mbps /(1538 Bytes X 8 bits))

793	   The maximum FPS for GigE is 81274 and the calculation formula is:
794	   FPS = (1Gbps /(1538 Bytes X 8 bits))

796	   The maximum FPS for 10GigE is 812743 and the calculation formula is:
797	   FPS = (10Gbps /(1538 Bytes X 8 bits))
798	   The 1538 equates to:

800	     MTU + Eth + CRC32 + IFG + Preamble + SFD

802	   Where MTU is 1500 Bytes, Ethernet is 14 Bytes, CRC32 is 4 Bytes,
803	   IFG is 12 Bytes, Preamble is 7 Bytes and SFD is 1 Byte.

805	   Note that better results could be obtained with jumbo frames on
806	   GigE and 10 GigE.

808	   Then, to obtain the Maximum Achievable TCP Throughput (layer 4), we
809	   simply use: MSS in Bytes X 8 bits X max FPS.
810	   For a 100M, the maximum TCP Throughput = 1460 B X 8 bits X 8127 FPS
811	   Maximum TCP Throughput = 11680 bits X 8127 FPS
812	   Maximum TCP Throughput = 94.9 Mbps.

814	   To illustrate the TCP Transfer Time Index, an example would be the
815	   bulk transfer of 100 MB over 5 simultaneous TCP connections  (each
816	   connection uploading 100 MB).  In this example, the Ethernet service
817	   provides a Committed Access Rate (CAR) of 500 Mbit/s.  Each
818	   connection may achieve different throughputs during a test and the
819	   overall throughput rate is not always easy to determine (especially
820	   as the number of connections increases).

822	   The ideal TCP Transfer Time would be ~8 seconds, but in this example,
823	   the actual TCP Transfer Time was 12 seconds.  The TCP Transfer Index
824	   would then be 12/8 = 1.5, which indicates that the transfer across
825	   all connections took 1.5 times longer than the ideal.

827	   The second metric is TCP Efficiency, which is the percentage of Bytes
828	   that were not retransmitted and is defined as:

830	                Transmitted Bytes - Retransmitted Bytes
831	                ---------------------------------------  x 100
832	                          Transmitted Bytes

834	   Transmitted Bytes are the total number of TCP payload Bytes to be
835	   transmitted which includes the original and retransmitted Bytes. This
836	   metric provides a comparative measure between various QoS mechanisms
837	   like traffic management or congestion avoidance.  Various TCP
838	   implementations like Reno, Vegas, etc. could also be compared.

840	   As an example, if 100,000 Bytes were sent and 2,000 had to be
841	   retransmitted, the TCP Efficiency should be calculated as:

843	                   102,000 - 2,000
844	                   ----------------  x 100 = 98.03%
845	                        102,000

847	   Note that the retransmitted Bytes may have occurred more than once,
848	   and these multiple retransmissions are added to the Retransmitted
849	   Bytes count (and the Transmitted Bytes count).

851	   The third metric is the Buffer Delay Percentage, which represents the
852	   increase in RTT during a TCP throughput test with respect to
853	   inherent or baseline network RTT. The baseline RTT is the round-trip
854	   time inherent to the network path under non-congested conditions.
855	   (See 3.2.1 for details concerning the baseline RTT measurements).

857	   The Buffer Delay Percentage is defined as:

859	              Average RTT during Transfer - Baseline RTT
860	              ------------------------------------------ x 100
861	                             Baseline RTT

863	   As an example, the baseline RTT for the network path is 25 msec.
864	   During the course of a TCP transfer, the average RTT across the
865	   entire transfer increased to 32 msec.  In this example, the Buffer
866	   Delay Percentage would be calculated as:

868	                          32 - 25
869	                          ------- x 100 = 28%
870	                             25

872	   Note that the TCP Transfer Time, TCP Efficiency, and Buffer Delay
873	   Percentage MUST be measured during each throughput test. Poor TCP
874	   Transfer Time Indexes (TCP Transfer Time greater than Ideal TCP
875	   Transfer Times) may be diagnosed by correlating with sub-optimal TCP
876	   Efficiency and/or Buffer Delay Percentage metrics.

878	3.3.3 Conducting the TCP Throughput Tests

880	   Several TCP tools are currently used in the network world and one of
881	   the most common is "iperf". With this tool, hosts are installed at
882	   each end of the network path; one acts as client and the other as
883	   a server.  The Send Socket Buffer and the TCP Receive Window sizes
884	   of both client and server can be manually set.  The achieved
885	   throughput can then be measured, either uni-directionally or
886	   bi-directionally.  For higher BDP situations in lossy networks
887	   (long fat networks or satellite links, etc.), TCP options such as
888	   Selective Acknowledgment SHOULD be considered and become part of
889	   the window size / throughput characterization.

891	   Host hardware performance must be well understood before conducting
892	   the tests described in the following sections.  A dedicated
893	   communications test instrument will generally be required, especially
894	   for line rates of GigE and 10 GigE.  A compliant TCP TTD SHOULD
895	   provide a warning message when the expected test throughput will
896	   exceed 10% of the network bandwidth capacity.  If the throughput test
897	   is expected to exceed 10% of the provider bandwidth, then the test
898	   should be coordinated with the network provider.  This does not
899	   include the customer premise bandwidth, the 10% refers directly to
900	   the provider's bandwidth (Provider Edge to Provider router).

902	   The TCP throughput test should be run over a long enough duration
903	   to properly exercise network buffers (greater than 30 seconds) and
904	   also characterize performance at different time periods of the day.

906	3.3.4 Single vs. Multiple TCP Connection Testing

908	   The decision whether to conduct single or multiple TCP connection
909	   tests depends upon the size of the BDP in relation to the configured
910	   TCP Receive Window sizes configured in the end-user environment.
911	   For example, if the BDP for a long fat network turns out to be 2MB,
912	   then it is probably more realistic to test this network path with
913	   multiple connections.  Assuming typical host computer TCP Receive
914	   Window Sizes of 64 KB, using 32 TCP connections would realistically
915	   test this path.

917	   The following table is provided to illustrate the relationship
918	   between the TCP Receive Window size and the number of TCP connections
919	   required to utilize the available capacity of a given BDP. For this
920	   example, the network bandwidth is 500 Mbps and the RTT is 5 ms, then
921	   the BDP equates to 312.5 KBytes.

923	      TCP        Number of TCP Connections
924	      Window     to fill available bandwidth
925	     -------------------------------------
926	       16KB             20
927	       32KB             10
928	       64KB              5
929	      128KB              3

931	   The TCP Transfer Time metric is useful for conducting multiple
932	   connection tests.  Each connection should be configured to transfer
933	   payloads of the same size (i.e. 100 MB), and the TCP Transfer time
934	   should provide a simple metric to verify the actual versus expected
935	   results.

937	   Note that the TCP transfer time is the time for all connections to
938	   complete the transfer of the configured payload size.  From the
939	   previous table, the 64KB window is considered.  Each of the 5
940	   TCP connections would be configured to transfer 100MB, and each one
941	   should obtain a maximum of 100 Mb/sec.  So for this example, the
942	   100MB payload should be transferred across the connections in
943	   approximately 8 seconds (which would be the ideal TCP transfer time
944	   under these conditions).

946	   Additionally, the TCP Efficiency metric MUST be computed for each
947	   connection tested as defined in section 3.3.2.

949	3.3.5 Interpretation of the TCP Throughput Results

951	   At the end of this step, the user will document the theoretical BDP
952	   and a set of Window size experiments with measured TCP throughput for
953	   each TCP window size.  For cases where the sustained TCP throughput
954	   does not equal the ideal value, some possible causes are:

956	   - Network congestion causing packet loss which MAY be inferred from
957	     a poor TCP Efficiency % (higher TCP Efficiency % = less packet
958	     loss)
959	   - Network congestion causing an increase in RTT which MAY be inferred
960	     from the Buffer Delay Percentage (i.e., 0% = no increase in RTT
961	     over baseline)
962	   - Intermediate network devices which actively regenerate the TCP
963	     connection and can alter TCP Receive Window size, MSS, etc.
964	   - Rate limiting (policing).  More details on traffic management
965	     tests follows in section 3.4

967	3.4. Traffic Management Tests

969	   In most cases, the network connection between two geographic
970	   locations (branch offices, etc.) is lower than the network connection
971	   to host computers.  An example would be LAN connectivity of GigE
972	   and WAN connectivity of 100 Mbps.  The WAN connectivity may be
973	   physically 100 Mbps or logically 100 Mbps (over a GigE WAN
974	   connection). In the later case, rate limiting is used to provide the
975	   WAN bandwidth per the SLA.

977	   Traffic management techniques are employed to provide various forms
978	   of QoS, the more common include:

980	   - Traffic Shaping
981	   - Priority queuing
982	   - Random Early Discard (RED)

984	   Configuring the end-to-end network with these various traffic
985	   management mechanisms is a complex under-taking. For traffic shaping
986	   and RED techniques, the end goal is to provide better performance to
987	   bursty traffic such as TCP,(RED is specifically intended for TCP).

989	   This section of the methodology provides guidelines to test traffic
990	   shaping and RED implementations.  As in section 3.3, host hardware
991	   performance must be well understood before conducting the traffic
992	   shaping and RED tests. Dedicated communications test instrument will
993	   generally be REQUIRED for line rates of GigE and 10 GigE.  If the
994	   throughput test is expected to exceed 10% of the provider bandwidth,
995	   then the test should be coordinated with the network provider.  This
996	   does not include the customer premises bandwidth, the 10% refers to
997	   the provider's bandwidth (Provider Edge to Provider router). Note
998	   that GigE and 10 GigE interfaces might benefit from hold-queue
999	   adjustments in order to prevent the saw-tooth TCP traffic pattern.

1001	3.4.1 Traffic Shaping Tests

1003	   For services where the available bandwidth is rate limited, two (2)
1004	   techniques can be used: traffic policing or traffic shaping.

1006	   Simply stated, traffic policing marks and/or drops packets which
1007	   exceed the SLA bandwidth (in most cases, excess traffic is dropped).
1008	   Traffic shaping employs the use of queues to smooth the bursty
1009	   traffic and then send out within the SLA bandwidth limit (without
1010	   dropping packets unless the traffic shaping queue is exhausted).

1012	   Traffic shaping is generally configured for TCP data services and
1013	   can provide improved TCP performance since the retransmissions are
1014	   reduced, which in turn optimizes TCP throughput for the available
1015	   bandwidth.  Through this section, the rate-limited bandwidth shall
1016	   be referred to as the "bottleneck bandwidth".

1018	   The ability to detect proper traffic shaping is more easily diagnosed
1019	   when conducting a multiple TCP connections test.  Proper shaping will
1020	   provide a fair distribution of the available bottleneck bandwidth,
1021	   while traffic policing will not.

1023	   The traffic shaping tests are built upon the concepts of multiple
1024	   connections testing as defined in section 3.3.3.  Calculating the BDP
1025	   for the bottleneck bandwidth is first required before selecting the
1026	   number of connections and Send Buffer and TCP Receive Window sizes
1027	   per connection.

1029	   Similar to the example in section 3.3, a typical test scenario might
1030	   be:  GigE LAN with a 100Mbps bottleneck bandwidth (rate limited
1031	   logical interface), and 5 msec RTT.  This would require five (5) TCP
1032	   connections of 64 KB Send Socket Buffer and TCP Receive Window sizes
1033	   to evenly fill the bottleneck bandwidth (~100 Mbps per connection).

1035	   The traffic shaping test should be run over a long enough duration to
1036	   properly exercise network buffers (greater than 30 seconds) and also
1037	   characterize performance during different time periods of the day.
1038	   The throughput of each connection MUST be logged during the entire
1039	   test, along with the TCP Transfer Time, TCP Efficiency, and
1040	   Buffer Delay Percentage.

1042	3.4.1.1 Interpretation of Traffic Shaping Test Results

1044	   By plotting the throughput achieved by each TCP connection, the fair
1045	   sharing of the bandwidth is generally very obvious when traffic
1046	   shaping is properly configured for the bottleneck interface.  For the
1047	   previous example of 5 connections sharing 500 Mbps, each connection
1048	   would consume ~100 Mbps with a smooth variation.

1050	   If traffic policing was present on the bottleneck interface, the
1051	   bandwidth sharing may not be fair and the resulting throughput plot
1052	   may reveal "spikey" throughput consumption of the competing TCP
1053	   connections (due to the TCP retransmissions).

1055	3.4.2 RED Tests

1057	   Random Early Discard techniques are specifically targeted to provide
1058	   congestion avoidance for TCP traffic.  Before the network element
1059	   queue "fills" and enters the tail drop state, RED drops packets at
1060	   configurable queue depth thresholds.  This action causes TCP
1061	   connections to back-off which helps to prevent tail drop, which in
1062	   turn helps to prevent global TCP synchronization.

1064	   Again, rate limited interfaces may benefit greatly from RED based
1065	   techniques.  Without RED, TCP may not be able to achieve the full
1066	   bottleneck bandwidth.  With RED enabled, TCP congestion avoidance
1067	   throttles the connections on the higher speed interface (i.e. LAN)
1068	   and can help achieve the full bottleneck bandwidth.  The burstiness
1069	   of TCP traffic is a key factor in the overall effectiveness of RED
1070	   techniques; steady state bulk transfer flows will generally not
1071	   benefit from RED.  With bulk transfer flows, network device queues
1072	   gracefully throttle the effective throughput rates due to increased
1073	   delays.

1075	   The ability to detect proper RED configuration is more easily
1076	   diagnosed when conducting a multiple TCP connections test.  Multiple
1077	   TCP connections provide the bursty sources that emulate the
1078	   real-world conditions for which RED was intended.

1080	   The RED tests also builds upon the concepts of multiple connections
1081	   testing as defined in section 3.3.3.  Calculating the BDP for the
1082	   bottleneck bandwidth is first required before selecting the number
1083	   of connections, the Send Socket Buffer size and the TCP Receive
1084	   Window size per connection.

1086	   For RED testing, the desired effect is to cause the TCP connections
1087	   to burst beyond the bottleneck bandwidth so that queue drops will
1088	   occur.  Using the same example from section 3.4.1 (traffic shaping),
1089	   the 500 Mbps bottleneck bandwidth requires 5 TCP connections (with
1090	   window size of 64KB) to fill the capacity.  Some experimentation is
1091	   required, but it is recommended to start with double the number of
1092	   connections to stress the network element buffers / queues (10
1093	   connections for this example).

1095	   The TCP TTD must be configured to generate these connections as
1096	   shorter (bursty) flows versus bulk transfer type flows.  These TCP
1097	   bursts should stress queue sizes in the 512KB range.  Again
1098	   experimentation will be required; the proper number of TCP
1099	   connections, the Send Socket Buffer and TCP Receive Window sizes will
1100	   be dictated by the size of the network element queue.

1102	3.4.2.1 Interpretation of RED Results

1104	   The default queuing technique for most network devices is FIFO based.
1105	   Without RED, the FIFO based queue may cause excessive loss to all of
1106	   the TCP connections and in the worst case global TCP synchronization.

1108	   By plotting the aggregate throughput achieved on the bottleneck
1109	   interface, proper RED operation may be determined if the bottleneck
1110	   bandwidth is fully utilized.  For the previous example of 10
1111	   connections (window = 64 KB) sharing 500 Mbps, each connection should
1112	   consume ~50 Mbps.  If RED was not properly enabled on the interface,
1113	   then the TCP connections will retransmit at a higher rate and the
1114	   net effect is that the bottleneck bandwidth is not fully utilized.

1116	   Another means to study non-RED versus RED implementation is to use
1117	   the TCP Transfer Time metric for all of the connections.  In this
1118	   example, a 100 MB payload transfer should take ideally 16 seconds
1119	   across all 10 connections (with RED enabled).  With RED not enabled,
1120	   the throughput across the bottleneck bandwidth may be greatly
1121	   reduced (generally 10-20%) and the actual TCP Transfer time may be
1122	   proportionally longer then the Ideal TCP Transfer time.

1124	   Additionally, non-RED implementations may exhibit a lower TCP
1125	   Transfer Efficiency.

1127	4. Security Considerations

1129	   The security considerations that apply to any active measurement of
1130	   live networks are relevant here as well.  See [RFC4656] and
1131	   [RFC5357].

1133	5. IANA Considerations

1135	   This document does not REQUIRE an IANA registration for ports
1136	   dedicated to the TCP testing described in this document.

1138	6. Acknowledgments

1140	   Thanks to Lars Eggert, Al Morton, Matt Mathis, Matt Zekauskas,
1141	   Yaakov Stein, and Loki Jorgenson for many good comments and for
1142	   pointing us to great sources of information pertaining to past works
1143	   in the TCP capacity area.

1145	7. References

1147	7.1 Normative References

1149	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
1150	              Requirement Levels", BCP 14, RFC 2119, March 1997.

1152	   [RFC4656]  Shalunov, S., Teitelbaum, B., Karp, A., Boote, J., and M.
1153	              Zekauskas, "A One-way Active Measurement Protocol
1154	              (OWAMP)", RFC 4656, September 2006.

1156	   [RFC5681]  Allman, M., Paxson, V., Stevens W., "TCP Congestion
1157	              Control", RFC 5681, September 2009.

1159	   [RFC2544]  Bradner, S., McQuaid, J., "Benchmarking Methodology for
1160	              Network Interconnect Devices", RFC 2544, June 1999

1162	   [RFC5357]  Hedayat, K., Krzanowski, R., Morton, A., Yum, K., Babiarz,
1163	              J., "A Two-Way Active Measurement Protocol (TWAMP)",
1164	              RFC 5357, October 2008

1166	   [RFC4821]  Mathis, M., Heffner, J., "Packetization Layer Path MTU
1167	              Discovery", RFC 4821, June 2007

1169	              draft-ietf-ippm-btc-cap-00.txt Allman, M., "A Bulk
1170	              Transfer Capacity Methodology for Cooperating Hosts",
1171	              August 2001

1173	   [RFC2681]  Almes G., Kalidindi S., Zekauskas, M., "A Round-trip Delay
1174	              Metric for IPPM", RFC 2681, September, 1999

1176	   [RFC4898]  Mathis, M., Heffner, J., Raghunarayan, R., "TCP Extended
1177	              Statistics MIB", May 2007

1179	   [RFC5136]  Chimento P., Ishac, J., "Defining Network Capacity",
1180	              February 2008

1182	   [RFC1323]  Jacobson, V., Braden, R., Borman D., "TCP Extensions for
1183	              High Performance", May 1992

1185	7.2. Informative References
1186	Authors' Addresses

1188	   Barry Constantine
1189	   JDSU, Test and Measurement Division
1190	   One Milesone Center Court
1191	   Germantown, MD 20876-7100
1192	   USA

1194	   Phone: +1 240 404 2227
1195	   barry.constantine@jdsu.com

1197	   Gilles Forget
1198	   Independent Consultant to Bell Canada.
1199	   308, rue de Monaco, St-Eustache
1200	   Qc. CANADA, Postal Code : J7P-4T5

1202	   Phone: (514) 895-8212
1203	   gilles.forget@sympatico.ca

1205	   Rudiger Geib
1206	   Heinrich-Hertz-Strasse (Number: 3-7)
1207	   Darmstadt, Germany, 64295

1209	   Phone: +49 6151 6282747
1210	   Ruediger.Geib@telekom.de

1212	   Reinhard Schrage
1213	   Schrage Consulting

1215	   Phone: +49 (0) 5137 909540
1216	   reinhard@schrageconsult.com