idnits 2.17.1 

draft-ietf-ippm-tcp-throughput-tm-13.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (May 31, 2011) is 4714 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  ** Obsolete normative reference: RFC 1323 (Obsoleted by RFC 7323)


     Summary: 1 error (**), 0 flaws (~~), 1 warning (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	Network Working Group                                     B. Constantine
2	Internet-Draft                                                      JDSU
3	Intended status: Informational                                 G. Forget
4	Expires: November 30, 2011                 Bell Canada (Ext. Consultant)
5	                                                           Ruediger Geib
6	                                                        Deutsche Telekom
7	                                                        Reinhard Schrage
8	                                                      Schrage Consulting

10	                                                            May 31, 2011

12	                  Framework for TCP Throughput Testing
13	                draft-ietf-ippm-tcp-throughput-tm-13.txt

15	Abstract

17	   This framework describes a practical methodology for measuring end-
18	   to-end TCP Throughput in a managed IP network. The goal is to provide
19	   a better indication in regards to user experience. In this framework,
20	   TCP and IP parameters are specified to optimize TCP throughput.

22	Requirements Language

24	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
25	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
26	   document are to be interpreted as described in RFC 2119 [RFC2119].

28	Status of this Memo

30	   This Internet-Draft is submitted in full conformance with the
31	   provisions of BCP 78 and BCP 79.

33	   Internet-Drafts are working documents of the Internet Engineering
34	   Task Force (IETF).  Note that other groups may also distribute
35	   working documents as Internet-Drafts.  The list of current Internet-
36	   Drafts is at http://datatracker.ietf.org/drafts/current/.

38	   Internet-Drafts are draft documents valid for a maximum of six months
39	   and may be updated, replaced, or obsoleted by other documents at any
40	   time.  It is inappropriate to use Internet-Drafts as reference
41	   material or to cite them other than as "work in progress."

43	        This Internet-Draft will expire on November 30, 2011.

45	   Copyright Notice

47	   Copyright (c) 2011 IETF Trust and the persons identified as the
48	   document authors.  All rights reserved.

50	   This document is subject to BCP 78 and the IETF Trust's Legal
51	   Provisions Relating to IETF Documents
52	   (http://trustee.ietf.org/license-info) in effect on the date of
53	   publication of this document.  Please review these documents
54	   carefully, as they describe your rights and restrictions with respect
55	   to this document.  Code Components extracted from this document must
56	   include Simplified BSD License text as described in Section 4.e of
57	   the Trust Legal Provisions and are provided without warranty as
58	   described in the Simplified BSD License.

60	Table of Contents

62	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
63	     1.1   Terminology. . . . . . . . . . . . . . . . . . . . . . . .  4
64	     1.2   TCP Equilibrium  . . . . . . . . . . . . . . . . . . . . .  5
65	   2.  Scope and Goals  . . . . . . . . . . . . . . . . . . . . . . .  6
66	   3.  Methodology. . . . . . . . . . . . . . . . . . . . . . . . . .  7
67	     3.1   Path MTU . . . . . . . . . . . . . . . . . . . . . . . . .  9
68	     3.2   Round Trip Time (RTT) and Bottleneck Bandwidth (BB). . . .  9
69	         3.2.1  Measuring RTT . . . . . . . . . . . . . . . . . . . .  9
70	         3.2.2  Measuring BB  . . . . . . . . . . . . . . . . . . . . 10
71	     3.3.  Measuring TCP Throughput . . . . . . . . . . . . . . . . . 11
72	         3.3.1 Minimum TCP RWND . . . . . . . . . . . . . . . . . . . 11
73	   4.  TCP Metrics  . . . . . . . . . . . . . . . . . . . . . . . . . 14
74	     4.1   Transfer Time Ratio. . . . . . . . . . . . . . . . . . . . 14
75	         4.1.1 Maximum Achievable TCP Throughput calculation  . . . . 15
76	         4.1.2 Transfer Time and Transfer Time Ratio calculation. . . 16
77	     4.2   TCP Efficiency . . . . . . . . . . . . . . . . . . . . . . 17
78	         4.2.1 TCP Efficiency Percentage calculation  . . . . . . . . 17
79	     4.3   Buffer Delay . . . . . . . . . . . . . . . . . . . . . . . 17
80	         4.3.1 Buffer Delay Percentage calculation. . . . . . . . . . 17
81	   5.  Conducting TCP Throughput Tests. . . . . . . . . . . . . . . . 18
82	     5.1   Single versus Multiple Connections . . . . . . . . . . . . 18
83	     5.2   Results Interpretation . . . . . . . . . . . . . . . . . . 19
84	   6.  Security Considerations  . . . . . . . . . . . . . . . . . . . 21
85	   7.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 21
86	   8.  Acknowledgments  . . . . . . . . . . . . . . . . . . . . . . . 22
87	   9.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 22
88	     9.1   Normative References . . . . . . . . . . . . . . . . . . . 22
89	     9.2   Informative References . . . . . . . . . . . . . . . . . . 22

91	   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 23

93	1. Introduction

95	   In the network industry, the SLA (Service Level Agreement) provided
96	   to business class customers is generally based upon Layer 2/3
97	   criteria such as:   Bandwidth, latency, packet loss and delay
98	   variations (jitter).  Network providers are coming to the realization
99	   that Layer 2/3 testing is not enough to adequately ensure end-user's
100	   satisfaction.  In addition to Layer 2/3 testing, this framework
101	   recommends a methodology for measuring TCP Throughput in order to
102	   provide meaningful results with respect to user experience.

104	   Additionally, business class customers seek to conduct repeatable TCP
105	   Throughput tests between locations. Since these organizations rely on
106	   the networks of the providers, a common test methodology with
107	   predefined metrics would benefit both parties.

109	   Note that the primary focus of this methodology is managed business
110	   class IP networks; e.g. those Ethernet terminated services for which
111	   organizations are provided an SLA from the network provider.  Because
112	   of the SLA, the expectation is that the TCP Throughput should achieve
113	   the guaranteed bandwidth.   End-users with "best effort" access could
114	   use this methodology, but this framework and its metrics are intended
115	   to be used in a predictable managed IP network.   No end-to-end
116	   performance can be guaranteed when only the access portion is being
117	   provisioned to a specific bandwidth capacity.

119	   The intent behind this document is to define a methodology for
120	   testing sustained TCP Layer performance.  In this document, the
121	   achievable TCP Throughput is that amount of data per unit time that
122	   TCP transports when in the TCP Equilibrium state.  (See Section 1.2
123	   for TCP Equilibrium definition).  Throughout this document, maximum
124	   achievable throughput refers to the theoretical achievable throughput
125	   when TCP is in the Equilibrium state.

127	   TCP is connection oriented and at the transmitting side it uses a
128	   congestion window, (TCP CWND).  At the receiving end, TCP uses a
129	   receive window, (TCP RWND) to inform the transmitting end on how
130	   many Bytes it is capable to accept at a given time.

132	   Derived from Round Trip Time (RTT) and network Bottleneck Bandwidth
133	   (BB), the Bandwidth Delay Product (BDP) determines the Send and
134	   Received Socket buffers sizes required to achieve the maximum TCP
135	   Throughput.  Then, with the help of slow start and congestion
136	   avoidance algorithms, a TCP CWND is calculated based on the IP
137	   network path loss rate.  Finally, the minimum value between the
138	   calculated TCP CWND and the TCP RWND advertised by the opposite end
139	   will determine how many Bytes can actually be sent by the
140	   transmitting side at a given time.

142	   Both TCP Window sizes (RWND and CWND) may vary during any given TCP
143	   session, although up to bandwidth limits, larger RWND and larger CWND
144	   will achieve higher throughputs by permitting more in-flight Bytes.

146	   At both ends of the TCP connection and for each socket, there are
147	   default buffer sizes.  There are also kernel enforced maximum buffer
148	   sizes.  These buffer sizes can be adjusted at both ends (transmitting
149	   and receiving).  Some TCP/IP stack implementations use Receive Window
150	   Auto-Tuning, although in order to obtain the maximum throughput it is
151	   critical to use large enough TCP Send and Receive Socket Buffer
152	   sizes.  In fact, they SHOULD be equal to or greater than BDP.

154	   Many variables are involved in TCP Throughput performance, but this
155	   methodology focuses on:
156	   - BB (Bottleneck Bandwidth)
157	   - RTT (Round Trip Time)
158	   - Send and Receive Socket Buffers
159	   - Minimum TCP RWND
160	   - Path MTU (Maximum Transmission Unit)

162	   This methodology proposes TCP testing that SHOULD be performed in
163	   addition to traditional Layer 2/3 type tests.   In fact, Layer 2/3
164	   tests are REQUIRED to verify the integrity of the network before
165	   conducting TCP tests.  Examples include iperf (UDP mode) and manual
166	   packet layer test techniques where packet throughput, loss, and delay
167	   measurements are conducted.  When available, standardized testing
168	   similar to [RFC2544] but adapted for use in operational networks MAY
169	   be used.

171	   Note: [RFC2544] was never meant to be used outside a lab environment.

173	   Sections 2 and 3 of this document provides a general overview of the
174	   proposed methodology.  Section 4 defines the metrics while Section 5
175	   explains how to conduct the tests and interpret the results.

177	1.1 Terminology

179	   The common definitions used in this methodology are:

181	   - TCP Throughput Test Device (TCP TTD), refers to compliant TCP
182	     host that generates traffic and measures metrics as defined in
183	     this methodology. i.e. a dedicated communications test instrument.
184	   - Customer Provided Equipment (CPE), refers to customer owned
185	     equipment (routers, switches, computers, etc.)
186	   - Customer Edge (CE), refers to provider owned demarcation device.
187	   - Provider Edge (PE), refers to provider's distribution equipment.
188	   - Bottleneck Bandwidth (BB), lowest bandwidth along the complete
189	     path. Bottleneck Bandwidth and Bandwidth are used synonymously
190	     in this document. Most of the time the Bottleneck Bandwidth is
191	     in the access portion of the wide area network (CE - PE).
192	   - Provider (P), refers to provider core network equipment.
193	   - Network Under Test (NUT), refers to the tested IP network path.
194	   - Round Trip Time (RTT), is the elapsed time between the clocking in
195	     of the first bit of a TCP segment sent and the receipt of the last
196	     bit of the corresponding TCP Acknowledgment.
197	   - Bandwidth Delay Product (BDP), refers to the product of a data
198	     link's capacity (in bits per second) and its end-to-end delay
199	     (in seconds).

201	   Figure 1.1 Devices, Links and Paths

203	 +----+ +----+ +----+  +----+ +---+  +---+ +----+  +----+ +----+ +----+
204	 | TCP|-| CPE|-| CE |--| PE |-| P |--| P |-| PE |--| CE |-| CPE|-| TCP|
205	 | TTD| |    | |    |BB|    | |   |  |   | |    |BB|    | |    | | TTD|
206	 +----+ +----+ +----+  +----+ +---+  +---+ +----+  +----+ +----+ +----+
207	        <------------------------ NUT ------------------------->
208	    R >-----------------------------------------------------------|
209	    T                                                             |
210	    T <-----------------------------------------------------------|

212	   Note that the NUT may be built with of a variety of devices including
213	   but not limited to, load balancers, proxy servers or WAN acceleration
214	   appliances.  The detailed topology of the NUT SHOULD be well known
215	   when conducting the TCP Throughput tests, although this methodology
216	   makes no attempt to characterize specific network architectures.

218	1.2 TCP Equilibrium

220	   TCP connections have three (3) fundamental congestion window phases,
221	   which are depicted in Figure 1.2.

223	   1 - The Slow Start phase, which occurs at the beginning of a TCP
224	   transmission or after a retransmission time out.

226	   2 - The Congestion Avoidance phase, during which TCP ramps up to
227	   establish the maximum achievable throughput.  It is important to note
228	   that retransmissions are a natural by-product of the TCP congestion
229	   avoidance algorithm as it seeks to achieve maximum throughput.

231	   3 - The Loss Recovery phase, which could include Fast Retransmit
232	   (Tahoe) or Fast Recovery (Reno & New Reno).  When packet loss occurs,
233	   Congestion Avoidance phase transitions either to Fast Retransmission
234	   or Fast Recovery depending upon the TCP implementation. If a Time-Out
235	   occurs, TCP transitions back to the Slow Start phase.

237	   Figure 1.2 TCP CWND Phases

239	        /\  |
240	        /\  |High ssthresh  TCP CWND                         TCP
241	        /\  |Loss Event *   halving    3-Loss Recovery       Equilibrium
242	        /\  |          * \  upon loss
243	        /\  |          *  \    /  \        Time-Out            Adjusted
244	        /\  |          *   \  /    \      +--------+         * ssthresh
245	        /\  |          *    \/      \    / Multiple|        *
246	        /\  |          * 2-Congestion\  /  Loss    |        *
247	        /\  |         *    Avoidance  \/   Event   |       *
248	   TCP      |        *              Half           |     *
249	   Through- |      *                TCP CWND       | * 1-Slow Start
250	   put      | * 1-Slow Start                      Min TCP CWND after T-O
251	            +-----------------------------------------------------------
252	             Time > > > > > > > > > > > > > > > > > > > > > > > > > > >

254	   Note: ssthresh = Slow Start threshold.

256	   A well tuned and managed IP network with appropriate TCP adjustments
257	   in the IP hosts and applications should perform very close to the
258	   BB when TCP is in the Equilibrium state.

260	   This TCP methodology provides guidelines to measure the maximum
261	   achievable TCP Throughput when TCP is in the Equilibrium state.
262	   All maximum achievable TCP Throughputs specified in Section 3.3 are
263	   with respect to this condition.

265	   It is important to clarify the interaction between the sender's Send
266	   Socket Buffer and the receiver's advertised TCP RWND Size.  TCP test
267	   programs such as iperf, ttcp, etc. allows the sender to control the
268	   quantity of TCP Bytes transmitted and unacknowledged (in-flight),
269	   commonly referred to as the Send Socket Buffer.   This is done
270	   independently of the TCP RWND Size advertised by the receiver.

272	2. Scope and Goals

274	   Before defining the goals, it is important to clearly define the
275	   areas that are out-of-scope.

277	   - This methodology is not intended to predict the TCP Throughput
278	   during the transient stages of a TCP connection, such as during the
279	   slow start phase.

281	   - This methodology is not intended to definitively benchmark TCP
282	   implementations of one OS to another, although some users may find
283	   value in conducting qualitative experiments.

285	   - This methodology is not intended to provide detailed diagnosis
286	   of problems within end-points or within the network itself as
287	   related to non-optimal TCP performance, although results
288	   interpretation for each test step may provide insights to potential
289	   issues.

291	   - This methodology does not propose to operate permanently with high
292	   measurement loads.  TCP performance and optimization within
293	   operational networks MAY be captured and evaluated by using data
294	   from the "TCP Extended Statistics MIB" [RFC4898].

296	   In contrast to the above exclusions, the primary goal is to define a
297	   method to conduct a practical end-to-end assessment of sustained
298	   TCP performance within a managed business class IP network.  Another
299	   key goal is to establish a set of "best practices" that a non-TCP
300	   expert SHOULD apply when validating the ability of a managed IP
301	   network to carry end-user TCP applications.

303	   Specific goals are to:

305	   - Provide a practical test approach that specifies tunable parameters
306	   (such as MTU (Maximum Transmit Unit) and Socket Buffer sizes) and how
307	   these affect the outcome of TCP performances over an IP network.

309	   - Provide specific test conditions like link speed, RTT, MTU, Socket
310	   Buffer sizes and achievable TCP Throughput when TCP is in the
311	   Equilibrium state.  For guideline purposes, provide examples of
312	   test conditions and their maximum achievable TCP Throughput.
313	   Section 1.2 provides specific details concerning the definition of
314	   TCP Equilibrium within this methodology while Section 3 provides
315	   specific test conditions with examples.

317	   - Define three (3) basic metrics to compare the performance of TCP
318	   connections under various network conditions.  See Section 4.

320	   - In test situations where the recommended procedure does not yield
321	   the maximum achievable TCP Throughput, this methodology provides
322	   some areas within the end host or the network that SHOULD be
323	   considered for investigation.   Although again, this methodology
324	   is not intended to provide detailed diagnosis on these issues.
325	   See Section 5.2.

327	3. Methodology

329	   This methodology is intended for operational and managed IP networks.
330	   A multitude of network architectures and topologies can be tested.
331	   The diagram in Figure 1.1 is very general and is only there to
332	   illustrate typical segmentation within end-user and network provider
333	   domains.

335	   Also, as stated earlier in Section 1, it is considered best practice
336	   to verify the integrity of the network by conducting Layer 2/3 tests
337	   such as [RFC2544] or other methods of network stress tests.
338	   Although, it is important to mention here that [RFC2544] was never
339	   meant to be used outside a lab environment.

341	   It is not possible to make an accurate TCP Throughput measurement
342	   when the network is dysfunctional. In particular, if the network is
343	   exhibiting high packet loss and/or high jitter, then TCP Layer
344	   Throughput testing will not be meaningful. As a guideline 5% packet
345	   loss and/or 150 ms of jitter may be considered too high for an
346	   accurate measurement.

348	   TCP Throughput testing may require cooperation between the end-user
349	   customer and the network provider.  As an example, in an MPLS (Multi-
350	   Protocol Label Switching) network architecture, the testing SHOULD be
351	   conducted either on the CPE or on the CE device and not on the PE
352	   (Provider Edge) router.

354	   The following represents the sequential order of steps for this
355	   testing methodology:

357	   1 - Identify the Path MTU.  Packetization Layer Path MTU Discovery
358	   or PLPMTUD, [RFC4821], SHOULD be conducted.  It is important to
359	   identify the path MTU so that the TCP TTD is configured properly to
360	   avoid fragmentation.

362	   2 - Baseline Round Trip Time and Bandwidth. This step establishes the
363	   inherent, non-congested Round Trip Time (RTT) and the Bottleneck
364	   Bandwidth (BB) of the end-to-end network path.  These measurements
365	   are used to provide estimates of the TCP RWND and Send Socket Buffer
366	   Sizes that SHOULD be used during subsequent test steps.

368	   3 - TCP Connection Throughput Tests.  With baseline measurements
369	   of Round Trip Time and Bottleneck Bandwidth, single and multiple TCP
370	   connection throughput tests SHOULD be conducted to baseline network
371	   performances.

373	   These three (3) steps are detailed in Sections 3.1 - 3.3.

375	   Important to note are some of the key characteristics and
376	   considerations for the TCP test instrument.  The test host MAY be a
377	   standard computer or a dedicated communications test instrument.
378	   In both cases, it MUST be capable of emulating both a client and a
379	   server.

381	   The following criteria SHOULD be considered when selecting whether
382	   the TCP test host can be a standard computer or has to be a dedicated
383	   communications test instrument:

385	   - TCP implementation used by the test host, OS version, i.e. LINUX OS
386	   kernel using TCP New Reno, TCP options supported, etc.  These will
387	   obviously be more important when using dedicated communications test
388	   instruments where the TCP implementation may be customized or tuned
389	   to run in higher performance hardware.  When a compliant TCP TTD is
390	   used, the TCP implementation SHOULD be identified in the test
391	   results.  The compliant TCP TTD SHOULD be usable for complete
392	   end-to-end testing through network security elements and SHOULD also
393	   be usable for testing network sections.

395	   - More important, the TCP test host MUST be capable to generate
396	   and receive stateful TCP test traffic at the full BB of the NUT.
397	   Stateful TCP test traffic means that the test host MUST fully
398	   implement a TCP/IP stack; this is generally a comment aimed at
399	   dedicated communications test equipments which sometimes "blast"
400	   packets with TCP headers. As a general rule of thumb, testing TCP
401	   Throughput at rates greater than 100 Mbps may require high
402	   performance server hardware or dedicated hardware based test tools.

404	   - A compliant TCP Throughput Test Device MUST allow adjusting both
405	   Send and Receive Socket Buffer sizes.  The Socket Buffers MUST be
406	   large enough to fill the BDP.

408	   - Measuring RTT and retransmissions per connection will generally
409	   require a dedicated communications test instrument. In the absence of
410	   dedicated hardware based test tools, these measurements may need to
411	   be conducted with packet capture tools, i.e. conduct TCP Throughput
412	   tests and analyze RTT and retransmissions in packet captures.
413	   Another option MAY be to use "TCP Extended Statistics MIB" per
414	   [RFC4898].

416	   - The [RFC4821] PLPMTUD test SHOULD be conducted with a dedicated
417	   tester which exposes the ability to run the PLPMTUD algorithm
418	   independently from the OS stack.

420	3.1. Path MTU

422	   TCP implementations should use Path MTU Discovery techniques (PMTUD).
423	   PMTUD relies on ICMP 'need to frag' messages to learn the path MTU.
424	   When a device has a packet to send which has the Don't Fragment (DF)
425	   bit in the IP header set and the packet is larger than the (MTU) of
426	   the next hop, the packet is dropped and the device sends an ICMP
427	   'need to frag' message back to the host that originated the packet.
428	   The ICMP 'need to frag' message includes the next hop MTU which PMTUD
429	   uses to adjust itself.   Unfortunately, because many network managers
430	   completely disable ICMP, this technique does not always prove
431	   reliable.

433	   Packetization Layer Path MTU Discovery or PLPMTUD [RFC4821] MUST then
434	   be conducted to verify the network path MTU.  PLPMTUD can be used
435	   with or without ICMP.  [RFC4821] specifies search_high and search_low
436	   parameters for the MTU and we recommend to use those.  The goal is to
437	   avoid fragmentation during all subsequent tests.

439	3.2. Round Trip Time (RTT) and Bottleneck Bandwidth (BB)

441	   Before stateful TCP testing can begin, it is important to determine
442	   the baseline RTT (i.e. non-congested inherent delay) and BB of the
443	   end-to-end network to be tested.   These measurements are used to
444	   calculate the BDP and to provide estimates of the TCP RWND and
445	   Send Socket Buffer Sizes that SHOULD be used in subsequent test
446	   steps.

448	3.2.1 Measuring RTT

450	   As previously defined in Section 1.1, RTT is the elapsed time
451	   between the clocking in of the first bit of a TCP segment sent
452	   and the receipt of the last bit of the corresponding TCP
453	   Acknowledgment.

455	   The RTT SHOULD be baselined during off-peak hours in order to obtain
456	   a reliable figure of the inherent network latency.  Otherwise,
457	   additional delay caused by network buffering can occur.  Also, when
458	   sampling RTT values over a given test interval, the minimum
459	   measured value SHOULD be used as the baseline RTT.  This will most
460	   closely estimate the real inherent RTT.  This value is also used to
461	   determine the Buffer Delay Percentage metric defined in Section 4.3.

463	   The following list is not meant to be exhaustive,  although it
464	   summarizes some of the most common ways to determine Round Trip Time.
465	   The desired measurement precision (i.e. ms versus us) may dictate
466	   whether the RTT measurement can be achieved with ICMP pings or by a
467	   dedicated communications test instrument with precision timers.  The
468	   objective in this section is to list several techniques in order of
469	   decreasing accuracy.

471	   - Use test equipment on each end of the network, "looping" the
472	   far-end tester so that a packet stream can be measured back and forth
473	   from end-to-end. This RTT measurement may be compatible with delay
474	   measurement protocols specified in [RFC5357].

476	   - Conduct packet captures of TCP test sessions using "iperf" or FTP,
477	   or other TCP test applications.   By running multiple experiments,
478	   packet captures can then be analyzed to estimate RTT.  It is
479	   important to note that results based upon the SYN -> SYN-ACK at the
480	   beginning of TCP sessions SHOULD be avoided since Firewalls might
481	   slow down 3 way handshakes.  Also, at the senders side, Ostermann's
482	   LINUX TCPTRACE utility with -l -r arguments can be used to extract
483	   the RTT results directly from the packet captures.

485	   - Obtain RTT statistics available from MIBs defined in [RFC4898].

487	   - ICMP pings may also be adequate to provide Round Trip Time
488	   estimates, provided that the packet size is factored into the
489	   estimates (i.e. pings with different packet sizes might be required).
490	   Some limitations with ICMP Ping may include ms resolution and
491	   whether the network elements are responding to pings or not.  Also,
492	   ICMP is often rate-limited or segregated into different buffer
493	   queues.   ICMP might not work if QoS (Quality of Service)
494	   reclassification is done at any hop.   ICMP is not as reliable and
495	   accurate as in-band measurements.

497	3.2.2 Measuring BB

499	   Before any TCP Throughput test can be conducted, bandwidth
500	   measurement tests SHOULD be run with stateless IP streams (i.e. not
501	   stateful TCP) in order to determine the BB of the NUT.
502	   These measurements SHOULD be conducted in both directions,
503	   especially in asymmetrical access networks (e.g. ADSL access). These
504	   tests SHOULD be performed at various intervals throughout a business
505	   day or even across a week.

507	   Testing at various time intervals would provide a better
508	   characterization of TCP throughput and better diagnosis insight (for
509	   cases where there are TCP performance issues).  The bandwidth tests
510	   SHOULD produce logged outputs of the achieved bandwidths across the
511	   complete test duration.

513	   There are many well established techniques available to provide
514	   estimated measures of bandwidth over a network.  It is a common
515	   practice for network providers to conduct Layer 2/3 bandwidth
516	   capacity tests using [RFC2544], although it is understood that
517	   [RFC2544] was never meant to be used outside a lab environment.
518	   These bandwidth measurements SHOULD use network capacity
519	   techniques as defined in [RFC5136].

521	3.3. Measuring TCP Throughput

523	   This methodology specifically defines TCP Throughput measurement
524	   techniques to verify maximum achievable TCP performance in a managed
525	   business class IP network.

527	   With baseline measurements of RTT and BB from Section 3.2, a series
528	   of single and / or multiple TCP connection throughput tests SHOULD
529	   be conducted.

531	   The number of trials and single versus multiple TCP connections
532	   choice will be based on the intention of the test.  A single TCP
533	   connection test might be enough to measure the achievable throughput
534	   of a Metro Ethernet connectivity.  Although, it is important to note
535	   that various traffic management techniques can be used in an IP
536	   network and that some of those can only be tested with multiple
537	   connections.  As an example, multiple TCP sessions might be required
538	   to detect traffic shaping versus policing.  Multiple sessions might
539	   also be needed to measure Active Queue Management performances.
540	   However, traffic management testing is not within the scope of this
541	   test methodology.

543	   In all circumstances, it is RECOMMENDED to run the tests in each
544	   direction independently first and then to run in both directions
545	   simultaneously.  It is also RECOMMENDED to run the tests at
546	   different times of day.

548	   In each case, the TCP Transfer Time Ratio, the TCP Efficiency
549	   Percentage, and the Buffer Delay Percentage MUST be measured in
550	   each direction.  These 3 metrics are defined in Section 4.

552	3.3.1 Minimum TCP RWND

554	   The TCP TTD MUST allow the Send Socket Buffer and Receive Window
555	   sizes to be set higher than the BDP, otherwise TCP performance will
556	   be limited. In the business customer environment, these settings are
557	   not generally adjustable by the average user.  These settings are
558	   either hard coded in the application or configured within the OS as
559	   part of a corporate image. In many cases, the user's host Send
560	   Socket Buffer and Receive Window size settings are not optimal.

562	   This section provides derivations of BDPs under various network
563	   conditions.  It also provides examples of achievable TCP Throughput
564	   with various TCP RWND sizes.  This provides important guidelines
565	   showing what can be achieved with settings higher than the BDP,
566	   versus what would be achieved in a variety of real world conditions.

568	   The minimum required TCP RWND Size can be calculated from the
569	   Bandwidth Delay Product (BDP), which is:

571	   BDP (bits) = RTT (sec) x BB (bps)
572	   Note that the RTT is being used as the "Delay" variable for the BDP.

574	   Then, by dividing the BDP by 8, we obtain the minimum required TCP
575	   RWND Size in Bytes.  For optimal results, the Send Socket Buffer
576	   MUST be adjusted to the same value at each end of the network.

578	   Minimum required TCP RWND = BDP / 8

580	   As an example on a T3 link with 25 ms RTT, the BDP would equal
581	   ~1,105,000 bits and the minimum required TCP RWND would be ~138 KB.

583	   Note that separate calculations are REQUIRED on asymmetrical paths.
584	   An asymmetrical path example would be a 90 ms RTT ADSL line with
585	   5Mbps downstream and 640Kbps upstream. The downstream BDP would equal
586	   ~450,000 bits while the upstream one would be only ~57,600 bits.

588	   The following table provides some representative network Link Speeds,
589	   RTT, BDP, and their associated minimum required TCP RWND Sizes.

591	   Table 3.3.1: Link Speed, RTT, calculated BDP & min. TCP RWND

593	      Link                                         Minimum required
594	      Speed*         RTT              BDP             TCP RWND
595	      (Mbps)         (ms)            (bits)           (KBytes)
596	   ---------------------------------------------------------------------
597	        1.536        20.00           30,720              3.84
598	        1.536        50.00           76,800              9.60
599	        1.536       100.00          153,600             19.20
600	       44.210        10.00          442,100             55.26
601	       44.210        15.00          663,150             82.89
602	       44.210        25.00        1,105,250            138.16
603	      100.000         1.00          100,000             12.50
604	      100.000         2.00          200,000             25.00
605	      100.000         5.00          500,000             62.50
606	    1,000.000         0.10          100,000             12.50
607	    1,000.000         0.50          500,000             62.50
608	    1,000.000         1.00        1,000,000            125.00
609	   10,000.000         0.05          500,000             62.50
610	   10,000.000         0.30        3,000,000            375.00

612	   * Note that link speed is the BB for the NUT
613	   In the above table, the following serial link speeds are used:
614	   - T1 = 1.536 Mbps (for a B8ZS line encoding facility)
615	   - T3 = 44.21 Mbps (for a C-Bit Framing facility)

617	   The previous table illustrates the minimum required TCP RWND.
618	   If a smaller TCP RWND Size is used, then the TCP Throughput
619	   can not be optimal. To calculate the TCP Throughput, the following
620	   formula is used: TCP Throughput = TCP RWND X 8 / RTT

622	   An example could be a 100 Mbps IP path with 5 ms RTT and a TCP RWND
623	   of 16KB, then:

625	   TCP Throughput = 16 KBytes X 8 bits / 5 ms.
626	   TCP Throughput = 128,000 bits / 0.005 sec.
627	   TCP Throughput = 25.6 Mbps.

629	   Another example for a T3 using the same calculation formula is
630	   illustrated in Figure 3.3.1a:

632	   TCP Throughput = 16 KBytes X 8 bits / 10 ms.
633	   TCP Throughput = 128,000 bits / 0.01 sec.
634	   TCP Throughput = 12.8 Mbps. *

636	   When the TCP RWND Size exceeds the BDP (T3 link and 64 KBytes TCP
637	   RWND on a 10 ms RTT path), the maximum frames per second limit of
638	   3664 is reached and then the formula is:

640	   TCP Throughput = Max FPS X (MTU - 40) X 8.
641	   TCP Throughput = 3664 FPS X 1460 Bytes X 8 bits.
642	   TCP Throughput = 42.8 Mbps. **

644	   The following diagram compares achievable TCP Throughputs on a T3
645	   with Send Socket Buffer & TCP RWND Sizes of 16KB vs. 64KB.

647	   Figure 3.3.1a TCP Throughputs on a T3 at different RTTs

649	           45|
650	             |           _______**42.8
651	           40|           |64KB |
652	TCP          |           |     |
653	Throughput 35|           |     |
654	in Mbps      |           |     |          +-----+34.1
655	           30|           |     |          |64KB |
656	             |           |     |          |     |
657	           25|           |     |          |     |
658	             |           |     |          |     |
659	           20|           |     |          |     |          _______20.5
660	             |           |     |          |     |          |64KB |
661	           15|           |     |          |     |          |     |
662	             |*12.8+-----|     |          |     |          |     |
663	           10|     |16KB |     |          |     |          |     |
664	             |     |     |     |8.5 +-----|     |          |     |
665	            5|     |     |     |    |16KB |     |5.1 +-----|     |
666	             |_____|_____|_____|____|_____|_____|____|16KB |_____|_____
667	                        10               15               25
668	                                RTT in milliseconds

670	   The following diagram shows the achievable TCP Throughput on a 25 ms
671	   T3 when Send Socket Buffer & TCP RWND Sizes are increased.

673	   Figure 3.3.1b TCP Throughputs on a T3 with different TCP RWND

675	           45|
676	             |
677	           40|                                             +-----+40.9
678	TCP          |                                             |     |
679	Throughput 35|                                             |     |
680	in Mbps      |                                             |     |
681	           30|                                             |     |
682	             |                                             |     |
683	           25|                                             |     |
684	             |                                             |     |
685	           20|                               +-----+20.5   |     |
686	             |                               |     |       |     |
687	           15|                               |     |       |     |
688	             |                               |     |       |     |
689	           10|                  +-----+10.2  |     |       |     |
690	             |                  |     |      |     |       |     |
691	            5|     +-----+5.1   |     |      |     |       |     |
692	             |_____|_____|______|_____|______|_____|_______|_____|_____
693	                     16           32           64            128*
694	                          TCP RWND Size in KBytes

696	   * Note that 128KB requires [RFC1323] TCP Window scaling option.

698	4. TCP Metrics

700	   This methodology focuses on a TCP Throughput and provides 3 basic
701	   metrics that can be used for better understanding of the results.
702	   It is recognized that the complexity and unpredictability of TCP
703	   makes it very difficult to develop a complete set of metrics that
704	   accounts for the myriad of variables (i.e. RTT variations, loss
705	   conditions, TCP implementations, etc.).  However, these 3 metrics
706	   facilitate TCP Throughput comparisons under varying network
707	   conditions and host buffer size / RWND settings.

709	4.1 Transfer Time Ratio

711	   The first metric is the TCP Transfer Time Ratio, which is simply the
712	   ratio between the Actual versus the Ideal TCP Transfer Times.

714	   The Actual TCP Transfer Time, is simply the time it takes to transfer
715	   a block of data across TCP connection(s).

717	   The Ideal TCP Transfer Time is the predicted time for which a block
718	   of data SHOULD transfer across TCP connection(s) considering the BB
719	   of the NUT.

721	                              Actual TCP Transfer Time
722	   TCP Transfer Time Ratio =  -------------------------
723	                              Ideal TCP Transfer Time

725	   The Ideal TCP Transfer Time is derived from the Maximum Achievable
726	   TCP Throughput, which is related to the BB and Layer 1/2/3/4
727	   overheads associated with the network path.  The following sections
728	   provide derivations for the Maximum Achievable TCP Throughput and
729	   example calculations for the TCP Transfer Time Ratio.

731	4.1.1 Maximum Achievable TCP Throughput calculation

733	   This section provides formulas to calculate the Maximum Achievable
734	   TCP Throughput with examples for T3 (44.21 Mbps) and Ethernet.

736	   All calculations are based on IP version 4 with TCP/IP headers of
737	   20 Bytes each (20 for TCP + 20 for IP) within an MTU of 1500 Bytes.

739	   First, the maximum achievable Layer 2 throughput of a T3 Interface
740	   is limited by the maximum quantity of Frames Per Second (FPS)
741	   permitted by the actual physical layer (Layer 1) speed.

743	   The calculation formula is:
744	   FPS = T3 Physical Speed / ((MTU + PPP + Flags + CRC16) X 8)
745	   FPS = (44.21Mbps /((1500 Bytes + 4 Bytes + 2 Bytes + 2 Bytes) X 8 )))
746	   FPS = (44.21Mbps /(1508 Bytes X 8))
747	   FPS = 44.21Mbps / 12064 bits
748	   FPS = 3664

750	   Then, to obtain the Maximum Achievable TCP Throughput (Layer 4), we
751	   simply use: (MTU - 40) in Bytes X 8 bits X max FPS.

753	   For a T3, the maximum TCP Throughput = 1460 Bytes X 8 bits X 3664 FPS
754	   Maximum TCP Throughput = 11680 bits X 3664 FPS
755	   Maximum TCP Throughput = 42.8 Mbps.

757	   On Ethernet, the maximum achievable Layer 2 throughput is limited by
758	   the maximum Frames Per Second permitted by the IEEE802.3 standard.

760	   The maximum FPS for 100 Mbps Ethernet is 8127 and the calculation is:
761	   FPS = (100Mbps /(1538 Bytes X 8 bits))

763	   The maximum FPS for GigE is 81274 and the calculation formula is:
764	   FPS = (1Gbps /(1538 Bytes X 8 bits))

766	   The maximum FPS for 10GigE is 812743 and the calculation formula is:
767	   FPS = (10Gbps /(1538 Bytes X 8 bits))

769	   The 1538 Bytes equates to:

771	   MTU + Ethernet + CRC32 + IFG + Preamble + SFD
772	        (IFG = Inter-Frame Gap and SFD = Start of Frame Delimiter)
773	   Where MTU is 1500 Bytes, Ethernet is 14 Bytes, CRC32 is 4 Bytes,
774	   IFG is 12 Bytes, Preamble is 7 Bytes and SFD is 1 Byte.

776	   Then, to obtain the Maximum Achievable TCP Throughput (Layer 4), we
777	   simply use: (MTU - 40) in Bytes X 8 bits X max FPS.
778	   For a 100Mbps, the max TCP Throughput = 1460Bytes X 8 bits X 8127 FPS
779	   Maximum TCP Throughput = 11680 bits X 8127 FPS
780	   Maximum TCP Throughput = 94.9 Mbps.

782	   It is important to note that better results could be obtained with
783	   jumbo frames on Gigabit and 10 Gigabit Ethernet interfaces.

785	4.1.2 TCP Transfer Time and Transfer Time Ratio calculation

787	   The following table illustrates the Ideal TCP Transfer time of a
788	   single TCP connection when its TCP RWND and Send Socket Buffer Sizes
789	   equals or exceeds the BDP.

791	   Table 4.1.1: Link Speed, RTT, BDP, TCP Throughput, and
792	                Ideal TCP Transfer time for a 100 MB File

794	       Link                             Maximum            Ideal TCP
795	       Speed                   BDP      Achievable TCP     Transfer time
796	       (Mbps)     RTT (ms)   (KBytes)   Throughput(Mbps)   (seconds)*
797	   --------------------------------------------------------------------
798	         1.536    50.00         9.6            1.4             571.0
799	        44.210    25.00       138.2           42.8              18.0
800	       100.000     2.00        25.0           94.9               9.0
801	     1,000.000     1.00       125.0          949.2               1.0
802	    10,000.000     0.05        62.5        9,492.0               0.1

804	    * Transfer times are rounded for simplicity.

806	   For a 100MB file (100 x 8 = 800 Mbits), the Ideal TCP Transfer Time
807	   is derived as follows:

809	                                           800 Mbits
810	       Ideal TCP Transfer Time = -----------------------------------
811	                                  Maximum Achievable TCP Throughput

813	   To illustrate the TCP Transfer Time Ratio, an example would be the
814	   bulk transfer of 100 MB over 5 simultaneous TCP connections  (each
815	   connection transferring 100 MB).  In this example, the Ethernet
816	   service provides a Committed Access Rate (CAR) of 500 Mbps.  Each
817	   connection may achieve different throughputs during a test and the
818	   overall throughput rate is not always easy to determine (especially
819	   as the number of connections increases).

821	   The ideal TCP Transfer Time would be ~8 seconds, but in this example,
822	   the actual TCP Transfer Time was 12 seconds.  The TCP Transfer Time
823	   Ratio would then be 12/8 = 1.5, which indicates that the transfer
824	   across all connections took 1.5 times longer than the ideal.

826	4.2 TCP Efficiency

828	   The second metric represents the percentage of Bytes that were not
829	   retransmitted.

831	                       Transmitted Bytes - Retransmitted Bytes
832	   TCP Efficiency % =  ---------------------------------------  X 100
833	                                Transmitted Bytes

835	   Transmitted Bytes are the total number of TCP Bytes to be transmitted
836	   including the original and the retransmitted Bytes.

838	4.2.1 TCP Efficiency Percentage calculation

840	   As an example, if 100,000 Bytes were sent and 2,000 had to be
841	   retransmitted, the TCP Efficiency Percentage would be calculated as:

843	                        102,000 - 2,000
844	   TCP Efficiency % =  -----------------  x 100 = 98.03%
845	                          102,000

847	   Note that the Retransmitted Bytes may have occurred more than once,
848	   if so, then these multiple retransmissions are added to the
849	   Retransmitted Bytes and to the Transmitted Bytes counts.

851	4.3 Buffer Delay

853	   The third metric is the Buffer Delay Percentage, which represents
854	   the increase in RTT during a TCP Throughput test versus the inherent
855	   or baseline RTT.  The baseline RTT is the Round Trip Time inherent to
856	   the network path under non-congested conditions as defined in Section
857	   3.2.1.  The average RTT is derived from the total of all measured
858	   RTTs during the actual test at every second divided by the test
859	   duration in seconds.

861	                                      Total RTTs during transfer
862	      Average RTT during transfer = -----------------------------
863	                                     Transfer duration in seconds

865	                     Average RTT during Transfer - Baseline RTT
866	    Buffer Delay % = ------------------------------------------ X 100
867	                                 Baseline RTT

869	4.3.1 Buffer Delay calculation

871	   As an example, consider a network path with a baseline RTT of 25 ms.
872	   During the course of a TCP transfer, the average RTT across
873	   the entire transfer increases to 32 ms.  Then, the Buffer Delay
874	   Percentage would be calculated as:

876	                     32 - 25
877	    Buffer Delay % = ------- x 100 = 28%
878	                       25

880	   Note that the TCP Transfer Time Ratio, TCP Efficiency Percentage, and
881	   the Buffer Delay Percentage MUST all be measured during each
882	   throughput test.  Poor TCP Transfer Time Ratio (i.e. TCP Transfer
883	   Time greater than the Ideal TCP Transfer Time) may be diagnosed by
884	   correlating with sub-optimal TCP Efficiency Percentage and/or Buffer
885	   Delay Percentage metrics.

887	5. Conducting TCP Throughput Tests

889	   Several TCP tools are currently used in the network world and one of
890	   the most common is "iperf".  With this tool, hosts are installed at
891	   each end of the network path; one acts as client and the other as
892	   a server.  The Send Socket Buffer and the TCP RWND Sizes of both
893	   client and server can be manually set.  The achieved throughput can
894	   then be measured, either uni-directionally or bi-directionally.  For
895	   higher BDP situations in lossy networks (Long Fat Networks (LFNs) or
896	   satellite links, etc.), TCP options such as Selective Acknowledgment
897	   SHOULD become part of the window size / throughput characterization.

899	   Host hardware performance must be well understood before conducting
900	   the tests described in the following sections.  A dedicated
901	   communications test instrument will generally be REQUIRED, especially
902	   for line rates of GigE and 10 GigE.  A compliant TCP TTD SHOULD
903	   provide a warning message when the expected test throughput will
904	   exceed the subscribed customer SLA.  If the throughput test is
905	   expected to exceed the subscribed customer SLA, then the test
906	   SHOULD be coordinated with the network provider.

908	   The TCP Throughput test SHOULD be run over a long enough duration
909	   to properly exercise network buffers (i.e. greater than 30 seconds)
910	   and SHOULD also characterize performance at different times of day.

912	5.1 Single versus Multiple TCP Connections

914	   The decision whether to conduct single or multiple TCP connection
915	   tests depends upon the size of the BDP in relation to the TCP RWND
916	   configured in the end-user environment. For example, if the BDP for
917	   a Long Fat Network (LFN) turns out to be 2MB, then it is probably
918	   more realistic to test this network path with multiple connections.
919	   Assuming typical host TCP RWND Sizes of 64 KB (i.e. Windows XP),
920	   using 32 TCP connections would emulate a small office scenario.

922	   The following table is provided to illustrate the relationship
923	   between the TCP RWND and the number of TCP connections required to
924	   fill the available capacity of a given BDP. For this example, the
925	   network bandwidth is 500 Mbps and the RTT is 5 ms, then the BDP
926	   equates to 312.5 KBytes.

928	   Table 5.1 Number of TCP connections versus TCP RWND

930	                 Number of TCP Connections
931	      TCP RWND   to fill available bandwidth
932	     -------------------------------------
933	       16KB             20
934	       32KB             10
935	       64KB              5
936	      128KB              3

938	   The TCP Transfer Time Ratio metric is useful when conducting multiple
939	   connection tests.  Each connection SHOULD be configured to transfer
940	   payloads of the same size (i.e. 100 MB), then the TCP Transfer Time
941	   Ratio provides a simple metric to verify the actual versus expected
942	   results.

944	   Note that the TCP Transfer Time is the time required for each
945	   connection to complete the transfer of the predetermined payload
946	   size.  From the previous table, the 64KB window is considered.  Each
947	   of the 5 TCP connections would be configured to transfer 100MB, and
948	   each one should obtain a maximum of 100 Mbps.  So for this example,
949	   the 100MB payload should be transferred across the connections in
950	   approximately 8 seconds (which would be the Ideal TCP Transfer Time
951	   under these conditions).

953	   Additionally, the TCP Efficiency Percentage metric MUST be computed
954	   for each connection as defined in Section 4.2.

956	5.2 Results Interpretation

958	   At the end, a TCP Throughput Test Device (TCP TTD) SHOULD generate a
959	   report with the calculated BDP and a set of Window Size experiments.
960	   Window Size refers to the minimum of the Send Socket Buffer and TCP
961	   RWND.  The report SHOULD include TCP Throughput results for each TCP
962	   Window Size tested.  The goal is to provide clear achievable versus
963	   actual TCP Throughputs results with respect to the TCP Window Size
964	   when no fragmentation occurs.  The report SHOULD also include the
965	   results for the 3 metrics defined in Section 4. The goal is to
966	   provide a clear relationship between these 3 metrics and user
967	   experience.  As an example, for the same results in regards with
968	   Transfer Time Ratio, a better TCP Efficiency could be obtained at the
969	   cost of higher Buffer Delays.

971	   For cases where the test results are not equal to the ideal values,
972	   some possible causes are:

974	   - Network congestion causing packet loss which may be inferred from
975	   a poor TCP Efficiency % (i.e., higher TCP Efficiency % = less packet
976	   loss)

978	   - Network congestion causing an increase in RTT which may be inferred
979	   from the Buffer Delay Percentage (i.e., 0% = no increase in RTT over
980	   baseline)
981	   - Intermediate network devices which actively regenerate the TCP
982	   connection and can alter TCP RWND Size, MTU, etc.

984	   - Rate limiting by policing instead of shaping.

986	   - Maximum TCP Buffer space.  All operating systems have a global
987	   mechanism to limit the quantity of system memory to be used by TCP
988	   connections. On some systems, each connection is subject to a memory
989	   limit that is applied to the total memory used for input data, output
990	   data and controls. On other systems, there are separate limits for
991	   input and output buffer spaces per connection.  Client/server IP
992	   hosts might be configured with Maximum Buffer Space limits that are
993	   far too small for high performance networks.

995	   - Socket Buffer Sizes.  Most operating systems support separate per
996	   connection send and receive buffer limits that can be adjusted as
997	   long as they stay within the maximum memory limits.  These socket
998	   buffers MUST be large enough to hold a full BDP of TCP Bytes plus
999	   some overhead.  There are several methods that can be used to adjust
1000	   socket buffer sizes, but TCP Auto-Tuning automatically adjusts these
1001	   as needed to optimally balance TCP performance and memory usage.

1003	   It is important to note that Auto-Tuning is enabled by default in
1004	   LINUX since the kernel release 2.6.6 and in UNIX since FreeBSD 7.0.
1005	   It is also enabled by default in Windows since Vista and in MAC since
1006	   OS X version 10.5 (leopard).  Over buffering can cause some
1007	   applications to behave poorly, typically causing sluggish interactive
1008	   response and risk running the system out of memory.   Large default
1009	   socket buffers have to be considered carefully on multi-user systems.

1011	   - TCP Window Scale Option, [RFC1323].  This option enables TCP to
1012	   support large BDP paths.  It provides a scale factor which is
1013	   required for TCP to support window sizes larger than 64KB. Most
1014	   systems automatically request WSCALE under some conditions, such as
1015	   when the receive socket buffer is larger than 64KB or when the other
1016	   end of the TCP connection requests it first.  WSCALE can only be
1017	   negotiated during the 3 way handshake.  If either end fails to
1018	   request WSCALE or requests an insufficient value, it cannot be
1019	   renegotiated. Different systems use different algorithms to select
1020	   WSCALE, but it is very important to have large enough buffer
1021	   sizes.  Note that under these constraints, a client application
1022	   wishing to send data at high rates may need to set its own receive
1023	   buffer to something larger than 64K Bytes before it opens the
1024	   connection to ensure that the server properly negotiates WSCALE.
1025	   A system administrator might have to explicitly enable [RFC1323]
1026	   extensions.  Otherwise, the client/server IP host would not support
1027	   TCP window sizes (BDP) larger than 64KB.  Most of the time,
1028	   performance gains will be obtained by enabling this option in LFNs.

1030	   - TCP Timestamps Option, [RFC1323].  This feature provides better
1031	   measurements of the Round Trip Time and protects TCP from data
1032	   corruption that might occur if packets are delivered so late that the
1033	   sequence numbers wrap before they are delivered.  Wrapped sequence
1034	   numbers do not pose a serious risk below 100 Mbps, but the risk
1035	   increases at higher data rates. Most of the time, performance gains
1036	   will be obtained by enabling this option in Gigabit bandwidth
1037	   networks.

1039	   - TCP Selective Acknowledgments Option (SACK), [RFC2018]. This allows
1040	   a TCP receiver to inform the sender about exactly which data segment
1041	   is missing and needs to be retransmitted.  Without SACK, TCP has to
1042	   estimate which data segment is missing, which works just fine if all
1043	   losses are isolated (i.e. only one loss in any given round trip).
1044	   Without SACK, TCP takes a very long time to recover after multiple
1045	   and consecutive losses.  SACK is now supported by most operating
1046	   systems, but it may have to be explicitly enabled by the system
1047	   administrator. In networks with unknown load and error patterns, TCP
1048	   SACK will improve throughput performances.  On the other hand,
1049	   security appliances vendors might have implemented TCP randomization
1050	   without considering TCP SACK and under such circumstances, SACK might
1051	   need to be disabled in the client/server IP hosts until the vendor
1052	   corrects the issue.  Also, poorly implemented SACK algorithms might
1053	   cause extreme CPU loads and might need to be disabled.

1055	   - Path MTU.  The client/server IP host system SHOULD use the largest
1056	   possible MTU for the path.  This may require enabling Path MTU
1057	   Discovery [RFC1191] & [RFC4821].  Since [RFC1191] is flawed, it is
1058	   sometimes not enabled by default and may need to be explicitly
1059	   enabled by the system administrator. [RFC4821] describes a new, more
1060	   robust algorithm for MTU discovery and ICMP black hole recovery.

1062	   - TOE (TCP Offload Engine). Some recent Network Interface Cards (NIC)
1063	   are equipped with drivers that can do part or all of the TCP/IP
1064	   protocol processing.  TOE implementations require additional work
1065	   (i.e. hardware-specific socket manipulation) to set up and tear down
1066	   connections.  Because TOE NICs configuration parameters are vendor
1067	   specific and not necessarily RFC-compliant,  they are poorly
1068	   integrated with UNIX & LINUX.  Occasionally, TOE might need to be
1069	   disabled in a server because its NIC does not have enough memory
1070	   resources to buffer thousands of connections.

1072	   Note that both ends of a TCP connection MUST be properly tuned.

1074	6. Security Considerations

1076	   Measuring TCP network performance raises security concerns.  Metrics
1077	   produced within this framework may create security issues.

1079	6.1 Denial of Service Attacks

1081	   TCP network performance metrics, as defined in this document attempts
1082	   to fill the NUT with a stateful connection.  However, since the test
1083	   MAY use stateless IP streams as specified in Section 3.2.2, it might
1084	   appear to network operators as a Denial Of Service attack. Thus, as
1085	   mentioned at the beginning of section 3, TCP Throughput testing may
1086	   require cooperation between the end-user customer and the network
1087	   provider.

1089	6.2 User data confidentiality

1091	   Metrics within this framework generate packets from a sample, rather
1092	   than taking samples based on user data.  Thus, our framework does not
1093	   threaten user data confidentiality.

1095	6.3 Interference with metrics

1097	   The security considerations that apply to any active measurement of
1098	   live networks are relevant here as well.  See [RFC4656] and
1099	   [RFC5357].

1101	7. IANA Considerations

1103	   This document does not REQUIRE an IANA registration for ports
1104	   dedicated to the TCP testing described in this document.

1106	8. Acknowledgments

1108	   Thanks to Lars Eggert, Al Morton, Matt Mathis, Matt Zekauskas,
1109	   Yaakov Stein, and Loki Jorgenson for many good comments and for
1110	   pointing us to great sources of information pertaining to past works
1111	   in the TCP capacity area.

1113	9. References

1115	9.1 Normative References

1117	   [RFC1191]  Mogul, A., Deering, S., "Path MTU Discovery", 1990

1119	   [RFC1323]  Jacobson, V., Braden, R., Borman D., "TCP Extensions for
1120	              High Performance", May 1992

1122	   [RFC2018]  Mathis, M., Mahdavi, J., Floyd, S., Romanow, A., "TCP
1123	              Selective Acknowledgment Options", 1996

1125	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
1126	              Requirement Levels", BCP 14, RFC 2119, March 1997.

1128	   [RFC2544]  Bradner, S., McQuaid, J., "Benchmarking Methodology for
1129	              Network Interconnect Devices", RFC 2544, June 1999

1131	   [RFC4656]  Shalunov, S., Teitelbaum, B., Karp, A., Boote, J., and M.
1132	              Zekauskas, "A One-way Active Measurement Protocol
1133	              (OWAMP)", RFC 4656, September 2006.

1135	   [RFC4821]  Mathis, M., Heffner, J., "Packetization Layer Path MTU
1136	              Discovery", RFC 4821, June 2007

1138	   [RFC4898]  Mathis, M., Heffner, J., Raghunarayan, R., "TCP Extended
1139	              Statistics MIB", May 2007

1141	   [RFC5136]  Chimento P., Ishac, J., "Defining Network Capacity",
1142	              February 2008

1144	   [RFC5357]  Hedayat, K., Krzanowski, R., Morton, A., Yum, K., Babiarz,
1145	              J., "A Two-Way Active Measurement Protocol (TWAMP)",
1146	              RFC 5357, October 2008

1148	              draft-ietf-ippm-btc-cap-00.txt Allman, M., "A Bulk
1149	              Transfer Capacity Methodology for Cooperating Hosts",
1150	              August 2001

1152	9.2. Informative References
1153	Authors' Addresses

1155	   Barry Constantine
1156	   JDSU, Test and Measurement Division
1157	   One Milesone Center Court
1158	   Germantown, MD 20876-7100
1159	   USA

1161	   Phone: +1 240 404 2227
1162	   barry.constantine@jdsu.com

1164	   Gilles Forget
1165	   Independent Consultant to Bell Canada.
1166	   308, rue de Monaco, St-Eustache
1167	   Qc. CANADA, Postal Code: J7P-4T5

1169	   Phone: (514) 895-8212
1170	   gilles.forget@sympatico.ca

1172	   Ruediger Geib
1173	   Heinrich-Hertz-Strasse (Number: 3-7)
1174	   Darmstadt, Germany, 64295

1176	   Phone: +49 6151 6282747
1177	   Ruediger.Geib@telekom.de

1179	   Reinhard Schrage
1180	   Schrage Consulting
1181	   Phone: +49 (0) 5137 909540
1182	   reinhard@schrageconsult.com