idnits 2.17.1 draft-ietf-ippm-tcp-throughput-tm-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Sep 2009 rather than the newer Notice from 28 Dec 2009. (See https://trustee.ietf.org/license-info/) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == It seems as if not all pages are separated by form feeds - found 0 form feeds but 18 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 31 instances of too long lines in the document, the longest one being 1 character in excess of 72. ** There are 2 instances of lines with control characters in the document. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (May 3, 2010) is 5105 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'RFC2581' is defined on line 735, but no explicit reference was found in the text == Unused Reference: 'RFC3148' is defined on line 738, but no explicit reference was found in the text == Unused Reference: 'RFC2544' is defined on line 742, but no explicit reference was found in the text == Unused Reference: 'RFC3449' is defined on line 745, but no explicit reference was found in the text == Unused Reference: 'RFC5357' is defined on line 749, but no explicit reference was found in the text == Unused Reference: 'RFC4821' is defined on line 753, but no explicit reference was found in the text == Unused Reference: 'MSMO' is defined on line 760, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2581 (Obsoleted by RFC 5681) Summary: 7 errors (**), 0 flaws (~~), 9 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group B. Constantine 2 Internet-Draft JDSU 3 Intended status: Informational G. Forget 4 Expires: November 3, 2010 Bell Canada (Ext. Consultant) 5 L. Jorgenson 6 Apparent Networks 7 Reinhard Schrage 8 Schrage Consulting 9 May 3, 2010 11 TCP Throughput Testing Methodology 12 draft-ietf-ippm-tcp-throughput-tm-01.txt 14 Abstract 16 This memo describes a methodology for measuring sustained TCP 17 throughput performance in an end-to-end managed network environment. 18 This memo is intended to provide a practical approach to help users 19 validate the TCP layer performance of a managed network, which should 20 provide a better indication of end-user application level experience. 21 In the methodology, various TCP and network parameters are identified 22 that should be tested as part of the network verification at the TCP 23 layer. 25 Status of this Memo 27 This Internet-Draft is submitted to IETF in full conformance with the 28 provisions of BCP 78 and BCP 79. 30 Internet-Drafts are working documents of the Internet Engineering 31 Task Force (IETF), its areas, and its working groups. Note that 32 other groups may also distribute working documents as Internet- 33 Drafts. Creation date May 3, 2010. 35 Internet-Drafts are draft documents valid for a maximum of six months 36 and may be updated, replaced, or obsoleted by other documents at any 37 time. It is inappropriate to use Internet-Drafts as reference 38 material or to cite them other than as "work in progress." 40 The list of current Internet-Drafts can be accessed at 41 http://www.ietf.org/ietf/1id-abstracts.txt. 43 The list of Internet-Draft Shadow Directories can be accessed at 44 http://www.ietf.org/shadow.html. 46 This Internet-Draft will expire on November 3, 2010. 48 Copyright Notice 50 Copyright (c) 2010 IETF Trust and the persons identified as the 51 document authors. All rights reserved. 53 This document is subject to BCP 78 and the IETF Trust's Legal 54 Provisions Relating to IETF Documents 55 (http://trustee.ietf.org/license-info) in effect on the date of 56 publication of this document. Please review these documents 57 carefully, as they describe your rights and restrictions with respect 58 to this document. Code Components extracted from this document must 59 include Simplified BSD License text as described in Section 4.e of 60 the Trust Legal Provisions and are provided without warranty as 61 described in the BSD License. 63 Table of Contents 65 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 66 2. Goals of this Methodology. . . . . . . . . . . . . . . . . . . 4 67 2.1 TCP Equilibrium State Throughput . . . . . . . . . . . . . 5 68 3. TCP Throughput Testing Methodology . . . . . . . . . . . . . . 6 69 3.1 Determine Network Path MTU . . . . . . . . . . . . . . . . 7 70 3.2. Baseline Round-trip Delay and Bandwidth. . . . . . . . . . 7 71 3.2.1 Techniques to Measure Round Trip Time . . . . . . . . 8 72 3.2.2 Techniques to Measure End-end Bandwidth . . . . . . . 8 73 3.3. Single TCP Connection Throughput Tests . . . . . . . . . . .9 74 3.3.1 Interpretation of the Single Connection TCP 75 Throughput Results . . . . . . . . . . . . . . . . . . 13 76 3.4. TCP MSS Throughput Testing . . . . . . . . . . . . . . . . 13 77 3.4.1 MSS Size Testing Method. . . . . . . . . . . . . . . 13 78 3.4.2 Interpretation of TCP MSS Throughput Results. . . . . 14 79 3.5. Multiple TCP Connection Throughput Tests. . . . . . . . . . 15 80 3.5.1 Multiple TCP Connections - below Link Capacity . . . . 15 81 3.5.2 Multiple TCP Connections - over Link Capacity. . . . . 16 82 3.5.3 Interpretation of Multiple TCP Connection Results. . . 16 84 4. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 17 85 5. References . . . . . . . . . . . . . . . . . . . . . . . . . . 17 86 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 18 88 1. Introduction 90 Even though RFC2544 was meant to benchmark network equipment and 91 used by network equipment manufacturers (NEMs), network providers 92 have used it to benchmark operational networks in order to 93 verify SLAs (Service Level Agreements) before turning on a service 94 to their business customers. Testing an operational network prior to 95 customer activation is referred to as "turn-up" testing and the SLA 96 is generally Layer 2/3 packet throughput, delay, loss and 97 jitter. 99 Network providers are coming to the realization that RFC2544 testing 100 and TCP layer testing are required to more adequately ensure end-user 101 satisfaction. Therefore, the network provider community desires to 102 measure network throughput performance at the TCP layer. Measuring 103 TCP throughput provides a meaningful measure with respect to the end 104 user's application SLA (and ultimately reach some level of TCP 105 testing interoperability which does not exist today). 107 The complexity of the network grows and the various queuing 108 mechanisms in the network greatly affect TCP layer performance (i.e. 109 improper default router settings for queuing, etc.) and devices such 110 as firewalls, proxies, load-balancers can actively alter the TCP 111 settings as a TCP session traverses the network (such as window size, 112 MSS, etc.). Network providers (and NEMs) are wrestling with end-end 113 complexities of the above and there is a strong interest in the 114 standardization of a test methodology to validate end-to-end TCP 115 performance (as this is the precursor to acceptable end-user 116 application performance). 118 So the intent behind this draft TCP throughput work is to define 119 a methodology for testing sustained TCP layer performance. In this 120 document, sustained TCP throughput is that amount of data per unit 121 time that TCP transports during equilibrium (steady state), i.e. 122 after the initial slow start phase. We refer to this state as TCP 123 Equilibrium, and that the equalibrium throughput is the maximum 124 achievable for the TCP connection(s). 126 One other important note; the precursor to conducting the TCP tests 127 test methodlogy is to perform "network stress tests" such as RFC2544 128 Layer 2/3 tests or other conventional tests (OWAMP, etc.). It is 129 highly recommended to run traditional Layer 2/3 type test to verify 130 the integrity of the network before conducting TCP testing. 132 2. Goals of this Methodology 134 Before defining the goals of this methodology, it is important to 135 clearly define the areas that are not intended to be measured or 136 analyzed by such a methodology. 138 - The methodology is not intended to predict TCP throughput 139 behavior during the transient stages of a TCP connection, such 140 as initial slow start. 142 - The methodology is not intended to definitively benchmark TCP 143 implementations of one OS to another, although some users may find 144 some value in conducting qualitative experiments 146 - The methodology is not intended to provide detailed diagnosis 147 of problems within end-points or the network itself as related to 148 non-optimal TCP performance, although a results interpretation 149 section for each test step may provide insight into potential 150 issues within the network 152 In contrast to the above exclusions, the goals of this methodology 153 are to define a method to conduct a structured, end-to-end 154 assessment of sustained TCP performance within a managed business 155 class IP network. A key goal is to establish a set of "best 156 practices" that an engineer should apply when validating the 157 ability of a managed network to carry end-user TCP applications. 159 Some specific goals are to: 161 - Provide a practical test approach that specifies the more well 162 understood (and end-user configurable) TCP parameters such as Window 163 size, MSS, # connections, and how these affect the outcome of TCP 164 performance over a network 166 - Provide specific test conditions (link speed, RTT, window size, 167 etc.) and maximum achievable TCP throughput under TCP Equilbrium 168 conditions. For guideline purposes, provide examples of these test 169 conditions and the maximum achievable TCP throughput during the 170 equilbrium state. Section 2.1 provides specific details concerning 171 the definition of TCP Equilibrium within the context of this draft. 173 - In test situations where the recommended procedure does not yield 174 the maximum achievable TCP throughput result, this draft provides some 175 possible areas within the end host or network that should be 176 considered for investigation (although again, this draft is not 177 intended to provide a detailed diagnosis of these issues) 179 2.1 TCP Equilibrium State Throughput 181 TCP connections have three (3) fundamental congestion window phases 182 as documented in RFC2581. These states are: 184 - Slow Start, which occurs during the beginning of a TCP transmission 185 or after a retransmission time out event 187 - Congestion avoidance, which is the phase during which TCP ramps up 188 to establish the maximum attainable throughput on an end-end network 189 path. Retransmissions are a natural by-product of the TCP congestion 190 avoidance algorithm as it seeks to achieve maximum throughput on 191 the network path. 193 - Retransmission phase, which include Fast Retransmit (Tahoe) and Fast 194 Recovery (Reno and New Reno). When a packet is lost, the Congestion 195 avoidance phase transitions to a Fast Retransmission or Recovery 196 Phase dependent upon the TCP implementation. 198 The following diagram depicts these states. 200 | ssthresh 201 TCP | | 202 Through- | | Equilibrium 203 put | |\ /\/\/\/\/\ Retransmit /\/\ ... 204 | | \ / | Time-out / 205 | | \ / | _______ _/ 206 | Slow _/ |/ | / | Slow _/ 207 | Start _/ Congestion |/ |Start_/ Congestion 208 | _/ Avoidance Loss | _/ Avoidance 209 | _/ Event | _/ 210 | _/ |/ 211 |/___________________________________________________________ 212 Time 214 This TCP methodology provides guidelines to measure the equilibrium 215 throughput which refers to the maximum sustained rate obtained by 216 congestion avoidance before packet loss conditions occur (which would 217 cause the state change from congestion avoidance to a retransmission 218 phase). All maximum achievable throughputs specified in Section 3 are 219 with respect to this Equilibrium state. 221 3. TCP Throughput Testing Methodology 223 This section summarizes the specific test methodology to achieve the 224 goals listed in Section 2. 226 As stated in Section 1, it is considered best practice to verify 227 the integrity of the network by conducting Layer2/3 stress tests 228 such as RFC2544 or other methods of network stress tests. If the 229 network is not performing properly in terms of packet loss, jitter, 230 etc. then the TCP layer testing will not be meaningful since the 231 equalibrium throughput would be very difficult to achieve (in a 232 "dysfunctional" network). 234 The following provides the sequential order of steps to conduct the 235 TCP throughput testing methodology: 237 1. Identify the Path MTU. Packetization Layer Path MTU Discovery 238 or PLPMTUD (RFC4821) should be conducted to verify the minimum network 239 path MTU. Conducting PLPMTUD establishes the upper limit for the MSS 240 to be used in subsequent steps. 242 2. Baseline Round-trip Delay and Bandwidth. These measurements provide 243 estimates of the ideal TCP window size, which will be used in 244 subsequent test steps. 246 3. Single TCP Connection Throughput Tests. With baseline measurements 247 of round trip delay and bandwidth, a series of single connection TCP 248 throughput tests can be conducted to baseline the performance of the 249 network against expectations. 251 4. TCP MSS Throughput Testing. By varying the MSS size of the TCP 252 connection, the ability of the network to sustain expected TCP 253 throughput can be verified. 255 5. Multiple TCP Connection Throughput Tests. Single connection TCP 256 testing is a useful first step to measure expected versus actual TCP 257 performance. The multiple connection test more closely emulates 258 customer traffic, which comprise many TCP connections over a network 259 link. 261 Important to note are some of the key characteristics and 262 considerations for the TCP test instrument. The test host may be a 263 standard computer or dedicated communications test instrument 264 and these TCP test hosts be capable of emulating both a client and a 265 server. As a general rule of thumb, testing TCP throughput at rates 266 greater than 250-500 Mbit/sec generally requires high performance 267 server hardware or dedicated hardware based test tools. 269 Whether the TCP test host is a standard computer or dedicated test 270 instrument, the following areas should be considered when selecting 271 a test host: 273 - TCP implementation used by the test host OS, i.e. Linux OS kernel 274 using TCP Reno, TCP options supported, etc. This will obviously be 275 more important when using custom test equipment where the TCP 276 implementation may be customized or tuned to run in higher 277 performance hardware 279 - Most importantly, the TCP test host must be capable of generating 280 and receiving stateful TCP test traffic at the full link speed of the 281 network under test. This requirement is very serious and may require 282 custom test equipment, especially on 1 GigE and 10 GigE networks. 284 3.1. Determine Network Path MTU 286 TCP implementations should use Path MTU Discovery techniques (PMTUD), 287 but this technique does not always prove reliable in real world 288 situations. Since PMTUD relies on ICMP messages (to inform the host 289 that unfragmented transmission cannot occur), PMTUD is not always 290 reliable since many network managers completely disable ICMP. 292 Increasingly network providers and enterprises are instituting fixed 293 MTU sizes on the hosts to eliminate TCP fragmentation issues in the 294 application. 296 Packetization Layer Path MTU Discovery or PLPMTUD (RFC4821) should 297 be conducted to verify the minimum network path MTU. Conducting 298 the PLPMTUD test establishes the upper limit upon the MTU, which 299 establishes the upper limit for the MSS in the subsequent test steps. 301 3.2. Baseline Round-trip Delay and Bandwidth 303 Before stateful TCP testing can begin, it is important to baseline 304 the round trip delay and bandwidth of the network to be tested. 305 These measurements provide estimates of the ideal TCP window size, 306 which will be used in subsequent test steps. 308 These latency and bandwidth tests should be run over a long enough 309 period of time to characterize the performance of the network over 310 the course of a meaningful time period. One example would 311 be to take samples during various times of the work day. The goal 312 would be to determine a representative minimum, average, and maximum 313 RTD and bandwidth for the network under test. Topology changes are 314 to be avoided during this time of initial convergence (e.g. in 315 crossing BGP4 boundaries). 317 In some cases, baselining bandwidth may not be required, since a 318 network provider's end-to-end topology may be well enough defined. 320 3.2.1 Techniques to Measure Round Trip Time 322 We follow in the definitions used in the references of the appendix; 323 hence Round Trip Time (RTT) is the time elapsed between the clocking 324 in of the first bit of a payload packet to the receipt of the last 325 bit of the corresponding acknowledgement. Round Trip Delay (RTD) 326 is used synonymously to twice the Link Latency. 328 In any method used to baseline round trip delay between network 329 end-points, it is important to realize that network latency is the 330 sum of inherent network delay and congestion. The RTT should be 331 baselined during "off-peak" hours to obtain a reliable figure for 332 network latency (versus additional delay caused by congestion). 334 During the actual sustained TCP throughput tests, it is critical 335 to measure RTT along with measured TCP throughput. Congestive 336 effects can be isolated if RTT is concurrently measured 338 This is not meant to provide an exhaustive list, but summarizes some 339 of the more common ways to determine round trip time (RTT) through 340 the network. The desired resolution of the measurement (i.e. msec 341 versus usec) may dictate whether the RTT measurement can be achieved 342 with standard tools such as ICMP ping techniques or whether 343 specialized test equipment would be required with high precision 344 timers. The objective in this section is to list several techniques 345 in order of decreasing accuracy. 347 - Use test equipment on each end of the network, "looping" the 348 far-end tester so that a packet stream can be measured end-end. This 349 test equipment RTT measurement may be compatible with delay 350 measurement protocols specified in RFC5357. 352 - Conduct packet captures of TCP test applications using for example 353 "iperf" or FTP, etc. By running multiple experiments, the packet 354 captures can be studied to estimate RTT based upon the SYN -> SYN-ACK 355 handshakes within the TCP connection set-up. 357 - ICMP Pings may also be adequate to provide round trip time 358 estimations. Some limitations of ICMP Ping are the msec resolution 359 and whether the network elements respond to pings (or block them). 361 3.2.2 Techniques to Measure End-end Bandwidth 363 There are many well established techniques available to provide 364 estimated measures of bandwidth over a network. This measurement 365 should be conducted in both directions of the network, especially for 366 access networks which are inherently asymmetrical. Some of the 367 asymmetric implications to TCP performance are documented in RFC-3449 368 and the results of this work will be further studied to determine 369 relevance to this draft. 371 The bandwidth measurement test must be run with stateless IP streams 372 (not stateful TCP) in order to determine the available bandwidth in 373 each direction. And this test should obviously be performed at 374 various intervals throughout a business day (or even across a week). 375 Ideally, the bandwidth test should produce a log output of the 376 bandwidth achieved across the test interval AND the round trip delay. 378 And during the actual TCP level performance measurements (Sections 379 3.3 - 3.5), the test tool must be able to track round trip time 380 of the TCP connection(s) during the test. Measuring round trip time 381 variation (aka "jitter") provides insight into effects of congestive 382 delay on the sustained throughput achieved for the TCP layer test. 384 3.3. Single TCP Connection Throughput Tests 386 This draft specifically defines TCP throughput techniques to verify 387 sustained TCP performance in a managed business network. Defined 388 in section 2.1, the equalibrium throughput reflects the maximum 389 rate achieved by a TCP connection within the congestion avoidance 390 phase on a end-end network path. This section and others will define 391 the method to conduct these sustained throughput tests and guidelines 392 of the predicted results. 394 With baseline measurements of round trip time and bandwidth 395 from section 3.2, a series of single connection TCP throughput tests 396 can be conducted to baseline the performance of the network against 397 expectations. The optimum TCP window size can be calculated from 398 the bandwidth delay product (BDP), which is: 400 BDP = RTT x Bandwidth 402 By dividing the BDP by 8, the "ideal" TCP window size is calculated. 403 An example would be a T3 link with 25 msec RTT. The BDP would equal 404 ~1,105,000 bits and the ideal TCP window would equal ~138,000 bytes. 406 The following table provides some representative network link speeds, 407 latency, BDP, and associated "optimum" TCP window size. Sustained 408 TCP transfers should reach nearly 100% throughput, minus the overhead 409 of Layers 1-3 and the divisor of the MSS into the window. 411 For this single connection baseline test, the MSS size will effect 412 the achieved throughput (especially for smaller TCP window sizes). 413 Table 3.2 provides the achievable, equalibrium TCP 414 throughput (at Layer 4) using 1000 byte MSS. Also in this table, 415 the case of 58 byte L1-L4 overhead including the Ethernet CRC32 is 416 used for simplicity. 418 Table 3.2: Link Speed, RTT and calculated BDP, TCP Throughput 420 Link Ideal TCP Maximum Achievable 421 Speed* RTT (ms) BDP (bits) Window (kbytes) TCP Throughput(Mbps) 422 ---------------------------------------------------------------------- 423 T1 20 30,720 3.84 1.20 424 T1 50 76,800 9.60 1.44 425 T1 100 153,600 19.20 1.44 426 T3 10 442,100 55.26 41.60 427 T3 15 663,150 82.89 41.13 428 T3 25 1,105,250 138.16 41.92 429 T3(ATM) 10 407,040 50.88 32.44 430 T3(ATM) 15 610,560 76.32 32.44 431 T3(ATM) 25 1,017,600 127.20 32.44 432 100M 1 100,000 12.50 90.699 433 100M 2 200,000 25.00 92.815 435 Link Ideal TCP Maximum Achievable 436 Speed* RTT (ms) BDP (bits) Window (kbytes) TCP Throughput (Mbps) 437 ---------------------------------------------------------------------- 438 100M 5 500,000 62.50 90.699 439 1Gig 0.1 100,000 12.50 906.991 440 1Gig 0.5 500,000 62.50 906.991 441 1Gig 1 1,000,000 125.00 906.991 442 10Gig 0.05 500,000 62.50 9,069.912 443 10Gig 0.3 3,000,000 375.00 9,069.912 445 * Note that link speed is the minimum link speed throughput a network; 446 i.e. WAN with T1 link, etc. 448 Also, the following link speeds (available payload bandwidth) were 449 used for the WAN entries: 451 - T1 = 1.536 Mbits/sec (B8ZS line encoding facility) 452 - T3 = 44.21 Mbits/sec (C-Bit Framing) 453 - T3(ATM) = 36.86 Mbits/sec (C-Bit Framing & PLCP, 96000 Cells per 454 second) 456 The calculation method used in this document is a 3 step process : 458 1 - We determine what should be the optimal TCP Window size value 459 based on the optimal quantity of "in-flight" octets discovered by 460 the BDP calculation. We take into consideration that the TCP 461 Window size has to be an exact multiple value of the MSS. 462 2 - Then we calculate the achievable layer 2 throughput by multiplying 463 the value determined in step 1 with the MSS & (MSS + L2 + L3 + L4 464 Overheads) divided by the RTT. 465 3 - Finally, we multiply the calculated value of step 2 by the MSS 466 versus (MSS + L2 + L3 + L4 Overheads) ratio. 468 This gives us the achievable TCP Throughput value. Sometimes, the 469 maximum achievable throughput is limited by the maximum achievable 470 quantity of Ethernet Frames per second on the physical media. Then 471 this value is used in step 2 instead of the calculated one. 473 There are several TCP tools that are commonly used in the network 474 provider world and one of the most common is the "iperf" tool. With 475 this tool, hosts are installed at each end of the network segment; 476 one as client and the other as server. The TCP Window size of both 477 the client and the server can be maunally set and the achieved 478 throughput is measured, either uni-directionally or bi-directionally. 479 For higher BDP situations in lossy networks (long fat networks or 480 satellite links, etc.), TCP options such as Selective Acknowledgment 481 should be considered and also become part of the window 482 size / throughput characterization. 484 The following diagram shows the achievable TCP throughput on a T3 with 485 the default Windows2000/XP TCP Window size of 17520 Bytes. 487 45| 488 | 489 40| 490 TCP | 491 Throughput 35| 492 in Mbps | 493 30| 494 | 495 25| 496 | 497 20| 498 | 499 15| _______ 14.48M 500 | | | 501 10| | | +-----+ 9.65M 502 | | | | | _______ 5.79M 503 5| | | | | | | 504 |_________+_____+_________+_____+________+____ +___________ 505 10 15 25 506 RTT in milliseconds 508 The following diagram shows the achievable TCP throughput on a 25ms T3 509 when the TCP Window size is increased and with the RFC1323 TCP Window 510 scaling option. 512 45| 513 | +-----+42.47M 514 40| | | 515 TCP | | | 516 Throughput 35| | | 517 in Mbps | | | 518 30| | | 519 | | | 520 25| | | 521 | ______ 21.23M | | 522 20| | | | | 523 | | | | | 524 15| | | | | 525 | | | | | 526 10| +----+10.62M | | | | 527 | _______5.31M | | | | | | 528 5| | | | | | | | | 529 |__+_____+______+____+___________+____+________+_____+___ 530 16 32 64 128 531 TCP Window size in Kili Bytes 533 The single connection TCP throughput test must be run over a 534 a long duration and results must be logged at the desired interval. 535 The test must record RTT and TCP retransmissions at each interval. 537 This correlation of retransmissions and RTT over the course of the 538 test will clearly identify which portions of the transfer reached 539 TCP Equilbrium state and to what effect increased RTT (congestive 540 effects) may have been the cause of reduced equilibrium performance. 542 Host hardware performance must be well understood before conducting 543 this TCP single connection test and other tests in this section. 544 Dedicated test equipment may be required, especially for line rates 545 of GigE and 10 GigE. 547 3.3.1 Interpretation of the Single Connection TCP Throughput Results 549 At the end of this step, the user will document the theoretical BDP 550 and a set of Window size experiments with measured TCP throughput for 551 each TCP window size setting. For cases where the sustained TCP 552 throughput does not equal the predicted value, some possible causes 553 are listed: 555 - Network congestion causing packet loss 556 - Network congestion not causing packet loss, but effectively 557 increasing the size of the required TCP window during the transfer 558 - Intermediate network devices which actively regenerate the TCP 559 connection and can alter window size, MSS, etc. 561 3.4. TCP MSS Throughput Testing 563 This test setup should be conducted as a single TCP connection test. 564 By varying the MSS size of the TCP connection, the ability of the 565 network to sustain expected TCP throughput can be verified. This is 566 similar to frame and packet size techniques within RFC2-2544, which 567 aim to determine the ability of the routing/switching devices to 568 handle loads in term of packets/frames per second at various frame 569 and packet sizes. This test can also further characterize the 570 performance of a network in the presence of active TCP elements 571 (proxies, etc.), devices that fragment IP packets, and the actual 572 end hosts themselves (servers, etc.). 574 3.4.1 MSS Size Testing Method 576 The single connection testing listed in Section 3.3 should be 577 repeated, using the appropriate window size and collecting 578 throughput measurements per various MSS sizes. 580 The following are the typical sizes of MSS settings for various 581 link speeds: 583 - 256 bytes for very low speed links such as 9.6Kbps (per RFC1144). 584 - 536 bytes for low speed links (per RFC879) . 585 - 966 bytes for SLIP high speed (per RFC1055). 586 - 1380 bytes for IPSec VPN Tunnel testing 587 - 1452 bytes for PPPoE connectivity (per RFC2516) 588 - 1460 for Ethernet and Fast Ethernet (per RFC895). 589 - 8960 byte jumbo frames for GigE 591 Using the optimum window size determined by conducting steps 3.2 and 592 3.3, a variety of window sizes should be tested according to the link 593 speed under test. Using Fast Ethernet with 5 msec RTT as an example, 594 the optimum TCP window size would be 62.5 kbytes and the recommended 595 MSS for Fast Ethernet is 1460 bytes. 597 Link Achievable TCP Throughput (Mbps) for 598 Speed RTT(ms) MSS=1000 MSS=1260 MSS=1300 MSS=1380 MSS=1420 MSS=1460 599 ---------------------------------------------------------------------- 600 T1 20 | 1.20 1.008 1.040 1.104 1.136 1.168 601 T1 50 | 1.44 1.411 1.456 1.335 1.363 1.402 602 T1 100 | 1.44 1.512 1.456 1.435 1.477 1.402 603 T3 10 | 41.60 42.336 42.640 41.952 40.032 42.048 604 T3 15 | 42.13 42.336 42.293 42.688 42.411 42.048 605 T3 25 | 41.92 42.336 42.432 42.394 42.714 42.515 606 T3(ATM) 10 | 32.44 33.815 34.477 35.482 36.022 36.495 607 T3(ATM) 15 | 32.44 34.120 34.477 35.820 36.022 36.127 608 T3(ATM) 25 | 32.44 34.363 34.860 35.684 36.022 36.274 609 100M 1 | 90.699 89.093 91.970 86.866 89.424 91.982 610 100M 2 | 92.815 93.226 93.275 88.505 90.973 93.442 611 100M 5 | 90.699 92.481 92.697 88.245 90.844 93.442 613 For GigE and 10GigE, Jumbo frames (9000 bytes) are becoming more 614 common. The following table adds jumbo frames to the possible MSS 615 values. 617 Link Achievable TCP Throughput (Mbps) for 618 Speed RTT(ms) MSS=1260 MSS=1300 MSS=1380 MSS=1420 MSS=1460 MSS=8960 619 ---------------------------------------------------------------------- 620 1Gig 0.1 | 924.812 926.966 882.495 894.240 919.819 713.786 621 1Gig 0.5 | 924.812 926.966 930.922 932.743 934.467 856.543 622 1Gig 1.0 | 924.812 926.966 930.922 932.743 934.467 927.922 623 10Gig 0.05| 9248.125 9269.655 9309.218 9839.790 9344.671 8565.435 624 10Gig 0.3 | 9248.125 9269.655 9309.218 9839.790 9344.671 9755.079 626 Each row in the table is a separate test that should be conducted 627 over a predetermined test interval and the throughput,retransmissions, 628 and RTT logged during the entire test interval. 630 3.4.2 Interpretation of TCP MSS Throughput Results 632 For cases where the predicted TCP throughput does not equal the 633 predicted throughput predicted for a given MSS, some possible causes 634 are listed: 636 - TBD 638 3.5. Multiple TCP Connection Throughput Tests 640 After baselining the network under test with a single TCP connection 641 (Section 3.3), the nominal capacity of the network has been 642 determined. The capacity measured in section 3.3 may be a capacity 643 range and it is reasonable that some level of tuning may have been 644 required (i.e. router shaping techniques employed, intermediary 645 proxy like devices tuned, etc.). 647 Single connection TCP testing is a useful first step to measure 648 expected versus actual TCP performance and as a means to diagnose 649 / tune issues in the network and active elements. However, the 650 ultimate goal of this methodology is to more closely emulate customer 651 traffic, which comprise many TCP connections over a network link. 652 This methodology inevitably seeks to provide the framework for 653 testing stateful TCP connections in concurrence with stateless 654 traffic streams, and this is described in Section 3.5. 656 3.5.1 Multiple TCP Connections - below Link Capacity 658 First, the ability of the network to carry multiple TCP connections 659 to full network capacity should be tested. Prioritization and QoS 660 settings are not considered during this step, since the network 661 capacity is not to be exceeded by the test traffic (section 3.5.2 662 covers the over capacity test case). 664 For this multiple connection TCP throughput test, the number of 665 connections will more than likely be limited by the test tool (host 666 vs. dedicated test equipment). As an example, for a GigE link with 667 1 msec RTT, the optimum TCP window would equal ~128 KBytes. So under 668 this condition, 8 concurrent connections with window size equal to 669 16KB would fill the GigE link. For 10G, 80 connections would be 670 required to accomplish the same. 672 Just as in section 3.3, the end host or test tool can not be the 673 processing bottleneck or the throughput measurements will not be 674 valid. The test tool must be benchmarked in ideal lab conditions to 675 verify it's ability to transfer stateful TCP traffic at the given 676 network line rate. 678 For this test step, it should be conducted over a reasonable test 679 duration and results should be logged per interval such as throughput 680 per connection, RTT, and retransmissions. 682 Since the network is not to be driven into over capacity (by nature 683 of the BDP allocated evenly to each connection), this test verifies 684 the ability of the network to carry multiple TCP connections up to 685 the link speed of the network. 687 3.5.2 Multiple TCP Connections - over Link Capacity 689 In this step, the network bandwidth is intentionally exceeded with 690 multiple TCP connections to test expected prioritization and queuing 691 within the network. 693 All conditions related to Section 3.3 set-up apply, especially the 694 ability of the test hosts to transfer stateful TCP traffic at network 695 line rates. 697 Using the same example from Section 3.3, a GigE link with 1 msec 698 RTT would require a window size of 128 KB to fill the link (with 699 one TCP connection). Assuming a 16KB window, 8 concurrent 700 connections would fill the GigE link capacity and values higher than 701 8 would over-subscribe the network capacity. The user would select 702 values to over-subscribe the network (i.e. possibly 10 15, 20, etc.) 703 to conduct experiments to verify proper prioritization and queuing 704 within the network. 706 3.5.3 Interpretation of Multiple TCP Connection Test Restults 708 Without any prioritization in the network, the over subscribed test 709 results could assist in the queuing studies. With proper queuing, 710 the bandwidth should be shared in a reasonable manner. The author 711 understands that the term "reasonable" is too wide open, and future 712 draft versions of this memo would attempt to quantify this sharing 713 in more tangible terms. It is known that if a network element 714 is not set for proper queuing (i.e. FIFO), then an oversubscribed 715 TCP connection test will generally show a very uneven distribution of 716 bandwidth. 718 With prioritization in the network, different TCP connections can be 719 assigned various QoS settings via the various mechanisms (i.e. per 720 VLAN, DSCP, etc.), and the higher priority connections must be 721 verified to achieve the expected throughput. 723 4. Acknowledgements 725 The author would like to thank Gilles Forget, Loki Jorgenson, 726 and Reinhard Schrage for technical review and contributions to this 727 draft-00 memo. 729 Also thanks to Matt Mathis and Matt Zekauskas for many good comments 730 through email exchange and for pointing me to great sources of 731 information pertaining to past works in the TCP capacity area. 733 5. References 735 [RFC2581] Allman, M., Paxson, V., Stevens W., "TCP Congestion 736 Control", RFC 2581, May 1999. 738 [RFC3148] Mathis M., Allman, M., "A Framework for Defining 739 Empirical Bulk Transfer Capacity Metrics", RFC 3148, July 740 2001. 742 [RFC2544] Bradner, S., McQuaid, J., "Benchmarking Methodology for 743 Network Interconnect Devices", RFC 2544, May 1999 745 [RFC3449] Balakrishnan, H., Padmanabhan, V. N., Fairhurst, G., 746 Sooriyabandara, M., "TCP Performance Implications of 747 Network Path Asymmetry", RFC 3449, December 2002 749 [RFC5357] Hedayat, K., Krzanowski, R., Morton, A., Yum, K., Babiarz, 750 J., "A Two-Way Active Measurement Protocol (TWAMP)", 751 RFC 5357, October 2008 753 [RFC4821] Mathis, M., Heffner, J., "Packetization Layer Path MTU 754 Discovery", RFC 4821, May 2007 756 draft-ietf-ippm-btc-cap-00.txt Allman, M., "A Bulk 757 Transfer Capacity Methodology for Cooperating Hosts", 758 August 2001 760 [MSMO] The Macroscopic Behavior of the TCP Congestion Avoidance 761 Algorithm Mathis, M.,Semke, J, Mahdavi, J, Ott, T 762 July 1997 SIGCOMM Computer Communication Review, 763 Volume 27 Issue 3 765 [Stevens Vol1] TCP/IP Illustrated, Vol1, The Protocols 766 Addison-Wesley 768 Authors' Addresses 770 Barry Constantine 771 JDSU, Test and Measurement Division 772 One Milesone Center Court 773 Germantown, MD 20876-7100 774 USA 776 Phone: +1 240 404 2227 777 Email: barry.constantine@jdsu.com 779 Gilles Forget 780 Independent Consultant to Bell Canada. 781 308, rue de Monaco, St-Eustache 782 Qc. CANADA, Postal Code : J7P-4T5 784 Phone: (514) 895-8212 785 gilles.forget@sympatico.ca 787 Loki Jorgenson 788 Apparent Networks 790 Phone: (604) 433-2333 ext 105 791 ljorgenson@apparentnetworks.com 793 Reinhard Schrage 794 Schrage Consulting 796 Phone: +49 (0) 5137 909540 797 reinhard@schrageconsult.com