idnits 2.17.1 draft-irtf-iccrg-tcpeval-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 1) being 68 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** There are 31 instances of too long lines in the document, the longest one being 36 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 538 has weird spacing: '... scale exper...' == Line 570 has weird spacing: '... scale exper...' == Line 600 has weird spacing: '... load scale ...' == Line 602 has weird spacing: '... tbd tbd...' == Line 603 has weird spacing: '... tbd tbd...' == (14 more instances...) -- The document date (July 3, 2014) is 3578 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Missing reference section? '1' on line 1268 looks like a reference -- Missing reference section? '2' on line 1273 looks like a reference -- Missing reference section? '3' on line 1276 looks like a reference -- Missing reference section? '4' on line 1280 looks like a reference -- Missing reference section? '5' on line 1284 looks like a reference -- Missing reference section? '6' on line 1288 looks like a reference -- Missing reference section? '7' on line 1290 looks like a reference -- Missing reference section? '8' on line 1294 looks like a reference -- Missing reference section? '9' on line 1297 looks like a reference -- Missing reference section? '10' on line 1302 looks like a reference -- Missing reference section? '11' on line 1306 looks like a reference -- Missing reference section? '12' on line 1309 looks like a reference -- Missing reference section? '13' on line 1312 looks like a reference -- Missing reference section? '14' on line 1316 looks like a reference -- Missing reference section? '15' on line 1320 looks like a reference -- Missing reference section? '16' on line 1324 looks like a reference -- Missing reference section? '17' on line 1327 looks like a reference -- Missing reference section? '18' on line 1350 looks like a reference -- Missing reference section? '19' on line 1371 looks like a reference Summary: 3 errors (**), 0 flaws (~~), 8 warnings (==), 20 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group D. Hayes 3 Internet-Draft University of Oslo 4 Intended status: Informational D. Ros 5 Expires: January 4, 2015 Telecom Bretagne 6 L.L.H. Andrew 7 CAIA Swinburne University of Technology 8 S. Floyd 9 ICSI 10 July 3, 2014 12 Common TCP Evaluation Suite 13 draft-irtf-iccrg-tcpeval-00 15 Abstract 17 This document presents an evaluation test suite for the initial assess- 18 ment of proposed TCP modifications. The goal of the test suite is to 19 allow researchers to quickly and easily evaluate their proposed TCP 20 extensions in simulators and testbeds using a common set of well- 21 defined, standard test cases, in order to compare and contrast proposals 22 against standard TCP as well as other proposed modifications. This test 23 suite is not intended to result in an exhaustive evaluation of a pro- 24 posed TCP modification or new congestion control mechanism. Instead, the 25 focus is on quickly and easily generating an initial evaluation report 26 that allows the networking community to understand and discuss the 27 behavioral aspects of a new proposal, in order to guide further experi- 28 mentation that will be needed to fully investigate the specific aspects 29 of a new proposal. 31 Status of This Memo 33 This Internet-Draft is submitted in full conformance with the 34 provisions of BCP 78 and BCP 79. 36 Internet-Drafts are working documents of the Internet Engineering 37 Task Force (IETF). Note that other groups may also distribute 38 working documents as Internet-Drafts. The list of current Internet- 39 Drafts is at http://datatracker.ietf.org/drafts/current/. 41 Internet-Drafts are draft documents valid for a maximum of six months 42 and may be updated, replaced, or obsoleted by other documents at any 43 time. It is inappropriate to use Internet-Drafts as reference 44 material or to cite them other than as "work in progress." 46 This Internet-Draft will expire on January 4, 2015. 48 Copyright Notice 50 Copyright (c) 2014 IETF Trust and the persons identified as the 51 document authors. All rights reserved. 53 This document is subject to BCP 78 and the IETF Trust's Legal 54 Provisions Relating to IETF Documents 55 (http://trustee.ietf.org/license-info) in effect on the date of 56 publication of this document. Please review these documents 57 carefully, as they describe your rights and restrictions with respect 58 to this document. Code Components extracted from this document must 59 include Simplified BSD License text as described in Section 4.e of 60 the Trust Legal Provisions and are provided without warranty as 61 described in the Simplified BSD License. 63 Table of Contents 65 1 Introduction . . . . . . . . . . . . . . . . . . . . . . 3 66 2 Traffic generation . . . . . . . . . . . . . . . . . . . 3 67 2.1 Desirable model characteristics . . . . . . . . . . . . . 4 68 2.2 Tmix . . . . . . . . . . . . . . . . . . . . . . . . . . 4 69 2.2.1 Base Tmix trace files for tests . . . . . . . . . . . . . 5 70 2.3 Loads . . . . . . . . . . . . . . . . . . . . . . . . . . 5 71 2.3.1 Varying the Tmix traffic load . . . . . . . . . . . . . . 5 72 2.3.1.1 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . 5 73 2.3.2 Dealing with non-stationarity . . . . . . . . . . . . . . 6 74 2.3.2.1 Bin size . . . . . . . . . . . . . . . . . . . . . . . . 6 75 2.3.2.2 NS2 implementation specifics . . . . . . . . . . . . . . 6 76 2.4 Packet size distribution . . . . . . . . . . . . . . . . 6 77 2.4.1 Potential revision . . . . . . . . . . . . . . . . . . . 7 78 3 Achieving reliable results in minimum time . . . . . . . 7 79 3.1 Background . . . . . . . . . . . . . . . . . . . . . . . 7 80 3.2 Equilibrium or Steady State . . . . . . . . . . . . . . . 7 81 3.2.1 Note on the offered load in NS2 . . . . . . . . . . . . . 8 82 3.3 Accelerated test start up time . . . . . . . . . . . . . 8 83 4 Basic scenarios . . . . . . . . . . . . . . . . . . . . . 9 84 4.1 Basic topology . . . . . . . . . . . . . . . . . . . . . 9 85 4.2 Traffic . . . . . . . . . . . . . . . . . . . . . . . . . 9 86 4.3 Flows under test . . . . . . . . . . . . . . . . . . . . 11 87 4.4 Scenarios . . . . . . . . . . . . . . . . . . . . . . . . 11 88 4.4.1 Data Center . . . . . . . . . . . . . . . . . . . . . . . 11 89 4.4.1.1 Potential Revisions . . . . . . . . . . . . . . . . . . . 11 90 4.4.2 Access Link . . . . . . . . . . . . . . . . . . . . . . . 12 91 4.4.2.1 Potential Revisions . . . . . . . . . . . . . . . . . . . 12 92 4.4.3 Trans-Oceanic Link . . . . . . . . . . . . . . . . . . . 12 93 4.4.4 Geostationary Satellite . . . . . . . . . . . . . . . . . 12 94 4.4.5 Wireless LAN . . . . . . . . . . . . . . . . . . . . . . 13 95 4.4.5.1 NS2 implementation specifics . . . . . . . . . . . . . . 14 96 4.4.5.2 Potential revisions . . . . . . . . . . . . . . . . . . . 15 97 4.4.6 Dial-up Link . . . . . . . . . . . . . . . . . . . . . . 15 98 4.4.6.1 Note on parameters . . . . . . . . . . . . . . . . . . . 15 99 4.4.6.2 Potential revisions . . . . . . . . . . . . . . . . . . . 16 100 4.5 Metrics of interest . . . . . . . . . . . . . . . . . . . 16 101 4.6 Potential Revisions . . . . . . . . . . . . . . . . . . . 17 102 5 Latency specific experiments . . . . . . . . . . . . . . 17 103 5.1 Delay/throughput tradeoff as function of queue size 104 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 105 5.1.1 Topology . . . . . . . . . . . . . . . . . . . . . . . . 17 106 5.1.1.1 Potential revisions . . . . . . . . . . . . . . . . . . . 18 107 5.1.2 Flows under test . . . . . . . . . . . . . . . . . . . . 18 108 5.1.3 Metrics of interest . . . . . . . . . . . . . . . . . . . 18 110 D. Hayes et. al. [Page 2a] 112 5.2 Ramp up time: completion time of one flow . . . . . . . . 18 113 5.2.1 Topology and background traffic . . . . . . . . . . . . . 19 114 5.2.2 Flows under test . . . . . . . . . . . . . . . . . . . . 20 115 5.2.2.1 Potential Revisions . . . . . . . . . . . . . . . . . . . 20 116 5.2.3 Metrics of interest . . . . . . . . . . . . . . . . . . . 20 117 5.3 Transients: release of bandwidth, arrival of many 118 flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 119 5.3.1 Topology and background traffic . . . . . . . . . . . . . 21 120 5.3.2 Flows under test . . . . . . . . . . . . . . . . . . . . 22 121 5.3.3 Metrics of interest . . . . . . . . . . . . . . . . . . . 22 122 6 Throughput- and fairness-related experiments . . . . . . 22 123 6.1 Impact on standard TCP traffic . . . . . . . . . . . . . 22 124 6.1.1 Topology and background traffic . . . . . . . . . . . . . 23 125 6.1.2 Flows under test . . . . . . . . . . . . . . . . . . . . 23 126 6.1.3 Metrics of interest . . . . . . . . . . . . . . . . . . . 23 127 6.1.3.1 Suggestions . . . . . . . . . . . . . . . . . . . . . . . 24 128 6.2 Intra-protocol and inter-RTT fairness . . . . . . . . . . 24 129 6.2.1 Topology and background traffic . . . . . . . . . . . . . 24 130 6.2.2 Flows under test . . . . . . . . . . . . . . . . . . . . 24 131 6.2.2.1 Intra-protocol fairness: . . . . . . . . . . . . . . . . 25 132 6.2.2.2 Inter-RTT fairness: . . . . . . . . . . . . . . . . . . . 25 133 6.2.3 Metrics of interest . . . . . . . . . . . . . . . . . . . 25 134 6.3 Multiple bottlenecks . . . . . . . . . . . . . . . . . . 25 135 6.3.1 Topology and traffic . . . . . . . . . . . . . . . . . . 25 136 6.3.1.1 Potential Revisions . . . . . . . . . . . . . . . . . . . 26 137 6.3.2 Metrics of interest . . . . . . . . . . . . . . . . . . . 27 138 7 Implementations . . . . . . . . . . . . . . . . . . . . . 27 139 8 Acknowledgments . . . . . . . . . . . . . . . . . . . . 28 140 9 Bibliography . . . . . . . . . . . . . . . . . . . . . . 28 141 A Discussions on Traffic . . . . . . . . . . . . . . . . . 30 143 D. Hayes et. al. [Page 2b] 145 1 Introduction 147 This document describes a common test suite for the initial assessment 148 of new TCP extensions or modifications. It defines a small number of 149 evaluation scenarios, including traffic and delay distributions, network 150 topologies, and evaluation parameters and metrics. The motivation for 151 such an evaluation suite is to help researchers in evaluating their pro- 152 posed modifications to TCP. The evaluation suite will also enable inde- 153 pendent duplication and verification of reported results by others, 154 which is an important aspect of the scientific method that is not often 155 put to use by the networking community. A specific target is that the 156 evaluations should be able to be completed in a reasonable amount of 157 time by simulation, or with a reasonable amount of effort in a testbed. 159 It is not possible to provide TCP researchers with a complete set of 160 scenarios for an exhaustive evaluation of a new TCP extension; espe- 161 cially because the characteristics of a new extension will often require 162 experiments with specific scenarios that highlight its behavior. On the 163 other hand, an exhaustive evaluation of a TCP extension will need to 164 include several standard scenarios, and it is the focus of the test 165 suite described in this document to define this initial set of test 166 cases. 168 These scenarios generalize current characteristics of the Internet such 169 as round-trip times (RTT), propagation delays, and buffer sizes. It is 170 envisaged that as the Internet evolves these will need to be adjusted. 171 In particular, we expect buffer sizes will need to be adjusted as 172 latency becomes increasingly important. 174 The scenarios specified here are intended to be as generic as possible, 175 i.e., not tied to a particular simulation or emulation platform. How- 176 ever, when needed some details pertaining to implementation using a 177 given tool are described. 179 This document has evolved from a "round-table" meeting on TCP evalua- 180 tion, held at Caltech on November 8-9, 2007, reported in [1]. This doc- 181 ument is the first step in constructing the evaluation suite; the goal 182 is for the evaluation suite to be adapted in response to feedback from 183 the networking community. It revises draft-irtf-tmrg-tests-02. 185 Information related to the draft can be found at: 186 http://riteproject.eu/ietf-drafts 188 2 Traffic generation 190 Congestion control concerns the response of flows to bandwidth limita- 191 tions or to the presence of other flows. Cross-traffic and reverse-path 192 traffic are therefore important to the tests described in this suite. 194 Such traffic can have the desirable effect of reducing the occurrence of 195 pathological conditions, such as global synchronization among competing 196 flows, that might otherwise be mis-interpreted as normal average behav- 197 iours of those protocols [2,3]. This traffic must be reasonably realis- 198 tic for the tests to predict the behaviour of congestion control proto- 199 cols in real networks, and also well-defined so that statistical noise 200 does not mask important effects. 202 2.1 Desirable model characteristics 204 Most scenarios use traffic produced by a traffic generator, with a range 205 of start times for user sessions, connection sizes, and the like, mim- 206 icking the traffic patterns commonly observed in the Internet. It is 207 important that the same "amount" of congestion or cross-traffic be used 208 for the testing scenarios of different congestion control algorithms. 209 This is complicated by the fact that packet arrivals and even flow 210 arrivals are influenced by the behavior of the algorithms. For this rea- 211 son, a pure open-loop, packet-level generation of traffic where gener- 212 ated traffic does not respond to the behaviour of other present flows is 213 not suitable. Instead, emulating application or user behaviours at the 214 end points using reactive protocols such as TCP in a closed-loop fashion 215 results in a closer approximation of cross-traffic, where user behav- 216 iours are modeled by well-defined parameters for source inputs (e.g., 217 request sizes for HTTP), destination inputs (e.g., response size), and 218 think times between pairs of source and destination inputs. By setting 219 appropriate parameters for the traffic generator, we can emulate non- 220 greedy user-interactive traffic (e.g., HTTP 1.1, SMTP and Telnet), 221 greedy traffic (e.g., P2P and long file downloads), as well as long- 222 lived but non-greedy, non-interactive flows (or thin streams). 224 This approach models protocol reactions to the congestion caused by 225 other flows in the common paths, although it fails to model the reac- 226 tions of users themselves to the presence of congestion. A model that 227 includes end-users' reaction to congestion is beyond the scope of this 228 draft, but we invite researchers to explore how the user behavior, as 229 reflected in the connection sizes, user wait times, and number of con- 230 nections per session, might be affected by the level of congestion expe- 231 rienced within a session [4]. 233 2.2 Tmix 235 There are several traffic generators available that implement a similar 236 approach to that discussed above. For now, we have chosen to use the 237 Tmix [5] traffic generator. Tmix is available for the NS2 and NS3 simu- 238 lators, and can generate traffic for testbeds (for example GENI [6]). 240 Tmix represents each TCP connection by a connection vector consisting of 241 a sequence of (request-size, response-size, think-time) triples, thus 242 representing bi-directional traffic. Connection vectors used for traffic 243 generation can be obtained from Internet traffic traces. 245 2.2.1 Base Tmix trace files for tests 247 The traces currently defined for use in the test suite are based on cam- 248 pus traffic at the University of North Carolina (see [7] for a descrip- 249 tion of construction methods and basic statistics). 251 The traces have an additional "m" field added to each connection vector 252 to provide each direction's maximum segment size for the connection. 253 This is used to provide the packet size distribution described in sec- 254 tion 2.4. 256 These traces contain a mixture of connections, from very short flows 257 that do not exist for long enough to be "congestion controlled", to long 258 thin streams, to bulk file transfer like connections. 260 The traces are available at: 261 http://hosting.riteproject.eu/tcpevaltmixtraces.tgz 263 2.3 Loads 265 While the protocols being tested may differ, it is important that we 266 maintain the same "load" or level of congestion for the experimental 267 scenarios. For many of the scenarios, such as the basic ones in section 268 4, each scenario is run for a range of loads, where the load is varied 269 by varying the rate of session arrivals. 271 2.3.1 Varying the Tmix traffic load 273 To adjust the traffic load for a given scenario, the connection start 274 times for flows in a Tmix trace are scaled as follows. Connections are 275 actually started at: 277 experiment_cv_start_time = scale * cv_start_time 279 where cv_start_time denotes the connection vector start time in the Tmix 280 traces and experiment_start_time is the time the connection starts in 281 the experiment. Therefore, the smaller the scale the higher (in general) 282 the traffic load. 284 2.3.1.1 Notes 286 Changing the connection start times also changes the way the traffic 287 connections interact, potentially changing the "clumping" of traffic 288 bursts. 290 Very small changes in the scaling parameter can cause disproportionate 291 changes in the offered load. This is due to possibility of the small 292 change causing the exclusion or inclusion of a CV that will transfer a 293 very large amount of data. 295 2.3.2 Dealing with non-stationarity 297 The Tmix traffic traces, as they are, offer a non-stationary load. This 298 is exacerbated for tests that do not require use of the full trace 299 files, but only a portion of them. While removing this non-stationarity 300 does also remove some of the "realism" of the traffic, it is necessary 301 for the test suite to produce reliable and consistent results. 303 A more stationary offered load is achieved by shuffling the start times 304 of connection vectors in the Tmix trace file. The trace file is logi- 305 cally partitioned into n-second bins, which are then shuffled using a 306 Fisher-Yates shuffle [8], and the required portions written to shuffled 307 trace files for the particular experiment being conducted. 309 2.3.2.1 Bin size 311 The bin size is chosen so that there is enough shuffling with respect to 312 the test length. The offered traffic per test second from the Tmix trace 313 files depends scale factor (see section 2.3.1), which is related to the 314 capacity of the bottleneck link. The shuffling bin size (in seconds) is 315 set at: b = 500e6 / C where C is the bottleneck link's capacity in bits 316 per second, and 500e6 is a scaling factor (in bits). 318 Thus for the access link scenario described in section 4.4.2, the bin 319 size for shuffling will be 5 seconds. 321 2.3.2.2 NS2 implementation specifics 323 The tcl scripts for this process are distributed with the NS2 example 324 test suite implementation. Care must be taken when using this algorithm, 325 so that the given random number generator and the same seed are 326 employed, or else the resulting experimental traces will be different. 328 2.4 Packet size distribution 330 For flows generated by the traffic generator, 10% of them use 536-byte 331 packets, and 90% 1500-byte packets. The base Tmix traces described in 332 section 2.2.1 have been processed at the *connection* level to have this 333 characteristic. As a result, *packets* in a given test will be roughly, 334 but not be exactly, in this proportion. However, the proportion of 335 offered traffic will be consistent for each experiment. 337 2.4.1 Potential revision 339 As Tmix can now read and use a connection's Maximum Segment Size (MSS) 340 from the trace file, it will be possible to produce Tmix connection vec- 341 tor trace files where the packet sizes reflect actual measurements. 343 3 Achieving reliable results in minimum time 345 This section describes the techniques used to achieve reliable results 346 in the minimum test time. 348 3.1 Background 350 Over a long time, because the session arrival times are to a large 351 extent independent of the transfer times, load could be defined as: 353 A=E[f]/E[t], 355 where E[f] is the mean session (flow) size in bits transferred, E[t] is 356 the mean session inter-arrival time in seconds, and A is the load in 357 bps. 359 It is important to test congestion control protocols in "overloaded" 360 conditions. However, if A>C, where C is the capacity of the bottleneck 361 link, then the system has no equilibrium. In long-running experiments 362 with A>C, the expected number of flows would keep on increasing with 363 time (because as time passes, flows would tend to last for longer and 364 longer, thus "piling up" with newly-arriving ones). This means that, in 365 an overload scenario, some measures will be very sensitive to the dura- 366 tion of the tests. 368 3.2 Equilibrium or Steady State 370 Ideally, experiments should be run until some sort of equilibrium 371 results can be obtained. Since every test algorithm can potentially 372 change how long this may take, the following approach is adopted: 374 1. Traces are shuffled to remove non-stationarity (see section 375 2.3.2.) 377 2. The experiment run time is determined from the traffic traces. 378 The shuffled traces are compiled such that the estimate of 379 traffic offered in the second third of the test is equal to 380 the estimated traffic offered in the second third of the test 381 is equal to the estimate of traffic offered in the final third 382 of the test, to within a 5% tolerance. The length of the trace 383 files becomes the total experiment run time (including the 384 warm up time). 386 3. The warmup time until measurements start, is calculated as the 387 time at which the NS2 simulation of standard TCP achieves 388 "steady state". In this case, warmup time is determined as the 389 time required so the measurements have statistically similar 390 first and second half results. The metrics used as reference 391 are: the bottleneck raw throughput, and the average bottleneck 392 queue size. The latter is stable when A>>C and A<| prefill_si 438 | <----> 439 |--------------|-------|----|-------------------------| 440 t=0 | t = prefill_t t=warmup 441 | 442 | 443 t = prefill_t - prefill_si 445 Figure 1: prefilling 447 4 Basic scenarios 449 The purpose of the basic scenarios is to explore the behavior of a TCP 450 extension over different link types. These scenarios use the dumbbell 451 topology described in section 4.1. 453 4.1 Basic topology 455 Most tests use a simple dumbbell topology with a central link that con- 456 nects two routers, as illustrated in Figure 2. Each router is also con- 457 nected to three nodes by edge links. In order to generate a typical 458 range of round trip times, edge links have different delays. Unless 459 specified otherwise, such delays are as follows. On one side, the one- 460 way propagation delays are: 0ms, 12ms and 25ms; on the other: 2ms, 37ms, 461 and 75ms. Traffic is uniformly shared among the nine source/destination 462 pairs, giving a distribution of per-flow RTTs in the absence of queueing 463 delay shown in Table 1. These RTTs are computed for a dumbbell topology 464 assuming a delay of 0ms for the central link. The delay for the central 465 link that is used in a specific scenario is given in the next section. 467 For dummynet experiments, delays can be obtained by specifying the delay 468 of each flow. 470 4.2 Traffic 471 Node 1 Node 4 472 \_ _/ 473 \_ _/ 474 \_ __________ Central __________ _/ 475 | | link | | 476 Node 2 ------| Router 1 |----------------| Router 2 |------ Node 5 477 _|__________| |__________|_ 478 _/ \_ 479 _/ \_ 480 Node 3 / \ Node 6 482 Figure 2: A dumbbell topology 484 --------------------------------- 485 Path RTT Path RTT Path RTT 486 --------------------------------- 487 1-4 4 1-5 74 1-6 150 488 2-4 28 2-5 98 2-6 174 489 3-4 54 3-5 124 3-6 200 490 --------------------------------- 492 Table 1: Minimum RTTs of the paths between two nodes, in milliseconds. 494 In all of the basic scenarios, *all* TCP flows use the TCP extension or 495 modification under evaluation. 497 In general, the 9 bidirectional Tmix sources are connected to nodes 1 to 498 6 of figure 2 to create the paths tabulated in table 1. 500 Offered loads are estimated directly from the shuffled and scaled Tmix 501 traces, as described in section 3.2. The actual measured loads will 502 depend on the TCP variant and the scenario being tested. 504 Buffer sizes are based on the Bandwidth Delay Product (BDP), except for 505 the Dial-up scenario where a BDP buffer does not provide enough buffer- 506 ing. 508 The load generated by Tmix with the standard trace files is asymmetric, 509 with a higher load offered in the right to left direction (refer to fig- 510 ure 2) than in the left to right direction. Loads are specified for the 511 higher traffic right to left direction. For each of the basic scenarios, 512 three offered loads are tested: moderate (60%), high (85%), and overload 513 (110%). Loads are for the bottleneck link, which is the central link in 514 all scenarios except the wireless LAN scenario. 516 The 9 tmix traces are scaled using a single scaling factor in these 517 tests. This means that the traffic offered on each of the 9 paths 518 through the network is not equal, but combined at the bottleneck pro- 519 duces the specified offered load. 521 4.3 Flows under test 523 For these basic scenarios, there is no differentiation between "cross- 524 traffic" and the "flows under test". The aggregate traffic is under 525 test, with the metrics exploring both aggregate traffic and distribu- 526 tions of flow-specific metrics. 528 4.4 Scenarios 530 4.4.1 Data Center 532 The data center scenario models a case where bandwidth is plentiful and 533 link delays are generally low. All links have a capacity of 1 Gbps. 534 Links from nodes 1, 2 and 4 have a one-way propagation delay of 10 us, 535 while those from nodes 3, 5 and 6 have 100 us [9], and the central link 536 has 0 ms delay. The central link has 10 ms buffers. 538 load scale experiment time warmup test_time prefill_t prefill_si 539 --------------------------------------------------------------------------- 540 60% 0.56385119 156.5 4.0 145.0 7.956 4.284117 541 85% 0.372649 358.0 19.0 328.0 11.271 6.411839 542 110% 0.295601 481.5 7.5 459 14.586 7.356242 544 Table 2: Data center scenario parameters 546 4.4.1.1 Potential Revisions 548 The rate of 1 Gbps is chosen such that NS2 simulations can run in a rea- 549 sonable time. Higher values will become feasible as computing power 550 increases, however the current traces may not be long enough to drive 551 simulations or test bed experiments at higher rates. 553 The supplied Tmix traces are used here to provide a standard comparison 554 across scenarios. Data Centers, however, have very specialised traffic 555 which may not be represented well in such traces. In the future, 556 specialised Data Center traffic traces may be needed to provide a more 557 realistic test. 559 4.4.2 Access Link 561 The access link scenario models an access link connecting an institution 562 (e.g., a university or corporation) to an ISP. The central and edge 563 links are all 100 Mbps. The one-way propagation delay of the central 564 link is 2 ms, while the edge links have the delays given in Section 4.1. 565 Our goal in assigning delays to edge links is only to give a realistic 566 distribution of round-trip times for traffic on the central link. The 567 Central link buffer size is 100 ms, which is equivalent to the BDP 568 (using the mean RTT). 570 load scale experiment time warmup test_time prefill_t prefill_si 571 -------------------------------------------------------------------------- 572 60% 4.910115 440 107.0 296.0 36.72 24.103939 573 85% 3.605109 920 135.0 733.0 52.02 23.378915 574 110% 3.0027085 2710 34.0 2609.0 67.32 35.895355 576 Table 3: Access link scenario parameters (times in seconds) 578 4.4.2.1 Potential Revisions 580 As faster access links become common, the link speed for this scenario 581 will need to be updated accordingly. Also as access link buffer sizes 582 shrink to less than BDP sized buffers, this should be updated to reflect 583 these changes in the Internet. 585 4.4.3 Trans-Oceanic Link 587 The trans-oceanic scenario models a test case where mostly lower-delay 588 edge links feed into a high-delay central link. Both the central and all 589 edge links are 1 Gbps. The central link has 100 ms buffers, and a one- 590 way propagation delay of 65 ms. 65 ms is chosen as a "typical number". 591 The actual delay on real links depends, of course, on their length. For 592 example, Melbourne to Los Angeles is about 85 ms. 594 4.4.4 Geostationary Satellite 596 The geostationary satellite scenario models an asymmetric test case with 597 a high-bandwidth downlink and a low-bandwidth uplink [10,11]. The sce- 598 nario modeled is that of nodes connected to a satellite hub which has an 599 asymmetric satellite connection to the master base station which is 600 load scale experiment time warmup test_time prefill_t prefill_si 601 ----------------------------------------------------------------------- 602 60% tbd tbd tbd 603 85% tbd tbd tbd 604 110%: tbd tbd tbd 606 Table 4: Trans-Oceanic link scenario parameters 608 connected to the Internet. The capacity of the central link is asymmet- 609 ric - 40 Mbps down, and 4 Mbps up with a one-way propagation delay of 610 300 ms. Edge links are all bidirectional 100 Mbps links with one-way 611 delays as given in Section 4.1. The central link buffer size is 100 ms 612 for downlink and 1000 ms for uplink. 614 Note that congestion in this case is often on the 4 Mbps uplink (left to 615 right), even though most of the traffic is in the downlink direction 616 (right to left). 618 load scale experiment time warmup test_time prefill_t prefill_si 619 ----------------------------------------------------------------------- 620 60% tbd tbd tbd 621 85% tbd tbd tbd 622 110%: tbd tbd tbd 624 Table 5: Trans-Oceanic link scenario parameters 626 4.4.5 Wireless LAN 628 The wireless LAN scenario models WiFi access to a wired backbone, as 629 depicted in Figure 3. 631 The capacity of the central link is 100 Mbps, with a one-way delay of 2 632 ms. All links to Router 2 are wired. Router 1 acts as a base station for 633 a shared wireless IEEE 802.11g links. Although 802.11g has a peak bit 634 rate of 54 Mbps, its typical throughput rate is much lower, and 635 decreases under high loads and bursty traffic. The scales specified 636 here are based on a nominal rate of 6Mbps. 638 The Node_[123] to Wireless_[123] connections are to allow the same RTT 639 distribution as for the wired scenarios. This is in addition to delays 640 on the wireless link due to CSMA. Figure 3 shows how the topology should 641 look in a test bed. 643 Node_1----Wireless_1.. Node_4 644 :. / 645 :... Base central link / 646 Node_2----Wireless_2 ....:..Station-------------- Router_2 --- Node_5 647 ...: (Router 1) \ 648 .: \ 649 Node_3----Wireless_3.: Node_6 651 Figure 3: Wireless dumbell topology for a test-bed. Wireless_n are wire- 652 less transceivers for connection to the base station 654 load scale experiment time warmup test_time prefill_t prefill_si 655 ---------------------------------------------------------------------------- 656 60% 117.852049 14917 20.0 14917 0 0 657 85% 85.203155 10250.0 20.0 10230 0 0 658 110%: 65.262840 4500.0 20.0 4480 0 0 660 Table 6: Wireless LAN scenario parameters 662 The percentage load for this scenario is based on the sum of the esti- 663 mate of offered load in both directions since the wireless bottleneck 664 link is a shared media. Also, due to contention for the bottleneck link, 665 the accelerated start up using prefill is not used for this scenario. 667 Note that the prefill values are zero as prefill was found to be of no 668 benefit in this scenario. 670 4.4.5.1 NS2 implementation specifics 672 In NS2, this is implemented as depicted in Figure 2 The delays between 673 Node_1 and Wireless_1 are implemented as delays through the Logical Link 674 layer. 676 Since NS2 don't have a simple way of measuring transport packet loss on 677 the wireless link, dropped packets are inferred based on flow arrivals 678 and departures (see figure 4). This gives a good estimate of the average 679 loss rate over a long enough period (long compared with the transit 680 delay of packets), which is the case here. 682 logical link 683 X--------------------X 684 | | 685 v | 686 n1--+---. | _n4 687 : V / 688 n2--+---.:.C0-------------C1---n5 689 : \_ 690 n3--+---.: n6 692 Figure 4: Wireless measurements in the ns2 simulator 694 4.4.5.2 Potential revisions 696 Wireless standards are continually evolving. This scenario may need 697 updating in the future to reflect these changes. 699 Wireless links have many other unique properties not captured by delay 700 and bitrate. In particular, the physical layer might suffer from propa- 701 gation effects that result in packet losses, and the MAC layer might add 702 high jitter under contention or large steps in bandwidth due to adaptive 703 modulation and coding. Specifying these properties is beyond the scope 704 of the current first version of this test suite but may make useful 705 additions in the future. 707 Latency in this scenario is very much affected by contention for the 708 media. It will be good to have end-to-end delay measurements to quantify 709 this characteristic. This could include per packet latency, application 710 burst completion times, and/or application session completion times. 712 4.4.6 Dial-up Link 714 The dial-up link scenario models a network with a dial-up link of 64 715 kbps and a one-way delay of 5 ms for the central link. This could be 716 thought of as modeling a scenario reported as typical in Africa, with 717 many users sharing a single low-bandwidth dial-up link. Central link 718 buffer size of 1250 ms 720 4.4.6.1 Note on parameters 722 The traffic offered by tmix over a low bandwidth link is very bursty. It 723 takes a long time to reach some sort of statistical stability. For event 724 load scale experiment time warmup test_time prefill_t prefill_si 725 ----------------------------------------------------------------------------- 726 60% 10176.2847 1214286 273900 273900 0 0 727 85% 7679.1920 1071429 513600 557165 664.275 121.147563 728 110%: 5796.7901 2223215 440.0 2221915 859.65 180.428 730 Table 7: Dial-up link scenario parameters 732 based simulators, this is not too much of a problem, as the number of 733 packets transferred is not prohibitively high, however for test beds 734 these times are prohibitively long. This scenario needs further investi- 735 gation to address this. 737 4.4.6.2 Potential revisions 739 Modems often have asymmetric up and down link rates. Asymmetry is tested 740 in the Geostationary Satellite scenario (section 4.4.4), but the dial-up 741 scenario could be modified to model this as well. 743 4.5 Metrics of interest 745 For each run, the following metrics will be collected for the central 746 link in each direction: 748 1. the aggregate link utilization, 750 2. the average packet drop rate, and 752 3. the average queueing delay. 754 These measures only provide a general overview of performance. The goal 755 of this draft is to produce a set of tests that can be "run" at all lev- 756 els of abstraction, from Grid500's WAN, through WAN-in-Lab, testbeds and 757 simulations all the way to theory. Researchers may add additional mea- 758 sures to illustrate other performance aspects as required. 760 Other metrics of general interest include: 762 1. end-to-end delay measurements 764 2. flow-centric: 766 1. sending rate, 768 2. goodput, 769 3. cumulative loss and queueing delay trajectory for each 770 flow, over time, 772 4. the transfer time per flow versus file size 774 3. stability properties: 776 1. standard deviation of the throughput and the queueing 777 delay for the bottleneck link, 779 2. worst case stability measures, especially proving (possi- 780 bly theoretically) the stability of TCP. 782 4.6 Potential Revisions 784 As with all of the scenarios in this document, the basic scenarios could 785 benefit from more measurement studies about characteristics of congested 786 links in the current Internet, and about trends that could help predict 787 the characteristics of congested links in the future. This would 788 include more measurements on typical packet drop rates, and on the range 789 of round-trip times for traffic on congested links. 791 5 Latency specific experiments 793 5.1 Delay/throughput tradeoff as function of queue size 795 Performance in data communications is increasingly limited by latency. 796 Smaller and smarter buffers improve this measure, but often at the 797 expense of TCP throughput. The purpose of these tests is to investigate 798 delay-throughput tradeoffs, *with and without the particular TCP exten- 799 sion under study*. 801 Different queue management mechanisms have different delay-throughput 802 tradeoffs. It is envisaged that the tests described here would be 803 extended to explore and compare the performance of different Active 804 Queue Management (AQM) techniques. However, this is an area of active 805 research and beyond the scope of this test suite at this time. For now, 806 it may be better to have a dedicated, separate test suite to look at AQM 807 performance issues. 809 5.1.1 Topology 811 These tests use the topology of Figure 4.1. They are based on the access 812 link scenario (see section 4.4.2) with the 85% offered load used for 813 this test. 815 For each Drop-Tail scenario set, five tests are run, with buffer sizes 816 of 10%, 20%, 50%, 100%, and 200% of the Bandwidth Delay Product (BDP) 817 for a 100 ms base RTT flow (the average base RTT in the access link 818 dumbell scenario is 100 ms). 820 5.1.1.1 Potential revisions 822 Buffer sizing is still an area of research. Results from this research 823 may necessitate changes to the test suite so that it models these 824 changes in the Internet. 826 AQM is currently an area of active research. It is envisaged that these 827 tests could be extended to explore and compare the performance of key 828 AQM techniques when it becomes clear what these will be. For now a dedi- 829 cated AQM test suite would best serve such research efforts. 831 5.1.2 Flows under test 833 Two kinds of tests should be run: one where all TCP flows use the TCP 834 modification under study, and another where no TCP flows use such modi- 835 fication, as a "baseline" version. 837 The level of traffic from the traffic generator is the same as that 838 described in section 4.4.2. 840 5.1.3 Metrics of interest 842 For each test, three figures are kept, the average throughput, the aver- 843 age packet drop rate, and the average queueing delay over the measure- 844 ment period. 846 Ideally it would be better to have more complete statistics, especially 847 for queueing delay where the delay distribution can be important. It 848 would also be good for this to be illustrated with delay/bandwidth 849 graph, the x-axis shows the average queueing delay, and the y-axis shows 850 the average throughput. For the drop-rate graph, the x-axis shows the 851 average queueing delay, and the y-axis shows the average packet drop 852 rate. Each pair of graphs illustrates the delay/throughput/drop-rate 853 tradeoffs with and without the TCP mechanism under evaluation. For an 854 AQM mechanism, each pair of graphs also illustrates how the throughput 855 and average queue size vary (or don't vary) as a function of the traffic 856 load. Examples of delay/throughput tradeoffs appear in Figures 1-3 857 of[12] and Figures 4-5 of[13]. 859 5.2 Ramp up time: completion time of one flow 861 These tests aim to determine how quickly existing flows make room for 862 new flows. 864 5.2.1 Topology and background traffic 866 The ramp up time test uses the topology shown in figure 5. Two long- 867 lived test TCP connections are used in this experiment. Test TCP connec- 868 tion 1 is run between T_n1 and T_n3, with data flowing from T_n1 to 869 T_n3, and test TCP source 2 runs between T_n2 and T_n4, with data flow- 870 ing from T_n2 to T_n4. The background traffic topology is identical to 871 that used in the basic scenarios (see section 4 and Figure 2); i.e., 872 background flows run between nodes B_n1 to B_n6. 874 T_n2 T_n4 875 | | 876 | | 877 T_n1 | | T_n3 878 \ | | / 879 \ | |/ 880 B_n1--- R1--------------------------R2--- B_n4 881 / | |\ 882 / | | \ 883 B_n2 | | B_n5 884 | | 885 B_n3 B_n6 887 Figure 5: Ramp up dumbbell test topology 889 Experiments are conducted with capacities of 56 kbps, 10 Mbps and 1 Gbps 890 for the central link. The 56 kbps case is included to investigate the 891 performance using low bit rate devices such as mobile handsets or dial 892 up modems. 894 For each capacity, three RTT scenarios should be tested, in which the 895 existing and newly arriving flow have RTTs of (80,80), (120,30), and 896 (30,120) respectively. This is made up of a central link has a 2 ms 897 delay in each direction, and test link delays as shown in Table 5.2.1. 899 Throughout the experiment, the offered load of the background (or cross) 900 traffic is 10% of the central link capacity in the right to left direc- 901 tion. The background traffic is generated in the same manner as for the 902 basic scenarios (see section 4). 904 ---------------------------------- 905 RTT T_n1 T_n2 T_n3 T_n4 906 scenario (ms) (ms) (ms) (ms) 907 ---------------------------------- 908 1 0 0 38 38 909 2 23 12 35 1 910 3 12 23 1 35 911 ---------------------------------- 913 Table 8: Link delays for the test TCP source connections to the central 914 link 916 Central link scale experiment time warmup test_time prefill_t prefill_si 917 --------------------------------------------------------------------------------- 918 56 kbps 919 10 Mbps 920 1 Gbps 3.355228 324 9.18 2.820201 922 Table 9: Ramp-up time scenario parameters (times in seconds) 924 All traffic for this scenario uses the TCP extension under test. 926 5.2.2 Flows under test 928 Traffic is dominated by the two long lived test flows, because we 929 believe that to be the worst case, in which convergence is slowest. 931 One flow starts in "equilibrium" (at least having finished normal slow- 932 start). A new flow then starts; slow-start is disabled by setting the 933 initial slow-start threshold to the initial CWND. Slow start is disabled 934 because this is the worst case, and could happen if a loss occurred in 935 the first RTT. 937 The experiment ends once the new flow has run for five minutes. Both of 938 the flows use 1500-byte packets. The test should be run both with Stan- 939 dard TCP and with the TCP extension under test for comparison. 941 5.2.2.1 Potential Revisions 943 It may also be useful to conduct the tests with slow start enabled too, 944 if time permits. 946 5.2.3 Metrics of interest 947 The output of these experiments are the time until the 1500times10n th 948 byte of the new flow is received, for n = 1,2,... . This measures how 949 quickly the existing flow releases capacity to the new flow, without 950 requiring a definition of when "fairness" has been achieved. By leaving 951 the upper limit on n unspecified, the test remains applicable to very 952 high-speed networks. 954 A single run of this test cannot achieve statistical reliability by run- 955 ning for a long time. Instead, an average over at least three runs 956 should be taken. Each run must use different cross traffic. Different 957 cross traffic can be generated using the standard tmix trace files by 958 changing the random number seed used to shuffle the traces. 960 5.3 Transients: release of bandwidth, arrival of many flows 962 These tests investigate the impact of a sudden change of congestion 963 level. They differ from the "Ramp up time" test in that the congestion 964 here is caused by unresponsive traffic. 966 Note that this scenario has not yet been implemented in the NS2 example 967 test suite. 969 5.3.1 Topology and background traffic 971 The network is a single bottleneck link (see Figure 6), with bit rate 972 100 Mbps, with a buffer of 1024 packets (i.e., 120% of the BDP at 100 973 ms). 975 T T 976 \ / 977 \ / 978 R1--------------------------R2 979 / \ 980 / \ 981 U U 983 Figure 6: Transient test topology 985 The transient traffic is generated using UDP, to avoid overlap with the 986 ramp-up time scenario (see section 5.2) and isolate the behavior of the 987 flows under study. 989 Three transients are tested: 991 1. step decrease from 75 Mbps to 0 Mbps, 993 2. step increase from 0 Mbps to 75 Mbps, 995 3. 30 step increases of 2.5 Mbps at 1 s intervals. 996 These transients occur after the flow under test has exited slow-start, 997 and remain until the end of the experiment. 999 There is no TCP cross traffic in this experiment. 1001 5.3.2 Flows under test 1003 There is one flow under test: a long-lived flow in the same direction as 1004 the transient traffic, with a 100 ms RTT. The test should be run both 1005 with Standard TCP and with the TCP extension under test for comparison. 1007 5.3.3 Metrics of interest 1009 For the decrease in cross traffic, the metrics are 1011 1. the time taken for the TCP flow under test to increase its 1012 window to 60%, 80% and 90% of its BDP, and 1014 2. the maximum change of the window in a single RTT while the 1015 window is increasing to that value. 1017 For cases with an increase in cross traffic, the metric is the number of 1018 *cross traffic* packets dropped from the start of the transient until 1019 100 s after the transient. This measures the harm caused by algorithms 1020 which reduce their rates too slowly on congestion. 1022 6 Throughput- and fairness-related experiments 1024 6.1 Impact on standard TCP traffic 1026 Many new TCP proposals achieve a gain, G, in their own throughput at the 1027 expense of a loss, L, in the throughput of standard TCP flows sharing a 1028 bottleneck, as well as by increasing the link utilization. In this con- 1029 text a "standard TCP flow" is defined as a flow using SACK TCP [14] but 1030 without ECN [15]. 1032 The intention is for a "standard TCP flow" to correspond to TCP as com- 1033 monly deployed in the Internet today (with the notable exception of 1034 CUBIC, which runs by default on the majority of web servers). This sce- 1035 nario quantifies this trade off. 1037 6.1.1 Topology and background traffic 1039 The basic dumbbell topology of section 4.1 is used with the same capaci- 1040 ties as for the ramp-up time tests in section 5.2. All traffic in this 1041 scenario comes from the flows under test. 1043 A_1 A_4 1044 B_1 B_4 1045 \ / 1046 \ central link / 1047 A_2 --- Router_1 -------------- Router_2 --- A_5 1048 B_2 / \ B_5 1049 / \ 1050 A_3 A_6 1051 B_3 B_6 1053 Figure 7: Impact on Standard TCP dumbbell 1055 6.1.2 Flows under test 1057 The scenario is performed by conducting pairs of experiments, with iden- 1058 tical flow arrival times and flow sizes. Within each experiment, flows 1059 are divided into two camps. For every flow in camp A, there is a flow 1060 with the same size, source and destination in camp B, and vice versa. 1062 These experiments use duplicate copies of the Tmix traces used in the 1063 basic scenarios (see section 4). Two offered loads are tested: 50% and 1064 100%. 1066 Two experiments are conducted. A BASELINE experiment where both camp A 1067 and camp B use standard TCP. In the second, called MIX, camp A uses 1068 standard TCP and camp B uses the new TCP extension under evaluation. 1070 The rationale for having paired camps is to remove the statistical 1071 uncertainty which would come from randomly choosing half of the flows to 1072 run each algorithm. This way, camp A and camp B have the same loads. 1074 6.1.3 Metrics of interest 1075 load scale experiment time warmup test_time prefill_t prefill_si 1076 -------------------------------------------------------------------------- 1077 50% 13.780346 660 104.0 510.0 45.90 14.262121 1078 100% 5.881093 720 49.0 582.0 91.80 23.382947 1080 Table 10: Impact on Standard TCP scenario parameters 1082 The gain achieved by the new algorithm and loss incurred by standard TCP 1083 are given, respectively, by G=T(B)_Mix/T(B)_Baseline and 1084 L=T(A)_Mix/T(A)_Baseline where T(x) is the throughput obtained by camp 1085 x, measured as the amount of data acknowledged by the receivers (that 1086 is, "goodput"). 1088 The loss, L, is analogous to the "bandwidth stolen from TCP" in [16] and 1089 "throughput degradation" in [17]. 1091 A plot of G vs L represents the tradeoff between efficiency and loss. 1093 6.1.3.1 Suggestions 1095 Other statistics of interest are the values of G and L for each 1096 quartile of file sizes. This will reveal whether the new proposal is 1097 more aggressive in starting up or more reluctant to release its share of 1098 capacity. 1100 As always, testing at other loads and averaging over multiple runs is 1101 encouraged. 1103 6.2 Intra-protocol and inter-RTT fairness 1105 These tests aim to measure bottleneck bandwidth sharing among flows of 1106 the same protocol with the same RTT, which represents the flows going 1107 through the same routing path. The tests also measure inter-RTT fair- 1108 ness, the bandwidth sharing among flows of the same protocol where rout- 1109 ing paths have a common bottleneck segment but might have different 1110 overall paths with different RTTs. 1112 6.2.1 Topology and background traffic 1114 The topology, the capacity and cross traffic conditions of these tests 1115 are the same as in section 5.2. The bottleneck buffer is varied from 1116 25% to 200% of the BDP for a 100 ms base RTT flow, increasing by factors 1117 of 2. 1119 6.2.2 Flows under test 1120 We use two flows of the same protocol variant for this experiment. The 1121 RTTs of the flows range from 10 ms to 160 ms (10 ms, 20 ms, 40 ms, 80 1122 ms, and 160 ms) such that the ratio of the minimum RTT over the maximum 1123 RTT is at most 1/16. 1125 6.2.2.1 Intra-protocol fairness: 1127 For each run, two flows with the same RTT, taken from the range of RTTs 1128 above, start randomly within the first 10% of the experiment duration. 1129 The order in which these flows start doesn't matter. An additional test 1130 of interest, but not part of this suite, would involve two extreme cases 1131 - two flows with very short or long RTTs (e.g., a delay less than 1-2 ms 1132 representing communication happening in a data-center, and a delay 1133 larger than 600 ms representing communication over a satellite link). 1135 6.2.2.2 Inter-RTT fairness: 1137 For each run, one flow with a fixed RTT of 160 ms starts first, and 1138 another flow with a different RTT taken from the range of RTTs above, 1139 joins afterward. The starting times of both two flows are randomly cho- 1140 sen within the first 10% of the experiment as before. 1142 6.2.3 Metrics of interest 1144 The output of this experiment is the ratio of the average throughput 1145 values of the two flows. The output also includes the packet drop rate 1146 for the congested link. 1148 6.3 Multiple bottlenecks 1150 These experiments explore the relative bandwidth for a flow that tra- 1151 verses multiple bottlenecks, with respect to that of flows that have the 1152 same round-trip time but each traverse only one of the bottleneck links. 1154 6.3.1 Topology and traffic 1156 The topology is a "parking-lot" topology with three (horizontal) bottle- 1157 neck links and four (vertical) access links. The bottleneck links have 1158 a rate of 100 Mbps, and the access links have a rate of 1 Gbps. 1160 All flows have a round-trip time of 60 ms, to enable the effect of 1161 traversing multiple bottlenecks to be distinguished from that of differ- 1162 ent round trip times. 1164 This can be achieved in both a symmetric and asymmetric way (see figures 1165 8 and 9). It is not clear whether there are interesting performance 1166 differences between these two topologies, and if so, which is more typi- 1167 cal of the actual internet. 1169 > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > 1170 __________ 0ms _________________ 0ms __________________ 30ms ____ 1171 | ................ | ................ | ................ | 1172 | : : | : : | : : | 1173 | : : | : : | : : | 1174 0ms : : 30ms : : 0ms : : 0ms 1175 | ^ V | ^ V | ^ V | 1177 Figure 8: Asymmetric parking lot topology 1179 > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > 1180 __________ 10ms _______________ 10ms ________________ 10ms ___ 1181 | ............... | ............... | ............... | 1182 | : : | : : | : : | 1183 | : : | : : | : : | 1184 10ms : : 10ms : : 10ms : : 10ms 1185 | ^ V | ^ V | ^ V | 1187 Figure 9: Symmetric parking lot topology 1189 The three hop topology used in the test suite is based on the symmetric 1190 topology (see figure 10). Bidirectional traffic flows between Nodes 1 1191 and 8, 2 and 3, 4 and 5, and 6 and 7. 1193 The first four Tmix trace files are used to generate the traffic. Each 1194 Tmix source offers the same load for each experiment. Three experiments 1195 are conducted at 30%, 40%, and 50% offered loads per Tmix source. As two 1196 sources share each of the three bottlenecks (A,B,C), the combined 1197 offered loads on the bottlenecks is 60%, 80%, and 100% respectively. 1199 All traffic uses the new TCP extension under test. 1201 6.3.1.1 Potential Revisions 1202 Node_1 Node_3 Node_5 Node_7 1203 \ | | / 1204 \ |10ms |10ms /10ms 1205 0ms \ | | / 1206 \ A | B | C / 1207 Router1 ---Router2---Router3--- Router4 1208 / 10ms | 10ms | 10ms \ 1209 / | | \ 1210 10ms/ |10ms |10ms \ 0ms 1211 / | | \ 1212 Node_2 Node_4 Node_6 Node_8 1214 Flow 1: Node_1 <--> Node_8 1215 Flow 2: Node_2 <--> Node_3 1216 Flow 3: Node_4 <--> Node_5 1217 Flow 4: Node_6 <--> Node_7 1219 Figure 10: Test suite parking lot topology 1221 load scale 1 prefill_t prefill_si scale 2 prefill_t 1222 prefill_si scale 3 prefill_t prefill_si total time warmup test_time 1223 ------------------------------------------------------------------------------------------------------------ 1224 50% tbd tbd tbd tbd tbd tbd tbd tbd tbd tbd tbd tbd 1225 100% tbd tbd tbd tbd tbd tbd tbd tbd tbd tbd tbd tbd 1227 Table 11: Multiple bottleneck scenario parameters 1229 Parking lot models with more hops may also be of interest. 1231 6.3.2 Metrics of interest 1233 The output for this experiment is the ratio between the average through- 1234 put of the single-bottleneck flows and the throughput of the multiple- 1235 bottleneck flow, measured after the warmup period. Output also includes 1236 the packet drop rate for the congested link. 1238 7 Implementations 1239 At the moment the only implementation effort is using the NS2 simulator. 1240 It is still a work in progress, but contains the base to most of the 1241 test, as well as the algorithms that determined the test parameters. It 1242 is being made available to the community for further development and 1243 verification through ***** url *** 1245 At the moment there are no ongoing test bed implementations. We invite 1246 the community to initiate and contribute to the development of these 1247 test beds. 1249 8 Acknowledgments 1251 This work is based on a paper by Lachlan Andrew, Cesar Marcondes, Sally 1252 Floyd, Lawrence Dunn, Romaric Guillier, Wang Gang, Lars Eggert, Sangtae 1253 Ha and Injong Rhee [1]. 1255 The authors would also like to thank Roman Chertov, Doug Leith, Saverio 1256 Mascolo, Ihsan Qazi, Bob Shorten, David Wei and Michele Weigle for valu- 1257 able feedback and acknowledge the work of Wang Gang to start the NS2 1258 implementation. 1260 This work has been partly funded by the European Community under its 1261 Seventh Framework Programme through the Reducing Internet Transport 1262 Latency (RITE) project (ICT-317700), by the Aurora-Hubert Curien Part- 1263 nership program "ANT" (28844PD / 221629), and under Australian Research 1264 Council's Discovery Projects funding scheme (project number 0985322). 1266 9 Bibliography 1268 [1] L. L. H. Andrew, C. Marcondes, S. Floyd, L. Dunn, R. Guillier, W. 1269 Gang, L. Eggert, S. Ha, and I. Rhee, "Towards a common TCP evaluation 1270 suite," in Protocols for Fast, Long Distance Networks (PFLDnet), 5-7 Mar 1271 2008. 1273 [2] S. Floyd and E. Kohler, "Internet research needs better models," 1274 SIGCOMM Comput. Commun. Rev., vol. 33, pp. 29--34, Jan. 2003. 1276 [3] S. Mascolo and F. Vacirca, "The effect of reverse traffic on the 1277 performance of new TCP congestion control algorithms for gigabit net- 1278 works," in Protocols for Fast, Long Distance Networks (PFLDnet), 2006. 1280 [4] D. Rossi, M. Mellia, and C. Casetti, "User patience and the web: a 1281 hands-on investigation," in Global Telecommunications Conference, 2003. 1282 GLOBECOM 1284 [5] M. C. Weigle, P. Adurthi, F. Hernandez-Campos, K. Jeffay, and F. D. 1285 Smith, "Tmix: a tool for generating realistic TCP application workloads 1286 in ns-2," SIGCOMM Comput. Commun. Rev., vol. 36, pp. 65--76, July 2006. 1288 [6] G. project, "Tmix on ProtoGENI." 1290 [7] J. xxxxx, "Tmix trace generation for the TCP evaluation suite." 1291 http://web.archive.org/web/20100711061914/http://wil-ns.cs.caltech.edu/ 1292 benchmark/traffic/. 1294 [8] Wikipedia, "Fisher-Yates shuffle." 1295 http://en.wikipedia.org/wiki/Fisher-Yates_shuffle. 1297 [9] M. Alizadeh, A. Greenberg, D. A. Maltz, J. Padhye, P. Patel, B. 1298 Prabhakar, S. Sengupta, and M. Sridharan, "Data center tcp (dctcp)," in 1299 Proceedings of the ACM SIGCOMM 2010 conference, SIGCOMM '10, (New York, 1300 NY, USA), pp. 63--74, ACM, 2010. 1302 [10] T. Henderson and R. Katz, "Transport protocols for internet-compat- 1303 ible satellite networks," Selected Areas in Communications, IEEE Journal 1304 on, vol. 17, no. 2, pp. 326--344, 1999. 1306 [11] A. Gurtov and S. Floyd, "Modeling wireless links for transport pro- 1307 tocols," SIGCOMM Comput. Commun. Rev., vol. 34, pp. 85--96, Apr. 2004. 1309 [12] S. Floyd, R. Gummadi, and S. Shenker, "Adaptive RED: An algorithm 1310 for increasing the robustness of RED," tech. rep., ICIR, 2001. 1312 [13] L. L. H. Andrew, S. V. Hanly, and R. G. Mukhtar, "Active queue man- 1313 agement for fair resource allocation in wireless networks," IEEE Trans- 1314 actions on Mobile Computing, vol. 7, pp. 231--246, Feb. 2008. 1316 [14] S. Floyd, J. Mahdavi, M. Mathis, and M. Podolsky, "An Extension to 1317 the Selective Acknowledgement (SACK) Option for TCP." RFC 2883 (Proposed 1318 Standard), July 2000. 1320 [15] K. Ramakrishnan, S. Floyd, and D. Black, "The Addition of Explicit 1321 Congestion Notification (ECN) to IP." RFC 3168 (Proposed Standard), 1322 Sept. 2001. Updated by RFCs 4301, 6040. 1324 [16] E. Souza and D. Agarwal, "A highspeed TCP study: Characteristics 1325 and deployment issues," Tech. Rep. LBNL-53215, LBNL, 2003. 1327 [17] H. Shimonishi, M. Sanadidi, and T. Murase, "Assessing interactions 1328 among legacy and high-speed tcp protocols," in Protocols for Fast, Long 1329 Distance Networks (PFLDnet), 2007. 1331 [18] N. Hohn, D. Veitch, and P. Abry, "The impact of the flow arrival 1332 process in internet traffic," in Acoustics, Speech, and Signal Process- 1333 ing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Confer- 1334 ence on, vol. 6, pp. VI-37--40 vol.6, 2003. 1336 [19] F. Kelly, Reversibility and stochastic networks. University of 1337 Cambridge Statistical Laboratory, 1979. 1339 A Discussions on Traffic 1341 While the protocols being tested may differ, it is important that we 1342 maintain the same "load" or level of congestion for the experimental 1343 scenarios. To enable this, we use a hybrid of open-loop and close-loop 1344 approaches. For this test suite, network traffic consists of sessions 1345 corresponding to individual users. Because users are independent, these 1346 session arrivals are well modeled by an open-loop Poisson process. A 1347 session may consist of a single greedy TCP flow, multiple greedy flows 1348 separated by user "think" times, a single non-greedy flow with embedded 1349 think times, or many non-greedy "thin stream" flows. process forms a 1350 Poisson process [18]. Both the think times and burst sizes have heavy- 1351 tailed distributions, with the exact distribution based on empirical 1352 studies. The think times and burst sizes will be chosen independently. 1353 This is unlikely to be the case in practice, but we have not been able 1354 to find any measurements of the joint distribution. We invite 1355 researchers to study this joint distribution, and future revisions of 1356 this test suite will use such statistics when they are available. 1358 For most current traffic generators, the traffic is specified by an 1359 arrival rate for independent user sessions, along with specifications of 1360 connection sizes, number of connections per sessions, user wait times 1361 within sessions, and the like. Because the session arrival times are 1362 specified independently of the transfer times, one way to specify the 1363 load would be as A = E[f]/E[t], where E[f] is the mean session size (in 1364 bits transferred), E[t] is the mean session inter-arrival time in sec- 1365 onds, and A is the load in bps. 1367 Instead, for equilibrium experiments, we measure the load as the "mean 1368 number of jobs in an M/G/1 queue using processor sharing," where a job 1369 is a user session. This reflects the fact that TCP aims at processor 1370 sharing of variable sized files. Because processor sharing is a symmet- 1371 ric discipline [19], the mean number of flows is equal to that of an 1372 M/M/1 queue, namely rho/(1-rho), where rho=lambda S/C, and lambda [flows 1373 per second] is the arrival rate of jobs/flows, S [bits] is the mean job 1374 size and C [bits per second] is the bottleneck capacity. For small 1375 loads, say 10%, this is essentially equal to the fraction of the capac- 1376 ity that is used. However, for overloaded systems, the fraction of the 1377 bandwidth used will be much less than this measure of load. 1379 In order to minimize the dependence of the results on the experiment 1380 durations, scenarios should be as stationary as possible. To this end, 1381 experiments will start with rho/(1-rho) active cross-traffic flows, with 1382 traffic of the specified load. 1384 Authors' Addresses 1386 David Hayes 1387 University of Oslo 1388 Department of Informatics, P.O. Box 1080 Blindern 1389 Oslo N-0316 1390 Norway 1392 Email: davihay@ifi.uio.no 1394 David Ros 1395 Institut Mines-Telecom / Telecom Bretagne 1396 2 rue de la Chataigneraie 1397 35510 Cesson-Sevigne 1398 France 1400 Email: david.ros@telecom-bretagne.eu 1402 Lachlan L.H. Andrew 1403 CAIA Swinburne University of Technology 1404 P.O. Box 218, John Street 1405 Hawthorn Victoria 3122 1406 Australia 1408 Email: lachlan.andrew@gmail.com 1410 Sally Floyd 1411 ICSI 1412 1947 Center Street, Ste. 600 1413 Berkeley CA 94704 1414 United States