idnits 2.17.1 

draft-irtf-iccrg-tcpeval-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  == The page length should not exceed 58 lines per page, but there was 1
     longer page, the longest (page 1) being 68 lines


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack a Security Considerations section.

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** There are 31 instances of too long lines in the document, the longest
     one being 36 characters in excess of 72.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == Line 538 has weird spacing: '...  scale  exper...'

  == Line 570 has weird spacing: '...  scale  exper...'

  == Line 600 has weird spacing: '... load  scale  ...'

  == Line 602 has weird spacing: '...    tbd    tbd...'

  == Line 603 has weird spacing: '...    tbd    tbd...'

  == (14 more instances...)

  -- The document date (July 3, 2014) is 3578 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  -- Missing reference section? '1' on line 1268 looks like a reference

  -- Missing reference section? '2' on line 1273 looks like a reference

  -- Missing reference section? '3' on line 1276 looks like a reference

  -- Missing reference section? '4' on line 1280 looks like a reference

  -- Missing reference section? '5' on line 1284 looks like a reference

  -- Missing reference section? '6' on line 1288 looks like a reference

  -- Missing reference section? '7' on line 1290 looks like a reference

  -- Missing reference section? '8' on line 1294 looks like a reference

  -- Missing reference section? '9' on line 1297 looks like a reference

  -- Missing reference section? '10' on line 1302 looks like a reference

  -- Missing reference section? '11' on line 1306 looks like a reference

  -- Missing reference section? '12' on line 1309 looks like a reference

  -- Missing reference section? '13' on line 1312 looks like a reference

  -- Missing reference section? '14' on line 1316 looks like a reference

  -- Missing reference section? '15' on line 1320 looks like a reference

  -- Missing reference section? '16' on line 1324 looks like a reference

  -- Missing reference section? '17' on line 1327 looks like a reference

  -- Missing reference section? '18' on line 1350 looks like a reference

  -- Missing reference section? '19' on line 1371 looks like a reference


     Summary: 3 errors (**), 0 flaws (~~), 8 warnings (==), 20 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                           D. Hayes
3	Internet-Draft                                        University of Oslo
4	Intended status: Informational                                    D. Ros
5	Expires: January 4, 2015                                Telecom Bretagne
6	                                                           L.L.H. Andrew
7	                                 CAIA Swinburne University of Technology
8	                                                                S. Floyd
9	                                                                    ICSI
10	                                                            July 3, 2014

12	                      Common TCP Evaluation Suite
13	                      draft-irtf-iccrg-tcpeval-00

15	Abstract

17	This document presents an evaluation test suite for the initial assess-
18	ment of proposed TCP modifications. The goal of the test suite is to
19	allow researchers to quickly and easily evaluate their proposed TCP
20	extensions in simulators and testbeds using a common set of well-
21	defined, standard test cases, in order to compare and contrast proposals
22	against standard TCP as well as other proposed modifications. This test
23	suite is not intended to result in an exhaustive evaluation of a pro-
24	posed TCP modification or new congestion control mechanism. Instead, the
25	focus is on quickly and easily generating an initial evaluation report
26	that allows the networking community to understand and discuss the
27	behavioral aspects of a new proposal, in order to guide further experi-
28	mentation that will be needed to fully investigate the specific aspects
29	of a new proposal.

31	Status of This Memo

33	   This Internet-Draft is submitted in full conformance with the
34	   provisions of BCP 78 and BCP 79.

36	   Internet-Drafts are working documents of the Internet Engineering
37	   Task Force (IETF).  Note that other groups may also distribute
38	   working documents as Internet-Drafts.  The list of current Internet-
39	   Drafts is at http://datatracker.ietf.org/drafts/current/.

41	   Internet-Drafts are draft documents valid for a maximum of six months
42	   and may be updated, replaced, or obsoleted by other documents at any
43	   time.  It is inappropriate to use Internet-Drafts as reference
44	   material or to cite them other than as "work in progress."

46	   This Internet-Draft will expire on January 4, 2015.

48	Copyright Notice

50	   Copyright (c) 2014 IETF Trust and the persons identified as the
51	   document authors.  All rights reserved.

53	   This document is subject to BCP 78 and the IETF Trust's Legal
54	   Provisions Relating to IETF Documents
55	   (http://trustee.ietf.org/license-info) in effect on the date of
56	   publication of this document.  Please review these documents
57	   carefully, as they describe your rights and restrictions with respect
58	   to this document.  Code Components extracted from this document must
59	   include Simplified BSD License text as described in Section 4.e of
60	   the Trust Legal Provisions and are provided without warranty as
61	   described in the Simplified BSD License.

63	                           Table of Contents

65	1          Introduction  . . . . . . . . . . . . . . . . . . . . . .   3
66	2          Traffic generation  . . . . . . . . . . . . . . . . . . .   3
67	2.1        Desirable model characteristics . . . . . . . . . . . . .   4
68	2.2        Tmix  . . . . . . . . . . . . . . . . . . . . . . . . . .   4
69	2.2.1      Base Tmix trace files for tests . . . . . . . . . . . . .   5
70	2.3        Loads . . . . . . . . . . . . . . . . . . . . . . . . . .   5
71	2.3.1      Varying the Tmix traffic load . . . . . . . . . . . . . .   5
72	2.3.1.1    Notes . . . . . . . . . . . . . . . . . . . . . . . . . .   5
73	2.3.2      Dealing with non-stationarity . . . . . . . . . . . . . .   6
74	2.3.2.1    Bin size  . . . . . . . . . . . . . . . . . . . . . . . .   6
75	2.3.2.2    NS2 implementation specifics  . . . . . . . . . . . . . .   6
76	2.4        Packet size distribution  . . . . . . . . . . . . . . . .   6
77	2.4.1      Potential revision  . . . . . . . . . . . . . . . . . . .   7
78	3          Achieving reliable results in minimum time  . . . . . . .   7
79	3.1        Background  . . . . . . . . . . . . . . . . . . . . . . .   7
80	3.2        Equilibrium or Steady State . . . . . . . . . . . . . . .   7
81	3.2.1      Note on the offered load in NS2 . . . . . . . . . . . . .   8
82	3.3        Accelerated test start up time  . . . . . . . . . . . . .   8
83	4          Basic scenarios . . . . . . . . . . . . . . . . . . . . .   9
84	4.1        Basic topology  . . . . . . . . . . . . . . . . . . . . .   9
85	4.2        Traffic . . . . . . . . . . . . . . . . . . . . . . . . .   9
86	4.3        Flows under test  . . . . . . . . . . . . . . . . . . . .  11
87	4.4        Scenarios . . . . . . . . . . . . . . . . . . . . . . . .  11
88	4.4.1      Data Center . . . . . . . . . . . . . . . . . . . . . . .  11
89	4.4.1.1    Potential Revisions . . . . . . . . . . . . . . . . . . .  11
90	4.4.2      Access Link . . . . . . . . . . . . . . . . . . . . . . .  12
91	4.4.2.1    Potential Revisions . . . . . . . . . . . . . . . . . . .  12
92	4.4.3      Trans-Oceanic Link  . . . . . . . . . . . . . . . . . . .  12
93	4.4.4      Geostationary Satellite . . . . . . . . . . . . . . . . .  12
94	4.4.5      Wireless LAN  . . . . . . . . . . . . . . . . . . . . . .  13
95	4.4.5.1    NS2 implementation specifics  . . . . . . . . . . . . . .  14
96	4.4.5.2    Potential revisions . . . . . . . . . . . . . . . . . . .  15
97	4.4.6      Dial-up Link  . . . . . . . . . . . . . . . . . . . . . .  15
98	4.4.6.1    Note on parameters  . . . . . . . . . . . . . . . . . . .  15
99	4.4.6.2    Potential revisions . . . . . . . . . . . . . . . . . . .  16
100	4.5        Metrics of interest . . . . . . . . . . . . . . . . . . .  16
101	4.6        Potential Revisions . . . . . . . . . . . . . . . . . . .  17
102	5          Latency specific experiments  . . . . . . . . . . . . . .  17
103	5.1        Delay/throughput tradeoff as function of queue size
104	 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  17
105	5.1.1      Topology  . . . . . . . . . . . . . . . . . . . . . . . .  17
106	5.1.1.1    Potential revisions . . . . . . . . . . . . . . . . . . .  18
107	5.1.2      Flows under test  . . . . . . . . . . . . . . . . . . . .  18
108	5.1.3      Metrics of interest . . . . . . . . . . . . . . . . . . .  18

110	D. Hayes et. al.                                              [Page 2a]

112	5.2        Ramp up time: completion time of one flow . . . . . . . .  18
113	5.2.1      Topology and background traffic . . . . . . . . . . . . .  19
114	5.2.2      Flows under test  . . . . . . . . . . . . . . . . . . . .  20
115	5.2.2.1    Potential Revisions . . . . . . . . . . . . . . . . . . .  20
116	5.2.3      Metrics of interest . . . . . . . . . . . . . . . . . . .  20
117	5.3        Transients: release of bandwidth, arrival of many
118	flows  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  21
119	5.3.1      Topology and background traffic . . . . . . . . . . . . .  21
120	5.3.2      Flows under test  . . . . . . . . . . . . . . . . . . . .  22
121	5.3.3      Metrics of interest . . . . . . . . . . . . . . . . . . .  22
122	6          Throughput- and fairness-related experiments  . . . . . .  22
123	6.1        Impact on standard TCP traffic  . . . . . . . . . . . . .  22
124	6.1.1      Topology and background traffic . . . . . . . . . . . . .  23
125	6.1.2      Flows under test  . . . . . . . . . . . . . . . . . . . .  23
126	6.1.3      Metrics of interest . . . . . . . . . . . . . . . . . . .  23
127	6.1.3.1    Suggestions . . . . . . . . . . . . . . . . . . . . . . .  24
128	6.2        Intra-protocol and inter-RTT fairness . . . . . . . . . .  24
129	6.2.1      Topology and background traffic . . . . . . . . . . . . .  24
130	6.2.2      Flows under test  . . . . . . . . . . . . . . . . . . . .  24
131	6.2.2.1    Intra-protocol fairness:  . . . . . . . . . . . . . . . .  25
132	6.2.2.2    Inter-RTT fairness: . . . . . . . . . . . . . . . . . . .  25
133	6.2.3      Metrics of interest . . . . . . . . . . . . . . . . . . .  25
134	6.3        Multiple bottlenecks  . . . . . . . . . . . . . . . . . .  25
135	6.3.1      Topology and traffic  . . . . . . . . . . . . . . . . . .  25
136	6.3.1.1    Potential Revisions . . . . . . . . . . . . . . . . . . .  26
137	6.3.2      Metrics of interest . . . . . . . . . . . . . . . . . . .  27
138	7          Implementations . . . . . . . . . . . . . . . . . . . . .  27
139	8          Acknowledgments  . . . . . . . . . . . . . . . . . . . .  28
140	9          Bibliography  . . . . . . . . . . . . . . . . . . . . . .  28
141	A          Discussions on Traffic  . . . . . . . . . . . . . . . . .  30

143	D. Hayes et. al.                                              [Page 2b]

145	1 Introduction

147	This document describes a common test suite for the initial assessment
148	of new TCP extensions or modifications. It defines a small number of
149	evaluation scenarios, including traffic and delay distributions, network
150	topologies, and evaluation parameters and metrics. The motivation for
151	such an evaluation suite is to help researchers in evaluating their pro-
152	posed modifications to TCP. The evaluation suite will also enable inde-
153	pendent duplication and verification of reported results by others,
154	which is an important aspect of the scientific method that is not often
155	put to use by the networking community. A specific target is that the
156	evaluations should be able to be completed in a reasonable amount of
157	time by simulation, or with a reasonable amount of effort in a testbed.

159	It is not possible to provide TCP researchers with a complete set of
160	scenarios for an exhaustive evaluation of a new TCP extension; espe-
161	cially because the characteristics of a new extension will often require
162	experiments with specific scenarios that highlight its behavior. On the
163	other hand, an exhaustive evaluation of a TCP extension will need to
164	include several standard scenarios, and it is the focus of the test
165	suite described in this document to define this initial set of test
166	cases.

168	These scenarios generalize current characteristics of the Internet such
169	as round-trip times (RTT), propagation delays, and buffer sizes. It is
170	envisaged that as the Internet evolves these will need to be adjusted.
171	In particular, we expect buffer sizes will need to be adjusted as
172	latency becomes increasingly important.

174	The scenarios specified here are intended to be as generic as possible,
175	i.e., not tied to a particular simulation or emulation platform. How-
176	ever, when needed some details pertaining to implementation using a
177	given tool are described.

179	This document has evolved from a "round-table" meeting on TCP evalua-
180	tion, held at Caltech on November 8-9, 2007, reported in [1].  This doc-
181	ument is the first step in constructing the evaluation suite; the goal
182	is for the evaluation suite to be adapted in response to feedback from
183	the networking community. It revises draft-irtf-tmrg-tests-02.

185	Information related to the draft can be found at:
186	http://riteproject.eu/ietf-drafts

188	2 Traffic generation

190	Congestion control concerns the response of flows to bandwidth limita-
191	tions or to the presence of other flows. Cross-traffic and reverse-path
192	traffic are therefore important to the tests described in this suite.

194	Such traffic can have the desirable effect of reducing the occurrence of
195	pathological conditions, such as global synchronization among competing
196	flows, that might otherwise be mis-interpreted as normal average behav-
197	iours of those protocols [2,3]. This traffic must be reasonably realis-
198	tic for the tests to predict the behaviour of congestion control proto-
199	cols in real networks, and also well-defined so that statistical noise
200	does not mask important effects.

202	2.1 Desirable model characteristics

204	Most scenarios use traffic produced by a traffic generator, with a range
205	of start times for user sessions, connection sizes, and the like, mim-
206	icking the traffic patterns commonly observed in the Internet. It is
207	important that the same "amount" of congestion or cross-traffic be used
208	for the testing scenarios of different congestion control algorithms.
209	This is complicated by the fact that packet arrivals and even flow
210	arrivals are influenced by the behavior of the algorithms. For this rea-
211	son, a pure open-loop, packet-level generation of traffic where gener-
212	ated traffic does not respond to the behaviour of other present flows is
213	not suitable. Instead, emulating application or user behaviours at the
214	end points using reactive protocols such as TCP in a closed-loop fashion
215	results in a closer approximation of cross-traffic, where user behav-
216	iours are modeled by well-defined parameters for source inputs (e.g.,
217	request sizes for HTTP), destination inputs (e.g., response size), and
218	think times between pairs of source and destination inputs. By setting
219	appropriate parameters for the traffic generator, we can emulate non-
220	greedy user-interactive traffic (e.g., HTTP 1.1, SMTP and Telnet),
221	greedy traffic (e.g., P2P and long file downloads), as well as long-
222	lived but non-greedy, non-interactive flows (or thin streams).

224	This approach models protocol reactions to the congestion caused by
225	other flows in the common paths, although it fails to model the reac-
226	tions of users themselves to the presence of congestion. A model that
227	includes end-users' reaction to congestion is beyond the scope of this
228	draft, but we invite researchers to explore how the user behavior, as
229	reflected in the connection sizes, user wait times, and number of con-
230	nections per session, might be affected by the level of congestion expe-
231	rienced within a session [4].

233	2.2 Tmix

235	There are several traffic generators available that implement a similar
236	approach to that discussed above. For now, we have chosen to use the
237	Tmix [5] traffic generator. Tmix is available for the NS2 and NS3 simu-
238	lators, and can generate traffic for testbeds (for example GENI [6]).

240	Tmix represents each TCP connection by a connection vector consisting of
241	a sequence of (request-size, response-size, think-time) triples, thus
242	representing bi-directional traffic. Connection vectors used for traffic
243	generation can be obtained from Internet traffic traces.

245	2.2.1 Base Tmix trace files for tests

247	The traces currently defined for use in the test suite are based on cam-
248	pus traffic at the University of North Carolina (see [7] for a descrip-
249	tion of construction methods and basic statistics).

251	The traces have an additional "m" field added to each connection vector
252	to provide each direction's maximum segment size for the connection.
253	This is used to provide the packet size distribution described in sec-
254	tion 2.4.

256	These traces contain a mixture of connections, from very short flows
257	that do not exist for long enough to be "congestion controlled", to long
258	thin streams, to bulk file transfer like connections.

260	The traces are available at:
261	http://hosting.riteproject.eu/tcpevaltmixtraces.tgz

263	2.3 Loads

265	While the protocols being tested may differ, it is important that we
266	maintain the same "load" or level of congestion for the experimental
267	scenarios. For many of the scenarios, such as the basic ones in section
268	4, each scenario is run for a range of loads, where the load is varied
269	by varying the rate of session arrivals.

271	2.3.1 Varying the Tmix traffic load

273	To adjust the traffic load for a given scenario, the connection start
274	times for flows in a Tmix trace are scaled as follows. Connections are
275	actually started at:

277	experiment_cv_start_time = scale * cv_start_time

279	where cv_start_time denotes the connection vector start time in the Tmix
280	traces and experiment_start_time is the time the connection starts in
281	the experiment. Therefore, the smaller the scale the higher (in general)
282	the traffic load.

284	2.3.1.1 Notes

286	Changing the connection start times also changes the way the traffic
287	connections interact, potentially changing the "clumping" of traffic
288	bursts.

290	Very small changes in the scaling parameter can cause disproportionate
291	changes in the offered load. This is due to possibility of the small
292	change causing the exclusion or inclusion of a CV that will transfer a
293	very large amount of data.

295	2.3.2 Dealing with non-stationarity

297	The Tmix traffic traces, as they are, offer a non-stationary load. This
298	is exacerbated for tests that do not require use of the full trace
299	files, but only a portion of them. While removing this non-stationarity
300	does also remove some of the "realism" of the traffic, it is necessary
301	for the test suite to produce reliable and consistent results.

303	A more stationary offered load is achieved by shuffling the start times
304	of connection vectors in the Tmix trace file. The trace file is logi-
305	cally partitioned into n-second bins, which are then shuffled using a
306	Fisher-Yates shuffle [8], and the required portions written to shuffled
307	trace files for the particular experiment being conducted.

309	2.3.2.1 Bin size

311	The bin size is chosen so that there is enough shuffling with respect to
312	the test length. The offered traffic per test second from the Tmix trace
313	files depends scale factor (see section 2.3.1), which is related to the
314	capacity of the bottleneck link. The shuffling bin size (in seconds) is
315	set at: b = 500e6 / C where C is the bottleneck link's capacity in bits
316	per second, and 500e6 is a scaling factor (in bits).

318	Thus for the access link scenario described in section 4.4.2, the bin
319	size for shuffling will be 5 seconds.

321	2.3.2.2 NS2 implementation specifics

323	The tcl scripts for this process are distributed with the NS2 example
324	test suite implementation. Care must be taken when using this algorithm,
325	so that the given random number generator and the same seed are
326	employed, or else the resulting experimental traces will be different.

328	2.4 Packet size distribution

330	For flows generated by the traffic generator, 10% of them use 536-byte
331	packets, and 90% 1500-byte packets. The base Tmix traces described in
332	section 2.2.1 have been processed at the *connection* level to have this
333	characteristic. As a result, *packets* in a given test will be roughly,
334	but not be exactly, in this proportion. However, the proportion of
335	offered traffic will be consistent for each experiment.

337	2.4.1 Potential revision

339	As Tmix can now read and use a connection's Maximum Segment Size (MSS)
340	from the trace file, it will be possible to produce Tmix connection vec-
341	tor trace files where the packet sizes reflect actual measurements.

343	3 Achieving reliable results in minimum time

345	This section describes the techniques used to achieve reliable results
346	in the minimum test time.

348	3.1 Background

350	Over a long time, because the session arrival times are to a large
351	extent independent of the transfer times, load could be defined as:

353	A=E[f]/E[t],

355	where E[f] is the mean session (flow) size in bits transferred, E[t] is
356	the mean session inter-arrival time in seconds, and A is the load in
357	bps.

359	It is important to test congestion control protocols in "overloaded"
360	conditions. However, if A>C, where C is the capacity of the bottleneck
361	link, then the system has no equilibrium. In long-running experiments
362	with A>C, the expected number of flows would keep on increasing with
363	time (because as time passes, flows would tend to last for longer and
364	longer, thus "piling up" with newly-arriving ones).  This means that, in
365	an overload scenario, some measures will be very sensitive to the dura-
366	tion of the tests.

368	3.2 Equilibrium or Steady State

370	Ideally, experiments should be run until some sort of equilibrium
371	results can be obtained. Since every test algorithm can potentially
372	change how long this may take, the following approach is adopted:

374	     1.   Traces are shuffled to remove non-stationarity (see section
375	          2.3.2.)

377	     2.   The experiment run time is determined from the traffic traces.
378	          The shuffled traces are compiled such that the estimate of
379	          traffic offered in the second third of the test is equal to
380	          the estimated traffic offered in the second third of the test
381	          is equal to the estimate of traffic offered in the final third
382	          of the test, to within a 5% tolerance. The length of the trace
383	          files becomes the total experiment run time (including the
384	          warm up time).

386	     3.   The warmup time until measurements start, is calculated as the
387	          time at which the NS2 simulation of standard TCP achieves
388	          "steady state". In this case, warmup time is determined as the
389	          time required so the measurements have statistically similar
390	          first and second half results. The metrics used as reference
391	          are: the bottleneck raw throughput, and the average bottleneck
392	          queue size. The latter is stable when A>>C and A<<C, but not
393	          when AapproxC. In this case the queue is not a stable mea-
394	          sure, and just the raw bottleneck throughput is used.

396	3.2.1 Note on the offered load in NS2

398	The offered load in an NS2 simulation using one-way TCP will be higher
399	than the estimated load. One-way TCP uses fixed TCP segment sizes, so
400	all transmissions that would normally use a segment size less than the
401	maximum segment size (in this case 496B or 1460B), such as at the end of
402	a block of data, or for short queries or responses, will still be sent
403	as a maximum segment size packet.

405	3.3 Accelerated test start up time

407	Tmix traffic generation does not provide an instant constant load. It
408	can take quite a long time for the number of simultaneous TCP connec-
409	tions, and thus the offered load, to build up when using Tmix to gener-
410	ate the load. To accelerate the system start up, the system is "pre-
411	filled" to a state close to "steady state", as follows.

413	Connections that start before t=prefill_t are selected with a bias
414	toward longer sessions. Only connections which are estimated to continue
415	past the long_flow_bias time (see figure 1) are selected.

417	The prefill_t (in seconds) calculation has been automated, based on the
418	following heuristic: prefill_t = 1.5 * targetload * maxRTT where maxRTT
419	is the median maximum RTT in the particular topology, and targetload is
420	given as a percentage. The long_flow_bias threshold is set at
421	long_flow_bias = prefill_t / 2 These values are not optimal, but have
422	been experimentally determined to give reasonable results.

424	These selected connections are then started at an accelerated rate so
425	that the estimated resulting load over the accelerated start up time is
426	the target load for this experiment: prefill_si = total_pfcb / (C * TL /
427	100.0) where prefill_si is the interval of time for the accelerated
428	start up, total_pfcb is the total number of bits estimated to be sent by
429	the prefill connections, C is the capacity of the bottleneck link, and
430	TL is the target offered load as a percentage.

432	This procedure has the effect of quickly bringing the system to a loaded
433	state. From this point the system runs until t=warmup (as calculated in
434	section 3.2), after which moment statistics are computed.

436	                         accelerated start up
437	 long_flow_bias->|       prefill_si
438	                 |       <---->
439	  |--------------|-------|----|-------------------------|
440	  t=0                    |    t = prefill_t             t=warmup
441	                         |
442	                         |
443	                         t = prefill_t - prefill_si

445	Figure 1: prefilling

447	4 Basic scenarios

449	The purpose of the basic scenarios is to explore the behavior of a TCP
450	extension over different link types. These scenarios use the dumbbell
451	topology described in section 4.1.

453	4.1 Basic topology

455	Most tests use a simple dumbbell topology with a central link that con-
456	nects two routers, as illustrated in Figure 2.  Each router is also con-
457	nected to three nodes by edge links.  In order to generate a typical
458	range of round trip times, edge links have different delays. Unless
459	specified otherwise, such delays are as follows.  On one side, the one-
460	way propagation delays are: 0ms, 12ms and 25ms; on the other: 2ms, 37ms,
461	and 75ms.  Traffic is uniformly shared among the nine source/destination
462	pairs, giving a distribution of per-flow RTTs in the absence of queueing
463	delay shown in Table 1.  These RTTs are computed for a dumbbell topology
464	assuming a delay of 0ms for the central link.  The delay for the central
465	link that is used in a specific scenario is given in the next section.

467	For dummynet experiments, delays can be obtained by specifying the delay
468	of each flow.

470	4.2 Traffic
471	 Node 1                                                      Node 4
472	      \_                                                _/
473	        \_                                            _/
474	          \_ __________     Central      __________ _/
475	             |          |     link       |          |
476	Node 2 ------| Router 1 |----------------| Router 2 |------ Node 5
477	            _|__________|                |__________|_
478	          _/                                          \_
479	        _/                                              \_
480	Node 3 /                                                  \  Node 6

482	Figure 2: A dumbbell topology

484	---------------------------------
485	Path  RTT  Path  RTT  Path  RTT
486	---------------------------------
487	1-4     4  1-5    74  1-6   150
488	2-4    28  2-5    98  2-6   174
489	3-4    54  3-5   124  3-6   200
490	---------------------------------

492	Table 1: Minimum RTTs of the paths between two nodes, in milliseconds.

494	In all of the basic scenarios, *all* TCP flows use the TCP extension or
495	modification under evaluation.

497	In general, the 9 bidirectional Tmix sources are connected to nodes 1 to
498	6 of figure 2 to create the paths tabulated in table 1.

500	Offered loads are estimated directly from the shuffled and scaled Tmix
501	traces, as described in section 3.2. The actual measured loads will
502	depend on the TCP variant and the scenario being tested.

504	Buffer sizes are based on the Bandwidth Delay Product (BDP), except for
505	the Dial-up scenario where a BDP buffer does not provide enough buffer-
506	ing.

508	The load generated by Tmix with the standard trace files is asymmetric,
509	with a higher load offered in the right to left direction (refer to fig-
510	ure 2) than in the left to right direction. Loads are specified for the
511	higher traffic right to left direction. For each of the basic scenarios,
512	three offered loads are tested: moderate (60%), high (85%), and overload
513	(110%). Loads are for the bottleneck link, which is the central link in
514	all scenarios except the wireless LAN scenario.

516	The 9 tmix traces are scaled using a single scaling factor in these
517	tests. This means that the traffic offered on each of the 9 paths
518	through the network is not equal, but combined at the bottleneck pro-
519	duces the specified offered load.

521	4.3 Flows under test

523	For these basic scenarios, there is no differentiation between "cross-
524	traffic" and the "flows under test". The aggregate traffic is under
525	test, with the metrics exploring both aggregate traffic and distribu-
526	tions of flow-specific metrics.

528	4.4 Scenarios

530	4.4.1 Data Center

532	The data center scenario models a case where bandwidth is plentiful and
533	link delays are generally low. All links have a capacity of 1 Gbps.
534	Links from nodes 1, 2 and 4 have a one-way propagation delay of 10 us,
535	while those from nodes 3, 5 and 6 have 100 us [9], and the central link
536	has 0 ms delay. The central link has 10 ms buffers.

538	load       scale  experiment time warmup  test_time  prefill_t  prefill_si
539	---------------------------------------------------------------------------
540	 60%  0.56385119            156.5    4.0      145.0      7.956    4.284117
541	 85%    0.372649            358.0   19.0      328.0     11.271    6.411839
542	110%    0.295601            481.5    7.5        459     14.586    7.356242

544	Table 2: Data center scenario parameters

546	4.4.1.1 Potential Revisions

548	The rate of 1 Gbps is chosen such that NS2 simulations can run in a rea-
549	sonable time. Higher values will become feasible as computing power
550	increases, however the current traces may not be long enough to drive
551	simulations or test bed experiments at higher rates.

553	The supplied Tmix traces are used here to provide a standard comparison
554	across scenarios. Data Centers, however, have very specialised traffic
555	which may not be represented well in such traces. In the future,
556	specialised Data Center traffic traces may be needed to provide a more
557	realistic test.

559	4.4.2 Access Link

561	The access link scenario models an access link connecting an institution
562	(e.g., a university or corporation) to an ISP. The central and edge
563	links are all 100 Mbps. The one-way propagation delay of the central
564	link is 2 ms, while the edge links have the delays given in Section 4.1.
565	Our goal in assigning delays to edge links is only to give a realistic
566	distribution of round-trip times for traffic on the central link. The
567	Central link buffer size is 100 ms, which is equivalent to the BDP
568	(using the mean RTT).

570	load      scale  experiment time warmup  test_time  prefill_t  prefill_si
571	--------------------------------------------------------------------------
572	 60%   4.910115              440  107.0      296.0      36.72   24.103939
573	 85%   3.605109              920  135.0      733.0      52.02   23.378915
574	110%  3.0027085             2710   34.0     2609.0      67.32   35.895355

576	Table 3: Access link scenario parameters (times in seconds)

578	4.4.2.1 Potential Revisions

580	As faster access links become common, the link speed for this scenario
581	will need to be updated accordingly. Also as access link buffer sizes
582	shrink to less than BDP sized buffers, this should be updated to reflect
583	these changes in the Internet.

585	4.4.3 Trans-Oceanic Link

587	The trans-oceanic scenario models a test case where mostly lower-delay
588	edge links feed into a high-delay central link. Both the central and all
589	edge links are 1 Gbps. The central link has 100 ms buffers, and a one-
590	way propagation delay of 65 ms. 65 ms is chosen as a "typical number".
591	The actual delay on real links depends, of course, on their length. For
592	example, Melbourne to Los Angeles is about 85 ms.

594	4.4.4 Geostationary Satellite

596	The geostationary satellite scenario models an asymmetric test case with
597	a high-bandwidth downlink and a low-bandwidth uplink [10,11].  The sce-
598	nario modeled is that of nodes connected to a satellite hub which has an
599	asymmetric satellite connection to the master base station which is
600	 load  scale  experiment time warmup  test_time  prefill_t  prefill_si
601	-----------------------------------------------------------------------
602	  60%    tbd              tbd    tbd
603	  85%    tbd              tbd    tbd
604	110%:    tbd              tbd    tbd

606	Table 4: Trans-Oceanic link scenario parameters

608	connected to the Internet. The capacity of the central link is asymmet-
609	ric - 40 Mbps down, and 4 Mbps up with a one-way propagation delay of
610	300 ms. Edge links are all bidirectional 100 Mbps links with one-way
611	delays as given in Section 4.1. The central link buffer size is 100 ms
612	for downlink and 1000 ms for uplink.

614	Note that congestion in this case is often on the 4 Mbps uplink (left to
615	right), even though most of the traffic is in the downlink direction
616	(right to left).

618	 load  scale  experiment time warmup  test_time  prefill_t  prefill_si
619	-----------------------------------------------------------------------
620	  60%    tbd              tbd    tbd
621	  85%    tbd              tbd    tbd
622	110%:    tbd              tbd    tbd

624	Table 5: Trans-Oceanic link scenario parameters

626	4.4.5 Wireless LAN

628	The wireless LAN scenario models WiFi access to a wired backbone, as
629	depicted in Figure 3.

631	The capacity of the central link is 100 Mbps, with a one-way delay of 2
632	ms. All links to Router 2 are wired. Router 1 acts as a base station for
633	a shared wireless IEEE 802.11g links. Although 802.11g has a peak bit
634	rate of 54 Mbps, its typical throughput rate is much lower, and
635	decreases under high loads and bursty traffic.  The scales specified
636	here are based on a nominal rate of 6Mbps.

638	The Node_[123] to Wireless_[123] connections are to allow the same RTT
639	distribution as for the wired scenarios. This is in addition to delays
640	on the wireless link due to CSMA. Figure 3 shows how the topology should
641	look in a test bed.

643	 Node_1----Wireless_1..                                      Node_4
644	                      :.                                    /
645	                       :...   Base   central link          /
646	 Node_2----Wireless_2 ....:..Station-------------- Router_2 --- Node_5
647	                       ...: (Router 1)                     \
648	                      .:                                    \
649	 Node_3----Wireless_3.:                                      Node_6

651	Figure 3: Wireless dumbell topology for a test-bed. Wireless_n are wire-
652	less transceivers for connection to the base station

654	 load       scale  experiment time warmup  test_time  prefill_t  prefill_si
655	----------------------------------------------------------------------------
656	  60%  117.852049            14917   20.0      14917          0           0
657	  85%   85.203155          10250.0   20.0      10230          0           0
658	110%:   65.262840           4500.0   20.0       4480          0           0

660	Table 6: Wireless LAN scenario parameters

662	The percentage load for this scenario is based on the sum of the esti-
663	mate of offered load in both directions since the wireless bottleneck
664	link is a shared media. Also, due to contention for the bottleneck link,
665	the accelerated start up using prefill is not used for this scenario.

667	Note that the prefill values are zero as prefill was found to be of no
668	benefit in this scenario.

670	4.4.5.1 NS2 implementation specifics

672	In NS2, this is implemented as depicted in Figure 2 The delays between
673	Node_1 and Wireless_1 are implemented as delays through the Logical Link
674	layer.

676	Since NS2 don't have a simple way of measuring transport packet loss on
677	the wireless link, dropped packets are inferred based on flow arrivals
678	and departures (see figure 4). This gives a good estimate of the average
679	loss rate over a long enough period (long compared with the transit
680	delay of packets), which is the case here.

682	           logical link
683	       X--------------------X
684	       |                    |
685	       v                    |
686	   n1--+---.                |  _n4
687	            :               V /
688	   n2--+---.:.C0-------------C1---n5
689	            :                 \_
690	   n3--+---.:                   n6

692	Figure 4: Wireless measurements in the ns2 simulator

694	4.4.5.2 Potential revisions

696	Wireless standards are continually evolving. This scenario may need
697	updating in the future to reflect these changes.

699	Wireless links have many other unique properties not captured by delay
700	and bitrate. In particular, the physical layer might suffer from propa-
701	gation effects that result in packet losses, and the MAC layer might add
702	high jitter under contention or large steps in bandwidth due to adaptive
703	modulation and coding. Specifying these properties is beyond the scope
704	of the current first version of this test suite but may make useful
705	additions in the future.

707	Latency in this scenario is very much affected by contention for the
708	media. It will be good to have end-to-end delay measurements to quantify
709	this characteristic. This could include per packet latency, application
710	burst completion times, and/or application session completion times.

712	4.4.6 Dial-up Link

714	The dial-up link scenario models a network with a dial-up link of 64
715	kbps and a one-way delay of 5 ms for the central link. This could be
716	thought of as modeling a scenario reported as typical in Africa, with
717	many users sharing a single low-bandwidth dial-up link. Central link
718	buffer size of 1250 ms

720	4.4.6.1 Note on parameters

722	The traffic offered by tmix over a low bandwidth link is very bursty. It
723	takes a long time to reach some sort of statistical stability. For event
724	 load       scale  experiment time  warmup  test_time  prefill_t  prefill_si
725	-----------------------------------------------------------------------------
726	  60%  10176.2847          1214286  273900     273900          0           0
727	  85%   7679.1920          1071429  513600     557165    664.275  121.147563
728	110%:   5796.7901          2223215   440.0    2221915     859.65     180.428

730	Table 7: Dial-up link scenario parameters

732	based simulators, this is not too much of a problem, as the number of
733	packets transferred is not prohibitively high, however for test beds
734	these times are prohibitively long. This scenario needs further investi-
735	gation to address this.

737	4.4.6.2 Potential revisions

739	Modems often have asymmetric up and down link rates. Asymmetry is tested
740	in the Geostationary Satellite scenario (section 4.4.4), but the dial-up
741	scenario could be modified to model this as well.

743	4.5 Metrics of interest

745	For each run, the following metrics will be collected for the central
746	link in each direction:

748	     1.   the aggregate link utilization,

750	     2.   the average packet drop rate, and

752	     3.   the average queueing delay.

754	These measures only provide a general overview of performance. The goal
755	of this draft is to produce a set of tests that can be "run" at all lev-
756	els of abstraction, from Grid500's WAN, through WAN-in-Lab, testbeds and
757	simulations all the way to theory. Researchers may add additional mea-
758	sures to illustrate other performance aspects as required.

760	Other metrics of general interest include:

762	     1.   end-to-end delay measurements

764	     2.   flow-centric:

766	          1.   sending rate,

768	          2.   goodput,
769	          3.   cumulative loss and queueing delay trajectory for each
770	               flow, over time,

772	          4.   the transfer time per flow versus file size

774	     3.   stability properties:

776	          1.   standard deviation of the throughput and the queueing
777	               delay for the bottleneck link,

779	          2.   worst case stability measures, especially proving (possi-
780	               bly theoretically) the stability of TCP.

782	4.6 Potential Revisions

784	As with all of the scenarios in this document, the basic scenarios could
785	benefit from more measurement studies about characteristics of congested
786	links in the current Internet, and about trends that could help predict
787	the characteristics of congested links in the future.  This would
788	include more measurements on typical packet drop rates, and on the range
789	of round-trip times for traffic on congested links.

791	5 Latency specific experiments

793	5.1 Delay/throughput tradeoff as function of queue size

795	Performance in data communications is increasingly limited by latency.
796	Smaller and smarter buffers improve this measure, but often at the
797	expense of TCP throughput. The purpose of these tests is to investigate
798	delay-throughput tradeoffs, *with and without the particular TCP exten-
799	sion under study*.

801	Different queue management mechanisms have different delay-throughput
802	tradeoffs. It is envisaged that the tests described here would be
803	extended to explore and compare the performance of different Active
804	Queue Management (AQM) techniques. However, this is an area of active
805	research and beyond the scope of this test suite at this time. For now,
806	it may be better to have a dedicated, separate test suite to look at AQM
807	performance issues.

809	5.1.1 Topology

811	These tests use the topology of Figure 4.1. They are based on the access
812	link scenario (see section 4.4.2) with the 85% offered load used for
813	this test.

815	For each Drop-Tail scenario set, five tests are run, with buffer sizes
816	of 10%, 20%, 50%, 100%, and 200% of the Bandwidth Delay Product (BDP)
817	for a 100 ms base RTT flow (the average base RTT in the access link
818	dumbell scenario is 100 ms).

820	5.1.1.1 Potential revisions

822	Buffer sizing is still an area of research. Results from this research
823	may necessitate changes to the test suite so that it models these
824	changes in the Internet.

826	AQM is currently an area of active research. It is envisaged that these
827	tests could be extended to explore and compare the performance of key
828	AQM techniques when it becomes clear what these will be. For now a dedi-
829	cated AQM test suite would best serve such research efforts.

831	5.1.2 Flows under test

833	Two kinds of tests should be run: one where all TCP flows use the TCP
834	modification under study, and another where no TCP flows use such modi-
835	fication, as a "baseline" version.

837	The level of traffic from the traffic generator is the same as that
838	described in section 4.4.2.

840	5.1.3 Metrics of interest

842	For each test, three figures are kept, the average throughput, the aver-
843	age packet drop rate, and the average queueing delay over the measure-
844	ment period.

846	Ideally it would be better to have more complete statistics, especially
847	for queueing delay where the delay distribution can be important. It
848	would also be good for this to be illustrated with delay/bandwidth
849	graph, the x-axis shows the average queueing delay, and the y-axis shows
850	the average throughput. For the drop-rate graph, the x-axis shows the
851	average queueing delay, and the y-axis shows the average packet drop
852	rate. Each pair of graphs illustrates the delay/throughput/drop-rate
853	tradeoffs with and without the TCP mechanism under evaluation. For an
854	AQM mechanism, each pair of graphs also illustrates how the throughput
855	and average queue size vary (or don't vary) as a function of the traffic
856	load. Examples of delay/throughput tradeoffs appear in Figures 1-3
857	of[12] and Figures 4-5 of[13].

859	5.2 Ramp up time: completion time of one flow

861	These tests aim to determine how quickly existing flows make room for
862	new flows.

864	5.2.1 Topology and background traffic

866	The ramp up time test uses the topology shown in figure 5. Two long-
867	lived test TCP connections are used in this experiment. Test TCP connec-
868	tion 1 is run between T_n1 and T_n3, with data flowing from T_n1 to
869	T_n3, and test TCP source 2 runs between T_n2 and T_n4, with data flow-
870	ing from T_n2 to T_n4. The background traffic topology is identical to
871	that used in the basic scenarios (see section 4 and Figure 2); i.e.,
872	background flows run between nodes B_n1 to B_n6.

874	              T_n2                        T_n4
875	               |                           |
876	               |                           |
877	         T_n1  |                           |  T_n3
878	            \  |                           | /
879	             \ |                           |/
880	      B_n1--- R1--------------------------R2--- B_n4
881	             / |                           |\
882	            /  |                           | \
883	        B_n2   |                           |  B_n5
884	               |                           |
885	              B_n3                        B_n6

887	Figure 5: Ramp up dumbbell test topology

889	Experiments are conducted with capacities of 56 kbps, 10 Mbps and 1 Gbps
890	for the central link. The 56 kbps case is included to investigate the
891	performance using low bit rate devices such as mobile handsets or dial
892	up modems.

894	For each capacity, three RTT scenarios should be tested, in which the
895	existing and newly arriving flow have RTTs of (80,80), (120,30), and
896	(30,120) respectively. This is made up of a central link has a 2 ms
897	delay in each direction, and test link delays as shown in Table 5.2.1.

899	Throughout the experiment, the offered load of the background (or cross)
900	traffic is 10% of the central link capacity in the right to left direc-
901	tion. The background traffic is generated in the same manner as for the
902	basic scenarios (see section 4).

904	----------------------------------
905	RTT       T_n1  T_n2  T_n3   T_n4
906	scenario  (ms)  (ms)  (ms)  (ms)
907	----------------------------------
908	1            0     0    38     38
909	2           23    12    35      1
910	3           12    23     1     35
911	----------------------------------

913	Table 8: Link delays for the test TCP source connections to the central
914	link

916	Central link     scale  experiment time warmup  test_time  prefill_t  prefill_si
917	---------------------------------------------------------------------------------
918	     56 kbps
919	     10 Mbps
920	      1 Gbps  3.355228              324                         9.18    2.820201

922	Table 9: Ramp-up time scenario parameters (times in seconds)

924	All traffic for this scenario uses the TCP extension under test.

926	5.2.2 Flows under test

928	Traffic is dominated by the two long lived test flows, because we
929	believe that to be the worst case, in which convergence is slowest.

931	One flow starts in "equilibrium" (at least having finished normal slow-
932	start). A new flow then starts; slow-start is disabled by setting the
933	initial slow-start threshold to the initial CWND. Slow start is disabled
934	because this is the worst case, and could happen if a loss occurred in
935	the first RTT.

937	The experiment ends once the new flow has run for five minutes. Both of
938	the flows use 1500-byte packets. The test should be run both with Stan-
939	dard TCP and with the TCP extension under test for comparison.

941	5.2.2.1 Potential Revisions

943	It may also be useful to conduct the tests with slow start enabled too,
944	if time permits.

946	5.2.3 Metrics of interest
947	The output of these experiments are the time until the 1500times10n th
948	byte of the new flow is received, for n = 1,2,... . This measures how
949	quickly the existing flow releases capacity to the new flow, without
950	requiring a definition of when "fairness" has been achieved. By leaving
951	the upper limit on n unspecified, the test remains applicable to very
952	high-speed networks.

954	A single run of this test cannot achieve statistical reliability by run-
955	ning for a long time. Instead, an average over at least three runs
956	should be taken. Each run must use different cross traffic. Different
957	cross traffic can be generated using the standard tmix trace files by
958	changing the random number seed used to shuffle the traces.

960	5.3 Transients: release of bandwidth, arrival of many flows

962	These tests investigate the impact of a sudden change of congestion
963	level.  They differ from the "Ramp up time" test in that the congestion
964	here is caused by unresponsive traffic.

966	Note that this scenario has not yet been implemented in the NS2 example
967	test suite.

969	5.3.1 Topology and background traffic

971	The network is a single bottleneck link (see Figure 6), with bit rate
972	100 Mbps, with a buffer of 1024 packets (i.e., 120% of the BDP at 100
973	ms).

975	           T                                  T
976	            \                                /
977	             \                              /
978	              R1--------------------------R2
979	             /                              \
980	            /                                \
981	           U                                  U

983	Figure 6: Transient test topology

985	The transient traffic is generated using UDP, to avoid overlap with the
986	ramp-up time scenario (see section 5.2) and isolate the behavior of the
987	flows under study.

989	Three transients are tested:

991	     1.   step decrease from 75 Mbps to 0 Mbps,

993	     2.   step increase from 0 Mbps to 75 Mbps,

995	     3.   30 step increases of 2.5 Mbps at 1 s intervals.
996	These transients occur after the flow under test has exited slow-start,
997	and remain until the end of the experiment.

999	There is no TCP cross traffic in this experiment.

1001	5.3.2 Flows under test

1003	There is one flow under test: a long-lived flow in the same direction as
1004	the transient traffic, with a 100 ms RTT. The test should be run both
1005	with Standard TCP and with the TCP extension under test for comparison.

1007	5.3.3 Metrics of interest

1009	For the decrease in cross traffic, the metrics are

1011	     1.   the time taken for the TCP flow under test to increase its
1012	          window to 60%, 80% and 90% of its BDP, and

1014	     2.   the maximum change of the window in a single RTT while the
1015	          window is increasing to that value.

1017	For cases with an increase in cross traffic, the metric is the number of
1018	*cross traffic* packets dropped from the start of the transient until
1019	100 s after the transient. This measures the harm caused by algorithms
1020	which reduce their rates too slowly on congestion.

1022	6 Throughput- and fairness-related experiments

1024	6.1 Impact on standard TCP traffic

1026	Many new TCP proposals achieve a gain, G, in their own throughput at the
1027	expense of a loss, L, in the throughput of standard TCP flows sharing a
1028	bottleneck, as well as by increasing the link utilization.  In this con-
1029	text a "standard TCP flow" is defined as a flow using SACK TCP [14] but
1030	without ECN [15].

1032	The intention is for a "standard TCP flow" to correspond to TCP as com-
1033	monly deployed in the Internet today (with the notable exception of
1034	CUBIC, which runs by default on the majority of web servers).  This sce-
1035	nario quantifies this trade off.

1037	6.1.1 Topology and background traffic

1039	The basic dumbbell topology of section 4.1 is used with the same capaci-
1040	ties as for the ramp-up time tests in section 5.2.  All traffic in this
1041	scenario comes from the flows under test.

1043	              A_1                                  A_4
1044	              B_1                                  B_4
1045	                \                                 /
1046	                 \         central link          /
1047	          A_2 --- Router_1 -------------- Router_2 --- A_5
1048	          B_2     /                              \     B_5
1049	                 /                                \
1050	              A_3                                  A_6
1051	              B_3                                  B_6

1053	Figure 7: Impact on Standard TCP dumbbell

1055	6.1.2 Flows under test

1057	The scenario is performed by conducting pairs of experiments, with iden-
1058	tical flow arrival times and flow sizes. Within each experiment, flows
1059	are divided into two camps. For every flow in camp A, there is a flow
1060	with the same size, source and destination in camp B, and vice versa.

1062	These experiments use duplicate copies of the Tmix traces used in the
1063	basic scenarios (see section 4). Two offered loads are tested: 50% and
1064	100%.

1066	Two experiments are conducted. A BASELINE experiment where both camp A
1067	and camp B use standard TCP. In the second, called MIX, camp A uses
1068	standard TCP and camp B uses the new TCP extension under evaluation.

1070	The rationale for having paired camps is to remove the statistical
1071	uncertainty which would come from randomly choosing half of the flows to
1072	run each algorithm. This way, camp A and camp B have the same loads.

1074	6.1.3 Metrics of interest
1075	load      scale  experiment time warmup  test_time  prefill_t  prefill_si
1076	--------------------------------------------------------------------------
1077	 50%  13.780346              660  104.0      510.0      45.90   14.262121
1078	100%   5.881093              720   49.0      582.0      91.80   23.382947

1080	Table 10: Impact on Standard TCP scenario parameters

1082	The gain achieved by the new algorithm and loss incurred by standard TCP
1083	are given, respectively, by G=T(B)_Mix/T(B)_Baseline and
1084	L=T(A)_Mix/T(A)_Baseline where T(x) is the throughput obtained by camp
1085	x, measured as the amount of data acknowledged by the receivers (that
1086	is, "goodput").

1088	The loss, L, is analogous to the "bandwidth stolen from TCP" in [16] and
1089	"throughput degradation" in [17].

1091	A plot of G vs L represents the tradeoff between efficiency and loss.

1093	6.1.3.1 Suggestions

1095	Other statistics of interest are the values of  G  and  L  for each
1096	quartile of file sizes. This will reveal whether the new proposal is
1097	more aggressive in starting up or more reluctant to release its share of
1098	capacity.

1100	As always, testing at other loads and averaging over multiple runs is
1101	encouraged.

1103	6.2 Intra-protocol and inter-RTT fairness

1105	These tests aim to measure bottleneck bandwidth sharing among flows of
1106	the same protocol with the same RTT, which represents the flows going
1107	through the same routing path.  The tests also measure inter-RTT fair-
1108	ness, the bandwidth sharing among flows of the same protocol where rout-
1109	ing paths have a common bottleneck segment but might have different
1110	overall paths with different RTTs.

1112	6.2.1 Topology and background traffic

1114	The topology, the capacity and cross traffic conditions of these tests
1115	are the same as in section 5.2.  The bottleneck buffer is varied from
1116	25% to 200% of the BDP for a 100 ms base RTT flow, increasing by factors
1117	of 2.

1119	6.2.2 Flows under test
1120	We use two flows of the same protocol variant for this experiment. The
1121	RTTs of the flows range from 10 ms to 160 ms (10 ms, 20 ms, 40 ms, 80
1122	ms, and 160 ms) such that the ratio of the minimum RTT over the maximum
1123	RTT is at most 1/16.

1125	6.2.2.1 Intra-protocol fairness:

1127	For each run, two flows with the same RTT, taken from the range of RTTs
1128	above, start randomly within the first 10% of the experiment duration.
1129	The order in which these flows start doesn't matter. An additional test
1130	of interest, but not part of this suite, would involve two extreme cases
1131	- two flows with very short or long RTTs (e.g., a delay less than 1-2 ms
1132	representing communication happening in a data-center, and a delay
1133	larger than 600 ms representing communication over a satellite link).

1135	6.2.2.2 Inter-RTT fairness:

1137	For each run, one flow with a fixed RTT of 160 ms starts first, and
1138	another flow with a different RTT taken from the range of RTTs above,
1139	joins afterward. The starting times of both two flows are randomly cho-
1140	sen within the first 10% of the experiment as before.

1142	6.2.3 Metrics of interest

1144	The output of this experiment is the ratio of the average throughput
1145	values of the two flows. The output also includes the packet drop rate
1146	for the congested link.

1148	6.3 Multiple bottlenecks

1150	These experiments explore the relative bandwidth for a flow that tra-
1151	verses multiple bottlenecks, with respect to that of flows that have the
1152	same round-trip time but each traverse only one of the bottleneck links.

1154	6.3.1 Topology and traffic

1156	The topology is a "parking-lot" topology with three (horizontal) bottle-
1157	neck links and four (vertical) access links.  The bottleneck links have
1158	a rate of 100 Mbps, and the access links have a rate of 1 Gbps.

1160	All flows have a round-trip time of 60 ms, to enable the effect of
1161	traversing multiple bottlenecks to be distinguished from that of differ-
1162	ent round trip times.

1164	This can be achieved in both a symmetric and asymmetric way (see figures
1165	8 and 9).  It is not clear whether there are interesting performance
1166	differences between these two topologies, and if so, which is more typi-
1167	cal of the actual internet.

1169	 > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - >
1170	  __________ 0ms _________________ 0ms __________________ 30ms ____
1171	 |   ................  |   ................  |   ................  |
1172	 |   :              :  |   :              :  |   :              :  |
1173	 |   :              :  |   :              :  |   :              :  |
1174	0ms  :              : 30ms :              : 0ms  :              : 0ms
1175	 |   ^              V  |   ^              V  |   ^              V  |

1177	Figure 8: Asymmetric parking lot topology

1179	 > - - - - - - - - - - - - - - - - - - - - - - - - - - -  - - - >
1180	  __________ 10ms _______________ 10ms ________________ 10ms ___
1181	 |   ...............  |   ...............  |   ...............  |
1182	 |   :             :  |   :             :  |   :             :  |
1183	 |   :             :  |   :             :  |   :             :  |
1184	10ms :             : 10ms :             : 10ms :             : 10ms
1185	 |   ^             V  |   ^             V  |   ^             V  |

1187	Figure 9: Symmetric parking lot topology

1189	The three hop topology used in the test suite is based on the symmetric
1190	topology (see figure 10). Bidirectional traffic flows between Nodes 1
1191	and 8, 2 and 3, 4 and 5, and 6 and 7.

1193	The first four Tmix trace files are used to generate the traffic. Each
1194	Tmix source offers the same load for each experiment. Three experiments
1195	are conducted at 30%, 40%, and 50% offered loads per Tmix source. As two
1196	sources share each of the three bottlenecks (A,B,C), the combined
1197	offered loads on the bottlenecks is 60%, 80%, and 100% respectively.

1199	All traffic uses the new TCP extension under test.

1201	6.3.1.1 Potential Revisions
1202	           Node_1          Node_3     Node_5         Node_7
1203	                \             |          |             /
1204	                 \            |10ms      |10ms        /10ms
1205	              0ms \           |          |           /
1206	                   \     A    |    B     |   C      /
1207	                Router1 ---Router2---Router3--- Router4
1208	                  /    10ms   |   10ms   |  10ms   \
1209	                 /            |          |          \
1210	            10ms/             |10ms      |10ms       \  0ms
1211	               /              |          |            \
1212	          Node_2           Node_4     Node_6         Node_8

1214	 Flow 1: Node_1 <--> Node_8
1215	 Flow 2: Node_2 <--> Node_3
1216	 Flow 3: Node_4 <--> Node_5
1217	 Flow 4: Node_6 <--> Node_7

1219	Figure 10: Test suite parking lot topology

1221	      load  scale 1  prefill_t  prefill_si     scale 2  prefill_t
1222	prefill_si  scale 3  prefill_t  prefill_si  total time     warmup  test_time
1223	------------------------------------------------------------------------------------------------------------
1224	       50%      tbd        tbd         tbd         tbd        tbd        tbd  tbd  tbd  tbd  tbd  tbd  tbd
1225	      100%      tbd        tbd         tbd         tbd        tbd        tbd  tbd  tbd  tbd  tbd  tbd  tbd

1227	Table 11: Multiple bottleneck scenario parameters

1229	Parking lot models with more hops may also be of interest.

1231	6.3.2 Metrics of interest

1233	The output for this experiment is the ratio between the average through-
1234	put of the single-bottleneck flows and the throughput of the multiple-
1235	bottleneck flow, measured after the warmup period.  Output also includes
1236	the packet drop rate for the congested link.

1238	7 Implementations
1239	At the moment the only implementation effort is using the NS2 simulator.
1240	It is still a work in progress, but contains the base to most of the
1241	test, as well as the algorithms that determined the test parameters. It
1242	is being made available to the community for further development and
1243	verification through ***** url ***

1245	At the moment there are no ongoing test bed implementations. We invite
1246	the community to initiate and contribute to the development of these
1247	test beds.

1249	8 Acknowledgments

1251	This work is based on a paper by Lachlan Andrew, Cesar Marcondes, Sally
1252	Floyd, Lawrence Dunn, Romaric Guillier, Wang Gang, Lars Eggert, Sangtae
1253	Ha and Injong Rhee [1].

1255	The authors would also like to thank Roman Chertov, Doug Leith, Saverio
1256	Mascolo, Ihsan Qazi, Bob Shorten, David Wei and Michele Weigle for valu-
1257	able feedback and acknowledge the work of Wang Gang to start the NS2
1258	implementation.

1260	This work has been partly funded by the European Community under its
1261	Seventh Framework Programme through the Reducing Internet Transport
1262	Latency (RITE) project (ICT-317700), by the Aurora-Hubert Curien Part-
1263	nership program "ANT" (28844PD / 221629), and under Australian Research
1264	Council's Discovery Projects funding scheme (project number 0985322).

1266	9 Bibliography

1268	[1] L. L. H. Andrew, C. Marcondes, S. Floyd, L. Dunn, R. Guillier, W.
1269	Gang, L. Eggert, S. Ha, and I. Rhee, "Towards a common TCP evaluation
1270	suite," in Protocols for Fast, Long Distance Networks (PFLDnet), 5-7 Mar
1271	2008.

1273	[2] S. Floyd and E. Kohler, "Internet research needs better models,"
1274	SIGCOMM Comput. Commun. Rev., vol. 33, pp. 29--34, Jan. 2003.

1276	[3] S. Mascolo and F. Vacirca, "The effect of reverse traffic on the
1277	performance of new TCP congestion control algorithms for gigabit net-
1278	works," in  Protocols for Fast, Long Distance Networks (PFLDnet), 2006.

1280	[4] D. Rossi, M. Mellia, and C. Casetti, "User patience and the web: a
1281	hands-on investigation," in Global Telecommunications Conference, 2003.
1282	GLOBECOM

1284	[5] M. C. Weigle, P. Adurthi, F. Hernandez-Campos, K. Jeffay, and F. D.
1285	Smith, "Tmix: a tool for generating realistic TCP application workloads
1286	in ns-2," SIGCOMM Comput. Commun. Rev., vol. 36, pp. 65--76, July 2006.

1288	[6] G. project, "Tmix on ProtoGENI."

1290	[7] J. xxxxx, "Tmix trace generation for the TCP evaluation suite."
1291	http://web.archive.org/web/20100711061914/http://wil-ns.cs.caltech.edu/
1292	benchmark/traffic/.

1294	[8] Wikipedia, "Fisher-Yates shuffle."
1295	http://en.wikipedia.org/wiki/Fisher-Yates_shuffle.

1297	[9] M. Alizadeh, A. Greenberg, D. A. Maltz, J. Padhye, P. Patel, B.
1298	Prabhakar, S. Sengupta, and M. Sridharan, "Data center tcp (dctcp)," in
1299	Proceedings of the ACM SIGCOMM 2010 conference, SIGCOMM '10, (New York,
1300	NY, USA), pp. 63--74, ACM, 2010.

1302	[10] T. Henderson and R. Katz, "Transport protocols for internet-compat-
1303	ible satellite networks," Selected Areas in Communications, IEEE Journal
1304	on, vol. 17, no. 2, pp. 326--344, 1999.

1306	[11] A. Gurtov and S. Floyd, "Modeling wireless links for transport pro-
1307	tocols," SIGCOMM Comput. Commun. Rev., vol. 34, pp. 85--96, Apr. 2004.

1309	[12] S. Floyd, R. Gummadi, and S. Shenker, "Adaptive RED: An algorithm
1310	for increasing the robustness of RED," tech. rep., ICIR, 2001.

1312	[13] L. L. H. Andrew, S. V. Hanly, and R. G. Mukhtar, "Active queue man-
1313	agement for fair resource allocation in wireless networks," IEEE Trans-
1314	actions on Mobile Computing, vol. 7, pp. 231--246, Feb. 2008.

1316	[14] S. Floyd, J. Mahdavi, M. Mathis, and M. Podolsky, "An Extension to
1317	the Selective Acknowledgement (SACK) Option for TCP." RFC 2883 (Proposed
1318	Standard), July 2000.

1320	[15] K. Ramakrishnan, S. Floyd, and D. Black, "The Addition of Explicit
1321	Congestion Notification (ECN) to IP." RFC 3168 (Proposed Standard),
1322	Sept. 2001.  Updated by RFCs 4301, 6040.

1324	[16] E. Souza and D. Agarwal, "A highspeed TCP study: Characteristics
1325	and deployment issues," Tech. Rep. LBNL-53215, LBNL, 2003.

1327	[17] H. Shimonishi, M. Sanadidi, and T. Murase, "Assessing interactions
1328	among legacy and high-speed tcp protocols," in Protocols for Fast, Long
1329	Distance Networks (PFLDnet), 2007.

1331	[18] N. Hohn, D. Veitch, and P. Abry, "The impact of the flow arrival
1332	process in internet traffic," in Acoustics, Speech, and Signal Process-
1333	ing, 2003.  Proceedings. (ICASSP '03). 2003 IEEE International Confer-
1334	ence on, vol. 6, pp. VI-37--40 vol.6, 2003.

1336	[19] F. Kelly, Reversibility and stochastic networks.  University of
1337	Cambridge Statistical Laboratory, 1979.

1339	A Discussions on Traffic

1341	While the protocols being tested may differ, it is important that we
1342	maintain the same "load" or level of congestion for the experimental
1343	scenarios. To enable this, we use a hybrid of open-loop and close-loop
1344	approaches. For this test suite, network traffic consists of sessions
1345	corresponding to individual users. Because users are independent, these
1346	session arrivals are well modeled by an open-loop Poisson process. A
1347	session may consist of a single greedy TCP flow, multiple greedy flows
1348	separated by user "think" times, a single non-greedy flow with embedded
1349	think times, or many non-greedy "thin stream" flows.  process forms a
1350	Poisson process [18].  Both the think times and burst sizes have heavy-
1351	tailed distributions, with the exact distribution based on empirical
1352	studies. The think times and burst sizes will be chosen independently.
1353	This is unlikely to be the case in practice, but we have not been able
1354	to find any measurements of the joint distribution. We invite
1355	researchers to study this joint distribution, and future revisions of
1356	this test suite will use such statistics when they are available.

1358	For most current traffic generators, the traffic is specified by an
1359	arrival rate for independent user sessions, along with specifications of
1360	connection sizes, number of connections per sessions, user wait times
1361	within sessions, and the like. Because the session arrival times are
1362	specified independently of the transfer times, one way to specify the
1363	load would be as A = E[f]/E[t], where E[f] is the mean session size (in
1364	bits transferred), E[t] is the mean session inter-arrival time in sec-
1365	onds, and A is the load in bps.

1367	Instead, for equilibrium experiments, we measure the load as the "mean
1368	number of jobs in an M/G/1 queue using processor sharing," where a job
1369	is a user session. This reflects the fact that TCP aims at processor
1370	sharing of variable sized files. Because processor sharing is a symmet-
1371	ric discipline [19], the mean number of flows is equal to that of an
1372	M/M/1 queue, namely rho/(1-rho), where rho=lambda S/C, and lambda [flows
1373	per second] is the arrival rate of jobs/flows, S [bits] is the mean job
1374	size and C [bits per second] is the bottleneck capacity. For small
1375	loads, say 10%, this is essentially equal to the fraction of the capac-
1376	ity that is used. However, for overloaded systems, the fraction of the
1377	bandwidth used will be much less than this measure of load.

1379	In order to minimize the dependence of the results on the experiment
1380	durations, scenarios should be as stationary as possible. To this end,
1381	experiments will start with rho/(1-rho) active cross-traffic flows, with
1382	traffic of the specified load.

1384	Authors' Addresses

1386	   David Hayes
1387	   University of Oslo
1388	   Department of Informatics, P.O. Box 1080 Blindern
1389	   Oslo  N-0316
1390	   Norway

1392	   Email: davihay@ifi.uio.no

1394	   David Ros
1395	   Institut Mines-Telecom / Telecom Bretagne
1396	   2 rue de la Chataigneraie
1397	   35510 Cesson-Sevigne
1398	   France

1400	   Email: david.ros@telecom-bretagne.eu

1402	   Lachlan L.H. Andrew
1403	   CAIA Swinburne University of Technology
1404	   P.O. Box 218, John Street
1405	   Hawthorn  Victoria 3122
1406	   Australia

1408	   Email: lachlan.andrew@gmail.com

1410	   Sally Floyd
1411	   ICSI
1412	   1947 Center Street, Ste. 600
1413	   Berkeley  CA 94704
1414	   United States