idnits 2.17.1 draft-ietf-ippm-model-based-metrics-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) == There are 7 instances of lines with non-RFC2606-compliant FQDNs in the document. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 583: '... A TDS or FSTDS MUST apportion all re...' RFC 2119 keyword, line 689: '...d, a fully specified TDS or FSTDS MUST...' RFC 2119 keyword, line 706: '...The TDS or FSTDS MUST document and jus...' RFC 2119 keyword, line 836: '...argets then this MUST be stated both t...' RFC 2119 keyword, line 837: '...etwork performance. The tests MUST be...' (2 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 363 has weird spacing: '...y tests deter...' == Line 374 has weird spacing: '...g tests are d...' == Line 379 has weird spacing: '...g tests evalu...' == Line 1035 has weird spacing: '... and n = h1...' -- The document date (February 14, 2014) is 3714 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Experimental ---------------------------------------------------------------------------- == Missing Reference: 'Dominant' is mentioned on line 254, but not defined == Missing Reference: 'CUBIC' is mentioned on line 819, but not defined == Missing Reference: 'SLowScaling' is mentioned on line 822, but not defined == Missing Reference: 'REACTIVE' is mentioned on line 853, but not defined == Missing Reference: 'Bufferbloat' is mentioned on line 1305, but not defined == Missing Reference: 'POWER' is mentioned on line 1274, but not defined == Missing Reference: 'NPAD' is mentioned on line 1296, but not defined == Missing Reference: 'TSO' is mentioned on line 1399, but not defined == Missing Reference: 'RFC 863' is mentioned on line 1442, but not defined == Missing Reference: 'RFC 864' is mentioned on line 1442, but not defined == Missing Reference: 'HDvideo' is mentioned on line 1507, but not defined == Missing Reference: 'SDvideo' is mentioned on line 1529, but not defined == Missing Reference: 'AFD' is mentioned on line 1771, but not defined == Missing Reference: 'W' is mentioned on line 1821, but not defined == Unused Reference: 'RFC6049' is defined on line 1680, but no explicit reference was found in the text -- Obsolete informational reference (is this intentional?): RFC 2309 (Obsoleted by RFC 7567) -- Obsolete informational reference (is this intentional?): RFC 2861 (Obsoleted by RFC 7661) == Outdated reference: A later version (-01) exists of draft-morton-ippm-lmap-path-00 Summary: 3 errors (**), 0 flaws (~~), 22 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 IP Performance Working Group M. Mathis 3 Internet-Draft Google, Inc 4 Intended status: Experimental A. Morton 5 Expires: August 18, 2014 AT&T Labs 6 February 14, 2014 8 Model Based Bulk Performance Metrics 9 draft-ietf-ippm-model-based-metrics-02.txt 11 Abstract 13 We introduce a new class of model based metrics designed to determine 14 if an end-to-end Internet path can meet predefined transport 15 performance targets by applying a suite of IP diagnostic tests to 16 successive subpaths. The subpath-at-a-time tests are designed to 17 accurately detect if any subpath will prevent the full end-to-end 18 path from meeting the specified target performance. Each IP 19 diagnostic test consists of a precomputed traffic pattern and a 20 statistical criteria for evaluating packet delivery. 22 The IP diagnostics tests are based on traffic patterns that are 23 precomputed to mimic TCP or other transport protocol over a long path 24 but are independent of the actual details of the subpath under test. 25 Likewise the success criteria depends on the target performance and 26 not the actual performance of the subpath. This makes the 27 measurements open loop, eliminating nearly all of the difficulties 28 encountered by traditional bulk transport metrics. 30 This document does not fully define diagnostic tests, but provides a 31 framework for designing suites of diagnostics tests that are tailored 32 the confirming the target performance. 34 By making the tests open loop, we eliminate standards congestion 35 control equilibrium behavior, which otherwise causes every measured 36 parameter to be sensitive to every component of the system. As an 37 open loop test, various measurable properties become independent, and 38 potentially subject to an algebra enabling several important new 39 uses. 41 Interim DRAFT Formatted: Fri Feb 14 14:07:33 PST 2014 43 Status of this Memo 45 This Internet-Draft is submitted in full conformance with the 46 provisions of BCP 78 and BCP 79. 48 Internet-Drafts are working documents of the Internet Engineering 49 Task Force (IETF). Note that other groups may also distribute 50 working documents as Internet-Drafts. The list of current Internet- 51 Drafts is at http://datatracker.ietf.org/drafts/current/. 53 Internet-Drafts are draft documents valid for a maximum of six months 54 and may be updated, replaced, or obsoleted by other documents at any 55 time. It is inappropriate to use Internet-Drafts as reference 56 material or to cite them other than as "work in progress." 58 This Internet-Draft will expire on August 18, 2014. 60 Copyright Notice 62 Copyright (c) 2014 IETF Trust and the persons identified as the 63 document authors. All rights reserved. 65 This document is subject to BCP 78 and the IETF Trust's Legal 66 Provisions Relating to IETF Documents 67 (http://trustee.ietf.org/license-info) in effect on the date of 68 publication of this document. Please review these documents 69 carefully, as they describe your rights and restrictions with respect 70 to this document. Code Components extracted from this document must 71 include Simplified BSD License text as described in Section 4.e of 72 the Trust Legal Provisions and are provided without warranty as 73 described in the Simplified BSD License. 75 Table of Contents 77 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5 78 1.1. TODO . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 79 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 7 80 3. New requirements relative to RFC 2330 . . . . . . . . . . . . 10 81 4. Background . . . . . . . . . . . . . . . . . . . . . . . . . . 11 82 4.1. TCP properties . . . . . . . . . . . . . . . . . . . . . . 12 83 4.2. Diagnostic Approach . . . . . . . . . . . . . . . . . . . 13 84 5. Common Models and Parameters . . . . . . . . . . . . . . . . . 15 85 5.1. Target End-to-end parameters . . . . . . . . . . . . . . . 15 86 5.2. Common Model Calculations . . . . . . . . . . . . . . . . 15 87 5.3. Parameter Derating . . . . . . . . . . . . . . . . . . . . 16 88 6. Common testing procedures . . . . . . . . . . . . . . . . . . 17 89 6.1. Traffic generating techniques . . . . . . . . . . . . . . 17 90 6.1.1. Paced transmission . . . . . . . . . . . . . . . . . . 17 91 6.1.2. Constant window pseudo CBR . . . . . . . . . . . . . . 18 92 6.1.3. Scanned window pseudo CBR . . . . . . . . . . . . . . 18 93 6.1.4. Concurrent or channelized testing . . . . . . . . . . 19 94 6.1.5. Intermittent Testing . . . . . . . . . . . . . . . . . 19 95 6.1.6. Intermittent Scatter Testing . . . . . . . . . . . . . 20 96 6.2. Interpreting the Results . . . . . . . . . . . . . . . . . 20 97 6.2.1. Test outcomes . . . . . . . . . . . . . . . . . . . . 20 98 6.2.2. Statistical criteria for measuring run_length . . . . 22 99 6.2.2.1. Alternate criteria for measuring run_length . . . 24 100 6.2.3. Reordering Tolerance . . . . . . . . . . . . . . . . . 25 101 6.3. Test Qualifications . . . . . . . . . . . . . . . . . . . 26 102 7. Diagnostic Tests . . . . . . . . . . . . . . . . . . . . . . . 27 103 7.1. Basic Data Rate and Run Length Tests . . . . . . . . . . . 27 104 7.1.1. Run Length at Paced Full Data Rate . . . . . . . . . . 27 105 7.1.2. Run Length at Full Data Windowed Rate . . . . . . . . 28 106 7.1.3. Background Run Length Tests . . . . . . . . . . . . . 28 107 7.2. Standing Queue tests . . . . . . . . . . . . . . . . . . . 28 108 7.2.1. Congestion Avoidance . . . . . . . . . . . . . . . . . 29 109 7.2.2. Bufferbloat . . . . . . . . . . . . . . . . . . . . . 30 110 7.2.3. Non excessive loss . . . . . . . . . . . . . . . . . . 30 111 7.2.4. Duplex Self Interference . . . . . . . . . . . . . . . 30 112 7.3. Slowstart tests . . . . . . . . . . . . . . . . . . . . . 30 113 7.3.1. Full Window slowstart test . . . . . . . . . . . . . . 31 114 7.3.2. Slowstart AQM test . . . . . . . . . . . . . . . . . . 31 115 7.4. Sender Rate Burst tests . . . . . . . . . . . . . . . . . 31 116 7.5. Combined Tests . . . . . . . . . . . . . . . . . . . . . . 32 117 7.5.1. Sustained burst test . . . . . . . . . . . . . . . . . 32 118 7.5.2. Live Streaming Media . . . . . . . . . . . . . . . . . 33 119 8. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 120 8.1. Near serving HD streaming video . . . . . . . . . . . . . 34 121 8.2. Far serving SD streaming video . . . . . . . . . . . . . . 34 122 8.3. Bulk delivery of remote scientific data . . . . . . . . . 35 124 9. Validation . . . . . . . . . . . . . . . . . . . . . . . . . . 35 125 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 37 126 11. Informative References . . . . . . . . . . . . . . . . . . . . 37 127 Appendix A. Model Derivations . . . . . . . . . . . . . . . . . . 39 128 A.1. Queueless Reno . . . . . . . . . . . . . . . . . . . . . . 39 129 A.2. CUBIC . . . . . . . . . . . . . . . . . . . . . . . . . . 40 130 Appendix B. Complex Queueing . . . . . . . . . . . . . . . . . . 41 131 Appendix C. Version Control . . . . . . . . . . . . . . . . . . . 42 132 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 42 134 1. Introduction 136 Bulk performance metrics evaluate an Internet path's ability to carry 137 bulk data. Model based bulk performance metrics rely on mathematical 138 TCP models to design a targeted diagnostic suite (TDS) of IP 139 performance tests which can be applied independently to each subpath 140 of the full end-to-end path. These targeted diagnostic suites allow 141 independent tests of subpaths to accurately detect if any subpath 142 will prevent the full end-to-end path from delivering bulk data at 143 the specified performance target, independent of the measurement 144 vantage points or other details of the test procedures used for each 145 measurement. 147 The end-to-end target performance is determined by the needs of the 148 user or application, outside the scope of this document. For bulk 149 data transport, the primary performance parameter of interest is the 150 target data rate. However, since TCP's ability to compensate for 151 less than ideal network conditions is fundamentally affected by the 152 Round Trip Time (RTT) and the Maximum Transmission Unit (MTU) of the 153 entire end-to-end path over which the data traverses, these 154 parameters must also be specified in advance. They may reflect a 155 specific real path through the Internet or an idealized path 156 representing a typical user community. The target values for these 157 three parameters, Data Rate, RTT and MTU, inform the mathematical 158 models used to design the TDS. 160 Each IP diagnostic test in a TDS consists of a precomputed traffic 161 pattern and statistical criteria for evaluating packet delivery. 163 Mathematical models are used to design traffic patterns that mimic 164 TCP or other bulk transport protocol operating at the target data 165 rate, MTU and RTT over a full range of conditions, including flows 166 that are bursty at multiple time scales. The traffic patterns are 167 computed in advance based on the three target parameters of the end- 168 to-end path and independent of the properties of individual subpaths. 169 As much as possible the measurement traffic is generated 170 deterministically in ways that minimize the extent to which test 171 methodology, measurement points, measurement vantage or path 172 partitioning affect the details of the measurement traffic. 174 Mathematical models are also used to compute the bounds on the packet 175 delivery statistics for acceptable IP performance. Since these 176 statistics, such as packet loss, are typically aggregated from all 177 subpaths of the end-to-end path, the end-to-end statistical bounds 178 need to be apportioned as a separate bound for each subpath. Note 179 that links that are expected to be bottlenecks are expected to 180 contribute more packet loss and/or delay. In compensation, other 181 links have to be constrained to contribute less packet loss and 182 delay. The criteria for passing each test of a TDS is an apportioned 183 share of the total bound determined by the mathematical model from 184 the end-to-end target performance. 186 In addition to passing or failing, a test can be deemed to be 187 inconclusive for a number of reasons including, the precomputed 188 traffic pattern was not accurately generated, measurement results 189 were not statistically significant, and others such as failing to 190 meet some test preconditions. 192 This document describes a framework for deriving traffic patterns and 193 delivery statistics for model based metrics. It does not fully 194 specify any measurement techniques. Important details such as packet 195 type-p selection, sampling techniques, vantage selection, etc. are 196 not specified here. We imagine Fully Specified Targeted Diagnostic 197 Suites (FSTDS), that define all of these details. We use TDS to 198 refer to the subset of such a specification that is in scope for this 199 document. A TDS includes the target parameters, documentation of the 200 models and assumptions used to derive the diagnostic test parameters, 201 specifications for the traffic and delivery statistics for the tests 202 themselves, and a description of a test setup that can be used to 203 validate the tests and models. 205 Section 2 defines terminology used throughout this document. 207 It has been difficult to develop Bulk Transport Capacity [RFC3148] 208 metrics due to some overlooked requirements described in Section 3 209 and some intrinsic problems with using protocols for measurement, 210 described in Section 4. 212 In Section 5 we describe the models and common parameters used to 213 derive the targeted diagnostic suite. In Section 6 we describe 214 common testing procedures. Each subpath is evaluated using suite of 215 far simpler and more predictable diagnostic tests described in 216 Section 7. In Section 8 we present three example TDS', one that 217 might be representative of HD video, when served fairly close to the 218 user, a second that might be representative of standard video, served 219 from a greater distance, and a third that might be representative of 220 high performance bulk data delivered over a transcontinental path. 222 There exists a small risk that model based metric itself might yield 223 a false pass result, in the sense that every subpath of an end-to-end 224 path passes every IP diagnostic test and yet a real application fails 225 to attain the performance target over the end-to-end path. If this 226 happens, then the validation procedure described in Section 9 needs 227 to be used to prove and potentially revise the models. 229 Future documents will define model based metrics for other traffic 230 classes and application types, such as real time streaming media. 232 1.1. TODO 234 Please send comments on this draft to ippm@ietf.org. See 235 http://goo.gl/02tkD for more information including: interim drafts, 236 an up to date todo list and information on contributing. 238 Formatted: Fri Feb 14 14:07:33 PST 2014 240 2. Terminology 242 Terminology about paths, etc. See [RFC2330] and 243 [I-D.morton-ippm-lmap-path]. 245 [data] sender Host sending data and receiving ACKs. 246 [data] receiver Host receiving data and sending ACKs. 247 subpath A portion of the full path. Note that there is no 248 requirement that subpaths be non-overlapping. 249 Measurement Point Measurement points as described in 250 [I-D.morton-ippm-lmap-path]. 251 test path A path between two measurement points that includes a 252 subpath of the end-to-end path under test, and could include 253 infrastructure between the measurement points and the subpath. 254 [Dominant] Bottleneck The Bottleneck that generally dominates 255 traffic statistics for the entire path. It typically determines a 256 flow's self clock timing, packet loss and ECN marking rate. See 257 Section 4.1. 258 front path The subpath from the data sender to the dominant 259 bottleneck. 260 back path The subpath from the dominant bottleneck to the receiver. 261 return path The path taken by the ACKs from the data receiver to the 262 data sender. 263 cross traffic Other, potentially interfering, traffic competing for 264 resources (network and/or queue capacity). 266 Properties determined by the end-to-end path and application. They 267 are described in more detail in Section 5.1. 269 Application Data Rate General term for the data rate as seen by the 270 application above the transport layer. This is the payload data 271 rate, and excludes transport and lower level headers(TCP/IP or 272 other protocols) and as well as retransmissions and other data 273 that does not contribute to the total quantity of data delivered 274 to the application. 276 Link Data Rate General term for the data rate as seen by the link or 277 lower layers. The link data rate includes transport and IP 278 headers, retransmits and other transport layer overhead. This 279 document is agnostic as to whether the link data rate includes or 280 excludes framing, MAC, or other lower layer overheads, except that 281 they must be treated uniformly. 282 end-to-end target parameters: Application or transport performance 283 goals for the end-to-end path. They include the target data rate, 284 RTT and MTU described below. 285 Target Data Rate: The application data rate, typically the ultimate 286 user's performance goal. 287 Target RTT (Round Trip Time): The baseline (minimum) RTT of the 288 longest end-to-end path over which the application expects to meet 289 the target performance. TCP and other transport protocol's 290 ability to compensate for path problems is generally proportional 291 to the number of round trips per second. The Target RTT 292 determines both key parameters of the traffic patterns (e.g. burst 293 sizes) and the thresholds on acceptable traffic statistics. The 294 Target RTT must be specified considering authentic packets sizes: 295 MTU sized packets on the forward path, ACK sized packets 296 (typically the header_overhead) on the return path. 297 Target MTU (Maximum Transmission Unit): The maximum MTU supported by 298 the end-to-end path the over which the application expects to meet 299 the target performance. Assume 1500 Byte packet unless otherwise 300 specified. If some subpath forces a smaller MTU, then it becomes 301 the target MTU, and all model calculations and subpath tests must 302 use the same smaller MTU. 303 Effective Bottleneck Data Rate: This is the bottleneck data rate 304 inferred from the ACK stream, by looking at how much data the ACK 305 stream reports delivered per unit time. If the path is thinning 306 ACKs or batching packets the effective bottleneck rate can be much 307 higher than the average link rate. See Section 4.1 and Appendix B 308 for more details. 309 [sender | interface] rate: The burst data rate, constrained by the 310 data sender's interfaces. Today 1 or 10 Gb/s are typical. 311 Header_overhead: The IP and TCP header sizes, which are the portion 312 of each MTU not available for carrying application payload. 313 Without loss of generality this is assumed to be the size for 314 returning acknowledgements (ACKs). For TCP, the Maximum Segment 315 Size (MSS) is the Target MTU minus the header_overhead. 317 Basic parameters common to models and subpath tests. They are 318 described in more detail in Section 5.2. Note that these are mixed 319 between application transport performance (excludes headers) and link 320 IP performance (includes headers). 322 pipe size A general term for number of packets needed in flight (the 323 window size) to exactly fill some network path or subpath. This 324 is the window size which is normally the onset of queueing. 325 target_pipe_size: The number of packets in flight (the window size) 326 needed to exactly meet the target rate, with a single stream and 327 no cross traffic for the specified application target data rate, 328 RTT, and MTU. It is the amount of circulating data required to 329 meet the target data rate, and implies the scale of the bursts 330 that the network might experience. 331 run length A general term for the observed, measured, or specified 332 number of packets that are (to be) delivered between losses or ECN 333 marks. Nominally one over the loss or ECN marking probability, if 334 there are independently and identically distributed. 335 target_run_length The target_run_length is an estimate of the 336 minimum required headway between losses or ECN marks necessary to 337 attain the target_data_rate over a path with the specified 338 target_RTT and target_MTU, as computed by a mathematical model of 339 TCP congestion control. A reference calculation is show in 340 Section 5.2 and alternatives in Appendix A 342 Ancillary parameters used for some tests 344 derating: Under some conditions the standard models are too 345 conservative. The modeling framework permits some latitude in 346 relaxing or derating some test parameters as described in 347 Section 5.3 in exchange for a more stringent TDS validation 348 procedures, described in Section 9. 349 subpath_data_rate The maximum IP data rate supported by a subpath. 350 This typically includes TCP/IP overhead, including headers, 351 retransmits, etc. 352 test_path_RTT The RTT between two measurement points using 353 appropriate data and ACK packet sizes. 354 test_path_pipe The amount of data necessary to fill a test path. 355 Nominally the test path RTT times the subpath_data_rate (which 356 should be part of the end-to-end subpath). 357 test_window The window necessary to meet the target_rate over a 358 subpath. Typically test_window=target_data_rate*test_RTT/ 359 (target_MTU - header_overhead). 361 Tests can be classified into groups according to their applicability. 363 Capacity tests determine if a network subpath has sufficient 364 capacity to deliver the target performance. As long as the test 365 traffic is within the proper envelope for the target end-to-end 366 performance, the average packet losses or ECN must be below the 367 threshold computed by the model. As such, capacity tests reflect 368 parameters that can transition from passing to failing as a 369 consequence of cross traffic, additional presented load or the 370 actions of other network users. By definition, capacity tests 371 also consume significant network resources (data capacity and/or 372 buffer space), and the test schedules must be balanced by their 373 cost. 374 Monitoring tests are designed to capture the most important aspects 375 of a capacity test, but without presenting excessive ongoing load 376 themselves. As such they may miss some details of the network's 377 performance, but can serve as a useful reduced-cost proxy for a 378 capacity test. 379 Engineering tests evaluate how network algorithms (such as AQM and 380 channel allocation) interact with TCP-style self clocked protocols 381 and adaptive congestion control based on packet loss and ECN 382 marks. These tests are likely to have complicated interactions 383 with other traffic and under some conditions can be inversely 384 sensitive to load. For example a test to verify that an AQM 385 algorithm causes ECN marks or packet drops early enough to limit 386 queue occupancy may experience a false pass result in the presence 387 of bursty cross traffic. It is important that engineering tests 388 be performed under a wide range of conditions, including both in 389 situ and bench testing, and over a wide variety of load 390 conditions. Ongoing monitoring is less likely to be useful for 391 engineering tests, although sparse in situ testing might be 392 appropriate. 394 General Terminology: 396 Targeted Diagnostic Test (TDS) A set of IP Diagnostics designed to 397 determine if a subpath can sustain flows at a specific 398 target_data_rate over a path that has a target_RTT using 399 target_MTU sided packets. 400 Fully Specified Targeted Diagnostic Test A TDS together with 401 additional specification such as "type-p", etc which are out of 402 scope for this document, but need to be drawn from other standards 403 documents. 404 apportioned To divide and allocate, as in budgeting packet loss 405 rates across multiple subpaths to accumulate below a specified 406 end-to-end loss rate. 407 open loop A control theory term used to describe a class of 408 techniques where systems that exhibit circular dependencies can be 409 analyzed by suppressing some of the dependences, such that the 410 resulting dependency graph is acyclic. 412 3. New requirements relative to RFC 2330 414 Model Based Metrics are designed to fulfill some additional 415 requirement that were not recognized at the time RFC 2330 was written 416 [RFC2330]. These missing requirements may have significantly 417 contributed to policy difficulties in the IP measurement space. Some 418 additional requirements are: 419 o IP metrics must be actionable by the ISP - they have to be 420 interpreted in terms of behaviors or properties at the IP or lower 421 layers, that an ISP can test, repair and verify. 422 o Metrics must be vantage point invariant over a significant range 423 of measurement point choices, including off path measurement 424 points. The only requirements on MP selection should be that the 425 portion of the test path that is not under test is effectively 426 ideal (or is non ideal in ways that can be calibrated out of the 427 measurements) and the test RTT between the MPs is below some 428 reasonable bound. 429 o Metrics must be repeatable by multiple parties with no specialized 430 access to MPs or diagnostic infrastructure. It must be possible 431 for different parties to make the same measurement and observe the 432 same results. In particular it is specifically important that 433 both a consumer (or their delegate) and ISP be able to perform the 434 same measurement and get the same result. 436 NB: All of the metric requirements in RFC 2330 should be reviewed and 437 potentially revised. If such a document is opened soon enough, this 438 entire section should be dropped. 440 4. Background 442 At the time the IPPM WG was chartered, sound Bulk Transport Capacity 443 measurement was known to be beyond our capabilities. By hindsight it 444 is now clear why it is such a hard problem: 445 o TCP is a control system with circular dependencies - everything 446 affects performance, including components that are explicitly not 447 part of the test. 448 o Congestion control is an equilibrium process, such that transport 449 protocols change the network (raise loss probability and/or RTT) 450 to conform to their behavior. 451 o TCP's ability to compensate for network flaws is directly 452 proportional to the number of roundtrips per second (i.e. 453 inversely proportional to the RTT). As a consequence a flawed 454 link may pass a short RTT local test even though it fails when the 455 path is extended by a perfect network to some larger RTT. 456 o TCP has a meta Heisenberg problem - Measurement and cross traffic 457 interact in unknown and ill defined ways. The situation is 458 actually worse than the traditional physics problem where you can 459 at least estimate the relative momentum of the measurement and 460 measured particles. For network measurement you can not in 461 general determine the relative "elasticity" of the measurement 462 traffic and cross traffic, so you can not even gauge the relative 463 magnitude of their effects on each other. 465 These properties are a consequence of the equilibrium behavior 466 intrinsic to how all throughput optimizing protocols interact with 467 the network. The protocols rely on control systems based on multiple 468 network estimators to regulate the quantity of data sent into the 469 network. The data in turn alters network and the properties observed 470 by the estimators, such that there are circular dependencies between 471 every component and every property. Since some of these estimators 472 are non-linear, the entire system is nonlinear, and any change 473 anywhere causes difficult to predict changes in every parameter. 475 Model Based Metrics overcome these problems by forcing the 476 measurement system to be open loop: the delivery statistics (akin to 477 the network estimators) do not affect the traffic. The traffic and 478 traffic patterns (bursts) are computed on the basis of the target 479 performance. In order for a network to pass, the resulting delivery 480 statistics and corresponding network estimators have to be such that 481 they would not cause the control systems slow the traffic below the 482 target rate. 484 4.1. TCP properties 486 TCP and SCTP are self clocked protocols. The dominant steady state 487 behavior is to have an approximately fixed quantity of data and 488 acknowledgements (ACKs) circulating in the network. The receiver 489 reports arriving data by returning ACKs to the data sender, the data 490 sender typically responds by sending exactly the same quantity of 491 data back into the network. The total quantity of data plus the data 492 represented by ACKs circulating in the network is referred to as the 493 window. The mandatory congestion control algorithms incrementally 494 adjust the window by sending slightly more or less data in response 495 to each ACK. The fundamentally important property of this systems is 496 that it is entirely self clocked: The data transmissions are a 497 reflection of the ACKs that were delivered by the network, the ACKs 498 are a reflection of the data arriving from the network. 500 A number of phenomena can cause bursts of data, even in idealized 501 networks that are modeled as simple queueing systems. 503 During slowstart the data rate is doubled on each RTT by sending 504 twice as much data as was delivered to the receiver on the prior RTT. 505 For slowstart to be able to fill such a network the network must be 506 able to tolerate slowstart bursts up to the full pipe size inflated 507 by the anticipated window reduction on the first loss or ECN mark. 508 For example, with classic Reno congestion control, an optimal 509 slowstart has to end with a burst that is twice the bottleneck rate 510 for exactly one RTT in duration. This burst causes a queue which is 511 exactly equal to the pipe size (i.e. the window is exactly twice the 512 pipe size) so when the window is halved in response to the first 513 loss, the new window will be exactly the pipe size. 515 Note that if the bottleneck data rate is significantly slower than 516 the rest of the path, the slowstart bursts will not cause significant 517 queues anywhere else along the path; they primarily exercise the 518 queue at the dominant bottleneck. 520 Other sources of bursts include application pauses and channel 521 allocation mechanisms. Appendix B describes the treatment of channel 522 allocation systems. If the application pauses (stops reading or 523 writing data) for some fraction of one RTT, state-of-the-art TCP 524 catches up to the earlier window size by sending a burst of data at 525 the full sender interface rate. To fill such a network with a 526 realistic application, the network has to be able to tolerate 527 interface rate bursts from the data sender large enough to cover 528 application pauses. 530 Although the interface rate bursts are typically smaller than last 531 burst of a slowstart, they are at a higher data rate so they 532 potentially exercise queues at arbitrary points along the front path 533 from the data sender up to and including the queue at the dominant 534 bottleneck. There is no model for how frequent or what sizes of 535 sender rate bursts should be tolerated. 537 To verify that a path can meet a performance target, it is necessary 538 to independently confirm that the path can tolerate bursts in the 539 dimensions that can be caused by these mechanisms. Three cases are 540 likely to be sufficient: 542 o Slowstart bursts sufficient to get connections started properly. 543 o Frequent sender interface rate bursts that are small enough where 544 they can be assumed not to significantly affect delivery 545 statistics. (Implicitly derated by selecting the burst size). 546 o Infrequent sender interface rate full target_pipe_size bursts that 547 do affect the delivery statistics. (Target_run_length is 548 derated). 550 4.2. Diagnostic Approach 552 The MBM approach is to open loop TCP by precomputing traffic patterns 553 that are typically generated by TCP operating at the given target 554 parameters, and evaluating delivery statistics (packet loss, ECN 555 marks and delay). In this approach the measurement software 556 explicitly controls the data rate, transmission pattern or cwnd 557 (TCP's primary congestion control state variables) to create 558 repeatable traffic patterns that mimic TCP behavior but are 559 independent of the actual behavior of the subpath under test. These 560 patterns are manipulated to probe the network to verify that it can 561 deliver all of the traffic patterns that a transport protocol is 562 likely to generate under normal operation at the target rate and RTT. 564 By opening the protocol control loops, we remove most sources of 565 temporal and spatial correlation in the traffic delivery statistics, 566 such that each subpath's contribution to the end-to-end statistics 567 can be assumed to be independent and stationary (The delivery 568 statistics depend on the fine structure of the data transmissions, 569 but not on long time scale state imbedded in the sender, receiver or 570 other network components.) Therefore each subpath's contribution to 571 the end-to-end delivery statistics can be assumed to be independent, 572 and spatial composition techniques such as [RFC5835] apply. 574 In typical networks, the dominant bottleneck contributes the majority 575 of the packet loss and ECN marks. Often the rest of the path makes 576 insignificant contribution to these properties. A TDS should 577 apportion the end-to-end budget for the specified parameters 578 (primarily packet loss and ECN marks) to each subpath or group of 579 subpaths. For example the dominant bottleneck may be permitted to 580 contribute 90% of the loss budget, while the rest of the path is only 581 permitted to contribute 10%. 583 A TDS or FSTDS MUST apportion all relevant packet delivery statistics 584 between different subpaths, such that the spatial composition of the 585 metrics yields end-to-end statics which are within the bounds 586 determined by the models. 588 A network is expected to be able to sustain a Bulk TCP flow of a 589 given data rate, MTU and RTT when the following conditions are met: 590 o The raw link rate is higher than the target data rate. 591 o The observed run length is larger than required by a suitable TCP 592 performance model 593 o There is sufficient buffering at the dominant bottleneck to absorb 594 a slowstart rate burst large enough to get the flow out of 595 slowstart at a suitable window size. 596 o There is sufficient buffering in the front path to absorb and 597 smooth sender interface rate bursts at all scales that are likely 598 to be generated by the application, any channel arbitration in the 599 ACK path or other mechanisms. 600 o When there is a standing queue at a bottleneck for a shared media 601 subpath, there are suitable bounds on how the data and ACKs 602 interact, for example due to the channel arbitration mechanism. 603 o When there is a slowly rising standing queue at the bottleneck the 604 onset of packet loss has to be at an appropriate point (time or 605 queue depth) and progressive. This typically requires some form 606 of Automatic Queue Management [RFC2309]. 608 We are developing a tool that can perform many of the tests described 609 here[MBMSource]. 611 5. Common Models and Parameters 613 5.1. Target End-to-end parameters 615 The target end-to-end parameters are the target data rate, target RTT 616 and target MTU as defined in Section 2. These parameters are 617 determined by the needs of the application or the ultimate end user 618 and the end-to-end Internet path over which the application is 619 expected to operate. The target parameters are in units that make 620 sense to upper layers: payload bytes delivered to the application, 621 above TCP. They exclude overheads associated with TCP and IP 622 headers, retransmits and other protocols (e.g. DNS). 624 Other end-to-end parameters defined in Section 2 include the 625 effective bottleneck data rate, the sender interface data rate and 626 the TCP/IP header sizes (overhead). 628 The target data rate must be smaller than all link data rates by 629 enough headroom to carry the transport protocol overhead, explicitly 630 including retransmissions and an allowance fluctuations in the actual 631 data rate, needed to meet the specified average rate. Specifying a 632 target rate with insufficient headroom are likely to result in 633 brittle measurements having little predictive value. 635 Note that the target parameters can be specified for a hypothetical 636 path, for example to construct TDS designed for bench testing in the 637 absence of a real application, or for a real physical test, for in 638 situ testing of production infrastructure. 640 The number of concurrent connections is explicitly not a parameter to 641 this model. If a subpath requires multiple connections in order to 642 meet the specified performance, that must be stated explicitly and 643 the procedure described in Section 6.1.4 applies. 645 5.2. Common Model Calculations 647 The end-to-end target parameters are used to derive the 648 target_pipe_size and the reference target_run_length. 650 The target_pipe_size, is the average window size in packets needed to 651 meet the target rate, for the specified target RTT and MTU. It is 652 given by: 654 target_pipe_size = target_rate * target_RTT / ( target_MTU - 655 header_overhead ) 656 Target_run_length is an estimate of the minimum required headway 657 between losses or ECN marks, as computed by a mathematical model of 658 TCP congestion control. The derivation here follows [MSMO97], and by 659 design is quite conservative. The alternate models described in 660 Appendix A generally yield smaller run_lengths (higher loss rates), 661 but may not apply in all situations. In any case alternate models 662 should be compared to the reference target_run_length computed here. 664 Reference target_run_length is derived as follows: assume the 665 subpath_data_rate is infinitesimally larger than the target_data_rate 666 plus the required header_overhead. Then target_pipe_size also 667 predicts the onset of queueing. A larger window will cause a 668 standing queue at the bottleneck. 670 Assume the transport protocol is using standard Reno style Additive 671 Increase, Multiplicative Decrease congestion control [RFC5681] (but 672 not Appropriate Byte Counting [RFC3465]) and the receiver is using 673 standard delayed ACKs. Reno increases the window by one packet every 674 pipe_size worth of ACKs. With delayed ACKs this takes 2 Round Trip 675 Times per increase. To exactly fill the pipe losses must be no 676 closer than when the peak of the AIMD sawtooth reached exactly twice 677 the target_pipe_size otherwise the multiplicative window reduction 678 triggered by the loss would cause the network to be underfilled. 679 Following [MSMO97] the number of packets between losses must be the 680 area under the AIMD sawtooth. They must be no more frequent than 681 every 1 in ((3/2)*target_pipe_size)*(2*target_pipe_size) packets, 682 which simplifies to: 684 target_run_length = 3*(target_pipe_size^2) 686 Note that this calculation is very conservative and is based on a 687 number of assumptions that may not apply. Appendix A discusses these 688 assumptions and provides some alternative models. If a less 689 conservative model is used, a fully specified TDS or FSTDS MUST 690 document the actual method for computing target_run_length along with 691 the rationale for the underlying assumptions and the ratio of chosen 692 target_run_length to the reference target_run_length calculated 693 above. 695 These two parameters, target_pipe_size and target_run_length, 696 directly imply most of the individual parameters for the tests in 697 Section 7. 699 5.3. Parameter Derating 701 Since some aspects of the models are very conservative, this 702 framework permits some latitude in derating test parameters. Rather 703 than trying to formalize more complicated models we permit some test 704 parameters to be relaxed as long as they meet some additional 705 procedural constraints: 706 o The TDS or FSTDS MUST document and justify the actual method used 707 compute the derated metric parameters. 708 o The validation procedures described in Section 9 must be used to 709 demonstrate the feasibility of meeting the performance targets 710 with infrastructure that infinitesimally passes the derated tests. 711 o The validation process itself must be documented is such a way 712 that other researchers can duplicate the validation experiments. 714 Except as noted, all tests below assume no derating. Tests where 715 there is not currently a well established model for the required 716 parameters explicitly include derating as a way to indicate 717 flexibility in the parameters. 719 6. Common testing procedures 721 6.1. Traffic generating techniques 723 6.1.1. Paced transmission 725 Paced (burst) transmissions: send bursts of data on a timer to meet a 726 particular target rate and pattern. In all cases the specified data 727 rate can either be the application or link rates. Header overheads 728 must be included in the calculations as appropriate. 729 Paced single packets: Send individual packets at the specified rate 730 or headway. 731 Burst: Send sender interface rate bursts on a timer. Specify any 3 732 of: average rate, packet size, burst size (number of packets) and 733 burst headway (burst start to start). These bursts are typically 734 sent as back-to-back packets at the testers interface rate. 735 Slowstart bursts: Send 4 packet sender interface rate bursts at an 736 average data rate equal to twice effective bottleneck link rate 737 (but not more than the sender interface rate). This corresponds 738 to the average rate during a TCP slowstart when Appropriate Byte 739 Counting [RFC3465] is present or delayed ack is disabled. Note 740 that if the effective bottleneck link rate is more than half of 741 the sender interface rate, slowstart bursts become sender 742 interface rate bursts. 743 Repeated Slowstart bursts: Slowstart bursts are typically part of 744 larger scale pattern of repeated bursts, such as sending 745 target_pipe_size packets as slowstart bursts on a target_RTT 746 headway (burst start to burst start). Such a stream has three 747 different average rates, depending on the averaging interval. At 748 the finest time scale the average rate is the same as the sender 749 interface rate, at a medium scale the average rate is twice the 750 effective bottleneck link rate and at the longest time scales the 751 average rate is equal to the target data rate. 753 Note that in conventional measurement theory exponential 754 distributions are often used to eliminate many sorts of correlations. 755 For the procedures above, the correlations are created by the network 756 elements and accurately reflect their behavior. At some point in the 757 future, it may be desirable to introduce noise sources into the above 758 pacing models, but the are not warranted at this time. 760 6.1.2. Constant window pseudo CBR 762 Implement pseudo constant bit rate by running a standard protocol 763 such as TCP with a fixed bound on the window size. The rate is only 764 maintained in average over each RTT, and is subject to limitations of 765 the transport protocol. 767 The bound on the window size is computed from the target_data_rate 768 and the actual RTT of the test path. 770 If the transport protocol fails to maintain the test rate within 771 prescribed limits the test would typically be considered inconclusive 772 or failing, depending depending on what mechanism caused the reduced 773 rate. See the discussion of test outcomes in Section 6.2.1. 775 6.1.3. Scanned window pseudo CBR 777 Same as the above, except the window is scanned across a range of 778 sizes designed to include two key events, the onset of queueing and 779 the onset of packet loss or ECN marks. The window is scanned by 780 incrementing it by one packet for every 2*target_pipe_size delivered 781 packets. This mimics the additive increase phase of standard 782 congestion avoidance and normally separates the the window increases 783 by approximately twice the target_RTT. 785 There are two versions of this test: one built by applying a window 786 clamp to standard congestion control and one one built by stiffening 787 a non-standard transport protocol. When standard congestion control 788 is in effect, any losses or ECN marks cause the transport to revert 789 to a window smaller than the clamp such that the scanning clamp loses 790 control the window size. The NPAD pathdiag tool is an example of 791 this class of algorithms [Pathdiag]. 793 Alternatively a non-standard congestion control algorithm can respond 794 to losses by transmitting extra data, such that it maintains the 795 specified window size independent of losses or ECN marks. Such a 796 stiffened transport explicitly violates mandatory Internet congestion 797 control and is not suitable for in situ testing. It is only 798 appropriate for engineering testing under laboratory conditions. The 799 Windowed Ping tools implemented such a test [WPING]. This tool has 800 been updated and is under test.[mpingSource] 802 The test procedures in Section 7.2 describe how to the partition the 803 scans into regions and how to interpret the results. 805 6.1.4. Concurrent or channelized testing 807 The procedures described in his document are only directly applicable 808 to single stream performance measurement, e.g. one TCP connection. 809 In an ideal world, we would disallow all performance claims based 810 multiple concurrent streams but this is not practical due to at least 811 two different issues. First, many very high rate link technologies 812 are channelized and pin individual flows to specific channels to 813 minimize reordering or other problems and second, TCP itself has 814 scaling limits. Although the former problem might be overcome 815 through different design decisions, the later problem is more deeply 816 rooted. 818 All standard [RFC5681] and de facto standard congestion control 819 algorithms [CUBIC] have scaling limits, in the sense that as a long 820 fast network (LFN) with a fixed RTT and MTU gets faster, all 821 congestion control algorithms get less accurate and as a consequence 822 have difficulty filling the network [SLowScaling]. These properties 823 are a consequence of the original Reno AIMD congestion control design 824 and the requirement in RFC 5681 that all transport protocols have 825 uniform response to congestion. 827 There are a number of reasons to want to specify performance in term 828 of multiple concurrent flows, however this approach is not 829 recommended for data rates below several Mb/s, which can be attained 830 with run lengths under 10000 packets. Since run length goes as the 831 square of the data rate, at higher rates the run lengths can be 832 unfeasibly large, and multiple connection might be the only feasible 833 approach. For an example of this problem see Section 8.3. 835 If multiple connections are deemed necessary to meet aggregate 836 performance targets then this MUST be stated both the design of the 837 TDS and in any claims about network performance. The tests MUST be 838 performed concurrently with the specified number of connections. For 839 the the tests that using bursty traffic, the bursts should be 840 synchronized across flows. 842 6.1.5. Intermittent Testing 844 Any test which does not depend on queueing (e.g. the CBR tests) or 845 experiences periodic zero outstanding data during normal operation 846 (e.g. between bursts for the various burst tests), can be formulated 847 as an intermittent test, to reduce the perceived impact on other 848 traffic. The approach is to insert periodic pauses in the test at 849 any point when there is no expected queue occupancy. 851 Intermittent testing can be used for ongoing monitoring for changes 852 in subpath quality with minimal disruption users. However it is not 853 suitable in environments where there are reactive links[REACTIVE]. 855 6.1.6. Intermittent Scatter Testing 857 Intermittent scatter testing is a technique for non-disruptively 858 evaluating the front path from a sender to a subscriber aggregation 859 point within an ISP at full load by intermittently testing across a 860 pool of subscriber access links, such that each subscriber sees 861 tolerable test traffic loads. The load on the front path should be 862 limited to be no more than that which would be caused by a single 863 test to an known to otherwise be idle subscriber. This test in 864 aggregate mimics a full load test from a content provider to the 865 aggregation point. 867 Intermittent scatter testing can be used to reduce the measurement 868 noise introduced by unknown traffic on customer access links. 870 6.2. Interpreting the Results 872 6.2.1. Test outcomes 874 To perform an exhaustive test of an end-to-end network path, each 875 test of the TDS is applied to each subpath of an end-to-end path. If 876 any subpath fails any test then an application running over the end- 877 to-end path can also be expected to fail to attain the target 878 performance under some conditions. 880 In addition to passing or failing, a test can be deemed to be 881 inconclusive for a number of reasons. Proper instrumentation and 882 treatment of inclusive outcomes is critical to the accuracy and 883 robustness of Model Based Metrics. Tests can be inconclusive if the 884 precomputed traffic pattern was not accurately generated; the 885 measurement results were not statistically significant; and others 886 causes such as failing to meet some required preconditions for the 887 test. 889 For example consider a test that implements Constant Window Pseudo 890 CBR (Section 6.1.2) by adding rate controls and detailed traffic 891 instrumentation to TCP (e.g. [RFC4898]). TCP includes built in 892 control systems which might interfere with the sending data rate. If 893 such a test meets the the run length specification while failing to 894 attain the specified data rate it must be treated as an inconclusive 895 result, because we can not a priori determine if the reduced data 896 rate was caused by a TCP problem or a network problem, or if the 897 reduced data rate had a material effect on the run length measurement 898 itself. 900 Note that for load tests such as this example, an observed run length 901 that is too small can be considered to have failed the test because 902 it doesn't really matter that the test didn't attain the required 903 data rate. 905 The really important new properties of MBM, such as vantage 906 independence, are a direct consequence of opening the control loops 907 in the protocols, such that the test traffic does not depend on 908 network conditions or traffic received. Any mechanism that 909 introduces feedback between the traffic measurements and the traffic 910 generation is at risk of introducing nonlinearities that spoil these 911 properties. Any exceptional event that indicates that such feedback 912 has happened should cause the test to be considered inconclusive. 914 One way to view inconclusive tests is that they reflect situations 915 where a test outcome is ambiguous between limitations of the network 916 and some unknown limitation of the diagnostic test itself, which was 917 presumably caused by some uncontrolled feedback from the network. 919 Note that procedures that attempt to sweep the target parameter space 920 to find the bounds on some parameter (for example to find the highest 921 data rate for a subpath) are likely to break the location independent 922 properties of Model Based Metrics, because the boundary between 923 passing and inconclusive is sensitive to the RTT because TCP's 924 ability to compensate for problems scales with the number of round 925 trips per second. Repeating the same procedure from another vantage 926 point with a different RTT is likely get a different result, because 927 TCP will get lower performance on the path with the longer RTT. 929 One of the goals for evolving TDS designs will be to keep sharpening 930 distinction between inconclusive, passing and failing tests. The 931 criteria for for passing, failing and inclusive tests MUST be 932 explicitly stated for every test in the TDS or FSTDS. 934 One of the goals of evolving the testing process, procedures tools 935 and measurement point selection should be to minimize the number of 936 inconclusive tests. 938 It may be useful to keep raw data delivery statistics for deeper 939 study of the behavior of the network path and to measure the tools. 940 This can help to drive tool evolution. Under some conditions it 941 might be possible to reevaluate the raw data for satisfying alternate 942 performance targets. However such procedures are likely to introduce 943 sampling bias and other implicit feedback which can cause false 944 results and exhibit MP vantage sensitivity. 946 6.2.2. Statistical criteria for measuring run_length 948 When evaluating the observed run_length, we need to determine 949 appropriate packet stream sizes and acceptable error levels for 950 efficient measurement. In practice, can we compare the empirically 951 estimated packet loss and ECN marking probabilities with the targets 952 as the sample size grows? How large a sample is needed to say that 953 the measurements of packet transfer indicate a particular run length 954 is present? 956 The generalized measurement can be described as recursive testing: 957 send packets (individually or in patterns) and observe the packet 958 delivery performance (loss ratio or other metric, any marking we 959 define). 961 As each packet is sent and measured, we have an ongoing estimate of 962 the performance in terms of the ratio of packet loss or ECN mark to 963 total packets (i.e. an empirical probability). We continue to send 964 until conditions support a conclusion or a maximum sending limit has 965 been reached. 967 We have a target_mark_probability, 1 mark per target_run_length, 968 where a "mark" is defined as a lost packet, a packet with ECN mark, 969 or other signal. This constitutes the null Hypothesis: 971 H0: no more than one mark in target_run_length = 972 3*(target_pipe_size)^2 packets 974 and we can stop sending packets if on-going measurements support 975 accepting H0 with the specified Type I error = alpha (= 0.05 for 976 example). 978 We also have an alternative Hypothesis to evaluate: if performance is 979 significantly lower than the target_mark_probability. Based on 980 analysis of typical values and practical limits on measurement 981 duration, we choose four times the H0 probability: 983 H1: one or more marks in (target_run_length/4) packets 985 and we can stop sending packets if measurements support rejecting H0 986 with the specified Type II error = beta (= 0.05 for example), thus 987 preferring the alternate hypothesis H1. 989 H0 and H1 constitute the Success and Failure outcomes described 990 elsewhere in the memo, and while the ongoing measurements do not 991 support either hypothesis the current status of measurements is 992 inconclusive. 994 The problem above is formulated to match the Sequential Probability 995 Ratio Test (SPRT) [StatQC]. Note that as originally framed the 996 events under consideration were all manufacturing defects. In 997 networking, ECN marks and lost packets are not defects but signals, 998 indicating that the transport protocol should slow down. 1000 The Sequential Probability Ratio Test also starts with a pair of 1001 hypothesis specified as above: 1003 H0: p0 = one defect in target_run_length 1004 H1: p1 = one defect in target_run_length/4 1005 As packets are sent and measurements collected, the tester evaluates 1006 the cumulative defect count against two boundaries representing H0 1007 Acceptance or Rejection (and acceptance of H1): 1009 Acceptance line: Xa = -h1 + sn 1010 Rejection line: Xr = h2 + sn 1011 where n increases linearly for each packet sent and 1013 h1 = { log((1-alpha)/beta) }/k 1014 h2 = { log((1-beta)/alpha) }/k 1015 k = log{ (p1(1-p0)) / (p0(1-p1)) } 1016 s = [ log{ (1-p0)/(1-p1) } ]/k 1017 for p0 and p1 as defined in the null and alternative Hypotheses 1018 statements above, and alpha and beta as the Type I and Type II error. 1020 The SPRT specifies simple stopping rules: 1022 o Xa < defect_count(n) < Xb: continue testing 1023 o defect_count(n) <= Xa: Accept H0 1024 o defect_count(n) >= Xb: Accept H1 1026 The calculations above are implemented in the R-tool for Statistical 1027 Analysis [Rtool] , in the add-on package for Cross-Validation via 1028 Sequential Testing (CVST) [CVST] . 1030 Using the equations above, we can calculate the minimum number of 1031 packets (n) needed to accept H0 when x defects are observed. For 1032 example, when x = 0: 1034 Xa = 0 = -h1 + sn 1035 and n = h1 / s 1037 6.2.2.1. Alternate criteria for measuring run_length 1039 An alternate calculation, contributed by Alex Gilgur (Google). 1041 The probability of failure within an interval whose length is 1042 target_run_length is given by an exponential distribution with rate = 1043 1 / target_run_length (a memoryless process). The implication of 1044 this is that it will be different, depending on the total count of 1045 packets that have been through the pipe, the formula being: 1047 P(t1 < T < t2) = R(t1) - R(t2), 1049 where 1051 T = number of packets at which a failure will occur with probability P; 1052 t = number of packets: 1053 t1 = number of packets (e.g., when failure last occurred) 1054 t2 = t1 + target_run_length 1055 R = failure rate: 1056 R(t1) = exp (-t1/target_run_length) 1057 R(t2) = exp (-t2/target_run_length) 1059 The algorithm: 1061 initialize the packet.counter = 0 1062 initialize the failed.packet.counter = 0 1063 start the loop 1064 if paket_response = ACK: 1065 increment the packet.counter 1066 else: 1067 ### The packet failed 1068 increment the packet.counter 1069 increment the failed.packet.counter 1071 P_fail_observed = failed.packet.counter/packet.counter 1073 upper_bound = packet.counter + target.run.length / 2 1074 lower_bound = packet.counter - target.run.length / 2 1076 R1 = exp( -upper_bound / target.run.length) 1077 R0 = R(max(0, lower_bound)/ target.run.length) 1079 P_fail_predicted = R1-R0 1080 Compare P_fail_observed vs. P_fail_predicted 1081 end-if 1082 continue the loop 1084 This algorithm allows accurate comparison of the observed failure 1085 probability with the corresponding values predicted based on a fixed 1086 target_failure_rate, which is equal to 1.0 / target_run_length. 1088 6.2.3. Reordering Tolerance 1090 All tests must be instrumented for packet level reordering [RFC4737]. 1091 However, there is no consensus for how much reordering should be 1092 acceptable. Over the last two decades the general trend has been to 1093 make protocols and applications more tolerant to reordering, in 1094 response to the gradual increase in reordering in the network. This 1095 increase has been due to the gradual deployment of parallelism in the 1096 network, as a consequence of such technologies as multithreaded route 1097 lookups and Equal Cost Multipath (ECMP) routing. These techniques to 1098 increase network parallelism are critical to enabling overall 1099 Internet growth to exceed Moore's Law. 1101 Section 5 of [RFC4737] proposed a metric that may be sufficient to 1102 designate isolated reordered packets as effectively lost, because 1103 TCP's retransmission response would be the same. 1105 TCP should be able to adapt to reordering as long as the reordering 1106 extent is no more than the maximum of one half window or 1 mS, 1107 whichever is larger. Note that there is a fundamental tradeoff 1108 between tolerance to reordering and how quickly algorithms such as 1109 fast retransmit can repair losses. Within this limit on reorder 1110 extent, there should be no bound on reordering density. 1112 NB: Traditional TCP implementations were not compatible with this 1113 metric, however newer implementations still need to be evaluated 1115 Parameters: 1116 Reordering displacement: the maximum of one half of target_pipe_size 1117 or 1 mS. 1119 6.3. Test Qualifications 1121 This entire section need to be completely overhauled. @@@@ It might 1122 be summarized as "needs to be specified in a FSTDS". 1124 Send pre-load traffic as needed to activate radios with a sleep mode, 1125 or other "reactive network" elements (term defined in 1126 [draft-morton-ippm-2330-update-01]). 1128 In general failing to accurately generate the test traffic has to be 1129 treated as an inconclusive test, since it must be presumed that the 1130 error in traffic generation might have affected the test outcome. To 1131 the extent that the network itself had an effect on the the traffic 1132 generation (e.g. in the standing queue tests) the possibility exists 1133 that allowing too large of error margin in the traffic generation 1134 might introduce feedback loops that comprise the vantage independents 1135 properties of these tests. 1137 The proper treatment of cross traffic is different for different 1138 subpaths. In general when testing infrastructure which is associated 1139 with only one subscriber, the test should be treated as inconclusive 1140 it that subscriber is active on the network. However, for shared 1141 infrastructure managed by an ISP, the question at hand is likely to 1142 be testing if ISP has sufficient total capacity. In such cases the 1143 presence of cross traffic due to other subscribers is explicitly part 1144 of the network conditions and its effects are explicitly part of the 1145 test. 1147 These two cases do not cover all subpaths. For example, WiFI which 1148 itself shares unmanaged channel space with other devices is unlikely 1149 to be unsuitable for any prescriptive measurement. 1151 Note that canceling tests due to load on subscriber lines may 1152 introduce sampling bias for testing other parts of the 1153 infrastructure. For this reason tests that are scheduled but not run 1154 due to load should be treated as a special case of "inconclusive". 1156 7. Diagnostic Tests 1158 The diagnostic tests below are organized by traffic pattern: basic 1159 data rate and run length, standing queues, slowstart bursts, and 1160 sender rate bursts. We also introduce some combined tests which are 1161 more efficient the expense of conflating the signatures of different 1162 failures. 1164 7.1. Basic Data Rate and Run Length Tests 1166 We propose several versions of the basic data rate and run length 1167 test. All measure the number of packets delivered between losses or 1168 ECN marks, using a data stream that is rate controlled at or below 1169 the target_data_rate. 1171 The tests below differ in how the data rate is controlled. The data 1172 can be paced on a timer, or window controlled at full target data 1173 rate. The first two tests implicitly confirm that sub_path has 1174 sufficient raw capacity to carry the target_data_rate. They are 1175 recommend for relatively infrequent testing, such as an installation 1176 or auditing process. The third, background run length, is a low rate 1177 test designed for ongoing monitoring for changes in subpath quality. 1179 All rely on the receiver accumulating packet delivery statistics as 1180 described in Section 6.2.2 to score the outcome: 1182 Pass: it is statistically significant that the observed run length is 1183 larger than the target_run_length. 1185 Fail: it is statistically significant that the observed run length is 1186 smaller than the target_run_length. 1188 A test is considered to be inconclusive if it failed to meet the data 1189 rate as specified below, meet the qualifications defined in 1190 Section 6.3 or neither run length statistical hypothesis was 1191 confirmed in the allotted test duration. 1193 7.1.1. Run Length at Paced Full Data Rate 1195 Confirm that the observed run length is at least the 1196 target_run_length while relying on timer to send data at the 1197 target_rate using the procedure described in in Section 6.1.1 with a 1198 burst size of 1 (single packets). 1200 The test is considered to be inconclusive if the packet transmission 1201 can not be accurately controlled for any reason. 1203 7.1.2. Run Length at Full Data Windowed Rate 1205 Confirm that the observed run length is at least the 1206 target_run_length while sending at an average rate equal to the 1207 target_data_rate, by controlling (or clamping) the window size of a 1208 conventional transport protocol to a fixed value computed from the 1209 properties of the test path, typically 1210 test_window=target_data_rate*test_RTT/target_MTU. 1212 Since losses and ECN marks generally cause transport protocols to at 1213 least temporarily reduce their data rates, this test is expected to 1214 be less precise about controlling its data rate. It should not be 1215 considered inconclusive as long as at least some of the round trips 1216 reached the full target_data_rate, without incurring losses. To pass 1217 this test the network MUST deliver target_pipe_size packets in 1218 target_RTT time without any losses or ECN marks at least once per two 1219 target_pipe_size round trips, in addition to meeting the run length 1220 statistical test. 1222 7.1.3. Background Run Length Tests 1224 The background run length is a low rate version of the target target 1225 rate test above, designed for ongoing lightweight monitoring for 1226 changes in the observed subpath run length without disrupting users. 1227 It should be used in conjunction with one of the above full rate 1228 tests because it does not confirm that the subpath can support raw 1229 data rate. 1231 Existing loss metrics such as [RFC6673] might be appropriate for 1232 measuring background run length. 1234 7.2. Standing Queue tests 1236 These test confirm that the bottleneck is well behaved across the 1237 onset of packet loss, which typically follows after the onset of 1238 queueing. Well behaved generally means lossless for transient 1239 queues, but once the queue has been sustained for a sufficient period 1240 of time (or reaches a sufficient queue depth) there should be a small 1241 number of losses to signal to the transport protocol that it should 1242 reduce its window. Losses that are too early can prevent the 1243 transport from averaging at the target_data_rate. Losses that are 1244 too late indicate that the queue might be subject to bufferbloat 1245 [Bufferbloat] and inflict excess queuing delays on all flows sharing 1246 the bottleneck queue. Excess losses make loss recovery problematic 1247 for the transport protocol. Non-linear or erratic RTT fluctuations 1248 suggest poor interactions between the channel acquisition systems and 1249 the transport self clock. All of the tests in this section use the 1250 same basic scanning algorithm but score the link on the basis of how 1251 well it avoids each of these problems. 1253 For some technologies the data might not be subject to increasing 1254 delays, in which case the data rate will vary with the window size 1255 all the way up to the onset of losses or ECN marks. For theses 1256 technologies, the discussion of queueing does not apply, but it is 1257 still required that the onset of losses (or ECN marks) be at an 1258 appropriate point and progressive. 1260 Use the procedure in Section 6.1.3 to sweep the window across the 1261 onset of queueing and the onset of loss. The tests below all assume 1262 that the scan emulates standard additive increase and delayed ACK by 1263 incrementing the window by one packet for every 2*target_pipe_size 1264 packets delivered. A scan can be divided into three regions: below 1265 the onset of queueing, a standing queue, and at or beyond the onset 1266 of loss. 1268 Below the onset of queueing the RTT is typically fairly constant, and 1269 the data rate varies in proportion to the window size. Once the data 1270 rate reaches the link rate, the data rate becomes fairly constant, 1271 and the RTT increases in proportion to the the window size. The 1272 precise transition from one region to the other can be identified by 1273 the maximum network power, defined to be the ratio data rate over the 1274 RTT[POWER]. 1276 For technologies that do not have conventional queues, start the scan 1277 at a window equal to the test_window, i.e. starting at the target 1278 rate, instead of the power point. 1280 If there is random background loss (e.g. bit errors, etc), precise 1281 determination of the onset of packet loss may require multiple scans. 1282 Above the onset of loss, all transport protocols are expected to 1283 experience periodic losses. For the stiffened transport case they 1284 will be determined by the AQM algorithm in the network or the details 1285 of how the the window increase function responds to loss. For the 1286 standard transport case the details of periodic losses are typically 1287 dominated by the behavior of the transport protocol itself. 1289 7.2.1. Congestion Avoidance 1291 A link passes the congestion avoidance standing queue test if more 1292 than target_run_length packets are delivered between the power point 1293 (or test_window) and the first loss or ECN mark. If this test is 1294 implemented using a standards congestion control algorithm with a 1295 clamp, it can be used in situ in the production internet as a 1296 capacity test. For an example of such a test see [NPAD]. 1298 7.2.2. Bufferbloat 1300 This test confirms that there is some mechanism to limit buffer 1301 occupancy (e.g. that prevents bufferbloat). Note that this is not 1302 strictly a requirement for single stream bulk performance, however if 1303 there is no mechanism to limit buffer occupancy then a single stream 1304 with sufficient data to deliver is likely to cause the problems 1305 described in [RFC2309] and [Bufferbloat]. This may cause only minor 1306 symptoms for the dominant flow, but has the potential to make the 1307 link unusable for other flows and applications. 1309 Pass if the onset of loss is before a standing queue has introduced 1310 more delay than than twice target_RTT, or other well defined limit. 1311 Note that there is not yet a model for how much standing queue is 1312 acceptable. The factor of two chosen here reflects a rule of thumb. 1313 Note that in conjunction with the previous test, this test implies 1314 that the first loss should occur at a queueing delay which is between 1315 one and two times the target_RTT. 1317 7.2.3. Non excessive loss 1319 This test confirm that the onset of loss is not excessive. Pass if 1320 losses are bound by the the fluctuations in the cross traffic, such 1321 that transient load (bursts) do not cause dips in aggregate raw 1322 throughput. e.g. pass as long as the losses are no more bursty than 1323 are expected from a simple drop tail queue. Although this test could 1324 be made more precise it is really included here for pedantic 1325 completeness. 1327 7.2.4. Duplex Self Interference 1329 This engineering test confirms a bound on the interactions between 1330 the forward data path and the ACK return path. Fail if the RTT rises 1331 by more than some fixed bound above the expected queueing time 1332 computed from trom the excess window divided by the link data rate. 1334 7.3. Slowstart tests 1336 These tests mimic slowstart: data is sent at twice the effective 1337 bottleneck rate to exercise the queue at the dominant bottleneck. 1339 They are deemed inconclusive if the elapsed time to send the data 1340 burst is not less than half of the time to receive the ACKs. (i.e. 1341 sending data too fast is ok, but sending it slower than twice the 1342 actual bottleneck rate as indicated by the ACKs is deemed 1343 inconclusive). Space the bursts such that the average data rate is 1344 equal to the target_data_rate. 1346 7.3.1. Full Window slowstart test 1348 This is a capacity test to confirm that slowstart is not likely to 1349 exit prematurely. Send slowstart bursts that are target_pipe_size 1350 total packets. 1352 Accumulate packet delivery statistics as described in Section 6.2.2 1353 to score the outcome. Pass if it is statistically significant that 1354 the observed run length is larger than the target_run_length. Fail 1355 if it is statistically significant that the observed run length is 1356 smaller than the target_run_length. 1358 Note that these are the same parameters as the Sender Full Window 1359 burst test, except the burst rate is at slowestart rate, rather than 1360 sender interface rate. 1362 7.3.2. Slowstart AQM test 1364 Do a continuous slowstart (send data continuously at slowstart_rate), 1365 until the first loss, stop, allow the network to drain and repeat, 1366 gathering statistics on the last packet delivered before the loss, 1367 the loss pattern, maximum observed RTT and window size. Justify the 1368 results. There is not currently sufficient theory justifying 1369 requiring any particular result, however design decisions that affect 1370 the outcome of this tests also affect how the network balances 1371 between long and short flows (the "mice and elephants" problem). 1373 This is an engineering test: It would be best performed on a 1374 quiescent network or testbed, since cross traffic has the potential 1375 to change the results. 1377 7.4. Sender Rate Burst tests 1379 These tests determine how well the network can deliver bursts sent at 1380 sender's interface rate. Note that this test most heavily exercises 1381 the front path, and is likely to include infrastructure may be out of 1382 scope for a subscriber ISP. 1384 Also, there are a several details that are not precisely defined. 1385 For starters there is not a standard server interface rate. 1 Gb/s 1386 and 10 Gb/s are very common today, but higher rates will become cost 1387 effective and can be expected to be dominant some time in the future. 1389 Current standards permit TCP to send a full window bursts following 1390 an application pause. Congestion Window Validation [RFC2861], is not 1391 required, but even if was it does not take effect until an 1392 application pause is longer than an RTO. Since this is standard 1393 behavior, it is desirable that the network be able to deliver such 1394 bursts, otherwise application pauses will cause unwarranted losses. 1396 It is also understood in the application and serving community that 1397 interface rate bursts have a cost to the network that has to be 1398 balanced against other costs in the servers themselves. For example 1399 TCP Segmentation Offload [TSO] reduces server CPU in exchange for 1400 larger network bursts, which increase the stress on network buffer 1401 memory. 1403 There is not yet theory to unify these costs or to provide a 1404 framework for trying to optimize global efficiency. We do not yet 1405 have a model for how much the network should tolerate server rate 1406 bursts. Some bursts must be tolerated by the network, but it is 1407 probably unreasonable to expect the network to be able to efficiently 1408 deliver all data as a series of bursts. 1410 For this reason, this is the only test for which we explicitly 1411 encourage detrateing. A TDS should include a table of pairs of 1412 derating parameters: what burst size to use as a fraction of the 1413 target_pipe_size, and how much each burst size is permitted to reduce 1414 the run length, relative to to the target_run_length. 1416 7.5. Combined Tests 1418 These tests are more efficient from a deployment/operational 1419 perspective, but may not be possible to diagnose if they fail. 1421 7.5.1. Sustained burst test 1423 Send target_pipe_size*derate sender interface rate bursts every 1424 target_RTT*derate, for derate between 0 and 1. Verify that the 1425 observed run length meets target_run_length. Key observations: 1426 o This test is subpath RTT invariant, as long as the tester can 1427 generate the required pattern. 1428 o The subpath under test is expected to go idle for some fraction of 1429 the time: (subpath_data_rate-target_rate)/subpath_data_rate. 1430 Failing to do so suggests a problem with the procedure and an 1431 inconclusive test result. 1432 o This test is more strenuous than the slowstart tests: they are not 1433 needed if the link passes this test with derate=1. 1434 o A link that passes this test is likely to be able to sustain 1435 higher rates (close to subpath_data_rate) for paths with RTTs 1436 smaller than the target_RTT. Offsetting this performance 1437 underestimation is part of the rationale behind permitting 1438 derating in general. 1440 o This test can be implemented with standard instrumented 1441 TCP[RFC4898], using a specialized measurement application at one 1442 end and a minimal service at the other end [RFC 863, RFC 864]. It 1443 may require tweaks to the TCP implementation. [MBMSource] 1444 o This test is efficient to implement, since it does not require 1445 per-packet timers, and can make use of TSO in modern NIC hardware. 1446 o This test is not totally sufficient: the standing window 1447 engineering tests are also needed to be sure that the link is well 1448 behaved at and beyond the onset of congestion. 1449 o This one test can be proven to be the one capacity test to 1450 supplant them all. 1452 7.5.2. Live Streaming Media 1454 Model Based Metrics can be implemented as a side effect of serving 1455 any non-throughput maximizing traffic*, such as streaming media, with 1456 some additional controls and instrumentation in the servers. The 1457 essential requirement is that the traffic be constrained such that 1458 even with arbitrary application pauses, bursts and data rate 1459 fluctuations, the traffic stays within the envelope defined by the 1460 individual tests described above, for a specific TDS. 1462 If the serving_data_rate is less than or equal to the 1463 target_data_rate and the serving_RTT (the RTT between the sender and 1464 client) is less than the target_RTT, this constraint is most easily 1465 implemented by clamping the transport window size to: 1467 serving_window_clamp=target_data_rate*serving_RTT/ 1468 (target_MTU-header_overhead) 1470 The serving_window_clamp will limit the both the serving data rate 1471 and burst sizes to be no larger than the procedures in Section 7.1.2 1472 and Section 7.4 or Section 7.5.1. Since the serving RTT is smaller 1473 than the target_RTT, the worst case bursts that might be generated 1474 under these conditions will be smaller than called for by Section 7.4 1475 and the sender rate burst sizes are implicitly derated by the 1476 serving_window_clamp divided by the target_pipe_size at the very 1477 least. (The traffic might be smoother than specified by the sender 1478 interface rate bursts test.) 1480 Note that if the application tolerates fluctuations in its actual 1481 data rate (say by use of a playout buffer) it is important that the 1482 target_data_rate be above the actual average rate needed by the 1483 application so it can recover after transient pauses caused by 1484 congestion or the application itself. 1486 Alternatively the sender data rate and bursts might be explicitly 1487 controlled by a host shaper or pacing at the sender. This would 1488 provide better control and work for serving_RTTs that are larger than 1489 the target_RTT, but it is substantially more complicated to 1490 implement. With this technique, any traffic might be used for 1491 measurement. 1493 * Note that this technique might be applied to any content, if users 1494 are willing to tolerate reduced data rate to inhibit TCP equilibrium 1495 behavior. 1497 8. Examples 1499 In this section we present TDS for a couple of performance 1500 specifications. 1502 Tentatively: 5 Mb/s*50 ms, 1 Mb/s*50ms, 250kbp*100mS 1504 8.1. Near serving HD streaming video 1506 Today the best quality HD video requires slightly less than 5 Mb/s 1507 [HDvideo]. Since it is desirable to serve such content locally, we 1508 assume that the content will be within 50 mS, which is enough to 1509 cover continental Europe or either US coast from a single site. 1511 5 Mb/s over a 50 ms path 1513 +----------------------+-------+---------+ 1514 | End to End Parameter | Value | units | 1515 +----------------------+-------+---------+ 1516 | target_rate | 5 | Mb/s | 1517 | target_RTT | 50 | ms | 1518 | traget_MTU | 1500 | bytes | 1519 | target_pipe_size | 22 | packets | 1520 | target_run_length | 1452 | packets | 1521 +----------------------+-------+---------+ 1523 Table 1 1525 This example uses the most conservative TCP model and no derating. 1527 8.2. Far serving SD streaming video 1529 Standard Quality video typically fits in 1 Mb/s [SDvideo]. This can 1530 be reasonably delivered via longer paths with larger. We assume 1531 100mS. 1533 1 Mb/s over a 100 ms path 1535 +----------------------+-------+---------+ 1536 | End to End Parameter | Value | units | 1537 +----------------------+-------+---------+ 1538 | target_rate | 1 | Mb/s | 1539 | target_RTT | 100 | ms | 1540 | traget_MTU | 1500 | bytes | 1541 | target_pipe_size | 9 | packets | 1542 | target_run_length | 243 | packets | 1543 +----------------------+-------+---------+ 1545 Table 2 1547 This example uses the most conservative TCP model and no derating. 1549 8.3. Bulk delivery of remote scientific data 1551 This example corresponds to 100 Mb/s bulk scientific data over a 1552 moderately long RTT. Note that the target_run_length is infeasible 1553 for most networks. 1555 100 Mb/s over a 200 ms path 1557 +----------------------+---------+---------+ 1558 | End to End Parameter | Value | units | 1559 +----------------------+---------+---------+ 1560 | target_rate | 100 | Mb/s | 1561 | target_RTT | 200 | ms | 1562 | traget_MTU | 1500 | bytes | 1563 | target_pipe_size | 1741 | packets | 1564 | target_run_length | 9093243 | packets | 1565 +----------------------+---------+---------+ 1567 Table 3 1569 9. Validation 1571 Since some aspects of the models are likely to be too conservative, 1572 Section 5.2 and Section 5.3 permit alternate protocol models and test 1573 parameter derating. In exchange for this latitude in the modelling 1574 process, we require demonstrations that such a TDS can robustly 1575 detect links that will prevent authentic applications using state-of- 1576 the-art protocol implementations from meeting the specified 1577 performance targets. This correctness criteria is potentially 1578 difficult to prove, because it implicitly requires validating a TDS 1579 against all possible links and subpaths. 1581 We suggest two strategies, both of which should be applied: first, 1582 publish a fully open description of the TDS, including what 1583 assumptions were used and and how it was derived, such that the 1584 research community can evaluate these decisions, test them and 1585 comment on there applicability; and second, demonstrate that an 1586 applications running over an infinitessimally passing testbed do meet 1587 the performance targets. 1589 An infinitessimally passing testbed resembles a epsilon-delta proof 1590 in calculus. Construct a test network such that all of the 1591 individual tests of the TDS only pass by small (infinitesimal) 1592 margins, and demonstrate that a variety of authentic applications 1593 running over real TCP implementations (or other protocol as 1594 appropriate) meets the end-to-end target parameters over such a 1595 network. The workloads should include multiple types of streaming 1596 media and transaction oriented short flows (e.g. synthetic web 1597 traffic ). 1599 For example using our example in our HD streaming video TDS described 1600 in Section 8.1, the bottleneck data rate should be 5 Mb/s, the per 1601 packet random background loss probability should be 1/1453, for a run 1602 length of 1452 packets, the bottleneck queue should be 22 packets and 1603 the front path should have just enough buffering to withstand 22 1604 packet line rate bursts. We want every one of the TDS tests to fail 1605 if we slightly increase the relevant test parameter, so for example 1606 sending a 23 packet slowstart bursts should cause excess (possibly 1607 deterministic) packet drops at the dominant queue at the bottleneck. 1608 On this infinitessimally passing network it should be possible for a 1609 real ral application using a stock TCP implementation in the vendor's 1610 default configuration to attain 5 Mb/s over an 50 mS path. 1612 The most difficult part of setting up such a testbed is arranging to 1613 infinitesimally pass the individual tests. We suggest two 1614 approaches: constraining the network devices not to use all available 1615 resources (limiting available buffer space or data rate); and 1616 preloading subpaths with cross traffic. Note that is it important 1617 that a single environment be constructed which infinitessimally 1618 passes all tests at the same time, otherwise there is a chance that 1619 TCP can exploit extra latitude in some parameters (such as data rate) 1620 to partially compensate for constraints in other parameters (queue 1621 space, or viceversa). 1623 To the extent that a TDS is used to inform public dialog it should be 1624 fully publicly documented, including the details of the tests, what 1625 assumptions were used and how it was derived. All of the details of 1626 the validation experiment should also be public with sufficient 1627 detail for the experiments to be replicated by other researchers. 1628 All components should either be open source of fully described 1629 proprietary implementations that are available to the research 1630 community. 1632 This work here is inspired by open tools running on an open platform, 1633 using open techniques to collect open data. See Measurement Lab 1634 [http://www.measurementlab.net/] 1636 10. Acknowledgements 1638 Ganga Maguluri suggested the statistical test for measuring loss 1639 probability in the target run length. Alex Gilgur for helping with 1640 the statistics and contributing and alternate model. 1642 Meredith Whittaker for improving the clarity of the communications. 1644 11. Informative References 1646 [RFC2309] Braden, B., Clark, D., Crowcroft, J., Davie, B., Deering, 1647 S., Estrin, D., Floyd, S., Jacobson, V., Minshall, G., 1648 Partridge, C., Peterson, L., Ramakrishnan, K., Shenker, 1649 S., Wroclawski, J., and L. Zhang, "Recommendations on 1650 Queue Management and Congestion Avoidance in the 1651 Internet", RFC 2309, April 1998. 1653 [RFC2330] Paxson, V., Almes, G., Mahdavi, J., and M. Mathis, 1654 "Framework for IP Performance Metrics", RFC 2330, 1655 May 1998. 1657 [RFC2861] Handley, M., Padhye, J., and S. Floyd, "TCP Congestion 1658 Window Validation", RFC 2861, June 2000. 1660 [RFC3148] Mathis, M. and M. Allman, "A Framework for Defining 1661 Empirical Bulk Transfer Capacity Metrics", RFC 3148, 1662 July 2001. 1664 [RFC3465] Allman, M., "TCP Congestion Control with Appropriate Byte 1665 Counting (ABC)", RFC 3465, February 2003. 1667 [RFC4898] Mathis, M., Heffner, J., and R. Raghunarayan, "TCP 1668 Extended Statistics MIB", RFC 4898, May 2007. 1670 [RFC4737] Morton, A., Ciavattone, L., Ramachandran, G., Shalunov, 1671 S., and J. Perser, "Packet Reordering Metrics", RFC 4737, 1672 November 2006. 1674 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 1675 Control", RFC 5681, September 2009. 1677 [RFC5835] Morton, A. and S. Van den Berghe, "Framework for Metric 1678 Composition", RFC 5835, April 2010. 1680 [RFC6049] Morton, A. and E. Stephan, "Spatial Composition of 1681 Metrics", RFC 6049, January 2011. 1683 [RFC6673] Morton, A., "Round-Trip Packet Loss Metrics", RFC 6673, 1684 August 2012. 1686 [I-D.morton-ippm-lmap-path] 1687 Bagnulo, M., Burbridge, T., Crawford, S., Eardley, P., and 1688 A. Morton, "A Reference Path and Measurement Points for 1689 LMAP", draft-morton-ippm-lmap-path-00 (work in progress), 1690 January 2013. 1692 [MSMO97] Mathis, M., Semke, J., Mahdavi, J., and T. Ott, "The 1693 Macroscopic Behavior of the TCP Congestion Avoidance 1694 Algorithm", Computer Communications Review volume 27, 1695 number3, July 1997. 1697 [WPING] Mathis, M., "Windowed Ping: An IP Level Performance 1698 Diagnostic", INET 94, June 1994. 1700 [mpingSource] 1701 Fan, X., Mathis, M., and D. Hamon, "Git Repository for 1702 mping: An IP Level Performance Diagnostic", Sept 2013, 1703 . 1705 [MBMSource] 1706 Hamon, D., "Git Repository for Model Based Metrics", 1707 Sept 2013, . 1709 [Pathdiag] 1710 Mathis, M., Heffner, J., O'Neil, P., and P. Siemsen, 1711 "Pathdiag: Automated TCP Diagnosis", Passive and Active 1712 Measurement , June 2008. 1714 [StatQC] Montgomery, D., "Introduction to Statistical Quality 1715 Control - 2nd ed.", ISBN 0-471-51988-X, 1990. 1717 [Rtool] R Development Core Team, "R: A language and environment 1718 for statistical computing. R Foundation for Statistical 1719 Computing, Vienna, Austria. ISBN 3-900051-07-0, URL 1720 http://www.R-project.org/", , 2011. 1722 [CVST] Krueger, T. and M. Braun, "R package: Fast Cross- 1723 Validation via Sequential Testing", version 0.1, 11 2012. 1725 [LMCUBIC] Ledesma Goyzueta, R. and Y. Chen, "A Deterministic Loss 1726 Model Based Analysis of CUBIC, IEEE International 1727 Conference on Computing, Networking and Communications 1728 (ICNC), E-ISBN : 978-1-4673-5286-4", January 2013. 1730 Appendix A. Model Derivations 1732 The reference target_run_length described in Section 5.2 is based on 1733 very conservative assumptions: that all window above target_pipe_size 1734 contributes to a standing queue that raises the RTT, and that classic 1735 Reno congestion control with delayed ACKs are in effect. In this 1736 section we provide two alternative calculations using different 1737 assumptions. 1739 It may seem out of place to allow such latitude in a measurement 1740 standard, but the section provides offsetting requirements. 1742 The estimates provided by these models make the most sense if network 1743 performance is viewed logarithmically. In the operational Internet, 1744 data rates span more than 8 orders of magnitude, RTT spans more than 1745 3 orders of magnitude, and loss probability spans at least 8 orders 1746 of magnitude. When viewed logarithmically (as in decibels), these 1747 correspond to 80 dB of dynamic range. On an 80 db scale, a 3 dB 1748 error is less than 4% of the scale, even though it might represent a 1749 factor of 2 in untransformed parameter. 1751 This document gives a lot of latitude for calculating 1752 target_run_length, however people designing a TDS should consider the 1753 effect of their choices on the ongoing tussle about the relevance of 1754 "TCP friendliness" as an appropriate model for Internet capacity 1755 allocation. Choosing a target_run_length that is substantially 1756 smaller than the reference target_run_length specified in Section 5.2 1757 strengthens the argument that it may be appropriate to abandon "TCP 1758 friendliness" as the Internet fairness model. This gives developers 1759 incentive and permission to develop even more aggressive applications 1760 and protocols, for example by increasing the number of connections 1761 that they open concurrently. 1763 A.1. Queueless Reno 1765 In Section 5.2 it is assumed that the target rate is the same as the 1766 link rate, and any excess window causes a standing queue at the 1767 bottleneck. This might be representative of a non-shared access 1768 link. An alternative situation would be a heavily aggregated subpath 1769 where individual flows do not significantly contribute to the 1770 queueing delay, and losses are determined monitoring the average data 1771 rate, for example by the use of a virtual queue as in [AFD]. In such 1772 a scheme the RTT is constant and TCP's AIMD congestion control causes 1773 the data rate to fluctuate in a sawtooth. If the traffic is being 1774 controlled in a manner that is consistent with the metrics here, goal 1775 would be to make the actual average rate equal to the 1776 target_data_rate. 1778 We can derive a model for Reno TCP and delayed ACK under the above 1779 set of assumptions: for some value of Wmin, the window will sweep 1780 from Wmin to 2*Wmin in 2*Wmin RTT. Unlike the queueing case where 1781 Wmin = Target_pipe_size, we want the average of Wmin and 2*Wmin to be 1782 the target_pipe_size, so the average rate is the target rate. Thus 1783 we want Wmin = (2/3)*target_pipe_size. 1785 Between losses each sawtooth delivers (1/2)(Wmin+2*Wmin)(2Wmin) 1786 packets in 2*Wmin round trip times. 1788 Substituting these together we get: 1790 target_run_length = (4/3)(target_pipe_size^2) 1792 Note that this is 44% of the reference run length. This makes sense 1793 because under the assumptions in Section 5.2 the AMID sawtooth caused 1794 a queue at the bottleneck, which raised the effective RTT by 50%. 1796 A.2. CUBIC 1798 CUBIC has three operating regions. The model for the expected value 1799 of window size derived in [LMCUBIC] assumes operation in the 1800 "concave" region only, which is a non-TCP friendly region for long- 1801 lived flows. The authors make the following assumptions: packet loss 1802 probability, p, is independent and periodic, losses occur one at a 1803 time, and they are true losses due to tail drop or corruption. This 1804 definition of p aligns very well with our definition of 1805 target_run_length and the requirement for progressive loss (AQM). 1807 Although CUBIC window increase depends on continuous time, the 1808 authors transform the time to reach the maximum Window size in terms 1809 of RTT and a parameter for the multiplicative rate decrease on 1810 observing loss, beta (whose default value is 0.2 in CUBIC). The 1811 expected value of Window size, E[W], is also dependent on C, a 1812 parameter of CUBIC that determines its window-growth aggressiveness 1813 (values from 0.01 to 4). 1815 E[W] = ( C*(RTT/p)^3 * ((4-beta)/beta) )^-4 1817 and, further assuming Poisson arrival, the mean throughput, x, is 1818 x = E[W]/RTT 1820 We note that under these conditions (deterministic single losses), 1821 the value of E[W] is always greater than 0.8 of the maximum window 1822 size ~= reference_run_length. (as far as I can tell) 1824 Appendix B. Complex Queueing 1826 For many network technologies simple queueing models do not apply: 1827 the network schedules, thins or otherwise alters the timing of ACKs 1828 and data, generally to raise the efficiency of the channel allocation 1829 process when confronted with relatively widely spaced small ACKs. 1830 These efficiency strategies are ubiquitous for half duplex, wireless 1831 and broadcast media. 1833 Altering the ACK stream generally has two consequences: it raises the 1834 effective bottleneck data rate, making slowstart burst at higher 1835 rates (possibly as high as the sender's interface rate) and it 1836 effectively raises the RTT by the average time that the ACKs were 1837 delayed. The first effect can be partially mitigated by reclocking 1838 ACKs once they are beyond the bottleneck on the return path to the 1839 sender, however this further raises the effective RTT. 1841 The most extreme example of this sort of behavior would be a half 1842 duplex channel that is not released as long as end point currently 1843 holding the channel has pending traffic. Such environments cause 1844 self clocked protocols under full load to revert to extremely 1845 inefficient stop and wait behavior, where they send an entire window 1846 of data as a single burst, followed by the entire window of ACKs on 1847 the return path. 1849 If a particular end-to-end path contains a link or device that alters 1850 the ACK stream, then the entire path from the sender up to the 1851 bottleneck must be tested at the burst parameters implied by the ACK 1852 scheduling algorithm. The most important parameter is the Effective 1853 Bottleneck Data Rate, which is the average rate at which the ACKs 1854 advance snd.una. Note that thinning the ACKs (relying on the 1855 cumulative nature of seg.ack to permit discarding some ACKs) is 1856 implies an effectively infinite bottleneck data rate. It is 1857 important to note that due to the self clock, ill conceived channel 1858 allocation mechanisms can increase the stress on upstream links in a 1859 long path. 1861 Holding data or ACKs for channel allocation or other reasons (such as 1862 error correction) always raises the effective RTT relative to the 1863 minimum delay for the path. Therefore it may be necessary to replace 1864 target_RTT in the calculation in Section 5.2 by an effective_RTT, 1865 which includes the target_RTT reflecting the fixed part of the path 1866 plus a term to account for the extra delays introduced by these 1867 mechanisms. 1869 Appendix C. Version Control 1871 Formatted: Fri Feb 14 14:07:33 PST 2014 1873 Authors' Addresses 1875 Matt Mathis 1876 Google, Inc 1877 1600 Amphitheater Parkway 1878 Mountain View, California 94043 1879 USA 1881 Email: mattmathis@google.com 1883 Al Morton 1884 AT&T Labs 1885 200 Laurel Avenue South 1886 Middletown, NJ 07748 1887 USA 1889 Phone: +1 732 420 1571 1890 Email: acmorton@att.com 1891 URI: http://home.comcast.net/~acmacm/