idnits 2.17.1 draft-ietf-ippm-model-based-metrics-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 1257 has weird spacing: '... and n = h1...' -- The document date (July 6, 2015) is 3210 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- == Missing Reference: 'Dominant' is mentioned on line 441, but not defined -- Obsolete informational reference (is this intentional?): RFC 2309 (Obsoleted by RFC 7567) -- Obsolete informational reference (is this intentional?): RFC 2861 (Obsoleted by RFC 7661) == Outdated reference: A later version (-05) exists of draft-ietf-ippm-2680-bis-02 Summary: 0 errors (**), 0 flaws (~~), 4 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 IP Performance Working Group M. Mathis 3 Internet-Draft Google, Inc 4 Intended status: Experimental A. Morton 5 Expires: January 7, 2016 AT&T Labs 6 July 6, 2015 8 Model Based Metrics for Bulk Transport Capacity 9 draft-ietf-ippm-model-based-metrics-06.txt 11 Abstract 13 We introduce a new class of Model Based Metrics designed to assess if 14 a complete Internet path can be expected to meet a predefined Bulk 15 Transport Performance target by applying a suite of IP diagnostic 16 tests to successive subpaths. The subpath-at-a-time tests can be 17 robustly applied to key infrastructure, such as interconnects or even 18 individual devices, to accurately detect if any part of the 19 infrastructure will prevent any path traversing it from meeting the 20 specified Target Transport Performance. 22 The IP diagnostic tests consist of precomputed traffic patterns and 23 statistical criteria for evaluating packet delivery. The traffic 24 patterns are precomputed to mimic TCP or other transport protocol 25 over a long path but are constructed in such a way that they are 26 independent of the actual details of the subpath under test, end 27 systems or applications. Likewise the success criteria depends on 28 the packet delivery statistics of the subpath, as evaluated against a 29 protocol model applied to the Target Transport Performance. The 30 success criteria also does not depend on the details of the subpath, 31 end systems or application. This makes the measurements open loop, 32 eliminating most of the difficulties encountered by traditional bulk 33 transport metrics. 35 Model based metrics exhibit several important new properties not 36 present in other Bulk Capacity Metrics, including the ability to 37 reason about concatenated or overlapping subpaths. The results are 38 vantage independent which is critical for supporting independent 39 validation of tests results from multiple Measurement Points. 41 This document does not define IP diagnostic tests directly, but 42 provides a framework for designing suites of IP diagnostics tests 43 that are tailored to confirming that infrastructure can meet a 44 predetermined Target Transport Performance. 46 Status of this Memo 48 This Internet-Draft is submitted in full conformance with the 49 provisions of BCP 78 and BCP 79. 51 Internet-Drafts are working documents of the Internet Engineering 52 Task Force (IETF). Note that other groups may also distribute 53 working documents as Internet-Drafts. The list of current Internet- 54 Drafts is at http://datatracker.ietf.org/drafts/current/. 56 Internet-Drafts are draft documents valid for a maximum of six months 57 and may be updated, replaced, or obsoleted by other documents at any 58 time. It is inappropriate to use Internet-Drafts as reference 59 material or to cite them other than as "work in progress." 61 This Internet-Draft will expire on January 7, 2016. 63 Copyright Notice 65 Copyright (c) 2015 IETF Trust and the persons identified as the 66 document authors. All rights reserved. 68 This document is subject to BCP 78 and the IETF Trust's Legal 69 Provisions Relating to IETF Documents 70 (http://trustee.ietf.org/license-info) in effect on the date of 71 publication of this document. Please review these documents 72 carefully, as they describe your rights and restrictions with respect 73 to this document. Code Components extracted from this document must 74 include Simplified BSD License text as described in Section 4.e of 75 the Trust Legal Provisions and are provided without warranty as 76 described in the Simplified BSD License. 78 Table of Contents 80 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5 81 1.1. Version Control . . . . . . . . . . . . . . . . . . . . . 6 82 2. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 83 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 9 84 4. Background . . . . . . . . . . . . . . . . . . . . . . . . . . 14 85 4.1. TCP properties . . . . . . . . . . . . . . . . . . . . . . 16 86 4.2. Diagnostic Approach . . . . . . . . . . . . . . . . . . . 17 87 4.3. New requirements relative to RFC 2330 . . . . . . . . . . 18 88 5. Common Models and Parameters . . . . . . . . . . . . . . . . . 18 89 5.1. Target End-to-end parameters . . . . . . . . . . . . . . . 18 90 5.2. Common Model Calculations . . . . . . . . . . . . . . . . 19 91 5.3. Parameter Derating . . . . . . . . . . . . . . . . . . . . 20 92 5.4. Test Preconditions . . . . . . . . . . . . . . . . . . . . 21 93 6. Traffic generating techniques . . . . . . . . . . . . . . . . 21 94 6.1. Paced transmission . . . . . . . . . . . . . . . . . . . . 21 95 6.2. Constant window pseudo CBR . . . . . . . . . . . . . . . . 23 96 6.3. Scanned window pseudo CBR . . . . . . . . . . . . . . . . 24 97 6.4. Concurrent or channelized testing . . . . . . . . . . . . 24 98 7. Interpreting the Results . . . . . . . . . . . . . . . . . . . 25 99 7.1. Test outcomes . . . . . . . . . . . . . . . . . . . . . . 25 100 7.2. Statistical criteria for estimating run_length . . . . . . 27 101 7.3. Reordering Tolerance . . . . . . . . . . . . . . . . . . . 29 102 8. Diagnostic Tests . . . . . . . . . . . . . . . . . . . . . . . 29 103 8.1. Basic Data Rate and Packet Delivery Tests . . . . . . . . 30 104 8.1.1. Delivery Statistics at Paced Full Data Rate . . . . . 30 105 8.1.2. Delivery Statistics at Full Data Windowed Rate . . . . 31 106 8.1.3. Background Packet Delivery Statistics Tests . . . . . 31 107 8.2. Standing Queue Tests . . . . . . . . . . . . . . . . . . . 31 108 8.2.1. Congestion Avoidance . . . . . . . . . . . . . . . . . 33 109 8.2.2. Bufferbloat . . . . . . . . . . . . . . . . . . . . . 33 110 8.2.3. Non excessive loss . . . . . . . . . . . . . . . . . . 33 111 8.2.4. Duplex Self Interference . . . . . . . . . . . . . . . 34 112 8.3. Slowstart tests . . . . . . . . . . . . . . . . . . . . . 34 113 8.3.1. Full Window slowstart test . . . . . . . . . . . . . . 35 114 8.3.2. Slowstart AQM test . . . . . . . . . . . . . . . . . . 35 115 8.4. Sender Rate Burst tests . . . . . . . . . . . . . . . . . 35 116 8.5. Combined and Implicit Tests . . . . . . . . . . . . . . . 36 117 8.5.1. Sustained Bursts Test . . . . . . . . . . . . . . . . 36 118 8.5.2. Streaming Media . . . . . . . . . . . . . . . . . . . 37 119 9. An Example . . . . . . . . . . . . . . . . . . . . . . . . . . 38 120 10. Validation . . . . . . . . . . . . . . . . . . . . . . . . . . 40 121 11. Security Considerations . . . . . . . . . . . . . . . . . . . 41 122 12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 41 123 13. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 42 124 14. References . . . . . . . . . . . . . . . . . . . . . . . . . . 42 125 14.1. Normative References . . . . . . . . . . . . . . . . . . . 42 126 14.2. Informative References . . . . . . . . . . . . . . . . . . 42 127 Appendix A. Model Derivations . . . . . . . . . . . . . . . . . . 44 128 A.1. Queueless Reno . . . . . . . . . . . . . . . . . . . . . . 45 129 Appendix B. Complex Queueing . . . . . . . . . . . . . . . . . . 46 130 Appendix C. Version Control . . . . . . . . . . . . . . . . . . . 47 131 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 47 133 1. Introduction 135 Model Based Metrics (MBM) rely on mathematical models to specify a 136 targeted suite of IP diagnostic tests, designed to assess whether 137 common transport protocols can be expected to meet a predetermined 138 performance target over an Internet path. Each test in the Targeted 139 Diagnostic Suite (TDS) measures some aspect of IP packet transfer 140 that is required to meet the Target Transport Performance. For 141 example a TDS may have separate diagnostic tests to verify that there 142 is: sufficient IP capacity (rate); sufficient queue space to deliver 143 typical transport bursts; and that the background packet loss ratio 144 is small enough not to interfere with congestion control. Unlike 145 other metrics which yield measures of network properties, Model Based 146 Metrics nominally yield pass/fail evaluations of the ability of 147 standard transport protocols to meet a specific performance objective 148 over some network path. 150 This note describes the modeling framework to derive the IP 151 diagnostic test parameters from the Target Transport Performance 152 specified for TCP Bulk Transport Capacity. Model Based Metrics is an 153 alternative to the approach described in [RFC3148]. In the future, 154 other Model Based Metrics may cover other applications and 155 transports, such as VoIP over RTP. In most cases the IP diagnostic 156 tests can be implemented by combining existing IPPM metrics with 157 additional controls for generating precomputed traffic patterns and 158 statistical criteria for evaluating packet delivery. 160 This approach, mapping Target Transport Performance to a targeted 161 diagnostic suite (TDS) of IP tests, solves some intrinsic problems 162 with using TCP or other throughput maximizing protocols for 163 measurement. In particular all throughput maximizing protocols (and 164 TCP congestion control in particular) cause some level of congestion 165 in order to detect when they have filled the network. This self 166 inflicted congestion obscures the network properties of interest and 167 introduces non-linear equilibrium behaviors that make any resulting 168 measurements useless as metrics because they have no predictive value 169 for conditions or paths other than that of the measurement itself. 170 These problems are discussed at length in Section 4. 172 A targeted suite of IP diagnostic tests does not have such 173 difficulties. They can be constructed such that they make strong 174 statistical statements about path properties that are independent of 175 the measurement details, such as vantage and choice of measurement 176 points. Model Based Metrics bridge the gap between empirical IP 177 measurements and expected TCP performance. 179 1.1. Version Control 181 RFC Editor: Please remove this entire subsection prior to 182 publication. 184 Please send comments about this draft to ippm@ietf.org. See 185 http://goo.gl/02tkD for more information including: interim drafts, 186 an up to date todo list and information on contributing. 188 Formatted: Mon Jul 6 13:49:30 PDT 2015 190 Changes since -05 draft: 191 o Wordsmithing on sections overhauled in -05 draft. 192 o Reorganized the document: 193 * Relocated subsection "Preconditions". 194 * Relocated subsection "New Requirements relative to RFC 2330". 195 o Addressed nits and not so nits by Ruediger Geib. (Thanks!) 196 o Substantially tightened the entire definitions section. 197 o Many terminology changes, to better conform to other docs : 198 * IP rate and IP capacity (following RFC 5136) replaces various 199 forms of link data rate. 200 * subpath replaces link. 201 * target_window_size replaces target_pipe_size. 202 * Implied Bottleneck IP Rate replaces effective bottleneck link 203 rate. 204 * Packet delivery statistics replaces delivery statistics. 206 Changes since -04 draft: 207 o The introduction was heavily overhauled: split into a separate 208 introduction and overview. 209 o The new shorter introduction: 210 * Is a problem statement; 211 * This document provides a framework; 212 * That it replaces TCP measurement by IP tests; 213 * That the results are pass/fail. 214 o Added a diagram of the framework to the overview 215 o and introduces all of the elements of the framework. 216 o Renumbered sections, reducing the depth of some section numbers. 217 o Updated definitions to better agree with other documents: 218 * Reordered section 2 219 * Bulk [data] performance -> Bulk Transport Capacity, everywhere 220 including the title. 221 * loss rate and loss probability -> packet loss ratio 222 * end-to-end path -> complete path 223 * [end-to-end][target] performance -> Target Transport 224 Performance 226 * load test -> capacity test 228 2. Overview 230 This document describes a modeling framework for deriving a Targeted 231 Diagnostic Suite from a predetermined Target Transport Performance. 232 It is not a complete specification, and relies on other standards 233 documents to define important details such as packet type-p 234 selection, sampling techniques, vantage selection, etc. We imagine 235 Fully Specified Targeted Diagnostic Suites (FSTDS), that define all 236 of these details. We use Targeted Diagnostic Suite (TDS) to refer to 237 the subset of such a specification that is in scope for this 238 document. This terminology is defined in Section 3. 240 Section 4 describes some key aspects of TCP behavior and what it 241 implies about the requirements for IP packet delivery. Most of the 242 IP diagnostic tests needed to confirm that the path meets these 243 properties can be built on existing IPPM metrics, with the addition 244 of statistical criteria for evaluating packet delivery and in a few 245 cases, new mechanisms to implement precomputed traffic patterns. 246 (One group of tests, the standing queue tests described in 247 Section 8.2, don't correspond to existing IPPM metrics, but suitable 248 metrics can be patterned after existing tools.) 250 Figure 1 shows the MBM modeling and measurement framework. The 251 Target Transport Performance, at the top of the figure, is determined 252 by the needs of the user or application, outside the scope of this 253 document. For Bulk Transport Capacity, the main performance 254 parameter of interest is the target data rate. However, since TCP's 255 ability to compensate for less than ideal network conditions is 256 fundamentally affected by the Round Trip Time (RTT) and the Maximum 257 Transmission Unit (MTU) of the complete path, these parameters must 258 also be specified in advance based on knowledge about the intended 259 application setting. They may reflect a specific application over 260 real path through the Internet or an idealized application and 261 hypothetical path representing a typical user community. Section 5 262 describes the common parameters and models derived from the Target 263 Transport Performance. 265 Target Transport Performance 266 (target data rate, target RTT and target MTU) 267 | 268 ________V_________ 269 | mathematical | 270 | models | 271 | | 272 ------------------ 273 Traffic parameters | | Statistical criteria 274 | | 275 _______V____________V____Targeted_______ 276 | | * * * | Diagnostic Suite | 277 _____|_______V____________V________________ | 278 __|____________V____________V______________ | | 279 | IP Diagnostic test | | | 280 | | | | | | 281 | _____________V__ __V____________ | | | 282 | | Traffic | | Delivery | | | | 283 | | Generation | | Evaluation | | | | 284 | | | | | | | | 285 | -------v-------- ------^-------- | | | 286 | | v Test Traffic via ^ | | |-- 287 | | -->======================>-- | | | 288 | | subpath under test | |- 289 ----V----------------------------------V--- | 290 | | | | | | 291 V V V V V V 292 fail/inconclusive pass/fail/inconclusive 294 Overall Modeling Framework 296 Figure 1 298 The mathematical models are used to design traffic patterns that 299 mimic TCP or other bulk transport protocol operating at the target 300 data rate, MTU and RTT over a full range of conditions, including 301 flows that are bursty at multiple time scales. The traffic patterns 302 are generated based on the three target parameters of complete path 303 and independent of the properties of individual subpaths using the 304 techniques described in Section 6. As much as possible the 305 measurement traffic is generated deterministically (precomputed) to 306 minimize the extent to which test methodology, measurement points, 307 measurement vantage or path partitioning affect the details of the 308 measurement traffic. 310 Section 7 describes packet delivery statistics and methods test them 311 against the bounds provided by the mathematical models. Since these 312 statistics are typically the composition of subpaths of the complete 313 path [RFC6049] , in situ testing requires that the end-to-end 314 statistical bounds be apportioned as separate bounds for each 315 subpath. Subpaths that are expected to be bottlenecks may be 316 expected to contribute a larger fraction of the total packet loss. 317 In compensation, non-bottlenecked subpaths have to be constrained to 318 contribute less packet loss. The criteria for passing each test of a 319 TDS is an apportioned share of the total bound determined by the 320 mathematical model from the Target Transport Performance. 322 Section 8 describes the suite of individual tests needed to verify 323 all of required IP delivery properties. A subpath passes if and only 324 if all of the individual IP diagnostics tests pass. Any subpath that 325 fails any test indicates that some users are likely fail to attain 326 their Target Transport Performance under some conditions. In 327 addition to passing or failing, a test can be deemed to be 328 inconclusive for a number of reasons including: the precomputed 329 traffic pattern was not accurately generated; the measurement results 330 were not statistically significant; and others such as failing to 331 meet some required test preconditions. If all tests pass, except 332 some are inconclusive then the entire suite is deemed to be 333 inconclusive. 335 In Section 9 we present an example TDS that might be representative 336 of HD video, and illustrate how Model Based Metrics can be used to 337 address difficult measurement situations, such as confirming that 338 intercarrier exchanges have sufficient performance and capacity to 339 deliver HD video between ISPs. 341 Since there is some uncertainty in the modeling process, Section 10 342 describes a validation procedure to diagnose and minimize false 343 positive and false negative results. 345 3. Terminology 347 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 348 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 349 document are to be interpreted as described in [RFC2119]. 351 Note that terms containing underscores (rather than spaces) appear in 352 equations in the modeling sections. In some cases both forms are 353 used for aesthetic reasons, they do not have different meanings. 355 General Terminology: 357 Target: A general term for any parameter specified by or derived 358 from the user's application or transport performance requirements. 359 Target Transport Performance: Application or transport performance 360 goals for the complete path. For Bulk Transport Capacity defined 361 in this note the Target Transport Performance includes the target 362 data rate, target RTT and target MTU as described below. 363 Target Data Rate: The specified application data rate required for 364 an application's proper operation. Conventional BTC metrics are 365 focused on the target data rate, however these metrics had little 366 or no predictive value because they do not consider the effects of 367 the other two parameters of the Target Transport Performance, the 368 RTT and MTU of the complete paths. 369 Target RTT (Round Trip Time): The specified baseline (minimum) RTT 370 of the longest complete path over which the application expects to 371 be able meet the target performance. TCP and other transport 372 protocol's ability to compensate for path problems is generally 373 proportional to the number of round trips per second. The Target 374 RTT determines both key parameters of the traffic patterns (e.g. 375 burst sizes) and the thresholds on acceptable traffic statistics. 376 The Target RTT must be specified considering appropriate packets 377 sizes: MTU sized packets on the forward path, ACK sized packets 378 (typically header_overhead) on the return path. Note that target 379 RTT is specified and not measured, it determines the applicability 380 MBM evaluations for paths that are different than the measured 381 path. 382 Target MTU (Maximum Transmission Unit): The specified maximum MTU 383 supported by the complete path the over which the application 384 expects to meet the target performance. Assume 1500 Byte MTU 385 unless otherwise specified. If some subpath forces a smaller MTU, 386 then it becomes the target MTU for the complete path, and all 387 model calculations and subpath tests must use the same smaller 388 MTU. 389 Targeted Diagnostic Suite (TDS): A set of IP diagnostic tests 390 designed to determine if an otherwise ideal complete path 391 containing the subpath under test can sustain flows at a specific 392 target_data_rate using target_MTU sized packets when the RTT of 393 the complete path is target_RTT. 394 Fully Specified Targeted Diagnostic Suite: A TDS together with 395 additional specification such as "type-p", etc which are out of 396 scope for this document, but need to be drawn from other standards 397 documents. 398 Bulk Transport Capacity: Bulk Transport Capacity Metrics evaluate an 399 Internet path's ability to carry bulk data, such as large files, 400 streaming (non-real time) video, and under some conditions, web 401 images and other content. Prior efforts to define BTC metrics 402 have been based on [RFC3148], which predates our understanding of 403 TCP ant the requirements described in Section 4 405 IP diagnostic tests: Measurements or diagnostic tests to determine 406 if packet delivery statistics meet some precomputed target. 407 traffic patterns: The temporal patterns or statistics of traffic 408 generated by applications over transport protocols such as TCP. 409 There are several mechanisms that cause bursts at various time 410 scales as described in Section 4.1. Our goal here is to mimic the 411 range of common patterns (burst sizes and rates, etc), without 412 tying our applicability to specific applications, implementations 413 or technologies, which are sure to become stale. 414 packet delivery statistics: Raw, detailed or summary statistics 415 about packet delivery properties of the IP layer including packet 416 losses, ECN marks, reordering, or any other properties that may be 417 germane to transport performance. 418 packet loss ratio: As defined in [I-D.ietf-ippm-2680-bis]. 419 apportioned: To divide and allocate, for example budgeting packet 420 loss across multiple subpaths such that they will accumulate to 421 less than a specified end-to-end loss ratio. 422 open loop: A control theory term used to describe a class of 423 techniques where systems that naturally exhibit circular 424 dependencies can be analyzed by suppressing some of the 425 dependencies, such that the resulting dependency graph is acyclic. 427 Terminology about paths, etc. See [RFC2330] and [RFC7398]. 429 [data] sender: Host sending data and receiving ACKs. 430 [data] receiver: Host receiving data and sending ACKs. 431 complete path: The end-to-end path from the data sender to the data 432 receiver. 433 subpath: A portion of the complete path. Note that there is no 434 requirement that subpaths be non-overlapping. A subpath can be a 435 small as a single device, link or interface. 436 Measurement Point: Measurement points as described in [RFC7398]. 437 test path: A path between two measurement points that includes a 438 subpath of the complete path under test, and if the measurement 439 points are off path, may include "test leads" between the 440 measurement points and the subpath. 441 [Dominant] Bottleneck: The Bottleneck that generally dominates 442 packet delivery statistics for the entire path. It typically 443 determines a flow's self clock timing, packet loss and ECN marking 444 rate. See Section 4.1. 445 front path: The subpath from the data sender to the dominant 446 bottleneck. 447 back path: The subpath from the dominant bottleneck to the receiver. 448 return path: The path taken by the ACKs from the data receiver to 449 the data sender. 451 cross traffic: Other, potentially interfering, traffic competing for 452 network resources (bandwidth and/or queue capacity). 454 Properties determined by the complete path and application. They are 455 described in more detail in Section 5.1. 457 Application Data Rate: General term for the data rate as seen by the 458 application above the transport layer in bytes per second. This 459 is the payload data rate, and explicitly excludes transport and 460 lower level headers (TCP/IP or other protocols), retransmissions 461 and other overhead that is not part to the total quantity of data 462 delivered to the application. 463 IP Rate: The actual number of IP-layer bytes delivered through a 464 subpath, per unit time, including TCP and IP headers, retransmits 465 and other TCP/IP overhead. Follows from IP-type-P Link Usage 466 [RFC5136]. 467 IP Capacity: The maximum number of IP-layer bytes that can be 468 transmitted through a subpath, per unit time, including TCP and IP 469 headers, retransmits and other TCP/IP overhead. Follows from IP- 470 type-P Link Capacity [RFC5136]. 471 Bottleneck IP Rate: This is the IP rate of the data flowing through 472 the dominant bottleneck in the forward path. TCP and other 473 protocols normally derive their self clocks from the timing of 474 this data. See Section 4.1 and Appendix B for more details. 475 Implied Bottleneck IP Rate: This is the bottleneck IP rate implied 476 by the returning ACKs from the receiver. It is determined by 477 looking at how much application data the ACK stream reports 478 delivered per unit time. If the return path is thinning, batching 479 or otherwise altering ACK timing TCP will derive its clock from 480 the the implied bottleneck IP rate of the ACK stream, which in the 481 short term, might be much different than the actual bottleneck IP 482 rate. In the case of thinned or batched ACKs front path must have 483 sufficient buffering to smooth any data bursts to the IP capacity 484 of the bottleneck. If the return path is not altering the ACK 485 stream, then the Implied Bottleneck IP Rate will be the same as 486 the Bottleneck IP Rate. See Section 4.1 and Appendix B for more 487 details. 488 [sender | interface] rate: The IP rate which corresponds to the IP 489 Capacity of the data sender's interface. Due to issues of sender 490 efficiency and technologies such as TCP offload engines, nearly 491 all moderns servers deliver data in bursts at full interface link 492 rate. Today 1 or 10 Gb/s are typical. 493 Header_overhead: The IP and TCP header sizes, which are the portion 494 of each MTU not available for carrying application payload. 495 Without loss of generality this is assumed to be the size for 496 returning acknowledgements (ACKs). For TCP, the Maximum Segment 497 Size (MSS) is the Target MTU minus the header_overhead. 499 Basic parameters common to models and subpath tests are defined here 500 are described in more detail in Section 5.2. Note that these are 501 mixed between application transport performance (excludes headers) 502 and IP performance (which include TCP headers and retransmissions as 503 part of the payload). 505 Window: The total quantity of data plus the data represented by ACKs 506 circulating in the network is referred to as the window. See 507 Section 4.1. Sometimes used with other qualifiers (congestion 508 window, cwnd or receiver window) to indicate which mechanism is 509 controlling the window. 510 pipe size: A general term for number of packets needed in flight 511 (the window size) to exactly fill some network path or subpath. 512 It corresponds to the window size which maximizes network power, 513 the observed data rate divided by the observed RTT. Often used 514 with additional qualifiers to specify which path, or under what 515 conditions, etc. 516 target_window_size: The average number of packets in flight (the 517 window size) needed to meet the target data rate, for the 518 specified target RTT, and MTU. It implies the scale of the bursts 519 that the network might experience. 520 run length: A general term for the observed, measured, or specified 521 number of packets that are (expected to be) delivered between 522 losses or ECN marks. Nominally one over the sum of the loss and 523 ECN marking probabilities, if there are independently and 524 identically distributed. 525 target_run_length: The target_run_length is an estimate of the 526 minimum number of non-congestion marked packets needed between 527 losses or ECN marks necessary to attain the target_data_rate over 528 a path with the specified target_RTT and target_MTU, as computed 529 by a mathematical model of TCP congestion control. A reference 530 calculation is shown in Section 5.2 and alternatives in Appendix A 531 reference target_run_length: target_run_length computed precisely by 532 the method in Section 5.2. This is likely to be more slightly 533 conservative than required by modern TCP implementations. 535 Ancillary parameters used for some tests: 537 derating: Under some conditions the standard models are too 538 conservative. The modeling framework permits some latitude in 539 relaxing or "derating" some test parameters as described in 540 Section 5.3 in exchange for a more stringent TDS validation 541 procedures, described in Section 10. 542 subpath_IP_capacity: The IP capacity of a specific subpath. 544 test path: A subpath of a complete path under test. 545 test_path_RTT: The RTT observed between two measurement points using 546 packet sizes that are consistent with the transport protocol. 547 Generally MTU sized packets of the forward path, header_overhead 548 sized packets on the return path. 549 test_path_pipe: The pipe size of a test path. Nominally the test 550 path RTT times the test path IP_capacity. 551 test_window: The window necessary to meet the target_rate over a 552 test path. Typically test_window=target_data_rate*test_path_RTT/ 553 (target_MTU - header_overhead). 555 The tests described in this note can be grouped according to their 556 applicability. 558 capacity tests: determine if a network subpath has sufficient 559 capacity to deliver the Target Transport Performance. As long as 560 the test traffic is within the proper envelope for the Target 561 Transport Performance, the average packet losses or ECN marks must 562 be below the threshold computed by the model. As such, capacity 563 tests reflect parameters that can transition from passing to 564 failing as a consequence of cross traffic, additional presented 565 load or the actions of other network users. By definition, 566 capacity tests also consume significant network resources (data 567 capacity and/or queue buffer space), and the test schedules must 568 be balanced by their cost. 569 Monitoring tests: are designed to capture the most important aspects 570 of a capacity test, but without presenting excessive ongoing load 571 themselves. As such they may miss some details of the network's 572 performance, but can serve as a useful reduced-cost proxy for a 573 capacity test, for example to support ongoing monitoring. 574 Engineering tests: evaluate how network algorithms (such as AQM and 575 channel allocation) interact with TCP-style self clocked protocols 576 and adaptive congestion control based on packet loss and ECN 577 marks. These tests are likely to have complicated interactions 578 with cross traffic and under some conditions can be inversely 579 sensitive to load. For example a test to verify that an AQM 580 algorithm causes ECN marks or packet drops early enough to limit 581 queue occupancy may experience a false pass result in the presence 582 of cross traffic. It is important that engineering tests be 583 performed under a wide range of conditions, including both in situ 584 and bench testing, and over a wide variety of load conditions. 585 Ongoing monitoring is less likely to be useful for engineering 586 tests, although sparse in situ testing might be appropriate. 588 4. Background 590 At the time the IPPM WG was chartered, sound Bulk Transport Capacity 591 measurement was known to be well beyond our capabilities. Even at 592 the time [RFC3148] was written we knew that we didn't fully 593 understand the problem. Now, by hindsight we understand why BTC is 594 such a hard problem: 595 o TCP is a control system with circular dependencies - everything 596 affects performance, including components that are explicitly not 597 part of the test. 598 o Congestion control is an equilibrium process, such that transport 599 protocols change the network statistics (raise the packet loss 600 ratio and/or RTT) to conform to their behavior. By design TCP 601 congestion control keep raising the data rate until the network 602 gives some indication that it is full by dropping or ECN marking 603 packets. If TCP successfully fills the network the packet loss 604 and ECN marks are mostly determined by TCP and how hard TCP drives 605 the network and not by the network itself. 606 o TCP's ability to compensate for network flaws is directly 607 proportional to the number of roundtrips per second (i.e. 608 inversely proportional to the RTT). As a consequence a flawed 609 subpath may pass a short RTT local test even though it fails when 610 the path is extended by a perfect network to some larger RTT. 611 o TCP has an extreme form of the Heisenberg problem - Measurement 612 and cross traffic interact in unknown and ill defined ways. The 613 situation is actually worse than the traditional physics problem 614 where you can at least estimate bounds on the relative momentum of 615 the measurement and measured particles. For network measurement 616 you can not in general determine the relative "mass" of either the 617 measurement traffic or the cross traffic, so you can not gauge the 618 relative magnitude of the uncertainty that might be introduced by 619 any interaction. 621 These properties are a consequence of the equilibrium behavior 622 intrinsic to how all throughput maximizing protocols interact with 623 the Internet. These protocols rely on control systems based on 624 estimated network parameters to regulate the quantity of data traffic 625 sent into the network. The data traffic in turn alters network and 626 the properties observed by the estimators, such that there are 627 circular dependencies between every component and every property. 628 Since some of these properties are nonlinear, the entire system is 629 nonlinear, and any change anywhere causes difficult to predict 630 changes in every parameter. 632 Model Based Metrics overcome these problems by forcing the 633 measurement system to be open loop: the packet delivery statistics 634 (akin to the network estimators) do not affect the traffic or traffic 635 patterns (bursts), which computed on the basis of the Target 636 Transport Performance. In order for a network to pass, the resulting 637 packet delivery statistics and corresponding network estimators have 638 to be such that they would not cause the control systems slow the 639 traffic below the target data rate. 641 4.1. TCP properties 643 TCP and SCTP are self clocked protocols. The dominant steady state 644 behavior is to have an approximately fixed quantity of data and 645 acknowledgements (ACKs) circulating in the network. The receiver 646 reports arriving data by returning ACKs to the data sender, the data 647 sender typically responds by sending exactly the same quantity of 648 data back into the network. The total quantity of data plus the data 649 represented by ACKs circulating in the network is referred to as the 650 window. The mandatory congestion control algorithms incrementally 651 adjust the window by sending slightly more or less data in response 652 to each ACK. The fundamentally important property of this system is 653 that it is self clocked: The data transmissions are a reflection of 654 the ACKs that were delivered by the network, the ACKs are a 655 reflection of the data arriving from the network. 657 A number of phenomena can cause bursts of data, even in idealized 658 networks that can be modeled as simple queueing systems. 660 During slowstart the data rate is doubled on each RTT by sending 661 twice as much data as was delivered to the receiver on the prior RTT. 662 For slowstart to be able to fill such a network the network must be 663 able to tolerate slowstart bursts up to the full pipe size inflated 664 by the anticipated window reduction on the first loss or ECN mark. 665 For example, with classic Reno congestion control, an optimal 666 slowstart has to end with a burst that is twice the bottleneck rate 667 for exactly one RTT in duration. This burst causes a queue which is 668 exactly equal to the pipe size (i.e. the window is exactly twice the 669 pipe size) so when the window is halved in response to the first 670 loss, the new window will be exactly the pipe size. 672 Note that if the bottleneck data rate is significantly slower than 673 the rest of the path, the slowstart bursts will not cause significant 674 queues anywhere else along the path; they primarily exercise the 675 queue at the dominant bottleneck. 677 Other sources of bursts include application pauses and channel 678 allocation mechanisms. Appendix B describes the treatment of channel 679 allocation systems. If the application pauses (stops reading or 680 writing data) for some fraction of one RTT, state-of-the-art TCP 681 catches up to the earlier window size by sending a burst of data at 682 the full sender interface rate. To fill such a network with a 683 realistic application, the network has to be able to tolerate 684 interface rate bursts from the data sender large enough to cover 685 application pauses. 687 Although the interface rate bursts are typically smaller than the 688 last burst of a slowstart, they are at a higher data rate so they 689 potentially exercise queues at arbitrary points along the front path 690 from the data sender up to and including the queue at the dominant 691 bottleneck. There is no model for how frequent or what sizes of 692 sender rate bursts should be tolerated. 694 To verify that a path can meet a Target Transport Performance, it is 695 necessary to independently confirm that the path can tolerate bursts 696 in the dimensions that can be caused by these mechanisms. Three 697 cases are likely to be sufficient: 699 o Slowstart bursts sufficient to get connections started properly. 700 o Frequent sender interface rate bursts that are small enough where 701 they can be assumed not to significantly affect packet delivery 702 statistics. (Implicitly derated by limiting the burst size). 703 o Infrequent sender interface rate full target_window_size bursts 704 that might affect the packet delivery statistics. 705 (Target_run_length may be derated). 707 4.2. Diagnostic Approach 709 A complete path is expected to be able to sustain a Bulk TCP flow of 710 a given data rate, MTU and RTT when all of the following conditions 711 are met: 712 1. The IP capacity is above the target data rate by sufficient 713 margin to cover all TCP/IP overheads. See Section 8.1 or any 714 number of data rate tests outside of MBM. 715 2. The observed packet delivery statistics are better than required 716 by a suitable TCP performance model (e.g. fewer losses or ECN 717 marks). See Section 8.1 or any number of low rate packet loss 718 tests outside of MBM. 719 3. There is sufficient buffering at the dominant bottleneck to 720 absorb a slowstart rate burst large enough to get the flow out of 721 slowstart at a suitable window size. See Section 8.3. 722 4. There is sufficient buffering in the front path to absorb and 723 smooth sender interface rate bursts at all scales that are likely 724 to be generated by the application, any channel arbitration in 725 the ACK path or any other mechanisms. See Section 8.4. 726 5. When there is a slowly rising standing queue at the bottleneck 727 the onset of packet loss has to be at an appropriate point (time 728 or queue depth) and progressive. See Section 8.2. 729 6. When there is a standing queue at a bottleneck for a shared media 730 subpath (e.g. half duplex), there are suitable bounds on how the 731 data and ACKs interact, for example due to the channel 732 arbitration mechanism. See Section 8.2.4. 734 Note that conditions 1 through 4 require capacity tests for 735 validation, and thus may need to be monitored on an ongoing basis. 736 Conditions 5 and 6 require engineering tests best performed in 737 controlled environments such as a bench test. They won't generally 738 fail due to load, but may fail in the field due to configuration 739 errors, etc. and should be spot checked. 741 We are developing a tool that can perform many of the tests described 742 here [MBMSource]. 744 4.3. New requirements relative to RFC 2330 746 Model Based Metrics are designed to fulfill some additional 747 requirements that were not recognized at the time RFC 2330 was 748 written [RFC2330]. These missing requirements may have significantly 749 contributed to policy difficulties in the IP measurement space. Some 750 additional requirements are: 751 o IP metrics must be actionable by the ISP - they have to be 752 interpreted in terms of behaviors or properties at the IP or lower 753 layers, that an ISP can test, repair and verify. 754 o Metrics should be spatially composable, such that measures of 755 concatenated paths should be predictable from subpaths. 756 o Metrics must be vantage point invariant over a significant range 757 of measurement point choices, including off path measurement 758 points. The only requirements on MP selection should be that the 759 RTT between the MPs is below some reasonable bound, and that the 760 effects of the "test leads" connecting MPs to the subpath under 761 test can be can be calibrated out of the measurements. The latter 762 might be be accomplished if the test leads are effectively ideal 763 or their properties can be deducted from the measurements between 764 the MPs. While many of tests require that the test leads have at 765 least as much IP capacity as the subpath under test, some do not, 766 for example Background Packet Delivery Tests described in 767 Section 8.1.3. 768 o Metric measurements must be repeatable by multiple parties with no 769 specialized access to MPs or diagnostic infrastructure. It must 770 be possible for different parties to make the same measurement and 771 observe the same results. In particular it is specifically 772 important that both a consumer (or their delegate) and ISP be able 773 to perform the same measurement and get the same result. Note 774 that vantage independence is key to meeting this requirement. 776 5. Common Models and Parameters 778 5.1. Target End-to-end parameters 780 The target end-to-end parameters are the target data rate, target RTT 781 and target MTU as defined in Section 3. These parameters are 782 determined by the needs of the application or the ultimate end user 783 and the complete Internet path over which the application is expected 784 to operate. The target parameters are in units that make sense to 785 upper layers: payload bytes delivered to the application, above TCP. 786 They exclude overheads associated with TCP and IP headers, 787 retransmits and other protocols (e.g. DNS). 789 Other end-to-end parameters defined in Section 3 include the 790 effective bottleneck data rate, the sender interface data rate and 791 the TCP and IP header sizes. 793 The target_data_rate must be smaller than all subpath IP capacities 794 by enough headroom to carry the transport protocol overhead, 795 explicitly including retransmissions and an allowance for 796 fluctuations in TCP's actual data rate. Specifying a 797 target_data_rate with insufficient headroom is likely to result in 798 brittle measurements having little predictive value. 800 Note that the target parameters can be specified for a hypothetical 801 path, for example to construct TDS designed for bench testing in the 802 absence of a real application; or for a live in situ test of 803 production infrastructure. 805 The number of concurrent connections is explicitly not a parameter to 806 this model. If a subpath requires multiple connections in order to 807 meet the specified performance, that must be stated explicitly and 808 the procedure described in Section 6.4 applies. 810 5.2. Common Model Calculations 812 The Target Transport Performance is used to derive the 813 target_window_size and the reference target_run_length. 815 The target_window_size, is the average window size in packets needed 816 to meet the target_rate, for the specified target_RTT and target_MTU. 817 It is given by: 819 target_window_size = ceiling( target_rate * target_RTT / ( target_MTU 820 - header_overhead ) ) 822 Target_run_length is an estimate of the minimum required number of 823 unmarked packets that must be delivered between losses or ECN marks, 824 as computed by a mathematical model of TCP congestion control. The 825 derivation here follows [MSMO97], and by design is quite 826 conservative. 828 Reference target_run_length is derived as follows: assume the 829 subpath_IP_capacity is infinitesimally larger than the 830 target_data_rate plus the required header_overhead. Then 831 target_window_size also predicts the onset of queueing. A larger 832 window will cause a standing queue at the bottleneck. 834 Assume the transport protocol is using standard Reno style Additive 835 Increase, Multiplicative Decrease (AIMD) congestion control [RFC5681] 836 (but not Appropriate Byte Counting [RFC3465]) and the receiver is 837 using standard delayed ACKs. Reno increases the window by one packet 838 every pipe_size worth of ACKs. With delayed ACKs this takes 2 Round 839 Trip Times per increase. To exactly fill the pipe, losses must be no 840 closer than when the peak of the AIMD sawtooth reached exactly twice 841 the target_window_size otherwise the multiplicative window reduction 842 triggered by the loss would cause the network to be underfilled. 843 Following [MSMO97] the number of packets between losses must be the 844 area under the AIMD sawtooth. They must be no more frequent than 845 every 1 in ((3/2)*target_window_size)*(2*target_window_size) packets, 846 which simplifies to: 848 target_run_length = 3*(target_window_size^2) 850 Note that this calculation is very conservative and is based on a 851 number of assumptions that may not apply. Appendix A discusses these 852 assumptions and provides some alternative models. If a different 853 model is used, a fully specified TDS or FSTDS MUST document the 854 actual method for computing target_run_length and ratio between 855 alternate target_run_length and the reference target_run_length 856 calculated above, along with a discussion of the rationale for the 857 underlying assumptions. 859 These two parameters, target_window_size and target_run_length, 860 directly imply most of the individual parameters for the tests in 861 Section 8. 863 5.3. Parameter Derating 865 Since some aspects of the models are very conservative, the MBM 866 framework permits some latitude in derating test parameters. Rather 867 than trying to formalize more complicated models we permit some test 868 parameters to be relaxed as long as they meet some additional 869 procedural constraints: 870 o The TDS or FSTDS MUST document and justify the actual method used 871 to compute the derated metric parameters. 872 o The validation procedures described in Section 10 must be used to 873 demonstrate the feasibility of meeting the Target Transport 874 Performance with infrastructure that infinitesimally passes the 875 derated tests. 877 o The validation process for a FSTDS itself must be documented is 878 such a way that other researchers can duplicate the validation 879 experiments. 881 Except as noted, all tests below assume no derating. Tests where 882 there is not currently a well established model for the required 883 parameters explicitly include derating as a way to indicate 884 flexibility in the parameters. 886 5.4. Test Preconditions 888 Many tests have preconditions which are required to assure their 889 validity. For example the presence or nonpresence of cross traffic 890 on specific subpaths, or appropriate preloading to put reactive 891 network elements into the proper states [RFC7312]. If preconditions 892 are not properly satisfied for some reason, the tests should be 893 considered to be inconclusive. In general it is useful to preserve 894 diagnostic information about why the preconditions were not met, and 895 any test data that was collected even if it is not useful for the 896 intended test. Such diagnostic information and partial test data may 897 be useful for improving the test in the future. 899 It is important to preserve the record that a test was scheduled, 900 because otherwise precondition enforcement mechanisms can introduce 901 sampling bias. For example, canceling tests due to cross traffic on 902 subscriber access links might introduce sampling bias in tests of the 903 rest of the network by reducing the number of tests during peak 904 network load. 906 Test preconditions and failure actions MUST be specified in a FSTDS. 908 6. Traffic generating techniques 910 6.1. Paced transmission 912 Paced (burst) transmissions: send bursts of data on a timer to meet a 913 particular target rate and pattern. In all cases the specified data 914 rate can either be the application or IP rates. Header overheads 915 must be included in the calculations as appropriate. 916 Packet Headway: Time interval between packets, specified from the 917 start of one to the start of the next. e.g. If packets are sent 918 with a 1 mS headway, there will be exactly 1000 packets per 919 second. 921 Burst Headway: Time interval between bursts, specified from the 922 start of the first packet one burst to the start of the first 923 packet of the next burst. e.g. If 4 packet bursts are sent with a 924 1 mS burst headway, there will be exactly 4000 packets per second. 925 Paced single packets: Send individual packets at the specified rate 926 or packet headway. 927 Paced Bursts: Send sender interface rate bursts on a timer. Specify 928 any 3 of: average rate, packet size, burst size (number of 929 packets) and burst headway (burst start to start). The packet 930 headway within a burst is typically assumed to be the minimum 931 supported by the tester's interface. i.e. Bursts are normally 932 sent as back-to-back packets. The packet headway within the 933 bursts can also be explicitly specified. 934 Slowstart burst: Mimic TCP slowstart by sending 4 packet paced 935 bursts at an average data rate equal to twice the implied 936 bottleneck IP rate (but not more than the sender interface rate). 937 If the implied bottleneck IP rate is more than half of the sender 938 interface rate, slowstart rate bursts become sender interface rate 939 bursts. See the discussion and figure below. 940 Repeated Slowstart bursts: Repeat Slowstart bursts once per 941 target_RTT. For TCP each burst would be twice as large as the 942 prior burst, and the sequence would end at the first ECN mark or 943 lost packet. For measurement, all slowstart bursts would be the 944 same size (nominally target_window_size but other sizes might be 945 specified). See the discussion and figure below. 947 The slowstart bursts mimic TCP slowstart under a particular set of 948 implementation assumptions. The burst headway shown in Figure 2 949 reflects the TCP self clock derived from the data passing through the 950 dominant bottleneck. The slow start burst size is nominally 951 target_window_size (so it might end with a bust that is less than 4 952 packets). The slowstart bursts are repeated every target_RTT. Note 953 that a stream of repeated slowstart bursts has three different 954 average rates, depending on the averaging interval. At the finest 955 time scale (a few packet times at the sender interface) the peak of 956 the average rate is the same as the sender interface rate; at a 957 medium scale (a few packet times at the dominant bottleneck) the peak 958 of the average rate is twice the implied bottleneck IP rate; and at 959 time scales longer than the target_RTT and when the burst size is 960 equal to the target_window_size the average rate is equal to the 961 target_data_rate. This pattern corresponds to repeating the last RTT 962 of TCP slowstart when delayed ACK and sender side byte counting are 963 present but without the limits specified in Appropriate Byte Counting 964 [RFC3465]. 966 time --> ( - = one packet) 967 Packet stream: 969 ---- ---- ---- ---- ---- ---- ---- ... 971 |<>| 4 packet sender interface rate bursts 972 |<--->| Burst headway 973 |<------------------------>| slowstart burst size 974 |<---------------------------------------------->| slowstart headway 975 \____________ _____________/ \______ __ ... 976 V V 977 One slowstart burst Repeated slowstart bursts 979 Slowstart Burst Structure 981 Figure 2 983 Note that in conventional measurement practice, exponentially 984 distributed intervals are often used to eliminate many sorts of 985 correlations. For the procedures above, the correlations are created 986 by the network or protocol elements and accurately reflect their 987 behavior. At some point in the future, it will be desirable to 988 introduce noise sources into the above pacing models, but they are 989 not warranted at this time. 991 6.2. Constant window pseudo CBR 993 Implement pseudo constant bit rate by running a standard protocol 994 such as TCP with a fixed window size, such that it is self clocked. 995 Data packets arriving at the receiver trigger acknowledgements (ACKs) 996 which travel back to the sender where they trigger additional 997 transmissions. The window size is computed from the target_data_rate 998 and the actual RTT of the test path. The rate is only maintained in 999 average over each RTT, and is subject to limitations of the transport 1000 protocol. 1002 Since the window size is constrained to be an integer number of 1003 packets, for small RTTs or low data rates there may not be 1004 sufficiently precise control over the data rate. Rounding the window 1005 size up (the default) is likely to be result in data rates that are 1006 higher than the target rate, but reducing the window by one packet 1007 may result in data rates that are too small. Also cross traffic 1008 potentially raises the RTT, implicitly reducing the rate. Cross 1009 traffic that raises the RTT nearly always makes the test more 1010 strenuous. A FSTDS specifying a constant window CBR tests MUST 1011 explicitly indicate under what conditions errors in the data cause 1012 tests to inconclusive. 1014 Since constant window pseudo CBR testing is sensitive to RTT 1015 fluctuations it is less accurate at control the data rate in 1016 environments with fluctuating delays. 1018 6.3. Scanned window pseudo CBR 1020 Scanned window pseudo CBR is similar to the constant window CBR 1021 described above, except the window is scanned across a range of sizes 1022 designed to include two key events, the onset of queueing and the 1023 onset of packet loss or ECN marks. The window is scanned by 1024 incrementing it by one packet every 2*target_window_size delivered 1025 packets. This mimics the additive increase phase of standard TCP 1026 congestion avoidance when delayed ACKs are in effect. Normally the 1027 window increases separated by intervals slightly longer than twice 1028 the target_RTT. 1030 There are two ways to implement this test: one built by applying a 1031 window clamp to standard congestion control in a standard protocol 1032 such as TCP and the other built by stiffening a non-standard 1033 transport protocol. When standard congestion control is in effect, 1034 any losses or ECN marks cause the transport to revert to a window 1035 smaller than the clamp such that the scanning clamp loses control the 1036 window size. The NPAD pathdiag tool is an example of this class of 1037 algorithms [Pathdiag]. 1039 Alternatively a non-standard congestion control algorithm can respond 1040 to losses by transmitting extra data, such that it maintains the 1041 specified window size independent of losses or ECN marks. Such a 1042 stiffened transport explicitly violates mandatory Internet congestion 1043 control and is not suitable for in situ testing. [RFC5681] It is 1044 only appropriate for engineering testing under laboratory conditions. 1045 The Windowed Ping tool implements such a test [WPING]. The tool 1046 described in the paper has been updated.[mpingSource] 1048 The test procedures in Section 8.2 describe how to the partition the 1049 scans into regions and how to interpret the results. 1051 6.4. Concurrent or channelized testing 1053 The procedures described in this document are only directly 1054 applicable to single stream measurement, e.g. one TCP connection or 1055 measurement stream. In an ideal world, we would disallow all 1056 performance claims based multiple concurrent streams, but this is not 1057 practical due to at least two different issues. First, many very 1058 high rate link technologies are channelized and at last partially pin 1059 the flow to channel mapping to minimize packet reordering within 1060 flows. Second, TCP itself has scaling limits. Although the former 1061 problem might be overcome through different design decisions, the 1062 later problem is more deeply rooted. 1064 All congestion control algorithms that are philosophically aligned 1065 with the standard [RFC5681] (e.g. claim some level of TCP 1066 compatibility, friendliness or fairness) have scaling limits, in the 1067 sense that as a long fast network (LFN) with a fixed RTT and MTU gets 1068 faster, these congestion control algorithms get less accurate and as 1069 a consequence have difficulty filling the network[CCscaling]. These 1070 properties are a consequence of the original Reno AIMD congestion 1071 control design and the requirement in [RFC5681] that all transport 1072 protocols have similar responses to congestion. 1074 There are a number of reasons to want to specify performance in term 1075 of multiple concurrent flows, however this approach is not 1076 recommended for data rates below several megabits per second, which 1077 can be attained with run lengths under 10000 packets on many paths. 1078 Since the required run length goes as the square of the data rate, at 1079 higher rates the run lengths can be unreasonably large, and multiple 1080 flows might be the only feasible approach. 1082 If multiple flows are deemed necessary to meet aggregate performance 1083 targets then this MUST be stated both the design of the TDS and in 1084 any claims about network performance. The IP diagnostic tests MUST 1085 be performed concurrently with the specified number of connections. 1086 For the the tests that use bursty traffic, the bursts should be 1087 synchronized across flows. 1089 7. Interpreting the Results 1091 7.1. Test outcomes 1093 To perform an exhaustive test of a complete network path, each test 1094 of the TDS is applied to each subpath of the complete path. If any 1095 subpath fails any test then a standard transport protocol running 1096 over the complete path can also be expected to fail to attain the 1097 Target Transport Performance under some conditions. 1099 In addition to passing or failing, a test can be deemed to be 1100 inconclusive for a number of reasons. Proper instrumentation and 1101 treatment of inconclusive outcomes is critical to the accuracy and 1102 robustness of Model Based Metrics. Tests can be inconclusive if the 1103 precomputed traffic pattern or data rates were not accurately 1104 generated; the measurement results were not statistically 1105 significant; and others causes such as failing to meet some required 1106 preconditions for the test. See Section 5.4 1108 For example consider a test that implements Constant Window Pseudo 1109 CBR (Section 6.2) by adding rate controls and detailed traffic 1110 instrumentation to TCP (e.g. [RFC4898]). TCP includes built in 1111 control systems which might interfere with the sending data rate. If 1112 such a test meets the required packet delivery statistics (e.g. run 1113 length) while failing to attain the specified data rate it must be 1114 treated as an inconclusive result, because we can not a priori 1115 determine if the reduced data rate was caused by a TCP problem or a 1116 network problem, or if the reduced data rate had a material effect on 1117 the observed packet delivery statistics. 1119 Note that for capacity tests, if the observed packet delivery 1120 statistics meet the statistical criteria for failing (accepting 1121 hypnosis H1 in Section 7.2), the test can can be considered to have 1122 failed because it doesn't really matter that the test didn't attain 1123 the required data rate. 1125 The really important new properties of MBM, such as vantage 1126 independence, are a direct consequence of opening the control loops 1127 in the protocols, such that the test traffic does not depend on 1128 network conditions or traffic received. Any mechanism that 1129 introduces feedback between the paths measurements and the traffic 1130 generation is at risk of introducing nonlinearities that spoil these 1131 properties. Any exceptional event that indicates that such feedback 1132 has happened should cause the test to be considered inconclusive. 1134 One way to view inconclusive tests is that they reflect situations 1135 where a test outcome is ambiguous between limitations of the network 1136 and some unknown limitation of the IP diagnostic test itself, which 1137 may have been caused by some uncontrolled feedback from the network. 1139 Note that procedures that attempt to sweep the target parameter space 1140 to find the limits on some parameter such as target_data_rate are at 1141 risk of breaking the location independent properties of Model Based 1142 Metrics, if any part of the boundary between passing and inconclusive 1143 is sensitive to RTT (which is normally the case). 1145 One of the goals for evolving TDS designs will be to keep sharpening 1146 distinction between inconclusive, passing and failing tests. The 1147 criteria for for passing, failing and inconclusive tests MUST be 1148 explicitly stated for every test in the TDS or FSTDS. 1150 One of the goals of evolving the testing process, procedures, tools 1151 and measurement point selection should be to minimize the number of 1152 inconclusive tests. 1154 It may be useful to keep raw packet delivery statistics and ancillary 1155 metrics [RFC3148] for deeper study of the behavior of the network 1156 path and to measure the tools themselves. Raw packet delivery 1157 statistics can help to drive tool evolution. Under some conditions 1158 it might be possible to reevaluate the raw data for satisfying 1159 alternate Target Transport Performance. However it is important to 1160 guard against sampling bias and other implicit feedback which can 1161 cause false results and exhibit measurement point vantage 1162 sensitivity. Simply applying different delivery criteria based on a 1163 different Target Transport Performance is insufficient if the test 1164 traffic patterns (bursts, etc) does not match the alternate Target 1165 Transport Performance. 1167 7.2. Statistical criteria for estimating run_length 1169 When evaluating the observed run_length, we need to determine 1170 appropriate packet stream sizes and acceptable error levels for 1171 efficient measurement. In practice, can we compare the empirically 1172 estimated packet loss and ECN marking ratios with the targets as the 1173 sample size grows? How large a sample is needed to say that the 1174 measurements of packet transfer indicate a particular run length is 1175 present? 1177 The generalized measurement can be described as recursive testing: 1178 send packets (individually or in patterns) and observe the packet 1179 delivery performance (packet loss ratio or other metric, any marking 1180 we define). 1182 As each packet is sent and measured, we have an ongoing estimate of 1183 the performance in terms of the ratio of packet loss or ECN mark to 1184 total packets (i.e. an empirical probability). We continue to send 1185 until conditions support a conclusion or a maximum sending limit has 1186 been reached. 1188 We have a target_mark_probability, 1 mark per target_run_length, 1189 where a "mark" is defined as a lost packet, a packet with ECN mark, 1190 or other signal. This constitutes the null Hypothesis: 1192 H0: no more than one mark in target_run_length = 1193 3*(target_window_size)^2 packets 1195 and we can stop sending packets if on-going measurements support 1196 accepting H0 with the specified Type I error = alpha (= 0.05 for 1197 example). 1199 We also have an alternative Hypothesis to evaluate: if performance is 1200 significantly lower than the target_mark_probability. Based on 1201 analysis of typical values and practical limits on measurement 1202 duration, we choose four times the H0 probability: 1204 H1: one or more marks in (target_run_length/4) packets 1206 and we can stop sending packets if measurements support rejecting H0 1207 with the specified Type II error = beta (= 0.05 for example), thus 1208 preferring the alternate hypothesis H1. 1210 H0 and H1 constitute the Success and Failure outcomes described 1211 elsewhere in the memo, and while the ongoing measurements do not 1212 support either hypothesis the current status of measurements is 1213 inconclusive. 1215 The problem above is formulated to match the Sequential Probability 1216 Ratio Test (SPRT) [StatQC]. Note that as originally framed the 1217 events under consideration were all manufacturing defects. In 1218 networking, ECN marks and lost packets are not defects but signals, 1219 indicating that the transport protocol should slow down. 1221 The Sequential Probability Ratio Test also starts with a pair of 1222 hypothesis specified as above: 1224 H0: p0 = one defect in target_run_length 1225 H1: p1 = one defect in target_run_length/4 1226 As packets are sent and measurements collected, the tester evaluates 1227 the cumulative defect count against two boundaries representing H0 1228 Acceptance or Rejection (and acceptance of H1): 1230 Acceptance line: Xa = -h1 + s*n 1231 Rejection line: Xr = h2 + s*n 1232 where n increases linearly for each packet sent and 1234 h1 = { log((1-alpha)/beta) }/k 1235 h2 = { log((1-beta)/alpha) }/k 1236 k = log{ (p1(1-p0)) / (p0(1-p1)) } 1237 s = [ log{ (1-p0)/(1-p1) } ]/k 1238 for p0 and p1 as defined in the null and alternative Hypotheses 1239 statements above, and alpha and beta as the Type I and Type II 1240 errors. 1242 The SPRT specifies simple stopping rules: 1244 o Xa < defect_count(n) < Xb: continue testing 1245 o defect_count(n) <= Xa: Accept H0 1246 o defect_count(n) >= Xb: Accept H1 1248 The calculations above are implemented in the R-tool for Statistical 1249 Analysis [Rtool] , in the add-on package for Cross-Validation via 1250 Sequential Testing (CVST) [CVST] . 1252 Using the equations above, we can calculate the minimum number of 1253 packets (n) needed to accept H0 when x defects are observed. For 1254 example, when x = 0: 1256 Xa = 0 = -h1 + s*n 1257 and n = h1 / s 1259 7.3. Reordering Tolerance 1261 All tests must be instrumented for packet level reordering [RFC4737]. 1262 However, there is no consensus for how much reordering should be 1263 acceptable. Over the last two decades the general trend has been to 1264 make protocols and applications more tolerant to reordering (see for 1265 example [RFC4015]), in response to the gradual increase in reordering 1266 in the network. This increase has been due to the deployment of 1267 technologies such as multi threaded routing lookups and Equal Cost 1268 MultiPath (ECMP) routing. These techniques increase parallelism in 1269 network and are critical to enabling overall Internet growth to 1270 exceed Moore's Law. 1272 Note that transport retransmission strategies can trade off 1273 reordering tolerance vs how quickly they can repair losses vs 1274 overhead from spurious retransmissions. In advance of new 1275 retransmission strategies we propose the following strawman: 1276 Transport protocols should be able to adapt to reordering as long as 1277 the reordering extent is no more than the maximum of one quarter 1278 window or 1 mS, whichever is larger. Within this limit on reorder 1279 extent, there should be no bound on reordering density. 1281 By implication, recording which is less than these bounds should not 1282 be treated as a network impairment. However [RFC4737] still applies: 1283 reordering should be instrumented and the maximum reordering that can 1284 be properly characterized by the test (e.g. bound on history buffers) 1285 should be recorded with the measurement results. 1287 Reordering tolerance and diagnostic limitations, such as history 1288 buffer size, MUST be specified in a FSTDS. 1290 8. Diagnostic Tests 1292 The IP diagnostic tests below are organized by traffic pattern: basic 1293 data rate and packet delivery statistics, standing queues, slowstart 1294 bursts, and sender rate bursts. We also introduce some combined 1295 tests which are more efficient when networks are expected to pass, 1296 but conflate diagnostic signatures when they fail. 1298 There are a number of test details which are not fully defined here. 1300 They must be fully specified in a FSTDS. From a standardization 1301 perspective, this lack of specificity will weaken this version of 1302 Model Based Metrics, however it is anticipated that this it be more 1303 than offset by the extent to which MBM suppresses the problems caused 1304 by using transport protocols for measurement. e.g. non-specific MBM 1305 metrics are likely to have better repeatability than many existing 1306 BTC like metrics. Once we have good field experience, the missing 1307 details can be fully specified. 1309 8.1. Basic Data Rate and Packet Delivery Tests 1311 We propose several versions of the basic data rate and packet 1312 delivery statistics test. All measure the number of packets 1313 delivered between losses or ECN marks, using a data stream that is 1314 rate controlled at or below the target_data_rate. 1316 The tests below differ in how the data rate is controlled. The data 1317 can be paced on a timer, or window controlled at full target data 1318 rate. The first two tests implicitly confirm that sub_path has 1319 sufficient raw capacity to carry the target_data_rate. They are 1320 recommend for relatively infrequent testing, such as an installation 1321 or periodic auditing process. The third, background packet delivery 1322 statistics, is a low rate test designed for ongoing monitoring for 1323 changes in subpath quality. 1325 All rely on the receiver accumulating packet delivery statistics as 1326 described in Section 7.2 to score the outcome: 1328 Pass: it is statistically significant that the observed interval 1329 between losses or ECN marks is larger than the target_run_length. 1331 Fail: it is statistically significant that the observed interval 1332 between losses or ECN marks is smaller than the target_run_length. 1334 A test is considered to be inconclusive if it failed to meet the data 1335 rate as specified below, meet the qualifications defined in 1336 Section 5.4 or neither run length statistical hypothesis was 1337 confirmed in the allotted test duration. 1339 8.1.1. Delivery Statistics at Paced Full Data Rate 1341 Confirm that the observed run length is at least the 1342 target_run_length while relying on timer to send data at the 1343 target_rate using the procedure described in in Section 6.1 with a 1344 burst size of 1 (single packets) or 2 (packet pairs). 1346 The test is considered to be inconclusive if the packet transmission 1347 can not be accurately controlled for any reason. 1349 RFC 6673 [RFC6673] is appropriate for measuring packet delivery 1350 statistics at full data rate. 1352 8.1.2. Delivery Statistics at Full Data Windowed Rate 1354 Confirm that the observed run length is at least the 1355 target_run_length while sending at an average rate approximately 1356 equal to the target_data_rate, by controlling (or clamping) the 1357 window size of a conventional transport protocol to a fixed value 1358 computed from the properties of the test path, typically 1359 test_window=target_data_rate*test_path_RTT/target_MTU. Note that if 1360 there is any interaction between the forward and return path, 1361 test_window may need to be adjusted slightly to compensate for the 1362 resulting inflated RTT. 1364 Since losses and ECN marks generally cause transport protocols to at 1365 least temporarily reduce their data rates, this test is expected to 1366 be less precise about controlling its data rate. It should not be 1367 considered inconclusive as long as at least some of the round trips 1368 reached the full target_data_rate without incurring losses or ECN 1369 marks. To pass this test the network MUST deliver target_window_size 1370 packets in target_RTT time without any losses or ECN marks at least 1371 once per two target_window_size round trips, in addition to meeting 1372 the run length statistical test. 1374 8.1.3. Background Packet Delivery Statistics Tests 1376 The background run length is a low rate version of the target target 1377 rate test above, designed for ongoing lightweight monitoring for 1378 changes in the observed subpath run length without disrupting users. 1379 It should be used in conjunction with one of the above full rate 1380 tests because it does not confirm that the subpath can support raw 1381 data rate. 1383 RFC 6673 [RFC6673] is appropriate for measuring background packet 1384 delivery statistics. 1386 8.2. Standing Queue Tests 1388 These engineering tests confirm that the bottleneck is well behaved 1389 across the onset of packet loss, which typically follows after the 1390 onset of queueing. Well behaved generally means lossless for 1391 transient queues, but once the queue has been sustained for a 1392 sufficient period of time (or reaches a sufficient queue depth) there 1393 should be a small number of losses to signal to the transport 1394 protocol that it should reduce its window. Losses that are too early 1395 can prevent the transport from averaging at the target_data_rate. 1396 Losses that are too late indicate that the queue might be subject to 1397 bufferbloat [wikiBloat] and inflict excess queuing delays on all 1398 flows sharing the bottleneck queue. Excess losses (more than half of 1399 the window) at the onset of congestion make loss recovery problematic 1400 for the transport protocol. Non-linear, erratic or excessive RTT 1401 increases suggest poor interactions between the channel acquisition 1402 algorithms and the transport self clock. All of the tests in this 1403 section use the same basic scanning algorithm, described here, but 1404 score the link or subpath on the basis of how well it avoids each of 1405 these problems. 1407 For some technologies the data might not be subject to increasing 1408 delays, in which case the data rate will vary with the window size 1409 all the way up to the onset of load induced losses or ECN marks. For 1410 theses technologies, the discussion of queueing does not apply, but 1411 it is still required that the onset of losses or ECN marks be at an 1412 appropriate point and progressive. 1414 Use the procedure in Section 6.3 to sweep the window across the onset 1415 of queueing and the onset of loss. The tests below all assume that 1416 the scan emulates standard additive increase and delayed ACK by 1417 incrementing the window by one packet for every 2*target_window_size 1418 packets delivered. A scan can typically be divided into three 1419 regions: below the onset of queueing, a standing queue, and at or 1420 beyond the onset of loss. 1422 Below the onset of queueing the RTT is typically fairly constant, and 1423 the data rate varies in proportion to the window size. Once the data 1424 rate reaches the subpath IP rate, the data rate becomes fairly 1425 constant, and the RTT increases in proportion to the increase in 1426 window size. The precise transition across the start of queueing can 1427 be identified by the maximum network power, defined to be the ratio 1428 data rate over the RTT. The network power can be computed at each 1429 window size, and the window with the maximum are taken as the start 1430 of the queueing region. 1432 For technologies that do not have conventional queues, start the scan 1433 at a window equal to the test_window=target_data_rate*test_path_RTT/ 1434 target_MTU, i.e. starting at the target rate, instead of the power 1435 point. 1437 If there is random background loss (e.g. bit errors, etc), precise 1438 determination of the onset of queue induced packet loss may require 1439 multiple scans. Above the onset of queuing loss, all transport 1440 protocols are expected to experience periodic losses determined by 1441 the interaction between the congestion control and AQM algorithms. 1442 For standard congestion control algorithms the periodic losses are 1443 likely to be relatively widely spaced and the details are typically 1444 dominated by the behavior of the transport protocol itself. For the 1445 stiffened transport protocols case (with non-standard, aggressive 1446 congestion control algorithms) the details of periodic losses will be 1447 dominated by how the the window increase function responds to loss. 1449 8.2.1. Congestion Avoidance 1451 A subpath passes the congestion avoidance standing queue test if more 1452 than target_run_length packets are delivered between the onset of 1453 queueing (as determined by the window with the maximum network power) 1454 and the first loss or ECN mark. If this test is implemented using a 1455 standards congestion control algorithm with a clamp, it can be 1456 performed in situ in the production internet as a capacity test. For 1457 an example of such a test see [Pathdiag]. 1459 For technologies that do not have conventional queues, use the 1460 test_window inplace of the onset of queueing. i.e. A subpath passes 1461 the congestion avoidance standing queue test if more than 1462 target_run_length packets are delivered between start of the scan at 1463 test_window and the first loss or ECN mark. 1465 8.2.2. Bufferbloat 1467 This test confirms that there is some mechanism to limit buffer 1468 occupancy (e.g. that prevents bufferbloat). Note that this is not 1469 strictly a requirement for single stream bulk transport capacity, 1470 however if there is no mechanism to limit buffer queue occupancy then 1471 a single stream with sufficient data to deliver is likely to cause 1472 the problems described in [RFC2309], [I-D.ietf-aqm-recommendation] 1473 and [wikiBloat]. This may cause only minor symptoms for the dominant 1474 flow, but has the potential to make the subpath unusable for other 1475 flows and applications. 1477 Pass if the onset of loss occurs before a standing queue has 1478 introduced more delay than than twice target_RTT, or other well 1479 defined and specified limit. Note that there is not yet a model for 1480 how much standing queue is acceptable. The factor of two chosen here 1481 reflects a rule of thumb. In conjunction with the previous test, 1482 this test implies that the first loss should occur at a queueing 1483 delay which is between one and two times the target_RTT. 1485 Specified RTT limits that are larger than twice the target_RTT must 1486 be fully justified in the FSTDS. 1488 8.2.3. Non excessive loss 1490 This test confirm that the onset of loss is not excessive. Pass if 1491 losses are equal or less than the increase in the cross traffic plus 1492 the test traffic window increase on the previous RTT. This could be 1493 restated as non-decreasing subpath throughput at the onset of loss, 1494 which is easy to meet as long as discarding packets is not more 1495 expensive than delivering them. (Note when there is a transient drop 1496 in subpath throughput, outside of a standing queue test, a subpath 1497 that passes other queue tests in this document will have sufficient 1498 queue space to hold one RTT worth of data). 1500 Note that conventional Internet traffic policers will not pass this 1501 test, which is correct. TCP often fails to come into equilibrium at 1502 more than a small fraction of the available capacity, if the capacity 1503 is enforced by a policer. [Citation Pending]. 1505 8.2.4. Duplex Self Interference 1507 This engineering test confirms a bound on the interactions between 1508 the forward data path and the ACK return path. 1510 Some historical half duplex technologies had the property that each 1511 direction held the channel until it completely drained its queue. 1512 When a self clocked transport protocol, such as TCP, has data and 1513 ACKs passing in opposite directions through such a link, the behavior 1514 often reverts to stop-and-wait. Each additional packet added to the 1515 window raises the observed RTT by two forward path packet times, once 1516 as it passes through the data path, and once for the additional delay 1517 incurred by the ACK waiting on the return path. 1519 The duplex self interference test fails if the RTT rises by more than 1520 some fixed bound above the expected queueing time computed from trom 1521 the excess window divided by the subpath IP Capacity. This bound 1522 must be smaller than target_RTT/2 to avoid reverting to stop and wait 1523 behavior. (e.g. Data packets and ACKs have to be released at least 1524 twice per RTT.) 1526 8.3. Slowstart tests 1528 These tests mimic slowstart: data is sent at twice the effective 1529 bottleneck rate to exercise the queue at the dominant bottleneck. 1531 In general they are deemed inconclusive if the elapsed time to send 1532 the data burst is not less than half of the time to receive the ACKs. 1533 (i.e. sending data too fast is ok, but sending it slower than twice 1534 the actual bottleneck rate as indicated by the ACKs is deemed 1535 inconclusive). Space the bursts such that the average data rate is 1536 equal to the target_data_rate. 1538 8.3.1. Full Window slowstart test 1540 This is a capacity test to confirm that slowstart is not likely to 1541 exit prematurely. Send slowstart bursts that are target_window_size 1542 total packets. 1544 Accumulate packet delivery statistics as described in Section 7.2 to 1545 score the outcome. Pass if it is statistically significant that the 1546 observed number of good packets delivered between losses or ECN marks 1547 is larger than the target_run_length. Fail if it is statistically 1548 significant that the observed interval between losses or ECN marks is 1549 smaller than the target_run_length. 1551 Note that these are the same parameters as the Sender Full Window 1552 burst test, except the burst rate is at slowestart rate, rather than 1553 sender interface rate. 1555 8.3.2. Slowstart AQM test 1557 Do a continuous slowstart (send data continuously at slowstart_rate), 1558 until the first loss, stop, allow the network to drain and repeat, 1559 gathering statistics on the last packet delivered before the loss, 1560 the loss pattern, maximum observed RTT and window size. Justify the 1561 results. There is not currently sufficient theory justifying 1562 requiring any particular result, however design decisions that affect 1563 the outcome of this tests also affect how the network balances 1564 between long and short flows (the "mice and elephants" problem). The 1565 queue at the time of the first loss should be at least one half of 1566 the target_RTT. 1568 This is an engineering test: It would be best performed on a 1569 quiescent network or testbed, since cross traffic has the potential 1570 to change the results. 1572 8.4. Sender Rate Burst tests 1574 These tests determine how well the network can deliver bursts sent at 1575 sender's interface rate. Note that this test most heavily exercises 1576 the front path, and is likely to include infrastructure may be out of 1577 scope for an access ISP, even though the bursts might be caused by 1578 ACK compression, thinning or channel arbitration in the access ISP. 1579 See Appendix B. 1581 Also, there are a several details that are not precisely defined. 1582 For starters there is not a standard server interface rate. 1 Gb/s 1583 and 10 Gb/s are very common today, but higher rates will become cost 1584 effective and can be expected to be dominant some time in the future. 1586 Current standards permit TCP to send a full window bursts following 1587 an application pause. (Congestion Window Validation [RFC2861], is 1588 not required, but even if was, it does not take effect until an 1589 application pause is longer than an RTO.) Since full window bursts 1590 are consistent with standard behavior, it is desirable that the 1591 network be able to deliver such bursts, otherwise application pauses 1592 will cause unwarranted losses. Note that the AIMD sawtooth requires 1593 a peak window that is twice target_window_size, so the worst case 1594 burst may be 2*target_window_size. 1596 It is also understood in the application and serving community that 1597 interface rate bursts have a cost to the network that has to be 1598 balanced against other costs in the servers themselves. For example 1599 TCP Segmentation Offload (TSO) reduces server CPU in exchange for 1600 larger network bursts, which increase the stress on network buffer 1601 memory. 1603 There is not yet theory to unify these costs or to provide a 1604 framework for trying to optimize global efficiency. We do not yet 1605 have a model for how much the network should tolerate server rate 1606 bursts. Some bursts must be tolerated by the network, but it is 1607 probably unreasonable to expect the network to be able to efficiently 1608 deliver all data as a series of bursts. 1610 For this reason, this is the only test for which we encourage 1611 derating. A TDS could include a table of pairs of derating 1612 parameters: what burst size to use as a fraction of the 1613 target_window_size, and how much each burst size is permitted to 1614 reduce the run length, relative to to the target_run_length. 1616 8.5. Combined and Implicit Tests 1618 Combined tests efficiently confirm multiple network properties in a 1619 single test, possibly as a side effect of normal content delivery. 1620 They require less measurement traffic than other testing strategies 1621 at the cost of conflating diagnostic signatures when they fail. 1622 These are by far the most efficient for monitoring networks that are 1623 nominally expected to pass all tests. 1625 8.5.1. Sustained Bursts Test 1627 The sustained burst test implements a combined worst case version of 1628 all of the capacity tests above. It is simply: 1630 Send target_window_size bursts of packets at server interface rate 1631 with target_RTT burst headway (burst start to burst start). Verify 1632 that the observed packet delivery statistics meets the 1633 target_run_length. 1635 Key observations: 1636 o The subpath under test is expected to go idle for some fraction of 1637 the time: (subpath_IP_capacity-target_rate/ 1638 (target_MTU-header_overhead)*target_MTU)/subpath_IP_capacity. 1639 Failing to do so indicates a problem with the procedure and an 1640 inconclusive test result. 1641 o The burst sensitivity can be derated by sending smaller bursts 1642 more frequently. E.g. send target_window_size*derate packet 1643 bursts every target_RTT*derate. 1644 o When not derated, this test is the most strenuous capacity test. 1645 o A subpath that passes this test is likely to be able to sustain 1646 higher rates (close to subpath_IP_capacity) for paths with RTTs 1647 significantly smaller than the target_RTT. 1648 o This test can be implemented with instrumented TCP [RFC4898], 1649 using a specialized measurement application at one end [MBMSource] 1650 and a minimal service at the other end [RFC0863] [RFC0864]. 1651 o This test is efficient to implement, since it does not require 1652 per-packet timers, and can make use of TSO in modern NIC hardware. 1653 o This test by itself is not sufficient: the standing window 1654 engineering tests are also needed to ensure that the subpath is 1655 well behaved at and beyond the onset of congestion. 1656 o Assuming the subpath passes relevant standing window engineering 1657 tests (particularly that it has a progressive onset of loss at an 1658 appropriate queue depth) the passing sustained burst test is 1659 (believed to be) a sufficient verify that the subpath will not 1660 impair stream at the target performance under all conditions. 1661 Proving this statement will be subject of ongoing research. 1663 Note that this test is clearly independent of the subpath RTT, or 1664 other details of the measurement infrastructure, as long as the 1665 measurement infrastructure can accurately and reliably deliver the 1666 required bursts to the subpath under test. 1668 8.5.2. Streaming Media 1670 Model Based Metrics can be implicitly implemented as a side effect of 1671 serving any non-throughput maximizing traffic, such as streaming 1672 media, with some additional controls and instrumentation in the 1673 servers. The essential requirement is that the traffic be 1674 constrained such that even with arbitrary application pauses, bursts 1675 and data rate fluctuations, the traffic stays within the envelope 1676 defined by the individual tests described above. 1678 If the application's serving_data_rate is less than or equal to the 1679 target_data_rate and the serving_RTT (the RTT between the sender and 1680 client) is less than the target_RTT, this constraint is most easily 1681 implemented by clamping the transport window size to be no larger 1682 than: 1684 serving_window_clamp=target_data_rate*serving_RTT/ 1685 (target_MTU-header_overhead) 1687 Under the above constraints the serving_window_clamp will limit the 1688 both the serving data rate and burst sizes to be no larger than the 1689 procedures in Section 8.1.2 and Section 8.4 or Section 8.5.1. Since 1690 the serving RTT is smaller than the target_RTT, the worst case bursts 1691 that might be generated under these conditions will be smaller than 1692 called for by Section 8.4 and the sender rate burst sizes are 1693 implicitly derated by the serving_window_clamp divided by the 1694 target_window_size at the very least. (Depending on the application 1695 behavior, the data traffic might be significantly smoother than 1696 specified by any of the burst tests.) 1698 In an alternative implementation the data rate and bursts might be 1699 explicitly controlled by a host shaper or pacing at the sender. This 1700 would provide better control over transmissions but it is 1701 substantially more complicated to implement and would be likely to 1702 have a higher CPU overhead. 1704 Note that these techniques can be applied to any content delivery 1705 that can be subjected to a reduced data rate in order to inhibit TCP 1706 equilibrium behavior. 1708 9. An Example 1710 In this section a we illustrate a TDS designed to confirm that an 1711 access ISP can reliably deliver HD video from multiple content 1712 providers to all of their customers. With modern codecs, minimal HD 1713 video (720p) generally fits in 2.5 Mb/s. Due to their geographical 1714 size, network topology and modem designs the ISP determines that most 1715 content is within a 50 mS RTT from their users (This is a sufficient 1716 to cover continental Europe or either US coast from a single serving 1717 site.) 1718 2.5 Mb/s over a 50 ms path 1720 +----------------------+-------+---------+ 1721 | End-to-End Parameter | value | units | 1722 +----------------------+-------+---------+ 1723 | target_rate | 2.5 | Mb/s | 1724 | target_RTT | 50 | ms | 1725 | target_MTU | 1500 | bytes | 1726 | header_overhead | 64 | bytes | 1727 | target_window_size | 11 | packets | 1728 | target_run_length | 363 | packets | 1729 +----------------------+-------+---------+ 1731 Table 1 1733 Table 1 shows the default TCP model with no derating, and as such is 1734 quite conservative. The simplest TDS would be to use the sustained 1735 burst test, described in Section 8.5.1. Such a test would send 11 1736 packet bursts every 50mS, and confirming that there was no more than 1737 1 packet loss per 33 bursts (363 total packets in 1.650 seconds). 1739 Since this number represents is the entire end-to-end loss budget, 1740 independent subpath tests could be implemented by apportioning the 1741 packet loss ratio across subpaths. For example 50% of the losses 1742 might be allocated to the access or last mile link to the user, 40% 1743 to the interconnects with other ISPs and 1% to each internal hop 1744 (assuming no more than 10 internal hops). Then all of the subpaths 1745 can be tested independently, and the spatial composition of passing 1746 subpaths would be expected to be within the end-to-end loss budget. 1748 Testing interconnects has generally been problematic: conventional 1749 performance tests run between Measurement Points adjacent to either 1750 side of the interconnect, are not generally useful. Unconstrained 1751 TCP tests, such as iperf [iperf] are usually overly aggressive 1752 because the RTT is so small (often less than 1 mS). With a short RTT 1753 these tools are likely to report inflated numbers because for short 1754 RTTs these tools can tolerate very high packet loss ratios and can 1755 push other cross traffic off of the network. As a consequence they 1756 are useless for predicting actual user performance, and may 1757 themselves be quite disruptive. Model Based Metrics solves this 1758 problem. The same test pattern as used on other subpaths can be 1759 applied to the interconnect. For our example, when apportioned 40% 1760 of the losses, 11 packet bursts sent every 50mS should have fewer 1761 than one loss per 82 bursts (902 packets). 1763 10. Validation 1765 Since some aspects of the models are likely to be too conservative, 1766 Section 5.2 permits alternate protocol models and Section 5.3 permits 1767 test parameter derating. If either of these techniques are used, we 1768 require demonstrations that such a TDS can robustly detect subpaths 1769 that will prevent authentic applications using state-of-the-art 1770 protocol implementations from meeting the specified Target Transport 1771 Performance. This correctness criteria is potentially difficult to 1772 prove, because it implicitly requires validating a TDS against all 1773 possible subpaths and subpaths. The procedures described here are 1774 still experimental. 1776 We suggest two approaches, both of which should be applied: first, 1777 publish a fully open description of the TDS, including what 1778 assumptions were used and and how it was derived, such that the 1779 research community can evaluate the design decisions, test them and 1780 comment on their applicability; and second, demonstrate that an 1781 applications running over an infinitessimally passing testbed do meet 1782 the performance targets. 1784 An infinitessimally passing testbed resembles a epsilon-delta proof 1785 in calculus. Construct a test network such that all of the 1786 individual tests of the TDS pass by only small (infinitesimal) 1787 margins, and demonstrate that a variety of authentic applications 1788 running over real TCP implementations (or other protocol as 1789 appropriate) meets the Target Transport Performance over such a 1790 network. The workloads should include multiple types of streaming 1791 media and transaction oriented short flows (e.g. synthetic web 1792 traffic). 1794 For example, for the HD streaming video TDS described in Section 9, 1795 the IP capacity should be exactly the header overhead above 2.5 Mb/s, 1796 the per packet random background loss ratio should be 1/363, for a 1797 run length of 363 packets, the bottleneck queue should be 11 packets 1798 and the front path should have just enough buffering to withstand 11 1799 packet interface rate bursts. We want every one of the TDS tests to 1800 fail if we slightly increase the relevant test parameter, so for 1801 example sending a 12 packet bursts should cause excess (possibly 1802 deterministic) packet drops at the dominant queue at the bottleneck. 1803 On this infinitessimally passing network it should be possible for a 1804 real application using a stock TCP implementation in the vendor's 1805 default configuration to attain 2.5 Mb/s over an 50 mS path. 1807 The most difficult part of setting up such a testbed is arranging for 1808 it to infinitesimally pass the individual tests. Two approaches: 1809 constraining the network devices not to use all available resources 1810 (e.g. by limiting available buffer space or data rate); and 1811 preloading subpaths with cross traffic. Note that is it important 1812 that a single environment be constructed which infinitessimally 1813 passes all tests at the same time, otherwise there is a chance that 1814 TCP can exploit extra latitude in some parameters (such as data rate) 1815 to partially compensate for constraints in other parameters (queue 1816 space, or viceversa). 1818 To the extent that a TDS is used to inform public dialog it should be 1819 fully publicly documented, including the details of the tests, what 1820 assumptions were used and how it was derived. All of the details of 1821 the validation experiment should also be published with sufficient 1822 detail for the experiments to be replicated by other researchers. 1823 All components should either be open source of fully described 1824 proprietary implementations that are available to the research 1825 community. 1827 11. Security Considerations 1829 Measurement is often used to inform business and policy decisions, 1830 and as a consequence is potentially subject to manipulation. Model 1831 Based Metrics are expected to be a huge step forward because 1832 equivalent measurements can be performed from multiple vantage 1833 points, such that performance claims can be independently validated 1834 by multiple parties. 1836 Much of the acrimony in the Net Neutrality debate is due by the 1837 historical lack of any effective vantage independent tools to 1838 characterize network performance. Traditional methods for measuring 1839 Bulk Transport Capacity are sensitive to RTT and as a consequence 1840 often yield very different results when run local to an ISP or 1841 internconnect and when run over a customer's complete path. Neither 1842 the ISP nor customer can repeat the other's measurements, leading to 1843 high levels of distrust and acrimony. Model Based Metrics are 1844 expected to greatly improve this situation. 1846 This document only describes a framework for designing Fully 1847 Specified Targeted Diagnostic Suite. Each FSTDS MUST include its own 1848 security section. 1850 12. Acknowledgements 1852 Ganga Maguluri suggested the statistical test for measuring loss 1853 probability in the target run length. Alex Gilgur for helping with 1854 the statistics. 1856 Meredith Whittaker for improving the clarity of the communications. 1858 Ruediger Geib provided feedback which greatly improved the document. 1860 This work was inspired by Measurement Lab: open tools running on an 1861 open platform, using open tools to collect open data. See 1862 http://www.measurementlab.net/ 1864 13. IANA Considerations 1866 This document has no actions for IANA. 1868 14. References 1870 14.1. Normative References 1872 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1873 Requirement Levels", BCP 14, RFC 2119, March 1997. 1875 14.2. Informative References 1877 [RFC0863] Postel, J., "Discard Protocol", STD 21, RFC 863, May 1983. 1879 [RFC0864] Postel, J., "Character Generator Protocol", STD 22, 1880 RFC 864, May 1983. 1882 [RFC2309] Braden, B., Clark, D., Crowcroft, J., Davie, B., Deering, 1883 S., Estrin, D., Floyd, S., Jacobson, V., Minshall, G., 1884 Partridge, C., Peterson, L., Ramakrishnan, K., Shenker, 1885 S., Wroclawski, J., and L. Zhang, "Recommendations on 1886 Queue Management and Congestion Avoidance in the 1887 Internet", RFC 2309, April 1998. 1889 [RFC2330] Paxson, V., Almes, G., Mahdavi, J., and M. Mathis, 1890 "Framework for IP Performance Metrics", RFC 2330, 1891 May 1998. 1893 [RFC2861] Handley, M., Padhye, J., and S. Floyd, "TCP Congestion 1894 Window Validation", RFC 2861, June 2000. 1896 [RFC3148] Mathis, M. and M. Allman, "A Framework for Defining 1897 Empirical Bulk Transfer Capacity Metrics", RFC 3148, 1898 July 2001. 1900 [RFC3465] Allman, M., "TCP Congestion Control with Appropriate Byte 1901 Counting (ABC)", RFC 3465, February 2003. 1903 [RFC4015] Ludwig, R. and A. Gurtov, "The Eifel Response Algorithm 1904 for TCP", RFC 4015, February 2005. 1906 [RFC4737] Morton, A., Ciavattone, L., Ramachandran, G., Shalunov, 1907 S., and J. Perser, "Packet Reordering Metrics", RFC 4737, 1908 November 2006. 1910 [RFC4898] Mathis, M., Heffner, J., and R. Raghunarayan, "TCP 1911 Extended Statistics MIB", RFC 4898, May 2007. 1913 [RFC5136] Chimento, P. and J. Ishac, "Defining Network Capacity", 1914 RFC 5136, February 2008. 1916 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 1917 Control", RFC 5681, September 2009. 1919 [RFC6049] Morton, A. and E. Stephan, "Spatial Composition of 1920 Metrics", RFC 6049, January 2011. 1922 [RFC6673] Morton, A., "Round-Trip Packet Loss Metrics", RFC 6673, 1923 August 2012. 1925 [RFC7312] Fabini, J. and A. Morton, "Advanced Stream and Sampling 1926 Framework for IP Performance Metrics (IPPM)", RFC 7312, 1927 August 2014. 1929 [RFC7398] Bagnulo, M., Burbridge, T., Crawford, S., Eardley, P., and 1930 A. Morton, "A Reference Path and Measurement Points for 1931 Large-Scale Measurement of Broadband Performance", 1932 RFC 7398, February 2015. 1934 [I-D.ietf-ippm-2680-bis] 1935 Almes, G., Kalidindi, S., Zekauskas, M., and A. Morton, "A 1936 One-Way Loss Metric for IPPM", draft-ietf-ippm-2680-bis-02 1937 (work in progress), June 2015. 1939 [I-D.ietf-aqm-recommendation] 1940 Baker, F. and G. Fairhurst, "IETF Recommendations 1941 Regarding Active Queue Management", 1942 draft-ietf-aqm-recommendation-11 (work in progress), 1943 February 2015. 1945 [MSMO97] Mathis, M., Semke, J., Mahdavi, J., and T. Ott, "The 1946 Macroscopic Behavior of the TCP Congestion Avoidance 1947 Algorithm", Computer Communications Review volume 27, 1948 number3, July 1997. 1950 [WPING] Mathis, M., "Windowed Ping: An IP Level Performance 1951 Diagnostic", INET 94, June 1994. 1953 [mpingSource] 1954 Fan, X., Mathis, M., and D. Hamon, "Git Repository for 1955 mping: An IP Level Performance Diagnostic", Sept 2013, 1956 . 1958 [MBMSource] 1959 Hamon, D., Stuart, S., and H. Chen, "Git Repository for 1960 Model Based Metrics", Sept 2013, 1961 . 1963 [Pathdiag] 1964 Mathis, M., Heffner, J., O'Neil, P., and P. Siemsen, 1965 "Pathdiag: Automated TCP Diagnosis", Passive and Active 1966 Measurement , June 2008. 1968 [iperf] Wikipedia Contributors, "iPerf", Wikipedia, The Free 1969 Encyclopedia , cited March 2015, . 1972 [StatQC] Montgomery, D., "Introduction to Statistical Quality 1973 Control - 2nd ed.", ISBN 0-471-51988-X, 1990. 1975 [Rtool] R Development Core Team, "R: A language and environment 1976 for statistical computing. R Foundation for Statistical 1977 Computing, Vienna, Austria. ISBN 3-900051-07-0, URL 1978 http://www.R-project.org/", , 2011. 1980 [CVST] Krueger, T. and M. Braun, "R package: Fast Cross- 1981 Validation via Sequential Testing", version 0.1, 11 2012. 1983 [AFD] Pan, R., Breslau, L., Prabhakar, B., and S. Shenker, 1984 "Approximate fairness through differential dropping", 1985 SIGCOMM Comput. Commun. Rev. 33, 2, April 2003. 1987 [wikiBloat] 1988 Wikipedia, "Bufferbloat", http://en.wikipedia.org/w/ 1989 index.php?title=Bufferbloat&oldid=608805474, March 2015. 1991 [CCscaling] 1992 Fernando, F., Doyle, J., and S. Steven, "Scalable laws for 1993 stable network congestion control", Proceedings of 1994 Conference on Decision and 1995 Control, http://www.ee.ucla.edu/~paganini, December 2001. 1997 Appendix A. Model Derivations 1999 The reference target_run_length described in Section 5.2 is based on 2000 very conservative assumptions: that all window above 2001 target_window_size contributes to a standing queue that raises the 2002 RTT, and that classic Reno congestion control with delayed ACKs are 2003 in effect. In this section we provide two alternative calculations 2004 using different assumptions. 2006 It may seem out of place to allow such latitude in a measurement 2007 standard, but this section provides offsetting requirements. 2009 The estimates provided by these models make the most sense if network 2010 performance is viewed logarithmically. In the operational Internet, 2011 data rates span more than 8 orders of magnitude, RTT spans more than 2012 3 orders of magnitude, and packet loss ratio spans at least 8 orders 2013 of magnitude if not more. When viewed logarithmically (as in 2014 decibels), these correspond to 80 dB of dynamic range. On an 80 dB 2015 scale, a 3 dB error is less than 4% of the scale, even though it 2016 represents a factor of 2 in untransformed parameter. 2018 This document gives a lot of latitude for calculating 2019 target_run_length, however people designing a TDS should consider the 2020 effect of their choices on the ongoing tussle about the relevance of 2021 "TCP friendliness" as an appropriate model for Internet capacity 2022 allocation. Choosing a target_run_length that is substantially 2023 smaller than the reference target_run_length specified in Section 5.2 2024 strengthens the argument that it may be appropriate to abandon "TCP 2025 friendliness" as the Internet fairness model. This gives developers 2026 incentive and permission to develop even more aggressive applications 2027 and protocols, for example by increasing the number of connections 2028 that they open concurrently. 2030 A.1. Queueless Reno 2032 In Section 5.2 it was assumed that the subpath IP rate matches the 2033 target rate plus overhead, such that the excess window needed for the 2034 AIMD sawtooth causes a fluctuating queue at the bottleneck. 2036 An alternate situation would be bottleneck where there is no 2037 significant queue and losses are caused by some mechanism that does 2038 not involve extra delay, for example by the use of a virtual queue as 2039 in Approximate Fair Dropping [AFD]. A flow controlled by such a 2040 bottleneck would have a constant RTT and a data rate that fluctuates 2041 in a sawtooth due to AIMD congestion control. Assume the losses are 2042 being controlled to make the average data rate meet some goal which 2043 is equal or greater than the target_rate. The necessary run length 2044 can be computed as follows: 2046 For some value of Wmin, the window will sweep from Wmin packets to 2047 2*Wmin packets in 2*Wmin RTT (due to delayed ACK). Unlike the 2048 queueing case where Wmin = target_window_size, we want the average of 2049 Wmin and 2*Wmin to be the target_window_size, so the average rate is 2050 the target rate. Thus we want Wmin = (2/3)*target_window_size. 2052 Between losses each sawtooth delivers (1/2)(Wmin+2*Wmin)(2Wmin) 2053 packets in 2*Wmin round trip times. 2055 Substituting these together we get: 2057 target_run_length = (4/3)(target_window_size^2) 2059 Note that this is 44% of the reference_run_length computed earlier. 2060 This makes sense because under the assumptions in Section 5.2 the 2061 AMID sawtooth caused a queue at the bottleneck, which raised the 2062 effective RTT by 50%. 2064 Appendix B. Complex Queueing 2066 For many network technologies simple queueing models don't apply: the 2067 network schedules, thins or otherwise alters the timing of ACKs and 2068 data, generally to raise the efficiency of the channel allocation 2069 when confronted with relatively widely spaced small ACKs. These 2070 efficiency strategies are ubiquitous for half duplex, wireless and 2071 broadcast media. 2073 Altering the ACK stream generally has two consequences: it raises the 2074 implied bottleneck IP capacity, making slowstart burst at higher 2075 rates (possibly as high as the sender's interface rate) and it 2076 effectively raises the RTT by the average time that the ACKs and data 2077 were delayed. The first effect can be partially mitigated by 2078 reclocking ACKs once they are beyond the bottleneck on the return 2079 path to the sender, however this further raises the effective RTT. 2081 The most extreme example of this sort of behavior would be a half 2082 duplex channel that is not released as long as end point currently 2083 holding the channel has more traffic (data or ACKs) to send. Such 2084 environments cause self clocked protocols under full load to revert 2085 to extremely inefficient stop and wait behavior, where they send an 2086 entire window of data as a single burst of the forward path, followed 2087 by the entire window of ACKs on the return path. It is important to 2088 note that due to self clocking, ill conceived channel allocation 2089 mechanisms can increase the stress on upstream subpaths in a long 2090 path: they cause large and faster bursts. 2092 If a particular return path contains a subpath or device that alters 2093 the ACK stream, then the entire path from the sender up to the 2094 bottleneck must be tested at the burst parameters implied by the ACK 2095 scheduling algorithm. The most important parameter is the Implied 2096 Bottleneck IP Capacity, which is the average rate at which the ACKs 2097 advance snd.una. Note that thinning the ACKs (relying on the 2098 cumulative nature of seg.ack to permit discarding some ACKs) is 2099 implies an effectively infinite Implied Bottleneck IP Capacity. 2101 Holding data or ACKs for channel allocation or other reasons (such as 2102 forward error correction) always raises the effective RTT relative to 2103 the minimum delay for the path. Therefore it may be necessary to 2104 replace target_RTT in the calculation in Section 5.2 by an 2105 effective_RTT, which includes the target_RTT plus a term to account 2106 for the extra delays introduced by these mechanisms. 2108 Appendix C. Version Control 2110 This section to be removed prior to publication. 2112 Formatted: Mon Jul 6 13:49:30 PDT 2015 2114 Authors' Addresses 2116 Matt Mathis 2117 Google, Inc 2118 1600 Amphitheater Parkway 2119 Mountain View, California 94043 2120 USA 2122 Email: mattmathis@google.com 2124 Al Morton 2125 AT&T Labs 2126 200 Laurel Avenue South 2127 Middletown, NJ 07748 2128 USA 2130 Phone: +1 732 420 1571 2131 Email: acmorton@att.com 2132 URI: http://home.comcast.net/~acmacm/