idnits 2.17.1 

draft-ietf-ippm-model-based-metrics-05.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == Line 1191 has weird spacing: '...   and  n = h1...'

  -- The document date (June 13, 2015) is 3239 days in the past.  Is this
     intentional?


  Checking references for intended status: Experimental
  ----------------------------------------------------------------------------

  == Missing Reference: 'RFC2680bis' is mentioned on line 386, but not defined

  ** Obsolete undefined reference: RFC 2680 (Obsoleted by RFC 7680)

  == Missing Reference: 'Dominant' is mentioned on line 426, but not defined

  -- Obsolete informational reference (is this intentional?): RFC 2309
     (Obsoleted by RFC 7567)

  -- Obsolete informational reference (is this intentional?): RFC 2861
     (Obsoleted by RFC 7661)


     Summary: 1 error (**), 0 flaws (~~), 4 warnings (==), 3 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	IP Performance Working Group                                   M. Mathis
3	Internet-Draft                                               Google, Inc
4	Intended status: Experimental                                  A. Morton
5	Expires: December 15, 2015                                     AT&T Labs
6	                                                           June 13, 2015

8	            Model Based Metrics for Bulk Transport Capacity
9	               draft-ietf-ippm-model-based-metrics-05.txt

11	Abstract

13	   We introduce a new class of model based metrics designed to determine
14	   if a complete Internet path can meet predefined bulk transport
15	   performance targets by applying a suite of IP diagnostic tests to
16	   successive subpaths.  The subpath-at-a-time tests can be robustly
17	   applied to key infrastructure, such as interconnects or even
18	   individual devices, to accurately detect if any part of the
19	   infrastructure will prevent any path traversing it from meeting the
20	   specified target performance.

22	   The diagnostic tests consist of precomputed traffic patterns and
23	   statistical criteria for evaluating packet delivery.  The traffic
24	   patterns are precomputed to mimic TCP or other transport protocol
25	   over a long path but are constructed in such a way that they are
26	   independent of the actual details of the subpath under test, end
27	   systems or applications.  Likewise the success criteria depends on
28	   the packet delivery statistics of the subpath, as evaluated against a
29	   protocol model applied to the target performance.  The success
30	   criteria also does not depend on the details of the subpath, end
31	   systems or application.  This makes the measurements open loop,
32	   eliminating most of the difficulties encountered by traditional bulk
33	   transport metrics.

35	   Model based metrics exhibit several important new properties not
36	   present in other Bulk Capacity Metrics, including the ability to
37	   reason about concatenated or overlapping subpaths.  The results are
38	   vantage independent which is critical for supporting independent
39	   validation of tests results from multiple Measurement Points.

41	   This document does not define diagnostic tests directly, but provides
42	   a framework for designing suites of IP diagnostics tests that are
43	   tailored to confirming that infrastructure can meet a predetermined
44	   target performance.

46	   Interim DRAFT Formatted: Sat Jun 13 16:25:01 PDT 2015

48	Status of this Memo
49	   This Internet-Draft is submitted in full conformance with the
50	   provisions of BCP 78 and BCP 79.

52	   Internet-Drafts are working documents of the Internet Engineering
53	   Task Force (IETF).  Note that other groups may also distribute
54	   working documents as Internet-Drafts.  The list of current Internet-
55	   Drafts is at http://datatracker.ietf.org/drafts/current/.

57	   Internet-Drafts are draft documents valid for a maximum of six months
58	   and may be updated, replaced, or obsoleted by other documents at any
59	   time.  It is inappropriate to use Internet-Drafts as reference
60	   material or to cite them other than as "work in progress."

62	   This Internet-Draft will expire on December 15, 2015.

64	Copyright Notice

66	   Copyright (c) 2015 IETF Trust and the persons identified as the
67	   document authors.  All rights reserved.

69	   This document is subject to BCP 78 and the IETF Trust's Legal
70	   Provisions Relating to IETF Documents
71	   (http://trustee.ietf.org/license-info) in effect on the date of
72	   publication of this document.  Please review these documents
73	   carefully, as they describe your rights and restrictions with respect
74	   to this document.  Code Components extracted from this document must
75	   include Simplified BSD License text as described in Section 4.e of
76	   the Trust Legal Provisions and are provided without warranty as
77	   described in the Simplified BSD License.

79	Table of Contents

81	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  5
82	     1.1.  Version Control  . . . . . . . . . . . . . . . . . . . . .  6
83	   2.  Overview . . . . . . . . . . . . . . . . . . . . . . . . . . .  6
84	   3.  Terminology  . . . . . . . . . . . . . . . . . . . . . . . . . 10
85	   4.  New requirements relative to RFC 2330  . . . . . . . . . . . . 14
86	   5.  Background . . . . . . . . . . . . . . . . . . . . . . . . . . 15
87	     5.1.  TCP properties . . . . . . . . . . . . . . . . . . . . . . 16
88	     5.2.  Diagnostic Approach  . . . . . . . . . . . . . . . . . . . 17
89	   6.  Common Models and Parameters . . . . . . . . . . . . . . . . . 19
90	     6.1.  Target End-to-end parameters . . . . . . . . . . . . . . . 19
91	     6.2.  Common Model Calculations  . . . . . . . . . . . . . . . . 19
92	     6.3.  Parameter Derating . . . . . . . . . . . . . . . . . . . . 20
93	   7.  Traffic generating techniques  . . . . . . . . . . . . . . . . 21
94	     7.1.  Paced transmission . . . . . . . . . . . . . . . . . . . . 21
95	     7.2.  Constant window pseudo CBR . . . . . . . . . . . . . . . . 22
96	     7.3.  Scanned window pseudo CBR  . . . . . . . . . . . . . . . . 23
97	     7.4.  Concurrent or channelized testing  . . . . . . . . . . . . 23
98	   8.  Interpreting the Results . . . . . . . . . . . . . . . . . . . 24
99	     8.1.  Test outcomes  . . . . . . . . . . . . . . . . . . . . . . 24
100	     8.2.  Statistical criteria for estimating run_length . . . . . . 26
101	     8.3.  Reordering Tolerance . . . . . . . . . . . . . . . . . . . 27
102	   9.  Test Preconditions . . . . . . . . . . . . . . . . . . . . . . 28
103	   10. Diagnostic Tests . . . . . . . . . . . . . . . . . . . . . . . 29
104	     10.1. Basic Data Rate and Delivery Statistics Tests  . . . . . . 29
105	       10.1.1.  Delivery Statistics at Paced Full Data Rate . . . . . 30
106	       10.1.2.  Delivery Statistics at Full Data Windowed Rate  . . . 30
107	       10.1.3.  Background Delivery Statistics Tests  . . . . . . . . 30
108	     10.2. Standing Queue Tests . . . . . . . . . . . . . . . . . . . 31
109	       10.2.1.  Congestion Avoidance  . . . . . . . . . . . . . . . . 32
110	       10.2.2.  Bufferbloat . . . . . . . . . . . . . . . . . . . . . 32
111	       10.2.3.  Non excessive loss  . . . . . . . . . . . . . . . . . 33
112	       10.2.4.  Duplex Self Interference  . . . . . . . . . . . . . . 33
113	     10.3. Slowstart tests  . . . . . . . . . . . . . . . . . . . . . 34
114	       10.3.1.  Full Window slowstart test  . . . . . . . . . . . . . 34
115	       10.3.2.  Slowstart AQM test  . . . . . . . . . . . . . . . . . 34
116	     10.4. Sender Rate Burst tests  . . . . . . . . . . . . . . . . . 35
117	     10.5. Combined and Implicit Tests  . . . . . . . . . . . . . . . 35
118	       10.5.1.  Sustained Bursts Test . . . . . . . . . . . . . . . . 36
119	       10.5.2.  Streaming Media . . . . . . . . . . . . . . . . . . . 37
120	   11. An Example . . . . . . . . . . . . . . . . . . . . . . . . . . 37
121	   12. Validation . . . . . . . . . . . . . . . . . . . . . . . . . . 39
122	   13. Security Considerations  . . . . . . . . . . . . . . . . . . . 40
123	   14. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 41
124	   15. IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 41
125	   16. References . . . . . . . . . . . . . . . . . . . . . . . . . . 41
126	     16.1. Normative References . . . . . . . . . . . . . . . . . . . 41
127	     16.2. Informative References . . . . . . . . . . . . . . . . . . 41
128	   Appendix A.  Model Derivations . . . . . . . . . . . . . . . . . . 44
129	     A.1.  Queueless Reno . . . . . . . . . . . . . . . . . . . . . . 44
130	   Appendix B.  Complex Queueing  . . . . . . . . . . . . . . . . . . 45
131	   Appendix C.  Version Control . . . . . . . . . . . . . . . . . . . 46
132	   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 46

134	1.  Introduction

136	   Model Based Metrics (MBM) rely on mathematical models to specify a
137	   targeted diagnostic suite of IP diagnostic tests, designed to verify
138	   that common transport protocols can meet a predetermined performance
139	   target over an Internet path.  Each diagnostic in the suite measures
140	   some aspect of IP delivery that is required to meet the performance
141	   target.  For example a TDS may have separate diagnostic tests to
142	   verify that there is sufficient data rate and sufficient queueing
143	   buffer space to deliver typical transport bursts, and that the
144	   background packet loss is small enough not to interfere with
145	   congestion control.  Unlike other metrics which yield measures of
146	   network properties, Model Based Metrics nominally yield pass/fail
147	   evaluations of the ability of transport protocols to meet a
148	   performance objective as need by a user application over a particular
149	   network path.

151	   This note describes the modeling framework to derive the IP
152	   diagnostic test parameters from the target performance specified for
153	   TCP bulk transport capacity.  In the future, other Model Based
154	   Metrics may cover other applications and transports, such as VoIP
155	   over RTP.  In most cases the IP diagnostic tests can be implemented
156	   by combining existing IPPM metrics with additional controls for
157	   precomputed traffic patterns and statistical criteria for evaluating
158	   packet delivery.

160	   This approach, mapping transport performance targets to a targeted
161	   diagnostic suite (TDS) of IP diagnostic tests, solves an intrinsic
162	   problem with using TCP or other throughput maximizing protocols for
163	   measurement.  In particular all throughput maximizing protocols (and
164	   TCP congestion control in particular) cause some level of congestion
165	   in order to fill the network.  This self inflicted congestion
166	   obscures the network properties of interest and introduces non-linear
167	   equilibrium behaviors that make any resulting measurements useless as
168	   metrics because they have no predictive value for conditions or paths
169	   different than the measurement itself.  This problem is discussed in
170	   Section 5.

172	   A targeted suite of IP diagnostic tests do not have such
173	   difficulties.  They can be constructed to make strong statistical
174	   statements about path properties that are independent of the
175	   measurement details, such as vantage and choice of measurement
176	   points.  Model Based Metrics bridge the gap between empirical IP
177	   measurements and expected TCP performance.

179	1.1.  Version Control

181	   RFC Editor: Please remove this entire subsection prior to
182	   publication.

184	   Please send comments about this draft to ippm@ietf.org.  See
185	   http://goo.gl/02tkD for more information including: interim drafts,
186	   an up to date todo list and information on contributing.

188	   Formatted: Sat Jun 13 16:25:01 PDT 2015

190	   Changes since -04 draft:
191	   o  The introduction was heavily overhauled: split into a separate
192	      introduction and overview.
193	   o  The new shorter introduction:
194	      *  Is a problem statement;
195	      *  This document provides a framework;
196	      *  That it replaces TCP measurement by IP tests;
197	      *  That the results are pass/fail.
198	   o  Added a diagram of the framework to the overview
199	   o  and introduces all of the elements of the framework.
200	   o  Renumbered sections, reducing the depth of some section numbers.
201	   o  Updated definitions to better agree with other documents:
202	      *  Reordered section 2
203	      *  Bulk [data] performance -> Bulk Transport Capacity, everywhere
204	         including the title.
205	      *  loss rate and loss probability -> loss ratio
206	      *  end-to-end path -> complete path
207	      *  [end-to-end][target] performance -> target transport
208	         performance
209	      *  load test -> capacity test

211	   This interim draft is a partial update since the WGLC, to collect an
212	   additional round of feedback on the Introduction, overview, and
213	   terminology sections.  Note that some of the prior WGLC comments are
214	   still pending.  Later sections (4 and beyond) have only been updated
215	   to track changes in the terminology section.  We intend to produce an
216	   additional draft prior to the IETF, incorporating still pending
217	   comments from the WGLC and any additional comments on the
218	   introduction and overview.

220	2.  Overview

222	   This document describes a modeling framework for deriving Target
223	   Diagnostic Suites to determine if an IP path can be expected to meet
224	   a predetermined target performance.  It relies on other standards
225	   documents to define Important details such as packet type-p
226	   selection, sampling techniques, vantage selection, etc. which are not
227	   specified here.  We imagine Fully Specified Targeted Diagnostic
228	   Suites (FSTDS), that define all of these details.  We use TDS to
229	   refer to the subset of such a specification that is in scope for this
230	   document.

232	   Figure 1 shows the MBM modeling and measurement framework.  (See
233	   Section 3 for terminology used throughout this document).  The target
234	   transport performance is determined by the needs of the user or
235	   application, outside the scope of this document.  For bulk transport
236	   capacity, the performance parameter of interest is the target data
237	   rate.  However, since TCP's ability to compensate for less than ideal
238	   network conditions is fundamentally affected by the Round Trip Time
239	   (RTT) and the Maximum Transmission Unit (MTU) of the complete path,
240	   these parameters must also be specified in advance using knowledge
241	   about the intended application setting.  Section 6 describes the
242	   common parameters and models used to derive a targeted diagnostic
243	   suite.

245	   The target transport performance may reflect a specific application
246	   over real path through the Internet or an idealized application and
247	   path representing a typical user community.

249	               target transport performance
250	     (target data rate, target RTT and target MTU)
251	                            |
252	                    ________V_________
253	                    |  mathematical  |
254	                    |     models     |
255	                    |                |
256	                    ------------------
257	   Traffic parameters |            | Statistical criteria
258	                      |            |
259	               _______V____________V____Targeted_______
260	              |       |   * * *    | Diagnostic Suite  |
261	         _____|_______V____________V________________   |
262	       __|____________V____________V______________  |  |
263	       |           IP Diagnostic test             | |  |
264	       |              |            |              | |  |
265	       | _____________V__        __V____________  | |  |
266	       | |    Traffic   |        |   Delivery  |  | |  |
267	       | |  Generation  |        |  Evaluation |  | |  |
268	       | |              |        |             |  | |  |
269	       | -------v--------        ------^--------  | |  |
270	       |   |    v   Test Traffic via   ^      |   | |--
271	       |   |  -->======================>--    |   | |
272	       |   |       subpath under test         |   |-
273	       ----V----------------------------------V--- |
274	           | |  |                             | |  |
275	           V V  V                             V V  V
276	       fail/inconclusive            pass/fail/inconclusive

278	   Overall Modeling Framework

280	                                 Figure 1

282	   Section 5 describes some key aspects of TCP behavior and what they
283	   imply about the requirements for IP packet delivery.  Most of the IP
284	   diagnostic tests needed to confirm that the path meets these
285	   properties can be built on existing IPPM metrics, with the addition
286	   of statistical criteria for evaluating packet delivery and in some
287	   cases new mechanisms to implement precomputed traffic patterns.  One
288	   group of tests, the standing queue tests described in section
289	   Section 10.2, don't correspond to existing IPPM metrics, but suitable
290	   metrics can be patterned after existing tools.

292	   Mathematical models are used to design traffic patterns that mimic
293	   TCP or other bulk transport protocol operating at the target data
294	   rate, MTU and RTT over a full range of conditions, including flows
295	   that are bursty at multiple time scales.  The traffic patterns are
296	   generated based on the three target parameters of complete path and
297	   independent of the properties of individual subpaths as described in
298	   Section 7.  As much as possible the measurement traffic is generated
299	   deterministically to that minimize the extent to which test
300	   methodology, measurement points, measurement vantage or path
301	   partitioning affect the details of the measurement traffic.

303	   Section 8 describes packet delivery statistics and methods test them
304	   against the bounds provided by the mathematical models.  Since these
305	   statistics are typically aggregated from all subpaths of the complete
306	   path, in situ testing requires that the end-to-end statistical bounds
307	   be apportioned as a separate bound for each subpath.  Links that are
308	   expected to be bottlenecks are expected to contribute a larger
309	   fraction of the total packet loss.  In compensation, other links have
310	   to be constrained to contribute less packet loss.  The criteria for
311	   passing each test of a TDS is an apportioned share of the total bound
312	   determined by the mathematical model from the target transport
313	   performance .

315	   Section 10 describes the suite of individual tests needed to verify
316	   all of required IP delivery properties.  A subpath passes if and only
317	   if all of the individual IP diagnostics tests pass.  Any subpath that
318	   fails any test indicates that some users are likely fail to attain
319	   their target transport performance under some conditions.  In
320	   addition to passing or failing, a test can be deemed to be
321	   inconclusive for a number of reasons including: the precomputed
322	   traffic pattern was not accurately generated; the measurement results
323	   were not statistically significant; and others such as failing to
324	   meet some required test preconditions.  If all test pass, except some
325	   are inconclusive then the entire suite is deemed to be inconclusive.

327	   Since there is some uncertainty in this process, Section 12,
328	   describes a validation procedure to diagnose and minimize false
329	   positive and false negative results.

331	   In Section 11 we present an example TDS that might be representative
332	   of HD video, and illustrate how Model Based Metrics can be used to
333	   address difficult measurement situations, such as confirming that
334	   intercarrier exchanges have sufficient performance and capacity to
335	   deliver HD video between ISPs.

337	   A TDS includes the target parameters, documentation of the models and
338	   assumptions used to derive the IP diagnostic test parameters,
339	   specifications for the traffic and delivery statistics for the tests
340	   themselves, and a description of a test setup that can be used to
341	   validate the tests and models.

343	3.  Terminology

345	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
346	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
347	   document are to be interpreted as described in [RFC2119].

349	   General Terminology:

351	   Target:  A general term for any parameter specified by or derived
352	      from the user's application or transport performance requirements.
353	   Complete Path:  From RFC 5835
354	   target transport performance:  Application or transport performance
355	      goals for the complete path.  For bulk transport capacity defined
356	      in this note the target transport performance includes the target
357	      data rate, target RTT and target MTU as described below.
358	   Target Data Rate:  The specified application data rate required for
359	      an application's proper operation.  This is typically the
360	      performance goal as needed by the ultimate user.
361	   Target RTT (Round Trip Time):  The baseline (minimum) RTT of the
362	      longest complete path over which the application expects to be
363	      able meet the target performance.  TCP and other transport
364	      protocol's ability to compensate for path problems is generally
365	      proportional to the number of round trips per second.  The Target
366	      RTT determines both key parameters of the traffic patterns (e.g.
367	      burst sizes) and the thresholds on acceptable traffic statistics.
368	      The Target RTT must be specified considering authentic packets
369	      sizes: MTU sized packets on the forward path, ACK sized packets
370	      (typically header_overhead) on the return path.
371	   Target MTU (Maximum Transmission Unit):  The maximum MTU supported by
372	      the complete path the over which the application expects to meet
373	      the target performance.  Assume 1500 Byte MTU unless otherwise
374	      specified.  If some subpath forces a smaller MTU, then it becomes
375	      the target MTU, and all model calculations and subpath tests must
376	      use the same smaller MTU.
377	   Targeted Diagnostic Suite (TDS):  A set of IP Diagnostics designed to
378	      determine if an otherwise ideal complete path containing the
379	      subpath under test can sustain flows at a specific
380	      target_data_rate using target_MTU sized packets when the RTT of
381	      the complete path is target_RTT.
382	   Fully Specified Targeted Diagnostic Suite:  A TDS together with
383	      additional specification such as "type-p", etc which are out of
384	      scope for this document, but need to be drawn from other standards
385	      documents.
386	   loss ratio:  See "Packet Loss Ratio in [RFC2680bis]
387	   apportioned:  To divide and allocate, for example budgeting packet
388	      loss ratio across multiple subpaths such that they will accumulate
389	      to less than a specified end-to-end loss ratio.
390	   open loop:  A control theory term used to describe a class of
391	      techniques where systems that naturally exhibit circular
392	      dependencies can be analyzed by suppressing some of the
393	      dependences, such that the resulting dependency graph is acyclic.
394	   Bulk Transport Capacity:  Bulk Transport Capacity Metrics evaluate an
395	      Internet path's ability to carry bulk data, such as large files,
396	      streaming (non-real time) video, and under some conditions, web
397	      images and other content.  Prior efforts to define BTC metrics
398	      have been based on [RFC3148], which never succeeded due to some
399	      overlooked requirements described in Section 4 and problems
400	      described in The metrics presented in this document reflect an
401	      entirely different approach to the problem outlined in [RFC3148].
402	   traffic patterns:  The temporal patterns or statistics of traffic
403	      generated by applications over transport protocols such as TCP.
404	      There are several mechanisms that cause bursts at various time
405	      scales.  Our goal here is to mimic the range of common patterns
406	      (burst sizes and rates, etc), without tieing our applicability to
407	      specific applications, implementations or technologies, which are
408	      sure to become stale.
409	   delivery Statistics:  Raw or summary statistics about packet delivery
410	      properties of the IP layer including packet losses, ECN marks,
411	      reordering, or any other properties that may be germane to
412	      transport performance.
413	   IP performance tests:  Measurements or diagnostic tests to determine
414	      delivery statistics.

416	   Terminology about paths, etc.  See [RFC2330] and [RFC7398].

418	   [data] sender:  Host sending data and receiving ACKs.
419	   [data] receiver:  Host receiving data and sending ACKs.
420	   subpath:  A portion of the full path.  Note that there is no
421	      requirement that subpaths be non-overlapping.
422	   Measurement Point:  Measurement points as described in [RFC7398].
423	   test path:  A path between two measurement points that includes a
424	      subpath of the complete path under test, and could include
425	      infrastructure between the measurement points and the subpath.
426	   [Dominant] Bottleneck:  The Bottleneck that generally dominates
427	      traffic statistics for the entire path.  It typically determines a
428	      flow's self clock timing, packet loss and ECN marking rate.  See
429	      Section 5.1.
430	   front path:  The subpath from the data sender to the dominant
431	      bottleneck.

433	   back path:  The subpath from the dominant bottleneck to the receiver.
434	   return path:  The path taken by the ACKs from the data receiver to
435	      the data sender.
436	   cross traffic:  Other, potentially interfering, traffic competing for
437	      network resources (bandwidth and/or queue capacity).

439	   Properties determined by the complet path and application.  They are
440	   described in more detail in Section 6.1.

442	   Application Data Rate:  General term for the data rate as seen by the
443	      application above the transport layer.  This is the payload data
444	      rate, and explicitly excludes transport and lower level headers
445	      (TCP/IP or other protocols), retransmissions and other overhead
446	      that is not part to the total quantity of data delivered to the
447	      application.
448	   Link Data Rate:  General term for the data rate as seen by the link
449	      or lower layers.  The link data rate includes transport and IP
450	      headers, retransmissions and other transport layer overhead.  This
451	      document is agnostic as to whether the link data rate includes or
452	      excludes framing, MAC, or other lower layer overheads, except that
453	      they must be treated uniformly.
454	   Effective Bottleneck Data Rate:  This is the bottleneck data rate
455	      implied by the returning ACKs, by looking at how much application
456	      data the ACK stream reports delivered per unit time.  If the path
457	      is thinning ACKs or batching ACKs the effective bottleneck rate
458	      can be much higher than the average link rate.  See Section 5.1
459	      and Appendix B for more details.
460	   [sender | interface] rate:  The burst data rate, constrained by the
461	      data sender's interface.  Today 1 or 10 Gb/s are typical.
462	   Header_overhead:  The IP and TCP header sizes, which are the portion
463	      of each MTU not available for carrying application payload.
464	      Without loss of generality this is assumed to be the size for
465	      returning acknowledgements (ACKs).  For TCP, the Maximum Segment
466	      Size (MSS) is the Target MTU minus the header_overhead.

468	   Basic parameters common to models and subpath tests are defined here
469	   are described in more detail in Section 6.2.  Note that these are
470	   mixed between application transport performance (excludes headers)
471	   and link IP performance (includes headers).

473	   Window:  The total quantity of data plus the data represented by ACKs
474	      circulating in the network is referred to as the window.  See
475	      Section 5.1
476	   pipe size:  A general term for number of packets needed in flight
477	      (the window size) to exactly fill some network path or subpath.
478	      It corresponds to the window size which maximizes network power,
479	      the observed data rate divided by the observed RTT.  Often used
480	      with additional qualifies to specify which path, etc.

482	   target_pipe_size:  The number of packets in flight (the window size)
483	      needed to exactly meet the target rate, with a single stream and
484	      no cross traffic for the specified application target data rate,
485	      RTT, and MTU.  It is the amount of circulating data required to
486	      meet the target data rate, and implies the scale of the bursts
487	      that the network might experience.
488	   run length:  A general term for the observed, measured, or specified
489	      number of packets that are (to be) delivered between losses or ECN
490	      marks.  Nominally one over the sum of the loss and ECN marking
491	      probabilities, if there are independently and identically
492	      distributed.
493	   target_run_length:  The target_run_length is an estimate of the
494	      minimum number of non-congestion marked packets needed between
495	      losses or ECN marks necessary to attain the target_data_rate over
496	      a path with the specified target_RTT and target_MTU, as computed
497	      by a mathematical model of TCP congestion control.  A reference
498	      calculation is shown in Section 6.2 and alternatives in Appendix A
499	   reference target_run_length:  target_run_length computed precisely by
500	      the method in Section 6.2.  This is likely to be more slightly
501	      conservative than required by modern TCP algorithms.

503	   Ancillary parameters used for some tests

505	   derating:  Under some conditions the standard models are too
506	      conservative.  The modeling framework permits some latitude in
507	      relaxing or "derating" some test parameters as described in
508	      Section 6.3 in exchange for a more stringent TDS validation
509	      procedures, described in Section 12.
510	   subpath_data_rate:  The maximum data rate supported by a subpath.
511	      This typically includes TCP/IP overhead, including all headers and
512	      retransmits, etc.
513	   test_path_RTT:  The RTT observed between two measurement points using
514	      packet sizes that are consistent with the transport protocol.
515	      Generally MTU sized packets of the forward path, header_overhead
516	      sized packets on the return path.
517	   test_path_pipe:  The amount of data necessary to fill a test path.
518	      Nominally the test path RTT times the subpath_data_rate.
519	   test_window:  The window necessary to meet the target_rate over a
520	      subpath.  Typically test_window=target_data_rate*test_RTT/
521	      (target_MTU - header_overhead).

523	   Tests can be grouped according to their applicability.

525	   Capacity tests:  determine if a network subpath has sufficient
526	      capacity to deliver the target performance.  As long as the test
527	      traffic is within the proper envelope for the target performance,
528	      the average packet losses or ECN marks must be below the threshold
529	      computed by the model.  As such, capacity tests reflect parameters
530	      that can transition from passing to failing as a consequence of
531	      cross traffic, additional presented load or the actions of other
532	      network users.  By definition, capacity tests also consume
533	      significant network resources (data capacity and/or buffer space),
534	      and the test schedules must be balanced by their cost.
535	   Monitoring tests:  are designed to capture the most important aspects
536	      of a capacity test, but without presenting excessive ongoing load
537	      themselves.  As such they may miss some details of the network's
538	      performance, but can serve as a useful reduced-cost proxy for a
539	      capacity test, for example to support ongoing monitoring.
540	   Engineering tests:  evaluate how network algorithms (such as AQM and
541	      channel allocation) interact with TCP-style self clocked protocols
542	      and adaptive congestion control based on packet loss and ECN
543	      marks.  These tests are likely to have complicated interactions
544	      with cross traffic and under some conditions can be inversely
545	      sensitive to load.  For example a test to verify that an AQM
546	      algorithm causes ECN marks or packet drops early enough to limit
547	      queue occupancy may experience a false pass result in the presence
548	      of cross traffic.  It is important that engineering tests be
549	      performed under a wide range of conditions, including both in situ
550	      and bench testing, and over a wide variety of load conditions.
551	      Ongoing monitoring is less likely to be useful for engineering
552	      tests, although sparse in situ testing might be appropriate.

554	4.  New requirements relative to RFC 2330

556	   Model Based Metrics are designed to fulfill some additional
557	   requirement that were not recognized at the time RFC 2330 was written
558	   [RFC2330].  These missing requirements may have significantly
559	   contributed to policy difficulties in the IP measurement space.  Some
560	   additional requirements are:
561	   o  IP metrics must be actionable by the ISP - they have to be
562	      interpreted in terms of behaviors or properties at the IP or lower
563	      layers, that an ISP can test, repair and verify.
564	   o  Metrics should be spatially composable, such that measures of
565	      concatenated paths should be predictable from subpaths.
566	   o  Metrics must be vantage point invariant over a significant range
567	      of measurement point choices, including off path measurement
568	      points.  The only requirements on MP selection should be that the
569	      portion of the test path that is not under test between the MP and
570	      the part that is under test is effectively ideal, or is non ideal
571	      in ways that can be calibrated out of the measurements and the
572	      test RTT between the MPs is below some reasonable bound.
573	   o  Metric measurements must be repeatable by multiple parties with no
574	      specialized access to MPs or diagnostic infrastructure.  It must
575	      be possible for different parties to make the same measurement and
576	      observe the same results.  In particular it is specifically
577	      important that both a consumer (or their delegate) and ISP be able
578	      to perform the same measurement and get the same result.  Note
579	      that vantage independence is key to this requirement.

581	5.  Background

583	   At the time the IPPM WG was chartered, sound Bulk Transport Capacity
584	   measurement was known to be well beyond our capabilities.  Even at
585	   the time [RFC3148] was written we knew that we didn't fully
586	   understand the problem.  Now, by hindsight we understand why BTC is
587	   such a hard problem:
588	   o  TCP is a control system with circular dependencies - everything
589	      affects performance, including components that are explicitly not
590	      part of the test.
591	   o  Congestion control is an equilibrium process, such that transport
592	      protocols change the network (raise the loss ratio and/or RTT) to
593	      conform to their behavior.  By design TCP congestion control keep
594	      raising the data rate until the network give some indication that
595	      it is full by delaying, dropping or ECN marking packets.
596	   o  TCP's ability to compensate for network flaws is directly
597	      proportional to the number of roundtrips per second (i.e.
598	      inversely proportional to the RTT).  As a consequence a flawed
599	      link may pass a short RTT local test even though it fails when the
600	      path is extended by a perfect network to some larger RTT.
601	   o  TCP has a meta Heisenberg problem - Measurement and cross traffic
602	      interact in unknown and ill defined ways.  The situation is
603	      actually worse than the traditional physics problem where you can
604	      at least estimate bounds on the relative momentum of the
605	      measurement and measured particles.  For network measurement you
606	      can not in general determine the relative "mass" of the
607	      measurement traffic and cross traffic, so you can not even gauge
608	      the relative magnitude of their effects on each other.

610	   These properties are a consequence of the equilibrium behavior
611	   intrinsic to how all throughput optimizing protocols interact with
612	   the Internet.  The protocols rely on control systems based on
613	   multiple network estimators to regulate the quantity of data traffic
614	   sent into the network.  The data traffic in turn alters network and
615	   the properties observed by the estimators, such that there are
616	   circular dependencies between every component and every property.
617	   Since some of these properties are nonlinear, the entire system is
618	   nonlinear, and any change anywhere causes difficult to predict
619	   changes in every parameter.

621	   Model Based Metrics overcome these problems by forcing the
622	   measurement system to be open loop: the delivery statistics (akin to
623	   the network estimators) do not affect the traffic or traffic patterns
624	   (bursts), which computed on the basis of the target performance.  In
625	   order for a network to pass, the resulting delivery statistics and
626	   corresponding network estimators have to be such that they would not
627	   cause the control systems slow the traffic below the target rate.

629	5.1.  TCP properties

631	   TCP and SCTP are self clocked protocols.  The dominant steady state
632	   behavior is to have an approximately fixed quantity of data and
633	   acknowledgements (ACKs) circulating in the network.  The receiver
634	   reports arriving data by returning ACKs to the data sender, the data
635	   sender typically responds by sending exactly the same quantity of
636	   data back into the network.  The total quantity of data plus the data
637	   represented by ACKs circulating in the network is referred to as the
638	   window.  The mandatory congestion control algorithms incrementally
639	   adjust the window by sending slightly more or less data in response
640	   to each ACK.  The fundamentally important property of this systems is
641	   that it is entirely self clocked: The data transmissions are a
642	   reflection of the ACKs that were delivered by the network, the ACKs
643	   are a reflection of the data arriving from the network.

645	   A number of phenomena can cause bursts of data, even in idealized
646	   networks that are modeled as simple queueing systems.

648	   During slowstart the data rate is doubled on each RTT by sending
649	   twice as much data as was delivered to the receiver on the prior RTT.
650	   For slowstart to be able to fill such a network the network must be
651	   able to tolerate slowstart bursts up to the full pipe size inflated
652	   by the anticipated window reduction on the first loss or ECN mark.
653	   For example, with classic Reno congestion control, an optimal
654	   slowstart has to end with a burst that is twice the bottleneck rate
655	   for exactly one RTT in duration.  This burst causes a queue which is
656	   exactly equal to the pipe size (i.e. the window is exactly twice the
657	   pipe size) so when the window is halved in response to the first
658	   loss, the new window will be exactly the pipe size.

660	   Note that if the bottleneck data rate is significantly slower than
661	   the rest of the path, the slowstart bursts will not cause significant
662	   queues anywhere else along the path; they primarily exercise the
663	   queue at the dominant bottleneck.

665	   Other sources of bursts include application pauses and channel
666	   allocation mechanisms.  Appendix B describes the treatment of channel
667	   allocation systems.  If the application pauses (stops reading or
668	   writing data) for some fraction of one RTT, state-of-the-art TCP
669	   catches up to the earlier window size by sending a burst of data at
670	   the full sender interface rate.  To fill such a network with a
671	   realistic application, the network has to be able to tolerate
672	   interface rate bursts from the data sender large enough to cover
673	   application pauses.

675	   Although the interface rate bursts are typically smaller than last
676	   burst of a slowstart, they are at a higher data rate so they
677	   potentially exercise queues at arbitrary points along the front path
678	   from the data sender up to and including the queue at the dominant
679	   bottleneck.  There is no model for how frequent or what sizes of
680	   sender rate bursts should be tolerated.

682	   To verify that a path can meet a performance target, it is necessary
683	   to independently confirm that the path can tolerate bursts in the
684	   dimensions that can be caused by these mechanisms.  Three cases are
685	   likely to be sufficient:

687	   o  Slowstart bursts sufficient to get connections started properly.
688	   o  Frequent sender interface rate bursts that are small enough where
689	      they can be assumed not to significantly affect delivery
690	      statistics.  (Implicitly derated by selecting the burst size).
691	   o  Infrequent sender interface rate full target_pipe_size bursts that
692	      do affect the delivery statistics.  (Target_run_length may be
693	      derated).

695	5.2.  Diagnostic Approach

697	   The MBM approach is to open loop TCP by precomputing traffic patterns
698	   that are typically generated by TCP operating at the given target
699	   parameters, and evaluating delivery statistics (packet loss, ECN
700	   marks and delay).  In this approach the measurement software
701	   explicitly controls the data rate, transmission pattern or cwnd
702	   (TCP's primary congestion control state variables) to create
703	   repeatable traffic patterns that mimic TCP behavior but are
704	   independent of the actual behavior of the subpath under test.  These
705	   patterns are manipulated to probe the network to verify that it can
706	   deliver all of the traffic patterns that a transport protocol is
707	   likely to generate under normal operation at the target rate and RTT.

709	   By opening the protocol control loops, we remove most sources of
710	   temporal and spatial correlation in the traffic delivery statistics,
711	   such that each subpath's contribution to the end-to-end delivery
712	   statistics can be assumed to be independent and stationary (The
713	   delivery statistics depend on the fine structure of the data
714	   transmissions, but not on long time scale state imbedded in the
715	   sender, receiver or other network components.)  Therefore each
716	   subpath's contribution to the end-to-end delivery statistics can be
717	   assumed to be independent, and spatial composition techniques such as
718	   [RFC5835] and [RFC6049] apply.

720	   In typical networks, the dominant bottleneck contributes the majority
721	   of the packet loss and ECN marks.  Often the rest of the path makes
722	   insignificant contribution to these properties.  A TDS should
723	   apportion the end-to-end budget for the specified parameters
724	   (primarily packet loss and ECN marks) to each subpath or group of
725	   subpaths.  For example the dominant bottleneck may be permitted to
726	   contribute 90% of the loss budget, while the rest of the path is only
727	   permitted to contribute 10%.

729	   A TDS or FSTDS MUST apportion all relevant packet delivery statistics
730	   between successive subpaths, such that the spatial composition of the
731	   apportioned metrics will yield end-to-end delivery statistics which
732	   are within the bounds determined by the models.

734	   A network is expected to be able to sustain a Bulk TCP flow of a
735	   given data rate, MTU and RTT when all of the following conditions are
736	   met:
737	   1.  The raw link rate is higher than the target data rate.  See
738	       Section 10.1 or any number of data rate tests outside of MBM.
739	   2.  The observed packet delivery statistics are better than required
740	       by a suitable TCP performance model (e.g. fewer losses or ECN
741	       marks).  See Section 10.1 or any number of low rate packet loss
742	       tests outside of MBM.
743	   3.  There is sufficient buffering at the dominant bottleneck to
744	       absorb a slowstart rate burst large enough to get the flow out of
745	       slowstart at a suitable window size.  See Section 10.3.
746	   4.  There is sufficient buffering in the front path to absorb and
747	       smooth sender interface rate bursts at all scales that are likely
748	       to be generated by the application, any channel arbitration in
749	       the ACK path or any other mechanisms.  See Section 10.4.
750	   5.  When there is a standing queue at a bottleneck for a shared media
751	       subpath (e.g. half duplex), there are suitable bounds on how the
752	       data and ACKs interact, for example due to the channel
753	       arbitration mechanism.  See Section 10.2.4.
754	   6.  When there is a slowly rising standing queue at the bottleneck
755	       the onset of packet loss has to be at an appropriate point (time
756	       or queue depth) and progressive.  See Section 10.2.

758	   Note that conditions 1 through 4 require capacity tests for
759	   confirmation, and thus need to be monitored on an ongoing basis.
760	   Conditions 5 and 6 require engineering tests.  They won't generally
761	   fail due to load, but may fail in the field due to configuration
762	   errors, etc. and should be spot checked.

764	   We are developing a tool that can perform many of the tests described
765	   here[MBMSource].

767	6.  Common Models and Parameters

769	6.1.  Target End-to-end parameters

771	   The target end-to-end parameters are the target data rate, target RTT
772	   and target MTU as defined in Section 3.  These parameters are
773	   determined by the needs of the application or the ultimate end user
774	   and the complete Internet path over which the application is expected
775	   to operate.  The target parameters are in units that make sense to
776	   upper layers: payload bytes delivered to the application, above TCP.
777	   They exclude overheads associated with TCP and IP headers,
778	   retransmits and other protocols (e.g.  DNS).

780	   Other end-to-end parameters defined in Section 3 include the
781	   effective bottleneck data rate, the sender interface data rate and
782	   the TCP/IP header sizes (overhead).

784	   The target data rate must be smaller than all link data rates by
785	   enough headroom to carry the transport protocol overhead, explicitly
786	   including retransmissions and an allowance for fluctuations in the
787	   actual data rate, needed to meet the specified average rate.
788	   Specifying a target rate with insufficient headroom is likely to
789	   result in brittle measurements having little predictive value.

791	   Note that the target parameters can be specified for a hypothetical
792	   path, for example to construct TDS designed for bench testing in the
793	   absence of a real application, or for a real physical test for in
794	   situ testing of production infrastructure.

796	   The number of concurrent connections is explicitly not a parameter to
797	   this model.  If a subpath requires multiple connections in order to
798	   meet the specified performance, that must be stated explicitly and
799	   the procedure described in Section 7.4 applies.

801	6.2.  Common Model Calculations

803	   The target transport performance is used to derive the
804	   target_pipe_size and the reference target_run_length.

806	   The target_pipe_size, is the average window size in packets needed to
807	   meet the target rate, for the specified target RTT and MTU.  It is
808	   given by:

810	   target_pipe_size = ceiling( target_rate * target_RTT / ( target_MTU -
811	   header_overhead ) )

813	   Target_run_length is an estimate of the minimum required number of
814	   unmarked packets that must be delivered between losses or ECN marks,
815	   as computed by a mathematical model of TCP congestion control.  The
816	   derivation here follows [MSMO97], and by design is quite
817	   conservative.  The alternate models described in Appendix A generally
818	   yield smaller run_lengths (higher acceptable loss or ECN marking
819	   rates), but may not apply in all situations.  A FSTDS that uses an
820	   alternate model MUST compare it to the reference target_run_length
821	   computed here.

823	   Reference target_run_length is derived as follows: assume the
824	   subpath_data_rate is infinitesimally larger than the target_data_rate
825	   plus the required header_overhead.  Then target_pipe_size also
826	   predicts the onset of queueing.  A larger window will cause a
827	   standing queue at the bottleneck.

829	   Assume the transport protocol is using standard Reno style Additive
830	   Increase, Multiplicative Decrease (AIMD) congestion control [RFC5681]
831	   (but not Appropriate Byte Counting [RFC3465]) and the receiver is
832	   using standard delayed ACKs.  Reno increases the window by one packet
833	   every pipe_size worth of ACKs.  With delayed ACKs this takes 2 Round
834	   Trip Times per increase.  To exactly fill the pipe, losses must be no
835	   closer than when the peak of the AIMD sawtooth reached exactly twice
836	   the target_pipe_size otherwise the multiplicative window reduction
837	   triggered by the loss would cause the network to be underfilled.
838	   Following [MSMO97] the number of packets between losses must be the
839	   area under the AIMD sawtooth.  They must be no more frequent than
840	   every 1 in ((3/2)*target_pipe_size)*(2*target_pipe_size) packets,
841	   which simplifies to:

843	   target_run_length = 3*(target_pipe_size^2)

845	   Note that this calculation is very conservative and is based on a
846	   number of assumptions that may not apply.  Appendix A discusses these
847	   assumptions and provides some alternative models.  If a different
848	   model is used, a fully specified TDS or FSTDS MUST document the
849	   actual method for computing target_run_length and ratio between
850	   alternate target_run_length and the reference target_run_length
851	   calculated above, along with a discussion of the rationale for the
852	   underlying assumptions.

854	   These two parameters, target_pipe_size and target_run_length,
855	   directly imply most of the individual parameters for the tests in
856	   Section 10.

858	6.3.  Parameter Derating

860	   Since some aspects of the models are very conservative, the MBM
861	   framework permits some latitude in derating test parameters.  Rather
862	   than trying to formalize more complicated models we permit some test
863	   parameters to be relaxed as long as they meet some additional
864	   procedural constraints:
865	   o  The TDS or FSTDS MUST document and justify the actual method used
866	      to compute the derated metric parameters.
867	   o  The validation procedures described in Section 12 must be used to
868	      demonstrate the feasibility of meeting the performance targets
869	      with infrastructure that infinitesimally passes the derated tests.
870	   o  The validation process itself must be documented is such a way
871	      that other researchers can duplicate the validation experiments.

873	   Except as noted, all tests below assume no derating.  Tests where
874	   there is not currently a well established model for the required
875	   parameters explicitly include derating as a way to indicate
876	   flexibility in the parameters.

878	7.  Traffic generating techniques

880	7.1.  Paced transmission

882	   Paced (burst) transmissions: send bursts of data on a timer to meet a
883	   particular target rate and pattern.  In all cases the specified data
884	   rate can either be the application or link rates.  Header overheads
885	   must be included in the calculations as appropriate.
886	   Packet Headway:  Time interval between packets, specified from the
887	      start of one to the start of the next. e.g.  If packets are sent
888	      with a 1 mS headway, there will be exactly 1000 packets per
889	      second.
890	   Burst Headway:  Time interval between bursts, specified from the
891	      start of the first packet one burst to the start of the first
892	      packet of the next burst. e.g.  If 4 packet bursts are sent with a
893	      1 mS headway, there will be exactly 4000 packets per second.
894	   Paced single packets:  Send individual packets at the specified rate
895	      or packet headway. [@@@@ Site RFC 3432, update definition?]
896	   Paced Bursts:  Send sender interface rate bursts on a timer.  Specify
897	      any 3 of: average rate, packet size, burst size (number of
898	      packets) and burst headway (burst start to start).  The packet
899	      headway within a burst is typically assumed to be the minimum
900	      supported by the tester's interface. i.e.  Bursts are normally
901	      sent as back-to-back packets.  The packet headway within the
902	      bursts can be explicitly specified.
903	   Slowstart bursts:  Send 4 packet paced bursts at an average data rate
904	      equal to twice effective bottleneck link rate (but not more than
905	      the sender interface rate).  This corresponds to the average rate
906	      during a TCP slowstart when Appropriate Byte Counting [RFC3465] is
907	      present or delayed ack is disabled.  Note that if the effective
908	      bottleneck link rate is more than half of the sender interface
909	      rate, slowstart rate bursts become sender interface rate bursts.

911	      [@@@@ Add figure --MM].
912	   Repeated Slowstart bursts:  Slowstart bursts are typically part of
913	      larger scale pattern of repeated bursts, such as sending
914	      target_pipe_size packets as slowstart bursts on a target_RTT
915	      headway (burst start to burst start).  Such a stream has three
916	      different average rates, depending on the averaging interval.  At
917	      the finest time scale the average rate is the same as the sender
918	      interface rate, at a medium scale the average rate is twice the
919	      effective bottleneck link rate and at the longest time scales the
920	      average rate is equal to the target data rate.

922	   Note that in conventional measurement theory, exponential
923	   distributions are often used to eliminate many sorts of correlations.
924	   For the procedures above, the correlations are created by the network
925	   elements and accurately reflect their behavior.  At some point in the
926	   future, it will be desirable to introduce noise sources into the
927	   above pacing models, but they are not warranted at this time.

929	7.2.  Constant window pseudo CBR

931	   Implement pseudo constant bit rate by running a standard protocol
932	   such as TCP with a fixed window size, such that it is self clocked.
933	   Data packets arriving at the receiver trigger acknowledgements (ACKs)
934	   which travel back to the sender where they trigger additional
935	   transmissions.  The window size is computed from the target_data_rate
936	   and the actual RTT of the test path.  The rate is only maintained in
937	   average over each RTT, and is subject to limitations of the transport
938	   protocol.

940	   Since the window size is constrained to be an integer number of
941	   packets, for small RTTs or low data rates there may not be
942	   sufficiently precise control over the data rate.  Rounding the window
943	   size up (the default) is likely to be result in data rates that are
944	   higher than the target rate, but reducing the window by one packet
945	   may result in data rates that are too small.  Also cross traffic
946	   potentially raises the RTT, implicitly reducing the rate.  Cross
947	   traffic that raises the RTT nearly always makes the test more
948	   strenuous.  A FSTDS specifying a constant window CBR tests MUST
949	   explicitly indicate under what conditions errors in the data cause
950	   tests to inconclusive.  See the discussion of test outcomes in
951	   Section 8.1.

953	   Since constant window pseudo CBR testing is sensitive to RTT
954	   fluctuations it can not accurately control the data rate in
955	   environments with fluctuating delays.

957	7.3.  Scanned window pseudo CBR

959	   Scanned window pseudo CBR is similar to the constant window CBR
960	   described above, except the window is scanned across a range of sizes
961	   designed to include two key events, the onset of queueing and the
962	   onset of packet loss or ECN marks.  The window is scanned by
963	   incrementing it by one packet every 2*target_pipe_size delivered
964	   packets.  This mimics the additive increase phase of standard TCP
965	   congestion avoidance when delayed ACKs are in effect.  It normally
966	   separates the the window increases by approximately twice the
967	   target_RTT.

969	   There are two ways to implement this test: one built by applying a
970	   window clamp to standard congestion control in a standard protocol
971	   such as TCP and the other built by stiffening a non-standard
972	   transport protocol.  When standard congestion control is in effect,
973	   any losses or ECN marks cause the transport to revert to a window
974	   smaller than the clamp such that the scanning clamp loses control the
975	   window size.  The NPAD pathdiag tool is an example of this class of
976	   algorithms [Pathdiag].

978	   Alternatively a non-standard congestion control algorithm can respond
979	   to losses by transmitting extra data, such that it maintains the
980	   specified window size independent of losses or ECN marks.  Such a
981	   stiffened transport explicitly violates mandatory Internet congestion
982	   control and is not suitable for in situ testing.  [RFC5681] It is
983	   only appropriate for engineering testing under laboratory conditions.
984	   The Windowed Ping tool implements such a test [WPING].  The tool
985	   described in the paper has been updated.[mpingSource]

987	   The test procedures in Section 10.2 describe how to the partition the
988	   scans into regions and how to interpret the results.

990	7.4.  Concurrent or channelized testing

992	   The procedures described in this document are only directly
993	   applicable to single stream performance measurement, e.g. one TCP
994	   connection.  In an ideal world, we would disallow all performance
995	   claims based multiple concurrent streams, but this is not practical
996	   due to at least two different issues.  First, many very high rate
997	   link technologies are channelized and pin individual flows to
998	   specific channels to minimize reordering or other problems and
999	   second, TCP itself has scaling limits.  Although the former problem
1000	   might be overcome through different design decisions, the later
1001	   problem is more deeply rooted.

1003	   All congestion control algorithms that are philosophically aligned
1004	   with the standard [RFC5681] (e.g. claim some level of TCP
1005	   friendliness) have scaling limits, in the sense that as a long fast
1006	   network (LFN) with a fixed RTT and MTU gets faster, these congestion
1007	   control algorithms get less accurate and as a consequence have
1008	   difficulty filling the network[CCscaling].  These properties are a
1009	   consequence of the original Reno AIMD congestion control design and
1010	   the requirement in [RFC5681] that all transport protocols have
1011	   uniform response to congestion.

1013	   There are a number of reasons to want to specify performance in term
1014	   of multiple concurrent flows, however this approach is not
1015	   recommended for data rates below several megabits per second, which
1016	   can be attained with run lengths under 10000 packets.  Since the
1017	   required run length goes as the square of the data rate, at higher
1018	   rates the run lengths can be unreasonably large, and multiple
1019	   connection might be the only feasible approach.

1021	   If multiple connections are deemed necessary to meet aggregate
1022	   performance targets then this MUST be stated both the design of the
1023	   TDS and in any claims about network performance.  The tests MUST be
1024	   performed concurrently with the specified number of connections.  For
1025	   the the tests that use bursty traffic, the bursts should be
1026	   synchronized across flows.

1028	8.  Interpreting the Results

1030	8.1.  Test outcomes

1032	   To perform an exhaustive test of a complete network path, each test
1033	   of the TDS is applied to each subpath of the complete path.  If any
1034	   subpath fails any test then an application running over the complete
1035	   path can also be expected to fail to attain the target performance
1036	   under some conditions.

1038	   In addition to passing or failing, a test can be deemed to be
1039	   inconclusive for a number of reasons.  Proper instrumentation and
1040	   treatment of inconclusive outcomes is critical to the accuracy and
1041	   robustness of Model Based Metrics.  Tests can be inconclusive if the
1042	   precomputed traffic pattern or data rates were not accurately
1043	   generated; the measurement results were not statistically
1044	   significant; and others causes such as failing to meet some required
1045	   preconditions for the test.

1047	   For example consider a test that implements Constant Window Pseudo
1048	   CBR (Section 7.2) by adding rate controls and detailed traffic
1049	   instrumentation to TCP (e.g.  [RFC4898]).  TCP includes built in
1050	   control systems which might interfere with the sending data rate.  If
1051	   such a test meets the required delivery statistics (e.g. run length)
1052	   while failing to attain the specified data rate it must be treated as
1053	   an inconclusive result, because we can not a priori determine if the
1054	   reduced data rate was caused by a TCP problem or a network problem,
1055	   or if the reduced data rate had a material effect on the observed
1056	   delivery statistics.

1058	   Note that for capacity tests, if the observed delivery statistics
1059	   fail to meet the targets, the test can can be considered to have
1060	   failed because it doesn't really matter that the test didn't attain
1061	   the required data rate.

1063	   The really important new properties of MBM, such as vantage
1064	   independence, are a direct consequence of opening the control loops
1065	   in the protocols, such that the test traffic does not depend on
1066	   network conditions or traffic received.  Any mechanism that
1067	   introduces feedback between the paths measurements and the traffic
1068	   generation is at risk of introducing nonlinearities that spoil these
1069	   properties.  Any exceptional event that indicates that such feedback
1070	   has happened should cause the test to be considered inconclusive.

1072	   One way to view inconclusive tests is that they reflect situations
1073	   where a test outcome is ambiguous between limitations of the network
1074	   and some unknown limitation of the diagnostic test itself, which may
1075	   have been caused by some uncontrolled feedback from the network.

1077	   Note that procedures that attempt to sweep the target parameter space
1078	   to find the limits on some parameter such as target_data_rate are at
1079	   risk of breaking the location independent properties of Model Based
1080	   Metrics, if the boundary between passing and inconclusive is at all
1081	   sensitive to RTT.

1083	   One of the goals for evolving TDS designs will be to keep sharpening
1084	   distinction between inconclusive, passing and failing tests.  The
1085	   criteria for for passing, failing and inconclusive tests MUST be
1086	   explicitly stated for every test in the TDS or FSTDS.

1088	   One of the goals of evolving the testing process, procedures, tools
1089	   and measurement point selection should be to minimize the number of
1090	   inconclusive tests.

1092	   It may be useful to keep raw data delivery statistics for deeper
1093	   study of the behavior of the network path and to measure the tools
1094	   themselves.  Raw delivery statistics can help to drive tool
1095	   evolution.  Under some conditions it might be possible to reevaluate
1096	   the raw data for satisfying alternate performance targets.  However
1097	   it is important to guard against sampling bias and other implicit
1098	   feedback which can cause false results and exhibit measurement point
1099	   vantage sensitivity.

1101	8.2.  Statistical criteria for estimating run_length

1103	   When evaluating the observed run_length, we need to determine
1104	   appropriate packet stream sizes and acceptable error levels for
1105	   efficient measurement.  In practice, can we compare the empirically
1106	   estimated packet loss and ECN marking ratios with the targets as the
1107	   sample size grows?  How large a sample is needed to say that the
1108	   measurements of packet transfer indicate a particular run length is
1109	   present?

1111	   The generalized measurement can be described as recursive testing:
1112	   send packets (individually or in patterns) and observe the packet
1113	   delivery performance (loss ratio or other metric, any marking we
1114	   define).

1116	   As each packet is sent and measured, we have an ongoing estimate of
1117	   the performance in terms of the ratio of packet loss or ECN mark to
1118	   total packets (i.e. an empirical probability).  We continue to send
1119	   until conditions support a conclusion or a maximum sending limit has
1120	   been reached.

1122	   We have a target_mark_probability, 1 mark per target_run_length,
1123	   where a "mark" is defined as a lost packet, a packet with ECN mark,
1124	   or other signal.  This constitutes the null Hypothesis:

1126	   H0:  no more than one mark in target_run_length =
1127	      3*(target_pipe_size)^2 packets

1129	   and we can stop sending packets if on-going measurements support
1130	   accepting H0 with the specified Type I error = alpha (= 0.05 for
1131	   example).

1133	   We also have an alternative Hypothesis to evaluate: if performance is
1134	   significantly lower than the target_mark_probability.  Based on
1135	   analysis of typical values and practical limits on measurement
1136	   duration, we choose four times the H0 probability:

1138	   H1:  one or more marks in (target_run_length/4) packets

1140	   and we can stop sending packets if measurements support rejecting H0
1141	   with the specified Type II error = beta (= 0.05 for example), thus
1142	   preferring the alternate hypothesis H1.

1144	   H0 and H1 constitute the Success and Failure outcomes described
1145	   elsewhere in the memo, and while the ongoing measurements do not
1146	   support either hypothesis the current status of measurements is
1147	   inconclusive.

1149	   The problem above is formulated to match the Sequential Probability
1150	   Ratio Test (SPRT) [StatQC].  Note that as originally framed the
1151	   events under consideration were all manufacturing defects.  In
1152	   networking, ECN marks and lost packets are not defects but signals,
1153	   indicating that the transport protocol should slow down.

1155	   The Sequential Probability Ratio Test also starts with a pair of
1156	   hypothesis specified as above:

1158	   H0:  p0 = one defect in target_run_length
1159	   H1:  p1 = one defect in target_run_length/4
1160	   As packets are sent and measurements collected, the tester evaluates
1161	   the cumulative defect count against two boundaries representing H0
1162	   Acceptance or Rejection (and acceptance of H1):

1164	   Acceptance line:  Xa = -h1 + s*n
1165	   Rejection line:  Xr = h2 + s*n
1166	   where n increases linearly for each packet sent and

1168	   h1 =  { log((1-alpha)/beta) }/k
1169	   h2 =  { log((1-beta)/alpha) }/k
1170	   k  =  log{ (p1(1-p0)) / (p0(1-p1)) }
1171	   s  =  [ log{ (1-p0)/(1-p1) } ]/k
1172	   for p0 and p1 as defined in the null and alternative Hypotheses
1173	   statements above, and alpha and beta as the Type I and Type II
1174	   errors.

1176	   The SPRT specifies simple stopping rules:

1178	   o  Xa < defect_count(n) < Xb: continue testing
1179	   o  defect_count(n) <= Xa: Accept H0
1180	   o  defect_count(n) >= Xb: Accept H1

1182	   The calculations above are implemented in the R-tool for Statistical
1183	   Analysis [Rtool] , in the add-on package for Cross-Validation via
1184	   Sequential Testing (CVST) [CVST] .

1186	   Using the equations above, we can calculate the minimum number of
1187	   packets (n) needed to accept H0 when x defects are observed.  For
1188	   example, when x = 0:

1190	   Xa = 0  = -h1 + s*n
1191	   and  n = h1 / s

1193	8.3.  Reordering Tolerance

1195	   All tests must be instrumented for packet level reordering [RFC4737].
1196	   However, there is no consensus for how much reordering should be
1197	   acceptable.  Over the last two decades the general trend has been to
1198	   make protocols and applications more tolerant to reordering (see for
1199	   example [RFC4015]), in response to the gradual increase in reordering
1200	   in the network.  This increase has been due to the deployment of
1201	   technologies such as multi threaded routing lookups and Equal Cost
1202	   MultiPath (ECMP) routing.  These techniques increase parallelism in
1203	   network and are critical to enabling overall Internet growth to
1204	   exceed Moore's Law.

1206	   Note that transport retransmission strategies can trade off
1207	   reordering tolerance vs how quickly they can repair losses vs
1208	   overhead from spurious retransmissions.  In advance of new
1209	   retransmission strategies we propose the following strawman:
1210	   Transport protocols should be able to adapt to reordering as long as
1211	   the reordering extent is no more than the maximum of one quarter
1212	   window or 1 mS, whichever is larger.  Within this limit on reorder
1213	   extent, there should be no bound on reordering density.

1215	   By implication, recording which is less than these bounds should not
1216	   be treated as a network impairment.  However [RFC4737] still applies:
1217	   reordering should be instrumented and the maximum reordering that can
1218	   be properly characterized by the test (e.g. bound on history buffers)
1219	   should be recorded with the measurement results.

1221	   Reordering tolerance and diagnostic limitations, such as history
1222	   buffer size, MUST be specified in a FSTDS.

1224	9.  Test Preconditions

1226	   Many tests have preconditions which are required to assure their
1227	   validity.  For example the presence or nonpresence of cross traffic
1228	   on specific subpaths, or appropriate preloading to put reactive
1229	   network elements into the proper states[RFC7312]).  If preconditions
1230	   are not properly satisfied for some reason, the tests should be
1231	   considered to be inconclusive.  In general it is useful to preserve
1232	   diagnostic information about why the preconditions were not met, and
1233	   any test data that was collected even if it is not useful for the
1234	   intended test.  Such diagnostic information and partial test data may
1235	   be useful for improving the test in the future.

1237	   It is important to preserve the record that a test was scheduled,
1238	   because otherwise precondition enforcement mechanisms can introduce
1239	   sampling bias.  For example, canceling tests due to cross traffic on
1240	   subscriber access links might introduce sampling bias of tests of the
1241	   rest of the network by reducing the number of tests during peak
1242	   network load.

1244	   Test preconditions and failure actions MUST be specified in a FSTDS.

1246	10.  Diagnostic Tests

1248	   The diagnostic tests below are organized by traffic pattern: basic
1249	   data rate and delivery statistics, standing queues, slowstart bursts,
1250	   and sender rate bursts.  We also introduce some combined tests which
1251	   are more efficient when networks are expected to pass, but conflate
1252	   diagnostic signatures when they fail.

1254	   There are a number of test details which are not fully defined here.
1255	   They must be fully specified in a FSTDS.  From a standardization
1256	   perspective, this lack of specificity will weaken this version of
1257	   Model Based Metrics, however it is anticipated that this it be more
1258	   than offset by the extent to which MBM suppresses the problems caused
1259	   by using transport protocols for measurement. e.g. non-specific MBM
1260	   metrics are likely to have better repeatability than many existing
1261	   BTC like metrics.  Once we have good field experience, the missing
1262	   details can be fully specified.

1264	10.1.  Basic Data Rate and Delivery Statistics Tests

1266	   We propose several versions of the basic data rate and delivery
1267	   statistics test.  All measure the number of packets delivered between
1268	   losses or ECN marks, using a data stream that is rate controlled at
1269	   or below the target_data_rate.

1271	   The tests below differ in how the data rate is controlled.  The data
1272	   can be paced on a timer, or window controlled at full target data
1273	   rate.  The first two tests implicitly confirm that sub_path has
1274	   sufficient raw capacity to carry the target_data_rate.  They are
1275	   recommend for relatively infrequent testing, such as an installation
1276	   or periodic auditing process.  The third, background delivery
1277	   statistics, is a low rate test designed for ongoing monitoring for
1278	   changes in subpath quality.

1280	   All rely on the receiver accumulating packet delivery statistics as
1281	   described in Section 8.2 to score the outcome:

1283	   Pass: it is statistically significant that the observed interval
1284	   between losses or ECN marks is larger than the target_run_length.

1286	   Fail: it is statistically significant that the observed interval
1287	   between losses or ECN marks is smaller than the target_run_length.

1289	   A test is considered to be inconclusive if it failed to meet the data
1290	   rate as specified below, meet the qualifications defined in Section 9
1291	   or neither run length statistical hypothesis was confirmed in the
1292	   allotted test duration.

1294	10.1.1.  Delivery Statistics at Paced Full Data Rate

1296	   Confirm that the observed run length is at least the
1297	   target_run_length while relying on timer to send data at the
1298	   target_rate using the procedure described in in Section 7.1 with a
1299	   burst size of 1 (single packets) or 2 (packet pairs).

1301	   The test is considered to be inconclusive if the packet transmission
1302	   can not be accurately controlled for any reason.

1304	   RFC 6673 [RFC6673] is appropriate for measuring delivery statistics
1305	   at full data rate.

1307	10.1.2.  Delivery Statistics at Full Data Windowed Rate

1309	   Confirm that the observed run length is at least the
1310	   target_run_length while sending at an average rate approximately
1311	   equal to the target_data_rate, by controlling (or clamping) the
1312	   window size of a conventional transport protocol to a fixed value
1313	   computed from the properties of the test path, typically
1314	   test_window=target_data_rate*test_RTT/target_MTU.  Note that if there
1315	   is any interaction between the forward and return path, test_window
1316	   may need to be adjusted slightly to compensate for the resulting
1317	   inflated RTT.

1319	   Since losses and ECN marks generally cause transport protocols to at
1320	   least temporarily reduce their data rates, this test is expected to
1321	   be less precise about controlling its data rate.  It should not be
1322	   considered inconclusive as long as at least some of the round trips
1323	   reached the full target_data_rate without incurring losses or ECN
1324	   marks.  To pass this test the network MUST deliver target_pipe_size
1325	   packets in target_RTT time without any losses or ECN marks at least
1326	   once per two target_pipe_size round trips, in addition to meeting the
1327	   run length statistical test.

1329	10.1.3.  Background Delivery Statistics Tests

1331	   The background run length is a low rate version of the target target
1332	   rate test above, designed for ongoing lightweight monitoring for
1333	   changes in the observed subpath run length without disrupting users.
1334	   It should be used in conjunction with one of the above full rate
1335	   tests because it does not confirm that the subpath can support raw
1336	   data rate.

1338	   RFC 6673 [RFC6673] is appropriate for measuring background delivery
1339	   statistics.

1341	10.2.  Standing Queue Tests

1343	   These engineering tests confirm that the bottleneck is well behaved
1344	   across the onset of packet loss, which typically follows after the
1345	   onset of queueing.  Well behaved generally means lossless for
1346	   transient queues, but once the queue has been sustained for a
1347	   sufficient period of time (or reaches a sufficient queue depth) there
1348	   should be a small number of losses to signal to the transport
1349	   protocol that it should reduce its window.  Losses that are too early
1350	   can prevent the transport from averaging at the target_data_rate.
1351	   Losses that are too late indicate that the queue might be subject to
1352	   bufferbloat [wikiBloat] and inflict excess queuing delays on all
1353	   flows sharing the bottleneck queue.  Excess losses (more than half of
1354	   the window) at the onset of congestion make loss recovery problematic
1355	   for the transport protocol.  Non-linear, erratic or excessive RTT
1356	   increases suggest poor interactions between the channel acquisition
1357	   algorithms and the transport self clock.  All of the tests in this
1358	   section use the same basic scanning algorithm, described here, but
1359	   score the link on the basis of how well it avoids each of these
1360	   problems.

1362	   For some technologies the data might not be subject to increasing
1363	   delays, in which case the data rate will vary with the window size
1364	   all the way up to the onset of load induced losses or ECN marks.  For
1365	   theses technologies, the discussion of queueing does not apply, but
1366	   it is still required that the onset of losses or ECN marks be at an
1367	   appropriate point and progressive.

1369	   Use the procedure in Section 7.3 to sweep the window across the onset
1370	   of queueing and the onset of loss.  The tests below all assume that
1371	   the scan emulates standard additive increase and delayed ACK by
1372	   incrementing the window by one packet for every 2*target_pipe_size
1373	   packets delivered.  A scan can typically be divided into three
1374	   regions: below the onset of queueing, a standing queue, and at or
1375	   beyond the onset of loss.

1377	   Below the onset of queueing the RTT is typically fairly constant, and
1378	   the data rate varies in proportion to the window size.  Once the data
1379	   rate reaches the link rate, the data rate becomes fairly constant,
1380	   and the RTT increases in proportion to the increase in window size.
1381	   The precise transition across the start of queueing can be identified
1382	   by the maximum network power, defined to be the ratio data rate over
1383	   the RTT.  The network power can be computed at each window size, and
1384	   the window with the maximum are taken as the start of the queueing
1385	   region.

1387	   For technologies that do not have conventional queues, start the scan
1388	   at a window equal to the test_window=target_data_rate*test_RTT/
1389	   target_MTU, i.e. starting at the target rate, instead of the power
1390	   point.

1392	   If there is random background loss (e.g. bit errors, etc), precise
1393	   determination of the onset of queue induced packet loss may require
1394	   multiple scans.  Above the onset of queuing loss, all transport
1395	   protocols are expected to experience periodic losses determined by
1396	   the interaction between the congestion control and AQM algorithms.
1397	   For standard congestion control algorithms the periodic losses are
1398	   likely to be relatively widely spaced and the details are typically
1399	   dominated by the behavior of the transport protocol itself.  For the
1400	   stiffened transport protocols case (with non-standard, aggressive
1401	   congestion control algorithms) the details of periodic losses will be
1402	   dominated by how the the window increase function responds to loss.

1404	10.2.1.  Congestion Avoidance

1406	   A link passes the congestion avoidance standing queue test if more
1407	   than target_run_length packets are delivered between the onset of
1408	   queueing (as determined by the window with the maximum network power)
1409	   and the first loss or ECN mark.  If this test is implemented using a
1410	   standards congestion control algorithm with a clamp, it can be
1411	   performed in situ in the production internet as a capacity test.  For
1412	   an example of such a test see [Pathdiag].

1414	   For technologies that do not have conventional queues, use the
1415	   test_window inplace of the onset of queueing. i.e.  A link passes the
1416	   congestion avoidance standing queue test if more than
1417	   target_run_length packets are delivered between start of the scan at
1418	   test_window and the first loss or ECN mark.

1420	10.2.2.  Bufferbloat

1422	   This test confirms that there is some mechanism to limit buffer
1423	   occupancy (e.g. that prevents bufferbloat).  Note that this is not
1424	   strictly a requirement for single stream bulk performance, however if
1425	   there is no mechanism to limit buffer queue occupancy then a single
1426	   stream with sufficient data to deliver is likely to cause the
1427	   problems described in [RFC2309], [I-D.ietf-aqm-recommendation] and
1428	   [wikiBloat].  This may cause only minor symptoms for the dominant
1429	   flow, but has the potential to make the link unusable for other flows
1430	   and applications.

1432	   Pass if the onset of loss occurs before a standing queue has
1433	   introduced more delay than than twice target_RTT, or other well
1434	   defined and specified limit.  Note that there is not yet a model for
1435	   how much standing queue is acceptable.  The factor of two chosen here
1436	   reflects a rule of thumb.  In conjunction with the previous test,
1437	   this test implies that the first loss should occur at a queueing
1438	   delay which is between one and two times the target_RTT.

1440	   Specified RTT limits that are larger than twice the target_RTT must
1441	   be fully justified in the FSTDS.

1443	10.2.3.  Non excessive loss

1445	   This test confirm that the onset of loss is not excessive.  Pass if
1446	   losses are equal or less than the increase in the cross traffic plus
1447	   the test traffic window increase on the previous RTT.  This could be
1448	   restated as non-decreasing link throughput at the onset of loss,
1449	   which is easy to meet as long as discarding packets in not more
1450	   expensive than delivering them.  (Note when there is a transient drop
1451	   in link throughput, outside of a standing queue test, a link that
1452	   passes other queue tests in this document will have sufficient queue
1453	   space to hold one RTT worth of data).

1455	   Note that conventional Internet traffic policers will not pass this
1456	   test, which is correct.  TCP often fails to come into equilibrium at
1457	   more than a small fraction of the available capacity, if the capacity
1458	   is enforced by a policer.  [Citation Pending].

1460	10.2.4.  Duplex Self Interference

1462	   This engineering test confirms a bound on the interactions between
1463	   the forward data path and the ACK return path.

1465	   Some historical half duplex technologies had the property that each
1466	   direction held the channel until it completely drains its queue.
1467	   When a self clocked transport protocol, such as TCP, has data and
1468	   acks passing in opposite directions through such a link, the behavior
1469	   often reverts to stop-and-wait.  Each additional packet added to the
1470	   window raises the observed RTT by two forward path packet times, once
1471	   as it passes through the data path, and once for the additional delay
1472	   incurred by the ACK waiting on the return path.

1474	   The duplex self interference test fails if the RTT rises by more than
1475	   some fixed bound above the expected queueing time computed from trom
1476	   the excess window divided by the link data rate.  This bound must be
1477	   smaller than target_RTT/2 to avoid reverting to stop and wait
1478	   behavior. (e.g.  Packets have to be released at least twice per RTT,
1479	   to avoid stop and wait behavior.)

1481	10.3.  Slowstart tests

1483	   These tests mimic slowstart: data is sent at twice the effective
1484	   bottleneck rate to exercise the queue at the dominant bottleneck.

1486	   In general they are deemed inconclusive if the elapsed time to send
1487	   the data burst is not less than half of the time to receive the ACKs.
1488	   (i.e. sending data too fast is ok, but sending it slower than twice
1489	   the actual bottleneck rate as indicated by the ACKs is deemed
1490	   inconclusive).  Space the bursts such that the average data rate is
1491	   equal to the target_data_rate.

1493	10.3.1.  Full Window slowstart test

1495	   This is a capacity test to confirm that slowstart is not likely to
1496	   exit prematurely.  Send slowstart bursts that are target_pipe_size
1497	   total packets.

1499	   Accumulate packet delivery statistics as described in Section 8.2 to
1500	   score the outcome.  Pass if it is statistically significant that the
1501	   observed number of good packets delivered between losses or ECN marks
1502	   is larger than the target_run_length.  Fail if it is statistically
1503	   significant that the observed interval between losses or ECN marks is
1504	   smaller than the target_run_length.

1506	   Note that these are the same parameters as the Sender Full Window
1507	   burst test, except the burst rate is at slowestart rate, rather than
1508	   sender interface rate.

1510	10.3.2.  Slowstart AQM test

1512	   Do a continuous slowstart (send data continuously at slowstart_rate),
1513	   until the first loss, stop, allow the network to drain and repeat,
1514	   gathering statistics on the last packet delivered before the loss,
1515	   the loss pattern, maximum observed RTT and window size.  Justify the
1516	   results.  There is not currently sufficient theory justifying
1517	   requiring any particular result, however design decisions that affect
1518	   the outcome of this tests also affect how the network balances
1519	   between long and short flows (the "mice and elephants" problem).  The
1520	   queue at the time of the first loss should be at least one half of
1521	   the target_RTT.

1523	   This is an engineering test: It would be best performed on a
1524	   quiescent network or testbed, since cross traffic has the potential
1525	   to change the results.

1527	10.4.  Sender Rate Burst tests

1529	   These tests determine how well the network can deliver bursts sent at
1530	   sender's interface rate.  Note that this test most heavily exercises
1531	   the front path, and is likely to include infrastructure may be out of
1532	   scope for an access ISP, even though the bursts might be caused by
1533	   ACK compression, thinning or channel arbitration in the access ISP.
1534	   See Appendix B.

1536	   Also, there are a several details that are not precisely defined.
1537	   For starters there is not a standard server interface rate. 1 Gb/s
1538	   and 10 Gb/s are very common today, but higher rates will become cost
1539	   effective and can be expected to be dominant some time in the future.

1541	   Current standards permit TCP to send a full window bursts following
1542	   an application pause.  (Congestion Window Validation [RFC2861], is
1543	   not required, but even if was, it does not take effect until an
1544	   application pause is longer than an RTO.)  Since full window bursts
1545	   are consistent with standard behavior, it is desirable that the
1546	   network be able to deliver such bursts, otherwise application pauses
1547	   will cause unwarranted losses.  Note that the AIMD sawtooth requires
1548	   a peak window that is twice target_pipe_size, so the worst case burst
1549	   may be 2*target_pipe_size.

1551	   It is also understood in the application and serving community that
1552	   interface rate bursts have a cost to the network that has to be
1553	   balanced against other costs in the servers themselves.  For example
1554	   TCP Segmentation Offload (TSO) reduces server CPU in exchange for
1555	   larger network bursts, which increase the stress on network buffer
1556	   memory.

1558	   There is not yet theory to unify these costs or to provide a
1559	   framework for trying to optimize global efficiency.  We do not yet
1560	   have a model for how much the network should tolerate server rate
1561	   bursts.  Some bursts must be tolerated by the network, but it is
1562	   probably unreasonable to expect the network to be able to efficiently
1563	   deliver all data as a series of bursts.

1565	   For this reason, this is the only test for which we encourage
1566	   derating.  A TDS could include a table of pairs of derating
1567	   parameters: what burst size to use as a fraction of the
1568	   target_pipe_size, and how much each burst size is permitted to reduce
1569	   the run length, relative to to the target_run_length.

1571	10.5.  Combined and Implicit Tests

1573	   Combined tests efficiently confirm multiple network properties in a
1574	   single test, possibly as a side effect of normal content delivery.

1576	   They require less measurement traffic than other testing strategies
1577	   at the cost of conflating diagnostic signatures when they fail.
1578	   These are by far the most efficient for monitoring networks that are
1579	   nominally expected to pass all tests.

1581	10.5.1.  Sustained Bursts Test

1583	   The sustained burst test implements a combined worst case version of
1584	   all of the capacity tests above.  It is simply:

1586	   Send target_pipe_size bursts of packets at server interface rate with
1587	   target_RTT burst headway (burst start to burst start).  Verify that
1588	   the observed delivery statistics meets the target_run_length.

1590	   Key observations:
1591	   o  The subpath under test is expected to go idle for some fraction of
1592	      the time: (subpath_data_rate-target_rate)/subpath_data_rate.
1593	      Failing to do so indicates a problem with the procedure and an
1594	      inconclusive test result.
1595	   o  The burst sensitivity can be derated by sending smaller bursts
1596	      more frequently.  E.g. send target_pipe_size*derate packet bursts
1597	      every target_RTT*derate.
1598	   o  When not derated, this test is the most strenuous capacity test.
1599	   o  A link that passes this test is likely to be able to sustain
1600	      higher rates (close to subpath_data_rate) for paths with RTTs
1601	      significantly smaller than the target_RTT.
1602	   o  This test can be implemented with instrumented TCP [RFC4898],
1603	      using a specialized measurement application at one end [MBMSource]
1604	      and a minimal service at the other end [RFC0863] [RFC0864].
1605	   o  This test is efficient to implement, since it does not require
1606	      per-packet timers, and can make use of TSO in modern NIC hardware.
1607	   o  This test by itself is not sufficient: the standing window
1608	      engineering tests are also needed to ensure that the link is well
1609	      behaved at and beyond the onset of congestion.
1610	   o  Assuming the link passes relevant standing window engineering
1611	      tests (particularly that it has a progressive onset of loss at an
1612	      appropriate queue depth) the passing sustained burst test is
1613	      (believed to be) a sufficient verify that the subpath will not
1614	      impair stream at the target performance under all conditions.
1615	      Proving this statement will be subject of ongoing research.

1617	   Note that this test is clearly independent of the subpath RTT, or
1618	   other details of the measurement infrastructure, as long as the
1619	   measurement infrastructure can accurately and reliably deliver the
1620	   required bursts to the subpath under test.

1622	10.5.2.  Streaming Media

1624	   Model Based Metrics can be implicitly implemented as a side effect of
1625	   serving any non-throughput maximizing traffic, such as streaming
1626	   media, with some additional controls and instrumentation in the
1627	   servers.  The essential requirement is that the traffic be
1628	   constrained such that even with arbitrary application pauses, bursts
1629	   and data rate fluctuations, the traffic stays within the envelope
1630	   defined by the individual tests described above.

1632	   If the application's serving_data_rate is less than or equal to the
1633	   target_data_rate and the serving_RTT (the RTT between the sender and
1634	   client) is less than the target_RTT, this constraint is most easily
1635	   implemented by clamping the transport window size to be no larger
1636	   than:

1638	   serving_window_clamp=target_data_rate*serving_RTT/
1639	   (target_MTU-header_overhead)

1641	   Under the above constraints the serving_window_clamp will limit the
1642	   both the serving data rate and burst sizes to be no larger than the
1643	   procedures in Section 10.1.2 and Section 10.4 or Section 10.5.1.
1644	   Since the serving RTT is smaller than the target_RTT, the worst case
1645	   bursts that might be generated under these conditions will be smaller
1646	   than called for by Section 10.4 and the sender rate burst sizes are
1647	   implicitly derated by the serving_window_clamp divided by the
1648	   target_pipe_size at the very least.  (Depending on the application
1649	   behavior, the data traffic might be significantly smoother than
1650	   specified by any of the burst tests.)

1652	   Note that it is important that the target_data_rate be above the
1653	   actual average rate needed by the application so it can recover after
1654	   transient pauses caused by congestion or the application itself.

1656	   In an alternative implementation the data rate and bursts might be
1657	   explicitly controlled by a host shaper or pacing at the sender.  This
1658	   would provide better control over transmissions but it is
1659	   substantially more complicated to implement and would be likely to
1660	   have a higher CPU overhead.

1662	   Note that these techniques can be applied to any content delivery
1663	   that can be subjected to a reduced data rate in order to inhibit TCP
1664	   equilibrium behavior.

1666	11.  An Example

1668	   In this section a we illustrate a TDS designed to confirm that an
1669	   access ISP can reliably deliver HD video from multiple content
1670	   providers to all of their customers.  With modern codecs, minimal HD
1671	   video (720p) generally fits in 2.5 Mb/s.  Due to their geographical
1672	   size, network topology and modem designs the ISP determines that most
1673	   content is within a 50 mS RTT from their users (This is a sufficient
1674	   to cover continental Europe or either US coast from a single serving
1675	   site.)

1677	                        2.5 Mb/s over a 50 ms path

1679	                +----------------------+-------+---------+
1680	                | End-to-End Parameter | value | units   |
1681	                +----------------------+-------+---------+
1682	                | target_rate          | 2.5   | Mb/s    |
1683	                | target_RTT           | 50    | ms      |
1684	                | target_MTU           | 1500  | bytes   |
1685	                | header_overhead      | 64    | bytes   |
1686	                | target_pipe_size     | 11    | packets |
1687	                | target_run_length    | 363   | packets |
1688	                +----------------------+-------+---------+

1690	                                  Table 1

1692	   Table 1 shows the default TCP model with no derating, and as such is
1693	   quite conservative.  The simplest TDS would be to use the sustained
1694	   burst test, described in Section 10.5.1.  Such a test would send 11
1695	   packet bursts every 50mS, and confirming that there was no more than
1696	   1 packet loss per 33 bursts (363 total packets in 1.650 seconds).

1698	   Since this number represents is the entire end-to-end loss budget,
1699	   independent subpath tests could be implemented by apportioning the
1700	   loss ratio across subpaths.  For example 50% of the losses might be
1701	   allocated to the access or last mile link to the user, 40% to the
1702	   interconnects with other ISPs and 1% to each internal hop (assuming
1703	   no more than 10 internal hops).  Then all of the subpaths can be
1704	   tested independently, and the spatial composition of passing subpaths
1705	   would be expected to be within the end-to-end loss budget.

1707	   Testing interconnects has generally been problematic: conventional
1708	   performance tests run between Measurement Points adjacent to either
1709	   side of the interconnect, are not generally useful.  Unconstrained
1710	   TCP tests, such as iperf [iperf] are usually overly aggressive
1711	   because the RTT is so small (often less than 1 mS).  With a short RTT
1712	   these tools are likely to report inflated numbers because for short
1713	   RTTs these tools can tolerate very high loss ratio and can push other
1714	   cross traffic off of the network.  As a consequence they are useless
1715	   for predicting actual user performance, and may themselves be quite
1716	   disruptive.  Model Based Metrics solves this problem.  The same test
1717	   pattern as used on other links can be applied to the interconnect.
1718	   For our example, when apportioned 40% of the losses, 11 packet bursts
1719	   sent every 50mS should have fewer than one loss per 82 bursts (902
1720	   packets).

1722	12.  Validation

1724	   Since some aspects of the models are likely to be too conservative,
1725	   Section 6.2 permits alternate protocol models and Section 6.3 permits
1726	   test parameter derating.  If either of these techniques are used, we
1727	   require demonstrations that such a TDS can robustly detect links that
1728	   will prevent authentic applications using state-of-the-art protocol
1729	   implementations from meeting the specified performance targets.  This
1730	   correctness criteria is potentially difficult to prove, because it
1731	   implicitly requires validating a TDS against all possible links and
1732	   subpaths.  The procedures described here are still experimental.

1734	   We suggest two approaches, both of which should be applied: first,
1735	   publish a fully open description of the TDS, including what
1736	   assumptions were used and and how it was derived, such that the
1737	   research community can evaluate the design decisions, test them and
1738	   comment on their applicability; and second, demonstrate that an
1739	   applications running over an infinitessimally passing testbed do meet
1740	   the performance targets.

1742	   An infinitessimally passing testbed resembles a epsilon-delta proof
1743	   in calculus.  Construct a test network such that all of the
1744	   individual tests of the TDS pass by only small (infinitesimal)
1745	   margins, and demonstrate that a variety of authentic applications
1746	   running over real TCP implementations (or other protocol as
1747	   appropriate) meets the target transport performance over such a
1748	   network.  The workloads should include multiple types of streaming
1749	   media and transaction oriented short flows (e.g. synthetic web
1750	   traffic ).

1752	   For example, for the HD streaming video TDS described in Section 11,
1753	   the link layer bottleneck data rate should be exactly the header
1754	   overhead above 2.5 Mb/s, the per packet random background loss ratio
1755	   should be 1/363, for a run length of 363 packets, the bottleneck
1756	   queue should be 11 packets and the front path should have just enough
1757	   buffering to withstand 11 packet interface rate bursts.  We want
1758	   every one of the TDS tests to fail if we slightly increase the
1759	   relevant test parameter, so for example sending a 12 packet bursts
1760	   should cause excess (possibly deterministic) packet drops at the
1761	   dominant queue at the bottleneck.  On this infinitessimally passing
1762	   network it should be possible for a real application using a stock
1763	   TCP implementation in the vendor's default configuration to attain
1764	   2.5 Mb/s over an 50 mS path.

1766	   The most difficult part of setting up such a testbed is arranging for
1767	   it to infinitesimally pass the individual tests.  Two approaches:
1768	   constraining the network devices not to use all available resources
1769	   (e.g. by limiting available buffer space or data rate); and
1770	   preloading subpaths with cross traffic.  Note that is it important
1771	   that a single environment be constructed which infinitessimally
1772	   passes all tests at the same time, otherwise there is a chance that
1773	   TCP can exploit extra latitude in some parameters (such as data rate)
1774	   to partially compensate for constraints in other parameters (queue
1775	   space, or viceversa).

1777	   To the extent that a TDS is used to inform public dialog it should be
1778	   fully publicly documented, including the details of the tests, what
1779	   assumptions were used and how it was derived.  All of the details of
1780	   the validation experiment should also be published with sufficient
1781	   detail for the experiments to be replicated by other researchers.
1782	   All components should either be open source of fully described
1783	   proprietary implementations that are available to the research
1784	   community.

1786	13.  Security Considerations

1788	   Measurement is often used to inform business and policy decisions,
1789	   and as a consequence is potentially subject to manipulation for
1790	   illicit gains.  Model Based Metrics are expected to be a huge step
1791	   forward because equivalent measurements can be performed from
1792	   multiple vantage points, such that performance claims can be
1793	   independently validated by multiple parties.

1795	   Much of the acrimony in the Net Neutrality debate is due by the
1796	   historical lack of any effective vantage independent tools to
1797	   characterize network performance.  Traditional methods for measuring
1798	   Bulk Transport Capacity are sensitive to RTT and as a consequence
1799	   often yield very different results local to an ISP and when run over
1800	   a customer's complete path.  Neither the ISP nor customer can repeat
1801	   the other's measurements, leading to high levels of distrust and
1802	   acrimony.  Model Based Metrics are expected to greatly improve this
1803	   situation.

1805	   This document only describes a framework for designing Fully
1806	   Specified Targeted Diagnostic Suite.  Each FSTDS MUST include its own
1807	   security section.

1809	14.  Acknowledgements

1811	   Ganga Maguluri suggested the statistical test for measuring loss
1812	   probability in the target run length.  Alex Gilgur for helping with
1813	   the statistics.

1815	   Meredith Whittaker for improving the clarity of the communications.

1817	   This work was inspired by Measurement Lab: open tools running on an
1818	   open platform, using open tools to collect open data.  See
1819	   http://www.measurementlab.net/

1821	15.  IANA Considerations

1823	   This document has no actions for IANA.

1825	16.  References

1827	16.1.  Normative References

1829	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
1830	              Requirement Levels", BCP 14, RFC 2119, March 1997.

1832	16.2.  Informative References

1834	   [RFC0863]  Postel, J., "Discard Protocol", STD 21, RFC 863, May 1983.

1836	   [RFC0864]  Postel, J., "Character Generator Protocol", STD 22,
1837	              RFC 864, May 1983.

1839	   [RFC2309]  Braden, B., Clark, D., Crowcroft, J., Davie, B., Deering,
1840	              S., Estrin, D., Floyd, S., Jacobson, V., Minshall, G.,
1841	              Partridge, C., Peterson, L., Ramakrishnan, K., Shenker,
1842	              S., Wroclawski, J., and L. Zhang, "Recommendations on
1843	              Queue Management and Congestion Avoidance in the
1844	              Internet", RFC 2309, April 1998.

1846	   [RFC2330]  Paxson, V., Almes, G., Mahdavi, J., and M. Mathis,
1847	              "Framework for IP Performance Metrics", RFC 2330,
1848	              May 1998.

1850	   [RFC2861]  Handley, M., Padhye, J., and S. Floyd, "TCP Congestion
1851	              Window Validation", RFC 2861, June 2000.

1853	   [RFC3148]  Mathis, M. and M. Allman, "A Framework for Defining
1854	              Empirical Bulk Transfer Capacity Metrics", RFC 3148,
1855	              July 2001.

1857	   [RFC3465]  Allman, M., "TCP Congestion Control with Appropriate Byte
1858	              Counting (ABC)", RFC 3465, February 2003.

1860	   [RFC4015]  Ludwig, R. and A. Gurtov, "The Eifel Response Algorithm
1861	              for TCP", RFC 4015, February 2005.

1863	   [RFC4737]  Morton, A., Ciavattone, L., Ramachandran, G., Shalunov,
1864	              S., and J. Perser, "Packet Reordering Metrics", RFC 4737,
1865	              November 2006.

1867	   [RFC4898]  Mathis, M., Heffner, J., and R. Raghunarayan, "TCP
1868	              Extended Statistics MIB", RFC 4898, May 2007.

1870	   [RFC5681]  Allman, M., Paxson, V., and E. Blanton, "TCP Congestion
1871	              Control", RFC 5681, September 2009.

1873	   [RFC5835]  Morton, A. and S. Van den Berghe, "Framework for Metric
1874	              Composition", RFC 5835, April 2010.

1876	   [RFC6049]  Morton, A. and E. Stephan, "Spatial Composition of
1877	              Metrics", RFC 6049, January 2011.

1879	   [RFC6673]  Morton, A., "Round-Trip Packet Loss Metrics", RFC 6673,
1880	              August 2012.

1882	   [RFC7312]  Fabini, J. and A. Morton, "Advanced Stream and Sampling
1883	              Framework for IP Performance Metrics (IPPM)", RFC 7312,
1884	              August 2014.

1886	   [RFC7398]  Bagnulo, M., Burbridge, T., Crawford, S., Eardley, P., and
1887	              A. Morton, "A Reference Path and Measurement Points for
1888	              Large-Scale Measurement of Broadband Performance",
1889	              RFC 7398, February 2015.

1891	   [I-D.ietf-aqm-recommendation]
1892	              Baker, F. and G. Fairhurst, "IETF Recommendations
1893	              Regarding Active Queue Management",
1894	              draft-ietf-aqm-recommendation-11 (work in progress),
1895	              February 2015.

1897	   [MSMO97]   Mathis, M., Semke, J., Mahdavi, J., and T. Ott, "The
1898	              Macroscopic Behavior of the TCP Congestion Avoidance
1899	              Algorithm", Computer Communications Review volume 27,
1900	              number3, July 1997.

1902	   [WPING]    Mathis, M., "Windowed Ping: An IP Level Performance
1903	              Diagnostic", INET 94, June 1994.

1905	   [mpingSource]
1906	              Fan, X., Mathis, M., and D. Hamon, "Git Repository for
1907	              mping: An IP Level Performance Diagnostic", Sept 2013,
1908	              <https://github.com/m-lab/mping>.

1910	   [MBMSource]
1911	              Hamon, D., Stuart, S., and H. Chen, "Git Repository for
1912	              Model Based Metrics", Sept 2013,
1913	              <https://github.com/m-lab/MBM>.

1915	   [Pathdiag]
1916	              Mathis, M., Heffner, J., O'Neil, P., and P. Siemsen,
1917	              "Pathdiag: Automated TCP Diagnosis", Passive and Active
1918	              Measurement , June 2008.

1920	   [iperf]    Wikipedia Contributors, "iPerf", Wikipedia, The Free
1921	              Encyclopedia , cited March 2015, <http://en.wikipedia.org/
1922	              w/index.php?title=Iperf&oldid=649720021>.

1924	   [StatQC]   Montgomery, D., "Introduction to Statistical Quality
1925	              Control - 2nd ed.", ISBN 0-471-51988-X, 1990.

1927	   [Rtool]    R Development Core Team, "R: A language and environment
1928	              for statistical computing. R Foundation for Statistical
1929	              Computing, Vienna, Austria. ISBN 3-900051-07-0, URL
1930	              http://www.R-project.org/",  , 2011.

1932	   [CVST]     Krueger, T. and M. Braun, "R package: Fast Cross-
1933	              Validation via Sequential Testing", version 0.1, 11 2012.

1935	   [AFD]      Pan, R., Breslau, L., Prabhakar, B., and S. Shenker,
1936	              "Approximate fairness through differential dropping",
1937	              SIGCOMM Comput. Commun. Rev.  33, 2, April 2003.

1939	   [wikiBloat]
1940	              Wikipedia, "Bufferbloat", http://en.wikipedia.org/w/
1941	              index.php?title=Bufferbloat&oldid=608805474, March 2015.

1943	   [CCscaling]
1944	              Fernando, F., Doyle, J., and S. Steven, "Scalable laws for
1945	              stable network congestion control", Proceedings of
1946	              Conference on Decision and
1947	              Control, http://www.ee.ucla.edu/~paganini, December 2001.

1949	Appendix A.  Model Derivations

1951	   The reference target_run_length described in Section 6.2 is based on
1952	   very conservative assumptions: that all window above target_pipe_size
1953	   contributes to a standing queue that raises the RTT, and that classic
1954	   Reno congestion control with delayed ACKs are in effect.  In this
1955	   section we provide two alternative calculations using different
1956	   assumptions.

1958	   It may seem out of place to allow such latitude in a measurement
1959	   standard, but this section provides offsetting requirements.

1961	   The estimates provided by these models make the most sense if network
1962	   performance is viewed logarithmically.  In the operational Internet,
1963	   data rates span more than 8 orders of magnitude, RTT spans more than
1964	   3 orders of magnitude, and loss ratio spans at least 8 orders of
1965	   magnitude.  When viewed logarithmically (as in decibels), these
1966	   correspond to 80 dB of dynamic range.  On an 80 db scale, a 3 dB
1967	   error is less than 4% of the scale, even though it might represent a
1968	   factor of 2 in untransformed parameter.

1970	   This document gives a lot of latitude for calculating
1971	   target_run_length, however people designing a TDS should consider the
1972	   effect of their choices on the ongoing tussle about the relevance of
1973	   "TCP friendliness" as an appropriate model for Internet capacity
1974	   allocation.  Choosing a target_run_length that is substantially
1975	   smaller than the reference target_run_length specified in Section 6.2
1976	   strengthens the argument that it may be appropriate to abandon "TCP
1977	   friendliness" as the Internet fairness model.  This gives developers
1978	   incentive and permission to develop even more aggressive applications
1979	   and protocols, for example by increasing the number of connections
1980	   that they open concurrently.

1982	A.1.  Queueless Reno

1984	   In Section 6.2 it was assumed that the link rate matches the target
1985	   rate plus overhead, such that the excess window needed for the AIMD
1986	   sawtooth causes a fluctuating queue at the bottleneck.

1988	   An alternate situation would be bottleneck where there is no
1989	   significant queue and losses are caused by some mechanism that does
1990	   not involve extra delay, for example by the use of a virtual queue as
1991	   in Approximate Fair Dropping[AFD].  A flow controlled by such a
1992	   bottleneck would have a constant RTT and a data rate that fluctuates
1993	   in a sawtooth due to AIMD congestion control.  Assume the losses are
1994	   being controlled to make the average data rate meet some goal which
1995	   is equal or greater than the target_rate.  The necessary run length
1996	   can be computed as follows:

1998	   For some value of Wmin, the window will sweep from Wmin packets to
1999	   2*Wmin packets in 2*Wmin RTT (due to delayed ACK).  Unlike the
2000	   queueing case where Wmin = Target_pipe_size, we want the average of
2001	   Wmin and 2*Wmin to be the target_pipe_size, so the average rate is
2002	   the target rate.  Thus we want Wmin = (2/3)*target_pipe_size.

2004	   Between losses each sawtooth delivers (1/2)(Wmin+2*Wmin)(2Wmin)
2005	   packets in 2*Wmin round trip times.

2007	   Substituting these together we get:

2009	   target_run_length = (4/3)(target_pipe_size^2)

2011	   Note that this is 44% of the reference_run_length computed earlier.
2012	   This makes sense because under the assumptions in Section 6.2 the
2013	   AMID sawtooth caused a queue at the bottleneck, which raised the
2014	   effective RTT by 50%.

2016	Appendix B.  Complex Queueing

2018	   For many network technologies simple queueing models don't apply: the
2019	   network schedules, thins or otherwise alters the timing of ACKs and
2020	   data, generally to raise the efficiency of the channel allocation
2021	   when confronted with relatively widely spaced small ACKs.  These
2022	   efficiency strategies are ubiquitous for half duplex, wireless and
2023	   broadcast media.

2025	   Altering the ACK stream generally has two consequences: it raises the
2026	   effective bottleneck data rate, making slowstart burst at higher
2027	   rates (possibly as high as the sender's interface rate) and it
2028	   effectively raises the RTT by the average time that the ACKs and data
2029	   were delayed.  The first effect can be partially mitigated by
2030	   reclocking ACKs once they are beyond the bottleneck on the return
2031	   path to the sender, however this further raises the effective RTT.

2033	   The most extreme example of this sort of behavior would be a half
2034	   duplex channel that is not released as long as end point currently
2035	   holding the channel has more traffic (data or ACKs) to send.  Such
2036	   environments cause self clocked protocols under full load to revert
2037	   to extremely inefficient stop and wait behavior, where they send an
2038	   entire window of data as a single burst of the forward path, followed
2039	   by the entire window of ACKs on the return path.  It is important to
2040	   note that due to self clocking, ill conceived channel allocation
2041	   mechanisms can increase the stress on upstream links in a long path:
2042	   they cause large and faster bursts.

2044	   If a particular return path contains a link or device that alters the
2045	   ACK stream, then the entire path from the sender up to the bottleneck
2046	   must be tested at the burst parameters implied by the ACK scheduling
2047	   algorithm.  The most important parameter is the Effective Bottleneck
2048	   Data Rate, which is the average rate at which the ACKs advance
2049	   snd.una.  Note that thinning the ACKs (relying on the cumulative
2050	   nature of seg.ack to permit discarding some ACKs) is implies an
2051	   effectively infinite bottleneck data rate.

2053	   Holding data or ACKs for channel allocation or other reasons (such as
2054	   forward error correction) always raises the effective RTT relative to
2055	   the minimum delay for the path.  Therefore it may be necessary to
2056	   replace target_RTT in the calculation in Section 6.2 by an
2057	   effective_RTT, which includes the target_RTT plus a term to account
2058	   for the extra delays introduced by these mechanisms.

2060	Appendix C.  Version Control

2062	   This section to be removed prior to publication.

2064	   Formatted: Sat Jun 13 16:25:01 PDT 2015

2066	Authors' Addresses

2068	   Matt Mathis
2069	   Google, Inc
2070	   1600 Amphitheater Parkway
2071	   Mountain View, California  94043
2072	   USA

2074	   Email: mattmathis@google.com

2076	   Al Morton
2077	   AT&T Labs
2078	   200 Laurel Avenue South
2079	   Middletown, NJ  07748
2080	   USA

2082	   Phone: +1 732 420 1571
2083	   Email: acmorton@att.com
2084	   URI:   http://home.comcast.net/~acmacm/