idnits 2.17.1 draft-ietf-ippm-model-based-metrics-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) == There are 7 instances of lines with non-RFC2606-compliant FQDNs in the document. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 583: '... A TDS or FSTDS MUST apportion all re...' RFC 2119 keyword, line 689: '...d, a fully specified TDS or FSTDS MUST...' RFC 2119 keyword, line 706: '...The TDS or FSTDS MUST document and jus...' RFC 2119 keyword, line 836: '...argets then this MUST be stated both t...' RFC 2119 keyword, line 837: '...etwork performance. The tests MUST be...' (2 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 363 has weird spacing: '...y tests deter...' == Line 374 has weird spacing: '...g tests are d...' == Line 379 has weird spacing: '...g tests evalu...' == Line 1035 has weird spacing: '... and n = h1...' -- The document date (February 14, 2014) is 3714 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '
' and
     '' lines.


  Checking references for intended status: Experimental
  ----------------------------------------------------------------------------

  == Missing Reference: 'Dominant' is mentioned on line 254, but not defined

  == Missing Reference: 'CUBIC' is mentioned on line 819, but not defined

  == Missing Reference: 'SLowScaling' is mentioned on line 822, but not
     defined

  == Missing Reference: 'REACTIVE' is mentioned on line 853, but not defined

  == Missing Reference: 'Bufferbloat' is mentioned on line 1305, but not
     defined

  == Missing Reference: 'POWER' is mentioned on line 1274, but not defined

  == Missing Reference: 'NPAD' is mentioned on line 1296, but not defined

  == Missing Reference: 'TSO' is mentioned on line 1399, but not defined

  == Missing Reference: 'RFC 863' is mentioned on line 1442, but not defined

  == Missing Reference: 'RFC 864' is mentioned on line 1442, but not defined

  == Missing Reference: 'HDvideo' is mentioned on line 1507, but not defined

  == Missing Reference: 'SDvideo' is mentioned on line 1529, but not defined

  == Missing Reference: 'AFD' is mentioned on line 1771, but not defined

  == Missing Reference: 'W' is mentioned on line 1821, but not defined

  == Unused Reference: 'RFC6049' is defined on line 1680, but no explicit
     reference was found in the text

  -- Obsolete informational reference (is this intentional?): RFC 2309
     (Obsoleted by RFC 7567)

  -- Obsolete informational reference (is this intentional?): RFC 2861
     (Obsoleted by RFC 7661)

  == Outdated reference: A later version (-01) exists of
     draft-morton-ippm-lmap-path-00


     Summary: 3 errors (**), 0 flaws (~~), 22 warnings (==), 4 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	IP Performance Working Group                                   M. Mathis
3	Internet-Draft                                               Google, Inc
4	Intended status: Experimental                                  A. Morton
5	Expires: August 18, 2014                                       AT&T Labs
6	                                                       February 14, 2014

8	                  Model Based Bulk Performance Metrics
9	               draft-ietf-ippm-model-based-metrics-02.txt

11	Abstract

13	   We introduce a new class of model based metrics designed to determine
14	   if an end-to-end Internet path can meet predefined transport
15	   performance targets by applying a suite of IP diagnostic tests to
16	   successive subpaths.  The subpath-at-a-time tests are designed to
17	   accurately detect if any subpath will prevent the full end-to-end
18	   path from meeting the specified target performance.  Each IP
19	   diagnostic test consists of a precomputed traffic pattern and a
20	   statistical criteria for evaluating packet delivery.

22	   The IP diagnostics tests are based on traffic patterns that are
23	   precomputed to mimic TCP or other transport protocol over a long path
24	   but are independent of the actual details of the subpath under test.
25	   Likewise the success criteria depends on the target performance and
26	   not the actual performance of the subpath.  This makes the
27	   measurements open loop, eliminating nearly all of the difficulties
28	   encountered by traditional bulk transport metrics.

30	   This document does not fully define diagnostic tests, but provides a
31	   framework for designing suites of diagnostics tests that are tailored
32	   the confirming the target performance.

34	   By making the tests open loop, we eliminate standards congestion
35	   control equilibrium behavior, which otherwise causes every measured
36	   parameter to be sensitive to every component of the system.  As an
37	   open loop test, various measurable properties become independent, and
38	   potentially subject to an algebra enabling several important new
39	   uses.

41	   Interim DRAFT Formatted: Fri Feb 14 14:07:33 PST 2014

43	Status of this Memo

45	   This Internet-Draft is submitted in full conformance with the
46	   provisions of BCP 78 and BCP 79.

48	   Internet-Drafts are working documents of the Internet Engineering
49	   Task Force (IETF).  Note that other groups may also distribute
50	   working documents as Internet-Drafts.  The list of current Internet-
51	   Drafts is at http://datatracker.ietf.org/drafts/current/.

53	   Internet-Drafts are draft documents valid for a maximum of six months
54	   and may be updated, replaced, or obsoleted by other documents at any
55	   time.  It is inappropriate to use Internet-Drafts as reference
56	   material or to cite them other than as "work in progress."

58	   This Internet-Draft will expire on August 18, 2014.

60	Copyright Notice

62	   Copyright (c) 2014 IETF Trust and the persons identified as the
63	   document authors.  All rights reserved.

65	   This document is subject to BCP 78 and the IETF Trust's Legal
66	   Provisions Relating to IETF Documents
67	   (http://trustee.ietf.org/license-info) in effect on the date of
68	   publication of this document.  Please review these documents
69	   carefully, as they describe your rights and restrictions with respect
70	   to this document.  Code Components extracted from this document must
71	   include Simplified BSD License text as described in Section 4.e of
72	   the Trust Legal Provisions and are provided without warranty as
73	   described in the Simplified BSD License.

75	Table of Contents

77	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  5
78	     1.1.  TODO . . . . . . . . . . . . . . . . . . . . . . . . . . .  7
79	   2.  Terminology  . . . . . . . . . . . . . . . . . . . . . . . . .  7
80	   3.  New requirements relative to RFC 2330  . . . . . . . . . . . . 10
81	   4.  Background . . . . . . . . . . . . . . . . . . . . . . . . . . 11
82	     4.1.  TCP properties . . . . . . . . . . . . . . . . . . . . . . 12
83	     4.2.  Diagnostic Approach  . . . . . . . . . . . . . . . . . . . 13
84	   5.  Common Models and Parameters . . . . . . . . . . . . . . . . . 15
85	     5.1.  Target End-to-end parameters . . . . . . . . . . . . . . . 15
86	     5.2.  Common Model Calculations  . . . . . . . . . . . . . . . . 15
87	     5.3.  Parameter Derating . . . . . . . . . . . . . . . . . . . . 16
88	   6.  Common testing procedures  . . . . . . . . . . . . . . . . . . 17
89	     6.1.  Traffic generating techniques  . . . . . . . . . . . . . . 17
90	       6.1.1.  Paced transmission . . . . . . . . . . . . . . . . . . 17
91	       6.1.2.  Constant window pseudo CBR . . . . . . . . . . . . . . 18
92	       6.1.3.  Scanned window pseudo CBR  . . . . . . . . . . . . . . 18
93	       6.1.4.  Concurrent or channelized testing  . . . . . . . . . . 19
94	       6.1.5.  Intermittent Testing . . . . . . . . . . . . . . . . . 19
95	       6.1.6.  Intermittent Scatter Testing . . . . . . . . . . . . . 20
96	     6.2.  Interpreting the Results . . . . . . . . . . . . . . . . . 20
97	       6.2.1.  Test outcomes  . . . . . . . . . . . . . . . . . . . . 20
98	       6.2.2.  Statistical criteria for measuring run_length  . . . . 22
99	         6.2.2.1.  Alternate criteria for measuring run_length  . . . 24
100	       6.2.3.  Reordering Tolerance . . . . . . . . . . . . . . . . . 25
101	     6.3.  Test Qualifications  . . . . . . . . . . . . . . . . . . . 26
102	   7.  Diagnostic Tests . . . . . . . . . . . . . . . . . . . . . . . 27
103	     7.1.  Basic Data Rate and Run Length Tests . . . . . . . . . . . 27
104	       7.1.1.  Run Length at Paced Full Data Rate . . . . . . . . . . 27
105	       7.1.2.  Run Length at Full Data Windowed Rate  . . . . . . . . 28
106	       7.1.3.  Background Run Length Tests  . . . . . . . . . . . . . 28
107	     7.2.  Standing Queue tests . . . . . . . . . . . . . . . . . . . 28
108	       7.2.1.  Congestion Avoidance . . . . . . . . . . . . . . . . . 29
109	       7.2.2.  Bufferbloat  . . . . . . . . . . . . . . . . . . . . . 30
110	       7.2.3.  Non excessive loss . . . . . . . . . . . . . . . . . . 30
111	       7.2.4.  Duplex Self Interference . . . . . . . . . . . . . . . 30
112	     7.3.  Slowstart tests  . . . . . . . . . . . . . . . . . . . . . 30
113	       7.3.1.  Full Window slowstart test . . . . . . . . . . . . . . 31
114	       7.3.2.  Slowstart AQM test . . . . . . . . . . . . . . . . . . 31
115	     7.4.  Sender Rate Burst tests  . . . . . . . . . . . . . . . . . 31
116	     7.5.  Combined Tests . . . . . . . . . . . . . . . . . . . . . . 32
117	       7.5.1.  Sustained burst test . . . . . . . . . . . . . . . . . 32
118	       7.5.2.  Live Streaming Media . . . . . . . . . . . . . . . . . 33
119	   8.  Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
120	     8.1.  Near serving HD streaming video  . . . . . . . . . . . . . 34
121	     8.2.  Far serving SD streaming video . . . . . . . . . . . . . . 34
122	     8.3.  Bulk delivery of remote scientific data  . . . . . . . . . 35

124	   9.  Validation . . . . . . . . . . . . . . . . . . . . . . . . . . 35
125	   10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 37
126	   11. Informative References . . . . . . . . . . . . . . . . . . . . 37
127	   Appendix A.  Model Derivations . . . . . . . . . . . . . . . . . . 39
128	     A.1.  Queueless Reno . . . . . . . . . . . . . . . . . . . . . . 39
129	     A.2.  CUBIC  . . . . . . . . . . . . . . . . . . . . . . . . . . 40
130	   Appendix B.  Complex Queueing  . . . . . . . . . . . . . . . . . . 41
131	   Appendix C.  Version Control . . . . . . . . . . . . . . . . . . . 42
132	   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 42

134	1.  Introduction

136	   Bulk performance metrics evaluate an Internet path's ability to carry
137	   bulk data.  Model based bulk performance metrics rely on mathematical
138	   TCP models to design a targeted diagnostic suite (TDS) of IP
139	   performance tests which can be applied independently to each subpath
140	   of the full end-to-end path.  These targeted diagnostic suites allow
141	   independent tests of subpaths to accurately detect if any subpath
142	   will prevent the full end-to-end path from delivering bulk data at
143	   the specified performance target, independent of the measurement
144	   vantage points or other details of the test procedures used for each
145	   measurement.

147	   The end-to-end target performance is determined by the needs of the
148	   user or application, outside the scope of this document.  For bulk
149	   data transport, the primary performance parameter of interest is the
150	   target data rate.  However, since TCP's ability to compensate for
151	   less than ideal network conditions is fundamentally affected by the
152	   Round Trip Time (RTT) and the Maximum Transmission Unit (MTU) of the
153	   entire end-to-end path over which the data traverses, these
154	   parameters must also be specified in advance.  They may reflect a
155	   specific real path through the Internet or an idealized path
156	   representing a typical user community.  The target values for these
157	   three parameters, Data Rate, RTT and MTU, inform the mathematical
158	   models used to design the TDS.

160	   Each IP diagnostic test in a TDS consists of a precomputed traffic
161	   pattern and statistical criteria for evaluating packet delivery.

163	   Mathematical models are used to design traffic patterns that mimic
164	   TCP or other bulk transport protocol operating at the target data
165	   rate, MTU and RTT over a full range of conditions, including flows
166	   that are bursty at multiple time scales.  The traffic patterns are
167	   computed in advance based on the three target parameters of the end-
168	   to-end path and independent of the properties of individual subpaths.
169	   As much as possible the measurement traffic is generated
170	   deterministically in ways that minimize the extent to which test
171	   methodology, measurement points, measurement vantage or path
172	   partitioning affect the details of the measurement traffic.

174	   Mathematical models are also used to compute the bounds on the packet
175	   delivery statistics for acceptable IP performance.  Since these
176	   statistics, such as packet loss, are typically aggregated from all
177	   subpaths of the end-to-end path, the end-to-end statistical bounds
178	   need to be apportioned as a separate bound for each subpath.  Note
179	   that links that are expected to be bottlenecks are expected to
180	   contribute more packet loss and/or delay.  In compensation, other
181	   links have to be constrained to contribute less packet loss and
182	   delay.  The criteria for passing each test of a TDS is an apportioned
183	   share of the total bound determined by the mathematical model from
184	   the end-to-end target performance.

186	   In addition to passing or failing, a test can be deemed to be
187	   inconclusive for a number of reasons including, the precomputed
188	   traffic pattern was not accurately generated, measurement results
189	   were not statistically significant, and others such as failing to
190	   meet some test preconditions.

192	   This document describes a framework for deriving traffic patterns and
193	   delivery statistics for model based metrics.  It does not fully
194	   specify any measurement techniques.  Important details such as packet
195	   type-p selection, sampling techniques, vantage selection, etc. are
196	   not specified here.  We imagine Fully Specified Targeted Diagnostic
197	   Suites (FSTDS), that define all of these details.  We use TDS to
198	   refer to the subset of such a specification that is in scope for this
199	   document.  A TDS includes the target parameters, documentation of the
200	   models and assumptions used to derive the diagnostic test parameters,
201	   specifications for the traffic and delivery statistics for the tests
202	   themselves, and a description of a test setup that can be used to
203	   validate the tests and models.

205	   Section 2 defines terminology used throughout this document.

207	   It has been difficult to develop Bulk Transport Capacity [RFC3148]
208	   metrics due to some overlooked requirements described in Section 3
209	   and some intrinsic problems with using protocols for measurement,
210	   described in Section 4.

212	   In Section 5 we describe the models and common parameters used to
213	   derive the targeted diagnostic suite.  In Section 6 we describe
214	   common testing procedures.  Each subpath is evaluated using suite of
215	   far simpler and more predictable diagnostic tests described in
216	   Section 7.  In Section 8 we present three example TDS', one that
217	   might be representative of HD video, when served fairly close to the
218	   user, a second that might be representative of standard video, served
219	   from a greater distance, and a third that might be representative of
220	   high performance bulk data delivered over a transcontinental path.

222	   There exists a small risk that model based metric itself might yield
223	   a false pass result, in the sense that every subpath of an end-to-end
224	   path passes every IP diagnostic test and yet a real application fails
225	   to attain the performance target over the end-to-end path.  If this
226	   happens, then the validation procedure described in Section 9 needs
227	   to be used to prove and potentially revise the models.

229	   Future documents will define model based metrics for other traffic
230	   classes and application types, such as real time streaming media.

232	1.1.  TODO

234	   Please send comments on this draft to ippm@ietf.org.  See
235	   http://goo.gl/02tkD for more information including: interim drafts,
236	   an up to date todo list and information on contributing.

238	   Formatted: Fri Feb 14 14:07:33 PST 2014

240	2.  Terminology

242	   Terminology about paths, etc.  See [RFC2330] and
243	   [I-D.morton-ippm-lmap-path].

245	   [data] sender  Host sending data and receiving ACKs.
246	   [data] receiver  Host receiving data and sending ACKs.
247	   subpath  A portion of the full path.  Note that there is no
248	      requirement that subpaths be non-overlapping.
249	   Measurement Point  Measurement points as described in
250	      [I-D.morton-ippm-lmap-path].
251	   test path  A path between two measurement points that includes a
252	      subpath of the end-to-end path under test, and could include
253	      infrastructure between the measurement points and the subpath.
254	   [Dominant] Bottleneck  The Bottleneck that generally dominates
255	      traffic statistics for the entire path.  It typically determines a
256	      flow's self clock timing, packet loss and ECN marking rate.  See
257	      Section 4.1.
258	   front path  The subpath from the data sender to the dominant
259	      bottleneck.
260	   back path  The subpath from the dominant bottleneck to the receiver.
261	   return path  The path taken by the ACKs from the data receiver to the
262	      data sender.
263	   cross traffic  Other, potentially interfering, traffic competing for
264	      resources (network and/or queue capacity).

266	   Properties determined by the end-to-end path and application.  They
267	   are described in more detail in Section 5.1.

269	   Application Data Rate  General term for the data rate as seen by the
270	      application above the transport layer.  This is the payload data
271	      rate, and excludes transport and lower level headers(TCP/IP or
272	      other protocols) and as well as retransmissions and other data
273	      that does not contribute to the total quantity of data delivered
274	      to the application.

276	   Link Data Rate  General term for the data rate as seen by the link or
277	      lower layers.  The link data rate includes transport and IP
278	      headers, retransmits and other transport layer overhead.  This
279	      document is agnostic as to whether the link data rate includes or
280	      excludes framing, MAC, or other lower layer overheads, except that
281	      they must be treated uniformly.
282	   end-to-end target parameters:  Application or transport performance
283	      goals for the end-to-end path.  They include the target data rate,
284	      RTT and MTU described below.
285	   Target Data Rate:  The application data rate, typically the ultimate
286	      user's performance goal.
287	   Target RTT (Round Trip Time):  The baseline (minimum) RTT of the
288	      longest end-to-end path over which the application expects to meet
289	      the target performance.  TCP and other transport protocol's
290	      ability to compensate for path problems is generally proportional
291	      to the number of round trips per second.  The Target RTT
292	      determines both key parameters of the traffic patterns (e.g. burst
293	      sizes) and the thresholds on acceptable traffic statistics.  The
294	      Target RTT must be specified considering authentic packets sizes:
295	      MTU sized packets on the forward path, ACK sized packets
296	      (typically the header_overhead) on the return path.
297	   Target MTU (Maximum Transmission Unit):  The maximum MTU supported by
298	      the end-to-end path the over which the application expects to meet
299	      the target performance.  Assume 1500 Byte packet unless otherwise
300	      specified.  If some subpath forces a smaller MTU, then it becomes
301	      the target MTU, and all model calculations and subpath tests must
302	      use the same smaller MTU.
303	   Effective Bottleneck Data Rate:  This is the bottleneck data rate
304	      inferred from the ACK stream, by looking at how much data the ACK
305	      stream reports delivered per unit time.  If the path is thinning
306	      ACKs or batching packets the effective bottleneck rate can be much
307	      higher than the average link rate.  See Section 4.1 and Appendix B
308	      for more details.
309	   [sender | interface] rate:  The burst data rate, constrained by the
310	      data sender's interfaces.  Today 1 or 10 Gb/s are typical.
311	   Header_overhead:  The IP and TCP header sizes, which are the portion
312	      of each MTU not available for carrying application payload.
313	      Without loss of generality this is assumed to be the size for
314	      returning acknowledgements (ACKs).  For TCP, the Maximum Segment
315	      Size (MSS) is the Target MTU minus the header_overhead.

317	   Basic parameters common to models and subpath tests.  They are
318	   described in more detail in Section 5.2.  Note that these are mixed
319	   between application transport performance (excludes headers) and link
320	   IP performance (includes headers).

322	   pipe size  A general term for number of packets needed in flight (the
323	      window size) to exactly fill some network path or subpath.  This
324	      is the window size which is normally the onset of queueing.
325	   target_pipe_size:  The number of packets in flight (the window size)
326	      needed to exactly meet the target rate, with a single stream and
327	      no cross traffic for the specified application target data rate,
328	      RTT, and MTU.  It is the amount of circulating data required to
329	      meet the target data rate, and implies the scale of the bursts
330	      that the network might experience.
331	   run length  A general term for the observed, measured, or specified
332	      number of packets that are (to be) delivered between losses or ECN
333	      marks.  Nominally one over the loss or ECN marking probability, if
334	      there are independently and identically distributed.
335	   target_run_length  The target_run_length is an estimate of the
336	      minimum required headway between losses or ECN marks necessary to
337	      attain the target_data_rate over a path with the specified
338	      target_RTT and target_MTU, as computed by a mathematical model of
339	      TCP congestion control.  A reference calculation is show in
340	      Section 5.2 and alternatives in Appendix A

342	   Ancillary parameters used for some tests

344	   derating:  Under some conditions the standard models are too
345	      conservative.  The modeling framework permits some latitude in
346	      relaxing or derating some test parameters as described in
347	      Section 5.3 in exchange for a more stringent TDS validation
348	      procedures, described in Section 9.
349	   subpath_data_rate  The maximum IP data rate supported by a subpath.
350	      This typically includes TCP/IP overhead, including headers,
351	      retransmits, etc.
352	   test_path_RTT  The RTT between two measurement points using
353	      appropriate data and ACK packet sizes.
354	   test_path_pipe  The amount of data necessary to fill a test path.
355	      Nominally the test path RTT times the subpath_data_rate (which
356	      should be part of the end-to-end subpath).
357	   test_window  The window necessary to meet the target_rate over a
358	      subpath.  Typically test_window=target_data_rate*test_RTT/
359	      (target_MTU - header_overhead).

361	   Tests can be classified into groups according to their applicability.

363	   Capacity tests  determine if a network subpath has sufficient
364	      capacity to deliver the target performance.  As long as the test
365	      traffic is within the proper envelope for the target end-to-end
366	      performance, the average packet losses or ECN must be below the
367	      threshold computed by the model.  As such, capacity tests reflect
368	      parameters that can transition from passing to failing as a
369	      consequence of cross traffic, additional presented load or the
370	      actions of other network users.  By definition, capacity tests
371	      also consume significant network resources (data capacity and/or
372	      buffer space), and the test schedules must be balanced by their
373	      cost.
374	   Monitoring tests  are designed to capture the most important aspects
375	      of a capacity test, but without presenting excessive ongoing load
376	      themselves.  As such they may miss some details of the network's
377	      performance, but can serve as a useful reduced-cost proxy for a
378	      capacity test.
379	   Engineering tests  evaluate how network algorithms (such as AQM and
380	      channel allocation) interact with TCP-style self clocked protocols
381	      and adaptive congestion control based on packet loss and ECN
382	      marks.  These tests are likely to have complicated interactions
383	      with other traffic and under some conditions can be inversely
384	      sensitive to load.  For example a test to verify that an AQM
385	      algorithm causes ECN marks or packet drops early enough to limit
386	      queue occupancy may experience a false pass result in the presence
387	      of bursty cross traffic.  It is important that engineering tests
388	      be performed under a wide range of conditions, including both in
389	      situ and bench testing, and over a wide variety of load
390	      conditions.  Ongoing monitoring is less likely to be useful for
391	      engineering tests, although sparse in situ testing might be
392	      appropriate.

394	   General Terminology:

396	   Targeted Diagnostic Test (TDS)  A set of IP Diagnostics designed to
397	      determine if a subpath can sustain flows at a specific
398	      target_data_rate over a path that has a target_RTT using
399	      target_MTU sided packets.
400	   Fully Specified Targeted Diagnostic Test  A TDS together with
401	      additional specification such as "type-p", etc which are out of
402	      scope for this document, but need to be drawn from other standards
403	      documents.
404	   apportioned  To divide and allocate, as in budgeting packet loss
405	      rates across multiple subpaths to accumulate below a specified
406	      end-to-end loss rate.
407	   open loop  A control theory term used to describe a class of
408	      techniques where systems that exhibit circular dependencies can be
409	      analyzed by suppressing some of the dependences, such that the
410	      resulting dependency graph is acyclic.

412	3.  New requirements relative to RFC 2330

414	   Model Based Metrics are designed to fulfill some additional
415	   requirement that were not recognized at the time RFC 2330 was written
416	   [RFC2330].  These missing requirements may have significantly
417	   contributed to policy difficulties in the IP measurement space.  Some
418	   additional requirements are:
419	   o  IP metrics must be actionable by the ISP - they have to be
420	      interpreted in terms of behaviors or properties at the IP or lower
421	      layers, that an ISP can test, repair and verify.
422	   o  Metrics must be vantage point invariant over a significant range
423	      of measurement point choices, including off path measurement
424	      points.  The only requirements on MP selection should be that the
425	      portion of the test path that is not under test is effectively
426	      ideal (or is non ideal in ways that can be calibrated out of the
427	      measurements) and the test RTT between the MPs is below some
428	      reasonable bound.
429	   o  Metrics must be repeatable by multiple parties with no specialized
430	      access to MPs or diagnostic infrastructure.  It must be possible
431	      for different parties to make the same measurement and observe the
432	      same results.  In particular it is specifically important that
433	      both a consumer (or their delegate) and ISP be able to perform the
434	      same measurement and get the same result.

436	   NB: All of the metric requirements in RFC 2330 should be reviewed and
437	   potentially revised.  If such a document is opened soon enough, this
438	   entire section should be dropped.

440	4.  Background

442	   At the time the IPPM WG was chartered, sound Bulk Transport Capacity
443	   measurement was known to be beyond our capabilities.  By hindsight it
444	   is now clear why it is such a hard problem:
445	   o  TCP is a control system with circular dependencies - everything
446	      affects performance, including components that are explicitly not
447	      part of the test.
448	   o  Congestion control is an equilibrium process, such that transport
449	      protocols change the network (raise loss probability and/or RTT)
450	      to conform to their behavior.
451	   o  TCP's ability to compensate for network flaws is directly
452	      proportional to the number of roundtrips per second (i.e.
453	      inversely proportional to the RTT).  As a consequence a flawed
454	      link may pass a short RTT local test even though it fails when the
455	      path is extended by a perfect network to some larger RTT.
456	   o  TCP has a meta Heisenberg problem - Measurement and cross traffic
457	      interact in unknown and ill defined ways.  The situation is
458	      actually worse than the traditional physics problem where you can
459	      at least estimate the relative momentum of the measurement and
460	      measured particles.  For network measurement you can not in
461	      general determine the relative "elasticity" of the measurement
462	      traffic and cross traffic, so you can not even gauge the relative
463	      magnitude of their effects on each other.

465	   These properties are a consequence of the equilibrium behavior
466	   intrinsic to how all throughput optimizing protocols interact with
467	   the network.  The protocols rely on control systems based on multiple
468	   network estimators to regulate the quantity of data sent into the
469	   network.  The data in turn alters network and the properties observed
470	   by the estimators, such that there are circular dependencies between
471	   every component and every property.  Since some of these estimators
472	   are non-linear, the entire system is nonlinear, and any change
473	   anywhere causes difficult to predict changes in every parameter.

475	   Model Based Metrics overcome these problems by forcing the
476	   measurement system to be open loop: the delivery statistics (akin to
477	   the network estimators) do not affect the traffic.  The traffic and
478	   traffic patterns (bursts) are computed on the basis of the target
479	   performance.  In order for a network to pass, the resulting delivery
480	   statistics and corresponding network estimators have to be such that
481	   they would not cause the control systems slow the traffic below the
482	   target rate.

484	4.1.  TCP properties

486	   TCP and SCTP are self clocked protocols.  The dominant steady state
487	   behavior is to have an approximately fixed quantity of data and
488	   acknowledgements (ACKs) circulating in the network.  The receiver
489	   reports arriving data by returning ACKs to the data sender, the data
490	   sender typically responds by sending exactly the same quantity of
491	   data back into the network.  The total quantity of data plus the data
492	   represented by ACKs circulating in the network is referred to as the
493	   window.  The mandatory congestion control algorithms incrementally
494	   adjust the window by sending slightly more or less data in response
495	   to each ACK.  The fundamentally important property of this systems is
496	   that it is entirely self clocked: The data transmissions are a
497	   reflection of the ACKs that were delivered by the network, the ACKs
498	   are a reflection of the data arriving from the network.

500	   A number of phenomena can cause bursts of data, even in idealized
501	   networks that are modeled as simple queueing systems.

503	   During slowstart the data rate is doubled on each RTT by sending
504	   twice as much data as was delivered to the receiver on the prior RTT.
505	   For slowstart to be able to fill such a network the network must be
506	   able to tolerate slowstart bursts up to the full pipe size inflated
507	   by the anticipated window reduction on the first loss or ECN mark.
508	   For example, with classic Reno congestion control, an optimal
509	   slowstart has to end with a burst that is twice the bottleneck rate
510	   for exactly one RTT in duration.  This burst causes a queue which is
511	   exactly equal to the pipe size (i.e. the window is exactly twice the
512	   pipe size) so when the window is halved in response to the first
513	   loss, the new window will be exactly the pipe size.

515	   Note that if the bottleneck data rate is significantly slower than
516	   the rest of the path, the slowstart bursts will not cause significant
517	   queues anywhere else along the path; they primarily exercise the
518	   queue at the dominant bottleneck.

520	   Other sources of bursts include application pauses and channel
521	   allocation mechanisms.  Appendix B describes the treatment of channel
522	   allocation systems.  If the application pauses (stops reading or
523	   writing data) for some fraction of one RTT, state-of-the-art TCP
524	   catches up to the earlier window size by sending a burst of data at
525	   the full sender interface rate.  To fill such a network with a
526	   realistic application, the network has to be able to tolerate
527	   interface rate bursts from the data sender large enough to cover
528	   application pauses.

530	   Although the interface rate bursts are typically smaller than last
531	   burst of a slowstart, they are at a higher data rate so they
532	   potentially exercise queues at arbitrary points along the front path
533	   from the data sender up to and including the queue at the dominant
534	   bottleneck.  There is no model for how frequent or what sizes of
535	   sender rate bursts should be tolerated.

537	   To verify that a path can meet a performance target, it is necessary
538	   to independently confirm that the path can tolerate bursts in the
539	   dimensions that can be caused by these mechanisms.  Three cases are
540	   likely to be sufficient:

542	   o  Slowstart bursts sufficient to get connections started properly.
543	   o  Frequent sender interface rate bursts that are small enough where
544	      they can be assumed not to significantly affect delivery
545	      statistics.  (Implicitly derated by selecting the burst size).
546	   o  Infrequent sender interface rate full target_pipe_size bursts that
547	      do affect the delivery statistics.  (Target_run_length is
548	      derated).

550	4.2.  Diagnostic Approach

552	   The MBM approach is to open loop TCP by precomputing traffic patterns
553	   that are typically generated by TCP operating at the given target
554	   parameters, and evaluating delivery statistics (packet loss, ECN
555	   marks and delay).  In this approach the measurement software
556	   explicitly controls the data rate, transmission pattern or cwnd
557	   (TCP's primary congestion control state variables) to create
558	   repeatable traffic patterns that mimic TCP behavior but are
559	   independent of the actual behavior of the subpath under test.  These
560	   patterns are manipulated to probe the network to verify that it can
561	   deliver all of the traffic patterns that a transport protocol is
562	   likely to generate under normal operation at the target rate and RTT.

564	   By opening the protocol control loops, we remove most sources of
565	   temporal and spatial correlation in the traffic delivery statistics,
566	   such that each subpath's contribution to the end-to-end statistics
567	   can be assumed to be independent and stationary (The delivery
568	   statistics depend on the fine structure of the data transmissions,
569	   but not on long time scale state imbedded in the sender, receiver or
570	   other network components.)  Therefore each subpath's contribution to
571	   the end-to-end delivery statistics can be assumed to be independent,
572	   and spatial composition techniques such as [RFC5835] apply.

574	   In typical networks, the dominant bottleneck contributes the majority
575	   of the packet loss and ECN marks.  Often the rest of the path makes
576	   insignificant contribution to these properties.  A TDS should
577	   apportion the end-to-end budget for the specified parameters
578	   (primarily packet loss and ECN marks) to each subpath or group of
579	   subpaths.  For example the dominant bottleneck may be permitted to
580	   contribute 90% of the loss budget, while the rest of the path is only
581	   permitted to contribute 10%.

583	   A TDS or FSTDS MUST apportion all relevant packet delivery statistics
584	   between different subpaths, such that the spatial composition of the
585	   metrics yields end-to-end statics which are within the bounds
586	   determined by the models.

588	   A network is expected to be able to sustain a Bulk TCP flow of a
589	   given data rate, MTU and RTT when the following conditions are met:
590	   o  The raw link rate is higher than the target data rate.
591	   o  The observed run length is larger than required by a suitable TCP
592	      performance model
593	   o  There is sufficient buffering at the dominant bottleneck to absorb
594	      a slowstart rate burst large enough to get the flow out of
595	      slowstart at a suitable window size.
596	   o  There is sufficient buffering in the front path to absorb and
597	      smooth sender interface rate bursts at all scales that are likely
598	      to be generated by the application, any channel arbitration in the
599	      ACK path or other mechanisms.
600	   o  When there is a standing queue at a bottleneck for a shared media
601	      subpath, there are suitable bounds on how the data and ACKs
602	      interact, for example due to the channel arbitration mechanism.
603	   o  When there is a slowly rising standing queue at the bottleneck the
604	      onset of packet loss has to be at an appropriate point (time or
605	      queue depth) and progressive.  This typically requires some form
606	      of Automatic Queue Management [RFC2309].

608	   We are developing a tool that can perform many of the tests described
609	   here[MBMSource].

611	5.  Common Models and Parameters

613	5.1.  Target End-to-end parameters

615	   The target end-to-end parameters are the target data rate, target RTT
616	   and target MTU as defined in Section 2.  These parameters are
617	   determined by the needs of the application or the ultimate end user
618	   and the end-to-end Internet path over which the application is
619	   expected to operate.  The target parameters are in units that make
620	   sense to upper layers: payload bytes delivered to the application,
621	   above TCP.  They exclude overheads associated with TCP and IP
622	   headers, retransmits and other protocols (e.g.  DNS).

624	   Other end-to-end parameters defined in Section 2 include the
625	   effective bottleneck data rate, the sender interface data rate and
626	   the TCP/IP header sizes (overhead).

628	   The target data rate must be smaller than all link data rates by
629	   enough headroom to carry the transport protocol overhead, explicitly
630	   including retransmissions and an allowance fluctuations in the actual
631	   data rate, needed to meet the specified average rate.  Specifying a
632	   target rate with insufficient headroom are likely to result in
633	   brittle measurements having little predictive value.

635	   Note that the target parameters can be specified for a hypothetical
636	   path, for example to construct TDS designed for bench testing in the
637	   absence of a real application, or for a real physical test, for in
638	   situ testing of production infrastructure.

640	   The number of concurrent connections is explicitly not a parameter to
641	   this model.  If a subpath requires multiple connections in order to
642	   meet the specified performance, that must be stated explicitly and
643	   the procedure described in Section 6.1.4 applies.

645	5.2.  Common Model Calculations

647	   The end-to-end target parameters are used to derive the
648	   target_pipe_size and the reference target_run_length.

650	   The target_pipe_size, is the average window size in packets needed to
651	   meet the target rate, for the specified target RTT and MTU.  It is
652	   given by:

654	   target_pipe_size = target_rate * target_RTT / ( target_MTU -
655	   header_overhead )
656	   Target_run_length is an estimate of the minimum required headway
657	   between losses or ECN marks, as computed by a mathematical model of
658	   TCP congestion control.  The derivation here follows [MSMO97], and by
659	   design is quite conservative.  The alternate models described in
660	   Appendix A generally yield smaller run_lengths (higher loss rates),
661	   but may not apply in all situations.  In any case alternate models
662	   should be compared to the reference target_run_length computed here.

664	   Reference target_run_length is derived as follows: assume the
665	   subpath_data_rate is infinitesimally larger than the target_data_rate
666	   plus the required header_overhead.  Then target_pipe_size also
667	   predicts the onset of queueing.  A larger window will cause a
668	   standing queue at the bottleneck.

670	   Assume the transport protocol is using standard Reno style Additive
671	   Increase, Multiplicative Decrease congestion control [RFC5681] (but
672	   not Appropriate Byte Counting [RFC3465]) and the receiver is using
673	   standard delayed ACKs.  Reno increases the window by one packet every
674	   pipe_size worth of ACKs.  With delayed ACKs this takes 2 Round Trip
675	   Times per increase.  To exactly fill the pipe losses must be no
676	   closer than when the peak of the AIMD sawtooth reached exactly twice
677	   the target_pipe_size otherwise the multiplicative window reduction
678	   triggered by the loss would cause the network to be underfilled.
679	   Following [MSMO97] the number of packets between losses must be the
680	   area under the AIMD sawtooth.  They must be no more frequent than
681	   every 1 in ((3/2)*target_pipe_size)*(2*target_pipe_size) packets,
682	   which simplifies to:

684	   target_run_length = 3*(target_pipe_size^2)

686	   Note that this calculation is very conservative and is based on a
687	   number of assumptions that may not apply.  Appendix A discusses these
688	   assumptions and provides some alternative models.  If a less
689	   conservative model is used, a fully specified TDS or FSTDS MUST
690	   document the actual method for computing target_run_length along with
691	   the rationale for the underlying assumptions and the ratio of chosen
692	   target_run_length to the reference target_run_length calculated
693	   above.

695	   These two parameters, target_pipe_size and target_run_length,
696	   directly imply most of the individual parameters for the tests in
697	   Section 7.

699	5.3.  Parameter Derating

701	   Since some aspects of the models are very conservative, this
702	   framework permits some latitude in derating test parameters.  Rather
703	   than trying to formalize more complicated models we permit some test
704	   parameters to be relaxed as long as they meet some additional
705	   procedural constraints:
706	   o  The TDS or FSTDS MUST document and justify the actual method used
707	      compute the derated metric parameters.
708	   o  The validation procedures described in Section 9 must be used to
709	      demonstrate the feasibility of meeting the performance targets
710	      with infrastructure that infinitesimally passes the derated tests.
711	   o  The validation process itself must be documented is such a way
712	      that other researchers can duplicate the validation experiments.

714	   Except as noted, all tests below assume no derating.  Tests where
715	   there is not currently a well established model for the required
716	   parameters explicitly include derating as a way to indicate
717	   flexibility in the parameters.

719	6.  Common testing procedures

721	6.1.  Traffic generating techniques

723	6.1.1.  Paced transmission

725	   Paced (burst) transmissions: send bursts of data on a timer to meet a
726	   particular target rate and pattern.  In all cases the specified data
727	   rate can either be the application or link rates.  Header overheads
728	   must be included in the calculations as appropriate.
729	   Paced single packets:  Send individual packets at the specified rate
730	      or headway.
731	   Burst:  Send sender interface rate bursts on a timer.  Specify any 3
732	      of: average rate, packet size, burst size (number of packets) and
733	      burst headway (burst start to start).  These bursts are typically
734	      sent as back-to-back packets at the testers interface rate.
735	   Slowstart bursts:  Send 4 packet sender interface rate bursts at an
736	      average data rate equal to twice effective bottleneck link rate
737	      (but not more than the sender interface rate).  This corresponds
738	      to the average rate during a TCP slowstart when Appropriate Byte
739	      Counting [RFC3465] is present or delayed ack is disabled.  Note
740	      that if the effective bottleneck link rate is more than half of
741	      the sender interface rate, slowstart bursts become sender
742	      interface rate bursts.
743	   Repeated Slowstart bursts:  Slowstart bursts are typically part of
744	      larger scale pattern of repeated bursts, such as sending
745	      target_pipe_size packets as slowstart bursts on a target_RTT
746	      headway (burst start to burst start).  Such a stream has three
747	      different average rates, depending on the averaging interval.  At
748	      the finest time scale the average rate is the same as the sender
749	      interface rate, at a medium scale the average rate is twice the
750	      effective bottleneck link rate and at the longest time scales the
751	      average rate is equal to the target data rate.

753	   Note that in conventional measurement theory exponential
754	   distributions are often used to eliminate many sorts of correlations.
755	   For the procedures above, the correlations are created by the network
756	   elements and accurately reflect their behavior.  At some point in the
757	   future, it may be desirable to introduce noise sources into the above
758	   pacing models, but the are not warranted at this time.

760	6.1.2.  Constant window pseudo CBR

762	   Implement pseudo constant bit rate by running a standard protocol
763	   such as TCP with a fixed bound on the window size.  The rate is only
764	   maintained in average over each RTT, and is subject to limitations of
765	   the transport protocol.

767	   The bound on the window size is computed from the target_data_rate
768	   and the actual RTT of the test path.

770	   If the transport protocol fails to maintain the test rate within
771	   prescribed limits the test would typically be considered inconclusive
772	   or failing, depending depending on what mechanism caused the reduced
773	   rate.  See the discussion of test outcomes in Section 6.2.1.

775	6.1.3.  Scanned window pseudo CBR

777	   Same as the above, except the window is scanned across a range of
778	   sizes designed to include two key events, the onset of queueing and
779	   the onset of packet loss or ECN marks.  The window is scanned by
780	   incrementing it by one packet for every 2*target_pipe_size delivered
781	   packets.  This mimics the additive increase phase of standard
782	   congestion avoidance and normally separates the the window increases
783	   by approximately twice the target_RTT.

785	   There are two versions of this test: one built by applying a window
786	   clamp to standard congestion control and one one built by stiffening
787	   a non-standard transport protocol.  When standard congestion control
788	   is in effect, any losses or ECN marks cause the transport to revert
789	   to a window smaller than the clamp such that the scanning clamp loses
790	   control the window size.  The NPAD pathdiag tool is an example of
791	   this class of algorithms [Pathdiag].

793	   Alternatively a non-standard congestion control algorithm can respond
794	   to losses by transmitting extra data, such that it maintains the
795	   specified window size independent of losses or ECN marks.  Such a
796	   stiffened transport explicitly violates mandatory Internet congestion
797	   control and is not suitable for in situ testing.  It is only
798	   appropriate for engineering testing under laboratory conditions.  The
799	   Windowed Ping tools implemented such a test [WPING].  This tool has
800	   been updated and is under test.[mpingSource]

802	   The test procedures in Section 7.2 describe how to the partition the
803	   scans into regions and how to interpret the results.

805	6.1.4.  Concurrent or channelized testing

807	   The procedures described in his document are only directly applicable
808	   to single stream performance measurement, e.g. one TCP connection.
809	   In an ideal world, we would disallow all performance claims based
810	   multiple concurrent streams but this is not practical due to at least
811	   two different issues.  First, many very high rate link technologies
812	   are channelized and pin individual flows to specific channels to
813	   minimize reordering or other problems and second, TCP itself has
814	   scaling limits.  Although the former problem might be overcome
815	   through different design decisions, the later problem is more deeply
816	   rooted.

818	   All standard [RFC5681] and de facto standard congestion control
819	   algorithms [CUBIC] have scaling limits, in the sense that as a long
820	   fast network (LFN) with a fixed RTT and MTU gets faster, all
821	   congestion control algorithms get less accurate and as a consequence
822	   have difficulty filling the network [SLowScaling].  These properties
823	   are a consequence of the original Reno AIMD congestion control design
824	   and the requirement in RFC 5681 that all transport protocols have
825	   uniform response to congestion.

827	   There are a number of reasons to want to specify performance in term
828	   of multiple concurrent flows, however this approach is not
829	   recommended for data rates below several Mb/s, which can be attained
830	   with run lengths under 10000 packets.  Since run length goes as the
831	   square of the data rate, at higher rates the run lengths can be
832	   unfeasibly large, and multiple connection might be the only feasible
833	   approach.  For an example of this problem see Section 8.3.

835	   If multiple connections are deemed necessary to meet aggregate
836	   performance targets then this MUST be stated both the design of the
837	   TDS and in any claims about network performance.  The tests MUST be
838	   performed concurrently with the specified number of connections.  For
839	   the the tests that using bursty traffic, the bursts should be
840	   synchronized across flows.

842	6.1.5.  Intermittent Testing

844	   Any test which does not depend on queueing (e.g. the CBR tests) or
845	   experiences periodic zero outstanding data during normal operation
846	   (e.g. between bursts for the various burst tests), can be formulated
847	   as an intermittent test, to reduce the perceived impact on other
848	   traffic.  The approach is to insert periodic pauses in the test at
849	   any point when there is no expected queue occupancy.

851	   Intermittent testing can be used for ongoing monitoring for changes
852	   in subpath quality with minimal disruption users.  However it is not
853	   suitable in environments where there are reactive links[REACTIVE].

855	6.1.6.  Intermittent Scatter Testing

857	   Intermittent scatter testing is a technique for non-disruptively
858	   evaluating the front path from a sender to a subscriber aggregation
859	   point within an ISP at full load by intermittently testing across a
860	   pool of subscriber access links, such that each subscriber sees
861	   tolerable test traffic loads.  The load on the front path should be
862	   limited to be no more than that which would be caused by a single
863	   test to an known to otherwise be idle subscriber.  This test in
864	   aggregate mimics a full load test from a content provider to the
865	   aggregation point.

867	   Intermittent scatter testing can be used to reduce the measurement
868	   noise introduced by unknown traffic on customer access links.

870	6.2.  Interpreting the Results

872	6.2.1.  Test outcomes

874	   To perform an exhaustive test of an end-to-end network path, each
875	   test of the TDS is applied to each subpath of an end-to-end path.  If
876	   any subpath fails any test then an application running over the end-
877	   to-end path can also be expected to fail to attain the target
878	   performance under some conditions.

880	   In addition to passing or failing, a test can be deemed to be
881	   inconclusive for a number of reasons.  Proper instrumentation and
882	   treatment of inclusive outcomes is critical to the accuracy and
883	   robustness of Model Based Metrics.  Tests can be inconclusive if the
884	   precomputed traffic pattern was not accurately generated; the
885	   measurement results were not statistically significant; and others
886	   causes such as failing to meet some required preconditions for the
887	   test.

889	   For example consider a test that implements Constant Window Pseudo
890	   CBR (Section 6.1.2) by adding rate controls and detailed traffic
891	   instrumentation to TCP (e.g.  [RFC4898]).  TCP includes built in
892	   control systems which might interfere with the sending data rate.  If
893	   such a test meets the the run length specification while failing to
894	   attain the specified data rate it must be treated as an inconclusive
895	   result, because we can not a priori determine if the reduced data
896	   rate was caused by a TCP problem or a network problem, or if the
897	   reduced data rate had a material effect on the run length measurement
898	   itself.

900	   Note that for load tests such as this example, an observed run length
901	   that is too small can be considered to have failed the test because
902	   it doesn't really matter that the test didn't attain the required
903	   data rate.

905	   The really important new properties of MBM, such as vantage
906	   independence, are a direct consequence of opening the control loops
907	   in the protocols, such that the test traffic does not depend on
908	   network conditions or traffic received.  Any mechanism that
909	   introduces feedback between the traffic measurements and the traffic
910	   generation is at risk of introducing nonlinearities that spoil these
911	   properties.  Any exceptional event that indicates that such feedback
912	   has happened should cause the test to be considered inconclusive.

914	   One way to view inconclusive tests is that they reflect situations
915	   where a test outcome is ambiguous between limitations of the network
916	   and some unknown limitation of the diagnostic test itself, which was
917	   presumably caused by some uncontrolled feedback from the network.

919	   Note that procedures that attempt to sweep the target parameter space
920	   to find the bounds on some parameter (for example to find the highest
921	   data rate for a subpath) are likely to break the location independent
922	   properties of Model Based Metrics, because the boundary between
923	   passing and inconclusive is sensitive to the RTT because TCP's
924	   ability to compensate for problems scales with the number of round
925	   trips per second.  Repeating the same procedure from another vantage
926	   point with a different RTT is likely get a different result, because
927	   TCP will get lower performance on the path with the longer RTT.

929	   One of the goals for evolving TDS designs will be to keep sharpening
930	   distinction between inconclusive, passing and failing tests.  The
931	   criteria for for passing, failing and inclusive tests MUST be
932	   explicitly stated for every test in the TDS or FSTDS.

934	   One of the goals of evolving the testing process, procedures tools
935	   and measurement point selection should be to minimize the number of
936	   inconclusive tests.

938	   It may be useful to keep raw data delivery statistics for deeper
939	   study of the behavior of the network path and to measure the tools.
940	   This can help to drive tool evolution.  Under some conditions it
941	   might be possible to reevaluate the raw data for satisfying alternate
942	   performance targets.  However such procedures are likely to introduce
943	   sampling bias and other implicit feedback which can cause false
944	   results and exhibit MP vantage sensitivity.

946	6.2.2.  Statistical criteria for measuring run_length

948	   When evaluating the observed run_length, we need to determine
949	   appropriate packet stream sizes and acceptable error levels for
950	   efficient measurement.  In practice, can we compare the empirically
951	   estimated packet loss and ECN marking probabilities with the targets
952	   as the sample size grows?  How large a sample is needed to say that
953	   the measurements of packet transfer indicate a particular run length
954	   is present?

956	   The generalized measurement can be described as recursive testing:
957	   send packets (individually or in patterns) and observe the packet
958	   delivery performance (loss ratio or other metric, any marking we
959	   define).

961	   As each packet is sent and measured, we have an ongoing estimate of
962	   the performance in terms of the ratio of packet loss or ECN mark to
963	   total packets (i.e. an empirical probability).  We continue to send
964	   until conditions support a conclusion or a maximum sending limit has
965	   been reached.

967	   We have a target_mark_probability, 1 mark per target_run_length,
968	   where a "mark" is defined as a lost packet, a packet with ECN mark,
969	   or other signal.  This constitutes the null Hypothesis:

971	   H0:  no more than one mark in target_run_length =
972	      3*(target_pipe_size)^2 packets

974	   and we can stop sending packets if on-going measurements support
975	   accepting H0 with the specified Type I error = alpha (= 0.05 for
976	   example).

978	   We also have an alternative Hypothesis to evaluate: if performance is
979	   significantly lower than the target_mark_probability.  Based on
980	   analysis of typical values and practical limits on measurement
981	   duration, we choose four times the H0 probability:

983	   H1:  one or more marks in (target_run_length/4) packets

985	   and we can stop sending packets if measurements support rejecting H0
986	   with the specified Type II error = beta (= 0.05 for example), thus
987	   preferring the alternate hypothesis H1.

989	   H0 and H1 constitute the Success and Failure outcomes described
990	   elsewhere in the memo, and while the ongoing measurements do not
991	   support either hypothesis the current status of measurements is
992	   inconclusive.

994	   The problem above is formulated to match the Sequential Probability
995	   Ratio Test (SPRT) [StatQC].  Note that as originally framed the
996	   events under consideration were all manufacturing defects.  In
997	   networking, ECN marks and lost packets are not defects but signals,
998	   indicating that the transport protocol should slow down.

1000	   The Sequential Probability Ratio Test also starts with a pair of
1001	   hypothesis specified as above:

1003	   H0:  p0 = one defect in target_run_length
1004	   H1:  p1 = one defect in target_run_length/4
1005	   As packets are sent and measurements collected, the tester evaluates
1006	   the cumulative defect count against two boundaries representing H0
1007	   Acceptance or Rejection (and acceptance of H1):

1009	   Acceptance line:  Xa = -h1 + sn
1010	   Rejection line:  Xr = h2 + sn
1011	   where n increases linearly for each packet sent and

1013	   h1 =  { log((1-alpha)/beta) }/k
1014	   h2 =  { log((1-beta)/alpha) }/k
1015	   k  =  log{ (p1(1-p0)) / (p0(1-p1)) }
1016	   s  =  [ log{ (1-p0)/(1-p1) } ]/k
1017	   for p0 and p1 as defined in the null and alternative Hypotheses
1018	   statements above, and alpha and beta as the Type I and Type II error.

1020	   The SPRT specifies simple stopping rules:

1022	   o  Xa < defect_count(n) < Xb: continue testing
1023	   o  defect_count(n) <= Xa: Accept H0
1024	   o  defect_count(n) >= Xb: Accept H1

1026	   The calculations above are implemented in the R-tool for Statistical
1027	   Analysis [Rtool] , in the add-on package for Cross-Validation via
1028	   Sequential Testing (CVST) [CVST] .

1030	   Using the equations above, we can calculate the minimum number of
1031	   packets (n) needed to accept H0 when x defects are observed.  For
1032	   example, when x = 0:

1034	   Xa = 0  = -h1 + sn
1035	   and  n = h1 / s

1037	6.2.2.1.  Alternate criteria for measuring run_length

1039	   An alternate calculation, contributed by Alex Gilgur (Google).

1041	   The probability of failure within an interval whose length is
1042	   target_run_length is given by an exponential distribution with rate =
1043	   1 / target_run_length (a memoryless process).  The implication of
1044	   this is that it will be different, depending on the total count of
1045	   packets that have been through the pipe, the formula being:

1047	   P(t1 < T < t2) = R(t1) - R(t2),

1049	   where

1051	 T = number of packets at which a failure will occur with probability P;
1052	 t = number of packets:
1053	 t1 = number of packets (e.g., when failure last occurred)
1054	 t2 = t1 + target_run_length
1055	 R = failure rate:
1056	 R(t1) = exp (-t1/target_run_length)
1057	 R(t2) = exp (-t2/target_run_length)

1059	   The algorithm:

1061	   initialize the packet.counter = 0
1062	   initialize the failed.packet.counter = 0
1063	   start the loop
1064	   if paket_response = ACK:
1065	   increment the packet.counter
1066	   else:
1067	   ### The packet failed
1068	   increment the packet.counter
1069	   increment the failed.packet.counter

1071	   P_fail_observed = failed.packet.counter/packet.counter

1073	   upper_bound =  packet.counter + target.run.length / 2
1074	   lower_bound =  packet.counter - target.run.length / 2

1076	   R1 = exp( -upper_bound / target.run.length)
1077	   R0 = R(max(0, lower_bound)/ target.run.length)

1079	   P_fail_predicted = R1-R0
1080	   Compare P_fail_observed vs. P_fail_predicted
1081	   end-if
1082	   continue the loop

1084	   This algorithm allows accurate comparison of the observed failure
1085	   probability with the corresponding values predicted based on a fixed
1086	   target_failure_rate, which is equal to 1.0 / target_run_length.

1088	6.2.3.  Reordering Tolerance

1090	   All tests must be instrumented for packet level reordering [RFC4737].
1091	   However, there is no consensus for how much reordering should be
1092	   acceptable.  Over the last two decades the general trend has been to
1093	   make protocols and applications more tolerant to reordering, in
1094	   response to the gradual increase in reordering in the network.  This
1095	   increase has been due to the gradual deployment of parallelism in the
1096	   network, as a consequence of such technologies as multithreaded route
1097	   lookups and Equal Cost Multipath (ECMP) routing.  These techniques to
1098	   increase network parallelism are critical to enabling overall
1099	   Internet growth to exceed Moore's Law.

1101	   Section 5 of [RFC4737] proposed a metric that may be sufficient to
1102	   designate isolated reordered packets as effectively lost, because
1103	   TCP's retransmission response would be the same.

1105	   TCP should be able to adapt to reordering as long as the reordering
1106	   extent is no more than the maximum of one half window or 1 mS,
1107	   whichever is larger.  Note that there is a fundamental tradeoff
1108	   between tolerance to reordering and how quickly algorithms such as
1109	   fast retransmit can repair losses.  Within this limit on reorder
1110	   extent, there should be no bound on reordering density.

1112	   NB: Traditional TCP implementations were not compatible with this
1113	   metric, however newer implementations still need to be evaluated

1115	   Parameters:
1116	   Reordering displacement:  the maximum of one half of target_pipe_size
1117	      or 1 mS.

1119	6.3.  Test Qualifications

1121	   This entire section need to be completely overhauled. @@@@ It might
1122	   be summarized as "needs to be specified in a FSTDS".

1124	   Send pre-load traffic as needed to activate radios with a sleep mode,
1125	   or other "reactive network" elements (term defined in
1126	   [draft-morton-ippm-2330-update-01]).

1128	   In general failing to accurately generate the test traffic has to be
1129	   treated as an inconclusive test, since it must be presumed that the
1130	   error in traffic generation might have affected the test outcome.  To
1131	   the extent that the network itself had an effect on the the traffic
1132	   generation (e.g. in the standing queue tests) the possibility exists
1133	   that allowing too large of error margin in the traffic generation
1134	   might introduce feedback loops that comprise the vantage independents
1135	   properties of these tests.

1137	   The proper treatment of cross traffic is different for different
1138	   subpaths.  In general when testing infrastructure which is associated
1139	   with only one subscriber, the test should be treated as inconclusive
1140	   it that subscriber is active on the network.  However, for shared
1141	   infrastructure managed by an ISP, the question at hand is likely to
1142	   be testing if ISP has sufficient total capacity.  In such cases the
1143	   presence of cross traffic due to other subscribers is explicitly part
1144	   of the network conditions and its effects are explicitly part of the
1145	   test.

1147	   These two cases do not cover all subpaths.  For example, WiFI which
1148	   itself shares unmanaged channel space with other devices is unlikely
1149	   to be unsuitable for any prescriptive measurement.

1151	   Note that canceling tests due to load on subscriber lines may
1152	   introduce sampling bias for testing other parts of the
1153	   infrastructure.  For this reason tests that are scheduled but not run
1154	   due to load should be treated as a special case of "inconclusive".

1156	7.  Diagnostic Tests

1158	   The diagnostic tests below are organized by traffic pattern: basic
1159	   data rate and run length, standing queues, slowstart bursts, and
1160	   sender rate bursts.  We also introduce some combined tests which are
1161	   more efficient the expense of conflating the signatures of different
1162	   failures.

1164	7.1.  Basic Data Rate and Run Length Tests

1166	   We propose several versions of the basic data rate and run length
1167	   test.  All measure the number of packets delivered between losses or
1168	   ECN marks, using a data stream that is rate controlled at or below
1169	   the target_data_rate.

1171	   The tests below differ in how the data rate is controlled.  The data
1172	   can be paced on a timer, or window controlled at full target data
1173	   rate.  The first two tests implicitly confirm that sub_path has
1174	   sufficient raw capacity to carry the target_data_rate.  They are
1175	   recommend for relatively infrequent testing, such as an installation
1176	   or auditing process.  The third, background run length, is a low rate
1177	   test designed for ongoing monitoring for changes in subpath quality.

1179	   All rely on the receiver accumulating packet delivery statistics as
1180	   described in Section 6.2.2 to score the outcome:

1182	   Pass: it is statistically significant that the observed run length is
1183	   larger than the target_run_length.

1185	   Fail: it is statistically significant that the observed run length is
1186	   smaller than the target_run_length.

1188	   A test is considered to be inconclusive if it failed to meet the data
1189	   rate as specified below, meet the qualifications defined in
1190	   Section 6.3 or neither run length statistical hypothesis was
1191	   confirmed in the allotted test duration.

1193	7.1.1.  Run Length at Paced Full Data Rate

1195	   Confirm that the observed run length is at least the
1196	   target_run_length while relying on timer to send data at the
1197	   target_rate using the procedure described in in Section 6.1.1 with a
1198	   burst size of 1 (single packets).

1200	   The test is considered to be inconclusive if the packet transmission
1201	   can not be accurately controlled for any reason.

1203	7.1.2.  Run Length at Full Data Windowed Rate

1205	   Confirm that the observed run length is at least the
1206	   target_run_length while sending at an average rate equal to the
1207	   target_data_rate, by controlling (or clamping) the window size of a
1208	   conventional transport protocol to a fixed value computed from the
1209	   properties of the test path, typically
1210	   test_window=target_data_rate*test_RTT/target_MTU.

1212	   Since losses and ECN marks generally cause transport protocols to at
1213	   least temporarily reduce their data rates, this test is expected to
1214	   be less precise about controlling its data rate.  It should not be
1215	   considered inconclusive as long as at least some of the round trips
1216	   reached the full target_data_rate, without incurring losses.  To pass
1217	   this test the network MUST deliver target_pipe_size packets in
1218	   target_RTT time without any losses or ECN marks at least once per two
1219	   target_pipe_size round trips, in addition to meeting the run length
1220	   statistical test.

1222	7.1.3.  Background Run Length Tests

1224	   The background run length is a low rate version of the target target
1225	   rate test above, designed for ongoing lightweight monitoring for
1226	   changes in the observed subpath run length without disrupting users.
1227	   It should be used in conjunction with one of the above full rate
1228	   tests because it does not confirm that the subpath can support raw
1229	   data rate.

1231	   Existing loss metrics such as [RFC6673] might be appropriate for
1232	   measuring background run length.

1234	7.2.  Standing Queue tests

1236	   These test confirm that the bottleneck is well behaved across the
1237	   onset of packet loss, which typically follows after the onset of
1238	   queueing.  Well behaved generally means lossless for transient
1239	   queues, but once the queue has been sustained for a sufficient period
1240	   of time (or reaches a sufficient queue depth) there should be a small
1241	   number of losses to signal to the transport protocol that it should
1242	   reduce its window.  Losses that are too early can prevent the
1243	   transport from averaging at the target_data_rate.  Losses that are
1244	   too late indicate that the queue might be subject to bufferbloat
1245	   [Bufferbloat] and inflict excess queuing delays on all flows sharing
1246	   the bottleneck queue.  Excess losses make loss recovery problematic
1247	   for the transport protocol.  Non-linear or erratic RTT fluctuations
1248	   suggest poor interactions between the channel acquisition systems and
1249	   the transport self clock.  All of the tests in this section use the
1250	   same basic scanning algorithm but score the link on the basis of how
1251	   well it avoids each of these problems.

1253	   For some technologies the data might not be subject to increasing
1254	   delays, in which case the data rate will vary with the window size
1255	   all the way up to the onset of losses or ECN marks.  For theses
1256	   technologies, the discussion of queueing does not apply, but it is
1257	   still required that the onset of losses (or ECN marks) be at an
1258	   appropriate point and progressive.

1260	   Use the procedure in Section 6.1.3 to sweep the window across the
1261	   onset of queueing and the onset of loss.  The tests below all assume
1262	   that the scan emulates standard additive increase and delayed ACK by
1263	   incrementing the window by one packet for every 2*target_pipe_size
1264	   packets delivered.  A scan can be divided into three regions: below
1265	   the onset of queueing, a standing queue, and at or beyond the onset
1266	   of loss.

1268	   Below the onset of queueing the RTT is typically fairly constant, and
1269	   the data rate varies in proportion to the window size.  Once the data
1270	   rate reaches the link rate, the data rate becomes fairly constant,
1271	   and the RTT increases in proportion to the the window size.  The
1272	   precise transition from one region to the other can be identified by
1273	   the maximum network power, defined to be the ratio data rate over the
1274	   RTT[POWER].

1276	   For technologies that do not have conventional queues, start the scan
1277	   at a window equal to the test_window, i.e. starting at the target
1278	   rate, instead of the power point.

1280	   If there is random background loss (e.g. bit errors, etc), precise
1281	   determination of the onset of packet loss may require multiple scans.
1282	   Above the onset of loss, all transport protocols are expected to
1283	   experience periodic losses.  For the stiffened transport case they
1284	   will be determined by the AQM algorithm in the network or the details
1285	   of how the the window increase function responds to loss.  For the
1286	   standard transport case the details of periodic losses are typically
1287	   dominated by the behavior of the transport protocol itself.

1289	7.2.1.  Congestion Avoidance

1291	   A link passes the congestion avoidance standing queue test if more
1292	   than target_run_length packets are delivered between the power point
1293	   (or test_window) and the first loss or ECN mark.  If this test is
1294	   implemented using a standards congestion control algorithm with a
1295	   clamp, it can be used in situ in the production internet as a
1296	   capacity test.  For an example of such a test see [NPAD].

1298	7.2.2.  Bufferbloat

1300	   This test confirms that there is some mechanism to limit buffer
1301	   occupancy (e.g. that prevents bufferbloat).  Note that this is not
1302	   strictly a requirement for single stream bulk performance, however if
1303	   there is no mechanism to limit buffer occupancy then a single stream
1304	   with sufficient data to deliver is likely to cause the problems
1305	   described in [RFC2309] and [Bufferbloat].  This may cause only minor
1306	   symptoms for the dominant flow, but has the potential to make the
1307	   link unusable for other flows and applications.

1309	   Pass if the onset of loss is before a standing queue has introduced
1310	   more delay than than twice target_RTT, or other well defined limit.
1311	   Note that there is not yet a model for how much standing queue is
1312	   acceptable.  The factor of two chosen here reflects a rule of thumb.
1313	   Note that in conjunction with the previous test, this test implies
1314	   that the first loss should occur at a queueing delay which is between
1315	   one and two times the target_RTT.

1317	7.2.3.  Non excessive loss

1319	   This test confirm that the onset of loss is not excessive.  Pass if
1320	   losses are bound by the the fluctuations in the cross traffic, such
1321	   that transient load (bursts) do not cause dips in aggregate raw
1322	   throughput. e.g. pass as long as the losses are no more bursty than
1323	   are expected from a simple drop tail queue.  Although this test could
1324	   be made more precise it is really included here for pedantic
1325	   completeness.

1327	7.2.4.  Duplex Self Interference

1329	   This engineering test confirms a bound on the interactions between
1330	   the forward data path and the ACK return path.  Fail if the RTT rises
1331	   by more than some fixed bound above the expected queueing time
1332	   computed from trom the excess window divided by the link data rate.

1334	7.3.  Slowstart tests

1336	   These tests mimic slowstart: data is sent at twice the effective
1337	   bottleneck rate to exercise the queue at the dominant bottleneck.

1339	   They are deemed inconclusive if the elapsed time to send the data
1340	   burst is not less than half of the time to receive the ACKs. (i.e.
1341	   sending data too fast is ok, but sending it slower than twice the
1342	   actual bottleneck rate as indicated by the ACKs is deemed
1343	   inconclusive).  Space the bursts such that the average data rate is
1344	   equal to the target_data_rate.

1346	7.3.1.  Full Window slowstart test

1348	   This is a capacity test to confirm that slowstart is not likely to
1349	   exit prematurely.  Send slowstart bursts that are target_pipe_size
1350	   total packets.

1352	   Accumulate packet delivery statistics as described in Section 6.2.2
1353	   to score the outcome.  Pass if it is statistically significant that
1354	   the observed run length is larger than the target_run_length.  Fail
1355	   if it is statistically significant that the observed run length is
1356	   smaller than the target_run_length.

1358	   Note that these are the same parameters as the Sender Full Window
1359	   burst test, except the burst rate is at slowestart rate, rather than
1360	   sender interface rate.

1362	7.3.2.  Slowstart AQM test

1364	   Do a continuous slowstart (send data continuously at slowstart_rate),
1365	   until the first loss, stop, allow the network to drain and repeat,
1366	   gathering statistics on the last packet delivered before the loss,
1367	   the loss pattern, maximum observed RTT and window size.  Justify the
1368	   results.  There is not currently sufficient theory justifying
1369	   requiring any particular result, however design decisions that affect
1370	   the outcome of this tests also affect how the network balances
1371	   between long and short flows (the "mice and elephants" problem).

1373	   This is an engineering test: It would be best performed on a
1374	   quiescent network or testbed, since cross traffic has the potential
1375	   to change the results.

1377	7.4.  Sender Rate Burst tests

1379	   These tests determine how well the network can deliver bursts sent at
1380	   sender's interface rate.  Note that this test most heavily exercises
1381	   the front path, and is likely to include infrastructure may be out of
1382	   scope for a subscriber ISP.

1384	   Also, there are a several details that are not precisely defined.
1385	   For starters there is not a standard server interface rate. 1 Gb/s
1386	   and 10 Gb/s are very common today, but higher rates will become cost
1387	   effective and can be expected to be dominant some time in the future.

1389	   Current standards permit TCP to send a full window bursts following
1390	   an application pause.  Congestion Window Validation [RFC2861], is not
1391	   required, but even if was it does not take effect until an
1392	   application pause is longer than an RTO.  Since this is standard
1393	   behavior, it is desirable that the network be able to deliver such
1394	   bursts, otherwise application pauses will cause unwarranted losses.

1396	   It is also understood in the application and serving community that
1397	   interface rate bursts have a cost to the network that has to be
1398	   balanced against other costs in the servers themselves.  For example
1399	   TCP Segmentation Offload [TSO] reduces server CPU in exchange for
1400	   larger network bursts, which increase the stress on network buffer
1401	   memory.

1403	   There is not yet theory to unify these costs or to provide a
1404	   framework for trying to optimize global efficiency.  We do not yet
1405	   have a model for how much the network should tolerate server rate
1406	   bursts.  Some bursts must be tolerated by the network, but it is
1407	   probably unreasonable to expect the network to be able to efficiently
1408	   deliver all data as a series of bursts.

1410	   For this reason, this is the only test for which we explicitly
1411	   encourage detrateing.  A TDS should include a table of pairs of
1412	   derating parameters: what burst size to use as a fraction of the
1413	   target_pipe_size, and how much each burst size is permitted to reduce
1414	   the run length, relative to to the target_run_length.

1416	7.5.  Combined Tests

1418	   These tests are more efficient from a deployment/operational
1419	   perspective, but may not be possible to diagnose if they fail.

1421	7.5.1.  Sustained burst test

1423	   Send target_pipe_size*derate sender interface rate bursts every
1424	   target_RTT*derate, for derate between 0 and 1.  Verify that the
1425	   observed run length meets target_run_length.  Key observations:
1426	   o  This test is subpath RTT invariant, as long as the tester can
1427	      generate the required pattern.
1428	   o  The subpath under test is expected to go idle for some fraction of
1429	      the time: (subpath_data_rate-target_rate)/subpath_data_rate.
1430	      Failing to do so suggests a problem with the procedure and an
1431	      inconclusive test result.
1432	   o  This test is more strenuous than the slowstart tests: they are not
1433	      needed if the link passes this test with derate=1.
1434	   o  A link that passes this test is likely to be able to sustain
1435	      higher rates (close to subpath_data_rate) for paths with RTTs
1436	      smaller than the target_RTT.  Offsetting this performance
1437	      underestimation is part of the rationale behind permitting
1438	      derating in general.

1440	   o  This test can be implemented with standard instrumented
1441	      TCP[RFC4898], using a specialized measurement application at one
1442	      end and a minimal service at the other end [RFC 863, RFC 864].  It
1443	      may require tweaks to the TCP implementation.  [MBMSource]
1444	   o  This test is efficient to implement, since it does not require
1445	      per-packet timers, and can make use of TSO in modern NIC hardware.
1446	   o  This test is not totally sufficient: the standing window
1447	      engineering tests are also needed to be sure that the link is well
1448	      behaved at and beyond the onset of congestion.
1449	   o  This one test can be proven to be the one capacity test to
1450	      supplant them all.

1452	7.5.2.  Live Streaming Media

1454	   Model Based Metrics can be implemented as a side effect of serving
1455	   any non-throughput maximizing traffic*, such as streaming media, with
1456	   some additional controls and instrumentation in the servers.  The
1457	   essential requirement is that the traffic be constrained such that
1458	   even with arbitrary application pauses, bursts and data rate
1459	   fluctuations, the traffic stays within the envelope defined by the
1460	   individual tests described above, for a specific TDS.

1462	   If the serving_data_rate is less than or equal to the
1463	   target_data_rate and the serving_RTT (the RTT between the sender and
1464	   client) is less than the target_RTT, this constraint is most easily
1465	   implemented by clamping the transport window size to:

1467	   serving_window_clamp=target_data_rate*serving_RTT/
1468	   (target_MTU-header_overhead)

1470	   The serving_window_clamp will limit the both the serving data rate
1471	   and burst sizes to be no larger than the procedures in Section 7.1.2
1472	   and Section 7.4 or Section 7.5.1.  Since the serving RTT is smaller
1473	   than the target_RTT, the worst case bursts that might be generated
1474	   under these conditions will be smaller than called for by Section 7.4
1475	   and the sender rate burst sizes are implicitly derated by the
1476	   serving_window_clamp divided by the target_pipe_size at the very
1477	   least.  (The traffic might be smoother than specified by the sender
1478	   interface rate bursts test.)

1480	   Note that if the application tolerates fluctuations in its actual
1481	   data rate (say by use of a playout buffer) it is important that the
1482	   target_data_rate be above the actual average rate needed by the
1483	   application so it can recover after transient pauses caused by
1484	   congestion or the application itself.

1486	   Alternatively the sender data rate and bursts might be explicitly
1487	   controlled by a host shaper or pacing at the sender.  This would
1488	   provide better control and work for serving_RTTs that are larger than
1489	   the target_RTT, but it is substantially more complicated to
1490	   implement.  With this technique, any traffic might be used for
1491	   measurement.

1493	   * Note that this technique might be applied to any content, if users
1494	   are willing to tolerate reduced data rate to inhibit TCP equilibrium
1495	   behavior.

1497	8.  Examples

1499	   In this section we present TDS for a couple of performance
1500	   specifications.

1502	   Tentatively: 5 Mb/s*50 ms, 1 Mb/s*50ms, 250kbp*100mS

1504	8.1.  Near serving HD streaming video

1506	   Today the best quality HD video requires slightly less than 5 Mb/s
1507	   [HDvideo].  Since it is desirable to serve such content locally, we
1508	   assume that the content will be within 50 mS, which is enough to
1509	   cover continental Europe or either US coast from a single site.

1511	                         5 Mb/s over a 50 ms path

1513	                +----------------------+-------+---------+
1514	                | End to End Parameter | Value | units   |
1515	                +----------------------+-------+---------+
1516	                | target_rate          | 5     | Mb/s    |
1517	                | target_RTT           | 50    | ms      |
1518	                | traget_MTU           | 1500  | bytes   |
1519	                | target_pipe_size     | 22    | packets |
1520	                | target_run_length    | 1452  | packets |
1521	                +----------------------+-------+---------+

1523	                                  Table 1

1525	   This example uses the most conservative TCP model and no derating.

1527	8.2.  Far serving SD streaming video

1529	   Standard Quality video typically fits in 1 Mb/s [SDvideo].  This can
1530	   be reasonably delivered via longer paths with larger.  We assume
1531	   100mS.

1533	                         1 Mb/s over a 100 ms path

1535	                +----------------------+-------+---------+
1536	                | End to End Parameter | Value | units   |
1537	                +----------------------+-------+---------+
1538	                | target_rate          | 1     | Mb/s    |
1539	                | target_RTT           | 100   | ms      |
1540	                | traget_MTU           | 1500  | bytes   |
1541	                | target_pipe_size     | 9     | packets |
1542	                | target_run_length    | 243   | packets |
1543	                +----------------------+-------+---------+

1545	                                  Table 2

1547	   This example uses the most conservative TCP model and no derating.

1549	8.3.  Bulk delivery of remote scientific data

1551	   This example corresponds to 100 Mb/s bulk scientific data over a
1552	   moderately long RTT.  Note that the target_run_length is infeasible
1553	   for most networks.

1555	                        100 Mb/s over a 200 ms path

1557	               +----------------------+---------+---------+
1558	               | End to End Parameter | Value   | units   |
1559	               +----------------------+---------+---------+
1560	               | target_rate          | 100     | Mb/s    |
1561	               | target_RTT           | 200     | ms      |
1562	               | traget_MTU           | 1500    | bytes   |
1563	               | target_pipe_size     | 1741    | packets |
1564	               | target_run_length    | 9093243 | packets |
1565	               +----------------------+---------+---------+

1567	                                  Table 3

1569	9.  Validation

1571	   Since some aspects of the models are likely to be too conservative,
1572	   Section 5.2 and Section 5.3 permit alternate protocol models and test
1573	   parameter derating.  In exchange for this latitude in the modelling
1574	   process, we require demonstrations that such a TDS can robustly
1575	   detect links that will prevent authentic applications using state-of-
1576	   the-art protocol implementations from meeting the specified
1577	   performance targets.  This correctness criteria is potentially
1578	   difficult to prove, because it implicitly requires validating a TDS
1579	   against all possible links and subpaths.

1581	   We suggest two strategies, both of which should be applied: first,
1582	   publish a fully open description of the TDS, including what
1583	   assumptions were used and and how it was derived, such that the
1584	   research community can evaluate these decisions, test them and
1585	   comment on there applicability; and second, demonstrate that an
1586	   applications running over an infinitessimally passing testbed do meet
1587	   the performance targets.

1589	   An infinitessimally passing testbed resembles a epsilon-delta proof
1590	   in calculus.  Construct a test network such that all of the
1591	   individual tests of the TDS only pass by small (infinitesimal)
1592	   margins, and demonstrate that a variety of authentic applications
1593	   running over real TCP implementations (or other protocol as
1594	   appropriate) meets the end-to-end target parameters over such a
1595	   network.  The workloads should include multiple types of streaming
1596	   media and transaction oriented short flows (e.g. synthetic web
1597	   traffic ).

1599	   For example using our example in our HD streaming video TDS described
1600	   in Section 8.1, the bottleneck data rate should be 5 Mb/s, the per
1601	   packet random background loss probability should be 1/1453, for a run
1602	   length of 1452 packets, the bottleneck queue should be 22 packets and
1603	   the front path should have just enough buffering to withstand 22
1604	   packet line rate bursts.  We want every one of the TDS tests to fail
1605	   if we slightly increase the relevant test parameter, so for example
1606	   sending a 23 packet slowstart bursts should cause excess (possibly
1607	   deterministic) packet drops at the dominant queue at the bottleneck.
1608	   On this infinitessimally passing network it should be possible for a
1609	   real ral application using a stock TCP implementation in the vendor's
1610	   default configuration to attain 5 Mb/s over an 50 mS path.

1612	   The most difficult part of setting up such a testbed is arranging to
1613	   infinitesimally pass the individual tests.  We suggest two
1614	   approaches: constraining the network devices not to use all available
1615	   resources (limiting available buffer space or data rate); and
1616	   preloading subpaths with cross traffic.  Note that is it important
1617	   that a single environment be constructed which infinitessimally
1618	   passes all tests at the same time, otherwise there is a chance that
1619	   TCP can exploit extra latitude in some parameters (such as data rate)
1620	   to partially compensate for constraints in other parameters (queue
1621	   space, or viceversa).

1623	   To the extent that a TDS is used to inform public dialog it should be
1624	   fully publicly documented, including the details of the tests, what
1625	   assumptions were used and how it was derived.  All of the details of
1626	   the validation experiment should also be public with sufficient
1627	   detail for the experiments to be replicated by other researchers.
1628	   All components should either be open source of fully described
1629	   proprietary implementations that are available to the research
1630	   community.

1632	   This work here is inspired by open tools running on an open platform,
1633	   using open techniques to collect open data.  See Measurement Lab
1634	   [http://www.measurementlab.net/]

1636	10.  Acknowledgements

1638	   Ganga Maguluri suggested the statistical test for measuring loss
1639	   probability in the target run length.  Alex Gilgur for helping with
1640	   the statistics and contributing and alternate model.

1642	   Meredith Whittaker for improving the clarity of the communications.

1644	11.  Informative References

1646	   [RFC2309]  Braden, B., Clark, D., Crowcroft, J., Davie, B., Deering,
1647	              S., Estrin, D., Floyd, S., Jacobson, V., Minshall, G.,
1648	              Partridge, C., Peterson, L., Ramakrishnan, K., Shenker,
1649	              S., Wroclawski, J., and L. Zhang, "Recommendations on
1650	              Queue Management and Congestion Avoidance in the
1651	              Internet", RFC 2309, April 1998.

1653	   [RFC2330]  Paxson, V., Almes, G., Mahdavi, J., and M. Mathis,
1654	              "Framework for IP Performance Metrics", RFC 2330,
1655	              May 1998.

1657	   [RFC2861]  Handley, M., Padhye, J., and S. Floyd, "TCP Congestion
1658	              Window Validation", RFC 2861, June 2000.

1660	   [RFC3148]  Mathis, M. and M. Allman, "A Framework for Defining
1661	              Empirical Bulk Transfer Capacity Metrics", RFC 3148,
1662	              July 2001.

1664	   [RFC3465]  Allman, M., "TCP Congestion Control with Appropriate Byte
1665	              Counting (ABC)", RFC 3465, February 2003.

1667	   [RFC4898]  Mathis, M., Heffner, J., and R. Raghunarayan, "TCP
1668	              Extended Statistics MIB", RFC 4898, May 2007.

1670	   [RFC4737]  Morton, A., Ciavattone, L., Ramachandran, G., Shalunov,
1671	              S., and J. Perser, "Packet Reordering Metrics", RFC 4737,
1672	              November 2006.

1674	   [RFC5681]  Allman, M., Paxson, V., and E. Blanton, "TCP Congestion
1675	              Control", RFC 5681, September 2009.

1677	   [RFC5835]  Morton, A. and S. Van den Berghe, "Framework for Metric
1678	              Composition", RFC 5835, April 2010.

1680	   [RFC6049]  Morton, A. and E. Stephan, "Spatial Composition of
1681	              Metrics", RFC 6049, January 2011.

1683	   [RFC6673]  Morton, A., "Round-Trip Packet Loss Metrics", RFC 6673,
1684	              August 2012.

1686	   [I-D.morton-ippm-lmap-path]
1687	              Bagnulo, M., Burbridge, T., Crawford, S., Eardley, P., and
1688	              A. Morton, "A Reference Path and Measurement Points for
1689	              LMAP", draft-morton-ippm-lmap-path-00 (work in progress),
1690	              January 2013.

1692	   [MSMO97]   Mathis, M., Semke, J., Mahdavi, J., and T. Ott, "The
1693	              Macroscopic Behavior of the TCP Congestion Avoidance
1694	              Algorithm", Computer Communications Review volume 27,
1695	              number3, July 1997.

1697	   [WPING]    Mathis, M., "Windowed Ping: An IP Level Performance
1698	              Diagnostic", INET 94, June 1994.

1700	   [mpingSource]
1701	              Fan, X., Mathis, M., and D. Hamon, "Git Repository for
1702	              mping: An IP Level Performance Diagnostic", Sept 2013,
1703	              .

1705	   [MBMSource]
1706	              Hamon, D., "Git Repository for Model Based Metrics",
1707	              Sept 2013, .

1709	   [Pathdiag]
1710	              Mathis, M., Heffner, J., O'Neil, P., and P. Siemsen,
1711	              "Pathdiag: Automated TCP Diagnosis", Passive and Active
1712	              Measurement , June 2008.

1714	   [StatQC]   Montgomery, D., "Introduction to Statistical Quality
1715	              Control - 2nd ed.", ISBN 0-471-51988-X, 1990.

1717	   [Rtool]    R Development Core Team, "R: A language and environment
1718	              for statistical computing. R Foundation for Statistical
1719	              Computing, Vienna, Austria. ISBN 3-900051-07-0, URL
1720	              http://www.R-project.org/",  , 2011.

1722	   [CVST]     Krueger, T. and M. Braun, "R package: Fast Cross-
1723	              Validation via Sequential Testing", version 0.1, 11 2012.

1725	   [LMCUBIC]  Ledesma Goyzueta, R. and Y. Chen, "A Deterministic Loss
1726	              Model Based Analysis of CUBIC, IEEE International
1727	              Conference on Computing, Networking and Communications
1728	              (ICNC), E-ISBN : 978-1-4673-5286-4", January 2013.

1730	Appendix A.  Model Derivations

1732	   The reference target_run_length described in Section 5.2 is based on
1733	   very conservative assumptions: that all window above target_pipe_size
1734	   contributes to a standing queue that raises the RTT, and that classic
1735	   Reno congestion control with delayed ACKs are in effect.  In this
1736	   section we provide two alternative calculations using different
1737	   assumptions.

1739	   It may seem out of place to allow such latitude in a measurement
1740	   standard, but the section provides offsetting requirements.

1742	   The estimates provided by these models make the most sense if network
1743	   performance is viewed logarithmically.  In the operational Internet,
1744	   data rates span more than 8 orders of magnitude, RTT spans more than
1745	   3 orders of magnitude, and loss probability spans at least 8 orders
1746	   of magnitude.  When viewed logarithmically (as in decibels), these
1747	   correspond to 80 dB of dynamic range.  On an 80 db scale, a 3 dB
1748	   error is less than 4% of the scale, even though it might represent a
1749	   factor of 2 in untransformed parameter.

1751	   This document gives a lot of latitude for calculating
1752	   target_run_length, however people designing a TDS should consider the
1753	   effect of their choices on the ongoing tussle about the relevance of
1754	   "TCP friendliness" as an appropriate model for Internet capacity
1755	   allocation.  Choosing a target_run_length that is substantially
1756	   smaller than the reference target_run_length specified in Section 5.2
1757	   strengthens the argument that it may be appropriate to abandon "TCP
1758	   friendliness" as the Internet fairness model.  This gives developers
1759	   incentive and permission to develop even more aggressive applications
1760	   and protocols, for example by increasing the number of connections
1761	   that they open concurrently.

1763	A.1.  Queueless Reno

1765	   In Section 5.2 it is assumed that the target rate is the same as the
1766	   link rate, and any excess window causes a standing queue at the
1767	   bottleneck.  This might be representative of a non-shared access
1768	   link.  An alternative situation would be a heavily aggregated subpath
1769	   where individual flows do not significantly contribute to the
1770	   queueing delay, and losses are determined monitoring the average data
1771	   rate, for example by the use of a virtual queue as in [AFD].  In such
1772	   a scheme the RTT is constant and TCP's AIMD congestion control causes
1773	   the data rate to fluctuate in a sawtooth.  If the traffic is being
1774	   controlled in a manner that is consistent with the metrics here, goal
1775	   would be to make the actual average rate equal to the
1776	   target_data_rate.

1778	   We can derive a model for Reno TCP and delayed ACK under the above
1779	   set of assumptions: for some value of Wmin, the window will sweep
1780	   from Wmin to 2*Wmin in 2*Wmin RTT.  Unlike the queueing case where
1781	   Wmin = Target_pipe_size, we want the average of Wmin and 2*Wmin to be
1782	   the target_pipe_size, so the average rate is the target rate.  Thus
1783	   we want Wmin = (2/3)*target_pipe_size.

1785	   Between losses each sawtooth delivers (1/2)(Wmin+2*Wmin)(2Wmin)
1786	   packets in 2*Wmin round trip times.

1788	   Substituting these together we get:

1790	   target_run_length = (4/3)(target_pipe_size^2)

1792	   Note that this is 44% of the reference run length.  This makes sense
1793	   because under the assumptions in Section 5.2 the AMID sawtooth caused
1794	   a queue at the bottleneck, which raised the effective RTT by 50%.

1796	A.2.  CUBIC

1798	   CUBIC has three operating regions.  The model for the expected value
1799	   of window size derived in [LMCUBIC] assumes operation in the
1800	   "concave" region only, which is a non-TCP friendly region for long-
1801	   lived flows.  The authors make the following assumptions: packet loss
1802	   probability, p, is independent and periodic, losses occur one at a
1803	   time, and they are true losses due to tail drop or corruption.  This
1804	   definition of p aligns very well with our definition of
1805	   target_run_length and the requirement for progressive loss (AQM).

1807	   Although CUBIC window increase depends on continuous time, the
1808	   authors transform the time to reach the maximum Window size in terms
1809	   of RTT and a parameter for the multiplicative rate decrease on
1810	   observing loss, beta (whose default value is 0.2 in CUBIC).  The
1811	   expected value of Window size, E[W], is also dependent on C, a
1812	   parameter of CUBIC that determines its window-growth aggressiveness
1813	   (values from 0.01 to 4).

1815	   E[W] = ( C*(RTT/p)^3 * ((4-beta)/beta) )^-4

1817	   and, further assuming Poisson arrival, the mean throughput, x, is
1818	   x = E[W]/RTT

1820	   We note that under these conditions (deterministic single losses),
1821	   the value of E[W] is always greater than 0.8 of the maximum window
1822	   size ~= reference_run_length. (as far as I can tell)

1824	Appendix B.  Complex Queueing

1826	   For many network technologies simple queueing models do not apply:
1827	   the network schedules, thins or otherwise alters the timing of ACKs
1828	   and data, generally to raise the efficiency of the channel allocation
1829	   process when confronted with relatively widely spaced small ACKs.
1830	   These efficiency strategies are ubiquitous for half duplex, wireless
1831	   and broadcast media.

1833	   Altering the ACK stream generally has two consequences: it raises the
1834	   effective bottleneck data rate, making slowstart burst at higher
1835	   rates (possibly as high as the sender's interface rate) and it
1836	   effectively raises the RTT by the average time that the ACKs were
1837	   delayed.  The first effect can be partially mitigated by reclocking
1838	   ACKs once they are beyond the bottleneck on the return path to the
1839	   sender, however this further raises the effective RTT.

1841	   The most extreme example of this sort of behavior would be a half
1842	   duplex channel that is not released as long as end point currently
1843	   holding the channel has pending traffic.  Such environments cause
1844	   self clocked protocols under full load to revert to extremely
1845	   inefficient stop and wait behavior, where they send an entire window
1846	   of data as a single burst, followed by the entire window of ACKs on
1847	   the return path.

1849	   If a particular end-to-end path contains a link or device that alters
1850	   the ACK stream, then the entire path from the sender up to the
1851	   bottleneck must be tested at the burst parameters implied by the ACK
1852	   scheduling algorithm.  The most important parameter is the Effective
1853	   Bottleneck Data Rate, which is the average rate at which the ACKs
1854	   advance snd.una.  Note that thinning the ACKs (relying on the
1855	   cumulative nature of seg.ack to permit discarding some ACKs) is
1856	   implies an effectively infinite bottleneck data rate.  It is
1857	   important to note that due to the self clock, ill conceived channel
1858	   allocation mechanisms can increase the stress on upstream links in a
1859	   long path.

1861	   Holding data or ACKs for channel allocation or other reasons (such as
1862	   error correction) always raises the effective RTT relative to the
1863	   minimum delay for the path.  Therefore it may be necessary to replace
1864	   target_RTT in the calculation in Section 5.2 by an effective_RTT,
1865	   which includes the target_RTT reflecting the fixed part of the path
1866	   plus a term to account for the extra delays introduced by these
1867	   mechanisms.

1869	Appendix C.  Version Control

1871	   Formatted: Fri Feb 14 14:07:33 PST 2014

1873	Authors' Addresses

1875	   Matt Mathis
1876	   Google, Inc
1877	   1600 Amphitheater Parkway
1878	   Mountain View, California  94043
1879	   USA

1881	   Email: mattmathis@google.com

1883	   Al Morton
1884	   AT&T Labs
1885	   200 Laurel Avenue South
1886	   Middletown, NJ  07748
1887	   USA

1889	   Phone: +1 732 420 1571
1890	   Email: acmorton@att.com
1891	   URI:   http://home.comcast.net/~acmacm/