idnits 2.17.1 

draft-ietf-ippm-rt-delay-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in
     this document.

     Expected boilerplate is as follows today (2024-04-25) according to
     https://trustee.ietf.org/license-info :

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.a:
        This Internet-Draft is submitted in full conformance with the provisions
        of BCP 78 and BCP 79.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2:
        Copyright (c) 2024 IETF Trust and the persons identified as the document
        authors.  All rights reserved.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3:
        This document is subject to BCP 78 and the IETF Trust's Legal Provisions
        Relating to IETF Documents
        (https://trustee.ietf.org/license-info) in effect on the date of
        publication of this document.  Please review these documents
        carefully, as they describe your rights and restrictions with
        respect to this document.  Code Components extracted from this
        document must include Simplified BSD License text as described in
        Section 4.e of the Trust Legal Provisions and are provided
        without warranty as described in the Simplified BSD License.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** Missing expiration date.  The document expiration date should appear on
     the first and last page.

  ** The document seems to lack a 1id_guidelines paragraph about
     Internet-Drafts being working documents. 

  ** The document seems to lack a 1id_guidelines paragraph about 6 months
     document validity -- however, there's a paragraph with a matching
     beginning. Boilerplate error?

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     current Internet-Drafts. 

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     Shadow Directories. 

  ** The document is more than 15 pages and seems to lack a Table of Contents.

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an Abstract section.

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (November 1998) is 9293 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  ** Downref: Normative reference to an Informational RFC: RFC 2330 (ref. '1')

  -- Possible downref: Non-RFC (?) normative reference: ref. '2'

  ** Obsolete normative reference: RFC 1305 (ref. '3') (Obsoleted by RFC 5905)

  -- Possible downref: Non-RFC (?) normative reference: ref. '4'


     Summary: 12 errors (**), 0 flaws (~~), 1 warning (==), 4 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                           G. Almes
3	Internet Draft                                              S. Kalidindi
4	Expiration Date: May 1999                                   M. Zekauskas
5	                                             Advanced Network & Services
6	                                                           November 1998

8	                   A Round-trip Delay Metric for IPPM
9	                   <draft-ietf-ippm-rt-delay-00.txt>

11	1. Status of this Memo

13	   This document is an Internet-Draft.  Internet-Drafts are working
14	   documents of the Internet Engineering Task Force (IETF), its areas,
15	   and its working groups.  Note that other groups may also distribute
16	   working documents as Internet-Drafts.

18	   Internet-Drafts are draft documents valid for a maximum of six
19	   months, and may be updated, replaced, or obsoleted by other documents
20	   at any time.  It is inappropriate to use Internet-Drafts as reference
21	   material or to cite them other than as "work in progress."

23	   To view the entire list of current Internet-Drafts, please check the
24	   "1id-abstracts.txt" listing contained in the Internet-Drafts shadow
25	   directories on ftp.is.co.za (Africa), nic.nordu.net (Northern
26	   Europe), ftp.nis.garr.it (Southern Europe), munnari.oz.au (Pacific
27	   Rim), ftp.ietf.org (US East Coast), or ftp.isi.edu (US West Coast).

29	   This memo provides information for the Internet community.  This memo
30	   does not specify an Internet standard of any kind.  Distribution of
31	   this memo is unlimited.

33	2. Introduction

35	   This memo defines a metric for round-trip delay of packets across
36	   Internet paths.  It builds on notions introduced and discussed in the
37	   IPPM Framework document, RFC 2330 [1], and follows closely the
38	   corresponding metric for One-way Delay ("A One-way Delay Metric for
39	   IPPM" <draft-ietf-ippm-delay-05.txt>) [2]; the reader is assumed to
40	   be familiar with those documents.

42	   The memo was largely written by copying material from the One-way
43	   Delay metric.  The intention is that, where the two metrics are
44	   similar, they will be described with similar or identical text, and
45	   that where the two metrics differ, new or modified text will be used.

47	   This memo is intended to be parallel in structure to a future
48	   companion document for Packet Loss.

50	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
51	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
52	   document are to be interpreted as described in RFC 2119 [6].
53	   Although RFC 2119 was written with protocols in mind, the key words
54	   are used in this document for similar reasons.  They are used to
55	   ensure the results of measurements from two different implementations
56	   are comparable, and to note instances when an implementation could
57	   perturb the network.

59	   The structure of the memo is as follows:

61	   +  A 'singleton' analytic metric, called Type-P-Round-trip-Delay,
62	      will be introduced to measure a single observation of round-trip
63	      delay.

65	   +  Using this singleton metric, a 'sample', called Type-P-Round-trip-
66	      Delay-Poisson-Stream, will be introduced to measure a sequence of
67	      singleton delays measured at times taken from a Poisson process.

69	   +  Using this sample, several 'statistics' of the sample will be
70	      defined and discussed.

72	   This progression from singleton to sample to statistics, with clear
73	   separation among them, is important.

75	   Whenever a technical term from the IPPM Framework document is first
76	   used in this memo, it will be tagged with a trailing asterisk.  For
77	   example, "term*" indicates that "term" is defined in the Framework.

79	2.1. Motivation

81	   Round-trip delay of a Type-P* packet from a source host* to a
82	   destination host is useful for several reasons:

84	   +  Some applications do not perform well (or at all) if end-to-end
85	      delay between hosts is large relative to some threshold value.

87	   +  Erratic variation in delay makes it difficult (or impossible) to
88	      support many real-time applications.

90	   +  The larger the value of delay, the more difficult it is for
91	      transport-layer protocols to sustain high bandwidths.

93	   +  The minimum value of this metric provides an indication of the
94	      delay due only to propagation and transmission delay.

96	   +  The minimum value of this metric provides an indication of the
97	      delay that will likely be experienced when the path* traversed is
98	      lightly loaded.

100	   +  Values of this metric above the minimum provide an indication of
101	      the congestion present in the path.

103	   The measurement of round-trip delay instead of one-way delay has
104	   several weaknesses, summarized here:

106	   +  The Internet path from a source to a destination may differ from
107	      the path from the destination back to the source ("asymmetric
108	      paths"), such that different sequences of routers are used for the
109	      forward and reverse paths.  Therefore round-trip measurements
110	      actually measure the performance of two distinct paths together.

112	   +  Even when the two paths are symmetric, they may have radically
113	      different performance characteristics due to asymmetric queueing.

115	   +  Performance of an application may depend mostly on the performance
116	      in one direction.  For example, a file transfer using TCP may
117	      depend more on the performance in the direction that data flows,
118	      rather than the direction in which acknowledgements travel.

120	   +  In quality-of-service (QoS) enabled networks, provisioning in one
121	      direction may be radically different than provisioning in the
122	      reverse direction, and thus the QoS guarantees differ.

124	   On the other hand, the measurement of round-trip delay has two
125	   specific advantages:

127	   +  Ease of deployment: unlike in one-way measurement, it is often
128	      possible to perform some form of round-trip delay measurement
129	      without installing measurement-specific software at the intended
130	      destination.  A variety of approaches are well-known, including
131	      use of ICMP Echo or of TCP-based methodologies (similar to those
132	      outlined in "IPPM Metrics for Measuring Connectivity" [4]).
133	      However, some approaches may introduce greater uncertainty in the
134	      time for the destination to produce a response (see
135	      Section 3.7.3).

137	   +  Ease of interpretation: in some circumstances, the round-trip time
138	      is in fact the quantity of interest; deducing it from matching
139	      one-way measurements and an assumption of the destination
140	      processing time is less direct and potentially less accurate.

142	2.2. General Issues Regarding Time

144	   Whenever a time (i.e., a moment in history) is mentioned here, it is
145	   understood to be measured in seconds (and fractions) relative to UTC.

147	   As described more fully in the Framework document, there are four
148	   distinct, but related notions of clock uncertainty:

150	   synchronization*

152	        measures the extent to which two clocks agree on what time it
153	        is.  For example, the clock on one host might be 5.4 msec ahead
154	        of the clock on a second host.

156	   accuracy*

158	        measures the extent to which a given clock agrees with UTC.  For
159	        example, the clock on a host might be 27.1 msec behind UTC.

161	   resolution*

163	        measures the precision of a given clock.  For example, the clock
164	        on an old Unix host might tick only once every 10 msec, and thus
165	        have a resolution of only 10 msec.

167	   skew*

169	        measures the change of accuracy, or of synchronization, with
170	        time.  For example, the clock on a given host might gain 1.3
171	        msec per hour and thus be 27.1 msec behind UTC at one time and
172	        only 25.8 msec an hour later.  In this case, we say that the
173	        clock of the given host has a skew of 1.3 msec per hour relative
174	        to UTC, which threatens accuracy.  We might also speak of the
175	        skew of one clock relative to another clock, which threatens
176	        synchronization.

178	3. A Singleton Definition for Round-trip Delay

180	3.1. Metric Name:

182	   Type-P-Round-trip-Delay

184	3.2. Metric Parameters:

186	   +  Src, the IP address of a host

188	   +  Dst, the IP address of a host

190	   +  T, a time

192	3.3. Metric Units:

194	   The value of a Type-P-Round-trip-Delay is either a non-negative real
195	   number, or an undefined (informally, infinite) number of seconds.

197	3.4. Definition:

199	   For a non-negative real number dT, >>the *Type-P-Round-trip-Delay*
200	   from Src to Dst at T is dT<< means that Src sent the first bit of a
201	   Type-P packet to Dst at wire-time* T, that Dst received that packet,
202	   then sent a Type-P packet back to Src, and that Src received the last
203	   bit of that packet at wire-time T+dT.

205	   >>The *Type-P-Round-trip-Delay* from Src to Dst at T is undefined
206	   (informally, infinite)<< means that Src sent the first bit of a Type-
207	   P packet to Dst at wire-time T and that (either Dst did not receive
208	   the packet, Dst did not send a Type-P packet in response, or) Src did
209	   not receive that response packet.

211	   >>The *Type-P-Round-trip-Delay between Src and Dst at T<< means
212	   either the *Type-P-Round-trip-Delay from Src to Dst at T or the
213	   *Type-P-Round-trip-Delay from Dst to Src at T.  When this notion is
214	   used, it is understood to be specifically ambiguous which host acts
215	   as Src and which as Dst.  {This ambiguity will usually be a small
216	   price to pay for being able to have one measurement, launched from
217	   either Src or Dst, rather than having two measurements.}

219	   Suggestions for what to report along with metric values appear in
220	   Section 3.8 after a discussion of the metric, methodologies for
221	   measuring the metric, and error analysis.

223	3.5. Discussion:

225	   Type-P-Round-trip-Delay is a relatively simple analytic metric, and
226	   one that we believe will afford effective methods of measurement.

228	   The following issues are likely to come up in practice:

230	   +  The timestamp values (T) for the time at which delays are measured
231	      should be fairly accurate in order to draw meaningful conclusions
232	      about the state of the network at a given T.  Therefore, Src
233	      should have an accurate knowledge of time-of-day.  NTP [3] affords
234	      one way to achieve time accuracy to within several milliseconds.
235	      Depending on the NTP server, higher accuracy may be achieved, for
236	      example when NTP servers make use of GPS systems as a time source.
237	      Note that NTP will adjust the instrument's clock.  If an
238	      adjustment is made between the time the initial timestamp is taken
239	      and the time the final timestamp is taken the adjustment will
240	      affect the uncertainty in the measured delay.  This uncertainty
241	      must be accounted for in the instrument's calibration.

243	   +  A given methodology will have to include a way to determine
244	      whether a delay value is infinite or whether it is merely very
245	      large (and the packet is yet to arrive at Dst).  As noted by
246	      Mahdavi and Paxson [4], simple upper bounds (such as the 255
247	      seconds theoretical upper bound on the lifetimes of IP
248	      packets [5]) could be used, but good engineering, including an
249	      understanding of packet lifetimes, will be needed in practice.
250	      {Comment: Note that, for many applications of these metrics, the
251	      harm in treating a large delay as infinite might be zero or very
252	      small.  A TCP data packet, for example, that arrives only after
253	      several multiples of the RTT may as well have been lost.}

255	   +  If the packet is duplicated so that multiple non-corrupt instances
256	      of the response arrive back at the source, then the packet is
257	      counted as received, and the first instance to arrive back at the
258	      source determines the packet's round-trip delay.

260	   +  If the packet is fragmented and if, for whatever reason,
261	      reassembly does not occur, then the packet will be deemed lost.

263	3.6. Methodologies:

265	   As with other Type-P-* metrics, the detailed methodology will depend
266	   on the Type-P (e.g., protocol number, UDP/TCP port number, size,
267	   precedence).

269	   Generally, for a given Type-P, the methodology would proceed as
270	   follows:

272	   +  At the Src host, select Src and Dst IP addresses, and form a test
273	      packet of Type-P with these addresses.  Any 'padding' portion of
274	      the packet needed only to make the test packet a given size should
275	      be filled with randomized bits to avoid a situation in which the
276	      measured delay is lower than it would otherwise be due to
277	      compression techniques along the path.  The test packet must have
278	      some identifying information so that the response to it can be
279	      identified by Src when Src receives the response; one means to do
280	      this is by placing the timestamp generated just before sending the
281	      test packet in the packet itself.

283	   +  At the Dst host, arrange to receive and respond to the test
284	      packet.  At the Src host, arrange to receive the corresponding
285	      response packet.

287	   +  At the Src host, take the initial timestamp and then send the
288	      prepared Type-P packet towards Dst.  Note that the timestamp could
289	      be placed inside the packet, or kept separately as long as the
290	      packet contains a suitable identifier so the received timestamp
291	      can be compared with the send timestamp.

293	   +  If the packet arrives at Dst, send a corresponding response packet
294	      back from Dst to Src as soon as possible.

296	   +  If the response packet arrives within a reasonable period of time,
297	      take the final timestamp as soon as possible upon the receipt of
298	      the packet.  By subtracting the two timestamps, an estimate of
299	      round-trip delay can be computed.  If the delay between the
300	      initial timestamp and the actual sending of the packet is known,
301	      then the estimate could be adjusted by subtracting this amount;
302	      uncertainty in this value must be taken into account in error
303	      analysis.  Similarly, if the delay between the actual receipt of
304	      the response packet and final timestamp is known, then the
305	      estimate could be adjusted by subtracting this amount; uncertainty
306	      in this value must be taken into account in error analysis.  See
307	      the next section, "Errors and Uncertainties", for a more detailed
308	      discussion.

310	   +  If the packet fails to arrive within a reasonable period of time,
311	      the round-trip delay is taken to be undefined (informally,
312	      infinite).  Note that the threshold of 'reasonable' is a parameter
313	      of the methodology.

315	   Issues such as the packet format and the means by which Dst knows
316	   when to expect the test packet are outside the scope of this
317	   document.

319	   {Comment: Note that you cannot in general add two Type-P-One-way-
320	   Delay values (see [2]) to form a Type-P-Round-trip-Delay value.  In
321	   order to form a Type-P-Round-trip-Delay value, the return packet must
322	   be triggered by the reception of a packet from Src.}

324	   {Comment: "ping" would qualify as a round-trip measure under this
325	   definition, with a Type-P of ICMP echo request/reply with 60-byte
326	   packets.  However, the uncertainties associated with a typical ping
327	   program must be analyzed as in the next section, including the type
328	   of reflecting point (a router may not handle an ICMP request in the
329	   fast path) and effects of load on the reflecting point.}

331	3.7. Errors and Uncertainties:

333	   The description of any specific measurement method should include an
334	   accounting and analysis of various sources of error or uncertainty.
335	   The Framework document provides general guidance on this point, but
336	   we note here the following specifics related to delay metrics:

338	   +  Errors or uncertainties due to uncertainty in the clock of the Src
339	      host.

341	   +  Errors or uncertainties due to the difference between 'wire time'
342	      and 'host time'.

344	   +  Errors or uncertainties due to time required by the Dst to receive
345	      the packet from the Src and send the corresponding response.

347	   In addition, the loss threshold may affect the results.  Each of
348	   these are discussed in more detail below, along with a section
349	   ("Calibration") on accounting for these errors and uncertainties.

351	3.7.1. Errors or Uncertainties Related to Clocks

353	   The uncertainty in a measurement of round-trip delay is related, in
354	   part, to uncertainty in the clock of the Src host.  In the following,
355	   we refer to the clock used to measure when the packet was sent from
356	   Src as the source clock, and we refer to the observed time when the
357	   packet was sent by the source as Tinitial, and the observed time when
358	   the packet was received by the source as Tfinal.  Alluding to the
359	   notions of synchronization, accuracy, resolution, and skew mentioned
360	   in the Introduction, we note the following:

362	   +  While in one-way delay there is an issue of the synchronization of
363	      the source clock and the destination clock, in round-trip delay
364	      there is an (easier) issue of self-synchronization, as it were,
365	      between the source clock at the time the test packet is sent and
366	      the (same) source clock at the time the response packet is
367	      received.  Theoretically a very severe case of skew could threaten
368	      this.  In practice, the greater threat is anything that would
369	      cause a discontinuity in the source clock during the time between
370	      the taking of the initial and final timestamp.  This might happen,
371	      for example, with certain implementations of NTP.

373	   +  The accuracy of a clock is important only in identifying the time
374	      at which a given delay was measured.  Accuracy, per se, has no
375	      importance to the accuracy of the measurement of delay.

377	   +  The resolution of a clock adds to uncertainty about any time
378	      measured with it.  Thus, if the source clock has a resolution of
379	      10 msec, then this adds 10 msec of uncertainty to any time value
380	      measured with it.  We will denote the resolution of the source
381	      clock as Rsource.

383	   Taking these items together, we note that naive computation Tfinal-
384	   Tinitial will be off by 2*Rsource.

386	3.7.2. Errors or Uncertainties Related to Wire-time vs Host-time

388	   As we have defined round-trip delay, we would like to measure the
389	   time between when the test packet leaves the network interface of Src
390	   and when the corresponding response packet (completely) arrives at
391	   the network interface of Src, and we refer to these as "wire times".
392	   If the timings are themselves performed by software on Src, however,
393	   then this software can only directly measure the time between when
394	   Src grabs a timestamp just prior to sending the test packet and when
395	   it grabs a timestamp just after having received the response packet,
396	   and we refer to these two points as "host times".

398	   Another contributor to this problem is time spent at Dst between the
399	   receipt there of the test packet and the sending of the response
400	   packet.  Ideally, this time is zero; it is explored further in the
401	   next section.

403	   To the extent that the difference between wire time and host time is
404	   accurately known, this knowledge can be used to correct for host time
405	   measurements and the corrected value more accurately estimates the
406	   desired (wire time) metric.

408	   To the extent, however, that the difference between wire time and
409	   host time is uncertain, this uncertainty must be accounted for in an
410	   analysis of a given measurement method.  We denote by Hinitial an
411	   upper bound on the uncertainty in the difference between wire time
412	   and host time on the Src host in sending the test packet, and
413	   similarly define Hfinal for the difference on the Src host in
414	   receiving the reponse packet.  We then note that these problems
415	   introduce a total uncertainty of Hinitial + Hfinal.  This estimate of
416	   total wire-vs-host uncertainty should be included in the
417	   error/uncertainty analysis of any measurement implementation.

419	3.7.3. Errors or Uncertainties Related to Dst Producing a Response

421	   Any time spent by the destination host in receiving and recognizing
422	   the packet from Src, and then producing and sending the corresponding
423	   response adds additional error and uncertainty to the round-trip
424	   delay measurement.  The error equals the difference between the wire-
425	   time the first bit of the packet is received by Dst and the wire-time
426	   the first bit of the response is sent by Dst.  To the extent that
427	   this difference is accurately known, this knowledge can be used to
428	   correct the desired metric.  To the extent, however, that this
429	   difference is uncertain, this uncertainty must be accounted for in
430	   the error analysis of a measurement implementation. We denote by
431	   Hrefl the difference between the two wire-times.

433	3.7.4. Calibration

435	   Generally, the measured values can be decomposed as follows:

437	       measured value = true value + systematic error + random error

439	   If the systematic error (the constant bias in measured values) can be
440	   determined, it can be compensated for in the reported results.

442	       reported value = measured value - systematic error

444	   therefore

446	       reported value = true value + random error

448	   The goal of calibration is to determine the systematic and random
449	   error generated by the instruments themselves in as much detail as
450	   possible.  At a minimum, a bound ("e") should be found such that the
451	   reported value is in the range (true value - e) to (true value + e)
452	   at least 95 percent of the time.  We call "e" the calibration error
453	   for the measurements.  It represents the degree to which the values
454	   produced by the measurement instrument are repeatable; that is, how
455	   closely an actual delay of 30 ms is reported as 30 ms.  {Comment: 95
456	   percent was chosen because (1) some confidence level is desirable to
457	   be able to remove outliers which will be found in measuring any
458	   physical property; and (2) a particular confidence level should be
459	   specified so that the results of independent implementations can be
460	   compared.}

462	   From the discussion in the previous three sections, the error in
463	   measurements could be bounded by determining all the individual
464	   uncertainties, and adding them together to form
465	       2*Rsource + Hinitial + Hfinal + Hrefl.
466	   However, reasonable bounds on both the clock-related uncertainty
467	   captured by the first term and the host-related uncertainty captured
468	   by the last three terms should be possible by careful design
469	   techniques and calibrating the instruments using a known, isolated,
470	   network in a lab.

472	   The host-related uncertainties, Hinitial + Hfinal + Hrefl, could be
473	   bounded by connecting two instruments back-to-back with a high-speed
474	   serial link or isolated LAN segment.  In this case, repeated
475	   measurements are measuring the same round-trip delay.

477	   If the test packets are small, such a network connection has a
478	   minimal delay that may be approximated by zero.  The measured delay
479	   therefore contains only systematic and random error in the
480	   instrumentation.  The "average value" of repeated measurements is the
481	   systematic error, and the variation is the random error.

483	   One way to compute the systematic error, and the random error to a
484	   95% confidence is to repeat the experiment many times - at least
485	   hundreds of tests.  The systematic error would then be the median.
486	   The random error could then be found by removing the systematic error
487	   from the measured values.  The 95% confidence interval would be the
488	   range from the 2.5th percentile to the 97.5th percentile of these
489	   deviations from the true value.  The calibration error "e" could then
490	   be taken to be the largest absolute value of these two numbers, plus
491	   the clock-related uncertainty.  {Comment: as described, this bound is
492	   relatively loose since the uncertainties are added, and the absolute
493	   value of the largest deviation is used.  As long as the resulting
494	   value is not a significant fraction of the measured values, it is a
495	   reasonable bound.  If the resulting value is a significant fraction
496	   of the measured values, then more exact methods will be needed to
497	   compute the calibration error.}

499	   Note that random error is a function of measurement load.  For
500	   example, if many paths will be measured by one instrument, this might
501	   increase interrupts, process scheduling, and disk I/O (for example,
502	   recording the measurements), all of which may increase the random
503	   error in measured singletons.  Therefore, in addition to minimal load
504	   measurements to find the systematic error, calibration measurements
505	   should be performed with the same measurement load that the
506	   instruments will see in the field.

508	   We wish to reiterate that this statistical treatment refers to the
509	   calibration of the instrument; it is used to "calibrate the meter
510	   stick" and say how well the meter stick reflects reality.

512	   In addition to calibrating the instruments for finite delay, two
513	   checks should be made to ensure that packets reported as losses were
514	   really lost.  First, the threshold for loss should be verified.  In
515	   particular, ensure the "reasonable" threshold is reasonable: that it
516	   is very unlikely a packet will arrive after the threshold value, and
517	   therefore the number of packets lost over an interval is not
518	   sensitive to the error bound on measurements.  Second, consider the
519	   possibility that a packet arrives at the network interface, but is
520	   lost due to congestion on that interface or to other resource
521	   exhaustion (e.g. buffers) in the instrument.

523	3.8. Reporting the Metric:

525	   The calibration and context in which the metric is measured MUST be
526	   carefully considered, and SHOULD always be reported along with metric
527	   results.  We now present four items to consider: the Type-P of test
528	   packets, the threshold of infinite delay (if any), error calibration,
529	   and the path traversed by the test packets.  This list is not
530	   exhaustive; any additional information that could be useful in
531	   interpreting applications of the metrics should also be reported.

533	3.8.1. Type-P

535	   As noted in the Framework document [1], the value of the metric may
536	   depend on the type of IP packets used to make the measurement, or
537	   "type-P".  The value of Type-P-Round-trip-Delay could change if the
538	   protocol (UDP or TCP), port number, size, or arrangement for special
539	   treatment (e.g., IP precedence or RSVP) changes.  The exact Type-P
540	   used to make the measurements MUST be accurately reported.

542	3.8.2. Loss threshold

544	   In addition, the threshold (or methodology to distinguish) between a
545	   large finite delay and loss MUST be reported.

547	3.8.3. Calibration Results

549	   +  If the systematic error can be determined, it SHOULD be removed
550	      from the measured values.

552	   +  You SHOULD also report the calibration error, e, such that the
553	      true value is the reported value plus or minus e, with 95%
554	      confidence (see the last section.)

556	   +  If possible, the conditions under which a test packet with finite
557	      delay is reported as lost due to resource exhaustion on the
558	      measurement instrument SHOULD be reported.

560	3.8.4. Path

562	   Finally, the path traversed by the packet SHOULD be reported, if
563	   possible.  In general it is impractical to know the precise path a
564	   given packet takes through the network.  The precise path may be
565	   known for certain Type-P on short or stable paths.  For example, if
566	   Type-P includes the record route (or loose-source route) option in
567	   the IP header, and the path is short enough, and all routers* on the
568	   path support record (or loose-source) route, and the Dst host copies
569	   the path from Src to Dst into the corresponding reply packet, then
570	   the path will be precisely recorded.  This is impractical because the
571	   route must be short enough, many routers do not support (or are not
572	   configured for) record route, and use of this feature would often
573	   artificially worsen the performance observed by removing the packet
574	   from common-case processing.  However, partial information is still
575	   valuable context.  For example, if a host can choose between two
576	   links* (and hence two separate routes from Src to Dst), then the
577	   initial link used is valuable context.  {Comment: For example, with
578	   Merit's NetNow setup, a Src on one NAP can reach a Dst on another NAP
579	   by either of several different backbone networks.}

581	4. A Definition for Samples of Round-trip Delay

583	   Given the singleton metric Type-P-Round-trip-Delay, we now define one
584	   particular sample of such singletons.  The idea of the sample is to
585	   select a particular binding of the parameters Src, Dst, and Type-P,
586	   then define a sample of values of parameter T.  The means for
587	   defining the values of T is to select a beginning time T0, a final
588	   time Tf, and an average rate lambda, then define a pseudo-random
589	   Poisson process of rate lambda, whose values fall between T0 and Tf.

591	   The time interval between successive values of T will then average
592	   1/lambda.

594	   {Comment: Note that Poisson sampling is only one way of defining a
595	   sample.  Poisson has the advantage of limiting bias, but other
596	   methods of sampling might be appropriate for different situations.
597	   We encourage others who find such appropriate cases to use this
598	   general framework and submit their sampling method for
599	   standardization.}

601	4.1. Metric Name:

603	   Type-P-Round-trip-Delay-Poisson-Stream

605	4.2. Metric Parameters:

607	   +  Src, the IP address of a host

609	   +  Dst, the IP address of a host

611	   +  T0, a time

613	   +  Tf, a time

615	   +  lambda, a rate in reciprocal seconds

617	4.3. Metric Units:

619	   A sequence of pairs; the elements of each pair are:

621	   +  T, a time, and

623	   +  dT, either a non-negative real number or an undefined number of
624	      seconds.

626	   The values of T in the sequence are monotonic increasing.  Note that
627	   T would be a valid parameter to Type-P-Round-trip-Delay, and that dT
628	   would be a valid value of Type-P-Round-trip-Delay.

630	4.4. Definition:

632	   Given T0, Tf, and lambda, we compute a pseudo-random Poisson process
633	   beginning at or before T0, with average arrival rate lambda, and
634	   ending at or after Tf.  Those time values greater than or equal to T0
635	   and less than or equal to Tf are then selected.  At each of the times
636	   in this process, we obtain the value of Type-P-Round-trip-Delay at
637	   this time.  The value of the sample is the sequence made up of the
638	   resulting <time, delay> pairs.  If there are no such pairs, the
639	   sequence is of length zero and the sample is said to be empty.

641	4.5. Discussion:

643	   The reader should be familiar with the in-depth discussion of Poisson
644	   sampling in the Framework document [1], which includes methods to
645	   compute and verify the pseudo-random Poisson process.

647	   We specifically do not constrain the value of lambda, except to note
648	   the extremes.  If the rate is too large, then the measurement traffic
649	   will perturb the network, and itself cause congestion.  If the rate
650	   is too small, then you might not capture interesting network
651	   behavior.  {Comment: We expect to document our experiences with, and
652	   suggestions for, lambda elsewhere, culminating in a "best current
653	   practices" document.}

655	   Since a pseudo-random number sequence is employed, the sequence of
656	   times, and hence the value of the sample, is not fully specified.
657	   Pseudo-random number generators of good quality will be needed to
658	   achieve the desired qualities.

660	   The sample is defined in terms of a Poisson process both to avoid the
661	   effects of self-synchronization and also capture a sample that is
662	   statistically as unbiased as possible.  {Comment: there is, of
663	   course, no claim that real Internet traffic arrives according to a
664	   Poisson arrival process.}  The Poisson process is used to schedule
665	   the delay measurements.  The test packets will generally not arrive
666	   at Dst according to a Poisson distribution, nor will response packets
667	   arrive at Src according to a Poisson distribution, since they are
668	   influenced by the network.

670	   All the singleton Type-P-Round-trip-Delay metrics in the sequence
671	   will have the same values of Src, Dst, and Type-P.

673	   Note also that, given one sample that runs from T0 to Tf, and given
674	   new time values T0' and Tf' such that T0 <= T0' <= Tf' <= Tf, the
675	   subsequence of the given sample whose time values fall between T0'
676	   and Tf' are also a valid Type-P-Round-trip-Delay-Poisson-Stream
677	   sample.

679	4.6. Methodologies:

681	   The methodologies follow directly from:

683	   +  the selection of specific times, using the specified Poisson
684	      arrival process, and

686	   +  the methodologies discussion already given for the singleton Type-
687	      P-Round-trip-Delay metric.

689	   Care must, of course, be given to correctly handle out-of-order
690	   arrival of test or response packets; it is possible that the Src
691	   could send one test packet at TS[i], then send a second test packet
692	   (later) at TS[i+1], and it could receive the second response packet
693	   at TR[i+1], and then receive the first response packet (later) at
694	   TR[i].

696	4.7. Errors and Uncertainties:

698	   In addition to sources of errors and uncertainties associated with
699	   methods employed to measure the singleton values that make up the
700	   sample, care must be given to analyze the accuracy of the Poisson
701	   process with respect to the wire-times of the sending of the test
702	   packets.  Problems with this process could be caused by several
703	   things, including problems with the pseudo-random number techniques
704	   used to generate the Poisson arrival process, or with jitter in the
705	   value of Hinitial (mentioned above as uncertainty in the singleton
706	   delay metric).  The Framework document shows how to use the Anderson-
707	   Darling test to verify the accuracy of a Poisson process over small
708	   time frames.  {Comment: The goal is to ensure that test packets are
709	   sent "close enough" to a Poisson schedule, and avoid periodic
710	   behavior.}

712	4.8. Reporting the Metric:

714	   You MUST report the calibration and context for the underlying
715	   singletons along with the stream.  (See "Reporting the metric" for
716	   Type-P-Round-trip-Delay.)

718	5. Some Statistics Definitions for Round-trip Delay

720	   Given the sample metric Type-P-Round-trip-Delay-Poisson-Stream, we
721	   now offer several statistics of that sample.  These statistics are
722	   offered mostly to be illustrative of what could be done.

724	5.1. Type-P-Round-trip-Delay-Percentile

726	   Given a Type-P-Round-trip-Delay-Poisson-Stream and a percent X
727	   between 0% and 100%, the Xth percentile of all the dT values in the
728	   Stream.  In computing this percentile, undefined values are treated
729	   as infinitely large.  Note that this means that the percentile could
730	   thus be undefined (informally, infinite).  In addition, the Type-P-
731	   Round-trip-Delay-Percentile is undefined if the sample is empty.

733	   Example: suppose we take a sample and the results are:
734	      Stream1 = <
735	      <T1, 100 msec>
736	      <T2, 110 msec>
737	      <T3, undefined>
738	      <T4, 90 msec>
739	      <T5, 500 msec>
740	      >
741	   Then the 50th percentile would be 110 msec, since 90 msec and 100
742	   msec are smaller and 110 msec and 'undefined' are larger.

744	   Note that if the possibility that a packet with finite delay is
745	   reported as lost is significant, then a high percentile (90th or
746	   95th) might be reported as infinite instead of finite.

748	5.2. Type-P-Round-trip-Delay-Median

750	   Given a Type-P-Round-trip-Delay-Poisson-Stream, the median of all the
751	   dT values in the Stream.  In computing the median, undefined values
752	   are treated as infinitely large.  As with Type-P-Round-trip-Delay-
753	   Percentile, Type-P-Round-trip-Delay-Median is undefined if the sample
754	   is empty.

756	   As noted in the Framework document, the median differs from the 50th
757	   percentile only when the sample contains an even number of values, in
758	   which case the mean of the two central values is used.

760	   Example: suppose we take a sample and the results are:
761	      Stream2 = <
762	      <T1, 100 msec>
763	      <T2, 110 msec>
764	      <T3, undefined>
765	      <T4, 90 msec>
766	      >
767	   Then the median would be 105 msec, the mean of 100 msec and 110 msec,
768	   the two central values.

770	5.3. Type-P-Round-trip-Delay-Minimum

772	   Given a Type-P-Round-trip-Delay-Poisson-Stream, the minimum of all
773	   the dT values in the Stream.  In computing this, undefined values are
774	   treated as infinitely large.  Note that this means that the minimum
775	   could thus be undefined (informally, infinite) if all the dT values
776	   are undefined.  In addition, the Type-P-Round-trip-Delay-Minimum is
777	   undefined if the sample is empty.

779	   In the above example, the minimum would be 90 msec.

781	5.4. Type-P-Round-trip-Delay-Inverse-Percentile

783	   Given a Type-P-Round-trip-Delay-Poisson-Stream and a non-negative
784	   time duration threshold, the fraction of all the dT values in the
785	   Stream less than or equal to the threshold.  The result could be as
786	   low as 0% (if all the dT values exceed threshold) or as high as 100%.
787	   Type-P-Round-trip-Delay-Inverse-Percentile is undefined if the sample
788	   is empty.

790	   In the above example, the Inverse-Percentile of 103 msec would be
791	   50%.

793	6. Security Considerations

795	   Conducting Internet measurements raises both security and privacy
796	   concerns.  This memo does not specify an implementation of the
797	   metrics, so it does not directly affect the security of the Internet
798	   nor of applications which run on the Internet.  However,
799	   implementations of these metrics must be mindful of security and
800	   privacy concerns.

802	   There are two types of security concerns: potential harm caused by
803	   the measurements, and potential harm to the measurements.  The
804	   measurements could cause harm because they are active, and inject
805	   packets into the network.  The measurement parameters MUST be
806	   carefully selected so that the measurements inject trivial amounts of
807	   additional traffic into the networks they measure.  If they inject
808	   "too much" traffic, they can skew the results of the measurement, and
809	   in extreme cases cause congestion and denial of service.

811	   The measurements themselves could be harmed by routers giving
812	   measurement traffic a different priority than "normal" traffic, or by
813	   an attacker injecting artificial measurement traffic.  If routers can
814	   recognize measurement traffic and treat it separately, the
815	   measurements will not reflect actual user traffic.  If an attacker
816	   injects artificial traffic that is accepted as legitimate, the loss
817	   rate will be artificially lowered.  Therefore, the measurement
818	   methodologies SHOULD include appropriate techniques to reduce the
819	   probability measurement traffic can be distinguished from "normal"
820	   traffic.  Authentication techniques, such as digital signatures, may
821	   be used where appropriate to guard against injected traffic attacks.

823	   The privacy concerns of network measurement are limited by the active
824	   measurements described in this memo.  Unlike passive measurements,
825	   there can be no release of existing user data.

827	7. Acknowledgements

829	   Special thanks are due to Vern Paxson and to Will Leland for several
830	   useful suggestions.

832	8. References

834	   [1]  V. Paxson, G. Almes, J. Mahdavi, and M. Mathis, "Framework for
835	        IP Performance Metrics", RFC 2330, May 1998.

837	   [2]  G. Almes, S. Kalidindi, and M. Zekauskas, "A One-way Delay
838	        Metric for IPPM", Internet Draft <draft-ietf-ippm-delay-05.txt>,
839	        November 1998.

841	   [3]  D. Mills, "Network Time Protocol (v3)", RFC 1305, April 1992.

843	   [4]  J. Mahdavi and V. Paxson, "IPPM Metrics for Measuring
844	        Connectivity", Internet-Draft <draft-ietf-ippm-
845	        connectivity-03.txt>, October 1998.

847	   [5]  J. Postel, "Internet Protocol", RFC 791, September 1981.

849	   [6]  S. Bradner, "Key words for use in RFCs to Indicate Requirement
850	        Levels", RFC 2119, March 1997.

852	9. Authors' Addresses

854	   Guy Almes
855	   Advanced Network & Services, Inc.
856	   200 Business Park Drive
857	   Armonk, NY  10504
858	   USA

860	   Phone: +1 914 765 1120
861	   EMail: almes@advanced.org

863	   Sunil Kalidindi
864	   Advanced Network & Services, Inc.
865	   200 Business Park Drive
866	   Armonk, NY  10504
867	   USA

869	   Phone: +1 914 765 1128
870	   EMail: kalidindi@advanced.org

872	   Matthew J. Zekauskas
873	   Advanced Network & Services, Inc.
874	   200 Buisiness Park Drive
875	   Armonk, NY 10504
876	   USA

878	   Phone: +1 914 765 1112
879	   EMail: matt@advanced.org

881	   Expiration date: May, 1999