idnits 2.17.1 draft-ietf-ippm-rt-delay-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-25) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. ** The document is more than 15 pages and seems to lack a Table of Contents. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Abstract section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (November 1998) is 9293 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Downref: Normative reference to an Informational RFC: RFC 2330 (ref. '1') -- Possible downref: Non-RFC (?) normative reference: ref. '2' ** Obsolete normative reference: RFC 1305 (ref. '3') (Obsoleted by RFC 5905) -- Possible downref: Non-RFC (?) normative reference: ref. '4' Summary: 12 errors (**), 0 flaws (~~), 1 warning (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group G. Almes 3 Internet Draft S. Kalidindi 4 Expiration Date: May 1999 M. Zekauskas 5 Advanced Network & Services 6 November 1998 8 A Round-trip Delay Metric for IPPM 9 11 1. Status of this Memo 13 This document is an Internet-Draft. Internet-Drafts are working 14 documents of the Internet Engineering Task Force (IETF), its areas, 15 and its working groups. Note that other groups may also distribute 16 working documents as Internet-Drafts. 18 Internet-Drafts are draft documents valid for a maximum of six 19 months, and may be updated, replaced, or obsoleted by other documents 20 at any time. It is inappropriate to use Internet-Drafts as reference 21 material or to cite them other than as "work in progress." 23 To view the entire list of current Internet-Drafts, please check the 24 "1id-abstracts.txt" listing contained in the Internet-Drafts shadow 25 directories on ftp.is.co.za (Africa), nic.nordu.net (Northern 26 Europe), ftp.nis.garr.it (Southern Europe), munnari.oz.au (Pacific 27 Rim), ftp.ietf.org (US East Coast), or ftp.isi.edu (US West Coast). 29 This memo provides information for the Internet community. This memo 30 does not specify an Internet standard of any kind. Distribution of 31 this memo is unlimited. 33 2. Introduction 35 This memo defines a metric for round-trip delay of packets across 36 Internet paths. It builds on notions introduced and discussed in the 37 IPPM Framework document, RFC 2330 [1], and follows closely the 38 corresponding metric for One-way Delay ("A One-way Delay Metric for 39 IPPM" ) [2]; the reader is assumed to 40 be familiar with those documents. 42 The memo was largely written by copying material from the One-way 43 Delay metric. The intention is that, where the two metrics are 44 similar, they will be described with similar or identical text, and 45 that where the two metrics differ, new or modified text will be used. 47 This memo is intended to be parallel in structure to a future 48 companion document for Packet Loss. 50 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 51 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 52 document are to be interpreted as described in RFC 2119 [6]. 53 Although RFC 2119 was written with protocols in mind, the key words 54 are used in this document for similar reasons. They are used to 55 ensure the results of measurements from two different implementations 56 are comparable, and to note instances when an implementation could 57 perturb the network. 59 The structure of the memo is as follows: 61 + A 'singleton' analytic metric, called Type-P-Round-trip-Delay, 62 will be introduced to measure a single observation of round-trip 63 delay. 65 + Using this singleton metric, a 'sample', called Type-P-Round-trip- 66 Delay-Poisson-Stream, will be introduced to measure a sequence of 67 singleton delays measured at times taken from a Poisson process. 69 + Using this sample, several 'statistics' of the sample will be 70 defined and discussed. 72 This progression from singleton to sample to statistics, with clear 73 separation among them, is important. 75 Whenever a technical term from the IPPM Framework document is first 76 used in this memo, it will be tagged with a trailing asterisk. For 77 example, "term*" indicates that "term" is defined in the Framework. 79 2.1. Motivation 81 Round-trip delay of a Type-P* packet from a source host* to a 82 destination host is useful for several reasons: 84 + Some applications do not perform well (or at all) if end-to-end 85 delay between hosts is large relative to some threshold value. 87 + Erratic variation in delay makes it difficult (or impossible) to 88 support many real-time applications. 90 + The larger the value of delay, the more difficult it is for 91 transport-layer protocols to sustain high bandwidths. 93 + The minimum value of this metric provides an indication of the 94 delay due only to propagation and transmission delay. 96 + The minimum value of this metric provides an indication of the 97 delay that will likely be experienced when the path* traversed is 98 lightly loaded. 100 + Values of this metric above the minimum provide an indication of 101 the congestion present in the path. 103 The measurement of round-trip delay instead of one-way delay has 104 several weaknesses, summarized here: 106 + The Internet path from a source to a destination may differ from 107 the path from the destination back to the source ("asymmetric 108 paths"), such that different sequences of routers are used for the 109 forward and reverse paths. Therefore round-trip measurements 110 actually measure the performance of two distinct paths together. 112 + Even when the two paths are symmetric, they may have radically 113 different performance characteristics due to asymmetric queueing. 115 + Performance of an application may depend mostly on the performance 116 in one direction. For example, a file transfer using TCP may 117 depend more on the performance in the direction that data flows, 118 rather than the direction in which acknowledgements travel. 120 + In quality-of-service (QoS) enabled networks, provisioning in one 121 direction may be radically different than provisioning in the 122 reverse direction, and thus the QoS guarantees differ. 124 On the other hand, the measurement of round-trip delay has two 125 specific advantages: 127 + Ease of deployment: unlike in one-way measurement, it is often 128 possible to perform some form of round-trip delay measurement 129 without installing measurement-specific software at the intended 130 destination. A variety of approaches are well-known, including 131 use of ICMP Echo or of TCP-based methodologies (similar to those 132 outlined in "IPPM Metrics for Measuring Connectivity" [4]). 133 However, some approaches may introduce greater uncertainty in the 134 time for the destination to produce a response (see 135 Section 3.7.3). 137 + Ease of interpretation: in some circumstances, the round-trip time 138 is in fact the quantity of interest; deducing it from matching 139 one-way measurements and an assumption of the destination 140 processing time is less direct and potentially less accurate. 142 2.2. General Issues Regarding Time 144 Whenever a time (i.e., a moment in history) is mentioned here, it is 145 understood to be measured in seconds (and fractions) relative to UTC. 147 As described more fully in the Framework document, there are four 148 distinct, but related notions of clock uncertainty: 150 synchronization* 152 measures the extent to which two clocks agree on what time it 153 is. For example, the clock on one host might be 5.4 msec ahead 154 of the clock on a second host. 156 accuracy* 158 measures the extent to which a given clock agrees with UTC. For 159 example, the clock on a host might be 27.1 msec behind UTC. 161 resolution* 163 measures the precision of a given clock. For example, the clock 164 on an old Unix host might tick only once every 10 msec, and thus 165 have a resolution of only 10 msec. 167 skew* 169 measures the change of accuracy, or of synchronization, with 170 time. For example, the clock on a given host might gain 1.3 171 msec per hour and thus be 27.1 msec behind UTC at one time and 172 only 25.8 msec an hour later. In this case, we say that the 173 clock of the given host has a skew of 1.3 msec per hour relative 174 to UTC, which threatens accuracy. We might also speak of the 175 skew of one clock relative to another clock, which threatens 176 synchronization. 178 3. A Singleton Definition for Round-trip Delay 180 3.1. Metric Name: 182 Type-P-Round-trip-Delay 184 3.2. Metric Parameters: 186 + Src, the IP address of a host 188 + Dst, the IP address of a host 190 + T, a time 192 3.3. Metric Units: 194 The value of a Type-P-Round-trip-Delay is either a non-negative real 195 number, or an undefined (informally, infinite) number of seconds. 197 3.4. Definition: 199 For a non-negative real number dT, >>the *Type-P-Round-trip-Delay* 200 from Src to Dst at T is dT<< means that Src sent the first bit of a 201 Type-P packet to Dst at wire-time* T, that Dst received that packet, 202 then sent a Type-P packet back to Src, and that Src received the last 203 bit of that packet at wire-time T+dT. 205 >>The *Type-P-Round-trip-Delay* from Src to Dst at T is undefined 206 (informally, infinite)<< means that Src sent the first bit of a Type- 207 P packet to Dst at wire-time T and that (either Dst did not receive 208 the packet, Dst did not send a Type-P packet in response, or) Src did 209 not receive that response packet. 211 >>The *Type-P-Round-trip-Delay between Src and Dst at T<< means 212 either the *Type-P-Round-trip-Delay from Src to Dst at T or the 213 *Type-P-Round-trip-Delay from Dst to Src at T. When this notion is 214 used, it is understood to be specifically ambiguous which host acts 215 as Src and which as Dst. {This ambiguity will usually be a small 216 price to pay for being able to have one measurement, launched from 217 either Src or Dst, rather than having two measurements.} 219 Suggestions for what to report along with metric values appear in 220 Section 3.8 after a discussion of the metric, methodologies for 221 measuring the metric, and error analysis. 223 3.5. Discussion: 225 Type-P-Round-trip-Delay is a relatively simple analytic metric, and 226 one that we believe will afford effective methods of measurement. 228 The following issues are likely to come up in practice: 230 + The timestamp values (T) for the time at which delays are measured 231 should be fairly accurate in order to draw meaningful conclusions 232 about the state of the network at a given T. Therefore, Src 233 should have an accurate knowledge of time-of-day. NTP [3] affords 234 one way to achieve time accuracy to within several milliseconds. 235 Depending on the NTP server, higher accuracy may be achieved, for 236 example when NTP servers make use of GPS systems as a time source. 237 Note that NTP will adjust the instrument's clock. If an 238 adjustment is made between the time the initial timestamp is taken 239 and the time the final timestamp is taken the adjustment will 240 affect the uncertainty in the measured delay. This uncertainty 241 must be accounted for in the instrument's calibration. 243 + A given methodology will have to include a way to determine 244 whether a delay value is infinite or whether it is merely very 245 large (and the packet is yet to arrive at Dst). As noted by 246 Mahdavi and Paxson [4], simple upper bounds (such as the 255 247 seconds theoretical upper bound on the lifetimes of IP 248 packets [5]) could be used, but good engineering, including an 249 understanding of packet lifetimes, will be needed in practice. 250 {Comment: Note that, for many applications of these metrics, the 251 harm in treating a large delay as infinite might be zero or very 252 small. A TCP data packet, for example, that arrives only after 253 several multiples of the RTT may as well have been lost.} 255 + If the packet is duplicated so that multiple non-corrupt instances 256 of the response arrive back at the source, then the packet is 257 counted as received, and the first instance to arrive back at the 258 source determines the packet's round-trip delay. 260 + If the packet is fragmented and if, for whatever reason, 261 reassembly does not occur, then the packet will be deemed lost. 263 3.6. Methodologies: 265 As with other Type-P-* metrics, the detailed methodology will depend 266 on the Type-P (e.g., protocol number, UDP/TCP port number, size, 267 precedence). 269 Generally, for a given Type-P, the methodology would proceed as 270 follows: 272 + At the Src host, select Src and Dst IP addresses, and form a test 273 packet of Type-P with these addresses. Any 'padding' portion of 274 the packet needed only to make the test packet a given size should 275 be filled with randomized bits to avoid a situation in which the 276 measured delay is lower than it would otherwise be due to 277 compression techniques along the path. The test packet must have 278 some identifying information so that the response to it can be 279 identified by Src when Src receives the response; one means to do 280 this is by placing the timestamp generated just before sending the 281 test packet in the packet itself. 283 + At the Dst host, arrange to receive and respond to the test 284 packet. At the Src host, arrange to receive the corresponding 285 response packet. 287 + At the Src host, take the initial timestamp and then send the 288 prepared Type-P packet towards Dst. Note that the timestamp could 289 be placed inside the packet, or kept separately as long as the 290 packet contains a suitable identifier so the received timestamp 291 can be compared with the send timestamp. 293 + If the packet arrives at Dst, send a corresponding response packet 294 back from Dst to Src as soon as possible. 296 + If the response packet arrives within a reasonable period of time, 297 take the final timestamp as soon as possible upon the receipt of 298 the packet. By subtracting the two timestamps, an estimate of 299 round-trip delay can be computed. If the delay between the 300 initial timestamp and the actual sending of the packet is known, 301 then the estimate could be adjusted by subtracting this amount; 302 uncertainty in this value must be taken into account in error 303 analysis. Similarly, if the delay between the actual receipt of 304 the response packet and final timestamp is known, then the 305 estimate could be adjusted by subtracting this amount; uncertainty 306 in this value must be taken into account in error analysis. See 307 the next section, "Errors and Uncertainties", for a more detailed 308 discussion. 310 + If the packet fails to arrive within a reasonable period of time, 311 the round-trip delay is taken to be undefined (informally, 312 infinite). Note that the threshold of 'reasonable' is a parameter 313 of the methodology. 315 Issues such as the packet format and the means by which Dst knows 316 when to expect the test packet are outside the scope of this 317 document. 319 {Comment: Note that you cannot in general add two Type-P-One-way- 320 Delay values (see [2]) to form a Type-P-Round-trip-Delay value. In 321 order to form a Type-P-Round-trip-Delay value, the return packet must 322 be triggered by the reception of a packet from Src.} 324 {Comment: "ping" would qualify as a round-trip measure under this 325 definition, with a Type-P of ICMP echo request/reply with 60-byte 326 packets. However, the uncertainties associated with a typical ping 327 program must be analyzed as in the next section, including the type 328 of reflecting point (a router may not handle an ICMP request in the 329 fast path) and effects of load on the reflecting point.} 331 3.7. Errors and Uncertainties: 333 The description of any specific measurement method should include an 334 accounting and analysis of various sources of error or uncertainty. 335 The Framework document provides general guidance on this point, but 336 we note here the following specifics related to delay metrics: 338 + Errors or uncertainties due to uncertainty in the clock of the Src 339 host. 341 + Errors or uncertainties due to the difference between 'wire time' 342 and 'host time'. 344 + Errors or uncertainties due to time required by the Dst to receive 345 the packet from the Src and send the corresponding response. 347 In addition, the loss threshold may affect the results. Each of 348 these are discussed in more detail below, along with a section 349 ("Calibration") on accounting for these errors and uncertainties. 351 3.7.1. Errors or Uncertainties Related to Clocks 353 The uncertainty in a measurement of round-trip delay is related, in 354 part, to uncertainty in the clock of the Src host. In the following, 355 we refer to the clock used to measure when the packet was sent from 356 Src as the source clock, and we refer to the observed time when the 357 packet was sent by the source as Tinitial, and the observed time when 358 the packet was received by the source as Tfinal. Alluding to the 359 notions of synchronization, accuracy, resolution, and skew mentioned 360 in the Introduction, we note the following: 362 + While in one-way delay there is an issue of the synchronization of 363 the source clock and the destination clock, in round-trip delay 364 there is an (easier) issue of self-synchronization, as it were, 365 between the source clock at the time the test packet is sent and 366 the (same) source clock at the time the response packet is 367 received. Theoretically a very severe case of skew could threaten 368 this. In practice, the greater threat is anything that would 369 cause a discontinuity in the source clock during the time between 370 the taking of the initial and final timestamp. This might happen, 371 for example, with certain implementations of NTP. 373 + The accuracy of a clock is important only in identifying the time 374 at which a given delay was measured. Accuracy, per se, has no 375 importance to the accuracy of the measurement of delay. 377 + The resolution of a clock adds to uncertainty about any time 378 measured with it. Thus, if the source clock has a resolution of 379 10 msec, then this adds 10 msec of uncertainty to any time value 380 measured with it. We will denote the resolution of the source 381 clock as Rsource. 383 Taking these items together, we note that naive computation Tfinal- 384 Tinitial will be off by 2*Rsource. 386 3.7.2. Errors or Uncertainties Related to Wire-time vs Host-time 388 As we have defined round-trip delay, we would like to measure the 389 time between when the test packet leaves the network interface of Src 390 and when the corresponding response packet (completely) arrives at 391 the network interface of Src, and we refer to these as "wire times". 392 If the timings are themselves performed by software on Src, however, 393 then this software can only directly measure the time between when 394 Src grabs a timestamp just prior to sending the test packet and when 395 it grabs a timestamp just after having received the response packet, 396 and we refer to these two points as "host times". 398 Another contributor to this problem is time spent at Dst between the 399 receipt there of the test packet and the sending of the response 400 packet. Ideally, this time is zero; it is explored further in the 401 next section. 403 To the extent that the difference between wire time and host time is 404 accurately known, this knowledge can be used to correct for host time 405 measurements and the corrected value more accurately estimates the 406 desired (wire time) metric. 408 To the extent, however, that the difference between wire time and 409 host time is uncertain, this uncertainty must be accounted for in an 410 analysis of a given measurement method. We denote by Hinitial an 411 upper bound on the uncertainty in the difference between wire time 412 and host time on the Src host in sending the test packet, and 413 similarly define Hfinal for the difference on the Src host in 414 receiving the reponse packet. We then note that these problems 415 introduce a total uncertainty of Hinitial + Hfinal. This estimate of 416 total wire-vs-host uncertainty should be included in the 417 error/uncertainty analysis of any measurement implementation. 419 3.7.3. Errors or Uncertainties Related to Dst Producing a Response 421 Any time spent by the destination host in receiving and recognizing 422 the packet from Src, and then producing and sending the corresponding 423 response adds additional error and uncertainty to the round-trip 424 delay measurement. The error equals the difference between the wire- 425 time the first bit of the packet is received by Dst and the wire-time 426 the first bit of the response is sent by Dst. To the extent that 427 this difference is accurately known, this knowledge can be used to 428 correct the desired metric. To the extent, however, that this 429 difference is uncertain, this uncertainty must be accounted for in 430 the error analysis of a measurement implementation. We denote by 431 Hrefl the difference between the two wire-times. 433 3.7.4. Calibration 435 Generally, the measured values can be decomposed as follows: 437 measured value = true value + systematic error + random error 439 If the systematic error (the constant bias in measured values) can be 440 determined, it can be compensated for in the reported results. 442 reported value = measured value - systematic error 444 therefore 446 reported value = true value + random error 448 The goal of calibration is to determine the systematic and random 449 error generated by the instruments themselves in as much detail as 450 possible. At a minimum, a bound ("e") should be found such that the 451 reported value is in the range (true value - e) to (true value + e) 452 at least 95 percent of the time. We call "e" the calibration error 453 for the measurements. It represents the degree to which the values 454 produced by the measurement instrument are repeatable; that is, how 455 closely an actual delay of 30 ms is reported as 30 ms. {Comment: 95 456 percent was chosen because (1) some confidence level is desirable to 457 be able to remove outliers which will be found in measuring any 458 physical property; and (2) a particular confidence level should be 459 specified so that the results of independent implementations can be 460 compared.} 462 From the discussion in the previous three sections, the error in 463 measurements could be bounded by determining all the individual 464 uncertainties, and adding them together to form 465 2*Rsource + Hinitial + Hfinal + Hrefl. 466 However, reasonable bounds on both the clock-related uncertainty 467 captured by the first term and the host-related uncertainty captured 468 by the last three terms should be possible by careful design 469 techniques and calibrating the instruments using a known, isolated, 470 network in a lab. 472 The host-related uncertainties, Hinitial + Hfinal + Hrefl, could be 473 bounded by connecting two instruments back-to-back with a high-speed 474 serial link or isolated LAN segment. In this case, repeated 475 measurements are measuring the same round-trip delay. 477 If the test packets are small, such a network connection has a 478 minimal delay that may be approximated by zero. The measured delay 479 therefore contains only systematic and random error in the 480 instrumentation. The "average value" of repeated measurements is the 481 systematic error, and the variation is the random error. 483 One way to compute the systematic error, and the random error to a 484 95% confidence is to repeat the experiment many times - at least 485 hundreds of tests. The systematic error would then be the median. 486 The random error could then be found by removing the systematic error 487 from the measured values. The 95% confidence interval would be the 488 range from the 2.5th percentile to the 97.5th percentile of these 489 deviations from the true value. The calibration error "e" could then 490 be taken to be the largest absolute value of these two numbers, plus 491 the clock-related uncertainty. {Comment: as described, this bound is 492 relatively loose since the uncertainties are added, and the absolute 493 value of the largest deviation is used. As long as the resulting 494 value is not a significant fraction of the measured values, it is a 495 reasonable bound. If the resulting value is a significant fraction 496 of the measured values, then more exact methods will be needed to 497 compute the calibration error.} 499 Note that random error is a function of measurement load. For 500 example, if many paths will be measured by one instrument, this might 501 increase interrupts, process scheduling, and disk I/O (for example, 502 recording the measurements), all of which may increase the random 503 error in measured singletons. Therefore, in addition to minimal load 504 measurements to find the systematic error, calibration measurements 505 should be performed with the same measurement load that the 506 instruments will see in the field. 508 We wish to reiterate that this statistical treatment refers to the 509 calibration of the instrument; it is used to "calibrate the meter 510 stick" and say how well the meter stick reflects reality. 512 In addition to calibrating the instruments for finite delay, two 513 checks should be made to ensure that packets reported as losses were 514 really lost. First, the threshold for loss should be verified. In 515 particular, ensure the "reasonable" threshold is reasonable: that it 516 is very unlikely a packet will arrive after the threshold value, and 517 therefore the number of packets lost over an interval is not 518 sensitive to the error bound on measurements. Second, consider the 519 possibility that a packet arrives at the network interface, but is 520 lost due to congestion on that interface or to other resource 521 exhaustion (e.g. buffers) in the instrument. 523 3.8. Reporting the Metric: 525 The calibration and context in which the metric is measured MUST be 526 carefully considered, and SHOULD always be reported along with metric 527 results. We now present four items to consider: the Type-P of test 528 packets, the threshold of infinite delay (if any), error calibration, 529 and the path traversed by the test packets. This list is not 530 exhaustive; any additional information that could be useful in 531 interpreting applications of the metrics should also be reported. 533 3.8.1. Type-P 535 As noted in the Framework document [1], the value of the metric may 536 depend on the type of IP packets used to make the measurement, or 537 "type-P". The value of Type-P-Round-trip-Delay could change if the 538 protocol (UDP or TCP), port number, size, or arrangement for special 539 treatment (e.g., IP precedence or RSVP) changes. The exact Type-P 540 used to make the measurements MUST be accurately reported. 542 3.8.2. Loss threshold 544 In addition, the threshold (or methodology to distinguish) between a 545 large finite delay and loss MUST be reported. 547 3.8.3. Calibration Results 549 + If the systematic error can be determined, it SHOULD be removed 550 from the measured values. 552 + You SHOULD also report the calibration error, e, such that the 553 true value is the reported value plus or minus e, with 95% 554 confidence (see the last section.) 556 + If possible, the conditions under which a test packet with finite 557 delay is reported as lost due to resource exhaustion on the 558 measurement instrument SHOULD be reported. 560 3.8.4. Path 562 Finally, the path traversed by the packet SHOULD be reported, if 563 possible. In general it is impractical to know the precise path a 564 given packet takes through the network. The precise path may be 565 known for certain Type-P on short or stable paths. For example, if 566 Type-P includes the record route (or loose-source route) option in 567 the IP header, and the path is short enough, and all routers* on the 568 path support record (or loose-source) route, and the Dst host copies 569 the path from Src to Dst into the corresponding reply packet, then 570 the path will be precisely recorded. This is impractical because the 571 route must be short enough, many routers do not support (or are not 572 configured for) record route, and use of this feature would often 573 artificially worsen the performance observed by removing the packet 574 from common-case processing. However, partial information is still 575 valuable context. For example, if a host can choose between two 576 links* (and hence two separate routes from Src to Dst), then the 577 initial link used is valuable context. {Comment: For example, with 578 Merit's NetNow setup, a Src on one NAP can reach a Dst on another NAP 579 by either of several different backbone networks.} 581 4. A Definition for Samples of Round-trip Delay 583 Given the singleton metric Type-P-Round-trip-Delay, we now define one 584 particular sample of such singletons. The idea of the sample is to 585 select a particular binding of the parameters Src, Dst, and Type-P, 586 then define a sample of values of parameter T. The means for 587 defining the values of T is to select a beginning time T0, a final 588 time Tf, and an average rate lambda, then define a pseudo-random 589 Poisson process of rate lambda, whose values fall between T0 and Tf. 591 The time interval between successive values of T will then average 592 1/lambda. 594 {Comment: Note that Poisson sampling is only one way of defining a 595 sample. Poisson has the advantage of limiting bias, but other 596 methods of sampling might be appropriate for different situations. 597 We encourage others who find such appropriate cases to use this 598 general framework and submit their sampling method for 599 standardization.} 601 4.1. Metric Name: 603 Type-P-Round-trip-Delay-Poisson-Stream 605 4.2. Metric Parameters: 607 + Src, the IP address of a host 609 + Dst, the IP address of a host 611 + T0, a time 613 + Tf, a time 615 + lambda, a rate in reciprocal seconds 617 4.3. Metric Units: 619 A sequence of pairs; the elements of each pair are: 621 + T, a time, and 623 + dT, either a non-negative real number or an undefined number of 624 seconds. 626 The values of T in the sequence are monotonic increasing. Note that 627 T would be a valid parameter to Type-P-Round-trip-Delay, and that dT 628 would be a valid value of Type-P-Round-trip-Delay. 630 4.4. Definition: 632 Given T0, Tf, and lambda, we compute a pseudo-random Poisson process 633 beginning at or before T0, with average arrival rate lambda, and 634 ending at or after Tf. Those time values greater than or equal to T0 635 and less than or equal to Tf are then selected. At each of the times 636 in this process, we obtain the value of Type-P-Round-trip-Delay at 637 this time. The value of the sample is the sequence made up of the 638 resulting pairs. If there are no such pairs, the 639 sequence is of length zero and the sample is said to be empty. 641 4.5. Discussion: 643 The reader should be familiar with the in-depth discussion of Poisson 644 sampling in the Framework document [1], which includes methods to 645 compute and verify the pseudo-random Poisson process. 647 We specifically do not constrain the value of lambda, except to note 648 the extremes. If the rate is too large, then the measurement traffic 649 will perturb the network, and itself cause congestion. If the rate 650 is too small, then you might not capture interesting network 651 behavior. {Comment: We expect to document our experiences with, and 652 suggestions for, lambda elsewhere, culminating in a "best current 653 practices" document.} 655 Since a pseudo-random number sequence is employed, the sequence of 656 times, and hence the value of the sample, is not fully specified. 657 Pseudo-random number generators of good quality will be needed to 658 achieve the desired qualities. 660 The sample is defined in terms of a Poisson process both to avoid the 661 effects of self-synchronization and also capture a sample that is 662 statistically as unbiased as possible. {Comment: there is, of 663 course, no claim that real Internet traffic arrives according to a 664 Poisson arrival process.} The Poisson process is used to schedule 665 the delay measurements. The test packets will generally not arrive 666 at Dst according to a Poisson distribution, nor will response packets 667 arrive at Src according to a Poisson distribution, since they are 668 influenced by the network. 670 All the singleton Type-P-Round-trip-Delay metrics in the sequence 671 will have the same values of Src, Dst, and Type-P. 673 Note also that, given one sample that runs from T0 to Tf, and given 674 new time values T0' and Tf' such that T0 <= T0' <= Tf' <= Tf, the 675 subsequence of the given sample whose time values fall between T0' 676 and Tf' are also a valid Type-P-Round-trip-Delay-Poisson-Stream 677 sample. 679 4.6. Methodologies: 681 The methodologies follow directly from: 683 + the selection of specific times, using the specified Poisson 684 arrival process, and 686 + the methodologies discussion already given for the singleton Type- 687 P-Round-trip-Delay metric. 689 Care must, of course, be given to correctly handle out-of-order 690 arrival of test or response packets; it is possible that the Src 691 could send one test packet at TS[i], then send a second test packet 692 (later) at TS[i+1], and it could receive the second response packet 693 at TR[i+1], and then receive the first response packet (later) at 694 TR[i]. 696 4.7. Errors and Uncertainties: 698 In addition to sources of errors and uncertainties associated with 699 methods employed to measure the singleton values that make up the 700 sample, care must be given to analyze the accuracy of the Poisson 701 process with respect to the wire-times of the sending of the test 702 packets. Problems with this process could be caused by several 703 things, including problems with the pseudo-random number techniques 704 used to generate the Poisson arrival process, or with jitter in the 705 value of Hinitial (mentioned above as uncertainty in the singleton 706 delay metric). The Framework document shows how to use the Anderson- 707 Darling test to verify the accuracy of a Poisson process over small 708 time frames. {Comment: The goal is to ensure that test packets are 709 sent "close enough" to a Poisson schedule, and avoid periodic 710 behavior.} 712 4.8. Reporting the Metric: 714 You MUST report the calibration and context for the underlying 715 singletons along with the stream. (See "Reporting the metric" for 716 Type-P-Round-trip-Delay.) 718 5. Some Statistics Definitions for Round-trip Delay 720 Given the sample metric Type-P-Round-trip-Delay-Poisson-Stream, we 721 now offer several statistics of that sample. These statistics are 722 offered mostly to be illustrative of what could be done. 724 5.1. Type-P-Round-trip-Delay-Percentile 726 Given a Type-P-Round-trip-Delay-Poisson-Stream and a percent X 727 between 0% and 100%, the Xth percentile of all the dT values in the 728 Stream. In computing this percentile, undefined values are treated 729 as infinitely large. Note that this means that the percentile could 730 thus be undefined (informally, infinite). In addition, the Type-P- 731 Round-trip-Delay-Percentile is undefined if the sample is empty. 733 Example: suppose we take a sample and the results are: 734 Stream1 = < 735 736 737 738 739 740 > 741 Then the 50th percentile would be 110 msec, since 90 msec and 100 742 msec are smaller and 110 msec and 'undefined' are larger. 744 Note that if the possibility that a packet with finite delay is 745 reported as lost is significant, then a high percentile (90th or 746 95th) might be reported as infinite instead of finite. 748 5.2. Type-P-Round-trip-Delay-Median 750 Given a Type-P-Round-trip-Delay-Poisson-Stream, the median of all the 751 dT values in the Stream. In computing the median, undefined values 752 are treated as infinitely large. As with Type-P-Round-trip-Delay- 753 Percentile, Type-P-Round-trip-Delay-Median is undefined if the sample 754 is empty. 756 As noted in the Framework document, the median differs from the 50th 757 percentile only when the sample contains an even number of values, in 758 which case the mean of the two central values is used. 760 Example: suppose we take a sample and the results are: 761 Stream2 = < 762 763 764 765 766 > 767 Then the median would be 105 msec, the mean of 100 msec and 110 msec, 768 the two central values. 770 5.3. Type-P-Round-trip-Delay-Minimum 772 Given a Type-P-Round-trip-Delay-Poisson-Stream, the minimum of all 773 the dT values in the Stream. In computing this, undefined values are 774 treated as infinitely large. Note that this means that the minimum 775 could thus be undefined (informally, infinite) if all the dT values 776 are undefined. In addition, the Type-P-Round-trip-Delay-Minimum is 777 undefined if the sample is empty. 779 In the above example, the minimum would be 90 msec. 781 5.4. Type-P-Round-trip-Delay-Inverse-Percentile 783 Given a Type-P-Round-trip-Delay-Poisson-Stream and a non-negative 784 time duration threshold, the fraction of all the dT values in the 785 Stream less than or equal to the threshold. The result could be as 786 low as 0% (if all the dT values exceed threshold) or as high as 100%. 787 Type-P-Round-trip-Delay-Inverse-Percentile is undefined if the sample 788 is empty. 790 In the above example, the Inverse-Percentile of 103 msec would be 791 50%. 793 6. Security Considerations 795 Conducting Internet measurements raises both security and privacy 796 concerns. This memo does not specify an implementation of the 797 metrics, so it does not directly affect the security of the Internet 798 nor of applications which run on the Internet. However, 799 implementations of these metrics must be mindful of security and 800 privacy concerns. 802 There are two types of security concerns: potential harm caused by 803 the measurements, and potential harm to the measurements. The 804 measurements could cause harm because they are active, and inject 805 packets into the network. The measurement parameters MUST be 806 carefully selected so that the measurements inject trivial amounts of 807 additional traffic into the networks they measure. If they inject 808 "too much" traffic, they can skew the results of the measurement, and 809 in extreme cases cause congestion and denial of service. 811 The measurements themselves could be harmed by routers giving 812 measurement traffic a different priority than "normal" traffic, or by 813 an attacker injecting artificial measurement traffic. If routers can 814 recognize measurement traffic and treat it separately, the 815 measurements will not reflect actual user traffic. If an attacker 816 injects artificial traffic that is accepted as legitimate, the loss 817 rate will be artificially lowered. Therefore, the measurement 818 methodologies SHOULD include appropriate techniques to reduce the 819 probability measurement traffic can be distinguished from "normal" 820 traffic. Authentication techniques, such as digital signatures, may 821 be used where appropriate to guard against injected traffic attacks. 823 The privacy concerns of network measurement are limited by the active 824 measurements described in this memo. Unlike passive measurements, 825 there can be no release of existing user data. 827 7. Acknowledgements 829 Special thanks are due to Vern Paxson and to Will Leland for several 830 useful suggestions. 832 8. References 834 [1] V. Paxson, G. Almes, J. Mahdavi, and M. Mathis, "Framework for 835 IP Performance Metrics", RFC 2330, May 1998. 837 [2] G. Almes, S. Kalidindi, and M. Zekauskas, "A One-way Delay 838 Metric for IPPM", Internet Draft , 839 November 1998. 841 [3] D. Mills, "Network Time Protocol (v3)", RFC 1305, April 1992. 843 [4] J. Mahdavi and V. Paxson, "IPPM Metrics for Measuring 844 Connectivity", Internet-Draft , October 1998. 847 [5] J. Postel, "Internet Protocol", RFC 791, September 1981. 849 [6] S. Bradner, "Key words for use in RFCs to Indicate Requirement 850 Levels", RFC 2119, March 1997. 852 9. Authors' Addresses 854 Guy Almes 855 Advanced Network & Services, Inc. 856 200 Business Park Drive 857 Armonk, NY 10504 858 USA 860 Phone: +1 914 765 1120 861 EMail: almes@advanced.org 863 Sunil Kalidindi 864 Advanced Network & Services, Inc. 865 200 Business Park Drive 866 Armonk, NY 10504 867 USA 869 Phone: +1 914 765 1128 870 EMail: kalidindi@advanced.org 872 Matthew J. Zekauskas 873 Advanced Network & Services, Inc. 874 200 Buisiness Park Drive 875 Armonk, NY 10504 876 USA 878 Phone: +1 914 765 1112 879 EMail: matt@advanced.org 881 Expiration date: May, 1999