idnits 2.17.1 draft-ietf-ippm-rt-delay-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. ** The document is more than 15 pages and seems to lack a Table of Contents. == No 'Intended status' indicated for this document; assuming Proposed Standard == It seems as if not all pages are separated by form feeds - found 0 form feeds but 20 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Abstract section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (May 1999) is 9112 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Downref: Normative reference to an Informational RFC: RFC 2330 (ref. '1') -- Possible downref: Non-RFC (?) normative reference: ref. '2' ** Obsolete normative reference: RFC 1305 (ref. '3') (Obsoleted by RFC 5905) ** Obsolete normative reference: RFC 2498 (ref. '4') (Obsoleted by RFC 2678) Summary: 10 errors (**), 0 flaws (~~), 2 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group G. Almes 2 Internet Draft S. Kalidindi 3 Expiration Date: October 1999 M. Zekauskas 4 Advanced Network & Services 5 May 1999 7 A Round-trip Delay Metric for IPPM 8 10 1. Status of this Memo 12 This document is an Internet-Draft and is in full conformance with 13 all provisions of Section 10 of RFC2026. 15 Internet-Drafts are working documents of the Internet Engineering 16 Task Force (IETF), its areas, and its working groups. Note that 17 other groups may also distribute working documents as Internet- 18 Drafts. 20 Internet-Drafts are draft documents valid for a maximum of six months 21 and may be updated, replaced, or obsoleted by other documents at any 22 time. It is inappropriate to use Internet-Drafts as reference 23 material or to cite them other than as "work in progress." 25 The list of current Internet-Drafts can be accessed at 26 http://www.ietf.org/ietf/1id-abstracts.txt. 28 The list of Internet-Draft shadow directories can be accessed at 29 http://www.ietf.org/shadow.html. 31 2. Introduction 33 This memo defines a metric for round-trip delay of packets across 34 Internet paths. It builds on notions introduced and discussed in the 35 IPPM Framework document, RFC 2330 [1], and follows closely the 36 corresponding metric for One-way Delay ("A One-way Delay Metric for 37 IPPM" ) [2]; the reader is assumed to 38 be familiar with those documents. 40 The memo was largely written by copying material from the One-way 41 Delay metric. The intention is that, where the two metrics are 42 similar, they will be described with similar or identical text, and 43 that where the two metrics differ, new or modified text will be used. 45 This memo is intended to be parallel in structure to a future 46 companion document for Packet Loss. 48 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 49 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 50 document are to be interpreted as described in RFC 2119 [6]. 51 Although RFC 2119 was written with protocols in mind, the key words 52 are used in this document for similar reasons. They are used to 53 ensure the results of measurements from two different implementations 54 are comparable, and to note instances when an implementation could 55 perturb the network. 57 The structure of the memo is as follows: 59 + A 'singleton' analytic metric, called Type-P-Round-trip-Delay, 60 will be introduced to measure a single observation of round-trip 61 delay. 63 + Using this singleton metric, a 'sample', called Type-P-Round-trip- 64 Delay-Poisson-Stream, will be introduced to measure a sequence of 65 singleton delays measured at times taken from a Poisson process. 67 + Using this sample, several 'statistics' of the sample will be 68 defined and discussed. 70 This progression from singleton to sample to statistics, with clear 71 separation among them, is important. 73 Whenever a technical term from the IPPM Framework document is first 74 used in this memo, it will be tagged with a trailing asterisk. For 75 example, "term*" indicates that "term" is defined in the Framework. 77 2.1. Motivation 79 Round-trip delay of a Type-P* packet from a source host* to a 80 destination host is useful for several reasons: 82 + Some applications do not perform well (or at all) if end-to-end 83 delay between hosts is large relative to some threshold value. 85 + Erratic variation in delay makes it difficult (or impossible) to 86 support many interactive real-time applications. 88 + The larger the value of delay, the more difficult it is for 89 transport-layer protocols to sustain high bandwidths. 91 + The minimum value of this metric provides an indication of the 92 delay due only to propagation and transmission delay. 94 + The minimum value of this metric provides an indication of the 95 delay that will likely be experienced when the path* traversed is 96 lightly loaded. 98 + Values of this metric above the minimum provide an indication of 99 the congestion present in the path. 101 The measurement of round-trip delay instead of one-way delay has 102 several weaknesses, summarized here: 104 + The Internet path from a source to a destination may differ from 105 the path from the destination back to the source ("asymmetric 106 paths"), such that different sequences of routers are used for the 107 forward and reverse paths. Therefore round-trip measurements 108 actually measure the performance of two distinct paths together. 110 + Even when the two paths are symmetric, they may have radically 111 different performance characteristics due to asymmetric queueing. 113 + Performance of an application may depend mostly on the performance 114 in one direction. 116 + In quality-of-service (QoS) enabled networks, provisioning in one 117 direction may be radically different than provisioning in the 118 reverse direction, and thus the QoS guarantees differ. 120 On the other hand, the measurement of round-trip delay has two 121 specific advantages: 123 + Ease of deployment: unlike in one-way measurement, it is often 124 possible to perform some form of round-trip delay measurement 125 without installing measurement-specific software at the intended 126 destination. A variety of approaches are well-known, including 127 use of ICMP Echo or of TCP-based methodologies (similar to those 128 outlined in "IPPM Metrics for Measuring Connectivity" [4]). 129 However, some approaches may introduce greater uncertainty in the 130 time for the destination to produce a response (see 131 Section 3.7.3). 133 + Ease of interpretation: in some circumstances, the round-trip time 134 is in fact the quantity of interest. Deducing the round-trip time 135 from matching one-way measurements and an assumption of the 136 destination processing time is less direct and potentially less 137 accurate. 139 2.2. General Issues Regarding Time 141 Whenever a time (i.e., a moment in history) is mentioned here, it is 142 understood to be measured in seconds (and fractions) relative to UTC. 144 As described more fully in the Framework document, there are four 145 distinct, but related notions of clock uncertainty: 147 synchronization* 149 measures the extent to which two clocks agree on what time it 150 is. For example, the clock on one host might be 5.4 msec ahead 151 of the clock on a second host. 153 accuracy* 155 measures the extent to which a given clock agrees with UTC. For 156 example, the clock on a host might be 27.1 msec behind UTC. 158 resolution* 160 measures the precision of a given clock. For example, the clock 161 on an old Unix host might tick only once every 10 msec, and thus 162 have a resolution of only 10 msec. 164 skew* 166 measures the change of accuracy, or of synchronization, with 167 time. For example, the clock on a given host might gain 1.3 168 msec per hour and thus be 27.1 msec behind UTC at one time and 169 only 25.8 msec an hour later. In this case, we say that the 170 clock of the given host has a skew of 1.3 msec per hour relative 171 to UTC, which threatens accuracy. We might also speak of the 172 skew of one clock relative to another clock, which threatens 173 synchronization. 175 3. A Singleton Definition for Round-trip Delay 177 3.1. Metric Name: 179 Type-P-Round-trip-Delay 181 3.2. Metric Parameters: 183 + Src, the IP address of a host 185 + Dst, the IP address of a host 187 + T, a time 189 3.3. Metric Units: 191 The value of a Type-P-Round-trip-Delay is either a real number, or an 192 undefined (informally, infinite) number of seconds. 194 3.4. Definition: 196 For a real number dT, >>the *Type-P-Round-trip-Delay* from Src to Dst 197 at T is dT<< means that Src sent the first bit of a Type-P packet to 198 Dst at wire-time* T, that Dst received that packet, then immediately 199 sent a Type-P packet back to Src, and that Src received the last bit 200 of that packet at wire-time T+dT. 202 >>The *Type-P-Round-trip-Delay* from Src to Dst at T is undefined 203 (informally, infinite)<< means that Src sent the first bit of a Type- 204 P packet to Dst at wire-time T and that (either Dst did not receive 205 the packet, Dst did not send a Type-P packet in response, or) Src did 206 not receive that response packet. 208 >>The *Type-P-Round-trip-Delay between Src and Dst at T<< means 209 either the *Type-P-Round-trip-Delay from Src to Dst at T or the 210 *Type-P-Round-trip-Delay from Dst to Src at T. When this notion is 211 used, it is understood to be specifically ambiguous which host acts 212 as Src and which as Dst. {This ambiguity will usually be a small 213 price to pay for being able to have one measurement, launched from 214 either Src or Dst, rather than having two measurements.} 216 Suggestions for what to report along with metric values appear in 217 Section 3.8 after a discussion of the metric, methodologies for 218 measuring the metric, and error analysis. 220 3.5. Discussion: 222 Type-P-Round-trip-Delay is a relatively simple analytic metric, and 223 one that we believe will afford effective methods of measurement. 225 The following issues are likely to come up in practice: 227 + The timestamp values (T) for the time at which delays are measured 228 should be fairly accurate in order to draw meaningful conclusions 229 about the state of the network at a given T. Therefore, Src 230 should have an accurate knowledge of time-of-day. NTP [3] affords 231 one way to achieve time accuracy to within several milliseconds. 232 Depending on the NTP server, higher accuracy may be achieved, for 233 example when NTP servers make use of GPS systems as a time source. 234 Note that NTP will adjust the instrument's clock. If an 235 adjustment is made between the time the initial timestamp is taken 236 and the time the final timestamp is taken the adjustment will 237 affect the uncertainty in the measured delay. This uncertainty 238 must be accounted for in the instrument's calibration. 240 + A given methodology will have to include a way to determine 241 whether a delay value is infinite or whether it is merely very 242 large (and the packet is yet to arrive at Dst). As noted by 243 Mahdavi and Paxson [4], simple upper bounds (such as the 255 244 seconds theoretical upper bound on the lifetimes of IP 245 packets [5]) could be used, but good engineering, including an 246 understanding of packet lifetimes, will be needed in practice. 247 {Comment: Note that, for many applications of these metrics, the 248 harm in treating a large delay as infinite might be zero or very 249 small. A TCP data packet, for example, that arrives only after 250 several multiples of the RTT may as well have been lost.} 252 + If the packet is duplicated so that multiple non-corrupt instances 253 of the response arrive back at the source, then the packet is 254 counted as received, and the first instance to arrive back at the 255 source determines the packet's round-trip delay. 257 + If the packet is fragmented and if, for whatever reason, 258 reassembly does not occur, then the packet will be deemed lost. 260 3.6. Methodologies: 262 As with other Type-P-* metrics, the detailed methodology will depend 263 on the Type-P (e.g., protocol number, UDP/TCP port number, size, 264 precedence). 266 Generally, for a given Type-P, the methodology would proceed as 267 follows: 269 + At the Src host, select Src and Dst IP addresses, and form a test 270 packet of Type-P with these addresses. Any 'padding' portion of 271 the packet needed only to make the test packet a given size should 272 be filled with randomized bits to avoid a situation in which the 273 measured delay is lower than it would otherwise be due to 274 compression techniques along the path. The test packet must have 275 some identifying information so that the response to it can be 276 identified by Src when Src receives the response; one means to do 277 this is by placing the timestamp generated just before sending the 278 test packet in the packet itself. 280 + At the Dst host, arrange to receive and respond to the test 281 packet. At the Src host, arrange to receive the corresponding 282 response packet. 284 + At the Src host, take the initial timestamp and then send the 285 prepared Type-P packet towards Dst. Note that the timestamp could 286 be placed inside the packet, or kept separately as long as the 287 packet contains a suitable identifier so the received timestamp 288 can be compared with the send timestamp. 290 + If the packet arrives at Dst, send a corresponding response packet 291 back from Dst to Src as soon as possible. 293 + If the response packet arrives within a reasonable period of time, 294 take the final timestamp as soon as possible upon the receipt of 295 the packet. By subtracting the two timestamps, an estimate of 296 round-trip delay can be computed. If the delay between the 297 initial timestamp and the actual sending of the packet is known, 298 then the estimate could be adjusted by subtracting this amount; 299 uncertainty in this value must be taken into account in error 300 analysis. Similarly, if the delay between the actual receipt of 301 the response packet and final timestamp is known, then the 302 estimate could be adjusted by subtracting this amount; uncertainty 303 in this value must be taken into account in error analysis. See 304 the next section, "Errors and Uncertainties", for a more detailed 305 discussion. 307 + If the packet fails to arrive within a reasonable period of time, 308 the round-trip delay is taken to be undefined (informally, 309 infinite). Note that the threshold of 'reasonable' is a parameter 310 of the methodology. 312 Issues such as the packet format and the means by which Dst knows 313 when to expect the test packet are outside the scope of this 314 document. 316 {Comment: Note that you cannot in general add two Type-P-One-way- 317 Delay values (see [2]) to form a Type-P-Round-trip-Delay value. In 318 order to form a Type-P-Round-trip-Delay value, the return packet must 319 be triggered by the reception of a packet from Src.} 321 {Comment: "ping" would qualify as a round-trip measure under this 322 definition, with a Type-P of ICMP echo request/reply with 60-byte 323 packets. However, the uncertainties associated with a typical ping 324 program must be analyzed as in the next section, including the type 325 of reflecting point (a router may not handle an ICMP request in the 326 fast path) and effects of load on the reflecting point.} 328 3.7. Errors and Uncertainties: 330 The description of any specific measurement method should include an 331 accounting and analysis of various sources of error or uncertainty. 332 The Framework document provides general guidance on this point, but 333 we note here the following specifics related to delay metrics: 335 + Errors or uncertainties due to uncertainty in the clock of the Src 336 host. 338 + Errors or uncertainties due to the difference between 'wire time' 339 and 'host time'. 341 + Errors or uncertainties due to time required by the Dst to receive 342 the packet from the Src and send the corresponding response. 344 In addition, the loss threshold may affect the results. Each of 345 these are discussed in more detail below, along with a section 346 ("Calibration") on accounting for these errors and uncertainties. 348 3.7.1. Errors or Uncertainties Related to Clocks 350 The uncertainty in a measurement of round-trip delay is related, in 351 part, to uncertainty in the clock of the Src host. In the following, 352 we refer to the clock used to measure when the packet was sent from 353 Src as the source clock, and we refer to the observed time when the 354 packet was sent by the source as Tinitial, and the observed time when 355 the packet was received by the source as Tfinal. Alluding to the 356 notions of synchronization, accuracy, resolution, and skew mentioned 357 in the Introduction, we note the following: 359 + While in one-way delay there is an issue of the synchronization of 360 the source clock and the destination clock, in round-trip delay 361 there is an (easier) issue of self-synchronization, as it were, 362 between the source clock at the time the test packet is sent and 363 the (same) source clock at the time the response packet is 364 received. Theoretically a very severe case of skew could threaten 365 this. In practice, the greater threat is anything that would 366 cause a discontinuity in the source clock during the time between 367 the taking of the initial and final timestamp. This might happen, 368 for example, with certain implementations of NTP. 370 + The accuracy of a clock is important only in identifying the time 371 at which a given delay was measured. Accuracy, per se, has no 372 importance to the accuracy of the measurement of delay. 374 + The resolution of a clock adds to uncertainty about any time 375 measured with it. Thus, if the source clock has a resolution of 376 10 msec, then this adds 10 msec of uncertainty to any time value 377 measured with it. We will denote the resolution of the source 378 clock as Rsource. 380 Taking these items together, we note that naive computation Tfinal- 381 Tinitial will be off by 2*Rsource. 383 3.7.2. Errors or Uncertainties Related to Wire-time vs Host-time 385 As we have defined round-trip delay, we would like to measure the 386 time between when the test packet leaves the network interface of Src 387 and when the corresponding response packet (completely) arrives at 388 the network interface of Src, and we refer to these as "wire times". 389 If the timings are themselves performed by software on Src, however, 390 then this software can only directly measure the time between when 391 Src grabs a timestamp just prior to sending the test packet and when 392 it grabs a timestamp just after having received the response packet, 393 and we refer to these two points as "host times". 395 Another contributor to this problem is time spent at Dst between the 396 receipt there of the test packet and the sending of the response 397 packet. Ideally, this time is zero; it is explored further in the 398 next section. 400 To the extent that the difference between wire time and host time is 401 accurately known, this knowledge can be used to correct for host time 402 measurements and the corrected value more accurately estimates the 403 desired (wire time) metric. 405 To the extent, however, that the difference between wire time and 406 host time is uncertain, this uncertainty must be accounted for in an 407 analysis of a given measurement method. We denote by Hinitial an 408 upper bound on the uncertainty in the difference between wire time 409 and host time on the Src host in sending the test packet, and 410 similarly define Hfinal for the difference on the Src host in 411 receiving the reponse packet. We then note that these problems 412 introduce a total uncertainty of Hinitial + Hfinal. This estimate of 413 total wire-vs-host uncertainty should be included in the 414 error/uncertainty analysis of any measurement implementation. 416 3.7.3. Errors or Uncertainties Related to Dst Producing a Response 418 Any time spent by the destination host in receiving and recognizing 419 the packet from Src, and then producing and sending the corresponding 420 response adds additional error and uncertainty to the round-trip 421 delay measurement. The error equals the difference between the wire- 422 time the first bit of the packet is received by Dst and the wire-time 423 the first bit of the response is sent by Dst. To the extent that 424 this difference is accurately known, this knowledge can be used to 425 correct the desired metric. To the extent, however, that this 426 difference is uncertain, this uncertainty must be accounted for in 427 the error analysis of a measurement implementation. We denote this 428 uncertainty by Hrefl. This estimate of uncertainty should be 429 included in the error/uncertainty analysis of any measurement 430 implementation. 432 3.7.4. Calibration 434 Generally, the measured values can be decomposed as follows: 436 measured value = true value + systematic error + random error 438 If the systematic error (the constant bias in measured values) can be 439 determined, it can be compensated for in the reported results. 441 reported value = measured value - systematic error 443 therefore 445 reported value = true value + random error 447 The goal of calibration is to determine the systematic and random 448 error generated by the instruments themselves in as much detail as 449 possible. At a minimum, a bound ("e") should be found such that the 450 reported value is in the range (true value - e) to (true value + e) 451 at least 95 percent of the time. We call "e" the calibration error 452 for the measurements. It represents the degree to which the values 453 produced by the measurement instrument are repeatable; that is, how 454 closely an actual delay of 30 ms is reported as 30 ms. {Comment: 95 455 percent was chosen because (1) some confidence level is desirable to 456 be able to remove outliers which will be found in measuring any 457 physical property; and (2) a particular confidence level should be 458 specified so that the results of independent implementations can be 459 compared.} 461 From the discussion in the previous three sections, the error in 462 measurements could be bounded by determining all the individual 463 uncertainties, and adding them together to form 464 2*Rsource + Hinitial + Hfinal + Hrefl. 465 However, reasonable bounds on both the clock-related uncertainty 466 captured by the first term and the host-related uncertainty captured 467 by the last three terms should be possible by careful design 468 techniques and calibrating the instruments using a known, isolated, 469 network in a lab. 471 The host-related uncertainties, Hinitial + Hfinal + Hrefl, could be 472 bounded by connecting two instruments back-to-back with a high-speed 473 serial link or isolated LAN segment. In this case, repeated 474 measurements are measuring the same round-trip delay. 476 If the test packets are small, such a network connection has a 477 minimal delay that may be approximated by zero. The measured delay 478 therefore contains only systematic and random error in the 479 instrumentation. The "average value" of repeated measurements is the 480 systematic error, and the variation is the random error. 482 One way to compute the systematic error, and the random error to a 483 95% confidence is to repeat the experiment many times - at least 484 hundreds of tests. The systematic error would then be the median. 485 The random error could then be found by removing the systematic error 486 from the measured values. The 95% confidence interval would be the 487 range from the 2.5th percentile to the 97.5th percentile of these 488 deviations from the true value. The calibration error "e" could then 489 be taken to be the largest absolute value of these two numbers, plus 490 the clock-related uncertainty. {Comment: as described, this bound is 491 relatively loose since the uncertainties are added, and the absolute 492 value of the largest deviation is used. As long as the resulting 493 value is not a significant fraction of the measured values, it is a 494 reasonable bound. If the resulting value is a significant fraction 495 of the measured values, then more exact methods will be needed to 496 compute the calibration error.} 498 Note that random error is a function of measurement load. For 499 example, if many paths will be measured by one instrument, this might 500 increase interrupts, process scheduling, and disk I/O (for example, 501 recording the measurements), all of which may increase the random 502 error in measured singletons. Therefore, in addition to minimal load 503 measurements to find the systematic error, calibration measurements 504 should be performed with the same measurement load that the 505 instruments will see in the field. 507 We wish to reiterate that this statistical treatment refers to the 508 calibration of the instrument; it is used to "calibrate the meter 509 stick" and say how well the meter stick reflects reality. 511 In addition to calibrating the instruments for finite delay, two 512 checks should be made to ensure that packets reported as losses were 513 really lost. First, the threshold for loss should be verified. In 514 particular, ensure the "reasonable" threshold is reasonable: that it 515 is very unlikely a packet will arrive after the threshold value, and 516 therefore the number of packets lost over an interval is not 517 sensitive to the error bound on measurements. Second, consider the 518 possibility that a packet arrives at the network interface, but is 519 lost due to congestion on that interface or to other resource 520 exhaustion (e.g. buffers) in the instrument. 522 3.8. Reporting the Metric: 524 The calibration and context in which the metric is measured MUST be 525 carefully considered, and SHOULD always be reported along with metric 526 results. We now present four items to consider: the Type-P of test 527 packets, the threshold of infinite delay (if any), error calibration, 528 and the path traversed by the test packets. This list is not 529 exhaustive; any additional information that could be useful in 530 interpreting applications of the metrics should also be reported. 532 3.8.1. Type-P 534 As noted in the Framework document [1], the value of the metric may 535 depend on the type of IP packets used to make the measurement, or 536 "type-P". The value of Type-P-Round-trip-Delay could change if the 537 protocol (UDP or TCP), port number, size, or arrangement for special 538 treatment (e.g., IP precedence or RSVP) changes. The exact Type-P 539 used to make the measurements MUST be accurately reported. 541 3.8.2. Loss threshold 543 In addition, the threshold (or methodology to distinguish) between a 544 large finite delay and loss MUST be reported. 546 3.8.3. Calibration Results 548 + If the systematic error can be determined, it SHOULD be removed 549 from the measured values. 551 + You SHOULD also report the calibration error, e, such that the 552 true value is the reported value plus or minus e, with 95% 553 confidence (see the last section.) 555 + If possible, the conditions under which a test packet with finite 556 delay is reported as lost due to resource exhaustion on the 557 measurement instrument SHOULD be reported. 559 3.8.4. Path 561 Finally, the path traversed by the packet SHOULD be reported, if 562 possible. In general it is impractical to know the precise path a 563 given packet takes through the network. The precise path may be 564 known for certain Type-P on short or stable paths. For example, if 565 Type-P includes the record route (or loose-source route) option in 566 the IP header, and the path is short enough, and all routers* on the 567 path support record (or loose-source) route, and the Dst host copies 568 the path from Src to Dst into the corresponding reply packet, then 569 the path will be precisely recorded. This is impractical because the 570 route must be short enough, many routers do not support (or are not 571 configured for) record route, and use of this feature would often 572 artificially worsen the performance observed by removing the packet 573 from common-case processing. However, partial information is still 574 valuable context. For example, if a host can choose between two 575 links* (and hence two separate routes from Src to Dst), then the 576 initial link used is valuable context. {Comment: For example, with 577 Merit's NetNow setup, a Src on one NAP can reach a Dst on another NAP 578 by either of several different backbone networks.} 580 4. A Definition for Samples of Round-trip Delay 582 Given the singleton metric Type-P-Round-trip-Delay, we now define one 583 particular sample of such singletons. The idea of the sample is to 584 select a particular binding of the parameters Src, Dst, and Type-P, 585 then define a sample of values of parameter T. The means for 586 defining the values of T is to select a beginning time T0, a final 587 time Tf, and an average rate lambda, then define a pseudo-random 588 Poisson process of rate lambda, whose values fall between T0 and Tf. 589 The time interval between successive values of T will then average 590 1/lambda. 592 {Comment: Note that Poisson sampling is only one way of defining a 593 sample. Poisson has the advantage of limiting bias, but other 594 methods of sampling might be appropriate for different situations. 595 We encourage others who find such appropriate cases to use this 596 general framework and submit their sampling method for 597 standardization.} 599 4.1. Metric Name: 601 Type-P-Round-trip-Delay-Poisson-Stream 603 4.2. Metric Parameters: 605 + Src, the IP address of a host 607 + Dst, the IP address of a host 609 + T0, a time 611 + Tf, a time 613 + lambda, a rate in reciprocal seconds 615 4.3. Metric Units: 617 A sequence of pairs; the elements of each pair are: 619 + T, a time, and 621 + dT, either a real number or an undefined number of seconds. 623 The values of T in the sequence are monotonic increasing. Note that 624 T would be a valid parameter to Type-P-Round-trip-Delay, and that dT 625 would be a valid value of Type-P-Round-trip-Delay. 627 4.4. Definition: 629 Given T0, Tf, and lambda, we compute a pseudo-random Poisson process 630 beginning at or before T0, with average arrival rate lambda, and 631 ending at or after Tf. Those time values greater than or equal to T0 632 and less than or equal to Tf are then selected. At each of the times 633 in this process, we obtain the value of Type-P-Round-trip-Delay at 634 this time. The value of the sample is the sequence made up of the 635 resulting pairs. If there are no such pairs, the 636 sequence is of length zero and the sample is said to be empty. 638 4.5. Discussion: 640 The reader should be familiar with the in-depth discussion of Poisson 641 sampling in the Framework document [1], which includes methods to 642 compute and verify the pseudo-random Poisson process. 644 We specifically do not constrain the value of lambda, except to note 645 the extremes. If the rate is too large, then the measurement traffic 646 will perturb the network, and itself cause congestion. If the rate 647 is too small, then you might not capture interesting network 648 behavior. {Comment: We expect to document our experiences with, and 649 suggestions for, lambda elsewhere, culminating in a "best current 650 practices" document.} 652 Since a pseudo-random number sequence is employed, the sequence of 653 times, and hence the value of the sample, is not fully specified. 654 Pseudo-random number generators of good quality will be needed to 655 achieve the desired qualities. 657 The sample is defined in terms of a Poisson process both to avoid the 658 effects of self-synchronization and also capture a sample that is 659 statistically as unbiased as possible. {Comment: there is, of 660 course, no claim that real Internet traffic arrives according to a 661 Poisson arrival process.} The Poisson process is used to schedule 662 the delay measurements. The test packets will generally not arrive 663 at Dst according to a Poisson distribution, nor will response packets 664 arrive at Src according to a Poisson distribution, since they are 665 influenced by the network. 667 All the singleton Type-P-Round-trip-Delay metrics in the sequence 668 will have the same values of Src, Dst, and Type-P. 670 Note also that, given one sample that runs from T0 to Tf, and given 671 new time values T0' and Tf' such that T0 <= T0' <= Tf' <= Tf, the 672 subsequence of the given sample whose time values fall between T0' 673 and Tf' are also a valid Type-P-Round-trip-Delay-Poisson-Stream 674 sample. 676 4.6. Methodologies: 678 The methodologies follow directly from: 680 + the selection of specific times, using the specified Poisson 681 arrival process, and 683 + the methodologies discussion already given for the singleton Type- 684 P-Round-trip-Delay metric. 686 Care must, of course, be given to correctly handle out-of-order 687 arrival of test or response packets; it is possible that the Src 688 could send one test packet at TS[i], then send a second test packet 689 (later) at TS[i+1], and it could receive the second response packet 690 at TR[i+1], and then receive the first response packet (later) at 691 TR[i]. 693 4.7. Errors and Uncertainties: 695 In addition to sources of errors and uncertainties associated with 696 methods employed to measure the singleton values that make up the 697 sample, care must be given to analyze the accuracy of the Poisson 698 process with respect to the wire-times of the sending of the test 699 packets. Problems with this process could be caused by several 700 things, including problems with the pseudo-random number techniques 701 used to generate the Poisson arrival process, or with jitter in the 702 value of Hinitial (mentioned above as uncertainty in the singleton 703 delay metric). The Framework document shows how to use the Anderson- 704 Darling test to verify the accuracy of a Poisson process over small 705 time frames. {Comment: The goal is to ensure that test packets are 706 sent "close enough" to a Poisson schedule, and avoid periodic 707 behavior.} 709 4.8. Reporting the Metric: 711 You MUST report the calibration and context for the underlying 712 singletons along with the stream. (See "Reporting the metric" for 713 Type-P-Round-trip-Delay.) 715 5. Some Statistics Definitions for Round-trip Delay 717 Given the sample metric Type-P-Round-trip-Delay-Poisson-Stream, we 718 now offer several statistics of that sample. These statistics are 719 offered mostly to be illustrative of what could be done. 721 5.1. Type-P-Round-trip-Delay-Percentile 723 Given a Type-P-Round-trip-Delay-Poisson-Stream and a percent X 724 between 0% and 100%, the Xth percentile of all the dT values in the 725 Stream. In computing this percentile, undefined values are treated 726 as infinitely large. Note that this means that the percentile could 727 thus be undefined (informally, infinite). In addition, the Type-P- 728 Round-trip-Delay-Percentile is undefined if the sample is empty. 730 Example: suppose we take a sample and the results are: 731 Stream1 = < 732 733 734 735 736 737 > 738 Then the 50th percentile would be 110 msec, since 90 msec and 100 739 msec are smaller and 110 msec and 'undefined' are larger. 741 Note that if the possibility that a packet with finite delay is 742 reported as lost is significant, then a high percentile (90th or 743 95th) might be reported as infinite instead of finite. 745 5.2. Type-P-Round-trip-Delay-Median 747 Given a Type-P-Round-trip-Delay-Poisson-Stream, the median of all the 748 dT values in the Stream. In computing the median, undefined values 749 are treated as infinitely large. As with Type-P-Round-trip-Delay- 750 Percentile, Type-P-Round-trip-Delay-Median is undefined if the sample 751 is empty. 753 As noted in the Framework document, the median differs from the 50th 754 percentile only when the sample contains an even number of values, in 755 which case the mean of the two central values is used. 757 Example: suppose we take a sample and the results are: 758 Stream2 = < 759 760 761 762 763 > 764 Then the median would be 105 msec, the mean of 100 msec and 110 msec, 765 the two central values. 767 5.3. Type-P-Round-trip-Delay-Minimum 769 Given a Type-P-Round-trip-Delay-Poisson-Stream, the minimum of all 770 the dT values in the Stream. In computing this, undefined values are 771 treated as infinitely large. Note that this means that the minimum 772 could thus be undefined (informally, infinite) if all the dT values 773 are undefined. In addition, the Type-P-Round-trip-Delay-Minimum is 774 undefined if the sample is empty. 776 In the above example, the minimum would be 90 msec. 778 5.4. Type-P-Round-trip-Delay-Inverse-Percentile 780 Given a Type-P-Round-trip-Delay-Poisson-Stream and a time duration 781 threshold, the fraction of all the dT values in the Stream less than 782 or equal to the threshold. The result could be as low as 0% (if all 783 the dT values exceed threshold) or as high as 100%. Type-P-Round- 784 trip-Delay-Inverse-Percentile is undefined if the sample is empty. 786 In the above example, the Inverse-Percentile of 103 msec would be 787 50%. 789 6. Security Considerations 791 Conducting Internet measurements raises both security and privacy 792 concerns. This memo does not specify an implementation of the 793 metrics, so it does not directly affect the security of the Internet 794 nor of applications which run on the Internet. However, 795 implementations of these metrics must be mindful of security and 796 privacy concerns. 798 There are two types of security concerns: potential harm caused by 799 the measurements, and potential harm to the measurements. The 800 measurements could cause harm because they are active, and inject 801 packets into the network. The measurement parameters MUST be 802 carefully selected so that the measurements inject trivial amounts of 803 additional traffic into the networks they measure. If they inject 804 "too much" traffic, they can skew the results of the measurement, and 805 in extreme cases cause congestion and denial of service. 807 The measurements themselves could be harmed by routers giving 808 measurement traffic a different priority than "normal" traffic, or by 809 an attacker injecting artificial measurement traffic. If routers can 810 recognize measurement traffic and treat it separately, the 811 measurements will not reflect actual user traffic. If an attacker 812 injects artificial traffic that is accepted as legitimate, the loss 813 rate will be artificially lowered. Therefore, the measurement 814 methodologies SHOULD include appropriate techniques to reduce the 815 probability measurement traffic can be distinguished from "normal" 816 traffic. Authentication techniques, such as digital signatures, may 817 be used where appropriate to guard against injected traffic attacks. 819 The privacy concerns of network measurement are limited by the active 820 measurements described in this memo. Unlike passive measurements, 821 there can be no release of existing user data. 823 7. Acknowledgements 825 Special thanks are due to Vern Paxson and to Will Leland for several 826 useful suggestions. 828 8. References 830 [1] V. Paxson, G. Almes, J. Mahdavi, and M. Mathis, "Framework for 831 IP Performance Metrics", RFC 2330, May 1998. 833 [2] G. Almes, S. Kalidindi, and M. Zekauskas, "A One-way Delay 834 Metric for IPPM", Internet Draft , 835 April, 1999. 837 [3] D. Mills, "Network Time Protocol (v3)", RFC 1305, April 1992. 839 [4] J. Mahdavi and V. Paxson, "IPPM Metrics for Measuring 840 Connectivity", RFC 2498, January 1999. 842 [5] J. Postel, "Internet Protocol", RFC 791, September 1981. 844 [6] S. Bradner, "Key words for use in RFCs to Indicate Requirement 845 Levels", RFC 2119, March 1997. 847 9. Authors' Addresses 849 Guy Almes 850 Advanced Network & Services, Inc. 851 200 Business Park Drive 852 Armonk, NY 10504 853 USA 855 Phone: +1 914 765 1120 856 EMail: almes@advanced.org 858 Sunil Kalidindi 859 Advanced Network & Services, Inc. 860 200 Business Park Drive 861 Armonk, NY 10504 862 USA 864 Phone: +1 914 765 1128 865 EMail: kalidindi@advanced.org 867 Matthew J. Zekauskas 868 Advanced Network & Services, Inc. 869 200 Buisiness Park Drive 870 Armonk, NY 10504 871 USA 873 Phone: +1 914 765 1112 874 EMail: matt@advanced.org 876 Expiration date: October, 1999