idnits 2.17.1 draft-ietf-ippm-delay-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. ** The document is more than 15 pages and seems to lack a Table of Contents. == No 'Intended status' indicated for this document; assuming Proposed Standard == It seems as if not all pages are separated by form feeds - found 0 form feeds but 20 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Abstract section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (May 1999) is 9075 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: '3' is defined on line 821, but no explicit reference was found in the text == Unused Reference: '7' is defined on line 831, but no explicit reference was found in the text ** Downref: Normative reference to an Informational RFC: RFC 2330 (ref. '1') -- Possible downref: Non-RFC (?) normative reference: ref. '2' ** Obsolete normative reference: RFC 1305 (ref. '3') (Obsoleted by RFC 5905) ** Obsolete normative reference: RFC 2498 (ref. '4') (Obsoleted by RFC 2678) Summary: 10 errors (**), 0 flaws (~~), 4 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group G. Almes 2 Internet Draft S. Kalidindi 3 Expiration Date: November 1999 M. Zekauskas 4 Advanced Network & Services 5 May 1999 7 A One-way Delay Metric for IPPM 8 10 1. Status of this Memo 12 This document is an Internet-Draft and is in full conformance with 13 all provisions of Section 10 of RFC2026. 15 Internet-Drafts are working documents of the Internet Engineering 16 Task Force (IETF), its areas, and its working groups. Note that 17 other groups may also distribute working documents as Internet- 18 Drafts. 20 Internet-Drafts are draft documents valid for a maximum of six months 21 and may be updated, replaced, or obsoleted by other documents at any 22 time. It is inappropriate to use Internet-Drafts as reference 23 material or to cite them other than as "work in progress." 25 The list of current Internet-Drafts can be accessed at 26 http://www.ietf.org/ietf/1id-abstracts.txt 28 The list of Internet-Draft shadow directories can be accessed at 29 http://www.ietf.org/shadow.html 31 This memo provides information for the Internet community. This memo 32 does not specify an Internet standard of any kind. Distribution of 33 this memo is unlimited. 35 2. Introduction 37 This memo defines a metric for one-way delay of packets across 38 Internet paths. It builds on notions introduced and discussed in the 39 IPPM Framework document, RFC 2330 [1]; the reader is assumed to be 40 familiar with that document. 42 This memo is intended to be parallel in structure to a companion 43 document for Packet Loss ("A Packet Loss Metric for IPPM" 44 ) [2]. 46 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 47 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 48 document are to be interpreted as described in RFC 2119 [6]. 49 Although RFC 2119 was written with protocols in mind, the key words 50 are used in this document for similar reasons. They are used to 51 ensure the results of measurements from two different implementations 52 are comparable, and to note instances when an implementation could 53 perturb the network. 55 The structure of the memo is as follows: 57 + A 'singleton' analytic metric, called Type-P-One-way-Delay, will 58 be introduced to measure a single observation of one-way delay. 60 + Using this singleton metric, a 'sample', called Type-P-One-way- 61 Delay-Poisson-Stream, will be introduced to measure a sequence of 62 singleton delays measured at times taken from a Poisson process. 64 + Using this sample, several 'statistics' of the sample will be 65 defined and discussed. 67 This progression from singleton to sample to statistics, with clear 68 separation among them, is important. 70 Whenever a technical term from the IPPM Framework document is first 71 used in this memo, it will be tagged with a trailing asterisk. For 72 example, "term*" indicates that "term" is defined in the Framework. 74 2.1. Motivation: 76 One-way delay of a Type-P* packet from a source host* to a 77 destination host is useful for several reasons: 79 + Some applications do not perform well (or at all) if end-to-end 80 delay between hosts is large relative to some threshold value. 82 + Erratic variation in delay makes it difficult (or impossible) to 83 support many real-time applications. 85 + The larger the value of delay, the more difficult it is for 86 transport-layer protocols to sustain high bandwidths. 88 + The minimum value of this metric provides an indication of the 89 delay due only to propagation and transmission delay. 91 + The minimum value of this metric provides an indication of the 92 delay that will likely be experienced when the path* traversed is 93 lightly loaded. 95 + Values of this metric above the minimum provide an indication of 96 the congestion present in the path. 98 The measurement of one-way delay instead of round-trip delay is 99 motivated by the following factors: 101 + In today's Internet, the path from a source to a destination may 102 be different than the path from the destination back to the source 103 ("asymmetric paths"), such that different sequences of routers are 104 used for the forward and reverse paths. Therefore round-trip 105 measurements actually measure the performance of two distinct 106 paths together. Measuring each path independently highlights the 107 performance difference between the two paths which may traverse 108 different Internet service providers, and even radically different 109 types of networks (for example, research versus commodity 110 networks, or ATM versus packet-over-SONET). 112 + Even when the two paths are symmetric, they may have radically 113 different performance characteristics due to asymmetric queueing. 115 + Performance of an application may depend mostly on the performance 116 in one direction. For example, a file transfer using TCP may 117 depend more on the performance in the direction that data flows, 118 rather than the direction in which acknowledgements travel. 120 + In quality-of-service (QoS) enabled networks, provisioning in one 121 direction may be radically different than provisioning in the 122 reverse direction, and thus the QoS guarantees differ. Measuring 123 the paths independently allows the verification of both 124 guarantees. 126 It is outside the scope of this document to say precisely how delay 127 metrics would be applied to specific problems. 129 2.2. General Issues Regarding Time 131 {Comment: the terminology below differs from that defined by ITU-T 132 documents (e.g., G.810, "Definitions and terminology for 133 synchronization networks" and I.356, "B-ISDN ATM layer cell transfer 134 performance"), but is consistent with the IPPM Framework document. 135 In general, these differences derive from the different backgrounds; 136 the ITU-T documents historically have a telephony origin, while the 137 authors of this document (and the Framework) have a computer systems 138 background. Although the terms defined below have no direct 139 equivalent in the ITU-T definitions, after our definitions we will 140 provide a rough mapping. However, note one potential confusion: our 141 definition of "clock" is the computer operating systems definition 142 denoting a time-of-day clock, while the ITU-T definition of clock 143 denotes a frequency reference.} 145 Whenever a time (i.e., a moment in history) is mentioned here, it is 146 understood to be measured in seconds (and fractions) relative to UTC. 148 As described more fully in the Framework document, there are four 149 distinct, but related notions of clock uncertainty: 151 synchronization* 153 measures the extent to which two clocks agree on what time it 154 is. For example, the clock on one host might be 5.4 msec ahead 155 of the clock on a second host. {Comment: A rough ITU-T 156 equivalent is "time error".} 158 accuracy* 160 measures the extent to which a given clock agrees with UTC. For 161 example, the clock on a host might be 27.1 msec behind UTC. 162 {Comment: A rough ITU-T equivalent is "time error from UTC".} 164 resolution* 166 measures the precision of a given clock. For example, the clock 167 on an old Unix host might tick only once every 10 msec, and thus 168 have a resolution of only 10 msec. {Comment: A very rough ITU-T 169 equivalent is "sampling period".} 171 skew* 173 measures the change of accuracy, or of synchronization, with 174 time. For example, the clock on a given host might gain 1.3 175 msec per hour and thus be 27.1 msec behind UTC at one time and 176 only 25.8 msec an hour later. In this case, we say that the 177 clock of the given host has a skew of 1.3 msec per hour relative 178 to UTC, which threatens accuracy. We might also speak of the 179 skew of one clock relative to another clock, which threatens 180 synchronization. {Comment: A rough ITU-T equivalent is "time 181 drift".} 183 3. A Singleton Definition for One-way Delay 185 3.1. Metric Name: 187 Type-P-One-way-Delay 189 3.2. Metric Parameters: 191 + Src, the IP address of a host 193 + Dst, the IP address of a host 195 + T, a time 197 3.3. Metric Units: 199 The value of a Type-P-One-way-Delay is either a real number, or an 200 undefined (informally, infinite) number of seconds. 202 3.4. Definition: 204 For a real number dT, >>the *Type-P-One-way-Delay* from Src to Dst at 205 T is dT<< means that Src sent the first bit of a Type-P packet to Dst 206 at wire-time* T and that Dst received the last bit of that packet at 207 wire-time T+dT. 209 >>The *Type-P-One-way-Delay* from Src to Dst at T is undefined 210 (informally, infinite)<< means that Src sent the first bit of a Type- 211 P packet to Dst at wire-time T and that Dst did not receive that 212 packet. 214 Suggestions for what to report along with metric values appear in 215 Section 3.8 after a discussion of the metric, methodologies for 216 measuring the metric, and error analysis. 218 3.5. Discussion: 220 Type-P-One-way-Delay is a relatively simple analytic metric, and one 221 that we believe will afford effective methods of measurement. 223 The following issues are likely to come up in practice: 225 + Real delay values will be positive. Therefore, it does not make 226 sense to report a negative value as a real delay. However, an 227 individual zero or negative delay value might be useful as part of 228 a stream when trying to discover a distribution of a stream of 229 delay values. 231 + Since delay values will often be as low as the 100 usec to 10 msec 232 range, it will be important for Src and Dst to synchronize very 233 closely. GPS systems afford one way to achieve synchronization to 234 within several 10s of usec. Ordinary application of NTP may allow 235 synchronization to within several msec, but this depends on the 236 stability and symmetry of delay properties among those NTP agents 237 used, and this delay is what we are trying to measure. A 238 combination of some GPS-based NTP servers and a conservatively 239 designed and deployed set of other NTP servers should yield good 240 results, but this is yet to be tested. 242 + A given methodology will have to include a way to determine 243 whether a delay value is infinite or whether it is merely very 244 large (and the packet is yet to arrive at Dst). As noted by 245 Mahdavi and Paxson [4], simple upper bounds (such as the 255 246 seconds theoretical upper bound on the lifetimes of IP 247 packets [5]) could be used, but good engineering, including an 248 understanding of packet lifetimes, will be needed in practice. 249 {Comment: Note that, for many applications of these metrics, the 250 harm in treating a large delay as infinite might be zero or very 251 small. A TCP data packet, for example, that arrives only after 252 several multiples of the RTT may as well have been lost.} 254 + If the packet is duplicated along the path (or paths) so that 255 multiple non-corrupt copies arrive at the destination, then the 256 packet is counted as received, and the first copy to arrive 257 determines the packet's one-way delay. 259 + If the packet is fragmented and if, for whatever reason, 260 reassembly does not occur, then the packet will be deemed lost. 262 3.6. Methodologies: 264 As with other Type-P-* metrics, the detailed methodology will depend 265 on the Type-P (e.g., protocol number, UDP/TCP port number, size, 266 precedence). 268 Generally, for a given Type-P, the methodology would proceed as 269 follows: 271 + Arrange that Src and Dst are synchronized; that is, that they have 272 clocks that are very closely synchronized with each other and each 273 fairly close to the actual time. 275 + At the Src host, select Src and Dst IP addresses, and form a test 276 packet of Type-P with these addresses. Any 'padding' portion of 277 the packet needed only to make the test packet a given size should 278 be filled with randomized bits to avoid a situation in which the 279 measured delay is lower than it would otherwise be due to 280 compression techniques along the path. 282 + At the Dst host, arrange to receive the packet. 284 + At the Src host, place a timestamp in the prepared Type-P packet, 285 and send it towards Dst. 287 + If the packet arrives within a reasonable period of time, take a 288 timestamp as soon as possible upon the receipt of the packet. By 289 subtracting the two timestamps, an estimate of one-way delay can 290 be computed. Error analysis of a given implementation of the 291 method must take into account the closeness of synchronization 292 between Src and Dst. If the delay between Src's timestamp and the 293 actual sending of the packet is known, then the estimate could be 294 adjusted by subtracting this amount; uncertainty in this value 295 must be taken into account in error analysis. Similarly, if the 296 delay between the actual receipt of the packet and Dst's timestamp 297 is known, then the estimate could be adjusted by subtracting this 298 amount; uncertainty in this value must be taken into account in 299 error analysis. See the next section, "Errors and Uncertainties", 300 for a more detailed discussion. 302 + If the packet fails to arrive within a reasonable period of time, 303 the one-way delay is taken to be undefined (informally, infinite). 304 Note that the threshold of 'reasonable' is a parameter of the 305 methodology. 307 Issues such as the packet format, the means by which Dst knows when 308 to expect the test packet, and the means by which Src and Dst are 309 synchronized are outside the scope of this document. {Comment: We 310 plan to document elsewhere our own work in describing such more 311 detailed implementation techniques and we encourage others to as 312 well.} 314 3.7. Errors and Uncertainties: 316 The description of any specific measurement method should include an 317 accounting and analysis of various sources of error or uncertainty. 318 The Framework document provides general guidance on this point, but 319 we note here the following specifics related to delay metrics: 321 + Errors or uncertainties due to uncertainties in the clocks of the 322 Src and Dst hosts. 324 + Errors or uncertainties due to the difference between 'wire time' 325 and 'host time'. 327 In addition, the loss threshold may affect the results. Each of 328 these are discussed in more detail below, along with a section 329 ("Calibration") on accounting for these errors and uncertainties. 331 3.7.1. Errors or uncertainties related to Clocks 333 The uncertainty in a measurement of one-way delay is related, in 334 part, to uncertainties in the clocks of the Src and Dst hosts. In 335 the following, we refer to the clock used to measure when the packet 336 was sent from Src as the source clock, we refer to the clock used to 337 measure when the packet was received by Dst as the destination clock, 338 we refer to the observed time when the packet was sent by the source 339 clock as Tsource, and the observed time when the packet was received 340 by the destination clock as Tdest. Alluding to the notions of 341 synchronization, accuracy, resolution, and skew mentioned in the 342 Introduction, we note the following: 344 + Any error in the synchronization between the source clock and the 345 destination clock will contribute to error in the delay 346 measurement. We say that the source clock and the destination 347 clock have a synchronization error of Tsynch if the source clock 348 is Tsynch ahead of the destination clock. Thus, if we know the 349 value of Tsynch exactly, we could correct for clock 350 synchronization by adding Tsynch to the uncorrected value of 351 Tdest-Tsource. 353 + The accuracy of a clock is important only in identifying the time 354 at which a given delay was measured. Accuracy, per se, has no 355 importance to the accuracy of the measurement of delay. When 356 computing delays, we are interested only in the differences 357 between clock values, not the values themselves. 359 + The resolution of a clock adds to uncertainty about any time 360 measured with it. Thus, if the source clock has a resolution of 361 10 msec, then this adds 10 msec of uncertainty to any time value 362 measured with it. We will denote the resolution of the source 363 clock and the destination clock as Rsource and Rdest, 364 respectively. 366 + The skew of a clock is not so much an additional issue as it is a 367 realization of the fact that Tsynch is itself a function of time. 368 Thus, if we attempt to measure or to bound Tsynch, this needs to 369 be done periodically. Over some periods of time, this function 370 can be approximated as a linear function plus some higher order 371 terms; in these cases, one option is to use knowledge of the 372 linear component to correct the clock. Using this correction, the 373 residual Tsynch is made smaller, but remains a source of 374 uncertainty that must be accounted for. We use the function 375 Esynch(t) to denote an upper bound on the uncertainty in 376 synchronization. Thus, |Tsynch(t)| <= Esynch(t). 378 Taking these items together, we note that naive computation Tdest- 379 Tsource will be off by Tsynch(t) +/- (Rsource + Rdest). Using the 380 notion of Esynch(t), we note that these clock-related problems 381 introduce a total uncertainty of Esynch(t)+ Rsource + Rdest. This 382 estimate of total clock-related uncertainty should be included in the 383 error/uncertainty analysis of any measurement implementation. 385 3.7.2. Errors or uncertainties related to Wire-time vs Host-time 387 As we have defined one-way delay, we would like to measure the time 388 between when the test packet leaves the network interface of Src and 389 when it (completely) arrives at the network interface of Dst, and we 390 refer to these as "wire times." If the timings are themselves 391 performed by software on Src and Dst, however, then this software can 392 only directly measure the time between when Src grabs a timestamp 393 just prior to sending the test packet and when Dst grabs a timestamp 394 just after having received the test packet, and we refer to these two 395 points as "host times". 397 To the extent that the difference between wire time and host time is 398 accurately known, this knowledge can be used to correct for host time 399 measurements and the corrected value more accurately estimates the 400 desired (wire time) metric. 402 To the extent, however, that the difference between wire time and 403 host time is uncertain, this uncertainty must be accounted for in an 404 analysis of a given measurement method. We denote by Hsource an 405 upper bound on the uncertainty in the difference between wire time 406 and host time on the Src host, and similarly define Hdest for the Dst 407 host. We then note that these problems introduce a total uncertainty 408 of Hsource+Hdest. This estimate of total wire-vs-host uncertainty 409 should be included in the error/uncertainty analysis of any 410 measurement implementation. 412 3.7.3. Calibration 414 Generally, the measured values can be decomposed as follows: 416 measured value = true value + systematic error + random error 418 If the systematic error (the constant bias in measured values) can be 419 determined, it can be compensated for in the reported results. 421 reported value = measured value - systematic error 423 therefore 425 reported value = true value + random error 427 The goal of calibration is to determine the systematic and random 428 error generated by the instruments themselves in as much detail as 429 possible. At a minimum, a bound ("e") should be found such that the 430 reported value is in the range (true value - e) to (true value + e) 431 at least 95 percent of the time. We call "e" the calibration error 432 for the measurements. It represents the degree to which the values 433 produced by the measurement instrument are repeatable; that is, how 434 closely an actual delay of 30 ms is reported as 30 ms. {Comment: 95 435 percent was chosen because (1) some confidence level is desirable to 436 be able to remove outliers which will be found in measuring any 437 physical property; (2) a particular confidence level should be 438 specified so that the results of independent implementations can be 439 compared; and (3) even with a prototype user-level implementation, 440 95% was loose enough to exclude outliers.} 442 From the discussion in the previous two sections, the error in 443 measurements could be bounded by determining all the individual 444 uncertainties, and adding them together to form 445 Esynch(t) + Rsource + Rdest + Hsource + Hdest. 446 However, reasonable bounds on both the clock-related uncertainty 447 captured by the first three terms and the host-related uncertainty 448 captured by the last two terms should be possible by careful design 449 techniques and calibrating the instruments using a known, isolated, 450 network in a lab. 452 For example, the clock-related uncertainties are greatly reduced 453 through the use of a GPS time source. The sum of Esynch(t) + Rsource 454 + Rdest is small, and is also bounded for the duration of the 455 measurement because of the global time source. 457 The host-related uncertainties, Hsource + Hdest, could be bounded by 458 connecting two instruments back-to-back with a high-speed serial link 459 or isolated LAN segment. In this case, repeated measurements are 460 measuring the same one-way delay. 462 If the test packets are small, such a network connection has a 463 minimal delay that may be approximated by zero. The measured delay 464 therefore contains only systematic and random error in the 465 instrumentation. The "average value" of repeated measurements is the 466 systematic error, and the variation is the random error. 468 One way to compute the systematic error, and the random error to a 469 95% confidence is to repeat the experiment many times - at least 470 hundreds of tests. The systematic error would then be the median. 471 The random error could then be found by removing the systematic error 472 from the measured values. The 95% confidence interval would be the 473 range from the 2.5th percentile to the 97.5th percentile of these 474 deviations from the true value. The calibration error "e" could then 475 be taken to be the largest absolute value of these two numbers, plus 476 the clock-related uncertainty. {Comment: as described, this bound is 477 relatively loose since the uncertainties are added, and the absolute 478 value of the largest deviation is used. As long as the resulting 479 value is not a significant fraction of the measured values, it is a 480 reasonable bound. If the resulting value is a significant fraction 481 of the measured values, then more exact methods will be needed to 482 compute the calibration error.} 484 Note that random error is a function of measurement load. For 485 example, if many paths will be measured by one instrument, this might 486 increase interrupts, process scheduling, and disk I/O (for example, 487 recording the measurements), all of which may increase the random 488 error in measured singletons. Therefore, in addition to minimal load 489 measurements to find the systematic error, calibration measurements 490 should be performed with the same measurement load that the 491 instruments will see in the field. 493 We wish to reiterate that this statistical treatment refers to the 494 calibration of the instrument; it is used to "calibrate the meter 495 stick" and say how well the meter stick reflects reality. 497 In addition to calibrating the instruments for finite one-way delay, 498 two checks should be made to ensure that packets reported as losses 499 were really lost. First, the threshold for loss should be verified. 500 In particular, ensure the "reasonable" threshold is reasonable: that 501 it is very unlikely a packet will arrive after the threshold value, 502 and therefore the number of packets lost over an interval is not 503 sensitive to the error bound on measurements. Second, consider the 504 possibility that a packet arrives at the network interface, but is 505 lost due to congestion on that interface or to other resource 506 exhaustion (e.g. buffers) in the instrument. 508 3.8. Reporting the metric: 510 The calibration and context in which the metric is measured MUST be 511 carefully considered, and SHOULD always be reported along with metric 512 results. We now present four items to consider: the Type-P of test 513 packets, the threshold of infinite delay (if any), error calibration, 514 and the path traversed by the test packets. This list is not 515 exhaustive; any additional information that could be useful in 516 interpreting applications of the metrics should also be reported. 518 3.8.1. Type-P 520 As noted in the Framework document [1], the value of the metric may 521 depend on the type of IP packets used to make the measurement, or 522 "type-P". The value of Type-P-One-way-Delay could change if the 523 protocol (UDP or TCP), port number, size, or arrangement for special 524 treatment (e.g., IP precedence or RSVP) changes. The exact Type-P 525 used to make the measurements MUST be accurately reported. 527 3.8.2. Loss threshold 529 In addition, the threshold (or methodology to distinguish) between a 530 large finite delay and loss MUST be reported. 532 3.8.3. Calibration results 534 + If the systematic error can be determined, it SHOULD be removed 535 from the measured values. 537 + You SHOULD also report the calibration error, e, such that the 538 true value is the reported value plus or minus e, with 95% 539 confidence (see the last section.) 541 + If possible, the conditions under which a test packet with finite 542 delay is reported as lost due to resource exhaustion on the 543 measurement instrument SHOULD be reported. 545 3.8.4. Path 547 Finally, the path traversed by the packet SHOULD be reported, if 548 possible. In general it is impractical to know the precise path a 549 given packet takes through the network. The precise path may be 550 known for certain Type-P on short or stable paths. If Type-P 551 includes the record route (or loose-source route) option in the IP 552 header, and the path is short enough, and all routers* on the path 553 support record (or loose-source) route, then the path will be 554 precisely recorded. This is impractical because the route must be 555 short enough, many routers do not support (or are not configured for) 556 record route, and use of this feature would often artificially worsen 557 the performance observed by removing the packet from common-case 558 processing. However, partial information is still valuable context. 559 For example, if a host can choose between two links* (and hence two 560 separate routes from Src to Dst), then the initial link used is 561 valuable context. {Comment: For example, with Merit's NetNow setup, 562 a Src on one NAP can reach a Dst on another NAP by either of several 563 different backbone networks.} 565 4. A Definition for Samples of One-way Delay 567 Given the singleton metric Type-P-One-way-Delay, we now define one 568 particular sample of such singletons. The idea of the sample is to 569 select a particular binding of the parameters Src, Dst, and Type-P, 570 then define a sample of values of parameter T. The means for 571 defining the values of T is to select a beginning time T0, a final 572 time Tf, and an average rate lambda, then define a pseudo-random 573 Poisson process of rate lambda, whose values fall between T0 and Tf. 574 The time interval between successive values of T will then average 575 1/lambda. 577 {Comment: Note that Poisson sampling is only one way of defining a 578 sample. Poisson has the advantage of limiting bias, but other 579 methods of sampling might be appropriate for different situations. 580 We encourage others who find such appropriate cases to use this 581 general framework and submit their sampling method for 582 standardization.} 584 4.1. Metric Name: 586 Type-P-One-way-Delay-Poisson-Stream 588 4.2. Metric Parameters: 590 + Src, the IP address of a host 592 + Dst, the IP address of a host 594 + T0, a time 596 + Tf, a time 598 + lambda, a rate in reciprocal seconds 600 4.3. Metric Units: 602 A sequence of pairs; the elements of each pair are: 604 + T, a time, and 606 + dT, either a real number or an undefined number of seconds. 608 The values of T in the sequence are monotonic increasing. Note that 609 T would be a valid parameter to Type-P-One-way-Delay, and that dT 610 would be a valid value of Type-P-One-way-Delay. 612 4.4. Definition: 614 Given T0, Tf, and lambda, we compute a pseudo-random Poisson process 615 beginning at or before T0, with average arrival rate lambda, and 616 ending at or after Tf. Those time values greater than or equal to T0 617 and less than or equal to Tf are then selected. At each of the times 618 in this process, we obtain the value of Type-P-One-way-Delay at this 619 time. The value of the sample is the sequence made up of the 620 resulting pairs. If there are no such pairs, the 621 sequence is of length zero and the sample is said to be empty. 623 4.5. Discussion: 625 The reader should be familiar with the in-depth discussion of Poisson 626 sampling in the Framework document [1], which includes methods to 627 compute and verify the pseudo-random Poisson process. 629 We specifically do not constrain the value of lambda, except to note 630 the extremes. If the rate is too large, then the measurement traffic 631 will perturb the network, and itself cause congestion. If the rate 632 is too small, then you might not capture interesting network 633 behavior. {Comment: We expect to document our experiences with, and 634 suggestions for, lambda elsewhere, culminating in a "best current 635 practices" document.} 637 Since a pseudo-random number sequence is employed, the sequence of 638 times, and hence the value of the sample, is not fully specified. 639 Pseudo-random number generators of good quality will be needed to 640 achieve the desired qualities. 642 The sample is defined in terms of a Poisson process both to avoid the 643 effects of self-synchronization and also capture a sample that is 644 statistically as unbiased as possible. {Comment: there is, of 645 course, no claim that real Internet traffic arrives according to a 646 Poisson arrival process.} The Poisson process is used to schedule 647 the delay measurements. The test packets will generally not arrive 648 at Dst according to a Poisson distribution, since they are influenced 649 by the network. 651 All the singleton Type-P-One-way-Delay metrics in the sequence will 652 have the same values of Src, Dst, and Type-P. 654 Note also that, given one sample that runs from T0 to Tf, and given 655 new time values T0' and Tf' such that T0 <= T0' <= Tf' <= Tf, the 656 subsequence of the given sample whose time values fall between T0' 657 and Tf' are also a valid Type-P-One-way-Delay-Poisson-Stream sample. 659 4.6. Methodologies: 661 The methodologies follow directly from: 663 + the selection of specific times, using the specified Poisson 664 arrival process, and 666 + the methodologies discussion already given for the singleton Type- 667 P-One-way-Delay metric. 669 Care must, of course, be given to correctly handle out-of-order 670 arrival of test packets; it is possible that the Src could send one 671 test packet at TS[i], then send a second one (later) at TS[i+1], 672 while the Dst could receive the second test packet at TR[i+1], and 673 then receive the first one (later) at TR[i]. 675 4.7. Errors and Uncertainties: 677 In addition to sources of errors and uncertainties associated with 678 methods employed to measure the singleton values that make up the 679 sample, care must be given to analyze the accuracy of the Poisson 680 process with respect to the wire-times of the sending of the test 681 packets. Problems with this process could be caused by several 682 things, including problems with the pseudo-random number techniques 683 used to generate the Poisson arrival process, or with jitter in the 684 value of Hsource (mentioned above as uncertainty in the singleton 685 delay metric). The Framework document shows how to use the Anderson- 686 Darling test to verify the accuracy of a Poisson process over small 687 time frames. {Comment: The goal is to ensure that test packets are 688 sent "close enough" to a Poisson schedule, and avoid periodic 689 behavior.} 691 4.8. Reporting the metric: 693 You MUST report the calibration and context for the underlying 694 singletons along with the stream. (See "Reporting the metric" for 695 Type-P-One-way-Delay.) 697 5. Some Statistics Definitions for One-way Delay 699 Given the sample metric Type-P-One-way-Delay-Poisson-Stream, we now 700 offer several statistics of that sample. These statistics are 701 offered mostly to be illustrative of what could be done. 703 5.1. Type-P-One-way-Delay-Percentile 705 Given a Type-P-One-way-Delay-Poisson-Stream and a percent X between 706 0% and 100%, the Xth percentile of all the dT values in the Stream. 707 In computing this percentile, undefined values are treated as 708 infinitely large. Note that this means that the percentile could 709 thus be undefined (informally, infinite). In addition, the Type-P- 710 One-way-Delay-Percentile is undefined if the sample is empty. 712 Example: suppose we take a sample and the results are: 713 Stream1 = < 714 715 716 717 718 719 > 720 Then the 50th percentile would be 110 msec, since 90 msec and 100 721 msec are smaller and 110 msec and 'undefined' are larger. 723 Note that if the possibility that a packet with finite delay is 724 reported as lost is significant, then a high percentile (90th or 725 95th) might be reported as infinite instead of finite. 727 5.2. Type-P-One-way-Delay-Median 729 Given a Type-P-One-way-Delay-Poisson-Stream, the median of all the dT 730 values in the Stream. In computing the median, undefined values are 731 treated as infinitely large. As with Type-P-One-way-Delay- 732 Percentile, Type-P-One-way-Delay-Median is undefined if the sample is 733 empty. 735 As noted in the Framework document, the median differs from the 50th 736 percentile only when the sample contains an even number of values, in 737 which case the mean of the two central values is used. 739 Example: suppose we take a sample and the results are: 740 Stream2 = < 741 742 743 744 745 > 746 Then the median would be 105 msec, the mean of 100 msec and 110 msec, 747 the two central values. 749 5.3. Type-P-One-way-Delay-Minimum 751 Given a Type-P-One-way-Delay-Poisson-Stream, the minimum of all the 752 dT values in the Stream. In computing this, undefined values are 753 treated as infinitely large. Note that this means that the minimum 754 could thus be undefined (informally, infinite) if all the dT values 755 are undefined. In addition, the Type-P-One-way-Delay-Minimum is 756 undefined if the sample is empty. 758 In the above example, the minimum would be 90 msec. 760 5.4. Type-P-One-way-Delay-Inverse-Percentile 762 Given a Type-P-One-way-Delay-Poisson-Stream and a time duration 763 threshold, the fraction of all the dT values in the Stream less than 764 or equal to the threshold. The result could be as low as 0% (if all 765 the dT values exceed threshold) or as high as 100%. Type-P-One-way- 766 Delay-Inverse-Percentile is undefined if the sample is empty. 768 In the above example, the Inverse-Percentile of 103 msec would be 769 50%. 771 6. Security Considerations 773 Conducting Internet measurements raises both security and privacy 774 concerns. This memo does not specify an implementation of the 775 metrics, so it does not directly affect the security of the Internet 776 nor of applications which run on the Internet. However, 777 implementations of these metrics must be mindful of security and 778 privacy concerns. 780 There are two types of security concerns: potential harm caused by 781 the measurements, and potential harm to the measurements. The 782 measurements could cause harm because they are active, and inject 783 packets into the network. The measurement parameters MUST be 784 carefully selected so that the measurements inject trivial amounts of 785 additional traffic into the networks they measure. If they inject 786 "too much" traffic, they can skew the results of the measurement, and 787 in extreme cases cause congestion and denial of service. 789 The measurements themselves could be harmed by routers giving 790 measurement traffic a different priority than "normal" traffic, or by 791 an attacker injecting artificial measurement traffic. If routers can 792 recognize measurement traffic and treat it separately, the 793 measurements will not reflect actual user traffic. If an attacker 794 injects artificial traffic that is accepted as legitimate, the loss 795 rate will be artificially lowered. Therefore, the measurement 796 methodologies SHOULD include appropriate techniques to reduce the 797 probability measurement traffic can be distinguished from "normal" 798 traffic. Authentication techniques, such as digital signatures, may 799 be used where appropriate to guard against injected traffic attacks. 801 The privacy concerns of network measurement are limited by the active 802 measurements described in this memo. Unlike passive measurements, 803 there can be no release of existing user data. 805 7. Acknowledgements 807 Special thanks are due to Vern Paxson of Lawrence Berkeley Labs for 808 his helpful comments on issues of clock uncertainty and statistics. 809 Thanks also to Garry Couch, Will Leland, Andy Scherrer, Sean Shapira, 810 and Roland Wittig for several useful suggestions. 812 8. References 814 [1] V. Paxson, G. Almes, J. Mahdavi, and M. Mathis, "Framework for 815 IP Performance Metrics", RFC 2330, May 1998. 817 [2] G. Almes, S. Kalidindi, and M. Zekauskas, "A One-way Packet Loss 818 Metric for IPPM", Internet-Draft , 819 May 1999. 821 [3] D. Mills, "Network Time Protocol (v3)", RFC 1305, April 1992. 823 [4] J. Mahdavi and V. Paxson, "IPPM Metrics for Measuring 824 Connectivity", RFC 2498, January 1999. 826 [5] J. Postel, "Internet Protocol", RFC 791, September 1981. 828 [6] S. Bradner, "Key words for use in RFCs to Indicate Requirement 829 Levels", RFC 2119, March 1997. 831 [7] S. Bradner, "The Internet Standards Process -- Revision 3", RFC 832 2026, October 1996. 834 9. Authors' Addresses 836 Guy Almes 837 Advanced Network & Services, Inc. 838 200 Business Park Drive 839 Armonk, NY 10504 840 USA 842 Phone: +1 914 765 1120 843 EMail: almes@advanced.org 844 Sunil Kalidindi 845 Advanced Network & Services, Inc. 846 200 Business Park Drive 847 Armonk, NY 10504 848 USA 850 Phone: +1 914 765 1128 851 EMail: kalidindi@advanced.org 853 Matthew J. Zekauskas 854 Advanced Network & Services, Inc. 855 200 Buisiness Park Drive 856 Armonk, NY 10504 857 USA 859 Phone: +1 914 765 1112 860 EMail: matt@advanced.org 862 Expiration date: November, 1999