idnits 2.17.1 draft-morton-ippm-2679-bis-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The draft header indicates that this document obsoletes RFC2679, but the abstract doesn't seem to directly say this. It does mention RFC2679 though, so this could be OK. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (April 22, 2013) is 3994 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '1' on line 953 -- Looks like a reference, but probably isn't: '2' on line 956 -- Looks like a reference, but probably isn't: '4' on line 961 -- Looks like a reference, but probably isn't: '5' on line 964 -- Looks like a reference, but probably isn't: '3' on line 959 -- Looks like a reference, but probably isn't: '6' on line 966 -- Looks like a reference, but probably isn't: '7' on line 969 == Unused Reference: 'RFC2026' is defined on line 976, but no explicit reference was found in the text == Unused Reference: 'RFC2330' is defined on line 982, but no explicit reference was found in the text == Unused Reference: 'RFC2680' is defined on line 989, but no explicit reference was found in the text == Unused Reference: 'RFC3432' is defined on line 992, but no explicit reference was found in the text == Unused Reference: 'RFC4656' is defined on line 996, but no explicit reference was found in the text == Unused Reference: 'RFC5357' is defined on line 1000, but no explicit reference was found in the text == Unused Reference: 'RFC5657' is defined on line 1004, but no explicit reference was found in the text == Unused Reference: 'RFC5835' is defined on line 1008, but no explicit reference was found in the text == Unused Reference: 'RFC6049' is defined on line 1011, but no explicit reference was found in the text == Unused Reference: 'ADK' is defined on line 1024, but no explicit reference was found in the text == Unused Reference: 'RFC3931' is defined on line 1035, but no explicit reference was found in the text ** Downref: Normative reference to an Informational RFC: RFC 2330 ** Obsolete normative reference: RFC 2679 (Obsoleted by RFC 7679) ** Obsolete normative reference: RFC 2680 (Obsoleted by RFC 7680) ** Downref: Normative reference to an Informational RFC: RFC 5835 ** Downref: Normative reference to an Informational RFC: RFC 6703 Summary: 5 errors (**), 0 flaws (~~), 12 warnings (==), 9 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group G. Almes 3 Internet-Draft Texas A&M 4 Obsoletes: 2679 (if approved) S. Kalidindi 5 Intended status: Standards Track Ixia 6 Expires: October 24, 2013 M. Zekauskas 7 Internet2 8 A. Morton, Ed. 9 AT&T Labs 10 April 22, 2013 12 A One-Way Delay Metric for IPPM 13 draft-morton-ippm-2679-bis-02 15 Abstract 17 This memo (RFC 2679 bis) defines a metric for one-way delay of 18 packets across Internet paths. It builds on notions introduced and 19 discussed in the IPPM Framework document, RFC 2330; the reader is 20 assumed to be familiar with that document. 22 Requirements Language 24 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 25 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 26 document are to be interpreted as described in RFC 2119 [RFC2119]. 28 Status of This Memo 30 This Internet-Draft is submitted in full conformance with the 31 provisions of BCP 78 and BCP 79. 33 Internet-Drafts are working documents of the Internet Engineering 34 Task Force (IETF). Note that other groups may also distribute 35 working documents as Internet-Drafts. The list of current Internet- 36 Drafts is at http://datatracker.ietf.org/drafts/current/. 38 Internet-Drafts are draft documents valid for a maximum of six months 39 and may be updated, replaced, or obsoleted by other documents at any 40 time. It is inappropriate to use Internet-Drafts as reference 41 material or to cite them other than as "work in progress." 43 This Internet-Draft will expire on October 24, 2013. 45 Copyright Notice 47 Copyright (c) 2013 IETF Trust and the persons identified as the 48 document authors. All rights reserved. 50 This document is subject to BCP 78 and the IETF Trust's Legal 51 Provisions Relating to IETF Documents 52 (http://trustee.ietf.org/license-info) in effect on the date of 53 publication of this document. Please review these documents 54 carefully, as they describe your rights and restrictions with respect 55 to this document. Code Components extracted from this document must 56 include Simplified BSD License text as described in Section 4.e of 57 the Trust Legal Provisions and are provided without warranty as 58 described in the Simplified BSD License. 60 Table of Contents 62 1. RFC 2679 bis . . . . . . . . . . . . . . . . . . . . . . . . 3 63 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 64 2.1. Motivation . . . . . . . . . . . . . . . . . . . . . . . 4 65 2.2. General Issues Regarding Time . . . . . . . . . . . . . . 6 66 3. A Singleton Definition for One-way Delay . . . . . . . . . . 7 67 3.1. Metric Name: . . . . . . . . . . . . . . . . . . . . . . 7 68 3.2. Metric Parameters: . . . . . . . . . . . . . . . . . . . 7 69 3.3. Metric Units: . . . . . . . . . . . . . . . . . . . . . . 7 70 3.4. Definition: . . . . . . . . . . . . . . . . . . . . . . . 7 71 3.5. Discussion: . . . . . . . . . . . . . . . . . . . . . . . 7 72 3.6. Methodologies: . . . . . . . . . . . . . . . . . . . . . 8 73 3.7. Errors and Uncertainties: . . . . . . . . . . . . . . . . 10 74 3.7.1. Errors or uncertainties related to Clocks . . . . . . 10 75 3.7.2. Errors or uncertainties related to Wire-time vs Host- 76 time . . . . . . . . . . . . . . . . . . . . . . . . 11 77 3.7.3. Calibration . . . . . . . . . . . . . . . . . . . . . 12 78 3.8. Reporting the metric: . . . . . . . . . . . . . . . . . . 14 79 3.8.1. Type-P . . . . . . . . . . . . . . . . . . . . . . . 14 80 3.8.2. Loss Threshold . . . . . . . . . . . . . . . . . . . 14 81 3.8.3. Calibration Results . . . . . . . . . . . . . . . . . 14 82 3.8.4. Path . . . . . . . . . . . . . . . . . . . . . . . . 15 83 4. A Definition for Samples of One-way Delay . . . . . . . . . . 15 84 4.1. Metric Name: . . . . . . . . . . . . . . . . . . . . . . 15 85 4.2. Metric Parameters: . . . . . . . . . . . . . . . . . . . 15 86 4.3. Metric Units: . . . . . . . . . . . . . . . . . . . . . . 16 87 4.4. Definition: . . . . . . . . . . . . . . . . . . . . . . . 16 88 4.5. Discussion: . . . . . . . . . . . . . . . . . . . . . . . 16 89 4.6. Methodologies: . . . . . . . . . . . . . . . . . . . . . 17 90 4.7. Errors and Uncertainties: . . . . . . . . . . . . . . . . 17 91 4.8. Reporting the metric: . . . . . . . . . . . . . . . . . . 18 92 5. Some Statistics Definitions for One-way Delay . . . . . . . . 18 93 5.1. Type-P-One-way-Delay-Percentile . . . . . . . . . . . . . 18 94 5.2. Type-P-One-way-Delay-Median . . . . . . . . . . . . . . . 19 95 5.3. Type-P-One-way-Delay-Minimum . . . . . . . . . . . . . . 19 96 5.4. Type-P-One-way-Delay-Inverse-Percentile . . . . . . . . . 19 97 6. Security Considerations . . . . . . . . . . . . . . . . . . . 20 98 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 21 99 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 21 100 9. Refetrences (temporary) . . . . . . . . . . . . . . . . . . . 21 101 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 21 102 10.1. Normative References . . . . . . . . . . . . . . . . . . 21 103 10.2. Informative References . . . . . . . . . . . . . . . . . 22 104 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 23 106 1. RFC 2679 bis 108 The following text constitutes RFC 2769 bis proposed for advancement 109 on the IETF Standards Track. 111 [I-D.ietf-ippm-testplan-rfc2679] (now approved) provides the test 112 plan and results supporting [RFC2679] advancement along the standards 113 track, according to the process in [RFC6576]. The conclusions of 114 [I-D.ietf-ippm-testplan-rfc2679] list four minor modifications for 115 inclusion: 117 1. Section 6.2.3 of [I-D.ietf-ippm-testplan-rfc2679] asserts that 118 the assumption of post-processing to enforce a constant waiting 119 time threshold is compliant, and that the text of the RFC should 120 be revised slightly to include this point (see the last list item 121 of section 3.6, below). 123 2. Section 6.5 of [I-D.ietf-ippm-testplan-rfc2679] indicates that 124 Type-P-One-way-Delay-Inverse-Percentile statistic has been 125 ignored in both implementations, so it is a candidate for removal 126 or deprecation in RFC2679bis (this small discrepancy does not 127 affect candidacy for advancement) (see section 5.4, below). 129 3. The IETF has reached consensus on guidance for reporting metrics 130 in [RFC6703], and this memo should be referenced in RFC2679bis to 131 incorporate recent experience where appropriate (see the last 132 list item of section 3.6, section 3.8, and section 5 below). 134 4. There is currently one erratum with status "Held for document 135 update" for [RFC2679], and it appears this minor revision and 136 additional text should be incorporated in RFC2679bis (see section 137 5.1). 139 A small number of updates to the [RFC2679] text have been proposed 140 (by the current Editor) in the text below, principally to reference 141 key IPPM RFCs that were approved after [RFC2679]. 143 Section 5.4.4 of RFC 6390 suggests a common template for performance 144 metrics partially derived from previous IPPM and BMWG RFCs, but also 145 some new items. All of the RFC 6390 Normative points are covered, 146 but not quite in the same section names or orientation. Several of 147 the Informative points are covered. It is proposed to "grandfather- 148 in" bis RFCs w.r.t. RFC 6390 (keeping the familiar outline and 149 minimizing unnecessary differences), and focus efforts on applying 150 the template with new metric memos instead. 152 The publication of RFC 6921 suggests an area where this memo might be 153 updated. Packet transfer on Faster-Than-Light (FTL) networks could 154 result in negative delays and packet reordering, and both are covered 155 as possibilities in the current text. 157 2. Introduction 159 This memo defines a metric for one-way delay of packets across 160 Internet paths. It builds on notions introduced and discussed in the 161 IPPM Framework document, RFC 2330 [1]; the reader is assumed to be 162 familiar with that document. 164 This memo is intended to be parallel in structure to a companion 165 document for Packet Loss ("A One-way Packet Loss Metric for IPPM") 166 [2]. 168 Although RFC 2119 was written with protocols in mind, the key words 169 are used in this document for similar reasons. They are used to 170 ensure the results of measurements from two different implementations 171 are comparable, and to note instances when an implementation could 172 perturb the network. 174 The structure of the memo is as follows: 176 + A 'singleton' analytic metric, called Type-P-One-way-Delay, will be 177 introduced to measure a single observation of one-way delay. 179 + Using this singleton metric, a 'sample', called Type-P-One-way- 180 Delay-Poisson-Stream, will be introduced to measure a sequence of 181 singleton delays measured at times taken from a Poisson process. 183 + Using this sample, several 'statistics' of the sample will be 184 defined and discussed. This progression from singleton to sample to 185 statistics, with clear separation among them, is important. 187 Whenever a technical term from the IPPM Framework document is first 188 used in this memo, it will be tagged with a trailing asterisk. For 189 example, "term*" indicates that "term" is defined in the Framework. 191 2.1. Motivation 192 One-way delay of a Type-P* packet from a source host* to a 193 destination host is useful for several reasons: 195 + Some applications do not perform well (or at all) if end-to-end 196 delay between hosts is large relative to some threshold value. 198 + Erratic variation in delay makes it difficult (or impossible) to 199 support many real-time applications. 201 + The larger the value of delay, the more difficult it is for 202 transport-layer protocols to sustain high bandwidths. 204 + The minimum value of this metric provides an indication of the 205 delay due only to propagation and transmission delay. 207 + The minimum value of this metric provides an indication of the 208 delay that will likely be experienced when the path* traversed is 209 lightly loaded. 211 + Values of this metric above the minimum provide an indication of 212 the congestion present in the path. 214 The measurement of one-way delay instead of round-trip delay is 215 motivated by the following factors: 217 + In today's Internet, the path from a source to a destination may be 218 different than the path from the destination back to the source 219 ("asymmetric paths"), such that different sequences of routers are 220 used for the forward and reverse paths. Therefore round-trip 221 measurements actually measure the performance of two distinct paths 222 together. Measuring each path independently highlights the 223 performance difference between the two paths which may traverse 224 different Internet service providers, and even radically different 225 types of networks (for example, research versus commodity networks, 226 or ATM versus packet-over-SONET). 228 + Even when the two paths are symmetric, they may have radically 229 different performance characteristics due to asymmetric queueing. 231 + Performance of an application may depend mostly on the performance 232 in one direction. For example, a file transfer using TCP may depend 233 more on the performance in the direction that data flows, rather than 234 the direction in which acknowledgements travel. 236 + In quality-of-service (QoS) enabled networks, provisioning in one 237 direction may be radically different than provisioning in the reverse 238 direction, and thus the QoS guarantees differ. Measuring the paths 239 independently allows the verification of both guarantees. 241 It is outside the scope of this document to say precisely how delay 242 metrics would be applied to specific problems. 244 2.2. General Issues Regarding Time 246 {Comment: the terminology below differs from that defined by ITU-T 247 documents (e.g., G.810, "Definitions and terminology for 248 synchronization networks" and I.356, "B-ISDN ATM layer cell transfer 249 performance"), but is consistent with the IPPM Framework document. 250 In general, these differences derive from the different backgrounds; 251 the ITU-T documents historically have a telephony origin, while the 252 authors of this document (and the Framework) have a computer systems 253 background. Although the terms defined below have no direct 254 equivalent in the ITU-T definitions, after our definitions we will 255 provide a rough mapping. However, note one potential confusion: our 256 definition of "clock" is the computer operating systems definition 257 denoting a time-of-day clock, while the ITU-T definition of clock 258 denotes a frequency reference.} 260 Whenever a time (i.e., a moment in history) is mentioned here, it is 261 understood to be measured in seconds (and fractions) relative to UTC. 263 As described more fully in the Framework document, there are four 264 distinct, but related notions of clock uncertainty: 266 synchronization* 268 measures the extent to which two clocks agree on what time it is. 269 For example, the clock on one host might be 5.4 msec ahead of the 270 clock on a second host. {Comment: A rough ITU-T equivalent is "time 271 error".} 273 accuracy* 275 measures the extent to which a given clock agrees with UTC. For 276 example, the clock on a host might be 27.1 msec behind UTC. 277 {Comment: A rough ITU-T equivalent is "time error from UTC".} 279 resolution* 281 measures the precision of a given clock. For example, the clock on 282 an old Unix host might tick only once every 10 msec, and thus have a 283 resolution of only 10 msec. {Comment: A very rough ITU-T equivalent 284 is "sampling period".} 286 skew* 287 measures the change of accuracy, or of synchronization, with time. 288 For example, the clock on a given host might gain 1.3 msec per hour 289 and thus be 27.1 msec behind UTC at one time and only 25.8 msec an 290 hour later. In this case, we say that the clock of the given host 291 has a skew of 1.3 msec per hour relative to UTC, which threatens 292 accuracy. We might also speak of the skew of one clock relative to 293 another clock, which threatens synchronization. {Comment: A rough 294 ITU-T equivalent is "time drift".} 296 3. A Singleton Definition for One-way Delay 298 3.1. Metric Name: 300 Type-P-One-way-Delay 302 3.2. Metric Parameters: 304 + Src, the IP address of a host 306 + Dst, the IP address of a host 308 + T, a time 310 3.3. Metric Units: 312 The value of a Type-P-One-way-Delay is either a real number, or an 313 undefined (informally, infinite) number of seconds. 315 3.4. Definition: 317 For a real number dT, >>the *Type-P-One-way-Delay* from Src to Dst at 318 T is dT<< means that Src sent the first bit of a Type-P packet to Dst 319 at wire-time* T and that Dst received the last bit of that packet at 320 wire-time T+dT. 322 >>The *Type-P-One-way-Delay* from Src to Dst at T is undefined 323 (informally, infinite)<< means that Src sent the first bit of a 324 Type-P packet to Dst at wire-time T and that Dst did not receive that 325 packet. 327 Suggestions for what to report along with metric values appear in 328 Section 3.8 after a discussion of the metric, methodologies for 329 measuring the metric, and error analysis. 331 3.5. Discussion: 333 Type-P-One-way-Delay is a relatively simple analytic metric, and one 334 that we believe will afford effective methods of measurement. 336 The following issues are likely to come up in practice: 338 + Real delay values will be positive. Therefore, it does not make 339 sense to report a negative value as a real delay. However, an 340 individual zero or negative delay value might be useful as part of a 341 stream when trying to discover a distribution of a stream of delay 342 values. 344 + Since delay values will often be as low as the 100 usec to 10 msec 345 range, it will be important for Src and Dst to synchronize very 346 closely. GPS systems afford one way to achieve synchronization to 347 within several 10s of usec. Ordinary application of NTP may allow 348 synchronization to within several msec, but this depends on the 349 stability and symmetry of delay properties among those NTP agents 350 used, and this delay is what we are trying to measure. A combination 351 of some GPS-based NTP servers and a conservatively designed and 352 deployed set of other NTP servers should yield good results, but this 353 is yet to be tested. 355 + A given methodology will have to include a way to determine whether 356 a delay value is infinite or whether it is merely very large (and the 357 packet is yet to arrive at Dst). As noted by Mahdavi and Paxson [4], 358 simple upper bounds (such as the 255 seconds theoretical upper bound 359 on the lifetimes of IP packets [5]) could be used, but good 360 engineering, including an understanding of packet lifetimes, will be 361 needed in practice. {Comment: Note that, for many applications of 362 these metrics, the harm in treating a large delay as infinite might 363 be zero or very small. A TCP data packet, for example, that arrives 364 only after several multiples of the RTT may as well have been lost.} 366 + If the packet is duplicated along the path (or paths) so that 367 multiple non-corrupt copies arrive at the destination, then the 368 packet is counted as received, and the first copy to arrive 369 determines the packet's one-way delay. 371 + If the packet is fragmented and if, for whatever reason, reassembly 372 does not occur, then the packet will be deemed lost. 374 3.6. Methodologies: 376 As with other Type-P-* metrics, the detailed methodology will depend 377 on the Type-P (e.g., protocol number, UDP/TCP port number, size, 378 precedence). 380 Generally, for a given Type-P, the methodology would proceed as 381 follows: 383 + Arrange that Src and Dst are synchronized; that is, that they have 384 clocks that are very closely synchronized with each other and each 385 fairly close to the actual time. 387 + At the Src host, select Src and Dst IP addresses, and form a test 388 packet of Type-P with these addresses. Any 'padding' portion of the 389 packet needed only to make the test packet a given size should be 390 filled with randomized bits to avoid a situation in which the 391 measured delay is lower than it would otherwise be due to compression 392 techniques along the path. 394 + At the Dst host, arrange to receive the packet. 396 + At the Src host, place a timestamp in the prepared Type-P packet, 397 and send it towards Dst. 399 + If the packet arrives within a reasonable period of time, take a 400 timestamp as soon as possible upon the receipt of the packet. By 401 subtracting the two timestamps, an estimate of one-way delay can be 402 computed. Error analysis of a given implementation of the method 403 must take into account the closeness of synchronization between Src 404 and Dst. If the delay between Src's timestamp and the actual sending 405 of the packet is known, then the estimate could be adjusted by 406 subtracting this amount; uncertainty in this value must be taken into 407 account in error analysis. Similarly, if the delay between the 408 actual receipt of the packet and Dst's timestamp is known, then the 409 estimate could be adjusted by subtracting this amount; uncertainty in 410 this value must be taken into account in error analysis. See the 411 next section, "Errors and Uncertainties", for a more detailed 412 discussion. 414 + If the packet fails to arrive within a reasonable period of time, 415 the one-way delay is taken to be undefined (informally, infinite). 416 Note that the threshold of 'reasonable' is a parameter of the 417 methodology. These points are examined in detail in [RFC6703], 418 including analysis preferences to assign undefined delay to packets 419 that fail to arrive with the difficulties emerging from the informal 420 "infinite delay" assignment, and an estimation of an upper bound on 421 waiting time for packets in transit. Further, enforcing a specific 422 constant waiting time on stored singletons of one-way delay is 423 compliant with this specification and may allow the results to serve 424 more than one reporting audience. 426 Issues such as the packet format, the means by which Dst knows when 427 to expect the test packet, and the means by which Src and Dst are 428 synchronized are outside the scope of this document. {Comment: We 429 plan to document elsewhere our own work in describing such more 430 detailed implementation techniques and we encourage others to as 431 well.} 433 3.7. Errors and Uncertainties: 435 The description of any specific measurement method should include an 436 accounting and analysis of various sources of error or uncertainty. 437 The Framework document provides general guidance on this point, but 438 we note here the following specifics related to delay metrics: 440 + Errors or uncertainties due to uncertainties in the clocks of the 441 Src and Dst hosts. 443 + Errors or uncertainties due to the difference between 'wire time' 444 and 'host time'. 446 In addition, the loss threshold may affect the results. Each of 447 these are discussed in more detail below, along with a section 448 ("Calibration") on accounting for these errors and uncertainties. 450 3.7.1. Errors or uncertainties related to Clocks 452 The uncertainty in a measurement of one-way delay is related, in 453 part, to uncertainties in the clocks of the Src and Dst hosts. In 454 the following, we refer to the clock used to measure when the packet 455 was sent from Src as the source clock, we refer to the clock used to 456 measure when the packet was received by Dst as the destination clock, 457 we refer to the observed time when the packet was sent by the source 458 clock as Tsource, and the observed time when the packet was received 459 by the destination clock as Tdest. Alluding to the notions of 460 synchronization, accuracy, resolution, and skew mentioned in the 461 Introduction, we note the following: 463 + Any error in the synchronization between the source clock and the 464 destination clock will contribute to error in the delay measurement. 465 We say that the source clock and the destination clock have a 466 synchronization error of Tsynch if the source clock is Tsynch ahead 467 of the destination clock. Thus, if we know the value of Tsynch 468 exactly, we could correct for clock synchronization by adding Tsynch 469 to the uncorrected value of Tdest-Tsource. 471 + The accuracy of a clock is important only in identifying the time 472 at which a given delay was measured. Accuracy, per se, has no 473 importance to the accuracy of the measurement of delay. When 474 computing delays, we are interested only in the differences between 475 clock values, not the values themselves. 477 + The resolution of a clock adds to uncertainty about any time 478 measured with it. Thus, if the source clock has a resolution of 10 479 msec, then this adds 10 msec of uncertainty to any time value 480 measured with it. We will denote the resolution of the source clock 481 and the destination clock as Rsource and Rdest, respectively. 483 + The skew of a clock is not so much an additional issue as it is a 484 realization of the fact that Tsynch is itself a function of time. 485 Thus, if we attempt to measure or to bound Tsynch, this needs to be 486 done periodically. Over some periods of time, this function can be 487 approximated as a linear function plus some higher order terms; in 488 these cases, one option is to use knowledge of the linear component 489 to correct the clock. Using this correction, the residual Tsynch is 490 made smaller, but remains a source of uncertainty that must be 491 accounted for. We use the function Esynch(t) to denote an upper 492 bound on the uncertainty in synchronization. Thus, |Tsynch(t)| <= 493 Esynch(t). 495 Taking these items together, we note that naive computation Tdest- 496 Tsource will be off by Tsynch(t) +/- (Rsource + Rdest). Using the 497 notion of Esynch(t), we note that these clock-related problems 498 introduce a total uncertainty of Esynch(t)+ Rsource + Rdest. This 499 estimate of total clock-related uncertainty should be included in the 500 error/uncertainty analysis of any measurement implementation. 502 3.7.2. Errors or uncertainties related to Wire-time vs Host-time 504 As we have defined one-way delay, we would like to measure the time 505 between when the test packet leaves the network interface of Src and 506 when it (completely) arrives at the network interface of Dst, and we 507 refer to these as "wire times." If the timings are themselves 508 performed by software on Src and Dst, however, then this software can 509 only directly measure the time between when Src grabs a timestamp 510 just prior to sending the test packet and when Dst grabs a timestamp 511 just after having received the test packet, and we refer to these two 512 points as "host times". 514 To the extent that the difference between wire time and host time is 515 accurately known, this knowledge can be used to correct for host time 516 measurements and the corrected value more accurately estimates the 517 desired (wire time) metric. 519 To the extent, however, that the difference between wire time and 520 host time is uncertain, this uncertainty must be accounted for in an 521 analysis of a given measurement method. We denote by Hsource an 522 upper bound on the uncertainty in the difference between wire time 523 and host time on the Src host, and similarly define Hdest for the Dst 524 host. We then note that these problems introduce a total uncertainty 525 of Hsource+Hdest. This estimate of total wire-vs-host uncertainty 526 should be included in the error/uncertainty analysis of any 527 measurement implementation. 529 3.7.3. Calibration 531 Generally, the measured values can be decomposed as follows: 533 measured value = true value + systematic error + random error 535 If the systematic error (the constant bias in measured values) can be 536 determined, it can be compensated for in the reported results. 538 reported value = measured value - systematic error 540 therefore 542 reported value = true value + random error 544 The goal of calibration is to determine the systematic and random 545 error generated by the instruments themselves in as much detail as 546 possible. At a minimum, a bound ("e") should be found such that the 547 reported value is in the range (true value - e) to (true value + e) 548 at least 95 percent of the time. We call "e" the calibration error 549 for the measurements. It represents the degree to which the values 550 produced by the measurement instrument are repeatable; that is, how 551 closely an actual delay of 30 ms is reported as 30 ms. {Comment: 95 552 percent was chosen because (1) some confidence level is desirable to 553 be able to remove outliers, which will be found in measuring any 554 physical property; (2) a particular confidence level should be 555 specified so that the results of independent implementations can be 556 compared; and (3) even with a prototype user-level implementation, 557 95% was loose enough to exclude outliers.} 559 From the discussion in the previous two sections, the error in 560 measurements could be bounded by determining all the individual 561 uncertainties, and adding them together to form 563 Esynch(t) + Rsource + Rdest + Hsource + Hdest. 565 However, reasonable bounds on both the clock-related uncertainty 566 captured by the first three terms and the host-related uncertainty 567 captured by the last two terms should be possible by careful design 568 techniques and calibrating the instruments using a known, isolated, 569 network in a lab. 571 For example, the clock-related uncertainties are greatly reduced 572 through the use of a GPS time source. The sum of Esynch(t) + Rsource 573 + Rdest is small, and is also bounded for the duration of the 574 measurement because of the global time source. 576 The host-related uncertainties, Hsource + Hdest, could be bounded by 577 connecting two instruments back-to-back with a high-speed serial link 578 or isolated LAN segment. In this case, repeated measurements are 579 measuring the same one-way delay. 581 If the test packets are small, such a network connection has a 582 minimal delay that may be approximated by zero. The measured delay 583 therefore contains only systematic and random error in the 584 instrumentation. The "average value" of repeated measurements is the 585 systematic error, and the variation is the random error. 587 One way to compute the systematic error, and the random error to a 588 95% confidence is to repeat the experiment many times - at least 589 hundreds of tests. The systematic error would then be the median. 590 The random error could then be found by removing the systematic error 591 from the measured values. The 95% confidence interval would be the 592 range from the 2.5th percentile to the 97.5th percentile of these 593 deviations from the true value. The calibration error "e" could then 594 be taken to be the largest absolute value of these two numbers, plus 595 the clock-related uncertainty. {Comment: as described, this bound is 596 relatively loose since the uncertainties are added, and the absolute 597 value of the largest deviation is used. As long as the resulting 598 value is not a significant fraction of the measured values, it is a 599 reasonable bound. If the resulting value is a significant fraction 600 of the measured values, then more exact methods will be needed to 601 compute the calibration error.} 603 Note that random error is a function of measurement load. For 604 example, if many paths will be measured by one instrument, this might 605 increase interrupts, process scheduling, and disk I/O (for example, 606 recording the measurements), all of which may increase the random 607 error in measured singletons. Therefore, in addition to minimal load 608 measurements to find the systematic error, calibration measurements 609 should be performed with the same measurement load that the 610 instruments will see in the field. 612 We wish to reiterate that this statistical treatment refers to the 613 calibration of the instrument; it is used to "calibrate the meter 614 stick" and say how well the meter stick reflects reality. 616 In addition to calibrating the instruments for finite one-way delay, 617 two checks should be made to ensure that packets reported as losses 618 were really lost. First, the threshold for loss should be verified. 620 In particular, ensure the "reasonable" threshold is reasonable: that 621 it is very unlikely a packet will arrive after the threshold value, 622 and therefore the number of packets lost over an interval is not 623 sensitive to the error bound on measurements. Second, consider the 624 possibility that a packet arrives at the network interface, but is 625 lost due to congestion on that interface or to other resource 626 exhaustion (e.g. buffers) in the instrument. 628 3.8. Reporting the metric: 630 The calibration and context in which the metric is measured MUST be 631 carefully considered, and SHOULD always be reported along with metric 632 results. We now present four items to consider: the Type-P of test 633 packets, the threshold of infinite delay (if any), error calibration, 634 and the path traversed by the test packets. This list is not 635 exhaustive; any additional information that could be useful in 636 interpreting applications of the metrics should also be reported (see 637 [RFC6703] for extensive discussion of reporting considerations for 638 different audiences). 640 3.8.1. Type-P 642 As noted in the Framework document [1], the value of the metric may 643 depend on the type of IP packets used to make the measurement, or 644 "type-P". The value of Type-P-One-way-Delay could change if the 645 protocol (UDP or TCP), port number, size, or arrangement for special 646 treatment (e.g., IP precedence or RSVP) changes. The exact Type-P 647 used to make the measurements MUST be accurately reported. 649 3.8.2. Loss Threshold 651 In addition, the threshold (or methodology to distinguish) between a 652 large finite delay and loss MUST be reported. 654 3.8.3. Calibration Results 656 + If the systematic error can be determined, it SHOULD be removed 657 from the measured values. 659 + You SHOULD also report the calibration error, e, such that the true 660 value is the reported value plus or minus e, with 95% confidence (see 661 the last section.) 663 + If possible, the conditions under which a test packet with finite 664 delay is reported as lost due to resource exhaustion on the 665 measurement instrument SHOULD be reported. 667 3.8.4. Path 669 Finally, the path traversed by the packet SHOULD be reported, if 670 possible. In general it is impractical to know the precise path a 671 given packet takes through the network. The precise path may be 672 known for certain Type-P on short or stable paths. If Type-P 673 includes the record route (or loose-source route) option in the IP 674 header, and the path is short enough, and all routers* on the path 675 support record (or loose-source) route, then the path will be 676 precisely recorded. This is impractical because the route must be 677 short enough, many routers do not support (or are not configured for) 678 record route, and use of this feature would often artificially worsen 679 the performance observed by removing the packet from common-case 680 processing. However, partial information is still valuable context. 681 For example, if a host can choose between two links* (and hence two 682 separate routes from Src to Dst), then the initial link used is 683 valuable context. {Comment: For example, with Merit's NetNow setup, 684 a Src on one NAP can reach a Dst on another NAP by either of several 685 different backbone networks.} 687 4. A Definition for Samples of One-way Delay 689 Given the singleton metric Type-P-One-way-Delay, we now define one 690 particular sample of such singletons. The idea of the sample is to 691 select a particular binding of the parameters Src, Dst, and Type-P, 692 then define a sample of values of parameter T. The means for 693 defining the values of T is to select a beginning time T0, a final 694 time Tf, and an average rate lambda, then define a pseudo-random 695 Poisson process of rate lambda, whose values fall between T0 and Tf. 696 The time interval between successive values of T will then average 1/ 697 lambda. 699 {Comment: Note that Poisson sampling is only one way of defining a 700 sample. Poisson has the advantage of limiting bias, but other 701 methods of sampling might be appropriate for different situations. 702 We encourage others who find such appropriate cases to use this 703 general framework and submit their sampling method for 704 standardization.} 706 >>> Editor proposal: Add ref to RFC 3432 Periodic sampling above. 708 4.1. Metric Name: 710 Type-P-One-way-Delay-Poisson-Stream 712 4.2. Metric Parameters: 714 + Src, the IP address of a host 715 + Dst, the IP address of a host 717 + T0, a time 719 + Tf, a time 721 + lambda, a rate in reciprocal seconds 723 4.3. Metric Units: 725 A sequence of pairs; the elements of each pair are: 727 + T, a time, and 729 + dT, either a real number or an undefined number of seconds. 731 The values of T in the sequence are monotonic increasing. Note that 732 T would be a valid parameter to Type-P-One-way-Delay, and that dT 733 would be a valid value of Type-P-One-way-Delay. 735 4.4. Definition: 737 Given T0, Tf, and lambda, we compute a pseudo-random Poisson process 738 beginning at or before T0, with average arrival rate lambda, and 739 ending at or after Tf. Those time values greater than or equal to T0 740 and less than or equal to Tf are then selected. At each of the times 741 in this process, we obtain the value of Type-P-One-way-Delay at this 742 time. The value of the sample is the sequence made up of the 743 resulting pairs. If there are no such pairs, the 744 sequence is of length zero and the sample is said to be empty. 746 4.5. Discussion: 748 The reader should be familiar with the in-depth discussion of Poisson 749 sampling in the Framework document [1], which includes methods to 750 compute and verify the pseudo-random Poisson process. 752 We specifically do not constrain the value of lambda, except to note 753 the extremes. If the rate is too large, then the measurement traffic 754 will perturb the network, and itself cause congestion. If the rate 755 is too small, then you might not capture interesting network 756 behavior. {Comment: We expect to document our experiences with, and 757 suggestions for, lambda elsewhere, culminating in a "best current 758 practices" document.} 759 Since a pseudo-random number sequence is employed, the sequence of 760 times, and hence the value of the sample, is not fully specified. 761 Pseudo-random number generators of good quality will be needed to 762 achieve the desired qualities. 764 The sample is defined in terms of a Poisson process both to avoid the 765 effects of self-synchronization and also capture a sample that is 766 statistically as unbiased as possible. {Comment: there is, of 767 course, no claim that real Internet traffic arrives according to a 768 Poisson arrival process.} The Poisson process is used to schedule the 769 delay measurements. The test packets will generally not arrive at 770 Dst according to a Poisson distribution, since they are influenced by 771 the network. 773 All the singleton Type-P-One-way-Delay metrics in the sequence will 774 have the same values of Src, Dst, and Type-P. 776 Note also that, given one sample that runs from T0 to Tf, and given 777 new time values T0' and Tf' such that T0 <= T0' <= Tf' <= Tf, the 778 subsequence of the given sample whose time values fall between T0' 779 and Tf' are also a valid Type-P-One-way-Delay-Poisson-Stream sample. 781 4.6. Methodologies: 783 The methodologies follow directly from: 785 + the selection of specific times, using the specified Poisson 786 arrival process, and 788 + the methodologies discussion already given for the singleton Type-P 789 -One-way-Delay metric. 791 Care must, of course, be given to correctly handle out-of-order 792 arrival of test packets; it is possible that the Src could send one 793 test packet at TS[i], then send a second one (later) at TS[i+1], 794 while the Dst could receive the second test packet at TR[i+1], and 795 then receive the first one (later) at TR[i]. 797 >>> Editor proposal: Add ref to RFC 4737 Reordering metric above. 799 4.7. Errors and Uncertainties: 801 In addition to sources of errors and uncertainties associated with 802 methods employed to measure the singleton values that make up the 803 sample, care must be given to analyze the accuracy of the Poisson 804 process with respect to the wire-times of the sending of the test 805 packets. Problems with this process could be caused by several 806 things, including problems with the pseudo-random number techniques 807 used to generate the Poisson arrival process, or with jitter in the 808 value of Hsource (mentioned above as uncertainty in the singleton 809 delay metric). The Framework document shows how to use the Anderson- 810 Darling test to verify the accuracy of a Poisson process over small 811 time frames. {Comment: The goal is to ensure that test packets are 812 sent "close enough" to a Poisson schedule, and avoid periodic 813 behavior.} 815 4.8. Reporting the metric: 817 You MUST report the calibration and context for the underlying 818 singletons along with the stream. (See "Reporting the metric" for 819 Type-P-One-way-Delay.) 821 5. Some Statistics Definitions for One-way Delay 823 Given the sample metric Type-P-One-way-Delay-Poisson-Stream, we now 824 offer several statistics of that sample. These statistics are 825 offered mostly to be illustrative of what could be done. See 826 [RFC6703] for additional discussion of statistics that are relevant 827 to different audiences. 829 5.1. Type-P-One-way-Delay-Percentile 831 Given a Type-P-One-way-Delay-Poisson-Stream and a percent X between 832 0% and 100%, the Xth percentile of all the dT values in the Stream. 833 In computing this percentile, undefined values are treated as 834 infinitely large. Note that this means that the percentile could 835 thus be undefined (informally, infinite). In addition, the Type-P- 836 One-way-Delay-Percentile is undefined if the sample is empty. 838 Example: suppose we take a sample and the results are: 840 Stream1 = < 842 844 846 848 850 852 > 853 Then the 50th percentile would be 110 msec, since 90 msec and 100 854 msec are smaller and 500 msec and 'undefined' are larger. See 855 Section 11.3 of [1] for computing percentiles. 857 Note that if the possibility that a packet with finite delay is 858 reported as lost is significant, then a high percentile (90th or 859 95th) might be reported as infinite instead of finite. 861 5.2. Type-P-One-way-Delay-Median 863 Given a Type-P-One-way-Delay-Poisson-Stream, the median of all the dT 864 values in the Stream. In computing the median, undefined values are 865 treated as infinitely large. As with Type-P-One-way-Delay- 866 Percentile, Type-P-One-way-Delay-Median is undefined if the sample is 867 empty. 869 As noted in the Framework document, the median differs from the 50th 870 percentile only when the sample contains an even number of values, in 871 which case the mean of the two central values is used. 873 Example: suppose we take a sample and the results are: 875 Stream2 = < > 878 Then the median would be 105 msec, the mean of 100 msec and 110 msec, 879 the two central values. 881 5.3. Type-P-One-way-Delay-Minimum 883 Given a Type-P-One-way-Delay-Poisson-Stream, the minimum of all the 884 dT values in the Stream. In computing this, undefined values are 885 treated as infinitely large. Note that this means that the minimum 886 could thus be undefined (informally, infinite) if all the dT values 887 are undefined. In addition, the Type-P-One-way-Delay-Minimum is 888 undefined if the sample is empty. 890 In the above example, the minimum would be 90 msec. 892 5.4. Type-P-One-way-Delay-Inverse-Percentile 894 Note: This statistic is deprecated in this version of the memo 895 because of lack of use. 897 Given a Type-P-One-way-Delay-Poisson-Stream and a time duration 898 threshold, the fraction of all the dT values in the Stream less than 899 or equal to the threshold. The result could be as low as 0% (if all 900 the dT values exceed threshold) or as high as 100%. Type-P-One-way- 901 Delay-Inverse-Percentile is undefined if the sample is empty. 903 In the above example, the Inverse-Percentile of 103 msec would be 904 50%. 906 6. Security Considerations 908 Conducting Internet measurements raises both security and privacy 909 concerns. This memo does not specify an implementation of the 910 metrics, so it does not directly affect the security of the Internet 911 nor of applications which run on the Internet. However, 912 implementations of these metrics must be mindful of security and 913 privacy concerns. 915 There are two types of security concerns: potential harm caused by 916 the measurements, and potential harm to the measurements. The 917 measurements could cause harm because they are active, and inject 918 packets into the network. The measurement parameters MUST be 919 carefully selected so that the measurements inject trivial amounts of 920 additional traffic into the networks they measure. If they inject 921 "too much" traffic, they can skew the results of the measurement, and 922 in extreme cases cause congestion and denial of service. 924 The measurements themselves could be harmed by routers giving 925 measurement traffic a different priority than "normal" traffic, or by 926 an attacker injecting artificial measurement traffic. If routers can 927 recognize measurement traffic and treat it separately, the 928 measurements will not reflect actual user traffic. If an attacker 929 injects artificial traffic that is accepted as legitimate, the loss 930 rate will be artificially lowered. Therefore, the measurement 931 methodologies SHOULD include appropriate techniques to reduce the 932 probability measurement traffic can be distinguished from "normal" 933 traffic. Authentication techniques, such as digital signatures, may 934 be used where appropriate to guard against injected traffic attacks. 936 The privacy concerns of network measurement are limited by the active 937 measurements described in this memo. Unlike passive measurements, 938 there can be no release of existing user data. 940 7. IANA Considerations 942 This memo makes no requests of IANA. 944 8. Acknowledgements 946 Special thanks are due to Vern Paxson of Lawrence Berkeley Labs for 947 his helpful comments on issues of clock uncertainty and statistics. 948 Thanks also to Garry Couch, Will Leland, Andy Scherrer, Sean Shapira, 949 and Roland Wittig for several useful suggestions. 951 9. Refetrences (temporary) 953 [1] Paxson, V., Almes, G., Mahdavi, J. and M. Mathis, "Framework 954 for IP Performance Metrics", RFC 2330, May 1998. 956 [2] Almes, G., Kalidindi, S. and M. Zekauskas, "A One-way Packet 957 Loss Metric for IPPM", RFC 2680, September 1999. 959 [3] Mills, D., "Network Time Protocol (v3)", RFC 1305, April 1992. 961 [4] Mahdavi J. and V. Paxson, "IPPM Metrics for Measuring 962 Connectivity", RFC 2678, September 1999. 964 [5] Postel, J., "Internet Protocol", STD 5, RFC 791, September 1981. 966 [6] Bradner, S., "Key words for use in RFCs to Indicate Requirement 967 Levels", BCP 14, RFC 2119, March 1997. 969 [7] Bradner, S., "The Internet Standards Process -- Revision 3", BCP 970 9, RFC 2026, October 1996. 972 10. References 974 10.1. Normative References 976 [RFC2026] Bradner, S., "The Internet Standards Process -- Revision 977 3", BCP 9, RFC 2026, October 1996. 979 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 980 Requirement Levels", BCP 14, RFC 2119, March 1997. 982 [RFC2330] Paxson, V., Almes, G., Mahdavi, J., and M. Mathis, 983 "Framework for IP Performance Metrics", RFC 2330, May 984 1998. 986 [RFC2679] Almes, G., Kalidindi, S., and M. Zekauskas, "A One-way 987 Delay Metric for IPPM", RFC 2679, September 1999. 989 [RFC2680] Almes, G., Kalidindi, S., and M. Zekauskas, "A One-way 990 Packet Loss Metric for IPPM", RFC 2680, September 1999. 992 [RFC3432] Raisanen, V., Grotefeld, G., and A. Morton, "Network 993 performance measurement with periodic streams", RFC 3432, 994 November 2002. 996 [RFC4656] Shalunov, S., Teitelbaum, B., Karp, A., Boote, J., and M. 997 Zekauskas, "A One-way Active Measurement Protocol 998 (OWAMP)", RFC 4656, September 2006. 1000 [RFC5357] Hedayat, K., Krzanowski, R., Morton, A., Yum, K., and J. 1001 Babiarz, "A Two-Way Active Measurement Protocol (TWAMP)", 1002 RFC 5357, October 2008. 1004 [RFC5657] Dusseault, L. and R. Sparks, "Guidance on Interoperation 1005 and Implementation Reports for Advancement to Draft 1006 Standard", BCP 9, RFC 5657, September 2009. 1008 [RFC5835] Morton, A. and S. Van den Berghe, "Framework for Metric 1009 Composition", RFC 5835, April 2010. 1011 [RFC6049] Morton, A. and E. Stephan, "Spatial Composition of 1012 Metrics", RFC 6049, January 2011. 1014 [RFC6576] Geib, R., Morton, A., Fardid, R., and A. Steinmitz, "IP 1015 Performance Metrics (IPPM) Standard Advancement Testing", 1016 BCP 176, RFC 6576, March 2012. 1018 [RFC6703] Morton, A., Ramachandran, G., and G. Maguluri, "Reporting 1019 IP Network Performance Metrics: Different Points of View", 1020 RFC 6703, August 2012. 1022 10.2. Informative References 1024 [ADK] Scholz, F.W. and M.A. Stephens, "K-sample Anderson-Darling 1025 Tests of fit, for continuous and discrete cases", 1026 University of Washington, Technical Report No. 81, May 1027 1986. 1029 [I-D.ietf-ippm-testplan-rfc2679] 1030 Ciavattone, L., Geib, R., Morton, A., and M. Wieser, "Test 1031 Plan and Results Supporting Advancement of RFC 2679 on the 1032 Standards Track", draft-ietf-ippm-testplan-rfc2679-03 1033 (work in progress), September 2012. 1035 [RFC3931] Lau, J., Townsley, M., and I. Goyret, "Layer Two Tunneling 1036 Protocol - Version 3 (L2TPv3)", RFC 3931, March 2005. 1038 Authors' Addresses 1040 Guy Almes 1041 Texas A&M 1043 Sunil Kalidindi 1044 Ixia 1046 Matt Zekauskas 1047 Internet2 1049 Email: matt@internet2.edu 1051 Al Morton (editor) 1052 AT&T Labs 1053 200 Laurel Avenue South 1054 Middletown, NJ 07748 1055 USA 1057 Phone: +1 732 420 1571 1058 Fax: +1 732 368 1192 1059 Email: acmorton@att.com 1060 URI: http://home.comcast.net/~acmacm/