idnits 2.17.1 draft-ietf-ippm-reporting-metrics-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to contain a disclaimer for pre-RFC5378 work, but was first submitted on or after 10 November 2008. The disclaimer is usually necessary only for documents that revise or obsolete older RFCs, and that take significant amounts of text from those RFCs. If you can contact all authors of the source material and they are willing to grant the BCP78 rights to the IETF Trust, you can and should remove the disclaimer. Otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (October 25, 2010) is 4932 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 2679 (Obsoleted by RFC 7679) ** Obsolete normative reference: RFC 2680 (Obsoleted by RFC 7680) == Outdated reference: A later version (-06) exists of draft-ietf-ippm-reporting-05 Summary: 2 errors (**), 0 flaws (~~), 3 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group A. Morton 3 Internet-Draft G. Ramachandran 4 Intended status: Informational G. Maguluri 5 Expires: April 28, 2011 AT&T Labs 6 October 25, 2010 8 Reporting Metrics: Different Points of View 9 draft-ietf-ippm-reporting-metrics-04 11 Abstract 13 Consumers of IP network performance metrics have many different uses 14 in mind. The memo provides "long-term" reporting considerations 15 (e.g, days, weeks or months, as opposed to 10 seconds), based on 16 analysis of the two key audience points-of-view. It describes how 17 the audience categories affect the selection of metric parameters and 18 options when seeking info that serves their needs. 20 Requirements Language 22 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 23 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 24 document are to be interpreted as described in RFC 2119 [RFC2119]. 26 Status of this Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at http://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on April 28, 2011. 43 Copyright Notice 45 Copyright (c) 2010 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (http://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 This document may contain material from IETF Documents or IETF 59 Contributions published or made publicly available before November 60 10, 2008. The person(s) controlling the copyright in some of this 61 material may not have granted the IETF Trust the right to allow 62 modifications of such material outside the IETF Standards Process. 63 Without obtaining an adequate license from the person(s) controlling 64 the copyright in such materials, this document may not be modified 65 outside the IETF Standards Process, and derivative works of it may 66 not be created outside the IETF Standards Process, except to format 67 it for publication as an RFC or to translate it into languages other 68 than English. 70 Table of Contents 72 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 73 2. Purpose and Scope . . . . . . . . . . . . . . . . . . . . . . 4 74 3. Reporting Results . . . . . . . . . . . . . . . . . . . . . . 5 75 3.1. Overview of Metric Statistics . . . . . . . . . . . . . . 5 76 3.2. Long-Term Reporting Considerations . . . . . . . . . . . . 6 77 4. Effect of POV on the Loss Metric . . . . . . . . . . . . . . . 8 78 4.1. Loss Threshold . . . . . . . . . . . . . . . . . . . . . . 8 79 4.1.1. Network Characterization . . . . . . . . . . . . . . . 8 80 4.1.2. Application Performance . . . . . . . . . . . . . . . 10 81 4.2. Errored Packet Designation . . . . . . . . . . . . . . . . 10 82 4.3. Causes of Lost Packets . . . . . . . . . . . . . . . . . . 10 83 4.4. Summary for Loss . . . . . . . . . . . . . . . . . . . . . 11 84 5. Effect of POV on the Delay Metric . . . . . . . . . . . . . . 11 85 5.1. Treatment of Lost Packets . . . . . . . . . . . . . . . . 11 86 5.1.1. Application Performance . . . . . . . . . . . . . . . 11 87 5.1.2. Network Characterization . . . . . . . . . . . . . . . 12 88 5.1.3. Delay Variation . . . . . . . . . . . . . . . . . . . 13 89 5.1.4. Reordering . . . . . . . . . . . . . . . . . . . . . . 14 90 5.2. Preferred Statistics . . . . . . . . . . . . . . . . . . . 14 91 5.3. Summary for Delay . . . . . . . . . . . . . . . . . . . . 15 92 6. Effect of POV on Raw Capacity Metrics . . . . . . . . . . . . 15 93 6.1. Type-P Parameter . . . . . . . . . . . . . . . . . . . . . 15 94 6.2. a priori Factors . . . . . . . . . . . . . . . . . . . . . 16 95 6.3. IP-layer Capacity . . . . . . . . . . . . . . . . . . . . 16 96 6.4. IP-layer Utilization . . . . . . . . . . . . . . . . . . . 17 97 6.5. IP-layer Available Capacity . . . . . . . . . . . . . . . 17 98 6.6. Variability in Utilization and Avail. Capacity . . . . . . 18 99 7. Effect of POV on Restricted Capacity Metrics . . . . . . . . . 18 100 7.1. Type-P Parameter and Type-C Parameter . . . . . . . . . . 19 101 7.2. a priori Factors . . . . . . . . . . . . . . . . . . . . . 19 102 7.3. Measurement Interval . . . . . . . . . . . . . . . . . . . 19 103 7.4. Bulk Transfer Capacity Reporting . . . . . . . . . . . . . 20 104 7.5. Variability in Bulk Transfer Capacity . . . . . . . . . . 21 105 8. Test Streams and Sample Size . . . . . . . . . . . . . . . . . 21 106 8.1. Test Stream Characteristics . . . . . . . . . . . . . . . 21 107 8.2. Sample Size . . . . . . . . . . . . . . . . . . . . . . . 22 108 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 22 109 10. Security Considerations . . . . . . . . . . . . . . . . . . . 22 110 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 23 111 12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 23 112 12.1. Normative References . . . . . . . . . . . . . . . . . . . 23 113 12.2. Informative References . . . . . . . . . . . . . . . . . . 24 114 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 24 116 1. Introduction 118 When designing measurements of IP networks and presenting the 119 results, knowledge of the audience is a key consideration. To 120 present a useful and relevant portrait of network conditions, one 121 must answer the following question: 123 "How will the results be used?" 125 There are two main audience categories: 127 1. Network Characterization - describes conditions in an IP network 128 for quality assurance, troubleshooting, modeling, Service Level 129 Agreements (SLA), etc. The point-of-view looks inward, toward 130 the network, and the consumer intends their actions there. 132 2. Application Performance Estimation - describes the network 133 conditions in a way that facilitates determining affects on user 134 applications, and ultimately the users themselves. This point- 135 of-view looks outward, toward the user(s), accepting the network 136 as-is. This consumer intends to estimate a network-dependent 137 aspect of performance, or design some aspect of an application's 138 accommodation of the network. (These are *not* application 139 metrics, they are defined at the IP layer.) 141 This memo considers how these different points-of-view affect both 142 the measurement design (parameters and options of the metrics) and 143 statistics reported when serving their needs. 145 The IPPM framework [RFC2330] and other RFCs describing IPPM metrics 146 provide a background for this memo. 148 2. Purpose and Scope 150 The purpose of this memo is to clearly delineate two points-of-view 151 (POV) for using measurements, and describe their effects on the test 152 design, including the selection of metric parameters and reporting 153 the results. 155 The scope of this memo primarily covers the design and reporting of 156 the loss and delay metrics [RFC2680] [RFC2679]. It will also discuss 157 the delay variation [RFC3393] and reordering metrics [RFC4737] where 158 applicable. 160 With capacity metrics growing in relevance to the industry, the memo 161 also covers POV and reporting considerations for metrics resulting 162 from the Bulk Transfer Capacity Framework [RFC3148] and Network 163 Capacity Definitions [RFC5136]. These memos effectively describe two 164 different categories of metrics, 166 o [RFC3148] with congestion flow-control and the notion of unique 167 data bits delivered, and 169 o [RFC5136] using a definition of raw capacity without the 170 restrictions of data uniqueness or congestion-awareness. 172 It might seem at first glance that each of these metrics has an 173 obvious audience (Raw = Network Characterization, Restricted = 174 Application Performance), but reality is more complex and consistent 175 with the overall topic of capacity measurement and reporting. For 176 example, TCP is usually used in Restricted capacity measurement 177 methods, while UDP appears in Raw capacity measurement. The Raw and 178 Restricted capacity metrics will be treated in separate sections, 179 although they share one common reporting issue: representing 180 variability in capacity metric results as part of a long-term report. 182 Sampling, or the design of the active packet stream that is the basis 183 for the measurements, is also discussed. 185 3. Reporting Results 187 This section gives an overview of recommendations, followed by 188 additional considerations for reporting results in the "long-term", 189 based on the discussion and conclusions of the major sections that 190 follow. 192 3.1. Overview of Metric Statistics 194 This section gives an overview of reporting recommendations for the 195 loss, delay, and delay variation metrics. 197 The minimal report on measurements MUST include both Loss and Delay 198 Metrics. 200 For Packet Loss, the loss ratio defined in [RFC2680] is a sufficient 201 starting point, especially the guidance for setting the loss 202 threshold waiting time. We have calculated a waiting time above that 203 should be sufficient to differentiate between packets that are truly 204 lost or have long finite delays under general measurement 205 circumstances, 51 seconds. Knowledge of specific conditions can help 206 to reduce this threshold, but 51 seconds is considered to be 207 manageable in practice. 209 We note that a loss ratio calculated according to [Y.1540] would 210 exclude errored packets from the numerator. In practice, the 211 difference between these two loss metrics is small if any, depending 212 on whether the last link prior to the destination contributes errored 213 packets. 215 For Packet Delay, we recommend providing both the mean delay and the 216 median delay with lost packets designated undefined (as permitted by 217 [RFC2679]). Both statistics are based on a conditional distribution, 218 and the condition is packet arrival prior to a waiting time dT, where 219 dT has been set to take maximum packet lifetimes into account, as 220 discussed below. Using a long dT helps to ensure that delay 221 distributions are not truncated. 223 For Packet Delay Variation (PDV), the minimum delay of the 224 conditional distribution should be used as the reference delay for 225 computing PDV according to [Y.1540] or [RFC5481] and [RFC3393]. A 226 useful value to report is a pseudo range of delay variation based on 227 calculating the difference between a high percentile of delay and the 228 minimum delay. For example, the 99.9%-ile minus the minimum will 229 give a value that can be compared with objectives in [Y.1541]. 231 3.2. Long-Term Reporting Considerations 233 [I-D.ietf-ippm-reporting] describes methods to conduct measurements 234 and report the results on a near-immediate time scale (10 seconds, 235 which we consider to be "short-term"). 237 Measurement intervals and reporting intervals need not be the same 238 length. Sometimes, the user is only concerned with the performance 239 levels achieved over a relatively long interval of time (e.g, days, 240 weeks, or months, as opposed to 10 seconds). However, there can be 241 risks involved with running a measurement continuously over a long 242 period without recording intermediate results: 244 o Temporary power failure may cause loss of all the results to date. 246 o Measurement system timing synchronization signals may experience a 247 temporary outage, causing sub-sets of measurements to be in error 248 or invalid. 250 o Maintenance may be necessary on the measurement system, or its 251 connectivity to the network under test. 253 For these and other reasons, such as 255 o the constraint to collect measurements on intervals similar to 256 user session length, or 258 o the dual-use of measurements in monitoring activities where 259 results are needed on a period of a few minutes, 261 there is value in conducting measurements on intervals that are much 262 shorter than the reporting interval. 264 There are several approaches for aggregating a series of measurement 265 results over time in order to make a statement about the longer 266 reporting interval. One approach requires the storage of all metric 267 singletons collected throughout the reporting interval, even though 268 the measurement interval stops and starts many times. 270 Another approach is described in [RFC5835] as "temporal aggregation". 271 This approach would estimate the results for the reporting interval 272 based on many individual measurement interval statistics (results) 273 alone. The result would ideally appear in the same form as though a 274 continuous measurement was conducted. A memo to address the details 275 of temporal aggregation is yet to be prepared. 277 Yet another approach requires a numerical objective for the metric, 278 and the results of each measurement interval are compared with the 279 objective. Every measurement interval where the results meet the 280 objective contribute to the fraction of time with performance as 281 specified. When the reporting interval contains many measurement 282 intervals it is possible to present the results as "metric A was less 283 than or equal to objective X during Y% of time. 285 NOTE that numerical thresholds of acceptability are not set in IETF 286 performance work and are explicitly excluded from the IPPM charter. 288 In all measurement, it is important to avoid unintended 289 synchronization with network events. This topic is treated in 290 [RFC2330] for Poisson-distributed inter-packet time streams, and 291 [RFC3432] for Periodic streams. Both avoid synchronization through 292 use of random start times. 294 There are network conditions where it is simply more useful to report 295 the connectivity status of the Source-Destination path, and to 296 distinguish time intervals where connectivity can be demonstrated 297 from other time intervals (where connectivity does not appear to 298 exist). [RFC2678] specifies a number of one-way and two connectivity 299 metrics of increasing complexity. In this memo, we RECOMMEND that 300 long term reporting of loss, delay, and other metrics be limited to 301 time intervals where connectivity can be demonstrated, and other 302 intervals be summarized as percent of time where connectivity does 303 not appear to exist. We note that this same approach has been 304 adopted in ITU-T Recommendation [Y.1540] where performance parameters 305 are only valid during periods of service "availability" (evaluated 306 according to a function based on packet loss, and sustained periods 307 of loss ratio greater than a threshold are declared "unavailable"). 309 4. Effect of POV on the Loss Metric 311 This section describes the ways in which the Loss metric can be tuned 312 to reflect the preferences of the two audience categories, or 313 different POV. The waiting time to declare a packet lost, or loss 314 threshold is one area where there would appear to be a difference, 315 but the ability to post-process the results may resolve it. 317 4.1. Loss Threshold 319 RFC 2680 [RFC2680] defines the concept of a waiting time for packets 320 to arrive, beyond which they are declared lost. The text of the RFC 321 declines to recommend a value, instead saying that "good engineering, 322 including an understanding of packet lifetimes, will be needed in 323 practice." Later, in the methodology, they give reasons for waiting 324 "a reasonable period of time", and leaving the definition of 325 "reasonable" intentionally vague. 327 4.1.1. Network Characterization 329 Practical measurement experience has shown that unusual network 330 circumstances can cause long delays. One such circumstance is when 331 routing loops form during IGP re-convergence following a failure or 332 drastic link cost change. Packets will loop between two routers 333 until new routes are installed, or until the IPv4 Time-to-Live (TTL) 334 field (or the IPv6 Hop Limit) decrements to zero. Very long delays 335 on the order of several seconds have been measured [Casner] [Cia03]. 337 Therefore, network characterization activities prefer a long waiting 338 time in order to distinguish these events from other causes of loss 339 (such as packet discard at a full queue, or tail drop). This way, 340 the metric design helps to distinguish more reliably between packets 341 that might yet arrive, and those that are no longer traversing the 342 network. 344 It is possible to calculate a worst-case waiting time, assuming that 345 a routing loop is the cause. We model the path between Source and 346 Destination as a series of delays in links (t) and queues (q), as 347 these two are the dominant contributors to delay. The normal path 348 delay across n hops without encountering a loop, D, is 349 n 350 --- 351 \ 352 D = t + > t + q 353 0 / i i 354 --- 355 i = 1 357 Figure 1: Normal Path Delay 359 and the time spent in the loop with L hops, is 361 i + L-1 362 --- 363 \ (TTL - n) 364 R = C > t + q where C = --------- 365 / i i max L 366 --- 367 i 369 Figure 2: Delay due to Rotations in a Loop 371 and where C is the number of times a packet circles the loop. 373 If we take the delays of all links and queues as 100ms each, the 374 TTL=255, the number of hops n=5 and the hops in the loop L=4, then 376 D = 1.1 sec and R ~= 50 sec, and D + R ~= 51.1 seconds 378 We note that the link delays of 100ms would span most continents, and 379 a constant queue length of 100ms is also very generous. When a loop 380 occurs, it is almost certain to be resolved in 10 seconds or less. 381 The value calculated above is an upper limit for almost any realistic 382 circumstance. 384 A waiting time threshold parameter, dT, set consistent with this 385 calculation would not truncate the delay distribution (possibly 386 causing a change in its mathematical properties), because the packets 387 that might arrive have been given sufficient time to traverse the 388 network. 390 It is worth noting that packets that are stored and deliberately 391 forwarded at a much later time constitute a replay attack on the 392 measurement system, and are beyond the scope of normal performance 393 reporting. 395 4.1.2. Application Performance 397 Fortunately, application performance estimation activities are not 398 adversely affected by the estimated worst-case transfer time. 399 Although the designer's tendency might be to set the Loss Threshold 400 at a value equivalent to a particular application's threshold, this 401 specific threshold can be applied when post-processing the 402 measurements. A shorter waiting time can be enforced by locating 403 packets with delays longer than the application's threshold, and re- 404 designating such packets as lost. Thus, the measurement system can 405 use a single loss threshold and support both application and network 406 performance POVs simultaneously. 408 4.2. Errored Packet Designation 410 RFC 2680 designates packets that arrive containing errors as lost 411 packets. Many packets that are corrupted by bit errors are discarded 412 within the network and do not reach their intended destination. 414 This is consistent with applications that would check the payload 415 integrity at higher layers, and discard the packet. However, some 416 applications prefer to deal with errored payloads on their own, and 417 even a corrupted payload is better than no packet at all. 419 To address this possibility, and to make network characterization 420 more complete, it is recommended to distinguish between packets that 421 do not arrive (lost) and errored packets that arrive (conditionally 422 lost). 424 4.3. Causes of Lost Packets 426 Although many measurement systems use a waiting time to determine if 427 a packet is lost or not, most of the waiting is in vain. The packets 428 are no-longer traversing the network, and have not reached their 429 destination. 431 There are many causes of packet loss, including: 433 1. Queue drop, or discard 435 2. Corruption of the IP header, or other essential header info 437 3. TTL expiration (or use of a TTL value that is too small) 439 4. Link or router failure 441 After waiting sufficient time, packet loss can probably be attributed 442 to one of these causes. 444 4.4. Summary for Loss 446 Given that measurement post-processing is possible (even encouraged 447 in the definitions of IPPM metrics), measurements of loss can easily 448 serve both points of view: 450 o Use a long waiting time to serve network characterization and 451 revise results for specific application delay thresholds as 452 needed. 454 o Distinguish between errored packets and lost packets when possible 455 to aid network characterization, and combine the results for 456 application performance if appropriate. 458 5. Effect of POV on the Delay Metric 460 This section describes the ways in which the Delay metric can be 461 tuned to reflect the preferences of the two consumer categories, or 462 different POV. 464 5.1. Treatment of Lost Packets 466 The Delay Metric [RFC2679] specifies the treatment of packets that do 467 not successfully traverse the network: their delay is undefined. 469 " >>The *Type-P-One-way-Delay* from Src to Dst at T is undefined 470 (informally, infinite)<< means that Src sent the first bit of a 471 Type-P packet to Dst at wire-time T and that Dst did not receive that 472 packet." 474 It is an accepted, but informal practice to assign infinite delay to 475 lost packets. We next look at how these two different treatments 476 align with the needs of measurement consumers who wish to 477 characterize networks or estimate application performance. Also, we 478 look at the way that lost packets have been treated in other metrics: 479 delay variation and reordering. 481 5.1.1. Application Performance 483 Applications need to perform different functions, dependent on 484 whether or not each packet arrives within some finite tolerance. In 485 other words, a receivers' packet processing takes one of two 486 directions (or "forks" in the road): 488 o Packets that arrive within expected tolerance are handled by 489 processes that remove headers, restore smooth delivery timing (as 490 in a de-jitter buffer), restore sending order, check for errors in 491 payloads, and many other operations. 493 o Packets that do not arrive when expected spawn other processes 494 that attempt recovery from the apparent loss, such as 495 retransmission requests, loss concealment, or forward error 496 correction to replace the missing packet. 498 So, it is important to maintain a distinction between packets that 499 actually arrive, and those that do not. Therefore, it is preferable 500 to leave the delay of lost packets undefined, and to characterize the 501 delay distribution as a conditional distribution (conditioned on 502 arrival). 504 5.1.2. Network Characterization 506 In this discussion, we assume that both loss and delay metrics will 507 be reported for network characterization (at least). 509 Assume packets that do not arrive are reported as Lost, usually as a 510 fraction of all sent packets. If these lost packets are assigned 511 undefined delay, then network's inability to deliver them (in a 512 timely way) is captured only in the loss metric when we report 513 statistics on the Delay distribution conditioned on the event of 514 packet arrival (within the Loss waiting time threshold). We can say 515 that the Delay and Loss metrics are Orthogonal, in that they convey 516 non-overlapping information about the network under test. 518 However, if we assign infinite delay to all lost packets, then: 520 o The delay metric results are influenced both by packets that 521 arrive and those that do not. 523 o The delay singleton and the loss singleton do not appear to be 524 orthogonal (Delay is finite when Loss=0, Delay is infinite when 525 Loss=1). 527 o The network is penalized in both the loss and delay metrics, 528 effectively double-counting the lost packets. 530 As further evidence of overlap, consider the Cumulative Distribution 531 Function (CDF) of Delay when the value positive infinity is assigned 532 to all lost packets. Figure 3 shows a CDF where a small fraction of 533 packets are lost. 535 1 | - - - - - - - - - - - - - - - - - -+ 536 | | 537 | _..----'''''''''''''''''''' 538 | ,-'' 539 | ,' 540 | / Mass at 541 | / +infinity 542 | / = fraction 543 || lost 544 |/ 545 0 |_____________________________________ 547 0 Delay +o0 549 Figure 3: Cumulative Distribution Function for Delay when Loss = 550 +Infinity 552 We note that a Delay CDF that is conditioned on packet arrival would 553 not exhibit this apparent overlap with loss. 555 Although infinity is a familiar mathematical concept, it is somewhat 556 disconcerting to see any time-related metric reported as infinity, in 557 the opinion of the authors. Questions are bound to arise, and tend 558 to detract from the goal of informing the consumer with a performance 559 report. 561 5.1.3. Delay Variation 563 [RFC3393] excludes lost packets from samples, effectively assigning 564 an undefined delay to packets that do not arrive in a reasonable 565 time. Section 4.1 describes this specification and its rationale 566 (ipdv = inter-packet delay variation in the quote below). 568 "The treatment of lost packets as having "infinite" or "undefined" 569 delay complicates the derivation of statistics for ipdv. 570 Specifically, when packets in the measurement sequence are lost, 571 simple statistics such as sample mean cannot be computed. One 572 possible approach to handling this problem is to reduce the event 573 space by conditioning. That is, we consider conditional statistics; 574 namely we estimate the mean ipdv (or other derivative statistic) 575 conditioned on the event that selected packet pairs arrive at the 576 destination (within the given timeout). While this itself is not 577 without problems (what happens, for example, when every other packet 578 is lost), it offers a way to make some (valid) statements about ipdv, 579 at the same time avoiding events with undefined outcomes." 581 We note that the argument above applies to all forms of packet delay 582 variation that can be constructed using the "selection function" 583 concept of [RFC3393]. In recent work the two main forms of delay 584 variation metrics have been compared and the results are summarized 585 in [RFC5481]. 587 5.1.4. Reordering 589 [RFC4737]defines metrics that are based on evaluation of packet 590 arrival order, and include a waiting time to declare a packet lost 591 (to exclude them from further processing). 593 If packets are assigned a delay value, then the reordering metric 594 would declare any packets with infinite delay to be reordered, 595 because their sequence numbers will surely be less than the "Next 596 Expected" threshold when (or if) they arrive. But this practice 597 would fail to maintain orthogonality between the reordering metric 598 and the loss metric. Confusion can be avoided by designating the 599 delay of non-arriving packets as undefined, and reserving delay 600 values only for packets that arrive within a sufficiently long 601 waiting time. 603 5.2. Preferred Statistics 605 Today in network characterization, the sample mean is one statistic 606 that is almost ubiquitously reported. It is easily computed and 607 understood by virtually everyone in this audience category. Also, 608 the sample is usually filtered on packet arrival, so that the mean is 609 based a conditional distribution. 611 The median is another statistic that summarizes a distribution, 612 having somewhat different properties from the sample mean. The 613 median is stable in distributions with a few outliers or without 614 them. However, the median's stability prevents it from indicating 615 when a large fraction of the distribution changes value. 50% or more 616 values would need to change for the median to capture the change. 618 Both the median and sample mean have difficulty with bimodal 619 distributions. The median will reside in only one of the modes, and 620 the mean may not lie in either mode range. For this and other 621 reasons, additional statistics such as the minimum, maximum, and 95%- 622 ile have value when summarizing a distribution. 624 When both the sample mean and median are available, a comparison will 625 sometimes be informative, because these two statistics are equal only 626 when the delay distribution is perfectly symmetrical. 628 Also, these statistics are generally useful from the Application 629 Performance POV, so there is a common set that should satisfy 630 audiences. 632 Plots of the delay distribution may also be useful when single-value 633 statistics indicate that new conditions are present. An empirically- 634 derived probability distribution function will usually describe 635 multiple modes more efficiently than any other form of result. 637 5.3. Summary for Delay 639 From the perspectives of: 641 1. application/receiver analysis, where subsequent processing 642 depends on whether the packet arrives or times-out, 644 2. straightforward network characterization without double-counting 645 defects, and 647 3. consistency with Delay variation and Reordering metric 648 definitions, 650 the most efficient practice is to distinguish between truly lost and 651 delayed packets with a sufficiently long waiting time, and to 652 designate the delay of non-arriving packets as undefined. 654 6. Effect of POV on Raw Capacity Metrics 656 This section describes the ways that raw capacity metrics can be 657 tuned to reflect the preferences of the two audiences, or different 658 Points-of-View (POV). Raw capacity refers to the metrics defined in 659 [RFC5136] which do not include restrictions such as data uniqueness 660 or flow-control response to congestion. 662 In summary, the metrics considered are IP-layer Capacity, Utilization 663 (or used capacity), and Available Capacity, for individual links and 664 complete paths. These three metrics form a triad: knowing one metric 665 constrains the other two (within their allowed range), and knowing 666 two determines the third. The link metrics have another key aspect 667 in common: they are single-measurement-point metrics at the egress of 668 a link. The path Capacity and Available Capacity are derived by 669 examining the set of single-point link measurements and taking the 670 minimum value. 672 6.1. Type-P Parameter 674 The concept of "packets of type-P" is defined in [RFC2330]. The 675 type-P categorization has critical relevance in all forms of capacity 676 measurement and reporting. The ability to categorize packets based 677 on header fields for assignment to different queues and scheduling 678 mechanisms is now common place. When un-used resources are shared 679 across queues, the conditions in all packet categories will affect 680 capacity and related measurements. This is one source of variability 681 in the results that all audiences would prefer to see reported in a 682 useful and easily understood way. 684 Type-P in OWAMP and TWAMP is essentially confined to the Diffserv 685 Codepoint [ref]. DSCP is the most common qualifier for type-P. 687 Each audience will have a set of type-P qualifications and value 688 combinations that are of interest. Measurements and reports SHOULD 689 have the flexibility to per-type and aggregate performance. 691 6.2. a priori Factors 693 The audience for Network Characterization may have detailed 694 information about each link that comprises a complete path (due to 695 ownership, for example), or some of the links in the path but not 696 others, or none of the links. 698 There are cases where the measurement audience only has information 699 on one of the links (the local access link), and wishes to measure 700 one or more of the raw capacity metrics. This scenario is quite 701 common, and has spawned a substantial number of experimental 702 measurement methods [ref to CAIDA survey page, etc.]. Many of these 703 methods respect that their users want a result fairly quickly and in 704 a one-trial. Thus, the measurement interval is kept short (a few 705 seconds to a minute). For long-term reporting, a sample of short 706 term results need to be summarized. 708 6.3. IP-layer Capacity 710 For links, this metric's theoretical maximum value can be determined 711 from the physical layer bit rate and the bit rate reduction due to 712 the layers between the physical layer and IP. When measured, this 713 metric takes additional factors into account, such as the ability of 714 the sending device to process and forward traffic under various 715 conditions. For example, the arrival of routing updates may spawn 716 high priority processes that reduce the sending rate temporarily. 717 Thus, the measured capacity of a link will be variable, and the 718 maximum capacity observed applies to a specific time, time interval, 719 and other relevant circumstances. 721 For paths composed of a series of links, it is easy to see how the 722 sources of variability for the results grow with each link in the 723 path. Results variability will be discussed in more detail below. 725 6.4. IP-layer Utilization 727 The ideal metric definition of Link Utilization [RFC5136] is based on 728 the actual usage (bits successfully received during a time interval) 729 and the Maximum Capacity for the same interval. 731 In practice, Link Utilization can be calculated by counting the IP- 732 layer (or other layer) octets received over a time interval and 733 dividing by the theoretical maximum of octets that could have been 734 delivered in the same interval. A commonly used time interval is 5 735 minutes, and this interval has been sufficient to support network 736 operations and design for some time. 5 minutes is somewhat long 737 compared with the expected download time for web pages, but short 738 with respect to large file transfers and TV program viewing. It is 739 fair to say that considerable variability is concealed by reporting a 740 single (average) Utilization value for each 5 minute interval. Some 741 performance management systems have begun to make 1 minute averages 742 available. 744 There is also a limit on the smallest useful measurement interval. 745 Intervals on the order of the serialization time for a single Maximum 746 Transmission Unit (MTU) packet will observe on/off behavior and 747 report 100% or 0%. The smallest interval needs to be some multiple 748 of MTU serialization time for averaging to be effective. 750 6.5. IP-layer Available Capacity 752 The Available Capacity of a link can be calculated using the Capacity 753 and Utilization metrics. 755 When Available capacity of a link or path is estimated through some 756 measurement technique, the following parameters SHOULD be reported: 758 o Name and reference to the exact method of measurement 760 o IP packet length, octets (including IP header) 762 o Maximum Capacity that can be assessed in the measurement 763 configuration 765 o The time a duration of the measurement 767 o All other parameters specific to the measurement method 769 Many methods of Available capacity measurement have a maximum 770 capacity that they can measure, and this maximum may be less than the 771 actual Available capacity of the link or path. Therefore, it is 772 important to know the capacity value beyond which there will be no 773 measured improvement. 775 The Application Design audience may have a target capacity value and 776 simply wish to assess whether there is sufficient Available Capacity. 777 This case simplifies measurement of link and path capacity to some 778 degree, as long as the measurable maximum exceeds the target 779 capacity. 781 6.6. Variability in Utilization and Avail. Capacity 783 As with most metrics and measurements, assessing the consistency or 784 variability in the results gives a the user an intuitive feel for the 785 degree (or confidence) that any one value is representative of other 786 results, or the underlying distribution from which these singleton 787 measurements have come. 789 Two questions are raised here for further discussion: 791 What ways can Utilization be measured and summarized to describe the 792 potential variability in a useful way? 794 How can the variability in Available Capacity estimates be reported, 795 so that the confidence in the results is also conveyed? 797 7. Effect of POV on Restricted Capacity Metrics 799 This section describes the ways that restricted capacity metrics can 800 be tuned to reflect the preferences of the two audiences, or 801 different Points-of-View (POV). Raw capacity refers to the metrics 802 defined in [RFC3148] which include restrictions such as data 803 uniqueness or flow-control response to congestion. 805 In primary metric considered is Bulk Transfer Capacity (BTC) for 806 complete paths. [RFC3148] defines 808 BTC = data_sent / elapsed_time 810 for a connection with congestion-aware flow control, where data_sent 811 is the total of unique payload bits (no headers). 813 We note that this definition *differs* from the raw capacity 814 definition in Section 2.3.1 of [RFC5136], where IP-layer Capacity 815 *includes* all bits in the IP header and payload. This means that 816 Restricted Capacity BTC is already operating at a disadvantage when 817 compared to the raw capacity at layers below TCP. Further, there are 818 cases where "THE IP-layer" is encapsulated in another IP-layer or 819 other form of tunneling protocol, designating more and more of the 820 fundamental transport capacity as header bits that are pure overhead 821 to the BTC measurement. 823 When thinking about the triad of raw capacity metrics, BTC is most 824 akin to the "IP-Type-P Available Path Capacity", at least in the eyes 825 of a network user who seeks to know what transmission performance a 826 path might support. 828 7.1. Type-P Parameter and Type-C Parameter 830 The concept of "packets of type-P" is defined in [RFC2330]. The 831 considerations for Restricted Capacity are identical to the raw 832 capacity section on this topic, with the addition that the various 833 fields and options in the TCP header MUST be included in the 834 description. 836 The vast array of TCP flow control options are not well-captured by 837 Type-P, because they do not exist in the TCP header bits. Therefore, 838 we introduce a new notion here: TCP Configuration of "Type-C". The 839 elements of Type-C describe all of the settings for TCP options and 840 congestion control algorithm variables, including the main form of 841 congestion control in use. 843 7.2. a priori Factors 845 The audience for Network Characterization may have detailed 846 information about each link that comprises a complete path (due to 847 ownership, for example), or some of the links in the path but not 848 others, or none of the links. 850 There are cases where the measurement audience only has information 851 on one of the links (the local access link), and wishes to measure 852 one or more BTC metrics. This scenario is quite common, and has 853 spawned a substantial number of experimental measurement methods [ref 854 to CAIDA survey page, etc.]. Many of these methods respect that 855 their users want a result fairly quickly and in a one-trial. Thus, 856 the measurement interval is kept short (a few seconds to a minute). 857 For long-term reporting, a sample of short term results need to be 858 summarized. 860 7.3. Measurement Interval 862 There are limits on a useful measurement interval for BTC. Three 863 factors that influence the interval duration are listed below: 865 1. Measurements may choose to include or exclude the 3-way handshake 866 of TCP connection establishment, which requires at least 1.5 * 867 RTT and contains both the delay of the path and the host 868 processing time for responses. However, user experience includes 869 the 3-way handshake for all new TCP connections. 871 2. Measurements may choose to include or exclude Slow-Start, 872 preferring instead to focus on a portion of the transfer that 873 represents "equilibrium" <<<< which needs a definition for this 874 purpose >>>>. However, user experience includes the Slow-Start 875 for all new TCP connections. 877 3. Measurements may choose to use a fixed block of data to transfer, 878 where the size of the block has a relationship to the file size 879 of the application of interest. This approach yields variable 880 size measurement intervals, where a path faster BTC is measured 881 for less time than a slower path, an this has implications when 882 path impairments are time-varying, or transient. Users are 883 likely to turn their immediate attention elsewhere when a very 884 large file must be transferred, thus they do not directly 885 experience such a long transfer -- they see the result (success 886 or fail) and possibly an objective measurement of the transfer 887 time (which will likely include the 3-way handshake, Slow-start, 888 and application file management processing time as well as the 889 BTC). 891 Individual measurement intervals may be short or long, but there is a 892 need to report the results on a long-term basis that captures the BTC 893 variability experienced between each interval. Consistent BTC is a 894 valuable commodity along with the value attained. 896 7.4. Bulk Transfer Capacity Reporting 898 When BTC of a link or path is estimated through some measurement 899 technique, the following parameters SHOULD be reported: 901 o Name and reference to the exact method of measurement 903 o Maximum Transmission Unit (MTU) 905 o Maximum BTC that can be assessed in the measurement configuration 907 o The time and duration of the measurement 909 o The number of BTC connections used simultaneously 911 o *All* other parameters specific to the measurement method, 912 especially the Congestion Control algorithm in use 914 See also 915 [http://tools.ietf.org/wg/ippm/draft-ietf-ippm-tcp-throughput-tm/] 916 Many methods of Bulk Transfer Capacity measurement have a maximum 917 capacity that they can measure, and this maximum may be less than the 918 available capacity of the link or path. Therefore, it is important 919 to specify the measured BTC value beyond which there will be no 920 measured improvement. 922 The Application Design audience may have a target capacity value and 923 simply wish to assess whether there is sufficient BTC. This case 924 simplifies measurement of link and path capacity to some degree, as 925 long as the measurable maximum exceeds the target capacity. 927 7.5. Variability in Bulk Transfer Capacity 929 As with most metrics and measurements, assessing the consistency or 930 variability in the results gives a the user an intuitive feel for the 931 degree (or confidence) that any one value is representative of other 932 results, or the underlying distribution from which these singleton 933 measurements have come. 935 Two questions are raised here for further discussion: 937 What ways can BTC be measured and summarized to describe the 938 potential variability in a useful way? 940 How can the variability in BTC estimates be reported, so that the 941 confidence in the results is also conveyed? 943 8. Test Streams and Sample Size 945 This section discusses two key aspects of measurement that are 946 sometimes omitted from the report: the description of the test stream 947 on which the measurements are based, and the sample size. 949 8.1. Test Stream Characteristics 951 Network Characterization has traditionally used Poisson-distributed 952 inter-packet spacing, as this provides an unbiased sample. The 953 average inter-packet spacing may be selected to allow observation of 954 specific network phenomena. Other test streams are designed to 955 sample some property of the network, such as the presence of 956 congestion, link bandwidth, or packet reordering. 958 If measuring a network in order to make inferences about applications 959 or receiver performance, then there are usually efficiencies derived 960 from a test stream that has similar characteristics to the sender. 961 In some cases, it is essential to synthesize the sender stream, as 962 with Bulk Transfer Capacity estimates. In other cases, it may be 963 sufficient to sample with a "known bias", e.g., a Periodic stream to 964 estimate real-time application performance. 966 8.2. Sample Size 968 Sample size is directly related to the accuracy of the results, and 969 plays a critical role in the report. Even if only the sample size 970 (in terms of number of packets) is given for each value or summary 971 statistic, it imparts a notion of the confidence in the result. 973 In practice, the sample size will be selected taking both statistical 974 and practical factors into account. Among these factors are: 976 1. The estimated variability of the quantity being measured 978 2. The desired confidence in the result (although this may be 979 dependent on assumption of the underlying distribution of the 980 measured quantity). 982 3. The effects of active measurement traffic on user traffic 984 4. etc. 986 A sample size may sometimes be referred to as "large". This is a 987 relative, and qualitative term. It is preferable to describe what 988 one is attempting to achieve with their sample. For example, stating 989 an implication may be helpful: this sample is large enough such that 990 a single outlying value at ten times the "typical" sample mean (the 991 mean without the outlying value) would influence the mean by no more 992 than X. 994 9. IANA Considerations 996 This document makes no request of IANA. 998 Note to RFC Editor: this section may be removed on publication as an 999 RFC. 1001 10. Security Considerations 1003 The security considerations that apply to any active measurement of 1004 live networks are relevant here as well. See [RFC4656]. 1006 11. Acknowledgements 1008 The authors thank: Phil Chimento for his suggestion to employ 1009 conditional distributions for Delay, Steve Konish Jr. for his careful 1010 review and suggestions, Dave Mcdysan and Don McLachlan for useful 1011 comments based on their long experience with measurement and 1012 reporting, and Matt Zekauskas for suggestions on organizing the memo 1013 for easier consumption. 1015 12. References 1017 12.1. Normative References 1019 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1020 Requirement Levels", BCP 14, RFC 2119, March 1997. 1022 [RFC2330] Paxson, V., Almes, G., Mahdavi, J., and M. Mathis, 1023 "Framework for IP Performance Metrics", RFC 2330, 1024 May 1998. 1026 [RFC2678] Mahdavi, J. and V. Paxson, "IPPM Metrics for Measuring 1027 Connectivity", RFC 2678, September 1999. 1029 [RFC2679] Almes, G., Kalidindi, S., and M. Zekauskas, "A One-way 1030 Delay Metric for IPPM", RFC 2679, September 1999. 1032 [RFC2680] Almes, G., Kalidindi, S., and M. Zekauskas, "A One-way 1033 Packet Loss Metric for IPPM", RFC 2680, September 1999. 1035 [RFC3148] Mathis, M. and M. Allman, "A Framework for Defining 1036 Empirical Bulk Transfer Capacity Metrics", RFC 3148, 1037 July 2001. 1039 [RFC3393] Demichelis, C. and P. Chimento, "IP Packet Delay Variation 1040 Metric for IP Performance Metrics (IPPM)", RFC 3393, 1041 November 2002. 1043 [RFC3432] Raisanen, V., Grotefeld, G., and A. Morton, "Network 1044 performance measurement with periodic streams", RFC 3432, 1045 November 2002. 1047 [RFC4656] Shalunov, S., Teitelbaum, B., Karp, A., Boote, J., and M. 1048 Zekauskas, "A One-way Active Measurement Protocol 1049 (OWAMP)", RFC 4656, September 2006. 1051 [RFC4737] Morton, A., Ciavattone, L., Ramachandran, G., Shalunov, 1052 S., and J. Perser, "Packet Reordering Metrics", RFC 4737, 1053 November 2006. 1055 [RFC5136] Chimento, P. and J. Ishac, "Defining Network Capacity", 1056 RFC 5136, February 2008. 1058 12.2. Informative References 1060 [Casner] "A Fine-Grained View of High Performance Networking, NANOG 1061 22 Conf.; http://www.nanog.org/mtg-0105/agenda.html", May 1062 20-22 2001. 1064 [Cia03] "Standardized Active Measurements on a Tier 1 IP Backbone, 1065 IEEE Communications Mag., pp 90-97.", June 2003. 1067 [I-D.ietf-ippm-reporting] 1068 Shalunov, S. and M. Swany, "Reporting IP Performance 1069 Metrics to Users", draft-ietf-ippm-reporting-05 (work in 1070 progress), July 2010. 1072 [RFC5481] Morton, A. and B. Claise, "Packet Delay Variation 1073 Applicability Statement", RFC 5481, March 2009. 1075 [RFC5835] Morton, A. and S. Van den Berghe, "Framework for Metric 1076 Composition", RFC 5835, April 2010. 1078 [Y.1540] ITU-T Recommendation Y.1540, "Internet protocol data 1079 communication service - IP packet transfer and 1080 availability performance parameters", December 2002. 1082 [Y.1541] ITU-T Recommendation Y.1540, "Network Performance 1083 Objectives for IP-Based Services", February 2006. 1085 Authors' Addresses 1087 Al Morton 1088 AT&T Labs 1089 200 Laurel Avenue South 1090 Middletown, NJ 07748 1091 USA 1093 Phone: +1 732 420 1571 1094 Fax: +1 732 368 1192 1095 Email: acmorton@att.com 1096 URI: http://home.comcast.net/~acmacm/ 1097 Gomathi Ramachandran 1098 AT&T Labs 1099 200 Laurel Avenue South 1100 Middletown, New Jersey 07748 1101 USA 1103 Phone: +1 732 420 2353 1104 Fax: 1105 Email: gomathi@att.com 1106 URI: 1108 Ganga Maguluri 1109 AT&T Labs 1110 200 Laurel Avenue 1111 Middletown, New Jersey 07748 1112 USA 1114 Phone: 732-420-2486 1115 Fax: 1116 Email: gmaguluri@att.com 1117 URI: