idnits 2.17.1 draft-ietf-ippm-reporting-metrics-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to contain a disclaimer for pre-RFC5378 work, but was first submitted on or after 10 November 2008. The disclaimer is usually necessary only for documents that revise or obsolete older RFCs, and that take significant amounts of text from those RFCs. If you can contact all authors of the source material and they are willing to grant the BCP78 rights to the IETF Trust, you can and should remove the disclaimer. Otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (January 7, 2012) is 4492 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 2679 (Obsoleted by RFC 7679) ** Obsolete normative reference: RFC 2680 (Obsoleted by RFC 7680) Summary: 2 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group A. Morton 3 Internet-Draft G. Ramachandran 4 Intended status: Informational G. Maguluri 5 Expires: July 10, 2012 AT&T Labs 6 January 7, 2012 8 Reporting Metrics: Different Points of View 9 draft-ietf-ippm-reporting-metrics-06 11 Abstract 13 Consumers of IP network performance metrics have many different uses 14 in mind. The memo provides "long-term" reporting considerations 15 (e.g, days, weeks or months, as opposed to 10 seconds), based on 16 analysis of the two key audience points-of-view. It describes how 17 the audience categories affect the selection of metric parameters and 18 options when seeking info that serves their needs. 20 Requirements Language 22 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 23 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 24 document are to be interpreted as described in RFC 2119 [RFC2119]. 26 Status of this Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at http://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on July 10, 2012. 43 Copyright Notice 45 Copyright (c) 2012 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (http://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 This document may contain material from IETF Documents or IETF 59 Contributions published or made publicly available before November 60 10, 2008. The person(s) controlling the copyright in some of this 61 material may not have granted the IETF Trust the right to allow 62 modifications of such material outside the IETF Standards Process. 63 Without obtaining an adequate license from the person(s) controlling 64 the copyright in such materials, this document may not be modified 65 outside the IETF Standards Process, and derivative works of it may 66 not be created outside the IETF Standards Process, except to format 67 it for publication as an RFC or to translate it into languages other 68 than English. 70 Table of Contents 72 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 73 2. Purpose and Scope . . . . . . . . . . . . . . . . . . . . . . 4 74 3. Reporting Results . . . . . . . . . . . . . . . . . . . . . . 5 75 3.1. Overview of Metric Statistics . . . . . . . . . . . . . . 5 76 3.2. Long-Term Reporting Considerations . . . . . . . . . . . . 6 77 4. Effect of POV on the Loss Metric . . . . . . . . . . . . . . . 8 78 4.1. Loss Threshold . . . . . . . . . . . . . . . . . . . . . . 8 79 4.1.1. Network Characterization . . . . . . . . . . . . . . . 8 80 4.1.2. Application Performance . . . . . . . . . . . . . . . 10 81 4.2. Errored Packet Designation . . . . . . . . . . . . . . . . 10 82 4.3. Causes of Lost Packets . . . . . . . . . . . . . . . . . . 10 83 4.4. Summary for Loss . . . . . . . . . . . . . . . . . . . . . 11 84 5. Effect of POV on the Delay Metric . . . . . . . . . . . . . . 11 85 5.1. Treatment of Lost Packets . . . . . . . . . . . . . . . . 11 86 5.1.1. Application Performance . . . . . . . . . . . . . . . 11 87 5.1.2. Network Characterization . . . . . . . . . . . . . . . 12 88 5.1.3. Delay Variation . . . . . . . . . . . . . . . . . . . 13 89 5.1.4. Reordering . . . . . . . . . . . . . . . . . . . . . . 14 90 5.2. Preferred Statistics . . . . . . . . . . . . . . . . . . . 14 91 5.3. Summary for Delay . . . . . . . . . . . . . . . . . . . . 15 92 6. Effect of POV on Raw Capacity Metrics . . . . . . . . . . . . 15 93 6.1. Type-P Parameter . . . . . . . . . . . . . . . . . . . . . 15 94 6.2. a priori Factors . . . . . . . . . . . . . . . . . . . . . 16 95 6.3. IP-layer Capacity . . . . . . . . . . . . . . . . . . . . 16 96 6.4. IP-layer Utilization . . . . . . . . . . . . . . . . . . . 17 97 6.5. IP-layer Available Capacity . . . . . . . . . . . . . . . 17 98 6.6. Variability in Utilization and Avail. Capacity . . . . . . 18 99 6.6.1. General Summary of Variability . . . . . . . . . . . . 18 100 7. Effect of POV on Restricted Capacity Metrics . . . . . . . . . 19 101 7.1. Type-P Parameter and Type-C Parameter . . . . . . . . . . 20 102 7.2. a priori Factors . . . . . . . . . . . . . . . . . . . . . 20 103 7.3. Measurement Interval . . . . . . . . . . . . . . . . . . . 20 104 7.4. Bulk Transfer Capacity Reporting . . . . . . . . . . . . . 21 105 7.5. Variability in Bulk Transfer Capacity . . . . . . . . . . 22 106 8. Test Streams and Sample Size . . . . . . . . . . . . . . . . . 22 107 8.1. Test Stream Characteristics . . . . . . . . . . . . . . . 22 108 8.2. Sample Size . . . . . . . . . . . . . . . . . . . . . . . 23 109 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 23 110 10. Security Considerations . . . . . . . . . . . . . . . . . . . 23 111 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 23 112 12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 24 113 12.1. Normative References . . . . . . . . . . . . . . . . . . . 24 114 12.2. Informative References . . . . . . . . . . . . . . . . . . 25 115 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 25 117 1. Introduction 119 When designing measurements of IP networks and presenting the 120 results, knowledge of the audience is a key consideration. To 121 present a useful and relevant portrait of network conditions, one 122 must answer the following question: 124 "How will the results be used?" 126 There are two main audience categories: 128 1. Network Characterization - describes conditions in an IP network 129 for quality assurance, troubleshooting, modeling, Service Level 130 Agreements (SLA), etc. The point-of-view looks inward, toward 131 the network, and the consumer intends their actions there. 133 2. Application Performance Estimation - describes the network 134 conditions in a way that facilitates determining affects on user 135 applications, and ultimately the users themselves. This point- 136 of-view looks outward, toward the user(s), accepting the network 137 as-is. This consumer intends to estimate a network-dependent 138 aspect of performance, or design some aspect of an application's 139 accommodation of the network. (These are *not* application 140 metrics, they are defined at the IP layer.) 142 This memo considers how these different points-of-view affect both 143 the measurement design (parameters and options of the metrics) and 144 statistics reported when serving their needs. 146 The IPPM framework [RFC2330] and other RFCs describing IPPM metrics 147 provide a background for this memo. 149 2. Purpose and Scope 151 The purpose of this memo is to clearly delineate two points-of-view 152 (POV) for using measurements, and describe their effects on the test 153 design, including the selection of metric parameters and reporting 154 the results. 156 The scope of this memo primarily covers the design and reporting of 157 the loss and delay metrics [RFC2680] [RFC2679]. It will also discuss 158 the delay variation [RFC3393] and reordering metrics [RFC4737] where 159 applicable. 161 With capacity metrics growing in relevance to the industry, the memo 162 also covers POV and reporting considerations for metrics resulting 163 from the Bulk Transfer Capacity Framework [RFC3148] and Network 164 Capacity Definitions [RFC5136]. These memos effectively describe two 165 different categories of metrics, 167 o [RFC3148] with congestion flow-control and the notion of unique 168 data bits delivered, and 170 o [RFC5136] using a definition of raw capacity without the 171 restrictions of data uniqueness or congestion-awareness. 173 It might seem at first glance that each of these metrics has an 174 obvious audience (Raw = Network Characterization, Restricted = 175 Application Performance), but reality is more complex and consistent 176 with the overall topic of capacity measurement and reporting. For 177 example, TCP is usually used in Restricted capacity measurement 178 methods, while UDP appears in Raw capacity measurement. The Raw and 179 Restricted capacity metrics will be treated in separate sections, 180 although they share one common reporting issue: representing 181 variability in capacity metric results as part of a long-term report. 183 Sampling, or the design of the active packet stream that is the basis 184 for the measurements, is also discussed. 186 3. Reporting Results 188 This section gives an overview of recommendations, followed by 189 additional considerations for reporting results in the "long-term", 190 based on the discussion and conclusions of the major sections that 191 follow. 193 3.1. Overview of Metric Statistics 195 This section gives an overview of reporting recommendations for the 196 loss, delay, and delay variation metrics. 198 The minimal report on measurements MUST include both Loss and Delay 199 Metrics. 201 For Packet Loss, the loss ratio defined in [RFC2680] is a sufficient 202 starting point, especially the guidance for setting the loss 203 threshold waiting time. We have calculated a waiting time above that 204 should be sufficient to differentiate between packets that are truly 205 lost or have long finite delays under general measurement 206 circumstances, 51 seconds. Knowledge of specific conditions can help 207 to reduce this threshold, but 51 seconds is considered to be 208 manageable in practice. 210 We note that a loss ratio calculated according to [Y.1540] would 211 exclude errored packets from the numerator. In practice, the 212 difference between these two loss metrics is small if any, depending 213 on whether the last link prior to the destination contributes errored 214 packets. 216 For Packet Delay, we recommend providing both the mean delay and the 217 median delay with lost packets designated undefined (as permitted by 218 [RFC2679]). Both statistics are based on a conditional distribution, 219 and the condition is packet arrival prior to a waiting time dT, where 220 dT has been set to take maximum packet lifetimes into account, as 221 discussed below. Using a long dT helps to ensure that delay 222 distributions are not truncated. 224 For Packet Delay Variation (PDV), the minimum delay of the 225 conditional distribution should be used as the reference delay for 226 computing PDV according to [Y.1540] or [RFC5481] and [RFC3393]. A 227 useful value to report is a pseudo range of delay variation based on 228 calculating the difference between a high percentile of delay and the 229 minimum delay. For example, the 99.9%-ile minus the minimum will 230 give a value that can be compared with objectives in [Y.1541]. 232 3.2. Long-Term Reporting Considerations 234 [I-D.ietf-ippm-reporting] describes methods to conduct measurements 235 and report the results on a near-immediate time scale (10 seconds, 236 which we consider to be "short-term"). 238 Measurement intervals and reporting intervals need not be the same 239 length. Sometimes, the user is only concerned with the performance 240 levels achieved over a relatively long interval of time (e.g, days, 241 weeks, or months, as opposed to 10 seconds). However, there can be 242 risks involved with running a measurement continuously over a long 243 period without recording intermediate results: 245 o Temporary power failure may cause loss of all the results to date. 247 o Measurement system timing synchronization signals may experience a 248 temporary outage, causing sub-sets of measurements to be in error 249 or invalid. 251 o Maintenance may be necessary on the measurement system, or its 252 connectivity to the network under test. 254 For these and other reasons, such as 256 o the constraint to collect measurements on intervals similar to 257 user session length, or 259 o the dual-use of measurements in monitoring activities where 260 results are needed on a period of a few minutes, 262 there is value in conducting measurements on intervals that are much 263 shorter than the reporting interval. 265 There are several approaches for aggregating a series of measurement 266 results over time in order to make a statement about the longer 267 reporting interval. One approach requires the storage of all metric 268 singletons collected throughout the reporting interval, even though 269 the measurement interval stops and starts many times. 271 Another approach is described in [RFC5835] as "temporal aggregation". 272 This approach would estimate the results for the reporting interval 273 based on many individual measurement interval statistics (results) 274 alone. The result would ideally appear in the same form as though a 275 continuous measurement was conducted. A memo to address the details 276 of temporal aggregation is yet to be prepared. 278 Yet another approach requires a numerical objective for the metric, 279 and the results of each measurement interval are compared with the 280 objective. Every measurement interval where the results meet the 281 objective contribute to the fraction of time with performance as 282 specified. When the reporting interval contains many measurement 283 intervals it is possible to present the results as "metric A was less 284 than or equal to objective X during Y% of time. 286 NOTE that numerical thresholds of acceptability are not set in IETF 287 performance work and are explicitly excluded from the IPPM charter. 289 In all measurement, it is important to avoid unintended 290 synchronization with network events. This topic is treated in 291 [RFC2330] for Poisson-distributed inter-packet time streams, and 292 [RFC3432] for Periodic streams. Both avoid synchronization through 293 use of random start times. 295 There are network conditions where it is simply more useful to report 296 the connectivity status of the Source-Destination path, and to 297 distinguish time intervals where connectivity can be demonstrated 298 from other time intervals (where connectivity does not appear to 299 exist). [RFC2678] specifies a number of one-way and two connectivity 300 metrics of increasing complexity. In this memo, we RECOMMEND that 301 long term reporting of loss, delay, and other metrics be limited to 302 time intervals where connectivity can be demonstrated, and other 303 intervals be summarized as percent of time where connectivity does 304 not appear to exist. We note that this same approach has been 305 adopted in ITU-T Recommendation [Y.1540] where performance parameters 306 are only valid during periods of service "availability" (evaluated 307 according to a function based on packet loss, and sustained periods 308 of loss ratio greater than a threshold are declared "unavailable"). 310 4. Effect of POV on the Loss Metric 312 This section describes the ways in which the Loss metric can be tuned 313 to reflect the preferences of the two audience categories, or 314 different POV. The waiting time to declare a packet lost, or loss 315 threshold is one area where there would appear to be a difference, 316 but the ability to post-process the results may resolve it. 318 4.1. Loss Threshold 320 RFC 2680 [RFC2680] defines the concept of a waiting time for packets 321 to arrive, beyond which they are declared lost. The text of the RFC 322 declines to recommend a value, instead saying that "good engineering, 323 including an understanding of packet lifetimes, will be needed in 324 practice." Later, in the methodology, they give reasons for waiting 325 "a reasonable period of time", and leaving the definition of 326 "reasonable" intentionally vague. 328 4.1.1. Network Characterization 330 Practical measurement experience has shown that unusual network 331 circumstances can cause long delays. One such circumstance is when 332 routing loops form during IGP re-convergence following a failure or 333 drastic link cost change. Packets will loop between two routers 334 until new routes are installed, or until the IPv4 Time-to-Live (TTL) 335 field (or the IPv6 Hop Limit) decrements to zero. Very long delays 336 on the order of several seconds have been measured [Casner] [Cia03]. 338 Therefore, network characterization activities prefer a long waiting 339 time in order to distinguish these events from other causes of loss 340 (such as packet discard at a full queue, or tail drop). This way, 341 the metric design helps to distinguish more reliably between packets 342 that might yet arrive, and those that are no longer traversing the 343 network. 345 It is possible to calculate a worst-case waiting time, assuming that 346 a routing loop is the cause. We model the path between Source and 347 Destination as a series of delays in links (t) and queues (q), as 348 these two are the dominant contributors to delay. The normal path 349 delay across n hops without encountering a loop, D, is 350 n 351 --- 352 \ 353 D = t + > t + q 354 0 / i i 355 --- 356 i = 1 358 Figure 1: Normal Path Delay 360 and the time spent in the loop with L hops, is 362 i + L-1 363 --- 364 \ (TTL - n) 365 R = C > t + q where C = --------- 366 / i i max L 367 --- 368 i 370 Figure 2: Delay due to Rotations in a Loop 372 and where C is the number of times a packet circles the loop. 374 If we take the delays of all links and queues as 100ms each, the 375 TTL=255, the number of hops n=5 and the hops in the loop L=4, then 377 D = 1.1 sec and R ~= 50 sec, and D + R ~= 51.1 seconds 379 We note that the link delays of 100ms would span most continents, and 380 a constant queue length of 100ms is also very generous. When a loop 381 occurs, it is almost certain to be resolved in 10 seconds or less. 382 The value calculated above is an upper limit for almost any realistic 383 circumstance. 385 A waiting time threshold parameter, dT, set consistent with this 386 calculation would not truncate the delay distribution (possibly 387 causing a change in its mathematical properties), because the packets 388 that might arrive have been given sufficient time to traverse the 389 network. 391 It is worth noting that packets that are stored and deliberately 392 forwarded at a much later time constitute a replay attack on the 393 measurement system, and are beyond the scope of normal performance 394 reporting. 396 4.1.2. Application Performance 398 Fortunately, application performance estimation activities are not 399 adversely affected by the estimated worst-case transfer time. 400 Although the designer's tendency might be to set the Loss Threshold 401 at a value equivalent to a particular application's threshold, this 402 specific threshold can be applied when post-processing the 403 measurements. A shorter waiting time can be enforced by locating 404 packets with delays longer than the application's threshold, and re- 405 designating such packets as lost. Thus, the measurement system can 406 use a single loss threshold and support both application and network 407 performance POVs simultaneously. 409 4.2. Errored Packet Designation 411 RFC 2680 designates packets that arrive containing errors as lost 412 packets. Many packets that are corrupted by bit errors are discarded 413 within the network and do not reach their intended destination. 415 This is consistent with applications that would check the payload 416 integrity at higher layers, and discard the packet. However, some 417 applications prefer to deal with errored payloads on their own, and 418 even a corrupted payload is better than no packet at all. 420 To address this possibility, and to make network characterization 421 more complete, it is recommended to distinguish between packets that 422 do not arrive (lost) and errored packets that arrive (conditionally 423 lost). 425 4.3. Causes of Lost Packets 427 Although many measurement systems use a waiting time to determine if 428 a packet is lost or not, most of the waiting is in vain. The packets 429 are no-longer traversing the network, and have not reached their 430 destination. 432 There are many causes of packet loss, including: 434 1. Queue drop, or discard 436 2. Corruption of the IP header, or other essential header info 438 3. TTL expiration (or use of a TTL value that is too small) 440 4. Link or router failure 442 After waiting sufficient time, packet loss can probably be attributed 443 to one of these causes. 445 4.4. Summary for Loss 447 Given that measurement post-processing is possible (even encouraged 448 in the definitions of IPPM metrics), measurements of loss can easily 449 serve both points of view: 451 o Use a long waiting time to serve network characterization and 452 revise results for specific application delay thresholds as 453 needed. 455 o Distinguish between errored packets and lost packets when possible 456 to aid network characterization, and combine the results for 457 application performance if appropriate. 459 5. Effect of POV on the Delay Metric 461 This section describes the ways in which the Delay metric can be 462 tuned to reflect the preferences of the two consumer categories, or 463 different POV. 465 5.1. Treatment of Lost Packets 467 The Delay Metric [RFC2679] specifies the treatment of packets that do 468 not successfully traverse the network: their delay is undefined. 470 " >>The *Type-P-One-way-Delay* from Src to Dst at T is undefined 471 (informally, infinite)<< means that Src sent the first bit of a 472 Type-P packet to Dst at wire-time T and that Dst did not receive that 473 packet." 475 It is an accepted, but informal practice to assign infinite delay to 476 lost packets. We next look at how these two different treatments 477 align with the needs of measurement consumers who wish to 478 characterize networks or estimate application performance. Also, we 479 look at the way that lost packets have been treated in other metrics: 480 delay variation and reordering. 482 5.1.1. Application Performance 484 Applications need to perform different functions, dependent on 485 whether or not each packet arrives within some finite tolerance. In 486 other words, a receivers' packet processing takes one of two 487 directions (or "forks" in the road): 489 o Packets that arrive within expected tolerance are handled by 490 processes that remove headers, restore smooth delivery timing (as 491 in a de-jitter buffer), restore sending order, check for errors in 492 payloads, and many other operations. 494 o Packets that do not arrive when expected spawn other processes 495 that attempt recovery from the apparent loss, such as 496 retransmission requests, loss concealment, or forward error 497 correction to replace the missing packet. 499 So, it is important to maintain a distinction between packets that 500 actually arrive, and those that do not. Therefore, it is preferable 501 to leave the delay of lost packets undefined, and to characterize the 502 delay distribution as a conditional distribution (conditioned on 503 arrival). 505 5.1.2. Network Characterization 507 In this discussion, we assume that both loss and delay metrics will 508 be reported for network characterization (at least). 510 Assume packets that do not arrive are reported as Lost, usually as a 511 fraction of all sent packets. If these lost packets are assigned 512 undefined delay, then network's inability to deliver them (in a 513 timely way) is captured only in the loss metric when we report 514 statistics on the Delay distribution conditioned on the event of 515 packet arrival (within the Loss waiting time threshold). We can say 516 that the Delay and Loss metrics are Orthogonal, in that they convey 517 non-overlapping information about the network under test. 519 However, if we assign infinite delay to all lost packets, then: 521 o The delay metric results are influenced both by packets that 522 arrive and those that do not. 524 o The delay singleton and the loss singleton do not appear to be 525 orthogonal (Delay is finite when Loss=0, Delay is infinite when 526 Loss=1). 528 o The network is penalized in both the loss and delay metrics, 529 effectively double-counting the lost packets. 531 As further evidence of overlap, consider the Cumulative Distribution 532 Function (CDF) of Delay when the value positive infinity is assigned 533 to all lost packets. Figure 3 shows a CDF where a small fraction of 534 packets are lost. 536 1 | - - - - - - - - - - - - - - - - - -+ 537 | | 538 | _..----'''''''''''''''''''' 539 | ,-'' 540 | ,' 541 | / Mass at 542 | / +infinity 543 | / = fraction 544 || lost 545 |/ 546 0 |_____________________________________ 548 0 Delay +o0 550 Figure 3: Cumulative Distribution Function for Delay when Loss = 551 +Infinity 553 We note that a Delay CDF that is conditioned on packet arrival would 554 not exhibit this apparent overlap with loss. 556 Although infinity is a familiar mathematical concept, it is somewhat 557 disconcerting to see any time-related metric reported as infinity, in 558 the opinion of the authors. Questions are bound to arise, and tend 559 to detract from the goal of informing the consumer with a performance 560 report. 562 5.1.3. Delay Variation 564 [RFC3393] excludes lost packets from samples, effectively assigning 565 an undefined delay to packets that do not arrive in a reasonable 566 time. Section 4.1 describes this specification and its rationale 567 (ipdv = inter-packet delay variation in the quote below). 569 "The treatment of lost packets as having "infinite" or "undefined" 570 delay complicates the derivation of statistics for ipdv. 571 Specifically, when packets in the measurement sequence are lost, 572 simple statistics such as sample mean cannot be computed. One 573 possible approach to handling this problem is to reduce the event 574 space by conditioning. That is, we consider conditional statistics; 575 namely we estimate the mean ipdv (or other derivative statistic) 576 conditioned on the event that selected packet pairs arrive at the 577 destination (within the given timeout). While this itself is not 578 without problems (what happens, for example, when every other packet 579 is lost), it offers a way to make some (valid) statements about ipdv, 580 at the same time avoiding events with undefined outcomes." 582 We note that the argument above applies to all forms of packet delay 583 variation that can be constructed using the "selection function" 584 concept of [RFC3393]. In recent work the two main forms of delay 585 variation metrics have been compared and the results are summarized 586 in [RFC5481]. 588 5.1.4. Reordering 590 [RFC4737]defines metrics that are based on evaluation of packet 591 arrival order, and include a waiting time to declare a packet lost 592 (to exclude them from further processing). 594 If packets are assigned a delay value, then the reordering metric 595 would declare any packets with infinite delay to be reordered, 596 because their sequence numbers will surely be less than the "Next 597 Expected" threshold when (or if) they arrive. But this practice 598 would fail to maintain orthogonality between the reordering metric 599 and the loss metric. Confusion can be avoided by designating the 600 delay of non-arriving packets as undefined, and reserving delay 601 values only for packets that arrive within a sufficiently long 602 waiting time. 604 5.2. Preferred Statistics 606 Today in network characterization, the sample mean is one statistic 607 that is almost ubiquitously reported. It is easily computed and 608 understood by virtually everyone in this audience category. Also, 609 the sample is usually filtered on packet arrival, so that the mean is 610 based a conditional distribution. 612 The median is another statistic that summarizes a distribution, 613 having somewhat different properties from the sample mean. The 614 median is stable in distributions with a few outliers or without 615 them. However, the median's stability prevents it from indicating 616 when a large fraction of the distribution changes value. 50% or more 617 values would need to change for the median to capture the change. 619 Both the median and sample mean have difficulty with bimodal 620 distributions. The median will reside in only one of the modes, and 621 the mean may not lie in either mode range. For this and other 622 reasons, additional statistics such as the minimum, maximum, and 95%- 623 ile have value when summarizing a distribution. 625 When both the sample mean and median are available, a comparison will 626 sometimes be informative, because these two statistics are equal only 627 when the delay distribution is perfectly symmetrical. 629 Also, these statistics are generally useful from the Application 630 Performance POV, so there is a common set that should satisfy 631 audiences. 633 Plots of the delay distribution may also be useful when single-value 634 statistics indicate that new conditions are present. An empirically- 635 derived probability distribution function will usually describe 636 multiple modes more efficiently than any other form of result. 638 5.3. Summary for Delay 640 From the perspectives of: 642 1. application/receiver analysis, where subsequent processing 643 depends on whether the packet arrives or times-out, 645 2. straightforward network characterization without double-counting 646 defects, and 648 3. consistency with Delay variation and Reordering metric 649 definitions, 651 the most efficient practice is to distinguish between truly lost and 652 delayed packets with a sufficiently long waiting time, and to 653 designate the delay of non-arriving packets as undefined. 655 6. Effect of POV on Raw Capacity Metrics 657 This section describes the ways that raw capacity metrics can be 658 tuned to reflect the preferences of the two audiences, or different 659 Points-of-View (POV). Raw capacity refers to the metrics defined in 660 [RFC5136] which do not include restrictions such as data uniqueness 661 or flow-control response to congestion. 663 In summary, the metrics considered are IP-layer Capacity, Utilization 664 (or used capacity), and Available Capacity, for individual links and 665 complete paths. These three metrics form a triad: knowing one metric 666 constrains the other two (within their allowed range), and knowing 667 two determines the third. The link metrics have another key aspect 668 in common: they are single-measurement-point metrics at the egress of 669 a link. The path Capacity and Available Capacity are derived by 670 examining the set of single-point link measurements and taking the 671 minimum value. 673 6.1. Type-P Parameter 675 The concept of "packets of type-P" is defined in [RFC2330]. The 676 type-P categorization has critical relevance in all forms of capacity 677 measurement and reporting. The ability to categorize packets based 678 on header fields for assignment to different queues and scheduling 679 mechanisms is now common place. When un-used resources are shared 680 across queues, the conditions in all packet categories will affect 681 capacity and related measurements. This is one source of variability 682 in the results that all audiences would prefer to see reported in a 683 useful and easily understood way. 685 Type-P in OWAMP and TWAMP is essentially confined to the Diffserv 686 Codepoint [RFC4656]. DSCP is the most common qualifier for type-P. 688 Each audience will have a set of type-P qualifications and value 689 combinations that are of interest. Measurements and reports SHOULD 690 have the flexibility to per-type and aggregate performance. 692 6.2. a priori Factors 694 The audience for Network Characterization may have detailed 695 information about each link that comprises a complete path (due to 696 ownership, for example), or some of the links in the path but not 697 others, or none of the links. 699 There are cases where the measurement audience only has information 700 on one of the links (the local access link), and wishes to measure 701 one or more of the raw capacity metrics. This scenario is quite 702 common, and has spawned a substantial number of experimental 703 measurement methods [ref to CAIDA survey page, etc.]. Many of these 704 methods respect that their users want a result fairly quickly and in 705 a one-trial. Thus, the measurement interval is kept short (a few 706 seconds to a minute). For long-term reporting, a sample of short 707 term results need to be summarized. 709 6.3. IP-layer Capacity 711 For links, this metric's theoretical maximum value can be determined 712 from the physical layer bit rate and the bit rate reduction due to 713 the layers between the physical layer and IP. When measured, this 714 metric takes additional factors into account, such as the ability of 715 the sending device to process and forward traffic under various 716 conditions. For example, the arrival of routing updates may spawn 717 high priority processes that reduce the sending rate temporarily. 718 Thus, the measured capacity of a link will be variable, and the 719 maximum capacity observed applies to a specific time, time interval, 720 and other relevant circumstances. 722 For paths composed of a series of links, it is easy to see how the 723 sources of variability for the results grow with each link in the 724 path. Results variability will be discussed in more detail below. 726 6.4. IP-layer Utilization 728 The ideal metric definition of Link Utilization [RFC5136] is based on 729 the actual usage (bits successfully received during a time interval) 730 and the Maximum Capacity for the same interval. 732 In practice, Link Utilization can be calculated by counting the IP- 733 layer (or other layer) octets received over a time interval and 734 dividing by the theoretical maximum of octets that could have been 735 delivered in the same interval. A commonly used time interval is 5 736 minutes, and this interval has been sufficient to support network 737 operations and design for some time. 5 minutes is somewhat long 738 compared with the expected download time for web pages, but short 739 with respect to large file transfers and TV program viewing. It is 740 fair to say that considerable variability is concealed by reporting a 741 single (average) Utilization value for each 5 minute interval. Some 742 performance management systems have begun to make 1 minute averages 743 available. 745 There is also a limit on the smallest useful measurement interval. 746 Intervals on the order of the serialization time for a single Maximum 747 Transmission Unit (MTU) packet will observe on/off behavior and 748 report 100% or 0%. The smallest interval needs to be some multiple 749 of MTU serialization time for averaging to be effective. 751 6.5. IP-layer Available Capacity 753 The Available Capacity of a link can be calculated using the Capacity 754 and Utilization metrics. 756 When Available capacity of a link or path is estimated through some 757 measurement technique, the following parameters SHOULD be reported: 759 o Name and reference to the exact method of measurement 761 o IP packet length, octets (including IP header) 763 o Maximum Capacity that can be assessed in the measurement 764 configuration 766 o The time a duration of the measurement 768 o All other parameters specific to the measurement method 770 Many methods of Available capacity measurement have a maximum 771 capacity that they can measure, and this maximum may be less than the 772 actual Available capacity of the link or path. Therefore, it is 773 important to know the capacity value beyond which there will be no 774 measured improvement. 776 The Application Design audience may have a target capacity value and 777 simply wish to assess whether there is sufficient Available Capacity. 778 This case simplifies measurement of link and path capacity to some 779 degree, as long as the measurable maximum exceeds the target 780 capacity. 782 6.6. Variability in Utilization and Avail. Capacity 784 As with most metrics and measurements, assessing the consistency or 785 variability in the results gives a the user an intuitive feel for the 786 degree (or confidence) that any one value is representative of other 787 results, or the underlying distribution from which these singleton 788 measurements have come. 790 What ways can Utilization be measured and summarized to describe the 791 potential variability in a useful way? 793 How can the variability in Available Capacity estimates be reported, 794 so that the confidence in the results is also conveyed? 796 We suggest some methods below: 798 6.6.1. General Summary of Variability 800 With a set of singleton Utilization or Available Capacity estimates, 801 each representing a time interval needed to ascertain the estimate, 802 we seek to describe the variation over the set of singletons as 803 though reporting summary statistics of a distribution. Three useful 804 summary statistics are: 806 o Minimum, 808 o Maximum, 810 o Range 812 An alternate way to represent the Range is as ratio of Maximum to 813 Minimum value. This enables an easily understandable statistic to 814 describe the range observed. For example, when Maximum = 3*Minimum, 815 then the Max/Min Ratio is 3 and users may see variability of this 816 order. On the other hand, Capacity estimates with a Max/Min Ratio 817 near 1 are quite consistent and near the central measure or statistic 818 reported. 820 For an on-going series of singleton estimates, a moving average of n 821 estimates may provide a single value estimate to more easily 822 distinguish substantial changes in performance over time. For 823 example, in a window of n singletons observed in time interval, t, a 824 percentage change of x% is declared to be a substantial change and 825 reported as an exception. 827 Often, the most informative summary of the results is a two-axis plot 828 rather than a table of statistics, where time is plotted on the 829 x-axis and the singleton value on the y-axis. The time-series plot 830 can illustrate sudden changes in an otherwise stable range, identify 831 bi-modality easily, and help quickly assess correlation with other 832 time-series. Plots of frequency of the singleton values are likewise 833 useful tools to visualize the variation. 835 7. Effect of POV on Restricted Capacity Metrics 837 This section describes the ways that restricted capacity metrics can 838 be tuned to reflect the preferences of the two audiences, or 839 different Points-of-View (POV). Raw capacity refers to the metrics 840 defined in [RFC3148] which include restrictions such as data 841 uniqueness or flow-control response to congestion. 843 In primary metric considered is Bulk Transfer Capacity (BTC) for 844 complete paths. [RFC3148] defines 846 BTC = data_sent / elapsed_time 848 for a connection with congestion-aware flow control, where data_sent 849 is the total of unique payload bits (no headers). 851 We note that this definition *differs* from the raw capacity 852 definition in Section 2.3.1 of [RFC5136], where IP-layer Capacity 853 *includes* all bits in the IP header and payload. This means that 854 Restricted Capacity BTC is already operating at a disadvantage when 855 compared to the raw capacity at layers below TCP. Further, there are 856 cases where "THE IP-layer" is encapsulated in another IP-layer or 857 other form of tunneling protocol, designating more and more of the 858 fundamental transport capacity as header bits that are pure overhead 859 to the BTC measurement. 861 When thinking about the triad of raw capacity metrics, BTC is most 862 akin to the "IP-Type-P Available Path Capacity", at least in the eyes 863 of a network user who seeks to know what transmission performance a 864 path might support. 866 7.1. Type-P Parameter and Type-C Parameter 868 The concept of "packets of type-P" is defined in [RFC2330]. The 869 considerations for Restricted Capacity are identical to the raw 870 capacity section on this topic, with the addition that the various 871 fields and options in the TCP header MUST be included in the 872 description. 874 The vast array of TCP flow control options are not well-captured by 875 Type-P, because they do not exist in the TCP header bits. Therefore, 876 we introduce a new notion here: TCP Configuration of "Type-C". The 877 elements of Type-C describe all of the settings for TCP options and 878 congestion control algorithm variables, including the main form of 879 congestion control in use. 881 7.2. a priori Factors 883 The audience for Network Characterization may have detailed 884 information about each link that comprises a complete path (due to 885 ownership, for example), or some of the links in the path but not 886 others, or none of the links. 888 There are cases where the measurement audience only has information 889 on one of the links (the local access link), and wishes to measure 890 one or more BTC metrics. This scenario is quite common, and has 891 spawned a substantial number of experimental measurement methods [ref 892 to CAIDA survey page, etc.]. Many of these methods respect that 893 their users want a result fairly quickly and in a one-trial. Thus, 894 the measurement interval is kept short (a few seconds to a minute). 895 For long-term reporting, a sample of short term results need to be 896 summarized. 898 7.3. Measurement Interval 900 There are limits on a useful measurement interval for BTC. Three 901 factors that influence the interval duration are listed below: 903 1. Measurements may choose to include or exclude the 3-way handshake 904 of TCP connection establishment, which requires at least 1.5 * 905 RTT and contains both the delay of the path and the host 906 processing time for responses. However, user experience includes 907 the 3-way handshake for all new TCP connections. 909 2. Measurements may choose to include or exclude Slow-Start, 910 preferring instead to focus on a portion of the transfer that 911 represents "equilibrium" <<<< which needs a definition for this 912 purpose >>>>. However, user experience includes the Slow-Start 913 for all new TCP connections. 915 3. Measurements may choose to use a fixed block of data to transfer, 916 where the size of the block has a relationship to the file size 917 of the application of interest. This approach yields variable 918 size measurement intervals, where a path faster BTC is measured 919 for less time than a slower path, an this has implications when 920 path impairments are time-varying, or transient. Users are 921 likely to turn their immediate attention elsewhere when a very 922 large file must be transferred, thus they do not directly 923 experience such a long transfer -- they see the result (success 924 or fail) and possibly an objective measurement of the transfer 925 time (which will likely include the 3-way handshake, Slow-start, 926 and application file management processing time as well as the 927 BTC). 929 Individual measurement intervals may be short or long, but there is a 930 need to report the results on a long-term basis that captures the BTC 931 variability experienced between each interval. Consistent BTC is a 932 valuable commodity along with the value attained. 934 7.4. Bulk Transfer Capacity Reporting 936 When BTC of a link or path is estimated through some measurement 937 technique, the following parameters SHOULD be reported: 939 o Name and reference to the exact method of measurement 941 o Maximum Transmission Unit (MTU) 943 o Maximum BTC that can be assessed in the measurement configuration 945 o The time and duration of the measurement 947 o The number of BTC connections used simultaneously 949 o *All* other parameters specific to the measurement method, 950 especially the Congestion Control algorithm in use 952 See also 953 [http://tools.ietf.org/wg/ippm/draft-ietf-ippm-tcp-throughput-tm/] 955 Many methods of Bulk Transfer Capacity measurement have a maximum 956 capacity that they can measure, and this maximum may be less than the 957 available capacity of the link or path. Therefore, it is important 958 to specify the measured BTC value beyond which there will be no 959 measured improvement. 961 The Application Design audience may have a target capacity value and 962 simply wish to assess whether there is sufficient BTC. This case 963 simplifies measurement of link and path capacity to some degree, as 964 long as the measurable maximum exceeds the target capacity. 966 7.5. Variability in Bulk Transfer Capacity 968 As with most metrics and measurements, assessing the consistency or 969 variability in the results gives a the user an intuitive feel for the 970 degree (or confidence) that any one value is representative of other 971 results, or the underlying distribution from which these singleton 972 measurements have come. 974 With two questions looming: 976 1. What ways can BTC be measured and summarized to describe the 977 potential variability in a useful way? 979 2. How can the variability in BTC estimates be reported, so that the 980 confidence in the results is also conveyed? 982 we suggest the methods of Section 6.6.1 above, and the additional 983 results presentations given in [RFC6349]. 985 8. Test Streams and Sample Size 987 This section discusses two key aspects of measurement that are 988 sometimes omitted from the report: the description of the test stream 989 on which the measurements are based, and the sample size. 991 8.1. Test Stream Characteristics 993 Network Characterization has traditionally used Poisson-distributed 994 inter-packet spacing, as this provides an unbiased sample. The 995 average inter-packet spacing may be selected to allow observation of 996 specific network phenomena. Other test streams are designed to 997 sample some property of the network, such as the presence of 998 congestion, link bandwidth, or packet reordering. 1000 If measuring a network in order to make inferences about applications 1001 or receiver performance, then there are usually efficiencies derived 1002 from a test stream that has similar characteristics to the sender. 1003 In some cases, it is essential to synthesize the sender stream, as 1004 with Bulk Transfer Capacity estimates. In other cases, it may be 1005 sufficient to sample with a "known bias", e.g., a Periodic stream to 1006 estimate real-time application performance. 1008 8.2. Sample Size 1010 Sample size is directly related to the accuracy of the results, and 1011 plays a critical role in the report. Even if only the sample size 1012 (in terms of number of packets) is given for each value or summary 1013 statistic, it imparts a notion of the confidence in the result. 1015 In practice, the sample size will be selected taking both statistical 1016 and practical factors into account. Among these factors are: 1018 1. The estimated variability of the quantity being measured 1020 2. The desired confidence in the result (although this may be 1021 dependent on assumption of the underlying distribution of the 1022 measured quantity). 1024 3. The effects of active measurement traffic on user traffic 1026 4. etc. 1028 A sample size may sometimes be referred to as "large". This is a 1029 relative, and qualitative term. It is preferable to describe what 1030 one is attempting to achieve with their sample. For example, stating 1031 an implication may be helpful: this sample is large enough such that 1032 a single outlying value at ten times the "typical" sample mean (the 1033 mean without the outlying value) would influence the mean by no more 1034 than X. 1036 9. IANA Considerations 1038 This document makes no request of IANA. 1040 Note to RFC Editor: this section may be removed on publication as an 1041 RFC. 1043 10. Security Considerations 1045 The security considerations that apply to any active measurement of 1046 live networks are relevant here as well. See [RFC4656]. 1048 11. Acknowledgements 1050 The authors thank: Phil Chimento for his suggestion to employ 1051 conditional distributions for Delay, Steve Konish Jr. for his careful 1052 review and suggestions, Dave Mcdysan and Don McLachlan for useful 1053 comments based on their long experience with measurement and 1054 reporting, and Matt Zekauskas for suggestions on organizing the memo 1055 for easier consumption. 1057 12. References 1059 12.1. Normative References 1061 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1062 Requirement Levels", BCP 14, RFC 2119, March 1997. 1064 [RFC2330] Paxson, V., Almes, G., Mahdavi, J., and M. Mathis, 1065 "Framework for IP Performance Metrics", RFC 2330, 1066 May 1998. 1068 [RFC2678] Mahdavi, J. and V. Paxson, "IPPM Metrics for Measuring 1069 Connectivity", RFC 2678, September 1999. 1071 [RFC2679] Almes, G., Kalidindi, S., and M. Zekauskas, "A One-way 1072 Delay Metric for IPPM", RFC 2679, September 1999. 1074 [RFC2680] Almes, G., Kalidindi, S., and M. Zekauskas, "A One-way 1075 Packet Loss Metric for IPPM", RFC 2680, September 1999. 1077 [RFC3148] Mathis, M. and M. Allman, "A Framework for Defining 1078 Empirical Bulk Transfer Capacity Metrics", RFC 3148, 1079 July 2001. 1081 [RFC3393] Demichelis, C. and P. Chimento, "IP Packet Delay Variation 1082 Metric for IP Performance Metrics (IPPM)", RFC 3393, 1083 November 2002. 1085 [RFC3432] Raisanen, V., Grotefeld, G., and A. Morton, "Network 1086 performance measurement with periodic streams", RFC 3432, 1087 November 2002. 1089 [RFC4656] Shalunov, S., Teitelbaum, B., Karp, A., Boote, J., and M. 1090 Zekauskas, "A One-way Active Measurement Protocol 1091 (OWAMP)", RFC 4656, September 2006. 1093 [RFC4737] Morton, A., Ciavattone, L., Ramachandran, G., Shalunov, 1094 S., and J. Perser, "Packet Reordering Metrics", RFC 4737, 1095 November 2006. 1097 [RFC5136] Chimento, P. and J. Ishac, "Defining Network Capacity", 1098 RFC 5136, February 2008. 1100 12.2. Informative References 1102 [Casner] "A Fine-Grained View of High Performance Networking, NANOG 1103 22 Conf.; http://www.nanog.org/mtg-0105/agenda.html", May 1104 20-22 2001. 1106 [Cia03] "Standardized Active Measurements on a Tier 1 IP Backbone, 1107 IEEE Communications Mag., pp 90-97.", June 2003. 1109 [I-D.ietf-ippm-reporting] 1110 Shalunov, S. and M. Swany, "Reporting IP Performance 1111 Metrics to Users", draft-ietf-ippm-reporting-06 (work in 1112 progress), March 2011. 1114 [RFC5481] Morton, A. and B. Claise, "Packet Delay Variation 1115 Applicability Statement", RFC 5481, March 2009. 1117 [RFC5835] Morton, A. and S. Van den Berghe, "Framework for Metric 1118 Composition", RFC 5835, April 2010. 1120 [RFC6349] Constantine, B., Forget, G., Geib, R., and R. Schrage, 1121 "Framework for TCP Throughput Testing", RFC 6349, 1122 August 2011. 1124 [Y.1540] ITU-T Recommendation Y.1540, "Internet protocol data 1125 communication service - IP packet transfer and 1126 availability performance parameters", December 2002. 1128 [Y.1541] ITU-T Recommendation Y.1540, "Network Performance 1129 Objectives for IP-Based Services", February 2006. 1131 Authors' Addresses 1133 Al Morton 1134 AT&T Labs 1135 200 Laurel Avenue South 1136 Middletown, NJ 07748 1137 USA 1139 Phone: +1 732 420 1571 1140 Fax: +1 732 368 1192 1141 Email: acmorton@att.com 1142 URI: http://home.comcast.net/~acmacm/ 1143 Gomathi Ramachandran 1144 AT&T Labs 1145 200 Laurel Avenue South 1146 Middletown, New Jersey 07748 1147 USA 1149 Phone: +1 732 420 2353 1150 Fax: 1151 Email: gomathi@att.com 1152 URI: 1154 Ganga Maguluri 1155 AT&T Labs 1156 200 Laurel Avenue 1157 Middletown, New Jersey 07748 1158 USA 1160 Phone: 732-420-2486 1161 Fax: 1162 Email: gmaguluri@att.com 1163 URI: