idnits 2.17.1 draft-ietf-ippm-reporting-metrics-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to contain a disclaimer for pre-RFC5378 work, but was first submitted on or after 10 November 2008. The disclaimer is usually necessary only for documents that revise or obsolete older RFCs, and that take significant amounts of text from those RFCs. If you can contact all authors of the source material and they are willing to grant the BCP78 rights to the IETF Trust, you can and should remove the disclaimer. Otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (February 13, 2012) is 4449 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 2679 (Obsoleted by RFC 7679) ** Obsolete normative reference: RFC 2680 (Obsoleted by RFC 7680) Summary: 2 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group A. Morton 3 Internet-Draft G. Ramachandran 4 Intended status: Informational G. Maguluri 5 Expires: August 16, 2012 AT&T Labs 6 February 13, 2012 8 Reporting Metrics: Different Points of View 9 draft-ietf-ippm-reporting-metrics-07 11 Abstract 13 Consumers of IP network performance metrics have many different uses 14 in mind. The memo provides "long-term" reporting considerations 15 (e.g, days, weeks or months, as opposed to 10 seconds), based on 16 analysis of the two key audience points-of-view. It describes how 17 the audience categories affect the selection of metric parameters and 18 options when seeking info that serves their needs. 20 Requirements Language 22 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 23 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 24 document are to be interpreted as described in RFC 2119 [RFC2119]. 26 Status of this Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at http://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on August 16, 2012. 43 Copyright Notice 45 Copyright (c) 2012 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (http://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 This document may contain material from IETF Documents or IETF 59 Contributions published or made publicly available before November 60 10, 2008. The person(s) controlling the copyright in some of this 61 material may not have granted the IETF Trust the right to allow 62 modifications of such material outside the IETF Standards Process. 63 Without obtaining an adequate license from the person(s) controlling 64 the copyright in such materials, this document may not be modified 65 outside the IETF Standards Process, and derivative works of it may 66 not be created outside the IETF Standards Process, except to format 67 it for publication as an RFC or to translate it into languages other 68 than English. 70 Table of Contents 72 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 73 2. Purpose and Scope . . . . . . . . . . . . . . . . . . . . . . 4 74 3. Reporting Results . . . . . . . . . . . . . . . . . . . . . . 5 75 3.1. Overview of Metric Statistics . . . . . . . . . . . . . . 5 76 3.2. Long-Term Reporting Considerations . . . . . . . . . . . . 6 77 4. Effect of POV on the Loss Metric . . . . . . . . . . . . . . . 8 78 4.1. Loss Threshold . . . . . . . . . . . . . . . . . . . . . . 8 79 4.1.1. Network Characterization . . . . . . . . . . . . . . . 8 80 4.1.2. Application Performance . . . . . . . . . . . . . . . 10 81 4.2. Errored Packet Designation . . . . . . . . . . . . . . . . 10 82 4.3. Causes of Lost Packets . . . . . . . . . . . . . . . . . . 10 83 4.4. Summary for Loss . . . . . . . . . . . . . . . . . . . . . 11 84 5. Effect of POV on the Delay Metric . . . . . . . . . . . . . . 11 85 5.1. Treatment of Lost Packets . . . . . . . . . . . . . . . . 11 86 5.1.1. Application Performance . . . . . . . . . . . . . . . 11 87 5.1.2. Network Characterization . . . . . . . . . . . . . . . 12 88 5.1.3. Delay Variation . . . . . . . . . . . . . . . . . . . 13 89 5.1.4. Reordering . . . . . . . . . . . . . . . . . . . . . . 14 90 5.2. Preferred Statistics . . . . . . . . . . . . . . . . . . . 14 91 5.3. Summary for Delay . . . . . . . . . . . . . . . . . . . . 15 92 6. Reporting Raw Capacity Metrics . . . . . . . . . . . . . . . . 15 93 6.1. Type-P Parameter . . . . . . . . . . . . . . . . . . . . . 15 94 6.2. A priori Factors . . . . . . . . . . . . . . . . . . . . . 16 95 6.3. IP-layer Capacity . . . . . . . . . . . . . . . . . . . . 16 96 6.4. IP-layer Utilization . . . . . . . . . . . . . . . . . . . 17 97 6.5. IP-layer Available Capacity . . . . . . . . . . . . . . . 17 98 6.6. Variability in Utilization and Avail. Capacity . . . . . . 18 99 6.6.1. General Summary of Variability . . . . . . . . . . . . 18 100 7. Reporting Restricted Capacity Metrics . . . . . . . . . . . . 19 101 7.1. Type-P Parameter and Type-C Parameter . . . . . . . . . . 20 102 7.2. A priori Factors . . . . . . . . . . . . . . . . . . . . . 20 103 7.3. Measurement Interval . . . . . . . . . . . . . . . . . . . 20 104 7.4. Bulk Transfer Capacity Reporting . . . . . . . . . . . . . 21 105 7.5. Variability in Bulk Transfer Capacity . . . . . . . . . . 22 106 8. Reporting on Test Streams and Sample Size . . . . . . . . . . 22 107 8.1. Test Stream Characteristics . . . . . . . . . . . . . . . 22 108 8.2. Sample Size . . . . . . . . . . . . . . . . . . . . . . . 22 109 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 23 110 10. Security Considerations . . . . . . . . . . . . . . . . . . . 23 111 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 23 112 12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 24 113 12.1. Normative References . . . . . . . . . . . . . . . . . . . 24 114 12.2. Informative References . . . . . . . . . . . . . . . . . . 24 115 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 25 117 1. Introduction 119 When designing measurements of IP networks and presenting the 120 results, knowledge of the audience is a key consideration. To 121 present a useful and relevant portrait of network conditions, one 122 must answer the following question: 124 "How will the results be used?" 126 There are two main audience categories: 128 1. Network Characterization - describes conditions in an IP network 129 for quality assurance, troubleshooting, modeling, Service Level 130 Agreements (SLA), etc. The point-of-view looks inward, toward 131 the network, and the consumer intends their actions there. 133 2. Application Performance Estimation - describes the network 134 conditions in a way that facilitates determining affects on user 135 applications, and ultimately the users themselves. This point- 136 of-view looks outward, toward the user(s), accepting the network 137 as-is. This consumer intends to estimate a network-dependent 138 aspect of performance, or design some aspect of an application's 139 accommodation of the network. (These are *not* application 140 metrics, they are defined at the IP layer.) 142 This memo considers how these different points-of-view affect both 143 the measurement design (parameters and options of the metrics) and 144 statistics reported when serving their needs. 146 The IPPM framework [RFC2330] and other RFCs describing IPPM metrics 147 provide a background for this memo. 149 2. Purpose and Scope 151 The purpose of this memo is to clearly delineate two points-of-view 152 (POV) for using measurements, and describe their effects on the test 153 design, including the selection of metric parameters and reporting 154 the results. 156 The scope of this memo primarily covers the design and reporting of 157 the loss and delay metrics [RFC2680] [RFC2679]. It will also discuss 158 the delay variation [RFC3393] and reordering metrics [RFC4737] where 159 applicable. 161 With capacity metrics growing in relevance to the industry, the memo 162 also covers POV and reporting considerations for metrics resulting 163 from the Bulk Transfer Capacity Framework [RFC3148] and Network 164 Capacity Definitions [RFC5136]. These memos effectively describe two 165 different categories of metrics, 167 o [RFC3148] with congestion flow-control and the notion of unique 168 data bits delivered, and 170 o [RFC5136] using a definition of raw capacity without the 171 restrictions of data uniqueness or congestion-awareness. 173 It might seem at first glance that each of these metrics has an 174 obvious audience (Raw = Network Characterization, Restricted = 175 Application Performance), but reality is more complex and consistent 176 with the overall topic of capacity measurement and reporting. For 177 example, TCP is usually used in Restricted capacity measurement 178 methods, while UDP appears in Raw capacity measurement. The Raw and 179 Restricted capacity metrics will be treated in separate sections, 180 although they share one common reporting issue: representing 181 variability in capacity metric results as part of a long-term report. 183 Sampling, or the design of the active packet stream that is the basis 184 for the measurements, is also discussed. 186 3. Reporting Results 188 This section gives an overview of recommendations, followed by 189 additional considerations for reporting results in the "long-term", 190 based on the discussion and conclusions of the major sections that 191 follow. 193 3.1. Overview of Metric Statistics 195 This section gives an overview of reporting recommendations for the 196 loss, delay, and delay variation metrics. 198 The minimal report on measurements MUST include both Loss and Delay 199 Metrics. 201 For Packet Loss, the loss ratio defined in [RFC2680] is a sufficient 202 starting point, especially the existing guidance for setting the loss 203 threshold waiting time. We have calculated a waiting time above that 204 should be sufficient to differentiate between packets that are truly 205 lost or have long finite delays under general measurement 206 circumstances, 51 seconds. Knowledge of specific conditions can help 207 to reduce this threshold, but 51 seconds is considered to be 208 manageable in practice. 210 We note that a loss ratio calculated according to [Y.1540] would 211 exclude errored packets from the numerator. In practice, the 212 difference between these two loss metrics is small if any, depending 213 on whether the last link prior to the destination contributes errored 214 packets. 216 For Packet Delay, we recommend providing both the mean delay and the 217 median delay with lost packets designated undefined (as permitted by 218 [RFC2679]). Both statistics are based on a conditional distribution, 219 and the condition is packet arrival prior to a waiting time dT, where 220 dT has been set to take maximum packet lifetimes into account, as 221 discussed above for loss. Using a long dT helps to ensure that delay 222 distributions are not truncated. 224 For Packet Delay Variation (PDV), the minimum delay of the 225 conditional distribution should be used as the reference delay for 226 computing PDV according to [Y.1540] or [RFC5481] and [RFC3393]. A 227 useful value to report is a pseudo range of delay variation based on 228 calculating the difference between a high percentile of delay and the 229 minimum delay. For example, the 99.9%-ile minus the minimum will 230 give a value that can be compared with objectives in [Y.1541]. 232 For Capacity, both Raw and Restricted, reporting the variability in a 233 useful way is identified as the main challenge. The Min, Max, and 234 Range statistics are suggested along with a ratio of Max to Min and 235 moving averages. In the end, a simple plot of the singleton results 236 over time may succeed where summary metrics fail, or serve to confirm 237 that the summaries are valid. 239 3.2. Long-Term Reporting Considerations 241 [I-D.ietf-ippm-reporting] describes methods to conduct measurements 242 and report the results on a near-immediate time scale (10 seconds, 243 which we consider to be "short-term"). 245 Measurement intervals and reporting intervals need not be the same 246 length. Sometimes, the user is only concerned with the performance 247 levels achieved over a relatively long interval of time (e.g, days, 248 weeks, or months, as opposed to 10 seconds). However, there can be 249 risks involved with running a measurement continuously over a long 250 period without recording intermediate results: 252 o Temporary power failure may cause loss of all the results to date. 254 o Measurement system timing synchronization signals may experience a 255 temporary outage, causing sub-sets of measurements to be in error 256 or invalid. 258 o Maintenance may be necessary on the measurement system, or its 259 connectivity to the network under test. 261 For these and other reasons, such as 263 o the constraint to collect measurements on intervals similar to 264 user session length, or 266 o the dual-use of measurements in monitoring activities where 267 results are needed on a period of a few minutes, 269 there is value in conducting measurements on intervals that are much 270 shorter than the reporting interval. 272 There are several approaches for aggregating a series of measurement 273 results over time in order to make a statement about the longer 274 reporting interval. One approach requires the storage of all metric 275 singletons collected throughout the reporting interval, even though 276 the measurement interval stops and starts many times. 278 Another approach is described in [RFC5835] as "temporal aggregation". 279 This approach would estimate the results for the reporting interval 280 based on many individual measurement interval statistics (results) 281 alone. The result would ideally appear in the same form as though a 282 continuous measurement was conducted. A memo to address the details 283 of temporal aggregation is yet to be prepared. 285 Yet another approach requires a numerical objective for the metric, 286 and the results of each measurement interval are compared with the 287 objective. Every measurement interval where the results meet the 288 objective contribute to the fraction of time with performance as 289 specified. When the reporting interval contains many measurement 290 intervals it is possible to present the results as "metric A was less 291 than or equal to objective X during Y% of time. 293 NOTE that numerical thresholds of acceptability are not set in IETF 294 performance work and are explicitly excluded from the IPPM charter. 296 In all measurement, it is important to avoid unintended 297 synchronization with network events. This topic is treated in 298 [RFC2330] for Poisson-distributed inter-packet time streams, and 299 [RFC3432] for Periodic streams. Both avoid synchronization through 300 use of random start times. 302 There are network conditions where it is simply more useful to report 303 the connectivity status of the Source-Destination path, and to 304 distinguish time intervals where connectivity can be demonstrated 305 from other time intervals (where connectivity does not appear to 306 exist). [RFC2678] specifies a number of one-way and two connectivity 307 metrics of increasing complexity. In this memo, we RECOMMEND that 308 long term reporting of loss, delay, and other metrics be limited to 309 time intervals where connectivity can be demonstrated, and other 310 intervals be summarized as percent of time where connectivity does 311 not appear to exist. We note that this same approach has been 312 adopted in ITU-T Recommendation [Y.1540] where performance parameters 313 are only valid during periods of service "availability" (evaluated 314 according to a function based on packet loss, and sustained periods 315 of loss ratio greater than a threshold are declared "unavailable"). 317 4. Effect of POV on the Loss Metric 319 This section describes the ways in which the Loss metric can be tuned 320 to reflect the preferences of the two audience categories, or 321 different POV. The waiting time to declare a packet lost, or loss 322 threshold is one area where there would appear to be a difference, 323 but the ability to post-process the results may resolve it. 325 4.1. Loss Threshold 327 RFC 2680 [RFC2680] defines the concept of a waiting time for packets 328 to arrive, beyond which they are declared lost. The text of the RFC 329 declines to recommend a value, instead saying that "good engineering, 330 including an understanding of packet lifetimes, will be needed in 331 practice." Later, in the methodology, they give reasons for waiting 332 "a reasonable period of time", and leaving the definition of 333 "reasonable" intentionally vague. 335 4.1.1. Network Characterization 337 Practical measurement experience has shown that unusual network 338 circumstances can cause long delays. One such circumstance is when 339 routing loops form during IGP re-convergence following a failure or 340 drastic link cost change. Packets will loop between two routers 341 until new routes are installed, or until the IPv4 Time-to-Live (TTL) 342 field (or the IPv6 Hop Limit) decrements to zero. Very long delays 343 on the order of several seconds have been measured [Casner] [Cia03]. 345 Therefore, network characterization activities prefer a long waiting 346 time in order to distinguish these events from other causes of loss 347 (such as packet discard at a full queue, or tail drop). This way, 348 the metric design helps to distinguish more reliably between packets 349 that might yet arrive, and those that are no longer traversing the 350 network. 352 It is possible to calculate a worst-case waiting time, assuming that 353 a routing loop is the cause. We model the path between Source and 354 Destination as a series of delays in links (t) and queues (q), as 355 these two are the dominant contributors to delay. The normal path 356 delay across n hops without encountering a loop, D, is 358 n 359 --- 360 \ 361 D = t + > t + q 362 0 / i i 363 --- 364 i = 1 366 Figure 1: Normal Path Delay 368 and the time spent in the loop with L hops, is 370 i + L-1 371 --- 372 \ (TTL - n) 373 R = C > t + q where C = --------- 374 / i i max L 375 --- 376 i 378 Figure 2: Delay due to Rotations in a Loop 380 and where C is the number of times a packet circles the loop. 382 If we take the delays of all links and queues as 100ms each, the 383 TTL=255, the number of hops n=5 and the hops in the loop L=4, then 385 D = 1.1 sec and R ~= 50 sec, and D + R ~= 51.1 seconds 387 We note that the link delays of 100ms would span most continents, and 388 a constant queue length of 100ms is also very generous. When a loop 389 occurs, it is almost certain to be resolved in 10 seconds or less. 390 The value calculated above is an upper limit for almost any real- 391 world circumstance. 393 A waiting time threshold parameter, dT, set consistent with this 394 calculation would not truncate the delay distribution (possibly 395 causing a change in its mathematical properties), because the packets 396 that might arrive have been given sufficient time to traverse the 397 network. 399 It is worth noting that packets that are stored and deliberately 400 forwarded at a much later time constitute a replay attack on the 401 measurement system, and are beyond the scope of normal performance 402 reporting. 404 4.1.2. Application Performance 406 Fortunately, application performance estimation activities are not 407 adversely affected by the estimated worst-case transfer time. 408 Although the designer's tendency might be to set the Loss Threshold 409 at a value equivalent to a particular application's threshold, this 410 specific threshold can be applied when post-processing the 411 measurements. A shorter waiting time can be enforced by locating 412 packets with delays longer than the application's threshold, and re- 413 designating such packets as lost. Thus, the measurement system can 414 use a single loss waiting time and support both application and 415 network performance POVs simultaneously. 417 4.2. Errored Packet Designation 419 RFC 2680 designates packets that arrive containing errors as lost 420 packets. Many packets that are corrupted by bit errors are discarded 421 within the network and do not reach their intended destination. 423 This is consistent with applications that would check the payload 424 integrity at higher layers, and discard the packet. However, some 425 applications prefer to deal with errored payloads on their own, and 426 even a corrupted payload is better than no packet at all. 428 To address this possibility, and to make network characterization 429 more complete, it is recommended to distinguish between packets that 430 do not arrive (lost) and errored packets that arrive (conditionally 431 lost). 433 4.3. Causes of Lost Packets 435 Although many measurement systems use a waiting time to determine if 436 a packet is lost or not, most of the waiting is in vain. The packets 437 are no-longer traversing the network, and have not reached their 438 destination. 440 There are many causes of packet loss, including: 442 1. Queue drop, or discard 444 2. Corruption of the IP header, or other essential header info 446 3. TTL expiration (or use of a TTL value that is too small) 447 4. Link or router failure 449 After waiting sufficient time, packet loss can probably be attributed 450 to one of these causes. 452 4.4. Summary for Loss 454 Given that measurement post-processing is possible (even encouraged 455 in the definitions of IPPM metrics), measurements of loss can easily 456 serve both points of view: 458 o Use a long waiting time to serve network characterization and 459 revise results for specific application delay thresholds as 460 needed. 462 o Distinguish between errored packets and lost packets when possible 463 to aid network characterization, and combine the results for 464 application performance if appropriate. 466 5. Effect of POV on the Delay Metric 468 This section describes the ways in which the Delay metric can be 469 tuned to reflect the preferences of the two consumer categories, or 470 different POV. 472 5.1. Treatment of Lost Packets 474 The Delay Metric [RFC2679] specifies the treatment of packets that do 475 not successfully traverse the network: their delay is undefined. 477 " >>The *Type-P-One-way-Delay* from Src to Dst at T is undefined 478 (informally, infinite)<< means that Src sent the first bit of a 479 Type-P packet to Dst at wire-time T and that Dst did not receive that 480 packet." 482 It is an accepted, but informal practice to assign infinite delay to 483 lost packets. We next look at how these two different treatments 484 align with the needs of measurement consumers who wish to 485 characterize networks or estimate application performance. Also, we 486 look at the way that lost packets have been treated in other metrics: 487 delay variation and reordering. 489 5.1.1. Application Performance 491 Applications need to perform different functions, dependent on 492 whether or not each packet arrives within some finite tolerance. In 493 other words, a receivers' packet processing takes one of two 494 directions (or "forks" in the road): 496 o Packets that arrive within expected tolerance are handled by 497 processes that remove headers, restore smooth delivery timing (as 498 in a de-jitter buffer), restore sending order, check for errors in 499 payloads, and many other operations. 501 o Packets that do not arrive when expected spawn other processes 502 that attempt recovery from the apparent loss, such as 503 retransmission requests, loss concealment, or forward error 504 correction to replace the missing packet. 506 So, it is important to maintain a distinction between packets that 507 actually arrive, and those that do not. Therefore, it is preferable 508 to leave the delay of lost packets undefined, and to characterize the 509 delay distribution as a conditional distribution (conditioned on 510 arrival). 512 5.1.2. Network Characterization 514 In this discussion, we assume that both loss and delay metrics will 515 be reported for network characterization (at least). 517 Assume packets that do not arrive are reported as Lost, usually as a 518 fraction of all sent packets. If these lost packets are assigned 519 undefined delay, then the network's inability to deliver them (in a 520 timely way) is relegated only in the Loss metric when we report 521 statistics on the Delay distribution conditioned on the event of 522 packet arrival (within the Loss waiting time threshold). We can say 523 that the Delay and Loss metrics are Orthogonal, in that they convey 524 non-overlapping information about the network under test. This is a 525 valuable property, whose absence is discussed below. 527 However, if we assign infinite delay to all lost packets, then: 529 o The delay metric results are influenced both by packets that 530 arrive and those that do not. 532 o The delay singleton and the loss singleton do not appear to be 533 orthogonal (Delay is finite when Loss=0, Delay is infinite when 534 Loss=1). 536 o The network is penalized in both the loss and delay metrics, 537 effectively double-counting the lost packets. 539 As further evidence of overlap, consider the Cumulative Distribution 540 Function (CDF) of Delay when the value positive infinity is assigned 541 to all lost packets. Figure 3 shows a CDF where a small fraction of 542 packets are lost. 544 1 | - - - - - - - - - - - - - - - - - -+ 545 | | 546 | _..----'''''''''''''''''''' 547 | ,-'' 548 | ,' 549 | / Mass at 550 | / +infinity 551 | / = fraction 552 || lost 553 |/ 554 0 |_____________________________________ 556 0 Delay +o0 558 Figure 3: Cumulative Distribution Function for Delay when Loss = 559 +Infinity 561 We note that a Delay CDF that is conditioned on packet arrival would 562 not exhibit this apparent overlap with loss. 564 Although infinity is a familiar mathematical concept, it is somewhat 565 disconcerting to see any time-related metric reported as infinity, in 566 the opinion of the authors. Questions are bound to arise, and tend 567 to detract from the goal of informing the consumer with a performance 568 report. 570 5.1.3. Delay Variation 572 [RFC3393] excludes lost packets from samples, effectively assigning 573 an undefined delay to packets that do not arrive in a reasonable 574 time. Section 4.1 of [RFC3393] describes this specification and its 575 rationale (ipdv = inter-packet delay variation in the quote below). 577 "The treatment of lost packets as having "infinite" or "undefined" 578 delay complicates the derivation of statistics for ipdv. 579 Specifically, when packets in the measurement sequence are lost, 580 simple statistics such as sample mean cannot be computed. One 581 possible approach to handling this problem is to reduce the event 582 space by conditioning. That is, we consider conditional statistics; 583 namely we estimate the mean ipdv (or other derivative statistic) 584 conditioned on the event that selected packet pairs arrive at the 585 destination (within the given timeout). While this itself is not 586 without problems (what happens, for example, when every other packet 587 is lost), it offers a way to make some (valid) statements about ipdv, 588 at the same time avoiding events with undefined outcomes." 589 We note that the argument above applies to all forms of packet delay 590 variation that can be constructed using the "selection function" 591 concept of [RFC3393]. In recent work the two main forms of delay 592 variation metrics have been compared and the results are summarized 593 in [RFC5481]. 595 5.1.4. Reordering 597 [RFC4737]defines metrics that are based on evaluation of packet 598 arrival order, and include a waiting time to declare a packet lost 599 (to exclude them from further processing). 601 If packets are assigned a delay value, then the reordering metric 602 would declare any packets with infinite delay to be reordered, 603 because their sequence numbers will surely be less than the "Next 604 Expected" threshold when (or if) they arrive. But this practice 605 would fail to maintain orthogonality between the reordering metric 606 and the loss metric. Confusion can be avoided by designating the 607 delay of non-arriving packets as undefined, and reserving delay 608 values only for packets that arrive within a sufficiently long 609 waiting time. 611 5.2. Preferred Statistics 613 Today in network characterization, the sample mean is one statistic 614 that is almost ubiquitously reported. It is easily computed and 615 understood by virtually everyone in this audience category. Also, 616 the sample is usually filtered on packet arrival, so that the mean is 617 based on a conditional distribution. 619 The median is another statistic that summarizes a distribution, 620 having somewhat different properties from the sample mean. The 621 median is stable in distributions with a few outliers or without 622 them. However, the median's stability prevents it from indicating 623 when a large fraction of the distribution changes value. 50% or more 624 values would need to change for the median to capture the change. 626 Both the median and sample mean have difficulty with bimodal 627 distributions. The median will reside in only one of the modes, and 628 the mean may not lie in either mode range. For this and other 629 reasons, additional statistics such as the minimum, maximum, and 95%- 630 ile have value when summarizing a distribution. 632 When both the sample mean and median are available, a comparison will 633 sometimes be informative, because these two statistics are equal only 634 when the delay distribution is perfectly symmetrical. 636 Also, these statistics are generally useful from the Application 637 Performance POV, so there is a common set that should satisfy 638 audiences. 640 Plots of the delay distribution may also be useful when single-value 641 statistics indicate that new conditions are present. An empirically- 642 derived probability distribution function will usually describe 643 multiple modes more efficiently than any other form of result. 645 5.3. Summary for Delay 647 From the perspectives of: 649 1. application/receiver analysis, where subsequent processing 650 depends on whether the packet arrives or times-out, 652 2. straightforward network characterization without double-counting 653 defects, and 655 3. consistency with Delay variation and Reordering metric 656 definitions, 658 the most efficient practice is to distinguish between truly lost and 659 delayed packets with a sufficiently long waiting time, and to 660 designate the delay of non-arriving packets as undefined. 662 6. Reporting Raw Capacity Metrics 664 Raw capacity refers to the metrics defined in [RFC5136] which do not 665 include restrictions such as data uniqueness or flow-control response 666 to congestion. 668 The metrics considered are IP-layer Capacity, Utilization (or used 669 capacity), and Available Capacity, for individual links and complete 670 paths. These three metrics form a triad: knowing one metric 671 constrains the other two (within their allowed range), and knowing 672 two determines the third. The link metrics have another key aspect 673 in common: they are single-measurement-point metrics at the egress of 674 a link. The path Capacity and Available Capacity are derived by 675 examining the set of single-point link measurements and taking the 676 minimum value. 678 6.1. Type-P Parameter 680 The concept of "packets of type-P" is defined in [RFC2330]. The 681 type-P categorization has critical relevance in all forms of capacity 682 measurement and reporting. The ability to categorize packets based 683 on header fields for assignment to different queues and scheduling 684 mechanisms is now common place. When un-used resources are shared 685 across queues, the conditions in all packet categories will affect 686 capacity and related measurements. This is one source of variability 687 in the results that all audiences would prefer to see reported in a 688 useful and easily understood way. 690 Type-P in OWAMP and TWAMP is essentially confined to the Diffserv 691 Codepoint [RFC4656]. DSCP is the most common qualifier for type-P. 693 Each audience will have a set of type-P qualifications and value 694 combinations that are of interest. Measurements and reports SHOULD 695 have the flexibility to report per-type and aggregate performance. 697 6.2. A priori Factors 699 The audience for Network Characterization may have detailed 700 information about each link that comprises a complete path (due to 701 ownership, for example), or some of the links in the path but not 702 others, or none of the links. 704 There are cases where the measurement audience only has information 705 on one of the links (the local access link), and wishes to measure 706 one or more of the raw capacity metrics. This scenario is quite 707 common, and has spawned a substantial number of experimental 708 measurement methods (e.g., http://www.caida.org/tools/taxonomy/ ). 709 Many of these methods respect that their users want a result fairly 710 quickly and in a one-trial. Thus, the measurement interval is kept 711 short (a few seconds to a minute). For long-term reporting, a sample 712 of short term results need to be summarized. 714 6.3. IP-layer Capacity 716 For links, this metric's theoretical maximum value can be determined 717 from the physical layer bit rate and the bit rate reduction due to 718 the layers between the physical layer and IP. When measured, this 719 metric takes additional factors into account, such as the ability of 720 the sending device to process and forward traffic under various 721 conditions. For example, the arrival of routing updates may spawn 722 high priority processes that reduce the sending rate temporarily. 723 Thus, the measured capacity of a link will be variable, and the 724 maximum capacity observed applies to a specific time, time interval, 725 and other relevant circumstances. 727 For paths composed of a series of links, it is easy to see how the 728 sources of variability for the results grow with each link in the 729 path. Results variability will be discussed in more detail below. 731 6.4. IP-layer Utilization 733 The ideal metric definition of Link Utilization [RFC5136] is based on 734 the actual usage (bits successfully received during a time interval) 735 and the Maximum Capacity for the same interval. 737 In practice, Link Utilization can be calculated by counting the IP- 738 layer (or other layer) octets received over a time interval and 739 dividing by the theoretical maximum of octets that could have been 740 delivered in the same interval. A commonly used time interval is 5 741 minutes, and this interval has been sufficient to support network 742 operations and design for some time. 5 minutes is somewhat long 743 compared with the expected download time for web pages, but short 744 with respect to large file transfers and TV program viewing. It is 745 fair to say that considerable variability is concealed by reporting a 746 single (average) Utilization value for each 5 minute interval. Some 747 performance management systems have begun to make 1 minute averages 748 available. 750 There is also a limit on the smallest useful measurement interval. 751 Intervals on the order of the serialization time for a single Maximum 752 Transmission Unit (MTU) packet will observe on/off behavior and 753 report 100% or 0%. The smallest interval needs to be some multiple 754 of MTU serialization time for averaging to be effective. 756 6.5. IP-layer Available Capacity 758 The Available Capacity of a link can be calculated using the Capacity 759 and Utilization metrics. 761 When Available capacity of a link or path is estimated through some 762 measurement technique, the following parameters SHOULD be reported: 764 o Name and reference to the exact method of measurement 766 o IP packet length, octets (including IP header) 768 o Maximum Capacity that can be assessed in the measurement 769 configuration 771 o The time duration of the measurement 773 o All other parameters specific to the measurement method 775 Many methods of Available capacity measurement have a maximum 776 capacity that they can measure, and this maximum may be less than the 777 actual Available capacity of the link or path. Therefore, it is 778 important to know the capacity value beyond which there will be no 779 measured improvement. 781 The Application Design audience may have a desired target capacity 782 value and simply wish to assess whether there is sufficient Available 783 Capacity. This case simplifies measurement of link and path capacity 784 to some degree, as long as the measurable maximum exceeds the target 785 capacity. 787 6.6. Variability in Utilization and Avail. Capacity 789 As with most metrics and measurements, assessing the consistency or 790 variability in the results gives the user an intuitive feel for the 791 degree (or confidence) that any one value is representative of other 792 results, or the spread of the underlying distribution of the 793 singleton measurements. 795 How can Utilization be measured and summarized to describe the 796 potential variability in a useful way? 798 How can the variability in Available Capacity estimates be reported, 799 so that the confidence in the results is also conveyed? 801 We suggest some methods below: 803 6.6.1. General Summary of Variability 805 With a set of singleton Utilization or Available Capacity estimates, 806 each representing a time interval needed to ascertain the estimate, 807 we seek to describe the variation over the set of singletons as 808 though reporting summary statistics of a distribution. Three useful 809 summary statistics are: 811 o Minimum, 813 o Maximum, 815 o Range 817 An alternate way to represent the Range is as ratio of Maximum to 818 Minimum value. This enables an easily understandable statistic to 819 describe the range observed. For example, when Maximum = 3*Minimum, 820 then the Max/Min Ratio is 3 and users may see variability of this 821 order. On the other hand, Capacity estimates with a Max/Min Ratio 822 near 1 are quite consistent and near the central measure or statistic 823 reported. 825 For an on-going series of singleton estimates, a moving average of n 826 estimates may provide a single value estimate to more easily 827 distinguish substantial changes in performance over time. For 828 example, in a window of n singletons observed in time interval, t, a 829 percentage change of x% is declared to be a substantial change and 830 reported as an exception. 832 Often, the most informative summary of the results is a two-axis plot 833 rather than a table of statistics, where time is plotted on the 834 x-axis and the singleton value on the y-axis. The time-series plot 835 can illustrate sudden changes in an otherwise stable range, identify 836 bi-modality easily, and help quickly assess correlation with other 837 time-series. Plots of frequency of the singleton values are likewise 838 useful tools to visualize the variation. 840 7. Reporting Restricted Capacity Metrics 842 Restricted capacity refers to the metrics defined in [RFC3148] which 843 include criteria of data uniqueness or flow-control response to 844 congestion. 846 In primary metric considered is Bulk Transfer Capacity (BTC) for 847 complete paths. [RFC3148] defines 849 BTC = data_sent / elapsed_time 851 for a connection with congestion-aware flow control, where data_sent 852 is the total of unique payload bits (no headers). 854 We note that this definition *differs* from the raw capacity 855 definition in Section 2.3.1 of [RFC5136], where IP-layer Capacity 856 *includes* all bits in the IP header and payload. This means that 857 Restricted Capacity BTC is already operating at a disadvantage when 858 compared to the raw capacity at layers below TCP. Further, there are 859 cases where one IP-layer is encapsulated in another IP-layer or other 860 form of tunneling protocol, designating more and more of the 861 fundamental transport capacity as header bits that are pure overhead 862 to the BTC measurement. 864 We also note that Raw and Restricted Capacity metrics are not 865 orthogonal in the sense defined in Section 5.1.2 above. The 866 information they covey about the network under test is certainly 867 overlapping, but they reveal two different and important aspects of 868 performance. 870 When thinking about the triad of raw capacity metrics, BTC is most 871 akin to the "IP-Type-P Available Path Capacity", at least in the eyes 872 of a network user who seeks to know what transmission performance a 873 path might support. 875 7.1. Type-P Parameter and Type-C Parameter 877 The concept of "packets of type-P" is defined in [RFC2330]. The 878 considerations for Restricted Capacity are identical to the raw 879 capacity section on this topic, with the addition that the various 880 fields and options in the TCP header MUST be included in the 881 description. 883 The vast array of TCP flow control options are not well-captured by 884 Type-P, because they do not exist in the TCP header bits. Therefore, 885 we introduce a new notion here: TCP Configuration of "Type-C". The 886 elements of Type-C describe all of the settings for TCP options and 887 congestion control algorithm variables, including the main form of 888 congestion control in use. 890 7.2. A priori Factors 892 The audience for Network Characterization may have detailed 893 information about each link that comprises a complete path (due to 894 ownership, for example), or some of the links in the path but not 895 others, or none of the links. 897 There are cases where the measurement audience only has information 898 on one of the links (the local access link), and wishes to measure 899 one or more BTC metrics. The discussion of Section 6.2 applies here 900 as well. 902 7.3. Measurement Interval 904 There are limits on a useful measurement interval for BTC. Three 905 factors that influence the interval duration are listed below: 907 1. Measurements may choose to include or exclude the 3-way handshake 908 of TCP connection establishment, which requires at least 1.5 * 909 RTT and contains both the delay of the path and the host 910 processing time for responses. However, user experience includes 911 the 3-way handshake for all new TCP connections. 913 2. Measurements may choose to include or exclude Slow-Start, 914 preferring instead to focus on a portion of the transfer that 915 represents "equilibrium" (which needs to be defined for 916 particular circumstances if used). However, user experience 917 includes the Slow-Start for all new TCP connections. 919 3. Measurements may choose to use a fixed block of data to transfer, 920 where the size of the block has a relationship to the file size 921 of the application of interest. This approach yields variable 922 size measurement intervals, where a path with faster BTC is 923 measured for less time than a path with slower BTC, and this has 924 implications when path impairments are time-varying, or 925 transient. Users are likely to turn their immediate attention 926 elsewhere when a very large file must be transferred, thus they 927 do not directly experience such a long transfer -- they see the 928 result (success or fail) and possibly an objective measurement of 929 the transfer time (which will likely include the 3-way handshake, 930 Slow-start, and application file management processing time as 931 well as the BTC). 933 Individual measurement intervals may be short or long, but there is a 934 need to report the results on a long-term basis that captures the BTC 935 variability experienced between each interval. Consistent BTC is a 936 valuable commodity along with the value attained. 938 7.4. Bulk Transfer Capacity Reporting 940 When BTC of a link or path is estimated through some measurement 941 technique, the following parameters SHOULD be reported: 943 o Name and reference to the exact method of measurement 945 o Maximum Transmission Unit (MTU) 947 o Maximum BTC that can be assessed in the measurement configuration 949 o The time and duration of the measurement 951 o The number of BTC connections used simultaneously 953 o *All* other parameters specific to the measurement method, 954 especially the Congestion Control algorithm in use 956 See also [RFC6349]. 958 Many methods of Bulk Transfer Capacity measurement have a maximum 959 capacity that they can measure, and this maximum may be less than the 960 available capacity of the link or path. Therefore, it is important 961 to specify the measured BTC value beyond which there will be no 962 measured improvement. 964 The Application Design audience may have a desired target capacity 965 value and simply wish to assess whether there is sufficient BTC. 966 This case simplifies measurement of link and path capacity to some 967 degree, as long as the measurable maximum exceeds the target 968 capacity. 970 7.5. Variability in Bulk Transfer Capacity 972 As with most metrics and measurements, assessing the consistency or 973 variability in the results gives the user an intuitive feel for the 974 degree (or confidence) that any one value is representative of other 975 results, or the underlying distribution from which these singleton 976 measurements have come. 978 With two questions looming: 980 1. What ways can BTC be measured and summarized to describe the 981 potential variability in a useful way? 983 2. How can the variability in BTC estimates be reported, so that the 984 confidence in the results is also conveyed? 986 we suggest the methods of Section 6.6.1 above, and the additional 987 results presentations given in [RFC6349]. 989 8. Reporting on Test Streams and Sample Size 991 This section discusses two key aspects of measurement that are 992 sometimes omitted from the report: the description of the test stream 993 on which the measurements are based, and the sample size. 995 8.1. Test Stream Characteristics 997 Network Characterization has traditionally used Poisson-distributed 998 inter-packet spacing, as this provides an unbiased sample. The 999 average inter-packet spacing may be selected to allow observation of 1000 specific network phenomena. Other test streams are designed to 1001 sample some property of the network, such as the presence of 1002 congestion, link bandwidth, or packet reordering. 1004 If measuring a network in order to make inferences about applications 1005 or receiver performance, then there are usually efficiencies derived 1006 from a test stream that has similar characteristics to the sender. 1007 In some cases, it is essential to synthesize the sender stream, as 1008 with Bulk Transfer Capacity estimates. In other cases, it may be 1009 sufficient to sample with a "known bias", e.g., a Periodic stream to 1010 estimate real-time application performance. 1012 8.2. Sample Size 1014 Sample size is directly related to the accuracy of the results, and 1015 plays a critical role in the report. Even if only the sample size 1016 (in terms of number of packets) is given for each value or summary 1017 statistic, it imparts a notion of the confidence in the result. 1019 In practice, the sample size will be selected taking both statistical 1020 and practical factors into account. Among these factors are: 1022 1. The estimated variability of the quantity being measured 1024 2. The desired confidence in the result (although this may be 1025 dependent on assumption of the underlying distribution of the 1026 measured quantity). 1028 3. The effects of active measurement traffic on user traffic. 1030 A sample size may sometimes be referred to as "large". This is a 1031 relative, and qualitative term. It is preferable to describe what 1032 one is attempting to achieve with their sample. For example, stating 1033 an implication may be helpful: this sample is large enough such that 1034 a single outlying value at ten times the "typical" sample mean (the 1035 mean without the outlying value) would influence the mean by no more 1036 than X. 1038 9. IANA Considerations 1040 This document makes no request of IANA. 1042 Note to RFC Editor: this section may be removed on publication as an 1043 RFC. 1045 10. Security Considerations 1047 The security considerations that apply to any active measurement of 1048 live networks are relevant here as well. See [RFC4656]. 1050 11. Acknowledgements 1052 The authors thank: Phil Chimento for his suggestion to employ 1053 conditional distributions for Delay, Steve Konish Jr. for his careful 1054 review and suggestions, Dave McDysan and Don McLachlan for useful 1055 comments based on their long experience with measurement and 1056 reporting, Daniel Genin for his observation of non-orthogonality 1057 between Raw and Restricted Capacity metrics (and our omission of this 1058 fact), and Matt Zekauskas for suggestions on organizing the memo for 1059 easier consumption. 1061 12. References 1063 12.1. Normative References 1065 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1066 Requirement Levels", BCP 14, RFC 2119, March 1997. 1068 [RFC2330] Paxson, V., Almes, G., Mahdavi, J., and M. Mathis, 1069 "Framework for IP Performance Metrics", RFC 2330, 1070 May 1998. 1072 [RFC2678] Mahdavi, J. and V. Paxson, "IPPM Metrics for Measuring 1073 Connectivity", RFC 2678, September 1999. 1075 [RFC2679] Almes, G., Kalidindi, S., and M. Zekauskas, "A One-way 1076 Delay Metric for IPPM", RFC 2679, September 1999. 1078 [RFC2680] Almes, G., Kalidindi, S., and M. Zekauskas, "A One-way 1079 Packet Loss Metric for IPPM", RFC 2680, September 1999. 1081 [RFC3148] Mathis, M. and M. Allman, "A Framework for Defining 1082 Empirical Bulk Transfer Capacity Metrics", RFC 3148, 1083 July 2001. 1085 [RFC3393] Demichelis, C. and P. Chimento, "IP Packet Delay Variation 1086 Metric for IP Performance Metrics (IPPM)", RFC 3393, 1087 November 2002. 1089 [RFC3432] Raisanen, V., Grotefeld, G., and A. Morton, "Network 1090 performance measurement with periodic streams", RFC 3432, 1091 November 2002. 1093 [RFC4656] Shalunov, S., Teitelbaum, B., Karp, A., Boote, J., and M. 1094 Zekauskas, "A One-way Active Measurement Protocol 1095 (OWAMP)", RFC 4656, September 2006. 1097 [RFC4737] Morton, A., Ciavattone, L., Ramachandran, G., Shalunov, 1098 S., and J. Perser, "Packet Reordering Metrics", RFC 4737, 1099 November 2006. 1101 [RFC5136] Chimento, P. and J. Ishac, "Defining Network Capacity", 1102 RFC 5136, February 2008. 1104 12.2. Informative References 1106 [Casner] "A Fine-Grained View of High Performance Networking, NANOG 1107 22 Conf.; http://www.nanog.org/mtg-0105/agenda.html", May 1108 20-22 2001. 1110 [Cia03] "Standardized Active Measurements on a Tier 1 IP Backbone, 1111 IEEE Communications Mag., pp 90-97.", June 2003. 1113 [I-D.ietf-ippm-reporting] 1114 Shalunov, S. and M. Swany, "Reporting IP Performance 1115 Metrics to Users", draft-ietf-ippm-reporting-06 (work in 1116 progress), March 2011. 1118 [RFC5481] Morton, A. and B. Claise, "Packet Delay Variation 1119 Applicability Statement", RFC 5481, March 2009. 1121 [RFC5835] Morton, A. and S. Van den Berghe, "Framework for Metric 1122 Composition", RFC 5835, April 2010. 1124 [RFC6349] Constantine, B., Forget, G., Geib, R., and R. Schrage, 1125 "Framework for TCP Throughput Testing", RFC 6349, 1126 August 2011. 1128 [Y.1540] ITU-T Recommendation Y.1540, "Internet protocol data 1129 communication service - IP packet transfer and 1130 availability performance parameters", December 2011. 1132 [Y.1541] ITU-T Recommendation Y.1540, "Network Performance 1133 Objectives for IP-Based Services", February 2011. 1135 Authors' Addresses 1137 Al Morton 1138 AT&T Labs 1139 200 Laurel Avenue South 1140 Middletown, NJ 07748 1141 USA 1143 Phone: +1 732 420 1571 1144 Fax: +1 732 368 1192 1145 Email: acmorton@att.com 1146 URI: http://home.comcast.net/~acmacm/ 1147 Gomathi Ramachandran 1148 AT&T Labs 1149 200 Laurel Avenue South 1150 Middletown, New Jersey 07748 1151 USA 1153 Phone: +1 732 420 2353 1154 Fax: 1155 Email: gomathi@att.com 1156 URI: 1158 Ganga Maguluri 1159 AT&T Labs 1160 200 Laurel Avenue 1161 Middletown, New Jersey 07748 1162 USA 1164 Phone: 732-420-2486 1165 Fax: 1166 Email: gmaguluri@att.com 1167 URI: