idnits 2.17.1 draft-ietf-ippm-reporting-metrics-08.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to contain a disclaimer for pre-RFC5378 work, but was first submitted on or after 10 November 2008. The disclaimer is usually necessary only for documents that revise or obsolete older RFCs, and that take significant amounts of text from those RFCs. If you can contact all authors of the source material and they are willing to grant the BCP78 rights to the IETF Trust, you can and should remove the disclaimer. Otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (March 11, 2012) is 4428 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 2679 (Obsoleted by RFC 7679) ** Obsolete normative reference: RFC 2680 (Obsoleted by RFC 7680) Summary: 2 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group A. Morton 3 Internet-Draft G. Ramachandran 4 Intended status: Informational G. Maguluri 5 Expires: September 12, 2012 AT&T Labs 6 March 11, 2012 8 Reporting Metrics: Different Points of View 9 draft-ietf-ippm-reporting-metrics-08 11 Abstract 13 Consumers of IP network performance metrics have many different uses 14 in mind. The memo provides "long-term" reporting considerations 15 (e.g, days, weeks or months, as opposed to 10 seconds), based on 16 analysis of the two key audience points-of-view. It describes how 17 the audience categories affect the selection of metric parameters and 18 options when seeking info that serves their needs. 20 Requirements Language 22 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 23 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 24 document are to be interpreted as described in RFC 2119 [RFC2119]. 26 Status of this Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at http://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on September 12, 2012. 43 Copyright Notice 45 Copyright (c) 2012 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (http://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 This document may contain material from IETF Documents or IETF 59 Contributions published or made publicly available before November 60 10, 2008. The person(s) controlling the copyright in some of this 61 material may not have granted the IETF Trust the right to allow 62 modifications of such material outside the IETF Standards Process. 63 Without obtaining an adequate license from the person(s) controlling 64 the copyright in such materials, this document may not be modified 65 outside the IETF Standards Process, and derivative works of it may 66 not be created outside the IETF Standards Process, except to format 67 it for publication as an RFC or to translate it into languages other 68 than English. 70 Table of Contents 72 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 73 2. Purpose and Scope . . . . . . . . . . . . . . . . . . . . . . 4 74 3. Reporting Results . . . . . . . . . . . . . . . . . . . . . . 5 75 3.1. Overview of Metric Statistics . . . . . . . . . . . . . . 5 76 3.2. Long-Term Reporting Considerations . . . . . . . . . . . . 6 77 4. Effect of POV on the Loss Metric . . . . . . . . . . . . . . . 8 78 4.1. Loss Threshold . . . . . . . . . . . . . . . . . . . . . . 8 79 4.1.1. Network Characterization . . . . . . . . . . . . . . . 8 80 4.1.2. Application Performance . . . . . . . . . . . . . . . 10 81 4.2. Errored Packet Designation . . . . . . . . . . . . . . . . 10 82 4.3. Causes of Lost Packets . . . . . . . . . . . . . . . . . . 10 83 4.4. Summary for Loss . . . . . . . . . . . . . . . . . . . . . 11 84 5. Effect of POV on the Delay Metric . . . . . . . . . . . . . . 11 85 5.1. Treatment of Lost Packets . . . . . . . . . . . . . . . . 11 86 5.1.1. Application Performance . . . . . . . . . . . . . . . 12 87 5.1.2. Network Characterization . . . . . . . . . . . . . . . 12 88 5.1.3. Delay Variation . . . . . . . . . . . . . . . . . . . 13 89 5.1.4. Reordering . . . . . . . . . . . . . . . . . . . . . . 14 90 5.2. Preferred Statistics . . . . . . . . . . . . . . . . . . . 14 91 5.3. Summary for Delay . . . . . . . . . . . . . . . . . . . . 15 92 6. Reporting Raw Capacity Metrics . . . . . . . . . . . . . . . . 15 93 6.1. Type-P Parameter . . . . . . . . . . . . . . . . . . . . . 16 94 6.2. A priori Factors . . . . . . . . . . . . . . . . . . . . . 16 95 6.3. IP-layer Capacity . . . . . . . . . . . . . . . . . . . . 16 96 6.4. IP-layer Utilization . . . . . . . . . . . . . . . . . . . 17 97 6.5. IP-layer Available Capacity . . . . . . . . . . . . . . . 17 98 6.6. Variability in Utilization and Avail. Capacity . . . . . . 18 99 6.6.1. General Summary of Variability . . . . . . . . . . . . 18 100 7. Reporting Restricted Capacity Metrics . . . . . . . . . . . . 19 101 7.1. Type-P Parameter and Type-C Parameter . . . . . . . . . . 20 102 7.2. A priori Factors . . . . . . . . . . . . . . . . . . . . . 20 103 7.3. Measurement Interval . . . . . . . . . . . . . . . . . . . 20 104 7.4. Bulk Transfer Capacity Reporting . . . . . . . . . . . . . 21 105 7.5. Variability in Bulk Transfer Capacity . . . . . . . . . . 22 106 8. Reporting on Test Streams and Sample Size . . . . . . . . . . 22 107 8.1. Test Stream Characteristics . . . . . . . . . . . . . . . 22 108 8.2. Sample Size . . . . . . . . . . . . . . . . . . . . . . . 23 109 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 23 110 10. Security Considerations . . . . . . . . . . . . . . . . . . . 23 111 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 23 112 12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 24 113 12.1. Normative References . . . . . . . . . . . . . . . . . . . 24 114 12.2. Informative References . . . . . . . . . . . . . . . . . . 25 115 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 25 117 1. Introduction 119 When designing measurements of IP networks and presenting the 120 results, knowledge of the audience is a key consideration. To 121 present a useful and relevant portrait of network conditions, one 122 must answer the following question: 124 "How will the results be used?" 126 There are two main audience categories: 128 1. Network Characterization - describes conditions in an IP network 129 for quality assurance, troubleshooting, modeling, Service Level 130 Agreements (SLA), etc. The point-of-view looks inward, toward 131 the network, and the consumer intends their actions there. 133 2. Application Performance Estimation - describes the network 134 conditions in a way that facilitates determining affects on user 135 applications, and ultimately the users themselves. This point- 136 of-view looks outward, toward the user(s), accepting the network 137 as-is. This consumer intends to estimate a network-dependent 138 aspect of performance, or design some aspect of an application's 139 accommodation of the network. (These are *not* application 140 metrics, they are defined at the IP layer.) 142 This memo considers how these different points-of-view affect both 143 the measurement design (parameters and options of the metrics) and 144 statistics reported when serving their needs. 146 The IPPM framework [RFC2330] and other RFCs describing IPPM metrics 147 provide a background for this memo. 149 2. Purpose and Scope 151 The purpose of this memo is to clearly delineate two points-of-view 152 (POV) for using measurements, and describe their effects on the test 153 design, including the selection of metric parameters and reporting 154 the results. 156 The scope of this memo primarily covers the design and reporting of 157 the loss and delay metrics [RFC2680] [RFC2679]. It will also discuss 158 the delay variation [RFC3393] and reordering metrics [RFC4737] where 159 applicable. 161 With capacity metrics growing in relevance to the industry, the memo 162 also covers POV and reporting considerations for metrics resulting 163 from the Bulk Transfer Capacity Framework [RFC3148] and Network 164 Capacity Definitions [RFC5136]. These memos effectively describe two 165 different categories of metrics, 167 o [RFC3148] includes restrictions of congestion control and the 168 notion of unique data bits delivered, and 170 o [RFC5136] using a definition of raw capacity without the 171 restrictions of data uniqueness or congestion-awareness. 173 It might seem at first glance that each of these metrics has an 174 obvious audience (Raw = Network Characterization, Restricted = 175 Application Performance), but reality is more complex and consistent 176 with the overall topic of capacity measurement and reporting. For 177 example, TCP is usually used in Restricted capacity measurement 178 methods, while UDP appears in Raw capacity measurement. The Raw and 179 Restricted capacity metrics will be treated in separate sections, 180 although they share one common reporting issue: representing 181 variability in capacity metric results as part of a long-term report. 183 Sampling, or the design of the active packet stream that is the basis 184 for the measurements, is also discussed. 186 3. Reporting Results 188 This section gives an overview of recommendations, followed by 189 additional considerations for reporting results in the "long-term", 190 based on the discussion and conclusions of the major sections that 191 follow. 193 3.1. Overview of Metric Statistics 195 This section gives an overview of reporting recommendations for the 196 loss, delay, and delay variation metrics. 198 The minimal report on measurements MUST include both Loss and Delay 199 Metrics. 201 For Packet Loss, the loss ratio defined in [RFC2680] is a sufficient 202 starting point, especially the existing guidance for setting the loss 203 threshold waiting time. We have calculated a waiting time above that 204 should be sufficient to differentiate between packets that are truly 205 lost or have long finite delays under general measurement 206 circumstances, 51 seconds. Knowledge of specific conditions can help 207 to reduce this threshold, but 51 seconds is considered to be 208 manageable in practice. 210 We note that a loss ratio calculated according to [Y.1540] would 211 exclude errored packets from the numerator. In practice, the 212 difference between these two loss metrics is small if any, depending 213 on whether the last link prior to the destination contributes errored 214 packets. 216 For Packet Delay, we recommend providing both the mean delay and the 217 median delay with lost packets designated undefined (as permitted by 218 [RFC2679]). Both statistics are based on a conditional distribution, 219 and the condition is packet arrival prior to a waiting time dT, where 220 dT has been set to take maximum packet lifetimes into account, as 221 discussed above for loss. Using a long dT helps to ensure that delay 222 distributions are not truncated. 224 For Packet Delay Variation (PDV), the minimum delay of the 225 conditional distribution should be used as the reference delay for 226 computing PDV according to [Y.1540] or [RFC5481] and [RFC3393]. A 227 useful value to report is a pseudo range of delay variation based on 228 calculating the difference between a high percentile of delay and the 229 minimum delay. For example, the 99.9%-ile minus the minimum will 230 give a value that can be compared with objectives in [Y.1541]. 232 For Capacity, both Raw and Restricted, reporting the variability in a 233 useful way is identified as the main challenge. The Min, Max, and 234 Range statistics are suggested along with a ratio of Max to Min and 235 moving averages. In the end, a simple plot of the singleton results 236 over time may succeed where summary metrics fail, or serve to confirm 237 that the summaries are valid. 239 3.2. Long-Term Reporting Considerations 241 [I-D.ietf-ippm-reporting] describes methods to conduct measurements 242 and report the results on a near-immediate time scale (10 seconds, 243 which we consider to be "short-term"). 245 Measurement intervals and reporting intervals need not be the same 246 length. Sometimes, the user is only concerned with the performance 247 levels achieved over a relatively long interval of time (e.g, days, 248 weeks, or months, as opposed to 10 seconds). However, there can be 249 risks involved with running a measurement continuously over a long 250 period without recording intermediate results: 252 o Temporary power failure may cause loss of all the results to date. 254 o Measurement system timing synchronization signals may experience a 255 temporary outage, causing sub-sets of measurements to be in error 256 or invalid. 258 o Maintenance may be necessary on the measurement system, or its 259 connectivity to the network under test. 261 For these and other reasons, such as 263 o the constraint to collect measurements on intervals similar to 264 user session length, or 266 o the dual-use of measurements in monitoring activities where 267 results are needed on a period of a few minutes, 269 there is value in conducting measurements on intervals that are much 270 shorter than the reporting interval. 272 There are several approaches for aggregating a series of measurement 273 results over time in order to make a statement about the longer 274 reporting interval. One approach requires the storage of all metric 275 singletons collected throughout the reporting interval, even though 276 the measurement interval stops and starts many times. 278 Another approach is described in [RFC5835] as "temporal aggregation". 279 This approach would estimate the results for the reporting interval 280 based on many individual measurement interval statistics (results) 281 alone. The result would ideally appear in the same form as though a 282 continuous measurement was conducted. A memo to address the details 283 of temporal aggregation is yet to be prepared. 285 Yet another approach requires a numerical objective for the metric, 286 and the results of each measurement interval are compared with the 287 objective. Every measurement interval where the results meet the 288 objective contribute to the fraction of time with performance as 289 specified. When the reporting interval contains many measurement 290 intervals it is possible to present the results as "metric A was less 291 than or equal to objective X during Y% of time. 293 NOTE that numerical thresholds of acceptability are not set in IETF 294 performance work and are explicitly excluded from the IPPM charter. 296 In all measurement, it is important to avoid unintended 297 synchronization with network events. This topic is treated in 298 [RFC2330] for Poisson-distributed inter-packet time streams, and 299 [RFC3432] for Periodic streams. Both avoid synchronization through 300 use of random start times. 302 There are network conditions where it is simply more useful to report 303 the connectivity status of the Source-Destination path, and to 304 distinguish time intervals where connectivity can be demonstrated 305 from other time intervals (where connectivity does not appear to 306 exist). [RFC2678] specifies a number of one-way and two connectivity 307 metrics of increasing complexity. In this memo, we RECOMMEND that 308 long term reporting of loss, delay, and other metrics be limited to 309 time intervals where connectivity can be demonstrated, and other 310 intervals be summarized as percent of time where connectivity does 311 not appear to exist. We note that this same approach has been 312 adopted in ITU-T Recommendation [Y.1540] where performance parameters 313 are only valid during periods of service "availability" (evaluated 314 according to a function based on packet loss, and sustained periods 315 of loss ratio greater than a threshold are declared "unavailable"). 317 4. Effect of POV on the Loss Metric 319 This section describes the ways in which the Loss metric can be tuned 320 to reflect the preferences of the two audience categories, or 321 different POV. The waiting time to declare a packet lost, or loss 322 threshold is one area where there would appear to be a difference, 323 but the ability to post-process the results may resolve it. 325 4.1. Loss Threshold 327 RFC 2680 [RFC2680] defines the concept of a waiting time for packets 328 to arrive, beyond which they are declared lost. The text of the RFC 329 declines to recommend a value, instead saying that "good engineering, 330 including an understanding of packet lifetimes, will be needed in 331 practice." Later, in the methodology, they give reasons for waiting 332 "a reasonable period of time", and leaving the definition of 333 "reasonable" intentionally vague. 335 4.1.1. Network Characterization 337 Practical measurement experience has shown that unusual network 338 circumstances can cause long delays. One such circumstance is when 339 routing loops form during IGP re-convergence following a failure or 340 drastic link cost change. Packets will loop between two routers 341 until new routes are installed, or until the IPv4 Time-to-Live (TTL) 342 field (or the IPv6 Hop Limit) decrements to zero. Very long delays 343 on the order of several seconds have been measured [Casner] [Cia03]. 345 Therefore, network characterization activities prefer a long waiting 346 time in order to distinguish these events from other causes of loss 347 (such as packet discard at a full queue, or tail drop). This way, 348 the metric design helps to distinguish more reliably between packets 349 that might yet arrive, and those that are no longer traversing the 350 network. 352 It is possible to calculate a worst-case waiting time, assuming that 353 a routing loop is the cause. We model the path between Source and 354 Destination as a series of delays in links (t) and queues (q), as 355 these two are the dominant contributors to delay. The normal path 356 delay across n hops without encountering a loop, D, is 358 n 359 --- 360 \ 361 D = t + > (t + q ) 362 0 / i i 363 --- 364 i = 1 366 Figure 1: Normal Path Delay 368 and the time spent in the loop with L hops, is 370 j + L-1 371 --- 372 \ (TTL - n) 373 R = C > (t + q ) where C = --------- 374 / i i max L 375 --- 376 i=j where j is the hop number where the loop begins 378 Figure 2: Delay due to Rotations in a Loop 380 where C is the number of times a packet circles the loop, and where 381 TTL is the packet's initial Time-to-Live value at the source (or Hop 382 Count in IPv6). 384 If we take the delays of all links and queues as 100ms each, the 385 TTL=255, the number of hops n=5 and the hops in the loop L=4, then 387 D = 1.1 sec and R ~= 50 sec, and D + R ~= 51.1 seconds 389 We note that the link delays of 100ms would span most continents, and 390 a constant queue length of 100ms is also very generous. When a loop 391 occurs, it is almost certain to be resolved in 10 seconds or less. 392 The value calculated above is an upper limit for almost any real- 393 world circumstance. 395 A waiting time threshold parameter, dT, set consistent with this 396 calculation would not truncate the delay distribution (possibly 397 causing a change in its mathematical properties), because the packets 398 that might arrive have been given sufficient time to traverse the 399 network. 401 It is worth noting that packets that are stored and deliberately 402 forwarded at a much later time constitute a replay attack on the 403 measurement system, and are beyond the scope of normal performance 404 reporting. 406 4.1.2. Application Performance 408 Fortunately, application performance estimation activities are not 409 adversely affected by the estimated worst-case transfer time. 410 Although the designer's tendency might be to set the Loss Threshold 411 at a value equivalent to a particular application's threshold, this 412 specific threshold can be applied when post-processing the 413 measurements. A shorter waiting time can be enforced by locating 414 packets with delays longer than the application's threshold, and re- 415 designating such packets as lost. Thus, the measurement system can 416 use a single loss waiting time and support both application and 417 network performance POVs simultaneously. 419 4.2. Errored Packet Designation 421 RFC 2680 designates packets that arrive containing errors as lost 422 packets. Many packets that are corrupted by bit errors are discarded 423 within the network and do not reach their intended destination. 425 This is consistent with applications that would check the payload 426 integrity at higher layers, and discard the packet. However, some 427 applications prefer to deal with errored payloads on their own, and 428 even a corrupted payload is better than no packet at all. 430 To address this possibility, and to make network characterization 431 more complete, it is recommended to distinguish between packets that 432 do not arrive (lost) and errored packets that arrive (conditionally 433 lost). 435 4.3. Causes of Lost Packets 437 Although many measurement systems use a waiting time to determine if 438 a packet is lost or not, most of the waiting is in vain. The packets 439 are no-longer traversing the network, and have not reached their 440 destination. 442 There are many causes of packet loss, including: 444 1. Queue drop, or discard 446 2. Corruption of the IP header, or other essential header info 447 3. TTL expiration (or use of a TTL value that is too small) 449 4. Link or router failure 451 5. Layers below the source-to-destination IP layer can discard 452 packets that fail error checking and link-layer checksums often 453 cover the entire packet 455 After waiting sufficient time, packet loss can probably be attributed 456 to one of these causes. 458 4.4. Summary for Loss 460 Given that measurement post-processing is possible (even encouraged 461 in the definitions of IPPM metrics), measurements of loss can easily 462 serve both points of view: 464 o Use a long waiting time to serve network characterization and 465 revise results for specific application delay thresholds as 466 needed. 468 o Distinguish between errored packets and lost packets when possible 469 to aid network characterization, and combine the results for 470 application performance if appropriate. 472 5. Effect of POV on the Delay Metric 474 This section describes the ways in which the Delay metric can be 475 tuned to reflect the preferences of the two consumer categories, or 476 different POV. 478 5.1. Treatment of Lost Packets 480 The Delay Metric [RFC2679] specifies the treatment of packets that do 481 not successfully traverse the network: their delay is undefined. 483 " >>The *Type-P-One-way-Delay* from Src to Dst at T is undefined 484 (informally, infinite)<< means that Src sent the first bit of a 485 Type-P packet to Dst at wire-time T and that Dst did not receive that 486 packet." 488 It is an accepted, but informal practice to assign infinite delay to 489 lost packets. We next look at how these two different treatments 490 align with the needs of measurement consumers who wish to 491 characterize networks or estimate application performance. Also, we 492 look at the way that lost packets have been treated in other metrics: 493 delay variation and reordering. 495 5.1.1. Application Performance 497 Applications need to perform different functions, dependent on 498 whether or not each packet arrives within some finite tolerance. In 499 other words, a receivers' packet processing takes one of two 500 directions (or "forks" in the road): 502 o Packets that arrive within expected tolerance are handled by 503 processes that remove headers, restore smooth delivery timing (as 504 in a de-jitter buffer), restore sending order, check for errors in 505 payloads, and many other operations. 507 o Packets that do not arrive when expected spawn other processes 508 that attempt recovery from the apparent loss, such as 509 retransmission requests, loss concealment, or forward error 510 correction to replace the missing packet. 512 So, it is important to maintain a distinction between packets that 513 actually arrive, and those that do not. Therefore, it is preferable 514 to leave the delay of lost packets undefined, and to characterize the 515 delay distribution as a conditional distribution (conditioned on 516 arrival). 518 5.1.2. Network Characterization 520 In this discussion, we assume that both loss and delay metrics will 521 be reported for network characterization (at least). 523 Assume packets that do not arrive are reported as Lost, usually as a 524 fraction of all sent packets. If these lost packets are assigned 525 undefined delay, then the network's inability to deliver them (in a 526 timely way) is relegated only in the Loss metric when we report 527 statistics on the Delay distribution conditioned on the event of 528 packet arrival (within the Loss waiting time threshold). We can say 529 that the Delay and Loss metrics are Orthogonal, in that they convey 530 non-overlapping information about the network under test. This is a 531 valuable property, whose absence is discussed below. 533 However, if we assign infinite delay to all lost packets, then: 535 o The delay metric results are influenced both by packets that 536 arrive and those that do not. 538 o The delay singleton and the loss singleton do not appear to be 539 orthogonal (Delay is finite when Loss=0, Delay is infinite when 540 Loss=1). 542 o The network is penalized in both the loss and delay metrics, 543 effectively double-counting the lost packets. 545 As further evidence of overlap, consider the Cumulative Distribution 546 Function (CDF) of Delay when the value positive infinity is assigned 547 to all lost packets. Figure 3 shows a CDF where a small fraction of 548 packets are lost. 550 1 | - - - - - - - - - - - - - - - - - -+ 551 | | 552 | _..----'''''''''''''''''''' 553 | ,-'' 554 | ,' 555 | / Mass at 556 | / +infinity 557 | / = fraction 558 || lost 559 |/ 560 0 |_____________________________________ 562 0 Delay +o0 564 Figure 3: Cumulative Distribution Function for Delay when Loss = 565 +Infinity 567 We note that a Delay CDF that is conditioned on packet arrival would 568 not exhibit this apparent overlap with loss. 570 Although infinity is a familiar mathematical concept, it is somewhat 571 disconcerting to see any time-related metric reported as infinity, in 572 the opinion of the authors. Questions are bound to arise, and tend 573 to detract from the goal of informing the consumer with a performance 574 report. 576 5.1.3. Delay Variation 578 [RFC3393] excludes lost packets from samples, effectively assigning 579 an undefined delay to packets that do not arrive in a reasonable 580 time. Section 4.1 of [RFC3393] describes this specification and its 581 rationale (ipdv = inter-packet delay variation in the quote below). 583 "The treatment of lost packets as having "infinite" or "undefined" 584 delay complicates the derivation of statistics for ipdv. 585 Specifically, when packets in the measurement sequence are lost, 586 simple statistics such as sample mean cannot be computed. One 587 possible approach to handling this problem is to reduce the event 588 space by conditioning. That is, we consider conditional statistics; 589 namely we estimate the mean ipdv (or other derivative statistic) 590 conditioned on the event that selected packet pairs arrive at the 591 destination (within the given timeout). While this itself is not 592 without problems (what happens, for example, when every other packet 593 is lost), it offers a way to make some (valid) statements about ipdv, 594 at the same time avoiding events with undefined outcomes." 596 We note that the argument above applies to all forms of packet delay 597 variation that can be constructed using the "selection function" 598 concept of [RFC3393]. In recent work the two main forms of delay 599 variation metrics have been compared and the results are summarized 600 in [RFC5481]. 602 5.1.4. Reordering 604 [RFC4737]defines metrics that are based on evaluation of packet 605 arrival order, and include a waiting time to declare a packet lost 606 (to exclude them from further processing). 608 If packets are assigned a delay value, then the reordering metric 609 would declare any packets with infinite delay to be reordered, 610 because their sequence numbers will surely be less than the "Next 611 Expected" threshold when (or if) they arrive. But this practice 612 would fail to maintain orthogonality between the reordering metric 613 and the loss metric. Confusion can be avoided by designating the 614 delay of non-arriving packets as undefined, and reserving delay 615 values only for packets that arrive within a sufficiently long 616 waiting time. 618 5.2. Preferred Statistics 620 Today in network characterization, the sample mean is one statistic 621 that is almost ubiquitously reported. It is easily computed and 622 understood by virtually everyone in this audience category. Also, 623 the sample is usually filtered on packet arrival, so that the mean is 624 based on a conditional distribution. 626 The median is another statistic that summarizes a distribution, 627 having somewhat different properties from the sample mean. The 628 median is stable in distributions with a few outliers or without 629 them. However, the median's stability prevents it from indicating 630 when a large fraction of the distribution changes value. 50% or more 631 values would need to change for the median to capture the change. 633 Both the median and sample mean have difficulty with bimodal 634 distributions. The median will reside in only one of the modes, and 635 the mean may not lie in either mode range. For this and other 636 reasons, additional statistics such as the minimum, maximum, and 95%- 637 ile have value when summarizing a distribution. 639 When both the sample mean and median are available, a comparison will 640 sometimes be informative, because these two statistics are equal only 641 when the delay distribution is perfectly symmetrical. 643 Also, these statistics are generally useful from the Application 644 Performance POV, so there is a common set that should satisfy 645 audiences. 647 Plots of the delay distribution may also be useful when single-value 648 statistics indicate that new conditions are present. An empirically- 649 derived probability distribution function will usually describe 650 multiple modes more efficiently than any other form of result. 652 5.3. Summary for Delay 654 From the perspectives of: 656 1. application/receiver analysis, where subsequent processing 657 depends on whether the packet arrives or times-out, 659 2. straightforward network characterization without double-counting 660 defects, and 662 3. consistency with Delay variation and Reordering metric 663 definitions, 665 the most efficient practice is to distinguish between truly lost and 666 delayed packets with a sufficiently long waiting time, and to 667 designate the delay of non-arriving packets as undefined. 669 6. Reporting Raw Capacity Metrics 671 Raw capacity refers to the metrics defined in [RFC5136] which do not 672 include restrictions such as data uniqueness or flow-control response 673 to congestion. 675 The metrics considered are IP-layer Capacity, Utilization (or used 676 capacity), and Available Capacity, for individual links and complete 677 paths. These three metrics form a triad: knowing one metric 678 constrains the other two (within their allowed range), and knowing 679 two determines the third. The link metrics have another key aspect 680 in common: they are single-measurement-point metrics at the egress of 681 a link. The path Capacity and Available Capacity are derived by 682 examining the set of single-point link measurements and taking the 683 minimum value. 685 6.1. Type-P Parameter 687 The concept of "packets of type-P" is defined in [RFC2330]. The 688 type-P categorization has critical relevance in all forms of capacity 689 measurement and reporting. The ability to categorize packets based 690 on header fields for assignment to different queues and scheduling 691 mechanisms is now common place. When un-used resources are shared 692 across queues, the conditions in all packet categories will affect 693 capacity and related measurements. This is one source of variability 694 in the results that all audiences would prefer to see reported in a 695 useful and easily understood way. 697 Type-P in OWAMP and TWAMP is essentially confined to the Diffserv 698 Codepoint [RFC4656]. DSCP is the most common qualifier for type-P. 700 Each audience will have a set of type-P qualifications and value 701 combinations that are of interest. Measurements and reports SHOULD 702 have the flexibility to report per-type and aggregate performance. 704 6.2. A priori Factors 706 The audience for Network Characterization may have detailed 707 information about each link that comprises a complete path (due to 708 ownership, for example), or some of the links in the path but not 709 others, or none of the links. 711 There are cases where the measurement audience only has information 712 on one of the links (the local access link), and wishes to measure 713 one or more of the raw capacity metrics. This scenario is quite 714 common, and has spawned a substantial number of experimental 715 measurement methods (e.g., http://www.caida.org/tools/taxonomy/ ). 716 Many of these methods respect that their users want a result fairly 717 quickly and in a one-trial. Thus, the measurement interval is kept 718 short (a few seconds to a minute). For long-term reporting, a sample 719 of short term results need to be summarized. 721 6.3. IP-layer Capacity 723 For links, this metric's theoretical maximum value can be determined 724 from the physical layer bit rate and the bit rate reduction due to 725 the layers between the physical layer and IP. When measured, this 726 metric takes additional factors into account, such as the ability of 727 the sending device to process and forward traffic under various 728 conditions. For example, the arrival of routing updates may spawn 729 high priority processes that reduce the sending rate temporarily. 730 Thus, the measured capacity of a link will be variable, and the 731 maximum capacity observed applies to a specific time, time interval, 732 and other relevant circumstances. 734 For paths composed of a series of links, it is easy to see how the 735 sources of variability for the results grow with each link in the 736 path. Results variability will be discussed in more detail below. 738 6.4. IP-layer Utilization 740 The ideal metric definition of Link Utilization [RFC5136] is based on 741 the actual usage (bits successfully received during a time interval) 742 and the Maximum Capacity for the same interval. 744 In practice, Link Utilization can be calculated by counting the IP- 745 layer (or other layer) octets received over a time interval and 746 dividing by the theoretical maximum of octets that could have been 747 delivered in the same interval. A commonly used time interval is 5 748 minutes, and this interval has been sufficient to support network 749 operations and design for some time. 5 minutes is somewhat long 750 compared with the expected download time for web pages, but short 751 with respect to large file transfers and TV program viewing. It is 752 fair to say that considerable variability is concealed by reporting a 753 single (average) Utilization value for each 5 minute interval. Some 754 performance management systems have begun to make 1 minute averages 755 available. 757 There is also a limit on the smallest useful measurement interval. 758 Intervals on the order of the serialization time for a single Maximum 759 Transmission Unit (MTU) packet will observe on/off behavior and 760 report 100% or 0%. The smallest interval needs to be some multiple 761 of MTU serialization time for averaging to be effective. 763 6.5. IP-layer Available Capacity 765 The Available Capacity of a link can be calculated using the Capacity 766 and Utilization metrics. 768 When Available capacity of a link or path is estimated through some 769 measurement technique, the following parameters SHOULD be reported: 771 o Name and reference to the exact method of measurement 773 o IP packet length, octets (including IP header) 775 o Maximum Capacity that can be assessed in the measurement 776 configuration 778 o The time duration of the measurement 780 o All other parameters specific to the measurement method 781 Many methods of Available capacity measurement have a maximum 782 capacity that they can measure, and this maximum may be less than the 783 actual Available capacity of the link or path. Therefore, it is 784 important to know the capacity value beyond which there will be no 785 measured improvement. 787 The Application Design audience may have a desired target capacity 788 value and simply wish to assess whether there is sufficient Available 789 Capacity. This case simplifies measurement of link and path capacity 790 to some degree, as long as the measurable maximum exceeds the target 791 capacity. 793 6.6. Variability in Utilization and Avail. Capacity 795 As with most metrics and measurements, assessing the consistency or 796 variability in the results gives the user an intuitive feel for the 797 degree (or confidence) that any one value is representative of other 798 results, or the spread of the underlying distribution of the 799 singleton measurements. 801 How can Utilization be measured and summarized to describe the 802 potential variability in a useful way? 804 How can the variability in Available Capacity estimates be reported, 805 so that the confidence in the results is also conveyed? 807 We suggest some methods below: 809 6.6.1. General Summary of Variability 811 With a set of singleton Utilization or Available Capacity estimates, 812 each representing a time interval needed to ascertain the estimate, 813 we seek to describe the variation over the set of singletons as 814 though reporting summary statistics of a distribution. Three useful 815 summary statistics are: 817 o Minimum, 819 o Maximum, 821 o Range 823 An alternate way to represent the Range is as ratio of Maximum to 824 Minimum value. This enables an easily understandable statistic to 825 describe the range observed. For example, when Maximum = 3*Minimum, 826 then the Max/Min Ratio is 3 and users may see variability of this 827 order. On the other hand, Capacity estimates with a Max/Min Ratio 828 near 1 are quite consistent and near the central measure or statistic 829 reported. 831 For an on-going series of singleton estimates, a moving average of n 832 estimates may provide a single value estimate to more easily 833 distinguish substantial changes in performance over time. For 834 example, in a window of n singletons observed in time interval, t, a 835 percentage change of x% is declared to be a substantial change and 836 reported as an exception. 838 Often, the most informative summary of the results is a two-axis plot 839 rather than a table of statistics, where time is plotted on the 840 x-axis and the singleton value on the y-axis. The time-series plot 841 can illustrate sudden changes in an otherwise stable range, identify 842 bi-modality easily, and help quickly assess correlation with other 843 time-series. Plots of frequency of the singleton values are likewise 844 useful tools to visualize the variation. 846 7. Reporting Restricted Capacity Metrics 848 Restricted capacity refers to the metrics defined in [RFC3148] which 849 include criteria of data uniqueness or flow-control response to 850 congestion. 852 In primary metric considered is Bulk Transfer Capacity (BTC) for 853 complete paths. [RFC3148] defines 855 BTC = data_sent / elapsed_time 857 for a connection with congestion-aware flow control, where data_sent 858 is the total of unique payload bits (no headers). 860 We note that this definition *differs* from the raw capacity 861 definition in Section 2.3.1 of [RFC5136], where IP-layer Capacity 862 *includes* all bits in the IP header and payload. This means that 863 Restricted Capacity BTC is already operating at a disadvantage when 864 compared to the raw capacity at layers below TCP. Further, there are 865 cases where one IP-layer is encapsulated in another IP-layer or other 866 form of tunneling protocol, designating more and more of the 867 fundamental transport capacity as header bits that are pure overhead 868 to the BTC measurement. 870 We also note that Raw and Restricted Capacity metrics are not 871 orthogonal in the sense defined in Section 5.1.2 above. The 872 information they covey about the network under test is certainly 873 overlapping, but they reveal two different and important aspects of 874 performance. 876 When thinking about the triad of raw capacity metrics, BTC is most 877 akin to the "IP-Type-P Available Path Capacity", at least in the eyes 878 of a network user who seeks to know what transmission performance a 879 path might support. 881 7.1. Type-P Parameter and Type-C Parameter 883 The concept of "packets of type-P" is defined in [RFC2330]. The 884 considerations for Restricted Capacity are identical to the raw 885 capacity section on this topic, with the addition that the various 886 fields and options in the TCP header MUST be included in the 887 description. 889 The vast array of TCP flow control options are not well-captured by 890 Type-P, because they do not exist in the TCP header bits. Therefore, 891 we introduce a new notion here: TCP Configuration of "Type-C". The 892 elements of Type-C describe all of the settings for TCP options and 893 congestion control algorithm variables, including the main form of 894 congestion control in use. 896 7.2. A priori Factors 898 The audience for Network Characterization may have detailed 899 information about each link that comprises a complete path (due to 900 ownership, for example), or some of the links in the path but not 901 others, or none of the links. 903 There are cases where the measurement audience only has information 904 on one of the links (the local access link), and wishes to measure 905 one or more BTC metrics. The discussion of Section 6.2 applies here 906 as well. 908 7.3. Measurement Interval 910 There are limits on a useful measurement interval for BTC. Three 911 factors that influence the interval duration are listed below: 913 1. Measurements may choose to include or exclude the 3-way handshake 914 of TCP connection establishment, which requires at least 1.5 * 915 RTT and contains both the delay of the path and the host 916 processing time for responses. However, user experience includes 917 the 3-way handshake for all new TCP connections. 919 2. Measurements may choose to include or exclude Slow-Start, 920 preferring instead to focus on a portion of the transfer that 921 represents "equilibrium" (which needs to be defined for 922 particular circumstances if used). However, user experience 923 includes the Slow-Start for all new TCP connections. 925 3. Measurements may choose to use a fixed block of data to transfer, 926 where the size of the block has a relationship to the file size 927 of the application of interest. This approach yields variable 928 size measurement intervals, where a path with faster BTC is 929 measured for less time than a path with slower BTC, and this has 930 implications when path impairments are time-varying, or 931 transient. Users are likely to turn their immediate attention 932 elsewhere when a very large file must be transferred, thus they 933 do not directly experience such a long transfer -- they see the 934 result (success or fail) and possibly an objective measurement of 935 the transfer time (which will likely include the 3-way handshake, 936 Slow-start, and application file management processing time as 937 well as the BTC). 939 Individual measurement intervals may be short or long, but there is a 940 need to report the results on a long-term basis that captures the BTC 941 variability experienced between each interval. Consistent BTC is a 942 valuable commodity along with the value attained. 944 7.4. Bulk Transfer Capacity Reporting 946 When BTC of a link or path is estimated through some measurement 947 technique, the following parameters SHOULD be reported: 949 o Name and reference to the exact method of measurement 951 o Maximum Transmission Unit (MTU) 953 o Maximum BTC that can be assessed in the measurement configuration 955 o The time and duration of the measurement 957 o The number of BTC connections used simultaneously 959 o *All* other parameters specific to the measurement method, 960 especially the Congestion Control algorithm in use 962 See also [RFC6349]. 964 Many methods of Bulk Transfer Capacity measurement have a maximum 965 capacity that they can measure, and this maximum may be less than the 966 available capacity of the link or path. Therefore, it is important 967 to specify the measured BTC value beyond which there will be no 968 measured improvement. 970 The Application Design audience may have a desired target capacity 971 value and simply wish to assess whether there is sufficient BTC. 972 This case simplifies measurement of link and path capacity to some 973 degree, as long as the measurable maximum exceeds the target 974 capacity. 976 7.5. Variability in Bulk Transfer Capacity 978 As with most metrics and measurements, assessing the consistency or 979 variability in the results gives the user an intuitive feel for the 980 degree (or confidence) that any one value is representative of other 981 results, or the underlying distribution from which these singleton 982 measurements have come. 984 With two questions looming: 986 1. What ways can BTC be measured and summarized to describe the 987 potential variability in a useful way? 989 2. How can the variability in BTC estimates be reported, so that the 990 confidence in the results is also conveyed? 992 we suggest the methods of Section 6.6.1 above, and the additional 993 results presentations given in [RFC6349]. 995 8. Reporting on Test Streams and Sample Size 997 This section discusses two key aspects of measurement that are 998 sometimes omitted from the report: the description of the test stream 999 on which the measurements are based, and the sample size. 1001 8.1. Test Stream Characteristics 1003 Network Characterization has traditionally used Poisson-distributed 1004 inter-packet spacing, as this provides an unbiased sample. The 1005 average inter-packet spacing may be selected to allow observation of 1006 specific network phenomena. Other test streams are designed to 1007 sample some property of the network, such as the presence of 1008 congestion, link bandwidth, or packet reordering. 1010 If measuring a network in order to make inferences about applications 1011 or receiver performance, then there are usually efficiencies derived 1012 from a test stream that has similar characteristics to the sender. 1013 In some cases, it is essential to synthesize the sender stream, as 1014 with Bulk Transfer Capacity estimates. In other cases, it may be 1015 sufficient to sample with a "known bias", e.g., a Periodic stream to 1016 estimate real-time application performance. 1018 8.2. Sample Size 1020 Sample size is directly related to the accuracy of the results, and 1021 plays a critical role in the report. Even if only the sample size 1022 (in terms of number of packets) is given for each value or summary 1023 statistic, it imparts a notion of the confidence in the result. 1025 In practice, the sample size will be selected taking both statistical 1026 and practical factors into account. Among these factors are: 1028 1. The estimated variability of the quantity being measured 1030 2. The desired confidence in the result (although this may be 1031 dependent on assumption of the underlying distribution of the 1032 measured quantity). 1034 3. The effects of active measurement traffic on user traffic. 1036 A sample size may sometimes be referred to as "large". This is a 1037 relative, and qualitative term. It is preferable to describe what 1038 one is attempting to achieve with their sample. For example, stating 1039 an implication may be helpful: this sample is large enough such that 1040 a single outlying value at ten times the "typical" sample mean (the 1041 mean without the outlying value) would influence the mean by no more 1042 than X. 1044 9. IANA Considerations 1046 This document makes no request of IANA. 1048 Note to RFC Editor: this section may be removed on publication as an 1049 RFC. 1051 10. Security Considerations 1053 The security considerations that apply to any active measurement of 1054 live networks are relevant here as well. See [RFC4656]. 1056 11. Acknowledgements 1058 The authors thank: Phil Chimento for his suggestion to employ 1059 conditional distributions for Delay, Steve Konish Jr. for his careful 1060 review and suggestions, Dave McDysan and Don McLachlan for useful 1061 comments based on their long experience with measurement and 1062 reporting, Daniel Genin for his observation of non-orthogonality 1063 between Raw and Restricted Capacity metrics (and our omission of this 1064 fact), and Matt Zekauskas for suggestions on organizing the memo for 1065 easier consumption. 1067 12. References 1069 12.1. Normative References 1071 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1072 Requirement Levels", BCP 14, RFC 2119, March 1997. 1074 [RFC2330] Paxson, V., Almes, G., Mahdavi, J., and M. Mathis, 1075 "Framework for IP Performance Metrics", RFC 2330, 1076 May 1998. 1078 [RFC2678] Mahdavi, J. and V. Paxson, "IPPM Metrics for Measuring 1079 Connectivity", RFC 2678, September 1999. 1081 [RFC2679] Almes, G., Kalidindi, S., and M. Zekauskas, "A One-way 1082 Delay Metric for IPPM", RFC 2679, September 1999. 1084 [RFC2680] Almes, G., Kalidindi, S., and M. Zekauskas, "A One-way 1085 Packet Loss Metric for IPPM", RFC 2680, September 1999. 1087 [RFC3148] Mathis, M. and M. Allman, "A Framework for Defining 1088 Empirical Bulk Transfer Capacity Metrics", RFC 3148, 1089 July 2001. 1091 [RFC3393] Demichelis, C. and P. Chimento, "IP Packet Delay Variation 1092 Metric for IP Performance Metrics (IPPM)", RFC 3393, 1093 November 2002. 1095 [RFC3432] Raisanen, V., Grotefeld, G., and A. Morton, "Network 1096 performance measurement with periodic streams", RFC 3432, 1097 November 2002. 1099 [RFC4656] Shalunov, S., Teitelbaum, B., Karp, A., Boote, J., and M. 1100 Zekauskas, "A One-way Active Measurement Protocol 1101 (OWAMP)", RFC 4656, September 2006. 1103 [RFC4737] Morton, A., Ciavattone, L., Ramachandran, G., Shalunov, 1104 S., and J. Perser, "Packet Reordering Metrics", RFC 4737, 1105 November 2006. 1107 [RFC5136] Chimento, P. and J. Ishac, "Defining Network Capacity", 1108 RFC 5136, February 2008. 1110 12.2. Informative References 1112 [Casner] "A Fine-Grained View of High Performance Networking, NANOG 1113 22 Conf.; http://www.nanog.org/mtg-0105/agenda.html", May 1114 20-22 2001. 1116 [Cia03] "Standardized Active Measurements on a Tier 1 IP Backbone, 1117 IEEE Communications Mag., pp 90-97.", June 2003. 1119 [I-D.ietf-ippm-reporting] 1120 Shalunov, S. and M. Swany, "Reporting IP Performance 1121 Metrics to Users", draft-ietf-ippm-reporting-06 (work in 1122 progress), March 2011. 1124 [RFC5481] Morton, A. and B. Claise, "Packet Delay Variation 1125 Applicability Statement", RFC 5481, March 2009. 1127 [RFC5835] Morton, A. and S. Van den Berghe, "Framework for Metric 1128 Composition", RFC 5835, April 2010. 1130 [RFC6349] Constantine, B., Forget, G., Geib, R., and R. Schrage, 1131 "Framework for TCP Throughput Testing", RFC 6349, 1132 August 2011. 1134 [Y.1540] ITU-T Recommendation Y.1540, "Internet protocol data 1135 communication service - IP packet transfer and 1136 availability performance parameters", December 2011. 1138 [Y.1541] ITU-T Recommendation Y.1540, "Network Performance 1139 Objectives for IP-Based Services", February 2011. 1141 Authors' Addresses 1143 Al Morton 1144 AT&T Labs 1145 200 Laurel Avenue South 1146 Middletown, NJ 07748 1147 USA 1149 Phone: +1 732 420 1571 1150 Fax: +1 732 368 1192 1151 Email: acmorton@att.com 1152 URI: http://home.comcast.net/~acmacm/ 1153 Gomathi Ramachandran 1154 AT&T Labs 1155 200 Laurel Avenue South 1156 Middletown, New Jersey 07748 1157 USA 1159 Phone: +1 732 420 2353 1160 Fax: 1161 Email: gomathi@att.com 1162 URI: 1164 Ganga Maguluri 1165 AT&T Labs 1166 200 Laurel Avenue 1167 Middletown, New Jersey 07748 1168 USA 1170 Phone: 732-420-2486 1171 Fax: 1172 Email: gmaguluri@att.com 1173 URI: