idnits 2.17.1 draft-ietf-ippm-reporting-metrics-09.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to contain a disclaimer for pre-RFC5378 work, but was first submitted on or after 10 November 2008. The disclaimer is usually necessary only for documents that revise or obsolete older RFCs, and that take significant amounts of text from those RFCs. If you can contact all authors of the source material and they are willing to grant the BCP78 rights to the IETF Trust, you can and should remove the disclaimer. Otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (May 10, 2012) is 4341 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 2679 (Obsoleted by RFC 7679) ** Obsolete normative reference: RFC 2680 (Obsoleted by RFC 7680) Summary: 2 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group A. Morton 3 Internet-Draft G. Ramachandran 4 Intended status: Informational G. Maguluri 5 Expires: November 11, 2012 AT&T Labs 6 May 10, 2012 8 Reporting IP Network Performance Metrics: Different Points of View 9 draft-ietf-ippm-reporting-metrics-09 11 Abstract 13 Consumers of IP network performance metrics have many different uses 14 in mind. The memo provides "long-term" reporting considerations 15 (e.g., hours, days, weeks or months, as opposed to 10 seconds), based 16 on analysis of the two key audience points-of-view. It describes how 17 the audience categories affect the selection of metric parameters and 18 options when seeking info that serves their needs. 20 Status of this Memo 22 This Internet-Draft is submitted in full conformance with the 23 provisions of BCP 78 and BCP 79. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF). Note that other groups may also distribute 27 working documents as Internet-Drafts. The list of current Internet- 28 Drafts is at http://datatracker.ietf.org/drafts/current/. 30 Internet-Drafts are draft documents valid for a maximum of six months 31 and may be updated, replaced, or obsoleted by other documents at any 32 time. It is inappropriate to use Internet-Drafts as reference 33 material or to cite them other than as "work in progress." 35 This Internet-Draft will expire on November 11, 2012. 37 Copyright Notice 39 Copyright (c) 2012 IETF Trust and the persons identified as the 40 document authors. All rights reserved. 42 This document is subject to BCP 78 and the IETF Trust's Legal 43 Provisions Relating to IETF Documents 44 (http://trustee.ietf.org/license-info) in effect on the date of 45 publication of this document. Please review these documents 46 carefully, as they describe your rights and restrictions with respect 47 to this document. Code Components extracted from this document must 48 include Simplified BSD License text as described in Section 4.e of 49 the Trust Legal Provisions and are provided without warranty as 50 described in the Simplified BSD License. 52 This document may contain material from IETF Documents or IETF 53 Contributions published or made publicly available before November 54 10, 2008. The person(s) controlling the copyright in some of this 55 material may not have granted the IETF Trust the right to allow 56 modifications of such material outside the IETF Standards Process. 57 Without obtaining an adequate license from the person(s) controlling 58 the copyright in such materials, this document may not be modified 59 outside the IETF Standards Process, and derivative works of it may 60 not be created outside the IETF Standards Process, except to format 61 it for publication as an RFC or to translate it into languages other 62 than English. 64 Table of Contents 66 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 67 2. Purpose and Scope . . . . . . . . . . . . . . . . . . . . . . 4 68 3. Reporting Results . . . . . . . . . . . . . . . . . . . . . . 5 69 3.1. Overview of Metric Statistics . . . . . . . . . . . . . . 5 70 3.2. Long-Term Reporting Considerations . . . . . . . . . . . . 6 71 4. Effect of POV on the Loss Metric . . . . . . . . . . . . . . . 8 72 4.1. Loss Threshold . . . . . . . . . . . . . . . . . . . . . . 8 73 4.1.1. Network Characterization . . . . . . . . . . . . . . . 8 74 4.1.2. Application Performance . . . . . . . . . . . . . . . 11 75 4.2. Errored Packet Designation . . . . . . . . . . . . . . . . 11 76 4.3. Causes of Lost Packets . . . . . . . . . . . . . . . . . . 11 77 4.4. Summary for Loss . . . . . . . . . . . . . . . . . . . . . 12 78 5. Effect of POV on the Delay Metric . . . . . . . . . . . . . . 12 79 5.1. Treatment of Lost Packets . . . . . . . . . . . . . . . . 12 80 5.1.1. Application Performance . . . . . . . . . . . . . . . 13 81 5.1.2. Network Characterization . . . . . . . . . . . . . . . 13 82 5.1.3. Delay Variation . . . . . . . . . . . . . . . . . . . 14 83 5.1.4. Reordering . . . . . . . . . . . . . . . . . . . . . . 15 84 5.2. Preferred Statistics . . . . . . . . . . . . . . . . . . . 15 85 5.3. Summary for Delay . . . . . . . . . . . . . . . . . . . . 16 86 6. Reporting Raw Capacity Metrics . . . . . . . . . . . . . . . . 16 87 6.1. Type-P Parameter . . . . . . . . . . . . . . . . . . . . . 17 88 6.2. A priori Factors . . . . . . . . . . . . . . . . . . . . . 17 89 6.3. IP-layer Capacity . . . . . . . . . . . . . . . . . . . . 17 90 6.4. IP-layer Utilization . . . . . . . . . . . . . . . . . . . 18 91 6.5. IP-layer Available Capacity . . . . . . . . . . . . . . . 18 92 6.6. Variability in Utilization and Available Capacity . . . . 19 93 6.6.1. General Summary of Variability . . . . . . . . . . . . 19 94 7. Reporting Restricted Capacity Metrics . . . . . . . . . . . . 20 95 7.1. Type-P Parameter and Type-C Parameter . . . . . . . . . . 21 96 7.2. A priori Factors . . . . . . . . . . . . . . . . . . . . . 21 97 7.3. Measurement Interval . . . . . . . . . . . . . . . . . . . 21 98 7.4. Bulk Transfer Capacity Reporting . . . . . . . . . . . . . 22 99 7.5. Variability in Bulk Transfer Capacity . . . . . . . . . . 23 100 8. Reporting on Test Streams and Sample Size . . . . . . . . . . 23 101 8.1. Test Stream Characteristics . . . . . . . . . . . . . . . 23 102 8.2. Sample Size . . . . . . . . . . . . . . . . . . . . . . . 24 103 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 24 104 10. Security Considerations . . . . . . . . . . . . . . . . . . . 24 105 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 25 106 12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 25 107 12.1. Normative References . . . . . . . . . . . . . . . . . . . 25 108 12.2. Informative References . . . . . . . . . . . . . . . . . . 26 109 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 26 111 1. Introduction 113 When designing measurements of IP networks and presenting a result, 114 knowledge of the audience is a key consideration. To present a 115 useful and relevant portrait of network conditions, one must answer 116 the following question: 118 "How will the results be used?" 120 There are two main audience categories for the report of results: 122 1. Network Characterization - describes conditions in an IP network 123 for quality assurance, troubleshooting, modeling, Service Level 124 Agreements (SLA), etc. The point-of-view looks inward toward the 125 network where the report consumer intends their actions. 127 2. Application Performance Estimation - describes the network 128 conditions in a way that facilitates determining effects on user 129 applications, and ultimately the users themselves. This point- 130 of-view looks outward, toward the user(s), accepting the network 131 as-is. This report consumer intends to estimate a network- 132 dependent aspect of performance, or design some aspect of an 133 application's accommodation of the network. (These are *not* 134 application metrics, they are defined at the IP layer.) 136 This memo considers how these different points-of-view affect both 137 the measurement design (parameters and options of the metrics) and 138 statistics reported when serving their needs. 140 The IPPM framework [RFC2330] and other RFCs describing IPPM metrics 141 provide a background for this memo. 143 2. Purpose and Scope 145 The purpose of this memo is to clearly delineate two points-of-view 146 (POV) for using measurements, and describe their effects on the test 147 design, including the selection of metric parameters and reporting 148 the results. 150 The scope of this memo primarily covers the design and reporting of 151 the loss and delay metrics [RFC2680] [RFC2679]. It will also discuss 152 the delay variation [RFC3393] and reordering metrics [RFC4737] where 153 applicable. 155 With capacity metrics growing in relevance to the industry, the memo 156 also covers POV and reporting considerations for metrics resulting 157 from the Bulk Transfer Capacity Framework [RFC3148] and Network 158 Capacity Definitions [RFC5136]. These memos effectively describe two 159 different categories of metrics, 161 o Restricted: [RFC3148] includes restrictions of congestion control 162 and the notion of unique data bits delivered, and 164 o Raw: [RFC5136] using a definition of raw capacity without the 165 restrictions of data uniqueness or congestion awareness. 167 It might seem at first glance that each of these metrics has an 168 obvious audience (Raw = Network Characterization, Restricted = 169 Application Performance), but reality is more complex and consistent 170 with the overall topic of capacity measurement and reporting. For 171 example, TCP is usually used in Restricted capacity measurement 172 methods, while UDP appears in Raw capacity measurement. The Raw and 173 Restricted capacity metrics will be treated in separate sections, 174 although they share one common reporting issue: representing 175 variability in capacity metric results as part of a long-term report. 177 Sampling, or the design of the active packet stream that is the basis 178 for the measurements, is also discussed. 180 3. Reporting Results 182 This section gives an overview of recommendations, followed by 183 additional considerations for reporting results in the "long term", 184 based on the discussion and conclusions of the major sections that 185 follow. 187 3.1. Overview of Metric Statistics 189 This section gives an overview of reporting recommendations for the 190 loss, delay, and delay variation metrics. 192 The minimal report on measurements must include both Loss and Delay 193 Metrics. 195 For Packet Loss, the loss ratio defined in [RFC2680] is a sufficient 196 starting point, especially the existing guidance for setting the loss 197 threshold waiting time. We have calculated a waiting time in section 198 4.1.1 that should be sufficient to differentiate between packets that 199 are truly lost or have long finite delays under general measurement 200 circumstances, 51 seconds. Knowledge of specific conditions can help 201 to reduce this threshold, and a waiting time of approximately 50 202 seconds is considered to be manageable in practice. 204 We note that a loss ratio calculated according to [Y.1540] would 205 exclude errored packets from the numerator. In practice, the 206 difference between these two loss metrics is small if any, depending 207 on whether the last link prior to the destination contributes errored 208 packets. 210 For Packet Delay, we recommend providing both the mean delay and the 211 median delay with lost packets designated undefined (as permitted by 212 [RFC2679]). Both statistics are based on a conditional distribution, 213 and the condition is packet arrival prior to a waiting time dT, where 214 dT has been set to take maximum packet lifetimes into account, as 215 discussed above for loss. Using a long dT helps to ensure that delay 216 distributions are not truncated. 218 For Packet Delay Variation (PDV), the minimum delay of the 219 conditional distribution should be used as the reference delay for 220 computing PDV according to [Y.1540] or [RFC5481] and [RFC3393]. A 221 useful value to report is a pseudo range of delay variation based on 222 calculating the difference between a high percentile of delay and the 223 minimum delay. For example, the 99.9 percentile minus the minimum 224 will give a value that can be compared with objectives in [Y.1541]. 226 For Capacity, both Raw and Restricted, reporting the variability in a 227 useful way is identified as the main challenge. The Min, Max, and 228 Range statistics are suggested along with a ratio of Max to Min and 229 moving averages. In the end, a simple plot of the singleton results 230 over time may succeed where summary metrics fail, or serve to confirm 231 that the summaries are valid. 233 3.2. Long-Term Reporting Considerations 235 [I-D.ietf-ippm-reporting] describes methods to conduct measurements 236 and report the results on a near-immediate time scale (10 seconds, 237 which we consider to be "short-term"). 239 Measurement intervals and reporting intervals need not be the same 240 length. Sometimes, the user is only concerned with the performance 241 levels achieved over a relatively long interval of time (e.g., days, 242 weeks, or months, as opposed to 10 seconds). However, there can be 243 risks involved with running a measurement continuously over a long 244 period without recording intermediate results: 246 o Temporary power failure may cause loss of all the results to date. 248 o Measurement system timing synchronization signals may experience a 249 temporary outage, causing sub-sets of measurements to be in error 250 or invalid. 252 o Maintenance may be necessary on the measurement system, or its 253 connectivity to the network under test. 255 For these and other reasons, such as 257 o the constraint to collect measurements on intervals similar to 258 user session length, 260 o the dual-use of measurements in monitoring activities where 261 results are needed on a period of a few minutes, or 263 o the ability to inspect results of a single measurement interval 264 for deeper analysis, 266 there is value in conducting measurements on intervals that are much 267 shorter than the reporting interval. 269 There are several approaches for aggregating a series of measurement 270 results over time in order to make a statement about the longer 271 reporting interval. One approach requires the storage of all metric 272 singletons collected throughout the reporting interval, even though 273 the measurement interval stops and starts many times. 275 Another approach is described in [RFC5835] as "temporal aggregation". 276 This approach would estimate the results for the reporting interval 277 based on many individual measurement interval statistics (results) 278 alone. The result would ideally appear in the same form as though a 279 continuous measurement had been conducted. A memo addressing the 280 details of temporal aggregation is yet to be prepared. 282 Yet another approach requires a numerical objective for the metric, 283 and the results of each measurement interval are compared with the 284 objective. Every measurement interval where the results meet the 285 objective contribute to the fraction of time with performance as 286 specified. When the reporting interval contains many measurement 287 intervals it is possible to present the results as "metric A was less 288 than or equal to objective X during Y% of time". 290 NOTE that numerical thresholds of acceptability are not set in IETF 291 performance work and are therefore excluded from the scope of this 292 memo. 294 In all measurement, it is important to avoid unintended 295 synchronization with network events. This topic is treated in 296 [RFC2330] for Poisson-distributed inter-packet time streams, and 297 [RFC3432] for Periodic streams. Both avoid synchronization through 298 use of random start times. 300 There are network conditions where it is simply more useful to report 301 the connectivity status of the Source-Destination path, and to 302 distinguish time intervals where connectivity can be demonstrated 303 from other time intervals (where connectivity does not appear to 304 exist). [RFC2678] specifies a number of one-way and two-way 305 connectivity metrics of increasing complexity. In this memo, we 306 RECOMMEND that long term reporting of loss, delay, and other metrics 307 be limited to time intervals where connectivity can be demonstrated, 308 and other intervals be summarized as percent of time where 309 connectivity does not appear to exist. We note that this same 310 approach has been adopted in ITU-T Recommendation [Y.1540] where 311 performance parameters are only valid during periods of service 312 "availability" (evaluated according to a function based on packet 313 loss, and sustained periods of loss ratio greater than a threshold 314 are declared "unavailable"). 316 4. Effect of POV on the Loss Metric 318 This section describes the ways in which the Loss metric can be tuned 319 to reflect the preferences of the two audience categories, or 320 different POV. The waiting time to declare a packet lost, or loss 321 threshold is one area where there would appear to be a difference, 322 but the ability to post-process the results may resolve it. 324 4.1. Loss Threshold 326 RFC 2680 [RFC2680] defines the concept of a waiting time for packets 327 to arrive, beyond which they are declared lost. The text of the RFC 328 declines to recommend a value, instead saying that "good engineering, 329 including an understanding of packet lifetimes, will be needed in 330 practice." Later, in the methodology, they give reasons for waiting 331 "a reasonable period of time", and leaving the definition of 332 "reasonable" intentionally vague. We estimate a practical bound on 333 waiting time below. 335 4.1.1. Network Characterization 337 Practical measurement experience has shown that unusual network 338 circumstances can cause long delays. One such circumstance is when 339 routing loops form during IGP re-convergence following a failure or 340 drastic link cost change. Packets will loop between two routers 341 until new routes are installed, or until the IPv4 Time-to-Live (TTL) 342 field (or the IPv6 Hop Limit) decrements to zero. Very long delays 343 on the order of several seconds have been measured [Casner] [Cia03]. 345 Therefore, network characterization activities prefer a long waiting 346 time in order to distinguish these events from other causes of loss 347 (such as packet discard at a full queue, or tail drop). This way, 348 the metric design helps to distinguish more reliably between packets 349 that might yet arrive, and those that are no longer traversing the 350 network. 352 It is possible to calculate a worst-case waiting time, assuming that 353 a routing loop is the cause. We model the path between Source and 354 Destination as a series of delays in links (t) and queues (q), as 355 these are the dominant contributors to delay (in active measurement, 356 the Source and Destination hosts contribute minimal delay). The 357 normal path delay, D, across n queues (where TTL is decremented at a 358 node with a queue) and n+1 links without encountering a loop, is 360 Path model with n=5 361 Source --- q1 --- q2 --- q3 --- q4 --- q5 --- Destination 362 t0 t1 t2 t3 t4 t5 364 n 365 --- 366 \ 367 D = t + > (t + q ) 368 0 / i i 369 --- 370 i = 1 372 Figure 1: Normal Path Delay 374 and the time spent in the loop with L queues, is 375 Path model with n=5 and L=3 376 Time in one Loop = (qx+tx + qy+ty + qz+tz) 378 qy -- qz 379 | ?/exit? 380 qx--/\ 381 Src --- q1 --- q2 ---/ q3 --- q4 --- q5 --- Dst 382 t0 t1 t2 t3 t4 t5 384 j + L-1 385 --- 386 \ (TTL - n) 387 R = C > (t + q ) where C = --------- 388 / i i max L 389 --- 390 i=j 392 Figure 2: Delay due to Rotations in a Loop 394 where n is the total number of queues in the non-loop path (with n+1 395 links), j is the queue number where the loop begins, C is the number 396 of times a packet circles the loop, and TTL is the packet's initial 397 Time-to-Live value at the source (or Hop Count in IPv6). 399 If we take the delays of all links and queues as 100ms each, the 400 TTL=255, the number of queues n=5 and the queues in the loop L=4, 401 then using C_max: 403 D = 1.1 sec and R ~= 50 sec, and D + R ~= 51.1 seconds 405 We note that the link delays of 100ms would span most continents, and 406 a constant queue length of 100ms is also very generous. When a loop 407 occurs, it is almost certain to be resolved in 10 seconds or less. 408 The value calculated above is an upper limit for almost any real- 409 world circumstance. 411 A waiting time threshold parameter, dT, set consistent with this 412 calculation would not truncate the delay distribution (possibly 413 causing a change in its mathematical properties), because the packets 414 that might arrive have been given sufficient time to traverse the 415 network. 417 It is worth noting that packets that are stored and deliberately 418 forwarded at a much later time constitute a replay attack on the 419 measurement system, and are beyond the scope of normal performance 420 reporting. 422 4.1.2. Application Performance 424 Fortunately, application performance estimation activities are not 425 adversely affected by the long estimated limit on waiting time, 426 because most applications will use shorter time thresholds. Although 427 the designer's tendency might be to set the Loss Threshold at a value 428 equivalent to a particular application's threshold, this specific 429 threshold can be applied when post-processing the measurements. A 430 shorter waiting time can be enforced by locating packets with delays 431 longer than the application's threshold, and re-designating such 432 packets as lost. Thus, the measurement system can use a single loss 433 waiting time and support both application and network performance 434 POVs simultaneously. 436 4.2. Errored Packet Designation 438 RFC 2680 designates packets that arrive containing errors as lost 439 packets. Many packets that are corrupted by bit errors are discarded 440 within the network and do not reach their intended destination. 442 This is consistent with applications that would check the payload 443 integrity at higher layers, and discard the packet. However, some 444 applications prefer to deal with errored payloads on their own, and 445 even a corrupted payload is better than no packet at all. 447 To address this possibility, and to make network characterization 448 more complete, it is recommended to distinguish between packets that 449 do not arrive (lost) and errored packets that arrive (conditionally 450 lost). 452 4.3. Causes of Lost Packets 454 Although many measurement systems use a waiting time to determine if 455 a packet is lost or not, most of the waiting is in vain. The packets 456 are no-longer traversing the network, and have not reached their 457 destination. 459 There are many causes of packet loss, including: 461 1. Queue drop, or discard 463 2. Corruption of the IP header, or other essential header info 465 3. TTL expiration (or use of a TTL value that is too small) 467 4. Link or router failure 468 5. Layers below the source-to-destination IP layer can discard 469 packets that fail error checking, and link-layer checksums often 470 cover the entire packet 472 It is reasonable to consider a packet that has not arrived after a 473 large amount of time to be lost (due to one of the causes above) 474 because packets do not "live forever" in the network, or have 475 infinite delay. 477 4.4. Summary for Loss 479 Given that measurement post-processing is possible (even encouraged 480 in the definitions of IPPM metrics), measurements of loss can easily 481 serve both points of view: 483 o Use a long waiting time to serve network characterization and 484 revise results for specific application delay thresholds as 485 needed. 487 o Distinguish between errored packets and lost packets when possible 488 to aid network characterization, and combine the results for 489 application performance if appropriate. 491 5. Effect of POV on the Delay Metric 493 This section describes the ways in which the Delay metric can be 494 tuned to reflect the preferences of the two consumer categories, or 495 different POV. 497 5.1. Treatment of Lost Packets 499 The Delay Metric [RFC2679] specifies the treatment of packets that do 500 not successfully traverse the network: their delay is undefined. 502 " >>The *Type-P-One-way-Delay* from Src to Dst at T is undefined 503 (informally, infinite)<< means that Src sent the first bit of a 504 Type-P packet to Dst at wire-time T and that Dst did not receive that 505 packet." 507 It is an accepted, but informal practice to assign infinite delay to 508 lost packets. We next look at how these two different treatments 509 align with the needs of measurement consumers who wish to 510 characterize networks or estimate application performance. Also, we 511 look at the way that lost packets have been treated in other metrics: 512 delay variation and reordering. 514 5.1.1. Application Performance 516 Applications need to perform different functions, dependent on 517 whether or not each packet arrives within some finite tolerance. In 518 other words, a receiver's packet processing takes only one of two 519 alternative directions (a "fork" in the road): 521 o Packets that arrive within expected tolerance are handled by 522 removing headers, restoring smooth delivery timing (as in a de- 523 jitter buffer), restoring sending order, checking for errors in 524 payloads, and many other operations. 526 o Packets that do not arrive when expected lead to attempted 527 recovery from the apparent loss, such as retransmission requests, 528 loss concealment, or forward error correction to replace the 529 missing packet. 531 So, it is important to maintain a distinction between packets that 532 actually arrive, and those that do not. Therefore, it is preferable 533 to leave the delay of lost packets undefined, and to characterize the 534 delay distribution as a conditional distribution (conditioned on 535 arrival). 537 5.1.2. Network Characterization 539 In this discussion, we assume that both loss and delay metrics will 540 be reported for network characterization (at least). 542 Assume packets that do not arrive are reported as Lost, usually as a 543 fraction of all sent packets. If these lost packets are assigned 544 undefined delay, then the network's inability to deliver them (in a 545 timely way) is relegated only in the Loss metric when we report 546 statistics on the Delay distribution conditioned on the event of 547 packet arrival (within the Loss waiting time threshold). We can say 548 that the Delay and Loss metrics are orthogonal, in that they convey 549 non-overlapping information about the network under test. This is a 550 valuable property, whose absence is discussed below. 552 However, if we assign infinite delay to all lost packets, then: 554 o The delay metric results are influenced both by packets that 555 arrive and those that do not. 557 o The delay singleton and the loss singleton do not appear to be 558 orthogonal (Delay is finite when Loss=0, Delay is infinite when 559 Loss=1). 561 o The network is penalized in both the loss and delay metrics, 562 effectively double-counting the lost packets. 564 As further evidence of overlap, consider the Cumulative Distribution 565 Function (CDF) of Delay when the value "positive infinity" is 566 assigned to all lost packets. Figure 3 shows a CDF where a small 567 fraction of packets are lost. 569 1 | - - - - - - - - - - - - - - - - - -+ 570 | | 571 | _..----'''''''''''''''''''' 572 | ,-'' 573 | ,' 574 | / Mass at 575 | / +infinity 576 | / = fraction 577 || lost 578 |/ 579 0 |_____________________________________ 581 0 Delay +o0 583 Figure 3: Cumulative Distribution Function for Delay when Loss = 584 +Infinity 586 We note that a Delay CDF that is conditioned on packet arrival would 587 not exhibit this apparent overlap with loss. 589 Although infinity is a familiar mathematical concept, it is somewhat 590 disconcerting to see any time-related metric reported as infinity. 591 Questions are bound to arise, and tend to detract from the goal of 592 informing the consumer with a performance report. 594 5.1.3. Delay Variation 596 [RFC3393] excludes lost packets from samples, effectively assigning 597 an undefined delay to packets that do not arrive in a reasonable 598 time. Section 4.1 of [RFC3393] describes this specification and its 599 rationale (ipdv = inter-packet delay variation in the quote below). 601 "The treatment of lost packets as having "infinite" or "undefined" 602 delay complicates the derivation of statistics for ipdv. 603 Specifically, when packets in the measurement sequence are lost, 604 simple statistics such as sample mean cannot be computed. One 605 possible approach to handling this problem is to reduce the event 606 space by conditioning. That is, we consider conditional statistics; 607 namely we estimate the mean ipdv (or other derivative statistic) 608 conditioned on the event that selected packet pairs arrive at the 609 destination (within the given timeout). While this itself is not 610 without problems (what happens, for example, when every other packet 611 is lost), it offers a way to make some (valid) statements about ipdv, 612 at the same time avoiding events with undefined outcomes." 614 We note that the argument above applies to all forms of packet delay 615 variation that can be constructed using the "selection function" 616 concept of [RFC3393]. In recent work the two main forms of delay 617 variation metrics have been compared and the results are summarized 618 in [RFC5481]. 620 5.1.4. Reordering 622 [RFC4737] defines metrics that are based on evaluation of packet 623 arrival order, and include a waiting time to declare a packet lost 624 (to exclude them from further processing). 626 If packets are assigned a delay value, then the reordering metric 627 would declare any packets with infinite delay to be reordered, 628 because their sequence numbers will surely be less than the "Next 629 Expected" threshold when (or if) they arrive. But this practice 630 would fail to maintain orthogonality between the reordering metric 631 and the loss metric. Confusion can be avoided by designating the 632 delay of non-arriving packets as undefined, and reserving delay 633 values only for packets that arrive within a sufficiently long 634 waiting time. 636 5.2. Preferred Statistics 638 Today in network characterization, the sample mean is one statistic 639 that is almost ubiquitously reported. It is easily computed and 640 understood by virtually everyone in this audience category. Also, 641 the sample is usually filtered on packet arrival, so that the mean is 642 based on a conditional distribution. 644 The median is another statistic that summarizes a distribution, 645 having somewhat different properties from the sample mean. The 646 median is stable in distributions with a few outliers or without 647 them. However, the median's stability prevents it from indicating 648 when a large fraction of the distribution changes value. 50% or more 649 values would need to change for the median to capture the change. 651 Both the median and sample mean have difficulty with bimodal 652 distributions. The median will reside in only one of the modes, and 653 the mean may not lie in either mode range. For this and other 654 reasons, additional statistics such as the minimum, maximum, and 95 655 percentile have value when summarizing a distribution. 657 When both the sample mean and median are available, a comparison will 658 sometimes be informative, because these two statistics are equal only 659 under unusual circumstances, such as when the delay distribution is 660 perfectly symmetrical. 662 Also, these statistics are generally useful from the Application 663 Performance POV, so there is a common set that should satisfy 664 audiences. 666 Plots of the delay distribution may also be useful when single-value 667 statistics indicate that new conditions are present. An empirically- 668 derived probability distribution function will usually describe 669 multiple modes more efficiently than any other form of result. 671 5.3. Summary for Delay 673 From the perspectives of: 675 1. application/receiver analysis, where subsequent processing 676 depends on whether the packet arrives or times-out, 678 2. straightforward network characterization without double-counting 679 defects, and 681 3. consistency with Delay variation and Reordering metric 682 definitions, 684 the most efficient practice is to distinguish between packets that 685 are truly lost and those that are delayed packets with a sufficiently 686 long waiting time, and to designate the delay of non-arriving packets 687 as undefined. 689 6. Reporting Raw Capacity Metrics 691 Raw capacity refers to the metrics defined in [RFC5136] which do not 692 include restrictions such as data uniqueness or flow-control response 693 to congestion. 695 The metrics considered are IP-layer Capacity, Utilization (or used 696 capacity), and Available Capacity, for individual links and complete 697 paths. These three metrics form a triad: knowing one metric 698 constrains the other two (within their allowed range), and knowing 699 two determines the third. The link metrics have another key aspect 700 in common: they are single-measurement-point metrics at the egress of 701 a link. The path Capacity and Available Capacity are derived by 702 examining the set of single-point link measurements and taking the 703 minimum value. 705 6.1. Type-P Parameter 707 The concept of "packets of Type-P" is defined in [RFC2330]. The 708 Type-P categorization has critical relevance in all forms of capacity 709 measurement and reporting. The ability to categorize packets based 710 on header fields for assignment to different queues and scheduling 711 mechanisms is now common place. When un-used resources are shared 712 across queues, the conditions in all packet categories will affect 713 capacity and related measurements. This is one source of variability 714 in the results that all audiences would prefer to see reported in a 715 useful and easily understood way. 717 Communication of Type-P within the One-way Active Measurement 718 Protocol (OWAMP) and the Two-way Active Measurement Protocol (TWAMP) 719 is essentially confined to the Diffserv Codepoint [RFC4656]. DSCP is 720 the most common qualifier for Type-P. 722 Each audience will have a set of Type-P qualifications and value 723 combinations that are of interest. Measurements and reports should 724 have the flexibility to report per-type and aggregate performance. 726 6.2. A priori Factors 728 The audience for Network Characterization may have detailed 729 information about each link that comprises a complete path (due to 730 ownership, for example), or some of the links in the path but not 731 others, or none of the links. 733 There are cases where the measurement audience only has information 734 on one of the links (the local access link), and wishes to measure 735 one or more of the raw capacity metrics. This scenario is quite 736 common, and has spawned a substantial number of experimental 737 measurement methods (e.g., http://www.caida.org/tools/taxonomy/ ). 738 Many of these methods respect that their users want a result fairly 739 quickly and in a one-trial. Thus, the measurement interval is kept 740 short (a few seconds to a minute). For long-term reporting, a sample 741 of short term results need to be summarized. 743 6.3. IP-layer Capacity 745 For links, this metric's theoretical maximum value can be determined 746 from the physical layer bit rate and the bit rate reduction due to 747 the layers between the physical layer and IP. When measured, this 748 metric takes additional factors into account, such as the ability of 749 the sending device to process and forward traffic under various 750 conditions. For example, the arrival of routing updates may spawn 751 high priority processes that reduce the sending rate temporarily. 752 Thus, the measured capacity of a link will be variable, and the 753 maximum capacity observed applies to a specific time, time interval, 754 and other relevant circumstances. 756 For paths composed of a series of links, it is easy to see how the 757 sources of variability for the results grow with each link in the 758 path. Results variability will be discussed in more detail below. 760 6.4. IP-layer Utilization 762 The ideal metric definition of Link Utilization [RFC5136] is based on 763 the actual usage (bits successfully received during a time interval) 764 and the Maximum Capacity for the same interval. 766 In practice, Link Utilization can be calculated by counting the IP- 767 layer (or other layer) octets received over a time interval and 768 dividing by the theoretical maximum of octets that could have been 769 delivered in the same interval. A commonly used time interval is 5 770 minutes, and this interval has been sufficient to support network 771 operations and design for some time. 5 minutes is somewhat long 772 compared with the expected download time for web pages, but short 773 with respect to large file transfers and TV program viewing. It is 774 fair to say that considerable variability is concealed by reporting a 775 single (average) Utilization value for each 5 minute interval. Some 776 performance management systems have begun to make 1 minute averages 777 available. 779 There is also a limit on the smallest useful measurement interval. 780 Intervals on the order of the serialization time for a single Maximum 781 Transmission Unit (MTU) packet will observe on/off behavior and 782 report 100% or 0%. The smallest interval needs to be some multiple 783 of MTU serialization time for averaging to be effective. 785 6.5. IP-layer Available Capacity 787 The Available Capacity of a link can be calculated using the Capacity 788 and Utilization metrics. 790 When Available capacity of a link or path is estimated through some 791 measurement technique, the following parameters should be reported: 793 o Name and reference to the exact method of measurement 795 o IP packet length, octets (including IP header) 797 o Maximum Capacity that can be assessed in the measurement 798 configuration 800 o The time duration of the measurement 802 o All other parameters specific to the measurement method 804 Many methods of Available capacity measurement have a maximum 805 capacity that they can measure, and this maximum may be less than the 806 actual Available capacity of the link or path. Therefore, it is 807 important to know the capacity value beyond which there will be no 808 measured improvement. 810 The Application Design audience may have a desired target capacity 811 value and simply wish to assess whether there is sufficient Available 812 Capacity. This case simplifies measurement of link and path capacity 813 to some degree, as long as the measurable maximum exceeds the target 814 capacity. 816 6.6. Variability in Utilization and Available Capacity 818 As with most metrics and measurements, assessing the consistency or 819 variability in the results gives the user an intuitive feel for the 820 degree (or confidence) that any one value is representative of other 821 results, or the spread of the underlying distribution of the 822 singleton measurements. 824 How can Utilization be measured and summarized to describe the 825 potential variability in a useful way? 827 How can the variability in Available Capacity estimates be reported, 828 so that the confidence in the results is also conveyed? 830 We suggest some methods below: 832 6.6.1. General Summary of Variability 834 With a set of singleton Utilization or Available Capacity estimates, 835 each representing a time interval needed to ascertain the estimate, 836 we seek to describe the variation over the set of singletons as 837 though reporting summary statistics of a distribution. Three useful 838 summary statistics are: 840 o Minimum, 842 o Maximum, 844 o Range 846 An alternate way to represent the Range is as ratio of Maximum to 847 Minimum value. This enables an easily understandable statistic to 848 describe the range observed. For example, when Maximum = 3*Minimum, 849 then the Max/Min Ratio is 3 and users may see variability of this 850 order. On the other hand, Capacity estimates with a Max/Min Ratio 851 near 1 are quite consistent and near the central measure or statistic 852 reported. 854 For an on-going series of singleton estimates, a moving average of n 855 estimates may provide a single value estimate to more easily 856 distinguish substantial changes in performance over time. For 857 example, in a window of n singletons observed in time interval, t, a 858 percentage change of x% is declared to be a substantial change and 859 reported as an exception. 861 Often, the most informative summary of the results is a two-axis plot 862 rather than a table of statistics, where time is plotted on the 863 x-axis and the singleton value on the y-axis. The time-series plot 864 can illustrate sudden changes in an otherwise stable range, identify 865 bi-modality easily, and help quickly assess correlation with other 866 time-series. Plots of frequency of the singleton values are likewise 867 useful tools to visualize the variation. 869 7. Reporting Restricted Capacity Metrics 871 Restricted capacity refers to the metrics defined in [RFC3148] which 872 include criteria of data uniqueness or flow-control response to 873 congestion. 875 In primary metric considered is Bulk Transfer Capacity (BTC) for 876 complete paths. [RFC3148] defines 878 BTC = data_sent / elapsed_time 880 for a connection with congestion aware flow control, where data_sent 881 is the total of unique payload bits (no headers). 883 We note that this definition *differs* from the raw capacity 884 definition in Section 2.3.1 of [RFC5136], where IP-layer Capacity 885 *includes* all bits in the IP header and payload. This means that 886 Restricted Capacity BTC is already operating at a disadvantage when 887 compared to the raw capacity at layers below TCP. Further, there are 888 cases where one IP-layer is encapsulated in another IP-layer or other 889 form of tunneling protocol, designating more and more of the 890 fundamental transport capacity as header bits that are pure overhead 891 to the BTC measurement. 893 We also note that Raw and Restricted Capacity metrics are not 894 orthogonal in the sense defined in Section 5.1.2 above. The 895 information they covey about the network under test is certainly 896 overlapping, but they reveal two different and important aspects of 897 performance. 899 When thinking about the triad of raw capacity metrics, BTC is most 900 akin to the "IP-Type-P Available Path Capacity", at least in the eyes 901 of a network user who seeks to know what transmission performance a 902 path might support. 904 7.1. Type-P Parameter and Type-C Parameter 906 The concept of "packets of Type-P" is defined in [RFC2330]. The 907 considerations for Restricted Capacity are identical to the raw 908 capacity section on this topic, with the addition that the various 909 fields and options in the TCP header must be included in the 910 description. 912 The vast array of TCP flow control options are not well-captured by 913 Type-P, because they do not exist in the TCP header bits. Therefore, 914 we introduce a new notion here: TCP Configuration of "Type-C". The 915 elements of Type-C describe all of the settings for TCP options and 916 congestion control algorithm variables, including the main form of 917 congestion control in use. Readers should consider the parameters 918 and variables of [RFC3148] and [RFC6349] when constructing Type-C. 920 7.2. A priori Factors 922 The audience for Network Characterization may have detailed 923 information about each link that comprises a complete path (due to 924 ownership, for example), or some of the links in the path but not 925 others, or none of the links. 927 There are cases where the measurement audience only has information 928 on one of the links (the local access link), and wishes to measure 929 one or more BTC metrics. The discussion of Section 6.2 applies here 930 as well. 932 7.3. Measurement Interval 934 There are limits on a useful measurement interval for BTC. Three 935 factors that influence the interval duration are listed below: 937 1. Measurements may choose to include or exclude the 3-way handshake 938 of TCP connection establishment, which requires at least 1.5 * 939 RTT and contains both the delay of the path and the host 940 processing time for responses. However, user experience includes 941 the 3-way handshake for all new TCP connections. 943 2. Measurements may choose to include or exclude Slow-Start, 944 preferring instead to focus on a portion of the transfer that 945 represents "equilibrium" (which needs to be defined for 946 particular circumstances if used). However, user experience 947 includes the Slow-Start for all new TCP connections. 949 3. Measurements may choose to use a fixed block of data to transfer, 950 where the size of the block has a relationship to the file size 951 of the application of interest. This approach yields variable 952 size measurement intervals, where a path with faster BTC is 953 measured for less time than a path with slower BTC, and this has 954 implications when path impairments are time-varying, or 955 transient. Users are likely to turn their immediate attention 956 elsewhere when a very large file must be transferred, thus they 957 do not directly experience such a long transfer -- they see the 958 result (success or fail) and possibly an objective measurement of 959 the transfer time (which will likely include the 3-way handshake, 960 Slow-start, and application file management processing time as 961 well as the BTC). 963 Individual measurement intervals may be short or long, but there is a 964 need to report the results on a long-term basis that captures the BTC 965 variability experienced between each interval. Consistent BTC is a 966 valuable commodity along with the value attained. 968 7.4. Bulk Transfer Capacity Reporting 970 When BTC of a link or path is estimated through some measurement 971 technique, the following parameters should be reported: 973 o Name and reference to the exact method of measurement 975 o Maximum Transmission Unit (MTU) 977 o Maximum BTC that can be assessed in the measurement configuration 979 o The time and duration of the measurement 981 o The number of BTC connections used simultaneously 983 o *All* other parameters specific to the measurement method, 984 especially the Congestion Control algorithm in use 986 See also [RFC6349]. 988 Many methods of Bulk Transfer Capacity measurement have a maximum 989 capacity that they can measure, and this maximum may be less than the 990 available capacity of the link or path. Therefore, it is important 991 to specify the measured BTC value beyond which there will be no 992 measured improvement. 994 The Application Design audience may have a desired target capacity 995 value and simply wish to assess whether there is sufficient BTC. 996 This case simplifies measurement of link and path capacity to some 997 degree, as long as the measurable maximum exceeds the target 998 capacity. 1000 7.5. Variability in Bulk Transfer Capacity 1002 As with most metrics and measurements, assessing the consistency or 1003 variability in the results gives the user an intuitive feel for the 1004 degree (or confidence) that any one value is representative of other 1005 results, or the underlying distribution from which these singleton 1006 measurements have come. 1008 With two questions looming: 1010 1. What ways can BTC be measured and summarized to describe the 1011 potential variability in a useful way? 1013 2. How can the variability in BTC estimates be reported, so that the 1014 confidence in the results is also conveyed? 1016 We suggest the methods of Section 6.6.1 above, and the additional 1017 results presentations given in [RFC6349]. 1019 8. Reporting on Test Streams and Sample Size 1021 This section discusses two key aspects of measurement that are 1022 sometimes omitted from the report: the description of the test stream 1023 on which the measurements are based, and the sample size. 1025 8.1. Test Stream Characteristics 1027 Network Characterization has traditionally used Poisson-distributed 1028 inter-packet spacing, as this provides an unbiased sample. The 1029 average inter-packet spacing may be selected to allow observation of 1030 specific network phenomena. Other test streams are designed to 1031 sample some property of the network, such as the presence of 1032 congestion, link bandwidth, or packet reordering. 1034 If measuring a network in order to make inferences about applications 1035 or receiver performance, then there are usually efficiencies derived 1036 from a test stream that has similar characteristics to the sender. 1037 In some cases, it is essential to synthesize the sender stream, as 1038 with Bulk Transfer Capacity estimates. In other cases, it may be 1039 sufficient to sample with a "known bias", e.g., a Periodic stream to 1040 estimate real-time application performance. 1042 8.2. Sample Size 1044 Sample size is directly related to the accuracy of the results, and 1045 plays a critical role in the report. Even if only the sample size 1046 (in terms of number of packets) is given for each value or summary 1047 statistic, it imparts a notion of the confidence in the result. 1049 In practice, the sample size will be selected taking both statistical 1050 and practical factors into account. Among these factors are: 1052 1. The estimated variability of the quantity being measured 1054 2. The desired confidence in the result (although this may be 1055 dependent on assumption of the underlying distribution of the 1056 measured quantity). 1058 3. The effects of active measurement traffic on user traffic. 1060 A sample size may sometimes be referred to as "large". This is a 1061 relative, and qualitative term. It is preferable to describe what 1062 one is attempting to achieve with their sample. For example, stating 1063 an implication may be helpful: this sample is large enough such that 1064 a single outlying value at ten times the "typical" sample mean (the 1065 mean without the outlying value) would influence the mean by no more 1066 than X. 1068 The Appendix of [RFC2330] indicates that sample size of 128 1069 singletons worked well for goodness-of-fit testing, while a much 1070 larger size almost always failed (8192 singletons). 1072 9. IANA Considerations 1074 This document makes no request of IANA. 1076 Note to RFC Editor: this section may be removed on publication as an 1077 RFC. 1079 10. Security Considerations 1081 The security considerations that apply to any active measurement of 1082 live networks are relevant here as well. See [RFC4656] for mandatory 1083 to implement security features that intend to mitigate attacks 1084 described in the corresponding security considerations section. 1086 Measurement systems conducting long-term measurements are more 1087 exposed to threats as a by-product of ports open longer to perform 1088 their task, and more easily detected measurement activity on those 1089 ports. Further, use of long packet waiting times affords an attacker 1090 a better opportunity to prepare and launch a replay attack. 1092 11. Acknowledgements 1094 The authors thank: Phil Chimento for his suggestion to employ 1095 conditional distributions for Delay, Steve Konish Jr. for his careful 1096 review and suggestions, Dave McDysan and Don McLachlan for useful 1097 comments based on their long experience with measurement and 1098 reporting, Daniel Genin for his observation of non-orthogonality 1099 between Raw and Restricted Capacity metrics (and our omission of this 1100 fact), and Matt Zekauskas for suggestions on organizing the memo for 1101 easier consumption. 1103 12. References 1105 12.1. Normative References 1107 [RFC2330] Paxson, V., Almes, G., Mahdavi, J., and M. Mathis, 1108 "Framework for IP Performance Metrics", RFC 2330, 1109 May 1998. 1111 [RFC2678] Mahdavi, J. and V. Paxson, "IPPM Metrics for Measuring 1112 Connectivity", RFC 2678, September 1999. 1114 [RFC2679] Almes, G., Kalidindi, S., and M. Zekauskas, "A One-way 1115 Delay Metric for IPPM", RFC 2679, September 1999. 1117 [RFC2680] Almes, G., Kalidindi, S., and M. Zekauskas, "A One-way 1118 Packet Loss Metric for IPPM", RFC 2680, September 1999. 1120 [RFC3148] Mathis, M. and M. Allman, "A Framework for Defining 1121 Empirical Bulk Transfer Capacity Metrics", RFC 3148, 1122 July 2001. 1124 [RFC3393] Demichelis, C. and P. Chimento, "IP Packet Delay Variation 1125 Metric for IP Performance Metrics (IPPM)", RFC 3393, 1126 November 2002. 1128 [RFC3432] Raisanen, V., Grotefeld, G., and A. Morton, "Network 1129 performance measurement with periodic streams", RFC 3432, 1130 November 2002. 1132 [RFC4656] Shalunov, S., Teitelbaum, B., Karp, A., Boote, J., and M. 1133 Zekauskas, "A One-way Active Measurement Protocol 1134 (OWAMP)", RFC 4656, September 2006. 1136 [RFC4737] Morton, A., Ciavattone, L., Ramachandran, G., Shalunov, 1137 S., and J. Perser, "Packet Reordering Metrics", RFC 4737, 1138 November 2006. 1140 [RFC5136] Chimento, P. and J. Ishac, "Defining Network Capacity", 1141 RFC 5136, February 2008. 1143 12.2. Informative References 1145 [Casner] "A Fine-Grained View of High Performance Networking, NANOG 1146 22 Conf.; http://www.nanog.org/mtg-0105/agenda.html", May 1147 20-22 2001. 1149 [Cia03] "Standardized Active Measurements on a Tier 1 IP Backbone, 1150 IEEE Communications Mag., pp 90-97.", June 2003. 1152 [I-D.ietf-ippm-reporting] 1153 Shalunov, S. and M. Swany, "Reporting IP Performance 1154 Metrics to Users", draft-ietf-ippm-reporting-06 (work in 1155 progress), March 2011. 1157 [RFC5481] Morton, A. and B. Claise, "Packet Delay Variation 1158 Applicability Statement", RFC 5481, March 2009. 1160 [RFC5835] Morton, A. and S. Van den Berghe, "Framework for Metric 1161 Composition", RFC 5835, April 2010. 1163 [RFC6349] Constantine, B., Forget, G., Geib, R., and R. Schrage, 1164 "Framework for TCP Throughput Testing", RFC 6349, 1165 August 2011. 1167 [Y.1540] ITU-T Recommendation Y.1540, "Internet protocol data 1168 communication service - IP packet transfer and 1169 availability performance parameters", December 2011. 1171 [Y.1541] ITU-T Recommendation Y.1540, "Network Performance 1172 Objectives for IP-Based Services", February 2011. 1174 Authors' Addresses 1176 Al Morton 1177 AT&T Labs 1178 200 Laurel Avenue South 1179 Middletown, NJ 07748 1180 USA 1182 Phone: +1 732 420 1571 1183 Fax: +1 732 368 1192 1184 Email: acmorton@att.com 1185 URI: http://home.comcast.net/~acmacm/ 1187 Gomathi Ramachandran 1188 AT&T Labs 1189 200 Laurel Avenue South 1190 Middletown, New Jersey 07748 1191 USA 1193 Phone: +1 732 420 2353 1194 Fax: 1195 Email: gomathi@att.com 1196 URI: 1198 Ganga Maguluri 1199 AT&T Labs 1200 200 Laurel Avenue 1201 Middletown, New Jersey 07748 1202 USA 1204 Phone: 732-420-2486 1205 Fax: 1206 Email: gmaguluri@att.com 1207 URI: