idnits 2.17.1 draft-ietf-ippm-owmetric-as-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 7 longer pages, the longest (page 2) being 60 lines == It seems as if not all pages are separated by form feeds - found 0 form feeds but 8 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** The abstract seems to contain references ([2], [3], [1]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (November 2002) is 7834 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 2679 (ref. '1') (Obsoleted by RFC 7679) ** Obsolete normative reference: RFC 2680 (ref. '2') (Obsoleted by RFC 7680) ** Downref: Normative reference to an Informational RFC: RFC 2330 (ref. '3') Summary: 8 errors (**), 0 flaws (~~), 5 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Draft Henk Uijterwaal 3 Document: draft-ietf-ippm-owmetric-as-01.txt Merike Kaeo 4 Expires: June 2003 November 2002 6 One-Way Metric Applicability Statement 8 Status of this Memo 10 This document is an Internet-Draft and is in full conformance with all 11 provisions of Section 10 of RFC2026. Internet-Drafts are working 12 documents of the Internet Engineering Task Force (IETF), its areas, and 13 its working groups. Note that other groups may also distribute working 14 documents as Internet-Drafts. 16 Internet-Drafts are draft documents valid for a maximum of six months 17 and may be updated, replaced, or obsoleted by other documents at any 18 time. It is inappropriate to use Internet- Drafts as reference material 19 or to cite them other than as "work in progress." 21 The list of current Internet-Drafts can be accessed at 22 http://www.ietf.org/ietf/1id-abstracts.txt 24 The list of Internet-Draft Shadow Directories can be accessed at 25 http://www.ietf.org/shadow.html. 27 Abstract 29 Active traffic measurements are starting to become more widely used to 30 ascertain network performance characteristics. All active measurement 31 systems have the capability to measure one-way delay and one-way loss 32 metrics, as defined in RFC2679 [1] A One- way Delay Metric for IPPM and 33 RFC 2680 [2] A One-way Packet Loss Metric for IPPM, respectively. To 34 ensure that the resulting numbers have some meaning, we attempt to 35 characterize how the measurements are taken and what would ensure that 36 the end numbers are indeed meaningful. This document describes an 37 applicability statement (formerly known as best current practices) for 38 measuring the one-way delay and one-way loss metrics in operational 39 networks. 41 Overview 43 As more people start measuring one-way delay and one-way loss parameters 44 it results in a large set of numbers. To ensure that these numbers have 45 some meaning, we attempt to characterize how the measurements are taken 46 and what would ensure that the end numbers are indeed meaningful. Much 47 of the work relates to RFC2679 [1] A One-way Delay Metric for IPPM and 48 RFC2680[2] A One- way Packet Loss Metric for IPPM. It is assumed that 49 the reader is familiar with both of these documents, as well as the 50 related framework document RFC2330[3]. 52 Conventions used in this document 54 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 55 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 56 document are to be interpreted as described in RFC-2119 [4]. 58 1. Introduction and Terminology 60 Active traffic measurements are starting to become more widely used to 61 ascertain network performance characteristics. All active measurement 62 systems have the capability to measure one-way delay and one-way loss 63 metrics, as defined in RFC2679 [1] and RFC 2680 [2], respectively. 64 However, while these standards define how to measure quantities, there 65 are a large number of parameters that have to be set by the operator of 66 a measurement device. To ensure that the resulting numbers have some 67 meaning, we attempt to characterize how the measurements are taken and 68 what would ensure that the end numbers are indeed meaningful. This 69 document describes best current practices for measuring the one-way 70 delay and one-way loss metrics in operational networks. 72 2. Ambiguities in one-way measurement metrics 74 RFC2679[1] and RFC2680[2] define metrics for one-way delay and one-way 75 loss, respectively. In practice, a large number of instances of these 76 metrics are measured and when comparing results from different 77 measurement entities, the numbers sometimes vary. This is partly due to 78 ambiguities in the current documents for variables such as frequency of 79 measurement samples, packet size, timing issues, test duration and data 80 volumes. This draft will give recommendations for these variables for 81 both inter-provider networks and internal networks. Inter-provider 82 networks are those where the measurement end-points cross administrative 83 domain boundaries, such as from one ISP to another ISP. Internal 84 networks are those where the measurement end- points are contained 85 within one administrative domain. This draft also discusses ambiguity 86 issues related to reporting the metrics, such as when is a result 87 different, alarms and sigma, average percentiles. 89 3. Recommendations for one way delay and loss measurements. 91 3.1 Measurement samples 93 The number of measurement samples need to be clearly defined. 94 Specifically, we need to specify how many packets are needed to say 95 something about a connection. The frequency of packets should be such 96 that one has a reasonable chance to see effects on the link but low 97 enough that the regular traffic on the link is not affected by the 98 measurement. In addition, it is important to ascertain what a 99 reasonable number of packets to send, before the probability of a 100 statistical fluke becomes small, is. 102 [Question: Can we benefit from packet sampling BOF work here? Ideally, 103 math to calculate that if an effect occurs with a rate of N Hz and we 104 send traffic with M Hz, there is a probability >X that one packet will 105 see this effect.] 107 3.2 Packet size 109 The size of the packets is important as some devices tend to give 110 preferential treatment to smaller packets, thus causing the delay for 111 small packets to appear lower than for large packets, as well as 112 overtaking or reordering. In all cases, packet sizes should be smaller 113 than the MTU to avoid effects due to fragmentation and reassembly. 115 Before running any actual measurements, one should perform tests to see 116 if delay depends on packet size other than scaling with the packet size. 117 If this appears to be the case, one should try to estimate packet sizes 118 for "user" data using passive measurements and adjust the packet size 119 accordingly, or use a variable packet size according to the distribution 120 seen in user data. These tests should be repeated when the path between 121 source and destination changes. 123 Also note that some line card designs have buffer pools of different 124 sizes. This can lead to loss being different for different packet 125 sizes. 127 When packets are sent larger than the minimum size required by the 128 measurement device, the remainder of the packet should be padded with 129 random bits in order to avoid compression being applied to any 130 measurement packets. The algorithm to generate these random bits as 131 well as any seed values have to be known, in order to be able to fully 132 understand any remaining issues with compression. 134 3.3 Timing issues 136 The measured metric should report experimental errors on the accuracy of 137 the clocks. This has been seen to only be an issue during measurement 138 test start-up. In the case of using NTP, it starts with an estimate and 139 as the clock starts to stabilize it corrects the internal clock of the 140 device. 142 When the IPDV metric is being measured, one use 4 time-stamps: send and 143 arrival time of the first packet and, send and arrival time of the 144 second packet. The difference between these time-stamps will be small. 145 One should take care that sufficient accuracy for the calculation is 146 available and check that the experimental error on the overall result is 147 still small compared to the result. 149 The clock should be checked for correct performance at regular intervals 150 and measurements should be discarded when there is a problem. 152 One should check if the overall experimental error is small compared to 153 the delay before further processing of the data. The errors should be 154 recorded so they are available when calculating derived metrics such as 155 IPDV. 157 3.4 Test duration 159 The test duration can be infinitely long depending on the metric and 160 application. In order to easily see traffic variations, measurements 161 should run for a long time but have a limited life-time. The former 162 requirement makes it easier to use the data for traffic engineering or 163 load balancing. 165 The latter requirement allows for a easy failure detection: suppose one 166 is measuring between A and B. At some point in time, B stops receiving 167 packets. Until the measurement session times out, there is no way to 168 tell if this is due to full connectivity loss between A and B, or due to 169 a failure of the device A. When the measurement session ends, one can 170 attempt to restart it. If one can contact the host at A, one can 171 conservatively assume that A crashed. 173 How to report intermediate results while the test is in progress? 175 3.5. Data volumes 177 It is important to ensure that any measurement traffic does not 178 interfere with normal network operations. Initially, one should check 179 if outgoing/incoming data volume for a box is small with respect to link 180 capacity of the first few hops to avoid measurements being affected by 181 loaded links. Also, one should check that the machine sending/receiving 182 the data can cope with the expected offered load. Lastly, make sure that 183 the total test traffic volume sent or received by a machine is small 184 compared to total link capacity, a number of 3% of the total available 185 capacity seems reasonable for routine monitoring of the performance of a 186 link without affecting the performance of that link. 188 Capacity and reordering measurements that fill a link at (almost) its 189 maximum line rate should not be used on production networks except 190 during scheduled maintenance or test periods. 192 4. Reporting metrics 194 4.1. When is a result different? 196 Given 2 sets of measurements, when is set 1 statistically different from 197 set 2? 199 When do you have reasonable probability that things have not changed or 200 are OK with your network? This might vary from application to 201 application of the data. 203 4.2. Alarms 205 From the previous paragraph, it follows when 2 results are different. 206 This can be used to define thresholds for delay alarms. 208 4.3. Average/Sigma versus 2.5/median/97.5% 210 Since Average/Sigma for a one-way delay distribution is not well 211 defined, and percentiles are, we should use the latter. 213 If it necessary to use Average/Sigma, then it should be specified how 214 losses are treated in the calculation. 216 Question: what about the loss metrics: average/sigma or percentiles. 218 Question: Larry Dunn suggest filtering theory to get a feeling for 219 the shape of a curve. Anybody who wants to elaborate? 221 5.0 Reporting the IPDV metric. 223 Using average/sigma for reporting the IPDV metric does not work: first 224 of all, the average will almost always be close to zero. Then, the 225 distribution generally is not Gaussian and the sigma is not well defined 226 for the distributions that are being seen. 228 Using percentiles suffers from the same problem: the median will almost 229 always be 0, and the 2.5 and 97.5% will be the same. 231 What appears to be working is 2 percentiles, for example 5 and 25%, this 232 gives a reasonable description of the shape of the distribution. 234 Question: Stas: do you have some better wording? 236 6.0 Access to the data 238 Measurement results comprise of both raw data and derived results. The 239 raw data should be kept accessible to allow for historical trend 240 analysis. 242 A minimum set of informative fields to be stored is: 243 * IP address of source 244 * IP address of destination 245 * Time the packet was sent (or arrived) 246 * Delay 247 * Experimental error on sending and receiving clock 248 * Packet Size 249 * ... 251 7.0. Control/Configuration 253 Define maximal acceptable time to set up a measurement, latency between 254 configuration changes and effect on measurement. No idea what the answer 255 is, this might depend from operator to operator. 257 8. IANA Considerations 259 NONE at the moment. 261 9. Security Considerations 263 One-way delay packets can be used as a DDOS. Even if each sending box 264 carefully checks that the outgoing rate to a destination is small, a 265 large number of sending boxes can still be used to overflow a link. To 266 protect against this, send configuration to receiving device before the 267 measurements start. 269 Other Sanity checks? what are they? 271 10. References 273 [1] RFC2679 274 [2] RFC2680 275 [3] RFC2330 276 [4] RFC2119 278 11. Acknowledgments 280 Victor Reijs (HEANET) July 9's comments incorporated. Stanislav 281 Shalunov's comments from July 26 added, Aug 8 added. 283 12. Authors' Addresses 284 Henk Uijterwaal 285 RIPE Network Coordination Centre 286 Singel 258 287 1016 AB Amsterdam 288 The Netherlands 290 Phone: +31.20.5354414 291 Fax: +31.20.5354445 292 Email: henk.uijterwaal@ripe.net 294 Merike Kaeo 295 Merike, Inc. 296 123 Ross Street 297 Santa Cruz, CA 95060 298 USA 300 Phone: +1 831 818 4864 301 Fax: +1 831 457 2654 302 Email: kaeo@merike.com 304 Full Copyright Statement Copyright (C) The Internet Society (2002). All 305 Rights Reserved. 307 This document and translations of it may be copied and furnished to 308 others, and derivative works that comment on or otherwise explain it or 309 assist in its implementation may be prepared, copied, published and 310 distributed, in whole or in part, without restriction of any kind, 311 provided that the above copyright notice and this paragraph are included 312 on all such copies and derivative works. However, this document itself 313 may not be modified in any way, such as by removing the copyright notice 314 or references to the Internet Society or other Internet organizations, 315 except as needed for the purpose of developing Internet standards in 316 which case the procedures for copyrights defined in the Internet 317 Standards process must be followed, or as required to translate it into 318 languages other than English. 320 The limited permissions granted above are perpetual and will not be 321 revoked by the Internet Society or its successors or assigns. 323 This document and the information contained herein is provided on an "AS 324 IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK 325 FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT 326 LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT 327 INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR 328 FITNESS FOR A PARTICULAR PURPOSE.