idnits 2.17.1 draft-ietf-ippm-loss-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-19) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Abstract section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (August 1998) is 9379 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Downref: Normative reference to an Informational RFC: RFC 2330 (ref. '1') -- Possible downref: Non-RFC (?) normative reference: ref. '2' -- Possible downref: Non-RFC (?) normative reference: ref. '3' Summary: 10 errors (**), 0 flaws (~~), 1 warning (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group G. Almes 3 INTERNET-DRAFT S. Kalidindi 4 Expiration Date: March 1999 M. Zekauskas 5 Advanced Network & Services 6 August 1998 8 A Packet Loss Metric for IPPM 9 11 1. Status of this Memo 13 This document is an Internet-Draft. Internet-Drafts are working 14 documents of the Internet Engineering Task Force (IETF), its areas, 15 and its working groups. Note that other groups may also distribute 16 working documents as Internet Drafts. 18 Internet-Drafts are draft documents valid for a maximum of six 19 months, and may be updated, replaced, or obsoleted by other documents 20 at any time. It is inappropriate to use Internet- Drafts as 21 reference material or to cite them other than as "work in progress." 23 To view the entire list of current Internet-Drafts, please check the 24 "1id-abstracts.txt" listing contained in the Internet-Drafts shadow 25 directories on ftp.is.co.za (Africa), nic.nordu.net (Northern 26 Europe), ftp.nis.garr.it (Southern Europe), munnari.oz.au (Pacific 27 Rim), ftp.ietf.org (US East Coast), or ftp.isi.edu (US West Coast). 29 This memo provides information for the Internet community. This memo 30 does not specify an Internet standard of any kind. Distribution of 31 this memo is unlimited. 33 2. Introduction 35 This memo defines a metric for packet loss across Internet paths. It 36 builds on notions introduced and discussed in the IPPM Framework 37 document, RFC 2330 [1]; the reader is assumed to be familiar with 38 that document. 40 This memo is intended to be parallel in structure to a companion 41 document for One-way Delay (currently "A One-way Delay Metric for 42 IPPM" ) [2]; the reader is assumed to 43 be familiar with that document. 45 The structure of the memo is as follows: 47 + A 'singleton' analytic metric, called Type-P-One-way-Loss, is 48 introduced to measure a single observation of packet transmission 49 or loss. 51 + Using this singleton metric, a 'sample', called Type-P-One-way- 52 Loss-Poisson-Stream, is introduced to measure a sequence of 53 singleton transmissions and/or losses measured at times taken from 54 a Poisson process. 56 + Using this sample, several 'statistics' of the sample are defined 57 and discussed. 59 This progression from singleton to sample to statistics, with clear 60 separation among them, is important. 62 Whenever a technical term from the IPPM Framework document is first 63 used in this memo, it will be tagged with a trailing asterisk. For 64 example, "term*" indicates that "term" is defined in the Framework. 66 2.1. Motivation: 68 Understanding one-way packet loss of Type-P* packets from a source 69 host* to a destination host is useful for several reasons: 71 + Some applications do not perform well (or at all) if end-to-end 72 loss between hosts is large relative to some threshold value. 74 + Excessive packet loss may make it difficult to support certain 75 real-time applications (where the precise threshold of "excessive" 76 depends on the application). 78 + The larger the value of packet loss, the more difficult it is for 79 transport-layer protocols to sustain high bandwidths. 81 + The sensitivity of real-time applications and of transport-layer 82 protocols to loss become especially important when very large 83 delay-bandwidth products must be supported. 85 It is outside the scope of this document to say precisely how loss 86 metrics would be applied to specific problems. 88 2.2. General Issues Regarding Time 90 Whenever a time (i.e., a moment in history) is mentioned here, it is 91 understood to be measured in seconds (and fractions) relative to UTC. 93 As described more fully in the Framework document, there are four 94 distinct, but related notions of clock uncertainty: 96 synchronization* 98 Synchronization measures the extent to which two clocks agree on 99 what time it is. For example, the clock on one host might be 100 5.4 msec ahead of the clock on a second host. 102 accuracy* 104 Accuracy measures the extent to which a given clock agrees with 105 UTC. For example, the clock on a host might be 27.1 msec behind 106 UTC. 108 resolution* 110 Resolution measures the precision of a given clock. For 111 example, the clock on an old Unix host might advance only once 112 every 10 msec, and thus have a resolution of only 10 msec. 114 skew* 116 Skew measures the change of accuracy, or of synchronization, 117 with time. For example, the clock on a given host might gain 118 1.3 msec per hour and thus be 27.1 msec behind UTC at one time 119 and only 25.8 msec an hour later. In this case, we say that the 120 clock of the given host has a skew of 1.3 msec per hour relative 121 to UTC, and this threatens accuracy. We might also speak of the 122 skew of one clock relative to another clock, and this threatens 123 synchronization. 125 3. A Singleton Definition for One-way Packet Loss 127 3.1. Metric Name: 129 Type-P-One-way-Packet-Loss 131 3.2. Metric Parameters: 133 + Src, the IP address of a host 135 + Dst, the IP address of a host 137 + T, a time 139 3.3. Metric Units: 141 The value of a Type-P-One-way-Packet-Loss is either a zero 142 (signifying successful transmission of the packet) or a one 143 (signifying loss). 145 3.4. Definition: 147 >>The *Type-P-One-way-Packet-Loss* from Src to Dst at T is 0<< means 148 that Src sent the first bit of a Type-P packet to Dst at wire-time* T 149 and that Dst received that packet. 151 >>The *Type-P-One-way-Packet-Loss* from Src to Dst at T is 1<< means 152 that Src sent the first bit of a type-P packet to Dst at wire-time T 153 and that Dst did not receive that packet. 155 3.5. Discussion: 157 Thus, Type-P-One-way-Packet-Loss is 0 exactly when Type-P-One-way- 158 Delay is a finite positive value, and it is 1 exactly when Type-P- 159 One-way-Delay is undefined. 161 The following issues are likely to come up in practice: 163 + A given methodology will have to include a way to distinguish 164 between a packet loss and a very large (but finite) delay. As 165 noted by Mahdavi and Paxson [3], simple upper bounds (such as the 166 255 seconds theoretical upper bound on the lifetimes of IP 167 packets [4]) could be used, but good engineering, including an 168 understanding of packet lifetimes, will be needed in practice. 169 {Comment: Note that, for many applications of these metrics, there 170 may be no harm in treating a large delay as packet loss. An audio 171 playback packet, for example, that arrives only after the playback 172 point may as well have been lost.} 174 + If the packet arrives, but is corrupted, then it is counted as 175 lost. {Comment: one is tempted to count the packet as received 176 since corruption and packet loss are related but distinct 177 phenomena. If the IP header is corrupted, however, one cannot be 178 sure about the source or destination IP addresses and is thus on 179 shaky grounds about knowing that the corrupted received packet 180 corresponds to a given sent test packet. Similarly, if other 181 parts of the packet needed by the methodology to know that the 182 corrupted received packet corresponds to a given sent test packet, 183 then such a packet would have to be counted as lost. Counting 184 these packets as lost but packet with corruption in other parts of 185 the packet as not lost would be inconsistent.} 187 + If the packet is duplicated along the path (or paths) so that 188 multiple non-corrupt copies arrive at the destination, then the 189 packet is counted as received. 191 + If the packet is fragmented and if, for whatever reason, 192 reassembly does not occur, then the packet will be deemed lost. 194 3.6. Methodologies: 196 As with other Type-P-* metrics, the detailed methodology will depend 197 on the Type-P (e.g., protocol number, UDP/TCP port number, size, 198 precedence). 200 Generally, for a given Type-P, one possible methodology would proceed 201 as follows: 203 + Arrange that Src and Dst have clocks that are synchronized with 204 each other. The degree of synchronization is a parameter of the 205 methodology, and depends on the threshold used to determine loss 206 (see below). 208 + At the Src host, select Src and Dst IP addresses, and form a test 209 packet of Type-P with these addresses. 211 + At the Dst host, arrange to receive the packet. 213 + At the Src host, place a timestamp in the prepared Type-P packet, 214 and send it towards Dst. 216 + If the packet arrives within a reasonable period of time, the one- 217 way packet-loss is taken to be zero. 219 + If the packet fails to arrive within a reasonable period of time, 220 the one-way packet-loss is taken to be one. Note that the 221 threshold of "reasonable" here is a parameter of the methodology. 223 {Comment: The definition of reasonable is intentionally vague, and 224 is intended to indicate a value "Th" so large that any value in 225 the closed interval [Th-delta, Th+delta] is an equivalent 226 threshold for loss. Here, delta encompasses all error in clock 227 synchronization along the measured path. If there is a single 228 value after which the packet must be counted as lost, then we 229 reintroduce the need for a degree of clock synchronization similar 230 to that needed for one-way delay. Therefore, if a measure of 231 packet loss parameterized by a specific non-huge "reasonable" 232 time-out value is needed, one can always measure one-way delay and 233 see what percentage of packets from a given stream exceed a given 234 time-out value.} 236 Issues such as the packet format, the means by which Dst knows when 237 to expect the test packet, and the means by which Src and Dst are 238 synchronized are outside the scope of this document. {Comment: We 239 plan to document elsewhere our own work in describing such more 240 detailed implementation techniques and we encourage others to as 241 well.} 243 3.7. Errors and Uncertainties: 245 The description of any specific measurement method should include an 246 accounting and analysis of various sources of error or uncertainty. 247 The Framework document provides general guidance on this point. 249 For loss, there are three sources of error: 251 + Synchronization between clocks on Src and Dst. 253 + The packet-loss threshold (which is related to the synchronization 254 between clocks). 256 + Resource limits in the network interface or software on the 257 receiving instrument. 259 The first two sources are interrelated and could result in a test 260 packet with finite delay being reported as lost. Type-P-One-way- 261 Packet-Loss is 0 if the test packet does not arrive, or if it does 262 arrive and the difference between Src timestamp and Dst timestamp is 263 greater than the "reasonable period of time", or loss threshold. If 264 the clocks are not sufficiently synchronized, the loss threshold may 265 not be "reasonable" - the packet may take much less time to arrive 266 than its Src timestamp indicates. Similarly, if the loss threshold 267 is set too low, then many packets may be counted as lost. The loss 268 threshold must be high enough, and the clocks synchronized well 269 enough so that a packet that arrives is rarely counted as lost. (See 270 the discussions in the previous two sections.) 272 Since the sensitivity of packet loss measurement to lack of clock 273 synchronization is less than for delay, we refer the reader to the 274 treatment of synchronization errors in the One-way Delay metric [2] 275 for more details. 277 The last source of error, resource limits, cause the packet to be 278 dropped by the measurement instrument, and counted as lost when in 279 fact the network delivered the packet in reasonable time. 281 The measurement instruments should be calibrated such that the loss 282 threshold is reasonable for application of the metrics and the clocks 283 are synchronized enough so the loss threshold remains reasonable. 285 In addition, the instruments should be checked to ensure the 286 probability is low that a packet arrives at the network interface, 287 but is lost due to congestion on the interface or to other resource 288 exhaustion (e.g., buffers) on the instrument. 290 3.8. Reporting the metric: 292 The calibration and context in which the metric is measured must be 293 carefully considered, and should always be reported along with metric 294 results. We now present four items to consider: Type-P of the test 295 packets, the loss threshold, instrument calibration, and the path 296 traversed by the test packets. This list is not exhaustive; any 297 additional information that could be useful in interpreting 298 applications of the metrics should also be reported. 300 3.8.1. Type-P 302 As noted in the Framework document [1], the value of the metric may 303 depend on the type of IP packets used to make the measurement, or 304 "Type-P". The value of Type-P-One-way-Delay could change if the 305 protocol (UDP or TCP), port number, size, or arrangement for special 306 treatment (e.g., IP precedence or RSVP) changes. The exact Type-P 307 used to make the measurements must be accurately reported. 309 3.8.2. Loss threshold 311 The threshold (or methodology to distinguish) between a large finite 312 delay and loss should be reported. 314 3.8.3. Calibration results 316 The degree of synchronization between the Src and Dst clocks should 317 be reported. If possible, report the probability that a test packet 318 that arrives at the Dst network interface is reported as lost due to 319 resource exhaustion on Dst. 321 3.8.4. Path 323 Finally, the path traversed by the packet should be reported, if 324 possible. In general it is impractical to know the precise path a 325 given packet takes through the network. The precise path may be 326 known for certain Type-P on short or stable paths. If Type-P 327 includes the record route (or loose-source route) option in the IP 328 header, and the path is short enough, and all routers* on the path 329 support record (or loose-source) route, then the path will be 330 precisely recorded. This is impractical because the route must be 331 short enough, many routers do not support (or are not configured for) 332 record route, and use of this feature would often artificially worsen 333 the performance observed by removing the packet from common-case 334 processing. However, partial information is still valuable context. 335 For example, if a host can choose between two links* (and hence two 336 separate routes from Src to Dst), then the initial link used is 337 valuable context. {Comment: For example, with Merit's NetNow setup, 338 a Src on one NAP can reach a Dst on another NAP by either of several 339 different backbone networks.} 341 4. A Definition for Samples of One-way Packet Loss 343 Given the singleton metric Type-P-One-way-Packet-Loss, we now define 344 one particular sample of such singletons. The idea of the sample is 345 to select a particular binding of the parameters Src, Dst, and Type- 346 P, then define a sample of values of parameter T. The means for 347 defining the values of T is to select a beginning time T0, a final 348 time Tf, and an average rate lambda, then define a pseudo-random 349 Poisson arrival process of rate lambda, whose values fall between T0 350 and Tf. The time interval between successive values of T will then 351 average 1/lambda. 353 4.1. Metric Name: 355 Type-P-One-way-Packet-Loss-Poisson-Stream 357 4.2. Metric Parameters: 359 + Src, the IP address of a host 361 + Dst, the IP address of a host 363 + T0, a time 365 + Tf, a time 367 + lambda, a rate in reciprocal seconds 369 4.3. Metric Units: 371 A sequence of pairs; the elements of each pair are: 373 + T, a time, and 375 + L, either a zero or a one 377 The values of T in the sequence are monotonic increasing. Note that 378 T would be a valid parameter to Type-P-One-way-Packet-Loss, and that 379 L would be a valid value of Type-P-One-way-Packet-Loss. 381 4.4. Definition: 383 Given T0, Tf, and lambda, we compute a pseudo-random Poisson process 384 beginning at or before T0, with average arrival rate lambda, and 385 ending at or after Tf. Those time values greater than or equal to T0 386 and less than or equal to Tf are then selected. At each of the times 387 in this process, we obtain the value of Type-P-One-way-Packet-Loss at 388 this time. The value of the sample is the sequence made up of the 389 resulting pairs. If there are no such pairs, the 390 sequence is of length zero and the sample is said to be empty. 392 4.5. Discussion: 394 Note first that, since a pseudo-random number sequence is employed, 395 the sequence of times, and hence the value of the sample, is not 396 fully specified. Pseudo-random number generators of good quality 397 will be needed to achieve the desired qualities. 399 The sample is defined in terms of a Poisson process both to avoid the 400 effects of self-synchronization and also capture a sample that is 401 statistically as unbiased as possible. {Comment: there is, of 402 course, no claim that real Internet traffic arrives according to a 403 Poisson arrival process. 405 It is important to note that, in contrast to this metric, loss rates 406 observed by transport connections do not reflect unbiased samples. 407 For example, TCP transmissions both (1) occur in bursts, which can 408 induce loss due to the burst volume that would not otherwise have 409 been observed, and (2) adapt their transmission rate in an attempt to 410 minimize the loss rate observed by the connection.} 412 All the singleton Type-P-One-way-Packet-Loss metrics in the sequence 413 will have the same values of Src, Dst, and Type-P. 415 Note also that, given one sample that runs from T0 to Tf, and given 416 new time values T0' and Tf' such that T0 <= T0' <= Tf' <= Tf, the 417 subsequence of the given sample whose time values fall between T0' 418 and Tf' are also a valid Type-P-One-way-Packet-Loss-Poisson-Stream 419 sample. 421 4.6. Methodologies: 423 The methodologies follow directly from: 425 + the selection of specific times, using the specified Poisson 426 arrival process, and 428 + the methodologies discussion already given for the singleton Type- 429 P-One-way-Packet-Loss metric. 431 Care must be given to correctly handle out-of-order arrival of test 432 packets; it is possible that the Src could send one test packet at 433 TS[i], then send a second one (later) at TS[i+1], while the Dst could 434 receive the second test packet at TR[i+1], and then receive the first 435 one (later) at TR[i]. 437 4.7. Errors and Uncertainties: 439 In addition to sources of errors and uncertainties associated with 440 methods employed to measure the singleton values that make up the 441 sample, care must be given to analyze the accuracy of the Poisson 442 arrival process of the wire-time of the sending of the test packets. 443 Problems with this process could be caused by several things, 444 including problems with the pseudo-random number techniques used to 445 generate the Poisson arrival process. The Framework document shows 446 how to use the Anderson-Darling test to verify the Poisson process. 448 4.8. Reporting the metric: 450 The calibration and context for the underlying singletons should be 451 reported along with the stream. (See "Reporting the metric" for 452 Type-P-One-way-Packet-Loss.) 454 5. Some Statistics Definitions for One-way Packet Loss 456 Given the sample metric Type-P-One-way-Packet-Loss-Poisson-Stream, we 457 now offer several statistics of that sample. These statistics are 458 offered mostly to be illustrative of what could be done. 460 5.1. Type-P-One-way-Packet-Loss-Average 462 Given a Type-P-One-way-Packet-Loss-Poisson-Stream, the average of all 463 the L values in the Stream. In addition, the Type-P-One-way-Packet- 464 Loss-Average is undefined if the sample is empty. 466 Example: suppose we take a sample and the results are: 467 Stream1 = < 468 469 470 471 472 473 > 474 Then the average would be 0.2. 476 Note that, since healthy Internet paths should be operating at loss 477 rates below 1% (particularly if high delay-bandwidth products are to 478 be sustained), the sample sizes needed might be larger than one would 479 like. Thus, for example, if one wants to discriminate between 480 various fractions of 1% over one-minute periods, then several hundred 481 samples per minute might be needed. This would result in larger 482 values of lambda than one would ordinarily want. 484 Note that although the loss threshold should be set such that any 485 errors in loss are not significant, if the probability that a packet 486 which arrived is counted as lost due to resource exhaustion is 487 significant compared to the loss rate of interest, Type-P-One-way- 488 Packet-Loss-Average will be meaningless. 490 6. Security Considerations 492 Conducting Internet measurements raises both security and privacy 493 concerns. This memo does not specify an implementation of the 494 metrics, so it does not directly affect the security of the Internet 495 nor of applications which run on the Internet. However, 496 implementations of these metrics must be mindful of security and 497 privacy concerns. 499 There are two types of security concerns: potential harm caused by 500 the measurements, and potential harm to the measurements. The 501 measurements could cause harm because they are active, and inject 502 packets into the network. The measurement parameters must be 503 carefully selected so that the measurements inject trivial amounts of 504 additional traffic into the networks they measure. If they inject 505 "too much" traffic, they can skew the results of the measurement, and 506 in extreme cases cause congestion and denial of service. 508 The measurements themselves could be harmed by routers giving 509 measurement traffic a different priority than "normal" traffic, or by 510 an attacker injecting artificial measurement traffic. If routers can 511 recognize measurement traffic and treat it separately, the 512 measurements will not reflect actual user traffic. If an attacker 513 injects artificial traffic that is accepted as legitimate, the loss 514 rate will be artificially lowered. Therefore, the measurement 515 methodologies should include appropriate techniques to reduce the 516 probability measurement traffic can be distinguished from "normal" 517 traffic. Authentication techniques, such as digital signatures, may 518 be used where appropriate to guard against injected traffic attacks. 520 The privacy concerns of network measurement are limited by the active 521 measurements described in this memo. Unlike passive measurements, 522 there can be no release of existing user data. 524 7. Acknowledgements 526 Thanks are due to Matt Mathis for encouraging this work and for 527 calling attention on so many occasions to the significance of packet 528 loss. 530 Thanks are due also to Vern Paxson for his valuable comments on early 531 drafts. 533 8. References 535 [1] V. Paxson, G. Almes, J. Mahdavi, and M. Mathis, "Framework for 536 IP Performance Metrics", RFC 2330, May 1998. 538 [2] G. Almes, S. Kalidindi, and M. Zekauskas, "A One-way Delay 539 Metric for IPPM", Internet-Draft , 540 August 1998. 542 [3] J. Mahdavi and V. Paxson, "IPPM Metrics for Measuring 543 Connectivity", Internet-Draft , August 1998. 546 [4] J. Postel, "Internet Protocol", RFC 791, September 1981. 548 9. Authors' Addresses 550 Guy Almes 551 Advanced Network & Services, Inc. 552 200 Business Park Drive 553 Armonk, NY 10504 554 USA 556 Phone: +1 914 765 1120 557 EMail: almes@advanced.org 559 Sunil Kalidindi 560 Advanced Network & Services, Inc. 561 200 Business Park Drive 562 Armonk, NY 10504 563 USA 565 Phone: +1 914 765 1128 566 EMail: kalidindi@advanced.org 567 Matthew J. Zekauskas 568 Advanced Network & Services, Inc. 569 200 Buisiness Park Drive 570 Armonk, NY 10504 571 USA 573 Phone: +1 914 765 1112 574 EMail: matt@advanced.org 576 Expiration date: March, 1999