idnits 2.17.1 

draft-ietf-ippm-owmetric-as-01.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Looks like you're using RFC 2026 boilerplate.  This must be updated to
     follow RFC 3978/3979, as updated by RFC 4748.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** The document seems to lack a 1id_guidelines paragraph about
     Internet-Drafts being working documents. 

  ** The document seems to lack a 1id_guidelines paragraph about 6 months
     document validity -- however, there's a paragraph with a matching
     beginning. Boilerplate error?

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard

  == The page length should not exceed 58 lines per page, but there was 7
     longer pages, the longest (page 2) being 60 lines

  == It seems as if not all pages are separated by form feeds - found 0 form
     feeds but 8 pages


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.

  ** The abstract seems to contain references ([2], [3], [1]), which it
     shouldn't.  Please replace those with straight textual mentions of the
     documents in question.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not
     match the current year

  == The document doesn't use any RFC 2119 keywords, yet seems to have RFC
     2119 boilerplate text.

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (November 2002) is 7834 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  ** Obsolete normative reference: RFC 2679 (ref. '1') (Obsoleted by RFC 7679)

  ** Obsolete normative reference: RFC 2680 (ref. '2') (Obsoleted by RFC 7680)

  ** Downref: Normative reference to an Informational RFC: RFC 2330 (ref. '3')


     Summary: 8 errors (**), 0 flaws (~~), 5 warnings (==), 2 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Internet Draft                                    Henk Uijterwaal
3	Document: draft-ietf-ippm-owmetric-as-01.txt      Merike Kaeo
4	Expires: June 2003                                November 2002

6	                   One-Way Metric Applicability Statement

8	Status of this Memo

10	This document is an Internet-Draft and is in full conformance with all
11	provisions of Section 10 of RFC2026. Internet-Drafts are working
12	documents of the Internet Engineering Task Force (IETF), its areas, and
13	its working groups.  Note that other groups may also distribute working
14	documents as Internet-Drafts.

16	Internet-Drafts are draft documents valid for a maximum of six months
17	and may be updated, replaced, or obsoleted by other documents at any
18	time.  It is inappropriate to use Internet- Drafts as reference material
19	or to cite them other than as "work in progress."

21	The list of current Internet-Drafts can be accessed at
22	        http://www.ietf.org/ietf/1id-abstracts.txt

24	The list of Internet-Draft Shadow Directories can be accessed at
25	        http://www.ietf.org/shadow.html.

27	Abstract

29	Active traffic measurements are starting to become more widely used to
30	ascertain network performance characteristics.  All active measurement
31	systems have the capability to measure one-way delay and one-way loss
32	metrics, as defined in RFC2679 [1] A One- way Delay Metric for IPPM and
33	RFC 2680 [2] A One-way Packet Loss Metric for IPPM, respectively.  To
34	ensure that the resulting numbers have some meaning, we attempt to
35	characterize how the measurements are taken and what would ensure that
36	the end numbers are indeed meaningful.  This document describes an
37	applicability statement (formerly known as best current practices) for
38	measuring the one-way delay and one-way loss metrics in operational
39	networks.

41	Overview

43	As more people start measuring one-way delay and one-way loss parameters
44	it results in a large set of numbers.  To ensure that these numbers have
45	some meaning, we attempt to characterize how the measurements are taken
46	and what would ensure that the end numbers are indeed meaningful.  Much
47	of the work relates to RFC2679 [1] A One-way Delay Metric for IPPM and
48	RFC2680[2] A One- way Packet Loss Metric for IPPM.  It is assumed that
49	the reader is familiar with both of these documents, as well as the
50	related framework document RFC2330[3].

52	Conventions used in this document

54	The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
55	"SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and "OPTIONAL" in this
56	document are to be interpreted as described in RFC-2119 [4].

58	1. Introduction and Terminology

60	Active traffic measurements are starting to become more widely used to
61	ascertain network performance characteristics.  All active measurement
62	systems have the capability to measure one-way delay and one-way loss
63	metrics, as defined in RFC2679 [1] and RFC 2680 [2], respectively.
64	However, while these standards define how to measure quantities, there
65	are a large number of parameters that have to be set by the operator of
66	a measurement device. To ensure that the resulting numbers have some
67	meaning, we attempt to characterize how the measurements are taken and
68	what would ensure that the end numbers are indeed meaningful.  This
69	document describes best current practices for measuring the one-way
70	delay and one-way loss metrics in operational networks.

72	2. Ambiguities in one-way measurement metrics

74	RFC2679[1] and RFC2680[2] define metrics for one-way delay and one-way
75	loss, respectively. In practice, a large number of instances of these
76	metrics are measured and when comparing results from different
77	measurement entities, the numbers sometimes vary.  This is partly due to
78	ambiguities in the current documents for variables such as frequency of
79	measurement samples, packet size, timing issues, test duration and data
80	volumes.  This draft will give recommendations for these variables for
81	both inter-provider networks and internal networks.  Inter-provider
82	networks are those where the measurement end-points cross administrative
83	domain boundaries, such as from one ISP to another ISP.  Internal
84	networks are those where the measurement end- points are contained
85	within one administrative domain. This draft also discusses ambiguity
86	issues related to reporting the metrics, such as when is a result
87	different, alarms and sigma, average percentiles.

89	3. Recommendations for one way delay and loss measurements.

91	3.1 Measurement samples

93	The number of measurement samples need to be clearly defined.
94	Specifically, we need to specify how many packets are needed to say
95	something about a connection.   The frequency of packets should be such
96	that one has a reasonable chance to see effects on the link but low
97	enough that the regular traffic on the link is not affected by the
98	measurement.  In addition, it is important to ascertain what a
99	reasonable number of packets to send, before the probability of a
100	statistical fluke becomes small, is.

102	[Question: Can we benefit from packet sampling BOF work here?  Ideally,
103	math to calculate that if an effect occurs with a rate of N Hz and we
104	send traffic with M Hz, there is a probability >X that one packet will
105	see this effect.]

107	3.2 Packet size

109	The size of the packets is important as some devices tend to give
110	preferential treatment to smaller packets, thus causing the delay for
111	small packets to appear lower than for large packets, as well as
112	overtaking or reordering.  In all cases, packet sizes should be smaller
113	than the MTU to avoid effects due to fragmentation and reassembly.

115	Before running any actual measurements, one should perform tests to see
116	if delay depends on packet size other than scaling with the packet size.
117	If this appears to be the case, one should try to estimate packet sizes
118	for "user" data using passive measurements and adjust the packet size
119	accordingly, or use a variable packet size according to the distribution
120	seen in user data. These tests should be repeated when the path between
121	source and destination changes.

123	Also note that some line card designs have buffer pools of different
124	sizes.  This can lead to loss being different for different packet
125	sizes.

127	When packets are sent larger than the minimum size required by the
128	measurement device, the remainder of the packet should be padded with
129	random bits in order to avoid compression being applied to any
130	measurement packets.  The algorithm to generate these random bits as
131	well as any seed values have to be known, in order to be able to fully
132	understand any remaining issues with compression.

134	3.3 Timing issues

136	The measured metric should report experimental errors on the accuracy of
137	the clocks. This has been seen to only be an issue during measurement
138	test start-up.  In the case of using NTP, it starts with an estimate and
139	as the clock starts to stabilize it corrects the internal clock of the
140	device.

142	When the IPDV metric is being measured, one use 4 time-stamps: send and
143	arrival time of the first packet and, send and arrival time of the
144	second packet.   The difference between these time-stamps will be small.
145	One should take care that sufficient accuracy for the calculation is
146	available and check that the experimental error on the overall result is
147	still small compared to the result.

149	The clock should be checked for correct performance at regular intervals
150	and measurements should be discarded when there is a problem.

152	One should check if the overall experimental error is small compared to
153	the delay before further processing of the data. The errors should be
154	recorded so they are available when calculating derived metrics such as
155	IPDV.

157	3.4 Test duration

159	The test duration can be infinitely long depending on the metric and
160	application.  In order to easily see traffic variations, measurements
161	should run for a long time but have a limited life-time.  The former
162	requirement makes it easier to use the data for traffic engineering or
163	load balancing.

165	The latter requirement allows for a easy failure detection: suppose one
166	is measuring between A and B. At some point in time, B stops receiving
167	packets. Until the measurement session times out, there is no way to
168	tell if this is due to full connectivity loss between A and B, or due to
169	a failure of the device A.  When the measurement session ends, one can
170	attempt to restart it.  If one can contact the host at A, one can
171	conservatively assume that A crashed.

173	How to report intermediate results while the test is in progress?

175	3.5. Data volumes

177	It is important to ensure that any measurement traffic does not
178	interfere with normal network operations.  Initially, one should check
179	if outgoing/incoming data volume for a box is small with respect to link
180	capacity of the first few hops to avoid measurements being affected by
181	loaded links. Also, one should check that the machine sending/receiving
182	the data can cope with the expected offered load. Lastly, make sure that
183	the total test traffic volume sent or received by a machine is small
184	compared to total link capacity, a number of 3% of the total available
185	capacity seems reasonable for routine monitoring of the performance of a
186	link without affecting the performance of that link.

188	Capacity and reordering measurements that fill a link at (almost) its
189	maximum line rate should not be used on production networks except
190	during scheduled maintenance or test periods.

192	4. Reporting metrics

194	4.1. When is a result different?

196	Given 2 sets of measurements, when is set 1 statistically different from
197	set 2?

199	When do you have reasonable probability that things have not changed or
200	are OK with your network?  This might vary from application to
201	application of the data.

203	4.2. Alarms

205	From the previous paragraph, it follows when 2 results are different.
206	This can be used to define thresholds for delay alarms.

208	4.3. Average/Sigma versus 2.5/median/97.5%

210	Since Average/Sigma for a one-way delay distribution is not well
211	defined, and percentiles are, we should use the latter.

213	If it necessary to use Average/Sigma, then it should be specified how
214	losses are treated in the calculation.

216	Question: what about the loss metrics: average/sigma or percentiles.

218	Question: Larry Dunn suggest filtering theory to get a feeling for
219	          the shape of a curve.  Anybody who wants to elaborate?

221	5.0 Reporting the IPDV metric.

223	Using average/sigma for reporting the IPDV metric does not work: first
224	of all, the average will almost always be close to zero.  Then, the
225	distribution generally is not Gaussian and the sigma is not well defined
226	for the distributions that are being seen.

228	Using percentiles suffers from the same problem: the median will almost
229	always be 0, and the 2.5 and 97.5% will be the same.

231	What appears to be working is 2 percentiles, for example 5 and 25%, this
232	gives a reasonable description of the shape of the distribution.

234	Question: Stas: do you have some better wording?

236	6.0 Access to the data

238	Measurement results comprise of both raw data and derived results.  The
239	raw data should be kept accessible to allow for historical trend
240	analysis.

242	A minimum set of informative fields to be stored is:
243	*    IP address of source
244	*    IP address of destination
245	*    Time the packet was sent (or arrived)
246	*    Delay
247	*    Experimental error on sending and receiving clock
248	*    Packet Size
249	*    ...

251	7.0. Control/Configuration

253	Define maximal acceptable time to set up a measurement, latency between
254	configuration changes and effect on measurement. No idea what the answer
255	is, this might depend from operator to operator.

257	8. IANA Considerations

259	NONE at the moment.

261	9. Security Considerations

263	One-way delay packets can be used as a DDOS.  Even if each sending box
264	carefully checks that the outgoing rate to a destination is small, a
265	large number of sending boxes can still be used to overflow a link. To
266	protect against this, send configuration to receiving device before the
267	measurements start.

269	Other Sanity checks? what are they?

271	10. References

273	[1] RFC2679
274	[2] RFC2680
275	[3] RFC2330
276	[4] RFC2119

278	11. Acknowledgments

280	Victor Reijs (HEANET) July 9's comments incorporated.  Stanislav
281	Shalunov's comments from July 26 added, Aug 8 added.

283	12. Authors' Addresses
284	Henk Uijterwaal
285	RIPE Network Coordination Centre
286	Singel 258
287	1016 AB Amsterdam
288	The Netherlands

290	Phone: +31.20.5354414
291	Fax: +31.20.5354445
292	Email: henk.uijterwaal@ripe.net

294	Merike Kaeo
295	Merike, Inc.
296	123 Ross Street
297	Santa Cruz, CA 95060
298	USA

300	Phone: +1 831 818 4864
301	Fax:   +1 831 457 2654
302	Email: kaeo@merike.com

304	Full Copyright Statement Copyright (C) The Internet Society (2002).  All
305	Rights Reserved.

307	This document and translations of it may be copied and furnished to
308	others, and derivative works that comment on or otherwise explain it or
309	assist in its implementation may be prepared, copied, published and
310	distributed, in whole or in part, without restriction of any kind,
311	provided that the above copyright notice and this paragraph are included
312	on all such copies and derivative works.  However, this document itself
313	may not be modified in any way, such as by removing the copyright notice
314	or references to the Internet Society or other Internet organizations,
315	except as needed for the purpose of developing Internet standards in
316	which case the procedures for copyrights defined in the Internet
317	Standards process must be followed, or as required to translate it into
318	languages other than English.

320	The limited permissions granted above are perpetual and will not be
321	revoked by the Internet Society or its successors or assigns.

323	This document and the information contained herein is provided on an "AS
324	IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK
325	FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT
326	LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT
327	INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR
328	FITNESS FOR A PARTICULAR PURPOSE.