2.8.3 IP Performance Metrics (ippm)

NOTE: This charter is a snapshot of the 50th IETF Meeting in Minneapolis, Minnesota. It may now be out-of-date. Last Modified: 14-Mar-01


Merike Kaeo <kaeo@merike.com>
Matthew Zekauskas <matt@advanced.org>

Transport Area Director(s):

Scott Bradner <sob@harvard.edu>
Allison Mankin <mankin@east.isi.edu>

Transport Area Advisor:

Allison Mankin <mankin@east.isi.edu>

Mailing Lists:

General Discussion:ippm@advanced.org
To Subscribe: ippm-request@advanced.org
Archive: http://www.advanced.org/IPPM/archive/

Description of Working Group:

The IPPM WG will develop a set of standard metrics that can be applied to the quality, performance, and reliability of Internet data delivery

services. These metrics will be designed such that they can be performed by network operators, end users, or independent testing groups. It is important that the metrics not represent a value judgement (i.e. define "good" and "bad"), but rather provide unbiased quantitative measures of performance.
Functions peripheral to Internet data delivery services, such as NOC/NIC services, are beyond the scope of this working group.
The IPPM WG will define specific metrics, cultivate technology for the accurate measurement and documentation of these metrics, and promote the sharing of effective tools and procedures for measuring these metrics. It will also offer a forum for sharing information about the implementation and application of these metrics, but actual implementations and applications are understood to be beyond the scope of this working group.

Goals and Milestones:

Oct 97


Submit drafts of standard metrics for connectivity and treno-bulk-throughput.

Nov 97


Submit a framework document describing terms and notions used in the IPPM effort, and the creation of metrics by the working group to IESG for publication as an Informational RFC.

Feb 98


Submit documents on delay and loss to IESG for publication as Informational RFCs.

Apr 98


Submit a document on connectivity to IESG for publication as an Informational RFC.

Sep 98


Submit a document on bulk-throughput to IESG for publication as an Informational RFC.

Request For Comments:






Framework for IP Performance Metrics



IPPM Metrics for Measuring Connectivity



A One-way Delay Metric for IPPM



A One-way Packet Loss Metric for IPPM



A Round-trip Delay Metric for IPPM

Current Meeting Report

Minutes of the IP Performance Metrics WG (ippm)
Tuesday, 20 March 2001, 14:15-15:15 and 15:45-16:45

The meeting was chaired by the working group chairs, Matt Zekauskas and Merike Kaeo. Thanks to Will Leland and Bob Mandeville who took notes during the meeting.


1. Agenda bashing
2. Discussion on advancing metrics through the IETF standards process
3. Discussion on IPDV Reorganization
4. Preliminary results on periodic streams: periodic vs. Poisson and discussion on npmps implementation

-- Cookies --

5. Discussion on last call for BTC framework, and the CAP metric
6. One-way delay protocol document discussion; requirements document
7. An implementation of one-way delay
8. Input for a protocol requirements document
9. Charter milestone revisions

IETF home page: http://www.ietf.org/html.charters/ippm-charter.html
IPPM home page: http://www.advanced.org/IPPM/

2. Discussion on advancing metrics through the IETF standards process -- Vern Paxson

Matt Z noted that there was not much discussion yet on the standards draft authored by Scott Bradner, Vern Paxson, and Allison Mankin.

Vern Paxson wanted to start discussions on the question of giving metrics the full weight of standards. He presented two requirements for progressing to an IETF standard: interoperable implementations and an assessment of which options are actually implemented and used. Interoperability demonstrates that the spec is unambiguous, and shows that there is sufficient community interest. The purpose of the options assessment is to remove unneeded options from the standard.

What does it mean for metrics to interoperate? Vern noted that similar issues exist for MIBs. He suggested that a procedure must be agreed upon to ensure that measurements are repeatable and reliable. The metrics standards process should also be applicable to the BMWG metrics that are placed in the standards track. For IPPM metrics, since network conditions change, measurement implementations might be verified by running them at the same time with the same network endpoints. They might also be verified by running two implementations on a randomized schedule. This approach might be important for tests that cannot be run simultaneously, such as bulk transfer capacity. An example criterion for agreement between two tests is that the 90% of the values reported by method A fall within two sigma of the values reported by method B. He noted that the IESG should be explicitly given the opportunity to modify the amount and frequency that a measurement method might deviate from the standard but still be validated as "interoperable" for the standard. The Area Directors will determine if the coverage offered by the proposed implementations is complete.

On testing option values, the existing draft will be reworded to clarify that a representative range of values must be covered. For example, the definition of Type P packets is very generic. Testing is expected to cover a representative range rather than literally cover all possible Type P values.

Vern ended by stressing that the current draft is intended to generate discussion, and is not the final version.

In the discussion that followed it was first suggested that lab tests might determine the similarity of results in a controlled environment. Matt Z noted that the existing IPPM metrics mention use of lab testing to help determine the error range of an implementation. The issue of coverage was raised by asking what would happen if method A and B produced the same results and C and D did the same, but not A and C. There was also a need to determine that the population is the same if measurements using different implementations are performed on a live network. A comment was made that the 90% rule is ad-hoc. Ideally, implementations should be compared using a statistical framework that allows for the observed population of empirical values. The goal would be to accept or reject the hypothesis that two implementations are recording values drawn from the same population. However, such a framework might be difficult to achieve in practice. The question of two-dimensional distributions was raised. Vern recognized the importance of this requirement but pointed out that there was no approach yet been defined for this.

Scott Bradner pointed out that the draft was germane to both the BMWG and IPPM. He distinguished the BMWG lab environment from the IPPM live network environment and suggested that the test of a metric should be relative to the environment for which it was intended. He questioned the usefulness of a lab test for a network metric and said that the draft should be expanded to cover both and to note the right things in the right places. Lab testing might isolate problems for an implementation, but simply passing lab testing does not provide sufficient confidence that an implementation measures real networks faithfully.

3. Discussion on IPDV Reorganization
-- Matt Zekauskas, from slides by Phil Chimento

The IPDV draft has been cleaned up by Phil Chimento. The major new definition is the concept of a selection function that allows for the explicit designation of the selected pairs of singleton measurement whose delay values are compared to compute one-way delay variation. The definition of jitter does not change.

In the discussion it was pointed out that not all comments had been incorporated in the new version of the draft. In particular Poisson streams were included but not periodic streams. A question arose as to whether statistics in the drafts are examples or part of the proposed standard. In particular, did they all need to be implemented in order to advance the metric along the standards track. Vern Paxson did not feel the example Statistics (derived from measurement streams) are required parts of the standard. Vern and Matt both suggested that as we gain experience with useful statistics, they could be described in best current practices documents. It was also pointed out that there may a conflict with the standards work being done by the ITU, which in some respects is more advanced than the IPPM. The main point is that we do not need two incompatible standards for the same thing. The existing approaches do differ; for example, the ITU definition does not consider route flapping. In the IPDV metric, the selection function is extremely powerful and is intended to allow the IPDV metric to include the ITU approach as a special case. Matt Z said he would review the current ITU standards work and send a note to the mailing list. At the end of the IPDV discussion there was rough consensus to move the draft forward as a proposed standard.

4. Preliminary results on periodic streams: periodic vs. Poisson
and discussion on npmps implementation -- Al Morton

Al Morton presented results to fill in the lack of experimental data supporting the work done on periodic stream measurement. Measurements included a mix of Poisson and periodic packet distributions and were run between three points in an existing network. For details on the specific setup and measurement parameters, see the presentation slides. For the most part, results were very stable and could have been measured with a frequency of only ten packets per day, so the environment did not allow any significant differences between the behavior observed by Poisson and periodic measurements. However, a period was found where four re-routes occurred between two of the measurement points over a one hour period. The Poisson sampling missed one short burst of lost packets picked up the periodic sampling during the re-routes. Al suggested that these results argue in favor of using both sampling distributions, which work well together.

In the discussion, it was observed that Poisson sampling would have worked equally well for detecting short bursts of loss. To detect short events, any sampling method must be done on a finer time scale than the event length. It was pointed out that the infrequent stretches of periodic measurements just happened to pick up one of the (presumably frequent) short bursts; in response, Al pointed out that the result obtained from the periodic sampling was not fortuitous: the measurement technique is not intended to capture every short period of loss, but to detect the existence of short loss bursts if they recur in typical network behavior. That is, the sampling method was designed to pick up the type of anomaly that it indeed picked up. It was also suggested that a Markov-modulated Poisson process might be used to switch between "sparse" and "dense" rates of Poisson sampling, though Al suggested that would have been more work to implement in their particular tests.

5. Discussion on last call for BTC framework, and the CAP metric -- Mark Allman

Mark Allman led a discussion on the BTC Framework draft that he and Matt Mathis co-authored. Mark said the last call to the BTC framework was based on version 5 of the draft which rolled in all comments. There were no known outstanding issues.

In the discussion there was a question on whether the abstract and text should state that reference implementations and pseudocode were expected to be the norm for defining a BTC metric, since IETF policy is not to embed reference implementations in standards. Mark pointed out that Matt Mathis' original concern was that a precise definition of a specific BTC metric would require code. However, the statement is now vestigial -- based on later drafts and the CAP draft's experience, it should be possible to eliminate code in the standard's definition. It was also pointed out that limited use of pseudocode to illuminate specific algorithms is already widely used in IETF standards.

Mark also presented an opportunity to discuss his new draft for a specific BTC metric, CAP, between co-operating sending and receiving processes. No issues had been raised so far. Matt Z encouraged people to find the time to implement the metric, to clarify and debug the draft.

6. One-way delay protocol document discussion; requirements document
--Stanislav Shalunov

Merike introduced the discussion on the one-way delay protocol by noting that there has been much interest on the mailing list but in order to revise the charter to cover protocol development, the ADs requested a requirements document to state what problem the protocol was meant to solve. Today, there would be a discussion on the current owdp draft, a presentation by another implementer of one-way delay measurements, and the opportunity to raise requirements. Merike and Stanislav will create the first draft of the requirements document in a timely manner for community comment.

Stanislav Shalunov then took up the discussion of the protocol, starting by noting that the protocol needed both a requirements document and also eventually an outside security review. Matt Z noted that the Chairs had asked a Security AD if there was a proper procedure. The AD responded that when the time is right the WG chairs should forward a request to the Security ADs and they would find someone from the security area to perform a security review of the protocol. Stansilav then went into the changes in this version; see the slides for details. Stanislav said that the test protocol could be run without the control protocol and that provision was made for partial results retrieval which would be useful for continuous measurements. Type P descriptions are currently provided by DSCP and PHB values, since no universal service identifiers have been standardized. Per session precision was removed, and replaced by a coding of the timestamp precision with each timestamp value. One of the objectives was to fit the measurement packet into an ATM cell. Issues on the table were non-Poisson packet distribution and support for timestamps known a posteriori. Stanislav did not see the value in adding full TLV format, and invited comments.

In the discussion Henk contested the newly proposed implementation, which added manipulation of the timestamp to the original RFC 1305 timestamp definition. Matt Z suggested that a discussion of the change might be taken up off-line. A request to make it possible to pick up measurement packets in the middle of the network by identifying them with a marker was opposed by Stanislav on the grounds that measurement packets had to be hidden to avoid cheating. However, there was some recognition that not all uses of these metrics are in environments where cheating is an issue, and that in those cases it is advantageous to be able to detect probe packets at intermediate points. This consideration suggests that encryption be optional instead of required. It was agreed to further discuss TLV on the list. An issue was raised about the protocol stating that connections should be accepted even if it could not set a DSCP value. The questioner believes that if two test streams were established, one with DSCP set and one without, that the two streams would in fact be indistinguishable. Stanislav explained that if there were two streams, they would be using separate TCP or UDP ports, and therefore be distinguishable.

7. An implementation of one-way delay -- Kaynan Hedayat

Kaynan Hedayat presented the Brix architecture, a one-way delay measurement method and hardware timestamping technique. He said they used three different protocols between servers and agents, one using http between server and agents, another to start and stop tests, and the third for test data for UDP packets. The latter two correspond to the proposed OWDP-Control and OWDP-Test protocol. He said it was important to eliminate machine time from measurements of transit time, so that only the precise network transit time is reported. In Brix, hardware sniffs the packets and registers the timestamp. He wanted to see support for emulation of applications like VoIP. He asserted that VoIP vendors want periodic traffic stream testing. Matt Z asked if the periodic streams draft required any changes to support the desired traffic; no problems were noted. Kaynan also suggested that the control protocol support requests for simultaneous measurements in both directions between two measurement points. An audience member pointed out that you could just set up one in each direction.

In the discussion Henk pointed out evidence that machine time (the "A" and "C" in the presentation time diagram) was negligible and could be compensated for. Henk referred to a paper by Stephen Donnelly.

Passive Calibration of an Active Measurement System
Stephen Donnelly+, Ian Graham+, Rene Wilhelm*
+ The University of Waikato, * RIPE NCC

To be published in:
Proceedings of the PAM2001 workshop
Amsterdam, April 2001.

URL will be http://www.ripe.net/pam2001/Proceedings

8. Input for a protocol requirements document -- Merike Kaeo

Merike led a discussion on the requirements document and said the first draft would be posted on the list in a few weeks.

* Bob Cole suggested that RMON might serve as an example of how to separate control and test protocols; we might decide that there is a slightly different separation of control and data messages that would be more desirable.

* Encryption was not required for measurements performed on the same network and should not be mandatory.

* In response to a question, Stanislav confirmed that the control protocol retrieves all data.

* The possibility of having multiple control protocols will be looked into.

* We might agree on a standard more quickly if we do not define our own control protocol; in addition we could take advantage of existing solutions to hard problems (such as secure control of remote devices).

There was a discussion as to the urgency of the delay protocol. It appears that we are hurrying the requirements document in order to get back to protocol design. Merike agreed that we need carefully designed requirements, an explained that the short timeframe for the first draft to come out reflects that some discussion has already occurred. The matter of urgency should not compromise the quality of the work. The goal is to not let the document lag. In addition, there was a comments from the floor that today one cannot buy interoperable solutions and network providers need solutions now. One person questioned the urgency given that there were few people willing to raise their hands when asked if they would implement the protocol. Another comment stated that if there was an RFC, then there would be more implementers.

9. Charter milestone revisions -- Matt Zekauskas

See the slides for the complete list.

Matt Z noted the need for implementation reports to advance the established IPPM metrics further on the standards track, since they have been RFCs for almost 2 years now. He said that it might be useful to develop a draft for the rigorous comparison and validity of metrics among implementations, in addition to comparisons in the metrics advancement draft. Is it the right time for a BCP on delay and loss? Experience with commercial solutions is just starting. Is there enough existing experience to warrant a BCP? Merike volunteered to serve as editor if enough contributors volunteered experience. Matt also questioned whether it was time to do a comparison of the ITU and IPPM metrics. Al Morton stated that we should wait until "the targets stopped moving" -- the metrics are still being developed. Everything listed in the milestones was within the existing charter. He also pointed out that the further work on the OWDP required the working group to be re-chartered. The re-charter was not noted as a milestone.


Advancement of metrics specifications on the IETF Standards Track
Status of the IPDV draft
Network Measurements with Periodic and Poisson Sampling: Preliminary Results
Bulk Transfer Capacity Status
A One-way Delay Measurement Protocol
An Overview of Brix Network’s One Way Delay Performance Test