0=-=-=-
IP Performance Metrics WG (ippm)
Tuesday, July 24, 09:00--11:30 CDT

This session was chaired by Henk Uijterwaal and Matt Zekauskas. Al Morton acted as
scribe, and his notes (along with notes taken by Matt) were edited into these
minutes by the Chairs.

AGENDA:

1. Administrivia
2. TWAMP updates, and next steps
3. Duplication Draft Update
4. Composition Framework and Spatial Composition Additions
5. Multimetrics Draft Developments
6. Delay Variation Draft Developments
7. Any Other Business

1. Administrivia
   Agenda bashing, Scribe, Minutes, Blue Sheets
   Status of drafts and milestones

Henk opened the meeting.  There were no changes to the agenda.

During the discussion of drafts and milestones, Henk menioned that we are still
sitting on the implementation draft awaiting guidance on progressing metrics along
the standards track. While initially the thought was we should just publish the
docoument as informational, Lars skimmed draft-bradner-metricstest-02.txt, and it
said metrics would progress when test equipment from different vendors had results
in controlled experiment that were verifyably equivalent. However, the definition
of "verifyably equivalent" is still open, and the results of that definition might
imply that changes need to be made to the draft.  Therefore we will let the draft
sit, and encourage this group (and bmwg) to contribute to the definition of
"verifyably equivalent".

Henk asked if Al had any idea of realistic dates for his composition drafts, since
the milestones for those drafts had passed.  Al thought that the framework was in
good shape, but didn't want to progress it before one of the composition drafts
was ready.  The big barrier is that it needs more review by working group members. 
We need some rigorous reviews. December 2007 is probably too soon, mid-2008 is a
more realistic target. Folks in the group are requested to read and commend on the
mailing list.

Emile said that the multimetrics draft is more-or-less complete and has been
stable.  It too needs review, and he thought a WGLC would be the means to coerce
folks to review.  However, he has been speaking with Jurgen Quittek, and Jurgen
has a series of comments he is going to send in.  Therefore, he requests that the
working group review the document after the next revision. Al offered to help with
readability going in to the next version as well.  The big issue with the draft
are the passive aspects, which were discussed during the draft discussion below.

2. TWAMP updates, and next steps
   --Kaynam Hedayat

   http://tools.ietf.org/id/draft-ietf-ippm-twamp-04.txt
   http://tools.ietf.org/wg/ippm/draft-ietf-ippm-twamp/

Kaynam presented a single slide on TWAMP status.  At the last meeting they agreed
to add sneder TTL into test packets, as well as clarifying security text with
respect to TWAMP-Light, and both have been done.  In addition, text has been
clarified for sender port and receiver port.

Someone brought up setting DSCP on the ML.  What does the group feel we should do? 
Matt asked if Kaynam had seen a use for it, and he said he did, but more for
one-way than two-way tests.  It seems like a fine idea, but should we make such a
change now?  No one felt strongly that we should do so.  Therefore, no further
changes for DSCP are proposed.  This question was also asked on the list before
the meeting, with no response.

There were additional clarifications that were suggested on the ML and a new
revision is in the works to address those.

Matt mentioned that he was supposed to help the authors come up with some IANA
text for OWAMP-Control commands, since this document adds a new value.  He said he
would do that by the end of the week.

3. Duplication Draft Update
   --Henk Uijterwaal

   http://tools.ietf.org/id/draft-ietf-ippm-duplicate-01.txt
   http://tools.ietf.org/wg/ippm/draft-ietf-ippm-duplicate/

Henk next presented updates on the duplication draft.  Based on previous comments,
a new version was created in April.  There are two statistics, and two ways to
express duplication.  This draft spun off from the original reporting draft, which
wanted to reference duplication but there was no definition.

There is a placeholder Y.1540 section.  However, references have been made along
the way, so Henk does not see the need to keep the section.  Al agreed, although
he also stated that it might not be a bad idea to see where we could harmonize the
definitions, and make the differnces go away (Y.1540 is currently under revision,
see the final topic.)

Al had reviewed the draft and made a few comments. He thought the names for some
metrics might be change to be more descriptive. For example,
"type-P-one-way-packet-duplication-average" was new in this version and didn't
seem to capture the meaning of the metric.

Al also noted that the real singleton here is not "duplication", but an arrival
counter.  It starts at 0, and goes to 1, 2, 3, etc, for each packet.  Including
"duplication" in the name seems like forming a conclusion that we haven't reached
yet.  WE should evaluate the arrival counter and see if a packet only once or more
than once. Only those that arrive more than once should be called duplicates. Al
offered to send his comments to the list.  They may be found here:
http://www1.ietf.org/mail-archive/ippm/current/msg01174.html

4. Composition Framework and Spatial Composition Additions
   -- Al Morton

   http://tools.ietf.org/id/draft-ietf-ippm-framework-compagg-04.txt
   http://tools.ietf.org/wg/ippm/draft-ietf-ippm-framework-compagg/
   http://tools.ietf.org/id/draft-ietf-ippm-spatial-composition-04.txt
   http://tools.ietf.org/wg/ippm/draft-ietf-ippm-spatial-composition/

Al started by reviewing the composition framework.  Overall, it's been fairly
stable; the definitions have been refined, and a maintenance update has been
performed.  However, there has not been much feedback recently.  Al said he is
declaring this stable, pending feedback or one of the composition drafts to be
complete.

Al noted the Metro Ethernet Forum is specifying some end-to-end performance
parameters, and they are going to need a solution to this problem as well.

Jurgen Quittek and Emile Stephan offered to read the draft.

Al then did an overview of the spatial composition draft, and
the metrics that are defined there.  The metric names have been
simplified, allowing for easier exposition.

Scott Bradner and Loki Jorgenson offered to read this draft.

Matt admitted he had not read the draft recently, but the description triggered a
thought -- are there auxilliary metrics or parameters to specifcy when a
composition is valid?  For example, min composition works as long as the path is
relatively stable and uncongested. If the path is congested, the min will jump
around.  Al thought that it might just mean the "error bars" are larger, but you
can still get an estimate of the minimum.

Craig White remarked: Given a large enough sample size, the minimum will tend to
be the one that occurs the most frequently, as long as the path is relatively
stable.  Even if the path is not stable, you do get some bimodal distributions
(for example if a SONET ring goes to its protect side, and then back.

Al said that bimodal distributions are dealt with in the document. You don't want
to mix them together because the minimum for one is different than the other.  The
intent is to measure for a long time, if you see some path changes, segregate the
measurements into different intervals.

Emile said that in his opinion you can't consider making an aggregation that
consists of measurements where the path changes -- the result would be unusable. 
Perhaps we need to say that in the framework or in the specific metrics.

Al said that path stability needs to be mentioned when we talk about measuremetn
validity; he thought it went to an overall section since it applies to all
parameters.

5. Multimetrics Draft Developments
   -- Emile Stephan

   http://tools.ietf.org/id/draft-ietf-ippm-multimetrics-04.txt
   http://tools.ietf.org/wg/ippm/draft-ietf-ippm-multimetrics/

Emile started with a quick review of the multimetrics draft, starting with the
spatial metrics, and then multicast metrics.

For the spatial metrics, there has been some harmonization and discussion of
scalability aspects.  These metrics are used for either path composition or by the
end user.  Emile felt that these descriptions are now stable, and he didn't plan
to make more changes except to clean up wording.

The segment metrics have been renamed so that it is clear that they look at
performance between any two hops on the network. They are purely passive.

One participant had an issue with purely passive measurements, especially those
based only on user packets.  Did we want to discuss this now or later?

Jurgen said that he had read the document, and think we run into problems if we
focus on passive measurements.  A lot of the terminology has been inherited from
other drafts, and doesn't apply to passive measurements.  If there are active
probes, there is one sender and one receiver.  If you passively examine user
traffic, there are many senders and receivers that pass by.  Emile thought that
each flow had one sender and one receiver.  There is some time when the packet was
sent, even if it is not measurable.  You want to know the performance between two
points, you don't need to know the original sending time, you just need the two
points. The terminology could be clarified.

This would also provide psamp with some purely passive metrics; the section could
be removed if it doesn't make sense.

Emile thought that 80% of the wording is OK, and we need strong input, such as
that Jurgen is giving.  He would like a WGLC to force people to read the document.

Al said that he wanted to make a pass over the document first, to improve grammar. 
He thought the document could benefit form an editorial pass to make it more
readable, and as a co-author felt that he was responsible for that.  After that,
he agreed a WGLC would be a good idea.

Al also wanted to know the groups point of view with respect to purely passive
metrics.  There was some discussion among the authors.  The IPPM charter says that
the "metrics will not characterize traffic", and thus would not be used to
characterize VOIP traffic, for example.  He reads this as passive is out of bounds
for the working group.  However, other people are entitled to give interpretations
of that.

Emile siad that it is not easy to mix passive and active metrics, but that we
needed some ground truth.  If we want to compose metrics, we will need to make
approximations, and we need to include passive metrics in the algorithms.  The
spatial definition relies on passive technieuqs in nodes the packet cros.

If there is an active source that provides packets that can be observed along the
path, that was still an active measurement. The discussion is about qualifying the
stream you are using for the basis of measurement.

One opinion was that we haven't done this yet, and that there is enough material
to create a separate draft on that topic.

Lars (our area director) agreed that the charter text has an underlying notion
that the metrics are based on active sampling.  The working group could of course
decide to take on passive measurements. The previous presentation goes in that
direction.  But he thought it was out of scope in the current charter, and he
would like to see us finish the metrics and related documents on active techniques
first. He thought passive measuremetns were a pretty large piece of new work -- in
short, he agreed with Al.

Emile reiterated his opinion that ignoring passive metrics is a mistake, and that
the best way to have some "ground-truth" measurements is to have somee passive
measurements to compare to.

Alan Clark said he agreed with both positions, in that if you look at composition
and define it using active measurements, the basic metrics are equally applicable
to passive techniques.  Work in composing delay, delay variation, and others, can
be picked up and used with passive measurements.  He felt that once the basic
ideas have been developed, that the group should look at passive applications.
However, it is difficult to make passive measurements of an IP stream, because you
have to decode protocols, and that runs the risk of overlapping with other groups.

Al Morton thought that was an interesting point; that the working group has
focused on IP, dabbled a bit in TCP, but has focussed on active live measurements
of the IP layer of the stack, and that's our core charter.  If you work on passive
metrics, and start to look at things like sequence numbers in higher layers, that
definitely goes beyond our charter.

Al also wanted to note that there are other gaps, places where people want to grow
the group beyond the charter, both here and in BMWG. Interested folks should
follow the "application performance metrics" BOF developments.

Jurgen said that he agreed with Emile, that it would be good to have a methodology
to apply.  However, if the working group wanted to pursue this, it would take more
than just a paragraph; there are many things to consier.  There should be a
framework for how we do passive measurements in general, before we apply them to
any specific document.

Emile feld we had to fix this before going further in composition, and that
something was needed now to promote comparable metrics in this space.  He also
agreed that squeezing it in as a subsection of a draft isn't the right place for
this to be.

There was a comment that at least with the active measurements, we would be
combining metrics where we understand the characteristics; if we combine passive
measurements, there may be less we can rely on to combine them meaningfully.

Emile noted that companies and people will do this combination anyway, so we
should be working to defining standard metrics.

There was a bit of further discussion on how large this effort might be and where
the right place to do it would be; there was no definitive conclusion.

Next, Karsten Fleischhauer presented a multimetrics use case provided by Ruediger
Geib, who could not attend this meeting.  The goal in this case it to improve
multicast monitoring within a provider's network.  It would give an operator
increased understanding of what was happening with multicast, and an easy view of
where failures occur.

Several simple multicast trees, and failures were presented, along with showing
how the metrics, especially the segment metrics, would improve understanding. 
(See slides for details.)

Emile pointed out that the current draft does not mix segment metrics and
multicast metrics, which these examples seem to do.  Karsten said that his
understanding is that there are no requests to change the draft, the example is
just showing how the metrics can be used together.


6. Delay Variation Draft Developments
   -- Al Morton

   http://tools.ietf.org/wg/ippm/draft-morton-ippm-delay-var-as-03.txt

Next Al Morton presented the delay variation applicability statement draft, which
is not yet a working group item, but has had extensive review in the group.  He
spent some time presenting the problem space and showing how the different metric
formulations are useful for different problems.  See the slides for details.

There was a recent section that made suggestions on what could be done to reduce
variation.  This provoked a discussion of whether the section was appropriate for
the draft.  Scott Bradner mentioned that there was a different set of expertise
involved in fixing a problem than how to describe and measure the problem clearly. 
We would need to reach out to operators in different environments including
enterprise and ISP for validation.  He thought this section was better in a
different document, or even a Wiki to keep advice up to date.

Al noted that Roman Krzanowski, who first asked for this work, worked for an
operator, and that Al himself was "tier-5 support" at his company.

Wenji Wu asked about applicability of this work, combining it with loss to
indicate congestion.  Al said that the draft was not trying to combine delay
variation with packet loss in this analysis; loss affects the delay variation
metric, however.  In the extreme case of losing every other packet you cannot
produce a inter-packet delay variation result.

Wenji said that if you have the minimum delay, and an indication of the maximum
delay, then you have an indication of the minimum latency from source to
destination, and an indication of queueing. Have there been thoughts to
standardize this application? Al said that the draft includes an estimate of queue
occupation, in orther words, this is a task that is already identified.  Al asked
that Wenji read the draft and comment.

Lars made the comment that there are transport protocols that are trying to infer
congestion by looking at both delay and loss changes. They are experimental. 
There are other causes for loss than congestion, and other causes for delay than
congestion, for example a wireless link changing it's rate.  He said that there is
ongoing research here but there is nothing baked enough to become a metric. is
research on this,but not ready.

Al noted that one thing the draft could talk about with respect to minimum delay
is that it is an indication of path changes and could be used to separate a
bimodal distribution.  If we can detect that a path has changed positivly by TTL
changes, and correlate that with bursts of loss and delay change (and possibly by
reordering caused by "microloops" during a change) you may be able to reset the
minimum delay at that point.  If path changes are frequent, delay variation is
probably not your biggest problem.

Resetting the minimum at appropriate points would be good for long-term
measurements.

Alan said that at the last meeting he raised a possible case where inter-packet
delay variation might be useful, and that is the packet video buffer smoothing
problem.  Video packets have timestamps in them, but the rate is highly variable
due to big complete frames being mixed with packets indicating changes since the
last complete frame.  There is a smoothing buffer to try and even things out. You
need to take fragmentation and MTU size into account in this case.  He's thought
about it a bit, but has no conclusion yet.

Alan also noted another interesting problem, not specific to video, is where the
bandwith of test signal, whatever it may be, is sufficient to cause some delay
variation itself.  Can you separate this test signal induction (due to bandwidth
limits in the network) from those from other sources in the network?  It doesn't
matter what the absolute value of variation is in this case.  He has been doing
some simulation that might apply, with scene characteristics of video streams.

In summary, Al said that he'll go through the comparisions again, and felt he was
still trying to acheive consensus, but that he has not seen much resistance to the
basic results.

Speaking to slide 9, Alan Clarke made the observation that rather than calling
these guidelines in the document, we should call them measurement consderations;
to him guidelines are directing test vendors to do something rather than giving
them advice about things to watch out for.  Packet Delay Variation is one of the
most difficult things to measure and understand properly, so he thinks a section
like this is extremely useful.  Scott Bradner noted that his earlier reaction to
the guidance in the appendix did not apply here, measurement considerations are
good to state.

Al concluded noting that there had been significant support in the room, comments
from group members both here and on the mailing list and that this draft had
already been cited to satisfy an item the charter, so it's time to make this
document a working group item.

Henk took a quick poll of the room; about 1/2 indicated support for the document. 
When asked who thought it was a bad idea to continue with this document, there was
no opposition.  Therefore Henk said the Chairs would verify the sense of the room
on the mailing list, and assuming no major change would work with the area
director to make this a working group draft.

An group member thought the document should be broadend from just focusing on
delay variation.  Al felt that it was better to stay focused in order to finish
the document and complete the milestone.  [Later note by MZ: we have had a couple
of other starts at applicability statements that did not have enough support to
follow through to completion; others are welcome to make suggestions, but we
should just focus on finishing the document in an area where there is a need and
support from the working group.]

7. Any Other Business

Henk mentioned a passive measurement draft that had been sent to the list:
draft-kikuchi-passive-measure-00.txt. No group members had read the draft; given
the earlier discussion about passive measurements and the upcoming "Application
Performance Metrics BOF", it seems that the draft is currently out of scope for
this working group but potentially in scope for the results of the BOF, Henk will
send out a note to that effect.

Al Morton made a comment about his ITU-T liaison work.  Years ago, Al was put in
charge of the IP Performance question in SG 12. We have been working to reduce the
differences between the work done there and here, harmonizing the results as much
as possible.  The scope of SG 12 is different and wider -- it produces numerical
objectives based on performance perameters (from metrics); this group is focussed
on the metrics.

One way to reduce the differences between the groups is to share information so
that each group can comment on the other groups doucments. Following the model of
several other ITU-T study groups, SG 12 has created an easily accessable FTP site.
See: www1.ietf.org/mail-archive/web/ippm/current/msg1163.html and
www.itu.int/ITU-T/studygroups/com12/packet/index.html.

Scott Bradner noted that working documents are what are covered by the FTP site;
right now finished documents are available for free.  Brian Carpenter noted that
the plan to distribute finished documents for free was approved last January for
one year, so it is not yet clear what will happen in 2008.

SG 12 would specifically like feedback on two documents that are now in the
repository.  The first is a revised version of Y.1540, the IP performance metrics
document. This revision has more metrics, with reordering pulled into the
normative text, and new text on duplication and replication.  We should try to
harmonize on terminology if we can.  At the moment the group is grappling with the
problem of how to include a markov model description of burst loss.  There are two
competing proposals of how to define states (like burst or gap, alternatively
stable or un-stable).

The current burst/gap proposal needs to evaluate packets in sending order and
requires an evaluation interval whose length has a dependece on the application of
interests.  An alternative proposal might be to use arrival order instad, so it
would include the effects of reordering.

The other document is a multicast metrics draft.  Within the ITU the scope is
expanded so that it covers the IGMP aspects of multicast, not just packet transfer
performance.  This is a complement to Y.1540. It goes "above and below" to look at
session establishment and teardown of IGMP.  It would be useful to get feedback
from IETF folks.

If one wanted to post a comment document, they could send it to Al Morton,
acmorton at att.com.  The IPPM chairs and also the Transport AD also have the
ability to post to the site. If there is a document that is relvant to the whole
community we will post it to the web site.

Someone asked about the intellectual property rules of documents posted to that
space.  Al said he did not know, but would look into it.  Alan noted that one of
the differences between the ITU and IETF; here people are encouraged to disclose
IPR as early as possible.  Within the ITU, the onus on members is to formally
offer to license IPR, if any exists, prior to any recommendation being approved.
There is less pressure to disclose early.

Scott Bradner replied that this is accurate, except... IETF has no requirment to
say what licensing is but ITU does, everything else is exactly right

APM
application performance metrics.  tomorrow afternoon 1510