This session was chaired by Henk Uijterwaal and Matt Zekauskas. Tom Taylor and Al Morton scribed the meeting, and their notes were edited into these minutes by the Chairs.
(see slides)
Matt presented the changes to the document, which completed the addition
of references to metrics and applied all the changes discussed during and
after the last meeting (see slides).
There are no other known issues, and the document is ready for WGLC
(perhaps after a minor revision - have to double-check sample code
percentile calculation, but that could be done during/after last call).
Al Morton commented that based on recent TWAMP draft feedback from the IESG,
he suggests that the title should reflect short term nature of the metrics (which
would also further differentiate it from the (currently) individual draft.
Henk indicated that WGLC would occur around the September time frame.
Al discussed his separate document on reporting, which he views
as complementary to the group draft. He talked about status and changes
(see slides). The idea is to finish the working group draft, and then
make this a working group item. Al is looking for people to involve
themselves in the work on this draft.
Henk stated that we would look at it as WG item after finishing the
current group draft. A sense of the room was taken -- 5-6 people in
favour of making it WG item after short-term done, none opposed.
Henk will confirm on list.
David McDyson asked how does long-term relate to availability work
in progress at the ITU. Al said that most of the drafts focus is
on traffic parameters. He hasn't tackled availability yet; further,
IPPM defines connectivity, not availability. Al is interested if
others have suggestions here. Dave related that he thought the main
thing was to make sure definitions in the two groups were aligned.
The Framework draft has completed last call.
Al gave a brief overview of the composition draft, the problems
addressed, and the updates for this revision (see slides).
In particular, Yaakov's comments are now incorporated in the
composition document.
There is one more minor revision planned, and then Al believes
the document is ready for WGLC. However, more readers would be
useful, please read and comment.
No additional comments from the room; Henk agreed with the plan.
Al began with the session control draft. The individual session control feature
gives more flexibility in starting, although loses global start control.
There is no longer a wait for the server to assign a sessionID. The commands
are assigned an index number. This allows pipelining of commands with
correlation of responses. The authors feel that this draft is now done.
There were no other comments from the room; the authors will ask that the
document go to last call.
Al then turned to the reflection draft. There is currently an asymmetry of
packet sizes between the sent and reflected packets. For round-trip
measurements, same-size packets are useful. Al then went through
a detailed review of options to provide same-size packets for reflection
(see slides). The authors' current view is that the sending header should
be changed so there is no need to move bytes around in the packet at the
reflector -- the "simple Sender format".
There was a hum of those in favor of this direction, with a moderate response.
There were none opposed. Henk asked that Al request those that reviewed
TWAMP review this. Al agreed modulo those folks that have moved on.
Comments on all aspects are appreciated. The authors are fully satisfied
with the current content.
(See slides.) Gerhard started with a review of what the proposal is
trying to do, and noting how it fits with RFC 2330. 2330 has the
notion of goodness of fit test rather than calibration, for sampling
processes. If you compare histograms, the null hypothesis is that they
are same distribution. The Anderson-Darling empirical distribution
function (EDF) two-sample test is used. This draft also has a notion
of how to run measurements of metrics over the same path, running through
some sort of tunnel. Finally, Gerhard presented an
example given some data he had available to him.
For valid comparison, you need knowledge about network conditions.
If you don't know typical variations, it is really difficult to compare,
but measurements should tell you when conditions are invalid.
Nick Duffield noted that the question of trying to have comparable conditions
is an interesting one, and he can envision 2 scenarios:
Nick thought another potential problem is that these significance tests
assume two different sets of samples that are independent. If you take
the same measurements, measuring same variables, there
may be correlations between two samples, which might lead standard
tests to be altruistic. It seems that there is a tradeoff there -
same conditions versus introduce correlations.
David McDysan notes potential of sample correlation in the example because
of path overlap.
Henk asked if this draft is of interest to the group? Al Morton
thought the emphasis on comparison of measured results is necessary
but not sufficient. Could simplify the test comparisons, by perhaps
adding conditions to make comparisons simpler (e.g. common path rather
than partially in common). See the next presentation. Another person
in the room felt that the empirical results of this sort were
important.
(See Slides.)
Al wanted to join the editorial team, but did not have a chance to
participate until June. He started writing a very long email, which became
long enough that he thought he should just post it as a draft instead.
He believes this is a complementary solution -- focus on the metrics
themselves since implementations of METRICS are not required to be
interoperable.
On slide 3, he notes that comparing implementations to check out
proposed measurements has its problems: have to sort out variability
due to implementations from other sources of variability.
Al argues that much testing has to be done under controlled conditions in the lab.
Lars noted that there is no requirement to show interoperability over the open
Internet. Al notes that IETF is not in certification business.
The key point is to work clause by clause through the RFC in making the comparisons. Basically intent is to make most comparisons simple rather than statistical.
Looking for expressions of interest or otherwise.
Dave McDyson said he could see usefulness. Matt Z. found the lab
tests appealing, the idea of rigor in examining options appealing, but
that this seems like it's testing the implementations more than the
specification. Lars thought that we were not actually evaluating
accuracy and usefulness of the metric with this process. Trying to
draw analogy with evaluating accuracy/usefulness of a protocol. Henk
thought this should be done in parallel with testing in networks. It could
test corner-cases.
Henk directed discussion to the list.
Nick presented the current revision of the burst loss metrics,
along with some history and motivation (see slides). The main
contribution is to standardize a methodology. Bi-packet probing is
used to estimate burst duration. Nick is looking for readers,
comments, and adoption as WG item.
Chairs note IPR (see https://datatracker.ietf.org/ipr/1126/). There were no opinions from the group
either way on whether to adopt as a WG draft. There was a question
as to whether the WG could see the patent application.
Lars noted that there is no requirement in IETF rules to provide that.
Lars also noted that if WG decides to take forward, this is not like a protocol where mandatory to implement. It is just one tool.
We are looking for volunteers to read draft. Matt noted the Chairs might
press for some reviews.
(See slides.) Rock presented some potential measurements to supply
performance data for "next generational mobile network" backhaul. However,
they were mainly passive, and some were based on packet counting.
Al Morton objected to the measurement strategy. Going back to slide 2,
he noted that comparing packet counts at two different points does not
work in connectionless networks -- you don't know if packets are supposed
to appear at both points. Round-trip delay can already by measured with
ICMP or other tools. Round-trip delay variation and loss are not very
useful, and Al recommended they not be pursued. Rock said that more details
could be made available; Matt felt that we couldn't judge the usefulness
just from these slides. Al disagreed, and thought the measurement premise
was flawed. Lars thought that the specific methods here were not acceptable.
These measurements also led to a discussion about if passive measurements
are acceptable within IPPM. The charter talks about applying the active
measurements in a passive mode, not inventing new ones. Work in this area
would require specific AD approval to add a milestone, and some new one
might require a charter revision. Therefore, there needs to be a good definition
and justification. Passive measurements such as those discussed here
are not something that would currently fit.
Al recommended the authors look at ITU recommendation Y.1540, which is agnostic to
active or passive measurements.
3. Reporting draft
3.1 Group draft (Swany/Shalunov, 10')
draft-ietf-ippm-reporting-04.txt
--Matt Zekauskas presenting
3.2 Alternative view
draft-morton-ippm-reporting-metrics-07.txt
--Al Morton
4. Composition Drafts
draft-ietf-ippm-framework-compagg-08.txt
draft-ietf-ippm-spatial-composition-09.txt
---Al Morton
5. TWAMP features
draft-ietf-ippm-twamp-reflect-octets-02.txt
draft-ietf-ippm-twamp-session-cntrl-01.txt
---Al Morton
(See slides)
6. Advancing Metrics along the standards track
6.1 Editorial team progress
draft-geib-ippm-metrictest-00.txt
--Gerhard Hasslinger
If possible, he thought that they should be done under same conditions.
Thus, in parallel at the same time would be favorable.
6.2 Thoughts on advancing metrics
draft-morton-ippm-advance-metrics-00.txt
--Al Morton
7. Burst Loss Draft
draft-duffield-ippm-burst-loss-metrics-01.txt
--Nick Duffield
8. AOB
Backhaul Network Performance
--Rock Xie