Last Modified: 2005-02-18
|Done||Submit drafts of standard metrics for connectivity and treno-bulk-throughput.|
|Done||Submit a framework document describing terms and notions used in the IPPM effort, and the creation of metrics by the working group to IESG for publication as an Informational RFC.|
|Done||Submit documents on delay and loss to IESG for publication as Informational RFCs.|
|Done||Submit a document on connectivity to IESG for publication as an Informational RFC.|
|Done||Submit a document on bulk-throughput to IESG for publication as an Informational RFC.|
|Done||Submit draft on loss pattern sample metrics to the IESG for publication as an Informational RFC.|
|Done||Submit draft on metrics for periodic streams to the IESG for publication as a Proposed Standard RFC.|
|Done||Submit draft on IP delay variation to the IESG for publication as a Proposed Standard RFC.|
|Done||First draft for AS on one-way delay and loss.|
|Done||Submit draft on One-Way Active Measurement Protocol Requirements to the IESG for consideration as an Informational RFC.|
|Done||Create initial draft on a MIB for reporting IPPM metrics.|
|Done||Create initial draft on a packet reordering metric.|
|Done||Create draft on a One-Way Active Measurement Protocol that satisfies the requirements document.|
|Done||Submit draft on the One-Way Active Measurement Protocol to the IESG for consideration as a PS.|
|Oct 04||Submit draft on a packet reordering metric to the IESG for Proposed Standard.|
|Dec 04||Submit link bandwidth capacity definitions draft to the IESG, for consideration as an Informational RFC.|
|Dec 04||Collect implementation reports for RFCs 2678-2681|
|Jan 05||Create initial draft on the definitions of link bandwidth capacity.|
|Feb 05||Discuss rechartering or ending working group.|
|RFC2330||I||Framework for IP Performance Metrics|
|RFC2678||E||IPPM Metrics for Measuring Connectivity|
|RFC2679||PS||A One-way Delay Metric for IPPM|
|RFC2680||PS||A One-way Packet Loss Metric for IPPM|
|RFC2681||PS||A Round-trip Delay Metric for IPPM|
|RFC3148||I||A Framework for Defining Empirical Bulk Transfer Capacity Metrics|
|RFC3357||I||One-way Loss Pattern Sample Metrics|
|RFC3393||PS||IP Packet Delay Variation Metric for IPPM|
|RFC3432||PS||Network performance measurement for periodic streams|
|RFC3763||I||A One-way Active Measurement Protocol Requirements|
IP Performance Metrics (ippm) WG
Monday 7-March-2005 19:30-22:00
The meeting was chaired by Henk Uijterwaal and Matt Zekauskas. Al Morton took notes, which were edited into these minutes by the Chairs.
2. Al Morton: Reordering
3. Nischal Piratla: Reordering
4. Henk Uijterwaal: Implementation reports
5. Al Morton: ITU Performance Work Update
6. Phil Chimento: Bandwidth draft
7. Kaynam Hedayat: A Two-way Active Measurement Protocol (TWAMP)
8. Emile Stephan-Lei Liang: Multiparty metrics
9. Juergen Quittek for Saverio Niccolini: Traceroutes
1. Administrivia, Milestones
Henk opened the meeting, reviewing milestones and draft status. He did note that one of the ADs had told him yesterday that they didn't think the OWAMP spec met the requirements doc with respect to "unauthenticated" mode. Stanislav asked if there was more information. Henk and Matt proposed to re-read the requirements document, re-read the sections on unauthenticated mode in the draft, and work with the AD to propose changes. The proposed changes would then be given to the authors (and the WG) for review.
2. Al Morton: Reordering
Al Morton then reviewed the reordering metrics, and updates to the reordering document to address comments as a result of the second WG Last Call. We had good email comments from Vern Paxson, Mark Allman, and Phil Chimento, which have been incorporated into the document. (See slides for details.)
In section 3, when defining the singleton, there was an email suggestion from Phil to mention general mathematical requirements for order evaluation. This was done, but it is quickly limited to message sequence numbers. There were still some mentions of time and byte sequence in the text, which is now gone.
In section three there was also a question about whether duplicate packets are reordered. The definition keeps duplicate packets orthogonal to reordering. Scott Poretzky asked if that meant duplicates were reordered or not. Al replied that the first copy is used for evaluation and the second is ignored; you can note that a duplicate has occurred but it is not used in the metric. This is the result of previous discussion, and Al considers this issue closed (as agreed at IETF-61). There was no dissension from the room.
The "next expected" packet parameter still mentioned time and bytes, and these were removed. Mark emailed a comment that the source byte, and the payload size parameters could be optional. He also said that source time could be optional. But all the other metrics have a time, "T", parameter. For any singleton, that is the time the test applies to. So we left the source time as one of the mandatory parameters.
Mark Allman stated that he thinks it is fine to include the time. His only problem with the time was when we were using it to determine orderness.
Section 3.4 now talks about loss and reordering, please read and comment. Phil also emailed a comment about sequence discontinuity is a local phenomena; we added material to discuss, please check it.
The plan is to make one more revision before LC again (at least remove a mistake introduced in the first example).
In section 4, in email Mark observed that section looked like it had been written by committee. The whole section has been revised so that it all has the same level of formality, and no parameter assignments/names are repeated. There are a lot of diffs, but little substantive change.
On substantive change: the late time (offset of reordered packet), was originally keyed it to arrival index (i), but has now been redefined to be based on the source index s[i]. All parameters/metrics key to original source index.
Another comment in Mark's mail was that the arrival sequence 1, 10, 5 (with other packets missing), late time wouldn't indicate how late packet 5 was, because there was no frame of reference. There is a concept in some of the ITU-T recs, single point delay variation, where you establish a time frame of reference for subsequent packet arrivals. Using that point of reference you know expected arrival times, and can calculate single point jitter measurements. We added text and mentioned these references.
Both Vern and Phil in email found the byte stream offset definition confusing. This section had not been modified for a long time, because we expected it might be replaced. Since it has not, it has been revised substantially. The definition looks at all earlier packets with higher sequence numbers, and those are the ones that would have to be stored in a buffer. Al felt this now effectively obviates the reordering buffer density (proposed elsewhere) as a reordering metric.
In the reordering free runs definition, the minimum run length was changed to zero. It makes more sense this way, and we think it is what Jon Bennett intended. Please look and comment.
In email, Mark complained n-reordering is still bogus, although less so (see slides for details). To address the issue, took out that n-reordering can predict packets "as good as lost" on it's own, but it does tell you how you might set the duplicate acknowledgement threshold. Require a BTC sending discipline, and added references to 2581 and 2960 for SCTP. Mention DCCP by name, but no RFC number.
See the slides for the rest of the changes.
Working on this for three years, been through two last calls,
can see all changes on http://home.comcast.net/~acmacm/ .
Are we ready for another last call? Any further comments now?
Nischal Piratla: a few questions. First, in the measurement section, if you are expecting a given sequence, say 100 packets, and you apply n-reordering or even reordering extent, how do you report the reordering? A histogram? An array?
Al: we show some hints about reporting formats in every metric section. We say how they can be summarized when we talk about the sample metrics.
Nischal: for the reordering extent, if I remember correctly, you say it can be summarized by a histogram. If you use something like a histogram, and have a sequence of 100 packets, and have two packets with extent 2 and 3; and then another sequence of 1000 packets and it also has two packets with extent two and three, what's the difference between the two? Do you think normalizing these would be a good thing to do? And how far do you go, there is no threshold like RD has, where you can say beyond this you aren't going to represent it because it's not feasible to store it.
Al: there are two questions in there. All the questions about the difference between 2 reordered packets in 100 or 1000 packet stream are answered by giving the sample size. Other metrics capture that... reordering free runs would capture the length of time reordering free runs exist. It sounds like you are looking for more information in reporting aspects. If you have some suggestions, we are open.
Mark Allman: it sounds to me like you're saying the various metrics will all be summarized and presented together, right?
Al: if you're looking for a comprehensive set of metrics on all aspects of reordering, it would be a good idea to report as many as you feel are useful.
Mark: you should state that in the draft.
Nischal: in section 7.2 you say reorder extent may overestimate buffer size. How is that useful?
Al: it's purely a positional offset -- how many packets arrived between the reordering discontinuity and this particular packet. Just giving packet counts is not as effective as giving the byte offset, which would give the amount of storage necessary. The definition of extent helps these other metrics, because it associates a Reordering Discontinuity with each reordered packet (needed for LateTime, ByteOffset and Gap metric definitions).
Mark: don't understand. Overestimate in terms of bytes? No, it is an overestimate in number of packets that might have to be stored.
Scott Poretzky: directed at the test equip vendors in the room. At least two in the room. I think it's a great draft, and I remember sharing it a year ago with some test equipment vendors, some of which counted loss as reordering. They promised they would fix that. Are vendors here willing to say that they are compliant?
Kaynam Hedayat: Our reordering metric, since day 1, about 5 years ago, does not include loss. This is a great draft, will implement all the metrics. Brix Networks.
Diego Dugatkin (IXIA) commented that they liked the draft and will follow it.
3. Nischal Piratla: Reordering metric comparison, and status of RD-RBD draft
Nischal Piratla presented his view of the metrics in the current draft and the reordering density personal submission. (See slides for details.) Basis for comparison is to look at essential attributes (some must and some should). Most fundamental one: capture reordering. We think that capture earliness and lateness is also important. See example. Think that both earliness and lateness should be considered. In the second example sequence, 3 is not n-reordered, for any n. So they feel that n-reordering really doesn't capture reordering, so it may not be a reorder metric.
Other desirable attributes are a low sensitivity to loss and reordering. They do not want to call it orthogonal, because it's not really orthogonal; it's not totally independent. On-the-fly computation. Usefulness - not just a number. Simple. Informative. Not required to buffer whole sequence. Low computational complexity, and in particular not rising dramatically with sequence length. Be able to combine metrics.
Nischal then presented a table with metrics and how they compared against the attributes. Not claiming that one is better than another.
RD - captures reordering, because it captures both earliness and lateness; low sensitivity to loss and duplication because of the threshold; useful because max displacement of RD will indicate buffer size needed to recover from reordering.
RBD - reordering, but only buffering early packets.
reorder extent - captures only lateness. Low sensitivity to reordering and loss. For duplication, you can identify and remove them, but reorder extent doesn't give an adequate mechanism to do it. Usefulness... as just asked, not sure how useful it is.
byte offset - is the same thing, but it is useful in estimating buffer sizes
n-reordering - doesn't capture reordering. Low sensitivity to loss. Duplication as above. Usefulness... it can be used for TCP dupack estimation, but nothing more than that.
RD, RBD - thresholds limit spatial requirement. The thresholds also limit computational complexity.
RD only metric that can be combined in cascaded networks (assuming some stationary conditions). Still working on RBD for this.
reordering extent - spatial requirement order n if you have to detect the duplicates (by storing missing packets or early arrivals).
Status of the draft: RD is accepted for publication, and have several papers in the pipeline for publication. Removed loss-orthogonal RBD.
Believe that this should be an IPPM reorder metrics draft.
Scott Poretzky: what do you mean "low sensitivity" as a user of test equipment that measures this stuff, if a vendor says low sensitivity, I'd be afraid.
Nischal: If you make a measurement using the metric, and a stream has losses, how much will that affect the measurement.
Scott: I want the measurement to be 100% deterministic. And I want it to mean there is not duplication and there is not packet loss.
Nischal: if a packet is late by 1000 places, it is probably as good as lost. In reality, all packets before that are early. If a metric has no attribute for robustness, at that point the measurement will drastically change.\
Scott: for that answer and lot of your examples, use small buffers. Do you require a small buffer size?
Nischal: no that is just so it fits on a page. Could be 50-60... Scott: thinking about operational networks that providers run, with OC 192. 100,000 packets per second on those links. For that application, need huge buffers to analyze reordering within that second or in sequential seconds. Your approach wouldn't work with such a network.
Nischal: you're talking about buffering sequence numbers. But we don't have to buffer sequence numbers, we can measure on the fly. If a packet comes in very late, you drop it because of the threshold and keep going with the measurement. I don't see a problem.
Scott: but you say you use less buffer size, and I see that as a disadvantage. What about load balancing across many high-speed links. Many could be misordered, and you want to measure that.
Nischal: The metric has a tunable size, you can make it big. You must choose something.
Al Morton wanted to make two points. First, this definition of reordering is still different than the definition in the other draft. That draft is consistent with the way reordering is defined in a bunch of other venues. Second, the real question is that you have to take a different view of loss in order to make RD and RBD work. Looking for loss within fixed window of packet arrivals. If look into implementations of metrics, have a time we're willing to wait, not a number of packets. Even with the fundamental definition of loss, we appear to be in two different universes. Nischal stated that he believes the definition of reordering in Al's draft is flawed. It only counts late packets. That means that there is no hope of combining cascaded network measurements; it is not carrying all information. Looking at other metrics, see how TCP works. If you don't get the ACK you hope for, after 3 duplicate acks you retransmit. That's waiting for packets, not time. Al stated there is the retransmit timeout. The action on a triple duplicate acknowledgement is a performance optimization (whether the missing packet is lost or reordered is ambiguous when TCP takes action). Nischal felt that it is what needed to be captured, and no matter what you need to capture you can set the threshold appropriately.
Al wanted to ask about the current working group draft metrics being incomplete because they could not be concatenated. Nischal said that concatenation is different than cascading. None of these metrics work for concatenating measurements on network segments. Look at a network, find a stead-state reordering for one, then another, then cascade together. What is the reorder response of the cascaded network. This can be estimated using RD only. Also, this was not named an essential property. In more detail, Nishcal said if you measure the response of one network to a given set of packets, then another network to the same set of packets, then the combined networks with the same set of packets, that is what he is talking about estimating with RD only. What use? ISPs would be interested if you have a set of AS, conjoined going to be worse or better.
Stanislav Shalunov likes the metric, it captures a lot of information. What he finds disconcerting is the insistence that your metric captures reordering, and others don't capture reordering. Nischal said that he only stated that he believed n-reordering did not capture reordering. Stanislav pointed out that we are defining reordering. You can't say a definition doesn't capture something it is defining.
Nischal stated that if he sees a packet out of sequence, and the metric doesn't capture it. Stanislav was wondering that because it is different from your metric, it is not really reordering therefore it doesn't capture reordering? Nischal said (repeating what's on mailing list) that n-reordering doesn't keep with Al's definition from the second slide; packets 4-5-6 are reordered, but according to n-reordering only 4 is reordered and 5 and 6 are not, which means they are in order. Henk said that the fact that 4 is reordered, means there is reordering.
Stanislav commented that it is easy to agree if a sample has reordering, or no reordering. That has intuitive meaning. If a sequence number is not monotonically increasing -- intuitive meaning. If you want properties on top of that, there is no intuitive meaning, just the usefulness of properties. If the definition is not useful, then your intuition is no better than someone else's definition. There is no master dictionary that says 5 should or should not be counted.
Nischal accepted the comment, but within the same draft it appears that you have multiple definitions. Will there be one definition? Henk cut off discussion to move the line, and asked that additional discussion take place on the list.
Mark Allman wanted to make a couple of points. First, although he had read the buffer density draft today, he still doesn't see how it's useful in a concrete way. The draft could do a better job of stating the application. Second, he likes the RBD metric; it has some properties that other definitions do not have. The histogram idea is good, it gives a better idea of where to put buffer size and not just one number. Third, it does not make a lot of sense to have two different ways to measure buffer size.
Matt Mathis wanted to underscore the idea of there being no natural definition of reordering. The reason n-reordering makes sense is that the amount of effort to put data back in order. Early packets don't matter. This is different than the "abstract mathematical order" of reordering, and it is designed that way.
Mark said that underscores the reason for multiple definitions...there are multiple ways things work in the world, multiple applications, which might require multiple definitions.
Scott Poretzky made the point that if test equipment has to implement, and each implementer picks a different one, we as users of equipment are in trouble.
Henk said that the way we have discussed this is that if there are multiple metrics, you need to define their applicability. You may want different metrics for different cases, and document what you use in each case. The test box has to implement both, or more than one in that case.
Matt Z felt that discussion hasn't been closed, and that he would like to see more discussion on the list. Not sure how to phrase the right question to move forward. Al said that we had this discussion last August, and agreed that the WG draft would proceed through working group last call. If there was a demonstrated value in the other draft, then would take it up. Don't forget that... Henk agreed. We will proceed with the current draft, and put the question of whether to pick up the Colorado State draft to the list.
Mark had a final comment. I understand what you say, but I don't see why we need to give out two different ways to estimate buffer sizes that are pretty similar. There is room to collaborate here.
4. Henk Uijterwaal: Implementation reports
Henk reported out on implementation reports of existing Proposed Standard metrics. Five responses were received (there's one more that may come in); the generally cover the metrics, and show two separate implementations. The question is what next... we propose to give it to the ADs, and ask their advice.
Al Morton thought this might be a time to see if there is anything we should change. For example, should we revisit the group's decision to assign infinite 1-way delay to lost packets and related intransigence with respect to using delay averages -- "everyone else does it" (and in particular the ITU, so introducing the notion would harmonize the worlds a bit).
5. Al Morton: ITU Performance Work Update
Al Morton gave a short report on ITU work in performance metrics (in particular question 17, study group 12). He's in charge of performance work in ITU-T now (as of January). They are revising Y.1540 and Y.1541; in particular looking at loss performance objectives of 10**(-5) driven by broadcast video, large bandwidth-x-delay TCPs and circuit emulation over packet networks. They also want to look at concatenation rules for loss, delay, and jitter. Recall that we have had some good comparisons in the past, Will Leland gave reports in IETF 44 and Al at IETF 55. Al said that he would keep the group abreast.
Matt asked if this was all of the ITU-T's work on IP Performance. Al mentioned that there other related activities, such as VoIP-specific packet performance metrics defined in Recommendation G.1020.
Nischal asked if loss packets also have infinite jitter? Al said yes (if they are assigned infinite delay) but the ITU-T metrics emphasize the delay histogram of successful packet transfers instead of interpacket arrival (as in RFC 3393). Recommendation Y.1540 mentions IPPM terminology in an Appendix and compares the various metrics. The IPPM jitter metric can be equivalent to ITU-T if you choose the selection function of IPDV appropriately.
6. Phil Chimento: Bandwidth draft
Phil Chimento presented the current "draft" draft on bandwidth definitions, which should soon become a real internet draft. (See slides.) Its a very rough stake in the ground, intended to solicit comments. There are two sets of definitions, one related to capacity and the other related to available capacity. Focus is on what IP layer sees coming from L2.
Matt Mathis mentioned that with these kinds of measurements done actively, you don't know how much your measurement perturbs traffic. You don't know if you are measuring golf balls with bowling balls.
Stanislav brought up an issue about the definitions and whether they have unstated assumptions about the underlying links. Phil will investigate.
Mark Allman brought up that the work on definitions was good, but work on ways to measure them was probably premature. This was echoed by a few other folks in the room.
An equipment vendor echoed that just having agreed definitions was important... and that customers want to know "what is the 'bandwidth' in the last mile".
7. Kaynam Hedayat: A Two-way Active Measurement Protocol (TWAMP)
Kaynam Hedayat reported on an individual submission "TWAMP" -- a two-way active measurement protocol. (See slides for detail.) Roman Krzanowski spoke to give motivation, in some sense reprising his talk from the last IETF in one slide. The architecture for TWAMP follows OWAMP, but allows for a reflector, not just a receiver, and the reflector can either be statefull or stateless. Thus, this allows for "TWAMP Light", with a simple reflector.
The test packets from sender to reflector are the same as OWAMP; the opposite direction needs a new format.
Matt Mathis asked about any new security concerns with this method of use. The security concerns are the same as the one-way protocol; you can use keys. With TWAMP Light, there definitely could be problems, but it depends on how a provider provisions. Could limit hosts, or encrypt packets. Matt will take his concerns off line.
Stanislav related this history of OWAMP, and why did they want something that had (in his view) fewer good properties. Why TWAMP rather than two OWAMP? Basically, the authors want a standard way to do the reflection, so lightweight systems could interoperate; and this does not require synchronization. Stanislav also noted that if you consider the distribution of reflected packets, the can be arbitrarily distorted in this design. For example, a receiver on a power-saving wireless network might get packets bunched up into 100ms intervals. There's also a problem with distinguishing losses. Stanislav will take this issue offline. Kaynam said that there are two things this kind of design might provide. First, you get a round-trip jitter measurement which is something you can't add from two one-way measurements. Second, since OWAMP requires synchronization of time, it is hard to add the two one-way delays precisely if you cannot tell if the clocks changed with respect to one another (or were adjusted to keep the clocks synchronized). Emile said he would be interested in seeing the definition for round-trip jitter.
Henk asked for a sense of the room as to whether the group was interested in this as a WG item; there was support, and no dissent. Henk said we'd take it to the list and the ADs.
8. Emile Stephan: Multiparty metrics
Finally, Emile Stefan talked about "multiparty metrics" -- basically combining one-way metrics and methods with (probably passive) measurement points to get information about a packet from more than just the source and destination (or in the case of multicast, from the source and many destinations). See slides for detail. This follows on from items he had brought to the group earlier and a draft Lei Lang had sent in before the last IETF meeting. The PCE WG is evidently interested in such a thing. The draft was late, missing the cutoff, and no one in the room had read it. The chairs said that once the draft officially appeared in the Internet Draft directories, they would try to start a discussion on the mailing list.
There was one question from the audience, wondering if members of the group could be given individual weights; here it seems like they all have the same weight. Emile replied that this tries to use the definitions from the IPPM framework; it is not possible to mix definitions. The questioner asked about having one receiver not too important to you having a large delay. Emile thought that anything should be reported. Henk recommended that the questioner read the draft, and if he still had a question, post it to the list.
Henk also stated that we would take the question of whether this draft should be a working group item to the list.
9. Juergen Quittek for Saverio Niccolini: Traceroutes
Juergen Quittek was unable to free himself from his other working group to present on the traceroute storage draft; again, we'll try to generate a discussion on the mailing list.
We told the room about Juergen's status, asked for comment on the draft, and closed the meeting.