IETF-73 BMWG Minutes

Benchmarking Methodology WG

Monday, November 17, 2008, 1300-1500 Afternoon Session I

Room: Rochester

CHAIR: Al Morton

These minutes were prepared by Al Morton, and appear in two parts:

a 1 paragraph summary and detailed minutes based on Bill Cerveny’s notes

as official note-taker.

Summary:

About 20 people attended the meeting, with another 4 joining remotely
(both the Cisco folks who commented on the IGP-Dataplane WGLC).
MPLS Forwarding Benchmarking has completed WGLC and been revised based
on comments, and it may be the next draft to reach IESG review.
The status of the IPsec Benchmarking Drafts will be posted
to the list next week by the primary author, Merike Kaeo. After an extensive
discussion, there appears to be a way to satisfy the latest round of comments
on the IGP Data Plane Convergence drafts and resolve the IESG concerns.
A draft on SIP device benchmarking terminology is ready to be considered
as a working group item.  Protection Mechanism benchmarking needs to consider
comments from the last meeting during the upcoming WGLC.
Several of the
"old" work proposals seem to have lost interest, but IPFIX benchmarking
could be considered after the proposal is modified to more closely match
bmwg's scope of black box DUT comparisons. A new proposal of DNS device
benchmarking was also briefly reviewed.

Detailed Minutes:

Al Morton began at 1:05pm CST

1.  Working Group Status (Chair)

Charter has been extended

Rajiv Asati — MPLS work has been socialized with MPLS WG; may get some reviews from mpls working group

SIP Networking Devices — now on our charter

BMWG Activity review (see slide)

Accelerated stress benchmarking - Scott Poretsky lost co-author on draft — looking for someone to help co-author this draft

Old Action Items

BMWG Activity (slide 6)

Draft-poretsky-sip-bench-term-04 — active, authors want to do terminology document before methodology document.

Supplementary BMWG page —

Standard “Paragraph” (intro/security) (slide 8)

— Underscores that work is to be done only in laboratory

— Every draft must have explicit scope section

Additional Standard Sections or paragraphs (slide 9)

Al puts foot down on needing a scope section - for example RFC2544 — can see that this is meant for lab, but people have turned this into pre-service application service on live networks. You need to have control on device under test in lab environment.  If we have scope sections, less misuse will happen.

Test Device Capabilities — listed out test capabilities needed for rfc tests

— Scott Poretsky — in fact, the most recent Accelerated Stress methodology document had a testing capability section in it.  “You wonder why it isn’t in every document. It adds a lot of value to the document.”

2. IGP Dataplane Convergence Status — Scott Poretsky

Changes

Held a WGLC

Incorporate 2 sets of comments into version -17

Changes in Meth-16

— Added concept of route-specific convergence and benchmarking it — from iesg review

— Added tester capabilities

Planned Changes for -17

— Include scope and model to introduction — real help to position document within our charter

— Group terms by states, events, times and interfaces

— Remove references to full convergence = max(route-specific convergence time) since this is not true

— Is route-specific convergence now outside scope? — we need to convince ourselves of this (future slide to discuss this)

— Added term “convergence event trigger”

    Methodologies for Convergence Events Due to Route Withdrawal and Cost Change

    Updated convergence timeline — vendors showed timeline from right to left — I-D does the same.

    Adding term “convergence event trigger”. We also have “route convergence time” now. Do we want route convergence time in addition to convergence time —- Al says (since you are looking at me) I think the answer is “yes”.  Measurement of individual route convergence times was asked for in the DISCUSSes. People are measuring the complete distribution of route convergence times today, and talking about time for 99% of routes to converge. It could be a very long time, and if it is, that’s what people want to know.

    Scott Poretsky — if anyone has comments on this, post on mailing list.

Any comments? — Al - didn’t see introduction of separate benchmark and methodology approach; that would be a benchmark.  Either via loss-based or rate-based methodology.  Scott Poretsky — need to identify that route convergence time is NOT used to benchmark full convergence.  Al — 10-11 procedures all have a step to use specific convergence time. Need to identify explicit methodology.  Scott Poretsky — we’re benchmarking full convergence.  Al — need to figure out how to incorporate route convergence methods.

Online comment — peter de vriendt

[19:38:14] <Peter De Vriendt> Route specific is often used with small amount of prefixes (igp) because the accuracy is much better,

[19:38:23] <Peter De Vriendt>  than the one achieved with rate derived

[19:40:26] <Peter De Vriendt> So which method to use (rate vs loss vs route-specific) may depend on the requirements.

 

- Al: Peter, it would be good to have text to make that recommendation. A small number of routes might make the recommendation.

 

[19:44:11] <Peter De Vriendt> Example, small topology is 5k IGP prefixes, and I could achieve sub-second convergence, but today I'm not able to measure this accurately using rate derived if the  sampling interval is 1 sec.

- Al: so Peter is suggesting that the route-specific convergence time may be the most useful metric for this case, and packet sampling interval would need to be

 

[19:47:08] <kris michielsen> route-specific only differs from rate-derived if both convergence and recovery transitions are non-instantaneous

 

Al: Agree, so this is another point to note in the text.

 

Online comment on another subject - Kris Michelsen

 

[19:42:12] <kris michielsen> What does the igp benchmark want to measure: convergence time or packet loss during convergence?

Al: In some cases, we are using packet loss to infer convergence time. Ideal thing we are seeking to learn is time interval in regards to convergence time.

 

[19:43:29] <kris michielsen> So if there is no packet loss, the convergence is instantaneous?

Al: Need statements in document that address this explicitly (no packet loss).  Document needs to address this eventuality. But that’s the scope, dataplane measurements only.

Scott Poretsky— we’re measuring on the data.

 

SIP Performance Benchmarking (Scott Poretsky speaker)

This topic is of tremendous interest to SIPPING working group.

Two individual submissions —

— We’ve reworked terminology document significantly to add some detail.  We want to walk though these items

— Excellent comments from SIPPING … they seem quite enthusiastic about this work

Major changes

    Removed terms related … (see slide)

------------- NOTE: the recorded audio file dies about HERE.--------------------------

Scope — DUT/SUT

DUT — must be rfc3261 capable network equipment.

Tester — emulated agents

Focused on control plane, not on media performance —- performance benchmarks and how to get those.

Scope — Signaling and Media

Type of DUT will determine if media is traversing this

Scope - Scenarios

Session establishment

Vijay Gurbani — Presence not being done in this document.

Al - checking charter regarding Presence – it’s not mentioned there.

Scope - Overload

Falls in line with performance — strongly recommended that we cover overload — we to know where overload work is headed

— Some level of dependency in overload specification development

Out of Scope Scenarios

- Not benchmarking media, but media will be present.

- Not covering IMS — methodology being developed could be applied to IMS

- Session disconnect not considered in scope – this generated some discussion:

- Andrew Dolganow: Benchmark session establish and flow disconnect — Scott Poretsky: that is worthwhile

- Vijay Gurbani — explained by excluded

- Carol Davids — idea that at any given load, DISCONNECTS will take precedence.

- Andrew: doing similar things in past, if you don’t do that, things don’t disconnect.  Look at impact of disconnects

- Carol Davids — this is only time we were looking a ?? Transaction.

- Scott Poretsky— at this time we aren’t doing to measurement of session disconnects.  Are going to study case where we have disconnects present.

Removed Terms

Session Terms

Al - session overload capacity — is that clear to everybody? If you stop responding to session attempts, it will start again

- Andrew Dolganow: — it can respond negatively, want point where we no longer respond.

Appendix for Session Attempt Arrival Rate

Moved benchmarks to appendix, like accel stress document.  White box measurement available in appendix Al - wouldn’t be doing comparative benchmarks - Scott — no we wouldn’t, but this is not what we did

- Carol Davids: We put it in appendix because it was outside of scope.

- Al - white box measurement, so it can’t be used when comparing vendors, the text of the Appendix needs to make this clear.

Next steps

Target Co-WGLC with BMWG and SIPPING Feb. 09 — Poretsky says this put us on schedule with milestones.

Al asked for who read document, Andrew Dolganow said he did.  Al asked Andrew Dolganow if this was sufficient to become a working group draft.  Andrew Dolganow said he didn’t think there had been a sufficient numbers of readers.  Al will pose the question of adopting this draft as a WG item on the list, and ask for more readers.

 

Sub-IP Layer Protection Mechanism Performance Benchmarking (Scott Poretsky speaker)

Status

Agilent ran through methodology and liked it

Comments focused on terminology, worked on consistency people were looking for.

Major Changes in Term-05

PLR versus Headend was clarified — significant update

 

Comments:

Bill Cerveny - Questioned extensive capitalization. 

- Scott - Capitalized only defined terms.

- Al – RFC Editor is OK with this as long as capitalization is done consistently.

Ron Bonica: ok to capitalize defined terms. Questioned if “Backup Path” was defined elsewhere (in another document); don’t redefine the term.

- Al — doing WGLC…asked that people carefully read document.

 

Milestones Slide

 

Work Proposal Summary Matrix

Stuff in gray may be dropped if people don’t start working on it.

Would like to see people do the reading and revive WLAN switching work – very detailed drafts are available, but Al’s the only person who’s read them, AFAIK.

 

IPFIX Benchmarking Proposal

Benoit Claise

Changes since 00 (see slides)

Issues - it could benefit from more usage from the ipfix world

Rate limiting issues

Bandwidth — cpu, impact on forwarding plane, bandwidth

Scott Poretsky — can you tell us what performance benchmarks are that you are measuring.  Looks like whitebox measurement.

Al — Most familiar benchmark proposed here is impact on throughput.  Agree that CPU utilization can be used as a basis for comparison between software releases on a single DUT.  Effectively, the procedure could be turning feature on and looking at utilization change, but this would not be a benchmark. 

Scott Poretsky — There is a lot more about CPU load versus throughput.

Brian Trammell — Issue with CPU benchmarks and specific implementations.  Good for comparing one implementation to itself. Throughput is much more useful between vendors’ devices. Need to specify that this a IPFIX measurement on an IP forwarding device. There is another class of devices that are dedicated to packet sampling and IPFIX, and throughput would not be a relevant metric for those sorts of devices.

Bill Cerveny’s comments that IPFIX measurement needs to have some sort of accuracy component

Andrew agreed that Bill’s issue was a significant issue.

 

 

Measuring DNS (Server) Performance (presentation slides not in online materials)

Peter Koch, DNSOP co-chair — asking for help from the IETF performance community

What DNSOP does

DNS and Performance

- On the Internet: Round Trip Time

- In the lab: Queries per second — combination of hardware, software, and other factors.

New area of interest

- Apples and oranges comparison problem

Lab Testing

- DNS authoritative server

- Find peak performance

- But

—- No agreed upon set of benchmarking zones

—- No agreed upon set of benchmarking dns queries

-         Measurements not always consistent

Al: Anyone with interest should contact Peter and start to flesh-out a proposal.

Al: Action items: SIP draft to be posted and WGLC on Sub-IP Layer Protection Mechanism Performance Benchmarking