2.4.2 Benchmarking Methodology (bmwg)

NOTE: This charter is a snapshot of the 45th IETF Meeting in Oslo, Norway. It may now be out-of-date. Last Modified: 04-Jun-99


Kevin Dubray <kdubray@ironbridgenetworks.com>

Operations and Management Area Director(s):

Randy Bush <randy@psg.com>
Bert Wijnen <wijnen@vnet.ibm.com>

Operations and Management Area Advisor:

Randy Bush <randy@psg.com>

Mailing Lists:

General Discussion:bmwg@ironbridgenetworks.com
To Subscribe: bmwg-request@ironbridgenetworks.com
Archive: http://www.alvestrand.no/archives/bmwg/

Description of Working Group:

The major goal of the Benchmarking Methodology Working Group is to make a series of recommendations concerning the measurement of the performance characteristics of various internetworking technologies; further, these recommendations may focus on the systems or services that are built from these technologies.

Each recommendation will describe the class of equipment, system, or service being addressed; discuss the performance characteristics that are pertinent to that class; clearly identify a set of metrics that aid in the description of those characteristics; specify the methodologies required to collect said metrics; and lastly, present the requirements for the common, unambiguous reporting of benchmarking results.

Because the demands of a class may vary from deployment to deployment, a specific non-goal of the Working Group is to define acceptance criteria or performance requirements.

An ongoing task is to provide a forum for discussion regarding the advancement of measurements designed to provide insight on the operation internetworking technologies.

Goals and Milestones:



Expand the current Ethernet switch benchmarking methodology draft to define the metrics and methodologies particular to the general class of connectionless, LAN switches.



Edit the LAN switch draft to reflect the input from BMWG. Issue a new version of document for comment. If appropriate, ascertain consensus on whether to recommend the draft for consideration as an RFC.



Take controversial components of multicast draft to mailing list for discussion. Incorporate changes to draft and reissue appropriately.



Submit workplan for continuing work on the Terminology for Cell/Call Benchmarking draft.



Submit workplan for initiating work on Benchmarking Methodology for LAN Switching Devices.



Submit initial draft of Benchmarking Methodology for LAN Switches.



Submit Terminology for IP Multicast Benchmarking draft for AD Review.

Sep 98


Incorporate BMWG input and continue to progress the Cell/Call Terminology Draft. Reissue draft as appropriate.

Sep 98


Submit first draft of Latency Benchmarking Terminology

Dec 98


Submit Benchmarking Terminology for Firewall Performance for AD review

Mar 99


Submit Terminology for Cell/Call Benchmarking draft for AD review.

Mar 99


Submit Benchmarking Methodology for LAN Switching Devices draft for AD review.

Jul 99


Submit Latency Benchmarking Terminology draft for AD review


Request For Comments:







Benchmarking Terminology for Network Interconnection Devices



Benchmarking Terminology for LAN Switching Devices



Terminology for IP Multicast Benchmarking



Benchmarking Methodology for Network Interconnect Devices

Current Meeting Report

Benchmarking Methodology WG Minutes

WG Chair: Kevin Dubray

Minutes reported by Kevin Dubray.

The BMWG met at the 45th IETF in Oslo, Norway, on Thursday, July 14, 1998. The group consisted of over 30 people.

Dubray noted many of the group's editors were unable to attend this IETF. With that, he presented the agenda:

1. Agenda/Administration
2. IP Multicast Benchmarking Methodology

Dubray offered an amendment to the agenda - highlight outstanding issues associated with the LAN switch draft or any of the other BMWG drafts. There were no dissenting voices.

1. Administration

The Firewall Benchmarking Terminology's eighth revision was approved for distribution as an informational RFC. So far, no word from the RFC editor.

2. IP Multicast Benchmarking Methodology

Hardev Soor was introduced to lead a discussion of the latest multicast benchmarking terminology draft, now in its second revision.

Hardev revisited the latest changes to the document:

It was questioned why references to Forwarding Burdened Multicast Latency and Join Delay were removed. Hardev said that was the impression that he was left with from the 44th IETF BMWG. It was pointed out that while Burdened Response was a clarifying term and didn't need a corresponding methodology, that didn't obviate the two forwarding burdened metrics from RFC 2432 from receiving a methodology. Hardev said he would look into that.

Hardev informed the group that the only comments that were posted were in the form of David Meyer's email to the BMWG list on 30 June 1999. (This can retrieved from the BMWG mail archive.) Points from that email were subsequently addressed:

A. Test set up ignores few-to-many and many-to-many test scenarios.

Hardev indicated that he believes that those methodologies can be extrapolated from the current draft's methodologies. Another voice offered that the methodological extensions might not be as straightforward as proposed. Another person recounted the decision at earlier meetings to focus on the one-to-many scenario, as that was most straightforward - and the extrapolation could be handled in subsequent BMWG work, if need be.

B. Lack of IGMPv3 support?

There was significant discussion on what should be cited. Should there be a vague reference to a group membership protocol or an exact specification of a protocol (e.g. IGMP v3)? It was recommended that the draft require no less than IGMP v2. This way you get the improvements over version 1 (e.g. join, IGMP snooping, etc.)

C. Multicast Test Address Ranges
- Hardev recommends leaving as is, but changing the MUST qualifier to SHOULD.

It was stated that there are scoped addresses in the draft's proposed ranges and it might get you in trouble - administrative scope might be a case.

D. Fanout & Forwarding/Throughput Metrics
- Meyers posited that the methodology didn't adequately capture the interaction of the metric's primary factors of source, destination and port distribution (fanout) on the metric.

There was a discussion of how these factors may impact forwarding. (e.g., the number of source/destination pairs in the offered load may impact routing size; larger table size could impact forwarding.) Hardev said they would consider this.

E. Re-Encapsulation Throughput.

Meyer's email questions the utility of such metrics with relation to multicast. It was pointed out that there were several routing protocols to handle multicast natively. It was countered that these routing protocols were not omnipresent. Moreover, several in the group were rather vocal in stating that given the islands of unicast support that today's multicast traffic must traverse via tunnels, these metrics weren't brain dead at all. It was agreed there was need for encapsulation related metrics.

The methodology to employ two DUTs versus one DUT was questioned, however. It was thought that a single DUT could better characterize the encapsulation related functions. A system of DUTs may mask behavior or at least make it harder to identify where the performance issues lie.

It was also mentioned that the methodology might benefit by offering specific encapsulation types and scenarios (VLANs?).

F. Multicast group capacity.
Meyers' email states the corresponding methodology doesn't define success or, as it was pointed out, failure. It was agreed that since the proposed methodology uses the concepts of success and failure as conditionals, the criteria for these conditionals must be clearly stated.

Other comments were offered outside the scope of the Meyers' memo. It was stated that the inconsistent or confusing use of nomenclature within the document detracted from it. (e.g. SUT/DUT/SUT in sec 7.1; tester/test tool/DUT.)

Specifically, it was thought that the document should be scanned and brought in line with other BMWG terminology and concepts such as offered load, intended load, forwarding rate, and throughput.

Citation of a theoretical max of 148,880 fps (section 4.1) would be clearer if it were bound to the Fast Ethernet medium.

For section 4.1, it was thought traffic class mix MUST be reported; it was further suggested that a class "interleaving" factor MUST be reported. It was offered that the Appendix would benefit from a discussion of how traffic class mixing and packet interleaving may affect performance. This was considered important as many routing implementations consider multicast an exception and the grouping of exception arrival events may yield different characterizations.

It was decided that 30 second trial durations should be mandated.

It was thought that section 5.2, Min/Max/Average Multicast Latency, missed the mark set by RFC 2432. That is to say, the methodology departs from the metric's definition which mandates the reporting of the difference between the minimum AND maximum individual latencies ONLY. It was thought that if statistical gymnastics were to be provided, the provision should be tied to to the generic multicast latency methodology stated in section 5.1.

There a brief discussion as to whether the draft should conform to RFC 2119's guidelines for the usage of terms like MUST, MAY, etc. The counsel was to pick a direction and we'll adjust when we need.

It was mentioned that the mixed class methodology should provide for a class of traffic comprising an offered stream but is destined to be DROPPED rather than forwarded. An example would be scope boundary violations. Drop functions are equally as important as forwarding functions to characterize.

Another question was directed to the 20 nanosecond clock resolution reference in section 6.1. If this is cogent to the methodology, should there not be an explanatory comment?

A general latency question was posed to the group, "Is it more meaningful to know the latency of a packet before or after the corresponding information has been "learned" in the address cache or forwarding table. It was communicated that after-the-learning latency was meaningful; the initial packet's latency (which generally triggers the address "learning" event) was a nice-to-know, but is not well suited for the traditional latency collection tests.

Hardev thanked the group for its input and asked for continued comments.

Dubray said that he traversed the BMWG archive to glean outstanding issues with other BMWG works in progress. A quick review was conducted to try to close or get a determination on those issues.

On the ATM draft:

1. There was a thread on mailing list with respect to reducing the number of base definitions from draft. Specifically, the group was asked whether the draft should address TM4.0 terminology and benchmarking issues. The group indicated that it was OK to exclude the TM4.0 terminology and benchmarking issues from this draft - these could be addressed in a later draft.

2. There was an outstanding question on the mailing list posing that if the WG chooses to restrict the draft's focus to certain ATM service user applications, which should be chosen? IP? TCP? FTP? Etc.? The attendees thought that IP/UDP focus was sufficient for now.

On the LAN Switch Methodology draft:

It was noted the latest draft <draft-ietf-bmwg-mswitch-01.txt> enjoyed a sizable improvement over the previous draft. The discussion benefited from having the document's editor, Jerry Perser, unexpectedly present.

1. There was a discussion on frame tagging and impact on frame sizes. Some explanatory text was offered on the list that the draft's other editor, B. Mandeville, found OK. Jerry said it would make it into the next draft.

2. Questions were raised as to the stated methodologies for sections 5.9.3 (address cache capacity) and 5.10.3 (Address learning rate). These methodologies require the use of only 3 tested ports. Some alternative text was offered that specified 3 ports per VLAN (one receive, one transmit, and one "spy" port to detect flooding). It was suggested that the editor check the archive and consider the input.

3. With respect to Section 5.6 (Filter illegal frames) & Section 5.7 Broadcast frame handling & latency tests: will there be a reporting format description for these sections? Are they TBD...? Jerry indicated the appropriate text will be added.

4. With respect to address learning, the group seemed to to think that zero-missed mark was an acceptable target. If it wasn't lossless type of metric, what could be offered as an alternative. No change of wording is anticipated.

5. It was mentioned on the mailing list that the draft didn't make use of "offered load," "intended load," and "for- warding rate," as cited in RFC 2285. Jerry agreed that use of these terms should be consistent with the RFC. Jerry proposed, however, modifying associated forwarding type metrics' reporting formats to:
a) MUST report intended load,
b) MUST report forwarding rates, and
c) MAY report offered load.

6. On the mailing list there was a question as to the 4 port requirement in the Head of Line blocking (sec 5.3.3) methodology. Jerry indicated that 4 ports were mandated by RFC 2285.

7. With respect to duration, it was agreed to mandate 30 second trials. It was thought, however, that the reporting format must require the declaring of non-standard trial durations.

8. It was cited that a backpressure methodology was missing. It was acknowledged that this needed to be addressed.

9. On the mailing list, alternate wording with respect to half and full duplex testing was offered. Jerry indicated he would have to review the thread on the mailing list archive.

Goals for next period:

1. Have BMWG drafts address and reflect issue resolution.
2. Find out where the Firewall Benchmarking Terminology draft was in the RFC process.
3. Progress the state of the Frame Relay Benchmarking Terminology draft.


None received.