2.4.1 Benchmarking Methodology (bmwg)

NOTE: This charter is a snapshot of the 39th IETF Meeting in Munich, Bavaria, Germany. It may now be out-of-date.


Guy Almes <almes@advanced.org>
Kevin Dubray <kdubray@baynetworks.com>

Operations and Management Area Director(s):

John Curran <jcurran@bbn.com>
Michael O''Dell <mo@uu.net>

Operations and Management Area Advisor:

John Curran <jcurran@bbn.com>

Mailing Lists:

General Discussion: bmwg@baynetworks.com
To Subscribe: bmwg-request@baynetworks.com
Archive: ftp://ndtl.harvard.edu/pub/bmwg/mailing.list

Description of Working Group:

The major goal of the Benchmarking Methodology Working Group is to make a series of recommendations concerning the measurement of the performance characteristics of various internetworking technologies; further, these recommendations may focus on the systems or services that are built from these technologies.

Each recommendation will describe the class of equipment, system, or service being addressed; discuss the performance characteristics that are pertinent to that class; clearly identify a set of metrics that aid in the description of those characteristics; specify the methodologies required to collect said metrics; and lastly, present the requirements for the common, unambiguous reporting of benchmarking results.

Because the demands of a class may vary from deployment to deployment, this Working Group will not attempt to define acceptance criteria or performance requirements.

Currently, there are two distinct efforts underway in the BMWG. The first addresses the metrics and methodologies associated with benchmarking network interconnect devices. The second effort (IPPM) focuses on determining the practical benchmarks and procedures needed in gaining insight for users and providers of IP Internet services.

An ongoing task is to provide a forum for the discussion and the advancement of measurements designed to provide insight on the operation internetworking technologies.

Goals and Milestones:



Expand the current Ethernet switch benchmarking methodology draft to define the metrics and methodologies particular to the general class of connectionless, LAN switches.

Aug 97


Edit the LAN switch draft to reflect the input from BMWG. Issue a new version of document for comment. If appropriate, ascertain consensus on whether to recommend the draft for consideration as an RFC.

Aug 97


Take controversial components of multicast draft to mailing list for discussion. Incorporate changes to draft and reissue appropriately.

Aug 97


Incorporate BMWG input and continue to progress the Cell/Call Terminology Draft. Reissue draft as appropriate.


Request For Comments:







Benchmarking Terminology for Network Interconnection Devices



Benchmarking Methodology for Network Interconnect Devices

Current Meeting Report

Minutes of the Benchmarking Methodology Working Group (bmwg) Meeting

Reported by Kevin Dubray

The BMWG met at the 39th IETF in Munich on 12 Aug 97. Fifty-five people attended the session. The agenda was approved as presented. The major topics discussed were:


A question was asked as to why the Cell/Call Terminology draft was removed from this session's agenda. The chair informed the group that the editor had recently changed employers and was unable to affect the changes mandated at the Memphis meeting in the required time frame.

In addition, the chair further announced that the IPPM effort was being detached from the BMWG. As such, the BMWG charter was being amended to reflect this separation. The chair announced a draft of the revised BMWG charter had been submitted to the Area Directors.

I. Benchmarking Terminology for LAN Switching Devices

Dubray stated that it appeared that the LAN Switch terminology draft was nearing Last Call status. With that he introduced Bob Mandeville to lead a discussion to fine-tune the draft in preparation of its Last Call.

Bob mentioned that one of the items that he would like to modify is to return the term "one-to-one mapped traffic," item 3.3.1. to the original wording in a earlier draft, "non-mesh traffic." David Newman articulated that when many people talk to him about the condition conveyed by one-to-one mapped traffic, they called it "non-mesh." Dubray stated that while the two terms are very related, they are different. He further stated that one-to-one mapping has industry precedence in several benchmarking applications. Jim McQuaid suggested that perhaps an effective compromise would be to keep the generic term, non-mesh traffic, but give it the meaning of "one-to-one mapped traffic." The group agreed that this would be an acceptable workaround. However, Newman stated that he had issues with the use of non-meshed traffic distribution in benchmarking scenarios. Dubray noted that use of one traffic distribution pattern over another was really a methodological detail best left to the future methodology document.

On the burst-related terms, section 3.4, Bob asked the group whether the notion of capture effect should be introduced in the terminology draft or in the subsequent methodology document. There was a general feeling that the mention of capture effect as an "issue" would suffice and a detailed accounting for the issue in the related methodology document was reasonable.

Dubray raised some other minor issues:

1. He did not see the need of the last paragraph of section 3.2.3.
2. The example chart in the discussion of Maximum Forwarding Rate had the incorrect entry of Offered Rate versus Offered Load.
3. In the discussion of the term OLOAD, wording should be added that obliges the report of Oload in association with Forwarding Rate.

Bob articulated that he would also like to change the last paragraph in term 3.2.3 by dropping it and re-introducing related wording from a previous draft.

Bob commented there was an additional paragraph in the draft that he wished to discuss with Scott Bradner before modifying. Bob also mention that there was a query as to why multicast was not considered in the LAN switch draft. Bob stated that his reply was there was an explicit multicast draft in progress.

With the discussion of the LAN switch draft drawing to a close, Mandeville turned the floor over to Dubray. Dubray queried the group as to whether they felt a Working Group Last Call was appropriate for the document when the proposed modifications were folded into a subsequent draft. Most to all indicated that it was appropriate. In response to a connected question, a short review of the steps associated with taking a draft to an Informational RFC was presented. The AD, John Curran, verified that the procedure was correct.

II. Terminology for IP Multicast Benchmarking

The next item on the docket was the discussion of the Multicast Benchmarking Terminology draft. Dubray gave a very brief recap of the Memphis presentation. He then went over the three major modifications to the current draft:

1. Moving the basic nomenclature (e.g., Iload, Oload, Forwarding Rate, etc.) to the LAN Switch Terminology document.
2. Changing the name of term 3.1.1, Flow, to Traffic Class.
3. Replacing the Scaled Group Throughput (SGT) metric with Scaled Group Forwarding Matrix (SGFM).

The first item was straightforward; there was no associated discussion.

On the second item, there was agreement that the identification of logical equivalence classes of traffic was a good and needed thing. There was not any disagreement in regard to the recasting of the term "Flow" to "Traffic Class." Dubray pointed out that the definition of Traffic Class cascaded down to two related definitions, Group Class and Service Class. The group noted the definition of Service Class was a particularly a good thing.

On the topic of Scaled Group Forwarding Matrix, Dubray differentiated between that term and its predecessor, Scaled Group Throughput. In a slide, (mcast-02, slide 5), Dubray showed the power of the SGT metric in contrasting between throughputs of various target multicast groups in a single DUT scenario. In the next slide (mcast-02, slide 6), Dubray demonstrated the issue that Scott Bradner and others had with the SGT metric: it did not lend itself to straightforward comparisons across multiple DUTs (represented by the various shade bars) because of the moving baseline, throughput, as represented by the horizontal lines.

Dubray said the metric was reworked to allow for better comparison by using Forwarding Rate as the baselining mechanism. A target Forwarding Rate can easily be ascertained from a known Offered Load, thereby serving as a baseline or fixed reference for the tested devices in a particular configuration.

Dubray indicated some general problems with the multicast forwarding benchmarks. The currently proposed benchmarks do not consider multiple source addresses. Nor do they consider many-to-many relationships. In general, he said, multicast was problematic in its characterization because there were potentially "many axes" to consider when conveying results.

Others in the group were quick to point out that other factors may need to be conveyed, such as forwarding decisions as a function of forwarding rate. Jim McQuaid believed that expanding benchmarks beyond the standard two dimensions was fast becoming a requirement across other BMWG actions, such as the Network Security Device Terminology draft. Dubray stated that he would be very receptive to suggestions with regards to generic wording, were it applicable. He cautioned, however, that generic wording sometimes lacked the specificity required to convey the information needed for correct and consistent interpretation of the metric with regards to a particular test case.

Coming back directly to multicast, Dubray noted that in other multicast-related working groups, it seemed to be a pervasive understanding that the current scenario where multicast is most mature is the one source-to-many destinations scenario. While a similar focus is most likely sufficient for this draft, Dubray invited others to participate in adapting the benchmarks for many-to-many test cases.

III. Benchmarking Terminology for Network Security Devices.

David Newman was introduced to lead a discussion of the initial draft on "Benchmarking Terminology for Network Security Devices." David was immediately greeted with a variety of input:

1. Avoid redefining previously defined BMWG terminology.
2. Use, if needed, the term's "Discussion" section to articulate interpretation details of previously defined terms in the context of draft.
3. Consider reworking the usage of the terms "multi-homed" and "policy."
4. Presentation style: Consider using Table of Contents and group related concepts together. This may "flow" better than presenting the terms alphabetically.

David thanked the group for its input; he went on to say that the document, in its initial form, is a seed document. He had hoped to elicit a discussion from which the future direction of the draft could be better determined. Specifically, he had a few basic questions for the purposes of the draft:

1. What is a security device?
2. At what layer in the 7-layer OSI model are measurements useful with regards to security device performance? And what do you measure?
3. Does security device architecture matter?

During the discussion of the first question, "What is a security device?" There was much talk over features and access. One person commented that the word "access" is itself an ambiguous concept - ambiguous enough to warrant caution should one decide to use it to build a definition for a security device.

Another idea presented was synchronizing the definition of a security device with the work done in the IPSEC area. Newman commented that would be fine for a Layer 3 centric approach, but is that enough?

Another discussion on the defining features for a security device ensued. John Curran commented that an idea of methodology and a notion of what would be done with a defined term is as important as the term itself. There is no utility in defining a metric that can't be gathered or a term that is too difficult to define. For instance, it may be better to discretely focus on firewall performance than to address the performance of "security devices." It was conceded that this question required further thought and discussion.

David focused the group on the second question: "At what Layer do we measure security device performance?" For example, should the focus be on "sessions" or "packets"?

A reinforcing idea from an attendee stated that the aforementioned division excluded ATM security services in lieu of IP security services. Another comment suggested that useful metrics may not necessarily reside exclusively on "one Layer." Useful metrics may be had by combining data points from multiple layers, such as plotting authentication rate against traffic forwarding rate.

Bob Mandeville offered the general observation that "you can mix what you well define."

Someone raised the question of test configurations. Is it more important to scrutinize a single device or a security solution built from heterogeneous components acting as some black box? Many were of the mind that the black box approach is more realistic.

John Curran re-stated that this effort would have a better chance of success if activity were undertaken to limit the effort's scope - decide to specialize on Network Address Translation, encryption, web proxy, etc.

Others thought that providing a more narrow set of metrics to be applied to a varied set of security devices may be useful as well. Jim McQuaid, stated that one such metric, throughput, maybe be better suited for the task at hand than a forwarding rate type of metric. It is important to ascertain the rate at which there is no loss of policy enforcement.

A question was raised during the throughput versus forwarding rate discussion of how does one differentiate router functionality versus security device functionality - especially in light of the fact that many routers now embed "security" features? Another security device feature discussion ensued.

David refocused the group by rephrasing the second question: "What and where do you measure?" There was some discussion about the benefits of measuring "PDUs" versus other values. Dubray stated you measure what has relative importance in the context of the characterization; it may be a forwarding rate, it may be some generic transaction rate.

There was a discussion of characterizing functionality versus characterizing performance. Some thought that conformance testing was needed to enforce interoperability. Dubray interjected the IETF line: "The IETF is not in the conformance test business." He also noted, however, that the methodological qualifier "correctly" was a very powerful word.

The discussion turned full circle when it was commented that Curran's original suggestion to de-scope the effort to focus on firewall testing was not unappealing. Newman countered with that was OK, but that time and again we will return to referencing a wider set of features.

With regards to the last question, "Does device architecture matter?," most thought that it really does not matter. There may be divisions of where you look, such as Layer 2/Layer 3 for addressed-based scrutiny or Layer 7 for Proxy related issues. However, in the end, it is the treatment of the data unit that matters.

As time was waning, David thanked people for their input and asked folks to continue to give him feedback. Dubray reviewed the goals for next period:

1. Prepare for WG Last Call on the LAN switch draft;
2. Update and reissue the Cell/Call Terminology draft;
3. Continue the multicast draft discussion and reissue draft as necessary.
4. Progress the Security Device Benchmarking Terminology Draft.
5. Post the revised BMWG charter.


BMWG - Traffic Class

Attendees List

Roster Not Received

Previous PageNext Page