2.4.1 Benchmarking Methodology (bmwg)

NOTE: This charter is a snapshot of the 40th IETF Meeting in Washington, DC. It may now be out-of-date. Last Modified: 04-Nov-97


Guy Almes <almes@advanced.org>
Kevin Dubray <kdubray@baynetworks.com>

Operations and Management Area Director(s):

John Curran <jcurran@bbn.com>
Michael O'Dell <mo@uu.net>

Operations and Management Area Advisor:

John Curran <jcurran@bbn.com>

Mailing Lists:

General Discussion:bmwg@baynetworks.com
To Subscribe: bmwg-request@baynetworks.com
Archive: ftp://ndtl.harvard.edu/pub/bmwg/mailing.list

Description of Working Group:

The major goal of the Benchmarking Methodology Working Group is to make a series of recommendations concerning the measurement of the performance characteristics of various internetworking technologies; further, these recommendations may focus on the systems or services that are built from these technologies.

Each recommendation will describe the class of equipment, system, or service being addressed; discuss the performance characteristics that are pertinent to that class; clearly identify a set of metrics that aid in the description of those characteristics; specify the methodologies required to collect said metrics; and lastly, present the requirements for the common, unambiguous reporting of benchmarking results.

Because the demands of a class may vary from deployment to deployment, this Working Group will not attempt to define acceptance criteria or performance requirements.

Currently, there are two distinct efforts underway in the BMWG. The first addresses the metrics and methodologies associated with benchmarking network interconnect devices. The second effort (IPPM) focuses on determining the practical benchmarks and procedures needed in gaining insight for users and providers of IP Internet services.

An ongoing task is to provide a forum for the discussion and the advancement of measurements designed to provide insight on the operation internetworking technologies.

Goals and Milestones:



Expand the current Ethernet switch benchmarking methodology draft to define the metrics and methodologies particular to the general class of connectionless, LAN switches.

Aug 97


Edit the LAN switch draft to reflect the input from BMWG. Issue a new version of document for comment. If appropriate, ascertain consensus on whether to recommend the draft for consideration as an RFC.

Aug 97


Take controversial components of multicast draft to mailing list for discussion. Incorporate changes to draft and reissue appropriately.

Aug 97


Incorporate BMWG input and continue to progress the Cell/Call Terminology Draft. Reissue draft as appropriate.


Request For Comments:







Benchmarking Terminology for Network Interconnection Devices



Benchmarking Methodology for Network Interconnect Devices

Current Meeting Report

Minutes of the Benchmarking Methodology (BMWG) Working Group

Reported by Kevin Dubray

The Benchmarking Methodology Working Group met on Wednesday, December 10, 1997, in Washington, D.C. Thirty people attended this session.

Kevin Dubray opened the session; and the agenda was approved as presented:

I. Administration (Solicitation for editor(s) and Charter presentation)
II. Status of <draft-ietf-bmwg-lanswitch-07.txt>
III. Benchmarking Terminology for Firewall Performance
IV. Terminology for IP Multicast Benchmarking.
V. Goals for the next period

I. Administration

Dubray announced that the vacancies for the two open editorships have been filled: Bob Mandeville volunteered to pick up the LAN Switch Benchmarking Methodology draft. Dr. Raj Jain agreed to get the Terminology for Cell/Call Benchmarking draft going again.

Dubray also presented the proposed charter revision. Dubray summarized the changes as essentially providing for the spinoff of the IPPM effort to the IETF's Transport Area. There was some concern with regard to the notion of "services" being scoped in the charter and that this scope may possibly conflict with other WGs, most notably IPPM. A discussion ensued on the many areas of "service" in internetworking. (e.g., Quality of Service, Service provision, frame relay service, etc.) It was agreed that it was the duty of the BMWG chair and Area Directors to moderate the BMWG scope to ensure that overlap with other groups' efforts are minimized.

A point was raised regarding the need for BMWG metrics to be clear and have the property of yielding uniform results. The chair asked whether the second paragraph of "Description of the Working Group" section of the BMWG charter fulfilled this requirement. The group agreed that it did.

With that, the anticipated BMWG goals and milestones through 1999 were presented. The group appeared satisfied with the direction. John Curran indicated that the Operations and Management Area Directorate had given its approval.

II. Status of <draft-ietf-bmwg-lanswitch-07.txt>

The chair pinged Operations and Management Area Director, John Curran, as to the status of this Internet-Draft on its way to an Informational RFC. Mr. Curran indicated that the draft was in AD review. He anticipated a response in the near future.

III. Benchmarking Terminology for Firewall Performance

David Newman was on-hand to lead the discussion on <draft-ietf-bmwg-secperf-01.txt>. David opened with the comment that input from the Munich meeting is reflected in this latest draft. He noted the draft's scope had been reduced from a more general network security device perspective to a more focused firewall device view.

David went on to qualify the draft by saying that it was not a complete reference on the subject -- he doubted that it ever would be or could be complete. A discussion followed supporting Newman's premise: while an absolute reference is desirable, it is often unattainable.

Moving on, there was a bit of a discussion revolving around the draft's use of "data connection," and at what level/layer the concept encompassed. It was articulated that, conceptually, the term data connection should be thought of the way IPSEC considers it: a flow of data between two endpoints.

David mentioned that he tried to use careful wordsmithing. For example, he cited that he employed the term "stateful inspection" versus a more commercially recognized term. He said this was sometimes hard to do. His research into firewalls seemed to indicate that the devices have been around since at least 1988. Subsequently, a substantial, often contradictory, and sometimes vendor-specific vocabulary exists. John Curran lauded the effort of keeping terms vendor-independent, recanting that beloved IETF phrase, "to no exclusion of others..."

Michael Richardson brought up the concern that for a benchmark reference, there was a noticeable absence of measurable units. Jeff Dunn added that while "auditable events" such as frame forwarding are good, one should not dismiss other events. Examples of such events could be rejected traffic or rejected sessions. Michael offered that auditing things that a device SHOULD NOT be doing may be desirable as well. "Badput" metrics that measure when a device bleeds frames or when a port becomes blocks can offer excellent insight into device behavior.

Newman acknowledged the excellent points, but he cautioned against putting too much of the methodology draft in the terminology draft. The group did agree that identifying a characterization as a metric was desirable when the characterization added value to understanding the behavior of a class of devices.

There was a suggestion that draft be made less generic. A document covering firewalls with respect to ingredient technologies may suffer from attempting to address too many technologies. It was further suggested to remove ATM from the draft. Dunn countered by stating that if one starts filtering, where does one stop? Restrict the draft to IP only, does one mean IP version 4, version 6, or both? How does UDP fit in vis a vis TCP? He thought it was counterproductive to limit scope too much. The suggestion to restrict ATM from the draft was withdrawn.

Newman added that essentially his approach was: "Here's some user data; here's its associated wrapper..."

A comment from the group alluded that approach was fine from an atomic level, but what about streaming those wrappers and payloads across "data connections" or "sessions." A discussion of these terms followed. Newman outlined some of the issues:

A. Abstract a level too high, you might not get a good basis for comparison; abstract too low, you run the risk of ambiguity.

B. Some of the differences may be methodological details.

One suggestion for a workaround was offered: identify the problematic nature of the term "data connection" and use examples to clarify.

As time was growing short, David asked the WG for any additional input. Dubray responded that the significance of the terms "dual-homed" and "tri-homed" was lost in definitions citing interface numbers. The definitions might benefit if a definition reflected the role that the type or class of device configuration played in securing a network.

The group offered suggestions for other possible firewall metrics: boot/initialization time, recovery rate, fail-over (a measure of how quick a device starts to resume forwarding after link failure.) Newman thanked the group for all the suggestions and insight. He said he would communicate the input to the draft's other contributors.

IV. Terminology for IP Multicast Benchmarking

After a quick re-hash of the purpose of the draft, Dubray indicated that there was acceptance for the following terms: traffic class, scaled group forwarding matrix, and aggregated multicast throughput. Kevin also said that he had not received enough input on the draft since Munich to warrant re-issue. The goal of this session was to build consensus or gain input on other terms.

With regards to the term Mixed Class Throughput (3.2.1), Dubray stated that the original intent of the metric was to differentiate how a DUT forwards traffic in the presence of BOTH unicast and multicast loads. Knowing that other classes of differentiation exist (e.g., an IDMR teleconference over the MBONE versus a multicast push technology such as a stock update), is the focus on unicast and multicast too restrictive? Or should another specialized metric be developed? The group expressed that a generic metric may be best in this case. It added that the word "rate" in the definition may be problematic.

The discussion turned to the term "Translational Throughput," (3.2.4). Dubray voiced that in the term's existing wording, the "translational" function was a methodological variable. In this way, the metric was made as generic as possible. He indicated, though, Dave Thaler emailed his concerns that references to "transitional format" and "final format" in the metric's definition were ambiguous. Given the possible translations,

A. Native frame to encapsulated frame,
B. Encapsulated frame to native frame,
C. One type of encapsulation to another type of encapsulation,

he thought the draft would be better served by independent metrics. The working group supported his position, indicating they would like to see Encapsulation Throughput, Decapsulation Throughput, and re-encapsulation Throughput added to the set of metrics.

Dubray stated that the Multicast Latency metric (3.4.1) was extended from the definition of the latency metric defined in RFC 1242. RFC 1242 defines a single measurement from a pair of ports. Multicast Latency provides a set of measurements in a one-to-many configuration. He noted that the metric did not attempt to limit the number of statistical gymnastics that could be derived from the set; rather, the metric serves to provide a foundation. The group was supportive of this metric and liked its flexibility. A suggestion was made to differentiate 3.4.1 from its unicast analog in the metrics discussion section.

Follow-up discussion ensued on one of those statistical latency exercises, Min/Max Multicast Latency (3.4.2). Dubray articulated that this explicit latency metric may provide significant insight to certain multicast applications. The group liked this metric, but some queried about potentially defining additional latency metrics. One participant suggested that a Standard Deviation style of metric would be useful. Mr. Dunn countered with a warning that advocating a statistic where it may not be appropriate is dangerous. He cited, for example, computing a standard deviation on a non-normal distribution. Dubray reinforced the beauty of the generic Multicast Latency metric (3.4.1). By enforcing a metric that provides for a uniform way of collecting a sample set, one could apply the correct statistical exercise, such as collecting a variance.

In closing the multicast benchmarking terminology draft discussion, Dubray asked that folks give consideration to some of the unaddressed sections of the document, such as fairness, overhead, and capacity.

V. Goals for the Next Period

Dubray reviewed the goals for next period:


None Received

Attendees List

go to list

Previous PageNext Page