IETF July 2000 Proceedings

Current Meeting Report
Slides

2.3.4 Configuration Management with SNMP (snmpconf)

NOTE: This charter is a snapshot of the 48th IETF Meeting in Pittsburgh, Pennsylvania. It may now be out-of-date. Last Modified: 17-Jul-00

Chair(s):

Jonathan Saperia <saperia@mediaone.net>
David Harrington <dbh@cabletron.com>

Operations and Management Area Director(s):

Randy Bush <randy@psg.com>
Bert Wijnen <bwijnen@lucent.com>

Operations and Management Area Advisor:

Bert Wijnen <bwijnen@lucent.com>

Mailing Lists:

General Discussion:snmpconf@snmp.com
To Subscribe: snmpconf-request@snmp.com
In Body: subscribe snmpconf
Archive: snmpconf-request@snmp.com (index snmpconf in body)

Description of Working Group:

The working group will create a Best Current Practices document which outlines the most effective methods for using the SNMP Framework to accomplish configuration management. The scope of the work will include recommendations for device specific as well as network-wide (Policy) configuration. The group is also chartered to write any MIB modules necessary to facilitate configuration management, specifically they will write a MIB module which describes a network entities capabilities and capacities which can be used by management entities making policy decisions at a network level or device specific level.

As a proof of concept, the working group will also write a MIB module which describes management objects for the control of differentiated services policy in coordination with the effort currently taking place in the Differentiated Services Working Group.

Deliverables

1. A Best Current Practices document to provide guidelines on how to best use the existing Internet Standard Management Framework to perform configuration management.

2. A MIB module which describes a network entities capabilities such as support for a particular type of security or a particular queuing method on certain interfaces. The module will also convey the capacity of the device to perform certain work.

3. A MIB module which can be used to concisely convey information about desired network wide Diffserv Based QoS behavior.

4. A document which describes potential future work needed to meet all the Requirements for Configuration Management.

Goals and Milestones:

Jan 00



Announce Working Group and call for Input

Feb 00



Submit Initial Drafts for BCP and MIB Documents

Mar 00



Meet at 47th IETF in Adelaide

May 00



Interim Meeting

May 00



Revised Drafts for BCP and MIB Documents and WG Last Call these Drafts. Submit to AD for consideration as BCP and PS.

Jun 00



Conduct Interoperability Testing

Jul 00



New Internet Drafts, including a document describing potential future work.

Aug 00



Meet at 48th IETF meeting in Pittsburgh

Sep 00



WG Last Call on remaining Drafts. Submit to AD for consideration as BCP and PS.

Oct 00



Re-charter or shutdown WG.

Internet-Drafts:

· The DiffServ Policy MIB

· Policy Based Management MIB

· Configuring Networks and Devices with SNMP

No Request For Comments

Current Meeting Report

IETF 48 Pittsburgh snmpconf Tue 0900 1 Aug 2000

Reported by Dale Francisco (dfrancis@cisco.com), Rob Frye (rfrye@longsys.com), and Steve Moulton (moulton@snmp.com).

Summary
=======
The bulk of the Tuesday morning meeting was devoted to Jon Saperia's presentation "Configuration Management with SNMP: SNMPCONF WG Status" (see also accompanying slides), and discussion of the topics he raised. Then Mike MacFaden and Jon discussed the BCP doc "Configuring Networks and Devices with SNMP" (draft-ietf-snmpconf-bcp-02.txt). Finally, Steve Waldbusser began his presentation on "Policy Processing Questions" originally scheduled to begin at the Friday meeting.

Attributions
============
Andrea == Andrea Westerinen andreaw@cisco.com
Andy == Andy Bierman abierman@cisco.com
Case == Jeff Case case@snmp.com
Dan == Dan Romascanu dromasca@lucent.com
Harrington == Dave Harrington dbh@enterasys.com
Jon == Jon Saperia saperia@mediaone.net
Juergen == Juergen Schoenwaelder schoenw@ibr.cs.tu-bs.de>
Mike == Mike MacFaden mrm@riverstonenet.com
Partain == Dave Partain David.Partain@ericsson.com
Randy == Randy Presuhn Randy_Presuhn@bmc.com
Shai == Shai Herzog herzog@iphighway.com
Steve == Steve Waldbusser waldbusser@nextbeacon.com

Jon Saperia began his presentation with an agenda:

- Review of charter, goals, and work items.
- Policy management with SNMP: Goals, Terms and Operational Model
- Policy-based Management MIB Module: quick status and work items.
- DiffServ Policy MIB Module: quick status and work items.

Jon went on to discuss the wg charter, goals and work items.

The overall goal is to improve the utility of SNMP as a configuration management framework. In support of this goal, the wg will add MIB objects to support policy-based provisioning concepts.

Work items include:

- Best Current Practices document for configuration using SNMP and for documenting policy-based work
- Policy Management MIB module
- DiffServ Policy MIB module

Jon argued that it made sense to use a common framework (SNMP) for all management activities, in order to achieve benefits of efficiency and scale, and to leverage existing knowledge and software. There will also be operational benefits from integration of configuration data with other data such as that from performance and fault monitoring.

The wg approach will involve adding MIB modules at different levels of abstraction. Jon proposed defining those levels as:

Domain: A domain is a general area of technology such as service quality or security. Services, or service level agreements, may span several domains, each of them potentially including many policies. Generally people will not discuss these domains in the abstract. They will most often use technology or application-specific examples. Examples include IPSec and Differentiated Services.

Mechanism: A mechanism is a management technology over a domain--the dials that you turn to affect a domain. Comprises standard MIB modules.

Implementation: Vendors may define implementation-specific parameters to augment a standard set of mechanism-specific parameters. These vendor-specific extensions are described in private (enterprise) MIB modules.

Instance--The low-level details; the actual parameter values associated with particular instances of a managed element.

It is useful to have _all_ of these levels represented.

Jon proceeded to define terminology for policy-based management using SNMP:

Policy-based management: The practice of applying management operations globally on all managed objects that share certain attributes.

Policy: The association of a boolean filter that selects objects with an operation to be performed on those objects. Expressed as: if (policyFilter) then (policyAction).

Filter: A boolean expression that determines if an object is a member of a set of objects upon which an action is to be performed.

Action: An operation performed on a set of objects (efficiency gain--don't set individual instances).

Role: An abstract characteristic assigned to an element that expresses a notion such as political, financial, legal, or geographical association; a way of identifying who gets a particular QoS, for example.

Element: A uniquely addressable entity on a managed device.

Jon described the inputs to managers (slide #9, Policy Management with SNMP: Operational Model--Applications): Roles, schedules (only want something to be true at certain times), filters (for a particular policy, select only these elements), and actions (mechanism and implementation-specific info).

Managed elements include capabilities, capacity info, utilization, and other state info.

For example: The House Lighting MIB module. A table might have light intensity and other parameters for every light in house. Next level of abstraction: Mechanism specific: House lighting Policy MIB module Next level: Policy based Mgmt MIB module (communicates w/ other modules as needed) Note: The standard, low-level management framework is unchanged. Policy mgmt is just an overlay.

Jon then gave the current status of the Policy-Based Mgmt MIB module:

draft-ietf-snmpconf-02 published in July.

Modifications to capabilities table
- pointer to mechanism- and technology-specific modules
- support for implementations with restricted capabilities (how does a system tell the world it supports WRED, or security, but with exceptions?).

General functional questions:
- policy precedence
- notifications (if there's a config change, should tell system)
- policy override mechanisms

Execution environment (scheduling)

Policy expression issues (versioning, extensibility)

Architectural relationship w/ other docs e.g., interaction w/ mechanism- and implementation-specific modules

Conflict resolution

Jon then gave status on the DiffServ Policy MIB module:

draft-ietf-snmpconf-diffpolicy-02.txt, published in June.

Work items:
- Ongoing sync with QoS device model
- Ongoing sync with DiffServ MIB
- Expansion of how module relates to other MIB modules
- Better usage examples
- Integration of state, other info

Then followed discussion on Jon's presentation:

Partain: In slide 10 ("Policy Management with SNMP: Agents", using the House Lighting MIB example) I'd like to restate the terminology to see if I understand it. The "House Lighting MIB module" is the nuts and bolts--what we SNMP IETF people love. The "House Lighting Policy MIB module" is a template for operations (e.g. vacation lighting policy). The "Policy Based Management MIB module" sets up filters/mechanisms (a role might be "edge light").

Dan: I have a question about the different definition of "role" (different from PIB/COPS). I better understand the COPS one.

Jon: We talked about this: a role can be an arbitrarily assigned attribute of an element.

Dan: Other definition: A role is a selector for policy rules that determines the applicability of a policy to a network element.

Jon: My sense is that they don't mean exactly the same thing.

Andrea: Why don't we work together and come up with a better word? "Role" means function in COPS world.

Steve: They really are different things. In COPS, a role is the entire selector for whether policy applies to a particular element. In snmpconf, we draw on many variables (e.g. ifAdminStatus) for selection; "roles" are "special", high-level notions only known by humans. Maybe we need another word.

Shai: "Role" is a problematic usage. COPS already has this well-defined. No reason to twist an existing term.

Jon: Someone will contact Andrea to work with her on syncing terminology.

Next, Mike MacFaden discussed the BCP document "Configuring Networks and Devices with SNMP" (draft-ietf-snmpconf-bcp-02.txt).

The goals are to:

- Document what we already know
- Document BCPs in the use of SNMP as a configuration tool, and correct misperceptions about SNMP-based configuration and policy. Illustrate the practices through examples.
- Highlight application and MIB module design choices to produce efficient and effective configuration mgmt systems.
- Update general knowledge of SNMP-based mgmt to the year 2000.

The intended audience is network vendors and people who design and deploy MIBs.

Some MIB design guidelines:

Distinguish row creation from activation; need to make explicit the effect a change has on the stability of a system under configuration.

Newer MIBs often provide objects to turn all or some features off. Turning off is better than removing.

Design in ability to save the active config easily. It's nice to have clear separation of config variables from others. One idea is to use MIB design convention of groupings. What do I have to save from a MIB in order to capture active config? A BCP would state how to partition the config info.

Don't use OCTET-STRINGs to aggregate multiple objects. Does this conflict with PortList TC? No...that's a good example of efficient containerization.

Specify an upper limit on the number of traps that can be sent in response to a given event. The effects of configuration can generate an inordinate number of notification events. E.g., when an FR link is set admin down, don't need to send down traps on each DLCI.

Provide user controls to avoid cascading notifications due to hierarchies. When possible, specify a notification equivalent to the layer being configured. Provide a mechanism that can control lower layer notifications e.g., ifLinkUpDownEnable. (Otherwise things could get bad in policy changes, where you change lots of things at once.)

Provide user controls to avoid a flood of notifications due to parallel notifications, e.g., a line card w/ 400 IFs goes down--just need one trap for the "containing" object.

Persistence of config objects should be well specified. Consider using StorageType TC as opposed to using DESCRIPTIONS. e.g., disman schedule MIB. Alternate approach: defining static config tables.

Design w/ multiple managers in mind (provide appropriate locking mechanisms, such as spinlock, ownerstring).

Define read-write objs at "right" level of abstraction. Too many instances to configure may mean abstracting the mgmt interface. A lot of MIBs have too many details--if I have to config 1000 rows to put ACLs on my router, I probably won't use SNMP.

Randy: Providing individual controls on generation of notifications at the MIB level interacts with the Notification Log MIB and access control in 2575, seems like it invites a multi-manager problem.

Discussion:

Jon: One clarification: Wrt notifications, you want to create them at the appropriate level of abstraction. Some things shouldn't be notifiable (e.g. queue depth change). Try to decide where does it make sense to have a notification, then choose level.

Jon Saperia then continued with discussion of the BCP doc. More recommendations include:

Management software: Make SET PDUs as efficient as possible by grouping as many varbinds as makes sense. Take advantage of AGENT-CAPS if available. Make config ops as parallel as possible (support concurrent Sets as makes sense).

Use notifications to relate changed configuration to one or more mgmt stations. (ability to timestamp, rollback, etc.)

Policy management:

Provide implementation-specific MIB objects if instance-specific vendor extensions have been made.

Policy mgmt apps should determine the time-keeping abilities of managed systems. E.g., could either have the mgmt app send down policy at particular times, or have devices that know what time it is turn policy on at particular time.

A notification should be sent when a policy override occurs. Overrides may be necessary, but when allowed, need to let mgr know.

Q: On making sets as large as possible: You need to know more about the device before you can know if an entire Set of varbinds will work together. So sometimes conservative policy is best.

Harrington: If you send large SET packets, you can tie up bandwidth (v1) figuring out what went wrong.

Perkins: Business logic is duplicated in the mgr app as well as agent. If I decide to change rules, I need to update in two places. It's been proposed over the past 10 years that SETS could return an app-level error, so that business logic could be done in agent. Agent might return "can't do this op because of constraints X,Y,Z".

Jon: Agent error reporting for config ops would be useful.

Perkins: Access control is very fine-grained in SNMP. It's hard to relate from a user point of view what I need to config in VACM to perform high-level operations.

Q: You say it's necessary to notify when config changes or when there's a policy override. Are they really different?

Jon: Not every element in a system is necessarily under policy control. So if something isn't under policy control, you need to send config change trap. Secondly, the policy override is meant to keep the policy mgmt system informed about things it thinks it controls being temporarily removed from its control.

Q: About throttling notifications. Would be nice to have mgrs subscribe to different sets of traps. More flexibility.

Juergen: Need better specifications in MIB docs. Procedures for app writers on proper sequence of SET ops. So really need better documentation in MIB.

Perkins: I support that, I typically advise that the table DESCRIPTION tells meta-level info (e.g. if you have this many slots, you have this many rows), ROW description gives entry-specific info.

Andy: One of the big problems with SETs is arbitrary and incomplete rows that you need to handle. (E.g., many set requests for different rows in 11 different tables, all bundled in one SET PDU). We could get rid of a lot of this complexity. BCP touched on rowStatus and how many items in a PDU. We really should address this in the SMI.

Randy: Regarding complex SET requests--the infamous "as if" simultaneous requirement in the protocol is the root cause.

Andy: About the House lighting MIB and House lighting Policy MIB...why not just add a table to the House Lighting MIB? House lighting MIB could have templates. Not clear how there's a reduction of work in having two MIBs. You still need to have perfect knowledge of the low-level MIB to do the Policy MIB.

As there was time remaining, the wg moved on to Friday agenda items, with a presentation on "Policy Processing Questions" by Steve Waldbusser:

1) An agent that gets a bad policy should do the right thing, where the right thing is TBD. A bad policy is one that has a syntax error or that creates a processing error. The right thing is terminate immediately.

2) Expression examples would be helpful (the new draft is OK in this).

3) How do we handle syntax errors?

Harrington: Need clarification: syntax errors at processing time vs load time.

Andy: Delay of syntax error returns goes against SNMP approaches to have SNMP agent reply about success/failure of SETs immediately

- Can a management station step through an expression statement to find errors? No.
- An agent can't reasonably go through every code path in a script to look for potential errors & return such at SET time.
- To the extent that these are described in BNF, the agent should be able to verify that they adhere to the language (BNF) syntax without evaluating correctness.
- Is there value in doing the check at SET time? Limited value.
- Protecting the agent from overruns is different from other types of checking of valid operations.

4) How do we handle run time exceptions? How many errors are OK in a policy? (0?)

5) How to deal with partial failures and notifications thereof?

6) Different types of exceptions, e.g. divide-by-zero or system failures--how to handle different error severities?

Juergen: What is the relationship of this work to the disman Script and Expression MIBs?

Steve: They are different to address different (though similar) problems

Case: There are strong similarities between disman and this work.

Randy: Why are there scripts?

Steve: Network operators are comfortable with the use of scripts.

7) What is the role of the management application in evaluating expressions?

8) What type of error reporting from agents is required?

The meeting adjourned for further discussion on Friday.

================================================================================

IETF 48 Pittsburgh snmpconf Fri 1130 4 Aug 2000

Reported by Dale Francisco (dfrancis@cisco.com) and Steve Moulton (moulton@snmp.com).

Summary
=======
Continued with Steve Waldbusser's presentation, in the section "Policy Processing Questions". We resumed discussion at question (9). After completing this section, we moved on to "Execution Environment Questions", then "Language Related Questions".

The questions that generated the most discussion were on how to select instances in filters (Policy Processing questions 21-23 below), and whether the language and the accessor functions should be extensible (Language Related question 4 below).

We decided that the next interim meeting would be in Knoxville, TN, hosted by SNMP Research.

Attributions
============
Andy == Andy Bierman abierman@cisco.com
Bert == Bert Wijnen bwijnen@lucent.com
Case == Jeff Case case@snmp.com
Harrington == Dave Harrington dbh@enterasys.com
Joel == Joel Halpern joel@omniplex.mcquillan.com
Jon == Jon Saperia saperia@mediaone.net
Partain == Dave Partain David.Partain@ericsson.com
Randy == Randy Presuhn Randy_Presuhn@bmc.com
Steve == Steve Waldbusser waldbusser@nextbeacon.com

Policy Processing Questions
===========================

9) Is it a requirement that we support UTF8 in our accessor functions?

Randy Presuhn will work with Steve on this.

10) Do we want to feed values from filters into policy action? E.g., suppose there's filtering on interfaces, and say I/Fs 1, 2, fail, but I/F 3 succeeds...should there be any state variables that linger from the earlier, failed processing? And after we get a return value do we want to fire an action, and if so with a param other than true/false?

Andy: Suppose I go through RMON control table where ifIndex is a columnar value, not an index...how would I pass this into an action?

Steve: The same way the filter got it.

11) Is the constraint of left and right useful (cleavage into two scripts, filter and action)?

Jon: One of the compelling reasons for keeping them separate is it may be handy to do the eval on left side at a different time than eval on right side, especially if one or the other is computationally complex. Left decides what policy applies to.

Steve: Another advantage is code reuse. E.g., imagine an icon for "select all enet I/Fs", drag to left, "turn off all I/Fs", drag to right. Also, Policy group supports this split.

12) Do we need both action and filter or would one do? What are the interactions allowed?

This just restates questions 10 and 11. There was no further discussion on this.

13) Need to address the question of the syntax of an identifier, what is the proper syntax of ifType.$1--what is the structure including wildcarding?

Steve: Does anyone want a different syntax? I'd like to prove what we're doing is inefficient before changing it.

Joel: How do we handle wildcarding on multiple indices? Will they each get a wildcard?

Steve: They should, the design isn't finished yet.

14) What are implications of wildcarding and implications in both directions in the role tables?

Steve: I can't remember what this one meant.

Joel: Larger issue: We're trying to build a solution, but what if there are base attributes that are used by both alternatives [snmpconf and rap]. "Role" is particularly problematic. I would like to have one table where roles are dealt with, i.e., not different than what rap group is doing.

Steve: In a offline activity, we're working with rap to either resolve the identity or come up with different names.

Joel: It'd be nice if were easy to factor out the simple cases of role use.

Steve: What we called assertions...as we go through usage examples, we may find it'll be difficult for a manager to look at a filter and know whether it should be downloaded to a system...but you do know what roles and capabilities are on a system. "Assertion" is a term for pulling some terms of a filter out that would allow a mgmt station to know it didn't need to download a particular policy.

Case: Wildcarding has been used in two places. One associated with roles, one with identifiers. Item #14 really isn't about roles, it's about identifiers. With roles: How do you do roles and role combinations? #14 is suppose you had a table that's not the ifTable that has two indexes, and you want to wildcard over one and hold the other constant. Another person asked what is the balance between complexity and functionality (is wildcarding too expensive?).

Jon: I'm keeping a list of side topics such as assertions and roles for Saturday afternoon.

15) How big should an expression be allowed to be?

Steve: No larger than necessary [laughter]. Most filters will be 1 or 2 or 3 lines, so I picked something very high (65k).

Jon: I think that's fine for now. We can express local limitations using the CAPS file if desired. So different vendors with different memory constraints can restrict as necessary.

16) Not possible to download all policies everywhere. Must subset what to send, so must handle two errors: download superfluous policy, fail to download needed policy.

Steve: Two cases: (1) False positive (superfluous download)...filter never executes on that system. False negative would have operational impact (didn't download something that was needed...if it had been, it would have executed). Assertions will make it easier to decide what to download, but haven't heard an idea of how to prevent false negatives.

Case: This is more something that should be observed than something we need to worry about. Just need to alert the readers. This really isn't an open issue.

Randy: I agree with what Jeff just said, also feel that we need to be careful about saying whether or not it's a problem. If we try to determine whether a filter could ever be true, it gets messy, possibly even not computable with a sufficiently powerful filter language.

17) How do we detect and how to standardize error handling?

Steve: No text yet on notifications of errors (not necessarily SNMP notifications, but even just error counting variables)...will investigate further.

Randy: Should look at history of script and expression MIBs...this is a deceptively devilish problem. This has some things in common with both of them.

18) Estimate the # of policies we need to handle. We need the architecture to scale up and down--what attribs needed to meet the goals?

Jon: Start collecting usage scenarios, need a base level understanding of how many expressions are out there, how large they'd be, accessor functions. Anyone who has experience in this area please get in touch with Steve...any volunteers? [No response.]

19) Where to start the policy evaluation?

Steve: That is, is the role the first thing to evaluate in order to reduce computation? (Move role comparison to front of filter to short-circuit evaluation). Should this be in the form of a recommendation? Role table and element table help bound the problem.

Jon: Even as we get more examples, we may want to modify Rules for Filter writers. What is the general sequence of events that a policy mgr takes to get a policy properly executing..."This is how the system works start to finish."

Case: All of these questions are subitems of "What does the execution environment look like?". 19-24 are all related.

20) How often to eval filter and when?

Steve: Don't want to leave this completely up to agent. Need to specify on a policy-by-policy basis. Straw proposal: 2 new objects, filterMaxLatency, actionMaxLatency. Not the time between when a policy executes on I/F 7 and I/F 8, but on the same I/F. Agent is free to execute filters and actions in any order.

Harrington: On ordering, if I write an action that has Sets, is that not guaranteed order?

Steve: No, within an action, order of execution is assured.

Harrington: Just for sanity's sake, should we add language saying that agent is ultimate arbiter of timing?

Randy: Language that can be stolen from Expression MIB.

21) Is the iterator inside or outside the filter statement?

Steve: E.g., should the expression have a for-loop that would iterate through interfaces? I think it's a bad idea. How to find instances quickly is something that the execution environment should provide.

Harrington: I think it may be clearer for people to be allowed to write iterators.

Steve: It may take some mindset change to think in terms of policy rather than iterators.

22) Is one iterator sufficient or do we need nested iterators?

[same as 21].

23) How do we select instances?

Steve: Execution environment hands elements to filter. That's how we know what "this" element is.

Jon: I think we'd had discussions where it was the method that was used to select from all possible objects...

Steve: Right, this was the element type registration table. A way for mgmt station to download to agent what elements it wants under policy control. Agent doesn't know (out of the box) whether policies are executed on I/Fs, circuits, whatevers. This table provides the OID of say, right up to ifIndex. The agent conceptually does a walk of that OID, comes up with a hit for every I/F instance.

Jon: One thing that isn't covered is shortlived instances.

Steve: This table is just a way to describe to the system things it doesn't understand. It would be difficult to create a standard set of element types. Shouldn't confuse conceptual algorithm with how agent implements it. Text allows for builtin element types, so e.g. for ifIndex, it already knows how to handle objects in this table. I.e., not a polling arch.

Randy: Just know the base OID doesn't give you the info to get to indices..they're not self-describing.

Steve: This system treats the OIDs opaquely. Treats suffixes opaquely as well, whether one subid or many. Suppose FR circuit table, 1.111, 1.112, 5.317 (ifIndex, DLCI). $1==1, $2==111.

Randy: Isn't it then possible to derive from the filter expression itself the contents of this table?

Steve: Yes, sigh. It's not artificial intelligence, but it's close.

Randy: At expression download time, you look for the OIDs that this filter expression cares about, that gives you the element types that the table needs. If the filter expression needs to know how the indexes are composed, then the filter is required to know what the element name is. If ifOperStatus is in the filter exp, then you know ifTable is an element type.

Steve: Doesn't that make your brain hurt? We haven't discussed expressions that effect things that contain things that they work on (e.g., touching the parent ifIndex of an FR circuit).

Randy: Is the type table necessary to evaluate the filter expressions? If there's sufficient info in the filter itself, than the type table is superfluous. You need a way of specifying the iteration context per filter--if you have that, the table is superfluous. If there's always a (type==something), the table is superfluous.

Steve: I'll take an action item to see if I can get comfortable with it.

Randy: I think that since all your expressions have a "type==something", that says something about the structure of the filters.

Steve: Autoregistering element types by inspection of policy filters (especially if there's a mandatory "type==something").

24) What do we want for role--iterators...how many, where?

[No comments.]

Execution Environment Questions
===============================

1) What are the implementation target environment
requirements for the work?

Steve: E.g., what's the size of 'int'? Don't want people to have to check ints with sizeof() every time they use them. Still not entirely defined...please send notes to mailing list when you find things like this.

2) What are minimal requirements for systems that participate in policy configuration?

Steve: E.g., date and time?

Jon: I wanted to ask...I think we want elements to execute on internal schedules, but do we want to allow a model of operation where the managed device says "I don't have date and time, so you have to do it for me"?

Randy: I think it's a deployment decision on whether time is local or distributed.

Jon: Architecturally we shouldn't add a constraint.

3) Would like a clearer definition of what "this element" is.

[No comments.]

4) What's the scheduling environment?

[No comments.]

Language Related Questions
==========================
1) Language for expressions: Do we want to use an existing one?
Do we want to create a new one?

Steve: We've been given a directive that we're not allowed to define a new language. The IESG will look askance if we do so. So instead, let's subset an existing language. There's a BNF in the doc that shows a subset of ANSI C. The BNF doesn't fully describe the syntax. Normative reference to ANSI C. Would subset of C or subset of PERL be better? What we have is such a restricted subset of C that I think it's also PERL [laughter]. There is no normative ref for PERL, and the PERL5 to PERL6 is a huge non-backward-compatible change, so I think C really is the right choice.

Case: Primary difference between having two identical grammars (subset of C / subset of PERL) is marketing. You're concerned that it's true, and we're talking about marketing.

Joel: I think we need this level of complexity, but I think we need to spell out exactly why we need this level of richness.

Steve: I agree. We'll come up with a list of examples, learn and realize what's only achievable through this langauage, then write down rationale.

Randy: There's a rather PERL-like script language that BMC has used for years that has a normative ref in an ISO standard. SMSL is the name...one of the language for ISO command sequencer stuff. May or may not be useful, but thought I should mention it.

2) Temp variables? Garbage collection?

[No comments]

3) Is UTF8 required?

Randy: Anything outside the 7-bit range gets entered as multiple "\" escaped sequences if we're using subset of C.

Case: We have to be sensitive to internationalization issues, it has to be UTF8, other schemes wouldn't be good. If we decided to support C except for UTF8, I don't think the IESG would shoot us down. So let's just do good engineering and it'll work out.

Joel: IESG mandate is that we shall internationalize. We need to know how to put UTF8 strings in expressions, but _not_ in variable names.

Randy: there are languages that allow UTF8 variable names, and in some dev environments that's useful.

Steve: Randy and I already have action item to fix this.

4) Extensibility of language and accessor functions without revisiting the standard? Current doc says no subsetting, no extensions.

Randy: I agree with that for the language. In the library it gets trickier. We dealt with that in the Script MIB (extending accessor functions).

Steve: Regardless of what we say, vendors will extend accessor functions.

Case: You want consistency across the system. But there are two issues here: product differentiation, and spec evolution. We should be cautious in terms of blocking spec evolution. One way to allow evolution is a standardized system for extension. So as we find useful improvements, we can upgrade easily. I think it's also true that vendors will make non-standard extensions, but I think we can contain the damage by specifying how such extensions are made. It might be as simple as having an app level err code that is returned to indicate you didn't have a particular accessor function available. I don't think we should say "no extensibility".

Randy: You want to be able to interrogate a system to discover the language level supported and accessor functions supported, so you don't have to wait for a runtime error to find out that a policy wasn't the right version for a device.

Steve: Do we need to do anything explicitly?

Partain: There are going to be extensions. We should make it possible to do in a standards based way.

Andy: I think the draft currently says only the specified accessor functions are allowed...that's a mistake...I think there needs to be an accessor registry.

Steve: The high level goal is to manage a network as a whole, not a device. The network will include devices from many vendors. If we have subsetting and extensibility, it gets hard to manage.

Andy: Capabilities grow over time, and we can't ignore it. If I'm managing a class of devices and I know it has the extension, then I want to use it. A CAPS table, read-only.

Steve: I don't want to allow subsetting of this very minimal set of accessors.

Case: Unlikely we'll resolve this today. Two separate issues: Language extensibility, accessor extensibility (Andy was just referring to accessor extensibility). I'm uncomfortable with waiting until the first extension comes along to figure out how to allow for extensibility. Andy is correct that the difference between extensions and subsets...they're really the same problem from an interoperability point of view. We need to design something that will evolve gracefully over time.

Bert: Are we struggling over whether we need the absolutely minimal set of accessors? I think we need a minimal subset of accessors available, even if there's a registry.

[end of discussion]

SNMPCONF Interim Working Group Meeting Summary
Pittsburgh, PA, USA
August 4th and 5th, 2000

This set of notes is for the part of the meeting that took place on August 4, 2000.

The Configuration Management with SNMP (SNMPCONF) working group met during an interim session on Friday and Saturday, August 4 and 5, 2000. The purpose of this meeting was to continue making forward progress while parties were present for the 48th IETF.

The note takers for this session were Shawn Routhier and Steve Moulton.

The first topic for discussion was which language might be used to express policy to the SNMPCONF policy. A subset of c that might be used has been laid out in draft-ietf-snmpconf-pm-02.txt. This discussion was moderated by Steve Waldbusser, and is a continuation of the discussion started in the second meeting of the SNMPCONF Working Group during the 48th IETF.

Item 4: Language Extensibility.

The first question on the floor is whether or not we should state that we permit vendor-specific language subsetting. In particular:

Issue 1: Do we allow vendor specific subsetting of the language

There is general agreement that as a working group we can't stop producers from behaving in various fashions, but we can specify what is compliant with the specification and what isn't. There was some discussion about what level of extensibility we can and should provide. The options included trying to trying to limit extensibility, allowing it but frowning on it, attempting to provide for it directly, and allowing unknown extensions to be handled gracefully. There were the usual interoperability concerns and the concern that limiting growth of the language will limit its deployment.

Specifically on the issue of subsetting the language, the statement was made that not allowing subsetting may drive us to reduce the number of functions included in the specification in order to get as wide deployment as possible of the required functionality set. The observation was also made that we can expand the requirements of the language in a second version.

On putting the question, there was general agreement that we should not permit vendor specific subsetting of the language in the first version.

Issue 2: Do we allow vendor specific subsetting of the accessor functions?

The first concern had to do with systems that do not have the concept of time. The sense of the room from this morning's session is that we will not mandate time of day clocks.

Is the current set of accessor functions sufficient? It does not allow many possible options such as getting an entire row. It is generally agreed that the current set is not sufficient and more will be required.

But how should they be presented? As one example if we include accessor functions for SNMPv3 then what should happen if the agent doesn't support v3. In such a case we might want to have something similar to libraries such that different sets can be included.

For the first revision we may want to keep the list short and simple and add new ones after we gain some experience with the scheme. Some of the experience can be gathered from other similar areas.

But if the current list is not the final one then attempting to discuss this issue may be premature. Until we have a reasonable list we can't be specific.

One probably will not have a good handle on which accessor functions are needed until one has built a set of applications. There are expression evaluators that have been in the industry for a few years, from which we can draw lessons.

. Can we have a statement of direction that we should attempt to define the minimum mandatory set such that everybody can implement it?

The suggestion was made that we have containers of functions - the equivalent of libraries and that the minimum compliance object would be those groups. So some of the libraries would be conditionally required but we would not end up with a constantly growing list. The libraries might be viewed as putting some hierarchy into the naming, instead of having one flat name space have a two level name space.

Steve Waldbusser proposed a set of classes of things that might not be implemented on a given system:

. Can't do (turnOffPower())
. Too Complex (code size/cpu speed limitations)
. Convenience Functions: too many functions specified

One proposal that rather than follow this approach we will specify a minimal set of functions so that no one will have a good excuse not to implement the entire set of functions in version one of this language.

Another proposal is that we have two classes of accessor functions, one which must be implemented, and one that should, but that without which one may still claim compliance. Functions to do RMON-specific things are one such example quoted.

One participant expressed a wish to have date and time functions split off into a separate group for compliance purposes. Another suggested that there be three classes of accessor functions:

. mandatory
. MIB-related (i.e. RMON support)
. device-related (power on/off, date and time).

Furthermore we need a way for the managed element to tell a manager what subsets of functions a device supports.

When put to the group, there was general agreement that we need groups of accessor functions for compliance purposes. Furthermore there was general agreement that we were going to be very rigorous in adding functions to the various groups.

Discussion then shifted to how one indicates whether a particular group is supported in an agent. One proposal was to specify using a bit string object, the other using a table of supported groups (an agent capabilities table). The first proposal would require less agent resources, the second may be easier to implement on run-time extensible agents.

A concern was made that we are departing from the policy model, where we send a policy to all boxes and have the boxes decide for themselves how to implement that policy. This would require that the policy-issuing entity know capabilities of each device. One way to handle this would be to bind accessor groups to MIBs being accessed by a particular policy.

A smaller team was chartered to discuss this outside of the group and report tomorrow. The group consisted of Jeff Case, Steve Waldbusser, David Harrington and Walter Weiss.

[break]

Jon Saperia asked the subgroup to study issues of collapsing capabilities table information also. What is the best way to advertise availability of accessor libraries?

The next set of language and accessor issues:

Issue 3: Do we allow vendor extension of the language
Issue 4: Do we allow vendor extension of the accessor functions
Issue 5: Do we recognize vendor extensions of the accessor functions

Walter Weiss made the point that the premise of policy is that we are trying to work across devices to make management a simpler world. He suggests no to c and yes to d. Bert Wijnen made the opposing point that this working group was formed because of concern from the SNMP community about two management protocols. This group was tasked to do both policy-based management and configuration management. He was concerned that in the interest of simplicity and adherence to the policy model we may remove too much functionality for configuration via SNMP.

Jon Saperia made the point that we need to difference between policy based configuration (across systems) and configuration of policy (diffserv). It is also true that we need to know a bunch of capabilities of the machine before we send policy to it. This is very valuable configuration management. In the charter we two have goals. I don't think they conflict with each other, so we need to do both. We haven't excluded one or the other at this point.

After a considerable discussion the question was called on issue c, and there was fairly strong consensus that we did not want to allow vendor extension of the language. There was general agreement that we do want to allow vendor extension of the accessor functions (issue 4).

There were three different proposals for determining the presence of vendor extensions. One was to have a accessorExists(accessorFunction) function, another to have a BITS construct for enumerating what functionality is present, and the third was to use the capabilities mechanism already familiar. In response to a concern about name space conflicts, a suggestion was made to use part of the company's DNS name as part of the function name to guarantee uniqueness (like is done in Java). The observation was also made that if we can do this within the SMI, we already know how to deal with it.

Issue 5 is being deferred to the smaller group.

Issue 6: How do we handle versioning issues, is being sent to the small group.

Issue 7: Changes in OID issue is also being sent to the small group.

Issue 8: How do we specify attributes of an element? Can there be multiple instances? As discussed this morning [ IETF48 SNMPCONF Working Group Meeting -ed], multiple instances can be specified as name.$1.$2.$3.

Issue 9: deferred (not available in notetaker's notes)

Issue 10: How big can an expression be? An arbitrary limit of 64K was mentioned; there is no real limit. This was discussed in the morning session. The point was made that we need to pay special attention into buffer overflow problems since this is a script issue. Good programming practice should be adequate.

Issue 11: Temporary Variables: do we want them? How do we dispose of them?

There was strong sentiment that temporary variables are needed. Questions were raised about the scope and lifetime of temporary variables. Possibilities such as the use of a global scratch pad and emulating MIB variables were raised. The use of temporary variables for lag variables and trending was discussed, as well as for generally saving state in a loop. It is necessary that any global variables created have some indexing to prevent access anomalies across different scripts. The requirement that each policy has to have a local temporary variable space was mentioned.

The issue of short-circuiting loop execution arose. Once again the discussion arose whether we are taking the policy view (all devices) or configuration view (one or more devices).

It was suggested that this topic be taken away and discussed further tomorrow.

Jon Saperia reviewed the working assumptions again:

1) We cannot invent a new language

2) We want to use something that is as familiar as possible to as many people as possible. We settled on Java, Perl, and C as possibilities. What is currently in the document is a subset of the intersection of C and PERL.

3) We are not saying that different languages can be used in different evaluations.
We are discussing the concept of a break statement in a loop construct, the storage of temporary variables, and the ability to preserve information across executions.
We need to have a side discussion on this issue.

[break]

Issue 12: Debug-able execution environment. Issues include
. Memory leak detection
. ability to find and fix bugs
. find looping and iterators

The suggestion was made that we may need to have watchdog timers to break out of iterations, or perhaps to set a limit on the number of times to iterate. If using a timer scheme, one needs to be able to match timeout information to restart information.

The ability to trace execution through a script is very important. One might use a log table to log output from the script for debugging purposes. Implementing a trace function as an accessor function was suggested.

Issue 13: Do we permit comments? We need to explicitly say so.

There was a general feeling in the room that comments should not disappear from scripts. Gets on the script should retrieve the same contents that sets put with respect to comments.

A more general issue about string length arose at this point. There was concern about string length being limited by PDU size. The question was put as to whether we should modify the policy table and add a new table to allow for longer strings. There was strong agreement for this change.

David Harrington mentioned that some script interpreter code was in use at Enterasys, and that he might be able to contribute this interpreter to the community at large. The possibility of using bc (the unix binary calculator) was also mentioned, as was PHP (from the Apache world).

Script re-usability was also discussed at this point; specifically whether a script can refer to another script already in the device. Currently this is not a capability of the MIB. The following MIBlet was put on screen

PmPolicyEntry ::= SEQUENCE {
pmPolicyIndex Integer32,
pmPolicyFilter OCTET STRING,
pmPolicyAction OCTET STRING
...
}

The suggestion was made to replace the pmPolicyFilter with a script MIB entry. The opposing view was that we should not pull in the script MIB, but add multiple lines of text to the pmPolicyTable.

The suggestion was made to change the pmPolicyFilter and the pmPolicyAction octet strings into integers. The clear sense of the room was to split out the pmPolicyTable now, and replace these objects with integer indices.

At this point, a brief review of issues for tomorrow's meeting was made.

---------------------------------------------------------------------
SNMPCONF Interim Working Group Meeting Summary
Pittsburgh, PA, USA
August 4th and 5th, 2000

The note takers for this session were Shawn Routhier and Steve Moulton.

SNMPCONF 8/5/00 9 AM

Jon Saperia was the first moderator for the day. He started the the meeting stating that he would like to pick a few issues that are solvable so that we have some progress by the end of the day.

Joel Halpern has agreed to lead a discussion on what we mean by policy. (he is now Policy Framework WG Co-Chair)

Lists of items that we may tackle today (not in any order):

Review of Policy Override (90 mins)
Capabilities table in BCP (scratch)
Schedule and Time (45 minutes)
testing and debug (deferred)
what about diagnostic information - what might be required
for larger scale scripts
fate sharing and groups - do groups have special meaning in eval
possible use of groups with precedence, groups could be
used to collect all members of one class (all QoS) or
in order to pick a member of a class (best security option
from the list) (90 mins)
diffserv policy module (5 mins)

planning next steps (30 mins)
discussion of policy examples (45 mins)

David Partain asked from the floor if there any chance of getting closure on the language issue. He would like to take what is there and run with it, or see if we are going to spin our wheels on it a little bit more. This affects many other issues. It would be nice to have some resolution.

The comment was made that some of the difficulty of yesterday is we were trying too hard to reach closure on everything in the absence of examples.

Jon Saperia stated that there does not seem to be a consensus to add language to today's topics.

There was a discussion that resulted in the table above.

Andy Bierman requested that we discuss putting mechanisms in the MIB to help debug when configuring arbitrarily large systems. He does not want to schedule time at this point though.

Topic 1: Discussion of Policy Examples. Lead by Joel Halpern.

Joel: At this point I am trying to collect suggestions as to what policy is. What kinds of problems are we trying to solve?

Many dimensions were brought up, including

. Creating a persistent change (persistant across reboot) versus a temporary change

. Cause a change on all entities running a particular application

. Selection based on vendor, model, software/firmware versions, installed capabilities.

. Selection based on resource availability, disk usage, cpu usage, forwarding capability.

. Selection based on utilization

. Selection based on role.

A specific case was mentioned, where one has a configuration that one wants to apply to N different systems, but each is slightly different. Sometimes the selection substitutions are easy and sometimes they are hard. An example: IP address/netmask substitution is easier than distributing a configuration that is thresholding on a temporary storage area on a variety of different systems. This is quite different on a NT vs. unix system, or a Cisco vs. Nortel device. We need to hide the differences from an operator who might not be completely aware of the implications.

Subsequent discussion brought up many issues. Many vendors currently have proprietary solutions for doing these types of things, where we are trying to evolve an interoperable multi-vendor environment. It is normal to provision for what you care about, and services/customers that don't receive any preference get the leftovers. The opposite approach is to limit by service (say let video have as a maximum 30% of the bandwidth), but many will want to do this. It was mentioned that this approach may have multiple ways of being provided and that stating the policy goal in this way may lead to an under-constrained solution.

Other possible requirements were brought up:

. The ability to specify roles to the system about how to react over time to configuration or state changes.

. For security reasons, the ability to disable all ports that are not explicitly enabled.

. Avoid multiple scripts for minor device variations.

. Enable coherence of configuration across devices.

Systems such as diffserv or QoS are distributed by nature and for them to work you need to have global coherence - while each device may be slightly different or use a slightly different style it should be pointing in the correct direction.

. Well constrained programming environment in devices for configurable behavior.

. Be able to change state of running activities or processes.

. Minimize number distinct OS versions: either fix or report. Not every box runs the same set of features, and there is version skew/quality skew/feature skew based on device.

One can't really have only one version, even in things like routers. One may have reasons for running different versions, such as some features working better in one version or other. The script may need to take notice of these differences. You may also want to run an older version due to internal reasons so script may need to down rev new entities. Scripts should be able to notice the mismatch and report it somewhere - possibly causing a down load itself or possibly leaving that for some other program or a human.

. Find and repair misfits.

. Be able to drop all traffic of a given nature (say Microsoft Workgroup) on interfaces facing the internet.

. Apply "standard" route filtering at all "appropriate" places (possibly trap on exceptions, rather than filter them).

. Set VPN information based on a user "login".

. Create "access ports" with just public internet access (no
intranet).

. Change connectivity of a port or resource for a time based on payment, and be able to cancel such connectivity either at end of time period or end of usage ("I'm done, now").

Suppose you were in a hotel and have ethernet access fee based. You may want to restrict access to the site where you pay the fee. You want to provision a particular to be available until noon the next day. You can have a cancel capability.

One needs to configure to the mission, and then abstract to SNMP operations.

. Hiding complexity.

. Abstract access to type-specific and complex instance information.

. Specify roles to the system how to react over time to configuration or state changes -- decompose the configuration.

A warning about the diffserv arena: There is seldom a crossing between "user wants service level" and "service level x means this and that".

. Disabled a BGP operation over a scheduled period.

. Enable coherence of configuration across devices. They cannot be consistent, but must be coherent.

. Use subnet as the parameter, for both values and conditions. May be several approaches.

No one objected that anything was out of scope. We need to encourage people to add more. This list is not ordered in any order or priority. Nor is the list a working list for this working group.

[break]

Topic 2: Schedule and Time. Presentation by Thippanna Hongal (hongal@riverstonenet.com)

The slide presentation touched on several topics, such as RFC2445 (Internet Calendaring and Scheduling Core Object Specification), local time objects and explicit dates. Several possible methods for dealing with time issues (accessor functions, duration, predefined scalar variables, explicit start and end date, and using RFC2591 (schedule MIB) with duration extensions) were discussed. The slides are available.

The presentation was well received, with the following comments made:

. One needs to have a software flipflop. If something is not done, then do it (so that you have idempotency).

. This can be made a little richer by adding semantics to the accessor functions so that duration is automatically handled.

. Change the single letter to H for Thursday, or use digits for days. (the example in question used MTWTF).

. The accessor function needs to be richer to allow for some sort of between operation (between 9 and 5 rather than a trigger happening at 9 and 5).

. When using the schedule MIB, you would need to create a MIB entry that would point back to the entry that would be started. Actually you would need two entries in the schedule MIB one for starting (setting the admin status to enable) and one for stopping (setting the admin status to disable). We may be able to add an augments clause to the schedule MIB to pack the information into one entry.

. One needs to make sure that the forward and backward pointers stay in sync. To do this, there should be a one-to-one pairing. Have two schedule entries pointing to one ifAdminStatus would be difficult to maintain.

. If we use the schedule MIB, we need to bear in mind that it has up to one minute latency after one presses the go button.

. One could group things into a policy filter result. You could make it so that the expressions are evaluated synchronously based on a max latency timer. You can make it so that when the result column is retrieved it causes an evaluation.

. One would like a policy that says if an action follows in a certain time frame, page someone, otherwise turn on a blue light in the machine room.

. If you have both a trigger column and a pointer to time stuff, things will still work if a schedule tool on a network management station or elsewhere can set the object on a system without a concept of time.

The general idea is that you have a trigger button that when twiddled causes an edge-like action. You also have a time base set of arguments that cause it to happen based on time, and you have a coupling that allows the time based stuff to diddle the trigger. (synchronous and asynchronous).

Another way to do that would be to have that trigger something in the event MIB from DISMAN to do the discrimination.

In summary, we are talking about augmenting the schedule MIB so that we have the start and stop times rather than having multiple entries. This is a proposed change.

[lunch break]

Topic 3: Report of the Accessor Function Design Team.

. Functions are designed in groups
. All functions of a group must be implemented to have the group advertised.
. We are silent about the point that you may implement less than the whole group and not advertise the group. i.e., we do not prohibit partial implementation but we do explicitly prohibit advertising a partial implementation.
. Groups are exposed via MIB objects in the capabilities table, except the mandatory core group, which is explicitly implicit (which may need exposure in the capabilities table anyway, for versioning).
. Access functions are always defined in information modules
. Groups are identified by Object ID
. Accessor functions are identified by name and are not exposed individually
. vendors are allowed at define their own groups in their own name spaces
. We have an accessor function capMatch(group)
. We do not support a functionExists(accessor function)
. Versioning is done at the group, not the accessor level (capabilities subtype)
. We may have more issues with versioning.

Following the presentation, there was a discussion about naming, name spaces, and versioning, summarized below.

. Libraries (groups) should be backwards compatible (can only add functions, cannot remove functions). The "you cannot change count or type of arguments to a function" rule was mentioned.
. Names are assumed to be unique within the standards name-space; names within a vendor name space should have a prefix attached based on some sort of vendor-specific information (enterprise number-type information) and the vendor will be responsible for ensuring the uniqueness of the function names. The prepending cannot be done via dots, as this has a meaning within C-like languages, which is a constraint we have due to our restriction to extant languages.
. Function groups can be identified by OID and invoked by name.
. The semantics of capMatch should be such that capMatch(1.0) means that 1.0 functions are available, and capMatch(2.0) means that 1.0 and 2.0 functions are available. However, since the name space and signature of the functions are unique, it can be argued that you don't need versioning.
. We have to make sure standard groups never standardize two groups with the same name, and across two standard groups don't have two identical names.

[End of design team report and hash]

Topic 4: Diffserv Policy MIB Report, by David Partain.

First thing we discovered is that there was some logic that was not exactly right in the -02 draft. In discussion with Joel Halpern, just like in the diffserv MIB, we want to build up a linked list of elements that are acted on. To know where the linked list ends, we have to do a linked list traversal. To fix this, we put in an "associative index". So diffservPolicy #1 means the whole list of things. We added a linked list of all of the lists.

How to you instantiate this stuff? Joel H came up with a start table where you provide the instance information that is needed (ifIndex, direction) for tat policy and tell it which policy to start. We added a table indexed by (ifIndex, direction). You push a button (set an integer index to, say, 3), and this instantiates a policy.

In the diffserv working group, Joel suggested the same things, as a result of which change were made to their MIB. This way we no longer to take each revision of their draft and clean it up. Whether or not Joel and others can push this through the diffserv working group is an open question.

Harrie found a couple of things that were bugs. As a part of this, we realized that if the instance and policy information go into a separate table in the diffserv MIB, then it is likely that the diffserv policy MIB may no longer be needed.

This provides an interesting lesson on how to model this stuff template-wise. We now understand diffserv a lot better.

Jon Saperia raised several points:

. Have you written this up? We should have a policy section in the BCP document.
. Also, could you tell the working group if we are omitting this MIB.
. This working group has been asked by the IPSEC group for some help in writing up the MIB and etc to config IPSEC.

Brief Presentation on instance/object fanout, by Jeff Case.

Jeff put up a drawing showing various ways of traveling the MIBs to get to instance information. There were three basic routes:

1 NMS -> diffserv MIB
2 NMS -> Policy MIB -> diffserv MIB
3 NMS -> Policy MIB -> QOS Policy MIB -> diffserv MIB

The Policy MIB gives you fanout on the number of instances. The QoS MIB gives you fanout on the number of objects. Going through both, you get m*n fanout. Two points of interest:

. the first fanout (Policy MIB) is more interesting as you may get a much larger fanout (for example 1000s to several).
. if the accessor functions are allowed to go off board (to another managed element) then some of the policy instrumentation can be located elsewhere. All of our work so far has assumed that QoS, Policy MIB and diffserv MIB are all on the same managed system. If our accessor functions can do a snmp get, then Policy MIB or QoS MIB can be colocated at the agent or the manager or a mid-level manager.

Topic 5: Policy Override Discussion, led by Steve Waldbusser.

Steve Waldbusser: In the general area of policy conflicts. I am uncomfortable with the term "conflicts" It implies things that happen behind our backs.

When policies have relationships, sometimes the policy writer wants to specify enough information to have the agent know what to do when two or more policies might apply.

You might have related sets of policies that need arbitration:

if capMatch(802) configure 802
if capMatch(diffServ) configure DiffServ

You may not want both happening at the same time if there is a conflict. Some approaches one might use are:

. Pick one set based on capabilities or other state
. Use defined configuration for non-matching elements
. Use a default configuration

You also need to have a defined state when a policy goes out of effect (rather than leave policy in limbo).

You have completely different policies different domains, goals, and human authors, yet they diddle the same object. Which one wins?

Everything falls into one of these two groups:

. Related sets of policies that need arbitration
. Unrelated sets of policies

We might use two objects to arbitrate related policies

. pmPolicyGroup
. pmPolicyGroupPriority

and one object to a arbitrate instance conflict

. pmPolicyConflictPriority

Walter Weiss had two issues with this approach:

. If we have two alternatives, which one do I select
. If we have two alternatives, which do I do first

Jeff Case broke it down into the following cases:

. Policy 1 modifies object/instance A, policy 2 modifies object/instance B no problem.
. Policy 1 modifies object/instance a and b, policy 2 modifies a and b, and there is a perfect overlap. You only do the one with highest precedence.
. Policy 1 modifies object/instance a and b, policy 2 modifies a and b, and you want a way to say "by the way, I want you to do both a and b or neither a nor b. You needs groups for this case.

Joel Halpern disagreed with this approach, as the policy is an atomic thing. You cannot do half a policy.

Two approaches to this were discussed. One might use two policies with difference precedences, or you might use a group in a logical switch statement. Joel Halpern liked the logical composition, as using precedence is more likely to lead to programmer error. He went on to say that there is no good way to catch overlapping policies, they'll just have to diagnose it on the box. We need to say we thought about it. He thinks a precedence value is about as good as we can get. There may be a corollary in the execution environment, if we have several sets from a policy action, you need to check the precedence to make sure the sets are ordered correctly.

Jon Saperia made that point that policies that simultaneously control various MIB objects are difficult to implement. You put two things in a group because their actions could conflict. If these two things match on the same instance then the box burns up.

The observation was made that you cannot make sure that policy writers do groups correctly. From policy group, it is important that you be able to detect all potential policy conflicts. The problem is when you are not coordinating your policy writers.

Walter Weiss: Do we have the scripts or policies invoke other scripts or policies? It seems to me that for many of the conflict issues this may be the ideal level of sophistication. We are playing games on criteria. More often than not, what is more important on circumstances rather than conflict.

Joel Halpern: In most cases, policy conflicts are caused by underspecified conditions. "It can't happen! Well, unfortunately it did." It is messy when unrelated sets of policies conflict.

The discussion over the next few minutes focused in policy interactions and the need for human supervision to deal with interaction problems. Concerns were expressed about how to handle a partial policy implementation due to partial failures and whether or not we should be able to do policy unwinding (like SNMP sets are handled now). The problem of having initial configuration policies, subsequent incremental policies, and interaction resolution between the two was mentioned. The point that there are cases where we don't want the agent to choose randomly was made.

A policy set of actions can be done starting at the beginning and end at the end. This is unlike how we do PDU processing. A more SNMP-set-like set of operations (must all succeed or they fail) will not work, as most policies aren't going to work on a larger machine (since some sets will most always fail). We are going to have to live with this.

Sometimes we want a partial policy execution to succeed, sometimes we want it to fail (roll back). For example, we may have a policy that says provide power to the oxygen generator if possible. The other is "when on emergency generator shut down as much as you can". I don't think that since we cannot shut down the oxygen generator we can't shut down any outlets. I believe you will have times that say "do as much as you can", and sometimes you say "do all or none". You want to have both kinds of semantics expressible. We have a mechanism for doing both but we don't understand the computational complexity yet. When you want partial results, put them in separate groups, otherwise put them in same group. You could have an accessor function that says "do all or none", or separated into separate unrelated actions.

Meeting wrapup.

How do do people feel about having the interim at the end of the IETF?

Great to minimize travel, but we are toastier than toast.

We made a great deal of progress between last interim and this interim.

Proposed dates for next interim Oct 12/13 and Oct 19/20. Knoxville has been generally agreed up as the location. We'll take the dates issue to the list.

The Policy MIB document has a lot of stuff that has been effected.
The policy module is probably go to a tiny set of objects.

Harrie says he wants to wait to see what diffserv is going to do.

DIffserv is probably going to decide in the next two weeks.

We've made some changes that require that we change the BCP document.
My goal is to republish before the end of September.

Thanks to SNMP Research and Ericsson for sponsoring this interim meeting.

---
Administrivia

Since there are no blue sheets for an interim meeting, an attendance list must be submitted. Many thanks to Shawn Routhier, both for his excellent notes, and for the attendance list.

Attendance list for meeting
David Partain David.Partain@ericsson.com
Steve Moulton moulton@snmp.com
Jon Saperia saperia@jdscons.com
Matt White mwhite@torrentnet.com
Harrie Hazewinkel harrie@covalent.net
Walter Weiss wweiss@ellacoy.com
Thippanna Hongal hongal@riverstonenet.com
David T. Perkins dperkins@dsperkins.com
Omar Cherkaoui cherkaoui.omar@ugam.ca
Kwok Ho Chan khchan@nortelnetworks.com
Zhifeng Xiao zhifeng@cs.mcgill.ca
Joel Halpern joel@longsys.com
Rob Frye rfrye@longsys.com
Mike MacFaden mrm@riverstonenet.com
Bert Wijnen bwijnene@lucent.com
Shawn A. Routhier sar@epilogue.com
Andy Bierman abierman@cisco.com
Chris Elliott chelliot@cisco.com
Dan Romascanu dromasca@avaya.com
Dale Francisco dfrancis@cisco.com
Barr Hibbs rbhibbs@ultradns.com
Jeff Case case@snmp.com
Assaf Zeira assafz@p-cube.com
Steve Waldbusser waldbusser@nextbeacon.com
David Harrington dbh@enterasys.com

Slides

BCP Status
Configuration Management with SNMP
Policy Processing Questions