idnits 2.17.1 draft-ietf-madman-alarmmib-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-24) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity. ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Introduction section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** There are 91 instances of weird spacing in the document. Is it really formatted ragged-right, rather than justified? ** There are 46 instances of too long lines in the document, the longest one being 7 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 14 has weird spacing: '... Drafts are ...' == Line 15 has weird spacing: '...cuments of t...' == Line 16 has weird spacing: '... groups may ...' == Line 20 has weird spacing: '... Drafts may ...' == Line 21 has weird spacing: '...iate to use ...' == (86 more instances...) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- Couldn't find a document date in the document -- date freshness check skipped. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? '1' on line 422 looks like a reference -- Missing reference section? '2' on line 428 looks like a reference -- Missing reference section? '3' on line 433 looks like a reference -- Missing reference section? '4' on line 439 looks like a reference -- Missing reference section? '5' on line 443 looks like a reference -- Missing reference section? '6' on line 446 looks like a reference Summary: 10 errors (**), 0 flaws (~~), 7 warnings (==), 8 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 MADMAN Working Group Gordon B. Jones [gbjones@mitre.org] 3 INTERNET-DRAFT MITRE 4 draft-ietf-madman-alarmmib-01.txt Niraj Jain [njain@us.oracle.com] 5 Oracle Corporation 6 Glenn Mansfield [glenn@aic.co.jp] 7 AIC Systems Laboratory 8 August 1996 10 Mail and Directory Alarms 12 Status of this Memo 14 This document is an Internet Draft. Internet Drafts are working 15 documents of the Internet Engineering Task Force (IETF), its Areas, 16 and its Working Groups. Note that other groups may also distribute 17 working documents as Internet Drafts. 19 Internet Drafts are draft documents valid for a maximum of six 20 months. Internet Drafts may be updated, replaced, or obsoleted by 21 other documents at any time. It is not appropriate to use Internet 22 Drafts as reference material or to cite them other than as a "working 23 draft" or "work in progress." 25 To learn the current status of any Internet-Draft, please check the 26 1id-abstracts.txt listing contained in the Internet-Drafts Shadow 27 Directories on ds.internic.net, nic.nordu.net, ftp.nisc.sri.com, or 28 munnari.oz.au. 30 Abstract 32 This document defines alarms for Mail and Directory usage. It is to be 33 used in conjunction with the Mail and Directory Management (MADMAN) 34 RFCs. 36 1.The SNMPv2 Network Management Framework. 38 1. The SNMPv2 Network Management Framework. 40 The major components of the SNMPv2 Network Management framework are 41 described in the documents listed below. 43 o RFC 1902 [1] defines the Structure of Management Information 44 (SMI), the mechanisms used for describing and naming objects 45 for the purpose of management. 47 o STD 17, RFC 1213 [2] defines MIB-II, the core set of managed 48 objects (MO) for the Internet suite of protocols. 50 o RFC 1905 [3] defines the protocol used for network access to 51 managed objects. 53 The framework is adaptable/extensible by defining new MIBs to suit the 54 requirements of specific applications/protocols/situations. 56 Managed objects are accessed via a virtual information store, the MIB. 57 Objects in the MIB are defined using the subset of Abstract Syntax 58 Notation One (ASN.1) defined in the SMI. In particular, each object type 59 is named by an OBJECT IDENTIFIER, which is an administratively assigned 60 name. The object type together with an object instance serves to 61 uniquely identify a specific instantiation of the object. For human 62 convenience, often a textual string, termed the descriptor, is used to 63 refer to the object type. 65 2. The Need for Alarms in Messaging 67 Alarms are notifications of abnormalities associated with an MTA or a 68 message processed by an MTA. Alarms are generated by a Management 69 Console. Two facilities aid the Management Console in the generation of 70 alarms. The first facility is the trap, which is an unsolicited event 71 initiated by the Management Agent and directed to the Management 72 Console. Traps generated by an agent may optionally convey the values 73 of MIB variables inside them. The Management Console interprets the 74 traps and generates alarms as it determines appropriate. 76 The second facility consists of variables that can be polled by the 77 Management Console. These variables include the existing MIB variables 78 defined in the other MADMAN RFCs (Network Services Monitoring MIB, 79 Directory Services Monitoring MIB, Mail Monitoring MIB), plus more 80 variables defined herein specifically to augment support for alarm 81 generation. If the Management Console detects a variable value which 82 indicates that a threshold has been reached, or some other worrisome 83 trend or event has occurred, it generates an alarm as it determines 84 appropriate. It is expected that when an abnormality occurs, a trap will 85 be generated indicating the specific cause of the problem. If the trap 86 is lost or discarded by the network, the console may still detect the 87 abnormality on its next regular polling cycle through inspection of the 88 MIB variables. This combination of mechanisms provides a flexible alarm 89 functionality that is either event-driven, polling-driven, or both. 91 It is understood that traps are an unreliable mechanism. However, traps 92 may enhance the effects of polling-based alarms. This is because traps 93 can provide a more immediate discovery of a problem than polling alone 94 can, which may be important within some operational environments. For 95 example, when component availability is required to exceed 99%, a 96 polling cycle consisting of fifteen minute intervals to detect if a 97 component is operational may fail this requirement. A polling cycle 98 more frequent than fifteen minutes might saturate the network with SNMP 99 traffic. When a fifteen minute polling cycle with 99% reliability is 100 combined with an event-driven mechanism that is itself 99% reliable, the 101 probability that a given component failure goes undetected, if both 102 event-driven and polled, becomes less than one one-hundredth of one 103 percent. This scenario is also applicable to the case of message 104 throughput requirements, where the detection of queue saturation may be 105 both event-driven and polling-driven. 107 Alarms denote cases where outstanding intervention is required. 108 Implementations that result in a bombardment of superfluous traps should 109 be avoided (some fault conditions may lend themselves to this). Traps 110 should not be issued repetitively to signify one basic fault condition. 111 The setting of threshold conditions and the evaluation of other 112 composite information is the responsibility of the console, or is a 113 local implementation matter within the agent. The destinations of SNMP 114 traps as selected by the SNMP agents or applications is also a local 115 matter. 117 3. MIB Data to Support Alarms 119 The following material is a definition of the traps and MIB variables 120 defined specifically to support alarm functionality. The MADMAN 121 variables used to support alarms are defined in RFCs 19??, 19??, and 122 19??. The usage of these traps and MIB variables to fulfill specific 123 requirements is defined in a later section. 125 3.1 Traps to Support Alarms 126 Two forms of specific traps are defined to support alarms. The first, 127 called mADAlarm, denotes an MTA- or DSA-related failure, and the second, 128 messageAlarm, denotes a message-related failure in an MTA. mADAlarm This 129 trap is generated by the agent in an unsolicited fashion to signify that 130 a failure has occurred within the MTA or DSA. Examples of such failures 131 may include one MTA's inability to contact another MTA, or the detection 132 of message queue saturation. The mADAlarm trap may convey a number of 133 values, including the name of the MTA or DSA reporting the problem, the 134 name of the remote MTA or DSA purportedly causing the problem, and 135 variables describing the problem itself. messageAlarm This trap is 136 generated by the agent in an unsolicited fashion to signify that a non- 137 recoverable failure has occurred in processing a message due to some 138 sort of structural flaw in the message itself or in its addressing. 139 Examples may include cases where a message can not be delivered, non- 140 delivered, or redirected, or the case where a messaging loop was 141 detected. The messageAlarm trap may convey a number of values, including 142 the name of the MTA that processed the message, and variables describing 143 the problem itself. 145 3.2 MIB Variables to Support Alarms 147 A new table is defined in the MIB to supply supplementary fault-related 148 information to support alarm generation. When a failure occurs, the 149 identities of the applications responsible are retained in the MIB, 150 along with the ID of the message most recently involved in a failure. 151 Through polling, any changes in the values of these variables can 152 signify a recent failure. The following sections describe each variable 153 in the MIB. lastMessageIdFailure This is the identifier of the most 154 recent message that was the cause of a message-related failure. A 155 message-related failure is defined to be a non-recoverable error in the 156 processing of a message. In the event of multiple message failures, it 157 is a clue to the administrator or application to inspect the message 158 queues to determine which messages are defective. numMessagesFailed This 159 is the total number of messages that have failed processing since the 160 messaging application was last initialized. This variable may be used 161 in conjunction with lastMessageIdFailure to detect multiple message 162 failures within a single unit of time. lastFailureMtaGroupName When an 163 error involving a neighboring MTA occurs, this variable holds the 164 mtaGroupName (from the MADMAN mtaGroupTable) of the MTA most recently 165 involved in a failure. lastFailureApplName This variable holds the 166 applName (from the MADMAN applTable) of the MTA that most recently 167 reported a failure. 169 4. SNMP Format for Alarms 171 Alarms are supported under SNMP using traps and additional MIB 172 variables. An additional table called mADAlarmTable is defined here. 173 Elements of the existing MADMAN tables and proposed extensions are also 174 utilized for alarm purposes. It is expected that traps will be 175 implemented under SNMP v1, but that the grammatical constructs used to 176 define them are taken from SNMP v2. Page 31 of RFC 1157 shows how trap 177 Protocol Data Units (PDUs) are formed in SNMP v1. We would add two 178 enterprise-specific traps (generic-trap type 6) whose specific-trap 179 values are set to either mADAlarm (specific-trap 0) or messageAlarm 180 (specific-trap 1). The enterprise field of the trap would contain the 181 OID "experimental ??" designating the MADMAN alarm MIB (MADAlarmMIB). 182 The values of variables and their corresponding OBJECT IDENTIFIERs are 183 conveyed within the VarBindList. These variables are obtained from 184 either the mADAlarmTable or tables found in the other MADMAN RFCs. 186 MADMAN-ALARM-MIB DEFINITIONS ::= BEGIN 188 IMPORTS 189 MODULE-IDENTITY, OBJECT-TYPE, 190 NOTIFICATION-TYPE, experimental, Counter32, Gauge32 191 FROM SNMPv2-SMI 192 DisplayString, 193 TEXTUAL-CONVENTION 194 FROM SNMPv2-TC 195 applOperStatus, applName 196 FROM APPLICATION-MIB 197 mtaGroupName, mtaGroupInboundRejectionReason, 198 mtaGroupStoredVolume, mtaLoopsDetected, mtaGroupLoopsDetected, 199 mtaGroupOutboundConnectFailureReason 200 FROM MTA-MIB; 202 mADAlarmMIB MODULE-IDENTITY 203 LAST-UPDATED "9608230000Z" 204 ORGANIZATION "IETF Mail and Directory Management Working 205 Group" 206 CONTACT-INFO 207 " Glenn Mansfield 208 Postal: AIC Systems Laboratory 209 6-6-3, Minami Yoshinari 210 Aoba-ku, Sendai, Japan 989-32. 212 Tel: +81-22-279-3310 213 Fax: +81-22-279-3640 214 E-mail: glenn@aic.co.jp" 215 DESCRIPTION 216 "The MIB module describing alarms for MADMAN" 217 ::= { experimental 73 } 219 mADAlarmTable OBJECT-TYPE 220 SYNTAX SEQUENCE OF mADAlarmEntry 221 ACCESS not-accessible 222 STATUS mandatory 223 DESCRIPTION 224 "The table holding alarm information for an individual MTA or DSA." 225 ::= { mADAlarmMIB 1 } 227 mADAlarmEntry OBJECT-TYPE 228 SYNTAX mADAlarmEntry 229 ACCESS not-accessible 230 STATUS mandatory 231 DESCRIPTION 232 "The alarm entry associated with each MTA or DSA." 233 ::= { mADAlarmTable 1 } 234 mADAlarmEntry ::= SEQUENCE { 235 lastMessageIdFailure DisplayString, 236 numMessagesFailed Counter32, 237 lastFailureMtaGroupName DisplayString, 238 lastFailureMtaApplName DisplayString 239 } 241 lastMessageIdFailure OBJECT-TYPE 242 SYNTAX DisplayString 243 ACCESS read-only 244 STATUS mandatory 245 DESCRIPTION 246 "This is the message ID of the last message to either loop or have 247 an unrecoverable error while proccessing" 248 ::= {mADAlarmEntry 1} 250 numMessagesFailed OBJECT-TYPE 251 SYNTAX Counter32 252 ACCESS read-only 253 STATUS mandatory 254 DESCRIPTION 255 "This is the number of messages that have had an unrecoverable error 256 while proccessing since MTA initialization" 257 ::= {mADAlarmEntry 2} 259 lastFailureMtaGroupName OBJECT-TYPE 260 SYNTAX DisplayString 261 ACCESS read-only 262 STATUS mandatory 263 DESCRIPTION 264 "This is the group name of the last MTA group to have a connectivity 265 failure" 266 ::= {mADAlarmEntry 3} 268 lastFailureMtaApplName OBJECT-TYPE 269 SYNTAX DisplayString 270 ACCESS read-only 271 STATUS mandatory 272 DESCRIPTION 273 "This is the application name of the last MTA to have a connectivity 274 failure" 275 ::= {mADAlarmEntry 4} 277 mADAlarmNotifications OBJECT IDENTIFIER ::= { mADAlarmMIB 2 } 279 mADAlarm NOTIFICATION-TYPE 280 OBJECTS {applOperStatus, applName, mtaGroupName, 281 mtaGroupConnectFailureReason, mtaGroupStoredVolume} 282 -- these OBJECTS are the things that an mADAlarm may convey 283 ::= {mADAlarmNotifications 1} 285 messageAlarm NOTIFICATION-TYPE 286 OBJECTS {lastMessageIdFailure, numMessagesFailed } 287 ::= {mADAlarmNotifications 2} 289 mADAlarmConformance OBJECT IDENTIFIER ::= {mADAlarmMIB 3} 291 mADAlarmGroup OBJECT IDENTIFIER ::= {mADAlarmConformance 1} 292 mADAlarmCompliances OBJECT IDENTIFIER ::= {mADAlarmConformance 2} 294 mADAlarmTrapCompliance MODULE-COMPLIANCE 295 STATUS current 296 DESCRIPTION 297 "The most basic level of compliance for MAD SNMPv2 entities that 298 implement MAD alarms." 299 MODULE 300 MANDATORY-GROUPS {mADAlarmTrapGroup} 301 ::= {mADAlarmCompliances 1} 303 mADAlarmVariableCompliance MODULE-COMPLIANCE 304 STATUS current 305 DESCRIPTION 306 "The compliance statement for MAD SNMPv2 entities that implement MIB 307 variables to support 308 alarms for MTAs." 309 MODULE 310 MANDATORY-GROUPS {mADAlarmVariableGroup} 311 ::= {mADAlarmCompliances 2} 313 mADAlarmTrapGroup OBJECT-GROUP 314 OBJECTS {mADAlarm, messageAlarm} 315 STATUS current 316 DESCRIPTION "Two Traps providing the basic level of support for alarms for 317 MTAs." 318 ::= {mADAlarmGroup 1} 320 mADAlarmVariableGroup OBJECT-GROUP 321 OBJECTS {lastMessageIdFailure, numMessagesFailed, 322 lastFailureMtaGroupName, lastFailureMtaApplName} 323 STATUS current 324 DESCRIPTION "A collection of objects providing support for alarms for MTAs 325 that includes some 326 other alarm-specific MIB variables" 327 ::= {mADAlarmGroup 2} 329 END 331 5. Scenarios 333 The following scenarios provide examples of how the mADAlarm and messageAlarm 334 are used in various fault conditions. 336 5.1 Connectivity Failure 338 When an MTA or DSA detects that another MTA or DSA cannot be contacted, a 339 mADAlarm is sent. The mADAlarm contains the applName of the MTA reporting the 340 problem, the mtaGroupName for the MTA that cannot be contacted, and the 341 mtaGroupOutboundConnectFailureReason. In the case of a more general 342 connectivity failure, such as the general unavailability of the network 343 element, the MTA-trap contains only the variable mtaGroupConnectFailureReason. 344 Care should be taken to report these conditions only in the case of permanent 345 failure, since intermittent failures are more frequent and might result in too 346 many traps being generated. For example, when an MTA cannot connect to another 347 MTA in order to deliver a message, the MTA delivering the message usually 348 retries the delivery attempt for a specified duration or for a specified number 349 of tries. If the retry limit is exceeded, a case that should not occur, the 350 message is returned. In this case, a trap would be sent when the retry limit 351 is exceeded, but would not be sent for each individual retry. 353 5.2 MTA or DSA Down 355 This condition signifies that the MTA or DSA is not operational (but should be) 356 or has not recently registered with the management system. This condition is 357 reported with an mADAlarm containing the values of applOperStatus and applName 358 from the MADMAN Application Monitoring MIB. Support for this feature is 359 optional, since an MTA or DSA that has crashed cannot report that fact to an 360 agent, and since off-the-shelf agents cannot be expected to monitor the 361 aliveness of applications by themselves. 363 5.3 Messaging Loop Detection 365 This condition may signify that a particular message has been detected, 366 received, and sent multiple times, perhaps exceeding a locally established 367 threshold value. The condition is reported with a messageAlarm trap, where the 368 trap contains the applName of the MTA reporting the problem, and optionally the 369 values of lastMessageIdFailure, mtaLoopsDetected, mtaGroupLoopsDetected. 371 5.4 Message Processing Failure 373 When an MTA encounters certain non-recoverable errors processing a message, 374 (e.g., a "dead" message that cannot be delivered, nondelivered, or redirected), 375 a messageAlarm is generated. The messageAlarm contains the applName of the MTA 376 reporting the failure, and optionally the lastMessageIdFailure, which 377 identifies the most recent message that failed, and numMessagesFailed, which 378 aids in detecting multiple message failures. If other messages had failed 379 processing prior to the immediate condition being reported and after the most 380 recent polling cycle, the identities of these messages may be detected 381 manually. 383 5.5 Queue Error 385 When an MTA or agent detects that a queue is full or is approaching saturation, 386 a mADAlarm is sent. The applName of the MTA reporting the problem is conveyed 387 within the variable bindings list of the mADAlarm. The mADAlarm also contains 388 the values of the MIB variables mtaGroupName and mtaGroupStoredVolume (both 389 from the mtaGroupTable). 391 5.6 Security Error 393 When an MTA or agent detects a security error such as an authentication failure 394 (e.g. when an MTA or DSA fails to authenticate itself to another), a mADAlarm 395 is sent. The applName of the MTA reporting the problem is conveyed within the 396 variable bindings list of the mADAlarm. The mADAlarm also contains the values 397 of the MIB variables mtaGroupInboundRejectionReason (stating an authentication 398 failure) and the mtaGroupName. 400 When an MTA or agent detects a security error such as a data integrity 401 violation (e.g. while processing a message), a messageAlarm is sent. 402 The applName of the MTA reporting the problem is conveyed within the variable 403 bindings list of the messageAlarm. The messageAlarm also contains the values 404 of the MIB variables mtaGroupInboundRejectionReason (stating an integrity 405 violation) and the mtaGroupName. 407 6. Acknowledgements 409 This draft is the product of discussions and deliberations carried out 410 in the following groups: 411 ietf-madman-wg ietf-madman@innosoft.com 413 This draft also incorporates the intellectual contributions of 415 Bruce Greenblatt 416 Sue Lebeck 417 Roger Mizumori 418 Edward Owens 420 7. References 422 [1] Case, J., McCloghrie, K., Rose, M., and S. Waldbusser, "Structure 423 of Management Information for version 2 of the Simple Network 424 Management Protocol (SNMPv2)", RFC 1902, SNMP Research,Inc., 425 Hughes LAN Systems, Dover Beach Consulting, Inc., Carnegie Mellon 426 University, February 1996. 428 [2] McCloghrie, K., and M. Rose, Editors, "Management Information 429 Base for Network Management of TCP/IP-based internets: MIB-II", 430 STD 17, RFC 1213, Hughes LAN Systems, Performance Systems 431 International, March 1991. 433 [3] Case, J., McCloghrie, K., Rose, M., and S, Waldbusser, "Protocol 434 Operations for version 2 of the Simple Network Management 435 Protocol (SNMPv2)", RFC 1905, SNMP Research,Inc., Hughes LAN 436 Systems, Dover Beach Consulting, Inc., Carnegie Mellon 437 University, February 1996. 439 [4] Freed, N., Kille, S., "Network Services Monitoring MIB" 440 Monitoring MIB", RFC 1565, Innosoft, ISODE Consortium, January 441 1994. 443 [5] Freed, N., Kille, S., "Mail Monitoring MIB", RFC 1566, 444 Innosoft, ISODE Consortium, January 1994. 446 [6] Mansfield, G., Kille, S, "X.500 Directory Monitoring MIB", 447 Monitoring MIB", RFC 1567, AIC Systems Lab, ISODE Consortium, 448 November 1994 450 Security Considerations 452 Security issues are not discussed in this memo. 454 Authors' Addresses 456 Glenn Mansfield 457 AIC Systems Laboratories 458 6-6-3 Minami Yoshinari 459 Aoba-ku, Sendai 989-32 460 Japan 462 Phone: +81-22-279-3310 463 E-Mail: glenn@aic.co.jp 465 Gordon B. Jones 466 MITRE Corporation 467 1820 Dolley Madison Blvd. 468 McLean, VA 22102-3481 470 Phone: (703) 883-76701 471 E-Mail: gbjones@mitre.org 472 Niraj Jain 473 Oracle Corporation 474 500 Oracle Parkway 475 Redwood Shores 476 California 940065 478 Phone: (415) 506-2581 479 E-Mail: njain@us.oracle.com