idnits 2.17.1 draft-vallin-netmod-alarm-module-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 589 has weird spacing: '...perator str...' == Line 594 has weird spacing: '...w state ope...' == Line 649 has weird spacing: '...alifier ala...' == Line 681 has weird spacing: '...alifier lea...' == Line 690 has weird spacing: '...everity sev...' == (10 more instances...) -- The document date (October 31, 2016) is 2734 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) No issues found here. Summary: 1 error (**), 0 flaws (~~), 7 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group S. Vallin 3 Internet-Draft Stefan Vallin AB 4 Intended status: Standards Track M. Bjorklund 5 Expires: May 4, 2017 Cisco 6 October 31, 2016 8 YANG Alarm Module 9 draft-vallin-netmod-alarm-module-01 11 Abstract 13 This document defines a YANG module for alarm management. It 14 includes functions for alarm list management, alarm shelving and 15 notifications to inform management systems. There are also RPCs to 16 manage the operator state of an alarm and administrative alarm 17 procedures. The module carefully maps to relevant alarm standards. 19 Status of This Memo 21 This Internet-Draft is submitted in full conformance with the 22 provisions of BCP 78 and BCP 79. 24 Internet-Drafts are working documents of the Internet Engineering 25 Task Force (IETF). Note that other groups may also distribute 26 working documents as Internet-Drafts. The list of current Internet- 27 Drafts is at http://datatracker.ietf.org/drafts/current/. 29 Internet-Drafts are draft documents valid for a maximum of six months 30 and may be updated, replaced, or obsoleted by other documents at any 31 time. It is inappropriate to use Internet-Drafts as reference 32 material or to cite them other than as "work in progress." 34 This Internet-Draft will expire on May 4, 2017. 36 Copyright Notice 38 Copyright (c) 2016 IETF Trust and the persons identified as the 39 document authors. All rights reserved. 41 This document is subject to BCP 78 and the IETF Trust's Legal 42 Provisions Relating to IETF Documents 43 (http://trustee.ietf.org/license-info) in effect on the date of 44 publication of this document. Please review these documents 45 carefully, as they describe your rights and restrictions with respect 46 to this document. Code Components extracted from this document must 47 include Simplified BSD License text as described in Section 4.e of 48 the Trust Legal Provisions and are provided without warranty as 49 described in the Simplified BSD License. 51 Table of Contents 53 1. Requirements notation . . . . . . . . . . . . . . . . . . . . 3 54 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 55 2.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 56 3. Objectives . . . . . . . . . . . . . . . . . . . . . . . . . 4 57 4. Background and Usability Requirements . . . . . . . . . . . . 5 58 5. Alarm Concepts . . . . . . . . . . . . . . . . . . . . . . . 8 59 5.1. What is an Alarm? . . . . . . . . . . . . . . . . . . . . 9 60 5.2. What is an Alarm Type? . . . . . . . . . . . . . . . . . 9 61 5.3. How are Resources Identified? . . . . . . . . . . . . . . 12 62 5.4. How are Alarm Instances Identified? . . . . . . . . . . . 12 63 5.5. What is the Life-Cycle of an Alarm? . . . . . . . . . . . 13 64 5.5.1. Resource Alarm Life-Cycle . . . . . . . . . . . . . . 13 65 5.5.2. Operator Alarm Life-cycle . . . . . . . . . . . . . . 14 66 5.5.3. Administrative Alarm Life-Cycle . . . . . . . . . . . 14 67 5.6. Alarm Shelving . . . . . . . . . . . . . . . . . . . . . 15 68 6. Alarm Data Model . . . . . . . . . . . . . . . . . . . . . . 15 69 6.1. Alarm Control . . . . . . . . . . . . . . . . . . . . . . 17 70 6.1.1. Alarm Shelving . . . . . . . . . . . . . . . . . . . 17 71 6.2. Alarm Inventory . . . . . . . . . . . . . . . . . . . . . 18 72 6.3. Alarm Summary . . . . . . . . . . . . . . . . . . . . . . 18 73 6.4. The Alarm List . . . . . . . . . . . . . . . . . . . . . 19 74 6.5. The Shelved Alarms List . . . . . . . . . . . . . . . . . 20 75 6.6. RPCs . . . . . . . . . . . . . . . . . . . . . . . . . . 20 76 6.7. Notifications . . . . . . . . . . . . . . . . . . . . . . 20 77 7. Alarm YANG Module . . . . . . . . . . . . . . . . . . . . . . 20 78 8. X.733 Alarm Mapping Data Model . . . . . . . . . . . . . . . 45 79 9. X.733 Alarm Mapping YANG Module . . . . . . . . . . . . . . . 45 80 10. Security Considerations . . . . . . . . . . . . . . . . . . . 51 81 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 51 82 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 52 83 12.1. Normative References . . . . . . . . . . . . . . . . . . 52 84 12.2. Informative References . . . . . . . . . . . . . . . . . 52 85 Appendix A. Enterprise-specific Alarm-Types Example . . . . . . 53 86 Appendix B. Alarm Inventory Example . . . . . . . . . . . . . . 54 87 Appendix C. Alarm List Example . . . . . . . . . . . . . . . . . 55 88 Appendix D. Alarm Shelving Example . . . . . . . . . . . . . . . 56 89 Appendix E. X.733 Mapping Example . . . . . . . . . . . . . . . 57 90 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 57 92 1. Requirements notation 94 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 95 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 96 document are to be interpreted as described in [RFC2119]. 98 2. Introduction 100 This document defines a YANG [RFC7950] module for alarm management. 101 The purpose is to define a standardised alarm interface for network 102 devices that can be easily integrated into management applications. 103 The model is also applicable as a northbound alarm interface in the 104 management applications. 106 Alarm monitoring is a fundamental part of monitoring the network. 107 Raw alarms from devices do not always tell the status of the network 108 services or necessarily point to the root cause. However, being able 109 to feed alarms to the network management system in a standardised 110 format is a starting point for performing higher level network 111 assurance tasks. 113 The telecommunication domain has standardised an alarm interface in 114 ITU-T X.733 [X.733]. This continued in mobile networks within the 115 3GPP organisation [ALARMIRP]. Although SNMP is the dominant 116 mechanism for monitoring devices, IETF did not early on standardise 117 an alarm MIB. Instead, management systems interpreted the enterprise 118 specific traps per MIB and device to build an alarm list. When 119 finally The Alarm MIB [RFC3877] was published, it had to address the 120 existence of enterprise traps and map these into alarms. This 121 requirement led to a MIB that is not easy to use. 123 This document defines a standardised YANG module for alarm 124 management. The design of the module is based on experience from 125 using and implementing the above mentioned alarm standards. 127 2.1. Terminology 129 The following terms are defined in [RFC7950]: 131 o action 133 o client 135 o data tree 137 o RPC 139 o server 140 The following terms are used within this document: 142 o Alarm (the general concept): An alarm signifies an undesirable 143 state in a resource that requires corrective action. 145 o Alarm Instance: The alarm state for a specific resource and alarm 146 type. For example (GigabitEthernet0/15, link-alarm). An entry in 147 the alarm list. 149 o Alarm Inventory: A list of all possible alarm types on a system. 151 o Alarm Shelving: Blocking alarms according to specific criteria. 153 o Alarm Type: An alarm type identifies a possible unique alarm state 154 for a resource. Alarm types are names to identify the state like 155 'link-alarm', 'jitter-violation', 'high-disk-utilization'. 157 o Management System: The alarm management application that consumes 158 the alarms, i.e., acts as a client. 160 o Resource: A fine-grained identification of the alarming resource, 161 for example: an interface, a process. 163 o System: The system that implements this YANG alarm module, i.e., 164 acts as a server. This corresponds to a network device or a 165 management application that provides a north-bound alarm 166 interface. 168 3. Objectives 170 The objectives for the design of the Alarm Module are: 172 o Simple to use. If a system supports this module, it shall be 173 straight-forward to integrate this into a YANG based alarm 174 manager. 176 o View alarms as states on resources and not as discrete 177 notifications. 179 o Clear definition of "alarm" in order to exclude general events 180 that should not be forwarded as alarm notifications. 182 o Clear and precise identification of alarm types and alarm 183 instances. 185 o A management system should be able to pull all available alarm 186 types from a system, i.e., read the alarm inventory from a system. 188 This makes it possible to prepare alarm operators with 189 corresponding alarm instructions. 191 o Address alarm usability requirements. While IETF has not really 192 addressed alarm management, telecom standards has addressed it 193 purely from a protocol perspective. The process industry has 194 published several relevant standards addressing requirements for a 195 useful alarm interface; [EEMUA], [ISA182]. This alarm module 196 defines usability requirements as well as a YANG data model. 198 o Mapping to X.733, which is a requirement for many alarm systems. 199 Still, keep some of the X.733 concepts out of the core model in 200 order to make the model small and easy to understand. 202 4. Background and Usability Requirements 204 Common alarm problems and the cause of the problems are summarised in 205 Table 1. This summary is adopted to networking based on the ISA 206 [ISA182] and EEMUA [EEMUA] standards. 208 +------------------+--------------------------------+---------------+ 209 | Problem | Cause | How this | 210 | | | module | 211 | | | address the | 212 | | | cause | 213 +------------------+--------------------------------+---------------+ 214 | Alarms are | "Nuisance" alarms (chattering | Strict | 215 | generated but | alarms and fleeting alarms), | definition of | 216 | they are ignored | faulty hardware, redundant | alarms | 217 | by the operator. | alarms, cascading alarms, | requiring | 218 | | incorrect alarm settings, | corrective | 219 | | alarms have not been | response. | 220 | | rationalised, the alarms | Alarm | 221 | | represent log information | requirements | 222 | | rather than true alarms. | in Table 2. | 223 | | | | 224 | When alarms | Insufficient alarm response | The alarm | 225 | occur, operators | procedures and not well | inventory | 226 | do not know how | defined alarm types. | lists all | 227 | to respond. | | alarm types | 228 | | | and | 229 | | | corrective | 230 | | | actions. | 231 | | | Alarm | 232 | | | requirements | 233 | | | in Table 2. | 234 | | | | 235 | The alarm | Nuisance alarms, stale alarms, | The alarm | 236 | display is full | alarms from equipment not in | definition | 237 | of alarms, even | service. | and alarm | 238 | when there is | | shelving. | 239 | nothing wrong. | | | 240 | | | | 241 | During a | Incorrect prioritization of | State-based | 242 | failure, | alarms. Not using advanced | alarm model, | 243 | operators are | alarm techniques (e.g. state- | alarm rate | 244 | flooded with so | based alarming). | requirements | 245 | many alarms that | | in Table 3 | 246 | they do not know | | and Table 4 | 247 | which ones are | | | 248 | the most | | | 249 | important. | | | 250 +------------------+--------------------------------+---------------+ 252 Table 1: Alarm Problems and Causes 254 Based upon the above problems EEMUA gives the following definition of 255 a good alarm: 257 +----------------+--------------------------------------------------+ 258 | Characteristic | Explanation | 259 +----------------+--------------------------------------------------+ 260 | Relevant | Not spurious or of low operational value. | 261 | | | 262 | Unique | Not duplicating another alarm. | 263 | | | 264 | Timely | Not long before any response is needed or too | 265 | | late to do anything. | 266 | | | 267 | Prioritised | Indicating the importance that the operator | 268 | | deals with the problem. | 269 | | | 270 | Understandable | Having a message which is clear and easy to | 271 | | understand. | 272 | | | 273 | Diagnostic | Identifying the problem that has occurred. | 274 | | | 275 | Advisory | Indicative of the action to be taken. | 276 | | | 277 | Focusing | Drawing attention to the most important issues. | 278 +----------------+--------------------------------------------------+ 280 Table 2: Definition of a Good Alarm 282 Vendors SHOULD rationalise all alarms according to above. Another 283 crucial requirement is acceptable alarm rates. Vendors SHOULD make 284 sure that they do not exceed the recommendations from EEMUA below: 286 +-----------------------------------+-------------------------------+ 287 | Long Term Alarm Rate in Steady | Acceptability | 288 | Operation | | 289 +-----------------------------------+-------------------------------+ 290 | More than one per minute | Very likely to be | 291 | | unacceptable. | 292 | | | 293 | One per 2 minutes | Likely to be over-demanding. | 294 | | | 295 | One per 5 minutes | Manageable. | 296 | | | 297 | Less than one per 10 minutes | Very likely to be acceptable. | 298 +-----------------------------------+-------------------------------+ 300 Table 3: Acceptable Alarm Rates, Steady State 302 +----------------------------+--------------------------------------+ 303 | Number of alarms displayed | Acceptability | 304 | in 10 minutes following a | | 305 | major network problem | | 306 +----------------------------+--------------------------------------+ 307 | More than 100 | Definitely excessive and very likely | 308 | | to lead to the operator to abandon | 309 | | the use of the alarm system. | 310 | | | 311 | 20-100 | Hard to cope with. | 312 | | | 313 | Under 10 | Should be manageable - but may be | 314 | | difficult if several of the alarms | 315 | | require a complex operator response. | 316 +----------------------------+--------------------------------------+ 318 Table 4: Acceptable Alarm Rates, Burst 320 The numbers in Table 3 and Table 4 are the sum of all alarms for a 321 network being managed from one alarm console. So every individual 322 system or NMS contributes to these numbers. 324 Vendors SHOULD make sure that the following rules are used in 325 designing the alarm interface: 327 1. Rationalize the alarms in the system to ensure that every alarm 328 is necessary, has a purpose, and follows the cardinal rule - that 329 it requires an operator response. Adheres to the rules of 330 Table 2 332 2. Audit the quality of the alarms. Talk with the operators about 333 how well the alarm information support them. Do they know what 334 to do in the event of an alarm? Are they able to quickly 335 diagnose the problem and determine the corrective action? Does 336 the alarm text adhere to the requirements in Table 2? 338 3. Analyze and benchmark the performance of the system and compare 339 it to the recommended metrics in Table 3 and Table 4. Start by 340 identifying nuisance alarms, standing alarms at normal state and 341 startup. 343 5. Alarm Concepts 345 This section defines the fundamental concepts behind the data model. 346 This section is rooted in the works of Vallin et. al [ALARMSEM]. 348 5.1. What is an Alarm? 350 There are two misconceptions regarding alarms and alarm interfaces 351 that are important to sort out. The first problem is that alarms are 352 mixed with events in general. Alarms MUST correspond to an 353 undesirable state that needs corrective action. Many implementations 354 of alarm interfaces do not adhere to this principle and just send 355 events in general. In order to qualify as an alarm, there must exist 356 a corrective action. If that is not true, it is an event that can go 357 into logs. 359 The other misconception is that the term alarm refers to the 360 notification itself. Rather, an alarm is a state of a resource in 361 the system. The alarm notifications report state changes of the 362 alarm, such as alarm raise and alarm clear. 364 Based upon the above, we will use the following alarm definition: 366 An alarm signifies an undesirable state in a resource that 367 requires corrective action. 369 "One of the most important principles of alarm management is that an 370 alarm requires an action. This means that if the operator does not 371 need to respond to an alarm (because unacceptable consequences do not 372 occur), then it is not an alarm. Following this cardinal rule will 373 help eliminate many potential alarm management issues." [ISA182] 375 5.2. What is an Alarm Type? 377 One of the fundamental requirements stated in the previous section is 378 that every alarm must have a corresponding corrective action. This 379 means that every vendor should be able to prepare a list of available 380 alarms and their corrective actions. We use the term 'alarm type' to 381 refer to every possible alarm that could be active in the system. 383 Alarm types are also fundamental in order to provide a state-based 384 alarm list. The alarm list correlates alarm state changes for the 385 same alarm type and the same resource into one alarm. 387 Different alarm interfaces use different mechanisms to define alarm 388 types, ranging from simple error numbers to more advanced mechanisms 389 like the X.733 triplet of event type, probable cause and specific 390 problem. 392 This document defines an alarm type with an alarm type id and an 393 alarm type qualifier. 395 The alarm type id is modeled as a YANG identity. With YANG 396 identities, new alarm types can be defined in a distributed fashion. 397 YANG identities are hierarchical, which means that an hierarchy of 398 alarm types can be defined. 400 The primary goal for the alarm module has been to provide a simple 401 but extensible mechanism. YANG identities is a good mechanism for 402 enumerated values that are easy to extend. 404 This means that every possible alarm type that can appear in a system 405 exists as a well defined hierarchical identity along with a 406 description. Tools can provide a list of possible alarms by parsing 407 the YANG identities rather than reading user guides. 409 Standards and vendors should define their own alarm type identities 410 based on this definition. 412 The use of YANG identities means that all possible alarms are 413 identified at design time. This explicit declaration of alarm types 414 makes it easier to allow for alarm qualification reviews and 415 preparation of alarm actions and documentation. 417 There are occasions where the alarm types are not known at design 418 time. For example, a system with digital inputs that allows users to 419 connects detectors (e.g., smoke detector) to the inputs. In this 420 case it is a configuration action that says that certain connectors 421 are fire alarms for example. The drawback of this is that there is a 422 big risk that alarm operators will receive alarm types as a surprise, 423 they do not know how to resolve the problem since a defined alarm 424 procedure does not necessarily exist. 426 In order to allow for dynamic addition of alarm types the alarm 427 module also allows for further qualification of the identity based 428 alarm type using a string. 430 A common misunderstanding is that individual alarm notifications are 431 alarm types. This is not correct; e.g., "link-up" and "link-down" 432 are two notifications reporting different states for the same alarm 433 type, "link-alarm". 435 A vendor or standard can then define their own alarm-type hierarchy. 436 The example below shows a hierarchy based on X.733 event types: 438 import ietf-alarms { 439 prefix al; 440 } 441 identity vendor-alarms { 442 base al:alarm-type; 443 } 444 identity communications-alarm { 445 base vendor-alarms; 446 } 447 identity link-alarm { 448 base communications-alarm; 449 } 451 Alarm types can be abstract. An abstract alarm type is used as a 452 base for defining hierarchical alarm types. Concrete alarm types are 453 used for alarm states and appear in the alarm inventory. There are 454 two kinds of concrete alarm types: 456 1. The last subordinate identity in the 'alarm-type-id' hierarchy is 457 concrete, for example: "alarm-identity.environmental- 458 alarm.smoke". In this example "alarm-identity" and 459 "environmental-alarm" are abstract YANG identities, whereas 460 "smoke" is a concrete YANG identity. 462 2. The YANG identity hierarchy is abstract and the concrete alarm 463 type is defined by the dynamic alarm qualifier string, for 464 example: "alarm-identity.environmental-alarm.external-detector" 465 with alarm-type-qualifier "smoke". 467 For example: 469 // Alternative 1: concrete alarm type identity 470 import ietf-alarms { 471 prefix al; 472 } 473 identity environmental-alarm { 474 base al:alarm-type; 475 description "Abstract alarm type"; 476 } 477 identity smoke { 478 base environmental-alarm; 479 description "Concrete alarm type"; 480 } 482 // Alternative 2: concrete alarm type qualifier 483 import ietf-alarms { 484 prefix al; 485 } 486 identity environmental-alarm { 487 base al:alarm-type; 488 description "Abstract alarm type"; 489 } 490 identity external-detector { 491 base environmental-alarm; 492 description 493 "Abstract alarm type, a run-time configuration 494 procedure sets the type of alarm detected. This will 495 be reported in the alarm-qualifier."; 496 } 498 5.3. How are Resources Identified? 500 It is of vital importance to be able to refer to the alarming 501 resource. This reference must be as fine-grained as possible. If 502 the alarming resource exists in the data tree then an instance- 503 identifier MUST be used with the full path to the object. 505 This module also allows for alternate naming of the alarming resource 506 if it is not available in the data tree. 508 5.4. How are Alarm Instances Identified? 510 A primary goal of this alarm module is to remove any ambiguity in how 511 alarm notifications are mapped to an update of an alarm instance. 512 X.733 and especially 3GPP was not really clear on this point. This 513 YANG alarm module states that the tuple (resource, alarm type 514 identifier, alarm type qualifier) corresponds to the same alarm 515 instance. This means that alarm notifications for the same resource 516 and same alarm type are matched to update the same alarm instance. 517 These three leafs are therefore used as the key in the alarm list: 519 list alarm { 520 key "resource alarm-type-id alarm-type-qualifier"; 521 ... 522 } 524 5.5. What is the Life-Cycle of an Alarm? 526 The alarm model clearly separates the resource alarm life-cycle from 527 the operator and administrative life-cycles of an alarm. 529 o resource alarm life-cycle: the alarm instrumentation that controls 530 alarm raise, clearance, and severity changes. 532 o operator alarm life-cycle: operators acting upon alarms with 533 actions like acknowledgment and closing. Closing an alarm implies 534 that the operator considers the corrective action performed. 535 Operators can also shelf alarms in order to avoid nuisance alarms. 537 o administrative alarm life-cycle: deleting (purging) alarms and 538 compressing the alarm status change list. This module exposes 539 operations to manage the administrative life-cycle. The server 540 may also perform these operations based on other policies, but how 541 that is done is out of scope for this document. 543 5.5.1. Resource Alarm Life-Cycle 545 From a resource perspective, an alarm can have the following life- 546 cycle: raise, change severity, change severity, clear, being raised 547 again etc. All of these status changes can have different alarm 548 texts generated by the instrumentation. Two important things to 549 note: 551 1. Alarms are not deleted when they are cleared. Deleting alarms is 552 an administrative process. The alarm module defines an rpc 553 "purge" that deletes alarms. 555 2. Alarms are not cleared by operators, only the underlying 556 instrumentation can clear an alarm. Operators can close alarms. 558 The YANG tree representation below illustrates the resource oriented 559 life-cycle: 561 +--ro alarm* [resource alarm-type-id alarm-type-qualifier] 562 ... 563 +--ro is-cleared boolean 564 +--ro last-changed yang:date-and-time 565 +--ro perceived-severity severity 566 +--ro alarm-text alarm-text 567 +--ro status-change* [time] 568 +--ro time yang:date-and-time 569 +--ro perceived-severity severity 570 +--ro alarm-text alarm-text 572 For every status change from the resource perspective a row is added 573 to the 'status-change' list. The last status values are also 574 represented at leafs for the alarm. Note well that the alarm 575 severity does not include 'cleared', alarm clearance is a flag. 577 An alarm can therefore look like this: ((GigabitEthernet0/25, link- 578 alarm,""), false, T, major, "Interface GigabitEthernet0/25 down") 580 5.5.2. Operator Alarm Life-cycle 582 Operators can also act upon alarms using the set-operator-state 583 action: 585 +--ro alarm* [resource alarm-type-id alarm-type-qualifier] 586 ... 587 +--ro operator-state-change* [time] {operator-actions}? 588 | +--ro time yang:date-and-time 589 | +--ro operator string 590 | +--ro state operator-state 591 | +--ro text? string 592 +---x set-operator-state {operator-actions}? 593 +---w input 594 +---w state operator-state 595 +---w text? string 597 The operator state for an alarm can be: 'none', 'ack', 'shelved', and 598 'closed'. Alarm deletion, 'rpc purge', can use this state as a 599 criteria. A closed alarm is an alarm where the operator has 600 performed any required corrective actions. Closed alarms are good 601 candidates for being deleted. 603 5.5.3. Administrative Alarm Life-Cycle 605 Deleting alarms from the alarm list is considered an administrative 606 action. This is supported by the "purge-alarms" rpc. The "purge- 607 alarms" rpc takes a filter as input. The filter selects alarms based 608 on the operator and resource life-cycle such as "all closed cleared 609 alarms older than a time specification". The server may also perform 610 these operations based on other policies, but how that is done is out 611 of scope for this document. 613 Alarms can be compressed. Compressing an alarm deletes all entries 614 in the alarm's "status-change" list except for the last status 615 change. A client can perform this using the "compress-alarms" rpc. 616 The server may also perform these operations based on other policies, 617 but how that is done is out of scope for this document. 619 5.6. Alarm Shelving 621 Alarm shelving is an important function in order for alarm management 622 applications and operators to stop superfluous alarms. A shelved 623 alarm implies that any alarms fulfilling this criteria are ignored. 624 Shelved alarms appear in a dedicated shelved alarm list in order not 625 to disturb the relevant alarms. Shelved alarms do not generate 626 notifications. 628 6. Alarm Data Model 630 Alarm shelving and operator actions are YANG features so that a 631 server can select not to support these. 633 The data model has the following overall structure: 635 +--rw alarms 636 +--rw control 637 | +--rw max-alarm-status-changes? union 638 | +--rw notify-status-changes? boolean 639 | +--rw alarm-shelving {alarm-shelving}? 640 | +--rw shelf* [shelf-name] 641 | +--rw shelf-name string 642 | +--rw resource? resource 643 | +--rw alarm-type-id? alarm-type-id 644 | +--rw alarm-type-qualifier? alarm-type-qualifier 645 | +--rw description? string 646 +--ro alarm-inventory 647 | +--ro alarm-type* [alarm-type-id alarm-type-qualifier] 648 | +--ro alarm-type-id alarm-type-id 649 | +--ro alarm-type-qualifier alarm-type-qualifier 650 | +--ro resource* string 651 | +--ro has-clear boolean 652 | +--ro description string 653 +--ro summary 654 | +--ro alarm-summary* [severity] 655 | | +--ro severity severity 656 | | +--ro total? yang:gauge32 657 | | +--ro cleared? yang:gauge32 658 | | +--ro cleared-not-closed? yang:gauge32 659 | | | {operator-actions}? 660 | | +--ro cleared-closed? yang:gauge32 661 | | | {operator-actions}? 662 | | +--ro not-cleared-closed? yang:gauge32 663 | | | {operator-actions}? 664 | | +--ro not-cleared-not-closed? yang:gauge32 665 | | {operator-actions}? 666 | +--ro shelves-active? empty {alarm-shelving}? 667 +--ro alarm-list 668 | +--ro number-of-alarms? yang:gauge32 669 | +--ro last-changed? yang:date-and-time 670 | +--ro alarm* [resource alarm-type-id alarm-type-qualifier] 671 | +--ro time-created yang:date-and-time 672 | +--ro resource resource 673 | +--ro alarm-type-id alarm-type-id 674 | +--ro alarm-type-qualifier alarm-type-qualifier 675 | +--ro alt-resource* resource 676 | +--ro related-alarm* 677 | | [resource alarm-type-id alarm-type-qualifier] 678 | | +--ro resource 679 | | | -> /alarms/alarm-list/alarm/resource 680 | | +--ro alarm-type-id leafref 681 | | +--ro alarm-type-qualifier leafref 682 | +--ro impacted-resource* resource 683 | +--ro root-cause-resource* resource 684 | +--ro is-cleared boolean 685 | +--ro last-changed yang:date-and-time 686 | +--ro perceived-severity severity 687 | +--ro alarm-text alarm-text 688 | +--ro status-change* [time] {alarm-history}? 689 | | +--ro time yang:date-and-time 690 | | +--ro perceived-severity severity-with-clear 691 | | +--ro alarm-text alarm-text 692 | +--ro operator-state-change* [time] {operator-actions}? 693 | | +--ro time yang:date-and-time 694 | | +--ro operator string 695 | | +--ro state operator-state 696 | | +--ro text? string 697 | +---x set-operator-state {operator-actions}? 698 | +---w input 699 | +---w state operator-state 700 | +---w text? string 701 +--ro shelved-alarms {alarm-shelving}? 702 +--ro number-of-shelved-alarms? yang:gauge32 703 +--ro alarm-shelf-last-changed? yang:date-and-time 704 +--ro shelved-alarm* 706 [resource alarm-type-id alarm-type-qualifier] 707 +--ro resource resource 708 +--ro alarm-type-id alarm-type-id 709 +--ro alarm-type-qualifier alarm-type-qualifier 710 +--ro alt-resource* resource 711 +--ro related-alarm* 712 | [resource alarm-type-id alarm-type-qualifier] 713 | +--ro resource 714 | | -> /alarms/alarm-list/alarm/resource 715 | +--ro alarm-type-id leafref 716 | +--ro alarm-type-qualifier leafref 717 +--ro impacted-resource* resource 718 +--ro root-cause-resource* resource 719 +--ro is-cleared boolean 720 +--ro last-changed yang:date-and-time 721 +--ro perceived-severity severity 722 +--ro alarm-text alarm-text 723 +--ro status-change* [time] {alarm-history}? 724 | +--ro time yang:date-and-time 725 | +--ro perceived-severity severity-with-clear 726 | +--ro alarm-text alarm-text 727 +--ro operator-state-change* [time] {operator-actions}? 728 +--ro time yang:date-and-time 729 +--ro operator string 730 +--ro state operator-state 731 +--ro text? string 733 6.1. Alarm Control 735 The "/alarms/control/notify-status-changes" leaf controls if 736 notifications are sent for all state changes, severity change and 737 alarm text change, or just for new and cleared alarms. 739 Every alarm has a list of status changes, this is a circular list. 740 The length of this list is controlled by "/alarms/control/max-alarm- 741 status-changes". 743 6.1.1. Alarm Shelving 745 The shelving control tree is shown below: 747 +--rw alarm-shelving {alarm-shelving}? 748 +--rw shelf* [shelf-name] 749 +--rw shelf-name string 750 +--rw resource? resource 751 +--rw alarm-type-id? alarm-type-id 752 +--rw alarm-type-qualifier? alarm-type-qualifier 754 Shelved alarms are shown in a dedicated shelved alarm list. The 755 instrumentation MUST move shelved alarms from the alarm list 756 (/alarms/alarm-list) to the shelved alarm list (/alarms/shelved- 757 alarms/). Shelved alarms do not generate any notifications. When 758 the shelving criteria is removed or changed the alarm list MUST be 759 updated to the correct actual state of the alarms. 761 A leaf (/alarms/summary/shelfs-active) in the alarm summary indicates 762 if there are shelved alarms. 764 A system can select to not support the shelving feature. 766 6.2. Alarm Inventory 768 The alarm inventory represents all possible alarm types that may 769 occur in the system. A management system may use this to build alarm 770 procedures. The alarm inventory is relevant for several reasons: 772 The system might not instrument all alarm type identities. 774 The system has configured dynamic alarm types using the alarm 775 qualifier. The inventory makes it possible for the management 776 system to discover these. 778 Note that the mechanism whereby dynamic alarm types are added using 779 the alarm type qualifier MUST populate this list. 781 The optional leaf-list "resource" in the alarm inventory enables the 782 system to publish for which resources a given alarm type may appear. 784 The alarm inventory tree is shown below: 786 ro alarm-inventory 787 +--ro alarm-type* [alarm-type-id alarm-type-qualifier] 788 +--ro alarm-type-id alarm-type-id 789 +--ro alarm-type-qualifier alarm-type-qualifier 790 +--ro resource* string 791 +--ro has-clear boolean 792 +--ro description string 794 6.3. Alarm Summary 796 The alarm summary list summarises alarms per severity; how many 797 cleared, cleared and closed, and closed. It also gives an indication 798 if there are shelved alarms. 800 6.4. The Alarm List 802 The alarm list (/alarms/alarm-list) is a function from (resource, 803 alarm type, alarm type qualifier) to the current alarm state. 805 +--ro alarm-list 806 +--ro number-of-alarms? yang:gauge32 807 +--ro last-changed? yang:date-and-time 808 +--ro alarm* [resource alarm-type-id alarm-type-qualifier] 809 +--ro time-created yang:date-and-time 810 +--ro resource resource 811 +--ro alarm-type-id alarm-type-id 812 +--ro alarm-type-qualifier alarm-type-qualifier 813 +--ro alt-resource* resource 814 +--ro related-alarm* 815 | [resource alarm-type-id alarm-type-qualifier] 816 | +--ro resource 817 | | -> /alarms/alarm-list/alarm/resource 818 | +--ro alarm-type-id leafref 819 | +--ro alarm-type-qualifier leafref 820 +--ro impacted-resource* resource 821 +--ro root-cause-resource* resource 822 +--ro is-cleared boolean 823 +--ro last-changed yang:date-and-time 824 +--ro perceived-severity severity 825 +--ro alarm-text alarm-text 826 +--ro status-change* [time] {alarm-history}? 827 | +--ro time yang:date-and-time 828 | +--ro perceived-severity severity-with-clear 829 | +--ro alarm-text alarm-text 830 +--ro operator-state-change* [time] {operator-actions}? 831 | +--ro time yang:date-and-time 832 | +--ro operator string 833 | +--ro state operator-state 834 | +--ro text? string 835 +---x set-operator-state {operator-actions}? 836 +---w input 837 +---w state operator-state 838 +---w text? string 840 Every alarm has three important states, the resource clearance state 841 "is-cleared", the severity "perceived-severity" and the operator 842 state available in the operator state change list. 844 In order to see the alarm history the resource state changes are 845 available in the "status-change" list and the operator history is 846 available in the "operator-state-change" list. 848 6.5. The Shelved Alarms List 850 The shelved alarm list has the same structure as the alarm list 851 above. It shows all the alarms that matches the shelving criteria 852 (/alarms/control/alarm-shelving). 854 6.6. RPCs 856 The alarm module supports rpcs/actions to manage the alarms: 858 "purge-alarms" (RPC): delete alarms according to specific 859 criteria, for example all cleared alarms older then a specific 860 date. 862 "compress" and "compress-alarms" rpcs: compress the status-change 863 list for the alarms. 865 "set-operator-state" action: change the operator state for an 866 alarm: for example acknowledge. 868 6.7. Notifications 870 The alarm module supports a general notification to report alarm 871 state changes. It carries all relevant parameters for the alarm 872 management application. 874 There is also a notification to report that an operator changed the 875 operator state on an alarm, like acknowledge. 877 If the alarm inventory is changed, for example a new card type is 878 inserted, a notification will tell the management application that 879 new alarm types are available. 881 7. Alarm YANG Module 883 file "ietf-alarms.yang" 884 module ietf-alarms { 885 yang-version 1.1; 886 namespace "urn:ietf:params:xml:ns:yang:ietf-alarms"; 887 prefix al; 889 import ietf-yang-types { 890 prefix yang; 891 } 893 organization 894 "IETF NETMOD (NETCONF Data Modeling Language) Working Group"; 896 contact 897 "WG Web: 898 WG List: 900 Editor: Stefan Vallin 901 903 Editor: Martin Bjorklund 904 "; 906 description 907 "This module defines an interface for managing alarms. Main 908 inputs to the module design are the 3GPP Alarm IRP, ITU-T X.733 909 and ANSI/ISA-18.2 alarm standards. 911 Main features of this module include: 913 * Alarm list: 914 A list of all alarms. Cleared alarms stay in 915 the list until explicitly removed. 917 * Operator actions on alarms: 918 Acknowledging and closing alarms. 920 * Administrative actions on alarms: 921 Purging alarms from the list according to specific 922 criteria. 924 * Alarm inventory: 925 A management application can read all 926 alarm types implemented by the system. 928 * Alarm shelving: 929 Shelving (blocking) alarms according 930 to specific criteria. 932 This module uses a stateful view on alarms. An alarm is a state 933 for a specific resource (note that an alarm is not a 934 notification). An alarm type is a possible alarm state for a 935 resource. For example, the tuple: 937 ('link-alarm', 'GigabitEthernet0/25') 939 is an alarm of type 'link-alarm' on the resource 940 'GigabitEthernet0/25'. 942 Alarm types are identified using YANG identities and an optional 943 string-based qualifier. The string-based qualifier allows for 944 dynamic extension of the statically defined alarm types. Alarm 945 types identify a possible alarm state and not the individual 946 notifications. For example, the traditional 'link-down' and 947 'link-up' notifications are two notifications referring to the 948 same alarm type 'link-alarm'. 950 With this design there is no ambiguity about how alarm and alarm 951 clear correlation should be performed: notifications that report 952 the same resource and alarm type are considered updates of the 953 same alarm, such as clearing an active alarm or changing the 954 severity of an alarm. 956 The instrumentation can update 'severity' and 'alarm-text' on an 957 existing alarm. The above alarm example can therefore look 958 like: 960 (('link-alarm', 'GigabitEthernet0/25'), 961 warning, 962 'interface down while interface admin state is up') 964 There is a clear separation between updates on the alarm from 965 the underlying resource, like clear, and updates from an 966 operator like acknowledge or closing an alarm: 968 (('link-alarm', 'GigabitEthernet0/25'), 969 warning, 970 'interface down while interface admin state is up', 971 cleared, 972 closed) 974 Administrative actions like removing closed alarms older than a 975 given time is supported."; 977 revision 2016-10-27 { 978 description 979 "Initial revision."; 980 reference 981 "RFC XXXX: YANG Alarm Module"; 982 } 984 /* 985 * Features 986 */ 988 feature operator-actions { 989 description 990 "This feature means that the systems supports operator states 991 on alarms."; 992 } 994 feature alarm-shelving { 995 description 996 "This feature means that the system supports shelving 997 (blocking) alarms."; 998 } 1000 feature alarm-history { 1001 description 1002 "This feature means that the alarm list also maintains a 1003 history of state changes for each alarm. For example, if an 1004 alarm toggles between cleared and active 10 times, a list for 1005 that alarm will show those state changes with time-stamps."; 1006 } 1007 /* 1008 * Identities 1009 */ 1011 identity alarm-identity { 1012 description 1013 "Base identity for alarm types. A unique identification of the 1014 alarm, not including the resource. Different resources can 1015 share alarm types. If the resource reports the same alarm 1016 type, it is to be considered to be the same alarm. The alarm 1017 type is a simplification of the different X.733 and 3GPP alarm 1018 IRP alarm correlation mechanisms and it allows for 1019 hierarchical extensions. 1021 A string-based qualifier can be used in addition to the 1022 identity in order to have different alarm types based on 1023 information not known at design-time, such as values in 1024 textual SNMP Notification var-binds. 1026 Standards and vendors can define sub-identities to clearly 1027 identify specific alarm types. 1029 This identity is abstract and shall not be used for alarms."; 1030 } 1032 /* 1033 * Common types 1034 */ 1036 typedef resource { 1037 type union { 1038 type instance-identifier { 1039 require-instance false; 1040 } 1041 type yang:object-identifier; 1042 type string; 1043 } 1044 description 1045 "This is an identification of the alarming resource, such as an 1046 interface. It should be as fine-grained as possible both to 1047 guide the operator and to guarantee uniqueness of the 1048 alarms. If a resource has both a config and a state tree 1049 normally this should identify the state tree, 1050 (e.g., /interfaces-state/interface/name). 1051 But if the instrumentation can detect a broken config, this 1052 should be identified as the resource. 1053 If the alarming resource is modelled in YANG, this 1054 type will be an instance-identifier. If the resource is an 1055 SNMP object, the type will be an object-identifier. If the 1056 resource is anything else, for example a distinguished name or 1057 a CIM path, this type will be a string."; 1058 } 1060 typedef alarm-text { 1061 type string; 1062 description 1063 "The string used to inform operators about the alarm. This 1064 MUST contain enough information for an operator to be able 1065 to understand the problem and how to resolve it. If this 1066 string contains structure, this format should be clearly 1067 documented for programs to be able to parse that 1068 information."; 1069 } 1071 typedef severity { 1072 type enumeration { 1073 enum indeterminate { 1074 value 2; 1075 description 1076 "Indicates that the severity level could not be 1077 determined. This level SHOULD be avoided."; 1078 } 1079 enum minor { 1080 value 3; 1081 description 1082 "The 'minor' severity level indicates the existence of a 1083 non-service affecting fault condition and that corrective 1084 action should be taken in order to prevent a more serious 1085 (for example, service affecting) fault. Such a severity 1086 can be reported, for example, when the detected alarm 1087 condition is not currently degrading the capacity of the 1088 resource."; 1089 } 1090 enum warning { 1091 value 4; 1092 description 1093 "The 'warning' severity level indicates the detection of 1094 a potential or impending service affecting fault, before 1095 any significant effects have been felt. Action should be 1096 taken to further diagnose (if necessary) and correct the 1097 problem in order to prevent it from becoming a more 1098 serious service affecting fault."; 1099 } 1100 enum major { 1101 value 5; 1102 description 1103 "The 'major' severity level indicates that a service 1104 affecting condition has developed and an urgent 1105 corrective action is required. Such a severity can be 1106 reported, for example, when there is a severe 1107 degradation in the capability of the resource 1108 and its full capability must be restored."; 1109 } 1110 enum critical { 1111 value 6; 1112 description 1113 "The 'critical' severity level indicates that a service 1114 affecting condition has occurred and an immediate 1115 corrective action is required. Such a severity can be 1116 reported, for example, when a resource becomes totally 1117 out of service and its capability must be restored."; 1118 } 1119 } 1120 description 1121 "The severity level of the alarm. Note well that value 'clear' 1122 is not included. If an alarm is cleared or not is a separate 1123 boolean flag."; 1124 reference 1125 "ITU Recommendation X.733: Information Technology 1126 - Open Systems Interconnection 1127 - System Management: Alarm Reporting Function"; 1128 } 1130 typedef severity-with-clear { 1131 type union { 1132 type enumeration { 1133 enum cleared { 1134 value 1; 1135 description 1136 "The alarm is cleared by the instrumentation."; 1137 } 1138 } 1139 type severity; 1140 } 1141 description 1142 "The severity level of the alarm including clear. 1143 This is used *only* in notifications reporting state changes 1144 for an alarm."; 1145 } 1147 typedef operator-state { 1148 type enumeration { 1149 enum none { 1150 value 1; 1151 description 1152 "The alarm is not being taken care of."; 1153 } 1154 enum ack { 1155 value 2; 1156 description 1157 "The alarm is being taken care of. Corrective action not 1158 taken yet, or failed"; 1159 } 1160 enum closed { 1161 value 3; 1162 description 1163 "Corrective action taken successfully."; 1164 } 1165 enum shelved { 1166 value 4; 1167 description 1168 "Alarm shelved. Alarms in alarms/shelved-alarms/ 1169 MUST be assigned this operator state by the server as 1170 the last entry in the operator-state-change list."; 1171 } 1172 enum un-shelved { 1173 value 5; 1174 description 1175 "Alarm moved back to alarm-list from shelf. 1176 Alarms 'moved' from /alarms/shelved-alarms/ 1177 to /alarms/alarm-list MUST be assigned this 1178 state by the server as the last entry in the 1179 operator-state-change list."; 1180 } 1182 } 1183 description 1184 "Operator states on an alarm. The 'closed' state indicates 1185 that an operator considers the alarm being resolved. This 1186 is separate from the resource alarm clear flag."; 1187 } 1189 /* Alarm type */ 1191 typedef alarm-type-id { 1192 type identityref { 1193 base alarm-identity; 1194 } 1195 description 1196 "Identifies an alarm type. The description of the alarm type 1197 id MUST indicate if the alarm type is abstract or not. An 1198 abstract alarm type is used as a base for other alarm type ids 1199 and will not be used as a value for an alarm or be present in 1200 the alarm inventory."; 1201 } 1203 typedef alarm-type-qualifier { 1204 type string; 1205 description 1206 "If an alarm type can not be fully specified at design time by 1207 alarm-type-id, this string qualifier is used in addition to 1208 fully define a unique alarm type. 1210 The definition of alarm qualifiers is considered being part 1211 of the instrumentation and out of scope for this module. 1212 An empty string is used when this is part of a key."; 1213 } 1215 /* 1216 * Groupings 1217 */ 1219 grouping common-alarm-parameters { 1220 description 1221 "Common parameters for an alarm. 1223 This grouping is used both in the alarm list and in the 1224 notification representing an alarm state change."; 1226 leaf resource { 1227 type resource; 1228 mandatory true; 1229 description 1230 "The alarming resource. See also 'alt-resource'. 1231 This could for example be a reference to the alarming 1232 interface"; 1233 } 1235 leaf alarm-type-id { 1236 type alarm-type-id; 1237 mandatory true; 1238 description 1239 "This leaf and the leaf 'alarm-type-qualifier' together 1240 provides a unique identification of the alarm type."; 1241 } 1243 leaf alarm-type-qualifier { 1244 type alarm-type-qualifier; 1245 description 1246 "This leaf is used when the 'alarm-type-id' leaf cannot 1247 uniquely identify the alarm type. Normally, this is not 1248 the case, and this leaf is the empty string."; 1249 } 1251 leaf-list alt-resource { 1252 type resource; 1253 description 1254 "Used if the alarming resource is available over other 1255 interfaces. This field can contain SNMP OID's, CIM paths or 1256 3GPP Distinguished names for example."; 1257 } 1259 list related-alarm { 1260 key "resource alarm-type-id alarm-type-qualifier"; 1262 description 1263 "References to related alarms. Note that the related alarm 1264 might have been removed from the alarm list."; 1266 leaf resource { 1267 type leafref { 1268 path "/alarms/alarm-list/alarm/resource"; 1269 require-instance false; 1270 } 1271 description 1272 "The alarming resource for the related alarm."; 1273 } 1274 leaf alarm-type-id { 1275 type leafref { 1276 path "/alarms/alarm-list/alarm" 1277 + "[resource=current()/../resource]" 1278 + "/alarm-type-id"; 1279 require-instance false; 1280 } 1281 description 1282 "The alarm type identifier for the related alarm."; 1283 } 1284 leaf alarm-type-qualifier { 1285 type leafref { 1286 path "/alarms/alarm-list/alarm" 1287 + "[resource=current()/../resource]" 1288 + "[alarm-type-id=current()/../alarm-type-id]" 1289 + "/alarm-type-qualifier"; 1290 require-instance false; 1291 } 1292 description 1293 "The alarm qualifier for the related alarm."; 1294 } 1295 } 1296 leaf-list impacted-resource { 1297 type resource; 1298 description 1299 "Resources that might be affected by this alarm. If the 1300 system creates an alarm on a resource and also has a mapping 1301 to other resources that might be impacted, these resources 1302 can be listed in this leaf-list. In this way the system can 1303 create one alarm instead of several. For example, if an 1304 interface has an alarm, the 'impacted-resource' can reference 1305 the aggregated port channels."; 1306 } 1307 leaf-list root-cause-resource { 1308 type resource; 1309 description 1310 "Resources that are candidates for causing the alarm. If the 1311 system has a mechanism to understand the candidate root 1312 causes of an alarm, this leaf-list can be used to list the 1313 root cause candidate resources. In this way the system can 1314 create one alarm instead of several. An example might be a 1315 logging system (alarm resource) that fails, the alarm can 1316 reference the file-system in the 'root-cause-resource' 1317 leaf-list."; 1318 } 1319 } 1321 grouping alarm-state-change-parameters { 1322 description 1323 "Parameters for an alarm state change. 1325 This grouping is used both in the alarm list's 1326 status-change list and in the notification representing an 1327 alarm state change."; 1329 leaf time { 1330 type yang:date-and-time; 1331 mandatory true; 1332 description 1333 "The time the status of the alarm changed. The value 1334 represents the time the real alarm state change appeared 1335 in the resource and not when it was added to the 1336 alarm list. The /alarm-list/alarm/last-changed MUST be 1337 set to the same value."; 1338 } 1339 leaf perceived-severity { 1340 type severity-with-clear; 1341 mandatory true; 1342 description 1343 "The severity of the alarm as defined by X.733. Note 1344 that this may not be the original severity since the alarm 1345 may have changed severity."; 1346 reference 1347 "ITU Recommendation X.733: Information Technology 1348 - Open Systems Interconnection 1349 - System Management: Alarm Reporting Function"; 1350 } 1351 leaf alarm-text { 1352 type alarm-text; 1353 mandatory true; 1354 description 1355 "A user friendly text describing the alarm state change."; 1356 reference 1357 "ITU Recommendation X.733: Information Technology 1358 - Open Systems Interconnection 1359 - System Management: Alarm Reporting Function"; 1360 } 1361 } 1363 grouping operator-parameters { 1364 description 1365 "This grouping defines parameters that can 1366 be changed by an operator"; 1367 leaf time { 1368 type yang:date-and-time; 1369 mandatory true; 1370 description 1371 "Timestamp for operator action on alarm."; 1372 } 1373 leaf operator { 1374 type string; 1375 mandatory true; 1376 description 1377 "The name of the operator that has acted on this 1378 alarm."; 1379 } 1380 leaf state { 1381 type operator-state; 1382 mandatory true; 1383 description 1384 "The operator's view of the alarm state."; 1385 } 1386 leaf text { 1387 type string; 1388 description 1389 "Additional optional textual information provided by 1390 the operator."; 1391 } 1392 } 1394 grouping resource-alarm-parameters { 1395 description 1396 "Alarm parameters that originates from the resource view."; 1397 leaf is-cleared { 1398 type boolean; 1399 mandatory true; 1400 description 1401 "Indicates the current clearance state of the alarm. An 1402 alarm might toggle from active alarm to cleared alarm and 1403 back to active again."; 1404 } 1406 leaf last-changed { 1407 type yang:date-and-time; 1408 mandatory true; 1409 description 1410 "A timestamp when the alarm status was last changed. Status 1411 changes are changes to 'is-cleared', 'perceived-severity', 1412 and 'alarm-text'."; 1413 } 1415 leaf perceived-severity { 1416 type severity; 1417 mandatory true; 1418 description 1419 "The last severity of the alarm. 1421 If an alarm was raised with severity 'warning', but later 1422 changed to 'major', this leaf will show 'major'."; 1423 } 1425 leaf alarm-text { 1426 type alarm-text; 1427 mandatory true; 1428 description 1429 "The last reported alarm text. This text should contain 1430 information for an operator to be able to understand 1431 the problem and how to resolve it."; 1432 } 1434 list status-change { 1435 if-feature alarm-history; 1436 key time; 1437 min-elements 1; 1438 description 1439 "A list of status change events for this alarm. 1441 The entry with latest time-stamp in this list MUST 1442 correspond to the leafs 'is-cleared', 'perceived-severity' 1443 and 'alarm-text' for the alarm. The time-stamp for that 1444 entry MUST be equal to the 'last-changed' leaf. 1446 This list is ordered according to the timestamps of 1447 alarm state changes. The last item corresponds to the 1448 latest state change. 1450 The following state changes creates an entry in this 1451 list: 1452 - changed severity (warning, minor, major, critical) 1453 - clearance status, this also updates the 'is-cleared' 1454 leaf 1455 - alarm text update"; 1457 uses alarm-state-change-parameters; 1458 } 1459 } 1461 /* 1462 * The /alarms data tree 1463 */ 1465 container alarms { 1466 description 1467 "The top container for this module"; 1468 container control { 1469 description 1470 "Configuration to control the alarm behaviour."; 1471 leaf max-alarm-status-changes { 1472 type union { 1473 type uint16; 1474 type enumeration { 1475 enum infinite { 1476 description 1477 "The status change entries are accumulated 1478 infinitely."; 1479 } 1480 } 1481 } 1482 default 32; 1483 description 1484 "The status-change entries are kept in a circular list 1485 per alarm. When this number is exceeded, the oldest 1486 status change entry is automatically removed. If the 1487 value is 'infinite', the status change entries are 1488 accumulated infinitely."; 1489 } 1491 leaf notify-status-changes { 1492 type boolean; 1493 default false; 1494 description 1495 "This leaf controls whether notifications are sent on all 1496 alarm status updates, e.g., updated perceived-severity or 1497 alarm-text. By default the notifications are only sent 1498 when a new alarm is raised, re-raised after being cleared 1499 and when an alarm is cleared."; 1500 } 1501 container alarm-shelving { 1502 if-feature alarm-shelving; 1503 description 1504 "This list is used to shelve alarms. The server will move 1505 any alarms corresponding to the shelving criteria from the 1506 alarms/alarm-list/alarm list to the 1507 alarms/shelved-alarms/shelved-alarm list. It will also 1508 stop sending notifications for the shelved alarms. The 1509 conditions in the shelf criteria are logically ANDed. 1510 When the shelving criteria is deleted or changed, the 1511 non-matching alarms MUST appear in the 1512 alarms/alarm-list/alarm list according to the real state. 1513 This means that the instrumentation MUST maintain states 1514 for the shelved alarms. Alarms that match the criteria 1515 shall have an operator-state 'shelved'."; 1516 list shelf { 1517 key shelf-name; 1518 leaf shelf-name { 1519 type string; 1520 description 1521 "An arbitrary name for the alarm shelf."; 1522 } 1523 description 1524 "Each entry defines the criteria for shelving alarms. 1525 Criterias are ANDed."; 1527 leaf resource { 1528 type resource; 1529 description 1530 "Shelve alarms for this resource."; 1531 } 1532 leaf alarm-type-id { 1533 type alarm-type-id; 1534 description 1535 "Shelve alarms for this alarm type identifier."; 1536 } 1537 leaf alarm-type-qualifier { 1538 type alarm-type-qualifier; 1539 description 1540 "Shelve alarms for this alarm type qualifier."; 1541 } 1542 leaf description { 1543 type string; 1544 description 1545 "An optional textual description of the shelf. This 1546 description should include the reason for shelving 1547 these alarms."; 1548 } 1549 } 1550 } 1551 } 1553 container alarm-inventory { 1554 config false; 1555 description 1556 "This list contains all possible alarm types for the system. 1557 If the system knows for wich resources a a specific alarm 1558 type can appear, this is also identified in the inventory. 1559 The list also tells if each alarm type has a corresponding 1560 clear state. The inventory shall only contain concrete 1561 alarm types. 1563 The alarm inventory MUST be updated by the system when new 1564 alarms can appear. This can be the case when installing new 1565 software modules or inserting new card types. A 1566 notification 'alarm-inventory-changed' is sent when the 1567 inventory is changed."; 1569 list alarm-type { 1570 key "alarm-type-id alarm-type-qualifier"; 1571 description 1572 "An entry in this list defines a possible alarm."; 1573 leaf alarm-type-id { 1574 type alarm-type-id; 1575 mandatory true; 1576 description 1577 "The statically defined alarm type identifier for this 1578 possible alarm."; 1579 } 1580 leaf alarm-type-qualifier { 1581 type alarm-type-qualifier; 1582 description 1583 "The optionally dynamically defined alarm type identifier 1584 for this possible alarm."; 1585 } 1586 leaf-list resource { 1587 type string; 1588 description 1589 "Optionally, specifies for which resources the alarm type 1590 is valid. This string is for human consumption but 1591 SHOULD refer to paths in the model."; 1592 } 1593 leaf has-clear { 1594 type boolean; 1595 mandatory true; 1596 description 1597 "This leaf tells the operator if the alarm will be 1598 cleared when the correct corrective action has been 1599 taken. Implementations SHOULD strive for detecting the 1600 cleared state for all alarm types. If this leaf is 1601 true, the operator can monitor the alarm until it 1602 becomes cleared after the corrective action has been 1603 taken. If this leaf is false the operator needs to 1604 validate that the alarm is not longer active using other 1605 mechanisms. Alarms can lack a corresponding clear due 1606 to missing instrumentation or that there is no logical 1607 corresponding clear state."; 1608 } 1609 leaf description { 1610 type string; 1611 mandatory true; 1612 description 1613 "A description of the possible alarm. It SHOULD include 1614 information on possible underlying root causes and 1615 corrective actions."; 1616 } 1617 } 1618 } 1620 container summary { 1621 config false; 1622 description 1623 "This container gives a summary of number of alarms 1624 and shelved alarms"; 1625 list alarm-summary { 1626 key severity; 1627 description 1628 "A global summary of all alarms in the system."; 1629 leaf severity { 1630 type severity; 1631 description 1632 "Alarm summary for this severity level."; 1633 } 1634 leaf total { 1635 type yang:gauge32; 1636 description 1637 "Total number of alarms of this severity level."; 1638 } 1639 leaf cleared { 1640 type yang:gauge32; 1641 description 1642 "For this severity level, the number of alarms that are 1643 cleared."; 1644 } 1645 leaf cleared-not-closed { 1646 if-feature operator-actions; 1647 type yang:gauge32; 1648 description 1649 "For this severity level, the number of alarms that are 1650 cleared but not closed."; 1651 } 1652 leaf cleared-closed { 1653 if-feature operator-actions; 1654 type yang:gauge32; 1655 description 1656 "For this severity level, the number of alarms that are 1657 cleared and closed."; 1658 } 1659 leaf not-cleared-closed { 1660 if-feature operator-actions; 1661 type yang:gauge32; 1662 description 1663 "For this severity level, the number of alarms that are 1664 not cleared but closed."; 1665 } 1666 leaf not-cleared-not-closed { 1667 if-feature operator-actions; 1668 type yang:gauge32; 1669 description 1670 "For this severity level, the number of alarms that are 1671 not cleared and not closed."; 1672 } 1673 } 1674 leaf shelves-active { 1675 if-feature alarm-shelving; 1676 type empty; 1677 description 1678 "This is a hint to the operator that there are active 1679 alarm shelves. This leaf MUST exist if the 1680 alarms/shelved-alarms/number-of-shelved-alarms is > 0."; 1681 } 1682 } 1684 container alarm-list { 1685 config false; 1686 description 1687 "The alarms in the system."; 1688 leaf number-of-alarms { 1689 type yang:gauge32; 1690 description 1691 "This object shows the total number of 1692 alarms in the system, i.e., the total number 1693 of entries in the alarm list."; 1694 } 1696 leaf last-changed { 1697 type yang:date-and-time; 1698 description 1699 "A timestamp when the alarm list was last 1700 changed. The value can be used by a manager to 1701 initiate an alarm resynchronization procedure."; 1702 } 1704 list alarm { 1705 key "resource alarm-type-id alarm-type-qualifier"; 1707 description 1708 "The list of alarms. Each entry in the list holds one 1709 alarm for a given alarm type and resource. 1710 An alarm can be updated from the underlying resource or 1711 by the user. The following leafs are maintained by the 1712 resource: is-cleared, last-change, perceived-severity, 1713 and alarm-text. An operator can change: operator-state 1714 and operator-text. 1716 Entries appear in the alarm list the first time an 1717 alarm becomes active for a given alarm-type and resource. 1718 Entries do not get deleted when the alarm is cleared, this 1719 is a boolean state in the alarm. 1721 Alarm entries are removed, purged, from the list by an 1722 explicit purge action. For example, delete all alarms 1723 that are cleared and in closed operator-state that are 1724 older than 24 hours. Systems may also remove alarms based 1725 on locally configured policies which is out of scope for 1726 this module."; 1727 leaf time-created { 1728 type yang:date-and-time; 1729 mandatory true; 1730 description 1731 "The time-stamp when this alarm entry was created. This 1732 represents the first time the alarm appeared, it can 1733 also represent that the alarm re-appeared after a purge. 1734 Further state-changes of the same alarm does not change 1735 this leaf, these changes will update the 'last-changed' 1736 leaf."; 1737 } 1739 uses common-alarm-parameters; 1740 uses resource-alarm-parameters; 1741 list operator-state-change { 1742 if-feature operator-actions; 1743 key time; 1744 description 1745 "This list is used by operators to indicate 1746 the state of human intervention on an alarm. 1747 For example, if an operator has seen an alarm, 1748 the operator can add a new item to this list indicating 1749 that the alarm is acknowledged."; 1750 uses operator-parameters; 1751 } 1753 action set-operator-state { 1754 if-feature operator-actions; 1755 description 1756 "This is a means for the operator to indicate 1757 the level of human intervention on an alarm."; 1758 input { 1759 leaf state { 1760 type operator-state; 1761 mandatory true; 1762 description 1763 "Set this operator state."; 1764 } 1765 leaf text { 1766 type string; 1767 description 1768 "Additional optional textual information."; 1769 } 1770 } 1771 } 1772 } 1773 } 1775 container shelved-alarms { 1776 if-feature alarm-shelving; 1777 config false; 1778 description 1779 "The shelved alarms. Alarms appear here if they match the 1780 criterias in /alarms/control/alarm-shelving. This list does 1781 not generate any notifications. The list represents alarms 1782 that are considered not relevant by the operator. Alarms in 1783 this list have an operator-state of 'shelved'. This can not 1784 be changed."; 1785 leaf number-of-shelved-alarms { 1786 type yang:gauge32; 1787 description 1788 "This object shows the total number of currently 1789 alarms, i.e., the total number of entries 1790 in the alarm list."; 1791 } 1793 leaf alarm-shelf-last-changed { 1794 type yang:date-and-time; 1795 description 1796 "A timestamp when the shelved alarm list was last 1797 changed. The value can be used by a manager to 1798 initiate an alarm resynchronization procedure."; 1799 } 1801 list shelved-alarm { 1802 key "resource alarm-type-id alarm-type-qualifier"; 1804 description 1805 "The list of shelved alarms. Each entry in the list holds 1806 one alarm for a given alarm type and resource. An alarm 1807 can be updated from the underlying resource or by the 1808 user. These changes are reflected in different lists 1809 below the corresponding alarm."; 1811 uses common-alarm-parameters; 1812 uses resource-alarm-parameters; 1814 list operator-state-change { 1815 if-feature operator-actions; 1816 key time; 1817 description 1818 "This list is used by operators to indicate 1819 the state of human intervention on an alarm. 1820 For example, if an operator has seen an alarm, 1821 the operator can add a new item to this list indicating 1822 that the alarm is acknowledged."; 1823 uses operator-parameters; 1824 } 1825 } 1826 } 1827 } 1829 /* 1830 * Operations 1831 */ 1833 rpc compress-alarms { 1834 if-feature alarm-history; 1835 description 1836 "This operation requests the server to compress entries in the 1837 alarm list by removing all but the latest state change for all 1838 alarms. Conditions in the input are logically ANDed. If no 1839 input condition is given, all alarms are compressed."; 1840 input { 1841 leaf resource { 1842 type leafref { 1843 path "/alarms/alarm-list/alarm/resource"; 1844 require-instance false; 1845 } 1846 description 1847 "Compress the alarms with this resource."; 1848 } 1849 leaf alarm-type-id { 1850 type leafref { 1851 path "/alarms/alarm-list/alarm/alarm-type-id"; 1852 } 1853 description 1854 "Compress alarms with this alarm-type-id."; 1855 } 1856 leaf alarm-type-qualifier { 1857 type leafref { 1858 path "/alarms/alarm-list/alarm/alarm-type-qualifier"; 1859 } 1860 description 1861 "Compress the alarms with this alarm-type-qualifier."; 1862 } 1863 } 1864 output { 1865 leaf compressed-alarms { 1866 type uint32; 1867 description 1868 "Number of compressed alarm entries."; 1869 } 1870 } 1871 } 1873 grouping filter-input { 1874 description 1875 "Grouping to specify a filter construct on alarm information."; 1876 leaf alarm-status { 1877 type enumeration { 1878 enum any { 1879 description 1880 "Ignore alarm clearance status."; 1881 } 1882 enum cleared { 1883 description 1884 "Filter cleared alarms."; 1885 } 1886 enum not-cleared { 1887 description 1888 "Filter not cleared alarms."; 1889 } 1890 } 1891 mandatory true; 1892 description 1893 "The clearance status of the alarm."; 1894 } 1896 container older-than { 1897 presence "Age specification"; 1898 description 1899 "Matches the 'last-status-change' leaf in the alarm."; 1900 choice age-spec { 1901 description 1902 "Filter using date and time age."; 1903 case seconds { 1904 leaf seconds { 1905 type uint16; 1906 description 1907 "Seconds part"; 1908 } 1909 } 1910 case minutes { 1911 leaf minutes { 1912 type uint16; 1913 description 1914 "Minute part"; 1915 } 1916 } 1917 case hours { 1918 leaf hours { 1919 type uint16; 1920 description 1921 "Hours part."; 1922 } 1923 } 1924 case days { 1925 leaf days { 1926 type uint16; 1927 description 1928 "Day part"; 1929 } 1930 } 1931 case weeks { 1932 leaf weeks { 1933 type uint16; 1934 description 1935 "Week part"; 1936 } 1937 } 1938 } 1939 } 1940 container severity { 1941 presence "Severity filter"; 1942 choice sev-spec { 1943 description 1944 "Filter based on severity level."; 1945 leaf below { 1946 type severity; 1947 description 1948 "Severity less than this leaf."; 1950 } 1951 leaf is { 1952 type severity; 1953 description 1954 "Severity level equal this leaf."; 1955 } 1956 leaf above { 1957 type severity; 1958 description 1959 "Severity level higher than this leaf."; 1960 } 1961 } 1962 description 1963 "Filter based on severity."; 1964 } 1965 container operator-state-filter { 1966 if-feature operator-actions; 1967 presence "Operator state filter"; 1968 leaf state { 1969 type operator-state; 1970 description 1971 "Filter on operator state."; 1972 } 1973 leaf user { 1974 type string; 1975 description 1976 "Filter based on which operator."; 1977 } 1978 description 1979 "Filter based on operator state."; 1980 } 1981 } 1983 rpc purge-alarms { 1984 description 1985 "This operation requests the server to delete entries from the 1986 alarm list according to the supplied criteria. Typically it 1987 can be used to delete alarms that are in closed operator state 1988 and older than a specified time. The number of purged alarms 1989 is returned as an output parameter"; 1990 input { 1991 uses filter-input; 1992 } 1993 output { 1994 leaf purged-alarms { 1995 type uint32; 1996 description 1997 "Number of purged alarms."; 1999 } 2000 } 2001 } 2003 /* 2004 * Notifications 2005 */ 2007 notification alarm-notification { 2008 description 2009 "This notification is used to report a state change for an 2010 alarm. The same notification is used for reporting a newly 2011 raised alarm, a cleared alarm or changing the text and/or 2012 severity of an existing alarm."; 2014 uses common-alarm-parameters; 2015 uses alarm-state-change-parameters; 2016 } 2018 notification alarm-inventory-changed { 2019 description 2020 "This notification is used to report that the list of possible 2021 alarms has changed. This can happen when for example if a new 2022 software module is installed, or a new physical card is 2023 inserted"; 2024 } 2026 notification operator-action { 2027 if-feature operator-actions; 2028 description 2029 "This notification is used to report that an operator 2030 acted upon an alarm."; 2032 leaf resource { 2033 type leafref { 2034 path "/alarms/alarm-list/alarm/resource"; 2035 require-instance false; 2036 } 2037 description 2038 "The alarming resource."; 2039 } 2040 leaf alarm-type-id { 2041 type leafref { 2042 path "/alarms/alarm-list/alarm" 2043 + "[resource=current()/../resource]" 2044 + "/alarm-type-id"; 2045 require-instance false; 2046 } 2047 description 2048 "The alarm type identifier for the alarm."; 2049 } 2050 leaf alarm-type-qualifier { 2051 type leafref { 2052 path "/alarms/alarm-list/alarm" 2053 + "[resource=current()/../resource]" 2054 + "[alarm-type-id=current()/../alarm-type-id]" 2055 + "/alarm-type-qualifier"; 2056 require-instance false; 2057 } 2058 description 2059 "The alarm qualifier for the alarm."; 2060 } 2061 uses operator-parameters; 2062 } 2063 } 2065 2067 8. X.733 Alarm Mapping Data Model 2069 Many alarm management systems are based on the X.733 alarm standard. 2070 This YANG module allows a mapping from alarm types to X.733 event- 2071 type and probable-cause. 2073 The module augments the alarm inventory, the alarm list and the alarm 2074 notification with X.733 parameters. 2076 The module also supports a feature whereby the alarm manager can 2077 configure the mapping. This might be needed when the default mapping 2078 provided by the system is in conflict with other systems or not 2079 considered good. 2081 9. X.733 Alarm Mapping YANG Module 2083 This YANG module references [X.736]. 2085 file "ietf-alarms-x733.yang" 2086 module ietf-alarms-x733 { 2087 yang-version 1.1; 2088 namespace "urn:ietf:params:xml:ns:yang:ietf-alarms-x733"; 2089 prefix x733; 2091 import ietf-alarms { 2092 prefix al; 2093 } 2094 organization 2095 "IETF NETMOD (NETCONF Data Modeling Language) Working Group"; 2097 contact 2098 "WG Web: 2099 WG List: 2101 Editor: Stefan Vallin 2102 2104 Editor: Martin Bjorklund 2105 "; 2107 description 2108 "This module augments the ietf-alarms module with X.733 mapping 2109 information. The following structures are augemented with 2110 event type and probable cause: 2112 1) alarm inventory: all possible alarms. 2113 2) alarm: every alarm in the system. 2114 3) alarm notification: notifications indicating alarm state 2115 changes. 2117 The module also optionally allows the alarm management system 2118 to configure the mapping. The mapping does not include a 2119 a corresponding specific problem value. The recommendation is 2120 to use alarm-type-qualifier which serves the same purpose."; 2121 reference 2122 "ITU Recommendation X.733: Information Technology 2123 - Open Systems Interconnection 2124 - System Management: Alarm Reporting Function"; 2126 revision 2016-10-05 { 2127 description 2128 "Initial revision."; 2129 reference 2130 "RFC XXXX: YANG Alarm Module"; 2131 } 2133 /* 2134 * Features 2135 */ 2137 feature configure-x733-mapping { 2138 description 2139 "The system supports configurable X733 mapping from 2140 alarm type to event type and probable cause."; 2141 } 2142 /* 2143 * Typedefs 2144 */ 2146 typedef event-type { 2147 type enumeration { 2148 enum other { 2149 value 1; 2150 description 2151 "None of the below."; 2152 } 2153 enum communications-alarm { 2154 value 2; 2155 description 2156 "An alarm of this type is principally associated with the 2157 procedures and/or processes required to convey 2158 information from one point to another."; 2159 reference 2160 "ITU Recommendation X.733: Information Technology 2161 - Open Systems Interconnection 2162 - System Management: Alarm Reporting Function"; 2163 } 2164 enum quality-of-service-alarm { 2165 value 3; 2166 description 2167 "An alarm of this type is principally associated with a 2168 degradation in the quality of a service."; 2169 reference 2170 "ITU Recommendation X.733: Information Technology 2171 - Open Systems Interconnection 2172 - System Management: Alarm Reporting Function"; 2173 } 2174 enum processing-error-alarm { 2175 value 4; 2176 description 2177 "An alarm of this type is principally associated with a 2178 software or processing fault."; 2179 reference 2180 "ITU Recommendation X.733: Information Technology 2181 - Open Systems Interconnection 2182 - System Management: Alarm Reporting Function"; 2183 } 2184 enum equipment-alarm { 2185 value 5; 2186 description 2187 "An alarm of this type is principally associated with an 2188 equipment fault."; 2189 reference 2190 "ITU Recommendation X.733: Information Technology 2191 - Open Systems Interconnection 2192 - System Management: Alarm Reporting Function"; 2193 } 2194 enum environmental-alarm { 2195 value 6; 2196 description 2197 "An alarm of this type is principally associated with a 2198 condition relating to an enclosure in which the equipment 2199 resides."; 2200 reference 2201 "ITU Recommendation X.733: Information Technology 2202 - Open Systems Interconnection 2203 - System Management: Alarm Reporting Function"; 2204 } 2205 enum integrity-violation { 2206 value 7; 2207 description 2208 "An indication that information may have been illegally 2209 modified, inserted or deleted."; 2210 reference 2211 "ITU Recommendation X.736: Information Technology 2212 - Open Systems Interconnection 2213 - System Management: Security Alarm Reporting Function"; 2214 } 2215 enum operational-violation { 2216 value 8; 2217 description 2218 "An indication that the provision of the requested service 2219 was not possible due to the unavailability, malfunction or 2220 incorrect invocation of the service."; 2221 reference 2222 "ITU Recommendation X.736: Information Technology 2223 - Open Systems Interconnection 2224 - System Management: Security Alarm Reporting Function"; 2225 } 2226 enum physical-violation { 2227 value 9; 2228 description 2229 "An indication that a physical resource has been violated 2230 in a way that suggests a security attack."; 2231 reference 2232 "ITU Recommendation X.736: Information Technology 2233 - Open Systems Interconnection 2234 - System Management: Security Alarm Reporting Function"; 2235 } 2236 enum security-service-or-mechanism-violation { 2237 value 10; 2238 description 2239 "An indication that a security attack has been detected by 2240 a security service or mechanism."; 2241 reference 2242 "ITU Recommendation X.736: Information Technology 2243 - Open Systems Interconnection 2244 - System Management: Security Alarm Reporting Function"; 2245 } 2246 enum time-domain-violation { 2247 value 11; 2248 description 2249 "An indication that an event has occurred at an unexpected 2250 or prohibited time."; 2251 reference 2252 "ITU Recommendation X.736: Information Technology 2253 - Open Systems Interconnection 2254 - System Management: Security Alarm Reporting Function"; 2255 } 2256 } 2257 description 2258 "The event types as defined by X.733 and X.736. The use of the 2259 term 'event' is a bit confusing. In an alarm context these 2260 are top level alarm types."; 2261 } 2263 /* 2264 * Groupings 2265 */ 2267 grouping x733-alarm-parameters { 2268 description 2269 "Common X.733 parameters for alarms."; 2271 leaf event-type { 2272 type event-type; 2273 description 2274 "The X.733/X.736 event type for this alarm."; 2275 } 2276 leaf probable-cause { 2277 type uint32; 2278 description 2279 "The X.733 probable cause for this alarm."; 2280 } 2281 } 2283 grouping x733-alarm-definition-parameters { 2284 description 2285 "Common X.733 parameters for alarm definitions."; 2287 leaf event-type { 2288 type event-type; 2289 description 2290 "The alarm type has this X.733/X.736 event type."; 2291 } 2292 leaf probable-cause { 2293 type uint32; 2294 description 2295 "The alarm type has this X.733 probable cause value. 2296 This module defines probable cause as an integer 2297 and not as an enumeration. The reason being that the 2298 primary use of probable cause is in the management 2299 application if it is based on the X.733 standard. 2300 However, most management applications have their own 2301 defined enum definitions and merging enums from 2302 different systems might create conflicts. By using 2303 a configurable uint32 the system can be configured 2304 to match the enum values in the manager."; 2305 } 2306 } 2308 /* 2309 * Add X.733 parameters to the alarm defintions, alarms, 2310 * and notification. 2311 */ 2313 augment "/al:alarms/al:alarm-inventory/al:alarm-type" { 2314 description 2315 "Augment X.733 mapping information to the alarm inventory."; 2317 uses x733-alarm-definition-parameters; 2318 } 2320 augment "/al:alarms/al:control" { 2321 description 2322 "Add X.733 mapping capabilities. "; 2323 list x733-mapping { 2324 if-feature configure-x733-mapping; 2325 key "alarm-type-id alarm-type-qualifier-match"; 2326 description 2327 "This list allows a management application to control the 2328 X.733 mapping for all alarm types in the system. Any entry 2329 in this list will allow the alarm manager to over-ride the 2330 default X.733 mapping in the system and the final mapping 2331 will be shown in the alarm-inventory"; 2333 leaf alarm-type-id { 2334 type al:alarm-type-id; 2335 description 2336 "Map the alarm type with this alarm type identifier."; 2337 } 2338 leaf alarm-type-qualifier-match { 2339 type string; 2340 description 2341 "A W3C regular expression that is used when mapping an 2342 alarm type and alarm-type-qualifier to X.733 parameters."; 2343 } 2345 uses x733-alarm-definition-parameters; 2346 } 2347 } 2349 augment "/al:alarms/al:alarm-list/al:alarm" { 2350 description 2351 "Augment X.733 information to the alarm."; 2353 uses x733-alarm-parameters; 2354 } 2356 augment "/al:alarms/al:shelved-alarms/al:shelved-alarm" { 2357 description 2358 "Augment X.733 information to the alarm."; 2360 uses x733-alarm-parameters; 2361 } 2363 augment "/al:alarm-notification" { 2364 description 2365 "Augment X.733 information to the alarm notification."; 2367 uses x733-alarm-parameters; 2368 } 2369 } 2371 2373 10. Security Considerations 2375 None. 2377 11. Acknowledgements 2379 The author wishes to thank Viktor Leijon and Johan Nordlander for 2380 their valuable input on forming the alarm model. 2382 12. References 2384 12.1. Normative References 2386 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 2387 Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/ 2388 RFC2119, March 1997, 2389 . 2391 [RFC7950] Bjorklund, M., Ed., "The YANG 1.1 Data Modeling Language", 2392 RFC 7950, DOI 10.17487/RFC7950, August 2016, 2393 . 2395 12.2. Informative References 2397 [ALARMIRP] 2398 3GPP, "Telecommunication management; Fault Management; 2399 Part 2: Alarm Integration Reference Point (IRP): 2400 Information Service (IS)", 3GPP TS 32.111-2 3.4.0, March 2401 2005. 2403 [ALARMSEM] 2404 Wallin, S., Leijon, V., Nordlander, J., and N. Bystedt, 2405 "The semantics of alarm definitions: enabling systematic 2406 reasoning about alarms. International Journal of Network 2407 Management, Volume 22, Issue 3, John Wiley and Sons, Ltd, 2408 http://dx.doi.org/10.1002/nem.800", March 2012. 2410 [EEMUA] EEMUA Publication No. 191 Engineering Equipment and 2411 Materials Users Association, London, 2 edition., "Alarm 2412 Systems: A Guide to Design, Management and Procurement.", 2413 2007. 2415 [ISA182] International Society of Automation,ISA, "ANSI/ISA- 2416 18.2-2009 Management of Alarm Systems for the Process 2417 Industries", 2009. 2419 [RFC3877] Chisholm, S. and D. Romascanu, "Alarm Management 2420 Information Base (MIB)", RFC 3877, DOI 10.17487/RFC3877, 2421 September 2004, . 2423 [X.733] International Telecommunications Union, "Information 2424 Technology - Open Systems Interconnection - Systems 2425 Management: Alarm Reporting Function", ITU-T 2426 Recommendation X.733, 1992. 2428 [X.736] International Telecommunications Union, "Information 2429 Technology - Open Systems Interconnection - Systems 2430 Management: Security alarm reporting function", ITU-T 2431 Recommendation X.736, 1992. 2433 Appendix A. Enterprise-specific Alarm-Types Example 2435 This example shows how to define alarm-types in an enterprise 2436 specific module. In this case "xyz" has chosen to define top level 2437 identities according to X.733 event types. 2439 module example-xyz-alarms { 2440 namespace "urn:example:xyz-alarms"; 2441 prefix xyz-al; 2443 import ietf-alarms { 2444 prefix al; 2445 } 2447 identity xyz-alarms { 2448 base al:alarm-identity; 2449 } 2451 identity communications-alarm { 2452 base xyz-alarms; 2453 } 2454 identity quality-of-service-alarm { 2455 base xyz-alarms; 2456 } 2457 identity processing-error-alarm { 2458 base xyz-alarms; 2459 } 2460 identity equipment-alarm { 2461 base xyz-alarms; 2462 } 2463 identity environmental-alarm { 2464 base xyz-alarms; 2465 } 2467 // communications alarms 2468 identity link-alarm { 2469 base communications-alarm; 2470 } 2472 // QoS alarms 2473 identity high-jitter-alarm { 2474 base quality-of-service-alarm; 2475 } 2476 } 2478 Appendix B. Alarm Inventory Example 2480 This shows an alarm inventory, it shows one alarm type defined only 2481 with the identifier, and another dynamically configured. In the 2482 latter case a digital input has been connected to a smoke-detector, 2483 therefore the 'alarm-type-qualifier' is set to "smoke-detector" and 2484 the 'alarm-type-identity' to "environmental-alarm". 2486 2488 2489 2490 xyz-al:link-alarm 2491 2492 true 2493 2494 Link failure, operational state down but admin state up 2495 2496 2497 2498 xyz-al:environmental-alarm 2499 smoke-alarm 2500 true 2501 2502 Connected smoke detector to digital input 2503 2504 2505 2506 2508 Appendix C. Alarm List Example 2510 In this example we show an alarm that has toggled [major, clear, 2511 major]. An operator has acknowledged the alarm. 2513 2516 2517 1 2518 2015-04-08T08:39:50.00Z 2520 2521 2522 /dev:interfaces/dev:interface[name='FastEthernet1/0'] 2523 2524 xyz-al:link-alarm 2525 2527 2015-04-08T08:39:50.00Z 2528 false 2529 1.3.6.1.2.1.2.2.1.1.17 2530 2015-04-08T08:39:40.00Z 2531 major 2532 2533 Link operationally down but administratively up 2534 2535 2536 2537 major 2538 2539 Link operationally down but administratively up 2540 2541 2542 2543 2544 cleared 2545 2546 Link operationally up and administratively up 2547 2548 2549 2550 2551 major 2552 2553 Link operationally down but administratively up 2554 2555 2556 2557 2558 ack 2559 joe 2560 Will investigate, ticket TR764999 2561 2562 2563 2564 2566 Appendix D. Alarm Shelving Example 2568 This example shows how to shelf alarms. We shelf alarms related to 2569 the smoke-detectors since they are being installed and tested. We 2570 also shelf all alarms from FastEthernet1/0. 2572 2575 2576 2577 2578 FE10 2579 2580 /dev:interfaces/dev:interface[name='FastEthernet1/0'] 2581 2582 2583 2584 detectortest 2585 xyz-al:environmental-alarm 2586 smoke-alarm 2587 2588 2589 2590 2592 Appendix E. X.733 Mapping Example 2594 This example shows how to map a dynamic alarm type (alarm-type- 2595 identity=environmental-alarm, alarm-type-qualifier=smoke-alarm) to 2596 the corresponding X.733 event-type and probable cause parameters. 2598 2600 2601 2603 xyz-al:environmental-alarm 2604 2605 smoke-alarm 2606 2607 quality-of-service-alarm 2608 777 2609 2610 2611 2613 Authors' Addresses 2614 Stefan Vallin 2615 Stefan Vallin AB 2617 Email: stefan@wallan.se 2619 Martin Bjorklund 2620 Cisco 2622 Email: mbj@tail-f.com