idnits 2.17.1 draft-vallin-netmod-alarm-module-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 590 has weird spacing: '...perator str...' == Line 595 has weird spacing: '...w state ope...' == Line 660 has weird spacing: '...alifier ala...' == Line 692 has weird spacing: '...alifier lea...' == Line 701 has weird spacing: '...everity sev...' == (10 more instances...) -- The document date (May 8, 2017) is 2544 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) No issues found here. Summary: 1 error (**), 0 flaws (~~), 7 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group S. Vallin 3 Internet-Draft Stefan Vallin AB 4 Intended status: Standards Track M. Bjorklund 5 Expires: November 9, 2017 Cisco 6 May 8, 2017 8 YANG Alarm Module 9 draft-vallin-netmod-alarm-module-02 11 Abstract 13 This document defines a YANG module for alarm management. It 14 includes functions for alarm list management, alarm shelving and 15 notifications to inform management systems. There are also RPCs to 16 manage the operator state of an alarm and administrative alarm 17 procedures. The module carefully maps to relevant alarm standards. 19 Status of This Memo 21 This Internet-Draft is submitted in full conformance with the 22 provisions of BCP 78 and BCP 79. 24 Internet-Drafts are working documents of the Internet Engineering 25 Task Force (IETF). Note that other groups may also distribute 26 working documents as Internet-Drafts. The list of current Internet- 27 Drafts is at http://datatracker.ietf.org/drafts/current/. 29 Internet-Drafts are draft documents valid for a maximum of six months 30 and may be updated, replaced, or obsoleted by other documents at any 31 time. It is inappropriate to use Internet-Drafts as reference 32 material or to cite them other than as "work in progress." 34 This Internet-Draft will expire on November 9, 2017. 36 Copyright Notice 38 Copyright (c) 2017 IETF Trust and the persons identified as the 39 document authors. All rights reserved. 41 This document is subject to BCP 78 and the IETF Trust's Legal 42 Provisions Relating to IETF Documents 43 (http://trustee.ietf.org/license-info) in effect on the date of 44 publication of this document. Please review these documents 45 carefully, as they describe your rights and restrictions with respect 46 to this document. Code Components extracted from this document must 47 include Simplified BSD License text as described in Section 4.e of 48 the Trust Legal Provisions and are provided without warranty as 49 described in the Simplified BSD License. 51 Table of Contents 53 1. Requirements notation . . . . . . . . . . . . . . . . . . . . 3 54 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 55 2.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 56 3. Objectives . . . . . . . . . . . . . . . . . . . . . . . . . 4 57 4. Background and Usability Requirements . . . . . . . . . . . . 5 58 5. Alarm Concepts . . . . . . . . . . . . . . . . . . . . . . . 8 59 5.1. What is an Alarm? . . . . . . . . . . . . . . . . . . . . 9 60 5.2. What is an Alarm Type? . . . . . . . . . . . . . . . . . 9 61 5.3. How are Resources Identified? . . . . . . . . . . . . . . 12 62 5.4. How are Alarm Instances Identified? . . . . . . . . . . . 12 63 5.5. What is the Life-Cycle of an Alarm? . . . . . . . . . . . 13 64 5.5.1. Resource Alarm Life-Cycle . . . . . . . . . . . . . . 13 65 5.5.2. Operator Alarm Life-cycle . . . . . . . . . . . . . . 14 66 5.5.3. Administrative Alarm Life-Cycle . . . . . . . . . . . 14 67 5.6. Root Cause and Impacted Resources . . . . . . . . . . . . 15 68 5.7. Alarm Shelving . . . . . . . . . . . . . . . . . . . . . 15 69 6. Alarm Data Model . . . . . . . . . . . . . . . . . . . . . . 15 70 6.1. Alarm Control . . . . . . . . . . . . . . . . . . . . . . 17 71 6.1.1. Alarm Shelving . . . . . . . . . . . . . . . . . . . 18 72 6.2. Alarm Inventory . . . . . . . . . . . . . . . . . . . . . 18 73 6.3. Alarm Summary . . . . . . . . . . . . . . . . . . . . . . 19 74 6.4. The Alarm List . . . . . . . . . . . . . . . . . . . . . 19 75 6.5. The Shelved Alarms List . . . . . . . . . . . . . . . . . 21 76 6.6. RPCs . . . . . . . . . . . . . . . . . . . . . . . . . . 21 77 6.7. Notifications . . . . . . . . . . . . . . . . . . . . . . 21 78 7. Alarm YANG Module . . . . . . . . . . . . . . . . . . . . . . 21 79 8. X.733 Alarm Mapping Data Model . . . . . . . . . . . . . . . 46 80 9. X.733 Alarm Mapping YANG Module . . . . . . . . . . . . . . . 46 81 10. Security Considerations . . . . . . . . . . . . . . . . . . . 52 82 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 52 83 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 53 84 12.1. Normative References . . . . . . . . . . . . . . . . . . 53 85 12.2. Informative References . . . . . . . . . . . . . . . . . 53 86 Appendix A. Enterprise-specific Alarm-Types Example . . . . . . 54 87 Appendix B. Alarm Inventory Example . . . . . . . . . . . . . . 55 88 Appendix C. Alarm List Example . . . . . . . . . . . . . . . . . 56 89 Appendix D. Alarm Shelving Example . . . . . . . . . . . . . . . 57 90 Appendix E. X.733 Mapping Example . . . . . . . . . . . . . . . 58 91 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 58 93 1. Requirements notation 95 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 96 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 97 document are to be interpreted as described in [RFC2119]. 99 2. Introduction 101 This document defines a YANG [RFC7950] module for alarm management. 102 The purpose is to define a standardised alarm interface for network 103 devices that can be easily integrated into management applications. 104 The model is also applicable as a northbound alarm interface in the 105 management applications. 107 Alarm monitoring is a fundamental part of monitoring the network. 108 Raw alarms from devices do not always tell the status of the network 109 services or necessarily point to the root cause. However, being able 110 to feed alarms to the network management system in a standardised 111 format is a starting point for performing higher level network 112 assurance tasks. 114 The telecommunication domain has standardised an alarm interface in 115 ITU-T X.733 [X.733]. This continued in mobile networks within the 116 3GPP organisation [ALARMIRP]. Although SNMP is the dominant 117 mechanism for monitoring devices, IETF did not early on standardise 118 an alarm MIB. Instead, management systems interpreted the enterprise 119 specific traps per MIB and device to build an alarm list. When 120 finally The Alarm MIB [RFC3877] was published, it had to address the 121 existence of enterprise traps and map these into alarms. This 122 requirement led to a MIB that is not easy to use. 124 This document defines a standardised YANG module for alarm 125 management. The design of the module is based on experience from 126 using and implementing the above mentioned alarm standards. 128 2.1. Terminology 130 The following terms are defined in [RFC7950]: 132 o action 134 o client 136 o data tree 138 o RPC 140 o server 141 The following terms are used within this document: 143 o Alarm (the general concept): An alarm signifies an undesirable 144 state in a resource that requires corrective action. 146 o Alarm Instance: The alarm state for a specific resource and alarm 147 type. For example (GigabitEthernet0/15, link-alarm). An entry in 148 the alarm list. 150 o Alarm Inventory: A list of all possible alarm types on a system. 152 o Alarm Shelving: Blocking alarms according to specific criteria. 154 o Alarm Type: An alarm type identifies a possible unique alarm state 155 for a resource. Alarm types are names to identify the state like 156 'link-alarm', 'jitter-violation', 'high-disk-utilization'. 158 o Management System: The alarm management application that consumes 159 the alarms, i.e., acts as a client. 161 o Resource: A fine-grained identification of the alarming resource, 162 for example: an interface, a process. 164 o System: The system that implements this YANG alarm module, i.e., 165 acts as a server. This corresponds to a network device or a 166 management application that provides a north-bound alarm 167 interface. 169 3. Objectives 171 The objectives for the design of the Alarm Module are: 173 o Simple to use. If a system supports this module, it shall be 174 straight-forward to integrate this into a YANG based alarm 175 manager. 177 o View alarms as states on resources and not as discrete 178 notifications. 180 o Clear definition of "alarm" in order to exclude general events 181 that should not be forwarded as alarm notifications. 183 o Clear and precise identification of alarm types and alarm 184 instances. 186 o A management system should be able to pull all available alarm 187 types from a system, i.e., read the alarm inventory from a system. 189 This makes it possible to prepare alarm operators with 190 corresponding alarm instructions. 192 o Address alarm usability requirements. While IETF has not really 193 addressed alarm management, telecom standards has addressed it 194 purely from a protocol perspective. The process industry has 195 published several relevant standards addressing requirements for a 196 useful alarm interface; [EEMUA], [ISA182]. This alarm module 197 defines usability requirements as well as a YANG data model. 199 o Mapping to X.733, which is a requirement for many alarm systems. 200 Still, keep some of the X.733 concepts out of the core model in 201 order to make the model small and easy to understand. 203 4. Background and Usability Requirements 205 Common alarm problems and the cause of the problems are summarised in 206 Table 1. This summary is adopted to networking based on the ISA 207 [ISA182] and EEMUA [EEMUA] standards. 209 +------------------+--------------------------------+---------------+ 210 | Problem | Cause | How this | 211 | | | module | 212 | | | address the | 213 | | | cause | 214 +------------------+--------------------------------+---------------+ 215 | Alarms are | "Nuisance" alarms (chattering | Strict | 216 | generated but | alarms and fleeting alarms), | definition of | 217 | they are ignored | faulty hardware, redundant | alarms | 218 | by the operator. | alarms, cascading alarms, | requiring | 219 | | incorrect alarm settings, | corrective | 220 | | alarms have not been | response. | 221 | | rationalised, the alarms | Alarm | 222 | | represent log information | requirements | 223 | | rather than true alarms. | in Table 2. | 224 | | | | 225 | When alarms | Insufficient alarm response | The alarm | 226 | occur, operators | procedures and not well | inventory | 227 | do not know how | defined alarm types. | lists all | 228 | to respond. | | alarm types | 229 | | | and | 230 | | | corrective | 231 | | | actions. | 232 | | | Alarm | 233 | | | requirements | 234 | | | in Table 2. | 235 | | | | 236 | The alarm | Nuisance alarms, stale alarms, | The alarm | 237 | display is full | alarms from equipment not in | definition | 238 | of alarms, even | service. | and alarm | 239 | when there is | | shelving. | 240 | nothing wrong. | | | 241 | | | | 242 | During a | Incorrect prioritization of | State-based | 243 | failure, | alarms. Not using advanced | alarm model, | 244 | operators are | alarm techniques (e.g. state- | alarm rate | 245 | flooded with so | based alarming). | requirements | 246 | many alarms that | | in Table 3 | 247 | they do not know | | and Table 4 | 248 | which ones are | | | 249 | the most | | | 250 | important. | | | 251 +------------------+--------------------------------+---------------+ 253 Table 1: Alarm Problems and Causes 255 Based upon the above problems EEMUA gives the following definition of 256 a good alarm: 258 +----------------+--------------------------------------------------+ 259 | Characteristic | Explanation | 260 +----------------+--------------------------------------------------+ 261 | Relevant | Not spurious or of low operational value. | 262 | | | 263 | Unique | Not duplicating another alarm. | 264 | | | 265 | Timely | Not long before any response is needed or too | 266 | | late to do anything. | 267 | | | 268 | Prioritised | Indicating the importance that the operator | 269 | | deals with the problem. | 270 | | | 271 | Understandable | Having a message which is clear and easy to | 272 | | understand. | 273 | | | 274 | Diagnostic | Identifying the problem that has occurred. | 275 | | | 276 | Advisory | Indicative of the action to be taken. | 277 | | | 278 | Focusing | Drawing attention to the most important issues. | 279 +----------------+--------------------------------------------------+ 281 Table 2: Definition of a Good Alarm 283 Vendors SHOULD rationalise all alarms according to above. Another 284 crucial requirement is acceptable alarm rates. Vendors SHOULD make 285 sure that they do not exceed the recommendations from EEMUA below: 287 +-----------------------------------+-------------------------------+ 288 | Long Term Alarm Rate in Steady | Acceptability | 289 | Operation | | 290 +-----------------------------------+-------------------------------+ 291 | More than one per minute | Very likely to be | 292 | | unacceptable. | 293 | | | 294 | One per 2 minutes | Likely to be over-demanding. | 295 | | | 296 | One per 5 minutes | Manageable. | 297 | | | 298 | Less than one per 10 minutes | Very likely to be acceptable. | 299 +-----------------------------------+-------------------------------+ 301 Table 3: Acceptable Alarm Rates, Steady State 303 +----------------------------+--------------------------------------+ 304 | Number of alarms displayed | Acceptability | 305 | in 10 minutes following a | | 306 | major network problem | | 307 +----------------------------+--------------------------------------+ 308 | More than 100 | Definitely excessive and very likely | 309 | | to lead to the operator to abandon | 310 | | the use of the alarm system. | 311 | | | 312 | 20-100 | Hard to cope with. | 313 | | | 314 | Under 10 | Should be manageable - but may be | 315 | | difficult if several of the alarms | 316 | | require a complex operator response. | 317 +----------------------------+--------------------------------------+ 319 Table 4: Acceptable Alarm Rates, Burst 321 The numbers in Table 3 and Table 4 are the sum of all alarms for a 322 network being managed from one alarm console. So every individual 323 system or NMS contributes to these numbers. 325 Vendors SHOULD make sure that the following rules are used in 326 designing the alarm interface: 328 1. Rationalize the alarms in the system to ensure that every alarm 329 is necessary, has a purpose, and follows the cardinal rule - that 330 it requires an operator response. Adheres to the rules of 331 Table 2 333 2. Audit the quality of the alarms. Talk with the operators about 334 how well the alarm information support them. Do they know what 335 to do in the event of an alarm? Are they able to quickly 336 diagnose the problem and determine the corrective action? Does 337 the alarm text adhere to the requirements in Table 2? 339 3. Analyze and benchmark the performance of the system and compare 340 it to the recommended metrics in Table 3 and Table 4. Start by 341 identifying nuisance alarms, standing alarms at normal state and 342 startup. 344 5. Alarm Concepts 346 This section defines the fundamental concepts behind the data model. 347 This section is rooted in the works of Vallin et. al [ALARMSEM]. 349 5.1. What is an Alarm? 351 There are two misconceptions regarding alarms and alarm interfaces 352 that are important to sort out. The first problem is that alarms are 353 mixed with events in general. Alarms MUST correspond to an 354 undesirable state that needs corrective action. Many implementations 355 of alarm interfaces do not adhere to this principle and just send 356 events in general. In order to qualify as an alarm, there must exist 357 a corrective action. If that is not true, it is an event that can go 358 into logs. 360 The other misconception is that the term alarm refers to the 361 notification itself. Rather, an alarm is a state of a resource in 362 the system. The alarm notifications report state changes of the 363 alarm, such as alarm raise and alarm clear. 365 Based upon the above, we will use the following alarm definition: 367 An alarm signifies an undesirable state in a resource that 368 requires corrective action. 370 "One of the most important principles of alarm management is that an 371 alarm requires an action. This means that if the operator does not 372 need to respond to an alarm (because unacceptable consequences do not 373 occur), then it is not an alarm. Following this cardinal rule will 374 help eliminate many potential alarm management issues." [ISA182] 376 5.2. What is an Alarm Type? 378 One of the fundamental requirements stated in the previous section is 379 that every alarm must have a corresponding corrective action. This 380 means that every vendor should be able to prepare a list of available 381 alarms and their corrective actions. We use the term 'alarm type' to 382 refer to every possible alarm that could be active in the system. 384 Alarm types are also fundamental in order to provide a state-based 385 alarm list. The alarm list correlates alarm state changes for the 386 same alarm type and the same resource into one alarm. 388 Different alarm interfaces use different mechanisms to define alarm 389 types, ranging from simple error numbers to more advanced mechanisms 390 like the X.733 triplet of event type, probable cause and specific 391 problem. 393 This document defines an alarm type with an alarm type id and an 394 alarm type qualifier. 396 The alarm type id is modeled as a YANG identity. With YANG 397 identities, new alarm types can be defined in a distributed fashion. 398 YANG identities are hierarchical, which means that an hierarchy of 399 alarm types can be defined. 401 The primary goal for the alarm module has been to provide a simple 402 but extensible mechanism. YANG identities is a good mechanism for 403 enumerated values that are easy to extend. 405 This means that every possible alarm type that can appear in a system 406 exists as a well defined hierarchical identity along with a 407 description. Tools can provide a list of possible alarms by parsing 408 the YANG identities rather than reading user guides. 410 Standards and vendors should define their own alarm type identities 411 based on this definition. 413 The use of YANG identities means that all possible alarms are 414 identified at design time. This explicit declaration of alarm types 415 makes it easier to allow for alarm qualification reviews and 416 preparation of alarm actions and documentation. 418 There are occasions where the alarm types are not known at design 419 time. For example, a system with digital inputs that allows users to 420 connects detectors (e.g., smoke detector) to the inputs. In this 421 case it is a configuration action that says that certain connectors 422 are fire alarms for example. The drawback of this is that there is a 423 big risk that alarm operators will receive alarm types as a surprise, 424 they do not know how to resolve the problem since a defined alarm 425 procedure does not necessarily exist. 427 In order to allow for dynamic addition of alarm types the alarm 428 module also allows for further qualification of the identity based 429 alarm type using a string. 431 A common misunderstanding is that individual alarm notifications are 432 alarm types. This is not correct; e.g., "link-up" and "link-down" 433 are two notifications reporting different states for the same alarm 434 type, "link-alarm". 436 A vendor or standard can then define their own alarm-type hierarchy. 437 The example below shows a hierarchy based on X.733 event types: 439 import ietf-alarms { 440 prefix al; 441 } 442 identity vendor-alarms { 443 base al:alarm-type; 444 } 445 identity communications-alarm { 446 base vendor-alarms; 447 } 448 identity link-alarm { 449 base communications-alarm; 450 } 452 Alarm types can be abstract. An abstract alarm type is used as a 453 base for defining hierarchical alarm types. Concrete alarm types are 454 used for alarm states and appear in the alarm inventory. There are 455 two kinds of concrete alarm types: 457 1. The last subordinate identity in the 'alarm-type-id' hierarchy is 458 concrete, for example: "alarm-identity.environmental- 459 alarm.smoke". In this example "alarm-identity" and 460 "environmental-alarm" are abstract YANG identities, whereas 461 "smoke" is a concrete YANG identity. 463 2. The YANG identity hierarchy is abstract and the concrete alarm 464 type is defined by the dynamic alarm qualifier string, for 465 example: "alarm-identity.environmental-alarm.external-detector" 466 with alarm-type-qualifier "smoke". 468 For example: 470 // Alternative 1: concrete alarm type identity 471 import ietf-alarms { 472 prefix al; 473 } 474 identity environmental-alarm { 475 base al:alarm-type; 476 description "Abstract alarm type"; 477 } 478 identity smoke { 479 base environmental-alarm; 480 description "Concrete alarm type"; 481 } 483 // Alternative 2: concrete alarm type qualifier 484 import ietf-alarms { 485 prefix al; 486 } 487 identity environmental-alarm { 488 base al:alarm-type; 489 description "Abstract alarm type"; 490 } 491 identity external-detector { 492 base environmental-alarm; 493 description 494 "Abstract alarm type, a run-time configuration 495 procedure sets the type of alarm detected. This will 496 be reported in the alarm-type-qualifier."; 497 } 499 5.3. How are Resources Identified? 501 It is of vital importance to be able to refer to the alarming 502 resource. This reference must be as fine-grained as possible. If 503 the alarming resource exists in the data tree then an instance- 504 identifier MUST be used with the full path to the object. 506 This module also allows for alternate naming of the alarming resource 507 if it is not available in the data tree. 509 5.4. How are Alarm Instances Identified? 511 A primary goal of this alarm module is to remove any ambiguity in how 512 alarm notifications are mapped to an update of an alarm instance. 513 X.733 and especially 3GPP was not really clear on this point. This 514 YANG alarm module states that the tuple (resource, alarm type 515 identifier, alarm type qualifier) corresponds to the same alarm 516 instance. This means that alarm notifications for the same resource 517 and same alarm type are matched to update the same alarm instance. 518 These three leafs are therefore used as the key in the alarm list: 520 list alarm { 521 key "resource alarm-type-id alarm-type-qualifier"; 522 ... 523 } 525 5.5. What is the Life-Cycle of an Alarm? 527 The alarm model clearly separates the resource alarm life-cycle from 528 the operator and administrative life-cycles of an alarm. 530 o resource alarm life-cycle: the alarm instrumentation that controls 531 alarm raise, clearance, and severity changes. 533 o operator alarm life-cycle: operators acting upon alarms with 534 actions like acknowledgment and closing. Closing an alarm implies 535 that the operator considers the corrective action performed. 536 Operators can also shelf alarms in order to avoid nuisance alarms. 538 o administrative alarm life-cycle: deleting (purging) alarms and 539 compressing the alarm status change list. This module exposes 540 operations to manage the administrative life-cycle. The server 541 may also perform these operations based on other policies, but how 542 that is done is out of scope for this document. 544 5.5.1. Resource Alarm Life-Cycle 546 From a resource perspective, an alarm can have the following life- 547 cycle: raise, change severity, change severity, clear, being raised 548 again etc. All of these status changes can have different alarm 549 texts generated by the instrumentation. Two important things to 550 note: 552 1. Alarms are not deleted when they are cleared. Deleting alarms is 553 an administrative process. The alarm module defines an rpc 554 "purge" that deletes alarms. 556 2. Alarms are not cleared by operators, only the underlying 557 instrumentation can clear an alarm. Operators can close alarms. 559 The YANG tree representation below illustrates the resource oriented 560 life-cycle: 562 +--ro alarm* [resource alarm-type-id alarm-type-qualifier] 563 ... 564 +--ro is-cleared boolean 565 +--ro last-changed yang:date-and-time 566 +--ro perceived-severity severity 567 +--ro alarm-text alarm-text 568 +--ro status-change* [time] 569 +--ro time yang:date-and-time 570 +--ro perceived-severity severity 571 +--ro alarm-text alarm-text 573 For every status change from the resource perspective a row is added 574 to the 'status-change' list. The last status values are also 575 represented at leafs for the alarm. Note well that the alarm 576 severity does not include 'cleared', alarm clearance is a flag. 578 An alarm can therefore look like this: ((GigabitEthernet0/25, link- 579 alarm,""), false, T, major, "Interface GigabitEthernet0/25 down") 581 5.5.2. Operator Alarm Life-cycle 583 Operators can also act upon alarms using the set-operator-state 584 action: 586 +--ro alarm* [resource alarm-type-id alarm-type-qualifier] 587 ... 588 +--ro operator-state-change* [time] {operator-actions}? 589 | +--ro time yang:date-and-time 590 | +--ro operator string 591 | +--ro state operator-state 592 | +--ro text? string 593 +---x set-operator-state {operator-actions}? 594 +---w input 595 +---w state operator-state 596 +---w text? string 598 The operator state for an alarm can be: 'none', 'ack', 'shelved', and 599 'closed'. Alarm deletion, 'rpc purge', can use this state as a 600 criteria. A closed alarm is an alarm where the operator has 601 performed any required corrective actions. Closed alarms are good 602 candidates for being deleted. 604 5.5.3. Administrative Alarm Life-Cycle 606 Deleting alarms from the alarm list is considered an administrative 607 action. This is supported by the "purge-alarms" rpc. The "purge- 608 alarms" rpc takes a filter as input. The filter selects alarms based 609 on the operator and resource life-cycle such as "all closed cleared 610 alarms older than a time specification". The server may also perform 611 these operations based on other policies, but how that is done is out 612 of scope for this document. 614 Alarms can be compressed. Compressing an alarm deletes all entries 615 in the alarm's "status-change" list except for the last status 616 change. A client can perform this using the "compress-alarms" rpc. 617 The server may also perform these operations based on other policies, 618 but how that is done is out of scope for this document. 620 5.6. Root Cause and Impacted Resources 622 The general principle of this alarm module is to limit the amount of 623 alarms. The alarm has two leaf-lists to identify possible impacted 624 resources and possible root-cause resources. The system should not 625 send individual alarms for the posible root-cause resources and 626 impacted resources. These serves as hints only. It is up to the 627 client application to use this information to present the overall 628 status. 630 5.7. Alarm Shelving 632 Alarm shelving is an important function in order for alarm management 633 applications and operators to stop superfluous alarms. A shelved 634 alarm implies that any alarms fulfilling this criteria are ignored. 635 Shelved alarms appear in a dedicated shelved alarm list in order not 636 to disturb the relevant alarms. Shelved alarms do not generate 637 notifications. 639 6. Alarm Data Model 641 Alarm shelving and operator actions are YANG features so that a 642 server can select not to support these. 644 The data model has the following overall structure: 646 +--rw alarms 647 +--rw control 648 | +--rw max-alarm-status-changes? union 649 | +--rw notify-status-changes? boolean 650 | +--rw alarm-shelving {alarm-shelving}? 651 | +--rw shelf* [shelf-name] 652 | +--rw shelf-name string 653 | +--rw resource? resource 654 | +--rw alarm-type-id? alarm-type-id 655 | +--rw alarm-type-qualifier? alarm-type-qualifier 656 | +--rw description? string 657 +--ro alarm-inventory 658 | +--ro alarm-type* [alarm-type-id alarm-type-qualifier] 659 | +--ro alarm-type-id alarm-type-id 660 | +--ro alarm-type-qualifier alarm-type-qualifier 661 | +--ro resource* string 662 | +--ro has-clear boolean 663 | +--ro description string 664 +--ro summary 665 | +--ro alarm-summary* [severity] 666 | | +--ro severity severity 667 | | +--ro total? yang:gauge32 668 | | +--ro cleared? yang:gauge32 669 | | +--ro cleared-not-closed? yang:gauge32 670 | | | {operator-actions}? 671 | | +--ro cleared-closed? yang:gauge32 672 | | | {operator-actions}? 673 | | +--ro not-cleared-closed? yang:gauge32 674 | | | {operator-actions}? 675 | | +--ro not-cleared-not-closed? yang:gauge32 676 | | {operator-actions}? 677 | +--ro shelves-active? empty {alarm-shelving}? 678 +--ro alarm-list 679 | +--ro number-of-alarms? yang:gauge32 680 | +--ro last-changed? yang:date-and-time 681 | +--ro alarm* [resource alarm-type-id alarm-type-qualifier] 682 | +--ro time-created yang:date-and-time 683 | +--ro resource resource 684 | +--ro alarm-type-id alarm-type-id 685 | +--ro alarm-type-qualifier alarm-type-qualifier 686 | +--ro alt-resource* resource 687 | +--ro related-alarm* 688 | | [resource alarm-type-id alarm-type-qualifier] 689 | | +--ro resource 690 | | | -> /alarms/alarm-list/alarm/resource 691 | | +--ro alarm-type-id leafref 692 | | +--ro alarm-type-qualifier leafref 693 | +--ro impacted-resource* resource 694 | +--ro root-cause-resource* resource 695 | +--ro is-cleared boolean 696 | +--ro last-changed yang:date-and-time 697 | +--ro perceived-severity severity 698 | +--ro alarm-text alarm-text 699 | +--ro status-change* [time] {alarm-history}? 700 | | +--ro time yang:date-and-time 701 | | +--ro perceived-severity severity-with-clear 702 | | +--ro alarm-text alarm-text 703 | +--ro operator-state-change* [time] {operator-actions}? 704 | | +--ro time yang:date-and-time 705 | | +--ro operator string 706 | | +--ro state operator-state 707 | | +--ro text? string 708 | +---x set-operator-state {operator-actions}? 709 | +---w input 710 | +---w state operator-state 711 | +---w text? string 712 +--ro shelved-alarms {alarm-shelving}? 713 +--ro number-of-shelved-alarms? yang:gauge32 714 +--ro alarm-shelf-last-changed? yang:date-and-time 715 +--ro shelved-alarm* 716 [resource alarm-type-id alarm-type-qualifier] 717 +--ro resource resource 718 +--ro alarm-type-id alarm-type-id 719 +--ro alarm-type-qualifier alarm-type-qualifier 720 +--ro alt-resource* resource 721 +--ro related-alarm* 722 | [resource alarm-type-id alarm-type-qualifier] 723 | +--ro resource 724 | | -> /alarms/alarm-list/alarm/resource 725 | +--ro alarm-type-id leafref 726 | +--ro alarm-type-qualifier leafref 727 +--ro impacted-resource* resource 728 +--ro root-cause-resource* resource 729 +--ro is-cleared boolean 730 +--ro last-changed yang:date-and-time 731 +--ro perceived-severity severity 732 +--ro alarm-text alarm-text 733 +--ro status-change* [time] {alarm-history}? 734 | +--ro time yang:date-and-time 735 | +--ro perceived-severity severity-with-clear 736 | +--ro alarm-text alarm-text 737 +--ro operator-state-change* [time] {operator-actions}? 738 +--ro time yang:date-and-time 739 +--ro operator string 740 +--ro state operator-state 741 +--ro text? string 743 6.1. Alarm Control 745 The "/alarms/control/notify-status-changes" leaf controls if 746 notifications are sent for all state changes, severity change and 747 alarm text change, or just for new and cleared alarms. 749 Every alarm has a list of status changes, this is a circular list. 750 The length of this list is controlled by "/alarms/control/max-alarm- 751 status-changes". 753 6.1.1. Alarm Shelving 755 The shelving control tree is shown below: 757 +--rw alarm-shelving {alarm-shelving}? 758 +--rw shelf* [shelf-name] 759 +--rw shelf-name string 760 +--rw resource? resource 761 +--rw alarm-type-id? alarm-type-id 762 +--rw alarm-type-qualifier? alarm-type-qualifier 764 Shelved alarms are shown in a dedicated shelved alarm list. The 765 instrumentation MUST move shelved alarms from the alarm list 766 (/alarms/alarm-list) to the shelved alarm list (/alarms/shelved- 767 alarms/). Shelved alarms do not generate any notifications. When 768 the shelving criteria is removed or changed the alarm list MUST be 769 updated to the correct actual state of the alarms. 771 A leaf (/alarms/summary/shelfs-active) in the alarm summary indicates 772 if there are shelved alarms. 774 A system can select to not support the shelving feature. 776 6.2. Alarm Inventory 778 The alarm inventory represents all possible alarm types that may 779 occur in the system. A management system may use this to build alarm 780 procedures. The alarm inventory is relevant for several reasons: 782 The system might not instrument all alarm type identities. 784 The system has configured dynamic alarm types using the alarm 785 qualifier. The inventory makes it possible for the management 786 system to discover these. 788 Note that the mechanism whereby dynamic alarm types are added using 789 the alarm type qualifier MUST populate this list. 791 The optional leaf-list "resource" in the alarm inventory enables the 792 system to publish for which resources a given alarm type may appear. 794 The alarm inventory tree is shown below: 796 ro alarm-inventory 797 +--ro alarm-type* [alarm-type-id alarm-type-qualifier] 798 +--ro alarm-type-id alarm-type-id 799 +--ro alarm-type-qualifier alarm-type-qualifier 800 +--ro resource* string 801 +--ro has-clear boolean 802 +--ro description string 804 6.3. Alarm Summary 806 The alarm summary list summarises alarms per severity; how many 807 cleared, cleared and closed, and closed. It also gives an indication 808 if there are shelved alarms. 810 6.4. The Alarm List 812 The alarm list (/alarms/alarm-list) is a function from (resource, 813 alarm type, alarm type qualifier) to the current alarm state. 815 +--ro alarm-list 816 +--ro number-of-alarms? yang:gauge32 817 +--ro last-changed? yang:date-and-time 818 +--ro alarm* [resource alarm-type-id alarm-type-qualifier] 819 +--ro time-created yang:date-and-time 820 +--ro resource resource 821 +--ro alarm-type-id alarm-type-id 822 +--ro alarm-type-qualifier alarm-type-qualifier 823 +--ro alt-resource* resource 824 +--ro related-alarm* 825 | [resource alarm-type-id alarm-type-qualifier] 826 | +--ro resource 827 | | -> /alarms/alarm-list/alarm/resource 828 | +--ro alarm-type-id leafref 829 | +--ro alarm-type-qualifier leafref 830 +--ro impacted-resource* resource 831 +--ro root-cause-resource* resource 832 +--ro is-cleared boolean 833 +--ro last-changed yang:date-and-time 834 +--ro perceived-severity severity 835 +--ro alarm-text alarm-text 836 +--ro status-change* [time] {alarm-history}? 837 | +--ro time yang:date-and-time 838 | +--ro perceived-severity severity-with-clear 839 | +--ro alarm-text alarm-text 840 +--ro operator-state-change* [time] {operator-actions}? 841 | +--ro time yang:date-and-time 842 | +--ro operator string 843 | +--ro state operator-state 844 | +--ro text? string 845 +---x set-operator-state {operator-actions}? 846 +---w input 847 +---w state operator-state 848 +---w text? string 850 Every alarm has three important states, the resource clearance state 851 "is-cleared", the severity "perceived-severity" and the operator 852 state available in the operator state change list. 854 In order to see the alarm history the resource state changes are 855 available in the "status-change" list and the operator history is 856 available in the "operator-state-change" list. 858 6.5. The Shelved Alarms List 860 The shelved alarm list has the same structure as the alarm list 861 above. It shows all the alarms that matches the shelving criteria 862 (/alarms/control/alarm-shelving). 864 6.6. RPCs 866 The alarm module supports rpcs/actions to manage the alarms: 868 "purge-alarms" (RPC): delete alarms according to specific 869 criteria, for example all cleared alarms older then a specific 870 date. 872 "compress" and "compress-alarms" rpcs: compress the status-change 873 list for the alarms. 875 "set-operator-state" action: change the operator state for an 876 alarm: for example acknowledge. 878 6.7. Notifications 880 The alarm module supports a general notification to report alarm 881 state changes. It carries all relevant parameters for the alarm 882 management application. 884 There is also a notification to report that an operator changed the 885 operator state on an alarm, like acknowledge. 887 If the alarm inventory is changed, for example a new card type is 888 inserted, a notification will tell the management application that 889 new alarm types are available. 891 7. Alarm YANG Module 893 file "ietf-alarms@2017-05-08.yang" 894 module ietf-alarms { 895 yang-version 1.1; 896 namespace "urn:ietf:params:xml:ns:yang:ietf-alarms"; 897 prefix al; 899 import ietf-yang-types { 900 prefix yang; 901 } 903 organization 904 "IETF NETMOD (NETCONF Data Modeling Language) Working Group"; 906 contact 907 "WG Web: 908 WG List: 910 Editor: Stefan Vallin 911 913 Editor: Martin Bjorklund 914 "; 916 description 917 "This module defines an interface for managing alarms. Main 918 inputs to the module design are the 3GPP Alarm IRP, ITU-T X.733 919 and ANSI/ISA-18.2 alarm standards. 921 Main features of this module include: 923 * Alarm list: 924 A list of all alarms. Cleared alarms stay in 925 the list until explicitly removed. 927 * Operator actions on alarms: 928 Acknowledging and closing alarms. 930 * Administrative actions on alarms: 931 Purging alarms from the list according to specific 932 criteria. 934 * Alarm inventory: 935 A management application can read all 936 alarm types implemented by the system. 938 * Alarm shelving: 939 Shelving (blocking) alarms according 940 to specific criteria. 942 This module uses a stateful view on alarms. An alarm is a state 943 for a specific resource (note that an alarm is not a 944 notification). An alarm type is a possible alarm state for a 945 resource. For example, the tuple: 947 ('link-alarm', 'GigabitEthernet0/25') 949 is an alarm of type 'link-alarm' on the resource 950 'GigabitEthernet0/25'. 952 Alarm types are identified using YANG identities and an optional 953 string-based qualifier. The string-based qualifier allows for 954 dynamic extension of the statically defined alarm types. Alarm 955 types identify a possible alarm state and not the individual 956 notifications. For example, the traditional 'link-down' and 957 'link-up' notifications are two notifications referring to the 958 same alarm type 'link-alarm'. 960 With this design there is no ambiguity about how alarm and alarm 961 clear correlation should be performed: notifications that report 962 the same resource and alarm type are considered updates of the 963 same alarm, such as clearing an active alarm or changing the 964 severity of an alarm. 966 The instrumentation can update 'severity' and 'alarm-text' on an 967 existing alarm. The above alarm example can therefore look 968 like: 970 (('link-alarm', 'GigabitEthernet0/25'), 971 warning, 972 'interface down while interface admin state is up') 974 There is a clear separation between updates on the alarm from 975 the underlying resource, like clear, and updates from an 976 operator like acknowledge or closing an alarm: 978 (('link-alarm', 'GigabitEthernet0/25'), 979 warning, 980 'interface down while interface admin state is up', 981 cleared, 982 closed) 984 Administrative actions like removing closed alarms older than a 985 given time is supported."; 987 revision 2017-05-08 { 988 description 989 "Initial revision."; 990 reference 991 "RFC XXXX: YANG Alarm Module"; 992 } 994 /* 995 * Features 996 */ 998 feature operator-actions { 999 description 1000 "This feature means that the systems supports operator states 1001 on alarms."; 1002 } 1004 feature alarm-shelving { 1005 description 1006 "This feature means that the system supports shelving 1007 (blocking) alarms."; 1008 } 1010 feature alarm-history { 1011 description 1012 "This feature means that the alarm list also maintains a 1013 history of state changes for each alarm. For example, if an 1014 alarm toggles between cleared and active 10 times, a list for 1015 that alarm will show those state changes with time-stamps."; 1016 } 1017 /* 1018 * Identities 1019 */ 1021 identity alarm-identity { 1022 description 1023 "Base identity for alarm types. A unique identification of the 1024 alarm, not including the resource. Different resources can 1025 share alarm types. If the resource reports the same alarm 1026 type, it is to be considered to be the same alarm. The alarm 1027 type is a simplification of the different X.733 and 3GPP alarm 1028 IRP alarm correlation mechanisms and it allows for 1029 hierarchical extensions. 1031 A string-based qualifier can be used in addition to the 1032 identity in order to have different alarm types based on 1033 information not known at design-time, such as values in 1034 textual SNMP Notification var-binds. 1036 Standards and vendors can define sub-identities to clearly 1037 identify specific alarm types. 1039 This identity is abstract and shall not be used for alarms."; 1040 } 1042 /* 1043 * Common types 1044 */ 1046 typedef resource { 1047 type union { 1048 type instance-identifier { 1049 require-instance false; 1050 } 1051 type yang:object-identifier; 1052 type string; 1053 } 1054 description 1055 "This is an identification of the alarming resource, such as an 1056 interface. It should be as fine-grained as possible both to 1057 guide the operator and to guarantee uniqueness of the 1058 alarms. If a resource has both a config and a state tree 1059 normally this should identify the state tree, 1060 (e.g., /interfaces-state/interface/name). 1061 But if the instrumentation can detect a broken config, this 1062 should be identified as the resource. 1063 If the alarming resource is modelled in YANG, this 1064 type will be an instance-identifier. If the resource is an 1065 SNMP object, the type will be an object-identifier. If the 1066 resource is anything else, for example a distinguished name or 1067 a CIM path, this type will be a string."; 1068 } 1070 typedef alarm-text { 1071 type string; 1072 description 1073 "The string used to inform operators about the alarm. This 1074 MUST contain enough information for an operator to be able 1075 to understand the problem and how to resolve it. If this 1076 string contains structure, this format should be clearly 1077 documented for programs to be able to parse that 1078 information."; 1079 } 1081 typedef severity { 1082 type enumeration { 1083 enum indeterminate { 1084 value 2; 1085 description 1086 "Indicates that the severity level could not be 1087 determined. This level SHOULD be avoided."; 1088 } 1089 enum minor { 1090 value 3; 1091 description 1092 "The 'minor' severity level indicates the existence of a 1093 non-service affecting fault condition and that corrective 1094 action should be taken in order to prevent a more serious 1095 (for example, service affecting) fault. Such a severity 1096 can be reported, for example, when the detected alarm 1097 condition is not currently degrading the capacity of the 1098 resource."; 1099 } 1100 enum warning { 1101 value 4; 1102 description 1103 "The 'warning' severity level indicates the detection of 1104 a potential or impending service affecting fault, before 1105 any significant effects have been felt. Action should be 1106 taken to further diagnose (if necessary) and correct the 1107 problem in order to prevent it from becoming a more 1108 serious service affecting fault."; 1109 } 1110 enum major { 1111 value 5; 1112 description 1113 "The 'major' severity level indicates that a service 1114 affecting condition has developed and an urgent 1115 corrective action is required. Such a severity can be 1116 reported, for example, when there is a severe 1117 degradation in the capability of the resource 1118 and its full capability must be restored."; 1119 } 1120 enum critical { 1121 value 6; 1122 description 1123 "The 'critical' severity level indicates that a service 1124 affecting condition has occurred and an immediate 1125 corrective action is required. Such a severity can be 1126 reported, for example, when a resource becomes totally 1127 out of service and its capability must be restored."; 1128 } 1129 } 1130 description 1131 "The severity level of the alarm. Note well that value 'clear' 1132 is not included. If an alarm is cleared or not is a separate 1133 boolean flag."; 1134 reference 1135 "ITU Recommendation X.733: Information Technology 1136 - Open Systems Interconnection 1137 - System Management: Alarm Reporting Function"; 1138 } 1140 typedef severity-with-clear { 1141 type union { 1142 type enumeration { 1143 enum cleared { 1144 value 1; 1145 description 1146 "The alarm is cleared by the instrumentation."; 1147 } 1148 } 1149 type severity; 1150 } 1151 description 1152 "The severity level of the alarm including clear. 1153 This is used *only* in notifications reporting state changes 1154 for an alarm."; 1155 } 1157 typedef operator-state { 1158 type enumeration { 1159 enum none { 1160 value 1; 1161 description 1162 "The alarm is not being taken care of."; 1163 } 1164 enum ack { 1165 value 2; 1166 description 1167 "The alarm is being taken care of. Corrective action not 1168 taken yet, or failed"; 1169 } 1170 enum closed { 1171 value 3; 1172 description 1173 "Corrective action taken successfully."; 1174 } 1175 enum shelved { 1176 value 4; 1177 description 1178 "Alarm shelved. Alarms in alarms/shelved-alarms/ 1179 MUST be assigned this operator state by the server as 1180 the last entry in the operator-state-change list."; 1181 } 1182 enum un-shelved { 1183 value 5; 1184 description 1185 "Alarm moved back to alarm-list from shelf. 1186 Alarms 'moved' from /alarms/shelved-alarms/ 1187 to /alarms/alarm-list MUST be assigned this 1188 state by the server as the last entry in the 1189 operator-state-change list."; 1190 } 1192 } 1193 description 1194 "Operator states on an alarm. The 'closed' state indicates 1195 that an operator considers the alarm being resolved. This 1196 is separate from the resource alarm clear flag."; 1197 } 1199 /* Alarm type */ 1201 typedef alarm-type-id { 1202 type identityref { 1203 base alarm-identity; 1204 } 1205 description 1206 "Identifies an alarm type. The description of the alarm type 1207 id MUST indicate if the alarm type is abstract or not. An 1208 abstract alarm type is used as a base for other alarm type ids 1209 and will not be used as a value for an alarm or be present in 1210 the alarm inventory."; 1211 } 1213 typedef alarm-type-qualifier { 1214 type string; 1215 description 1216 "If an alarm type can not be fully specified at design time by 1217 alarm-type-id, this string qualifier is used in addition to 1218 fully define a unique alarm type. 1220 The definition of alarm qualifiers is considered being part 1221 of the instrumentation and out of scope for this module. 1222 An empty string is used when this is part of a key."; 1223 } 1225 /* 1226 * Groupings 1227 */ 1229 grouping common-alarm-parameters { 1230 description 1231 "Common parameters for an alarm. 1233 This grouping is used both in the alarm list and in the 1234 notification representing an alarm state change."; 1236 leaf resource { 1237 type resource; 1238 mandatory true; 1239 description 1240 "The alarming resource. See also 'alt-resource'. 1241 This could for example be a reference to the alarming 1242 interface"; 1243 } 1245 leaf alarm-type-id { 1246 type alarm-type-id; 1247 mandatory true; 1248 description 1249 "This leaf and the leaf 'alarm-type-qualifier' together 1250 provides a unique identification of the alarm type."; 1251 } 1253 leaf alarm-type-qualifier { 1254 type alarm-type-qualifier; 1255 description 1256 "This leaf is used when the 'alarm-type-id' leaf cannot 1257 uniquely identify the alarm type. Normally, this is not 1258 the case, and this leaf is the empty string."; 1259 } 1261 leaf-list alt-resource { 1262 type resource; 1263 description 1264 "Used if the alarming resource is available over other 1265 interfaces. This field can contain SNMP OID's, CIM paths or 1266 3GPP Distinguished names for example."; 1267 } 1269 list related-alarm { 1270 key "resource alarm-type-id alarm-type-qualifier"; 1272 description 1273 "References to related alarms. Note that the related alarm 1274 might have been removed from the alarm list."; 1276 leaf resource { 1277 type leafref { 1278 path "/alarms/alarm-list/alarm/resource"; 1279 require-instance false; 1280 } 1281 description 1282 "The alarming resource for the related alarm."; 1283 } 1284 leaf alarm-type-id { 1285 type leafref { 1286 path "/alarms/alarm-list/alarm" 1287 + "[resource=current()/../resource]" 1288 + "/alarm-type-id"; 1289 require-instance false; 1290 } 1291 description 1292 "The alarm type identifier for the related alarm."; 1293 } 1294 leaf alarm-type-qualifier { 1295 type leafref { 1296 path "/alarms/alarm-list/alarm" 1297 + "[resource=current()/../resource]" 1298 + "[alarm-type-id=current()/../alarm-type-id]" 1299 + "/alarm-type-qualifier"; 1300 require-instance false; 1301 } 1302 description 1303 "The alarm qualifier for the related alarm."; 1304 } 1305 } 1306 leaf-list impacted-resource { 1307 type resource; 1308 description 1309 "Resources that might be affected by this alarm. If the 1310 system creates an alarm on a resource and also has a mapping 1311 to other resources that might be impacted, these resources 1312 can be listed in this leaf-list. In this way the system can 1313 create one alarm instead of several. For example, if an 1314 interface has an alarm, the 'impacted-resource' can 1315 reference the aggregated port channels."; 1316 } 1317 leaf-list root-cause-resource { 1318 type resource; 1319 description 1320 "Resources that are candidates for causing the alarm. If the 1321 system has a mechanism to understand the candidate root 1322 causes of an alarm, this leaf-list can be used to list the 1323 root cause candidate resources. In this way the system can 1324 create one alarm instead of several. An example might be a 1325 logging system (alarm resource) that fails, the alarm can 1326 reference the file-system in the 'root-cause-resource' 1327 leaf-list. Note that the intended use is not to also send an 1328 an alarm with the root-cause-resource as alarming resource. 1329 The root-cause-resource leaf list is a hint and should not 1330 also generate an alarm for the same problem."; 1331 } 1332 } 1334 grouping alarm-state-change-parameters { 1335 description 1336 "Parameters for an alarm state change. 1338 This grouping is used both in the alarm list's 1339 status-change list and in the notification representing an 1340 alarm state change."; 1342 leaf time { 1343 type yang:date-and-time; 1344 mandatory true; 1345 description 1346 "The time the status of the alarm changed. The value 1347 represents the time the real alarm state change appeared 1348 in the resource and not when it was added to the 1349 alarm list. The /alarm-list/alarm/last-changed MUST be 1350 set to the same value."; 1351 } 1352 leaf perceived-severity { 1353 type severity-with-clear; 1354 mandatory true; 1355 description 1356 "The severity of the alarm as defined by X.733. Note 1357 that this may not be the original severity since the alarm 1358 may have changed severity."; 1359 reference 1360 "ITU Recommendation X.733: Information Technology 1361 - Open Systems Interconnection 1362 - System Management: Alarm Reporting Function"; 1363 } 1364 leaf alarm-text { 1365 type alarm-text; 1366 mandatory true; 1367 description 1368 "A user friendly text describing the alarm state change."; 1369 reference 1370 "ITU Recommendation X.733: Information Technology 1371 - Open Systems Interconnection 1372 - System Management: Alarm Reporting Function"; 1373 } 1374 } 1376 grouping operator-parameters { 1377 description 1378 "This grouping defines parameters that can 1379 be changed by an operator"; 1380 leaf time { 1381 type yang:date-and-time; 1382 mandatory true; 1383 description 1384 "Timestamp for operator action on alarm."; 1385 } 1386 leaf operator { 1387 type string; 1388 mandatory true; 1389 description 1390 "The name of the operator that has acted on this 1391 alarm."; 1392 } 1393 leaf state { 1394 type operator-state; 1395 mandatory true; 1396 description 1397 "The operator's view of the alarm state."; 1398 } 1399 leaf text { 1400 type string; 1401 description 1402 "Additional optional textual information provided by 1403 the operator."; 1404 } 1405 } 1407 grouping resource-alarm-parameters { 1408 description 1409 "Alarm parameters that originates from the resource view."; 1410 leaf is-cleared { 1411 type boolean; 1412 mandatory true; 1413 description 1414 "Indicates the current clearance state of the alarm. An 1415 alarm might toggle from active alarm to cleared alarm and 1416 back to active again."; 1417 } 1419 leaf last-changed { 1420 type yang:date-and-time; 1421 mandatory true; 1422 description 1423 "A timestamp when the alarm status was last changed. Status 1424 changes are changes to 'is-cleared', 'perceived-severity', 1425 and 'alarm-text'."; 1426 } 1428 leaf perceived-severity { 1429 type severity; 1430 mandatory true; 1431 description 1432 "The last severity of the alarm. 1434 If an alarm was raised with severity 'warning', but later 1435 changed to 'major', this leaf will show 'major'."; 1436 } 1438 leaf alarm-text { 1439 type alarm-text; 1440 mandatory true; 1441 description 1442 "The last reported alarm text. This text should contain 1443 information for an operator to be able to understand 1444 the problem and how to resolve it."; 1445 } 1447 list status-change { 1448 if-feature alarm-history; 1449 key time; 1450 min-elements 1; 1451 description 1452 "A list of status change events for this alarm. 1454 The entry with latest time-stamp in this list MUST 1455 correspond to the leafs 'is-cleared', 'perceived-severity' 1456 and 'alarm-text' for the alarm. The time-stamp for that 1457 entry MUST be equal to the 'last-changed' leaf. 1459 This list is ordered according to the timestamps of 1460 alarm state changes. The last item corresponds to the 1461 latest state change. 1463 The following state changes creates an entry in this 1464 list: 1465 - changed severity (warning, minor, major, critical) 1466 - clearance status, this also updates the 'is-cleared' 1467 leaf 1468 - alarm text update"; 1470 uses alarm-state-change-parameters; 1471 } 1472 } 1474 /* 1475 * The /alarms data tree 1476 */ 1478 container alarms { 1479 description 1480 "The top container for this module"; 1481 container control { 1482 description 1483 "Configuration to control the alarm behaviour."; 1484 leaf max-alarm-status-changes { 1485 type union { 1486 type uint16; 1487 type enumeration { 1488 enum infinite { 1489 description 1490 "The status change entries are accumulated 1491 infinitely."; 1492 } 1493 } 1494 } 1495 default 32; 1496 description 1497 "The status-change entries are kept in a circular list 1498 per alarm. When this number is exceeded, the oldest 1499 status change entry is automatically removed. If the 1500 value is 'infinite', the status change entries are 1501 accumulated infinitely."; 1502 } 1504 leaf notify-status-changes { 1505 type boolean; 1506 default false; 1507 description 1508 "This leaf controls whether notifications are sent on all 1509 alarm status updates, e.g., updated perceived-severity or 1510 alarm-text. By default the notifications are only sent 1511 when a new alarm is raised, re-raised after being cleared 1512 and when an alarm is cleared."; 1513 } 1514 container alarm-shelving { 1515 if-feature alarm-shelving; 1516 description 1517 "This list is used to shelve alarms. The server will move 1518 any alarms corresponding to the shelving criteria from the 1519 alarms/alarm-list/alarm list to the 1520 alarms/shelved-alarms/shelved-alarm list. It will also 1521 stop sending notifications for the shelved alarms. The 1522 conditions in the shelf criteria are logically ANDed. 1523 When the shelving criteria is deleted or changed, the 1524 non-matching alarms MUST appear in the 1525 alarms/alarm-list/alarm list according to the real state. 1526 This means that the instrumentation MUST maintain states 1527 for the shelved alarms. Alarms that match the criteria 1528 shall have an operator-state 'shelved'."; 1529 list shelf { 1530 key shelf-name; 1531 leaf shelf-name { 1532 type string; 1533 description 1534 "An arbitrary name for the alarm shelf."; 1535 } 1536 description 1537 "Each entry defines the criteria for shelving alarms. 1538 Criterias are ANDed."; 1540 leaf resource { 1541 type resource; 1542 description 1543 "Shelve alarms for this resource."; 1544 } 1545 leaf alarm-type-id { 1546 type alarm-type-id; 1547 description 1548 "Shelve alarms for this alarm type identifier."; 1549 } 1550 leaf alarm-type-qualifier { 1551 type alarm-type-qualifier; 1552 description 1553 "Shelve alarms for this alarm type qualifier."; 1554 } 1555 leaf description { 1556 type string; 1557 description 1558 "An optional textual description of the shelf. This 1559 description should include the reason for shelving 1560 these alarms."; 1561 } 1562 } 1563 } 1564 } 1566 container alarm-inventory { 1567 config false; 1568 description 1569 "This list contains all possible alarm types for the system. 1570 If the system knows for wich resources a a specific alarm 1571 type can appear, this is also identified in the inventory. 1572 The list also tells if each alarm type has a corresponding 1573 clear state. The inventory shall only contain concrete 1574 alarm types. 1576 The alarm inventory MUST be updated by the system when new 1577 alarms can appear. This can be the case when installing new 1578 software modules or inserting new card types. A 1579 notification 'alarm-inventory-changed' is sent when the 1580 inventory is changed."; 1582 list alarm-type { 1583 key "alarm-type-id alarm-type-qualifier"; 1584 description 1585 "An entry in this list defines a possible alarm."; 1586 leaf alarm-type-id { 1587 type alarm-type-id; 1588 mandatory true; 1589 description 1590 "The statically defined alarm type identifier for this 1591 possible alarm."; 1592 } 1593 leaf alarm-type-qualifier { 1594 type alarm-type-qualifier; 1595 description 1596 "The optionally dynamically defined alarm type identifier 1597 for this possible alarm."; 1598 } 1599 leaf-list resource { 1600 type string; 1601 description 1602 "Optionally, specifies for which resources the alarm type 1603 is valid. This string is for human consumption but 1604 SHOULD refer to paths in the model."; 1605 } 1606 leaf has-clear { 1607 type boolean; 1608 mandatory true; 1609 description 1610 "This leaf tells the operator if the alarm will be 1611 cleared when the correct corrective action has been 1612 taken. Implementations SHOULD strive for detecting the 1613 cleared state for all alarm types. If this leaf is 1614 true, the operator can monitor the alarm until it 1615 becomes cleared after the corrective action has been 1616 taken. If this leaf is false the operator needs to 1617 validate that the alarm is not longer active using other 1618 mechanisms. Alarms can lack a corresponding clear due 1619 to missing instrumentation or that there is no logical 1620 corresponding clear state."; 1621 } 1622 leaf description { 1623 type string; 1624 mandatory true; 1625 description 1626 "A description of the possible alarm. It SHOULD include 1627 information on possible underlying root causes and 1628 corrective actions."; 1629 } 1630 } 1631 } 1633 container summary { 1634 config false; 1635 description 1636 "This container gives a summary of number of alarms 1637 and shelved alarms"; 1638 list alarm-summary { 1639 key severity; 1640 description 1641 "A global summary of all alarms in the system."; 1642 leaf severity { 1643 type severity; 1644 description 1645 "Alarm summary for this severity level."; 1646 } 1647 leaf total { 1648 type yang:gauge32; 1649 description 1650 "Total number of alarms of this severity level."; 1651 } 1652 leaf cleared { 1653 type yang:gauge32; 1654 description 1655 "For this severity level, the number of alarms that are 1656 cleared."; 1657 } 1658 leaf cleared-not-closed { 1659 if-feature operator-actions; 1660 type yang:gauge32; 1661 description 1662 "For this severity level, the number of alarms that are 1663 cleared but not closed."; 1664 } 1665 leaf cleared-closed { 1666 if-feature operator-actions; 1667 type yang:gauge32; 1668 description 1669 "For this severity level, the number of alarms that are 1670 cleared and closed."; 1671 } 1672 leaf not-cleared-closed { 1673 if-feature operator-actions; 1674 type yang:gauge32; 1675 description 1676 "For this severity level, the number of alarms that are 1677 not cleared but closed."; 1678 } 1679 leaf not-cleared-not-closed { 1680 if-feature operator-actions; 1681 type yang:gauge32; 1682 description 1683 "For this severity level, the number of alarms that are 1684 not cleared and not closed."; 1685 } 1686 } 1687 leaf shelves-active { 1688 if-feature alarm-shelving; 1689 type empty; 1690 description 1691 "This is a hint to the operator that there are active 1692 alarm shelves. This leaf MUST exist if the 1693 alarms/shelved-alarms/number-of-shelved-alarms is > 0."; 1694 } 1695 } 1697 container alarm-list { 1698 config false; 1699 description 1700 "The alarms in the system."; 1701 leaf number-of-alarms { 1702 type yang:gauge32; 1703 description 1704 "This object shows the total number of 1705 alarms in the system, i.e., the total number 1706 of entries in the alarm list."; 1707 } 1709 leaf last-changed { 1710 type yang:date-and-time; 1711 description 1712 "A timestamp when the alarm list was last 1713 changed. The value can be used by a manager to 1714 initiate an alarm resynchronization procedure."; 1715 } 1717 list alarm { 1718 key "resource alarm-type-id alarm-type-qualifier"; 1719 description 1720 "The list of alarms. Each entry in the list holds one 1721 alarm for a given alarm type and resource. 1722 An alarm can be updated from the underlying resource or 1723 by the user. The following leafs are maintained by the 1724 resource: is-cleared, last-change, perceived-severity, 1725 and alarm-text. An operator can change: operator-state 1726 and operator-text. 1728 Entries appear in the alarm list the first time an 1729 alarm becomes active for a given alarm-type and resource. 1730 Entries do not get deleted when the alarm is cleared, this 1731 is a boolean state in the alarm. 1733 Alarm entries are removed, purged, from the list by an 1734 explicit purge action. For example, delete all alarms 1735 that are cleared and in closed operator-state that are 1736 older than 24 hours. Systems may also remove alarms based 1737 on locally configured policies which is out of scope for 1738 this module."; 1739 leaf time-created { 1740 type yang:date-and-time; 1741 mandatory true; 1742 description 1743 "The time-stamp when this alarm entry was created. This 1744 represents the first time the alarm appeared, it can 1745 also represent that the alarm re-appeared after a purge. 1746 Further state-changes of the same alarm does not change 1747 this leaf, these changes will update the 'last-changed' 1748 leaf."; 1749 } 1751 uses common-alarm-parameters; 1752 uses resource-alarm-parameters; 1753 list operator-state-change { 1754 if-feature operator-actions; 1755 key time; 1756 description 1757 "This list is used by operators to indicate 1758 the state of human intervention on an alarm. 1759 For example, if an operator has seen an alarm, 1760 the operator can add a new item to this list indicating 1761 that the alarm is acknowledged."; 1762 uses operator-parameters; 1763 } 1765 action set-operator-state { 1766 if-feature operator-actions; 1767 description 1768 "This is a means for the operator to indicate 1769 the level of human intervention on an alarm."; 1770 input { 1771 leaf state { 1772 type operator-state; 1773 mandatory true; 1774 description 1775 "Set this operator state."; 1776 } 1777 leaf text { 1778 type string; 1779 description 1780 "Additional optional textual information."; 1781 } 1782 } 1783 } 1784 } 1785 } 1787 container shelved-alarms { 1788 if-feature alarm-shelving; 1789 config false; 1790 description 1791 "The shelved alarms. Alarms appear here if they match the 1792 criterias in /alarms/control/alarm-shelving. This list does 1793 not generate any notifications. The list represents alarms 1794 that are considered not relevant by the operator. Alarms in 1795 this list have an operator-state of 'shelved'. This can not 1796 be changed."; 1797 leaf number-of-shelved-alarms { 1798 type yang:gauge32; 1799 description 1800 "This object shows the total number of currently 1801 alarms, i.e., the total number of entries 1802 in the alarm list."; 1803 } 1805 leaf alarm-shelf-last-changed { 1806 type yang:date-and-time; 1807 description 1808 "A timestamp when the shelved alarm list was last 1809 changed. The value can be used by a manager to 1810 initiate an alarm resynchronization procedure."; 1811 } 1813 list shelved-alarm { 1814 key "resource alarm-type-id alarm-type-qualifier"; 1815 description 1816 "The list of shelved alarms. Each entry in the list holds 1817 one alarm for a given alarm type and resource. An alarm 1818 can be updated from the underlying resource or by the 1819 user. These changes are reflected in different lists 1820 below the corresponding alarm."; 1822 uses common-alarm-parameters; 1823 uses resource-alarm-parameters; 1825 list operator-state-change { 1826 if-feature operator-actions; 1827 key time; 1828 description 1829 "This list is used by operators to indicate 1830 the state of human intervention on an alarm. 1831 For example, if an operator has seen an alarm, 1832 the operator can add a new item to this list indicating 1833 that the alarm is acknowledged."; 1834 uses operator-parameters; 1835 } 1836 } 1837 } 1838 } 1840 /* 1841 * Operations 1842 */ 1844 rpc compress-alarms { 1845 if-feature alarm-history; 1846 description 1847 "This operation requests the server to compress entries in the 1848 alarm list by removing all but the latest state change for all 1849 alarms. Conditions in the input are logically ANDed. If no 1850 input condition is given, all alarms are compressed."; 1851 input { 1852 leaf resource { 1853 type leafref { 1854 path "/alarms/alarm-list/alarm/resource"; 1855 require-instance false; 1856 } 1857 description 1858 "Compress the alarms with this resource."; 1859 } 1860 leaf alarm-type-id { 1861 type leafref { 1862 path "/alarms/alarm-list/alarm/alarm-type-id"; 1864 } 1865 description 1866 "Compress alarms with this alarm-type-id."; 1867 } 1868 leaf alarm-type-qualifier { 1869 type leafref { 1870 path "/alarms/alarm-list/alarm/alarm-type-qualifier"; 1871 } 1872 description 1873 "Compress the alarms with this alarm-type-qualifier."; 1874 } 1875 } 1876 output { 1877 leaf compressed-alarms { 1878 type uint32; 1879 description 1880 "Number of compressed alarm entries."; 1881 } 1882 } 1883 } 1885 grouping filter-input { 1886 description 1887 "Grouping to specify a filter construct on alarm information."; 1888 leaf alarm-status { 1889 type enumeration { 1890 enum any { 1891 description 1892 "Ignore alarm clearance status."; 1893 } 1894 enum cleared { 1895 description 1896 "Filter cleared alarms."; 1897 } 1898 enum not-cleared { 1899 description 1900 "Filter not cleared alarms."; 1901 } 1902 } 1903 mandatory true; 1904 description 1905 "The clearance status of the alarm."; 1906 } 1908 container older-than { 1909 presence "Age specification"; 1910 description 1911 "Matches the 'last-status-change' leaf in the alarm."; 1913 choice age-spec { 1914 description 1915 "Filter using date and time age."; 1916 case seconds { 1917 leaf seconds { 1918 type uint16; 1919 description 1920 "Seconds part"; 1921 } 1922 } 1923 case minutes { 1924 leaf minutes { 1925 type uint16; 1926 description 1927 "Minute part"; 1928 } 1929 } 1930 case hours { 1931 leaf hours { 1932 type uint16; 1933 description 1934 "Hours part."; 1935 } 1936 } 1937 case days { 1938 leaf days { 1939 type uint16; 1940 description 1941 "Day part"; 1942 } 1943 } 1944 case weeks { 1945 leaf weeks { 1946 type uint16; 1947 description 1948 "Week part"; 1949 } 1950 } 1951 } 1952 } 1953 container severity { 1954 presence "Severity filter"; 1955 choice sev-spec { 1956 description 1957 "Filter based on severity level."; 1958 leaf below { 1959 type severity; 1960 description 1961 "Severity less than this leaf."; 1962 } 1963 leaf is { 1964 type severity; 1965 description 1966 "Severity level equal this leaf."; 1967 } 1968 leaf above { 1969 type severity; 1970 description 1971 "Severity level higher than this leaf."; 1972 } 1973 } 1974 description 1975 "Filter based on severity."; 1976 } 1977 container operator-state-filter { 1978 if-feature operator-actions; 1979 presence "Operator state filter"; 1980 leaf state { 1981 type operator-state; 1982 description 1983 "Filter on operator state."; 1984 } 1985 leaf user { 1986 type string; 1987 description 1988 "Filter based on which operator."; 1989 } 1990 description 1991 "Filter based on operator state."; 1992 } 1993 } 1995 rpc purge-alarms { 1996 description 1997 "This operation requests the server to delete entries from the 1998 alarm list according to the supplied criteria. Typically it 1999 can be used to delete alarms that are in closed operator state 2000 and older than a specified time. The number of purged alarms 2001 is returned as an output parameter"; 2002 input { 2003 uses filter-input; 2004 } 2005 output { 2006 leaf purged-alarms { 2007 type uint32; 2008 description 2009 "Number of purged alarms."; 2010 } 2011 } 2012 } 2014 /* 2015 * Notifications 2016 */ 2018 notification alarm-notification { 2019 description 2020 "This notification is used to report a state change for an 2021 alarm. The same notification is used for reporting a newly 2022 raised alarm, a cleared alarm or changing the text and/or 2023 severity of an existing alarm."; 2025 uses common-alarm-parameters; 2026 uses alarm-state-change-parameters; 2027 } 2029 notification alarm-inventory-changed { 2030 description 2031 "This notification is used to report that the list of possible 2032 alarms has changed. This can happen when for example if a new 2033 software module is installed, or a new physical card is 2034 inserted"; 2035 } 2037 notification operator-action { 2038 if-feature operator-actions; 2039 description 2040 "This notification is used to report that an operator 2041 acted upon an alarm."; 2043 leaf resource { 2044 type leafref { 2045 path "/alarms/alarm-list/alarm/resource"; 2046 require-instance false; 2047 } 2048 description 2049 "The alarming resource."; 2050 } 2051 leaf alarm-type-id { 2052 type leafref { 2053 path "/alarms/alarm-list/alarm" 2054 + "[resource=current()/../resource]" 2055 + "/alarm-type-id"; 2056 require-instance false; 2058 } 2059 description 2060 "The alarm type identifier for the alarm."; 2061 } 2062 leaf alarm-type-qualifier { 2063 type leafref { 2064 path "/alarms/alarm-list/alarm" 2065 + "[resource=current()/../resource]" 2066 + "[alarm-type-id=current()/../alarm-type-id]" 2067 + "/alarm-type-qualifier"; 2068 require-instance false; 2069 } 2070 description 2071 "The alarm qualifier for the alarm."; 2072 } 2073 uses operator-parameters; 2074 } 2075 } 2077 2079 8. X.733 Alarm Mapping Data Model 2081 Many alarm management systems are based on the X.733 alarm standard. 2082 This YANG module allows a mapping from alarm types to X.733 event- 2083 type and probable-cause. 2085 The module augments the alarm inventory, the alarm list and the alarm 2086 notification with X.733 parameters. 2088 The module also supports a feature whereby the alarm manager can 2089 configure the mapping. This might be needed when the default mapping 2090 provided by the system is in conflict with other systems or not 2091 considered good. 2093 9. X.733 Alarm Mapping YANG Module 2095 This YANG module references [X.736]. 2097 file "ietf-alarms-x733@2017-05-08.yang" 2098 module ietf-alarms-x733 { 2099 yang-version 1.1; 2100 namespace "urn:ietf:params:xml:ns:yang:ietf-alarms-x733"; 2101 prefix x733; 2103 import ietf-alarms { 2104 prefix al; 2105 } 2106 organization 2107 "IETF NETMOD (NETCONF Data Modeling Language) Working Group"; 2109 contact 2110 "WG Web: 2111 WG List: 2113 Editor: Stefan Vallin 2114 2116 Editor: Martin Bjorklund 2117 "; 2119 description 2120 "This module augments the ietf-alarms module with X.733 mapping 2121 information. The following structures are augemented with 2122 event type and probable cause: 2124 1) alarm inventory: all possible alarms. 2125 2) alarm: every alarm in the system. 2126 3) alarm notification: notifications indicating alarm state 2127 changes. 2129 The module also optionally allows the alarm management system 2130 to configure the mapping. The mapping does not include a 2131 a corresponding specific problem value. The recommendation is 2132 to use alarm-type-qualifier which serves the same purpose."; 2133 reference 2134 "ITU Recommendation X.733: Information Technology 2135 - Open Systems Interconnection 2136 - System Management: Alarm Reporting Function"; 2138 revision 2017-05-08 { 2139 description 2140 "Initial revision."; 2141 reference 2142 "RFC XXXX: YANG Alarm Module"; 2143 } 2145 /* 2146 * Features 2147 */ 2149 feature configure-x733-mapping { 2150 description 2151 "The system supports configurable X733 mapping from 2152 alarm type to event type and probable cause."; 2153 } 2154 /* 2155 * Typedefs 2156 */ 2158 typedef event-type { 2159 type enumeration { 2160 enum other { 2161 value 1; 2162 description 2163 "None of the below."; 2164 } 2165 enum communications-alarm { 2166 value 2; 2167 description 2168 "An alarm of this type is principally associated with the 2169 procedures and/or processes required to convey 2170 information from one point to another."; 2171 reference 2172 "ITU Recommendation X.733: Information Technology 2173 - Open Systems Interconnection 2174 - System Management: Alarm Reporting Function"; 2175 } 2176 enum quality-of-service-alarm { 2177 value 3; 2178 description 2179 "An alarm of this type is principally associated with a 2180 degradation in the quality of a service."; 2181 reference 2182 "ITU Recommendation X.733: Information Technology 2183 - Open Systems Interconnection 2184 - System Management: Alarm Reporting Function"; 2185 } 2186 enum processing-error-alarm { 2187 value 4; 2188 description 2189 "An alarm of this type is principally associated with a 2190 software or processing fault."; 2191 reference 2192 "ITU Recommendation X.733: Information Technology 2193 - Open Systems Interconnection 2194 - System Management: Alarm Reporting Function"; 2195 } 2196 enum equipment-alarm { 2197 value 5; 2198 description 2199 "An alarm of this type is principally associated with an 2200 equipment fault."; 2201 reference 2202 "ITU Recommendation X.733: Information Technology 2203 - Open Systems Interconnection 2204 - System Management: Alarm Reporting Function"; 2205 } 2206 enum environmental-alarm { 2207 value 6; 2208 description 2209 "An alarm of this type is principally associated with a 2210 condition relating to an enclosure in which the equipment 2211 resides."; 2212 reference 2213 "ITU Recommendation X.733: Information Technology 2214 - Open Systems Interconnection 2215 - System Management: Alarm Reporting Function"; 2216 } 2217 enum integrity-violation { 2218 value 7; 2219 description 2220 "An indication that information may have been illegally 2221 modified, inserted or deleted."; 2222 reference 2223 "ITU Recommendation X.736: Information Technology 2224 - Open Systems Interconnection 2225 - System Management: Security Alarm Reporting Function"; 2226 } 2227 enum operational-violation { 2228 value 8; 2229 description 2230 "An indication that the provision of the requested service 2231 was not possible due to the unavailability, malfunction or 2232 incorrect invocation of the service."; 2233 reference 2234 "ITU Recommendation X.736: Information Technology 2235 - Open Systems Interconnection 2236 - System Management: Security Alarm Reporting Function"; 2237 } 2238 enum physical-violation { 2239 value 9; 2240 description 2241 "An indication that a physical resource has been violated 2242 in a way that suggests a security attack."; 2243 reference 2244 "ITU Recommendation X.736: Information Technology 2245 - Open Systems Interconnection 2246 - System Management: Security Alarm Reporting Function"; 2247 } 2248 enum security-service-or-mechanism-violation { 2249 value 10; 2250 description 2251 "An indication that a security attack has been detected by 2252 a security service or mechanism."; 2253 reference 2254 "ITU Recommendation X.736: Information Technology 2255 - Open Systems Interconnection 2256 - System Management: Security Alarm Reporting Function"; 2257 } 2258 enum time-domain-violation { 2259 value 11; 2260 description 2261 "An indication that an event has occurred at an unexpected 2262 or prohibited time."; 2263 reference 2264 "ITU Recommendation X.736: Information Technology 2265 - Open Systems Interconnection 2266 - System Management: Security Alarm Reporting Function"; 2267 } 2268 } 2269 description 2270 "The event types as defined by X.733 and X.736. The use of the 2271 term 'event' is a bit confusing. In an alarm context these 2272 are top level alarm types."; 2273 } 2275 /* 2276 * Groupings 2277 */ 2279 grouping x733-alarm-parameters { 2280 description 2281 "Common X.733 parameters for alarms."; 2283 leaf event-type { 2284 type event-type; 2285 description 2286 "The X.733/X.736 event type for this alarm."; 2287 } 2288 leaf probable-cause { 2289 type uint32; 2290 description 2291 "The X.733 probable cause for this alarm."; 2292 } 2293 } 2295 grouping x733-alarm-definition-parameters { 2296 description 2297 "Common X.733 parameters for alarm definitions."; 2299 leaf event-type { 2300 type event-type; 2301 description 2302 "The alarm type has this X.733/X.736 event type."; 2303 } 2304 leaf probable-cause { 2305 type uint32; 2306 description 2307 "The alarm type has this X.733 probable cause value. 2308 This module defines probable cause as an integer 2309 and not as an enumeration. The reason being that the 2310 primary use of probable cause is in the management 2311 application if it is based on the X.733 standard. 2312 However, most management applications have their own 2313 defined enum definitions and merging enums from 2314 different systems might create conflicts. By using 2315 a configurable uint32 the system can be configured 2316 to match the enum values in the manager."; 2317 } 2318 } 2320 /* 2321 * Add X.733 parameters to the alarm defintions, alarms, 2322 * and notification. 2323 */ 2325 augment "/al:alarms/al:alarm-inventory/al:alarm-type" { 2326 description 2327 "Augment X.733 mapping information to the alarm inventory."; 2329 uses x733-alarm-definition-parameters; 2330 } 2332 augment "/al:alarms/al:control" { 2333 description 2334 "Add X.733 mapping capabilities. "; 2335 list x733-mapping { 2336 if-feature configure-x733-mapping; 2337 key "alarm-type-id alarm-type-qualifier-match"; 2338 description 2339 "This list allows a management application to control the 2340 X.733 mapping for all alarm types in the system. Any entry 2341 in this list will allow the alarm manager to over-ride the 2342 default X.733 mapping in the system and the final mapping 2343 will be shown in the alarm-inventory"; 2345 leaf alarm-type-id { 2346 type al:alarm-type-id; 2347 description 2348 "Map the alarm type with this alarm type identifier."; 2349 } 2350 leaf alarm-type-qualifier-match { 2351 type string; 2352 description 2353 "A W3C regular expression that is used when mapping an 2354 alarm type and alarm-type-qualifier to X.733 parameters."; 2355 } 2357 uses x733-alarm-definition-parameters; 2358 } 2359 } 2361 augment "/al:alarms/al:alarm-list/al:alarm" { 2362 description 2363 "Augment X.733 information to the alarm."; 2365 uses x733-alarm-parameters; 2366 } 2368 augment "/al:alarms/al:shelved-alarms/al:shelved-alarm" { 2369 description 2370 "Augment X.733 information to the alarm."; 2372 uses x733-alarm-parameters; 2373 } 2375 augment "/al:alarm-notification" { 2376 description 2377 "Augment X.733 information to the alarm notification."; 2379 uses x733-alarm-parameters; 2380 } 2381 } 2383 2385 10. Security Considerations 2387 None. 2389 11. Acknowledgements 2391 The author wishes to thank Viktor Leijon and Johan Nordlander for 2392 their valuable input on forming the alarm model. 2394 12. References 2396 12.1. Normative References 2398 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 2399 Requirement Levels", BCP 14, RFC 2119, 2400 DOI 10.17487/RFC2119, March 1997, 2401 . 2403 [RFC7950] Bjorklund, M., Ed., "The YANG 1.1 Data Modeling Language", 2404 RFC 7950, DOI 10.17487/RFC7950, August 2016, 2405 . 2407 12.2. Informative References 2409 [ALARMIRP] 2410 3GPP, "Telecommunication management; Fault Management; 2411 Part 2: Alarm Integration Reference Point (IRP): 2412 Information Service (IS)", 3GPP TS 32.111-2 3.4.0, March 2413 2005. 2415 [ALARMSEM] 2416 Wallin, S., Leijon, V., Nordlander, J., and N. Bystedt, 2417 "The semantics of alarm definitions: enabling systematic 2418 reasoning about alarms. International Journal of Network 2419 Management, Volume 22, Issue 3, John Wiley and Sons, Ltd, 2420 http://dx.doi.org/10.1002/nem.800", March 2012. 2422 [EEMUA] EEMUA Publication No. 191 Engineering Equipment and 2423 Materials Users Association, London, 2 edition., "Alarm 2424 Systems: A Guide to Design, Management and Procurement.", 2425 2007. 2427 [ISA182] International Society of Automation,ISA, "ANSI/ISA- 2428 18.2-2009 Management of Alarm Systems for the Process 2429 Industries", 2009. 2431 [RFC3877] Chisholm, S. and D. Romascanu, "Alarm Management 2432 Information Base (MIB)", RFC 3877, DOI 10.17487/RFC3877, 2433 September 2004, . 2435 [X.733] International Telecommunications Union, "Information 2436 Technology - Open Systems Interconnection - Systems 2437 Management: Alarm Reporting Function", 2438 ITU-T Recommendation X.733, 1992. 2440 [X.736] International Telecommunications Union, "Information 2441 Technology - Open Systems Interconnection - Systems 2442 Management: Security alarm reporting function", 2443 ITU-T Recommendation X.736, 1992. 2445 Appendix A. Enterprise-specific Alarm-Types Example 2447 This example shows how to define alarm-types in an enterprise 2448 specific module. In this case "xyz" has chosen to define top level 2449 identities according to X.733 event types. 2451 module example-xyz-alarms { 2452 namespace "urn:example:xyz-alarms"; 2453 prefix xyz-al; 2455 import ietf-alarms { 2456 prefix al; 2457 } 2459 identity xyz-alarms { 2460 base al:alarm-identity; 2461 } 2463 identity communications-alarm { 2464 base xyz-alarms; 2465 } 2466 identity quality-of-service-alarm { 2467 base xyz-alarms; 2468 } 2469 identity processing-error-alarm { 2470 base xyz-alarms; 2471 } 2472 identity equipment-alarm { 2473 base xyz-alarms; 2474 } 2475 identity environmental-alarm { 2476 base xyz-alarms; 2477 } 2479 // communications alarms 2480 identity link-alarm { 2481 base communications-alarm; 2482 } 2484 // QoS alarms 2485 identity high-jitter-alarm { 2486 base quality-of-service-alarm; 2487 } 2488 } 2490 Appendix B. Alarm Inventory Example 2492 This shows an alarm inventory, it shows one alarm type defined only 2493 with the identifier, and another dynamically configured. In the 2494 latter case a digital input has been connected to a smoke-detector, 2495 therefore the 'alarm-type-qualifier' is set to "smoke-detector" and 2496 the 'alarm-type-identity' to "environmental-alarm". 2498 2500 2501 2502 xyz-al:link-alarm 2503 2504 true 2505 2506 Link failure, operational state down but admin state up 2507 2508 2509 2510 xyz-al:environmental-alarm 2511 smoke-alarm 2512 true 2513 2514 Connected smoke detector to digital input 2515 2516 2517 2518 2520 Appendix C. Alarm List Example 2522 In this example we show an alarm that has toggled [major, clear, 2523 major]. An operator has acknowledged the alarm. 2525 2528 2529 1 2530 2015-04-08T08:39:50.00Z 2532 2533 2534 /dev:interfaces/dev:interface[name='FastEthernet1/0'] 2535 2536 xyz-al:link-alarm 2537 2539 2015-04-08T08:39:50.00Z 2540 false 2541 1.3.6.1.2.1.2.2.1.1.17 2542 2015-04-08T08:39:40.00Z 2543 major 2544 2545 Link operationally down but administratively up 2546 2547 2548 2549 major 2550 2551 Link operationally down but administratively up 2552 2553 2554 2555 2556 cleared 2557 2558 Link operationally up and administratively up 2559 2560 2561 2562 2563 major 2564 2565 Link operationally down but administratively up 2566 2567 2568 2569 2570 ack 2571 joe 2572 Will investigate, ticket TR764999 2573 2574 2575 2576 2578 Appendix D. Alarm Shelving Example 2580 This example shows how to shelf alarms. We shelf alarms related to 2581 the smoke-detectors since they are being installed and tested. We 2582 also shelf all alarms from FastEthernet1/0. 2584 2587 2588 2589 2590 FE10 2591 2592 /dev:interfaces/dev:interface[name='FastEthernet1/0'] 2593 2594 2595 2596 detectortest 2597 xyz-al:environmental-alarm 2598 smoke-alarm 2599 2600 2601 2602 2604 Appendix E. X.733 Mapping Example 2606 This example shows how to map a dynamic alarm type (alarm-type- 2607 identity=environmental-alarm, alarm-type-qualifier=smoke-alarm) to 2608 the corresponding X.733 event-type and probable cause parameters. 2610 2612 2613 2615 xyz-al:environmental-alarm 2616 2617 smoke-alarm 2618 2619 quality-of-service-alarm 2620 777 2621 2622 2623 2625 Authors' Addresses 2626 Stefan Vallin 2627 Stefan Vallin AB 2629 Email: stefan@wallan.se 2631 Martin Bjorklund 2632 Cisco 2634 Email: mbj@tail-f.com