idnits 2.17.1 draft-ietf-ccamp-alarm-module-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 443 has weird spacing: '...perator str...' == Line 448 has weird spacing: '...w state wri...' == Line 644 has weird spacing: '...r-match str...' == Line 696 has weird spacing: '...alifier ala...' == Line 750 has weird spacing: '...alifier lea...' == (5 more instances...) -- The document date (January 28, 2019) is 1887 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-21) exists of draft-ietf-netmod-yang-instance-file-format-00 Summary: 0 errors (**), 0 flaws (~~), 8 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group S. Vallin 3 Internet-Draft Stefan Vallin AB 4 Intended status: Standards Track M. Bjorklund 5 Expires: August 1, 2019 Cisco 6 January 28, 2019 8 YANG Alarm Module 9 draft-ietf-ccamp-alarm-module-07 11 Abstract 13 This document defines a YANG module for alarm management. It 14 includes functions for alarm list management, alarm shelving and 15 notifications to inform management systems. There are also 16 operations to manage the operator state of an alarm and 17 administrative alarm procedures. The module carefully maps to 18 relevant alarm standards. 20 Status of This Memo 22 This Internet-Draft is submitted in full conformance with the 23 provisions of BCP 78 and BCP 79. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF). Note that other groups may also distribute 27 working documents as Internet-Drafts. The list of current Internet- 28 Drafts is at http://datatracker.ietf.org/drafts/current/. 30 Internet-Drafts are draft documents valid for a maximum of six months 31 and may be updated, replaced, or obsoleted by other documents at any 32 time. It is inappropriate to use Internet-Drafts as reference 33 material or to cite them other than as "work in progress." 35 This Internet-Draft will expire on August 1, 2019. 37 Copyright Notice 39 Copyright (c) 2019 IETF Trust and the persons identified as the 40 document authors. All rights reserved. 42 This document is subject to BCP 78 and the IETF Trust's Legal 43 Provisions Relating to IETF Documents 44 (http://trustee.ietf.org/license-info) in effect on the date of 45 publication of this document. Please review these documents 46 carefully, as they describe your rights and restrictions with respect 47 to this document. Code Components extracted from this document must 48 include Simplified BSD License text as described in Section 4.e of 49 the Trust Legal Provisions and are provided without warranty as 50 described in the Simplified BSD License. 52 Table of Contents 54 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 55 1.1. Terminology and Notation . . . . . . . . . . . . . . . . 3 56 2. Objectives . . . . . . . . . . . . . . . . . . . . . . . . . 4 57 3. Alarm Module Concepts . . . . . . . . . . . . . . . . . . . . 5 58 3.1. Alarm Definition . . . . . . . . . . . . . . . . . . . . 5 59 3.2. Alarm Type . . . . . . . . . . . . . . . . . . . . . . . 5 60 3.3. Identifying the Alarming Resource . . . . . . . . . . . . 8 61 3.4. Identifying Alarm Instances . . . . . . . . . . . . . . . 8 62 3.5. Alarm Life-Cycle . . . . . . . . . . . . . . . . . . . . 8 63 3.5.1. Resource Alarm Life-Cycle . . . . . . . . . . . . . . 9 64 3.5.2. Operator Alarm Life-cycle . . . . . . . . . . . . . . 10 65 3.5.3. Administrative Alarm Life-Cycle . . . . . . . . . . . 10 66 3.6. Root Cause, Impacted Resources and Related 67 Alarms . . . . . . . . . . . . . . . . . . . . . . . . . 11 68 3.7. Alarm Shelving . . . . . . . . . . . . . . . . . . . . . 12 69 3.8. Alarm Profiles . . . . . . . . . . . . . . . . . . . . . 12 70 4. Alarm Data Model . . . . . . . . . . . . . . . . . . . . . . 12 71 4.1. Alarm Control . . . . . . . . . . . . . . . . . . . . . . 14 72 4.1.1. Alarm Shelving . . . . . . . . . . . . . . . . . . . 14 73 4.2. Alarm Inventory . . . . . . . . . . . . . . . . . . . . . 15 74 4.3. Alarm Summary . . . . . . . . . . . . . . . . . . . . . . 15 75 4.4. The Alarm List . . . . . . . . . . . . . . . . . . . . . 16 76 4.5. The Shelved Alarms List . . . . . . . . . . . . . . . . . 18 77 4.6. Alarm Profiles . . . . . . . . . . . . . . . . . . . . . 18 78 4.7. Operations . . . . . . . . . . . . . . . . . . . . . . . 19 79 4.8. Notifications . . . . . . . . . . . . . . . . . . . . . . 19 80 5. Relationship to the ietf-hardware YANG module . . . . . . . . 19 81 6. Alarm YANG Module . . . . . . . . . . . . . . . . . . . . . . 20 82 7. X.733 Extensions . . . . . . . . . . . . . . . . . . . . . . 51 83 8. The X.733 Mapping Module . . . . . . . . . . . . . . . . . . 51 84 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 63 85 10. Security Considerations . . . . . . . . . . . . . . . . . . . 64 86 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 65 87 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 65 88 12.1. Normative References . . . . . . . . . . . . . . . . . . 65 89 12.2. Informative References . . . . . . . . . . . . . . . . . 66 90 Appendix A. Vendor-specific Alarm-Types Example . . . . . . . . 67 91 Appendix B. Alarm Inventory Example . . . . . . . . . . . . . . 68 92 Appendix C. Alarm List Example . . . . . . . . . . . . . . . . . 69 93 Appendix D. Alarm Shelving Example . . . . . . . . . . . . . . . 70 94 Appendix E. X.733 Mapping Example . . . . . . . . . . . . . . . 71 95 Appendix F. Relationship to other alarm standards . . . . . . . 72 96 F.1. Alarm definition . . . . . . . . . . . . . . . . . . . . 72 97 F.2. Data model . . . . . . . . . . . . . . . . . . . . . . . 74 98 F.2.1. X.733 . . . . . . . . . . . . . . . . . . . . . . . . 74 99 F.2.2. RFC 3877, the Alarm MIB . . . . . . . . . . . . . . . 74 100 F.2.3. 3GPP Alarm IRP . . . . . . . . . . . . . . . . . . . 75 101 F.2.4. G.7710 . . . . . . . . . . . . . . . . . . . . . . . 75 102 Appendix G. Alarm Usability Requirements . . . . . . . . . . . . 75 103 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 79 105 1. Introduction 107 This document defines a YANG [RFC7950] module for alarm management. 108 The purpose is to define a standardized alarm interface for network 109 devices that can be easily integrated into management applications. 110 The model is also applicable as a northbound alarm interface in the 111 management applications. 113 Alarm monitoring is a fundamental part of monitoring the network. 114 Raw alarms from devices do not always tell the status of the network 115 services or necessarily point to the root cause. However, being able 116 to feed alarms to the alarm management application in a standardized 117 format is a starting point for performing higher level network 118 assurance tasks. 120 The design of the module is based on experience from using and 121 implementing available alarm standards from ITU [X.733], 3GPP 122 [ALARMIRP] and ANSI [ISA182]. 124 1.1. Terminology and Notation 126 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 127 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 128 "OPTIONAL" in this document are to be interpreted as described in BCP 129 14 [RFC2119] [RFC8174] when, and only when, they appear in all 130 capitals, as shown here. 132 The following terms are defined in [RFC7950]: 134 o action 136 o client 138 o data tree 140 o server 142 The following terms are used within this document: 144 o Alarm (the general concept): An alarm signifies an undesirable 145 state in a resource that requires corrective action. 147 o Alarm Type: An alarm type identifies a possible unique alarm state 148 for a resource. Alarm types are names to identify the state like 149 "link-alarm", "jitter-violation", "high-disk-utilization". 151 o Resource: A fine-grained identification of the alarming resource, 152 for example: an interface, a process. 154 o Alarm Instance: The alarm state for a specific resource and alarm 155 type. For example (GigabitEthernet0/15, link-alarm). An entry in 156 the alarm list. 158 o Alarm Inventory: A list of all possible alarm types on a system. 160 o Alarm Shelving: Blocking alarms according to specific criteria. 162 o Corrective Action: An action taken by an operator or automation 163 routine in order to minimize the impact of the alarm or resolving 164 the root cause. 166 o Management System: The alarm management application that consumes 167 the alarms, i.e., acts as a client. 169 o System: The system that implements this YANG alarm module, i.e., 170 acts as a server. This corresponds to a network device or a 171 management application that provides a north-bound alarm 172 interface. 174 Tree diagrams used in this document follow the notation defined in 175 [RFC8340]. 177 2. Objectives 179 The objectives for the design of the Alarm Module are: 181 o Simple to use. If a system supports this module, it shall be 182 straight-forward to integrate this into a YANG based alarm 183 manager. 185 o View alarms as states on resources and not as discrete 186 notifications. 188 o Clear definition of "alarm" in order to exclude general events 189 that should not be forwarded as alarm notifications. 191 o Clear and precise identification of alarm types and alarm 192 instances. 194 o A management system should be able to pull all available alarm 195 types from a system, i.e., read the alarm inventory from a system. 196 This makes it possible to prepare alarm operators with 197 corresponding alarm instructions. 199 o Address alarm usability requirements, see Appendix G. While IETF 200 has not really addressed alarm management, telecom standards has 201 addressed it purely from a protocol perspective. The process 202 industry has published several relevant standards addressing 203 requirements for a useful alarm interface; [EEMUA], [ISA182]. 204 This alarm module defines usability requirements as well as a YANG 205 data model. 207 o Mapping to X.733, which is a requirement for some alarm systems. 208 Still, keep some of the X.733 concepts out of the core model in 209 order to make the model small and easy to understand. 211 3. Alarm Module Concepts 213 This section defines the fundamental concepts behind the data model. 214 This section is rooted in the works of Vallin et. al [ALARMSEM]. 216 3.1. Alarm Definition 218 An alarm signifies an undesirable state in a resource that requires 219 corrective action. 221 There are two main things to remember from this definition: 223 1. the definition focuses on leaving out events and logging 224 information in general. Alarms should only be used for undesired 225 states that require action. 227 2. the definition also focus on alarms as a state on a resource, not 228 the notifications that report the state changes. 230 See Appendix F for information how this definition relates to other 231 alarm standards. 233 3.2. Alarm Type 235 This document defines an alarm type with an alarm type id and an 236 alarm type qualifier. 238 The alarm type id is modeled as a YANG identity. With YANG 239 identities, new alarm types can be defined in a distributed fashion. 240 YANG identities are hierarchical, which means that an hierarchy of 241 alarm types can be defined. 243 Standards and vendors should define their own alarm type identities 244 based on this definition. 246 The use of YANG identities means that all possible alarms are 247 identified at design time. This explicit declaration of alarm types 248 makes it easier to allow for alarm qualification reviews and 249 preparation of alarm actions and documentation. 251 There are occasions where the alarm types are not known at design 252 time. For example, a system with digital inputs that allows users to 253 connects detectors (e.g., smoke detector) to the inputs. In this 254 case it is a configuration action that says that certain connectors 255 are fire alarms for example. 257 In order to allow for dynamic addition of alarm types the alarm 258 module allows for further qualification of the identity based alarm 259 type using a string. A potential drawback of this is that there is a 260 big risk that alarm operators will receive alarm types as a surprise, 261 they do not know how to resolve the problem since a defined alarm 262 procedure does not necessarily exist. To avoid this risk the system 263 MUST publish all possible alarm types in the alarm inventory, see 264 Section 4.2. 266 A vendor or standard organization can define their own alarm-type 267 hierarchy. The example below shows a hierarchy based on X.733 event 268 types: 270 import ietf-alarms { 271 prefix al; 272 } 273 identity vendor-alarms { 274 base al:alarm-type; 275 } 276 identity communications-alarm { 277 base vendor-alarms; 278 } 279 identity link-alarm { 280 base communications-alarm; 281 } 283 Alarm types can be abstract. An abstract alarm type is used as a 284 base for defining hierarchical alarm types. Concrete alarm types are 285 used for alarm states and appear in the alarm inventory. There are 286 two kinds of concrete alarm types: 288 1. The last subordinate identity in the "alarm-type-id" hierarchy is 289 concrete, for example: "alarm-identity.environmental- 290 alarm.smoke". In this example "alarm-identity" and 291 "environmental-alarm" are abstract YANG identities, whereas 292 "smoke" is a concrete YANG identity. 294 2. The YANG identity hierarchy is abstract and the concrete alarm 295 type is defined by the dynamic alarm qualifier string, for 296 example: "alarm-identity.environmental-alarm.external-detector" 297 with alarm-type-qualifier "smoke". 299 For example: 301 // Alternative 1: concrete alarm type identity 302 import ietf-alarms { 303 prefix al; 304 } 305 identity environmental-alarm { 306 base al:alarm-type; 307 description "Abstract alarm type"; 308 } 309 identity smoke { 310 base environmental-alarm; 311 description "Concrete alarm type"; 312 } 314 // Alternative 2: concrete alarm type qualifier 315 import ietf-alarms { 316 prefix al; 317 } 318 identity environmental-alarm { 319 base al:alarm-type; 320 description "Abstract alarm type"; 321 } 322 identity external-detector { 323 base environmental-alarm; 324 description 325 "Abstract alarm type, a run-time configuration 326 procedure sets the type of alarm detected. This will 327 be reported in the alarm-type-qualifier."; 328 } 330 A server SHOULD strive to minimize the number of dynamically defined 331 alarm types. 333 3.3. Identifying the Alarming Resource 335 It is of vital importance to be able to refer to the alarming 336 resource. This reference must be as fine-grained as possible. If 337 the alarming resource exists in the data tree then an instance- 338 identifier MUST be used with the full path to the object. 340 When the module is used in a controller/orchestrator/manager the 341 original device resource identification can be modified to include 342 the device in the path. The details depend on how devices are 343 identified, and are out of scope for this specification. 345 Example: 347 The original device alarm might identify the resource as 348 "/dev:interfaces/dev:interface[dev:name='FastEthernet1/0']". 350 The resource identification in the manager could look something 351 like: "/mgr:devices/mgr:device[mgr:name='xyz123']/dev:interfaces/ 352 dev:interface[dev:name='FastEthernet1/0']" 354 This module also allows for alternate naming of the alarming resource 355 if it is not available in the data tree. 357 3.4. Identifying Alarm Instances 359 A primary goal of this alarm module is to remove any ambiguity in how 360 alarm notifications are mapped to an update of an alarm instance. 361 X.733 and especially 3GPP were not really clear on this point. This 362 YANG alarm module states that the tuple (resource, alarm type 363 identifier, alarm type qualifier) corresponds to a single alarm 364 instance. This means that alarm notifications for the same resource 365 and same alarm type are matched to update the same alarm instance. 366 These three leafs are therefore used as the key in the alarm list: 368 list alarm { 369 key "resource alarm-type-id alarm-type-qualifier"; 370 ... 371 } 373 3.5. Alarm Life-Cycle 375 The alarm model clearly separates the resource alarm life-cycle from 376 the operator and administrative life-cycles of an alarm. 378 o resource alarm life-cycle: the alarm instrumentation that controls 379 alarm raise, clearance, and severity changes. 381 o operator alarm life-cycle: operators acting upon alarms with 382 actions like acknowledgment and closing. Closing an alarm implies 383 that the operator considers the corrective action performed. 384 Operators can also shelf (block/filter) alarms in order to avoid 385 nuisance alarms. 387 o administrative alarm life-cycle: purging (deleting) unwanted 388 alarms and compressing the alarm status change list. This module 389 exposes operations to manage the administrative life-cycle. The 390 server may also perform these operations based on other policies, 391 but how that is done is out of scope for this document. 393 A server SHOULD describe how long it retains cleared/closed alarms: 394 until manually purged or if it has an automatic removal policy. 396 3.5.1. Resource Alarm Life-Cycle 398 From a resource perspective, an alarm can for example have the 399 following life-cycle: raise, change severity, change severity, clear, 400 being raised again etc. All of these status changes can have 401 different alarm texts generated by the instrumentation. Two 402 important things to note: 404 1. Alarms are not deleted when they are cleared. Deleting alarms is 405 an administrative process. The alarm module defines an action 406 "purge-alarms" that deletes alarms. 408 2. Alarms are not cleared by operators, only the underlying 409 instrumentation can clear an alarm. Operators can close alarms. 411 The YANG tree representation below illustrates the resource oriented 412 life-cycle: 414 +--ro alarm* [resource alarm-type-id alarm-type-qualifier] 415 ... 416 +--ro is-cleared boolean 417 +--ro last-changed yang:date-and-time 418 +--ro perceived-severity severity 419 +--ro alarm-text alarm-text 420 +--ro status-change* [time] {alarm-history}? 421 +--ro time yang:date-and-time 422 +--ro perceived-severity severity-with-clear 423 +--ro alarm-text alarm-text 425 For every status change from the resource perspective a row is added 426 to the "status-change" list. The last status values are also 427 represented as leafs for the alarm. Note well that the alarm 428 severity does not include "cleared", alarm clearance is a boolean 429 flag. 431 An alarm can therefore look like this: ((GigabitEthernet0/25, link- 432 alarm,""), false, T, major, "Interface GigabitEthernet0/25 down") 434 3.5.2. Operator Alarm Life-cycle 436 Operators can also act upon alarms using the set-operator-state 437 action: 439 +--ro alarm* [resource alarm-type-id alarm-type-qualifier] 440 ... 441 +--ro operator-state-change* [time] {operator-actions}? 442 | +--ro time yang:date-and-time 443 | +--ro operator string 444 | +--ro state operator-state 445 | +--ro text? string 446 +---x set-operator-state {operator-actions}? 447 +---w input 448 +---w state writable-operator-state 449 +---w text? string 451 The operator state for an alarm can be: "none", "ack", "shelved", and 452 "closed". Alarm deletion (using the action "purge-alarms"), can use 453 this state as a criteria. A closed alarm is an alarm where the 454 operator has performed any required corrective actions. Closed 455 alarms are good candidates for being purged. 457 3.5.3. Administrative Alarm Life-Cycle 459 Deleting alarms from the alarm list is considered an administrative 460 action. This is supported by the "purge-alarms" action. The "purge- 461 alarms" action takes a filter as input. The filter selects alarms 462 based on the operator and resource life-cycle such as "all closed 463 cleared alarms older than a time specification". The server may also 464 perform these operations based on other policies, but how that is 465 done is out of scope for this document. 467 Purged alarms are removed from the alarm list. Note well, if the 468 alarm resource state changes after a purge, the alarm will reappear 469 in the alarm list. 471 Alarms can be compressed. Compressing an alarm deletes all entries 472 in the alarm's "status-change" list except for the last status 473 change. A client can perform this using the "compress-alarms" 474 action. The server may also perform these operations based on other 475 policies, but how that is done is out of scope for this document. 477 3.6. Root Cause, Impacted Resources and Related Alarms 479 The alarm module does not mandate any requirements for the system to 480 support alarm correlation or root-cause and service-impact analysis. 481 However, if such features are supported, this section describes how 482 the results of such analysis are represented in the data model. 483 These parts of the model are optional. The module supports three 484 scenarios: 486 Root cause analysis: An alarm can indicate candidate root cause 487 resources, for example: a database issue alarm referring to a full 488 disc partition. 490 Service impact analysis: An alarm can refer to potential impacted 491 resources, for example: an interface alarm referring to impacted 492 network services 494 Alarm correlation: Dependencies between alarms, several alarms can 495 be grouped as relating to each other, for example a streaming 496 media alarm relating to a high jitter alarm. 498 Different systems have various degrees of alarm correlation and 499 analysis capabilities, and the intent of the alarm module is to 500 enable any capability, including none. 502 The general principle of this alarm module is to limit the amount of 503 alarms. In many cases several resources are affected for a given 504 underlying problem. A full disk will of course impact databases and 505 applications as well. The recommendation is to have a single alarm 506 for the underlying problem and list the affected resources in the 507 alarm, rather than having separate alarms for each resource. 509 The alarm has one leaf-list to identify possible "impacted-resources" 510 and a leaf-list to identify possible "root-cause-resources". These 511 serves as hints only. It is up to the client application to use this 512 information to present the overall status. Using the the disk full 513 example, a "good" alarm would be to use the hard disk partition as 514 the alarming resource and add the database and applications into the 515 impacted-resources leaf-list. 517 A system should always strive to identify the resource that can be 518 acted upon as the "resource" leaf. The "impacted-resource" leaf-list 519 shall be used to identify any side-effects of the alarm. The 520 impacted resources can not be acted upon to fix the problem. The 521 disk full example above illustrates the principle; you can not fix 522 the underlying issue by database operations. However, you need to 523 pay attention to the database to perform any operations that limits 524 the impact of problem. 526 In some occasions the system might not be capable of detecting the 527 root cause, the resource that can be acted upon. The instrumentation 528 in this case only monitors the side-effect and needs to represent an 529 alarm that indicates a situation that needs acting upon. The 530 instrumentation still might identify possible candidates for the 531 root-cause resource. In this case the "root-cause-resource" leaf- 532 list can be used to indicate the candidate root-cause resources. An 533 example of this kind of alarm might be an active test tool that 534 detects an SLA violation on a VPN connection and identifies the 535 devices along the chain as candidate root causes. 537 The alarm module also supports a way to associate different alarms to 538 each other with the "related-alarm" list. This list enables the 539 server to inform the client that certain alarms are related to other 540 alarms. 542 Note well that this module does not prescribe any dependencies or 543 preference between the above alarm correlation mechanisms. Different 544 systems have different capabilities and the above described 545 mechanisms are available to support the instrumentation features. 547 3.7. Alarm Shelving 549 Alarm shelving is an important function in order for alarm management 550 applications and operators to stop superfluous alarms. A shelved 551 alarm implies that any alarms fulfilling this criteria are ignored 552 (blocked/filtered). Shelved alarms appear in a dedicated shelved 553 alarm list in order not to disturb the relevant alarms. Shelved 554 alarms do not generate notifications but the shelved alarm list is 555 updated with any alarm state changes. 557 3.8. Alarm Profiles 559 Alarm profiles are used to configure further information to an alarm 560 type. This module supports configuring severity levels overriding 561 the system default levels. This corresponds to the Alarm Assignment 562 Profile, ASAP, functionality in M.3100 [M.3100] and M.3160 [M.3160]. 563 Other standard or enterprise modules can augment this list with 564 further alarm type information. 566 4. Alarm Data Model 568 The fundamental parts of the data model are the "alarm-list" with 569 associated notifications and the "alarm-inventory" list of all 570 possible alarm types. These MUST be implemented by a system. The 571 rest of the data model are made conditional with YANG the features 572 "operator-actions", "alarm-shelving", "alarm-history", "alarm- 573 summary", "alarm-profile", and "severity-assignment". 575 The data model has the following overall structure: 577 +--rw control 578 | +--rw max-alarm-status-changes? union 579 | +--rw (notify-status-changes)? 580 | | ... 581 | +--rw alarm-shelving {alarm-shelving}? 582 | ... 583 +--ro alarm-inventory 584 | +--ro alarm-type* [alarm-type-id alarm-type-qualifier] 585 | ... 586 +--ro summary {alarm-summary}? 587 | +--ro alarm-summary* [severity] 588 | | ... 589 | +--ro shelves-active? empty {alarm-shelving}? 590 +--ro alarm-list 591 | +--ro number-of-alarms? yang:gauge32 592 | +--ro last-changed? yang:date-and-time 593 | +--ro alarm* [resource alarm-type-id alarm-type-qualifier] 594 | | ... 595 | +---x purge-alarms 596 | | ... 597 | +---x compress-alarms {alarm-history}? 598 | ... 599 +--ro shelved-alarms {alarm-shelving}? 600 | +--ro number-of-shelved-alarms? yang:gauge32 601 | +--ro shelved-alarms-last-changed? yang:date-and-time 602 | +--ro shelved-alarm* 603 | | [resource alarm-type-id alarm-type-qualifier] 604 | | ... 605 | +---x purge-shelved-alarms 606 | | ... 607 | +---x compress-shelved-alarms {alarm-history}? 608 | ... 609 +--rw alarm-profile* 610 [alarm-type-id alarm-type-qualifier-match resource] 611 {alarm-profile}? 612 +--rw alarm-type-id alarm-type-id 613 +--rw alarm-type-qualifier-match string 614 +--rw resource resource-match 615 +--rw description string 616 +--rw alarm-severity-assignment-profile 617 {severity-assignment}? 618 ... 620 4.1. Alarm Control 622 The "/alarms/control/notify-status-changes" choice controls if 623 notifications are sent for all state changes, only raise and clear, 624 or only notifications more severe than a configured level. This 625 feature in combination with alarm shelving corresponds to the ITU 626 Alarm Report Control functionality. 628 Every alarm has a list of status changes, this is a circular list. 629 The length of this list is controlled by "/alarms/control/max-alarm- 630 status-changes". 632 4.1.1. Alarm Shelving 634 The shelving control tree is shown below: 636 +--rw control 637 +--rw alarm-shelving {alarm-shelving}? 638 +--rw shelf* [name] 639 +--rw name string 640 +--rw resource* resource-match 641 +--rw alarm-type* 642 | [alarm-type-id alarm-type-qualifier-match] 643 | +--rw alarm-type-id alarm-type-id 644 | +--rw alarm-type-qualifier-match string 645 +--rw description? string 647 Shelved alarms are shown in a dedicated shelved alarm list. The 648 instrumentation MUST move shelved alarms from the alarm list 649 (/alarms/alarm-list) to the shelved alarm list (/alarms/shelved- 650 alarms/). Shelved alarms do not generate any notifications. When 651 the shelving criteria is removed or changed the alarm list MUST be 652 updated to the correct actual state of the alarms. 654 Shelving and unshelving can only be performed by editing the shelf 655 configuration. It cannot be performed on individual alarms. The 656 server will add an operator state indicating that the alarm was 657 shelved/unshelved. 659 A leaf (/alarms/summary/shelves-active) in the alarm summary 660 indicates if there are shelved alarms. 662 A system can select to not support the shelving feature. 664 4.2. Alarm Inventory 666 The alarm inventory represents all possible alarm types that may 667 occur in the system. A management system may use this to build alarm 668 procedures. The alarm inventory is relevant for several reasons: 670 The system might not instrument all defined alarm type identities, 671 and some alarm identities are abstract. 673 The system has configured dynamic alarm types using the alarm 674 qualifier. The inventory makes it possible for the management 675 system to discover these. 677 Note that the mechanism whereby dynamic alarm types are added using 678 the alarm type qualifier MUST populate this list. 680 The optional leaf-list "resource" in the alarm inventory enables the 681 system to publish for which resources a given alarm type may appear. 683 A server MUST implement the alarm inventory in order to enable 684 controlled alarm procedures in the client. 686 A server implementer may want to document the alarm inventory for 687 off-line processing by clients. The file format defined in 688 [I-D.ietf-netmod-yang-instance-file-format] can be used for this 689 purpose. 691 The alarm inventory tree is shown below: 693 +--ro alarm-inventory 694 +--ro alarm-type* [alarm-type-id alarm-type-qualifier] 695 +--ro alarm-type-id alarm-type-id 696 +--ro alarm-type-qualifier alarm-type-qualifier 697 +--ro resource* resource-match 698 +--ro has-clear boolean 699 +--ro severity-levels* severity 700 +--ro description string 702 4.3. Alarm Summary 704 The alarm summary list summarizes alarms per severity; how many 705 cleared, cleared and closed, and closed. It also gives an indication 706 if there are shelved alarms. 708 The alarm summary tree is shown below: 710 +--ro summary {alarm-summary}? 711 +--ro alarm-summary* [severity] 712 | +--ro severity severity 713 | +--ro total? yang:gauge32 714 | +--ro not-cleared? yang:gauge32 715 | +--ro cleared? yang:gauge32 716 | +--ro cleared-not-closed? yang:gauge32 717 | | {operator-actions}? 718 | +--ro cleared-closed? yang:gauge32 719 | | {operator-actions}? 720 | +--ro not-cleared-closed? yang:gauge32 721 | | {operator-actions}? 722 | +--ro not-cleared-not-closed? yang:gauge32 723 | {operator-actions}? 724 +--ro shelves-active? empty {alarm-shelving}? 726 4.4. The Alarm List 728 The alarm list (/alarms/alarm-list) is a function from (resource, 729 alarm type, alarm type qualifier) to the current composite alarm 730 state. The composite state includes states for the resource life- 731 cycle such as severity, clearance flag and operator states such as 732 acknowledgment. This means that for a given resource and alarm-type 733 the alarm list shows the current states of the alarm such as 734 acknowledged and cleared status. 736 +--ro alarm-list 737 +--ro number-of-alarms? yang:gauge32 738 +--ro last-changed? yang:date-and-time 739 +--ro alarm* [resource alarm-type-id alarm-type-qualifier] 740 | +--ro resource resource 741 | +--ro alarm-type-id alarm-type-id 742 | +--ro alarm-type-qualifier alarm-type-qualifier 743 | +--ro alt-resource* resource 744 | +--ro related-alarm* 745 | | [resource alarm-type-id alarm-type-qualifier] 746 | | {alarm-correlation}? 747 | | +--ro resource 748 | | | -> /alarms/alarm-list/alarm/resource 749 | | +--ro alarm-type-id leafref 750 | | +--ro alarm-type-qualifier leafref 751 | +--ro impacted-resource* resource 752 | | {service-impact-analysis}? 753 | +--ro root-cause-resource* resource 754 | | {root-cause-analysis}? 755 | +--ro time-created yang:date-and-time 756 | +--ro is-cleared boolean 757 | +--ro last-raised yang:date-and-time 758 | +--ro last-changed yang:date-and-time 759 | +--ro perceived-severity severity 760 | +--ro alarm-text alarm-text 761 | +--ro status-change* [time] {alarm-history}? 762 | | +--ro time yang:date-and-time 763 | | +--ro perceived-severity severity-with-clear 764 | | +--ro alarm-text alarm-text 765 | +--ro operator-state-change* [time] {operator-actions}? 766 | | +--ro time yang:date-and-time 767 | | +--ro operator string 768 | | +--ro state operator-state 769 | | +--ro text? string 770 | +---x set-operator-state {operator-actions}? 771 | | +---w input 772 | | +---w state writable-operator-state 773 | | +---w text? string 774 | +---n operator-action {operator-actions}? 775 | +-- time yang:date-and-time 776 | +-- operator string 777 | +-- state operator-state 778 | +-- text? string 779 +---x purge-alarms 780 | +---w input 781 | | +---w alarm-clearance-status enumeration 782 | | +---w older-than! 783 | | | +---w (age-spec)? 784 | | | +--:(seconds) 785 | | | | +---w seconds? uint16 786 | | | +--:(minutes) 787 | | | | +---w minutes? uint16 788 | | | +--:(hours) 789 | | | | +---w hours? uint16 790 | | | +--:(days) 791 | | | | +---w days? uint16 792 | | | +--:(weeks) 793 | | | +---w weeks? uint16 794 | | +---w severity! 795 | | | +---w (sev-spec)? 796 | | | +--:(below) 797 | | | | +---w below? severity 798 | | | +--:(is) 799 | | | | +---w is? severity 800 | | | +--:(above) 801 | | | +---w above? severity 802 | | +---w operator-state-filter! {operator-actions}? 803 | | +---w state? operator-state 804 | | +---w user? string 805 | +--ro output 806 | +--ro purged-alarms? uint32 807 +---x compress-alarms {alarm-history}? 808 +---w input 809 | +---w resource? resource-match 810 | +---w alarm-type-id? 811 | | -> /alarms/alarm-list/alarm/alarm-type-id 812 | +---w alarm-type-qualifier? leafref 813 +--ro output 814 +--ro compressed-alarms? uint32 816 Every alarm has three important states, the resource clearance state 817 "is-cleared", the severity "perceived-severity" and the operator 818 state available in the operator state change list. 820 In order to see the alarm history the resource state changes are 821 available in the "status-change" list and the operator history is 822 available in the "operator-state-change" list. 824 4.5. The Shelved Alarms List 826 The shelved alarm list has the same structure as the alarm list 827 above. It shows all the alarms that matches the shelving criteria 828 (/alarms/control/alarm-shelving). 830 4.6. Alarm Profiles 832 Alarm profiles (/alarms/alarm-profile/) is a list of configurable 833 alarm types. The list supports configurable alarm severity levels in 834 the container "alarm-severity-assignment-profile". If an alarm 835 matches the configured alarm type it MUST use the configured severity 836 level(s) instead of the system default. This configuration MUST also 837 be represented in the alarm inventory. 839 +--rw alarm-profile* 840 [alarm-type-id alarm-type-qualifier-match resource] 841 {alarm-profile}? 842 +--rw alarm-type-id alarm-type-id 843 +--rw alarm-type-qualifier-match string 844 +--rw resource resource-match 845 +--rw description string 846 +--rw alarm-severity-assignment-profile 847 {severity-assignment}? 848 +--rw severity-levels* severity 850 4.7. Operations 852 The alarm module supports the following actions to manage the alarms: 854 /alarms/alarm-list/purge-alarms: Delete alarms from the "alarm-list" 855 according to specific criteria, for example all cleared alarms 856 older than a specific date. 858 /alarms/alarm-list/compress-alarms: Compress the "status-change" 859 list for the alarms. 861 /alarms/alarm-list/alarm/set-operator-state: Change the operator 862 state for an alarm. For example, an alarn can be acknowledged by 863 setting the operator state to "ack". 865 /alarms/shelved-alarm-list/purge-shelved-alarms: Delete alarms from 866 the "shelved-alarm-list" according to specific criteria, for 867 example all alarms older than a specific date. 869 /alarms/shelved-alarm-list/compress-shelved-alarms: Compress the 870 "status-change" list for the alarms. 872 4.8. Notifications 874 The alarm module supports a general notification to report alarm 875 state changes. It carries all relevant parameters for the alarm 876 management application. 878 There is also a notification to report that an operator changed the 879 operator state on an alarm, like acknowledge. 881 If the alarm inventory is changed, for example a new card type is 882 inserted, a notification will tell the management application that 883 new alarm types are available. 885 5. Relationship to the ietf-hardware YANG module 887 RFC 8348 [RFC8348] defines the "ietf-hardware" YANG data model for 888 the management of hardware. The "alarm-state" in RFC 8348 is a 889 summary of the alarm severity levels that may be active on the 890 specific hardware component. It does not say anything about how 891 alarms are reported, and it doesn't provide any details of the 892 alarms. 894 The mapping between the alarm YANG data model and the "alarm-state" 895 in RFC 8348 is as follows: 897 resource: Corresponds to an entry in the list "/hardware/component/" 899 is-cleared: No bit set in "/hardware/component/state/alarm-state" 901 perceived-severity: Corresponding bit set in 902 "/hardware/component/state/alarm-state". 904 operator-state-change/state: If the alarm is acknowledged by the 905 operator, the bit "under-repair" is in "/hardware/component/state/ 906 alarm-state". 908 6. Alarm YANG Module 910 This YANG module references [RFC6991]. 912 file "ietf-alarms@2019-01-27.yang" 913 module ietf-alarms { 914 yang-version 1.1; 915 namespace "urn:ietf:params:xml:ns:yang:ietf-alarms"; 916 prefix al; 918 import ietf-yang-types { 919 prefix yang; 920 reference 921 "RFC 6991: Common YANG Data Types."; 922 } 924 organization 925 "IETF CCAMP Working Group"; 926 contact 927 "WG Web: 928 WG List: 930 Editor: Stefan Vallin 931 933 Editor: Martin Bjorklund 934 "; 936 // RFC Ed.: replace XXXX with actual RFC number and 937 // remove this note. 939 description 940 "This module defines an interface for managing alarms. Main 941 inputs to the module design are the 3GPP Alarm IRP, ITU-T X.733 942 and ANSI/ISA-18.2 alarm standards. 944 Main features of this module include: 946 * Alarm list: 947 A list of all alarms. Cleared alarms stay in 948 the list until explicitly purged. 950 * Operator actions on alarms: 951 Acknowledging and closing alarms. 953 * Administrative actions on alarms: 954 Purging alarms from the list according to specific 955 criteria. 957 * Alarm inventory: 958 A management application can read all 959 alarm types implemented by the system. 961 * Alarm shelving: 962 Shelving (blocking) alarms according 963 to specific criteria. 965 * Alarm profiles: 966 A management system can attach further 967 information to alarm types, for example 968 overriding system default severity 969 levels. 971 This module uses a stateful view on alarms. An alarm is a state 972 for a specific resource (note that an alarm is not a 973 notification). An alarm type is a possible alarm state for a 974 resource. For example, the tuple: 976 ('link-alarm', 'GigabitEthernet0/25') 978 is an alarm of type 'link-alarm' on the resource 979 'GigabitEthernet0/25'. 981 Alarm types are identified using YANG identities and an optional 982 string-based qualifier. The string-based qualifier allows for 983 dynamic extension of the statically defined alarm types. Alarm 984 types identify a possible alarm state and not the individual 985 notifications. For example, the traditional 'link-down' and 986 'link-up' notifications are two notifications referring to the 987 same alarm type 'link-alarm'. 989 With this design there is no ambiguity about how alarm and alarm 990 clear correlation should be performed: notifications that report 991 the same resource and alarm type are considered updates of the 992 same alarm, e.g., clearing an active alarm or changing the 993 severity of an alarm. 995 The instrumentation can update 'severity' and 'alarm-text' on an 996 existing alarm. The above alarm example can therefore look 997 like: 999 (('link-alarm', 'GigabitEthernet0/25'), 1000 warning, 1001 'interface down while interface admin state is up') 1003 There is a clear separation between updates on the alarm from 1004 the underlying resource, like clear, and updates from an 1005 operator like acknowledge or closing an alarm: 1007 (('link-alarm', 'GigabitEthernet0/25'), 1008 warning, 1009 'interface down while interface admin state is up', 1010 cleared, 1011 closed) 1013 Administrative actions like removing closed alarms older than a 1014 given time is supported. 1016 This alarm module does not define how the underlying 1017 instrumentation detects and clears the specific alarms. That 1018 belongs to the SDO or enterprise that owns that specific 1019 technology. 1021 The key words 'MUST', 'MUST NOT', 'REQUIRED', 'SHALL', 'SHALL 1022 NOT', 'SHOULD', 'SHOULD NOT', 'RECOMMENDED', 'NOT RECOMMENDED', 1023 'MAY', and 'OPTIONAL' in this document are to be interpreted as 1024 described in BCP 14 (RFC 2119) (RFC 8174) when, and only when, 1025 they appear in all capitals, as shown here. 1027 Copyright (c) 2019 IETF Trust and the persons identified as 1028 authors of the code. All rights reserved. 1030 Redistribution and use in source and binary forms, with or 1031 without modification, is permitted pursuant to, and subject to 1032 the license terms contained in, the Simplified BSD License set 1033 forth in Section 4.c of the IETF Trust's Legal Provisions 1034 Relating to IETF Documents 1035 (https://trustee.ietf.org/license-info). 1037 This version of this YANG module is part of RFC XXXX 1038 (https://tools.ietf.org/html/rfcXXXX); see the RFC itself for 1039 full legal notices."; 1041 // RFC Ed.: update the date below with the date of RFC publication 1042 // and remove this note. 1044 revision 2010-01-27 { 1045 description 1046 "Initial revision."; 1047 reference 1048 "RFC XXXX: YANG Alarm Module"; 1049 } 1051 /* 1052 * Features 1053 */ 1055 feature operator-actions { 1056 description 1057 "This feature indicates that the system supports operator 1058 states on alarms."; 1059 } 1061 feature alarm-shelving { 1062 description 1063 "This feature indicates that the system supports shelving 1064 (blocking) alarms."; 1065 } 1067 feature alarm-history { 1068 description 1069 "This feature indicates that server maintains a history of 1070 state changes for each alarm. For example, if an alarm 1071 toggles between cleared and active 10 times, these state 1072 changes are present in a separate list in the alarm."; 1073 } 1075 feature alarm-summary { 1076 description 1077 "This feature indicates that the server summarizes the number 1078 of alarms per severity and operator state."; 1079 } 1081 feature alarm-profile { 1082 description 1083 "The system supports clients to configure further information 1084 to each alarm type."; 1085 } 1087 feature severity-assignment { 1088 description 1089 "The system supports configurable alarm severity levels."; 1090 reference 1091 "M.3160/M.3100 Alarm Severity Assignment Profile, ASAP"; 1093 } 1095 feature root-cause-analysis { 1096 description 1097 "The system supports identifying candidate root-cause 1098 resources for an alarm, for example a disc partition 1099 root cause for a logger failure alarm."; 1100 } 1102 feature service-impact-analysis { 1103 description 1104 "The system supports identifiying candidate impacted 1105 resources for an alarm, for exampla a link being impacted 1106 by an interface alarm."; 1107 } 1109 feature alarm-correlation { 1110 description 1111 "The system supports correlating/grouping alarms 1112 that belong together."; 1113 } 1115 /* 1116 * Identities 1117 */ 1119 identity alarm-type-id { 1120 description 1121 "Base identity for alarm types. A unique identification of the 1122 alarm, not including the resource. Different resources can 1123 share alarm types. If the resource reports the same alarm 1124 type, it is to be considered to be the same alarm. The alarm 1125 type is a simplification of the different X.733 and 3GPP alarm 1126 IRP alarm correlation mechanisms and it allows for 1127 hierarchical extensions. 1129 A string-based qualifier can be used in addition to the 1130 identity in order to have different alarm types based on 1131 information not known at design-time, such as values in 1132 textual SNMP Notification var-binds. 1134 Standards and vendors can define sub-identities to clearly 1135 identify specific alarm types. 1137 This identity is abstract and MUST NOT be used for alarms."; 1138 } 1140 /* 1141 * Common types 1142 */ 1144 typedef resource { 1145 type union { 1146 type instance-identifier { 1147 require-instance false; 1148 } 1149 type yang:object-identifier; 1150 type string; 1151 type yang:uuid; 1152 } 1153 description 1154 "This is an identification of the alarming resource, such as an 1155 interface. It should be as fine-grained as possible both to 1156 guide the operator and to guarantee uniqueness of the alarms. 1158 If the alarming resource is modelled in YANG, this type will 1159 be an instance-identifier. 1161 If the resource is an SNMP object, the type will be an 1162 object-identifier. 1164 If the resource is anything else, for example a distinguished 1165 name or a CIM path, this type will be a string. 1167 If the alarming object is identified by a UUID use the uuid 1168 type. Be cautious when using this type, since a UUID is hard 1169 to use for an operator. 1171 If the server supports several models, the precedence should 1172 be in the order as given in the union definition."; 1173 } 1175 typedef resource-match { 1176 type union { 1177 type yang:xpath1.0; 1178 type yang:object-identifier; 1179 type string; 1180 } 1181 description 1182 "This type is used to match resources of type 'resource'. 1183 Since the type 'resource' is a union of different types, the 1184 'resource-match' type is also a union of corresponding types. 1186 If the type is given as an XPath 1.0 expression, a resource of 1187 type 'instance-identifier' matches if the instance is part of 1188 the node set that is the result of evaluating the XPath 1.0 1189 expression. For example, the XPath 1.0 expression: 1191 /ietf-interfaces:interfaces/ietf-interfaces:interface 1192 [ietf-interfaces:type='ianaift:ethernetCsmacd'] 1194 would match the resource instance-identifier: 1196 /if:interfaces/if:interface[if:name='eth1'], 1198 assuming that the interface 'eth1' is of type 1199 'ianaift:ethernetCsmacd'. 1201 If the type is given as an object identifier, a resource of 1202 type 'object-identifier' matches if the match object 1203 identifier is a prefix of the resource's object identifier. 1204 For example, the value: 1206 1.3.6.1.2.1.2.2 1208 would match the resource object identifier: 1210 1.3.6.1.2.1.2.2.1.1.5 1212 If the type is given as an UUID or a string, it is interpreted 1213 as a W3C regular expression, which matches a resource of type 1214 'yang:uuid' or 'string' if the given regular expression 1215 matches the resource string. 1217 If the type is given as an XPath expression it is evaluated 1218 in the following XPath context: 1220 o The set of namespace declarations is the set of prefix 1221 and namespace pairs for all YANG modules implemented by 1222 the server, where the prefix is the YANG module name and 1223 the namespace is as defined by the 'namespace' statement 1224 in the YANG module. 1226 If a leaf of this type is encoded in XML, all namespace 1227 declarations in scope on the leaf element are added to 1228 the set of namespace declarations. If a prefix found in 1229 the XML is already present in the set of namespace 1230 declarations, the namespace in the XML is used. 1232 o The set of variable bindings is empty. 1234 o The function library is the core function library 1235 and the functions defined in Section 10 of RFC 7950. 1237 o The context node is the root node in the data tree."; 1238 } 1240 typedef alarm-text { 1241 type string; 1242 description 1243 "The string used to inform operators about the alarm. This 1244 MUST contain enough information for an operator to be able to 1245 understand the problem and how to resolve it. If this string 1246 contains structure, this format should be clearly documented 1247 for programs to be able to parse that information."; 1248 } 1250 typedef severity { 1251 type enumeration { 1252 enum indeterminate { 1253 value 2; 1254 description 1255 "Indicates that the severity level could not be 1256 determined. This level SHOULD be avoided."; 1257 } 1258 enum warning { 1259 value 3; 1260 description 1261 "The 'warning' severity level indicates the detection of a 1262 potential or impending service affecting fault, before any 1263 significant effects have been felt. Action should be 1264 taken to further diagnose (if necessary) and correct the 1265 problem in order to prevent it from becoming a more 1266 serious service affecting fault."; 1267 } 1268 enum minor { 1269 value 4; 1270 description 1271 "The 'minor' severity level indicates the existence of a 1272 non-service affecting fault condition and that corrective 1273 action should be taken in order to prevent a more serious 1274 (for example, service affecting) fault. Such a severity 1275 can be reported, for example, when the detected alarm 1276 condition is not currently degrading the capacity of the 1277 resource."; 1278 } 1279 enum major { 1280 value 5; 1281 description 1282 "The 'major' severity level indicates that a service 1283 affecting condition has developed and an urgent corrective 1284 action is required. Such a severity can be reported, for 1285 example, when there is a severe degradation in the 1286 capability of the resource and its full capability must be 1287 restored."; 1288 } 1289 enum critical { 1290 value 6; 1291 description 1292 "The 'critical' severity level indicates that a service 1293 affecting condition has occurred and an immediate 1294 corrective action is required. Such a severity can be 1295 reported, for example, when a resource becomes totally out 1296 of service and its capability must be restored."; 1297 } 1298 } 1299 description 1300 "The severity level of the alarm. Note well that value 'clear' 1301 is not included. If an alarm is cleared or not is a separate 1302 boolean flag."; 1303 reference 1304 "ITU Recommendation X.733: Information Technology 1305 - Open Systems Interconnection 1306 - System Management: Alarm Reporting Function"; 1307 } 1309 typedef severity-with-clear { 1310 type union { 1311 type enumeration { 1312 enum cleared { 1313 value 1; 1314 description 1315 "The alarm is cleared by the instrumentation."; 1316 } 1317 } 1318 type severity; 1319 } 1320 description 1321 "The severity level of the alarm including clear. This is used 1322 only in notifications reporting state changes for an alarm."; 1323 } 1325 typedef writable-operator-state { 1326 type enumeration { 1327 enum none { 1328 value 1; 1329 description 1330 "The alarm is not being taken care of."; 1331 } 1332 enum ack { 1333 value 2; 1334 description 1335 "The alarm is being taken care of. Corrective action not 1336 taken yet, or failed"; 1337 } 1338 enum closed { 1339 value 3; 1340 description 1341 "Corrective action taken successfully."; 1342 } 1343 } 1344 description 1345 "Operator states on an alarm. The 'closed' state indicates 1346 that an operator considers the alarm being resolved. This is 1347 separate from the alarm's 'is-cleared' leaf."; 1348 } 1350 typedef operator-state { 1351 type union { 1352 type writable-operator-state; 1353 type enumeration { 1354 enum shelved { 1355 value 4; 1356 description 1357 "The alarm is shelved. Alarms in /alarms/shelved-alarms/ 1358 MUST be assigned this operator state by the server as 1359 the last entry in the operator-state-change list. The 1360 text for that entry SHOULD include the shelf name."; 1361 } 1362 enum un-shelved { 1363 value 5; 1364 description 1365 "The alarm is moved back to 'alarm-list' from a shelf. 1366 Alarms that are moved from /alarms/shelved-alarms/ to 1367 /alarms/alarm-list MUST be assigned this state by the 1368 server as the last entry in the 'operator-state-change' 1369 list. The text for that entry SHOULD include the shelf 1370 name."; 1371 } 1372 } 1373 } 1374 description 1375 "Operator states on an alarm. The 'closed' state indicates 1376 that an operator considers the alarm being resolved. This is 1377 separate from the alarm's 'is-cleared' leaf."; 1378 } 1380 /* Alarm type */ 1381 typedef alarm-type-id { 1382 type identityref { 1383 base alarm-type-id; 1384 } 1385 description 1386 "Identifies an alarm type. The description of the alarm type 1387 id MUST indicate if the alarm type is abstract or not. An 1388 abstract alarm type is used as a base for other alarm type ids 1389 and will not be used as a value for an alarm or be present in 1390 the alarm inventory."; 1391 } 1393 typedef alarm-type-qualifier { 1394 type string; 1395 description 1396 "If an alarm type can not be fully specified at design time by 1397 alarm-type-id, this string qualifier is used in addition to 1398 fully define a unique alarm type. 1400 The definition of alarm qualifiers is considered being part of 1401 the instrumentation and out of scope for this module. An 1402 empty string is used when this is part of a key."; 1403 } 1405 /* 1406 * Groupings 1407 */ 1409 grouping common-alarm-parameters { 1410 description 1411 "Common parameters for an alarm. 1413 This grouping is used both in the alarm list and in the 1414 notification representing an alarm state change."; 1415 leaf resource { 1416 type resource; 1417 mandatory true; 1418 description 1419 "The alarming resource. See also 'alt-resource'. This could 1420 for example be a reference to the alarming interface"; 1421 } 1422 leaf alarm-type-id { 1423 type alarm-type-id; 1424 mandatory true; 1425 description 1426 "This leaf and the leaf 'alarm-type-qualifier' together 1427 provides a unique identification of the alarm type."; 1428 } 1429 leaf alarm-type-qualifier { 1430 type alarm-type-qualifier; 1431 description 1432 "This leaf is used when the 'alarm-type-id' leaf cannot 1433 uniquely identify the alarm type. Normally, this is not the 1434 case, and this leaf is the empty string."; 1435 } 1436 leaf-list alt-resource { 1437 type resource; 1438 description 1439 "Used if the alarming resource is available over other 1440 interfaces. This field can contain SNMP OID's, CIM paths or 1441 3GPP Distinguished names for example."; 1442 } 1443 list related-alarm { 1444 if-feature "alarm-correlation"; 1445 key "resource alarm-type-id alarm-type-qualifier"; 1446 description 1447 "References to related alarms. Note that the related alarm 1448 might have been purged from the alarm list."; 1449 leaf resource { 1450 type leafref { 1451 path "/alarms/alarm-list/alarm/resource"; 1452 require-instance false; 1453 } 1454 description 1455 "The alarming resource for the related alarm."; 1456 } 1457 leaf alarm-type-id { 1458 type leafref { 1459 path "/alarms/alarm-list/alarm" 1460 + "[resource=current()/../resource]" 1461 + "/alarm-type-id"; 1462 require-instance false; 1463 } 1464 description 1465 "The alarm type identifier for the related alarm."; 1466 } 1467 leaf alarm-type-qualifier { 1468 type leafref { 1469 path "/alarms/alarm-list/alarm" 1470 + "[resource=current()/../resource]" 1471 + "[alarm-type-id=current()/../alarm-type-id]" 1472 + "/alarm-type-qualifier"; 1473 require-instance false; 1474 } 1475 description 1476 "The alarm qualifier for the related alarm."; 1478 } 1479 } 1480 leaf-list impacted-resource { 1481 if-feature "service-impact-analysis"; 1482 type resource; 1483 description 1484 "Resources that might be affected by this alarm. If the 1485 system creates an alarm on a resource and also has a mapping 1486 to other resources that might be impacted, these resources 1487 can be listed in this leaf-list. In this way the system can 1488 create one alarm instead of several. For example, if an 1489 interface has an alarm, the 'impacted-resource' can 1490 reference the aggregated port channels."; 1491 } 1492 leaf-list root-cause-resource { 1493 if-feature "root-cause-analysis"; 1494 type resource; 1495 description 1496 "Resources that are candidates for causing the alarm. If the 1497 system has a mechanism to understand the candidate root 1498 causes of an alarm, this leaf-list can be used to list the 1499 root cause candidate resources. In this way the system can 1500 create one alarm instead of several. An example might be a 1501 logging system (alarm resource) that fails, the alarm can 1502 reference the file-system in the 'root-cause-resource' 1503 leaf-list. Note that the intended use is not to also send 1504 an an alarm with the root-cause-resource as alarming 1505 resource. The root-cause-resource leaf list is a hint and 1506 should not also generate an alarm for the same problem."; 1507 } 1508 } 1510 grouping alarm-state-change-parameters { 1511 description 1512 "Parameters for an alarm state change. 1514 This grouping is used both in the alarm list's status-change 1515 list and in the notification representing an alarm state 1516 change."; 1517 leaf time { 1518 type yang:date-and-time; 1519 mandatory true; 1520 description 1521 "The time the status of the alarm changed. The value 1522 represents the time the real alarm state change appeared in 1523 the resource and not when it was added to the alarm 1524 list. The /alarm-list/alarm/last-changed MUST be set to the 1525 same value."; 1527 } 1528 leaf perceived-severity { 1529 type severity-with-clear; 1530 mandatory true; 1531 description 1532 "The severity of the alarm as defined by X.733. Note that 1533 this may not be the original severity since the alarm may 1534 have changed severity."; 1535 reference 1536 "ITU Recommendation X.733: Information Technology 1537 - Open Systems Interconnection 1538 - System Management: Alarm Reporting Function"; 1539 } 1540 leaf alarm-text { 1541 type alarm-text; 1542 mandatory true; 1543 description 1544 "A user friendly text describing the alarm state change."; 1545 reference 1546 "ITU Recommendation X.733: Information Technology 1547 - Open Systems Interconnection 1548 - System Management: Alarm Reporting Function"; 1549 } 1550 } 1552 grouping operator-parameters { 1553 description 1554 "This grouping defines parameters that can be changed by an 1555 operator."; 1556 leaf time { 1557 type yang:date-and-time; 1558 mandatory true; 1559 description 1560 "Timestamp for operator action on alarm."; 1561 } 1562 leaf operator { 1563 type string; 1564 mandatory true; 1565 description 1566 "The name of the operator that has acted on this alarm."; 1567 } 1568 leaf state { 1569 type operator-state; 1570 mandatory true; 1571 description 1572 "The operator's view of the alarm state."; 1573 } 1574 leaf text { 1575 type string; 1576 description 1577 "Additional optional textual information provided by the 1578 operator."; 1579 } 1580 } 1582 grouping resource-alarm-parameters { 1583 description 1584 "Alarm parameters that originates from the resource view."; 1585 leaf is-cleared { 1586 type boolean; 1587 mandatory true; 1588 description 1589 "Indicates the current clearance state of the alarm. An 1590 alarm might toggle from active alarm to cleared alarm and 1591 back to active again."; 1592 } 1593 leaf last-raised { 1594 type yang:date-and-time; 1595 mandatory true; 1596 description 1597 "An alarm may change severity level and toggle between 1598 active and cleared during its life-time. This leaf indicates 1599 the last time it was last raised (is-cleared = false)."; 1600 } 1601 leaf last-changed { 1602 type yang:date-and-time; 1603 mandatory true; 1604 description 1605 "A timestamp when the alarm status was last changed. Status 1606 changes are changes to 'is-cleared', 'perceived-severity', 1607 and 'alarm-text'."; 1608 } 1609 leaf perceived-severity { 1610 type severity; 1611 mandatory true; 1612 description 1613 "The last severity of the alarm. 1615 If an alarm was raised with severity 'warning', but later 1616 changed to 'major', this leaf will show 'major'."; 1617 } 1618 leaf alarm-text { 1619 type alarm-text; 1620 mandatory true; 1621 description 1622 "The last reported alarm text. This text should contain 1623 information for an operator to be able to understand the 1624 problem and how to resolve it."; 1625 } 1626 list status-change { 1627 if-feature "alarm-history"; 1628 key "time"; 1629 min-elements 1; 1630 description 1631 "A list of status change events for this alarm. 1633 The entry with latest time-stamp in this list MUST 1634 correspond to the leafs 'is-cleared', 'perceived-severity' 1635 and 'alarm-text' for the alarm. The time-stamp for that 1636 entry MUST be equal to the 'last-changed' leaf. 1638 This list is ordered according to the timestamps of alarm 1639 state changes. The last item corresponds to the latest 1640 state change. 1642 The following state changes creates an entry in this 1643 list: 1644 - changed severity (warning, minor, major, critical) 1645 - clearance status, this also updates the 'is-cleared' 1646 leaf 1647 - alarm text update"; 1648 uses alarm-state-change-parameters; 1649 } 1650 } 1652 grouping filter-input { 1653 description 1654 "Grouping to specify a filter construct on alarm information."; 1655 leaf alarm-clearance-status { 1656 type enumeration { 1657 enum any { 1658 description 1659 "Ignore alarm clearance status."; 1660 } 1661 enum cleared { 1662 description 1663 "Filter cleared alarms."; 1664 } 1665 enum not-cleared { 1666 description 1667 "Filter not cleared alarms."; 1668 } 1669 } 1670 mandatory true; 1671 description 1672 "The clearance status of the alarm."; 1673 } 1674 container older-than { 1675 presence "Age specification"; 1676 description 1677 "Matches the 'last-status-change' leaf in the alarm."; 1678 choice age-spec { 1679 description 1680 "Filter using date and time age."; 1681 case seconds { 1682 leaf seconds { 1683 type uint16; 1684 description 1685 "Seconds part"; 1686 } 1687 } 1688 case minutes { 1689 leaf minutes { 1690 type uint16; 1691 description 1692 "Minute part"; 1693 } 1694 } 1695 case hours { 1696 leaf hours { 1697 type uint16; 1698 description 1699 "Hours part."; 1700 } 1701 } 1702 case days { 1703 leaf days { 1704 type uint16; 1705 description 1706 "Day part"; 1707 } 1708 } 1709 case weeks { 1710 leaf weeks { 1711 type uint16; 1712 description 1713 "Week part"; 1714 } 1715 } 1716 } 1717 } 1718 container severity { 1719 presence "Severity filter"; 1720 choice sev-spec { 1721 description 1722 "Filter based on severity level."; 1723 leaf below { 1724 type severity; 1725 description 1726 "Severity less than this leaf."; 1727 } 1728 leaf is { 1729 type severity; 1730 description 1731 "Severity level equal this leaf."; 1732 } 1733 leaf above { 1734 type severity; 1735 description 1736 "Severity level higher than this leaf."; 1737 } 1738 } 1739 description 1740 "Filter based on severity."; 1741 } 1742 container operator-state-filter { 1743 if-feature "operator-actions"; 1744 presence "Operator state filter"; 1745 leaf state { 1746 type operator-state; 1747 description 1748 "Filter on operator state."; 1749 } 1750 leaf user { 1751 type string; 1752 description 1753 "Filter based on which operator."; 1754 } 1755 description 1756 "Filter based on operator state."; 1757 } 1758 } 1760 /* 1761 * The /alarms data tree 1762 */ 1764 container alarms { 1765 description 1766 "The top container for this module."; 1768 container control { 1769 description 1770 "Configuration to control the alarm behaviour."; 1771 leaf max-alarm-status-changes { 1772 type union { 1773 type uint16; 1774 type enumeration { 1775 enum infinite { 1776 description 1777 "The status change entries are accumulated 1778 infinitely."; 1779 } 1780 } 1781 } 1782 default "32"; 1783 description 1784 "The status-change entries are kept in a circular list per 1785 alarm. When this number is exceeded, the oldest status 1786 change entry is automatically removed. If the value is 1787 'infinite', the status change entries are accumulated 1788 infinitely."; 1789 } 1790 choice notify-status-changes { 1791 description 1792 "This leaf controls the notifications sent for alarm status 1793 updates. There are three options: 1795 1. Notifications are sent for all updates, severity level 1796 changes and alarm text changes 1798 2. Notifications are only sent for alarm raise and clear 1800 3. Notifications are sent for status changes equal to or 1801 above the specified severity level. Clear 1802 notifications shall always be sent Notifications shall 1803 also be sent for state changes that makes an alarm less 1804 severe than the specified level. 1806 For example, in option 3, assuming the severity level is 1807 set to major and that the alarm has the following state 1808 changes: 1810 [(Time, severity, clear)]: 1811 [(T1, major, -), (T2, minor, -), (T3, warning, -), 1812 (T4, minor, -), (T5, major, -), (T6, critical, -), 1813 (T7, major. -), (T8, major, clear)] 1815 In that case, notifications will be sent at times 1816 T1, T2, T5, T6, T7 and T8."; 1817 leaf notify-all-state-changes { 1818 type empty; 1819 description 1820 "Send notifications for all status changes."; 1821 } 1822 leaf notify-raise-and-clear { 1823 type empty; 1824 description 1825 "Send notifications only for raise, clear, and re-raise. 1826 Notifications for severity level changes or alarm text 1827 changes are not sent."; 1828 } 1829 leaf notify-severity-level { 1830 type severity; 1831 description 1832 "Only send notifications for alarm state changes crossing 1833 the specified level. Always send clear notifications."; 1834 } 1835 } 1836 container alarm-shelving { 1837 if-feature "alarm-shelving"; 1838 description 1839 "The alarm-shelving/shelf list is used to shelve 1840 (block/filter) alarms. The first matching shelf is used, 1841 and an alarm is shelved only for this first match. 1842 The server will move any alarms corresponding to the 1843 shelving criteria from the 1844 alarms/alarm-list/alarm list to the 1845 alarms/shelved-alarms/shelved-alarm list. It will also 1846 stop sending notifications for the shelved alarms. The 1847 conditions in the shelf criteria are logically ANDed. 1848 When the shelving criteria is deleted or changed, the 1849 non-matching alarms MUST appear in the 1850 alarms/alarm-list/alarm list according to the real state. 1852 This means that the instrumentation MUST maintain states 1853 for the shelved alarms. Alarms that match the criteria 1854 shall have an operator-state 'shelved'. When the shelf 1855 configuration removes an alarm from the shelf the 1856 server shall add an operator state 'unshelved'."; 1857 list shelf { 1858 key "name"; 1859 ordered-by user; 1860 leaf name { 1861 type string; 1862 description 1863 "An arbitrary name for the alarm shelf."; 1865 } 1866 description 1867 "Each entry defines the criteria for shelving alarms. 1868 Criteria are ANDed. If no criteria are specified, 1869 all alarms will be shelved."; 1870 leaf-list resource { 1871 type resource-match; 1872 description 1873 "Shelve alarms for matching resources."; 1874 } 1875 list alarm-type { 1876 key "alarm-type-id alarm-type-qualifier-match"; 1877 description 1878 "Any alarm matching the combined criteria of 1879 alarm-type-id and alarm-type-qualifier-match 1880 MUST be matched."; 1881 leaf alarm-type-id { 1882 type alarm-type-id; 1883 description 1884 "Shelve all alarms that have an alarm-type-id that is 1885 equal to or derived from the given alarm-type-id."; 1886 } 1887 leaf alarm-type-qualifier-match { 1888 type string; 1889 description 1890 "A W3C regular expression that is used to match an 1891 alarm type qualifier. Shelve all alarms that 1892 matches this regular expression for the alarm type 1893 qualifier."; 1894 } 1895 } 1896 leaf description { 1897 type string; 1898 description 1899 "An optional textual description of the shelf. This 1900 description should include the reason for shelving 1901 these alarms."; 1902 } 1903 } 1904 } 1905 } 1906 container alarm-inventory { 1907 config false; 1908 description 1909 "This alarm-inventory/alarm-type list contains all possible 1910 alarm types for the system. 1912 If the system knows for which resources a specific alarm 1913 type can appear, this is also identified in the inventory. 1914 The list also tells if each alarm type has a corresponding 1915 clear state. The inventory shall only contain concrete 1916 alarm types. 1918 The alarm inventory MUST be updated by the system when new 1919 alarms can appear. This can be the case when installing new 1920 software modules or inserting new card types. A 1921 notification 'alarm-inventory-changed' is sent when the 1922 inventory is changed."; 1923 list alarm-type { 1924 key "alarm-type-id alarm-type-qualifier"; 1925 description 1926 "An entry in this list defines a possible alarm."; 1927 leaf alarm-type-id { 1928 type alarm-type-id; 1929 description 1930 "The statically defined alarm type identifier for this 1931 possible alarm."; 1932 } 1933 leaf alarm-type-qualifier { 1934 type alarm-type-qualifier; 1935 description 1936 "The optionally dynamically defined alarm type identifier 1937 for this possible alarm."; 1938 } 1939 leaf-list resource { 1940 type resource-match; 1941 description 1942 "Optionally, specifies for which resources the alarm type 1943 is valid."; 1944 } 1945 leaf has-clear { 1946 type boolean; 1947 mandatory true; 1948 description 1949 "This leaf tells the operator if the alarm will be 1950 cleared when the correct corrective action has been 1951 taken. Implementations SHOULD strive for detecting the 1952 cleared state for all alarm types. 1954 If this leaf is 'true', the operator can monitor the 1955 alarm until it becomes cleared after the corrective 1956 action has been taken. 1958 If this leaf is 'false', the operator needs to validate 1959 that the alarm is not longer active using other 1960 mechanisms. Alarms can lack a corresponding clear due 1961 to missing instrumentation or that there is no logical 1962 corresponding clear state."; 1963 } 1964 leaf-list severity-levels { 1965 type severity; 1966 description 1967 "This leaf-list indicates the possible severity levels of 1968 this alarm type. Note well that 'clear' is not part of 1969 the severity type. In general, the severity level 1970 should be defined by the instrumentation based on 1971 dynamic state and not defined statically by the alarm 1972 type in order to provide relevant severity level based 1973 on dynamic state and context. However most alarm types 1974 have a defined set of possible severity levels and this 1975 should be provided here."; 1976 } 1977 leaf description { 1978 type string; 1979 mandatory true; 1980 description 1981 "A description of the possible alarm. It SHOULD include 1982 information on possible underlying root causes and 1983 corrective actions."; 1984 } 1985 } 1986 } 1987 container summary { 1988 if-feature "alarm-summary"; 1989 config false; 1990 description 1991 "This container gives a summary of number of alarms."; 1992 list alarm-summary { 1993 key "severity"; 1994 description 1995 "A global summary of all alarms in the system. The summary 1996 does not include shelved alarms."; 1997 leaf severity { 1998 type severity; 1999 description 2000 "Alarm summary for this severity level."; 2001 } 2002 leaf total { 2003 type yang:gauge32; 2004 description 2005 "Total number of alarms of this severity level."; 2006 } 2007 leaf not-cleared { 2008 type yang:gauge32; 2009 description 2010 "Total number of alarms of this severity level 2011 that are not cleared."; 2012 } 2013 leaf cleared { 2014 type yang:gauge32; 2015 description 2016 "For this severity level, the number of alarms that are 2017 cleared."; 2018 } 2019 leaf cleared-not-closed { 2020 if-feature "operator-actions"; 2021 type yang:gauge32; 2022 description 2023 "For this severity level, the number of alarms that are 2024 cleared but not closed."; 2025 } 2026 leaf cleared-closed { 2027 if-feature "operator-actions"; 2028 type yang:gauge32; 2029 description 2030 "For this severity level, the number of alarms that are 2031 cleared and closed."; 2032 } 2033 leaf not-cleared-closed { 2034 if-feature "operator-actions"; 2035 type yang:gauge32; 2036 description 2037 "For this severity level, the number of alarms that are 2038 not cleared but closed."; 2039 } 2040 leaf not-cleared-not-closed { 2041 if-feature "operator-actions"; 2042 type yang:gauge32; 2043 description 2044 "For this severity level, the number of alarms that are 2045 not cleared and not closed."; 2046 } 2047 } 2048 leaf shelves-active { 2049 if-feature "alarm-shelving"; 2050 type empty; 2051 description 2052 "This is a hint to the operator that there are active 2053 alarm shelves. This leaf MUST exist if the 2054 alarms/shelved-alarms/number-of-shelved-alarms is > 0."; 2055 } 2056 } 2057 container alarm-list { 2058 config false; 2059 description 2060 "The alarms in the system."; 2061 leaf number-of-alarms { 2062 type yang:gauge32; 2063 description 2064 "This object shows the total number of 2065 alarms in the system, i.e., the total number 2066 of entries in the alarm list."; 2067 } 2068 leaf last-changed { 2069 type yang:date-and-time; 2070 description 2071 "A timestamp when the alarm list was last 2072 changed. The value can be used by a manager to 2073 initiate an alarm resynchronization procedure."; 2074 } 2075 list alarm { 2076 key "resource alarm-type-id alarm-type-qualifier"; 2077 description 2078 "The list of alarms. Each entry in the list holds one 2079 alarm for a given alarm type and resource. An alarm can 2080 be updated from the underlying resource or by the user. 2081 The following leafs are maintained by the resource: 2082 is-cleared, last-change, perceived-severity, and 2083 alarm-text. An operator can change: operator-state and 2084 operator-text. 2086 Entries appear in the alarm list the first time an alarm 2087 becomes active for a given alarm-type and resource. 2088 Entries do not get deleted when the alarm is cleared, this 2089 is a boolean state in the alarm. 2091 Alarm entries are removed, purged, from the list by an 2092 explicit purge action. For example, purge all alarms that 2093 are cleared and in closed operator-state that are older 2094 than 24 hours. Purged alarms are removed from the alarm 2095 list. If the alarm resource state changes after a purge, 2096 the alarm will reappear in the alarm list. 2098 Systems may also remove alarms based on locally configured 2099 policies which is out of scope for this module."; 2100 uses common-alarm-parameters; 2101 leaf time-created { 2102 type yang:date-and-time; 2103 mandatory true; 2104 description 2105 "The time-stamp when this alarm entry was created. This 2106 represents the first time the alarm appeared, it can 2107 also represent that the alarm re-appeared after a purge. 2108 Further state-changes of the same alarm does not change 2109 this leaf, these changes will update the 'last-changed' 2110 leaf."; 2111 } 2112 uses resource-alarm-parameters; 2113 list operator-state-change { 2114 if-feature "operator-actions"; 2115 key "time"; 2116 description 2117 "This list is used by operators to indicate the state of 2118 human intervention on an alarm. For example, if an 2119 operator has seen an alarm, the operator can add a new 2120 item to this list indicating that the alarm is 2121 acknowledged."; 2122 uses operator-parameters; 2123 } 2124 action set-operator-state { 2125 if-feature "operator-actions"; 2126 description 2127 "This is a means for the operator to indicate the level 2128 of human intervention on an alarm."; 2129 input { 2130 leaf state { 2131 type writable-operator-state; 2132 mandatory true; 2133 description 2134 "Set this operator state."; 2135 } 2136 leaf text { 2137 type string; 2138 description 2139 "Additional optional textual information."; 2140 } 2141 } 2142 } 2143 notification operator-action { 2144 if-feature "operator-actions"; 2145 description 2146 "This notification is used to report that an operator 2147 acted upon an alarm."; 2148 uses operator-parameters; 2149 } 2150 } 2151 action purge-alarms { 2152 description 2153 "This operation requests the server to delete entries from 2154 the alarm list according to the supplied criteria. 2156 Typically this operation is used to delete alarms that are 2157 in closed operator state and older than a specified time. 2159 The number of purged alarms is returned as an output 2160 parameter."; 2161 input { 2162 uses filter-input; 2163 } 2164 output { 2165 leaf purged-alarms { 2166 type uint32; 2167 description 2168 "Number of purged alarms."; 2169 } 2170 } 2171 } 2172 action compress-alarms { 2173 if-feature "alarm-history"; 2174 description 2175 "This operation requests the server to compress entries in 2176 the alarm list by removing all but the latest 2177 'status-change' entry for all matching alarms. Conditions 2178 in the input are logically ANDed. If no input condition 2179 is given, all alarms are compressed."; 2180 input { 2181 leaf resource { 2182 type resource-match; 2183 description 2184 "Compress the alarms matching this resource."; 2185 } 2186 leaf alarm-type-id { 2187 type leafref { 2188 path "/alarms/alarm-list/alarm/alarm-type-id"; 2189 require-instance false; 2190 } 2191 description 2192 "Compress alarms with this alarm-type-id."; 2193 } 2194 leaf alarm-type-qualifier { 2195 type leafref { 2196 path "/alarms/alarm-list/alarm/alarm-type-qualifier"; 2197 require-instance false; 2198 } 2199 description 2200 "Compress the alarms with this alarm-type-qualifier."; 2202 } 2203 } 2204 output { 2205 leaf compressed-alarms { 2206 type uint32; 2207 description 2208 "Number of compressed alarm entries."; 2209 } 2210 } 2211 } 2212 } 2213 container shelved-alarms { 2214 if-feature "alarm-shelving"; 2215 config false; 2216 description 2217 "The shelved alarms. Alarms appear here if they match the 2218 criteria in /alarms/control/alarm-shelving. This list does 2219 not generate any notifications. The list represents alarms 2220 that are considered not relevant by the operator. Alarms in 2221 this list have an operator-state of 'shelved'. This can not 2222 be changed."; 2223 leaf number-of-shelved-alarms { 2224 type yang:gauge32; 2225 description 2226 "This object shows the total number of currently 2227 alarms, i.e., the total number of entries 2228 in the alarm list."; 2229 } 2230 leaf shelved-alarms-last-changed { 2231 type yang:date-and-time; 2232 description 2233 "A timestamp when the shelved alarm list was last changed. 2234 The value can be used by a manager to initiate an alarm 2235 resynchronization procedure."; 2236 } 2237 list shelved-alarm { 2238 key "resource alarm-type-id alarm-type-qualifier"; 2239 description 2240 "The list of shelved alarms. Shelved alarms can only be 2241 updated from the underlying resource, no operator actions 2242 are supported."; 2243 uses common-alarm-parameters; 2244 leaf shelf-name { 2245 type leafref { 2246 path "/alarms/control/alarm-shelving/shelf/name"; 2247 require-instance false; 2248 } 2249 description 2250 "The name of the shelf."; 2251 } 2252 uses resource-alarm-parameters; 2253 list operator-state-change { 2254 if-feature "operator-actions"; 2255 key "time"; 2256 description 2257 "This list is used by operators to indicate the state of 2258 human intervention on an alarm. For shelved alarms, the 2259 system has set the list item in the list to 'shelved'."; 2260 uses operator-parameters; 2261 } 2262 } 2263 action purge-shelved-alarms { 2264 description 2265 "This operation requests the server to delete entries from 2266 the shelved alarms list according to the supplied 2267 criteria. 2269 In the shelved alarm list it makes sense to delete alarms 2270 that are not relevant anymore. 2272 The number of purged alarms is returned as an output 2273 parameter."; 2274 input { 2275 uses filter-input; 2276 } 2277 output { 2278 leaf purged-alarms { 2279 type uint32; 2280 description 2281 "Number of purged alarms."; 2282 } 2283 } 2284 } 2285 action compress-shelved-alarms { 2286 if-feature "alarm-history"; 2287 description 2288 "This operation requests the server to compress entries in 2289 the shelved alarm list by removing all but the latest 2290 'status-change' entry for all matching shelved alarms. 2291 Conditions in the input are logically ANDed. If no input 2292 condition is given, all alarms are compressed."; 2293 input { 2294 leaf resource { 2295 type leafref { 2296 path "/alarms/shelved-alarms/shelved-alarm/resource"; 2297 require-instance false; 2299 } 2300 description 2301 "Compress the alarms with this resource."; 2302 } 2303 leaf alarm-type-id { 2304 type leafref { 2305 path "/alarms/shelved-alarms/shelved-alarm" 2306 + "/alarm-type-id"; 2307 require-instance false; 2308 } 2309 description 2310 "Compress alarms with this alarm-type-id."; 2311 } 2312 leaf alarm-type-qualifier { 2313 type leafref { 2314 path "/alarms/shelved-alarms/shelved-alarm" 2315 + "/alarm-type-qualifier"; 2316 require-instance false; 2317 } 2318 description 2319 "Compress the alarms with this alarm-type-qualifier."; 2320 } 2321 } 2322 output { 2323 leaf compressed-alarms { 2324 type uint32; 2325 description 2326 "Number of compressed alarm entries."; 2327 } 2328 } 2329 } 2330 } 2331 list alarm-profile { 2332 if-feature "alarm-profile"; 2333 key "alarm-type-id alarm-type-qualifier-match resource"; 2334 ordered-by user; 2335 description 2336 "This list is used to assign further information or 2337 configuration for each alarm type. This module supports a 2338 mechanism where the client can override the system default 2339 alarm severity levels. The alarm-profile is also a useful 2340 augmentation point for specific additions to alarm types."; 2341 leaf alarm-type-id { 2342 type alarm-type-id; 2343 description 2344 "The alarm type identifier to match."; 2345 } 2346 leaf alarm-type-qualifier-match { 2347 type string; 2348 description 2349 "A W3C regular expression that is used to match the alarm 2350 type qualifier."; 2351 } 2352 leaf resource { 2353 type resource-match; 2354 description 2355 "Specifies which resources to match."; 2356 } 2357 leaf description { 2358 type string; 2359 mandatory true; 2360 description 2361 "A description of the alarm profile."; 2362 } 2363 container alarm-severity-assignment-profile { 2364 if-feature "severity-assignment"; 2365 description 2366 "The client can override the system default severity 2367 level."; 2368 reference 2369 "ITU M.3100, ITU M.3160 2370 - Generic Network Information Model, Alarm Severity 2371 Assignment Profile"; 2372 leaf-list severity-levels { 2373 type severity; 2374 ordered-by user; 2375 description 2376 "Specifies the configured severity level(s) for the 2377 matching alarm. If the alarm has several severity 2378 levels the leaf-list shall be given in rising severity 2379 order. The original M3100/M3160 ASAP function only 2380 allows for a one-to-one mapping between alarm type and 2381 severity but since the IETF alarm module supports 2382 stateful alarms the mapping must allow for several 2383 severity levels. 2385 Assume a high-utilisation alarm type with two thresholds 2386 with the system default severity levels of threshold1 = 2387 warning and threshold2 = minor. Setting this leaf-list 2388 to (minor, major) will assign the severity levels 2389 threshold1 = minor and threshold2 = major"; 2390 } 2391 } 2392 } 2393 } 2394 /* 2395 * Notifications 2396 */ 2398 notification alarm-notification { 2399 description 2400 "This notification is used to report a state change for an 2401 alarm. The same notification is used for reporting a newly 2402 raised alarm, a cleared alarm or changing the text and/or 2403 severity of an existing alarm."; 2404 uses common-alarm-parameters; 2405 uses alarm-state-change-parameters; 2406 } 2408 notification alarm-inventory-changed { 2409 description 2410 "This notification is used to report that the list of possible 2411 alarms has changed. This can happen when for example if a new 2412 software module is installed, or a new physical card is 2413 inserted."; 2414 } 2415 } 2417 2419 7. X.733 Extensions 2421 Many alarm systems are based on the X.733, [X.733], and X.736 [X.736] 2422 alarm standards. This module augments the alarm inventory, the alarm 2423 lists and the alarm notification with X.733 and X.736 parameters. 2425 The module also supports a feature whereby the alarm manager can 2426 configure the mapping from alarm types to X.733 event-type and 2427 probable-cause parameters. This might be needed when the default 2428 mapping provided by the system is in conflict with other management 2429 systems or not considered correct. 2431 Note that the IETF Alarm Module term 'resource' is synonymous to the 2432 ITU term 'managed object'. 2434 8. The X.733 Mapping Module 2436 This YANG module references [X.721], [X.733] and [X.736]. 2438 file "ietf-alarms-x733@2019-01-27.yang" 2439 module ietf-alarms-x733 { 2440 yang-version 1.1; 2441 namespace "urn:ietf:params:xml:ns:yang:ietf-alarms-x733"; 2442 prefix x733; 2444 import ietf-alarms { 2445 prefix al; 2446 } 2447 import ietf-yang-types { 2448 prefix yang; 2449 reference 2450 "RFC 6991: Common YANG Data Types"; 2451 } 2453 organization 2454 "IETF CCAMP Working Group"; 2455 contact 2456 "WG Web: 2457 WG List: 2459 Editor: Stefan Vallin 2460 2462 Editor: Martin Bjorklund 2463 "; 2464 description 2465 "This module augments the ietf-alarms module with X.733 alarm 2466 parameters. 2468 The following structures are augmented with X.733 event type 2469 and probable cause: 2471 1) alarms/alarm-inventory: all possible alarm types 2472 2) alarms/alarm-list: every alarm in the system 2473 3) alarm-notification: notifications indicating alarm state 2474 changes 2475 4) alarms/shelved-alarms 2477 The module also optionally allows the alarm management system 2478 to configure the mapping from the IETF Alarm module alarm keys 2479 to the ITU tuple (event-type, probable-cause). 2481 The mapping does not include a corresponding X.733 specific 2482 problem value. The recommendation is to use the 2483 'alarm-type-qualifier' leaf which serves the same purpose. 2485 The module uses an integer and a corresponding string for 2486 probable cause instead of a globally defined enumeration, in 2487 order to be able to manage conflicting enumeration definitions. 2488 A single globally defined enumeration is challenging to 2489 maintain. 2491 The key words 'MUST', 'MUST NOT', 'REQUIRED', 'SHALL', 'SHALL 2492 NOT', 'SHOULD', 'SHOULD NOT', 'RECOMMENDED', 'NOT RECOMMENDED', 2493 'MAY', and 'OPTIONAL' in this document are to be interpreted as 2494 described in BCP 14 (RFC 2119) (RFC 8174) when, and only when, 2495 they appear in all capitals, as shown here. 2497 Copyright (c) 2019 IETF Trust and the persons identified as 2498 authors of the code. All rights reserved. 2500 Redistribution and use in source and binary forms, with or 2501 without modification, is permitted pursuant to, and subject to 2502 the license terms contained in, the Simplified BSD License set 2503 forth in Section 4.c of the IETF Trust's Legal Provisions 2504 Relating to IETF Documents 2505 (https://trustee.ietf.org/license-info). 2507 This version of this YANG module is part of RFC XXXX 2508 (https://tools.ietf.org/html/rfcXXXX); see the RFC itself for 2509 full legal notices."; 2510 reference 2511 "ITU Recommendation X.733: Information Technology 2512 - Open Systems Interconnection 2513 - System Management: Alarm Reporting Function"; 2515 revision 2019-01-27 { 2516 description 2517 "Initial revision."; 2518 reference 2519 "RFC XXXX: YANG Alarm Module"; 2520 } 2522 /* 2523 * Features 2524 */ 2526 feature configure-x733-mapping { 2527 description 2528 "The system supports configurable X733 mapping from 2529 the IETF alarm module alarm-type to X733 event-type 2530 and probable-cause."; 2531 } 2533 /* 2534 * Typedefs 2535 */ 2537 typedef event-type { 2538 type enumeration { 2539 enum other { 2540 value 1; 2541 description 2542 "None of the below."; 2543 } 2544 enum communications-alarm { 2545 value 2; 2546 description 2547 "An alarm of this type is principally associated with the 2548 procedures and/or processes required to convey 2549 information from one point to another."; 2550 } 2551 enum quality-of-service-alarm { 2552 value 3; 2553 description 2554 "An alarm of this type is principally associated with a 2555 degradation in the quality of a service."; 2556 } 2557 enum processing-error-alarm { 2558 value 4; 2559 description 2560 "An alarm of this type is principally associated with a 2561 software or processing fault."; 2562 } 2563 enum equipment-alarm { 2564 value 5; 2565 description 2566 "An alarm of this type is principally associated with an 2567 equipment fault."; 2568 } 2569 enum environmental-alarm { 2570 value 6; 2571 description 2572 "An alarm of this type is principally associated with a 2573 condition relating to an enclosure in which the equipment 2574 resides."; 2575 } 2576 enum integrity-violation { 2577 value 7; 2578 description 2579 "An indication that information may have been illegally 2580 modified, inserted or deleted."; 2581 } 2582 enum operational-violation { 2583 value 8; 2584 description 2585 "An indication that the provision of the requested service 2586 was not possible due to the unavailability, malfunction or 2587 incorrect invocation of the service."; 2588 } 2589 enum physical-violation { 2590 value 9; 2591 description 2592 "An indication that a physical resource has been violated 2593 in a way that suggests a security attack."; 2594 } 2595 enum security-service-or-mechanism-violation { 2596 value 10; 2597 description 2598 "An indication that a security attack has been detected by 2599 a security service or mechanism."; 2600 } 2601 enum time-domain-violation { 2602 value 11; 2603 description 2604 "An indication that an event has occurred at an unexpected 2605 or prohibited time."; 2606 } 2607 } 2608 description 2609 "The event types as defined by X.733 and X.736."; 2610 reference 2611 "ITU Recommendation X.733: Information Technology 2612 - Open Systems Interconnection 2613 - System Management: Alarm Reporting Function 2614 ITU Recommendation X.736: Information Technology 2615 - Open Systems Interconnection 2616 - System Management: Security Alarm Reporting Function"; 2617 } 2619 typedef trend { 2620 type enumeration { 2621 enum less-severe { 2622 description 2623 "There is at least one outstanding alarm of a 2624 severity higher (more severe) than that in the 2625 current alarm."; 2626 } 2627 enum no-change { 2628 description 2629 "The Perceived severity reported in the current 2630 alarm is the same as the highest (most severe) 2631 of any of the outstanding alarms"; 2632 } 2633 enum more-severe { 2634 description 2635 "The Perceived severity in the current alarm is 2636 higher (more severe) than that reported in any 2637 of the outstanding alarms."; 2638 } 2639 } 2640 description 2641 "This type is used to describe the 2642 severity trend of the alarming resource"; 2643 reference 2644 "ITU Recommendation X.721: Information Technology 2645 - Open Systems Interconnection 2646 - Structure of management information: 2647 Definition of management information 2648 Module Attribute-ASN1Module"; 2649 } 2651 typedef value-type { 2652 type union { 2653 type int64; 2654 type uint64; 2655 type decimal64 { 2656 fraction-digits 2; 2657 } 2658 } 2659 description 2660 "A generic union type to match ITU choice of integer 2661 and real."; 2662 } 2664 /* 2665 * Groupings 2666 */ 2668 grouping x733-alarm-parameters { 2669 description 2670 "Common X.733 parameters for alarms."; 2671 leaf event-type { 2672 type event-type; 2673 description 2674 "The X.733/X.736 event type for this alarm."; 2675 } 2676 leaf probable-cause { 2677 type uint32; 2678 description 2679 "The X.733 probable cause for this alarm."; 2680 } 2681 leaf probable-cause-string { 2682 type string; 2683 description 2684 "The user friendly string matching 2685 the probable cause integer value. The string 2686 SHOULD match the X.733 enumeration. For example, 2687 value 27 is 'localNodeTransmissionError'."; 2688 } 2689 container threshold-information { 2690 description 2691 "This parameter shall be present when the alarm 2692 is a result of crossing a threshold. "; 2693 leaf triggered-threshold { 2694 type string; 2695 description 2696 "The identifier of the threshold attribute that 2697 caused the notification."; 2698 } 2699 leaf observed-value { 2700 type value-type; 2701 description 2702 "The value of the gauge or counter which crossed 2703 the threshold. This may be different from the 2704 threshold value if, for example, the gauge may 2705 only take on discrete values."; 2706 } 2707 choice threshold-level { 2708 description 2709 "In the case of a gauge the threshold level specifies 2710 a pair of threshold values, the first being the value 2711 of the crossed threshold and the second, its corresponding 2712 hysteresis; in the case of a counter the threshold level 2713 specifies only the threshold value."; 2714 case up { 2715 leaf up-high { 2716 type value-type; 2717 description 2718 "The going up threshold for rising the alarm."; 2719 } 2720 leaf up-low { 2721 type value-type; 2722 description 2723 "The threshold level for clearing the alarm. 2724 This is used for hysteresis functions for gauges."; 2725 } 2726 } 2727 case down { 2728 leaf down-low { 2729 type value-type; 2730 description 2731 "The going down threshold for rising the alarm."; 2732 } 2733 leaf down-high { 2734 type value-type; 2735 description 2736 "The threshold level for clearing the alarm. 2737 This is used for hysteresis functions for gauges."; 2738 } 2739 } 2740 } 2741 leaf arm-time { 2742 type yang:date-and-time; 2743 description 2744 "For a gauge threshold, the time at which the threshold 2745 was last re-armed, namely the time after the previous 2746 threshold crossing at which the hysteresis value of the 2747 threshold was exceeded thus again permitting generation 2748 of notifications when the threshold is crossed. 2749 For a counter threshold, the later of the time at which 2750 the threshold offset was last applied, or the time at 2751 which the counter was last initialized (for resettable 2752 counters)."; 2753 } 2754 } 2755 list monitored-attributes { 2756 uses attribute; 2757 key "id"; 2758 description 2759 "The Monitored attributes parameter, when present, defines 2760 one or more attributes of the resource and their 2761 corresponding values at the time of the alarm."; 2762 } 2763 leaf-list proposed-repair-actions { 2764 type string; 2765 description 2766 "This parameter, when present, is used if the cause is 2767 known and the system being managed can suggest one or 2768 more solutions (such as switch in standby equipment, 2769 retry, replace media)."; 2770 } 2771 leaf trend-indication { 2772 type trend; 2773 description 2774 "This parameter specifies the current 2775 severity trend of the resource. If present it 2776 indicates that there are one or more alarms 2777 ('outstanding alarms') which have not been cleared, 2778 and pertain to the same resource as that to which 2779 this alarm ('current alarm') pertains. 2780 The possible values are: 2782 more-severe: The Perceived severity in the current 2783 alarm is higher (more severe) than that reported in 2784 any of the outstanding alarms. 2786 no-change: The Perceived severity reported in the 2787 current alarm is the same as the highest (most severe) 2788 of any of the outstanding alarms. 2790 less-severe: There is at least one outstanding alarm 2791 of a severity higher (more severe) than that in the 2792 current alarm."; 2793 } 2794 leaf backedup-status { 2795 type boolean; 2796 description 2797 "This parameter, when present, specifies whether or not 2798 the object emitting the alarm has been backed-up, and 2799 services provided to the user have, therefore, not been 2800 disrupted. The use of this field in conjunction with the 2801 severity field provides information in an independent form 2802 to qualify the seriousness of the alarm and the ability of 2803 the system as a whole to continue to provide services. 2804 If the value of this parameter is true, it indicates that 2805 the object emitting the alarm has been backed-up; if false, 2806 the object has not been backed-up."; 2807 } 2808 leaf backup-object { 2809 type al:resource; 2810 description 2811 "This parameter shall be present when the Backed-up status 2812 parameter is present and has the value true. This parameter 2813 specifies the managed object instance that is providing 2814 back-up services for the managed object about which the 2815 notification pertains. This parameter is useful, 2816 for example, when the back-up object is from a pool of 2817 objects any of which may be dynamically allocated to 2818 replace a faulty object."; 2819 } 2820 list additional-information { 2821 key "identifier"; 2822 description 2823 "This parameter allows the inclusion of a 2824 set of additional information in the alarm. It is 2825 a series of data structures each of which contains three 2826 items of information: an identifier, a significance 2827 indicator, and the problem information."; 2828 leaf identifier { 2829 type string; 2830 description 2831 "Identifies the data-type of the information parameter."; 2832 } 2833 leaf significant { 2834 type boolean; 2835 description 2836 "Set to true if the receiving system must be able to 2837 parse the contents of the information subparameter 2838 for the event report to be fully understood."; 2839 } 2840 leaf information { 2841 type string; 2842 description 2843 "Additional information about the alarm."; 2844 } 2845 } 2846 leaf security-alarm-detector { 2847 type al:resource; 2848 description 2849 "This parameter identifies the detector of the security 2850 alarm."; 2851 } 2852 leaf service-user { 2853 type al:resource; 2854 description 2855 "This parameter identifies the service-user whose request 2856 for service led to the generation of the security alarm."; 2857 } 2858 leaf service-provider { 2859 type al:resource; 2860 description 2861 "This parameter identifies the intended service-provider 2862 of the service that led to the generation of the security 2863 alarm."; 2864 } 2865 reference 2866 "ITU Recommendation X.733: Information Technology 2867 - Open Systems Interconnection 2868 - System Management: Alarm Reporting Function 2869 ITU Recommendation X.736: Information Technology 2870 - Open Systems Interconnection 2871 - System Management: Security Alarm Reporting Function"; 2872 } 2874 grouping x733-alarm-definition-parameters { 2875 description 2876 "Common X.733 parameters for alarm definitions. 2877 This grouping is used to define those alarm 2878 attributes that can be mapped from the alarm-type 2879 mechanism in the ietf-alarm module."; 2880 leaf event-type { 2881 type event-type; 2882 description 2883 "The alarm type has this X.733/X.736 event type."; 2884 } 2885 leaf probable-cause { 2886 type uint32; 2887 description 2888 "The alarm type has this X.733 probable cause value. 2889 This module defines probable cause as an integer 2890 and not as an enumeration. The reason being that the 2891 primary use of probable cause is in the management 2892 application if it is based on the X.733 standard. 2893 However, most management applications have their own 2894 defined enum definitions and merging enums from 2895 different systems might create conflicts. By using 2896 a configurable uint32 the system can be configured 2897 to match the enum values in the management application."; 2898 } 2899 leaf probable-cause-string { 2900 type string; 2901 description 2902 "This string can be used to give a user friendly string 2903 to the probable cause value."; 2904 } 2905 } 2907 grouping attribute { 2908 description 2909 "A grouping to match the ITU generic reference to 2910 an attribute."; 2911 leaf id { 2912 type al:resource; 2913 description 2914 "The resource representing the attribute."; 2915 } 2916 leaf value { 2917 type string; 2918 description 2919 "The value represented as a string since it could 2920 be of any type."; 2921 } 2922 reference 2923 "ITU Recommendation X.721: Information Technology 2924 - Open Systems Interconnection 2925 - Structure of management information: 2926 Definition of management information 2927 Module Attribute-ASN1Module"; 2928 } 2930 /* 2931 * Add X.733 parameters to the alarm definitions, alarms, 2932 * and notification. 2933 */ 2935 augment "/al:alarms/al:alarm-inventory/al:alarm-type" { 2936 description 2937 "Augment X.733 mapping information to the alarm inventory."; 2938 uses x733-alarm-definition-parameters; 2939 } 2941 /* 2942 * Add X.733 configurable mapping. 2943 */ 2945 augment "/al:alarms/al:control" { 2946 description 2947 "Add X.733 mapping capabilities. "; 2948 list x733-mapping { 2949 if-feature "configure-x733-mapping"; 2950 key "alarm-type-id alarm-type-qualifier-match"; 2951 description 2952 "This list allows a management application to control the 2953 X.733 mapping for all alarm types in the system. Any entry 2954 in this list will allow the alarm manager to over-ride the 2955 default X.733 mapping in the system and the final mapping 2956 will be shown in the alarm inventory."; 2957 leaf alarm-type-id { 2958 type al:alarm-type-id; 2959 description 2960 "Map the alarm type with this alarm type identifier."; 2961 } 2962 leaf alarm-type-qualifier-match { 2963 type string; 2964 description 2965 "A W3C regular expression that is used when mapping an 2966 alarm type and alarm-type-qualifier to X.733 parameters."; 2967 } 2968 uses x733-alarm-definition-parameters; 2969 } 2970 } 2971 augment "/al:alarms/al:alarm-list/al:alarm" { 2972 description 2973 "Augment X.733 information to the alarm."; 2974 uses x733-alarm-parameters; 2975 } 2977 augment "/al:alarms/al:shelved-alarms/al:shelved-alarm" { 2978 description 2979 "Augment X.733 information to the alarm."; 2980 uses x733-alarm-parameters; 2981 } 2983 augment "/al:alarm-notification" { 2984 description 2985 "Augment X.733 information to the alarm notification."; 2986 uses x733-alarm-parameters; 2987 } 2988 } 2990 2992 9. IANA Considerations 2994 This document registers two URIs in the IETF XML registry [RFC3688]. 2995 Following the format in RFC 3688, the following registrations are 2996 requested to be made. 2998 URI: urn:ietf:params:xml:ns:yang:ietf-alarms 2999 Registrant Contact: The IESG. 3000 XML: N/A, the requested URI is an XML namespace. 3002 URI: urn:ietf:params:xml:ns:yang:ietf-alarms-x733 3003 Registrant Contact: The IESG. 3004 XML: N/A, the requested URI is an XML namespace. 3006 This document registers two YANG modules in the YANG Module Names 3007 registry [RFC6020]. 3009 name: ietf-alarms 3010 namespace: urn:ietf:params:xml:ns:yang:ietf-alarms 3011 prefix: al 3012 reference: RFC XXXX 3014 name: ietf-alarms-x7333 3015 namespace: urn:ietf:params:xml:ns:yang:ietf-alarms-x733 3016 prefix: x733 3017 reference: RFC XXXX 3019 10. Security Considerations 3021 The YANG module specified in this document defines a schema for data 3022 that is designed to be accessed via network management protocols such 3023 as NETCONF [RFC6241] or RESTCONF [RFC8040]. The lowest NETCONF layer 3024 is the secure transport layer, and the mandatory-to-implement secure 3025 transport is Secure Shell (SSH) [RFC6242]. The lowest RESTCONF layer 3026 is HTTPS, and the mandatory-to-implement secure transport is TLS 3027 [RFC8446]. 3029 The NETCONF access control model [RFC8341] provides the means to 3030 restrict access for particular NETCONF or RESTCONF users to a 3031 preconfigured subset of all available NETCONF or RESTCONF protocol 3032 operations and content. 3034 There are a number of data nodes defined in this YANG module that are 3035 writable/creatable/deletable (i.e., config true, which is the 3036 default). These data nodes may be considered sensitive or vulnerable 3037 in some network environments. Write operations (e.g., edit-config) 3038 to these data nodes without proper protection can have a negative 3039 effect on network operations. These are the subtrees and data nodes 3040 and their sensitivity/vulnerability: 3042 /alarms/control/notify-status-change: This leaf controls whether an 3043 alarm should notify only raise and clear or all severity level 3044 changes. Unauthorized access to leaf could have a negative impact 3045 on operational procedures relying on fine-grained alarm state 3046 change reporting. 3048 /alarms/control/alarm-shelving/shelf: This list controls the 3049 shelving (blocking) of alarms. Unauthorized access to this list 3050 could jeopardize the alarm management procedures since these 3051 alarms will not be notified and not be part of the alarm list. 3053 Some of the operations in this YANG module may be considered 3054 sensitive or vulnerable in some network environments. It is thus 3055 important to control access to these operations. These are the 3056 operations and their sensitivity/vulnerability: 3058 /alarms/alarm-list/purge-alarms: This action deletes alarms from the 3059 alarm list. Unauthorized use of this action could jeopardize the 3060 alarm management procedures since the deleted alarms may be vital 3061 for the alarm management application. 3063 11. Acknowledgements 3065 The authors wish to thank Viktor Leijon and Johan Nordlander for 3066 their valuable input on forming the alarm model. 3068 The authors also wish to thank Nick Hancock, Joey Boyd, Tom Petch and 3069 Balazs Lengyel for their extensive reviews and contributions to this 3070 document. 3072 12. References 3074 12.1. Normative References 3076 [M.3100] International Telecommunications Union, "Generic Network 3077 Information Model", ITU-T Recommendation M.3100, 2005. 3079 [M.3160] International Telecommunications Union, "Generic, 3080 protocol-neutral management information model", ITU-T 3081 Recommendation M.3100, 2008. 3083 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 3084 Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/ 3085 RFC2119, March 1997, 3086 . 3088 [RFC3688] Mealling, M., "The IETF XML Registry", BCP 81, RFC 3688, 3089 DOI 10.17487/RFC3688, January 2004, 3090 . 3092 [RFC6020] Bjorklund, M., Ed., "YANG - A Data Modeling Language for 3093 the Network Configuration Protocol (NETCONF)", RFC 6020, 3094 DOI 10.17487/RFC6020, October 2010, 3095 . 3097 [RFC6241] Enns, R., Ed., Bjorklund, M., Ed., Schoenwaelder, J., Ed., 3098 and A. Bierman, Ed., "Network Configuration Protocol 3099 (NETCONF)", RFC 6241, DOI 10.17487/RFC6241, June 2011, 3100 . 3102 [RFC6242] Wasserman, M., "Using the NETCONF Protocol over Secure 3103 Shell (SSH)", RFC 6242, DOI 10.17487/RFC6242, June 2011, 3104 . 3106 [RFC6991] Schoenwaelder, J., Ed., "Common YANG Data Types", RFC 3107 6991, DOI 10.17487/RFC6991, July 2013, 3108 . 3110 [RFC7950] Bjorklund, M., Ed., "The YANG 1.1 Data Modeling Language", 3111 RFC 7950, DOI 10.17487/RFC7950, August 2016, 3112 . 3114 [RFC8040] Bierman, A., Bjorklund, M., and K. Watsen, "RESTCONF 3115 Protocol", RFC 8040, DOI 10.17487/RFC8040, January 2017, 3116 . 3118 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 3119 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 3120 May 2017, . 3122 [RFC8341] Bierman, A. and M. Bjorklund, "Network Configuration 3123 Access Control Model", STD 91, RFC 8341, DOI 10.17487/ 3124 RFC8341, March 2018, . 3127 [RFC8348] Bierman, A., Bjorklund, M., Dong, J., and D. Romascanu, "A 3128 YANG Data Model for Hardware Management", RFC 8348, DOI 3129 10.17487/RFC8348, March 2018, . 3132 [RFC8446] Rescorla, E., "The Transport Layer Security (TLS) Protocol 3133 Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018, 3134 . 3136 [X.721] International Telecommunications Union, "Information 3137 Technology - Open Systems Interconnection - Structure of 3138 management information: Definition of management 3139 information", ITU-T Recommendation X.721, 1992. 3141 [X.733] International Telecommunications Union, "Information 3142 Technology - Open Systems Interconnection - Systems 3143 Management: Alarm Reporting Function", ITU-T 3144 Recommendation X.733, 1992. 3146 12.2. Informative References 3148 [ALARMIRP] 3149 3GPP, "Telecommunication management; Fault Management; 3150 Part 2: Alarm Integration Reference Point (IRP): 3151 Information Service (IS)", 3GPP TS 32.111-2 3.4.0, March 3152 2005. 3154 [ALARMSEM] 3155 Wallin, S., Leijon, V., Nordlander, J., and N. Bystedt, 3156 "The semantics of alarm definitions: enabling systematic 3157 reasoning about alarms. International Journal of Network 3158 Management, Volume 22, Issue 3, John Wiley and Sons, Ltd, 3159 http://dx.doi.org/10.1002/nem.800", March 2012. 3161 [EEMUA] EEMUA Publication No. 191 Engineering Equipment and 3162 Materials Users Association, London, 2 edition., "Alarm 3163 Systems: A Guide to Design, Management and Procurement.", 3164 2007. 3166 [G.7710] ITU-T, "SERIES G: TRANSMISSION SYSTEMS AND MEDIA, DIGITAL 3167 SYSTEMS AND NETWORKS Data over Transport - Generic aspects 3168 - Transport network control aspects. Common equipment 3169 management function requirements", 2012. 3171 [I-D.ietf-netmod-yang-instance-file-format] 3172 Lengyel, B. and B. Claise, "YANG Instance Data File 3173 Format", draft-ietf-netmod-yang-instance-file-format-00 3174 (work in progress), November 2018. 3176 [ISA182] International Society of Automation,ISA, "ANSI/ISA- 3177 18.2-2009 Management of Alarm Systems for the Process 3178 Industries", 2009. 3180 [RFC3877] Chisholm, S. and D. Romascanu, "Alarm Management 3181 Information Base (MIB)", RFC 3877, DOI 10.17487/RFC3877, 3182 September 2004, . 3184 [RFC8340] Bjorklund, M. and L. Berger, Ed., "YANG Tree Diagrams", 3185 BCP 215, RFC 8340, DOI 10.17487/RFC8340, March 2018, 3186 . 3188 [X.736] International Telecommunications Union, "Information 3189 Technology - Open Systems Interconnection - Systems 3190 Management: Security alarm reporting function", ITU-T 3191 Recommendation X.736, 1992. 3193 Appendix A. Vendor-specific Alarm-Types Example 3195 This example shows how to define alarm-types in a vendor-specific 3196 module. In this case the vendor "xyz" has chosen to define top level 3197 identities according to X.733 event types. 3199 module example-xyz-alarms { 3200 namespace "urn:example:xyz-alarms"; 3201 prefix xyz-al; 3203 import ietf-alarms { 3204 prefix al; 3205 } 3207 identity xyz-alarms { 3208 base al:alarm-type-id; 3209 } 3211 identity communications-alarm { 3212 base xyz-alarms; 3213 } 3214 identity quality-of-service-alarm { 3215 base xyz-alarms; 3216 } 3217 identity processing-error-alarm { 3218 base xyz-alarms; 3219 } 3220 identity equipment-alarm { 3221 base xyz-alarms; 3222 } 3223 identity environmental-alarm { 3224 base xyz-alarms; 3225 } 3227 // communications alarms 3228 identity link-alarm { 3229 base communications-alarm; 3230 } 3232 // QoS alarms 3233 identity high-jitter-alarm { 3234 base quality-of-service-alarm; 3235 } 3236 } 3238 Appendix B. Alarm Inventory Example 3240 This shows an alarm inventory, it shows one alarm type defined only 3241 with the identifier, and another dynamically configured. In the 3242 latter case a digital input has been connected to a smoke-detector, 3243 therefore the 'alarm-type-qualifier' is set to "smoke-detector" and 3244 the 'alarm-type-identity' to "environmental-alarm". 3246 3249 3250 3251 xyz-al:link-alarm 3252 3253 3254 /dev:interfaces/dev:interface 3255 3256 true 3257 3258 Link failure, operational state down but admin state up 3259 3260 3261 3262 xyz-al:environmental-alarm 3263 smoke-alarm 3264 true 3265 3266 Connected smoke detector to digital input 3267 3268 3269 3270 3272 Appendix C. Alarm List Example 3274 In this example we show an alarm that has toggled [major, clear, 3275 major]. An operator has acknowledged the alarm. 3277 3280 3281 1 3282 2018-04-08T08:39:50.00Z 3283 3284 3285 /dev:interfaces/dev:interface[name='FastEthernet1/0'] 3286 3287 xyz-al:link-alarm 3288 3289 2018-04-08T08:20:10.00Z 3290 false 3291 1.3.6.1.2.1.2.2.1.1.17 3292 2018-04-08T08:39:40.00Z 3293 2018-04-08T08:39:50.00Z 3294 major 3295 3296 Link operationally down but administratively up 3297 3298 3299 3300 major 3301 3302 Link operationally down but administratively up 3303 3304 3305 3306 3307 cleared 3308 3309 Link operationally up and administratively up 3310 3311 3312 3313 3314 major 3315 3316 Link operationally down but administratively up 3317 3318 3319 3320 3321 ack 3322 joe 3323 Will investigate, ticket TR764999 3324 3325 3326 3327 3329 Appendix D. Alarm Shelving Example 3331 This example shows how to shelf alarms. We shelf alarms related to 3332 the smoke-detectors since they are being installed and tested. We 3333 also shelf all alarms from FastEthernet1/0. 3335 3338 3339 3340 3341 FE10 3342 3343 /dev:interfaces/dev:interface[name='FastEthernet1/0'] 3344 3345 3346 3347 detectortest 3348 3349 3350 xyz-al:environmental-alarm 3351 3352 3353 smoke-alarm 3354 3355 3356 3357 3358 3359 3361 Appendix E. X.733 Mapping Example 3363 This example shows how to map a dynamic alarm type (alarm-type- 3364 identity=environmental-alarm, alarm-type-qualifier=smoke-alarm) to 3365 the corresponding X.733 event-type and probable cause parameters. 3367 3369 3370 3372 xyz-al:environmental-alarm 3373 3374 smoke-alarm 3375 3376 quality-of-service-alarm 3377 777 3378 3379 3380 3382 Appendix F. Relationship to other alarm standards 3384 This section briefly describes how this alarm module relates to other 3385 relevant standards. 3387 F.1. Alarm definition 3389 The table below summarizes relevant definitions of the term "alarm" 3390 in other alarm standards. 3392 +------------+---------------------------+--------------------------+ 3393 | Standard | Definition | Comment | 3394 +------------+---------------------------+--------------------------+ 3395 | X.733 | error: A deviation of a | The X.733 alarm | 3396 | [X.733] | system from normal | definition is focused on | 3397 | | operation. fault: The | the notification as such | 3398 | | physical or algorithmic | and not the state. It | 3399 | | cause of a malfunction. | also uses the basic | 3400 | | Faults manifest | criteria of deviation | 3401 | | themselves as errors. | from normal condition. | 3402 | | alarm: A notification, of | There is no requirement | 3403 | | the form defined by this | for an operation action | 3404 | | function, of a specific | to be required. | 3405 | | event. An alarm may or | | 3406 | | may not represent an | | 3407 | | error. | | 3408 | | | | 3409 | G.7710 | Alarms are indications | The G.7710 definition is | 3410 | [G.7710] | that are automatically | close to the original | 3411 | | generated by an NE as a | X.733 definition. | 3412 | | result of the declaration | | 3413 | | of a failure. | | 3414 | | | | 3415 | Alarm MIB | Alarm: Persistent | RFC 3877 defines alarm | 3416 | [RFC3877] | indication of a fault. | referring back to "a | 3417 | | Fault: Lasting error or | deviation from normal | 3418 | | warning condition. | operation". This is | 3419 | | Error: A deviation of a | problematic, since this | 3420 | | system from normal | might not require an | 3421 | | operation. | operator action. The | 3422 | | | alarm MIB is state | 3423 | | | oriented rather than | 3424 | | | notification oriented, | 3425 | | | an alarm is a "lasting | 3426 | | | condition", not a | 3427 | | | discrete notification | 3428 | | | reporting about a | 3429 | | | condition state change. | 3430 | | | | 3431 | ISA | Alarm: An audible and/or | The ISA standard adds an | 3432 | [ISA182] | visible means of | important requirement to | 3433 | | indicating to the | the "deviation from | 3434 | | operator an equipment | normal condition state"; | 3435 | | malfunction, process | requiring a response. | 3436 | | deviation or abnormal | | 3437 | | condition requiring a | | 3438 | | response. | | 3439 | | | | 3440 | EEMUA | An alarm is an event to | This is the foundation | 3441 | [EEMUA] | which an operator must | for the definition of | 3442 | | knowingly react,respond, | alarm in this document. | 3443 | | and acknowledge - not | It focuses on the core | 3444 | | simply acknowledge and | criteria that an action | 3445 | | ignore. | is really needed. | 3446 | | | | 3447 | 3GPP Alarm | 3GPP v15: An alarm | The latest 3GPP Alarm | 3448 | IRP | signifies an undesired | IRP version uses | 3449 | [ALARMIRP] | condition of a resource | literally the same alarm | 3450 | | (e.g. network element, | definition as this alarm | 3451 | | link) for which an | module. It is worth | 3452 | | operator action is | noting that earlier | 3453 | | required. It emphasizes a | versions used a | 3454 | | key requirement that | definition not requiring | 3455 | | operators [...] should | an operator action and | 3456 | | not be informed about an | the more broad | 3457 | | undesired condition | definition of deviation | 3458 | | unless it requires | from normal condition. | 3459 | | operator action. 3GPP | The earlier version also | 3460 | | v12: alarm: abnormal | defined an alarm as a | 3461 | | network entity condition, | special case of "event". | 3462 | | which categorizes an | | 3463 | | event as a fault. fault: | | 3464 | | a deviation of a system | | 3465 | | from normal operation, | | 3466 | | which may result in the | | 3467 | | loss of operational | | 3468 | | capabilities [...] | | 3469 +------------+---------------------------+--------------------------+ 3471 Table 1: Definition of alarm in standards 3473 The evolution of the definition of alarm moves from focused on events 3474 reporting a deviation from normal operation towards a definition to a 3475 undesired *state* which *requires an operator action*. 3477 F.2. Data model 3479 This section describes how this YANG alarm module relates to other 3480 standard data models. Note well that we cover other data-models for 3481 alarm interfaces. Not other standards such as SDO specific alarms 3482 for example. 3484 F.2.1. X.733 3486 X.733 has acted as a base for several alarm data models over the 3487 year. The YANG alarm module differs in the following ways: 3489 X.733 models the alarm list as a list of notifications. The YANG 3490 alarm module defines the alarm list as the current alarm states 3491 for the resources, which is generated from the state change 3492 reporting notifications. 3494 In X.733 an alarm can have the severity level clear. In the YANG 3495 alarm module "clear" is not a severity level, it is a separate 3496 state of the alarm. An alarm can have the following states for 3497 example (major, cleared), (minor, not cleared) 3499 X.733 uses a flat globally defined enumerated "probable cause" to 3500 identify alarm types. This alarm module uses a hierarchical YANG 3501 identity, alarm-type. This enables delegation of alarm types 3502 within organizations. It also lets management reason about 3503 "abstract" alarm-types corresponding to base identities, see 3504 Section 3.2. 3506 The YANG alarm module has not included the majority of the X.733 3507 alarm attributes. Rather these are defined in an augmenting 3508 module if "strict" X.733 compliance is needed. 3510 F.2.2. RFC 3877, the Alarm MIB 3512 The MIB in RFC 3877 takes a different approach, rather than defining 3513 a concrete data model for alarms, it defines a model to map existing 3514 SNMP managed objects and notifications into alarm states and alarm 3515 notifications. This was necessary since MIBs were already defined 3516 with both managed objects and notifications indicating alarms, for 3517 example linkUp and linkDown notifications in combination with 3518 ifAdminState and ifOperState. So RFC 3877 can not really be compared 3519 to the alarm YANG module in that sense. 3521 The Alarm MIB maps existing MIB definitions into alarms, 3522 alarmModelTable. The upside of that is that a SNMP Manager can at 3523 runtime read the possible alarm types. This corresponds to the 3524 alarmInventory in the alarm YANG module. 3526 F.2.3. 3GPP Alarm IRP 3528 The 3GPP Alarm IRP is an evolution of X.733. Main differences 3529 between the alarm YANG module and 3GPP are: 3531 3GPP keeps the majority of the X.733 attributes, the alarm YANG 3532 module does not. 3534 3GPP introduced overlapping and possibly conflicting keys for 3535 alarms, alarmId and (managed object, event type, probable cause, 3536 specific problem). (See Annex C in [X.733] Example 3). In the 3537 YANG alarm module the key for identifying an alarm instance is 3538 clearly defined by (resource, alarm-type, alarm-type-qualifier). 3539 See also Section 3.4 for more information. 3541 The alarm YANG module clearly separates the resource/ 3542 instrumentation life cycle from the operator life cycle. 3GPP 3543 allows operators to set the alarm severity to clear, this is not 3544 allowed by this module, rather an operator closes an alarm which 3545 does not affect the severity. 3547 F.2.4. G.7710 3549 G.7710 is different than the previous referenced alarm standards. It 3550 does define a data-model for alarm reporting. It defines common 3551 equipment management function requirements including alarm 3552 instrumentation. The scope is transport networks. 3554 The requirements in G.7710 corresponds to features in the alarm YANG 3555 module in the following way: 3557 Alarm Severity Assignment Profile (ASAP): the alarm profile 3558 "/alarms/alarm-profile/". 3560 Alarm Reporting Control (ARC): alarm shelving "/alarms/control/ 3561 alarm-shelving/" and the ability to control alarm notifications 3562 "/alarms/control/notify-status-changes". Alarm shelving 3563 corresponds to the use case of turning off alarm reporting for a 3564 specific resource, the NALM state in M.3100. 3566 Appendix G. Alarm Usability Requirements 3568 This section defines usability requirements for alarms. Alarm 3569 usability is important for an alarm interface. A data-model will 3570 help in defining the format but if the actual alarms are of low value 3571 we have not gained the goal of alarm management. 3573 Common alarm problems and the cause of the problems are summarized in 3574 Table 2. This summary is adopted to networking based on the ISA 3575 [ISA182] and EEMUA [EEMUA] standards. 3577 +------------------+--------------------------------+---------------+ 3578 | Problem | Cause | How this | 3579 | | | module | 3580 | | | address the | 3581 | | | cause | 3582 +------------------+--------------------------------+---------------+ 3583 | Alarms are | "Nuisance" alarms (chattering | Strict | 3584 | generated but | alarms and fleeting alarms), | definition of | 3585 | they are ignored | faulty hardware, redundant | alarms | 3586 | by the operator. | alarms, cascading alarms, | requiring | 3587 | | incorrect alarm settings, | corrective | 3588 | | alarms have not been | response. | 3589 | | rationalized, the alarms | Alarm | 3590 | | represent log information | requirements | 3591 | | rather than true alarms. | in Table 3. | 3592 | | | | 3593 | When alarms | Insufficient alarm response | The alarm | 3594 | occur, operators | procedures and not well | inventory | 3595 | do not know how | defined alarm types. | lists all | 3596 | to respond. | | alarm types | 3597 | | | and | 3598 | | | corrective | 3599 | | | actions. | 3600 | | | Alarm | 3601 | | | requirements | 3602 | | | in Table 3. | 3603 | | | | 3604 | The alarm | Nuisance alarms, stale alarms, | The alarm | 3605 | display is full | alarms from equipment not in | definition | 3606 | of alarms, even | service. | and alarm | 3607 | when there is | | shelving. | 3608 | nothing wrong. | | | 3609 | | | | 3610 | During a | Incorrect prioritization of | State-based | 3611 | failure, | alarms. Not using advanced | alarm model, | 3612 | operators are | alarm techniques (e.g. state- | alarm rate | 3613 | flooded with so | based alarming). | requirements | 3614 | many alarms that | | in Table 4 | 3615 | they do not know | | and Table 5 | 3616 | which ones are | | | 3617 | the most | | | 3618 | important. | | | 3619 +------------------+--------------------------------+---------------+ 3621 Table 2: Alarm Problems and Causes 3623 Based upon the above problems EEMUA gives the following definition of 3624 a good alarm: 3626 +----------------+--------------------------------------------------+ 3627 | Characteristic | Explanation | 3628 +----------------+--------------------------------------------------+ 3629 | Relevant | Not spurious or of low operational value. | 3630 | | | 3631 | Unique | Not duplicating another alarm. | 3632 | | | 3633 | Timely | Not long before any response is needed or too | 3634 | | late to do anything. | 3635 | | | 3636 | Prioritized | Indicating the importance that the operator | 3637 | | deals with the problem. | 3638 | | | 3639 | Understandable | Having a message which is clear and easy to | 3640 | | understand. | 3641 | | | 3642 | Diagnostic | Identifying the problem that has occurred. | 3643 | | | 3644 | Advisory | Indicative of the action to be taken. | 3645 | | | 3646 | Focusing | Drawing attention to the most important issues. | 3647 +----------------+--------------------------------------------------+ 3649 Table 3: Definition of a Good Alarm 3651 Vendors SHOULD rationalize all alarms according to above. Another 3652 crucial requirement is acceptable alarm notification rates. Vendors 3653 SHOULD make sure that they do not exceed the recommendations from 3654 EEMUA below: 3656 +-----------------------------------+-------------------------------+ 3657 | Long Term Alarm Rate in Steady | Acceptability | 3658 | Operation | | 3659 +-----------------------------------+-------------------------------+ 3660 | More than one per minute | Very likely to be | 3661 | | unacceptable. | 3662 | | | 3663 | One per 2 minutes | Likely to be over-demanding. | 3664 | | | 3665 | One per 5 minutes | Manageable. | 3666 | | | 3667 | Less than one per 10 minutes | Very likely to be acceptable. | 3668 +-----------------------------------+-------------------------------+ 3670 Table 4: Acceptable Alarm Rates, Steady State 3672 +----------------------------+--------------------------------------+ 3673 | Number of alarms displayed | Acceptability | 3674 | in 10 minutes following a | | 3675 | major network problem | | 3676 +----------------------------+--------------------------------------+ 3677 | More than 100 | Definitely excessive and very likely | 3678 | | to lead to the operator to abandon | 3679 | | the use of the alarm system. | 3680 | | | 3681 | 20-100 | Hard to cope with. | 3682 | | | 3683 | Under 10 | Should be manageable - but may be | 3684 | | difficult if several of the alarms | 3685 | | require a complex operator response. | 3686 +----------------------------+--------------------------------------+ 3688 Table 5: Acceptable Alarm Rates, Burst 3690 The numbers in Table 4 and Table 5 are the sum of all alarms for a 3691 network being managed from one alarm console. So every individual 3692 system or NMS contributes to these numbers. 3694 Vendors SHOULD make sure that the following rules are used in 3695 designing the alarm interface: 3697 1. Rationalize the alarms in the system to ensure that every alarm 3698 is necessary, has a purpose, and follows the cardinal rule - that 3699 it requires an operator response. Adheres to the rules of 3700 Table 3 3702 2. Audit the quality of the alarms. Talk with the operators about 3703 how well the alarm information support them. Do they know what 3704 to do in the event of an alarm? Are they able to quickly 3705 diagnose the problem and determine the corrective action? Does 3706 the alarm text adhere to the requirements in Table 3? 3708 3. Analyze and benchmark the performance of the system and compare 3709 it to the recommended metrics in Table 4 and Table 5. Start by 3710 identifying nuisance alarms, standing alarms at normal state and 3711 startup. 3713 Authors' Addresses 3715 Stefan Vallin 3716 Stefan Vallin AB 3718 Email: stefan@wallan.se 3719 Martin Bjorklund 3720 Cisco 3722 Email: mbj@tail-f.com