idnits 2.17.1 draft-ietf-dtn-ama-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 2021) is 918 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'ACL' is mentioned on line 1248, but not defined == Outdated reference: A later version (-17) exists of draft-ietf-core-comi-11 == Outdated reference: A later version (-24) exists of draft-ietf-core-sid-16 == Outdated reference: A later version (-20) exists of draft-ietf-core-yang-cbor-16 Summary: 0 errors (**), 0 flaws (~~), 5 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Delay-Tolerant Networking E.J. Birrane 3 Internet-Draft E. Annis 4 Intended status: Informational S.E. Heiner 5 Expires: 28 April 2022 Johns Hopkins Applied Physics Laboratory 6 October 2021 8 Asynchronous Management Architecture 9 draft-ietf-dtn-ama-03 11 Abstract 13 This document describes a management architecture suitable for 14 deployment in challenged networking environments for the 15 configuration, monitoring, and local control of application services. 16 Challenged networking environments exhibit interruptions in end-to- 17 end connectivity and communications delays that are both long-lived 18 and unpredictable. Even in these challenging conditions, such 19 networks must provide some type of end-to-end information transport 20 and fault protection while also supporting configuration and 21 performance reporting. This management may need to operate without 22 human- or system-in-the-loop synchronous interactivity and without 23 the preservation of transport-layer sessions. In such a context, 24 challenged networks must exhibit behavior that is both determinable 25 and autonomous while maintaining as much compatibility with non- 26 challenged-network operational concepts as possible. 28 The architecture described in this document is termed the 29 Asynchronous Management Architecture (AMA). The AMA supported two 30 types of asynchronous behavior. First, the AMA does not presuppose 31 any synchronized transport behavior between managed and managing 32 devices. Second, the AMA does not support any query-response 33 semantics. In this way, the AMA allows for operation in extremely 34 challenging conditions, to include over uni-directional links and 35 cases where delays/disruptions would otherwise prevent operation over 36 traditional transport layers, such as when exceeding the Maximum 37 Segment Lifetime (MSL) of the Transmission Control Protocol (TCP). 39 Status of This Memo 41 This Internet-Draft is submitted in full conformance with the 42 provisions of BCP 78 and BCP 79. 44 Internet-Drafts are working documents of the Internet Engineering 45 Task Force (IETF). Note that other groups may also distribute 46 working documents as Internet-Drafts. The list of current Internet- 47 Drafts is at https://datatracker.ietf.org/drafts/current/. 49 Internet-Drafts are draft documents valid for a maximum of six months 50 and may be updated, replaced, or obsoleted by other documents at any 51 time. It is inappropriate to use Internet-Drafts as reference 52 material or to cite them other than as "work in progress." 54 This Internet-Draft will expire on 4 April 2022. 56 Copyright Notice 58 Copyright (c) 2021 IETF Trust and the persons identified as the 59 document authors. All rights reserved. 61 This document is subject to BCP 78 and the IETF Trust's Legal 62 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 63 license-info) in effect on the date of publication of this document. 64 Please review these documents carefully, as they describe your rights 65 and restrictions with respect to this document. Code Components 66 extracted from this document must include Simplified BSD License text 67 as described in Section 4.e of the Trust Legal Provisions and are 68 provided without warranty as described in the Simplified BSD License. 70 Table of Contents 72 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 73 1.1. Scope . . . . . . . . . . . . . . . . . . . . . . . . . . 4 74 1.2. Requirements Language . . . . . . . . . . . . . . . . . . 5 75 1.3. Organization . . . . . . . . . . . . . . . . . . . . . . 5 76 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 6 77 3. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . 8 78 3.1. Challenged Networks . . . . . . . . . . . . . . . . . . . 9 79 3.2. Current Approaches and Their Limitations . . . . . . . . 10 80 3.2.1. Simple Network Management Protocol (SNMP) . . . . . . 10 81 3.2.2. YANG, NETCONF, and RESTCONF . . . . . . . . . . . . . 11 82 3.2.3. Constrained RESTful Network Management . . . . . . . 13 83 4. Services Provided by an AMA . . . . . . . . . . . . . . . . . 13 84 4.1. Configuration . . . . . . . . . . . . . . . . . . . . . . 14 85 4.2. Reporting . . . . . . . . . . . . . . . . . . . . . . . . 14 86 4.3. Autonomous Parameterized Procedure Calls . . . . . . . . 15 87 4.4. Administration . . . . . . . . . . . . . . . . . . . . . 16 88 5. Desirable Properties of an AMA . . . . . . . . . . . . . . . 16 89 5.1. Intelligent Push of Information . . . . . . . . . . . . . 16 90 5.2. Minimize Message Size Not Node Processing . . . . . . . . 17 91 5.3. Absolute Data Identification . . . . . . . . . . . . . . 17 92 5.4. Custom Data Definition . . . . . . . . . . . . . . . . . 18 93 5.5. Autonomous Operation . . . . . . . . . . . . . . . . . . 18 94 6. AMA Roles and Responsibilities . . . . . . . . . . . . . . . 19 95 6.1. Agent Responsibilities . . . . . . . . . . . . . . . . . 19 96 6.2. Manager Responsibilities . . . . . . . . . . . . . . . . 20 98 7. Logical Data Model . . . . . . . . . . . . . . . . . . . . . 21 99 7.1. Data Representations: Constants, Externally Defined Data, 100 and Variables . . . . . . . . . . . . . . . . . . . . . . 21 101 7.2. Data Collections: Reports and Tables . . . . . . . . . . 22 102 7.2.1. Report Templates and Reports . . . . . . . . . . . . 22 103 7.2.2. Table Templates and Tables . . . . . . . . . . . . . 23 104 7.3. Command Execution: Controls and Macros . . . . . . . . . 23 105 7.4. Autonomy: Time and State-Based Rules . . . . . . . . . . 24 106 7.4.1. State-Based Rule (SBR) . . . . . . . . . . . . . . . 24 107 7.4.2. Time-Based Rule (TBR) . . . . . . . . . . . . . . . . 25 108 7.5. Calculations: Expressions, Literals, and Operators . . . 25 109 8. System Model . . . . . . . . . . . . . . . . . . . . . . . . 26 110 8.1. Control and Data Flows . . . . . . . . . . . . . . . . . 26 111 8.2. Control Flow by Role . . . . . . . . . . . . . . . . . . 27 112 8.2.1. Notation . . . . . . . . . . . . . . . . . . . . . . 27 113 8.2.2. Serialized Management . . . . . . . . . . . . . . . . 27 114 8.2.3. Multiplexed Management . . . . . . . . . . . . . . . 28 115 8.2.4. Data Fusion . . . . . . . . . . . . . . . . . . . . . 30 116 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 31 117 10. Security Considerations . . . . . . . . . . . . . . . . . . . 31 118 11. Informative References . . . . . . . . . . . . . . . . . . . 31 119 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 33 121 1. Introduction 123 The Asynchronous Management Architecture (AMA) provides a novel 124 approach for the configuration, monitoring, and local control of 125 application services on a managed device over a challenged network. 126 The unique properties of a challenged network are as defined in 127 [RFC7228] and include cases where an end-to-end transport path may 128 not be feasible at any moment in time and delivery delays may prevent 129 timely communications between a network operator and a managed 130 device. These delays may be caused by long signal propagations or 131 frequent link disruptions (such as described in [RFC4838]) or by non- 132 environmental factors such as quality-of-service prioritizations and 133 service-level agreements. 135 Importantly, the management approach for a challenged network must be 136 one which remains operational in the most restrictive environments in 137 which such networks might be instantiated. The AMA approach should 138 be functional in a variety of potential management scenarios, to 139 include the following. 141 * Managed devices that are only accessible via a uni-directional 142 link, or via a link whose duration is shorter than a single round- 143 trip propagation time. 145 * Links that may be significantly constrained by capacity or 146 reliability, but at (predictable or unpredictable) times may offer 147 significant throughput. 149 * Multi-hop challenged networks that interconnect two or more 150 unchallenged networks such that managed and managing devices exist 151 in different networks. 153 In these and related scenarios, managed devices need to operate with 154 a certain level of local autonomy because managing devices may not be 155 available within operationally-relevant timeframes. Managing devices 156 deliver instruction sets that govern the local, autonomous behavior 157 of the managed device. These behaviors include, but are not limited 158 to, collecting performance data, state, and error conditions, and 159 applying pre-determined responses to pre-determined events. 161 The AMA is a novel approach to management that can leverage 162 transport, network, and security solutions designed for challenged 163 networks, but is not bound to any single solution. The goal is 164 asynchronous communication between the device being managed and the 165 manager, at times never expecting a reply, and with knowledge that 166 commands and queries may be delivered much later than the initial 167 request. 169 More generally, the AMA approach is designed such that it can be 170 deployed in all environments in which the Delay/Disruption-Tolerant 171 (DTN) Bundle Protocol (BPv7) [I-D.ietf-dtn-bpbis] may be deployed. 173 1.1. Scope 175 This document describes the motivation, services, desirable 176 properties, roles/responsibilities, logical data model, and system 177 model that form the AMA. These descriptions comprise a concept of 178 operations for management in challenged networks with sufficient 179 specificity that implementations conformant with this architecture 180 will operate successfully in a challenged networking environment. 182 The AMA described herein is strictly a framework for application 183 management over a challenged network. The document is not a 184 prescriptive standardization of a physical data model or any 185 protocol. Instead, it serves as informative guidance to authors and 186 users of such models and protocols. 188 The AMA is independent of transport and network layers. It does not, 189 for example, require the use of TCP or UDP. Similarly, the AMA does 190 not pre-suppose the use of IPv4 or IPv6. 192 The AMA is not bound to a particular security solution. It is 193 assumed that any network using this architecture supports those 194 services such as naming, addressing, integrity, confidentiality, and 195 authentication required to communicate AMA messages. Therefore, the 196 transport of these messages is outside of the scope of the AMA. 198 While possible that a challenged network may interface with an 199 unchallenged network, this document does not address the concept of 200 compatibility with other management approaches. 202 1.2. Requirements Language 204 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 205 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 206 document are to be interpreted as described in [RFC2119]. 208 1.3. Organization 210 The remainder of this document is organized into seven sections that, 211 together, describe an AMA suitable for management of challenged 212 networks. The description of each section is as follows. 214 * Terminology - This section identifies those terms critical to 215 understanding the proper operation of the AMA. Whenever possible, 216 these terms align in both word selection and meaning with their 217 analogs from other management protocols. 219 * Motivation - This section provides an overall motivation for this 220 work as providing a novel and useful alternative to other network 221 management approaches. 223 * Services - This section identifies and defines the services that 224 an AMA will provide to network and mission operators that are 225 unique to operating in a challenged environment. 227 * Desirable Properties - This section identifies those properties of 228 a challenged network management system required to effectively 229 implement needed services. These properties guide the subsequent 230 definition of the system and logical models that comprise the AMA. 232 * Roles and Responsibilities - This section identifies roles in the 233 AMA and their associated responsibilities. It provides the 234 context for discussing how services are provided by both managers 235 and agents. 237 * Logical Data Model - This section describes the kinds of data, 238 procedures, autonomy, and associated hierarchal structure inherent 239 to the AMA. 241 * System Model - This section describes data flows amongst various 242 defined AMA roles. These flows capture how the AMA system works 243 to manage devices across a challenged network. 245 2. Terminology 247 * Actor - A software service running on either managed or managing 248 devices for the purpose of implementing management protocols 249 between such devices. Actors may implement the "Manager" role, 250 "Agent" role, or both. 252 * Agent Role (or Agent) - A role associated with a managed device, 253 responsible for reporting performance data, accepting/performing 254 controls, error handling and validation, and executing any 255 autonomous behaviors. AMA Agents exchange information with AMA 256 Managers operating either on the same device or on a remote 257 managing device. 259 * Asynchronous Management - Management that does not depend on 260 stateful connections or real time delivery of management messages. 261 Allows for delivery of management messages and instruction sets 262 for autonomous behavior that governs the expected actions, rules 263 associated with those actions, and expected reporting procedures. 264 Asynchronous management does not depend on underlying transport or 265 network protocols for reliability or addressing of source and 266 destination. 268 * Asynchronous Management Model (AMM) - data types and data 269 structures needed to manage applications in asynchronous networks. 271 * Externally Defined Data (EDD) - Information made available to an 272 AMA Agent by a managed device, but not computed directly by the 273 AMA Agent itself. 275 * Variables (VARs) - Typed information that is computed by an AMA 276 Agent, typically as a function of EDD values and/or other 277 variables. 279 * Constants (CONST) - A constant represents a typed, immutable value 280 that is referred to by a semantic name. Constants are used in 281 situations where substituting a name for a fixed value provides 282 useful semantic information. For example, using the named 283 constant PI rather than the literal value 3.14. 285 * Controls (CTRLs) - Procedures run by an AMA Actor to change the 286 behavior, configuration, or state of an application or protocol 287 being asynchronously managed. Controls may also be used to 288 request data from an agent and define the rules associated with 289 generation and delivery. 291 * Literals (LITs) - A literal represents a typed value without a 292 semantic name. Literals are used in cases where adding a semantic 293 name to a fixed value provides no useful semantic information. 294 For example, the number 4 is a literal value. 296 * Macros (MACROs) - A named, ordered collection of Controls and/or 297 other Macros. 299 * Manager Role (or Manager) - A role associated with a managing 300 device responsible for configuring the behavior of, and eventually 301 receiving information from, AMA Agents. AMA Managers interact 302 with one or more AMA Agents located on the same device and/or on 303 remote devices in the network. 305 * Operator (OP) - The enumeration and specification of a 306 mathematical function used to calculate variable values and 307 construct expressions to evaluate AMA Agent state. 309 * Report (RPT) - A typed, ordered collection of data values gathered 310 by one or more AMA Agents and provided to one or more AMA 311 Managers. Reports only contain typed data values and the identity 312 of the Report Template (RPTT) to which they conform. 314 * Report Template (RPTT) - A named, typed, ordered collection of 315 data types that represent the structure of a report (RPT). This 316 is the schema for a report, generated by an AMA Manager and 317 communicated to one or more AMA Agents. 319 * Rule - A unit of autonomous specification that provides a 320 stimulus-response relationship between time or state on an AMA 321 Agent and the actions or operations to be run as a result of that 322 time or state. A rule might trigger updating a variable, 323 populating a report/table, executing a control, or initiating the 324 transmission of a report/table. 326 * State-Based Rule (SBR) - A state-based rule is any rule in which 327 the rule stimulus is triggered by the calculable internal state of 328 data model associated with the AMA Agent. 330 * Synchronous Management - Management that assumes messages will be 331 delivered and acted upon in real or near-real-time. Synchronous 332 management often involves immediate replies of acknowledgment or 333 error status. Synchronous management is often bound to underlying 334 transport protocols and network protocols to ensure reliability or 335 source and sender identification. 337 * Table (TBL) - A typed collection of data values organized in a 338 tabular way in which columns represent homogeneous types of data 339 and rows represent unique sets of data values conforming to column 340 types. Tables only contain typed data values and the identity of 341 the Table Template (TBLT) to which they conform. 343 * Table Template (TBLT) - A named, typed, ordered collection of 344 columns that comprise the structure for representing tabular data 345 values. This template forms the structure of a table (TBL). 347 * Time-Based Rule (TBR) - A time-based rule is a specialization, and 348 simplification, of a state-based rule in which the rule stimulus 349 is triggered by the relative time as it is known on the Agent as a 350 function of either matched value or frequency. 352 3. Motivation 354 Early work into the rationale and motivation for specialized 355 management for challenged networks was captured in [BIRRANE1], 356 [BIRRANE2], and [BIRRANE3]. Some of the properties and feasibility 357 of such a management system were adopted from prototyping work done 358 in accordance with the DTN Research Group within the IRTF as 359 documented in [I-D.irtf-dtnrg-dtnmp]. 361 The unique nature of challenged networks requires new network 362 capabilities to deliver expected network functions. For example, the 363 unique nature of DTNs required the development of the Bundle Protocol 364 for transport functions and the Bundle Protocol Security Protocol 365 (BPSec) is required to secure bundles in certain types of DTNs. 366 Similarly, new management capabilities are needed to implement 367 management in challenged environments, such as those defined as DTNs. 369 The AMA provides a method of configuring AMA Agents with local, 370 autonomous management functions, such as rules-based execution of 371 procedures and generation of reports, to achieve expected behavior 372 when managed devices exist over a challenged network. It further 373 allows for dynamic instantiation and population of Variables and 374 reports through local operations defined by the manager, as well as 375 custom formatting of tables and reports to be sent back. This gives 376 the AMA significant flexibility to operate over challenged networks, 377 both providing new degrees of freedom over existing configuration 378 based data models used in synchronous networks and allowing for more 379 concise formatting over constrained networks. This architecture 380 makes very few assumptions on the nature of the network and allow for 381 continuous operation through periods of connectivity and lack of 382 connectivity. The AMA deviates from synchronous management 383 approaches because it never requires periods of bi-directional 384 connectivity, and provides the manager flexibility to describe agent 385 behavior that was unpredicted at the time of the data model creation. 387 To understand the unique motivations for the architecture, this 388 section discusses motivating characteristics of challenged networks, 389 current network management approaches, and how they might behave in a 390 challenged environment. 392 3.1. Challenged Networks 394 A challenged network is one that "has serious trouble maintaining 395 what an application would today expect of the end-to-end IP model" 396 ([RFC7228]). This includes cases where there is never simultaneous 397 end-to-end connectivity, when such connectivity is interrupted at 398 planned or unplanned intervals, or when delays exceed those that 399 could be accommodated by IP-based transport. Links in such networks 400 are often unavailable due to attenuations, propagation delays, 401 mobility, occultation, and other limitations imposed by energy and 402 mass considerations. 404 Challenged networks exhibit the following properties that impact the 405 way in which the function of network management is considered. 407 * No end-to-end path is guaranteed to exist at any given time 408 between any two nodes. 410 * Round-trip communications between any two nodes within any given 411 time window may be impossible. 413 * Latencies on the order of seconds, hours, or days must be 414 tolerated. 416 * Links may be uni-directional. 418 * Bi-directional links may have asymmetric data rates. 420 One way in which constrained networks differ from challenged networks 421 is the way in which the topology and, otherwise, roles and 422 responsibilities of the network may evolve over time. From the time 423 at which data is generated on a source node to the time at which the 424 data is received at a destination node, the topology of the network 425 may have changed. In certain circumstances, the physical node 426 receiving messages for a given node identifier may also have changed. 428 When this topological change impacts the transport of messages, then 429 transports must wait for the incremental connectivity necessary to 430 advance messages along their expected route. Therefore, these 431 networks cannot guarantee that there exist timely data exchange 432 between managing and managed devices. For example, the Bundle 433 Protocol transport protocol for use in DTNs implements this type of 434 store-and-forward operation. 436 When topological change impacts the semantic roles and 437 responsibilities of nodes in the network, then local configuration 438 and autonomy at nodes must be present to determine time-variant 439 changes. For example, the BPSec protocol does not encode security 440 destinations and, instead, requires nodes in a network to identify as 441 verifiers or acceptors when receiving secured messages. 443 When applied to network management, the semantic roles of Agent and 444 Manager may also change with the changing topology of the network. 445 Individual nodes must implement desirable behavior without reliance 446 on a single oracle of configuration or other coordinating function 447 such as an operator-in-the-loop. This implies that there MUST NOT be 448 a defined relationship between a particular manager and agent in a 449 network. 451 3.2. Current Approaches and Their Limitations 453 Network management solutions have been prevalent for many years in 454 both local-area and wide-area networks. These range from the 455 simplistic ability to configure settings of operational devices or 456 report on state and operational conditions; to the more more complex 457 modeling of an entire managed device setting, state, and behavior, 458 pushing and receiving large sets of configuration data between the 459 manager and the agent. Autonomy has more recently been applied to 460 network management but is focused more on well resourced, 461 unchallenged networks where devices self-configure, self-heal, and 462 self-optimize with other nodes within their vicinity. This section 463 describes some of the well known standardized protocols for network 464 management as well as various proposed solutions and aims to 465 differentiate their purpose with the needs of challenged network 466 management solutions. 468 3.2.1. Simple Network Management Protocol (SNMP) 470 Historically, network management tools in unchallenged networks 471 provide mechanisms for communicating locally-collected data from 472 devices to operators and managing applications, typically using a 473 "pull" mechanism where data must be explicitly requested by a Manager 474 in order to be transmitted by an Agent. A legacy method for 475 management in unchallenged networks today is the Simple Network 476 Management Protocol (SNMP) [RFC3416]. SNMP utilizes a request/ 477 response model to set and retrieve data values such as host 478 identifiers, link utilizations, error rates, and counters between 479 application software on Agents and Managers. Data may be directly 480 sampled or consolidated into representative statistics. 481 Additionally, SNMP supports a model for asynchronous notification 482 messages, called traps, based on predefined triggering events. Thus, 483 Managers can query Agents for status information, send new 484 configurations, and be informed when specific events have occurred. 485 Traps and queryable data are defined in one or more Managed 486 Information Bases (MIBs) which define the information for a 487 particular data standard, protocol, device, or application. 489 While there is a large installation base for SNMP there are several 490 aspects of the protocol that make in inappropriate for use in a 491 challenged networking environment. SNMP relies on sessions with low 492 round-trip latency to support its "pull" model. Complex management 493 can be achieved but only through craftful orchestration using a 494 series of real-time manager generated query and response logic not 495 possible in challenged networks. The SNMP trap model provides some 496 Agent-side processing, however because the processing has very low 497 fidelity and traps are typically "fire and forget." Adaptive 498 modifications to SNMP to support challenged networks and more complex 499 application-level management, would alter the basic function of the 500 protocol (data models, control flows, and syntax) so as to be 501 functionally incompatible with existing SNMP installations. 502 Therefore, this approach is not suitable for an asynchronous network 503 management system. 505 3.2.2. YANG, NETCONF, and RESTCONF 507 Yet Another Next Generation (YANG) [RFC6020] is a data modeling 508 language used to model configuration and state data of managed 509 devices and applications. The YANG model defines a schema for 510 organizing and accessing a device's configuration or operational 511 information. Once a model is developed, it is loaded to both the 512 client (manager) and server (agent) and serves as a contract between 513 the two. A YANG model can be complex, describing many containers of 514 managed elements, each with many configuration or operational state 515 data nodes. It can further define lists of like elements. YANG 516 allows for the definition of parameterized Remote Procedure Calls 517 (RPCs) to be executed on managed nodes as well as the definition of 518 asynchronous notifications within the model. 520 YANG by itself serves no purpose other than to organize data and 521 describe the allowed configuration parameters on the managed device. 522 The Network Configuration Protocol (NETCONF) [RFC6241] and the 523 RESTCONF protocol [RFC8040] provide the mechanisms to install, 524 manipulate, and delete the configuration of network devices, using 525 the YANG modules. NETCONF is a stateful, XML-based protocol that 526 provides the RPC syntax to retrieve, edit, copy, or delete any data 527 nodes or exposed functionality on the server. NETCONF connections 528 are required to provide authentication, data integrity, 529 confidentiality, and replay protection through secure transport 530 protocols such as SSH or TLS. RESTCONF is a stateless RESTful 531 protocol based on HTTP that uses JSON encoding to GET, POST, PUT, 532 PATCH, or DELETE data nodes within the YANG modules similar to 533 NETCONF. RESTCONF, while stateless, still requires secure transport 534 such as TLS. Both NETCONF and RESTCONF place no specific functional 535 requirements or constraints on the capabilities of the server, which 536 makes it a very flexible tool for configuring a homogeneous network 537 of devices, however they are limiting in challenged networks due to 538 their requirements of underlying transport and dependence on the YANG 539 data models. 541 NETCONF places specific constraints on any underlying transport 542 protocol: a long-lived, reliable, low-latency sequenced data delivery 543 session. No data is transferred without first establishing this bi- 544 directional NETCONF session. RESTCONF relaxes this constraint 545 however is limited to requesting or configuring individual data 546 elements or entire containers within the YANG data model. It is 547 therefore quite verbose and limited by the structure previously 548 defined in the YANG module and any autonomous behavior depends on 549 client slide orchestration similar to SNMP. 551 As previously noted, YANG allows for the definition of RPCs within 552 the model and notification elements for asynchronous messaging. The 553 RPCs provide both the definition of input and output parameters 554 however are strictly allowed in NETCONF and RESTCONF to be sent as 555 sequential procedures. Even if multiple procedures are sent, the 556 server is required to execute them and reply in the order they were 557 received. There is also no flexibility for the state-based execution 558 of those procedures on the server. The RPCs are executed as soon as 559 they are received, ultimately limiting the degrees of autonomy of the 560 server. YANG notifications are quite promising for asynchronous 561 network management, defined as both subscriptions to YANG 562 notifications [RFC8639] and YANG PUSH notifications [RFC8641]. 563 Notification containers must first be defined within the YANG module 564 declaring the containers or data nodes of interest. The events can 565 be filtered according to XPATH filtering defined in [RFC8639] 566 Section 6, however generation of events are streamed and generally 567 limited to the external changing state of a data node. YANG PUSH 568 allows for both periodic and on-change event notification but 569 supports no rules-based triggering. While the YANG data model offers 570 many great features, the features today are simply limiting for the 571 autonomous behavior required by challenged network management. 573 YANG is additionally limiting for challenged networks because of its 574 non-hierarchal schema. While the YANG model flexibility is great for 575 the management of nodes and applications of any type in an 576 unchallenged network, it becomes a burden in challenged networks 577 where concise encoding is necessary. All the data nodes within a 578 YANG model are referenced by verbose string based path of the module, 579 sub-module, container, and any data nodes such as lists, leaf-lists, 580 or leafs. Recent efforts are underway which allow for CBOR encoding 581 of YANG models [I-D.ietf-core-yang-cbor] and addressing of data nodes 582 through integer value YANG Schema Item iDentifiers (SIDs) 583 [I-D.ietf-core-sid], however these lack any formal hierarchal 584 structure. All mapping of SIDs to YANG modules and data nodes is 585 preformed manually which limits the portability of models and further 586 increases the size of any encoding scheme. 588 3.2.3. Constrained RESTful Network Management 590 Due to the advent and ubiquity of the Internet of Things (IoT), the 591 Constrained Application Protocol (CoAP) [RFC7252] has been recently 592 developed for communicating with nodes and applications in 593 constrained networks. CoAP is merely the messaging framework 594 designed to limit message size and fragmentation, operating over IP 595 networks. Because constrained networks could experience interruption 596 similar to those in DTNs, the protocol provides for application layer 597 store-and-forward as well as proxy delivery of messages, but is bound 598 to UDP transport. An approach to network management has been 599 authored that uses CoAP for transport and YANG as the data model, and 600 is defined as CORECONF [I-D.ietf-core-comi]. This proposed protocol 601 makes use of the YANG to CBOR encoding including the use of SIDs to 602 limit message size, however is currently bound to UDP/IP transport of 603 CoAP and further defines security requirements including DTLS or 604 OSCORE. This explicit binding to transport and security protocols is 605 limiting when applied to novel DTN approaches designed for challenged 606 networks. 608 4. Services Provided by an AMA 610 This section identifies the services that an AMA would provide for 611 management of challenged network resources. These services include 612 configuration, reporting, parameterized control, and administration. 614 4.1. Configuration 616 Configuration services update Agent data associated with managed 617 applications and protocols. Some configuration data might be defined 618 in the context of an application or protocol, such that any network 619 using that application or protocol would understand that data. Other 620 configuration data may be defined tactically for use in a specific 621 network deployment and not available to other networks even if they 622 use the same applications or protocols. 624 New configurations received by an Agent must be validated to ensure 625 that they do not conflict with other configurations or would 626 otherwise prevent the Agent from effectively working with other 627 Actors in its region. With no guarantee of round-trip data exchange, 628 Agents cannot rely on remote Managers to correct erroneous or stale 629 configurations from harming the flow of data through a challenged 630 network. 632 Examples of configuration service behavior include the following. 634 * Creating a new datum as a function of other well-known data: 636 C = A + B. 638 * Creating a new report as a unique, ordered collection of known 639 data: 641 RPT = {A, B, C}. 643 * Storing predefined, parameterized responses to potential future 644 conditions: 646 IF (X > 3) THEN RUN CMD(PARM). 648 4.2. Reporting 650 Reporting services populate report templates with values collected or 651 computed by an Agent. The resultant reports are sent to one or more 652 Managers by the Agent. The term "reporting" is used in place of the 653 term "monitoring", as monitoring implies a timeliness and regularity 654 that cannot be guaranteed by a challenged network. Reports sent by 655 an Agent provide best-effort information to receiving Managers. 657 Since a Manager is not actively "monitoring" an Agent, the Agent must 658 make its own determination on when to send what Reports based on its 659 own local time and state information. Agents should produce Reports 660 of varying fidelity and with varying frequency based on thresholds 661 and other information set as part of configuration services. 663 Examples of reporting service behavior include the following. 665 * Generate Report R1 every hour (time-based production). 667 * Generate Report R2 when X > 3 (state-based production). 669 4.3. Autonomous Parameterized Procedure Calls 671 Similar to an RPC call, some mechanism MUST exist which allows a 672 procedure to be run on an Agent in order to affect its behavior or 673 otherwise change its internal state. Since there is no guarantee 674 that a Manager will be in contact with an Agent at any given time, 675 the decisions of whether and when a procedure should be run MUST be 676 made locally and autonomously by the Agent. Two types of automation 677 triggers are identified in the AMA: triggers based on the internal 678 state of the Agent and triggers based on an Agent's notion of time. 679 As such, the autonomous execution of procedures can be viewed as a 680 stimulus-response system, where the stimulus is the positive 681 evaluation of a state or time based predicate and the response is the 682 function to be executed. 684 The autonomous nature of procedure execution by an Agent implies that 685 the full suite of information necessary to run a procedure may not be 686 known by a Manager in advance. To address this situation, a 687 parameterization mechanism MUST be available so that required data 688 can be provided at the time of execution on the Agent rather than at 689 the time of definition/configuration by the Manager. 691 Autonomous, parameterized procedure calls provide a powerful 692 mechanism for Managers to "manage" an Agent asynchronously during 693 periods of no communication by pre-configuring responses to events 694 that may be encountered by the Agent at a future time. 696 Examples of potential behavior include the following. 698 * Updating local routing information based on instantaneous link 699 analysis. 701 * Managing storage on the device to enforce quotas. 703 * Applying or modifying local security policy. 705 4.4. Administration 707 Administration services enforce the potentially complex mapping of 708 configuration, reporting, and control services amongst Agents and 709 Managers in the network. Fine-grained access controls that specify 710 which Managers may apply which services to which Agents may be 711 necessary in networks that either deal with multiple administrative 712 entities or overlay networks that cross administrative boundaries. 713 Whitelists, blacklists, key-based infrastructures, or other schemes 714 may be used for this purpose. 716 Examples of administration service behavior include the following. 718 * Agent A1 only Sends reports for Protocol P1 to Manager M1. 720 * Agent A2 only accepts a configurations for Application Y from 721 Managers M2 and M3. 723 * Agent A3 accepts services from any Manager providing the proper 724 authentication token. 726 Note that the administrative enforcement of access control is 727 different from security services provided by the networking stack 728 carrying such messages. 730 5. Desirable Properties of an AMA 732 This section describes those design properties that are desirable 733 when defining an architecture that must operate across challenged 734 links in a network. These properties ensure that network management 735 capabilities are retained even as delays and disruptions in the 736 network scale. Ultimately, these properties are the driving design 737 principles for the AMA. 739 5.1. Intelligent Push of Information 741 Pull management mechanisms require that a Manager send a query to an 742 Agent and then wait for the response to that query. This practice 743 implies a control-session between entities and increases the overall 744 message traffic in the network. Challenged networks cannot guarantee 745 that the round-trip data-exchange will occur in a timely fashion. In 746 extreme cases, networks may be comprised of solely uni-directional 747 links which drastically increases the amount of time needed for a 748 round-trip data exchange. Therefore, pull mechanisms must be avoided 749 in favor of push mechanisms. 751 Push mechanisms, in this context, refer to the ability of Agents to 752 leverage rule-based criteria to determine when and what information 753 should be sent to managers. This could be based solely off logic 754 applied to existing VARs or EDDs, or based off operations applied to 755 data elements. Such mechanisms do not require round-trip 756 communications as Managers do not request each reporting instance; 757 Managers need only request once, in advance, that information be 758 produced in accordance with a predetermined schedule or in response 759 to a predefined state on the Agent. In this way information is 760 "pushed" from Agents to Managers and the push is "intelligent" 761 because it is based on some internal evaluation performed by the 762 Agent. 764 5.2. Minimize Message Size Not Node Processing 766 Protocol designers must balance message size versus message 767 processing time at sending and receiving nodes. Verbose 768 representations of data simplify node processing whereas compact 769 representations require additional activities to generate/parse the 770 compacted message. There is no asynchronous management advantage to 771 minimizing node processing time in a challenged network. However, 772 there is a significant advantage to smaller message sizes in such 773 networks. Compact messages require smaller periods of viable 774 transmission for communication, incur less re-transmission cost, and 775 consume less resources when persistently stored en-route in the 776 network. An Asynchronous Management Protocol (AMP) should minimize 777 PDUs whenever practical, to include packing and unpacking binary 778 data, variable-length fields, and pre-configured data definitions. 780 5.3. Absolute Data Identification 782 Elements within the management system must be uniquely identifiable 783 so that they can be individually manipulated. Identification schemes 784 that are relative to system configuration make data exchange between 785 Agents and Managers difficult as system configurations may change 786 faster than nodes can communicate. 788 Consider the following common technique for approximating an 789 associative array lookup. A manager wishing to do an associative 790 lookup for some key K1 will (1) query a list of array keys from the 791 agent, (2) find the key that matches K1 and infer the index of K1 792 from the returned key list, and (3) query the discovered index on the 793 agent to retrieve the desired data. 795 Ignoring the inefficiency of two pull requests, this mechanism fails 796 when the Agent changes its key-index mapping between the first and 797 second query. Rather than constructing an artificial mapping from K1 798 to an index, an AMP must provide an absolute mechanism to lookup the 799 value K1 without an abstraction between the Agent and Manager. 801 5.4. Custom Data Definition 803 Custom definition of new data from existing data (such as through 804 data fusion, averaging, sampling, or other mechanisms) provides the 805 ability to communicate desired information in as compact a form as 806 possible. Specifically, an Agent should not be required to transmit 807 a large data set for a Manager that only wishes to calculate a 808 smaller, inferred data set. These new defined data elements could be 809 calculated and used both as parameters for local stimulus-response 810 rules-based criteria or simply serve to populate custom reports and 811 tables. Since the identification of custom data sets is likely to 812 occur in the context of a specific network deployment, AMPs must 813 provide a mechanism for their definition. 815 Aggregation of controls and custom formatting of reports and tables 816 are is equally important. Custom reporting provides the flexibility 817 allowing the manager to define the desired format of all information 818 to be sent over the challenged network from the agents, serving to 819 both save link capacity and increase the value of returned 820 information. Aggregation of controls allows a manager to specify a 821 set of controls to execute, specifying both the order and criteria of 822 execution. This aggregate set of controls can be sent as a single 823 command rather than a series of sequential operands. In this case it 824 is additionally possible to use outputs of one command to serve as an 825 input to the next at the agent. 827 5.5. Autonomous Operation 829 AMA network functions must be achievable using only knowledge local 830 to the Agent. Rather than directly controlling an Agent, a Manager 831 configures an engine of the Agent to take its own action under the 832 appropriate conditions in accordance with the Agent's notion of local 833 state and time. 835 Such an engine may be used for simple automation of predefined tasks 836 or to support semi-autonomous behavior in determining when to run 837 tasks and how to configure or parameterize tasks when they are run. 838 Wholly autonomous operations MAY be supported where required. 839 Generally, autonomous operations should provide the following 840 benefits. 842 * Distributed Operation - The concept of pre-configuration allows 843 the Agent to operate without regular contact with Managers in the 844 system. The initial configuration (and periodic update) of the 845 system remains difficult in a challenged network, but an initial 846 synchronization on stimuli and responses drastically reduces needs 847 for centralized operations. 849 * Deterministic Behavior - Such behavior is necessary in critical 850 operational systems where the actions of a platform must be well 851 understood even in the absence of an operator in the loop. 852 Depending on the types of stimuli and responses, these systems may 853 be considered to be maintaining simple automation or semi- 854 autonomous behavior. In either case, this preserves the ability 855 of a frequently-out-of-contact Manager to predict the state of an 856 Agent with more reliability than cases where Agents implement 857 independent and fully autonomous systems. 859 * Engine-Based Behavior - Several operational systems are unable to 860 deploy "mobile code" based solutions due to network bandwidth, 861 memory or processor loading, or security concerns. Engine-based 862 approaches provide configurable behavior without incurring these 863 types of concerns associated with mobile code. 865 6. AMA Roles and Responsibilities 867 By definition, Agents reside on managed devices and Managers reside 868 on managing devices. There is however no pre-supposed architecture 869 that connects managers and agents and therefore a single device could 870 assume both roles. This section describes the responsibilities 871 associated with each role and how these roles participate in network 872 management. 874 6.1. Agent Responsibilities 876 Application Support 877 Agents MUST collect all data, execute all procedures, 878 populate all reports and run operations required by each 879 application which the Agent manages. Agents MUST report 880 supported applications so that Managers in a network 881 understands what information is understood by what Agent. 883 Local Data Collection 884 Agents MUST collect from local firmware (or other on-board 885 mechanisms) and report all data defined for the management of 886 applications for which they have been configured. 888 Autonomous Control 889 Agents MUST determine, as previously prescribed by a manager, 890 whether a procedure should be invoked. 892 User Data Definition 893 Agents MUST provide mechanisms for operators in the network 894 to use configuration services to create customized data 895 definitions in the context of a specific network or network 896 use-case. Agents MUST allow for the creation, listing, and 897 removal of such definitions in accordance with whatever 898 security models are deployed within the particular network. 900 Where applicable, Agents MUST verify the validity of these 901 definitions when they are configured and respond in a way 902 consistent with the logging/error-handling policies of the 903 Agent and the network. 905 Autonomous Reporting 906 Agents MUST determine, without real-time Manager 907 intervention, whether and when to populate and transmit a 908 given report targeted to one or more Managers in the network. 910 Consolidate Messages 911 Agents SHOULD produce as few messages as possible when 912 sending information. For example, rather than sending 913 multiple messages, each with one report to a Manager, an 914 Agent SHOULD prefer to send a single message containing 915 multiple reports. 917 6.2. Manager Responsibilities 919 Agent Capabilities Mapping 920 Managers MUST understand what applications are managed by the 921 various Agents with which they communicate. Managers should 922 not attempt to request, invoke, or refer to application 923 information for applications not managed by an Agent. 925 Data Collection 926 Managers MUST receive information from Agents by 927 asynchronously configuring the production of reports and then 928 waiting for, and collecting, responses from Agents over time. 929 Managers MAY try to detect conditions where Agent information 930 has not been received within operationally relevant time 931 spans and react in accordance with network policy. 933 Custom Definitions 934 Managers should provide the ability to define custom data 935 definitions. Any custom definitions MUST be transmitted to 936 appropriate Agents and these definitions MUST be remembered 937 to interpret the reporting of these custom values from Agents 938 in the future. 940 Data Translation 941 Managers should provide some interface to other network 942 management protocols. Managers MAY accomplish this by 943 accumulating a repository of push-data from high-latency 944 parts of the network from which data may be pulled by low- 945 latency parts of the network. 947 Data Fusion 948 Managers MAY support the fusion of data from multiple Agents 949 with the purpose of transmitting fused data results to other 950 Managers within the network. Managers MAY receive fused 951 reports from other Managers pursuant to appropriate security 952 and administrative configurations. 954 7. Logical Data Model 956 The AMA logical data model captures the types of information that 957 should be collected and exchanged to implement necessary roles and 958 responsibilities. The data model presented in this section does not 959 presuppose a specific mapping to a physical data model or encoding 960 technique; it is included to provide a way to logically reason about 961 the types of data that should be exchanged in an asynchronously 962 managed network. 964 The elements of the AMA logical data model are described as follows. 966 7.1. Data Representations: Constants, Externally Defined Data, and 967 Variables 969 There are three fundamental representations of data in the AMA: (1) 970 data whose values do not change as a function of time or state, (2) 971 data whose values change as determined by sampling/calculation 972 external to the network management system, and (3) data whose values 973 are calculated internal to the network management system. 975 Data whose values do not change as a function of time or state are 976 defined as Constants (CONST). CONST values are strongly typed, named 977 values that cannot be modified once they have been defined. 979 Data sampled/calculated external to the network management system are 980 defined as Externally Defined Data" (EDD). EDD values represent the 981 most useful information in the management system as they are provided 982 by the applications or protocols being managed on the Agent. It is 983 RECOMMENDED that EDD values be strongly typed to avoid issues with 984 interpreting the data value. It is also RECOMMENDED that the 985 timeliness/staleness of the data value be considered when using the 986 data in the context of autonomous action on the Agent. 988 Data that is calculated internal to the network management system is 989 defined as a Variable (VAR). VARs allow the creation of new data 990 values for use in the network management system. New value 991 definitions are useful for storing user-defined information, storing 992 the results of complex calculations for easier re-use, and providing 993 a mechanism for combining information from multiple external sources. 994 It is RECOMMENDED that VARs be strongly typed to avoid issues with 995 interpreting the data value. In cases where a VAR definition relies 996 on other VAR definitions, mechanisms to prevent circular references 997 MUST be included in any actual data model or implementation. 999 7.2. Data Collections: Reports and Tables 1001 Individual data values may be exchanged amongst Agents and Managers 1002 in the AMA. However, data are typically most useful to a Manager 1003 when received as part of a set of information. Ordered collections 1004 of data values can be produced by Agents and sent to Managers as a 1005 way of efficiently communicating Agent status. Within the AMA, the 1006 structure of the ordered collection is treated separately from the 1007 values that populate such a structure. 1009 The AMA provides two ways of defining collections of data: reports 1010 and tables. Reports are ordered sets of data values, whereas Tables 1011 are special types of reports whose entries have a regular, tabular 1012 structure. 1014 7.2.1. Report Templates and Reports 1016 The typed, ordered structure of a data collection is defined as a 1017 Report Template (RPTT). A particular set of data values provided in 1018 compliance with such a template is called a Report (RPT). 1020 Separating the structure and content of a report reduces the overall 1021 size of RPTs in cases where reporting structures are well known and 1022 unchanging. RPTTs can be synchronized between an Agent and a Manager 1023 so that RPTs themselves do not incur the overhead of carrying self- 1024 describing data. RPTTs may include EDD values, VARs, and also other 1025 RPTTs. In cases where a RPTT includes another RPTTs, mechanisms to 1026 prevent circular references MUST be included in any actual data model 1027 or implementation. 1029 Protocols and applications managed in the AMA may define common 1030 RPTTs. Additionally, users within a network may define their own 1031 RPTTs that are useful in the context of a particular deployment. 1033 Unlike tables, reports do not exploit assumptions on the underlying 1034 structure of their data. Therefore, unlike tables, operators can 1035 define new reports at any time as part of the runtime configuration 1036 of the network. 1038 7.2.2. Table Templates and Tables 1040 Tables optimize the communication of multiple sets of data in 1041 situations where each data set has the same syntactic structure and 1042 with the same semantic meaning. Unlike reports, the regularity of 1043 tabular data representations allow for the addition of new rows 1044 without changing the structure of the table. Attempting to add a new 1045 data set at the end of a report would require alterations to the 1046 report template. 1048 The typed, ordered structure of a table is defined as a 1049 Table Template (TBLT). A particular instance of values populating 1050 the table template is called a Table (TBL). 1052 TBLTs describes the "columns" that define the table schema. A TBL 1053 represents the instance of a specific TBLT that holds actual data 1054 values. These data values represent the "rows" of the table. 1056 The prescriptive nature of the TBLT allows for the possibility of 1057 advanced filtering which may reduce traffic between Agents and 1058 Managers. However, the unique structure of each TBLT may make them 1059 difficult or impossible to change dynamically in a network. 1061 7.3. Command Execution: Controls and Macros 1063 Low-latency, high-availability approaches to network management use 1064 mechanisms such as (or similar to) RPCs to cause some action to be 1065 performed on an Agent. The AMA enables similar capabilities without 1066 requiring that the Manager be in the processing loop of the Agent. 1067 Command execution in the AMA happens through the use of controls and 1068 macros. 1070 A Control (CTRL) represents a parameterized, predefined procedure 1071 that can be run on an Agent. While conceptually similar to a "remote 1072 procedure call", CTRLs differ in that they do not provide numeric 1073 return codes. The concept of a return code when running a procedure 1074 implies a synchronous relationship between the caller of the 1075 procedure and the procedure being called, which is disallowed in an 1076 asynchronous management system. Instead, CTRLs may create reports 1077 which describe the status and other summarizations of their 1078 operation, and these reports may be sent to the Manager(s) calling 1079 the CTRL. 1081 Parameters can be provided when running a command from a Manager, 1082 pre-configured as part of a response to a time-based or state-based 1083 rule on the Agent, or auto-generated as needed on the Agent. The 1084 success or failure of a control MAY be inferred by reports generated 1085 for that purpose. 1087 NOTE: The AMA term control is derived in part from the concept of 1088 Command and Control (C2) where control implies the operational 1089 instructions that must be undertaken to implement (or maintain) a 1090 commanded objective. An asynchronous management function controls an 1091 Agent to allow it to fulfill its commanded purpose in a variety of 1092 operational scenarios. For example, attempting to maintain a safe 1093 internal thermal environment for a spacecraft is considered "thermal 1094 control" (not "thermal commanding") even though thermal control 1095 involves "commanding" heaters, louvers, radiators, and other 1096 temperature-affecting components. 1098 Often, a series of controls must be executed in sequence to achieve a 1099 particular outcome. A Macro (MACRO) represents an ordered collection 1100 of controls (or other macros). In cases where a MACRO includes 1101 another MACRO, mechanisms to prevent circular references and maximum 1102 nesting levels MUST be included in any actual data model or 1103 implementation. 1105 7.4. Autonomy: Time and State-Based Rules 1107 The AMA data model contains EDDs and VARs that capture the state of 1108 applications on an Agent. The model also contains controls and 1109 macros to perform actions on an Agent. A mechanism is needed to 1110 relate these two capabilities: to perform an action on the Agent in 1111 response to the state of the Agent. This mechanism in the AMA is the 1112 "rule" and can be activated based on Agent internal state (state- 1113 based rule) or based on the Agent's notion of relative time (time- 1114 based rule). 1116 7.4.1. State-Based Rule (SBR) 1118 State-Based Rules (SBRs) perform actions based on the Agent's 1119 internal state, as identified by EDD and VAR values. An SBR 1120 represents a stimulus-response pairing in the following form: IF 1121 predicate THEN response The predicate is a logical expression that 1122 evaluates to true if the rule stimulus is present and evaluates to 1123 false otherwise. The response may be any control or macro known to 1124 the Agent. 1126 An example of an SBR could be to turn off a heater if some internal 1127 temperature is greater than a threshold: IF (current_temp > 1128 maximum_temp) THEN turn_heater_off 1129 Rules may construct their stimuli from the full set of values known 1130 to the network management system. Similarly, responses may be 1131 constructed from the full set of controls and macros that can be run 1132 on the Agent. By allowing rules to evaluate the variety of all known 1133 data and run the variety of all known controls, multiple applications 1134 can be monitored and managed by one (or few) Agent instances. 1136 7.4.2. Time-Based Rule (TBR) 1138 Time-Based Rules (TBR) perform actions based on the Agent's notion of 1139 the passage of time. A possible TBR construct would be to perform 1140 some action at 1Hz on the Agent. 1142 A TBR is a specialization of an SBR as the Agent's notion of time is 1143 a type of Agent state. For example, a TBR to perform an action every 1144 24 hours could be expressed using some type of predicate of the form: 1145 IF (((current_time - base_time) % 24_hours) == 0) THEN ... However, 1146 time-based events are popular enough that special semantics for 1147 expressing them would likely significantly reduce the computations 1148 necessary to represent time functions in a SBR. 1150 7.5. Calculations: Expressions, Literals, and Operators 1152 Actions such as computing a VAR value or describing a rule predicate 1153 require some mechanism for calculating the value of mathematical 1154 expressions. In addition to the aforementioned AMA logical data 1155 objects, Literals, Operators, and Expressions are used to perform 1156 these calculations. 1158 A Literal (LIT) represents a strongly typed datum whose identity is 1159 equivalent to its value. An example of a LIT value is "4" - its 1160 identifier (4) is the same as its value (4). Literals differ from 1161 constants in that constants have an identifier separate from their 1162 value. For example, the constant PI may refer to a value of 3.14. 1163 However, the literal 3.14159 always refers to the value 3.14159. 1165 An Operator (OP) represents a mathematical operation in an 1166 expression. OPs should support multiple operands based on the 1167 operation supported. A common set of OPs SHOULD be defined for any 1168 Agent and systems MAY choose to allow individual applications to 1169 define new OPs to assist in the generation of new VAR values and 1170 predicates for managing that application. OPs may be simple binary 1171 operations such as "A + B" or more complex functions such as sin(A) 1172 or avg(A,B,C,D). Additionally, OPs may be typed. For example, 1173 addition of integers may be defined separately from addition of real 1174 numbers. 1176 An Expression (EXPR) is a combination of operators and operands used 1177 to construct a numerical value from a series of other elements of the 1178 AMA logical model. Operands include any AMA logical data model 1179 object that can be interpreted as a value, such as EDD, VAR, CONST, 1180 and LIT values. Operators perform some function on operands to 1181 generate new values. 1183 8. System Model 1185 This section describes the notional data flows and control flows that 1186 illustrate how Managers and Agents within an AMA cooperate to perform 1187 network management services. 1189 8.1. Control and Data Flows 1191 The AMA identifies three significant data flows: control flows from 1192 Managers to Agents, reports flows from Agents to Managers, and fusion 1193 reports from Managers to other Managers. These data flows are 1194 illustrated in Figure 1. 1196 AMA Control and Data Flows 1198 +---------+ +------------------------+ +---------+ 1199 | Node A | | Node B | | Node C | 1200 | | | | | | 1201 |+-------+| |+-------+ +-------+| |+-------+| 1202 || ||=====>>||Manager|====>>| ||====>>|| || 1203 || ||<<=====|| B |<<====|Agent B||<<====|| || 1204 || || |+--++---+ +-------+| ||Manager|| 1205 || Agent || +---||-------------------+ || C || 1206 || A || || || || 1207 || ||<<=========||==========================|| || 1208 || ||===========++========================>>|| || 1209 |+-------+| |+-------+| 1210 +---------+ +---------+ 1212 Figure 1 1214 In this data flow, the Agent on node A receives Controls from 1215 Managers on nodes B and C, and replies with Report Entries back to 1216 these Managers. Similarly, the Agent on node B interacts with the 1217 local Manager on node B and the remote Manager on node C. Finally, 1218 the Manager on node B may fuse Report Entries received from Agents at 1219 nodes A and B and send these fused Report Entries back to the Manager 1220 on node C. From this figure it is clear that there exist many-to- 1221 many relationships amongst Managers, amongst Agents, and between 1222 Agents and Managers. Note that Agents and Managers are roles, not 1223 necessarily different software applications. Node A may represent a 1224 single software application fulfilling only the Agent role, whereas 1225 node B may have a single software application fulfilling both the 1226 Agent and Manager roles. The specifics of how these roles are 1227 realized is an implementation matter. 1229 8.2. Control Flow by Role 1231 This section describes three common configurations of Agents and 1232 Managers and the flow of messages between them. These configurations 1233 involve local and remote management and data fusion. 1235 8.2.1. Notation 1237 The notation outlined in Table 1 describes the types of control 1238 messages exchanged between Agents and Managers. 1240 +============+===================================+===========+ 1241 | Term | Definition | Example | 1242 +============+===================================+===========+ 1243 | EDD# | EDD definition. | EDD1 | 1244 +------------+-----------------------------------+-----------+ 1245 | V# | Variable definition. | V1 = EDD1 | 1246 | | | + V0. | 1247 +------------+-----------------------------------+-----------+ 1248 | DEF([ACL], | Define ID from expression. Allow | DEF([*], | 1249 | ID,EXPR) | managers in access control list | V1, EDD1 | 1250 | | (ACL) to request this ID. | + EDD2) | 1251 +------------+-----------------------------------+-----------+ 1252 | PROD(P,ID) | Produce ID according to predicate | PROD(1s, | 1253 | | P. P may be a time period (1s) | EDD1) | 1254 | | or an expression (EDD1 > 10). | | 1255 +------------+-----------------------------------+-----------+ 1256 | RPT(ID) | A report identified by ID. | RPT(EDD1) | 1257 +------------+-----------------------------------+-----------+ 1259 Table 1: Terminology 1261 8.2.2. Serialized Management 1263 This is a nominal configuration of network management where a Manager 1264 interacts with a set of Agents. The control flows for this are 1265 outlined in Figure 2. 1267 Serialized Management Control Flow 1268 +----------+ +---------+ +---------+ 1269 | Manager | | Agent A | | Agent B | 1270 +----+-----+ +----+----+ +----+----+ 1271 | | | 1272 |-----PROD(1s, EDD1)--->| | (1) 1273 |----------------------------PROD(1s, EDD1)-->| 1274 | | | 1275 | | | 1276 |<-------RPT(EDD1)------| | (2) 1277 |<----------------------------RPT(EDD1)-------| 1278 | | | 1279 | | | 1280 |<-------RPT(EDD1)------| | 1281 |<----------------------------RPT(EDD1)-------| 1282 | | | 1283 | | | 1284 |<-------RPT(EDD1)------| | 1285 |<----------------------------RPT(EDD1)-------| 1286 | | | 1288 Figure 2 1290 In a simple network, a Manager interacts with multiple Agents. 1292 In this figure, the Manager configures Agents A and B to produce EDD1 1293 every second in (1). Upon receiving and configuring this message, 1294 Agents A and B then build a Report Entry containing EDD1 and send 1295 those reports back to the Manager in (2). This behavior then repeats 1296 this action every 1s without requiring other inputs from the Manager. 1298 8.2.3. Multiplexed Management 1300 Networks spanning multiple administrative domains may require 1301 multiple Managers (for example, one per domain). When a Manager 1302 defines custom Reports/Variables to an Agent, that definition may be 1303 tagged with an Access Control List (ACL) to limit what other Managers 1304 will be privy to this information. Managers in such networks should 1305 synchronize with those other Managers granted access to their custom 1306 data definitions. When Agents generate messages, they MUST only send 1307 messages to Managers according to these ACLs, if present. The 1308 control flows in this scenario are outlined in Figure 3. 1310 Multiplexed Management Control Flow 1311 +-----------+ +-------+ +-----------+ 1312 | Manager A | | Agent | | Manager B | 1313 +-----+-----+ +---+---+ +-----+-----+ 1314 | | | 1315 |---DEF(A,V1,EDD1*2)-->|<-DEF(B, V2, EDD2*2)--| (1) 1316 | | | 1317 |---PROD(1s, V1)------>|<---PROD(1s, V2)------| (2) 1318 | | | 1319 |<--------RPT(V1)------| | (3) 1320 | |--------RPT(V2)------>| 1321 |<--------RPT(V1)------| | 1322 | |--------RPT(V2)------>| 1323 | | | 1324 | |<---PROD(1s, V1)------| (4) 1325 | | | 1326 | |---ERR(V1 no perm.)-->| 1327 | | | 1328 |--DEF(*,V3,EDD3*3)--->| | (5) 1329 | | | 1330 |---PROD(1s, V3)------>| | (6) 1331 | | | 1332 | |<----PROD(1s, V3)-----| 1333 | | | 1334 |<--------RPT(V3)------|--------RPT(V3)------>| (7) 1335 |<--------RPT(V1)------| | 1336 | |--------RPT(V2)------>| 1337 |<-------RPT(V3)-------|--------RPT(V3)------>| 1338 |<-------RPT(V1)-------| | 1339 | |--------RPT(V2)------>| 1341 Figure 3 1343 Complex networks require multiple Managers interfacing with Agents. 1345 In more complex networks, any Manager may choose to define custom 1346 Reports and Variables, and Agents may need to accept such definitions 1347 from multiple Managers. Variable definitions may include an ACL that 1348 describes who may query and otherwise understand these definitions. 1349 In (1), Manager A defines V1 only for A while Manager B defines V2 1350 only for B. Managers may, then, request the production of Report 1351 Entries containing these definitions, as shown in (2). Agents 1352 produce different data for different Managers in accordance with 1353 configured production rules, as shown in (3). If a Manager requests 1354 the production of a custom definition for which the Manager has no 1355 permissions, a response consistent with the configured logging policy 1356 on the Agent should be implemented, as shown in (4). Alternatively, 1357 as shown in (5), a Manager may define custom data with no access 1358 restrictions, allowing all other Managers to request and use this 1359 definition. This allows all Managers to request the production of 1360 Report Entries containing this definition, shown in (6) and have all 1361 Managers receive this and other data going forward, as shown in (7). 1363 8.2.4. Data Fusion 1365 Data fusion reduces the number and size of messages in the network 1366 which can lead to more efficient utilization of networking resources. 1367 The AMA supports fusion of NM reports by co-locating Agents and 1368 Managers on nodes and offloading fusion activities to the Manager. 1369 This process is illustrated in Figure 4. 1371 Data Fusion Control Flow 1373 +-----------+ +-----------+ +---------+ +---------+ 1374 | Manager A | | Manager B | | Agent B | | Agent C | 1375 +---+-------+ +-----+-----+ +----+----+ +----+----+ 1376 | | | | 1377 |-DEF(A,V0,EDD1+EDD2)->| | | (1) 1378 |-PROD(EDD1&EDD2,V0)-->| | | 1379 | | | | 1380 | |--PROD(1s,EDD1)->| | (2) 1381 | |------------------PROD(1s, EDD2)->| 1382 | | | | 1383 | |<---RPT(EDD1)----| | (3) 1384 | |<------------------RPT(EDD2)------| 1385 | | | | 1386 |<-----RPT(A,V0)-------| | | (4) 1387 | | | | 1389 Figure 4 1391 Data fusion occurs amongst Managers in the network. 1393 In this example, Manager A requires the production of a Variable V0, 1394 from node B, as shown in (1). The Manager role understands what data 1395 is available from what agents in the subnetwork local to B, 1396 understanding that EDD1 is available locally and EDD2 is available 1397 remotely. Production messages are produced in (2) and data collected 1398 in (3). This allows the Manager at node B to fuse the collected 1399 Report Entries into V0 and return it in (4). While a trivial 1400 example, the mechanism of associating fusion with the Manager 1401 function rather than the Agent function scales with fusion 1402 complexity, though it is important to reiterate that Agent and 1403 Manager designations are roles, not individual software components. 1404 There may be a single software application running on node B 1405 implementing both Manager B and Agent B roles. 1407 9. IANA Considerations 1409 This protocol has no fields registered by IANA. 1411 10. Security Considerations 1413 Security within an AMA MUST exist in two layers: transport layer 1414 security and access control. 1416 Transport-layer security addresses the questions of authentication, 1417 integrity, and confidentiality associated with the transport of 1418 messages between and amongst Managers and Agents in the AMA. This 1419 security is applied before any particular Actor in the system 1420 receives data and, therefore, is outside of the scope of this 1421 document. 1423 Finer grain application security is done via ACLs which are defined 1424 via configuration messages and implementation specific. 1426 11. Informative References 1428 [BIRRANE1] Birrane, E.B. and R.C. Cole, "Management of Disruption- 1429 Tolerant Networks: A Systems Engineering Approach", 2010. 1431 [BIRRANE2] Birrane, E.B., Burleigh, S.B., and V.C. Cerf, "Defining 1432 Tolerance: Impacts of Delay and Disruption when Managing 1433 Challenged Networks", 2011. 1435 [BIRRANE3] Birrane, E.B. and H.K. Kruse, "Delay-Tolerant Network 1436 Management: The Definition and Exchange of Infrastructure 1437 Information in High Delay Environments", 2011. 1439 [I-D.ietf-core-comi] 1440 Veillette, M., Stok, P. V. D., Pelov, A., Bierman, A., and 1441 I. Petrov, "CoAP Management Interface (CORECONF)", Work in 1442 Progress, Internet-Draft, draft-ietf-core-comi-11, 17 1443 January 2021, . 1446 [I-D.ietf-core-sid] 1447 Veillette, M., Pelov, A., Petrov, I., and C. Bormann, 1448 "YANG Schema Item iDentifier (YANG SID)", Work in 1449 Progress, Internet-Draft, draft-ietf-core-sid-16, 24 June 1450 2021, . 1453 [I-D.ietf-core-yang-cbor] 1454 Veillette, M., Petrov, I., Pelov, A., and C. Bormann, 1455 "CBOR Encoding of Data Modeled with YANG", Work in 1456 Progress, Internet-Draft, draft-ietf-core-yang-cbor-16, 24 1457 June 2021, . 1460 [I-D.ietf-dtn-bpbis] 1461 Burleigh, S., Fall, K., and E. J. Birrane, "Bundle 1462 Protocol Version 7", Work in Progress, Internet-Draft, 1463 draft-ietf-dtn-bpbis-31, 25 January 2021, 1464 . 1467 [I-D.irtf-dtnrg-dtnmp] 1468 Birrane, E. J. and V. Ramachandran, "Delay Tolerant 1469 Network Management Protocol", Work in Progress, Internet- 1470 Draft, draft-irtf-dtnrg-dtnmp-01, 31 December 2014, 1471 . 1474 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1475 Requirement Levels", BCP 14, RFC 2119, 1476 DOI 10.17487/RFC2119, March 1997, 1477 . 1479 [RFC3416] Presuhn, R., Ed., "Version 2 of the Protocol Operations 1480 for the Simple Network Management Protocol (SNMP)", 1481 STD 62, RFC 3416, DOI 10.17487/RFC3416, December 2002, 1482 . 1484 [RFC4838] Cerf, V., Burleigh, S., Hooke, A., Torgerson, L., Durst, 1485 R., Scott, K., Fall, K., and H. Weiss, "Delay-Tolerant 1486 Networking Architecture", RFC 4838, DOI 10.17487/RFC4838, 1487 April 2007, . 1489 [RFC6020] Bjorklund, M., Ed., "YANG - A Data Modeling Language for 1490 the Network Configuration Protocol (NETCONF)", RFC 6020, 1491 DOI 10.17487/RFC6020, October 2010, 1492 . 1494 [RFC6241] Enns, R., Ed., Bjorklund, M., Ed., Schoenwaelder, J., Ed., 1495 and A. Bierman, Ed., "Network Configuration Protocol 1496 (NETCONF)", RFC 6241, DOI 10.17487/RFC6241, June 2011, 1497 . 1499 [RFC7228] Bormann, C., Ersue, M., and A. Keranen, "Terminology for 1500 Constrained-Node Networks", RFC 7228, 1501 DOI 10.17487/RFC7228, May 2014, 1502 . 1504 [RFC7252] Shelby, Z., Hartke, K., and C. Bormann, "The Constrained 1505 Application Protocol (CoAP)", RFC 7252, 1506 DOI 10.17487/RFC7252, June 2014, 1507 . 1509 [RFC8040] Bierman, A., Bjorklund, M., and K. Watsen, "RESTCONF 1510 Protocol", RFC 8040, DOI 10.17487/RFC8040, January 2017, 1511 . 1513 [RFC8639] Voit, E., Clemm, A., Gonzalez Prieto, A., Nilsen-Nygaard, 1514 E., and A. Tripathy, "Subscription to YANG Notifications", 1515 RFC 8639, DOI 10.17487/RFC8639, September 2019, 1516 . 1518 [RFC8641] Clemm, A. and E. Voit, "Subscription to YANG Notifications 1519 for Datastore Updates", RFC 8641, DOI 10.17487/RFC8641, 1520 September 2019, . 1522 Authors' Addresses 1524 Edward J. Birrane 1525 Johns Hopkins Applied Physics Laboratory 1527 Email: Edward.Birrane@jhuapl.edu 1529 Emery Annis 1530 Johns Hopkins Applied Physics Laboratory 1532 Email: Emery.Annis@jhuapl.edu 1534 Sarah E. Heiner 1535 Johns Hopkins Applied Physics Laboratory 1537 Email: Sarah.Heiner@jhuapl.edu