idnits 2.17.1 draft-ietf-anima-asa-guidelines-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == There is 1 instance of lines with non-ascii characters in the document. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (1 February 2022) is 815 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-20) exists of draft-ietf-core-yang-cbor-18 == Outdated reference: A later version (-09) exists of draft-irtf-nmrg-ibn-concepts-definitions-06 Summary: 0 errors (**), 0 flaws (~~), 4 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group B. E. Carpenter 3 Internet-Draft Univ. of Auckland 4 Intended status: Informational L. Ciavaglia 5 Expires: 5 August 2022 Rakuten Mobile 6 S. Jiang 7 Huawei Technologies Co., Ltd 8 P. Peloso 9 Nokia 10 1 February 2022 12 Guidelines for Autonomic Service Agents 13 draft-ietf-anima-asa-guidelines-07 15 Abstract 17 This document proposes guidelines for the design of Autonomic Service 18 Agents for autonomic networks. Autonomic Service Agents, together 19 with the Autonomic Network Infrastructure, the Autonomic Control 20 Plane and the Generic Autonomic Signaling Protocol constitute base 21 elements of an autonomic networking ecosystem. 23 Discussion Venue 25 This note is to be removed before publishing as an RFC. 27 Discussion of this document takes place on the ANIMA mailing list 28 (anima@ietf.org), which is archived at 29 https://mailarchive.ietf.org/arch/browse/anima/ 30 (https://mailarchive.ietf.org/arch/browse/anima/). 32 Status of This Memo 34 This Internet-Draft is submitted in full conformance with the 35 provisions of BCP 78 and BCP 79. 37 Internet-Drafts are working documents of the Internet Engineering 38 Task Force (IETF). Note that other groups may also distribute 39 working documents as Internet-Drafts. The list of current Internet- 40 Drafts is at https://datatracker.ietf.org/drafts/current/. 42 Internet-Drafts are draft documents valid for a maximum of six months 43 and may be updated, replaced, or obsoleted by other documents at any 44 time. It is inappropriate to use Internet-Drafts as reference 45 material or to cite them other than as "work in progress." 47 This Internet-Draft will expire on 5 August 2022. 49 Copyright Notice 51 Copyright (c) 2022 IETF Trust and the persons identified as the 52 document authors. All rights reserved. 54 This document is subject to BCP 78 and the IETF Trust's Legal 55 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 56 license-info) in effect on the date of publication of this document. 57 Please review these documents carefully, as they describe your rights 58 and restrictions with respect to this document. Code Components 59 extracted from this document must include Revised BSD License text as 60 described in Section 4.e of the Trust Legal Provisions and are 61 provided without warranty as described in the Revised BSD License. 63 Table of Contents 65 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 66 2. Logical Structure of an Autonomic Service Agent . . . . . . . 5 67 3. Interaction with the Autonomic Networking Infrastructure . . 6 68 3.1. Interaction with the security mechanisms . . . . . . . . 6 69 3.2. Interaction with the Autonomic Control Plane . . . . . . 6 70 3.3. Interaction with GRASP and its API . . . . . . . . . . . 7 71 3.4. Interaction with policy mechanisms . . . . . . . . . . . 8 72 4. Interaction with Non-Autonomic Components and Systems . . . . 9 73 5. Design of GRASP Objectives . . . . . . . . . . . . . . . . . 9 74 6. Life Cycle . . . . . . . . . . . . . . . . . . . . . . . . . 10 75 6.1. Installation phase . . . . . . . . . . . . . . . . . . . 11 76 6.1.1. Installation phase inputs and outputs . . . . . . . . 12 77 6.2. Instantiation phase . . . . . . . . . . . . . . . . . . . 13 78 6.2.1. Operator's goal . . . . . . . . . . . . . . . . . . . 13 79 6.2.2. Instantiation phase inputs and outputs . . . . . . . 14 80 6.2.3. Instantiation phase requirements . . . . . . . . . . 14 81 6.3. Operation phase . . . . . . . . . . . . . . . . . . . . . 15 82 6.4. Removal phase . . . . . . . . . . . . . . . . . . . . . . 16 83 7. Coordination and Data Models . . . . . . . . . . . . . . . . 16 84 7.1. Coordination between Autonomic Functions . . . . . . . . 16 85 7.2. Coordination with Traditional Management Functions . . . 16 86 7.3. Data Models . . . . . . . . . . . . . . . . . . . . . . . 16 87 8. Robustness . . . . . . . . . . . . . . . . . . . . . . . . . 17 88 9. Security Considerations . . . . . . . . . . . . . . . . . . . 18 89 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 19 90 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 19 91 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 20 92 12.1. Normative References . . . . . . . . . . . . . . . . . . 20 93 12.2. Informative References . . . . . . . . . . . . . . . . . 20 94 Appendix A. Change log . . . . . . . . . . . . . . . . . . . . . 22 95 Appendix B. Terminology . . . . . . . . . . . . . . . . . . . . 25 96 Appendix C. Example Logic Flows . . . . . . . . . . . . . . . . 25 97 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 30 99 1. Introduction 101 This document proposes guidelines for the design of Autonomic Service 102 Agents (ASAs) in the context of an Autonomic Network (AN) based on 103 the Autonomic Network Infrastructure (ANI) outlined in the autonomic 104 networking reference model [RFC8993]. This infrastructure makes use 105 of the Autonomic Control Plane (ACP) [RFC8994] and the Generic 106 Autonomic Signaling Protocol (GRASP) [RFC8990]. A general 107 introduction to this environment may be found at [IPJ], which also 108 includes explanatory diagrams, and a summary of terminology is in 109 Appendix B. 111 This document is a contribution to the description of an autonomic 112 networking ecosystem, recognizing that a deployable autonomic network 113 needs more than just ACP and GRASP implementations. Such an 114 autonomic network must achieve management tasks that a Network 115 Operations Center (NOC) cannot readily achieve manually, such as 116 continuous resource optimization or automated fault detection and 117 repair. These tasks, and other management automation goals, are 118 described at length in [RFC7575]. The net result should be 119 significant operational improvement. To achieve this, the autonomic 120 networking ecosystem must include at least a library of ASAs and 121 corresponding GRASP technical objective definitions. A GRASP 122 objective [RFC8990] is a data structure whose main contents are a 123 name and a value. The value consists of a single configurable 124 parameter or a set of parameters of some kind. 126 There must also be tools to deploy and oversee ASAs, and integration 127 with existing operational mechanisms [RFC8368]. However, this 128 document focuses on the design of ASAs, with some reference to 129 implementation and operational aspects. 131 There is a considerable literature about autonomic agents with a 132 variety of proposals about how they should be characterized. Some 133 examples are [DeMola06], [Huebscher08], [Movahedi12] and [GANA13]. 134 However, for the present document, the basic definitions and goals 135 for autonomic networking given in [RFC7575] apply. According to RFC 136 7575, an Autonomic Service Agent is "An agent implemented on an 137 autonomic node that implements an autonomic function, either in part 138 (in the case of a distributed function) or whole." 139 ASAs must be distinguished from other forms of software component. 140 They are components of network or service management; they do not in 141 themselves provide services to end users. They do however provide 142 management services to network operators and administrators. For 143 example, the services envisaged for network function virtualisation 144 [NFV] or for service function chaining [RFC7665] might be managed by 145 an ASA rather than by traditional configuration tools. 147 Another example is that an existing script running within a router to 148 locally monitor or configure functions or services could be upgraded 149 to an ASA that could communicate with peer scripts on neighboring or 150 remote routers. A high-level API will allow such upgraded scripts to 151 take full advantage of the secure ACP and the discovery, negotiation 152 and synchronization features of GRASP. Familiar tasks such as 153 configuring an Interior Gateway Protocol (IGP) on neighboring routers 154 or even exchanging IGP security keys could be performed securely in 155 this way. This document mainly addresses issues affecting quite 156 complex ASAs, but initially the most useful ASAs may in fact be 157 rather simple evolutions of existing scripts. 159 The reference model [RFC8993] for autonomic networks explains further 160 the functionality of ASAs by adding "[An ASA is] a process that makes 161 use of the features provided by the ANI to achieve its own goals, 162 usually including interaction with other ASAs via the GRASP protocol 163 [RFC8990] or otherwise. Of course, it also interacts with the 164 specific targets of its function, using any suitable mechanism. 165 Unless its function is very simple, the ASA will need to handle 166 overlapping asynchronous operations. It may therefore be a quite 167 complex piece of software in its own right, forming part of the 168 application layer above the ANI." 170 As mentioned, there will certainly be simple ASAs that manage a 171 single objective in a straightforward way and do not need 172 asynchronous operations. In nodes where computing power and memory 173 space are limited, ASAs should run at a much lower frequency than the 174 primary workload, so CPU load should not be a big issue, but memory 175 footprint in a constrained node is certainly a concern. ASAs 176 installed in constrained devices will have limited functionality. In 177 such cases, many aspects of the current document do not apply. 178 However, in the general case, an ASA may be a relatively complex 179 software component that will in many cases control and monitor 180 simpler entities in the same or remote host(s). For example, a 181 device controller that manages tens or hundreds of simple devices 182 might contain a single ASA. 184 The remainder of this document offers guidance on the design of 185 complex ASAs. Some of the material may be familiar to those 186 experienced in distributed fault-tolerant and real-time control 187 systems. Robustness and security are of particular importance in 188 autonomic networks and are discussed in Section 8 and Section 9. 190 2. Logical Structure of an Autonomic Service Agent 192 As mentioned above, all but the simplest ASAs will need to support 193 asynchronous operations. Different programming environments support 194 asynchronicity in different ways. In this document, we use an 195 explicit multi-threading model to describe operations. This is 196 illustrative, and alternatives to multi-threading are discussed in 197 detail in connection with the GRASP API (see Section 3.3). 199 A typical ASA will have a main thread that performs various initial 200 housekeeping actions such as: 202 * Obtain authorization credentials, if needed. 204 * Register the ASA with GRASP. 206 * Acquire relevant policy parameters. 208 * Declare data structures for relevant GRASP objectives. 210 * Register with GRASP those objectives that it will actively manage. 212 * Launch a self-monitoring thread. 214 * Enter its main loop. 216 The logic of the main loop will depend on the details of the 217 autonomic function concerned. Whenever asynchronous operations are 218 required, extra threads may be launched. Examples of such threads 219 include: 221 * Repeatedly flood an objective to the AN, so that any ASA can 222 receive the objective's latest value. 224 * Accept incoming synchronization requests for an objective managed 225 by this ASA. 227 * Accept incoming negotiation requests for an objective managed by 228 this ASA, and then conduct the resulting negotiation with the 229 counterpart ASA. 231 * Manage subsidiary non-autonomic devices directly. 233 These threads should all either exit after their job is done, or 234 enter a wait state for new work, to avoid wasting system resources. 236 According to the degree of parallelism needed by the application, 237 some of these threads might be launched in multiple instances. In 238 particular, if negotiation sessions with other ASAs are expected to 239 be long or to involve wait states, the ASA designer might allow for 240 multiple simultaneous negotiating threads, with appropriate use of 241 queues and synchronization primitives to maintain consistency. 243 The main loop itself could act as the initiator of synchronization 244 requests or negotiation requests, when the ASA needs data or 245 resources from other ASAs. In particular, the main loop should watch 246 for changes in policy parameters that affect its operation, and if 247 appropriate, occasionally refresh authorization credentials. It 248 should also do whatever is required to avoid unnecessary resource 249 consumption, for example by limiting its frequency of execution. 251 The self-monitoring thread is of considerable importance. Failure of 252 autonomic service agents is highly undesirable. To a large extent 253 this depends on careful coding and testing, with no unhandled error 254 returns or exceptions, but if there is nevertheless some sort of 255 failure, the self-monitoring thread should detect it, fix it if 256 possible, and in the worst case restart the entire ASA. 258 Appendix C presents some example logic flows in informal pseudocode. 260 3. Interaction with the Autonomic Networking Infrastructure 262 3.1. Interaction with the security mechanisms 264 An ASA by definition runs in an autonomic node. Before any normal 265 ASAs are started, such nodes must be bootstrapped into the autonomic 266 network's secure key infrastructure, typically in accordance with 267 [RFC8995]. This key infrastructure will be used to secure the ACP 268 (next section) and may be used by ASAs to set up additional secure 269 interactions with their peers, if needed. 271 Note that the secure bootstrap process itself incorporates simple 272 special-purpose ASAs that use a restricted mode of GRASP (Section 4 273 of [RFC8995]). 275 3.2. Interaction with the Autonomic Control Plane 277 In a normal autonomic network, ASAs will run as clients of the ACP, 278 which will provide a fully secured network environment for all 279 communication with other ASAs, in most cases mediated by GRASP (next 280 section). 282 Note that the ACP formation process itself incorporates simple 283 special-purpose ASAs that use a restricted mode of GRASP (Section 6.4 284 of [RFC8994]). 286 3.3. Interaction with GRASP and its API 288 In a node where a significant number of ASAs are installed, GRASP 289 [RFC8990] is likely to run as a separate process with its API 290 [RFC8991] available in user space. Thus, ASAs may operate without 291 special privilege, unless they need it for other reasons. The ASA's 292 view of GRASP is built around GRASP objectives (Section 5), defined 293 as data structures containing administrative information such as the 294 objective's unique name, and its current value. The format and size 295 of the value is not restricted by the protocol, except that it must 296 be possible to serialise it for transmission in Concise Binary Object 297 Representation (CBOR) [RFC8949], subject only to GRASP's maximum 298 message size as discussed in Section 5. 300 As discussed in Section 2, GRASP is an asynchronous protocol, and 301 this document uses a multi-threading model to describe operations. 302 In many programming environments, an 'event loop' model is used 303 instead, in which case each thread would be implemented as an event 304 handler called in turn by the main loop. For this case, the GRASP 305 API must provide non-blocking calls and possibly support callbacks. 306 This topic is discussed in more detail in [RFC8991], and other 307 asynchronicity models are also possible. Whenever necessary, the 308 GRASP session identifier will be used to distinguish simultaneous 309 operations. 311 The GRASP API should offer the following features: 313 * Registration functions, so that an ASA can register itself and the 314 objectives that it manages. 316 * A discovery function, by which an ASA can discover other ASAs 317 supporting a given objective. 319 * A negotiation request function, by which an ASA can start 320 negotiation of an objective with a counterpart ASA. With this, 321 there is a corresponding listening function for an ASA that wishes 322 to respond to negotiation requests, and a set of functions to 323 support negotiating steps. Once a negotiation starts, it is a 324 symmetric process with both sides sending successive objective 325 values to each other until agreement is reached (or the 326 negotiation fails). 328 * A synchronization function, by which an ASA can request the 329 current value of an objective from a counterpart ASA. With this, 330 there is a corresponding listening function for an ASA that wishes 331 to respond to synchronization requests. Unlike negotiation, 332 synchronization is an asymmetric process in which the listener 333 sends a single objective value to the requester. 335 * A flood function, by which an ASA can cause the current value of 336 an objective to be flooded throughout the AN so that any ASA can 337 receive it. 339 For further details and some additional housekeeping functions, see 340 [RFC8991]. 342 The GRASP API is intended to support the various interactions 343 expected between most ASAs, such as the interactions outlined in 344 Section 2. However, if ASAs require additional communication between 345 themselves, they can do so directly over the ACP to benefit from its 346 security. One option is to use GRASP discovery and synchronization 347 as a rendez-vous mechanism between two ASAs, passing communication 348 parameters such as a TCP port number via GRASP. The use of TLS over 349 the ACP for such communications is advisable, as described in 350 Section 6.9.2 of [RFC8994]. 352 3.4. Interaction with policy mechanisms 354 At the time of writing, the policy mechanisms for the ANI are 355 undefined. In particular, the use of declarative policies (aka 356 Intents) for the definition and management of ASA's behaviors remains 357 a research topic [I-D.irtf-nmrg-ibn-concepts-definitions]. 359 In the cases where ASAs are defined as closed control loops, the 360 specifications defined in [ZSM009-1] regarding imperative and 361 declarative goal statements may be applicable. 363 In the ANI, policy dissemination is expected to operate by an 364 information distribution mechanism (e.g. via GRASP [RFC8990]) that 365 can reach all autonomic nodes, and therefore every ASA. However, 366 each ASA must be capable of operating "out of the box" in the absence 367 of locally defined policy, so every ASA implementation must include 368 carefully chosen default values and settings for all policy 369 parameters. 371 4. Interaction with Non-Autonomic Components and Systems 373 An ASA, to have any external effects, must also interact with non- 374 autonomic components of the node where it is installed. For example, 375 an ASA whose purpose is to manage a resource must interact with that 376 resource. An ASA managing an entity that is also managed by local 377 software must interact with that software. For example, if such 378 management is performed by NETCONF [RFC6241], the ASA must interact 379 with the NETCONF server as an independent NETCONF client in the same 380 node to avoid any inconsistency between configuration changes 381 delivered via NETCONF and configuration changes made by the ASA. 383 In an environment where systems are virtualized and specialized using 384 techniques such as network function virtualization or network 385 slicing, there will be a design choice whether ASAs are deployed once 386 per physical node or once per virtual context. A related issue is 387 whether the ANI as a whole is deployed once on a physical network, or 388 whether several virtual ANIs are deployed. This aspect needs to be 389 considered by the ASA designer. 391 5. Design of GRASP Objectives 393 The design of an ASA will often require the design of a new GRASP 394 objective. The general rules for the format of GRASP objectives, 395 their names, and IANA registration are given in [RFC8990]. 396 Additionally, that document discusses various general considerations 397 for the design of objectives, which are not repeated here. However, 398 note that the GRASP protocol, like HTTP, does not provide 399 transactional integrity. In particular, steps in a GRASP negotiation 400 are not idempotent. The design of a GRASP objective and the logic 401 flow of the ASA should take this into account. One approach, which 402 should be used when possible, is to design objectives with idempotent 403 semantics. If this is not possible, typically if an ASA is 404 allocating part of a shared resource to other ASAs, it needs to 405 ensure that the same part of the resource is not allocated twice. 406 The easiest way is to run only one negotiation at a time. If an ASA 407 is capable of overlapping several negotiations, it must avoid 408 interference between these negotiations. 410 Negotiations will always end, normally because one end or the other 411 declares success or failure. If this does not happen, either a 412 timeout or exhaustion of the loop count will occur. The definition 413 of a GRASP objective should describe a specific negotiation policy if 414 it is not self-evident. 416 GRASP allows a 'dry run' mode of negotiation, where a negotiation 417 session follows its normal course but is not committed at either end 418 until a subsequent live negotiation session. If 'dry run' mode is 419 defined for the objective, its specification, and every 420 implementation, must consider what state needs to be saved following 421 a dry run negotiation, such that a subsequent live negotiation can be 422 expected to succeed. It must be clear how long this state is kept, 423 and what happens if the live negotiation occurs after this state is 424 deleted. An ASA that requests a dry run negotiation must take 425 account of the possibility that a successful dry run is followed by a 426 failed live negotiation. Because of these complexities, the dry run 427 mechanism should only be supported by objectives and ASAs where there 428 is a significant benefit from it. 430 The actual value field of an objective is limited by the GRASP 431 protocol definition to any data structure that can be expressed in 432 Concise Binary Object Representation (CBOR) [RFC8949]. For some 433 objectives, a single data item will suffice; for example an integer, 434 a floating point number, a UTF-8 string or an arbitrary byte string. 435 For more complex cases, a simple tuple structure such as [item1, 436 item2, item3] could be used. Since CBOR is closely linked to JSON, 437 it is also rather easy to define an objective whose value is a JSON 438 structure. The formats acceptable by the GRASP API will limit the 439 options in practice. A generic solution is for the API to accept and 440 deliver the value field in raw CBOR, with the ASA itself encoding and 441 decoding it via a CBOR library (section 2.3.2.4 of [RFC8991]). 443 The maximum size of the value field of an objective is limited by the 444 GRASP maximum message size. If the default maximum size specified as 445 GRASP_DEF_MAX_SIZE by [RFC8990] is not enough, the specification of 446 the objective must indicate the required maximum message size, both 447 for unicast and multicast messages. 449 A mapping from YANG to CBOR is defined by [I-D.ietf-core-yang-cbor]. 450 Subject to the size limit defined for GRASP messages, nothing 451 prevents objectives transporting YANG in this way. 453 The flexibility of CBOR implies that the value field of many 454 objectives can be extended in service, to add additional information 455 or alternative content, especially if JSON-like structures are used. 456 This has consequences for the robustness of ASAs, as discussed in 457 Section 8. 459 6. Life Cycle 461 The ASA life cycle was discussed in 462 [I-D.peloso-anima-autonomic-function], from which the following text 463 was derived. It does not cover all details, and some of the terms 464 used would require precise definitions in a given implementation. 466 In simple cases, Autonomic functions could be permanent, in the sense 467 that ASAs are shipped as part of a product and persist throughout the 468 product's life. However, in complex cases, a more likely situation 469 is that ASAs need to be installed or updated dynamically, because of 470 new requirements or bugs. This section describes one approach to the 471 resulting life cycle of individual ASAs. It does not consider wider 472 issues such as updates of shared libraries. 474 Because continuity of service is fundamental to autonomic networking, 475 the process of seamlessly replacing a running instance of an ASA with 476 a new version needs to be part of the ASA's design. The implication 477 of service continuity on the design of ASAs can be illustrated along 478 the three main phases of the ASA life cycle, namely Installation, 479 Instantiation and Operation. 481 +--------------+ 482 Undeployed ------>| |------> Undeployed 483 | Installed | 484 +-->| |---+ 485 Mandate | +--------------+ | Receives a 486 is revoked | +--------------+ | Mandate 487 +---| |<--+ 488 | Instantiated | 489 +-->| |---+ 490 set | +--------------+ | set 491 down | +--------------+ | up 492 +---| |<--+ 493 | Operational | 494 | | 495 +--------------+ 497 Figure 1: Life Cycle of an Autonomic Service Agent 499 6.1. Installation phase 501 We define "installation" to mean that a piece of software is loaded 502 into a device, along with any necessary libraries, but is not yet 503 activated. 505 Before being able to instantiate and run ASAs, the operator will 506 first provision the infrastructure with the sets of ASA software 507 corresponding to its needs and objectives. Such software must be 508 checked for integrity and authenticity before installation. The 509 provisioning of the infrastructure is realized in the installation 510 phase and consists of installing (or checking the availability of) 511 the pieces of software of the different ASAs in a set of Installation 512 Hosts within the autonomic network. 514 There are 3 properties applicable to the installation of ASAs: 516 * The dynamic installation property allows installing an ASA on 517 demand, on any hosts compatible with the ASA. 519 * The decoupling property allows an ASA on one machine to control 520 resources in another machine (known as "decoupled mode"). 522 * The multiplicity property allows controlling multiple sets of 523 resources from a single ASA. 525 These three properties are very important in the context of the 526 installation phase as their variations condition how the ASA could be 527 installed on the infrastructure. 529 6.1.1. Installation phase inputs and outputs 531 Inputs are: 533 * [ASA_type] specifies which ASA to install. 535 * [Installation_target_infrastructure] specifies the candidate 536 installation Hosts. 538 * [ASA_placement_function] specifies how the installation phase will 539 meet the operator's needs and objectives for the provision of the 540 infrastructure. This function is only useful in the decoupled 541 mode. It can be as simple as an explicit list of hosts on which 542 the ASAs are to be installed, or it could consist of operator- 543 defined criteria and constraints. 545 The main output of the installation phase is a [List_of_ASAs] 546 installed on [List_of_hosts]. This output is also useful for the 547 coordination function where it acts as a static interaction map (see 548 Section 7.1). 550 The condition to validate in order to pass to next phase is to ensure 551 that [List_of_ASAs] are correctly installed on [List_of_hosts]. A 552 minimum set of primitives to support the installation of ASAs could 553 be: install(List_of_ASAs, Installation_target_infrastructure, 554 ASA_placement_function), and uninstall (List_of_ASAs). 556 6.2. Instantiation phase 558 We define "instantiation" as the operation of creating a single ASA 559 instance from the corresponding piece of installed software. 561 Once the ASAs are installed on the appropriate hosts in the network, 562 these ASAs may start to operate. From the operator viewpoint, an 563 operating ASA means the ASA manages the network resources as per the 564 objectives given. At the ASA local level, operating means executing 565 their control loop algorithm. 567 There are two apsects to take into consideration. First, having a 568 piece of code installed and available to run on a host is not the 569 same as having an agent based on this piece of code running inside 570 the host. Second, in a coupled case, determining which resources are 571 controlled by an ASA is straightforward (the ASA runs on the same 572 autonomic node as the resources it is controlling); in a decoupled 573 mode determining this is a bit more complex: a starting agent will 574 have to either discover the set of resources it ought to control, or 575 such information has to be communicated to the ASA. 577 The instantiation phase of an ASA covers both these aspects: starting 578 the agent code (when this does not start automatically) and 579 determining which resources have to be controlled (when this is not 580 straightforward). 582 6.2.1. Operator's goal 584 Through this phase, the operator wants to control its autonomic 585 network regarding at least two aspects: 587 1 determine the scope of autonomic functions by instructing which 588 network resources have to be managed by which autonomic function 589 (and more precisely by which release of the ASA software code, 590 e.g., version number or provider), 592 2 determine how the autonomic functions are organized by 593 instantiating a set of ASAs across one or more autonomic nodes and 594 instructing them accordingly about the other ASAs in the set as 595 necessary. 597 In this phase, the operator may also want to set goals for autonomic 598 functions, e.g., by configuring GRASP objectives. 600 The operator's goal can be summarized in an instruction to the 601 autonomic ecosystem matching the following format, explained in 602 detail in the next sub-section: 604 [Instances_of_ASA_type] ready to control 605 [Instantiation_target_infrastructure] with 606 [Instantiation_target_parameters] 608 6.2.2. Instantiation phase inputs and outputs 610 Inputs are: 612 * [Instances_of_ASA_type] that specifies which ASAs to instantiate 614 * [Instantiation_target_infrastructure] that specifies which are the 615 resources to be managed by the autonomic function; this can be the 616 whole network or a subset of it like a domain, a physical segment 617 or even a specific list of resources, 619 * [Instantiation_target_parameters] that specifies which are the 620 GRASP objectives to be sent to ASAs (e.g., an optimization target) 622 Outputs are: 624 * [Set_of_ASA_resources_relations] describing which resources are 625 managed by which ASA instances; this is not a formal message, but 626 a resulting configuration log for a set of ASAs. 628 6.2.3. Instantiation phase requirements 630 The instructions described in Section 6.2 could be either: 632 * Sent to a targeted ASA. In the case, the receiving Agent will 633 have to manage the specified list of 634 [Instantiation_target_infrastructure], with the 635 [Instantiation_target_parameters]. 637 * Broadcast to all ASAs. In this case, the ASAs would determine 638 from the list which ASAs would handle which 639 [Instantiation_target_infrastructure], with the 640 [Instantiation_target_parameters]. 642 These instructions may be grouped as a specific data structure, 643 referred to as an ASA Instance Mandate. The specification of such an 644 ASA Instance Mandate is beyond the scope of this document. 646 The conclusion of this instantiation phase is a set of ASA instances 647 ready to operate. These ASA instances are characterized by the 648 resources they manage, the metrics being monitored and the actions 649 that can be executed (like modifying certain parameters values). The 650 description of the ASA instance may be defined in an ASA Instance 651 Manifest data structure. The specification of such an ASA Instance 652 Manifest is beyond the scope of this document. 654 The ASA Instance Manifest does not only serve informational purposes 655 such as acknowledgement of successful instantiation to the operator, 656 but is also necessary for further autonomic operations with: 658 * coordinated entities (see Section 7.1) 660 * collaborative entities with purposes such as to establish 661 knowledge exchange (some ASAs may produce knowledge or monitor 662 metrics that would be useful for other ASAs) 664 6.3. Operation phase 666 During the Operation phase, the operator can: 668 * Activate/Deactivate ASAs: enable/disable their autonomic loops. 670 * Modify ASAs targets: set different technical objectives. 672 * Modify ASAs managed resources: update the instance mandate to 673 specify a different set of resources to manage (only applicable to 674 decoupled ASAs). 676 During the Operation phase, running ASAs can interact with other 677 ASAs: 679 * in order to exchange knowledge (e.g. an ASA providing traffic 680 predictions to a load balancing ASA) 682 * in order to collaboratively reach an objective (e.g. ASAs 683 pertaining to the same autonomic function will collaborate, e.g., 684 in the case of a load balancing function, by modifying link 685 metrics according to neighboring resource loads) 687 During the Operation phase, running ASAs are expected to apply 688 coordination schemes as per Section 7.1. 690 6.4. Removal phase 692 When an ASA is removed from service and uninstalled, the above steps 693 are reversed. It is important that its data, especially any security 694 key material, is purged. 696 7. Coordination and Data Models 698 7.1. Coordination between Autonomic Functions 700 Some autonomic functions will be completely independent of each 701 other. However, others are at risk of interfering with each other - 702 for example, two different optimization functions might both attempt 703 to modify the same underlying parameter in different ways. In a 704 complete system, a method is needed of identifying ASAs that might 705 interfere with each other and coordinating their actions when 706 necessary. 708 7.2. Coordination with Traditional Management Functions 710 Some ASAs will have functions that overlap with existing 711 configuration tools and network management mechanisms such as command 712 line interfaces, DHCP, DHCPv6, SNMP, NETCONF, and RESTCONF. This is 713 of course an existing problem whenever multiple configuration tools 714 are in use by the NOC. Each ASA designer will need to consider this 715 issue and how to avoid clashes and inconsistencies in various 716 deployment scenarios. Some specific considerations for interaction 717 with OAM tools are given in [RFC8368]. As another example, [RFC8992] 718 describes how autonomic management of IPv6 prefixes can interact with 719 prefix delegation via DHCPv6. The description of a GRASP objective 720 and of an ASA using it should include a discussion of any such 721 interactions. 723 7.3. Data Models 725 Management functions often include a shared data model, quite likely 726 to be expressed in a formal notation such as YANG. This aspect 727 should not be an afterthought in the design of an ASA. To the 728 contrary, the design of the ASA and of its GRASP objectives should 729 match the data model; as noted in Section 5, YANG serialized as CBOR 730 may be used directly as the value of a GRASP objective. 732 8. Robustness 734 It is of great importance that all components of an autonomic system 735 are highly robust. Although ASA designers should aim for their 736 component to never fail, it is more important to design the ASA to 737 assume that failures will happen and to gracefully recover from those 738 failures when they occur. Hence, this section lists various aspects 739 of robustness that ASA designers should consider: 741 1. If despite all precautions, an ASA does encounter a fatal error, 742 it should in any case restart automatically and try again. To 743 mitigate a loop in case of persistent failure, a suitable pause 744 should be inserted before such a restart. The length of the 745 pause depends on the use case; randomization and exponential 746 backoff should be considered. 748 2. If a newly received or calculated value for a parameter falls 749 out of bounds, the corresponding parameter should be either left 750 unchanged or restored to a value known to be safe in all 751 configurations. 753 3. If a GRASP synchronization or negotiation session fails for any 754 reason, it may be repeated after a suitable pause. The length 755 of the pause depends on the use case; randomization and 756 exponential backoff should be considered. 758 4. If a session fails repeatedly, the ASA should consider that its 759 peer has failed, and cause GRASP to flush its discovery cache 760 and repeat peer discovery. 762 5. In any case, it may be prudent to repeat discovery periodically, 763 depending on the use case. 765 6. Any received GRASP message should be checked. If it is wrongly 766 formatted, it should be ignored. Within a unicast session, an 767 Invalid message (M_INVALID) may be sent. This function may be 768 provided by the GRASP implementation itself. 770 7. Any received GRASP objective should be checked. Basic 771 formatting errors like invalid CBOR will likely be detected by 772 GRASP itself, but the ASA is responsible for checking the 773 precise syntax and semantics of a received objective. If it is 774 wrongly formatted, it should be ignored. Within a negotiation 775 session, a Negotiation End message (M_END) with a Decline option 776 (O_DECLINE) should be sent. An ASA may log such events for 777 diagnostic purposes. 779 8. On the other hand, the definitions of GRASP objectives are very 780 likely to be extended, using the flexibility of CBOR or JSON. 781 Therefore, ASAs should be able to deal gracefully with unknown 782 components within the values of objectives. The specification 783 of an objective should describe how unknown components are to be 784 handled (ignored, logged and ignored, or rejected as an error). 786 9. If an ASA receives either an Invalid message (M_INVALID) or a 787 Negotiation End message (M_END) with a Decline option 788 (O_DECLINE), one possible reason is that the peer ASA does not 789 support a new feature of either GRASP or of the objective in 790 question. In such a case the ASA may choose to repeat the 791 operation concerned without using that new feature. 793 10. All other possible exceptions should be handled in an orderly 794 way. There should be no such thing as an unhandled exception 795 (but see point 1 above). 797 At a slightly more general level, ASAs are not services in 798 themselves, but they automate services. This has a fundamental 799 impact on how to design robust ASAs. In general, when an ASA 800 observes a particular state (1) of operations of the services/ 801 resources it controls, it typically aims to improve this state to a 802 better state, say (2). Ideally, the ASA is built so that it can 803 ensure that any error encountered can still lead to returning to (1) 804 instead of a state (3) which is worse than (1). One example instance 805 of this principle is "make-before-break" used in reconfiguration of 806 routing protocols in manual operations. This principle of operations 807 can accordingly be coded into the operation of an ASA. The GRASP dry 808 run option mentioned in Section 5 is another tool helpful for this 809 ASA design goal of "test-before-make". 811 9. Security Considerations 813 ASAs are intended to run in an environment that is protected by the 814 Autonomic Control Plane [RFC8994], admission to which depends on an 815 initial secure bootstrap process such as BRSKI [RFC8995]. Those 816 documents describe security considerations relating to the use of and 817 properties provided by the ACP and BRSKI, respectively. Such an ACP 818 can provide keying material for mutual authentication between ASAs as 819 well as confidential communication channels for messages between 820 ASAs. In some deployments, a secure partition of the link layer 821 might be used instead. GRASP itself has significant security 822 considerations [RFC8990]. However, this does not relieve ASAs of 823 responsibility for security. When ASAs configure or manage network 824 elements outside the ACP, potentially in a different physical node, 825 they must interact with other non-autonomic software components to 826 perform their management functions. The details are specific to each 827 case, but this has an important security implication. An ASA might 828 act as a loophole by which the managed entity could penetrate the 829 security boundary of the ANI. Thus, ASAs must be designed to avoid 830 loopholes such as passing on executable code or proxying unverified 831 commands, and should if possible operate in an unprivileged mode. In 832 particular, they must use secure coding practices, e.g., carefully 833 validate all incoming information and avoid unnecessary elevation of 834 privilege. This will apply in particular when an ASA interacts with 835 a management component such as a NETCONF server. 837 A similar situation will arise if an ASA acts as a gateway between 838 two separate autonomic networks, i.e. it has access to two separate 839 ACPs. Such an ASA must also be designed to avoid loopholes and to 840 validate incoming information from both sides. 842 As a reminder, GRASP does not intrinsically provide transactional 843 integrity (Section 5). 845 As appropriate to their specific functions, ASAs should take account 846 of relevant privacy considerations [RFC6973]. 848 The initial version of the autonomic infrastructure assumes that all 849 autonomic nodes are trusted by virtue of their admission to the ACP. 850 ASAs are therefore trusted to manipulate any GRASP objective, simply 851 because they are installed on a node that has successfully joined the 852 ACP. In the general case, a node may have multiple roles and a role 853 may use multiple ASAs, each using multiple GRASP objectives. 854 Additional mechanisms for the fine-grained authorization of nodes and 855 ASAs to manipulate specific GRASP objectives could be designed. 856 Meanwhile, we repeat that ASAs should run without special privilege 857 if possible. Independently of this, interfaces between ASAs and the 858 router configuration and monitoring services of the node can be 859 subject to authentication that provides more fine-grained 860 authorization for specific services. These additional authentication 861 parameters could be passed to an ASA during its instantiation phase. 863 10. IANA Considerations 865 This document makes no request of the IANA. 867 11. Acknowledgements 869 Valuable comments were received from Michael Behringer, Menachem 870 Dodge, Martin Dürst, Toerless Eckert, Thomas Fossati, Alex Galis, 871 Bing Liu, Benno Overeinder, Michael Richardson, Rob Wilton and other 872 IESG members. 874 12. References 876 12.1. Normative References 878 [RFC8949] Bormann, C. and P. Hoffman, "Concise Binary Object 879 Representation (CBOR)", STD 94, RFC 8949, 880 DOI 10.17487/RFC8949, December 2020, 881 . 883 [RFC8990] Bormann, C., Carpenter, B., Ed., and B. Liu, Ed., "GeneRic 884 Autonomic Signaling Protocol (GRASP)", RFC 8990, 885 DOI 10.17487/RFC8990, May 2021, 886 . 888 [RFC8994] Eckert, T., Ed., Behringer, M., Ed., and S. Bjarnason, "An 889 Autonomic Control Plane (ACP)", RFC 8994, 890 DOI 10.17487/RFC8994, May 2021, 891 . 893 [RFC8995] Pritikin, M., Richardson, M., Eckert, T., Behringer, M., 894 and K. Watsen, "Bootstrapping Remote Secure Key 895 Infrastructure (BRSKI)", RFC 8995, DOI 10.17487/RFC8995, 896 May 2021, . 898 12.2. Informative References 900 [DeMola06] De Mola, F. and R. Quitadamo, "Towards an Agent Model for 901 Future Autonomic Communications", Proceedings of the 7th 902 WOA 2006 Workshop From Objects to Agents 51-59, September 903 2006. 905 [GANA13] "Autonomic network engineering for the self-managing 906 Future Internet (AFI): GANA Architectural Reference Model 907 for Autonomic Networking, Cognitive Networking and Self- 908 Management.", April 2013, 909 . 912 [Huebscher08] 913 Huebscher, M. C. and J. A. McCann, "A survey of autonomic 914 computing - degrees, models, and applications", ACM 915 Computing Surveys (CSUR) Volume 40 Issue 3 DOI: 916 10.1145/1380584.1380585, August 2008. 918 [I-D.ietf-core-yang-cbor] 919 Veillette, M., Petrov, I., Pelov, A., Bormann, C., and M. 920 Richardson, "CBOR Encoding of Data Modeled with YANG", 921 Work in Progress, Internet-Draft, draft-ietf-core-yang- 922 cbor-18, 19 December 2021, 923 . 926 [I-D.irtf-nmrg-ibn-concepts-definitions] 927 Clemm, A., Ciavaglia, L., Granville, L. Z., and J. 928 Tantsura, "Intent-Based Networking - Concepts and 929 Definitions", Work in Progress, Internet-Draft, draft- 930 irtf-nmrg-ibn-concepts-definitions-06, 15 December 2021, 931 . 934 [I-D.peloso-anima-autonomic-function] 935 Pierre, P. and L. Ciavaglia, "A Day in the Life of an 936 Autonomic Function", Work in Progress, Internet-Draft, 937 draft-peloso-anima-autonomic-function-01, 21 March 2016, 938 . 941 [IPJ] Behringer, M., Bormann, C., Carpenter, B. E., Eckert, T., 942 Campos Nobre, J., Jiang, S., Li, Y., and M. C. Richardson, 943 "Autonomic Networking Gets Serious", The Internet Protocol 944 Journal Volume: 24 , Issue: 3, ISSN 1944-1134, Page(s): 2 945 - 18, October 2021, . 948 [Movahedi12] 949 Movahedi, Z., Ayari, M., Langar, R., and G. Pujolle, "A 950 Survey of Autonomic Network Architectures and Evaluation 951 Criteria", IEEE Communications Surveys & Tutorials Volume: 952 14 , Issue: 2 DOI: 10.1109/SURV.2011.042711.00078, 953 Page(s): 464 - 490, 2012. 955 [NFV] "Network Functions Virtualisation - Introductory White 956 Paper", SDN and OpenFlow World Congress, Darmstadt, 957 Germany 1-16, October 2012, 958 . 960 [RFC6241] Enns, R., Ed., Bjorklund, M., Ed., Schoenwaelder, J., Ed., 961 and A. Bierman, Ed., "Network Configuration Protocol 962 (NETCONF)", RFC 6241, DOI 10.17487/RFC6241, June 2011, 963 . 965 [RFC6973] Cooper, A., Tschofenig, H., Aboba, B., Peterson, J., 966 Morris, J., Hansen, M., and R. Smith, "Privacy 967 Considerations for Internet Protocols", RFC 6973, 968 DOI 10.17487/RFC6973, July 2013, 969 . 971 [RFC7575] Behringer, M., Pritikin, M., Bjarnason, S., Clemm, A., 972 Carpenter, B., Jiang, S., and L. Ciavaglia, "Autonomic 973 Networking: Definitions and Design Goals", RFC 7575, 974 DOI 10.17487/RFC7575, June 2015, 975 . 977 [RFC7665] Halpern, J., Ed. and C. Pignataro, Ed., "Service Function 978 Chaining (SFC) Architecture", RFC 7665, 979 DOI 10.17487/RFC7665, October 2015, 980 . 982 [RFC8368] Eckert, T., Ed. and M. Behringer, "Using an Autonomic 983 Control Plane for Stable Connectivity of Network 984 Operations, Administration, and Maintenance (OAM)", 985 RFC 8368, DOI 10.17487/RFC8368, May 2018, 986 . 988 [RFC8991] Carpenter, B., Liu, B., Ed., Wang, W., and X. Gong, 989 "GeneRic Autonomic Signaling Protocol Application Program 990 Interface (GRASP API)", RFC 8991, DOI 10.17487/RFC8991, 991 May 2021, . 993 [RFC8992] Jiang, S., Ed., Du, Z., Carpenter, B., and Q. Sun, 994 "Autonomic IPv6 Edge Prefix Management in Large-Scale 995 Networks", RFC 8992, DOI 10.17487/RFC8992, May 2021, 996 . 998 [RFC8993] Behringer, M., Ed., Carpenter, B., Eckert, T., Ciavaglia, 999 L., and J. Nobre, "A Reference Model for Autonomic 1000 Networking", RFC 8993, DOI 10.17487/RFC8993, May 2021, 1001 . 1003 [ZSM009-1] "Zero-touch network and Service Management (ZSM); Closed- 1004 Loop Automation; Part 1: Enablers", June 2021, 1005 . 1008 Appendix A. Change log 1010 This section is to be removed before publishing as an RFC. 1012 draft-ietf-anima-asa-guidelines-07, 2022-02-01: 1014 * Editorial 1016 draft-ietf-anima-asa-guidelines-06, 2022-01-27: 1018 * Clarified two sentences about special-purpose ASAs (Section 3.1, 1019 Section 3.2). 1020 * Fixed indentation bug and added one statement to MAIN PROGRAM 1021 pseudocode. 1022 * Removed mention of software image servers in section 6.1, which 1023 confused the reader. 1024 * Other improvements from IESG reviews. 1026 draft-ietf-anima-asa-guidelines-05, 2021-12-20: 1028 * Clarified NETCONF wording. 1029 * Removed on advice from IETF Trust 1030 * Noted resource limits in constrained nodes 1031 * Strengthened text on data integrity in resource management example 1032 * Strengthen discussion of extensibility of GRASP objectives. 1033 * Other editorial improvements from IETF Last Call reviews 1035 draft-ietf-anima-asa-guidelines-04, 2021-11-20: 1037 * Added terminology appendix 1038 * Further clarified discussion of asynch operations 1039 * Other editorial improvements from AD review 1041 draft-ietf-anima-asa-guidelines-03, 2021-11-07: 1043 * Added security consideration for gateway ASAs 1044 * Cite IPJ article 1046 draft-ietf-anima-asa-guidelines-02, 2021-09-13: 1048 * Added note on maximum message size. 1049 * Editorial fixes 1051 draft-ietf-anima-asa-guidelines-01, 2021-06-27: 1053 * Incorporated shepherd's review comments 1054 * Editorial fixes 1056 draft-ietf-anima-asa-guidelines-00, 2020-11-14: 1058 * Adopted by WG 1059 * Editorial fixes 1061 draft-carpenter-anima-asa-guidelines-09, 2020-07-25: 1063 * Additional text on future authorization. 1064 * Editorial fixes 1065 draft-carpenter-anima-asa-guidelines-08, 2020-01-10: 1067 * Introduced notion of autonomic ecosystem. 1068 * Minor technical clarifications. 1069 * Converted to v3 format. 1071 draft-carpenter-anima-asa-guidelines-07, 2019-07-17: 1073 * Improved explanation of threading vs event-loop 1074 * Other editorial improvements. 1076 draft-carpenter-anima-asa-guidelines-06, 2018-01-07: 1078 * Expanded and improved example logic flow. 1079 * Editorial corrections. 1081 draft-carpenter-anima-asa-guidelines-05, 2018-06-30: 1083 * Added section on relationshp with non-autonomic components. 1084 * Editorial corrections. 1086 draft-carpenter-anima-asa-guidelines-04, 2018-03-03: 1088 * Added note about simple ASAs. 1089 * Added note about NFV/SFC services. 1090 * Improved text about threading v event loop model 1091 * Added section about coordination with traditional tools. 1092 * Added appendix with example logic flow. 1094 draft-carpenter-anima-asa-guidelines-03, 2017-10-25: 1096 * Added details on life cycle. 1097 * Added details on robustness. 1098 * Added co-authors. 1100 draft-carpenter-anima-asa-guidelines-02, 2017-07-01: 1102 * Expanded description of event-loop case. 1103 * Added note about 'dry run' mode. 1105 draft-carpenter-anima-asa-guidelines-01, 2017-01-06: 1107 * More sections filled in. 1109 draft-carpenter-anima-asa-guidelines-00, 2016-09-30: 1111 * Initial version 1113 Appendix B. Terminology 1115 This appendix summarises various acronyms and terminology used in the 1116 document. Where no other reference is given, please consult 1117 [RFC8993] or [RFC7575]. 1119 * Autonomic: Self-managing (self-configuring, self-protecting, self- 1120 healing, self-optimizing), but allowing high-level guidance by a 1121 central entity such as a NOC. 1122 * Autonomic Function: A function that adapts on its own to a 1123 changing environment. 1124 * Autonomic Node: A node that employs autonomic functions. 1125 * ACP: Autonomic Control Plane [RFC8994]. 1126 * AN: Autonomic Network: A network of autonomic nodes, which 1127 interact directly with each other. 1128 * ANI: Autonomic Network Infrastructure. 1129 * ASA: Autonomic Service Agent. An agent installed on an autonomic 1130 node that implements an autonomic function, either partially (in 1131 the case of a distributed function) or completely. 1132 * BRSKI: Bootstrapping Remote Secure Key Infrastructure [RFC8995]. 1133 * CBOR: Concise Binary Object Representation [RFC8949]. 1134 * GRASP: Generic Autonomic Signaling Protocol [RFC8990]. 1135 * GRASP API: GRASP Application Programming Interface [RFC8991]. 1136 * NOC: Network Operations Center [RFC8368]. 1137 * Objective: A GRASP technical objective is a data structure whose 1138 main contents are a name and a value. The value consists of a 1139 single configurable parameter or a set of parameters of some kind. 1140 [RFC8990]. 1142 Appendix C. Example Logic Flows 1144 This appendix describes generic logic flows that combine to act as an 1145 Autonomic Service Agent (ASA) for resource management. Note that 1146 these are illustrative examples, and in no sense requirements. As 1147 long as the rules of GRASP are followed, a real implementation could 1148 be different. The reader is assumed to be familiar with GRASP 1149 [RFC8990] and its conceptual API [RFC8991]. 1151 A complete autonomic function for a distributed resource will consist 1152 of a number of instances of the ASA placed at relevant points in a 1153 network. Specific details will of course depend on the resource 1154 concerned. One example is IP address prefix management, as specified 1155 in [RFC8992]. In this case, an instance of the ASA will exist in 1156 each delegating router. 1158 An underlying assumption is that there is an initial source of the 1159 resource in question, referred to here as an origin ASA. The other 1160 ASAs, known as delegators, obtain supplies of the resource from the 1161 origin, and then delegate quantities of the resource to consumers 1162 that request it, and recover it when no longer needed. 1164 Another assumption is there is a set of network wide policy 1165 parameters, which the origin will provide to the delegators. These 1166 parameters will control how the delegators decide how much resource 1167 to provide to consumers. Thus, the ASA logic has two operating 1168 modes: origin and delegator. When running as an origin, it starts by 1169 obtaining a quantity of the resource from the NOC, and it acts as a 1170 source of policy parameters, via both GRASP flooding and GRASP 1171 synchronization. (In some scenarios, flooding or synchronization 1172 alone might be sufficient, but this example includes both.) 1174 When running as a delegator, it starts with an empty resource pool, 1175 it acquires the policy parameters by GRASP synchronization, and it 1176 delegates quantities of the resource to consumers that request it. 1177 Both as an origin and as a delegator, when its pool is low it seeks 1178 quantities of the resource by requesting GRASP negotiation with peer 1179 ASAs. When its pool is sufficient, it hands out resource to peer 1180 ASAs in response to negotiation requests. Thus, over time, the 1181 initial resource pool held by the origin will be shared among all the 1182 delegators according to demand. 1184 In theory a network could include any number of origins and any 1185 number of delegators, with the only condition being that each 1186 origin's initial resource pool is unique. A realistic scenario is to 1187 have exactly one origin and as many delegators as you like. A 1188 scenario with no origin is useless. 1190 An implementation requirement is that resource pools are kept in 1191 stable storage. Otherwise, if a delegator exits for any reason, all 1192 the resources it has obtained or delegated are lost. If an origin 1193 exits, its entire spare pool is lost. The logic for using stable 1194 storage and for crash recovery is not included in the pseudocode 1195 below, which focuses on communication between ASAs. Since GRASP 1196 operations are not intrinsically idempotent, data integrity during 1197 failure scenarios is the responsibility of the ASA designer. This is 1198 a complex topic in its own right that is not discussed in the present 1199 document. 1201 The description below does not implement GRASP's 'dry run' function. 1202 That would require temporarily marking any resource handed out in a 1203 dry run negotiation as reserved, until either the peer obtains it in 1204 a live run, or a suitable timeout occurs. 1206 The main data structures used in each instance of the ASA are: 1208 * The resource_pool, for example an ordered list of available 1209 resources. Depending on the nature of the resource, units of 1210 resource are split when appropriate, and a background garbage 1211 collector recombines split resources if they are returned to the 1212 pool. 1214 * The delegated_list, where a delegator stores the resources it has 1215 given to subsidiary devices. 1217 Possible main logic flows are below, using a threaded implementation 1218 model. As noted above, alternative approaches to asynchronous 1219 operations are possible. The transformation to an event loop model 1220 should be apparent - each thread would correspond to one event in the 1221 event loop. 1223 The GRASP objectives are as follows: 1225 * ["EX1.Resource", flags, loop_count, value] where the value depends 1226 on the resource concerned, but will typically include its size and 1227 identification. 1229 * ["EX1.Params", flags, loop_count, value] where the value will be, 1230 for example, a JSON object defining the applicable parameters. 1232 In the outline logic flows below, these objectives are represented 1233 simply by their names. 1235 MAIN PROGRAM: 1237 Create empty resource_pool (and an associated lock) 1238 Create empty delegated_list 1239 Determine whether to act as origin 1240 if origin: 1241 Obtain initial resource_pool contents from NOC 1242 Obtain value of EX1.Params from NOC 1243 Register ASA with GRASP 1244 Register GRASP objectives EX1.Resource and EX1.Params 1245 if origin: 1246 Start FLOODER thread to flood EX1.Params 1247 Start SYNCHRONIZER listener for EX1.Params 1248 Start MAIN_NEGOTIATOR thread for EX1.Resource 1249 if not origin: 1250 Obtain value of EX1.Params from GRASP flood or synchronization 1251 Start DELEGATOR thread 1252 Start GARBAGE_COLLECTOR thread 1253 good_peer = none 1254 do forever: 1255 if resource_pool is low: 1256 Calculate amount A of resource needed 1257 Discover peers using GRASP M_DISCOVER / M_RESPONSE 1258 if good_peer in peers: 1259 peer = good_peer 1260 else: 1261 peer = #any choice among peers 1262 grasp.request_negotiate("EX1.Resource", peer) 1263 #i.e., send negotiation request 1264 Wait for response (M_NEGOTIATE, M_END or M_WAIT) 1265 if OK: 1266 if offered amount of resource sufficient: 1267 Send M_END + O_ACCEPT #negotiation succeeded 1268 Add resource to pool 1269 good_peer = peer #remember this choice 1270 else: 1271 Send M_END + O_DECLINE #negotiation failed 1272 good_peer = none #forget this choice 1273 sleep() #periodic timer suitable for application scenario 1275 MAIN_NEGOTIATOR thread: 1277 do forever: 1278 grasp.listen_negotiate("EX1.Resource") 1279 #i.e., wait for negotiation request 1280 Start a separate new NEGOTIATOR thread for requested amount A 1282 NEGOTIATOR thread: 1284 Request resource amount A from resource_pool 1285 if not OK: 1286 while not OK and A > Amin: 1287 A = A-1 1288 Request resource amount A from resource_pool 1289 if OK: 1290 Offer resource amount A to peer by GRASP M_NEGOTIATE 1291 if received M_END + O_ACCEPT: 1292 #negotiation succeeded 1293 elif received M_END + O_DECLINE or other error: 1294 #negotiation failed 1295 Return resource to resource_pool 1296 else: 1297 Send M_END + O_DECLINE #negotiation failed 1298 #thread exits 1300 DELEGATOR thread: 1302 do forever: 1303 Wait for request or release for resource amount A 1304 if request: 1305 Get resource amount A from resource_pool 1306 if OK: 1307 Delegate resource to consumer #atomic 1308 Record in delegated_list #operation 1309 else: 1310 Signal failure to consumer 1311 Signal main thread that resource_pool is low 1312 else: 1313 Delete resource from delegated_list 1314 Return resource amount A to resource_pool 1316 SYNCHRONIZER thread: 1318 do forever: 1319 Wait for M_REQ_SYN message for EX1.Params 1320 Reply with M_SYNCH message for EX1.Params 1322 FLOODER thread: 1324 do forever: 1325 Send M_FLOOD message for EX1.Params 1326 sleep() #periodic timer suitable for application scenario 1328 GARBAGE_COLLECTOR thread: 1330 do forever: 1331 Search resource_pool for adjacent resources 1332 Merge adjacent resources 1333 sleep() #periodic timer suitable for application scenario 1335 Authors' Addresses 1337 Brian Carpenter 1338 School of Computer Science 1339 University of Auckland 1340 PB 92019 1341 Auckland 1142 1342 New Zealand 1344 Email: brian.e.carpenter@gmail.com 1346 Laurent Ciavaglia 1347 Rakuten Mobile 1348 Paris 1349 France 1351 Email: laurent.ciavaglia@rakuten.com 1353 Sheng Jiang 1354 Huawei Technologies Co., Ltd 1355 Q14 Huawei Campus 1356 156 Beiqing Road 1357 Hai-Dian District 1358 Beijing 1359 100095 1360 China 1362 Email: jiangsheng@huawei.com 1364 Pierre Peloso 1365 Nokia 1366 Villarceaux 1367 91460 Nozay 1368 France 1370 Email: pierre.peloso@nokia.com