idnits 2.17.1 draft-carpenter-anima-asa-guidelines-08.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 412 has weird spacing: '...roperty allow...' == Line 415 has weird spacing: '...roperty allow...' == Line 419 has weird spacing: '...roperty allow...' -- The document date (10 January 2020) is 1567 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-30) exists of draft-ietf-anima-autonomic-control-plane-21 == Outdated reference: A later version (-45) exists of draft-ietf-anima-bootstrapping-keyinfra-34 ** Obsolete normative reference: RFC 7049 (Obsoleted by RFC 8949) == Outdated reference: A later version (-02) exists of draft-carpenter-anima-l2acp-scenarios-01 == Outdated reference: A later version (-10) exists of draft-ietf-anima-grasp-api-04 == Outdated reference: A later version (-20) exists of draft-ietf-core-yang-cbor-11 Summary: 1 error (**), 0 flaws (~~), 9 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group B. E. Carpenter 3 Internet-Draft Univ. of Auckland 4 Intended status: Informational L. Ciavaglia 5 Expires: 13 July 2020 Nokia 6 S. Jiang 7 Huawei Technologies Co., Ltd 8 P. Peloso 9 Nokia 10 10 January 2020 12 Guidelines for Autonomic Service Agents 13 draft-carpenter-anima-asa-guidelines-08 15 Abstract 17 This document proposes guidelines for the design of Autonomic Service 18 Agents for autonomic networks, as a contribution to describing an 19 autonomic ecosystem. It is based on the Autonomic Network 20 Infrastructure outlined in the ANIMA reference model, using the 21 Autonomic Control Plane and the Generic Autonomic Signaling Protocol. 23 Status of This Memo 25 This Internet-Draft is submitted in full conformance with the 26 provisions of BCP 78 and BCP 79. 28 Internet-Drafts are working documents of the Internet Engineering 29 Task Force (IETF). Note that other groups may also distribute 30 working documents as Internet-Drafts. The list of current Internet- 31 Drafts is at https://datatracker.ietf.org/drafts/current/. 33 Internet-Drafts are draft documents valid for a maximum of six months 34 and may be updated, replaced, or obsoleted by other documents at any 35 time. It is inappropriate to use Internet-Drafts as reference 36 material or to cite them other than as "work in progress." 38 This Internet-Draft will expire on 13 July 2020. 40 Copyright Notice 42 Copyright (c) 2020 IETF Trust and the persons identified as the 43 document authors. All rights reserved. 45 This document is subject to BCP 78 and the IETF Trust's Legal 46 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 47 license-info) in effect on the date of publication of this document. 48 Please review these documents carefully, as they describe your rights 49 and restrictions with respect to this document. Code Components 50 extracted from this document must include Simplified BSD License text 51 as described in Section 4.e of the Trust Legal Provisions and are 52 provided without warranty as described in the Simplified BSD License. 54 Table of Contents 56 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 57 2. Logical Structure of an Autonomic Service Agent . . . . . . . 4 58 3. Interaction with the Autonomic Networking Infrastructure . . 5 59 3.1. Interaction with the security mechanisms . . . . . . . . 5 60 3.2. Interaction with the Autonomic Control Plane . . . . . . 5 61 3.3. Interaction with GRASP and its API . . . . . . . . . . . 6 62 3.4. Interaction with policy mechanism . . . . . . . . . . . . 7 63 4. Interaction with Non-Autonomic Components . . . . . . . . . . 7 64 5. Design of GRASP Objectives . . . . . . . . . . . . . . . . . 8 65 6. Life Cycle . . . . . . . . . . . . . . . . . . . . . . . . . 9 66 6.1. Installation phase . . . . . . . . . . . . . . . . . . . 9 67 6.1.1. Installation phase inputs and outputs . . . . . . . . 10 68 6.2. Instantiation phase . . . . . . . . . . . . . . . . . . . 11 69 6.2.1. Operator's goal . . . . . . . . . . . . . . . . . . . 11 70 6.2.2. Instantiation phase inputs and outputs . . . . . . . 12 71 6.2.3. Instantiation phase requirements . . . . . . . . . . 12 72 6.3. Operation phase . . . . . . . . . . . . . . . . . . . . . 13 73 7. Coordination between Autonomic Functions . . . . . . . . . . 14 74 8. Coordination with Traditional Management Functions . . . . . 14 75 9. Robustness . . . . . . . . . . . . . . . . . . . . . . . . . 14 76 10. Security Considerations . . . . . . . . . . . . . . . . . . . 15 77 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 16 78 12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 16 79 13. References . . . . . . . . . . . . . . . . . . . . . . . . . 16 80 13.1. Normative References . . . . . . . . . . . . . . . . . . 16 81 13.2. Informative References . . . . . . . . . . . . . . . . . 16 82 Appendix A. Change log [RFC Editor: Please remove] . . . . . . . 19 83 Appendix B. Example Logic Flows . . . . . . . . . . . . . . . . 20 84 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 25 86 1. Introduction 88 This document proposes guidelines for the design of Autonomic Service 89 Agents (ASAs) in the context of an Autonomic Network (AN) based on 90 the Autonomic Network Infrastructure (ANI) outlined in the ANIMA 91 reference model [I-D.ietf-anima-reference-model]. This 92 infrastructure makes use of the Autonomic Control Plane (ACP) 93 [I-D.ietf-anima-autonomic-control-plane] and the Generic Autonomic 94 Signaling Protocol (GRASP) [I-D.ietf-anima-grasp]. This document is 95 a contribution to the description of an autonomic ecosystem, 96 recognizing that a deployable autonomic network needs more than just 97 ACP and GRASP implementations. It must achieve management goals that 98 a Network Operations Center (NOC) cannot achieve manually, including 99 at least a library of ASAs and corresponding GRASP objective 100 definitions. There must also be tools to deploy and oversee ASAs, 101 and integration with existing operational mechanisms [RFC8368]. 102 However, this document focuses on the design of ASAs, with some 103 reference to implementation and operational aspects. 105 There is a considerable literature about autonomic agents with a 106 variety of proposals about how they should be characterized. Some 107 examples are [DeMola06], [Huebscher08], [Movahedi12] and [GANA13]. 108 However, for the present document, the basic definitions and goals 109 for autonomic networking given in [RFC7575] apply . According to RFC 110 7575, an Autonomic Service Agent is "An agent implemented on an 111 autonomic node that implements an autonomic function, either in part 112 (in the case of a distributed function) or whole." 114 ASAs must be distinguished from other forms of software component. 115 They are components of network or service management; they do not in 116 themselves provide services. For example, the services envisaged for 117 network function virtualisation [RFC8568] or for service function 118 chaining [RFC7665] might be managed by an ASA rather than by 119 traditional configuration tools. 121 The reference model [I-D.ietf-anima-reference-model] expands this by 122 adding that an ASA is "a process that makes use of the features 123 provided by the ANI to achieve its own goals, usually including 124 interaction with other ASAs via the GRASP protocol 125 [I-D.ietf-anima-grasp] or otherwise. Of course it also interacts 126 with the specific targets of its function, using any suitable 127 mechanism. Unless its function is very simple, the ASA will need to 128 handle overlapping asynchronous operations. This will require either 129 a multi-threaded implementation, or a logically equivalent event loop 130 structure. It may therefore be a quite complex piece of software in 131 its own right, forming part of the application layer above the ANI." 133 There will certainly be very simple ASAs that manage a single 134 objective in a straightforward way and do not need asynchronous 135 operations. In such a case, many aspects of the current document do 136 not apply. However, in general a basic property of an ASA is that it 137 is a relatively complex software component that will in many cases 138 control and monitor simpler entities in the same host or elsewhere. 139 For example, a device controller that manages tens or hundreds of 140 simple devices might contain a single ASA. 142 The remainder of this document offers guidance on the design of such 143 ASAs. 145 2. Logical Structure of an Autonomic Service Agent 147 As mentioned above, all but the simplest ASAs will need to suport 148 asynchronous operations. Not all programming environments explicitly 149 support multi-threading. In that case, an 'event loop' style of 150 implementation should be adopted, in which case each thread would be 151 implemented as an event handler called in turn by the main loop. For 152 this, the GRASP API (Section 3.3) must provide non-blocking calls. 153 If necessary, the GRASP session identifier will be used to 154 distinguish simultaneous operations. 156 A typical ASA will have a main thread that performs various initial 157 housekeeping actions such as: 159 * Obtain authorization credentials. 161 * Register the ASA with GRASP. 163 * Acquire relevant policy parameters. 165 * Define data structures for relevant GRASP objectives. 167 * Register with GRASP those objectives that it will actively manage. 169 * Launch a self-monitoring thread. 171 * Enter its main loop. 173 The logic of the main loop will depend on the details of the 174 autonomic function concerned. Whenever asynchronous operations are 175 required, extra threads will be launched, or events added to the 176 event loop. Examples include: 178 * Repeatedly flood an objective to the AN, so that any ASA can 179 receive the objective's latest value. 181 * Accept incoming synchronization requests for an objective managed 182 by this ASA. 184 * Accept incoming negotiation requests for an objective managed by 185 this ASA, and then conduct the resulting negotiation with the 186 counterpart ASA. 188 * Manage subsidiary non-autonomic devices directly. 190 These threads or events should all either exit after their job is 191 done, or enter a wait state for new work, to avoid blocking others 192 unnecessarily. 194 According to the degree of parallelism needed by the application, 195 some of these threads or events might be launched in multiple 196 instances. In particular, if negotiation sessions with other ASAs 197 are expected to be long or to involve wait states, the ASA designer 198 might allow for multiple simultaneous negotiating threads, with 199 appropriate use of queues and locks to maintain consistency. 201 The main loop itself could act as the initiator of synchronization 202 requests or negotiation requests, when the ASA needs data or 203 resources from other ASAs. In particular, the main loop should watch 204 for changes in policy parameters that affect its operation. It 205 should also do whatever is required to avoid unnecessary resource 206 consumption, such as including an arbitrary wait time in each cycle 207 of the main loop. 209 The self-monitoring thread is of considerable importance. Autonomic 210 service agents must never fail. To a large extent this depends on 211 careful coding and testing, with no unhandled error returns or 212 exceptions, but if there is nevertheless some sort of failure, the 213 self-monitoring thread should detect it, fix it if possible, and in 214 the worst case restart the entire ASA. 216 Appendix B presents some example logic flows in informal pseudocode. 218 3. Interaction with the Autonomic Networking Infrastructure 220 3.1. Interaction with the security mechanisms 222 An ASA by definition runs in an autonomic node. Before any normal 223 ASAs are started, such nodes must be bootstrapped into the autonomic 224 network's secure key infrastructure in accordance with 225 [I-D.ietf-anima-bootstrapping-keyinfra]. This key infrastructure 226 will be used to secure the ACP (next section) and may be used by ASAs 227 to set up additional secure interactions with their peers, if needed. 229 Note that the secure bootstrap process itself may include special- 230 purpose ASAs that run in a constrained insecure mode. 232 3.2. Interaction with the Autonomic Control Plane 234 In a normal autonomic network, ASAs will run as users of the ACP, 235 which will provide a fully secured network environment for all 236 communication with other ASAs, in most cases mediated by GRASP (next 237 section). 239 Note that the ACP formation process itself may include special- 240 purpose ASAs that run in a constrained insecure mode. 242 3.3. Interaction with GRASP and its API 244 GRASP [I-D.ietf-anima-grasp] is expected to run as a separate process 245 with its API [I-D.ietf-anima-grasp-api] available in user space. 246 Thus ASAs may operate without special privilege, unless they need it 247 for other reasons. The ASA's view of GRASP is built around GRASP 248 objectives (Section 5), defined as data structures containing 249 administrative information such as the objective's unique name, and 250 its current value. The format and size of the value is not 251 restricted by the protocol, except that it must be possible to 252 serialise it for transmission in CBOR [RFC7049], which is no 253 restriction at all in practice. 255 The GRASP API should offer the following features: 257 * Registration functions, so that an ASA can register itself and the 258 objectives that it manages. 260 * A discovery function, by which an ASA can discover other ASAs 261 supporting a given objective. 263 * A negotiation request function, by which an ASA can start 264 negotiation of an objective with a counterpart ASA. With this, 265 there is a corresponding listening function for an ASA that wishes 266 to respond to negotiation requests, and a set of functions to 267 support negotiating steps. 269 * A synchronization function, by which an ASA can request the 270 current value of an objective from a counterpart ASA. With this, 271 there is a corresponding listening function for an ASA that wishes 272 to respond to synchronization requests. 274 * A flood function, by which an ASA can cause the current value of 275 an objective to be flooded throughout the AN so that any ASA can 276 receive it. 278 For further details and some additional housekeeping functions, see 279 [I-D.ietf-anima-grasp-api]. 281 This API is intended to support the various interactions expected 282 between most ASAs, such as the interactions outlined in Section 2. 283 However, if ASAs require additional communication between themselves, 284 they can do so using any desired protocol. One option is to use 285 GRASP discovery and synchronization as a rendez-vous mechanism 286 between two ASAs, passing communication parameters such as a TCP port 287 number via GRASP. As noted above, either the ACP or in special cases 288 the autonomic key infrastructure will be used to secure such 289 communications. 291 3.4. Interaction with policy mechanism 293 At the time of writing, the policy (or "Intent") mechanism for the 294 ANI is undefined and is regarded as a research topic. It is expected 295 to operate by an information distribution mechanism (e.g. 296 [I-D.liu-anima-grasp-distribution]) that can reach all autonomic 297 nodes, and therefore every ASA. However, each ASA must be capable of 298 operating "out of the box" in the absence of locally defined policy, 299 so every ASA implementation must include carefully chosen default 300 values and settings for all policy parameters. 302 4. Interaction with Non-Autonomic Components 304 An ASA, to have any external effects, must also interact with non- 305 autonomic components of the node where it is installed. For example, 306 an ASA whose purpose is to manage a resource must interact with that 307 resource. An ASA whose purpose is to manage an entity that is 308 already managed by local software must interact with that software. 309 For example, if such management is performed by NETCONF [RFC6241], 310 the ASA must interact directly with the NETCONF server in the same 311 node. This is stating the obvious, and the details are specific to 312 each case, but it has an important security implication. The ASA 313 might act as a loophole by which the managed entity could penetrate 314 the security boundary of the ANI. The ASA must be designed to avoid 315 such loopholes, and should if possible operate in an unprivileged 316 mode. 318 In an environment where systems are virtualized and specialized using 319 techniques such as network function virtualization or network 320 slicing, there will be a design choice whether ASAs are deployed once 321 per physical node or once per virtual context. A related issue is 322 whether the ANI as a whole is deployed once on a physical network, or 323 whether several virtual ANIs are deployed. This aspect needs to be 324 considered by the ASA designer. 326 5. Design of GRASP Objectives 328 The general rules for the format of GRASP Objective options, their 329 names, and IANA registration are given in [I-D.ietf-anima-grasp]. 330 Additionally that document discusses various general considerations 331 for the design of objectives, which are not repeated here. However, 332 we emphasize that the GRASP protocol does not provide transactional 333 integrity. In other words, if an ASA is capable of overlapping 334 several negotiations for a given objective, then the ASA itself must 335 use suitable locking techniques to avoid interference between these 336 negotiations. For example, if an ASA is allocating part of a shared 337 resource to other ASAs, it needs to ensure that the same part of the 338 resource is not allocated twice. This might impact the design of the 339 objective as well as the logic flow of the ASA. 341 In particular, if 'dry run' mode is defined for the objective, its 342 specification, and every implementation, must consider what state 343 needs to be saved following a dry run negotiation, such that a 344 subsequent live negotiation can be expected to succeed. It must be 345 clear how long this state is kept, and what happens if the live 346 negotiation occurs after this state is deleted. An ASA that requests 347 a dry run negotiation must take account of the possibility that a 348 successful dry run is followed by a failed live negotiation. Because 349 of these complexities, the dry run mechanism should only be supported 350 by objectives and ASAs where there is a significant benefit from it. 352 The actual value field of an objective is limited by the GRASP 353 protocol definition to any data structure that can be expressed in 354 Concise Binary Object Representation (CBOR) [RFC7049]. For some 355 objectives, a single data item will suffice; for example an integer, 356 a floating point number or a UTF-8 string. For more complex cases, a 357 simple tuple structure such as [item1, item2, item3] could be used. 358 Nothing prevents using other formats such as JSON, but this requires 359 the ASA to be capable of parsing and generating JSON. The formats 360 acceptable by the GRASP API will limit the options in practice. A 361 fallback solution is for the API to accept and deliver the value 362 field in raw CBOR, with the ASA itself encoding and decoding it via a 363 CBOR library. 365 Note that a mapping from YANG to CBOR is defined by 366 [I-D.ietf-core-yang-cbor]. Subject to the size limit defined for 367 GRASP messages, nothing prevents objectives using YANG in this way. 369 6. Life Cycle 371 Autonomic functions could be permanent, in the sense that ASAs are 372 shipped as part of a product and persist throughout the product's 373 life. However, a more likely situation is that ASAs need to be 374 installed or updated dynamically, because of new requirements or 375 bugs. Because continuity of service is fundamental to autonomic 376 networking, the process of seamlessly replacing a running instance of 377 an ASA with a new version needs to be part of the ASA's design. 379 The implication of service continuity on the design of ASAs can be 380 illustrated along the three main phases of the ASA life-cycle, namely 381 Installation, Instantiation and Operation. 383 +--------------+ 384 Undeployed ------>| |------> Undeployed 385 | Installed | 386 +-->| |---+ 387 Mandate | +--------------+ | Receives a 388 is revoked | +--------------+ | Mandate 389 +---| |<--+ 390 | Instantiated | 391 +-->| |---+ 392 set | +--------------+ | set 393 down | +--------------+ | up 394 +---| |<--+ 395 | Operational | 396 | | 397 +--------------+ 399 Figure 1: Life cycle of an Autonomic Service Agent 401 6.1. Installation phase 403 Before being able to instantiate and run ASAs, the operator must 404 first provision the infrastructure with the sets of ASA software 405 corresponding to its needs and objectives. The provisioning of the 406 infrastructure is realized in the installation phase and consists in 407 installing (or checking the availability of) the pieces of software 408 of the different ASA classes in a set of Installation Hosts. 410 There are 3 properties applicable to the installation of ASAs: 412 The dynamic installation property allows installing an ASA on 413 demand, on any hosts compatible with the ASA. 415 The decoupling property allows controlling resources of a NE from a 416 remote ASA, i.e. an ASA installed on a host machine different from 417 the resources' NE. 419 The multiplicity property allows controlling multiple sets of 420 resources from a single ASA. 422 These three properties are very important in the context of the 423 installation phase as their variations condition how the ASA class 424 could be installed on the infrastructure. 426 6.1.1. Installation phase inputs and outputs 428 Inputs are: 430 [ASA class of type_x] that specifies which classes ASAs to install, 432 [Installation_target_Infrastructure] that specifies the candidate 433 Installation Hosts, 435 [ASA class placement function, e.g. under which criteria/ 436 constraints as defined by the operator] 437 that specifies how the installation phase shall meet the 438 operator's needs and objectives for the provision of the 439 infrastructure. In the coupled mode, the placement function is 440 not necessary, whereas in the decoupled mode, the placement 441 function is mandatory, even though it can be as simple as an 442 explicit list of Installation hosts. 444 The main output of the installation phase is an up-to-date directory 445 of installed ASAs which corresponds to [list of ASA classes] 446 installed on [list of installation Hosts]. This output is also 447 useful for the coordination function and corresponds to the static 448 interaction map (see next section). 450 The condition to validate in order to pass to next phase is to ensure 451 that [list of ASA classes] are well installed on [list of 452 installation Hosts]. The state of the ASA at the end of the 453 installation phase is: installed. (not instantiated). The following 454 commands or messages are foreseen: install(list of ASA classes, 455 Installation_target_Infrastructure, ASA class placement function), 456 and un-install (list of ASA classes). 458 6.2. Instantiation phase 460 Once the ASAs are installed on the appropriate hosts in the network, 461 these ASA may start to operate. From the operator viewpoint, an 462 operating ASA means the ASA manages the network resources as per the 463 objectives given. At the ASA local level, operating means executing 464 their control loop/algorithm. 466 But right before that, there are two things to take into 467 consideration. First, there is a difference between 1. having a 468 piece of code available to run on a host and 2. having an agent based 469 on this piece of code running inside the host. Second, in a coupled 470 case, determining which resources are controlled by an ASA is 471 straightforward (the determination is embedded), in a decoupled mode 472 determining this is a bit more complex (hence a starting agent will 473 have to either discover or be taught it). 475 The instantiation phase of an ASA covers both these aspects: starting 476 the agent piece of code (when this does not start automatically) and 477 determining which resources have to be controlled (when this is not 478 obvious). 480 6.2.1. Operator's goal 482 Through this phase, the operator wants to control its autonomic 483 network in two things: 485 1 determine the scope of autonomic functions by instructing which of 486 the network resources have to be managed by which autonomic 487 function (and more precisely which class e.g. 1. version X or 488 version Y or 2. provider A or provider B), 490 2 determine how the autonomic functions are organized by instructing 491 which ASAs have to interact with which other ASAs (or more 492 precisely which set of network resources have to be handled as an 493 autonomous group by their managing ASAs). 495 Additionally in this phase, the operator may want to set objectives 496 to autonomic functions, by configuring the ASAs technical objectives. 498 The operator's goal can be summarized in an instruction to the ANIMA 499 ecosystem matching the following pattern: 501 [ASA of type_x instances] ready to control 502 [Instantiation_target_Infrastructure] with 503 [Instantiation_target_parameters] 505 6.2.2. Instantiation phase inputs and outputs 507 Inputs are: 509 [ASA of type_x instances] that specifies which are the ASAs to be 510 targeted (and more precisely which class e.g. 1. version X or 511 version Y or 2. provider A or provider B), 513 [Instantiation_target_Infrastructure] that specifies which are the 514 resources to be managed by the autonomic function, this can be the 515 whole network or a subset of it like a domain a technology segment 516 or even a specific list of resources, 518 [Instantiation_target_parameters] that specifies which are the 519 technical objectives to be set to ASAs (e.g. an optimization 520 target) 522 Outputs are: 524 [Set of ASAs - Resources relations] describing which resources are 525 managed by which ASA instances, this is not a formal message, but 526 a resulting configuration of a set of ASAs, 528 6.2.3. Instantiation phase requirements 530 The instructions described in section 4.2 could be either: 532 sent to a targeted ASA In which case, the receiving Agent will have 533 to manage the specified list of 534 [Instantiation_target_Infrastructure], with the 535 [Instantiation_target_parameters]. 537 broadcast to all ASAs In which case, the ASAs would collectively 538 determine from the list which Agent(s) would handle which 539 [Instantiation_target_Infrastructure], with the 540 [Instantiation_target_parameters]. 542 This set of instructions can be materialized through a message that 543 is named an Instance Mandate (description TBD). 545 The conclusion of this instantiation phase is a ready to operate ASA 546 (or interacting set of ASAs), then this (or those) ASA(s) can 547 describe themselves by depicting which are the resources they manage 548 and what this means in terms of metrics being monitored and in terms 549 of actions that can be executed (like modifying the parameters 550 values). A message conveying such a self description is named an 551 Instance Manifest (description TBD). 553 Though the operator may well use such a self-description "per se", 554 the final goal of such a description is to be shared with other ANIMA 555 entities like: 557 * the coordination entities (see [I-D.ciavaglia-anima-coordination]) 559 * collaborative entities in the purpose of establishing knowledge 560 exchanges (some ASAs may produce knowledge or even monitor metrics 561 that other ASAs cannot make by themselves why those would be 562 useful for their execution) 564 6.3. Operation phase 566 Note: This section is to be further developed in future revisions of 567 the document, especially the implications on the design of ASAs. 569 During the Operation phase, the operator can: 571 Activate/Deactivate ASA: meaning enabling those to execute their 572 autonomic loop or not. 574 Modify ASAs targets: meaning setting them different objectives. 576 Modify ASAs managed resources: by updating the instance mandate 577 which would specify different set of resources to manage (only 578 applicable to decouples ASAs). 580 During the Operation phase, running ASAs can interact the one with 581 the other: 583 in order to exchange knowledge (e.g. an ASA providing traffic 584 predictions to load balancing ASA) 586 in order to collaboratively reach an objective (e.g. ASAs 587 pertaining to the same autonomic function targeted to manage a 588 network domain, these ASA will collaborate - in the case of a load 589 balancing one, by modifying the links metrics according to the 590 neighboring resources loads) 592 During the Operation phase, running ASAs are expected to apply 593 coordination schemes 595 then execute their control loop under coordination supervision/ 596 instructions 598 The ASA life-cycle is discussed in more detail in "A Day in the Life 599 of an Autonomic Function" [I-D.peloso-anima-autonomic-function]. 601 7. Coordination between Autonomic Functions 603 Some autonomic functions will be completely independent of each 604 other. However, others are at risk of interfering with each other - 605 for example, two different optimization functions might both attempt 606 to modify the same underlying parameter in different ways. In a 607 complete system, a method is needed of identifying ASAs that might 608 interfere with each other and coordinating their actions when 609 necessary. This issue is considered in "Autonomic Functions 610 Coordination" [I-D.ciavaglia-anima-coordination]. 612 8. Coordination with Traditional Management Functions 614 Some ASAs will have functions that overlap with existing 615 configuration tools and network management mechanisms such as command 616 line interfaces, DHCP, DHCPv6, SNMP, NETCONF, RESTCONF and YANG-based 617 solutions. Each ASA designer will need to consider this issue and 618 how to avoid clashes and inconsistencies. Some specific 619 considerations for interaction with OAM tools are given in [RFC8368]. 620 As another example, [I-D.ietf-anima-prefix-management] describes how 621 autonomic management of IPv6 prefixes can interact with prefix 622 delegation via DHCPv6. The description of a GRASP objective and of 623 an ASA using it should include a discussion of any such interactions. 625 A related aspect is that management functions often include a data 626 model, quite likely to be expressed in a formal notation such as 627 YANG. This aspect should not be an afterthought in the design of an 628 ASA. To the contrary, the design of the ASA and of its GRASP 629 objectives should match the data model; as noted above, YANG 630 serialized as CBOR may be used directly as the value of a GRASP 631 objective. 633 9. Robustness 635 It is of great importance that all components of an autonomic system 636 are highly robust. In principle they must never fail. This section 637 lists various aspects of robustness that ASA designers should 638 consider. 640 1. If despite all precautions, an ASA does encounter a fatal error, 641 it should in any case restart automatically and try again. To 642 mitigate a hard loop in case of persistent failure, a suitable 643 pause should be inserted before such a restart. The length of 644 the pause depends on the use case. 646 2. If a newly received or calculated value for a parameter falls out 647 of bounds, the corresponding parameter should be either left 648 unchanged or restored to a safe value. 650 3. If a GRASP synchronization or negotiation session fails for any 651 reason, it may be repeated after a suitable pause. The length of 652 the pause depends on the use case. 654 4. If a session fails repeatedly, the ASA should consider that its 655 peer has failed, and cause GRASP to flush its discovery cache and 656 repeat peer discovery. 658 5. In any case, it may be prudent to repeat discovery periodically, 659 depending on the use case. 661 6. Any received GRASP message should be checked. If it is wrongly 662 formatted, it should be ignored. Within a unicast session, an 663 Invalid message (M_INVALID) may be sent. This function may be 664 provided by the GRASP implementation itself. 666 7. Any received GRASP objective should be checked. If it is wrongly 667 formatted, it should be ignored. Within a negotiation session, a 668 Negotiation End message (M_END) with a Decline option (O_DECLINE) 669 should be sent. An ASA may log such events for diagnostic 670 purposes. 672 8. If an ASA receives either an Invalid message (M_INVALID) or a 673 Negotiation End message (M_END) with a Decline option 674 (O_DECLINE), one possible reason is that the peer ASA does not 675 support a new feature of either GRASP or of the objective in 676 question. In such a case the ASA may choose to repeat the 677 operation concerned without using that new feature. 679 9. All other possible exceptions should be handled in an orderly 680 way. There should be no such thing as an unhandled exception 681 (but see point 1 above). 683 10. Security Considerations 685 ASAs are intended to run in an environment that is protected by the 686 Autonomic Control Plane [I-D.ietf-anima-autonomic-control-plane], 687 admission to which depends on an initial secure bootstrap process 688 [I-D.ietf-anima-bootstrapping-keyinfra]. In some deployments, a 689 secure partition of the link layer might be used instead 690 [I-D.carpenter-anima-l2acp-scenarios]. However, this does not 691 relieve ASAs of responsibility for security. In particular, when 692 ASAs configure or manage network elements outside the ACP, they must 693 use secure techniques and carefully validate any incoming 694 information. As noted above, this will apply in particular when an 695 ASA interacts with a management component such as a NETCONF server. 697 As appropriate to their specific functions, ASAs should take account 698 of relevant privacy considerations [RFC6973]. 700 Authorization of ASAs is a subject for future study. At present, 701 ASAs are trusted by virtue of being installed on a node that has 702 successfully joined the ACP. 704 11. IANA Considerations 706 This document makes no request of the IANA. 708 12. Acknowledgements 710 Useful comments were received from Toerless Eckert, Alex Galis, Bing 711 Liu, and other members of the ANIMA WG. 713 13. References 715 13.1. Normative References 717 [I-D.ietf-anima-autonomic-control-plane] 718 Eckert, T., Behringer, M., and S. Bjarnason, "An Autonomic 719 Control Plane (ACP)", Work in Progress, Internet-Draft, 720 draft-ietf-anima-autonomic-control-plane-21, 3 November 721 2019, . 724 [I-D.ietf-anima-bootstrapping-keyinfra] 725 Pritikin, M., Richardson, M., Eckert, T., Behringer, M., 726 and K. Watsen, "Bootstrapping Remote Secure Key 727 Infrastructures (BRSKI)", Work in Progress, Internet- 728 Draft, draft-ietf-anima-bootstrapping-keyinfra-34, 3 729 January 2020, . 732 [I-D.ietf-anima-grasp] 733 Bormann, C., Carpenter, B., and B. Liu, "A Generic 734 Autonomic Signaling Protocol (GRASP)", Work in Progress, 735 Internet-Draft, draft-ietf-anima-grasp-15, 13 July 2017, 736 . 738 [RFC7049] Bormann, C. and P. Hoffman, "Concise Binary Object 739 Representation (CBOR)", RFC 7049, DOI 10.17487/RFC7049, 740 October 2013, . 742 13.2. Informative References 744 [DeMola06] De Mola, F. and R. Quitadamo, "An Agent Model for Future 745 Autonomic Communications", Proceedings of the 7th WOA 2006 746 Workshop From Objects to Agents 51-59, September 2006. 748 [GANA13] "Autonomic network engineering for the self-managing 749 Future Internet (AFI): GANA Architectural Reference Model 750 for Autonomic Networking, Cognitive Networking and Self- 751 Management.", April 2013, 752 . 755 [Huebscher08] 756 Huebscher, M. C. and J. A. McCann, "A survey of autonomic 757 computing--degrees, models, and applications", ACM 758 Computing Surveys (CSUR) Volume 40 Issue 3 DOI: 759 10.1145/1380584.1380585, August 2008. 761 [I-D.carpenter-anima-l2acp-scenarios] 762 Carpenter, B. and B. Liu, "Scenarios and Requirements for 763 Layer 2 Autonomic Control Planes", Work in Progress, 764 Internet-Draft, draft-carpenter-anima-l2acp-scenarios-01, 765 2 October 2019, . 768 [I-D.ciavaglia-anima-coordination] 769 Ciavaglia, L. and P. Peloso, "Autonomic Functions 770 Coordination", Work in Progress, Internet-Draft, draft- 771 ciavaglia-anima-coordination-01, 21 March 2016, 772 . 775 [I-D.ietf-anima-grasp-api] 776 Carpenter, B., Liu, B., Wang, W., and X. Gong, "Generic 777 Autonomic Signaling Protocol Application Program Interface 778 (GRASP API)", Work in Progress, Internet-Draft, draft- 779 ietf-anima-grasp-api-04, 6 October 2019, 780 . 783 [I-D.ietf-anima-prefix-management] 784 Jiang, S., Du, Z., Carpenter, B., and Q. Sun, "Autonomic 785 IPv6 Edge Prefix Management in Large-scale Networks", Work 786 in Progress, Internet-Draft, draft-ietf-anima-prefix- 787 management-07, 18 December 2017, 788 . 791 [I-D.ietf-anima-reference-model] 792 Behringer, M., Carpenter, B., Eckert, T., Ciavaglia, L., 793 and J. Nobre, "A Reference Model for Autonomic 794 Networking", Work in Progress, Internet-Draft, draft-ietf- 795 anima-reference-model-10, 22 November 2018, 796 . 799 [I-D.ietf-core-yang-cbor] 800 Veillette, M., Petrov, I., and A. Pelov, "CBOR Encoding of 801 Data Modeled with YANG", Work in Progress, Internet-Draft, 802 draft-ietf-core-yang-cbor-11, 11 September 2019, 803 . 806 [I-D.liu-anima-grasp-distribution] 807 Liu, B., Xiao, X., Hecker, A., Jiang, S., and Z. 808 Despotovic, "Information Distribution in Autonomic 809 Networking", Work in Progress, Internet-Draft, draft-liu- 810 anima-grasp-distribution-13, 12 December 2019, 811 . 814 [I-D.peloso-anima-autonomic-function] 815 Pierre, P. and L. Ciavaglia, "A Day in the Life of an 816 Autonomic Function", Work in Progress, Internet-Draft, 817 draft-peloso-anima-autonomic-function-01, 21 March 2016, 818 . 821 [Movahedi12] 822 Movahedi, Z., Ayari, M., Langar, R., and G. Pujolle, "A 823 Survey of Autonomic Network Architectures and Evaluation 824 Criteria", IEEE Communications Surveys & Tutorials Volume: 825 14 , Issue: 2 DOI: 10.1109/SURV.2011.042711.00078, 826 Page(s): 464 - 490, 2012. 828 [RFC6241] Enns, R., Ed., Bjorklund, M., Ed., Schoenwaelder, J., Ed., 829 and A. Bierman, Ed., "Network Configuration Protocol 830 (NETCONF)", RFC 6241, DOI 10.17487/RFC6241, June 2011, 831 . 833 [RFC6973] Cooper, A., Tschofenig, H., Aboba, B., Peterson, J., 834 Morris, J., Hansen, M., and R. Smith, "Privacy 835 Considerations for Internet Protocols", RFC 6973, 836 DOI 10.17487/RFC6973, July 2013, 837 . 839 [RFC7575] Behringer, M., Pritikin, M., Bjarnason, S., Clemm, A., 840 Carpenter, B., Jiang, S., and L. Ciavaglia, "Autonomic 841 Networking: Definitions and Design Goals", RFC 7575, 842 DOI 10.17487/RFC7575, June 2015, 843 . 845 [RFC7665] Halpern, J., Ed. and C. Pignataro, Ed., "Service Function 846 Chaining (SFC) Architecture", RFC 7665, 847 DOI 10.17487/RFC7665, October 2015, 848 . 850 [RFC8368] Eckert, T., Ed. and M. Behringer, "Using an Autonomic 851 Control Plane for Stable Connectivity of Network 852 Operations, Administration, and Maintenance (OAM)", 853 RFC 8368, DOI 10.17487/RFC8368, May 2018, 854 . 856 [RFC8568] Bernardos, CJ., Rahman, A., Zuniga, JC., Contreras, LM., 857 Aranda, P., and P. Lynch, "Network Virtualization Research 858 Challenges", RFC 8568, DOI 10.17487/RFC8568, April 2019, 859 . 861 Appendix A. Change log [RFC Editor: Please remove] 863 draft-carpenter-anima-asa-guidelines-08, 2020-01-10: 865 * Introduced notion of autonomic ecosystem. 866 * Minor technical clarifications. 867 * Converted to v3 format. 869 draft-carpenter-anima-asa-guidelines-07, 2019-07-17: 871 * Improved explanation of threading vs event-loop 872 * Other editorial improvements. 874 draft-carpenter-anima-asa-guidelines-06, 2018-01-07: 876 * Expanded and improved example logic flow. 877 * Editorial corrections. 879 draft-carpenter-anima-asa-guidelines-05, 2018-06-30: 881 * Added section on relationshp with non-autonomic components. 882 * Editorial corrections. 884 draft-carpenter-anima-asa-guidelines-04, 2018-03-03: 886 * Added note about simple ASAs. 888 * Added note about NFV/SFC services. 889 * Improved text about threading v event loop model 890 * Added section about coordination with traditional tools. 891 * Added appendix with example logic flow. 893 draft-carpenter-anima-asa-guidelines-03, 2017-10-25: 895 * Added details on life cycle. 896 * Added details on robustness. 897 * Added co-authors. 899 draft-carpenter-anima-asa-guidelines-02, 2017-07-01: 901 * Expanded description of event-loop case. 902 * Added note about 'dry run' mode. 904 draft-carpenter-anima-asa-guidelines-01, 2017-01-06: 906 * More sections filled in. 908 draft-carpenter-anima-asa-guidelines-00, 2016-09-30: 910 * Initial version 912 Appendix B. Example Logic Flows 914 This appendix describes generic logic flows for an Autonomic Service 915 Agent (ASA) for resource management. Note that these are 916 illustrative examples, and in no sense requirements. As long as the 917 rules of GRASP are followed, a real implementation could be 918 different. The reader is assumed to be familiar with GRASP 919 [I-D.ietf-anima-grasp] and its conceptual API 920 [I-D.ietf-anima-grasp-api]. 922 A complete autonomic function for a resource would consist of a 923 number of instances of the ASA placed at relevant points in a 924 network. Specific details will of course depend on the resource 925 concerned. One example is IP address prefix management, as specified 926 in [I-D.ietf-anima-prefix-management]. In this case, an instance of 927 the ASA would exist in each delegating router. 929 An underlying assumption is that there is an initial source of the 930 resource in question, referred to here as a master ASA. The other 931 ASAs, known as delegators, obtain supplies of the resource from the 932 master, and then delegate quantities of the resource to consumers 933 that request it, and recover it when no longer needed. 935 Another assumption is there is a set of network wide policy 936 parameters, which the master will provide to the delegators. These 937 parameters will control how the delegators decide how much resource 938 to provide to consumers. Thus the ASA logic has two operating modes: 939 master and delegator. When running as a master, it starts by 940 obtaining a quantity of the resource from the NOC, and it acts as a 941 source of policy parameters, via both GRASP flooding and GRASP 942 synchronization. (In some scenarios, flooding or synchronization 943 alone might be sufficient, but this example includes both.) 945 When running as a delegator, it starts with an empty resource pool, 946 it acquires the policy parameters by GRASP synchronization, and it 947 delegates quantities of the resource to consumers that request it. 948 Both as a master and as a delegator, when its pool is low it seeks 949 quantities of the resource by requesting GRASP negotiation with peer 950 ASAs. When its pool is sufficient, it hands out resource to peer 951 ASAs in response to negotiation requests. Thus, over time, the 952 initial resource pool held by the master will be shared among all the 953 delegators according to demand. 955 In theory a network could include any number of masters and any 956 number of delegators, with the only condition being that each 957 master's initial resource pool is unique. A realistic scenario is to 958 have exactly one master and as many delegators as you like. A 959 scenario with no master is useless. 961 An implementation requirement is that resource pools are kept in 962 stable storage. Otherwise, if a delegator exits for any reason, all 963 the resources it has obtained or delegated are lost. If a master 964 exits, its entire spare pool is lost. The logic for using stable 965 storage and for crash recovery is not included in the pseudocode 966 below. 968 The description below does not implement GRASP's 'dry run' function. 969 That would require temporarily marking any resource handed out in a 970 dry run negotiation as reserved, until either the peer obtains it in 971 a live run, or a suitable timeout expires. 973 The main data structures used in each instance of the ASA are: 975 * The resource_pool, for example an ordered list of available 976 resources. Depending on the nature of the resource, units of 977 resource are split when appropriate, and a background garbage 978 collector recombines split resources if they are returned to the 979 pool. 981 * The delegated_list, where a delegator stores the resources it has 982 given to consumers routers. 984 Possible main logic flows are below, using a threaded implementation 985 model. The transformation to an event loop model should be apparent 986 - each thread would correspond to one event in the event loop. 988 The GRASP objectives are as follows: 990 * ["EX1.Resource", flags, loop_count, value] where the value depends 991 on the resource concerned, but will typically include its size and 992 identification. 994 * ["EX1.Params", flags, loop_count, value] where the value will be, 995 for example, a JSON object defining the applicable parameters. 997 In the outline logic flows below, these objectives are represented 998 simply by their names. 1000 1002 MAIN PROGRAM: 1004 Create empty resource_pool (and an associated lock) 1005 Create empty delegated_list 1006 Determine whether to act as master 1007 if master: 1008 Obtain initial resource_pool contents from NOC 1009 Obtain value of EX1.Params from NOC 1010 Register ASA with GRASP 1011 Register GRASP objectives EX1.Resource and EX1.Params 1012 if master: 1013 Start FLOODER thread to flood EX1.Params 1014 Start SYNCHRONIZER listener for EX1.Params 1015 Start MAIN_NEGOTIATOR thread for EX1.Resource 1016 if not master: 1017 Obtain value of EX1.Params from GRASP flood or synchronization 1018 Start DELEGATOR thread 1019 Start GARBAGE_COLLECTOR thread 1020 do forever: 1021 good_peer = none 1022 if resource_pool is low: 1023 Calculate amount A of resource needed 1024 Discover peers using GRASP M_DISCOVER / M_RESPONSE 1025 if good_peer in peers: 1026 peer = good_peer 1027 else: 1028 peer = #any choice among peers 1029 grasp.request_negotiate("EX1.Resource", peer) 1030 i.e., send M_REQ_NEG 1031 Wait for response (M_NEGOTIATE, M_END or M_WAIT) 1032 if OK: 1033 if offered amount of resource sufficient: 1034 Send M_END + O_ACCEPT #negotiation succeeded 1035 Add resource to pool 1036 good_peer = peer 1037 else: 1038 Send M_END + O_DECLINE #negotiation failed 1039 sleep() #sleep time depends on application scenario 1041 MAIN_NEGOTIATOR thread: 1043 do forever: 1044 grasp.listen_negotiate("EX1.Resource") 1045 i.e., wait for M_REQ_NEG 1046 Start a separate new NEGOTIATOR thread for requested amount A 1048 NEGOTIATOR thread: 1050 Request resource amount A from resource_pool 1051 if not OK: 1052 while not OK and A > Amin: 1053 A = A-1 1054 Request resource amount A from resource_pool 1055 if OK: 1056 Offer resource amount A to peer by GRASP M_NEGOTIATE 1057 if received M_END + O_ACCEPT: 1058 #negotiation succeeded 1059 elif received M_END + O_DECLINE or other error: 1060 #negotiation failed 1061 else: 1062 Send M_END + O_DECLINE #negotiation failed 1064 DELEGATOR thread: 1066 do forever: 1067 Wait for request or release for resource amount A 1068 if request: 1069 Get resource amount A from resource_pool 1070 if OK: 1071 Delegate resource to consumer 1072 Record in delegated_list 1073 else: 1074 Signal failure to consumer 1075 Signal main thread that resource_pool is low 1076 else: 1077 Delete resource from delegated_list 1078 Return resource amount A to resource_pool 1080 SYNCHRONIZER thread: 1082 do forever: 1083 Wait for M_REQ_SYN message for EX1.Params 1084 Reply with M_SYNCH message for EX1.Params 1086 FLOODER thread: 1088 do forever: 1089 Send M_FLOOD message for EX1.Params 1090 sleep() #sleep time depends on application scenario 1092 GARBAGE_COLLECTOR thread: 1094 do forever: 1095 Search resource_pool for adjacent resources 1096 Merge adjacent resources 1097 sleep() #sleep time depends on application scenario 1099 1101 Authors' Addresses 1103 Brian Carpenter 1104 School of Computer Science 1105 University of Auckland 1106 PB 92019 1107 Auckland 1142 1108 New Zealand 1110 Email: brian.e.carpenter@gmail.com 1112 Laurent Ciavaglia 1113 Nokia 1114 Villarceaux 1115 91460 Nozay 1116 France 1118 Email: laurent.ciavaglia@nokia.com 1120 Sheng Jiang 1121 Huawei Technologies Co., Ltd 1122 Q14 Huawei Campus 1123 156 Beiqing Road 1124 Hai-Dian District 1125 Beijing 1126 100095 1127 China 1129 Email: jiangsheng@huawei.com 1131 Pierre Peloso 1132 Nokia 1133 Villarceaux 1134 91460 Nozay 1135 France 1136 Email: pierre.peloso@nokia.com