idnits 2.17.1 draft-carpenter-anima-asa-guidelines-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 401 has weird spacing: '...roperty allow...' == Line 404 has weird spacing: '...roperty allow...' == Line 408 has weird spacing: '...roperty allow...' -- The document date (July 7, 2019) is 1755 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-30) exists of draft-ietf-anima-autonomic-control-plane-19 == Outdated reference: A later version (-45) exists of draft-ietf-anima-bootstrapping-keyinfra-22 ** Obsolete normative reference: RFC 7049 (Obsoleted by RFC 8949) == Outdated reference: A later version (-02) exists of draft-carpenter-anima-l2acp-scenarios-00 == Outdated reference: A later version (-10) exists of draft-ietf-anima-grasp-api-03 == Outdated reference: A later version (-20) exists of draft-ietf-core-yang-cbor-10 Summary: 1 error (**), 0 flaws (~~), 9 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group B. Carpenter 3 Internet-Draft Univ. of Auckland 4 Intended status: Informational L. Ciavaglia 5 Expires: January 8, 2020 Nokia 6 S. Jiang 7 Huawei Technologies Co., Ltd 8 P. Peloso 9 Nokia 10 July 7, 2019 12 Guidelines for Autonomic Service Agents 13 draft-carpenter-anima-asa-guidelines-07 15 Abstract 17 This document proposes guidelines for the design of Autonomic Service 18 Agents for autonomic networks. It is based on the Autonomic Network 19 Infrastructure outlined in the ANIMA reference model, making use of 20 the Autonomic Control Plane and the Generic Autonomic Signaling 21 Protocol. 23 Status of This Memo 25 This Internet-Draft is submitted in full conformance with the 26 provisions of BCP 78 and BCP 79. 28 Internet-Drafts are working documents of the Internet Engineering 29 Task Force (IETF). Note that other groups may also distribute 30 working documents as Internet-Drafts. The list of current Internet- 31 Drafts is at https://datatracker.ietf.org/drafts/current/. 33 Internet-Drafts are draft documents valid for a maximum of six months 34 and may be updated, replaced, or obsoleted by other documents at any 35 time. It is inappropriate to use Internet-Drafts as reference 36 material or to cite them other than as "work in progress." 38 This Internet-Draft will expire on January 8, 2020. 40 Copyright Notice 42 Copyright (c) 2019 IETF Trust and the persons identified as the 43 document authors. All rights reserved. 45 This document is subject to BCP 78 and the IETF Trust's Legal 46 Provisions Relating to IETF Documents 47 (https://trustee.ietf.org/license-info) in effect on the date of 48 publication of this document. Please review these documents 49 carefully, as they describe your rights and restrictions with respect 50 to this document. Code Components extracted from this document must 51 include Simplified BSD License text as described in Section 4.e of 52 the Trust Legal Provisions and are provided without warranty as 53 described in the Simplified BSD License. 55 Table of Contents 57 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 58 2. Logical Structure of an Autonomic Service Agent . . . . . . . 3 59 3. Interaction with the Autonomic Networking Infrastructure . . 5 60 3.1. Interaction with the security mechanisms . . . . . . . . 5 61 3.2. Interaction with the Autonomic Control Plane . . . . . . 5 62 3.3. Interaction with GRASP and its API . . . . . . . . . . . 5 63 3.4. Interaction with policy mechanism . . . . . . . . . . . . 6 64 4. Interaction with Non-Autonomic Components . . . . . . . . . . 7 65 5. Design of GRASP Objectives . . . . . . . . . . . . . . . . . 7 66 6. Life Cycle . . . . . . . . . . . . . . . . . . . . . . . . . 8 67 6.1. Installation phase . . . . . . . . . . . . . . . . . . . 9 68 6.1.1. Installation phase inputs and outputs . . . . . . . . 10 69 6.2. Instantiation phase . . . . . . . . . . . . . . . . . . . 10 70 6.2.1. Operator's goal . . . . . . . . . . . . . . . . . . . 11 71 6.2.2. Instantiation phase inputs and outputs . . . . . . . 11 72 6.2.3. Instantiation phase requirements . . . . . . . . . . 12 73 6.3. Operation phase . . . . . . . . . . . . . . . . . . . . . 12 74 7. Coordination between Autonomic Functions . . . . . . . . . . 13 75 8. Coordination with Traditional Management Functions . . . . . 13 76 9. Robustness . . . . . . . . . . . . . . . . . . . . . . . . . 14 77 10. Security Considerations . . . . . . . . . . . . . . . . . . . 15 78 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 15 79 12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 15 80 13. References . . . . . . . . . . . . . . . . . . . . . . . . . 15 81 13.1. Normative References . . . . . . . . . . . . . . . . . . 15 82 13.2. Informative References . . . . . . . . . . . . . . . . . 16 83 Appendix A. Change log [RFC Editor: Please remove] . . . . . . . 18 84 Appendix B. Example Logic Flows . . . . . . . . . . . . . . . . 19 85 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 24 87 1. Introduction 89 This document proposes guidelines for the design of Autonomic Service 90 Agents (ASAs) in the context of an Autonomic Network (AN) based on 91 the Autonomic Network Infrastructure (ANI) outlined in the ANIMA 92 reference model [I-D.ietf-anima-reference-model]. This 93 infrastructure makes use of the Autonomic Control Plane (ACP) 94 [I-D.ietf-anima-autonomic-control-plane] and the Generic Autonomic 95 Signaling Protocol (GRASP) [I-D.ietf-anima-grasp]. 97 There is a considerable literature about autonomic agents with a 98 variety of proposals about how they should be characterized. Some 99 examples are [DeMola06], [Huebscher08], [Movahedi12] and [GANA13]. 100 However, for the present document, the basic definitions and goals 101 for autonomic networking given in [RFC7575] apply . According to RFC 102 7575, an Autonomic Service Agent is "An agent implemented on an 103 autonomic node that implements an autonomic function, either in part 104 (in the case of a distributed function) or whole." 106 ASAs must be distinguished from other forms of software component. 107 They are components of network or service management; they do not in 108 themselves provide services. For example, the services envisaged for 109 network function virtualisation [RFC8568] or for service function 110 chaining [RFC7665] might be managed by an ASA rather than by 111 traditional configuration tools. 113 The reference model [I-D.ietf-anima-reference-model] expands this by 114 adding that an ASA is "a process that makes use of the features 115 provided by the ANI to achieve its own goals, usually including 116 interaction with other ASAs via the GRASP protocol 117 [I-D.ietf-anima-grasp] or otherwise. Of course it also interacts 118 with the specific targets of its function, using any suitable 119 mechanism. Unless its function is very simple, the ASA will need to 120 handle overlapping asynchronous operations. This will require either 121 a multi-threaded implementation, or a logically equivalent event loop 122 structure. It may therefore be a quite complex piece of software in 123 its own right, forming part of the application layer above the ANI." 125 There will certainly be very simple ASAs that manage a single 126 objective in a straightforward way and do not asynchronous 127 operations. In such a case, many aspects of the current document do 128 not apply. However, in general a basic property of an ASA is that it 129 is a relatively complex software component that will in many cases 130 control and monitor simpler entities in the same host or elsewhere. 131 For example, a device controller that manages tens or hundreds of 132 simple devices might contain a single ASA. 134 The remainder of this document offers guidance on the design of such 135 ASAs. 137 2. Logical Structure of an Autonomic Service Agent 139 As mentioned above, all but the simplest ASAs will need to suport 140 asynchronous operations. Not all programming environments explicitly 141 support multi-threading. In that case, an 'event loop' style of 142 implementation should be adopted, in which case each thread would be 143 implemented as an event handler called in turn by the main loop. For 144 this, the GRASP API (Section 3.3) must provide non-blocking calls. 146 If necessary, the GRASP session identifier will be used to 147 distinguish simultaneous operations. 149 A typical ASA will have a main thread that performs various initial 150 housekeeping actions such as: 152 o Obtain authorization credentials. 154 o Register the ASA with GRASP. 156 o Acquire relevant policy parameters. 158 o Define data structures for relevant GRASP objectives. 160 o Register with GRASP those objectives that it will actively manage. 162 o Launch a self-monitoring thread. 164 o Enter its main loop. 166 The logic of the main loop will depend on the details of the 167 autonomic function concerned. Whenever asynchronous operations are 168 required, extra threads will be launched, or events added to the 169 event loop. Examples include: 171 o Repeatedly flood an objective to the AN, so that any ASA can 172 receive the objective's latest value. 174 o Accept incoming synchronization requests for an objective managed 175 by this ASA. 177 o Accept incoming negotiation requests for an objective managed by 178 this ASA, and then conduct the resulting negotiation with the 179 counterpart ASA. 181 o Manage subsidiary non-autonomic devices directly. 183 These threads or events should all either exit after their job is 184 done, or enter a wait state for new work, to avoid blocking others 185 unnecessarily. 187 According to the degree of parallelism needed by the application, 188 some of these threads or events might be launched in multiple 189 instances. In particular, if negotiation sessions with other ASAs 190 are expected to be long or to involve wait states, the ASA designer 191 might allow for multiple simultaneous negotiating threads, with 192 appropriate use of queues and locks to maintain consistency. 194 The main loop itself could act as the initiator of synchronization 195 requests or negotiation requests, when the ASA needs data or 196 resources from other ASAs. In particular, the main loop should watch 197 for changes in policy parameters that affect its operation. It 198 should also do whatever is required to avoid unnecessary resource 199 consumption, such as including an arbitrary wait time in each cycle 200 of the main loop. 202 The self-monitoring thread is of considerable importance. Autonomic 203 service agents must never fail. To a large extent this depends on 204 careful coding and testing, with no unhandled error returns or 205 exceptions, but if there is nevertheless some sort of failure, the 206 self-monitoring thread should detect it, fix it if possible, and in 207 the worst case restart the entire ASA. 209 Appendix B presents some example logic flows in informal pseudocode. 211 3. Interaction with the Autonomic Networking Infrastructure 213 3.1. Interaction with the security mechanisms 215 An ASA by definition runs in an autonomic node. Before any normal 216 ASAs are started, such nodes must be bootstrapped into the autonomic 217 network's secure key infrastructure in accordance with 218 [I-D.ietf-anima-bootstrapping-keyinfra]. This key infrastructure 219 will be used to secure the ACP (next section) and may be used by ASAs 220 to set up additional secure interactions with their peers, if needed. 222 Note that the secure bootstrap process itself may include special- 223 purpose ASAs that run in a constrained insecure mode. 225 3.2. Interaction with the Autonomic Control Plane 227 In a normal autonomic network, ASAs will run as clients of the ACP. 228 It will provide a fully secured network environment for all 229 communication with other ASAs, in most cases mediated by GRASP (next 230 section). 232 Note that the ACP formation process itself may include special- 233 purpose ASAs that run in a constrained insecure mode. 235 3.3. Interaction with GRASP and its API 237 GRASP [I-D.ietf-anima-grasp] is expected to run as a separate process 238 with its API [I-D.ietf-anima-grasp-api] available in user space. 239 Thus ASAs may operate without special privilege, unless they need it 240 for other reasons. The ASA's view of GRASP is built around GRASP 241 objectives (Section 5), defined as data structures containing 242 administrative information such as the objective's unique name, and 243 its current value. The format and size of the value is not 244 restricted by the protocol, except that it must be possible to 245 serialise it for transmission in CBOR [RFC7049], which is no 246 restriction at all in practice. 248 The GRASP API should offer the following features: 250 o Registration functions, so that an ASA can register itself and the 251 objectives that it manages. 253 o A discovery function, by which an ASA can discover other ASAs 254 supporting a given objective. 256 o A negotiation request function, by which an ASA can start 257 negotiation of an objective with a counterpart ASA. With this, 258 there is a corresponding listening function for an ASA that wishes 259 to respond to negotiation requests, and a set of functions to 260 support negotiating steps. 262 o A synchronization function, by which an ASA can request the 263 current value of an objective from a counterpart ASA. With this, 264 there is a corresponding listening function for an ASA that wishes 265 to respond to synchronization requests. 267 o A flood function, by which an ASA can cause the current value of 268 an objective to be flooded throughout the AN so that any ASA can 269 receive it. 271 For further details and some additional housekeeping functions, see 272 [I-D.ietf-anima-grasp-api]. 274 This API is intended to support the various interactions expected 275 between most ASAs, such as the interactions outlined in Section 2. 276 However, if ASAs require additional communication between themselves, 277 they can do so using any desired protocol. One option is to use 278 GRASP discovery and synchronization as a rendez-vous mechanism 279 between two ASAs, passing communication parameters such as a TCP port 280 number via GRASP. As noted above, either the ACP or in special cases 281 the autonomic key infrastructure will be used to secure such 282 communications. 284 3.4. Interaction with policy mechanism 286 At the time of writing, the policy (or "Intent") mechanism for the 287 ANI is undefined. It is expected to operate by an information 288 distribution mechanism that can reach all autonomic nodes, and 289 therefore every ASA. However, each ASA must be capable of operating 290 "out of the box" in the absence of locally defined policy, so every 291 ASA implementation must include carefully chosen default values and 292 settings for all policy parameters. 294 4. Interaction with Non-Autonomic Components 296 An ASA, to have any external effects, must also interact with non- 297 autonomic components of the node where it is installed. For example, 298 an ASA whose purpose is to manage a resource must interact with that 299 resource. An ASA whose purpose is to manage an entity that is 300 already managed by local software must interact with that software. 301 This is stating the obvious, and the details are specific to each 302 case, but it has an important security implication. The ASA might 303 act as a loophole by which the managed entity could penetrate the 304 security boundary of the ANI. The ASA must be designed to avoid such 305 loopholes, and should if possible operate in an unprivileged mode. 307 In an environment where systems are virtualized and specialized using 308 techniques such as network function virtualization or network 309 slicing, there will be a design choice whether ASAs are deployed once 310 per physical node or once per virtual context. A related issue is 311 whether the ANI as a whole is deployed once on a physical network, or 312 whether several virtual ANIs are deployed. This aspect needs to be 313 considered by the ASA designer. 315 5. Design of GRASP Objectives 317 The general rules for the format of GRASP Objective options, their 318 names, and IANA registration are given in [I-D.ietf-anima-grasp]. 319 Additionally that document discusses various general considerations 320 for the design of objectives, which are not repeated here. However, 321 we emphasize that the GRASP protocol does not provide transactional 322 integrity. In other words, if an ASA is capable of overlapping 323 several negotiations for a given objective, then the ASA itself must 324 use suitable locking techniques to avoid interference between these 325 negotiations. For example, if an ASA is allocating part of a shared 326 resource to other ASAs, it needs to ensure that the same part of the 327 resource is not allocated twice. This might impact the design of the 328 objective as well as the logic flow of the ASA. 330 In particular, if 'dry run' mode is defined for the objective, its 331 specification, and every implementation, must consider what state 332 needs to be saved following a dry run negotiation, such that a 333 subsequent live negotiation can be expected to succeed. It must be 334 clear how long this state is kept, and what happens if the live 335 negotiation occurs after this state is deleted. An ASA that requests 336 a dry run negotiation must take account of the possibility that a 337 successful dry run is followed by a failed live negotiation. Because 338 of these complexities, the dry run mechanism should only be supported 339 by objectives and ASAs where there is a significant benefit from it. 341 The actual value field of an objective is limited by the GRASP 342 protocol definition to any data structure that can be expressed in 343 Concise Binary Object Representation (CBOR) [RFC7049]. For some 344 objectives, a single data item will suffice; for example an integer, 345 a floating point number or a UTF-8 string. For more complex cases, a 346 simple tuple structure such as [item1, item2, item3] could be used. 347 Nothing prevents using other formats such as JSON, but this requires 348 the ASA to be capable of parsing and generating JSON. The formats 349 acceptable by the GRASP API will limit the options in practice. A 350 fallback solution is for the API to accept and deliver the value 351 field in raw CBOR, with the ASA itself encoding and decoding it via a 352 CBOR library. 354 Note that a mapping from YANG to CBOR is defined by 355 [I-D.ietf-core-yang-cbor]. Subject to the size limit defined for 356 GRASP messages, nothing prevents objectives using YANG in this way. 358 6. Life Cycle 360 Autonomic functions could be permanent, in the sense that ASAs are 361 shipped as part of a product and persist throughout the product's 362 life. However, a more likely situation is that ASAs need to be 363 installed or updated dynamically, because of new requirements or 364 bugs. Because continuity of service is fundamental to autonomic 365 networking, the process of seamlessly replacing a running instance of 366 an ASA with a new version needs to be part of the ASA's design. 368 The implication of service continuity on the design of ASAs can be 369 illustrated along the three main phases of the ASA life-cycle, namely 370 Installation, Instantiation and Operation. 372 +--------------+ 373 Undeployed ------>| |------> Undeployed 374 | Installed | 375 +-->| |---+ 376 Mandate | +--------------+ | Receives a 377 is revoked | +--------------+ | Mandate 378 +---| |<--+ 379 | Instantiated | 380 +-->| |---+ 381 set | +--------------+ | set 382 down | +--------------+ | up 383 +---| |<--+ 384 | Operational | 385 | | 386 +--------------+ 388 Figure 1: Life cycle of an Autonomic Service Agent 390 6.1. Installation phase 392 Before being able to instantiate and run ASAs, the operator must 393 first provision the infrastructure with the sets of ASA software 394 corresponding to its needs and objectives. The provisioning of the 395 infrastructure is realized in the installation phase and consists in 396 installing (or checking the availability of) the pieces of software 397 of the different ASA classes in a set of Installation Hosts. 399 There are 3 properties applicable to the installation of ASAs: 401 The dynamic installation property allows installing an ASA on 402 demand, on any hosts compatible with the ASA. 404 The decoupling property allows controlling resources of a NE from a 405 remote ASA, i.e. an ASA installed on a host machine different from 406 the resources' NE. 408 The multiplicity property allows controlling multiple sets of 409 resources from a single ASA. 411 These three properties are very important in the context of the 412 installation phase as their variations condition how the ASA class 413 could be installed on the infrastructure. 415 6.1.1. Installation phase inputs and outputs 417 Inputs are: 419 [ASA class of type_x] that specifies which classes ASAs to install, 421 [Installation_target_Infrastructure] that specifies the candidate 422 Installation Hosts, 424 [ASA class placement function, e.g. under which criteria/constraints 425 as defined by the operator] 426 that specifies how the installation phase shall meet the 427 operator's needs and objectives for the provision of the 428 infrastructure. In the coupled mode, the placement function is 429 not necessary, whereas in the decoupled mode, the placement 430 function is mandatory, even though it can be as simple as an 431 explicit list of Installation hosts. 433 The main output of the installation phase is an up-to-date directory 434 of installed ASAs which corresponds to [list of ASA classes] 435 installed on [list of installation Hosts]. This output is also 436 useful for the coordination function and corresponds to the static 437 interaction map (see next section). 439 The condition to validate in order to pass to next phase is to ensure 440 that [list of ASA classes] are well installed on [list of 441 installation Hosts]. The state of the ASA at the end of the 442 installation phase is: installed. (not instantiated). The following 443 commands or messages are foreseen: install(list of ASA classes, 444 Installation_target_Infrastructure, ASA class placement function), 445 and un-install (list of ASA classes). 447 6.2. Instantiation phase 449 Once the ASAs are installed on the appropriate hosts in the network, 450 these ASA may start to operate. From the operator viewpoint, an 451 operating ASA means the ASA manages the network resources as per the 452 objectives given. At the ASA local level, operating means executing 453 their control loop/algorithm. 455 But right before that, there are two things to take into 456 consideration. First, there is a difference between 1. having a 457 piece of code available to run on a host and 2. having an agent based 458 on this piece of code running inside the host. Second, in a coupled 459 case, determining which resources are controlled by an ASA is 460 straightforward (the determination is embedded), in a decoupled mode 461 determining this is a bit more complex (hence a starting agent will 462 have to either discover or be taught it). 464 The instantiation phase of an ASA covers both these aspects: starting 465 the agent piece of code (when this does not start automatically) and 466 determining which resources have to be controlled (when this is not 467 obvious). 469 6.2.1. Operator's goal 471 Through this phase, the operator wants to control its autonomic 472 network in two things: 474 1 determine the scope of autonomic functions by instructing which of 475 the network resources have to be managed by which autonomic 476 function (and more precisely which class e.g. 1. version X or 477 version Y or 2. provider A or provider B), 479 2 determine how the autonomic functions are organized by instructing 480 which ASAs have to interact with which other ASAs (or more 481 precisely which set of network resources have to be handled as an 482 autonomous group by their managing ASAs). 484 Additionally in this phase, the operator may want to set objectives 485 to autonomic functions, by configuring the ASAs technical objectives. 487 The operator's goal can be summarized in an instruction to the ANIMA 488 ecosystem matching the following pattern: 490 [ASA of type_x instances] ready to control 491 [Instantiation_target_Infrastructure] with 492 [Instantiation_target_parameters] 494 6.2.2. Instantiation phase inputs and outputs 496 Inputs are: 498 [ASA of type_x instances] that specifies which are the ASAs to be 499 targeted (and more precisely which class e.g. 1. version X or 500 version Y or 2. provider A or provider B), 502 [Instantiation_target_Infrastructure] that specifies which are the 503 resources to be managed by the autonomic function, this can be the 504 whole network or a subset of it like a domain a technology segment 505 or even a specific list of resources, 507 [Instantiation_target_parameters] that specifies which are the 508 technical objectives to be set to ASAs (e.g. an optimization 509 target) 511 Outputs are: 513 [Set of ASAs - Resources relations] describing which resources are 514 managed by which ASA instances, this is not a formal message, but 515 a resulting configuration of a set of ASAs, 517 6.2.3. Instantiation phase requirements 519 The instructions described in section 4.2 could be either: 521 sent to a targeted ASA In which case, the receiving Agent will have 522 to manage the specified list of 523 [Instantiation_target_Infrastructure], with the 524 [Instantiation_target_parameters]. 526 broadcast to all ASAs In which case, the ASAs would collectively 527 determine from the list which Agent(s) would handle which 528 [Instantiation_target_Infrastructure], with the 529 [Instantiation_target_parameters]. 531 This set of instructions can be materialized through a message that 532 is named an Instance Mandate (description TBD). 534 The conclusion of this instantiation phase is a ready to operate ASA 535 (or interacting set of ASAs), then this (or those) ASA(s) can 536 describe themselves by depicting which are the resources they manage 537 and what this means in terms of metrics being monitored and in terms 538 of actions that can be executed (like modifying the parameters 539 values). A message conveying such a self description is named an 540 Instance Manifest (description TBD). 542 Though the operator may well use such a self-description "per se", 543 the final goal of such a description is to be shared with other ANIMA 544 entities like: 546 o the coordination entities (see [I-D.ciavaglia-anima-coordination] 547 - Autonomic Functions Coordination) 549 o collaborative entities in the purpose of establishing knowledge 550 exchanges (some ASAs may produce knowledge or even monitor metrics 551 that other ASAs cannot make by themselves why those would be 552 useful for their execution) 554 6.3. Operation phase 556 Note: This section is to be further developed in future revisions of 557 the document, especially the implications on the design of ASAs. 559 During the Operation phase, the operator can: 561 Activate/Deactivate ASA: meaning enabling those to execute their 562 autonomic loop or not. 564 Modify ASAs targets: meaning setting them different objectives. 566 Modify ASAs managed resources: by updating the instance mandate 567 which would specify different set of resources to manage (only 568 applicable to decouples ASAs). 570 During the Operation phase, running ASAs can interact the one with 571 the other: 573 in order to exchange knowledge (e.g. an ASA providing traffic 574 predictions to load balancing ASA) 576 in order to collaboratively reach an objective (e.g. ASAs 577 pertaining to the same autonomic function targeted to manage a 578 network domain, these ASA will collaborate - in the case of a load 579 balancing one, by modifying the links metrics according to the 580 neighboring resources loads) 582 During the Operation phase, running ASAs are expected to apply 583 coordination schemes 585 then execute their control loop under coordination supervision/ 586 instructions 588 The ASA life-cycle is discussed in more detail in "A Day in the Life 589 of an Autonomic Function" [I-D.peloso-anima-autonomic-function]. 591 7. Coordination between Autonomic Functions 593 Some autonomic functions will be completely independent of each 594 other. However, others are at risk of interfering with each other - 595 for example, two different optimization functions might both attempt 596 to modify the same underlying parameter in different ways. In a 597 complete system, a method is needed of identifying ASAs that might 598 interfere with each other and coordinating their actions when 599 necessary. This issue is considered in "Autonomic Functions 600 Coordination" [I-D.ciavaglia-anima-coordination]. 602 8. Coordination with Traditional Management Functions 604 Some ASAs will have functions that overlap with existing 605 configuration tools and network management mechanisms such as command 606 line interfaces, DHCP, DHCPv6, SNMP, NETCONF, RESTCONF and YANG-based 607 solutions. Each ASA designer will need to consider this issue and 608 how to avoid clashes and inconsistencies. Some specific 609 considerations for interaction with OAM tools are given in [RFC8368]. 610 As another example, [I-D.ietf-anima-prefix-management] describes how 611 autonomic management of IPv6 prefixes can interact with prefix 612 delegation via DHCPv6. The description of a GRASP objective and of 613 an ASA using it should include a discussion of any such interactions. 615 A related aspect is that management functions often include a data 616 model, quite likely to be expressed in a formal notation such as 617 YANG. This aspect should not be an afterthought in the design of an 618 ASA. To the contrary, the design of the ASA and of its GRASP 619 objectives should match the data model; as noted above, YANG 620 serialized as CBOR may be used directly as the value of a GRASP 621 objective. 623 9. Robustness 625 It is of great importance that all components of an autonomic system 626 are highly robust. In principle they must never fail. This section 627 lists various aspects of robustness that ASA designers should 628 consider. 630 1. If despite all precautions, an ASA does encounter a fatal error, 631 it should in any case restart automatically and try again. To 632 mitigate a hard loop in case of persistent failure, a suitable 633 pause should be inserted before such a restart. The length of 634 the pause depends on the use case. 636 2. If a newly received or calculated value for a parameter falls out 637 of bounds, the corresponding parameter should be either left 638 unchanged or restored to a safe value. 640 3. If a GRASP synchronization or negotiation session fails for any 641 reason, it may be repeated after a suitable pause. The length of 642 the pause depends on the use case. 644 4. If a session fails repeatedly, the ASA should consider that its 645 peer has failed, and cause GRASP to flush its discovery cache and 646 repeat peer discovery. 648 5. Any received GRASP message should be checked. If it is wrongly 649 formatted, it should be ignored. Within a unicast session, an 650 Invalid message (M_INVALID) may be sent. This function may be 651 provided by the GRASP implementation itself. 653 6. Any received GRASP objective should be checked. If it is wrongly 654 formatted, it should be ignored. Within a negotiation session, a 655 Negotiation End message (M_END) with a Decline option (O_DECLINE) 656 should be sent. An ASA may log such events for diagnostic 657 purposes. 659 7. If an ASA receives either an Invalid message (M_INVALID) or a 660 Negotiation End message (M_END) with a Decline option 661 (O_DECLINE), one possible reason is that the peer ASA does not 662 support a new feature of either GRASP or of the objective in 663 question. In such a case the ASA may choose to repeat the 664 operation concerned without using that new feature. 666 8. All other possible exceptions should be handled in an orderly 667 way. There should be no such thing as an unhandled exception 668 (but see point 1 above). 670 10. Security Considerations 672 ASAs are intended to run in an environment that is protected by the 673 Autonomic Control Plane [I-D.ietf-anima-autonomic-control-plane], 674 admission to which depends on an initial secure bootstrap process 675 [I-D.ietf-anima-bootstrapping-keyinfra]. In some deployments, a 676 secure partition of the link layer might be used instead 677 [I-D.carpenter-anima-l2acp-scenarios]. However, this does not 678 relieve ASAs of responsibility for security. In particular, when 679 ASAs configure or manage network elements outside the ACP, they must 680 use secure techniques and carefully validate any incoming 681 information. As appropriate to their specific functions, ASAs should 682 take account of relevant privacy considerations [RFC6973]. 684 Authorization of ASAs is a subject for future study. At present, 685 ASAs are trusted by virtue of being installed on a node that has 686 successfully joined the ACP. 688 11. IANA Considerations 690 This document makes no request of the IANA. 692 12. Acknowledgements 694 Useful comments were received from Toerless Eckert, Alex Galis, Bing 695 Liu, and other members of the ANIMA WG. 697 13. References 699 13.1. Normative References 701 [I-D.ietf-anima-autonomic-control-plane] 702 Eckert, T., Behringer, M., and S. Bjarnason, "An Autonomic 703 Control Plane (ACP)", draft-ietf-anima-autonomic-control- 704 plane-19 (work in progress), March 2019. 706 [I-D.ietf-anima-bootstrapping-keyinfra] 707 Pritikin, M., Richardson, M., Behringer, M., Bjarnason, 708 S., and K. Watsen, "Bootstrapping Remote Secure Key 709 Infrastructures (BRSKI)", draft-ietf-anima-bootstrapping- 710 keyinfra-22 (work in progress), June 2019. 712 [I-D.ietf-anima-grasp] 713 Bormann, C., Carpenter, B., and B. Liu, "A Generic 714 Autonomic Signaling Protocol (GRASP)", draft-ietf-anima- 715 grasp-15 (work in progress), July 2017. 717 [RFC7049] Bormann, C. and P. Hoffman, "Concise Binary Object 718 Representation (CBOR)", RFC 7049, DOI 10.17487/RFC7049, 719 October 2013, . 721 13.2. Informative References 723 [DeMola06] 724 De Mola, F. and R. Quitadamo, "An Agent Model for Future 725 Autonomic Communications", Proceedings of the 7th WOA 2006 726 Workshop From Objects to Agents 51-59, September 2006. 728 [GANA13] "Autonomic network engineering for the self-managing 729 Future Internet (AFI): GANA Architectural Reference Model 730 for Autonomic Networking, Cognitive Networking and Self- 731 Management.", April 2013, 732 . 735 [Huebscher08] 736 Huebscher, M. and J. McCann, "A survey of autonomic 737 computing--degrees, models, and applications", ACM 738 Computing Surveys (CSUR) Volume 40 Issue 3 DOI: 739 10.1145/1380584.1380585, August 2008. 741 [I-D.carpenter-anima-l2acp-scenarios] 742 Carpenter, B. and B. Liu, "Scenarios and Requirements for 743 Layer 2 Autonomic Control Planes", draft-carpenter-anima- 744 l2acp-scenarios-00 (work in progress), February 2019. 746 [I-D.ciavaglia-anima-coordination] 747 Ciavaglia, L. and P. Peloso, "Autonomic Functions 748 Coordination", draft-ciavaglia-anima-coordination-01 (work 749 in progress), March 2016. 751 [I-D.ietf-anima-grasp-api] 752 Carpenter, B., Liu, B., Wang, W., and X. Gong, "Generic 753 Autonomic Signaling Protocol Application Program Interface 754 (GRASP API)", draft-ietf-anima-grasp-api-03 (work in 755 progress), January 2019. 757 [I-D.ietf-anima-prefix-management] 758 Jiang, S., Du, Z., Carpenter, B., and Q. Sun, "Autonomic 759 IPv6 Edge Prefix Management in Large-scale Networks", 760 draft-ietf-anima-prefix-management-07 (work in progress), 761 December 2017. 763 [I-D.ietf-anima-reference-model] 764 Behringer, M., Carpenter, B., Eckert, T., Ciavaglia, L., 765 and J. Nobre, "A Reference Model for Autonomic 766 Networking", draft-ietf-anima-reference-model-10 (work in 767 progress), November 2018. 769 [I-D.ietf-core-yang-cbor] 770 Veillette, M., Petrov, I., and A. Pelov, "CBOR Encoding of 771 Data Modeled with YANG", draft-ietf-core-yang-cbor-10 772 (work in progress), April 2019. 774 [I-D.peloso-anima-autonomic-function] 775 Pierre, P. and L. Ciavaglia, "A Day in the Life of an 776 Autonomic Function", draft-peloso-anima-autonomic- 777 function-01 (work in progress), March 2016. 779 [Movahedi12] 780 Movahedi, Z., Ayari, M., Langar, R., and G. Pujolle, "A 781 Survey of Autonomic Network Architectures and Evaluation 782 Criteria", IEEE Communications Surveys & Tutorials Volume: 783 14 , Issue: 2 DOI: 10.1109/SURV.2011.042711.00078, 784 Page(s): 464 - 490, 2012. 786 [RFC6973] Cooper, A., Tschofenig, H., Aboba, B., Peterson, J., 787 Morris, J., Hansen, M., and R. Smith, "Privacy 788 Considerations for Internet Protocols", RFC 6973, 789 DOI 10.17487/RFC6973, July 2013, 790 . 792 [RFC7575] Behringer, M., Pritikin, M., Bjarnason, S., Clemm, A., 793 Carpenter, B., Jiang, S., and L. Ciavaglia, "Autonomic 794 Networking: Definitions and Design Goals", RFC 7575, 795 DOI 10.17487/RFC7575, June 2015, 796 . 798 [RFC7665] Halpern, J., Ed. and C. Pignataro, Ed., "Service Function 799 Chaining (SFC) Architecture", RFC 7665, 800 DOI 10.17487/RFC7665, October 2015, 801 . 803 [RFC8368] Eckert, T., Ed. and M. Behringer, "Using an Autonomic 804 Control Plane for Stable Connectivity of Network 805 Operations, Administration, and Maintenance (OAM)", 806 RFC 8368, DOI 10.17487/RFC8368, May 2018, 807 . 809 [RFC8568] Bernardos, CJ., Rahman, A., Zuniga, JC., Contreras, LM., 810 Aranda, P., and P. Lynch, "Network Virtualization Research 811 Challenges", RFC 8568, DOI 10.17487/RFC8568, April 2019, 812 . 814 Appendix A. Change log [RFC Editor: Please remove] 816 draft-carpenter-anima-asa-guidelines-07, 2019-07-17: 818 Improved explanation of threading vs event-loop 820 Other editorial improvements. 822 draft-carpenter-anima-asa-guidelines-06, 2018-01-07: 824 Expanded and improved example logic flow. 826 Editorial corrections. 828 draft-carpenter-anima-asa-guidelines-05, 2018-06-30: 830 Added section on relationshp with non-autonomic components. 832 Editorial corrections. 834 draft-carpenter-anima-asa-guidelines-04, 2018-03-03: 836 Added note about simple ASAs. 838 Added note about NFV/SFC services. 840 Improved text about threading v event loop model 842 Added section about coordination with traditional tools. 844 Added appendix with example logic flow. 846 draft-carpenter-anima-asa-guidelines-03, 2017-10-25: 848 Added details on life cycle. 850 Added details on robustness. 852 Added co-authors. 854 draft-carpenter-anima-asa-guidelines-02, 2017-07-01: 856 Expanded description of event-loop case. 858 Added note about 'dry run' mode. 860 draft-carpenter-anima-asa-guidelines-01, 2017-01-06: 862 More sections filled in 864 draft-carpenter-anima-asa-guidelines-00, 2016-09-30: 866 Initial version 868 Appendix B. Example Logic Flows 870 This appendix describes generic logic flows for an Autonomic Service 871 Agent (ASA) for resource management. Note that these are 872 illustrative examples, and in no sense requirements. As long as the 873 rules of GRASP are followed, a real implementation could be 874 different. The reader is assumed to be familiar with GRASP 875 [I-D.ietf-anima-grasp] and its conceptual API 876 [I-D.ietf-anima-grasp-api]. 878 A complete autonomic function for a resource would consist of a 879 number of instances of the ASA placed at relevant points in a 880 network. Specific details will of course depend on the resource 881 concerned. One example is IP address prefix management, as specified 882 in [I-D.ietf-anima-prefix-management]. In this case, an instance of 883 the ASA would exist in each delegating router. 885 An underlying assumption is that there is an initial source of the 886 resource in question, referred to here as a master ASA. The other 887 ASAs, known as delegators, obtain supplies of the resource from the 888 master, and then delegate quantities of the resource to consumers 889 that request it, and recover it when no longer needed. 891 Another assumption is there is a set of network wide policy 892 parameters, which the master will provide to the delegators. These 893 parameters will control how the delegators decide how much resource 894 to provide to consumers. Thus the ASA logic has two operating modes: 895 master and delegator. When running as a master, it starts by 896 obtaining a quantity of the resource from the NOC, and it acts as a 897 source of policy parameters, via both GRASP flooding and GRASP 898 synchronization. (In some scenarios, flooding or synchronization 899 alone might be sufficient, but this example includes both.) 901 When running as a delegator, it starts with an empty resource pool, 902 it acquires the policy parameters by GRASP synchronization, and it 903 delegates quantities of the resource to consumers that request it. 904 Both as a master and as a delegator, when its pool is low it seeks 905 quantities of the resource by requesting GRASP negotiation with peer 906 ASAs. When its pool is sufficient, it hands out resource to peer 907 ASAs in response to negotiation requests. Thus, over time, the 908 initial resource pool held by the master will be shared among all the 909 delegators according to demand. 911 In theory a network could include any number of masters and any 912 number of delegators, with the only condition being that each 913 master's initial resource pool is unique. A realistic scenario is to 914 have exactly one master and as many delegators as you like. A 915 scenario with no master is useless. 917 An implementation requirement is that resource pools are kept in 918 stable storage. Otherwise, if a delegator exits for any reason, all 919 the resources it has obtained or delegated are lost. If a master 920 exits, its entire spare pool is lost. The logic for using stable 921 storage and for crash receovery is not included below. 923 The description below does not implement GRASP's 'dry run' function. 924 That would require temporarily marking any resource handed out in a 925 dry run negotiation as reserved, until either the peer obtains it in 926 a live run, or a suitable timeout expires. 928 The main data structures used in each instance of the ASA are: 930 o The resource_pool, for example an ordered list of available 931 resources. Depending on the nature of the resource, units of 932 resource are split when appropriate, and a background garbage 933 collector recombines split resources if they are returned to the 934 pool. 936 o The delegated_list, where a delegator stores the resources it has 937 given to consumers routers. 939 Possible main logic flows are below, using a threaded implementation 940 model. The transformation to an event loop model should be apparent 941 - each thread would correspond to one event in the event loop. 943 The GRASP objectives are as follows: 945 ["EX1.Resource", flags, loop_count, value] where the value depends 946 on the resource concerned, but will typically include its size and 947 identification. 949 ["EX1.Params", flags, loop_count, value] where the value will be, 950 for example, a JSON object defining the applicable parameters. 952 In the outline logic flows below, these objectives are represented 953 simply by their names. 955 MAIN PROGRAM: 957 Create empty resource_pool (and an associated lock) 958 Create empty delegated_list 959 Determine whether to act as master 960 if master: 961 Obtain initial resource_pool contents from NOC 962 Obtain value of EX1.Params from NOC 963 Register ASA with GRASP 964 Register GRASP objectives EX1.Resource and EX1.Params 965 if master: 966 Start FLOODER thread to flood EX1.Params 967 Start SYNCHRONIZER listener for EX1.Params 968 Start MAIN_NEGOTIATOR thread for EX1.Resource 969 if not master: 970 Obtain value of EX1.Params from GRASP flood or synchronization 971 Start DELEGATOR thread 972 Start GARBAGE_COLLECTOR thread 973 do forever: 974 good_peer = none 975 if resource_pool is low: 976 Calculate amount A of resource needed 977 Discover peers using GRASP M_DISCOVER / M_RESPONSE 978 if good_peer in peers: 979 peer = good_peer 980 else: 981 peer = #any choice among peers 982 grasp.request_negotiate("EX1.Resource", peer) 983 i.e., send M_REQ_NEG 984 Wait for response (M_NEGOTIATE, M_END or M_WAIT) 985 if OK: 986 if offered amount of resource sufficient: 987 Send M_END + O_ACCEPT #negotiation succeeded 988 Add resource to pool 989 good_peer = peer 990 else: 991 Send M_END + O_DECLINE #negotiation failed 992 sleep() #sleep time depends on application scenario 994 MAIN_NEGOTIATOR thread: 996 do forever: 997 grasp.listen_negotiate("EX1.Resource") 998 i.e., wait for M_REQ_NEG 999 Start a separate new NEGOTIATOR thread for requested amount A 1001 NEGOTIATOR thread: 1003 Request resource amount A from resource_pool 1004 if not OK: 1005 while not OK and A > Amin: 1006 A = A-1 1007 Request resource amount A from resource_pool 1008 if OK: 1009 Offer resource amount A to peer by GRASP M_NEGOTIATE 1010 if received M_END + O_ACCEPT: 1011 #negotiation succeeded 1012 elif received M_END + O_DECLINE or other error: 1013 #negotiation failed 1014 else: 1015 Send M_END + O_DECLINE #negotiation failed 1017 DELEGATOR thread: 1019 do forever: 1020 Wait for request or release for resource amount A 1021 if request: 1022 Get resource amount A from resource_pool 1023 if OK: 1024 Delegate resource to consumer 1025 Record in delegated_list 1026 else: 1027 Signal failure to consumer 1028 Signal main thread that resource_pool is low 1029 else: 1030 Delete resource from delegated_list 1031 Return resource amount A to resource_pool 1033 SYNCHRONIZER thread: 1035 do forever: 1036 Wait for M_REQ_SYN message for EX1.Params 1037 Reply with M_SYNCH message for EX1.Params 1039 FLOODER thread: 1041 do forever: 1042 Send M_FLOOD message for EX1.Params 1043 sleep() #sleep time depends on application scenario 1045 GARBAGE_COLLECTOR thread: 1047 do forever: 1048 Search resource_pool for adjacent resources 1049 Merge adjacent resources 1050 sleep() #sleep time depends on application scenario 1052 Authors' Addresses 1054 Brian Carpenter 1055 School of Computer Science 1056 University of Auckland 1057 PB 92019 1058 Auckland 1142 1059 New Zealand 1061 Email: brian.e.carpenter@gmail.com 1063 Laurent Ciavaglia 1064 Nokia 1065 Villarceaux 1066 Nozay 91460 1067 FR 1069 Email: laurent.ciavaglia@nokia.com 1071 Sheng Jiang 1072 Huawei Technologies Co., Ltd 1073 Q14, Huawei Campus, No.156 Beiqing Road 1074 Hai-Dian District, Beijing, 100095 1075 P.R. China 1077 Email: jiangsheng@huawei.com 1079 Pierre Peloso 1080 Nokia 1081 Villarceaux 1082 Nozay 91460 1083 FR 1085 Email: pierre.peloso@nokia.com