idnits 2.17.1 draft-carpenter-anima-asa-guidelines-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 402 has weird spacing: '...roperty allow...' == Line 405 has weird spacing: '...roperty allow...' == Line 409 has weird spacing: '...roperty allow...' -- The document date (January 7, 2019) is 1907 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-30) exists of draft-ietf-anima-autonomic-control-plane-18 == Outdated reference: A later version (-45) exists of draft-ietf-anima-bootstrapping-keyinfra-17 ** Obsolete normative reference: RFC 7049 (Obsoleted by RFC 8949) == Outdated reference: A later version (-10) exists of draft-ietf-anima-grasp-api-02 == Outdated reference: A later version (-20) exists of draft-ietf-core-yang-cbor-07 Summary: 1 error (**), 0 flaws (~~), 8 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group B. Carpenter 3 Internet-Draft Univ. of Auckland 4 Intended status: Informational L. Ciavaglia 5 Expires: July 11, 2019 Nokia 6 S. Jiang 7 Huawei Technologies Co., Ltd 8 P. Peloso 9 Nokia 10 January 7, 2019 12 Guidelines for Autonomic Service Agents 13 draft-carpenter-anima-asa-guidelines-06 15 Abstract 17 This document proposes guidelines for the design of Autonomic Service 18 Agents for autonomic networks. It is based on the Autonomic Network 19 Infrastructure outlined in the ANIMA reference model, making use of 20 the Autonomic Control Plane and the Generic Autonomic Signaling 21 Protocol. 23 Status of This Memo 25 This Internet-Draft is submitted in full conformance with the 26 provisions of BCP 78 and BCP 79. 28 Internet-Drafts are working documents of the Internet Engineering 29 Task Force (IETF). Note that other groups may also distribute 30 working documents as Internet-Drafts. The list of current Internet- 31 Drafts is at https://datatracker.ietf.org/drafts/current/. 33 Internet-Drafts are draft documents valid for a maximum of six months 34 and may be updated, replaced, or obsoleted by other documents at any 35 time. It is inappropriate to use Internet-Drafts as reference 36 material or to cite them other than as "work in progress." 38 This Internet-Draft will expire on July 11, 2019. 40 Copyright Notice 42 Copyright (c) 2019 IETF Trust and the persons identified as the 43 document authors. All rights reserved. 45 This document is subject to BCP 78 and the IETF Trust's Legal 46 Provisions Relating to IETF Documents 47 (https://trustee.ietf.org/license-info) in effect on the date of 48 publication of this document. Please review these documents 49 carefully, as they describe your rights and restrictions with respect 50 to this document. Code Components extracted from this document must 51 include Simplified BSD License text as described in Section 4.e of 52 the Trust Legal Provisions and are provided without warranty as 53 described in the Simplified BSD License. 55 Table of Contents 57 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 58 2. Logical Structure of an Autonomic Service Agent . . . . . . . 3 59 3. Interaction with the Autonomic Networking Infrastructure . . 5 60 3.1. Interaction with the security mechanisms . . . . . . . . 5 61 3.2. Interaction with the Autonomic Control Plane . . . . . . 5 62 3.3. Interaction with GRASP and its API . . . . . . . . . . . 6 63 3.4. Interaction with policy mechanism . . . . . . . . . . . . 7 64 4. Interaction with Non-Autonomic Components . . . . . . . . . . 7 65 5. Design of GRASP Objectives . . . . . . . . . . . . . . . . . 7 66 6. Life Cycle . . . . . . . . . . . . . . . . . . . . . . . . . 8 67 6.1. Installation phase . . . . . . . . . . . . . . . . . . . 9 68 6.1.1. Installation phase inputs and outputs . . . . . . . . 10 69 6.2. Instantiation phase . . . . . . . . . . . . . . . . . . . 10 70 6.2.1. Operator's goal . . . . . . . . . . . . . . . . . . . 11 71 6.2.2. Instantiation phase inputs and outputs . . . . . . . 11 72 6.2.3. Instantiation phase requirements . . . . . . . . . . 12 73 6.3. Operation phase . . . . . . . . . . . . . . . . . . . . . 12 74 7. Coordination between Autonomic Functions . . . . . . . . . . 13 75 8. Coordination with Traditional Management Functions . . . . . 13 76 9. Robustness . . . . . . . . . . . . . . . . . . . . . . . . . 14 77 10. Security Considerations . . . . . . . . . . . . . . . . . . . 15 78 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 15 79 12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 15 80 13. References . . . . . . . . . . . . . . . . . . . . . . . . . 15 81 13.1. Normative References . . . . . . . . . . . . . . . . . . 15 82 13.2. Informative References . . . . . . . . . . . . . . . . . 16 83 Appendix A. Change log [RFC Editor: Please remove] . . . . . . . 18 84 Appendix B. Example Logic Flows . . . . . . . . . . . . . . . . 19 85 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 23 87 1. Introduction 89 This document proposes guidelines for the design of Autonomic Service 90 Agents (ASAs) in the context of an Autonomic Network (AN) based on 91 the Autonomic Network Infrastructure (ANI) outlined in the ANIMA 92 reference model [I-D.ietf-anima-reference-model]. This 93 infrastructure makes use of the Autonomic Control Plane (ACP) 94 [I-D.ietf-anima-autonomic-control-plane] and the Generic Autonomic 95 Signaling Protocol (GRASP) [I-D.ietf-anima-grasp]. 97 There is a considerable literature about autonomic agents with a 98 variety of proposals about how they should be characterized. Some 99 examples are [DeMola06], [Huebscher08], [Movahedi12] and [GANA13]. 100 However, for the present document, the basic definitions and goals 101 for autonomic networking given in [RFC7575] apply . According to RFC 102 7575, an Autonomic Service Agent is "An agent implemented on an 103 autonomic node that implements an autonomic function, either in part 104 (in the case of a distributed function) or whole." 106 ASAs must be distinguished from other forms of software component. 107 They are components of network or service management; they do not in 108 themselves provide services. For example, the services envisaged for 109 network function virtualisation 110 [I-D.irtf-nfvrg-gaps-network-virtualization] or for service function 111 chaining [RFC7665] might be managed by an ASA rather than by 112 traditional configuration tools. 114 The reference model [I-D.ietf-anima-reference-model] expands this by 115 adding that an ASA is "a process that makes use of the features 116 provided by the ANI to achieve its own goals, usually including 117 interaction with other ASAs via the GRASP protocol 118 [I-D.ietf-anima-grasp] or otherwise. Of course it also interacts 119 with the specific targets of its function, using any suitable 120 mechanism. Unless its function is very simple, the ASA will need to 121 handle overlapping asynchronous operations. It may therefore be a 122 quite complex piece of software in its own right, forming part of the 123 application layer above the ANI." 125 There will certainly be very simple ASAs that manage a single 126 objective in a straightforward way and do not asynchronous 127 operations. In such a case, many aspects of the current document do 128 not apply. However, in general a basic property of an ASA is that it 129 is a relatively complex software component that will in many cases 130 control and monitor simpler entities in the same host or elsewhere. 131 For example, a device controller that manages tens or hundreds of 132 simple devices might contain a single ASA. 134 The remainder of this document offers guidance on the design of such 135 ASAs. 137 2. Logical Structure of an Autonomic Service Agent 139 As mentioned above, all but the simplest ASAs will be multi-threaded 140 programs. 142 A typical ASA will have a main thread that performs various initial 143 housekeeping actions such as: 145 o Obtain authorization credentials. 147 o Register the ASA with GRASP. 149 o Acquire relevant policy parameters. 151 o Define data structures for relevant GRASP objectives. 153 o Register with GRASP those objectives that it will actively manage. 155 o Launch a self-monitoring thread. 157 o Enter its main loop. 159 The logic of the main loop will depend on the details of the 160 autonomic function concerned. Whenever asynchronous operations are 161 required, extra threads will be launched. Examples of such threads 162 include: 164 o A background thread to repeatedly flood an objective to the AN, so 165 that any ASA can receive the objective's latest value. 167 o A thread to accept incoming synchronization requests for an 168 objective managed by this ASA. 170 o A thread to accept incoming negotiation requests for an objective 171 managed by this ASA, and then to conduct the resulting negotiation 172 with the counterpart ASA. 174 o A thread to manage subsidiary non-autonomic devices directly. 176 These threads should all either exit after their job is done, or 177 enter a wait state for new work, to avoid blocking other threads 178 unnecessarily. 180 Not all programming environments explicitly support multi-threading. 181 In such cases, an 'event loop' style of implementation could be 182 adopted, in which case each of the above threads would be implemented 183 as an event handler called in turn by the main loop. In this case, 184 the GRASP API (Section 3.3) must provide non-blocking calls. If 185 necessary, the GRASP session identifier will be used to distinguish 186 simultaneous operations. 188 According to the degree of parallelism needed by the application, 189 some of these threads might be launched in multiple instances. In 190 particular, if negotiation sessions with other ASAs are expected to 191 be long or to involve wait states, the ASA designer might allow for 192 multiple simultaneous negotiating threads, with appropriate use of 193 queues and locks to maintain consistency. 195 The main loop itself could act as the initiator of synchronization 196 requests or negotiation requests, when the ASA needs data or 197 resources from other ASAs. In particular, the main loop should watch 198 for changes in policy parameters that affect its operation. It 199 should also do whatever is required to avoid unnecessary resource 200 consumption, such as including an arbitrary wait time in each cycle 201 of the main loop. 203 The self-monitoring thread is of considerable importance. Autonomic 204 service agents must never fail. To a large extent this depends on 205 careful coding and testing, with no unhandled error returns or 206 exceptions, but if there is nevertheless some sort of failure, the 207 self-monitoring thread should detect it, fix it if possible, and in 208 the worst case restart the entire ASA. 210 Appendix B presents some example logic flows in informal pseudocode. 212 3. Interaction with the Autonomic Networking Infrastructure 214 3.1. Interaction with the security mechanisms 216 An ASA by definition runs in an autonomic node. Before any normal 217 ASAs are started, such nodes must be bootstrapped into the autonomic 218 network's secure key infrastructure in accordance with 219 [I-D.ietf-anima-bootstrapping-keyinfra]. This key infrastructure 220 will be used to secure the ACP (next section) and may be used by ASAs 221 to set up additional secure interactions with their peers, if needed. 223 Note that the secure bootstrap process itself may include special- 224 purpose ASAs that run in a constrained insecure mode. 226 3.2. Interaction with the Autonomic Control Plane 228 In a normal autonomic network, ASAs will run as clients of the ACP. 229 It will provide a fully secured network environment for all 230 communication with other ASAs, in most cases mediated by GRASP (next 231 section). 233 Note that the ACP formation process itself may include special- 234 purpose ASAs that run in a constrained insecure mode. 236 3.3. Interaction with GRASP and its API 238 GRASP [I-D.ietf-anima-grasp] is expected to run as a separate process 239 with its API [I-D.ietf-anima-grasp-api] available in user space. 240 Thus ASAs may operate without special privilege, unless they need it 241 for other reasons. The ASA's view of GRASP is built around GRASP 242 objectives (Section 5), defined as data structures containing 243 administrative information such as the objective's unique name, and 244 its current value. The format and size of the value is not 245 restricted by the protocol, except that it must be possible to 246 serialise it for transmission in CBOR [RFC7049], which is no 247 restriction at all in practice. 249 The GRASP API should offer the following features: 251 o Registration functions, so that an ASA can register itself and the 252 objectives that it manages. 254 o A discovery function, by which an ASA can discover other ASAs 255 supporting a given objective. 257 o A negotiation request function, by which an ASA can start 258 negotiation of an objective with a counterpart ASA. With this, 259 there is a corresponding listening function for an ASA that wishes 260 to respond to negotiation requests, and a set of functions to 261 support negotiating steps. 263 o A synchronization function, by which an ASA can request the 264 current value of an objective from a counterpart ASA. With this, 265 there is a corresponding listening function for an ASA that wishes 266 to respond to synchronization requests. 268 o A flood function, by which an ASA can cause the current value of 269 an objective to be flooded throughout the AN so that any ASA can 270 receive it. 272 For further details and some additional housekeeping functions, see 273 [I-D.ietf-anima-grasp-api]. 275 This API is intended to support the various interactions expected 276 between most ASAs, such as the interactions outlined in Section 2. 277 However, if ASAs require additional communication between themselves, 278 they can do so using any desired protocol. One option is to use 279 GRASP discovery and synchronization as a rendez-vous mechanism 280 between two ASAs, passing communication parameters such as a TCP port 281 number via GRASP. As noted above, either the ACP or in special cases 282 the autonomic key infrastructure will be used to secure such 283 communications. 285 3.4. Interaction with policy mechanism 287 At the time of writing, the policy (or "Intent") mechanism for the 288 ANI is undefined. It is expected to operate by an information 289 distribution mechanism that can reach all autonomic nodes, and 290 therefore every ASA. However, each ASA must be capable of operating 291 "out of the box" in the absence of locally defined policy, so every 292 ASA implementation must include carefully chosen default values and 293 settings for all policy parameters. 295 4. Interaction with Non-Autonomic Components 297 An ASA, to have any external effects, must also interact with non- 298 autonomic components of the node where it is installed. For example, 299 an ASA whose purpose is to manage a resource must interact with that 300 resource. An ASA whose purpose is to manage an entity that is 301 already managed by local software must interact with that software. 302 This is stating the obvious, and the details are specific to each 303 case, but it has an important security implication. The ASA might 304 act as a loophole by which the managed entity could penetrate the 305 security boundary of the ANI. The ASA must be designed to avoid such 306 loopholes, and should if possible operate in an unprivileged mode. 308 In an environment where systems are virtualized and specialized using 309 techniques such as network function virtualization or network 310 slicing, there will be a design choice whether ASAs are deployed once 311 per physical node or once per virtual context. A related issue is 312 whether the ANI as a whole is deployed once on a physical network, or 313 whether several virtual ANIs are deployed. This aspect needs to be 314 considered by the ASA designer. 316 5. Design of GRASP Objectives 318 The general rules for the format of GRASP Objective options, their 319 names, and IANA registration are given in [I-D.ietf-anima-grasp]. 320 Additionally that document discusses various general considerations 321 for the design of objectives, which are not repeated here. However, 322 we emphasize that the GRASP protocol does not provide transactional 323 integrity. In other words, if an ASA is capable of overlapping 324 several negotiations for a given objective, then the ASA itself must 325 use suitable locking techniques to avoid interference between these 326 negotiations. For example, if an ASA is allocating part of a shared 327 resource to other ASAs, it needs to ensure that the same part of the 328 resource is not allocated twice. This might impact the design of the 329 objective as well as the logic flow of the ASA. 331 In particular, if 'dry run' mode is defined for the objective, its 332 specification, and every implementation, must consider what state 333 needs to be saved following a dry run negotiation, such that a 334 subsequent live negotiation can be expected to succeed. It must be 335 clear how long this state is kept, and what happens if the live 336 negotiation occurs after this state is deleted. An ASA that requests 337 a dry run negotiation must take account of the possibility that a 338 successful dry run is followed by a failed live negotiation. Because 339 of these complexities, the dry run mechanism should only be supported 340 by objectives and ASAs where there is a significant benefit from it. 342 The actual value field of an objective is limited by the GRASP 343 protocol definition to any data structure that can be expressed in 344 Concise Binary Object Representation (CBOR) [RFC7049]. For some 345 objectives, a single data item will suffice; for example an integer, 346 a floating point number or a UTF-8 string. For more complex cases, a 347 simple tuple structure such as [item1, item2, item3] could be used. 348 Nothing prevents using other formats such as JSON, but this requires 349 the ASA to be capable of parsing and generating JSON. The formats 350 acceptable by the GRASP API will limit the options in practice. A 351 fallback solution is for the API to accept and deliver the value 352 field in raw CBOR, with the ASA itself encoding and decoding it via a 353 CBOR library. 355 Note that a mapping from YANG to CBOR is defined by 356 [I-D.ietf-core-yang-cbor]. Subject to the size limit defined for 357 GRASP messages, nothing prevents objectives using YANG in this way. 359 6. Life Cycle 361 Autonomic functions could be permanent, in the sense that ASAs are 362 shipped as part of a product and persist throughout the product's 363 life. However, a more likely situation is that ASAs need to be 364 installed or updated dynamically, because of new requirements or 365 bugs. Because continuity of service is fundamental to autonomic 366 networking, the process of seamlessly replacing a running instance of 367 an ASA with a new version needs to be part of the ASA's design. 369 The implication of service continuity on the design of ASAs can be 370 illustrated along the three main phases of the ASA life-cycle, namely 371 Installation, Instantiation and Operation. 373 +--------------+ 374 Undeployed ------>| |------> Undeployed 375 | Installed | 376 +-->| |---+ 377 Mandate | +--------------+ | Receives a 378 is revoked | +--------------+ | Mandate 379 +---| |<--+ 380 | Instantiated | 381 +-->| |---+ 382 set | +--------------+ | set 383 down | +--------------+ | up 384 +---| |<--+ 385 | Operational | 386 | | 387 +--------------+ 389 Figure 1: Life cycle of an Autonomic Service Agent 391 6.1. Installation phase 393 Before being able to instantiate and run ASAs, the operator must 394 first provision the infrastructure with the sets of ASA software 395 corresponding to its needs and objectives. The provisioning of the 396 infrastructure is realized in the installation phase and consists in 397 installing (or checking the availability of) the pieces of software 398 of the different ASA classes in a set of Installation Hosts. 400 There are 3 properties applicable to the installation of ASAs: 402 The dynamic installation property allows installing an ASA on 403 demand, on any hosts compatible with the ASA. 405 The decoupling property allows controlling resources of a NE from a 406 remote ASA, i.e. an ASA installed on a host machine different from 407 the resources' NE. 409 The multiplicity property allows controlling multiple sets of 410 resources from a single ASA. 412 These three properties are very important in the context of the 413 installation phase as their variations condition how the ASA class 414 could be installed on the infrastructure. 416 6.1.1. Installation phase inputs and outputs 418 Inputs are: 420 [ASA class of type_x] that specifies which classes ASAs to install, 422 [Installation_target_Infrastructure] that specifies the candidate 423 Installation Hosts, 425 [ASA class placement function, e.g. under which criteria/constraints 426 as defined by the operator] 427 that specifies how the installation phase shall meet the 428 operator's needs and objectives for the provision of the 429 infrastructure. In the coupled mode, the placement function is 430 not necessary, whereas in the decoupled mode, the placement 431 function is mandatory, even though it can be as simple as an 432 explicit list of Installation hosts. 434 The main output of the installation phase is an up-to-date directory 435 of installed ASAs which corresponds to [list of ASA classes] 436 installed on [list of installation Hosts]. This output is also 437 useful for the coordination function and corresponds to the static 438 interaction map (see next section). 440 The condition to validate in order to pass to next phase is to ensure 441 that [list of ASA classes] are well installed on [list of 442 installation Hosts]. The state of the ASA at the end of the 443 installation phase is: installed. (not instantiated). The following 444 commands or messages are foreseen: install(list of ASA classes, 445 Installation_target_Infrastructure, ASA class placement function), 446 and un-install (list of ASA classes). 448 6.2. Instantiation phase 450 Once the ASAs are installed on the appropriate hosts in the network, 451 these ASA may start to operate. From the operator viewpoint, an 452 operating ASA means the ASA manages the network resources as per the 453 objectives given. At the ASA local level, operating means executing 454 their control loop/algorithm. 456 But right before that, there are two things to take into 457 consideration. First, there is a difference between 1. having a 458 piece of code available to run on a host and 2. having an agent based 459 on this piece of code running inside the host. Second, in a coupled 460 case, determining which resources are controlled by an ASA is 461 straightforward (the determination is embedded), in a decoupled mode 462 determining this is a bit more complex (hence a starting agent will 463 have to either discover or be taught it). 465 The instantiation phase of an ASA covers both these aspects: starting 466 the agent piece of code (when this does not start automatically) and 467 determining which resources have to be controlled (when this is not 468 obvious). 470 6.2.1. Operator's goal 472 Through this phase, the operator wants to control its autonomic 473 network in two things: 475 1 determine the scope of autonomic functions by instructing which of 476 the network resources have to be managed by which autonomic 477 function (and more precisely which class e.g. 1. version X or 478 version Y or 2. provider A or provider B), 480 2 determine how the autonomic functions are organized by instructing 481 which ASAs have to interact with which other ASAs (or more 482 precisely which set of network resources have to be handled as an 483 autonomous group by their managing ASAs). 485 Additionally in this phase, the operator may want to set objectives 486 to autonomic functions, by configuring the ASAs technical objectives. 488 The operator's goal can be summarized in an instruction to the ANIMA 489 ecosystem matching the following pattern: 491 [ASA of type_x instances] ready to control 492 [Instantiation_target_Infrastructure] with 493 [Instantiation_target_parameters] 495 6.2.2. Instantiation phase inputs and outputs 497 Inputs are: 499 [ASA of type_x instances] that specifies which are the ASAs to be 500 targeted (and more precisely which class e.g. 1. version X or 501 version Y or 2. provider A or provider B), 503 [Instantiation_target_Infrastructure] that specifies which are the 504 resources to be managed by the autonomic function, this can be the 505 whole network or a subset of it like a domain a technology segment 506 or even a specific list of resources, 508 [Instantiation_target_parameters] that specifies which are the 509 technical objectives to be set to ASAs (e.g. an optimization 510 target) 512 Outputs are: 514 [Set of ASAs - Resources relations] describing which resources are 515 managed by which ASA instances, this is not a formal message, but 516 a resulting configuration of a set of ASAs, 518 6.2.3. Instantiation phase requirements 520 The instructions described in section 4.2 could be either: 522 sent to a targeted ASA In which case, the receiving Agent will have 523 to manage the specified list of 524 [Instantiation_target_Infrastructure], with the 525 [Instantiation_target_parameters]. 527 broadcast to all ASAs In which case, the ASAs would collectively 528 determine from the list which Agent(s) would handle which 529 [Instantiation_target_Infrastructure], with the 530 [Instantiation_target_parameters]. 532 This set of instructions can be materialized through a message that 533 is named an Instance Mandate (description TBD). 535 The conclusion of this instantiation phase is a ready to operate ASA 536 (or interacting set of ASAs), then this (or those) ASA(s) can 537 describe themselves by depicting which are the resources they manage 538 and what this means in terms of metrics being monitored and in terms 539 of actions that can be executed (like modifying the parameters 540 values). A message conveying such a self description is named an 541 Instance Manifest (description TBD). 543 Though the operator may well use such a self-description "per se", 544 the final goal of such a description is to be shared with other ANIMA 545 entities like: 547 o the coordination entities (see [I-D.ciavaglia-anima-coordination] 548 - Autonomic Functions Coordination) 550 o collaborative entities in the purpose of establishing knowledge 551 exchanges (some ASAs may produce knowledge or even monitor metrics 552 that other ASAs cannot make by themselves why those would be 553 useful for their execution) 555 6.3. Operation phase 557 Note: This section is to be further developed in future revisions of 558 the document, especially the implications on the design of ASAs. 560 During the Operation phase, the operator can: 562 Activate/Deactivate ASA: meaning enabling those to execute their 563 autonomic loop or not. 565 Modify ASAs targets: meaning setting them different objectives. 567 Modify ASAs managed resources: by updating the instance mandate 568 which would specify different set of resources to manage (only 569 applicable to decouples ASAs). 571 During the Operation phase, running ASAs can interact the one with 572 the other: 574 in order to exchange knowledge (e.g. an ASA providing traffic 575 predictions to load balancing ASA) 577 in order to collaboratively reach an objective (e.g. ASAs 578 pertaining to the same autonomic function targeted to manage a 579 network domain, these ASA will collaborate - in the case of a load 580 balancing one, by modifying the links metrics according to the 581 neighboring resources loads) 583 During the Operation phase, running ASAs are expected to apply 584 coordination schemes 586 then execute their control loop under coordination supervision/ 587 instructions 589 The ASA life-cycle is discussed in more detail in "A Day in the Life 590 of an Autonomic Function" [I-D.peloso-anima-autonomic-function]. 592 7. Coordination between Autonomic Functions 594 Some autonomic functions will be completely independent of each 595 other. However, others are at risk of interfering with each other - 596 for example, two different optimization functions might both attempt 597 to modify the same underlying parameter in different ways. In a 598 complete system, a method is needed of identifying ASAs that might 599 interfere with each other and coordinating their actions when 600 necessary. This issue is considered in "Autonomic Functions 601 Coordination" [I-D.ciavaglia-anima-coordination]. 603 8. Coordination with Traditional Management Functions 605 Some ASAs will have functions that overlap with existing 606 configuration tools and network management mechanisms such as command 607 line interfaces, DHCP, DHCPv6, SNMP, NETCONF, RESTCONF and YANG-based 608 solutions. Each ASA designer will need to consider this issue and 609 how to avoid clashes and inconsistencies. Some specific 610 considerations for interaction with OAM tools are given in 611 [I-D.ietf-anima-stable-connectivity]. As another example, 612 [I-D.ietf-anima-prefix-management] describes how autonomic management 613 of IPv6 prefixes can interact with prefix delegation via DHCPv6. The 614 description of a GRASP objective and of an ASA using it should 615 include a discussion of any such interactions. 617 A related aspect is that management functions often include a data 618 model, quite likely to be expressed in a formal notation such as 619 YANG. This aspect should not be an afterthought in the design of an 620 ASA. To the contrary, the design of the ASA and of its GRASP 621 objectives should match the data model; as noted above, YANG 622 serialized as CBOR may be used directly as the value of a GRASP 623 objective. 625 9. Robustness 627 It is of great importance that all components of an autonomic system 628 are highly robust. In principle they must never fail. This section 629 lists various aspects of robustness that ASA designers should 630 consider. 632 1. If despite all precautions, an ASA does encounter a fatal error, 633 it should in any case restart automatically and try again. To 634 mitigate a hard loop in case of persistent failure, a suitable 635 pause should be inserted before such a restart. The length of 636 the pause depends on the use case. 638 2. If a newly received or calculated value for a parameter falls out 639 of bounds, the corresponding parameter should be either left 640 unchanged or restored to a safe value. 642 3. If a GRASP synchronization or negotiation session fails for any 643 reason, it may be repeated after a suitable pause. The length of 644 the pause depends on the use case. 646 4. If a session fails repeatedly, the ASA should consider that its 647 peer has failed, and cause GRASP to flush its discovery cache and 648 repeat peer discovery. 650 5. Any received GRASP message should be checked. If it is wrongly 651 formatted, it should be ignored. Within a unicast session, an 652 Invalid message (M_INVALID) may be sent. This function may be 653 provided by the GRASP implementation itself. 655 6. Any received GRASP objective should be checked. If it is wrongly 656 formatted, it should be ignored. Within a negotiation session, a 657 Negotiation End message (M_END) with a Decline option (O_DECLINE) 658 should be sent. An ASA may log such events for diagnostic 659 purposes. 661 7. If an ASA receives either an Invalid message (M_INVALID) or a 662 Negotiation End message (M_END) with a Decline option 663 (O_DECLINE), one possible reason is that the peer ASA does not 664 support a new feature of either GRASP or of the objective in 665 question. In such a case the ASA may choose to repeat the 666 operation concerned without using that new feature. 668 8. All other possible exceptions should be handled in an orderly 669 way. There should be no such thing as an unhandled exception 670 (but see point 1 above). 672 10. Security Considerations 674 ASAs are intended to run in an environment that is protected by the 675 Autonomic Control Plane [I-D.ietf-anima-autonomic-control-plane], 676 admission to which depends on an initial secure bootstrap process 677 [I-D.ietf-anima-bootstrapping-keyinfra]. However, this does not 678 relieve ASAs of responsibility for security. In particular, when 679 ASAs configure or manage network elements outside the ACP, they must 680 use secure techniques and carefully validate any incoming 681 information. As appropriate to their specific functions, ASAs should 682 take account of relevant privacy considerations [RFC6973]. 684 Authorization of ASAs is a subject for future study. At present, 685 ASAs are trusted by virtue of being installed on a node that has 686 successfully joined the ACP. 688 11. IANA Considerations 690 This document makes no request of the IANA. 692 12. Acknowledgements 694 Useful comments were received from Toerless Eckert, Alex Galis, Bing 695 Liu, and other members of the ANIMA WG. 697 13. References 699 13.1. Normative References 701 [I-D.ietf-anima-autonomic-control-plane] 702 Eckert, T., Behringer, M., and S. Bjarnason, "An Autonomic 703 Control Plane (ACP)", draft-ietf-anima-autonomic-control- 704 plane-18 (work in progress), August 2018. 706 [I-D.ietf-anima-bootstrapping-keyinfra] 707 Pritikin, M., Richardson, M., Behringer, M., Bjarnason, 708 S., and K. Watsen, "Bootstrapping Remote Secure Key 709 Infrastructures (BRSKI)", draft-ietf-anima-bootstrapping- 710 keyinfra-17 (work in progress), November 2018. 712 [I-D.ietf-anima-grasp] 713 Bormann, C., Carpenter, B., and B. Liu, "A Generic 714 Autonomic Signaling Protocol (GRASP)", draft-ietf-anima- 715 grasp-15 (work in progress), July 2017. 717 [RFC7049] Bormann, C. and P. Hoffman, "Concise Binary Object 718 Representation (CBOR)", RFC 7049, DOI 10.17487/RFC7049, 719 October 2013, . 721 13.2. Informative References 723 [DeMola06] 724 De Mola, F. and R. Quitadamo, "An Agent Model for Future 725 Autonomic Communications", Proceedings of the 7th WOA 2006 726 Workshop From Objects to Agents 51-59, September 2006. 728 [GANA13] ETSI GS AFI 002, "Autonomic network engineering for the 729 self-managing Future Internet (AFI): GANA Architectural 730 Reference Model for Autonomic Networking, Cognitive 731 Networking and Self-Management.", April 2013, 732 . 735 [Huebscher08] 736 Huebscher, M. and J. McCann, "A survey of autonomic 737 computing--degrees, models, and applications", ACM 738 Computing Surveys (CSUR) Volume 40 Issue 3 DOI: 739 10.1145/1380584.1380585, August 2008. 741 [I-D.ciavaglia-anima-coordination] 742 Ciavaglia, L. and P. Peloso, "Autonomic Functions 743 Coordination", draft-ciavaglia-anima-coordination-01 (work 744 in progress), March 2016. 746 [I-D.ietf-anima-grasp-api] 747 Carpenter, B., Liu, B., Wang, W., and X. Gong, "Generic 748 Autonomic Signaling Protocol Application Program Interface 749 (GRASP API)", draft-ietf-anima-grasp-api-02 (work in 750 progress), June 2018. 752 [I-D.ietf-anima-prefix-management] 753 Jiang, S., Du, Z., Carpenter, B., and Q. Sun, "Autonomic 754 IPv6 Edge Prefix Management in Large-scale Networks", 755 draft-ietf-anima-prefix-management-07 (work in progress), 756 December 2017. 758 [I-D.ietf-anima-reference-model] 759 Behringer, M., Carpenter, B., Eckert, T., Ciavaglia, L., 760 and J. Nobre, "A Reference Model for Autonomic 761 Networking", draft-ietf-anima-reference-model-10 (work in 762 progress), November 2018. 764 [I-D.ietf-anima-stable-connectivity] 765 Eckert, T. and M. Behringer, "Using Autonomic Control 766 Plane for Stable Connectivity of Network OAM", draft-ietf- 767 anima-stable-connectivity-10 (work in progress), February 768 2018. 770 [I-D.ietf-core-yang-cbor] 771 Veillette, M., Pelov, A., Somaraju, A., Turner, R., and A. 772 Minaburo, "CBOR Encoding of Data Modeled with YANG", 773 draft-ietf-core-yang-cbor-07 (work in progress), September 774 2018. 776 [I-D.irtf-nfvrg-gaps-network-virtualization] 777 Bernardos, C., Rahman, A., Zuniga, J., Contreras, L., 778 Aranda, P., and P. Lynch, "Network Virtualization Research 779 Challenges", draft-irtf-nfvrg-gaps-network- 780 virtualization-10 (work in progress), September 2018. 782 [I-D.peloso-anima-autonomic-function] 783 Pierre, P. and L. Ciavaglia, "A Day in the Life of an 784 Autonomic Function", draft-peloso-anima-autonomic- 785 function-01 (work in progress), March 2016. 787 [Movahedi12] 788 Movahedi, Z., Ayari, M., Langar, R., and G. Pujolle, "A 789 Survey of Autonomic Network Architectures and Evaluation 790 Criteria", IEEE Communications Surveys & Tutorials Volume: 791 14 , Issue: 2 DOI: 10.1109/SURV.2011.042711.00078, 792 Page(s): 464 - 490, 2012. 794 [RFC6973] Cooper, A., Tschofenig, H., Aboba, B., Peterson, J., 795 Morris, J., Hansen, M., and R. Smith, "Privacy 796 Considerations for Internet Protocols", RFC 6973, 797 DOI 10.17487/RFC6973, July 2013, 798 . 800 [RFC7575] Behringer, M., Pritikin, M., Bjarnason, S., Clemm, A., 801 Carpenter, B., Jiang, S., and L. Ciavaglia, "Autonomic 802 Networking: Definitions and Design Goals", RFC 7575, 803 DOI 10.17487/RFC7575, June 2015, 804 . 806 [RFC7665] Halpern, J., Ed. and C. Pignataro, Ed., "Service Function 807 Chaining (SFC) Architecture", RFC 7665, 808 DOI 10.17487/RFC7665, October 2015, 809 . 811 Appendix A. Change log [RFC Editor: Please remove] 813 draft-carpenter-anima-asa-guidelines-06, 2018-01-07: 815 Expanded and improved example logic flow. 817 Editorial corrections. 819 draft-carpenter-anima-asa-guidelines-05, 2018-06-30: 821 Added section on relationshp with non-autonomic components. 823 Editorial corrections. 825 draft-carpenter-anima-asa-guidelines-04, 2018-03-03: 827 Added note about simple ASAs. 829 Added note about NFV/SFC services. 831 Improved text about threading v event loop model 833 Added section about coordination with traditional tools. 835 Added appendix with example logic flow. 837 draft-carpenter-anima-asa-guidelines-03, 2017-10-25: 839 Added details on life cycle. 841 Added details on robustness. 843 Added co-authors. 845 draft-carpenter-anima-asa-guidelines-02, 2017-07-01: 847 Expanded description of event-loop case. 849 Added note about 'dry run' mode. 851 draft-carpenter-anima-asa-guidelines-01, 2017-01-06: 853 More sections filled in 855 draft-carpenter-anima-asa-guidelines-00, 2016-09-30: 857 Initial version 859 Appendix B. Example Logic Flows 861 This appendix describes generic logic flows for an Autonomic Service 862 Agent (ASA) for resource management. Note that these are 863 illustrative examples, and in no sense requirements. As long as the 864 rules of GRASP are followed, a real implementation could be 865 different. The reader is assumed to be familiar with GRASP 866 [I-D.ietf-anima-grasp] and its conceptual API 867 [I-D.ietf-anima-grasp-api]. 869 A complete autonomic function for a resource would consist of a 870 number of instances of the ASA placed at relevant points in a 871 network. Specific details will of course depend on the resource 872 concerned. One example is IP address prefix management, as specified 873 in [I-D.ietf-anima-prefix-management]. In this case, an instance of 874 the ASA would exist in each delegating router. 876 An underlying assumption is that there is an initial source of the 877 resource in question, referred to here as a master ASA. The other 878 ASAs, known as delegators, obtain supplies of the resource from the 879 master, and then delegate quantities of the resource to consumers 880 that request it, and recover it when no longer needed. 882 Another assumption is there is a set of network wide policy 883 parameters, which the master will provide to the delegators. These 884 parameters will control how the delegators decide how much resource 885 to provide to consumers. Thus the ASA logic has two operating modes: 886 master and delegator. When running as a master, it starts by 887 obtaining a quantity of the resource from the NOC, and it acts as a 888 source of policy parameters, via both GRASP flooding and GRASP 889 synchronization. (In some scenarios, flooding or synchronization 890 alone might be sufficient, but this example includes both.) 892 When running as a delegator, it starts with an empty resource pool, 893 it acquires the policy parameters by GRASP synchronization, and it 894 delegates quantities of the resource to consumers that request it. 895 Both as a master and as a delegator, when its pool is low it seeks 896 quantities of the resource by requesting GRASP negotiation with peer 897 ASAs. When its pool is sufficient, it hands out resource to peer 898 ASAs in response to negotiation requests. Thus, over time, the 899 initial resource pool held by the master will be shared among all the 900 delegators according to demand. 902 In theory a network could include any number of masters and any 903 number of delegators, with the only condition being that each 904 master's initial resource pool is unique. A realistic scenario is to 905 have exactly one master and as many delegators as you like. A 906 scenario with no master is useless. 908 An implementation requirement is that resource pools are kept in 909 stable storage. Otherwise, if a delegator exits for any reason, all 910 the resources it has obtained or delegated are lost. If a master 911 exits, its entire spare pool is lost. The logic for using stable 912 storage and for crash receovery is not included below. 914 The description below doesn't implement GRASP's 'dry run' function. 915 That would mean temporarily marking any resource handed out in a dry 916 run negotiation as reserved, until either the peer obtains it in a 917 live run, or a suitable timeout expires. 919 The main data structures used in each instance of the ASA are: 921 o The resource_pool, for example an ordered list of available 922 resources. Depending on the nature of the resource, units of 923 resource are split when appropriate, and a background garbage 924 collector recombines split resources if they are returned to the 925 pool. 927 o The delegated_list, where a delegator stores the resources it has 928 given to consumers routers. 930 Possible main logic flows are below, using a threaded implementation 931 model. The transformation to an event loop model should be apparent 932 - each thread would correspond to one event in the event loop. 934 The GRASP objectives are as follows: 936 ["EX1.Resource", flags, loop_count, value] where the value depends 937 on the resource concerned, but will typically include its size and 938 identification. 940 ["EX1.Params", flags, loop_count, value] where the value will be, 941 for example, a JSON object defining the applicable parameters. 943 In the outline logic flows below, these objectives are represented 944 simply by their names. 946 MAIN PROGRAM: 948 Create empty resource_pool (and an associated lock) 949 Create empty delegated_list 950 Determine whether to act as master 951 if master: 952 Obtain initial resource_pool contents from NOC 953 Obtain value of EX1.Params from NOC 954 Register ASA with GRASP 955 Register GRASP objectives EX1.Resource and EX1.Params 956 if master: 957 Start FLOODER thread to flood EX1.Params 958 Start SYNCHRONIZER listener for EX1.Params 959 Start MAIN_NEGOTIATOR thread for EX1.Resource 960 if not master: 961 Obtain value of EX1.Params from GRASP flood or synchronization 962 Start DELEGATOR thread 963 Start GARBAGE_COLLECTOR thread 964 do forever: 965 good_peer = none 966 if resource_pool is low: 967 Calculate amount A of resource needed 968 Discover peers using GRASP M_DISCOVER / M_RESPONSE 969 if good_peer in peers: 970 peer = good_peer 971 else: 972 peer = #any choice among peers 973 grasp.request_negotiate("EX1.Resource", peer) 974 i.e., send M_REQ_NEG 975 Wait for response (M_NEGOTIATE, M_END or M_WAIT) 976 if OK: 977 if offered amount of resource sufficient: 978 Send M_END + O_ACCEPT #negotiation succeeded 979 Add resource to pool 980 good_peer = peer 981 else: 982 Send M_END + O_DECLINE #negotiation failed 983 sleep() #sleep time depends on application scenario 985 MAIN_NEGOTIATOR thread: 987 do forever: 988 grasp.listen_negotiate("EX1.Resource") 989 i.e., wait for M_REQ_NEG 990 Start a separate new NEGOTIATOR thread for requested amount A 992 NEGOTIATOR thread: 994 Request resource amount A from resource_pool 995 if not OK: 996 while not OK and A > Amin: 997 A = A-1 998 Request resource amount A from resource_pool 999 if OK: 1000 Offer resource amount A to peer by GRASP M_NEGOTIATE 1001 if received M_END + O_ACCEPT: 1002 #negotiation succeeded 1003 elif received M_END + O_DECLINE or other error: 1004 #negotiation failed 1005 else: 1006 Send M_END + O_DECLINE #negotiation failed 1008 DELEGATOR thread: 1010 do forever: 1011 Wait for request or release for resource amount A 1012 if request: 1013 Get resource amount A from resource_pool 1014 if OK: 1015 Delegate resource to consumer 1016 Record in delegated_list 1017 else: 1018 Signal failure to consumer 1019 Signal main thread that resource_pool is low 1020 else: 1021 Delete resource from delegated_list 1022 Return resource amount A to resource_pool 1024 SYNCHRONIZER thread: 1026 do forever: 1027 Wait for M_REQ_SYN message for EX1.Params 1028 Reply with M_SYNCH message for EX1.Params 1030 FLOODER thread: 1032 do forever: 1033 Send M_FLOOD message for EX1.Params 1034 sleep() #sleep time depends on application scenario 1036 GARBAGE_COLLECTOR thread: 1038 do forever: 1039 Search resource_pool for adjacent resources 1040 Merge adjacent resources 1041 sleep() #sleep time depends on application scenario 1043 Authors' Addresses 1045 Brian Carpenter 1046 Department of Computer Science 1047 University of Auckland 1048 PB 92019 1049 Auckland 1142 1050 New Zealand 1052 Email: brian.e.carpenter@gmail.com 1054 Laurent Ciavaglia 1055 Nokia 1056 Villarceaux 1057 Nozay 91460 1058 FR 1060 Email: laurent.ciavaglia@nokia.com 1062 Sheng Jiang 1063 Huawei Technologies Co., Ltd 1064 Q14, Huawei Campus, No.156 Beiqing Road 1065 Hai-Dian District, Beijing, 100095 1066 P.R. China 1068 Email: jiangsheng@huawei.com 1070 Pierre Peloso 1071 Nokia 1072 Villarceaux 1073 Nozay 91460 1074 FR 1076 Email: pierre.peloso@nokia.com