idnits 2.17.1 draft-carpenter-anima-asa-guidelines-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 405 has weird spacing: '...roperty allow...' == Line 408 has weird spacing: '...roperty allow...' == Line 412 has weird spacing: '...roperty allow...' -- The document date (June 30, 2018) is 2127 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-30) exists of draft-ietf-anima-autonomic-control-plane-16 == Outdated reference: A later version (-45) exists of draft-ietf-anima-bootstrapping-keyinfra-16 ** Obsolete normative reference: RFC 7049 (Obsoleted by RFC 8949) == Outdated reference: A later version (-10) exists of draft-ietf-anima-grasp-api-01 == Outdated reference: A later version (-10) exists of draft-ietf-anima-reference-model-06 == Outdated reference: A later version (-20) exists of draft-ietf-core-yang-cbor-06 == Outdated reference: A later version (-10) exists of draft-irtf-nfvrg-gaps-network-virtualization-09 Summary: 1 error (**), 0 flaws (~~), 10 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group B. Carpenter 3 Internet-Draft Univ. of Auckland 4 Intended status: Informational L. Ciavaglia 5 Expires: January 1, 2019 Nokia 6 S. Jiang 7 Huawei Technologies Co., Ltd 8 P. Peloso 9 Nokia 10 June 30, 2018 12 Guidelines for Autonomic Service Agents 13 draft-carpenter-anima-asa-guidelines-05 15 Abstract 17 This document proposes guidelines for the design of Autonomic Service 18 Agents for autonomic networks. It is based on the Autonomic Network 19 Infrastructure outlined in the ANIMA reference model, making use of 20 the Autonomic Control Plane and the Generic Autonomic Signaling 21 Protocol. 23 Status of This Memo 25 This Internet-Draft is submitted in full conformance with the 26 provisions of BCP 78 and BCP 79. 28 Internet-Drafts are working documents of the Internet Engineering 29 Task Force (IETF). Note that other groups may also distribute 30 working documents as Internet-Drafts. The list of current Internet- 31 Drafts is at https://datatracker.ietf.org/drafts/current/. 33 Internet-Drafts are draft documents valid for a maximum of six months 34 and may be updated, replaced, or obsoleted by other documents at any 35 time. It is inappropriate to use Internet-Drafts as reference 36 material or to cite them other than as "work in progress." 38 This Internet-Draft will expire on January 1, 2019. 40 Copyright Notice 42 Copyright (c) 2018 IETF Trust and the persons identified as the 43 document authors. All rights reserved. 45 This document is subject to BCP 78 and the IETF Trust's Legal 46 Provisions Relating to IETF Documents 47 (https://trustee.ietf.org/license-info) in effect on the date of 48 publication of this document. Please review these documents 49 carefully, as they describe your rights and restrictions with respect 50 to this document. Code Components extracted from this document must 51 include Simplified BSD License text as described in Section 4.e of 52 the Trust Legal Provisions and are provided without warranty as 53 described in the Simplified BSD License. 55 Table of Contents 57 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 58 2. Logical Structure of an Autonomic Service Agent . . . . . . . 3 59 3. Interaction with the Autonomic Networking Infrastructure . . 5 60 3.1. Interaction with the security mechanisms . . . . . . . . 5 61 3.2. Interaction with the Autonomic Control Plane . . . . . . 5 62 3.3. Interaction with GRASP and its API . . . . . . . . . . . 6 63 3.4. Interaction with Intent mechanism . . . . . . . . . . . . 7 64 4. Interaction with Non-Autonomic Components . . . . . . . . . . 7 65 5. Design of GRASP Objectives . . . . . . . . . . . . . . . . . 7 66 6. Life Cycle . . . . . . . . . . . . . . . . . . . . . . . . . 8 67 6.1. Installation phase . . . . . . . . . . . . . . . . . . . 9 68 6.1.1. Installation phase inputs and outputs . . . . . . . . 10 69 6.2. Instantiation phase . . . . . . . . . . . . . . . . . . . 10 70 6.2.1. Operator's goal . . . . . . . . . . . . . . . . . . . 11 71 6.2.2. Instantiation phase inputs and outputs . . . . . . . 11 72 6.2.3. Instantiation phase requirements . . . . . . . . . . 12 73 6.3. Operation phase . . . . . . . . . . . . . . . . . . . . . 12 74 7. Coordination between Autonomic Functions . . . . . . . . . . 13 75 8. Coordination with Traditional Management Functions . . . . . 13 76 9. Robustness . . . . . . . . . . . . . . . . . . . . . . . . . 14 77 10. Security Considerations . . . . . . . . . . . . . . . . . . . 15 78 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 15 79 12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 15 80 13. References . . . . . . . . . . . . . . . . . . . . . . . . . 15 81 13.1. Normative References . . . . . . . . . . . . . . . . . . 15 82 13.2. Informative References . . . . . . . . . . . . . . . . . 16 83 Appendix A. Change log [RFC Editor: Please remove] . . . . . . . 18 84 Appendix B. Example Logic Flows . . . . . . . . . . . . . . . . 19 85 B.1. Threaded Example . . . . . . . . . . . . . . . . . . . . 19 86 B.2. Event Loop Example . . . . . . . . . . . . . . . . . . . 21 87 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 21 89 1. Introduction 91 This document proposes guidelines for the design of Autonomic Service 92 Agents (ASAs) in the context of an Autonomic Network (AN) based on 93 the Autonomic Network Infrastructure (ANI) outlined in the ANIMA 94 reference model [I-D.ietf-anima-reference-model]. This 95 infrastructure makes use of the Autonomic Control Plane (ACP) 97 [I-D.ietf-anima-autonomic-control-plane] and the Generic Autonomic 98 Signaling Protocol (GRASP) [I-D.ietf-anima-grasp]. 100 There is a considerable literature about autonomic agents with a 101 variety of proposals about how they should be characterized. Some 102 examples are [DeMola06], [Huebscher08], [Movahedi12] and [GANA13]. 103 However, for the present document, the basic definitions and goals 104 for autonomic networking given in [RFC7575] apply . According to RFC 105 7575, an Autonomic Service Agent is "An agent implemented on an 106 autonomic node that implements an autonomic function, either in part 107 (in the case of a distributed function) or whole." 109 ASAs must be distinguished from other forms of software component. 110 They are components of network or service management; they do not in 111 themselves provide services. For example, the services envisaged for 112 network function virtualisation 113 [I-D.irtf-nfvrg-gaps-network-virtualization] or for service function 114 chaining [RFC7665] might be managed by an ASA rather than by 115 traditional configuration tools. 117 The reference model [I-D.ietf-anima-reference-model] expands this by 118 adding that an ASA is "a process that makes use of the features 119 provided by the ANI to achieve its own goals, usually including 120 interaction with other ASAs via the GRASP protocol 121 [I-D.ietf-anima-grasp] or otherwise. Of course it also interacts 122 with the specific targets of its function, using any suitable 123 mechanism. Unless its function is very simple, the ASA will need to 124 handle overlapping asynchronous operations. It may therefore be a 125 quite complex piece of software in its own right, forming part of the 126 application layer above the ANI." 128 There will certainly be very simple ASAs that manage a single 129 objective in a straightforward way and do not asynchronous 130 operations. In such a case, many aspects of the current document do 131 not apply. However, in general a basic property of an ASA is that it 132 is a relatively complex software component that will in many cases 133 control and monitor simpler entities in the same host or elsewhere. 134 For example, a device controller that manages tens or hundreds of 135 simple devices might contain a single ASA. 137 The remainder of this document offers guidance on the design of such 138 ASAs. 140 2. Logical Structure of an Autonomic Service Agent 142 As mentioned above, all but the simplest ASAs will be multi-threaded 143 programs. 145 A typical ASA will have a main thread that performs various initial 146 housekeeping actions such as: 148 o Obtain authorization credentials. 150 o Register the ASA with GRASP. 152 o Acquire relevant policy Intent. 154 o Define data structures for relevant GRASP objectives. 156 o Register with GRASP those objectives that it will actively manage. 158 o Launch a self-monitoring thread. 160 o Enter its main loop. 162 The logic of the main loop will depend on the details of the 163 autonomic function concerned. Whenever asynchronous operations are 164 required, extra threads will be launched. Examples of such threads 165 include: 167 o A background thread to repeatedly flood an objective to the AN, so 168 that any ASA can receive the objective's latest value. 170 o A thread to accept incoming synchronization requests for an 171 objective managed by this ASA. 173 o A thread to accept incoming negotiation requests for an objective 174 managed by this ASA, and then to conduct the resulting negotiation 175 with the counterpart ASA. 177 o A thread to manage subsidiary non-autonomic devices directly. 179 These threads should all either exit after their job is done, or 180 enter a wait state for new work, to avoid blocking other threads 181 unnecessarily. 183 Not all programming environments explicitly support multi-threading. 184 In such cases, an 'event loop' style of implementation could be 185 adopted, in which case each of the above threads would be implemented 186 as an event handler called in turn by the main loop. In this case, 187 the GRASP API (Section 3.3) must provide non-blocking calls. If 188 necessary, the GRASP session identifier will be used to distinguish 189 simultaneous operations. 191 According to the degree of parallelism needed by the application, 192 some of these threads might be launched in multiple instances. In 193 particular, if negotiation sessions with other ASAs are expected to 194 be long or to involve wait states, the ASA designer might allow for 195 multiple simultaneous negotiating threads, with appropriate use of 196 queues and locks to maintain consistency. 198 The main loop itself could act as the initiator of synchronization 199 requests or negotiation requests, when the ASA needs data or 200 resources from other ASAs. In particular, the main loop should watch 201 for changes in policy Intent that affect its operation. It should 202 also do whatever is required to avoid unnecessary resource 203 consumption, such as including an arbitrary wait time in each cycle 204 of the main loop. 206 The self-monitoring thread is of considerable importance. Autonomic 207 service agents must never fail. To a large extent this depends on 208 careful coding and testing, with no unhandled error returns or 209 exceptions, but if there is nevertheless some sort of failure, the 210 self-monitoring thread should detect it, fix it if possible, and in 211 the worst case restart the entire ASA. 213 Appendix B presents some example logic flows in informal pseudocode. 215 3. Interaction with the Autonomic Networking Infrastructure 217 3.1. Interaction with the security mechanisms 219 An ASA by definition runs in an autonomic node. Before any normal 220 ASAs are started, such nodes must be bootstrapped into the autonomic 221 network's secure key infrastructure in accordance with 222 [I-D.ietf-anima-bootstrapping-keyinfra]. This key infrastructure 223 will be used to secure the ACP (next section) and may be used by ASAs 224 to set up additional secure interactions with their peers, if needed. 226 Note that the secure bootstrap process itself may include special- 227 purpose ASAs that run in a constrained insecure mode. 229 3.2. Interaction with the Autonomic Control Plane 231 In a normal autonomic network, ASAs will run as clients of the ACP. 232 It will provide a fully secured network environment for all 233 communication with other ASAs, in most cases mediated by GRASP (next 234 section). 236 Note that the ACP formation process itself may include special- 237 purpose ASAs that run in a constrained insecure mode. 239 3.3. Interaction with GRASP and its API 241 GRASP [I-D.ietf-anima-grasp] is expected to run as a separate process 242 with its API [I-D.ietf-anima-grasp-api] available in user space. 243 Thus ASAs may operate without special privilege, unless they need it 244 for other reasons. The ASA's view of GRASP is built around GRASP 245 objectives (Section 5), defined as data structures containing 246 administrative information such as the objective's unique name, and 247 its current value. The format and size of the value is not 248 restricted by the protocol, except that it must be possible to 249 serialise it for transmission in CBOR [RFC7049], which is no 250 restriction at all in practice. 252 The GRASP API should offer the following features: 254 o Registration functions, so that an ASA can register itself and the 255 objectives that it manages. 257 o A discovery function, by which an ASA can discover other ASAs 258 supporting a given objective. 260 o A negotiation request function, by which an ASA can start 261 negotiation of an objective with a counterpart ASA. With this, 262 there is a corresponding listening function for an ASA that wishes 263 to respond to negotiation requests, and a set of functions to 264 support negotiating steps. 266 o A synchronization function, by which an ASA can request the 267 current value of an objective from a counterpart ASA. With this, 268 there is a corresponding listening function for an ASA that wishes 269 to respond to synchronization requests. 271 o A flood function, by which an ASA can cause the current value of 272 an objective to be flooded throughout the AN so that any ASA can 273 receive it. 275 For further details and some additional housekeeping functions, see 276 [I-D.ietf-anima-grasp-api]. 278 This API is intended to support the various interactions expected 279 between most ASAs, such as the interactions outlined in Section 2. 280 However, if ASAs require additional communication between themselves, 281 they can do so using any desired protocol. One option is to use 282 GRASP discovery and synchronization as a rendez-vous mechanism 283 between two ASAs, passing communication parameters such as a TCP port 284 number via GRASP. As noted above, either the ACP or in special cases 285 the autonomic key infrastructure will be used to secure such 286 communications. 288 3.4. Interaction with Intent mechanism 290 At the time of writing, the Intent mechanism for the ANI is 291 undefined. It is expected to operate by an information distribution 292 mechanism that can reach all autonomic nodes, and therefore every 293 ASA. However, each ASA must be capable of operating "out of the box" 294 in the absence of locally defined Intent, so every ASA implementation 295 must include carefully chosen default values and settings for all 296 parameters and choices that might depend on Intent. 298 4. Interaction with Non-Autonomic Components 300 An ASA, to have any external effects, must also interact with non- 301 autonomic components of the node where it is installed. For example, 302 an ASA whose purpose is to manage a resource must interact with that 303 resource. An ASA whose purpose is to manage an entity that is 304 already managed by local software must interact with that software. 305 This is stating the obvious, and the details are specific to each 306 case, but it has an important security implication. The ASA might 307 act as a loophole by which the managed entity could penetrate the 308 security boundary of the ANI. The ASA must be designed to avoid such 309 loopholes, and should if possible operate in an unprivileged mode. 311 In an environment where systems are virtualized and specialized using 312 techniques such as network function virtualization or network 313 slicing, there will be a design choice whether ASAs are deployed once 314 per physical node or once per virtual context. A related issue is 315 whether the ANI as a whole is deployed once on a physical network, or 316 whether several virtual ANIs are deployed. This aspect needs to be 317 considered by the ASA designer. 319 5. Design of GRASP Objectives 321 The general rules for the format of GRASP Objective options, their 322 names, and IANA registration are given in [I-D.ietf-anima-grasp]. 323 Additionally that document discusses various general considerations 324 for the design of objectives, which are not repeated here. However, 325 we emphasize that the GRASP protocol does not provide transactional 326 integrity. In other words, if an ASA is capable of overlapping 327 several negotiations for a given objective, then the ASA itself must 328 use suitable locking techniques to avoid interference between these 329 negotiations. For example, if an ASA is allocating part of a shared 330 resource to other ASAs, it needs to ensure that the same part of the 331 resource is not allocated twice. This might impact the design of the 332 objective as well as the logic flow of the ASA. 334 In particular, if 'dry run' mode is defined for the objective, its 335 specification, and every implementation, must consider what state 336 needs to be saved following a dry run negotiation, such that a 337 subsequent live negotiation can be expected to succeed. It must be 338 clear how long this state is kept, and what happens if the live 339 negotiation occurs after this state is deleted. An ASA that requests 340 a dry run negotiation must take account of the possibility that a 341 successful dry run is followed by a failed live negotiation. Because 342 of these complexities, the dry run mechanism should only be supported 343 by objectives and ASAs where there is a significant benefit from it. 345 The actual value field of an objective is limited by the GRASP 346 protocol definition to any data structure that can be expressed in 347 Concise Binary Object Representation (CBOR) [RFC7049]. For some 348 objectives, a single data item will suffice; for example an integer, 349 a floating point number or a UTF-8 string. For more complex cases, a 350 simple tuple structure such as [item1, item2, item3] could be used. 351 Nothing prevents using other formats such as JSON, but this requires 352 the ASA to be capable of parsing and generating JSON. The formats 353 acceptable by the GRASP API will limit the options in practice. A 354 fallback solution is for the API to accept and deliver the value 355 field in raw CBOR, with the ASA itself encoding and decoding it via a 356 CBOR library. 358 Note that a mapping from YANG to CBOR is defined by 359 [I-D.ietf-core-yang-cbor]. Subject to the size limit defined for 360 GRASP messages, nothing prevents objectives using YANG in this way. 362 6. Life Cycle 364 Autonomic functions could be permanent, in the sense that ASAs are 365 shipped as part of a product and persist throughout the product's 366 life. However, a more likely situation is that ASAs need to be 367 installed or updated dynamically, because of new requirements or 368 bugs. Because continuity of service is fundamental to autonomic 369 networking, the process of seamlessly replacing a running instance of 370 an ASA with a new version needs to be part of the ASA's design. 372 The implication of service continuity on the design of ASAs can be 373 illustrated along the three main phases of the ASA life-cycle, namely 374 Installation, Instantiation and Operation. 376 +--------------+ 377 Undeployed ------>| |------> Undeployed 378 | Installed | 379 +-->| |---+ 380 Mandate | +--------------+ | Receives a 381 is revoked | +--------------+ | Mandate 382 +---| |<--+ 383 | Instantiated | 384 +-->| |---+ 385 set | +--------------+ | set 386 down | +--------------+ | up 387 +---| |<--+ 388 | Operational | 389 | | 390 +--------------+ 392 Figure 1: Life cycle of an Autonomic Service Agent 394 6.1. Installation phase 396 Before being able to instantiate and run ASAs, the operator must 397 first provision the infrastructure with the sets of ASA software 398 corresponding to its needs and objectives. The provisioning of the 399 infrastructure is realized in the installation phase and consists in 400 installing (or checking the availability of) the pieces of software 401 of the different ASA classes in a set of Installation Hosts. 403 There are 3 properties applicable to the installation of ASAs: 405 The dynamic installation property allows installing an ASA on 406 demand, on any hosts compatible with the ASA. 408 The decoupling property allows controlling resources of a NE from a 409 remote ASA, i.e. an ASA installed on a host machine different from 410 the resources' NE. 412 The multiplicity property allows controlling multiple sets of 413 resources from a single ASA. 415 These three properties are very important in the context of the 416 installation phase as their variations condition how the ASA class 417 could be installed on the infrastructure. 419 6.1.1. Installation phase inputs and outputs 421 Inputs are: 423 [ASA class of type_x] that specifies which classes ASAs to install, 425 [Installation_target_Infrastructure] that specifies the candidate 426 Installation Hosts, 428 [ASA class placement function, e.g. under which criteria/constraints 429 as defined by the operator] 430 that specifies how the installation phase shall meet the 431 operator's needs and objectives for the provision of the 432 infrastructure. In the coupled mode, the placement function is 433 not necessary, whereas in the decoupled mode, the placement 434 function is mandatory, even though it can be as simple as an 435 explicit list of Installation hosts. 437 The main output of the installation phase is an up-to-date directory 438 of installed ASAs which corresponds to [list of ASA classes] 439 installed on [list of installation Hosts]. This output is also 440 useful for the coordination function and corresponds to the static 441 interaction map (see next section). 443 The condition to validate in order to pass to next phase is to ensure 444 that [list of ASA classes] are well installed on [list of 445 installation Hosts]. The state of the ASA at the end of the 446 installation phase is: installed. (not instantiated). The following 447 commands or messages are foreseen: install(list of ASA classes, 448 Installation_target_Infrastructure, ASA class placement function), 449 and un-install (list of ASA classes). 451 6.2. Instantiation phase 453 Once the ASAs are installed on the appropriate hosts in the network, 454 these ASA may start to operate. From the operator viewpoint, an 455 operating ASA means the ASA manages the network resources as per the 456 objectives given. At the ASA local level, operating means executing 457 their control loop/algorithm. 459 But right before that, there are two things to take into 460 consideration. First, there is a difference between 1. having a 461 piece of code available to run on a host and 2. having an agent based 462 on this piece of code running inside the host. Second, in a coupled 463 case, determining which resources are controlled by an ASA is 464 straightforward (the determination is embedded), in a decoupled mode 465 determining this is a bit more complex (hence a starting agent will 466 have to either discover or be taught it). 468 The instantiation phase of an ASA covers both these aspects: starting 469 the agent piece of code (when this does not start automatically) and 470 determining which resources have to be controlled (when this is not 471 obvious). 473 6.2.1. Operator's goal 475 Through this phase, the operator wants to control its autonomic 476 network in two things: 478 1 determine the scope of autonomic functions by instructing which of 479 the network resources have to be managed by which autonomic 480 function (and more precisely which class e.g. 1. version X or 481 version Y or 2. provider A or provider B), 483 2 determine how the autonomic functions are organized by instructing 484 which ASAs have to interact with which other ASAs (or more 485 precisely which set of network resources have to be handled as an 486 autonomous group by their managing ASAs). 488 Additionally in this phase, the operator may want to set objectives 489 to autonomic functions, by configuring the ASAs technical objectives. 491 The operator's goal can be summarized in an instruction to the ANIMA 492 ecosystem matching the following pattern: 494 [ASA of type_x instances] ready to control 495 [Instantiation_target_Infrastructure] with 496 [Instantiation_target_parameters] 498 6.2.2. Instantiation phase inputs and outputs 500 Inputs are: 502 [ASA of type_x instances] that specifies which are the ASAs to be 503 targeted (and more precisely which class e.g. 1. version X or 504 version Y or 2. provider A or provider B), 506 [Instantiation_target_Infrastructure] that specifies which are the 507 resources to be managed by the autonomic function, this can be the 508 whole network or a subset of it like a domain a technology segment 509 or even a specific list of resources, 511 [Instantiation_target_parameters] that specifies which are the 512 technical objectives to be set to ASAs (e.g. an optimization 513 target) 515 Outputs are: 517 [Set of ASAs - Resources relations] describing which resources are 518 managed by which ASA instances, this is not a formal message, but 519 a resulting configuration of a set of ASAs, 521 6.2.3. Instantiation phase requirements 523 The instructions described in section 4.2 could be either: 525 sent to a targeted ASA In which case, the receiving Agent will have 526 to manage the specified list of 527 [Instantiation_target_Infrastructure], with the 528 [Instantiation_target_parameters]. 530 broadcast to all ASAs In which case, the ASAs would collectively 531 determine from the list which Agent(s) would handle which 532 [Instantiation_target_Infrastructure], with the 533 [Instantiation_target_parameters]. 535 This set of instructions can be materialized through a message that 536 is named an Instance Mandate (description TBD). 538 The conclusion of this instantiation phase is a ready to operate ASA 539 (or interacting set of ASAs), then this (or those) ASA(s) can 540 describe themselves by depicting which are the resources they manage 541 and what this means in terms of metrics being monitored and in terms 542 of actions that can be executed (like modifying the parameters 543 values). A message conveying such a self description is named an 544 Instance Manifest (description TBD). 546 Though the operator may well use such a self-description "per se", 547 the final goal of such a description is to be shared with other ANIMA 548 entities like: 550 o the coordination entities (see [I-D.ciavaglia-anima-coordination] 551 - Autonomic Functions Coordination) 553 o collaborative entities in the purpose of establishing knowledge 554 exchanges (some ASAs may produce knowledge or even monitor metrics 555 that other ASAs cannot make by themselves why those would be 556 useful for their execution) 558 6.3. Operation phase 560 Note: This section is to be further developed in future revisions of 561 the document, especially the implications on the design of ASAs. 563 During the Operation phase, the operator can: 565 Activate/Deactivate ASA: meaning enabling those to execute their 566 autonomic loop or not. 568 Modify ASAs targets: meaning setting them different objectives. 570 Modify ASAs managed resources: by updating the instance mandate 571 which would specify different set of resources to manage (only 572 applicable to decouples ASAs). 574 During the Operation phase, running ASAs can interact the one with 575 the other: 577 in order to exchange knowledge (e.g. an ASA providing traffic 578 predictions to load balancing ASA) 580 in order to collaboratively reach an objective (e.g. ASAs 581 pertaining to the same autonomic function targeted to manage a 582 network domain, these ASA will collaborate - in the case of a load 583 balancing one, by modifying the links metrics according to the 584 neighboring resources loads) 586 During the Operation phase, running ASAs are expected to apply 587 coordination schemes 589 then execute their control loop under coordination supervision/ 590 instructions 592 The ASA life-cycle is discussed in more detail in "A Day in the Life 593 of an Autonomic Function" [I-D.peloso-anima-autonomic-function]. 595 7. Coordination between Autonomic Functions 597 Some autonomic functions will be completely independent of each 598 other. However, others are at risk of interfering with each other - 599 for example, two different optimization functions might both attempt 600 to modify the same underlying parameter in different ways. In a 601 complete system, a method is needed of identifying ASAs that might 602 interfere with each other and coordinating their actions when 603 necessary. This issue is considered in "Autonomic Functions 604 Coordination" [I-D.ciavaglia-anima-coordination]. 606 8. Coordination with Traditional Management Functions 608 Some ASAs will have functions that overlap with existing 609 configuration tools and network management mechanisms such as command 610 line interfaces, DHCP, DHCPv6, SNMP, NETCONF, RESTCONF and YANG-based 611 solutions. Each ASA designer will need to consider this issue and 612 how to avoid clashes and inconsistencies. Some specific 613 considerations for interaction with OAM tools are given in 614 [I-D.ietf-anima-stable-connectivity]. As another example, 615 [I-D.ietf-anima-prefix-management] describes how autonomic management 616 of IPv6 prefixes can interact with prefix delegation via DHCPv6. The 617 description of a GRASP objective and of an ASA using it should 618 include a discussion of any such interactions. 620 A related aspect is that management functions often include a data 621 model, quite likely to be expressed in a formal notation such as 622 YANG. This aspect should not be an afterthought in the design of an 623 ASA. To the contrary, the design of the ASA and of its GRASP 624 objectives should match the data model; as noted above, YANG 625 serialized as CBOR may be used directly as the value of a GRASP 626 objective. 628 9. Robustness 630 It is of great importance that all components of an autonomic system 631 are highly robust. In principle they must never fail. This section 632 lists various aspects of robustness that ASA designers should 633 consider. 635 1. If despite all precautions, an ASA does encounter a fatal error, 636 it should in any case restart automatically and try again. To 637 mitigate a hard loop in case of persistent failure, a suitable 638 pause should be inserted before such a restart. The length of 639 the pause depends on the use case. 641 2. If a newly received or calculated value for a parameter falls out 642 of bounds, the corresponding parameter should be either left 643 unchanged or restored to a safe value. 645 3. If a GRASP synchronization or negotiation session fails for any 646 reason, it may be repeated after a suitable pause. The length of 647 the pause depends on the use case. 649 4. If a session fails repeatedly, the ASA should consider that its 650 peer has failed, and cause GRASP to flush its discovery cache and 651 repeat peer discovery. 653 5. Any received GRASP message should be checked. If it is wrongly 654 formatted, it should be ignored. Within a unicast session, an 655 Invalid message (M_INVALID) may be sent. This function may be 656 provided by the GRASP implementation itself. 658 6. Any received GRASP objective should be checked. If it is wrongly 659 formatted, it should be ignored. Within a negotiation session, a 660 Negotiation End message (M_END) with a Decline option (O_DECLINE) 661 should be sent. An ASA may log such events for diagnostic 662 purposes. 664 7. If an ASA receives either an Invalid message (M_INVALID) or a 665 Negotiation End message (M_END) with a Decline option 666 (O_DECLINE), one possible reason is that the peer ASA does not 667 support a new feature of either GRASP or of the objective in 668 question. In such a case the ASA may choose to repeat the 669 operation concerned without using that new feature. 671 8. All other possible exceptions should be handled in an orderly 672 way. There should be no such thing as an unhandled exception 673 (but see point 1 above). 675 10. Security Considerations 677 ASAs are intended to run in an environment that is protected by the 678 Autonomic Control Plane [I-D.ietf-anima-autonomic-control-plane], 679 admission to which depends on an initial secure bootstrap process 680 [I-D.ietf-anima-bootstrapping-keyinfra]. However, this does not 681 relieve ASAs of responsibility for security. In particular, when 682 ASAs configure or manage network elements outside the ACP, they must 683 use secure techniques and carefully validate any incoming 684 information. As appropriate to their specific functions, ASAs should 685 take account of relevant privacy considerations [RFC6973]. 687 Authorization of ASAs is a subject for future study. At present, 688 ASAs are trusted by virtue of being installed on a node that has 689 successfully joined the ACP. 691 11. IANA Considerations 693 This document makes no request of the IANA. 695 12. Acknowledgements 697 Useful comments were received from Toerless Eckert, Alex Galis, Bing 698 Liu, and other members of the ANIMA WG. 700 13. References 702 13.1. Normative References 704 [I-D.ietf-anima-autonomic-control-plane] 705 Eckert, T., Behringer, M., and S. Bjarnason, "An Autonomic 706 Control Plane (ACP)", draft-ietf-anima-autonomic-control- 707 plane-16 (work in progress), June 2018. 709 [I-D.ietf-anima-bootstrapping-keyinfra] 710 Pritikin, M., Richardson, M., Behringer, M., Bjarnason, 711 S., and K. Watsen, "Bootstrapping Remote Secure Key 712 Infrastructures (BRSKI)", draft-ietf-anima-bootstrapping- 713 keyinfra-16 (work in progress), June 2018. 715 [I-D.ietf-anima-grasp] 716 Bormann, C., Carpenter, B., and B. Liu, "A Generic 717 Autonomic Signaling Protocol (GRASP)", draft-ietf-anima- 718 grasp-15 (work in progress), July 2017. 720 [RFC7049] Bormann, C. and P. Hoffman, "Concise Binary Object 721 Representation (CBOR)", RFC 7049, DOI 10.17487/RFC7049, 722 October 2013, . 724 13.2. Informative References 726 [DeMola06] 727 De Mola, F. and R. Quitadamo, "An Agent Model for Future 728 Autonomic Communications", Proceedings of the 7th WOA 2006 729 Workshop From Objects to Agents 51-59, September 2006. 731 [GANA13] ETSI GS AFI 002, "Autonomic network engineering for the 732 self-managing Future Internet (AFI): GANA Architectural 733 Reference Model for Autonomic Networking, Cognitive 734 Networking and Self-Management.", April 2013, 735 . 738 [Huebscher08] 739 Huebscher, M. and J. McCann, "A survey of autonomic 740 computing--degrees, models, and applications", ACM 741 Computing Surveys (CSUR) Volume 40 Issue 3 DOI: 742 10.1145/1380584.1380585, August 2008. 744 [I-D.ciavaglia-anima-coordination] 745 Ciavaglia, L. and P. Peloso, "Autonomic Functions 746 Coordination", draft-ciavaglia-anima-coordination-01 (work 747 in progress), March 2016. 749 [I-D.ietf-anima-grasp-api] 750 Carpenter, B., Liu, B., Wang, W., and X. Gong, "Generic 751 Autonomic Signaling Protocol Application Program Interface 752 (GRASP API)", draft-ietf-anima-grasp-api-01 (work in 753 progress), March 2018. 755 [I-D.ietf-anima-prefix-management] 756 Jiang, S., Du, Z., Carpenter, B., and Q. Sun, "Autonomic 757 IPv6 Edge Prefix Management in Large-scale Networks", 758 draft-ietf-anima-prefix-management-07 (work in progress), 759 December 2017. 761 [I-D.ietf-anima-reference-model] 762 Behringer, M., Carpenter, B., Eckert, T., Ciavaglia, L., 763 and J. Nobre, "A Reference Model for Autonomic 764 Networking", draft-ietf-anima-reference-model-06 (work in 765 progress), February 2018. 767 [I-D.ietf-anima-stable-connectivity] 768 Eckert, T. and M. Behringer, "Using Autonomic Control 769 Plane for Stable Connectivity of Network OAM", draft-ietf- 770 anima-stable-connectivity-10 (work in progress), February 771 2018. 773 [I-D.ietf-core-yang-cbor] 774 Veillette, M., Pelov, A., Somaraju, A., Turner, R., and A. 775 Minaburo, "CBOR Encoding of Data Modeled with YANG", 776 draft-ietf-core-yang-cbor-06 (work in progress), February 777 2018. 779 [I-D.irtf-nfvrg-gaps-network-virtualization] 780 Bernardos, C., Rahman, A., Zuniga, J., Contreras, L., 781 Aranda, P., and P. Lynch, "Network Virtualization Research 782 Challenges", draft-irtf-nfvrg-gaps-network- 783 virtualization-09 (work in progress), February 2018. 785 [I-D.peloso-anima-autonomic-function] 786 Pierre, P. and L. Ciavaglia, "A Day in the Life of an 787 Autonomic Function", draft-peloso-anima-autonomic- 788 function-01 (work in progress), March 2016. 790 [Movahedi12] 791 Movahedi, Z., Ayari, M., Langar, R., and G. Pujolle, "A 792 Survey of Autonomic Network Architectures and Evaluation 793 Criteria", IEEE Communications Surveys & Tutorials Volume: 794 14 , Issue: 2 DOI: 10.1109/SURV.2011.042711.00078, 795 Page(s): 464 - 490, 2012. 797 [RFC6973] Cooper, A., Tschofenig, H., Aboba, B., Peterson, J., 798 Morris, J., Hansen, M., and R. Smith, "Privacy 799 Considerations for Internet Protocols", RFC 6973, 800 DOI 10.17487/RFC6973, July 2013, 801 . 803 [RFC7575] Behringer, M., Pritikin, M., Bjarnason, S., Clemm, A., 804 Carpenter, B., Jiang, S., and L. Ciavaglia, "Autonomic 805 Networking: Definitions and Design Goals", RFC 7575, 806 DOI 10.17487/RFC7575, June 2015, 807 . 809 [RFC7665] Halpern, J., Ed. and C. Pignataro, Ed., "Service Function 810 Chaining (SFC) Architecture", RFC 7665, 811 DOI 10.17487/RFC7665, October 2015, 812 . 814 Appendix A. Change log [RFC Editor: Please remove] 816 draft-carpenter-anima-asa-guidelines-05, 2018-06-30: 818 Added section on relationshp with non-autonomic components. 820 Editorial corrections. 822 draft-carpenter-anima-asa-guidelines-04, 2018-03-03: 824 Added note about simple ASAs. 826 Added note about NFV/SFC services. 828 Improved text about threading v event loop model 830 Added section about coordination with traditional tools. 832 Added appendix with example logic flow. 834 draft-carpenter-anima-asa-guidelines-03, 2017-10-25: 836 Added details on life cycle. 838 Added details on robustness. 840 Added co-authors. 842 draft-carpenter-anima-asa-guidelines-02, 2017-07-01: 844 Expanded description of event-loop case. 846 Added note about 'dry run' mode. 848 draft-carpenter-anima-asa-guidelines-01, 2017-01-06: 850 More sections filled in 851 draft-carpenter-anima-asa-guidelines-00, 2016-09-30: 853 Initial version 855 Appendix B. Example Logic Flows 857 This appendix outlines logic flows for a general purpose resource 858 management ASA. It is assumed that all ASA instances managing this 859 resource use the same logic. However, one instance acts as a master, 860 initialised with a resource pool and a set of policy parameters. The 861 ASA uses a notional objective EX1 and an associated policy parameters 862 objective EX1.Params. 864 B.1. Threaded Example 866 MAIN Thread: 868 Create empty resource pool 869 Decide whether to act as master 870 if master: 871 Obtain initial resources from NOC and add to pool 872 Obtain EX1.Params values from NOC, or use default values 873 Register ASA with GRASP 874 Register objectives EX1 and EX1.Params 875 if master: 876 Start FLOODER thread to flood EX1.Params 877 Start SYNCHRONIZER listener thread for EX1.Params 878 Start MAIN_NEGOTIATOR and GARBAGE_COLLECTOR threads 879 if not master: 880 Obtain value of EX1.Params (from flood cache or via M_SYN message) 881 Start ASSIGN thread 882 while True: 883 if resource pool is low: 884 Calculate needed amount of resource 885 Discover peers (M_DISCOVER / M_RESPONSE) 886 Choose a peer (prefer good_peer if available) 887 Send M_REQ_NEG("EX1", peer) 888 Wait for response (M_NEGOTIATE, M_END or M_WAIT) 889 if OK: 890 if offered resource is sufficient: 891 Negotiation succeeded: Send M_END + O_ACCEPT 892 Add resource to pool 893 good_peer = peer 894 else: 895 Fail negotiation: Send M_END + O_DECLINE 896 sleep(10s) 898 MAIN_NEGOTIATOR Thread : 900 while True: 901 Wait for M_REQ_NEG for EX1 902 start a separate new NEGOTIATOR thread 903 (allows simultaneous negotiations) 905 NEGOTIATOR Thread: 907 Fetch available resource from pool 908 if OK: 909 Offer resource to peer: Send M_NEGOTIATE for EX1 objective 910 if OK: 911 Received M_END + O_ACCEPT 912 Negotiation succeeded 913 else: 914 Received M_END + O_DECLINE or other error 915 Return resource to pool 916 else: 917 Fail negotiation: Send M_END + O_DECLINE 919 ASSIGN Thread: 921 while True: 922 wait for resource request from managed entity 923 get resource from pool 924 if OK: 925 assign resource to managed entity 926 else: 927 signal main thread that pool is low 929 GARBAGE_COLLECTOR Thread: 931 while True: 932 return unused resources to pool 933 sleep(5s) 935 SYNCHRONIZER Thread: 937 while True: 938 wait for M_REQ_SYN message for EX1.Params 939 reply with M_SYNCH message for EX1.Params 941 FLOODER Thread: 943 while True: 944 send M_FLOOD message for EX1.Params 945 sleep(60s) 947 B.2. Event Loop Example 949 TBD 951 Authors' Addresses 953 Brian Carpenter 954 Department of Computer Science 955 University of Auckland 956 PB 92019 957 Auckland 1142 958 New Zealand 960 Email: brian.e.carpenter@gmail.com 962 Laurent Ciavaglia 963 Nokia 964 Villarceaux 965 Nozay 91460 966 FR 968 Email: laurent.ciavaglia@nokia.com 970 Sheng Jiang 971 Huawei Technologies Co., Ltd 972 Q14, Huawei Campus, No.156 Beiqing Road 973 Hai-Dian District, Beijing, 100095 974 P.R. China 976 Email: jiangsheng@huawei.com 978 Pierre Peloso 979 Nokia 980 Villarceaux 981 Nozay 91460 982 FR 984 Email: pierre.peloso@nokia.com