idnits 2.17.1 draft-carpenter-anima-asa-guidelines-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 401 has weird spacing: '...roperty allow...' == Line 404 has weird spacing: '...roperty allow...' == Line 408 has weird spacing: '...roperty allow...' -- The document date (July 7, 2019) is 1755 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '
' and
     '' lines.


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Outdated reference: A later version (-30) exists of
     draft-ietf-anima-autonomic-control-plane-19

  == Outdated reference: A later version (-45) exists of
     draft-ietf-anima-bootstrapping-keyinfra-22

  ** Obsolete normative reference: RFC 7049 (Obsoleted by RFC 8949)

  == Outdated reference: A later version (-02) exists of
     draft-carpenter-anima-l2acp-scenarios-00

  == Outdated reference: A later version (-10) exists of
     draft-ietf-anima-grasp-api-03

  == Outdated reference: A later version (-20) exists of
     draft-ietf-core-yang-cbor-10


     Summary: 1 error (**), 0 flaws (~~), 9 warnings (==), 2 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                       B. Carpenter
3	Internet-Draft                                         Univ. of Auckland
4	Intended status: Informational                              L. Ciavaglia
5	Expires: January 8, 2020                                           Nokia
6	                                                                S. Jiang
7	                                            Huawei Technologies Co., Ltd
8	                                                               P. Peloso
9	                                                                   Nokia
10	                                                            July 7, 2019

12	                Guidelines for Autonomic Service Agents
13	                draft-carpenter-anima-asa-guidelines-07

15	Abstract

17	   This document proposes guidelines for the design of Autonomic Service
18	   Agents for autonomic networks.  It is based on the Autonomic Network
19	   Infrastructure outlined in the ANIMA reference model, making use of
20	   the Autonomic Control Plane and the Generic Autonomic Signaling
21	   Protocol.

23	Status of This Memo

25	   This Internet-Draft is submitted in full conformance with the
26	   provisions of BCP 78 and BCP 79.

28	   Internet-Drafts are working documents of the Internet Engineering
29	   Task Force (IETF).  Note that other groups may also distribute
30	   working documents as Internet-Drafts.  The list of current Internet-
31	   Drafts is at https://datatracker.ietf.org/drafts/current/.

33	   Internet-Drafts are draft documents valid for a maximum of six months
34	   and may be updated, replaced, or obsoleted by other documents at any
35	   time.  It is inappropriate to use Internet-Drafts as reference
36	   material or to cite them other than as "work in progress."

38	   This Internet-Draft will expire on January 8, 2020.

40	Copyright Notice

42	   Copyright (c) 2019 IETF Trust and the persons identified as the
43	   document authors.  All rights reserved.

45	   This document is subject to BCP 78 and the IETF Trust's Legal
46	   Provisions Relating to IETF Documents
47	   (https://trustee.ietf.org/license-info) in effect on the date of
48	   publication of this document.  Please review these documents
49	   carefully, as they describe your rights and restrictions with respect
50	   to this document.  Code Components extracted from this document must
51	   include Simplified BSD License text as described in Section 4.e of
52	   the Trust Legal Provisions and are provided without warranty as
53	   described in the Simplified BSD License.

55	Table of Contents

57	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
58	   2.  Logical Structure of an Autonomic Service Agent . . . . . . .   3
59	   3.  Interaction with the Autonomic Networking Infrastructure  . .   5
60	     3.1.  Interaction with the security mechanisms  . . . . . . . .   5
61	     3.2.  Interaction with the Autonomic Control Plane  . . . . . .   5
62	     3.3.  Interaction with GRASP and its API  . . . . . . . . . . .   5
63	     3.4.  Interaction with policy mechanism . . . . . . . . . . . .   6
64	   4.  Interaction with Non-Autonomic Components . . . . . . . . . .   7
65	   5.  Design of GRASP Objectives  . . . . . . . . . . . . . . . . .   7
66	   6.  Life Cycle  . . . . . . . . . . . . . . . . . . . . . . . . .   8
67	     6.1.  Installation phase  . . . . . . . . . . . . . . . . . . .   9
68	       6.1.1.  Installation phase inputs and outputs . . . . . . . .  10
69	     6.2.  Instantiation phase . . . . . . . . . . . . . . . . . . .  10
70	       6.2.1.  Operator's goal . . . . . . . . . . . . . . . . . . .  11
71	       6.2.2.  Instantiation phase inputs and outputs  . . . . . . .  11
72	       6.2.3.  Instantiation phase requirements  . . . . . . . . . .  12
73	     6.3.  Operation phase . . . . . . . . . . . . . . . . . . . . .  12
74	   7.  Coordination between Autonomic Functions  . . . . . . . . . .  13
75	   8.  Coordination with Traditional Management Functions  . . . . .  13
76	   9.  Robustness  . . . . . . . . . . . . . . . . . . . . . . . . .  14
77	   10. Security Considerations . . . . . . . . . . . . . . . . . . .  15
78	   11. IANA Considerations . . . . . . . . . . . . . . . . . . . . .  15
79	   12. Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  15
80	   13. References  . . . . . . . . . . . . . . . . . . . . . . . . .  15
81	     13.1.  Normative References . . . . . . . . . . . . . . . . . .  15
82	     13.2.  Informative References . . . . . . . . . . . . . . . . .  16
83	   Appendix A.  Change log [RFC Editor: Please remove] . . . . . . .  18
84	   Appendix B.  Example Logic Flows  . . . . . . . . . . . . . . . .  19
85	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  24

87	1.  Introduction

89	   This document proposes guidelines for the design of Autonomic Service
90	   Agents (ASAs) in the context of an Autonomic Network (AN) based on
91	   the Autonomic Network Infrastructure (ANI) outlined in the ANIMA
92	   reference model [I-D.ietf-anima-reference-model].  This
93	   infrastructure makes use of the Autonomic Control Plane (ACP)
94	   [I-D.ietf-anima-autonomic-control-plane] and the Generic Autonomic
95	   Signaling Protocol (GRASP) [I-D.ietf-anima-grasp].

97	   There is a considerable literature about autonomic agents with a
98	   variety of proposals about how they should be characterized.  Some
99	   examples are [DeMola06], [Huebscher08], [Movahedi12] and [GANA13].
100	   However, for the present document, the basic definitions and goals
101	   for autonomic networking given in [RFC7575] apply . According to RFC
102	   7575, an Autonomic Service Agent is "An agent implemented on an
103	   autonomic node that implements an autonomic function, either in part
104	   (in the case of a distributed function) or whole."

106	   ASAs must be distinguished from other forms of software component.
107	   They are components of network or service management; they do not in
108	   themselves provide services.  For example, the services envisaged for
109	   network function virtualisation [RFC8568] or for service function
110	   chaining [RFC7665] might be managed by an ASA rather than by
111	   traditional configuration tools.

113	   The reference model [I-D.ietf-anima-reference-model] expands this by
114	   adding that an ASA is "a process that makes use of the features
115	   provided by the ANI to achieve its own goals, usually including
116	   interaction with other ASAs via the GRASP protocol
117	   [I-D.ietf-anima-grasp] or otherwise.  Of course it also interacts
118	   with the specific targets of its function, using any suitable
119	   mechanism.  Unless its function is very simple, the ASA will need to
120	   handle overlapping asynchronous operations.  This will require either
121	   a multi-threaded implementation, or a logically equivalent event loop
122	   structure.  It may therefore be a quite complex piece of software in
123	   its own right, forming part of the application layer above the ANI."

125	   There will certainly be very simple ASAs that manage a single
126	   objective in a straightforward way and do not asynchronous
127	   operations.  In such a case, many aspects of the current document do
128	   not apply.  However, in general a basic property of an ASA is that it
129	   is a relatively complex software component that will in many cases
130	   control and monitor simpler entities in the same host or elsewhere.
131	   For example, a device controller that manages tens or hundreds of
132	   simple devices might contain a single ASA.

134	   The remainder of this document offers guidance on the design of such
135	   ASAs.

137	2.  Logical Structure of an Autonomic Service Agent

139	   As mentioned above, all but the simplest ASAs will need to suport
140	   asynchronous operations.  Not all programming environments explicitly
141	   support multi-threading.  In that case, an 'event loop' style of
142	   implementation should be adopted, in which case each thread would be
143	   implemented as an event handler called in turn by the main loop.  For
144	   this, the GRASP API (Section 3.3) must provide non-blocking calls.

146	   If necessary, the GRASP session identifier will be used to
147	   distinguish simultaneous operations.

149	   A typical ASA will have a main thread that performs various initial
150	   housekeeping actions such as:

152	   o  Obtain authorization credentials.

154	   o  Register the ASA with GRASP.

156	   o  Acquire relevant policy parameters.

158	   o  Define data structures for relevant GRASP objectives.

160	   o  Register with GRASP those objectives that it will actively manage.

162	   o  Launch a self-monitoring thread.

164	   o  Enter its main loop.

166	   The logic of the main loop will depend on the details of the
167	   autonomic function concerned.  Whenever asynchronous operations are
168	   required, extra threads will be launched, or events added to the
169	   event loop.  Examples include:

171	   o  Repeatedly flood an objective to the AN, so that any ASA can
172	      receive the objective's latest value.

174	   o  Accept incoming synchronization requests for an objective managed
175	      by this ASA.

177	   o  Accept incoming negotiation requests for an objective managed by
178	      this ASA, and then conduct the resulting negotiation with the
179	      counterpart ASA.

181	   o  Manage subsidiary non-autonomic devices directly.

183	   These threads or events should all either exit after their job is
184	   done, or enter a wait state for new work, to avoid blocking others
185	   unnecessarily.

187	   According to the degree of parallelism needed by the application,
188	   some of these threads or events might be launched in multiple
189	   instances.  In particular, if negotiation sessions with other ASAs
190	   are expected to be long or to involve wait states, the ASA designer
191	   might allow for multiple simultaneous negotiating threads, with
192	   appropriate use of queues and locks to maintain consistency.

194	   The main loop itself could act as the initiator of synchronization
195	   requests or negotiation requests, when the ASA needs data or
196	   resources from other ASAs.  In particular, the main loop should watch
197	   for changes in policy parameters that affect its operation.  It
198	   should also do whatever is required to avoid unnecessary resource
199	   consumption, such as including an arbitrary wait time in each cycle
200	   of the main loop.

202	   The self-monitoring thread is of considerable importance.  Autonomic
203	   service agents must never fail.  To a large extent this depends on
204	   careful coding and testing, with no unhandled error returns or
205	   exceptions, but if there is nevertheless some sort of failure, the
206	   self-monitoring thread should detect it, fix it if possible, and in
207	   the worst case restart the entire ASA.

209	   Appendix B presents some example logic flows in informal pseudocode.

211	3.  Interaction with the Autonomic Networking Infrastructure

213	3.1.  Interaction with the security mechanisms

215	   An ASA by definition runs in an autonomic node.  Before any normal
216	   ASAs are started, such nodes must be bootstrapped into the autonomic
217	   network's secure key infrastructure in accordance with
218	   [I-D.ietf-anima-bootstrapping-keyinfra].  This key infrastructure
219	   will be used to secure the ACP (next section) and may be used by ASAs
220	   to set up additional secure interactions with their peers, if needed.

222	   Note that the secure bootstrap process itself may include special-
223	   purpose ASAs that run in a constrained insecure mode.

225	3.2.  Interaction with the Autonomic Control Plane

227	   In a normal autonomic network, ASAs will run as clients of the ACP.
228	   It will provide a fully secured network environment for all
229	   communication with other ASAs, in most cases mediated by GRASP (next
230	   section).

232	   Note that the ACP formation process itself may include special-
233	   purpose ASAs that run in a constrained insecure mode.

235	3.3.  Interaction with GRASP and its API

237	   GRASP [I-D.ietf-anima-grasp] is expected to run as a separate process
238	   with its API [I-D.ietf-anima-grasp-api] available in user space.
239	   Thus ASAs may operate without special privilege, unless they need it
240	   for other reasons.  The ASA's view of GRASP is built around GRASP
241	   objectives (Section 5), defined as data structures containing
242	   administrative information such as the objective's unique name, and
243	   its current value.  The format and size of the value is not
244	   restricted by the protocol, except that it must be possible to
245	   serialise it for transmission in CBOR [RFC7049], which is no
246	   restriction at all in practice.

248	   The GRASP API should offer the following features:

250	   o  Registration functions, so that an ASA can register itself and the
251	      objectives that it manages.

253	   o  A discovery function, by which an ASA can discover other ASAs
254	      supporting a given objective.

256	   o  A negotiation request function, by which an ASA can start
257	      negotiation of an objective with a counterpart ASA.  With this,
258	      there is a corresponding listening function for an ASA that wishes
259	      to respond to negotiation requests, and a set of functions to
260	      support negotiating steps.

262	   o  A synchronization function, by which an ASA can request the
263	      current value of an objective from a counterpart ASA.  With this,
264	      there is a corresponding listening function for an ASA that wishes
265	      to respond to synchronization requests.

267	   o  A flood function, by which an ASA can cause the current value of
268	      an objective to be flooded throughout the AN so that any ASA can
269	      receive it.

271	   For further details and some additional housekeeping functions, see
272	   [I-D.ietf-anima-grasp-api].

274	   This API is intended to support the various interactions expected
275	   between most ASAs, such as the interactions outlined in Section 2.
276	   However, if ASAs require additional communication between themselves,
277	   they can do so using any desired protocol.  One option is to use
278	   GRASP discovery and synchronization as a rendez-vous mechanism
279	   between two ASAs, passing communication parameters such as a TCP port
280	   number via GRASP.  As noted above, either the ACP or in special cases
281	   the autonomic key infrastructure will be used to secure such
282	   communications.

284	3.4.  Interaction with policy mechanism

286	   At the time of writing, the policy (or "Intent") mechanism for the
287	   ANI is undefined.  It is expected to operate by an information
288	   distribution mechanism that can reach all autonomic nodes, and
289	   therefore every ASA.  However, each ASA must be capable of operating
290	   "out of the box" in the absence of locally defined policy, so every
291	   ASA implementation must include carefully chosen default values and
292	   settings for all policy parameters.

294	4.  Interaction with Non-Autonomic Components

296	   An ASA, to have any external effects, must also interact with non-
297	   autonomic components of the node where it is installed.  For example,
298	   an ASA whose purpose is to manage a resource must interact with that
299	   resource.  An ASA whose purpose is to manage an entity that is
300	   already managed by local software must interact with that software.
301	   This is stating the obvious, and the details are specific to each
302	   case, but it has an important security implication.  The ASA might
303	   act as a loophole by which the managed entity could penetrate the
304	   security boundary of the ANI.  The ASA must be designed to avoid such
305	   loopholes, and should if possible operate in an unprivileged mode.

307	   In an environment where systems are virtualized and specialized using
308	   techniques such as network function virtualization or network
309	   slicing, there will be a design choice whether ASAs are deployed once
310	   per physical node or once per virtual context.  A related issue is
311	   whether the ANI as a whole is deployed once on a physical network, or
312	   whether several virtual ANIs are deployed.  This aspect needs to be
313	   considered by the ASA designer.

315	5.  Design of GRASP Objectives

317	   The general rules for the format of GRASP Objective options, their
318	   names, and IANA registration are given in [I-D.ietf-anima-grasp].
319	   Additionally that document discusses various general considerations
320	   for the design of objectives, which are not repeated here.  However,
321	   we emphasize that the GRASP protocol does not provide transactional
322	   integrity.  In other words, if an ASA is capable of overlapping
323	   several negotiations for a given objective, then the ASA itself must
324	   use suitable locking techniques to avoid interference between these
325	   negotiations.  For example, if an ASA is allocating part of a shared
326	   resource to other ASAs, it needs to ensure that the same part of the
327	   resource is not allocated twice.  This might impact the design of the
328	   objective as well as the logic flow of the ASA.

330	   In particular, if 'dry run' mode is defined for the objective, its
331	   specification, and every implementation, must consider what state
332	   needs to be saved following a dry run negotiation, such that a
333	   subsequent live negotiation can be expected to succeed.  It must be
334	   clear how long this state is kept, and what happens if the live
335	   negotiation occurs after this state is deleted.  An ASA that requests
336	   a dry run negotiation must take account of the possibility that a
337	   successful dry run is followed by a failed live negotiation.  Because
338	   of these complexities, the dry run mechanism should only be supported
339	   by objectives and ASAs where there is a significant benefit from it.

341	   The actual value field of an objective is limited by the GRASP
342	   protocol definition to any data structure that can be expressed in
343	   Concise Binary Object Representation (CBOR) [RFC7049].  For some
344	   objectives, a single data item will suffice; for example an integer,
345	   a floating point number or a UTF-8 string.  For more complex cases, a
346	   simple tuple structure such as [item1, item2, item3] could be used.
347	   Nothing prevents using other formats such as JSON, but this requires
348	   the ASA to be capable of parsing and generating JSON.  The formats
349	   acceptable by the GRASP API will limit the options in practice.  A
350	   fallback solution is for the API to accept and deliver the value
351	   field in raw CBOR, with the ASA itself encoding and decoding it via a
352	   CBOR library.

354	   Note that a mapping from YANG to CBOR is defined by
355	   [I-D.ietf-core-yang-cbor].  Subject to the size limit defined for
356	   GRASP messages, nothing prevents objectives using YANG in this way.

358	6.  Life Cycle

360	   Autonomic functions could be permanent, in the sense that ASAs are
361	   shipped as part of a product and persist throughout the product's
362	   life.  However, a more likely situation is that ASAs need to be
363	   installed or updated dynamically, because of new requirements or
364	   bugs.  Because continuity of service is fundamental to autonomic
365	   networking, the process of seamlessly replacing a running instance of
366	   an ASA with a new version needs to be part of the ASA's design.

368	   The implication of service continuity on the design of ASAs can be
369	   illustrated along the three main phases of the ASA life-cycle, namely
370	   Installation, Instantiation and Operation.

372	                     +--------------+
373	   Undeployed ------>|              |------> Undeployed
374	                     |  Installed   |
375	                 +-->|              |---+
376	        Mandate  |   +--------------+   | Receives a
377	      is revoked |   +--------------+   |  Mandate
378	                 +---|              |<--+
379	                     | Instantiated |
380	                 +-->|              |---+
381	             set |   +--------------+   | set
382	            down |   +--------------+   | up
383	                 +---|              |<--+
384	                     |  Operational |
385	                     |              |
386	                     +--------------+

388	            Figure 1: Life cycle of an Autonomic Service Agent

390	6.1.  Installation phase

392	   Before being able to instantiate and run ASAs, the operator must
393	   first provision the infrastructure with the sets of ASA software
394	   corresponding to its needs and objectives.  The provisioning of the
395	   infrastructure is realized in the installation phase and consists in
396	   installing (or checking the availability of) the pieces of software
397	   of the different ASA classes in a set of Installation Hosts.

399	   There are 3 properties applicable to the installation of ASAs:

401	   The dynamic installation property  allows installing an ASA on
402	      demand, on any hosts compatible with the ASA.

404	   The decoupling property  allows controlling resources of a NE from a
405	      remote ASA, i.e. an ASA installed on a host machine different from
406	      the resources' NE.

408	   The multiplicity property  allows controlling multiple sets of
409	      resources from a single ASA.

411	   These three properties are very important in the context of the
412	   installation phase as their variations condition how the ASA class
413	   could be installed on the infrastructure.

415	6.1.1.  Installation phase inputs and outputs

417	   Inputs are:

419	   [ASA class of type_x]  that specifies which classes ASAs to install,

421	   [Installation_target_Infrastructure]  that specifies the candidate
422	      Installation Hosts,

424	   [ASA class placement function, e.g. under which criteria/constraints
425	   as defined by the operator]
426	      that specifies how the installation phase shall meet the
427	      operator's needs and objectives for the provision of the
428	      infrastructure.  In the coupled mode, the placement function is
429	      not necessary, whereas in the decoupled mode, the placement
430	      function is mandatory, even though it can be as simple as an
431	      explicit list of Installation hosts.

433	   The main output of the installation phase is an up-to-date directory
434	   of installed ASAs which corresponds to [list of ASA classes]
435	   installed on [list of installation Hosts].  This output is also
436	   useful for the coordination function and corresponds to the static
437	   interaction map (see next section).

439	   The condition to validate in order to pass to next phase is to ensure
440	   that [list of ASA classes] are well installed on [list of
441	   installation Hosts].  The state of the ASA at the end of the
442	   installation phase is: installed. (not instantiated).  The following
443	   commands or messages are foreseen: install(list of ASA classes,
444	   Installation_target_Infrastructure, ASA class placement function),
445	   and un-install (list of ASA classes).

447	6.2.  Instantiation phase

449	   Once the ASAs are installed on the appropriate hosts in the network,
450	   these ASA may start to operate.  From the operator viewpoint, an
451	   operating ASA means the ASA manages the network resources as per the
452	   objectives given.  At the ASA local level, operating means executing
453	   their control loop/algorithm.

455	   But right before that, there are two things to take into
456	   consideration.  First, there is a difference between 1. having a
457	   piece of code available to run on a host and 2. having an agent based
458	   on this piece of code running inside the host.  Second, in a coupled
459	   case, determining which resources are controlled by an ASA is
460	   straightforward (the determination is embedded), in a decoupled mode
461	   determining this is a bit more complex (hence a starting agent will
462	   have to either discover or be taught it).

464	   The instantiation phase of an ASA covers both these aspects: starting
465	   the agent piece of code (when this does not start automatically) and
466	   determining which resources have to be controlled (when this is not
467	   obvious).

469	6.2.1.  Operator's goal

471	   Through this phase, the operator wants to control its autonomic
472	   network in two things:

474	   1  determine the scope of autonomic functions by instructing which of
475	      the network resources have to be managed by which autonomic
476	      function (and more precisely which class e.g. 1. version X or
477	      version Y or 2. provider A or provider B),

479	   2  determine how the autonomic functions are organized by instructing
480	      which ASAs have to interact with which other ASAs (or more
481	      precisely which set of network resources have to be handled as an
482	      autonomous group by their managing ASAs).

484	   Additionally in this phase, the operator may want to set objectives
485	   to autonomic functions, by configuring the ASAs technical objectives.

487	   The operator's goal can be summarized in an instruction to the ANIMA
488	   ecosystem matching the following pattern:

490	      [ASA of type_x instances] ready to control
491	      [Instantiation_target_Infrastructure] with
492	      [Instantiation_target_parameters]

494	6.2.2.  Instantiation phase inputs and outputs

496	   Inputs are:

498	   [ASA of type_x instances]  that specifies which are the ASAs to be
499	      targeted (and more precisely which class e.g. 1. version X or
500	      version Y or 2. provider A or provider B),

502	   [Instantiation_target_Infrastructure]  that specifies which are the
503	      resources to be managed by the autonomic function, this can be the
504	      whole network or a subset of it like a domain a technology segment
505	      or even a specific list of resources,

507	   [Instantiation_target_parameters]  that specifies which are the
508	      technical objectives to be set to ASAs (e.g. an optimization
509	      target)

511	   Outputs are:

513	   [Set of ASAs - Resources relations]  describing which resources are
514	      managed by which ASA instances, this is not a formal message, but
515	      a resulting configuration of a set of ASAs,

517	6.2.3.  Instantiation phase requirements

519	   The instructions described in section 4.2 could be either:

521	   sent to a targeted ASA  In which case, the receiving Agent will have
522	      to manage the specified list of
523	      [Instantiation_target_Infrastructure], with the
524	      [Instantiation_target_parameters].

526	   broadcast to all ASAs  In which case, the ASAs would collectively
527	      determine from the list which Agent(s) would handle which
528	      [Instantiation_target_Infrastructure], with the
529	      [Instantiation_target_parameters].

531	   This set of instructions can be materialized through a message that
532	   is named an Instance Mandate (description TBD).

534	   The conclusion of this instantiation phase is a ready to operate ASA
535	   (or interacting set of ASAs), then this (or those) ASA(s) can
536	   describe themselves by depicting which are the resources they manage
537	   and what this means in terms of metrics being monitored and in terms
538	   of actions that can be executed (like modifying the parameters
539	   values).  A message conveying such a self description is named an
540	   Instance Manifest (description TBD).

542	   Though the operator may well use such a self-description "per se",
543	   the final goal of such a description is to be shared with other ANIMA
544	   entities like:

546	   o  the coordination entities (see [I-D.ciavaglia-anima-coordination]
547	      - Autonomic Functions Coordination)

549	   o  collaborative entities in the purpose of establishing knowledge
550	      exchanges (some ASAs may produce knowledge or even monitor metrics
551	      that other ASAs cannot make by themselves why those would be
552	      useful for their execution)

554	6.3.  Operation phase

556	   Note: This section is to be further developed in future revisions of
557	   the document, especially the implications on the design of ASAs.

559	   During the Operation phase, the operator can:

561	      Activate/Deactivate ASA: meaning enabling those to execute their
562	      autonomic loop or not.

564	      Modify ASAs targets: meaning setting them different objectives.

566	      Modify ASAs managed resources: by updating the instance mandate
567	      which would specify different set of resources to manage (only
568	      applicable to decouples ASAs).

570	   During the Operation phase, running ASAs can interact the one with
571	   the other:

573	      in order to exchange knowledge (e.g. an ASA providing traffic
574	      predictions to load balancing ASA)

576	      in order to collaboratively reach an objective (e.g.  ASAs
577	      pertaining to the same autonomic function targeted to manage a
578	      network domain, these ASA will collaborate - in the case of a load
579	      balancing one, by modifying the links metrics according to the
580	      neighboring resources loads)

582	   During the Operation phase, running ASAs are expected to apply
583	   coordination schemes

585	      then execute their control loop under coordination supervision/
586	      instructions

588	   The ASA life-cycle is discussed in more detail in "A Day in the Life
589	   of an Autonomic Function" [I-D.peloso-anima-autonomic-function].

591	7.  Coordination between Autonomic Functions

593	   Some autonomic functions will be completely independent of each
594	   other.  However, others are at risk of interfering with each other -
595	   for example, two different optimization functions might both attempt
596	   to modify the same underlying parameter in different ways.  In a
597	   complete system, a method is needed of identifying ASAs that might
598	   interfere with each other and coordinating their actions when
599	   necessary.  This issue is considered in "Autonomic Functions
600	   Coordination" [I-D.ciavaglia-anima-coordination].

602	8.  Coordination with Traditional Management Functions

604	   Some ASAs will have functions that overlap with existing
605	   configuration tools and network management mechanisms such as command
606	   line interfaces, DHCP, DHCPv6, SNMP, NETCONF, RESTCONF and YANG-based
607	   solutions.  Each ASA designer will need to consider this issue and
608	   how to avoid clashes and inconsistencies.  Some specific
609	   considerations for interaction with OAM tools are given in [RFC8368].
610	   As another example, [I-D.ietf-anima-prefix-management] describes how
611	   autonomic management of IPv6 prefixes can interact with prefix
612	   delegation via DHCPv6.  The description of a GRASP objective and of
613	   an ASA using it should include a discussion of any such interactions.

615	   A related aspect is that management functions often include a data
616	   model, quite likely to be expressed in a formal notation such as
617	   YANG.  This aspect should not be an afterthought in the design of an
618	   ASA.  To the contrary, the design of the ASA and of its GRASP
619	   objectives should match the data model; as noted above, YANG
620	   serialized as CBOR may be used directly as the value of a GRASP
621	   objective.

623	9.  Robustness

625	   It is of great importance that all components of an autonomic system
626	   are highly robust.  In principle they must never fail.  This section
627	   lists various aspects of robustness that ASA designers should
628	   consider.

630	   1.  If despite all precautions, an ASA does encounter a fatal error,
631	       it should in any case restart automatically and try again.  To
632	       mitigate a hard loop in case of persistent failure, a suitable
633	       pause should be inserted before such a restart.  The length of
634	       the pause depends on the use case.

636	   2.  If a newly received or calculated value for a parameter falls out
637	       of bounds, the corresponding parameter should be either left
638	       unchanged or restored to a safe value.

640	   3.  If a GRASP synchronization or negotiation session fails for any
641	       reason, it may be repeated after a suitable pause.  The length of
642	       the pause depends on the use case.

644	   4.  If a session fails repeatedly, the ASA should consider that its
645	       peer has failed, and cause GRASP to flush its discovery cache and
646	       repeat peer discovery.

648	   5.  Any received GRASP message should be checked.  If it is wrongly
649	       formatted, it should be ignored.  Within a unicast session, an
650	       Invalid message (M_INVALID) may be sent.  This function may be
651	       provided by the GRASP implementation itself.

653	   6.  Any received GRASP objective should be checked.  If it is wrongly
654	       formatted, it should be ignored.  Within a negotiation session, a
655	       Negotiation End message (M_END) with a Decline option (O_DECLINE)
656	       should be sent.  An ASA may log such events for diagnostic
657	       purposes.

659	   7.  If an ASA receives either an Invalid message (M_INVALID) or a
660	       Negotiation End message (M_END) with a Decline option
661	       (O_DECLINE), one possible reason is that the peer ASA does not
662	       support a new feature of either GRASP or of the objective in
663	       question.  In such a case the ASA may choose to repeat the
664	       operation concerned without using that new feature.

666	   8.  All other possible exceptions should be handled in an orderly
667	       way.  There should be no such thing as an unhandled exception
668	       (but see point 1 above).

670	10.  Security Considerations

672	   ASAs are intended to run in an environment that is protected by the
673	   Autonomic Control Plane [I-D.ietf-anima-autonomic-control-plane],
674	   admission to which depends on an initial secure bootstrap process
675	   [I-D.ietf-anima-bootstrapping-keyinfra].  In some deployments, a
676	   secure partition of the link layer might be used instead
677	   [I-D.carpenter-anima-l2acp-scenarios].  However, this does not
678	   relieve ASAs of responsibility for security.  In particular, when
679	   ASAs configure or manage network elements outside the ACP, they must
680	   use secure techniques and carefully validate any incoming
681	   information.  As appropriate to their specific functions, ASAs should
682	   take account of relevant privacy considerations [RFC6973].

684	   Authorization of ASAs is a subject for future study.  At present,
685	   ASAs are trusted by virtue of being installed on a node that has
686	   successfully joined the ACP.

688	11.  IANA Considerations

690	   This document makes no request of the IANA.

692	12.  Acknowledgements

694	   Useful comments were received from Toerless Eckert, Alex Galis, Bing
695	   Liu, and other members of the ANIMA WG.

697	13.  References

699	13.1.  Normative References

701	   [I-D.ietf-anima-autonomic-control-plane]
702	              Eckert, T., Behringer, M., and S. Bjarnason, "An Autonomic
703	              Control Plane (ACP)", draft-ietf-anima-autonomic-control-
704	              plane-19 (work in progress), March 2019.

706	   [I-D.ietf-anima-bootstrapping-keyinfra]
707	              Pritikin, M., Richardson, M., Behringer, M., Bjarnason,
708	              S., and K. Watsen, "Bootstrapping Remote Secure Key
709	              Infrastructures (BRSKI)", draft-ietf-anima-bootstrapping-
710	              keyinfra-22 (work in progress), June 2019.

712	   [I-D.ietf-anima-grasp]
713	              Bormann, C., Carpenter, B., and B. Liu, "A Generic
714	              Autonomic Signaling Protocol (GRASP)", draft-ietf-anima-
715	              grasp-15 (work in progress), July 2017.

717	   [RFC7049]  Bormann, C. and P. Hoffman, "Concise Binary Object
718	              Representation (CBOR)", RFC 7049, DOI 10.17487/RFC7049,
719	              October 2013, .

721	13.2.  Informative References

723	   [DeMola06]
724	              De Mola, F. and R. Quitadamo, "An Agent Model for Future
725	              Autonomic Communications", Proceedings of the 7th WOA 2006
726	              Workshop From Objects to Agents 51-59, September 2006.

728	   [GANA13]   "Autonomic network engineering for the self-managing
729	              Future Internet (AFI): GANA Architectural Reference Model
730	              for Autonomic Networking, Cognitive Networking and Self-
731	              Management.", April 2013,
732	              .

735	   [Huebscher08]
736	              Huebscher, M. and J. McCann, "A survey of autonomic
737	              computing--degrees, models, and applications", ACM
738	              Computing Surveys (CSUR) Volume 40 Issue 3 DOI:
739	              10.1145/1380584.1380585, August 2008.

741	   [I-D.carpenter-anima-l2acp-scenarios]
742	              Carpenter, B. and B. Liu, "Scenarios and Requirements for
743	              Layer 2 Autonomic Control Planes", draft-carpenter-anima-
744	              l2acp-scenarios-00 (work in progress), February 2019.

746	   [I-D.ciavaglia-anima-coordination]
747	              Ciavaglia, L. and P. Peloso, "Autonomic Functions
748	              Coordination", draft-ciavaglia-anima-coordination-01 (work
749	              in progress), March 2016.

751	   [I-D.ietf-anima-grasp-api]
752	              Carpenter, B., Liu, B., Wang, W., and X. Gong, "Generic
753	              Autonomic Signaling Protocol Application Program Interface
754	              (GRASP API)", draft-ietf-anima-grasp-api-03 (work in
755	              progress), January 2019.

757	   [I-D.ietf-anima-prefix-management]
758	              Jiang, S., Du, Z., Carpenter, B., and Q. Sun, "Autonomic
759	              IPv6 Edge Prefix Management in Large-scale Networks",
760	              draft-ietf-anima-prefix-management-07 (work in progress),
761	              December 2017.

763	   [I-D.ietf-anima-reference-model]
764	              Behringer, M., Carpenter, B., Eckert, T., Ciavaglia, L.,
765	              and J. Nobre, "A Reference Model for Autonomic
766	              Networking", draft-ietf-anima-reference-model-10 (work in
767	              progress), November 2018.

769	   [I-D.ietf-core-yang-cbor]
770	              Veillette, M., Petrov, I., and A. Pelov, "CBOR Encoding of
771	              Data Modeled with YANG", draft-ietf-core-yang-cbor-10
772	              (work in progress), April 2019.

774	   [I-D.peloso-anima-autonomic-function]
775	              Pierre, P. and L. Ciavaglia, "A Day in the Life of an
776	              Autonomic Function", draft-peloso-anima-autonomic-
777	              function-01 (work in progress), March 2016.

779	   [Movahedi12]
780	              Movahedi, Z., Ayari, M., Langar, R., and G. Pujolle, "A
781	              Survey of Autonomic Network Architectures and Evaluation
782	              Criteria", IEEE Communications Surveys & Tutorials Volume:
783	              14 , Issue: 2 DOI: 10.1109/SURV.2011.042711.00078,
784	              Page(s): 464 - 490, 2012.

786	   [RFC6973]  Cooper, A., Tschofenig, H., Aboba, B., Peterson, J.,
787	              Morris, J., Hansen, M., and R. Smith, "Privacy
788	              Considerations for Internet Protocols", RFC 6973,
789	              DOI 10.17487/RFC6973, July 2013,
790	              .

792	   [RFC7575]  Behringer, M., Pritikin, M., Bjarnason, S., Clemm, A.,
793	              Carpenter, B., Jiang, S., and L. Ciavaglia, "Autonomic
794	              Networking: Definitions and Design Goals", RFC 7575,
795	              DOI 10.17487/RFC7575, June 2015,
796	              .

798	   [RFC7665]  Halpern, J., Ed. and C. Pignataro, Ed., "Service Function
799	              Chaining (SFC) Architecture", RFC 7665,
800	              DOI 10.17487/RFC7665, October 2015,
801	              .

803	   [RFC8368]  Eckert, T., Ed. and M. Behringer, "Using an Autonomic
804	              Control Plane for Stable Connectivity of Network
805	              Operations, Administration, and Maintenance (OAM)",
806	              RFC 8368, DOI 10.17487/RFC8368, May 2018,
807	              .

809	   [RFC8568]  Bernardos, CJ., Rahman, A., Zuniga, JC., Contreras, LM.,
810	              Aranda, P., and P. Lynch, "Network Virtualization Research
811	              Challenges", RFC 8568, DOI 10.17487/RFC8568, April 2019,
812	              .

814	Appendix A.  Change log [RFC Editor: Please remove]

816	   draft-carpenter-anima-asa-guidelines-07, 2019-07-17:

818	   Improved explanation of threading vs event-loop

820	   Other editorial improvements.

822	   draft-carpenter-anima-asa-guidelines-06, 2018-01-07:

824	   Expanded and improved example logic flow.

826	   Editorial corrections.

828	   draft-carpenter-anima-asa-guidelines-05, 2018-06-30:

830	   Added section on relationshp with non-autonomic components.

832	   Editorial corrections.

834	   draft-carpenter-anima-asa-guidelines-04, 2018-03-03:

836	   Added note about simple ASAs.

838	   Added note about NFV/SFC services.

840	   Improved text about threading v event loop model

842	   Added section about coordination with traditional tools.

844	   Added appendix with example logic flow.

846	   draft-carpenter-anima-asa-guidelines-03, 2017-10-25:

848	   Added details on life cycle.

850	   Added details on robustness.

852	   Added co-authors.

854	   draft-carpenter-anima-asa-guidelines-02, 2017-07-01:

856	   Expanded description of event-loop case.

858	   Added note about 'dry run' mode.

860	   draft-carpenter-anima-asa-guidelines-01, 2017-01-06:

862	   More sections filled in

864	   draft-carpenter-anima-asa-guidelines-00, 2016-09-30:

866	   Initial version

868	Appendix B.  Example Logic Flows

870	   This appendix describes generic logic flows for an Autonomic Service
871	   Agent (ASA) for resource management.  Note that these are
872	   illustrative examples, and in no sense requirements.  As long as the
873	   rules of GRASP are followed, a real implementation could be
874	   different.  The reader is assumed to be familiar with GRASP
875	   [I-D.ietf-anima-grasp] and its conceptual API
876	   [I-D.ietf-anima-grasp-api].

878	   A complete autonomic function for a resource would consist of a
879	   number of instances of the ASA placed at relevant points in a
880	   network.  Specific details will of course depend on the resource
881	   concerned.  One example is IP address prefix management, as specified
882	   in [I-D.ietf-anima-prefix-management].  In this case, an instance of
883	   the ASA would exist in each delegating router.

885	   An underlying assumption is that there is an initial source of the
886	   resource in question, referred to here as a master ASA.  The other
887	   ASAs, known as delegators, obtain supplies of the resource from the
888	   master, and then delegate quantities of the resource to consumers
889	   that request it, and recover it when no longer needed.

891	   Another assumption is there is a set of network wide policy
892	   parameters, which the master will provide to the delegators.  These
893	   parameters will control how the delegators decide how much resource
894	   to provide to consumers.  Thus the ASA logic has two operating modes:
895	   master and delegator.  When running as a master, it starts by
896	   obtaining a quantity of the resource from the NOC, and it acts as a
897	   source of policy parameters, via both GRASP flooding and GRASP
898	   synchronization.  (In some scenarios, flooding or synchronization
899	   alone might be sufficient, but this example includes both.)

901	   When running as a delegator, it starts with an empty resource pool,
902	   it acquires the policy parameters by GRASP synchronization, and it
903	   delegates quantities of the resource to consumers that request it.
904	   Both as a master and as a delegator, when its pool is low it seeks
905	   quantities of the resource by requesting GRASP negotiation with peer
906	   ASAs.  When its pool is sufficient, it hands out resource to peer
907	   ASAs in response to negotiation requests.  Thus, over time, the
908	   initial resource pool held by the master will be shared among all the
909	   delegators according to demand.

911	   In theory a network could include any number of masters and any
912	   number of delegators, with the only condition being that each
913	   master's initial resource pool is unique.  A realistic scenario is to
914	   have exactly one master and as many delegators as you like.  A
915	   scenario with no master is useless.

917	   An implementation requirement is that resource pools are kept in
918	   stable storage.  Otherwise, if a delegator exits for any reason, all
919	   the resources it has obtained or delegated are lost.  If a master
920	   exits, its entire spare pool is lost.  The logic for using stable
921	   storage and for crash receovery is not included below.

923	   The description below does not implement GRASP's 'dry run' function.
924	   That would require temporarily marking any resource handed out in a
925	   dry run negotiation as reserved, until either the peer obtains it in
926	   a live run, or a suitable timeout expires.

928	   The main data structures used in each instance of the ASA are:

930	   o  The resource_pool, for example an ordered list of available
931	      resources.  Depending on the nature of the resource, units of
932	      resource are split when appropriate, and a background garbage
933	      collector recombines split resources if they are returned to the
934	      pool.

936	   o  The delegated_list, where a delegator stores the resources it has
937	      given to consumers routers.

939	   Possible main logic flows are below, using a threaded implementation
940	   model.  The transformation to an event loop model should be apparent
941	   - each thread would correspond to one event in the event loop.

943	   The GRASP objectives are as follows:

945	      ["EX1.Resource", flags, loop_count, value] where the value depends
946	      on the resource concerned, but will typically include its size and
947	      identification.

949	      ["EX1.Params", flags, loop_count, value] where the value will be,
950	      for example, a JSON object defining the applicable parameters.

952	   In the outline logic flows below, these objectives are represented
953	   simply by their names.

955	   MAIN PROGRAM:

957	   Create empty resource_pool (and an associated lock)
958	   Create empty delegated_list
959	   Determine whether to act as master
960	   if master:
961	       Obtain initial resource_pool contents from NOC
962	       Obtain value of EX1.Params from NOC
963	   Register ASA with GRASP
964	   Register GRASP objectives EX1.Resource and EX1.Params
965	   if master:
966	       Start FLOODER thread to flood EX1.Params
967	       Start SYNCHRONIZER listener for EX1.Params
968	   Start MAIN_NEGOTIATOR thread for EX1.Resource
969	   if not master:
970	       Obtain value of EX1.Params from GRASP flood or synchronization
971	       Start DELEGATOR thread
972	   Start GARBAGE_COLLECTOR thread
973	   do forever:
974	       good_peer = none
975	       if resource_pool is low:
976	           Calculate amount A of resource needed
977	           Discover peers using GRASP M_DISCOVER / M_RESPONSE
978	           if good_peer in peers:
979	               peer = good_peer
980	           else:
981	               peer =  #any choice among peers
982	               grasp.request_negotiate("EX1.Resource", peer)
983	               i.e., send M_REQ_NEG
984	               Wait for response (M_NEGOTIATE, M_END or M_WAIT)
985	               if OK:
986	                   if offered amount of resource sufficient:
987	                       Send M_END + O_ACCEPT #negotiation succeeded
988	                       Add resource to pool
989	                       good_peer = peer
990	                   else:
991	                       Send M_END + O_DECLINE #negotiation failed
992	       sleep() #sleep time depends on application scenario

994	   MAIN_NEGOTIATOR thread:

996	   do forever:
997	       grasp.listen_negotiate("EX1.Resource")
998	       i.e., wait for M_REQ_NEG
999	       Start a separate new NEGOTIATOR thread for requested amount A

1001	   NEGOTIATOR thread:

1003	   Request resource amount A from resource_pool
1004	   if not OK:
1005	       while not OK and A > Amin:
1006	           A = A-1
1007	           Request resource amount A from resource_pool
1008	   if OK:
1009	       Offer resource amount A to peer by GRASP M_NEGOTIATE
1010	       if received M_END + O_ACCEPT:
1011	           #negotiation succeeded
1012	       elif received M_END + O_DECLINE or other error:
1013	           #negotiation failed
1014	   else:
1015	       Send M_END + O_DECLINE #negotiation failed

1017	   DELEGATOR thread:

1019	   do forever:
1020	       Wait for request or release for resource amount A
1021	       if request:
1022	           Get resource amount A from resource_pool
1023	           if OK:
1024	               Delegate resource to consumer
1025	               Record in delegated_list
1026	           else:
1027	               Signal failure to consumer
1028	               Signal main thread that resource_pool is low
1029	       else:
1030	           Delete resource from delegated_list
1031	           Return resource amount A to resource_pool

1033	   SYNCHRONIZER thread:

1035	   do forever:
1036	     Wait for  M_REQ_SYN message for EX1.Params
1037	     Reply with M_SYNCH message for EX1.Params

1039	   FLOODER thread:

1041	   do forever:
1042	     Send M_FLOOD message for EX1.Params
1043	     sleep() #sleep time depends on application scenario

1045	   GARBAGE_COLLECTOR thread:

1047	   do forever:
1048	       Search resource_pool for adjacent resources
1049	       Merge adjacent resources
1050	       sleep() #sleep time depends on application scenario

1052	Authors' Addresses

1054	   Brian Carpenter
1055	   School of Computer Science
1056	   University of Auckland
1057	   PB 92019
1058	   Auckland  1142
1059	   New Zealand

1061	   Email: brian.e.carpenter@gmail.com

1063	   Laurent Ciavaglia
1064	   Nokia
1065	   Villarceaux
1066	   Nozay  91460
1067	   FR

1069	   Email: laurent.ciavaglia@nokia.com

1071	   Sheng Jiang
1072	   Huawei Technologies Co., Ltd
1073	   Q14, Huawei Campus, No.156 Beiqing Road
1074	   Hai-Dian District, Beijing, 100095
1075	   P.R. China

1077	   Email: jiangsheng@huawei.com

1079	   Pierre Peloso
1080	   Nokia
1081	   Villarceaux
1082	   Nozay  91460
1083	   FR

1085	   Email: pierre.peloso@nokia.com