idnits 2.17.1 

draft-bernardos-anima-fog-monitoring-06.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (24 May 2022) is 702 days in the past.  Is this
     intentional?


  Checking references for intended status: Experimental
  ----------------------------------------------------------------------------

     No issues found here.

     Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------


2	ANIMA WG                                              CJ. Bernardos, Ed.
3	Internet-Draft                                                      UC3M
4	Intended status: Experimental                                  A. Mourad
5	Expires: 25 November 2022                                   InterDigital
6	                                                       P. Martinez-Julia
7	                                                                    NICT
8	                                                             24 May 2022

10	                Autonomic setup of fog monitoring agents
11	                draft-bernardos-anima-fog-monitoring-06

13	Abstract

15	   The concept of fog computing has emerged driven by the Internet of
16	   Things (IoT) due to the need of handling the data generated from the
17	   end-user devices.  The term fog is referred to any networked
18	   computational resource in the continuum between things and cloud.  In
19	   fog computing, functions can be stiched together composing a service
20	   function chain.  These functions might be hosted on resources that
21	   are inherently heterogeneous, volatile and mobile.  This means that
22	   resources might appear and disappear, and the connectivity
23	   characteristics between these resources may also change dynamically.
24	   This calls for new orchestration solutions able to cope with dynamic
25	   changes to the resources in runtime or ahead of time (in anticipation
26	   through prediction) as opposed to today's solutions which are
27	   inherently reactive and static or semi-static.

29	   A fog monitoring solution can be used to help predicting events so an
30	   action can be taken before an event actually takes place.  This
31	   solution is composed of agents running on the fog nodes plus a
32	   controller hosted at another device (running in the infrastructure or
33	   in another fog node).  Since fog environments are inherently volatile
34	   and extremely dynamic, it is convenient to enable the use of
35	   autonomic technologies to autonomously set-up the fog monitoring
36	   platform.  This document aims at presenting this use case as well as
37	   specifying how to use GRASP as needed in this scenario.

39	Status of This Memo

41	   This Internet-Draft is submitted in full conformance with the
42	   provisions of BCP 78 and BCP 79.

44	   Internet-Drafts are working documents of the Internet Engineering
45	   Task Force (IETF).  Note that other groups may also distribute
46	   working documents as Internet-Drafts.  The list of current Internet-
47	   Drafts is at https://datatracker.ietf.org/drafts/current/.

49	   Internet-Drafts are draft documents valid for a maximum of six months
50	   and may be updated, replaced, or obsoleted by other documents at any
51	   time.  It is inappropriate to use Internet-Drafts as reference
52	   material or to cite them other than as "work in progress."

54	   This Internet-Draft will expire on 25 November 2022.

56	Copyright Notice

58	   Copyright (c) 2022 IETF Trust and the persons identified as the
59	   document authors.  All rights reserved.

61	   This document is subject to BCP 78 and the IETF Trust's Legal
62	   Provisions Relating to IETF Documents (https://trustee.ietf.org/
63	   license-info) in effect on the date of publication of this document.
64	   Please review these documents carefully, as they describe your rights
65	   and restrictions with respect to this document.  Code Components
66	   extracted from this document must include Revised BSD License text as
67	   described in Section 4.e of the Trust Legal Provisions and are
68	   provided without warranty as described in the Revised BSD License.

70	Table of Contents

72	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
73	     1.1.  Problem statement . . . . . . . . . . . . . . . . . . . .   3
74	     1.2.  Fog monitoring framework  . . . . . . . . . . . . . . . .   4
75	     1.3.  Supporting simple and complex monitoring metrics  . . . .   6
76	   2.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   6
77	   3.  Autonomic setup of fog monitoring framework . . . . . . . . .   7
78	   4.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  11
79	   5.  Security Considerations . . . . . . . . . . . . . . . . . . .  11
80	   6.  Acknowledgments . . . . . . . . . . . . . . . . . . . . . . .  11
81	   7.  Informative References  . . . . . . . . . . . . . . . . . . .  11
82	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  11

84	1.  Introduction

86	   The concept of fog computing has emerged driven by the Internet of
87	   Things (IoT) due to the need of handling the data generated from the
88	   end-user devices.  The term fog is referred to any networked
89	   computational resource in the continuum between things and cloud.  A
90	   fog node may therefore be an infrastructure network node such as an
91	   eNodeB or gNodeB, an edge server, a customer premises equipment
92	   (CPE), or even a user equipment (UE) terminal node such as a laptop,
93	   a smartphone, or a computing unit on-board a vehicle, robot or drone.

95	   In fog computing, functions might be organized in service function
96	   chains (SFCs), hosted on resources that are inherently heterogeneous,
97	   volatile and mobile.  This means that resources might appear and
98	   disappear, and the connectivity characteristics between these
99	   resources may also change dynamically.  This calls for new
100	   orchestration solutions able to cope with dynamic changes to the
101	   resources in runtime or ahead of time (in anticipation through
102	   prediction) as opposed to today's solutions which are inherently
103	   reactive and static or semi-static.

105	1.1.  Problem statement

107	   Figure 1 shows an exemplary scenario of a (robot) network service.  A
108	   robot device has its (navigation) control application running in the
109	   fog away from the robot, as a network service in the form of an SFC
110	   "F1-F2" (e.g., F1 might be in charge of identifying obstacles and F2
111	   takes decisions on the robot navigation).  Initially the function F1
112	   is assumed to be hosted at a fog node A and F2 at fog node B.  At a
113	   given point of time, fog node A becomes unavailable (e.g., due to low
114	   battery issues or the fog node A moving away from the coverage of the
115	   robot).  There is therefore a need to predict the need of migrating/
116	   moving the function F1 to another node (e.g., fog node C in the
117	   figure), and this needs to be done prior to the fog/edge node
118	   becoming no longer capable/available.  Such dynamic migration cannot
119	   be dealt with in today's orchestration solutions, which are rather
120	   reactive and static or semi-static (e.g., resources may fail, but
121	   this is an exceptional event, happening with low frequency, and only
122	   scaling actions are supported to react to SLA-related events).

124	              --------------
125	              |    ====    |
126	             ------+F1+----------
127	            / |  | ==== |  |     \
128	           /  |  +------+  |      \
129	           |  | fog node C |       \
130	           |  --------------        \
131	           |                         \
132	           |       --------------  ---\----------
133	           |       |    ====    |  |   \====    |
134	           | -----------+F1+------------+F2|    |
135	           |/      |  | ==== |  |  |  | ==== |  |
136	           o       |  +------+  |  |  +------+  |
137	           |       | fog node A |  | fog node B |
138	   --------+-      --------------  --------------
139	   |        |
140	   --0----0--

142	                         Figure 1: Example scenario

144	   Existing frameworks rely on monitoring platforms that react to
145	   resource failure events and ensure that negotiated SLAs are met.
146	   However these are not designed to predict events likely to happen in
147	   a volatile fog environment, such as resources moving away, resources
148	   becoming unavailable due to battery issues or just changes in
149	   availability of the resources because of variations of the use of the
150	   local resources on the nodes.  Besides, it is not feasible in this
151	   kind of volatile and extremely mobile environment to perform a
152	   continuous monitoring and reporting of every possible variable or
153	   parameter from all the nodes hosting resources, as this would not
154	   scale and would consume many resources and generate extra overhead.

156	   In volatile and mobile environments, prediction (make-before-break)
157	   is needed, as pure reaction (break-before-make) is not enough.  This
158	   prediction is not generic, and depends on the nature of the network
159	   service/SFC: the functions of the SFC, the connectivity between them,
160	   the service-specific requirements, etc.  Monitoring has to be setup
161	   differently on the nodes, depending on the specifics of the network
162	   service.  Besides, in order to act proactively and predict what might
163	   need to be done, monitoring in such a volatile and mobile
164	   environments does not only involve the nodes currently hosting the
165	   resources running the network service/service function chain (i.e.,
166	   hosting a function), but also other nodes which are potential
167	   candidates to join either in addition or in substitution to current
168	   nodes for running the network service in accordance with the
169	   orchestration decisions.

171	   In the example of Figure 1, the fog node initially hosting function
172	   F1 (fog node A) might be running out of battery and this should be
173	   detected before the node A actually becomes unavailable, so the
174	   function F1 can be effectively migrated in a time to a different fog
175	   node C, capable of meeting the requirements of F1 (compute,
176	   networking, location, expected availability, etc.).  In order to be
177	   able to predict the need for such a migration and have already
178	   identified a target fog node where to move the function, it is needed
179	   to have a monitoring solution in place that instructs each node
180	   involved in the service (A and B), and also neighboring node
181	   candidate (C) to host function (F1), to monitor and report on metrics
182	   that are relevant for the specific network service "F1-F2" that is
183	   currently running.

185	1.2.  Fog monitoring framework

187	   Fog environments differ from data-center ones on three key aspects:
188	   heterogeneity, volatility and mobility.  The fog monitoring framework
189	   is used to predict events triggering and orchestration event (e.g.,
190	   migrating a function to a different resource).

192	   The monitoring framework we propose for fog environments is composed
193	   of 2 logical components:

195	   *  Fog agents running on each fog node.  An agent is responsible for
196	      sending the value of a variable or parameter to a fog monitoring
197	      controller and to other fog agents.  What variable or parameter
198	      will be monitored and what data will be sent (including frequency)
199	      is configured per agent considering the specifics of the network
200	      service or SFC.  A fog agent might also take some autonomous
201	      actions (such as request migration of a function to a neighbor
202	      node) in certain situations where connectivity with the fog
203	      monitoring controller is temporarily unavailable.

205	   *  A fog monitoring controller (e.g., running at the edge or at a fog
206	      node).  This node obtains input from the orchestration logic (MANO
207	      stack) and autonomously decides what variables or parameters will
208	      be monitored, where will the data be collected, and how it will be
209	      done, based on the requirements provided by the orchestration
210	      logic managing the network services instantiated in the fog.  This
211	      configuration is specific to a network service, a function, or an
212	      SFC as whole.

214	      -  It interacts with the orchestration logic to coordinate and
215	         trigger orchestration events, such as function migration,
216	         connectivity updates, etc.  In some deployments, this entity
217	         might be co-located with the orchestration logic (e.g., the
218	         NFVO).

220	      -  It interacts with the fog agents to instruct what variables
221	         and/or parameters need to be monitored.  It also interacts to
222	         get the resulting monitoring data.  This interaction is not
223	         limited to fog agents at nodes currently involved in a given
224	         network service or SFC, but also includes other nodes that are
225	         suitable for hosting a function that needs to be migrated.
226	         This allows to provide the orchestration logic with candidate
227	         nodes in a pro-active way.

229	      -  It is capable of autonomously discover and set up fog agents.

231	1.3.  Supporting simple and complex monitoring metrics

233	   Fog monitoring nodes will be capable of providing raw monitoring data
234	   as well as processed data.  The former are obtained directly from the
235	   measured variables or parameters.  The latter are obtained by
236	   applying some processing function to several monitoring data items.
237	   The fog monitoring controllers will specify the function to be
238	   executed, which data will be collected and processed by the
239	   functions, and the additional parameters that will control the
240	   processing and will determine the particularities of the output of
241	   each function.

243	   The complexity of the functions that can be executed is arbitrary.
244	   They can be either pre-instructed in the fog agents or dynamically
245	   instructed by the requester (the fog monitoring controller) by
246	   providing the sequence to execute the functions and their input
247	   parameters.

249	   Complex monitoring metrics, the processed data, can also be used as
250	   part of the condition that determines the distributed and autonomic
251	   actions.  Thus, the logic that defines those actions is simplified
252	   and the actuation components can be concentrated on their task
253	   without requiring extra effort to process the raw monitoring data.

255	   Adding support for complex monitoring metrics enables the fog
256	   monitoring framework to avoid the transmission of unneeded data and
257	   thus optimize its overall operation.  For example, if the controller
258	   is interested in the average of the CPU load of a fog agent for the
259	   last 5 minutes, it can just request it, providing the period to
260	   average as input parameter and specifying the source from which
261	   measuring the CPU load variable.

263	2.  Terminology

265	   The following terms are using in ths document:

267	   fog:          Fog goes to the Extreme Edge, that is the closest
268	                 possible to the user including on the user device
269	                 itself.

271	   fog node:     Any device that is capable of participating in the Fog.
272	                 A Fog node might be volatile, mobile and constrained
273	                 (in terms of computing resources).  Fog nodes may be
274	                 heterogeneous and may belong to different owners.

276	   orchestrator:  In this document we use orchestrator and NFVO terms
277	                 interchangeably.

279	3.  Autonomic setup of fog monitoring framework

281	   Fog nodes autonomously start fog agents at the bootstrapping, then
282	   start looking for other agents and the fog monitoring controller.
283	   This autonomic setup can be performed using GRASP.  The procedure is
284	   represented in Figure 2.  The different steps are described next:

286	   +--------+    +--------+    +--------+
287	   |  fog   |    |  fog   |    |  fog   |
288	   | node C |    | node A |    | node B |                       +------+
289	   |        |    |        |    |        |                       | fog  |
290	   | |    | |    | |    | |    | |    | |        +------+       | mon. |
291	   | +----+ |    | +----+ |    | +----+ |        | NFVO |       | ctrl |
292	   +--------+    +--------+    +--------+        +------+       +------+
293	                      |             |                |              |
294	               (fog nodes A & B bootstrap)           |              |
295	                      |             |                |              |
296	                      |             |   periodic mcast advertisement|
297	                      |             |               (ID, fog_scope) |
298	                      |             |  <----------------------------+
299	                      | Mcast discovery (fog_node_ID, scope)        |
300	                      +-------------------------------------------->|
301	                      +------------>|                |              |
302	                      |    Mcast discovery (fog_node_ID, scope)     |
303	                      |             +------------------------------>|
304	                      |<------------+                |              |
305	                      |             |                |              |
306	                      |       Unicast advertisement (ID, fog_scope) |
307	                      |             |<------------------------------+
308	                      |<--------------------------------------------+
309	                      |             |                |              |
310	                      |    Unicast registration (ID, fog_node_ID    |
311	                      |             |            fog_scope, capab.) |
312	                      |             +------------------------------>|
313	                      +-------------------------------------------->|
314	                      |             |                |              |
315	               (fog nodes A & B registered)          |              |
316	                      |             |                |              |
317	   (fog node C bootstraps)          |                |              |
318	        |             |             |                |              |
319	        | Mcast discovery (fog_node_ID, scope)       |              |
320	        +---------------------------------------------------------->|
321	        +-------------------------->|                |              |
322	        +------------>|       Unicast advertisement (ID, fog_scope) |
323	        |<----------------------------------------------------------+
324	        |<--------------------------+                |              |
325	        |<------------+    Unicast registration (ID, fog_node_ID    |
326	        |             |             |            fog_scope, capab.) |
327	        +---------------------------------------------------------->|
328	   (fog node C registered)          |                |              |
329	        |             |             |                |              |

331	                  Figure 2: Autonomic setup of fog agents

333	   *  The fog monitoring controller is regularly sending periodic
334	      multicast advertisement messages, which include its ID as well as
335	      the scope for the advertisement messages (i.e., the scope of where
336	      the messages have to be flooded).

338	      M_DISCOVERY messages are used, with new objectives and objective
339	      options.  GRASP specifies that "an objective option is used to
340	      identify objectives for the purposes of discovery, negotiation or
341	      synchronization".  New objective options are defined for the
342	      purposes of discovering potential fog agents with certain
343	      characteristics.  Non-limiting examples of these options are
344	      listed below (note that the names are just examples, and the ones
345	      used have to be registered by the IANA):

347	      -  FOGNODERADIO: used to specify a given type of radio technology,
348	         e.g.,: WiFi (version), D2D, LTE, 5G, Bluetooth (version), etc.

350	      -  FOGNODECONNECTIVITY: used to specify a given type of
351	         connectivity, e.g., layer-2, IPv4, IPv6.

353	      -  FOGNODEVIRTUALIZATION: used to specify a given type of
354	         virtualization supported by the node where the agent runs.
355	         Examples are: hypervisor (type), container, micro-kernel, bare-
356	         metal, etc.

358	      -  FOGNODEDOMAIN: used to specify the domain/owner of the node.
359	         This is useful to support operation of multiple domains/
360	         operators simultaneously on the same fog network.

362	      An example of discovery message using GRASP would be the following
363	      (in this example, the fog monitoring controller is identified by
364	      its IPv6 address: 2001:DB8:1111:2222:3333:4444:5555:6666):

366	      [M_DISCOVERY, 13948745, h'20010db8111122223333444455556666',
367	      ["FOGDOMAIN", F_SYNCH_bits, 2, "operator1"]]

369	      GRASP is used to allow the fog agents and the controller discovery
370	      in an autonomic way.  The extensions defined above, together with
371	      the use of properly scoped multicast addresses (as explained
372	      below), allow to precisely define which nodes participate in the
373	      monitoring and to gather their principal characteristics.

375	   *  When a fog node bootstraps, such as nodes A and B in the figure,
376	      they start sending multicast discovery messages within a given
377	      scope, that is, the intended area that composes the fog.  The
378	      definition of the scope depends on the scenario, and examples of
379	      possible scopes are:

381	      -  All-resources of a given manufacturer.

383	      -  All-resources of a given type.

385	      -  All-resources of a given administrative domain.

387	      -  All-resources of a given user.

389	      -  All-resources within a topological network distance (e.g.,
390	         number of hops).

392	      -  All-resources within a geographical location.

394	      -  Etc.

396	      Combination of previous scopes are also possible.

398	      The discovery messages are multicast within the scope, reaching
399	      all the nodes that compose the specified fog resources.  This can
400	      be done for example using well defined IPv6 multicast addresses,
401	      specified for each of the different scopes.  This signaling is
402	      based on GRASP.  Different IPv6 multicast addresses need to be
403	      defined to reach each different scope, using scopes equal or
404	      larger than Admin-Local according to [RFC7346].

406	   *  In response to multicast fog discovery messages, the fog
407	      monitoring controller replies with unicast messages providing its
408	      information.

410	   *  Fog agents can then register with a controller.  The registration
411	      message is unicast, and includes information on the capabilities
412	      of the fog node, such as:

414	      -  Type of node.

416	      -  Vendor.

418	      -  Energy source: battery-powered or not.

420	      -  Connectivity (number of network interfaces and information
421	         associated to them, such as radio technology type, layer-2 and
422	         layer-3 addresses, etc.).

424	      -  Etc.

426	      Note that registration to multiple fog monitoring controller
427	      instances could also be possible if a fog node wants to belong to
428	      several fog domains at the same time (but note that how the
429	      orchestration of the same resource is done by multiple
430	      orchestrators is not covered by this invention).  The defined
431	      mechanisms support this via the use of fog IDs and FOGNODEDOMAIN
432	      options.

434	   *  A fog node C bootstraps after nodes A and B are already
435	      registered.  The same discovery process is followed by fog node C,
436	      but in addition to the regular advertisement, registration
437	      procedures described before, existing neighboring fog agents (such
438	      as A and B in this example), might also respond to discovery
439	      messages sent by bootstrapping nodes to provide required
440	      information.  This makes the procedure faster, more efficient and
441	      reliable.  In addition to helping the fog monitoring controller in
442	      the fog agent discovery process, fog agents learn themselves about
443	      the existence and associated capabilities of other fog agents.
444	      This can be used to allow autonomous monitoring by the fog agents
445	      without the involvement of the central controller.

447	4.  IANA Considerations

449	   TBD.

451	5.  Security Considerations

453	   TBD.

455	6.  Acknowledgments

457	   The work in this draft will be further developed and explored under
458	   the framework of the H2020 5G-DIVE project (Grant 859881).

460	7.  Informative References

462	   [RFC7346]  Droms, R., "IPv6 Multicast Address Scopes", RFC 7346,
463	              DOI 10.17487/RFC7346, August 2014,
464	              <https://www.rfc-editor.org/info/rfc7346>.

466	Authors' Addresses

468	   Carlos J. Bernardos (editor)
469	   Universidad Carlos III de Madrid
470	   Av. Universidad, 30
471	   28911 Leganes, Madrid
472	   Spain
473	   Phone: +34 91624 6236
474	   Email: cjbc@it.uc3m.es
475	   URI:   http://www.it.uc3m.es/cjbc/
476	   Alain Mourad
477	   InterDigital Europe
478	   Email: Alain.Mourad@InterDigital.com
479	   URI:   http://www.InterDigital.com/

481	   Pedro Martinez-Julia
482	   NICT
483	   4-2-1, Nukui-Kitamachi, Koganei, Tokyo
484	   184-8795
485	   Japan
486	   Phone: +81 42 327 7293
487	   Email: pedro@nict.go.jp