idnits 2.17.1 draft-bernardos-anima-fog-monitoring-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (November 19, 2020) is 1246 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- No issues found here. Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 ANIMA WG CJ. Bernardos 3 Internet-Draft UC3M 4 Intended status: Experimental A. Mourad 5 Expires: May 23, 2021 InterDigital 6 November 19, 2020 8 Autonomic setup of fog monitoring agents 9 draft-bernardos-anima-fog-monitoring-03 11 Abstract 13 The concept of fog computing has emerged driven by the Internet of 14 Things (IoT) due to the need of handling the data generated from the 15 end-user devices. The term fog is referred to any networked 16 computational resource in the continuum between things and cloud. In 17 fog computing, functions can be stiched together composing a service 18 function chain. These functions might be hosted on resources that 19 are inherently heterogeneous, volatile and mobile. This means that 20 resources might appear and disappear, and the connectivity 21 characteristics between these resources may also change dynamically. 22 This calls for new orchestration solutions able to cope with dynamic 23 changes to the resources in runtime or ahead of time (in anticipation 24 through prediction) as opposed to today's solutions which are 25 inherently reactive and static or semi-static. 27 A fog monitoring solution can be used to help predicting events so an 28 action can be taken before an event actually takes place. This 29 solution is composed of agents running on the fog nodes plus a 30 controller hosted at another device (running in the infrastructure or 31 in another fog node). Since fog environments are inherently volatile 32 and extremely dynamic, it is convenient to enable the use of 33 autonomic technologies to autonomously set-up the fog monitoring 34 platform. This document aims at presenting this use case as well as 35 specifying how to use GRASP as needed in this scenario. 37 Status of This Memo 39 This Internet-Draft is submitted in full conformance with the 40 provisions of BCP 78 and BCP 79. 42 Internet-Drafts are working documents of the Internet Engineering 43 Task Force (IETF). Note that other groups may also distribute 44 working documents as Internet-Drafts. The list of current Internet- 45 Drafts is at https://datatracker.ietf.org/drafts/current/. 47 Internet-Drafts are draft documents valid for a maximum of six months 48 and may be updated, replaced, or obsoleted by other documents at any 49 time. It is inappropriate to use Internet-Drafts as reference 50 material or to cite them other than as "work in progress." 52 This Internet-Draft will expire on May 23, 2021. 54 Copyright Notice 56 Copyright (c) 2020 IETF Trust and the persons identified as the 57 document authors. All rights reserved. 59 This document is subject to BCP 78 and the IETF Trust's Legal 60 Provisions Relating to IETF Documents 61 (https://trustee.ietf.org/license-info) in effect on the date of 62 publication of this document. Please review these documents 63 carefully, as they describe your rights and restrictions with respect 64 to this document. Code Components extracted from this document must 65 include Simplified BSD License text as described in Section 4.e of 66 the Trust Legal Provisions and are provided without warranty as 67 described in the Simplified BSD License. 69 Table of Contents 71 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 72 1.1. Problem statement . . . . . . . . . . . . . . . . . . . . 3 73 1.2. Fog monitoring framework . . . . . . . . . . . . . . . . 4 74 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 75 3. Autonomic setup of fog monitoring framework . . . . . . . . . 6 76 4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10 77 5. Security Considerations . . . . . . . . . . . . . . . . . . . 10 78 6. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 10 79 7. Informative References . . . . . . . . . . . . . . . . . . . 10 80 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 10 82 1. Introduction 84 The concept of fog computing has emerged driven by the Internet of 85 Things (IoT) due to the need of handling the data generated from the 86 end-user devices. The term fog is referred to any networked 87 computational resource in the continuum between things and cloud. A 88 fog node may therefore be an infrastructure network node such as an 89 eNodeB or gNodeB, an edge server, a customer premises equipment 90 (CPE), or even a user equipment (UE) terminal node such as a laptop, 91 a smartphone, or a computing unit on-board a vehicle, robot or drone. 93 In fog computing, functions might be organized in service function 94 chains (SFCs), hosted on resources that are inherently heterogeneous, 95 volatile and mobile. This means that resources might appear and 96 disappear, and the connectivity characteristics between these 97 resources may also change dynamically. This calls for new 98 orchestration solutions able to cope with dynamic changes to the 99 resources in runtime or ahead of time (in anticipation through 100 prediction) as opposed to today's solutions which are inherently 101 reactive and static or semi-static. 103 1.1. Problem statement 105 Figure 1 shows an exemplary scenario of a (robot) network service. A 106 robot device has its (navigation) control application running in the 107 fog away from the robot, as a network service in the form of an SFC 108 "F1-F2" (e.g., F1 might be in charge of identifying obstacles and F2 109 takes decisions on the robot navigation). Initially the function F1 110 is assumed to be hosted at a fog node A and F2 at fog node B. At a 111 given point of time, fog node A becomes unavailable (e.g., due to low 112 battery issues or the fog node A moving away from the coverage of the 113 robot). There is therefore a need to predict the need of migrating/ 114 moving the function F1 to another node (e.g., fog node C in the 115 figure), and this needs to be done prior to the fog/edge node 116 becoming no longer capable/available. Such dynamic migration cannot 117 be dealt with in today's orchestration solutions, which are rather 118 reactive and static or semi-static (e.g., resources may fail, but 119 this is an exceptional event, happening with low frequency, and only 120 scaling actions are supported to react to SLA-related events). 122 -------------- 123 | ==== | 124 ------+F1+---------- 125 / | | ==== | | \ 126 / | +------+ | \ 127 | | fog node C | \ 128 | -------------- \ 129 | \ 130 | -------------- ---\---------- 131 | | ==== | | \==== | 132 | -----------+F1+------------+F2| | 133 |/ | | ==== | | | | ==== | | 134 o | +------+ | | +------+ | 135 | | fog node A | | fog node B | 136 --------+- -------------- -------------- 137 | | 138 --0----0-- 140 Figure 1: Example scenario 142 Existing frameworks rely on monitoring platforms that react to 143 resource failure events and ensure that negotiated SLAs are met. 144 However these are not designed to predict events likely to happen in 145 a volatile fog environment, such as resources moving away, resources 146 becoming unavailable due to battery issues or just changes in 147 availability of the resources because of variations of the use of the 148 local resources on the nodes. Besides, it is not feasible in this 149 kind of volatile and extremely mobile environment to perform a 150 continuous monitoring and reporting of every possible parameter on 151 all the nodes hosting resources, as this would not scale and would 152 consume many resources and generate extra overhead. 154 In volatile and mobile environments, prediction (make-before-break) 155 is needed, as pure reaction (break-before-make) is not enough. This 156 prediction is not generic, and depends on the nature of the network 157 service/SFC: the functions of the SFC, the connectivity between them, 158 the service-specific requirements, etc. Monitoring has to be setup 159 differently on the nodes, depending on the specifics of the network 160 service. Besides, in order to act proactively and predict what might 161 need to be done, monitoring in such a volatile and mobile 162 environments does not only involve the nodes currently hosting the 163 resources running the network service/service function chain (i.e., 164 hosting a function), but also other nodes which are potential 165 candidates to join either in addition or in substitution to current 166 nodes for running the network service in accordance with the 167 orchestration decisions. 169 In the example of Figure 1, the fog node initially hosting function 170 F1 (fog node A) might be running out of battery and this should be 171 detected before the node A actually becomes unavailable, so the 172 function F1 can be effectively migrated in a time to a different fog 173 node C, capable of meeting the requirements of F1 (compute, 174 networking, location, expected availability, etc.). In order to be 175 able to predict the need for such a migration and have already 176 identified a target fog node where to move the function, it is needed 177 to have a monitoring solution in place that instructs each node 178 involved in the service (A and B), and also neighboring node 179 candidate (C) to host function (F1), to monitor and report on metrics 180 that are relevant for the specific network service "F1-F2" that is 181 currently running. 183 1.2. Fog monitoring framework 185 Fog environments differ from data-center ones on three key aspects: 186 heterogeneity, volatility and mobility. The fog monitoring framework 187 is used to predict events triggering and orchestration event (e.g., 188 migrating a function to a different resource). 190 The monitoring framework we propose for fog environments is composed 191 of 2 logical components: 193 o Fog agents running on each fog node. An agent is responsible for 194 sending information to a fog monitoring controller and to other 195 fog agents. What to monitor and what information to send 196 (including frequency) is configured per agent considering the 197 specifics of the network service/SFC. A fog agent might also take 198 some autonomous actions (such as request migration of a function 199 to a neighbor node) in certain situations where connectivity with 200 the fog monitoring controller is temporarily unavailable. 202 o A fog monitoring controller (e.g., running at the edge or at a fog 203 node). This node obtains input from the orchestration logic (MANO 204 stack) and autonomously decides what information to monitor, where 205 and how, based on the requirements provided by the orchestration 206 logic managing the network services instantiated in the fog. This 207 configuration is network service/function specific. 209 * It interacts with the orchestration logic to coordinate and 210 trigger orchestration events, such as function migration, 211 connectivity updates, etc. In some deployments, this entity 212 might be co-located with the orchestration logic (e.g., the 213 NFVO). 215 * It interacts with the fog agents to instruct what information 216 and parameters need to be monitored, as well as to obtain such 217 information. This interaction is not limited to fog agents at 218 nodes currently involved in a given network service/SFC, but 219 also includes other nodes that are suitable for hosting a 220 function that needs to be migrated. This allows to provide the 221 orchestration logic with candidate nodes in a pro-active way. 223 * It is capable of autonomously discover and set up fog agents. 225 2. Terminology 227 The following terms are using in ths document: 229 fog: Fog goes to the Extreme Edge, that is the closest 230 possible to the user including on the user device 231 itself. 233 fog node: Any device that is capable of participating in the Fog. 234 A Fog node might be volatile, mobile and constrained 235 (in terms of computing resources). Fog nodes may be 236 heterogeneous and may belong to different owners. 238 orchestrator: In this document we use orchestrator and NFVO terms 239 interchangeably. 241 3. Autonomic setup of fog monitoring framework 243 Fog nodes autonomously start fog agents at the bootstrapping, then 244 start looking for other agents and the fog monitoring controller. 245 This autonomic setup can be performed using GRASP. The procedure is 246 represented in Figure 2. The different steps are described next: 248 +--------+ +--------+ +--------+ 249 | fog | | fog | | fog | 250 | node C | | node A | | node B | +------+ 251 | | | | | | | fog | 252 | | | | | | | | | | | | +------+ | mon. | 253 | +----+ | | +----+ | | +----+ | | NFVO | | ctrl | 254 +--------+ +--------+ +--------+ +------+ +------+ 255 | | | | 256 (fog nodes A & B bootstrap) | | 257 | | | | 258 | | periodic mcast advertisement| 259 | | (ID, fog_scope) | 260 | | <----------------------------+ 261 | Mcast discovery (fog_node_ID, scope) | 262 +-------------------------------------------->| 263 +------------>| | | 264 | Mcast discovery (fog_node_ID, scope) | 265 | +------------------------------>| 266 |<------------+ | | 267 | | | | 268 | Unicast advertisement (ID, fog_scope) | 269 | |<------------------------------+ 270 |<--------------------------------------------+ 271 | | | | 272 | Unicast registration (ID, fog_node_ID | 273 | | fog_scope, capab.) | 274 | +------------------------------>| 275 +-------------------------------------------->| 276 | | | | 277 (fog nodes A & B registered) | | 278 | | | | 279 (fog node C bootstraps) | | | 280 | | | | | 281 | Mcast discovery (fog_node_ID, scope) | | 282 +---------------------------------------------------------->| 283 +-------------------------->| | | 284 +------------>| Unicast advertisement (ID, fog_scope) | 285 |<----------------------------------------------------------+ 286 |<--------------------------+ | | 287 |<------------+ Unicast registration (ID, fog_node_ID | 288 | | | fog_scope, capab.) | 289 +---------------------------------------------------------->| 290 (fog node C registered) | | | 291 | | | | | 293 Figure 2: Autonomic setup of fog agents 295 o The fog monitoring controller is regularly sending periodic 296 multicast advertisement messages, which include its ID as well as 297 the scope for the advertisement messages (i.e., the scope of where 298 the messages have to be flooded). 300 M_DISCOVERY messages are used, with new objectives and objective 301 options. GRASP specifies that "an objective option is used to 302 identify objectives for the purposes of discovery, negotiation or 303 synchronization". New objective options are defined for the 304 purposes of discovering potential fog agents with certain 305 characteristics. Non-limiting examples of these options are 306 listed below (note that the names are just examples, and the ones 307 used have to be registered by the IANA): 309 * FOGNODERADIO: used to specify a given type of radio technology, 310 e.g.,: WiFi (version), D2D, LTE, 5G, Bluetooth (version), etc. 312 * FOGNODECONNECTIVITY: used to specify a given type of 313 connectivity, e.g., layer-2, IPv4, IPv6. 315 * FOGNODEVIRTUALIZATION: used to specify a given type of 316 virtualization supported by the node where the agent runs. 317 Examples are: hypervisor (type), container, micro-kernel, bare- 318 metal, etc. 320 * FOGNODEDOMAIN: used to specify the domain/owner of the node. 321 This is useful to support operation of multiple domains/ 322 operators simultaneously on the same fog network. 324 An example of discovery message using GRASP would be the following 325 (in this example, the fog monitoring controller is identified by 326 its IPv6 address: 2001:DB8:1111:2222:3333:4444:5555:6666): 328 [M_DISCOVERY, 13948745, h'20010db8111122223333444455556666', 329 ["FOGDOMAIN", F_SYNCH_bits, 2, "operator1"]] 331 GRASP is used to allow the fog agents and the controller discovery 332 in an autonomic way. The extensions defined above, together with 333 the use of properly scoped multicast addresses (as explained 334 below), allow to precisely define which nodes participate in the 335 monitoring and to gather their principal characteristics. 337 o When a fog node bootstraps, such as nodes A and B in the figure, 338 they start sending multicast discovery messages within a given 339 scope, that is, the intended area that composes the fog. The 340 definition of the scope depends on the scenario, and examples of 341 possible scopes are: 343 * All-resources of a given manufacturer. 345 * All-resources of a given type. 347 * All-resources of a given administrative domain. 349 * All-resources of a given user. 351 * All-resources within a topological network distance (e.g., 352 number of hops). 354 * All-resources within a geographical location. 356 * Etc. 358 Combination of previous scopes are also possible. 360 The discovery messages are multicast within the scope, reaching 361 all the nodes that compose the specified fog resources. This can 362 be done for example using well defined IPv6 multicast addresses, 363 specified for each of the different scopes. This signaling is 364 based on GRASP. Different IPv6 multicast addresses need to be 365 defined to reach each different scope, using scopes equal or 366 larger than Admin-Local according to [RFC7346]. 368 o In response to multicast fog discovery messages, the fog 369 monitoring controller replies with unicast information messages. 371 o Fog agents can then register with a controller. The registration 372 message is unicast, and includes information on the capabilities 373 of the fog node, such as: 375 * Type of node. 377 * Vendor. 379 * Energy source: battery-powered or not. 381 * Connectivity (number of network interfaces and information 382 associated to them, such as radio technology type, layer-2 and 383 layer-3 addresses, etc.). 385 * Etc. 387 Note that registration to multiple fog monitoring controller 388 instances could also be possible if a fog node wants to belong to 389 several fog domains at the same time (but note that how the 390 orchestration of the same resource is done by multiple 391 orchestrators is not covered by this invention). The defined 392 mechanisms support this via the use of fog IDs and FOGNODEDOMAIN 393 options. 395 o A fog node C bootstraps after nodes A and B are already 396 registered. The same discovery process is followed by fog node C, 397 but in addition to the regular advertisement, registration 398 procedures described before, existing neighboring fog agents (such 399 as A and B in this example), might also respond to discovery 400 messages sent by bootstrapping nodes to provide required 401 information. This makes the procedure faster, more efficient and 402 reliable. In addition to helping the fog monitoring controller in 403 the fog agent discovery process, fog agents learn themselves about 404 the existence and associated capabilities of other fog agents. 405 This can be used to allow autonomous monitoring by the fog agents 406 without the involvement of the central controller. 408 4. IANA Considerations 410 TBD. 412 5. Security Considerations 414 TBD. 416 6. Acknowledgments 418 The work in this draft will be further developed and explored under 419 the framework of the H2020 5G-DIVE project (Grant 859881). 421 7. Informative References 423 [RFC7346] Droms, R., "IPv6 Multicast Address Scopes", RFC 7346, 424 DOI 10.17487/RFC7346, August 2014, 425 . 427 Authors' Addresses 429 Carlos J. Bernardos 430 Universidad Carlos III de Madrid 431 Av. Universidad, 30 432 Leganes, Madrid 28911 433 Spain 435 Phone: +34 91624 6236 436 Email: cjbc@it.uc3m.es 437 URI: http://www.it.uc3m.es/cjbc/ 438 Alain Mourad 439 InterDigital Europe 441 Email: Alain.Mourad@InterDigital.com 442 URI: http://www.InterDigital.com/