idnits 2.17.1 

draft-irtf-nmrg-autonomic-sla-violation-detection-08.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (June 8, 2017) is 2514 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Missing Reference: 'LMAP' is mentioned on line 470, but not defined

  == Missing Reference: 'IPFIX' is mentioned on line 478, but not defined

  == Missing Reference: 'ALTO' is mentioned on line 487, but not defined

  == Outdated reference: A later version (-30) exists of
     draft-ietf-anima-autonomic-control-plane-06

  -- Obsolete informational reference (is this intentional?): RFC 4148
     (Obsoleted by RFC 6248)


     Summary: 0 errors (**), 0 flaws (~~), 5 warnings (==), 2 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------


2	Network Management Research Group                               J. Nobre
3	Internet-Draft                       University of Vale do Rio dos Sinos
4	Intended status: Informational                              L. Granville
5	Expires: December 10, 2017       Federal University of Rio Grande do Sul
6	                                                                A. Clemm
7	                                                                  Huawei
8	                                                      A. Gonzalez Prieto
9	                                                            June 8, 2017

11	     Autonomic Networking Use Case for Distributed Detection of SLA
12	                               Violations
13	          draft-irtf-nmrg-autonomic-sla-violation-detection-08

15	Abstract

17	   This document describes a use case for autonomic networking
18	   concerning monitoring of Service Level Agreements (SLAs).  The use
19	   case aims to detect violations of SLAs in distributed fashion,
20	   striving to optimize the autonomic deployment of active measurement
21	   probes in a way that maximizes the likelihood of detecting service
22	   level violations without any outside guidance or intervention.

24	Status of This Memo

26	   This Internet-Draft is submitted in full conformance with the
27	   provisions of BCP 78 and BCP 79.

29	   Internet-Drafts are working documents of the Internet Engineering
30	   Task Force (IETF).  Note that other groups may also distribute
31	   working documents as Internet-Drafts.  The list of current Internet-
32	   Drafts is at http://datatracker.ietf.org/drafts/current/.

34	   Internet-Drafts are draft documents valid for a maximum of six months
35	   and may be updated, replaced, or obsoleted by other documents at any
36	   time.  It is inappropriate to use Internet-Drafts as reference
37	   material or to cite them other than as "work in progress."

39	   This Internet-Draft will expire on December 10, 2017.

41	Copyright Notice

43	   Copyright (c) 2017 IETF Trust and the persons identified as the
44	   document authors.  All rights reserved.

46	   This document is subject to BCP 78 and the IETF Trust's Legal
47	   Provisions Relating to IETF Documents
48	   (http://trustee.ietf.org/license-info) in effect on the date of
49	   publication of this document.  Please review these documents
50	   carefully, as they describe your rights and restrictions with respect
51	   to this document.  Code Components extracted from this document must
52	   include Simplified BSD License text as described in Section 4.e of
53	   the Trust Legal Provisions and are provided without warranty as
54	   described in the Simplified BSD License.

56	Table of Contents

58	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
59	   2.  Definitions and Acronyms  . . . . . . . . . . . . . . . . . .   5
60	   3.  Current Approaches  . . . . . . . . . . . . . . . . . . . . .   5
61	   4.  Use Case Description  . . . . . . . . . . . . . . . . . . . .   6
62	   5.  A Distributed Autonomic Solution  . . . . . . . . . . . . . .   7
63	   6.  Intended User Experience  . . . . . . . . . . . . . . . . . .   9
64	   7.  Implementation Considerations . . . . . . . . . . . . . . . .   9
65	     7.1.  Device Based Self-Knowledge and Decisions . . . . . . . .   9
66	     7.2.  Interaction with other devices  . . . . . . . . . . . . .  10
67	   8.  Comparison with current solutions . . . . . . . . . . . . . .  10
68	   9.  Related IETF Work . . . . . . . . . . . . . . . . . . . . . .  10
69	   10. Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  11
70	   11. IANA Considerations . . . . . . . . . . . . . . . . . . . . .  11
71	   12. Security Considerations . . . . . . . . . . . . . . . . . . .  11
72	   13. Informative References  . . . . . . . . . . . . . . . . . . .  12
73	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  13

75	1.  Introduction

77	   The Internet has been growing dramatically in terms of size,
78	   capacity, and accessibility in the last years.  Communication
79	   requirements of distributed services and applications running on top
80	   of the Internet have become increasingly demanding.  Some examples
81	   are real-time interactive video or financial trading.  Providing such
82	   services involves stringent requirements in terms of acceptable
83	   latency, loss, or jitter.

85	   Performance requirements lead to the articulation of Service Level
86	   Objectives (SLOs) which must be met.  Those SLOs are part of Service
87	   Level Agreements (SLAs) that define a contract between the provider
88	   and the consumer of a service.  SLOs, in effect, constitute a service
89	   level guarantee that the consumer of the service can expect to
90	   receive (and often has to pay for).  Likewise, the provider of a
91	   service needs to ensure that the service level guarantee and
92	   associated SLOs are met.  Some examples of clauses that relate to
93	   service level objectives can be found in [RFC7297]).

95	   Violations of SLOs can be associated with significant financial loss,
96	   which can by divided into two categories.  For one, there is the loss
97	   that can be incurred by the user of a service when the agreed service
98	   levels are not provided.  For example, a financial brokerage's stock
99	   orders might suffer losses when it is unable to execute stock
100	   transactions in a timely manner.  An electronic retailer may lose
101	   customers when their online presence is perceived by customers as
102	   sluggish.  An online gaming provider may not be able to provide fair
103	   access to online players, resulting in frustrated players who are
104	   lost as customers.  In each case, the failure of a service provider
105	   to meet promised service level guarantees can have a substantial
106	   financial impact on users of the service.  By the same token, there
107	   is the loss that is incurred by the provider of a service who is
108	   unable to meet promised service level objectives.  Those losses can
109	   take several forms, such as penalties for not meeting the service
110	   and, in many cases more important, loss of revenue due to reduced
111	   customer satisfaction.  Hence, service level objectives are a key
112	   concern for the service provider.  In order to ensure that SLOs are
113	   not being violated, service levels need to be continuously monitored
114	   at the network infrastructure layer in order to know, for example,
115	   when mitigating actions need to be taken.  To that end, service level
116	   measurements must take place.

118	   Network measurements can be performed using active or passive
119	   measurement techniques.  In passive measurements, production traffic
120	   is observed, and no monitoring traffic is created by the measurement
121	   process itself.  That is, network conditions are checked in a non
122	   intrusive way.  In the context of IP Flow Information EXport (IPFIX)
123	   WG, several documents were produced to define passive measurement
124	   mechanisms (e.g., flow records specification [RFC3954]).  Active
125	   measurements, on the other hand, are intrusive in the sense that it
126	   involves injecting synthetic test traffic into the network to measure
127	   network service levels.  The IP Performance Metrics (IPPM) WG
128	   produced documents that describe active measurement mechanisms, such
129	   as: One-Way Active Measurement Protocol (OWAMP) [RFC4656], Two-Way
130	   Active Measurement Protocol (TWAMP) [RFC5357], and Cisco Service
131	   Level Assurance Protocol (SLA) [RFC6812].  In addition, there are
132	   some mechanisms that do not fit into either active or passive
133	   categories, such as Performance and Diagnostic Metrics Destination
134	   Option (PDM) techniques [draft-ietf-ippm-6man-pdm-option].

136	   Active measurement mechanisms offer a high level of control of what
137	   and how to measure.  They do not require inspecting production
138	   traffic.  Because of this, active measurements usually offer better
139	   accuracy and privacy than passive measurement mechanisms.  Traffic
140	   encryption and regulations that limit the amount of payload
141	   inspection that can occur are non-issues.  Furthermore, active
142	   measurement mechanisms are able to detect end-to-end network
143	   performance problems in a fine-grained way (e.g., simulating the
144	   traffic that must be handled considering specific Service Level
145	   Objectives - SLOs).  As a result, active measurements are often
146	   preferred over passive measurement for SLA monitoring.  Measurement
147	   probes must be hosted in network devices and measurement sessions
148	   must be activated to compute the current network metrics (e.g.,
149	   considering those described in [RFC4148]).  This activation should be
150	   dynamic in order to follow changes in network conditions, such as
151	   those related with routes being added or new customer demands.

153	   While offering many advantages, active measurements are expensive in
154	   terms of network resource consumption.  Active measurements generally
155	   involve measurement probes that generate synthetic test traffic that
156	   is directed at a responder.  The responder needs to timestamp test
157	   traffic it receives and reflect it back to the originating
158	   measurement probe.  The measurement probe subsequently processes the
159	   returned packets along with time stamping information in order to
160	   compute service levels.  Accordingly, active measurements consume
161	   substantial CPU cycles as well as memory of network devices to
162	   generate and process test traffic.  In addition, synthetic traffic
163	   increases network load.  Active measurements thus compete for
164	   resources with other functions, including routing and switching.

166	   The resources required and traffic generated by the active
167	   measurement sessions are to a large part a function of the number of
168	   measured network destinations.  (In addition, the amount of traffic
169	   generated for each measurement plays a role, which in turn influences
170	   the accuracy of the measurement.)  The more destinations are being
171	   measured, the larger the amount of resources consumed and traffic
172	   needed to perform the measurements.  Thus, to have a better
173	   monitoring coverage it is necessary to deploy more sessions which
174	   consequently increases consumed resources.  Otherwise, enabling the
175	   observation of just a small subset of all network flows can lead to
176	   an insufficient coverage.

178	   Furthermore, while some end-to-end service levels can be determined
179	   by adding up the service levels observed across different path
180	   segments, the same is not true for all service levels.  For example,
181	   the end-to-end delay or packet loss from a node A to a node C routed
182	   via a node B can often be computed simply by adding delays (or loss)
183	   from A to B, and B to C.  This allows to decompose a large set of
184	   end-to-end measurements into a much smaller set of segment
185	   measurements.  However, end-to-end jitter and (for example) Mean
186	   Opinion Scores cannot be decomposed as easily and, for higher
187	   accuracy, must be measured end-to-end.

189	   Hence, the decision how to place measurement probes becomes an
190	   important management activity.  The goal is to obtain maximum
191	   benefits of service level monitoring with a limited amount of
192	   measurement overhead.  Specifically, the goal is to maximize the
193	   number of service level violations that are detected with a limited
194	   amount of resources.

196	2.  Definitions and Acronyms

198	   Active Measurements: Techniques to measure service levels that
199	   involve generating and observing synthetic test traffic

201	   Passive Measurements: Techniques used to measure service levels based
202	   on observation of production traffic

204	   AN: Autonomic Network; a network containing exclusively autonomic
205	   nodes, requiring no configuration and deriving all required
206	   inofrmaiton through self-knowledge, discovery, or intent.

208	   Measurement Session: A communications association between a Probe and
209	   a Responder used to send and reflect synthetic test traffic for
210	   active measurements

212	   Probe: The source of synthetic test traffic in an active measurement

214	   Responder: The destination for synthetic test traffic in an active
215	   measurement

217	   SLA: Service Level Agreement

219	   SLO: Service Level Objective

221	   P2P: Peer-to-Peer

223	3.  Current Approaches

225	   The current best practice in feasible deployments of active
226	   measurement solutions to distribute the available measurement
227	   sessions along the network consists in relying entirely on the human
228	   administrator expertise to infer which would be the best location to
229	   activate such sessions.  This is done through several steps.  First,
230	   it is necessary to collect traffic information in order to grasp the
231	   traffic matrix.  Then, the administrator uses this information to
232	   infer which are the best destinations for measurement sessions.
233	   After that, the administrator activates sessions on the chosen subset
234	   of destinations considering the available resources.  This practice,
235	   however, does not scale well because it is still labor intensive and
236	   error-prone for the administrator to determine which sessions should
237	   be activated given the set of critical flows that needs to be
238	   measured.  Even worse, this practice completely fails in networks
239	   whose critical flows are too short in time and dynamic in terms of
240	   traversing network path, like in modern cloud environments.  That is
241	   so because fast reactions are necessary to reconfigure the sessions
242	   and administrators are not just enough in computing and activating
243	   the new set of required sessions every time the network traffic
244	   pattern changes.  Finally, the current active measurements practice
245	   usually covers only a fraction of the network flows that should be
246	   observed, which invariably leads to the damaging consequence of
247	   undetected SLA violations.

249	4.  Use Case Description

251	   The use case involves a service level provider who needs to monitor
252	   the network to detect service level violations using active service
253	   level measurements, and wants to be able to do so with minimal human
254	   intervention.  The goal is to conduct the measurements in an
255	   effective manner maximizing the percentage of detected service level
256	   violations.  The service level provider has a bounded resource budget
257	   with regards to measurements that can be performed, specifically,
258	   with regards to the number of measurements that can be conducted
259	   concurrently from any one network device, and possibly with regards
260	   to the total amount of measurement traffic on the network.  However,
261	   while at any one point in time the number of measurements conducted
262	   is limited, it is possible for a device to change which destinations
263	   to measure over time.  This can be exploited to achieve a balance of
264	   eventually covering all possible destinations using a reasonable
265	   amount of "sampling" where measurement coverage of a destination
266	   cannot be continuous.  The solution needs to be dynamic and be able
267	   to cope with network conditions which may change over time.  The
268	   solution should also be embeddable inside network devices that
269	   control the deployment of active measurement mechanisms.

271	   The goal is to conduct the measurements in a smart manner that
272	   ensures that the network is broadly covered and the likelihood of
273	   detecting service level violations is maximized.  In order to
274	   maximize that likelihood, it is reasonable to focus measurement
275	   resources on destinations that are more likely to incur a violation,
276	   while spending less resources on destinations that are more likely to
277	   be in compliance.  In order to do so, there are various aspects that
278	   can be exploited, including past measurements (destinations close to
279	   a service level threshold requiring more focus than destinations
280	   further from it), complementation with passive measurements such as
281	   flow data (to identify network destinations that are currently
282	   popular and critical), an observations from other parts of the
283	   network.  In addition, measurements can be coordinated among
284	   different network devices to avoid hitting the same destination at
285	   the same time and to be able to share results that may be useful in
286	   future probe placement.

288	   Clearly, static solutions will have severe limitations.  At the same
289	   time, human administrators cannot be in the loop for continuous
290	   dynamic measurement probe reconfigurations.  Accordingly, an
291	   automated or, ideally, autonomic solution is needed in which network
292	   measurements are automatically orchestrated and dynamically
293	   reconfigured from within the network.

295	5.  A Distributed Autonomic Solution

297	   The use of Autonomic Networking (AN) [RFC7575] can help such
298	   detection through an efficient activation of measurement sessions
299	   [P2PBNM-Nobre-2012].  The problem to be solved by AN in the present
300	   use case is how to steer the process of Measurement Session
301	   activation by a complete solution that sets all necessary parameters
302	   for this activation to operate efficiently, reliably and securely,
303	   with no required human intervention other than setting overall
304	   policy.

306	   We advocate for embedding Peer-to-Peer (P2P) technology in network
307	   devices in order to conduct the Measurement Session activation
308	   decisions using autonomic control loops.  Specifically, we advocate
309	   for network devices to implement an autonomic function to monitor
310	   service levels for violations of service level objectives,
311	   determining which Measurement Sessions to set up at any given point
312	   in time based on current and past observations of the node, and of
313	   other peer nodes.  By performing these functions locally and
314	   autonomically on the device itself, which measurements to conduct can
315	   be modified quickly based on local observations while taking local
316	   resource availability into account.  This allows a solution to be
317	   more robust and react more dynamically to rapidly changing service
318	   levels than a solution that has to rely on central coordination.
319	   However, in order to optimize decisions which measurements to
320	   conduct, a node will need to communicate with other nodes.  This
321	   allows a node to take into account other nodes' observations in
322	   addition to its own in its decisions.  For example, remote
323	   destinations whose observed service levels are on the verge of
324	   violating stated objectives may require closer monitoring than remote
325	   destinations that are comfortably within a range of tolerance.  It
326	   also allows nodes to coordinate their probing decisions to
327	   collectively achieve the best possible measurement coverage.  This
328	   requires the use of a P2P overlay.

330	   A P2P overlay is important for several reasons:

332	   o  It makes it possible for nodes in the network to autonomically set
333	      up Measurement Sessions, without having to rely on central
334	      management system or controller to perform configuration
335	      operations associated with configuring measurement probes and
336	      responders.

338	   o  It facilitates the exchange of data between different nodes to
339	      share measurement results so that each node can refine its
340	      measurement strategy based not just its own observations, but
341	      observations from its peers.

343	   o  It allows nodes to coordinate their measurements to obtain the
344	      best possible test coverage and avoid measurements that have a
345	      very low likelihood of detecting service level violations.

347	   The provisioning of the P2P overlay should be transparent for the
348	   network administrator.  An Autonomic Control Plane such as defined in
349	   [I-D.anima-autonomic-control-plane] provides an ideal candidate for
350	   the P2P overlay to run on.

352	   An autonomic solution for the distributed detection of SLA violations
353	   provide several benefits.  First, efficiency: this solution should
354	   optimize the resource consumption and avoid resource starvation on
355	   the network devices.  A device that is "self-aware" of its available
356	   resources will be able to adjust measurement activities rapidly as
357	   needed, without requiring a separate control loop involving resource
358	   monitoring by an external system.  Secondly, placing logic where to
359	   conduct measurements in the node enables rapid control loops in which
360	   devices are able to react instantly to observations and adjust their
361	   measurement strategy.  For example, a device could decide to adjust
362	   the amount of synthetic test traffic being sent during the
363	   measurement itself depending on results observed so far on this and
364	   on other concurrent measurement sessions.  As a result, the solution
365	   could decrease the time necessary to detect SLA violations.
366	   Adaptivity features of an autonomic loop could capture faster the
367	   network dynamics than an human administrator and even a central
368	   controller.  Finally, the solution could help to reduce the workload
369	   of human administrator, or, at least, to avoid their need to perform
370	   operational tasks.

372	   In practice, these factors combine to maximize the likelihood of SLA
373	   violations being detected while operating within a given resource
374	   budget, allowing to conduct a continuous measurement strategy that
375	   takes into account past measurement results, observations of other
376	   measures such as link utilization or flow data, sharing of
377	   measurement results between network devices, and coordinating future
378	   measurement activities among nodes.  Combined this can result in
379	   efficient measurement decisions that achieve a golden balance between
380	   broad network coverage and honing in on service level "hot spots".

382	6.  Intended User Experience

384	   The autonomic solution should not require any human intervention in
385	   the distributed detection of SLA violations.  By virtue of the
386	   solution being autonomic, human users will not have to plan which
387	   measurements to conduct in a network, often a very labor intensive
388	   task today that requires detailed analysis of traffic matrices and
389	   network topologies and is not prone to easy dynamic adjustment.
390	   Likewise, they will not have to configure measurement probes and
391	   responders.

393	   There are some ways in which a human administrator may still interact
394	   with the solution.  For one, the human administrator will of course
395	   be notified and obtain reports about service level violations that
396	   are observed.  Second, a human administrator may set a policies
397	   regarding how closely to monitor the network for service level
398	   violations and how many resources to spend.  For example, an
399	   administrator may set a resource budget that is assigned to network
400	   devices for measurement operations.  With that given budget, the
401	   number of SLO violations that are detected will be maximized.
402	   Alternatively, an administrator may set a target for the percentage
403	   of SLO violations that must be detected, i.e. a target for the ratio
404	   between the number of detected SLO violations, and the number of
405	   total SLO violations that are actually occurring (some of which might
406	   go undetected).  In that case, the solution will aim to minimize the
407	   resources spent (i.e. the amount of test traffic and Measurement
408	   Sessions) that are required to achieve that target.

410	7.  Implementation Considerations

412	   The active measurement model assumes that a typical infrastructure
413	   will have multiple network segments and Autonomous Systems (ASs), and
414	   a reasonably large number of routers.  It also considers that
415	   multiple SLOs can be in place at a given time.  Since
416	   interoperability in a heterogenous network is a goal, features found
417	   on different active measurement mechanisms (e.g.  OWAMP, TWAMP, and
418	   IPSLA) and device programability interfaces (such as Juniper's Junos
419	   API or Cisco's Embedded Event Manager) could be used for the
420	   implementation.  The autonomic solution should include and/or
421	   reference specific algorithms, protocols, metrics and technologies
422	   for the implementation of distributed detection of SLA violations as
423	   a whole.

425	7.1.  Device Based Self-Knowledge and Decisions

427	   Each device has self-knowledge about the local SLA monitoring.  This
428	   could be in the form of historical measurement data and SLOs.
429	   Besides that, the devices would have algorithms that could decide
430	   which probes should be activated in a given time.  The choice of
431	   which algorithm is better for a specific situation would be also
432	   autonomic.

434	7.2.  Interaction with other devices

436	   Network devices should share information about service level
437	   measurement results.  This information can speed up the detection of
438	   SLA violations and increase the number of detected SLA violations.
439	   For example, if one device detects that a remote destination is in
440	   danger of violating an SLO, other devices may conduct additional
441	   measurements to the same destination or other destinations in its
442	   proximity.  For any given network device, the exchange of data may be
443	   more important with some devices (for example, devices in the same
444	   network neighborhood, or devices that are "correlated" by some other
445	   means) than with others.  The definition of network devices that
446	   exchange measurement data, i.e., management peers, creates a new
447	   topology.  Different approaches could be used to define this topology
448	   (e.g., correlated peers [P2PBNM-Nobre-2012]).  To bootstrap peer
449	   selection, each device should use its known endpoints neighbors
450	   (e.g., FIB and RIB tables) as the initial seed to get possible peers.

452	8.  Comparison with current solutions

454	   There is no standardized solution for distributed autonomic detection
455	   of SLA violations.  Current solutions are restricted to ad hoc
456	   scripts running on a per node fashion to automate some
457	   administrator's actions.  There are some proposals for passive probe
458	   activation (e.g., DECON and CSAMP), but without the focus on
459	   autonomic features.  It is also mentioning a proposal from Barford et
460	   al. to detect and localize links which cause anomalies along a
461	   network path.

463	9.  Related IETF Work

465	   The following paragraphs discuss related IETF work and are provided
466	   for reference.  This section is not exhaustive, rather it provides an
467	   overview of the various initiatives and how they relate to autonomic
468	   distributed detection of SLA violations.

470	   1.  [LMAP]: The Large-Scale Measurement of Broadband Performance
471	       Working Group aims at the standards for performance management.
472	       Since their mechanisms also consist in deploying measurement
473	       probes the autonomic solution could be relevant for LMAP
474	       specially considering SLA violation screening.  Besides that, a
475	       solution to decrease the workload of human administrators in
476	       service providers is probably highly desirable.

478	   2.  [IPFIX]: IP Flow Information EXport (IPFIX) aims at the process
479	       of standardization of IP flows (i.e., netflows).  IPFIX uses
480	       measurement probes (i.e., metering exporters) to gather flow
481	       data.  In this context, the autonomic solution for the activation
482	       of active measurement probes could be possibly extended to
483	       address also passive measurement probes.  Besides that, flow
484	       information could be used in the decision making of probe
485	       activation.

487	   3.  [ALTO]: The Application Layer Traffic Optimization Working Group
488	       aims to provide topological information at a higher abstraction
489	       layer, which can be based upon network policy, and with
490	       application-relevant service functions located in it.  Their work
491	       could be leveraged for the definition of the topology regarding
492	       the network devices which exchange measurement data.

494	10.  Acknowledgements

496	   We wish to acknowledge the helpful contributions, comments, and
497	   suggestions that were received from Mohamed Boucadair, Bruno Klauser,
498	   Eric Voit, and Hanlin Fang.

500	11.  IANA Considerations

502	   This memo includes no request to IANA.

504	12.  Security Considerations

506	   Security of the solution hinges on the security of the network
507	   underlay, i.e. the Autonomic Control Plane.  If the Autonomic Control
508	   Plane were to be compromised, an attacker could undermine the
509	   effectiveness of measurement coordination by reporting fraudulent
510	   measurement results to peers.  This would cause measurement probes to
511	   be deployed in an ineffective manner that allows that would increase
512	   the likelihood that violations of service level objectives would go
513	   undetected.

515	   Likewise, security of the solution hinges on the security of the
516	   deployment mechanism for autonomic functions, in this case, the
517	   autonomic function that conducts the service level measurements.  If
518	   an attacker was able to hijack an autonomic function, it could try to
519	   exhaust or exceed the resources that should be spent on autonomic
520	   measurements in order to deplete network resources, as well as
521	   reporting misleading results.

523	13.  Informative References

525	   [draft-anima-boot]
526	              Pritikin, M., Richardson, M., Behringer, M., Bjarnason,
527	              S., and K. Watsen, "draft-ietf-anima-bootstrapping-
528	              keyinfra", draft-ietf-anima-bootstrapping-keyinfra-06
529	              (work in progress), May 2017.

531	   [draft-ietf-ippm-6man-pdm-option]
532	              Elkins, N., Hamilton, R., and M. Ackermann, "draft-ietf-
533	              ippm-6man-pdm-option", draft-ietf-ippm-6man-pdm-option-11
534	              (work in progress), June 2017.

536	   [I-D.anima-autonomic-control-plane]
537	              Behringer, M., Eckert, T., and S. Bjarnason, "An Autonomic
538	              Control Plane", draft-ietf-anima-autonomic-control-
539	              plane-06 (work in progress), March 2017.

541	   [P2PBNM-Nobre-2012]
542	              Nobre, J., Granville, L., Clemm, A., and A. Gonzalez
543	              Prieto, "Decentralized Detection of SLA Violations Using
544	              P2P Technology, 8th International Conference Network and
545	              Service Management (CNSM)", 2012,
546	              <http://ieeexplore.ieee.org/xpls/
547	              abs_all.jsp?arnumber=6379997>.

549	   [RFC3954]  Claise, B., Ed., "Cisco Systems NetFlow Services Export
550	              Version 9", RFC 3954, DOI 10.17487/RFC3954, October 2004,
551	              <http://www.rfc-editor.org/info/rfc3954>.

553	   [RFC4148]  Stephan, E., "IP Performance Metrics (IPPM) Metrics
554	              Registry", BCP 108, RFC 4148, DOI 10.17487/RFC4148, August
555	              2005, <http://www.rfc-editor.org/info/rfc4148>.

557	   [RFC4656]  Shalunov, S., Teitelbaum, B., Karp, A., Boote, J., and M.
558	              Zekauskas, "A One-way Active Measurement Protocol
559	              (OWAMP)", RFC 4656, DOI 10.17487/RFC4656, September 2006,
560	              <http://www.rfc-editor.org/info/rfc4656>.

562	   [RFC5357]  Hedayat, K., Krzanowski, R., Morton, A., Yum, K., and J.
563	              Babiarz, "A Two-Way Active Measurement Protocol (TWAMP)",
564	              RFC 5357, DOI 10.17487/RFC5357, October 2008,
565	              <http://www.rfc-editor.org/info/rfc5357>.

567	   [RFC6812]  Chiba, M., Clemm, A., Medley, S., Salowey, J., Thombare,
568	              S., and E. Yedavalli, "Cisco Service-Level Assurance
569	              Protocol", RFC 6812, DOI 10.17487/RFC6812, January 2013,
570	              <http://www.rfc-editor.org/info/rfc6812>.

572	   [RFC7297]  Boucadair, M., Jacquenet, C., and N. Wang, "IP
573	              Connectivity Provisioning Profile (CPP)", RFC 7297,
574	              DOI 10.17487/RFC7297, July 2014,
575	              <http://www.rfc-editor.org/info/rfc7297>.

577	   [RFC7575]  Behringer, M., Pritikin, M., Bjarnason, S., Clemm, A.,
578	              Carpenter, B., Jiang, S., and L. Ciavaglia, "Autonomic
579	              Networking: Definitions and Design Goals", RFC 7575,
580	              DOI 10.17487/RFC7575, June 2015,
581	              <http://www.rfc-editor.org/info/rfc7575>.

583	Authors' Addresses

585	   Jeferson Campos Nobre
586	   University of Vale do Rio dos Sinos
587	   Porto Alegre
588	   Brazil

590	   Email: jcnobre@unisinos.br

592	   Lisandro Zambenedetti Granvile
593	   Federal University of Rio Grande do Sul
594	   Porto Alegre
595	   Brazil

597	   Email: granville@inf.ufrgs.br

599	   Alexander Clemm
600	   Huawei
601	   Santa Clara, California
602	   USA

604	   Email: ludwig@clemm.org

606	   Alberto Gonzalez Prieto
607	   Santa Clara, California
608	   USA

610	   Email: albertgo.gonzalezprieto@yahoo.com