idnits 2.17.1 

draft-wu-t2trg-network-telemetry-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack a Security Considerations section.

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (March 9, 2016) is 2969 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Missing Reference: 'I-D.draft-strassner-anima-control-loops-01' is
     mentioned on line 440, but not defined

  == Unused Reference: 'I-D.ietf-idr-te-pm-bgp' is defined on line 454, but
     no explicit reference was found in the text

  == Unused Reference: 'I-D.ietf-lime-yang-oam-model' is defined on line 460,
     but no explicit reference was found in the text

  == Unused Reference: 'I-D.strassner-anima-control-loops' is defined on line
     467, but no explicit reference was found in the text

  == Unused Reference: 'RFC5693' is defined on line 495, but no explicit
     reference was found in the text

  == Outdated reference: A later version (-18) exists of
     draft-ietf-idr-te-pm-bgp-02

  == Outdated reference: A later version (-10) exists of
     draft-ietf-lime-yang-oam-model-02

  == Outdated reference: A later version (-01) exists of
     draft-strassner-anima-control-loops-00

  == Outdated reference: A later version (-09) exists of
     draft-wu-alto-te-metrics-06


     Summary: 2 errors (**), 0 flaws (~~), 10 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                              Q. Wu
3	Internet-Draft                                              J. Strassner
4	Intended status: Informational                                    Huawei
5	Expires: September 10, 2016                                    A. Farrel
6	                                                      Old Dog Consulting
7	                                                                L. Zhang
8	                                                                  Huawei
9	                                                           March 9, 2016

11	                Network Telemetry and Big Data Analysis
12	                  draft-wu-t2trg-network-telemetry-00

14	Abstract

16	   This document focuses on network measurement and analysis in the
17	   network environment.  It first defines network telemetry, describes
18	   an exemplary network telemetry architecture, and then explores the
19	   characteristics of network telemetry data.  It ends with detailing a
20	   set of issues with retrieving and processing network telemetry data.

22	Status of This Memo

24	   This Internet-Draft is submitted in full conformance with the
25	   provisions of BCP 78 and BCP 79.

27	   Internet-Drafts are working documents of the Internet Engineering
28	   Task Force (IETF).  Note that other groups may also distribute
29	   working documents as Internet-Drafts.  The list of current Internet-
30	   Drafts is at http://datatracker.ietf.org/drafts/current/.

32	   Internet-Drafts are draft documents valid for a maximum of six months
33	   and may be updated, replaced, or obsoleted by other documents at any
34	   time.  It is inappropriate to use Internet-Drafts as reference
35	   material or to cite them other than as "work in progress."

37	   This Internet-Draft will expire on September 10, 2016.

39	Copyright Notice

41	   Copyright (c) 2016 IETF Trust and the persons identified as the
42	   document authors.  All rights reserved.

44	   This document is subject to BCP 78 and the IETF Trust's Legal
45	   Provisions Relating to IETF Documents
46	   (http://trustee.ietf.org/license-info) in effect on the date of
47	   publication of this document.  Please review these documents
48	   carefully, as they describe your rights and restrictions with respect
49	   to this document.  Code Components extracted from this document must
50	   include Simplified BSD License text as described in Section 4.e of
51	   the Trust Legal Provisions and are provided without warranty as
52	   described in the Simplified BSD License.

54	Table of Contents

56	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
57	   2.  The definition of Network Telemetry . . . . . . . . . . . . .   3
58	   3.  Network Telemetry architecture  . . . . . . . . . . . . . . .   3
59	   4.  Measurement data Characteristics  . . . . . . . . . . . . . .   6
60	   5.  Issues  . . . . . . . . . . . . . . . . . . . . . . . . . . .   7
61	     5.1.  Data Fetching Efficiency  . . . . . . . . . . . . . . . .   7
62	     5.2.  Existing Network Level Metrics Inefficiency issue . . . .   7
63	     5.3.  Measurement data format consistency issue . . . . . . . .   9
64	     5.4.  Data Correlation issue  . . . . . . . . . . . . . . . . .   9
65	     5.5.  Data Synchronization Issues . . . . . . . . . . . . . . .  10
66	   6.  Informative References  . . . . . . . . . . . . . . . . . . .  10
67	   Appendix A.  Network Telemetry data source Classification . . . .  12
68	   Appendix B.  Existing Network Data Collection Methods . . . . . .  12
69	     B.1.  Network Log Collection  . . . . . . . . . . . . . . . . .  12
70	       B.1.1.  Text based data collection  . . . . . . . . . . . . .  13
71	       B.1.2.  SNMP Trap . . . . . . . . . . . . . . . . . . . . . .  13
72	       B.1.3.  Syslog based Collection . . . . . . . . . . . . . . .  13
73	     B.2.  Network Traffic Collection  . . . . . . . . . . . . . . .  13
74	     B.3.  Network Performance Collection  . . . . . . . . . . . . .  14
75	     B.4.  Network Faults Collection . . . . . . . . . . . . . . . .  14
76	     B.5.  Network Topology data Collection  . . . . . . . . . . . .  14
77	     B.6.  Other Data Collection . . . . . . . . . . . . . . . . . .  14
78	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  15

80	1.  Introduction

82	   Today, billions of devices can connect to the internet and VPN and
83	   establish a good ecosystem of connectivity.  Our daily life also has
84	   been greatly changed with a large number of IoT applications and
85	   mobile application being built on top of it (e.g., smart tags on many
86	   daily life objects, wearable health monitoring sensors, smartphones,
87	   intelligent cars, and smart home appliances).  However, the increased
88	   amount of connection of devices and the proliferation of web and
89	   multimedia services also imposes a great impact on the network.
90	   Examples include:

92	   o  The massive scale and highly dynamic nature of the IoT
93	      applications and mobile applications (e.g., interaction with other
94	      thing at anytime and in any location)

96	   o  The increasingly vast amounts of data gathered from the network
97	      enviroment at varying speeds, with different amounts of accuracy,
98	      and the new communication patterns created

100	   o  The disparate types of pre- and post-processing necessary to
101	      understand the meaning and context (e.g., semantics) of measured
102	      data

104	   Therefore the network may be subject to increased network incidents
105	   and unregulated network changes, without better network visibility or
106	   a good view of the available network resources and network topology,
107	   it is not easy to

109	   o  schedule network resource to adapt to near real-time service
110	      demands

112	   o  measure the network performance and assess network quality as a
113	      whole

115	   o  provide quick network diagnosis, prove network innocence when the
116	      application quality get worse or identify what parts of the
117	      network can cause problems if a network glitch or service
118	      interruption happens.

120	   In this document, we first define network telemetry in the context of
121	   network environment, followed by an exemplary architecture for
122	   collecting and processing telemetry data.  We then explore the
123	   characteristics of network telemetry data, and end with describing a
124	   set of issues with retrieving and processing network telemetry data.

126	2.  The definition of Network Telemetry

128	   Network Telemetry describes how information from various data sources
129	   can be collected using a set of automated communication processes and
130	   transmitted to one or more receiving equipment for analysis tasks.
131	   Analysis tasks may include event correlation, anomaly detection,
132	   performance monitoring, metric calculation, trend analysis, and other
133	   related processes.

135	3.  Network Telemetry architecture

137	   A Network Telemetry architecture describes how different types of
138	   Network Telemetry data are transmitted from different network sources
139	   and received by different collection entities.  In an ideal network
140	   telemetry architecture, the ability to collect data should be
141	   independent of any specific application and vendor limitations.  This
142	   means that protocol and data format translation are required, so that
143	   a normalized form of data can be used to simplify the various
144	   analysis and processing tasks required.

146	   The Network Telemetry architecture is made up of the following three
147	   key functional components:

149	   o  Data Source: The Data Source can be any type of network device
150	      that generates data.  Examples include the management system that
151	      accesses IGP/BGP routing information, network inventory, topology,
152	      and resource data, as well as other types of information that
153	      provides data to be measured and/or contextual information to
154	      better understand the network telemetry data.

156	   o  Data Collector: The Data Collector may be a part of a control and/
157	      or management system (e.g., NMS/OSS, SDN Controller, or OAM
158	      system) and/or a dedicated set of entities.  It gathers data from
159	      various Data Sources, and performs processing tasks to feed raw
160	      and/or processed data to the Data Analyzer.

162	   o  Data Analyzer: The Data Analyzer processes data from various data
163	      collectors to provide actionable insight.  This ranges from
164	      generating simple statistical metrics to inferring problems to
165	      recommending solutions to said problems.

167	   Figure 1 shows an exemplary architecture for network telemetry and
168	   analysis.

170	                   +----------------------+
171	                   | Policy-based Manager |
172	                   +----------+-----------+
173	                             / \
174	                              |
175	                              |
176	             +----------------+----------+-----------------------+
177	             |                           |                       |
178	             |                           |                       |
179	            \ /                         \ /                     \ /
180	     +----------------+         +--------+-----------+      +----+-----+
181	     | Data Analyzer, |/       \|    Data Fusion,    |/    \| Decision |
182	     |   Normalizer,  +---------+     Analytics,     +-----+|  Logic   |
183	     |  Filter, etc.  |\       /|   and other Apps   |\    /| and Apps |
184	     +--------+-------+         +---+------------+---+      +----------+
185	             / \                   / \          / \
186	              |                     |            |
187	              |                     |            |
188	             \ /                   \ /          \ /
189	    +--------+-------------+   +----+----+  +----+----+
190	    | Data Abstraction and |   |  Other  |  | Other   |
191	    |   Modeling Software  |   | OT Data |  | IT Data |
192	    +------+--------+------+   +---------+  +---------+
193	          / \      / \
194	           |        |
195	           |        |
196	          \ /      \ /
197	      +----+--------+-----+
198	      |  Data Collectors  |
199	      +----+---------+----+
200	          / \       / \
201	           |         |
202	           |         |
203	           |        \ /
204	           |    +----+------------+        +-----------+
205	           |    | Edge Software   |/      \| Temporary |
206	           |    |  (analysis &    +--------+   Data    |
207	           |    | transformation) |\      /|  Storage  |
208	           |    +------------+----+        +-----------+
209	           |                / \
210	           |                 |
211	           |                 |
212	          \ /               \ /
213	   +-------+------+   +------+-------+
214	   | Data Sources |   | Data Sources |
215	   +--------------+   +--------------+

217	                Network Telemetry and Analysis Architecture

219	   o  Data Abstraction and Modeling Software.  This component uses an
220	      overarching information model to define relevant terms, objects,
221	      and values that all components in the Network Telemetry
222	      Architecture can use.

224	   o  Edge Software refers to performing compute, storage, and/or
225	      networking functions on nodes at the edges of a network.  This
226	      enables processing of data to occur at or near the source of the
227	      data.  Figure 4 shows that some information from some Data Sources
228	      may be sent directly to Data Collectors, while other data may be
229	      sent first to Edge Software for further processing before it is
230	      consumed by Data Collectors.

232	   o  Policy-based Manager.  This component is responsible for managing
233	      different aspects of the Network Telemetry Architecture in a
234	      distributed and extensible manner through the use of a set of
235	      policies that govern the behavior of the system.  Examples include
236	      defining rules that determine what data to collect when, where,
237	      and how, as well as defining rules that, given a specific context,
238	      determine how to process collected data.

240	   This reference architecture assumes that Data Collectors can choose
241	   different measurement data formats to gather measurement data, and
242	   different protocols to transmit said data; the Data Abstraction and
243	   Modeling Software normalizes collected data into a common form.  Both
244	   the Data Collector and the Data Analyzer may support data filtering,
245	   correlation, and other types of data processing mechanisms.  In the
246	   above architecture, bi-directional communication is shown for
247	   generality.  This may be implemented a number of different ways, such
248	   as using a request-response mechanism, a publish-subscribe mechanism,
249	   or even as a set of uni-directional (e.g., push and pull) requests.

251	4.  Measurement data Characteristics

253	   Measurement data is generated from different data sources, and has
254	   varying characteristics, including (but not limited to):

256	   o  Measurement data can be any of network performance data, network
257	      logging data, network warning and defects data, network statistics
258	      and state data, and network resource operation data (e.g.,
259	      operations on RIBs and FIBs[RFC4984]).

261	   o  Most measurement data are monitor state data rather than
262	      configuration data.  However, on occasion, network configuration
263	      data may also be included (e.g., to establish context for the
264	      measurement data).

266	   o  In many cases, telemetry data requires real time delivery with
267	      high throughput, multi-channel data collection mechanisms.

269	   o  In most cases, the required frequency of access to monitoring
270	      state data is extremely high.

272	5.  Issues

274	5.1.  Data Fetching Efficiency

276	   Today, the existing data feching methods (See appendix B) prove
277	   insufficiency due to the following factors:

279	   o  The existing Network management protocol is not dedicated and also
280	      not sufficient for data collection.

282	      *  E.g.,NETCONF more focus onnetwork configuration, only retrieve
283	         operational data

285	   o  SNMP relies on Periodic fetching.  Periodic fetching of data is
286	      not an adequate solution for many types of applications

288	      *  E.g., Applications that require frequent update to the stored
289	         data

291	      In addition, it adds significant load on participating networks,
292	      devices, and applications

294	   o  We increasingly rely on RPC-style interactions [RFC5531] to fetch
295	      data on demand by application.  However most of applications are
296	      interested in update of the data or change to the data.

298	   o  When data fetching protocol is selected, human readable format
299	      such as XML, JSON to encode structured data enable us to parse
300	      without knowing schema, however it lacks efficiency on the wire.

302	5.2.  Existing Network Level Metrics Inefficiency issue

304	   Quality of Service (QoS) and Quality of Experience (QoE) assessment
305	   [RFC7266] of multimedia services has been well studied in ITU-T SG
306	   12.  Media quality is commonly expressed in terms of MoS (Mean
307	   Opinion Score) [RFC3611][G107].  MoS is typically rated on a scale
308	   from 1 to 5, in which 5 represents excellent and 1 represents
309	   unacceptable.  When multimedia application quality becomes bad,it is
310	   hard to know whether this is network problem or application specific
311	   problem(e.;g.,Codec type, Coding bit rate, packetization scheme, loss
312	   recovery technique,the interaction between transport problems and
313	   application-layer protocols ).  To make sure this is not network
314	   problem or know how serious network events or network interrruption
315	   is, network health index or network key performance Index(KPI) or key
316	   quality index(KQI) becomes important.

318	   However, QoS/QoE assessment of network service that is dependent on
319	   or not dependent on the underlying network technology (e.g., MPLS,
320	   IP) is not well studied or defined in any body or organization.  The
321	   QoS/QoE of generic network services requires a set of appropriate
322	   network performance, reliability, or other metric definitions.  This
323	   may take the form of key quality and or performance indicators,
324	   ranging from high-level metrics (e.g., dropped calls) to low-level
325	   metrics (e.g., packet loss, delay, and jitter).  IP service
326	   performance parameters are defined in ITU-T Y.1540 [Y1540]; however,
327	   these existing network performance metrics are proving insufficient
328	   due to several factors:

330	   o  These transport-specific metrics are defined for specific
331	      technologies.  For example, network performance parameters in
332	      Y.1540 are only designed for IP networks, and do not apply to
333	      connection- oriented networks, such as an MPLS-TP network.

335	   o  Not all the metrics are end-to-end performance metrics at the
336	      network level.  For example, the TE performance metrice defined in
337	      ISIS-TE [RFC5305] is only defined for per link usage.

339	   o  These transport specific metrics are all single objective metrics;
340	      there are no transport specific metrics defined as multi-objective
341	      metrics.  For example, IP transfer Delay (IPTD) is a single-
342	      objective metric and cannot be used to measure similar and
343	      important performance behaviors such as IP packet Delay
344	      Variation[Y1541]).

346	   o  Different services have different performance requirements.  It is
347	      hard to measure network QoS to satisfy all possible services using
348	      a single metric.

350	   o  Transport-specific metrics are not applied to the whole network,
351	      but to a specific flow passing through the network corresponding
352	      to matched QoS classes.

354	   o  If there are multiple paths from source to destination in the IP
355	      network, then transport-specific metrics change with the path
356	      selected and it may be also hard to know which path the packet
357	      will traverse.

359	5.3.  Measurement data format consistency issue

361	   The data format is typically vendor- and device-specific.  This also
362	   means that different commands, having different syntax and semantics
363	   characteristics that use different protocols, may have to be issued
364	   to retrieve the same type of data from different devices.

366	   The Data Analyzer may need to ingest data in a specific format that
367	   is not supported by the Data Collectors that service it.  For
368	   example, the ALTO data format used between a data source and a Data
369	   Collector generates an abstracted network topology and provides it to
370	   network-aware applications (i.e., a Data Analyzer) over a web service
371	   based API [I-D.wu-alto-te-metrics].  In this case, prefix data in the
372	   network topology information need to be generated into ALTO Network
373	   Maps, TE (topology) data needs to be generated into ALTO Cost Maps.
374	   To provide better data format mapping, ALTO Network Map and Cost MAP
375	   need to be modeled in the same way as prefix data and TE data in the
376	   network topology information.  However, these data use different data
377	   formats, and do not have a common model structure to represent them
378	   in a consistent way.

380	   This is why the architecture shown in Figure 1 has a "Data
381	   Abstraction and Modeling Software" component.  This component
382	   normalizes all data received into a common format for analysis and
383	   processing by the Data Analyzer.  If this component is not present,
384	   then the Data Analyzer would have to deal with m vendor devices X n
385	   versions of software for each device at a minimum.  Furthermore,
386	   different protocols have different capabilities, and may or may not
387	   be able to transmit and receive different types of data.  The Data
388	   Abstraction and Modeling Software component can provide information
389	   that defines the structure of data that should be received; this can
390	   be useful for checking for incomplete collection data as well as
391	   missing collection data.

393	5.4.  Data Correlation issue

395	   To provide consistent configuration, reporting and representation for
396	   OAM information, the LIME YANG model [I-D.draft-ietf-lime-yang-oam-
397	   model-01] is proposed to correlate defects, faults, and network
398	   failures between the different layers and irregardless of network
399	   technologies.  This helps improve efficiency of fault detection and
400	   localization, and provide better OAM visibility.

402	   Today we see large amounts of data collected from different data
403	   sources.  These data can be network log data, network event data,
404	   network performance data, network fault data, network statistics
405	   state, network operation state.  However, these data can only be
406	   meaningful if they are correlated in time and space.  In particular,
407	   useful trend analysis and anomaly detection depend on proper
408	   correlation of the data collected from the different Data Sources.
409	   In addition, Correlate different type data from different Data
410	   Sources with time or space can provide better network visibility.
411	   But such correlations is still an challenging issue.

413	5.5.  Data Synchronization Issues

415	   When retrieving data from Data Sources or Data Collectors,
416	   synchronization the same type of data between data source and data
417	   collector or between data collector and data analyzer is a
418	   complicated thing.

420	   o  Arrange src and dst synchronized, especially when multiple source
421	      feed one data collector, or multiple data collector feed one data
422	      analyzer

424	   o  Aggregate data from different data source and synchronize the data
425	      to the data analyzer is also not easy task.

427	   The reference architecture of Figure 1 defines a "Policy-based
428	   Manager" to manage the set of data that are collected how, when,
429	   where, and by which devices.  This component provides mechanisms that
430	   help ensure that needed information is collected by the appropriate
431	   components of the Network Telemetry Architecture.  It also
432	   facilitates the synchronization of different components that make up
433	   the Network Telemetry Architecture, since these are likely
434	   distributed throughout one or more networks.

436	   It also provides a mechanism for the Data Analyzer, or other
437	   applications (e.g., the "Data Fusion, Analytics, and other Apps", as
438	   well as the "Decision Logic and Apps" components in Figure 1) to
439	   provide information to the Policy-based Manager in the form of
440	   feedback (e.g., see [I-D.draft-strassner-anima-control-loops-01]).

442	6.  Informative References

444	   [G107]     ITU-T, "The E-model: a computational model for use in
445	              transmission planning", ITU-T Recommendation G.107, June
446	              2015.

448	   [I-D.ietf-idr-ls-distribution]
449	              Gredler, H., Medved, J., Previdi, S., Farrel, A., and S.
450	              Ray, "North-Bound Distribution of Link-State and TE
451	              Information using BGP", draft-ietf-idr-ls-distribution-13
452	              (work in progress), October 2015.

454	   [I-D.ietf-idr-te-pm-bgp]
455	              Wu, Q., Previdi, S., Gredler, H., Ray, S., and J.
456	              Tantsura, "BGP attribute for North-Bound Distribution of
457	              Traffic Engineering (TE) performance Metrics", draft-ietf-
458	              idr-te-pm-bgp-02 (work in progress), January 2015.

460	   [I-D.ietf-lime-yang-oam-model]
461	              Senevirathne, T., Finn, N., Kumar, D., Salam, S., Wu, Q.,
462	              and Z. Wang, "Generic YANG Data Model for Connection
463	              Oriented Operations, Administration, and Maintenance(OAM)
464	              protocols", draft-ietf-lime-yang-oam-model-02 (work in
465	              progress), February 2016.

467	   [I-D.strassner-anima-control-loops]
468	              Strassner, J., "The Use of Control Loops in Autonomic
469	              Networking", draft-strassner-anima-control-loops-00 (work
470	              in progress), October 2015.

472	   [I-D.wu-alto-te-metrics]
473	              Wu, W., Yang, Y., Lee, Y., Dhody, D., and S. Randriamasy,
474	              "ALTO Traffic Engineering Cost Metrics", draft-wu-alto-te-
475	              metrics-06 (work in progress), April 2015.

477	   [RFC3611]  Friedman, T., Ed., Caceres, R., Ed., and A. Clark, Ed.,
478	              "RTP Control Protocol Extended Reports (RTCP XR)",
479	              RFC 3611, DOI 10.17487/RFC3611, November 2003,
480	              <http://www.rfc-editor.org/info/rfc3611>.

482	   [RFC4984]  Meyer, D., Ed., Zhang, L., Ed., and K. Fall, Ed., "Report
483	              from the IAB Workshop on Routing and Addressing",
484	              RFC 4984, DOI 10.17487/RFC4984, September 2007,
485	              <http://www.rfc-editor.org/info/rfc4984>.

487	   [RFC5305]  Li, T. and H. Smit, "IS-IS Extensions for Traffic
488	              Engineering", RFC 5305, DOI 10.17487/RFC5305, October
489	              2008, <http://www.rfc-editor.org/info/rfc5305>.

491	   [RFC5531]  Thurlow, R., "RPC: Remote Procedure Call Protocol
492	              Specification Version 2", RFC 5531, DOI 10.17487/RFC5531,
493	              May 2009, <http://www.rfc-editor.org/info/rfc5531>.

495	   [RFC5693]  Seedorf, J. and E. Burger, "Application-Layer Traffic
496	              Optimization (ALTO) Problem Statement", RFC 5693,
497	              DOI 10.17487/RFC5693, October 2009,
498	              <http://www.rfc-editor.org/info/rfc5693>.

500	   [RFC7266]  Clark, A., Wu, Q., Schott, R., and G. Zorn, "RTP Control
501	              Protocol (RTCP) Extended Report (XR) Blocks for Mean
502	              Opinion Score (MOS) Metric Reporting", RFC 7266,
503	              DOI 10.17487/RFC7266, June 2014,
504	              <http://www.rfc-editor.org/info/rfc7266>.

506	   [Y1540]    ITU-T, "Internet protocol data communication service - IP
507	              packet transfer and availability performance parameters",
508	              ITU-T Recommendation Y.1540, March 2011.

510	   [Y1541]    ITU-T, "Network performance objectives for IP-based
511	              services", ITU-T Recommendation Y.1541, December 2011.

513	Appendix A.  Network Telemetry data source Classification

515	   +-----------------------------|------------------------------+
516	   |    Data Source Catetory     |    Information               |
517	   |                             |                              |
518	   ------------------------------|-------------------------------
519	   | Network Data                |    Usage records             |
520	   |                             |   Performance Monitoring Data|
521	   |                             |   Fault Monitoring Data      |
522	   |                             |  Real Time Traffic Data      |
523	   |                             | Real Time Statistics Data    |
524	   |                             | Network Configuration Data   |
525	   |                             |      Provision Data          |
526	   ------------------------------|-------------------------------
527	   |                             |                              |
528	   | Subscriber Data             |   Profile Data               |
529	   |                             |   Network Registry           |
530	   |                             |   Operation Data             |
531	   |                             |   Billing Data               |
532	   |                             |                              |
533	   ------------------------------|------------------------------|
534	   |                             |                              |
535	   | Application Data derived    |   Traffic Analysis           |
536	   | from interfaces, channels,  |   Web, Search, SMS, Email    |
537	   | software, etc.              |   Social Media Data          |
538	   |                             |   Mobile apps                |
539	   +-----------------------------|------------------------------+

541	Appendix B.  Existing Network Data Collection Methods

543	B.1.  Network Log Collection

545	   There are three typical Log data Collection methods:

547	   o  Text based Collection
548	   o  SNMP Trap

550	   o  Syslog based Collection

552	B.1.1.  Text based data collection

554	   Text base Log data is designed for low speed network.  The amount of
555	   IoT data can not be too large.  It only can be parsed by the network
556	   personnel with experience to define such kind of Log. The log data
557	   can be transferred either by Email or via FTP.  The difference
558	   between using Email and using FTP are:

560	   o  The volume of data transferred by FTP can be much larger than via
561	      Email.

563	   o  FTP based collection is active data collection while Email based
564	      collection is passive data collection

566	B.1.2.  SNMP Trap

568	   SNMP Trap is a notification mechanism which enables an agent to
569	   notify the management system of significant events by way of an
570	   unsolicited SNMP message.  In case there are large number of devices
571	   and each device has large number of objects, SNMP Trap is more
572	   efficient to get the data than polling information from every object
573	   on every device.

575	B.1.3.  Syslog based Collection

577	   Syslog protocol is used to convey event notification messages and
578	   allows the use of any number of transport protocols for transmission
579	   of syslog messages.  It is widely used in the network device((e.g.,
580	   switch, router) .

582	B.2.  Network Traffic Collection

584	   Network Traffic Collection is a process of exporting network traffic
585	   flow information from routers, probes and other devices.  It doesn't
586	   care operation state on the network device but traffic flow
587	   characteristic on the links between any two adjacent network device.
588	   Take IPFIX as an example, it is widely adopted in the router and
589	   switch to get IP traffic flow information for the network management
590	   system.

592	B.3.  Network Performance Collection

594	   Network performance collection is a process of exporting network
595	   performance information from routers, probers and other devices.  The
596	   network peformance information can be applied to the quality,
597	   performance, and reliability of data delivery services and
598	   applications running over network.  It is also applied to traffic
599	   contract argreed by the user and the network service provider.
600	   Measurement mechanism defined in IPPM WG and OAM technology and OAM
601	   tools can be used to perform performance measurement.

603	B.4.  Network Faults Collection

605	   Network fault collection is a process of exporting network fault,
606	   failure, warning, defects from router, probers and other devices.  It
607	   usually adopts OAM technology,OAM tools, OAM model(e.g., SNMP MIB or
608	   NETCONF YANG model) to localize fault and pinpoint fault location.
609	   However OAM YANG model is mainly focused on configure OAM
610	   functionality on the network element, how to use OAM YANG model to
611	   collect more data, e.g., warning, failure, defects and how to use
612	   these data needs to be further standardized.

614	B.5.  Network Topology data Collection

616	   For network topology data collection, routing protocols are important
617	   collection method, since every router need to propagate its
618	   information throughout the whole network.  In addition, we can use
619	   NMS/OSS to get network topology data if they have access to network
620	   topology database or routing protocols.

622	   Network Topology data comprise node information and link information.
623	   It can be collected in two typical ways, if the network topology data
624	   is within one IGP area or one AS, we can use ISIS protocol or OSPF to
625	   gather them and write into RIB or topology datasore, and then we can
626	   use I2RS protocol to read these network topology data; if the network
627	   topology data is beyond one IGP area and span across several domains,
628	   we can use BGP-LS [I-D.ietf-idr-ls-distribution][I-D.ietf-idr-te-pm-
629	   bgp] to collect network topology data in different domain and
630	   aggregated them in the central network topology database.

632	B.6.  Other Data Collection

634	   To collect and process large volume of data in real time or in near
635	   real time to detect subtle event and aid failure diagnosis, we can
636	   choose some other data fetching efficient tools, e.g., Facebook's
637	   Scribe, Chukwa built on top of Hadoop File subsystem to parse out
638	   structured data from some of the logs and load them into a datastore.

640	Authors' Addresses

642	   Qin Wu
643	   Huawei
644	   101 Software Avenue, Yuhua District
645	   Nanjing, Jiangsu  210012
646	   China

648	   Email: bill.wu@huawei.com

650	   John Strassner
651	   Huawei
652	   2230 Central Expressway
653	   San Jose, CA, CA
654	   USA

656	   Email: john.sc.strassner@huawei.com

658	   Adrian Farrel
659	   Old Dog Consulting

661	   Email: adrian@olddog.co.uk

663	   Liang Zhang
664	   Huawei

666	   Email: zhangliang1@huawei.com