idnits 2.17.1 

draft-kunze-coin-industrial-use-cases-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (July 4, 2019) is 1757 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Outdated reference: A later version (-05) exists of
     draft-mcbride-edge-data-discovery-overview-01


     Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------


2	COIN                                                            I. Kunze
3	Internet-Draft                                                  J. Rueth
4	Intended status: Informational                                 K. Wehrle
5	Expires: January 5, 2020                          RWTH Aachen University
6	                                                            July 4, 2019

8	             Industrial Use Cases for In-Network Computing
9	                draft-kunze-coin-industrial-use-cases-00

11	Abstract

13	   Cyber-physical systems and the Industrial Internet of Things are
14	   characterized by diverse sets of requirements which can hardly be
15	   satisfied using standard networking technology.  One example are
16	   latency-critical computations which become increasingly complex and
17	   are consequently outsourced to more powerful cloud platforms for
18	   feasibility reasons.  The intrinsic physical propagation delay to
19	   these remote sites can, however, already be too high for given
20	   requirements.  The challenge is to develop techniques that bring
21	   together these requirements.  Utilizing available computational
22	   capabilities within the network can be a solution to this challenge
23	   which makes in-network computing concepts a promising starting point.
24	   This document discusses select industrial use cases to demonstrate
25	   how in-network computing concepts can be applied to the industrial
26	   domain and to point out essential requirements of industrial
27	   applications.

29	Status of This Memo

31	   This Internet-Draft is submitted in full conformance with the
32	   provisions of BCP 78 and BCP 79.

34	   Internet-Drafts are working documents of the Internet Engineering
35	   Task Force (IETF).  Note that other groups may also distribute
36	   working documents as Internet-Drafts.  The list of current Internet-
37	   Drafts is at https://datatracker.ietf.org/drafts/current/.

39	   Internet-Drafts are draft documents valid for a maximum of six months
40	   and may be updated, replaced, or obsoleted by other documents at any
41	   time.  It is inappropriate to use Internet-Drafts as reference
42	   material or to cite them other than as "work in progress."

44	   This Internet-Draft will expire on January 5, 2020.

46	Copyright Notice

48	   Copyright (c) 2019 IETF Trust and the persons identified as the
49	   document authors.  All rights reserved.

51	   This document is subject to BCP 78 and the IETF Trust's Legal
52	   Provisions Relating to IETF Documents
53	   (https://trustee.ietf.org/license-info) in effect on the date of
54	   publication of this document.  Please review these documents
55	   carefully, as they describe your rights and restrictions with respect
56	   to this document.  Code Components extracted from this document must
57	   include Simplified BSD License text as described in Section 4.e of
58	   the Trust Legal Provisions and are provided without warranty as
59	   described in the Simplified BSD License.

61	Table of Contents

63	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
64	   2.  In-Network Control / Time-sensitive applications  . . . . . .   4
65	     2.1.  Characterization and Requirements . . . . . . . . . . . .   5
66	       2.1.1.  Approaches  . . . . . . . . . . . . . . . . . . . . .   5
67	   3.  Large Volume Applications/ Traffic Filtering  . . . . . . . .   6
68	     3.1.  Characterization and Requirements . . . . . . . . . . . .   6
69	     3.2.  Approaches  . . . . . . . . . . . . . . . . . . . . . . .   7
70	       3.2.1.  Traffic Filters . . . . . . . . . . . . . . . . . . .   7
71	       3.2.2.  In-Network (Pre-)Processing . . . . . . . . . . . . .   8
72	   4.  Industrial Safety (Dead Man's Switch) . . . . . . . . . . . .   9
73	     4.1.  Characterization and Requirements . . . . . . . . . . . .   9
74	       4.1.1.  Approaches  . . . . . . . . . . . . . . . . . . . . .   9
75	   5.  Security Considerations . . . . . . . . . . . . . . . . . . .  10
76	   6.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  10
77	   7.  Conclusion  . . . . . . . . . . . . . . . . . . . . . . . . .  10
78	   8.  Informative References  . . . . . . . . . . . . . . . . . . .  11
79	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  11

81	1.  Introduction

83	   The Internet is based on a best-effort network that provides limited
84	   guarantees regarding the timely and successful transmission of
85	   packets.  This design-choice is suitable for general Internet-based
86	   applications, but specialized industrial applications demand a number
87	   of strict performance guarantees, e.g., regarding real-time
88	   capabilities, which cannot be provided over regular best-effort
89	   networks.

91	   Enhancements to the standard Ethernet such as Time-Sensitive-
92	   Networking [TSN] try to achieve the requirements on the link layer by
93	   statically reserving shares of the bandwidth.  These concepts are
94	   well-suited for traditional industrial settings where the
95	   communication paths are encapsulated at the respective factory sites
96	   and where the communication patterns are well understood.  Following
97	   the vision of the Industrial Internet of Things (IIoT), more and more
98	   parts of the industrial production domain are interconnected.  This
99	   increases the complexity of the industrial networks, making them more
100	   dynamic and creating more diverse sets of requirements.  Furthermore,
101	   process control is imagined to be exercised from remote clouds for
102	   feasibility reasons which is why solutions on the link layer alone
103	   are not sufficient in these scenarios.

105	   Common components of the IIoT can be divided into three categories as
106	   illustrated in Figure 1.  Following
107	   [I-D.draft-mcbride-edge-data-discovery-overview-01], EDGE DEVICES,
108	   such as sensors and actuators, constitute the boundary between
109	   physical and digital world.  They communicate the current state of
110	   the physical world to the digital world by transmitting sensor data
111	   or let the digital world interact with or manipulate the physical
112	   world by executing actions after receiving (simple) control
113	   information.  The processing of the sensor data as well as the
114	   creation of the control information is done on COMPUTING DEVICES.
115	   They range from small-powered controllers in close proximity to the
116	   EDGE DEVICES, to more powerful edge or remote clouds in larger
117	   distances.  The connection between the EDGE and COMPUTING DEVICES is
118	   established by NETWORKING DEVICES.  In the industrial domain, they
119	   range from standard devices, e.g. typical Ethernet switches, which
120	   can interconnect all Ethernet-capable hosts, to proprietary equipment
121	   with proprietary protocols which only supports hosts of specific
122	   vendors.

124	   The challenge is to develop concepts which can include off-premise
125	   entities (such as distant cloud platforms) as well as proprietary
126	   hosts into the communication and still satisfy the performance
127	   requirements of modern industrial networks.  The in-network computing
128	   paradigm presents a promising starting point because (pre-)processing
129	   data within the network can speed up the communication, e.g., by
130	   reducing the amount of transmitted data and thus congestion.
131	   Flexibly distributing the computation tasks across the network helps
132	   to manage dynamic changes.  Specifying general requirements for the
133	   different application scenarios is difficult due to the mentioned
134	   diversity.  In an effort to showcase potential requirements for the
135	   domain of industrial production, we characterize and analyze three
136	   distinct scenarios to illustrate how in-network computations can be
137	   helpful.

139	    --------
140	    |Sensor| ------------|              ~~~~~~~~~~~~      ------------
141	    --------       -------------        { Internet } --- |Remote Cloud|
142	       .           |Access Point|---    ~~~~~~~~~~~~      ------------
143	    --------       -------------   |          |
144	    |Sensor| ----|        |        |          |
145	    --------     |        |       --------    |
146	       .         |        |       |Switch| ----------------------
147	       .         |        |       --------                       |
148	       .         |        |                   ------------       |
149	    ----------   |        |----------------- | Controller |      |
150	    |Actuator| ------------                   ------------       |
151	    ----------   |    --------                            ------------
152	       .         |----|Switch|---------------------------| Edge Cloud |
153	    ----------        --------                            ------------
154	    |Actuator|  ---------|
155	    ----------

157	   |-----------|       |------------------|     |-------------------|
158	    EDGE DEVICES        NETWORKING DEVICES        COMPUTING DEVICES
159	     Figure 1: Industrial networks show a high level of heterogeneity.

161	2.  In-Network Control / Time-sensitive applications

163	   The control of physical processes and components of a production line
164	   is a cornerstone of the industrial domain.  It is essential for the
165	   growing automation of production and ideally allows for a consistent
166	   quality level.  Traditionally, the control has been exercised by
167	   control software running on programmable logic controllers (PLCs)
168	   located directly next to the controlled process or component.  This
169	   approach is best-suited for settings with a simple model that is
170	   focussed on a single or few controlled components.

172	   Modern production lines and shop floors are characterized by an
173	   increasing amount of involved devices and sensors, a growing level of
174	   dependency between the different components, and more complex control
175	   models.  A centralized control is desirable to manage the large
176	   amount of available information which often has to be pre-processed
177	   or aggregated with other information before it can be used.  PLCs are
178	   not designed for this array of tasks and computations could
179	   theoretically be moved to more powerful devices.  These devices are
180	   no longer in close proximity to the controlled objects and induce
181	   additional latency.

183	   It is worthwhile to investigate whether the outsourcing of control
184	   functionality to distant computation platforms is viable, because
185	   these platforms have a high level of flexibility and scalability.  In
186	   the following, we describe the requirements and characteristics of
187	   the control setting in more detail.

189	2.1.  Characterization and Requirements

191	   A control process consists of two main components as is illustrated
192	   in Figure 2: a system under control and a controller.  In feedback
193	   control, the current state of the system is monitored, e.g., using
194	   sensors, and the controller influences the system based on the
195	   difference between the current and the reference state to keep it
196	   close to this reference state.

198	   Apart from the control model, the quality of the control primarily
199	   depends on the timely reception of the sensor feedback, because the
200	   controller can only react if it is notified about changes in the
201	   system state.  Depending on the dynamics of the controlled system,
202	   the control can be subject to tight latency constraints, often in the
203	   single digit millisecond range.  While low latencies are important,
204	   there is an even greater need for stable and deterministic levels of
205	   latency, because controllers can generally cope with different levels
206	   of latency if they are designed for them, but they are significantly
207	   challenged by dynamically changing or unstable latencies.  This is
208	   especially true if off-premise cloud platforms are included due to
209	   the unpredictable latency of the Internet.

211	   The main requirements for the industrial control scenario are low and
212	   stable latencies to ensure that processes can work continuously and
213	   that no machines are damaged.

215	    reference
216	      state      ------------        --------    Output
217	   ---------->  | Controller | ---> | System | ---------->
218	              ^  ------------        --------       |
219	              |                                     |
220	              |   observed state                    |
221	              |                    ---------        |
222	               -------------------| Sensors | <-----
223	                                   ---------
224	            Figure 2: Simple feedback control model

226	2.1.1.  Approaches

228	   Control models in general can become complex but there is a variety
229	   of control algorithms that are composed of simple computations such
230	   as matrix multiplication.  As these are supported by programmable
231	   network devices, it is a possibility to compose simplified
232	   approximations of the more complex algorithms and deploy them in the
233	   network.  While the simplified versions induce a more inaccurate
234	   control, they allow for a quicker response and might be sufficient to
235	   operate a basic tight control loop while the overall control can
236	   still be exercised from the cloud.  The problem, however, is that
237	   networking devices typically only allow for integer precision
238	   computation while floating point precision is needed by most control
239	   algorithms.  Early approaches like [RUETH] have already shown the
240	   general applicability of such ideas, but there are still a lot of
241	   open research questions not limited to the following:

243	   o  How can one derive the simplified versions of the overall
244	      controller?

246	      *  How complex can they become?

248	      *  How can one take the limited computational precision of
249	         networking devices into account when making them?

251	   o  How does one distribute the simplified versions in the network?

253	   o  How does the overall controller interact with the simplified
254	      versions?

256	3.  Large Volume Applications/ Traffic Filtering

258	   In the IIoT, processes and machines can be monitored more effectively
259	   resulting in more available information.  This data can be used to
260	   deploy machine learning techniques and consequently help to find
261	   previously unknown correlations between different components of the
262	   production which in turn helps to improve the overall production
263	   system.  Newly gained knowledge can be shared between different sites
264	   of the same company or even between different companies.

266	   Traditional company infrastructure is neither equipped for the
267	   management and storage of such large amounts of data nor for the
268	   computationally expensive training of ML approaches.  Similar to the
269	   considerations in Section 2, off-premise cloud platforms offer cost-
270	   effective solutions with a high degree of flexibility and
271	   scalability.  While the unpredictable latency of the Internet is only
272	   a subordinate problem for this use case, moving all data to off-
273	   premise locations primarily poses infrastructural and security
274	   challenges which are presented in more detail in the following.

276	3.1.  Characterization and Requirements

278	   Processes in the industrial domain are monitored by distributed
279	   sensors which range from simple binary (e.g., light barriers) to
280	   complex sensors measuring the system with varying degrees of
281	   resolution.  Sensors can further serve different purposes, as some
282	   might be used for the time-critical process control while others are
283	   only used as redundant fall back platforms.  Overall, there is a high
284	   level of heterogeneity which makes managing the sensor output a
285	   challenging task.

287	   Depending on the deployed sensors and the complexity of the observed
288	   system, the resulting overall data volume can easily be in the range
289	   of several Gbit/s [GLEBKE].  Using off-premise clouds for managing
290	   the data requires uploading or streaming the growing volume of sensor
291	   data using the companies' Internet access which is typically limited
292	   to a few hundred of Mbit/s.  While large networking companies can
293	   simply upgrade their infrastructure, most industrial companies rely
294	   on traditional ISPs for their Internet access.  Higher access speeds
295	   are hence tied to higher costs and, above all, subject to the supply
296	   of the ISPs and consequently not always available.  A major challenge
297	   is thus to devise methodology which is able to handle such amounts of
298	   data over limited access links.

300	   Another aspect is that business data leaving the premise and control
301	   of the company further comes with security concerns, as sensitive
302	   information or valuable business secrets might be contained in it.
303	   Typical security measures such as encrypting the data makes in-
304	   network computing techniques hardly applicable as they typically work
305	   on unencrypted data.  Adding security to in-network computing
306	   approaches, either by adding functionality for handling encrypted
307	   data or devising general security measures, is thus a very promising
308	   field for research.

310	3.2.  Approaches

312	   While there is no work on the question of security yet, there are at
313	   least two concepts which might be suitable for reducing the amount of
314	   transmitted data in a meaningful way:

316	   1.  filtering out redundant or unnecessary data

318	   2.  aggregating data by applying preprocessing steps within the
319	       network

321	   Both concepts require detailed knowledge about the monitoring
322	   infrastructure at the factories and the purpose of the transmitted
323	   data.

325	3.2.1.  Traffic Filters

327	   Sensors are often set up redundantly, i.e., part of the collected
328	   data might also be redundant.  Moreover, they are often hard to
329	   configure or not configurable at all which is why their resolution or
330	   sampling frequency is often larger than required.  Consequently, it
331	   is likely that more data is transmitted than is actually needed or
332	   desired.  A trivial idea for reducing the amount of data is thus to
333	   filter out redundant or undesired data before it leaves the premise
334	   using simple traffic filters that are deployed in the on-premise
335	   network.  In this context, the following research questions can be of
336	   interest:

338	   o  How can traffic filters be designed?

340	   o  How can traffic filters be coordinated and deployed?

342	   o  How can traffic filters be changed dynamically?

344	3.2.2.  In-Network (Pre-)Processing

346	   There are manifold computations that can be performed on the sensor
347	   data in the cloud.  Some of them are very complex or need the
348	   complete sensor data during the computation, but there are also
349	   simpler operations which can be done on subsets of the overall
350	   dataset or earlier on the communication path as soon as all data is
351	   available.  One example is finding the maximum of all sensors values
352	   which can either be done iteratively on each intermediate hop or at
353	   the first hop, where all data is available.

355	   Using expert knowledge about the exact computation steps and the
356	   concrete transmission path of the sensor data, simple computation
357	   steps can be deployed in the on-premise network to reduce the overall
358	   data volume and potentially speed up the processing time in the
359	   cloud.

361	   Related work has already shown that in-network aggregation can help
362	   to improve the performance of distributed machine learning
363	   applications [SAPIO].  Investigating the applicability of stream data
364	   processing techniques to programmable networking devices is also
365	   interesting, because sensor data is usually streamed.  In this
366	   context, the following research questions can be of interest:

368	   o  Which (pre-)processing steps can be deployed in the network?

370	      *  How complex can they become?

372	   o  How can applications incorporate the (pre-)processing steps?

374	   o  How can the programming of the techniques be streamlined?

376	4.  Industrial Safety (Dead Man's Switch)

378	   Despite increasing automation in production processes, human workers
379	   are still often necessary.  This gives safety measures a high
380	   priority to ensure that no human life is endangered.  In traditional
381	   factories, the regions of contact between humans and machines are
382	   well-defined and interactions are simple.  Simple safety measures
383	   like emergency switches at the working positions are enough to
384	   provide a decent level of safety.

386	   Modern factories are characterized by increasingly dynamic and
387	   complex environments with new interaction scenarios between humans
388	   and robots.  Robots can either directly assist humans or perform
389	   tasks autonomously.  The intersect between the human working area and
390	   the robots grows and it is harder for human workers to fully observe
391	   the complete environment.

393	   Additional safety measures are important to prevent accidents and
394	   support humans in observing the environment.  The increased
395	   availability of sensor data and the detailed monitoring of the
396	   factories can help to build additional safety measures if the
397	   corresponding data is collected early at the correct position.

399	4.1.  Characterization and Requirements

401	   Industrial safety measures are typically hardware solutions, because
402	   they have to pass rigorous testing before they are certified and
403	   deployment-ready.  Common measures include safety switches, which
404	   need to be triggered manually, and light barriers.  Additionally, the
405	   working area can be explicitly divided into 'contact' and 'safe'
406	   areas, indicating when workers have to watch out for interactions
407	   with machinery.

409	   These measures are static solutions, potentially relying on special
410	   hardware, and are challenged by the increased dynamics of modern
411	   factories.  Software solutions offer a higher flexibility as they can
412	   dynamically respect new information gathered by the sensor systems.
413	   Depending on the corresponding occupational safety laws, the software
414	   has to satisfy very strict requirements which cannot be satisfied by
415	   regular best-effort networks.

417	4.1.1.  Approaches

419	   Software-based solutions can take advantage of the large amount of
420	   available sensor data.  Different safety indicators within the
421	   production hall can be combined within the network so that
422	   programmable networking devices can give early responses if a
423	   potential safety breach is detected.  A rather simple possibility
424	   could be to track the positions of human workers and robots.
425	   Whenever a robot gets too close to a human in a non-working area or
426	   if a human enters a certain safety zone, robots are stopped to
427	   prevent injuries.  More advanced concepts could also include image
428	   data or combine arbitrary sensor data.

430	   In this context, the following research questions can be of interest:

432	   o  How can the software give guaranteed safety over best-effort
433	      networks?

435	   o  Which sensor information can be combined and how?

437	5.  Security Considerations

439	   N/A

441	6.  IANA Considerations

443	   N/A

445	7.  Conclusion

447	   In-network computing concepts have the potential to improve
448	   industrial applications.  There are at-least three scenarios for
449	   which in-network processing can be beneficial, each having a unique
450	   set of requirements.

452	   In the control scenario, tight latency constraints in the single
453	   digit millisecond range have to be satisfied despite the use of cloud
454	   platforms and the corresponding unstable latency of the Internet.

456	   In a second scenario, large amounts of data have to be transmitted to
457	   cloud platforms for further evaluation.  One important task here is
458	   to reduce the amount of data that needs to be transmitted as the
459	   available Internet access speed is most likely non-sufficent.  Apart
460	   from that, security measures have to be implemented as business data
461	   is transmitted to the Internet.

463	   Regarding safety, software-based measures often lack the required
464	   guarantees and do not withstand the testing for certification.  In-
465	   network processing with its potential for early responses can be a
466	   solution by combining different sensor outputs early and acting
467	   quickly.

469	8.  Informative References

471	   [GLEBKE]   Glebke, R., "A Case for Integrated Data Processing in
472	              Large-Scale Cyber-Physical Systems", DOI: 10125/60162, in
473	              HICSS, January 2019.

475	   [I-D.draft-mcbride-edge-data-discovery-overview-01]
476	              McBride, M., Kutscher, D., Schooler, E., and C. Bernardos,
477	              "Overview of Edge Data Discovery", draft-mcbride-edge-
478	              data-discovery-overview-01 (work in progress), March 2019.

480	   [RUETH]    Rueth, J., "Towards In-Network Industrial Feedback
481	              Control", DOI: 10.1145/3229591.3229592, in ACM SIGCOMM
482	              NetCompute, August 2018.

484	   [SAPIO]    Sapio, A., "Scaling Distributed Machine Learning with In-
485	              Network Aggregation", 2019,
486	              <https://arxiv.org/abs/1903.06701>.

488	   [TSN]      "Time-Sensitive Networking (TSN) Task Group", 2019,
489	              <https://1.ieee802.org/tsn/>.

491	Authors' Addresses

493	   Ike Kunze
494	   RWTH Aachen University
495	   Ahornstr. 55
496	   Aachen  D-50274
497	   Germany

499	   Phone: +49-241-80-21422
500	   Email: kunze@comsys.rwth-aachen.de

502	   Jan Rueth
503	   RWTH Aachen University
504	   Ahornstr. 55
505	   Aachen  D-50274
506	   Germany

508	   Phone: +49-241-80-21417
509	   Email: rueth@comsys.rwth-aachen.de
510	   Klaus Wehrle
511	   RWTH Aachen University
512	   Ahornstr. 55
513	   Aachen  D-50274
514	   Germany

516	   Phone: +49-241-80-21401
517	   Email: wehrle@comsys.rwth-aachen.de