idnits 2.17.1 draft-kunze-coin-industrial-use-cases-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 4, 2019) is 1757 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-05) exists of draft-mcbride-edge-data-discovery-overview-01 Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 COIN I. Kunze 3 Internet-Draft J. Rueth 4 Intended status: Informational K. Wehrle 5 Expires: January 5, 2020 RWTH Aachen University 6 July 4, 2019 8 Industrial Use Cases for In-Network Computing 9 draft-kunze-coin-industrial-use-cases-00 11 Abstract 13 Cyber-physical systems and the Industrial Internet of Things are 14 characterized by diverse sets of requirements which can hardly be 15 satisfied using standard networking technology. One example are 16 latency-critical computations which become increasingly complex and 17 are consequently outsourced to more powerful cloud platforms for 18 feasibility reasons. The intrinsic physical propagation delay to 19 these remote sites can, however, already be too high for given 20 requirements. The challenge is to develop techniques that bring 21 together these requirements. Utilizing available computational 22 capabilities within the network can be a solution to this challenge 23 which makes in-network computing concepts a promising starting point. 24 This document discusses select industrial use cases to demonstrate 25 how in-network computing concepts can be applied to the industrial 26 domain and to point out essential requirements of industrial 27 applications. 29 Status of This Memo 31 This Internet-Draft is submitted in full conformance with the 32 provisions of BCP 78 and BCP 79. 34 Internet-Drafts are working documents of the Internet Engineering 35 Task Force (IETF). Note that other groups may also distribute 36 working documents as Internet-Drafts. The list of current Internet- 37 Drafts is at https://datatracker.ietf.org/drafts/current/. 39 Internet-Drafts are draft documents valid for a maximum of six months 40 and may be updated, replaced, or obsoleted by other documents at any 41 time. It is inappropriate to use Internet-Drafts as reference 42 material or to cite them other than as "work in progress." 44 This Internet-Draft will expire on January 5, 2020. 46 Copyright Notice 48 Copyright (c) 2019 IETF Trust and the persons identified as the 49 document authors. All rights reserved. 51 This document is subject to BCP 78 and the IETF Trust's Legal 52 Provisions Relating to IETF Documents 53 (https://trustee.ietf.org/license-info) in effect on the date of 54 publication of this document. Please review these documents 55 carefully, as they describe your rights and restrictions with respect 56 to this document. Code Components extracted from this document must 57 include Simplified BSD License text as described in Section 4.e of 58 the Trust Legal Provisions and are provided without warranty as 59 described in the Simplified BSD License. 61 Table of Contents 63 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 64 2. In-Network Control / Time-sensitive applications . . . . . . 4 65 2.1. Characterization and Requirements . . . . . . . . . . . . 5 66 2.1.1. Approaches . . . . . . . . . . . . . . . . . . . . . 5 67 3. Large Volume Applications/ Traffic Filtering . . . . . . . . 6 68 3.1. Characterization and Requirements . . . . . . . . . . . . 6 69 3.2. Approaches . . . . . . . . . . . . . . . . . . . . . . . 7 70 3.2.1. Traffic Filters . . . . . . . . . . . . . . . . . . . 7 71 3.2.2. In-Network (Pre-)Processing . . . . . . . . . . . . . 8 72 4. Industrial Safety (Dead Man's Switch) . . . . . . . . . . . . 9 73 4.1. Characterization and Requirements . . . . . . . . . . . . 9 74 4.1.1. Approaches . . . . . . . . . . . . . . . . . . . . . 9 75 5. Security Considerations . . . . . . . . . . . . . . . . . . . 10 76 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10 77 7. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 10 78 8. Informative References . . . . . . . . . . . . . . . . . . . 11 79 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 11 81 1. Introduction 83 The Internet is based on a best-effort network that provides limited 84 guarantees regarding the timely and successful transmission of 85 packets. This design-choice is suitable for general Internet-based 86 applications, but specialized industrial applications demand a number 87 of strict performance guarantees, e.g., regarding real-time 88 capabilities, which cannot be provided over regular best-effort 89 networks. 91 Enhancements to the standard Ethernet such as Time-Sensitive- 92 Networking [TSN] try to achieve the requirements on the link layer by 93 statically reserving shares of the bandwidth. These concepts are 94 well-suited for traditional industrial settings where the 95 communication paths are encapsulated at the respective factory sites 96 and where the communication patterns are well understood. Following 97 the vision of the Industrial Internet of Things (IIoT), more and more 98 parts of the industrial production domain are interconnected. This 99 increases the complexity of the industrial networks, making them more 100 dynamic and creating more diverse sets of requirements. Furthermore, 101 process control is imagined to be exercised from remote clouds for 102 feasibility reasons which is why solutions on the link layer alone 103 are not sufficient in these scenarios. 105 Common components of the IIoT can be divided into three categories as 106 illustrated in Figure 1. Following 107 [I-D.draft-mcbride-edge-data-discovery-overview-01], EDGE DEVICES, 108 such as sensors and actuators, constitute the boundary between 109 physical and digital world. They communicate the current state of 110 the physical world to the digital world by transmitting sensor data 111 or let the digital world interact with or manipulate the physical 112 world by executing actions after receiving (simple) control 113 information. The processing of the sensor data as well as the 114 creation of the control information is done on COMPUTING DEVICES. 115 They range from small-powered controllers in close proximity to the 116 EDGE DEVICES, to more powerful edge or remote clouds in larger 117 distances. The connection between the EDGE and COMPUTING DEVICES is 118 established by NETWORKING DEVICES. In the industrial domain, they 119 range from standard devices, e.g. typical Ethernet switches, which 120 can interconnect all Ethernet-capable hosts, to proprietary equipment 121 with proprietary protocols which only supports hosts of specific 122 vendors. 124 The challenge is to develop concepts which can include off-premise 125 entities (such as distant cloud platforms) as well as proprietary 126 hosts into the communication and still satisfy the performance 127 requirements of modern industrial networks. The in-network computing 128 paradigm presents a promising starting point because (pre-)processing 129 data within the network can speed up the communication, e.g., by 130 reducing the amount of transmitted data and thus congestion. 131 Flexibly distributing the computation tasks across the network helps 132 to manage dynamic changes. Specifying general requirements for the 133 different application scenarios is difficult due to the mentioned 134 diversity. In an effort to showcase potential requirements for the 135 domain of industrial production, we characterize and analyze three 136 distinct scenarios to illustrate how in-network computations can be 137 helpful. 139 -------- 140 |Sensor| ------------| ~~~~~~~~~~~~ ------------ 141 -------- ------------- { Internet } --- |Remote Cloud| 142 . |Access Point|--- ~~~~~~~~~~~~ ------------ 143 -------- ------------- | | 144 |Sensor| ----| | | | 145 -------- | | -------- | 146 . | | |Switch| ---------------------- 147 . | | -------- | 148 . | | ------------ | 149 ---------- | |----------------- | Controller | | 150 |Actuator| ------------ ------------ | 151 ---------- | -------- ------------ 152 . |----|Switch|---------------------------| Edge Cloud | 153 ---------- -------- ------------ 154 |Actuator| ---------| 155 ---------- 157 |-----------| |------------------| |-------------------| 158 EDGE DEVICES NETWORKING DEVICES COMPUTING DEVICES 159 Figure 1: Industrial networks show a high level of heterogeneity. 161 2. In-Network Control / Time-sensitive applications 163 The control of physical processes and components of a production line 164 is a cornerstone of the industrial domain. It is essential for the 165 growing automation of production and ideally allows for a consistent 166 quality level. Traditionally, the control has been exercised by 167 control software running on programmable logic controllers (PLCs) 168 located directly next to the controlled process or component. This 169 approach is best-suited for settings with a simple model that is 170 focussed on a single or few controlled components. 172 Modern production lines and shop floors are characterized by an 173 increasing amount of involved devices and sensors, a growing level of 174 dependency between the different components, and more complex control 175 models. A centralized control is desirable to manage the large 176 amount of available information which often has to be pre-processed 177 or aggregated with other information before it can be used. PLCs are 178 not designed for this array of tasks and computations could 179 theoretically be moved to more powerful devices. These devices are 180 no longer in close proximity to the controlled objects and induce 181 additional latency. 183 It is worthwhile to investigate whether the outsourcing of control 184 functionality to distant computation platforms is viable, because 185 these platforms have a high level of flexibility and scalability. In 186 the following, we describe the requirements and characteristics of 187 the control setting in more detail. 189 2.1. Characterization and Requirements 191 A control process consists of two main components as is illustrated 192 in Figure 2: a system under control and a controller. In feedback 193 control, the current state of the system is monitored, e.g., using 194 sensors, and the controller influences the system based on the 195 difference between the current and the reference state to keep it 196 close to this reference state. 198 Apart from the control model, the quality of the control primarily 199 depends on the timely reception of the sensor feedback, because the 200 controller can only react if it is notified about changes in the 201 system state. Depending on the dynamics of the controlled system, 202 the control can be subject to tight latency constraints, often in the 203 single digit millisecond range. While low latencies are important, 204 there is an even greater need for stable and deterministic levels of 205 latency, because controllers can generally cope with different levels 206 of latency if they are designed for them, but they are significantly 207 challenged by dynamically changing or unstable latencies. This is 208 especially true if off-premise cloud platforms are included due to 209 the unpredictable latency of the Internet. 211 The main requirements for the industrial control scenario are low and 212 stable latencies to ensure that processes can work continuously and 213 that no machines are damaged. 215 reference 216 state ------------ -------- Output 217 ----------> | Controller | ---> | System | ----------> 218 ^ ------------ -------- | 219 | | 220 | observed state | 221 | --------- | 222 -------------------| Sensors | <----- 223 --------- 224 Figure 2: Simple feedback control model 226 2.1.1. Approaches 228 Control models in general can become complex but there is a variety 229 of control algorithms that are composed of simple computations such 230 as matrix multiplication. As these are supported by programmable 231 network devices, it is a possibility to compose simplified 232 approximations of the more complex algorithms and deploy them in the 233 network. While the simplified versions induce a more inaccurate 234 control, they allow for a quicker response and might be sufficient to 235 operate a basic tight control loop while the overall control can 236 still be exercised from the cloud. The problem, however, is that 237 networking devices typically only allow for integer precision 238 computation while floating point precision is needed by most control 239 algorithms. Early approaches like [RUETH] have already shown the 240 general applicability of such ideas, but there are still a lot of 241 open research questions not limited to the following: 243 o How can one derive the simplified versions of the overall 244 controller? 246 * How complex can they become? 248 * How can one take the limited computational precision of 249 networking devices into account when making them? 251 o How does one distribute the simplified versions in the network? 253 o How does the overall controller interact with the simplified 254 versions? 256 3. Large Volume Applications/ Traffic Filtering 258 In the IIoT, processes and machines can be monitored more effectively 259 resulting in more available information. This data can be used to 260 deploy machine learning techniques and consequently help to find 261 previously unknown correlations between different components of the 262 production which in turn helps to improve the overall production 263 system. Newly gained knowledge can be shared between different sites 264 of the same company or even between different companies. 266 Traditional company infrastructure is neither equipped for the 267 management and storage of such large amounts of data nor for the 268 computationally expensive training of ML approaches. Similar to the 269 considerations in Section 2, off-premise cloud platforms offer cost- 270 effective solutions with a high degree of flexibility and 271 scalability. While the unpredictable latency of the Internet is only 272 a subordinate problem for this use case, moving all data to off- 273 premise locations primarily poses infrastructural and security 274 challenges which are presented in more detail in the following. 276 3.1. Characterization and Requirements 278 Processes in the industrial domain are monitored by distributed 279 sensors which range from simple binary (e.g., light barriers) to 280 complex sensors measuring the system with varying degrees of 281 resolution. Sensors can further serve different purposes, as some 282 might be used for the time-critical process control while others are 283 only used as redundant fall back platforms. Overall, there is a high 284 level of heterogeneity which makes managing the sensor output a 285 challenging task. 287 Depending on the deployed sensors and the complexity of the observed 288 system, the resulting overall data volume can easily be in the range 289 of several Gbit/s [GLEBKE]. Using off-premise clouds for managing 290 the data requires uploading or streaming the growing volume of sensor 291 data using the companies' Internet access which is typically limited 292 to a few hundred of Mbit/s. While large networking companies can 293 simply upgrade their infrastructure, most industrial companies rely 294 on traditional ISPs for their Internet access. Higher access speeds 295 are hence tied to higher costs and, above all, subject to the supply 296 of the ISPs and consequently not always available. A major challenge 297 is thus to devise methodology which is able to handle such amounts of 298 data over limited access links. 300 Another aspect is that business data leaving the premise and control 301 of the company further comes with security concerns, as sensitive 302 information or valuable business secrets might be contained in it. 303 Typical security measures such as encrypting the data makes in- 304 network computing techniques hardly applicable as they typically work 305 on unencrypted data. Adding security to in-network computing 306 approaches, either by adding functionality for handling encrypted 307 data or devising general security measures, is thus a very promising 308 field for research. 310 3.2. Approaches 312 While there is no work on the question of security yet, there are at 313 least two concepts which might be suitable for reducing the amount of 314 transmitted data in a meaningful way: 316 1. filtering out redundant or unnecessary data 318 2. aggregating data by applying preprocessing steps within the 319 network 321 Both concepts require detailed knowledge about the monitoring 322 infrastructure at the factories and the purpose of the transmitted 323 data. 325 3.2.1. Traffic Filters 327 Sensors are often set up redundantly, i.e., part of the collected 328 data might also be redundant. Moreover, they are often hard to 329 configure or not configurable at all which is why their resolution or 330 sampling frequency is often larger than required. Consequently, it 331 is likely that more data is transmitted than is actually needed or 332 desired. A trivial idea for reducing the amount of data is thus to 333 filter out redundant or undesired data before it leaves the premise 334 using simple traffic filters that are deployed in the on-premise 335 network. In this context, the following research questions can be of 336 interest: 338 o How can traffic filters be designed? 340 o How can traffic filters be coordinated and deployed? 342 o How can traffic filters be changed dynamically? 344 3.2.2. In-Network (Pre-)Processing 346 There are manifold computations that can be performed on the sensor 347 data in the cloud. Some of them are very complex or need the 348 complete sensor data during the computation, but there are also 349 simpler operations which can be done on subsets of the overall 350 dataset or earlier on the communication path as soon as all data is 351 available. One example is finding the maximum of all sensors values 352 which can either be done iteratively on each intermediate hop or at 353 the first hop, where all data is available. 355 Using expert knowledge about the exact computation steps and the 356 concrete transmission path of the sensor data, simple computation 357 steps can be deployed in the on-premise network to reduce the overall 358 data volume and potentially speed up the processing time in the 359 cloud. 361 Related work has already shown that in-network aggregation can help 362 to improve the performance of distributed machine learning 363 applications [SAPIO]. Investigating the applicability of stream data 364 processing techniques to programmable networking devices is also 365 interesting, because sensor data is usually streamed. In this 366 context, the following research questions can be of interest: 368 o Which (pre-)processing steps can be deployed in the network? 370 * How complex can they become? 372 o How can applications incorporate the (pre-)processing steps? 374 o How can the programming of the techniques be streamlined? 376 4. Industrial Safety (Dead Man's Switch) 378 Despite increasing automation in production processes, human workers 379 are still often necessary. This gives safety measures a high 380 priority to ensure that no human life is endangered. In traditional 381 factories, the regions of contact between humans and machines are 382 well-defined and interactions are simple. Simple safety measures 383 like emergency switches at the working positions are enough to 384 provide a decent level of safety. 386 Modern factories are characterized by increasingly dynamic and 387 complex environments with new interaction scenarios between humans 388 and robots. Robots can either directly assist humans or perform 389 tasks autonomously. The intersect between the human working area and 390 the robots grows and it is harder for human workers to fully observe 391 the complete environment. 393 Additional safety measures are important to prevent accidents and 394 support humans in observing the environment. The increased 395 availability of sensor data and the detailed monitoring of the 396 factories can help to build additional safety measures if the 397 corresponding data is collected early at the correct position. 399 4.1. Characterization and Requirements 401 Industrial safety measures are typically hardware solutions, because 402 they have to pass rigorous testing before they are certified and 403 deployment-ready. Common measures include safety switches, which 404 need to be triggered manually, and light barriers. Additionally, the 405 working area can be explicitly divided into 'contact' and 'safe' 406 areas, indicating when workers have to watch out for interactions 407 with machinery. 409 These measures are static solutions, potentially relying on special 410 hardware, and are challenged by the increased dynamics of modern 411 factories. Software solutions offer a higher flexibility as they can 412 dynamically respect new information gathered by the sensor systems. 413 Depending on the corresponding occupational safety laws, the software 414 has to satisfy very strict requirements which cannot be satisfied by 415 regular best-effort networks. 417 4.1.1. Approaches 419 Software-based solutions can take advantage of the large amount of 420 available sensor data. Different safety indicators within the 421 production hall can be combined within the network so that 422 programmable networking devices can give early responses if a 423 potential safety breach is detected. A rather simple possibility 424 could be to track the positions of human workers and robots. 425 Whenever a robot gets too close to a human in a non-working area or 426 if a human enters a certain safety zone, robots are stopped to 427 prevent injuries. More advanced concepts could also include image 428 data or combine arbitrary sensor data. 430 In this context, the following research questions can be of interest: 432 o How can the software give guaranteed safety over best-effort 433 networks? 435 o Which sensor information can be combined and how? 437 5. Security Considerations 439 N/A 441 6. IANA Considerations 443 N/A 445 7. Conclusion 447 In-network computing concepts have the potential to improve 448 industrial applications. There are at-least three scenarios for 449 which in-network processing can be beneficial, each having a unique 450 set of requirements. 452 In the control scenario, tight latency constraints in the single 453 digit millisecond range have to be satisfied despite the use of cloud 454 platforms and the corresponding unstable latency of the Internet. 456 In a second scenario, large amounts of data have to be transmitted to 457 cloud platforms for further evaluation. One important task here is 458 to reduce the amount of data that needs to be transmitted as the 459 available Internet access speed is most likely non-sufficent. Apart 460 from that, security measures have to be implemented as business data 461 is transmitted to the Internet. 463 Regarding safety, software-based measures often lack the required 464 guarantees and do not withstand the testing for certification. In- 465 network processing with its potential for early responses can be a 466 solution by combining different sensor outputs early and acting 467 quickly. 469 8. Informative References 471 [GLEBKE] Glebke, R., "A Case for Integrated Data Processing in 472 Large-Scale Cyber-Physical Systems", DOI: 10125/60162, in 473 HICSS, January 2019. 475 [I-D.draft-mcbride-edge-data-discovery-overview-01] 476 McBride, M., Kutscher, D., Schooler, E., and C. Bernardos, 477 "Overview of Edge Data Discovery", draft-mcbride-edge- 478 data-discovery-overview-01 (work in progress), March 2019. 480 [RUETH] Rueth, J., "Towards In-Network Industrial Feedback 481 Control", DOI: 10.1145/3229591.3229592, in ACM SIGCOMM 482 NetCompute, August 2018. 484 [SAPIO] Sapio, A., "Scaling Distributed Machine Learning with In- 485 Network Aggregation", 2019, 486 . 488 [TSN] "Time-Sensitive Networking (TSN) Task Group", 2019, 489 . 491 Authors' Addresses 493 Ike Kunze 494 RWTH Aachen University 495 Ahornstr. 55 496 Aachen D-50274 497 Germany 499 Phone: +49-241-80-21422 500 Email: kunze@comsys.rwth-aachen.de 502 Jan Rueth 503 RWTH Aachen University 504 Ahornstr. 55 505 Aachen D-50274 506 Germany 508 Phone: +49-241-80-21417 509 Email: rueth@comsys.rwth-aachen.de 510 Klaus Wehrle 511 RWTH Aachen University 512 Ahornstr. 55 513 Aachen D-50274 514 Germany 516 Phone: +49-241-80-21401 517 Email: wehrle@comsys.rwth-aachen.de