idnits 2.17.1 draft-ietf-roll-rpl-industrial-applicability-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 3) being 60 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 31 instances of too long lines in the document, the longest one being 3 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 929 has weird spacing: '...ing the use o...' == Line 1010 has weird spacing: '... of the wirel...' == Line 1037 has weird spacing: '... in the in th...' -- The document date (October 21, 2013) is 3837 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'IEEE802154e' is mentioned on line 830, but not defined == Missing Reference: 'ZigBeeIP' is mentioned on line 1468, but not defined == Missing Reference: 'HART' is mentioned on line 1459, but not defined == Unused Reference: 'I-D.ietf-roll-terminology' is defined on line 1376, but no explicit reference was found in the text == Unused Reference: 'I-D.thubert-6lo-forwarding-fragments' is defined on line 1425, but no explicit reference was found in the text == Unused Reference: 'I-D.vilajosana-6tisch-minimal' is defined on line 1452, but no explicit reference was found in the text == Outdated reference: A later version (-13) exists of draft-ietf-roll-terminology-12 == Outdated reference: A later version (-08) exists of draft-thubert-6lo-forwarding-fragments-00 == Outdated reference: A later version (-01) exists of draft-thubert-6tisch-architecture-00 Summary: 1 error (**), 0 flaws (~~), 14 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 ROLL T. Phinney, Ed. 3 Internet-Draft consultant 4 Intended status: Informational P. Thubert 5 Expires: April 22, 2014 cisco 6 RA. Assimiti 7 Nivis 8 October 21, 2013 10 RPL applicability in industrial networks 11 draft-ietf-roll-rpl-industrial-applicability-02 13 Abstract 15 The wide deployment of wireless devices, with their low installed 16 cost (compared to wired devices), will significantly improve the 17 productivity and safety of industrial plants. It will simultaneously 18 increase the efficiency and safety of the plant's workers, by 19 extending and making more timely the information set available about 20 plant operations. The new Routing Protocol for Low Power and Lossy 21 Networks (RPL) defines a Distance Vector protocol that is designed 22 for such networks. The aim of this document is to analyze the 23 applicability of that routing protocol in industrial LLNs formed of 24 field devices. 26 Status of this Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at http://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on April 22, 2014. 43 Copyright Notice 45 Copyright (c) 2013 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents (http://trustee.ietf.org/ 50 license-info) in effect on the date of publication of this document. 51 Please review these documents carefully, as they describe your rights 52 and restrictions with respect to this document. Code Components 53 extracted from this document must include Simplified BSD License text 54 as described in Section 4.e of the Trust Legal Provisions and are 55 provided without warranty as described in the Simplified BSD License. 57 Table of Contents 59 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 60 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 4 61 1.2. Required Reading . . . . . . . . . . . . . . . . . . . . . 4 62 1.3. Out of scope requirements . . . . . . . . . . . . . . . . 4 63 2. Deployment Scenario . . . . . . . . . . . . . . . . . . . . . 4 64 2.1. Network Topologies . . . . . . . . . . . . . . . . . . . . 6 65 2.1.1. Traffic Characteristics . . . . . . . . . . . . . . . 6 66 2.1.2. Topologies . . . . . . . . . . . . . . . . . . . . . . 8 67 2.1.3. Source-sink (SS) communication paradigm . . . . . . . 10 68 2.1.4. Publish-subscribe (PS, or pub/sub) communication paradig 11 69 2.1.5. Peer-to-peer (P2P) communication paradigm . . . . . . 13 70 2.1.6. Peer-to-multipeer (P2MP) communication paradigm . . . 14 71 2.1.7. Additional considerations: Duocast and N-cast . . . . 14 72 2.1.8. RPL applicability per communication paradigm . . . . . 16 73 2.2. Layer 2 applicability. . . . . . . . . . . . . . . . . . . 18 74 3. Using RPL to Meet Functional Requirements . . . . . . . . . . 18 75 4. RPL Profile . . . . . . . . . . . . . . . . . . . . . . . . . 20 76 4.1. RPL Features . . . . . . . . . . . . . . . . . . . . . . . 20 77 4.1.1. RPL Instances . . . . . . . . . . . . . . . . . . . . 20 78 4.1.2. Storing vs. Non-Storing Mode . . . . . . . . . . . . . 22 79 4.1.3. DAO Policy . . . . . . . . . . . . . . . . . . . . . . 23 80 4.1.4. Path Metrics . . . . . . . . . . . . . . . . . . . . . 23 81 4.1.5. Objective Function . . . . . . . . . . . . . . . . . . 24 82 4.1.6. DODAG Repair . . . . . . . . . . . . . . . . . . . . . 24 83 4.1.7. MPL Profile . . . . . . . . . . . . . . . . . . . . . 25 84 4.1.8. Security . . . . . . . . . . . . . . . . . . . . . . . 25 85 4.1.9. P2P communications . . . . . . . . . . . . . . . . . . 25 86 4.2. Layer-two features . . . . . . . . . . . . . . . . . . . . 26 87 4.3. Recommended Configuration Defaults and Ranges . . . . . . 26 88 4.3.1. Trickle Parameters . . . . . . . . . . . . . . . . . . 26 89 4.3.2. Other Parameters . . . . . . . . . . . . . . . . . . . 27 90 5. Manageability Considerations . . . . . . . . . . . . . . . . . 27 91 6. Security Considerations . . . . . . . . . . . . . . . . . . . 28 92 6.1. Security Considerations during initial deployment . . . . 28 93 6.2. Security Considerations during incremental deployment . . 28 94 7. Other Related Protocols . . . . . . . . . . . . . . . . . . . 28 95 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 28 96 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 28 97 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 28 98 10.1. Normative References . . . . . . . . . . . . . . . . . . 28 99 10.2. Informative References . . . . . . . . . . . . . . . . . 28 100 10.3. External Informative References . . . . . . . . . . . . . 30 102 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 30 104 1. Introduction 106 Information Technology (IT) is already, and increasingly will be 107 applied to Industrial Automation and Control System (IACS) technology 108 in application areas where those IT technologies can be constrained 109 sufficiently by Service Level Agreements (SLA) or other modest change 110 that they are able to meet the operational needs of IACS. When that 111 happens, the IACS benefits from the large intellectual, experiential 112 and training investment that has already occurred in those IT 113 precursors. One can conclude that future reuse of additional IT 114 protocols for IACS will continue to occur due to the significant 115 intellectual, experiential and training economies which result from 116 that reuse. 118 Following that logic, many vendors are already extending or replacing 119 their local field-bus technology with Ethernet and IP-based 120 solutions. Examples of this evolution include CIP EtherNet/IP, 121 Modbus/TCP, Foundation Fieldbus HSE, PROFInet and Invensys/Foxboro 122 FOXnet. At the same time, wireless, low power field devices are 123 being introduced that facilitate a significant increase in the amount 124 of information which industrial users can collect and the number of 125 control points that can be remotely managed. 127 IPv6 appears as a core technology at the conjunction of both trends, 128 as illustrated by the current [ISA100.11a] industrial Wireless Sensor 129 Networking (WSN) specification, where layers 1-4 technologies 130 developed for end uses other than IACS - IEEE 802.15.4 PHY and MAC, 131 6LoWPAN and IPv6, and UDP - are adapted to IACS use. But due to the 132 lack of open standards for routing in Low power and Lossy Networks 133 (LLN) at the time ISA100.11a was crafted, routing was accomplished at 134 the link layer and is specific to that standard. 136 The IETF ROLL Working Group has defined application-specific routing 137 requirements for a LLN routing protocol, specified in: 139 Routing Requirements for Urban LLNs [RFC5548], 141 Industrial Routing Requirements in LLNs [RFC5673], 143 Home Automation Routing Requirements in LLNs [RFC5826], and 145 Building Automation Routing Requirements in LLNs [RFC5867]. 147 The Routing Protocol for Low Power and Lossy Networks (RPL) 148 [RFC6550] specification and its point to point extension/optimization 149 [RFC6997] define a generic Distance Vector protocol that is adapted 150 to a variety of Low Power and Lossy Networks (LLN) types by the 151 application of specific Objective Functions (OFs). RPL forms 152 Destination Oriented Directed Acyclic Graphs (DODAGs) within 153 instances of the protocol, each instance being associated with an 154 Objective Function to form a routing topology. 156 A field device that belongs to an instance uses the OF to determine 157 which DODAG and which Version of that DODAG the device should join. 158 The device also uses the OF to select a number of routers within the 159 DODAG current and subsequent Versions to serve as parents or as 160 feasible successors. A new Version of the DODAG is periodically 161 reconstructed to enable a global reoptimization of the graph. 163 A RPL OF states the outcome of the process used by a RPL node to 164 select and optimize routes within a RPL Instance based on the 165 information objects available. The separation of OFs from the core 166 protocol specification allows RPL to be adapted to meet the different 167 optimization criteria required by the wide range of industrial 168 classes of traffic and applications. 170 This document provides information on how RPL can accommodate the 171 industrial requirements for LLNs, in particular as specified in 172 [RFC5673]. 174 1.1. Requirements Language 176 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 177 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 178 "OPTIONAL" in this document are to be interpreted as described in RFC 179 2119 [RFC2119]. 181 Additionally, this document uses terminology from [I-D.ietf-roll- 182 terminology], and uses usual terminology from the Process Control and 183 Factory Automation industries, some of which is recapitulated below: 185 FEC: Forward error correction 187 IACS: Industrial automation and control systems 189 RAND: reasonable and non-discriminatory (relative to licensing of 190 patents) 192 1.2. Required Reading 194 1.3. Out of scope requirements 196 This applicability statement does not address requirements related to 197 wireless LLNs employed in factory automation and related 198 applications. 200 2. Deployment Scenario 202 [RFC5673] describes in detail the routing requirements for industrial 203 LLNs. This RFC provides information on the varying deployment 204 scenarios for such LLNs and how RPL assists in meeting those 205 requirements. 207 Large industrial plants, or major operating areas within such plants, 208 repeatedly go through four major phases, each of which typically 209 lasts from months to years: 211 P1: Construction or major modification phase 213 P2: Planned startup phase 215 P3: Normal operation phase 217 P4: Planned shutdown phase 219 followed eventually by an (at least theoretical) 221 P5: Plant decommissioning phase. 223 It is also likely, after a major catastrophe at a plant, to have a 225 P6: Post-emergency recovery and repair phase. 227 The deployment scenarios for wireless LLN devices may be different in 228 each of these phases. In particular, during the Construction or 229 major modification phase (P1), LLN devices may be installed months 230 before the intended LLN can become usefully operational (because 231 needed routers and infrastructure devices are not yet installed or 232 active), and there are likely to be many personnel in whom the plant 233 owner/operator has only limited trust, such as subcontractors and 234 others in the plant area who have undergone only a cursory background 235 investigation (if any at all). In general, during this phase, plant 236 instrumentation is not yet operational, so could be removed and 237 replaced by a Trojaned device without much likelihood of physical 238 detection of the substitution. Thus physical security of LLN devices 239 is generally a more significant risk factor during this phase than 240 once the plant is operational, where simple replacement of device 241 electronics is detectable. 243 Extra LLN devices and even extra LLN subnets may be employed during 244 Planned startup (P2) and Planned shutdown (P4) phases, in support of 245 the task of transitioning the plant or plant area between operational 246 and shutdown states. The extra devices typically provide extra 247 monitoring as the plant transitions infrequent activity states. (In 248 many continuous process plants, up to 2x extra staff are employed at 249 monitoring and control workstations during these two phases, 250 precisely because the plant is undergoing extraordinary behavior as 251 it transitions to or from its steady-state operational condition.) 253 Similar transient devices and subnets may be used during an 254 unscheduled Post-emergency recovery and repair phase (P6) of 255 operation, but in that case the extra devices usually are routers 256 substituting for plant LLN devices that have been damaged by the 257 incident (such as a fire, explosion, flood, tornado or hurricane) 258 that induced the emergency. 260 The Planned startup (P2) and Planned shutdown (P4) phases are similar 261 in many respects, but the LLN environment of the two can be quite 262 different, since the Planned shutdown phase can assume that the 263 stable LLN environment used for Normal operation (P3) is functional 264 during shutdown, whereas that stable environment usually is still 265 being established during startup. 267 The Post-emergency recovery and repair phase (P6) typically operates 268 in an LLN environment that is somewhere between that of the Planned 269 startup (P2) and Normal operation (P3) phases, but with an 270 indeterminate number of temporary routers placed to facilitate 271 communication across and around the area affected by the catastrophe. 273 Smaller industrial plants and sites may go through similar phases, 274 but often commingle the phases because, in those smaller plants, the 275 phases require less planning and structuring of personnel 276 responsibilities and thus permit less formalization and partitioning 277 of the operating scenarios. For example, it is much simpler, and 278 usually requires much less planning, to bring new equipment on a skid 279 into a plant, using a forklift, than to lay temporary railroad track 280 or employ an extended-axle heavy haul tractor-trailer to deliver a 281 multi-ton process vessel, and temporarily deploy and use very large 282 heavy-lift cranes to install it. In the former cases, nearby 283 equipment usually can continue normal operation while the 284 installation proceeds; in the latter case that is almost always 285 impossible, due to safety and other concerns. 287 The domain of applicability for the RPL protocol may include all 288 phases but the Normal Operation phase, where the bandwidth allocation 289 and the routes are usually optimized by an external Path Computing 290 Engine (PCE), e.g. an ISA100.11a System Manager. 292 Additionally, it could be envisioned to include RPL in the normal 293 operation provided that a new Objective Function is defined that 294 actually interacts with the PCE is order to establish the reference 295 topology, in which case RPL operations would only apply to emergency 296 repair actions. when the reference topology becomes unusable for 297 some failure, and as long as the problem persists. 299 2.1. Network Topologies 301 2.1.1. Traffic Characteristics 303 The industrial market classifies process applications into three 304 broad categories and six classes. 306 o Safety 308 * Class 0: Emergency action - Always a critical function 310 o Control 311 * Class 1: Closed loop regulatory control - Often a critical 312 function 314 * Class 2: Closed loop supervisory control - Usually non-critical 315 function 317 * Class 3: Open loop control - Operator takes action and controls 318 the actuator (human in the loop) 320 o Monitoring 322 * Class 4: Alerting - Short-term operational effect (for example 323 event-based maintenance) 325 * Class 5: Logging and downloading / uploading - No immediate 326 operational consequence (e.g., history collection, sequence-of- 327 events, preventive maintenance) 329 Safety critical functions effect the basic safety integrity of the 330 plant. These normally dormant functions kick in only when process 331 control systems, or their operators, have failed. By design and by 332 regular interval inspection, they have a well-understood probability 333 of failure on demand in the range of typically once per 10-1000 334 years. 336 In-time deliveries of messages becomes more relevant as the class 337 number decreases. 339 Note that for a control application, the jitter is just as important 340 as latency and has a potential of destabilizing control algorithms. 342 The domain of applicability for the RPL protocol probably matches the 343 range of classes where industrial users are interested in deploying 344 wireless networks. This domain includes monitoring classes (4 and 345 5), and the non-critical portions of control classes (2 and 3). RPL 346 might also be considered as an additional repair mechanism in all 347 situations, and independently of the flow classification and the 348 medium type. 350 It appears from the above sections that whether and the way RPL can 351 be applied for a given flow depends both on the deployment scenario 352 and on the class of application / traffic. At a high level, this can 353 be summarized by the following matrix: 355 +---------------------+------------------------------------------------+ 356 | Phase \ Class | 0 1 2 3 4 5 | 357 +=====================+================================================+ 358 | Construction | X X X X | 359 +---------------------+------------------------------------------------+ 360 | Planned startup | X X X X | 361 +---------------------+------------------------------------------------+ 362 | Normal operation | ? ? ? | 363 +---------------------+------------------------------------------------+ 364 | Planned shutdown | X X X X | 365 +---------------------+------------------------------------------------+ 366 |Plant decommissioning| X X X X | 367 +---------------------+------------------------------------------------+ 368 | Recovery and repair | X X X X X X | 369 +---------------------+------------------------------------------------+ 371 ? : typically usable for all but higher-rate classes 0,1 PS traffic 373 2.1.2. Topologies 375 In an IACS, high-rate communications flows (e.g., 1 Hz or 4 Hz for a 376 traditional process automation network) typically are such that only 377 a single wireless LLN hop separates the source device from a LLN 378 Border Router (LBR) to a significantly higher data-rate backbone 379 network, typically based on IEEE 802.3, IEEE 802.11, or IEEE 802.16, 380 as illustrated in Figure 2. 382 ---+------------------------ 383 | Plant Network 384 | 385 +-----+ 386 | | Gateway 387 | | 388 +-----+ 389 | 390 | Backbone 391 +--------------------+------------------+ 392 | | | 393 +-----+ +-----+ +-----+ 394 | | LLN border | | LLN border | | LLN border 395 o | | router | | router | | router 396 +-----+ +-----+ +-----+ 397 o o o o 398 o o o o o o o o o o o 399 LLN 401 o : stationary wireless field device, seldom acting as an LLN router 403 For factory automation networks, the basic communications cycle for 404 control is typically much faster, on the order of 100 Hz or more. In 405 this case the LLN itself may be based on high-data-rate IEEE 802.11 406 or a 100 Mbit/s or faster optical link, and the higher-rate network 407 used by the LBRs to connect the LLN to superior automation equipment 408 typically might be based on fiber-optic IEEE 802.3, with multiple 409 LBRs around the periphery of the factory area, so that most high-rate 410 communications again requires only a single wireless LLN hop. 412 Multi-hop LLN routing is used within the LLN portion of such networks 413 to provide backup communications paths when primary single-hop LLN 414 paths fail, or for lower repetition rate communications where longer 415 LLN transit times and higher variance are not an issue. Typically, 416 the majority of devices in an IACS can tolerate such higher-delay 417 higher-variance paths, so routing choices often are driven by energy 418 considerations for the affected devices, rather than simply by IACS 419 performance requirements, as illustrated in Figure 3. 421 ---+------------------------ 422 | Plant Network 423 | 424 +-----+ 425 | | Gateway 426 | | 427 +-----+ 428 | 429 | Backbone 430 +--------------------+------------------+ 431 | | | 432 +-----+ +-----+ +-----+ 433 | | Backbone | | Backbone | | Backbone 434 | | router | | router | | router 435 +-----+ +-----+ +-----+ 436 o o o o o o o o o o o o o 437 o o o o o o o o o o o o o o o o o o 438 o o o o o o o o o o o M o o o o o 439 o o M o o o o o o o o o o o o o 440 o o o o o o o o o 441 o o o o o 442 LLN 444 o : stationary wireless field device, often acting as an LLN router 445 M : mobile wireless device 447 Two decades of experience with digital fieldbuses has shown that four 448 communications paradigms dominate in IACS: 450 SS: Source-sink 452 PS: Publish-subscribe 454 P2P: Peer-to-peer 456 P2MP: Peer-to-multipeer 458 2.1.3. Source-sink (SS) communication paradigm 460 In SS, the source-sink communication paradigm, each of many devices 461 in one set, S1, sends UDP-like messages, usually infrequently and 462 intermittently, to a second set of devices, S2, determined by a 463 common multicast address. A typical example would be that all 464 devices within a given process unit N are configured to send process 465 alarm messages to the multicast address 466 Receivers_of_process_alarms_for_unit_N. Receiving devices, typically 467 on non-LLN networks accessed via LBRs, are configured to receive such 468 multicast messages if their work assignment covers process unit N, 469 and not otherwise. 471 Timeliness of message delivery is a significant aspect of some SS 472 communication. When the SS traffic conveys process alarms or device 473 alerts, there is often a contractual requirement, and sometimes even 474 a regulatory requirement, on the maximum end-to-end transit delay of 475 the SS message, including both the LLN and non-LLN components of that 476 delay. However, there is no requirement on relative jitter in the 477 delivery of multiple SS messages from the same source, and message 478 reordering during transit is irrelevant. 480 Within the LLN, the SS paradigm simply requires that messages so 481 addressed be forwarded to the responsible LBR (or set of equivalent 482 LBRs) for further forwarding outside the LLN. Within the LLN such 483 traffic typically is device-to-LBR or device-to-redundant-set-of- 484 equivalent-LBRs. In general, SS traffic may be aggregated before 485 forwarding when both the multicast destination address and other QoS 486 attributes are identical. If information on the target delivery 487 times for SS messages is available to the aggregating forwarding 488 device, that device may intentionally delay forwarding somewhat to 489 facilitate further aggregation, which can significantly reduce LLN 490 alarm-reporting traffic during major plant upset events. 492 2.1.4. Publish-subscribe (PS, or pub/sub) communication paradigm 494 In PS, the publish-subscribe communication paradigm, a device sends 495 UDP-like messages, usually periodically or cyclicly (i.e., 496 repetitively but without fixed periodicity), to a single multicast 497 address derived from or correlated with the device's own address. A 498 typical example would be that each sensor and actuator device within 499 a given process unit N is configured to send process state messages 500 to the multicast address that designates its specific publications. 501 In essence the derived multicast address for device D is 502 Receivers_of_publications_by_device_D. Typically those receivers are 503 in two categories: controllers (C) for control loops in which device 504 D participates, and devices accessed via the LLN's LBRs that monitor 505 and/or accumulate historical information about device D's status and 506 outputs. 508 If the controller(s) that receive device D's publication are all 509 outside the LLN and accessed by LBRs, then within the LLN such 510 traffic typically is device-to-LBR or device-to-redundant-set-of- 511 equivalent-LBRs. But if a controller (Cn) is within the LLN, then a 512 number of different LLN-local traffic patterns may be employed, 513 depending on the capabilities of the underlying link technology and 514 on configured performance requirements for such reporting. Typically 515 in such a case, publication by device D is forwarded up a DODAG to an 516 LLN router that is also on a downward DODAG to a destination 517 controller Cn, then forwarded down that second DODAG to that 518 destination controller Cn. Of course, if the LLN router (or even the 519 LBR) is itself the intended destination controller, which will often 520 be the case, then no downward forwarding occurs. 522 Timeliness of message delivery is a critical aspect of PS 523 communication. Individual messages can be lost without significant 524 impact on the controlled physical process, but typically a sequence 525 of four consecutive lost messages will trigger fallback behavior of 526 the control algorithms, which is considered a system failure by most 527 system owner/operators. (In general, and unless a local catastrophic 528 event such as a major explosion or a tornado occurs in the plant, 529 invocation of more than one instance of such fallback handling per 530 year, per plant, is considered unacceptable.) 532 Message loss, delay and jitter in delivery of PS messaging is a 533 relative matter. PS messaging is used for transfer of process 534 measurements and associated status from sensors to control 535 computation elements, from control computation elements to actuators, 536 and of current commanded position and status from actuators back to 537 control computation elements. The actual time interval of interest 538 is that which starts with sensing of the physical process (which 539 necessarily occurs before the sensed value can be sent in the first 540 message) and which ends when the computed control correction is 541 applied to the physical process by the appropriate actuator (which 542 cannot occur until after the second message containing the computed 543 control output has been received by that actuator). With rare 544 exception, the control algorithms used with PS messaging in the 545 process automation industries - those managing continuous material 546 flows - rely on fixed-period sampling, computation and transfer of 547 outputs, while those in the factory automation industries - those 548 managing discrete manufacturing operations - rely on bounded delay 549 between sampling of inputs, control computation and transfer of 550 outputs to physical actuators that affect the controlled process. 552 Deliberately manipulated message delay and jitter in delivery of PS 553 messaging has the potential to destabilize control loops. It is the 554 responsibility of conveyed higher-level protocols to protect against 555 such potential security attacks by detecting overly delayed or 556 jittered messages at delivery, converting them into instances of 557 message loss. Thus network and data-link protocols such as IPv6 and 558 Ethernet need not themselves address such issues, although their 559 selection and employment should take the existence (or lack) of such 560 higher-layer protection mechanisms, and the resulting consequences 561 due to excessive delay and jitter, into consideration in their 562 parameterization. 564 In general, PS traffic within the LLN is not aggregated before 565 forwarding, to minimize message loss and delay in reception by any 566 relevant controller(s) that are outside the LLN. However, if all 567 intended destination controllers are within the LLN, and at least one 568 of those intended controllers also serves as an LLN router on a DODAG 569 to off-LLN destinations that all are not controllers, then the router 570 functions in that device may aggregate PS traffic before forwarding 571 when the required routing and other QoS attributes are identical. If 572 information on the target delivery times for PS messages to non- 573 controller devices is available to the aggregating forwarding device, 574 that device may intentionally delay forwarding somewhat to facilitate 575 further aggregation. 577 In some system architectures, message streams that use PS to convey 578 current process measurements and status are compressed at the source 579 through a 2-dimensional winnowing process that compares 581 1) the process measurement values and status of the about-to-be-sent 582 message with that of the last actually-sent message, and 584 2) the current time vs. the queueing time for the last actually-sent 585 message. 587 If the interval since that last-sent message is less than a 588 predefined maximum time, and the status is unchanged, and the process 589 measurement(s) conveyed in the message is within predefined 590 deadband(s) of the last-sent measurement value(s), then transmission 591 of the new message is suppressed. Often this suppression takes the 592 form of not queuing the new message for transmission, but in some 593 protocols a brief placeholder message indicating "no significant 594 change" is queued in its stead. 596 2.1.5. Peer-to-peer (P2P) communication paradigm 598 In P2P, the peer-to-peer communication paradigm, a device sends UDP- 599 like or TCP-like messages from one device (D1) to a second device 600 (D2), usually with bidirectional but asymmetric flow of application 601 data, where the amount of data is significantly greater in one 602 direction than the other. Typical examples are transfer of 603 configuration information to or from a process field device, or 604 transfer of captured process diagnostics (e.g., time-stamped noise 605 signatures from a coriolis flowmeter) to an off-LLN higher-level 606 asset management system. Unicast addressing is used in both 607 directions of data flow. 609 In general, specific P2P traffic has only loose timeliness 610 requirements, typically just those required so that response times to 611 human-operator-initiated actions meet human factors requirements. As 612 a consequence, in general, message aggregation is permitted, although 613 few opportunities are likely to present themselves for such 614 aggregation due to the sporadic nature of such messaging to a single 615 destination, and/or due to the large message payloads that often 616 occur in at least one direction of transmission. 618 2.1.6. Peer-to-multipeer (P2MP) communication paradigm 620 In P2MP, the peer-to-multipeer communication paradigm, a device sends 621 UDP-like messages downward, from one device (D1) to a set of other 622 devices (Dn). Typical examples are bulk downloads to a set of devices 623 that use identical code image segments or identically-structured 624 database segments; group commands to enable device state transitions 625 that are quasi-synchronized across all or part of the local network 626 (e.g., switch to the next set of point-to-point downloaded session 627 keys, or notifying that the network is switching to an emergency 628 repair and recovery mode); etc. Multicast addressing is used in the 629 downward direction of data flow. 631 Devices can be assigned to a number of multicast groups, for instance 632 by device type. Then, if it becomes necessary to reflash all devices 633 of a given type with a new load image, a multicast distribution 634 mechanism can be leveraged to optimize the distribution operation. 636 In general, P2MP traffic has only loose timeliness requirements. As 637 a consequence, in general, message aggregation is permitted, although 638 few opportunities are likely to present themselves for such 639 aggregation due to the sporadic nature of such messaging to a single 640 multicast group destination, and/or due to the large message payloads 641 that often occur when P2MP is used for group downloads. However, in 642 general, message aggregation negatively impacts the delivery success 643 rate for each of the aggregated messages, since the probability of 644 error in a received message increases with message length> Together 645 these considerations often lead to a policy of non-aggregation for 646 P2MP messaging. 648 Note: Reliable group download protocols, such as the no-longer- 649 published IEEE 802.1E (ISO/IEC 15802-4) system load protocol, and 650 reliable multicast protocols based on the guidance of [RFC2887], are 651 instructive in how P2MP can be used for initial bulk download, 652 followed by either P2MP or P2P selective retransmissions for missed 653 download segments. 655 2.1.7. Additional considerations: Duocast and N-cast 657 In industrial automation systems, some traffic is from (relatively) 658 high-rate monitoring and control loops, of Class 0 and Class 1 as 659 described in [RFC5673]. In such systems, the wireless link protocol, 660 which typically uses immediate in-band acknowledgement to confirm 661 delivery (or, on failure, conclude that a retransmission is 662 required), can be adapted to attempt simultaneous delivery to more 663 than one receiving device, with separated, sequenced immediate in- 664 band acknowledgement by each of those intended receivers. (This 665 mechanism is known colloquially as "duocast" (for two intended 666 receivers), or more generically as "N-cast" (for N intended 667 receivers).) Transmission is deemed successful if at least one such 668 immediate acknowledgement is received by the sending device; 669 otherwise the device queues the message for retransmission, up until 670 the maximum configured number of retries has been attempted. 672 The logic behind duocast/N-cast is very simple: In wireless systems 673 without FEC (forward error correction), the overall rate of success 674 for transactions consisting of an initial transmission and an 675 immediate acknowledgement is typically 95%. In other words, 5% of 676 such transactions fail, either because the initial message of the 677 transaction is not received correctly by the intended receiver, or 678 because the immediate acknowledgment by that receiver is not received 679 correctly by the transaction initiator. 681 In the generalized case of N-cast, where any received acknowledgement 682 serves to complete the transaction, and where the N intended 683 receivers are spatially diverse, physically separated from each other 684 by multiple wavelengths, the probability that all such receivers fail 685 to receive the initial message of the transaction, or that all 686 generated immediate acknowledgements are not received by the 687 transaction initiator, is typically approximately (5%)^N. Thus, for 688 duocast, the expected success rate for a single transaction goes from 689 95% (1.0 - 0.05) to 99.75% (1.0 - 0.05^2), to 99.9875% (1.0 - 0.05^3) 690 when N=3, and even higher when N>3. 692 From the above analysis, it is obvious that the primary benefit of 693 N-cast occurs when N goes from N=1 (unicast) to N=2 (duocast); the 694 reduction in transaction loss rate for increasing N>2 is quite small, 695 and for N>3 it is infinitesimal. In the typical industrial 696 automation environment of class 1 process control loops, which 697 typically repeat at a 1 Hz or 4 Hz rate, in a very large process 698 plant with thousands of field devices reporting at that rate, the 699 maximum number of transmission retries that must be planned, and for 700 which capacity must be scheduled (within the requisite 250 ms or 1 s 701 interval) is seven (7) retries for unicast PS reporting, but only 702 three (3) retries with duocast PS reporting. (This is determined by 703 the requirement to not miss four successive reports more than once 704 per year, across the entire plant, as such a loss typically triggers 705 fallback behavior in the controlled loop, which is considered a 706 failure of the wireless system by the plant owner/operator.) In 707 practice, the enormous reduction in both planned and used 708 retransmission capacity provided by duocast/N-cast is what enables 4 709 Hz loops to be supported in large wireless systems. 711 When available, duocast/N-cast typically is used only for one-hop PS 712 traffic on Class 1 and Class 0 control loops. It may also be 713 employed for rapid, reliable one-hop delivery of Class 0 and 714 sometimes Class 1 process alarms and device alerts, which use the SS 715 paradigm. Because it requires scheduling of multiple receivers that 716 are prepared to acknowledge the received message during the 717 transaction, in general it is not appropriate for the other types of 718 traffic in such systems - P2P and P2MP - and is not needed for other 719 classes of control loops or other types of traffic, which do not have 720 such stringent reporting requirements. 722 Note: Although there are known patent applications for duocast and 723 N-cast, at the time of this writing the patent assignee, Honeywell 724 International, has offered to permit cost-free RAND use in those 725 industrial wireless standards that have chosen to employee the 726 technology, under a reciprocal licensing requirement relative to that 727 use. Since duocast and N-cast provide performance and energy 728 optimizations, they are not essential for use in wireless systems. 729 However, in practice, their use makes it possible to support 4 Hz 730 wireless loops and meet sub-second safety alarm reporting 731 requirements in large plants, where that might otherwise be 732 impractical without use of a wired network. When duocast/N-cast is 733 not employed, the wireless retransmission capacity that is needed to 734 support such fast loops often is excessive, typically over 100x that 735 actually used for retransmission (i.e., providing for seven retries 736 per transaction when the mean number used is only 0.06 retries). 738 2.1.8. RPL applicability per communication paradigm 740 To match the requirements above, RPL provides a number of RPL Modes 741 of Operation (MOP): 743 No downward route: defined in [RFC6550], section 6.3.1, MOP of 0. 744 This mode allows only upward routing, that is from 745 nodes (devices) that reside inside the RPL network 746 toward the outside via the DODAG root. 748 Non-storing mode: defined in [RFC6550], section 6.3.1, MOP of 1. This 749 mode improves MOP 0 by adding the capability to use 750 source routing from the root towards registered 751 targets within the instance DODAG. 753 Storing mode without multicast support: defined in [RFC6550], section 754 6.3.1, MOP of 2. This mode 755 improves MOP 0 by adding the 756 capability to use stateful 757 routing from the root towards 758 registered targets within the 759 instance DODAG. 761 Storing mode with link-scope multicast DAO: defined in [RFC6550] 762 section 9.10, this mode 763 improves MOP 2 by adding 764 the capability to send 765 Destination 766 Advertisements to all 767 nodes over a single Layer 768 2 link (e.g. a wireless 769 hop) and enables line-of- 770 sight direct 771 communication. 773 Storing mode with multicast support: defined in [RFC6550], Mode-of- 774 operation (MOP) of 3. This mode 775 improves MOP 2 by adding the 776 capability to register multicast 777 groups and perform multicast 778 forwarding along the instance 779 DODAG (or a spanning subtree 780 within the DODAG). 782 Reactive: defined in [RFC6997], the reactive mode creates on-demand 783 additional DAGs that are used to reach a given node acting 784 as DODAG root within a certain number of hops. This mode 785 can typically be used for an ad-hoc closed-loop 786 communication. 788 The RPL MOP that can be applied for a given flow depends on the 789 communication paradigm. It must be noted that a DODAG that is used 790 for PS traffic can also be used for SS traffic since the MOP 2 791 extends the MOP 0, and that a DODAG that is used for P2MP 792 distribution can also be used for downward PS since the MOP 3 extends 793 the MOP 2. 795 On the other hand, an Objective Function (OF) that optimizes metrics 796 for a pure upwards DODAG might differ from the OF that optimizes a 797 mixed upward and downward DODAG. 799 As a result, it can be expected that different RPL instances are 800 installed with different OFs, different channel allocations, etc... 801 that result in different routing and forwarding topologies, sometimes 802 with differing delay vs. energy profiles, optimized separately for 803 the different flows at hand. 805 This can be broadly summarized in the following table: 807 +---------------------+------------+-----------------------------------+ 808 | Paradigm\RPL MOP | RPL spec | Mode of operation | 809 +=====================+============+===================================+ 810 | Peer-to-peer | RPL P2P | reactive (on-demand) | 811 +---------------------+------------+-----------------------------------+ 812 | P2P line-of-sight | RPL base | 2 (storing) with multicast DAO | 813 +---------------------+------------+-----------------------------------+ 814 | P2MP distribution | RPL base | 3 (storing with multicast) | 815 +---------------------+------------+-----------------------------------+ 816 | Publish-subscribe | RPL base | 1 or 2 (storing or not-storing) | 817 +---------------------+------------+-----------------------------------+ 818 | Source-sink | RPL base | 0 (no downward route) | 819 +---------------------+------------+-----------------------------------+ 820 | N-cast publish | RPL base | 0 (no downward route) | 821 +---------------------+------------+-----------------------------------+ 823 2.2. Layer 2 applicability. 825 Work at the 6TiSCH WG details layer 2 operations for the most 826 commonly used link Layer for industrial operations, the Timeslotted 827 Channel Hopping (TSCH) mode of IEEE802.15.4e [IEEE802154e]. 829 [I-D.watteyne-6tisch-tsch] provides in-depth information on the 830 IEEE802.15.4e [IEEE802154e] TSCH MAC operation whereas the 6TiSCH 831 architecture [I-D.thubert-6tisch-architecture] provides additional 832 imformation as of how RPL can be used over TSCH. 834 This contrasts with the SmartGrid area where ZigBee IP [ZigBeeIP] 835 ("ZigBee" is a registered trademark of the ZigBee Alliance) defines 836 an application of RPL over a more classical contention-based 837 operation but will not exhibit the deterministic capabilities that 838 industrial control loops require. 840 3. Using RPL to Meet Functional Requirements 842 The functional requirements for most industrial automation 843 deployments are similar to those listed in [RFC5673] 845 The routing protocol MUST be capable of supporting the 846 organization of a large number of nodes into regions, usually 847 corresponding to partitions of the automated process, each 848 containing on the order of 30 to 3000 nodes. 850 The routing protocol MUST provide mechanisms to support 851 configuration of the routing protocol itself. 853 The routing protocol MUST provide mechanisms to support instructed 854 configuration of explicit routing, so that in the absence of 855 failure the routing used for selected flow classes is that which 856 has been remotely configured (typically by a centralized 857 configurator). In such circumstances RPL is used 859 for local network repair; 861 for flow classes to which explicit routing has not been 862 assigned; 864 during bootstrapping of the network itself (which is really 865 just an instance of routing without such an externally-imposed 866 assignment). 868 The routing protocol SHOULD support directed flows with different 869 QoS characteristics, typically with different energy vs. delay 870 tradeoffs, for traffic directed to LBRs. In practice only two 871 such sets of QoS are relevant: 873 one that emphasizes energy minimization for energy-constrained 874 nodes at the expense of greater mean transit delay and variance 875 in transit delay; and 877 one that emphasizes minimization of mean transit delay and 878 transit delay variance at the expense of greater energy demand 879 on originating and intermediary energy-constrained nodes, 880 typically used for critical SS traffic (e.e., infrequent and 881 unpredictable safety alarms with legally-mandated maximum 882 reporting delays) and critical PS traffic (e.g., predictable 883 periodic (for process automation) or cyclic (for factory 884 automation) high-speed safety control loops needed to protect 885 life, the environment, and/or critical national infrastructure 886 assets). 888 In the absence of configured routing, or when such routes have 889 failed, the routing protocol MUST dynamically compute and select 890 effective routes composed of low-power and lossy links. Local 891 network dynamics SHOULD NOT impact the entire network. The 892 routing protocol MUST compute multiple paths when possible. 894 The routing protocol MUST support multicast addressing, including 896 multicast originating with a LBR or off the LLN, directed to a 897 predefined group within the LLN 899 multicast originating within the LLN, directed to one or more 900 equivalent LBRs, in support of SS traffic 902 multicast originating within the LLN, directed to one or more 903 equivalent LBRs, in support of PS traffic. 905 The routing protocol SHOULD support and utilize a large number of 906 highly directed flows to a few LBRs, to handle scalability. 908 The routing protocol SHOULD support formation of groups of field 909 devices in the network. 911 The routing protocol NEED NOT support anycast addressing because, 912 as of the date of writing of this document, such addressing is not 913 used by automation and control field devices. In general, no two 914 such devices are equivalent, except perhaps for intermediary LBRs, 915 so unicast suffices for situations where anycast might otherwise 916 be employed. 918 RPL supports: 920 Large-scale networks characterized by highly directed traffic 921 flows between each field device and servers close to the head-end 922 of the automation network. To this end, RPL builds Directed 923 Acyclic Graphs (DAGs) rooted at LBRs. 925 Zero-touch configuration. This is done through in-band methods 926 for configuring RPL variables using DIO messages. 928 The use of links with time-varying availability and quality 929 characteristics. This is accomplished by allowing the use of 930 metrics that effectively capture the quality of a path (e.g., in 931 terms of the mean and maximum impact of use of that path on packet 932 delivery timing and on endpoint energy demands), and by limiting 933 the impact of changing local conditions by discovering and 934 maintaining multiple DAG parents, and by using local repair 935 mechanisms when DAG links break. 937 For wireless installations of small size with undemanding 938 communication requirements, RPL is likely to generate satisfactory 939 routing without any special effort. However, in larger installations 940 or where timeliness considerations do not permit multi-second 941 wireless-subnet transit times, then flow labeling is likely required 942 so that forwarding routers can make informed tradeoffs between 943 conserving their own energy resources and meeting overall system 944 needs. 946 4. RPL Profile 948 This section outlines a RPL profile for a representative deployment 949 in a process control application. Process monitoring without control 950 is typically less demanding, so a subset of this profile generally 951 will suffice. 953 4.1. RPL Features 955 4.1.1. RPL Instances 957 RPL allows formation of multiple instances that operate independently 958 of each other. Each instance may use a different objective function 959 and different modes of operation. It is highly recommended that 960 wireless field devices participate in different instances that 961 utilize objective functions that meet different optimization goals. 962 These optimization goals target: 964 1. Minimizing and ensuring that a guaranteed latency is being met 966 2. Maximizing the communication reliability of the packets 967 transferred over the wireless media 969 3. Minimizing aggregate power consumption for multi-hop LLNs that 970 are composed of battery powered field devices. 972 Some of these optimization goals will have to be met concurrently in 973 a single instance by imposing various constraints. 975 Each wireless field device should participate in a set composed of a 976 minimum of three instances that meet optimization goals associated 977 with three traffic flows which need to be supported by all industrial 978 LLNs. 980 Management Instance: Wireless industrial networks are highly 981 deterministic in nature, meaning that wireless field devices do 982 not make any decisions locally but are managed by a centralized 983 System Manager that oversees the join process as well as all 984 communication and security settings present in the devices. The 985 management traffic flow is downward traffic and needs to meet 986 strictly enforced latency and reliability requirements in order to 987 ensure proper operation of the wireless LLN. Hence each field 988 device should participate in an instance dedicated to management 989 traffic. All decisions made while constructing this instance will 990 need to be approved by the Path Computaton Engine present in the 991 System Manager due to the deterministic, centralized nature of 992 wireless industrial LLNs. Shallow LLNs with a hop count of up to 993 one, accommodate this downward traffic using non-storing mode.Non- 994 storing involves source routing that is detrimental to the packet 995 size. For large transfers such as image download and 996 configuration files, this can be factorized for a large packet. 997 In that case, a method such as [I-D.thubert-6lo-forwarding- 998 fragments] is required over multi-hop networks to forward and 999 recover individual fragments without the overhead of the source 1000 route information in each fragment. If the hop count in the 1001 wireless LLN grows (LLN becomes deeper) it is higly recommended 1002 that the management instance rely on storing mode in order to 1003 relay management related packets. 1005 Operational Instance: The bulk of the data that is transferred over 1006 wireless LLN consists of process automation related payloads. 1007 This data is of paramount importance to the smooth operation of 1008 the process that is being monitored. Hence data reliabiliy is of 1009 paramount importance. It is also important to note that a vast 1010 majority of the wireless field devices that operate in industrial 1011 LLNs are battery powered. The operational instance should hence 1012 ensure high reliability of the data transmitted while also 1013 minimizing the aggregate power consumption of the field devices 1014 operating in the LLN. All decisions made while constructing this 1015 instance will need to be approved by the Path Computaton Engine 1016 present in the System Manager. This is due to the deterministic, 1017 centralized nature of wireless LLNs. 1019 Autonomous instance: An autonomous instance requires limited to no 1020 configuration. It, primary purpose is to serve as a backup for 1021 the operational instance in case the operational instance fails. 1022 It is also useful in non-production phases of the network, when 1023 the plant is installed or dismantled. [I-D.thubert-roll-asymlink] 1024 provides rules and mechanisms whereby an instance can be used as a 1025 fallback to another upon failure to forward a packet further. The 1026 autonomic instance should always be active and during normal 1027 operations it should be maintained through local repair 1028 mechanisms. In normal operation global repairs should be 1029 sparingly employed in order to conserve batteries. But a global 1030 repair is also probably the fastest and most economical technique 1031 in the case the network is extensively damaged. It is recommended 1032 to rely on automation that will trigger a global repair upon the 1033 detection of a large scale incident such as an explosion or a 1034 crash. As the name suggests, the autonomous instance is formed 1035 without any dependence on the System Manager. Decisions made 1036 during the construcstion of the autonomous instance do not need 1037 approval from the Path Computation Engine present in the in the 1038 System Manager. 1040 Participation of each wireless field device in at least one instance 1041 that hosts a DODAG with a virtual root is highly recommended. 1043 Wireless industrial networks are typically composed of multiple LLNs 1044 that terminate in a LLN Border Router (LBR). The LBRs communicate 1045 with each other and with other entities present on the backbone (such 1046 as the Gateway and the System Manager) over a wired or wireless 1047 backbone infrastructure. When a device A that operates in LLN 1 1048 sends a packet to a device B that operates in LLN2, the packets 1049 egresses LLN1 through LBR1 and ingresses LLN2 through LBR2 after 1050 travelling over the backbone infrastructure that connects the LBRs. 1051 In order to accommodate this packet flow that travels from one LLN to 1052 another, it is highly recommended that wireless field devices 1053 participate in at least one instance that has a DODAG with a virtual 1054 root. 1056 4.1.2. Storing vs. Non-Storing Mode 1058 In general, storing mode is required for high-reporting-rate devices 1059 (where "high rate" is with respect to the underlying link data 1060 conveyance capability). Such devices, in the absence of path failure, 1061 are typically only one hop from the LBR(s) that convey their 1062 messaging to other parts of the system. Fortunately, in such cases, 1063 the routing tables required by such nodes are small, even when they 1064 include information on DODAGs that are used as backup alternate 1065 routes. 1067 Deeper multi-hop wireless LLNs (hop count > 1) should support storing 1068 mode in order to minimize the overhead associated with source routing 1069 given the limited header capacity associated with typical physical 1070 layers employed in wireless LLNs. Support for storing mode requires 1071 additional RAM resources be present in the constrained wireless 1072 fielde devices. Typical wireless LLNs scale to a maximum of one 1073 hundred field devices. Hence the appropriate RAM resources for 1074 supporting storing mode should be part of the hardware requirements 1075 imposed upon wireless field devices during the design phase. 1077 The ISA100.11a standard mandates that all LBRs maintain routing 1078 tables with enough capacity to accomodate operation in storing mode. 1079 The standard also mandates that all wireless field devices maintain 1080 routing tables but it does not make any capacity assumptions, 1081 allowing for null routing tables. The System Manager should read the 1082 routing table capacity of each wireless field router and LBR during 1083 their join phase, and determine if support for storing mode in a 1084 particular LLN is feasible. 1086 Lack of support for storing mode is also detrimental to battery 1087 operated wireless field devices due to the power consumption 1088 associated with transporting the hefty headers associated with source 1089 routing. Support for storing mode also ensures path redundancy which 1090 in turn allows for better prediction of the latency associated with 1091 downward traffic flows. Guaranteed latencies are of paramount 1092 importance for various traffic flows in wireless industrial LLNs. 1094 4.1.3. DAO Policy 1096 Support for both upward and downward traffic flows is a requirement 1097 in industrial automation systems. As a result, nodes send DAO 1098 messages to establish downward paths from the root to themselves. 1099 DAO messages are not acknowledged in wireless industrial LLNs that 1100 are composed of battery operated field devices in order to minimize 1101 the power consumption overhead associated with path discovery. Given 1102 that wireless field devices in LLNs will typically participate in 1103 multiple RPL instances and DODAGs, it is highly recommended that both 1104 the RPLInstance ID and the DODAGID be included in the DAO. 1106 4.1.4. Path Metrics 1108 RPL relies on an Objective Function for selecting parents and 1109 computing path costs and rank. This objective function is decoupled 1110 from the core RPL mechanisms and also from the metrics in use in the 1111 network. Two objective functions for RPL have been defined at the 1112 time of this writing, the RPL Objective Function 0 [RFC6552] and the 1113 Minimum Rank with Hysteresis Objective Function [RFC6719], both of 1114 which define a selection method for a preferred parent and backup 1115 parents, and are suitable for industrial automation network 1116 deployments. 1118 4.1.5. Objective Function 1120 Industrial wireless LLNs are subject to swift variations in terms of 1121 the propagation of the wireless signal, variations that can affect 1122 the quality of the links between field devices. This is due to the 1123 nature of the environment in which they operate which can be 1124 characterized as metal jungles that cause wireles propagation 1125 distortions, multi-path fading and scattering. Hence support for 1126 hysteresis is needed in order to ensure relative link stability which 1127 in turn ensures route stability. 1129 As mentioned in previous sections of this document, different traffic 1130 flows require different optimization goals. Wireless field devices 1131 should participate in multiple instances associated with multiple 1132 objective functions. 1134 Management Instance: Should utilize an objective function that 1135 focuses on optimization of latency and data reliability. 1137 Operational instance: Should utilize an objective function that 1138 focuses on data reliability and minimizing aggregate power 1139 consumption for battery operated field devices. 1141 Autonomous instance: Should utilize an objective function that 1142 optimizes data latency. The primary purpose of the autonomous 1143 instance is as a fallback instance in case the operational 1144 instance fails. Data latency is hence paramount for ensuring that 1145 the wireless field devices can exchange packets in order to repair 1146 the operational instance. 1148 More complex objective functions are needed that take in 1149 consideration multiple constraints and utilize weighted sums of 1150 multiple additive and multiplicative metrics. Additional objective 1151 functions specifically designed for such networks may be defined in 1152 companion RFCs. 1154 4.1.6. DODAG Repair 1156 To effectively handle time-varying link characteristics and 1157 availability, industrial automation network deployments SHOULD 1158 utilize the local repair mechanisms in RPL. 1160 Local repair is triggered by broken link detection, and in storing 1161 mode also by loop detection. 1163 The first local repair mechanism consists of a node detaching from a 1164 DODAG and then re-attaching to the same or to a different DODAG at a 1165 later time. While detached, a node advertises an infinite rank value 1166 so that its children can select a different parent. This process is 1167 known as poisoning and is described in Section 8.2.2.5 of [RFC6550]. 1168 While RPL provides an option to form a local DODAG, doing so in 1169 industrial automation network deployments is of little benefit since 1170 applications typically communicate through a LBR. After the detached 1171 node has made sufficient effort to send notification to its children 1172 that it is detached, the node can rejoin the same DODAG with a higher 1173 rank value. The configured duration of the poisoning mechanism needs 1174 to take into account the disconnection time applications running over 1175 the network can tolerate. Note that when joining a different DODAG, 1176 the node need not perform poisoning. 1178 The second local repair mechanism controls how much a node can 1179 increase its rank within a given DODAG Version (e.g., after detaching 1180 from the DODAG as a result of broken link or loop detection). 1181 Setting the DAGMaxRankIncrease to a non-zero value enables this 1182 mechanism, and setting it to a value of less than infinity limits the 1183 cost of count-to-infinity scenarios when they occur, thus controlling 1184 the duration of disconnection applications may experience. 1186 4.1.7. MPL Profile 1188 The applicability of MPL is left to be determined. There is a 1189 potential for Source/Sink flows in order to control the flooding 1190 incurred by alarms and alerts. 1192 4.1.8. Security 1194 Industrial automation network deployments typically operate in areas 1195 that provide limited physical security (relative to the risk of 1196 attack). For this reason, the link layer, transport layer and 1197 application layer technologies utilized within such networks 1198 typically provide security mechanisms to ensure authentication, 1199 confidentiality, integrity, timeliness and freshness. As a result, 1200 such deployments may not need to implement RPL's security mechanisms 1201 and could rely on link layer and higher layer security features. 1203 4.1.9. P2P communications 1205 There is definitely a need for route optimizations for the close 1206 control loops that sustain the automation systems. [I-D.thubert- 1207 6tisch-architecture] discusses the applicability of a central routing 1208 computation based on a Path Computation Element (PCE), which would be 1209 the natural IETF correspondent to the System Managers or Network 1210 Managers that can be found in existing industrial standards. 1212 The RPL point to point extension/optimization [RFC6997] 1213 (experimental) or its standard track successor may be used as well to 1214 establish on-demand paths or repair existing ones. 1216 4.2. Layer-two features 1218 This section defers to work that is aking place at the 6TiSCH WG. In 1219 particular [I-D.wang-6tisch-6top] defines the Link Layer Control 1220 (LLC) operation that sustain RPL and IPv6 whereas [I-D.vilajosana- 1221 6tisch-minimal] specifies a minimal RPL operation based on a static 1222 TSCH schedule. 1224 4.3. Recommended Configuration Defaults and Ranges 1226 4.3.1. Trickle Parameters 1228 Trickle was designed to be density-aware and perform well in networks 1229 characterized by a wide range of node densities. The combination of 1230 DIO packet suppression and adaptive timers for sending updates allows 1231 Trickle to perform well in both sparse and dense environments. 1233 Node densities in industrial automation network deployments can vary 1234 greatly, from nodes having only one or a handful of neighbors to 1235 nodes having several hundred neighbors. In high density 1236 environments, relatively low values for Imin may cause a short period 1237 of congestion when an inconsistency is detected and DIO updates are 1238 sent by a large number of neighboring nodes nearly simultaneously. 1239 While the Trickle timer will exponentially backoff, some time may 1240 elapse before the congestion subsides. Although some link layers 1241 employ contention mechanisms that attempt to avoid congestion, 1242 relying solely on the link layer to avoid congestion caused by a 1243 large number of DIO updates can result in increased communication 1244 latency for other control and data traffic in the network. 1246 To mitigate this kind of short-term congestion, this document 1247 recommends a more conservative set of values for the Trickle 1248 parameters than those specified in [RFC6206]. In particular, 1249 DIOIntervalMin is set to a larger value to avoid periods of 1250 congestion in dense environments, and DIORefundancyConstant is 1251 parameterized accordingly as described below. These values are 1252 appropriate for the timely distribution of DIO updates in both sparse 1253 and dense scenarios while avoiding the short-term congestion that 1254 might arise in dense scenarios. 1256 Because the actual link capacity depends on the particular link 1257 technology used within an industrial automation network deployment, 1258 the Trickle parameters are specified in terms of the link's maximum 1259 capacity for conveying link-local multicast messages. If the link 1260 can convey m link-local multicast packets per second on average, the 1261 expected time it takes to transmit a link-local multicast packet is 1 1262 /m seconds. 1264 DIOIntervalMin: Industrial automation network deployments SHOULD set 1265 DIOIntervalMin such that the Trickle Imin is at least 50 times as 1266 long as it takes to convey a link-local multicast packet. This value 1267 is larger than that recommended in [RFC6206] to avoid congestion in 1268 dense plant deployments as described above. 1270 DIOIntervalDoublings: Industrial automation network deployments 1271 SHOULD set DIOIntervalDoublings such that the Trickle Imax is at 1272 least TBD minutes or more. 1274 DIORedundancyConstant: Industrial automation network deployments 1275 SHOULD set DIORedundancyConstant to a value of at least 10. This is 1276 due to the larger chosen value for DIOIntervalMin and the 1277 proportional relationship between Imin and k suggested in [RFC6206]. 1278 This increase is intended to compensate for the increased 1279 communication latency of DIO updates caused by the increase in the 1280 DIOIntervalMin value, though the proportional relationship between 1281 Imin and k suggested in [RFC6206] is not preserved. Instead, 1282 DIORedundancyConstant is set to a lower value in order to reduce the 1283 number of packet transmissions in dense environments. 1285 4.3.2. Other Parameters 1287 None identified at this time. Further work is required to refine 1288 this analysis. 1290 5. Manageability Considerations 1292 RPL enables automatic and consistent configuration of RPL routers 1293 through parameters specified by the DODAG root and disseminated 1294 through DIO packets. The use of Trickle for scheduling DIO 1295 transmissions ensures lightweight yet timely propagation of important 1296 network and parameter updates and allows network operators to choose 1297 the trade-off point they are comfortable with respect to overhead vs. 1298 reliability and timeliness of network updates. 1300 The metrics in use in the network along with the Trickle Timer 1301 parameters used to control the frequency and redundancy of network 1302 updates can be dynamically varied by the root during the lifetime of 1303 the network. To that end, all DIO messages SHOULD contain a Metric 1304 Container option for disseminating the metrics and metric values used 1305 for DODAG setup. In addition, DIO messages SHOULD contain a DODAG 1306 Configuration option for disseminating the Trickle Timer parameters 1307 throughout the network. 1309 The possibility of dynamically updating the metrics in use in the 1310 network as well as the frequency of network updates allows deployment 1311 characteristics (e.g., network density) to be discovered during 1312 network bring-up and to be used to tailor network parameters once the 1313 network is operational rather than having to rely on precise pre- 1314 configuration. This also allows the network parameters and the 1315 overall routing protocol behavior to evolve during the lifetime of 1316 the network. 1318 RPL specifies a number of variables and events that can be tracked 1319 for purposes of network fault and performance monitoring of RPL 1320 routers. Depending on the memory and processing capabilities of each 1321 smart grid device, various subsets of these can be employed in the 1322 field. 1324 6. Security Considerations 1326 Industrial automation network deployments typically operate in areas 1327 that provide limited physical security (relative to the risk of 1328 attack). For this reason, the link layer, transport layer and 1329 application layer technologies utilized within such networks 1330 typically provide security mechanisms to ensure authentication, 1331 confidentiality, integrity, timeliness and freshness. As a result, 1332 such deployments may not need to implement RPL's security mechanisms 1333 and could rely on link layer and higher layer security features. 1335 This document does not specify operations that could introduce new 1336 threats. Security considerations for RPL deployments are to be 1337 developed in accordance with recommendations laid out in, for 1338 example, [I-D.tsao-roll-security-framework]. 1340 Industrial automation networks are subject to stringent security 1341 requirements as they are considered a critical infrastructure 1342 component. At the same time, since they are composed of large 1343 numbers of resource- constrained devices inter-connected with 1344 limited-throughput links, many available security mechanisms are not 1345 practical for use in such networks. As a result, the choice of 1346 security mechanisms is highly dependent on the device and network 1347 capabilities characterizing a particular deployment. 1349 In contrast to other types of LLNs, in industrial automation networks 1350 centralized administrative control and access to a permanent secure 1351 infrastructure is available. As a result link-layer, transport-layer 1352 and/or application-layer security mechanisms are typically in place 1353 and may make use of RPL's secure mode unnecessary. 1355 6.1. Security Considerations during initial deployment 1357 6.2. Security Considerations during incremental deployment 1359 7. Other Related Protocols 1361 8. IANA Considerations 1363 This specification has no requirement on IANA. 1365 9. Acknowledgements 1367 10. References 1369 10.1. Normative References 1371 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1372 Requirement Levels", BCP 14, RFC 2119, March 1997. 1374 10.2. Informative References 1376 [I-D.ietf-roll-terminology] 1377 Vasseur, J., "Terminology in Low power And Lossy 1378 Networks", Internet-Draft draft-ietf-roll-terminology-12, 1379 March 2013. 1381 [RFC2887] Handley, M., Floyd, S., Whetten, B., Kermode, R., 1382 Vicisano, L. and M. Luby, "The Reliable Multicast Design 1383 Space for Bulk Data Transfer", RFC 2887, August 2000. 1385 [RFC5548] Dohler, M., Watteyne, T., Winter, T. and D. Barthel, 1386 "Routing Requirements for Urban Low-Power and Lossy 1387 Networks", RFC 5548, May 2009. 1389 [RFC5826] Brandt, A., Buron, J. and G. Porcu, "Home Automation 1390 Routing Requirements in Low-Power and Lossy Networks", RFC 1391 5826, April 2010. 1393 [RFC5867] Martocci, J., De Mil, P., Riou, N. and W. Vermeylen, 1394 "Building Automation Routing Requirements in Low-Power and 1395 Lossy Networks", RFC 5867, June 2010. 1397 [RFC5673] Pister, K., Thubert, P., Dwars, S. and T. Phinney, 1398 "Industrial Routing Requirements in Low-Power and Lossy 1399 Networks", RFC 5673, October 2009. 1401 [RFC6206] Levis, P., Clausen, T., Hui, J., Gnawali, O. and J. Ko, 1402 "The Trickle Algorithm", RFC 6206, March 2011. 1404 [RFC6550] Winter, T., Thubert, P., Brandt, A., Hui, J., Kelsey, R., 1405 Levis, P., Pister, K., Struik, R., Vasseur, JP. and R. 1406 Alexander, "RPL: IPv6 Routing Protocol for Low-Power and 1407 Lossy Networks", RFC 6550, March 2012. 1409 [RFC6552] Thubert, P., "Objective Function Zero for the Routing 1410 Protocol for Low-Power and Lossy Networks (RPL)", RFC 1411 6552, March 2012. 1413 [RFC6719] Gnawali, O. and P. Levis, "The Minimum Rank with 1414 Hysteresis Objective Function", RFC 6719, September 2012. 1416 [RFC6997] Goyal, M., Baccelli, E., Philipp, M., Brandt, A. and J. 1417 Martocci, "Reactive Discovery of Point-to-Point Routes in 1418 Low-Power and Lossy Networks", RFC 6997, August 2013. 1420 [I-D.thubert-roll-asymlink] 1421 Thubert, P., "RPL adaptation for asymmetrical links", 1422 Internet-Draft draft-thubert-roll-asymlink-02, December 1423 2011. 1425 [I-D.thubert-6lo-forwarding-fragments] 1426 Thubert, P. and J. Hui, "LLN Fragment Forwarding and 1427 Recovery", Internet-Draft draft-thubert-6lo-forwarding- 1428 fragments-00, October 2013. 1430 [I-D.thubert-6tisch-architecture] 1431 Thubert, P., Assimiti, R. and T. Watteyne, "An 1432 Architecture for IPv6 over the TSCH mode of IEEE 1433 IEEE802.15.4e", Internet-Draft draft-thubert-6tisch- 1434 architecture-00, October 2013. 1436 [I-D.tsao-roll-security-framework] 1437 Tsao, T., Alexander, R., Daza, V. and A. Lozano, "A 1438 Security Framework for Routing over Low Power and Lossy 1439 Networks", Internet-Draft draft-tsao-roll-security- 1440 framework-02, March 2010. 1442 [I-D.watteyne-6tisch-tsch] 1443 Watteyne, T., "Using IEEE802.15.4e TSCH in an LLN context: 1444 Overview, Problem Statement and Goals", Internet-Draft 1445 draft-watteyne-6tisch-tsch-00, October 2013. 1447 [I-D.wang-6tisch-6top] 1448 Wang, Q., Vilajosana, X. and T. Watteyne, "6TiSCH 1449 Operation Sublayer (6top)", Internet-Draft draft-wang- 1450 6tisch-6top-00, October 2013. 1452 [I-D.vilajosana-6tisch-minimal] 1453 Vilajosana, X. and K. Pister, "Minimal 6TiSCH 1454 Configuration", Internet-Draft draft-vilajosana-6tisch- 1455 minimal-00, October 2013. 1457 10.3. External Informative References 1459 [HART] www.hartcomm.org, "Highway Addressable Remote Transducer, 1460 a group of specifications for industrial process and 1461 control devices administered by the HART Foundation", . 1463 [ISA100.11a] 1464 ISA, "ISA100, Wireless Systems for Automation", May 2008, 1465 < http://www.isa.org/Community/ 1466 SP100WirelessSystemsforAutomation>. 1468 [ZigBeeIP] 1469 ZigBee Public Document 15-002r00, "ZigBee IP 1470 Specification", 2013. 1472 Authors' Addresses 1474 Tom Phinney, editor 1475 consultant 1476 5012 W. Torrey Pines Circle 1477 Glendale, AZ 85308-3221 1478 USA 1480 Phone: +1 602 938 3163 1481 Email: tom.phinney@cox.net 1482 Pascal Thubert 1483 Cisco Systems, Inc 1484 Building D 1485 45 Allee des Ormes - BP1200 1486 MOUGINS - Sophia Antipolis, 06254 1487 FRANCE 1489 Phone: +33 497 23 26 34 1490 Email: pthubert@cisco.com 1492 Robert Assimiti 1493 Nivis 1494 1000 Circle 75 Parkway SE, Ste 300 1495 Atlanta, GA 30339 1496 USA 1498 Phone: +1 678 202 6859 1499 Email: robert.assimiti@nivis.com