idnits 2.17.1 draft-ietf-roll-rpl-industrial-applicability-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 3) being 60 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 31 instances of too long lines in the document, the longest one being 3 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 926 has weird spacing: '...ing the use o...' == Line 1007 has weird spacing: '... of the wirel...' == Line 1034 has weird spacing: '... in the in th...' -- The document date (September 09, 2013) is 3883 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'HART' is mentioned on line 1421, but not defined == Unused Reference: 'I-D.ietf-roll-terminology' is defined on line 1359, but no explicit reference was found in the text == Outdated reference: A later version (-13) exists of draft-ietf-roll-terminology-12 == Outdated reference: A later version (-02) exists of draft-thubert-roll-forwarding-frags-01 Summary: 1 error (**), 0 flaws (~~), 9 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 ROLL T. Phinney, Ed. 3 Internet-Draft consultant 4 Intended status: Informational P. Thubert 5 Expires: March 11, 2014 cisco 6 RA. Assimiti 7 Nivis 8 September 09, 2013 10 RPL applicability in industrial networks 11 draft-ietf-roll-rpl-industrial-applicability-01 13 Abstract 15 The wide deployment of wireless devices, with their low installed 16 cost (compared to wired devices), will significantly improve the 17 productivity and safety of industrial plants. It will simultaneously 18 increase the efficiency and safety of the plant's workers, by 19 extending and making more timely the information set available about 20 plant operations. The new Routing Protocol for Low Power and Lossy 21 Networks (RPL) defines a Distance Vector protocol that is designed 22 for such networks. The aim of this document is to analyze the 23 applicability of that routing protocol in industrial LLNs formed of 24 field devices. 26 Status of this Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at http://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on March 11, 2014. 43 Copyright Notice 45 Copyright (c) 2013 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents (http://trustee.ietf.org/ 50 license-info) in effect on the date of publication of this document. 51 Please review these documents carefully, as they describe your rights 52 and restrictions with respect to this document. Code Components 53 extracted from this document must include Simplified BSD License text 54 as described in Section 4.e of the Trust Legal Provisions and are 55 provided without warranty as described in the Simplified BSD License. 57 Table of Contents 59 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 60 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 4 61 1.2. Required Reading . . . . . . . . . . . . . . . . . . . . . 4 62 1.3. Out of scope requirements . . . . . . . . . . . . . . . . 4 63 2. Deployment Scenario . . . . . . . . . . . . . . . . . . . . . 5 64 2.1. Network Topologies . . . . . . . . . . . . . . . . . . . . 6 65 2.1.1. Traffic Characteristics . . . . . . . . . . . . . . . 6 66 2.1.2. Topologies . . . . . . . . . . . . . . . . . . . . . . 8 67 2.1.3. Source-sink (SS) communication paradigm . . . . . . . 10 68 2.1.4. Publish-subscribe (PS, or pub/sub) communication paradig 11 69 2.1.5. Peer-to-peer (P2P) communication paradigm . . . . . . 13 70 2.1.6. Peer-to-multipeer (P2MP) communication paradigm . . . 14 71 2.1.7. Additional considerations: Duocast and N-cast . . . . 14 72 2.1.8. RPL applicability per communication paradigm . . . . . 16 73 2.2. Layer 2 applicability. . . . . . . . . . . . . . . . . . . 18 74 3. Using RPL to Meet Functional Requirements . . . . . . . . . . 18 75 4. RPL Profile . . . . . . . . . . . . . . . . . . . . . . . . . 21 76 4.1. RPL Features . . . . . . . . . . . . . . . . . . . . . . . 21 77 4.1.1. RPL Instances . . . . . . . . . . . . . . . . . . . . 21 78 4.1.2. Storing vs. Non-Storing Mode . . . . . . . . . . . . . 23 79 4.1.3. DAO Policy . . . . . . . . . . . . . . . . . . . . . . 24 80 4.1.4. Path Metrics . . . . . . . . . . . . . . . . . . . . . 24 81 4.1.5. Objective Function . . . . . . . . . . . . . . . . . . 25 82 4.1.6. DODAG Repair . . . . . . . . . . . . . . . . . . . . . 25 83 4.1.7. Multicast . . . . . . . . . . . . . . . . . . . . . . 26 84 4.1.8. Security . . . . . . . . . . . . . . . . . . . . . . . 26 85 4.1.9. P2P communications . . . . . . . . . . . . . . . . . . 26 86 4.2. Layer-two features . . . . . . . . . . . . . . . . . . . . 26 87 4.2.1. Need layer-2 expert here. . . . . . . . . . . . . . . 26 88 4.2.2. Security functions provided by layer-2. . . . . . . . 26 89 4.2.3. 6LowPAN options assumed. . . . . . . . . . . . . . . . 26 90 4.2.4. MLE and other things . . . . . . . . . . . . . . . . . 26 91 4.3. Recommended Configuration Defaults and Ranges . . . . . . 26 92 4.3.1. Trickle Parameters . . . . . . . . . . . . . . . . . . 26 93 4.3.2. Other Parameters . . . . . . . . . . . . . . . . . . . 28 94 5. Manageability Considerations . . . . . . . . . . . . . . . . . 28 95 6. Security Considerations . . . . . . . . . . . . . . . . . . . 28 96 6.1. Security Considerations during initial deployment . . . . 29 97 6.2. Security Considerations during incremental deployment . . 29 98 7. Other Related Protocols . . . . . . . . . . . . . . . . . . . 29 99 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 29 100 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 29 101 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 29 102 10.1. Normative References . . . . . . . . . . . . . . . . . . 29 103 10.2. Informative References . . . . . . . . . . . . . . . . . 29 104 10.3. External Informative References . . . . . . . . . . . . . 31 105 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 31 107 1. Introduction 109 Information Technology (IT) is already, and increasingly will be 110 applied to Industrial Automation and Control System (IACS) technology 111 in application areas where those IT technologies can be constrained 112 sufficiently by Service Level Agreements (SLA) or other modest change 113 that they are able to meet the operational needs of IACS. When that 114 happens, the IACS benefits from the large intellectual, experiential 115 and training investment that has already occurred in those IT 116 precursors. One can conclude that future reuse of additional IT 117 protocols for IACS will continue to occur due to the significant 118 intellectual, experiential and training economies which result from 119 that reuse. 121 Following that logic, many vendors are already extending or replacing 122 their local field-bus technology with Ethernet and IP-based 123 solutions. Examples of this evolution include CIP EtherNet/IP, 124 Modbus/TCP, Foundation Fieldbus HSE, PROFInet and Invensys/Foxboro 125 FOXnet. At the same time, wireless, low power field devices are 126 being introduced that facilitate a significant increase in the amount 127 of information which industrial users can collect and the number of 128 control points that can be remotely managed. 130 IPv6 appears as a core technology at the conjunction of both trends, 131 as illustrated by the current [ISA100.11a] industrial Wireless Sensor 132 Networking (WSN) specification, where layers 1-4 technologies 133 developed for end uses other than IACS - IEEE 802.15.4 PHY and MAC, 134 6LoWPAN and IPv6, and UDP - are adapted to IACS use. But due to the 135 lack of open standards for routing in Low power and Lossy Networks 136 (LLN) at the time ISA100.11a was crafted, routing was accomplished at 137 the link layer and is specific to that standard. 139 The IETF ROLL Working Group has defined application-specific routing 140 requirements for a LLN routing protocol, specified in: 142 Routing Requirements for Urban LLNs [RFC5548], 144 Industrial Routing Requirements in LLNs [RFC5673], 146 Home Automation Routing Requirements in LLNs [RFC5826], and 148 Building Automation Routing Requirements in LLNs [RFC5867]. 150 The Routing Protocol for Low Power and Lossy Networks (RPL) 151 [RFC6550] specification and its point to point extension/optimization 152 [RFC6997] define a generic Distance Vector protocol that is adapted 153 to a variety of Low Power and Lossy Networks (LLN) types by the 154 application of specific Objective Functions (OFs). RPL forms 155 Destination Oriented Directed Acyclic Graphs (DODAGs) within 156 instances of the protocol, each instance being associated with an 157 Objective Function to form a routing topology. 159 A field device that belongs to an instance uses the OF to determine 160 which DODAG and which Version of that DODAG the device should join. 161 The device also uses the OF to select a number of routers within the 162 DODAG current and subsequent Versions to serve as parents or as 163 feasible successors. A new Version of the DODAG is periodically 164 reconstructed to enable a global reoptimization of the graph. 166 A RPL OF states the outcome of the process used by a RPL node to 167 select and optimize routes within a RPL Instance based on the 168 information objects available. The separation of OFs from the core 169 protocol specification allows RPL to be adapted to meet the different 170 optimization criteria required by the wide range of industrial 171 classes of traffic and applications. 173 This document provides information on how RPL can accommodate the 174 industrial requirements for LLNs, in particular as specified in 175 [RFC5673]. 177 1.1. Requirements Language 179 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 180 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 181 "OPTIONAL" in this document are to be interpreted as described in RFC 182 2119 [RFC2119]. 184 Additionally, this document uses terminology from [I-D.ietf-roll- 185 terminology], and uses usual terminology from the Process Control and 186 Factory Automation industries, some of which is recapitulated below: 188 FEC: Forward error correction 190 IACS: Industrial automation and control systems 192 RAND: reasonable and non-discriminatory (relative to licensing of 193 patents) 195 1.2. Required Reading 197 1.3. Out of scope requirements 199 This applicability statement does not address requirements related to 200 wireless LLNs employed in factory automation and related 201 applications. 203 2. Deployment Scenario 205 [RFC5673] describes in detail the routing requirements for industrial 206 LLNs. This RFC provides information on the varying deployment 207 scenarios for such LLNs and how RPL assists in meeting those 208 requirements. 210 Large industrial plants, or major operating areas within such plants, 211 repeatedly go through four major phases, each of which typically 212 lasts from months to years: 214 P1: Construction or major modification phase 216 P2: Planned startup phase 218 P3: Normal operation phase 220 P4: Planned shutdown phase 222 followed eventually by an (at least theoretical) 224 P5: Plant decommissioning phase. 226 It is also likely, after a major catastrophe at a plant, to have a 228 P6: Post-emergency recovery and repair phase. 230 The deployment scenarios for wireless LLN devices may be different in 231 each of these phases. In particular, during the Construction or 232 major modification phase (P1), LLN devices may be installed months 233 before the intended LLN can become usefully operational (because 234 needed routers and infrastructure devices are not yet installed or 235 active), and there are likely to be many personnel in whom the plant 236 owner/operator has only limited trust, such as subcontractors and 237 others in the plant area who have undergone only a cursory background 238 investigation (if any at all). In general, during this phase, plant 239 instrumentation is not yet operational, so could be removed and 240 replaced by a Trojaned device without much likelihood of physical 241 detection of the substitution. Thus physical security of LLN devices 242 is generally a more significant risk factor during this phase than 243 once the plant is operational, where simple replacement of device 244 electronics is detectable. 246 Extra LLN devices and even extra LLN subnets may be employed during 247 Planned startup (P2) and Planned shutdown (P4) phases, in support of 248 the task of transitioning the plant or plant area between operational 249 and shutdown states. The extra devices typically provide extra 250 monitoring as the plant transitions infrequent activity states. (In 251 many continuous process plants, up to 2x extra staff are employed at 252 monitoring and control workstations during these two phases, 253 precisely because the plant is undergoing extraordinary behavior as 254 it transitions to or from its steady-state operational condition.) 256 Similar transient devices and subnets may be used during an 257 unscheduled Post-emergency recovery and repair phase (P6) of 258 operation, but in that case the extra devices usually are routers 259 substituting for plant LLN devices that have been damaged by the 260 incident (such as a fire, explosion, flood, tornado or hurricane) 261 that induced the emergency. 263 The Planned startup (P2) and Planned shutdown (P4) phases are similar 264 in many respects, but the LLN environment of the two can be quite 265 different, since the Planned shutdown phase can assume that the 266 stable LLN environment used for Normal operation (P3) is functional 267 during shutdown, whereas that stable environment usually is still 268 being established during startup. 270 The Post-emergency recovery and repair phase (P6) typically operates 271 in an LLN environment that is somewhere between that of the Planned 272 startup (P2) and Normal operation (P3) phases, but with an 273 indeterminate number of temporary routers placed to facilitate 274 communication across and around the area affected by the catastrophe. 276 Smaller industrial plants and sites may go through similar phases, 277 but often commingle the phases because, in those smaller plants, the 278 phases require less planning and structuring of personnel 279 responsibilities and thus permit less formalization and partitioning 280 of the operating scenarios. For example, it is much simpler, and 281 usually requires much less planning, to bring new equipment on a skid 282 into a plant, using a forklift, than to lay temporary railroad track 283 or employ an extended-axle heavy haul tractor-trailer to deliver a 284 multi-ton process vessel, and temporarily deploy and use very large 285 heavy-lift cranes to install it. In the former cases, nearby 286 equipment usually can continue normal operation while the 287 installation proceeds; in the latter case that is almost always 288 impossible, due to safety and other concerns. 290 The domain of applicability for the RPL protocol may include all 291 phases but the Normal Operation phase, where the bandwidth allocation 292 and the routes are usually optimized by an external Path Computing 293 Engine (PCE), e.g. an ISA100.11a System Manager. 295 Additionally, it could be envisioned to include RPL in the normal 296 operation provided that a new Objective Function is defined that 297 actually interacts with the PCE is order to establish the reference 298 topology, in which case RPL operations would only apply to emergency 299 repair actions. when the reference topology becomes unusable for 300 some failure, and as long as the problem persists. 302 2.1. Network Topologies 304 2.1.1. Traffic Characteristics 306 The industrial market classifies process applications into three 307 broad categories and six classes. 309 o Safety 311 * Class 0: Emergency action - Always a critical function 313 o Control 315 * Class 1: Closed loop regulatory control - Often a critical 316 function 318 * Class 2: Closed loop supervisory control - Usually non-critical 319 function 321 * Class 3: Open loop control - Operator takes action and controls 322 the actuator (human in the loop) 324 o Monitoring 326 * Class 4: Alerting - Short-term operational effect (for example 327 event-based maintenance) 329 * Class 5: Logging and downloading / uploading - No immediate 330 operational consequence (e.g., history collection, sequence-of- 331 events, preventive maintenance) 333 Safety critical functions effect the basic safety integrity of the 334 plant. These normally dormant functions kick in only when process 335 control systems, or their operators, have failed. By design and by 336 regular interval inspection, they have a well-understood probability 337 of failure on demand in the range of typically once per 10-1000 338 years. 340 In-time deliveries of messages becomes more relevant as the class 341 number decreases. 343 Note that for a control application, the jitter is just as important 344 as latency and has a potential of destabilizing control algorithms. 346 The domain of applicability for the RPL protocol probably matches the 347 range of classes where industrial users are interested in deploying 348 wireless networks. This domain includes monitoring classes (4 and 349 5), and the non-critical portions of control classes (2 and 3). RPL 350 might also be considered as an additional repair mechanism in all 351 situations, and independently of the flow classification and the 352 medium type. 354 It appears from the above sections that whether and the way RPL can 355 be applied for a given flow depends both on the deployment scenario 356 and on the class of application / traffic. At a high level, this can 357 be summarized by the following matrix: 359 +---------------------+------------------------------------------------+ 360 | Phase \ Class | 0 1 2 3 4 5 | 361 +=====================+================================================+ 362 | Construction | X X X X | 363 +---------------------+------------------------------------------------+ 364 | Planned startup | X X X X | 365 +---------------------+------------------------------------------------+ 366 | Normal operation | ? ? ? | 367 +---------------------+------------------------------------------------+ 368 | Planned shutdown | X X X X | 369 +---------------------+------------------------------------------------+ 370 |Plant decommissioning| X X X X | 371 +---------------------+------------------------------------------------+ 372 | Recovery and repair | X X X X X X | 373 +---------------------+------------------------------------------------+ 375 ? : typically usable for all but higher-rate classes 0,1 PS traffic 377 2.1.2. Topologies 379 In an IACS, high-rate communications flows (e.g., 1 Hz or 4 Hz for a 380 traditional process automation network) typically are such that only 381 a single wireless LLN hop separates the source device from a LLN 382 Border Router (LBR) to a significantly higher data-rate backbone 383 network, typically based on IEEE 802.3, IEEE 802.11, or IEEE 802.16, 384 as illustrated in Figure 2. 386 ---+------------------------ 387 | Plant Network 388 | 389 +-----+ 390 | | Gateway 391 | | 392 +-----+ 393 | 394 | Backbone 395 +--------------------+------------------+ 396 | | | 397 +-----+ +-----+ +-----+ 398 | | LLN border | | LLN border | | LLN border 399 o | | router | | router | | router 400 +-----+ +-----+ +-----+ 401 o o o o 402 o o o o o o o o o o o 403 LLN 405 o : stationary wireless field device, seldom acting as an LLN router 407 For factory automation networks, the basic communications cycle for 408 control is typically much faster, on the order of 100 Hz or more. In 409 this case the LLN itself may be based on high-data-rate IEEE 802.11 410 or a 100 Mbit/s or faster optical link, and the higher-rate network 411 used by the LBRs to connect the LLN to superior automation equipment 412 typically might be based on fiber-optic IEEE 802.3, with multiple 413 LBRs around the periphery of the factory area, so that most high-rate 414 communications again requires only a single wireless LLN hop. 416 Multi-hop LLN routing is used within the LLN portion of such networks 417 to provide backup communications paths when primary single-hop LLN 418 paths fail, or for lower repetition rate communications where longer 419 LLN transit times and higher variance are not an issue. Typically, 420 the majority of devices in an IACS can tolerate such higher-delay 421 higher-variance paths, so routing choices often are driven by energy 422 considerations for the affected devices, rather than simply by IACS 423 performance requirements, as illustrated in Figure 3. 425 ---+------------------------ 426 | Plant Network 427 | 428 +-----+ 429 | | Gateway 430 | | 431 +-----+ 432 | 433 | Backbone 434 +--------------------+------------------+ 435 | | | 436 +-----+ +-----+ +-----+ 437 | | Backbone | | Backbone | | Backbone 438 | | router | | router | | router 439 +-----+ +-----+ +-----+ 440 o o o o o o o o o o o o o 441 o o o o o o o o o o o o o o o o o o 442 o o o o o o o o o o o M o o o o o 443 o o M o o o o o o o o o o o o o 444 o o o o o o o o o 445 o o o o o 446 LLN 448 o : stationary wireless field device, often acting as an LLN router 449 M : mobile wireless device 451 Two decades of experience with digital fieldbuses has shown that four 452 communications paradigms dominate in IACS: 454 SS: Source-sink 456 PS: Publish-subscribe 458 P2P: Peer-to-peer 460 P2MP: Peer-to-multipeer 462 2.1.3. Source-sink (SS) communication paradigm 464 In SS, the source-sink communication paradigm, each of many devices 465 in one set, S1, sends UDP-like messages, usually infrequently and 466 intermittently, to a second set of devices, S2, determined by a 467 common multicast address. A typical example would be that all 468 devices within a given process unit N are configured to send process 469 alarm messages to the multicast address 470 Receivers_of_process_alarms_for_unit_N. Receiving devices, typically 471 on non-LLN networks accessed via LBRs, are configured to receive such 472 multicast messages if their work assignment covers process unit N, 473 and not otherwise. 475 Timeliness of message delivery is a significant aspect of some SS 476 communication. When the SS traffic conveys process alarms or device 477 alerts, there is often a contractual requirement, and sometimes even 478 a regulatory requirement, on the maximum end-to-end transit delay of 479 the SS message, including both the LLN and non-LLN components of that 480 delay. However, there is no requirement on relative jitter in the 481 delivery of multiple SS messages from the same source, and message 482 reordering during transit is irrelevant. 484 Within the LLN, the SS paradigm simply requires that messages so 485 addressed be forwarded to the responsible LBR (or set of equivalent 486 LBRs) for further forwarding outside the LLN. Within the LLN such 487 traffic typically is device-to-LBR or device-to-redundant-set-of- 488 equivalent-LBRs. In general, SS traffic may be aggregated before 489 forwarding when both the multicast destination address and other QoS 490 attributes are identical. If information on the target delivery 491 times for SS messages is available to the aggregating forwarding 492 device, that device may intentionally delay forwarding somewhat to 493 facilitate further aggregation, which can significantly reduce LLN 494 alarm-reporting traffic during major plant upset events. 496 2.1.4. Publish-subscribe (PS, or pub/sub) communication paradigm 498 In PS, the publish-subscribe communication paradigm, a device sends 499 UDP-like messages, usually periodically or cyclicly (i.e., 500 repetitively but without fixed periodicity), to a single multicast 501 address derived from or correlated with the device's own address. A 502 typical example would be that each sensor and actuator device within 503 a given process unit N is configured to send process state messages 504 to the multicast address that designates its specific publications. 505 In essence the derived multicast address for device D is 506 Receivers_of_publications_by_device_D. Typically those receivers are 507 in two categories: controllers (C) for control loops in which device 508 D participates, and devices accessed via the LLN's LBRs that monitor 509 and/or accumulate historical information about device D's status and 510 outputs. 512 If the controller(s) that receive device D's publication are all 513 outside the LLN and accessed by LBRs, then within the LLN such 514 traffic typically is device-to-LBR or device-to-redundant-set-of- 515 equivalent-LBRs. But if a controller (Cn) is within the LLN, then a 516 number of different LLN-local traffic patterns may be employed, 517 depending on the capabilities of the underlying link technology and 518 on configured performance requirements for such reporting. Typically 519 in such a case, publication by device D is forwarded up a DODAG to an 520 LLN router that is also on a downward DODAG to a destination 521 controller Cn, then forwarded down that second DODAG to that 522 destination controller Cn. Of course, if the LLN router (or even the 523 LBR) is itself the intended destination controller, which will often 524 be the case, then no downward forwarding occurs. 526 Timeliness of message delivery is a critical aspect of PS 527 communication. Individual messages can be lost without significant 528 impact on the controlled physical process, but typically a sequence 529 of four consecutive lost messages will trigger fallback behavior of 530 the control algorithms, which is considered a system failure by most 531 system owner/operators. (In general, and unless a local catastrophic 532 event such as a major explosion or a tornado occurs in the plant, 533 invocation of more than one instance of such fallback handling per 534 year, per plant, is considered unacceptable.) 536 Message loss, delay and jitter in delivery of PS messaging is a 537 relative matter. PS messaging is used for transfer of process 538 measurements and associated status from sensors to control 539 computation elements, from control computation elements to actuators, 540 and of current commanded position and status from actuators back to 541 control computation elements. The actual time interval of interest 542 is that which starts with sensing of the physical process (which 543 necessarily occurs before the sensed value can be sent in the first 544 message) and which ends when the computed control correction is 545 applied to the physical process by the appropriate actuator (which 546 cannot occur until after the second message containing the computed 547 control output has been received by that actuator). With rare 548 exception, the control algorithms used with PS messaging in the 549 process automation industries - those managing continuous material 550 flows - rely on fixed-period sampling, computation and transfer of 551 outputs, while those in the factory automation industries - those 552 managing discrete manufacturing operations - rely on bounded delay 553 between sampling of inputs, control computation and transfer of 554 outputs to physical actuators that affect the controlled process. 556 Deliberately manipulated message delay and jitter in delivery of PS 557 messaging has the potential to destabilize control loops. It is the 558 responsibility of conveyed higher-level protocols to protect against 559 such potential security attacks by detecting overly delayed or 560 jittered messages at delivery, converting them into instances of 561 message loss. Thus network and data-link protocols such as IPv6 and 562 Ethernet need not themselves address such issues, although their 563 selection and employment should take the existence (or lack) of such 564 higher-layer protection mechanisms, and the resulting consequences 565 due to excessive delay and jitter, into consideration in their 566 parameterization. 568 In general, PS traffic within the LLN is not aggregated before 569 forwarding, to minimize message loss and delay in reception by any 570 relevant controller(s) that are outside the LLN. However, if all 571 intended destination controllers are within the LLN, and at least one 572 of those intended controllers also serves as an LLN router on a DODAG 573 to off-LLN destinations that all are not controllers, then the router 574 functions in that device may aggregate PS traffic before forwarding 575 when the required routing and other QoS attributes are identical. If 576 information on the target delivery times for PS messages to non- 577 controller devices is available to the aggregating forwarding device, 578 that device may intentionally delay forwarding somewhat to facilitate 579 further aggregation. 581 In some system architectures, message streams that use PS to convey 582 current process measurements and status are compressed at the source 583 through a 2-dimensional winnowing process that compares 585 1) the process measurement values and status of the about-to-be-sent 586 message with that of the last actually-sent message, and 588 2) the current time vs. the queueing time for the last actually-sent 589 message. 591 If the interval since that last-sent message is less than a 592 predefined maximum time, and the status is unchanged, and the process 593 measurement(s) conveyed in the message is within predefined 594 deadband(s) of the last-sent measurement value(s), then transmission 595 of the new message is suppressed. Often this suppression takes the 596 form of not queuing the new message for transmission, but in some 597 protocols a brief placeholder message indicating "no significant 598 change" is queued in its stead. 600 2.1.5. Peer-to-peer (P2P) communication paradigm 602 In P2P, the peer-to-peer communication paradigm, a device sends UDP- 603 like or TCP-like messages from one device (D1) to a second device 604 (D2), usually with bidirectional but asymmetric flow of application 605 data, where the amount of data is significantly greater in one 606 direction than the other. Typical examples are transfer of 607 configuration information to or from a process field device, or 608 transfer of captured process diagnostics (e.g., time-stamped noise 609 signatures from a coriolis flowmeter) to an off-LLN higher-level 610 asset management system. Unicast addressing is used in both 611 directions of data flow. 613 In general, specific P2P traffic has only loose timeliness 614 requirements, typically just those required so that response times to 615 human-operator-initiated actions meet human factors requirements. As 616 a consequence, in general, message aggregation is permitted, although 617 few opportunities are likely to present themselves for such 618 aggregation due to the sporadic nature of such messaging to a single 619 destination, and/or due to the large message payloads that often 620 occur in at least one direction of transmission. 622 2.1.6. Peer-to-multipeer (P2MP) communication paradigm 624 In P2MP, the peer-to-multipeer communication paradigm, a device sends 625 UDP-like messages downward, from one device (D1) to a set of other 626 devices (Dn). Typical examples are bulk downloads to a set of devices 627 that use identical code image segments or identically-structured 628 database segments; group commands to enable device state transitions 629 that are quasi-synchronized across all or part of the local network 630 (e.g., switch to the next set of point-to-point downloaded session 631 keys, or notifying that the network is switching to an emergency 632 repair and recovery mode); etc. Multicast addressing is used in the 633 downward direction of data flow. 635 Devices can be assigned to a number of multicast groups, for instance 636 by device type. Then, if it becomes necessary to reflash all devices 637 of a given type with a new load image, a multicast distribution 638 mechanism can be leveraged to optimize the distribution operation. 640 In general, P2MP traffic has only loose timeliness requirements. As 641 a consequence, in general, message aggregation is permitted, although 642 few opportunities are likely to present themselves for such 643 aggregation due to the sporadic nature of such messaging to a single 644 multicast group destination, and/or due to the large message payloads 645 that often occur when P2MP is used for group downloads. However, in 646 general, message aggregation negatively impacts the delivery success 647 rate for each of the aggregated messages, since the probability of 648 error in a received message increases with message length> Together 649 these considerations often lead to a policy of non-aggregation for 650 P2MP messaging. 652 Note: Reliable group download protocols, such as the no-longer- 653 published IEEE 802.1E (ISO/IEC 15802-4) system load protocol, and 654 reliable multicast protocols based on the guidance of [RFC2887], are 655 instructive in how P2MP can be used for initial bulk download, 656 followed by either P2MP or P2P selective retransmissions for missed 657 download segments. 659 2.1.7. Additional considerations: Duocast and N-cast 660 In industrial automation systems, some traffic is from (relatively) 661 high-rate monitoring and control loops, of Class 0 and Class 1 as 662 described in [RFC5673]. In such systems, the wireless link protocol, 663 which typically uses immediate in-band acknowledgement to confirm 664 delivery (or, on failure, conclude that a retransmission is 665 required), can be adapted to attempt simultaneous delivery to more 666 than one receiving device, with separated, sequenced immediate in- 667 band acknowledgement by each of those intended receivers. (This 668 mechanism is known colloquially as "duocast" (for two intended 669 receivers), or more generically as "N-cast" (for N intended 670 receivers).) Transmission is deemed successful if at least one such 671 immediate acknowledgement is received by the sending device; 672 otherwise the device queues the message for retransmission, up until 673 the maximum configured number of retries has been attempted. 675 The logic behind duocast/N-cast is very simple: In wireless systems 676 without FEC (forward error correction), the overall rate of success 677 for transactions consisting of an initial transmission and an 678 immediate acknowledgement is typically 95%. In other words, 5% of 679 such transactions fail, either because the initial message of the 680 transaction is not received correctly by the intended receiver, or 681 because the immediate acknowledgment by that receiver is not received 682 correctly by the transaction initiator. 684 In the generalized case of N-cast, where any received acknowledgement 685 serves to complete the transaction, and where the N intended 686 receivers are spatially diverse, physically separated from each other 687 by multiple wavelengths, the probability that all such receivers fail 688 to receive the initial message of the transaction, or that all 689 generated immediate acknowledgements are not received by the 690 transaction initiator, is typically approximately (5%)^N. Thus, for 691 duocast, the expected success rate for a single transaction goes from 692 95% (1.0 - 0.05) to 99.75% (1.0 - 0.05^2), to 99.9875% (1.0 - 0.05^3) 693 when N=3, and even higher when N>3. 695 From the above analysis, it is obvious that the primary benefit of 696 N-cast occurs when N goes from N=1 (unicast) to N=2 (duocast); the 697 reduction in transaction loss rate for increasing N>2 is quite small, 698 and for N>3 it is infinitesimal. In the typical industrial 699 automation environment of class 1 process control loops, which 700 typically repeat at a 1 Hz or 4 Hz rate, in a very large process 701 plant with thousands of field devices reporting at that rate, the 702 maximum number of transmission retries that must be planned, and for 703 which capacity must be scheduled (within the requisite 250 ms or 1 s 704 interval) is seven (7) retries for unicast PS reporting, but only 705 three (3) retries with duocast PS reporting. (This is determined by 706 the requirement to not miss four successive reports more than once 707 per year, across the entire plant, as such a loss typically triggers 708 fallback behavior in the controlled loop, which is considered a 709 failure of the wireless system by the plant owner/operator.) In 710 practice, the enormous reduction in both planned and used 711 retransmission capacity provided by duocast/N-cast is what enables 4 712 Hz loops to be supported in large wireless systems. 714 When available, duocast/N-cast typically is used only for one-hop PS 715 traffic on Class 1 and Class 0 control loops. It may also be 716 employed for rapid, reliable one-hop delivery of Class 0 and 717 sometimes Class 1 process alarms and device alerts, which use the SS 718 paradigm. Because it requires scheduling of multiple receivers that 719 are prepared to acknowledge the received message during the 720 transaction, in general it is not appropriate for the other types of 721 traffic in such systems - P2P and P2MP - and is not needed for other 722 classes of control loops or other types of traffic, which do not have 723 such stringent reporting requirements. 725 Note: Although there are known patent applications for duocast and 726 N-cast, at the time of this writing the patent assignee, Honeywell 727 International, has offered to permit cost-free RAND use in those 728 industrial wireless standards that have chosen to employee the 729 technology, under a reciprocal licensing requirement relative to that 730 use. Since duocast and N-cast provide performance and energy 731 optimizations, they are not essential for use in wireless systems. 732 However, in practice, their use makes it possible to support 4 Hz 733 wireless loops and meet sub-second safety alarm reporting 734 requirements in large plants, where that might otherwise be 735 impractical without use of a wired network. When duocast/N-cast is 736 not employed, the wireless retransmission capacity that is needed to 737 support such fast loops often is excessive, typically over 100x that 738 actually used for retransmission (i.e., providing for seven retries 739 per transaction when the mean number used is only 0.06 retries). 741 2.1.8. RPL applicability per communication paradigm 743 To match the requirements above, RPL provides a number of RPL Modes 744 of Operation (MOP): 746 No downward route: defined in [RFC6550], section 6.3.1, MOP of 0. 747 This mode allows only upward routing, that is from 748 nodes (devices) that reside inside the RPL network 749 toward the outside via the DODAG root. 751 Non-storing mode: defined in [RFC6550], section 6.3.1, MOP of 1. This 752 mode improves MOP 0 by adding the capability to use 753 source routing from the root towards registered 754 targets within the instance DODAG. 756 Storing mode without multicast support: defined in [RFC6550], section 757 6.3.1, MOP of 2. This mode 758 improves MOP 0 by adding the 759 capability to use stateful 760 routing from the root towards 761 registered targets within the 762 instance DODAG. 764 Storing mode with link-scope multicast DAO: defined in [RFC6550] 765 section 9.10, this mode 766 improves MOP 2 by adding 767 the capability to send 768 Destination 769 Advertisements to all 770 nodes over a single Layer 771 2 link (e.g. a wireless 772 hop) and enables line-of- 773 sight direct 774 communication. 776 Storing mode with multicast support: defined in [RFC6550], Mode-of- 777 operation (MOP) of 3. This mode 778 improves MOP 2 by adding the 779 capability to register multicast 780 groups and perform multicast 781 forwarding along the instance 782 DODAG (or a spanning subtree 783 within the DODAG). 785 Reactive: defined in [RFC6997], the reactive mode creates on-demand 786 additional DAGs that are used to reach a given node acting 787 as DODAG root within a certain number of hops. This mode 788 can typically be used for an ad-hoc closed-loop 789 communication. 791 The RPL MOP that can be applied for a given flow depends on the 792 communication paradigm. It must be noted that a DODAG that is used 793 for PS traffic can also be used for SS traffic since the MOP 2 794 extends the MOP 0, and that a DODAG that is used for P2MP 795 distribution can also be used for downward PS since the MOP 3 extends 796 the MOP 2. 798 On the other hand, an Objective Function (OF) that optimizes metrics 799 for a pure upwards DODAG might differ from the OF that optimizes a 800 mixed upward and downward DODAG. 802 As a result, it can be expected that different RPL instances are 803 installed with different OFs, different channel allocations, etc... 804 that result in different routing and forwarding topologies, sometimes 805 with differing delay vs. energy profiles, optimized separately for 806 the different flows at hand. 808 This can be broadly summarized in the following table: 810 +---------------------+------------+-----------------------------------+ 811 | Paradigm\RPL MOP | RPL spec | Mode of operation | 812 +=====================+============+===================================+ 813 | Peer-to-peer | RPL P2P | reactive (on-demand) | 814 +---------------------+------------+-----------------------------------+ 815 | P2P line-of-sight | RPL base | 2 (storing) with multicast DAO | 816 +---------------------+------------+-----------------------------------+ 817 | P2MP distribution | RPL base | 3 (storing with multicast) | 818 +---------------------+------------+-----------------------------------+ 819 | Publish-subscribe | RPL base | 1 or 2 (storing or not-storing) | 820 +---------------------+------------+-----------------------------------+ 821 | Source-sink | RPL base | 0 (no downward route) | 822 +---------------------+------------+-----------------------------------+ 823 | N-cast publish | RPL base | 0 (no downward route) | 824 +---------------------+------------+-----------------------------------+ 826 2.2. Layer 2 applicability. 828 To be completed. 830 3. Using RPL to Meet Functional Requirements 832 The functional requirements for most industrial automation 833 deployments are similar to those listed in [RFC5673] 835 The routing protocol MUST be capable of supporting the 836 organization of a large number of nodes into regions, usually 837 corresponding to partitions of the automated process, each 838 containing on the order of 30 to 3000 nodes. 840 The routing protocol MUST provide mechanisms to support 841 configuration of the routing protocol itself. 843 The routing protocol MUST provide mechanisms to support instructed 844 configuration of explicit routing, so that in the absence of 845 failure the routing used for selected flow classes is that which 846 has been remotely configured (typically by a centralized 847 configurator). In such circumstances RPL is used 849 for local network repair; 851 for flow classes to which explicit routing has not been 852 assigned; 854 during bootstrapping of the network itself (which is really 855 just an instance of routing without such an externally-imposed 856 assignment). 858 The routing protocol SHOULD support directed flows with different 859 QoS characteristics, typically with different energy vs. delay 860 tradeoffs, for traffic directed to LBRs. In practice only two 861 such sets of QoS are relevant: 863 one that emphasizes energy minimization for energy-constrained 864 nodes at the expense of greater mean transit delay and variance 865 in transit delay; and 867 one that emphasizes minimization of mean transit delay and 868 transit delay variance at the expense of greater energy demand 869 on originating and intermediary energy-constrained nodes, 870 typically used for critical SS traffic (e.e., infrequent and 871 unpredictable safety alarms with legally-mandated maximum 872 reporting delays) and critical PS traffic (e.g., predictable 873 periodic (for process automation) or cyclic (for factory 874 automation) high-speed safety control loops needed to protect 875 life, the environment, and/or critical national infrastructure 876 assets). 878 In the absence of configured routing, or when such routes have 879 failed, the routing protocol MUST dynamically compute and select 880 effective routes composed of low-power and lossy links. Local 881 network dynamics SHOULD NOT impact the entire network. The 882 routing protocol MUST compute multiple paths when possible. 884 The routing protocol MUST support multicast addressing, including 886 multicast originating with a LBR or off the LLN, directed to a 887 predefined group within the LLN 889 multicast originating within the LLN, directed to one or more 890 equivalent LBRs, in support of SS traffic 892 multicast originating within the LLN, directed to one or more 893 equivalent LBRs, in support of PS traffic, including all three 894 of the following situations: 896 1: 898 2: 900 3: 902 The routing protocol SHOULD support and utilize a large number of 903 highly directed flows to a few LBRs, to handle scalability. 905 The routing protocol SHOULD support formation of groups of field 906 devices in the network. 908 The routing protocol NEED NOT support anycast addressing because, 909 as of the date of writing of this document, such addressing is not 910 used by automation and control field devices. In general, no two 911 such devices are equivalent, except perhaps for intermediary LBRs, 912 so unicast suffices for situations where anycast might otherwise 913 be employed. 915 RPL supports: 917 Large-scale networks characterized by highly directed traffic 918 flows between each field device and servers close to the head-end 919 of the automation network. To this end, RPL builds Directed 920 Acyclic Graphs (DAGs) rooted at LBRs. 922 Zero-touch configuration. This is done through in-band methods 923 for configuring RPL variables using DIO messages. 925 The use of links with time-varying availability and quality 926 characteristics. This is accomplished by allowing the use of 927 metrics that effectively capture the quality of a path (e.g., in 928 terms of the mean and maximum impact of use of that path on packet 929 delivery timing and on endpoint energy demands), and by limiting 930 the impact of changing local conditions by discovering and 931 maintaining multiple DAG parents, and by using local repair 932 mechanisms when DAG links break. 934 For wireless installations of small size with undemanding 935 communication requirements, RPL is likely to generate satisfactory 936 routing without any special effort. However, in larger installations 937 or where timeliness considerations do not permit multi-second 938 wireless-subnet transit times, then flow labeling is likely required 939 so that forwarding routers can make informed tradeoffs between 940 conserving their own energy resources and meeting overall system 941 needs. 943 4. RPL Profile 945 This section outlines a RPL profile for a representative deployment 946 in a process control application. Process monitoring without control 947 is typically less demanding, so a subset of this profile generally 948 will suffice. 950 4.1. RPL Features 952 4.1.1. RPL Instances 954 RPL allows formation of multiple instances that operate independently 955 of each other. Each instance may use a different objective function 956 and different modes of operation. It is highly recommended that 957 wireless field devices participate in different instances that 958 utilize objective functions that meet different optimization goals. 959 These optimization goals target: 961 1. Minimizing and ensuring that a guaranteed latency is being met 963 2. Maximizing the communication reliability of the packets 964 transferred over the wireless media 966 3. Minimizing aggregate power consumption for multi-hop LLNs that 967 are composed of battery powered field devices. 969 Some of these optimization goals will have to be met concurrently in 970 a single instance by imposing various constraints. 972 Each wireless field device should participate in a set composed of a 973 minimum of three instances that meet optimization goals associated 974 with three traffic flows which need to be supported by all industrial 975 LLNs. 977 Management Instance: Wireless industrial networks are highly 978 deterministic in nature, meaning that wireless field devices do 979 not make any decisions locally but are managed by a centralized 980 System Manager that oversees the join process as well as all 981 communication and security settings present in the devices. The 982 management traffic flow is downward traffic and needs to meet 983 strictly enforced latency and reliability requirements in order to 984 ensure proper operation of the wireless LLN. Hence each field 985 device should participate in an instance dedicated to management 986 traffic. All decisions made while constructing this instance will 987 need to be approved by the Path Computaton Engine present in the 988 System Manager due to the deterministic, centralized nature of 989 wireless industrial LLNs. Shallow LLNs with a hop count of up to 990 one, accommodate this downward traffic using non-storing mode.Non- 991 storing involves source routing that is detrimental to the packet 992 size. For large transfers such as image download and 993 configuration files, this can be factorized for a large packet. 994 In that case, a method such as [I-D.thubert-roll-forwarding-frags] 995 is required over multi-hop networks to forward and recover 996 individual fragments without the overhead of the source route 997 information in each fragment. If the hop count in the wireless 998 LLN grows (LLN becomes deeper) it is higly recommended that the 999 management instance rely on storing mode in order to relay 1000 management related packets. 1002 Operational Instance: The bulk of the data that is transferred over 1003 wireless LLN consists of process automation related payloads. 1004 This data is of paramount importance to the smooth operation of 1005 the process that is being monitored. Hence data reliabiliy is of 1006 paramount importance. It is also important to note that a vast 1007 majority of the wireless field devices that operate in industrial 1008 LLNs are battery powered. The operational instance should hence 1009 ensure high reliability of the data transmitted while also 1010 minimizing the aggregate power consumption of the field devices 1011 operating in the LLN. All decisions made while constructing this 1012 instance will need to be approved by the Path Computaton Engine 1013 present in the System Manager. This is due to the deterministic, 1014 centralized nature of wireless LLNs. 1016 Autonomous instance: An autonomous instance requires limited to no 1017 configuration. It, primary purpose is to serve as a backup for 1018 the operational instance in case the operational instance fails. 1019 It is also useful in non-production phases of the network, when 1020 the plant is installed or dismantled. [I-D.thubert-roll-asymlink] 1021 provides rules and mechanisms whereby an instance can be used as a 1022 fallback to another upon failure to forward a packet further. The 1023 autonomic instance should always be active and during normal 1024 operations it should be maintained through local repair 1025 mechanisms. In normal operation global repairs should be 1026 sparingly employed in order to conserve batteries. But a global 1027 repair is also probably the fastest and most economical technique 1028 in the case the network is extensively damaged. It is recommended 1029 to rely on automation that will trigger a global repair upon the 1030 detection of a large scale incident such as an explosion or a 1031 crash. As the name suggests, the autonomous instance is formed 1032 without any dependence on the System Manager. Decisions made 1033 during the construcstion of the autonomous instance do not need 1034 approval from the Path Computation Engine present in the in the 1035 System Manager. 1037 Participation of each wireless field device in at least one instance 1038 that hosts a DODAG with a virtual root is highly recommended. 1040 Wireless industrial networks are typically composed of multiple LLNs 1041 that terminate in a LLN Border Router (LBR). The LBRs communicate 1042 with each other and with other entities present on the backbone (such 1043 as the Gateway and the System Manager) over a wired or wireless 1044 backbone infrastructure. When a device A that operates in LLN 1 1045 sends a packet to a device B that operates in LLN2, the packets 1046 egresses LLN1 through LBR1 and ingresses LLN2 through LBR2 after 1047 travelling over the backbone infrastructure that connects the LBRs. 1048 In order to accommodate this packet flow that travels from one LLN to 1049 another, it is highly recommended that wireless field devices 1050 participate in at least one instance that has a DODAG with a virtual 1051 root. 1053 4.1.2. Storing vs. Non-Storing Mode 1055 In general, storing mode is required for high-reporting-rate devices 1056 (where "high rate" is with respect to the underlying link data 1057 conveyance capability). Such devices, in the absence of path failure, 1058 are typically only one hop from the LBR(s) that convey their 1059 messaging to other parts of the system. Fortunately, in such cases, 1060 the routing tables required by such nodes are small, even when they 1061 include information on DODAGs that are used as backup alternate 1062 routes. 1064 Deeper multi-hop wireless LLNs (hop count > 1) should support storing 1065 mode in order to minimize the overhead associated with source routing 1066 given the limited header capacity associated with typical physical 1067 layers employed in wireless LLNs. Support for storing mode requires 1068 additional RAM resources be present in the constrained wireless 1069 fielde devices. Typical wireless LLNs scale to a maximum of one 1070 hundred field devices. Hence the appropriate RAM resources for 1071 supporting storing mode should be part of the hardware requirements 1072 imposed upon wireless field devices during the design phase. 1074 The ISA100.11a standard mandates that all LBRs maintain routing 1075 tables with enough capacity to accomodate operation in storing mode. 1076 The standard also mandates that all wireless field devices maintain 1077 routing tables but it does not make any capacity assumptions, 1078 allowing for null routing tables. The System Manager should read the 1079 routing table capacity of each wireless field router and LBR during 1080 their join phase, and determine if support for storing mode in a 1081 particular LLN is feasible. 1083 Lack of support for storing mode is also detrimental to battery 1084 operated wireless field devices due to the power consumption 1085 associated with transporting the hefty headers associated with source 1086 routing. Support for storing mode also ensures path redundancy which 1087 in turn allows for better prediction of the latency associated with 1088 downward traffic flows. Guaranteed latencies are of paramount 1089 importance for various traffic flows in wireless industrial LLNs. 1091 4.1.3. DAO Policy 1093 Support for both upward and downward traffic flows is a requirement 1094 in industrial automation systems. As a result, nodes send DAO 1095 messages to establish downward paths from the root to themselves. 1096 DAO messages are not acknowledged in wireless industrial LLNs that 1097 are composed of battery operated field devices in order to minimize 1098 the power consumption overhead associated with path discovery. Given 1099 that wireless field devices in LLNs will typically participate in 1100 multiple RPL instances and DODAGs, it is highly recommended that both 1101 the RPLInstance ID and the DODAGID be included in the DAO. 1103 4.1.4. Path Metrics 1105 RPL relies on an Objective Function for selecting parents and 1106 computing path costs and rank. This objective function is decoupled 1107 from the core RPL mechanisms and also from the metrics in use in the 1108 network. Two objective functions for RPL have been defined at the 1109 time of this writing, the RPL Objective Function 0 [RFC6552] and the 1110 Minimum Rank with Hysteresis Objective Function [RFC6719], both of 1111 which define a selection method for a preferred parent and backup 1112 parents, and are suitable for industrial automation network 1113 deployments. 1115 4.1.5. Objective Function 1117 Industrial wireless LLNs are subject to swift variations in terms of 1118 the propagation of the wireless signal, variations that can affect 1119 the quality of the links between field devices. This is due to the 1120 nature of the environment in which they operate which can be 1121 characterized as metal jungles that cause wireles propagation 1122 distortions, multi-path fading and scattering. Hence support for 1123 hysteresis is needed in order to ensure relative link stability which 1124 in turn ensures route stability. 1126 As mentioned in previous sections of this document, different traffic 1127 flows require different optimization goals. Wireless field devices 1128 should participate in multiple instances associated with multiple 1129 objective functions. 1131 Management Instance: Should utilize an objective function that 1132 focuses on optimization of latency and data reliability. 1134 Operational instance: Should utilize an objective function that 1135 focuses on data reliability and minimizing aggregate power 1136 consumption for battery operated field devices. 1138 Autonomous instance: Should utilize an objective function that 1139 optimizes data latency. The primary purpose of the autonomous 1140 instance is as a fallback instance in case the operational 1141 instance fails. Data latency is hence paramount for ensuring that 1142 the wireless field devices can exchange packets in order to repair 1143 the operational instance. 1145 More complex objective functions are needed that take in 1146 consideration multiple constraints and utilize weighted sums of 1147 multiple additive and multiplicative metrics. Additional objective 1148 functions specifically designed for such networks may be defined in 1149 companion RFCs. 1151 4.1.6. DODAG Repair 1153 To effectively handle time-varying link characteristics and 1154 availability, industrial automation network deployments SHOULD 1155 utilize the local repair mechanisms in RPL. 1157 Local repair is triggered by broken link detection, and in storing 1158 mode also by loop detection. 1160 The first local repair mechanism consists of a node detaching from a 1161 DODAG and then re-attaching to the same or to a different DODAG at a 1162 later time. While detached, a node advertises an infinite rank value 1163 so that its children can select a different parent. This process is 1164 known as poisoning and is described in Section 8.2.2.5 of [RFC6550]. 1165 While RPL provides an option to form a local DODAG, doing so in 1166 industrial automation network deployments is of little benefit since 1167 applications typically communicate through a LBR. After the detached 1168 node has made sufficient effort to send notification to its children 1169 that it is detached, the node can rejoin the same DODAG with a higher 1170 rank value. The configured duration of the poisoning mechanism needs 1171 to take into account the disconnection time applications running over 1172 the network can tolerate. Note that when joining a different DODAG, 1173 the node need not perform poisoning. 1175 The second local repair mechanism controls how much a node can 1176 increase its rank within a given DODAG Version (e.g., after detaching 1177 from the DODAG as a result of broken link or loop detection). 1178 Setting the DAGMaxRankIncrease to a non-zero value enables this 1179 mechanism, and setting it to a value of less than infinity limits the 1180 cost of count-to-infinity scenarios when they occur, thus controlling 1181 the duration of disconnection applications may experience. 1183 4.1.7. Multicast 1185 4.1.8. Security 1187 Industrial automation network deployments typically operate in areas 1188 that provide limited physical security (relative to the risk of 1189 attack). For this reason, the link layer, transport layer and 1190 application layer technologies utilized within such networks 1191 typically provide security mechanisms to ensure authentication, 1192 confidentiality, integrity, timeliness and freshness. As a result, 1193 such deployments may not need to implement RPL's security mechanisms 1194 and could rely on link layer and higher layer security features. 1196 4.1.9. P2P communications 1198 1200 4.2. Layer-two features 1202 4.2.1. Need layer-2 expert here. 1204 4.2.2. Security functions provided by layer-2. 1206 4.2.3. 6LowPAN options assumed. 1208 4.2.4. MLE and other things 1210 4.3. Recommended Configuration Defaults and Ranges 1212 4.3.1. Trickle Parameters 1213 Trickle was designed to be density-aware and perform well in networks 1214 characterized by a wide range of node densities. The combination of 1215 DIO packet suppression and adaptive timers for sending updates allows 1216 Trickle to perform well in both sparse and dense environments. 1218 Node densities in industrial automation network deployments can vary 1219 greatly, from nodes having only one or a handful of neighbors to 1220 nodes having several hundred neighbors. In high density 1221 environments, relatively low values for Imin may cause a short period 1222 of congestion when an inconsistency is detected and DIO updates are 1223 sent by a large number of neighboring nodes nearly simultaneously. 1224 While the Trickle timer will exponentially backoff, some time may 1225 elapse before the congestion subsides. Although some link layers 1226 employ contention mechanisms that attempt to avoid congestion, 1227 relying solely on the link layer to avoid congestion caused by a 1228 large number of DIO updates can result in increased communication 1229 latency for other control and data traffic in the network. 1231 To mitigate this kind of short-term congestion, this document 1232 recommends a more conservative set of values for the Trickle 1233 parameters than those specified in [RFC6206]. In particular, 1234 DIOIntervalMin is set to a larger value to avoid periods of 1235 congestion in dense environments, and DIORefundancyConstant is 1236 parameterized accordingly as described below. These values are 1237 appropriate for the timely distribution of DIO updates in both sparse 1238 and dense scenarios while avoiding the short-term congestion that 1239 might arise in dense scenarios. 1241 Because the actual link capacity depends on the particular link 1242 technology used within an industrial automation network deployment, 1243 the Trickle parameters are specified in terms of the link's maximum 1244 capacity for conveying link-local multicast messages. If the link 1245 can convey m link-local multicast packets per second on average, the 1246 expected time it takes to transmit a link-local multicast packet is 1 1247 /m seconds. 1249 DIOIntervalMin: Industrial automation network deployments SHOULD set 1250 DIOIntervalMin such that the Trickle Imin is at least 50 times as 1251 long as it takes to convey a link-local multicast packet. This value 1252 is larger than that recommended in [RFC6206] to avoid congestion in 1253 dense plant deployments as described above. 1255 DIOIntervalDoublings: Industrial automation network deployments 1256 SHOULD set DIOIntervalDoublings such that the Trickle Imax is at 1257 least TBD minutes or more. 1259 DIORedundancyConstant: Industrial automation network deployments 1260 SHOULD set DIORedundancyConstant to a value of at least 10. This is 1261 due to the larger chosen value for DIOIntervalMin and the 1262 proportional relationship between Imin and k suggested in [RFC6206]. 1263 This increase is intended to compensate for the increased 1264 communication latency of DIO updates caused by the increase in the 1265 DIOIntervalMin value, though the proportional relationship between 1266 Imin and k suggested in [RFC6206] is not preserved. Instead, 1267 DIORedundancyConstant is set to a lower value in order to reduce the 1268 number of packet transmissions in dense environments. 1270 4.3.2. Other Parameters 1272 1274 5. Manageability Considerations 1276 RPL enables automatic and consistent configuration of RPL routers 1277 through parameters specified by the DODAG root and disseminated 1278 through DIO packets. The use of Trickle for scheduling DIO 1279 transmissions ensures lightweight yet timely propagation of important 1280 network and parameter updates and allows network operators to choose 1281 the trade-off point they are comfortable with respect to overhead vs. 1282 reliability and timeliness of network updates. 1284 The metrics in use in the network along with the Trickle Timer 1285 parameters used to control the frequency and redundancy of network 1286 updates can be dynamically varied by the root during the lifetime of 1287 the network. To that end, all DIO messages SHOULD contain a Metric 1288 Container option for disseminating the metrics and metric values used 1289 for DODAG setup. In addition, DIO messages SHOULD contain a DODAG 1290 Configuration option for disseminating the Trickle Timer parameters 1291 throughout the network. 1293 The possibility of dynamically updating the metrics in use in the 1294 network as well as the frequency of network updates allows deployment 1295 characteristics (e.g., network density) to be discovered during 1296 network bring-up and to be used to tailor network parameters once the 1297 network is operational rather than having to rely on precise pre- 1298 configuration. This also allows the network parameters and the 1299 overall routing protocol behavior to evolve during the lifetime of 1300 the network. 1302 RPL specifies a number of variables and events that can be tracked 1303 for purposes of network fault and performance monitoring of RPL 1304 routers. Depending on the memory and processing capabilities of each 1305 smart grid device, various subsets of these can be employed in the 1306 field. 1308 6. Security Considerations 1309 Industrial automation network deployments typically operate in areas 1310 that provide limited physical security (relative to the risk of 1311 attack). For this reason, the link layer, transport layer and 1312 application layer technologies utilized within such networks 1313 typically provide security mechanisms to ensure authentication, 1314 confidentiality, integrity, timeliness and freshness. As a result, 1315 such deployments may not need to implement RPL's security mechanisms 1316 and could rely on link layer and higher layer security features. 1318 This document does not specify operations that could introduce new 1319 threats. Security considerations for RPL deployments are to be 1320 developed in accordance with recommendations laid out in, for 1321 example, [I-D.tsao-roll-security-framework]. 1323 Industrial automation networks are subject to stringent security 1324 requirements as they are considered a critical infrastructure 1325 component. At the same time, since they are composed of large 1326 numbers of resource- constrained devices inter-connected with 1327 limited-throughput links, many available security mechanisms are not 1328 practical for use in such networks. As a result, the choice of 1329 security mechanisms is highly dependent on the device and network 1330 capabilities characterizing a particular deployment. 1332 In contrast to other types of LLNs, in industrial automation networks 1333 centralized administrative control and access to a permanent secure 1334 infrastructure is available. As a result link-layer, transport-layer 1335 and/or application-layer security mechanisms are typically in place 1336 and may make use of RPL's secure mode unnecessary. 1338 6.1. Security Considerations during initial deployment 1340 6.2. Security Considerations during incremental deployment 1342 7. Other Related Protocols 1344 8. IANA Considerations 1346 This specification has no requirement on IANA. 1348 9. Acknowledgements 1350 10. References 1352 10.1. Normative References 1354 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1355 Requirement Levels", BCP 14, RFC 2119, March 1997. 1357 10.2. Informative References 1359 [I-D.ietf-roll-terminology] 1360 Vasseur, J., "Terminology in Low power And Lossy 1361 Networks", Internet-Draft draft-ietf-roll-terminology-12, 1362 March 2013. 1364 [RFC2887] Handley, M., Floyd, S., Whetten, B., Kermode, R., 1365 Vicisano, L. and M. Luby, "The Reliable Multicast Design 1366 Space for Bulk Data Transfer", RFC 2887, August 2000. 1368 [RFC5548] Dohler, M., Watteyne, T., Winter, T. and D. Barthel, 1369 "Routing Requirements for Urban Low-Power and Lossy 1370 Networks", RFC 5548, May 2009. 1372 [RFC5826] Brandt, A., Buron, J. and G. Porcu, "Home Automation 1373 Routing Requirements in Low-Power and Lossy Networks", RFC 1374 5826, April 2010. 1376 [RFC5867] Martocci, J., De Mil, P., Riou, N. and W. Vermeylen, 1377 "Building Automation Routing Requirements in Low-Power and 1378 Lossy Networks", RFC 5867, June 2010. 1380 [RFC5673] Pister, K., Thubert, P., Dwars, S. and T. Phinney, 1381 "Industrial Routing Requirements in Low-Power and Lossy 1382 Networks", RFC 5673, October 2009. 1384 [RFC6206] Levis, P., Clausen, T., Hui, J., Gnawali, O. and J. Ko, 1385 "The Trickle Algorithm", RFC 6206, March 2011. 1387 [RFC6550] Winter, T., Thubert, P., Brandt, A., Hui, J., Kelsey, R., 1388 Levis, P., Pister, K., Struik, R., Vasseur, JP. and R. 1389 Alexander, "RPL: IPv6 Routing Protocol for Low-Power and 1390 Lossy Networks", RFC 6550, March 2012. 1392 [RFC6552] Thubert, P., "Objective Function Zero for the Routing 1393 Protocol for Low-Power and Lossy Networks (RPL)", RFC 1394 6552, March 2012. 1396 [RFC6719] Gnawali, O. and P. Levis, "The Minimum Rank with 1397 Hysteresis Objective Function", RFC 6719, September 2012. 1399 [RFC6997] Goyal, M., Baccelli, E., Philipp, M., Brandt, A. and J. 1400 Martocci, "Reactive Discovery of Point-to-Point Routes in 1401 Low-Power and Lossy Networks", RFC 6997, August 2013. 1403 [I-D.thubert-roll-asymlink] 1404 Thubert, P., "RPL adaptation for asymmetrical links", 1405 Internet-Draft draft-thubert-roll-asymlink-02, December 1406 2011. 1408 [I-D.thubert-roll-forwarding-frags] 1409 Thubert, P. and J. Hui, "LLN Fragment Forwarding and 1410 Recovery", Internet-Draft draft-thubert-roll-forwarding- 1411 frags-01, February 2013. 1413 [I-D.tsao-roll-security-framework] 1414 Tsao, T., Alexander, R., Daza, V. and A. Lozano, "A 1415 Security Framework for Routing over Low Power and Lossy 1416 Networks", Internet-Draft draft-tsao-roll-security- 1417 framework-02, March 2010. 1419 10.3. External Informative References 1421 [HART] www.hartcomm.org, "Highway Addressable Remote Transducer, 1422 a group of specifications for industrial process and 1423 control devices administered by the HART Foundation", . 1425 [ISA100.11a] 1426 ISA, "ISA100, Wireless Systems for Automation", May 2008, 1427 < http://www.isa.org/Community/ 1428 SP100WirelessSystemsforAutomation>. 1430 Authors' Addresses 1432 Tom Phinney, editor 1433 consultant 1434 5012 W. Torrey Pines Circle 1435 Glendale, AZ 85308-3221 1436 USA 1438 Phone: +1 602 938 3163 1439 Email: tom.phinney@cox.net 1441 Pascal Thubert 1442 Cisco Systems, Inc 1443 Building D 1444 45 Allee des Ormes - BP1200 1445 MOUGINS - Sophia Antipolis, 06254 1446 FRANCE 1448 Phone: +33 497 23 26 34 1449 Email: pthubert@cisco.com 1451 Robert Assimiti 1452 Nivis 1453 1000 Circle 75 Parkway SE, Ste 300 1454 Atlanta, GA 30339 1455 USA 1457 Phone: +1 678 202 6859 1458 Email: robert.assimiti@nivis.com