idnits 2.17.1 draft-ietf-roll-rpl-industrial-applicability-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (March 12, 2013) is 4062 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'HART' is mentioned on line 1423, but not defined == Outdated reference: A later version (-17) exists of draft-ietf-roll-p2p-rpl-16 == Outdated reference: A later version (-13) exists of draft-ietf-roll-terminology-11 == Outdated reference: A later version (-02) exists of draft-thubert-roll-forwarding-frags-01 Summary: 0 errors (**), 0 flaws (~~), 5 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 ROLL T. Phinney, Ed. 3 Internet-Draft consultant 4 Intended status: Informational P. Thubert 5 Expires: September 13, 2013 Cisco 6 RA. Assimiti 7 Nivis 8 March 12, 2013 10 RPL applicability in industrial networks 11 draft-ietf-roll-rpl-industrial-applicability-00 13 Abstract 15 The wide deployment of wireless devices, with their low installed 16 cost (compared to wired devices), will significantly improve the 17 productivity and safety of industrial plants. It will simultaneously 18 increase the efficiency and safety of the plant's workers, by 19 extending and making more timely the information set available about 20 plant operations. The new Routing Protocol for Low Power and Lossy 21 Networks (RPL) defines a Distance Vector protocol that is designed 22 for such networks. The aim of this document is to analyze the 23 applicability of that routing protocol in industrial LLNs formed of 24 field devices. 26 Status of this Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at http://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on September 13, 2013. 43 Copyright Notice 45 Copyright (c) 2013 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (http://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 61 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 5 62 1.2. Required Reading . . . . . . . . . . . . . . . . . . . . . 5 63 1.3. Out of scope requirements . . . . . . . . . . . . . . . . 5 64 2. Deployment Scenario . . . . . . . . . . . . . . . . . . . . . 6 65 2.1. Network Topologies . . . . . . . . . . . . . . . . . . . . 8 66 2.1.1. Traffic Characteristics . . . . . . . . . . . . . . . 8 67 2.1.2. Topologies . . . . . . . . . . . . . . . . . . . . . . 9 68 2.1.3. Source-sink (SS) communication paradigm . . . . . . . 11 69 2.1.4. Publish-subscribe (PS, or pub/sub) communication 70 paradigm . . . . . . . . . . . . . . . . . . . . . . . 12 71 2.1.5. Peer-to-peer (P2P) communication paradigm . . . . . . 14 72 2.1.6. Peer-to-multipeer (P2MP) communication paradigm . . . 15 73 2.1.7. Additional considerations: Duocast and N-cast . . . . 15 74 2.1.8. RPL applicability per communication paradigm . . . . . 17 75 2.2. Layer 2 applicability. . . . . . . . . . . . . . . . . . . 19 76 3. Using RPL to Meet Functional Requirements . . . . . . . . . . 20 77 4. RPL Profile . . . . . . . . . . . . . . . . . . . . . . . . . 23 78 4.1. RPL Features . . . . . . . . . . . . . . . . . . . . . . . 23 79 4.1.1. RPL Instances . . . . . . . . . . . . . . . . . . . . 23 80 4.1.2. Storing vs. Non-Storing Mode . . . . . . . . . . . . . 25 81 4.1.3. DAO Policy . . . . . . . . . . . . . . . . . . . . . . 26 82 4.1.4. Path Metrics . . . . . . . . . . . . . . . . . . . . . 26 83 4.1.5. Objective Function . . . . . . . . . . . . . . . . . . 26 84 4.1.6. DODAG Repair . . . . . . . . . . . . . . . . . . . . . 27 85 4.1.7. Multicast . . . . . . . . . . . . . . . . . . . . . . 28 86 4.1.8. Security . . . . . . . . . . . . . . . . . . . . . . . 28 87 4.1.9. P2P communications . . . . . . . . . . . . . . . . . . 28 88 4.2. Layer-two features . . . . . . . . . . . . . . . . . . . . 28 89 4.2.1. Need layer-2 expert here. . . . . . . . . . . . . . . 28 90 4.2.2. Security functions provided by layer-2. . . . . . . . 28 91 4.2.3. 6LowPAN options assumed. . . . . . . . . . . . . . . . 28 92 4.2.4. MLE and other things . . . . . . . . . . . . . . . . . 28 93 4.3. Recommended Configuration Defaults and Ranges . . . . . . 28 94 4.3.1. Trickle Parameters . . . . . . . . . . . . . . . . . . 28 95 4.3.2. Other Parameters . . . . . . . . . . . . . . . . . . . 29 97 5. Manageability Considerations . . . . . . . . . . . . . . . . . 30 98 6. Security Considerations . . . . . . . . . . . . . . . . . . . 31 99 6.1. Security Considerations during initial deployment . . . . 31 100 6.2. Security Considerations during incremental deployment . . 31 101 7. Other Related Protocols . . . . . . . . . . . . . . . . . . . 32 102 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 33 103 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 34 104 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 35 105 10.1. Normative References . . . . . . . . . . . . . . . . . . . 35 106 10.2. Informative References . . . . . . . . . . . . . . . . . . 35 107 10.3. External Informative References . . . . . . . . . . . . . 36 108 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 37 110 1. Introduction 112 Information Technology (IT) is already, and increasingly will be 113 applied to Industrial Automation and Control System (IACS) technology 114 in application areas where those IT technologies can be constrained 115 sufficiently by Service Level Agreements (SLA) or other modest change 116 that they are able to meet the operational needs of IACS. When that 117 happens, the IACS benefits from the large intellectual, experiential 118 and training investment that has already occurred in those IT 119 precursors. One can conclude that future reuse of additional IT 120 protocols for IACS will continue to occur due to the significant 121 intellectual, experiential and training economies which result from 122 that reuse. 124 Following that logic, many vendors are already extending or replacing 125 their local field-bus technology with Ethernet and IP-based 126 solutions. Examples of this evolution include CIP EtherNet/IP, 127 Modbus/TCP, Foundation Fieldbus HSE, PROFInet and Invensys/Foxboro 128 FOXnet. At the same time, wireless, low power field devices are 129 being introduced that facilitate a significant increase in the amount 130 of information which industrial users can collect and the number of 131 control points that can be remotely managed. 133 IPv6 appears as a core technology at the conjunction of both trends, 134 as illustrated by the current [ISA100.11a] industrial Wireless Sensor 135 Networking (WSN) specification, where layers 1-4 technologies 136 developed for end uses other than IACS - IEEE 802.15.4 PHY and MAC, 137 6LoWPAN and IPv6, and UDP - are adapted to IACS use. But due to the 138 lack of open standards for routing in Low power and Lossy Networks 139 (LLN) at the time ISA100.11a was crafted, routing was accomplished at 140 the link layer and is specific to that standard. 142 The IETF ROLL Working Group has defined application-specific routing 143 requirements for a LLN routing protocol, specified in: 145 Routing Requirements for Urban LLNs [RFC5548], 147 Industrial Routing Requirements in LLNs [RFC5673], 149 Home Automation Routing Requirements in LLNs [RFC5826], and 151 Building Automation Routing Requirements in LLNs [RFC5867]. 153 The Routing Protocol for Low Power and Lossy Networks (RPL) [RFC6550] 154 specification and its point to point extension/optimization 155 [I-D.ietf-roll-p2p-rpl] define a generic Distance Vector protocol 156 that is adapted to a variety of Low Power and Lossy Networks (LLN) 157 types by the application of specific Objective Functions (OFs). RPL 158 forms Destination Oriented Directed Acyclic Graphs (DODAGs) within 159 instances of the protocol, each instance being associated with an 160 Objective Function to form a routing topology. 162 A field device that belongs to an instance uses the OF to determine 163 which DODAG and which Version of that DODAG the device should join. 164 The device also uses the OF to select a number of routers within the 165 DODAG current and subsequent Versions to serve as parents or as 166 feasible successors. A new Version of the DODAG is periodically 167 reconstructed to enable a global reoptimization of the graph. 169 A RPL OF states the outcome of the process used by a RPL node to 170 select and optimize routes within a RPL Instance based on the 171 information objects available. The separation of OFs from the core 172 protocol specification allows RPL to be adapted to meet the different 173 optimization criteria required by the wide range of industrial 174 classes of traffic and applications. 176 This document provides information on how RPL can accommodate the 177 industrial requirements for LLNs, in particular as specified in 178 [RFC5673]. 180 1.1. Requirements Language 182 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 183 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 184 "OPTIONAL" in this document are to be interpreted as described in RFC 185 2119 [RFC2119]. 187 Additionally, this document uses terminology from 188 [I-D.ietf-roll-terminology], and uses usual terminology from the 189 Process Control and Factory Automation industries, some of which is 190 recapitulated below: 192 FEC: Forward error correction 194 IACS: Industrial automation and control systems 196 RAND: reasonable and non-discriminatory (relative to licensing of 197 patents) 199 1.2. Required Reading 201 1.3. Out of scope requirements 203 This applicability statement does not address requirements related to 204 wireless LLNs employed in factory automation and related 205 applications. 207 2. Deployment Scenario 209 [RFC5673] describes in detail the routing requirements for industrial 210 LLNs. This RFC provides information on the varying deployment 211 scenarios for such LLNs and how RPL assists in meeting those 212 requirements. 214 Large industrial plants, or major operating areas within such plants, 215 repeatedly go through four major phases, each of which typically 216 lasts from months to years: 218 P1: Construction or major modification phase 220 P2: Planned startup phase 222 P3: Normal operation phase 224 P4: Planned shutdown phase 226 followed eventually by an (at least theoretical) 228 P5: Plant decommissioning phase. 230 It is also likely, after a major catastrophe at a plant, to have a 232 P6: Post-emergency recovery and repair phase. 234 The deployment scenarios for wireless LLN devices may be different in 235 each of these phases. In particular, during the Construction or 236 major modification phase (P1), LLN devices may be installed months 237 before the intended LLN can become usefully operational (because 238 needed routers and infrastructure devices are not yet installed or 239 active), and there are likely to be many personnel in whom the plant 240 owner/operator has only limited trust, such as subcontractors and 241 others in the plant area who have undergone only a cursory background 242 investigation (if any at all). In general, during this phase, plant 243 instrumentation is not yet operational, so could be removed and 244 replaced by a Trojaned device without much likelihood of physical 245 detection of the substitution. Thus physical security of LLN devices 246 is generally a more significant risk factor during this phase than 247 once the plant is operational, where simple replacement of device 248 electronics is detectable. 250 Extra LLN devices and even extra LLN subnets may be employed during 251 Planned startup (P2) and Planned shutdown (P4) phases, in support of 252 the task of transitioning the plant or plant area between operational 253 and shutdown states. The extra devices typically provide extra 254 monitoring as the plant transitions infrequent activity states. (In 255 many continuous process plants, up to 2x extra staff are employed at 256 monitoring and control workstations during these two phases, 257 precisely because the plant is undergoing extraordinary behavior as 258 it transitions to or from its steady-state operational condition.) 260 Similar transient devices and subnets may be used during an 261 unscheduled Post-emergency recovery and repair phase (P6) of 262 operation, but in that case the extra devices usually are routers 263 substituting for plant LLN devices that have been damaged by the 264 incident (such as a fire, explosion, flood, tornado or hurricane) 265 that induced the emergency. 267 The Planned startup (P2) and Planned shutdown (P4) phases are similar 268 in many respects, but the LLN environment of the two can be quite 269 different, since the Planned shutdown phase can assume that the 270 stable LLN environment used for Normal operation (P3) is functional 271 during shutdown, whereas that stable environment usually is still 272 being established during startup. 274 The Post-emergency recovery and repair phase (P6) typically operates 275 in an LLN environment that is somewhere between that of the Planned 276 startup (P2) and Normal operation (P3) phases, but with an 277 indeterminate number of temporary routers placed to facilitate 278 communication across and around the area affected by the catastrophe. 280 Smaller industrial plants and sites may go through similar phases, 281 but often commingle the phases because, in those smaller plants, the 282 phases require less planning and structuring of personnel 283 responsibilities and thus permit less formalization and partitioning 284 of the operating scenarios. For example, it is much simpler, and 285 usually requires much less planning, to bring new equipment on a skid 286 into a plant, using a forklift, than to lay temporary railroad track 287 or employ an extended-axle heavy haul tractor-trailer to deliver a 288 multi-ton process vessel, and temporarily deploy and use very large 289 heavy-lift cranes to install it. In the former cases, nearby 290 equipment usually can continue normal operation while the 291 installation proceeds; in the latter case that is almost always 292 impossible, due to safety and other concerns. 294 The domain of applicability for the RPL protocol may include all 295 phases but the Normal Operation phase, where the bandwidth allocation 296 and the routes are usually optimized by an external Path Computing 297 Engine (PCE), e.g. an ISA100.11a System Manager. 299 Additionally, it could be envisioned to include RPL in the normal 300 operation provided that a new Objective Function is defined that 301 actually interacts with the PCE is order to establish the reference 302 topology, in which case RPL operations would only apply to emergency 303 repair actions. when the reference topology becomes unusable for some 304 failure, and as long as the problem persists. 306 2.1. Network Topologies 308 2.1.1. Traffic Characteristics 310 The industrial market classifies process applications into three 311 broad categories and six classes. 313 o Safety 315 * Class 0: Emergency action - Always a critical function 317 o Control 319 * Class 1: Closed loop regulatory control - Often a critical 320 function 322 * Class 2: Closed loop supervisory control - Usually non-critical 323 function 325 * Class 3: Open loop control - Operator takes action and controls 326 the actuator (human in the loop) 328 o Monitoring 330 * Class 4: Alerting - Short-term operational effect (for example 331 event-based maintenance) 333 * Class 5: Logging and downloading / uploading - No immediate 334 operational consequence (e.g., history collection, sequence-of- 335 events, preventive maintenance) 337 Safety critical functions effect the basic safety integrity of the 338 plant. These normally dormant functions kick in only when process 339 control systems, or their operators, have failed. By design and by 340 regular interval inspection, they have a well-understood probability 341 of failure on demand in the range of typically once per 10-1000 342 years. 344 In-time deliveries of messages becomes more relevant as the class 345 number decreases. 347 Note that for a control application, the jitter is just as important 348 as latency and has a potential of destabilizing control algorithms. 350 The domain of applicability for the RPL protocol probably matches the 351 range of classes where industrial users are interested in deploying 352 wireless networks. This domain includes monitoring classes (4 and 353 5), and the non-critical portions of control classes (2 and 3). RPL 354 might also be considered as an additional repair mechanism in all 355 situations, and independently of the flow classification and the 356 medium type. 358 It appears from the above sections that whether and the way RPL can 359 be applied for a given flow depends both on the deployment scenario 360 and on the class of application / traffic. At a high level, this can 361 be summarized by the following matrix: 363 +---------------------+------------------------------------------------+ 364 | Phase \ Class | 0 1 2 3 4 5 | 365 +=====================+================================================+ 366 | Construction | X X X X | 367 +---------------------+------------------------------------------------+ 368 | Planned startup | X X X X | 369 +---------------------+------------------------------------------------+ 370 | Normal operation | ? ? ? | 371 +---------------------+------------------------------------------------+ 372 | Planned shutdown | X X X X | 373 +---------------------+------------------------------------------------+ 374 |Plant decommissioning| X X X X | 375 +---------------------+------------------------------------------------+ 376 | Recovery and repair | X X X X X X | 377 +---------------------+------------------------------------------------+ 379 ? : typically usable for all but higher-rate classes 0,1 PS traffic 381 Figure 1: RPL applicability matrix 383 2.1.2. Topologies 385 In an IACS, high-rate communications flows (e.g., 1 Hz or 4 Hz for a 386 traditional process automation network) typically are such that only 387 a single wireless LLN hop separates the source device from a LLN 388 Border Router (LBR) to a significantly higher data-rate backbone 389 network, typically based on IEEE 802.3, IEEE 802.11, or IEEE 802.16, 390 as illustrated in Figure 2. 392 ---+------------------------ 393 | Plant Network 394 | 395 +-----+ 396 | | Gateway 397 | | 398 +-----+ 399 | 400 | Backbone 401 +--------------------+------------------+ 402 | | | 403 +-----+ +-----+ +-----+ 404 | | LLN border | | LLN border | | LLN border 405 o | | router | | router | | router 406 +-----+ +-----+ +-----+ 407 o o o o 408 o o o o o o o o o o o 409 LLN 411 o : stationary wireless field device, seldom acting as an LLN router 413 Figure 2: High-rate low-delay low-variance IACS topology 415 For factory automation networks, the basic communications cycle for 416 control is typically much faster, on the order of 100 Hz or more. In 417 this case the LLN itself may be based on high-data-rate IEEE 802.11 418 or a 100 Mbit/s or faster optical link, and the higher-rate network 419 used by the LBRs to connect the LLN to superior automation equipment 420 typically might be based on fiber-optic IEEE 802.3, with multiple 421 LBRs around the periphery of the factory area, so that most high-rate 422 communications again requires only a single wireless LLN hop. 424 Multi-hop LLN routing is used within the LLN portion of such networks 425 to provide backup communications paths when primary single-hop LLN 426 paths fail, or for lower repetition rate communications where longer 427 LLN transit times and higher variance are not an issue. Typically, 428 the majority of devices in an IACS can tolerate such higher-delay 429 higher-variance paths, so routing choices often are driven by energy 430 considerations for the affected devices, rather than simply by IACS 431 performance requirements, as illustrated in Figure 3. 433 ---+------------------------ 434 | Plant Network 435 | 436 +-----+ 437 | | Gateway 438 | | 439 +-----+ 440 | 441 | Backbone 442 +--------------------+------------------+ 443 | | | 444 +-----+ +-----+ +-----+ 445 | | Backbone | | Backbone | | Backbone 446 | | router | | router | | router 447 +-----+ +-----+ +-----+ 448 o o o o o o o o o o o o o 449 o o o o o o o o o o o o o o o o o o 450 o o o o o o o o o o o M o o o o o 451 o o M o o o o o o o o o o o o o 452 o o o o o o o o o 453 o o o o o 454 LLN 456 o : stationary wireless field device, often acting as an LLN router 457 M : mobile wireless device 459 Figure 3: Low-rate higher-delay higher-variance IACS topology 461 Two decades of experience with digital fieldbuses has shown that four 462 communications paradigms dominate in IACS: 464 SS: Source-sink 466 PS: Publish-subscribe 468 P2P: Peer-to-peer 470 P2MP: Peer-to-multipeer 472 2.1.3. Source-sink (SS) communication paradigm 474 In SS, the source-sink communication paradigm, each of many devices 475 in one set, S1, sends UDP-like messages, usually infrequently and 476 intermittently, to a second set of devices, S2, determined by a 477 common multicast address. A typical example would be that all 478 devices within a given process unit N are configured to send process 479 alarm messages to the multicast address 480 Receivers_of_process_alarms_for_unit_N. Receiving devices, typically 481 on non-LLN networks accessed via LBRs, are configured to receive such 482 multicast messages if their work assignment covers process unit N, 483 and not otherwise. 485 Timeliness of message delivery is a significant aspect of some SS 486 communication. When the SS traffic conveys process alarms or device 487 alerts, there is often a contractual requirement, and sometimes even 488 a regulatory requirement, on the maximum end-to-end transit delay of 489 the SS message, including both the LLN and non-LLN components of that 490 delay. However, there is no requirement on relative jitter in the 491 delivery of multiple SS messages from the same source, and message 492 reordering during transit is irrelevant. 494 Within the LLN, the SS paradigm simply requires that messages so 495 addressed be forwarded to the responsible LBR (or set of equivalent 496 LBRs) for further forwarding outside the LLN. Within the LLN such 497 traffic typically is device-to-LBR or device-to-redundant-set-of- 498 equivalent-LBRs. In general, SS traffic may be aggregated before 499 forwarding when both the multicast destination address and other QoS 500 attributes are identical. If information on the target delivery 501 times for SS messages is available to the aggregating forwarding 502 device, that device may intentionally delay forwarding somewhat to 503 facilitate further aggregation, which can significantly reduce LLN 504 alarm-reporting traffic during major plant upset events. 506 2.1.4. Publish-subscribe (PS, or pub/sub) communication paradigm 508 In PS, the publish-subscribe communication paradigm, a device sends 509 UDP-like messages, usually periodically or cyclicly (i.e., 510 repetitively but without fixed periodicity), to a single multicast 511 address derived from or correlated with the device's own address. A 512 typical example would be that each sensor and actuator device within 513 a given process unit N is configured to send process state messages 514 to the multicast address that designates its specific publications. 515 In essence the derived multicast address for device D is 516 Receivers_of_publications_by_device_D. Typically those receivers are 517 in two categories: controllers (C) for control loops in which device 518 D participates, and devices accessed via the LLN's LBRs that monitor 519 and/or accumulate historical information about device D's status and 520 outputs. 522 If the controller(s) that receive device D's publication are all 523 outside the LLN and accessed by LBRs, then within the LLN such 524 traffic typically is device-to-LBR or device-to-redundant-set-of- 525 equivalent-LBRs. But if a controller (Cn) is within the LLN, then a 526 number of different LLN-local traffic patterns may be employed, 527 depending on the capabilities of the underlying link technology and 528 on configured performance requirements for such reporting. Typically 529 in such a case, publication by device D is forwarded up a DODAG to an 530 LLN router that is also on a downward DODAG to a destination 531 controller Cn, then forwarded down that second DODAG to that 532 destination controller Cn. Of course, if the LLN router (or even the 533 LBR) is itself the intended destination controller, which will often 534 be the case, then no downward forwarding occurs. 536 Timeliness of message delivery is a critical aspect of PS 537 communication. Individual messages can be lost without significant 538 impact on the controlled physical process, but typically a sequence 539 of four consecutive lost messages will trigger fallback behavior of 540 the control algorithms, which is considered a system failure by most 541 system owner/operators. (In general, and unless a local catastrophic 542 event such as a major explosion or a tornado occurs in the plant, 543 invocation of more than one instance of such fallback handling per 544 year, per plant, is considered unacceptable.) 546 Message loss, delay and jitter in delivery of PS messaging is a 547 relative matter. PS messaging is used for transfer of process 548 measurements and associated status from sensors to control 549 computation elements, from control computation elements to actuators, 550 and of current commanded position and status from actuators back to 551 control computation elements. The actual time interval of interest 552 is that which starts with sensing of the physical process (which 553 necessarily occurs before the sensed value can be sent in the first 554 message) and which ends when the computed control correction is 555 applied to the physical process by the appropriate actuator (which 556 cannot occur until after the second message containing the computed 557 control output has been received by that actuator). With rare 558 exception, the control algorithms used with PS messaging in the 559 process automation industries - those managing continuous material 560 flows - rely on fixed-period sampling, computation and transfer of 561 outputs, while those in the factory automation industries - those 562 managing discrete manufacturing operations - rely on bounded delay 563 between sampling of inputs, control computation and transfer of 564 outputs to physical actuators that affect the controlled process. 566 Deliberately manipulated message delay and jitter in delivery of PS 567 messaging has the potential to destabilize control loops. It is the 568 responsibility of conveyed higher-level protocols to protect against 569 such potential security attacks by detecting overly delayed or 570 jittered messages at delivery, converting them into instances of 571 message loss. Thus network and data-link protocols such as IPv6 and 572 Ethernet need not themselves address such issues, although their 573 selection and employment should take the existence (or lack) of such 574 higher-layer protection mechanisms, and the resulting consequences 575 due to excessive delay and jitter, into consideration in their 576 parameterization. 578 In general, PS traffic within the LLN is not aggregated before 579 forwarding, to minimize message loss and delay in reception by any 580 relevant controller(s) that are outside the LLN. However, if all 581 intended destination controllers are within the LLN, and at least one 582 of those intended controllers also serves as an LLN router on a DODAG 583 to off-LLN destinations that all are not controllers, then the router 584 functions in that device may aggregate PS traffic before forwarding 585 when the required routing and other QoS attributes are identical. If 586 information on the target delivery times for PS messages to non- 587 controller devices is available to the aggregating forwarding device, 588 that device may intentionally delay forwarding somewhat to facilitate 589 further aggregation. 591 In some system architectures, message streams that use PS to convey 592 current process measurements and status are compressed at the source 593 through a 2-dimensional winnowing process that compares 595 1) the process measurement values and status of the about-to-be-sent 596 message with that of the last actually-sent message, and 598 2) the current time vs. the queueing time for the last actually-sent 599 message. 601 If the interval since that last-sent message is less than a 602 predefined maximum time, and the status is unchanged, and the process 603 measurement(s) conveyed in the message is within predefined 604 deadband(s) of the last-sent measurement value(s), then transmission 605 of the new message is suppressed. Often this suppression takes the 606 form of not queuing the new message for transmission, but in some 607 protocols a brief placeholder message indicating "no significant 608 change" is queued in its stead. 610 2.1.5. Peer-to-peer (P2P) communication paradigm 612 In P2P, the peer-to-peer communication paradigm, a device sends UDP- 613 like or TCP-like messages from one device (D1) to a second device 614 (D2), usually with bidirectional but asymmetric flow of application 615 data, where the amount of data is significantly greater in one 616 direction than the other. Typical examples are transfer of 617 configuration information to or from a process field device, or 618 transfer of captured process diagnostics (e.g., time-stamped noise 619 signatures from a coriolis flowmeter) to an off-LLN higher-level 620 asset management system. Unicast addressing is used in both 621 directions of data flow. 623 In general, specific P2P traffic has only loose timeliness 624 requirements, typically just those required so that response times to 625 human-operator-initiated actions meet human factors requirements. As 626 a consequence, in general, message aggregation is permitted, although 627 few opportunities are likely to present themselves for such 628 aggregation due to the sporadic nature of such messaging to a single 629 destination, and/or due to the large message payloads that often 630 occur in at least one direction of transmission. 632 2.1.6. Peer-to-multipeer (P2MP) communication paradigm 634 In P2MP, the peer-to-multipeer communication paradigm, a device sends 635 UDP-like messages downward, from one device (D1) to a set of other 636 devices (Dn). Typical examples are bulk downloads to a set of 637 devices that use identical code image segments or identically- 638 structured database segments; group commands to enable device state 639 transitions that are quasi-synchronized across all or part of the 640 local network (e.g., switch to the next set of point-to-point 641 downloaded session keys, or notifying that the network is switching 642 to an emergency repair and recovery mode); etc. Multicast addressing 643 is used in the downward direction of data flow. 645 Devices can be assigned to a number of multicast groups, for instance 646 by device type. Then, if it becomes necessary to reflash all devices 647 of a given type with a new load image, a multicast distribution 648 mechanism can be leveraged to optimize the distribution operation. 650 In general, P2MP traffic has only loose timeliness requirements. As 651 a consequence, in general, message aggregation is permitted, although 652 few opportunities are likely to present themselves for such 653 aggregation due to the sporadic nature of such messaging to a single 654 multicast group destination, and/or due to the large message payloads 655 that often occur when P2MP is used for group downloads. However, in 656 general, message aggregation negatively impacts the delivery success 657 rate for each of the aggregated messages, since the probability of 658 error in a received message increases with message length> Together 659 these considerations often lead to a policy of non-aggregation for 660 P2MP messaging. 662 Note: Reliable group download protocols, such as the no-longer- 663 published IEEE 802.1E (ISO/IEC 15802-4) system load protocol, and 664 reliable multicast protocols based on the guidance of [RFC2887], are 665 instructive in how P2MP can be used for initial bulk download, 666 followed by either P2MP or P2P selective retransmissions for missed 667 download segments. 669 2.1.7. Additional considerations: Duocast and N-cast 671 In industrial automation systems, some traffic is from (relatively) 672 high-rate monitoring and control loops, of Class 0 and Class 1 as 673 described in [RFC5673]. In such systems, the wireless link protocol, 674 which typically uses immediate in-band acknowledgement to confirm 675 delivery (or, on failure, conclude that a retransmission is 676 required), can be adapted to attempt simultaneous delivery to more 677 than one receiving device, with separated, sequenced immediate in- 678 band acknowledgement by each of those intended receivers. (This 679 mechanism is known colloquially as "duocast" (for two intended 680 receivers), or more generically as "N-cast" (for N intended 681 receivers).) Transmission is deemed successful if at least one such 682 immediate acknowledgement is received by the sending device; 683 otherwise the device queues the message for retransmission, up until 684 the maximum configured number of retries has been attempted. 686 The logic behind duocast/N-cast is very simple: In wireless systems 687 without FEC (forward error correction), the overall rate of success 688 for transactions consisting of an initial transmission and an 689 immediate acknowledgement is typically 95%. In other words, 5% of 690 such transactions fail, either because the initial message of the 691 transaction is not received correctly by the intended receiver, or 692 because the immediate acknowledgment by that receiver is not received 693 correctly by the transaction initiator. 695 In the generalized case of N-cast, where any received acknowledgement 696 serves to complete the transaction, and where the N intended 697 receivers are spatially diverse, physically separated from each other 698 by multiple wavelengths, the probability that all such receivers fail 699 to receive the initial message of the transaction, or that all 700 generated immediate acknowledgements are not received by the 701 transaction initiator, is typically approximately (5%)^N. Thus, for 702 duocast, the expected success rate for a single transaction goes from 703 95% (1.0 - 0.05) to 99.75% (1.0 - 0.05^2), to 99.9875% (1.0 - 0.05^3) 704 when N=3, and even higher when N>3. 706 From the above analysis, it is obvious that the primary benefit of 707 N-cast occurs when N goes from N=1 (unicast) to N=2 (duocast); the 708 reduction in transaction loss rate for increasing N>2 is quite small, 709 and for N>3 it is infinitesimal. In the typical industrial 710 automation environment of class 1 process control loops, which 711 typically repeat at a 1 Hz or 4 Hz rate, in a very large process 712 plant with thousands of field devices reporting at that rate, the 713 maximum number of transmission retries that must be planned, and for 714 which capacity must be scheduled (within the requisite 250 ms or 1 s 715 interval) is seven (7) retries for unicast PS reporting, but only 716 three (3) retries with duocast PS reporting. (This is determined by 717 the requirement to not miss four successive reports more than once 718 per year, across the entire plant, as such a loss typically triggers 719 fallback behavior in the controlled loop, which is considered a 720 failure of the wireless system by the plant owner/operator.) In 721 practice, the enormous reduction in both planned and used 722 retransmission capacity provided by duocast/N-cast is what enables 723 4 Hz loops to be supported in large wireless systems. 725 When available, duocast/N-cast typically is used only for one-hop PS 726 traffic on Class 1 and Class 0 control loops. It may also be 727 employed for rapid, reliable one-hop delivery of Class 0 and 728 sometimes Class 1 process alarms and device alerts, which use the SS 729 paradigm. Because it requires scheduling of multiple receivers that 730 are prepared to acknowledge the received message during the 731 transaction, in general it is not appropriate for the other types of 732 traffic in such systems - P2P and P2MP - and is not needed for other 733 classes of control loops or other types of traffic, which do not have 734 such stringent reporting requirements. 736 Note: Although there are known patent applications for duocast and 737 N-cast, at the time of this writing the patent assignee, Honeywell 738 International, has offered to permit cost-free RAND use in those 739 industrial wireless standards that have chosen to employee the 740 technology, under a reciprocal licensing requirement relative to that 741 use. Since duocast and N-cast provide performance and energy 742 optimizations, they are not essential for use in wireless systems. 743 However, in practice, their use makes it possible to support 4 Hz 744 wireless loops and meet sub-second safety alarm reporting 745 requirements in large plants, where that might otherwise be 746 impractical without use of a wired network. When duocast/N-cast is 747 not employed, the wireless retransmission capacity that is needed to 748 support such fast loops often is excessive, typically over 100x that 749 actually used for retransmission (i.e., providing for seven retries 750 per transaction when the mean number used is only 0.06 retries). 752 2.1.8. RPL applicability per communication paradigm 754 To match the requirements above, RPL provides a number of RPL Modes 755 of Operation (MOP): 757 No downward route: defined in [RFC6550], section 6.3.1, MOP of 0. 758 This mode allows only upward routing, that is from nodes (devices) 759 that reside inside the RPL network toward the outside via the 760 DODAG root. 762 Non-storing mode: defined in [RFC6550], section 6.3.1, MOP of 1. 763 This mode improves MOP 0 by adding the capability to use source 764 routing from the root towards registered targets within the 765 instance DODAG. 767 Storing mode without multicast support: defined in [RFC6550], 768 section 6.3.1, MOP of 2. This mode improves MOP 0 by adding the 769 capability to use stateful routing from the root towards 770 registered targets within the instance DODAG. 772 Storing mode with link-scope multicast DAO: defined in [RFC6550] 773 section 9.10, this mode improves MOP 2 by adding the capability to 774 send Destination Advertisements to all nodes over a single Layer 2 775 link (e.g. a wireless hop) and enables line-of-sight direct 776 communication. 778 Storing mode with multicast support: defined in [RFC6550], Mode-of- 779 operation (MOP) of 3. This mode improves MOP 2 by adding the 780 capability to register multicast groups and perform multicast 781 forwarding along the instance DODAG (or a spanning subtree within 782 the DODAG). 784 Reactive: defined in [I-D.ietf-roll-p2p-rpl], the reactive mode 785 creates on-demand additional DAGs that are used to reach a given 786 node acting as DODAG root within a certain number of hops. This 787 mode can typically be used for an ad-hoc closed-loop 788 communication. 790 The RPL MOP that can be applied for a given flow depends on the 791 communication paradigm. It must be noted that a DODAG that is used 792 for PS traffic can also be used for SS traffic since the MOP 2 793 extends the MOP 0, and that a DODAG that is used for P2MP 794 distribution can also be used for downward PS since the MOP 3 extends 795 the MOP 2. 797 On the other hand, an Objective Function (OF) that optimizes metrics 798 for a pure upwards DODAG might differ from the OF that optimizes a 799 mixed upward and downward DODAG. 801 As a result, it can be expected that different RPL instances are 802 installed with different OFs, different channel allocations, etc... 803 that result in different routing and forwarding topologies, sometimes 804 with differing delay vs. energy profiles, optimized separately for 805 the different flows at hand. 807 This can be broadly summarized in the following table: 809 +---------------------+------------+-----------------------------------+ 810 | Paradigm\RPL MOP | RPL spec | Mode of operation | 811 +=====================+============+===================================+ 812 | Peer-to-peer | RPL P2P | reactive (on-demand) | 813 +---------------------+------------+-----------------------------------+ 814 | P2P line-of-sight | RPL base | 2 (storing) with multicast DAO | 815 +---------------------+------------+-----------------------------------+ 816 | P2MP distribution | RPL base | 3 (storing with multicast) | 817 +---------------------+------------+-----------------------------------+ 818 | Publish-subscribe | RPL base | 1 or 2 (storing or not-storing) | 819 +---------------------+------------+-----------------------------------+ 820 | Source-sink | RPL base | 0 (no downward route) | 821 +---------------------+------------+-----------------------------------+ 822 | N-cast publish | RPL base | 0 (no downward route) | 823 +---------------------+------------+-----------------------------------+ 825 Figure 4: RPL applicability per communication paradigm 827 2.2. Layer 2 applicability. 829 To be completed. 831 3. Using RPL to Meet Functional Requirements 833 The functional requirements for most industrial automation 834 deployments are similar to those listed in [RFC5673] 836 The routing protocol MUST be capable of supporting the 837 organization of a large number of nodes into regions, usually 838 corresponding to partitions of the automated process, each 839 containing on the order of 30 to 3000 nodes. 841 The routing protocol MUST provide mechanisms to support 842 configuration of the routing protocol itself. 844 The routing protocol MUST provide mechanisms to support instructed 845 configuration of explicit routing, so that in the absence of 846 failure the routing used for selected flow classes is that which 847 has been remotely configured (typically by a centralized 848 configurator). In such circumstances RPL is used 850 for local network repair; 852 for flow classes to which explicit routing has not been 853 assigned; 855 during bootstrapping of the network itself (which is really 856 just an instance of routing without such an externally-imposed 857 assignment). 859 The routing protocol SHOULD support directed flows with different 860 QoS characteristics, typically with different energy vs. delay 861 tradeoffs, for traffic directed to LBRs. In practice only two 862 such sets of QoS are relevant: 864 one that emphasizes energy minimization for energy-constrained 865 nodes at the expense of greater mean transit delay and variance 866 in transit delay; and 868 one that emphasizes minimization of mean transit delay and 869 transit delay variance at the expense of greater energy demand 870 on originating and intermediary energy-constrained nodes, 871 typically used for critical SS traffic (e.e., infrequent and 872 unpredictable safety alarms with legally-mandated maximum 873 reporting delays) and critical PS traffic (e.g., predictable 874 periodic (for process automation) or cyclic (for factory 875 automation) high-speed safety control loops needed to protect 876 life, the environment, and/or critical national infrastructure 877 assets). 879 In the absence of configured routing, or when such routes have 880 failed, the routing protocol MUST dynamically compute and select 881 effective routes composed of low-power and lossy links. Local 882 network dynamics SHOULD NOT impact the entire network. The 883 routing protocol MUST compute multiple paths when possible. 885 The routing protocol MUST support multicast addressing, including 887 multicast originating with a LBR or off the LLN, directed to a 888 predefined group within the LLN 890 multicast originating within the LLN, directed to one or more 891 equivalent LBRs, in support of SS traffic 893 multicast originating within the LLN, directed to one or more 894 equivalent LBRs, in support of PS traffic, including all three 895 of the following situations: 897 1: 899 2: 901 3: 903 The routing protocol SHOULD support and utilize a large number of 904 highly directed flows to a few LBRs, to handle scalability. 906 The routing protocol SHOULD support formation of groups of field 907 devices in the network. 909 The routing protocol NEED NOT support anycast addressing because, 910 as of the date of writing of this document, such addressing is not 911 used by automation and control field devices. In general, no two 912 such devices are equivalent, except perhaps for intermediary LBRs, 913 so unicast suffices for situations where anycast might otherwise 914 be employed. 916 RPL supports: 918 Large-scale networks characterized by highly directed traffic 919 flows between each field device and servers close to the head-end 920 of the automation network. To this end, RPL builds Directed 921 Acyclic Graphs (DAGs) rooted at LBRs. 923 Zero-touch configuration. This is done through in-band methods 924 for configuring RPL variables using DIO messages. 926 The use of links with time-varying availability and quality 927 characteristics. This is accomplished by allowing the use of 928 metrics that effectively capture the quality of a path (e.g., in 929 terms of the mean and maximum impact of use of that path on packet 930 delivery timing and on endpoint energy demands), and by limiting 931 the impact of changing local conditions by discovering and 932 maintaining multiple DAG parents, and by using local repair 933 mechanisms when DAG links break. 935 For wireless installations of small size with undemanding 936 communication requirements, RPL is likely to generate satisfactory 937 routing without any special effort. However, in larger installations 938 or where timeliness considerations do not permit multi-second 939 wireless-subnet transit times, then flow labeling is likely required 940 so that forwarding routers can make informed tradeoffs between 941 conserving their own energy resources and meeting overall system 942 needs. 944 4. RPL Profile 946 This section outlines a RPL profile for a representative deployment 947 in a process control application. Process monitoring without control 948 is typically less demanding, so a subset of this profile generally 949 will suffice. 951 4.1. RPL Features 953 4.1.1. RPL Instances 955 RPL allows formation of multiple instances that operate independently 956 of each other. Each instance may use a different objective function 957 and different modes of operation. It is highly recommended that 958 wireless field devices participate in different instances that 959 utilize objective functions that meet different optimization goals. 960 These optimization goals target: 962 1. Minimizing and ensuring that a guaranteed latency is being met 964 2. Maximizing the communication reliability of the packets 965 transferred over the wireless media 967 3. Minimizing aggregate power consumption for multi-hop LLNs that 968 are composed of battery powered field devices. 970 Some of these optimization goals will have to be met concurrently in 971 a single instance by imposing various constraints. 973 Each wireless field device should participate in a set composed of a 974 minimum of three instances that meet optimization goals associated 975 with three traffic flows which need to be supported by all industrial 976 LLNs. 978 Management Instance: Wireless industrial networks are highly 979 deterministic in nature, meaning that wireless field devices do 980 not make any decisions locally but are managed by a centralized 981 System Manager that oversees the join process as well as all 982 communication and security settings present in the devices. The 983 management traffic flow is downward traffic and needs to meet 984 strictly enforced latency and reliability requirements in order to 985 ensure proper operation of the wireless LLN. Hence each field 986 device should participate in an instance dedicated to management 987 traffic. All decisions made while constructing this instance will 988 need to be approved by the Path Computaton Engine present in the 989 System Manager due to the deterministic, centralized nature of 990 wireless industrial LLNs. Shallow LLNs with a hop count of up to 991 one, accommodate this downward traffic using non-storing mode.Non- 992 storing involves source routing that is detrimental to the packet 993 size. For large transfers such as image download and 994 configuration files, this can be factorized for a large packet. 995 In that case, a method such as [I-D.thubert-roll-forwarding-frags] 996 is required over multi-hop networks to forward and recover 997 individual fragments without the overhead of the source route 998 information in each fragment. If the hop count in the wireless 999 LLN grows (LLN becomes deeper) it is higly recommended that the 1000 management instance rely on storing mode in order to relay 1001 management related packets. 1003 Operational Instance: The bulk of the data that is transferred over 1004 wireless LLN consists of process automation related payloads. 1005 This data is of paramount importance to the smooth operation of 1006 the process that is being monitored. Hence data reliabiliy is of 1007 paramount importance. It is also important to note that a vast 1008 majority of the wireless field devices that operate in industrial 1009 LLNs are battery powered. The operational instance should hence 1010 ensure high reliability of the data transmitted while also 1011 minimizing the aggregate power consumption of the field devices 1012 operating in the LLN. All decisions made while constructing this 1013 instance will need to be approved by the Path Computaton Engine 1014 present in the System Manager. This is due to the deterministic, 1015 centralized nature of wireless LLNs. 1017 Autonomous instance: An autonomous instance requires limited to no 1018 configuration. It, primary purpose is to serve as a backup for 1019 the operational instance in case the operational instance fails. 1020 It is also useful in non-production phases of the network, when 1021 the plant is installed or dismantled. [I-D.thubert-roll-asymlink] 1022 provides rules and mechanisms whereby an instance can be used as a 1023 fallback to another upon failure to forward a packet further. The 1024 autonomic instance should always be active and during normal 1025 operations it should be maintained through local repair 1026 mechanisms. In normal operation global repairs should be 1027 sparingly employed in order to conserve batteries. But a global 1028 repair is also probably the fastest and most economical technique 1029 in the case the network is extensively damaged. It is recommended 1030 to rely on automation that will trigger a global repair upon the 1031 detection of a large scale incident such as an explosion or a 1032 crash. As the name suggests, the autonomous instance is formed 1033 without any dependence on the System Manager. Decisions made 1034 during the construcstion of the autonomous instance do not need 1035 approval from the Path Computation Engine present in the in the 1036 System Manager. 1038 Participation of each wireless field device in at least one instance 1039 that hosts a DODAG with a virtual root is highly recommended. 1041 Wireless industrial networks are typically composed of multiple LLNs 1042 that terminate in a LLN Border Router (LBR). The LBRs communicate 1043 with each other and with other entities present on the backbone (such 1044 as the Gateway and the System Manager) over a wired or wireless 1045 backbone infrastructure. When a device A that operates in LLN 1 1046 sends a packet to a device B that operates in LLN2, the packets 1047 egresses LLN1 through LBR1 and ingresses LLN2 through LBR2 after 1048 travelling over the backbone infrastructure that connects the LBRs. 1049 In order to accommodate this packet flow that travels from one LLN to 1050 another, it is highly recommended that wireless field devices 1051 participate in at least one instance that has a DODAG with a virtual 1052 root. 1054 4.1.2. Storing vs. Non-Storing Mode 1056 In general, storing mode is required for high-reporting-rate devices 1057 (where "high rate" is with respect to the underlying link data 1058 conveyance capability). Such devices, in the absence of path 1059 failure, are typically only one hop from the LBR(s) that convey their 1060 messaging to other parts of the system. Fortunately, in such cases, 1061 the routing tables required by such nodes are small, even when they 1062 include information on DODAGs that are used as backup alternate 1063 routes. 1065 Deeper multi-hop wireless LLNs (hop count > 1) should support storing 1066 mode in order to minimize the overhead associated with source routing 1067 given the limited header capacity associated with typical physical 1068 layers employed in wireless LLNs. Support for storing mode requires 1069 additional RAM resources be present in the constrained wireless 1070 fielde devices. Typical wireless LLNs scale to a maximum of one 1071 hundred field devices. Hence the appropriate RAM resources for 1072 supporting storing mode should be part of the hardware requirements 1073 imposed upon wireless field devices during the design phase. 1075 The ISA100.11a standard mandates that all LBRs maintain routing 1076 tables with enough capacity to accomodate operation in storing mode. 1077 The standard also mandates that all wireless field devices maintain 1078 routing tables but it does not make any capacity assumptions, 1079 allowing for null routing tables. The System Manager should read the 1080 routing table capacity of each wireless field router and LBR during 1081 their join phase, and determine if support for storing mode in a 1082 particular LLN is feasible. 1084 Lack of support for storing mode is also detrimental to battery 1085 operated wireless field devices due to the power consumption 1086 associated with transporting the hefty headers associated with source 1087 routing. Support for storing mode also ensures path redundancy which 1088 in turn allows for better prediction of the latency associated with 1089 downward traffic flows. Guaranteed latencies are of paramount 1090 importance for various traffic flows in wireless industrial LLNs. 1092 4.1.3. DAO Policy 1094 Support for both upward and downward traffic flows is a requirement 1095 in industrial automation systems. As a result, nodes send DAO 1096 messages to establish downward paths from the root to themselves. 1097 DAO messages are not acknowledged in wireless industrial LLNs that 1098 are composed of battery operated field devices in order to minimize 1099 the power consumption overhead associated with path discovery. Given 1100 that wireless field devices in LLNs will typically participate in 1101 multiple RPL instances and DODAGs, it is highly recommended that both 1102 the RPLInstance ID and the DODAGID be included in the DAO. 1104 4.1.4. Path Metrics 1106 RPL relies on an Objective Function for selecting parents and 1107 computing path costs and rank. This objective function is decoupled 1108 from the core RPL mechanisms and also from the metrics in use in the 1109 network. Two objective functions for RPL have been defined at the 1110 time of this writing, the RPL Objective Function 0 [RFC6552] and the 1111 Minimum Rank with Hysteresis Objective Function [RFC6719], both of 1112 which define a selection method for a preferred parent and backup 1113 parents, and are suitable for industrial automation network 1114 deployments. 1116 4.1.5. Objective Function 1118 Industrial wireless LLNs are subject to swift variations in terms of 1119 the propagation of the wireless signal, variations that can affect 1120 the quality of the links between field devices. This is due to the 1121 nature of the environment in which they operate which can be 1122 characterized as metal jungles that cause wireles propagation 1123 distortions, multi-path fading and scattering. Hence support for 1124 hysteresis is needed in order to ensure relative link stability which 1125 in turn ensures route stability. 1127 As mentioned in previous sections of this document, different traffic 1128 flows require different optimization goals. Wireless field devices 1129 should participate in multiple instances associated with multiple 1130 objective functions. 1132 Management Instance: Should utilize an objective function that 1133 focuses on optimization of latency and data reliability. 1135 Operational instance: Should utilize an objective function that 1136 focuses on data reliability and minimizing aggregate power 1137 consumption for battery operated field devices. 1139 Autonomous instance: Should utilize an objective function that 1140 optimizes data latency. The primary purpose of the autonomous 1141 instance is as a fallback instance in case the operational 1142 instance fails. Data latency is hence paramount for ensuring that 1143 the wireless field devices can exchange packets in order to repair 1144 the operational instance. 1146 More complex objective functions are needed that take in 1147 consideration multiple constraints and utilize weighted sums of 1148 multiple additive and multiplicative metrics. Additional objective 1149 functions specifically designed for such networks may be defined in 1150 companion RFCs. 1152 4.1.6. DODAG Repair 1154 To effectively handle time-varying link characteristics and 1155 availability, industrial automation network deployments SHOULD 1156 utilize the local repair mechanisms in RPL. 1158 Local repair is triggered by broken link detection, and in storing 1159 mode also by loop detection. 1161 The first local repair mechanism consists of a node detaching from a 1162 DODAG and then re-attaching to the same or to a different DODAG at a 1163 later time. While detached, a node advertises an infinite rank value 1164 so that its children can select a different parent. This process is 1165 known as poisoning and is described in Section 8.2.2.5 of [RFC6550]. 1166 While RPL provides an option to form a local DODAG, doing so in 1167 industrial automation network deployments is of little benefit since 1168 applications typically communicate through a LBR. After the detached 1169 node has made sufficient effort to send notification to its children 1170 that it is detached, the node can rejoin the same DODAG with a higher 1171 rank value. The configured duration of the poisoning mechanism needs 1172 to take into account the disconnection time applications running over 1173 the network can tolerate. Note that when joining a different DODAG, 1174 the node need not perform poisoning. 1176 The second local repair mechanism controls how much a node can 1177 increase its rank within a given DODAG Version (e.g., after detaching 1178 from the DODAG as a result of broken link or loop detection). 1179 Setting the DAGMaxRankIncrease to a non-zero value enables this 1180 mechanism, and setting it to a value of less than infinity limits the 1181 cost of count-to-infinity scenarios when they occur, thus controlling 1182 the duration of disconnection applications may experience. 1184 4.1.7. Multicast 1186 4.1.8. Security 1188 Industrial automation network deployments typically operate in areas 1189 that provide limited physical security (relative to the risk of 1190 attack). For this reason, the link layer, transport layer and 1191 application layer technologies utilized within such networks 1192 typically provide security mechanisms to ensure authentication, 1193 confidentiality, integrity, timeliness and freshness. As a result, 1194 such deployments may not need to implement RPL's security mechanisms 1195 and could rely on link layer and higher layer security features. 1197 4.1.9. P2P communications 1199 1201 4.2. Layer-two features 1203 4.2.1. Need layer-2 expert here. 1205 4.2.2. Security functions provided by layer-2. 1207 4.2.3. 6LowPAN options assumed. 1209 4.2.4. MLE and other things 1211 4.3. Recommended Configuration Defaults and Ranges 1213 4.3.1. Trickle Parameters 1215 Trickle was designed to be density-aware and perform well in networks 1216 characterized by a wide range of node densities. The combination of 1217 DIO packet suppression and adaptive timers for sending updates allows 1218 Trickle to perform well in both sparse and dense environments. 1220 Node densities in industrial automation network deployments can vary 1221 greatly, from nodes having only one or a handful of neighbors to 1222 nodes having several hundred neighbors. In high density 1223 environments, relatively low values for Imin may cause a short period 1224 of congestion when an inconsistency is detected and DIO updates are 1225 sent by a large number of neighboring nodes nearly simultaneously. 1226 While the Trickle timer will exponentially backoff, some time may 1227 elapse before the congestion subsides. Although some link layers 1228 employ contention mechanisms that attempt to avoid congestion, 1229 relying solely on the link layer to avoid congestion caused by a 1230 large number of DIO updates can result in increased communication 1231 latency for other control and data traffic in the network. 1233 To mitigate this kind of short-term congestion, this document 1234 recommends a more conservative set of values for the Trickle 1235 parameters than those specified in [RFC6206]. In particular, 1236 DIOIntervalMin is set to a larger value to avoid periods of 1237 congestion in dense environments, and DIORefundancyConstant is 1238 parameterized accordingly as described below. These values are 1239 appropriate for the timely distribution of DIO updates in both sparse 1240 and dense scenarios while avoiding the short-term congestion that 1241 might arise in dense scenarios. 1243 Because the actual link capacity depends on the particular link 1244 technology used within an industrial automation network deployment, 1245 the Trickle parameters are specified in terms of the link's maximum 1246 capacity for conveying link-local multicast messages. If the link 1247 can convey m link-local multicast packets per second on average, the 1248 expected time it takes to transmit a link-local multicast packet is 1249 1/m seconds. 1251 DIOIntervalMin: Industrial automation network deployments SHOULD set 1252 DIOIntervalMin such that the Trickle Imin is at least 50 times as 1253 long as it takes to convey a link-local multicast packet. This value 1254 is larger than that recommended in [RFC6206] to avoid congestion in 1255 dense plant deployments as described above. 1257 DIOIntervalDoublings: Industrial automation network deployments 1258 SHOULD set DIOIntervalDoublings such that the Trickle Imax is at 1259 least TBD minutes or more. 1261 DIORedundancyConstant: Industrial automation network deployments 1262 SHOULD set DIORedundancyConstant to a value of at least 10. This is 1263 due to the larger chosen value for DIOIntervalMin and the 1264 proportional relationship between Imin and k suggested in [RFC6206]. 1265 This increase is intended to compensate for the increased 1266 communication latency of DIO updates caused by the increase in the 1267 DIOIntervalMin value, though the proportional relationship between 1268 Imin and k suggested in [RFC6206] is not preserved. Instead, 1269 DIORedundancyConstant is set to a lower value in order to reduce the 1270 number of packet transmissions in dense environments. 1272 4.3.2. Other Parameters 1274 1276 5. Manageability Considerations 1278 RPL enables automatic and consistent configuration of RPL routers 1279 through parameters specified by the DODAG root and disseminated 1280 through DIO packets. The use of Trickle for scheduling DIO 1281 transmissions ensures lightweight yet timely propagation of important 1282 network and parameter updates and allows network operators to choose 1283 the trade-off point they are comfortable with respect to overhead vs. 1284 reliability and timeliness of network updates. 1286 The metrics in use in the network along with the Trickle Timer 1287 parameters used to control the frequency and redundancy of network 1288 updates can be dynamically varied by the root during the lifetime of 1289 the network. To that end, all DIO messages SHOULD contain a Metric 1290 Container option for disseminating the metrics and metric values used 1291 for DODAG setup. In addition, DIO messages SHOULD contain a DODAG 1292 Configuration option for disseminating the Trickle Timer parameters 1293 throughout the network. 1295 The possibility of dynamically updating the metrics in use in the 1296 network as well as the frequency of network updates allows deployment 1297 characteristics (e.g., network density) to be discovered during 1298 network bring-up and to be used to tailor network parameters once the 1299 network is operational rather than having to rely on precise pre- 1300 configuration. This also allows the network parameters and the 1301 overall routing protocol behavior to evolve during the lifetime of 1302 the network. 1304 RPL specifies a number of variables and events that can be tracked 1305 for purposes of network fault and performance monitoring of RPL 1306 routers. Depending on the memory and processing capabilities of each 1307 smart grid device, various subsets of these can be employed in the 1308 field. 1310 6. Security Considerations 1312 Industrial automation network deployments typically operate in areas 1313 that provide limited physical security (relative to the risk of 1314 attack). For this reason, the link layer, transport layer and 1315 application layer technologies utilized within such networks 1316 typically provide security mechanisms to ensure authentication, 1317 confidentiality, integrity, timeliness and freshness. As a result, 1318 such deployments may not need to implement RPL's security mechanisms 1319 and could rely on link layer and higher layer security features. 1321 This document does not specify operations that could introduce new 1322 threats. Security considerations for RPL deployments are to be 1323 developed in accordance with recommendations laid out in, for 1324 example, [I-D.tsao-roll-security-framework]. 1326 Industrial automation networks are subject to stringent security 1327 requirements as they are considered a critical infrastructure 1328 component. At the same time, since they are composed of large 1329 numbers of resource- constrained devices inter-connected with 1330 limited-throughput links, many available security mechanisms are not 1331 practical for use in such networks. As a result, the choice of 1332 security mechanisms is highly dependent on the device and network 1333 capabilities characterizing a particular deployment. 1335 In contrast to other types of LLNs, in industrial automation networks 1336 centralized administrative control and access to a permanent secure 1337 infrastructure is available. As a result link-layer, transport-layer 1338 and/or application-layer security mechanisms are typically in place 1339 and may make use of RPL's secure mode unnecessary. 1341 6.1. Security Considerations during initial deployment 1343 6.2. Security Considerations during incremental deployment 1344 7. Other Related Protocols 1345 8. IANA Considerations 1347 This specification has no requirement on IANA. 1349 9. Acknowledgements 1350 10. References 1352 10.1. Normative References 1354 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1355 Requirement Levels", BCP 14, RFC 2119, March 1997. 1357 10.2. Informative References 1359 [I-D.ietf-roll-p2p-rpl] 1360 Goyal, M., Baccelli, E., Philipp, M., Brandt, A., and J. 1361 Martocci, "Reactive Discovery of Point-to-Point Routes in 1362 Low Power and Lossy Networks", draft-ietf-roll-p2p-rpl-16 1363 (work in progress), February 2013. 1365 [I-D.ietf-roll-terminology] 1366 Vasseur, J., "Terminology in Low power And Lossy 1367 Networks", draft-ietf-roll-terminology-11 (work in 1368 progress), February 2013. 1370 [RFC2887] Handley, M., Floyd, S., Whetten, B., Kermode, R., 1371 Vicisano, L., and M. Luby, "The Reliable Multicast Design 1372 Space for Bulk Data Transfer", RFC 2887, August 2000. 1374 [RFC5548] Dohler, M., Watteyne, T., Winter, T., and D. Barthel, 1375 "Routing Requirements for Urban Low-Power and Lossy 1376 Networks", RFC 5548, May 2009. 1378 [RFC5826] Brandt, A., Buron, J., and G. Porcu, "Home Automation 1379 Routing Requirements in Low-Power and Lossy Networks", 1380 RFC 5826, April 2010. 1382 [RFC5867] Martocci, J., De Mil, P., Riou, N., and W. Vermeylen, 1383 "Building Automation Routing Requirements in Low-Power and 1384 Lossy Networks", RFC 5867, June 2010. 1386 [RFC5673] Pister, K., Thubert, P., Dwars, S., and T. Phinney, 1387 "Industrial Routing Requirements in Low-Power and Lossy 1388 Networks", RFC 5673, October 2009. 1390 [RFC6206] Levis, P., Clausen, T., Hui, J., Gnawali, O., and J. Ko, 1391 "The Trickle Algorithm", RFC 6206, March 2011. 1393 [RFC6550] Winter, T., Thubert, P., Brandt, A., Hui, J., Kelsey, R., 1394 Levis, P., Pister, K., Struik, R., Vasseur, JP., and R. 1395 Alexander, "RPL: IPv6 Routing Protocol for Low-Power and 1396 Lossy Networks", RFC 6550, March 2012. 1398 [RFC6552] Thubert, P., "Objective Function Zero for the Routing 1399 Protocol for Low-Power and Lossy Networks (RPL)", 1400 RFC 6552, March 2012. 1402 [RFC6719] Gnawali, O. and P. Levis, "The Minimum Rank with 1403 Hysteresis Objective Function", RFC 6719, September 2012. 1405 [I-D.thubert-roll-asymlink] 1406 Thubert, P., "RPL adaptation for asymmetrical links", 1407 draft-thubert-roll-asymlink-02 (work in progress), 1408 December 2011. 1410 [I-D.thubert-roll-forwarding-frags] 1411 Thubert, P. and J. Hui, "LLN Fragment Forwarding and 1412 Recovery", draft-thubert-roll-forwarding-frags-01 (work in 1413 progress), February 2013. 1415 [I-D.tsao-roll-security-framework] 1416 Tsao, T., Alexander, R., Daza, V., and A. Lozano, "A 1417 Security Framework for Routing over Low Power and Lossy 1418 Networks", draft-tsao-roll-security-framework-02 (work in 1419 progress), March 2010. 1421 10.3. External Informative References 1423 [HART] www.hartcomm.org, "Highway Addressable Remote Transducer, 1424 a group of specifications for industrial process and 1425 control devices administered by the HART Foundation". 1427 [ISA100.11a] 1428 ISA, "ISA100, Wireless Systems for Automation", May 2008, 1429 < http://www.isa.org/Community/ 1430 SP100WirelessSystemsforAutomation>. 1432 Authors' Addresses 1434 Tom Phinney (editor) 1435 consultant 1436 5012 W. Torrey Pines Circle 1437 Glendale, AZ 85308-3221 1438 USA 1440 Phone: +1 602 938 3163 1441 Email: tom.phinney@cox.net 1443 Pascal Thubert 1444 Cisco Systems 1445 Village d'Entreprises Green Side 1446 400, Avenue de Roumanille 1447 Batiment T3 1448 Biot - Sophia Antipolis 06410 1449 FRANCE 1451 Phone: +33 497 23 26 34 1452 Email: pthubert@cisco.com 1454 Robert Assimiti 1455 Nivis 1456 1000 Circle 75 Parkway SE, Ste 300 1457 Atlanta, GA 30339 1458 USA 1460 Phone: +1 678 202 6859 1461 Email: robert.assimiti@nivis.com