idnits 2.17.1 

draft-ietf-roll-rpl-industrial-applicability-02.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  == The page length should not exceed 58 lines per page, but there was 1
     longer page, the longest (page 3) being 60 lines


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** There are 31 instances of too long lines in the document, the longest
     one being 3 characters in excess of 72.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == Line 929 has weird spacing: '...ing the  use o...'

  == Line 1010 has weird spacing: '... of the  wirel...'

  == Line 1037 has weird spacing: '... in the  in th...'

  -- The document date (October 21, 2013) is 3837 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Missing Reference: 'IEEE802154e' is mentioned on line 830, but not
     defined

  == Missing Reference: 'ZigBeeIP' is mentioned on line 1468, but not defined

  == Missing Reference: 'HART' is mentioned on line 1459, but not defined

  == Unused Reference: 'I-D.ietf-roll-terminology' is defined on line 1376,
     but no explicit reference was found in the text

  == Unused Reference: 'I-D.thubert-6lo-forwarding-fragments' is defined on
     line 1425, but no explicit reference was found in the text

  == Unused Reference: 'I-D.vilajosana-6tisch-minimal' is defined on line
     1452, but no explicit reference was found in the text

  == Outdated reference: A later version (-13) exists of
     draft-ietf-roll-terminology-12

  == Outdated reference: A later version (-08) exists of
     draft-thubert-6lo-forwarding-fragments-00

  == Outdated reference: A later version (-01) exists of
     draft-thubert-6tisch-architecture-00


     Summary: 1 error (**), 0 flaws (~~), 14 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	ROLL                                                     T. Phinney, Ed.
3	Internet-Draft                                                consultant
4	Intended status: Informational                                P. Thubert
5	Expires: April 22, 2014                                            cisco
6	                                                            RA. Assimiti
7	                                                                   Nivis
8	                                                        October 21, 2013

10	                RPL applicability in industrial networks
11	            draft-ietf-roll-rpl-industrial-applicability-02

13	Abstract

15	   The wide deployment of wireless devices, with their low installed
16	   cost (compared to wired devices), will significantly improve the
17	   productivity and safety of industrial plants.  It will simultaneously
18	   increase the efficiency and safety of the plant's workers, by
19	   extending and making more timely the information set available about
20	   plant operations.  The new Routing Protocol for Low Power and Lossy
21	   Networks (RPL) defines a Distance Vector protocol that is designed
22	   for such networks.  The aim of this document is to analyze the
23	   applicability of that routing protocol in industrial LLNs formed of
24	   field devices.

26	Status of this Memo

28	   This Internet-Draft is submitted in full conformance with the
29	   provisions of BCP 78 and BCP 79.

31	   Internet-Drafts are working documents of the Internet Engineering
32	   Task Force (IETF).  Note that other groups may also distribute
33	   working documents as Internet-Drafts.  The list of current Internet-
34	   Drafts is at http://datatracker.ietf.org/drafts/current/.

36	   Internet-Drafts are draft documents valid for a maximum of six months
37	   and may be updated, replaced, or obsoleted by other documents at any
38	   time.  It is inappropriate to use Internet-Drafts as reference
39	   material or to cite them other than as "work in progress."

41	   This Internet-Draft will expire on April 22, 2014.

43	Copyright Notice

45	   Copyright (c) 2013 IETF Trust and the persons identified as the
46	   document authors.  All rights reserved.

48	   This document is subject to BCP 78 and the IETF Trust's Legal
49	   Provisions Relating to IETF Documents (http://trustee.ietf.org/
50	   license-info) in effect on the date of publication of this document.
51	   Please review these documents carefully, as they describe your rights
52	   and restrictions with respect to this document.  Code Components
53	   extracted from this document must include Simplified BSD License text
54	   as described in Section 4.e of the Trust Legal Provisions and are
55	   provided without warranty as described in the Simplified BSD License.

57	Table of Contents

59	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
60	     1.1.  Requirements Language  . . . . . . . . . . . . . . . . . .  4
61	     1.2.  Required Reading . . . . . . . . . . . . . . . . . . . . .  4
62	     1.3.  Out of scope requirements  . . . . . . . . . . . . . . . .  4
63	   2.  Deployment Scenario  . . . . . . . . . . . . . . . . . . . . .  4
64	     2.1.  Network Topologies . . . . . . . . . . . . . . . . . . . .  6
65	       2.1.1.  Traffic Characteristics  . . . . . . . . . . . . . . .  6
66	       2.1.2.  Topologies . . . . . . . . . . . . . . . . . . . . . .  8
67	       2.1.3.  Source-sink (SS) communication paradigm  . . . . . . . 10
68	       2.1.4.  Publish-subscribe (PS, or pub/sub) communication paradig 11
69	       2.1.5.  Peer-to-peer (P2P) communication paradigm  . . . . . . 13
70	       2.1.6.  Peer-to-multipeer (P2MP) communication paradigm  . . . 14
71	       2.1.7.  Additional considerations: Duocast and N-cast  . . . . 14
72	       2.1.8.  RPL applicability per communication paradigm . . . . . 16
73	     2.2.  Layer 2 applicability. . . . . . . . . . . . . . . . . . . 18
74	   3.  Using RPL to Meet Functional Requirements  . . . . . . . . . . 18
75	   4.  RPL Profile  . . . . . . . . . . . . . . . . . . . . . . . . . 20
76	     4.1.  RPL Features . . . . . . . . . . . . . . . . . . . . . . . 20
77	       4.1.1.  RPL Instances  . . . . . . . . . . . . . . . . . . . . 20
78	       4.1.2.  Storing vs. Non-Storing Mode . . . . . . . . . . . . . 22
79	       4.1.3.  DAO Policy . . . . . . . . . . . . . . . . . . . . . . 23
80	       4.1.4.  Path Metrics . . . . . . . . . . . . . . . . . . . . . 23
81	       4.1.5.  Objective Function . . . . . . . . . . . . . . . . . . 24
82	       4.1.6.  DODAG Repair . . . . . . . . . . . . . . . . . . . . . 24
83	       4.1.7.   MPL Profile . . . . . . . . . . . . . . . . . . . . . 25
84	       4.1.8.  Security . . . . . . . . . . . . . . . . . . . . . . . 25
85	       4.1.9.  P2P communications . . . . . . . . . . . . . . . . . . 25
86	     4.2.  Layer-two features . . . . . . . . . . . . . . . . . . . . 26
87	     4.3.  Recommended Configuration Defaults and Ranges  . . . . . . 26
88	       4.3.1.  Trickle Parameters . . . . . . . . . . . . . . . . . . 26
89	       4.3.2.  Other Parameters . . . . . . . . . . . . . . . . . . . 27
90	   5.  Manageability Considerations . . . . . . . . . . . . . . . . . 27
91	   6.  Security Considerations  . . . . . . . . . . . . . . . . . . . 28
92	     6.1.  Security Considerations during initial deployment  . . . . 28
93	     6.2.  Security Considerations during incremental deployment  . . 28
94	   7.  Other Related Protocols  . . . . . . . . . . . . . . . . . . . 28
95	   8.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 28
96	   9.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 28
97	   10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 28
98	     10.1.  Normative References  . . . . . . . . . . . . . . . . . . 28
99	     10.2.  Informative References  . . . . . . . . . . . . . . . . . 28
100	     10.3.  External Informative References . . . . . . . . . . . . . 30

102	   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 30

104	1.  Introduction

106	   Information Technology (IT) is already, and increasingly will be
107	   applied to Industrial Automation and Control System (IACS) technology
108	   in application areas where those IT technologies can be constrained
109	   sufficiently by Service Level Agreements (SLA) or other modest change
110	   that they are able to meet the operational needs of IACS.  When that
111	   happens, the IACS benefits from the large intellectual, experiential
112	   and training investment that has already occurred in those IT
113	   precursors.  One can conclude that future reuse of additional IT
114	   protocols for IACS will continue to occur due to the significant
115	   intellectual, experiential and training economies which result from
116	   that reuse.

118	   Following that logic, many vendors are already extending or replacing
119	   their local field-bus technology with Ethernet and IP-based
120	   solutions.  Examples of this evolution include CIP EtherNet/IP,
121	   Modbus/TCP, Foundation Fieldbus HSE, PROFInet and Invensys/Foxboro
122	   FOXnet.  At the same time, wireless, low power field devices are
123	   being introduced that facilitate a significant increase in the amount
124	   of information which industrial users can collect and the number of
125	   control points that can be remotely managed.

127	   IPv6 appears as a core technology at the conjunction of both trends,
128	   as illustrated by the current [ISA100.11a] industrial Wireless Sensor
129	   Networking (WSN) specification, where layers 1-4 technologies
130	   developed for end uses other than IACS - IEEE 802.15.4 PHY and MAC,
131	   6LoWPAN and IPv6, and UDP - are adapted to IACS use.  But due to the
132	   lack of open standards for routing in Low power and Lossy Networks
133	   (LLN) at the time ISA100.11a was crafted, routing was accomplished at
134	   the link layer and is specific to that standard.

136	   The IETF ROLL Working Group has defined application-specific routing
137	   requirements for a LLN routing protocol, specified in:

139	      Routing Requirements for Urban LLNs [RFC5548],

141	      Industrial Routing Requirements in LLNs [RFC5673],

143	      Home Automation Routing Requirements in LLNs [RFC5826], and

145	      Building Automation Routing Requirements in LLNs [RFC5867].

147	   The Routing Protocol for Low Power and Lossy Networks (RPL)
148	   [RFC6550] specification and its point to point extension/optimization
149	   [RFC6997] define a generic Distance Vector protocol that is adapted
150	   to a variety of Low Power and Lossy Networks (LLN) types by the
151	   application of specific Objective Functions (OFs).  RPL forms
152	   Destination Oriented Directed Acyclic Graphs (DODAGs) within
153	   instances of the protocol, each instance being associated with an
154	   Objective Function to form a routing topology.

156	   A field device that belongs to an instance uses the OF to determine
157	   which DODAG and which Version of that DODAG the device should join.
158	   The device also uses the OF to select a number of routers within the
159	   DODAG current and subsequent Versions to serve as parents or as
160	   feasible successors.  A new Version of the DODAG is periodically
161	   reconstructed to enable a global reoptimization of the graph.

163	   A RPL OF states the outcome of the process used by a RPL node to
164	   select and optimize routes within a RPL Instance based on the
165	   information objects available.  The separation of OFs from the core
166	   protocol specification allows RPL to be adapted to meet the different
167	   optimization criteria required by the wide range of industrial
168	   classes of traffic and applications.

170	   This document provides information on how RPL can accommodate the
171	   industrial requirements for LLNs, in particular as specified in
172	   [RFC5673].

174	1.1.  Requirements Language

176	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
177	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
178	   "OPTIONAL" in this document are to be interpreted as described in RFC
179	   2119 [RFC2119].

181	   Additionally, this document uses terminology from [I-D.ietf-roll-
182	   terminology], and uses usual terminology from the Process Control and
183	   Factory Automation industries, some of which is recapitulated below:

185	   FEC:  Forward error correction

187	   IACS: Industrial automation and control systems

189	   RAND: reasonable and non-discriminatory (relative to licensing of
190	         patents)

192	1.2.  Required Reading

194	1.3.  Out of scope requirements

196	   This applicability statement does not address requirements related to
197	   wireless LLNs employed in factory automation and related
198	   applications.

200	2.  Deployment Scenario

202	   [RFC5673] describes in detail the routing requirements for industrial
203	   LLNs.  This RFC provides information on the varying deployment
204	   scenarios for such LLNs and how RPL assists in meeting those
205	   requirements.

207	   Large industrial plants, or major operating areas within such plants,
208	   repeatedly go through four major phases, each of which typically
209	   lasts from months to years:

211	     P1: Construction or major modification phase

213	     P2: Planned startup phase

215	     P3: Normal operation phase

217	     P4: Planned shutdown phase

219	   followed eventually by an (at least theoretical)

221	     P5: Plant decommissioning phase.

223	   It is also likely, after a major catastrophe at a plant, to have a

225	     P6: Post-emergency recovery and repair phase.

227	   The deployment scenarios for wireless LLN devices may be different in
228	   each of these phases.  In particular, during the Construction or
229	   major modification phase (P1), LLN devices may be installed months
230	   before the intended LLN can become usefully operational (because
231	   needed routers and infrastructure devices are not yet installed or
232	   active), and there are likely to be many personnel in whom the plant
233	   owner/operator has only limited trust, such as subcontractors and
234	   others in the plant area who have undergone only a cursory background
235	   investigation (if any at all). In general, during this phase, plant
236	   instrumentation is not yet operational, so could be removed and
237	   replaced by a Trojaned device without much likelihood of physical
238	   detection of the substitution.  Thus physical security of LLN devices
239	   is generally a more significant risk factor during this phase than
240	   once the plant is operational, where simple replacement of device
241	   electronics is detectable.

243	   Extra LLN devices and even extra LLN subnets may be employed during
244	   Planned startup (P2) and Planned shutdown (P4) phases, in support of
245	   the task of transitioning the plant or plant area between operational
246	   and shutdown states.  The extra devices typically provide extra
247	   monitoring as the plant transitions infrequent activity states.  (In
248	   many continuous process plants, up to 2x extra staff are employed at
249	   monitoring and control workstations during these two phases,
250	   precisely because the plant is undergoing extraordinary behavior as
251	   it transitions to or from its steady-state operational condition.)

253	   Similar transient devices and subnets may be used during an
254	   unscheduled Post-emergency recovery and repair phase (P6) of
255	   operation, but in that case the extra devices usually are routers
256	   substituting for plant LLN devices that have been damaged by the
257	   incident (such as a fire, explosion, flood, tornado or hurricane)
258	   that induced the emergency.

260	   The Planned startup (P2) and Planned shutdown (P4) phases are similar
261	   in many respects, but the LLN environment of the two can be quite
262	   different, since the Planned shutdown phase can assume that the
263	   stable LLN environment used for Normal operation (P3) is functional
264	   during shutdown, whereas that stable environment usually is still
265	   being established during startup.

267	   The Post-emergency recovery and repair phase (P6) typically operates
268	   in an LLN environment that is somewhere between that of the Planned
269	   startup (P2) and Normal operation (P3) phases, but with an
270	   indeterminate number of temporary routers placed to facilitate
271	   communication across and around the area affected by the catastrophe.

273	   Smaller industrial plants and sites may go through similar phases,
274	   but often commingle the phases because, in those smaller plants, the
275	   phases require less planning and structuring of personnel
276	   responsibilities and thus permit less formalization and partitioning
277	   of the operating scenarios.  For example, it is much simpler, and
278	   usually requires much less planning, to bring new equipment on a skid
279	   into a plant, using a forklift, than to lay temporary railroad track
280	   or employ an extended-axle heavy haul tractor-trailer to deliver a
281	   multi-ton process vessel, and temporarily deploy and use very large
282	   heavy-lift cranes to install it.  In the former cases, nearby
283	   equipment usually can continue normal operation while the
284	   installation proceeds; in the latter case that is almost always
285	   impossible, due to safety and other concerns.

287	   The domain of applicability for the RPL protocol may include all
288	   phases but the Normal Operation phase, where the bandwidth allocation
289	   and the routes are usually optimized by an external Path Computing
290	   Engine (PCE), e.g.  an ISA100.11a System Manager.

292	   Additionally, it could be envisioned to include RPL in the normal
293	   operation provided that a new Objective Function is defined that
294	   actually interacts with the PCE is order to establish the reference
295	   topology, in which case RPL operations would only apply to emergency
296	   repair actions.  when the reference topology becomes unusable for
297	   some failure, and as long as the problem persists.

299	2.1.  Network Topologies

301	2.1.1.  Traffic Characteristics

303	   The industrial market classifies process applications into three
304	   broad categories and six classes.

306	   o  Safety

308	      *  Class 0: Emergency action - Always a critical function

310	   o  Control
311	      *  Class 1: Closed loop regulatory control - Often a critical
312	         function

314	      *  Class 2: Closed loop supervisory control - Usually non-critical
315	         function

317	      *  Class 3: Open loop control - Operator takes action and controls
318	         the actuator (human in the loop)

320	   o  Monitoring

322	      *  Class 4: Alerting - Short-term operational effect (for example
323	         event-based maintenance)

325	      *  Class 5: Logging and downloading / uploading - No immediate
326	         operational consequence (e.g., history collection, sequence-of-
327	         events, preventive maintenance)

329	   Safety critical functions effect the basic safety integrity of the
330	   plant.  These normally dormant functions kick in only when process
331	   control systems, or their operators, have failed.  By design and by
332	   regular interval inspection, they have a well-understood probability
333	   of failure on demand in the range of typically once per 10-1000
334	   years.

336	   In-time deliveries of messages becomes more relevant as the class
337	   number decreases.

339	   Note that for a control application, the jitter is just as important
340	   as latency and has a potential of destabilizing control algorithms.

342	   The domain of applicability for the RPL protocol probably matches the
343	   range of classes where industrial users are interested in deploying
344	   wireless networks.  This domain includes monitoring classes (4 and
345	   5), and the non-critical portions of control classes (2 and 3). RPL
346	   might also be considered as an additional repair mechanism in all
347	   situations, and independently of the flow classification and the
348	   medium type.

350	   It appears from the above sections that whether and the way RPL can
351	   be applied for a given flow depends both on the deployment scenario
352	   and on the class of application / traffic.  At a high level, this can
353	   be summarized by the following matrix:

355	   +---------------------+------------------------------------------------+
356	   |   Phase \  Class    |   0       1       2       3       4       5    |
357	   +=====================+================================================+
358	   |   Construction      |                   X       X       X       X    |
359	   +---------------------+------------------------------------------------+
360	   |   Planned startup   |                   X       X       X       X    |
361	   +---------------------+------------------------------------------------+
362	   |   Normal operation  |                           ?       ?       ?    |
363	   +---------------------+------------------------------------------------+
364	   |   Planned shutdown  |                   X       X       X       X    |
365	   +---------------------+------------------------------------------------+
366	   |Plant decommissioning|                   X       X       X       X    |
367	   +---------------------+------------------------------------------------+
368	   | Recovery and repair |   X       X       X       X       X       X    |
369	   +---------------------+------------------------------------------------+

371	    ? : typically usable for all but higher-rate classes 0,1 PS traffic

373	2.1.2.  Topologies

375	   In an IACS, high-rate communications flows (e.g., 1 Hz or 4 Hz for a
376	   traditional process automation network) typically are such that only
377	   a single wireless LLN hop separates the source device from a LLN
378	   Border Router (LBR) to a significantly higher data-rate backbone
379	   network, typically based on IEEE 802.3, IEEE 802.11, or IEEE 802.16,
380	   as illustrated in  Figure 2.

382	                  ---+------------------------
383	                     |          Plant Network
384	                     |
385	                  +-----+
386	                  |     | Gateway
387	                  |     |
388	                  +-----+
389	                     |
390	                     |      Backbone
391	               +--------------------+------------------+
392	               |                    |                  |
393	            +-----+             +-----+             +-----+
394	            |     | LLN border  |     | LLN border  |     | LLN border
395	       o    |     | router      |     | router      |     | router
396	            +-----+             +-----+             +-----+
397	       o                  o                   o                 o
398	           o    o   o         o   o  o   o         o  o   o o
399	                                   LLN

401	    o : stationary wireless field device, seldom acting as an LLN router

403	   For factory automation networks, the basic communications cycle for
404	   control is typically much faster, on the order of 100 Hz or more.  In
405	   this case the LLN itself may be based on high-data-rate IEEE 802.11
406	   or a 100 Mbit/s or faster optical link, and the higher-rate network
407	   used by the LBRs to connect the LLN to superior automation equipment
408	   typically might be based on fiber-optic IEEE 802.3, with multiple
409	   LBRs around the periphery of the factory area, so that most high-rate
410	   communications again requires only a single wireless LLN hop.

412	   Multi-hop LLN routing is used within the LLN portion of such networks
413	   to provide backup communications paths when primary single-hop LLN
414	   paths fail, or for lower repetition rate communications where longer
415	   LLN transit times and higher variance are not an issue.  Typically,
416	   the majority of devices in an IACS can tolerate such higher-delay
417	   higher-variance paths, so routing choices often are driven by energy
418	   considerations for the affected devices, rather than simply by IACS
419	   performance requirements, as illustrated in  Figure 3.

421	                   ---+------------------------
422	                     |          Plant Network
423	                     |
424	                  +-----+
425	                  |     | Gateway
426	                  |     |
427	                  +-----+
428	                     |
429	                     |      Backbone
430	               +--------------------+------------------+
431	               |                    |                  |
432	            +-----+             +-----+             +-----+
433	            |     | Backbone    |     | Backbone    |     | Backbone
434	            |     | router      |     | router      |     | router
435	            +-----+             +-----+             +-----+
436	               o    o   o    o     o   o  o   o   o   o  o   o o
437	           o o   o  o   o  o  o o   o  o  o   o   o   o  o  o  o o
438	          o  o o  o o    o   o   o  o  o  o    M    o  o  o o o
439	          o   o  M o  o  o     o  o    o  o  o    o  o   o  o   o
440	            o   o o       o        o  o         o        o o
441	                    o           o          o             o     o
442	                                   LLN

444	    o : stationary wireless field device, often acting as an LLN router
445	    M : mobile wireless device

447	   Two decades of experience with digital fieldbuses has shown that four
448	   communications paradigms dominate in IACS:

450	   SS:   Source-sink

452	   PS:   Publish-subscribe

454	   P2P:  Peer-to-peer

456	   P2MP: Peer-to-multipeer

458	2.1.3.  Source-sink (SS) communication paradigm

460	   In SS, the source-sink communication paradigm, each of many devices
461	   in one set, S1, sends UDP-like messages, usually infrequently and
462	   intermittently, to a second set of devices, S2, determined by a
463	   common multicast address.  A typical example would be that all
464	   devices within a given process unit N are configured to send process
465	   alarm messages to the multicast address
466	   Receivers_of_process_alarms_for_unit_N. Receiving devices, typically
467	   on non-LLN networks accessed via LBRs, are configured to receive such
468	   multicast messages if their work assignment covers process unit N,
469	   and not otherwise.

471	   Timeliness of message delivery is a significant aspect of some SS
472	   communication.  When the SS traffic conveys process alarms or device
473	   alerts, there is often a contractual requirement, and sometimes even
474	   a regulatory requirement, on the maximum end-to-end transit delay of
475	   the SS message, including both the LLN and non-LLN components of that
476	   delay.  However, there is no requirement on relative jitter in the
477	   delivery of multiple SS messages from the same source, and message
478	   reordering during transit is irrelevant.

480	   Within the LLN, the SS paradigm simply requires that messages so
481	   addressed be forwarded to the responsible LBR (or set of equivalent
482	   LBRs) for further forwarding outside the LLN. Within the LLN such
483	   traffic typically is device-to-LBR or device-to-redundant-set-of-
484	   equivalent-LBRs.  In general, SS traffic may be aggregated before
485	   forwarding when both the multicast destination address and other QoS
486	   attributes are identical.  If information on the target delivery
487	   times for SS messages is available to the aggregating forwarding
488	   device, that device may intentionally delay forwarding somewhat to
489	   facilitate further aggregation, which can significantly reduce LLN
490	   alarm-reporting traffic during major plant upset events.

492	2.1.4.  Publish-subscribe (PS, or pub/sub) communication paradigm

494	   In PS, the publish-subscribe communication paradigm, a device sends
495	   UDP-like messages, usually periodically or cyclicly (i.e.,
496	   repetitively but without fixed periodicity), to a single multicast
497	   address derived from or correlated with the device's own address.  A
498	   typical example would be that each sensor and actuator device within
499	   a given process unit N is configured to send process state messages
500	   to the multicast address that designates its specific publications.
501	   In essence the derived multicast address for device D is
502	   Receivers_of_publications_by_device_D. Typically those receivers are
503	   in two categories: controllers (C) for control loops in which device
504	   D participates, and devices accessed via the LLN's LBRs that monitor
505	   and/or accumulate historical information about device D's status and
506	   outputs.

508	   If the controller(s) that receive device D's publication are all
509	   outside the LLN and accessed by LBRs, then within the LLN such
510	   traffic typically is device-to-LBR or device-to-redundant-set-of-
511	   equivalent-LBRs.  But if a controller (Cn) is within the LLN, then a
512	   number of different LLN-local traffic patterns may be employed,
513	   depending on the capabilities of the underlying link technology and
514	   on configured performance requirements for such reporting.  Typically
515	   in such a case, publication by device D is forwarded up a DODAG to an
516	   LLN router that is also on a downward DODAG to a destination
517	   controller Cn, then forwarded down that second DODAG to that
518	   destination controller Cn.  Of course, if the LLN router (or even the
519	   LBR) is itself the intended destination controller, which will often
520	   be the case, then no downward forwarding occurs.

522	   Timeliness of message delivery is a critical aspect of PS
523	   communication.  Individual messages can be lost without significant
524	   impact on the controlled physical process, but typically a sequence
525	   of four consecutive lost messages will trigger fallback behavior of
526	   the control algorithms, which is considered a system failure by most
527	   system owner/operators.  (In general, and unless a local catastrophic
528	   event such as a major explosion or a tornado occurs in the plant,
529	   invocation of more than one instance of such fallback handling per
530	   year, per plant, is considered unacceptable.)

532	   Message loss, delay and jitter in delivery of PS messaging is a
533	   relative matter.  PS messaging is used for transfer of process
534	   measurements and associated status from sensors to control
535	   computation elements, from control computation elements to actuators,
536	   and of current commanded position and status from actuators back to
537	   control computation elements.  The actual time interval of interest
538	   is that which starts with sensing of the physical process (which
539	   necessarily occurs before the sensed value can be sent in the first
540	   message) and which ends when the computed control correction is
541	   applied to the physical process by the appropriate actuator (which
542	   cannot occur until after the second message containing the computed
543	   control output has been received by that actuator). With rare
544	   exception, the control algorithms used with PS messaging in the
545	   process automation industries - those managing continuous material
546	   flows - rely on fixed-period sampling, computation and transfer of
547	   outputs, while those in the factory automation industries - those
548	   managing discrete manufacturing operations - rely on bounded delay
549	   between sampling of inputs, control computation and transfer of
550	   outputs to physical actuators that affect the controlled process.

552	   Deliberately manipulated message delay and jitter in delivery of PS
553	   messaging has the potential to destabilize control loops.  It is the
554	   responsibility of conveyed higher-level protocols to protect against
555	   such potential security attacks by detecting overly delayed or
556	   jittered messages at delivery, converting them into instances of
557	   message loss.  Thus network and data-link protocols such as IPv6 and
558	   Ethernet need not themselves address such issues, although their
559	   selection and employment should take the existence (or lack) of such
560	   higher-layer protection mechanisms, and the resulting consequences
561	   due to excessive delay and jitter, into consideration in their
562	   parameterization.

564	   In general, PS traffic within the LLN is not aggregated before
565	   forwarding, to minimize message loss and delay in reception by any
566	   relevant controller(s) that are outside the LLN. However, if all
567	   intended destination controllers are within the LLN, and at least one
568	   of those intended controllers also serves as an LLN router on a DODAG
569	   to off-LLN destinations that all are not controllers, then the router
570	   functions in that device may aggregate PS traffic before forwarding
571	   when the required routing and other QoS attributes are identical.  If
572	   information on the target delivery times for PS messages to non-
573	   controller devices is available to the aggregating forwarding device,
574	   that device may intentionally delay forwarding somewhat to facilitate
575	   further aggregation.

577	   In some system architectures, message streams that use PS to convey
578	   current process measurements and status are compressed at the source
579	   through a 2-dimensional winnowing process that compares

581	   1) the process measurement values and status of the about-to-be-sent
582	      message with that of the last actually-sent message, and

584	   2) the current time vs.  the queueing time for the last actually-sent
585	      message.

587	   If the interval since that last-sent message is less than a
588	   predefined maximum time, and the status is unchanged, and the process
589	   measurement(s) conveyed in the message is within predefined
590	   deadband(s) of the last-sent measurement value(s), then transmission
591	   of the new message is suppressed.  Often this suppression takes the
592	   form of not queuing the new message for transmission, but in some
593	   protocols a brief placeholder message indicating "no significant
594	   change" is queued in its stead.

596	2.1.5.  Peer-to-peer (P2P) communication paradigm

598	   In P2P, the peer-to-peer communication paradigm, a device sends UDP-
599	   like or TCP-like messages from one device (D1) to a second device
600	   (D2), usually with bidirectional but asymmetric flow of application
601	   data, where the amount of data is significantly greater in one
602	   direction than the other.  Typical examples are transfer of
603	   configuration information to or from a process field device, or
604	   transfer of captured process diagnostics (e.g., time-stamped noise
605	   signatures from a coriolis flowmeter) to an off-LLN higher-level
606	   asset management system.  Unicast addressing is used in both
607	   directions of data flow.

609	   In general, specific P2P traffic has only loose timeliness
610	   requirements, typically just those required so that response times to
611	   human-operator-initiated actions meet human factors requirements.  As
612	   a consequence, in general, message aggregation is permitted, although
613	   few opportunities are likely to present themselves for such
614	   aggregation due to the sporadic nature of such messaging to a single
615	   destination, and/or due to the large message payloads that often
616	   occur in at least one direction of transmission.

618	2.1.6.  Peer-to-multipeer (P2MP) communication paradigm

620	   In P2MP, the peer-to-multipeer communication paradigm, a device sends
621	   UDP-like messages downward, from one device (D1) to a set of other
622	   devices (Dn). Typical examples are bulk downloads to a set of devices
623	   that use identical code image segments or identically-structured
624	   database segments; group commands to enable device state transitions
625	   that are quasi-synchronized across all or part of the local network
626	   (e.g., switch to the next set of point-to-point downloaded session
627	   keys, or notifying that the network is switching to an emergency
628	   repair and recovery mode); etc.  Multicast addressing is used in the
629	   downward direction of data flow.

631	   Devices can be assigned to a number of multicast groups, for instance
632	   by device type.  Then, if it becomes necessary to reflash all devices
633	   of a given type with a new load image, a multicast distribution
634	   mechanism can be leveraged to optimize the distribution operation.

636	   In general, P2MP traffic has only loose timeliness requirements.  As
637	   a consequence, in general, message aggregation is permitted, although
638	   few opportunities are likely to present themselves for such
639	   aggregation due to the sporadic nature of such messaging to a single
640	   multicast group destination, and/or due to the large message payloads
641	   that often occur when P2MP is used for group downloads.  However, in
642	   general, message aggregation negatively impacts the delivery success
643	   rate for each of the aggregated messages, since the probability of
644	   error in a received message increases with message length> Together
645	   these considerations often lead to a policy of non-aggregation for
646	   P2MP messaging.

648	   Note: Reliable group download protocols, such as the no-longer-
649	   published IEEE 802.1E (ISO/IEC 15802-4) system load protocol, and
650	   reliable multicast protocols based on the guidance of [RFC2887], are
651	   instructive in how P2MP can be used for initial bulk download,
652	   followed by either P2MP or P2P selective retransmissions for missed
653	   download segments.

655	2.1.7.  Additional considerations: Duocast and N-cast

657	   In industrial automation systems, some traffic is from (relatively)
658	   high-rate monitoring and control loops, of Class 0 and Class 1 as
659	   described in [RFC5673].  In such systems, the wireless link protocol,
660	   which typically uses immediate in-band acknowledgement to confirm
661	   delivery (or, on failure, conclude that a retransmission is
662	   required), can be adapted to attempt simultaneous delivery to more
663	   than one receiving device, with separated, sequenced immediate in-
664	   band acknowledgement by each of those intended receivers.  (This
665	   mechanism is known colloquially as "duocast" (for two intended
666	   receivers), or more generically as "N-cast" (for N intended
667	   receivers).) Transmission is deemed successful if at least one such
668	   immediate acknowledgement is received by the sending device;
669	   otherwise the device queues the message for retransmission, up until
670	   the maximum configured number of retries has been attempted.

672	   The logic behind duocast/N-cast is very simple: In wireless systems
673	   without FEC (forward error correction), the overall rate of success
674	   for transactions consisting of an initial transmission and an
675	   immediate acknowledgement is typically 95%. In other words, 5% of
676	   such transactions fail, either because the initial message of the
677	   transaction is not received correctly by the intended receiver, or
678	   because the immediate acknowledgment by that receiver is not received
679	   correctly by the transaction initiator.

681	   In the generalized case of N-cast, where any received acknowledgement
682	   serves to complete the transaction, and where the N intended
683	   receivers are spatially diverse, physically separated from each other
684	   by multiple wavelengths, the probability that all such receivers fail
685	   to receive the initial message of the transaction, or that all
686	   generated immediate acknowledgements are not received by the
687	   transaction initiator, is typically approximately (5%)^N. Thus, for
688	   duocast, the expected success rate for a single transaction goes from
689	   95% (1.0 - 0.05) to 99.75% (1.0 - 0.05^2), to 99.9875% (1.0 - 0.05^3)
690	   when N=3, and even higher when N>3.

692	   From the above analysis, it is obvious that the primary benefit of
693	   N-cast occurs when N goes from N=1 (unicast) to N=2 (duocast); the
694	   reduction in transaction loss rate for increasing N>2 is quite small,
695	   and for N>3 it is infinitesimal.  In the typical industrial
696	   automation environment of class 1 process control loops, which
697	   typically repeat at a 1 Hz or 4 Hz rate, in a very large process
698	   plant with thousands of field devices reporting at that rate, the
699	   maximum number of transmission retries that must be planned, and for
700	   which capacity must be scheduled (within the requisite 250 ms or 1 s
701	   interval) is seven (7) retries for unicast PS reporting, but only
702	   three (3) retries with duocast PS reporting.  (This is determined by
703	   the requirement to not miss four successive reports more than once
704	   per year, across the entire plant, as such a loss typically triggers
705	   fallback behavior in the controlled loop, which is considered a
706	   failure of the wireless system by the plant owner/operator.) In
707	   practice, the enormous reduction in both planned and used
708	   retransmission capacity provided by duocast/N-cast is what enables 4
709	   Hz loops to be supported in large wireless systems.

711	   When available, duocast/N-cast typically is used only for one-hop PS
712	   traffic on Class 1 and Class 0 control loops.  It may also be
713	   employed for rapid, reliable one-hop delivery of Class 0 and
714	   sometimes Class 1 process alarms and device alerts, which use the SS
715	   paradigm.  Because it requires scheduling of multiple receivers that
716	   are prepared to acknowledge the received message during the
717	   transaction, in general it is not appropriate for the other types of
718	   traffic in such systems - P2P and P2MP - and is not needed for other
719	   classes of control loops or other types of traffic, which do not have
720	   such stringent reporting requirements.

722	   Note: Although there are known patent applications for duocast and
723	   N-cast, at the time of this writing the patent assignee, Honeywell
724	   International, has offered to permit cost-free RAND use in those
725	   industrial wireless standards that have chosen to employee the
726	   technology, under a reciprocal licensing requirement relative to that
727	   use.  Since duocast and N-cast provide performance and energy
728	   optimizations, they are not essential for use in wireless systems.
729	   However, in practice, their use makes it possible to support 4 Hz
730	   wireless loops and meet sub-second safety alarm reporting
731	   requirements in large plants, where that might otherwise be
732	   impractical without use of a wired network.  When duocast/N-cast is
733	   not employed, the wireless retransmission capacity that is needed to
734	   support such fast loops often is excessive, typically over 100x that
735	   actually used for retransmission (i.e., providing for seven retries
736	   per transaction when the mean number used is only 0.06 retries).

738	2.1.8.  RPL applicability per communication paradigm

740	   To match the requirements above, RPL provides a number of RPL Modes
741	   of Operation (MOP):

743	   No downward route: defined in [RFC6550], section 6.3.1, MOP of 0.
744	                      This mode allows only upward routing, that is from
745	                      nodes (devices) that reside inside the RPL network
746	                      toward the outside via the DODAG root.

748	   Non-storing mode: defined in [RFC6550], section 6.3.1, MOP of 1. This
749	                     mode improves MOP 0 by adding the capability to use
750	                     source routing from the root towards registered
751	                     targets within the instance DODAG.

753	   Storing mode without multicast support: defined in [RFC6550], section
754	                                           6.3.1, MOP of 2.  This mode
755	                                           improves MOP 0 by adding the
756	                                           capability to use stateful
757	                                           routing from the root towards
758	                                           registered targets within the
759	                                           instance DODAG.

761	   Storing mode with link-scope multicast DAO: defined in [RFC6550]
762	                                               section 9.10, this mode
763	                                               improves MOP 2 by adding
764	                                               the capability to send
765	                                               Destination
766	                                               Advertisements to all
767	                                               nodes over a single Layer
768	                                               2 link (e.g.  a wireless
769	                                               hop) and enables line-of-
770	                                               sight direct
771	                                               communication.

773	   Storing mode with multicast support: defined in [RFC6550], Mode-of-
774	                                        operation (MOP) of 3. This mode
775	                                        improves MOP 2 by adding the
776	                                        capability to register multicast
777	                                        groups and perform multicast
778	                                        forwarding along the instance
779	                                        DODAG (or a spanning subtree
780	                                        within the DODAG).

782	   Reactive: defined in [RFC6997], the reactive mode creates on-demand
783	             additional DAGs that are used to reach a given node acting
784	             as DODAG root within a certain number of hops.  This mode
785	             can typically be used for an ad-hoc closed-loop
786	             communication.

788	   The RPL MOP that can be applied for a given flow depends on the
789	   communication paradigm.  It must be noted that a DODAG that is used
790	   for PS       traffic can also be used for SS traffic since the MOP 2
791	   extends the MOP 0, and that a DODAG that is used for P2MP
792	   distribution can also be used for downward PS since the MOP 3 extends
793	   the MOP 2.

795	   On the other hand, an Objective Function (OF) that optimizes metrics
796	   for a pure upwards DODAG might differ from the OF that optimizes a
797	   mixed upward and downward DODAG.

799	   As a result, it can be expected that different RPL instances are
800	   installed with different OFs, different channel allocations, etc...
801	   that result in different routing and forwarding topologies, sometimes
802	   with differing delay vs.  energy profiles, optimized separately for
803	   the different flows at hand.

805	   This can be broadly summarized in the following table:

807	   +---------------------+------------+-----------------------------------+
808	   |   Paradigm\RPL MOP  |  RPL spec  |         Mode of operation         |
809	   +=====================+============+===================================+
810	   |   Peer-to-peer      |  RPL P2P   |     reactive (on-demand)          |
811	   +---------------------+------------+-----------------------------------+
812	   |   P2P line-of-sight |  RPL base  |  2 (storing) with multicast DAO   |
813	   +---------------------+------------+-----------------------------------+
814	   |   P2MP distribution |  RPL base  |     3 (storing with multicast)    |
815	   +---------------------+------------+-----------------------------------+
816	   |   Publish-subscribe |  RPL base  |  1 or 2 (storing or not-storing)  |
817	   +---------------------+------------+-----------------------------------+
818	   |   Source-sink       |  RPL base  |     0 (no downward route)         |
819	   +---------------------+------------+-----------------------------------+
820	   |   N-cast publish    |  RPL base  |     0 (no downward route)         |
821	   +---------------------+------------+-----------------------------------+

823	2.2.  Layer 2 applicability.

825	   Work at the 6TiSCH WG details layer 2 operations for the most
826	   commonly used link Layer for industrial operations, the  Timeslotted
827	   Channel Hopping (TSCH) mode of IEEE802.15.4e [IEEE802154e].

829	   [I-D.watteyne-6tisch-tsch] provides in-depth information on the
830	   IEEE802.15.4e [IEEE802154e] TSCH MAC operation whereas the 6TiSCH
831	   architecture [I-D.thubert-6tisch-architecture] provides additional
832	   imformation as of how RPL can be used over TSCH.

834	   This contrasts with the SmartGrid area where ZigBee IP [ZigBeeIP]
835	   ("ZigBee" is a registered trademark of the ZigBee Alliance) defines
836	   an application of RPL over a more classical contention-based
837	   operation but will not exhibit the deterministic capabilities that
838	   industrial control loops require.

840	3.  Using RPL to Meet Functional Requirements

842	   The functional requirements for most industrial automation
843	   deployments are similar to those listed in [RFC5673]

845	      The routing protocol MUST be capable of supporting the
846	      organization of a large number of nodes into regions, usually
847	      corresponding to partitions of the automated process, each
848	      containing on the order of 30 to 3000 nodes.

850	      The routing protocol MUST provide mechanisms to support
851	      configuration of the routing protocol itself.

853	      The routing protocol MUST provide mechanisms to support instructed
854	      configuration of explicit routing, so that in the absence of
855	      failure the routing used for selected flow classes is that which
856	      has been remotely configured (typically by a centralized
857	      configurator). In such circumstances RPL is used

859	         for local network repair;

861	         for flow classes to which explicit routing has not been
862	         assigned;

864	         during bootstrapping of the network itself (which is really
865	         just an instance of routing without such an externally-imposed
866	         assignment).

868	      The routing protocol SHOULD support directed flows with different
869	      QoS characteristics, typically with different energy vs.  delay
870	      tradeoffs, for traffic directed to LBRs.  In practice only two
871	      such sets of QoS are relevant:

873	         one that emphasizes energy minimization for energy-constrained
874	         nodes at the expense of greater mean transit delay and variance
875	         in transit delay; and

877	         one that emphasizes minimization of mean transit delay and
878	         transit delay variance at the expense of greater energy demand
879	         on originating and intermediary energy-constrained nodes,
880	         typically used for critical SS traffic (e.e., infrequent and
881	         unpredictable safety alarms with legally-mandated maximum
882	         reporting delays) and critical PS traffic (e.g., predictable
883	         periodic (for process automation) or cyclic (for factory
884	         automation) high-speed safety control loops needed to protect
885	         life, the environment, and/or critical national infrastructure
886	         assets).

888	      In the absence of configured routing, or when such routes have
889	      failed, the routing protocol MUST dynamically compute and select
890	      effective routes composed of low-power and lossy links.  Local
891	      network dynamics SHOULD NOT impact the entire network.  The
892	      routing protocol MUST compute multiple paths when possible.

894	      The routing protocol MUST support multicast addressing, including

896	         multicast originating with a LBR or off the LLN, directed to a
897	         predefined group within the LLN

899	         multicast originating within the LLN, directed to one or more
900	         equivalent LBRs, in support of SS traffic

902	         multicast originating within the LLN, directed to one or more
903	         equivalent LBRs, in support of PS traffic.

905	      The routing protocol SHOULD support and utilize a large number of
906	      highly directed flows to a few LBRs, to handle scalability.

908	      The routing protocol SHOULD support formation of groups of field
909	      devices in the network.

911	      The routing protocol NEED NOT support anycast addressing because,
912	      as of the date of writing of this document, such addressing is not
913	      used by automation and control field devices.  In general, no two
914	      such devices are equivalent, except perhaps for intermediary LBRs,
915	      so unicast suffices for situations where anycast might otherwise
916	      be employed.

918	   RPL supports:

920	      Large-scale networks characterized by highly directed traffic
921	      flows between each field device and servers close to the head-end
922	      of the automation network.  To this end, RPL builds Directed
923	      Acyclic Graphs (DAGs) rooted at LBRs.

925	      Zero-touch configuration.  This is done through in-band methods
926	      for configuring RPL variables using DIO messages.

928	      The use of links with time-varying availability and quality
929	      characteristics.  This is accomplished by allowing the  use of
930	      metrics that effectively capture the quality of a path (e.g., in
931	      terms of the mean and maximum impact of use of that path on packet
932	      delivery timing and on endpoint energy demands), and by limiting
933	      the impact of changing local conditions by discovering and
934	      maintaining multiple DAG parents, and by using local repair
935	      mechanisms when DAG links break.

937	   For wireless installations of small size with undemanding
938	   communication requirements, RPL is likely to generate satisfactory
939	   routing without any special effort.  However, in larger installations
940	   or where timeliness considerations do not permit multi-second
941	   wireless-subnet transit times, then flow labeling is likely required
942	   so that forwarding routers can make informed tradeoffs between
943	   conserving their own energy resources and meeting overall system
944	   needs.

946	4.  RPL Profile

948	   This section outlines a RPL profile for a representative deployment
949	   in a process control application.  Process monitoring without control
950	   is typically less demanding, so a subset of this profile generally
951	   will suffice.

953	4.1.  RPL Features

955	4.1.1.  RPL Instances

957	   RPL allows formation of multiple instances that operate independently
958	   of each other.  Each instance may use a different objective function
959	   and different modes of operation.  It is highly recommended that
960	   wireless field devices participate in different instances that
961	   utilize objective functions that meet different optimization goals.
962	   These optimization goals target:

964	   1.  Minimizing and ensuring that a guaranteed latency is being met

966	   2.  Maximizing the communication reliability of the packets
967	       transferred over the wireless media

969	   3.  Minimizing aggregate power consumption for multi-hop LLNs that
970	       are composed of battery powered field devices.

972	   Some of these optimization goals will have to be met concurrently in
973	   a single instance by imposing various constraints.

975	   Each wireless field device should participate in a set composed of a
976	   minimum of three instances that meet optimization goals associated
977	   with three traffic flows which need to be supported by all industrial
978	   LLNs.

980	   Management Instance: Wireless industrial networks are highly
981	      deterministic in nature, meaning that wireless field devices do
982	      not make any decisions locally but are managed by a centralized
983	      System Manager that oversees the join process as well as all
984	      communication and security settings present in the devices.  The
985	      management traffic flow is downward traffic and needs to meet
986	      strictly enforced latency and reliability requirements in order to
987	      ensure proper operation of the wireless LLN. Hence each field
988	      device should participate in an instance dedicated to management
989	      traffic.  All decisions made while constructing this instance will
990	      need to be approved by the Path Computaton Engine present in the
991	      System Manager due to the deterministic, centralized nature of
992	      wireless industrial LLNs.  Shallow LLNs with a hop count of up to
993	      one, accommodate this downward traffic using non-storing mode.Non-
994	      storing involves source routing that is detrimental to the packet
995	      size.  For large transfers such as image download and
996	      configuration files, this can be factorized for a large packet.
997	      In that case, a method such as [I-D.thubert-6lo-forwarding-
998	      fragments]  is required over multi-hop networks to forward and
999	      recover individual fragments without the overhead of the source
1000	      route information in each fragment.  If the hop count in the
1001	      wireless LLN grows (LLN becomes deeper) it is higly recommended
1002	      that the management instance rely on storing mode in order to
1003	      relay management related packets.

1005	   Operational Instance: The bulk of the data that is transferred over
1006	      wireless LLN consists of process automation related payloads.
1007	      This data is of paramount importance to the smooth operation of
1008	      the process that is being monitored.  Hence data reliabiliy is of
1009	      paramount importance.  It is also important to note that a vast
1010	      majority of the  wireless field devices that operate in industrial
1011	      LLNs are battery powered.  The operational instance should hence
1012	      ensure high reliability of the data transmitted while also
1013	      minimizing the aggregate power consumption of the field devices
1014	      operating in the LLN.  All decisions made while constructing this
1015	      instance will need to be approved by the Path Computaton Engine
1016	      present in the System Manager.  This is due to the deterministic,
1017	      centralized nature of wireless LLNs.

1019	   Autonomous instance: An autonomous instance requires limited to no
1020	      configuration.  It, primary purpose is to serve as a backup for
1021	      the operational instance in case the operational instance fails.
1022	      It is also useful in non-production phases of the network, when
1023	      the plant is installed or dismantled.  [I-D.thubert-roll-asymlink]
1024	      provides rules and mechanisms whereby an instance can be used as a
1025	      fallback to another upon failure to forward a packet further.  The
1026	      autonomic instance should always be active and during normal
1027	      operations it should be maintained through local repair
1028	      mechanisms.  In normal operation global repairs should be
1029	      sparingly employed in order to conserve batteries.  But a global
1030	      repair is also probably the fastest and most economical technique
1031	      in the case the network is extensively damaged.  It is recommended
1032	      to rely on automation that will trigger a global repair upon the
1033	      detection of a large scale incident such as an explosion or a
1034	      crash.  As the name suggests, the autonomous instance is formed
1035	      without any dependence on the System Manager.  Decisions made
1036	      during the construcstion of the autonomous instance do not need
1037	      approval from the Path Computation Engine present in the  in the
1038	      System Manager.

1040	   Participation of each wireless field device in at least one instance
1041	   that hosts a DODAG with a virtual root is highly recommended.

1043	   Wireless industrial networks are typically composed of multiple LLNs
1044	   that terminate in a LLN Border Router (LBR).  The LBRs communicate
1045	   with each other and with other entities present on the backbone (such
1046	   as the Gateway and the System Manager) over a wired or wireless
1047	   backbone infrastructure.  When a device A that operates in LLN 1
1048	   sends a packet to a device B that operates in LLN2, the packets
1049	   egresses LLN1 through LBR1 and ingresses LLN2 through LBR2 after
1050	   travelling over the backbone infrastructure that connects the LBRs.
1051	   In order to accommodate this packet flow that travels from one LLN to
1052	   another, it is highly recommended that wireless field devices
1053	   participate in at least one instance that has a DODAG with a virtual
1054	   root.

1056	4.1.2.  Storing vs.  Non-Storing Mode

1058	   In general, storing mode is required for high-reporting-rate devices
1059	   (where "high rate" is with respect to the underlying link data
1060	   conveyance capability). Such devices, in the absence of path failure,
1061	   are typically only one hop from the LBR(s) that convey their
1062	   messaging to other parts of the system.  Fortunately, in such cases,
1063	   the routing tables required by such nodes are small, even when they
1064	   include information on DODAGs that are used as backup alternate
1065	   routes.

1067	   Deeper multi-hop wireless LLNs (hop count > 1) should support storing
1068	   mode in order to minimize the overhead associated with source routing
1069	   given the limited header capacity associated with typical physical
1070	   layers employed in wireless LLNs.  Support for storing mode requires
1071	   additional RAM resources be present in the constrained wireless
1072	   fielde devices.  Typical wireless LLNs scale to a maximum of one
1073	   hundred field devices.  Hence the appropriate RAM resources for
1074	   supporting storing mode should be part of the hardware requirements
1075	   imposed upon wireless field devices during the design phase.

1077	   The ISA100.11a standard mandates that all LBRs maintain routing
1078	   tables with enough capacity to accomodate operation in storing mode.
1079	   The standard also mandates that all wireless field devices maintain
1080	   routing tables but it does not make any capacity assumptions,
1081	   allowing for null routing tables.  The System Manager should read the
1082	   routing table capacity of each wireless field router and LBR during
1083	   their join phase, and determine if support for storing mode in a
1084	   particular LLN is feasible.

1086	   Lack of support for storing mode is also detrimental to battery
1087	   operated wireless field devices due to the power consumption
1088	   associated with transporting the hefty headers associated with source
1089	   routing.  Support for storing mode also ensures path redundancy which
1090	   in turn allows for better prediction of the latency associated with
1091	   downward traffic flows.  Guaranteed latencies are of paramount
1092	   importance for various traffic flows in wireless industrial LLNs.

1094	4.1.3.  DAO Policy

1096	   Support for both upward and downward traffic flows is a requirement
1097	   in industrial automation systems.  As a result, nodes send DAO
1098	   messages to establish downward paths from the root to themselves.
1099	   DAO messages are not acknowledged in wireless industrial LLNs that
1100	   are composed of battery operated field devices in order to minimize
1101	   the power consumption overhead associated with path discovery.  Given
1102	   that wireless field devices in LLNs will typically participate in
1103	   multiple RPL instances and DODAGs, it is highly recommended that both
1104	   the RPLInstance ID and the DODAGID be included in the DAO.

1106	4.1.4.  Path Metrics

1108	   RPL relies on an Objective Function for selecting parents and
1109	   computing path costs and rank.  This objective function is decoupled
1110	   from the core RPL mechanisms and also from the metrics in use in the
1111	   network.  Two objective functions for RPL have been defined at the
1112	   time of this writing, the RPL Objective Function 0 [RFC6552] and the
1113	   Minimum Rank with Hysteresis Objective Function  [RFC6719], both of
1114	   which define a selection method for a preferred parent and backup
1115	   parents, and are suitable for industrial automation network
1116	   deployments.

1118	4.1.5.  Objective Function

1120	   Industrial wireless LLNs are subject to swift variations in terms of
1121	   the propagation of the wireless signal, variations that can affect
1122	   the quality of the links between field devices.  This is due to the
1123	   nature of the environment in which they operate which can be
1124	   characterized as metal jungles that cause wireles propagation
1125	   distortions, multi-path fading and scattering.  Hence support for
1126	   hysteresis is needed in order to ensure relative link stability which
1127	   in turn ensures route stability.

1129	   As mentioned in previous sections of this document, different traffic
1130	   flows require different optimization goals.  Wireless field devices
1131	   should participate in multiple instances associated with multiple
1132	   objective functions.

1134	   Management Instance: Should utilize an objective function that
1135	      focuses on optimization of latency and data reliability.

1137	   Operational instance: Should utilize an objective function that
1138	      focuses on data reliability and minimizing aggregate power
1139	      consumption for battery operated field devices.

1141	   Autonomous instance: Should utilize an objective function that
1142	      optimizes data latency.  The primary purpose of the autonomous
1143	      instance is as a fallback instance in case the operational
1144	      instance fails.  Data latency is hence paramount for ensuring that
1145	      the wireless field devices can exchange packets in order to repair
1146	      the operational instance.

1148	   More complex objective functions are needed that take in
1149	   consideration multiple constraints and utilize weighted sums of
1150	   multiple additive and multiplicative metrics.  Additional objective
1151	   functions specifically designed for such networks may be defined in
1152	   companion RFCs.

1154	4.1.6.  DODAG Repair

1156	   To effectively handle time-varying link characteristics and
1157	   availability, industrial automation network deployments SHOULD
1158	   utilize the local repair mechanisms in RPL.

1160	   Local repair is triggered by broken link detection, and in storing
1161	   mode also by loop detection.

1163	   The first local repair mechanism consists of a node detaching from a
1164	   DODAG and then re-attaching to the same or to a different DODAG at a
1165	   later time.  While detached, a node advertises an infinite rank value
1166	   so that its children can select a different parent.  This process is
1167	   known as poisoning and is described in Section 8.2.2.5 of [RFC6550].
1168	   While RPL provides an option to form a local DODAG, doing so in
1169	   industrial automation network deployments is of little benefit since
1170	   applications typically communicate through a LBR.  After the detached
1171	   node has made sufficient effort to send notification to its children
1172	   that it is detached, the node can rejoin the same DODAG with a higher
1173	   rank value.  The configured duration of the poisoning mechanism needs
1174	   to take into account the disconnection time applications running over
1175	   the network can tolerate.  Note that when joining a different DODAG,
1176	   the node need not perform poisoning.

1178	   The second local repair mechanism controls how much a node can
1179	   increase its rank within a given DODAG Version (e.g., after detaching
1180	   from the DODAG as a result of broken link or loop detection).
1181	   Setting the DAGMaxRankIncrease to a non-zero value enables this
1182	   mechanism, and setting it to a value of less than infinity limits the
1183	   cost of count-to-infinity scenarios when they occur, thus controlling
1184	   the duration of disconnection applications may experience.

1186	4.1.7.  MPL Profile

1188	   The applicability of MPL is left to be determined.  There is a
1189	   potential for Source/Sink flows in order to control the flooding
1190	   incurred by alarms and alerts.

1192	4.1.8.  Security

1194	   Industrial automation network deployments typically operate in areas
1195	   that provide limited physical security (relative to the risk of
1196	   attack).  For this reason, the link layer, transport layer and
1197	   application layer technologies utilized within such networks
1198	   typically provide security mechanisms to ensure authentication,
1199	   confidentiality, integrity, timeliness and freshness.  As a result,
1200	   such deployments may not need to implement RPL's security mechanisms
1201	   and could rely on link layer and higher layer security features.

1203	4.1.9.  P2P communications

1205	   There is definitely a need for route optimizations for the close
1206	   control loops that sustain the automation systems.  [I-D.thubert-
1207	   6tisch-architecture] discusses the applicability of a central routing
1208	   computation based on a Path Computation Element (PCE), which would be
1209	   the natural IETF correspondent to the System Managers or Network
1210	   Managers that can be found in existing industrial standards.

1212	   The RPL point to point extension/optimization [RFC6997]
1213	   (experimental) or its standard track successor may be used as well to
1214	   establish on-demand paths or repair existing ones.

1216	4.2.  Layer-two features

1218	   This section defers to work that is aking place at the 6TiSCH WG.  In
1219	   particular [I-D.wang-6tisch-6top] defines the Link Layer Control
1220	   (LLC) operation that sustain RPL and IPv6 whereas [I-D.vilajosana-
1221	   6tisch-minimal] specifies a minimal RPL operation based on a static
1222	   TSCH schedule.

1224	4.3.  Recommended Configuration Defaults and Ranges

1226	4.3.1.  Trickle Parameters

1228	   Trickle was designed to be density-aware and perform well in networks
1229	   characterized by a wide range of node densities.  The combination of
1230	   DIO packet suppression and adaptive timers for sending updates allows
1231	   Trickle to perform well in both sparse and dense environments.

1233	   Node densities in industrial automation network deployments can vary
1234	   greatly, from nodes having only one or a handful of neighbors to
1235	   nodes having several hundred neighbors.  In high density
1236	   environments, relatively low values for Imin may cause a short period
1237	   of congestion when an inconsistency is detected and DIO updates are
1238	   sent by a large number of neighboring nodes nearly simultaneously.
1239	   While the Trickle timer will exponentially backoff, some time may
1240	   elapse before the congestion subsides.  Although some link layers
1241	   employ contention mechanisms that attempt to avoid congestion,
1242	   relying solely on the link layer to avoid congestion caused by a
1243	   large number of DIO updates can result in increased communication
1244	   latency for other control and data traffic in the network.

1246	   To mitigate this kind of short-term congestion, this document
1247	   recommends a more conservative set of values for the Trickle
1248	   parameters than those specified in [RFC6206].  In particular,
1249	   DIOIntervalMin is set to a larger value to avoid periods of
1250	   congestion in dense environments, and DIORefundancyConstant is
1251	   parameterized accordingly as described below.  These values are
1252	   appropriate for the timely distribution of DIO updates in both sparse
1253	   and dense scenarios while avoiding the short-term congestion that
1254	   might arise in dense scenarios.

1256	   Because the actual link capacity depends on the particular link
1257	   technology used within an industrial automation network deployment,
1258	   the Trickle parameters are specified in terms of the link's maximum
1259	   capacity for conveying link-local multicast messages.  If the link
1260	   can convey m link-local multicast packets per second on average, the
1261	   expected time it takes to transmit a link-local multicast packet is 1
1262	   /m seconds.

1264	   DIOIntervalMin:  Industrial automation network deployments SHOULD set
1265	   DIOIntervalMin such that the Trickle Imin is at least 50 times as
1266	   long as it takes to convey a link-local multicast packet.  This value
1267	   is larger than that recommended in [RFC6206] to avoid congestion in
1268	   dense plant deployments as described above.

1270	   DIOIntervalDoublings:  Industrial automation network deployments
1271	   SHOULD set DIOIntervalDoublings such that the Trickle Imax is at
1272	   least TBD minutes or more.

1274	   DIORedundancyConstant:  Industrial automation network deployments
1275	   SHOULD set DIORedundancyConstant to a value of at least 10.  This is
1276	   due to the larger chosen value for DIOIntervalMin and the
1277	   proportional relationship between Imin and k suggested in [RFC6206].
1278	   This increase is intended to compensate for the increased
1279	   communication latency of DIO updates caused by the increase in the
1280	   DIOIntervalMin value, though the proportional relationship between
1281	   Imin and k suggested in [RFC6206] is not preserved.  Instead,
1282	   DIORedundancyConstant is set to a lower value in order to reduce the
1283	   number of packet transmissions in dense environments.

1285	4.3.2.  Other Parameters

1287	   None identified at this time.  Further work is required to refine
1288	   this analysis.

1290	5.  Manageability Considerations

1292	   RPL enables automatic and consistent configuration of RPL routers
1293	   through parameters specified by the DODAG root and disseminated
1294	   through DIO packets.  The use of Trickle for scheduling DIO
1295	   transmissions ensures lightweight yet timely propagation of important
1296	   network and parameter updates and allows network operators to choose
1297	   the trade-off point they are comfortable with respect to overhead vs.
1298	   reliability and timeliness of network updates.

1300	   The metrics in use in the network along with the Trickle Timer
1301	   parameters used to control the frequency and redundancy of network
1302	   updates can be dynamically varied by the root during the lifetime of
1303	   the network.  To that end, all DIO messages SHOULD contain a Metric
1304	   Container option for disseminating the metrics and metric values used
1305	   for DODAG setup.  In addition, DIO messages SHOULD contain a DODAG
1306	   Configuration option for disseminating the Trickle Timer parameters
1307	   throughout the network.

1309	   The possibility of dynamically updating the metrics in use in the
1310	   network as well as the frequency of network updates allows deployment
1311	   characteristics (e.g., network density) to be discovered during
1312	   network bring-up and to be used to tailor network parameters once the
1313	   network is operational rather than having to rely on precise pre-
1314	   configuration.  This also allows the network parameters and the
1315	   overall routing protocol behavior to evolve during the lifetime of
1316	   the network.

1318	   RPL specifies a number of variables and events that can be tracked
1319	   for purposes of network fault and performance monitoring of RPL
1320	   routers.  Depending on the memory and processing capabilities of each
1321	   smart grid device, various subsets of these can be employed in the
1322	   field.

1324	6.  Security Considerations

1326	   Industrial automation network deployments typically operate in areas
1327	   that provide limited physical security (relative to the risk of
1328	   attack).  For this reason, the link layer, transport layer and
1329	   application layer technologies utilized within such networks
1330	   typically provide security mechanisms to ensure authentication,
1331	   confidentiality, integrity, timeliness and freshness.  As a result,
1332	   such deployments may not need to implement RPL's security mechanisms
1333	   and could rely on link layer and higher layer security features.

1335	   This document does not specify operations that could introduce new
1336	   threats.  Security considerations for RPL deployments are to be
1337	   developed in accordance with recommendations laid out in, for
1338	   example, [I-D.tsao-roll-security-framework].

1340	   Industrial automation networks are subject to stringent security
1341	   requirements as they are considered a critical infrastructure
1342	   component.  At the same time, since they are composed of large
1343	   numbers of resource- constrained devices inter-connected with
1344	   limited-throughput links, many available security mechanisms are not
1345	   practical for use in such networks.  As a result, the choice of
1346	   security mechanisms is highly dependent on the device and network
1347	   capabilities characterizing a particular deployment.

1349	   In contrast to other types of LLNs, in industrial automation networks
1350	   centralized administrative control and access to a permanent secure
1351	   infrastructure is available.  As a result link-layer, transport-layer
1352	   and/or application-layer security mechanisms are typically in place
1353	   and may make use of RPL's secure mode unnecessary.

1355	6.1.  Security Considerations during initial deployment

1357	6.2.  Security Considerations during incremental deployment

1359	7.  Other Related Protocols

1361	8.  IANA Considerations

1363	   This specification has no requirement on IANA.

1365	9.  Acknowledgements

1367	10.  References

1369	10.1.  Normative References

1371	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
1372	              Requirement Levels", BCP 14, RFC 2119, March 1997.

1374	10.2.  Informative References

1376	   [I-D.ietf-roll-terminology]
1377	              Vasseur, J., "Terminology in Low power And Lossy
1378	              Networks", Internet-Draft draft-ietf-roll-terminology-12,
1379	              March 2013.

1381	   [RFC2887]  Handley, M., Floyd, S., Whetten, B., Kermode, R.,
1382	              Vicisano, L. and M. Luby, "The Reliable Multicast Design
1383	              Space for Bulk Data Transfer", RFC 2887, August 2000.

1385	   [RFC5548]  Dohler, M., Watteyne, T., Winter, T. and D. Barthel,
1386	              "Routing Requirements for Urban Low-Power and Lossy
1387	              Networks", RFC 5548, May 2009.

1389	   [RFC5826]  Brandt, A., Buron, J. and G. Porcu, "Home Automation
1390	              Routing Requirements in Low-Power and Lossy Networks", RFC
1391	              5826, April 2010.

1393	   [RFC5867]  Martocci, J., De Mil, P., Riou, N. and W. Vermeylen,
1394	              "Building Automation Routing Requirements in Low-Power and
1395	              Lossy Networks", RFC 5867, June 2010.

1397	   [RFC5673]  Pister, K., Thubert, P., Dwars, S. and T. Phinney,
1398	              "Industrial Routing Requirements in Low-Power and Lossy
1399	              Networks", RFC 5673, October 2009.

1401	   [RFC6206]  Levis, P., Clausen, T., Hui, J., Gnawali, O. and J. Ko,
1402	              "The Trickle Algorithm", RFC 6206, March 2011.

1404	   [RFC6550]  Winter, T., Thubert, P., Brandt, A., Hui, J., Kelsey, R.,
1405	              Levis, P., Pister, K., Struik, R., Vasseur, JP. and R.
1406	              Alexander, "RPL: IPv6 Routing Protocol for Low-Power and
1407	              Lossy Networks", RFC 6550, March 2012.

1409	   [RFC6552]  Thubert, P., "Objective Function Zero for the Routing
1410	              Protocol for Low-Power and Lossy Networks (RPL)", RFC
1411	              6552, March 2012.

1413	   [RFC6719]  Gnawali, O. and P. Levis, "The Minimum Rank with
1414	              Hysteresis Objective Function", RFC 6719, September 2012.

1416	   [RFC6997]  Goyal, M., Baccelli, E., Philipp, M., Brandt, A. and J.
1417	              Martocci, "Reactive Discovery of Point-to-Point Routes in
1418	              Low-Power and Lossy Networks", RFC 6997, August 2013.

1420	   [I-D.thubert-roll-asymlink]
1421	              Thubert, P., "RPL adaptation for asymmetrical links",
1422	              Internet-Draft draft-thubert-roll-asymlink-02, December
1423	              2011.

1425	   [I-D.thubert-6lo-forwarding-fragments]
1426	              Thubert, P. and J. Hui, "LLN Fragment Forwarding and
1427	              Recovery", Internet-Draft draft-thubert-6lo-forwarding-
1428	              fragments-00, October 2013.

1430	   [I-D.thubert-6tisch-architecture]
1431	              Thubert, P., Assimiti, R. and T. Watteyne, "An
1432	              Architecture for IPv6 over the TSCH mode of IEEE
1433	              IEEE802.15.4e", Internet-Draft draft-thubert-6tisch-
1434	              architecture-00, October 2013.

1436	   [I-D.tsao-roll-security-framework]
1437	              Tsao, T., Alexander, R., Daza, V. and A. Lozano, "A
1438	              Security Framework for Routing over Low Power and Lossy
1439	              Networks", Internet-Draft draft-tsao-roll-security-
1440	              framework-02, March 2010.

1442	   [I-D.watteyne-6tisch-tsch]
1443	              Watteyne, T., "Using IEEE802.15.4e TSCH in an LLN context:
1444	              Overview, Problem Statement and Goals", Internet-Draft
1445	              draft-watteyne-6tisch-tsch-00, October 2013.

1447	   [I-D.wang-6tisch-6top]
1448	              Wang, Q., Vilajosana, X. and T. Watteyne, "6TiSCH
1449	              Operation Sublayer (6top)", Internet-Draft draft-wang-
1450	              6tisch-6top-00, October 2013.

1452	   [I-D.vilajosana-6tisch-minimal]
1453	              Vilajosana, X. and K. Pister, "Minimal 6TiSCH
1454	              Configuration", Internet-Draft draft-vilajosana-6tisch-
1455	              minimal-00, October 2013.

1457	10.3.  External Informative References

1459	   [HART]     www.hartcomm.org, "Highway Addressable Remote Transducer,
1460	              a group of specifications for industrial process and
1461	              control devices administered by the HART Foundation", .

1463	   [ISA100.11a]
1464	              ISA, "ISA100, Wireless Systems for Automation", May 2008,
1465	              <     http://www.isa.org/Community/
1466	              SP100WirelessSystemsforAutomation>.

1468	   [ZigBeeIP]
1469	              ZigBee Public Document 15-002r00, "ZigBee IP
1470	              Specification", 2013.

1472	Authors' Addresses

1474	   Tom Phinney, editor
1475	   consultant
1476	   5012 W. Torrey Pines Circle
1477	   Glendale, AZ 85308-3221
1478	   USA

1480	   Phone: +1 602 938 3163
1481	   Email: tom.phinney@cox.net
1482	   Pascal Thubert
1483	   Cisco Systems, Inc
1484	   Building D
1485	   45 Allee des Ormes - BP1200
1486	   MOUGINS - Sophia Antipolis, 06254
1487	   FRANCE

1489	   Phone: +33 497 23 26 34
1490	   Email: pthubert@cisco.com

1492	   Robert Assimiti
1493	   Nivis
1494	   1000 Circle 75 Parkway SE, Ste 300
1495	   Atlanta, GA 30339
1496	   USA

1498	   Phone: +1 678 202 6859
1499	   Email: robert.assimiti@nivis.com