idnits 2.17.1 draft-clausen-lln-rpl-experiences-09.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 624: '... A ZIP node MUST ensure that the ...' RFC 2119 keyword, line 628: '...nnel entry point SHOULD be considered ...' Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 27, 2014) is 3467 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Obsolete informational reference (is this intentional?): RFC 2460 (Obsoleted by RFC 8200) Summary: 1 error (**), 0 flaws (~~), 1 warning (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group T. Clausen 3 Internet-Draft A. Colin de Verdiere 4 Intended status: Informational J. Yi 5 Expires: April 30, 2015 LIX, Ecole Polytechnique 6 U. Herberg 7 Fujitsu Laboratories of America 8 Y. Igarashi 9 Hitachi, Ltd., Yokohama Research 10 Laboratory 11 October 27, 2014 13 Observations of RPL: IPv6 Routing Protocol for Low power and Lossy 14 Networks 15 draft-clausen-lln-rpl-experiences-09 17 Abstract 19 With RPL - the "IPv6 Routing Protocol for Low-power Lossy Networks" - 20 having been published as a Proposed Standard after a ~2-year 21 development cycle, this document presents an evaluation of the 22 resulting protocol, of its applicability, and of its limits. The 23 documents presents a selection of observations of the protocol 24 characteristics, exposes experiences acquired when producing various 25 prototype implementations of RPL, and presents results obtained from 26 testing this protocol - by way of network simulations, in network 27 testbeds and in deployments. The document aims at providing a better 28 understanding of possible limits of RPL, notably the possible 29 directions that further protocol developments should explore, in 30 order to address these. 32 Status of this Memo 34 This Internet-Draft is submitted in full conformance with the 35 provisions of BCP 78 and BCP 79. 37 Internet-Drafts are working documents of the Internet Engineering 38 Task Force (IETF). Note that other groups may also distribute 39 working documents as Internet-Drafts. The list of current Internet- 40 Drafts is at http://datatracker.ietf.org/drafts/current/. 42 Internet-Drafts are draft documents valid for a maximum of six months 43 and may be updated, replaced, or obsoleted by other documents at any 44 time. It is inappropriate to use Internet-Drafts as reference 45 material or to cite them other than as "work in progress." 47 This Internet-Draft will expire on April 30, 2015. 49 Copyright Notice 51 Copyright (c) 2014 IETF Trust and the persons identified as the 52 document authors. All rights reserved. 54 This document is subject to BCP 78 and the IETF Trust's Legal 55 Provisions Relating to IETF Documents 56 (http://trustee.ietf.org/license-info) in effect on the date of 57 publication of this document. Please review these documents 58 carefully, as they describe your rights and restrictions with respect 59 to this document. Code Components extracted from this document must 60 include Simplified BSD License text as described in Section 4.e of 61 the Trust Legal Provisions and are provided without warranty as 62 described in the Simplified BSD License. 64 Table of Contents 66 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 67 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 68 3. RPL Overview . . . . . . . . . . . . . . . . . . . . . . . . . 5 69 3.1. RPL Message Emission Timing - Trickle Timers . . . . . . . 7 70 4. Requirement Of DODAG Root . . . . . . . . . . . . . . . . . . 8 71 4.1. Observations . . . . . . . . . . . . . . . . . . . . . . . 8 72 5. RPL Data Traffic Flows . . . . . . . . . . . . . . . . . . . . 9 73 5.1. Observations . . . . . . . . . . . . . . . . . . . . . . . 11 74 6. Fragmentation Of RPL Control Messages And Data Packet . . . . 12 75 6.1. Observations . . . . . . . . . . . . . . . . . . . . . . . 13 76 7. The DAO Mechanism: Downward and Point-to-Point Routes . . . . 15 77 7.1. Observations . . . . . . . . . . . . . . . . . . . . . . . 15 78 8. Address Aggregation and Summarization . . . . . . . . . . . . 17 79 8.1. Observations . . . . . . . . . . . . . . . . . . . . . . . 18 80 9. Link Bidirectionality Verification . . . . . . . . . . . . . . 19 81 9.1. Observations . . . . . . . . . . . . . . . . . . . . . . . 19 82 10. Neighbor Unreachability Detection For Unidirectional Links . . 20 83 10.1. Observations . . . . . . . . . . . . . . . . . . . . . . . 20 84 11. RPL Implementability and Complexity . . . . . . . . . . . . . 22 85 11.1. Observations . . . . . . . . . . . . . . . . . . . . . . . 22 86 12. Underspecification . . . . . . . . . . . . . . . . . . . . . . 22 87 12.1. Observations . . . . . . . . . . . . . . . . . . . . . . . 23 88 13. Protocol Convergence . . . . . . . . . . . . . . . . . . . . . 24 89 13.1. Observations . . . . . . . . . . . . . . . . . . . . . . . 24 90 13.2. Caveat . . . . . . . . . . . . . . . . . . . . . . . . . . 25 91 14. Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 92 14.1. Observations . . . . . . . . . . . . . . . . . . . . . . . 25 93 15. Security Considerations . . . . . . . . . . . . . . . . . . . 27 94 16. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 27 95 17. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 27 96 18. Informative References . . . . . . . . . . . . . . . . . . . . 27 97 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 30 99 1. Introduction 101 RPL - the "Routing Protocol for Low Power and Lossy Networks" 102 [RFC6550] - is a proposal for an IPv6 routing protocol for Low-power 103 Lossy Networks (LLNs), by the ROLL Working Group in the Internet 104 Engineering Task Force (IETF). This routing protocol is intended to 105 be the IPv6 routing protocol for LLNs and sensor networks, applicable 106 in all kinds of deployments and applications of LLNs. 108 The objective of RPL and ROLL is to provide routing in networks which 109 "comprise up to thousands of nodes" [roll-charter], where the 110 majority of the nodes have very constrained resources [RFC7102], and 111 where handling mobility is not an explicit design criteria [RFC5867], 112 [RFC5826], [RFC5673], [RFC5548]. 114 [roll-charter] states that "Typical traffic patterns are not simply 115 unicast flows (e.g. in some cases most if not all traffic can be 116 point to multipoint)", and [RFC7102] further categorizes the 117 supported traffic types into "upward" traffic from sensors to a 118 collection sink or LBR (LLN Border Router) (denoted multipoint-to- 119 point), "downward" traffic from the collection sink or LBR to the 120 sensors (denoted point-to-multipoint) and traffic from "sensor to 121 sensor" (denoted point-to-point traffic), and establishes this 122 terminology for these traffic types. Thus, while the target for RPL 123 and ROLL is to support all of these traffic types, the emphasis among 124 these, according to [roll-charter], appears to be to optimize for 125 multipoint-to-point traffic, while also supporting point-to- 126 multipoint and point-to-point traffic. 128 With experiences obtained since the publication of RPL as [RFC6550], 129 it is opportune to document observations of the protocol, in order to 130 understand which aspects of it work well and which necessitate 131 further investigations. Understanding possible limitations is 132 important to identify issues which may restrict the deployment scope 133 of the protocol and which may need further protocol work or 134 enhancements. 136 The observations made in this document, except for when explicitly 137 noted otherwise, do not depend on any specific implementation or 138 deployment, but can be understood from simply analyzing the protocol 139 specification [RFC6550]. That said, all observations made have been 140 confirmed to also be present in, at least, some deployments or test 141 platforms with RPL, i.e., have been experimentally confirmed. 143 This document is explicitly not an implementation guidebook for RPL. 144 It has as objective to document observations of behaviors of 145 [RFC6550], in the spirit of better understanding the characteristics 146 and limits of the protocol. 148 2. Terminology 150 This document uses the terminology and notation defined in [RFC6550]. 152 Additionally, this document uses terminology from [RFC7102], 153 specifically the terms defined for the traffic types "MP2P" 154 (Multipoint-to-Point), "P2P" (Point To Point) and "P2MP" (Point-to- 155 Multipoint). 157 3. RPL Overview 159 The basic construct in RPL is a "Destination Oriented Directed 160 Acyclic Graph" (DODAG), depicted in Figure 1, with a single router 161 acting as DODAG Root. The DODAG Root has responsabilities in 162 addition to those of other routers, including for initiating, 163 configuring, and managing the DODAG, and (in some cases) acting as a 164 central relay for traffic through and between routers in the LLN. 166 (s) 167 ^ ^ ^ 168 / | \ 169 (a) | (b) 170 ^ (c) ^ 171 / ^ (d) 172 (f) | ^ ^ 173 (e)--/ \ 174 (g) 176 Figure 1: RPL DODAG 178 In an LLN, in which RPL has converged to a stable state, each router 179 has identified a stable set of parents, each of which is a potential 180 next-hop on a route towards the DODAG Root. One of the parents is 181 selected as preferred parent. Each router, which is part of a DODAG 182 (i.e., which has selected parents and a preferred parent) will emit 183 DODAG Information Object (DIO) messages, using link-local multicast, 184 indicating its respective rank in the DODAG (i.e., distance to the 185 DODAG Root according to some metric(s), in the simplest form hop- 186 count). Upon having received a (number of such) DIO messages, a 187 router will calculate its own rank such that it is greater than the 188 rank of each of its parents, select a preferred parent and then 189 itself start emitting DIO messages. 191 DODAG formation thus starts at the DODAG Root (initially, the only 192 router which is part of a DODAG), and spreads gradually to cover the 193 whole LLN as DIOs are received, parents and preferred parents are 194 selected, and further routers participate in the DODAG. The DODAG 195 Root also includes, in DIO messages, a DODAG Configuration Object, 196 describing common configuration attributes for all routers in that 197 network - including their mode of operation, timer characteristics 198 etc. routers in a DODAG include a verbatim copy of the last received 199 DODAG Configuration Object in their DIO messages, permitting also 200 such configuration parameters propagating through the network. 202 As a Distance Vector protocol, RPL restricts the ability for a router 203 to change rank. A router can freely assume a smaller rank than 204 previously advertised (i.e., logically move closer to the DODAG Root) 205 if it discovers a parent advertising a lower rank, and must then 206 disregard all previous parents of ranks higher than the router's new 207 rank. The ability for a router to assume a greater rank (i.e., 208 logically move farther from the DODAG Root) than previously 209 advertised is restricted in order to avoid count-to-infinity 210 problems. The DODAG Root can trigger "global recalculation" of the 211 DODAG by increasing a sequence number, DODAG version, in DIO 212 messages. 214 The DODAG so constructed is used for installing routes: the 215 "preferred parent" of a router can serve as a default route towards 216 the DODAG Root, and the DODAG Root can embed in its DIO messages the 217 destination prefixes, included by DIOs generated by routers through 218 the LLN, to which connectivity is provided by the DODAG Root. Thus, 219 RPL by way of DIO generation provides "upward routes" or "multipoint- 220 to-point routes" from the sensors inside the LLN and towards the 221 DODAG Root (and, possibly, to destinations reachable through the 222 DODAG Root). 224 "Downward routes" are enabled by having sensors issue Destination 225 Advertisement Object (DAO) messages, propagating as unicast via 226 preferred parents towards the DODAG Root. These describe which 227 prefixes belong to, and can be reached via, which router. In a 228 network, all routers must operate in either of storing mode or non- 229 storing mode, specified by way of a "Mode of Operation" (MOP) flag in 230 the DODAG Configuration Object from the DODAG Root. Those two modes 231 are non-interoperable, i.e., a mixture of routers running in 232 different modes is impossible in the same routing domain. Depending 233 on the MOP, DAO messages are forwarded differently towards the DODAG 234 Root: 236 o In "non-storing mode", a router originates a DAO messages, 237 advertising one or more of its parents, and unicasts these to the 238 DODAG Root. Once the DODAG Root has received DAOs from a router, 239 and from all routers on the route between it and the DODAG Root, 240 it can use source routing for reaching advertised destinations 241 inside the LLN. 243 o In "storing mode", each router on the route between the originator 244 of a DAO and the DODAG Root records a route to the prefixes 245 advertised in the DAO, as well as the next-hop towards these (the 246 router, from which the DAO was received), then forwards the DAO to 247 its preferred parent. 249 "Point-to-point routes", for communication between devices inside the 250 LLN and where neither of the communicating devices are the DODAG 251 Root, are as default supported by having the source sensor transmit a 252 data packet, via its default route to the DODAG Root (i.e., using the 253 upward routes), which will then, depending on the "Mode of Operation" 254 for the DODAG, either add a source-route to the received data packet 255 for reaching the destination sensor (downward routes in non-storing 256 mode), or simply use hop-by-hop routing (downward routes in storing 257 mode) for forwarding the data packet. In the case of storing mode, 258 if the source and the destination for a point-to-point data packet 259 share a common ancestor other than the DODAG Root, a downward route 260 may be available in a router (and, thus, used) before the data packet 261 reaches the DODAG Root. 263 3.1. RPL Message Emission Timing - Trickle Timers 265 RPL message generation is timer-based, with the DODAG Root being able 266 to configure back-off of message emission intervals using Trickle 267 [RFC6206]. Trickle, as used in RPL, stipulates that a router 268 transmits a DIO "every so often" - except if receiving a number of 269 DIOs from neighbor routers, enabling the router to determine if its 270 DIO transmission is redundant. 272 When a router transmits a DIO, there are two possible outcomes: 273 either every neighbor router that hears the message finds that the 274 information contained is consistent with its own state (i.e., the 275 received DODAG version number corresponds with that which the router 276 has recorded, and no better rank is advertised than that which is 277 recorded in the parent set) - or, a recipient router detects that 278 either the sender of the DIO or itself has out-of-date information. 279 If the sender has out-of-date information, then the recipient router 280 schedules transmission of a DIO to update this information. If the 281 recipient router has out-of-date information, then it updates based 282 on the information received in the DIO. 284 With Trickle, a router will schedule emission of a DIO at some time, 285 t, in the future. When receiving a DIO containing information 286 consistent with its own information, the router will record that 287 "redundant information has been received" by incrementing a 288 redundancy counter, c. At the time t, if c is below some "redundancy 289 threshold", then it transmits its DIO. Otherwise, transmission of a 290 DIO at this time is suppressed, c is reset and a new t is selected to 291 twice as long time in the future - bounded by a pre-configured 292 maximum value for t. If, on the other hand, the router has received 293 an out-of-date DIO from one of its neighbors, t is reset to a pre- 294 configured minimum value and c is set to zero. In both cases, at the 295 expiration of t, the router will verify if c is below the "redundancy 296 threshold" and if so transmit - otherwise, increase t and stay quiet. 298 4. Requirement Of DODAG Root 300 As indicated in Section 3, the DODAG Root has both a special 301 responsibility and is subject to special requirements. The DODAG 302 Root is responsible for determining and maintaining the configuration 303 parameters for the DODAG, and for initiating DIO emissions. 305 The DODAG Root is also responsible (in both storing and non-storing 306 mode) for being able to, when downward routes are supported, maintain 307 sufficient topological information to be able to construct routes to 308 all destinations in the network. 310 When operating in non-storing mode, this entails that the DODAG Root 311 is required to have sufficient memory and sufficient computational 312 resources to be able to record a network graph containing all routes 313 from itself and to all destinations and to calculate routes. 315 When operating in storing mode, this entails that the DODAG Root 316 needs enough memory to keep a list of all routers in the RPL 317 instance, and a next hop for each of those routers. If aggregation 318 is used, the memory requirements can be reduced in storing mode (see 319 Section 8 for observations about aggregation in RPL). 321 The DODAG Root is also required to have sufficient energy available 322 so as to be able to ensure the relay functions required. This, 323 especially for non-storing mode, where all data packets transit 324 through the DODAG Root. 326 4.1. Observations 328 In a given deployment, select routers can be provisioned with the 329 required energy, memory and computational resources so as to serve as 330 DODAG Roots, and be administratively configured as such - with the 331 remainder of the routers in the network being of typically lesser 332 capacity. In storing mode, the DODAG root needs to keep a routing 333 entry for each router in the RPL instance. In non-storing mode, the 334 resource requirements on the DODAG Root are likely much higher than 335 in storing mode, as the DODAG Root needs to store a network graph 336 containing complete routes to all destinations in the RPL instance, 337 in order to calculate the routing table (whereas in storing mode, 338 only the next hop for each destination in the RPL instance needs to 339 be stored, and aggregation may be used to further reduce the resource 340 requirements). 342 A router provisioned with resources to act as a DODAG Root, and 343 administratively configured to act as such, represents a single point 344 of failure for the DODAG it serves. It is possible for a given RPL 345 deployment to contain several DODAGs, each rooted in a border router. 346 RPL also supports that several border routers participate in the same 347 DODAG - with the caveat that in this case, a "virtual" DODAG root, 348 external to the LLN, exists and which coordinates DODAGVersionNumbers 349 and other DODAG parameters. The precise coordination mechanism is 350 not specified in [RFC6550], which instead states that: 352 The method of coordination is out of scope for this specification 353 (to be defined in future companion specifications). 355 As the memory requirements for the DODAG Root and for other routers 356 are substantially different, unless all routers are provisioned with 357 resources (memory, energy, ...) to act as DODAG Roots, effectively if 358 the designated DODAG Root fails, the network fails and RPL is unable 359 to operate. Even if electing another router as temporary DODAG Root 360 (e.g., for forming a "Floating" DODAG) for providing internal 361 connectivity between routers, this router may not have the necessary 362 resources to satisfy this role as (temporary) DODAG Root. 364 Thus, although in principle RPL provides, by way of "Floating 365 DODAGs", protocol mechanisms for establishing a DODAG for providing 366 internal connectivity even in case of failure of the administratively 367 provisioned DODAG Root, all (or at least a large number) of the 368 routers need to have resources to act as roots to support floating 369 DODAG, especially in non-storing mode. 371 Another possible LLN scenario is that only internal point-to-point 372 connectivity is sought, and no router has a more "central" role than 373 any other - a self-organizing LLN. In those cases, it would be hard 374 to specify such "super-device" as DODAG root, and can result in non- 375 optimal routes. 377 5. RPL Data Traffic Flows 379 [RFC7102] defines three data traffic types: multipoint-to-point 380 traffic, point-to-multipoint traffic, and point-to-point traffic. 381 Multipoint-to-point traffic reflects telemetry flowing "from sensors 382 to a sink", with point-to-multipoint traffic reflecting control 383 (commands) "from a central authority to actuators". 385 RPL is designed to support these three data traffic types, but in its 386 doing so implicitly makes two assumptions regarding the targeted 387 deployment scenarios: 389 o Telemetry "from sensors to a sink" is common, control (commands) 390 "from a central authority to actuators" is rare - and while 391 traffic between two sensors is supported, it is extremely rare. 393 o The "sink" and the "central authority" are co-located with, or 394 reachable via, the DODAG root. 396 While not specifically called out thus in [RFC6550], the resulting 397 protocol design, however, reflects these assumptions in that the 398 mechanism constructing multipoint-to-point routes is efficient in 399 terms of control traffic generated and state required, point-to- 400 multipoint route construction much less so - and point-to-point 401 routes subject to potentially significant route stretch (routes going 402 through the DODAG Root in non-storing mode) and over-the-wire 403 overhead from using source routing (from the DODAG Root to the 404 destination) (see Section 7) - or, in case of storing mode, 405 considerable memory requirements in all LLN routers inside the 406 network (see Section 7). 408 A router selects from among its parents a "preferred parent", to 409 serve as a default route towards the DODAG Root (and to prefixes 410 advertised by the DODAG Root). Thus, RPL provides "upward routes" or 411 "multipoint-to-point routes" from the routers below the DODAG Root 412 and towards the DODAG Root. 414 A router which wishes to act as a destination for data traffic 415 ("downward routes" or "point-to-multipoint") issues DAOs upwards in 416 the DODAG towards the DODAG Root, describing which prefixes belong 417 to, and can be reached via, that router. 419 Point-to-Point routes between routers below the DODAG Root are 420 supported by having the source router transmit, via its default 421 route, data traffic towards the DODAG Root. In non-storing mode, the 422 data traffic will reach the DODAG Root, which will reflect the data 423 traffic downward towards the destination router, adding a strict 424 source routing header indicating the precise route for the data 425 traffic to reach the intended destination router. In storing mode, 426 the source and the destination may possibly (although, may also not) 427 have a common ancestor other than the DODAG Root, which may provide a 428 downward route to the destination before data traffic reaching the 429 DODAG Root. 431 5.1. Observations 433 RPL is well suited for networks in which the sink for data traffic is 434 co-located with, (or is outside the LLN and reachable via), the DODAG 435 root. However, these data traffic characteristics does not represent 436 a universal distribution of traffic types in LLNs. There are 437 scenarios where the sink is not co-located with (or is outside the 438 LLN and reachable via) the DODAG. These include: 440 o Command/control networks in which sensor-to-sensor traffic is a 441 more common occurrence, documented, e.g., in [RFC5867] ("Building 442 Automation Routing Requirements in Low Power and Lossy Networks"). 444 o Networks in which all traffic is bi-directional, e.g., in case 445 sensor devices in the LLN are, in majority, "actively read": a 446 request is issued by the DODAG Root to a specific sensor, and the 447 sensor value is expected returned. In fact, unless all traffic in 448 the LLN is unidirectional, without acknowledgements (e.g., as in 449 UDP), and no control messages (e.g., for service discovery) or 450 other data packets are sent from the DODAG Root to the routers, 451 traffic will be bi-directional. The IETF protocol for use in 452 constrained environments, CoAP [RFC7252], makes use of 453 acknowledgements to control packet loss and ensure that packets 454 are received by the packet destination. In the four message types 455 defined for CoAP: confirmable, acknowledgement, reset and non- 456 confirmable, the first three are dedicated for sending/ 457 acknowledgement cycle. Another example is that the ZigBee 458 Alliance SEP 2.0 specification [SEP2.0] (adopted by the IEEE) 459 describes the use of HTTP over TCP over ZigBeeIP, between routers 460 and the DODAG Root - and with the use of TCP inherently causing 461 bidirectional traffic by way of data-packets and their 462 corresponding acknowledgements. In fact, current Internet 463 protocols generally require some form of acknowledgment, and 464 foregoing an acknowledgment probably means a trade-off in the area 465 of reliable transmission or repeated retransmissions or both. 467 o Telemetry scenarios where there the DODAG root and the sink are 468 not co-located. This can happen if different kinds of information 469 are sent to different central authorities for processing: for 470 example, temperature goes to Server A and humidity goes to Server 471 B. A possible solution for RPL is to run several DADAGs with 472 different roots, which incurs extra overhead. 474 For scenarios where sensor-to-sensor traffic is a more common 475 occurrence, all sensor-to-sensor routes include the DODAG Root, 476 possibly causing congestions on the communication medium near the 477 DODAG Root, and draining energy from the intermediate routers on an 478 unnecessarily long route. If sensor-to-sensor traffic is common, 479 routers near the DODAG Root will be particularly solicited as relays, 480 especially in non-storing mode. 482 For scenarios with bi-directional traffic, as there is no provision 483 for on-demand generation of routing information from the DODAG Root 484 to a proper subset of all routers, each router (besides the Root) is 485 required to generate DAOs. In particular in non-storing mode, each 486 router will unicast a DAO to the DODAG Root (whereas in storing mode, 487 the DAOs propagate upwards towards the Root). The effects of the 488 requirement to establish downward routes to all routers are: 490 o Increased memory and processing requirements at the DODAG Root (in 491 particular in non-storing mode) and in routers near the DODAG Root 492 (in storing mode). 494 o A considerable control traffic overhead [bidir], in particular at 495 and near the DODAG Root, therefore: 497 o Potentially congested channels, and: 499 o Energy drain from the routers. 501 6. Fragmentation Of RPL Control Messages And Data Packet 503 Some link layers used in LLNs, such as IEEE 802.15.4 [ieee802154], 504 are unable to provide an MTU of at least 1280 octets - as otherwise 505 required for IPv6 [RFC2460]. In such LLNs, link fragmentation and 506 reassembly of IP packets at a layer below IPv6 is used to transport 507 larger IP packets, providing the required minimum 1280 octet MTU 508 [RFC4919]. 510 When such link fragmentation is used, the IP packet has to be 511 reassembled at every hop. Every fragment must be received 512 successfully by the receiving device, or the entire IP packet is 513 lost. Moreover, the additional link-layer frame overhead (and IPv6 514 Fragment header overhead in case of IP fragmentation) for each of the 515 fragments increases the capacity required from the medium, and may 516 consume more energy for transmitting a higher number of frames on the 517 network interface. 519 RPL is an IPv6 routing protocol, designed to operate on constrained 520 link layers, such as [ieee802154], with a maximum frame size of 127 521 bytes - a much smaller value than the specified minimum MTU of 1280 522 bytes for IPv6 [RFC2460]. Reducing the need of fragmentation of IP 523 datagrams on such a link layer, 6LoWPAN provides an adaptation layer 524 [RFC4944], [RFC6282], providing link fragmentation in order to 525 accommodate IPv6 packet transmissions over the maximum IEEE 802.15.4 526 frame size of 127 octets, as well as compressing the IPv6 header, 527 reducing the overhead of the IPv6 header from at least 40 octets to a 528 minimum of 2 octets. Given the IEEE 802.15.4 frame size of 127 529 octets, a maximum frame overhead of 25 octets and 21 octets for link 530 layer security [RFC4944], 81 octets remain for L2 payload. Further 531 subtracting 2 octets for the compressed IPv6 header leaves 79 octets 532 for L3 data payload if link fragmentation is to be avoided. 534 The second L in LLN indicating Lossy [roll-charter], higher loss 535 rates than typically seen in IP networks are expected, rendering link 536 fragmentation important to avoid. This, in particular because, as 537 mentioned above, the whole IP packet is dropped if only a single 538 fragment is lost [RFC4944]. 540 6.1. Observations 542 [RFC4919] makes the following observation regarding using IP in 543 LoWPAN networks based on IEEE 802.15.4 frames: 545 Applications within LoWPANs are expected to originate small 546 packets. Adding all layers for IP connectivity should still allow 547 transmission in one frame, without incurring excessive 548 fragmentation and reassembly. Furthermore, protocols must be 549 designed or chosen so that the individual "control/protocol 550 packets" fit within a single 802.15.4 frame. Along these lines, 551 IPv6's requirement of sub-IP reassembly [...] may pose challenges 552 for low-end LoWPAN devices that do not have enough RAM or storage 553 for a 1280-octet packet. 555 In order to avoid the link fragmentation and thus to adhere to the 556 recommendation in [RFC4919], each control packet of RPL must fit into 557 the remaining 79 octets of the 802.15.4 frame. While 79 octets may 558 seem to be sufficient to carry RPL control messages, consider the 559 following: RPL control messages are carried in ICMPv6, and the 560 mandatory ICMPv6 header consumes 4 octets. The DIO base another 24 561 octets. If link metrics are used, that consumes at least another 8 562 octets - and this is when using a simple hop count metric; other 563 metrics may require more. The DODAG Configuration Object consumes up 564 to a further 16 octets, for a total of 52 octets. Adding a Prefix 565 Information Object for address configuration consumes another 32 566 octets, for a total of 84 octets - thus exceeding the 79 octets 567 available for L3 data payload and causing link fragmentation of such 568 a DIO. As a point of reference, the ContikiRPL [rpl-contiki] 569 implementation includes both the DODAG Configuration option and the 570 Prefix Information option in all DIO messages. Any other options, 571 e.g., Route Information options indicating prefixes reachable through 572 the DODAG Root, increase the overhead and thus the probability of 573 fragmentation. 575 RPL may further increase the probability of link fragmentation of 576 data traffic: for non-storing mode, RPL employs source-routing for 577 all downward traffic. [RFC6554] specifies the RPL Source Routing 578 header, which imposes a fixed overhead of 8 octets per IP packet 579 leaving 71 octets remaining from the link-layer MTU in order to 580 contain the whole IP packet into a single frame - from which must be 581 deducted a variable number of octets, depending on the length of the 582 route. With fewer octets available for data payload, RPL thus 583 increases the probability for link fragmentation of also data 584 packets. This, in particular, for longer routes, e.g., for point-to- 585 point data traffic between sensors inside the LLN, where data traffic 586 transit through the DODAG Root and is then source-routed to the 587 destination. The overhead of source routing is further detailed in 588 Section 7. 590 Given the minimal packet size of LLNs, the routing protocol must 591 impose low (or no) overhead on data packets, hopefully independently 592 of the number of hops [RFC4919]. However, source-routing not only 593 causes increased overhead in the IP header, it also leads to a 594 variable available payload for data (depending on how long the source 595 route is). In point-to-point communication and when non-storing mode 596 is used for downward traffic, the source of a data packet will be 597 unaware of how many octets will be available for payload (without 598 incurring link fragmentation) when the DODAG Root relays the data 599 packet and adds the source routing header. Thus, the source may 600 choose an inefficient size for the data payload: if the data payload 601 is large, it may exceed the link-layer MTU at the DODAG Root after 602 adding the source-routing header; on the other hand, if the data 603 payload is low, the network resources are not used efficiently, which 604 introduces more overhead and more frame transmissions. 606 Unless the DODAG Root is the source of an IPv6 packet to be forwarded 607 through an RPL LLN, the IPv6 packet must be encapsulated in IPv6-in- 608 IPv6 tunneling, with the RPL extension added to the outer IPv6 609 header. Similarly, in non-storing mode, the original IPv6 packet 610 must be carried in IPv6-in-IPv6 tunneling, with the RPL routing 611 header added to the outer IPv6 header. Both of these mechanisms add 612 additional overhead, increasing the likelihood that link 613 fragmentation will be required to deliver the IPv6 packet. In 614 addition, even IPv6 packets that are the minimum MTU size of 1280 615 octets will require IPv6 fragmentation to accommodate the RPL tunnel 616 and headers on a deployment using the [RFC4944] specification to 617 carry IPv6 over IEEE 802.15.4, because RFC4944 defines the MTU for 618 such deployments to be 1280 octets. The ZigBee Alliance has relaxed 619 [RFC4944] to use a longer MTU for accommodating 1280 octet IPv6 620 packets with the required tunnel overhead without fragmentation. The 621 "ZigBee IP Specification" (ZIP) [ZigBeeIP] specifies in section 622 5.4.3: 624 A ZIP node MUST ensure that the insertion of a RPL extension 625 header, either directly or via IPv6-in-IPv6 tunneling, does not 626 cause IPv6 fragmentation. This is done by using a different MTU 627 value for packets where the IPv6 header includes a RPL extension 628 header. The RPL tunnel entry point SHOULD be considered as a 629 separate interface whose MTU is set to the 6LoWPAN interface MTU 630 plus RPL_MTU_EXTENSION bytes. 632 Section 7.1 of [ZigBeeIP] defines RPL_MTU_EXTENSION to be 100 bytes. 634 7. The DAO Mechanism: Downward and Point-to-Point Routes 636 RPL specifies two distinct and incompatible "modes of operation" for 637 downward traffic: storing mode, where each router is assumed to 638 maintain routes to all destinations in its sub-DODAG, i.e., routers 639 that are "deeper down" in the DODAG, and non-storing mode, where only 640 the DODAG Root stores routes to destinations inside the LLN, and 641 where the DODAG Root employs strict source routing in order to route 642 data traffic to the destination router. 644 7.1. Observations 646 In addition to possible fragmentation, as occurs when using 647 potentially long source routing headers over a medium with a small 648 MTU - similar to what is discussed in Section 6 - the maximum length 649 of the source routing header [RFC6554] is limited to 136 octets, 650 including an 8 octet long header. As each IPv6 address has a length 651 of 16 octets, not more than 8 hops from the source to the destination 652 are possible for "raw IPv6". Using address compression (e.g., as 653 specified in [RFC4944]), the maximum route length may not exceed 64 654 hops. This excludes deployment of RPL for scenarios with long 655 "chain-like" topologies, such as traffic lights along a street. 657 In storing mode, each router has to store routes for destinations in 658 its sub-DODAG. This implies that, for routers near the DODAG Root, 659 the required storage is only bounded by the number of destinations in 660 the network. As RPL targets constrained devices with little memory, 661 but also has as ambition to be operating networks consisting of 662 thousands of routers [roll-charter], the storing capacity on these 663 routers may need to be the same as DODAG root - or, at least, the 664 storage requirements in routers "near the DODAG Root" and "far from 665 the DODAG Root" is not homogenous, thus some sort of administrative 666 deployment, and continued administrative maintenance of devices, as 667 the network evolves, is needed. 669 In an experimental testbed, [rpl-eval-UCB] argues that practical 670 experiences suggest that RPL in storing mode, with routers having 671 10kB of RAM (TELOSB mote with TinyOS, 16-bit RISC, 48 kB program 672 flash memory, 16 kB configuration EEPROM), should be limited to 673 networks of less than ~30 routers. Note that observation of less 674 than 30 routers only presents the results obtained from specified 675 testbed and implementation in [rpl-eval-UCB]. Aggregation / 676 summarization of addresses may be advanced as a possible argument 677 that this issue is of little significance - Section 8 discusses why 678 such an argument does not apply. Moreover, if the LoWPAN adaption 679 layer [RFC4944] is used in the LLN, route aggregation is not possible 680 since the same /64 is applied across the entire network. 682 In short, the mechanisms in RPL force the choice between requiring 683 all routers to have sufficient memory to store route entries for all 684 destinations (storing mode) - or, suffer increased risk of 685 fragmentation, and thus loss of data packets, while consuming network 686 capacity by way of source routing through the DODAG Root (non-storing 687 mode). 689 In RPL, the "mode of operation" stipulates that either downward 690 routes are not supported (MOP=0), or that they are supported by way 691 of either storing or non-storing mode. In case downward routes are 692 supported, RPL does not provide any mechanism for discriminating 693 between which routes should or should not be maintained. In 694 particular, in order to calculate routes to a given destination, all 695 intermediaries between the DODAG Root and that destination must 696 themselves be reachable - effectively rendering downward routes in 697 RPL an "all-or-none" situation. In case a destination is 698 unreachable, all the DODAG Root may do is increase DTSN (Destination 699 Advertisement Trigger Sequence Number) to trigger DAO message 700 transmission, or eventually increase the DODAG version number in case 701 the destination is still unreachable, which possibly provokes a 702 broadcast-storm-like situation. This, in particular, as [RFC6550] 703 does not specify DAO message transmission constraints, nor any 704 mechanism for adapting DAO emission to the network capacity. 706 In storing mode, a DTSN increment by the DODAG Root works only if all 707 routers, on the path from the DODAG Root to the "lost" target router, 708 have kept their routing table up-to-date by triggering DAO updates, 709 and thus have a route to the target router. In non-storing mode, the 710 DODAG Root incrementing its DTSN will trigger global DAO updates, and 711 thus extra overhead in the network and delay in the recalculation of 712 the missing route. 714 Furthermore, DTSN increments are carried by way of DIO messages. In 715 case the "lost" target router has lost all of its parents, it will 716 not be able to receive DIO messages from them, and thus will have to 717 wait until it has poisoned its sub-DODAG and joined the DODAG through 718 another parent. The only way the DODAG Root can speed up this 719 process is by incrementing the DODAG version number, thus triggering 720 global recalculation of the DODAG. 722 Even in case the DTSN increment is carried to the "lost" target 723 router through another parent, the triggered DAO will need to go up 724 the DODAG to the DODAG Root via another route, which might itself be 725 broken. This would necessitate the use of local repair mechanisms, 726 potentially causing loops in the network (see Section 14) and 727 eventually global DODAG recalculation. 729 8. Address Aggregation and Summarization 731 As indicated in Section 7, in storing mode, a router is expected to 732 be able to store routing entries for all destinations in its "sub- 733 DODAG", i.e., routing entries for all destinations in the network 734 where the route to the DODAG Root includes that router. 736 In the Internet, no single router stores explicit routing entries for 737 all destinations. Rather, IP addresses are assigned hierarchically, 738 such that an IP address does not only uniquely identify a network 739 interface, but also its topological location in the network, as 740 illustrated in Figure 2. All addresses with the same prefix are 741 reachable by way of the same router - which can, therefore, advertise 742 only that prefix. Other routers need only record a single routing 743 entry for that prefix, knowing that as the IP packet reaches the 744 router advertising that prefix, more precise routing information is 745 available. 747 .---. 748 | | 749 '---' 750 | 751 | 752 (a) 753 | 754 |1.x.x.x/8 755 | 756 (b) 757 / \ 758 1.1.x.x/16/ \ 1.2.x.x/16 759 / \ 760 .---. .---. 761 | c | | d | 762 '---' '---' 764 Figure 2: Address Hierarchies 766 Any aggregated routes require the use of a prefix shorter than /64, 767 and subsequent hierarchical assignment of prefixes down to a /64 (as 768 any router itself provides a /64 subnet to any hosts connected to the 769 router). 771 Moreover, if the 6lowpan adaption layer [RFC4944] is used in the LLN, 772 route aggregation is not possible since the same /64 is applied 773 across the entire network. 775 8.1. Observations 777 In RPL, each router acquires a number of parents, as described in 778 Section 3, from among which it selects one as its preferred parent 779 and, thus, next-hop on the route to the DODAG Root. routers maintain 780 a parent set containing possibly more than a single parent so as to 781 be able to rapidly select an alternative preferred parent, should the 782 previously selected such become unavailable. Thus expected behavior 783 is for a router to be able to change its point of attachment towards 784 the DODAG Root. If IP addresses are assigned in a strictly 785 hierarchical fashion, and if scalability of the routing state 786 maintained in storing mode is based on this hierarchy, then this 787 entails that each time a router changes its preferred parent, it must 788 also change its own IP address - as well as cause routers in its 789 "sub-DODAG" to do the same. RPL does not specify signaling for 790 reconfiguring addresses in a sub-DODAG, while [RFC6550] specifically 791 allows for aggregation (e.g., in Section 18.2.6.: "[...] it is 792 recommended to delay the sending of DAO message to DAO parents in 793 order to maximize the chances to perform route aggregation"). 795 A slightly less strict hierarchy can be envisioned, where a router 796 can change its preferred parent without necessarily changing 797 addresses of itself and of its sub-DODAG, provided that its former 798 and new preferred parents both have the same preferred parent, and 799 have addresses hierarchically assigned from that - from the 800 "preferred grandparent". With reference to Figure 1, this could be e 801 changing its preferred parent from d to c, provided that both d and c 802 have b as preferred parent. Doing so would impose a restriction on 803 the parent-set selection, admitting only parents which have 804 themselves the same parent, losing redundancy in the network 805 connectivity. RPL does not specify rules for admitting only parents 806 with identical grand-parents into the parent set - although such is 807 not prohibited either, if the loss of redundancy is acceptable. 809 The DODAG Root incrementing the DODAG version number is the mechanism 810 by which RPL enables global reconfiguration of the network, 811 reconstructing the DODAG with (intended) more optimal routes. In 812 case of addressing hierarchies being enforced, so as to enable 813 aggregation, this will either restrict the ability for an optimal 814 DODAG construction, or will also have to trigger global address 815 autoconfiguration so as to ensure addressing hierarchies. 817 Finally, with IP addresses serving a dual role of an identifier of 818 both an end-point for communication and a topological location in the 819 network, changing the IP address of a device, so as to reflect a 820 change in network topology, also entails interrupting ongoing 821 communication to or through that device. Additional mechanisms 822 (e.g., a DNS-like system) mapping "communications identifiers" and 823 "IP addresses" are required. 825 9. Link Bidirectionality Verification 827 Parents (and the preferred parent) are selected based on receipt of 828 DIOs. This, alone, does not guarantee the ability of a router to 829 successfully communicate with the parent. However, the basic use of 830 links is for "upward" routes, i.e., for the router to use a parent 831 (the preferred parent) as relay towards the DODAG Root - in the 832 opposite direction of the one in which the DIO was received. 834 9.1. Observations 836 Unidirectional links are no rare occurrence, such as is known from 837 wireless multi-hop networks. Preliminary results from a test-bed of 838 AMI (Automated Metering Infrastructure) devices using 950MHz radio 839 interfaces, and with a total of 22 links, show that 36% of these 840 links are unidirectional. If a router receives a DIO on such a 841 unidirectional link, and selects the originator of the DIO as parent, 842 which would be a bad choice: unicast traffic in the upward direction 843 would be lost. If the router had verified the bidirectionality of 844 links, it might have selected a better parent, to which it has a 845 bidirectional link. 847 [RFC6550] discusses some mechanisms which can (if deemed needed) be 848 used to verify that a link is bidirectional before choosing a router 849 as a parent. While requiring one mechanism for bidirectional 850 verification to be used, the document does not specify which method 851 to be used, and how to be used. The mechanisms discussed include NUD 852 [RFC4861], BFD [RFC5881] and [RFC5184]. BFD is explicitly called out 853 as "often not desirable" as it uses a proactive approach (exchange of 854 periodic HELLO messages), and thus would "lead to excessive control 855 traffic". Furthermore, not all L2 protocols provide L2 856 acknowledgements; even less so for multicast packets - and so, not on 857 RPL DIOs, the multicast transmission of which is a requirement for 858 the Trickle timer flooding reduction to be effective (see 859 Section 3.1). This has as consequence that such L2 acknowledgements 860 can only be used to determine if a given link is bidirectional or 861 unidirectional once the router already has selected parents AND 862 actually has data traffic to forward by way of these parents - in 863 contradiction with RPL's stated design principle that require that 864 the reachability of a router be verified before choosing it as a 865 parent ([RFC6550], Section 1.1). Absent any mechanism specified by 866 RPL to verify the bidirectionality of links, routers thus have to 867 rely on NUD to choose their parent correctly (see Section 10). 869 10. Neighbor Unreachability Detection For Unidirectional Links 871 [RFC6550] suggests using Neighbor Unreachability Detection (NUD) 872 [RFC4861] to detect and recover from the situation of unidirectional 873 links between a router and its (preferred) parent(s). When, e.g., a 874 router tries (and fails) to actually use another router for 875 forwarding traffic, NUD is supposed engaged to detect and prompt 876 corrective action, e.g., by way of selecting an alternative preferred 877 parent. 879 NUD is based upon observing if a data packet is making forward 880 progress towards the destination, either by way of indicators from 881 upper-layer protocols (such as TCP and, though not called out in 882 [RFC4861], also from lower-layer protocols such as Link Layer ACKs ) 883 or - failing that - by unicast probing by way of transmitting a 884 unicast Neighbor Solicitation message and expecting that a solicited 885 Neighbor Advertisement message be returned. 887 10.1. Observations 889 A router may receive, transiently, a DIO from a router, closer (in 890 terms of rank) to the DODAG Root than any other router from which a 891 DIO has been received. Some, especially wireless, link layers may 892 exhibit different transmission characteristics between multicast and 893 unicast transmissions (such is the case for some implementations of 894 IEEE 802.11b, where multicast/broadcast transmissions are sent at 895 much lower bit-rates than are unicast; IEEE 802.11b is, of course, 896 not suggested as a viable L2 for LLNs, but serves to illustrate that 897 such asymmetric designs exist), leading to a (multicast) DIO being 898 received from farther away than a unicast transmission can reach. 899 DIOs are sent (downward) using link-local multicast, whereas the 900 traffic flowing in the opposite direction (upward) is unicast. Thus, 901 a received (multicast) DIO may not be indicative of useful unicast 902 connectivity - yet, RPL might cause this router to select this 903 seemingly attractive router as its preferred parent. This may happen 904 both at initialization, or at any time during the LLN lifetime as RPL 905 allows attachment to a "better parent" over the network lifetime. 907 A DODAG so constructed may appear stable and converged until such 908 time that unicast traffic is to be sent and, thus, NUD invoked. 909 Detecting only at that point that unicast connectivity is not 910 maintained, and causing local (and possibly global) repairs exactly 911 at that time, may lead to traffic not being deliverable. As 912 indicated in Section 8, if scalability is dependent on addresses 913 being assigned hierarchically, changing point-of-attachment may 914 entail more than switching preferred parent. 916 A router may detect that its preferred parent is lost by way of NUD, 917 when trying to communicate to the DODAG Root. If that router has no 918 other parents in its parent set, all it can do is wait: RPL does not 919 provide other mechanisms for a router to react to such an event. In 920 the case where there is no downward traffic (i.e., no data or 921 acknowledgements are sent from the DODAG Root), neither the DODAG 922 Root nor the preferred parent, to which upward connectivity was lost, 923 will be able to detect and react to the event of connectivity loss. 925 In other words, for upward traffic, the routers that by way of NUD 926 detect connectivity loss, will be unable to act in order to restore 927 connectivity (e.g., by way of a signaling mechanism to the DODAG 928 Root, to request DODAG reconstruction by way of version number 929 increase). The routers, which could react (the "preferred parents") 930 will for upward traffic not generate any traffic "downward" allowing 931 NUD to engage and detect connectivity loss. 933 It is worth noting that RPL is optimized for upward traffic 934 (multipoint-to-point traffic), and that this is exactly the type of 935 traffic where NUD is not applicable as a mechanism for detecting and 936 reacting to connectivity loss. 938 Also, absent all routers consistently advertising their reachability 939 through DAO messages, a protocol requiring bidirectional flows 940 between the communicating devices, such as TCP or CoAP confirmable- 941 acknowledgement exchange, will be unable to operate. 943 Finally, upon having been notified by NUD that the "next hop" is 944 unreachable, a router must discard the preferred parent and select 945 another - hoping that this time, the preferred parent is actually 946 reachable. Also, if NUD indicates "no forward progress" based on an 947 upper-layer protocol, there is no guarantee that the problem stems 948 exclusively from the preferred parent being unreachable. Indeed, it 949 may be a problem further ahead, possibly outside the LLN, thus 950 changing preferred parent will not alleviate the situation. 951 Moreover, using information from an upper-layer protocol, e.g., to 952 return TCP ACKs back to the source, requires established downward 953 routes in the DODAG (i.e., each router needs to send DAO messages to 954 the DODAG Root, as described in Section 7). 956 Incidentally, this stems from a fundamental difference between "fixed 957 links in the Internet" and "wireless links": whereas the former, as a 958 rule, are reliable, predictable and with losses being rare 959 exceptions, the latter are characterized by frequent losses and 960 general unpredictability. 962 11. RPL Implementability and Complexity 964 RPL is designed to operate on "routers [...] with constraints on 965 processing power, memory, and energy (battery power)" [RFC6550]. 966 However, the 163 pages long specification of RPL, plus additional 967 specifications for routing headers [RFC6554], Trickle timer 968 [RFC6206], routing metrics [RFC6551] and objective function 969 [RFC6552], describes complex mechanisms (e.g., the upwards and 970 downward data traffic, a security solution, manageability of routers, 971 auxiliary functions for autoconfiguration of routers, etc.), and 972 provides no less than 9 message types, and 10 different message 973 options. 975 To give one example, the ContikiRPL implementation 976 (http://www.sics.se/contiki), which provides only storing mode and no 977 security features, consumes about 50 KByte of memory. Sensor 978 hardware, such as MSP430 sensor platforms, does not contain much more 979 memory than that, i.e., there may not be much space left to deploy 980 any application on the router. 982 11.1. Observations 984 Since RPL is intended as the routing protocol for LLNs, which covers 985 all the diverse applications requirements listed in [RFC5867], 986 [RFC5673], [RFC5826], [RFC5548], it is likely that (i) due to limited 987 memory capacity of the routers, and (ii) due to expensive development 988 cost of the routing protocol implementation, RPL implementations will 989 only support a partial set of features from the specification, 990 leading to non-interoperable implementations. 992 In order to accommodate the verbose exchange format, route stretching 993 and source routing for point-to-point traffic, several additional 994 Internet-Drafts are being discussed for adoption in the ROLL Working 995 Group - adding complexity to an already complex specification which, 996 it is worth recalling, was intended to be of a protocol for low- 997 capacity devices. 999 12. Underspecification 1001 While [RFC6550] provides various options and extensions in many 1002 parts, which makes a complex protocol, as described in Section 11, 1003 some mechanisms are underspecified. 1005 While for DIOs, the Trickle timer specifies a relatively efficient 1006 and easy-to-understand timing for message transmission, the timing of 1007 DAO transmission is not explicit. As each DAO may have a limited 1008 lifetime, one "best guess" for implementers would be to send DAO 1009 periodically, just before the life-time of the previous DAO expires. 1010 Since DAOs may be lost, another "best guess" would be to send several 1011 DAOs shortly one after the other in order to increase probability 1012 that at least one DAO is successfully received. 1014 The same underspecification applies for DAO-ACK messages: optionally, 1015 on reception of a DAO, a router may acknowledge successful reception 1016 by returning a DAO-ACK. Timing of DAO-ACK messages is unspecified by 1017 RPL. 1019 12.1. Observations 1021 By not specifying details about message transmission intervals and 1022 required actions when receiving DAO and DAO-ACKs, implementations may 1023 exhibit a bad performance if not carefully implemented. Some 1024 examples are: 1026 1. If DAO messages are not sent in due time before the previous DAO 1027 expires (or if the DAO is lost during transmission), the routing 1028 entry will expire before it is renewed, leading to a possible 1029 data traffic loss. 1031 2. RPL does not specify to use jitter [RFC5148] (i.e., small random 1032 delay for message transmissions). If DAOs are sent periodically, 1033 adjacent routers may transmit DAO messages at the same time, 1034 leading to link layer collisions. 1036 3. In non-storing mode, the "piece-wise calculation" of routes to a 1037 destination from which a DAO has been received, relies on 1038 previous reception of DAOs from intermediate routers along the 1039 route. If not all of these DAOs from intermediate routers have 1040 been received, route calculation is not possible, and DAO-ACKs or 1041 data traffic cannot be sent to that destination. 1043 Other examples of underspecification include detection of 1044 connectivity loss, as described in Section 10, as well as the local 1045 repair mechanism, which may lead to loops and thus data traffic loss, 1046 if not carefully implemented: a router discovering that all its 1047 parents are unreachable, may - according to the RPL specification - 1048 "detach" from the DODAG, i.e., increase its own rank to infinity. It 1049 may then "poison" its sub-DODAG by advertising its infinite rank in 1050 its DIOs. If, however, the router receives a DIO before it transmits 1051 the "poisoned" DIO, it may attach to its own sub-DODAG, creating a 1052 loop. If, instead, it had waited some time before processing DIOs 1053 again, chances are it would have succeeded in poisoning its sub-DODAG 1054 and thus avoided the loop. 1056 13. Protocol Convergence 1058 Trickle [RFC6206] is used by RPL to schedule transmission of DIO 1059 messages, with the objective of minimizing the amount of transmitted 1060 DIOs while ensuring a low convergence time of the network. The 1061 theoretical behavior of Trickle is well understood, and the 1062 convergence properties are well studied. Simulations of the 1063 mechanism, such as documented [trickle-multicast], confirm these 1064 theoretical studies. 1066 In real-world environments, however, varying link qualities may cause 1067 the algorithm to converge less well: frequent message losses entail 1068 resets of the Trickle timer and more frequent and unpredicted message 1069 emissions. 1071 13.1. Observations 1073 The varying link quality in real-world environments results in 1074 frequent changes of the best parent, which triggers a reset of the 1075 Trickle timer and thus the emission of DIOs. Therefore Trickle does 1076 not converge as well for links that are fluctuating in quality as in 1077 theory. 1079 This has been observed, e.g., in an experimental testbed: 69 routers 1080 (MSP430-based wireless sensor routers with IEEE 802.15.4, using 1081 [rpl-contiki] IPv6 stack and RPL without downward routes; the 1082 parameters of the Trickle timer were set to the implementation 1083 defaults (minimum DIO interval: 4 s, DIO interval doublings: 8, 1084 redundancy constant: 10) were positioned in a fixed grid topology. 1085 This resulted in DODAGs being constructed with an average of 2.45 1086 children per router and an average rank of 3.58. 1088 In this small test network, the number of DIO messages emitted - 1089 expectedly - spiked within the first ~10 seconds. Alas, rather than 1090 taper off to become zero (as the simulation studies would suggest), 1091 the DIO emission rate remained constant at about 70 DIOs per second. 1092 Details on this experiment can be found in [rpl-eval]. 1094 In another experimental testbed with 17 routers (Tmote Sky, Contiki 1095 platform [powertrace]), the authors also showed that the DIO emission 1096 continues with constant rate. Even with a relatively high data rate 1097 for sensor networks (every router sends 1 packet to the root per 1098 minute), the energy used for routing control packets is higher than 1099 the data traffic transmission. 1101 The resulting higher control overhead due to frequent DIO emission, 1102 leads to higher bandwidth and energy consumption as well as possibly 1103 to an increased number of collisions of frames, as observed in 1104 [trickle-multicast]. 1106 13.2. Caveat 1108 Note that these observations do not claim that it is impossible to 1109 parametrize Trickle timers so that a given deployment exhibits the 1110 theoretical characteristics (or, characetristics sufficiently close 1111 thereto) of the Trickle mechanism. These observations suggest that 1112 the default parameter values, provided for Trickle timers in 1113 [RFC6550], did not apply to the small network tested. These 1114 observations also suggest that special care is required when 1115 selecting the values for the parameters for Trickle timers, and that 1116 these values likely are to be determined experimentally, and 1117 individually for each deployment. 1119 14. Loops 1121 [RFC6550] states that it "guarantees neither loop free route 1122 selection nor tight delay convergence times, but can detect and 1123 repair a loop as soon as it is used. RPL uses this loop detection to 1124 ensure that packets make forward progress [...] and trigger repairs 1125 when necessary". This implies that a loop may only then be detected 1126 and fixed when data traffic is sent through the network. 1128 In order to trigger a local repair, RPL relies on the "direction" 1129 information (with values "up" or "down"), contained in an IPv6 hop- 1130 by-hop option header from received a data packet. If an "upward" 1131 data packet is received by a router, but the previous hop of the 1132 packet is listed with a lower rank in the neighbor set, the router 1133 concludes that there must be a routing loop and it may therefore 1134 trigger a local repair. For downward traffic in non-storing mode, 1135 the DODAG Root can detect loops if the same router identifier (i.e., 1136 IP address) appears at least twice in the route towards a 1137 destination. 1139 14.1. Observations 1141 The reason for RPL to repair loops only when detected by a data 1142 traffic transmission is to reduce control traffic overhead. However, 1143 there are two problems in repairing loops only when so triggered: (i) 1144 the triggered local repair mechanism delays forward progress of data 1145 packets, increasing end-to-end delays, and (ii) the data packet has 1146 to be buffered during repair. 1148 (i) may seem as the lesser of the two problems, since in a number of 1149 applications, such as data acquisition in smart metering 1150 applications, an increased delay may be acceptable. However, for 1151 applications such as alarm signals or in home automation (e.g., a 1152 light switch), increased delay may be undesirable. 1154 As for (ii), RPL is supposed to run on LLN routers with "constraints 1155 on [...] memory" [RFC6550]; buffering incoming packets during the 1156 route repair may not be possible for all incoming data packets, 1157 leading to dropped packets. Depending on the transport protocol, 1158 these data packets must be retransmitted by the source or are 1159 definitely lost. 1161 If carefully implemented with respect to avoiding loops before they 1162 occur, the impact of the loop detection in RPL may be minimized. 1163 However, it can be observed that with current implementations of RPL, 1164 such as the ContikiRPL implementation, loops do occur - and, 1165 frequently. During the same experiments described in Section 13, a 1166 snapshot of the DODAG was made every ten seconds. In 74.14% of the 1167 4114 snapshots, at least one loop was observed. Further 1168 investigation revealed that in all these cases the DODAG was 1169 partitioned, and the loop occurred in the sub-DODAG that no longer 1170 had a connection to the DODAG Root. When the link to the only parent 1171 of a router breaks, the router may increase its rank and - when 1172 receiving a DIO from a router in its sub-DODAG - attach itself to its 1173 own sub-DODAG, thereby creating a loop - as detailed in Section 12.1. 1175 While it can be argued that the observed loops are harmless since 1176 they occur in a DODAG partition that has no connection to the DODAG 1177 Root, they show that the routes are not built correctly. Even worse, 1178 when the broken link re-appears, it is possible that in certain 1179 situations, the loop is only repaired when data traffic is sent, 1180 possibly leading to data loss (as described above). This can occur 1181 if the link to the previous parent is reestablished, but the rank of 1182 that previous parent has increased in the meantime. 1184 Another problem with the loop repair mechanism arises in non-storing 1185 mode when using only downward traffic: while the DODAG Root can 1186 easily detect loops (as described above), it has no direct means to 1187 trigger a local repair where the loop occurs. Indeed, it can only 1188 trigger a global repair by increasing the DODAG version number, 1189 leading to a Trickle timer reset and increased control traffic 1190 overhead in the network caused by DIO messages, and therefore a 1191 possible energy drain of the routers and congestion of the channel. 1193 Finally, loop detection for every data packet increases the 1194 processing overhead. RPL is targeted for deployments on very 1195 constrained devices with little CPU power, therefore a loop detection 1196 for every packet reduces available resources of the LLN router for 1197 other tasks (such as sensing). Moreover, each IPv6 packet needs to 1198 contain the "RPL Option for Carrying RPL Information in Data-Plane 1199 Datagrams" [RFC6553] in order to use loop detection (as well as 1200 determining the RPL instance), which in turn implies an extra IPv6 1201 header (and thus overhead) for IPv6-in-IPv6 tunneling. As this RPL 1202 option is a hop-by-hop option, it needs to be in an encapsulating 1203 IPv6-in-IPv6 tunnel and then regenerated at each hop. 1205 15. Security Considerations 1207 This document does currently not specify any security considerations. 1208 This document also does not provide any evaluation of the security 1209 mechanisms of RPL. 1211 16. IANA Considerations 1213 This document has no actions for IANA. 1215 17. Acknowledgements 1217 The authors would like to thank Matthias Philipp (INRIA) for his 1218 contributions to conducting many of the experiments, revealing or 1219 confirming the issues described in this document. 1221 Moreover, the authors would like to express their gratitude to Ralph 1222 Droms (Cisco) for his careful review of various versions of this 1223 document, for many long discussions, and for his considerable 1224 contributions to both the content and the quality of this document. 1226 18. Informative References 1228 [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 1229 (IPv6) Specification", RFC 2460, Decemer 1998. 1231 [RFC4861] Narten, T., Nordmark, E., Simpson, W., and H. Soliman, 1232 "Neighbor Discovery for IP version 6 (IPv6)", RFC 4861, 1233 September 2007. 1235 [RFC4919] Kushalnagar, N., Montenegro, G., and C. Schumacher, "IPv6 1236 over Low-Power Wireless Personal Area Networks (6LoWPANs): 1238 Overview, Assumptions, Problem Statement, and Goals", 1239 RFC 4919, August 2007. 1241 [RFC4944] Montenegro, G., Kushalnagar, N., Hui, J., and D. Culler, 1242 "Transmission of IPv6 Packets over IEEE 802.15.4 1243 Networks", RFC 4944, September 2007. 1245 [RFC5148] Clausen, T., Dearlove, C., and B. Adamson, "Jitter 1246 Considerations in Mobile Ad Hoc Networks (MANETs)", 1247 RFC 5148, February 2008. 1249 [RFC5184] Aggarwal, R., Kompella, K., Nadeau, T., and G. Swallow, 1250 "Bidirectional Forwarding Detection (BFD) for IPv4 and 1251 IPv6 (Single Hop)", RFC 5184, June 2010. 1253 [RFC5548] Dohler, M., Watteyne, T., Winter, T., and D. Barthel, 1254 "Routing Requirements for Urban Low-Power and Lossy 1255 Networks", RFC 5548, May 2009. 1257 [RFC5673] Pister, K., Thubert, P., Dwars, S., and T. Phinney, 1258 "Industrial Routing Requirements in Low-Power and Lossy 1259 Networks", RFC 5673, October 2009. 1261 [RFC5826] Brandt, A., Buron, J., and G. Porcu, "Home Automation 1262 Routing Requirements in Low-Power and Lossy Networks", 1263 RFC 5826, April 2010. 1265 [RFC5867] Martocci, J., Mi, P., Riou, N., and W. Vermeylen, 1266 "Building Automation Routing Requirements in Low Power and 1267 Lossy Networks", RFC 5867, June 2010. 1269 [RFC5881] Ward, D. and D. Katz, "Bidirectional Forwarding Detection 1270 (BFD) for IPv4 and IPv6 (Single Hop)", RFC 5881, 1271 June 2010. 1273 [RFC6206] Levis, P., Clausen, T., Hui, J., Gnawali, O., and J. Ko, 1274 "The Trickle Algorithm", RFC 6206, March 2011. 1276 [RFC6282] Hui, J. and P. Thubert, "Compression Format for IPv6 1277 Datagrams over IEEE 802.15.4-Based Networks", RFC 6282, 1278 September 2011. 1280 [RFC6550] Winther, T., Thubert, P., Hui, J., Vasseur, J., Brandt, 1281 A., Kelsey, R., Levis, P., Piester, K., Struik, R., and R. 1282 Alexander, "RPL: IPv6 Routing Protocol for Low-Power and 1283 Lossy Networks", RFC 6550, March 2012. 1285 [RFC6551] Vasseur, J., Pister, K., Dejan, N., and D. Barthel, 1286 "Routing Metrics Used for Path Calculation in Low-Power 1287 and Lossy Networks", RFC 6551, March 2012. 1289 [RFC6552] Thubert, P., "Objective Function Zero for the Routing 1290 Protocol for Low-Power and Lossy Networks (RPL)", 1291 RFC 6552, March 2012. 1293 [RFC6553] Hui, J. and J. Vasseur, "The Routing Protocol for Low- 1294 Power and Lossy Networks (RPL) Option for Carrying RPL 1295 Information in Data-Plane Datagrams", RFC 6553, 1296 March 2012. 1298 [RFC6554] Hui, J., Vasseur, J., Culler, D., and V. Manral, "An IPv6 1299 Routing Header for Source Routes with the Routing Protocol 1300 for Low-Power and Lossy Networks (RPL)", RFC 6554, 1301 March 2012. 1303 [RFC7102] Vasseur, JP., "Terms Used in Routing for Low-Power and 1304 Lossy Networks", RFC 7102, January 2014. 1306 [RFC7252] Shelby, Z., Hartke, K., and C. Bormann, "The Constrained 1307 Application Protocol (CoAP)", RFC 7252, June 2014. 1309 [SEP2.0] Computer Society, IEEE., "P2030.5 IEEE Draft Standard for 1310 Smart Energy Profile 2.0 Application Protocol", 2014. 1312 [ZigBeeIP] 1313 Alliance, ZigBee., "ZigBee IP Specification", 1314 February 2013. 1316 [bidir] Clausen, T. and U. Herberg, "A Comparative Performance 1317 Study of the Routing Protocols LOAD and RPL with Bi- 1318 Directional Traffic in Low-power and Lossy Networks 1319 (LLN)", Proceedings of the Eighth ACM International 1320 Symposium on Performance Evaluation of Wireless Ad Hoc, 1321 Sensor, and Ubiquitous Networks (PE-WASUN), 2011. 1323 [ieee802154] 1324 Computer Society, IEEE., "IEEE Std. 802.15.4-2003", 1325 October 2003. 1327 [powertrace] 1328 Dunkels, A., Eriksson, J., Finne, N., and N. Tsiftes, 1329 "Powertrace: Network-level Power Profiling for Low-power 1330 Wireless Networks", Technical Report SICS T2011:05. 1332 [roll-charter] 1333 "ROLL Charter", 1334 web http://datatracker.ietf.org/wg/roll/charter/, 1335 February 2012. 1337 [rpl-contiki] 1338 Tsiftes, N., Eriksson, J., and A. Dunkels, "Low-Power 1339 Wireless IPv6 Routing with ContikiRPL", 1340 Proceedings Proceedings of the 9th ACM/IEEE International 1341 Conference on Information Processing in Sensor Networks 1342 (ISPN), 2011. 1344 [rpl-eval] 1345 Clausen, T., Herberg, U., and M. Philipp, "A Critical 1346 Evaluation of the IPv6 Routing Protocol for Low Power and 1347 Lossy Networks (RPL)", Proceedings of the 5th IEEE 1348 International Conference on Wireless & Mobile Computing, 1349 Networking & Communication (WiMob), 2011. 1351 [rpl-eval-UCB] 1352 Ko, J., Dawson-Haggerty, S., Culler, D., and A. Terzis, 1353 "Evaluating the Performance of RPL and 6LoWPAN in TinyOS", 1354 Proceedings of the Workshop on Extending the Internet to 1355 Low power and Lossy Networks (IP+SN), 2011. 1357 [trickle-multicast] 1358 Clausen, T. and U. Herberg, "Study of Multipoint-to-Point 1359 and Broadcast Traffic Performance in the 'IPv6 Routing 1360 Protocol for Low Power and Lossy Networks' (RPL)", 1361 Journal of Ambient Intelligence and Humanized Computing, 1362 2011. 1364 Authors' Addresses 1366 Thomas Clausen 1367 LIX, Ecole Polytechnique 1368 91128 Palaiseau Cedex, 1369 France 1371 Phone: +33 6 6058 9349 1372 Email: T.Clausen@computer.org 1373 URI: http://www.thomasclausen.org 1374 Axel Colin de Verdiere 1375 LIX, Ecole Polytechnique 1376 91128 Palaiseau Cedex, 1377 France 1379 Phone: +33 6 1264 7119 1380 Email: axel@axelcdv.com 1381 URI: http://www.axelcdv.com/ 1383 Jiazi Yi 1384 LIX, Ecole Polytechnique 1385 91128 Palaiseau Cedex, 1386 France 1388 Phone: +33 1 6933 4031 1389 Email: jiazi@jiaziyi.com 1390 URI: http://www.jiaziyi.com/ 1392 Ulrich Herberg 1393 Fujitsu Laboratories of America 1394 1240 E Arques Ave 1395 Sunnyvale, CA 94085 1396 USA 1398 Email: ulrich@herberg.name 1399 URI: http://www.herberg.name/ 1401 Yuichi Igarashi 1402 Hitachi, Ltd., Yokohama Research Laboratory 1404 Phone: +81 45 860 3083 1405 Email: yuichi.igarashi.hb@hitachi.com 1406 URI: http://www.hitachi.com/