idnits 2.17.1 draft-ietf-roll-rpl-observations-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 2 instances of too long lines in the document, the longest one being 5 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (May 21, 2020) is 1436 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RIOT' is mentioned on line 799, but not defined == Missing Reference: 'CONTIKI' is mentioned on line 799, but not defined == Unused Reference: 'RFC6551' is defined on line 850, but no explicit reference was found in the text == Unused Reference: 'RFC6552' is defined on line 856, but no explicit reference was found in the text == Unused Reference: 'RFC6997' is defined on line 867, but no explicit reference was found in the text ** Downref: Normative reference to an Experimental RFC: RFC 6997 == Outdated reference: A later version (-18) exists of draft-ietf-roll-aodv-rpl-08 Summary: 2 errors (**), 0 flaws (~~), 7 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 ROLL R. Jadhav, Ed. 3 Internet-Draft R. Sahoo 4 Intended status: Standards Track Y. Wu 5 Expires: November 22, 2020 Huawei 6 May 21, 2020 8 RPL Observations 9 draft-ietf-roll-rpl-observations-04 11 Abstract 13 This document describes RPL protocol design issues, various 14 observations and possible consequences of the design and 15 implementation choices. 17 Status of This Memo 19 This Internet-Draft is submitted in full conformance with the 20 provisions of BCP 78 and BCP 79. 22 Internet-Drafts are working documents of the Internet Engineering 23 Task Force (IETF). Note that other groups may also distribute 24 working documents as Internet-Drafts. The list of current Internet- 25 Drafts is at https://datatracker.ietf.org/drafts/current/. 27 Internet-Drafts are draft documents valid for a maximum of six months 28 and may be updated, replaced, or obsoleted by other documents at any 29 time. It is inappropriate to use Internet-Drafts as reference 30 material or to cite them other than as "work in progress." 32 This Internet-Draft will expire on November 22, 2020. 34 Copyright Notice 36 Copyright (c) 2020 IETF Trust and the persons identified as the 37 document authors. All rights reserved. 39 This document is subject to BCP 78 and the IETF Trust's Legal 40 Provisions Relating to IETF Documents 41 (https://trustee.ietf.org/license-info) in effect on the date of 42 publication of this document. Please review these documents 43 carefully, as they describe your rights and restrictions with respect 44 to this document. Code Components extracted from this document must 45 include Simplified BSD License text as described in Section 4.e of 46 the Trust Legal Provisions and are provided without warranty as 47 described in the Simplified BSD License. 49 Table of Contents 51 1. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . 3 52 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 53 2.1. Requirements Language and Terminology . . . . . . . . . . 3 54 3. DTSN increment in storing MOP . . . . . . . . . . . . . . . . 4 55 3.1. Deliberations . . . . . . . . . . . . . . . . . . . . . . 5 56 4. DAO retransmission and use of DAO-ACK in storing MOP . . . . 5 57 4.1. Significance of bidirectional Path establishment 58 indication and relevance of DAO-ACK . . . . . . . . . . . 6 59 4.2. Problems with hop-by-hop DAO-ACK . . . . . . . . . . . . 6 60 4.3. Problems with end-to-end DAO-ACK . . . . . . . . . . . . 6 61 4.4. Deliberations . . . . . . . . . . . . . . . . . . . . . . 6 62 4.5. Implementation Notes . . . . . . . . . . . . . . . . . . 7 63 5. Interpreting Trickle Timer . . . . . . . . . . . . . . . . . 7 64 6. Handling resource unavailability . . . . . . . . . . . . . . 8 65 6.1. Deliberations . . . . . . . . . . . . . . . . . . . . . . 8 66 7. Handling aggregated targets . . . . . . . . . . . . . . . . . 9 67 7.1. Deliberations . . . . . . . . . . . . . . . . . . . . . . 9 68 8. RPL Transit Information in DAO . . . . . . . . . . . . . . . 9 69 8.1. Deliberations . . . . . . . . . . . . . . . . . . . . . . 10 70 9. Upgrades or Extensions to RPL protocol . . . . . . . . . . . 10 71 10. Path Control bits handling . . . . . . . . . . . . . . . . . 10 72 11. Asymmetric Links and RPL . . . . . . . . . . . . . . . . . . 11 73 12. Adjacencies probing with RPL . . . . . . . . . . . . . . . . 11 74 12.1. Deliberations . . . . . . . . . . . . . . . . . . . . . 12 75 13. Control Options eliding mechanism in RPL . . . . . . . . . . 12 76 14. Managing persistent variables across node reboots . . . . . . 12 77 14.1. Persistent storage and RPL state information . . . . . . 12 78 14.2. Lollipop Counters . . . . . . . . . . . . . . . . . . . 13 79 14.3. RPL State variables . . . . . . . . . . . . . . . . . . 14 80 14.3.1. DODAG Version . . . . . . . . . . . . . . . . . . . 14 81 14.3.2. DTSN field in DIO . . . . . . . . . . . . . . . . . 14 82 14.3.3. PathSequence . . . . . . . . . . . . . . . . . . . . 15 83 14.4. State variables update frequency . . . . . . . . . . . . 15 84 14.5. Deliberations . . . . . . . . . . . . . . . . . . . . . 15 85 14.6. Implementation Notes . . . . . . . . . . . . . . . . . . 16 86 15. Capabilities and its role in RPL . . . . . . . . . . . . . . 16 87 15.1. Handshaking node capabilities . . . . . . . . . . . . . 16 88 15.2. How do Capabilities differ from MOP and Configuration 89 Option? . . . . . . . . . . . . . . . . . . . . . . . . 17 90 15.3. Deliberations . . . . . . . . . . . . . . . . . . . . . 17 91 16. Backward Compatibility issues with RPL Options . . . . . . . 17 92 17. RPL under-specification . . . . . . . . . . . . . . . . . . . 17 93 18. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 18 94 19. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 18 95 20. Security Considerations . . . . . . . . . . . . . . . . . . . 18 96 21. References . . . . . . . . . . . . . . . . . . . . . . . . . 18 97 21.1. Normative References . . . . . . . . . . . . . . . . . . 18 98 21.2. Informative References . . . . . . . . . . . . . . . . . 19 99 Appendix A. Additional Stuff . . . . . . . . . . . . . . . . . . 20 100 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 20 102 1. Motivation 104 The primary motivation for this draft is to enlist different issues 105 with RPL operation and invoke a discussion within the working group. 106 This draft by itself is not intended for RFC tracks but as a WG 107 discussion track. This draft may in turn result in other work items 108 taken up by the WG which may improvise on the issues mentioned 109 herewith. 111 2. Introduction 113 RPL [RFC6550] specifies a proactive distance-vector routing scheme 114 designed for LLNs (Low Power and Lossy Networks). RPL enables the 115 network to be formed as a DODAG and supports storing mode and non- 116 storing mode of operations. Non-storing mode allows reduced memory 117 resource usage on the nodes by allowing non-BR nodes to operate 118 without managing a routing table and involves use of source routing 119 by the Root to direct the traffic along a specific path. In storing 120 mode of operation intermediate routers maintain routing tables. 122 This work aims to highlight various issues with RPL which makes it 123 difficult to handle certain scenarios. This work will highlight such 124 issues in context to RPL's mode of operations (storing versus non- 125 storing). There are cases where RPL does not provide clear rules and 126 implementations have to make their choices hindering interoperability 127 and performance. 129 [I-D.clausen-lln-rpl-experiences] provides some interesting points. 130 Some sections in this draft may overlap with some observations in 131 [clausen], but this is been done to further extend some scenarios or 132 observations. It is highly encouraged that readers should also visit 133 [I-D.clausen-lln-rpl-experiences] for other insights. Regardless, 134 this draft is self-sufficient in a way that it does not expect to 135 have read [clausen-draft]. 137 2.1. Requirements Language and Terminology 139 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 140 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 141 document are to be interpreted as described in RFC 2119 [RFC2119]. 143 NS-MOP = RPL Non-storing Mode of Operation 144 S-MOP = RPL Storing Mode of Operation 146 This document uses terminology described in [RFC6550] and [RFC6775]. 148 3. DTSN increment in storing MOP 150 DTSN increment has major impact on the overall RPL control traffic 151 and on the efficiency of downstream route update. DTSN is sent as 152 part of DIO message and signals the downstream nodes to trigger the 153 target advertisement. The 6LR needs to decide when to update the 154 DTSN and usually it should do it in a conservative way. The DTSN 155 update mechanism determines how soon the downward routes are 156 established along the new path. RPL specifications does not provide 157 any clear mechanism on how the DTSN update should happen in case of 158 storing mode. 160 (6LBR) 161 | 162 | 163 | 164 (A) 165 / \ 166 / \ 167 / \ 168 (B) -(C) 169 | / | 170 | / | 171 | / | 172 (D)- (E) 173 \ ; 174 \ ; 175 \ ; 176 (F) 177 / \ 178 / \ 179 / \ 180 (G) (H) 182 Figure 1: Sample topology 184 Consider example topology shown in Figure 1, assume that node D 185 switches the parent from node B to C. Ideally the downstream nodes D 186 and its sub-childs should send their target advertisement to the new 187 path via node C. To achieve this result in a efficient way is a 188 challenge. Incrementing DTSN is the only way to trigger the DAO on 189 downstream nodes. But this trigger should be sent not only on the 190 first hop but to all the grand-child nodes. Thus DTSN has to be 191 incremented in the complete sub-DODAG rooted at node D thus resulting 192 in DIO/DAO storm along the sub-DODAG. This is specifically a big 193 issue in high density networks where the metric deteoration might 194 happen transiently even though the signal strength is good. 196 The primary implementation issue is whether a child node increment 197 its own DTSN when it receives DTSN update from its parent node? This 198 would result in DAO-updates in the sub-DODAG, thus the cost could be 199 very high. If not incremented it may result in serious loss of 200 connectivity for nodes in the sub-DODAG. 202 3.1. Deliberations 204 (1) In S-MOP, should the child node increment its DTSN on seeing 205 that its preferred parent has updated its DTSN? 207 (2) What are rules for DTSN increment for S-MOP, which multiple 208 implementations can follow thus allowing consistent performance 209 across different implementations? 211 4. DAO retransmission and use of DAO-ACK in storing MOP 213 [RFC6550] has an optional DAO-ACK mechanism using which an upstream 214 parent confirms the reception of a DAO from the downstream child. In 215 case of storing mode, the DAO is addressed to the immediate hop 216 upstream parent resulting in DAO-ACK from the parent. There are two 217 implementations possible: 219 (1) Hop-by-hop ACK: A parent responds with a DAO-ACK immedetialy 220 after receiving the DAO. 222 (2) End-to-End ACK: A node waits for the upstream parent to send 223 DAO-ACK to respond with a DAO-ACK downstream. The upstream 224 parent may do as many attempts to successfully send this DAO 225 upstream. In other words, the parent node accepts the 226 responsibilty of sending the DAO upstream till the point it is 227 ACKed the moment it responds back with its own ACK to the child. 229 1-> 3-> 230 DAO DAO 231 (TgtNode)--------(6LR)-------(root) 232 ACK ACK 233 <-2 <-4 235 Figure 2: Hop-by-hop DAO-ACK 236 1-> 2-> 237 DAO DAO 238 (TgtNode)--------(6LR)-------(root) 239 ACK ACK 240 <-4 <-3 242 Figure 3: End-to-End DAO-ACK 244 4.1. Significance of bidirectional Path establishment indication and 245 relevance of DAO-ACK 247 Lot of application traffic patterns requires that the bidirectional 248 path be established between the target node and the root. A typical 249 example is that COAP request with ACK bit set would require an 250 acknowledgement from the end receiver and thus warrants bidirectional 251 path establishment. It is imperative that the target node first 252 ascertains whether such a bidirectional path is established before 253 initiating such application traffic. In case of non-storing MOP, the 254 DAO-ACK works perfectly fine to ascertain such bidirectional 255 connectivity since it is an indication that the root which usually is 256 the direct destination of the DAO has received the DAO. But in case 257 of storing MOP, things are more complicated since DAO is sent hop-by- 258 hop and the DAO-ACK semantics are not clear enough as per the current 259 specification. As mentioned in above section, an implementation can 260 choose to implement hop-by-hop ACK or end-to-end ACK. 262 4.2. Problems with hop-by-hop DAO-ACK 264 The primary issue with this mode is that target node cannot ascertain 265 bidirection path connectivity on the reception of the DAO-ACK. 267 4.3. Problems with end-to-end DAO-ACK 269 In this case, it is possible for the target node to ascertain if the 270 DAO has indeed reached the root since the reception of DAO-ACK on 271 target node confirms this. However there is extra state information 272 that needs to be maintained on the 6LRs on behalf of all the child 273 nodes. Also it is very difficult for the target node to ascertain a 274 timer value to decide whether the DAO transmission has failed to 275 reach the root. 277 4.4. Deliberations 279 (1) How should an implementation interpret the DAO-ACK semantics? 281 (2) What is the best way for the target node to know that the end to 282 end bidirectional path is successfully installed or updated? In 283 NS-MOP, the DAO-ACK provides a clear way to do this. Can the 284 same be achieved for storing-MOP? 286 (3) What happens if the DAO-ACK with Status!=0 is responded by 287 ancestor node? 289 (4) How to selectively NACK subset of targets in case target options 290 are aggregated? 292 4.5. Implementation Notes 294 Current RPL open source implementations have both types of DAO-ACK 295 implementations. For e.g. RIOT supports hop-by-hop DAO-ACK. 296 Contiki older versions supported hop-by-hop ACK but the recent 297 version have changed to end-to-end ACK implementation. 299 The sequence of sending no-path DAO and DAO matters when updating the 300 routing adjacencies on a parent switch. If an implementation chooses 301 to send no-path DAO before DAO then it results in significantly more 302 overhead for route invalidation. This is because no-path DAO would 303 traverse all the way up to the BR clearing the routes on the way. In 304 case there is a common ancestor post which the old and new path 305 remains same then it is better to send regular DAO first thus 306 limiting the propagation of subsequent no-path DAO till this common 307 ancestor. 309 5. Interpreting Trickle Timer 311 Trickle algorithm defines a mechanism to reset the timer. Trickle 312 timer reset is unlike regular periodic timers wherein the timer is 313 simply reset to start again. Reset of trickle timer implies 314 resetting the trickle back to Imin and starting with a new interval 315 as mentioned in Section 4.2 of [RFC6206]. 317 |----|--------|----------------|--------------------------------| . . . . . . 318 Imin I2 I3 I4 I5 320 Figure 4: Trickle Timer Operation 322 The above figure shows an example of trickle intervals. An interval 323 is double that of the previous interval size. Section 4.2. of 324 [RFC6206] states that, 326 "If Trickle hears a transmission that is "inconsistent" and I is 327 greater than Imin, it resets the Trickle timer. To reset the timer, 328 Trickle sets I to Imin and starts a new interval as in step 2. If I 329 is equal to Imin when Trickle hears an "inconsistent" transmission, 330 Trickle does nothing. Trickle can also reset its timer in response 331 to external "events"." 333 Thus if the trickle timer has advanced to subsequent intervals i.e., 334 >= I2, then a reset of trickle timer implies going back to Imin. 335 However, if the trickle timer is currently in Imin and if it hears an 336 inconsistent transmission then it does nothing. 338 In context to multicast DIS/DIO operation, this implies that if the 339 DIO trickle timer is already at Imin and if the node hears a 340 multicast DIS, then the timer does nothing. It MUST NOT reset the 341 timer again in this case. 343 An implementation MUST never restart the timer within an interval. 344 For e.g., in the above figure, if the timer is in interval I2, the 345 implementation MUST never restart the timer to the beginning of the 346 current interval i.e., I2. If the timer is in interval T2 and if the 347 reset is to be done then the interval is set back to Imin. If the 348 timer is already in Imin, then the reset should do nothing. 350 6. Handling resource unavailability 352 The nodes in the constrained networks have to maintain various 353 records such as neighbor cache entries and routing entries on behalf 354 of other targets to facilitate packet forwarding. Because of the 355 constrained nature of the devices the memory available may be very 356 limited and thus the path selection algorithm may have to take into 357 consideration such resource constraints as well. 359 RPL currently does not have any mechanism to advertise such resource 360 indicator metrics. The primary tables associated with RPL are 361 routing table and the neighbor cache. Even though neighbor cache is 362 not directly linked with RPL protocol, the maintenance of routing 363 adjacencies results in updates to neigbor cache. 365 6.1. Deliberations 367 Is it possible to know that an upstream parent/ancestor cannot 368 hold enough routing entries and thus this path should not be used? 370 Is it possible to know that an upstream parent cannot hold any 371 more neighbor cache entry and thus this upstream parent should not 372 be used? 374 7. Handling aggregated targets 376 RPL allows and defines specific procedures so as to aid target 377 aggregation in DAO. Having said that, the specification does not 378 mandate use of aggregated targets nor does it make any comment on 379 whether a receiving node needs to handle it. Target aggregation is 380 an useful tool and especially helps with link layer technologies that 381 does not suffer from low MTUs such as PLC. Even if the 382 implementation does not support aggregating targets, it should 383 atleast mandate reception of aggregated targets in DAO. 385 RPL has a mechanism currently to ACK the DAO but it does not have a 386 mechanism to ACK the target option. Thus in case of aggregated 387 targets in the DAO, if the subset of the targets fail then it is 388 impossible for the DAO-ACK to signal this to the DAO sender. 390 7.1. Deliberations 392 Even if the implementation does not support aggregating targets, 393 should it atleast mandate reception and handling of aggregated 394 targets in DAO? 396 There is a good scope for compressing aggregated targets which can 397 significantly reduce the RPL control overhead. 399 How to selectively NACK subset of targets in case target options 400 are aggregated? 402 The DEFAULT_DAO_DELAY of 1sec does not help much with aggregation. 403 The upstream parent nodes should wait for more time then the child 404 nodes so as to effectively aggregate. Can we have 405 DEFAULT_DAO_DELAY a function of the level/rank the node is at? 407 8. RPL Transit Information in DAO 409 RPL allows associating a target or set of targets with a Transit 410 Information Option which contains attributes for a path to one or 411 more destinations identified by the set of targets. In case of NS- 412 MOP, the transit Information will contain the all critical Parent 413 Address which allows the common ancestor usually the root to identify 414 the source route header for the target node. The Transit Information 415 also contains other information such as Path Sequence and Path 416 Lifetime which are critical for maintaining route adjacencies. 418 RPL however does not mandate the use of Transit Information Option 419 for targets. 421 8.1. Deliberations 423 Is it ok to let implementations decide on the inclusion of Transit 424 Information Option? 426 Is it possible to achieve interop without mandating use of Transit 427 Information Option? 429 If the Transit Information Option is sent, should the handling of 430 PathSequence be mandated? 432 9. Upgrades or Extensions to RPL protocol 434 RPL extensibility is highly desirable and is controlled by protocol 435 elements within the messaging framework. In the pursuit to keep the 436 signalling overhead less, RPL specification has been restricting in 437 its approach to extend its field ranges, thus in some cases putting 438 extensibility at stakes. Consider for example, the mode of operation 439 bits which is three bits in the RPL specification. These bits are 440 already saturated and it may be difficult to add major upgrades 441 without extending these bits. 443 Addition of new Control Options or new RPL Codes almost certainly 444 results in backward compatibility issues. RFC6550 clearly mentions 445 that a message with an unknown RPL Code MUST be silently discarded. 446 However, no explicit handling is suggested for unknown RPL control 447 option types. In some cases, implementations simply copy-forward an 448 unknown option as it is while in other cases the unknown option is 449 stripped off before forwarding the message. 451 Deliberations: 453 (1) What are the extensibility options RPL could implement? How 454 much overhead would it incur? 456 (2) Most of the extensions are in the form of new control options. 457 Should RPL have a mechanism to only handle such extensions in a 458 backward compatible but in a generic manner? 460 10. Path Control bits handling 462 RPL uses Path Control bits in the DAO's Transit Information Option 463 for installing multiple downward routes to the nodes. These multiple 464 routes could be used for reliability, latency or traffic load- 465 balancing within a DAG. The path control bits are usable both in 466 storing and non-storing mode of operation. 468 RFC6550 Section 9.9 bullet point 9 requires a mandatory setting of 469 Path Control bits in all the unicast DAOs sent by the Target node. 470 However, no existing implementation of RPL supports this. There is 471 no reason for a network which only requires a single path to the root 472 to mandatorily support path control bits. 474 Deliberations: 476 (1) Should the mandatory clause for supporting Path Control Bits in 477 RFC6550 Section 9.9 point 9 be removed? 479 (2) Handling Path Control Bits may be complex. An implementation 480 guideline explaining the use-cases and resource (memory 481 requirements) assumptions would help implementors decide the 482 utility of this technique. 484 11. Asymmetric Links and RPL 486 Section 3.1 of [I-D.ietf-intarea-adhoc-wireless-com] explains 487 asymmetric link characteristics and what it takes for a protocol to 488 support asymmetric links. RPL depends on bi-directional links for 489 control even though near-perfect symmetry is not expected. The 490 implication of this is that the upstream and downstream path remains 491 same within a given RPL instance for any pair of nodes. There are 492 following questions sprouting of this design: 494 (1) Is it possible to detect asymmetric links? 496 (2) In the presence of asymmetric links what is the impact on the 497 control overhead and is there a way to possibly mitigate or 498 alleviate any negative impact? 500 [I-D.ietf-roll-aodv-rpl] defines a mechanism to use a pair of 501 instances which are coupled. This allows disjoint upstream and 502 downstream paths between pair of nodes assuming that the link 503 asymmetricity is detected using some outside techniques. The link 504 assumes that the link asymmetricity is already known to the nodes in 505 the form of static configuration. In case of 6tisch networks, the 506 availability of transmission slots information can be used to 507 identify link asymmetricity. The challenge with regards to detecting 508 link asymmetricity arises from scenarios where, for example, the 509 nodes transmit with unequal power levels. 511 12. Adjacencies probing with RPL 513 RPL avoids periodic hello messaging as compared to other distance- 514 vector protocols. It uses trickle timer based mechanism to update 515 configuration parameters. This significantly reduces the RPL control 516 overhead. One of the fallout of this design choice is that, in the 517 absence of regular traffic, the adjacencies could not be tested and 518 repaired if broken. 520 RPL provides a mechanism in the form of unicast DIS to query a 521 particular node for its DIO. A node receiving a unicast DIS MUST 522 respond with a unicast DIO with Configuration Option. This mechanism 523 could as well be made use of for probing adjacencies and certain 524 implementations such as Contiki uses this. The periodicity of the 525 probing is implementation dependent, but the node is expected to 526 invoke probing only when 528 (1) There is no data traffic based on which the links could be 529 tested. 531 (2) There is no L2 feedback. In some case, L2 might provide 532 periodic beacons at link layer and the absence of beacons could 533 be used for link tests. 535 12.1. Deliberations 537 (1) Should the probing scheme be standardized? In some cases using 538 multicast based probing may prove advantageous. 540 (2) In some cases using multicast based probing may prove 541 advantageous. Currently RPL does not have multicast based 542 probing. Multicast DIS/DIO may not be suitable for probing 543 because it could possibly lead to change of states. 545 13. Control Options eliding mechanism in RPL 547 RPL configuration changes are rare and thus various configuration 548 options may not change over a long period of time. RPL provides a 549 way for the configuration options to be elided but there are no clear 550 guidelines on how the eliding should be handled. In the absence of 551 such guidelines, it is possible that certain nodes may end up using 552 stale configuration in the event of transient link failures. 554 14. Managing persistent variables across node reboots 556 14.1. Persistent storage and RPL state information 558 Devices are required to be functional for several years without 559 manual maintanence. Usually battery power consumption is considered 560 key for operating the devices for several (tens of) years. But apart 561 from battery, flash memory endurance may prove to be a lifetime 562 bottleneck in constrained networks. Endurance is defined as maximum 563 number of erase-write cycles that a NAND/NOR cell can undergo before 564 losing its 'gauranteed' write operation. In some cases (cheaper 565 NAND-MLC/TLC), the endurance can be as less as 2K cycles. Thus for 566 e.g. if a given cell is written 5 times a day, that NAND-flash cell 567 assuming an endurance of 10K cycles may last for less than 6 years. 569 Wear leveling is a popular technique used in flash memory to minimize 570 the impact of limited cell endurance. Wear leveling works by 571 arranging data so that erasures and re-writes are distributed evenly 572 across the medium. The memory sectors are over-provisioned so that 573 the writes are distributed across multiple sectors. Many IoT 574 platforms do not necessarily consider this over-provisioning and 575 usually provision the memory only to what is required. Some 576 scenarios such as street-lighting may not require the application 577 layer to write any information to the persistent storage and thus the 578 over-provisioning is often ignored. In such cases if the network 579 stack ends up using persistent storage for maintaining its state 580 information then it becomes counter-productive. 582 In a star topology, the amount of persistent data write done by 583 network protocols is very limited. But ad-hoc networks employing 584 routing protocols such as RPL assume certain state information to be 585 retained across node reboots. In case of IoT devices this storage is 586 mostly floating gate based NAND/NOR based flash memory. The impact 587 of loss of this state information differs depending upon the type 588 (6LN/6LR/6LBR) of the node. 590 14.2. Lollipop Counters 592 [RFC6550] Section 7.2. explains sequence counter operation defining 593 lollipop [Perlman83] style counters. Lollipop counters specify 594 mechanism in which even if the counter value wraps, the algorithm 595 would be able to tell whether the received value is the latest or 596 not. This mechanism also helps in "some cases" to recover from node 597 reboot, but is not foolproof. 599 Consider an e.g. where Node A boots up and initialises the seqcnt to 600 240 as recommended in [RFC6550]. Node A communicates to Node B using 601 this seqcnt and node B uses this seqcnt to determine whether the 602 information node A sent in the packet is latest. Now lets assume, 603 the counter value reaches 250 after some operations on Node A, and 604 node B keeps receiving updated seqcnt from node A. Now consider that 605 node A reboots, and since it reinitializes the seqcnt value to 240 606 and sends the information to node B (who has seqcnt of 250 stored on 607 behalf of node A). As per section 7.2. of [RFC6550], when node B 608 receives this packet it will consider the information to be old 609 (since 240 < 250). 611 +-----+-----+----------+ 612 | A | B | Output | 613 +-----+-----+----------+ 614 | 240 | 240 | AB, new | 620 | 240 | :: | A>B, new | 621 | 240 | 127 | A>B, new | 622 +-----+-----+----------+ 624 Default values for lollipop counters considered from [RFC6550] 625 Section 7.2. 627 Table 1: Example lollipop counter operation 629 Based on this figure, there is dead zone (240 to 0) in which if A 630 operates after reboot then the seqcnt will always be considered 631 smaller. Thus node A needs to maintain the seqcnt in persistent 632 storage and reuse this on reboot. 634 14.3. RPL State variables 636 The impact of loss of RPL state information differs depending upon 637 the node type (6LN/6LR/6LBR). Following sections explain different 638 state variables and the impact in case this information is lost on 639 reboot. 641 14.3.1. DODAG Version 643 The tuple (RPLInstanceID, DODAGID, DODAGVersionNumber) uniquely 644 identifies a DODAG Version. DODAGVersionNumber is incremented 645 everytime a global repair is initiated for the instance (global or 646 local). A node receiving an older DODAGVersionNumber will ignore the 647 DIO message assuming it to be from old DODAG version. Thus a 6LBR 648 node (and 6LR node in case of local DODAG) needs to maintain the 649 DODAGVersionNumber in the persistent storage, so as to be available 650 on reboot. In case the 6LBR could not use the latest 651 DODAGVersionNumber the implication are that it won't be able to 652 recover/re-establish the routing table. 654 14.3.2. DTSN field in DIO 656 DTSN (Destination advertisement Trigger Sequence Number) is a DIO 657 message field used as part of procedure to maintain Downward routes. 658 A 6LBR/6LR node may increment a DTSN in case it requires the 659 downstream nodes to send DAO and thus update downward routes on the 660 6LBR/6LR node. In case of RPL NS-MOP, only the 6LBR maintains the 661 downward routes and thus controls this field update. In case of 662 S-MOP, 6LRs additionally keep downward routes and thus control this 663 field update. 665 In S-MOP, when a 6LR node switches parent it may have to issue a DIO 666 with incremented DTSN to trigger downstream child nodes to send DAO 667 so that the downward routes are established in all parent/ancestor 668 set. Thus in S-MOP, the frequency of DTSN update might be relatively 669 high (given the node density and hysteresis set by objective function 670 to switch parent). 672 14.3.3. PathSequence 674 PathSequence is part of RPL Transit Option, and associated with RPL 675 Target option. A node whichs owns a target address can associate a 676 PathSequence in the DAO message to denote freshness of the target 677 information. This is especially useful when a node uses multiple 678 paths or multiple parents to advertise its reachability. 680 Loss of PathSequence information maintained on the target node can 681 result in routing adjacencies been lost on 6LRs/6LBR/6BBR. 683 14.4. State variables update frequency 685 +--------------------+-------------------+------------------------+ 686 | State variable | Update frequency | Impacts node type | 687 +--------------------+-------------------+------------------------+ 688 | DODAGVersionNumber | Low | 6LBR, 6LR(local DODAG) | 689 | DTSN | High(SM),Low(NSM) | 6LBR, 6LR | 690 | PathSequence | High(SM),Low(NSM) | 6LR, 6LN | 691 +--------------------+-------------------+------------------------+ 693 Low=<5 per day, High=>5 per day; SM=Storing MOP, NSM=Non-Storing MOP 695 Table 2: RPL State variables 697 14.5. Deliberations 699 (1) Is it possible that RPL removes the use of persistent storage 700 for maintaining state information? 702 (2) In most cases, the node reboots will happen very rarely. Thus 703 doing a persistent storage book-keeping for handling node reboot 704 might not make sense. Is it possible to consider signaling 705 (especially after the node reboots) so as to avoid maintaining 706 this persistent state? Is it possible to use one-time on-reboot 707 signalling to recover some state information? 709 (3) It is necessary that RPL avoids using persistent storage as far 710 as possible. Ideally, extensions to RPL should consider this as 711 a design requirement especially for 6LR and 6LN nodes. DTSN and 712 PathSequence are the primary state variables which have major 713 impact. 715 14.6. Implementation Notes 717 An implementation should use a random DAOSequence number on reboot so 718 as to avoid a risk of reusing the same DAOSequence on reboot. 719 Regardless the sequence counter size of 8bits does not provide much 720 gurantees towards choosing a good random number. A parent node will 721 not respond with a DAO-ACK in case it sees a DAO with the same 722 previous DAOSequence. 724 Write-Before-Use: The state information should be written to the 725 flash before using it in the messaging. If it is done the other way, 726 then the chances are that the node power downs before writing to the 727 persistent storage. 729 15. Capabilities and its role in RPL 731 RPL is a distributed protocol and it requires that the participating 732 nodes agree on basic set of primitives to follow. RPL currently 733 handles this using MOP (Mode of Operation) bits in the DIO. MOP bits 734 inform the nodes the basic mode of operation a node MUST support to 735 join the Instance as a 6LR. The MOP is decided and advertised by the 736 root of the RPL Instance. A node not supporting the given MOP may 737 still join the Instance as a leaf node or 6LN. 739 RPL further uses DIO Configuration Option to advertise the 740 configuration each node needs to use (for e.g., for trickle timer). 742 15.1. Handshaking node capabilities 744 Currently there exist no mechanism to handshake capabilities of the 745 root or 6LRs or 6LNs. If a feature is optional and is supported by 746 6LRs/6LNs then currently there exists no mechanism to signal it. 747 There are several RPL extension proposals which are possibly optional 748 features. Root needs to know if the 6LR/6LN supports these optional 749 features to enable the extension in that path context. Similarly 750 6LRs and 6LNs need to know whether the root supports certain 751 extensions that it can make use of. 753 15.2. How do Capabilities differ from MOP and Configuration Option? 755 Unlike MOP and Configuration Option which are issued by the root of 756 the Instance, Capabilities can be issued by any node. A 6LN/6LR node 757 can advertise its capabilities such that those can be seen by 758 intermediate 6LRs and the root of the Instance. 760 15.3. Deliberations 762 (1) Is it possible for leaf nodes to advertise their set of 763 capabilities, which can be used by root and/or intermediate 6LRs 764 to make run time decisions? 766 (2) How should these capabilities be carried? Should it be carried 767 in DAO/DIO/DAO-ACK? 769 (3) Should the definition of capabilities be same in both directions 770 (upstream/downstream)? 772 16. Backward Compatibility issues with RPL Options 774 Most of the new work in ROLL requires addition of new control 775 options. Everytime a new control option is added, it is required 776 that all the nodes upgrade to support this option. In many cases, 777 the new specification declares using a Flag day to switch to the new 778 functionality. 780 New control options may not require mandatory handling on every node 781 but it requires at-least some processing. For e.g., assume that a 782 new control option is added to DIO message. The option does not 783 require any handling on the nodes not supporting it but it requires 784 at-least for these nodes to forward this new control option 785 downstream. Currently the new control option may be stripped off. 787 It should be possible for the unknown control options to be copied 788 as-is to the downstream/upstream node(s). The specification defining 789 the new control option will decide whether a node should strip-off or 790 copy the unknown control option. 792 17. RPL under-specification 794 (a) PathSequence: Is it mandatory to use PathSequence in DAO Transit 795 Information Option? RPL mentions that a 6LR/6LBR hosting the 796 routing entry on behalf of target node should refresh the 797 lifetime on reception of a new Path Sequence. But RPL does not 798 necessarily mandate use of Path Sequence. Most of the open 799 source implementation [RIOT] [CONTIKI] currently do not issue 800 Path Sequence in the DAO message. 802 (b) Target Option aggregation in DAO: RPL allows multiple targets to 803 be aggregated in a single DAO message and has introduced a 804 notion of DelayDAO using which a 6LR node could delay its DAO to 805 enable such aggregation. But RPL does not have clear text on 806 handling of aggregated DAOs and thus it hinders 807 interoperability. 809 (c) DTSN Update: RPL does not clearly define in which cases DTSN 810 should be updated in case of storing mode of operation. More 811 details for this are presented in Section 3. 813 18. Acknowledgements 815 Many thanks to Pascal Thubert for hallway chats and for helping 816 understand the existing design rationales. Thanks to Michael 817 Richardson for Unstrung RPL implementation rationale. Thanks to ML 818 discussions, in particular (https://www.ietf.org/mail- 819 archive/web/roll/current/msg09443.html). 821 19. IANA Considerations 823 This memo includes no request to IANA. 825 20. Security Considerations 827 This is an information draft and does add any changes to the existing 828 specifications. 830 21. References 832 21.1. Normative References 834 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 835 Requirement Levels", BCP 14, RFC 2119, 836 DOI 10.17487/RFC2119, March 1997, 837 . 839 [RFC6206] Levis, P., Clausen, T., Hui, J., Gnawali, O., and J. Ko, 840 "The Trickle Algorithm", RFC 6206, DOI 10.17487/RFC6206, 841 March 2011, . 843 [RFC6550] Winter, T., Ed., Thubert, P., Ed., Brandt, A., Hui, J., 844 Kelsey, R., Levis, P., Pister, K., Struik, R., Vasseur, 845 JP., and R. Alexander, "RPL: IPv6 Routing Protocol for 846 Low-Power and Lossy Networks", RFC 6550, 847 DOI 10.17487/RFC6550, March 2012, 848 . 850 [RFC6551] Vasseur, JP., Ed., Kim, M., Ed., Pister, K., Dejean, N., 851 and D. Barthel, "Routing Metrics Used for Path Calculation 852 in Low-Power and Lossy Networks", RFC 6551, 853 DOI 10.17487/RFC6551, March 2012, 854 . 856 [RFC6552] Thubert, P., Ed., "Objective Function Zero for the Routing 857 Protocol for Low-Power and Lossy Networks (RPL)", 858 RFC 6552, DOI 10.17487/RFC6552, March 2012, 859 . 861 [RFC6775] Shelby, Z., Ed., Chakrabarti, S., Nordmark, E., and C. 862 Bormann, "Neighbor Discovery Optimization for IPv6 over 863 Low-Power Wireless Personal Area Networks (6LoWPANs)", 864 RFC 6775, DOI 10.17487/RFC6775, November 2012, 865 . 867 [RFC6997] Goyal, M., Ed., Baccelli, E., Philipp, M., Brandt, A., and 868 J. Martocci, "Reactive Discovery of Point-to-Point Routes 869 in Low-Power and Lossy Networks", RFC 6997, 870 DOI 10.17487/RFC6997, August 2013, 871 . 873 21.2. Informative References 875 [I-D.clausen-lln-rpl-experiences] 876 Clausen, T., Verdiere, A., Yi, J., Herberg, U., and Y. 877 Igarashi, "Observations on RPL: IPv6 Routing Protocol for 878 Low power and Lossy Networks", draft-clausen-lln-rpl- 879 experiences-11 (work in progress), March 2018. 881 [I-D.ietf-intarea-adhoc-wireless-com] 882 Baccelli, E. and C. Perkins, "Multi-hop Ad Hoc Wireless 883 Communication", draft-ietf-intarea-adhoc-wireless-com-02 884 (work in progress), July 2016. 886 [I-D.ietf-roll-aodv-rpl] 887 Anamalamudi, S., Zhang, M., Perkins, C., Anand, S., and B. 888 Liu, "AODV based RPL Extensions for Supporting Asymmetric 889 P2P Links in Low-Power and Lossy Networks", draft-ietf- 890 roll-aodv-rpl-08 (work in progress), May 2020. 892 [Perlman83] 893 Perlman, R., "Fault-Tolerant Broadcast of Routing 894 Information", North-Holland Computer Networks, Vol.7, 895 December 1983. 897 Appendix A. Additional Stuff 899 Authors' Addresses 901 Rahul Arvind Jadhav (editor) 902 Huawei 903 Kundalahalli Village, Whitefield, 904 Bangalore, Karnataka 560037 905 India 907 Phone: +91-080-49160700 908 Email: rahul.ietf@gmail.com 910 Rabi Narayan Sahoo 911 Huawei 912 Kundalahalli Village, Whitefield, 913 Bangalore, Karnataka 560037 914 India 916 Phone: +91-080-49160700 917 Email: rabinarayans@huawei.com 919 Yuefeng Wu 920 Huawei 921 No.101, Software Avenue, Yuhuatai District, 922 Nanjing, Jiangsu 210012 923 China 925 Phone: +86-15251896569 926 Email: wuyuefeng@huawei.com