idnits 2.17.1 draft-ietf-roll-rpl-observations-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (29 November 2021) is 878 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RIOT' is mentioned on line 809, but not defined == Missing Reference: 'CONTIKI' is mentioned on line 809, but not defined == Outdated reference: A later version (-18) exists of draft-ietf-roll-aodv-rpl-11 Summary: 0 errors (**), 0 flaws (~~), 4 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 ROLL R.A. Jadhav, Ed. 3 Internet-Draft 4 Intended status: Standards Track R.N. Sahoo 5 Expires: 2 June 2022 Juniper 6 Y. Wu 7 Huawei 8 29 November 2021 10 RPL Observations 11 draft-ietf-roll-rpl-observations-07 13 Abstract 15 This document describes RPL protocol design issues, various 16 observations and possible consequences of the design and 17 implementation choices. 19 Status of This Memo 21 This Internet-Draft is submitted in full conformance with the 22 provisions of BCP 78 and BCP 79. 24 Internet-Drafts are working documents of the Internet Engineering 25 Task Force (IETF). Note that other groups may also distribute 26 working documents as Internet-Drafts. The list of current Internet- 27 Drafts is at https://datatracker.ietf.org/drafts/current/. 29 Internet-Drafts are draft documents valid for a maximum of six months 30 and may be updated, replaced, or obsoleted by other documents at any 31 time. It is inappropriate to use Internet-Drafts as reference 32 material or to cite them other than as "work in progress." 34 This Internet-Draft will expire on 2 June 2022. 36 Copyright Notice 38 Copyright (c) 2021 IETF Trust and the persons identified as the 39 document authors. All rights reserved. 41 This document is subject to BCP 78 and the IETF Trust's Legal 42 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 43 license-info) in effect on the date of publication of this document. 44 Please review these documents carefully, as they describe your rights 45 and restrictions with respect to this document. Code Components 46 extracted from this document must include Revised BSD License text as 47 described in Section 4.e of the Trust Legal Provisions and are 48 provided without warranty as described in the Revised BSD License. 50 Table of Contents 52 1. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . 3 53 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 54 2.1. Requirements Language and Terminology . . . . . . . . . . 3 55 3. DTSN increment in storing MOP . . . . . . . . . . . . . . . . 4 56 3.1. Deliberations . . . . . . . . . . . . . . . . . . . . . . 5 57 4. DAO retransmission and use of DAO-ACK in storing MOP . . . . 5 58 4.1. Significance of bidirectional Path establishment indication 59 and relevance of DAO-ACK . . . . . . . . . . . . . . . . 6 60 4.2. Problems with hop-by-hop DAO-ACK . . . . . . . . . . . . 6 61 4.3. Problems with end-to-end DAO-ACK . . . . . . . . . . . . 6 62 4.4. Deliberations . . . . . . . . . . . . . . . . . . . . . . 6 63 4.5. Implementation Notes . . . . . . . . . . . . . . . . . . 7 64 5. Interpreting Trickle Timer . . . . . . . . . . . . . . . . . 7 65 6. Handling resource unavailability . . . . . . . . . . . . . . 8 66 6.1. Deliberations . . . . . . . . . . . . . . . . . . . . . . 8 67 7. Handling aggregated targets . . . . . . . . . . . . . . . . . 9 68 7.1. Deliberations . . . . . . . . . . . . . . . . . . . . . . 9 69 8. RPL Transit Information in DAO . . . . . . . . . . . . . . . 9 70 8.1. Deliberations . . . . . . . . . . . . . . . . . . . . . . 10 71 9. Upgrades or Extensions to RPL protocol . . . . . . . . . . . 10 72 10. Path Control bits handling . . . . . . . . . . . . . . . . . 10 73 11. Asymmetric Links and RPL . . . . . . . . . . . . . . . . . . 11 74 12. Adjacencies probing with RPL . . . . . . . . . . . . . . . . 12 75 12.1. Deliberations . . . . . . . . . . . . . . . . . . . . . 12 76 13. Control Options eliding mechanism in RPL . . . . . . . . . . 12 77 14. Managing persistent variables across node reboots . . . . . . 12 78 14.1. Persistent storage and RPL state information . . . . . . 13 79 14.2. Lollipop Counters . . . . . . . . . . . . . . . . . . . 13 80 14.3. RPL State variables . . . . . . . . . . . . . . . . . . 14 81 14.3.1. DODAG Version . . . . . . . . . . . . . . . . . . . 15 82 14.3.2. DTSN field in DIO . . . . . . . . . . . . . . . . . 15 83 14.3.3. PathSequence . . . . . . . . . . . . . . . . . . . . 15 84 14.4. State variables update frequency . . . . . . . . . . . . 16 85 14.5. Deliberations . . . . . . . . . . . . . . . . . . . . . 16 86 14.6. Implementation Notes . . . . . . . . . . . . . . . . . . 16 87 15. Capabilities and its role in RPL . . . . . . . . . . . . . . 17 88 15.1. Handshaking node capabilities . . . . . . . . . . . . . 17 89 15.2. How do Capabilities differ from MOP and Configuration 90 Option? . . . . . . . . . . . . . . . . . . . . . . . . 17 91 15.3. Deliberations . . . . . . . . . . . . . . . . . . . . . 17 92 16. Backward Compatibility issues with RPL Options . . . . . . . 18 93 17. RPL under-specification . . . . . . . . . . . . . . . . . . . 18 94 18. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 18 95 19. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 19 96 20. Security Considerations . . . . . . . . . . . . . . . . . . . 19 97 21. References . . . . . . . . . . . . . . . . . . . . . . . . . 19 98 21.1. Normative References . . . . . . . . . . . . . . . . . . 19 99 21.2. Informative References . . . . . . . . . . . . . . . . . 19 100 Appendix A. Additional Stuff . . . . . . . . . . . . . . . . . . 20 101 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 20 103 1. Motivation 105 The primary motivation for this draft is to enlist different issues 106 with RPL operation and invoke a discussion within the working group. 107 This draft by itself is not intended for RFC tracks but as a WG 108 discussion track. This draft may in turn result in other work items 109 taken up by the WG which may improvise on the issues mentioned 110 herewith. 112 2. Introduction 114 RPL [RFC6550] specifies a proactive distance-vector routing scheme 115 designed for LLNs (Low Power and Lossy Networks). RPL enables the 116 network to be formed as a DODAG and supports storing mode and non- 117 storing mode of operations. Non-storing mode allows reduced memory 118 resource usage on the nodes by allowing non-BR nodes to operate 119 without managing a routing table and involves use of source routing 120 by the Root to direct the traffic along a specific path. In storing 121 mode of operation intermediate routers maintain routing tables. 123 This work aims to highlight various issues with RPL which makes it 124 difficult to handle certain scenarios. This work will highlight such 125 issues in context to RPL's mode of operations (storing versus non- 126 storing). There are cases where RPL does not provide clear rules and 127 implementations have to make their choices hindering interoperability 128 and performance. 130 [I-D.clausen-lln-rpl-experiences] provides some interesting points. 131 Some sections in this draft may overlap with some observations in 132 [clausen], but this is been done to further extend some scenarios or 133 observations. It is highly encouraged that readers should also visit 134 [I-D.clausen-lln-rpl-experiences] for other insights. Regardless, 135 this draft is self-sufficient in a way that it does not expect to 136 have read [clausen-draft]. 138 2.1. Requirements Language and Terminology 140 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 141 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 142 document are to be interpreted as described in RFC 2119 [RFC2119]. 144 NS-MOP = RPL Non-storing Mode of Operation 145 S-MOP = RPL Storing Mode of Operation 147 This document uses terminology described in [RFC6550] and [RFC6775]. 149 3. DTSN increment in storing MOP 151 DTSN increment has major impact on the overall RPL control traffic 152 and on the efficiency of downstream route update. DTSN is sent as 153 part of DIO message and signals the downstream nodes to trigger the 154 target advertisement. The 6LR needs to decide when to update the 155 DTSN and usually it should do it in a conservative way. The DTSN 156 update mechanism determines how soon the downward routes are 157 established along the new path. RPL specifications does not provide 158 any clear mechanism on how the DTSN update should happen in case of 159 storing mode. 161 (6LBR) 162 | 163 | 164 | 165 (A) 166 / \ 167 / \ 168 / \ 169 (B) -(C) 170 | / | 171 | / | 172 | / | 173 (D)- (E) 174 \ ; 175 \ ; 176 \ ; 177 (F) 178 / \ 179 / \ 180 / \ 181 (G) (H) 183 Figure 1: Sample topology 185 Consider example topology shown in Figure 1, assume that node D 186 switches the parent from node B to C. Ideally the downstream nodes D 187 and its sub-childs should send their target advertisement to the new 188 path via node C. To achieve this result in a efficient way is a 189 challenge. Incrementing DTSN is the only way to trigger the DAO on 190 downstream nodes. But this trigger should be sent not only on the 191 first hop but to all the grand-child nodes. Thus DTSN has to be 192 incremented in the complete sub-DODAG rooted at node D thus resulting 193 in DIO/DAO storm along the sub-DODAG. This is specifically a big 194 issue in high density networks where the metric deteoration might 195 happen transiently even though the signal strength is good. 197 The primary implementation issue is whether a child node increment 198 its own DTSN when it receives DTSN update from its parent node? This 199 would result in DAO-updates in the sub-DODAG, thus the cost could be 200 very high. If not incremented it may result in serious loss of 201 connectivity for nodes in the sub-DODAG. 203 3.1. Deliberations 205 (1) In S-MOP, should the child node increment its DTSN on seeing 206 that its preferred parent has updated its DTSN? 208 (2) What are rules for DTSN increment for S-MOP, which multiple 209 implementations can follow thus allowing consistent performance 210 across different implementations? 212 4. DAO retransmission and use of DAO-ACK in storing MOP 214 [RFC6550] has an optional DAO-ACK mechanism using which an upstream 215 parent confirms the reception of a DAO from the downstream child. In 216 case of storing mode, the DAO is addressed to the immediate hop 217 upstream parent resulting in DAO-ACK from the parent. There are two 218 implementations possible: 220 (1) Hop-by-hop ACK: A parent responds with a DAO-ACK immedetialy 221 after receiving the DAO. 223 (2) End-to-End ACK: A node waits for the upstream parent to send 224 DAO-ACK to respond with a DAO-ACK downstream. The upstream 225 parent may do as many attempts to successfully send this DAO 226 upstream. In other words, the parent node accepts the 227 responsibilty of sending the DAO upstream till the point it is 228 ACKed the moment it responds back with its own ACK to the child. 230 1-> 3-> 231 DAO DAO 232 (TgtNode)--------(6LR)-------(root) 233 ACK ACK 234 <-2 <-4 236 Figure 2: Hop-by-hop DAO-ACK 237 1-> 2-> 238 DAO DAO 239 (TgtNode)--------(6LR)-------(root) 240 ACK ACK 241 <-4 <-3 243 Figure 3: End-to-End DAO-ACK 245 4.1. Significance of bidirectional Path establishment indication and 246 relevance of DAO-ACK 248 Lot of application traffic patterns requires that the bidirectional 249 path be established between the target node and the root. A typical 250 example is that COAP request with ACK bit set would require an 251 acknowledgement from the end receiver and thus warrants bidirectional 252 path establishment. It is imperative that the target node first 253 ascertains whether such a bidirectional path is established before 254 initiating such application traffic. In case of non-storing MOP, the 255 DAO-ACK works perfectly fine to ascertain such bidirectional 256 connectivity since it is an indication that the root which usually is 257 the direct destination of the DAO has received the DAO. But in case 258 of storing MOP, things are more complicated since DAO is sent hop-by- 259 hop and the DAO-ACK semantics are not clear enough as per the current 260 specification. As mentioned in above section, an implementation can 261 choose to implement hop-by-hop ACK or end-to-end ACK. 263 4.2. Problems with hop-by-hop DAO-ACK 265 The primary issue with this mode is that target node cannot ascertain 266 bidirection path connectivity on the reception of the DAO-ACK. 268 4.3. Problems with end-to-end DAO-ACK 270 In this case, it is possible for the target node to ascertain if the 271 DAO has indeed reached the root since the reception of DAO-ACK on 272 target node confirms this. However there is extra state information 273 that needs to be maintained on the 6LRs on behalf of all the child 274 nodes. Also it is very difficult for the target node to ascertain a 275 timer value to decide whether the DAO transmission has failed to 276 reach the root. 278 4.4. Deliberations 280 (1) How should an implementation interpret the DAO-ACK semantics? 281 (2) What is the best way for the target node to know that the end to 282 end bidirectional path is successfully installed or updated? In 283 NS-MOP, the DAO-ACK provides a clear way to do this. Can the 284 same be achieved for storing-MOP? 286 (3) What happens if the DAO-ACK with Status!=0 is responded by 287 ancestor node? 289 (4) How to selectively NACK subset of targets in case target options 290 are aggregated? 292 4.5. Implementation Notes 294 Current RPL open source implementations have both types of DAO-ACK 295 implementations. For e.g. RIOT supports hop-by-hop DAO-ACK. 296 Contiki older versions supported hop-by-hop ACK but the recent 297 version have changed to end-to-end ACK implementation. 299 The sequence of sending no-path DAO and DAO matters when updating the 300 routing adjacencies on a parent switch. If an implementation chooses 301 to send no-path DAO before DAO then it results in significantly more 302 overhead for route invalidation. This is because no-path DAO would 303 traverse all the way up to the BR clearing the routes on the way. In 304 case there is a common ancestor post which the old and new path 305 remains same then it is better to send regular DAO first thus 306 limiting the propagation of subsequent no-path DAO till this common 307 ancestor. 309 5. Interpreting Trickle Timer 311 Trickle algorithm defines a mechanism to reset the timer. Trickle 312 timer reset is unlike regular periodic timers wherein the timer is 313 simply reset to start again. Reset of trickle timer implies 314 resetting the trickle back to Imin and starting with a new interval 315 as mentioned in Section 4.2 of [RFC6206]. 317 |----|--------|----------------|------------------------------| . . . . 318 Imin I2 I3 I4 I5 320 Figure 4: Trickle Timer Operation 322 The above figure shows an example of trickle intervals. An interval 323 is double that of the previous interval size. Section 4.2. of 324 [RFC6206] states that, 326 "If Trickle hears a transmission that is "inconsistent" and I is 327 greater than Imin, it resets the Trickle timer. To reset the timer, 328 Trickle sets I to Imin and starts a new interval as in step 2. If I 329 is equal to Imin when Trickle hears an "inconsistent" transmission, 330 Trickle does nothing. Trickle can also reset its timer in response 331 to external "events"." 333 Thus if the trickle timer has advanced to subsequent intervals i.e., 334 >= I2, then a reset of trickle timer implies going back to Imin. 335 However, if the trickle timer is currently in Imin and if it hears an 336 inconsistent transmission then it does nothing. 338 In context to multicast DIS/DIO operation, this implies that if the 339 DIO trickle timer is already at Imin and if the node hears a 340 multicast DIS, then the timer does nothing. It MUST NOT reset the 341 timer again in this case. 343 An implementation MUST never restart the timer within an interval. 344 For e.g., in the above figure, if the timer is in interval I2, the 345 implementation MUST never restart the timer to the beginning of the 346 current interval i.e., I2. If the timer is in interval T2 and if the 347 reset is to be done then the interval is set back to Imin. If the 348 timer is already in Imin, then the reset should do nothing. 350 6. Handling resource unavailability 352 The nodes in the constrained networks have to maintain various 353 records such as neighbor cache entries and routing entries on behalf 354 of other targets to facilitate packet forwarding. Because of the 355 constrained nature of the devices the memory available may be very 356 limited and thus the path selection algorithm may have to take into 357 consideration such resource constraints as well. 359 RPL currently does not have any mechanism to advertise such resource 360 indicator metrics. The primary tables associated with RPL are 361 routing table and the neighbor cache. Even though neighbor cache is 362 not directly linked with RPL protocol, the maintenance of routing 363 adjacencies results in updates to neigbor cache. 365 6.1. Deliberations 367 Is it possible to know that an upstream parent/ancestor cannot 368 hold enough routing entries and thus this path should not be used? 370 Is it possible to know that an upstream parent cannot hold any 371 more neighbor cache entry and thus this upstream parent should not 372 be used? 374 7. Handling aggregated targets 376 RPL allows and defines specific procedures so as to aid target 377 aggregation in DAO. Having said that, the specification does not 378 mandate use of aggregated targets nor does it make any comment on 379 whether a receiving node needs to handle it. Target aggregation is 380 an useful tool and especially helps with link layer technologies that 381 does not suffer from low MTUs such as PLC. Even if the 382 implementation does not support aggregating targets, it should 383 atleast mandate reception of aggregated targets in DAO. 385 RPL has a mechanism currently to ACK the DAO but it does not have a 386 mechanism to ACK the target option. Thus in case of aggregated 387 targets in the DAO, if the subset of the targets fail then it is 388 impossible for the DAO-ACK to signal this to the DAO sender. 390 7.1. Deliberations 392 Even if the implementation does not support aggregating targets, 393 should it atleast mandate reception and handling of aggregated 394 targets in DAO? 396 There is a good scope for compressing aggregated targets which can 397 significantly reduce the RPL control overhead. 399 How to selectively NACK subset of targets in case target options 400 are aggregated? 402 The DEFAULT_DAO_DELAY of 1sec does not help much with aggregation. 403 The upstream parent nodes should wait for more time then the child 404 nodes so as to effectively aggregate. Can we have 405 DEFAULT_DAO_DELAY a function of the level/rank the node is at? 407 8. RPL Transit Information in DAO 409 RPL allows associating a target or set of targets with a Transit 410 Information Option which contains attributes for a path to one or 411 more destinations identified by the set of targets. In case of NS- 412 MOP, the transit Information will contain the all critical Parent 413 Address which allows the common ancestor usually the root to identify 414 the source route header for the target node. The Transit Information 415 also contains other information such as Path Sequence and Path 416 Lifetime which are critical for maintaining route adjacencies. 418 RPL however does not mandate the use of Transit Information Option 419 for targets. 421 8.1. Deliberations 423 Is it ok to let implementations decide on the inclusion of Transit 424 Information Option? 426 Is it possible to achieve interop without mandating use of Transit 427 Information Option? 429 If the Transit Information Option is sent, should the handling of 430 PathSequence be mandated? 432 9. Upgrades or Extensions to RPL protocol 434 RPL extensibility is highly desirable and is controlled by protocol 435 elements within the messaging framework. In the pursuit to keep the 436 signalling overhead less, RPL specification has been restricting in 437 its approach to extend its field ranges, thus in some cases putting 438 extensibility at stakes. Consider for example, the mode of operation 439 bits which is three bits in the RPL specification. These bits are 440 already saturated and it may be difficult to add major upgrades 441 without extending these bits. 443 Addition of new Control Options or new RPL Codes almost certainly 444 results in backward compatibility issues. RFC6550 clearly mentions 445 that a message with an unknown RPL Code MUST be silently discarded. 446 However, no explicit handling is suggested for unknown RPL control 447 option types. In some cases, implementations simply copy-forward an 448 unknown option as it is while in other cases the unknown option is 449 stripped off before forwarding the message. 451 Deliberations: 453 (1) What are the extensibility options RPL could implement? How 454 much overhead would it incur? 456 (2) Most of the extensions are in the form of new control options. 457 Should RPL have a mechanism to only handle such extensions in a 458 backward compatible but in a generic manner? 460 10. Path Control bits handling 462 RPL uses Path Control bits in the DAO's Transit Information Option 463 for installing multiple downward routes to the nodes. These multiple 464 routes could be used for reliability, latency or traffic load- 465 balancing within a DAG. The path control bits are usable both in 466 storing and non-storing mode of operation. 468 RFC6550 Section 9.9 bullet point 9 requires a mandatory setting of 469 Path Control bits in all the unicast DAOs sent by the Target node. 470 However, no existing implementation of RPL supports this. There is 471 no reason for a network which only requires a single path to the root 472 to mandatorily support path control bits. 474 Deliberations: 476 (1) Should the mandatory clause for supporting Path Control Bits in 477 RFC6550 Section 9.9 point 9 be removed? 479 (2) Handling Path Control Bits may be complex. An implementation 480 guideline explaining the use-cases and resource (memory 481 requirements) assumptions would help implementors decide the 482 utility of this technique. 484 11. Asymmetric Links and RPL 486 Section 3.1 of [I-D.ietf-intarea-adhoc-wireless-com] explains 487 asymmetric link characteristics and what it takes for a protocol to 488 support asymmetric links. RPL depends on bi-directional links for 489 control even though near-perfect symmetry is not expected. The 490 implication of this is that the upstream and downstream path remains 491 same within a given RPL instance for any pair of nodes. There are 492 following questions sprouting of this design: 494 (1) Is it possible to detect asymmetric links? 496 (2) In the presence of asymmetric links what is the impact on the 497 control overhead and is there a way to possibly mitigate or 498 alleviate any negative impact? 500 [I-D.ietf-roll-aodv-rpl] defines a mechanism to use a pair of 501 instances which are coupled. This allows disjoint upstream and 502 downstream paths between pair of nodes assuming that the link 503 asymmetricity is detected using some outside techniques. The link 504 assumes that the link asymmetricity is already known to the nodes in 505 the form of static configuration. In case of 6tisch networks, the 506 availability of transmission slots information can be used to 507 identify link asymmetricity. The challenge with regards to detecting 508 link asymmetricity arises from scenarios where, for example, the 509 nodes transmit with unequal power levels. 511 12. Adjacencies probing with RPL 513 RPL avoids periodic hello messaging as compared to other distance- 514 vector protocols. It uses trickle timer based mechanism to update 515 configuration parameters. This significantly reduces the RPL control 516 overhead. One of the fallout of this design choice is that, in the 517 absence of regular traffic, the adjacencies could not be tested and 518 repaired if broken. 520 RPL provides a mechanism in the form of unicast DIS to query a 521 particular node for its DIO. A node receiving a unicast DIS MUST 522 respond with a unicast DIO with Configuration Option. This mechanism 523 could as well be made use of for probing adjacencies and certain 524 implementations such as Contiki uses this. The periodicity of the 525 probing is implementation dependent, but the node is expected to 526 invoke probing only when 528 (1) There is no data traffic based on which the links could be 529 tested. 531 (2) There is no L2 feedback. In some case, L2 might provide 532 periodic beacons at link layer and the absence of beacons could 533 be used for link tests. 535 12.1. Deliberations 537 (1) Should the probing scheme be standardized? In some cases using 538 multicast based probing may prove advantageous. 540 (2) In some cases using multicast based probing may prove 541 advantageous. Currently RPL does not have multicast based 542 probing. Multicast DIS/DIO may not be suitable for probing 543 because it could possibly lead to change of states. 545 13. Control Options eliding mechanism in RPL 547 RPL configuration changes are rare and thus various configuration 548 options may not change over a long period of time. RPL provides a 549 way for the configuration options to be elided but there are no clear 550 guidelines on how the eliding should be handled. In the absence of 551 such guidelines, it is possible that certain nodes may end up using 552 stale configuration in the event of transient link failures. 554 14. Managing persistent variables across node reboots 555 14.1. Persistent storage and RPL state information 557 Devices are required to be functional for several years without 558 manual maintanence. Usually battery power consumption is considered 559 key for operating the devices for several (tens of) years. But apart 560 from battery, flash memory endurance may prove to be a lifetime 561 bottleneck in constrained networks. Endurance is defined as maximum 562 number of erase-write cycles that a NAND/NOR cell can undergo before 563 losing its 'gauranteed' write operation. In some cases (cheaper 564 NAND-MLC/TLC), the endurance can be as less as 2K cycles. Thus for 565 e.g. if a given cell is written 5 times a day, that NAND-flash cell 566 assuming an endurance of 10K cycles may last for less than 6 years. 568 Wear leveling is a popular technique used in flash memory to minimize 569 the impact of limited cell endurance. Wear leveling works by 570 arranging data so that erasures and re-writes are distributed evenly 571 across the medium. The memory sectors are over-provisioned so that 572 the writes are distributed across multiple sectors. Many IoT 573 platforms do not necessarily consider this over-provisioning and 574 usually provision the memory only to what is required. Some 575 scenarios such as street-lighting may not require the application 576 layer to write any information to the persistent storage and thus the 577 over-provisioning is often ignored. In such cases if the network 578 stack ends up using persistent storage for maintaining its state 579 information then it becomes counter-productive. 581 In a star topology, the amount of persistent data write done by 582 network protocols is very limited. But ad-hoc networks employing 583 routing protocols such as RPL assume certain state information to be 584 retained across node reboots. In case of IoT devices this storage is 585 mostly floating gate based NAND/NOR based flash memory. The impact 586 of loss of this state information differs depending upon the type 587 (6LN/6LR/6LBR) of the node. 589 14.2. Lollipop Counters 591 [RFC6550] Section 7.2. explains sequence counter operation defining 592 lollipop [Perlman83] style counters. Lollipop counters specify 593 mechanism in which even if the counter value wraps, the algorithm 594 would be able to tell whether the received value is the latest or 595 not. This mechanism also helps in "some cases" to recover from node 596 reboot, but is not foolproof. 598 Consider an e.g. where Node A boots up and initialises the seqcnt to 599 240 as recommended in [RFC6550]. Node A communicates to Node B using 600 this seqcnt and node B uses this seqcnt to determine whether the 601 information node A sent in the packet is latest. Now lets assume, 602 the counter value reaches 250 after some operations on Node A, and 603 node B keeps receiving updated seqcnt from node A. Now consider that 604 node A reboots, and since it reinitializes the seqcnt value to 240 605 and sends the information to node B (who has seqcnt of 250 stored on 606 behalf of node A). As per section 7.2. of [RFC6550], when node B 607 receives this packet it will consider the information to be old 608 (since 240 < 250). 610 +=====+=====+==========+ 611 | A | B | Output | 612 +=====+=====+==========+ 613 | 240 | 240 | AB, new | 624 +-----+-----+----------+ 625 | 240 | :: | A>B, new | 626 +-----+-----+----------+ 627 | 240 | 127 | A>B, new | 628 +-----+-----+----------+ 630 Table 1: Example 631 lollipop counter 632 operation 634 Default values for lollipop counters considered from [RFC6550] 635 Section 7.2. 637 Based on this figure, there is dead zone (240 to 0) in which if A 638 operates after reboot then the seqcnt will always be considered 639 smaller. Thus node A needs to maintain the seqcnt in persistent 640 storage and reuse this on reboot. 642 14.3. RPL State variables 644 The impact of loss of RPL state information differs depending upon 645 the node type (6LN/6LR/6LBR). Following sections explain different 646 state variables and the impact in case this information is lost on 647 reboot. 649 14.3.1. DODAG Version 651 The tuple (RPLInstanceID, DODAGID, DODAGVersionNumber) uniquely 652 identifies a DODAG Version. DODAGVersionNumber is incremented 653 everytime a global repair is initiated for the instance (global or 654 local). A node receiving an older DODAGVersionNumber will ignore the 655 DIO message assuming it to be from old DODAG version. Thus a 6LBR 656 node (and 6LR node in case of local DODAG) needs to maintain the 657 DODAGVersionNumber in the persistent storage, so as to be available 658 on reboot. In case the 6LBR could not use the latest 659 DODAGVersionNumber the implication are that it won't be able to 660 recover/re-establish the routing table. 662 14.3.2. DTSN field in DIO 664 DTSN (Destination advertisement Trigger Sequence Number) is a DIO 665 message field used as part of procedure to maintain Downward routes. 666 A 6LBR/6LR node may increment a DTSN in case it requires the 667 downstream nodes to send DAO and thus update downward routes on the 668 6LBR/6LR node. In case of RPL NS-MOP, only the 6LBR maintains the 669 downward routes and thus controls this field update. In case of 670 S-MOP, 6LRs additionally keep downward routes and thus control this 671 field update. 673 In S-MOP, when a 6LR node switches parent it may have to issue a DIO 674 with incremented DTSN to trigger downstream child nodes to send DAO 675 so that the downward routes are established in all parent/ancestor 676 set. Thus in S-MOP, the frequency of DTSN update might be relatively 677 high (given the node density and hysteresis set by objective function 678 to switch parent). 680 14.3.3. PathSequence 682 PathSequence is part of RPL Transit Option, and associated with RPL 683 Target option. A node whichs owns a target address can associate a 684 PathSequence in the DAO message to denote freshness of the target 685 information. This is especially useful when a node uses multiple 686 paths or multiple parents to advertise its reachability. 688 Loss of PathSequence information maintained on the target node can 689 result in routing adjacencies been lost on 6LRs/6LBR/6BBR. 691 14.4. State variables update frequency 693 +====================+===================+========================+ 694 | State variable | Update frequency | Impacts node type | 695 +====================+===================+========================+ 696 | DODAGVersionNumber | Low | 6LBR, 6LR(local DODAG) | 697 +--------------------+-------------------+------------------------+ 698 | DTSN | High(SM),Low(NSM) | 6LBR, 6LR | 699 +--------------------+-------------------+------------------------+ 700 | PathSequence | High(SM),Low(NSM) | 6LR, 6LN | 701 +--------------------+-------------------+------------------------+ 703 Table 2: RPL State variables 705 Low=<5 per day, High=>5 per day; SM=Storing MOP, NSM=Non-Storing MOP 707 14.5. Deliberations 709 (1) Is it possible that RPL removes the use of persistent storage 710 for maintaining state information? 712 (2) In most cases, the node reboots will happen very rarely. Thus 713 doing a persistent storage book-keeping for handling node reboot 714 might not make sense. Is it possible to consider signaling 715 (especially after the node reboots) so as to avoid maintaining 716 this persistent state? Is it possible to use one-time on-reboot 717 signalling to recover some state information? 719 (3) It is necessary that RPL avoids using persistent storage as far 720 as possible. Ideally, extensions to RPL should consider this as 721 a design requirement especially for 6LR and 6LN nodes. DTSN and 722 PathSequence are the primary state variables which have major 723 impact. 725 14.6. Implementation Notes 727 An implementation should use a random DAOSequence number on reboot so 728 as to avoid a risk of reusing the same DAOSequence on reboot. 729 Regardless the sequence counter size of 8bits does not provide much 730 gurantees towards choosing a good random number. A parent node will 731 not respond with a DAO-ACK in case it sees a DAO with the same 732 previous DAOSequence. 734 Write-Before-Use: The state information should be written to the 735 flash before using it in the messaging. If it is done the other way, 736 then the chances are that the node power downs before writing to the 737 persistent storage. 739 15. Capabilities and its role in RPL 741 RPL is a distributed protocol and it requires that the participating 742 nodes agree on basic set of primitives to follow. RPL currently 743 handles this using MOP (Mode of Operation) bits in the DIO. MOP bits 744 inform the nodes the basic mode of operation a node MUST support to 745 join the Instance as a 6LR. The MOP is decided and advertised by the 746 root of the RPL Instance. A node not supporting the given MOP may 747 still join the Instance as a leaf node or 6LN. 749 RPL further uses DIO Configuration Option to advertise the 750 configuration each node needs to use (for e.g., for trickle timer). 752 15.1. Handshaking node capabilities 754 Currently there exist no mechanism to handshake capabilities of the 755 root or 6LRs or 6LNs. If a feature is optional and is supported by 756 6LRs/6LNs then currently there exists no mechanism to signal it. 757 There are several RPL extension proposals which are possibly optional 758 features. Root needs to know if the 6LR/6LN supports these optional 759 features to enable the extension in that path context. Similarly 760 6LRs and 6LNs need to know whether the root supports certain 761 extensions that it can make use of. 763 15.2. How do Capabilities differ from MOP and Configuration Option? 765 Unlike MOP and Configuration Option which are issued by the root of 766 the Instance, Capabilities can be issued by any node. A 6LN/6LR node 767 can advertise its capabilities such that those can be seen by 768 intermediate 6LRs and the root of the Instance. 770 15.3. Deliberations 772 (1) Is it possible for leaf nodes to advertise their set of 773 capabilities, which can be used by root and/or intermediate 6LRs 774 to make run time decisions? 776 (2) How should these capabilities be carried? Should it be carried 777 in DAO/DIO/DAO-ACK? 779 (3) Should the definition of capabilities be same in both directions 780 (upstream/downstream)? 782 16. Backward Compatibility issues with RPL Options 784 Most of the new work in ROLL requires addition of new control 785 options. Everytime a new control option is added, it is required 786 that all the nodes upgrade to support this option. In many cases, 787 the new specification declares using a Flag day to switch to the new 788 functionality. 790 New control options may not require mandatory handling on every node 791 but it requires at-least some processing. For e.g., assume that a 792 new control option is added to DIO message. The option does not 793 require any handling on the nodes not supporting it but it requires 794 at-least for these nodes to forward this new control option 795 downstream. Currently the new control option may be stripped off. 797 It should be possible for the unknown control options to be copied 798 as-is to the downstream/upstream node(s). The specification defining 799 the new control option will decide whether a node should strip-off or 800 copy the unknown control option. 802 17. RPL under-specification 804 (a) PathSequence: Is it mandatory to use PathSequence in DAO Transit 805 Information Option? RPL mentions that a 6LR/6LBR hosting the 806 routing entry on behalf of target node should refresh the 807 lifetime on reception of a new Path Sequence. But RPL does not 808 necessarily mandate use of Path Sequence. Most of the open 809 source implementation [RIOT] [CONTIKI] currently do not issue 810 Path Sequence in the DAO message. 812 (b) Target Option aggregation in DAO: RPL allows multiple targets to 813 be aggregated in a single DAO message and has introduced a 814 notion of DelayDAO using which a 6LR node could delay its DAO to 815 enable such aggregation. But RPL does not have clear text on 816 handling of aggregated DAOs and thus it hinders 817 interoperability. 819 (c) DTSN Update: RPL does not clearly define in which cases DTSN 820 should be updated in case of storing mode of operation. More 821 details for this are presented in Section 3. 823 18. Acknowledgements 825 Many thanks to Pascal Thubert for hallway chats and for helping 826 understand the existing design rationales. Thanks to Michael 827 Richardson for Unstrung RPL implementation rationale. Thanks to ML 828 discussions, in particular (https://www.ietf.org/mail- 829 archive/web/roll/current/msg09443.html). 831 19. IANA Considerations 833 This memo includes no request to IANA. 835 20. Security Considerations 837 This is an information draft and does add any changes to the existing 838 specifications. 840 21. References 842 21.1. Normative References 844 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 845 Requirement Levels", BCP 14, RFC 2119, 846 DOI 10.17487/RFC2119, March 1997, 847 . 849 [RFC6206] Levis, P., Clausen, T., Hui, J., Gnawali, O., and J. Ko, 850 "The Trickle Algorithm", RFC 6206, DOI 10.17487/RFC6206, 851 March 2011, . 853 [RFC6550] Winter, T., Ed., Thubert, P., Ed., Brandt, A., Hui, J., 854 Kelsey, R., Levis, P., Pister, K., Struik, R., Vasseur, 855 JP., and R. Alexander, "RPL: IPv6 Routing Protocol for 856 Low-Power and Lossy Networks", RFC 6550, 857 DOI 10.17487/RFC6550, March 2012, 858 . 860 [RFC6775] Shelby, Z., Ed., Chakrabarti, S., Nordmark, E., and C. 861 Bormann, "Neighbor Discovery Optimization for IPv6 over 862 Low-Power Wireless Personal Area Networks (6LoWPANs)", 863 RFC 6775, DOI 10.17487/RFC6775, November 2012, 864 . 866 21.2. Informative References 868 [I-D.clausen-lln-rpl-experiences] 869 Clausen, T., Verdiere, A. C. D., Yi, J., Herberg, U., and 870 Y. Igarashi, "Observations on RPL: IPv6 Routing Protocol 871 for Low power and Lossy Networks", Work in Progress, 872 Internet-Draft, draft-clausen-lln-rpl-experiences-11, 27 873 March 2018, . 876 [I-D.ietf-intarea-adhoc-wireless-com] 877 Baccelli, E. and C. E. Perkins, "Multi-hop Ad Hoc Wireless 878 Communication", Work in Progress, Internet-Draft, draft- 879 ietf-intarea-adhoc-wireless-com-02, 20 July 2016, 880 . 883 [I-D.ietf-roll-aodv-rpl] 884 Anamalamudi, S., Perkins, C. E., Anand, S., and B. Liu, 885 "Supporting Asymmetric Links in Low Power Networks: AODV- 886 RPL", Work in Progress, Internet-Draft, draft-ietf-roll- 887 aodv-rpl-11, 16 September 2021, 888 . 891 [Perlman83] 892 Perlman, R., "Fault-Tolerant Broadcast of Routing 893 Information", North-Holland Computer Networks, Vol.7, 894 December 1983. 896 Appendix A. Additional Stuff 898 Authors' Addresses 900 Rahul Arvind Jadhav (editor) 901 Marathahalli 902 Bangalore 560037 903 Karnataka 904 India 906 Email: rahul.ietf@gmail.com 908 Rabi Narayan Sahoo 909 Juniper 910 Whitefield 911 Bangalore 560037 912 Karnataka 913 India 915 Email: rabinarayans0828@gmail.com 917 Yuefeng Wu 918 Huawei 919 No.101, Software Avenue, Yuhuatai District, 920 Nanjing 921 Jiangsu, 210012 922 China 924 Phone: +86-15251896569 925 Email: wuyuefeng@huawei.com