idnits 2.17.1 draft-ietf-homenet-dncp-09.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 197 has weird spacing: '...ntifier an o...' == Line 252 has weird spacing: '...e trust the ...' == Line 256 has weird spacing: '...y graph the...' == Line 260 has weird spacing: '...ionally a pe...' == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'SHOULD not' in this paragraph: If keep-alives specified in Section 6.1 are NOT sent by the peer (either the DNCP profile does not specify the use of keep-alives or the particular peer chooses not to send keep-alives), some other existing local transport-specific means (such as Ethernet carrier-detection or TCP keep-alive) MUST be used to ensure its presence. If the peer does not send keep-alives, and no means to verify presence of the peer are available, the peer MUST be considered no longer present and it SHOULD not be added back as a peer until it starts sending keep-alives again. When the peer is no longer present, the Peer TLV and the local DNCP peer state MUST be removed. -- The document date (August 5, 2015) is 3187 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 6347 (Obsoleted by RFC 9147) ** Obsolete normative reference: RFC 5246 (Obsoleted by RFC 8446) ** Downref: Normative reference to an Informational RFC: RFC 6234 Summary: 3 errors (**), 0 flaws (~~), 6 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Homenet Working Group M. Stenberg 3 Internet-Draft S. Barth 4 Intended status: Standards Track Independent 5 Expires: February 6, 2016 August 5, 2015 7 Distributed Node Consensus Protocol 8 draft-ietf-homenet-dncp-09 10 Abstract 12 This document describes the Distributed Node Consensus Protocol 13 (DNCP), a generic state synchronization protocol that uses the 14 Trickle algorithm and hash trees. DNCP is an abstract protocol, and 15 must be combined with a specific profile to make a complete 16 implementable protocol. 18 Status of This Memo 20 This Internet-Draft is submitted in full conformance with the 21 provisions of BCP 78 and BCP 79. 23 Internet-Drafts are working documents of the Internet Engineering 24 Task Force (IETF). Note that other groups may also distribute 25 working documents as Internet-Drafts. The list of current Internet- 26 Drafts is at http://datatracker.ietf.org/drafts/current/. 28 Internet-Drafts are draft documents valid for a maximum of six months 29 and may be updated, replaced, or obsoleted by other documents at any 30 time. It is inappropriate to use Internet-Drafts as reference 31 material or to cite them other than as "work in progress." 33 This Internet-Draft will expire on February 6, 2016. 35 Copyright Notice 37 Copyright (c) 2015 IETF Trust and the persons identified as the 38 document authors. All rights reserved. 40 This document is subject to BCP 78 and the IETF Trust's Legal 41 Provisions Relating to IETF Documents 42 (http://trustee.ietf.org/license-info) in effect on the date of 43 publication of this document. Please review these documents 44 carefully, as they describe your rights and restrictions with respect 45 to this document. Code Components extracted from this document must 46 include Simplified BSD License text as described in Section 4.e of 47 the Trust Legal Provisions and are provided without warranty as 48 described in the Simplified BSD License. 50 Table of Contents 52 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 53 1.1. Applicability . . . . . . . . . . . . . . . . . . . . . . 3 54 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 55 2.1. Requirements Language . . . . . . . . . . . . . . . . . . 6 56 3. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 6 57 4. Operation . . . . . . . . . . . . . . . . . . . . . . . . . . 7 58 4.1. Hash Tree . . . . . . . . . . . . . . . . . . . . . . . . 7 59 4.2. Data Transport . . . . . . . . . . . . . . . . . . . . . 8 60 4.3. Trickle-Driven Status Updates . . . . . . . . . . . . . . 9 61 4.4. Processing of Received TLVs . . . . . . . . . . . . . . . 10 62 4.5. Adding and Removing Peers . . . . . . . . . . . . . . . . 12 63 4.6. Data Liveliness Validation . . . . . . . . . . . . . . . 13 64 5. Data Model . . . . . . . . . . . . . . . . . . . . . . . . . 14 65 6. Optional Extensions . . . . . . . . . . . . . . . . . . . . . 15 66 6.1. Keep-Alives . . . . . . . . . . . . . . . . . . . . . . . 15 67 6.1.1. Data Model Additions . . . . . . . . . . . . . . . . 16 68 6.1.2. Per-Endpoint Periodic Keep-Alives . . . . . . . . . . 16 69 6.1.3. Per-Peer Periodic Keep-Alives . . . . . . . . . . . . 17 70 6.1.4. Received TLV Processing Additions . . . . . . . . . . 17 71 6.1.5. Peer Removal . . . . . . . . . . . . . . . . . . . . 17 72 6.2. Support For Dense Multicast-Enabled Links . . . . . . . . 17 73 7. Type-Length-Value Objects . . . . . . . . . . . . . . . . . . 18 74 7.1. Request TLVs . . . . . . . . . . . . . . . . . . . . . . 19 75 7.1.1. Request Network State TLV . . . . . . . . . . . . . . 19 76 7.1.2. Request Node State TLV . . . . . . . . . . . . . . . 19 77 7.2. Data TLVs . . . . . . . . . . . . . . . . . . . . . . . . 20 78 7.2.1. Node Endpoint TLV . . . . . . . . . . . . . . . . . . 20 79 7.2.2. Network State TLV . . . . . . . . . . . . . . . . . . 20 80 7.2.3. Node State TLV . . . . . . . . . . . . . . . . . . . 21 81 7.3. Data TLVs within Node State TLV . . . . . . . . . . . . . 22 82 7.3.1. Peer TLV . . . . . . . . . . . . . . . . . . . . . . 22 83 7.3.2. Keep-Alive Interval TLV . . . . . . . . . . . . . . . 22 84 8. Security and Trust Management . . . . . . . . . . . . . . . . 23 85 8.1. Pre-Shared Key Based Trust Method . . . . . . . . . . . . 23 86 8.2. PKI Based Trust Method . . . . . . . . . . . . . . . . . 23 87 8.3. Certificate Based Trust Consensus Method . . . . . . . . 23 88 8.3.1. Trust Verdicts . . . . . . . . . . . . . . . . . . . 24 89 8.3.2. Trust Cache . . . . . . . . . . . . . . . . . . . . . 25 90 8.3.3. Announcement of Verdicts . . . . . . . . . . . . . . 25 91 8.3.4. Bootstrap Ceremonies . . . . . . . . . . . . . . . . 26 92 9. DNCP Profile-Specific Definitions . . . . . . . . . . . . . . 27 93 10. Security Considerations . . . . . . . . . . . . . . . . . . . 29 94 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 29 95 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 30 96 12.1. Normative references . . . . . . . . . . . . . . . . . . 30 97 12.2. Informative references . . . . . . . . . . . . . . . . . 30 99 Appendix A. Alternative Modes of Operation . . . . . . . . . . . 31 100 A.1. Read-only Operation . . . . . . . . . . . . . . . . . . . 31 101 A.2. Forwarding Operation . . . . . . . . . . . . . . . . . . 31 102 Appendix B. Some Questions and Answers [RFC Editor: please 103 remove] . . . . . . . . . . . . . . . . . . . . . . 31 104 Appendix C. Changelog [RFC Editor: please remove] . . . . . . . 32 105 Appendix D. Draft Source [RFC Editor: please remove] . . . . . . 33 106 Appendix E. Acknowledgements . . . . . . . . . . . . . . . . . . 34 107 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 34 109 1. Introduction 111 DNCP is designed to provide a way for each participating node to 112 publish a set of TLV (Type-Length-Value) tuples, and to provide a 113 shared and common view about the data published by every currently or 114 recently bidirectionally reachable DNCP node in a network. 116 For state synchronization a hash tree is used. It is formed by first 117 calculating a hash for the dataset published by each node, called 118 node data, and then calculating another hash over those node data 119 hashes. The single resulting hash, called network state hash, is 120 transmitted using the Trickle algorithm [RFC6206] to ensure that all 121 nodes share the same view of the current state of the published data 122 within the network. The use of Trickle with only short network state 123 hashes sent infrequently (in steady state, once the maximum Trickle 124 interval per link or unicast connection has been reached) makes DNCP 125 very thrifty when updates happen rarely. 127 For maintaining liveliness of the topology and the data within it, a 128 combination of Trickled network state, keep-alives, and "other" means 129 of ensuring reachability are used. The core idea is that if every 130 node ensures its peers are present, transitively, the whole network 131 state also stays up-to-date. 133 1.1. Applicability 135 DNCP is most suitable for data that changes only infrequently to gain 136 the maximum benefit from using Trickle. As the network of nodes 137 grows, or the frequency of data changes per node increases, Trickle 138 is eventually used less and less and the benefit of using DNCP 139 diminishes. In these cases Trickle just provides extra complexity 140 within the specification and little added value. 142 The suitability of DNCP for a particular application can roughly be 143 evaluated by considering the expected average network-wide state 144 change interval A_NC_I; it is computed by dividing the mean interval 145 at which a node originates a new TLV set by the number of 146 participating nodes. If keep-alives are used, A_NC_I is the minimum 147 of the computed A_NC_I and the keep-alive interval. If A_NC_I is 148 less than the (application-specific) Trickle minimum interval, DNCP 149 is most likely unsuitable for the application as Trickle will not be 150 utilized most of the time. 152 If constant rapid state changes are needed, the preferable choice is 153 to use an additional point-to-point channel whose address or locator 154 is published using DNCP. Nevertheless, if doing so does not raise 155 A_NC_I above the (sensibly chosen) Trickle interval parameters for a 156 particular application, using DNCP is probably not suitable for the 157 application. 159 Another consideration is the size of the published TLV set by a node 160 compared to the size of deltas in the TLV set. If the TLV set 161 published by a node is very large, and has frequent small changes, 162 DNCP as currently specified may be unsuitable since it does not 163 define any delta synchronization scheme but always transmits the 164 complete updated TLV set verbatim. 166 DNCP can be used in networks where only unicast transport is 167 available. While DNCP uses the least amount of bandwidth when 168 multicast is utilized, even in pure unicast mode, the use of Trickle 169 (ideally with k < 2) results in a protocol with an exponential 170 backoff timer and fewer transmissions than a simpler protocol not 171 using Trickle. 173 2. Terminology 175 DNCP profile the values for the set of parameters, given in 176 Section 9. They are prefixed with DNCP_ in this 177 document. The profile also specifies the set of 178 optional DNCP extensions to be used. 180 DNCP-based a protocol which provides a DNCP profile, according 181 protocol to Section 9, and zero or more TLV assignments from 182 the per-DNCP profile TLV registry as well as their 183 processing rules. 185 DNCP node a single node which runs a DNCP-based protocol. 187 Link a link-layer media over which directly connected 188 nodes can communicate. 190 DNCP network a set of DNCP nodes running DNCP-based protocol(s) 191 with matching DNCP profile(s). The set consists of 192 nodes that have discovered each other using the 193 transport method defined in the DNCP profile, via 194 multicast on local links, and / or by using unicast 195 communication. 197 Node identifier an opaque fixed-length identifier consisting of 198 DNCP_NODE_IDENTIFIER_LENGTH bytes which uniquely 199 identifies a DNCP node within a DNCP network. 201 Interface a node's attachment to a particular link. 203 Address an identifier used as source or destination of a 204 DNCP message flow, e.g., a tuple (IPv6 address, UDP 205 port) for an IPv6 UDP transport. 207 Endpoint a locally configured termination point for 208 (potential or established) DNCP message flows. An 209 endpoint is the source and destination for separate 210 unicast message flows to individual nodes and 211 optionally for multicast messages to all thereby 212 reachable nodes (e.g., for node discovery). 213 Endpoints are usually in one of the transport modes 214 specified in Section 4.2. 216 Endpoint a 32-bit opaque and locally unique value, which 217 identifier identifies a particular endpoint of a particular 218 DNCP node. The value 0 is reserved for DNCP and 219 DNCP-based protocol purposes and not used to 220 identify an actual endpoint. This definition is in 221 sync with the interface index definition in 222 [RFC3493], as the non-zero small positive integers 223 should comfortably fit within 32 bits. 225 Peer another DNCP node with which a DNCP node 226 communicates using a particular local and remote 227 endpoint pair. 229 Node data a set of TLVs published and owned by a node in the 230 DNCP network. Other nodes pass it along as-is, even 231 if they cannot fully interpret it. 233 Node state a set of metadata attributes for node data. It 234 includes a sequence number for versioning, a hash 235 value for comparing equality of stored node data, 236 and a timestamp indicating the time passed since 237 its last publication. The hash function and the 238 length of the hash value are defined in the DNCP 239 profile. 241 Network state a hash value which represents the current state of 242 hash the network. The hash function and the length of 243 the hash value are defined in the DNCP profile. 244 Whenever a node is added, removed or updates its 245 published node data this hash value changes as 246 well. For calculation, please see Section 4.1. 248 Trust verdict a statement about the trustworthiness of a 249 certificate announced by a node participating in 250 the certificate based trust consensus mechanism. 252 Effective trust the trust verdict with the highest priority within 253 verdict the set of trust verdicts announced for the 254 certificate in the DNCP network. 256 Topology graph the undirected graph of DNCP nodes produced by 257 retaining only bidirectional peer relationships 258 between nodes. 260 Bidirectionally a peer is locally unidirectionally reachable if a 261 reachable recent and consistent multicast or any unicast DNCP 262 message has been received by the local node (see 263 Section 4.5). If said peer in return also 264 considers the local node unidirectionally 265 reachable, then bidirectionally reachability is 266 established. As this process is based on 267 publishing peer relationships and evaluating the 268 resulting topology graph as described in Section 269 4.6, this information is available to the whole 270 DNCP network. 272 2.1. Requirements Language 274 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 275 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 276 "OPTIONAL" in this document are to be interpreted as described in RFC 277 2119 [RFC2119]. 279 3. Overview 281 DNCP operates primarily using unicast exchanges between nodes, and 282 may use multicast for Trickle-based shared state dissemination and 283 topology discovery. If used in pure unicast mode with unreliable 284 transport, Trickle is also used between peers. 286 DNCP discovers the topology of the nodes in the DNCP network and 287 maintains the liveliness of published node data by ensuring that the 288 publishing node was - at least recently - bidirectionally reachable. 289 New potential peers can be discovered autonomously on multicast- 290 enabled links, their addresses may be manually configured or they may 291 be found by some other means defined in a later specification. 293 A hash tree of height 1, rooted in itself, is maintained by each node 294 to represent the state of all currently reachable nodes (see 295 Section 4.1) and the Trickle algorithm is used to trigger 296 synchronization (see Section 4.3). The need to check peer nodes for 297 state changes is thereby determined by comparing the current root of 298 their respective hash trees, i.e., their individually calculated 299 network state hashes. 301 Before joining a DNCP network, a node starts with a hash tree that 302 has only one leaf if the node publishes some TLVs, and no leaves 303 otherwise. It then announces the network state hash calculated from 304 the hash tree by means of the Trickle algorithm on all its configured 305 endpoints. 307 When an update is detected by a node (e.g., by receiving a different 308 network state hash from a peer) the originator of the event is 309 requested to provide a list of the state of all nodes, i.e., all the 310 information it uses to calculate its own hash tree. The node uses 311 the list to determine whether its own information is outdated and - 312 if necessary - requests the actual node data that has changed. 314 Whenever a node's local copy of any node data and its hash tree are 315 updated (e.g., due to its own or another node's node state changing 316 or due to a peer being added or removed) its Trickle instances are 317 reset which eventually causes any update to be propagated to all of 318 its peers. 320 4. Operation 322 4.1. Hash Tree 324 Each DNCP node maintains an arbitrary width hash tree of height 1. 325 Each leaf represents one recently bidirectionally reachable DNCP node 326 (see Section 4.6), and is represented by a tuple consisting of the 327 node's sequence number in network byte order concatenated with the 328 hash-value of the node's ordered node data published in the Node 329 State TLV (Section 7.2.3). These leaves are ordered in ascending 330 order of the respective node identifiers. The root of the tree - the 331 network state hash - is represented by the hash-value calculated over 332 all such leaf tuples concatenated in order. It is used to determine 333 whether the view of the network of two or more nodes is consistent 334 and shared. 336 The node data hashes in the leaves and the root network state hash 337 are updated on-demand and whenever any locally stored per-node state 338 changes. This includes local unidirectional reachability encoded in 339 the published Peer TLVs (Section 7.3.1) and - when combined with 340 remote data - results in awareness of bidirectional reachability 341 changes. 343 4.2. Data Transport 345 DNCP has few requirements for the underlying transport; it requires 346 some way of transmitting either unicast datagram or stream data to a 347 peer and, if used in multicast mode, a way of sending multicast 348 datagrams. As multicast is used only to identify potential new DNCP 349 nodes and to send status messages which merely notify that a unicast 350 exchange should be triggered, the multicast transport does not have 351 to be secured. If unicast security is desired and one of the built- 352 in security methods is to be used, support for some TLS-derived 353 transport scheme - such as TLS [RFC5246] on top of TCP or DTLS 354 [RFC6347] on top of UDP - is also required. A specific definition of 355 the transport(s) in use and their parameters MUST be provided by the 356 DNCP profile. 358 TLVs are sent across the transport as is, and they SHOULD be sent 359 together where, e.g., MTU considerations do not recommend sending 360 them in multiple batches. TLVs in general are handled individually 361 and statelessly, with one exception: To form bidirectional peer 362 relationships DNCP requires identification of the endpoints used for 363 communication. As bidirectional peer relationships are required for 364 validating liveliness of published node data as described in 365 Section 4.6, a DNCP node MUST send a Node Endpoint TLV 366 (Section 7.2.1). When it is sent varies, depending on the underlying 367 transport, but conceptually it should be available whenever 368 processing a Network State TLV: 370 o If using a stream transport, the TLV MUST be sent at least once 371 per connection, but SHOULD NOT be sent more than once. 373 o If using a datagram transport, it MUST be included in every 374 datagram that also contains a Network State TLV (Section 7.2.2) 375 and MUST be located before any such TLV. It SHOULD also be 376 included in any other datagram, to speeds up initial peer 377 detection. 379 Given the assorted transport options as well as potential endpoint 380 configuration, a DNCP endpoint may be used in various transport 381 modes: 383 Unicast: 385 * If only reliable unicast transport is used, Trickle is not used 386 at all. Where Trickle reset has been specified, a single 387 Network State TLV (Section 7.2.2) is sent instead to every 388 unicast peer. Additionally, recently changed Node State TLVs 389 (Section 7.2.3) MAY be included. 391 * If only unreliable unicast transport is used, Trickle state is 392 kept per peer and it is used to send Network State TLVs 393 intermittently, as specified in Section 4.3. 395 Multicast+Unicast: If multicast datagram transport is available on 396 an endpoint, Trickle state is only maintained for the endpoint as 397 a whole. It is used to send Network State TLVs every now and 398 then, as specified in Section 4.3. Additionally, per-endpoint 399 keep-alives MAY be defined in the DNCP profile, as specified in 400 Section 6.1.2. 402 MulticastListen+Unicast: Just like Unicast, except multicast 403 transmissions are listened to in order to detect changes of the 404 highest node identifier. This mode is used only if the DNCP 405 profile supports dense multicast-enabled link optimization 406 (Section 6.2). 408 4.3. Trickle-Driven Status Updates 410 The Trickle algorithm [RFC6206] has 3 parameters: Imin, Imax and k. 411 Imin and Imax represent the minimum and maximum values for I, which 412 is the time interval during which at least k Trickle updates must be 413 seen on an endpoint to prevent local state transmission. The actual 414 suggested Trickle algorithm parameters are DNCP profile specific, as 415 described in Section 9. 417 The Trickle state for all Trickle instances is considered 418 inconsistent and reset if and only if the locally calculated network 419 state hash changes. This occurs either due to a change in the local 420 node's own node data, or due to receipt of more recent data from 421 another node. A node MUST NOT reset its Trickle state merely based 422 on receiving a Network State TLV (Section 7.2.2) with a network state 423 hash which is different from its locally calculated one. 425 Every time a particular Trickle instance indicates that an update 426 should be sent, the node MUST send a Network State TLV 427 (Section 7.2.2) if and only if: 429 o the endpoint is in Multicast+Unicast transport mode, in which case 430 the TLV MUST be sent over multicast. 432 o the endpoint is NOT in Multicast+Unicast transport mode, and the 433 unicast transport is unreliable, in which case the TLV MUST be 434 sent over unicast. 436 A (sub)set of all Node State TLVs (Section 7.2.3) MAY also be 437 included, unless it is defined as undesirable for some reason by the 438 DNCP profile, or to avoid exposure of the node state TLVs by 439 transmitting them within insecure multicast when using secure 440 unicast. 442 4.4. Processing of Received TLVs 444 This section describes how received TLVs are processed. The DNCP 445 profile may specify when to ignore particular TLVs, e.g., to modify 446 security properties - see Section 9 for what may be safely defined to 447 be ignored in a profile. Any 'reply' mentioned in the steps below 448 denotes sending of the specified TLV(s) over unicast to the 449 originator of the TLV being processed. If the TLV being replied to 450 was received via multicast and it was sent to a multiple access link, 451 the reply SHOULD be delayed by a random timespan in [0, Imin/2], to 452 avoid potential simultaneous replies that may cause problems on some 453 links. Sending of replies MAY also be rate-limited or omitted for a 454 short period of time by an implementation. However, an 455 implementation MUST eventually reply to similar repeated requests, as 456 otherwise state synchronization breaks. 458 A DNCP node MUST process TLVs received from any valid address, as 459 specified by the DNCP profile and the configuration of a particular 460 endpoint, whether this address is known to be the address of a peer 461 or not. This provision satisfies the needs of monitoring or other 462 host software that needs to discover the DNCP topology without adding 463 to the state in the network. 465 Upon receipt of: 467 o Request Network State TLV (Section 7.1.1): The receiver MUST reply 468 with a Network State TLV (Section 7.2.2) and a Node State TLV 469 (Section 7.2.3) for each node data used to calculate the network 470 state hash. The Node State TLVs SHOULD NOT contain the optional 471 node data part to avoid redundant transmission of node data, 472 unless explicitly specified in the DNCP profile. 474 o Request Node State TLV (Section 7.1.2): If the receiver has node 475 data for the corresponding node, it MUST reply with a Node State 476 TLV (Section 7.2.3) for the corresponding node. The optional node 477 data part MUST be included in the TLV. 479 o Network State TLV (Section 7.2.2): If the network state hash 480 differs from the locally calculated network state hash, and the 481 receiver is unaware of any particular node state differences with 482 the sender, the receiver MUST reply with a Request Network State 483 TLV (Section 7.1.1). These replies MUST be rate limited to only 484 at most one reply per link per unique network state hash within 485 Imin. The simplest way to ensure this rate limit is a timestamp 486 indicating requests, and sending at most one Request Network State 487 TLV (Section 7.1.1) per Imin. To facilitate faster state 488 synchronization, if a Request Network State TLV is sent in a 489 reply, a local, current Network State TLV MAY also be sent. 491 o Node State TLV (Section 7.2.3): 493 * If the node identifier matches the local node identifier and 494 the TLV has a greater sequence number than its current local 495 value, or the same sequence number and a different hash, the 496 node SHOULD re-publish its own node data with a sequence number 497 significantly (e.g., 1000) greater than the received one, to 498 reclaim the node identifier. This difference is needed in 499 order to ensure that it is higher than any potentially 500 lingering copies of the node state in the network. This may 501 occur normally once due to the local node restarting and not 502 storing the most recently used sequence number. If this occurs 503 more than once or for nodes not re-publishing their own node 504 data, the DNCP profile MUST provide guidance on how to handle 505 these situations as it indicates the existence of another 506 active node with the same node identifier. 508 * If the node identifier does not match the local node 509 identifier, and one or more of the following conditions are 510 true: 512 + The local information is outdated for the corresponding node 513 (local sequence number is less than that within the TLV). 515 + The local information is potentially incorrect (local 516 sequence number matches but the node data hash differs). 518 + There is no data for that node altogether. 520 Then: 522 + If the TLV contains the Node Data field, it SHOULD also be 523 verified by ensuring that the locally calculated H(Node 524 Data) matches the content of the H(Node Data) field within 525 the TLV. If they differ, the TLV SHOULD be ignored and not 526 processed further. 528 + If the TLV does not contain the Node Data field, and the 529 H(Node Data) field within the TLV differs from the local 530 node data hash for that node (or there is none), the 531 receiver MUST reply with a Request Node State TLV 532 (Section 7.1.2) for the corresponding node. 534 + Otherwise the receiver MUST update its locally stored state 535 for that node (node data based on Node Data field if 536 present, sequence number and relative time) to match the 537 received TLV. 539 For comparison purposes of the sequence number, a looping 540 comparison function MUST be used to avoid problems in case of 541 overflow. The comparison function a < b <=> ((a - b) % (2^32)) & 542 (2^31) != 0 where (a % b) represents the remainder of a modulo b 543 and (a & b) represents bitwise conjunction of a and b is 544 RECOMMENDED unless the DNCP profile defines another. 546 o Any other TLV: TLVs not recognized by the receiver MUST be 547 silently ignored unless they are sent within another TLV (for 548 example, TLVs within the Node Data field of a Node State TLV). 550 If secure unicast transport is configured for an endpoint, any Node 551 State TLVs received over insecure multicast MUST be silently ignored. 553 4.5. Adding and Removing Peers 555 When receiving a Node Endpoint TLV (Section 7.2.1) on an endpoint 556 from an unknown peer: 558 o If received over unicast, the remote node MUST be added as a peer 559 on the endpoint and a Peer TLV (Section 7.3.1) MUST be created for 560 it. 562 o If received over multicast, the node MAY be sent a (possibly rate- 563 limited) unicast Request Network State TLV (Section 7.1.1). 565 If keep-alives specified in Section 6.1 are NOT sent by the peer 566 (either the DNCP profile does not specify the use of keep-alives or 567 the particular peer chooses not to send keep-alives), some other 568 existing local transport-specific means (such as Ethernet carrier- 569 detection or TCP keep-alive) MUST be used to ensure its presence. If 570 the peer does not send keep-alives, and no means to verify presence 571 of the peer are available, the peer MUST be considered no longer 572 present and it SHOULD not be added back as a peer until it starts 573 sending keep-alives again. When the peer is no longer present, the 574 Peer TLV and the local DNCP peer state MUST be removed. 576 If the local endpoint is in the Multicast-Listen+Unicast transport 577 mode, a Peer TLV (Section 7.3.1) MUST NOT be published for the peers 578 not having the highest node identifier. 580 4.6. Data Liveliness Validation 582 The topology graph MUST be traversed either immediately or with a 583 small delay shorter than the DNCP profile-defined Trickle Imin, 584 whenever: 586 o A Peer TLV or a whole node is added or removed, or 588 o the origination time (in milliseconds) of some node's node data is 589 less than current time - 2^32 + 2^15. 591 The topology graph traversal starts with the local node marked as 592 reachable. Other nodes are then iteratively marked as reachable 593 using the following algorithm: A candidate not-yet-reachable node N 594 with an endpoint NE is marked as reachable if there is a reachable 595 node R with an endpoint RE that meet all of the following criteria: 597 o The origination time (in milliseconds) of R's node data is greater 598 than current time in - 2^32 + 2^15. 600 o R publishes a Peer TLV with: 602 * Peer Node Identifier = N's node identifier 604 * Peer Endpoint Identifier = NE's endpoint identifier 606 * Endpoint Identifier = RE's endpoint identifier 608 o N publishes a Peer TLV with: 610 * Peer Node Identifier = R's node identifier 612 * Peer Endpoint Identifier = RE's endpoint identifier 614 * Endpoint Identifier = NE's endpoint identifier 616 The algorithm terminates, when no more candidate nodes fulfilling 617 these criteria can be found. 619 DNCP nodes that have not been reachable in the most recent topology 620 graph traversal MUST NOT be used for calculation of the network state 621 hash, be provided to any applications that need to use the whole TLV 622 graph, or be provided to remote nodes. They MAY be removed 623 immediately after the topology graph traversal, however it is 624 RECOMMENDED to keep them at least briefly to improve the speed of 625 DNCP network state convergence and to reduce the number of redundant 626 state transmissions between nodes. 628 5. Data Model 630 This section describes the local data structures a minimal 631 implementation might use. This section is provided only as a 632 convenience for the implementor. Some of the optional extensions 633 (Section 6) describe additional data requirements, and some optional 634 parts of the core protocol may also require more. 636 A DNCP node has: 638 o A data structure containing data about the most recently sent 639 Request Network State TLVs (Section 7.1.1). The simplest option 640 is keeping a timestamp of the most recent request (required to 641 fulfill reply rate limiting specified in Section 4.4). 643 A DNCP node has for every DNCP node in the DNCP network: 645 o Node identifier: the unique identifier of the node. The length, 646 how it is produced, and how collisions are handled, is up to the 647 DNCP profile. 649 o Node data: the set of TLV tuples published by that particular 650 node. As they are transmitted ordered (see Node State TLV 651 (Section 7.2.3) for details), maintaining the order within the 652 data structure here may be reasonable. 654 o Latest sequence number: the 32-bit sequence number that is 655 incremented any time the TLV set is published. The comparison 656 function used to compare them is described in Section 4.4. 658 o Origination time: the (estimated) time when the current TLV set 659 with the current sequence number was published. It is used to 660 populate the Milliseconds Since Origination field in a Node State 661 TLV (Section 7.2.3). Ideally it also has millisecond accuracy. 663 Additionally, a DNCP node has a set of endpoints for which DNCP is 664 configured to be used. For each such endpoint, a node has: 666 o Endpoint identifier: the 32-bit opaque locally unique value 667 identifying the endpoint within a node. It SHOULD NOT be reused 668 immediately after an endpoint is disabled. 670 o Trickle instance: the endpoint's Trickle instance with parameters 671 I, T, and c (only on an endpoint in Multicast+Unicast transport 672 mode). 674 and one (or more) of the following: 676 o Interface: the assigned local network interface. 678 o Unicast address: the DNCP node it should connect with. 680 o Set of addresses: the DNCP nodes from which connections are 681 accepted. 683 For each remote (peer, endpoint) pair detected on a local endpoint, a 684 DNCP node has: 686 o Node identifier: the unique identifier of the peer. 688 o Endpoint identifier: the unique endpoint identifier used by the 689 peer. 691 o Peer address: the most recently used address of the peer 692 (authenticated and authorized, if security is enabled). 694 o Trickle instance: the particular peer's Trickle instance with 695 parameters I, T, and c (only on an endpoint in Unicast mode, when 696 using an unreliable unicast transport) . 698 6. Optional Extensions 700 This section specifies extensions to the core protocol that a DNCP 701 profile may specify to be used. 703 6.1. Keep-Alives 705 Trickle-driven status updates (Section 4.3) provide a mechanism for 706 handling of new peer detection on an endpoint, as well as state 707 change notifications. Another mechanism may be needed to get rid of 708 old, no longer valid peers if the transport or lower layers do not 709 provide one. 711 If keep-alives are not specified in the DNCP profile, the rest of 712 this subsection MUST be ignored. 714 A DNCP profile MAY specify either per-endpoint (sent using multicast 715 to all DNCP nodes connected to a multicast-enabled link) or per-peer 716 (sent using unicast to each peer individually) keep-alive support. 718 For every endpoint that a keep-alive is specified for in the DNCP 719 profile, the endpoint-specific keep-alive interval MUST be 720 maintained. By default, it is DNCP_KEEPALIVE_INTERVAL. If there is 721 a local value that is preferred for that for any reason 722 (configuration, energy conservation, media type, ..), it can be 723 substituted instead. If a non-default keep-alive interval is used on 724 any endpoint, a DNCP node MUST publish appropriate Keep-Alive 725 Interval TLV(s) (Section 7.3.2) within its node data. 727 6.1.1. Data Model Additions 729 The following additions to the Data Model (Section 5) are needed to 730 support keep-alives: 732 For each configured endpoint that has per-endpoint keep-alives 733 enabled: 735 o Last sent: If a timestamp which indicates the last time a Network 736 State TLV (Section 7.2.2) was sent over that interface. 738 For each remote (peer, endpoint) pair detected on a local endpoint, a 739 DNCP node has: 741 o Last contact timestamp: a timestamp which indicates the last time 742 a consistent Network State TLV (Section 7.2.2) was received from 743 the peer over multicast, or anything was received over unicast. 744 When adding a new peer, it is initialized to the current time. 746 o Last sent: If per-peer keep-alives are enabled, a timestamp which 747 indicates the last time a Network State TLV (Section 7.2.2) was 748 sent to to that point-to-point peer. When adding a new peer, it 749 is initialized to the current time. 751 6.1.2. Per-Endpoint Periodic Keep-Alives 753 If per-endpoint keep-alives are enabled on an endpoint in 754 Multicast+Unicast transport mode, and if no traffic containing a 755 Network State TLV (Section 7.2.2) has been sent to a particular 756 endpoint within the endpoint-specific keep-alive interval, a Network 757 State TLV (Section 7.2.2) MUST be sent on that endpoint, and a new 758 Trickle interval started, as specified in the step 2 of Section 4.2 759 of [RFC6206]. The actual sending time SHOULD be further delayed by a 760 random timespan in [0, Imin/2]. 762 6.1.3. Per-Peer Periodic Keep-Alives 764 If per-peer keep-alives are enabled on a unicast-only endpoint, and 765 if no traffic containing a Network State TLV (Section 7.2.2) has been 766 sent to a particular peer within the endpoint-specific keep-alive 767 interval, a Network State TLV (Section 7.2.2) MUST be sent to the 768 peer, and a new Trickle interval started, as specified in the step 2 769 of Section 4.2 of [RFC6206]. 771 6.1.4. Received TLV Processing Additions 773 If a TLV is received over unicast from the peer, the Last contact 774 timestamp for the peer MUST be updated. 776 On receipt of a Network State TLV (Section 7.2.2) which is consistent 777 with the locally calculated network state hash, the Last contact 778 timestamp for the peer MUST be updated. 780 6.1.5. Peer Removal 782 For every peer on every endpoint, the endpoint-specific keep-alive 783 interval must be calculated by looking for Keep-Alive Interval TLVs 784 (Section 7.3.2) published by the node, and if none exist, using the 785 default value of DNCP_KEEPALIVE_INTERVAL. If the peer's last contact 786 timestamp has not been updated for at least locally chosen 787 potentially endpoint-specific keep-alive multiplier (defaults to 788 DNCP_KEEPALIVE_MULTIPLIER) times the peer's endpoint-specific keep- 789 alive interval, the Peer TLV for that peer and the local DNCP peer 790 state MUST be removed. 792 6.2. Support For Dense Multicast-Enabled Links 794 This optimization is needed to avoid a state space explosion. Given 795 a large set of DNCP nodes publishing data on an endpoint that uses 796 multicast on a link, every node will add a Peer TLV (Section 7.3.1) 797 for each peer. While Trickle limits the amount of traffic on the 798 link in stable state to some extent, the total amount of data that is 799 added to and maintained in the DNCP network given N nodes on a 800 multicast-enabled link is O(N^2). Additionally if per-peer keep- 801 alives are used, there will be O(N^2) keep-alives running on the link 802 if liveliness of peers is not ensured using some other way (e.g., TCP 803 connection lifetime, layer 2 notification, per-endpoint keep-alive). 805 An upper bound for the number of peers that are allowed for a 806 particular type of link that an endpoint in Multicast+Unicast 807 transport mode is used on SHOULD be provided by a DNCP profile, but 808 MAY also be chosen at runtime. The main consideration when selecting 809 a bound (if any) for a particular type of link should be whether it 810 supports multicast traffic, and whether a too large number of peers 811 case is likely to happen during the use of that DNCP profile on that 812 particular type of link. If neither is likely, there is little point 813 specifying support for this for that particular link type. 815 If a DNCP profile does not support this extension at all, the rest of 816 this subsection MUST be ignored. This is because when this extension 817 is used, the state within the DNCP network only contains a subset of 818 the full topology of the network. Therefore every node must be aware 819 of the potential of it being used in a particular DNCP profile. 821 If the specified upper bound is exceeded for some endpoint in 822 Multicast+Unicast transport mode and if the node does not have the 823 highest node identifier on the link, it SHOULD treat the endpoint as 824 a unicast endpoint connected to the node that has the highest node 825 identifier detected on the link, therefore transitioning to 826 Multicast-listen+Unicast transport mode. See Section 4.2 for 827 implications on the specific endpoint behavior. The nodes in 828 Multicast-listen+Unicast transport mode MUST keep listening to 829 multicast traffic to both receive messages from the node(s) still in 830 Multicast+Unicast mode, and as well to react to nodes with a greater 831 node identifier appearing. If the highest node identifier present on 832 the link changes, the remote unicast address of the endpoints in 833 Multicast-Listen+Unicast transport mode MUST be changed. If the node 834 identifier of the local node is the highest one, the node MUST switch 835 back to, or stay in Multicast+Unicast mode, and normally form peer 836 relationships with all peers. 838 7. Type-Length-Value Objects 840 Each TLV is encoded as a 2 byte type field, followed by a 2 byte 841 length field (of the value excluding header, in bytes, 0 meaning no 842 value) followed by the value itself, if any. Both type and length 843 fields in the header as well as all integer fields inside the value - 844 unless explicitly stated otherwise - are represented unsigned and in 845 network byte order. Padding bytes with value zero MUST be added up 846 to the next 4 byte boundary if the length is not divisible by 4. 847 These padding bytes MUST NOT be included in the number stored in the 848 length field. Each TLV which does not define optional fields or 849 variable-length content MAY be sent with additional nested TLVs 850 appended after the required TLV fields - and padding (if applicable) 851 to allow for extensibility. In this case the length field includes 852 the length of the original TLV, the length of the padding that are 853 inserted before the embedded TLVs and the length of the added TLVs. 854 Therefore, each node MUST accept received TLVs that are longer than 855 the fixed fields specified and ignore embedded TLVs it does not 856 understand. 858 0 1 2 3 859 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 860 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 861 | Type | Length | 862 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 863 | Value | 864 .. 865 | (variable # of bytes) | 866 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 867 | (Optional nested TLVs) | 868 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 870 For example, type=123 (0x7b) TLV with value 'x' (120 = 0x78) is 871 encoded as: 007B 0001 7800 0000. If it were to have sub-TLV of 872 type=124 (0x7c) with value 'y', it would be encoded as 007B 0009 7800 873 0000 007C 0001 7900 0000. 875 In this section, the following special notation is used: 877 .. = octet string concatenation operation. 879 H(x) = non-cryptographic hash function specified by DNCP profile. 881 7.1. Request TLVs 883 7.1.1. Request Network State TLV 885 0 1 2 3 886 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 887 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 888 | Type: REQ-NETWORK-STATE (1) | Length: >= 0 | 889 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 891 This TLV is used to request response with a Network State TLV 892 (Section 7.2.2) and all Node State TLVs (Section 7.2.3) (without node 893 data). 895 7.1.2. Request Node State TLV 896 0 1 2 3 897 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 898 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 899 | Type: REQ-NODE-STATE (2) | Length: > 0 | 900 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 901 | Node Identifier | 902 | (length fixed in DNCP profile) | 903 ... 904 | | 905 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 907 This TLV is used to request a Node State TLV (Section 7.2.3) 908 (including node data) for the node with the matching node identifier. 910 7.2. Data TLVs 912 7.2.1. Node Endpoint TLV 914 0 1 2 3 915 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 916 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 917 | Type: NODE-ENDPOINT (3) | Length: > 4 | 918 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 919 | Node Identifier | 920 | (length fixed in DNCP profile) | 921 ... 922 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 923 | Endpoint Identifier | 924 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 926 This TLV identifies both the local node's node identifier, as well as 927 the particular endpoint's endpoint identifier. Section 4.2 specifies 928 when it is sent. 930 7.2.2. Network State TLV 932 0 1 2 3 933 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 934 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 935 | Type: NETWORK-STATE (4) | Length: > 0 | 936 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 937 | H(sequence number of node 1 .. H(node data of node 1) .. | 938 | .. sequence number of node N .. H(node data of node N)) | 939 | (length fixed in DNCP profile) | 940 ... 941 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 942 This TLV contains the current locally calculated network state hash, 943 see Section 4.1 for how it is calculated. 945 7.2.3. Node State TLV 947 0 1 2 3 948 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 949 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 950 | Type: NODE-STATE (5) | Length: > 8 | 951 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 952 | Node Identifier | 953 | (length fixed in DNCP profile) | 954 ... 955 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 956 | Sequence Number | 957 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 958 | Milliseconds Since Origination | 959 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 960 | H(Node Data) | 961 | (length fixed in DNCP profile) | 962 ... 963 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 964 | (optionally) Node Data (a set of nested TLVs) | 965 ... 966 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 968 This TLV represents the local node's knowledge about the published 969 state of a node in the DNCP network identified by the Node Identifier 970 field in the TLV. 972 Every node, including the node publishing the node data, MUST update 973 the Milliseconds Since Origination whenever it sends a Node State TLV 974 based on when the node estimates the data was originally published. 975 This is, e.g., to ensure that any relative timestamps contained 976 within the published node data can be correctly offset and 977 interpreted. Ultimately, what is provided is just an approximation, 978 as transmission delays are not accounted for. 980 Absent any changes, if the originating node notices that the 32-bit 981 milliseconds since origination value would be close to overflow 982 (greater than 2^32-2^16), the node MUST re-publish its TLVs even if 983 there is no change. In other words, absent any other changes, the 984 TLV set MUST be re-published roughly every 48 days. 986 The actual node data of the node may be included within the TLV as 987 well in the optional Node Data field. The set of TLVs MUST be 988 strictly ordered based on ascending binary content (including TLV 989 type and length). This enables, e.g., efficient state delta 990 processing and no-copy indexing by TLV type by the recipient. The 991 Node Data content MUST be passed along exactly as it was received. 992 It SHOULD be also verified on receipt that the locally calculated 993 H(Node Data) matches the content of the field within the TLV, and if 994 the hash differs, the TLV SHOULD be ignored. 996 7.3. Data TLVs within Node State TLV 998 These TLVs are published by the DNCP nodes, and therefore only 999 encoded within the Node State TLVs. If encountered outside Node 1000 State TLV, they MUST be silently ignored. 1002 7.3.1. Peer TLV 1004 0 1 2 3 1005 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1006 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1007 | Type: PEER (8) | Length: > 8 | 1008 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1009 | Peer Node Identifier | 1010 | (length fixed in DNCP profile) | 1011 ... 1012 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1013 | Peer Endpoint Identifier | 1014 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1015 | Local Endpoint Identifier | 1016 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1018 This TLV indicates that the node in question vouches that the 1019 specified peer is reachable by it on the specified local endpoint. 1020 The presence of this TLV at least guarantees that the node publishing 1021 it has received traffic from the peer recently. For guaranteed up- 1022 to-date bidirectional reachability, the existence of both nodes' 1023 matching Peer TLVs needs to be checked. 1025 7.3.2. Keep-Alive Interval TLV 1027 0 1 2 3 1028 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1029 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1030 | Type: KEEP-ALIVE-INTERVAL (9) | Length: >= 8 | 1031 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1032 | Endpoint Identifier | 1033 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1034 | Interval | 1035 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1036 This TLV indicates a non-default interval being used to send keep- 1037 alives specified in Section 6.1. 1039 Endpoint identifier is used to identify the particular endpoint for 1040 which the interval applies. If 0, it applies for ALL endpoints for 1041 which no specific TLV exists. 1043 Interval specifies the interval in milliseconds at which the node 1044 sends keep-alives. A value of zero means no keep-alives are sent at 1045 all; in that case, some lower layer mechanism that ensures presence 1046 of nodes MUST be available and used. 1048 8. Security and Trust Management 1050 If specified in the DNCP profile, either DTLS [RFC6347] or TLS 1051 [RFC5246] may be used to authenticate and encrypt either some (if 1052 specified optional in the profile), or all unicast traffic. The 1053 following methods for establishing trust are defined, but it is up to 1054 the DNCP profile to specify which ones may, should or must be 1055 supported. 1057 8.1. Pre-Shared Key Based Trust Method 1059 A PSK-based trust model is a simple security management mechanism 1060 that allows an administrator to deploy devices to an existing network 1061 by configuring them with a pre-defined key, similar to the 1062 configuration of an administrator password or WPA-key. Although 1063 limited in nature it is useful to provide a user-friendly security 1064 mechanism for smaller networks. 1066 8.2. PKI Based Trust Method 1068 A PKI-based trust-model enables more advanced management capabilities 1069 at the cost of increased complexity and bootstrapping effort. It 1070 however allows trust to be managed in a centralized manner and is 1071 therefore useful for larger networks with a need for an authoritative 1072 trust management. 1074 8.3. Certificate Based Trust Consensus Method 1076 The certificate-based consensus model is designed to be a compromise 1077 between trust management effort and flexibility. It is based on 1078 X.509-certificates and allows each DNCP node to provide a trust 1079 verdict on any other certificate and a consensus is found to 1080 determine whether a node using this certificate or any certificate 1081 signed by it is to be trusted. 1083 A DNCP node not using this security method MUST ignore all announced 1084 trust verdicts and MUST NOT announce any such verdicts by itself, 1085 i.e., any other normative language in this subsection does not apply 1086 to it. 1088 The current effective trust verdict for any certificate is defined as 1089 the one with the highest priority from all trust verdicts announced 1090 for said certificate at the time. 1092 8.3.1. Trust Verdicts 1094 Trust verdicts are statements of DNCP nodes about the trustworthiness 1095 of X.509-certificates. There are 5 possible trust verdicts in order 1096 of ascending priority: 1098 0 (Neutral): no trust verdict exists but the DNCP network should 1099 determine one. 1101 1 (Cached Trust): the last known effective trust verdict was 1102 Configured or Cached Trust. 1104 2 (Cached Distrust): the last known effective trust verdict was 1105 Configured or Cached Distrust. 1107 3 (Configured Trust): trustworthy based upon an external ceremony 1108 or configuration. 1110 4 (Configured Distrust): not trustworthy based upon an external 1111 ceremony or configuration. 1113 Trust verdicts are differentiated in 3 groups: 1115 o Configured verdicts are used to announce explicit trust verdicts a 1116 node has based on any external trust bootstrap or predefined 1117 relation a node has formed with a given certificate. 1119 o Cached verdicts are used to retain the last known trust state in 1120 case all nodes with configured verdicts about a given certificate 1121 have been disconnected or turned off. 1123 o The Neutral verdict is used to announce a new node intending to 1124 join the network so a final verdict for it can be found. 1126 The current effective trust verdict for any certificate is defined as 1127 the one with the highest priority within the set of trust verdicts 1128 announced for the certificate in the DNCP network. A node MUST be 1129 trusted for participating in the DNCP network if and only if the 1130 current effective trust verdict for its own certificate or any one in 1131 its certificate hierarchy is (Cached or Configured) Trust and none of 1132 the certificates in its hierarchy have an effective trust verdict of 1133 (Cached or Configured) Distrust. In case a node has a configured 1134 verdict, which is different from the current effective trust verdict 1135 for a certificate, the current effective trust verdict takes 1136 precedence in deciding trustworthiness. Despite that, the node still 1137 retains and announces its configured verdict. 1139 8.3.2. Trust Cache 1141 Each node SHOULD maintain a trust cache containing the current 1142 effective trust verdicts for all certificates currently announced in 1143 the DNCP network. This cache is used as a backup of the last known 1144 state in case there is no node announcing a configured verdict for a 1145 known certificate. It SHOULD be saved to a non-volatile memory at 1146 reasonable time intervals to survive a reboot or power outage. 1148 Every time a node (re)joins the network or detects the change of an 1149 effective trust verdict for any certificate, it will synchronize its 1150 cache, i.e., store new effective trust verdicts overwriting any 1151 previously cached verdicts. Configured verdicts are stored in the 1152 cache as their respective cached counterparts. Neutral verdicts are 1153 never stored and do not override existing cached verdicts. 1155 8.3.3. Announcement of Verdicts 1157 A node SHOULD always announce any configured trust verdicts it has 1158 established by itself, and it MUST do so if announcing the configured 1159 trust verdict leads to a change in the current effective trust 1160 verdict for the respective certificate. In absence of configured 1161 verdicts, it MUST announce cached trust verdicts it has stored in its 1162 trust cache, if one of the following conditions applies: 1164 o The stored trust verdict is Cached Trust and the current effective 1165 trust verdict for the certificate is Neutral or does not exist. 1167 o The stored trust verdict is Cached Distrust and the current 1168 effective trust verdict for the certificate is Cached Trust. 1170 A node rechecks these conditions whenever it detects changes of 1171 announced trust verdicts anywhere in the network. 1173 Upon encountering a node with a hierarchy of certificates for which 1174 there is no effective trust verdict, a node adds a Neutral Trust- 1175 Verdict-TLV to its node data for all certificates found in the 1176 hierarchy, and publishes it until an effective trust verdict 1177 different from Neutral can be found for any of the certificates, or a 1178 reasonable amount of time (10 minutes is suggested) with no reaction 1179 and no further authentication attempts has passed. Such trust 1180 verdicts SHOULD also be limited in rate and number to prevent denial- 1181 of-service attacks. 1183 Trust verdicts are announced using Trust-Verdict TLVs: 1185 0 1 2 3 1186 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1187 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1188 | Type: Trust-Verdict (10) | Length: > 36 | 1189 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1190 | Verdict | (reserved) | 1191 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1192 | | 1193 | | 1194 | | 1195 | SHA-256 Fingerprint | 1196 | | 1197 | | 1198 | | 1199 | | 1200 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1201 | Common Name | 1203 Verdict represents the numerical index of the trust verdict. 1205 (reserved) is reserved for future additions and MUST be set to 0 1206 when creating TLVs and ignored when parsing them. 1208 SHA-256 Fingerprint contains the SHA-256 [RFC6234] hash value of 1209 the certificate in DER-format. 1211 Common Name contains the variable-length (1-64 bytes) common name 1212 of the certificate. 1214 8.3.4. Bootstrap Ceremonies 1216 The following non-exhaustive list of methods describes possible ways 1217 to establish trust relationships between DNCP nodes and node 1218 certificates. Trust establishment is a two-way process in which the 1219 existing network must trust the newly added node and the newly added 1220 node must trust at least one of its peer nodes. It is therefore 1221 necessary that both the newly added node and an already trusted node 1222 perform such a ceremony to successfully introduce a node into the 1223 DNCP network. In all cases an administrator MUST be provided with 1224 external means to identify the node belonging to a certificate based 1225 on its fingerprint and a meaningful common name. 1227 8.3.4.1. Trust by Identification 1229 A node implementing certificate-based trust MUST provide an interface 1230 to retrieve the current set of effective trust verdicts, fingerprints 1231 and names of all certificates currently known and set configured 1232 trust verdicts to be announced. Alternatively it MAY provide a 1233 companion DNCP node or application with these capabilities with which 1234 it has a pre-established trust relationship. 1236 8.3.4.2. Preconfigured Trust 1238 A node MAY be preconfigured to trust a certain set of node or CA 1239 certificates. However such trust relationships MUST NOT result in 1240 unwanted or unrelated trust for nodes not intended to be run inside 1241 the same network (e.g., all other devices by the same manufacturer). 1243 8.3.4.3. Trust on Button Press 1245 A node MAY provide a physical or virtual interface to put one or more 1246 of its internal network interfaces temporarily into a mode in which 1247 it trusts the certificate of the first DNCP node it can successfully 1248 establish a connection with. 1250 8.3.4.4. Trust on First Use 1252 A node which is not associated with any other DNCP node MAY trust the 1253 certificate of the first DNCP node it can successfully establish a 1254 connection with. This method MUST NOT be used when the node has 1255 already associated with any other DNCP node. 1257 9. DNCP Profile-Specific Definitions 1259 Each DNCP profile MUST specify the following aspects: 1261 o Unicast and optionally multicast transport protocol(s) to be used. 1262 If multicast-based node and status discovery is desired, a 1263 datagram-based transport supporting multicast has to be available. 1265 o How the chosen transport(s) are secured: Not at all, optionally or 1266 always with the TLS scheme defined here using one or more of the 1267 methods, or with something else. If the links with DNCP nodes can 1268 be sufficiently secured or isolated, it is possible to run DNCP in 1269 a secure manner without using any form of authentication or 1270 encryption. 1272 o Transport protocols' parameters such as port numbers to be used, 1273 or multicast address to be used. Unicast, multicast, and secure 1274 unicast may each require different parameters, if applicable. 1276 o When receiving TLVs, what sort of TLVs are ignored in addition - 1277 as specified in Section 4.4 - e.g., for security reasons. A DNCP 1278 profile may safely define the following DNCP TLVs to be safely 1279 ignored: 1281 * Anything received over multicast, except Node Endpoint TLV 1282 (Section 7.2.1) and Network State TLV (Section 7.2.2). 1284 * Any TLVs received over unreliable unicast or multicast at too 1285 high rate; Trickle will ensure eventual convergence given the 1286 rate slows down at some point. 1288 o How to deal with node identifier collision as described in 1289 Section 4.4. Main options are either for one or both nodes to 1290 assign new node identifiers to themselves, or to notify someone 1291 about a fatal error condition in the DNCP network. 1293 o Imin, Imax and k ranges to be suggested for implementations to be 1294 used in the Trickle algorithm. The Trickle algorithm does not 1295 require these to be the same across all implementations for it to 1296 work, but similar orders of magnitude helps implementations of a 1297 DNCP profile to behave more consistently and to facilitate 1298 estimation of lower and upper bounds for convergence behavior of 1299 the network. 1301 o Hash function H(x) to be used, and how many bits of the output are 1302 actually used. The chosen hash function is used to handle both 1303 hashing of node specific data, and network state hash, which is a 1304 hash of node specific data hashes. SHA-256 defined in [RFC6234] 1305 is the recommended default choice, but a non-cryptographic hash 1306 function could be used as well. 1308 o DNCP_NODE_IDENTIFIER_LENGTH: The fixed length of a node identifier 1309 (in bytes). 1311 o Whether to send keep-alives, and if so, whether per-endpoint 1312 (requires multicast transport), or per-peer. Keep-alive has also 1313 associated parameters: 1315 * DNCP_KEEPALIVE_INTERVAL: How often keep-alives are to be sent 1316 by default (if enabled). 1318 * DNCP_KEEPALIVE_MULTIPLIER: How many times the 1319 DNCP_KEEPALIVE_INTERVAL (or peer-supplied keep-alive interval 1320 value) a node may not be heard from to be considered still 1321 valid. This is just a default used in absence of any other 1322 configuration information, or particular per-endpoint 1323 configuration. 1325 10. Security Considerations 1327 DNCP-based protocols may use multicast to indicate DNCP state changes 1328 and for keep-alive purposes. However, no actual published data TLVs 1329 will be sent across that channel. Therefore an attacker may only 1330 learn hash values of the state within DNCP and may be able to trigger 1331 unicast synchronization attempts between nodes on a local link this 1332 way. A DNCP node MUST therefore rate-limit its reactions to 1333 multicast packets. 1335 When using DNCP to bootstrap a network, PKI based solutions may have 1336 issues when validating certificates due to potentially unavailable 1337 accurate time, or due to inability to use the network to either check 1338 Certificate Revocation Lists or perform on-line validation. 1340 The Certificate-based trust consensus mechanism defined in this 1341 document allows for a consenting revocation, however in case of a 1342 compromised device the trust cache may be poisoned before the actual 1343 revocation happens allowing the distrusted device to rejoin the 1344 network using a different identity. Stopping such an attack might 1345 require physical intervention and flushing of the trust caches. 1347 11. IANA Considerations 1349 IANA should set up a registry for the (decimal 16-bit) "DNCP TLV 1350 Types" under "Distributed Node Consensus Protocol (DNCP)", with the 1351 following initial contents: ([RFC Editor: please remove] ideally as 1352 http://www.iana.org/assignments/dncp-registry) 1354 0: Reserved 1356 1: Request network state 1358 2: Request node state 1360 3: Node endpoint 1362 4: Network state 1364 5: Node state 1366 6: Reserved (was: Custom) 1368 7: Reserved (was: Fragment count) 1370 8: Peer 1372 9: Keep-alive interval 1373 10: Trust-Verdict 1375 11-31: Free - policy of 'standards action' should be used 1377 32-511: Reserved for per-DNCP profile use 1379 512-767: Free - policy of 'standards action' should be used 1381 768-1023: Private use 1383 1024-65535: Reserved for future protocol evolution (for example, 1384 DNCP version 2) 1386 12. References 1388 12.1. Normative references 1390 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1391 Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/ 1392 RFC2119, March 1997, 1393 . 1395 [RFC6206] Levis, P., Clausen, T., Hui, J., Gnawali, O., and J. Ko, 1396 "The Trickle Algorithm", RFC 6206, DOI 10.17487/RFC6206, 1397 March 2011, . 1399 [RFC6347] Rescorla, E. and N. Modadugu, "Datagram Transport Layer 1400 Security Version 1.2", RFC 6347, DOI 10.17487/RFC6347, 1401 January 2012, . 1403 [RFC5246] Dierks, T. and E. Rescorla, "The Transport Layer Security 1404 (TLS) Protocol Version 1.2", RFC 5246, DOI 10.17487/ 1405 RFC5246, August 2008, 1406 . 1408 [RFC6234] Eastlake 3rd, D. and T. Hansen, "US Secure Hash Algorithms 1409 (SHA and SHA-based HMAC and HKDF)", RFC 6234, DOI 1410 10.17487/RFC6234, May 2011, 1411 . 1413 12.2. Informative references 1415 [RFC3493] Gilligan, R., Thomson, S., Bound, J., McCann, J., and W. 1416 Stevens, "Basic Socket Interface Extensions for IPv6", RFC 1417 3493, DOI 10.17487/RFC3493, February 2003, 1418 . 1420 Appendix A. Alternative Modes of Operation 1422 Beyond what is described in the main text, the protocol allows for 1423 other uses. These are provided as examples. 1425 A.1. Read-only Operation 1427 If a node uses just a single endpoint and does not need to publish 1428 any TLVs, full DNCP node functionality is not required. Such limited 1429 node can acquire and maintain view of the TLV space by implementing 1430 the processing logic as specified in Section 4.4. Such node would 1431 not need Trickle, peer-maintenance or even keep-alives at all, as the 1432 DNCP nodes' use of it would guarantee eventual receipt of network 1433 state hashes, and synchronization of node data, even in presence of 1434 unreliable transport. 1436 A.2. Forwarding Operation 1438 If a node with a pair of endpoints does not need to publish any TLVs, 1439 it can detect (for example) nodes with the highest node identifier on 1440 each of the endpoints (if any). Any TLVs received from one of them 1441 would be forwarded verbatim as unicast to the other node with highest 1442 node identifier. 1444 Any tinkering with the TLVs would remove guarantees of this scheme 1445 working; however passive monitoring would obviously be fine. This 1446 type of simple forwarding cannot be chained, as it does not send 1447 anything proactively. 1449 Appendix B. Some Questions and Answers [RFC Editor: please remove] 1451 Q: 32-bit endpoint id? 1453 A: Here, it would save 32 bits per peer if it was 16 bits (and less 1454 is not realistic). However, TLVs defined elsewhere would not seem to 1455 even gain that much on average. 32 bits is also used for ifindex in 1456 various operating systems, making for simpler implementation. 1458 Q: Why have topology information at all? 1460 A: It is an alternative to the more traditional seq#/TTL-based 1461 flooding schemes. In steady state, there is no need to, e.g., re- 1462 publish every now and then. 1464 Appendix C. Changelog [RFC Editor: please remove] 1466 draft-ietf-homenet-dncp-09: 1468 o Reserved 1024+ TLV types for future versions (=versioning 1469 mechanism); private use section moved from 192-255 to 512-767. 1471 o Added applicability statement and clarified some text based on 1472 reviews. 1474 draft-ietf-homenet-dncp-08: 1476 o Removed fragmentation as it is somewhat underspecified and 1477 unimplemented. It may be specified in some future extension draft 1478 or new version of DNCP. 1480 o Added generic sub-TLV extensibility mechanism. 1482 draft-ietf-homenet-dncp-06: 1484 o Removed custom TLV. 1486 o Made keep-alive multipliers local implementation choice, profiles 1487 just provide guidance on sane default value. 1489 o Removed the DNCP_GRACE_INTERVAL as it is really implementation 1490 choice. 1492 o Simplified the suggested structures in data model. 1494 o Reorganized the document and provided an overview section. 1496 draft-ietf-homenet-dncp-04: 1498 o Added mandatory rate limiting for network state requests, and 1499 optional slightly faster convergence mechanism by including 1500 current local network state in the remote network state requests. 1502 draft-ietf-homenet-dncp-03: 1504 o Renamed connection -> endpoint. 1506 o !!! Backwards incompatible change: Renumbered TLVs, and got rid of 1507 node data TLV; instead, node data TLV's contents are optionally 1508 within node state TLV. 1510 draft-ietf-homenet-dncp-02: 1512 o Changed DNCP "messages" into series of TLV streams, allowing 1513 optimized round-trip saving synchronization. 1515 o Added fragmentation support for bigger node data and for chunking 1516 in absence of reliable L2 and L3 fragmentation. 1518 draft-ietf-homenet-dncp-01: 1520 o Fixed keep-alive semantics to consider unicast requests also 1521 updates of most recently consistent, and added proactive unicast 1522 request to ensure even inconsistent keep-alive messages eventually 1523 triggering consistency timestamp update. 1525 o Facilitated (simple) read-only clients by making Node Connection 1526 TLV optional if just using DNCP for read-only purposes. 1528 o Added text describing how to deal with "dense" networks, but left 1529 actual numbers and mechanics up to DNCP profiles and (local) 1530 configurations. 1532 draft-ietf-homenet-dncp-00: Split from pre-version of draft-ietf- 1533 homenet-hncp-03 generic parts. Changes that affect implementations: 1535 o TLVs were renumbered. 1537 o TLV length does not include header (=-4). This facilitates, e.g., 1538 use of DHCPv6 option parsing libraries (same encoding), and 1539 reduces complexity (no need to handle error values of length less 1540 than 4). 1542 o Trickle is reset only when locally calculated network state hash 1543 is changes, not as remote different network state hash is seen. 1544 This prevents, e.g., attacks by multicast with one multicast 1545 packet to force Trickle reset on every interface of every node on 1546 a link. 1548 o Instead of 'ping', use 'keep-alive' (optional) for dead peer 1549 detection. Different message used! 1551 Appendix D. Draft Source [RFC Editor: please remove] 1553 As usual, this draft is available at https://github.com/fingon/ietf- 1554 drafts/ in source format (with nice Makefile too). Feel free to send 1555 comments and/or pull requests if and when you have changes to it! 1557 Appendix E. Acknowledgements 1559 Thanks to Ole Troan, Pierre Pfister, Mark Baugher, Mark Townsley, 1560 Juliusz Chroboczek, Jiazi Yi, Mikael Abrahamsson, Brian Carpenter, 1561 Thomas Clausen, DENG Hui and Margaret Cullen for their contributions 1562 to the draft. 1564 Authors' Addresses 1566 Markus Stenberg 1567 Independent 1568 Helsinki 00930 1569 Finland 1571 Email: markus.stenberg@iki.fi 1573 Steven Barth 1574 Independent 1575 Halle 06114 1576 Germany 1578 Email: cyrus@openwrt.org