idnits 2.17.1 draft-ietf-homenet-dncp-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 169 has weird spacing: '...ntifier an o...' == Line 226 has weird spacing: '...e trust the ...' == Line 230 has weird spacing: '...y graph the...' -- The document date (July 3, 2015) is 3191 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 6347 (Obsoleted by RFC 9147) ** Obsolete normative reference: RFC 5246 (Obsoleted by RFC 8446) Summary: 2 errors (**), 0 flaws (~~), 4 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Homenet Working Group M. Stenberg 3 Internet-Draft 4 Intended status: Standards Track S. Barth 5 Expires: January 4, 2016 6 July 3, 2015 8 Distributed Node Consensus Protocol 9 draft-ietf-homenet-dncp-07 11 Abstract 13 This document describes the Distributed Node Consensus Protocol 14 (DNCP), a generic state synchronization protocol which uses Trickle 15 and Merkle trees. DNCP leaves some details unspecified or provides 16 alternative options. Therefore, only profiles which specify those 17 missing parts define actual implementable DNCP-based protocols. 19 Status of This Memo 21 This Internet-Draft is submitted in full conformance with the 22 provisions of BCP 78 and BCP 79. 24 Internet-Drafts are working documents of the Internet Engineering 25 Task Force (IETF). Note that other groups may also distribute 26 working documents as Internet-Drafts. The list of current Internet- 27 Drafts is at http://datatracker.ietf.org/drafts/current/. 29 Internet-Drafts are draft documents valid for a maximum of six months 30 and may be updated, replaced, or obsoleted by other documents at any 31 time. It is inappropriate to use Internet-Drafts as reference 32 material or to cite them other than as "work in progress." 34 This Internet-Draft will expire on January 4, 2016. 36 Copyright Notice 38 Copyright (c) 2015 IETF Trust and the persons identified as the 39 document authors. All rights reserved. 41 This document is subject to BCP 78 and the IETF Trust's Legal 42 Provisions Relating to IETF Documents 43 (http://trustee.ietf.org/license-info) in effect on the date of 44 publication of this document. Please review these documents 45 carefully, as they describe your rights and restrictions with respect 46 to this document. Code Components extracted from this document must 47 include Simplified BSD License text as described in Section 4.e of 48 the Trust Legal Provisions and are provided without warranty as 49 described in the Simplified BSD License. 51 Table of Contents 53 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 54 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 55 2.1. Requirements Language . . . . . . . . . . . . . . . . . . 6 56 3. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 6 57 4. Operation . . . . . . . . . . . . . . . . . . . . . . . . . . 7 58 4.1. Merkle Tree . . . . . . . . . . . . . . . . . . . . . . . 7 59 4.2. Data Transport . . . . . . . . . . . . . . . . . . . . . 7 60 4.3. Trickle-Driven Status Updates . . . . . . . . . . . . . . 8 61 4.4. Processing of Received TLVs . . . . . . . . . . . . . . . 9 62 4.5. Adding and Removing Peers . . . . . . . . . . . . . . . . 11 63 4.6. Data Liveliness Validation . . . . . . . . . . . . . . . 12 64 5. Data Model . . . . . . . . . . . . . . . . . . . . . . . . . 13 65 6. Optional Extensions . . . . . . . . . . . . . . . . . . . . . 14 66 6.1. Keep-Alives . . . . . . . . . . . . . . . . . . . . . . . 14 67 6.1.1. Data Model Additions . . . . . . . . . . . . . . . . 15 68 6.1.2. Per-Endpoint Periodic Keep-Alives . . . . . . . . . . 15 69 6.1.3. Per-Peer Periodic Keep-Alives . . . . . . . . . . . . 16 70 6.1.4. Received TLV Processing Additions . . . . . . . . . . 16 71 6.1.5. Neighbor Removal . . . . . . . . . . . . . . . . . . 16 72 6.2. Support For Dense Broadcast Links . . . . . . . . . . . . 16 73 6.3. Node Data Fragmentation . . . . . . . . . . . . . . . . . 17 74 7. Type-Length-Value Objects . . . . . . . . . . . . . . . . . . 18 75 7.1. Request TLVs . . . . . . . . . . . . . . . . . . . . . . 19 76 7.1.1. Request Network State TLV . . . . . . . . . . . . . . 19 77 7.1.2. Request Node State TLV . . . . . . . . . . . . . . . 19 78 7.2. Data TLVs . . . . . . . . . . . . . . . . . . . . . . . . 19 79 7.2.1. Node Endpoint TLV . . . . . . . . . . . . . . . . . . 19 80 7.2.2. Network State TLV . . . . . . . . . . . . . . . . . . 20 81 7.2.3. Node State TLV . . . . . . . . . . . . . . . . . . . 20 82 7.3. Data TLVs within Node State TLV . . . . . . . . . . . . . 21 83 7.3.1. Fragment Count TLV . . . . . . . . . . . . . . . . . 21 84 7.3.2. Neighbor TLV . . . . . . . . . . . . . . . . . . . . 22 85 7.3.3. Keep-Alive Interval TLV . . . . . . . . . . . . . . . 22 86 8. Security and Trust Management . . . . . . . . . . . . . . . . 23 87 8.1. Pre-Shared Key Based Trust Method . . . . . . . . . . . . 23 88 8.2. PKI Based Trust Method . . . . . . . . . . . . . . . . . 23 89 8.3. Certificate Based Trust Consensus Method . . . . . . . . 23 90 8.3.1. Trust Verdicts . . . . . . . . . . . . . . . . . . . 24 91 8.3.2. Trust Cache . . . . . . . . . . . . . . . . . . . . . 25 92 8.3.3. Announcement of Verdicts . . . . . . . . . . . . . . 25 93 8.3.4. Bootstrap Ceremonies . . . . . . . . . . . . . . . . 26 94 9. DNCP Profile-Specific Definitions . . . . . . . . . . . . . . 27 95 10. Security Considerations . . . . . . . . . . . . . . . . . . . 29 96 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 29 97 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 30 98 12.1. Normative references . . . . . . . . . . . . . . . . . . 30 99 12.2. Informative references . . . . . . . . . . . . . . . . . 30 100 Appendix A. Alternative Modes of Operation . . . . . . . . . . . 30 101 A.1. Read-only Operation . . . . . . . . . . . . . . . . . . . 30 102 A.2. Forwarding Operation . . . . . . . . . . . . . . . . . . 31 103 Appendix B. Some Questions and Answers [RFC Editor: please 104 remove] . . . . . . . . . . . . . . . . . . . . . . 31 105 Appendix C. Changelog [RFC Editor: please remove] . . . . . . . 31 106 Appendix D. Draft Source [RFC Editor: please remove] . . . . . . 33 107 Appendix E. Acknowledgements . . . . . . . . . . . . . . . . . . 33 108 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 33 110 1. Introduction 112 DNCP is designed to provide a way for each participating node to 113 publish a set of TLV (Type-Length-Value) tuples, and to provide a 114 shared and common view about the data published by every currently or 115 recently bidirectionally reachable DNCP node in a network. 117 For state synchronization a Merkle tree is used. It is formed by 118 first calculating a hash for the dataset, called node data, published 119 by each node, and then calculating another hash over those node data 120 hashes. The single resulting hash, called network state hash, is 121 transmitted using the Trickle algorithm [RFC6206] to ensure that all 122 nodes share the same view of the current state of the published data 123 within the network. The use of Trickle with only short network state 124 hashes sent infrequently (in steady state) makes DNCP very thrifty 125 when updates happen rarely. 127 For maintaining liveliness of the topology and the data within it, a 128 combination of Trickled network state, keep-alives, and "other" means 129 of ensuring reachability are used. The core idea is that if every 130 node ensures its neighbors are present, transitively, the whole 131 network state also stays up-to-date. 133 DNCP is most suitable for data that changes only infrequently to gain 134 the maximum benefit from using Trickle. As the network of nodes, or 135 the rate of data changes grows over a given time interval, Trickle is 136 eventually used less and less and the benefit of using DNCP 137 diminishes. In these cases Trickle just provides extra complexity 138 within the specification and little added value. If constant rapid 139 state changes are needed, the preferable choice is to use an 140 additional point-to-point channel whose address or locator is 141 published using DNCP. 143 2. Terminology 145 DNCP profile a definition of the set of rules and values 146 defining the behavior of a fully specified, 147 implementable protocol which uses DNCP. The DNCP 148 profile specifies transport method to be used, 149 which optional parts of the DNCP specification are 150 required by that particular protocol, and various 151 parameters and optional behaviors. In this 152 document any parameter that a DNCP profile 153 specifies is prefixed with DNCP_. Contents of a 154 DNCP profile are specified in Section 9. 156 DNCP-based a protocol which provides a DNCP profile, and 157 protocol potentially much more, e.g., protocol-specific TLVs 158 and guidance on how they should be used. 159 DNCP node a single node which runs a DNCP-based protocol. 161 Link a link-layer media over which directly connected 162 nodes can communicate. 163 DNCP network a set of DNCP nodes running the same DNCP-based 164 protocol. The set consists of nodes that have 165 discovered each other using the transport method 166 defined in the DNCP profile, via multicast on local 167 links, and/or by using unicast communication. 169 Node identifier an opaque fixed-length identifier consisting of 170 DNCP_NODE_IDENTIFIER_LENGTH bytes which uniquely 171 identifies a DNCP node within a DNCP network. 173 Interface a node's attachment to a particular link. 175 Address As DNCP itself is relatively transport agnostic, an 176 address in this specification denotes just 177 something that identifies an endpoint used by the 178 transport protocol employed by a DNCP-based 179 protocol. In case of an IPv6 UDP transport, an 180 address in this specification refers to a tuple 181 (IPv6 address, UDP port). 182 Endpoint a locally configured communication endpoint of a 183 DNCP node, such as a network socket. It is either 184 bound to an Interface for multicast and unicast 185 communication, or configured for explicit unicast 186 communication with a predefined set of remote 187 addresses. Endpoints are usually in one of the 188 transport modes specified in Section 4.2. 190 Endpoint a 32-bit opaque value, which identifies a 191 identifier particular endpoint of a particular DNCP node. The 192 value 0 is reserved for DNCP and DNCP-based 193 protocol purposes and not used to identify an 194 actual endpoint. This definition is in sync with 195 the interface index definition in [RFC3493], as the 196 non-zero small positive integers should comfortably 197 fit within 32 bits. 199 Peer another DNCP node with which a DNCP node 200 communicates using a particular local and remote 201 endpoint pair. 203 Node data a set of TLVs published and owned by a node in the 204 DNCP network. Other nodes pass it along as-is, even 205 if they cannot fully interpret it. 207 Node state a set of metadata attributes for node data. It 208 includes a sequence number for versioning, a hash 209 value for comparing equality of stored node data, 210 and a timestamp indicating the time passed since 211 its last publication. The hash function and the 212 length of the hash value are defined in the DNCP 213 profile. 215 Network state a hash value which represents the current state of 216 hash the network. The hash function and the length of 217 the hash value are defined in the DNCP profile. 218 Whenever a node is added, removed or updates its 219 published node data this hash value changes as 220 well. For calculation, please see Section 4.1. 222 Trust verdict a statement about the trustworthiness of a 223 certificate announced by a node participating in 224 the certificate based trust consensus mechanism. 226 Effective trust the trust verdict with the highest priority within 227 verdict the set of trust verdicts announced for the 228 certificate in the DNCP network. 230 Topology graph the undirected graph of DNCP nodes produced by 231 retaining only bidirectional peer relationships 232 between nodes. 234 2.1. Requirements Language 236 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 237 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 238 "OPTIONAL" in this document are to be interpreted as described in RFC 239 2119 [RFC2119]. 241 3. Overview 243 DNCP operates primarily using unicast exchanges between nodes, and 244 may use multicast for Trickle-based shared state dissemination and 245 topology discovery. If used in pure unicast mode with unreliable 246 transport, Trickle is also used between peers. 248 DNCP discovers the topology of its nodes and maintains the liveliness 249 of published node data by ensuring that the publishing node was - at 250 least recently - bidirectionally reachable. This is determined, 251 e.g., by a recent and consistent multicast or unicast TLV exchange 252 with its peers. New potential peers can be discovered autonomously 253 on multicast-enabled links, their addresses may be manually 254 configured or they may be found by some other means defined in a 255 later specification. 257 A Merkle tree is maintained by each node to represent the state of 258 all currently reachable nodes and the Trickle algorithm is used to 259 trigger synchronization. The need to check neighboring nodes for 260 state changes is thereby determined by comparing the current root of 261 their respective trees, i.e., their individually calculated network 262 state hashes. 264 Before joining a DNCP network, a node starts with a Merkle tree (and 265 therefore a calculated network state hash) only consisting of the 266 node itself. It then announces said hash by means of the Trickle 267 algorithm on all its configured endpoints. 269 When an update is detected by a node (e.g., by receiving a different 270 network state hash from a peer) the originator of the event is 271 requested to provide a list of the state of all nodes, i.e., all the 272 information it uses to calculate its own Merkle tree. The node uses 273 the list to determine whether its own information is outdated and - 274 if necessary - requests the actual node data that has changed. 276 Whenever a node's local copy of any node data and its Merkle tree are 277 updated (e.g., due to its own or another node's node state changing 278 or due to a peer being added or removed) its Trickle instances are 279 reset which eventually causes any update to be propagated to all of 280 its peers. 282 4. Operation 284 4.1. Merkle Tree 286 Each DNCP node maintains a Merkle tree of height 1 to manage state 287 updates of individual DNCP nodes, the leaves of the tree, and the 288 network as a whole, the root of the tree. 290 Each leaf represents one recently bidirectionally reachable DNCP node 291 (see Section 4.6), and is represented by a tuple consisting of the 292 node's sequence number in network byte order concatenated with the 293 hash-value of the node's ordered node data published in the Node 294 State TLV (Section 7.2.3). These leaves are ordered in ascending 295 order of the respective node identifiers. The root of the tree - the 296 network state hash - is represented by the hash-value calculated over 297 all such leaf tuples concatenated in order. It is used to determine 298 whether the view of the network of two or more nodes is consistent 299 and shared. 301 The leaves and the root network state hash are updated on-demand and 302 whenever any locally stored per-node state changes. This includes 303 local unidirectional reachability encoded in the published Neighbor 304 TLVs (Section 7.3.2) and - when combined with remote data - results 305 in awareness of bidirectional reachability changes. 307 4.2. Data Transport 309 DNCP has relatively few requirements for the underlying transport; it 310 requires some way of transmitting either unicast datagram or stream 311 data to a peer and, if used in multicast mode, a way of sending 312 multicast datagrams. As multicast is used only to identify potential 313 new DNCP nodes and to send status messages which merely notify that a 314 unicast exchange should be triggered, the multicast transport does 315 not have to be secured. If unicast security is desired and one of 316 the built-in security methods is to be used, support for some TLS- 317 derived transport scheme - such as TLS [RFC5246] on top of TCP or 318 DTLS [RFC6347] on top of UDP - is also required. A specific 319 definition of the transport(s) in use and their parameters MUST be 320 provided by the DNCP profile. 322 TLVs are sent across the transport as is, and they SHOULD be sent 323 together where, e.g., MTU considerations do not recommend sending 324 them in multiple batches. TLVs in general are handled individually 325 and statelessly, with one exception: To form bidirectional peer 326 relationships DNCP requires identification of the endpoints used for 327 communication. As bidirectional peer relationships are required for 328 validating liveliness of published node data as described in 329 Section 4.6, a DNCP node MUST send an Endpoint TLV (Section 7.2.1). 331 When it is sent varies, depending on the underlying transport, but 332 conceptually it should be available whenever processing a Network 333 State TLV: 335 o If using a stream transport, the TLV MUST be sent at least once, 336 and it SHOULD be sent only once. 338 o If using a datagram transport, it MUST be included in every 339 datagram that also contains a Network State TLV (Section 7.2.2) 340 and MUST be located before any such TLV. It SHOULD also be 341 included in any other datagram, to speeds up initial peer 342 detection. 344 Given the assorted transport options as well as potential endpoint 345 configuration, a DNCP endpoint may be used in various transport 346 modes: 348 Unicast: 350 * If only reliable unicast transport is employed, Trickle is not 351 used at all. Where Trickle reset has been specified, a single 352 Network State TLV (Section 7.2.2) is sent instead to every 353 unicast peer. Additionally, recently changed Node State TLVs 354 (Section 7.2.3) MAY be included. 356 * If only unreliable unicast transport is employed, Trickle state 357 is kept per each peer and it is used to send Network State TLVs 358 every now and then, as specified in Section 4.3. 360 Multicast+Unicast: If multicast datagram transport is available on 361 an endpoint, Trickle state is only maintained for the endpoint as 362 a whole. It is used to send Network State TLVs every now and 363 then, as specified in Section 4.3. Additionally, per-endpoint 364 keep-alives MAY be defined in the DNCP profile, as specified in 365 Section 6.1.2. 367 MulticastListen+Unicast: Just like Unicast, except multicast 368 transmissions are listened to in order to detect changes of the 369 highest node identifier. This mode is used only if the DNCP 370 profile supports dense broadcast link optimization (Section 6.2). 372 4.3. Trickle-Driven Status Updates 374 The Trickle algorithm has 3 parameters: Imin, Imax and k. Imin and 375 Imax represent the minimum and maximum values for I, which is the 376 time interval during which at least k Trickle updates must be seen on 377 an endpoint to prevent local state transmission. The actual 378 suggested Trickle algorithm parameters are DNCP profile specific, as 379 described in Section 9. 381 The Trickle state for all Trickle instances is considered 382 inconsistent and reset if and only if the locally calculated network 383 state hash changes. This occurs either due to a change in the local 384 node's own node data, or due to receipt of more recent data from 385 another node. A node MUST NOT reset its Trickle state merely based 386 on receiving a Network State TLV (Section 7.2.2) with a network state 387 hash which is different from its locally calculated one. 389 Every time a particular Trickle instance indicates that an update 390 should be sent, the node MUST send a Network State TLV 391 (Section 7.2.2) if and only if: 393 o the endpoint is in Multicast+Unicast transport mode, in which case 394 the TLV MUST be sent over multicast. 396 o the endpoint is NOT in Multicast+Unicast transport mode, and the 397 unicast transport is unreliable, in which case the TLV MUST be 398 sent over unicast. 400 A (sub)set of all Node State TLVs (Section 7.2.3) MAY also be 401 included, unless it is defined as undesirable for some reason by the 402 DNCP profile, or to avoid exposure of the node state TLVs by 403 transmitting them within insecure multicast when using also secure 404 unicast. 406 4.4. Processing of Received TLVs 408 This section describes how received TLVs are processed. The DNCP 409 profile may specify when to ignore particular TLVs, e.g., to modify 410 security properties - see Section 9 for what may be safely defined to 411 be ignored in a profile. Any 'reply' mentioned in the steps below 412 denotes sending of the specified TLV(s) over unicast to the 413 originator of the TLV being processed. If the TLV being replied to 414 was received via multicast and it was sent to a link with shared 415 bandwidth, the reply SHOULD be delayed by a random timespan in [0, 416 Imin/2], to avoid potential simultaneous replies that may cause 417 problems on some links. Sending of replies MAY also be rate-limited 418 or omitted for a short period of time by an implementation. However, 419 an implementation MUST eventually reply to similar repeated requests, 420 as otherwise state synchronization breaks. 422 A DNCP node MUST process TLVs received from any valid address, as 423 specified by the DNCP profile and the configuration of a particular 424 endpoint, whether this address is known to be the address of a 425 neighbor or not. This provision satisfies the needs of monitoring or 426 other host software that needs to discover the DNCP topology without 427 adding to the state in the network. 429 Upon receipt of: 431 o Request Network State TLV (Section 7.1.1): The receiver MUST reply 432 with a Network State TLV (Section 7.2.2) and a Node State TLV 433 (Section 7.2.3) for each node data used to calculate the network 434 state hash. The Node State TLVs MUST NOT contain the optional 435 node data part unless explicitly specified in the DNCP profile. 437 o Request Node State TLV (Section 7.1.2): If the receiver has node 438 data for the corresponding node, it MUST reply with a Node State 439 TLV (Section 7.2.3) for the corresponding node. The optional node 440 data part MUST be included in the TLV. 442 o Network State TLV (Section 7.2.2): If the network state hash 443 differs from the locally calculated network state hash, and the 444 receiver is unaware of any particular node state differences with 445 the sender, the receiver MUST reply with a Request Network State 446 TLV (Section 7.1.1). These replies MUST be rate limited to only 447 at most one reply per link per unique network state hash within 448 Imin. The simplest way to ensure this rate limit is a timestamp 449 indicating requests, and sending at most one Request Network State 450 TLV (Section 7.1.1) per Imin. To facilitate faster state 451 synchronization, if a Request Network State TLV is sent in a 452 reply, a local, current Network State TLV MAY also be sent. 454 o Node State TLV (Section 7.2.3): 456 * If the node identifier matches the local node identifier and 457 the TLV has a greater sequence number than its current local 458 value, or the same sequence number and a different hash, the 459 node SHOULD re-publish its own node data with an sequence 460 number significantly (e.g., 1000) greater than the received 461 one, to reclaim the node identifier. This may occur normally 462 once due to the local node restarting and not storing the most 463 recently used sequence number. If this occurs more than once 464 or for nodes not re-publishing their own node data, the DNCP 465 profile MUST provide guidance on how to handle these situations 466 as it indicates the existence of another active node with the 467 same node identifier. 469 * If the node identifier does not match the local node 470 identifier, and one or more of the following conditions are 471 true: 473 + The local information is outdated for the corresponding node 474 (local sequence number is less than that within the TLV). 476 + The local information is potentially incorrect (local 477 sequence number matches but the node data hash differs). 479 + There is no data for that node altogether. 481 Then: 483 + If the TLV contains the Node Data field, it SHOULD also be 484 verified by ensuring that the locally calculated H(Node 485 Data) matches the content of the H(Node Data) field within 486 the TLV. If they differ, the TLV SHOULD be ignored and not 487 processed further. 489 + If the TLV does not contain the Node Data field, and the 490 H(Node Data) field within the TLV differs from the local 491 node data hash for that node (or there is none), the 492 receiver MUST reply with a Request Node State TLV 493 (Section 7.1.2) for the corresponding node. 495 + Otherwise the receiver MUST update its locally stored state 496 for that node (node data based on Node Data field if 497 present, sequence number and relative time) to match the 498 received TLV. 500 For comparison purposes of the sequence number, a looping 501 comparison function MUST be used to avoid problems in case of 502 overflow. The comparison function a < b <=> (a - b) % 2^32 & 2^31 503 != 0 is RECOMMENDED unless the DNCP profile defines another. 505 o Any other TLV: TLVs not recognized by the receiver MUST be 506 silently ignored. 508 If secure unicast transport is configured for an endpoint, any Node 509 State TLVs received over insecure multicast MUST be silently ignored. 511 4.5. Adding and Removing Peers 513 When receiving a Node Endpoint TLV (Section 7.2.1) on an endpoint 514 from an unknown peer: 516 o If received over unicast, the remote node MUST be added as a peer 517 on the endpoint and a Neighbor TLV (Section 7.3.2) MUST be created 518 for it. 520 o If received over multicast, the node MAY be sent a (possibly rate- 521 limited) unicast Request Network State TLV (Section 7.1.1). 523 If keep-alives specified in Section 6.1 are NOT sent by the peer 524 (either the DNCP profile does not specify the use of keep-alives or 525 the particular peer chooses not to send keep-alives), some other 526 existing local transport-specific means (such as Ethernet carrier- 527 detection or TCP keep-alive) MUST be employed to ensure its presence. 528 When the peer is no longer present, the Neighbor TLV and the local 529 DNCP peer state MUST be removed. 531 If the local endpoint is in the Multicast-Listen+Unicast transport 532 mode, a Neighbor TLV (Section 7.3.2) MUST NOT be published for the 533 peers not having the highest node identifier. 535 4.6. Data Liveliness Validation 537 The topology graph MUST be traversed either immediately or with a 538 small delay shorter than the DNCP profile-defined Trickle Imin, 539 whenever: 541 o A Neighbor TLV or a whole node is added or removed, or 543 o the origination time (in milliseconds) of some node's node data is 544 less than current time - 2^32 + 2^15. 546 The topology graph traversal starts with the local node marked as 547 reachable. Other nodes are then iteratively marked as reachable 548 using the following algorithm: A candidate not-yet-reachable node N 549 with an endpoint NE is marked as reachable if there is a reachable 550 node R with an endpoint RE that meet all of the following criteria: 552 o The origination time (in milliseconds) of R's node data is greater 553 than current time in - 2^32 + 2^15. 555 o R publishes a Neighbor TLV with: 557 * Neighbor Node Identifier = N's node identifier 559 * Neighbor Endpoint Identifier = NE's endpoint identifier 561 * Endpoint Identifier = RE's endpoint identifier 563 o N publishes a Neighbor TLV with: 565 * Neighbor Node Identifier = R's node identifier 567 * Neighbor Endpoint Identifier = RE's endpoint identifier 568 * Endpoint Identifier = NE's endpoint identifier 570 The algorithm terminates, when no more candidate nodes fulfilling 571 these criteria can be found. 573 DNCP nodes that have not been reachable in the most recent topology 574 graph traversal MUST NOT be used for calculation of the network state 575 hash, be provided to any applications that need to use the whole TLV 576 graph, or be provided to remote nodes. They MAY be removed 577 immediately after the topology graph traversal, however it is 578 RECOMMENDED to keep them at least briefly to improve the speed of 579 DNCP network state convergence and to reduce the number of redundant 580 state transmissions between nodes. 582 5. Data Model 584 This section describes the local data structures a minimal 585 implementation might use. This section is provided only as a 586 convenience for the implementor. Some of the optional extensions 587 (Section 6) describe additional data requirements, and some optional 588 parts of the core protocol may also require more. 590 A DNCP node has: 592 o A data structure containing data about the most recently sent 593 Request Network State TLVs (Section 7.1.1). The simplest option 594 is keeping a timestamp of the most recent request (required to 595 fulfill reply rate limiting specified in Section 4.4). 597 A DNCP node has for every DNCP node in the DNCP network: 599 o Node identifier: the unique identifier of the node. The length, 600 how it is produced, and how collisions are handled, is up to the 601 DNCP profile. 603 o Node data: the set of TLV tuples published by that particular 604 node. As they are transmitted ordered (see Node State TLV 605 (Section 7.2.3) for details), maintaining the order within the 606 data structure here may be reasonable. 608 o Latest sequence number: the 32-bit sequence number that is 609 incremented any time the TLV set is published. The comparison 610 function used to compare them is described in Section 4.4. 612 o Origination time: the (estimated) time when the current TLV set 613 with the current sequence number was published. It is used to 614 populate the Milliseconds Since Origination field in a Node State 615 TLV (Section 7.2.3). Ideally it also has millisecond accuracy. 617 Additionally, a DNCP node has a set of endpoints for which DNCP is 618 configured to be used. For each such endpoint, a node has: 620 o Endpoint identifier: the 32-bit opaque value uniquely identifying 621 it within the local node. 623 o Trickle instance: the endpoint's Trickle instance with parameters 624 I, T, and c (only on an endpoint in Multicast+Unicast transport 625 mode). 627 and one (or more) of the following: 629 o Interface: the assigned local network interface. 631 o Unicast address: the DNCP node it should connect with. 633 o Range of addresses: the DNCP nodes that are allowed to connect. 635 For each remote (peer, endpoint) pair detected on a local endpoint, a 636 DNCP node has: 638 o Node identifier: the unique identifier of the peer. 640 o Endpoint identifier: the unique endpoint identifier used by the 641 peer. 643 o Peer address: the most recently used address of the peer 644 (authenticated and authorized, if security is enabled). 646 o Trickle instance: the particular peer's Trickle instance with 647 parameters I, T, and c (only on an endpoint in Unicast mode, when 648 using an unreliable unicast transport) . 650 6. Optional Extensions 652 This section specifies extensions to the core protocol that a DNCP 653 profile may specify to be used. 655 6.1. Keep-Alives 657 Trickle-driven status updates (Section 4.3) provide a mechanism for 658 handling of new peer detection on an endpoint, as well as state 659 change notifications. Another mechanism may be needed to get rid of 660 old, no longer valid peers if the transport or lower layers do not 661 provide one. 663 If keep-alives are not specified in the DNCP profile, the rest of 664 this subsection MUST be ignored. 666 A DNCP profile MAY specify either per-endpoint or per-peer keep-alive 667 support. 669 For every endpoint that a keep-alive is specified for in the DNCP 670 profile, the endpoint-specific keep-alive interval MUST be 671 maintained. By default, it is DNCP_KEEPALIVE_INTERVAL. If there is 672 a local value that is preferred for that for any reason 673 (configuration, energy conservation, media type, ..), it can be 674 substituted instead. If a non-default keep-alive interval is used on 675 any endpoint, a DNCP node MUST publish appropriate Keep-Alive 676 Interval TLV(s) (Section 7.3.3) within its node data. 678 6.1.1. Data Model Additions 680 The following additions to the Data Model (Section 5) are needed to 681 support keep-alives: 683 For each configured endpoint that has per-endpoint keep-alives 684 enabled: 686 o Last sent: If a timestamp which indicates the last time a Network 687 State TLV (Section 7.2.2) was sent over that interface. 689 For each remote (peer, endpoint) pair detected on a local endpoint, a 690 DNCP node has: 692 o Last contact timestamp: a timestamp which indicates the last time 693 a consistent Network State TLV (Section 7.2.2) was received from 694 the peer over multicast, or anything was received over unicast. 695 When adding a new peer, it is initialized to the current time. 697 o Last sent: If per-peer keep-alives are enabled, a timestamp which 698 indicates the last time a Network State TLV (Section 7.2.2) was 699 sent to to that point-to-point peer. When adding a new peer, it 700 is initialized to the current time. 702 6.1.2. Per-Endpoint Periodic Keep-Alives 704 If per-endpoint keep-alives are enabled on an endpoint in 705 Multicast+Unicast transport mode, and if no traffic containing a 706 Network State TLV (Section 7.2.2) has been sent to a particular 707 endpoint within the endpoint-specific keep-alive interval, a Network 708 State TLV (Section 7.2.2) MUST be sent on that endpoint, and a new 709 Trickle transmission time 't' in [I/2, I] MUST be randomly chosen. 710 The actual sending time SHOULD be further delayed by a random 711 timespan in [0, Imin/2]. 713 6.1.3. Per-Peer Periodic Keep-Alives 715 If per-peer keep-alives are enabled on a unicast-only endpoint, and 716 if no traffic containing a Network State TLV (Section 7.2.2) has been 717 sent to a particular peer within the endpoint-specific keep-alive 718 interval, a Network State TLV (Section 7.2.2) MUST be sent to the 719 peer and a new Trickle transmission time 't' in [I/2, I] MUST be 720 randomly chosen. 722 6.1.4. Received TLV Processing Additions 724 If a TLV is received over unicast from the peer, the Last contact 725 timestamp for the peer MUST be updated. 727 On receipt of a Network State TLV (Section 7.2.2) which is consistent 728 with the locally calculated network state hash, the Last contact 729 timestamp for the peer MUST be updated. 731 6.1.5. Neighbor Removal 733 For every peer on every endpoint, the endpoint-specific keep-alive 734 interval must be calculated by looking for Keep-Alive Interval TLVs 735 (Section 7.3.3) published by the node, and if none exist, using the 736 default value of DNCP_KEEPALIVE_INTERVAL. If the peer's last contact 737 timestamp has not been updated for at least locally chosen 738 potentially endpoint-specific keep-alive multiplier (defaults to 739 DNCP_KEEPALIVE_MULTIPLIER) times the peer's endpoint-specific keep- 740 alive interval, the Neighbor TLV for that peer and the local DNCP 741 peer state MUST be removed. 743 6.2. Support For Dense Broadcast Links 745 This optimization is needed to avoid a state space explosion. Given 746 a large set of DNCP nodes publishing data on an endpoint that 747 actually uses multicast on a link, every node will add a Neighbor TLV 748 (Section 7.3.2) for each peer. While Trickle limits the amount of 749 traffic on the link in stable state to some extent, the total amount 750 of data that is added to and maintained in the DNCP network given N 751 nodes on a multicast-enabled link is O(N^2). Additionally if per- 752 peer keep-alives are employed, there will be O(N^2) keep-alives 753 running on the link if liveliness of peers is not ensured using some 754 other way (e.g., TCP connection lifetime, layer 2 notification, per- 755 endpoint keep-alive). 757 An upper bound for the number of neighbors that are allowed for a 758 particular type of link that an endpoint in Multicast+Unicast 759 transport mode is used on SHOULD be provided by a DNCP profile, but 760 MAY also be chosen at runtime. Main consideration when selecting a 761 bound (if any) for a particular type of link should be whether it 762 supports broadcast traffic, and whether a too large number of 763 neighbors case is likely to happen during the use of that DNCP 764 profile on that particular type of link. If neither is likely, there 765 is little point specifying support for this for that particular link 766 type. 768 If a DNCP profile does not support this extension at all, the rest of 769 this subsection MUST be ignored. This is because when this extension 770 is employed, the state within the DNCP network only contains a subset 771 of the full topology of the network. Therefore every node must be 772 aware of the potential of it being used in a particular DNCP profile. 774 If the specified upper bound is exceeded for some endpoint in 775 Multicast+Unicast transport mode and if the node does not have the 776 highest node identifier on the link, it SHOULD treat the endpoint as 777 a unicast endpoint connected to the node that has the highest node 778 identifier detected on the link, therefore transitioning to 779 Multicast-listen+Unicast transport mode. The nodes in Multicast- 780 listen+Unicast transport mode MUST keep listening to multicast 781 traffic to both receive messages from the node(s) still in 782 Multicast+Unicast mode, and as well to react to nodes with a greater 783 node identifier appearing. If the highest node identifier present on 784 the link changes, the remote unicast address of the endpoints in 785 Multicast-Listen+Unicast transport mode MUST be changed. If the node 786 identifier of the local node is the highest one, the node MUST switch 787 back to, or stay in Multicast+Unicast mode, and normally form peer 788 relationships with all peers. 790 6.3. Node Data Fragmentation 792 A DNCP-based protocol may be required to support node data which 793 would not fit the maximum size of a single Node State TLV 794 (Section 7.2.3) (roughly 64KB of payload), or use a datagram-only 795 transport with a limited MTU and no reliable support for 796 fragmentation. To handle such cases, a DNCP profile MAY specify a 797 fixed number of trailing bytes in the node identifier to represent a 798 fragment number indicating a part of a node's node data. The profile 799 MAY also specify an upper bound for the size of a single fragment to 800 accommodate limitations of links in the network. Note that the 801 maximum size of fragment also constrains the maximum size of a single 802 TLV published by a node. 804 The data within Node State TLVs of all fragments MUST be valid, as 805 specified in Section 7.2.3. The locally used node data for a 806 particular node MUST be produced by concatenating node data in each 807 fragment, in ascending fragment number order. The locally used 808 concatenated node data MUST still follow the ordering described in 809 Section 7.2.3. 811 Any transmitted node identifiers used to identify the own or any 812 other node MUST have the fragment number 0. For algorithm purposes, 813 the relative time since the most recent fragment change MUST be used, 814 regardless of fragment number. Therefore, even if just some of the 815 node data fragments change, they all are considered refreshed if one 816 of them is. 818 If using fragmentation, the data liveliness validation defined in 819 Section 4.6 is extended so that if a Fragment Count TLV 820 (Section 7.3.1) is present within the fragment number 0, all 821 fragments up to fragment number specified in the Count field are also 822 considered reachable if the fragment number 0 itself is reachable 823 based on graph traversal. 825 7. Type-Length-Value Objects 827 Each TLV is encoded as a 2 byte type field, followed by a 2 byte 828 length field (of the value excluding header, in bytes, 0 meaning no 829 value) followed by the value itself, if any. Both type and length 830 fields in the header as well as all integer fields inside the value - 831 unless explicitly stated otherwise - are represented in network byte 832 order. Padding bytes with value zero MUST be added up to the next 4 833 byte boundary if the length is not divisible by 4. These padding 834 bytes MUST NOT be included in the number stored in the length field. 836 0 1 2 3 837 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 838 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 839 | Type | Length | 840 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 841 | Value | 842 | (variable # of bytes) | 843 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 845 For example, type=123 (0x7b) TLV with value 'x' (120 = 0x78) is 846 encoded as: 007B 0001 7800 0000. 848 In this section, the following special notation is used: 850 .. = octet string concatenation operation. 852 H(x) = non-cryptographic hash function specified by DNCP profile. 854 7.1. Request TLVs 856 7.1.1. Request Network State TLV 858 0 1 2 3 859 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 860 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 861 | Type: REQ-NETWORK-STATE (1) | Length: 0 | 862 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 864 This TLV is used to request response with a Network State TLV 865 (Section 7.2.2) and all Node State TLVs (Section 7.2.3) (without node 866 data). 868 7.1.2. Request Node State TLV 870 0 1 2 3 871 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 872 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 873 | Type: REQ-NODE-STATE (2) | Length: >0 | 874 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 875 | Node Identifier | 876 | (length fixed in DNCP profile) | 877 ... 878 | | 879 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 881 This TLV is used to request a Node State TLV (Section 7.2.3) 882 (including node data) for the node with the matching node identifier. 884 7.2. Data TLVs 886 7.2.1. Node Endpoint TLV 888 0 1 2 3 889 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 890 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 891 | Type: NODE-ENDPOINT (3) | Length: > 4 | 892 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 893 | Node Identifier | 894 | (length fixed in DNCP profile) | 895 ... 896 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 897 | Endpoint Identifier | 898 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 900 This TLV identifies both the local node's node identifier, as well as 901 the particular endpoint's endpoint identifier. 903 7.2.2. Network State TLV 905 0 1 2 3 906 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 907 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 908 | Type: NETWORK-STATE (4) | Length: > 0 | 909 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 910 | H(sequence number of node 1 .. H(node data of node 1) .. | 911 | .. sequence number of node N .. H(node data of node N)) | 912 | (length fixed in DNCP profile) | 913 ... 914 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 916 This TLV contains the current locally calculated network state hash, 917 see Section 4.1 for how it is calculated. 919 7.2.3. Node State TLV 921 0 1 2 3 922 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 923 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 924 | Type: NODE-STATE (5) | Length: > 8 | 925 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 926 | Node Identifier | 927 | (length fixed in DNCP profile) | 928 ... 929 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 930 | Sequence Number | 931 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 932 | Milliseconds Since Origination | 933 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 934 | H(Node Data) | 935 | (length fixed in DNCP profile) | 936 ... 937 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 938 | (optionally) Node Data (a set of nested TLVs) | 939 ... 940 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 942 This TLV represents the local node's knowledge about the published 943 state of a node in the DNCP network identified by the Node Identifier 944 field in the TLV. 946 Every node, including the originating one, MUST update the 947 Milliseconds Since Origination whenever it sends a Node State TLV 948 based on when the node estimates the data was originally published. 949 This is, e.g., to ensure that any relative timestamps contained 950 within the published node data can be correctly offset and 951 interpreted. Ultimately, what is provided is just an approximation, 952 as transmission delays are not accounted for. 954 Absent any changes, if the originating node notices that the 32-bit 955 milliseconds since origination value would be close to overflow 956 (greater than 2^32-2^16), the node MUST re-publish its TLVs even if 957 there is no change. In other words, absent any other changes, the 958 TLV set MUST be re-published roughly every 48 days. 960 The actual node data of the node may be included within the TLV as 961 well in the optional Node Data field. In a DNCP profile which 962 supports fragmentation, described in Section 6.3, the TLV data may be 963 only partial but it MUST contain full individual TLVs. The set of 964 TLVs MUST be strictly ordered based on ascending binary content 965 (including TLV type and length). This enables, e.g., efficient state 966 delta processing and no-copy indexing by TLV type by the recipient. 967 The Node Data content MUST be passed along exactly as it was 968 received. It SHOULD be also verified on receipt that the locally 969 calculated H(Node Data) matches the content of the field within the 970 TLV, and if the hash differs, the TLV SHOULD be ignored. 972 7.3. Data TLVs within Node State TLV 974 These TLVs are published by the DNCP nodes, and therefore only 975 encoded within the Node State TLVs. If encountered outside Node 976 State TLV, they MUST be silently ignored. 978 7.3.1. Fragment Count TLV 980 0 1 2 3 981 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 982 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 983 | Type: FRAGMENT-COUNT (7) | Length: > 0 | 984 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 985 | Count | 986 | (length fixed in DNCP profile) | 987 ... 988 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 990 If the DNCP profile supports node data fragmentation as specified in 991 Section 6.3, this TLV indicates that the node data is encoded as a 992 sequence of Node State TLVs. Following Node State TLVs with Node 993 Identifiers up to Count greater than the current one MUST be 994 considered reachable and part of the same logical set of node data 995 that this TLV is within. The fragment portion of the Node Identifier 996 of the Node State TLV this TLV appears in MUST be zero. 998 7.3.2. Neighbor TLV 1000 0 1 2 3 1001 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1002 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1003 | Type: NEIGHBOR (8) | Length: > 8 | 1004 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1005 | Neighbor Node Identifier | 1006 | (length fixed in DNCP profile) | 1007 ... 1008 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1009 | Neighbor Endpoint Identifier | 1010 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1011 | Local Endpoint Identifier | 1012 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1014 This TLV indicates that the node in question vouches that the 1015 specified neighbor is reachable by it on the specified local 1016 endpoint. The presence of this TLV at least guarantees that the node 1017 publishing it has received traffic from the neighbor recently. For 1018 guaranteed up-to-date bidirectional reachability, the existence of 1019 both nodes' matching Neighbor TLVs needs to be checked. 1021 7.3.3. Keep-Alive Interval TLV 1023 0 1 2 3 1024 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1025 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1026 | Type: KEEP-ALIVE-INTERVAL (9) | Length: 8 | 1027 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1028 | Endpoint Identifier | 1029 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1030 | Interval | 1031 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1033 This TLV indicates a non-default interval being used to send keep- 1034 alives specified in Section 6.1. 1036 Endpoint identifier is used to identify the particular endpoint for 1037 which the interval applies. If 0, it applies for ALL endpoints for 1038 which no specific TLV exists. 1040 Interval specifies the interval in milliseconds at which the node 1041 sends keep-alives. A value of zero means no keep-alives are sent at 1042 all; in that case, some lower layer mechanism that ensures presence 1043 of nodes MUST be available and used. 1045 8. Security and Trust Management 1047 If specified in the DNCP profile, either DTLS [RFC6347] or TLS 1048 [RFC5246] may be used to authenticate and encrypt either some (if 1049 specified optional in the profile), or all unicast traffic. The 1050 following methods for establishing trust are defined, but it is up to 1051 the DNCP profile to specify which ones may, should or must be 1052 supported. 1054 8.1. Pre-Shared Key Based Trust Method 1056 A PSK-based trust model is a simple security management mechanism 1057 that allows an administrator to deploy devices to an existing network 1058 by configuring them with a pre-defined key, similar to the 1059 configuration of an administrator password or WPA-key. Although 1060 limited in nature it is useful to provide a user-friendly security 1061 mechanism for smaller networks. 1063 8.2. PKI Based Trust Method 1065 A PKI-based trust-model enables more advanced management capabilities 1066 at the cost of increased complexity and bootstrapping effort. It 1067 however allows trust to be managed in a centralized manner and is 1068 therefore useful for larger networks with a need for an authoritative 1069 trust management. 1071 8.3. Certificate Based Trust Consensus Method 1073 The certificate-based consensus model is designed to be a compromise 1074 between trust management effort and flexibility. It is based on 1075 X.509-certificates and allows each DNCP node to provide a trust 1076 verdict on any other certificate and a consensus is found to 1077 determine whether a node using this certificate or any certificate 1078 signed by it is to be trusted. 1080 A DNCP node not using this security method MUST ignore all announced 1081 trust verdicts and MUST NOT announce any such verdicts by itself, 1082 i.e., any other normative language in this subsection does not apply 1083 to it. 1085 The current effective trust verdict for any certificate is defined as 1086 the one with the highest priority from all trust verdicts announced 1087 for said certificate at the time. 1089 8.3.1. Trust Verdicts 1091 Trust verdicts are statements of DNCP nodes about the trustworthiness 1092 of X.509-certificates. There are 5 possible trust verdicts in order 1093 of ascending priority: 1095 0 (Neutral): no trust verdict exists but the DNCP network should 1096 determine one. 1098 1 (Cached Trust): the last known effective trust verdict was 1099 Configured or Cached Trust. 1101 2 (Cached Distrust): the last known effective trust verdict was 1102 Configured or Cached Distrust. 1104 3 (Configured Trust): trustworthy based upon an external ceremony 1105 or configuration. 1107 4 (Configured Distrust): not trustworthy based upon an external 1108 ceremony or configuration. 1110 Trust verdicts are differentiated in 3 groups: 1112 o Configured verdicts are used to announce explicit trust verdicts a 1113 node has based on any external trust bootstrap or predefined 1114 relation a node has formed with a given certificate. 1116 o Cached verdicts are used to retain the last known trust state in 1117 case all nodes with configured verdicts about a given certificate 1118 have been disconnected or turned off. 1120 o The Neutral verdict is used to announce a new node intending to 1121 join the network so a final verdict for it can be found. 1123 The current effective trust verdict for any certificate is defined as 1124 the one with the highest priority within the set of trust verdicts 1125 announced for the certificate in the DNCP network. A node MUST be 1126 trusted for participating in the DNCP network if and only if the 1127 current effective trust verdict for its own certificate or any one in 1128 its certificate hierarchy is (Cached or Configured) Trust and none of 1129 the certificates in its hierarchy have an effective trust verdict of 1130 (Cached or Configured) Distrust. In case a node has a configured 1131 verdict, which is different from the current effective trust verdict 1132 for a certificate, the current effective trust verdict takes 1133 precedence in deciding trustworthiness. Despite that, the node still 1134 retains and announces its configured verdict. 1136 8.3.2. Trust Cache 1138 Each node SHOULD maintain a trust cache containing the current 1139 effective trust verdicts for all certificates currently announced in 1140 the DNCP network. This cache is used as a backup of the last known 1141 state in case there is no node announcing a configured verdict for a 1142 known certificate. It SHOULD be saved to a non-volatile memory at 1143 reasonable time intervals to survive a reboot or power outage. 1145 Every time a node (re)joins the network or detects the change of an 1146 effective trust verdict for any certificate, it will synchronize its 1147 cache, i.e., store new effective trust verdicts overwriting any 1148 previously cached verdicts. Configured verdicts are stored in the 1149 cache as their respective cached counterparts. Neutral verdicts are 1150 never stored and do not override existing cached verdicts. 1152 8.3.3. Announcement of Verdicts 1154 A node SHOULD always announce any configured trust verdicts it has 1155 established by itself, and it MUST do so if announcing the configured 1156 trust verdict leads to a change in the current effective trust 1157 verdict for the respective certificate. In absence of configured 1158 verdicts, it MUST announce cached trust verdicts it has stored in its 1159 trust cache, if one of the following conditions applies: 1161 o The stored trust verdict is Cached Trust and the current effective 1162 trust verdict for the certificate is Neutral or does not exist. 1164 o The stored trust verdict is Cached Distrust and the current 1165 effective trust verdict for the certificate is Cached Trust. 1167 A node rechecks these conditions whenever it detects changes of 1168 announced trust verdicts anywhere in the network. 1170 Upon encountering a node with a hierarchy of certificates for which 1171 there is no effective trust verdict, a node adds a Neutral Trust- 1172 Verdict-TLV to its node data for all certificates found in the 1173 hierarchy, and publishes it until an effective trust verdict 1174 different from Neutral can be found for any of the certificates, or a 1175 reasonable amount of time (10 minutes is suggested) with no reaction 1176 and no further authentication attempts has passed. Such trust 1177 verdicts SHOULD also be limited in rate and number to prevent denial- 1178 of-service attacks. 1180 Trust verdicts are announced using Trust-Verdict TLVs: 1182 0 1 2 3 1183 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1184 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1185 | Type: Trust-Verdict (10) | Length: 37-100 | 1186 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1187 | Verdict | (reserved) | 1188 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1189 | | 1190 | | 1191 | | 1192 | SHA-256 Fingerprint | 1193 | | 1194 | | 1195 | | 1196 | | 1197 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1198 | Common Name | 1200 Verdict represents the numerical index of the trust verdict. 1202 (reserved) is reserved for future additions and MUST be set to 0 1203 when creating TLVs and ignored when parsing them. 1205 SHA-256 Fingerprint contains the SHA-256 [RFC6234] hash value of 1206 the certificate in DER-format. 1208 Common Name contains the variable-length (1-64 bytes) common name 1209 of the certificate. Final byte MUST have value of 0. 1211 8.3.4. Bootstrap Ceremonies 1213 The following non-exhaustive list of methods describes possible ways 1214 to establish trust relationships between DNCP nodes and node 1215 certificates. Trust establishment is a two-way process in which the 1216 existing network must trust the newly added node and the newly added 1217 node must trust at least one of its neighboring nodes. It is 1218 therefore necessary that both the newly added node and an already 1219 trusted node perform such a ceremony to successfully introduce a node 1220 into the DNCP network. In all cases an administrator MUST be 1221 provided with external means to identify the node belonging to a 1222 certificate based on its fingerprint and a meaningful common name. 1224 8.3.4.1. Trust by Identification 1226 A node implementing certificate-based trust MUST provide an interface 1227 to retrieve the current set of effective trust verdicts, fingerprints 1228 and names of all certificates currently known and set configured 1229 trust verdicts to be announced. Alternatively it MAY provide a 1230 companion DNCP node or application with these capabilities with which 1231 it has a pre-established trust relationship. 1233 8.3.4.2. Preconfigured Trust 1235 A node MAY be preconfigured to trust a certain set of node or CA 1236 certificates. However such trust relationships MUST NOT result in 1237 unwanted or unrelated trust for nodes not intended to be run inside 1238 the same network (e.g., all other devices by the same manufacturer). 1240 8.3.4.3. Trust on Button Press 1242 A node MAY provide a physical or virtual interface to put one or more 1243 of its internal network interfaces temporarily into a mode in which 1244 it trusts the certificate of the first DNCP node it can successfully 1245 establish a connection with. 1247 8.3.4.4. Trust on First Use 1249 A node which is not associated with any other DNCP node MAY trust the 1250 certificate of the first DNCP node it can successfully establish a 1251 connection with. This method MUST NOT be used when the node has 1252 already associated with any other DNCP node. 1254 9. DNCP Profile-Specific Definitions 1256 Each DNCP profile MUST specify the following aspects: 1258 o Unicast and optionally multicast transport protocol(s) to be used. 1259 If multicast-based node and status discovery is desired, a 1260 datagram-based transport supporting multicast has to be available. 1262 o How the chosen transport(s) are secured: Not at all, optionally or 1263 always with the TLS scheme defined here using one or more of the 1264 methods, or with something else. If the links with DNCP nodes can 1265 be sufficiently secured or isolated, it is possible to run DNCP in 1266 a secure manner without using any form of authentication or 1267 encryption. 1269 o Transport protocols' parameters such as port numbers to be used, 1270 or multicast address to be used. Unicast, multicast, and secure 1271 unicast may each require different parameters, if applicable. 1273 o When receiving TLVs, what sort of TLVs are ignored in addition - 1274 as specified in Section 4.4 - e.g., for security reasons. A DNCP 1275 profile may safely define the following DNCP TLVs to be safely 1276 ignored: 1278 * Anything received over multicast, except Node Endpoint TLV 1279 (Section 7.2.1) and Network State TLV (Section 7.2.2). 1281 * Any TLVs received over unreliable unicast or multicast at too 1282 high rate; Trickle will ensure eventual convergence given the 1283 rate slows down at some point. 1285 o How to deal with node identifier collision as described in 1286 Section 4.4. Main options are either for one or both nodes to 1287 assign new node identifiers to themselves, or to notify someone 1288 about a fatal error condition in the DNCP network. 1290 o Imin, Imax and k ranges to be suggested for implementations to be 1291 used in the Trickle algorithm. The Trickle algorithm does not 1292 require these to be the same across all implementations for it to 1293 work, but similar orders of magnitude helps implementations of a 1294 DNCP profile to behave more consistently and to facilitate 1295 estimation of lower and upper bounds for convergence behavior of 1296 the network. 1298 o Hash function H(x) to be used, and how many bits of the output are 1299 actually used. The chosen hash function is used to handle both 1300 hashing of node specific data, and network state hash, which is a 1301 hash of node specific data hashes. SHA-256 defined in [RFC6234] 1302 is the recommended default choice, but a non-cryptographic hash 1303 function could be used as well. 1305 o DNCP_NODE_IDENTIFIER_LENGTH: The fixed length of a node identifier 1306 (in bytes). 1308 o Whether to send keep-alives, and if so, whether per-endpoint 1309 (requires multicast transport), or per-peer. Keep-alive has also 1310 associated parameters: 1312 * DNCP_KEEPALIVE_INTERVAL: How often keep-alives are to be sent 1313 by default (if enabled). 1315 * DNCP_KEEPALIVE_MULTIPLIER: How many times the 1316 DNCP_KEEPALIVE_INTERVAL (or peer-supplied keep-alive interval 1317 value) a node may not be heard from to be considered still 1318 valid. This is just a default used in absence of any other 1319 configuration information, or particular per-endpoint 1320 configuration. 1322 o Whether to support fragmentation, and if so, the number of bytes 1323 reserved for fragment count in the node identifier. 1325 10. Security Considerations 1327 DNCP-based protocols may use multicast to indicate DNCP state changes 1328 and for keep-alive purposes. However, no actual published data TLVs 1329 will be sent across that channel. Therefore an attacker may only 1330 learn hash values of the state within DNCP and may be able to trigger 1331 unicast synchronization attempts between nodes on a local link this 1332 way. A DNCP node should therefore rate-limit its reactions to 1333 multicast packets. 1335 When using DNCP to bootstrap a network, PKI based solutions may have 1336 issues when validating certificates due to potentially unavailable 1337 accurate time, or due to inability to use the network to either check 1338 Certifcate Revocation Lists or perform on-line validation. 1340 The Certificate-based trust consensus mechanism defined in this 1341 document allows for a consenting revocation, however in case of a 1342 compromised device the trust cache may be poisoned before the actual 1343 revocation happens allowing the distrusted device to rejoin the 1344 network using a different identity. Stopping such an attack might 1345 require physical intervention and flushing of the trust caches. 1347 11. IANA Considerations 1349 IANA should set up a registry for DNCP TLV types, with the following 1350 initial contents: 1352 0: Reserved 1354 1: Request network state 1356 2: Request node state 1358 3: Node endpoint 1360 4: Network state 1362 5: Node state 1364 6: Reserved (was: Custom) 1366 7: Fragment count 1368 8: Neighbor 1370 9: Keep-alive interval 1372 10: Trust-Verdict 1373 32-191: Reserved for per-DNCP profile use 1375 192-255: Reserved for per-implementation experimentation. How 1376 collision is avoided is out of scope of this document. 1378 For the rest of the values (11-31, 256-65535), policy of 'standards 1379 action' should be used. 1381 12. References 1383 12.1. Normative references 1385 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1386 Requirement Levels", BCP 14, RFC 2119, March 1997. 1388 [RFC6206] Levis, P., Clausen, T., Hui, J., Gnawali, O., and J. Ko, 1389 "The Trickle Algorithm", RFC 6206, March 2011. 1391 [RFC6347] Rescorla, E. and N. Modadugu, "Datagram Transport Layer 1392 Security Version 1.2", RFC 6347, January 2012. 1394 [RFC5246] Dierks, T. and E. Rescorla, "The Transport Layer Security 1395 (TLS) Protocol Version 1.2", RFC 5246, August 2008. 1397 12.2. Informative references 1399 [RFC3493] Gilligan, R., Thomson, S., Bound, J., McCann, J., and W. 1400 Stevens, "Basic Socket Interface Extensions for IPv6", RFC 1401 3493, February 2003. 1403 [RFC6234] Eastlake, D. and T. Hansen, "US Secure Hash Algorithms 1404 (SHA and SHA-based HMAC and HKDF)", RFC 6234, May 2011. 1406 Appendix A. Alternative Modes of Operation 1408 Beyond what is described in the main text, the protocol allows for 1409 other uses. These are provided as examples. 1411 A.1. Read-only Operation 1413 If a node uses just a single endpoint and does not need to publish 1414 any TLVs, full DNCP node functionality is not required. Such limited 1415 node can acquire and maintain view of the TLV space by implementing 1416 the processing logic as specified in Section 4.4. Such node would 1417 not need Trickle, peer-maintenance or even keep-alives at all, as the 1418 DNCP nodes' use of it would guarantee eventual receipt of network 1419 state hashes, and synchronization of node data, even in presence of 1420 unreliable transport. 1422 A.2. Forwarding Operation 1424 If a node with a pair of endpoints does not need to publish any TLVs, 1425 it can detect (for example) nodes with the highest node identifier on 1426 each of the endpoints (if any). Any TLVs received from one of them 1427 would be forwarded verbatim as unicast to the other node with highest 1428 node identifier. 1430 Any tinkering with the TLVs would remove guarantees of this scheme 1431 working; however passive monitoring would obviously be fine. This 1432 type of simple forwarding cannot be chained, as it does not send 1433 anything proactively. 1435 Appendix B. Some Questions and Answers [RFC Editor: please remove] 1437 Q: 32-bit endpoint id? 1439 A: Here, it would save 32 bits per neighbor if it was 16 bits (and 1440 less is not realistic). However, TLVs defined elsewhere would not 1441 seem to even gain that much on average. 32 bits is also used for 1442 ifindex in various operating systems, making for simpler 1443 implementation. 1445 Q: Why have topology information at all? 1447 A: It is an alternative to the more traditional seq#/TTL-based 1448 flooding schemes. In steady state, there is no need to, e.g., re- 1449 publish every now and then. 1451 Appendix C. Changelog [RFC Editor: please remove] 1453 draft-ietf-homenet-dncp-06: 1455 o Removed custom TLV. 1457 o Made keep-alive multipliers local implementation choice, profiles 1458 just provide guidance on sane default value. 1460 o Removed the DNCP_GRACE_INTERVAL as it is really implementation 1461 choice. 1463 o Simplified the suggested structures in data model. 1465 o Reorganized the document and provided an overview section. 1467 draft-ietf-homenet-dncp-04: 1469 o Added mandatory rate limiting for network state requests, and 1470 optional slightly faster convergence mechanism by including 1471 current local network state in the remote network state requests. 1473 draft-ietf-homenet-dncp-03: 1475 o Renamed connection -> endpoint. 1477 o !!! Backwards incompatible change: Renumbered TLVs, and got rid of 1478 node data TLV; instead, node data TLV's contents are optionally 1479 within node state TLV. 1481 draft-ietf-homenet-dncp-02: 1483 o Changed DNCP "messages" into series of TLV streams, allowing 1484 optimized round-trip saving synchronization. 1486 o Added fragmentation support for bigger node data and for chunking 1487 in absence of reliable L2 and L3 fragmentation. 1489 draft-ietf-homenet-dncp-01: 1491 o Fixed keep-alive semantics to consider unicast requests also 1492 updates of most recently consistent, and added proactive unicast 1493 request to ensure even inconsistent keep-alive messages eventually 1494 triggering consistency timestamp update. 1496 o Facilitated (simple) read-only clients by making Node Connection 1497 TLV optional if just using DNCP for read-only purposes. 1499 o Added text describing how to deal with "dense" networks, but left 1500 actual numbers and mechanics up to DNCP profiles and (local) 1501 configurations. 1503 draft-ietf-homenet-dncp-00: Split from pre-version of draft-ietf- 1504 homenet-hncp-03 generic parts. Changes that affect implementations: 1506 o TLVs were renumbered. 1508 o TLV length does not include header (=-4). This facilitates, e.g., 1509 use of DHCPv6 option parsing libraries (same encoding), and 1510 reduces complexity (no need to handle error values of length less 1511 than 4). 1513 o Trickle is reset only when locally calculated network state hash 1514 is changes, not as remote different network state hash is seen. 1515 This prevents, e.g., attacks by multicast with one multicast 1516 packet to force Trickle reset on every interface of every node on 1517 a link. 1519 o Instead of 'ping', use 'keep-alive' (optional) for dead peer 1520 detection. Different message used! 1522 Appendix D. Draft Source [RFC Editor: please remove] 1524 As usual, this draft is available at https://github.com/fingon/ietf- 1525 drafts/ in source format (with nice Makefile too). Feel free to send 1526 comments and/or pull requests if and when you have changes to it! 1528 Appendix E. Acknowledgements 1530 Thanks to Ole Troan, Pierre Pfister, Mark Baugher, Mark Townsley, 1531 Juliusz Chroboczek, Jiazi Yi, Mikael Abrahamsson, Brian Carpenter, 1532 Thomas Clausen and DENG Hui for their contributions to the draft. 1534 Authors' Addresses 1536 Markus Stenberg 1537 Helsinki 00930 1538 Finland 1540 Email: markus.stenberg@iki.fi 1542 Steven Barth 1543 Halle 06114 1544 Germany 1546 Email: cyrus@openwrt.org