idnits 2.17.1 draft-summermatter-set-union-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** There are 42 instances of too long lines in the document, the longest one being 4 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (16 June 2021) is 1035 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Looks like a reference, but probably isn't: '1' on line 2870 -- Looks like a reference, but probably isn't: '2' on line 2870 -- Looks like a reference, but probably isn't: '4' on line 2870 -- Looks like a reference, but probably isn't: '8' on line 2869 -- Looks like a reference, but probably isn't: '10' on line 2868 -- Looks like a reference, but probably isn't: '6' on line 2868 -- Looks like a reference, but probably isn't: '26' on line 2869 -- Looks like a reference, but probably isn't: '17' on line 2869 -- Looks like a reference, but probably isn't: '19' on line 2869 -- Looks like a reference, but probably isn't: '15' on line 2869 -- Looks like a reference, but probably isn't: '0' on line 2870 -- Looks like a reference, but probably isn't: '3' on line 2870 Summary: 2 errors (**), 0 flaws (~~), 1 warning (==), 14 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Independent Stream E. Summermatter 3 Internet-Draft Seccom GmbH 4 Intended status: Informational C. Grothoff 5 Expires: 18 December 2021 Berner Fachhochschule 6 16 June 2021 8 Byzantine Fault Tolerant Set Reconciliation 9 draft-summermatter-set-union-01 11 Abstract 13 This document contains a protocol specification for Byzantine fault- 14 tolerant Set Reconciliation. 16 Status of This Memo 18 This Internet-Draft is submitted in full conformance with the 19 provisions of BCP 78 and BCP 79. 21 Internet-Drafts are working documents of the Internet Engineering 22 Task Force (IETF). Note that other groups may also distribute 23 working documents as Internet-Drafts. The list of current Internet- 24 Drafts is at https://datatracker.ietf.org/drafts/current/. 26 Internet-Drafts are draft documents valid for a maximum of six months 27 and may be updated, replaced, or obsoleted by other documents at any 28 time. It is inappropriate to use Internet-Drafts as reference 29 material or to cite them other than as "work in progress." 31 This Internet-Draft will expire on 18 December 2021. 33 Copyright Notice 35 Copyright (c) 2021 IETF Trust and the persons identified as the 36 document authors. All rights reserved. 38 This document is subject to BCP 78 and the IETF Trust's Legal 39 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 40 license-info) in effect on the date of publication of this document. 41 Please review these documents carefully, as they describe your rights 42 and restrictions with respect to this document. Code Components 43 extracted from this document must include Simplified BSD License text 44 as described in Section 4.e of the Trust Legal Provisions and are 45 provided without warranty as described in the Simplified BSD License. 47 Table of Contents 49 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 50 2. Background . . . . . . . . . . . . . . . . . . . . . . . . . 5 51 2.1. Bloom Filter . . . . . . . . . . . . . . . . . . . . . . 6 52 2.2. Counting Bloom Filter . . . . . . . . . . . . . . . . . . 7 53 3. Invertible Bloom Filter . . . . . . . . . . . . . . . . . . . 8 54 3.1. Structure . . . . . . . . . . . . . . . . . . . . . . . . 8 55 3.2. Salted Element ID Calculation . . . . . . . . . . . . . . 9 56 3.3. HASH calculation . . . . . . . . . . . . . . . . . . . . 10 57 3.4. Mapping Function . . . . . . . . . . . . . . . . . . . . 10 58 3.5. Operations . . . . . . . . . . . . . . . . . . . . . . . 11 59 3.5.1. Insert Element . . . . . . . . . . . . . . . . . . . 11 60 3.5.2. Remove Element . . . . . . . . . . . . . . . . . . . 13 61 3.5.3. Extracting elements . . . . . . . . . . . . . . . . . 14 62 3.5.4. Set Difference . . . . . . . . . . . . . . . . . . . 16 63 3.6. Wire format . . . . . . . . . . . . . . . . . . . . . . . 18 64 4. Strata Estimator . . . . . . . . . . . . . . . . . . . . . . 18 65 5. Mode of Operation . . . . . . . . . . . . . . . . . . . . . . 19 66 5.1. Full Synchronisation Mode . . . . . . . . . . . . . . . . 20 67 5.2. Differential Synchronisation Mode . . . . . . . . . . . . 21 68 5.3. Combined Mode . . . . . . . . . . . . . . . . . . . . . . 25 69 6. Messages . . . . . . . . . . . . . . . . . . . . . . . . . . 25 70 6.1. Operation Request . . . . . . . . . . . . . . . . . . . . 25 71 6.1.1. Description . . . . . . . . . . . . . . . . . . . . . 25 72 6.1.2. Structure . . . . . . . . . . . . . . . . . . . . . . 26 73 6.2. IBF . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 74 6.2.1. Description . . . . . . . . . . . . . . . . . . . . . 26 75 6.2.2. Structure . . . . . . . . . . . . . . . . . . . . . . 26 76 6.3. IBF Last . . . . . . . . . . . . . . . . . . . . . . . . 28 77 6.3.1. Description . . . . . . . . . . . . . . . . . . . . . 28 78 6.3.2. Structure . . . . . . . . . . . . . . . . . . . . . . 29 79 6.4. Element . . . . . . . . . . . . . . . . . . . . . . . . . 29 80 6.4.1. Description . . . . . . . . . . . . . . . . . . . . . 29 81 6.4.2. Structure . . . . . . . . . . . . . . . . . . . . . . 29 82 6.5. Offer . . . . . . . . . . . . . . . . . . . . . . . . . . 30 83 6.5.1. Description . . . . . . . . . . . . . . . . . . . . . 30 84 6.5.2. Structure . . . . . . . . . . . . . . . . . . . . . . 30 85 6.6. Inquiry . . . . . . . . . . . . . . . . . . . . . . . . . 31 86 6.6.1. Description . . . . . . . . . . . . . . . . . . . . . 31 87 6.6.2. Structure . . . . . . . . . . . . . . . . . . . . . . 31 88 6.7. Demand . . . . . . . . . . . . . . . . . . . . . . . . . 32 89 6.7.1. Description . . . . . . . . . . . . . . . . . . . . . 32 90 6.7.2. Structure . . . . . . . . . . . . . . . . . . . . . . 32 91 6.8. Done . . . . . . . . . . . . . . . . . . . . . . . . . . 32 92 6.8.1. Description . . . . . . . . . . . . . . . . . . . . . 33 93 6.8.2. Structure . . . . . . . . . . . . . . . . . . . . . . 33 94 6.9. Full Done . . . . . . . . . . . . . . . . . . . . . . . . 33 95 6.9.1. Description . . . . . . . . . . . . . . . . . . . . . 33 96 6.9.2. Structure . . . . . . . . . . . . . . . . . . . . . . 34 97 6.10. Request Full . . . . . . . . . . . . . . . . . . . . . . 34 98 6.10.1. Description . . . . . . . . . . . . . . . . . . . . 34 99 6.10.2. Structure . . . . . . . . . . . . . . . . . . . . . 34 100 6.11. Send Full . . . . . . . . . . . . . . . . . . . . . . . . 35 101 6.11.1. Description . . . . . . . . . . . . . . . . . . . . 35 102 6.11.2. Structure . . . . . . . . . . . . . . . . . . . . . 35 103 6.12. Strata Estimator . . . . . . . . . . . . . . . . . . . . 36 104 6.12.1. Description . . . . . . . . . . . . . . . . . . . . 36 105 6.12.2. Structure . . . . . . . . . . . . . . . . . . . . . 37 106 6.13. Strata Estimator Compressed . . . . . . . . . . . . . . . 38 107 6.13.1. Description . . . . . . . . . . . . . . . . . . . . 38 108 6.14. Full Element . . . . . . . . . . . . . . . . . . . . . . 38 109 6.14.1. Description . . . . . . . . . . . . . . . . . . . . 38 110 6.14.2. Structure . . . . . . . . . . . . . . . . . . . . . 39 111 7. Performance Considerations . . . . . . . . . . . . . . . . . 39 112 7.1. Formulas . . . . . . . . . . . . . . . . . . . . . . . . 39 113 7.1.1. Operation Mode . . . . . . . . . . . . . . . . . . . 40 114 7.1.2. IBF Size . . . . . . . . . . . . . . . . . . . . . . 42 115 7.1.3. Number of Buckets an Element is Hashed into . . . . . 43 116 7.2. Variable Counter Size . . . . . . . . . . . . . . . . . . 44 117 7.3. Multi Strata Estimators . . . . . . . . . . . . . . . . . 47 118 8. Security Considerations . . . . . . . . . . . . . . . . . . . 48 119 8.1. General Security Checks . . . . . . . . . . . . . . . . . 48 120 8.1.1. Input validation . . . . . . . . . . . . . . . . . . 49 121 8.1.2. Byzantine Boundaries . . . . . . . . . . . . . . . . 49 122 8.1.3. Valid State . . . . . . . . . . . . . . . . . . . . . 50 123 8.1.4. Message Flow Control . . . . . . . . . . . . . . . . 50 124 8.1.5. Limit Active/Passive Decoding changes . . . . . . . . 51 125 8.1.6. Full Synchronisation Plausibility Check . . . . . . . 52 126 8.2. States . . . . . . . . . . . . . . . . . . . . . . . . . 53 127 8.2.1. Expecting IBF . . . . . . . . . . . . . . . . . . . . 54 128 8.2.2. Full Sending . . . . . . . . . . . . . . . . . . . . 54 129 8.2.3. Expecting IBF Last . . . . . . . . . . . . . . . . . 54 130 8.2.4. Active Decoding . . . . . . . . . . . . . . . . . . . 55 131 8.2.5. Finish Closing . . . . . . . . . . . . . . . . . . . 57 132 8.2.6. Finished . . . . . . . . . . . . . . . . . . . . . . 57 133 8.2.7. Expect SE . . . . . . . . . . . . . . . . . . . . . . 57 134 8.2.8. Full Receiving . . . . . . . . . . . . . . . . . . . 57 135 8.2.9. Passive Decoding . . . . . . . . . . . . . . . . . . 58 136 8.2.10. Finish Waiting . . . . . . . . . . . . . . . . . . . 59 137 9. Constants . . . . . . . . . . . . . . . . . . . . . . . . . . 59 138 10. GANA Considerations . . . . . . . . . . . . . . . . . . . . . 60 139 11. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 61 140 12. Normative References . . . . . . . . . . . . . . . . . . . . 62 141 Appendix A. Test Vectors . . . . . . . . . . . . . . . . . . . . 63 142 A.1. Map Function . . . . . . . . . . . . . . . . . . . . . . 63 143 A.2. ID Calculation Function . . . . . . . . . . . . . . . . . 63 144 A.3. Counter Compression Function . . . . . . . . . . . . . . 64 145 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 64 147 1. Introduction 149 This document describes a byzantine fault tolerant set reconciliation 150 protocol used to efficient and securely compute the union of two sets 151 across a network. 153 This byzantine fault tolerant set reconciliation protocol can be used 154 in a variety of applications. Our primary envisioned application 155 domain is the distribution of revocation messages in the GNU Name 156 System (GNS) [GNS]. In GNS, key revocation messages are usually 157 flooded across the peer-to-peer overlay network to all connected 158 peers whenever a key is revoked. However, as peers may be offline or 159 the network might have been partitioned, there is a need to reconcile 160 revocation lists whenever network partitions are healed or peers go 161 online. The GNU Name System uses the protocol described in this 162 specification to efficiently distribute revocation messages whenever 163 network partitions are healed. Another application domain for the 164 protocol described in this specification are Byzantine fault-tolerant 165 bulletin boards, like those required in some secure multiparty 166 computations. A well-known example for secure multiparty 167 computations are various E-voting protocols 168 [CryptographicallySecureVoting] which use a bulletin board to share 169 the votes and intermediate computational results. We note that for 170 such systems, the set reconciliation protocol is merely a component 171 of a multiparty consensus protocol, such as the one described in 172 Dold's "Byzantine set-union consensus using efficient set 173 reconciliation" 174 [ByzantineSetUnionConsensusUsingEfficientSetReconciliation]. 176 The protocol described in this report is generic and suitable for a 177 wide range of applications. As a result, the internal structure of 178 the elements in the sets MUST be defined and verified by the 179 application using the protocol. This document thus does not cover 180 the element structure, except for imposing a limit on the maximum 181 size of an element. 183 The protocol faces an inherent trade-off between minimizing the 184 number of network round-trips and the number of bytes sent over the 185 network. Thus, for the protocol to choose the right parameters for a 186 given situation, applications using an implementation of the protocol 187 SHOULD provide a parameter that specifies the cost-ratio of round- 188 trips vs. bandwidth usage. Given this trade-off factor, an 189 implementation CAN then choose parameters that minimize total 190 execution cost. In particular, there is one major choice to be made, 191 namely between sending the complete set of elements, or computing the 192 set differences and transmitting only the elements in the set 193 differences. In the latter case, our design is basically a concrete 194 implementation of a proposal by Eppstein.[Eppstein] 196 We say that our set reconciliation protocol is Byzantine fault- 197 tolerant because it provides cryptographic and probabilistic methods 198 to discover if the other peer is dishonest or misbehaving. Here, the 199 security objective is to limit resources wasted on malicious actors. 200 Malicious actors could send malformed messages, including malformed 201 set elements, claim to have much larger numbers of valid set elements 202 than they actually hold, or request the retransmission of elements 203 that they have already received in previous interactions. Bounding 204 resources consumed by malicous actors is important to ensure that 205 higher-level protocols can use set reconciliation and still meet 206 their resource targets. This can be particularly critical in multi- 207 round synchronous consensus protocols where peers that cannot answer 208 in a timely fashion would have to be treated as failed or malicious. 210 To defend against some of these attacks, applications SHOULD remember 211 the number of elements previously shared with a peer, and SHOULD 212 provide a way to check that elements are well-formed. Applications 213 MAY also provide an upper bound on the total number of valid elements 214 that exist. For example, in E-voting, the number of eligible voters 215 MAY be used to provide such an upper bound. 217 A first draft of this RFC is part of Elias Summermatter's bachelor 218 thesis. Many of the algorithms and parameters documented in this RFC 219 are derived from experiments detailed in this thesis. 220 [byzantine_fault_tolerant_set_reconciliation] 222 This document defines the normative wire format of resource records, 223 resolution processes, cryptographic routines and security 224 considerations for use by implementors. SETU requires a 225 bidirectional secure communication channel between the two parties. 226 Specification of the communication channel is out of scope of this 227 document. 229 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 230 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 231 document are to be interpreted as described in [RFC2119]. 233 2. Background 234 2.1. Bloom Filter 236 A Bloom filter (BF) is a space-efficient probabilistic datastructure 237 to test if an element is part of a set of elements. Elements are 238 identified by an element ID. Since a BF is a probabilistic 239 datastructure, it is possible to have false-positives: when asked if 240 an element is in the set, the answer from a BF is either "no" or 241 "maybe". 243 A BF consists of L buckets. Every bucket is a binary value that can 244 be either 0 or 1. All buckets are initialized to 0. A mapping 245 function M is used to map each ID of each element from the set to a 246 subset of k buckets. In the original proposal by Bloom, M is non- 247 injective and can thus map the same element multiple times to the 248 same bucket. The type of the mapping function can thus be described 249 by the following mathematical notation: 251 ------------------------------------ 252 # M: E->B^k 253 ------------------------------------ 254 # L = Number of buckets 255 # B = 0,1,2,3,4,...L-1 (the buckets) 256 # k = Number of buckets per element 257 # E = Set of elements 258 ------------------------------------ 259 Example: L=256, k=3 260 M('element-data') = {4,6,255} 262 Figure 1 264 A typical mapping function is constructed by hashing the element, for 265 example using the well-known Section 2 of HKDF construction 266 [RFC5869]. 268 To add an element to the BF, the corresponding buckets under the map 269 M are set to 1. To check if an element may be in the set, one tests 270 if all buckets under the map M are set to 1. 272 In the BF the buckets are set to 1 if the corresponding bit in the 273 bitstream is 1. If there is a collision and a bucket is already set 274 to 1, the bucket stays at 1. 276 In the following example the element e0 with M(e0) = {1,3} has been 277 added: 279 bucket-0 bucket-1 bucket-2 bucket-3 280 +-------------+-------------+-------------+-------------+ 281 | 0 | 1 | 0 | 1 | 282 +-------------+-------------+-------------+-------------+ 284 Figure 2 286 It is easy to see that an element e1 with M(e1) = {0,3} could have 287 been added to the BF below, while an element e2 with M(e2) = {0,2} 288 cannot be in the set represented by the BF below: 290 bucket-0 bucket-1 bucket-2 bucket-3 291 +-------------+-------------+-------------+-------------+ 292 | 1 | 0 | 0 | 1 | 293 +-------------+-------------+-------------+-------------+ 295 Figure 3 297 The parameters L and k depend on the set size and MUST be chosen 298 carefully to ensure that the BF does not return too many false- 299 positives. 301 It is not possible to remove an element from the BF because buckets 302 can only be set to 1 or 0. Hence it is impossible to differentiate 303 between buckets containing one or more elements. To remove elements 304 from the BF a Counting Bloom Filter is required. 306 2.2. Counting Bloom Filter 308 A Counting Bloom Filter (CBF) is a variation on the idea of a Bloom 309 Filter. With a CBF, buckets are unsigned numbers instead of binary 310 values. This allows the removal of an element from the CBF. 312 Adding an element to the CBF is similar to the adding operation of 313 the BF. However, instead of setting the buckets to 1 the numeric 314 value stored in the bucket is increased by 1. For example, if two 315 colliding elements M(e1) = {1,3} and M(e2) = {0,3} are added to the 316 CBF, bucket 0 and 1 are set to 1 and bucket 3 (the colliding bucket) 317 is set to 2: 319 bucket-0 bucket-1 bucket-2 bucket-3 320 +-------------+-------------+-------------+-------------+ 321 | 1 | 1 | 0 | 2 | 322 +-------------+-------------+-------------+-------------+ 324 Figure 4 326 The counter stored in the bucket is also called the order of the 327 bucket. 329 To remove an element form the CBF the counters of all buckets the 330 element is mapped to are decreased by 1. 332 For example, removing M(e2) = {1,3} from the CBF above results in: 334 bucket-0 bucket-1 bucket-2 bucket-3 335 +-------------+-------------+-------------+-------------+ 336 | 1 | 0 | 0 | 1 | 337 +-------------+-------------+-------------+-------------+ 339 Figure 5 341 In practice, the number of bits available for the counters is often 342 finite. For example, given a 4-bit counter, a CBF bucket would 343 overflow 16 elements are mapped to the same bucket. To handle this 344 case, the maximum value (15 in our example) is considered to 345 represent "infinity". Once the order of a bucket reaches "infinity", 346 it is no longer incremented or decremented. 348 The parameters L and k and the number of bits allocated to the 349 counters SHOULD depend on the set size. A CBF will degenerate when 350 subjected to insert and remove iterations of different elements, and 351 eventually all buckets will reach "infinity". The speed of the 352 degradation will depend on the choice of L and k in relation to the 353 number of elements stored in the IBF. 355 3. Invertible Bloom Filter 357 An Invertible Bloom Filter (IBF) is a further extension of the 358 Counting Bloom Filter. An IBF extends the Counting Bloom Filter with 359 two more operations: decode and set difference. This two extra 360 operations are key to efficiently obtain small differences between 361 large sets. 363 3.1. Structure 365 An IBF consists of an injective mapping function M mapping elements 366 to k out of L buckets. Each of the L buckets stores a signed 367 COUNTER, an IDSUM and an XHASH. An IDSUM is the XOR of various 368 element IDs. An XHASH is the XOR of various hash values. As before, 369 the values used for k, L and the number of bits used for the signed 370 counter and the XHASH depend on the set size and various other trade- 371 offs. 373 If the IBF size is too small or the mapping function does not spread 374 out the elements uniformly, the signed counter can overflow or 375 underflow. As with the CBF, the "maximum" value is thus used to 376 represent "infinite". As there is no need to distinguish between 377 overflow and underflow, the most canonical representation of 378 "infinite" would be the minimum value of the counter in the canonical 379 2-complement interpretation. For example, given a 4-bit counter a 380 value of -8 would be used to represent "infinity". 382 bucket-0 bucket-1 bucket-2 bucket-3 383 +-------------+-------------+-------------+-------------+------- 384 count | COUNTER | COUNTER | COUNTER | COUNTER | C... 385 +-------------+-------------+-------------+-------------+------ 386 idSum | IDSUM | IDSUM | IDSUM | IDSUM | I... 387 +-------------+-------------+-------------+-------------+------ 388 hashSum | HASHSUM | HASHSUM | HASHSUM | HASHSUM | H.. 389 +-------------+-------------+-------------+-------------+------- 391 Figure 6 393 3.2. Salted Element ID Calculation 395 IBFs are a probabilistic data structure, hence it can be necessary to 396 recompute the IBF in case operations fail, and then try again. The 397 recomputed IBF would ideally be statistically independent of the 398 failed IBF. This is achieved by introducing an IBF-salt. Given that 399 with benign peers failures should be rare, and that we need to be 400 able to "invert" the application of the IBF-salt to the element IDs, 401 we use an unsigned 32 bit non-random IBF-salt value of which the 402 lowest 6 bits will be used to rotate bits in the element ID 403 computation. 405 64-bit element IDs are generated from a Section 2 of HKDF 406 construction [RFC5869] with HMAC-SHA512 as XTR and HMAC-SHA256 as PRF 407 with a 16-bit KDF-salt set to a unsigned 16-bit representation of 408 zero. The output of the KDF is then truncated to 64-bit. Finally, 409 salting is done by calculating the IBF-salt modulo 64 (effectively 410 using only the lowest 6-bits of the IBF-salt) and doing a bitwise 411 right rotation of the output of KDF. We note that this operation was 412 chosen as it is easily inverted, allowing applications to easily 413 derive element IDs with one IBF-salt value from element IDs generated 414 with a different IBF-salt value. 416 In case the IBF does not decode, the IBF-salt can be changed to 417 compute different element IDs, which will (likely) be mapped to 418 different buckets, likely allowing the IBF to decode in a subsequent 419 iteration. 421 # INPUTS: 422 # key: Pre calculated and truncated key from id_calculation function 423 # ibf_salt: Salt of the IBF 424 # OUTPUT: 425 # value: salted key 426 FUNCTION salt_key(key,ibf_salt): 427 s = (ibf_salt * 7) modulo 64; 428 /* rotate key */ 429 return (key >> s) | (key << (64 - s)) 430 END FUNCTION 432 # INPUTS: 433 # element: element for which we are to calculate the element ID 434 # ibf_salt: Salt of the IBF 435 # OUTPUT: 436 # value: the ID of the element 437 FUNCTION id_calculation (element,ibf_salt): 438 kdf_salt = 0 // 16 bits 439 XTR=HMAC-SHA256 440 PRF=HMAC-SHA256 441 key = HKDF(XTR, PRF, kdf_salt, element) modulo 2^64 442 return salt_key(key, ibf_salt) 443 END FUNCTION 445 Figure 7 447 3.3. HASH calculation 449 The HASH of an element ID is computed by calculating the CRC32 450 checksum of the 64-bit ID value, which returns a 32-bit value.CRC32 451 is well-known and described in Section 4.1 of the RFC [RFC3385]. 453 3.4. Mapping Function 455 The mapping function M decides which buckets a given ID is mapped to. 456 For an IBF, it is beneficial to use an injective mapping function M. 458 The first index is simply the CRC32 of the ID modulo the IBF size. 459 The second index is calculated by creating a new 64-bit value by 460 shifting the previous 32-bit value left and setting the lower 32 bits 461 to the number of indices already processed. From the resulting 462 64-bit value, another CRC32 checksum is computed. The subsequent 463 index is the modulo of this CRC32 output. The process is repeated 464 until the desired number of indices is generated. In the case the 465 process computes the same index twice, which would mean this bucket 466 could not get pure again, the second hit is just skipped and the next 467 iteration is used instead, creating an injective mapping function. 469 # INPUTS: 470 # key: the ID of the element calculated 471 # k: numbers of buckets per element 472 # L: total number of buckets in the IBF 473 # OUTPUT: 474 # dst: Array with k bucket IDs 475 FUNCTION get_bucket_id (key, k, L) 476 bucket = CRC32(key) 477 i = 0 // unsigned 32-bit index 478 filled = 0 479 WHILE filled < k DO 481 element_already_in_bucket = false 482 j = 0 483 WHILE j < filled DO 484 IF dst[j] == bucket modulo L THEN 485 element_already_in_bucket = true 486 END IF 487 j++ 488 END WHILE 490 IF !element_already_in_bucket THEN 491 dst[filled] = bucket modulo L 492 filled = filled + 1 493 END IF 495 x = (bucket << 32) | i // 64 bit result 496 bucket = CRC32(x) 497 i = i + 1 498 END WHILE 499 return dst 500 END FUNCTION 502 Figure 8 504 3.5. Operations 506 When an IBF is created, all counters and IDSUM and HASHSUM values of 507 all buckets are initialized to zero. 509 3.5.1. Insert Element 511 To add an element to an IBF, the element is mapped to a subset of k 512 buckets using the injective mapping function M as described in 513 section Mapping Function. For the buckets selected by the mapping 514 function, the counter is increased by one and the IDSUM field is set 515 to the XOR of the element ID computed as described in section Salted 516 Element ID Calculation and the previously stored IDSUM. Furthermore, 517 the HASHSUM is set to the XOR of the previously stored HASHSUM and 518 the hash of the element ID computed as described in section HASH 519 calculation. 521 In the following example, the insert operation is illustrated using 522 an element with the ID 0x0102 mapped to {1,3} with a hash of 0x4242, 523 and a second element with the ID 0x0304 mapped to {0,1} and a hash of 524 0x0101. 526 Empty IBF: 528 bucket-0 bucket-1 bucket-2 bucket-3 529 +-------------+-------------+-------------+-------------+ 530 count | 0 | 0 | 0 | 0 | 531 +-------------+-------------+-------------+-------------+ 532 idSum | 0x0000 | 0x0000 | 0x0000 | 0x0000 | 533 +-------------+-------------+-------------+-------------+ 534 hashSum | 0x0000 | 0x0000 | 0x0000 | 0x0000 | 535 +-------------+-------------+-------------+-------------+ 537 Figure 9 539 Insert first element with ID 0x0102 and hash 0x4242 into {1,3}: 541 bucket-0 bucket-1 bucket-2 bucket-3 542 +-------------+-------------+-------------+-------------+ 543 count | 0 | 1 | 0 | 1 | 544 +-------------+-------------+-------------+-------------+ 545 idSum | 0x0000 | 0x0102 | 0x0000 | 0x0102 | 546 +-------------+-------------+-------------+-------------+ 547 hashSum | 0x0000 | 0x4242 | 0x0000 | 0x4242 | 548 +-------------+-------------+-------------+-------------+ 550 Figure 10 552 Insert second element with ID 0x0304 and hash 0101 into {0,1}: 554 bucket-0 bucket-1 bucket-2 bucket-3 555 +-------------+-------------+-------------+-------------+ 556 count | 1 | 2 | 0 | 1 | 557 +-------------+-------------+-------------+-------------+ 558 idSum | 0x0304 | 0x0206 | 0x0000 | 0x0102 | 559 +-------------+-------------+-------------+-------------+ 560 hashSum | 0x0101 | 0x4343 | 0x0000 | 0x4242 | 561 +-------------+-------------+-------------+-------------+ 563 Figure 11 565 3.5.2. Remove Element 567 To remove an element from the IBF the element is again mapped to a 568 subset of the buckets using M. Then all the counters of the buckets 569 selected by M are reduced by one, the IDSUM is replaced by the XOR of 570 the old IDSUM and the ID of the element being removed, and the 571 HASHSUM is similarly replaced with the XOR of the old HASHSUM and the 572 hash of the ID. 574 In the following example the remove operation is illustrated. 576 IBF with two encoded elements: 578 bucket-0 bucket-1 bucket-2 bucket-3 579 +-------------+-------------+-------------+-------------+ 580 count | 1 | 2 | 0 | 1 | 581 +-------------+-------------+-------------+-------------+ 582 idSum | 0x0304 | 0x0206 | 0x0000 | 0x0102 | 583 +-------------+-------------+-------------+-------------+ 584 hashSum | 0x0101 | 0x4343 | 0x0000 | 0x4242 | 585 +-------------+-------------+-------------+-------------+ 587 Figure 12 589 After removal of element with ID 0x0304 and hash 0x0101 mapped to 590 {0,1} from the IBF: 592 bucket-0 bucket-1 bucket-2 bucket-3 593 +-------------+-------------+-------------+-------------+ 594 count | 0 | 1 | 0 | 1 | 595 +-------------+-------------+-------------+-------------+ 596 idSum | 0x0000 | 0x0102 | 0x0000 | 0x0102 | 597 +-------------+-------------+-------------+-------------+ 598 hashSum | 0x0000 | 0x4242 | 0x0000 | 0x4242 | 599 +-------------+-------------+-------------+-------------+ 601 Figure 13 603 Note that it is possible to "remove" elements from an IBF that were 604 never present in the IBF in the first place. A negative counter 605 value is thus indicative of elements that were removed without having 606 been added. Note that an IBF bucket counter of zero no longer 607 guarantees that an element mapped to that bucket is not present in 608 the set: a bucket with a counter of zero can be the result of one 609 element being added and a different element (mapped to the same 610 bucket) being removed. To check that an element is not present 611 requires a counter of zero and an IDSUM and HASHSUM of zero --- and 612 some certainty that there was no collision due to the limited number 613 of bits in IDSUM and HASHSUM. Thus, IBFs are not suitable to replace 614 BFs or IBFs. 616 Buckets in an IBF with a counter of 1 or -1 are crucial for decoding 617 an IBF, as they MIGHT represent only a single element, with the IDSUM 618 being the ID of that element. Following Eppstein [Eppstein], we will 619 call buckets that only represent a single element _pure buckets_. 620 Note that due to the possibility of multiple insertion and removal 621 operations affecting the same bucket, not all buckets with a counter 622 of 1 or -1 are actually pure buckets. Sometimes a counter can be 1 623 or -1 because N elements mapped to that bucket were added while N-1 624 or N+1 different elements also mapped to that bucket were removed. 626 3.5.3. Extracting elements 628 Extracting elements from an IBF yields IDs of elements from the IBF. 629 Elements are extracted from an IBF by repeatedly performing a decode 630 operation on the IBF. 632 A decode operation requires a pure bucket, that is a bucket to which 633 M only mapped a single element, to succeed. Thus, if there is no 634 bucket with a counter of 1 or -1, decoding fails. However, as a 635 counter of 1 or -1 is not a guarantee that the bucket is pure, there 636 is also a chance that the decoder returns an IDSUM value that is 637 actually the XOR of several IDSUMs. This is primarily detected by 638 checking that the HASHSUM is the hash of the IDSUM. Only if the 639 HASHSUM also matches, the bucket could be pure. Additionally, one 640 MUST check that the IDSUM value actually would be mapped by M to the 641 respective bucket. If not, there was a hash collision and the bucket 642 is also not pure. 644 The very rare case that after all these checks a bucket is still 645 falsely identified as pure MUST be detected (say by determining that 646 extracted element IDs do not match any actual elements), and 647 addressed at a higher level in the protocol. As these failures are 648 probabilistic and depend on element IDs and the IBF construction, 649 they can typically be avoided by retrying with different parameters, 650 such as a different way to assign element IDs to elements (by varying 651 the IBF-salt), using a larger value for L, or a different mapping 652 function M. A more common scenario (especially if L was too small) 653 is that IBF decoding fails because there is no pure bucket. In this 654 case, the higher-level protocol generally MUST also retry using 655 different parameters (except if an attack is detected). 657 Suppose the IBF contains a pure bucket. In this case, the IDSUM in 658 the bucket is the ID of an element. Furthermore, it is then possible 659 to remove that element from the IBF (by inserting it if the counter 660 was negative, and by removing it if the counter was positive). This 661 is likely to cause other buckets to become pure, allowing further 662 elements to be decoded. Eventually, decoding ought to finish with 663 all counters and IDSUM and HASHSUM values reach zero. However, it is 664 also possible that an IBF only partly decodes and then decoding fails 665 due to the lack of pure buckets after extracting some element IDs. 667 In the following example the successful decoding of an IBF containing 668 the two elements previously added in our running example. 670 We begin with an IBF with two elements added: 672 bucket-0 bucket-1 bucket-2 bucket-3 673 +-------------+-------------+-------------+-------------+ 674 count | 1 | 2 | 0 | 1 | 675 +-------------+-------------+-------------+-------------+ 676 idSum | 0x0304 | 0x0206 | 0x0000 | 0x0102 | 677 +-------------+-------------+-------------+-------------+ 678 hashSum | 0x0101 | 0x4343 | 0x0000 | 0x4242 | 679 +-------------+-------------+-------------+-------------+ 681 Figure 14 683 In the IBF are two pure buckets to decode (buckets 0 and 3) we choose 684 to start with decoding bucket 0. This yields the element with the 685 hash ID 0x0304 and hash 1010. This element ID is mapped to buckets 686 {0,1}. Subtracting this element results in bucket 1 becoming pure: 688 bucket-0 bucket-1 bucket-2 bucket-3 689 +-------------+-------------+-------------+-------------+ 690 count | 0 | 1 | 0 | 1 | 691 +-------------+-------------+-------------+-------------+ 692 idSum | 0x0000 | 0x0102 | 0x0000 | 0x0102 | 693 +-------------+-------------+-------------+-------------+ 694 hashSum | 0x0000 | 0x4242 | 0x0000 | 0x4242 | 695 +-------------+-------------+-------------+-------------+ 697 Figure 15 699 We can now decoding bucket 2 and extract the element with the ID 700 0x0102 and hash 0x4242. Now the IBF is empty. Extraction completes 701 with the status that the IBF has been successfully decoded. 703 bucket-0 bucket-1 bucket-2 bucket-3 704 +-------------+-------------+-------------+-------------+ 705 count | 0 | 0 | 0 | 0 | 706 +-------------+-------------+-------------+-------------+ 707 idSum | 0x0000 | 0x0000 | 0x0000 | 0x0000 | 708 +-------------+-------------+-------------+-------------+ 709 hashSum | 0x0000 | 0x0000 | 0x0000 | 0x0000 | 710 +-------------+-------------+-------------+-------------+ 712 Figure 16 714 3.5.4. Set Difference 716 Given addition and removal as defined above, it is possible to define 717 an operation on IBFs that computes an IBF representing the set 718 difference. Suppose IBF1 represents set A, and IBF2 represents set 719 B. Then this set difference operation will compute IBF3 which 720 represents the set A - B. Note that this computation can be done 721 only on the IBFs, and does not require access to the elements from 722 set A or B. To calculate the IBF representing this set difference, 723 both IBFs MUST have the same length L, the same number of buckets per 724 element k and use the same map M. Naturally, all IDs must have been 725 computed using the same IBF-salt. Given this, one can compute the 726 IBF representing the set difference by taking the XOR of the IDSUM 727 and HASHSUM values of the respective buckets and subtracting the 728 respective counters. Care MUST be taken to handle overflows and 729 underflows by setting the counter to "infinity" as necessary. The 730 result is a new IBF with the same number of buckets representing the 731 set difference. 733 This new IBF can be decoded as described in section 3.5.3. The new 734 IBF can have two types of pure buckets with counter set to 1 or -1. 735 If the counter is set to 1 the element is missing in the secondary 736 set, and if the counter is set to -1 the element is missing in the 737 primary set. 739 To demonstrate the set difference operation we compare IBF-A with 740 IBF-B and generate as described IBF-AB 742 IBF-A contains the elements with ID 0x0304 and hash 0x0101 mapped to 743 {0,1}, and ID 0x0102 and hash 0x4242 mapped to {1,3}: 745 bucket-0 bucket-1 bucket-2 bucket-3 746 +-------------+-------------+-------------+-------------+ 747 count | 1 | 2 | 0 | 1 | 748 +-------------+-------------+-------------+-------------+ 749 idSum | 0x0304 | 0x0206 | 0x0000 | 0x0102 | 750 +-------------+-------------+-------------+-------------+ 751 hashSum | 0x0101 | 0x4343 | 0x0000 | 0x4242 | 752 +-------------+-------------+-------------+-------------+ 754 Figure 17 756 IBF-B also contains the element with ID 0x0102 and and another 757 element with ID 0x1345 and hash 0x5050 mapped to {1,2}. 759 bucket-0 bucket-1 bucket-2 bucket-3 760 +-------------+-------------+-------------+-------------+ 761 count | 0 | 1 | 1 | 1 | 762 +-------------+-------------+-------------+-------------+ 763 idSum | 0x0000 | 0x1447 | 0x1345 | 0x0102 | 764 +-------------+-------------+-------------+-------------+ 765 hashSum | 0x0000 | 0x9292 | 0x5050 | 0x4242 | 766 +-------------+-------------+-------------+-------------+ 768 Figure 18 770 IBF-A minus IBF-B is then: 772 bucket-0 bucket-1 bucket-2 bucket-3 773 +-------------+-------------+-------------+-------------+ 774 count | 1 | 0 | -1 | 0 | 775 +-------------+-------------+-------------+-------------+ 776 idSum | 0x0304 | 0x1049 | 0x1345 | 0x0000 | 777 +-------------+-------------+-------------+-------------+ 778 hashSum | 0x0101 | 0x5151 | 0x5050 | 0x0000 | 779 +-------------+-------------+-------------+-------------+ 781 Figure 19 783 After calculating and decoding the IBF-AB shows clear that in IBF-A 784 the element with the hash 0x5050 is missing (-1 in bucket 2) while in 785 IBF-B the element with the hash 0101 is missing (1 in bucket 0). The 786 element with hash 0x4242 is present in IBF-A and IBF-B and is removed 787 by the set difference operation. Bucket 2 is not empty. 789 3.6. Wire format 791 For the counter field, we use a variable-size encoding to ensure that 792 even for very large sets the counter should never reach "infinity", 793 while also ensuring that the encoding is compact for small sets. 794 Hence, the counter size transmitted over the wire varies between 1 795 and 64 bits, depending on the maximum counter in the IBF. A range of 796 1 to 64 bits should cover most areas of application and can be 797 efficiently implemented on most contemporary CPU architectures and 798 programming languages. The bit length for the transmitted IBF will 799 be communicated in the header of the _IBF_ message in the "IMCS" 800 field as unsigned 8-bit integer. For implementation details see 801 section Variable Counter Size. 803 For the "IDSUM", we always use a 64-bit representation. The IDSUM 804 value MUST have sufficient entropy for the mapping function M to 805 yield reasonably random buckets even for very large values of L. 806 With a 32 bit value the chance that multiple elements may be mapped 807 to the same ID would be quite high, even for moderately large sets. 808 Using more than 64 bits would at best make sense for very large sets, 809 but then it is likely always better to simply afford additional round 810 trips to handle the occasional collision. 64 bits are also a 811 reasonable size for many CPU architectures. 813 For the "HASHSUM", we always use a 32-bit representation. Here, it 814 is most important to avoid collisions, where different elements are 815 mapped to the same hash, possibly resulting in a bucket being falsely 816 classified as pure. While with 32 bits there remains a non- 817 negligible chance of accidental collisions, our protocol is designed 818 to handle occasional collisions. Hence, at 32 bit the chance is 819 believed to be sufficiently small enough for the protocol to handle 820 those cases efficiently. Smaller hash values would safe bandwidth, 821 but also substantially increase the chance of collisions. 32 bits are 822 also again a reasonable size for many CPU architectures. 824 4. Strata Estimator 826 Strata Estimators help estimate the size of the set difference 827 between two sets of elements. This is necessary to efficiently 828 determinate the tuning parameters for an IBF, in particular a good 829 value for L. 831 Basically a Strata Estimator (SE) is a series of IBFs (with a rather 832 small value of L=79) in which increasingly large subsets of the full 833 set of elements are added to each IBF. For the n-th IBF, the 834 function selecting the subset of elements MUST sample to select 835 (probabilistically) 1/(2^n) of all elements. This can be done by 836 counting the number of trailing bits set to "1" in an element ID, and 837 then inserting the element into the IBF identified by that counter. 838 As a result, all elements will be mapped to one IBF, with the n-th 839 IBF being statistically expected to contain 1/(2^n) elements. 841 Given two SEs, the set size difference can be estimated by attempting 842 to decode all of the IBFs. Given that L is set to a fixed and rather 843 small value, IBFs containing large strata will likely fail to decode. 844 For IBFs that failed to decode, one simply extrapolates the number of 845 elements by scaling the numbers obtained from the other IBFs that did 846 decode. If none of the IBFs of the SE decoded (which given a 847 reasonable number of IBFs in the SE should be highly unlikely), one 848 can theoretically retry using a different IBF-salt. 850 When decoding the IBFs in the strata estimator, it is possible to 851 determine on which side which part of the difference is. For this 852 purpose, the pure buckets with counter 1 and -1 must be distinguished 853 and assigned to the respective side when decoding the IBFs. 855 5. Mode of Operation 857 Depending on the state of the two sets the set union protocol uses 858 different modes of operation to efficiently determinate missing 859 elements between the two sets. 861 The simplest mode is the _full synchronisation mode_. If the 862 difference between the sets of the two peers exceeds a certain 863 threshold, the overhead to determine which elements are different 864 would outweigh the overhead of simply sending the complete set. 865 Hence, the protocol may determine that the most efficient method is 866 to exchange the full sets. 868 The second possibility is that the difference between the sets is 869 relatively small compared to the set size. In this case, the 870 _differential synchronisation mode_ is more efficient. Given these 871 two possibilities, the first steps of the protocol are used to 872 determine which mode MUST be used. 874 Thus, the set union protocol always begins with the following 875 operation mode independent steps: 877 The initiating peer begins in the *Initiating Connection* state and 878 the receiving peer in the *Expecting Connection* state. The first 879 step for the initiating peer in the protocol is to send an _Operation 880 Request_ to the receiving peer and transition into the *Expect SE* 881 state. After receiving the _Operation Request_ the receiving peer 882 transitions to the *Expecting IBF* state and answers with the _Strata 883 Estimator_ message. When the initiating peer receives the _Strata 884 Estimator_ message, it decides with some heuristics which operation 885 mode is likely more suitable for the estimated set difference and the 886 application-provided latency-bandwidth tradeoff. The detailed 887 algorithm used to choose between the Full Synchronisation Mode and 888 the Differential Synchronisation Mode is explained in the section 889 Combined Mode below. 891 5.1. Full Synchronisation Mode 893 When the initiating peer decides to use the full synchronisation mode 894 and it is better that the other peer sends his set first, the 895 initiating peer sends a _Request Full_ message, and transitions from 896 *Expecting SE* to the *Full Receiving* state. If it has been 897 determined that it is better that the initiating peer sends his set 898 first, the initiating peer sends a _Send Full_ message followed by 899 all set elements in _Full Element_ messages to the other peer, 900 followed by the _Full Done_ message, and transitions into the *Full 901 Sending* state. 903 A state diagram illustrating the state machine used during full 904 synchronization is provided here 905 (https://git.gnunet.org/lsd0003.git/plain/statemachine/ 906 state_machine_full.png). 908 *The behavior of the participants the different state is described 909 below:* 911 *Expecting IBF:* If a peer in the *Expecting IBF* state receives a 912 _Request Full_ message from the other peer, the peer sends all the 913 elements of his set followed by a _Full Done_ message to the other 914 peer, and transitions to the *Full Sending* state. If the peer 915 receives an _Send Full_ message followed by _Full Element_ 916 messages, the peer processes the element and transitions to the 917 *Full Receiving* state. 919 *Full Sending:* While a peer is in *Full Sending* state the peer 920 expects to continuously receive elements from the other peer. As 921 soon as a the _Full Done_ message is received, the peer 922 transitions into the *Finished* state. 924 *Full Receiving:* While a peer is in the *Full Receiving* state, it 925 expects to continuously receive elements from the other peer. As 926 soon as a the _Full Done_ message is received, it sends the 927 remaining elements (those it did not receive) from his set to the 928 other peer, followed by a _Full Done_. After sending the last 929 message, the peer transitions into the *Finished* state. 931 5.2. Differential Synchronisation Mode 933 The message format used by the protocol limits the maximum message 934 size to 64 kb. Given that L can be large, an IBF will not always fit 935 within that size limit. To deal with this, larger IBFs are split 936 into multiple messages. 938 When the initiating peer in the *Expected SE* state decides to use 939 the differential synchronisation mode, it sends an IBF, which may 940 consist of several _IBF_ messages, to the receiving peer and 941 transitions into the *Passive Decoding* state. 943 The receiving peer in the *Expecting IBF* state receives the first 944 _IBF_ message from the initiating peer, and transitions into the 945 *Expecting IBF Last* state if the IBF was split into multiple _IBF_ 946 messages. If there is just a single _IBF_ message, the receiving 947 peer transitions directly to the *Active Decoding* state. 949 The peer that is in the *Active Decoding*, *Finish Closing* or in the 950 *Expecting IBF Last* state is called the active peer, and the peer 951 that is in either the *Passive Decoding* or the *Finish Waiting* 952 state is called the passive peer. 954 A state diagram illustrating the state machine used during 955 differential synchronization is provided here 956 (https://git.gnunet.org/lsd0003.git/plain/statemachine/ 957 differential_state_machine.png). 959 *The behavior of the participants the different states is described 960 below:* 962 *Passive Decoding:* In the *Passive Decoding* state the passive peer 963 reacts to requests from the active peer. The action the passive 964 peer executes depends on the message the passive peer receives in 965 the *Passive Decoding* state from the active peer and is described 966 below on a per message basis. 968 _Inquiry_ message: The _Inquiry_ message is received if the 969 active peer requests the SHA-512 hash of one or more elements 970 (by sending the 64 bit element ID) that are missing from the 971 active peer's set. In this case the passive peer answers with 972 _Offer_ messages which contain the SHA-512 hash of the 973 requested element. If the passive peer does not have an 974 element with a matching element ID, it MUST ignore the inquiry 975 (in this case, a bucket was falsely classified as pure, 976 decoding the IBF will eventually fail, and roles will be 977 swapped). It should be verified that after an falsely 978 classified pure bucket a role change is made. If multiple 979 elements match the 64 bit element ID, the passive peer MUST 980 send offers for all of the matching elements. 982 _Demand_ message: The _Demand_ message is received if the active 983 peer requests a complete element that is missing in the active 984 peers set in response to an offer. If the requested element is 985 known and has not yet been transmitted the passive peer answers 986 with an _Element_ message which contains the full, application- 987 dependent data of the requested element. If the passive peer 988 receives a demand for a SHA-512 hash for which it has no 989 corresponding element, a protocol violation is detected and the 990 protocol MUST be aborted. Implementations MUST also abort when 991 facing demands without previous matching offers or for which 992 the passive peer previously transmitted the element to the 993 active peer. 995 _Offer_ message: The _Offer_ message is received if the active 996 peer has decoded an element that is present in the active peers 997 set and is likely be missing in the set of the passive peer. 998 If the SHA-512 hash of the offer is indeed not a hash of any of 999 the elements from the set of the passive peer, the passive peer 1000 MUST answer with a _Demand_ message for that SHA-512 hash and 1001 remember that it issued this demand. The demand thus needs to 1002 be added to a list with unsatisfied demands. 1004 _Element_ message: When a new _Element_ message has been received 1005 the peer checks if a corresponding _Demand_ for the element has 1006 been sent and the demand is still unsatisfied. If the element 1007 has been demanded the peer checks the element for validity, 1008 removes it from the list of pending demands and then saves the 1009 element to the set. Otherwise the peer ignores the element. 1011 _IBF_ message: If an _IBF_ message is received, this indicates 1012 that decoding of the IBF on the active site has failed and 1013 roles will be swapped. The receiving passive peer transitions 1014 into the *Expecting IBF Last* state, and waits for more _IBF_ 1015 messages. There, once the final _IBF Last_ message has been 1016 received, it transitions to *Active Decoding*. 1018 _IBF Last_ message: If an _IBF Last_ message is received this 1019 indicates that there is just one IBF slice left and a direct 1020 state and role transition from *Passive Decoding* to *Active 1021 Decoding* is initiated. 1023 _Done_ message: Receiving the _Done_ message signals the passive 1024 peer that all demands of the active peer have been satisfied. 1025 Alas, the active peer will continue to process demands from the 1026 passive peer. Upon receiving this message, the passive peer 1027 transitions into the *Finish Waiting* state. 1029 *Active Decoding:* In the *Active Decoding* state the active peer 1030 decodes the IBFs and evaluates the set difference between the 1031 active and passive peer. Whenever an element ID is obtained by 1032 decoding the IBF, the active peer sends either an offer or an 1033 inquiry to the passive peer, depending on which site the decoded 1034 element is missing. 1036 If the IBF decodes a positive (1) pure bucket, the element is 1037 missing on the passive peers site. Thus, the active peer sends an 1038 _Offer_ to the passive peer. A negative (-1) pure bucket 1039 indicates that an element is missing in the active peers set, so 1040 the active peer sends a _Inquiry_ to the passive peer. 1042 In case the IBF does not successfully decode anymore, the active 1043 peer sends a new IBF computed with a different IBF-salt to the 1044 passive peer and changes into *Passive Decoding* state. This 1045 initiates a role swap. To reduce overhead and prevent double 1046 transmission of offers and elements, the new IBF is created on the 1047 local set after updating it with the all of the elements that have 1048 been successfully demanded. Note that the active peer MUST NOT 1049 wait for all active demands to be satisfied, as demands can fail 1050 if a bucket was falsely classified as pure. 1052 As soon as the active peer successfully finished decoding the IBF, 1053 the active peer sends a _Done_ message to the passive peer. 1055 All other actions taken by the active peer depend on the message 1056 the active peer receives from the passive peer. The actions are 1057 described below on a per message basis: 1059 _Offer_ message: The _Offer_ message indicates that the passive 1060 peer received a _Inquiry_ message from the active peer. If a 1061 inquiry has been sent and the offered element is missing in the 1062 active peers set, the active peer sends a _Demand_ message to 1063 the passive peer. The demand needs to be added to a list of 1064 unsatisfied demands. In case the received offer is for an 1065 element that is already in the set of the peer, the offer MUST 1066 BE ignored. 1068 _Demand_ message: The _Demand_ message indicates that the passive 1069 peer received a _Offer_ from the active peer. The active peer 1070 satisfies the demand of the passive peer by sending an 1071 _Element_ message if a offer request for the element was sent 1072 earlier. Otherwise, the protocol MUST be aborted, as peers 1073 must never send demands for hashes that they have never been 1074 offered. 1076 _Element_ message: If element is received that was not demanded 1077 or for which the application-specific validation logic fails, 1078 the protocol MUST be aborted. Otherwise, the corresponding 1079 demand is marked as satisfied. Note that this applies only to 1080 the differential synchronization mode. In full 1081 synchronization, it is perfectly normal to receive Full Element 1082 messages for elements that were not demanded and that might 1083 even already be known locally. 1085 _Done_ message: Receiving the message _Done_ indicates that all 1086 demands of the passive peer have been satisfied. The active 1087 peer then changes into the *Finish Closing* state. If the IBF 1088 has not finished decoding and the _Done_ is received, the other 1089 peer is not in compliance with the protocol and the protocol 1090 MUST be aborted. 1092 *Expecing IBF Last* In this state the active peer continuously 1093 receives _IBF_ messages from the passive peer. When the last _IBF 1094 Last_ message is received, the peer changes into the *Active 1095 Decoding* state. 1097 *Finish Closing* / *Finish Waiting* In this states the peers are 1098 waiting for all demands to be satisfied and for the 1099 synchronisation to be completed. When all demands are satisfied 1100 the peer changes into *Finished* state. 1102 5.3. Combined Mode 1104 In the _combined mode_ the protocol decides between Full 1105 Synchronisation Mode and the Differential Synchronisation Mode to 1106 minimize resource consumption. Typically, the protocol always runs 1107 in combined mode, but implementations MAY allow applications to force 1108 the use of one of the modes for testing. In this case, applications 1109 MUST ensure that the respective options to force a particular mode 1110 are used by both participants. 1112 The Differential Synchronisation Mode is only efficient on small set 1113 differences or if the byte-size of the elements is large. If the set 1114 difference is estimated to be large the Full Synchronisation Mode is 1115 more efficient. The exact heuristics and parameters on which the 1116 protocol decides which mode MUST be used are described in the 1117 Performance Considerations section of this document. 1119 There are two main cases when a Full Synchronisation Mode is always 1120 used. The first case is when one of the peers announces having an 1121 empty set. This is announced by setting the SETSIZE field in the 1122 _Strata Estimator_ to 0. The second case is if the application 1123 requests full synchronisation explicitly. This is useful for testing 1124 and MUST NOT be used in production. 1126 The state diagram illustrating the combined mode can be found here 1127 (https://git.gnunet.org/lsd0003.git/plain/statemachine/ 1128 full_state_machine.png). 1130 6. Messages 1132 This section describes the various message formats used by the 1133 protocol. 1135 6.1. Operation Request 1137 6.1.1. Description 1139 This message is the first message of the protocol and it is sent to 1140 signal to the receiving peer that the initiating peer wants to 1141 initialize a new connection. 1143 This message is sent in the transition between the *Initiating 1144 Connection* state and the *Expect SE* state. 1146 If a peer receives this message and is willing to run the protocol, 1147 it answers by sending back a _Strata Estimator_ message. Otherwise 1148 it simply closes the connection. 1150 6.1.2. Structure 1152 0 8 16 24 32 40 48 56 1153 +-----+-----+-----+-----+-----+-----+-----+-----+ 1154 | MSG SIZE | MSG TYPE | ELEMENT COUNT | 1155 +-----+-----+-----+-----+-----+-----+-----+-----+ 1156 | APX 1157 +-----+-----+-----+-----+-----+-----+-----+-----+ 1158 / APPLICATION DATA / 1159 / / 1161 Figure 20 1163 where: 1165 MSG SIZE is a 16-bit unsigned integer in network byte order, which 1166 describes the message size in bytes with the header included. 1168 MSG TYPE is the type of SETU_P2P_OPERATION_REQUEST as registered in 1169 GANA Considerations, in network byte order. 1171 ELEMENT COUNT is the number of the elements the requesting party has 1172 in its set, as a 32-bit unsigned integer in network byte order. 1174 APX is a SHA-512 hash that identifies the application. 1176 APPLICATION DATA is optional, variable-size application specific 1177 data that can be used by the application to decide if it would 1178 like to answer the request. 1180 6.2. IBF 1182 6.2.1. Description 1184 The IBF message contains a slice of the IBF. 1186 The _IBF_ message is sent at the start of the protocol from the 1187 initiating peer in the transaction between *Expect SE* -> *Expecting 1188 IBF Last* or when the IBF does not decode and there is a role change 1189 in the transition between *Active Decoding* -> *Expecting IBF Last*. 1190 This message is only sent if there is more than one IBF slice to be 1191 sent. If there is just one slice, then only the IBF Last message is 1192 sent. 1194 6.2.2. Structure 1195 0 8 16 24 32 40 48 56 1196 +-----+-----+-----+-----+-----+-----+-----+-----+ 1197 | MSG SIZE | MSG TYPE | IBF SIZE | 1198 +-----+-----+-----+-----+-----+-----+-----+-----+ 1199 | OFFSET | SALT | IMCS | 1200 +-----+-----+-----+-----+-----+-----+-----+-----+ 1201 | IBF-SLICE 1202 +-----+-----+-----+-----+-----+-----+-----+-----+ 1203 / / 1204 / / 1206 Figure 21 1208 where: 1210 MSG SIZE is a 16-bit unsigned integer in network byte 1211 orderwhichdescribes the message size in bytes with the header 1212 included. 1214 MSG TYPE the type of SETU_P2P_REQUEST_IBF as registered in GANA 1215 Considerations in network byte order. 1217 IBF SIZE is a 32-bit unsigned integer which signals the total number 1218 of buckets in the IBF. The minimal number of buckets is 37. 1220 OFFSET is a 32-bit unsigned integer which signals the offset of the 1221 following IBF slices in the original. 1223 SALT is a 16-bit unsigned integer that contains the IBF-salt which 1224 was used to create the IBF. 1226 IMCS is a 16-bit unsigned integer, which describes the number of 1227 bits that are required to store a single counter. This is used 1228 for the unpacking function as described in the Variable Counter 1229 Size section. 1231 IBF-SLICE are variable numbers of slices in an array. A single 1232 slice contains multiple 64-bit IDSUMS, 32-bit HASHSUMS and (1-64)- 1233 bit COUNTERS of variable size. All values are in the network byte 1234 order. The array of IDSUMS is serialized first, followed by an 1235 array of HASHSUMS. Last comes an array of unsigned COUNTERS 1236 (details of the COUNTERS encoding are described in section 1237 Section 7.2). The length of the array is defined by MIN( SIZE - 1238 OFFSET, MAX_BUCKETS_PER_MESSAGE). MAX_BUCKETS_PER_MESSAGE is 1239 defined as 32768 divided by the BUCKET_SIZE which ranges between 1240 97-bits when counter uses bit 1 (IMCS=1) and 160-bit when counter 1241 size uses 64 bit (IMCS=64). 1243 To get the IDSUM field, all IDs (computed with the IBF-salt) 1244 hitting a bucket under the map M are added up with a binary XOR 1245 operation. See Salted Element ID Calculation details about ID 1246 generation. 1248 The calculation of the HASHSUM field is done accordingly to the 1249 calculation of the IDSUM field: all HASHes are added up with a 1250 binary XOR operation. The HASH value is calculated as described 1251 in detail in section HASH calculation. 1253 The algorithm to find the correct bucket in which the ID and the 1254 HASH have to be added is described in detail in section Mapping 1255 Function. 1257 Test vectors for an implementation can be found in the Test 1258 Vectors section 1260 IBF-SLICE 1261 0 8 16 24 32 40 48 56 1262 +-----+-----+-----+-----+-----+-----+-----+-----+ 1263 | IDSUMS | 1264 +-----+-----+-----+-----+-----+-----+-----+-----+ 1265 | IDSUMS | 1266 +-----+-----+-----+-----+-----+-----+-----+-----+ 1267 | HASHSUMS | HASHSUMS | 1268 +-----+-----+-----+-----+-----+-----+-----+-----+ 1269 | COUNTERS* | COUNTERS* | 1270 +-----+-----+-----+-----+-----+-----+-----+-----+ 1271 / / 1272 / / 1273 * Counter size is variable. In this example the IMCS is 32 (4 bytes). 1275 Figure 22 1277 6.3. IBF Last 1279 6.3.1. Description 1281 This message indicates to the remote peer that this is the last slice 1282 of the Bloom filter. The receiving peer MUST check that the sizes 1283 and offsets of all received IBF slices add up to the total IBF SIZE 1284 that was given. 1286 Receiving this message initiates the state transmissions *Expecting 1287 IBF Last* -> *Active Decoding*, *Expecting IBF* -> *Active Decoding* 1288 and *Passive Decoding* -> *Active Decoding*. This message can 1289 initiate a peer the roll change from *Active Decoding* to *Passive 1290 Decoding*. 1292 6.3.2. Structure 1294 The binary structure is exactly the same as the Structure of the 1295 message IBF with a different "MSG TYPE" which is defined in GANA 1296 Considerations "SETU_P2P_IBF_LAST". 1298 6.4. Element 1300 6.4.1. Description 1302 The _Element_ message contains an element that is synchronized in the 1303 Differential Synchronisation Mode and transmits a full element 1304 between the peers. 1306 This message is sent in the state *Active Decoding* and *Passive 1307 Decoding* as answer to a _Demand_ message from the remote peer. The 1308 _Element_ message can also be received in the *Finish Closing* or 1309 *Finish Waiting* state after receiving a _Done_ message from the 1310 remote peer. In this case the peer changes to the *Finished* state 1311 as soon as all demands for elements have been satisfied. 1313 This message is exclusively used in the Differential Synchronisation 1314 Mode. 1316 6.4.2. Structure 1318 0 8 16 24 32 40 48 56 1319 +-----+-----+-----+-----+-----+-----+-----+-----+ 1320 | MSG SIZE | MSG TYPE | E TYPE | PADDING | 1321 +-----+-----+-----+-----+-----+-----+-----+-----+ 1322 | E SIZE | DATA 1323 +-----+-----+-----+-----+-----+-----+-----+-----+ 1324 / / 1325 / / 1327 Figure 23 1329 where: 1331 MSG SIZE is a 16-bit unsigned integer in network byte order, which 1332 describes the message size in bytes with the header included. 1334 MSG TYPE is SETU_P2P_ELEMENTS as registered in GANA Considerations 1335 in network byte order. 1337 E TYPE is a 16-bit unsigned integer which defines the element type 1338 for the application. 1340 PADDING is 16-bit always set to zero. 1342 E SIZE is a 16-bit unsigned integer that signals the size of the 1343 elements data part. 1345 DATA is a field with variable length that contains the data of the 1346 element. 1348 6.5. Offer 1350 6.5.1. Description 1352 The _Offer_ message is an answer to an _Inquiry_ message and 1353 transmits the full hash of an element that has been requested by the 1354 other peer. This full hash enables the other peer to check if the 1355 element is really missing in his set and eventually sends a _Demand_ 1356 message for that element. 1358 The offer is sent and received only in the *Active Decoding* and in 1359 the *Passive Decoding* state. 1361 This message is exclusively sent in the Differential Synchronisation 1362 Mode. 1364 6.5.2. Structure 1366 0 8 16 24 32 40 48 56 1367 +-----+-----+-----+-----+-----+-----+-----+-----+ 1368 | MSG SIZE | MSG TYPE | HASH 1 1369 +-----+-----+-----+-----+-----+-----+-----+-----+ 1370 ... ... 1371 +-----+-----+-----+-----+-----+-----+-----+-----+ 1372 HASH 1 | HASH 2 1373 +-----+-----+-----+-----+-----+-----+-----+-----+ 1374 ... ... 1375 +-----+-----+-----+-----+-----+-----+-----+-----+ 1376 HASH 2 | HASH n 1377 +-----+-----+-----+-----+-----+-----+-----+-----+ 1378 / / 1379 / / 1381 Figure 24 1383 where: 1385 MSG SIZE is a 16-bit unsigned integer in network byte order, which 1386 describes the message size in bytes header included. 1388 MSG TYPE is SETU_P2P_OFFER as registered in GANA Considerations in 1389 network byte order. 1391 HASH n contains n (one or more) successive SHA 512-bit hashes of the 1392 elements that are being requested with _Inquiry_ messages. 1394 6.6. Inquiry 1396 6.6.1. Description 1398 The _Inquiry_ message is exclusively sent by the active peer in 1399 *Active Decoding* state to request the full hash of an element that 1400 is missing in the active peers set. This is normally answered by the 1401 passive peer with _Offer_ message. 1403 This message is exclusively sent in the Differential Synchronisation 1404 Mode. 1406 6.6.2. Structure 1408 0 8 16 24 32 40 48 56 1409 +-----+-----+-----+-----+-----+-----+-----+-----+ 1410 | MSG SIZE | MSG TYPE | SALT | 1411 +-----+-----+-----+-----+-----+-----+-----+-----+ 1412 | IBF KEY 1 | 1413 +-----+-----+-----+-----+-----+-----+-----+-----+ 1414 | IBF KEY 2 | 1415 +-----+-----+-----+-----+-----+-----+-----+-----+ 1416 ... ... 1417 +-----+-----+-----+-----+-----+-----+-----+-----+ 1418 | IBF KEY n | 1419 +-----+-----+-----+-----+-----+-----+-----+-----+ 1420 / / 1421 / / 1423 Figure 25 1425 where: 1427 MSG SIZE is a 16-bit unsigned integer in network byte order, which 1428 describes the message size in bytes with the header included. 1430 MSG TYPE is SETU_P2P_INQUIRY as registered in GANA Considerations in 1431 network byte order. 1433 IBF KEY contains n (one or more) successive ibf keys (64-bit 1434 unsigned integer) for which the inquiry is sent. 1436 6.7. Demand 1438 6.7.1. Description 1440 The _Demand_ message is sent in the *Active Decoding* and in the 1441 *Passive Decoding* state. It is an answer to a received _Offer_ 1442 message and is sent if the element described in the _Offer_ message 1443 is missing in the peers set. In the normal workflow the answer to 1444 the _Demand_ message is an _Element_ message. 1446 This message is exclusively sent in the Differential Synchronisation 1447 Mode. 1449 6.7.2. Structure 1451 0 8 16 24 32 40 48 56 1452 +-----+-----+-----+-----+-----+-----+-----+-----+ 1453 | MSG SIZE | MSG TYPE | HASH 1 1454 +-----+-----+-----+-----+-----+-----+-----+-----+ 1455 ... ... 1456 +-----+-----+-----+-----+-----+-----+-----+-----+ 1457 HASH 1 | HASH 2 1458 +-----+-----+-----+-----+-----+-----+-----+-----+ 1459 ... ... 1460 +-----+-----+-----+-----+-----+-----+-----+-----+ 1461 HASH 2 | HASH n 1462 +-----+-----+-----+-----+-----+-----+-----+-----+ 1463 / / 1464 / / 1466 Figure 26 1468 where: 1470 MSG SIZE is a 16-bit unsigned integer in network byte order, which 1471 describes the message size in bytes and the header is included. 1473 MSG TYPE the type of SETU_P2P_DEMAND as registered in GANA 1474 Considerations in network byte order. 1476 HASH n contains n (one or more) successive SHA 512-bit hashes of the 1477 elements that are being demanded. 1479 6.8. Done 1480 6.8.1. Description 1482 The _Done_ message is sent when all _Demand_ messages have been 1483 successfully satisfied and from the perspective of the sender the set 1484 is completely synchronized. 1486 This message is exclusively sent in the Differential Synchronisation 1487 Mode. 1489 6.8.2. Structure 1491 0 8 16 24 32 40 48 56 1492 +-----+-----+-----+-----+-----+-----+-----+-----+ 1493 | MSG SIZE | MSG TYPE | FINAL CHECKSUM 1494 +-----+-----+-----+-----+-----+-----+-----+-----+ 1495 / / 1496 / / 1498 Figure 27 1500 where: 1502 MSG SIZE is a 16-bit unsigned integer in network byte order, which 1503 describes the message size in bytes with the header included. The 1504 value is always 4 for this message type. 1506 MSG TYPE is SETU_P2P_DONE as registered in GANA Considerations in 1507 network byte order. 1509 FINAL CHECKSUM a SHA-512 hash XOR sum of the full set after 1510 synchronization. This should ensure that the sets are identical 1511 in the end! 1513 6.9. Full Done 1515 6.9.1. Description 1517 The _Full Done_ message is sent in the Full Synchronisation Mode to 1518 signal that all remaining elements of the set have been sent. The 1519 message is received and sent in the *Full Sending* and in the *Full 1520 Receiving* state. When the _Full Done_ message is received in *Full 1521 Sending* state the peer changes directly into *Finished* state. In 1522 *Full Receiving* state receiving a _Full Done_ message initiates the 1523 sending of the remaining elements that are missing in the set of the 1524 other peer. 1526 6.9.2. Structure 1528 0 8 16 24 32 40 48 56 1529 +-----+-----+-----+-----+-----+-----+-----+-----+ 1530 | MSG SIZE | MSG TYPE | FINAL CHECKSUM 1531 +-----+-----+-----+-----+-----+-----+-----+-----+ 1532 / / 1533 / / 1535 Figure 28 1537 where: 1539 MSG SIZE is a 16-bit unsigned integer in network byte order, which 1540 describes the message size in bytes with the header included. The 1541 value is always 4 for this message type. 1543 MSG TYPE the type of SETU_P2P_FULL_DONE as registered in GANA 1544 Considerations in network byte order. 1546 FINAL CHECKSUM a SHA-512 hash XOR sum of the full set after 1547 synchronization. This should ensure that the sets are identical 1548 in the end! 1550 6.10. Request Full 1552 6.10.1. Description 1554 The _Request Full_ message is sent by the initiating peer in *Expect 1555 SE* state to the receiving peer, if the operation mode "Full 1556 Synchronisation Mode" is determined to be the superior Mode of 1557 Operation and that it is the better choice that the other peer sends 1558 his elements first. The initiating peer changes after sending the 1559 _Request Full_ message into *Full Receiving* state. 1561 The receiving peer receives the _Request Full_ message in the 1562 *Expecting IBF*, afterwards the receiving peer starts sending his 1563 complete set in Full Element messages to the initiating peer. 1565 6.10.2. Structure 1567 0 8 16 24 32 40 48 56 1568 +-----+-----+-----+-----+-----+-----+-----+-----+ 1569 | MSG SIZE | MSG TYPE | REMOTE SET DIFF | 1570 +-----+-----+-----+-----+-----+-----+-----+-----+ 1571 | REMOTE SET SIZE | LOCAL SET DIFF | 1572 +-----+-----+-----+-----+-----+-----+-----+-----+ 1573 Figure 29 1575 where: 1577 MSG SIZE is a 16-bit unsigned integer in network byte order, which 1578 describes the message size in bytes with the header included. The 1579 value is always 16 for this message type. 1581 MSG TYPE is SETU_P2P_REQUEST_FULL as registered in GANA 1582 Considerations in network byte order. 1584 REMOTE SET DIFF is a 32-bit unsigned integer in network byte order, 1585 which represents the remote (from the perspective of the sending 1586 peer) set difference calculated with strata estimator. 1588 REMOTE SET SIZE is a 32-bit unsigned integer in network byte order, 1589 which represents the total remote (from the perspective of the 1590 sending peer) set size. 1592 LOCAL SET DIFF is a 32-bit unsigned integer in network byte order, 1593 which represents the local (from the perspective of the sending 1594 peer) set difference calculated with strata estimator. 1596 6.11. Send Full 1598 6.11.1. Description 1600 The _Send Full_ message is sent by the initiating peer in *Expect SE* 1601 state to the receiving peer if the operation mode "Full 1602 Synchronisation Mode" is determined as superior Mode of Operation and 1603 that it is the better choice that the peer sends his elements first. 1604 The initiating peer changes after sending the _Request Full_ message 1605 into *Full Sending* state. 1607 The receiving peer receives the _Send Full_ message in the *Expecting 1608 IBF* state, afterwards the receiving peer changes into *Full 1609 Receiving* state and expects to receive the set of the remote peer. 1611 6.11.2. Structure 1613 0 8 16 24 32 40 48 56 1614 +-----+-----+-----+-----+-----+-----+-----+-----+ 1615 | MSG SIZE | MSG TYPE | REMOTE SET DIFF | 1616 +-----+-----+-----+-----+-----+-----+-----+-----+ 1617 | REMOTE SET SIZE | LOCAL SET DIFF | 1618 +-----+-----+-----+-----+-----+-----+-----+-----+ 1620 Figure 30 1622 where: 1624 MSG SIZE is a 16-bit unsigned integer in network byte order, which 1625 describes the message size in bytes with the header included. The 1626 value is always 16 for this message type. 1628 MSG TYPE is SETU_P2P_REQUEST_FULL as registered in GANA 1629 Considerations in network byte order. 1631 REMOTE SET DIFF is a 32-bit unsigned integer in network byte order, 1632 which represents the remote (from the perspective of the sending 1633 peer) set difference calculated with strata estimator. 1635 REMOTE SET SIZE is a 32-bit unsigned integer in network byte order, 1636 which represents the total remote (from the perspective of the 1637 sending peer) set size. 1639 LOCAL SET DIFF is a 32-bit unsigned integer in network byte order, 1640 which represents the local (from the perspective of the sending 1641 peer) set difference calculated with strata estimator. 1643 6.12. Strata Estimator 1645 6.12.1. Description 1647 The strata estimator is sent by the receiving peer at the start of 1648 the protocol, right after the Operation Request message has been 1649 received. 1651 The strata estimator is used to estimate the difference between the 1652 two sets as described in section Strata Estimator. 1654 When the initiating peer receives the strata estimator, the peer 1655 decides which Mode of Operation to use for the synchronisation. 1656 Depending on the size of the set difference and the Mode of Operation 1657 the initiating peer changes into *Full Sending*, *Full Receiving* or 1658 *Passive Decoding* state. 1660 The _Strata Estimator_ message can contain one, two, four or eight 1661 strata estimators with different salts, depending on the initial size 1662 of the sets. More details can be found in section Multi Strata 1663 Estimators. 1665 The IBFs in a strata estimator always have 79 buckets. The reason 1666 why can be found in [byzantine_fault_tolerant_set_reconciliation] in 1667 section 3.4.2. 1669 6.12.2. Structure 1671 0 8 16 24 32 40 48 56 1672 +-----+-----+-----+-----+-----+-----+-----+-----+ 1673 | MSG SIZE | MSG TYPE | SEC | SETSIZE 1674 +-----+-----+-----+-----+-----+-----+-----+-----+ 1675 SETSIZE | SE-SLICES 1676 +-----+-----+-----+-----+-----+-----+-----+-----+ 1677 / / 1678 / / 1680 Figure 31 1682 where: 1684 MSG SIZE is a 16-bit unsigned integer in network byte order, which 1685 describes the message size in bytes with the header included. 1687 MSG TYPE is SETU_P2P_SE as registered in GANA Considerations in 1688 network byte order. 1690 SEC is a 8-bit unsigned integer in network byte order, which 1691 indicates how many strata estimators with different salts are 1692 attached to the message. Valid values are 1,2,4 or 8, more 1693 details can be found in the section Multi Strata Estimators. 1695 SETSIZE is a 64-bit unsigned integer that is defined by the size of 1696 the set the SE is 1698 SE-SLICES are variable numbers of slices in an array. A slice can 1699 contain one or more Strata Estimators which contain multiple IBFs 1700 as described in IBF-SLICES in Section 6.2.2. A SE slice can 1701 contain one to eight Strata Estimators which contain 32 (Defined 1702 as Constant SE_STRATA_COUNT) IBFs. Every IBF in a SE contains 79 1703 Buckets. 1705 The different SEs are built as in detail described in Section 7.3. 1706 Simply put, the IBFs in each SE are serialized as described in 1707 Section 6.2.2 starting with the highest stratum. Then the created 1708 SEs are appended one after the other starting with the SE that was 1709 created with a salt of zero. 1711 SE-SLICE 1712 0 8 16 24 32 40 48 56 1713 +-----+-----+-----+-----+-----+-----+-----+-----+ 1714 | SE_1 -> IBF_1 1715 +-----+-----+-----+-----+-----+-----+-----+-----+ 1716 ... ... 1717 +-----+-----+-----+-----+-----+-----+-----+-----+ 1718 | SE_1 -> IBF_30 1719 +-----+-----+-----+-----+-----+-----+-----+-----+ 1720 | SE_2 -> IBF_1 1721 +-----+-----+-----+-----+-----+-----+-----+-----+ 1722 ... ... 1723 / / 1724 / / 1726 Figure 32 1728 6.13. Strata Estimator Compressed 1730 6.13.1. Description 1732 The Strata Estimator can be compressed with gzip as described in 1733 [RFC1951] to improve performance. This can be recognized by the 1734 different message type number from GANA Considerations. 1736 6.13.1.1. Structure 1738 The key difference between the compressed and the uncompressed Strata 1739 Estimator is that the SE slices are compressed with gzip ([RFC1951]) 1740 in the compressed SE. But the header remains uncompressed with both. 1742 Since the content of the message is the same as the uncompressed 1743 Strata Estimator, the details are not repeated here. For details see 1744 section 6.12. 1746 6.14. Full Element 1748 6.14.1. Description 1750 The _Full Element_ message is the equivalent of the Element message 1751 in the Full Synchronisation Mode. It contains a complete element 1752 that is missing in the set of the peer that receives this message. 1754 The _Full Element_ message is exclusively sent in the transitions 1755 *Expecting IBF* -> *Full Receiving* and *Full Receiving* -> 1756 *Finished*. The message is only received in the *Full Sending* and 1757 *Full Receiving* state. 1759 After the last _Full Element_ message has been sent, the _Full Done_ 1760 message is sent to conclude the full synchronisation of the element 1761 sending peer. 1763 6.14.2. Structure 1765 0 8 16 24 32 40 48 56 1766 +-----+-----+-----+-----+-----+-----+-----+-----+ 1767 | MSG SIZE | MSG TYPE | E TYPE | PADDING | 1768 +-----+-----+-----+-----+-----+-----+-----+-----+ 1769 | SIZE | AE TYPE | DATA 1770 +-----+-----+-----+-----+-----+-----+-----+-----+ 1771 / / 1772 / / 1774 Figure 33 1776 where: 1778 MSG SIZE is a 16-bit unsigned integer in network byte order, which 1779 describes the message size in bytes with the header included. 1781 MSG TYPE is SETU_P2P_REQUEST_FULL_ELEMENT as registered in GANA 1782 Considerations in network byte order. 1784 E TYPE is a 16-bit unsigned integer which defines the element type 1785 for the application. 1787 PADDING is 16-bit always set to zero 1789 E SIZE is a 16-bit unsigned integer that signals the size of the 1790 elements data part. 1792 AE TYPE is a 16-bit unsigned integer that is needed to identify the 1793 type of element that is in the data field 1795 DATA is a field with variable length that contains the data of the 1796 element. 1798 7. Performance Considerations 1800 7.1. Formulas 1801 7.1.1. Operation Mode 1803 The decision which Mode of Operation is used is described by the 1804 following code. More detailed explanations motivating the design can 1805 be found in the accompanying thesis in section 1806 4.5.3.[byzantine_fault_tolerant_set_reconciliation] 1808 The function takes as input the average element size, the local set 1809 size, the remote set size, the set differences as estimated from the 1810 strata estimator for both the local and remote sets, and the 1811 bandwidth/roundtrip tradeoff. The function returns the exact Mode of 1812 Operation that is predicted to be best as output: 1813 FULL_SYNC_REMOTE_SENDING_FIRST if it is likely cheapest that the 1814 other peer transmits his elements first, 1815 FULL_SYNC_LOCAL_SENDING_FIRST if it is likely cheapest that the 1816 elements are transmitted to the other peer directly, and 1817 DIFFERENTIAL_SYNC if the differential synchronisation is likely 1818 cheapest. 1820 The constant IBF_BUCKET_NUMBER_FACTOR is always 2 and IBF_MIN_SIZE is 1821 37. The method for deriving this can be found in the IBF parameter 1822 study in [byzantine_fault_tolerant_set_reconciliation] in section 1823 4.5.2. 1825 # CONSTANTS: 1826 # IBF_BUCKET_NUMBER_FACTOR = 2: The amount the IBF gets increased if 1827 # decoding fails 1828 # RTT_MIN_FULL = 2: Minimal round trips used for full Synchronisation 1829 # IBF_MIN_SIZE = 37: The minimal size of an IBF 1830 # MAX_BUCKETS_PER_MESSAGE: Custom value depending on the underlying 1831 # protocol 1832 # INPUTS: 1833 # avg_es: The average element size 1834 # lss: The initial local set size 1835 # rss: The remote set size 1836 # lsd: the estimated local set difference calculated by the SE 1837 # rsd: the estimated remote set difference calculated by the SE 1838 # rtt: the tradeoff between round trips and bandwidth 1839 # OUTPUT: 1840 # FULL_SYNC_REMOTE_SENDING_FIRST, FULL_SYNC_LOCAL_SENDING_FIRST or 1841 # DIFFERENTIAL_SYNC 1843 FUNCTION decide_operation_mode(avg_es, 1844 lss, 1845 rss, 1846 lsd 1847 rsd, 1848 rtt) 1850 # If a set size is zero always do full sync 1851 IF 0 == rss THEN 1852 RETURN FULL_SYNC_LOCAL_SENDING_FIRST 1853 END IF 1854 IF 0 == lss THEN 1855 RETURN FULL_SYNC_REMOTE_SENDING_FIRST 1856 END IF 1858 # Estimate required transferred bytes when doing a full 1859 # synchronisation and transmitting local set first. 1860 semh = sizeof(ELEMENT_MSG_HEADER) 1861 estimated_total_diff = rsd + lsd 1862 total_elements_local_send = rsd + lss 1863 cost_local_full_sync = avg_es * total_elements_local_send 1864 + total_elements_local_send * semh 1865 + sizeof(FULL_DONE_MSG_HEADER) * 2 1866 + RTT_MIN_FULL * rtt 1868 # Estimate required transferred bytes when doing a full 1869 # synchronisation and transmitting remote set first. 1870 total_elements_remote_send = lsd + rss 1871 cost_remote_full_sync = avg_es * total_elements_remote_send 1872 + total_elements_remote_send * semh 1873 + sizeof(FULL_DONE_MSG_HEADER) * 2 1874 + (RTT_MIN_FULL + 0.5) * rtt 1875 + sizeof(REQUEST_FULL_MSG) 1877 # Estimate required transferred bytes when doing a differential 1878 # synchronisation 1880 # Estimate messages required to transfer IBF 1881 ibf_bucket_count = estimated_total_diff * IBF_BUCKET_NUMBER_FACTOR 1882 IF ibf_bucket_count <= IBF_MIN_SIZE THEN 1883 ibf_bucket_count = IBF_MIN_SIZE 1884 END IF 1885 ibf_message_count = ceil (ibf_bucket_count / MAX_BUCKETS_PER_MESSAGE) 1887 # Estimate average counter length with variable counter 1888 estimated_counter_bits = MIN (2 * LOG2(lss / ibf_bucket_count), 1889 LOG2(lss)) 1890 estimated_counter_bytes = estimated_counter_bits / 8 1892 # Sum up all messages required to do differential synchronisation 1893 ibf_bytes = sizeof(IBF_MESSAGE) * ibf_message_count 1894 + ibf_bucket_count * sizeof(IBF_KEY) 1895 + ibf_bucket_count * sizeof(IBF_KEYHASH) 1896 + ibf_bucket_count * estimated_counter_bytes 1897 # Add 20% overhead to cover IBF retries due to decoding failures 1898 total_ibf_bytes = ibf_bytes * 1.2 1900 # Estimate other message sizes to be transfered in diff sync 1901 # Note that we simplify by adding the header each time; 1902 # if the implementation combines multiple INQUIRY/DEMAND/OFFER 1903 # requests in one message, the bandwidth would be lower. 1904 done_size = sizeof(DONE_HEADER) 1905 element_size = (avg_es + sizeof(ELEMENT_MSG_HEADER)) 1906 * estimated_total_diff 1907 inquery_size = (sizeof(IBF_KEY) + sizeof(INQUERY_MSG_HEADER)) 1908 * estimated_total_diff 1909 demand_size = (sizeof(HASHCODE) + sizeof(DEMAND_MSG_HEADER)) 1910 * estimated_total_diff 1911 offer_size = (sizeof(HASHCODE) + sizeof(OFFER_MSG_HEADER)) 1912 * estimated_total_diff 1914 # Estimate total cost 1915 diff_cost = element_size + done_size + inquery_size 1916 + demand_size + offer_size + total_ibf_bytes 1917 + DIFFERENTIAL_RTT_MEAN * rtt 1919 # Decide for a optimal mode of operation 1920 full_cost_min = MIN (cost_local_full_sync, 1921 cost_remote_full_sync) 1922 IF full_cost_min < diff_cost THEN 1923 IF cost_remote_full_sync > cost_local_full_sync THEN 1924 RETURN FULL_SYNC_LOCAL_SENDING_FIRST 1925 ELSE 1926 RETURN FULL_SYNC_REMOTE_SENDING_FIRST 1927 END IF 1928 ELSE 1929 RETURN DIFFERENTIAL_SYNC 1930 END IF 1931 END FUNCTION 1933 Figure 34 1935 7.1.2. IBF Size 1937 The functions, described in this section, calculate a good initial 1938 size (initial_ibf_size) and in case of decoding failure, a good next 1939 IBF size (get_next_ibf_size). 1941 These algorithms are described and justified in more details in 1942 [byzantine_fault_tolerant_set_reconciliation] in the parameter study 1943 in section 3.5.2, the max IBF counter in section 3.10 and the 1944 Improved IBF size in section 3.11. 1946 # CONSTANTS: 1947 # IBF_BUCKET_NUMBER_FACTOR = 2: The amount the IBF gets increased 1948 if decoding fails 1949 # Inputs: 1950 # sd: Estimated set difference 1951 # Output: 1952 # next_size: Size of the initial IBF 1954 FUNCTION initial_ibf_size(sd) 1955 # We do not go below 37, as 37 buckets should 1956 # basically always be below one MTU, so there is 1957 # little to be gained, while a smaller IBF would 1958 # increase the chance of a decoding failure. 1959 RETURN MAX(37, IBF_BUCKET_NUMBER_FACTOR * sd); 1960 END FUNCTION 1962 # CONSTANTS: 1963 # IBF_BUCKET_NUMBER_FACTOR = 2: The amount the IBF gets increased if 1964 # decoding fails 1965 # Inputs: 1966 # de: Number of elements that have been successfully decoded 1967 # lis: The number of buckets of the last IBF 1968 # Output: 1969 # number of buckets for the next IBF 1971 FUNCTION get_next_ibf_size(de, lis) 1972 next_size = IBF_BUCKET_NUMBER_FACTOR * (lis - de) 1973 # The MAX operation here also ensures that the 1974 # result is positive. 1975 RETURN MAX(37, next_size); 1976 END FUNCTION 1978 Figure 35 1980 7.1.3. Number of Buckets an Element is Hashed into 1982 The number of buckets an element is hashed to is hardcoded to 3. 1983 Reasoning and justification can be found in 1984 [byzantine_fault_tolerant_set_reconciliation] in the IBF parameter 1985 performance study in section 4.5.2. 1987 7.2. Variable Counter Size 1989 The number of bits required to represent the counters of an IBF 1990 varies due to different parameters as described in section 3.2 of 1991 [byzantine_fault_tolerant_set_reconciliation]. Therefore, a packing 1992 algorithm has been implemented. This algorithm encodes the IBF 1993 counters in their optimal bit-width and thus minimizes the bandwidth 1994 needed to transmit the IBF. 1996 A simple algorithm is used for the packing. In a first step it is 1997 determined, which is the largest counter. The the base 2 logarithm 1998 then determines how many bits are needed to store it. In a second 1999 step for every counter of every bucket, the counter is stored using 2000 this many bits. The resulting bit sequence is then simply 2001 concatenated. 2003 Three individual functions are used for this purpose. The first one 2004 is a function that iterates over each bucket of the IBF to get the 2005 maximum counter in the IBF. The second function packs the counters 2006 of the IBF, and the third function that unpacks the counters. 2008 As a plausibly check to prevent the byzantine upper bound checks in 2009 Section 8.1.2 to fail, implementations must ensure that the estimates 2010 of the set size difference added together never exceed the set 2011 byzantine upper bound. This could for example happen in case the 2012 strata estimator overestimates the set difference. 2014 # INPUTS: 2015 # ibf: The IBF 2016 # OUTPUTS: 2017 # returns: Minimal amount of bits required to store the counter 2019 FUNCTION ibf_get_max_counter(ibf) 2020 max_counter=1 # convince static analysis that we never take log2(0) 2021 FOR bucket IN ibf DO 2022 IF bucket.counter > max_counter THEN 2023 max_counter = bucket.counter 2024 END IF 2025 END FOR 2026 # next bigger discrete number of the binary logarithm of the 2027 # max counter 2028 RETURN CEILING( LOG2( max_counter ) ) 2029 END FUNCTION 2031 # INPUTS: 2032 # ibf: The IBF 2033 # offset: The offset which defines the starting point from which bucket 2034 # the pack operation starts 2035 # count: The number of buckets in the array that will be packed 2036 # OUTPUTS: 2037 # returns: A byte array of packed counters to send over the network 2039 # INPUTS: 2040 # ibf: The IBF 2041 # offset: The offset which defines the starting point from which bucket 2042 # the pack operation starts 2043 # count: The number of buckets in the array that will be packed 2044 # OUTPUTS: 2045 # returns: A byte array of packed counters to send over the network 2047 FUNCTION pack_counter(ibf, offset, count) 2048 counter_bytes = ibf_get_max_counter(ibf) 2049 store_bits = 0 2050 store = 0 2051 byte_ctr = 0 2052 buf=[] 2054 FOR bucket IN ibf[offset] TO ibf[count] DO 2055 counter = bucket.counter 2056 byte_len = counter_bytes 2058 WHILE byte_len + store_bits < 8 DO 2059 bit_to_shift = 0 2061 IF store_bits > 0 OR byte_len > 8 THEN 2062 bit_free = 8 - store_bits 2063 bit_to_shift = byte_len - bit_free 2064 store = store << bit_free 2065 END IF 2066 buf[byte_ctr] = (( counter >> bit_to_shift) | store) & 0xFF 2067 byte_ctr = byte_ctr + 1 2068 byte_len -= 8 - store_bits 2069 counter = counter & ((1 << byte_len) - 1) 2070 store = 0 2071 store_bits = 0 2072 END WHILE 2073 store = (store << byte_len) | counter 2074 store_bits = store_bits + byte_len 2075 byte_len = 0 2076 END FOR 2078 # Write the last partial packed byte to the buffer 2079 IF store_bits > 0 THEN 2080 buf[byte_ctr] = store << (8 - store_bits) 2081 byte_ctr = byte_ctr + 1 2082 END IF 2083 RETURN buf 2084 FUNCTION END 2086 # INPUTS: 2087 # ibf: The IBF 2088 # offset: The offset which defines the starting point from which bucket 2089 the packed operation starts 2090 # count: The number of buckets in the array that will be packed 2091 # cbl: The bit length of the counter can be found in the 2092 ibf message in the ibf_counter_bit_length field 2093 # pd: A byte array which contains the data packed with the pack_counter 2094 function 2095 # OUTPUTS: 2096 # returns: Nothing because the unpacked counter is saved directly 2097 into the IBF 2099 FUNCTION unpack_counter(ibf, offset, count, cbl, pd) 2100 ibf_bucket_ctr = 0 2101 store = 0 2102 store_bits = 0 2103 byte_ctr = 0 2105 WHILE TRUE 2106 byte_read = pd[byte_ctr] 2107 bit_to_pack_left = 8 2108 byte_ctr++ 2110 WHILE bit_to_pack_left >= 0 DO 2112 # Prevent packet from reading more than required 2113 IF ibf_bucket_ctr > (count - 1) THEN 2114 RETURN 2115 END IF 2117 IF store_bits + bit_to_pack_left >= cbl THEN 2118 bit_use = cbl - store_bits 2120 IF store_bits > 0 THEN 2121 store = store << bit_use 2122 END IF 2123 bytes_to_shift = bit_to_pack_left - bit_use 2124 counter_partial = byte_read >> bytes_to_shift 2125 store = store | counter_partial 2126 ibf.counter[ibf_bucket_ctr + offset] = store 2127 byte_read = byte_read & (( 1 << bytes_to_shift ) - 1) 2129 bit_to_pack_left -= bit_use 2130 ibf_bucket_ctr++ 2131 store = 0 2132 store_bits = 0 2133 ELSE 2134 store_bits = store_bits + bit_to_pack_left 2136 IF 0 == store_bits THEN 2137 store = byte_read 2138 ELSE 2139 store = store << bit_to_pack_left 2140 store = store | byte_read 2141 END IF 2142 BREAK 2143 END IF 2144 END WHILE 2145 END WHILE 2146 END FUNCTION 2148 Figure 36 2150 7.3. Multi Strata Estimators 2152 In order to improve the precision of the estimates not only one 2153 strata estimator is transmitted for larger sets. One, two, four or 2154 eight strata estimators can be transferred. Transmitting multiple 2155 strata estimators has the disadvantage that additional bandwidth will 2156 be used, so despite the higher precision, it is not always optimal to 2157 transmit eight strata estimators. Therefore, the following rules are 2158 used, which are based on the average element size multiplied by the 2159 number of elements in the set. This value is denoted as "b" in the 2160 table: 2162 SEs Rule 2164 1 b < 68kb 2166 2 b > 68kb 2168 4 b > 269kb 2170 8 b > 1'077kb 2172 When creating multiple strata estimators, it is important to salt the 2173 keys for the IBFs in the strata estimators differently, using the 2174 following bit rotation based salting method: 2176 # Inputs: 2177 # value: Input value to salt (needs to be 64 bit unsigned) 2178 # salt: Salt to salt value with; Should always be ascending and start 2179 # at zero 2180 i.e. SE1 = Salt 0; SE2 = Salt 1 etc. 2181 # Output: 2182 # Returns: Salted value 2184 FUNCTION se_key_salting(value, salt) 2185 s = (salt * 7) modulo 64 2186 RETURN (value >> s) | (value << (64 - s)) 2187 END FUNCTION 2189 Figure 37 2191 Performance study and details about the reasoning for the used 2192 methods can be found in [byzantine_fault_tolerant_set_reconciliation] 2193 in section 3.4.1 under the title "Added Support for Multiple Strata 2194 Estimators". [byzantine_fault_tolerant_set_reconciliation] 2196 8. Security Considerations 2198 The security considerations in this document focus mainly on the 2199 security goal of availability. The primary goal of the protocol is 2200 to prevent an attacker from wasting computing and network resources 2201 of the attacked peer. 2203 To prevent denial of service attacks, it is vital to check that peers 2204 can only reconcile a set once in a predefined time span. This is a 2205 predefined value and needs to be adapted per use basis. To enhance 2206 reliability and to allow for legitimate failures, say due to network 2207 connectivity issues, applications SHOULD define a threshold for the 2208 maximum number of failed reconciliation attempts in a given time 2209 period. 2211 It is important to close and purge connections after a given timeout 2212 to prevent draining attacks. 2214 8.1. General Security Checks 2216 In this section general checks are described which should be applied 2217 to multiple states. 2219 8.1.1. Input validation 2221 The format of all received messages needs to be properly validated. 2222 This is important to prevent many attacks on the code. The 2223 application data MUST be validated by the application using the 2224 protocol not by the implementation of the protocol. In case the 2225 format validation fails the set operation MUST be terminated. 2227 8.1.2. Byzantine Boundaries 2229 To restrict an attacker there should be an upper and lower bound 2230 defined and checked at the beginning of the protocol, based on prior 2231 knowledge, for the number of elements. The lower byzantine bound can 2232 be, for example, the number of elements the other peer had in his set 2233 at the last contact. The upper byzantine bound can be a practical 2234 maximum e.g. the number of e-voting votes, in Switzerland. 2236 # Input: 2237 # rec: Number of elements in remote set 2238 # rsd: Number of elements differ in remote set 2239 # lec: Number of elements in local set 2240 # lsd: Number of elements differ in local set 2241 # UPPER_BOUND: Given byzantine upper bound 2242 # LOWER_BOUND: Given byzantine lower bound 2243 # Output: 2244 # returns TRUE if parameters in byzantine bounds otherwise returns FALSE 2245 FUNCTION check_byzantine_bounds (rec,rsd,lec,lsd) 2246 IF rec + rsd > UPPER_BOUND THEN 2247 RETURN FALSE 2248 END IF 2249 IF lec + lsd > UPPER_BOUND THEN 2250 RETURN FALSE 2251 END IF 2252 IF rec < LOWER_BOUND THEN 2253 RETURN FALSE 2254 END IF 2255 RETURN TRUE 2256 END FUNCTION 2258 Figure 38 2260 8.1.3. Valid State 2262 To harden the protocol against attacks, controls were introduced in 2263 the improved implementation that check for each message whether the 2264 message was received in the correct state. This is central so that 2265 an attacker finds as little attack surface as possible and makes it 2266 more difficult for the attacker to send the protocol into an endless 2267 loop, for example. 2269 8.1.4. Message Flow Control 2271 For most messages received and sent there needs to be a check in 2272 place that checks that a message is not received multiple times. 2273 This is solved with a global store (message) and the following code 2275 The sequence in which messages are received and sent is arranged in a 2276 chain. The messages are dependent on each other. There are 2277 dependencies that are mandatory, e.g. for a sent "Demand" message, an 2278 "Element" message must always be received. But there are also 2279 messages for which a response is not mandatory, e.g. the _Inquiry_ 2280 message is only followed by an "Offer" message, if the corresponding 2281 element is in the set. Due to this fact, checks can be installed to 2282 verify compliance with the following chain. 2284 Chain for 2285 elements +---------+ +---------+ +---------+ +---------+ 2286 NOT in IBF | INQUIRY |--->| OFFER |===>| DEMAND |===>| ELEMENT | 2287 decoding +---------+ +---------+ +---------+ +---------+ 2288 peers set 2290 Chain for 2291 elements +---------+ +---------+ +---------+ 2292 in IBF | OFFER |--->| DEMAND |===>| ELEMENT | 2293 decoding +---------+ +---------+ +---------+ 2294 peers set 2296 --->: Answer not mandatory 2297 ===>: Always answer needed. 2299 Figure 39 2301 In the message control flow its important to ensure that no 2302 duplicated messages are received (Except inquiries where collisions 2303 are possible) and only messages are received which are compliant with 2304 the flow in Figure 39. To link messages the SHA-512 element hashes, 2305 that are part of all messages, except in the _Inquiry_ messages, can 2306 be used. To link an _Inquiry_ message to an _Offer_ message the 2307 SHA-512 hash from the offer has to be salted and converted to the 2308 IBF-Key (as described in Figure 7). The IBF-Key can be matched with 2309 the received _Inquiry_ message. 2311 At the end of the set reconciliation operation after receiving and 2312 sending the _Done_ message, it should be checked that all demands 2313 have been satisfied and all elements have been received. 2315 This is based on [byzantine_fault_tolerant_set_reconciliation], 2316 section 5.3 (Message Control Flow). 2318 8.1.5. Limit Active/Passive Decoding changes 2320 To prevent an attacker from sending a peer into an endless loop 2321 between active and passive decoding, a limitation for active/passive 2322 roll switches is required. Otherwise, an attacker could force the 2323 victim to waste unlimited amount of resources by just transmitting 2324 IBFs that do not decode. This can be implemented by a simple counter 2325 which terminates the operation after a predefined number of switches. 2326 The maximum number of switches needs to be defined in such a way that 2327 it is very improbable that more switches are required in a legitimate 2328 interaction, and hence the malicious behavior of the other peer is 2329 assured. 2331 The question after how many active/passive switches it can be assumed 2332 that the other peer is not honest, depends on the various tuning 2333 parameters of the algorithm. Section 5.4 of 2334 [byzantine_fault_tolerant_set_reconciliation] demonstrates that the 2335 probability of decoding failure is less than 15% for each round. The 2336 probability that there will be n legitimate active/passive changes is 2337 thus less than 0.15^{round number}. Which means that after about 30 2338 active/passive switches it can be said with a certainty of 2^80 that 2339 one of the peers is not following the protocol. Hence, participants 2340 MUST impose a maximum of 30 active/passive changes. 2342 8.1.6. Full Synchronisation Plausibility Check 2344 An attacker can try to use up a peer's bandwidth by pretending that 2345 the peer needs full synchronisation, even if the set difference is 2346 very small and the attacker only has a few (or even zero) elements 2347 that are not already synchronised. In such a case, it would be ideal 2348 if the plausibility could already be checked during full 2349 synchronisation as to whether the other peer was honest or not with 2350 regard to the estimation of the set size difference and thus the 2351 choice of mode of operation. 2353 In order to calculate this plausibility, section 5.5 of 2354 [byzantine_fault_tolerant_set_reconciliation] describes a formula, 2355 which depicts the probability with which one can calculate the 2356 corresponding plausibility based on the number of new and repeated 2357 elements after each received element. 2359 Besides this approach from probability theory, there is an additional 2360 check that can be made. After the entire set has been transferred to 2361 the other peer, no known elements may be returned by the second peer, 2362 since the second peer should only return the elements that are 2363 missing from the initial peer's set. 2365 This two approaches are implemented in the following pseudocode: 2367 # Input: 2368 # SECURITY_LEVEL: The security level used e.g. 2^80 2369 # state: The statemachine state 2370 # rs: Estimated remote set difference 2371 # lis: Number of elements in set 2372 # rd: Number of duplicated elements received 2373 # rf: Number of fresh elements received 2374 # Output: 2375 # Returns TRUE if full synchronisation is plausible and FALSE otherwise 2377 FUNCTION full_sync_plausibility_check (state,rs,lis,rd,rf) 2378 security_level_lb = -1 * SECURITY_LEVEL 2380 # Make sure that no element is received double when 2381 # all elements already are transmitted to the oder side. 2382 IF FULL_SENDING == state AND rd > 0 THEN 2383 RETURN FALSE 2384 END IF 2386 # Probabilistic algorithm to check for plausible 2387 # element distribution 2388 IF FULL_RECEIVING == state THEN 2390 # Prevent division by 0 2391 IF 0 <= rs THEN 2392 rs = 1 2393 END IF 2395 # Formula to verify plausibility 2396 base = 1 - (rs / (lis + rs)) 2397 exponent = rd - rf * lis / rs 2398 value = exponent * (LOG2(base)/LOG2(2)) 2399 IF value < security_level_lb OR value > SECURITY_LEVEL THEN 2400 RETURN FALSE 2401 END IF 2402 END IF 2403 RETURN TRUE 2404 END FUNCTION 2406 Figure 40 2408 8.2. States 2410 In this section the security considerations for each valid message in 2411 all states is described, if any other message is received the peer 2412 MUST terminate the operation. 2414 8.2.1. Expecting IBF 2416 Security considerations for received messages: 2418 Request Full It needs to be checked that the full synchronisation 2419 mode with receiving peer sending first is plausible according to 2420 the algorithm deciding which operation mode is applicable as 2421 described in Section 7.1.1. 2423 IBF It needs to be checked that the differential synchronisation 2424 mode is plausible according to the algorithm deciding which 2425 operation mode is applicable as described in Section 7.1.1. 2427 Send Full It needs to be checked that the full synchronisation mode 2428 with initiating peer sending first is plausible according to the 2429 algorithm deciding which operation mode is applicable as described 2430 in Section 7.1.1. 2432 8.2.2. Full Sending 2434 Security considerations for received messages: 2436 Full Element When receiving full elements there needs to be checked, 2437 that every element is a valid element, that no element has been 2438 received more than once, and that not more elements have been 2439 received than the other peer has committed to at the beginning of 2440 the operation. The plausibility should also be checked with an 2441 algorithm as described in Section 8.1.6. 2443 Full Done When receiving the _Full Done_ message, it is important to 2444 check that not fewer elements have been received than the other 2445 peer has committed to send at the beginning of the operation. If 2446 the sets differ (the FINAL CHECKSUM field in the Full Done message 2447 does not match to the SHA-512 hash XOR sum of the local set), the 2448 operation has failed and the reconciliation MUST be aborted. It 2449 is a strong indicator that something went wrong (eg. some hardware 2450 bug). This should never occur! 2452 8.2.3. Expecting IBF Last 2454 Security considerations for received messages: 2456 IBF The application should check that the overall size of the IBF 2457 that is being transmitted is within its resource bounds, and abort 2458 the protocol if its resource limits are likely to be exceeded, or 2459 if the size is implausible for the given operation. 2461 It needs to be checked that the offset (message field "OFFSET") 2462 for every received _IBF_ message is strictly monotonic increasing 2463 and is a multiple of the MAX_BUCKETS_PER_MESSAGE defined in the 2464 Constants section, otherwise the connection MUST be aborted. 2466 Another sanity check is to ensure that the "OFFSET" message field 2467 never is higher than the "IBF SIZE" field in the _IBF_ message. 2469 IBF Last When all _IBF_ messages have been received an _IBF Last_ 2470 message should conclude the transmission of the IBF and a change 2471 to the *Active Decoding* phase should be ensured. 2473 To verify that all IBFs have been received, a simple validation 2474 can be made. The number of buckets in the _IBF Last_ message 2475 added to the value in the message OFFSET field should always be 2476 equal to the "IBF SIZE". 2478 Further plausibility checks can be made. One is to ensure that 2479 after each active/passive switch the IBF can never be more than 2480 double in size. Another plausibility check is that an IBF 2481 probably never will be larger than the byzantine upperbound 2482 multiplied by two. The third plausibility check is to take 2483 successfully decoded IBF keys (received offers and demands) into 2484 account and to validate the size of the received IBF with the in 2485 Figure 35 described function get_next_ibf_size(). If any of these 2486 three checks fail the operation must be aborted. 2488 8.2.4. Active Decoding 2490 In the *Active Decoding* state it is important to prevent an attacker 2491 from generating and transmitting an unlimited number of IBFs that all 2492 do not decode, or to generate an IBF constructed to send the peers in 2493 an endless loop. To prevent an endless loop in decoding, loop 2494 detection MUST be implemented. A solution to prevent endless loop is 2495 to limit the number of elements decoded from an IBF. This limit is 2496 defined by the number of buckets in the IBF. It is not possible that 2497 more elements are decoded from an IBF than an IBF has buckets. If 2498 more elements than buckets are in an IBF it is not possible to get 2499 pure buckets. An additional check that should be implemented, is to 2500 store all element IDs that were prior decoded. When a new element ID 2501 is decoded from the IBF it should always be checked that no element 2502 ID is repeated. If the same element ID is decoded more than once, 2503 this is a strong indication for an invalid IBF and the operation MUST 2504 be aborted. Notice that the decoded element IDs are salted as 2505 described in Figure 7 so the described bit rotation needs to be 2506 reverted before the decoded element ID is stored and compared to the 2507 previous decoded element IDs. 2509 If the IBF decodes more elements than are plausible, the operation 2510 MUST be terminated. Furthermore, if the IBF decoding successfully 2511 terminates and fewer elements were decoded than plausible, the 2512 operation MUST also be terminated. The upper thresholds for decoded 2513 elements from the IBF is the remote set size the other peer has 2514 committed too (Case if the complete remote set is new). The lower 2515 threshold for decoding element is the absolute value of the 2516 difference between the local and remote set size (Case the set 2517 difference is only in the set of a single peer). The other peer's 2518 committed set sizes is transmitted in the the *Expecting IBF* state. 2520 Security considerations for received messages: 2522 Offer If an offer for an element, that never has been requested by 2523 an inquiry or if an offer is received twice, the operation MUST be 2524 terminated. This requirement can be fulfilled by saving lists 2525 that keep track of the state of all sent inquiries and offers. 2526 When answering offers these lists MUST be checked. The sending 2527 and receiving of Offer messages should always be protected with an 2528 Message Flow Control to secure the protocol against missing, 2529 duplicated, out-of-order or unexpected messages. 2531 Element If an element that never has been requested by a demand or 2532 is received twice, the operation MUST be terminated. The sending 2533 and receiving of Element messages should always be protected with 2534 an Message Flow Control to secure the protocol against missing, 2535 duplicated, out-of-order or unexpected messages. 2537 Demand For every received demand an offer has to be sent in advance. 2538 If a demand for an element is received, that never has been 2539 offered or the offer already has been answered with a demand, the 2540 operation MUST be terminated. It is required to implement a list 2541 which keeps track of the state of all sent offers and received 2542 demands. The sending and receiving of _Demand_ messages should 2543 always be protected with an Message Flow Control to secure the 2544 protocol against missing, duplicated, out-of-order or unexpected 2545 messages. 2547 Done The _Done_ message is only received if the IBF has finished 2548 decoding and all offers have been sent. If the _Done_ message is 2549 received before the decoding of the IBF is finished or all open 2550 demands have been answered, the operation MUST be terminated. If 2551 the sets differ (the FINAL CHECKSUM field in the Done message does 2552 not match to the SHA-512 hash XOR sum of the local set), the 2553 operation has failed and the reconciliation MUST be aborted. It 2554 is a strong indicator that something went wrong (eg. some hardware 2555 bug). This should never occur! 2556 When a _Done_ message is received the 2557 "check_if_synchronisation_is_complete()" function from the Message 2558 Flow Control is required to ensure that all demands have been 2559 satisfied successfully. 2561 8.2.5. Finish Closing 2563 In the *Finish Closing* state the protocol waits for all sent demands 2564 to be fulfilled. 2566 In case not all sent demands have been answered in time, the 2567 operation has failed and MUST be terminated. 2569 Security considerations for received messages: 2571 Element When receiving Element messages it is important to always 2572 check the Message Flow Control to secure the protocol against 2573 missing, duplicated, out-of-order or unexpected messages. 2575 8.2.6. Finished 2577 In this state the connection is terminated, so no security 2578 considerations are needed. 2580 8.2.7. Expect SE 2582 Security considerations for received messages: 2584 Strata Estimator In case the strata estimator does not decode, the 2585 operation MUST be terminated to prevent to get to an unresolvable 2586 state. The set difference calculated from the strata estimator 2587 needs to be plausible, which means within the byzantine boundaries 2588 described in section Byzantine Boundaries. 2590 8.2.8. Full Receiving 2592 Security considerations for received messages: 2594 Full Element When receiving full elements there needs to be checked, 2595 that every element is a valid element, no element has been 2596 received more than once and not more elements are received than 2597 the other peer committed to sending at the beginning of the 2598 operation. The plausibility should also be checked with an 2599 algorithm as described in Section 8.1.6. 2601 Full Done When the _Full Done_ message is received from the remote 2602 peer, it should be checked that the number of elements received 2603 matches the number that the remote peer originally committed to 2604 transmitting, otherwise the operation MUST be terminated. If the 2605 sets differ (the FINAL CHECKSUM field in the Full Done message 2606 does not match to the SHA-512 hash XOR sum of the local set), the 2607 operation has failed and the reconciliation MUST be aborted. It 2608 is a strong indicator that something went wrong (eg. some hardware 2609 bug). This should never occur! 2611 8.2.9. Passive Decoding 2613 Security considerations for received messages: 2615 IBF In case an IBF message is received by the peer a active/passive 2616 role switch is initiated by the active decoding remote peer. A 2617 switch into active decoding mode MUST only be permitted for a 2618 predefined number of times as described in Section 8.1.5 2620 Inquiry A check needs to be in place that prevents receiving an 2621 inquiry for an element multiple times or more inquiries than are 2622 plausible. The upper thresholds for sent/received inquiries is 2623 the remote set size the other peer has committed too (Case if the 2624 complete remote set is new). The lower threshold for for sent/ 2625 received inquiries is the absolute value of the set difference 2626 between the local and remote set size (Case the set difference is 2627 only in the set of a single peer). The other peer's committed set 2628 sizes is transmitted in the the *Expecting IBF* state. Beware 2629 that it is possible to get key collisions and an inquiry for the 2630 same key can be transmitted multiple times, so the threshold 2631 should take this into account. The sending and receiving of 2632 _Inquiry_ messages should always be protected with an Message Flow 2633 Control to secure the protocol against missing, duplicated, out- 2634 of-order or unexpected messages. 2636 Demand Same action as described for _Demand_ message in section 2637 Active Decoding. 2639 Offer Same action as described for _Offer_ message in section Active 2640 Decoding. 2642 Done Same action as described for _Done_ message in section Active 2643 Decoding. 2645 Element Same action as described for _Element_ message in section 2646 Active Decoding. 2648 8.2.10. Finish Waiting 2650 In the *Finish Waiting* state the protocol waits for all transmitted 2651 demands to be fulfilled. 2653 In case not all transmitted demands have been answered at this time, 2654 the operation has failed and the protocol MUST be terminated with an 2655 error. 2657 Security considerations for received messages: 2659 Element When receiving Element messages it is important to always 2660 check the Message Flow Control to secure the protocol against 2661 missing, duplicated, out-of-order or unexpected messages. 2663 9. Constants 2665 The following table contains constants used by the protocol. The 2666 constants marked with a * are validated through experiments in 2667 [byzantine_fault_tolerant_set_reconciliation]. 2669 Name | Value | Description 2670 ----------------------------+------------+------------------------------- 2671 SE_STRATA_COUNT | 32 | Number of IBFs in a strata 2672 estimator. 2673 IBF_HASH_NUM* | 3 | Number of times an element is 2674 hashed to an IBF. 2675 (from section 4.5.2) 2676 IBF_FACTOR* | 2 | The factor by which the size 2677 of the IBF is increased in 2678 case of decoding failure or 2679 initially from the set 2680 difference. 2681 (from section 4.5.2) 2682 MAX_BUCKETS_PER_MESSAGE | 1120 | Maximum bucket of an IBF 2683 that are transmitted in 2684 single message. 2685 IBF_MIN_SIZE* | 37 | Minimal number of buckets 2686 in an IBF. (from section 3.8) 2687 DIFFERENTIAL_RTT_MEAN* | 3.65145 | The average RTT that is 2688 needed for a differential 2689 synchronisation. 2690 SECURITY_LEVEL* | 2^80 | Security level for 2691 probabilistic security 2692 algorithms. (from section 5.8) 2693 PROBABILITY_FOR_NEW_ROUND* | 0.15 | The probability for a IBF 2694 decoding failure in the 2695 differential synchronisation 2696 mode. (from section 5.4) 2697 DIFFERENTIAL_RTT_MEAN* | 3.65145 | The average RTT that is needed 2698 for a differential 2699 synchronisation. 2700 (from section 4.5.3) 2701 MAX_IBF_SIZE | 1048576 | Maximal number of buckets in 2702 an IBF. 2703 AVG_BYTE_SIZE_SE* | 4221 | Average byte size of a single 2704 strata estimator. 2705 (from section 3.4.3) 2706 VALID_NUMBER_SE* | [1,2,4,8] | Valid number of SE's 2707 (from section 3.4) 2709 Figure 41 2711 10. GANA Considerations 2713 GANA is requested to amend the "GNUnet Message Type" [GANA] registry 2714 as follows: 2716 Type | Name | References | Description 2717 --------+----------------------------+------------+---------------------- 2718 559 | SETU_P2P_REQUEST_FULL | [This.I-D] | Request the full set 2719 of the other peer. 2720 710 | SETU_P2P_SEND_FULL | [This.I-D] | Signals to send the 2721 full set to the other 2722 peer. 2723 560 | SETU_P2P_DEMAND | [This.I-D] | Demand the whole 2724 element from the 2725 otherpeer, given 2726 only the hash code. 2727 561 | SETU_P2P_INQUIRY | [This.I-D] | Tell the other peer 2728 to send a list of 2729 hashes that match 2730 an IBF key. 2731 562 | SETU_P2P_OFFER | [This.I-D] | Tell the other peer 2732 which hashes match 2733 a given IBF key. 2734 563 | SETU_P2P_OPERATION_REQUEST | [This.I-D] | Request a set union 2735 operation from a 2736 remote peer. 2737 564 | SETU_P2P_SE | [This.I-D] | Strata Estimator 2738 uncompressed. 2739 565 | SETU_P2P_IBF | [This.I-D] | Invertible Bloom 2740 Filter slices. 2741 566 | SETU_P2P_ELEMENTS | [This.I-D] | Actual set elements. 2742 567 | SETU_P2P_IBF_LAST | [This.I-D] | Invertible Bloom 2743 Filter Last Slices. 2744 568 | SETU_P2P_DONE | [This.I-D] | Set operation is 2745 done. 2746 569 | SETU_P2P_SEC | [This.I-D] | Strata Estimator 2747 compressed. 2748 570 | SETU_P2P_FULL_DONE | [This.I-D] | All elements in 2749 full synchronisation 2750 mode have been sent 2751 is done. 2752 571 | SETU_P2P_FULL_ELEMENT | [This.I-D] | Send an actual 2753 element in full 2754 synchronisation mode. 2756 Figure 42 2758 11. Contributors 2760 The GNUnet implementation of the byzantine fault tolerant set 2761 reconciliation protocol was originally implemented by Florian Dold. 2763 12. Normative References 2765 [RFC5869] Krawczyk, H. and P. Eronen, "HMAC-based Extract-and-Expand 2766 Key Derivation Function (HKDF)", RFC 5869, 2767 DOI 10.17487/RFC5869, May 2010, 2768 . 2770 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 2771 Requirement Levels", BCP 14, RFC 2119, 2772 DOI 10.17487/RFC2119, March 1997, 2773 . 2775 [RFC3385] Sheinwald, D., Satran, J., Thaler, P., and V. Cavanna, 2776 "Internet Protocol Small Computer System Interface (iSCSI) 2777 Cyclic Redundancy Check (CRC)/Checksum Considerations", 2778 RFC 3385, DOI 10.17487/RFC3385, September 2002, 2779 . 2781 [RFC1951] Deutsch, P., "DEFLATE Compressed Data Format Specification 2782 version 1.3", RFC 1951, DOI 10.17487/RFC1951, May 1996, 2783 . 2785 [byzantine_fault_tolerant_set_reconciliation] 2786 Summermatter, E., "Byzantine Fault Tolerant Set 2787 Reconciliation", 2021, . 2791 [GANA] GNUnet e.V., "GNUnet Assigned Numbers Authority (GANA)", 2792 April 2020, . 2794 [CryptographicallySecureVoting] 2795 Dold, F., "Cryptographically Secure, Distributed 2796 Electronic Voting", 2797 . 2800 [ByzantineSetUnionConsensusUsingEfficientSetReconciliation] 2801 Dold, F. and C. Grothoff, "Byzantine set-union consensus 2802 using efficient set reconciliation", 2803 . 2805 [Eppstein] Eppstein, D., Goodrich, M., Uyeda, F., and G. Varghese, 2806 "What's the Difference? Efficient Set Reconciliation 2807 without Prior Context", 2808 . 2810 [GNS] Wachs, M., Schanzenbach, M., and C. Grothoff, "A 2811 Censorship-Resistant, Privacy-Enhancing and Fully 2812 Decentralized Name System", 2014, 2813 . 2815 Appendix A. Test Vectors 2817 A.1. Map Function 2819 INPUTS: 2821 k: 3 2822 ibf_size: 300 2824 key1: 0xFFFFFFFFFFFFFFFF (64-bit) 2825 key2: 0x0000000000000000 (64-bit) 2826 key3: 0x00000000FFFFFFFF (64-bit) 2827 key4: 0xC662B6298512A22D (64-bit) 2828 key5: 0xF20fA7C0AA0585BE (64-bit) 2830 Figure 43 2832 OUTPUT: 2834 key1: ["122","157","192"] 2835 key2: ["85","243","126"] 2836 key3: ["208","101","222"] 2837 key4: ["239","269","56"] 2838 key5: ["150","104","33"] 2840 Figure 44 2842 A.2. ID Calculation Function 2844 INPUTS: 2846 element 1: 0xFFFFFFFFFFFFFFFF (64-bit) 2847 element 2: 0x0000000000000000 (64-bit) 2848 element 3: 0x00000000FFFFFFFF (64-bit) 2849 element 4: 0xC662B6298512A22D (64-bit) 2850 element 5: 0xF20fA7C0AA0585BE (64-bit) 2852 Figure 45 2854 OUTPUT: 2856 element 1: 0x5AFB177B 2857 element 2: 0x64AB557C 2858 element 3: 0xCB5DB740 2859 element 4: 0x8C6A2BB2 2860 element 5: 0x7EC42981 2862 Figure 46 2864 A.3. Counter Compression Function 2866 INPUTS: 2868 counter serie 1: [1,8,10,6,2] (min bytes 4) 2869 counter serie 2: [26,17,19,15,2,8] (min bytes 5) 2870 counter serie 3: [4,2,0,1,3] (min bytes 3) 2872 Figure 47 2874 OUTPUT: 2876 counter serie 1: 0x18A62 2877 counter serie 2: 0x3519BC48 2878 counter serie 3: 0x440B 2880 Figure 48 2882 Authors' Addresses 2884 Elias Summermatter 2885 Seccom GmbH 2886 Brunnmattstrasse 44 2887 CH-3007 Bern 2888 Switzerland 2890 Email: elias.summermatter@seccom.ch 2892 Christian Grothoff 2893 Berner Fachhochschule 2894 Hoeheweg 80 2895 CH-2501 Biel/Bienne 2896 Switzerland 2898 Email: grothoff@gnunet.org