idnits 2.17.1 draft-icnrg-flic-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (June 25, 2017) is 2469 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Looks like a reference, but probably isn't: '8' on line 580 -- Looks like a reference, but probably isn't: '1' on line 607 -- Looks like a reference, but probably isn't: '32' on line 582 Summary: 1 error (**), 0 flaws (~~), 1 warning (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 ICNRG Working Group C. Tschudin 3 Internet-Draft University of Basel 4 Intended status: Informational C. Wood 5 Expires: December 27, 2017 University of California Irvine 6 June 25, 2017 8 File-Like ICN Collection (FLIC) 9 draft-icnrg-flic-00 11 Abstract 13 This document describes a bare bones "index table"-approach for 14 organizing a set of ICN data objects into a large, File-Like ICN 15 Collection (FLIC). 17 At the core of this collection is a so called manifest which acts as 18 the collection's root node. The manifest contains an index table 19 with pointers, each pointer being a hash value pointing to either a 20 final data block or another index table node. 22 Status of This Memo 24 This Internet-Draft is submitted in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF). Note that other groups may also distribute 29 working documents as Internet-Drafts. The list of current Internet- 30 Drafts is at http://datatracker.ietf.org/drafts/current/. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 This Internet-Draft will expire on December 27, 2017. 39 Copyright Notice 41 Copyright (c) 2017 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents 46 (http://trustee.ietf.org/license-info) in effect on the date of 47 publication of this document. Please review these documents 48 carefully, as they describe your rights and restrictions with respect 49 to this document. Code Components extracted from this document must 50 include Simplified BSD License text as described in Section 4.e of 51 the Trust Legal Provisions and are provided without warranty as 52 described in the Simplified BSD License. 54 Table of Contents 56 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 57 1.1. FLIC as a Distributed Data Structure . . . . . . . . . . 2 58 1.2. Design goals . . . . . . . . . . . . . . . . . . . . . . 3 59 2. File-Like ICN Collection (FLIC) Format . . . . . . . . . . . 4 60 2.1. Use of hash-valued pointers . . . . . . . . . . . . . . . 5 61 2.2. Creating a FLIC data structure . . . . . . . . . . . . . 5 62 2.3. Reconstructing the collection's data . . . . . . . . . . 7 63 2.4. Metadata in HashGroups . . . . . . . . . . . . . . . . . 8 64 2.5. Locating FLIC leaf and manifest nodes . . . . . . . . . . 9 65 3. Advanced uses of FLIC manifests . . . . . . . . . . . . . . . 10 66 3.1. Seeking . . . . . . . . . . . . . . . . . . . . . . . . . 10 67 3.2. Block-level de-duplication . . . . . . . . . . . . . . . 11 68 3.3. Growing ICN collections . . . . . . . . . . . . . . . . . 11 69 3.4. Re-publishing a FLIC under a new name . . . . . . . . . . 12 70 3.5. Data Chunks of variable size . . . . . . . . . . . . . . 12 71 4. Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . 13 72 4.1. Example Encoding for CCNx1.0 . . . . . . . . . . . . . . 13 73 4.2. Example Encoding for NDN . . . . . . . . . . . . . . . . 13 74 5. Security Considerations . . . . . . . . . . . . . . . . . . . 14 75 6. References . . . . . . . . . . . . . . . . . . . . . . . . . 14 76 6.1. Normative References . . . . . . . . . . . . . . . . . . 14 77 6.2. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 14 78 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 14 80 1. Introduction 82 1.1. FLIC as a Distributed Data Structure 84 One figure 85 root manifest 86 .------------------------------------. 87 | optional name: | 88 | /icn/name/of/this/flic | 89 | | 90 | HashGroup (HG): | 91 | optional metadata: | 92 | overall digest, locator, etc. | .------. 93 | hash-valued data pointer -----------> | data | 94 | ... | `------' sub manifest 95 | hash-valued manifest pointer ------. .------------------. 96 | | `--> | -----> 97 | optional additional HashGroups .. | | -----> 98 | | `------------------' 99 | optional signature | 100 `------------------------------------' 102 Figure 1: A FLIC manifest and its directed acyclic graph 104 1.2. Design goals 106 o Copy the proven UNIX inode concept: 108 * index tables and memory pointers 110 o Adaption to ICN: 112 * hash values instead of block numbers, unique with high 113 probability 115 o Advantages (over non-manifest collections): 117 * single root manifest signature covers all elements of the full 118 collection, including intermediate sub manifests 120 * eliminate reference to chunk numbering schemata (hash values 121 only) 123 * supports block-level de-duplication (can lead to a directed 124 acyclic graph, or DAG, instead of a tree) 126 o Limitations 128 * All data leafs must be present at manifest creation time 129 (otherwise one cannot compute the pointers) 131 o Potential extensions (for study): 133 * Enhance the manifest such that it can serve as a "database 134 cursor" or as a cursor over a time series, e.g. having entries 135 for "previous" and "next" collections. 137 2. File-Like ICN Collection (FLIC) Format 139 We first give the FLIC format in EBN notation: 141 ManifestMsg := Name? HashGroup+ 143 HashGroup := MetaData? (SizeDataPtr | SizeManifestPtr)+ 144 BlockHashGroup := MetaData? SizePerPtr (DataPtr | ManifestPtr)+ 146 DataPtr := HashValue 147 ManifestPtr := HashValue 148 SizeDataPtr := Size HashValue 149 SizeManifestPtr := Size HashValue 151 SizePerPtr := Size 152 HashValue := See {{CCNxMessages}} 153 Size := OCTET[8] 155 MetaData := Property* 156 Property := Locator | OverallByteCount | OverallDataDigest | ... 158 Description: 160 o The core of a manifest is the sequence of "hash groups". 162 o A HashGroup (HG) consists of a sequence of "sized" data or 163 manifest pointers. 165 o A BlockHashGroup (BHG) consists of a sequence of data or manifest 166 pointers and a mandatory field that lists the total size of each 167 pointer. These HashGroups should be used when each pointer 168 (except the last) contains an identical number of application 169 bytes. 171 o Sizes are 64-bit unsigned integers. 173 o Data and manifest pointers are cryptographic HashValues encoded 174 according to the mechanism listed in [CCNxMessages]. 175 Specifically, a HashValue specifies the cryptographic hash 176 algorithm and the actual digest. 178 o A HashGroup can contain a metadata section to help a reader to 179 optimize content retrieval (block size of leaf nodes, total size, 180 overall digest etc). 182 o None of the ICN objects used in FLIC are allowed to be chunked, 183 including the (sub-) manifests. The smallest possible complete 184 manifest contains one HashGroup with one pointer to an ICN object. 186 2.1. Use of hash-valued pointers 188 FLIC's tree data structure is a generalized index table as it is 189 known from file systems. The pointers, which in an OS typically are 190 hard disk block numbers, are replaced by hash values of other ICN 191 objects. These ICN objects contain either other manifest nodes, or 192 leaf nodes. Leafs contain the actual data of the collection. Each 193 pointer explicitly indicates the amount of application data bytes 194 contained by the referred object. For example, the size of a data 195 pointer (to a leaf) represents the size of the leaf's content object 196 payload. Conversely, the size of a manifest pointer represents the 197 total size of all pointers contained in that manifest. 199 FLIC makes use of "nameless ICN object" where the network is tasked 200 with fetching an object based on its digest only. The interest for 201 such an object consists of a routing hint (locator) plus the given 202 digest value. 204 2.2. Creating a FLIC data structure 206 Starting from the original content, the corresponding byte array is 207 sliced into chunks. Each chunk is encoded as a data object, 208 according the ICN suite. For each resulting data object, the hash 209 value is computed. Groups of consecutive objects are formed and the 210 corresponding hash values collected in manifests, which are also 211 encoded. The hash values of the manifest objects replace the hash 212 values of the covered leaf nodes, thus reducing the number of hash 213 values. This process of hash value collection and replacement is 214 repeated until only one (root) manifest is left. 216 data1 <-- h1 - - - - - - - - - - - - \ 217 data2 <-- h2 \ root mfst 218 ... mfst 1 <-- hN+1 \ / 219 dataJ <-- hJ / mfst2 <-- hN+2 220 ... / 221 dataN <-- hN - - - - - - / 223 Of special interest are "skewed trees" where a pointer to a manifest 224 may only appear as last pointer of (sub-) manifests. Such a tree 225 becomes a sequential list of manifests with a maximum of datapointers 226 per manifest packet. Beside the tree shape we also show this data 227 structure in form of packet content where D stands for a data pointer 228 and M is the hash of a manifest packet. 230 data1 <-- h1 - - - - - - - - root mfst 231 ... / 232 dataJ-1 <-- hJ-1 / 233 dataJ <-- hJ - - mfst1 <-- hN+1 / 234 ... / 235 dataN <-- hN - / 237 DDDDDDM--> DDDDDDM--> ....... DDDDDDM--> DDDDDDD 239 A pseudo code description for producing a skewed tree follows below. 241 Input: 242 Application data D of size |D| (bytes) 243 Block size B (in bytes) 244 Output: 245 FLIC root node R 246 Algo: 247 n = number of leaf nodes = ceil(|D| / B) 248 k = number of (encoded) hash values fitting in a block of size B 249 H[1..n] = array of hash values 250 initialized with the data hash values for data chunks 1..n 251 While n > k do 252 a) create manifest M with a HashGroup 253 b) append to the HashGroup in M all hash values H[n-k+1..n] 254 c) n = n - k + 1 255 d) H[n] = manifest hash value of M 256 Create root manifest R with a HashGroup 257 Add to the HashGroup of R all hash values H[1..n] 258 Optionally: add name to R, sign manifest R 259 Output R 261 Obtaining with each manifest a maximum of data pointers is beneficial 262 for keeping the download pipeline filled. On the other hand, this 263 tree doesn't support well random access to arbitrary byte positions: 264 All data pointers coming before that offset have to be fetched before 265 locating the block of interest. For random access, binary trees 266 (where both subtrees of a node cover half of the content bytes) are 267 better suited. This can be combined with the "skewed tree" approach: 268 Manifests of intermediate nodes are filled with data pointers except 269 for the last two slots. The second last slot points to a manifest 270 for the "first half" of the left content, the last slots then points 271 to a manifest for the rest. 273 root manifest= DDDDDMM 274 ____________/ \_____ 275 / \ 276 DDDDDMM DDDDDMM 277 _______/ \ _____/ \ 278 / \ / \ 279 DDDDDDD DDDDDDD DDDDDDD DDDDDDD 281 This can be generalized to k-ary trees by allocating k pointers per 282 manifest instead of 2. 284 2.3. Reconstructing the collection's data 286 To fetch the data associated with a given FLIC (sub-) manifest, the 287 receiver sequentially works through all entries found in the 288 HashGroups and issues corresponding hash-based interests. In case of 289 a data hash pointer, the received content object is appended. In 290 case of a manifest hash pointer, this procedure is called recursively 291 for the received manifest. In other words, the collection data is 292 represented as the concatenation of data leaves from this _pre-order_ 293 depth-first search (DFS) traversal strategy of the manifest tree. 294 (Currently, pre-order DFS is the only supported traversal strategy.) 295 This procedure works regardless of the tree's shape. 297 A pseudo code description for fetching is below. 299 Input: 300 Root manifest R 301 Output: 302 Application data D 303 Algo: 304 global D = [] 305 DFS(R) 306 Output D 308 where: 310 procedure DFS(M) 311 { 312 L: 313 H = sequence of hash valued pointers of M 314 foreach p in H do: 315 if p is a data pointer then 316 data = lookup(p) 317 Append data to D 318 else 319 M = lookup(p) 320 if p is last element in H then 321 goto L; // tail recursion 322 DFS(M) 323 } 325 The above DFS code works for FLIC manifest trees of arbitrary shape. 326 In case of a skewed tree, no recursion is needed and a single 327 instance of the DFS procedure suffices (i.e., one uses tail 328 recursion). 330 2.4. Metadata in HashGroups 332 In FLIC, metadata is linked to HashGroups and permits to inform the 333 FLIC retriever about properties of the data that is covered by this 334 hash group. Examples are overall data bytes or the overall hash 335 digest (this is akin to a Merkle hash). The intent of such metadata 336 is to enable an in-network retriever to optimize its operation - 337 other attributes linked to the collection as a whole (author, 338 copyright, etc.) is out of scope. 340 The list of available metadata is below. 342 * Locator - provides a new routing hint (name prefix) where the 343 chunks of this hash group can be retrieved from. The default is to 344 use the locator of the root manifest. 346 * OverallByteCount - indicates the total number of *application 347 data bytes* contained in a single HashGroup. This does not include 348 bytes consumed by child manifests. This value is equal to the sum of 349 all pointer sizes contained in the HashGroup. 351 * OverallDataDigest - expresses the overall digest of all application 352 data contained in the HashGroup. 354 BlockHashGroups contain a mandatory piece of metadata called the 355 SizePerPtr. This value indicates the total number of application 356 bytes contained within each pointer in the hash group _except for the 357 last pointer._ Normal HashGroups do not require this piece of 358 metadata; Instead, each pointer includes their size explicitly. 360 2.5. Locating FLIC leaf and manifest nodes 362 The optional name of a manifest is a mere decoration and has no 363 locator functionality at all: All objects pointed to by a manifest 364 are retrieved from the location where the manifest itself was 365 obtained from (which is not necessarily its name). Example: 367 Objects: 368 manifest(name=/a/b/c, ptr=h1, ptr=hN) - has hash h0 369 nameless(data1) - has hash h1 370 ... 371 nameless(dataN) - has hash hN 373 Query for the manifest: 374 interest(name=/the/locator/hint, implicitDigest=h0) 376 In this example, the name "/a/b/c" does NOT override "/the/locator/ 377 hint" i.e., after having obtained the manifest, the retriever will 378 issue requests for 380 interest(name=/the/locator/hint, implicitDigest=h1) 381 ... 382 interest(name=/the/locator/hint, implicitDigest=hN) 384 Using the locator metadata entry, this behavior can be changed: 386 Objects: 387 manifest(name=/a/b/c, 388 hashgroup(loc=/x/y/z, ptr=h1) 389 hashgroup(ptr=h2) - has hash h0 390 nameless(data1) - has hash h1 391 nameless(data2) - has hash h2 393 Queries: 394 interest(name=/the/locator/hint, implicitDigest=h0) 395 interest(name=/x/y/z, implicitDigest=h1) 396 interest(name=/the/locator/hint, implicitDigest=h2) 398 3. Advanced uses of FLIC manifests 400 The FLIC mechanics has uses cases beyond keeping together a set of 401 data objects, such as: seeking, block-level de-duplication, re- 402 publishing under a new name, growing ICN collections, and supporting 403 FLICs with different block sizes. 405 3.1. Seeking 407 Fast seeking (without having to sequentially fetch all content) works 408 by skipping over entries for which we know their size. The following 409 expression shows how to compute the byte offset of the data pointed 410 at by pointer P_i, offset_i. In this formula, let P_i represent the 411 Size value of the i-th pointer. 413 offset_i = \sum_{i = 1}^{i - 1} P_i.size 415 With this offset, seeking is done as follows: 417 Input: seek_pos P, a FLIC manifest with a hash group having N entries 418 Output: pointer index i and byte offset o, or out-of-range error 419 Algo: 420 offset = 0 421 for i in 1..N do 422 if (P < P_i.size) 423 return (i, P - offset) 424 offset += P_i.size 425 return out-of-range 427 Seeking in a BlockHashGroup is different since offsets can be quickly 428 computed. This is because the size of each pointer P_i except the 429 last is equal to the SizePerPtr value. For a BlockHashGroup with N 430 pointers, OverallByteCount D, and SizePerPointer L, the size of P_i 431 is equal to the following: 433 D - ((i - 1) * L) 435 In a BlockHashGroup with k pointers, the size of P_k is equal to: 437 D - L * (k - 1) 439 Using these, the seeking algorithm can be thus simplified to the 440 following: 442 Input: seek_pos P, a FLIC manifest with a hash group having 443 OverallByteCount S and SizePerPointer L. 444 Output: pointer index i and byte offset o, or out-of-range error 445 Algo: 446 if (P > S) 447 return out-of-range 448 i = floor(P / L) 449 if (i > N) 450 return out-of-range # bad FLIC encoding 451 o = P mod L 452 return (i, o) 454 Note: In both cases, if the pointer at position i is a manifest 455 pointer, this algorithm has to be called once more, seeking to 456 seek_pos o inside that manifest. 458 3.2. Block-level de-duplication 460 Consider a huge file, e.g. an ISO image of a DVD or program in binary 461 form, that had previously been FLIC-ed but now needs to be patched. 462 In this case, all existing encoded ICN chunks can remain in the 463 repository while only the chunks for the patch itself is added to a 464 new manifest data structure, as is shown in the picture below. For 465 example, the venti [1] archival file system of Plan9 uses this 466 technique. 468 old_mfst - - > h1 --> oldData1 <-- h1 < - - new_mfst 469 \ - > h2 --> oldData2 <-- h2 < - - / 470 \ replace3 <-- h5 < - -/ 471 \- > h3 --> oldData3 / 472 \ > h4 --> oldData4 <-- h4 < - / 474 3.3. Growing ICN collections 476 A log file, for example, grows over time. Instead of having to re- 477 FLIC the grown file it suffices to construct a new manifest with a 478 manifest pointer to the old root manifest plus the sequence of data 479 hash pointers for the new data (or additional sub-manifests if 480 necessary). Note that this tree will not be skewed (anymore). 482 old data < - - - mfst_old <-- h_old - - mfst_new 483 / 484 new data1 <-- h_1 - - - - - - - - -/ 485 new data2 / 486 ... / 487 new dataN <-- h_N - - - - - - - -/ 489 3.4. Re-publishing a FLIC under a new name 491 It can happen that a publisher's namespace is part of a service 492 provider's prefix. When switching provider, the publisher may want 493 to republish the old data under a new name. This can easily be 494 achieved with a single nameless root manifest for the large FLIC plus 495 arbitrarily many per-name manifests (which are signed by whomever 496 wants to publish this data): 498 data < - nameless_mfst() <-- h < - mfst(/com/parc/east/the/flic) 499 < - mfst(/com/parc/west/old/the/flic) 500 < - mfst(/internet/archive/flic234) 502 Note that the hash computation (of h) only requires reading the 503 nameless root manifest, not the entire FLIC. 505 This example points out the problem of HashGroups having locator 506 metadata elements: A retriever would be urged to follow these hints 507 which are "hardcoded" deep inside the FLIC but might have become 508 outdated. We therefore recommend to name FLIC manifests only at the 509 highest level (where these names have no locator function). Child 510 nodes in a FLIC manifest should not be named as these names serve no 511 purpose except retrieving a sub-tree's manifest by name, if would be 512 required. 514 3.5. Data Chunks of variable size 516 If chunks do not have regular (block) sizes, the HashGroup can be 517 used to still convey to a reader the length of the chunks at the 518 manifest level. (This can be computed based on the size of pointers, 519 but the metadata field makes this determination simpler.) Example 520 use cases would be chunks each carrying a single ASCII line as 521 entered by a user or a database with variable length records mapped 522 to chunks. 524 M = (manifest 525 (hashgroup((metadata(SizePerPtr=4096)) (dataptr=h1)) 526 (hashgroup((metadata(SizePerPtr=1500)) (dataptr=h2)) 527 ... 528 ) 530 4. Encoding 532 We express the packet encoding of manifests in a symbolic expression 533 style in order to show the TLV structure and the chosen type values. 534 In this notation, a TLV's type is a combination of "SymbolicName/ 535 Tvalue", Length is not shown and Values are sub-expressions. 536 Moreover, we populate the data structure with all possible entries 537 and omit repetition. 539 4.1. Example Encoding for CCNx1.0 541 [FIXED_HEADER OCTET[8]] 542 (ManifestMsg/T_MANIFEST 543 (Name/T_NAME ...) 544 (HashGroup/T_HASHGROUP 545 (MetaData/T_HASHGROUP_METADATA 546 (HGLocator/T_HASHGROUP_METADATA_LOCATOR (T_NAME ...)) 547 (HGOverallByteCount/T_HASHGROUP_METADATA_BYTECOUNT INT) 548 (HGOverallDataDigest/T_HASHGROUP_METADATA_DATADIGEST OCTET[32]) 549 ) 550 (SizeDataPtr/T_HASHGROUP_SIZEDATAPTR OCTET[8] (T_HASH ...)) 551 (SizeMfstPtr/T_HASHGROUP_SIZEMANIFESTPTR OCTET[8] (T_HASH ...)) 552 ) 553 (BlockHashGroup/T_BLOCKHASHGROUP 554 (MetaData/T_HASHGROUP_METADATA (...)) 555 (DataPtr/T_HASHGROUP_DATAPTR OCTET[32] (T_HASH ...)) 556 (MfstPtr/T_HASHGROUP_MANIFESTPTR OCTET[32] (T_HASH ...)) 557 ) 558 ) 560 Interest: name is locator, use objHashRestriction as selector. 562 4.2. Example Encoding for NDN 564 The assigned NDN content type value for FLIC manifests is 1024 565 (0x400). 567 (Data/0x6 568 (Name/0x7 ...) 569 (MetaInfo/0x14 570 (ContentType/0x18 0x0400) 571 ) 572 (Content/0x15 573 (HashGroup/0xC0 574 (MetaInfo/0x14 575 (LocatorNm/0xC3 (NameComp/0x8 ...)) 576 (OverallDataDigest/0xC4 OCTET[32]) 577 (OverallByteCount/0xC5 INT) 578 ) 579 (DataPtr/0xC1 OCTET[8] OCTET[32]) 580 (MfstPtr/0xC2 OCTET[8] OCTET[32]) 581 (SizeDataPtr/0xC3 OCTET[32]) 582 (SizeMfstPtr/0xC4 OCTET[32]) 583 ) 584 ) 585 (SignatureInfo/0x16 ...) 586 (SignatureValue/0x17 ...) 587 ) 589 Interest: name is locator, use implicitDigest name component as 590 selector. 592 5. Security Considerations 594 None. 596 6. References 598 6.1. Normative References 600 [CCNxMessages] 601 PARC, ., LinkedIn, ., and . PARC, "CCNx Messages in TLV 602 Format", n.d., . 605 6.2. URIs 607 [1] http://plan9.bell-labs.com/sys/doc/venti/venti.pdf 609 Authors' Addresses 611 Christian Tschudin 612 University of Basel 614 Email: christian.tschudin@unibas.ch 615 Christopher A. Wood 616 University of California Irvine 618 Email: woodc1@uci.edu