idnits 2.17.1 draft-irtf-icnrg-flic-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 11 instances of too long lines in the document, the longest one being 24 characters in excess of 72. == There are 2 instances of lines with non-RFC2606-compliant FQDNs in the document. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (November 4, 2019) is 1634 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '
' and
     '' lines.


  Checking references for intended status: Experimental
  ----------------------------------------------------------------------------

  == Missing Reference: 'SecurityCtx' is mentioned on line 428, but not
     defined

  == Missing Reference: 'AuthTag' is mentioned on line 428, but not defined

  == Missing Reference: 'NodeData' is mentioned on line 434, but not defined

  == Missing Reference: 'SubtreeSize' is mentioned on line 459, but not
     defined

  == Missing Reference: 'SubtreeDigest' is mentioned on line 459, but not
     defined

  == Missing Reference: 'Locators' is mentioned on line 435, but not defined

  == Missing Reference: 'GroupData' is mentioned on line 449, but not defined

  == Missing Reference: 'LeafSize' is mentioned on line 459, but not defined

  == Missing Reference: 'LeafDigest' is mentioned on line 459, but not defined

  == Missing Reference: 'NsId' is mentioned on line 459, but not defined

  == Missing Reference: 'NIST 800-38D' is mentioned on line 607, but not
     defined

  == Missing Reference: 'Name' is mentioned on line 761, but not defined

  == Missing Reference: 'ExpiryTime' is mentioned on line 761, but not defined

  -- Looks like a reference, but probably isn't: '1' on line 1166

  == Unused Reference: 'NDNTLV' is defined on line 1384, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC3552' is defined on line 1392, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC5226' is defined on line 1397, but no explicit
     reference was found in the text

  == Outdated reference: A later version (-08) exists of
     draft-irtf-icnrg-terminology-07

  -- Obsolete informational reference (is this intentional?): RFC 5226
     (Obsoleted by RFC 8126)


     Summary: 1 error (**), 0 flaws (~~), 19 warnings (==), 4 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	ICNRG                                                        C. Tschudin
3	Internet-Draft                                       University of Basel
4	Intended status: Experimental                                    C. Wood
5	Expires: May 7, 2020                     University of California Irvine
6	                                                                M. Mosko
7	                                                              PARC, Inc.
8	                                                            D. Oran, Ed.
9	                                       Network Systems Research & Design
10	                                                        November 4, 2019

12	                    File-Like ICN Collections (FLIC)
13	                      draft-irtf-icnrg-flic-02

15	Abstract

17	   This document describes a bare bones "index table"-approach for
18	   organizing a set of ICN data objects into a large, File-Like ICN
19	   Collection (FLIC).  At the core of this collection is a so called
20	   manifest which acts as the collection's root node.  The manifest
21	   contains an index table with pointers, each pointer being a hash
22	   value pointing to either a final data block or another index table
23	   node.

25	Status of This Memo

27	   This Internet-Draft is submitted in full conformance with the
28	   provisions of BCP 78 and BCP 79.

30	   Internet-Drafts are working documents of the Internet Engineering
31	   Task Force (IETF).  Note that other groups may also distribute
32	   working documents as Internet-Drafts.  The list of current Internet-
33	   Drafts is at https://datatracker.ietf.org/drafts/current/.

35	   Internet-Drafts are draft documents valid for a maximum of six months
36	   and may be updated, replaced, or obsoleted by other documents at any
37	   time.  It is inappropriate to use Internet-Drafts as reference
38	   material or to cite them other than as "work in progress."

40	   This Internet-Draft will expire on May 7, 2020.

42	Copyright Notice

44	   Copyright (c) 2019 IETF Trust and the persons identified as the
45	   document authors.  All rights reserved.

47	   This document is subject to BCP 78 and the IETF Trust's Legal
48	   Provisions Relating to IETF Documents
49	   (https://trustee.ietf.org/license-info) in effect on the date of
50	   publication of this document.  Please review these documents
51	   carefully, as they describe your rights and restrictions with respect
52	   to this document.  Code Components extracted from this document must
53	   include Simplified BSD License text as described in Section 4.e of
54	   the Trust Legal Provisions and are provided without warranty as
55	   described in the Simplified BSD License.

57	Table of Contents

59	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
60	     1.1.  Requirements Language . . . . . . . . . . . . . . . . . .   5
61	   2.  Design Goals  . . . . . . . . . . . . . . . . . . . . . . . .   5
62	   3.  FLIC Structure  . . . . . . . . . . . . . . . . . . . . . . .   6
63	     3.1.  Terminology . . . . . . . . . . . . . . . . . . . . . . .   6
64	     3.2.  Locators  . . . . . . . . . . . . . . . . . . . . . . . .   7
65	     3.3.  Namespaces  . . . . . . . . . . . . . . . . . . . . . . .   7
66	     3.4.  Manifest Metadata . . . . . . . . . . . . . . . . . . . .   8
67	     3.5.  Pointer Annotations . . . . . . . . . . . . . . . . . . .   8
68	     3.6.  Manifest Grammar (ABNF) . . . . . . . . . . . . . . . . .   9
69	     3.7.  Manifest Trees  . . . . . . . . . . . . . . . . . . . . .  11
70	       3.7.1.  Traversal . . . . . . . . . . . . . . . . . . . . . .  11
71	     3.8.  Manifest Encryption . . . . . . . . . . . . . . . . . . .  13
72	       3.8.1.  Preshared Key Algorithm . . . . . . . . . . . . . . .  13
73	       3.8.2.  RSA Key Encapsulation . . . . . . . . . . . . . . . .  14
74	       3.8.3.  RSA KEM-DEM . . . . . . . . . . . . . . . . . . . . .  16
75	     3.9.  Protocol Encodings  . . . . . . . . . . . . . . . . . . .  16
76	       3.9.1.  CCNx Encoding . . . . . . . . . . . . . . . . . . . .  16
77	         3.9.1.1.  CCNx Hash Naming  . . . . . . . . . . . . . . . .  17
78	         3.9.1.2.  CCNx Single Prefix  . . . . . . . . . . . . . . .  17
79	         3.9.1.3.  CCNx Segmented Prefix . . . . . . . . . . . . . .  18
80	         3.9.1.4.  CCNx Hybrid Schema  . . . . . . . . . . . . . . .  19
81	       3.9.2.  NDN Encoding  . . . . . . . . . . . . . . . . . . . .  19
82	         3.9.2.1.  NDN Hash Naming . . . . . . . . . . . . . . . . .  19
83	         3.9.2.2.  NDN Single Prefix . . . . . . . . . . . . . . . .  20
84	         3.9.2.3.  NDN Segmented Prefix  . . . . . . . . . . . . . .  21
85	         3.9.2.4.  NDN Hybrid Schema . . . . . . . . . . . . . . . .  22
86	     3.10. Example Structures  . . . . . . . . . . . . . . . . . . .  22
87	       3.10.1.  Leaf-only data . . . . . . . . . . . . . . . . . . .  22
88	       3.10.2.  Linear . . . . . . . . . . . . . . . . . . . . . . .  22
89	   4.  Usage Examples  . . . . . . . . . . . . . . . . . . . . . . .  23
90	     4.1.  Locating FLIC leaf and manifest nodes . . . . . . . . . .  23
91	     4.2.  Seeking . . . . . . . . . . . . . . . . . . . . . . . . .  24
92	     4.3.  Block-level de-duplication  . . . . . . . . . . . . . . .  25
93	     4.4.  Growing ICN collections . . . . . . . . . . . . . . . . .  25
94	     4.5.  Re-publishing a FLIC under a new name . . . . . . . . . .  26
95	   5.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  27
96	   6.  Security Considerations . . . . . . . . . . . . . . . . . . .  27
97	   7.  Appendix A: Building Trees  . . . . . . . . . . . . . . . . .  27
98	   8.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  29
99	     8.1.  Normative References  . . . . . . . . . . . . . . . . . .  29
100	     8.2.  Informative References  . . . . . . . . . . . . . . . . .  30
101	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  30

103	1.  Introduction

105	   ICN architectures such as Content-Centric Networking (CCN)[RFC8569]
106	   and Named Data Networking [NDN] are well suited for static content
107	   distribution.  Each piece of possibly immutable) static content is
108	   assigned a name by its producer.  Consumers fetch this content using
109	   said name.  Optionally, consumers may specify the full name of
110	   content, which includes its name and a unique (with overwhelming
111	   probability) cryptographic digest of said content.  (See
112	   [I-D.irtf-icnrg-terminology] for a formal definition of "full name".)

114	   To enable requests with full names, consumers need a priori knowledge
115	   of content digests.  Manifests, or catalogs, are data structures
116	   commonly proposed to transport this information.  Typically,
117	   manifests are signed content objects (data) which carry a collection
118	   of hash digests.  However, as content objects, manifests themselves
119	   may be fetched by full name.  Thus, manifests may contain hash
120	   digests of, or pointers to, other manifests or content objects.  A
121	   collection of manifests and content objects represents a large piece
122	   of application data, e.g., one that cannot otherwise fit in a single
123	   content object.

125	   Structurally, this relationship between manifests and content objects
126	   is reminiscent of the UNIX inode concept with index tables and memory
127	   pointers.  In this document, we specify a simple, yet extensible,
128	   manifest data structure called FLIC - File-Like ICN Collection.  FLIC
129	   is suitable for ICNs such as CCN and NDN.  We describe the FLIC
130	   design, grammar, and various use cases, e.g., seeking, de-
131	   duplication, extension, and variable-sized encoding.  We also include
132	   FLIC encoding examples for CCN and NDN.

134	   The purpose of a manifest is to concisely name the constiuent pieces
135	   of a larger object.  A FLIC manifest does this by using a first
136	   manifest to name and cryptographically sign the data structure and
137	   then use more concise lists of hash-based names to indicate the
138	   constituent pieces.  This maintains strong security from a single
139	   signature.  A Manifest entry gives one enough information to create
140	   an Interest for that entry, so it must specify the name, the hash
141	   digest, and if needed the locators (routing hints).

143	   FLIC is a distributed data structure best illustrated by the
144	   following picture.

146	                                            root manifest
147	    .------------------------------------.
148	    | optional name:                     |
149	    |   /icn/name/of/this/flic           |
150	    |                                    |
151	    | HashGroup (HG):                    |
152	    |   optional metadata:               |
153	    |     overall digest, locator, etc.  |    .------.
154	    |   hash-valued data pointer -----------> | data |
155	    |     ...                            |    `------'  sub manifest
156	    |   hash-valued manifest pointer ------.     .------------------.
157	    |                                    |  `--> |                ----->
158	    | optional additional HashGroups ..  |       |                ----->
159	    |                                    |       `------------------'
160	    | optional signature                 |
161	    `------------------------------------'

163	         Figure 1: A FLIC manifest and its directed acyclic graph

165	   A key question is how one names the root manifest, the application
166	   data, and other subsequent manifests.  The question of namespaces is
167	   specific to the names of each Content Object (CCNx) or Data (NDN),
168	   and is separate from the question of Locators.  FLIC allows one to
169	   use a first namespace for the manifests and a second namespace for
170	   the application data.  A given namespace may use one of three
171	   schemas: hash-based naming, single-prefix naming, or segmented
172	   naming.  We describe the allowed methods in Section 3.3.  There are
173	   also particulars of how to encode the name schema in a given ICN
174	   protocol, which we describe in Section 3.9.

176	   Locators are routing hints to find a Content Object / Data.  They
177	   exist in both CCNx and NDN, though the specifics differ.  A FLIC
178	   manifest encodes locators the same for both ICN protocols, though
179	   they are encoded differently in the underlying protocol.  See
180	   Section 3.9 for encoding differences.

182	   We follow the CCNx [RFC8569] terminology where a Content Object is
183	   the data structure that holds application payload.  It is made up of
184	   an optional Name, a PayloadType, a Payload, and an optional
185	   Signature.

187	   FLIC has encodings for CCNx encoding (Section 3.9.1) as per RFC 8609
188	   [RFC8609] and for NDN (Section 3.9.2).

190	   An example implementation in Python may be found at
191	   [FLICImplementation].

193	1.1.  Requirements Language

195	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
196	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
197	   document are to be interpreted as described in RFC 2119 [RFC2119].

199	2.  Design Goals

201	   The preferred FLIC structure copies the proven UNIX inode concept of
202	   direct and indirect pointers, but without the specific structural
203	   forms of direct versus indirect.

205	   In FLIC terms, a direct pointer links to application-level data,
206	   which is a Content Object with application data in the Payload.  An
207	   indirect pointer links to a Content Object with a FLIC Manifest in
208	   the Payload.

210	   Links in FLIC use hash-based naming of Content Objects, rather than
211	   inode block numbers.  Both CCNx and NDN support hash-based naming,
212	   though the details vary.  See Section 3.9.1 and Section 3.9.2.
213	   Another advantage of using hash-based naming is it permits block-
214	   level de-duplication of application data because two blocks with the
215	   same payload will have the same hash name.

217	   Because FLIC uses hash-based naming, FLIC graphs are inherently
218	   acyclic.

220	   The preferred FLIC structure includes a root manifest with a strong
221	   cryptographic signature and then strong hash names to other manifests
222	   (e.g.  SHA256).  The advantage of this structure is the single
223	   signature in the root manifest covers the entire data structure no
224	   matter how many additional manifests are in the data structure.
225	   Another advantage of this structure is it removes the need to use
226	   chunk (CCNx) or segment (NDN) name components for the subordinate
227	   manifests.

229	   FLIC supports manifest encryption separate from application payload
230	   encryption.  It has a flexible encryption envelope to support various
231	   encryption algorithms and key discovery mechanisms.  The byte layout
232	   allows for in-place encryption and decryption.

234	   A limitation of this approach is that one cannot construct a hash-
235	   based name for a child until one knows the payload of that child.  In
236	   practical terms, this means that one must have the complete
237	   application payload available at the time of manifest creation.

239	   FLIC's design allows straightforward applications that just need to
240	   traverse a linear set of related objects to do so simply, but FLIC
241	   has two extensibility mechanisms that allow for more sophisticated
242	   uses: manifest metadata, and pointer annotations.  These are
243	   described in Section 3.4 and Section 3.5 respectively.

245	3.  FLIC Structure

247	3.1.  Terminology

249	   Data Object:  a CCNx nameless Content Object that usually only has
250	      Payload.  It might also have an ExpiryTime to limit the lifetime
251	      of the data.

253	   Direct Pointer:  borrowed from inode terminology, it is a CCNx link
254	      using a content object hash restriction and a locator name to
255	      point to a Data Object.

257	   Indirect Pointer:  borrowed from inode terminology, it is a CCNx link
258	      using a content object hash restriction and a locator name to
259	      point to a manifest content object.

261	   Manifest:  a CCNx ContentObject with PayloadType 'Manifest' and a
262	      Payload of the encoded manifest.  A leaf manifest only has direct
263	      pointers.  An internal manifest has a mixture of direct and
264	      indirect manifests.

266	   Leaf Manifest:  all pointers are direct pointers.

268	   Internal Manifest:  some or all pointers are indirect.  The order and
269	      number of each is up to the manifest builder.  By convention, all
270	      the direct manifests come first, then the indirect.

272	   Manifest Waste:  a metric used to measure the amount of waste in a
273	      manifest tree.  Waste is the number of unused pointers.  For
274	      example, a leaf manifest might be able to hold 40 direct pointers,
275	      but only 30 of them are used, so the waste of this node is 10.
276	      Manifest tree waste is the sum of waste over all manifests in a
277	      tree.

279	   Root Manifest:  A signed, named, manifest that points to nameless
280	      manifest nodes.  This structure means that the internal tree
281	      structure of internal and leaf manifests have no names and thus
282	      may be located anywhere in a namespace, while the root manifest
283	      has a name to fetch it by.

285	   Top Manifest:  A preferred manifest structure is to use a Root
286	      manifest that points to a single Internal manifest called the Top
287	      Manifest.  The Top manifest the begins the structure used to
288	      organize manifests.

290	   Namespace:  The prefix and object name that goes inside a Content
291	      Object.  It may include typed name components specifying a version
292	      and/or chunk/segment number.

294	   Locator:  A routing hint in an Interest used by forwarding to get the
295	      Interest to where it can be matched based on its Namespace-derived
296	      name.

298	3.2.  Locators

300	   Locators are routing hints used by forwarders to get an Interest to a
301	   node in the network that can resolve the Interest's name.  In some
302	   naming conventions, the name might only be a hash-based name so the
303	   Locator is the only available routing information.

305	   A manifest Node may define one or more Locator prefixes that can be
306	   used in the construction of Interests from the pointers in the
307	   manifest.  The Locators are inherited when walking a manifest tree,
308	   so they do not need to be defined everywhere.  It is RECOMMENDED that
309	   only the Root manifest contain Locators so that a single operation
310	   can update the locators.  One usecase when storing application
311	   payloads at different replicas is to replace the Root manifest with a
312	   new one that contains locators for the current replicas.

314	3.3.  Namespaces

316	   A FLIC manifest may define zero or more namespaces.  If none are
317	   defined, FLIC uses the default Hash Naming approach.  If using
318	   namespaces, typically there are two defined: one for the manifest
319	   namespace and one for the application data namespace.  If the two are
320	   the same, they can share a namepace.  There may be more than two
321	   namespaces.

323	   A namespace follows a naming convention.  The naming convention
324	   governs how FLIC creates the ICN Name that goes in an Interest's Name
325	   and must match a Content Object / Data Name.  The conventions are:
326	   (1) Hash Naming, (2) Single Prefix, and (3) Segmented Prefix.  The
327	   default is to use Hash Naming.  Hash Naming does not include anything
328	   besides a hash name in the Interest's name and relies on the Locator
329	   to forward the Interest.  Single Prefix uses the same name
330	   differntiated only by a Content Object's implicit hash.  Segmented
331	   Prefix keeps a counter for the namespace, starting with 0, and
332	   increments it after each use of the namespace.

334	   The namespace definitions may be inherited from the Root manifest or
335	   the Top manifest, or any prior manifest.  It is RECOMMENDED that the
336	   namespace definitions appear in the Root manifest so they can be
337	   updated by a single operation.  Because Segmented Prefix namespaces
338	   use a counter, it is RECOMMENDED to only define them in the Root
339	   manifest or Top manifest and not elsewhere, as it may confuse the
340	   counters.

342	   In the NodeData, there may be zero or more NSDef contains.  Each
343	   NSDef defines a namespace identifier (octet string) and its naming
344	   convention.  For the Hash Naming convention, no further information
345	   is required.  For the Single Prefix and Segmented Prefix conventions,
346	   the NSDef specifies the ICN Name prefix used by the namespace.

348	   A HashGroup may have an NSRef container that indicates which
349	   namespace it is using, and by implication which naming convention and
350	   the corresponding prefix.  If there is no NSRef, the hash group uses
351	   Hash Naming convention.

353	3.4.  Manifest Metadata

355	   The FLIC Manifest may be extended by defining TLVs that apply to the
356	   Manifest as a whole, or alternatively, individually to every data
357	   object pointed to by the Manifest.  This basic specification does not
358	   specify any, but metadata TLVs may be defined through additional RFCs
359	   or via Vendor TLVs.  FLIC uses a Vendor TLV structure similar to
360	   [RFC8609] for vendor-specific annotations that require no
361	   standardization process.

363	   For example, some applications may find it useful to allow
364	   specialized consumers such as _repositories_ (for example
365	   [repository]) or enhanced forwarder caches to pre-place, or
366	   adaptively pre-fetch data in order to improve robustness of delay
367	   performance.  We note in passing that FLICs use of separate
368	   namespaces for the Manifest and the underlying Data allows different
369	   encryption keys to be used, hence giving a element like a cache or
370	   repository access to the Manifest data does not as a side effect
371	   reveal the contents of the application data itself.

373	3.5.  Pointer Annotations

375	   FLIC allows each manifest pointer to be annotated with extra data.
376	   Annotations allow applications to exploit metadata about each Data
377	   Object pointed to without having to first fetch the corresponding
378	   Content Object.  This specification defines one such annotation.  The
379	   _SizeAnnotation_ specifies the number of application layer octets
380	   covered by the pointer.

382	   An annotation may, for example, give hints about a preferred
383	   traversal order for fetching the data, or an importance/precedence
384	   indication to aid applications that do not require every content
385	   object pointed to in the manifest to be fetched.  This can be very
386	   useful for real-time or streaming media applications that can perform
387	   error concealment when rendering the media.

389	   Additional annotations may be defined through additional RFCs or via
390	   Vendor TLVs.  FLIC uses a Vendor TLV structure similar to [RFC8609]
391	   for vendor-specific annotations that require no standardization
392	   process.

394	3.6.  Manifest Grammar (ABNF)

396	   The manifest grammar is mostly independent of the transport ICN
397	   protocol.  The TLV encoding therefore follows the corresponding ICN
398	   protocol, so for CCNx FLIC uses 2 octet length, 2 octet type and for
399	   NDN uses the 1/3/5 octet types and lengths.  There are also some
400	   differences in how one structures and resolves links.  [RFC8569]
401	   defines HashValue and Link for CCNx encodings.  The NDN
402	   ImplicitSha256DigestComponent defines HashValue and NDN Delegation
403	   (from Link Object) defines Link for NDN.  The Section 3.9 section
404	   below specifies these differences.

406	   The basic structure of a FLIC manifest is a security context, a node,
407	   and an authentication tag.  The security context and authentication
408	   tag are not needed if the node is unencrypted.  A node is made up of
409	   a set of metadata, the NodeData, that applies to the entire node, and
410	   one or more HashGroups that contain pointers.

412	   The NodeData element defines the namespaces used by the manifest.
413	   There may be multiple namespaces, depending on how one names
414	   subsequent manifests or data objects.  Each HashGroup may reference a
415	   single namespace to control how one forms Interests from the
416	   HashGroup.  If one is using separate namespaces for manifests and
417	   application data, one needs at least two HashGroups.  For a manifest
418	   structure of "MMMDDD," (where M means manifest (indirect pointer) and
419	   D means data (direct pointer)) for example, one would have a first
420	   hash group for the child manifests with its namespace and a second
421	   HashGroup for the data pointers with the other namespace.  If one
422	   used a structure like "MMMDDDMMM," then one would need three
423	   HashGroups.

425	TYPE = 2OCTET / {1,3,5}OCTET ; As per CCNx or NDN TLV
426	LENGTH = 2OCTET / {1,3,5}OCTET ; As per CCNx or NDN TLV

428	Manifest = TYPE LENGTH [SecurityCtx] (EncryptedNode / Node) [AuthTag]

430	SecurityCtx = TYPE LENGTH AlgorithmCtx
431	AlgorithmCtx = PresharedKeyCtx / RsaKemCtx / RsaKemDemCtx
432	AuthTag = TYPE LENGTH *OCTET ; e.g. AEAD authentication tag
433	EncryptedNode = TYPE LENGTH *OCTET ; Encrypted Node
434	Node = TYPE LENGTH [NodeData] 1*HashGroup
435	NodeData = TYPE LENGTH [SubtreeSize] [SubtreeDigest] [Locators] 0*NSDef
436	SubtreeSize = TYPE LENGTH INTEGER
437	SubtreeDigest = TYPE LENGTH HashValue
438	NSDef = TYPE LENGTH NsId NsSchema
439	NsId = TYPE LENGTH INTEGER
440	NsSchema = HashSchema / SinglePrefixSchema / SegmentedPrefixSchema
441	HashSchema = TYPE 0
442	SinglePrefixSchema = TYPE LENGTH Name
443	SegmentedPrefixSchema = TYPE LENGTH Name

445	Locators = TYPE LENGTH 1*Link
446	HashValue = TYPE LENGTH *OCTET ; As per ICN Protocol
447	Link = TYPE LENGTH *OCTET ; As per ICN protocol

449	HashGroup = TYPE LENGTH [GroupData] (Ptrs / AnnotatedPtrs)
450	Ptrs = TYPE LENGTH *HashValue
451	AnnotatedPtrs = TYPE LENGTH *PointerBlock
452	PointerBlock = TYPE LENGTH *Annotation Ptr
453	Ptr = TYPE LENGTH HashValue

455	Annotation = SizeAnnotation / Vendor
456	SizeAnnotation = TYPE LENGTH Integer
457	Vendor = TYPE LENGTH PEN *OCTET

459	GroupData = TYPE LENGTH [LeafSize] [LeafDigest] [SubtreeSize] [SubtreeDigest] [NsId]
460	LeafSize = TYPE LENGTH INTEGER
461	LeafDigest = TYPE LENGTH HashValue

463	PresharedKeyCtx = TYPE LENGTH PresharedKeyData
464	PresharedKeyData = KeyNum IV Mode
465	KeyNum = TYPE LENGTH INTEGER
466	IV = TYPE LENGTH 1*OCTET
467	Mode = TYPE LENGTH (AES-GCM-128 / AES-GCM-256)

469	RsaKemCtx = 2 LENGTH RsaKemData
470	RsaKemData = KeyId IV Mode WrappedKey LocatorPrefix
471	KeyId = TYPE LENGTH HashValue; ID of Key Encryption Key
472	WrappedKey = TYPE LENGTH 1*OCTET
473	LocatorPrefix = TYPE LENGTH Link

475	RsaKemDemCtx = 3 LENGTH RsaKemDemData
476	RsaKemDemData = KeyId IV Mode WrappedKey LocatorPrefix

478	                          Figure 1: FLIC Grammar

480	   SecurityCtx:  information about how to decrypt an EncryptedNode.  The
481	      structure will depend on the specific encryption algorithm.

483	   AlgorithmId:  The ID of the encryption method (e.g. preshared key, a
484	      broadcast encryption scheme, etc.)

486	   AlgorithmData:  The context for the encryption algorithm.

488	   EncryptedNode:  An opaque octet string with an optional
489	      authentication tag (i.e. for AEAD authentication tag)

491	   Node:  A plain-text manifest node.  The structure allows for in-place
492	      encryption/decryption.

494	   NodeData:  the metadata about the Manifest node

496	   SubtreeSize:  The size of all application data at and below the Node
497	      or Group

499	   SubtreeDigest:  The cryptographic digest of all application data at
500	      and below the Node or Group

502	   Locators:  An array of routing hints to find the manifest components

504	   HashGroup:  A set of child pointers and associated metadata

506	   Ptrs:  A list of one or more Hash Values

508	   GroupData:  Metadata that applies to a HashGroup

510	   LeafSize:  Size of all application data immediately under the Group
511	      (i.e. via direct pointers)

513	   LeafDigest:  Digest of all application data immediately under the
514	      Group

516	   Ptr:  The ContentObjectHash of a child, which may be a data
517	      ContentObject (i.e. with Payload) or another Manifest Node.

519	3.7.  Manifest Trees

521	3.7.1.  Traversal

523	   FLIC manifests use a pre-order traversal.  This means they are read
524	   top to bottom, left to right.  The algorithms in Figure 2 show the
525	   in-order forward traversal code and the reverse-oder traversal code,
526	   which we use below to construct such a tree.  This document does not
527	   mandate how to build trees.  Appendix A provides a detailed example
528	   of building inode-like trees.

530	   If using Annotated Pointers, an annotation could influence the
531	   traversal order.

533	   preorder(node)
534	       if (node = null)
535	           return
536	       visit(node)
537	       for (i = 0, i < node.child.length, i++)
538	           preorder(node.child[i])

540	   reverse_preorder(node)
541	       if (node = null)
542	           return
543	       for (i = node.child.length - 1, i >= 0, i-- )
544	       reverse_preorder(node.child[i])
545	           visit(node)

547	                      Figure 2: Traversal Pseudocode

549	   In terms of the FLIC grammar, one expands a node into its hash
550	   groups, visiting each hash group in order.  In each hash group, one
551	   follows each pointer in order.  Figure Figure 3 shows how hash groups
552	   inside a manifest expand like virtual children in the tree.  The in-
553	   order traversal is M0, HG1, M1, HG3, D0, D1, D2, HG2, D3, D4.

555	   M0 ____
556	   |      \
557	   HG1    HG2
558	   | \    |  \
559	   M1 D2  D3  D4
560	   |
561	   HG3
562	   |  \
563	   D0 D1

565	                         Figure 3: Node Expansion

567	   Using the example manifest tree shown in Figure Figure 9, the in-
568	   order traversal would be: Root, M0, M1, D0, D1, D2, M2, D3, D4, D5,
569	   M3, D6, D7, D8.

571	3.8.  Manifest Encryption

573	   This document specifies three encryption methods.  The first is a
574	   preshared key algorithm, where the parties are assumed to have the
575	   decryption keys already.  This is useful, for example, when using a
576	   key agreement protocol such as CCNxKE.  The second is an RSA key
577	   encapsulation mechanisms (RsaKem).  The third is a standard RSA KEM-
578	   DEM mechanism that uses a shared group key (RsaKemDem).

580	   For group key based encryption, RsaKem and RsaKemDem, this
581	   specification only details the pertinent aspects of the encryption.
582	   It does not specify aspects of a key manager which may or may not be
583	   used as part of key distribution and management, nor does it specify
584	   the protocol between a key manager and a publisher.  In it's
585	   simpliest form, the publisher could be the key manager, so there is
586	   no extra protocol needed between the publisher and key manager.  This
587	   specification does describe how a consumer locates the appropriate
588	   keys.

590	   While the preshared key algorithm is limited in use, the AES
591	   encryption mechanisms described apply to the group key mechanisms
592	   too.  They group key mechanisms simply facilitate distribution of the
593	   shared key without an on-line key agreement protocol like CCNxKE.

595	   A fourth encryption mechanism based on elliptic curve key
596	   distribution is forthcoming.

598	3.8.1.  Preshared Key Algorithm

600	   The KeyNum identifies a key on the receiver.  The key must be of the
601	   correct length of the Mode used.  If the key is longer, use the left
602	   bits.  Many receivers many have the same key with the same KeyId.  A
603	   publisher creates a signed root manifest with a security context.  A
604	   consumer must ensure that the root manifest signer is the expected
605	   publisher for use with the pre-shared key, which may be shared with
606	   many other consumers.  The publisher may use either method 8.2.1
607	   (deterministic IV) or 8.2.2 (RBG-based IV) [NIST 800-38D] for
608	   creating the IV.

610	   Each encrypted manifest node (root manifest or internal manifest) has
611	   a full security context (KeyNum, IV, Mode).  The AES-GCM decryption
612	   is independent for each manifest so Manifest objects can be fetched
613	   and decrypted in any order.  This design also ensures that if a
614	   manifest tree points to the same subtree repeatedly, such as for
615	   deduplication, the decryptions are all idempotent.

617	   The functions for authenticated encryption and authenticated
618	   decryption are as given in Sections 7.1 and 7.2 of NIST 800-38D: GCM-
619	   AE_K(IV, P, A) and GCM-AD_K(IV, C, A, T).

621	EncryptNode(SecurityCtx, Node, K, IV) -> GCM-AE_K(IV, P, A) -> (C, T)
622	    Node: The wire format of the Node (P)
623	    SecurityCtx: The wire format of the SecurityCtx as the Additional Authenticated Data (A)
624	    K: the pre-shared key (128 or 256 bits)
625	    IV: The initialization vector (usually 96 or 128 bits)
626	    C: The cipher text
627	    T: The authentication tag

629	                      Figure 4: Preshared Key Encrypt

631	   The pair (C,T) is the OpaqueNode encoded as a TLV structure:
632	   (OpaqueNode (CipherText C) (AuthTag T)).

634	DecryptNode(SecurityCtx, C, T, K, IV) -> GCM-AD_K (IV, C, A, T) -> (Node, FailFlag)
635	Node: The wire format of the decrypted Node
636	FailFlag: Indicates authenticated decryption failure (true or false)

638	                      Figure 5: Preshared Key Decrypt

640	   If doing in-place decryption, the cipher text C will be enclosed in
641	   an EncryptedNode TLV value.  After decryption, change the TLV type to
642	   Node.  The length should be the same.  After decryption the AuthTag
643	   is no longer needed.  The TLV type should be changed to T_PAD and the
644	   value zeroed.  The SecurityCtx could be changed to T_PAD and zeroed
645	   or left as-is.

647	3.8.2.  RSA Key Encapsulation

649	   See also RFC 5990, NIST SP 800-56B Rev. 2 and
650	   https://lists.w3.org/Archives/Public/public-xmlsec/2009May/att-0032/
651	   Key_Encapsulation.pdf

653	   In this system, a key manager (KM) (which could be the publisher)
654	   creates a symmetric Content Encryption Key (CEK) and a key wrapping
655	   pair with a Key Encryption Key (KEK) and Key Decryption Key (KDK).
656	   Each publisher and consumer has its own public/private key pair, and
657	   the KM knows each publisher's and consumer's identity and its public
658	   key (PK_x).

660	   We do not describe the publisher-key manager protocol to request a
661	   CEK.  The publisher will obtain the (CEK, E_KEK(Z), KeyId, Locator),
662	   where each element is: the content encryption key, the CEK precursor,
663	   Z, encrypted with the KEK (an RSA operation), and the KeyId of the
664	   corresponding KDK, and the Locator is the CCNx name prefix to fetch
665	   the KDK (see below).  The precursor Z is chosen randomly z < n-1,
666	   where n is KEK's public modulus.  Note that CEK = KDF(Z).  Note that
667	   the publisher does not see KEK or Z.

669	   We use HKDF (RFC 5869) for the KDF.  CEK = HKDF-Expand(HKDF-
670	   Extract(0, Z), "CEK", KeyLen), where KenLen is usually 32 bytes (256
671	   bits).

673	   In the ABNF grammar, the RsaKemData includes a KeyId, IV, Mode,
674	   WrappedKey, and LocatorPrefix.  The KeyId is the ID (sha256) of the
675	   KEK.  The IV and Mode are as per preshared key, and describe how the
676	   manifest is encrypted with AES-GCM.  The WrappedKey is the AES key to
677	   decrypt the manifest.  The LocatorPrefix is used to construct an
678	   Interest to fetch the KDK.

680	   To fetch the KDK, a consumer with public key PK_c constructs an
681	   Interest with name /LocatorPrefix/{KeyId}/{PK_c keyid} and a
682	   KeyIdRestriction of the KM's KeyId (from the LocatorPrefix Link).  It
683	   should receive back a signed Content Object with the KDK wrapped for
684	   the consumer, or a NAK from the KM.  The payload of the ContentObject
685	   will be RsaKemWrap(PK, KDK).  The signed ContentObject must have a
686	   KeyLocator to the KM's public key.  The consumer will trust the KM's
687	   public key because the publisher, whom the consumer trusts, relayed
688	   that KeyId inside its own signed Manifest.

690	   The signed Content Object should have an ExpiryTime, which may be
691	   shorter than the Manifest's, but should not be substantially longer
692	   than the Manifest's ExpiryTime.  The KM may decide how to handle the
693	   Recommended Cache Time, or if caching of the response is even
694	   permissible.  The KM may require on-line fetching of the response via
695	   a CCNxKE encrypted transport tunnel.

697	   RsaKemWrap(PK, K, KeyLen = 256):
698	       choose z < n-1, where n is PK's public modulus
699	       encrypt c = z^e mod n
700	       prk = HKDF-Extract(0, Z)
701	       kek = HKDF-Expand(prk, "RsaKemWrap", KeyLen)
702	       WK = E_KEK(K) # [AES-WRAP, RFC 3394]
703	       output (c, WK)

705	                          Figure 6: RSA KEM Wrap

707	   A consumer must verify the signed content object's signature against
708	   the Key Manager's public key.  The consumer then unwraps the KDK from
709	   the Content Object's payload using RsaKemUnwrap().  The KeyLen is
710	   taken from the WrapMode parameter.

712	   RsaKemUnwrap(SK, c, WK, KeyLen = 256):
713	       Using the consumers private key SK, decrypt Z from c.
714	       prk = HKDF-Extract(0, Z)
715	       kek = HKDF-Expand(prk, "RsaKemWrap", KeyLen)
716	       K = D_KEK(WK) # [AES-UNWRAP, RFC 33940]
717	       output K

719	                         Figure 7: RSA KEM Unwrap

721	   The consumer then unwraps the CEK precursor by using the KDK to
722	   decrypt Z.  It then derives CEK as above.  Manifest encryption and
723	   decryption proceed as with PresharedKey, but using the CEK.

725	3.8.3.  RSA KEM-DEM

727	   In this scheme a Key Manager (KM), who could be the publisher,
728	   creates a Key Encryption Key (KEK) and Key Decryption Key (KDK) key
729	   pair.  The publisher obtains the KEK.  The KM distributes the KDK to
730	   each group member by encrypting it under each member's public key.
731	   To encrypt data, the publisher generates a symmetric Content
732	   Encryption Key (CEK), wraps it with the KEK, then encrypts the
733	   manifest with the CEK.  It places the wrapped CEK in the manifest.

735	   The KM communicates the KEK to the publisher through an unspecified
736	   means particular to the KM.

738	   The KM distributes the KDK to each group member.  It uses a name
739	   /{km-prefix}/{publisher-prefix}/{KDK KeyId}/{member KeyId} to publish
740	   the encrypted KDK under a member's public key.  It uses RSA-OAEP for
741	   the encryption.

743	   The publisher creates a random symmetric CEK of an appropriate bit
744	   length.  It uses the KEK to wrap the CEK using RSA-OAEP.  It places
745	   the wrapped key in the manifest's RsaKemDemData along with the KeyId
746	   set to the KDK's KeyId and the KeyLocator prefix /{km-
747	   prefix}/{publisher-prefix}/.  Each member appends the KDK KeyId and
748	   their public key KeyId to the name to attemt to fetch the KDK.  When
749	   forming the Interest to fetch the key, a consumer should also use a
750	   KeyIdRestriction of the KM's KeyId, which it can retrieve from the
751	   KeyLocator.

753	3.9.  Protocol Encodings

755	3.9.1.  CCNx Encoding

757	   In CCNx, all Manifest content objects use a PayloadType of
758	   T_PYLDTYPE_MANIFEST, while all application data content objects use a
759	   PayloadType of T_PYLDTYPE_DATA.

761	ManifestContentObject = TYPE LENGTH [Name] [ExpiryTime] PayloadType Payload
762	Name = TYPE LENGTH *OCTET ; As per RFC8569
763	ExpiryTime = TYPE LENGTH *OCTET ; As per RFC8569
764	PayloadType = TYPE LENGTH T_PYLDTYPE_MANIFEST ; Value TBD
765	Payload : TYPE LENGTH *OCTET ; the serialized Manifest object

767	                     Figure 8: CCNx Embedding Grammar

769	3.9.1.1.  CCNx Hash Naming

771	   The Hash Naming namespace uses CCNx nameless content objects.

773	   It proceeds as follows:

775	   o  The Root Manifest content object has a name used to fetch the
776	      manifest.  It is signed by the publisher.  It has a set of
777	      Locators used to fetch the remainder of the manifest.  It has a
778	      single HashPointer that points to the Top Manifest.  It may also
779	      have cache control directives, such as ExpiryTime.

781	   o  The Root Manifest has an NsDef that specifies HashSchema.  It's
782	      GroupData uses that NsId.  All internal and leaf manifests use the
783	      same GroupData NsId.  A Manifest Tree MAY omit the NsDef and NsId
784	      elements and rely on the default being HashSchema.

786	   o  The Top Manifest is a nameless CCNx content object.  It may have
787	      cache control directies.

789	   o  Internal and Leaf manifests are nameless CCNx content objects,
790	      possibly with cache control directives.

792	   o  The Data content objects are nameless CCNx content objects,
793	      possibly with cache control directives.

795	   o  To form an Interest for a direct or indirect pointer, use a Name
796	      from one of the Locators and put the pointer HashValue into the
797	      ContentObjectHashRestriction.

799	3.9.1.2.  CCNx Single Prefix

801	   The Single Prefix schema uses the same name in all Content Objects
802	   and distinguishes them via their ContentObjectHash.  Note that in
803	   CCNx, using a SinglePrefix name means that we do not use Locators.

805	   It proceeds as follows:

807	   o  The Root Manifest content object has a name used to fetch the
808	      manifest.  It is signed by the publisher.  It has a set of
809	      Locators used to fetch the remainder of the manifest.  It has a
810	      single HashPointer that points to the Top Manifest.  It may also
811	      have cache control directives, such as ExpiryTime.

813	   o  The Root Manifest has an NsDef that specifies SinglePrefix and the
814	      SinglePrefixSchema element specifies the SinglePrefixName.

816	   o  The Top Manifest has the name SinglePrefixName.  It may have cache
817	      control directies.  It's GroupData elements must have an NsId that
818	      references the NsDef.

820	   o  An Internal or Leaf manifest has the name SinglePrefixName,
821	      possibly with cache control directives.  It's GroupData elements
822	      must have an NsId that references the NsDef.

824	   o  The Data content objects have the name SinglePrefixName, possibly
825	      with cache control directives.

827	   o  To form an Interest for a direct or indirect pointer, use
828	      SinglePrefixName as the Name and put the pointer HashValue into
829	      the ContentObjectHashRestriction.

831	3.9.1.3.  CCNx Segmented Prefix

833	   The Segmented Prefix schema uses a different name in all Content
834	   Objects and distinguishes them via their ContentObjectHash.  Note
835	   that in CCNx, using a SegmentedPrefixSchema means that we do not use
836	   Locators.  OPTIONAL: Use AnnotatedPointers to indicate the segment
837	   number of each hash pointer to avoid needing to infer the segment
838	   numbers.

840	   It proceeds as follows:

842	   o  The Root Manifest content object has a name used to fetch the
843	      manifest.  It is signed by the publisher.  It has a set of
844	      Locators used to fetch the remainder of the manifest.  It has a
845	      single HashPointer that points to the Top Manifest.  It may also
846	      have cache control directives, such as ExpiryTime.

848	   o  The Root Manifest has an NsDef that specifies SegmentedPrefix and
849	      the SegmentedPrefixSchema element specifies the
850	      SegmentedPrefixName.

852	   o  The publisher will track the chunk number of each content object
853	      within the NsId.  Objects will be numbered in their traversal
854	      order.  Within each manifest, the name will be constructed from
855	      the SegmentedPrefixName plus a Chunk name component.

857	   o  The Top Manifest has the name SegmentedPrefixName plus chunk
858	      number.  It may have cache control directies.  It's GroupData
859	      elements must have an NsId that references the NsDef.

861	   o  An Internal or Leaf manifest has the name SegmentedPrefixName plus
862	      chunk number, possibly with cache control directives.  It's
863	      GroupData elements must have an NsId that references the NsDef.

865	   o  The Data content objects have the name SegmentedPrefixName plus
866	      chunk number, possibly with cache control directives.

868	   o  To form an Interest for a direct or indirect pointer, use
869	      SegmentedPrefixName plus chunk number as the Name and put the
870	      pointer HashValue into the ContentObjectHashRestriction.  A
871	      consumer must track the chunk number in traversal order for each
872	      SegmentedPrefixSchema NsId.

874	3.9.1.4.  CCNx Hybrid Schema

876	   A manifest may use multiple schemas.  For example, the application
877	   payload in data content objects might use SegmentedPrefix while the
878	   manifest content objects might use HashNaming.

880	   The Root Manifest should specify an NsDef with a first NsId (say 1)
881	   as the HashNaming schema and a second NsDef with a second NsId (say
882	   2) as the SegmentedPrefix schema along with the SegmentedPrefixName.

884	   Each manifest (Top, Internal, Leaf) uses two or more HashGroups,
885	   where eash HashGroup has only Direct (with the second NsId) or
886	   Indirect (with the first NsId).  The number of hash groups will
887	   depend on how the publisher wishes to interleave direct and indirect
888	   pointers.

890	   Manifests and data objects are named as appropriate for their naming
891	   schema.

893	3.9.2.  NDN Encoding

895	   In NDN, all Manifest Data objects use a ContentType of FLIC (1024),
896	   while all application data content objects use a PayloadType of Blob.

898	3.9.2.1.  NDN Hash Naming

900	   In NDN Hash Naming, a Data Object has a 0-length name.  This means
901	   that an Interest will only have an ImplicitDigest name component in
902	   it.  This method relies on using NDN ForwardingHints.

904	   It proceeds as follows:

906	   o  The Root Manifest Data has a name used to fetch the manifest.  It
907	      is signed by the publisher.  It has a set of Locators used to
908	      fetch the remainder of the manifest.  It has a single HashPointer
909	      that points to the Top Manifest.  It may also have cache control
910	      directives.

912	   o  The Root Manifest has an NsDef that specifies HashSchema.  It's
913	      GroupData uses that NsId.  All internal and leaf manifests use the
914	      same GroupData NsId.  A Manifest Tree MAY omit the NsDef and NsId
915	      elements and rely on the default being HashSchema.

917	   o  The Top Manifest has a 0-length Name.  It may have cache control
918	      directies.

920	   o  Internal and Leaf manifests has a 0-length Name, possibly with
921	      cache control directives.

923	   o  The application Data use a 0-length name, possibly with cache
924	      control directives.

926	   o  To form an Interest for a direct or indirect pointer, the name is
927	      only the Implicit Digest name component derived from a pointer's
928	      HashValue.  The ForwardingHints come from the Locators.  In NDN,
929	      one may use one or more locators within a single Interest.

931	3.9.2.2.  NDN Single Prefix

933	   In Single Prefix, the Data name is a common prefix used between all
934	   objects in that namespace, without a Segment or other counter.  They
935	   are distinguished via the Implicit Digest name component.  The FLIC
936	   Locators go in the ForwardingHints.

938	   It proceeds as follows:

940	   o  The Root Manifest Data object has a name used to fetch the
941	      manifest.  It is signed by the publisher.  It has a set of
942	      Locators used to fetch the remainder of the manifest.  It has a
943	      single HashPointer that points to the Top Manifest.  It may also
944	      have cache control directives.

946	   o  The Root Manifest has an NsDef that specifies SinglePrefix and the
947	      SinglePrefixSchema element specifies the SinglePrefixName.

949	   o  The Top Manifest has the name SinglePrefixName.  It may have cache
950	      control directies.  It's GroupData elements must have an NsId that
951	      references the NsDef.

953	   o  An Internal or Leaf manifest has the name SinglePrefixName,
954	      possibly with cache control directives.  It's GroupData elements
955	      must have an NsId that references the NsDef.

957	   o  The Data content objects have the name SinglePrefixName, possibly
958	      with cache control directives.

960	   o  To form an Interest for a direct or indirect pointer, use
961	      SinglePrefixName as the Name and append the pointer's HashValue
962	      into an ImplicitDigest name component.  Set the ForwardingHints
963	      from the FLIC locators.

965	3.9.2.3.  NDN Segmented Prefix

967	   In Segmented Prefix, the Data name is a common prefix plus a segment
968	   number, so each manifest or application data object has a unique full
969	   name before the implicit digest.  This means the consumer must
970	   maintain a counter for each SegmentedPrefix namespace.  OPTIONAL: Use
971	   AnnotatedPointers to indicate the segment number of each hash pointer
972	   to avoid needing to infer the segment numbers.

974	   It proceeds as follows:

976	   o  The Root Manifest Data object has a name used to fetch the
977	      manifest.  It is signed by the publisher.  It has a set of
978	      Locators used to fetch the remainder of the manifest.  It has a
979	      single HashPointer that points to the Top Manifest.  It may also
980	      have cache control directives.

982	   o  The Root Manifest has an NsDef that specifies SegmentedPrefix and
983	      the SegmentedPrefixSchema element specifies the
984	      SegmentedPrefixName.

986	   o  The publisher will track the segment number of each Data object
987	      within a SegmentedPrefix NsId.  Data will be numbered in their
988	      traversal order.  Within each manifest, the name will be
989	      constructed from the SegmentedPrefixName plus a Segment name
990	      component.

992	   o  The Top Manifest has the name SegmentedPrefixName plus segment
993	      number.  It may have cache control directies.  It's GroupData
994	      elements must have an NsId that references the NsDef.

996	   o  An Internal or Leaf manifest has the name SegmentedPrefixName plus
997	      segment number, possibly with cache control directives.  It's
998	      GroupData elements must have an NsId that references the NsDef.

1000	   o  The Data content objects have the name SegmentedPrefixName plus
1001	      chunk number, possibly with cache control directives.

1003	   o  To form an Interest for a direct or indirect pointer, use
1004	      SegmentedPrefixName plus segment number as the Name and put the
1005	      pointer HashValue into the ImplicitDigest name component.  A
1006	      consumer must track the segment number in traversal order for each
1007	      SegmentedPrefixSchema NsId.

1009	3.9.2.4.  NDN Hybrid Schema

1011	   A manifest may use multiple schemas.  For example, the application
1012	   payload in data content objects might use SegmentedPrefix while the
1013	   manifest content objects might use HashNaming.

1015	   The Root Manifest should specify an NsDef with a first NsId (say 1)
1016	   as the HashNaming schema and a second NsDef with a second NsId (say
1017	   2) as the SegmentedPrefix schema along with the SegmentedPrefixName.

1019	   Each manifest (Top, Internal, Leaf) uses two or more HashGroups,
1020	   where eash HashGroup has only Direct (with the second NsId) or
1021	   Indirect (with the first NsId).  The number of hash groups will
1022	   depend on how the publisher wishes to interleave direct and indirect
1023	   pointers.

1025	   Manifests and data objects are named as appropriate for their naming
1026	   schema.

1028	3.10.  Example Structures

1030	3.10.1.  Leaf-only data

1032	              Root
1033	               |
1034	        ______ M0 _____
1035	       /       |       \
1036	      M1       M2       M3
1037	    / | \    / | \    / | \
1038	   D0 D1 D2 D3 D4 D5 D6 D7 D8

1040	                     Figure 9: Leaf-only manifest tree

1042	3.10.2.  Linear

1044	   Of special interest are "skewed trees" where a pointer to a manifest
1045	   may only appear as last pointer of (sub-) manifests.  Such a tree
1046	   becomes a sequential list of manifests with a maximum of datapointers
1047	   per manifest packet.  Beside the tree shape we also show this data
1048	   structure in form of packet content where D stands for a data pointer
1049	   and M is the hash of a manifest packet.

1051	   Root -> M0 ----> M1 ----> ...
1052	   |->DDDD  |->DDDD

1054	4.  Usage Examples

1056	4.1.  Locating FLIC leaf and manifest nodes

1058	   The names of manifest and data objects are often missing or not
1059	   unique, unless using specific naming conventions.  In this example,
1060	   we show how using manifest locators is used to generate Interests.
1061	   Take for example the figure below where the root manifest is named by
1062	   hash h0.  It has nameless children with hashes with hashes h1 ... hN.

1064	   Objects:
1065	   manifest(name=/a/b/c, ptr=h1, ptr=hN)  - has hash h0
1066	   nameless(data1)                        - has hash h1
1067	   ...
1068	   nameless(dataN)                        - has hash hN

1070	   Query for the manifest:
1071	   interest(name=/a/b/c, implicitDigest=h0)

1073	                       Figure 10: Data Organization

1075	   After obtaining the manifest, the client fetches the contents.  In
1076	   this first instance, the manifest does not provide any Locators data
1077	   structure, so the client must continue using the name it used for the
1078	   manifest.

1080	   interest(name=/a/b/c, implicitDigest=h1)
1081	   ...
1082	   interest(name=/a/b/c, implicitDigest=hN)

1084	                         Figure 11: Data Interests

1086	   Using the locator metadata entry, this behavior can be changed:

1088	   Objects:
1089	   manifest(name=/a/b/c,
1090	   hashgroup(loc=/x/y/z, ptr=h1)
1091	   hashgroup(ptr=h2)             - has hash h0
1092	   nameless(data1)               - has hash h1
1093	   nameless(data2)               - has hash h2

1095	   Queries:
1096	   interest(name=/a/b/c, implicitDigest=h0)
1097	   interest(name=/x/y/z, implicitDigest=h1)
1098	   interest(name=/a/b/c, implicitDigest=h2)

1100	                         Figure 12: Using Locators

1102	4.2.  Seeking

1104	   Fast seeking (without having to sequentially fetch all content) works
1105	   by skipping over entries for which we know their size.  The following
1106	   expression shows how to compute the byte offset of the data pointed
1107	   at by pointer P_i, offset_i.  In this formula, let P_i represent the
1108	   Size value of the i-th pointer.

1110	   offset_i = \sum_{i = 1}^{i - 1} P_i.size

1112	   With this offset, seeking is done as follows:

1114	   Input: seek_pos P, a FLIC manifest with a hash group having N entries
1115	   Output: pointer index i and byte offset o, or out-of-range error
1116	   Algo:
1117	       offset = 0
1118	       for i in 1..N do
1119	       if (P < P_i.size)
1120	           return (i, P - offset)
1121	       offset += P_i.size
1122	       return out-of-range

1124	                       Figure 13: Seeking Algorithm

1126	   Seeking in a BlockHashGroup is different since offsets can be quickly
1127	   computed.  This is because the size of each pointer P_i except the
1128	   last is equal to the SizePerPtr value.  For a BlockHashGroup with N
1129	   pointers, OverallByteCount D, and SizePerPointer L, the size of P_i
1130	   is equal to the following:

1132	   D - ((i - 1) * L)

1134	   In a BlockHashGroup with k pointers, the size of P_k is equal to:

1136	   D - L * (k - 1)

1138	   Using these, the seeking algorithm can be thus simplified to the
1139	   following:

1141	   Input: seek_pos P, a FLIC manifest with a hash group having
1142	   OverallByteCount S and SizePerPointer L.
1143	   Output: pointer index i and byte offset o, or out-of-range error
1144	   Algo:
1145	       if (P > S)
1146	           return out-of-range
1147	       i = floor(P / L)
1148	       if (i > N)
1149	           return out-of-range # bad FLIC encoding
1150	       o = P mod L
1151	       return (i, o)

1153	                       Figure 14: Seeking Algorithm

1155	   Note: In both cases, if the pointer at position i is a manifest
1156	   pointer, this algorithm has to be called once more, seeking to
1157	   seek_pos o inside that manifest.

1159	4.3.  Block-level de-duplication

1161	   Consider a huge file, e.g. an ISO image of a DVD or program in binary
1162	   form, that had previously been FLIC-ed but now needs to be patched.
1163	   In this case, all existing encoded ICN chunks can remain in the
1164	   repository while only the chunks for the patch itself is added to a
1165	   new manifest data structure, as is shown in the picture below.  For
1166	   example, the venti [1] archival file system of Plan9 uses this
1167	   technique.

1169	   old_mfst -  - > h1 --> oldData1  <-- h1 < -  -  new_mfst
1170	            \  - > h2 --> oldData2  <-- h2 < -  - /
1171	             \            replace3  <-- h5 < -  -/
1172	              \- > h3 --> oldData3              /
1173	               \ > h4 --> oldData4  <-- h4 < - /

1175	                         Figure 15: De-duplication

1177	4.4.  Growing ICN collections

1179	   A log file, for example, grows over time.  Instead of having to re-
1180	   FLIC the grown file it suffices to construct a new manifest with a
1181	   manifest pointer to the old root manifest plus the sequence of data
1182	   hash pointers for the new data (or additional sub-manifests if
1183	   necessary).  Note that this tree will not be skewed (anymore).

1185	   old data < -  -  -  mfst_old <-- h_old -  - mfst_new
1186	                                               /
1187	   new data1 <-- h_1 -  -  -  -  -  -  -  -  -/
1188	   new data2                                 /
1189	   ...                                      /
1190	   new dataN <-- h_N -  -  -  -  -  -  -  -/

1192	                      Figure 16: Growing A Collection

1194	4.5.  Re-publishing a FLIC under a new name

1196	   There are several usecases for republishing a collection under a new
1197	   namespace, or having one collection exist under several namespaces:

1199	   o  It can happen that a publisher's namespace is part of a service
1200	      provider's prefix.  When switching provider, the publisher may
1201	      want to republish the old data under a new name.

1203	   o  A publishes wishes to distribute its content to several caches and
1204	      would like a local result to appear.  For example, the publisher
1205	      /alpha wishes to place content at /beta and /gamma, but routing to
1206	      /alpha would not send a request to either of those sites.  Each of
1207	      /beta and /gamma could create a locally named and signed version
1208	      of the root manifest with appropriate keys (or delegate that to
1209	      /alpha) so the results are always local without having to change
1210	      the bulk of the maniest tree.

1212	   This can easily be achieved with a single nameless root manifest for
1213	   the large FLIC plus arbitrarily many per-name manifests (which are
1214	   signed by whomever wants to publish this data):

1216	data < - nameless_mfst() <-- h  < - mfst(/com/example/east/the/flic)
1217	                                < - mfst(/com/example/west/old/the/flic)
1218	                                < - mfst(/internet/archive/flic234)

1220	                    Figure 17: Relocating A Collection

1222	   Note that the hash computation (of h) only requires reading the
1223	   nameless root manifest, not the entire FLIC.

1225	   This example points out the problem of HashGroups having locator
1226	   metadata elements: A retriever would be urged to follow these hints
1227	   which are "hardcoded" deep inside the FLIC but might have become
1228	   outdated.  We therefore recommend to name FLIC manifests only at the
1229	   highest level (where these names have no locator function).  Child
1230	   nodes in a FLIC manifest should not be named as these names serve no
1231	   purpose except retrieving a sub-tree's manifest by name, if would be
1232	   required.

1234	5.  IANA Considerations

1236	   TODO Need IANA actions:

1238	   o  Create a registry for Manifest Data and Annotation TLVs

1240	   o  Register the SizeAnnotation TLV

1242	   Also TODO: If this document is submitted as an official RG draft,
1243	   this section must be updated to reflect the IANA registries described
1244	   in [RFC8609]

1246	6.  Security Considerations

1248	   TODO Need a discussion on:

1250	   o  signing and hash chaining security.

1252	   o  republishing under a new namespace.

1254	   o  encryption mechanisms.

1256	   o  encryption key distribution mechanisms.

1258	7.  Appendix A: Building Trees

1260	   This section describes one method to build trees.  It constructs a
1261	   pre-order tree in a single pass of the application data, going from
1262	   the tail to the beginning.  This allows us to work up the right side
1263	   of the tree in a single pass, then work down each left branch until
1264	   we exhaust the data.  Using the reverse-order traversal, we create
1265	   the right-most-child manifest, then its parent, then the indirect
1266	   pointers of that parent, then the parent's direct pointers, then the
1267	   parent of the parent (repeating).  This process uses recursion, as it
1268	   is the clearest way to show the code.  A more optimized approach
1269	   could do it in a true single pass.

1271	   Because we're building from the bottom up, we use the term 'level' to
1272	   be the distance from the right-most child up.  Level 0 is the bottom-
1273	   most level of the tree, such as where node 7 is:

1275	         1
1276	     2      3
1277	   4  5    6  7
1278	   preorder: 1 2 4 5 3 6 7
1279	   reverse:  7 6 3 5 4 2 1

1281	   The Python-like pseudocode build_tree(data, n, k, m) algorithm
1282	   creates a tree of n data objects.  The data[] array is an array of
1283	   Content Objects that hold application payload; the application data
1284	   has already been packetized into n Content Object packets.  An
1285	   interior manifest node has k direct pointers and m indirect pointers.

1287	build_tree(data[0..n-1], n, k, m)
1288	    # data is an array of Content Objects (Data in NDN) with application payload.
1289	    # n is the number of data items
1290	    # k is the number of direct pointers per internal node
1291	    # m is the number of indirect pointers per internal node

1293	    segment = namedtuple('Segment', 'head tail')(0, n)
1294	    level = 0

1296	    # This bootstraps the process by creating the right most child manifest
1297	    # A leaf manifest has no indirect pointers, so k+m are direct pointers
1298	    root = leaf_manifest(data, segment, k + m)

1300	    # Keep building subtrees until we're out of direct pointers
1301	    while not segment.empty():
1302	        level += 1
1303	        root = bottom_up_preorder(data, segment, level, k, m, root)

1305	    return root

1307	bottom_up_preorder(data, segment, level, k, m, right_most_child=None)
1308	    manifest = None
1309	    if level == 0:
1310	        assert right_most_child is None
1311	        # build a leaf manifest with only direct pointers
1312	        manifest = leaf_manifest(data, segment, k + m)
1313	    else:
1314	        # If the number of remaining direct pointers will fit in a leaf node, make one of those.
1315	        # Otherwise, we need to be an interior node
1316	        if right_most_child is None and segment.length() <= k + m:
1317	            manifest = leaf_manifest(data, segment, k+m)
1318	        else:
1319	            manifest = interior_manifest(data, segment, level, k, m, right_most_child)
1320	    return manifest

1322	   leaf_manifest(data, segment, count)
1323	       # At most count items, but never go before the head
1324	       start = max(segment.head(), segment.tail() - count)
1325	       manifest = Manifest(data[start:segment.tail])
1326	       segment.tail -= segment.tail() - start
1327	       return manifest

1329	   interior_manifest(data, segment, level, k, m, right_most_child)
1330	       children = []
1331	       if right_most_child is not None:
1332	           children.append(right_most_child)

1334	       interior_indirect(data, segment, level, k, m, children)
1335	       interior_direct(data, segment, level, k, m, children)

1337	       manifest = Manifest(children)
1338	       return manifest, tail

1340	interior_indirect(data, segment, level, k, m, children)
1341	    # Reserve space at the head of the segment for this node's direct pointers before
1342	    # descending to children.  We want the top of the tree packed.
1343	    reserve_count = min(m, segment.tail - segment.head)
1344	    segment.head += reserve_count

1346	    while len(children) < m and not segment.head == segment.tail:
1347	        child = bottom_up_preorder(data, segment, level - 1, k, m)
1348	        # prepend
1349	        children.insert(0, child)

1351	    # Pull back our reservation and put those pointers in our direct children
1352	    segment.head -= reserve_count

1354	   interior_direct(data, segment, level, k, m, children)
1355	       while len(children) < k+m and not segment.head == segment.tail:
1356	           pointer = data[segment.tail() - 1]
1357	           children.insert(0, pointer)
1358	           segment.tail -= 1

1360	8.  References

1362	8.1.  Normative References

1364	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
1365	              Requirement Levels", BCP 14, RFC 2119,
1366	              DOI 10.17487/RFC2119, March 1997,
1367	              .

1369	8.2.  Informative References

1371	   [FLICImplementation]
1372	              Mosko, M., "FLIC Implementation in Python", various,
1373	              .

1375	   [I-D.irtf-icnrg-terminology]
1376	              Wissingh, B., Wood, C., Afanasyev, A., Zhang, L., Oran,
1377	              D., and C. Tschudin, "Information-Centric Networking
1378	              (ICN): CCNx and NDN Terminology", draft-irtf-icnrg-
1379	              terminology-07 (work in progress), November 2019.

1381	   [NDN]      "Named Data Networking", various,
1382	              .

1384	   [NDNTLV]   "NDN Packet Format Specification.", 2016,
1385	              .

1387	   [repository]
1388	              "Repo Protocol Specification", Various,
1389	              .

1392	   [RFC3552]  Rescorla, E. and B. Korver, "Guidelines for Writing RFC
1393	              Text on Security Considerations", BCP 72, RFC 3552,
1394	              DOI 10.17487/RFC3552, July 2003,
1395	              .

1397	   [RFC5226]  Narten, T. and H. Alvestrand, "Guidelines for Writing an
1398	              IANA Considerations Section in RFCs", RFC 5226,
1399	              DOI 10.17487/RFC5226, May 2008,
1400	              .

1402	   [RFC8569]  Mosko, M., Solis, I., and C. Wood, "Content-Centric
1403	              Networking (CCNx) Semantics", RFC 8569,
1404	              DOI 10.17487/RFC8569, July 2019,
1405	              .

1407	   [RFC8609]  Mosko, M., Solis, I., and C. Wood, "Content-Centric
1408	              Networking (CCNx) Messages in TLV Format", RFC 8609,
1409	              DOI 10.17487/RFC8609, July 2019,
1410	              .

1412	Authors' Addresses
1413	   Christian Tschudin
1414	   University of Basel

1416	   Email: christian.tschudin@unibas.ch

1418	   Christopher A. Wood
1419	   University of California Irvine

1421	   Email: woodc1@uci.edu

1423	   Marc Mosko
1424	   PARC, Inc.

1426	   Email: marc.mosko@parc.com

1428	   David Oran (editor)
1429	   Network Systems Research & Design

1431	   Email: daveoran@orandom.net