idnits 2.17.1 

draft-dannewitz-ppsp-secure-naming-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  == There are 2 instances of lines with non-RFC2606-compliant FQDNs in the
     document.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == The document doesn't use any RFC 2119 keywords, yet seems to have RFC
     2119 boilerplate text.

  == The document seems to contain a disclaimer for pre-RFC5378 work, but was
     first submitted on or after 10 November 2008.  The disclaimer is usually
     necessary only for documents that revise or obsolete older RFCs, and that
     take significant amounts of text from those RFCs.  If you can contact all
     authors of the source material and they are willing to grant the BCP78
     rights to the IETF Trust, you can and should remove the disclaimer. 
     Otherwise, the disclaimer is needed and you can ignore this comment. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (July 5, 2010) is 5044 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

     No issues found here.

     Summary: 0 errors (**), 0 flaws (~~), 4 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------


2	Individual Submission                                       C. Dannewitz
3	Internet-Draft                                   University of Paderborn
4	Intended status: Informational                                 T. Rautio
5	Expires: January 6, 2011                VTT Technical Research Centre of
6	                                                                 Finland
7	                                                           O. Strandberg
8	                                                  Nokia Siemens Networks
9	                                                            July 5, 2010

11	        Secure naming structure and p2p application interaction
12	                 draft-dannewitz-ppsp-secure-naming-00

14	Abstract

16	   Many P2P applications use their own way to identify and address data
17	   relying on host centric addressing, limiting the access to the same
18	   data on potentially multiple locations for multiple P2P applications.
19	   There are potential benefits in providing a generic way to identify
20	   and address data so that multiple P2P systems can use the same data
21	   regardless of data location.  The proposed secure naming structure
22	   provides a potential way to address these challenges with a common
23	   naming structure for all data and different needs.  The additional
24	   feature of the proposal is securing the way data is addressed such
25	   that the receiver has the possibility to verify that the correct data
26	   is received.  The secure naming structure should be beneficial as
27	   potential design principle in defining the two protocols identified
28	   as objectives in the PPSP charter.  This document enumerates a number
29	   of design considerations to impact the design and implementation of
30	   the tracker-peer signaling and peer-peer streaming signaling
31	   protocols.

33	Requirements Language

35	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
36	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
37	   document are to be interpreted as described in [RFC2119].

39	Status of this Memo

41	   This Internet-Draft is submitted in full conformance with the
42	   provisions of BCP 78 and BCP 79.

44	   Internet-Drafts are working documents of the Internet Engineering
45	   Task Force (IETF).  Note that other groups may also distribute
46	   working documents as Internet-Drafts.  The list of current Internet-
47	   Drafts is at http://datatracker.ietf.org/drafts/current/.

49	   Internet-Drafts are draft documents valid for a maximum of six months
50	   and may be updated, replaced, or obsoleted by other documents at any
51	   time.  It is inappropriate to use Internet-Drafts as reference
52	   material or to cite them other than as "work in progress."

54	   This Internet-Draft will expire on January 6, 2011.

56	Copyright Notice

58	   Copyright (c) 2010 IETF Trust and the persons identified as the
59	   document authors.  All rights reserved.

61	   This document is subject to BCP 78 and the IETF Trust's Legal
62	   Provisions Relating to IETF Documents
63	   (http://trustee.ietf.org/license-info) in effect on the date of
64	   publication of this document.  Please review these documents
65	   carefully, as they describe your rights and restrictions with respect
66	   to this document.  Code Components extracted from this document must
67	   include Simplified BSD License text as described in Section 4.e of
68	   the Trust Legal Provisions and are provided without warranty as
69	   described in the Simplified BSD License.

71	   This document may contain material from IETF Documents or IETF
72	   Contributions published or made publicly available before November
73	   10, 2008.  The person(s) controlling the copyright in some of this
74	   material may not have granted the IETF Trust the right to allow
75	   modifications of such material outside the IETF Standards Process.
76	   Without obtaining an adequate license from the person(s) controlling
77	   the copyright in such materials, this document may not be modified
78	   outside the IETF Standards Process, and derivative works of it may
79	   not be created outside the IETF Standards Process, except to format
80	   it for publication as an RFC or to translate it into languages other
81	   than English.

83	Table of Contents

85	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  4
86	   2.  Naming requirements  . . . . . . . . . . . . . . . . . . . . .  4
87	   3.  Basic Concepts for an Application-independent P2P Naming
88	       Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . .  6
89	     3.1.  Overview . . . . . . . . . . . . . . . . . . . . . . . . .  6
90	     3.2.  ID Structure . . . . . . . . . . . . . . . . . . . . . . .  7
91	     3.3.  Security Metadata Structure  . . . . . . . . . . . . . . .  8
92	   4.  Application use of secure naming structure . . . . . . . . . .  9
93	   5.  Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . 10
94	   6.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 10
95	   7.  Security Considerations  . . . . . . . . . . . . . . . . . . . 10
96	   8.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 10
97	   9.  Informative References . . . . . . . . . . . . . . . . . . . . 11
98	   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 11

100	1.  Introduction

102	   Today's dominating naming schemes in the Internet, i.e., IP addresses
103	   and URLs, are rather host-centric with respect to the fact that they
104	   are bound to a location.  This kind of naming scheme is not suitable
105	   for P2P systems as they are based on an information-centric thinking,
106	   i.e., putting the information at the focus whereas the source for
107	   this information is constantly changing and might involve more than
108	   one source at once.

110	   Numerous P2P applications use their own data model and protocol for
111	   keeping track of data and locations.  This poses a challenge for use
112	   of the same information for several applications.  A common naming
113	   scheme e.g. data model would be important to enable interconnectivity
114	   between different P2P systems.  To be able to build a common P2P
115	   infrastructure that can serve a multitude of applications there is a
116	   need for a common application independent naming scheme.  With such a
117	   naming scheme different applications can use and refer to the same
118	   information/data objects.

120	   It is possible to introduce false data into P2P systems, only
121	   detectable when the content is played out in the user application.
122	   The false data copies can be identified and sorted out if the P2P
123	   system can verify the reference used in the tracker protocol towards
124	   data received at the peer.  One option to address this can be to
125	   secure the naming structure i.e. make the data reference be dependent
126	   on the data and related metadata.

128	   For any type of caching solution (network based or P2P) and network
129	   based storage, e.g.  DECADE, a common application independent naming
130	   scheme is essential to be able to identify cached copies of
131	   information/data objects.

133	   This document enumerates and explains the rationale for why a naming
134	   structure for information/data objects should be part of a
135	   specification for a protocol for PPSP.  The main advantage is
136	   probably in the definition of a protocol for signaling and control
137	   between trackers and peers (the PPSP "tracker protocol") but also a
138	   signaling and control protocol for communication among the peers (the
139	   PPSP "peer protocol") might have benefits from a common and secure
140	   naming scheme.

142	2.  Naming requirements

144	   In the following, we discuss the requirements that a common naming
145	   scheme for P2P systems has to fulfill.

147	   To enable efficient, large scale data dissemination that can make use
148	   of any available data copy, identifiers (IDs) in P2P systems have to
149	   be location-independent.  Thereby, identical data can be identified
150	   by the same ID independently of its storage location and improved
151	   data dissemination can then benefit from all available copies.  This
152	   should be possible without compromising trust in data regardless of
153	   its network source.

155	   Security in a P2P system needs to be implemented differently than in
156	   host-centric networks.  In the latter, most security mechanisms are
157	   based on host authentication and then trusting the data that the host
158	   delivers.  In a P2P system, host authentication cannot be relied
159	   upon, or one of the main advantages of a P2P system, i.e., benefiting
160	   from any available copy, is defeated.  Host authentication of a
161	   random, untrusted host that happens to have a copy does not establish
162	   the needed trust.  Instead, the security has to be directly attached
163	   to the data which can be done via the scheme used to name the data.

165	   Therefore, _self-certification_ is a main requirement for the naming
166	   scheme.  Self-certification ensures the integrity of data and
167	   securely binds this data to its ID.  More precisely, this property
168	   means that any unauthorized change of data with a given ID is
169	   detectable without requiring a third party for verification.
170	   Beforehand, secure retrieval of IDs (e.g., via search, embedded in a
171	   Web page as link, etc.) is required to ensure that the user has the
172	   right ID in the first place.  Secure ID retrieval can be achieved by
173	   using recommendations, past experience, and specialized ID
174	   authentication services and mechanisms that are out of the scope of
175	   this discussion.

177	   Another important requirement is _name persistence_, not only with
178	   respect to storage location changes as discussed above, but also with
179	   respect to changes of owner and/or owner's organizational structure,
180	   and content changes producing a new version of the information.
181	   Information should always be identifiable with the same ID as long as
182	   it remains _essentially equivalent_.  Spreading of persistent naming
183	   schemes like the Digital Object Identifier (DOI) [Paskin2010] also
184	   emphasizes the need for a persistent naming scheme.  However, name
185	   persistence and self-certification are partly contradictory and
186	   achieving both simultaneously for _dynamic_ content is not trivial.

188	   From a user's perspective, persistent IDs ensure that links and
189	   bookmarks remain valid as long as the respective information exists
190	   somewhere in the network, reducing today's problem of "404 - file not
191	   found" errors triggered by renamed or moved content.  From a content
192	   provider's perspective, name persistence simplifies data management
193	   as content can, e.g., be moved between folders and different servers
194	   as desired.  Name persistence with respect to content changes makes
195	   it possible to identify different versions of the same information by
196	   the same consistent ID.  If it is important to differentiate between
197	   multiple versions, a dedicated versioning mechanism is required, and
198	   version numbers may be included as a special part of the ID.

200	   The requirement of building trust in a P2P system combined with the
201	   desire for anonymous publication as well as accountability (at least
202	   for some content) can be translated into two related naming
203	   requirements.  The first is _owner authentication_, where the owner
204	   is recognized as the same entity, which repeatedly acts as the object
205	   owner, but may remain _anonymous_.  The second is _owner
206	   identification_, where the owner is also identified by a physically
207	   verifiable identifier, such as a personal name.  This separation is
208	   important to allow for anonymous publication of content, e.g., to
209	   support free speech, while at the same time building up trust in a
210	   (potentially anonymous) owner.

212	   In general, the naming scheme should be able to adapt to future
213	   needs.  Therefore, the naming scheme should be extensible, i.e., it
214	   should be able to add new information (e.g., a chunk number for
215	   BitTorrent-like protocols) to the naming scheme.  The need for such
216	   extensions is stressed by today's variety of naming schemes (e.g.,
217	   DOI or PermaLink) added on top of the original Internet architecture
218	   that fulfill specialized needs which cannot be met by the common
219	   Internet naming schemes, i.e., IP addresses and URLs.

221	3.  Basic Concepts for an Application-independent P2P Naming Scheme

223	   In this section, we introduce an examplary naming scheme that
224	   illustrates a possible way to fulfill the requirements posed upon an
225	   application-independent naming scheme for P2P networks.  The naming
226	   scheme integrates security deeply into the system architecture.
227	   Trust is based on the data's ID in combination with additional
228	   _security metadata_.  Section 3.1 gives an overview of the naming
229	   scheme in general with details about the ID structure, and Section
230	   3.2 describes the security metadata in more detail.

232	3.1.  Overview

234	   Building on an identifier/locator split, each data element, e.g.,
235	   file, is given a unique ID with cryptographic properties.  Together
236	   with the additional security metadata, the ID can be used to verify
237	   data integrity, owner authentication, and owner identification.  The
238	   security metadata contains information needed for the security
239	   functions of the naming scheme, e.g., public keys, content hashes,
240	   certificates, and a data signature authenticating the content.  In
241	   comparison with the security model in today's host-centric networks,
242	   this approach minimizes the need for trust in the infrastructure,
243	   especially in the host(s) providing the data.

245	   In a P2P network, multiple copies of the same data element typically
246	   exist at different locations.  Thanks to the ID/locator split and the
247	   application-independent naming scheme, those identical copies have
248	   the same ID and, hence, each P2P application can benefit from all
249	   available copies.

251	   Data elements are manipulated (e.g., generated, modified, registered,
252	   and retrieved) by physical entities such as nodes (clients or hosts),
253	   persons, and companies.  Physical entities able of generating, i.e.,
254	   creating or modifying data elements are called _owners_ here.
255	   Several security properties of this naming scheme are based on the
256	   fact that each ID contains the hash of a public key that is part of a
257	   public/secret key pair PK/SK.  This PK/SK pair is conceptually bound
258	   to the data element itself and not directly to the owner as in other
259	   systems like DONA [Koponen].  If desired, the PK/SK pair can be bound
260	   to the owner only _indirectly_, via a certificate chain.  This is
261	   important to note because it enables owner change while keeping
262	   persistent IDs.  The key pair bound to the _d_ata is thus denoted as
263	   PK_D/SK_D.

265	   Making the (hash of the) public key part of ID enables self-
266	   certification of _dynamic_ content while keeping persistent IDs.
267	   Self-certification of _static_ content can be achieved by simply
268	   including the hash of content in the ID, but this would obviously
269	   result in non-persistent IDs for dynamic content.  For dynamic
270	   content, the public key in the ID can be used to securely bind the
271	   hash of content to the ID, by signing it with the corresponding
272	   secret key, while not making it part of ID.

274	   The owner's PK as part of the ID inherently provides _owner
275	   authentication_.  If the public key is bound to the owner's identity
276	   (i.e., to its real-world name) via a trusted third party certificate,
277	   this also allows _owner identification_.  Without this additional
278	   certificate, the owner can remain anonymous.

280	   To support the potentially diverse requirements of certain groups of
281	   P2P applications and adapt to future changes, the naming scheme can
282	   enable flexibility and extensibility by supporting different name
283	   structures, differentiated via a _Type field_ in the ID.

285	3.2.  ID Structure

287	   The naming scheme uses flat IDs to support self-certification and
288	   name persistence.  In addition, flat IDs are advantageous when it
289	   comes to mobility and they can be allocated without an administrative
290	   authority by relying on statistical uniqueness in a large namespace,
291	   with the rare case of ID collisions being handled by the P2P system.
292	   Although IDs are not hierarchical, they have a specified basic ID
293	   structure.  The ID structure given as ID = (Type field | A = hash(PK)
294	   | L) is described subsequently.

296	   The _Authenticator_ field A=Hash(PK_D) binds the ID to a public key
297	   PK_D. The hash function _Hash_ is a cryptographic hash function,
298	   which is required to be one-way and collision-resistant.  The hash
299	   function serves only to reduce the bit length of PK_D. PK_D is
300	   generated in accordance with a chosen public-key cryptosystem.  The
301	   corresponding secret key SK_D should only be known to a legitimate
302	   owner.  In consequence, an owner of the data is defined as any entity
303	   who (legitimately) knows SK_D.

305	   The pair (A, L) has to be globally unique.  Hence, the _Label_ field
306	   L provides global uniqueness if PK_D is repeatedly used for different
307	   data.

309	   To build a flexible and extensible naming scheme, e.g., to adapt the
310	   naming scheme to future changes, different types of IDs are supported
311	   by the naming scheme and differentiated via a mandatory and globally
312	   standardized _Type field_ in each ID.  For example, the Type field
313	   specifies the hash functions used to generate the ID.  If a used hash
314	   function becomes insecure, the Type field can be exploited by the P2P
315	   system in order to automatically mark the IDs using this hash
316	   function as invalid.

318	3.3.  Security Metadata Structure

320	   The security metadata is extensible and contains all information
321	   required to perform the security functions embedded in the naming
322	   scheme.  The metadata (or selected parts of it) will be signed by
323	   SK_D corresponding to PK_D. This securely binds the metadata to the
324	   ID, i.e., to the Hash(PK_D) which is part of the ID.  For example,
325	   the security metadata may include:

327	   o  specification of the hash function _h_ and the algorithm _DSAlg_
328	      used for the digital signature

330	   o  complete PK_D (not only Hash(PK_D))

332	   o  specification of the parts of data that are self-certified, i.e.,
333	      authenticated via the signature

335	   o  hash of the self-certified data
336	   o  signature of the self-certified data signed by SK_D

338	   o  all data required for owner authentication and identification

340	   A detailed description and security analysis of this naming scheme
341	   and its security properties, especially self-certification, name
342	   persistence, owner authentication, and owner identification can be
343	   found in Dannewitz et al.  [Dannewitz_10].

345	4.  Application use of secure naming structure

347	   From an application perspective the main advantage of a secure naming
348	   structure for a P2P infrastructure is that multiple applications can
349	   have common access to the same data elements.  Another benefit of
350	   application-independent naming is that locally available and cached
351	   copies can easily be located.  The secure naming also enables that
352	   data can be verified even if it is received from an untrusted host.

354	   For example, when an application like BitTorrent [WWWbittorrent] uses
355	   self-certifying names, the user is guaranteed that the data received
356	   is actually the data that has been requested, without having to trust
357	   any servers in the network (e.g., the tracker) or the peers that
358	   provide the data.

360	   This means that BitTorrent's validation of the data integrity can be
361	   improved significantly using the presented secure naming structure.
362	   Currently, a standard BitTorrent system has no means to verify the
363	   integrity of the torrent file and consequently of the data.  The
364	   torrent file contains the SHA1 hashes of the content pieces.
365	   However, anyone can modify a torrent file to bind different content
366	   to this file.  If the torrent file gets modified, the user has no
367	   means any more to verify the integrity of the data.  If, in addition,
368	   the tracker delivers forged data (consistent with the forged torrent
369	   file), a user could effectively be tricked into downloading forged
370	   content which would falsely be identified as being correct by the
371	   BitTorrent client.  I.e., in the current BitTorrent system, a user
372	   has no guarantee that the downloaded content actually matches the
373	   expected/correct content.

375	   The secure naming structure presented in this draft can provide a
376	   simple solution for this problem by securely binding the content of
377	   the torrent file to the name/ID of the torrent file.  This can be
378	   done by extending the torrent file to include the above described
379	   security metadata information.  In practice, an object owner would
380	   sign the hash values in the torrent file with the private key (SK_D)
381	   and would store this signature, the public key (PK_D), and some
382	   additional security metadata in the torrent file during torrent file
383	   creation.  The respective torrent file ID would be generated
384	   according to the rules described in Section 3.  Consequently,
385	   whenever a user knows the ID of the content/torrent file and
386	   retrieves the torrent file, she/he can now verify the integrity of
387	   the torrent file, can download the data pieces, and can use the
388	   included (and secured) hash(es) to verify the integrity of the
389	   received data.  As a result, the user can be sure that the correct
390	   content was retrieved.

392	5.  Conclusion

394	   The secure naming structure is proposed for consideration as common
395	   reference ID structure in PPSP WG.  For any P2P streaming application
396	   to have fair and multitude of data access, it is essential to have a
397	   common naming structure that is suitable for many different needs.
398	   The common naming is probably best displayed in the tracker protocol
399	   case but potential benefit in the actual streaming protocol case has
400	   to still be identified.  The secure binding of reference ID to the
401	   actual content is manifested in the end user peer possibility to
402	   check correct data reception in regard to the used ID.

404	   The naming structure has been implemented in the 4WARD project
405	   prototypes and has been released as open source (www.netinf.org).
406	   The naming structure is also available through a public NetInf
407	   registration service at www.netinf.org.  Three NetInf-enabled
408	   applications have also been published, the InFox (Firefox plugin),
409	   InBird (Thunderbird plugin), and a NetInf Information Object
410	   Management Tool, all available at the www.netinf.org site.

412	6.  IANA Considerations

414	   This document has no requests to IANA.

416	7.  Security Considerations

418	   There are considerations about what private/public key and hash
419	   algorithms to utilize when designing the naming structure in a secure
420	   way.

422	8.  Acknowledgements

424	   We would like to thank especially Borje Ohlman for excellent
425	   discussion and review of the draft.  Thanks also goes to all persons
426	   participating in the Network of Information work package in the EU
427	   FP7 project 4WARD, the project SAIL and the Finnish ICT SHOK Future
428	   Internet 2 project for contributions and feedback to this document.

430	9.  Informative References

432	   [Dannewitz_10]
433	              Dannewitz, C., Golic, J., Ohlman, B., and B. Ahlgren,
434	              "Secure Naming for a Network of Information", 13th IEEE
435	              Global Internet Symposium , 2010.

437	   [Koponen]  Koponen, T., Chawla, M., Chun, B., Ermolinskiy, A., Kim,
438	              K., Shenker, S., and I. Stoica, "A Data-Oriented (and
439	              beyond) Network Architecture", Proc. ACM SIGCOMM , 2007.

441	   [Paskin2010]
442	              Paskin, N., "Digital Object Identifier ({DOI}(R)) System",
443	              Encyclopedia of Library and Information Sciences , 2010.

445	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
446	              Requirement Levels", BCP 14, RFC 2119, March 1997.

448	   [WWWbittorrent]
449	              Cohen, B., "The BitTorrent Protocol Specification",
450	              http://www.bittorrent.org/beps/bep_0003.html , 2008.

452	Authors' Addresses

454	   Christian Dannewitz
455	   University of Paderborn
456	   Paderborn
457	   Germany

459	   Email: cdannewitz@upb.de

461	   Teemu Rautio
462	   VTT Technical Research Centre of Finland
463	   Oulu
464	   Finland

466	   Email: teemu.rautio@vtt.fi
467	   Ove Strandberg
468	   Nokia Siemens Networks
469	   Espoo
470	   Finland

472	   Email: ove.strandberg@nsn.com