idnits 2.17.1 draft-dannewitz-ppsp-secure-naming-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 2 instances of lines with non-RFC2606-compliant FQDNs in the document. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. == The document seems to contain a disclaimer for pre-RFC5378 work, but was first submitted on or after 10 November 2008. The disclaimer is usually necessary only for documents that revise or obsolete older RFCs, and that take significant amounts of text from those RFCs. If you can contact all authors of the source material and they are willing to grant the BCP78 rights to the IETF Trust, you can and should remove the disclaimer. Otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (July 5, 2010) is 5044 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- No issues found here. Summary: 0 errors (**), 0 flaws (~~), 4 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Individual Submission C. Dannewitz 3 Internet-Draft University of Paderborn 4 Intended status: Informational T. Rautio 5 Expires: January 6, 2011 VTT Technical Research Centre of 6 Finland 7 O. Strandberg 8 Nokia Siemens Networks 9 July 5, 2010 11 Secure naming structure and p2p application interaction 12 draft-dannewitz-ppsp-secure-naming-00 14 Abstract 16 Many P2P applications use their own way to identify and address data 17 relying on host centric addressing, limiting the access to the same 18 data on potentially multiple locations for multiple P2P applications. 19 There are potential benefits in providing a generic way to identify 20 and address data so that multiple P2P systems can use the same data 21 regardless of data location. The proposed secure naming structure 22 provides a potential way to address these challenges with a common 23 naming structure for all data and different needs. The additional 24 feature of the proposal is securing the way data is addressed such 25 that the receiver has the possibility to verify that the correct data 26 is received. The secure naming structure should be beneficial as 27 potential design principle in defining the two protocols identified 28 as objectives in the PPSP charter. This document enumerates a number 29 of design considerations to impact the design and implementation of 30 the tracker-peer signaling and peer-peer streaming signaling 31 protocols. 33 Requirements Language 35 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 36 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 37 document are to be interpreted as described in [RFC2119]. 39 Status of this Memo 41 This Internet-Draft is submitted in full conformance with the 42 provisions of BCP 78 and BCP 79. 44 Internet-Drafts are working documents of the Internet Engineering 45 Task Force (IETF). Note that other groups may also distribute 46 working documents as Internet-Drafts. The list of current Internet- 47 Drafts is at http://datatracker.ietf.org/drafts/current/. 49 Internet-Drafts are draft documents valid for a maximum of six months 50 and may be updated, replaced, or obsoleted by other documents at any 51 time. It is inappropriate to use Internet-Drafts as reference 52 material or to cite them other than as "work in progress." 54 This Internet-Draft will expire on January 6, 2011. 56 Copyright Notice 58 Copyright (c) 2010 IETF Trust and the persons identified as the 59 document authors. All rights reserved. 61 This document is subject to BCP 78 and the IETF Trust's Legal 62 Provisions Relating to IETF Documents 63 (http://trustee.ietf.org/license-info) in effect on the date of 64 publication of this document. Please review these documents 65 carefully, as they describe your rights and restrictions with respect 66 to this document. Code Components extracted from this document must 67 include Simplified BSD License text as described in Section 4.e of 68 the Trust Legal Provisions and are provided without warranty as 69 described in the Simplified BSD License. 71 This document may contain material from IETF Documents or IETF 72 Contributions published or made publicly available before November 73 10, 2008. The person(s) controlling the copyright in some of this 74 material may not have granted the IETF Trust the right to allow 75 modifications of such material outside the IETF Standards Process. 76 Without obtaining an adequate license from the person(s) controlling 77 the copyright in such materials, this document may not be modified 78 outside the IETF Standards Process, and derivative works of it may 79 not be created outside the IETF Standards Process, except to format 80 it for publication as an RFC or to translate it into languages other 81 than English. 83 Table of Contents 85 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 86 2. Naming requirements . . . . . . . . . . . . . . . . . . . . . 4 87 3. Basic Concepts for an Application-independent P2P Naming 88 Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 89 3.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . . 6 90 3.2. ID Structure . . . . . . . . . . . . . . . . . . . . . . . 7 91 3.3. Security Metadata Structure . . . . . . . . . . . . . . . 8 92 4. Application use of secure naming structure . . . . . . . . . . 9 93 5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . 10 94 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10 95 7. Security Considerations . . . . . . . . . . . . . . . . . . . 10 96 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 10 97 9. Informative References . . . . . . . . . . . . . . . . . . . . 11 98 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 11 100 1. Introduction 102 Today's dominating naming schemes in the Internet, i.e., IP addresses 103 and URLs, are rather host-centric with respect to the fact that they 104 are bound to a location. This kind of naming scheme is not suitable 105 for P2P systems as they are based on an information-centric thinking, 106 i.e., putting the information at the focus whereas the source for 107 this information is constantly changing and might involve more than 108 one source at once. 110 Numerous P2P applications use their own data model and protocol for 111 keeping track of data and locations. This poses a challenge for use 112 of the same information for several applications. A common naming 113 scheme e.g. data model would be important to enable interconnectivity 114 between different P2P systems. To be able to build a common P2P 115 infrastructure that can serve a multitude of applications there is a 116 need for a common application independent naming scheme. With such a 117 naming scheme different applications can use and refer to the same 118 information/data objects. 120 It is possible to introduce false data into P2P systems, only 121 detectable when the content is played out in the user application. 122 The false data copies can be identified and sorted out if the P2P 123 system can verify the reference used in the tracker protocol towards 124 data received at the peer. One option to address this can be to 125 secure the naming structure i.e. make the data reference be dependent 126 on the data and related metadata. 128 For any type of caching solution (network based or P2P) and network 129 based storage, e.g. DECADE, a common application independent naming 130 scheme is essential to be able to identify cached copies of 131 information/data objects. 133 This document enumerates and explains the rationale for why a naming 134 structure for information/data objects should be part of a 135 specification for a protocol for PPSP. The main advantage is 136 probably in the definition of a protocol for signaling and control 137 between trackers and peers (the PPSP "tracker protocol") but also a 138 signaling and control protocol for communication among the peers (the 139 PPSP "peer protocol") might have benefits from a common and secure 140 naming scheme. 142 2. Naming requirements 144 In the following, we discuss the requirements that a common naming 145 scheme for P2P systems has to fulfill. 147 To enable efficient, large scale data dissemination that can make use 148 of any available data copy, identifiers (IDs) in P2P systems have to 149 be location-independent. Thereby, identical data can be identified 150 by the same ID independently of its storage location and improved 151 data dissemination can then benefit from all available copies. This 152 should be possible without compromising trust in data regardless of 153 its network source. 155 Security in a P2P system needs to be implemented differently than in 156 host-centric networks. In the latter, most security mechanisms are 157 based on host authentication and then trusting the data that the host 158 delivers. In a P2P system, host authentication cannot be relied 159 upon, or one of the main advantages of a P2P system, i.e., benefiting 160 from any available copy, is defeated. Host authentication of a 161 random, untrusted host that happens to have a copy does not establish 162 the needed trust. Instead, the security has to be directly attached 163 to the data which can be done via the scheme used to name the data. 165 Therefore, _self-certification_ is a main requirement for the naming 166 scheme. Self-certification ensures the integrity of data and 167 securely binds this data to its ID. More precisely, this property 168 means that any unauthorized change of data with a given ID is 169 detectable without requiring a third party for verification. 170 Beforehand, secure retrieval of IDs (e.g., via search, embedded in a 171 Web page as link, etc.) is required to ensure that the user has the 172 right ID in the first place. Secure ID retrieval can be achieved by 173 using recommendations, past experience, and specialized ID 174 authentication services and mechanisms that are out of the scope of 175 this discussion. 177 Another important requirement is _name persistence_, not only with 178 respect to storage location changes as discussed above, but also with 179 respect to changes of owner and/or owner's organizational structure, 180 and content changes producing a new version of the information. 181 Information should always be identifiable with the same ID as long as 182 it remains _essentially equivalent_. Spreading of persistent naming 183 schemes like the Digital Object Identifier (DOI) [Paskin2010] also 184 emphasizes the need for a persistent naming scheme. However, name 185 persistence and self-certification are partly contradictory and 186 achieving both simultaneously for _dynamic_ content is not trivial. 188 From a user's perspective, persistent IDs ensure that links and 189 bookmarks remain valid as long as the respective information exists 190 somewhere in the network, reducing today's problem of "404 - file not 191 found" errors triggered by renamed or moved content. From a content 192 provider's perspective, name persistence simplifies data management 193 as content can, e.g., be moved between folders and different servers 194 as desired. Name persistence with respect to content changes makes 195 it possible to identify different versions of the same information by 196 the same consistent ID. If it is important to differentiate between 197 multiple versions, a dedicated versioning mechanism is required, and 198 version numbers may be included as a special part of the ID. 200 The requirement of building trust in a P2P system combined with the 201 desire for anonymous publication as well as accountability (at least 202 for some content) can be translated into two related naming 203 requirements. The first is _owner authentication_, where the owner 204 is recognized as the same entity, which repeatedly acts as the object 205 owner, but may remain _anonymous_. The second is _owner 206 identification_, where the owner is also identified by a physically 207 verifiable identifier, such as a personal name. This separation is 208 important to allow for anonymous publication of content, e.g., to 209 support free speech, while at the same time building up trust in a 210 (potentially anonymous) owner. 212 In general, the naming scheme should be able to adapt to future 213 needs. Therefore, the naming scheme should be extensible, i.e., it 214 should be able to add new information (e.g., a chunk number for 215 BitTorrent-like protocols) to the naming scheme. The need for such 216 extensions is stressed by today's variety of naming schemes (e.g., 217 DOI or PermaLink) added on top of the original Internet architecture 218 that fulfill specialized needs which cannot be met by the common 219 Internet naming schemes, i.e., IP addresses and URLs. 221 3. Basic Concepts for an Application-independent P2P Naming Scheme 223 In this section, we introduce an examplary naming scheme that 224 illustrates a possible way to fulfill the requirements posed upon an 225 application-independent naming scheme for P2P networks. The naming 226 scheme integrates security deeply into the system architecture. 227 Trust is based on the data's ID in combination with additional 228 _security metadata_. Section 3.1 gives an overview of the naming 229 scheme in general with details about the ID structure, and Section 230 3.2 describes the security metadata in more detail. 232 3.1. Overview 234 Building on an identifier/locator split, each data element, e.g., 235 file, is given a unique ID with cryptographic properties. Together 236 with the additional security metadata, the ID can be used to verify 237 data integrity, owner authentication, and owner identification. The 238 security metadata contains information needed for the security 239 functions of the naming scheme, e.g., public keys, content hashes, 240 certificates, and a data signature authenticating the content. In 241 comparison with the security model in today's host-centric networks, 242 this approach minimizes the need for trust in the infrastructure, 243 especially in the host(s) providing the data. 245 In a P2P network, multiple copies of the same data element typically 246 exist at different locations. Thanks to the ID/locator split and the 247 application-independent naming scheme, those identical copies have 248 the same ID and, hence, each P2P application can benefit from all 249 available copies. 251 Data elements are manipulated (e.g., generated, modified, registered, 252 and retrieved) by physical entities such as nodes (clients or hosts), 253 persons, and companies. Physical entities able of generating, i.e., 254 creating or modifying data elements are called _owners_ here. 255 Several security properties of this naming scheme are based on the 256 fact that each ID contains the hash of a public key that is part of a 257 public/secret key pair PK/SK. This PK/SK pair is conceptually bound 258 to the data element itself and not directly to the owner as in other 259 systems like DONA [Koponen]. If desired, the PK/SK pair can be bound 260 to the owner only _indirectly_, via a certificate chain. This is 261 important to note because it enables owner change while keeping 262 persistent IDs. The key pair bound to the _d_ata is thus denoted as 263 PK_D/SK_D. 265 Making the (hash of the) public key part of ID enables self- 266 certification of _dynamic_ content while keeping persistent IDs. 267 Self-certification of _static_ content can be achieved by simply 268 including the hash of content in the ID, but this would obviously 269 result in non-persistent IDs for dynamic content. For dynamic 270 content, the public key in the ID can be used to securely bind the 271 hash of content to the ID, by signing it with the corresponding 272 secret key, while not making it part of ID. 274 The owner's PK as part of the ID inherently provides _owner 275 authentication_. If the public key is bound to the owner's identity 276 (i.e., to its real-world name) via a trusted third party certificate, 277 this also allows _owner identification_. Without this additional 278 certificate, the owner can remain anonymous. 280 To support the potentially diverse requirements of certain groups of 281 P2P applications and adapt to future changes, the naming scheme can 282 enable flexibility and extensibility by supporting different name 283 structures, differentiated via a _Type field_ in the ID. 285 3.2. ID Structure 287 The naming scheme uses flat IDs to support self-certification and 288 name persistence. In addition, flat IDs are advantageous when it 289 comes to mobility and they can be allocated without an administrative 290 authority by relying on statistical uniqueness in a large namespace, 291 with the rare case of ID collisions being handled by the P2P system. 292 Although IDs are not hierarchical, they have a specified basic ID 293 structure. The ID structure given as ID = (Type field | A = hash(PK) 294 | L) is described subsequently. 296 The _Authenticator_ field A=Hash(PK_D) binds the ID to a public key 297 PK_D. The hash function _Hash_ is a cryptographic hash function, 298 which is required to be one-way and collision-resistant. The hash 299 function serves only to reduce the bit length of PK_D. PK_D is 300 generated in accordance with a chosen public-key cryptosystem. The 301 corresponding secret key SK_D should only be known to a legitimate 302 owner. In consequence, an owner of the data is defined as any entity 303 who (legitimately) knows SK_D. 305 The pair (A, L) has to be globally unique. Hence, the _Label_ field 306 L provides global uniqueness if PK_D is repeatedly used for different 307 data. 309 To build a flexible and extensible naming scheme, e.g., to adapt the 310 naming scheme to future changes, different types of IDs are supported 311 by the naming scheme and differentiated via a mandatory and globally 312 standardized _Type field_ in each ID. For example, the Type field 313 specifies the hash functions used to generate the ID. If a used hash 314 function becomes insecure, the Type field can be exploited by the P2P 315 system in order to automatically mark the IDs using this hash 316 function as invalid. 318 3.3. Security Metadata Structure 320 The security metadata is extensible and contains all information 321 required to perform the security functions embedded in the naming 322 scheme. The metadata (or selected parts of it) will be signed by 323 SK_D corresponding to PK_D. This securely binds the metadata to the 324 ID, i.e., to the Hash(PK_D) which is part of the ID. For example, 325 the security metadata may include: 327 o specification of the hash function _h_ and the algorithm _DSAlg_ 328 used for the digital signature 330 o complete PK_D (not only Hash(PK_D)) 332 o specification of the parts of data that are self-certified, i.e., 333 authenticated via the signature 335 o hash of the self-certified data 336 o signature of the self-certified data signed by SK_D 338 o all data required for owner authentication and identification 340 A detailed description and security analysis of this naming scheme 341 and its security properties, especially self-certification, name 342 persistence, owner authentication, and owner identification can be 343 found in Dannewitz et al. [Dannewitz_10]. 345 4. Application use of secure naming structure 347 From an application perspective the main advantage of a secure naming 348 structure for a P2P infrastructure is that multiple applications can 349 have common access to the same data elements. Another benefit of 350 application-independent naming is that locally available and cached 351 copies can easily be located. The secure naming also enables that 352 data can be verified even if it is received from an untrusted host. 354 For example, when an application like BitTorrent [WWWbittorrent] uses 355 self-certifying names, the user is guaranteed that the data received 356 is actually the data that has been requested, without having to trust 357 any servers in the network (e.g., the tracker) or the peers that 358 provide the data. 360 This means that BitTorrent's validation of the data integrity can be 361 improved significantly using the presented secure naming structure. 362 Currently, a standard BitTorrent system has no means to verify the 363 integrity of the torrent file and consequently of the data. The 364 torrent file contains the SHA1 hashes of the content pieces. 365 However, anyone can modify a torrent file to bind different content 366 to this file. If the torrent file gets modified, the user has no 367 means any more to verify the integrity of the data. If, in addition, 368 the tracker delivers forged data (consistent with the forged torrent 369 file), a user could effectively be tricked into downloading forged 370 content which would falsely be identified as being correct by the 371 BitTorrent client. I.e., in the current BitTorrent system, a user 372 has no guarantee that the downloaded content actually matches the 373 expected/correct content. 375 The secure naming structure presented in this draft can provide a 376 simple solution for this problem by securely binding the content of 377 the torrent file to the name/ID of the torrent file. This can be 378 done by extending the torrent file to include the above described 379 security metadata information. In practice, an object owner would 380 sign the hash values in the torrent file with the private key (SK_D) 381 and would store this signature, the public key (PK_D), and some 382 additional security metadata in the torrent file during torrent file 383 creation. The respective torrent file ID would be generated 384 according to the rules described in Section 3. Consequently, 385 whenever a user knows the ID of the content/torrent file and 386 retrieves the torrent file, she/he can now verify the integrity of 387 the torrent file, can download the data pieces, and can use the 388 included (and secured) hash(es) to verify the integrity of the 389 received data. As a result, the user can be sure that the correct 390 content was retrieved. 392 5. Conclusion 394 The secure naming structure is proposed for consideration as common 395 reference ID structure in PPSP WG. For any P2P streaming application 396 to have fair and multitude of data access, it is essential to have a 397 common naming structure that is suitable for many different needs. 398 The common naming is probably best displayed in the tracker protocol 399 case but potential benefit in the actual streaming protocol case has 400 to still be identified. The secure binding of reference ID to the 401 actual content is manifested in the end user peer possibility to 402 check correct data reception in regard to the used ID. 404 The naming structure has been implemented in the 4WARD project 405 prototypes and has been released as open source (www.netinf.org). 406 The naming structure is also available through a public NetInf 407 registration service at www.netinf.org. Three NetInf-enabled 408 applications have also been published, the InFox (Firefox plugin), 409 InBird (Thunderbird plugin), and a NetInf Information Object 410 Management Tool, all available at the www.netinf.org site. 412 6. IANA Considerations 414 This document has no requests to IANA. 416 7. Security Considerations 418 There are considerations about what private/public key and hash 419 algorithms to utilize when designing the naming structure in a secure 420 way. 422 8. Acknowledgements 424 We would like to thank especially Borje Ohlman for excellent 425 discussion and review of the draft. Thanks also goes to all persons 426 participating in the Network of Information work package in the EU 427 FP7 project 4WARD, the project SAIL and the Finnish ICT SHOK Future 428 Internet 2 project for contributions and feedback to this document. 430 9. Informative References 432 [Dannewitz_10] 433 Dannewitz, C., Golic, J., Ohlman, B., and B. Ahlgren, 434 "Secure Naming for a Network of Information", 13th IEEE 435 Global Internet Symposium , 2010. 437 [Koponen] Koponen, T., Chawla, M., Chun, B., Ermolinskiy, A., Kim, 438 K., Shenker, S., and I. Stoica, "A Data-Oriented (and 439 beyond) Network Architecture", Proc. ACM SIGCOMM , 2007. 441 [Paskin2010] 442 Paskin, N., "Digital Object Identifier ({DOI}(R)) System", 443 Encyclopedia of Library and Information Sciences , 2010. 445 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 446 Requirement Levels", BCP 14, RFC 2119, March 1997. 448 [WWWbittorrent] 449 Cohen, B., "The BitTorrent Protocol Specification", 450 http://www.bittorrent.org/beps/bep_0003.html , 2008. 452 Authors' Addresses 454 Christian Dannewitz 455 University of Paderborn 456 Paderborn 457 Germany 459 Email: cdannewitz@upb.de 461 Teemu Rautio 462 VTT Technical Research Centre of Finland 463 Oulu 464 Finland 466 Email: teemu.rautio@vtt.fi 467 Ove Strandberg 468 Nokia Siemens Networks 469 Espoo 470 Finland 472 Email: ove.strandberg@nsn.com