idnits 2.17.1 draft-ietf-urnbis-urns-are-not-uris-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 2) being 60 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The draft header indicates that this document updates RFC3986, but the abstract doesn't seem to directly say this. It does mention RFC3986 though, so this could be OK. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 144 has weird spacing: '...nerally is no...' (Using the creation date from RFC3986, updated by this document, for RFC5378 checks: 2002-11-01) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (April 7, 2014) is 3672 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'DeterministicURI' is defined on line 298, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2141 (Obsoleted by RFC 8141) -- Obsolete informational reference (is this intentional?): RFC 1738 (Obsoleted by RFC 4248, RFC 4266) -- Duplicate reference: RFC2141, mentioned in 'RFC2141bis', was also mentioned in 'RFC2141'. -- Obsolete informational reference (is this intentional?): RFC 2141 (Obsoleted by RFC 8141) Summary: 1 error (**), 0 flaws (~~), 4 warnings (==), 6 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Uniform Resource Names (urnbis) J.C. Klensin 3 Internet-Draft April 7, 2014 4 Updates: 3986 (if approved) 5 Intended status: Standards Track 6 Expires: October 07, 2014 8 Names are Not Locators and URNs are Not URIs 9 draft-ietf-urnbis-urns-are-not-uris-00.txt 11 Abstract 13 Experience has shown that identifiers associated with persistent 14 names are quite different from identifiers associated with the 15 locations of objects. This is especially true when such names are 16 are expected to be stable for a very long time or when they identify 17 large and complex entities. In order to allow Uniform Resource Names 18 (URNs) to evolve to meet the needs of the Informational Sciences 19 community and other users, this specification separates the syntax 20 for URNs from the generic syntax for Uniform Resource Identifiers 21 (URIs) specified in RFC 3986, updating the latter specification 22 accordingly. 24 Status of this Memo 26 This Internet-Draft is submitted in full conformance with the 27 provisions of BCP 78 and BCP 79. 29 Internet-Drafts are working documents of the Internet Engineering 30 Task Force (IETF). Note that other groups may also distribute 31 working documents as Internet-Drafts. The list of current Internet- 32 Drafts is at http://datatracker.ietf.org/drafts/current/. 34 Internet-Drafts are draft documents valid for a maximum of six months 35 and may be updated, replaced, or obsoleted by other documents at any 36 time. It is inappropriate to use Internet-Drafts as reference 37 material or to cite them other than as "work in progress." 39 This Internet-Draft will expire on October 07, 2014. 41 Copyright Notice 43 Copyright (c) 2014 IETF Trust and the persons identified as the 44 document authors. All rights reserved. 46 This document is subject to BCP 78 and the IETF Trust's Legal 47 Provisions Relating to IETF Documents (http://trustee.ietf.org/ 48 license-info) in effect on the date of publication of this document. 49 Please review these documents carefully, as they describe your rights 50 and restrictions with respect to this document. Code Components 51 extracted from this document must include Simplified BSD License text 52 as described in Section 4.e of the Trust Legal Provisions and are 53 provided without warranty as described in the Simplified BSD License. 55 Table of Contents 57 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 2 58 2. A perspective on locations and names . . . . . . . . . . . . . 2 59 3. Changes to RFC 3986 . . . . . . . . . . . . . . . . . . . . . 5 60 4. Other Required Actions . . . . . . . . . . . . . . . . . . . . 5 61 5. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 5 62 6. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 5 63 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 6 64 8. Security Considerations . . . . . . . . . . . . . . . . . . . 6 65 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 6 66 9.1. Normative References . . . . . . . . . . . . . . . . . . . 6 67 9.2. Informative References . . . . . . . . . . . . . . . . . . 6 68 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 6 70 1. Introduction 72 The Internet community now has many years of experience with both 73 name-type identifiers (notably Uniform Resource Names (URNs [RFC2141] 74 [RFC2141bis]) and location-based identifiers (notably Uniform 75 Resource Locators (URLs) [RFC1738]). That experience leads to the 76 conclusion that it is impractical to constrain URNs to the syntax and 77 high-level semantics of URLs. Generalization from URLs to generic 78 Uniform Resource Identifiers (URIs) [RFC3986], especially to name- 79 based, high-stability, long-persistence, identifiers of the URN 80 variety, has failed because the assumed similarities do not exist to 81 a sufficient degree. Ultimately, locators, which typically depend on 82 particular accessing protocols and a specification relative to some 83 physical space or network topology, are simply different creatures 84 from long-persistence, location-independent, object identifiers. The 85 syntax and semantic constraints that are appropriate for locators are 86 either irrelevant to or interfere with the needs of resource names as 87 a class. That was tolerable as long as the URN system didn't need 88 additional capabilities but experience since RFC 2141 was published 89 has shown that they are, in fact, needed. 91 This specification updates the Generic URI Syntax specification 92 [RFC3986] to exclude URNs from its coverage. Put differently, with 93 the publication of this specification, URNs are no longer considered 94 a member of the class of URIs to which RFC 3986 applies. 96 [[Note in draft: the above leaves it ambiguous as to whether it 97 remains appropriate to call URNs "URIs". That ambiguity is 98 intentional and, if possible should keep the question part of the 99 "someone else's problem" category.]] 101 For URLs and such other URIs as may exist or be created in the 102 future, this specification does not change the syntax rules and other 103 requirements and recommendations of RFC 3986. 105 2. A perspective on locations and names 106 Content industries (e.g., publishers) and memory organizations (e.g., 107 libraries, archives, and museums) invest a lot of resources on naming 108 things and the topics of naming and classification are important 109 information science issues. Tens, if not hundreds, of millions of 110 persistent identifiers have been assigned during the last decade. 112 Several identifier systems have been developed for persistent and 113 unique identification of resources. When there is a real need to 114 preserve something important (such as scientific publications, 115 research data, government publications, etc.) for the long term, URNs 116 or other persistent identifiers are used; URLs (or other generic 117 URIs) are not being used for identification or even linking purposes. 119 Naming and locating e.g. library resources are both complex 120 activities which have different aims. Traditionally, naming and 121 locating resources have been separate activities, and the rules for 122 the former are much more stringent than for the latter. The same 123 principles are being applied to digital materials as well as more 124 traditional ones. In a library, any book, be it printed or digital, 125 has both unique and persistent International Standard Book Number 126 (ISBN) and non-unique (each copy has its own location information) 127 and short-lived location information which cannot be trusted in the 128 long run. ISBN never changes, but both shelf locations and Web 129 addresses usually do, many times during the book's life span. 131 Giving location information a role in identification would not only 132 force libraries to adopt different policies for printed and digital 133 content, it would also undermine the value of existing identifier 134 systems. Let us assume that ten people independently upload a copy 135 of an electronic book into different locations in the Web. Are all 136 these ten URLs valid identifiers of the book? And what is their 137 relation to the ISBN or other identification information of the book 138 such as its title? 140 From the perspective of the communities who depend on persistent 141 identifiers, critical issues include: 143 1. Resource identification has to be a managed process. Assigning 144 URIs generally is not. Although it may be possible to introduce 145 some level of control to URI assignment, a user cannot determine 146 whether some URI is reliable or not. 148 2. Anyone may assign new URIs to resources even if these resources 149 already have proper identifiers assigned to them. Claiming that 150 these URIs actually identify something undermines the value of 151 proper identifiers. 153 3. There is no 1:1 relation between the resource identified and 154 URIs. An e-book in the Web may be represented as 1-n files 155 (URIs), and a single file may contain several books. And books 156 are simple, we need to name very complex objects such as research 157 data sets, or some component parts within these complex data 158 sets. 160 4. One resource such as a scientific article is typically available 161 from multiple locations, including (for instance) the publisher's 162 document supply service, a university's open repositories and 163 other cooperative repository systems, legal deposit collections 164 and the Internet archive. A resource should have one and only 165 one identifier of a given type; URIs do not meet this 166 requirement. 168 5. URIs relate to instances (copies) of resources, whereas 169 traditionally identification has much broader scope. Identifiers 170 may be assigned to, e.g., an immaterial work (such as Hamlet), 171 its expressions (e.g. Finnish translation of Hamlet), and 172 manifestations of works and expressions (e.g. PDF version of 173 Finnish translation of Hamlet). 175 6. Over time, different resources (or different versions of the same 176 resource) may be found from the same non-URN URI. A user has no 177 way of knowing whether the resource has changed. One of the 178 basic principles for proper identifier systems is that the same 179 identifier is never assigned to another resource. In general, 180 URIs do not meet this requirement. 182 7. Persistent identification must be available for resources which 183 are available only in databases and other environments that are 184 often identified today as "deep web". URIs for these resources 185 tend to be very complicated and it will be difficult to keep them 186 alive even with the help of DNS redirection when e.g. the 187 underlying database management system changes. 189 8. The role URI fragment and query could or should have in 190 identification is unclear and the statements in RFC 3986 are 191 definitely problematic from the points of view of existing 192 identifier systems and management of naming. 194 Does fragment identify a location or a certain section of a resource? 195 In the evolving set of URN Internet standards, fragment will not be a 196 part of the Namespace Specific String. Then fragment only indicates 197 a place / segment within the identified resource, but does not 198 identify it. If fragment had a role in identification, fragments 199 would extend the scope of existing standard identifiers to component 200 parts of resources. For instance, anyone could use URN based on ISBN 201 + fragment to identify chapters of electronic books. 203 Things get even more complicated with query since what an identifier 204 + query resolves to may not have anything to do with the original 205 resource. For instance, URN based in ISBN + query may resolve to the 206 metadata record describing the book. These records have their own 207 identifiers which are not based on ISBNs. 209 [[Note in draft: Most of the discussion above may belong in 2141bis 210 rather than here.]] 211 9. For many organizations, persistence means decades or centuries. 212 Anything that is protocol dependent will eventually fail. URLs 213 do not change by themselves, but in the long run it is very 214 difficult for people to not change them or the objects to which 215 they point. 217 The mention of centuries is intentional. Content industries, memory 218 organizations (such as national and repository libraries and national 219 archives) and universities and other research organizations, need 220 identifiers that will persist for hundreds of years. Such 221 identifiers might even need to outlast the institutions themselves, 222 and definitely should be usable even if current technologies such as 223 the Web and the Internet cease to exist or are supplanted by 224 something new (as unlikely as that might seem today). 226 In addition, operations on, or additional specifications about, names 227 and the associated objects must be possible, as stable as the names 228 themselves, and reasonably efficient. For example, if a URN were 229 assigned to an encyclopedia that consisted of many volumes, it should 230 be feasible to identify (and locate and retrieve if that were 231 desired) a particular volume or even a particular article without 232 accessing or retrieving the entire set. 234 3. Changes to RFC 3986 236 This specification removes URNs from the scope of RFC 3896. It makes 237 no changes for URI types that remain within that scope. 239 4. Other Required Actions 241 The basic URN syntax specification [RFC2141] was published well 242 before RFC 3986 and therefore does not depend on it. Successors to 243 that specification will need to fully spell out the syntax and 244 semantics of URNs without generic or implicit reference to any URI 245 specification. 247 5. Acknowledgments 249 This specification was inspired by a search in the IETF URNBIS WG for 250 other alternatives that would both satisfy the needs of persistent 251 name-type identifiers and still fully conform to the specifications 252 and intent of RFC 3986. That search lasted several years and 253 considered many alternatives. Discussions with Leslie Daigle, Juha 254 Hakala, Barry Leiba, Keith Moore, Andrew Newton, and Peter Saint- 255 Andre during the last quarter of 2013 and the first quarter of 2014 256 were particularly helpful in getting to the conclusion that a 257 conceptual separation of notions of location-based identifiers (e.g., 258 URLs) and the types of persistent identifiers represented by URNs was 259 necessary. Peter Saint-Andre provided significant text in a pre- 260 publication review. 262 6. Contributors 263 Juha Hakala contributed most of the text of Section 2. 265 Contact Information: 266 Juha Hakala 267 The National Library of Finland 268 P.O. Box 15, Helsinki University 269 Helsinki, MA FIN-00014 270 Finland 271 Email: juha.hakala@helsinki.fi 273 7. IANA Considerations 275 [[RFC Editor: Please remove this section before publication.]] 277 This memo is not believed to require any action on IANA's part. In 278 particular, we note that there are a collection of "Uniform Resource 279 Identifier (URI) Schemes" that does not include URNs and a series of 280 URN-specific registries that do not rely on the URI specificstions. 282 8. Security Considerations 284 All drafts are required to have a security considerations section. 286 9. References 288 9.1. Normative References 290 [RFC2141] Moats, R., "URN Syntax", RFC 2141, May 1997. 292 [RFC3986] Berners-Lee, T., Fielding, R. and L. Masinter, "Uniform 293 Resource Identifier (URI): Generic Syntax", STD 66, RFC 294 3986, January 2005. 296 9.2. Informative References 298 [DeterministicURI] 299 Mazahir, O., Thaler, D. and G. Montenegro, "Deterministic 300 URI Encoding", February 2014, . 303 [RFC1738] Berners-Lee, T., Masinter, L. and M. McCahill, "Uniform 304 Resource Locators (URL)", RFC 1738, December 1994. 306 [RFC2141bis] 307 Saint-Andre, P., "Uniform Resource Name (URN) Syntax", 308 January 2014, . 311 Author's Address 312 John C Klensin 313 1770 Massachusetts Ave, Ste 322 314 Cambridge, MA 02140 315 USA 317 Phone: +1 617 245 1457 318 Email: john-ietf@jck.com