idnits 2.17.1 draft-ietf-urnbis-semantics-clarif-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The draft header indicates that this document updates RFC3986, but the abstract doesn't seem to directly say this. It does mention RFC3986 though, so this could be OK. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year (Using the creation date from RFC3986, updated by this document, for RFC5378 checks: 2002-11-01) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (August 25, 2014) is 3531 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'DeterministicURI' is defined on line 385, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2141 (Obsoleted by RFC 8141) -- Obsolete informational reference (is this intentional?): RFC 1738 (Obsoleted by RFC 4248, RFC 4266) -- Duplicate reference: RFC2141, mentioned in 'RFC2141bis', was also mentioned in 'RFC2141'. -- Obsolete informational reference (is this intentional?): RFC 2141 (Obsoleted by RFC 8141) -- Obsolete informational reference (is this intentional?): RFC 3406 (Obsoleted by RFC 8141) Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 7 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Uniform Resource Names (urnbis) J. Klensin 3 Internet-Draft 4 Updates: 3986 (if approved) August 25, 2014 5 Intended status: Standards Track 6 Expires: February 26, 2015 8 URN Semantics Clarification 9 draft-ietf-urnbis-semantics-clarif-00.txt 11 Abstract 13 Experience has shown that identifiers associated with persistent 14 names have properties and requirements that may be somewhat different 15 from identifiers associated with the locations of objects. This is 16 especially true when such names are expected to be stable for a very 17 long time or when they identify large and complex entities. In order 18 to allow Uniform Resource Names (URNs) to evolve to meet the needs of 19 the Library, Museum, Publisher, and Informational Sciences 20 communities and other users, this specification separates URNs from 21 the semantic constraints that many people believe are part of the 22 specification for Uniform Resource Identifiers (URIs) specified in 23 RFC 3986, updating that document accordingly. The syntax of URNs is 24 still constrained to that of RFC 3986, so generic URI parsers are 25 unaffected by this change. 27 Status of This Memo 29 This Internet-Draft is submitted in full conformance with the 30 provisions of BCP 78 and BCP 79. 32 Internet-Drafts are working documents of the Internet Engineering 33 Task Force (IETF). Note that other groups may also distribute 34 working documents as Internet-Drafts. The list of current Internet- 35 Drafts is at http://datatracker.ietf.org/drafts/current/. 37 Internet-Drafts are draft documents valid for a maximum of six months 38 and may be updated, replaced, or obsoleted by other documents at any 39 time. It is inappropriate to use Internet-Drafts as reference 40 material or to cite them other than as "work in progress." 42 This Internet-Draft will expire on February 26, 2015. 44 Copyright Notice 46 Copyright (c) 2014 IETF Trust and the persons identified as the 47 document authors. All rights reserved. 49 This document is subject to BCP 78 and the IETF Trust's Legal 50 Provisions Relating to IETF Documents 51 (http://trustee.ietf.org/license-info) in effect on the date of 52 publication of this document. Please review these documents 53 carefully, as they describe your rights and restrictions with respect 54 to this document. Code Components extracted from this document must 55 include Simplified BSD License text as described in Section 4.e of 56 the Trust Legal Provisions and are provided without warranty as 57 described in the Simplified BSD License. 59 Table of Contents 61 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 62 2. Pragmatic Goals . . . . . . . . . . . . . . . . . . . . . . . 3 63 3. The role of queries and fragments in URNs . . . . . . . . . . 4 64 4. Changes to RFC 3986 . . . . . . . . . . . . . . . . . . . . . 5 65 5. Other Required Actions . . . . . . . . . . . . . . . . . . . 5 66 6. Alternatives and comparison . . . . . . . . . . . . . . . . . 5 67 6.1. Terminology and Information Location. . . . . . . . . . . 5 68 6.2. Comparison and "Part of the URN" . . . . . . . . . . . . 6 69 6.3. Applicability of components. . . . . . . . . . . . . . . 6 70 6.4. Internal syntax. . . . . . . . . . . . . . . . . . . . . 7 71 6.5. Extended, embedded, base, and derived URNs . . . . . . . 7 72 7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 7 73 8. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 8 74 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 75 10. Security Considerations . . . . . . . . . . . . . . . . . . . 8 76 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 8 77 11.1. Normative References . . . . . . . . . . . . . . . . . . 8 78 11.2. Informative References . . . . . . . . . . . . . . . . . 9 79 Appendix A. Background on the URN - URI relationship . . . . . . 10 80 Appendix B. Three views of locator-identifier separation . . . . 10 81 B.1. A Perspective on Locations and Names . . . . . . . . . . 11 82 B.2. A More Pragmatic Perspective . . . . . . . . . . . . . . 13 83 B.3. A more radical (or most conservative) view of URNs and 84 their role . . . . . . . . . . . . . . . . . . . . . . . 15 85 Appendix C. Change Log . . . . . . . . . . . . . . . . . . . . . 16 86 C.1. Changes from draft-ietf-urnbis-urns-are-not-uris-00 to 87 -01 . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 88 C.2. Changes from draft-ietf-urnbis-urns-are-not-uris-01 89 to draft-ietf-urnbis-semantics-clarif-00 . . . . . . . . 17 90 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 17 92 1. Introduction 94 The Generic URI Syntax specification [RFC3986] covers both locators 95 and names and mixtures of the two (See its Section 1.1.3) and 96 describes Uniform Resource Locators (URLs) -- first documented in the 97 IETF in RFC 1738 [RFC1738] -- as an embodiment of the locator concept 98 and Uniform Resource Names (URNs), specifically those using the "urn" 99 scheme [RFC2141], as an embodiment of the names that do not directly 100 provide for resource location. This specification is concerned only 101 about URNs of the variety described in RFC 2141 [RFC2141] (i.e., 102 those that use the "urn" scheme). URLs, other types of names, and 103 any URI types that may not fall into one of the above categories are 104 out of its scope and unaffected by it. 106 Experience with URNs since the publication of RFC 3986 has identified 107 several ways in which their inclusion under its scope has hampered 108 understanding, adoption, and especially extension in ways that were 109 anticipated in RFC 2141. The need for extensions to the URN concept 110 is now being felt in some communities, especially those that include 111 libraries, museums, publishers, and other information scientists. 113 In particular, the Generic URI Syntax specification goes beyond 114 syntax to specify the meaning and interpretation of various fields, 115 especially the "query" and "fragment" ones. This specification 116 excludes URNs from those definitions of meaning and interpretation so 117 that RFC 3986 applies to their syntax only. The meaning --and any 118 more specific syntax rules-- for those fields for URNs are now 119 defined in a URN-specific document [RFC2141bis]. 120 [[CREF1: Note forward pointer to a future version of 2141bis.]] 121 URNs remain members of the URI family and parsers for generic URI 122 syntax are not affected by this specification. 124 Portions of this document were inspired by discussions at the meeting 125 of the WG during IETF 90 [IETF90-URNBISWG] and subsequent comments 126 and clarifications on the mailing list [URNBIS-MailingList]. In its 127 present form, it is intended primarily to focus WG discussions. 129 This draft does not discuss issues about DDDS resolution or 130 conversion to (and interpretation of) URCs or URN "resolution" more 131 generally. If any of those topics need to be addressed, it should be 132 in other documents. Because URCs (as specified in RFC 2483 [RFC2483] 133 or elsewhere) have not been significantly implemented or deployed, 134 discussion of them is probably out of scope for the WG at this point. 135 The document also does not discuss alternatives to URNs, either those 136 that might use a different scheme name within the RFC 3986 URI 137 framework or those that might use a different framework entirely. 139 2. Pragmatic Goals 141 Despite the important background and rationale in the sections that 142 follow, the change made by this specification is driven by a desire 143 to avoid philosophical debates about terminology or ultimate truths. 144 Instead, it is motivated by three very pragmatic principles: 146 1. Try to accommodate all of those who think URNs are necessary, 147 i.e., that they can and should be usefully distinguished in 148 certain respects from other URIs, at least those that have been 149 defined prior to this document. In particular, provide a 150 foundation for extensions to the URN syntax allowed by and 151 defined in RFC 2141 to support requirements encountered by some 152 of those communities. 154 2. Try to avoid getting bogged down in declarative statements about 155 definitions and debates about what is and is not correct in the 156 abstract. 158 3. Avoid a fork in the standard that leads to multiple, conflicting, 159 definitions or criteria for URNs. 161 In addition, this document is intended to move past debates about 162 whether or not URNs are intended to be parsed at all (i.e., whether a 163 "urn"-scheme URI is simply opaque to a URI parser once the scheme 164 name is identified and, if not, how much of it is actually expected 165 to be understood and broken into identifiable parts by such a parser. 166 The assumption here is that parsing into the components identified in 167 RFC 3986 will be performed but that any meanings or interpretation 168 assigned to those components (including that applicability of the 169 normal English meanings of such terms as "query" or "fragment" are a 170 matter for URN-specific specifications. 172 3. The role of queries and fragments in URNs 174 Part of the concern that led to this document was a desire to 175 accommodate URN components that would be analogous to the query and 176 fragment components of generalized URNs. For many cases, the analogy 177 cannot be exact. For example, RFC 3986 ties the interpretation of 178 fragments to media types. Since media type is a function of specific 179 content, URNs that are never resolved to particular content cannot 180 have an associated media type. Similarly, while the syntax for 181 queries (and fragments) may be entirely appropriate for URN use, 182 terminology like "Service Request" (see Appendix B to the "URNs are 183 not..." draft [ServiceRequests] for additional discussion) may be 184 more suitable to the URN context than "query" (if, indeed, the query 185 portion of the URN is where those requests belong). 187 These issues are discussed as questions facing the WG in Section 6 188 below. 190 4. Changes to RFC 3986 192 This specification removes URN semantics from the scope of RFC 3896. 193 It makes no changes to the generic URI syntax. That syntax still 194 applies to URNs as well as to other URI types. Even as regard to 195 semantics, it has no practical effect for URNs defined in strict 196 conformance to the prior URN specification [RFC2141] or the 197 associated registration specification [RFC3406]. 199 In particular, the generic URI syntax for "queries" (strings starting 200 with "?" and continuing to the end of the URI or to a "#") and 201 "fragments" (strings starting with "#" and continuing to the end of 202 the URI) is unchanged, but the terms "query" and "fragment" become, 203 for URNs, terms of convenience that are defined in URN-specific ways. 205 5. Other Required Actions 207 The basic URN syntax specification [RFC2141] was published well 208 before RFC 3986 and therefore does not depend on it. Successors to 209 that specification will need to fully spell out, or reference 210 documents that spell out, the semantics and any required within-field 211 syntax of URNs, using great care about generic or implicit reference 212 to any URI specification. 214 6. Alternatives and comparison 216 [[Note in draft: temporary section to facilitate WG discussion]] 218 If this draft is approved, the WG will then have a number of other 219 choices to make. They include: 221 6.1. Terminology and Information Location. 223 RFC 3986 syntax appears to allow three components of a URI in which 224 we could put information for extending URNs past the "urn:nid:nss" 225 syntas of RFC 2141. The syntax that introduces each of these is 226 reserved for future use by RFC 2141 (Section 2.3.2). They are as 227 follows: 229 path segment(s). The NSS string could be extended to allow one or 230 more "path segments", introduced by "/" and terminating with the 231 next "/", a "?", a "#", or the end of the URI. These path segment 232 elements have been referred to as "facets" on the mailing list. 233 If they are to be used, the WG will need to settle on what they 234 should be called. 236 query. The URN syntax could be extended by use of what 3986 refers 237 to as a "query", represented as a string that starts with "?" and 238 extends to the first "#" or the end of the URI. 240 fragment. The URN syntax could be extended by use of what 3986 241 refers to as a "fragment", represented as a string that starts 242 with "#" and extends to the end of the URI. 244 The WG will need to determine which of these fields to use (it could 245 allow or require more than one of them, see below) and what to call 246 them. The terms "path segment", "query", and "fragment" have the 247 advantage of being traditional and associated in many people's minds 248 with the corresponding delimiter. On the other hand, the normal 249 conception of what those terms mean (including any semantics 250 associated with them in 3986) may not be a good match for the needs 251 of URNs. In particular, if a string starting in "?" were going to be 252 treated as a collection of "Service Requests", calling that a "query" 253 may strike some people as odd. 255 Allowing more than one of these components will probably require that 256 the WG understand and document the semantic relationship among them 257 (see below). 259 6.2. Comparison and "Part of the URN" 261 There has been fairly extensive discussion of what is compared when 262 one compares URNs for equality. There has been a separate, but 263 possibly equivalent, discussion about what elements associated with a 264 URN identify things. The discussions have particularly emphasized, 265 whether any of path segments, queries, or identifiers that are 266 allowed participate in such comparisons or identification. As with 267 other topics, some WG participants believe the answers are obvious, 268 but don't agree on what they are. Others make a distinction about 269 terminology (e.g., what is "part of the NSS") and assume that it 270 answers the questions. The WG will need to figure out whether these 271 discussions are the same and how to resolve the questions they imply. 273 6.3. Applicability of components. 275 The WG will need to decide whether whatever components are allowed 276 are allowed on a per-NID basis or, at least syntactically, across the 277 entire collection of URNs, remembering that, as far as 3986 is 278 concerned, some things have traditionally been associated with 279 schemes and all URNs are formally part of the same scheme. As noted 280 above, RFC 3986 ties the interpretation of fragments to media types, 281 but that is probably not meaningful for URNs, especially URNs that 282 are never resolves to objects. Part of this requires deciding what 283 should happen when a component is specified that is not applicable to 284 the particular NID-identified namespace. At least part of the web 285 tradition has been to simply ignore such fields but that may not be 286 the right answer for URNs, especially if one or more of them 287 participates in comparisons (see above). 289 6.4. Internal syntax. 291 As long as the conditions for terminating substrings were not 292 violated, the WG could decide on syntax within the components that 293 are to be allowed, possibly including defining syntax for identifying 294 keywords and defining or reserving some or all such keywords. Put 295 differently, it may be important to decide whether "a query" is a 296 series of related terms or components, possibly to be applied 297 serially or whether it has components that are assumed to be 298 independent and unordered. The latter choice may or may not interact 299 with considering query components (or some of them) as "Service 300 Requests". 302 6.5. Extended, embedded, base, and derived URNs 304 There has been discussion on the mailing list of different types of 305 URNs or near-URNs using at least the above terms. It is not clear 306 whether, once the issues above are resolved, those terminology 307 distinctions will be trivial or whether they represent additional 308 issues that the WG will need to resolve. 310 Note that this may interact with a discussion on the mailing list 311 (off-topic for this document) about embedding URNs in HTTP or other 312 URLs that locate a particular resolution or information-obtaining 313 system. It may also interact with potential revised registration 314 templates for ISSNs, ISBNs, and other existing URN namespaces and 315 hence with the transition discussion [URN-transition]. 317 7. Acknowledgments 319 This specification was inspired by a search in the IETF URNBIS WG for 320 other alternatives that would both satisfy the needs of persistent 321 name-type identifiers and still fully conform to the specifications 322 and intent of RFC 3986. That search lasted several years and 323 considered many alternatives. Discussions with Leslie Daigle, Juha 324 Hakala, Barry Leiba, Keith Moore, Andrew Newton, and Peter Saint- 325 Andre during the last quarter of 2013 and the first quarter of 2014 326 were particularly helpful in getting to the conclusion that a 327 conceptual separation of notions of location-based identifiers (e.g., 328 URLs) and the types of persistent identifiers represented by URNs was 329 necessary. As noted below, Juha Hakala provided much of the text on 330 which Appendix B.1 was based. Peter Saint-Andre provided significant 331 text in a pre-publication review. The author also appreciates the 332 efforts of several people, notably Tim Berners-Lee, Larry Masinter, 333 Keith Moore, Juha Hakala, Julian Reschke, Lars Svensson, Henry S. 334 Thompson, and Dale Worely, to challenge text and ideas and demand 335 answers to hard questions. Whether they agree with the results or 336 not, their insights have contributed significantly to whatever 337 clarity and precision appears in the text. 339 The specification was changed considerably and its focus narrowed 340 after an extended discussion at the WG meeting during IETF 90 in July 341 2014. 343 8. Contributors 345 Juha Hakala contributed most of the text of Appendix B.1. 347 Contact Information: 348 Juha Hakala 349 The National Library of Finland 350 P.O. Box 15, Helsinki University 351 Helsinki, MA FIN-00014 352 Finland 353 Email: juha.hakala@helsinki.fi 355 9. IANA Considerations 357 [[CREF2: RFC Editor: Please remove this section before publication.]] 359 This memo is not believed to require any action on IANA's part. In 360 particular, we note that there are a collection of "Uniform Resource 361 Identifier (URI) Schemes" that does not include URNs and a series of 362 URN-specific registries that do not rely on the URI specificstions. 364 10. Security Considerations 366 This specification changes the semantics of URNs to make them self- 367 contained (as specified in other documents), relying on the generic 368 URI syntax specification for syntax only. It should have no effect 369 on Internet security unless the use of a definition, syntax, and 370 semantics that are more clear reduces the potential for confusion and 371 consequent vulnerabilities. 373 11. References 375 11.1. Normative References 377 [RFC2141] Moats, R., "URN Syntax", RFC 2141, May 1997. 379 [RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform 380 Resource Identifier (URI): Generic Syntax", STD 66, RFC 381 3986, January 2005. 383 11.2. Informative References 385 [DeterministicURI] 386 Mazahir, O., Thaler, D., and G. Montenegro, "Deterministic 387 URI Encoding", February 2014, . 390 [IETF90-URNBISWG] 391 IETF, "URN BIS Working Group Minutes", July 2014, 392 . 395 [RFC1738] Berners-Lee, T., Masinter, L., and M. McCahill, "Uniform 396 Resource Locators (URL)", RFC 1738, December 1994. 398 [RFC2141bis] 399 Saint-Andre, P., "Uniform Resource Name (URN) Syntax", 400 January 2014, . 403 [RFC2483] Mealling, M. and R. Daniel, "URI Resolution Services 404 Necessary for URN Resolution", RFC 2483, January 1999. 406 [RFC3406] Daigle, L., van Gulik, D., Iannella, R., and P. Faltstrom, 407 "Uniform Resource Names (URN) Namespace Definition 408 Mechanisms", BCP 66, RFC 3406, October 2002. 410 [ServiceRequests] 411 Klensin, J., "Names are Not Locators and URNs are Not 412 URIs, Appendix B", July 2014, . 415 [URN-transition] 416 Klensin, J. and J. Hakala, "Uniform Resource Name (URN) 417 Namespace Registration Transition", August 2014, 418 . 421 [URNBIS-MailingList] 422 IETF, "IETF URN Mailing list", 2014, 423 . 425 Appendix A. Background on the URN - URI relationship 427 The Internet community now has many years of experience with both 428 name-type identifiers and location-based identifiers (or "references" 429 for those who are sensitive to the term "identifier" -- see 430 Appendix B.1). The primary examples of these two categories are 431 Uniform Resource Names (URNs [RFC2141] [RFC2141bis]) and Uniform 432 Resource Locators (URLs) [RFC1738]). That experience leads to the 433 conclusion that it is impractical to constrain URNs to the high-level 434 semantics of URLs. The generic syntax for URIs [RFC3986] is 435 adequately flexible to accommodate the perceived needs of URNs, but 436 the specific semantics associated with the URI syntax definition -- 437 what particular constructions "mean" and how and where they are 438 interpreted -- appear to not be. Generalization from URLs to generic 439 Uniform Resource Identifiers (URIs) [RFC3986], especially to name- 440 based, high-stability, long-persistence, identifiers such as many 441 URNs, has failed because the assumed similarities do not adequately 442 extend to all forms of URNs. Ultimately, locators, which typically 443 depend on particular accessing protocols and a specification relative 444 to some physical space or network topology, are simply different 445 creatures from long-persistence, location-independent, object 446 identifiers. The syntax and semantic constraints that are 447 appropriate for locators are either irrelevant to or interfere with 448 the needs of resource names as a class. That was tolerable as long 449 as the URN system didn't need additional capabilities (over those 450 specified in RFC 2141) but experience since RFC 2141 was published 451 has shown that they are, in fact, needed. 453 Appendix B. Three views of locator-identifier separation 455 Beginning in the 1990s with the first discussions of generalizing 456 HTTP-style URLs to more general, "URI" forms with more or less 457 different properties, there have been controversies between people 458 and communities with strongly-held views about whether the 459 differences between "locators" and "identifiers" are real, whether 460 the categories are actually disjoint (RFC 3986 says that they are 461 not), and, if real differences exist, how they are manifested and 462 what their interests are. The subsections below are intended to at 463 least partially capture different views of those issues. They are 464 included here in the hope that they will assist with focusing 465 discussion and reduce the frequency with which arguments are 466 repeated. It is almost certain that the community does not have 467 consensus on all of the points made below and that these blocks of 468 text should be moved into other documents if they should be retained 469 at all. 471 B.1. A Perspective on Locations and Names 473 Content industries (e.g., publishers) and memory organizations (e.g., 474 libraries, archives, and museums) invest a lot of resources on naming 475 things and the topics of naming and classification are important 476 information science issues. Tens, if not hundreds, of millions of 477 persistent identifiers have been assigned during the last decade. 479 Several identifier systems have been developed for persistent and 480 unique identification of resources. When there is a real need to 481 preserve something important (such as scientific publications, 482 research data, government publications, etc.) for the long term, URNs 483 or other persistent identifiers are used; URLs (or other generic 484 URIs) are not being used for identification or even linking purposes. 486 Naming and locating, e.g., for library resources, are both complex 487 activities which have different aims. Traditionally, naming and 488 locating resources have been separate activities, and the rules for 489 the former are much more stringent than for the latter. The same 490 principles are being applied to digital materials as well as more 491 traditional ones. In a library, any book, be it printed or digital, 492 has both unique and persistent International Standard Book Number 493 (ISBN) and non-unique (each copy has its own location information) 494 and short-lived location information which cannot be trusted in the 495 long run. ISBN never changes, but both shelf locations and Web 496 addresses usually do, many times during the book's life span. 498 Giving location information a role in identification would not only 499 force libraries to adopt different policies for printed and digital 500 content, it would also undermine the value of existing identifier 501 systems. Let us assume that ten people independently upload a copy 502 of an electronic book into different locations in the Web. Are all 503 these ten URLs valid identifiers of the book? And what is their 504 relation to the ISBN or other identification information of the book 505 such as its title? 507 From the perspective of the communities who depend on persistent 508 identifiers, critical issues include: 510 1. Resource identification has to be a managed process. Assigning 511 URIs generally is not. Although it may be possible to introduce 512 some level of control to URI assignment, a user cannot determine 513 whether some URI is reliable or not. 515 2. Anyone may assign new URIs to resources even if these resources 516 already have proper identifiers assigned to them. Claiming that 517 these URIs actually identify something undermines the value of 518 proper identifiers. 520 3. There is no 1:1 relation between the resource identified and 521 URIs. An e-book in the Web may be represented as 1-n files 522 (URIs), and a single file may contain several books. And books 523 are simple, we need to name very complex objects such as research 524 data sets, or some component parts within these complex data 525 sets. 527 4. One resource such as a scientific article is typically available 528 from multiple locations, including (for instance) the publisher's 529 document supply service, a university's open repositories and 530 other cooperative repository systems, legal deposit collections 531 and the Internet archive. A resource should have one and only 532 one identifier of a given type; URIs do not meet this 533 requirement. 535 5. URIs relate to instances (copies) of resources, whereas 536 traditionally identification has much broader scope. Identifiers 537 may be assigned to, e.g., an immaterial work (such as Hamlet), 538 its expressions (e.g. Finnish translation of Hamlet), and 539 manifestations of works and expressions (e.g. PDF version of 540 Finnish translation of Hamlet). 542 6. Over time, different resources (or different versions of the same 543 resource) may be found from the same non-URN URI. A user has no 544 way of knowing whether the resource has changed. One of the 545 basic principles for proper identifier systems is that the same 546 identifier is never assigned to another resource. In general, 547 URIs do not meet this requirement. 549 7. Persistent identification must be available for resources which 550 are available only in databases and other environments that are 551 often identified today as "deep web". URIs for these resources 552 tend to be very complicated and it will be difficult to keep them 553 alive even with the help of DNS redirection when e.g. the 554 underlying database management system changes. 556 8. The role URI fragment and query could or should have in 557 identification is unclear and the statements in RFC 3986 are 558 definitely problematic from the points of view of existing 559 identifier systems and management of naming. 561 Does "fragment" identify a location or a certain section of a 562 resource? In the evolving set of URN Internet standards, 563 fragment will not be a part of the Namespace Specific String. 564 Then fragment only indicates a place / segment within the 565 identified resource, but does not identify it. If fragment 566 had a role in identification, fragments would extend the scope 567 of existing standard identifiers to component parts of 568 resources. For instance, anyone could use URN based on ISBN + 569 fragment to identify chapters of electronic books. 571 Things get even more complicated with "query" since what the 572 combination of an identifier and a query resolves to may not 573 have anything to do with the original resource. For instance, 574 a URN based in ISBN + query may resolve to the metadata record 575 describing the book. These records have their own identifiers 576 which are not based on ISBNs. 578 9. For many organizations, persistence means decades or centuries. 579 Anything that is protocol dependent will eventually fail. URLs 580 do not change by themselves, but in the long run it is very 581 difficult for people to not change them or the objects to which 582 they point. 584 The mention of centuries is intentional. Content industries, 585 memory organizations (such as national and repository 586 libraries and national archives) and universities and other 587 research organizations, need identifiers that will persist for 588 hundreds of years. Such identifiers might even need to 589 outlast the institutions themselves, and definitely should be 590 usable even if current technologies such as the Web and the 591 Internet cease to exist or are supplanted by something new (as 592 unlikely as that might seem today). 594 In addition, operations on, or additional specifications 595 about, names and the associated objects must be possible, as 596 stable as the names themselves, and reasonably efficient. For 597 example, if a URN were assigned to an encyclopedia that 598 consisted of many volumes, it should be feasible to identify 599 (and locate and retrieve if that were desired) a particular 600 volume or even a particular article without accessing or 601 retrieving the entire set. 603 B.2. A More Pragmatic Perspective 605 The subsection above provides an explanation of the reasons for this 606 change and actually for a more radical separation of URNs from 607 generic URIs. That explanation is not without controversy, 608 especially from those who make different assumptions about the 609 future, or even interpretations of the present, than many members of 610 the community (and especially members of the communities described in 611 that section). Some of those who do not accept the explanation above 612 simply do not recognize and accept the distinctions on which it, and 613 URNs more generally, are based, including the name-locator 614 distinction. In some cases, opposition to that explanation is quite 615 pronounced, involving fundamental differences in philosophy that move 616 beyond mere differences of opinion. 618 Like most controversies in which one group does not accept the 619 definitions, facts, or logic of another, the differences are unlikely 620 to be resolved by further discussion, no matter how sensible and 621 patient. The material in this appendix is provided for the benefit 622 of those who cannot accept Appendix B.1 or consider the discussion 623 there to be meaningless. 625 Put differently, the issue is ultimately not whether the perspective 626 that Appendix B.1 reflects is, in some universal epistemology, 627 correct or incorrect or even whether the consequences and 628 implications of the introduction of the web and/or digital media 629 renders it hopelessly obsolete. If only in their manifestation 630 through national repository libraries and archives and setters of 631 standards for them -- activities that have far more formal authority 632 than the IETF or even W3C -- The community involved is relevant and 633 legitimate. If the IETF wishes to maintain authority over things 634 that are called URNs, then those perceived needs probably need to be 635 accommodated in some reasonable way... where "reasonable" is defined 636 as much or more by those communities as by the IETF one. 638 Independent of the details of the discussion above, in the case of 639 URNs, the IETF is faced with a pair of problems that are ultimately 640 faced sooner or later by all voluntary standards bodies: nothing 641 except quality and broad community consensus prevents a standard from 642 being ignored in the marketplace and nothing prevents another body 643 from creating a competing standard. The effort required to create a 644 competing standard can be increased and its potential for confusion 645 can be reduced somewhat by various measures -- measures the IETF has 646 rarely tried to actually use -- but those measures are rarely 647 effective when the other body is convinced that they have legitimate 648 and significant needs that differ from the original atandard. 649 Because of those problems, the key question for the URN effort is 650 ultimately not whether a clear enough distinction exists between 651 names and locator or location-based information, nor whether 652 "persistent" can be defined clearly enough, nor even whether the 653 communities and requirements described in Appendix B.1 are valid or 654 will be judged valid in retrospect in a few decades or centuries. 655 Instead, the question is whether the IETF is willing to evolve and 656 adapt the URN definition to accommodate those perceived needs or 657 whether if prefers to have that work done elsewhere, either by 658 adoption in the broader community and marketplace of a different 659 approach or, potentially, even a competing URN standard. If, in the 660 long run, those other communities and perspectives turn out to be 661 wrong, the additional features will atrophy. But that would be true 662 whether they are specified and standardized in the IETF or elsewhere. 664 B.3. A more radical (or most conservative) view of URNs and their role 666 [[CREF3: The text in this subsection was derived from an on-list 667 discussion. I believe it represents an even stronger position than 668 RFC 3986 takes although I think similar positions have come up in 669 other discussions. Because of its origins, the writing style is 670 somewhat different from the rest of this document. Again, this text 671 is provided for convenience and is not expected to survive into RFC 672 publication--JcK ]] 674 The essence of this position is that URNs are "just" names and that, 675 insofar as one can talk about location or resolution services of 676 various types, they are data associated with the URN (or underlying 677 name) and are not only not part of the URN but they are useful only 678 for constructing locator-type URIs to which the URN (name) is an 679 argument. 681 Suppose we have a URN that looks like 683 urn:isbn:1-4012-9876-1 685 It is really just a name. Associations with objects are someone 686 else's problem. There is actually no requirement that an object 687 exist, only that the publisher/registrant have sufficient intention 688 to create an object to assign the code. Now a query about metadata 689 associated with that name makes perfect sense although there are 690 questions about how far it should go (see below). For example, one 691 could invoke 693 urn:isbn:1-4012-9876-1?publisher 695 and, modulo some issues about queries being defined by the resource, 696 have a more than reasonable expectation of getting back "DC Comics". 697 But, since that is a name and not an object or the location of an 698 object, I don't know what a fragment is. One could certainly write 700 urn:isbn:1-4012-9876-1#publisher 702 or 704 urn:isbn:1-4012-9876-1#1 706 but, assuming one knows how ISBNs are constructed, the result would 707 presumably be 709 1-4012 711 and not anything useful, since there is no object to retrieve and 712 evaluate with regard to either media type or content. 714 If we are going to maintain a strong name - object distinction, this 715 approach makes a certain amount of sense. 717 An extreme version of the argument that we can't have fragments on 718 URNs because they are just names, not objects, might lead to the 719 claim that the only way one gets "Section 2" of that book is with 720 something like 722 http://school- 723 library.ps1234.k12.ma.example/?urn="isbn:1-4012-9876-1"&Section=2 725 or, in two more general cases: 727 myFavoriteLibraryRetrievalScheme://library- 728 domain.example/?urn="isbn:1-4012-9876-1"&Section=2 730 or maybe 732 http://www.generic- 733 bookseller.example/?urn="isbn:1-4012-9876-1"#Section=2 735 In all three of those cases and some other variations we can thinks 736 of, the URN is, itself, stable and persistent. Neither the two 737 schemes nor the domain parts associated with them need be.If the 738 fragment that refers to a section is valid, it is too (that doesn't 739 make it part of the name -- that is a separate question). The 740 retrieval/ resolution system is not a property of the URN. Instead, 741 the URN is a name-type argument --an object identifier-- used as 742 input to the retrieval system. 744 Appendix C. Change Log 746 [[CREF4: RFC Editor: Please remove this appendix before 747 publication.]] 749 C.1. Changes from draft-ietf-urnbis-urns-are-not-uris-00 to -01 751 o Revised Section 1 slightly and added some new material to try to 752 address questions raised on the mailing list. 754 o Added Section 2, reflecting an email exchange. 756 o Added a Security Considerations section, replacing the placeholder 757 in the previous version. 759 o Added Appendix B.2 and inserted a note in the material titled "A 760 Perspective on Locations and Names" pointing to it (that material 761 is in Appendix B.1 in the current version, but was Section 2 and 762 then Section 3 in earlier versions). 764 o Added temporary Appendix B for this version only. 766 o Enhanced and updated the Acknowledgments section. 768 o The usual small clarifications and editorial changes. 770 C.2. Changes from draft-ietf-urnbis-urns-are-not-uris-01 to draft-ietf- 771 urnbis-semantics-clarif-00 773 o Changed title and file name to better reflect changes summarized 774 below. Note that the predecessor of this document was draft-ietf- 775 urnbis-urns-are-not-uris-01. 777 o Revised considerably as discussed on the mailing list and at IETF 778 90. In particular, the document has been narrowed to change 779 semantics only without affecting the relationship to URI syntax 780 and the document title and other details changed to match. 782 o Dropped much of the original Introduction (moving it temporarily 783 to an appendix) and trimmed the abstract to be consistent with the 784 new, more limited. scope. 786 o Revised Appendix B.2 to make "perceived requirement" more clear. 788 o Removed the former Appendix B, as promised in the previous draft, 789 moved considerably more text into appendices, and added some new 790 appendix text. Note that the earlier text is temporarily 791 referenced in Appendix B.3 above. If we intend to keep that 792 appendix material, we will have to drag at least part of the text 793 back in from the earlier draft. 795 o Added new Section 6 to discuss the next round of decisions the WG 796 will have to make, assuming this provisions of this specification 797 are approved. 799 Author's Address 800 John C Klensin 801 1770 Massachusetts Ave, Ste 322 802 Cambridge, MA 02140 803 USA 805 Phone: +1 617 245 1457 806 Email: john-ietf@jck.com