idnits 2.17.1 

draft-ietf-urnbis-semantics-clarif-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  -- The draft header indicates that this document updates RFC3986, but the
     abstract doesn't seem to directly say this.  It does mention RFC3986
     though, so this could be OK.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

     (Using the creation date from RFC3986, updated by this document, for
     RFC5378 checks: 2002-11-01)

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (August 25, 2014) is 3531 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Unused Reference: 'DeterministicURI' is defined on line 385, but no
     explicit reference was found in the text

  ** Obsolete normative reference: RFC 2141 (Obsoleted by RFC 8141)

  -- Obsolete informational reference (is this intentional?): RFC 1738
     (Obsoleted by RFC 4248, RFC 4266)

  -- Duplicate reference: RFC2141, mentioned in 'RFC2141bis', was also
     mentioned in 'RFC2141'.

  -- Obsolete informational reference (is this intentional?): RFC 2141
     (Obsoleted by RFC 8141)

  -- Obsolete informational reference (is this intentional?): RFC 3406
     (Obsoleted by RFC 8141)


     Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 7 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Uniform Resource Names (urnbis)                               J. Klensin
3	Internet-Draft
4	Updates: 3986 (if approved)                              August 25, 2014
5	Intended status: Standards Track
6	Expires: February 26, 2015

8	                      URN Semantics Clarification
9	               draft-ietf-urnbis-semantics-clarif-00.txt

11	Abstract

13	   Experience has shown that identifiers associated with persistent
14	   names have properties and requirements that may be somewhat different
15	   from identifiers associated with the locations of objects.  This is
16	   especially true when such names are expected to be stable for a very
17	   long time or when they identify large and complex entities.  In order
18	   to allow Uniform Resource Names (URNs) to evolve to meet the needs of
19	   the Library, Museum, Publisher, and Informational Sciences
20	   communities and other users, this specification separates URNs from
21	   the semantic constraints that many people believe are part of the
22	   specification for Uniform Resource Identifiers (URIs) specified in
23	   RFC 3986, updating that document accordingly.  The syntax of URNs is
24	   still constrained to that of RFC 3986, so generic URI parsers are
25	   unaffected by this change.

27	Status of This Memo

29	   This Internet-Draft is submitted in full conformance with the
30	   provisions of BCP 78 and BCP 79.

32	   Internet-Drafts are working documents of the Internet Engineering
33	   Task Force (IETF).  Note that other groups may also distribute
34	   working documents as Internet-Drafts.  The list of current Internet-
35	   Drafts is at http://datatracker.ietf.org/drafts/current/.

37	   Internet-Drafts are draft documents valid for a maximum of six months
38	   and may be updated, replaced, or obsoleted by other documents at any
39	   time.  It is inappropriate to use Internet-Drafts as reference
40	   material or to cite them other than as "work in progress."

42	   This Internet-Draft will expire on February 26, 2015.

44	Copyright Notice

46	   Copyright (c) 2014 IETF Trust and the persons identified as the
47	   document authors.  All rights reserved.

49	   This document is subject to BCP 78 and the IETF Trust's Legal
50	   Provisions Relating to IETF Documents
51	   (http://trustee.ietf.org/license-info) in effect on the date of
52	   publication of this document.  Please review these documents
53	   carefully, as they describe your rights and restrictions with respect
54	   to this document.  Code Components extracted from this document must
55	   include Simplified BSD License text as described in Section 4.e of
56	   the Trust Legal Provisions and are provided without warranty as
57	   described in the Simplified BSD License.

59	Table of Contents

61	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
62	   2.  Pragmatic Goals . . . . . . . . . . . . . . . . . . . . . . .   3
63	   3.  The role of queries and fragments in URNs . . . . . . . . . .   4
64	   4.  Changes to RFC 3986 . . . . . . . . . . . . . . . . . . . . .   5
65	   5.  Other Required Actions  . . . . . . . . . . . . . . . . . . .   5
66	   6.  Alternatives and comparison . . . . . . . . . . . . . . . . .   5
67	     6.1.  Terminology and Information Location. . . . . . . . . . .   5
68	     6.2.  Comparison and "Part of the URN"  . . . . . . . . . . . .   6
69	     6.3.  Applicability of components.  . . . . . . . . . . . . . .   6
70	     6.4.  Internal syntax.  . . . . . . . . . . . . . . . . . . . .   7
71	     6.5.  Extended, embedded, base, and derived URNs  . . . . . . .   7
72	   7.  Acknowledgments . . . . . . . . . . . . . . . . . . . . . . .   7
73	   8.  Contributors  . . . . . . . . . . . . . . . . . . . . . . . .   8
74	   9.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   8
75	   10. Security Considerations . . . . . . . . . . . . . . . . . . .   8
76	   11. References  . . . . . . . . . . . . . . . . . . . . . . . . .   8
77	     11.1.  Normative References . . . . . . . . . . . . . . . . . .   8
78	     11.2.  Informative References . . . . . . . . . . . . . . . . .   9
79	   Appendix A.  Background on the URN - URI relationship . . . . . .  10
80	   Appendix B.  Three views of locator-identifier separation . . . .  10
81	     B.1.  A Perspective on Locations and Names  . . . . . . . . . .  11
82	     B.2.  A More Pragmatic Perspective  . . . . . . . . . . . . . .  13
83	     B.3.  A more radical (or most conservative) view of URNs and
84	           their role  . . . . . . . . . . . . . . . . . . . . . . .  15
85	   Appendix C.  Change Log . . . . . . . . . . . . . . . . . . . . .  16
86	     C.1.  Changes from draft-ietf-urnbis-urns-are-not-uris-00 to
87	           -01 . . . . . . . . . . . . . . . . . . . . . . . . . . .  16
88	     C.2.  Changes from draft-ietf-urnbis-urns-are-not-uris-01
89	           to draft-ietf-urnbis-semantics-clarif-00  . . . . . . . .  17
90	   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .  17

92	1.  Introduction

94	   The Generic URI Syntax specification [RFC3986] covers both locators
95	   and names and mixtures of the two (See its Section 1.1.3) and
96	   describes Uniform Resource Locators (URLs) -- first documented in the
97	   IETF in RFC 1738 [RFC1738] -- as an embodiment of the locator concept
98	   and Uniform Resource Names (URNs), specifically those using the "urn"
99	   scheme [RFC2141], as an embodiment of the names that do not directly
100	   provide for resource location.  This specification is concerned only
101	   about URNs of the variety described in RFC 2141 [RFC2141] (i.e.,
102	   those that use the "urn" scheme).  URLs, other types of names, and
103	   any URI types that may not fall into one of the above categories are
104	   out of its scope and unaffected by it.

106	   Experience with URNs since the publication of RFC 3986 has identified
107	   several ways in which their inclusion under its scope has hampered
108	   understanding, adoption, and especially extension in ways that were
109	   anticipated in RFC 2141.  The need for extensions to the URN concept
110	   is now being felt in some communities, especially those that include
111	   libraries, museums, publishers, and other information scientists.

113	   In particular, the Generic URI Syntax specification goes beyond
114	   syntax to specify the meaning and interpretation of various fields,
115	   especially the "query" and "fragment" ones.  This specification
116	   excludes URNs from those definitions of meaning and interpretation so
117	   that RFC 3986 applies to their syntax only.  The meaning --and any
118	   more specific syntax rules-- for those fields for URNs are now
119	   defined in a URN-specific document [RFC2141bis].
120	   [[CREF1: Note forward pointer to a future version of 2141bis.]]
121	   URNs remain members of the URI family and parsers for generic URI
122	   syntax are not affected by this specification.

124	   Portions of this document were inspired by discussions at the meeting
125	   of the WG during IETF 90 [IETF90-URNBISWG] and subsequent comments
126	   and clarifications on the mailing list [URNBIS-MailingList].  In its
127	   present form, it is intended primarily to focus WG discussions.

129	   This draft does not discuss issues about DDDS resolution or
130	   conversion to (and interpretation of) URCs or URN "resolution" more
131	   generally.  If any of those topics need to be addressed, it should be
132	   in other documents.  Because URCs (as specified in RFC 2483 [RFC2483]
133	   or elsewhere) have not been significantly implemented or deployed,
134	   discussion of them is probably out of scope for the WG at this point.
135	   The document also does not discuss alternatives to URNs, either those
136	   that might use a different scheme name within the RFC 3986 URI
137	   framework or those that might use a different framework entirely.

139	2.  Pragmatic Goals

141	   Despite the important background and rationale in the sections that
142	   follow, the change made by this specification is driven by a desire
143	   to avoid philosophical debates about terminology or ultimate truths.
144	   Instead, it is motivated by three very pragmatic principles:

146	   1.  Try to accommodate all of those who think URNs are necessary,
147	       i.e., that they can and should be usefully distinguished in
148	       certain respects from other URIs, at least those that have been
149	       defined prior to this document.  In particular, provide a
150	       foundation for extensions to the URN syntax allowed by and
151	       defined in RFC 2141 to support requirements encountered by some
152	       of those communities.

154	   2.  Try to avoid getting bogged down in declarative statements about
155	       definitions and debates about what is and is not correct in the
156	       abstract.

158	   3.  Avoid a fork in the standard that leads to multiple, conflicting,
159	       definitions or criteria for URNs.

161	   In addition, this document is intended to move past debates about
162	   whether or not URNs are intended to be parsed at all (i.e., whether a
163	   "urn"-scheme URI is simply opaque to a URI parser once the scheme
164	   name is identified and, if not, how much of it is actually expected
165	   to be understood and broken into identifiable parts by such a parser.
166	   The assumption here is that parsing into the components identified in
167	   RFC 3986 will be performed but that any meanings or interpretation
168	   assigned to those components (including that applicability of the
169	   normal English meanings of such terms as "query" or "fragment" are a
170	   matter for URN-specific specifications.

172	3.  The role of queries and fragments in URNs

174	   Part of the concern that led to this document was a desire to
175	   accommodate URN components that would be analogous to the query and
176	   fragment components of generalized URNs.  For many cases, the analogy
177	   cannot be exact.  For example, RFC 3986 ties the interpretation of
178	   fragments to media types.  Since media type is a function of specific
179	   content, URNs that are never resolved to particular content cannot
180	   have an associated media type.  Similarly, while the syntax for
181	   queries (and fragments) may be entirely appropriate for URN use,
182	   terminology like "Service Request" (see Appendix B to the "URNs are
183	   not..." draft [ServiceRequests] for additional discussion) may be
184	   more suitable to the URN context than "query" (if, indeed, the query
185	   portion of the URN is where those requests belong).

187	   These issues are discussed as questions facing the WG in Section 6
188	   below.

190	4.  Changes to RFC 3986

192	   This specification removes URN semantics from the scope of RFC 3896.
193	   It makes no changes to the generic URI syntax.  That syntax still
194	   applies to URNs as well as to other URI types.  Even as regard to
195	   semantics, it has no practical effect for URNs defined in strict
196	   conformance to the prior URN specification [RFC2141]  or the
197	   associated registration specification [RFC3406].

199	   In particular, the generic URI syntax for "queries" (strings starting
200	   with "?" and continuing to the end of the URI or to a "#") and
201	   "fragments" (strings starting with "#" and continuing to the end of
202	   the URI) is unchanged, but the terms "query" and "fragment" become,
203	   for URNs, terms of convenience that are defined in URN-specific ways.

205	5.  Other Required Actions

207	   The basic URN syntax specification [RFC2141] was published well
208	   before RFC 3986 and therefore does not depend on it.  Successors to
209	   that specification will need to fully spell out, or reference
210	   documents that spell out, the semantics and any required within-field
211	   syntax of URNs, using great care about generic or implicit reference
212	   to any URI specification.

214	6.  Alternatives and comparison

216	   [[Note in draft: temporary section to facilitate WG discussion]]

218	   If this draft is approved, the WG will then have a number of other
219	   choices to make.  They include:

221	6.1.  Terminology and Information Location.

223	   RFC 3986 syntax appears to allow three components of a URI in which
224	   we could put information for extending URNs past the "urn:nid:nss"
225	   syntas of RFC 2141.  The syntax that introduces each of these is
226	   reserved for future use by RFC 2141 (Section 2.3.2).  They are as
227	   follows:

229	   path segment(s).  The NSS string could be extended to allow one or
230	      more "path segments", introduced by "/" and terminating with the
231	      next "/", a "?", a "#", or the end of the URI.  These path segment
232	      elements have been referred to as "facets" on the mailing list.
233	      If they are to be used, the WG will need to settle on what they
234	      should be called.

236	   query.  The URN syntax could be extended by use of what 3986 refers
237	      to as a "query", represented as a string that starts with "?" and
238	      extends to the first "#" or the end of the URI.

240	   fragment.  The URN syntax could be extended by use of what 3986
241	      refers to as a "fragment", represented as a string that starts
242	      with "#" and extends to the end of the URI.

244	   The WG will need to determine which of these fields to use (it could
245	   allow or require more than one of them, see below) and what to call
246	   them.  The terms "path segment", "query", and "fragment" have the
247	   advantage of being traditional and associated in many people's minds
248	   with the corresponding delimiter.  On the other hand, the normal
249	   conception of what those terms mean (including any semantics
250	   associated with them in 3986) may not be a good match for the needs
251	   of URNs.  In particular, if a string starting in "?" were going to be
252	   treated as a collection of "Service Requests", calling that a "query"
253	   may strike some people as odd.

255	   Allowing more than one of these components will probably require that
256	   the WG understand and document the semantic relationship among them
257	   (see below).

259	6.2.  Comparison and "Part of the URN"

261	   There has been fairly extensive discussion of what is compared when
262	   one compares URNs for equality.  There has been a separate, but
263	   possibly equivalent, discussion about what elements associated with a
264	   URN identify things.  The discussions have particularly emphasized,
265	   whether any of path segments, queries, or identifiers that are
266	   allowed participate in such comparisons or identification.  As with
267	   other topics, some WG participants believe the answers are obvious,
268	   but don't agree on what they are.  Others make a distinction about
269	   terminology (e.g., what is "part of the NSS") and assume that it
270	   answers the questions.  The WG will need to figure out whether these
271	   discussions are the same and how to resolve the questions they imply.

273	6.3.  Applicability of components.

275	   The WG will need to decide whether whatever components are allowed
276	   are allowed on a per-NID basis or, at least syntactically, across the
277	   entire collection of URNs, remembering that, as far as 3986 is
278	   concerned, some things have traditionally been associated with
279	   schemes and all URNs are formally part of the same scheme.  As noted
280	   above, RFC 3986 ties the interpretation of fragments to media types,
281	   but that is probably not meaningful for URNs, especially URNs that
282	   are never resolves to objects.  Part of this requires deciding what
283	   should happen when a component is specified that is not applicable to
284	   the particular NID-identified namespace.  At least part of the web
285	   tradition has been to simply ignore such fields but that may not be
286	   the right answer for URNs, especially if one or more of them
287	   participates in comparisons (see above).

289	6.4.  Internal syntax.

291	   As long as the conditions for terminating substrings were not
292	   violated, the WG could decide on syntax within the components that
293	   are to be allowed, possibly including defining syntax for identifying
294	   keywords and defining or reserving some or all such keywords.  Put
295	   differently, it may be important to decide whether "a query" is a
296	   series of related terms or components, possibly to be applied
297	   serially or whether it has components that are assumed to be
298	   independent and unordered.  The latter choice may or may not interact
299	   with considering query components (or some of them) as "Service
300	   Requests".

302	6.5.  Extended, embedded, base, and derived URNs

304	   There has been discussion on the mailing list of different types of
305	   URNs or near-URNs using at least the above terms.  It is not clear
306	   whether, once the issues above are resolved, those terminology
307	   distinctions will be trivial or whether they represent additional
308	   issues that the WG will need to resolve.

310	   Note that this may interact with a discussion on the mailing list
311	   (off-topic for this document) about embedding URNs in HTTP or other
312	   URLs that locate a particular resolution or information-obtaining
313	   system.  It may also interact with potential revised registration
314	   templates for ISSNs, ISBNs, and other existing URN namespaces and
315	   hence with the transition discussion [URN-transition].

317	7.  Acknowledgments

319	   This specification was inspired by a search in the IETF URNBIS WG for
320	   other alternatives that would both satisfy the needs of persistent
321	   name-type identifiers and still fully conform to the specifications
322	   and intent of RFC 3986.  That search lasted several years and
323	   considered many alternatives.  Discussions with Leslie Daigle, Juha
324	   Hakala, Barry Leiba, Keith Moore, Andrew Newton, and Peter Saint-
325	   Andre during the last quarter of 2013 and the first quarter of 2014
326	   were particularly helpful in getting to the conclusion that a
327	   conceptual separation of notions of location-based identifiers (e.g.,
328	   URLs) and the types of persistent identifiers represented by URNs was
329	   necessary.  As noted below, Juha Hakala provided much of the text on
330	   which Appendix B.1 was based.  Peter Saint-Andre provided significant
331	   text in a pre-publication review.  The author also appreciates the
332	   efforts of several people, notably Tim Berners-Lee, Larry Masinter,
333	   Keith Moore, Juha Hakala, Julian Reschke, Lars Svensson, Henry S.
334	   Thompson, and Dale Worely, to challenge text and ideas and demand
335	   answers to hard questions.  Whether they agree with the results or
336	   not, their insights have contributed significantly to whatever
337	   clarity and precision appears in the text.

339	   The specification was changed considerably and its focus narrowed
340	   after an extended discussion at the WG meeting during IETF 90 in July
341	   2014.

343	8.  Contributors

345	   Juha Hakala contributed most of the text of Appendix B.1.

347	      Contact Information:
348	      Juha Hakala
349	      The National Library of Finland
350	      P.O.  Box 15, Helsinki University
351	      Helsinki, MA FIN-00014
352	      Finland
353	      Email: juha.hakala@helsinki.fi

355	9.  IANA Considerations

357	   [[CREF2: RFC Editor: Please remove this section before publication.]]

359	   This memo is not believed to require any action on IANA's part.  In
360	   particular, we note that there are a collection of "Uniform Resource
361	   Identifier (URI) Schemes" that does not include URNs and a series of
362	   URN-specific registries that do not rely on the URI specificstions.

364	10.  Security Considerations

366	   This specification changes the semantics of URNs to make them self-
367	   contained (as specified in other documents), relying on the generic
368	   URI syntax specification for syntax only.  It should have no effect
369	   on Internet security unless the use of a definition, syntax, and
370	   semantics that are more clear reduces the potential for confusion and
371	   consequent vulnerabilities.

373	11.  References

375	11.1.  Normative References

377	   [RFC2141]  Moats, R., "URN Syntax", RFC 2141, May 1997.

379	   [RFC3986]  Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
380	              Resource Identifier (URI): Generic Syntax", STD 66, RFC
381	              3986, January 2005.

383	11.2.  Informative References

385	   [DeterministicURI]
386	              Mazahir, O., Thaler, D., and G. Montenegro, "Deterministic
387	              URI Encoding", February 2014, <http://www.ietf.org/id/
388	              draft-montenegro-httpbis-uri-encoding-00.txt>.

390	   [IETF90-URNBISWG]
391	              IETF, "URN BIS Working Group Minutes", July 2014,
392	              <http://www.ietf.org/proceedings/90/minutes/
393	              minutes-90-urnbis>.

395	   [RFC1738]  Berners-Lee, T., Masinter, L., and M. McCahill, "Uniform
396	              Resource Locators (URL)", RFC 1738, December 1994.

398	   [RFC2141bis]
399	              Saint-Andre, P., "Uniform Resource Name (URN) Syntax",
400	              January 2014, <https://datatracker.ietf.org/doc/draft-
401	              ietf-urnbis-rfc2141bis-urn/>.

403	   [RFC2483]  Mealling, M. and R. Daniel, "URI Resolution Services
404	              Necessary for URN Resolution", RFC 2483, January 1999.

406	   [RFC3406]  Daigle, L., van Gulik, D., Iannella, R., and P. Faltstrom,
407	              "Uniform Resource Names (URN) Namespace Definition
408	              Mechanisms", BCP 66, RFC 3406, October 2002.

410	   [ServiceRequests]
411	              Klensin, J., "Names are Not Locators and URNs are Not
412	              URIs, Appendix B", July 2014, <http://www.ietf.org/id/
413	              draft-ietf-urnbis-urns-are-not-uris-01.txt>.

415	   [URN-transition]
416	              Klensin, J. and J. Hakala, "Uniform Resource Name (URN)
417	              Namespace Registration Transition", August 2014,
418	              <https://datatracker.ietf.org/doc/draft-ietf-urnbis-ns-
419	              reg-transition/>.

421	   [URNBIS-MailingList]
422	              IETF, "IETF URN Mailing list", 2014,
423	              <https://www.ietf.org/mailman/listinfo/urn>.

425	Appendix A.  Background on the URN - URI relationship

427	   The Internet community now has many years of experience with both
428	   name-type identifiers and location-based identifiers (or "references"
429	   for those who are sensitive to the term "identifier" -- see
430	   Appendix B.1).  The primary examples of these two categories are
431	   Uniform Resource Names (URNs [RFC2141] [RFC2141bis]) and Uniform
432	   Resource Locators (URLs) [RFC1738]).  That experience leads to the
433	   conclusion that it is impractical to constrain URNs to the high-level
434	   semantics of URLs.  The generic syntax for URIs [RFC3986] is
435	   adequately flexible to accommodate the perceived needs of URNs, but
436	   the specific semantics associated with the URI syntax definition --
437	   what particular constructions "mean" and how and where they are
438	   interpreted -- appear to not be.  Generalization from URLs to generic
439	   Uniform Resource Identifiers (URIs) [RFC3986], especially to name-
440	   based, high-stability, long-persistence, identifiers such as many
441	   URNs, has failed because the assumed similarities do not adequately
442	   extend to all forms of URNs.  Ultimately, locators, which typically
443	   depend on particular accessing protocols and a specification relative
444	   to some physical space or network topology, are simply different
445	   creatures from long-persistence, location-independent, object
446	   identifiers.  The syntax and semantic constraints that are
447	   appropriate for locators are either irrelevant to or interfere with
448	   the needs of resource names as a class.  That was tolerable as long
449	   as the URN system didn't need additional capabilities (over those
450	   specified in RFC 2141) but experience since RFC 2141 was published
451	   has shown that they are, in fact, needed.

453	Appendix B.  Three views of locator-identifier separation

455	   Beginning in the 1990s with the first discussions of generalizing
456	   HTTP-style URLs to more general, "URI" forms with more or less
457	   different properties, there have been controversies between people
458	   and communities with strongly-held views about whether the
459	   differences between "locators" and "identifiers" are real, whether
460	   the categories are actually disjoint (RFC 3986 says that they are
461	   not), and, if real differences exist, how they are manifested and
462	   what their interests are.  The subsections below are intended to at
463	   least partially capture different views of those issues.  They are
464	   included here in the hope that they will assist with focusing
465	   discussion and reduce the frequency with which arguments are
466	   repeated.  It is almost certain that the community does not have
467	   consensus on all of the points made below and that these blocks of
468	   text should be moved into other documents if they should be retained
469	   at all.

471	B.1.  A Perspective on Locations and Names

473	   Content industries (e.g., publishers) and memory organizations (e.g.,
474	   libraries, archives, and museums) invest a lot of resources on naming
475	   things and the topics of naming and classification are important
476	   information science issues.  Tens, if not hundreds, of millions of
477	   persistent identifiers have been assigned during the last decade.

479	   Several identifier systems have been developed for persistent and
480	   unique identification of resources.  When there is a real need to
481	   preserve something important (such as scientific publications,
482	   research data, government publications, etc.) for the long term, URNs
483	   or other persistent identifiers are used; URLs (or other generic
484	   URIs) are not being used for identification or even linking purposes.

486	   Naming and locating, e.g., for library resources, are both complex
487	   activities which have different aims.  Traditionally, naming and
488	   locating resources have been separate activities, and the rules for
489	   the former are much more stringent than for the latter.  The same
490	   principles are being applied to digital materials as well as more
491	   traditional ones.  In a library, any book, be it printed or digital,
492	   has both unique and persistent International Standard Book Number
493	   (ISBN) and non-unique (each copy has its own location information)
494	   and short-lived location information which cannot be trusted in the
495	   long run.  ISBN never changes, but both shelf locations and Web
496	   addresses usually do, many times during the book's life span.

498	   Giving location information a role in identification would not only
499	   force libraries to adopt different policies for printed and digital
500	   content, it would also undermine the value of existing identifier
501	   systems.  Let us assume that ten people independently upload a copy
502	   of an electronic book into different locations in the Web. Are all
503	   these ten URLs valid identifiers of the book?  And what is their
504	   relation to the ISBN or other identification information of the book
505	   such as its title?

507	   From the perspective of the communities who depend on persistent
508	   identifiers, critical issues include:

510	   1.  Resource identification has to be a managed process.  Assigning
511	       URIs generally is not.  Although it may be possible to introduce
512	       some level of control to URI assignment, a user cannot determine
513	       whether some URI is reliable or not.

515	   2.  Anyone may assign new URIs to resources even if these resources
516	       already have proper identifiers assigned to them.  Claiming that
517	       these URIs actually identify something undermines the value of
518	       proper identifiers.

520	   3.  There is no 1:1 relation between the resource identified and
521	       URIs.  An e-book in the Web may be represented as 1-n files
522	       (URIs), and a single file may contain several books.  And books
523	       are simple, we need to name very complex objects such as research
524	       data sets, or some component parts within these complex data
525	       sets.

527	   4.  One resource such as a scientific article is typically available
528	       from multiple locations, including (for instance) the publisher's
529	       document supply service, a university's open repositories and
530	       other cooperative repository systems, legal deposit collections
531	       and the Internet archive.  A resource should have one and only
532	       one identifier of a given type; URIs do not meet this
533	       requirement.

535	   5.  URIs relate to instances (copies) of resources, whereas
536	       traditionally identification has much broader scope.  Identifiers
537	       may be assigned to, e.g., an immaterial work (such as Hamlet),
538	       its expressions (e.g.  Finnish translation of Hamlet), and
539	       manifestations of works and expressions (e.g.  PDF version of
540	       Finnish translation of Hamlet).

542	   6.  Over time, different resources (or different versions of the same
543	       resource) may be found from the same non-URN URI.  A user has no
544	       way of knowing whether the resource has changed.  One of the
545	       basic principles for proper identifier systems is that the same
546	       identifier is never assigned to another resource.  In general,
547	       URIs do not meet this requirement.

549	   7.  Persistent identification must be available for resources which
550	       are available only in databases and other environments that are
551	       often identified today as "deep web".  URIs for these resources
552	       tend to be very complicated and it will be difficult to keep them
553	       alive even with the help of DNS redirection when e.g. the
554	       underlying database management system changes.

556	   8.  The role URI fragment and query could or should have in
557	       identification is unclear and the statements in RFC 3986 are
558	       definitely problematic from the points of view of existing
559	       identifier systems and management of naming.

561	          Does "fragment" identify a location or a certain section of a
562	          resource?  In the evolving set of URN Internet standards,
563	          fragment will not be a part of the Namespace Specific String.
564	          Then fragment only indicates a place / segment within the
565	          identified resource, but does not identify it.  If fragment
566	          had a role in identification, fragments would extend the scope
567	          of existing standard identifiers to component parts of
568	          resources.  For instance, anyone could use URN based on ISBN +
569	          fragment to identify chapters of electronic books.

571	          Things get even more complicated with "query" since what the
572	          combination of an identifier and a query resolves to may not
573	          have anything to do with the original resource.  For instance,
574	          a URN based in ISBN + query may resolve to the metadata record
575	          describing the book.  These records have their own identifiers
576	          which are not based on ISBNs.

578	   9.  For many organizations, persistence means decades or centuries.
579	       Anything that is protocol dependent will eventually fail.  URLs
580	       do not change by themselves, but in the long run it is very
581	       difficult for people to not change them or the objects to which
582	       they point.

584	          The mention of centuries is intentional.  Content industries,
585	          memory organizations (such as national and repository
586	          libraries and national archives) and universities and other
587	          research organizations, need identifiers that will persist for
588	          hundreds of years.  Such identifiers might even need to
589	          outlast the institutions themselves, and definitely should be
590	          usable even if current technologies such as the Web and the
591	          Internet cease to exist or are supplanted by something new (as
592	          unlikely as that might seem today).

594	          In addition, operations on, or additional specifications
595	          about, names and the associated objects must be possible, as
596	          stable as the names themselves, and reasonably efficient.  For
597	          example, if a URN were assigned to an encyclopedia that
598	          consisted of many volumes, it should be feasible to identify
599	          (and locate and retrieve if that were desired) a particular
600	          volume or even a particular article without accessing or
601	          retrieving the entire set.

603	B.2.  A More Pragmatic Perspective

605	   The subsection above provides an explanation of the reasons for this
606	   change and actually for a more radical separation of URNs from
607	   generic URIs.  That explanation is not without controversy,
608	   especially from those who make different assumptions about the
609	   future, or even interpretations of the present, than many members of
610	   the community (and especially members of the communities described in
611	   that section).  Some of those who do not accept the explanation above
612	   simply do not recognize and accept the distinctions on which it, and
613	   URNs more generally, are based, including the name-locator
614	   distinction.  In some cases, opposition to that explanation is quite
615	   pronounced, involving fundamental differences in philosophy that move
616	   beyond mere differences of opinion.

618	   Like most controversies in which one group does not accept the
619	   definitions, facts, or logic of another, the differences are unlikely
620	   to be resolved by further discussion, no matter how sensible and
621	   patient.  The material in this appendix is provided for the benefit
622	   of those who cannot accept Appendix B.1 or consider the discussion
623	   there to be meaningless.

625	   Put differently, the issue is ultimately not whether the perspective
626	   that Appendix B.1 reflects is, in some universal epistemology,
627	   correct or incorrect or even whether the consequences and
628	   implications of the introduction of the web and/or digital media
629	   renders it hopelessly obsolete.  If only in their manifestation
630	   through national repository libraries and archives and setters of
631	   standards for them -- activities that have far more formal authority
632	   than the IETF or even W3C -- The community involved is relevant and
633	   legitimate.  If the IETF wishes to maintain authority over things
634	   that are called URNs, then those perceived needs probably need to be
635	   accommodated in some reasonable way... where "reasonable" is defined
636	   as much or more by those communities as by the IETF one.

638	   Independent of the details of the discussion above, in the case of
639	   URNs, the IETF is faced with a pair of problems that are ultimately
640	   faced sooner or later by all voluntary standards bodies: nothing
641	   except quality and broad community consensus prevents a standard from
642	   being ignored in the marketplace and nothing prevents another body
643	   from creating a competing standard.  The effort required to create a
644	   competing standard can be increased and its potential for confusion
645	   can be reduced somewhat by various measures -- measures the IETF has
646	   rarely tried to actually use -- but those measures are rarely
647	   effective when the other body is convinced that they have legitimate
648	   and significant needs that differ from the original atandard.
649	   Because of those problems, the key question for the URN effort is
650	   ultimately not whether a clear enough distinction exists between
651	   names and locator or location-based information, nor whether
652	   "persistent" can be defined clearly enough, nor even whether the
653	   communities and requirements described in Appendix B.1 are valid or
654	   will be judged valid in retrospect in a few decades or centuries.
655	   Instead, the question is whether the IETF is willing to evolve and
656	   adapt the URN definition to accommodate those perceived needs or
657	   whether if prefers to have that work done elsewhere, either by
658	   adoption in the broader community and marketplace of a different
659	   approach or, potentially, even a competing URN standard.  If, in the
660	   long run, those other communities and perspectives turn out to be
661	   wrong, the additional features will atrophy.  But that would be true
662	   whether they are specified and standardized in the IETF or elsewhere.

664	B.3.  A more radical (or most conservative) view of URNs and their role

666	   [[CREF3: The text in this subsection was derived from an on-list
667	   discussion.  I believe it represents an even stronger position than
668	   RFC 3986 takes although I think similar positions have come up in
669	   other discussions.  Because of its origins, the writing style is
670	   somewhat different from the rest of this document.  Again, this text
671	   is provided for convenience and is not expected to survive into RFC
672	   publication--JcK ]]

674	   The essence of this position is that URNs are "just" names and that,
675	   insofar as one can talk about location or resolution services of
676	   various types, they are data associated with the URN (or underlying
677	   name) and are not only not part of the URN but they are useful only
678	   for constructing locator-type URIs to which the URN (name) is an
679	   argument.

681	   Suppose we have a URN that looks like

683	      urn:isbn:1-4012-9876-1

685	   It is really just a name.  Associations with objects are someone
686	   else's problem.  There is actually no requirement that an object
687	   exist, only that the publisher/registrant have sufficient intention
688	   to create an object to assign the code.  Now a query about metadata
689	   associated with that name makes perfect sense although there are
690	   questions about how far it should go (see below).  For example, one
691	   could invoke

693	      urn:isbn:1-4012-9876-1?publisher

695	   and, modulo some issues about queries being defined by the resource,
696	   have a more than reasonable expectation of getting back "DC Comics".
697	   But, since that is a name and not an object or the location of an
698	   object, I don't know what a fragment is.  One could certainly write

700	      urn:isbn:1-4012-9876-1#publisher

702	   or

704	      urn:isbn:1-4012-9876-1#1

706	   but, assuming one knows how ISBNs are constructed, the result would
707	   presumably be

709	      1-4012

711	   and not anything useful, since there is no object to retrieve and
712	   evaluate with regard to either media type or content.

714	   If we are going to maintain a strong name - object distinction, this
715	   approach makes a certain amount of sense.

717	   An extreme version of the argument that we can't have fragments on
718	   URNs because they are just names, not objects, might lead to the
719	   claim that the only way one gets "Section 2" of that book is with
720	   something like

722	      http://school-
723	      library.ps1234.k12.ma.example/?urn="isbn:1-4012-9876-1"&Section=2

725	   or, in two more general cases:

727	      myFavoriteLibraryRetrievalScheme://library-
728	      domain.example/?urn="isbn:1-4012-9876-1"&Section=2

730	   or maybe

732	      http://www.generic-
733	      bookseller.example/?urn="isbn:1-4012-9876-1"#Section=2

735	   In all three of those cases and some other variations we can thinks
736	   of, the URN is, itself, stable and persistent.  Neither the two
737	   schemes nor the domain parts associated with them need be.If the
738	   fragment that refers to a section is valid, it is too (that doesn't
739	   make it part of the name -- that is a separate question).  The
740	   retrieval/ resolution system is not a property of the URN.  Instead,
741	   the URN is a name-type argument --an object identifier-- used as
742	   input to the retrieval system.

744	Appendix C.  Change Log

746	   [[CREF4: RFC Editor: Please remove this appendix before
747	   publication.]]

749	C.1.  Changes from draft-ietf-urnbis-urns-are-not-uris-00 to -01

751	   o  Revised Section 1 slightly and added some new material to try to
752	      address questions raised on the mailing list.

754	   o  Added Section 2, reflecting an email exchange.

756	   o  Added a Security Considerations section, replacing the placeholder
757	      in the previous version.

759	   o  Added Appendix B.2 and inserted a note in the material titled "A
760	      Perspective on Locations and Names" pointing to it (that material
761	      is in Appendix B.1 in the current version, but was Section 2 and
762	      then Section 3 in earlier versions).

764	   o  Added temporary Appendix B for this version only.

766	   o  Enhanced and updated the Acknowledgments section.

768	   o  The usual small clarifications and editorial changes.

770	C.2.  Changes from draft-ietf-urnbis-urns-are-not-uris-01 to draft-ietf-
771	      urnbis-semantics-clarif-00

773	   o  Changed title and file name to better reflect changes summarized
774	      below.  Note that the predecessor of this document was draft-ietf-
775	      urnbis-urns-are-not-uris-01.

777	   o  Revised considerably as discussed on the mailing list and at IETF
778	      90.  In particular, the document has been narrowed to change
779	      semantics only without affecting the relationship to URI syntax
780	      and the document title and other details changed to match.

782	   o  Dropped much of the original Introduction (moving it temporarily
783	      to an appendix) and trimmed the abstract to be consistent with the
784	      new, more limited. scope.

786	   o  Revised Appendix B.2 to make "perceived requirement" more clear.

788	   o  Removed the former Appendix B, as promised in the previous draft,
789	      moved considerably more text into appendices, and added some new
790	      appendix text.  Note that the earlier text is temporarily
791	      referenced in Appendix B.3 above.  If we intend to keep that
792	      appendix material, we will have to drag at least part of the text
793	      back in from the earlier draft.

795	   o  Added new Section 6 to discuss the next round of decisions the WG
796	      will have to make, assuming this provisions of this specification
797	      are approved.

799	Author's Address
800	   John C Klensin
801	   1770 Massachusetts Ave, Ste 322
802	   Cambridge, MA  02140
803	   USA

805	   Phone: +1 617 245 1457
806	   Email: john-ietf@jck.com