idnits 2.17.1 

draft-ietf-urnbis-urns-are-not-uris-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  == The page length should not exceed 58 lines per page, but there was 1
     longer page, the longest (page 2) being 60 lines


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  -- The draft header indicates that this document updates RFC3986, but the
     abstract doesn't seem to directly say this.  It does mention RFC3986
     though, so this could be OK.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == Line 144 has weird spacing: '...nerally  is no...'

     (Using the creation date from RFC3986, updated by this document, for
     RFC5378 checks: 2002-11-01)

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (April 7, 2014) is 3672 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Unused Reference: 'DeterministicURI' is defined on line 298, but no
     explicit reference was found in the text

  ** Obsolete normative reference: RFC 2141 (Obsoleted by RFC 8141)

  -- Obsolete informational reference (is this intentional?): RFC 1738
     (Obsoleted by RFC 4248, RFC 4266)

  -- Duplicate reference: RFC2141, mentioned in 'RFC2141bis', was also
     mentioned in 'RFC2141'.

  -- Obsolete informational reference (is this intentional?): RFC 2141
     (Obsoleted by RFC 8141)


     Summary: 1 error (**), 0 flaws (~~), 4 warnings (==), 6 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Uniform Resource Names (urnbis)                             J.C. Klensin
3	Internet-Draft                                             April 7, 2014
4	Updates: 3986 (if approved)
5	Intended status: Standards Track
6	Expires: October 07, 2014

8	              Names are Not Locators and URNs are Not URIs
9	              draft-ietf-urnbis-urns-are-not-uris-00.txt

11	Abstract

13	   Experience has shown that identifiers associated with persistent
14	   names are quite different from identifiers associated with the
15	   locations of objects.  This is especially true when such names are
16	   are expected to be stable for a very long time or when they identify
17	   large and complex entities.  In order to allow Uniform Resource Names
18	   (URNs) to evolve to meet the needs of the Informational Sciences
19	   community and other users, this specification separates the syntax
20	   for URNs from the generic syntax for Uniform Resource Identifiers
21	   (URIs) specified in RFC 3986, updating the latter specification
22	   accordingly.

24	Status of this Memo

26	   This Internet-Draft is submitted in full conformance with the
27	   provisions of BCP 78 and BCP 79.

29	   Internet-Drafts are working documents of the Internet Engineering
30	   Task Force (IETF).  Note that other groups may also distribute
31	   working documents as Internet-Drafts.  The list of current Internet-
32	   Drafts is at http://datatracker.ietf.org/drafts/current/.

34	   Internet-Drafts are draft documents valid for a maximum of six months
35	   and may be updated, replaced, or obsoleted by other documents at any
36	   time.  It is inappropriate to use Internet-Drafts as reference
37	   material or to cite them other than as "work in progress."

39	   This Internet-Draft will expire on October 07, 2014.

41	Copyright Notice

43	   Copyright (c) 2014 IETF Trust and the persons identified as the
44	   document authors.  All rights reserved.

46	   This document is subject to BCP 78 and the IETF Trust's Legal
47	   Provisions Relating to IETF Documents (http://trustee.ietf.org/
48	   license-info) in effect on the date of publication of this document.
49	   Please review these documents carefully, as they describe your rights
50	   and restrictions with respect to this document.  Code Components
51	   extracted from this document must include Simplified BSD License text
52	   as described in Section 4.e of the Trust Legal Provisions and are
53	   provided without warranty as described in the Simplified BSD License.

55	Table of Contents

57	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  2
58	   2.  A perspective on locations and names . . . . . . . . . . . . .  2
59	   3.  Changes to RFC 3986  . . . . . . . . . . . . . . . . . . . . .  5
60	   4.  Other Required Actions . . . . . . . . . . . . . . . . . . . .  5
61	   5.  Acknowledgments  . . . . . . . . . . . . . . . . . . . . . . .  5
62	   6.  Contributors . . . . . . . . . . . . . . . . . . . . . . . . .  5
63	   7.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . .  6
64	   8.  Security Considerations  . . . . . . . . . . . . . . . . . . .  6
65	   9.  References . . . . . . . . . . . . . . . . . . . . . . . . . .  6
66	     9.1.  Normative References . . . . . . . . . . . . . . . . . . .  6
67	     9.2.  Informative References . . . . . . . . . . . . . . . . . .  6
68	   Author's Address . . . . . . . . . . . . . . . . . . . . . . . . .  6

70	1.  Introduction

72	   The Internet community now has many years of experience with both
73	   name-type identifiers (notably Uniform Resource Names (URNs [RFC2141]
74	   [RFC2141bis]) and location-based identifiers (notably Uniform
75	   Resource Locators (URLs) [RFC1738]).  That experience leads to the
76	   conclusion that it is impractical to constrain URNs to the syntax and
77	   high-level semantics of URLs.  Generalization from URLs to generic
78	   Uniform Resource Identifiers (URIs) [RFC3986], especially to name-
79	   based, high-stability, long-persistence, identifiers of the URN
80	   variety, has failed because the assumed similarities do not exist to
81	   a sufficient degree.  Ultimately, locators, which typically depend on
82	   particular accessing protocols and a specification relative to some
83	   physical space or network topology, are simply different creatures
84	   from long-persistence, location-independent, object identifiers.  The
85	   syntax and semantic constraints that are appropriate for locators are
86	   either irrelevant to or interfere with the needs of resource names as
87	   a class.  That was tolerable as long as the URN system didn't need
88	   additional capabilities but experience since RFC 2141 was published
89	   has shown that they are, in fact, needed.

91	   This specification updates the Generic URI Syntax specification
92	   [RFC3986] to exclude URNs from its coverage.  Put differently, with
93	   the publication of this specification, URNs are no longer considered
94	   a member of the class of URIs to which RFC 3986 applies.

96	   [[Note in draft: the above leaves it ambiguous as to whether it
97	   remains appropriate to call URNs "URIs".  That ambiguity is
98	   intentional and, if possible should keep the question part of the
99	   "someone else's problem" category.]]

101	   For URLs and such other URIs as may exist or be created in the
102	   future, this specification does not change the syntax rules and other
103	   requirements and recommendations of RFC 3986.

105	2.  A perspective on locations and names
106	   Content industries (e.g., publishers) and memory organizations (e.g.,
107	   libraries, archives, and museums) invest a lot of resources on naming
108	   things and the topics of naming and classification are important
109	   information science issues.  Tens, if not hundreds, of millions of
110	   persistent identifiers have been assigned during the last decade.

112	   Several identifier systems have been developed for persistent and
113	   unique identification of resources.  When there is a real need to
114	   preserve something important (such as scientific publications,
115	   research data, government publications, etc.) for the long term, URNs
116	   or other persistent identifiers are used; URLs (or other generic
117	   URIs) are not being used for identification or even linking purposes.

119	   Naming and locating e.g.  library resources are both complex
120	   activities which have different aims.  Traditionally, naming and
121	   locating resources have been separate activities, and the rules for
122	   the former are much more stringent than for the latter.  The same
123	   principles are being applied to digital materials as well as more
124	   traditional ones.  In a library, any book, be it printed or digital,
125	   has both unique and persistent International Standard Book Number
126	   (ISBN) and non-unique (each copy has its own location information)
127	   and short-lived location information which cannot be trusted in the
128	   long run.  ISBN never changes, but both shelf locations and Web
129	   addresses usually do, many times during the book's life span.

131	   Giving location information a role in identification would not only
132	   force libraries to adopt different policies for printed and digital
133	   content, it would also undermine the value of existing identifier
134	   systems.  Let us assume that ten people independently upload a copy
135	   of an electronic book into different locations in the Web.  Are all
136	   these ten URLs valid identifiers of the book?  And what is their
137	   relation to the ISBN or other identification information of the book
138	   such as its title?

140	   From the perspective of the communities who depend on persistent
141	   identifiers, critical issues include:

143	   1.  Resource identification has to be a managed process.  Assigning
144	       URIs generally  is not.  Although it may be possible to introduce
145	       some level of control to URI assignment, a user cannot determine
146	       whether some URI is reliable or not.

148	   2.  Anyone may assign new URIs to resources even if these resources
149	       already have proper identifiers assigned to them.  Claiming that
150	       these URIs actually identify something undermines the value of
151	       proper identifiers.

153	   3.  There is no 1:1 relation between the resource identified and
154	       URIs.  An e-book in the Web may be represented as 1-n files
155	       (URIs), and a single file may contain several books.  And books
156	       are simple, we need to name very complex objects such as research
157	       data sets, or some component parts within these complex data
158	       sets.

160	   4.  One resource such as a scientific article is typically available
161	       from multiple locations, including (for instance) the publisher's
162	       document supply service, a university's open repositories and
163	       other cooperative repository systems, legal deposit collections
164	       and the Internet archive.  A resource should have one and only
165	       one identifier of a given type; URIs do not meet this
166	       requirement.

168	   5.  URIs relate to instances (copies) of resources, whereas
169	       traditionally identification has much broader scope.  Identifiers
170	       may be assigned to, e.g., an immaterial work (such as Hamlet),
171	       its expressions (e.g.  Finnish translation of Hamlet), and
172	       manifestations of works and expressions (e.g.  PDF version of
173	       Finnish translation of Hamlet).

175	   6.  Over time, different resources (or different versions of the same
176	       resource) may be found from the same non-URN URI.  A user has no
177	       way of knowing whether the resource has changed.  One of the
178	       basic principles for proper identifier systems is that the same
179	       identifier is never assigned to another resource.  In general,
180	       URIs do not meet this requirement.

182	   7.  Persistent identification must be available for resources which
183	       are available only in databases and other environments that are
184	       often identified today as "deep web".  URIs for these resources
185	       tend to be very complicated and it will be difficult to keep them
186	       alive even with the help of DNS redirection when e.g.  the
187	       underlying database management system changes.

189	   8.  The role URI fragment and query could or should have in
190	       identification is unclear and the statements in RFC 3986 are
191	       definitely problematic from the points of view of existing
192	       identifier systems and management of naming.

194	   Does fragment identify a location or a certain section of a resource?
195	   In the evolving set of URN Internet standards, fragment will not be a
196	   part of the Namespace Specific String.  Then fragment only indicates
197	   a place / segment within the identified resource, but does not
198	   identify it.  If fragment had a role in identification, fragments
199	   would extend the scope of existing standard identifiers to component
200	   parts of resources.  For instance, anyone could use URN based on ISBN
201	   + fragment to identify chapters of electronic books.

203	   Things get even more complicated with query since what an identifier
204	   + query resolves to may not have anything to do with the original
205	   resource.  For instance, URN based in ISBN + query may resolve to the
206	   metadata record describing the book.  These records have their own
207	   identifiers which are not based on ISBNs.

209	   [[Note in draft: Most of the discussion above may belong in 2141bis
210	   rather than here.]]
211	   9.  For many organizations, persistence means decades or centuries.
212	       Anything that is protocol dependent will eventually fail.  URLs
213	       do not change by themselves, but in the long run it is very
214	       difficult for people to not change them or the objects to which
215	       they point.

217	   The mention of centuries is intentional.  Content industries, memory
218	   organizations (such as national and repository libraries and national
219	   archives) and universities and other research organizations, need
220	   identifiers that will persist for hundreds of years.  Such
221	   identifiers might even need to outlast the institutions themselves,
222	   and definitely should be usable even if current technologies such as
223	   the Web and the Internet cease to exist or are supplanted by
224	   something new (as unlikely as that might seem today).

226	   In addition, operations on, or additional specifications about, names
227	   and the associated objects must be possible, as stable as the names
228	   themselves, and reasonably efficient.  For example, if a URN were
229	   assigned to an encyclopedia that consisted of many volumes, it should
230	   be feasible to identify (and locate and retrieve if that were
231	   desired) a particular volume or even a particular article without
232	   accessing or retrieving the entire set.

234	3.  Changes to RFC 3986

236	   This specification removes URNs from the scope of RFC 3896.  It makes
237	   no changes for URI types that remain within that scope.

239	4.  Other Required Actions

241	   The basic URN syntax specification [RFC2141] was published well
242	   before RFC 3986 and therefore does not depend on it.  Successors to
243	   that specification will need to fully spell out the syntax and
244	   semantics of URNs without generic or implicit reference to any URI
245	   specification.

247	5.  Acknowledgments

249	   This specification was inspired by a search in the IETF URNBIS WG for
250	   other alternatives that would both satisfy the needs of persistent
251	   name-type identifiers and still fully conform to the specifications
252	   and intent of RFC 3986.  That search lasted several years and
253	   considered many alternatives.  Discussions with Leslie Daigle, Juha
254	   Hakala, Barry Leiba, Keith Moore, Andrew Newton, and Peter Saint-
255	   Andre during the last quarter of 2013 and the first quarter of 2014
256	   were particularly helpful in getting to the conclusion that a
257	   conceptual separation of notions of location-based identifiers (e.g.,
258	   URLs) and the types of persistent identifiers represented by URNs was
259	   necessary.  Peter Saint-Andre provided significant text in a pre-
260	   publication review.

262	6.  Contributors
263	   Juha Hakala contributed most of the text of Section 2.

265	      Contact Information:
266	   Juha Hakala
267	   The National Library of Finland
268	   P.O. Box 15, Helsinki University
269	   Helsinki, MA FIN-00014
270	   Finland
271	   Email: juha.hakala@helsinki.fi

273	7.  IANA Considerations

275	   [[RFC Editor: Please remove this section before publication.]]

277	   This memo is not believed to require any action on IANA's part.  In
278	   particular, we note that there are a collection of "Uniform Resource
279	   Identifier (URI) Schemes" that does not include URNs and a series of
280	   URN-specific registries that do not rely on the URI specificstions.

282	8.  Security Considerations

284	   All drafts are required to have a security considerations section.

286	9.  References

288	9.1.  Normative References

290	   [RFC2141]  Moats, R., "URN Syntax", RFC 2141, May 1997.

292	   [RFC3986]  Berners-Lee, T., Fielding, R. and L. Masinter, "Uniform
293	              Resource Identifier (URI): Generic Syntax", STD 66, RFC
294	              3986, January 2005.

296	9.2.  Informative References

298	   [DeterministicURI]
299	              Mazahir, O., Thaler, D. and G. Montenegro, "Deterministic
300	              URI Encoding", February 2014, <http://www.ietf.org/id/
301	              draft-montenegro-httpbis-uri-encoding-00.txt>.

303	   [RFC1738]  Berners-Lee, T., Masinter, L. and M. McCahill, "Uniform
304	              Resource Locators (URL)", RFC 1738, December 1994.

306	   [RFC2141bis]
307	              Saint-Andre, P., "Uniform Resource Name (URN) Syntax",
308	              January 2014, <https://datatracker.ietf.org/doc/draft-
309	              ietf-urnbis-rfc2141bis-urn/>.

311	Author's Address
312	   John C Klensin
313	   1770 Massachusetts Ave, Ste 322
314	   Cambridge, MA 02140
315	   USA

317	   Phone: +1 617 245 1457
318	   Email: john-ietf@jck.com