idnits 2.17.1
draft-ietf-urnbis-rfc2141bis-urn-02.txt:
Checking boilerplate required by RFC 5378 and the IETF Trust (see
https://trustee.ietf.org/license-info):
----------------------------------------------------------------------------
No issues found here.
Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
----------------------------------------------------------------------------
No issues found here.
Checking nits according to https://www.ietf.org/id-info/checklist :
----------------------------------------------------------------------------
-- The draft header indicates that this document obsoletes RFC2141, but the
abstract doesn't seem to directly say this. It does mention RFC2141
though, so this could be OK.
Miscellaneous warnings:
----------------------------------------------------------------------------
== The copyright year in the IETF Trust and authors Copyright Line does not
match the current year
== The document seems to lack the recommended RFC 2119 boilerplate, even if
it appears to use RFC 2119 keywords -- however, there's a paragraph with
a matching beginning. Boilerplate error?
(The document does seem to have the reference to RFC 2119 which the
ID-Checklist requires).
== The document seems to contain a disclaimer for pre-RFC5378 work, but was
first submitted on or after 10 November 2008. The disclaimer is usually
necessary only for documents that revise or obsolete older RFCs, and that
take significant amounts of text from those RFCs. If you can contact all
authors of the source material and they are willing to grant the BCP78
rights to the IETF Trust, you can and should remove the disclaimer.
Otherwise, the disclaimer is needed and you can ignore this comment.
(See the Legal Provisions document at
https://trustee.ietf.org/license-info for more information.)
-- The document date (March 12, 2012) is 4421 days in the past. Is this
intentional?
Checking references for intended status: Proposed Standard
----------------------------------------------------------------------------
(See RFCs 3967 and 4897 for information about using normative references
to lower-maturity documents in RFCs)
** Obsolete normative reference: RFC 4395 (Obsoleted by RFC 7595)
== Outdated reference: A later version (-09) exists of
draft-ietf-urnbis-rfc3406bis-urn-ns-reg-02
-- Obsolete informational reference (is this intentional?): RFC 615
(Obsoleted by RFC 645)
-- Obsolete informational reference (is this intentional?): RFC 1738
(Obsoleted by RFC 4248, RFC 4266)
-- Obsolete informational reference (is this intentional?): RFC 1808
(Obsoleted by RFC 3986)
-- Obsolete informational reference (is this intentional?): RFC 2141
(Obsoleted by RFC 8141)
-- Obsolete informational reference (is this intentional?): RFC 2396
(Obsoleted by RFC 3986)
-- Obsolete informational reference (is this intentional?): RFC 2611
(Obsoleted by RFC 3406)
-- Obsolete informational reference (is this intentional?): RFC 2717
(Obsoleted by RFC 4395)
-- Obsolete informational reference (is this intentional?): RFC 2718
(Obsoleted by RFC 4395)
-- Obsolete informational reference (is this intentional?): RFC 3406
(Obsoleted by RFC 8141)
-- Obsolete informational reference (is this intentional?): RFC 5226
(Obsoleted by RFC 8126)
Summary: 1 error (**), 0 flaws (~~), 4 warnings (==), 12 comments (--).
Run idnits with the --verbose option for more detailed information about
the items above.
--------------------------------------------------------------------------------
2 IETF URNbis WG A. Hoenes, Ed.
3 Internet-Draft TR-Sys
4 Obsoletes: 2141 (if approved) March 12, 2012
5 Intended status: Standards Track
6 Expires: September 13, 2012
8 Uniform Resource Name (URN) Syntax
9 draft-ietf-urnbis-rfc2141bis-urn-02
11 Abstract
13 Uniform Resource Names (URNs) are intended to serve as persistent,
14 location-independent, resource identifiers. This document serves as
15 the foundation of the 'urn' URI Scheme according to RFC 3986 and sets
16 forward the canonical syntax for URNs, which subdivides URNs into
17 "namespaces". A discussion of both existing legacy and new
18 namespaces and requirements for URN presentation and transmission are
19 presented. Finally, there is a discussion of URN equivalence and how
20 to determine it. This document supersedes RFC 2141.
22 The requirements and procedures for URN Namespace registration
23 documents are set forth in BCP 66, for which RFC 3406bis is the
24 companion revised specification document replacing RFC 3406.
26 Discussion
28 Comments are welcome on the urn@ietf.org mailing list (or sent to the
29 document editor). The home page of the URNbis WG is located at
30 .
32 Status of This Memo
34 This Internet-Draft is submitted in full conformance with the
35 provisions of BCP 78 and BCP 79.
37 Internet-Drafts are working documents of the Internet Engineering
38 Task Force (IETF). Note that other groups may also distribute
39 working documents as Internet-Drafts. The list of current Internet-
40 Drafts is at http://datatracker.ietf.org/drafts/current/.
42 Internet-Drafts are draft documents valid for a maximum of six months
43 and may be updated, replaced, or obsoleted by other documents at any
44 time. It is inappropriate to use Internet-Drafts as reference
45 material or to cite them other than as "work in progress."
47 This Internet-Draft will expire on September 13, 2012.
49 Copyright Notice
51 Copyright (c) 2012 IETF Trust and the persons identified as the
52 document authors. All rights reserved.
54 This document is subject to BCP 78 and the IETF Trust's Legal
55 Provisions Relating to IETF Documents
56 (http://trustee.ietf.org/license-info) in effect on the date of
57 publication of this document. Please review these documents
58 carefully, as they describe your rights and restrictions with respect
59 to this document. Code Components extracted from this document must
60 include Simplified BSD License text as described in Section 4.e of
61 the Trust Legal Provisions and are provided without warranty as
62 described in the Simplified BSD License.
64 This document may contain material from IETF Documents or IETF
65 Contributions published or made publicly available before November
66 10, 2008. The person(s) controlling the copyright in some of this
67 material may not have granted the IETF Trust the right to allow
68 modifications of such material outside the IETF Standards Process.
69 Without obtaining an adequate license from the person(s) controlling
70 the copyright in such materials, this document may not be modified
71 outside the IETF Standards Process, and derivative works of it may
72 not be created outside the IETF Standards Process, except to format
73 it for publication as an RFC or to translate it into languages other
74 than English.
76 Table of Contents
78 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4
79 1.1. Historical Perspective and Motivation . . . . . . . . . . 4
80 1.2. Background on Properties of URNs . . . . . . . . . . . . . 6
81 1.3. Objective of this Memo . . . . . . . . . . . . . . . . . . 7
82 1.4. Requirement Language . . . . . . . . . . . . . . . . . . . 8
83 2. URN Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . 8
84 2.1. Namespace Identifier (NID) Syntax . . . . . . . . . . . . 13
85 2.2. Namespace Specific String (NSS) Syntax . . . . . . . . . . 15
86 2.3. Special and Reserved Characters . . . . . . . . . . . . . 15
87 2.3.1. Delimiter Characters . . . . . . . . . . . . . . . . . 16
88 2.3.2. The Percent Character, Percent-Encoding . . . . . . . 16
89 2.3.3. Other Excluded Characters . . . . . . . . . . . . . . 17
90 3. Support of Existing Legacy Naming Systems and New Naming
91 Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
92 4. URN Presentation and Transport . . . . . . . . . . . . . . . . 18
93 5. Lexical Equivalence of URNs . . . . . . . . . . . . . . . . . 18
94 5.1. Examples of Lexical Equivalence . . . . . . . . . . . . . 19
95 6. Functional Equivalence of URNs . . . . . . . . . . . . . . . . 19
96 7. The 'urn' URI Scheme . . . . . . . . . . . . . . . . . . . . . 20
97 7.1. Registration of URI Scheme 'urn' . . . . . . . . . . . . . 20
98 8. Security Considerations . . . . . . . . . . . . . . . . . . . 22
99 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 22
100 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 23
101 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 23
102 11.1. Normative References . . . . . . . . . . . . . . . . . . . 23
103 11.2. Informative References . . . . . . . . . . . . . . . . . . 24
104 Appendix A. Handling of URNs by URL Resolvers/Browsers . . . . . 26
105 Appendix B. Collected ABNF (Informative) . . . . . . . . . . . . 26
106 Appendix C. Breakdown of NSS Syntax Evolution since RFC 2141
107 (Informative) . . . . . . . . . . . . . . . . . . . . 27
108 Appendix D. Changes since RFC 2141 (Informative) . . . . . . . . 29
109 D.1. Essential Changes from RFC 2141 . . . . . . . . . . . . . 29
110 D.2. Changes from RFC 2141 to Individual Draft -00 . . . . . . 29
111 D.3. Changes from Individual Draft -00 to -02 . . . . . . . . . 30
112 D.4. Changes from Individual Draft -02 to WG Draft -00 . . . . 30
113 D.5. Changes from WG Draft -00 to WG Draft -01 . . . . . . . . 30
114 D.6. Changes from WG Draft -01 to WG Draft -02 . . . . . . . . 31
115 Appendix E. How to Locate IETF Documents (Informative) . . . . . 32
117 1. Introduction
119 Uniform Resource Names (URNs) are intended to serve as persistent,
120 location-independent, resource identifiers and are designed to make
121 it easy to map other namespaces (that share the properties of URNs)
122 into URI-space. Therefore, the URN syntax provides a means to encode
123 character data in a form that can be sent in existing protocols,
124 transcribed on most keyboards, etc.
126 To this end, URNs are designed as an intrinsic part of the more
127 general framework of Uniform Resource Identifiers (URIs); 'urn' is a
128 particular URI Scheme (according to STD 66, RFC 3986 [RFC3986] and
129 BCP 35, RFC 4395 [RFC4395]) that is dedicated to forming a
130 hierarchical framework for persistent identifiers.
132 The first level of hierarchy is given by the classification of URIs
133 into "URI Schemes", and for URNs, the second level is organized into
134 "URN Namespaces". Henceforth both terms are used in this
135 capitalization to distinguish them from the more general common
136 meaning of "scheme" and "namespace".
138 It is an explicit design goal that pre-existing systems of persistent
139 identifiers are mapped into the URN framework. Ordinarily, each such
140 traditional identifier system (namespace) -- standard or otherwise --
141 will occupy its own URN Namespace. However, shared URN Namespaces
142 are possible (and in fact, already exist), but the identifier-driven
143 mechanisms needed to distinguish the originating namespaces make
144 registration and maintenance of such URN Namespaces more complicated.
146 URN (as a URI Scheme) as such does not have a specific scope. The
147 applicability of the URN system, that is, the totality of the
148 resources that URNs can be assigned to, is the union of all
149 identifier systems that have an associated registered URN Namespace.
150 Ideally every new namespace will thus extend the URN applicability.
152 1.1. Historical Perspective and Motivation
154 Since this RFC will be of particular interest for groups and
155 individuals that are interested in persistent identifiers in general
156 and not in continuous contact with the IETF and the RFC series, this
157 section gives a brief outline of the evolution of the matter over
158 time. Appendix E gives hints on how to obtain RFCs and related
159 information.
161 Attempts to define generally applicable identifiers for network
162 resources go back to the mid-1970s. Among the applicable RFCs is RFC
163 615 [RFC0615], which subsequently has been obsoleted by RFC 645
164 [RFC0645].
166 The seminal document in the RFC series regarding URIs (Uniform
167 Resource Identifiers) for use with the World Wide Web (WWW) was RFC
168 1630 [RFC1630], published in 1994. In the same year, the general
169 concept or Uniform Resource Names has been laid down in RFC 1737
170 [RFC1737] and that of Uniform Resource Locators in RFC 1736
171 [RFC1736].
173 The original formal specification of URN Syntax, RFC 2141 [RFC2141]
174 was adopted in 1997. That document was based on the original
175 specification of URLs (Uniform Resource Locators) in RFC 1738
176 [RFC1738] and RFC 1808 [RFC1808], which later on, in 1998, was
177 generalized and consolidated in the Generic URI specification,
178 RFC 2396 [RFC2396]. Most parts of these URI/URL documents were
179 superseded in 2005 by STD 66, RFC 3986 [RFC3986]. Notably, RFC 2141
180 makes (essentially normative) reference to a draft version of
181 RFC 2396.
183 Over time, the terms "URI", "URL", and "URN" have been refined and
184 slightly shifted according to emerging insight and use. This has
185 been clarified in a joint effort of the IETF and the World Wide Web
186 Council, published 2002 for the IETF in RFC 3305 [RFC3305].
188 The wealth of URI Schemes and URN Namespaces needs to be organized in
189 a persistent way, in order to guide application developers and users
190 to the standardized top level branches and the related
191 specifications. These registries are maintained by the Internet
192 Assigned Numbers Authority (IANA) [IANA] at [IANA-URI] and
193 [IANA-URN], respectively. Registration procedures for URI Schemes
194 originally had been laid down in RFC 2717 [RFC2717] and guidelines
195 for the related specification documents were given in RFC 2718
196 [RFC2718]. These documents have been obsoleted and consolidated into
197 BCP 35, RFC 4395 [RFC4395], which is based on, and aligned with,
198 RFC 3986.
200 Note that RFC 2141 predates RFC 2717 and, although the 'urn' URI
201 scheme traditionally was listed in [IANA-URI] with a pointer to
202 RFC 2141, this registration has never been performed formally.
204 Similarly, the URN Namespace definition and registration mechanisms
205 originally have been specified in RFC 2611 [RFC2611], which has been
206 obsoleted by BCP 66, RFC 3406 [RFC3406]. Guidelines for documents
207 prescribing IANA procedures have been revised as well over the years,
208 and at the time of this writing, BCP 26, RFC 5226 [RFC5226] is the
209 normative document. Neither RFC 4395 nor RFC 3406 conform to
210 RFC 5226.
212 Early documents specifying URI and URN syntax, including RFC 2141,
213 made use of an ad-hoc variant of the original Backus-Naur Form (BNF)
214 that never has been formally specified.
216 Over the years, the IETF has shifted to the use of a predominant
217 formal language used to define the syntax of textual protocol
218 elements, dubbed "Augmented Backus-Naur Form" (ABNF). The
219 specification of ABNF also has evolved, and now STD 68, RFC 5234
220 [RFC5234] is the normative document for it (that also will be used in
221 this RFC).
223 1.2. Background on Properties of URNs
225 This section aims at quoting requirements as identified in the past;
226 it does not attempt to revise or redefine these requirements, but it
227 gives some hints where more than a decade of experience with URNs has
228 shed a different light on past views. The citations below are given
229 here to make this document self-contained and avoid normative down-
230 references to old work.
232 RFC 1738 [RFC1738] defined the purpose of URNs as follows:
234 o The purpose or function of a URN is to provide a globally unique,
235 persistent identifier used for recognition, for access to
236 characteristics of the resource, or for access to the resource
237 itself.
239 Section 2 of RFC 1738 [RFC1738] listed the functional requirements
240 for URNs (quote slightly edited to reflect the time passed since that
241 RFC was written and the actual definition of the URN scheme that has
242 happened):
244 o Global scope: A URN is a name with global scope which does not
245 imply a location. It has the same meaning everywhere.
247 o Global uniqueness: The same URN will never be assigned to two
248 different resources.
250 o Persistence: It is intended that the lifetime of a URN be
251 permanent. That is, the URN will be globally unique forever, and
252 may well be used as a reference to a resource well beyond the
253 lifetime of the resource it identifies or of any naming authority
254 involved in the assignment of its name.
256 o Scalability: URNs can be assigned to any resource that might
257 conceivably be available on the network, for hundreds of years.
259 o Legacy support: The URN scheme permits the support of existing
260 legacy naming systems, insofar as they satisfy the other
261 requirements described here. [...]
263 o Extensibility: The URN scheme permits future extensions.
265 o Independence: It is solely the responsibility of a name issuing
266 authority to determine the conditions under which it will issue a
267 name.
269 o Resolution: URNs will not impede resolution. [...]
271 The URN syntax described below also accommodates the fundamental
272 "Requirements for URN Encoding" in Section 3 of RFC 1738 [RFC1738],
273 as far as experience gained has not lead to relax unrealistical
274 detail requirements:
276 o Single encoding: The encoding for presentation for people in clear
277 text, electronic mail and the like is the same as the encoding in
278 other transmissions.
280 o Simple comparison: A comparison algorithm for URNs is simple,
281 local, and deterministic. [...]
283 o Human transcribability: For URNs to be easily transcribable by
284 humans without error, they need to be short, use a minimum of
285 special characters, and be case insensitive. [...]
287 Note:
288 In particular practice gained with active URN Namespaces has
289 shown that this former goal is rather unrealistic, since
290 usually preference is given to 1:1 usage of existing
291 namespaces, which might not have this property. However, we
292 hold that, at least, the rough kind of resource identified by a
293 URN should be easily recognizable for humans.
295 o Transport friendliness: A URN can be transported unmodified in the
296 common Internet protocols, such as TCP, SMTP, FTP, Telnet, etc.,
297 as well as printed paper.
299 o Machine consumption: A URN can be parsed by a computer.
301 o Text recognition: The encoding of a URN needs to enhance the
302 ability to find and parse URNs in free text.
304 1.3. Objective of this Memo
306 RFC 2141 does not seamlessly match current Internet Standards. The
307 primary objective of this document is the alignment with the URI
308 standard [RFC3986] and URI Scheme guidelines [RFC4395], the ABNF
309 standard [RFC5234] and the current IANA Guidelines [RFC5226] in
310 general.
312 Further, experience from emerging international efforts to establish
313 a general, distributed, stable URN resolution service have been taken
314 into account during the draft stage of this document.
316 For advancing the URN specification on the Internet Standards-Track,
317 it needs to be based on documents of comparable maturity. Therefore,
318 to further advancements of the formal maturity level of this RFC, it
319 deliberately makes normative references only to documents at Full
320 Standard or Best Current Practice level.
322 Thus, this replacement document for RFC 2141 should make it possible
323 to advance the URN framework on the Internet Standard maturity
324 ladder. All other related documents depend on it; therefore this is
325 the first step to undertake.
327 Out of scope for this document is a revision of the URN Namespace
328 Definition Mechanisms document, BCP 66. This is being undertaken in
329 a companion document, RFC 3406bis
330 [I-D.ietf-urnbis-rfc3406bis-urn-ns-reg].
332 1.4. Requirement Language
334 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
335 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
336 document are to be interpreted as described in BCP 14 [RFC2119].
338 2. URN Syntax
340 This document defines the URI Scheme 'urn'. Hence, URNs are specific
341 URIs as specified in STD 66 [RFC3986]. The formal syntax definitions
342 below are given in ABNF according to STD 68 [RFC5234] and make use of
343 some "Core Rules" specified in Appendix B of that Standard and
344 several generic rules defined in Appendix A of RFC 3986.
346 The syntax definitions below do, and syntax definitions in dependent
347 documents MUST, conform to the URI syntax specified in RFC 3986, in
348 the sense that additional syntax rules must only constrain the
349 general rules from RFC 3986. In other words: a general URI parser
350 based on RFC 3986 MUST be able to parse any legal URN, and specific
351 semantics can be obtained from URN-specific parsing.
353 URNs conform to the variant of the general URI syntax
354 specified in Section 3 of [RFC3986], reproduced here informally:
356 URI = scheme ":" path-rootless [ "?" query ] [ "#" fragment ]
358 path-rootless = segment-nz *( "/" segment )
359 segment-nz = 1*pchar
360 segment = *pchar
362 pchar = unreserved / pct-encoded / sub-delims / ":" / "@"
364 In the case of URNs, we have:
366 scheme = "urn"
368 and for , only a single segment is used, but the
369 following additional syntax rule is superimposed on
370 to establish a level of hierarchy called "Namespace":
372 urn-path = NID ":" NSS
374 Here "urn" is the URI scheme name, is the Namespace Identifier,
375 and is the Namespace Specific String. The colons are REQUIRED
376 separator characters.
378 Note that it is common practise in several existing URN Namespaces
379 (and fully supported by this syntax) to use additional colon(s) as
380 separator character(s) in order to introduce further level(s) of
381 hierarchy into the NSS syntax, where needed. (See also
382 Section 2.3.1 below.)
384 Per RFC 3986, the URN Scheme name (here "urn") is case-insensitive.
386 The Namespace ID (also a case-insensitive string) determines the
387 syntactic structure and the semantic interpretation of the Namespace
388 Specific String. Details on NID syntax can be found below in
389 Section 2.1, and the NSS syntax is elaborated upon in Section 2.2.
391 Each particular URN Namespace is based on a specific document that
392 must normatively describe (among other things) the details of the
393 values allowed in conjunction with the respective . The
394 syntax and semantics of these values are ordinarily specified
395 by an existing persistent identifier system (namespace); for
396 instance, in the 'ISBN' URN Namespace, each NSS must be a valid ISBN.
397 Some URN Namespaces may have strict rules for well formed NSSs, while
398 some others may be far more relaxed. There may also be significant
399 differences regarding the identifier assignment process. The overall
400 specification requirements and registration procedures for URN
401 Namespaces are the subject of a dedicated companion document, BCP 66,
402 which has been updated for conformance to BCP 26 and alignment with
403 implementation experience RFC 3406bis
404 [I-D.ietf-urnbis-rfc3406bis-urn-ns-reg].
406 Notes:
408 RFC 2141 was published before the URI Generic Syntax was finalized
409 and therefore had to defer the decision on whether and
410 components are applicable to URNs. RFC 2141 therefore
411 has reserved the use of bare (unencoded) question mark ("?") and
412 hash ("#") characters in URNs for future usage in conformance with
413 the generic URI syntax.
415 URNs have been in use for more than a decade. Some user
416 communities want to be able to use these components (which are
417 split off by the high-level parsing rules of RFC 3986), or at
418 least the component, in the context of their focal
419 URNs. Therefore, this document allows the designers of selected
420 URN Namespaces to specify the use of the component with
421 URNs belonging these Namespaces, whereas the specification of
422 usage of the component is set aside to future
423 standardization efforts for URN resolution. Thus, this draft
424 allows both of these components in the general syntax.
426 ISSUE:
428 Regarding fragment identifiers, Section 3.5, para 1 of RFC 3986,
429 indicates that "The fragment identifier ... allows indirect
430 identification of a secondary resource by reference to a primary
431 resource and additional identifying information. The identified
432 secondary resource may be some portion or subset of the primary
433 resource, some view on representations of the primary resource, or
434 some other resource defined or described by those
435 representations." RFC 3986 continues in specifying that the
436 details of the interpretation of fragment identifiers are specific
437 to the media types returned upon resolution of an URI. The
438 entirety of the purposes mentioned in the above quote obviously
439 only can be achieved fully if the "consumer" of the URI becomes
440 aware of the fragment identifier as part of the requested URI,
441 since, e.g., secondary resources might consist in representations
442 might only be available in particular media types. However, RFC
443 3986 subsequently (in the penultimate paragraph of Section 3.5)
444 specifies that the evaluation of fragment identifiers be a client-
445 side matter and browsers are to strip them from request URIs sent
446 in information retrieval protocols.
447 Based on this, contemporary web browsers do not communicate
448 fragment identifiers to the web server but perform fragment
449 selection locally on the returned (HTML) resource. To make things
450 even more complicated, the most popular media type (HTML) does
451 only allow to set markers (which are anchor points in the
452 serialized media stream and used by browsers to identify a
453 specific position in the content) and does not allow browsers to
454 regularly identify actual, conceptional fragments of the media
455 delivered -- like, e.g., the "proper content" of a web page,
456 excluding navigation bars etc. -- so that in practice users have
457 got accustomed to understanding a "fragment" as actually
458 designating a *position* in the media, not a *part* of it.
460 Therefore, potential usage of components in URNs is
461 rather limited and has to be considered very seriously by
462 designers of URN Namespaces that would liek to make use of them.
463 URN Namespaces that rely on (unmodified) browser resolution via
464 HTTP/HTML cannot rely on the usage of fragment identifiers to
465 steer the resolution process. Thus, the use of fragment
466 identifiers only seems to be useful for URN Namespaces that are
467 intended to either (a) exclusively make use of resolution systems
468 / clients that can cope with handing off a full-featured URN
469 (including a possible fragment identifier) to the resolution
470 service, or (b) exclusively employ HTML/HTTP based resolution
471 systems / clients, i.e., where the resolution results are returned
472 as HTML such that web browsers can perform the fragment selection,
473 or as some other media type that better supports the
474 identification and actual selection of embedded fragments, even in
475 off-the-shelf web browsers -- perhaps possible for certain
476 variants of XML-based media types.
478 The syntax of and are defined in RFC 3986.
479 Question mark and hash sign remain reserved as separator characters
480 for these URI components and therefore MUST NOT appear unencoded in a
481 NSS. This rule guarantees backwards compatibility with existing URN
482 Namespaces and improves the compatibility of URN syntax with general
483 URI parsers.
485 The part MUST NOT be present in any *assigned* URN. This
486 specification reserves its use for future standardization related to
487 URN services and resolution. A part can only be added to an
488 assigned URN and appear in a URI *reference* [RFC3986] to a URN that
489 is intended to be used with URN resolution services, and, in
490 accordance with the general specification of this part in RFC 3986,
491 its purpose is restricted to indicate the requested URN resolution
492 service and/or particular service aspects of the intended resolution
493 response, e.g., to select the kind of metadata sought about the given
494 object that is identified by the basic, assigned URN.
496 The part is not generally allowed in URNs. It is only
497 applicable to URN Namespaces that specifically opt to support its
498 usage. Thus, a URN Namespace registration document MAY specify the
499 usage of with URNs of that particular URN Namespace.
500 Absent a registered namespace definition based on this document and
501 RFC 3406bis that explicitly specifies its usage, URNs within a
502 particular URN Namespace MUST NOT contain a fragment identifier.
504 The use of fragment identifiers may be useful if the URN Namespace is
505 based on an existing identifier scheme that designates objects of
506 reasonable complexity such that there is a need to make reference of
507 parts of such resources in typical network access environments
508 without incurring the effort to assign and maintain different
509 (assigned) NSSs in such cases.
511 URN Namespaces will deal with various kinds of fragments. For
512 instance, publications can be divided into smaller parts -- journals
513 consist of volumes, issues and articles, and books may contain
514 chapters. These logical fragments are usually not fragments in the
515 sense of the deliberations in the URI Generic Syntax, and if so,
516 MUST NOT be used. However, namespaces MAY have internal
517 means for identification of logical fragments such as journal
518 articles. For instance, the ISBN (International Standard Book
519 Number) system allows assignment of ISBN numbers to book chapters if
520 they are available as separate items. Namespace specific fragment
521 identification practices are beyond the scope of this document, since
522 they do not rely on URI Generic Syntax, and their application is the
523 primary RECOMMENDED way to deal with fragment identification. If a
524 namespace lacks this possibility, a URN Namespace definition SHOULD
525 define syntactical parts of its NSSs that amend the original
526 identifiers of the underlying namespace in a readily parseable way
527 and serve to allow assignment of URNs in that namespace to the
528 intended abstract fragments. A URN Namespace registration MAY forbid
529 all kind of fragment identification (even if it were possible on the
530 basis of URI Generic Syntax), if the application rules and syntax of
531 the identifier does not allow identification of fragments. ISSN
532 (International Standard Serial Number) is an example of this kind of
533 identifier / namespace.
535 The use of as specified in RFC 3986 is possible if and
536 only if (a) the URN Namespace is based on an existing identifier
537 scheme that designates objects of reasonable complexity that there is
538 a need to make reference of parts of such resources in typical
539 network access environments; and (b) these parts will be identified
540 in the canonical manner of the media type(s) delivered upon URN
541 resolution. Direct resolution to them SHOULD be possible and
542 sustainable.
544 If in a given namespace URNs are never assigned to a particular
545 manifestation of a resource (for instance, a PDF version of a book),
546 but can be transferred from one manifestation to the next or apply to
547 all of them, usage is forbidden. This applies also to the
548 situation when identified resources are works (without any references
549 to physical embodiments of the work).
551 The use of SHOULD NOT be opted for if the underlying
552 namespace provides for the intrinsic possibility to identify such
553 parts or if there is a readily usable method to construct NSSs by
554 combining the existing identifiers with a component (or components)
555 to identify such parts in an easily discernable manner.
557 Whether the URI Generic Syntax is applied or not, there are various
558 ways in which fragment identifiers can be generated:
560 (a) Fragment identifiers (if any) are assigned individually to the
561 relevant fragments of a larger entity during the URN assignment
562 process. If a URN Namespace opts for this model, its
563 specification SHOULD describe the additional syntax restrictions
564 to be adhered to and the particulars of the (per-URN) assignment
565 process.
567 (b) A specific set of fragment identifiers is generally applicable
568 to all resources targeted by URNs of the specific URN Namespace.
569 In this case, the specification document MUST specify a finite
570 set of values, or precise, generic rules for the
571 automated formation of syntactically valid fragment identifiers
572 for the particular URN Namespace. The specification SHOULD
573 indicate the treatment of syntactically valid values
574 in case they are not semantically valid for a given base URN.
575 Absent such specification, the default is to ignore such
576 fragment identifiers.
578 URN resolver clients SHOULD pass a given part of a URN
579 unchanged to the resolver service. The default URN resolution
580 behavior is to ignore any part if either the applicable
581 URN Namespace definition did not specify its use, or if no specific
582 related information was available for the basic resource in case (b)
583 above, or if that basic URN plus fragment identifier has not been
584 assigned in case (a) above.
586 2.1. Namespace Identifier (NID) Syntax
588 The following is the syntax for the Namespace Identifier. To (i) be
589 consistent with all potential resolution schemes and (ii) not put any
590 undue constraints on any potential resolution scheme, Namespace
591 Identifiers are ASCII strings with the syntax:
593 NID = (ALPHA / DIGIT) 0*30(ALPHA / DIGIT / "-") (ALPHA / DIGIT)
595 Note:
596 The above definition is slightly more restrictive than it was in
597 RFC 2141, to better reflect common practice for "handle"-like
598 identifiers in other IETF protocols (a.k.a. "LDH" syntax) and
599 requirements from RFC 3406bis. RFC 3406bis contains further
600 syntax restrictions on NID strings.
602 ISSUE:
603 The above rule still allows NIDs that contain multiple adjacent
604 hyphens or have the form of decimal numbers or decimal number
605 ranges.
607 Should this be further restricted _in this document_ or is it
608 sufficient to defer to the additional (NID kind specific) rules in
609 RFC 3406bis and the common sense of URN Namespace authors and the
610 designated IANA experts?
611 Anyhow, such restrictions would be fully backward compatible -- as
612 is the above tightened rule -- because no NIDs have been defined
613 so far that would violate these restrictions. Hyphens have been
614 used only in the naming pattern for "Informal Namespace IDs" per
615 RFC 3406[bis].
617 The document editor senses the low level of discussion of this
618 issue as an indication that this Issue can be closed.
620 Namespace Identifiers are case-insensitive, so that for instance
621 "ISBN" and "isbn" refer to the same namespace.
623 To avoid confusion with the URI Scheme name "urn", the NID "urn" is
624 permanently reserved by this RFC and MUST NOT be used or registered.
626 Note:
627 This reservation is carried over unchanged from RFC 2141, for
628 historical reasons.
630 ISSUE:
631 Further possible reservations and/or details are out of scope for
632 this document, but might be within the scope of RFC 3406bis.
633 It has been suggested that no additional reservations should be
634 codified and the final decision in any case should be left to the
635 common sense of URN Namespace authors and the designated IANA
636 experts.
638 The document editor senses the low level of discussion of this
639 issue as an indication that this Issue can be closed.
641 2.2. Namespace Specific String (NSS) Syntax
643 As already required since RFC 1737, there is a single canonical
644 representation of the NSS portion of an URN.
646 The format of this single canonical form follows:
648 NSS = 1*pchar ; or equivalent: NSS = segment-nz
650 ( and are defined in Section 3.3 of RFC 3986.)
652 Note: The informational Appendix C expands on the evolution of the
653 NSS syntax specification since RFC 2141.
655 ISSUE (for the record):
656 In comparison to RFC 2141, essentially now "&" and "~" are allowed
657 in the NSS syntax, in full conformance with the generic URI
658 syntax. On the other hand, the characters are no more
659 part of the formal syntax -- unfortunately (or erroneously) these
660 were included in the formal syntax rules of RFC 2141 and only
661 exluded after that fact in the prose, which at least in one
662 instance has lead to a URN Namespace definition document that
663 allows in the formal NSS syntax but does _not_ properly
664 exclude their use in the prose. The interpretation of "%" was
665 ambiguous in RFC 2141; it is now only allowed (in the formal
666 syntax and in the prose) in constructs.
668 The document editor senses that this change of the NSS syntax has
669 found consensus and that hence this Issue is regarded as closed.
671 Depending on the rules governing a namespace, valid identifiers in a
672 namespace might contain characters that are not members of the URN
673 character repertoire above (). In order to achieve
674 conformance with this NSS specification, such strings MUST be
675 translated into canonical NSS format before embedding them into a
676 URN, using them as protocol elements, or otherwise passing them on to
677 other applications. Translation is done by encoding each character
678 outside the URN character repertoire as a sequence of octets using
679 UTF-8 encoding (STD 63 [RFC3629]), and the "percent-encoding" of each
680 of those octets as "%" followed by two characters. The
681 latter two characters form the hexadecimal representation of that
682 octet. (See Section 2.3.2 below for more details.)
684 2.3. Special and Reserved Characters
686 The remaining printable characters not included in the
687 repertoire comprise the generic delimiters and the reserved
688 characters, which are restricted for special use only. These
689 characters are discussed below, giving the specifics of why each
690 character is special or reserved.
692 2.3.1. Delimiter Characters
694 RFC 3986 [RFC3986] defines the general delimiter characters used in
695 URIs:
697 gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@"
699 From among the , ":" and "@" are also included in the
700 rule and hence allowed in the path components of URIs.
702 The at-character ("@") in generic URIs only has a specific meaning
703 when contained in the part, which is absent in URNs.
704 Hence, "@" is available in the part of URNs.
706 With URNs, the colon (":") is used as a delimiter character not only
707 between the scheme name ("urn") and the , but also between the
708 latter and the , and many existing URN Namespaces additionally
709 use ":" to further subdivide a single RFC 3986 path segment in the
710 in a hierarchical manner.
712 Note: Using ":" as a sub-delimiter in the path in favor of "/" is
713 attractive because it avoids possible complications that could arise
714 from accidental inappropriate use of relative URI references
715 [RFC3986] for URNs.
717 The characters "/", "?", and "#" separate path components and the
718 and parts in the generic URI syntax; they are
719 restricted to this role in URNs as well, although the in URNs
720 only admits a single and hence "/" is not allowed.
721 Therefore, these characters MUST NOT appear literally in the
722 part of a URN in unencoded form. Namespaces that need these
723 characters MUST employ in their URNs the appropriate percent-encoding
724 for each such character.
726 The square brackets ("[" and "]") also play a particular role when
727 contained in the part, which is absent in URNs. However,
728 for conformance with the generic URI syntax, they are not allowed
729 literally in the component of URNs. If a specific URN
730 Namespace reflects semantics that require these characters, they MUST
731 be percent-encoded in the respective URNs.
733 2.3.2. The Percent Character, Percent-Encoding
735 The percent character ("%") is reserved in the URN syntax for
736 introducing the escape sequence for an octet that is either not a
737 printable ASCII character or reserved for special purposes, as
738 described in this section. The presence of a "%" character in a URN
739 MUST always be followed by two characters, which three
740 characters together semantically form an abstract
741 octet. Literal use of the "%" character in an underlying namespace
742 MUST therefore be encoded as "%25" in URNs for that namespace.
744 Namespaces MAY designate one or more characters from the URN
745 character repertoire as having special meaning for that namespace.
746 If the namespace also uses that character in a literal sense as well,
747 the character used in a literal sense MUST be encoded with "%"
748 followed by the hexadecimal representation of that octet. Further, a
749 character MUST NOT be percent-encoded if the character is not a
750 reserved character. Therefore, the process of registering a
751 namespace identifier shall include publication of a definition of
752 which characters have a special meaning to that namespace -- cf. RFC
753 3406bis [I-D.ietf-urnbis-rfc3406bis-urn-ns-reg].
755 2.3.3. Other Excluded Characters
757 The following list is included only for the sake of completeness. It
758 includes the characters discussed in Sections 2.3.1 and 2.3.2. Any
759 octets/characters on this list are explicitly NOT part of the URN
760 character repertoire, and if used in an URN, MUST be percent-
761 encoded.
763 excluded = CTL / SP ; control characters and space
764 / DQUOTE ; "
765 / "#" ; from
766 / "%" ; see above
767 / "/" ; from
768 / "<" / ">"
769 / "?" ; from
770 / "[" ; from
771 / "\"
772 / "]" ; from
773 / "^"
774 / "`"
775 / "{" / "|" / "}"
776 / %x7F ; DEL (control character)
777 / %x80-FF ; non-ASCII
779 The NUL octet (0 hex) is renowned for a long history of trouble in
780 implementations. It MUST NOT be used in URNs, in either unencoded or
781 percent-encoded form.
783 In a textual context for a URN, the NSS part ends when an octet/
784 character from the excluded character set () is
785 encountered. The character from the excluded character set is NOT
786 part of the NSS.
788 The more general issue of discerning URNs in non-structured text is
789 not specific to URNs, but a general issue for recognizing URIs (by
790 humans or automata), and hence out of scope of this document.
792 3. Support of Existing Legacy Naming Systems and New Naming Systems
794 Any identifier to be used as a URN MUST be expressed in conformance
795 with the URI and URN syntax specifications ([RFC3986], this
796 document). If names from (existing or newly devised) namespaces
797 contain characters other than those defined for the URN character
798 set, they MUST be translated into canonical form as discussed in
799 Section 2.2.
801 On the other hand, every namespace specific string in a given URN
802 Namespace MUST be based on an identifier that conforms to the
803 requirements of the identifier system to which the URN Namespace is
804 assigned; in the simplest form, if the syntactical rules admit, the
805 NSS can be the original identifier. For instance, every legal NSS in
806 the ISBN Namespace must be a valid ISBN.
808 4. URN Presentation and Transport
810 The URN syntax defines the canonical format for URNs and all URN
811 transport and interchanges MUST take place in this format. Further,
812 all URN-aware applications MUST offer the option of displaying URNs
813 in this canonical form to allow for direct transcription (for example
814 by cut-and-paste techniques). Such applications MAY support display
815 of URNs in a more human-friendly form and may use a character set
816 that includes characters that aren't permitted in URN syntax as
817 defined in this RFC (that is, they may replace %-notation by
818 characters in some extended character set in display to humans).
820 Note: Such transformation for the purpose of presentation, if done
821 blindly without NID-specific knowledge of special character usage,
822 might introduce ambiguity, because in the cases described above in
823 the second paragraph of Section 2.3.2, the unescaped and percent-
824 escaped form of the same character might carry different semantics
825 in NSSs of some URN Namespaces.
827 5. Lexical Equivalence of URNs
829 For various purposes such as caching, it is often desirable to
830 determine whether two URNs are the same without resolving them. The
831 general-purpose means of doing so is by testing for "lexical
832 equivalence" as defined below.
834 Two URNs are lexically equivalent if they are octet-by-octet equal
835 after the following preprocessing:
836 1. normalize the case of the leading "urn" scheme name;
837 2. normalize the case of the NID;
838 3. normalize the case of any percent-encoding;
839 4. remove the part of the URI, if present.
841 Note that percent-encoding MUST NOT be removed. It is an
842 implementation detail not affecting interoperability whether a URN
843 comparison function internally prefers normalization (in the above 3
844 steps) to lower or to upper case. Note also that MUST NOT
845 be removed, since there is no lexical equivalence between the "base"
846 URN and one which uses -- the former identifies the
847 resource as the whole; the latter just a part of it.
849 Some namespaces may define additional lexical equivalences, such as
850 case-insensitivity of the NSS (or parts thereof). Additional lexical
851 equivalences MUST be documented as part of Namespace registration,
852 MUST always only have the effect of eliminating some of the false
853 negatives obtained by the procedure above, i.e. they MUST NOT say
854 that two URNs are not equivalent if the procedure above says they are
855 equivalent.
857 5.1. Examples of Lexical Equivalence
859 The following hypothetical URN comparisons highlight the lexical
860 equivalence definitions:
862 1- URN:foo:a123,456
863 2- urn:foo:a123,456
864 3- urn:FOO:a123,456
865 4- urn:foo:A123,456
866 5- urn:foo:a123%2C456
867 6- URN:FOO:a123%2c456
868 7- urn:foo:a123,456?xyz
869 8- urn:foo:a123,456#xyz
871 URNs 1, 2, 3, and 7 are all lexically equivalent. URN 4 is not
872 lexically equivalent to any of the other URNs of the above set. The
873 same holds for URN 8.
874 URNs 5 and 6 are only lexically equivalent to each other.
876 6. Functional Equivalence of URNs
878 Functional equivalence is determined by practice within a given
879 namespace and managed by resolvers for that namespace. Thus, it is
880 beyond the scope of this document. Namespace registrations must
881 include guidance on how to determine functional equivalence for that
882 URN Namespace, i.e., when two URNs are identical within a namespace.
884 On the other hand, it is permissible to have two different URNs --
885 even from different URN Namespaces -- be assigned to a particular
886 resource. This can only be detected by resolving the URNs and
887 analysis of the resolution responses; hence, this is out of scope for
888 this memo.
890 7. The 'urn' URI Scheme
892 At the time of publication of RFC 2141, no formal registration
893 procedure for URI Schemes had been established yet, and so IANA only
894 informally has registered the 'urn' URI Scheme with a reference to
895 [RFC2141].
897 Section 7.1 below contains the URI scheme registration template for
898 the 'urn' scheme, in accordance with RFC 4395 [RFC4395].
900 Note: In order to be usable as a standalone text (after being
901 extracted from this RFC), the template below does not contain
902 formal anchors to the references listed in Section 11, but instead
903 gives the common document designations in prose. However, for
904 compliance with editorial policy, it needs to be noted here:
906 This registration template refers to RFCs 2196, 2276, 2608, 3401
907 through 3404, 3406bis, 3629 (STD 63), and 3986 (STD 66) ([RFC2169]
908 [RFC2276] [RFC2608] [RFC3401] [RFC3402] [RFC3403] [RFC3404]
909 [I-D.ietf-urnbis-rfc3406bis-urn-ns-reg] [RFC3629] [RFC3986]).
911 7.1. Registration of URI Scheme 'urn'
913 [ RFC Editor: Please replace "XXXX" in all instances of "RFC XXXX"
914 below by the RFC number assigned to this document. ]
916 URI scheme name: urn
918 Status: permanent
920 URI scheme syntax:
922 See Section 2 of RFC XXXX.
924 URI scheme semantics:
926 'urn' URIs, known as Universal Resource Names (URNs), serve as
927 persistent, location-independent, resource identifiers for
928 concrete and abstract objects that have network accessible
929 instances and/or metadata.
931 URNs are structured hierarchically into URN Namespaces, the
932 management of which is delegated to namespace-specific
933 authorities. Each such URN Namespace is founded in an independent
934 specification and registered with IANA, following the guidelines
935 and procedures of BCP 66 (at the time of this registration: RFC
936 3406, an update is in progress as RFC 3406bis
937 [I-D.ietf-urnbis-rfc3406bis-urn-ns-reg]).
939 Encoding considerations:
941 All URNs are ASCII strings conforming to the general URI syntax
942 from STD 66. As described in Sections 2.2 and 2.3.2 of RFC XXXX,
943 there may be characters allowed by the syntax and semantics of the
944 identifier system underlying the URN Namespace but not contained
945 in the US-ASCII charset. Such characters MUST first be
946 represented in Unicode and encoded in UTF-8 according to STD 63.
947 Any octets outside the allowed character set MUST then be percent-
948 encoded.
950 Note that it is perfectly possible that the syntax and semantics
951 of an underlying identifier system does not admit specific
952 characters allowed by the syntax rules in RFC XXXX.
954 Applications/protocols that use this URI scheme:
956 URNs that serve to identify abstract resources for protocol
957 purposes are expected to be recognized directly by the
958 implementations of these portocols.
960 In general, resolution systems for URNs are specified on a per-
961 namespace basis. If appropriate for the namespace, these systems
962 resolve URNs to (possibly multiple) URIs that allow the network
963 access to the identified object or metadata on it.
965 "Architectural Principles of Uniform Resource Name Resolution"
966 (RFC 2276) explains the basic concepts. Some resolution systems
967 laid down in IETF specifications are:
969 * Trivial HTTP-based URN Resolution (RFC 2169)
971 * Dynamic Delegation Discovery System (DDDS, RFCs 3401-3404)
973 * Service Location Protocol (SLPv2, RFC 2608)
975 Interoperability Considerations:
977 Persistence and stability of URNs require appropriate resolution
978 systems.
980 Security Considerations:
982 See Section 8 of RFC XXXX.
984 Contact:
986 The IETF URNbis working group.
987 This registration will be discussed on the following IETF lists:
988 urn and uri-review (AT ietf.org).
990 Author / Change controller:
992 The authors of RFC XXXX.
993 Change control is with the IESG.
995 References:
997 RFC XXXX.
999 Procedures for the specification and registration of URN
1000 Namespaces are detailed in BCP 66 (at the time of this writing:
1001 RFC 3406; an update is in progress in the URNbis WG as RFC 3406bis
1002 [I-D.ietf-urnbis-rfc3406bis-urn-ns-reg]).
1004 8. Security Considerations
1006 This document specifies the syntax and general requirements for URNs,
1007 which are the specific URIs that use the 'urn' URI scheme. As such,
1008 the general security considerations of STD 66 [RFC3986] apply.
1009 However, each URN Namespace will have specific security
1010 considerations, according to the semantics and usage of the
1011 underlying namespace. While some namespaces may assign special
1012 meaning to particular characters generically allowed in the Namespace
1013 Specific String, any security considerations resulting from such
1014 assignment are outside the scope of this document. It is REQUIRED by
1015 BCP 66 (currently [RFC3406], to be replaced by RFC 3406bis
1016 [I-D.ietf-urnbis-rfc3406bis-urn-ns-reg]) that the process of
1017 registering a namespace identifier include any such considerations.
1019 9. IANA Considerations
1021 IANA is asked to update the existing informal registration of the
1022 'urn' URI Scheme by the template in Section 7.1 above and list this
1023 RFC as the current normative reference in [IANA-URI].
1025 IANA is asked to add a note to [IANA-URN] that 'urn' is a permanently
1026 reserved formal namespace identifier string that cannot be
1027 registered, in order to avoid confusion with the 'urn' URI scheme.
1029 IANA is asked to again make available the URN Namespace Registry
1030 [IANA-URN] in a generic form (i.e. HTML) at the generic URI given in
1031 the Reference, and to make the XML and TXT versions available from
1032 that HTML version. (This state already had been achieved, but
1033 something seems to have been lost in 2011.)
1035 10. Acknowledgements
1037 This document is heavily based on RFC 2141, the author of which has
1038 laid the foundation for this work; that RFC contained the following
1039 Acknowledgements:
1041 Thanks to various members of the URN working group for comments on
1042 earlier drafts of this document. This document is partially
1043 supported by the National Science Foundation, Cooperative
1044 Agreement NCR-9218179.
1046 This document also heavily relies on and acknowledges the work done
1047 for STD 66 [RFC3986] and earlier RFCs that are being quoted
1048 informally, in particular RFC 1737 [RFC1737]. The experiences
1049 gathered during the first (more than a) decade of URN usage were also
1050 helpful, so individuals and organizations which have implemented and
1051 used URNs are also acknowledged.
1053 Many individuals in the URNbis working group have participated in the
1054 detailed discussion of this memo. Particular thanks for detailed
1055 review comments and text suggestions go to Juha Hakala and Mykyta
1056 Yevstifeyev.
1058 11. References
1060 11.1. Normative References
1062 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
1063 Requirement Levels", BCP 14, RFC 2119, March 1997.
1065 [RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO
1066 10646", STD 63, RFC 3629, November 2003.
1068 [RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
1069 Resource Identifier (URI): Generic Syntax", STD 66,
1070 RFC 3986, January 2005.
1072 [RFC4395] Hansen, T., Hardie, T., and L. Masinter, "Guidelines and
1073 Registration Procedures for New URI Schemes", BCP 35,
1074 RFC 4395, February 2006.
1076 [RFC5234] Crocker, D. and P. Overell, "Augmented BNF for Syntax
1077 Specifications: ABNF", STD 68, RFC 5234, January 2008.
1079 11.2. Informative References
1081 [I-D.ietf-urnbis-rfc3406bis-urn-ns-reg]
1082 Hoenes, A., "Uniform Resource Name (URN) Namespace
1083 Definition Mechanisms",
1084 draft-ietf-urnbis-rfc3406bis-urn-ns-reg-02 (work in
1085 progress), March 2012.
1087 [IANA] IANA, "The Internet Assigned Numbers Authority",
1088 .
1090 [IANA-URI]
1091 IANA, "URI Schemes Registry",
1092 .
1094 [IANA-URN]
1095 IANA, "URN Namespace Registry",
1096 .
1098 [RFC0615] Crocker, D., "Proposed Network Standard Data Pathname
1099 syntax", RFC 615, March 1974.
1101 [RFC0645] Crocker, D., "Network Standard Data Specification syntax",
1102 RFC 645, June 1974.
1104 [RFC1630] Berners-Lee, T., "Universal Resource Identifiers in WWW: A
1105 Unifying Syntax for the Expression of Names and Addresses
1106 of Objects on the Network as used in the World-Wide Web",
1107 RFC 1630, June 1994.
1109 [RFC1736] Kunze, J., "Functional Recommendations for Internet
1110 Resource Locators", RFC 1736, February 1995.
1112 [RFC1737] Sollins, K. and L. Masinter, "Functional Requirements for
1113 Uniform Resource Names", RFC 1737, December 1994.
1115 [RFC1738] Berners-Lee, T., Masinter, L., and M. McCahill, "Uniform
1116 Resource Locators (URL)", RFC 1738, December 1994.
1118 [RFC1808] Fielding, R., "Relative Uniform Resource Locators",
1119 RFC 1808, June 1995.
1121 [RFC2141] Moats, R., "URN Syntax", RFC 2141, May 1997.
1123 [RFC2169] Daniel, R., "A Trivial Convention for using HTTP in URN
1124 Resolution", RFC 2169, June 1997.
1126 [RFC2276] Sollins, K., "Architectural Principles of Uniform Resource
1127 Name Resolution", RFC 2276, January 1998.
1129 [RFC2396] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
1130 Resource Identifiers (URI): Generic Syntax", RFC 2396,
1131 August 1998.
1133 [RFC2608] Guttman, E., Perkins, C., Veizades, J., and M. Day,
1134 "Service Location Protocol, Version 2", RFC 2608,
1135 June 1999.
1137 [RFC2611] Daigle, L., van Gulik, D., Iannella, R., and P. Faltstrom,
1138 "URN Namespace Definition Mechanisms", BCP 33, RFC 2611,
1139 June 1999.
1141 [RFC2717] Petke, R. and I. King, "Registration Procedures for URL
1142 Scheme Names", BCP 35, RFC 2717, November 1999.
1144 [RFC2718] Masinter, L., Alvestrand, H., Zigmond, D., and R. Petke,
1145 "Guidelines for new URL Schemes", RFC 2718, November 1999.
1147 [RFC3305] Mealling, M. and R. Denenberg, "Report from the Joint W3C/
1148 IETF URI Planning Interest Group: Uniform Resource
1149 Identifiers (URIs), URLs, and Uniform Resource Names
1150 (URNs): Clarifications and Recommendations", RFC 3305,
1151 August 2002.
1153 [RFC3401] Mealling, M., "Dynamic Delegation Discovery System (DDDS)
1154 Part One: The Comprehensive DDDS", RFC 3401, October 2002.
1156 [RFC3402] Mealling, M., "Dynamic Delegation Discovery System (DDDS)
1157 Part Two: The Algorithm", RFC 3402, October 2002.
1159 [RFC3403] Mealling, M., "Dynamic Delegation Discovery System (DDDS)
1160 Part Three: The Domain Name System (DNS) Database",
1161 RFC 3403, October 2002.
1163 [RFC3404] Mealling, M., "Dynamic Delegation Discovery System (DDDS)
1164 Part Four: The Uniform Resource Identifiers (URI)",
1165 RFC 3404, October 2002.
1167 [RFC3406] Daigle, L., van Gulik, D., Iannella, R., and P. Faltstrom,
1168 "Uniform Resource Names (URN) Namespace Definition
1169 Mechanisms", BCP 66, RFC 3406, October 2002.
1171 [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an
1172 IANA Considerations Section in RFCs", BCP 26, RFC 5226,
1173 May 2008.
1175 Appendix A. Handling of URNs by URL Resolvers/Browsers
1177 The URN syntax has been defined so that URNs can be used in places
1178 where URLs are expected. A resolver that conforms to the current URI
1179 syntax specification [RFC3986] will extract a scheme value of "urn"
1180 rather than a scheme value of "urn:".
1182 An URN MUST be considered an opaque URI by URL resolvers and passed
1183 (with the "urn:" tag) to a URN resolver for resolution. The URN
1184 resolver can either be an external resolver that the URL resolver
1185 knows of, or it can be functionality built into the URL resolver.
1187 To avoid confusion of users, a URL browser SHOULD display the
1188 complete URN (including the "urn:" tag) to ensure that there is no
1189 confusion between URN Namespace identifiers and URI Scheme names.
1191 Appendix B. Collected ABNF (Informative)
1193 As a service to implementers specifically interested in URN syntax,
1194 the complete ABNF for URNs is collected here, including the
1195 referenced rules from [RFC5234] and [RFC3986]. In case of
1196 (unexpected) inconsistencies, these documents remain normative for
1197 the respective productions.
1199 URNs conform to the variant of the general URI syntax
1200 specified in Section 3 of [RFC3986] :
1202 URI = scheme ":" path-rootless [ "?" query ] [ "#" fragment ]
1204 scheme = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." )
1205 path-rootless = segment-nz *( "/" segment )
1206 query = *( pchar / "/" / "?" )
1207 fragment = *( pchar / "/" / "?" )
1209 segment-nz = 1*pchar
1210 segment = *pchar
1211 pchar = unreserved / pct-encoded / sub-delims / ":" / "@"
1213 unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
1214 pct-encoded = "%" HEXDIG HEXDIG
1215 sub-delims = "!" / "$" / "&" / "'" / "(" / ")"
1216 / "*" / "+" / "," / ";" / "="
1218 In the case of URNs, the above rules are subject to more specific
1219 restrictions:
1221 scheme = "urn"
1222 ; specific, fixed (assigned) value
1224 urn-path = NID ":" NSS
1225 ; to be superimposed on
1227 NID = ( ALPHA / DIGIT ) 1*31( ALPHA / DIGIT / "-" )
1228 ; RFC 3406[bis] contains more specific rules
1230 NSS = 1*pchar
1231 ; or equivalent: NSS = segment-nz
1233 The above rules make use of the following "Core Rules" from Appendix
1234 B.1 of [RFC5234] :
1236 ALPHA = %x41-5A / %x61-7A ; A-Z / a-z
1237 DIGIT = %x30-39 ; 0-9
1238 HEXDIG = DIGIT / "A" / "B" / "C" / "D" / "E" / "F"
1240 Appendix C. Breakdown of NSS Syntax Evolution since RFC 2141
1241 (Informative)
1243 In order to make visible the detailed migration path from RFC 2141
1244 and the influence of the evolution of URI syntax from RFC 2396 to RFC
1245 3986 on it, this appendix provides a highly annotated and expanded
1246 version of the NSS syntax provided in Section 2.2:
1248 NSS = 1*pchar ; or equivalent: NSS = segment-nz
1250 In particular, the breakdown below serves to provide evidence of that
1251 this syntax correctly reflects the addition of "&" and "~" to the
1252 repertoire of characters allowed in the NSS portion of URNs
1253 previously allowed by RFC 2141; it expands on the syntax specified in
1254 RFC 2141 after translation to standard ABNF.
1256 NSS = 1*URN-char
1258 URN-char = trans / pct-encoded
1259 ; Note that from RFC 3986 here replaces the
1260 ; explicit, expanded form used in RFC 2141.
1262 trans = ALPHA / DIGIT / u-other
1263 ; Note that RFC 2141's has been disambiguated here
1264 ; into .
1265 ; RFC 2141 also said:
1266 ; / reserved
1267 ; This caused an ambiguity in RFC 2141 with respect to "%", which
1268 ; now is resolved here by omission of this dangling alternative.
1269 ;
1270 ; After adoption of the generic URI syntax from RFC 3986, there
1271 ; is no more need to deal here with the higher-level separator
1272 ; characters "/", "?", and "#" contained in
1273 ; (beyond "%", which is fully taken care of by ),
1274 ; which are part of RFC 3986's , as shown below.
1276 ; From RFC 2141:
1277 ; reserved = '%" / "/" / "?" / "#" ; SIC!
1278 ; ^ ^
1280 u-other = ":" / "@"
1281 ; those from RFC 3986
1282 ; specifically allowed in .
1283 ; From RFC 3986:
1284 ; gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@"
1286 / "!" / "$" / "'" / "(" / ")"
1287 / "*" / "+" / "," / ";" / "="
1288 ; this is RFC 3986 except "&".
1289 ; From RFC 3986:
1290 ; sub-delims = "!" / "$" / "&" / "'" / "(" / ")"
1291 ; / "*" / "+" / "," / ";" / "="
1292 ; The URNbis WG arrived at unanimous consensus that "&" can be
1293 ; allowed without harm to backward compatibility for existing
1294 ; URN Namespaces.
1296 / "-" / "." / "_" ; except "~"
1297 ; From RFC 3986:
1298 ; unreserved = ALPHA / DIGIT
1299 ; / "-" / "." / "_" / "~"
1300 ; The URNbis WG arrived at unanimous consensus that "~" can be
1301 ; allowed without harm to backward compatibility for existing
1302 ; URN Namespaces.
1304 ; Since we now allow "&" and "~" , becomes ,
1305 ; greatly simplifying the syntax rules and parsers!
1307 ; From RFC 3986:
1308 ; segment-nz = 1*pchar
1309 ; pchar = unreserved / pct-encoded / sub-delims / ":" / "@"
1311 Appendix D. Changes since RFC 2141 (Informative)
1313 D.1. Essential Changes from RFC 2141
1315 [ RFC Editor: please remove the Appendix D.1 headline and all
1316 subsequent subsections starting with Appendix D.2. ]
1318 T.B.D. (after consolidation of this memo)
1320 D.2. Changes from RFC 2141 to Individual Draft -00
1322 Abstract amended: URI scheme, replacement for 2141, point to 3406.
1323 Use contemporary boilerplate. Added transient "Discussion" section.
1325 s1: added new 1st para (URI scheme) and 3rd para (hierarchy).
1326 s1.1 (Historical Perspective) added for background & motivation.
1327 s1.2 (Objective) added.
1328 s1.3 (2119 keywords) added -- used now throughout normative text.
1330 s2 (URN Syntax): Shifted from BNF to ABNF; explain relationship to
1331 3986 and gaps, how the gaps could be bridged, distinguish between URI
1332 generics and URN specifics; got rid of references to immature
1333 documents (1630, 1737).
1334 s2.1 (NID syntax): Use ABNF and RFC 5234 terminals (core rules);
1335 removed reference to an old draft of 2396; clarified prohibition to
1336 use "urn" as NID.
1337 s2.2 (NSS syntax): Shifted from BNF to ABNF; made ABNF consistent
1338 with subsequent textual description; exposition much expanded,
1339 showing relationship with 3986 and resulting incompatibilities;
1340 proposed how to bridge gaps, to make parsing more uniform among URIs;
1341 updated i18n considerations and pointer to UTF-8 specification.
1342 s.2.3, s2.3.*: reworked and much expanded, along the grouping of
1343 delimiter characters from 3986 in new s2.3.1 (including old s.2.3.2);
1344 made text fully consistent with ABNF in s2.2; consistent usage of
1345 term "percent-encoded"; old s.2.3.1 became s2.3.2; old s3.4 became
1347 s3.3.3, providing complete, annotated list of excluded characters,
1348 ordered by ascending code point; and restating design decisions
1349 needed to be made to close gaps to 3986.
1351 s3 through s6: only minor editorial changes.
1353 s7: formal registration of 'urn' URI scheme added, using 4395
1354 template.
1356 s8: Security Cons. slightly amended.
1358 s9: new: IANA Cons. added wrt s7.1 and prohibition of NID "urn".
1360 s10: Acknowledgments amended.
1362 s11: References split into Normative and Informative; updated refs
1363 and added many; only FS and BCP allowed as Normative Refs to further
1364 promotion of document.
1366 Added Appendices A through D.
1368 D.3. Changes from Individual Draft -00 to -02
1370 Updated "Discussion" on front page to point to dedicated urn list.
1372 Numerous editorial improvements and additions for clarification, in
1373 particular in the Introduction. No technical changes.
1375 More Informative References; missing details supplied in D.2.
1377 D.4. Changes from Individual Draft -02 to WG Draft -00
1379 Added new s1.2 to Introduction, with excerpts from RFC 1737 to
1380 provide background on URN functional and syntax requirements.
1381 Renumbered previous s1.2 and s1.3 to s1.3 and s1.4, respectively.
1383 Supplied text in s2 regarding the envisioned use of query and
1384 fragment parts, based on various discussion -- including a
1385 preliminary evaluation in PersID.
1387 Changed "SHOULD never" to "MUST NOT" for NUL character in NSS.
1389 Various editorial and grammar fixes; corrected STD / BCP numbers.
1391 D.5. Changes from WG Draft -00 to WG Draft -01
1393 Reflect WG consensus on adding "&" and "~" to the set of characters
1394 allowed in the NSS part of URNs, thus aligning URN syntax with
1395 generic URI syntax from RFC 3986.
1397 Moved breakdown of NSS syntax evolution from s2.2 to new Appendix C.
1399 Avoid "[URN] character set" in favor of "character repertoire" to
1400 minimize potential clashes with IETF terminology on charsets.
1402 s2.3.3: URN recognition in text documents is regarded out of scope.
1404 The previous version was ambiguous on whether eventual query and/or
1405 fragment parts were regarded as part of the NSS; after closer
1406 inspection of the syntax, clarification has been added that the syntax is indeed superimposed on the ABNF rule for
1408 URNs, and hence does not cover the trailing higher level parts
1409 (query, fragment) according to the URI syntax.
1411 Filled in Appendix B contents.
1413 Numerous editorial and grammar improvements.
1415 D.6. Changes from WG Draft -01 to WG Draft -02
1417 Added note at the beginning of Section 1.2 highlighting the purpose
1418 of this section. The URNbis charter excludes a revision of RFC 1738,
1419 and hence the changes suggested on the list to alter and update this
1420 section have been dismissed.
1422 Added hint to URN Namespace designers in Section 2 that ":" is
1423 customarily used in URN Namespaces to provide further level(s) of
1424 hierarchical subdivision of NSSs.
1426 Reworked text on fragment identification issues and resulting
1427 specification, mostly based on Juha Hakala's evaluation of the
1428 consensus evolving from the list discussion.
1430 Modified ABNF rule for NIDs to better align it with rules for similar
1431 identifiers used in IETF protocols. The new rule now prohibits a
1432 trailing hyphen, but defers further restricting rules on NID syntax
1433 (based on the kind of NID) to RFC 3406bis.
1435 More clearly documented and marked (still open / already closed)
1436 ISSUES. The related text will be removed in the next draft version,
1437 whence it should have been transferred into the IETF issue tracking
1438 system.
1440 Text of Section 3 revised, based on Juha's suggestion.
1442 In Section 5, added removal of part (but not part)
1443 to canonicalization steps for the purpose of determining lexical
1444 equivalence of URNs (Juha's comment). Also added examples showing
1445 this.
1447 Elaborated a bit more on Encoding Consideration in the URI Scheme
1448 registration template (Juha's comments).
1450 Numerous editorial corrections and improvements.
1452 Appendix E. How to Locate IETF Documents (Informative)
1454 Request For Comments (RFCs) are available from the RFC Editor site
1455 using the canonical URIs
1456 or (where 'NNNN' is
1457 the serial number of the RFC), and from numerous mirror sites.
1458 Additional metadata for any RFC, including possible Errata, are
1459 available from (where 'NNNN'
1460 again is the serial number of the RFC). A HTML-ized version and a
1461 PDF facsimile of each RFC are available from the IETF Tools site at
1462 and
1463 , respectively.
1465 Current Internet Draft documents are available via the search engines
1466 at and
1467 ; archival copies of older
1468 IETF documents can be found at .
1470 Author's Address
1472 Alfred Hoenes (editor)
1473 TR-Sys
1474 Gerlinger Str. 12
1475 Ditzingen D-71254
1476 Germany
1478 EMail: ah@TR-Sys.de