idnits 2.17.1 draft-ietf-urnbis-rfc2141bis-urn-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The draft header indicates that this document obsoletes RFC2141, but the abstract doesn't seem to directly say this. It does mention RFC2141 though, so this could be OK. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords -- however, there's a paragraph with a matching beginning. Boilerplate error? (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). == The document seems to contain a disclaimer for pre-RFC5378 work, but was first submitted on or after 10 November 2008. The disclaimer is usually necessary only for documents that revise or obsolete older RFCs, and that take significant amounts of text from those RFCs. If you can contact all authors of the source material and they are willing to grant the BCP78 rights to the IETF Trust, you can and should remove the disclaimer. Otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (March 12, 2012) is 4421 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 4395 (Obsoleted by RFC 7595) == Outdated reference: A later version (-09) exists of draft-ietf-urnbis-rfc3406bis-urn-ns-reg-02 -- Obsolete informational reference (is this intentional?): RFC 615 (Obsoleted by RFC 645) -- Obsolete informational reference (is this intentional?): RFC 1738 (Obsoleted by RFC 4248, RFC 4266) -- Obsolete informational reference (is this intentional?): RFC 1808 (Obsoleted by RFC 3986) -- Obsolete informational reference (is this intentional?): RFC 2141 (Obsoleted by RFC 8141) -- Obsolete informational reference (is this intentional?): RFC 2396 (Obsoleted by RFC 3986) -- Obsolete informational reference (is this intentional?): RFC 2611 (Obsoleted by RFC 3406) -- Obsolete informational reference (is this intentional?): RFC 2717 (Obsoleted by RFC 4395) -- Obsolete informational reference (is this intentional?): RFC 2718 (Obsoleted by RFC 4395) -- Obsolete informational reference (is this intentional?): RFC 3406 (Obsoleted by RFC 8141) -- Obsolete informational reference (is this intentional?): RFC 5226 (Obsoleted by RFC 8126) Summary: 1 error (**), 0 flaws (~~), 4 warnings (==), 12 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 IETF URNbis WG A. Hoenes, Ed. 3 Internet-Draft TR-Sys 4 Obsoletes: 2141 (if approved) March 12, 2012 5 Intended status: Standards Track 6 Expires: September 13, 2012 8 Uniform Resource Name (URN) Syntax 9 draft-ietf-urnbis-rfc2141bis-urn-02 11 Abstract 13 Uniform Resource Names (URNs) are intended to serve as persistent, 14 location-independent, resource identifiers. This document serves as 15 the foundation of the 'urn' URI Scheme according to RFC 3986 and sets 16 forward the canonical syntax for URNs, which subdivides URNs into 17 "namespaces". A discussion of both existing legacy and new 18 namespaces and requirements for URN presentation and transmission are 19 presented. Finally, there is a discussion of URN equivalence and how 20 to determine it. This document supersedes RFC 2141. 22 The requirements and procedures for URN Namespace registration 23 documents are set forth in BCP 66, for which RFC 3406bis is the 24 companion revised specification document replacing RFC 3406. 26 Discussion 28 Comments are welcome on the urn@ietf.org mailing list (or sent to the 29 document editor). The home page of the URNbis WG is located at 30 . 32 Status of This Memo 34 This Internet-Draft is submitted in full conformance with the 35 provisions of BCP 78 and BCP 79. 37 Internet-Drafts are working documents of the Internet Engineering 38 Task Force (IETF). Note that other groups may also distribute 39 working documents as Internet-Drafts. The list of current Internet- 40 Drafts is at http://datatracker.ietf.org/drafts/current/. 42 Internet-Drafts are draft documents valid for a maximum of six months 43 and may be updated, replaced, or obsoleted by other documents at any 44 time. It is inappropriate to use Internet-Drafts as reference 45 material or to cite them other than as "work in progress." 47 This Internet-Draft will expire on September 13, 2012. 49 Copyright Notice 51 Copyright (c) 2012 IETF Trust and the persons identified as the 52 document authors. All rights reserved. 54 This document is subject to BCP 78 and the IETF Trust's Legal 55 Provisions Relating to IETF Documents 56 (http://trustee.ietf.org/license-info) in effect on the date of 57 publication of this document. Please review these documents 58 carefully, as they describe your rights and restrictions with respect 59 to this document. Code Components extracted from this document must 60 include Simplified BSD License text as described in Section 4.e of 61 the Trust Legal Provisions and are provided without warranty as 62 described in the Simplified BSD License. 64 This document may contain material from IETF Documents or IETF 65 Contributions published or made publicly available before November 66 10, 2008. The person(s) controlling the copyright in some of this 67 material may not have granted the IETF Trust the right to allow 68 modifications of such material outside the IETF Standards Process. 69 Without obtaining an adequate license from the person(s) controlling 70 the copyright in such materials, this document may not be modified 71 outside the IETF Standards Process, and derivative works of it may 72 not be created outside the IETF Standards Process, except to format 73 it for publication as an RFC or to translate it into languages other 74 than English. 76 Table of Contents 78 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 79 1.1. Historical Perspective and Motivation . . . . . . . . . . 4 80 1.2. Background on Properties of URNs . . . . . . . . . . . . . 6 81 1.3. Objective of this Memo . . . . . . . . . . . . . . . . . . 7 82 1.4. Requirement Language . . . . . . . . . . . . . . . . . . . 8 83 2. URN Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . 8 84 2.1. Namespace Identifier (NID) Syntax . . . . . . . . . . . . 13 85 2.2. Namespace Specific String (NSS) Syntax . . . . . . . . . . 15 86 2.3. Special and Reserved Characters . . . . . . . . . . . . . 15 87 2.3.1. Delimiter Characters . . . . . . . . . . . . . . . . . 16 88 2.3.2. The Percent Character, Percent-Encoding . . . . . . . 16 89 2.3.3. Other Excluded Characters . . . . . . . . . . . . . . 17 90 3. Support of Existing Legacy Naming Systems and New Naming 91 Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 92 4. URN Presentation and Transport . . . . . . . . . . . . . . . . 18 93 5. Lexical Equivalence of URNs . . . . . . . . . . . . . . . . . 18 94 5.1. Examples of Lexical Equivalence . . . . . . . . . . . . . 19 95 6. Functional Equivalence of URNs . . . . . . . . . . . . . . . . 19 96 7. The 'urn' URI Scheme . . . . . . . . . . . . . . . . . . . . . 20 97 7.1. Registration of URI Scheme 'urn' . . . . . . . . . . . . . 20 98 8. Security Considerations . . . . . . . . . . . . . . . . . . . 22 99 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 22 100 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 23 101 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 23 102 11.1. Normative References . . . . . . . . . . . . . . . . . . . 23 103 11.2. Informative References . . . . . . . . . . . . . . . . . . 24 104 Appendix A. Handling of URNs by URL Resolvers/Browsers . . . . . 26 105 Appendix B. Collected ABNF (Informative) . . . . . . . . . . . . 26 106 Appendix C. Breakdown of NSS Syntax Evolution since RFC 2141 107 (Informative) . . . . . . . . . . . . . . . . . . . . 27 108 Appendix D. Changes since RFC 2141 (Informative) . . . . . . . . 29 109 D.1. Essential Changes from RFC 2141 . . . . . . . . . . . . . 29 110 D.2. Changes from RFC 2141 to Individual Draft -00 . . . . . . 29 111 D.3. Changes from Individual Draft -00 to -02 . . . . . . . . . 30 112 D.4. Changes from Individual Draft -02 to WG Draft -00 . . . . 30 113 D.5. Changes from WG Draft -00 to WG Draft -01 . . . . . . . . 30 114 D.6. Changes from WG Draft -01 to WG Draft -02 . . . . . . . . 31 115 Appendix E. How to Locate IETF Documents (Informative) . . . . . 32 117 1. Introduction 119 Uniform Resource Names (URNs) are intended to serve as persistent, 120 location-independent, resource identifiers and are designed to make 121 it easy to map other namespaces (that share the properties of URNs) 122 into URI-space. Therefore, the URN syntax provides a means to encode 123 character data in a form that can be sent in existing protocols, 124 transcribed on most keyboards, etc. 126 To this end, URNs are designed as an intrinsic part of the more 127 general framework of Uniform Resource Identifiers (URIs); 'urn' is a 128 particular URI Scheme (according to STD 66, RFC 3986 [RFC3986] and 129 BCP 35, RFC 4395 [RFC4395]) that is dedicated to forming a 130 hierarchical framework for persistent identifiers. 132 The first level of hierarchy is given by the classification of URIs 133 into "URI Schemes", and for URNs, the second level is organized into 134 "URN Namespaces". Henceforth both terms are used in this 135 capitalization to distinguish them from the more general common 136 meaning of "scheme" and "namespace". 138 It is an explicit design goal that pre-existing systems of persistent 139 identifiers are mapped into the URN framework. Ordinarily, each such 140 traditional identifier system (namespace) -- standard or otherwise -- 141 will occupy its own URN Namespace. However, shared URN Namespaces 142 are possible (and in fact, already exist), but the identifier-driven 143 mechanisms needed to distinguish the originating namespaces make 144 registration and maintenance of such URN Namespaces more complicated. 146 URN (as a URI Scheme) as such does not have a specific scope. The 147 applicability of the URN system, that is, the totality of the 148 resources that URNs can be assigned to, is the union of all 149 identifier systems that have an associated registered URN Namespace. 150 Ideally every new namespace will thus extend the URN applicability. 152 1.1. Historical Perspective and Motivation 154 Since this RFC will be of particular interest for groups and 155 individuals that are interested in persistent identifiers in general 156 and not in continuous contact with the IETF and the RFC series, this 157 section gives a brief outline of the evolution of the matter over 158 time. Appendix E gives hints on how to obtain RFCs and related 159 information. 161 Attempts to define generally applicable identifiers for network 162 resources go back to the mid-1970s. Among the applicable RFCs is RFC 163 615 [RFC0615], which subsequently has been obsoleted by RFC 645 164 [RFC0645]. 166 The seminal document in the RFC series regarding URIs (Uniform 167 Resource Identifiers) for use with the World Wide Web (WWW) was RFC 168 1630 [RFC1630], published in 1994. In the same year, the general 169 concept or Uniform Resource Names has been laid down in RFC 1737 170 [RFC1737] and that of Uniform Resource Locators in RFC 1736 171 [RFC1736]. 173 The original formal specification of URN Syntax, RFC 2141 [RFC2141] 174 was adopted in 1997. That document was based on the original 175 specification of URLs (Uniform Resource Locators) in RFC 1738 176 [RFC1738] and RFC 1808 [RFC1808], which later on, in 1998, was 177 generalized and consolidated in the Generic URI specification, 178 RFC 2396 [RFC2396]. Most parts of these URI/URL documents were 179 superseded in 2005 by STD 66, RFC 3986 [RFC3986]. Notably, RFC 2141 180 makes (essentially normative) reference to a draft version of 181 RFC 2396. 183 Over time, the terms "URI", "URL", and "URN" have been refined and 184 slightly shifted according to emerging insight and use. This has 185 been clarified in a joint effort of the IETF and the World Wide Web 186 Council, published 2002 for the IETF in RFC 3305 [RFC3305]. 188 The wealth of URI Schemes and URN Namespaces needs to be organized in 189 a persistent way, in order to guide application developers and users 190 to the standardized top level branches and the related 191 specifications. These registries are maintained by the Internet 192 Assigned Numbers Authority (IANA) [IANA] at [IANA-URI] and 193 [IANA-URN], respectively. Registration procedures for URI Schemes 194 originally had been laid down in RFC 2717 [RFC2717] and guidelines 195 for the related specification documents were given in RFC 2718 196 [RFC2718]. These documents have been obsoleted and consolidated into 197 BCP 35, RFC 4395 [RFC4395], which is based on, and aligned with, 198 RFC 3986. 200 Note that RFC 2141 predates RFC 2717 and, although the 'urn' URI 201 scheme traditionally was listed in [IANA-URI] with a pointer to 202 RFC 2141, this registration has never been performed formally. 204 Similarly, the URN Namespace definition and registration mechanisms 205 originally have been specified in RFC 2611 [RFC2611], which has been 206 obsoleted by BCP 66, RFC 3406 [RFC3406]. Guidelines for documents 207 prescribing IANA procedures have been revised as well over the years, 208 and at the time of this writing, BCP 26, RFC 5226 [RFC5226] is the 209 normative document. Neither RFC 4395 nor RFC 3406 conform to 210 RFC 5226. 212 Early documents specifying URI and URN syntax, including RFC 2141, 213 made use of an ad-hoc variant of the original Backus-Naur Form (BNF) 214 that never has been formally specified. 216 Over the years, the IETF has shifted to the use of a predominant 217 formal language used to define the syntax of textual protocol 218 elements, dubbed "Augmented Backus-Naur Form" (ABNF). The 219 specification of ABNF also has evolved, and now STD 68, RFC 5234 220 [RFC5234] is the normative document for it (that also will be used in 221 this RFC). 223 1.2. Background on Properties of URNs 225 This section aims at quoting requirements as identified in the past; 226 it does not attempt to revise or redefine these requirements, but it 227 gives some hints where more than a decade of experience with URNs has 228 shed a different light on past views. The citations below are given 229 here to make this document self-contained and avoid normative down- 230 references to old work. 232 RFC 1738 [RFC1738] defined the purpose of URNs as follows: 234 o The purpose or function of a URN is to provide a globally unique, 235 persistent identifier used for recognition, for access to 236 characteristics of the resource, or for access to the resource 237 itself. 239 Section 2 of RFC 1738 [RFC1738] listed the functional requirements 240 for URNs (quote slightly edited to reflect the time passed since that 241 RFC was written and the actual definition of the URN scheme that has 242 happened): 244 o Global scope: A URN is a name with global scope which does not 245 imply a location. It has the same meaning everywhere. 247 o Global uniqueness: The same URN will never be assigned to two 248 different resources. 250 o Persistence: It is intended that the lifetime of a URN be 251 permanent. That is, the URN will be globally unique forever, and 252 may well be used as a reference to a resource well beyond the 253 lifetime of the resource it identifies or of any naming authority 254 involved in the assignment of its name. 256 o Scalability: URNs can be assigned to any resource that might 257 conceivably be available on the network, for hundreds of years. 259 o Legacy support: The URN scheme permits the support of existing 260 legacy naming systems, insofar as they satisfy the other 261 requirements described here. [...] 263 o Extensibility: The URN scheme permits future extensions. 265 o Independence: It is solely the responsibility of a name issuing 266 authority to determine the conditions under which it will issue a 267 name. 269 o Resolution: URNs will not impede resolution. [...] 271 The URN syntax described below also accommodates the fundamental 272 "Requirements for URN Encoding" in Section 3 of RFC 1738 [RFC1738], 273 as far as experience gained has not lead to relax unrealistical 274 detail requirements: 276 o Single encoding: The encoding for presentation for people in clear 277 text, electronic mail and the like is the same as the encoding in 278 other transmissions. 280 o Simple comparison: A comparison algorithm for URNs is simple, 281 local, and deterministic. [...] 283 o Human transcribability: For URNs to be easily transcribable by 284 humans without error, they need to be short, use a minimum of 285 special characters, and be case insensitive. [...] 287 Note: 288 In particular practice gained with active URN Namespaces has 289 shown that this former goal is rather unrealistic, since 290 usually preference is given to 1:1 usage of existing 291 namespaces, which might not have this property. However, we 292 hold that, at least, the rough kind of resource identified by a 293 URN should be easily recognizable for humans. 295 o Transport friendliness: A URN can be transported unmodified in the 296 common Internet protocols, such as TCP, SMTP, FTP, Telnet, etc., 297 as well as printed paper. 299 o Machine consumption: A URN can be parsed by a computer. 301 o Text recognition: The encoding of a URN needs to enhance the 302 ability to find and parse URNs in free text. 304 1.3. Objective of this Memo 306 RFC 2141 does not seamlessly match current Internet Standards. The 307 primary objective of this document is the alignment with the URI 308 standard [RFC3986] and URI Scheme guidelines [RFC4395], the ABNF 309 standard [RFC5234] and the current IANA Guidelines [RFC5226] in 310 general. 312 Further, experience from emerging international efforts to establish 313 a general, distributed, stable URN resolution service have been taken 314 into account during the draft stage of this document. 316 For advancing the URN specification on the Internet Standards-Track, 317 it needs to be based on documents of comparable maturity. Therefore, 318 to further advancements of the formal maturity level of this RFC, it 319 deliberately makes normative references only to documents at Full 320 Standard or Best Current Practice level. 322 Thus, this replacement document for RFC 2141 should make it possible 323 to advance the URN framework on the Internet Standard maturity 324 ladder. All other related documents depend on it; therefore this is 325 the first step to undertake. 327 Out of scope for this document is a revision of the URN Namespace 328 Definition Mechanisms document, BCP 66. This is being undertaken in 329 a companion document, RFC 3406bis 330 [I-D.ietf-urnbis-rfc3406bis-urn-ns-reg]. 332 1.4. Requirement Language 334 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 335 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 336 document are to be interpreted as described in BCP 14 [RFC2119]. 338 2. URN Syntax 340 This document defines the URI Scheme 'urn'. Hence, URNs are specific 341 URIs as specified in STD 66 [RFC3986]. The formal syntax definitions 342 below are given in ABNF according to STD 68 [RFC5234] and make use of 343 some "Core Rules" specified in Appendix B of that Standard and 344 several generic rules defined in Appendix A of RFC 3986. 346 The syntax definitions below do, and syntax definitions in dependent 347 documents MUST, conform to the URI syntax specified in RFC 3986, in 348 the sense that additional syntax rules must only constrain the 349 general rules from RFC 3986. In other words: a general URI parser 350 based on RFC 3986 MUST be able to parse any legal URN, and specific 351 semantics can be obtained from URN-specific parsing. 353 URNs conform to the variant of the general URI syntax 354 specified in Section 3 of [RFC3986], reproduced here informally: 356 URI = scheme ":" path-rootless [ "?" query ] [ "#" fragment ] 358 path-rootless = segment-nz *( "/" segment ) 359 segment-nz = 1*pchar 360 segment = *pchar 362 pchar = unreserved / pct-encoded / sub-delims / ":" / "@" 364 In the case of URNs, we have: 366 scheme = "urn" 368 and for , only a single segment is used, but the 369 following additional syntax rule is superimposed on 370 to establish a level of hierarchy called "Namespace": 372 urn-path = NID ":" NSS 374 Here "urn" is the URI scheme name, is the Namespace Identifier, 375 and is the Namespace Specific String. The colons are REQUIRED 376 separator characters. 378 Note that it is common practise in several existing URN Namespaces 379 (and fully supported by this syntax) to use additional colon(s) as 380 separator character(s) in order to introduce further level(s) of 381 hierarchy into the NSS syntax, where needed. (See also 382 Section 2.3.1 below.) 384 Per RFC 3986, the URN Scheme name (here "urn") is case-insensitive. 386 The Namespace ID (also a case-insensitive string) determines the 387 syntactic structure and the semantic interpretation of the Namespace 388 Specific String. Details on NID syntax can be found below in 389 Section 2.1, and the NSS syntax is elaborated upon in Section 2.2. 391 Each particular URN Namespace is based on a specific document that 392 must normatively describe (among other things) the details of the 393 values allowed in conjunction with the respective . The 394 syntax and semantics of these values are ordinarily specified 395 by an existing persistent identifier system (namespace); for 396 instance, in the 'ISBN' URN Namespace, each NSS must be a valid ISBN. 397 Some URN Namespaces may have strict rules for well formed NSSs, while 398 some others may be far more relaxed. There may also be significant 399 differences regarding the identifier assignment process. The overall 400 specification requirements and registration procedures for URN 401 Namespaces are the subject of a dedicated companion document, BCP 66, 402 which has been updated for conformance to BCP 26 and alignment with 403 implementation experience RFC 3406bis 404 [I-D.ietf-urnbis-rfc3406bis-urn-ns-reg]. 406 Notes: 408 RFC 2141 was published before the URI Generic Syntax was finalized 409 and therefore had to defer the decision on whether and 410 components are applicable to URNs. RFC 2141 therefore 411 has reserved the use of bare (unencoded) question mark ("?") and 412 hash ("#") characters in URNs for future usage in conformance with 413 the generic URI syntax. 415 URNs have been in use for more than a decade. Some user 416 communities want to be able to use these components (which are 417 split off by the high-level parsing rules of RFC 3986), or at 418 least the component, in the context of their focal 419 URNs. Therefore, this document allows the designers of selected 420 URN Namespaces to specify the use of the component with 421 URNs belonging these Namespaces, whereas the specification of 422 usage of the component is set aside to future 423 standardization efforts for URN resolution. Thus, this draft 424 allows both of these components in the general syntax. 426 ISSUE: 428 Regarding fragment identifiers, Section 3.5, para 1 of RFC 3986, 429 indicates that "The fragment identifier ... allows indirect 430 identification of a secondary resource by reference to a primary 431 resource and additional identifying information. The identified 432 secondary resource may be some portion or subset of the primary 433 resource, some view on representations of the primary resource, or 434 some other resource defined or described by those 435 representations." RFC 3986 continues in specifying that the 436 details of the interpretation of fragment identifiers are specific 437 to the media types returned upon resolution of an URI. The 438 entirety of the purposes mentioned in the above quote obviously 439 only can be achieved fully if the "consumer" of the URI becomes 440 aware of the fragment identifier as part of the requested URI, 441 since, e.g., secondary resources might consist in representations 442 might only be available in particular media types. However, RFC 443 3986 subsequently (in the penultimate paragraph of Section 3.5) 444 specifies that the evaluation of fragment identifiers be a client- 445 side matter and browsers are to strip them from request URIs sent 446 in information retrieval protocols. 447 Based on this, contemporary web browsers do not communicate 448 fragment identifiers to the web server but perform fragment 449 selection locally on the returned (HTML) resource. To make things 450 even more complicated, the most popular media type (HTML) does 451 only allow to set markers (which are anchor points in the 452 serialized media stream and used by browsers to identify a 453 specific position in the content) and does not allow browsers to 454 regularly identify actual, conceptional fragments of the media 455 delivered -- like, e.g., the "proper content" of a web page, 456 excluding navigation bars etc. -- so that in practice users have 457 got accustomed to understanding a "fragment" as actually 458 designating a *position* in the media, not a *part* of it. 460 Therefore, potential usage of components in URNs is 461 rather limited and has to be considered very seriously by 462 designers of URN Namespaces that would liek to make use of them. 463 URN Namespaces that rely on (unmodified) browser resolution via 464 HTTP/HTML cannot rely on the usage of fragment identifiers to 465 steer the resolution process. Thus, the use of fragment 466 identifiers only seems to be useful for URN Namespaces that are 467 intended to either (a) exclusively make use of resolution systems 468 / clients that can cope with handing off a full-featured URN 469 (including a possible fragment identifier) to the resolution 470 service, or (b) exclusively employ HTML/HTTP based resolution 471 systems / clients, i.e., where the resolution results are returned 472 as HTML such that web browsers can perform the fragment selection, 473 or as some other media type that better supports the 474 identification and actual selection of embedded fragments, even in 475 off-the-shelf web browsers -- perhaps possible for certain 476 variants of XML-based media types. 478 The syntax of and are defined in RFC 3986. 479 Question mark and hash sign remain reserved as separator characters 480 for these URI components and therefore MUST NOT appear unencoded in a 481 NSS. This rule guarantees backwards compatibility with existing URN 482 Namespaces and improves the compatibility of URN syntax with general 483 URI parsers. 485 The part MUST NOT be present in any *assigned* URN. This 486 specification reserves its use for future standardization related to 487 URN services and resolution. A part can only be added to an 488 assigned URN and appear in a URI *reference* [RFC3986] to a URN that 489 is intended to be used with URN resolution services, and, in 490 accordance with the general specification of this part in RFC 3986, 491 its purpose is restricted to indicate the requested URN resolution 492 service and/or particular service aspects of the intended resolution 493 response, e.g., to select the kind of metadata sought about the given 494 object that is identified by the basic, assigned URN. 496 The part is not generally allowed in URNs. It is only 497 applicable to URN Namespaces that specifically opt to support its 498 usage. Thus, a URN Namespace registration document MAY specify the 499 usage of with URNs of that particular URN Namespace. 500 Absent a registered namespace definition based on this document and 501 RFC 3406bis that explicitly specifies its usage, URNs within a 502 particular URN Namespace MUST NOT contain a fragment identifier. 504 The use of fragment identifiers may be useful if the URN Namespace is 505 based on an existing identifier scheme that designates objects of 506 reasonable complexity such that there is a need to make reference of 507 parts of such resources in typical network access environments 508 without incurring the effort to assign and maintain different 509 (assigned) NSSs in such cases. 511 URN Namespaces will deal with various kinds of fragments. For 512 instance, publications can be divided into smaller parts -- journals 513 consist of volumes, issues and articles, and books may contain 514 chapters. These logical fragments are usually not fragments in the 515 sense of the deliberations in the URI Generic Syntax, and if so, 516 MUST NOT be used. However, namespaces MAY have internal 517 means for identification of logical fragments such as journal 518 articles. For instance, the ISBN (International Standard Book 519 Number) system allows assignment of ISBN numbers to book chapters if 520 they are available as separate items. Namespace specific fragment 521 identification practices are beyond the scope of this document, since 522 they do not rely on URI Generic Syntax, and their application is the 523 primary RECOMMENDED way to deal with fragment identification. If a 524 namespace lacks this possibility, a URN Namespace definition SHOULD 525 define syntactical parts of its NSSs that amend the original 526 identifiers of the underlying namespace in a readily parseable way 527 and serve to allow assignment of URNs in that namespace to the 528 intended abstract fragments. A URN Namespace registration MAY forbid 529 all kind of fragment identification (even if it were possible on the 530 basis of URI Generic Syntax), if the application rules and syntax of 531 the identifier does not allow identification of fragments. ISSN 532 (International Standard Serial Number) is an example of this kind of 533 identifier / namespace. 535 The use of as specified in RFC 3986 is possible if and 536 only if (a) the URN Namespace is based on an existing identifier 537 scheme that designates objects of reasonable complexity that there is 538 a need to make reference of parts of such resources in typical 539 network access environments; and (b) these parts will be identified 540 in the canonical manner of the media type(s) delivered upon URN 541 resolution. Direct resolution to them SHOULD be possible and 542 sustainable. 544 If in a given namespace URNs are never assigned to a particular 545 manifestation of a resource (for instance, a PDF version of a book), 546 but can be transferred from one manifestation to the next or apply to 547 all of them, usage is forbidden. This applies also to the 548 situation when identified resources are works (without any references 549 to physical embodiments of the work). 551 The use of SHOULD NOT be opted for if the underlying 552 namespace provides for the intrinsic possibility to identify such 553 parts or if there is a readily usable method to construct NSSs by 554 combining the existing identifiers with a component (or components) 555 to identify such parts in an easily discernable manner. 557 Whether the URI Generic Syntax is applied or not, there are various 558 ways in which fragment identifiers can be generated: 560 (a) Fragment identifiers (if any) are assigned individually to the 561 relevant fragments of a larger entity during the URN assignment 562 process. If a URN Namespace opts for this model, its 563 specification SHOULD describe the additional syntax restrictions 564 to be adhered to and the particulars of the (per-URN) assignment 565 process. 567 (b) A specific set of fragment identifiers is generally applicable 568 to all resources targeted by URNs of the specific URN Namespace. 569 In this case, the specification document MUST specify a finite 570 set of values, or precise, generic rules for the 571 automated formation of syntactically valid fragment identifiers 572 for the particular URN Namespace. The specification SHOULD 573 indicate the treatment of syntactically valid values 574 in case they are not semantically valid for a given base URN. 575 Absent such specification, the default is to ignore such 576 fragment identifiers. 578 URN resolver clients SHOULD pass a given part of a URN 579 unchanged to the resolver service. The default URN resolution 580 behavior is to ignore any part if either the applicable 581 URN Namespace definition did not specify its use, or if no specific 582 related information was available for the basic resource in case (b) 583 above, or if that basic URN plus fragment identifier has not been 584 assigned in case (a) above. 586 2.1. Namespace Identifier (NID) Syntax 588 The following is the syntax for the Namespace Identifier. To (i) be 589 consistent with all potential resolution schemes and (ii) not put any 590 undue constraints on any potential resolution scheme, Namespace 591 Identifiers are ASCII strings with the syntax: 593 NID = (ALPHA / DIGIT) 0*30(ALPHA / DIGIT / "-") (ALPHA / DIGIT) 595 Note: 596 The above definition is slightly more restrictive than it was in 597 RFC 2141, to better reflect common practice for "handle"-like 598 identifiers in other IETF protocols (a.k.a. "LDH" syntax) and 599 requirements from RFC 3406bis. RFC 3406bis contains further 600 syntax restrictions on NID strings. 602 ISSUE: 603 The above rule still allows NIDs that contain multiple adjacent 604 hyphens or have the form of decimal numbers or decimal number 605 ranges. 607 Should this be further restricted _in this document_ or is it 608 sufficient to defer to the additional (NID kind specific) rules in 609 RFC 3406bis and the common sense of URN Namespace authors and the 610 designated IANA experts? 611 Anyhow, such restrictions would be fully backward compatible -- as 612 is the above tightened rule -- because no NIDs have been defined 613 so far that would violate these restrictions. Hyphens have been 614 used only in the naming pattern for "Informal Namespace IDs" per 615 RFC 3406[bis]. 617 The document editor senses the low level of discussion of this 618 issue as an indication that this Issue can be closed. 620 Namespace Identifiers are case-insensitive, so that for instance 621 "ISBN" and "isbn" refer to the same namespace. 623 To avoid confusion with the URI Scheme name "urn", the NID "urn" is 624 permanently reserved by this RFC and MUST NOT be used or registered. 626 Note: 627 This reservation is carried over unchanged from RFC 2141, for 628 historical reasons. 630 ISSUE: 631 Further possible reservations and/or details are out of scope for 632 this document, but might be within the scope of RFC 3406bis. 633 It has been suggested that no additional reservations should be 634 codified and the final decision in any case should be left to the 635 common sense of URN Namespace authors and the designated IANA 636 experts. 638 The document editor senses the low level of discussion of this 639 issue as an indication that this Issue can be closed. 641 2.2. Namespace Specific String (NSS) Syntax 643 As already required since RFC 1737, there is a single canonical 644 representation of the NSS portion of an URN. 646 The format of this single canonical form follows: 648 NSS = 1*pchar ; or equivalent: NSS = segment-nz 650 ( and are defined in Section 3.3 of RFC 3986.) 652 Note: The informational Appendix C expands on the evolution of the 653 NSS syntax specification since RFC 2141. 655 ISSUE (for the record): 656 In comparison to RFC 2141, essentially now "&" and "~" are allowed 657 in the NSS syntax, in full conformance with the generic URI 658 syntax. On the other hand, the characters are no more 659 part of the formal syntax -- unfortunately (or erroneously) these 660 were included in the formal syntax rules of RFC 2141 and only 661 exluded after that fact in the prose, which at least in one 662 instance has lead to a URN Namespace definition document that 663 allows in the formal NSS syntax but does _not_ properly 664 exclude their use in the prose. The interpretation of "%" was 665 ambiguous in RFC 2141; it is now only allowed (in the formal 666 syntax and in the prose) in constructs. 668 The document editor senses that this change of the NSS syntax has 669 found consensus and that hence this Issue is regarded as closed. 671 Depending on the rules governing a namespace, valid identifiers in a 672 namespace might contain characters that are not members of the URN 673 character repertoire above (). In order to achieve 674 conformance with this NSS specification, such strings MUST be 675 translated into canonical NSS format before embedding them into a 676 URN, using them as protocol elements, or otherwise passing them on to 677 other applications. Translation is done by encoding each character 678 outside the URN character repertoire as a sequence of octets using 679 UTF-8 encoding (STD 63 [RFC3629]), and the "percent-encoding" of each 680 of those octets as "%" followed by two characters. The 681 latter two characters form the hexadecimal representation of that 682 octet. (See Section 2.3.2 below for more details.) 684 2.3. Special and Reserved Characters 686 The remaining printable characters not included in the 687 repertoire comprise the generic delimiters and the reserved 688 characters, which are restricted for special use only. These 689 characters are discussed below, giving the specifics of why each 690 character is special or reserved. 692 2.3.1. Delimiter Characters 694 RFC 3986 [RFC3986] defines the general delimiter characters used in 695 URIs: 697 gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@" 699 From among the , ":" and "@" are also included in the 700 rule and hence allowed in the path components of URIs. 702 The at-character ("@") in generic URIs only has a specific meaning 703 when contained in the part, which is absent in URNs. 704 Hence, "@" is available in the part of URNs. 706 With URNs, the colon (":") is used as a delimiter character not only 707 between the scheme name ("urn") and the , but also between the 708 latter and the , and many existing URN Namespaces additionally 709 use ":" to further subdivide a single RFC 3986 path segment in the 710 in a hierarchical manner. 712 Note: Using ":" as a sub-delimiter in the path in favor of "/" is 713 attractive because it avoids possible complications that could arise 714 from accidental inappropriate use of relative URI references 715 [RFC3986] for URNs. 717 The characters "/", "?", and "#" separate path components and the 718 and parts in the generic URI syntax; they are 719 restricted to this role in URNs as well, although the in URNs 720 only admits a single and hence "/" is not allowed. 721 Therefore, these characters MUST NOT appear literally in the 722 part of a URN in unencoded form. Namespaces that need these 723 characters MUST employ in their URNs the appropriate percent-encoding 724 for each such character. 726 The square brackets ("[" and "]") also play a particular role when 727 contained in the part, which is absent in URNs. However, 728 for conformance with the generic URI syntax, they are not allowed 729 literally in the component of URNs. If a specific URN 730 Namespace reflects semantics that require these characters, they MUST 731 be percent-encoded in the respective URNs. 733 2.3.2. The Percent Character, Percent-Encoding 735 The percent character ("%") is reserved in the URN syntax for 736 introducing the escape sequence for an octet that is either not a 737 printable ASCII character or reserved for special purposes, as 738 described in this section. The presence of a "%" character in a URN 739 MUST always be followed by two characters, which three 740 characters together semantically form an abstract 741 octet. Literal use of the "%" character in an underlying namespace 742 MUST therefore be encoded as "%25" in URNs for that namespace. 744 Namespaces MAY designate one or more characters from the URN 745 character repertoire as having special meaning for that namespace. 746 If the namespace also uses that character in a literal sense as well, 747 the character used in a literal sense MUST be encoded with "%" 748 followed by the hexadecimal representation of that octet. Further, a 749 character MUST NOT be percent-encoded if the character is not a 750 reserved character. Therefore, the process of registering a 751 namespace identifier shall include publication of a definition of 752 which characters have a special meaning to that namespace -- cf. RFC 753 3406bis [I-D.ietf-urnbis-rfc3406bis-urn-ns-reg]. 755 2.3.3. Other Excluded Characters 757 The following list is included only for the sake of completeness. It 758 includes the characters discussed in Sections 2.3.1 and 2.3.2. Any 759 octets/characters on this list are explicitly NOT part of the URN 760 character repertoire, and if used in an URN, MUST be percent- 761 encoded. 763 excluded = CTL / SP ; control characters and space 764 / DQUOTE ; " 765 / "#" ; from 766 / "%" ; see above 767 / "/" ; from 768 / "<" / ">" 769 / "?" ; from 770 / "[" ; from 771 / "\" 772 / "]" ; from 773 / "^" 774 / "`" 775 / "{" / "|" / "}" 776 / %x7F ; DEL (control character) 777 / %x80-FF ; non-ASCII 779 The NUL octet (0 hex) is renowned for a long history of trouble in 780 implementations. It MUST NOT be used in URNs, in either unencoded or 781 percent-encoded form. 783 In a textual context for a URN, the NSS part ends when an octet/ 784 character from the excluded character set () is 785 encountered. The character from the excluded character set is NOT 786 part of the NSS. 788 The more general issue of discerning URNs in non-structured text is 789 not specific to URNs, but a general issue for recognizing URIs (by 790 humans or automata), and hence out of scope of this document. 792 3. Support of Existing Legacy Naming Systems and New Naming Systems 794 Any identifier to be used as a URN MUST be expressed in conformance 795 with the URI and URN syntax specifications ([RFC3986], this 796 document). If names from (existing or newly devised) namespaces 797 contain characters other than those defined for the URN character 798 set, they MUST be translated into canonical form as discussed in 799 Section 2.2. 801 On the other hand, every namespace specific string in a given URN 802 Namespace MUST be based on an identifier that conforms to the 803 requirements of the identifier system to which the URN Namespace is 804 assigned; in the simplest form, if the syntactical rules admit, the 805 NSS can be the original identifier. For instance, every legal NSS in 806 the ISBN Namespace must be a valid ISBN. 808 4. URN Presentation and Transport 810 The URN syntax defines the canonical format for URNs and all URN 811 transport and interchanges MUST take place in this format. Further, 812 all URN-aware applications MUST offer the option of displaying URNs 813 in this canonical form to allow for direct transcription (for example 814 by cut-and-paste techniques). Such applications MAY support display 815 of URNs in a more human-friendly form and may use a character set 816 that includes characters that aren't permitted in URN syntax as 817 defined in this RFC (that is, they may replace %-notation by 818 characters in some extended character set in display to humans). 820 Note: Such transformation for the purpose of presentation, if done 821 blindly without NID-specific knowledge of special character usage, 822 might introduce ambiguity, because in the cases described above in 823 the second paragraph of Section 2.3.2, the unescaped and percent- 824 escaped form of the same character might carry different semantics 825 in NSSs of some URN Namespaces. 827 5. Lexical Equivalence of URNs 829 For various purposes such as caching, it is often desirable to 830 determine whether two URNs are the same without resolving them. The 831 general-purpose means of doing so is by testing for "lexical 832 equivalence" as defined below. 834 Two URNs are lexically equivalent if they are octet-by-octet equal 835 after the following preprocessing: 836 1. normalize the case of the leading "urn" scheme name; 837 2. normalize the case of the NID; 838 3. normalize the case of any percent-encoding; 839 4. remove the part of the URI, if present. 841 Note that percent-encoding MUST NOT be removed. It is an 842 implementation detail not affecting interoperability whether a URN 843 comparison function internally prefers normalization (in the above 3 844 steps) to lower or to upper case. Note also that MUST NOT 845 be removed, since there is no lexical equivalence between the "base" 846 URN and one which uses -- the former identifies the 847 resource as the whole; the latter just a part of it. 849 Some namespaces may define additional lexical equivalences, such as 850 case-insensitivity of the NSS (or parts thereof). Additional lexical 851 equivalences MUST be documented as part of Namespace registration, 852 MUST always only have the effect of eliminating some of the false 853 negatives obtained by the procedure above, i.e. they MUST NOT say 854 that two URNs are not equivalent if the procedure above says they are 855 equivalent. 857 5.1. Examples of Lexical Equivalence 859 The following hypothetical URN comparisons highlight the lexical 860 equivalence definitions: 862 1- URN:foo:a123,456 863 2- urn:foo:a123,456 864 3- urn:FOO:a123,456 865 4- urn:foo:A123,456 866 5- urn:foo:a123%2C456 867 6- URN:FOO:a123%2c456 868 7- urn:foo:a123,456?xyz 869 8- urn:foo:a123,456#xyz 871 URNs 1, 2, 3, and 7 are all lexically equivalent. URN 4 is not 872 lexically equivalent to any of the other URNs of the above set. The 873 same holds for URN 8. 874 URNs 5 and 6 are only lexically equivalent to each other. 876 6. Functional Equivalence of URNs 878 Functional equivalence is determined by practice within a given 879 namespace and managed by resolvers for that namespace. Thus, it is 880 beyond the scope of this document. Namespace registrations must 881 include guidance on how to determine functional equivalence for that 882 URN Namespace, i.e., when two URNs are identical within a namespace. 884 On the other hand, it is permissible to have two different URNs -- 885 even from different URN Namespaces -- be assigned to a particular 886 resource. This can only be detected by resolving the URNs and 887 analysis of the resolution responses; hence, this is out of scope for 888 this memo. 890 7. The 'urn' URI Scheme 892 At the time of publication of RFC 2141, no formal registration 893 procedure for URI Schemes had been established yet, and so IANA only 894 informally has registered the 'urn' URI Scheme with a reference to 895 [RFC2141]. 897 Section 7.1 below contains the URI scheme registration template for 898 the 'urn' scheme, in accordance with RFC 4395 [RFC4395]. 900 Note: In order to be usable as a standalone text (after being 901 extracted from this RFC), the template below does not contain 902 formal anchors to the references listed in Section 11, but instead 903 gives the common document designations in prose. However, for 904 compliance with editorial policy, it needs to be noted here: 906 This registration template refers to RFCs 2196, 2276, 2608, 3401 907 through 3404, 3406bis, 3629 (STD 63), and 3986 (STD 66) ([RFC2169] 908 [RFC2276] [RFC2608] [RFC3401] [RFC3402] [RFC3403] [RFC3404] 909 [I-D.ietf-urnbis-rfc3406bis-urn-ns-reg] [RFC3629] [RFC3986]). 911 7.1. Registration of URI Scheme 'urn' 913 [ RFC Editor: Please replace "XXXX" in all instances of "RFC XXXX" 914 below by the RFC number assigned to this document. ] 916 URI scheme name: urn 918 Status: permanent 920 URI scheme syntax: 922 See Section 2 of RFC XXXX. 924 URI scheme semantics: 926 'urn' URIs, known as Universal Resource Names (URNs), serve as 927 persistent, location-independent, resource identifiers for 928 concrete and abstract objects that have network accessible 929 instances and/or metadata. 931 URNs are structured hierarchically into URN Namespaces, the 932 management of which is delegated to namespace-specific 933 authorities. Each such URN Namespace is founded in an independent 934 specification and registered with IANA, following the guidelines 935 and procedures of BCP 66 (at the time of this registration: RFC 936 3406, an update is in progress as RFC 3406bis 937 [I-D.ietf-urnbis-rfc3406bis-urn-ns-reg]). 939 Encoding considerations: 941 All URNs are ASCII strings conforming to the general URI syntax 942 from STD 66. As described in Sections 2.2 and 2.3.2 of RFC XXXX, 943 there may be characters allowed by the syntax and semantics of the 944 identifier system underlying the URN Namespace but not contained 945 in the US-ASCII charset. Such characters MUST first be 946 represented in Unicode and encoded in UTF-8 according to STD 63. 947 Any octets outside the allowed character set MUST then be percent- 948 encoded. 950 Note that it is perfectly possible that the syntax and semantics 951 of an underlying identifier system does not admit specific 952 characters allowed by the syntax rules in RFC XXXX. 954 Applications/protocols that use this URI scheme: 956 URNs that serve to identify abstract resources for protocol 957 purposes are expected to be recognized directly by the 958 implementations of these portocols. 960 In general, resolution systems for URNs are specified on a per- 961 namespace basis. If appropriate for the namespace, these systems 962 resolve URNs to (possibly multiple) URIs that allow the network 963 access to the identified object or metadata on it. 965 "Architectural Principles of Uniform Resource Name Resolution" 966 (RFC 2276) explains the basic concepts. Some resolution systems 967 laid down in IETF specifications are: 969 * Trivial HTTP-based URN Resolution (RFC 2169) 971 * Dynamic Delegation Discovery System (DDDS, RFCs 3401-3404) 973 * Service Location Protocol (SLPv2, RFC 2608) 975 Interoperability Considerations: 977 Persistence and stability of URNs require appropriate resolution 978 systems. 980 Security Considerations: 982 See Section 8 of RFC XXXX. 984 Contact: 986 The IETF URNbis working group. 987 This registration will be discussed on the following IETF lists: 988 urn and uri-review (AT ietf.org). 990 Author / Change controller: 992 The authors of RFC XXXX. 993 Change control is with the IESG. 995 References: 997 RFC XXXX. 999 Procedures for the specification and registration of URN 1000 Namespaces are detailed in BCP 66 (at the time of this writing: 1001 RFC 3406; an update is in progress in the URNbis WG as RFC 3406bis 1002 [I-D.ietf-urnbis-rfc3406bis-urn-ns-reg]). 1004 8. Security Considerations 1006 This document specifies the syntax and general requirements for URNs, 1007 which are the specific URIs that use the 'urn' URI scheme. As such, 1008 the general security considerations of STD 66 [RFC3986] apply. 1009 However, each URN Namespace will have specific security 1010 considerations, according to the semantics and usage of the 1011 underlying namespace. While some namespaces may assign special 1012 meaning to particular characters generically allowed in the Namespace 1013 Specific String, any security considerations resulting from such 1014 assignment are outside the scope of this document. It is REQUIRED by 1015 BCP 66 (currently [RFC3406], to be replaced by RFC 3406bis 1016 [I-D.ietf-urnbis-rfc3406bis-urn-ns-reg]) that the process of 1017 registering a namespace identifier include any such considerations. 1019 9. IANA Considerations 1021 IANA is asked to update the existing informal registration of the 1022 'urn' URI Scheme by the template in Section 7.1 above and list this 1023 RFC as the current normative reference in [IANA-URI]. 1025 IANA is asked to add a note to [IANA-URN] that 'urn' is a permanently 1026 reserved formal namespace identifier string that cannot be 1027 registered, in order to avoid confusion with the 'urn' URI scheme. 1029 IANA is asked to again make available the URN Namespace Registry 1030 [IANA-URN] in a generic form (i.e. HTML) at the generic URI given in 1031 the Reference, and to make the XML and TXT versions available from 1032 that HTML version. (This state already had been achieved, but 1033 something seems to have been lost in 2011.) 1035 10. Acknowledgements 1037 This document is heavily based on RFC 2141, the author of which has 1038 laid the foundation for this work; that RFC contained the following 1039 Acknowledgements: 1041 Thanks to various members of the URN working group for comments on 1042 earlier drafts of this document. This document is partially 1043 supported by the National Science Foundation, Cooperative 1044 Agreement NCR-9218179. 1046 This document also heavily relies on and acknowledges the work done 1047 for STD 66 [RFC3986] and earlier RFCs that are being quoted 1048 informally, in particular RFC 1737 [RFC1737]. The experiences 1049 gathered during the first (more than a) decade of URN usage were also 1050 helpful, so individuals and organizations which have implemented and 1051 used URNs are also acknowledged. 1053 Many individuals in the URNbis working group have participated in the 1054 detailed discussion of this memo. Particular thanks for detailed 1055 review comments and text suggestions go to Juha Hakala and Mykyta 1056 Yevstifeyev. 1058 11. References 1060 11.1. Normative References 1062 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1063 Requirement Levels", BCP 14, RFC 2119, March 1997. 1065 [RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO 1066 10646", STD 63, RFC 3629, November 2003. 1068 [RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform 1069 Resource Identifier (URI): Generic Syntax", STD 66, 1070 RFC 3986, January 2005. 1072 [RFC4395] Hansen, T., Hardie, T., and L. Masinter, "Guidelines and 1073 Registration Procedures for New URI Schemes", BCP 35, 1074 RFC 4395, February 2006. 1076 [RFC5234] Crocker, D. and P. Overell, "Augmented BNF for Syntax 1077 Specifications: ABNF", STD 68, RFC 5234, January 2008. 1079 11.2. Informative References 1081 [I-D.ietf-urnbis-rfc3406bis-urn-ns-reg] 1082 Hoenes, A., "Uniform Resource Name (URN) Namespace 1083 Definition Mechanisms", 1084 draft-ietf-urnbis-rfc3406bis-urn-ns-reg-02 (work in 1085 progress), March 2012. 1087 [IANA] IANA, "The Internet Assigned Numbers Authority", 1088 . 1090 [IANA-URI] 1091 IANA, "URI Schemes Registry", 1092 . 1094 [IANA-URN] 1095 IANA, "URN Namespace Registry", 1096 . 1098 [RFC0615] Crocker, D., "Proposed Network Standard Data Pathname 1099 syntax", RFC 615, March 1974. 1101 [RFC0645] Crocker, D., "Network Standard Data Specification syntax", 1102 RFC 645, June 1974. 1104 [RFC1630] Berners-Lee, T., "Universal Resource Identifiers in WWW: A 1105 Unifying Syntax for the Expression of Names and Addresses 1106 of Objects on the Network as used in the World-Wide Web", 1107 RFC 1630, June 1994. 1109 [RFC1736] Kunze, J., "Functional Recommendations for Internet 1110 Resource Locators", RFC 1736, February 1995. 1112 [RFC1737] Sollins, K. and L. Masinter, "Functional Requirements for 1113 Uniform Resource Names", RFC 1737, December 1994. 1115 [RFC1738] Berners-Lee, T., Masinter, L., and M. McCahill, "Uniform 1116 Resource Locators (URL)", RFC 1738, December 1994. 1118 [RFC1808] Fielding, R., "Relative Uniform Resource Locators", 1119 RFC 1808, June 1995. 1121 [RFC2141] Moats, R., "URN Syntax", RFC 2141, May 1997. 1123 [RFC2169] Daniel, R., "A Trivial Convention for using HTTP in URN 1124 Resolution", RFC 2169, June 1997. 1126 [RFC2276] Sollins, K., "Architectural Principles of Uniform Resource 1127 Name Resolution", RFC 2276, January 1998. 1129 [RFC2396] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform 1130 Resource Identifiers (URI): Generic Syntax", RFC 2396, 1131 August 1998. 1133 [RFC2608] Guttman, E., Perkins, C., Veizades, J., and M. Day, 1134 "Service Location Protocol, Version 2", RFC 2608, 1135 June 1999. 1137 [RFC2611] Daigle, L., van Gulik, D., Iannella, R., and P. Faltstrom, 1138 "URN Namespace Definition Mechanisms", BCP 33, RFC 2611, 1139 June 1999. 1141 [RFC2717] Petke, R. and I. King, "Registration Procedures for URL 1142 Scheme Names", BCP 35, RFC 2717, November 1999. 1144 [RFC2718] Masinter, L., Alvestrand, H., Zigmond, D., and R. Petke, 1145 "Guidelines for new URL Schemes", RFC 2718, November 1999. 1147 [RFC3305] Mealling, M. and R. Denenberg, "Report from the Joint W3C/ 1148 IETF URI Planning Interest Group: Uniform Resource 1149 Identifiers (URIs), URLs, and Uniform Resource Names 1150 (URNs): Clarifications and Recommendations", RFC 3305, 1151 August 2002. 1153 [RFC3401] Mealling, M., "Dynamic Delegation Discovery System (DDDS) 1154 Part One: The Comprehensive DDDS", RFC 3401, October 2002. 1156 [RFC3402] Mealling, M., "Dynamic Delegation Discovery System (DDDS) 1157 Part Two: The Algorithm", RFC 3402, October 2002. 1159 [RFC3403] Mealling, M., "Dynamic Delegation Discovery System (DDDS) 1160 Part Three: The Domain Name System (DNS) Database", 1161 RFC 3403, October 2002. 1163 [RFC3404] Mealling, M., "Dynamic Delegation Discovery System (DDDS) 1164 Part Four: The Uniform Resource Identifiers (URI)", 1165 RFC 3404, October 2002. 1167 [RFC3406] Daigle, L., van Gulik, D., Iannella, R., and P. Faltstrom, 1168 "Uniform Resource Names (URN) Namespace Definition 1169 Mechanisms", BCP 66, RFC 3406, October 2002. 1171 [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an 1172 IANA Considerations Section in RFCs", BCP 26, RFC 5226, 1173 May 2008. 1175 Appendix A. Handling of URNs by URL Resolvers/Browsers 1177 The URN syntax has been defined so that URNs can be used in places 1178 where URLs are expected. A resolver that conforms to the current URI 1179 syntax specification [RFC3986] will extract a scheme value of "urn" 1180 rather than a scheme value of "urn:". 1182 An URN MUST be considered an opaque URI by URL resolvers and passed 1183 (with the "urn:" tag) to a URN resolver for resolution. The URN 1184 resolver can either be an external resolver that the URL resolver 1185 knows of, or it can be functionality built into the URL resolver. 1187 To avoid confusion of users, a URL browser SHOULD display the 1188 complete URN (including the "urn:" tag) to ensure that there is no 1189 confusion between URN Namespace identifiers and URI Scheme names. 1191 Appendix B. Collected ABNF (Informative) 1193 As a service to implementers specifically interested in URN syntax, 1194 the complete ABNF for URNs is collected here, including the 1195 referenced rules from [RFC5234] and [RFC3986]. In case of 1196 (unexpected) inconsistencies, these documents remain normative for 1197 the respective productions. 1199 URNs conform to the variant of the general URI syntax 1200 specified in Section 3 of [RFC3986] : 1202 URI = scheme ":" path-rootless [ "?" query ] [ "#" fragment ] 1204 scheme = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." ) 1205 path-rootless = segment-nz *( "/" segment ) 1206 query = *( pchar / "/" / "?" ) 1207 fragment = *( pchar / "/" / "?" ) 1209 segment-nz = 1*pchar 1210 segment = *pchar 1211 pchar = unreserved / pct-encoded / sub-delims / ":" / "@" 1213 unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~" 1214 pct-encoded = "%" HEXDIG HEXDIG 1215 sub-delims = "!" / "$" / "&" / "'" / "(" / ")" 1216 / "*" / "+" / "," / ";" / "=" 1218 In the case of URNs, the above rules are subject to more specific 1219 restrictions: 1221 scheme = "urn" 1222 ; specific, fixed (assigned) value 1224 urn-path = NID ":" NSS 1225 ; to be superimposed on 1227 NID = ( ALPHA / DIGIT ) 1*31( ALPHA / DIGIT / "-" ) 1228 ; RFC 3406[bis] contains more specific rules 1230 NSS = 1*pchar 1231 ; or equivalent: NSS = segment-nz 1233 The above rules make use of the following "Core Rules" from Appendix 1234 B.1 of [RFC5234] : 1236 ALPHA = %x41-5A / %x61-7A ; A-Z / a-z 1237 DIGIT = %x30-39 ; 0-9 1238 HEXDIG = DIGIT / "A" / "B" / "C" / "D" / "E" / "F" 1240 Appendix C. Breakdown of NSS Syntax Evolution since RFC 2141 1241 (Informative) 1243 In order to make visible the detailed migration path from RFC 2141 1244 and the influence of the evolution of URI syntax from RFC 2396 to RFC 1245 3986 on it, this appendix provides a highly annotated and expanded 1246 version of the NSS syntax provided in Section 2.2: 1248 NSS = 1*pchar ; or equivalent: NSS = segment-nz 1250 In particular, the breakdown below serves to provide evidence of that 1251 this syntax correctly reflects the addition of "&" and "~" to the 1252 repertoire of characters allowed in the NSS portion of URNs 1253 previously allowed by RFC 2141; it expands on the syntax specified in 1254 RFC 2141 after translation to standard ABNF. 1256 NSS = 1*URN-char 1258 URN-char = trans / pct-encoded 1259 ; Note that from RFC 3986 here replaces the 1260 ; explicit, expanded form used in RFC 2141. 1262 trans = ALPHA / DIGIT / u-other 1263 ; Note that RFC 2141's has been disambiguated here 1264 ; into . 1265 ; RFC 2141 also said: 1266 ; / reserved 1267 ; This caused an ambiguity in RFC 2141 with respect to "%", which 1268 ; now is resolved here by omission of this dangling alternative. 1269 ; 1270 ; After adoption of the generic URI syntax from RFC 3986, there 1271 ; is no more need to deal here with the higher-level separator 1272 ; characters "/", "?", and "#" contained in 1273 ; (beyond "%", which is fully taken care of by ), 1274 ; which are part of RFC 3986's , as shown below. 1276 ; From RFC 2141: 1277 ; reserved = '%" / "/" / "?" / "#" ; SIC! 1278 ; ^ ^ 1280 u-other = ":" / "@" 1281 ; those from RFC 3986 1282 ; specifically allowed in . 1283 ; From RFC 3986: 1284 ; gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@" 1286 / "!" / "$" / "'" / "(" / ")" 1287 / "*" / "+" / "," / ";" / "=" 1288 ; this is RFC 3986 except "&". 1289 ; From RFC 3986: 1290 ; sub-delims = "!" / "$" / "&" / "'" / "(" / ")" 1291 ; / "*" / "+" / "," / ";" / "=" 1292 ; The URNbis WG arrived at unanimous consensus that "&" can be 1293 ; allowed without harm to backward compatibility for existing 1294 ; URN Namespaces. 1296 / "-" / "." / "_" ; except "~" 1297 ; From RFC 3986: 1298 ; unreserved = ALPHA / DIGIT 1299 ; / "-" / "." / "_" / "~" 1300 ; The URNbis WG arrived at unanimous consensus that "~" can be 1301 ; allowed without harm to backward compatibility for existing 1302 ; URN Namespaces. 1304 ; Since we now allow "&" and "~" , becomes , 1305 ; greatly simplifying the syntax rules and parsers! 1307 ; From RFC 3986: 1308 ; segment-nz = 1*pchar 1309 ; pchar = unreserved / pct-encoded / sub-delims / ":" / "@" 1311 Appendix D. Changes since RFC 2141 (Informative) 1313 D.1. Essential Changes from RFC 2141 1315 [ RFC Editor: please remove the Appendix D.1 headline and all 1316 subsequent subsections starting with Appendix D.2. ] 1318 T.B.D. (after consolidation of this memo) 1320 D.2. Changes from RFC 2141 to Individual Draft -00 1322 Abstract amended: URI scheme, replacement for 2141, point to 3406. 1323 Use contemporary boilerplate. Added transient "Discussion" section. 1325 s1: added new 1st para (URI scheme) and 3rd para (hierarchy). 1326 s1.1 (Historical Perspective) added for background & motivation. 1327 s1.2 (Objective) added. 1328 s1.3 (2119 keywords) added -- used now throughout normative text. 1330 s2 (URN Syntax): Shifted from BNF to ABNF; explain relationship to 1331 3986 and gaps, how the gaps could be bridged, distinguish between URI 1332 generics and URN specifics; got rid of references to immature 1333 documents (1630, 1737). 1334 s2.1 (NID syntax): Use ABNF and RFC 5234 terminals (core rules); 1335 removed reference to an old draft of 2396; clarified prohibition to 1336 use "urn" as NID. 1337 s2.2 (NSS syntax): Shifted from BNF to ABNF; made ABNF consistent 1338 with subsequent textual description; exposition much expanded, 1339 showing relationship with 3986 and resulting incompatibilities; 1340 proposed how to bridge gaps, to make parsing more uniform among URIs; 1341 updated i18n considerations and pointer to UTF-8 specification. 1342 s.2.3, s2.3.*: reworked and much expanded, along the grouping of 1343 delimiter characters from 3986 in new s2.3.1 (including old s.2.3.2); 1344 made text fully consistent with ABNF in s2.2; consistent usage of 1345 term "percent-encoded"; old s.2.3.1 became s2.3.2; old s3.4 became 1347 s3.3.3, providing complete, annotated list of excluded characters, 1348 ordered by ascending code point; and restating design decisions 1349 needed to be made to close gaps to 3986. 1351 s3 through s6: only minor editorial changes. 1353 s7: formal registration of 'urn' URI scheme added, using 4395 1354 template. 1356 s8: Security Cons. slightly amended. 1358 s9: new: IANA Cons. added wrt s7.1 and prohibition of NID "urn". 1360 s10: Acknowledgments amended. 1362 s11: References split into Normative and Informative; updated refs 1363 and added many; only FS and BCP allowed as Normative Refs to further 1364 promotion of document. 1366 Added Appendices A through D. 1368 D.3. Changes from Individual Draft -00 to -02 1370 Updated "Discussion" on front page to point to dedicated urn list. 1372 Numerous editorial improvements and additions for clarification, in 1373 particular in the Introduction. No technical changes. 1375 More Informative References; missing details supplied in D.2. 1377 D.4. Changes from Individual Draft -02 to WG Draft -00 1379 Added new s1.2 to Introduction, with excerpts from RFC 1737 to 1380 provide background on URN functional and syntax requirements. 1381 Renumbered previous s1.2 and s1.3 to s1.3 and s1.4, respectively. 1383 Supplied text in s2 regarding the envisioned use of query and 1384 fragment parts, based on various discussion -- including a 1385 preliminary evaluation in PersID. 1387 Changed "SHOULD never" to "MUST NOT" for NUL character in NSS. 1389 Various editorial and grammar fixes; corrected STD / BCP numbers. 1391 D.5. Changes from WG Draft -00 to WG Draft -01 1393 Reflect WG consensus on adding "&" and "~" to the set of characters 1394 allowed in the NSS part of URNs, thus aligning URN syntax with 1395 generic URI syntax from RFC 3986. 1397 Moved breakdown of NSS syntax evolution from s2.2 to new Appendix C. 1399 Avoid "[URN] character set" in favor of "character repertoire" to 1400 minimize potential clashes with IETF terminology on charsets. 1402 s2.3.3: URN recognition in text documents is regarded out of scope. 1404 The previous version was ambiguous on whether eventual query and/or 1405 fragment parts were regarded as part of the NSS; after closer 1406 inspection of the syntax, clarification has been added that the syntax is indeed superimposed on the ABNF rule for 1408 URNs, and hence does not cover the trailing higher level parts 1409 (query, fragment) according to the URI syntax. 1411 Filled in Appendix B contents. 1413 Numerous editorial and grammar improvements. 1415 D.6. Changes from WG Draft -01 to WG Draft -02 1417 Added note at the beginning of Section 1.2 highlighting the purpose 1418 of this section. The URNbis charter excludes a revision of RFC 1738, 1419 and hence the changes suggested on the list to alter and update this 1420 section have been dismissed. 1422 Added hint to URN Namespace designers in Section 2 that ":" is 1423 customarily used in URN Namespaces to provide further level(s) of 1424 hierarchical subdivision of NSSs. 1426 Reworked text on fragment identification issues and resulting 1427 specification, mostly based on Juha Hakala's evaluation of the 1428 consensus evolving from the list discussion. 1430 Modified ABNF rule for NIDs to better align it with rules for similar 1431 identifiers used in IETF protocols. The new rule now prohibits a 1432 trailing hyphen, but defers further restricting rules on NID syntax 1433 (based on the kind of NID) to RFC 3406bis. 1435 More clearly documented and marked (still open / already closed) 1436 ISSUES. The related text will be removed in the next draft version, 1437 whence it should have been transferred into the IETF issue tracking 1438 system. 1440 Text of Section 3 revised, based on Juha's suggestion. 1442 In Section 5, added removal of part (but not part) 1443 to canonicalization steps for the purpose of determining lexical 1444 equivalence of URNs (Juha's comment). Also added examples showing 1445 this. 1447 Elaborated a bit more on Encoding Consideration in the URI Scheme 1448 registration template (Juha's comments). 1450 Numerous editorial corrections and improvements. 1452 Appendix E. How to Locate IETF Documents (Informative) 1454 Request For Comments (RFCs) are available from the RFC Editor site 1455 using the canonical URIs 1456 or (where 'NNNN' is 1457 the serial number of the RFC), and from numerous mirror sites. 1458 Additional metadata for any RFC, including possible Errata, are 1459 available from (where 'NNNN' 1460 again is the serial number of the RFC). A HTML-ized version and a 1461 PDF facsimile of each RFC are available from the IETF Tools site at 1462 and 1463 , respectively. 1465 Current Internet Draft documents are available via the search engines 1466 at and 1467 ; archival copies of older 1468 IETF documents can be found at . 1470 Author's Address 1472 Alfred Hoenes (editor) 1473 TR-Sys 1474 Gerlinger Str. 12 1475 Ditzingen D-71254 1476 Germany 1478 EMail: ah@TR-Sys.de