idnits 2.17.1 draft-ah-rfc2141bis-urn-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 21) being 66 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The draft header indicates that this document obsoletes RFC2141, but the abstract doesn't seem to directly say this. It does mention RFC2141 though, so this could be OK. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to contain a disclaimer for pre-RFC5378 work, but was first submitted on or after 10 November 2008. The disclaimer is usually necessary only for documents that revise or obsolete older RFCs, and that take significant amounts of text from those RFCs. If you can contact all authors of the source material and they are willing to grant the BCP78 rights to the IETF Trust, you can and should remove the disclaimer. Otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (May 31, 2010) is 5079 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 4395 (Obsoleted by RFC 7595) -- Obsolete informational reference (is this intentional?): RFC 615 (Obsoleted by RFC 645) -- Obsolete informational reference (is this intentional?): RFC 1738 (Obsoleted by RFC 4248, RFC 4266) -- Obsolete informational reference (is this intentional?): RFC 1808 (Obsoleted by RFC 3986) -- Obsolete informational reference (is this intentional?): RFC 2141 (Obsoleted by RFC 8141) -- Obsolete informational reference (is this intentional?): RFC 2396 (Obsoleted by RFC 3986) -- Obsolete informational reference (is this intentional?): RFC 2611 (Obsoleted by RFC 3406) -- Obsolete informational reference (is this intentional?): RFC 2717 (Obsoleted by RFC 4395) -- Obsolete informational reference (is this intentional?): RFC 2718 (Obsoleted by RFC 4395) -- Obsolete informational reference (is this intentional?): RFC 3406 (Obsoleted by RFC 8141) -- Obsolete informational reference (is this intentional?): RFC 5226 (Obsoleted by RFC 8126) Summary: 1 error (**), 0 flaws (~~), 3 warnings (==), 12 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 IETF URNbis A. Hoenes, Ed. 3 Internet-Draft TR-Sys 4 Obsoletes: 2141 (if approved) May 31, 2010 5 Intended status: Standards Track 6 Expires: December 2, 2010 8 Uniform Resource Name (URN) Syntax 9 draft-ah-rfc2141bis-urn-02 11 Abstract 13 Uniform Resource Names (URNs) are intended to serve as persistent, 14 location-independent, resource identifiers. This document serves as 15 the foundation of the 'urn' URI Scheme according to RFC 3986 and sets 16 forward the canonical syntax for URNs, which subdivides URNs into 17 "namespaces". A discussion of both existing legacy and new 18 namespaces and requirements for URN presentation and transmission are 19 presented. Finally, there is a discussion of URN equivalence and how 20 to determine it. This document supersedes RFC 2141. 22 The requirements and procedures for URN Namespace registration 23 documents are currently set forth in RFC 3406, which is also expected 24 to be updated by an independent, revised specification. 26 Discussion 28 This draft version has been obtained by importing the text from RFC 29 2141 into modern tools and making a first round of updating steps. 30 It is intended to serve as one of the starting points for an effort 31 to bring URN RFCs in alignment with STD 63, STD 68, BCP 26, and the 32 requirements from emerging distributed national and international URN 33 resolution systems, and advance them on the IETF Standards Track. 35 Comments are welcome on the urn@ietf.org mailing list (or sent to the 36 document editor). 38 Status of This Memo 40 This Internet-Draft is submitted in full conformance with the 41 provisions of BCP 78 and BCP 79. 43 Internet-Drafts are working documents of the Internet Engineering 44 Task Force (IETF). Note that other groups may also distribute 45 working documents as Internet-Drafts. The list of current Internet- 46 Drafts is at http://datatracker.ietf.org/drafts/current/. 48 Internet-Drafts are draft documents valid for a maximum of six months 49 and may be updated, replaced, or obsoleted by other documents at any 50 time. It is inappropriate to use Internet-Drafts as reference 51 material or to cite them other than as "work in progress." 53 This Internet-Draft will expire on December 2, 2010. 55 Copyright Notice 57 Copyright (c) 2010 IETF Trust and the persons identified as the 58 document authors. All rights reserved. 60 This document is subject to BCP 78 and the IETF Trust's Legal 61 Provisions Relating to IETF Documents 62 (http://trustee.ietf.org/license-info) in effect on the date of 63 publication of this document. Please review these documents 64 carefully, as they describe your rights and restrictions with respect 65 to this document. Code Components extracted from this document must 66 include Simplified BSD License text as described in Section 4.e of 67 the Trust Legal Provisions and are provided without warranty as 68 described in the Simplified BSD License. 70 This document may contain material from IETF Documents or IETF 71 Contributions published or made publicly available before November 72 10, 2008. The person(s) controlling the copyright in some of this 73 material may not have granted the IETF Trust the right to allow 74 modifications of such material outside the IETF Standards Process. 75 Without obtaining an adequate license from the person(s) controlling 76 the copyright in such materials, this document may not be modified 77 outside the IETF Standards Process, and derivative works of it may 78 not be created outside the IETF Standards Process, except to format 79 it for publication as an RFC or to translate it into languages other 80 than English. 82 Table of Contents 84 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 85 1.1. Historical Perspective and Motivation . . . . . . . . . . 4 86 1.2. Objective of this RFC . . . . . . . . . . . . . . . . . . 5 87 1.3. Requirement Language . . . . . . . . . . . . . . . . . . . 6 88 2. URN Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . 6 89 2.1. Namespace Identifier Syntax . . . . . . . . . . . . . . . 8 90 2.2. Namespace Specific String Syntax . . . . . . . . . . . . . 8 91 2.3. Special and Reserved Characters . . . . . . . . . . . . . 10 92 2.3.1. Delimiter Characters . . . . . . . . . . . . . . . . . 10 93 2.3.2. The '%' character . . . . . . . . . . . . . . . . . . 11 94 2.3.3. Other Excluded Characters . . . . . . . . . . . . . . 11 95 3. Support of Existing Legacy Naming Systems and New Naming 96 Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 97 4. URN Presentation and Transport . . . . . . . . . . . . . . . . 12 98 5. Lexical Equivalence in URNs . . . . . . . . . . . . . . . . . 12 99 5.1. Examples of Lexical Equivalence . . . . . . . . . . . . . 13 100 6. Functional Equivalence in URNs . . . . . . . . . . . . . . . . 13 101 7. The 'urn' URI Scheme . . . . . . . . . . . . . . . . . . . . . 13 102 7.1. Registration of URI Scheme 'urn' . . . . . . . . . . . . . 13 103 8. Security Considerations . . . . . . . . . . . . . . . . . . . 15 104 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 16 105 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 16 106 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 16 107 11.1. Normative References . . . . . . . . . . . . . . . . . . . 16 108 11.2. Informative References . . . . . . . . . . . . . . . . . . 17 109 Appendix A. How to Locate IETF Documents (Informative) . . . . . 19 110 Appendix B. Handling of URNs by URL Resolvers/Browsers . . . . . 19 111 Appendix C. Collected ABNF (Informative) . . . . . . . . . . . . 19 112 Appendix D. Changes since RFC 2141 (Informative) . . . . . . . . 20 113 D.1. Essential Changes from RFC 2141 . . . . . . . . . . . . . 20 114 D.2. Changes from RFC 2141 to draft -00 . . . . . . . . . . . . 20 115 D.3. Changes from draft-00 to draft -01 . . . . . . . . . . . . 21 117 1. Introduction 119 'urn' is a particular URI Scheme (according to STD 63, RFC 3986 120 [RFC3986] and BCP 35, RFC 4395 [RFC4395]) that is dedicated to 121 forming a hierarchical framework for persistent identifiers. 123 Uniform Resource Names (URNs) are intended to serve as persistent, 124 location-independent, resource identifiers and are designed to make 125 it easy to map other namespaces (that share the properties of URNs) 126 into URI-space. Therefore, the URN syntax provides a means to encode 127 character data in a form that can be sent in existing protocols, 128 transcribed on most keyboards, etc. 130 The first level of hierarchy is given by the classification of URIs 131 into "URI Schemes", and for URNs, the second level is organized into 132 "URN Namespaces". 134 1.1. Historical Perspective and Motivation 136 For the intended audience of this RFC, which is expected to include 137 groups interested in persistent identifiers in general and not in 138 continuous contact with the IETF and the RFC series, this section 139 gives a brief outline of the evolution of the matter over time. 140 Appendix A gives hints on how to obtain RFCs and related information. 142 Attempts to define generally applicable identifiers for network 143 resources go back to the mid-1970 years. Among the applicable RFCs 144 is RFC 615 [RFC0615], which subsequently has been obsoleted by 145 RFC 645 [RFC0645]. 147 The seminal document in the RFC series regarding URIs (Uniform 148 Resource Identifiers) for use with the World Wide Web (WWW) has been 149 RFC 1630 [RFC1630], published in 1994. In the same year, the general 150 concept or Uniform Resource Names has been laid down in RFC 1737 151 [RFC1737]. and that of Uniform Resource Locators in RFC 1736 152 [RFC1736]. 154 The original formal specification of URN Syntax, RFC 2141 [RFC2141] 155 has been adopted in 1997. That document was based on the original 156 specification of URLs (Uniform Resource Locators) in RFC 1738 157 [RFC1738] and RFC 1808 [RFC1808], which later on, in 1998, has been 158 generalized and consolidated in the Generic URI specification, RFC 159 2396 [RFC2396]. Most parts of these URI/URL documents have been 160 superseded in 2005 by STD 63, RFC 3986 [RFC3986]. Notably, RFC 2141 161 makes -- esentially normative -- reference to a draft version of RFC 162 2396. 164 Over time, the terms "URI", "URL", and "URN" have been refined and 165 slightly shifted according to emerging insight and use. This has 166 been clarified in a joint effort of the IETF and the World Wide Web 167 Council, published 2002 for the IETF in RFC 3305 [RFC3305]. 169 The wealth of URI Schemes and URN Namespaces needs to be organized in 170 a persistent way, in order to guide application developers and users 171 to the standardized top level branches and the related 172 specifications. These registries are maintained by the Internet 173 Assigned Numbers Authority (IANA) [IANA] at [IANA-URI] and 174 [IANA-URN], respectively. Registration procedures for URI Schemes 175 originally had been laid down in RFC 2717 [RFC2717] and guidelines 176 for the related specification documents were given in RFC 2718 177 [RFC2718]. These documents have been obsoleted and consolidated into 178 BCP 35, RFC 4395 [RFC4395], which is based on, and aligned with, RFC 179 3986. 181 Note that RFC 2141 predates RFC 2717 and, although the 'urn' URI 182 scheme is listed in [IANA-URI] with a pointer to RFC 2141, this 183 registration has never been performed formally. 185 Similarly, the URN Namespace definition and registration mechanisms 186 originally have been specified in RFC 2611 [RFC2611], which has been 187 obsoleted by BCP 66, RFC 3406 [RFC3406]. Guidelines for documents 188 prescribing IANA procedures have been revised as well over the years, 189 and at the time of this writing, BCP 26, RFC 5226 [RFC5226] is the 190 normative document. Neither RFC 4395 nor RFC 3406 conform with RFC 191 5226. 193 Early documents specifying URI and URN syntax, including RFC 2141, 194 made use of an ad-hoc variant of the original Backus-Naur Form (BNF) 195 that never has been formally specified. 197 Over the years, the IETF has shifted to the use of a predominant 198 formal language used to define the syntax of textual protocol 199 elements, dubbed "Augmented Backus-Naur Form" (ABNF). The 200 specification of ABNF also has evolved, and now STD 68, RFC 5234 201 [RFC5234] is the normative document for it (that also will be used in 202 this RFC). 204 1.2. Objective of this RFC 206 RFC 2141 does not seamlessly match current Internet Standards. The 207 primary objective of this document is the alignment with the URI 208 Standard [RFC3986] and guidelines [RFC4395], the ABNF Standard 209 [RFC5234] and the current IANA Guidelines [RFC5226] in general. 211 Further, experience from emerging international efforts to establish 212 a general, distributed, stable URN resolution service are expected to 213 be taken into account during the draft stage of this document. 215 For advancing the URN specification on the Internet Standards-Track, 216 it needs to be based on documents of comparable maturity. Therefore, 217 to further advancements of the formal maturity level of this RFC, it 218 deliberately makes normative references only to documents at Full 219 Standard or Best Current Practice level. 221 Thus, this replacement document for RFC 2141 should make it possible 222 to advance the URN framework step by step on the Internet Standard 223 maturity ladder. All other related documents depend on it; therefore 224 this is the first step to undertake. 226 Out of scope for this document is a revision of the URN Namespace 227 Definition Mechanisms document, BCP 66 [RFC3406]. 229 1.3. Requirement Language 231 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 232 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 233 document are to be interpreted as described in RFC 2119 [RFC2119]. 235 2. URN Syntax 237 This document defines the URI Scheme 'urn'. Hence, URNs are specific 238 URIs as specified in RFC 3986 [RFC3986]. The formal syntax 239 definitions below are given in ABNF according to RFC 5234 [RFC5234] 240 and make use of some "Core Rules" specified in Appendix B of that 241 Standard and several generic rules defined in Appendix A of RFC 3986. 243 The syntax definitions below do, and syntax definitions in dependent 244 documents MUST, conform to the URI syntax specified in RFC 3986, in 245 the sense that additional syntax rules must only constrain the 246 general rules from RFC 3986. In other words: a general URI parser 247 based on RFC 3986 MUST be able to parse any legal URN, and specific 248 semantics can be obtained from URN-specific parsing. 250 NOTE: The remainder of this section still requires MUCH work! 252 URNs conform to the variant of the general URI syntax 253 specifed in Section 3 of [RFC3986] : 255 URI = scheme ":" path-rootless [ "?" query ] [ "#" fragment ] 257 path-rootless = segment-nz *( "/" segment ) 259 segment-nz = 1*pchar 260 segment = *pchar 262 pchar = unreserved / pct-encoded / sub-delims / ":" / "@" 264 with 266 scheme = "urn" 268 and the following additional syntax rule superimposed on to establish a level of hierarchy called "Namespace": 271 urn-path = NID ":" NSS 273 Here "urn" is the URI scheme name, is the Namespace Identifier, 274 and is the Namespace Specific String. The colons are REQUIRED 275 separator characters. 277 Per RFC 3986, the URN Scheme name (here "urn") is case-insensitive. 279 The Namespace ID (also a case-insensitive string) determines the 280 syntactic structure and the semantic interpretation of the Namespace 281 Specific String. Generic details on NID syntax can be found below in 282 Section 2.1 and the NSS syntax is elaborated upon in Section 2.2. 284 Each particular namespace is based on a specific document that must 285 normatively describe (among other things) the details of the 286 values allowed in conjunction with the respective . The 287 specification requirements and registration procedures for URN 288 namespaces are the subject of a dedicated document, currently RFC 289 3406 [RFC3406] -- to be updated for conformace with BCP 26 and 290 alignment with implementation experience. 292 Note (to be discussed): 293 RFC 2141 has deferred the decision on whether and 294 components are applicable to URNs and reserved the use 295 of bare (unencoded) question mark ("?") and hash ("#") characters 296 in URNs. 298 There is evidence of desire to be able to use these components 299 (which are split off by the high-level parsing rules of RFC 3986), 300 or at least the component, in URNs belonging to 301 selected namespaces. Thus, this draft version tentatively aims at 302 allowing these components in the general syntax. These components 303 however shall only be allowed if and only if the specification 304 document for a particular URN namespace specifically does say so 305 and discusses the ramifications of this addition. 307 Question mark and hash sign remain reserved for this purpose and 308 cannot appear unencoded in an NSS. This way, backwards 309 compatibility with existing URN namespaces is guaranteed and 310 compatibility with general URI parsers is improved. 312 2.1. Namespace Identifier Syntax 314 The following is the syntax for the Namespace Identifier. To (a) be 315 consistent with all potential resolution schemes and (b) not put any 316 undue constraints on any potential resolution scheme, Namespace 317 Identifiers are ASCII strings with the syntax: 319 NID = ( ALPHA / DIGIT ) 0*31 ( ALPHA / DIGIT / "-" ) 321 Namespace Identifiers are case-insensitive, so that for instance 322 "ISBN" and "isbn" refer to the same namespace. 324 To avoid confusion with the URI Scheme name "urn", the NID "urn" is 325 permanently reserved by this RFC and MUST NOT be used or registered. 327 2.2. Namespace Specific String Syntax 329 Note: 330 In order to make visible the migration path from RFC 2141 and the 331 influence of the evolution of URI syntax from RFC 2396 to RFC 3986 332 on it, at this draft stage, the subsequent syntax description is 333 highly annotated and expanded. After discussion, a substantial 334 consolidation is expected. 336 As already required by RFC 1737, there is a single canonical 337 representation of the NSS portion of an URN. 339 Note: If the DISCUSSes above and below can be affirmed (allowing 340 optional and components as well as "&" and 341 "~" in the path), the syntax could be simplified very much to: 343 NSS = 1*pchar ; or equivalent: NSS = segment-nz 345 The format of this single canonical form follows: 347 NSS = 1*URN-char 349 URN-char = trans / pct-encoded 351 trans = ALPHA / DIGIT / u-other 352 ; NO? / reserved 353 ; Issue: This lead to ambiguity in RFC 2141 wrt "%". 355 u-other = ":" / "@" 356 ; those from RFC 3986 357 ; specifically allowed in . 358 ; From RFC 3986: 359 ; gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@" 361 / "!" / "$" / "'" / "(" / ")" 362 / "*" / "+" / "," / ";" / "=" 363 ; this is RFC 3986 except "&". 364 ; From RFC 3986: 365 ; sub-delims = "!" / "$" / "&" / "'" / "(" / ")" 366 ; / "*" / "+" / "," / ";" / "=" 367 ; Issue: can/should "&" be allowed ? 368 ; If we allow and according to the 369 ; generic URI syntax, there seems to be no more need to exclude "&". 371 / "-" / "." / "_" ; except "~" 372 ; From RFC 3986: 373 ; unreserved = ALPHA / DIGIT 374 ; / "-" / "." / "_" / "~" 375 ; Issue: can/should "~" be allowed as well ? 377 ; If we allow "&" and "~" , becomes , 378 ; greatly simplifying the syntax rules and parsers! 380 ; from RFC 2141: 381 ; reserved = '%" / "/" / "?" / "#" ; SIC! 383 Depending on the rules governing a namespace, valid identifiers in a 384 namespace might contain characters that are not members of the URN 385 character set above (). Such strings MUST be translated 386 into canonical NSS format before using them as protocol elements or 387 otherwise passing them on to other applications. Translation is done 388 by encoding each character outside the URN character set as a 389 sequence of octets using UTF-8 encoding [RFC3629], and the "percent- 390 encoding" of each of those octets as "%" followed by two 391 characters. The two characters give the hexadecimal representation 392 of that octet. 394 2.3. Special and Reserved Characters 396 The remaining printable characters left to be discussed above 397 comprise the generic delimiters and the reserved characters, which 398 are restricted for special use only. These characters are discussed 399 below, giving the specifics of why each character is special or 400 reserved. 402 2.3.1. Delimiter Characters 404 RFC 3986 [RFC3986] defines the general delimiter characters used in 405 URIs: 407 gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@" 409 From among the , ":" and "@" are also included in 410 and hence allowed in the path components of URIs. 412 The at-character ("@") in generic URIs only has a specific meaning 413 when contained in the part, which is absent in URNs. 414 Hence, "@" is available in the part of URNs. 416 With URNs, the colon (":") is used as a delimiter character not only 417 between the scheme name ("urn") and the , but also between the 418 latter and the , and many existing URN namespaces additionally 419 use ":" to further subdivide a single RFC 3986 path segment in the 420 in a hierarchical manner. 422 Note: Using ":" as a sub-delimiter in the path in favor of "/" is 423 attractive because it avoids possible complications that could arise 424 from the inappropriate use of relative URI references [RFC3986] for 425 URNs. 427 The characters "/", "?", and "#" separate path components and the 428 and parts in the generic URI syntax; they are 429 restricted to this role in URNs as well, although the in URNs 430 only admits a single and hence "/" is not allowed. 431 Therefore, these characters MUST NOT appear in the part of a 432 URN in unencoded form. Namespaces that need these characters MUST 433 employ in their URNs the appropriate percent-encoding for each 434 character. 436 The square brackets ("[" and "]") also play a particular role when 437 contained in the part, which is absent in URNs. However, 438 for conformance with the generic URI syntax, they are not allowed 439 literally in the component of URNs. If a specific URN 440 namespace reflects semantics that require these characters, they MUST 441 be percent-encoded in the respective URNs. 443 2.3.2. The '%' character 445 The "%" character is reserved in the URN syntax for introducing the 446 escape sequence for an octet that is either not a printable ASCII 447 character or reserved for special purposes, as described in this 448 section. Literal use of the "%" character in an underlying namespace 449 must be encoded as "%25" in URNs for that namespace. The presence of 450 a "%" character in a URN MUST always be followed by two 451 characters, which three together semanticaly form an abstract octet. 454 Namespaces MAY designate one or more characters from the URN 455 character set as having special meaning for that namespace. If the 456 namespace also uses that character in a literal sense as well, the 457 character used in a literal sense MUST be encoded with "%" followed 458 by the hexadecimal representation of that octet. Further, a 459 character MUST NOT be percent-encoded if the character is not a 460 reserved character. Therefore, the process of registering a 461 namespace identifier shall include publication of a definition of 462 which characters have a special meaning to that namespace. 464 2.3.3. Other Excluded Characters 466 The following list is included only for the sake of completeness. It 467 includes the characters discussed in Sections 2.3.1 and 2.3.2. Any 468 octets/characters on this list are explicitly NOT part of the URN 469 character set, and if used in an URN, MUST be percent-encoded. 471 excluded = CTL / SP ; control characters and space 472 / DQUOTE ; " 473 / "#" ; from 474 / "%" ; see above 475 ; DISCUSS! / "&" ; DISCUSS -- see above! 476 / "/" ; from 477 / "<" / ">" 478 / "?" ; from 479 / "[" ; from 480 / "\" 481 / "]" ; from 482 / "^" 483 / "`" 484 / "{" / "|" / "}" 485 ; DISCUSS! / "~" ; DISCUSS -- see above! 486 / %x7F ; DEL (control character) 487 / %x80-FF ; non-ASCII 489 In addition, the NUL octet (0 hex) SHOULD never be used, in either 490 unencoded or percent-encoded form. 492 In textual context, a URN ends when an octet/character from the 493 excluded character set () is encountered. The character 494 from the excluded character set is NOT part of the URN. 496 [ Does that still make sense? -- it collides with possible question / 497 fragment! ] 499 3. Support of Existing Legacy Naming Systems and New Naming Systems 501 Any namespace (existing or newly devised) that is proposed as a URN 502 namespace and fulfills the criteria of URN namespaces MUST be 503 expressed in this syntax. If names in these namespaces contain 504 characters other than those defined for the URN character set, they 505 MUST be translated into canonical form as discussed in Section 2.2. 507 4. URN Presentation and Transport 509 The URN syntax defines the canonical format for URNs and all URN 510 transport and interchanges MUST take place in this format. Further, 511 all URN-aware applications MUST offer the option of displaying URNs 512 in this canonical form to allow for direct transcription (for example 513 by cut-and-paste techniques). Such applications MAY support display 514 of URNs in a more human-friendly form and may use a character set 515 that includes characters that aren't permitted in URN syntax as 516 defined in this RFC (that is, they may replace %-notation by 517 characters in some extended character set in display to humans). 519 5. Lexical Equivalence in URNs 521 For various purposes such as caching, it is often desirable to 522 determine whether two URNs are the same without resolving them. The 523 general purpose means of doing so is by testing for "lexical 524 equivalence" as defined below. 526 Two URNs are lexically equivalent if they are octet-by-octet equal 527 after the following preprocessing: 528 1. normalize the case of the leading "urn" scheme; 529 2. normalize the case of the NID; 530 3. normalize the case of any percent-encoding. 531 Note that percent-encoding MUST NOT be removed. 533 Some namespaces may define additional lexical equivalences, such as 534 case-insensitivity of the NSS (or parts thereof). Additional lexical 535 equivalences MUST be documented as part of namespace registration, 536 MUST always have the effect of eliminating some of the false 537 negatives obtained by the procedure above, and MUST NEVER say that 538 two URNs are not equivalent if the procedure above says they are 539 equivalent. 541 5.1. Examples of Lexical Equivalence 543 The following URN comparisons highlight the lexical equivalence 544 definitions: 546 1- URN:foo:a123,456 547 2- urn:foo:a123,456 548 3- urn:FOO:a123,456 549 4- urn:foo:A123,456 550 5- urn:foo:a123%2C456 551 6- URN:FOO:a123%2c456 553 URNs 1, 2, and 3 are all lexically equivalent. URN 4 is not 554 lexically equivalent to any of the other URNs of the above set. 555 URNs 5 and 6 are only lexically equivalent to each other. 557 6. Functional Equivalence in URNs 559 Functional equivalence is determined by practice within a given 560 namespace and managed by resolvers for that namespace. Thus, it is 561 beyond the scope of this document. Namespace registrations must 562 include guidance on how to determine functional equivalence for that 563 namespace, i.e. when two URNs are identical within a namespace. 565 7. The 'urn' URI Scheme 567 At the time of publication of RFC 2141, no formal registration 568 procedure for URI Schemes had been established yet, and so IANA only 569 informally has registered the 'urn' URI Scheme with a reference to 570 [RFC2141]. 572 Section 7.1 below contains the URI scheme registration template for 573 the 'urn' scheme, in accordance with RFC 4395 [RFC4395]. 575 Note: In order to be useable as a standalone text (after being 576 extracted from this RFC), the template below does not contain 577 formal anchors to the references listed in section 11, but instead 578 gives to common RFC designations in prose. However, for 579 compliance with editorial policy, it needs to be noted: 581 This registration template refers to RFCs 2196, 2276, 2608, 3401 582 through 3404, and 3406 [RFC2169] [RFC2276] [RFC2608] [RFC3401] 583 [RFC3402] [RFC3403] [RFC3404] [RFC3406]. 585 7.1. Registration of URI Scheme 'urn' 587 [ RFC Editor: please replace "XXXX" in all instances of "RFC XXXX" 588 below by the RFC number assigned to this document. ] 589 URI scheme name: urn 591 Status: permanent 593 URI scheme syntax: 595 See Section 2 of RFC XXXX. 597 URI scheme semantics: 599 'urn' URIs, known as Universal Resource Names (URNs), serve as 600 persistent, location-independent, resource identifiers for 601 concrete and abstract objects that have network accessible 602 instances and/or metadata. 604 URNs are structured hierarchically into URN Namespaces, the 605 management of which is delegated to namespace-specific 606 authorities. Each such URN namespace is founded in an independent 607 specification and registered with IANA, following the guidelines 608 and procedures of BCP 66 (at the time of this registration: RFC 609 3406). 611 Encoding considerations: 613 All URNs are ASCII strings conforming to the general URI syntax 614 from STD 66. As described in Sections 2.2 and 2.3.2 of RFC XXXX, 615 characters needed by the URN namespace specific semantics but not 616 contained in the US-ASCII charset MUST be encoded in UTF-8 617 according to STD 63; any octets outside the allowed character set 618 MUST then be percent-encoded. 620 Applications/protocols that use this URI scheme: 622 URNs that serve to identify abstract resources for protocol 623 purposes are expected to be recognized directly by the 624 implementations of these portocols. 626 In general, resolution systems for URNs are specified on a per- 627 namespace basis. If appropriate for the namespace, these systems 628 resolve URNs to (possibly multiple) URIs that allow the network 629 access to the identified object or metadata on it. 631 "Architectural Principles of Uniform Resource Name Resolution" 632 (RFC 2276) explains the basic concepts. Some resolution systems 633 laid down in IETF specifications are: 635 * Trivial HTTP-based URN Resolution (RFC 2169) 637 * Dynamic Delegation Discovery System (DDDS, RFCs 3401-3404) 638 * Service Location Protocol (SLPv2, RFC 2608) 640 Interoperability Considerations: 642 Persistence and stability of URNs require appropriate resolution 643 systems. 645 Security Considerations: 647 See Section 8 of RFC XXXX. 649 Contact: 651 Provisionally: the authors of this draft. 652 This registration will be discussed on the following IETF lists: 653 uri-review and urn. 654 It is expected that a "URNbis" WG be formed in the IETF and take 655 over control of this document, and that subsequently the Chairs 656 and mailing list of that WG serve as the primary contact. 658 Author / Change controller: 660 The authors of this draft. 661 Change control is with the IESG. 663 References: 665 RFC XXXX. 667 Procedures for the specification and registration of URN 668 namespaces are detailed in BCP 66 (at the time of this writing: 669 RFC 3406; a rfc3406-bis document is expected as a deliverable of 670 the proposed "URNbis" WG). 672 8. Security Considerations 674 This document specifies the syntax and general requirements for URNs, 675 which are the specific URIs that use the 'urn' URI scheme. As such, 676 the general security considerations of STD 66 [RFC3986] apply. 677 However, each URN namespace will have specific security 678 considerations, according to the semantics and usage of the 679 underlying namespace. While some namespaces may assign special 680 meaning to certain of the characters of the Namespace Specific 681 String, any security considerations resulting from such assignment 682 are outside the scope of this document. It is REQUIRED by BCP 66 683 [RFC3406] that the process of registering a namespace identifier 684 include any such considerations. 686 9. IANA Considerations 688 IANA is asked to update the existing informal registration of the 689 'urn' URI Scheme by the template in Section 7.1 above and list this 690 RFC as the current normative reference in [IANA-URI]. 692 IANA is asked to add a note to [IANA-URN] that 'urn' is a permanently 693 reserved formal namespace identifier string that cannot be 694 registered, in order to avoid confusion with the 'urn' URI scheme. 696 10. Acknowledgements 698 This document is heavily based on RFC 2141, the author of which has 699 laid the foundation for this work, and which contained the following 700 Acknowledgements: 702 Thanks to various members of the URN working group for comments on 703 earlier drafts of this document. This document is partially 704 supported by the National Science Foundation, Cooperative 705 Agreement NCR-9218179. 707 This document also heavily relies on and acknowledges the work done 708 for STD 66 [RFC3986]. 710 Your name could go here ... 712 11. References 714 11.1. Normative References 716 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 717 Requirement Levels", BCP 14, RFC 2119, March 1997. 719 [RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO 720 10646", STD 63, RFC 3629, November 2003. 722 [RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform 723 Resource Identifier (URI): Generic Syntax", STD 66, 724 RFC 3986, January 2005. 726 [RFC4395] Hansen, T., Hardie, T., and L. Masinter, "Guidelines and 727 Registration Procedures for New URI Schemes", BCP 35, 728 RFC 4395, February 2006. 730 [RFC5234] Crocker, D. and P. Overell, "Augmented BNF for Syntax 731 Specifications: ABNF", STD 68, RFC 5234, January 2008. 733 11.2. Informative References 735 [IANA] IANA, "The Internet Assigned Numbers Authority", 736 . 738 [IANA-URI] IANA, "URI Schemes Registry", 739 . 741 [IANA-URN] IANA, "URN Namespace Registry", 742 . 744 [RFC0615] Crocker, D., "Proposed Network Standard Data Pathname 745 syntax", RFC 615, March 1974. 747 [RFC0645] Crocker, D., "Network Standard Data Specification 748 syntax", RFC 645, June 1974. 750 [RFC1630] Berners-Lee, T., "Universal Resource Identifiers in WWW: 751 A Unifying Syntax for the Expression of Names and 752 Addresses of Objects on the Network as used in the World- 753 Wide Web", RFC 1630, June 1994. 755 [RFC1736] Kunze, J., "Functional Recommendations for Internet 756 Resource Locators", RFC 1736, February 1995. 758 [RFC1737] Sollins, K. and L. Masinter, "Functional Requirements for 759 Uniform Resource Names", RFC 1737, December 1994. 761 [RFC1738] Berners-Lee, T., Masinter, L., and M. McCahill, "Uniform 762 Resource Locators (URL)", RFC 1738, December 1994. 764 [RFC1808] Fielding, R., "Relative Uniform Resource Locators", 765 RFC 1808, June 1995. 767 [RFC2141] Moats, R., "URN Syntax", RFC 2141, May 1997. 769 [RFC2169] Daniel, R., "A Trivial Convention for using HTTP in URN 770 Resolution", RFC 2169, June 1997. 772 [RFC2276] Sollins, K., "Architectural Principles of Uniform 773 Resource Name Resolution", RFC 2276, January 1998. 775 [RFC2396] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform 776 Resource Identifiers (URI): Generic Syntax", RFC 2396, 777 August 1998. 779 [RFC2608] Guttman, E., Perkins, C., Veizades, J., and M. Day, 780 "Service Location Protocol, Version 2", RFC 2608, 781 June 1999. 783 [RFC2611] Daigle, L., van Gulik, D., Iannella, R., and P. 784 Faltstrom, "URN Namespace Definition Mechanisms", BCP 33, 785 RFC 2611, June 1999. 787 [RFC2717] Petke, R. and I. King, "Registration Procedures for URL 788 Scheme Names", BCP 35, RFC 2717, November 1999. 790 [RFC2718] Masinter, L., Alvestrand, H., Zigmond, D., and R. Petke, 791 "Guidelines for new URL Schemes", RFC 2718, 792 November 1999. 794 [RFC3305] Mealling, M. and R. Denenberg, "Report from the Joint 795 W3C/IETF URI Planning Interest Group: Uniform Resource 796 Identifiers (URIs), URLs, and Uniform Resource Names 797 (URNs): Clarifications and Recommendations", RFC 3305, 798 August 2002. 800 [RFC3401] Mealling, M., "Dynamic Delegation Discovery System (DDDS) 801 Part One: The Comprehensive DDDS", RFC 3401, 802 October 2002. 804 [RFC3402] Mealling, M., "Dynamic Delegation Discovery System (DDDS) 805 Part Two: The Algorithm", RFC 3402, October 2002. 807 [RFC3403] Mealling, M., "Dynamic Delegation Discovery System (DDDS) 808 Part Three: The Domain Name System (DNS) Database", 809 RFC 3403, October 2002. 811 [RFC3404] Mealling, M., "Dynamic Delegation Discovery System (DDDS) 812 Part Four: The Uniform Resource Identifiers (URI)", 813 RFC 3404, October 2002. 815 [RFC3406] Daigle, L., van Gulik, D., Iannella, R., and P. 816 Faltstrom, "Uniform Resource Names (URN) Namespace 817 Definition Mechanisms", BCP 66, RFC 3406, October 2002. 819 [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an 820 IANA Considerations Section in RFCs", BCP 26, RFC 5226, 821 May 2008. 823 Appendix A. How to Locate IETF Documents (Informative) 825 Request For Comments (RFCs) are available from the RFC Editor site 826 using the canonical URIs 827 or (where 'NNNN' is 828 the serial number of the RFC), and from numerous mirror sites. 830 Additional metadata for any RFC, including possible Errata, are 831 available from (where 'NNNN' 832 again is the serial number of the RFC). A HTML-ized version and a 833 PDF facsimile of each RFC are available from the IETF Tools site at 834 and 835 , respectively. 837 Current Internet Draft documents are available via the search engines 838 at and 839 ; archival copies of older 840 IETF documents can be found at . 842 Appendix B. Handling of URNs by URL Resolvers/Browsers 844 The URN syntax has been defined so that URNs can be used in places 845 where URLs are expected. A resolver that conforms to the current URI 846 syntax specification [RFC3986] will extract a scheme value of "urn" 847 rather than a scheme value of "urn:". 849 An URN MUST be considered an opaque URI by URL resolvers and passed 850 (with the "urn:" tag) to an URN resolver for resolution. The URN 851 resolver can either be an external resolver that the URL resolver 852 knows of, or it can be functionality built into the URL resolver. 854 To avoid confusion of users, an URL browser SHOULD display the 855 complete URN (including the "urn:" tag) to ensure that there is no 856 confusion between URN namespace identifiers and URI scheme names. 858 Appendix C. Collected ABNF (Informative) 860 As a service to implementers specifically interested in URN syntax, 861 after consolidation of Section 2, the complete ABNF for URNs will be 862 collected here, including the referenced rules from [RFC5234] and 863 [RFC3986]. In case of (unexpected) inconsistencies, these documents 864 remain normative for the respective productions. 866 T.B.D. 868 ... 870 Appendix D. Changes since RFC 2141 (Informative) 872 D.1. Essential Changes from RFC 2141 874 [ RFC Editor: please remove the Appendix D.1 headline and all 875 subsequent subsections starting with Appendix D.2. ] 877 T.B.D. (after consolidation of this memo) 879 D.2. Changes from RFC 2141 to draft -00 881 Abstract amended: URI scheme, replacement for 2141, point to 3406. 882 Use contemporary boilerplate. Added transient "Discussion" section. 884 s1: added new 1st para (URI scheme) and 3rd para (hierarchy). 885 s1.1 (Historical Perspective) added for background & motivation. 886 s1.2 (Objective) added. 887 s1.3 (2119 keywords) added -- used now throughout normative text. 889 s2 (URN Syntax): Shifted from BNF to ABNF; explain relationship to 890 3986 and gaps, how the gaps could be bridged, distinguish between URI 891 generics and URN specifics; got rid of references to immature 892 documents (1630, 1737). 893 s2.1 (NID syntax): Use ABNF and RFC 5234 terminals (core rules); 894 removed reference to an old draft of 2396; clarified prohibition to 895 use "urn" as NID. 896 s2.2 (NSS syntax): Shifted from BNF to ABNF; made ABNF consistent 897 with subsequent textual description; exposition much expanded, 898 showing relationship with 3986 and resulting incompatibilities; 899 proposed how to bridge gaps, to make parsing more uniform among URIs; 900 updated i18n considerations and pointer to UTF-8 specification. 901 s.2.3, s2.3.*: reworked and much expanded, along the grouping of 902 delimiter characters from 3986 in new s2.3.1 (including old s.2.3.2); 903 made text fully consistent with ABNF in s2.2; consistent usage of 904 term "percent-encoded"; old s.2.3.1 became s2.3.2; old s3.4 became 905 s3.3.3, providing complete, annotated list of excluded characters, 906 ordered by ascending code point; and restating design decisions 907 needed to be made to close gaps to 3986. 909 s3 through s6: only minor editorial changes. 911 s7: formal registration of 'urn' URI scheme added, using 4395 912 template. 914 s8: Security Cons. slightly amended. 916 s9: new: IANA Cons. added wrt s7.1 and prohibition of NID "urn". 918 s10: Acknowledgments amended. 920 s11: References split into Normative and Informative; updated refs 921 and added many; only FS and BCP allowed as Normative Refs to further 922 promotion of document. 924 Added Appendices A through D. 926 D.3. Changes from draft-00 to draft -02 928 Updated "Discussion" on front page to point to dedicated urn list. 930 Numerous editorial improvements and additions for clarification, in 931 particular in the Introduction. No technical changes. 933 More Informative References; missing details supplied in D.1. 935 Author's Address 937 Alfred Hoenes (editor) 938 TR-Sys 939 Gerlinger Str. 12 940 Ditzingen D-71254 941 Germany 943 EMail: ah@TR-Sys.de