idnits 2.17.1 draft-ietf-urn-syntax-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-26) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 6 longer pages, the longest (page 2) being 60 lines == It seems as if not all pages are separated by form feeds - found 0 form feeds but 7 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 96: '... reserved and MUST NOT be used....' RFC 2119 keyword, line 119: '...rs>). Such strings MUST be translated...' RFC 2119 keyword, line 144: '...racter in an URN MUST be followed by t...' RFC 2119 keyword, line 147: '... Namespaces MAY designate one or mor...' RFC 2119 keyword, line 150: '... a literal sense MUST be encoded with ...' (14 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (March 1997) is 9904 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Possible downref: Non-RFC (?) normative reference: ref. '1' ** Downref: Normative reference to an Informational RFC: RFC 1630 (ref. '2') ** Downref: Normative reference to an Informational RFC: RFC 1737 (ref. '3') -- Possible downref: Non-RFC (?) normative reference: ref. '4' -- Possible downref: Non-RFC (?) normative reference: ref. '5' Summary: 11 errors (**), 0 flaws (~~), 3 warnings (==), 5 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet-Draft Ryan Moats 2 draft-ietf-urn-syntax-04.txt AT&T 3 Expires in six months March 1997 5 URN Syntax 6 Filename: draft-ietf-urn-syntax-04.txt 8 Status of This Memo 10 This document is an Internet-Draft. Internet-Drafts are working 11 documents of the Internet Engineering Task Force (IETF), its 12 areas, and its working groups. Note that other groups may also 13 distribute working documents as Internet-Drafts. 15 Internet-Drafts are draft documents valid for a maximum of six 16 months and may be updated, replaced, or obsoleted by other 17 documents at any time. It is inappropriate to use Internet- 18 Drafts as reference material or to cite them other than as ``work 19 in progress.'' 21 To learn the current status of any Internet-Draft, please check 22 the ``1id-abstracts.txt'' listing contained in the Internet- 23 Drafts Shadow Directories on ftp.is.co.za (Africa), nic.nordu.net 24 (Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East 25 Coast), or ftp.isi.edu (US West Coast). 27 Abstract 29 Uniform Resource Names (URNs) are intended to serve as persistent, 30 location-independent, resource identifiers. This document sets 31 forward the canonical syntax for URNs. A discussion of both existing 32 legacy and new namespaces and requirements for URN presentation and 33 transmission are presented. Finally, there is a discussion of URN 34 equivalence and how to determine it. 36 1. Introduction 38 Uniform Resource Names (URNs) are intended to serve as persistent, 39 location-independent, resource identifiers and are designed to make 40 it easy to map other namespaces (which share the properties of URNs) 41 into URN-space. Therefore, the URN syntax provides a means to encode 42 character data in a form that can be sent in existing protocols, 43 transcribed on most keyboards, etc. 45 2. Syntax 47 All URNs have the following syntax (phrases enclosed in quotes are 48 REQUIRED): 50 ::= "urn:" ":" 52 where is the Namespace Identifier, and is the Namespace 53 Specific String. The leading "urn:" sequence is case-insensitive. 54 The Namespace ID determines the _syntactic_ interpretation of the 55 Namespace Specific String (as discussed in [1]). 57 RFC 1630 [2] and RFC 1737 [3] each presents additional considerations 58 for URN encoding, which have implications as far as limiting syntax. 59 On the other hand, the requirement to support existing legacy naming 60 systems has the effect of broadening syntax. Thus, we discuss the 61 acceptable syntax for both the Namespace Identifier and the Namespace 62 Specific String separately. 64 2.1 Namespace Identifier Syntax 66 The following is the syntax for the Namespace Identifier. To (a) be 67 consistent with all potential resolution schemes and (b) not put any 68 undue constraints on any potential resolution scheme, the syntax for 69 the Namespace Identifier is: 71 ::= [ 1,31 ] 73 ::= | | | "-" 75 ::= | | 77 ::= "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" | 78 "I" | "J" | "K" | "L" | "M" | "N" | "O" | "P" | 79 "Q" | "R" | "S" | "T" | "U" | "V" | "W" | "X" | 80 "Y" | "Z" 82 ::= "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" | 83 "i" | "j" | "k" | "l" | "m" | "n" | "o" | "p" | 84 "q" | "r" | "s" | "t" | "u" | "v" | "w" | "x" | 85 "y" | "z" 87 ::= "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | 88 "8" | "9" 90 This is slightly more restrictive that what is stated in [4] (which 91 allows the characters "." and "+"). Further, the Namespace 92 Identifier is case insensitive, so that "ISBN" and "isbn" refer to 93 the same namespace. 95 To avoid confusion with the "urn:" identifier, the NID "urn" is 96 reserved and MUST NOT be used. 98 2.2 Namespace Specific String Syntax 100 As required by RFC 1737, there is a single canonical representation 101 of the NSS portion of an URN. The format of this single canonical 102 form follows: 104 ::= 1* 106 ::= | "%" 108 ::= | | | | 110 ::= | "A" | "B" | "C" | "D" | "E" | "F" | 111 "a" | "b" | "c" | "d" | "e" | "f" 113 ::= "(" | ")" | "+" | "," | "-" | "." | 114 ":" | "=" | "@" | ";" | "$" | 115 "_" | "!" | "*" | "'" 117 Depending on the rules governing a namespace, valid identifiers in a 118 namespace might contain characters that are not members of the URN 119 character set above (). Such strings MUST be translated 120 into canonical NSS format before using them as protocol elements or 121 otherwise passing them on to other applications. Translation is done 122 by encoding each character outside the URN character set as a 123 sequence of one to six octets using UTF-8 encoding [5], and the 124 encoding of each of those octets as "%" followed by two characters 125 from the character set above. The two characters give the 126 hexadecimal representation of that octet. 128 2.3 Reserved characters 130 The remaining character set left to be discussed above is the 131 reserved character set, which contains various characters reserved 132 from normal use. The reserved character set follows, with a 133 discussion on the specifics of why each character is reserved. 135 The reserved character set is: 137 ::= '%" | "/" | "?" | "#" 139 2.3.1 The "%" character 141 The "%" character is reserved in the URN syntax for introducing the 142 escape sequence for an octet. Literal use of the "%" character in a 143 namespace must be encoded using "%25" in URNs for that namespace. 144 The presence of an "%" character in an URN MUST be followed by two 145 characters from the character set. 147 Namespaces MAY designate one or more characters from the URN 148 character set as having special meaning for that namespace. If the 149 namespace also uses that character in a literal sense as well, the 150 character used in a literal sense MUST be encoded with "%" followed 151 by the hexadecimal representation of that octet. Further, a 152 character MUST NOT be "%"-encoded if the character is not a reserved 153 character. Therefore, the process of registering a namespace 154 identifier shall include publication of a definition of which 155 characters have a special meaning to that namespace. 157 2.3.2 The other reserved characters 159 RFC 1630 [2] reserves the characters "/", "?", and "#" for particular 160 purposes. The URN-WG has not yet debated the applicability and 161 precise semantics of those purposes as applied to URNs. Therefore, 162 these characters are RESERVED for future developments. Namespace 163 developers SHOULD NOT use these characters in unencoded form, but 164 rather use the appropriate %-encoding for each character. 166 2.4 Excluded characters 168 The following list is included only for the sake of completeness. 169 Any octets/characters on this list are explicitly NOT part of the URN 170 character set, and if used in an URN, MUST be %encoded: 172 ::= octets 1-32 (1-20 hex) | "\" | """ | "&" | "<" 173 | ">" | "[" | "]" | "^" | "`" | "{" | "|" | "}" | "~" 174 | octets 127-255 (7F-FF hex) 176 In addition, octet 0 (0 hex) should NEVER be used, in either 177 unencoded or %-encoded form. 179 An URN ends when an octet/character from the excluded character set 180 () is encountered. The character from the excluded 181 character set is NOT part of the URN. 183 3. Support of existing legacy naming systems and new naming systems 185 Any namespace (existing or newly-devised) that is proposed as an 186 URN-namespace and fulfills the criteria of URN-namespaces MUST be 187 expressed in this syntax. If names in these namespaces contain 188 characters other than those defined for the URN character set, they 189 MUST be translated into canonical form as discussed in section 2.2. 191 4. URN presentation and transport 193 The URN syntax defines the canonical format for URNs and all URN 194 transport and interchanges MUST take place in this format. Further, 195 all URN-aware applications MUST offer the option of displaying URNs 196 in this canonical form to allow for direct transcription (for example 197 by cut and paste techniques). Such applications MAY support display 198 of URNs in a more human-friendly form and may use a character set 199 that includes characters that aren't permitted in URN syntax as 200 defined in this RFC (that is, they may replace %-notation by 201 characters in some extended character set in display to humans). 203 5. Lexical Equivalence in URNs 205 For various purposes such as caching, it's often desirable to 206 determine if two URNs are the same without resolving them. The 207 general purpose means of doing so is by testing for "lexical 208 equivalence" as defined below. 210 Two URNs are lexically equivalent if they are octet-by-octet equal 211 after the following preprocessing: 213 1. normalize the case of the leading "urn:" token 214 2. normalize the case of the NID 215 3. normalizing the case of any %-escaping 217 Note that %-escaping MUST NOT be removed. 219 Some namespaces may define additional lexical equivalences, such as 220 case-insensitivity of the NSS (or parts thereof). Additional lexical 221 equivalences MUST be documented as part of namespace registration, 222 MUST always have the effect of eliminating some of the false 223 negatives obtained by the procedure above, and MUST NEVER say that 224 two URNs are not equivalent if the procedure above says they are 225 equivalent. 227 6. Examples of lexical equivalence 229 The following URN comparisons highlight the lexical equivalence 230 definitions: 232 1- URN:foo:a123,456 233 2- urn:foo:a123,456 234 3- urn:FOO:a123,456 235 4- urn:foo:A123,456 236 5- urn:foo:a123%2C456 237 6- URN:FOO:a123%2c456 238 URNs 1, 2, and 3 are all lexically equivalent. URN 4 is not 239 lexically equivalent any of the other URNs of the above set. URNs 5 240 and 6 are only lexically equivalent to each other. 242 7. Functional Equivalence in URNs 244 Functional equivalence is determined by practice within a given 245 namespace and managed by resolvers for that namespeace. Thus, it is 246 beyond the scope of this document. Namespace registration must 247 include guidance on how to determine functional equivalence for that 248 namespace, i.e. when two URNs are the identical within a namespace. 250 8. Security considerations 252 This document specifies the syntax for URNs. While some namespaces 253 resolvers may assign special meaning to certain of the characters of 254 the Namespace Specific String, any security consideration resulting 255 from such assignment are outside the scope of this document. It is 256 strongly recommended that the process of registering a namespace 257 identifier include any such considerations. 259 9. Acknowledgments 261 Thanks to various members of the URN working group for comments on 262 earlier drafts of this document. This document is partially 263 supported by the National Science Foundation, Cooperative Agreement 264 NCR-9218179. 266 10. References 268 Request For Comments (RFC) and Internet Draft documents are available 269 from and numerous mirror sites. 271 [1] K. R. Sollins, "Requirements and a Framework for 272 URN Resolution Systems," Internet Draft (work in 273 progress), November 1996. 275 [2] T. Berners-Lee, "Universal Resource Identifiers in 276 WWW," RFC 1630, June 1994. 278 [3] K. Sollins and L. Masinter, "Functional Require- 279 ments for Uniform Resource Names," RFC 1737. 280 December 1994. 282 [4] T. Berners-Lee, R. Fielding, L. Masinter, "Uniform 283 Resource Locators (URL)," Internet Draft (work in 284 progress), December 1996. 286 [5] Appendix A.2 of The Unicode Consortium, "The 287 Unicode Standard, Version 2.0", Addison-Wesley 288 Developers Press, 1996. ISBN 0-201-48345-9. 290 11. Editor's address 292 Ryan Moats 293 AT&T 294 15621 Drexel Circle 295 Omaha, NE 68135-2358 296 USA 298 Phone: +1 402 894-9456 299 EMail: jayhawk@ds.internic.net 301 Appendix A. Handling of URNs by URL resolvers/browsers. 303 The URN syntax has been defined so that URNs can be used in places 304 where URLs are expected. A resolver that conforms to the current URL 305 syntax specification [3] will extract a scheme value of "urn:" 306 rather than a scheme value of "urn:". 308 An URN MUST be considered an opaque URL by URL resolvers and passed 309 (with the "urn:" tag) to an URN resolver for resolution. The URN 310 resolver can either be an external resolver that the URL resolver 311 knows of, or it can be functionality built-in to the URL resolver. 313 To avoid confusion of users, an URL browser SHOULD display the com- 314 plete URN (including the "urn:" tag) to ensure that there is no con- 315 fusion between URN namespace identifiers and URL scheme identifiers. 317 This Internet Draft expires September 30, 1997.