idnits 2.17.1 draft-ietf-urn-syntax-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-18) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 6 longer pages, the longest (page 2) being 60 lines == It seems as if not all pages are separated by form feeds - found 0 form feeds but 7 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 88: '... reserved and MUST NOT be used....' RFC 2119 keyword, line 110: '...ars). Such strings MUST be translated...' RFC 2119 keyword, line 135: '...racter in an URN MUST be followed by t...' RFC 2119 keyword, line 138: '... Namespaces MAY designate one or mor...' RFC 2119 keyword, line 141: '... a literal sense MUST be encoded with ...' (14 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (March 1997) is 9896 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Possible downref: Non-RFC (?) normative reference: ref. '1' ** Downref: Normative reference to an Informational RFC: RFC 1630 (ref. '2') ** Downref: Normative reference to an Informational RFC: RFC 1737 (ref. '3') -- Possible downref: Non-RFC (?) normative reference: ref. '4' -- Possible downref: Non-RFC (?) normative reference: ref. '5' Summary: 11 errors (**), 0 flaws (~~), 3 warnings (==), 5 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet-Draft Ryan Moats 3 draft-ietf-urn-syntax-05.txt AT&T 4 Expires in six months March 1997 6 URN Syntax 7 Filename: draft-ietf-urn-syntax-05.txt 9 Status of This Memo 11 This document is an Internet-Draft. Internet-Drafts are working 12 documents of the Internet Engineering Task Force (IETF), its 13 areas, and its working groups. Note that other groups may also 14 distribute working documents as Internet-Drafts. 16 Internet-Drafts are draft documents valid for a maximum of six 17 months and may be updated, replaced, or obsoleted by other 18 documents at any time. It is inappropriate to use Internet- 19 Drafts as reference material or to cite them other than as ``work 20 in progress.'' 22 To learn the current status of any Internet-Draft, please check 23 the ``1id-abstracts.txt'' listing contained in the Internet- 24 Drafts Shadow Directories on ftp.is.co.za (Africa), nic.nordu.net 25 (Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East 26 Coast), or ftp.isi.edu (US West Coast). 28 Abstract 30 Uniform Resource Names (URNs) are intended to serve as persistent, 31 location-independent, resource identifiers. This document sets 32 forward the canonical syntax for URNs. A discussion of both existing 33 legacy and new namespaces and requirements for URN presentation and 34 transmission are presented. Finally, there is a discussion of URN 35 equivalence and how to determine it. 37 1. Introduction 39 Uniform Resource Names (URNs) are intended to serve as persistent, 40 location-independent, resource identifiers and are designed to make 41 it easy to map other namespaces (which share the properties of URNs) 42 into URN-space. Therefore, the URN syntax provides a means to encode 43 character data in a form that can be sent in existing protocols, 44 transcribed on most keyboards, etc. 46 2. Syntax 48 All URNs have the following syntax (phrases enclosed in quotes are 49 REQUIRED): 51 URN ::= "urn:" NID ":" NSS 53 where NID is the Namespace Identifier, and NSS is the Namespace 54 Specific String. The leading "urn:" sequence is case-insensitive. 55 The Namespace ID determines the _syntactic_ interpretation of the 56 Namespace Specific String (as discussed in [1]). 58 RFC 1630 [2] and RFC 1737 [3] each presents additional considerations 59 for URN encoding, which have implications as far as limiting syntax. 60 On the other hand, the requirement to support existing legacy naming 61 systems has the effect of broadening syntax. Thus, we discuss the 62 acceptable syntax for both the Namespace Identifier and the Namespace 63 Specific String separately. 65 2.1 Namespace Identifier Syntax 67 The following is the syntax for the Namespace Identifier. To (a) be 68 consistent with all potential resolution schemes and (b) not put any 69 undue constraints on any potential resolution scheme, the syntax for 70 the Namespace Identifier is: 72 NID ::= let-num [ 1*31let-num-hyp ] 74 let-num-hyp ::= letter / number / "-" 76 let-num ::= letter / number 78 letter ::= %x41..5A / %x61..7A 80 number ::= %x30..39 82 This is slightly more restrictive that what is stated in [4] (which 83 allows the characters "." and "+"). Further, the Namespace 84 Identifier is case insensitive, so that "ISBN" and "isbn" refer to 85 the same namespace. 87 To avoid confusion with the "urn:" identifier, the NID "urn" is 88 reserved and MUST NOT be used. 90 2.2 Namespace Specific String Syntax 92 As required by RFC 1737, there is a single canonical representation 93 of the NSS portion of an URN. The format of this single canonical 94 form follows: 96 NSS ::= 1*URN_chars 98 URN_chars ::= trans / ("%" hex hex) 100 trans ::= letter / number / other / reserved 102 hex ::= number / %x41..46 / %x61..66 104 other ::= "(" / ")" / "+" / "," / "-" / "." / 105 ":" / "=" / "@" / ";" / "$" / "_" / 106 "!" / "*" / "'" 108 Depending on the rules governing a namespace, valid identifiers in a 109 namespace might contain characters that are not members of the URN 110 character set above (URN_chars). Such strings MUST be translated 111 into canonical NSS format before using them as protocol elements or 112 otherwise passing them on to other applications. Translation is done 113 by encoding each character outside the URN character set as a 114 sequence of one to six octets using normalized UTF8 [5], and the 115 encoding of each of those octets as "%" followed by two characters 116 from the hex character set above. The two characters give the 117 hexadecimal representation of that octet. 119 2.3 Reserved characters 121 The remaining character set left to be discussed above is the 122 reserved character set, which contains various characters reserved 123 from normal use. The reserved character set follows, with a 124 discussion on the specifics of why each character is reserved. 126 The reserved character set is: 128 reserved ::= '%" / "/" / "?" / "#" 130 2.3.1 The "%" character 132 The "%" character is reserved in the URN syntax for introducing the 133 escape sequence for an octet. Literal use of the "%" character in a 134 namespace must be encoded using "%25" in URNs for that namespace. 135 The presence of an "%" character in an URN MUST be followed by two 136 characters from the character set. 138 Namespaces MAY designate one or more characters from the URN 139 character set as having special meaning for that namespace. If the 140 namespace also uses that character in a literal sense as well, the 141 character used in a literal sense MUST be encoded with "%" followed 142 by the hexadecimal representation of that octet. Further, a 143 character MUST NOT be "%"-encoded if the character is not a reserved 144 character. Therefore, the process of registering a namespace 145 identifier shall include publication of a definition of which 146 characters have a special meaning to that namespace. 148 2.3.2 The other reserved characters 150 RFC 1630 [2] reserves the characters "/", "?", and "#" for particular 151 purposes. The URN-WG has not yet debated the applicability and 152 precise semantics of those purposes as applied to URNs. Therefore, 153 these characters are RESERVED for future developments. Namespace 154 developers SHOULD NOT use these characters in unencoded form, but 155 rather use the appropriate %-encoding for each character. 157 2.4 Excluded characters 159 The following list is included only for the sake of completeness. 160 Any octets/characters on this list are explicitly NOT part of the URN 161 character set, and if used in an URN, MUST be %encoded: 163 excluded ::= octets 1-32 (1-20 hex) / "\" / """ / 164 "&" / "<" / ">" / "[" / "]" / "^" / 165 "`" / "{" / "|" / "}" / "~" / 166 octets 127-255 (7F-FF hex) 168 In addition, octet 0 (0 hex) should NEVER be used, in either 169 unencoded or %-encoded form. 171 An URN ends when an octet/character from the excluded character set 172 (excluded) is encountered. The character from the excluded character 173 set is NOT part of the URN. 175 3. Support of existing legacy naming systems and new naming systems 177 Any namespace (existing or newly-devised) that is proposed as an 178 URN-namespace and fulfills the criteria of URN-namespaces MUST be 179 expressed in this syntax. If names in these namespaces contain 180 characters other than those defined for the URN character set, they 181 MUST be translated into canonical form as discussed in section 2.2. 183 4. URN presentation and transport 185 The URN syntax defines the canonical format for URNs and all URN 186 transport and interchanges MUST take place in this format. Further, 187 all URN-aware applications MUST offer the option of displaying URNs 188 in this canonical form to allow for direct transcription (for example 189 by cut and paste techniques). Such applications MAY support display 190 of URNs in a more human-friendly form and may use a character set 191 that includes characters that aren't permitted in URN syntax as 192 defined in this RFC (that is, they may replace %-notation by 193 characters in some extended character set in display to humans). 195 5. Lexical Equivalence in URNs 197 For various purposes such as caching, it's often desirable to 198 determine if two URNs are the same without resolving them. The 199 general purpose means of doing so is by testing for "lexical 200 equivalence" as defined below. 202 Two URNs are lexically equivalent if they are octet-by-octet equal 203 after the following preprocessing: 205 1. normalize the case of the leading "urn:" token 206 2. normalize the case of the NID 207 3. normalizing the case of any %-escaping 209 Note that %-escaping MUST NOT be removed. 211 Some namespaces may define additional lexical equivalences, such as 212 case-insensitivity of the NSS (or parts thereof). Additional lexical 213 equivalences MUST be documented as part of namespace registration, 214 MUST always have the effect of eliminating some of the false 215 negatives obtained by the procedure above, and MUST NEVER say that 216 two URNs are not equivalent if the procedure above says they are 217 equivalent. 219 6. Examples of lexical equivalence 221 The following URN comparisons highlight the lexical equivalence 222 definitions: 224 1- URN:foo:a123,456 225 2- urn:foo:a123,456 226 3- urn:FOO:a123,456 227 4- urn:foo:A123,456 228 5- urn:foo:a123%2C456 229 6- URN:FOO:a123%2c456 230 URNs 1, 2, and 3 are all lexically equivalent. URN 4 is not 231 lexically equivalent any of the other URNs of the above set. URNs 5 232 and 6 are only lexically equivalent to each other. 234 7. Functional Equivalence in URNs 236 Functional equivalence is determined by practice within a given 237 namespace and managed by resolvers for that namespeace. Thus, it is 238 beyond the scope of this document. Namespace registration must 239 include guidance on how to determine functional equivalence for that 240 namespace, i.e. when two URNs are the identical within a namespace. 242 8. Security considerations 244 This document specifies the syntax for URNs. While some namespaces 245 resolvers may assign special meaning to certain of the characters of 246 the Namespace Specific String, any security consideration resulting 247 from such assignment are outside the scope of this document. It is 248 strongly recommended that the process of registering a namespace 249 identifier include any such considerations. 251 9. Acknowledgments 253 Thanks to various members of the URN working group for comments on 254 earlier drafts of this document. This document is partially 255 supported by the National Science Foundation, Cooperative Agreement 256 NCR-9218179. 258 10. References 260 Request For Comments (RFC) and Internet Draft documents are available 261 from and numerous mirror sites. 263 [1] K. R. Sollins, "Requirements and a Framework for 264 URN Resolution Systems," Internet Draft (work in 265 progress), November 1996. 267 [2] T. Berners-Lee, "Universal Resource Identifiers in 268 WWW," RFC 1630, June 1994. 270 [3] K. Sollins and L. Masinter, "Functional Require- 271 ments for Uniform Resource Names," RFC 1737. 272 December 1994. 274 [4] T. Berners-Lee, R. Fielding, L. Masinter, "Uniform 275 Resource Locators (URL)," Internet Draft (work in 276 progress), December 1996. 278 [5] Appendix A.2 of The Unicode Consortium, "The 279 Unicode Standard, Version 2.0", Addison-Wesley 280 Developers Press, 1996. ISBN 0-201-48345-9. 282 11. Editor's address 284 Ryan Moats 285 AT&T 286 15621 Drexel Circle 287 Omaha, NE 68135-2358 288 USA 290 Phone: +1 402 894-9456 291 EMail: jayhawk@ds.internic.net 293 Appendix A. Handling of URNs by URL resolvers/browsers. 295 The URN syntax has been defined so that URNs can be used in places 296 where URLs are expected. A resolver that conforms to the current URL 297 syntax specification [3] will extract a scheme value of "urn:" 298 rather than a scheme value of "urn:". 300 An URN MUST be considered an opaque URL by URL resolvers and passed 301 (with the "urn:" tag) to an URN resolver for resolution. The URN 302 resolver can either be an external resolver that the URL resolver 303 knows of, or it can be functionality built-in to the URL resolver. 305 To avoid confusion of users, an URL browser SHOULD display the com- 306 plete URN (including the "urn:" tag) to ensure that there is no con- 307 fusion between URN namespace identifiers and URL scheme identifiers. 309 This Internet Draft expires September 30, 1997.