idnits 2.17.1 draft-masinter-dated-uri-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 1) being 559 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 3 instances of lines with control characters in the document. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 248: '...es in the future SHOULD NOT be used, b...' RFC 2119 keyword, line 251: '...ource to the URI SHOULD NOT be used, b...' Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (February 2002) is 8105 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'RFC 2141' is defined on line 522, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2141 (Obsoleted by RFC 8141) -- Possible downref: Non-RFC (?) normative reference: ref. 'COOL' ** Downref: Normative reference to an Informational RFC: RFC 1396 (ref. 'RFC 2396') ** Downref: Normative reference to an Informational RFC: RFC 1737 ** Downref: Normative reference to an Informational RFC: RFC 2550 -- Possible downref: Non-RFC (?) normative reference: ref. 'TAI' ** Downref: Normative reference to an Informational RFC: RFC 2648 -- Possible downref: Non-RFC (?) normative reference: ref. 'XMLNAME' -- Possible downref: Non-RFC (?) normative reference: ref. 'RDF' Summary: 10 errors (**), 0 flaws (~~), 3 warnings (==), 6 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 INTERNET-DRAFT Larry Masinter 2 draft-masinter-dated-uri-00.txt August 22, 2001 3 Expires February 2002 5 "duri" and "tdb": URN Namespaces based on dated URIs 7 Status of this Memo 9 This document is an Internet-Draft and is in full conformance with all 10 provisions of Section 10 of RFC2026. 12 Internet-Drafts are working documents of the Internet Engineering Task 13 Force (IETF), its areas, and its working groups. Note that other 14 groups may also distribute working documents as Internet-Drafts. 16 Internet-Drafts are draft documents valid for a maximum of six months 17 and may be updated, replaced, or obsoleted by other documents at any 18 time. It is inappropriate to use Internet- Drafts as reference 19 material or to cite them other than as "work in progress." 21 The list of current Internet-Drafts can be accessed at 22 http://www.ietf.org/ietf/1id-abstracts.txt. 24 The list of Internet-Draft Shadow Directories can be accessed at 25 http://www.ietf.org/shadow.html. 27 Abstract 29 This document defines two persistent namespaces of URNs based on 30 prepending a date to an (encoded) URI. The results are namespaces 31 in which names are readily assigned but which offer the persistence 32 of reference that is required by URNs. The first namespace (duri) 33 is used to refer to URI-identified resources themselves, while the 34 second namespace (tdb) is used to refer to abstractions that are 35 not themselves networked resources but are "described by" them. 36 This idea and things like it have been discussed for several years, 37 but recent discussion about use of URIs and URNs for identifiers 38 in XML-based constructs has inspired writing this up more completely. 40 The purpose of this document is to help focus the discussion 41 of the role of URIs and URNs as names within non-Web applications. 42 This document is not a product of any working group, but may be 43 discussed on the mailing list . (Discussion of 44 related topics has occured on urn-ietf@lists.netsol.com and 45 www-rdf-interest@w3.org and w3c-uri-ig@w3.org). 47 Table of Contents 49 1. Overview 50 2. Encoding URIs 51 2.1 Characters that must be encoded 52 2.2 No need to encode "/" 53 3. Dates 54 4. Additional considerations 55 4.1 URI schemes 56 4.2 Date ranges 57 4.3 Free assignment 58 4.4 Resolution 59 4.5 Why Names with Semantics? 60 4.5 Avoiding MetaData 61 4.6 Avoiding duri and tdb 62 5. URN specification templates 63 5.1 "duri" specification template 64 5.2 "tdb" specification template 65 6. IANA considerations 66 7. Security Considerations 67 8. Acknowledgements 68 9. Copyright 69 10. Author's address 70 11. References 72 1. Overview 74 Many people have wondered about how to create globally unique and 75 persistent identifiers; while there are a number of URI schemes (and 76 URN namespaces) already registered, many of them lack an adequate 77 guarantee of both uniqueness and persistence. 79 In some cases, the guarantee of persistence comes through (a promise 80 of) good management practice; a promise that "Cool URIs don't change" 81 [COOL]. However, a promise of good management practice is different 82 from a design that insures reliability. 84 The primary principle of "Uniform" URIs is that they are intended to 85 mean the same thing, no matter in what context they appear; thus URIs 86 are a Uniform (in meaning) way to Identify a Resource. However, even 87 when URIs have Uniform meaning from the point of view of the source of 88 the reference, they don't implicitly guarantee stability over 89 time. Despite best efforts and intentions, identifying information 90 can change in unpredictable ways, be it domain names, name assigning 91 organizational structure or identity. 93 It is traditional in convention references and citations in printed 94 works to include the date of publication; this practice serves the 95 important purpose that the context of the naming can be determined. 97 The "duri" URN namespace takes the form: 99 urn:duri:: 101 where is a digit string corresponding to a date (Section 3), 102 and an is an absolute URI-reference [RFC 2396] in which 103 any character excuded from URN syntax has been escaped (Section 2). 105 The meaning of a duri is "the resource (or fragment) that was 106 identified by the (after hex decoding) at the very first 107 instant of the date given". 109 For example, urn:duri:2001:http://www.ietf.org is a persistent 110 identifier to 'http://www.ietf.org' as of the very first moment of the 111 year 2001. A duri may not be a resource locator in a practical sense, 112 because the time of location has passed. However, is an acceptable 113 resource identifier, and fulfills all of the requirements for 114 URNs.[RFC 1737]. 116 The second URN namespace defined is a parallel space which is useful 117 for describing entities, concepts, abstractions, and other items which 118 are not themselves network accessible resources, but have been 119 described by network accessible resources. An increasing number of 120 uses for URIs are for objects or concepts that don't actually 121 correspond to networked resources, but for which the URI space is used 122 as the identifier. To fill some of the need for such identifiers, a 123 second namespace is defined which designates the "thing described by" 124 the resource at the given URI at the given date and time. This URN 125 namespace is described by 'tdb', e.g., 127 urn:tdb:: 129 with the same syntactic rules as duris. 131 So "urn:duri:2001:http://www.ietf.org" can be used to designate the 132 Internet Engineering Task Force organization, at least as it was 133 described by or referenced by its home page at the first instant of 134 2001. 136 There are various other proposals for URN name spaces for abstract 137 entities that don't make reference to a concrete networked resource 138 for the purpose of identification; in much the same way that ASN.1 139 object identifiers don't contain any particular semantics of the 140 object identified. The "tdb" URN namespace satisfies a different 141 set of needs, since the designation of what is actually identified 142 by the tdb is clear and determinable without reference to the 143 context of its use. 145 2. Encoding URIs 147 Both "duri" and "tdb" URN namespaces require that some characters in 148 the URI references be encoded. 150 2.1 Characters that must be encoded 152 The characters that must be encoded are: 154 * All characters marked in RFC 2141, section 2.4 155 These are excluded because they are not allowed in URNs. 156 \"&<>[]^`{|}~ 158 * The character "#" 159 Note that the of a "duri" or "tdb" can include a 160 fragment identifier, but the "#" character used to delimit it must 161 be encoded. 163 * The character "%" 164 The encoded-URI can itself contain encoded characters, which are 165 encoded with the same method. To insure that decoding happens at the 166 right level of processing, the "%" itself must be encoded. 167 Unfortunately, this results in a confusing double encoding, but this 168 is difficult to avoid. 170 2.2 No need to encode "/" 172 The URN recommendation discourages the use of "/" in URNs because, in 173 general, there is no good interpretation of hierarchy and relative 174 URIs for assigned names. However, for the particular case of 175 duris (at least), there seems to be no good reason to avoid 176 the "/" because it corresponds fairly naturally (in many cases) 177 to the hierarchy of the original space. 179 3. Dates 181 A is a simple expression of date, optional time, with arbitrary 182 precision. The goal is to allow relatively short expressions of dates 183 with no ambiguity, and with arbitrary precision. (The idea for this 184 syntax came from [RFC 2550].) 186 date = year [ month [ day [ hour [ minute [ second [ fraction ]]]]]] 188 year = 4digit 189 month = 2digit 190 day = 2digit 191 hour = 2digit 192 minute = 2digit 193 second = 2digit 194 fraction = *digit 196 The representation of a date or time refers to the very first instant 197 of the given date, so that, for example, 1999 and 199901010000 are 198 equivalent. If necessary, dates can include times and even fractional 199 times, so that a generator of duris can be arbitrarily precise. 201 Dates are interpreted relative to International Atomic Time [TAI], so 202 that there is no ambiguity about time zone. 204 4. Additional Considerations 206 4.1 URI schemes 208 Many URI schemes are appropriate for use inside duris and tdb URNs. 210 Of course, a common usage would be use a "http" URI to refer to a web 211 page or the subject of a web site at a given time. This can be a way 212 of referring to a web site at some date in the past, or an 213 organization that has changed or merged. 215 Local systems that have unique host names can use "file" URIs in 216 their duris, for example, 218 urn:tdb:20010814142327:file://this.example.com/c|/temp/test.txt 220 can uniquely and unambiguously refer to a concept whose description is 221 contained in a system's local disk. While file URIs are difficult to 222 use for global resolution because of ambiguities of file system and 223 access methods, in this case, because the instant is fixed, the naming 224 mechanism of the host can prevail. 226 Even the "data" URI scheme might be used with "tdb" to designate 227 concepts that can be described briefly inline. For example, 229 urn:tdb:2001:data:,The%2520US%2520president 231 names the concept described by the (text/plain) string "The US 232 president" at the very first instant of 2001. (Note the awkward double 233 quoting of space as "%20" and then the "%" as "%25".) 235 Even urns might appear within a duri in unusual circumstances. For 236 example, there are circumstances where the assignment of names a URN 237 namespace are not in practice be permanent, or that one might want to 238 refer to the assignment as of a given date. In this case, it is 239 possible to use a "urn" within a "duri", e.g., 241 urn:duri:2000:urn:ietf:std:50 243 might be used to refer to "the document that was STD 50 that was in 244 effect as of the first instant of 2000". [RFC 2648] 246 4.2 Date ranges 248 Dates in the future SHOULD NOT be used, because the meaning of the 249 duri or tdb cannot readily be determined in advance reliably. Dates 250 far in the past or merely prior to the actual assignment of the 251 resource to the URI SHOULD NOT be used, because the meaning of the 252 reference is left in question. For example, using http URIs before a 253 web service was available at the given URI doesn't make much sense. 255 However, although these practices are not recommended, there is no 256 assurance that they have followed; by itself, a duri/tdb does not 257 constitute an assertion that the encoded-URI was available or assigned 258 at the date specified. 260 Note that the use of the "very first instant" means that a duri/tdb 261 using only a year must give a year greater than the first year in 262 which the corresponding URI was published; if a web page is published 263 in the middle of 2001, then "duri:2001:..." would be inappropriate. 265 4.3 Free assignment 267 Because of the many possible schemes that can be used in the 268 portion, there should be no difficulty in almost any 269 computational process being able to assign duris or tdbs at will. Of 270 course, it is necessary for there to be some resource which is 271 available at some point in time, and to have a clock which is 272 accurate to the granularity of the frequency of assignment. 274 4.4 Resolution 276 There are no accurate resolution servers for duri or tdb URNs. A duri 277 might be "resolvable" in the sense that a resource that was accessed 278 at a point in time might have the result of that access cached or 279 archived in an Internet archive service. A "tdb" is only resolvable in 280 the sense that if the corresponding duri can be resolved, the result 281 can be accessed and interpreted. 283 Clients without access to an Internet archive service might take the 284 decoded of a duri and attempt resolution of *that* 285 identifier. This will give an approximation whose reliability depends 286 on the amount of time elapsed since the date indicated. 288 4.5 Why Names with Semantics? 290 There are a number of proposals for URN schemes that create otherwise 291 unbound "names", where the URN scheme only provides for uniqueness. 292 Neither "duri" nor "tdb" intrinsically have the property that the 293 names assigned are without any resolution semantics. This is 294 intentional; it's difficult to create names that carry no semantics 295 whatsoever about the authority that assigned the name and the 296 intention of the authority for what the name should designate. 298 4.5 Avoiding MetaData 300 One might consider the date in a duri/tdb to be just one piece of 301 additional metadata about the encoded-URI, and consider adding other 302 pieces of metadata as annotation. 304 However, the use of the date in a duri/tdb is intended primarily as a 305 mechanism of accomplishing uniqueness over time. No other bit of 306 metadata or description readily fills that purpose. Further, the date 307 is not descriptive (an assertion about the encoded-URI) but merely 308 refining. 310 4.6 Avoiding duri and tdb 312 Many applications of URIs already provide a context of date. For 313 example, one could imagine a hypertext system where the URIs contained 314 within a document were intended to refer to the resources as of the 315 date of the enclosing document. This would be a reasonable 316 interpretation of URIs within an Internet archive system, for example. 318 And some applications of URIs arguably already contain the level of 319 interpretive indirection that is explicit with "tdb". For example, one 320 might consider the use of URIs as namespace names within XML [XMLNAME] 321 as a reference to the "thing described by" the URI used. 323 The Resource Description Framework [RDF] is an XML-based framework for 324 describing assertions. RDF uses URIs to identify the objects being 325 described and XML-based tags to describe the relationships between 326 them. The relations in RDF, however, may already provide for the 327 "thing described by" indirection. For example, the example in Section 328 3.2.1 of RDF claims the model for the sentence 329 "The students in course 6.001 are Amy, Tim and Mary" 330 would be written in RDF/XML as 332 333 334 335 336 337 338 339 340 341 342 344 but the resources listed are web pages (served by HTTP) and the 345 class and students are the "things described by" those web pages. 347 Other resource description frameworks may require using "tdb" to 348 distinguish between assertions about classes or students and the web 349 pages that describe them. 351 5. URN Specification Templates 353 5.1 "duri" Specification Template 355 Namespace ID: 356 "duri" requested. 358 Registration Information: 359 Registration Version: 1 360 Registration Date: 2001-08-19 362 Declared registrant of the namespace: 363 Larry Masinter (see Section 10 of this document.) 365 Declaration of syntactic structure: 366 Briefly, the syntax is 367 urn:duri:: 368 The syntax is described in Sections 1-3 of this document. 370 Relevant ancillary documentation: 371 (See Section 10, References, of this document) 373 Identifier uniqueness considerations: 374 Uniqueness is guaranteed by the structure of adding 375 a designation of a specific instant to a URI. However, 376 URIs with ambiguous interpretation at any given 377 instant (e.g., "file" URIs without a given host name) 378 will not be unique. 380 Identifier persistence considerations: 381 The designation of a dated URI is completely persistent 382 for all time. 384 Process of identifier assignment: 385 Any date can be used with any URI independently 386 by anyone. 388 Process of identifier resolution: 389 Identifiers can only be resolved approximately. See 390 Section 4.3. 392 Conformance with URN Syntax: 393 Note that the use of "/" for hierarchy, while discouraged 394 in the URN specification, is allowed in duris. 396 Rules for Lexical Equivalent: 397 For dates, YYYY is equivalent to YYYY01, YYYYMM is equivalent to 398 YYYYMM01, while YYYYMMDD is equivalent to YYYYMMDD0... followed 399 by any number of 0's. 401 In considering equivalence of the encoded URI, if two duris with 402 equivalent dates contain lexically equivalent URIs, the duris 403 are equivalent. 405 Validation mechanism: 406 Dates should be reasonable and meet the syntactic requirements. 407 The URI encoded within should meet the syntactic requirements of 408 the URI scheme used. 410 Scope: 411 Global. 413 5.2 "tdb" Specification Template 415 Namespace ID: 416 "tdb" requested. 418 Registration Information: 419 Registration Version: 1 420 Registration Date: 2001-08-19 422 Declared registrant of the namespace: 423 Larry Masinter (see Section 10 of this document.) 425 Declaration of syntactic structure: 426 Briefly, the syntax is 427 urn:tdb:: 428 The syntax is described in Sections 1-3 of this document. 430 Relevant ancillary documentation: 431 (See Section 10, References, of this document) 433 Identifier uniqueness considerations: 434 Uniqueness is guaranteed by the structure of adding 435 a designation of a specific instant to a URI. However, 436 URIs with ambiguous interpretation at any given 437 instant (e.g., "file" URIs without a given host name) 438 will not be unique. 440 Identifier persistence considerations: 441 The designation of a dated URI is completely persistent 442 for all time, although the intent of a resource that 443 is no longer available will be hard to discern. 445 Process of identifier assignment: 446 Any date can be used with any URI independently 447 by anyone. 449 Process of identifier resolution: 450 Resolution of "tdb" identifiers requires interpreting 451 the resource identified by the corresponding "duri". 452 See Section 4.3 of this document. 454 Rules for Lexical Equivalent: 455 As with "duri", see section 5.1. 457 Conformance with URN Syntax: 458 As with "duri", see section 5.1. 460 Validation mechanism: 461 As with "duri", see section 5.1. 463 Scope: 464 Global. 466 6. IANA considerations 468 This document includes two URN NID registrations (sections 5.1 and 469 5.2) that should be entered into the IANA registry of URN NIDs. 471 7. Security Considerations 473 duris and tdbs are not any more reliable because they are dated. 474 URIs don't contain enough information to supply the authority for 475 deciding what was or wasn't at a given URI at a given date. 477 8. Acknowledgements 479 Many thanks to the many discussions on the relationship of URLs, URNs, 480 URIs and resource identifiers, as well as similar ideas, that have 481 been floated over the last many years. 483 9. Copyright 485 Copyright (C) The Internet Society, 1997. All Rights Reserved. 487 This document and translations of it may be copied and furnished to 488 others, and derivative works that comment on or otherwise explain it 489 or assist in its implementation may be prepared, copied, published and 490 distributed, in whole or in part, without restriction of any kind, 491 provided that the above copyright notice and this paragraph are 492 included on all such copies and derivative works. However, this 493 document itself may not be modified in any way, such as by removing 494 the copyright notice or references to the Internet Society or other 495 Internet organizations, except as needed for the purpose of developing 496 Internet standards in which case the procedures for copyrights defined 497 in the Internet Standards process must be followed, or as required to 498 translate it into languages other than English. 500 The limited permissions granted above are perpetual and will not be 501 revoked by the Internet Society or its successors or assigns. 503 This document and the information contained herein is provided on an 504 "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING 505 TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT 506 NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN 507 WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF 508 MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE." 510 10. Author's address 512 Larry Masinter 513 Adobe Systems Incorporated 514 345 Park Ave 515 San Jose, CA 95110 516 mailto: LMM@acm.org 517 http://larry.masinter.net 518 Tel: +1 408 536-3024 520 11. References 522 [RFC 2141] R. Moats, "URN Syntax", May 1997. 524 [COOL] Tim Berners-Lee, "Cool URLs don't change.", 1998. 525 . 527 [RFC 2396] R. Fielding, L. Masinter, "Uniform Resource Identifiers 528 (URI): Generic Syntax", RFC 1396, August 1998. 530 [RFC 1737] K. Sollins, L. Masinter, "Functional Requirements for 531 Uniform Resource Names", RFC 1737, December 1994. 533 [RFC 2550] S. Glassman, M. Manasse, J. Mogul, "Y10K and Beyond", RFC 534 2550, April 1, 1999. 535 537 [TAI] "International Atomic Time", 538 540 [RFC 2648] R. Moats, "A URN Namespace for IETF Documents", August 541 1999. . 543 [XMLNAME] "Namespaces in XML", World Wide Web Consortium 544 Recommendation, 545 . 547 [RDF] "Resource Description Framework (RDF) Model and Syntax 548 Specification", World Wide Web Consortium Recommendation, 549