idnits 2.17.1 draft-masinter-dated-uri-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 1) being 601 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 3 instances of lines with control characters in the document. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 291: '...es in the future SHOULD NOT be used, b...' RFC 2119 keyword, line 294: '...far in the past) SHOULD NOT be used, b...' RFC 2119 keyword, line 299: '...ractices are NOT RECOMMENDED, there is...' Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (September 2002) is 7891 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'RFC 2141' is defined on line 566, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2141 (Obsoleted by RFC 8141) -- Possible downref: Non-RFC (?) normative reference: ref. 'COOL' ** Downref: Normative reference to an Informational RFC: RFC 1396 (ref. 'RFC 2396') ** Downref: Normative reference to an Informational RFC: RFC 1737 ** Downref: Normative reference to an Informational RFC: RFC 2550 -- Possible downref: Non-RFC (?) normative reference: ref. 'TAI' ** Downref: Normative reference to an Informational RFC: RFC 2648 -- Possible downref: Non-RFC (?) normative reference: ref. 'XMLNAME' -- Possible downref: Non-RFC (?) normative reference: ref. 'RDF' Summary: 10 errors (**), 0 flaws (~~), 3 warnings (==), 6 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 INTERNET-DRAFT Larry Masinter 2 draft-masinter-dated-uri-01.txt March 1, 2002 3 Expires September 2002 5 "duri" and "tdb" URN namespaces based on dated URIs 7 Status of this Memo 9 This document is an Internet-Draft and is in full conformance with all 10 provisions of Section 10 of RFC2026. 12 Internet-Drafts are working documents of the Internet Engineering Task 13 Force (IETF), its areas, and its working groups. Note that other 14 groups may also distribute working documents as Internet-Drafts. 16 Internet-Drafts are draft documents valid for a maximum of six months 17 and may be updated, replaced, or obsoleted by other documents at any 18 time. It is inappropriate to use Internet- Drafts as reference 19 material or to cite them other than as "work in progress." 21 The list of current Internet-Drafts can be accessed at 22 http://www.ietf.org/ietf/1id-abstracts.txt. 24 The list of Internet-Draft Shadow Directories can be accessed at 25 http://www.ietf.org/shadow.html. 27 Abstract 29 This document defines two persistent namespaces of URNs based on 30 prepending a date to an (encoded) URI. The results are namespaces in 31 which names are readily assigned but which offer the persistence of 32 reference that is required by URNs. The first namespace (duri) is used 33 to refer to URI-identified resources themselves, while the second 34 namespace (tdb) is used to refer to abstractions that are not 35 themselves networked resources but are "described by" them. One 36 reason for defining these is to help illustrate the boundaries 37 of applicability for URIs as permanent identifiers and for naming 38 abstract non-networked resources. Similar ideas have been discussed 39 in many fora for a number of years. 41 This document is not a product of any working group, but may be 42 discussed on the mailing list . 44 1. Overview and Requirements 46 The URN namespaces defined here solve two separate but related 47 problems, discussed in this section. 49 1.1 Intrinsically Persistent Identifiers 51 Many people have wondered about how to create globally unique and 52 persistent identifiers. There are a number of URI schemes and URN 53 namespaces already registered. However, many of them lack an adequate 54 guarantee of both uniqueness and persistence. 56 In some cases, the guarantee of persistence comes through a promise of 57 good management practice, such as is encouraged in "Cool URIs don't 58 change" [COOL]. However, relying on promise of good management 59 practice is not the same as having a design that guarantees 60 reliability independent of actual administrative practice. 62 A primary design goal for URIs is that they are intended to mean the 63 same thing, no matter in what context they appear: a "Uniform" way to 64 Identify a Resource. However, even when URIs have Uniform meaning from 65 the point of view of the source of the reference, they don't guarantee 66 stability over time. Despite best efforts and intentions, identifying 67 information can change in unpredictable ways: domain names can 68 disappear or be reassigned, name assigning organizations can change 69 structure, responsibility, disappear, merge, or change in 70 unpredictable ways. 72 1.2 URIs for abstractions 74 The description of URIs [RFC 2396] describes a scope for 'Resource' 75 that is quite broad: 77 A resource can be anything that has identity. Familiar examples 78 include an electronic document, an image, a service (e.g., 79 "today's weather report for Los Angeles"), and a collection of 80 other resources. Not all resources are network "retrievable"; 81 e.g., human beings, corporations, and bound books in a library 82 can also be considered resources. 84 However, most of the URI mechanisms are either quite concrete, 85 (including an identification of protocol and protocol parameters for 86 connecting to a network communication endpoint), or else quite vague 87 about the way in which they are connected to the resource they 88 identify. 90 There is a significant dependence in the interpretation of many URNs 91 with the concept of "naming authority". The authority is presumably 92 some individual or organization both to insure uniqueness of 93 assignment and also to help with understanding the meaning of the link 94 between the name and the named. 96 However, authorities, whether individuals or organizations, have a 97 lifetime, and must be consulted at some point to understand the 98 bindings. The functioning of names as unique identifiers and holders 99 of meaning depends on having a reliable infrastructure of consulting 100 the authority or the authorities records to determine the thing 101 referenced. The goal, then, of the second URN scheme proposed 102 below is to provide a mechanism which is, at the same time: 104 * permanent (the identity of the resource identified 105 is not subject to reinterpretation over time) 106 * explicitly bound (the mechanism by which the identified 107 resource can be determined is explicitly included in 108 the URI) 109 * allows identification of resources outside the network: 110 people, organizations, abstract concepts. 111 * does not depend on reliable administrative processes 112 of authorities for either assignment or interpretation 114 2. Namespace definitions 116 2.1 "duri" namespace 118 It is traditional in convention references and citations in printed 119 works to include the date of publication; this practice serves the 120 important purpose that the context of the naming can be determined. 122 The "duri" URN namespace takes the form: 124 urn:duri:: 126 where is a digit string corresponding to a date (Section 4), 127 and an is an absolute URI-reference [RFC 2396] in which 128 any character excluded from URN syntax has been escaped (Section 3). 130 The meaning of a duri is "the resource (or fragment) that was 131 identified by the (after hex decoding) at the very first 132 instant of the date(time) given". 134 For example, 'urn:duri:2001:http://www.ietf.org' is a persistent 135 identifier to 'http://www.ietf.org' as of the very first moment of the 136 year 2001. A duri may not be a resource locator in a practical sense, 137 because the time of location has passed. However, is an acceptable 138 resource identifier, and fulfills all of the requirements for 139 URNs [RFC 1737]. 141 2.2 "tdb" namespace 143 The second URN namespace defined is a parallel space which is useful 144 for describing entities, concepts, abstractions, and other items which 145 are not themselves network accessible resources, but have been at some 146 point described by network accessible resources. 148 The "tdb" namespace designates the "thing described by" a resource at 149 a given URI at the given time. This URN namespace is described by 150 'tdb', e.g., 152 urn:tdb:: 154 with the same syntactic rules as 'duri'. 156 The intent is to use the inversion of "is a web resource about". It 157 is common practice to give a reference for a concept by including a 158 pointer to a document, segment, phrase that defines the concept. 159 "tdb" attempts to capture this practice in URI space. 161 For example, "urn:tdb:2001:http://www.ietf.org" can be used to 162 designate the Internet Engineering Task Force organization, at least 163 as it was described by or referenced by its home page at the first 164 instant of 2001. 166 The "tdb" namespace differs from most other mechanisms for identifying 167 abstractions because the designation of what is actually identified by 168 the tdb doesn't depend on knowing the intention of the "assigner" of 169 the identifier. Unlike many of the alternatives proposed, the 170 identification is not dependent on the context of use. 172 The "tdb" namespace can be thought of as following another level of 173 indirection to URI resolution. While one could imagine using 'tdb' 174 without a date, it would leave the possibility that a reference that 175 is unambiguous at one time might become ambiguous at some other time. 177 3. Encoding URIs 179 Both "duri" and "tdb" URN namespaces require that some characters in 180 the URI references be encoded. 182 3.1 Characters that must be encoded 184 The characters that must be encoded are: 186 * All characters marked in RFC 2141, section 2.4 187 These are excluded because they are not allowed in URNs. 188 \"&<>[]^`{|}~ 190 * The character "#" 191 Note that the of a "duri" or "tdb" can include a 192 fragment identifier, but the "#" character used to delimit it must 193 be encoded. 195 * The character "%" 196 The encoded-URI can itself contain encoded characters, which are 197 encoded with the same method. To insure that decoding happens at the 198 right level of processing, the "%" itself must be encoded. 200 Unfortunately, there are many cases where there is a double 201 encoding of characters, first to construct the embedded URI 202 itself and second to then embed the URI within the tdb or 203 duri URN. 205 3.2 No need to encode "/" 207 The URN recommendation discourages the use of "/" in URNs because, in 208 general, there is no good interpretation of hierarchy and relative 209 URIs for assigned names. However, for the particular case of 210 duris (at least), there seems to be no good reason to avoid 211 the "/" because it corresponds fairly naturally (in many cases) 212 to the hierarchy of the original space. 214 4. Dates 216 A is a simple expression of date, optional time, with arbitrary 217 precision. The goal is to allow relatively short expressions of dates 218 with no ambiguity, and with arbitrary precision. (The idea for this 219 syntax came from [RFC 2550].) 221 date = year [ month [ day [ hour [ minute [ second [ fraction ]]]]]] 223 year = 4digit 224 month = 2digit 225 day = 2digit 226 hour = 2digit 227 minute = 2digit 228 second = 2digit 229 fraction = *digit 231 The representation of a date or time refers to the very first instant 232 of the given date, so that, for example, 1999 and 199901010000 are 233 equivalent. If necessary, dates can include times and even fractional 234 times, so that a generator of duris can be arbitrarily precise. 236 Dates are interpreted relative to International Atomic Time [TAI], so 237 that there is no ambiguity about time zone. 239 5. Additional Considerations 241 5.1 Embedded URI schemes 243 Many URI schemes are appropriate for use inside duris and tdb URNs. 245 Of course, a common usage would be use a "http" URI to refer to a web 246 page or the subject of a web site at a given time. This can be a way 247 of referring to a web site at some date in the past, or an 248 organization that has changed or merged. 250 Local systems that have unique host names can use "file" URIs in 251 their duris, for example, 253 urn:tdb:20010814142327:file://this.example.com/c|/temp/test.txt 255 can uniquely and unambiguously refer to a concept whose description is 256 contained in a system's local disk. While file URIs are difficult to 257 use for global resolution because of ambiguities of file system and 258 access methods, in this case, because the instant is fixed, the naming 259 mechanism of the host can prevail. (Using 'file:' URIs without a host 260 name is not recommended, because the interpretation is not uniform.) 262 Even urns might appear within a duri in unusual circumstances. For 263 example, there are circumstances where the assignment of names a URN 264 namespace are not in practice be permanent, or that one might want to 265 refer to the assignment as of a given date. In this case, it is 266 possible to use a "urn" within a "duri", e.g., 268 urn:duri:2000:urn:ietf:std:50 270 might be used to refer to "the document that was STD 50 in effect as 271 of the first instant of 2000". [RFC 2648] 273 5.2 Using the "data" URI scheme with tdb 275 It's possible to using "tdb" to designate concepts that can be 276 described uniquely briefly inline. For example, 278 urn:tdb:2001:data:,The%2520US%2520president 280 names the concept described by the (text/plain) string "The US 281 president" at the very first instant of 2001. (Note the awkward double 282 quoting of space as "%20" and then the "%" as "%25".) Of course, this 283 practice is only useful if the referent of the data is (or was at the 284 time designated) well-defined. In the case of 'data', there is no 285 assigning authority at all; the interpretation of the 'tdb' URN depend 286 on the interpreting community. 'urn:tdb:2001:data:,it' would not be 287 useful. 289 5.3 Useful dates 291 Dates in the future SHOULD NOT be used, because the meaning of the 292 duri or tdb cannot readily be determined in advance reliably. Dates 293 prior to the actual assignment of the resource to the embedded URI 294 (and, certainly, dates far in the past) SHOULD NOT be used, because 295 the meaning of the reference is left in question. For example, using 296 http URIs before a web service was available at the given URI doesn't 297 make much sense. 299 However, although these practices are NOT RECOMMENDED, there is no 300 assurance that they have followed; by itself, a duri/tdb does not 301 constitute an assertion that the encoded-URI was available or assigned 302 at the date specified. 304 Note that the use of the "very first instant" means that a duri/tdb 305 using only a year must give a year greater than the first year in 306 which the corresponding URI was published; if a web page is published 307 in the middle of 2001, then "duri:2001:..." would be inappropriate. 309 5.4 Free assignment 311 Because of the many possible schemes that can be used in the 312 portion, there should be no difficulty in almost any 313 computational process being able to assign duris or tdbs at will. Of 314 course, it is necessary for there to be some resource which is 315 available at some point in time, and to have a clock which is accurate 316 to the granularity of the frequency of assignment. 318 5.5 Resolution 320 There are no accurate resolution servers for duri or tdb URNs. A duri 321 might be "resolvable" in the sense that a resource that was accessed 322 at a point in time might have the result of that access cached or 323 archived in an Internet archive service. A "tdb" is only resolvable in 324 the sense that if the corresponding duri can be resolved, it may be 325 possible that the result can be accessed and interpreted. 327 Clients without access to an Internet archive service might take the 328 decoded of a duri and attempt resolution of *that* 329 identifier. This will give an approximation whose reliability depends 330 on the amount of time elapsed since the date indicated. 332 5.6 Why Names with Semantics? 334 There are a number of proposals for URN schemes that create otherwise 335 unbound "names", where the URN scheme only provides for uniqueness. 336 Neither "duri" nor "tdb" intrinsically have the property that the 337 names assigned are without any resolution semantics. This is 338 intentional; it's difficult to create names that carry no semantics 339 whatsoever about the authority that assigned the name and the 340 intention of the authority for what the name should designate. 342 5.7 Avoiding MetaData 344 One might consider the date in a duri/tdb to be just one piece of 345 additional metadata about the encoded-URI, and consider adding other 346 pieces of metadata as annotation. 348 However, the use of the date in a duri/tdb is intended primarily as a 349 mechanism of accomplishing uniqueness over time. No other bit of 350 metadata or description readily fills that purpose. Further, the date 351 is not descriptive (an assertion about the encoded-URI) but merely 352 refining. 354 5.8 Avoiding duri and tdb 356 Many applications of URIs already provide a context of date. For 357 example, one could imagine a hypertext system where the URIs contained 358 within a document were intended to refer to the resources as of the 359 date of the enclosing document. This would be a reasonable 360 interpretation of URIs within an Internet archive system, for example. 362 And some applications of URIs arguably already contain the level of 363 interpretive indirection that is explicit with "tdb". For example, one 364 might consider the use of URIs as namespace names within XML [XMLNAME] 365 as a reference to the "thing described by" the URI used. 367 5.9 tdb and RDF 369 The Resource Description Framework [RDF] is an XML-based framework for 370 describing assertions. RDF uses URIs to identify the objects being 371 described and XML-based tags to describe the relationships between 372 them. 374 The relations in RDF, however, may already provide for the "thing 375 described by" indirection. For example, the example in Section 3.2.1 376 of [RDF] claims the model for the sentence 378 "The students in course 6.001 are Amy, Tim and Mary" 380 would be written in RDF/XML as 382 383 384 385 386 387 388 389 390 391 392 394 but the resources listed are web pages (served by HTTP) and the class 395 and students are the "things described by" those web pages. 397 6. URN Specification Templates 399 6.1 "duri" Specification Template 401 Namespace ID: 402 "duri" requested. 404 Registration Information: 405 Registration Version: 1 406 Registration Date: 2001-08-19 408 Declared registrant of the namespace: 409 Larry Masinter (see Section 10 of this document.) 411 Declaration of syntactic structure: 412 Briefly, the syntax is 413 urn:duri:: 414 The syntax is described in Sections 1-3 of this document. 416 Relevant ancillary documentation: 417 (See Section 10, References, of this document) 419 Identifier uniqueness considerations: 420 Uniqueness is guaranteed by the structure of adding 421 a designation of a specific instant to a URI. However, 422 URIs with ambiguous interpretation at any given 423 instant (e.g., "file" URIs without a given host name) 424 will not be unique. 426 Identifier persistence considerations: 427 The designation of a dated URI is completely persistent 428 for all time. 430 Process of identifier assignment: 431 Any date can be used with any URI independently 432 by anyone. 434 Process of identifier resolution: 435 Identifiers can only be resolved approximately. See 436 Section 4.3. 438 Conformance with URN Syntax: 439 Note that the use of "/" for hierarchy, while discouraged 440 in the URN specification, is allowed in duris. 442 Rules for Lexical Equivalent: 443 For dates, YYYY is equivalent to YYYY01, YYYYMM is equivalent to 444 YYYYMM01, while YYYYMMDD is equivalent to YYYYMMDD0... followed 445 by any number of 0's. 447 In considering equivalence of the encoded URI, if two duris with 448 equivalent dates contain lexically equivalent URIs, the duris 449 are equivalent. 451 Validation mechanism: 452 Dates should be reasonable and meet the syntactic requirements. 453 The URI encoded within should meet the syntactic requirements of 454 the URI scheme used. 456 Scope: 457 Global. 459 6.2 "tdb" Specification Template 461 Namespace ID: 462 "tdb" requested. 464 Registration Information: 465 Registration Version: 1 466 Registration Date: 2001-08-19 468 Declared registrant of the namespace: 469 Larry Masinter (see Section 10 of this document.) 471 Declaration of syntactic structure: 472 Briefly, the syntax is 473 urn:tdb:: 474 The syntax is described in Sections 1-3 of this document. 476 Relevant ancillary documentation: 477 (See Section 10, References, of this document) 479 Identifier uniqueness considerations: 480 Uniqueness is guaranteed by the structure of adding 481 a designation of a specific instant to a URI. However, 482 URIs with ambiguous interpretation at any given 483 instant (e.g., "file" URIs without a given host name) 484 will not be unique. 486 Identifier persistence considerations: 487 The designation of a dated URI is completely persistent 488 for all time, although the intent of a resource that 489 is no longer available will be hard to discern. 491 Process of identifier assignment: 492 Any date can be used with any URI independently 493 by anyone. 495 Process of identifier resolution: 496 Resolution of "tdb" identifiers requires interpreting 497 the resource identified by the corresponding "duri". 498 See Section 4.3 of this document. 500 Rules for Lexical Equivalent: 501 As with "duri", see section 5.1. 503 Conformance with URN Syntax: 504 As with "duri", see section 5.1. 506 Validation mechanism: 507 As with "duri", see section 5.1. 509 Scope: 510 Global. 512 7. IANA considerations 514 This document includes two URN NID registrations (sections 5.1 and 515 5.2) that should be entered into the IANA registry of URN NIDs. 517 8. Security Considerations 519 duris and tdbs are not any more reliable because they are dated. 520 URIs don't contain enough information to supply the authority for 521 deciding what was or wasn't at a given URI at a given date. 523 9. Acknowledgements 525 Many thanks to the many discussions on the relationship of URLs, URNs, 526 URIs and resource identifiers, as well as similar ideas, that have 527 been floated over the last many years. 529 10. Copyright 531 Copyright (C) The Internet Society, 2002. All Rights Reserved. 533 This document and translations of it may be copied and furnished to 534 others, and derivative works that comment on or otherwise explain it 535 or assist in its implementation may be prepared, copied, published and 536 distributed, in whole or in part, without restriction of any kind, 537 provided that the above copyright notice and this paragraph are 538 included on all such copies and derivative works. However, this 539 document itself may not be modified in any way, such as by removing 540 the copyright notice or references to the Internet Society or other 541 Internet organizations, except as needed for the purpose of developing 542 Internet standards in which case the procedures for copyrights defined 543 in the Internet Standards process must be followed, or as required to 544 translate it into languages other than English. 546 The limited permissions granted above are perpetual and will not be 547 revoked by the Internet Society or its successors or assigns. 549 This document and the information contained herein is provided on an 550 "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING 551 TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT 552 NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN 553 WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF 554 MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE." 556 11. Author's address 558 Larry Masinter 559 345 Park Ave, #W14 560 San Jose, CA 95110 561 mailto: LMM@acm.org 562 http://larry.masinter.net 564 12. References 566 [RFC 2141] R. Moats, "URN Syntax", May 1997. 568 [COOL] Tim Berners-Lee, "Cool URLs don't change.", 1998. 569 . 571 [RFC 2396] R. Fielding, L. Masinter, "Uniform Resource Identifiers 572 (URI): Generic Syntax", RFC 1396, August 1998. 574 [RFC 1737] K. Sollins, L. Masinter, "Functional Requirements for 575 Uniform Resource Names", RFC 1737, December 1994. 577 [RFC 2550] S. Glassman, M. Manasse, J. Mogul, "Y10K and Beyond", RFC 578 2550, April 1, 1999. 579 581 [TAI] "International Atomic Time", 582 584 [RFC 2648] R. Moats, "A URN Namespace for IETF Documents", August 585 1999. . 587 [XMLNAME] "Namespaces in XML", World Wide Web Consortium 588 Recommendation, 589 . 591 [RDF] "Resource Description Framework (RDF) Model and Syntax 592 Specification", World Wide Web Consortium Recommendation, 593