idnits 2.17.1 draft-ietf-uri-urc-req-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-26) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity. ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 422 instances of weird spacing in the document. Is it really formatted ragged-right, rather than justified? ** There is 1 instance of too long lines in the document, the longest one being 2 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 11 has weird spacing: '...ocument is a...' == Line 12 has weird spacing: '...cuments of t...' == Line 13 has weird spacing: '...F), its areas...' == Line 14 has weird spacing: '... other group...' == Line 18 has weird spacing: '...fts may be up...' == (417 more instances...) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (November 21, 1994) is 10749 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Possible downref: Non-RFC (?) normative reference: ref. '1' -- Possible downref: Non-RFC (?) normative reference: ref. '2' Summary: 11 errors (**), 0 flaws (~~), 7 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Engineering Task Force Ron Daniel Jr. 2 INTERNET-DRAFT Los Alamos National Laboratory 3 draft-ietf-uri-urc-req-00.txt Michael Mealling 4 Georgia Institute of Technology 5 November 21, 1994 7 URC Scenarios and Requirements 9 Status of this draft 11 This document is an Internet-Draft. Internet-Drafts are 12 working documents of the Internet Engineering Task Force 13 (IETF), its areas, and its working groups. Note that 14 other groups may also distribute working documents as 15 Internet-Drafts. 17 Internet-Drafts are draft documents valid for a maximum of 18 six months. Internet-Drafts may be updated, replaced, or 19 obsoleted by other documents at any time. It is not 20 appropriate to use Internet-Drafts as reference material or 21 to cite them other than as a ``working draft'' or ``work in 22 progress.'' 24 To learn the current status of any Internet-Draft, please 25 check the 1id-abstracts.txt listing contained in the Internet- 26 Drafts Shadow Directories on ds.internic.net, nic.nordu.net, 27 ftp.isi.edu, or munnari.oz.au. 29 This Internet Draft expires May 25, 1995. 31 Abstract 33 This draft describes the place of the Uniform Resource Characteristic 34 (URC) service within the overall context of Uniform Resource 35 Identification on the Internet. It presents several scenarios 36 illustrating how the URC service might be used. From these usage 37 scenarios, we derive a set of requirements that any proposed URC 38 services must meet. 40 Contents 42 1 Introduction 3 44 2 User Scenarios 3 46 2.1 URN to URL resolution . . . . . . . . . . . . . . . . . . . . 4 48 2.2 Meta-data for its own sake . . . . . . . . . . . . . . . . . 5 50 2.3 Ensuring the veracity of the resource . . . . . . . . . . . . 6 52 2.4 Ensuring the veracity of the URC . . . . . . . . . . . . . . 7 54 2.5 Bibliographic Search . . . . . . . . . . . . . . . . . . . . 8 56 2.6 Filtering by Seals of Approval . . . . . . . . . . . . . . . 10 58 3 Provider Scenarios 11 60 3.1 Publishing a new resource . . . . . . . . . . . . . . . . . . 11 62 3.2 Publishing a new version of a resource . . . . . . . . . . . 12 64 3.3 Providing an additional location for a resource . . . . . . . 12 66 3.3.1Mirroring of free information . . . . . . . . . . . . . . 13 68 3.3.2Mirroring of information that is for sale . . . . . . . . 13 70 3.3.3Mirroring on a regular basis . . . . . . . . . . . . . . . 14 72 3.4 Removing a location for a resource . . . . . . . . . . . . . 14 74 3.5 Establishing a new publishing authority . . . . . . . . . . . 15 76 3.6 Dealing with the demise of a publisher . . . . . . . . . . . 15 78 4 Requirements 16 80 4.1 Requirements on the URC . . . . . . . . . . . . . . . . . . . 16 82 4.2 Requirements on the URC Service . . . . . . . . . . . . . . . 17 84 5 Characteristics 19 85 1 Introduction 87 As Joe Jackson says, ``You can't get what you want 'till you know what 88 you want''. This applies to software development just as much as it 89 applies to affairs of the heart. In order for the URI working group 90 to design an architecture that does what we want, we need to know what 91 we want it to do. This paper presents a wide range of scenarios for 92 how we would like the URC service to operate. From those scenarios, 93 we derive requirements for the functionality and encoding of Uniform 94 Resource Characteristics (URCs) within the overall architecture of 95 Uniform Resource Identification. 97 The URI architecture is concerned with resources and how they will 98 be used in applications. Resources are the objects, services, and 99 information that applications will make use of. In order to be 100 used, applications must have means for discovering, identifying, and 101 retrieving resources. Resouces are named by a URN (Uniform Resource 102 Name), and are retrieved by means of a URL (Uniform Resource Locator). 103 Describing the resource for purposes of discovery, as well as making 104 the binding between a resource's name and its location(s) is the role 105 of the URC (Uniform Resource Characteristic). The URI architecture 106 is described in other working drafts of the URI working group of the 107 Internet Engineering Task Force, particularly in [1]. With the URI 108 architecture in mind, we can say that: 110 The purpose or function of a URC is to provide a vehicle or 111 structure for the representation of URIs and their associated 112 meta-information. 114 The next few sections present concrete examples of how we foresee 115 URCs being used. We also describe the operation of the service in 116 which URCs reside - the URC service. From those concrete scenarios, 117 we derive a set of requirements on the functionality and encoding of 118 URCs. Any proposals for the URC service will be expected to show how 119 they meet the requirements set forth in this paper, or to point out 120 the error of our ways. 122 2 User Scenarios 124 This section of the paper presents several scenarios of how users 125 might interact with the URC service. In these scenarios we have 126 attempted to show how the system will be used, without specifying how 127 the system will accomplish its tasks. 129 2.1 URN to URL resolution 131 The fundamental purpose of the URC service is to map URNs to URLs so 132 that a resource can be retrieved if its name is known. We believe 133 that the most frequent operation will be to take a URN and return a 134 (possibly empty) list of URLs where the resource named by the URN can 135 be found. This is the primary use of the service and its speed and 136 fault-tolerance are paramount. 138 o User provides a URN to the browser by clicking on an anchor or by 139 entering text into a dialog box. 141 o Browser connects to the URC service and gives it the URN. 143 o Service returns a (possibly empty) list of locations to the 144 browser. Each location must contain a URL. It may also contain 145 information on Content-Type, Price, Signatures, Version, etc. The 146 list of locations unordered. Note that if a location contains 147 information in addition to the the URL, ordering may be used to 148 associate the additional information with a particualr URL, but no 149 importance should be placed on one URL appearing before another in 150 the list of locations sent back to the browser. The means by 151 which the URC service determines this list is outside the scope of 152 this scenario. 154 o The browser uses user-configurable preferences to order the list. 155 For example, a user might prefer HTML to PostScript to text. One 156 user might prefer locations that carried signature information, 157 another might not care. Most would prefer the cheapest version 158 of a resource, and the most recent version. Estimated network 159 distance is another means for ordering the selections. If 160 multiple locations tie, the browser randomizes them in the list to 161 prevent overload of any one server. 163 o Once the list of locations has been sorted, the browser attempts 164 to retrieve the resource from the first location. If that fails, 165 the next location is tried. This continues until one of the 166 following is true: 168 -- The browser successfully retrieves the resource 170 -- The list is exhausted 172 -- The user tells the browser to cancel the retrieval 174 o The browser displays the resource to the user, perhaps with the 175 aid of an external viewer. 177 Note that the list of termination conditions given above is not 178 complete. The URC service will undoubtably make extensive use of 179 caching for speed. If the list was obtained from a cache, the 180 possibility exists that the resolution failed because of the cache 181 being stale. A reasonable fix is to allow the user to configure 182 the browser to retry the query, this time getting authoritative 183 information to fill the list of locations. 185 This scenario provides us with several requirements for the URC and 186 the URC service. First, we must be able to locate a URC from a 187 URN. We must be able to transport a URC without errors using normal 188 Internet protocols. The URC must be parseable by a computer. It 189 may have a hierarchical structure, and we should be able to rearrange 190 elements of the URC within the same hierarchical level. The URC must 191 be able to contain a wide variety of information. Furthermore, we 192 must be able to distinguish queries answered over cached information 193 from those answered over authoritative information. 195 2.2 Meta-data for its own sake 197 The scenario above assumed that the user really and truly wanted to 198 retrieve the resource on the other end of the anchor. However, 199 sometimes the user will not be sure if they want to get the resource. 200 This is already the case in WWW browsing where users will sometimes 201 decide, by inferring size and speed from the URL, not to access 202 particular resources. As the WWW starts to encompass resources that 203 will require payment for their access, users will want to know just 204 what they about to get themselves into. 206 o User is browsing and comes across a moderately interesting link. 208 o User does a right-mouse-button over the link, presenting a pop-up 209 menu. 211 o User selects ``More info...'' from the pop-up and releases the 212 mouse button. 214 o Browser fetches the URC for the resource and displays it in a 215 nicely formatted dialog box. 217 o The user decides that they don't mind paying $1 for the resource, 218 and selects ``OK'' in the dialog. 220 o The browser fetches the resource and displays it to the user. 222 To support this scenario, the service must be able to provide an 223 entire URC, not just the list of locations that it returned in the 224 previous scenario. Second, URCs must have a printable representation 225 that can be understood and transcribed by humans. This does not 226 mean that all elements must be easily understood, or even that we 227 have to transmit URCs in a readable form. It merely means that, in 228 addition to any other representation, fields must have have a printed 229 representation that does not intentionally obfuscate the URC, barring 230 the presence of encryption. 232 2.3 Ensuring the veracity of the resource 234 An important concern voiced over the URI mailing list and in 235 discussions with different communities of users has been how to ensure 236 the veracity of a resource. This concern has been raised on both 237 the user and provider side. Users want to make sure that they 238 are getting the real resource, especially if they are paying for 239 it. Providers want to make sure that they are not haunted by bogus 240 versions of a resource. To ensure the veracity of a resource, the 241 location information provided by the URC service could carry a digital 242 signature of the resource. 244 o The user starts to retrieve a resource according to the first 245 scenario. 247 o As the browser is going through its list of locations, it notes if 248 the current location has signature information. The rest of this 249 scenario assumes that we successfully retrieve a resource which 250 has signature info. 252 o When the browser retrieves the resource, it displays it to the 253 user. 255 o In the background, the browser verifies the signature on the 256 information. To do this it retrieves the appropriate public key 257 of the publisher through a secure, ubiquitous public key service. 258 The public key is used to decrypt the signature from the location 259 object. It is compared with the MD-5 hash of the resource. 261 o If the signature does not check out, the browser alerts the user. 263 o If the user goes on to another resource before the signature 264 computation is complete, it is discarded. 266 This assumes that signatures are computed over the contents of a 267 complete file. Some resources, such as search services, can not be 268 treated in such a fashion. One possibility would be for the URC to 269 contain the signature of a constant header the service provides with 270 its results. The header would contain a public key used to verify a 271 signature of the search results appended to the search results. 273 This scenario imposes the requirement that it be possible to establish 274 an unbroken chain of authentication from a URN through the URC to the 275 resource. Multiple signatures schemes should be supported to allow 276 different cost/security tradeoffs to be made. 278 2.4 Ensuring the veracity of the URC 280 Resources are not the only information that can be tampered with. The 281 URC service will provide a tempting target for attack. It needs to 282 be secured against determined attacks and the information it provides 283 needs to be verifiable. However, security does not come for free, and 284 we should not impose that cost on all accesses. Therefore it is not 285 appropriate to make the URC server compute a digital signature for 286 every query response it generates. 288 One approach would be for the server to keep two pre-computed 289 signatures for each of its URCs. The first is a signature over the 290 entire URC, the second is only computed over the location information 291 it would return in response to a standard URN resolution query. 293 o User configures the browser to verify URC information. 295 o The user clicks on a link 297 o The browser sends a URN resolution request to the URC service. 298 The request has a flag set so that the URC server will provide 299 digital signature information. 301 o The browser receives the list of locations as in the first 302 scenario. In addition it receives a digital signature of that 303 information which has been encrypted with the private key of the 304 URC server. 306 o The browser retrieves the public key of the server, and uses it to 307 verify the URC information. 309 o If there is no problem, the browser continues as before to 310 retrieve the resource. If there is a problem the browser alerts 311 the user, who should alert the administrator of the URC server. 313 If a general query is issued, the URCs for all matching resources are 314 returned in their entirety. The browser then has to verify each of 315 the URCs in turn. Validating general queries will be an expensive 316 process, but it is the user's machine paying most of the cost. 318 This scenario requires that the URC have a consistent external 319 representation that is suitable for the computation of digital 320 signatures. That representation must be network friendly to the 321 extent that it will be transmitted without any changes over standard 322 Internet protocols. Furthermore, it must be possible to separate the 323 portion of a URC being signed from the portion carrying the signature. 325 2.5 Bibliographic Search 327 In addition to locations, the URC provides a convenient place to 328 store bibliographic information such as author, title, subject, date 329 of publication, etc. Also, since publishers are assumed to be 330 arranged in a hierarchy, it should be possible to find every publisher 331 affiliated with the URC service. Combining these two properties opens 332 up the possibility of bibliographic searches across the whole of the 333 web. Exactly how this should work is not so obvious. A naive 334 approach would be: 336 o User enters author, title, and/or subject information into a form 338 o Browser passes the query to the URC service. 340 o Within the URC service, each node is consulted with the query, the 341 results are collected and passed back to the browser. 343 o The browser presents the search results to the user. 345 Of course, the scenario above is unrealistic. If every bibliographic 346 search of every user consults every URC server, the service as a whole 347 will soon grind to a halt. The obvious alternative is for some sites 348 to come forward and carry the burden of these searches, similar to 349 the current situation with Archie. Some sites will do this out of 350 the goodness of their heart, while others may charge a fee for their 351 services. The usage scenario is now: 353 o User connects to a URC search site 355 o Browser puts up the form from that site 357 o User fills it in and hits ``submit'' 359 o The URC search site handles the query over its database and 360 returns the result to the browser. 362 o The browser displays the results to the user. 364 The database for handling the search is updated regularly by 365 harvesting the network. 367 o Search server starts a depth-first search of the tree of 368 publishers. 370 o Search server queries the current URC server for all URCs that are 371 new or have been modified since the last time the search server 372 visited. Those URCs are put into the database of the search 373 service. 375 o Search server asks the URC server for all changes in publication 376 hierarchy since last visit. 378 o Search server continues depth-first search using the new topology 380 The URC service will grow beyond the capabilities of all but the 381 most dedicated sites (OCLC, Library of Congress, etc.) to keep a 382 comprehensive index. The natural course is for search servers to 383 only keep a portion of the URCs that exist. Exactly how they choose 384 the subset to retain will be a decision that varies from one search 385 service to another. 387 Several requirements for the URC service spring from considering 388 searching. First, we must be able to connect to a variety of 389 URC servers instead of having only one URC server be our gateway 390 to the world. Second, if URN resolution is handled through a 391 query forwarding mechanism, servers will want to distinguish between 392 a simple resolution request (which should be forwarded) and general 393 bibliographic queries which should not be forwarded to most sites. If 394 URN resolution does not require query forwarding then this is not a 395 problem. Third, there will need to be a means for determining how 396 to contact the server administrator so that the administrator will 397 add the central search services to the list of entities that can 398 launch certain queries. Fourth, a publisher's server must keep a 399 complete record of all the sub-publishers authorized by that publisher 400 and be able to provide that list in response to a query. Fifth, 401 we must be able to determine the parent publisher of a publisher, 402 either from the URN or by a special request to the URC server. Sixth, 403 the administrator of a URC server should be able to make incremental 404 modifications to the URCs on that server. Seventh, URCs should 405 carry information on their creation and modification dates so that 406 incremental harvesting is possible. 408 2.6 Filtering by Seals of Approval 410 One of the interesting concepts to come out of the Interpedia 411 effort [2] is the concept of SOAPs (Seals Of APproval). SOAPs 412 are capsule reviews of a resource and are implemented using digital 413 signature technology so that they will be extremely difficult to 414 forge. Critics, professional organizations, etc. could use SOAPs 415 to carry quick reviews of resources and to point to more elaborate 416 reviews. For example, the IEEE might receive a request to ``publish'' 417 a resource in one of their electronic journals. The editorial board 418 of the journal lines up the requisite number of reviewers and sends 419 them the URL of the resource. Each of the reviewers sends their 420 review back to the editors, who either turn the author down flat, 421 recommend changes, or accept the resource as it is. If the editorial 422 board accepts it, they form a digital signature of the resource, the 423 quick rating, an optional URN of a full review, etc. all encrypted 424 with the private key of the particular journal. 426 Users could use SOAPs to augment bibliographic searches. For example, 427 a new physics grad student might ask to see all the abstracts of all 428 the resources dealing with (string theory AND quantum chromodynamics) 429 which had been reviewed by the American Physical Society and received 430 a rating of 9 or above. 432 Such queries do not necessarily need to proceed in the same fashion 433 as the general bibliographic search described in the earlier section. 434 Instead, SOAPs may well become the valuable intellectual property of 435 professional organizations. It may be that if you wish to do searches 436 on things with the SOAP of the APS, you have to connect to their 437 server to do it, presumably paying them for the privilege. Given 438 this money-making potential, it is doubtful that many professional 439 organizations will allow authors to include a SOAP in the default URC 440 for their resource unless the author pays for the privilege. 442 o User connects to the server of a reviewing organization, or to a 443 server that has licensed the right to use the SOAPs of particular 444 reviewing organizations. 446 o Browser displays the search form of that organization. This will 447 be a typical bibliographic search form augmented with special 448 features for the SOAPs issued by the organization. 450 o User fills out the form and submits it. 452 o The server does the search and returns the results to the browser. 454 o The browser displays the results of the search to the user. 456 This scenario imposes two requirements on the URC. First, it must be 457 possible to extend the URC by adding arbitrary elements. In the case 458 above, the SOAP is the new element. Second, it must be possible to 459 ignore elements that you do not understand. For this to be possible 460 it must be possible to determine where any particular element ends, 461 even if you know nothing about the structure of information inside 462 the element. Note that there is an interaction between ignorability 463 and having a consistent representation for the purposes of digital 464 signatures. Digital signatures are computed over the external 465 representation, which can include experimental elements. Ignorability 466 is a feature of the conversion from the external to the internal 467 representation, where if we do not understand an element we are free 468 to discard it while we are parsing the URC. 470 3 Provider Scenarios 472 3.1 Publishing a new resource 474 This is one of the fundamental operations for resource providers. 475 Consequently it needs to be as simple and as bulletproof as possible. 477 Consider the processes of preparing and testing a new resource. Any 478 anchors in the resource must be expressed as URNs, not URLs. However, 479 the resource will typically undergo considerable change while it is 480 being developed, so it is not appropriate for a community larger 481 than the developers to be able to resolve the URNs to URLs. To 482 meet this need, the authors request "development URNs" from the 483 naming authority. Development URNs will have very minimal URCs. 484 Typically they will contain zero or one URL and some access control 485 information. These URNs are hidden from all but the developers 486 and server administrator. Using the development URNs, the author(s) 487 prepare and test the new resource. Note that many development URNs 488 will never make it to the status of full URNs. 490 When the author(s) are ready to publish the resource to the world, 491 they will modify the access control information in the development URC 492 to allow wider access. They may also augment the URC with author, 493 title, publication date, etc. The amount of information needed in the 494 URC of a published resource will vary from one publisher to another 495 and can be enforced by the publisher's URC software. If the resource 496 is to be verifiable, the signature of the resource will be put into 497 the URC at this time. Once all the material for the URC has been 498 provided, a signature can be computed over it as well. 500 Once the URC information has been put onto the local URC server, it 501 will be propagated to any other servers around the globe that can play 502 the role of default server for that publisher. 504 This first publishing scenario imposes several requirements. A 505 publisher must be able to provide developmental URNs and to shepherd 506 the corresponding URC through the development process. Minimal URCs 507 should be easy to generate by hand, and the URC must be incrementally 508 modifiable. Note that a logical consequence of this is that we will 509 want to maintain versioning information for the URC, not just the 510 resource it describes. However, that is not foreseen to be part of 511 a minimal URC. While in development, harvester queries should not get 512 these URCs. Finally, note that we leave open the possibility for 513 multiple URC servers to provide default information for the works of 514 a particular publisher. Any proposed specifications will need to 515 show how they meet requirements for consistency among such cooperating 516 servers. At a minimum, a URC server for a publisher should know all 517 the other URC servers for that publisher and be able to update their 518 records. 520 3.2 Publishing a new version of a resource 522 When it is time to revise a resource the authors request a development 523 version of the URC info. This version will have restricted access 524 so that ordinary users only see the older version while the authors 525 and URC server administrator can see the new version. Once the 526 new version of the resource is ready, the modifications in the URC 527 are made publicly accessible. Locations for the new version are 528 established, and the locations for the old version should gradually go 529 away. 531 This scenario requires that the developers be able to augment their 532 resolution queries with version information so that they will be able 533 to access the new version while the rest of the world continues 534 to receive the old version. It also requires that access control 535 mechanisms have a fine enough granularity in the URC to allow such a 536 discrimination to be made. 538 3.3 Providing an additional location for a resource 540 One of the main benefits we are looking for from the URC service it 541 the ability to have multiple locations for a resource. How are these 542 additional locations to be established? There will be several ways 543 this might happen, the appropriate model will depend on financial 544 considerations more than technical ones. We will consider three 545 cases out of many possible ones. The first is simple mirroring of 546 free information. The second is a mirror of a small publisher's 547 information that is sold. The third is a contractual arrangement 548 between sites. 550 3.3.1 Mirroring of free information 552 A researcher in Australia comes across a collection of interesting 553 technical reports on a server in Sweden. He wishes to mirror those 554 reports as a service to the research community in Australia. He 555 contacts the administrator of the archive in Sweden, who also happens 556 to be the author of the reports of interest. She gives her permission 557 for a mirror to be established. He pulls over the reports and sets 558 them up on his HTTP server. Now that they have URLs, he sends 559 a register_new_url message to the URC service. Since the Swedish 560 researcher has provided a digital signature for the URCs of all the 561 reports, a new location can not just be blindly entered. The URC 562 service forwards the request to the Swedish researcher. She checks 563 out the new URLs to make sure that they are faithful versions of her 564 reports, then signs the register_new_url message with her private key 565 before sending it back to the URC service. The service verifies 566 the authentication information, sees that it is good, adds the new 567 location to the URC of each report and recalculates the signature 568 information. Now when users attempt to resolve the URN, they can 569 fetch it from either Australia or Sweden. As a matter of courtesy, 570 the Australian researcher periodically informs the Swedish researcher 571 about how many times her reports have been accessed. 573 This scenario does not impose any requirements on the URC service 574 beyond those already described for modifying the URC information. 576 3.3.2 Mirroring of information that is for sale 578 An experimental film maker in Germany has been selling avant-garde 579 videos over the WWW. A film distributor in Canada contacts her to see 580 if they can serve these up to the North American market in exchange 581 for a cut of the action. The film maker says ``sure''. The Canadian 582 distributor puts copies of the videos onto their video server and 583 attempts to register the new location with the URC service. The 584 service forwards the request to the film maker, who authorizes it by 585 signing the request with her private key. Periodically the Canadians 586 send the film maker a check to cover the royalties the film maker 587 collects from every download from the Canadian server. As part of the 588 contract between the two sites, the film maker can access the logs on 589 the distributor's server to make sure that she is being paid for all 590 the copies it provides. 592 This scenario does not impose many requirements on the URC service. 593 Note that the access logs are an obvious point of attack for an 594 unscrupulous mirror site administrator. It would be nice if there was 595 a means for ensuring their veracity, however, that is an HTTP (or 596 equivalent) server issue, not a URC server issue. 598 3.3.3 Mirroring on a regular basis 600 Some large sites may set up cooperative mirroring agreements. For 601 example, Los Alamos National Laboratory might make arrangements with 602 CERN to provide mirrors of each others work. When either of these 603 sites publishes a new resource, it sends a message to the other. The 604 second site fetches the resource and puts it on their server. It then 605 issues a register_new_url message to the URC service. It is forwarded 606 to the publisher of the resource, where it is automatically approved 607 without human intervention. 609 This scenario requires that the URC server's access controls be 610 capable of registering multiple users - not a big problem. It also 611 adds the concept of a list of sites to notify when new material is 612 published. However, that requirement could be eliminated in favor of 613 a polling model as already discussed in the information harvesting 614 scenario. 616 3.4 Removing a location for a resource 618 One of the other strong motivations for the URC service is to 619 allow administrators of collections of information to rearrange their 620 collections without breaking pages across the globe. Moving resources 621 can be accomplished in two steps - establishing the new location then 622 deleting the original location. Deletion is also necessary when we 623 wish to remove a resource for any reason. 625 o Administrator of a resource location sends a delete_location 626 message to the URC server. This will typically require 627 authentication that is provided through digital signature means. 629 o The URC service authenticates the request. If the issuer of the 630 request has permission, the URC is searched for the specified 631 location. If found, it is removed and a new signature for the URC 632 is computed. 634 o If there are other URC servers providing default information for 635 the particular publisher, they are notified as well so that they 636 may also modify their databases. 638 This scenario requires that we be able to select portions of a URC 639 and delete them. Access control mechanisms should operate on a fine 640 enough grain that the administrator of one location could delete their 641 location from the URC, but could not delete other locations. 643 3.5 Establishing a new publishing authority 645 Publishers are arranged in a hierarchy where new publishers can be 646 added as children of existing ones. 648 o Billy Bob Riker, Harley biker, decides to publish his doggerel to 649 the world. He contacts his friendly neighborhood web publisher 650 who, in exchange for a modest amount of cash, establishes ``HogDog 651 Press'' as a publisher by issuing the following request to the URC 652 service, signed with the private key of the publisher: 654 o Register_new_publisher(parent_publisher, name_of_new_publisher, signa- 655 ture_of_request) 657 o Since Billy Bob's nomadic life style is a little hard on disk 658 drives, he contracts with a third party to provide storage, HTTP 659 service, and URC service. These are private business dealings 660 between the two parties and do not especially concern us. 662 o Billy Bob's prose is published to the world using the operations 663 described earlier. 665 3.6 Dealing with the demise of a publisher 667 Poor Billy Bob. The market for Harley doggerel was not enough to 668 cover his yearly storage fees and his service provider is about to 669 evict his bits. While the parent publisher will never again register 670 an entity as ``HogDog Press'' no one is paying the service provider 671 for the machine resources, so it is time to remove the URLs from the 672 URCs for Billy Bob's resources. 674 Accomplishing this in the presence of digital signatures could be 675 a tricky question. Billy will have signed the URC elements with 676 HogDog's private key, and he is not about to go along willingly with 677 the eviction proceedings. Of course, he is not the administrator of 678 the HTTP and URC servers. The administrator of the servers simply 679 clobbers the old URC and replaces it with one that contains a ``no 680 longer available'' element. It can't be signed with Billy Bob's key, 681 but so what? Well, it leads to a form of denial of service attack. 683 Once the new URC is in place, the resources are deleted from the HTTP 684 server. 686 This scenario requires us to pay considerable attention to who will 687 sign URCs and URC components, how URCs might be nested in other URCs, 688 etc. 690 4 Requirements 692 In each of the scenarios above we listed any new requirements that 693 would be placed on the URC and the URC service. This section collects 694 those requirements into 2 categories: requirements on the URC, and 695 requirements on the URC service. Any proposed specifications for the 696 URC and URC service will need to demonstrate how they meet all the 697 requirements, or demonstrate how the requirements are unnecessary or 698 in error. 700 4.1 Requirements on the URC 702 Machine Consumption A URC must be parsable by a computer. 704 Consistent External Representation In order for digital signatures of 705 the URC information to work, and to simplify the requirement 706 for parsability, the URC must have a consistent external 707 representation. 709 Transport Friendliness Related to the consistency of the external 710 representation, it must be possible to transport a URC unmodified 711 in the common Internet protocols, such as TCP, SMTP, FTP, Telnet, 712 etc. 714 Human Readability A URC must have a printed representation that is 715 suitable for printing on paper, as well as suitable for entry 716 by means of being typed by a user on a keyboard. Digital 717 signature information need not apply to such items given the high 718 probability of trivial differences. 720 Some meta-information items are meant for humans only while others 721 are only meant to be machine consumable. One requirement should 722 not preclude the other from being encoded. 724 Simplicity It must be simple for humans to generate correct minimal 725 URCs that do not carry any signature information. 727 Rearrangeability It must be possible to reorder elements in a URC 728 without changing their semantics. Note that elements in this 729 case may mean compound entities. The compound entities (such 730 as information related to a particular location) should be 731 rearrangeable, while the information inside the entity may need to 732 have its order preserved. 734 Generality In the most basic sense, a URC must be able to contain ANY 735 conceivable type of meta-information or URI. Therefore it must be 736 possible to add new types of elements to the URC without breaking 737 previous applications. Any restrictions on the representational 738 capability of a URC will be the target of intense scrutiny. 740 Structure In accommodate the encapsulation of objects that are 741 currently unforeseen, have a self-describing structure. We 742 interpret this as meaning that elements in a URC must be tagged 743 with a descriptive label and it must be possible to determine 744 their extent, even in the presence of nesting. 746 Ignorability Related to the previous 2 requirements, an application 747 encountering an unknown tag must be able to ignore it without 748 error. This implies that it must be easy to tell where an 749 element stops, even if you don't know anything about its internal 750 structure. Also note that nesting unknown elements must still be 751 handled correctly. 753 Searchable It must be possible to select a URC based on a search of 754 its components. It must be possible to select which components 755 will be searched and which will not be searched. 757 Subsettable It must be possible to form a new URC from some of the 758 components of another URC. 760 Seperable It must be possible to separate a signature in a URC from 761 the information it signs. 763 Incrementally Modifiable It must be possible to add (or delete) 764 elements to (or from) a URC in an incremental fashion. It must be 765 possible to provide elements in a URC for tracking the changes in 766 a URC. 768 Versioning It must be possible for a URC to track version changes in 769 the resource(s) it describes. This is related to, but distinct 770 from, the requirement that a URC be able to track version changes 771 in a URC. 773 Caching Caching should be possible for any URC regardless of whether 774 or not any of its specific elements are not cacheable. Further, 775 it must be possible to determine if a query has been answered from 776 cached or authoritative URC information. 778 Grandfathering Current meta-information schemes should be allowed to 779 work within the URC structure, where this will not conflict with 780 the other requirements. 782 4.2 Requirements on the URC Service 784 The previous section discussed requirements on the encoding and 785 functionality of single URCs. This section presents the requirements 786 we have derived for collections of URCs sitting on a URC server, and 787 that server's communication with applications. 789 Resolution It must be possible for a URN to be resolved into a URC. 791 A URC is meant to be the format that URNs and URLs are transported 792 in, therefore a given URN or URL may be resolved into a URC. 793 Nothing within a URC should cause it to not be the solution to a 794 URN or URL resolution. 796 Query Language It must be possible for simple resolution queries 797 to be augmented with information on the version of a resource 798 desired, and an indication of whether signature information should 799 be supplied. 801 Security Since the URC service will be providing information 802 of significant value, it will be a tempting target for 803 attack. It must be possible to secure the service without 804 imposing performance penalties on unauthenticated access to free, 805 unencrypted, information. 807 Authentication Chain One aspect of the general requirement for 808 security is that it must be possible to establish a chain of 809 authentication from the URN, through the URC, to the resource 810 retrieved through a URL. 812 Access Control Another aspect of security is that it must be possible 813 to control (at least) read and write access to portions of a URC. 815 Maintenance The URI architecture is intended to be long-lived. The 816 service must not prevent the maintenance of the resources and 817 their meta-information. 819 Synchronization If multiple servers are equally authoritative for a 820 publisher, they must work with each other to keep their URCs in 821 sync within a reasonable time delay. This delay is certainly less 822 than a week, but can be more than an hour. 824 Development URC servers must support the development of new resources 825 by issuing and handling "development" URNs and URCs. 827 Choice A user must be able to connect to a range of URC servers, 828 and not have to do all interaction through one server that is a 829 gateway to the world. 831 Scalability In order for the system to scale to large numbers of 832 users, queries on one URC server should not automatically be 833 forwarded in such a fashion that they will hit a large number of 834 other servers around the globe. 836 Administrative Contact There must be a standard means for contacting 837 the administrator(s) of a server. 839 Hierarchical Operations A publisher will have the rights to register 840 sub-publishers. 842 The publisher must keep a list of the sub-publishers it has 843 created. 845 That list should be available as the result of an appropriate 846 query to a server that speaks for the publisher. 848 It must be possible to determine the parent publisher of a 849 sub-publisher. 851 5 Characteristics 853 Several important characteristics of URCs come about as a result of 854 fulfilling the above requirements. Some of these characteristics are 855 a result of requirements on URNs and URLs that make up some of the 856 elements of a URC: 858 Time To Live Since a URC may contain transient information such as 859 timestamps, access privileges, etc. it can not be guaranteed to 860 have a Time To Live greater than 0. Any subset of a URC is free 861 to specify its own TTL but this still does not affect the whole 862 URC. While this does not preclude the user from attempting to 863 trust a URC for a longer amount of time it should not be something 864 to depend on. 866 Character Sets Since the encapsulation and scalability requirements 867 force the inclusion of alternate character sets, some common 868 scheme must be found that accommodates all character sets in a way 869 that fulfills the transport friendly encoding requirement. This 870 precludes any restrictions on allowable character sets. 872 Data Naming Fulfilling the grandfathering requirement will make it 873 nearly impossible to specify the numerous ways extremely similar 874 pieces of information can be represented. Thus one consideration 875 should be a central authority that makes suggestions as to the 876 consolidation of the names used to identify specific pieces of 877 meta-information. 879 Member Element Control By allowing any piece meta-information to be 880 included within a URC the number of globally understood elements 881 will be small if not non-existent. Therefore some entity must 882 have some control over some set of very concretely specified 883 member elements. The specification of that entity should be done 884 in an encoding specification and is outside the scope of this list 885 of functional requirements. 887 Multiple Signatures To accomodate the requirements of insertion and 888 deletion in the presence of digital signatures, we anticipate that 889 URCs will be signed using the private key of a server. The 890 server's public key would be signed by the private key of the 891 publisher. Entities with the authority to modify URC elements 892 will have to have their keys signed by the server. 894 References 896 [1] Sollins, K. and Masinter, L., Functional Requirements for 897 Uniform Resource Names, 900 [2] Rhine, J., Interpedia Homepage, 901 903 Glossary 905 Default URC The URC that is provided by the publisher of a resource. 907 Default URC server The URC server(s) that can provide the default 908 URCs for a publisher. 910 Local URC server The URC server that a user's browser is configured 911 to connect to as a first resort. 913 Development URN A URN used while developing a resource. It 914 starts with very tight access controls so that only the 915 resource developers and the server administrator can see the URC 916 information and resolve the URN to a URL. The access controls can 917 be eased later. 919 value-added URC server A server that provides more than just the 920 default information on a resource. Servers run by professional 921 organizations that provide SOAPs are one example, servers that 922 keep full-text indices or n-grams of text in order to offer 923 greater search capabilities are another. 925 SOAP Seal Of APproval - A capsule review of a resource which uses 926 cyptographic techniques to provide guarantees on the source of the 927 review and its authenticity. 929 Contact Information 931 Ron Daniel Jr. 932 MS B287 933 Los Alamos National Laboratory 934 Los Alamos, NM, USA 87545 935 voice: (505) 665-0139 936 fax: (505) 665-4939 937 rdaniel@lanl.gov 939 Michael Mealling 940 Office of Information Technology, Network Services 941 Georgia Institute of Technology 942 Atlanta, GA, USA 30332-0730 943 voice: (404) 894-1712 944 fax: (404) 894 9548 945 michael.mealling@oit.gatech.edu 947 This Internet Draft expires May 25, 1995.