idnits 2.17.1 draft-hoffman-rfc1738bis-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 1) being 537 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 2 instances of too long lines in the document, the longest one being 2 characters in excess of 72. ** There are 4 instances of lines with control characters in the document. ** The abstract seems to contain references ([RFC1738]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (April 19, 2003) is 7671 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'STD3' is defined on line 515, but no explicit reference was found in the text == Unused Reference: 'STD13' is defined on line 518, but no explicit reference was found in the text -- Possible downref: Non-RFC (?) normative reference: ref. 'PROSPERO' ** Downref: Normative reference to an Informational RFC: RFC 1436 ** Downref: Normative reference to an Informational RFC: RFC 1625 ** Obsolete normative reference: RFC 1738 (Obsoleted by RFC 4248, RFC 4266) -- Possible downref: Normative reference to a draft: ref. 'RFC2396BIS' -- Possible downref: Non-RFC (?) normative reference: ref. 'WAIS' Summary: 10 errors (**), 0 flaws (~~), 3 warnings (==), 5 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Draft Paul Hoffman 2 draft-hoffman-rfc1738bis-02.txt VPN Consortium 3 April 19, 2003 4 Expires in six months 5 Intended status: Standards Track 7 Definitions of Early URI Schemes 9 Status of this Memo 11 This document is an Internet-Draft and is in full conformance with 12 all provisions of Section 10 of RFC2026. 14 Internet-Drafts are working documents of the Internet Engineering 15 Task Force (IETF), its areas, and its working groups. Note that other 16 groups may also distribute working documents as Internet-Drafts. 18 Internet-Drafts are draft documents valid for a maximum of six months 19 and may be updated, replaced, or obsoleted by other documents at any 20 time. It is inappropriate to use Internet-Drafts as reference 21 material or to cite them other than as "work in progress." 23 The list of current Internet-Drafts can be accessed at http:// 24 www.ietf.org/ietf/1id-abstracts.txt. 26 The list of Internet-Draft Shadow Directories can be accessed at 27 http://www.ietf.org/shadow.html. 29 Abstract 31 This document specifies many Uniform Resource Identifier (URI) schemes 32 that were originally specified in RFC 1738 [RFC1738]. Some of these 33 schemes are specified more fully in this document. The purpose of 34 this document is to allow RFC 1738 to be moved to historic while keeping 35 the information about the schemes on standards track. 37 1. Introduction 39 URIs are currently defined RFC 2396, which is being updated by 40 [RFC2396BIS]. Those documents also specify how to define schemes for 41 URIs. 43 The first definition for many URI schemes appeared in RFC 1738. Because 44 that document may be moved to Historic status, this document copies the 45 still-needed material from it to allow that material to remain on 46 standards track. Specifically, this document copies the URI schemes. 48 Some of the URI scheme definitions have been changed. The following 49 lists all of the changes: 51 - http: was removed because it is specified in RFC 2616 53 - mailto: was removed because it is specified in RFC 2368 55 It should be noted that three of the schemes for protocols that are 56 described in this document (Gopher+, WAIS, and Prospero) were never 57 documented in RFCs, and the references to them are URLs that may not be 58 long-lasting. In fact, at least two of those URLs are no longer 59 working at the time of this writing. 61 2. Specific Schemes 63 The mapping for some existing standard and experimental protocols is 64 outlined in the BNF syntax definition. Notes on particular protocols 65 follow. The schemes covered are: 67 ftp File Transfer protocol 68 gopher The Gopher protocol 69 news and nntp USENET news 70 telnet Reference to interactive sessions 71 wais Wide Area Information Servers 72 file Host-specific file names 73 prospero Prospero Directory Service 75 2.1. Common Internet Scheme Syntax 77 The common URL syntax is described in [RFC2396BIS] and is thus 78 not repeated here. 80 2.2. FTP 82 The ftp URL scheme is used to designate files and directories on 83 Internet hosts accessible using the FTP protocol (RFC959). 85 A FTP URL follow the syntax described in Section 2.1. If : is 86 omitted, the port defaults to 21. 88 2.2.1. FTP Name and Password 90 A user name and password may be supplied; they are used in the ftp 91 "USER" and "PASS" commands after first making the connection to the 92 FTP server. If no user name or password is supplied and one is 93 requested by the FTP server, the conventions for "anonymous" FTP are 94 to be used, as follows: 96 The user name "anonymous" is supplied. 98 The password is supplied as the Internet e-mail address 99 of the end user accessing the resource. 101 If the URL supplies a user name but no password, and the remote 102 server requests a password, the program interpreting the FTP URL 103 should request one from the user. 105 2.2.2. FTP url-path 107 The url-path of a FTP URL has the following syntax: 109 //...//;type= 111 Where through and are (possibly encoded) strings 112 and is one of the characters "a", "i", or "d". The part 113 ";type=" may be omitted. The and parts may be 114 empty. The whole url-path may be omitted, including the "/" 115 delimiting it from the prefix containing user, password, host, and 116 port. 118 The url-path is interpreted as a series of FTP commands as follows: 120 Each of the elements is to be supplied, sequentially, as the 121 argument to a CWD (change working directory) command. 123 If the typecode is "d", perform a NLST (name list) command with 124 as the argument, and interpret the results as a file 125 directory listing. 127 Otherwise, perform a TYPE command with as the argument, 128 and then access the file whose name is (for example, using 129 the RETR command.) 131 Within a name or CWD component, the characters "/" and ";" are 132 reserved and must be encoded. The components are decoded prior to 133 their use in the FTP protocol. In particular, if the appropriate FTP 134 sequence to access a particular file requires supplying a string 135 containing a "/" as an argument to a CWD or RETR command, it is 137 For example, the URL is 138 interpreted by FTP-ing to "host.dom", logging in as "myname" 139 (prompting for a password if it is asked for), and then executing 140 "CWD /etc" and then "RETR motd". This has a different meaning from 141 which would "CWD etc" and then 142 "RETR motd"; the initial "CWD" might be executed relative to the 143 default directory for "myname". On the other hand, 144 , would "CWD " with a null 145 argument, then "CWD etc", and then "RETR motd". 147 FTP URLs may also be used for other operations; for example, it is 148 possible to update a file on a remote file server, or infer 149 information about it from the directory listings. The mechanism for 150 doing so is not spelled out here. 152 2.2.3. FTP Typecode is Optional 154 The entire ;type= part of a FTP URL is optional. If it is 155 omitted, the client program interpreting the URL must guess the 156 appropriate mode to use. In general, the data content type of a file 157 can only be guessed from the name, e.g., from the suffix of the name; 158 the appropriate type code to be used for transfer of the file can 159 then be deduced from the data content of the file. 161 2.2.4. Hierarchy 163 For some file systems, the "/" used to denote the hierarchical 164 structure of the URL corresponds to the delimiter used to construct a 165 file name hierarchy, and thus, the filename will look similar to the 166 URL path. This does NOT mean that the URL is a Unix filename. 168 2.2.5. Optimization 170 Clients accessing resources via FTP may employ additional heuristics 171 to optimize the interaction. For some FTP servers, for example, it 172 may be reasonable to keep the control connection open while accessing 173 multiple URLs from the same server. However, there is no common 174 hierarchical model to the FTP protocol, so if a directory change 175 command has been given, it is impossible in general to deduce what 176 sequence should be given to navigate to another directory for a 177 second retrieval, if the paths are different. The only reliable 178 algorithm is to disconnect and reestablish the control connection. 180 2.3. Gopher 182 The gopher URL scheme is used to designate Internet resources 183 accessible using the Gopher protocol. 185 The base Gopher protocol is described in [RFC1436] and supports items 186 and collections of items (directories). The Gopher+ protocol is a set 187 of upward compatible extensions to the base Gopher protocol and is 188 described in [Gopher+]. Gopher+ supports associating arbitrary sets of 189 attributes and alternate data representations with Gopher items. 190 Gopher URLs accommodate both Gopher and Gopher+ items and item 191 attributes. 193 2.3.1. Gopher URL syntax 195 A Gopher URL takes the form: 197 gopher://:/ 199 where is one of 201 202 %09 203 %09%09 205 If : is omitted, the port defaults to 70. is a 206 single-character field to denote the Gopher type of the resource to 207 which the URL refers. The entire may also be empty, in 208 which case the delimiting "/" is also optional and the 209 defaults to "1". 211 is the Gopher selector string. In the Gopher protocol, 212 Gopher selector strings are a sequence of octets which may contain 213 any octets except 09 hexadecimal (US-ASCII HT or tab) 0A hexadecimal 214 (US-ASCII character LF), and 0D (US-ASCII character CR). 216 Gopher clients specify which item to retrieve by sending the Gopher 217 selector string to a Gopher server. 219 Within the , no characters are reserved. 221 Note that some Gopher strings begin with a copy of the 222 character, in which case that character will occur twice 223 consecutively. The Gopher selector string may be an empty string; 224 this is how Gopher clients refer to the top-level directory on a 225 Gopher server. 227 2.3.2 Specifying URLs for Gopher Search Engines 229 If the URL refers to a search to be submitted to a Gopher search 230 engine, the selector is followed by an encoded tab (%09) and the 231 search string. To submit a search to a Gopher search engine, the 232 Gopher client sends the string (after decoding), a tab, 233 and the search string to the Gopher server. 235 2.3.3 URL syntax for Gopher+ items 237 URLs for Gopher+ items have a second encoded tab (%09) and a Gopher+ 238 string. Note that in this case, the %09 string must be 239 supplied, although the element may be the empty string. 241 The is used to represent information required for 242 retrieval of the Gopher+ item. Gopher+ items may have alternate 243 views, arbitrary sets of attributes, and may have electronic forms 244 associated with them. 246 To retrieve the data associated with a Gopher+ URL, a client will 247 connect to the server and send the Gopher selector, followed by a tab 248 and the search string (which may be empty), followed by a tab and the 249 Gopher+ commands. 251 2.3.4 Default Gopher+ data representation 253 When a Gopher server returns a directory listing to a client, the 254 Gopher+ items are tagged with either a "+" (denoting Gopher+ items) 255 or a "?" (denoting Gopher+ items which have a +ASK form associated 256 with them). A Gopher URL with a Gopher+ string consisting of only a 257 "+" refers to the default view (data representation) of the item 258 while a Gopher+ string containing only a "?" refer to an item with a 259 Gopher electronic form associated with it. 261 2.3.5 Gopher+ items with electronic forms 263 Gopher+ items which have a +ASK associated with them (i.e. Gopher+ 264 items tagged with a "?") require the client to fetch the item's +ASK 265 attribute to get the form definition, and then ask the user to fill 266 out the form and return the user's responses along with the selector 267 string to retrieve the item. Gopher+ clients know how to do this but 268 depend on the "?" tag in the Gopher+ item description to know when to 269 handle this case. The "?" is used in the Gopher+ string to be 270 consistent with Gopher+ protocol's use of this symbol. 272 2.3.6 Gopher+ item attribute collections 274 To refer to the Gopher+ attributes of an item, the Gopher URL's 275 Gopher+ string consists of "!" or "$". "!" refers to the all of a 276 Gopher+ item's attributes. "$" refers to all the item attributes for 277 all items in a Gopher directory. 279 2.3.7 Referring to specific Gopher+ attributes 281 To refer to specific attributes, the URL's gopher+_string is 282 "!" or "$". For example, to refer to 283 the attribute containing the abstract of an item, the gopher+_string 284 would be "!+ABSTRACT". 286 To refer to several attributes, the gopher+_string consists of the 287 attribute names separated by coded spaces. For example, 288 "!+ABSTRACT%20+SMELL" refers to the +ABSTRACT and +SMELL attributes 289 of an item. 291 2.3.8 URL syntax for Gopher+ alternate views 293 Gopher+ allows for optional alternate data representations (alternate 294 views) of items. To retrieve a Gopher+ alternate view, a Gopher+ 295 client sends the appropriate view and language identifier (found in 296 the item's +VIEW attribute). To refer to a specific Gopher+ alternate 297 view, the URL's Gopher+ string would be in the form: 299 For example, a Gopher+ string of "+application/postscript%20Es_ES" 300 refers to the Spanish language postscript alternate view of a Gopher+ 301 item. 303 2.3.9 URL syntax for Gopher+ electronic forms 305 The gopher+_string for a URL that refers to an item referenced by a 306 Gopher+ electronic form (an ASK block) filled out with specific 307 values is a coded version of what the client sends to the server. 308 The gopher+_string is of the form: 310 +%091%0D%0A+-1%0D%0A%0D%0A%0D%0A.%0D%0A 312 To retrieve this item, the Gopher client sends: 314 +1 315 +-1 316 317 318 . 320 to the Gopher server. 322 2.4. news and nntp 324 The news and nntp URL schemes are used to refer to either news groups or 325 individual articles of USENET news, as specified in RFC 1036. 327 A news URL takes one of two forms: 329 newsURL = scheme ":" [ news-server ] [ refbygroup | message ] 330 scheme = "news" | "nntp" 331 news-server = "//" server "/" 332 refbygroup = group [ "/" messageno [ "-" messageno ] ] 333 messageno = local-part "@" domain 335 A is a period-delimited hierarchical name, such as 336 "comp.infosystems.www.misc". A corresponds to the 337 Message-ID of section 2.1.5 of RFC 1036, without the enclosing "<" 338 and ">"; it takes the form @. A message 339 identifier may be distinguished from a news group name by the 340 presence of the commercial at "@" character. No additional characters 341 are reserved within the components of a news URL. 343 If is "*" (as in ), it is used to refer 344 to "all available news groups". 346 2.5. TELNET 348 The Telnet URL scheme is used to designate interactive services that 349 may be accessed by the Telnet protocol. 351 A telnet URL takes the form: 353 telnet://:@:/ 355 as specified in Section 2.1. The final "/" character may be omitted. 356 If : is omitted, the port defaults to 23. The : can 357 be omitted, as well as the whole : part. 359 This URL does not designate a data object, but rather an interactive 360 service. Remote interactive services vary widely in the means by 361 which they allow remote logins; in practice, the and 362 supplied are advisory only: clients accessing a telnet URL 363 merely advise the user of the suggested username and password. 365 2.6. WAIS 367 The WAIS URL scheme is used to designate WAIS databases, searches, or 368 individual documents available from a WAIS database. WAIS is 369 described in [WAIS]. The WAIS protocol is described in RFC 1625 [RFC1625]; 370 Although the WAIS protocol is based on Z39.50-1988, the WAIS URL 371 scheme is not intended for use with arbitrary Z39.50 services. 373 A WAIS URL takes one of the following forms: 375 wais://:/ 376 wais://:/? 377 wais://:/// 379 where and are as described in Section 2.1. If : 380 is omitted, the port defaults to 210. The first form designates a 381 WAIS database that is available for searching. The second form 382 designates a particular search. is the name of the WAIS 383 database being queried. 385 The third form designates a particular document within a WAIS 386 database to be retrieved. In this form is the WAIS 387 designation of the type of the object. Many WAIS implementations 388 require that a client know the "type" of an object prior to 389 retrieval, the type being returned along with the internal object 390 identifier in the search response. The is included in the 391 URL in order to allow the client interpreting the URL adequate 392 information to actually retrieve the document. 394 The of a WAIS URL consists of the WAIS document-id. The WAIS 395 document-id should be treated opaquely; it may only be decomposed by 396 the server that issued it. 398 2.7 FILES 400 The file URL scheme is used to designate files accessible on a 401 particular host computer. This scheme, unlike most other URL schemes, 402 does not designate a resource that is universally accessible over the 403 Internet. 405 A file URL takes the form: 407 file:/// 409 where is the fully qualified domain name of the system on 410 which the is accessible, and is a hierarchical 411 directory path of the form //.../. 413 As a special case, can be the string "localhost" or the empty 414 string; this is interpreted as "the machine from which the URL is 415 being interpreted". However, this part of the syntax has been 416 ignored on many systems. That is, for some systems, the following 417 are considered equal, while on others they are not: 419 file://localhost/path/to/file.txt 420 file:///path/to/file.txt 422 Some systems allow URLs to point to directories. In this case, there 423 is usually (but not always) a terminating "/" character, such as 424 in: 426 file://usr/local/bin/ 428 On systems running some versions of Microsoft Windows, the local drive 429 specification is preceded by a "/" character. Thus, for a file called 430 "example.ini" in the "windows" directory on the "c:" drive, the URL 431 would be: 433 file:///c:/windows/example.ini 435 For Windows shares, there is an additional "/" prepended to the name. 436 Thus, the file "example.doc" on the shared directory "department" would 437 have the URL: 439 file:////department/example.doc 441 The file URL scheme is unusual in that it does not specify an 442 Internet protocol or access method for such files; as such, its 443 utility in network protocols between hosts is limited. 445 2.8 Prospero 447 The prospero URL scheme is used to designate resources that are 448 accessed via the Prospero Directory Service. The Prospero protocol is 449 described elsewhere [PROSPERO]. 451 A prospero URLs takes the form: 453 prospero://:/;= 455 where and are as described in Section 2.1. If : 456 is omitted, the port defaults to 1525. No username or password is 457 allowed. 459 The is the host-specific object name in the Prospero 460 protocol, suitably encoded. This name is opaque and interpreted by 461 the Prospero server. The semicolon ";" is reserved and may not 462 appear without quoting in the . 464 Prospero URLs are interpreted by contacting a Prospero directory 465 server on the specified host and port to determine appropriate access 466 methods for a resource, which might themselves be represented as 467 different URLs. External Prospero links are represented as URLs of 468 the underlying access method and are not represented as Prospero 469 URLs. 471 Note that a slash "/" may appear in the without quoting and 472 no significance may be assumed by the application. Though slashes 473 may indicate hierarchical structure on the server, such structure is 474 not guaranteed. Note that many s begin with a slash, in 475 which case the host or port will be followed by a double slash: the 476 slash from the URL syntax, followed by the initial slash from the 477 . (E.g., designates a 478 of "/pros/name".) 480 In addition, after the , optional fields and values 481 associated with a Prospero link may be specified as part of the URL. 482 When present, each field/value pair is separated from each other and 483 from the rest of the URL by a ";" (semicolon). The name of the field 484 and its value are separated by a "=" (equal sign). If present, these 485 fields serve to identify the target of the URL. For example, the 486 OBJECT-VERSION field can be specified to identify a specific version 487 of an object. 489 3. Security Considerations 491 There are many security considerations for URIs, as described in 492 [RFC2396BIS]. 494 4. References 496 [Gopher+] Anklesaria, F., Lindner, P., McCahill, M., Torrey, D., 497 Johnson, D., and B. Alberti, "Gopher+: Upward compatible enhancements to 498 the Internet Gopher protocol", University of Minnesota, July 1993. 500 [PROSPERO] Neuman, B., and S. Augart, "The Prospero Protocol", 501 USC/Information Sciences Institute, June 1993. 503 [RFC1436] Anklesaria, et. al., "Internet Gopher Protocol", RFC 1436, 504 March 1993. 506 [RFC1625] St. Pierre, et. al., "WAIS over Z39.50-1988", RFC 1625, June 507 1994. 509 [RFC1738] Berners-Lee, et. al., "Uniform Resource Locators (URL)", RFC 510 1738, December 1994. 512 [RFC2396BIS] Berners-Lee, et. al., "Uniform Resource Identifier (URI): 513 Generic Syntax", draft-fielding-uri-rfc2396bis. 515 [STD3] Braden, R., Editor, "Requirements for Internet Hosts -- 516 Application and Support", STD 3, RFC 1123, October 1989. 518 [STD13] Mockapetris, P., "Domain Names - Concepts and Facilities", STD 519 13, RFC 1034, November 1987. 521 [WAIS] Davis, et. al, "WAIS Interface Protocol Prototype Functional 522 Specification", (v1.5), Thinking Machines Corporation, April 1990. 524 5. Authors' Contact Information 526 Paul Hoffman 527 VPN Consortium 528 127 Segre Place 529 Santa Cruz, CA 95060 USA 530 Phone: +1-831-426-9827 531 EMail: paul.hoffman@vpnc.org