idnits 2.17.1 

draft-hoffman-rfc1738bis-02.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Looks like you're using RFC 2026 boilerplate.  This must be updated to
     follow RFC 3978/3979, as updated by RFC 4748.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** Missing expiration date.  The document expiration date should appear on
     the first and last page.

  == The page length should not exceed 58 lines per page, but there was 1
     longer page, the longest (page 1) being 537 lines


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.

  ** There are 2 instances of too long lines in the document, the longest one
     being 2 characters in excess of 72.

  ** There are 4 instances of lines with control characters in the document.

  ** The abstract seems to contain references ([RFC1738]), which it
     shouldn't.  Please replace those with straight textual mentions of the
     documents in question.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (April 19, 2003) is 7671 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Unused Reference: 'STD3' is defined on line 515, but no explicit
     reference was found in the text

  == Unused Reference: 'STD13' is defined on line 518, but no explicit
     reference was found in the text

  -- Possible downref: Non-RFC (?) normative reference: ref. 'PROSPERO'

  ** Downref: Normative reference to an Informational RFC: RFC 1436

  ** Downref: Normative reference to an Informational RFC: RFC 1625

  ** Obsolete normative reference: RFC 1738 (Obsoleted by RFC 4248, RFC 4266)

  -- Possible downref: Normative reference to a draft: ref. 'RFC2396BIS' 

  -- Possible downref: Non-RFC (?) normative reference: ref. 'WAIS'


     Summary: 10 errors (**), 0 flaws (~~), 3 warnings (==), 5 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	Internet Draft                                        Paul Hoffman
2	draft-hoffman-rfc1738bis-02.txt                     VPN Consortium
3	April 19, 2003
4	Expires in six months
5	Intended status: Standards Track

7	                      Definitions of Early URI Schemes

9	Status of this Memo

11	This document is an Internet-Draft and is in full conformance with
12	all provisions of Section 10 of RFC2026.

14	Internet-Drafts are working documents of the Internet Engineering
15	Task Force (IETF), its areas, and its working groups. Note that other
16	groups may also distribute working documents as Internet-Drafts.

18	Internet-Drafts are draft documents valid for a maximum of six months
19	and may be updated, replaced, or obsoleted by other documents at any
20	time. It is inappropriate to use Internet-Drafts as reference
21	material or to cite them other than as "work in progress."

23	The list of current Internet-Drafts can be accessed at http://
24	www.ietf.org/ietf/1id-abstracts.txt.

26	The list of Internet-Draft Shadow Directories can be accessed at
27	http://www.ietf.org/shadow.html.

29	Abstract

31	This document specifies many Uniform Resource Identifier (URI) schemes
32	that were originally specified in RFC 1738 [RFC1738]. Some of these
33	schemes are specified more fully in this document. The purpose of
34	this document is to allow RFC 1738 to be moved to historic while keeping
35	the information about the schemes on standards track.

37	1. Introduction

39	URIs are currently defined RFC 2396, which is being updated by
40	[RFC2396BIS]. Those documents also specify how to define schemes for
41	URIs.

43	The first definition for many URI schemes appeared in RFC 1738. Because
44	that document may be moved to Historic status, this document copies the
45	still-needed material from it to allow that material to remain on
46	standards track. Specifically, this document copies the URI schemes.

48	Some of the URI scheme definitions have been changed. The following
49	lists all of the changes:

51	- http: was removed because it is specified in RFC 2616

53	- mailto: was removed because it is specified in RFC 2368

55	It should be noted that three of the schemes for protocols that are
56	described in this document (Gopher+, WAIS, and Prospero) were never
57	documented in RFCs, and the references to them are URLs that may not be
58	long-lasting. In fact, at least two of those URLs are no longer
59	working at the time of this writing.

61	2. Specific Schemes

63	The mapping for some existing standard and experimental protocols is
64	outlined in the BNF syntax definition.  Notes on particular protocols
65	follow. The schemes covered are:

67	ftp                     File Transfer protocol
68	gopher                  The Gopher protocol
69	news and nntp           USENET news
70	telnet                  Reference to interactive sessions
71	wais                    Wide Area Information Servers
72	file                    Host-specific file names
73	prospero                Prospero Directory Service

75	2.1. Common Internet Scheme Syntax

77	The common URL syntax is described in [RFC2396BIS] and is thus
78	not repeated here.

80	2.2. FTP

82	The ftp URL scheme is used to designate files and directories on
83	Internet hosts accessible using the FTP protocol (RFC959).

85	A FTP URL follow the syntax described in Section 2.1.  If :<port> is
86	omitted, the port defaults to 21.

88	2.2.1. FTP Name and Password

90	A user name and password may be supplied; they are used in the ftp
91	"USER" and "PASS" commands after first making the connection to the
92	FTP server.  If no user name or password is supplied and one is
93	requested by the FTP server, the conventions for "anonymous" FTP are
94	to be used, as follows:

96		The user name "anonymous" is supplied.

98		The password is supplied as the Internet e-mail address
99		of the end user accessing the resource.

101	If the URL supplies a user name but no password, and the remote
102	server requests a password, the program interpreting the FTP URL
103	should request one from the user.

105	2.2.2. FTP url-path

107	The url-path of a FTP URL has the following syntax:

109		<cwd1>/<cwd2>/.../<cwdN>/<name>;type=<typecode>

111	Where <cwd1> through <cwdN> and <name> are (possibly encoded) strings
112	and <typecode> is one of the characters "a", "i", or "d".  The part
113	";type=<typecode>" may be omitted. The <cwdx> and <name> parts may be
114	empty. The whole url-path may be omitted, including the "/"
115	delimiting it from the prefix containing user, password, host, and
116	port.

118	The url-path is interpreted as a series of FTP commands as follows:

120	  Each of the <cwd> elements is to be supplied, sequentially, as the
121	  argument to a CWD (change working directory) command.

123	  If the typecode is "d", perform a NLST (name list) command with
124	  <name> as the argument, and interpret the results as a file
125	  directory listing.

127	  Otherwise, perform a TYPE command with <typecode> as the argument,
128	  and then access the file whose name is <name> (for example, using
129	  the RETR command.)

131	Within a name or CWD component, the characters "/" and ";" are
132	reserved and must be encoded. The components are decoded prior to
133	their use in the FTP protocol.  In particular, if the appropriate FTP
134	sequence to access a particular file requires supplying a string
135	containing a "/" as an argument to a CWD or RETR command, it is

137	For example, the URL <URL:ftp://myname@host.dom/%2Fetc/motd> is
138	interpreted by FTP-ing to "host.dom", logging in as "myname"
139	(prompting for a password if it is asked for), and then executing
140	"CWD /etc" and then "RETR motd". This has a different meaning from
141	<URL:ftp://myname@host.dom/etc/motd> which would "CWD etc" and then
142	"RETR motd"; the initial "CWD" might be executed relative to the
143	default directory for "myname". On the other hand,
144	<URL:ftp://myname@host.dom//etc/motd>, would "CWD " with a null
145	argument, then "CWD etc", and then "RETR motd".

147	FTP URLs may also be used for other operations; for example, it is
148	possible to update a file on a remote file server, or infer
149	information about it from the directory listings. The mechanism for
150	doing so is not spelled out here.

152	2.2.3. FTP Typecode is Optional

154	The entire ;type=<typecode> part of a FTP URL is optional. If it is
155	omitted, the client program interpreting the URL must guess the
156	appropriate mode to use. In general, the data content type of a file
157	can only be guessed from the name, e.g., from the suffix of the name;
158	the appropriate type code to be used for transfer of the file can
159	then be deduced from the data content of the file.

161	2.2.4. Hierarchy

163	For some file systems, the "/" used to denote the hierarchical
164	structure of the URL corresponds to the delimiter used to construct a
165	file name hierarchy, and thus, the filename will look similar to the
166	URL path. This does NOT mean that the URL is a Unix filename.

168	2.2.5. Optimization

170	Clients accessing resources via FTP may employ additional heuristics
171	to optimize the interaction. For some FTP servers, for example, it
172	may be reasonable to keep the control connection open while accessing
173	multiple URLs from the same server. However, there is no common
174	hierarchical model to the FTP protocol, so if a directory change
175	command has been given, it is impossible in general to deduce what
176	sequence should be given to navigate to another directory for a
177	second retrieval, if the paths are different.  The only reliable
178	algorithm is to disconnect and reestablish the control connection.

180	2.3. Gopher

182	The gopher URL scheme is used to designate Internet resources
183	accessible using the Gopher protocol.

185	The base Gopher protocol is described in [RFC1436] and supports items
186	and collections of items (directories). The Gopher+ protocol is a set
187	of upward compatible extensions to the base Gopher protocol and is
188	described in [Gopher+]. Gopher+ supports associating arbitrary sets of
189	attributes and alternate data representations with Gopher items.
190	Gopher URLs accommodate both Gopher and Gopher+ items and item
191	attributes.

193	2.3.1. Gopher URL syntax

195	A Gopher URL takes the form:

197	  gopher://<host>:<port>/<gopher-path>

199	where <gopher-path> is one of

201	   <gophertype><selector>
202	   <gophertype><selector>%09<search>
203	   <gophertype><selector>%09<search>%09<gopher+_string>

205	If :<port> is omitted, the port defaults to 70.  <gophertype> is a
206	single-character field to denote the Gopher type of the resource to
207	which the URL refers. The entire <gopher-path> may also be empty, in
208	which case the delimiting "/" is also optional and the <gophertype>
209	defaults to "1".

211	<selector> is the Gopher selector string.  In the Gopher protocol,
212	Gopher selector strings are a sequence of octets which may contain
213	any octets except 09 hexadecimal (US-ASCII HT or tab) 0A hexadecimal
214	(US-ASCII character LF), and 0D (US-ASCII character CR).

216	Gopher clients specify which item to retrieve by sending the Gopher
217	selector string to a Gopher server.

219	Within the <gopher-path>, no characters are reserved.

221	Note that some Gopher <selector> strings begin with a copy of the
222	<gophertype> character, in which case that character will occur twice
223	consecutively. The Gopher selector string may be an empty string;
224	this is how Gopher clients refer to the top-level directory on a
225	Gopher server.

227	2.3.2 Specifying URLs for Gopher Search Engines

229	If the URL refers to a search to be submitted to a Gopher search
230	engine, the selector is followed by an encoded tab (%09) and the
231	search string. To submit a search to a Gopher search engine, the
232	Gopher client sends the <selector> string (after decoding), a tab,
233	and the search string to the Gopher server.

235	2.3.3 URL syntax for Gopher+ items

237	URLs for Gopher+ items have a second encoded tab (%09) and a Gopher+
238	string. Note that in this case, the %09<search> string must be
239	supplied, although the <search> element may be the empty string.

241	The <gopher+_string> is used to represent information required for
242	retrieval of the Gopher+ item. Gopher+ items may have alternate
243	views, arbitrary sets of attributes, and may have electronic forms
244	associated with them.

246	To retrieve the data associated with a Gopher+ URL, a client will
247	connect to the server and send the Gopher selector, followed by a tab
248	and the search string (which may be empty), followed by a tab and the
249	Gopher+ commands.

251	2.3.4 Default Gopher+ data representation

253	When a Gopher server returns a directory listing to a client, the
254	Gopher+ items are tagged with either a "+" (denoting Gopher+ items)
255	or a "?" (denoting Gopher+ items which have a +ASK form associated
256	with them). A Gopher URL with a Gopher+ string consisting of only a
257	"+" refers to the default view (data representation) of the item
258	while a Gopher+ string containing only a "?" refer to an item with a
259	Gopher electronic form associated with it.

261	2.3.5 Gopher+ items with electronic forms

263	Gopher+ items which have a +ASK associated with them (i.e. Gopher+
264	items tagged with a "?") require the client to fetch the item's +ASK
265	attribute to get the form definition, and then ask the user to fill
266	out the form and return the user's responses along with the selector
267	string to retrieve the item.  Gopher+ clients know how to do this but
268	depend on the "?" tag in the Gopher+ item description to know when to
269	handle this case. The "?" is used in the Gopher+ string to be
270	consistent with Gopher+ protocol's use of this symbol.

272	2.3.6 Gopher+ item attribute collections

274	To refer to the Gopher+ attributes of an item, the Gopher URL's
275	Gopher+ string consists of "!" or "$". "!" refers to the all of a
276	Gopher+ item's attributes. "$" refers to all the item attributes for
277	all items in a Gopher directory.

279	2.3.7 Referring to specific Gopher+ attributes

281	To refer to specific attributes, the URL's gopher+_string is
282	"!<attribute_name>" or "$<attribute_name>". For example, to refer to
283	the attribute containing the abstract of an item, the gopher+_string
284	would be "!+ABSTRACT".

286	To refer to several attributes, the gopher+_string consists of the
287	attribute names separated by coded spaces. For example,
288	"!+ABSTRACT%20+SMELL" refers to the +ABSTRACT and +SMELL attributes
289	of an item.

291	2.3.8 URL syntax for Gopher+ alternate views

293	Gopher+ allows for optional alternate data representations (alternate
294	views) of items. To retrieve a Gopher+ alternate view, a Gopher+
295	client sends the appropriate view and language identifier (found in
296	the item's +VIEW attribute). To refer to a specific Gopher+ alternate
297	view, the URL's Gopher+ string would be in the form:

299	For example, a Gopher+ string of "+application/postscript%20Es_ES"
300	refers to the Spanish language postscript alternate view of a Gopher+
301	item.

303	2.3.9 URL syntax for Gopher+ electronic forms

305	The gopher+_string for a URL that refers to an item referenced by a
306	Gopher+ electronic form (an ASK block) filled out with specific
307	values is a coded version of what the client sends to the server.
308	The gopher+_string is of the form:

310	+%091%0D%0A+-1%0D%0A<ask_item1_value>%0D%0A<ask_item2_value>%0D%0A.%0D%0A

312	To retrieve this item, the Gopher client sends:

314	   <a_gopher_selector><tab>+<tab>1<cr><lf>
315	   +-1<cr><lf>
316	   <ask_item1_value><cr><lf>
317	   <ask_item2_value><cr><lf>
318	   .<cr><lf>

320	to the Gopher server.

322	2.4. news and nntp

324	The news and nntp URL schemes are used to refer to either news groups or
325	individual articles of USENET news, as specified in RFC 1036.

327	A news URL takes one of two forms:

329	   newsURL      =  scheme ":" [ news-server ] [ refbygroup | message ]
330	   scheme       =  "news" | "nntp"
331	   news-server  =  "//" server "/"
332	   refbygroup   = group [ "/" messageno [ "-" messageno ] ]
333	   messageno    = local-part "@" domain

335	A <group> is a period-delimited hierarchical name, such as
336	"comp.infosystems.www.misc". A <messageno> corresponds to the
337	Message-ID of section 2.1.5 of RFC 1036, without the enclosing "<"
338	and ">"; it takes the form <unique>@<full_domain_name>.  A message
339	identifier may be distinguished from a news group name by the
340	presence of the commercial at "@" character. No additional characters
341	are reserved within the components of a news URL.

343	If <newsgroup-name> is "*" (as in <URL:news:*>), it is used to refer
344	to "all available news groups".

346	2.5. TELNET

348	The Telnet URL scheme is used to designate interactive services that
349	may be accessed by the Telnet protocol.

351	A telnet URL takes the form:

353	   telnet://<user>:<password>@<host>:<port>/

355	as specified in Section 2.1. The final "/" character may be omitted.
356	If :<port> is omitted, the port defaults to 23.  The :<password> can
357	be omitted, as well as the whole <user>:<password> part.

359	This URL does not designate a data object, but rather an interactive
360	service. Remote interactive services vary widely in the means by
361	which they allow remote logins; in practice, the <user> and
362	<password> supplied are advisory only: clients accessing a telnet URL
363	merely advise the user of the suggested username and password.

365	2.6.  WAIS

367	The WAIS URL scheme is used to designate WAIS databases, searches, or
368	individual documents available from a WAIS database. WAIS is
369	described in [WAIS]. The WAIS protocol is described in RFC 1625 [RFC1625];
370	Although the WAIS protocol is based on Z39.50-1988, the WAIS URL
371	scheme is not intended for use with arbitrary Z39.50 services.

373	A WAIS URL takes one of the following forms:

375	 wais://<host>:<port>/<database>
376	 wais://<host>:<port>/<database>?<search>
377	 wais://<host>:<port>/<database>/<wtype>/<wpath>

379	where <host> and <port> are as described in Section 2.1. If :<port>
380	is omitted, the port defaults to 210.  The first form designates a
381	WAIS database that is available for searching. The second form
382	designates a particular search.  <database> is the name of the WAIS
383	database being queried.

385	The third form designates a particular document within a WAIS
386	database to be retrieved. In this form <wtype> is the WAIS
387	designation of the type of the object. Many WAIS implementations
388	require that a client know the "type" of an object prior to
389	retrieval, the type being returned along with the internal object
390	identifier in the search response.  The <wtype> is included in the
391	URL in order to allow the client interpreting the URL adequate
392	information to actually retrieve the document.

394	The <wpath> of a WAIS URL consists of the WAIS document-id. The WAIS
395	document-id should be treated opaquely; it may only be decomposed by
396	the server that issued it.

398	2.7 FILES

400	The file URL scheme is used to designate files accessible on a
401	particular host computer. This scheme, unlike most other URL schemes,
402	does not designate a resource that is universally accessible over the
403	Internet.

405	A file URL takes the form:

407	   file://<host>/<path>

409	where <host> is the fully qualified domain name of the system on
410	which the <path> is accessible, and <path> is a hierarchical
411	directory path of the form <directory>/<directory>/.../<name>.

413	As a special case, <host> can be the string "localhost" or the empty
414	string; this is interpreted as "the machine from which the URL is
415	being interpreted". However, this part of the syntax has been
416	ignored on many systems. That is, for some systems, the following
417	are considered equal, while on others they are not:

419	   file://localhost/path/to/file.txt
420	   file:///path/to/file.txt

422	Some systems allow URLs to point to directories. In this case, there
423	is usually (but not always) a terminating "/" character, such as
424	in:

426	   file://usr/local/bin/

428	On systems running some versions of Microsoft Windows, the local drive
429	specification is preceded by a "/" character. Thus, for a file called
430	"example.ini" in the "windows" directory on the "c:" drive, the URL
431	would be:

433	   file:///c:/windows/example.ini

435	For Windows shares, there is an additional "/" prepended to the name.
436	Thus, the file "example.doc" on the shared directory "department" would
437	have the URL:

439	   file:////department/example.doc

441	The file URL scheme is unusual in that it does not specify an
442	Internet protocol or access method for such files; as such, its
443	utility in network protocols between hosts is limited.

445	2.8 Prospero

447	The prospero URL scheme is used to designate resources that are
448	accessed via the Prospero Directory Service. The Prospero protocol is
449	described elsewhere [PROSPERO].

451	A prospero URLs takes the form:

453	  prospero://<host>:<port>/<hsoname>;<field>=<value>

455	where <host> and <port> are as described in Section 2.1. If :<port>
456	is omitted, the port defaults to 1525. No username or password is
457	allowed.

459	The <hsoname> is the host-specific object name in the Prospero
460	protocol, suitably encoded.  This name is opaque and interpreted by
461	the Prospero server.  The semicolon ";" is reserved and may not
462	appear without quoting in the <hsoname>.

464	Prospero URLs are interpreted by contacting a Prospero directory
465	server on the specified host and port to determine appropriate access
466	methods for a resource, which might themselves be represented as
467	different URLs. External Prospero links are represented as URLs of
468	the underlying access method and are not represented as Prospero
469	URLs.

471	Note that a slash "/" may appear in the <hsoname> without quoting and
472	no significance may be assumed by the application.  Though slashes
473	may indicate hierarchical structure on the server, such structure is
474	not guaranteed. Note that many <hsoname>s begin with a slash, in
475	which case the host or port will be followed by a double slash: the
476	slash from the URL syntax, followed by the initial slash from the
477	<hsoname>. (E.g., <URL:prospero://host.dom//pros/name> designates a
478	<hsoname> of "/pros/name".)

480	In addition, after the <hsoname>, optional fields and values
481	associated with a Prospero link may be specified as part of the URL.
482	When present, each field/value pair is separated from each other and
483	from the rest of the URL by a ";" (semicolon).  The name of the field
484	and its value are separated by a "=" (equal sign). If present, these
485	fields serve to identify the target of the URL.  For example, the
486	OBJECT-VERSION field can be specified to identify a specific version
487	of an object.

489	3. Security Considerations

491	There are many security considerations for URIs, as described in
492	[RFC2396BIS].

494	4. References

496	[Gopher+] Anklesaria, F., Lindner, P., McCahill, M., Torrey, D.,
497	Johnson, D., and B. Alberti, "Gopher+: Upward compatible enhancements to
498	the Internet Gopher protocol", University of Minnesota, July 1993.

500	[PROSPERO] Neuman, B., and S. Augart, "The Prospero Protocol",
501	USC/Information Sciences Institute, June 1993.

503	[RFC1436] Anklesaria, et. al., "Internet Gopher Protocol", RFC 1436,
504	March 1993.

506	[RFC1625] St. Pierre, et. al., "WAIS over Z39.50-1988", RFC 1625, June
507	1994.

509	[RFC1738] Berners-Lee, et. al.,  "Uniform Resource Locators (URL)", RFC
510	1738, December 1994.

512	[RFC2396BIS] Berners-Lee, et. al., "Uniform Resource Identifier (URI):
513	Generic Syntax", draft-fielding-uri-rfc2396bis.

515	[STD3] Braden, R., Editor, "Requirements for Internet Hosts --
516	Application and Support", STD 3, RFC 1123, October 1989.

518	[STD13] Mockapetris, P., "Domain Names - Concepts and Facilities", STD
519	13, RFC 1034, November 1987.

521	[WAIS] Davis, et. al, "WAIS Interface Protocol Prototype Functional
522	Specification", (v1.5), Thinking Machines Corporation, April 1990.

524	5. Authors' Contact Information

526	Paul Hoffman
527	VPN Consortium
528	127 Segre Place
529	Santa Cruz, CA 95060 USA
530	Phone: +1-831-426-9827
531	EMail: paul.hoffman@vpnc.org