idnits 2.17.1 

draft-ietf-urlreg-guide-05.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Looks like you're using RFC 2026 boilerplate.  This must be updated to
     follow RFC 3978/3979, as updated by RFC 4748.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** Missing expiration date.  The document expiration date should appear on
     the first and last page.

  ** The document seems to lack a 1id_guidelines paragraph about
     Internet-Drafts being working documents. 

  ** The document seems to lack a 1id_guidelines paragraph about 6 months
     document validity. 

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     current Internet-Drafts. 

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     Shadow Directories. 

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard

  == The page length should not exceed 58 lines per page, but there was 1
     longer page, the longest (page 1) being 425 lines


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not
     match the current year

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- Couldn't find a document date in the document -- date freshness check
     skipped.


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  -- Looks like a reference, but probably isn't: 'URL-PROCESS' on line 361

  ** Obsolete normative reference: RFC 2396 (ref. '1') (Obsoleted by RFC 3986)

  -- Possible downref: Non-RFC (?) normative reference: ref. '2'

  ** Obsolete normative reference: RFC 2279 (ref. '3') (Obsoleted by RFC 3629)


     Summary: 10 errors (**), 0 flaws (~~), 3 warnings (==), 4 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	INTERNET-DRAFT                                           Larry Masinter
2	<draft-ietf-urlreg-guide-05.txt>                   Harald T. Alvestrand
3	March 25, 1999                                              Dan Zigmond
4	                                                             Rich Petke

6	                  Guidelines for new URL Schemes

8	Status of this Memo

10	   This document is an Internet-Draft and is in full conformance with
11	   all provisions of Section 10 of RFC 2026.  Internet-Drafts are
12	   working documents of the Internet Engineering Task Force (IETF), its
13	   areas, and its working groups.  Note that other groups may also
14	   distribute working documents as Internet-Drafts.  Internet-Drafts
15	   are draft documents valid for a maximum of six months and may be
16	   updated, replaced, or obsoleted by other documents at any time.  It
17	   is inappropriate to use Internet-Drafts as reference material or to
18	   cite them other than as "work in progress."  The list of current
19	   Internet-Drafts can be accessed at
20	   http://www.ietf.org/ietf/1id-abstracts.txt  The list of
21	   Internet-Draft Shadow Directories can be accessed at
22	   http://www.ietf.org/shadow.html.

24	   Distribution of this memo is unlimited.

26	   This Internet Draft expires September 25, 1999.

28	Copyright Notice

30	   Copyright (C) The Internet Society (1999).  All Rights Reserved.

32	Abstract

34	   A Uniform Resource Locator (URL) is a compact string representation
35	   of the location for a resource that is available via the Internet.
36	   This document provides guidelines for the definition of new URL
37	   schemes.

39	1. Introduction

41	   A Uniform Resource Locator (URL) is a compact string representation
42	   of the location for a resource that is available via the Internet.
43	   RFC 2396 [1] defines the general syntax and semantics of URIs, and,
44	   by inclusion, URLs.  URLs are designated by including a "<scheme>:"
45	   and then a "<scheme-specific-part>".  Many URL schemes are already
46	   defined.

48	   This document provides guidelines for the definition of new URL
49	   schemes, for consideration by those who are defining and
50	   registering or evaluating those definitions.

52	   The process by which new URL schemes are registered is defined in
53	   RFC [URL-PROCESS] [2].

55	2. Guidelines for new URL schemes

57	   Because new URL schemes potentially complicate client software, new
58	   schemes must have demonstrable utility and operability, as well as
59	   compatibility with existing URL schemes.  This section elaborates
60	   these criteria.

62	2.1 Syntactic compatibility

64	   New URL schemes should follow the same syntactic conventions of
65	   existing schemes when appropriate.  If a URI scheme that has
66	   embedded links in content accessed by that scheme does not share
67	   syntax with a different scheme, the same content cannot be served up
68	   under different schemes without rewriting the content.  This can
69	   already be a problem, and with future digital signature schemes,
70	   rewriting may not even be possible.  Deployment of other schemes in
71	   the future could therefore become extremely difficult.

73	2.1.1 Motivations for syntactic compatibility

75	   Why should new URL schemes share as much of the generic URI syntax
76	   (that makes sense to share) as possible?  Consider the following:

78	   o If fragment syntax isn't shared between two schemes, (e.g. "<a
79	     href="#foo">"), you can't move individual completely self
80	     referential documents between schemes without rewriting the
81	     embedded references within the document.  In the Web, the fragment
82	     syntax is a property of the media type, and evaluated by the
83	     client.

85	   o If fragment syntax is not shared between different media types of
86	     the same capability (e.g. HTML, XML, Word, or image types such as
87	     GIF, JPEG, PNG) then you can't have a URI reference that can
88	     evolve to superior media types as they become available, or even
89	     likely work properly today with content negotiation.

91	   o If relative syntax (to the extent of understanding the URI is
92	     relative, and what part of the URI string is relative) isn't
93	     shared between two schemes, (e.g. "<a href="foo">"), you can't
94	     move sets of documents that are internally self referential
95	     between schemes without rewriting the embedded URIs.

97	   o If the ".." syntax as a path component in relative URI's isn't
98	     shared between schemes, you can't easily have sets of document
99	     sets and refer to them between schemes without rewriting the
100	     embedded references.

102	   o If the "/" syntax (to the extent of understanding that the URI
103	     refers to a path relative to the current naming authority, see
104	     section 2.1.1) isn't shared, you can't have multiple sets of
105	     documents easily be moved up or down in a relative hierarchy of
106	     names and share a common set of documents between them, without
107	     rewriting the content, shared either in that scheme or between
108	     schemes.  The best example is a site that has a common set of
109	     GIF's, JPEG and PNG images, and you want to reorganize the site
110	     changing the depth of a subtree from one depth to another, or
111	     from one directory to another where the depth isn't the same.

113	   o If naming authority syntax (e.g. what comes after "//" in most URL
114	     schemes, see section 2.1.1) and relative path syntax is shared, to
115	     the extent of understanding that the URI has a naming authority,
116	     and what part of the URI string is the naming authority vs. path),
117	     isn't shared between two schemes, you can't share identical name
118	     spaces and serve them up via different schemes.  (The naming
119	     authority syntax is a property of the scheme).  The fact that
120	     HTTP, and FTP have the same syntax, for example, has often been
121	     exploited by sites transitioning from ftp archive service to HTTP
122	     archive service so that the URL's can be identical between schemes
123	     except for the scheme; the same content can be served via two
124	     schemes simultaneously.

126	2.1.2 Improper use of "//" following "<scheme>:"

128	   Contrary to some examples set in past years, the use of double
129	   slashes as the first component of the <scheme-specific-part> of a
130	   URL is not simply an artistic indicator that what follows is a URL:
131	   Double slashes are used ONLY when the syntax of the URL's
132	   <scheme-specific-part> contains a hierarchical structure as
133	   described in RFC 2396.  In URLs from such schemes, the use of double
134	   slashes indicates that what follows is the top hierarchical element
135	   for a naming authority.  (See section 3 of RFC 2396 for more
136	   details.)  URL schemes which do not contain a conformant
137	   hierarchical structure in their <scheme-specific-part> should not
138	   use double slashes following the "<scheme>:" string.

140	2.1.3 Compatibility with relative URLs

142	   URL schemes should use the generic URL syntax if they are intended
143	   to be used with relative URLs.  A description of the allowed
144	   relative forms should be included in the scheme's definition.
145	   Many applications use relative URLs extensively.  Specifically,

147	   o Can the scheme be parsed according to RFC 2396 - that is, if the
148	     tokens "//", "/", ";", "?" and "#" are used, do they have the
149	     meaning given in RFC 2396?

151	   o Does the scheme make sense to use it in relative URLs like those
152	     RFC 2396 specifies?

154	   o If the scheme syntax is designed to be broken into pieces, does
155	     the documentation for the scheme's syntax specify what those
156	     pieces are, why it should be broken in this way, and why the
157	     breaks aren't where RFC 2396 says that they usually should be?

159	   o If the scheme has a hierarchy, does it go left-to-right and with
160	     slash separators like RFC 2396?  If not, why not?

162	2.1.4 Compatibility with fragment syntax

164	   Fragment syntax should be shared across URL schemes whenever
165	   possible.  Fragments indicate a location within a particular
166	   document, of a particular media type.  As media types evolve,
167	   and content negotiation becomes deployed, a shared fragment syntax
168	   allows a fragment to point to the correct location within documents
169	   of different media types.  For example, a named fragment (#foo),
170	   should to be able to point to the foo label in either a HTML
171	   document or an XML document.  Similarly for fragments identifying a
172	   location in an image, where the image may want to evolve from GIF,
173	   to JPEG, to PNG, the fragment ID should point to the same location.

175	2.2 Is the scheme well defined?

177	   It is important that the semantics of the "resource" that a URL
178	   "locates" be well defined.  This might mean different things
179	   depending on the nature of the URL scheme.

181	2.2.1 Clear mapping from other name spaces

183	   In many cases, new URL schemes are defined as ways to translate
184	   other protocols and name spaces into the general framework of
185	   URLs.  The "ftp" URL scheme translates from the FTP protocol, while
186	   the "mid" URL scheme translates from the Message-ID field of
187	   messages.

189	   In either case, the description of the mapping must be complete,
190	   must describe how characters get encoded or not in URLs, must
191	   describe exactly how all legal values of the base standard can be
192	   represented using the URL scheme, and exactly which modifiers,
193	   alternate forms and other artifacts from the base standards are
194	   included or not included.  These requirements are elaborated
195	   below.

197	2.2.2 URL schemes associated with network protocols

199	   Most new URL schemes are associated with network resources that
200	   have one or several network protocols that can access them.  The
201	   'ftp', 'news', and 'http' schemes are of this nature.  For such
202	   schemes, the specification should completely describe how URLs are
203	   translated into protocol actions in sufficient detail to make the
204	   access of the network resource unambiguous.  If an implementation
205	   of the URL scheme requires some configuration, the configuration
206	   elements must be clearly identified.  (For example, the 'news'
207	   scheme, if implemented using NTTP, requires configuration of the
208	   NTTP server.)

210	2.2.3 Definition of non-protocol URL schemes

212	   In some cases, URL schemes do not have particular network protocols
213	   associated with them, because their use is limited to contexts
214	   where the access method is understood.  This is the case, for
215	   example, with the "cid" and "mid" URL schemes.  For these URL
216	   schemes, the specification should describe the notation of the
217	   scheme and a complete mapping of the locator from its source.

219	2.2.4 Definition of URL schemes not associated with data resources

221	   Most URL schemes locate Internet resources that correspond
222	   to data objects that can be retrieved or modified.  This is the
223	   case with "ftp" and "http", for example.  However, some URL schemes
224	   do not; for example, the "mailto" URL scheme corresponds to an
225	   Internet mail address.

227	   If a new URL scheme does not locate resources that are data
228	   objects, the properties of names in the new space must be clearly
229	   defined.

231	2.2.5 Character encoding

233	   When describing URL schemes in which (some of) the elements of
234	   the URL are actually representations of sequences of characters,
235	   care should be taken not to introduce unnecessary variety in the
236	   ways in which characters are encoded into octets and then into
237	   URL characters.  Unless there is some compelling reason for a
238	   particular scheme to do otherwise, translating character sequences
239	   into UTF-8 (RFC 2279) [3] and then subsequently using the %HH
240	   encoding for unsafe octets is recommended.

242	2.2.6 Definition of operations

244	   In some contexts (for example, HTML forms) it is possible to
245	   specify any one of a list of operations to be performed on a
246	   specific URL.  (Outside forms, it is generally assumed to be
247	   something you GET.)

249	   The URL scheme definition should describe all well-defined
250	   operations on the URL identifier, and what they are supposed to
251	   do.

253	   Some URL schemes (for example, "telnet") provide location
254	   information for hooking onto bi-directional data streams, and don't
255	   fit the "infoaccess" paradigm of most URLs very well; this should
256	   be documented.

258	   NOTE: It is perfectly valid to say that "no operation apart from
259	   GET is defined for this URL".  It is also valid to say that "there's
260	   only one operation defined for this URL, and it's not very
261	   GET-like".  The important point is that what is defined on this type
262	   is described.

264	2.3 Demonstrated utility

266	   URL schemes should have demonstrated utility.  New URL schemes are
267	   expensive things to support.  Often they require special code in
268	   browsers, proxies, and/or servers.  Having a lot of ways to say the
269	   same thing needless complicates these programs without adding value
270	   to the Internet.

272	   The kinds of things that are useful include:

274	   o Things that cannot be referred to in any other way.

276	   o Things where it is much easier to get at them using this scheme
277	     than (for instance) a proxy gateway.

279	2.3.1 Proxy into HTTP/HTML

281	   One way to provide a demonstration of utility is via a gateway
282	   which provides objects in the new scheme for clients using an
283	   existing protocol.  It is much easier to deploy gateways to a new
284	   service than it is to deploy browsers that understand the new URL
285	   object.

287	   Things to look for when thinking about a proxy are:

289	   o Is there a single global resolution mechanism whereby any proxy
290	     can find the referenced object?
291	   o If not, is there a way in which the user can find any object of
292	     this type, and "run his own proxy"?
293	   o Are the operations mappable one-to-one (or possibly using
294	     modifiers) to HTTP operations?
295	   o Is the type of returned objects well defined?
296	      - as MIME content-types?
297	      - as something that can be translated to HTML?
298	   o Is there running code for a proxy?

300	2.4 Are there security considerations?

302	   Above and beyond the security considerations of the base mechanism
303	   a scheme builds upon, one must think of things that can happen in
304	   the normal course of URL usage.

306	   In particular:

308	   o Does the user need to be warned that such a thing is happening
309	     without an explicit request (GET for the source of an IMG tag,
310	     for instance)?  This has implications for the design of a proxy
311	     gateway, of course.

313	   o Is it possible to fake URLs of this type that point to different
314	     things in a dangerous way?

316	   o Are there mechanisms for identifying the requester that can be
317	     used or need to be used with this mechanism (the From: field in a
318	     mailto: URL, or the Kerberos login required for AFS access in the
319	     AFS: URL, for instance)?

321	   o Does the mechanism contain passwords or other security
322	     information that are passed inside the referring document in the
323	     clear (as in the "ftp" URL, for instance)?

325	2.5 Does it start with UR?

327	   Any scheme starting with the letters "U" and "R", in particular if
328	   it attaches any of the meanings "uniform", "universal" or
329	   "unifying" to the first letter, is going to cause intense debate,
330	   and generate much heat (but maybe little light).

332	   Any such proposal should either make sure that there is a large
333	   consensus behind it that it will be the only scheme of its type, or
334	   pick another name.

336	2.6 Non-considerations

338	   Some issues that are often raised but are not relevant to new URL
339	   schemes include the following.

341	2.6.1 Are all objects accessible?

343	   Can all objects in the world that are validly identified by a
344	   scheme be accessed by any UA implementing it?

346	   Sometimes the answer will be yes and sometimes no; often it will
347	   depend on factors (like firewalls or client configuration) not
348	   directly related to the scheme itself.

350	3. Security considerations

352	   New URL schemes are required to address all security considerations
353	   in their definitions.

355	4. References

357	   [1] Berners-Lee, T., Fielding, R., Masinter, L., "Uniform Resource
358	       Identifiers (URI): Generic Syntax", RFC 2396, August 1998

360	   [2] Petke, R., "Registration Procedures for URL Scheme Names",
361	       RFC [URL-PROCESS], November 1998

363	   [3] Yergeau, F., "UTF-8, A Transformation Format of Unicode and ISO
364	       10646", RFC 2279, January 1998.

366	5. Authors' Addresses

368	   Larry Masinter
369	   Xerox Corporation
370	   Palo Alto Research Center
371	   3333 Coyote Hill Road
372	   Palo Alto, CA 94304
373	   Fax: +1-415-812-4333
374	   EMail: masinter@parc.xerox.com

376	   Harald Tveit Alvestrand
377	   Maxware, Pirsenteret
378	   N-7005 Trondheim
379	   NORWAY
380	   Voice: +47 73 54 57 00
381	   EMail: harald.alvestrand@maxware.no

383	   Dan Zigmond
384	   WebTV Networks, Inc.
385	   305 Lytton Avenue
386	   Palo Alto, CA 94301
387	   USA
388	   Voice: +1-650-614-6071
389	   EMail: djz@corp.webtv.net

391	   Rich Petke
392	   UUNET Technologies
393	   5000 Britton Road
394	   P. O. Box 5000
395	   Hilliard, OH 43026-5000
396	   Voice: +1-614-723-4157
397	   Fax: +1-614-723-1333
398	   EMail: rpetke@wcom.net