idnits 2.17.1 draft-ietf-urlreg-guide-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Found some kind of copyright notice around line 34 but it does not match any copyright boilerplate known by this tool. Expected boilerplate is as follows today (2024-04-26) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 1) being 362 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There is 1 instance of too long lines in the document, the longest one being 1 character in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- Couldn't find a document date in the document -- date freshness check skipped. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: 'URI-SYNTAX' on line 296 -- Looks like a reference, but probably isn't: 'URL-PROCESS' on line 299 -- Possible downref: Non-RFC (?) normative reference: ref. '1' -- Possible downref: Non-RFC (?) normative reference: ref. '2' ** Obsolete normative reference: RFC 2044 (ref. '3') (Obsoleted by RFC 2279) Summary: 8 errors (**), 0 flaws (~~), 3 warnings (==), 6 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 INTERNET-DRAFT Larry Masinter 2 Harald T. Alvestrand 3 August 7, 1998 Dan Zigmond 4 Rich Petke 6 Guidelines for new URL Schemes 8 Status of this Memo 10 This document is an Internet-Draft. Internet-Drafts are working 11 documents of the Internet Engineering Task Force (IETF), its 12 areas, and its working groups. Note that other groups may also 13 distribute working documents as Internet-Drafts. 15 Internet-Drafts are draft documents valid for a maximum of six 16 months and may be updated, replaced, or obsoleted by other 17 documents at any time. It is inappropriate to use Internet- 18 Drafts as reference material or to cite them other than as 19 "work in progress." 21 To view the entire list of current Internet-Drafts, please check 22 the "1id-abstracts.txt" listing contained in the Internet-Drafts 23 Shadow Directories on ftp.is.co.za (Africa), ftp.nordu.net 24 (Northern Europe), ftp.nis.garr.it (Southern Europe), munnari.oz.au 25 (Pacific Rim), ftp.ietf.org (US East Coast), or ftp.isi.edu 26 (US West Coast). 28 Distribution of this memo is unlimited. 30 This Internet Draft expires February 7, 1999. 32 Copyright Notice 34 Copyright (C) The Internet Society (1998). All Rights Reserved. 36 Abstract 38 A Uniform Resource Locator (URL) is a compact string representation 39 of the location for a resource that is available via the Internet. 40 This document provides guidelines for the definition of new URL 41 schemes. 43 1. Introduction 45 A Uniform Resource Locator (URL) is a compact string representation 46 of the location for a resource that is available via the Internet. 47 RFC [URI-SYNTAX] [1] defines the general syntax and semantics of URIs, 48 and, by inclusion, URLs. URLs are designated by including a 49 ":" and then a "". Many URL schemes 50 are already defined. 52 This document provides guidelines for the definition of new URL 53 schemes, for consideration by those who are defining and 54 registering or evaluating those definitions. 56 The process by which new URL schemes are registered is defined in 57 RFC [URL-PROCESS] [2]. 59 2. Guidelines for new URL schemes 61 Because new URL schemes potentially complicate client software, new 62 schemes must have demonstrable utility and operability, as well as 63 compatibility with existing URL schemes. This section elaborates 64 these criteria. 66 2.1 Syntactic compatibility 68 New URL schemes should follow the same syntactic conventions of 69 existing schemes when appropriate. 71 2.1.1 Improper use of "//" following ":" 73 Contrary to some examples set in past years, the use of double 74 slashes as the first component of the of a 75 URL is not simply an artistic indicator that what follows is a URL: 76 Double slashes are used ONLY when the syntax of the URL's 77 contains a hierarchical structure as 78 described in RFC [URI-SYNTAX]. In URLs from such schemes, the use 79 of double slashes indicates that what follows is the top 80 hierarchical element for a naming authority. (See section 3 of RFC 81 [URI-SYNTAX] for more details.) URL schemes which do not contain a 82 conformant hierarchical structure in their 83 should not use double slashes following the ":" string. 85 2.1.2 Compatibility with relative URLs 87 URL schemes should use the generic URL syntax if they are intended 88 to be used with relative URLs. A description of the allowed 89 relative forms should be included in the scheme's definition. 90 Many applications use relative URLs extensively. Specifically, 92 o Can the scheme be parsed according to RFC [URI-SYNTAX] - that is, 93 if the tokens "//", "/", ";", "?" and "#" are used, do they have 94 the meaning given in RFC [URI-SYNTAX]? 96 o Does the scheme make sense to use it in relative URLs like those 97 RFC [URI-SYNTAX] specifies? 99 o If the scheme syntax is designed to be broken into pieces, does 100 the documentation for the scheme's syntax specify what those 101 pieces are, why it should be broken in this way, and why the 102 breaks aren't where RFC [URI-SYNTAX] says that they usually should 103 be? 105 o If the scheme has a hierarchy, does it go left-to-right and with 106 slash separators like RFC [URI-SYNTAX]? If not, why not? 108 2.2 Is the scheme well defined? 110 It is important that the semantics of the "resource" that a URL 111 "locates" be well defined. This might mean different things 112 depending on the nature of the URL scheme. 114 2.2.1 Clear mapping from other name spaces 116 In many cases, new URL schemes are defined as ways to translate 117 other protocols and name spaces into the general framework of 118 URLs. The "ftp" URL scheme translates from the FTP protocol, while 119 the "mid" URL scheme translates from the Message-ID field of 120 messages. 122 In either case, the description of the mapping must be complete, 123 must describe how character sets get encoded or not in URLs, must 124 describe exactly how all legal values of the base standard can be 125 represented using the URL scheme, and exactly which modifiers, 126 alternate forms and other artifacts from the base standards are 127 included or not included. These requirements are elaborated 128 below. 130 2.2.2 URL schemes associated with network protocols 132 Most new URL schemes are associated with network resources that 133 have one or several network protocols that can access them. The 134 'ftp', 'news', and 'http' schemes are of this nature. For such 135 schemes, the specification should completely describe how URLs are 136 translated into protocol actions in sufficient detail to make the 137 access of the network resource unambiguous. If an implementation 138 of the URL scheme requires some configuration, the configuration 139 elements must be clearly identified. (For example, the 'news' 140 scheme, if implemented using NTTP, requires configuration of the 141 NTTP server.) 143 2.2.3 Character encoding 145 When describing URL schemes in which (some of) the elements of 146 the URL are actually representations of sequences of characters, 147 care should be taken not to introduce unnecessary variety in the 148 ways in which characters are encoded into octets and then into 149 URL characters. Unless there is some compelling reason for a 150 particular scheme to do otherwise, translating character sequences 151 into UTF-8 (RFC 2044) [3] and then subsequently using the %HH 152 encoding for unsafe octets is recommended. 154 2.2.4 Definition of non-protocol URL schemes 156 In some cases, URL schemes do not have particular network protocols 157 associated with them, because their use is limited to contexts 158 where the access method is understood. This is the case, for 159 example, with the "cid" and "mid" URL schemes. For these URL 160 schemes, the specification should describe the notation of the 161 scheme and a complete mapping of the locator from its source. 163 2.2.5 Definition of URL schemes not associated with data resources 165 Most URL schemes locate Internet resources that correspond 166 to data objects that can be retrieved or modified. This is the 167 case with "ftp" and "http", for example. However, some URL schemes 168 do not; for example, the "mailto" URL scheme corresponds to an 169 Internet mail address. 171 If a new URL scheme does not locate resources that are data 172 objects, the properties of names in the new space must be clearly 173 defined. 175 2.2.6 Definition of operations 177 In some contexts (for example, HTML forms) it is possible to 178 specify any one of a list of operations to be performed on a 179 specific URL. (Outside forms, it is generally assumed to be 180 something you GET.) 182 The URL scheme definition should describe all well-defined 183 operations on the URL identifier, and what they are supposed to 184 do. 186 Some URL schemes (for example, "telnet") provide location 187 information for hooking onto bi-directional data streams, and don't 188 fit the "infoaccess" paradigm of most URLs very well; this should 189 be documented. 191 NOTE: It is perfectly valid to say that "no operation apart from 192 GET is defined for this URL". It is also valid to say that "there's 193 only one operation defined for this URL, and it's not very 194 GET-like". The important point is that what is defined on this type 195 is described. 197 2.3 Demonstrated utility 199 URL schemes should have demonstrated utility. New URL schemes are 200 expensive things to support. Often they require special code in 201 browsers, proxies, and/or servers. Having a lot of ways to say the 202 same thing needless complicates these programs without adding value 203 to the Internet. 205 The kinds of things that are useful include: 207 o Things that cannot be referred to in any other way. 209 o Things where it is much easier to get at them using this scheme 210 than (for instance) a proxy gateway. 212 2.3.1 Proxy into HTTP/HTML 214 One way to provide a demonstration of utility is via a gateway 215 which provides objects in the new scheme for clients using an 216 existing protocol. It is much easier to deploy gateways to a new 217 service than it is to deploy browsers that understand the new URL 218 object. 220 Things to look for when thinking about a proxy are: 222 o Is there a single global resolution mechanism whereby any proxy 223 can find the referenced object? 224 o If not, is there a way in which the user can find any object of 225 this type, and "run his own proxy"? 226 o Are the operations mappable one-to-one (or possibly using 227 modifiers) to HTTP operations? 228 o Is the type of returned objects well defined? 229 - as MIME content-types? 230 - as something that can be translated to HTML? 231 o Is there running code for a proxy? 233 2.4 Are there security considerations? 235 Above and beyond the security considerations of the base mechanism 236 a scheme builds upon, one must think of things that can happen in 237 the normal course of URL usage. 239 In particular: 241 o Does the user need to be warned that such a thing is happening 242 without an explicit request (GET for the source of an IMG tag, 243 for instance)? This has implications for the design of a proxy 244 gateway, of course. 246 o Is it possible to fake URLs of this type that point to different 247 things in a dangerous way? 249 o Are there mechanisms for identifying the requester that can be 250 used or need to be used with this mechanism (the From: field in a 251 mailto: URL, or the Kerberos login required for AFS access in the 252 AFS: URL, for instance)? 254 o Does the mechanism contain passwords or other security 255 information that are passed inside the referring document in the 256 clear (as in the "ftp" URL, for instance)? 258 2.5 Does it start with UR? 260 Any scheme starting with the letters "U" and "R", in particular if 261 it attaches any of the meanings "uniform", "universal" or 262 "unifying" to the first letter, is going to cause intense debate, 263 and generate much heat (but maybe little light). 265 Any such proposal should either make sure that there is a large 266 consensus behind it that it will be the only scheme of its type, or 267 pick another name. 269 2.6 Non-considerations 271 Some issues that are often raised but are not relevant to new URL 272 schemes include the following. 274 2.6.1 Are all objects accessible? 276 Can all objects in the world that are validly identified by a 277 scheme be accessed by any UA implementing it? 279 Sometimes the answer will be yes and sometimes no; often it will 280 depend on factors (like firewalls or client configuration) not 281 directly related to the scheme itself. 283 3. Security considerations 285 New URL schemes are required to address all security considerations 286 in their definitions. 288 4. IANA considerations 290 The process by which URL schemes names are registered is specified 291 in RFC [URL-PROCESS]. 293 5. References 295 [1] Berners-Lee, T., Fielding, R., Masinter, L., "Uniform Resource 296 Identifiers (URI): Generic Syntax", RFC [URI-SYNTAX], August 1998 298 [2] Petke, R., "Registration Procedures for URL Scheme Names", 299 RFC [URL-PROCESS], August 1998 301 [3] Yergeau, F., "UTF-8, A Transformation Format of Unicode and ISO 302 10646", RFC 2044, October 1996. 304 6. Authors' Addresses 306 Larry Masinter 307 Xerox Corporation 308 Palo Alto Research Center 309 3333 Coyote Hill Road 310 Palo Alto, CA 94304 311 Fax: +1-415-812-4333 312 EMail: masinter@parc.xerox.com 314 Harald Tveit Alvestrand 315 Maxware, Pirsenteret 316 N-7005 Trondheim 317 NORWAY 318 Voice: +47 73 54 57 00 319 EMail: harald.alvestrand@maxware.no 321 Dan Zigmond 322 WebTV Networks, Inc. 323 305 Lytton Avenue 324 Palo Alto, CA 94301 325 USA 326 Voice: +1-650-614-6071 327 EMail: djz@corp.webtv.net 329 Rich Petke 330 WorldCom Advanced Networks 331 5000 Britton Road 332 P. O. Box 5000 333 Hilliard, OH 43026-5000 334 Voice: +1-614-723-4157 335 Fax: +1-614-723-1333 336 EMail: rpetke@compuserve.net