Network Working Group M. Nottingham Internet-Draft August20,27, 2019 Obsoletes: 7320 (if approved) Updates: 3986 (if approved) Intended status: Best Current Practice Expires: February21,28, 2020 URI Design and Ownershipdraft-nottingham-rfc7320bis-00draft-nottingham-rfc7320bis-01 Abstract Section 1.1.1 of RFC 3986 defines URI syntax as "a federated and extensible naming system wherein each scheme's specification may further restrict the syntax and semantics of identifiers using that scheme." In other words, the structure of a URI is defined by its scheme. While it is common for schemes to further delegate their substructure to the URI's owner, publishing independent standards that mandate particular forms ofURIsubstructure in URIs isinappropriate, because that essentially usurps ownership.often problematic. This documentfurther describes this problematic practice andprovidessome acceptable alternatives for useguidance on the specification of URI substructure in standards. Note to Readers _RFC EDITOR: please remove this section before publication_ This is a proposed revision of RFC7320, aka BCP190. The -00 draft is a copy of the published RFC; subsequent revisions will update it. The issues list for this draft can be found athttps://github.com/mnot/I-D/labels/rfc7320https://github.com/mnot/I-D/labels/rfc7320bis [1]. The most recent (often, unpublished) draft is athttps://mnot.github.io/I-D/rfc7320/https://mnot.github.io/I-D/rfc7320bis/ [2]. Recent changes are listed at https://github.com/mnot/I-D/commits/gh-pages/rfc7320pages/rfc7320bis [3]. See also the draft's current status in the IETF datatracker, athttps://datatracker.ietf.org/doc/draft-nottingham-rfc7320/https://datatracker.ietf.org/doc/draft-nottingham-rfc7320bis/ [4]. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on February21,28, 2020. Copyright Notice Copyright (c) 2019 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1. Intended Audience . . . . . . . . . . . . . . . . . . . . 4 1.2. Notational Conventions . . . . . . . . . . . . . . . . . 5 2. Best Current Practices for Standardizing Structured URIs . . 5 2.1. URI Schemes . . . . . . . . . . . . . . . . . . . . . . . 5 2.2. URI Authorities . . . . . . . . . . . . . . . . . . . . . 5 2.3. URI Paths . . . . . . . . . . . . . . . . . . . . . . . .56 2.4. URI Queries . . . . . . . . . . . . . . . . . . . . . . . 6 2.5. URI Fragment Identifiers . . . . . . . . . . . . . . . . 7 3. Alternatives to Specifying Structure in URIs . . . . . . . . 7 4. Security Considerations . . . . . . . . . . . . . . . . . . .78 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 6. References . . . . . . . . . . . . . . . . . . . . . . . . . 8 6.1. Normative References . . . . . . . . . . . . . . . . . . 8 6.2. Informative References . . . . . . . . . . . . . . . . . 8 6.3. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . 9 Author's Address . . . . . . . . . . . . . . . . . . . . . . . .910 1. Introduction URIs [RFC3986] very often include structured application data. This might include artifacts from filesystems (often occurring in the path component) and user information (often in the query component). In some cases, there can even be application-specific data in the authority component (e.g., some applications are spread across several hostnames to enable a form of partitioning or dispatch).Furthermore,Implementations can impose further constraints upon the structure ofURIs can be imposed by an implementation;URIs; for example, many Web servers use the filename extension of the last path segment to determine the media type of the response. Likewise, prepackaged applications often have highly structured URIs that can only be changed in limited ways (often, just the hostname and port on which they are deployed). Because the owner of the URI (as defined in [webarch] Section 2.2.2.1) is choosing to use the server or the application, this can be seen as reasonable delegation of authority. However, when such conventions are mandated by a party other than the owner, it can have several potentially detrimental effects: o Collisions - As more ad hoc conventions for URI structure become standardized, it becomes more likely that there will be collisions between them (especially considering that servers, applications, and individual deployments will have their own conventions). o Dilution - When the information added to a URI is ephemeral, this dilutes its utility by reducing its stability (see [webarch] Section 3.5.1), and can cause several alternate forms of the URI to exist (see [webarch] Section 2.3.1). o Rigidity - Fixed URI syntax often interferes with desired deployment patterns. For example, if an authority wishes to offer several applications on a single hostname, it becomes difficult to impossible to do if their URIs do not allow the required flexibility. o Operational Difficulty - Supporting some URI conventions can be difficult in some implementations. For example, specifying that a particular query parameter be used with "HTTP" URIs precludes the use of Web servers that serve the response from a filesystem. Likewise, an application that fixes a base path for its operation (e.g., "/v1") makes it impossible to deploy other applications with the same prefix on the same host. o Client Assumptions - When conventions are standardized, some clients will inevitably assume that the standards are in use when those conventions are seen. This can lead to interoperability problems; for example, if a specification documents that the "sig" URI query parameter indicates that its payload is a cryptographic signature for the URI, it can lead to undesirable behavior. Publishing a standard that constrains an existing URI structure in ways that aren't explicitly allowed by [RFC3986] (usually, by updating the URI scheme definition) isinappropriate,therefore sometimes problematic, both for these reasons, and because the structure of a URI needs to be firmly under the control of itsowner, and the IETF (as well as other organizations) should not usurp it.owner. This document explains some best current practices for establishing URI structures, conventions, and formats in standards. It also offers strategies for specificationsto avoid violating these guidelinesin Section 3. 1.1. Intended Audience This document's guidelines and requirements target the authors of specifications that constrain the syntax or structure of URIs or parts of them. Two classes of such specifications are called out specifically: o Protocol Extensions ("extensions") - specifications that offer new capabilities that could apply to any identifier, or to a large subset of possible identifiers; e.g., a new signature mechanism for 'http' URIs,ormetadata for anyURI.URI, or a new format. o Applications Using URIs ("applications") - specifications that use URIs to meet specific needs; e.g., an HTTP interface to particular information on a host. Requirements that target the generic class"Specifications""specifications" apply to all specifications, including both those enumerated above and others. Note that this specification ought not be interpreted as preventing the allocation of control of URIs by parties that legitimately own them, or have delegated that ownership; for example, a specification might legitimately define the semantics of a URI on IANA's Web site as part of the establishment of a registry. There may be existing IETF specifications that already deviate from the guidance in this document. In these cases, it is up to the relevant communities (i.e., those of the URI scheme as well as that which produced the specification in question) to determine an appropriate outcome; e.g., updating the scheme definition, or changing the specification. 1.2. Notational Conventions The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in[RFC2119].BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here. 2. Best Current Practices for Standardizing Structured URIs This section updates [RFC3986] bysetting limitations on howadvising other specificationsmayhow they should define structure and semantics within URIs. Best practices differ depending on the URI component, as described below. 2.1. URI Schemes Applications and extensionsMAYcan require use of specific URI scheme(s); for example, it is perfectly acceptable to require that an application support 'http' and 'https' URIs. However, applicationsSHOULD NOTought not preclude the use of other URI schemes in the future, unless they are clearly only usable with the nominated schemes. A specification that defines substructurewithin a specific URI scheme MUST do so in the defining document for that URI scheme. A specification that defines substructurefor URI schemes overall (e.g., a prefix or suffix for URI scheme names) MUST do so by modifying [BCP115] (an exceptional circumstance). 2.2. URI Authorities Scheme definitions define the presence, format and semantics of an authority component in URIs; all other specifications MUST NOT constrain, or define the structure or the semantics for URI authorities, unless they update the scheme registrationitself.itself, or the structures it relies upon (e.g., DNS name syntax, defined in Section 3.5 of [RFC1034]). For example, an extension or applicationought notcannot say that the "foo" prefix in"foo_app.example.com""http://foo_app.example.com" is meaningful or triggers special handling inURIs. However, applications MAYURIs, unless they update either the HTTP URI scheme, or the DNS hostname syntax. Applications can nominate or constrain the port they use, when applicable. For example, BarApp could run over port nnnn (provided that it is properly registered). 2.3. URI Paths Scheme definitions define the presence, format, and semantics of a path component inURIs; all otherURIs, although these are often delegated to the application(s) in a given deployment. To avoid collisions, rigidity, and erroneous client assumptions, specifications MUST NOTconstrain, ordefinethe structure or the semanticsa fixed prefix forany path component. The onlytheir URI paths; for example, "/myapp", unless allowed by the scheme definition. One such exception to this requirement is registered "well-known" URIs, as specified by[RFC5785].[RFC8615]. See that document for a description of the applicability of that mechanism.For example, an application oughtNote that this does notspecifyapply to applications defining afixed URI path "/myapp", since this usurps the host's controlstructure ofthat space. SpecifyingURIs paths "under" afixed path relative to another (e.g., {whatever}/myapp) is also bad practice (even if "whatever"resource under control of the server. Because the prefix isdiscovered as suggested in Section 3); while doing so might prevent collisions, it does not avoidunder control of thepotential for operational difficulties (forparty deploying the application, collisions and rigidity are avoided, and the risk of erroneous client assumptions is reduced. For example, animplementation that prefersapplication might define "app_root" as a deployment- controlled URI prefix. Application-defined resources might then be assumed touse query processing instead, because of implementation constraints).be present at "{app_root}/foo" and "{app_root}/bar". Extensions MUST NOT define a structure within individual URI components (e.g., a prefix or suffix), again to avoid collisions and erroneous client assumptions. 2.4. URI Queries The presence, format and semantics of the query component of URIs is dependent upon many factors, andMAYcan be constrained by a scheme definition. Often, they are determined by the implementation of a resource itself. ApplicationsMUST NOT directlycan specify the syntax ofqueries, as thisqueries for the resources under their control. However, doing so can cause operational difficulties for deployments that do not support a particular form of a query. For example, a site may wish to support an application using "static" files that do not support query parameters. Extensions MUST NOT constrain the format or semantics ofqueries.queries, to avoid collisions and erroneous client assumptions. For example, an extension that indicates that all query parameters with the name "sig" indicate a cryptographic signature would collide with potentially preexisting query parameters on sites and lead clients to assume that any matching query parameter is a signature. HTML [W3C.REC-html401-19991224] constrains the syntax of query strings used in form submission. New form languagesSHOULD NOT emulate it, but insteadare encouraged to allow creation of a broader variety of URIs (e.g., by allowing the form to create new path components, and so forth).Note that "well-known" URIs (see [RFC5785]) MAY constrain their own query syntax, since these name spaces are effectively delegated to the registering party.2.5. URI Fragment IdentifiersMedia type definitions (as per [RFC6838]) SHOULD specify theSection 3.5 of [RFC3986] specifies fragmentidentifier syntax(es) to be used with them;identiers' syntax and semantics as being dependent upon the media type of a potentially retrieved resource. As a result, other specifications MUST NOT define structure within the fragment identifier, unless they are explicitly defining one for reuse by mediatype definitions. Fortypes in their definitions (for example,anas JSON Pointer [RFC6901] does). An application that defines common fragment identifiers across media types not controlled by it would engender interoperability problems with handlers for those media types (because the new, non-standard syntax is not expected). 3. Alternatives to Specifying Structure in URIs Given the issues described in Section 1, the most successful strategy for applications and extensions that wish to use URIs is to use them in the fashion they were designed: as links that are exchanged as part of the protocol, rather than statically specified syntax. Several existing specifications can aid in this.[RFC5988][RFC8288] specifies relation types for Web links. By providing a framework for linking on the Web, where every link has a relation type, context and target, it allows applications to define a link's semantics and connectivity. [RFC6570] provides a standard syntax for URI Templates that can be used to dynamically insert application-specific variables into a URI to enable such applications while avoiding impinging upon URI owners' control of them.[RFC5785][RFC8615] allows specific paths to be 'reserved' for standard use on URI schemes that opt into that mechanism ('http' and 'https' by default). Note, however, that this is not a general "escape valve" for applications that need structured URIs; see that specification for more information. Specifying more elaborate structures in an attempt to avoid collisions is not an acceptable solution, and does not address the issues in Section 1. For example, prefixing query parameters with "myapp_" does not help, because the prefix itself is subject to the risk of collision (since it is not "reserved"). 4. Security Considerations This document does not introduce new protocol artifacts with security considerations. It prohibits some practices that might lead to vulnerabilities; for example, if a security-sensitive mechanism is introduced by assuming that a URI path component or query string has a particular meaning, false positives might be encountered (due to sites that already use the chosen string). See also [RFC6943]. 5. IANA Considerations There are no direct IANA actions specified in this document. 6. References 6.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, <https://www.rfc-editor.org/info/rfc2119>. [RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform Resource Identifier (URI): Generic Syntax", STD 66, RFC 3986, DOI 10.17487/RFC3986, January 2005, <https://www.rfc-editor.org/info/rfc3986>.[RFC6838] Freed, N., Klensin, J., and T. Hansen, "Media Type Specifications and Registration Procedures",[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP13,14, RFC6838,8174, DOI10.17487/RFC6838, January 2013, <https://www.rfc-editor.org/info/rfc6838>.10.17487/RFC8174, May 2017, <https://www.rfc-editor.org/info/rfc8174>. [webarch] Jacobs, I. and N. Walsh, "Architecture of the World Wide Web, Volume One", December 2004, <http://www.w3.org/TR/2004/REC-webarch-20041215>. 6.2. Informative References [BCP115] Hansen, T., Hardie, T., and L. Masinter, "Guidelines and Registration Procedures for New URI Schemes",RFC 4395,BCP 115, RFC 4395, February 2006, <https://tools.ietf.org/html/bcp115>.[RFC5785] Nottingham, M.[RFC1034] Mockapetris, P., "Domain names - concepts andE. Hammer-Lahav, "Defining Well-Known Uniform Resource Identifiers (URIs)", RFC 5785, DOI 10.17487/RFC5785, April 2010, <https://www.rfc-editor.org/info/rfc5785>. [RFC5988] Nottingham, M., "Web Linking",facilities", STD 13, RFC5988,1034, DOI10.17487/RFC5988, October 2010, <https://www.rfc-editor.org/info/rfc5988>.10.17487/RFC1034, November 1987, <https://www.rfc-editor.org/info/rfc1034>. [RFC6570] Gregorio, J., Fielding, R., Hadley, M., Nottingham, M., and D. Orchard, "URI Template", RFC 6570, DOI 10.17487/RFC6570, March 2012, <https://www.rfc-editor.org/info/rfc6570>. [RFC6901] Bryan, P., Ed., Zyp, K., and M. Nottingham, Ed., "JavaScript Object Notation (JSON) Pointer", RFC 6901, DOI 10.17487/RFC6901, April 2013, <https://www.rfc-editor.org/info/rfc6901>. [RFC6943] Thaler, D., Ed., "Issues in Identifier Comparison for Security Purposes", RFC 6943, DOI 10.17487/RFC6943, May 2013, <https://www.rfc-editor.org/info/rfc6943>. [RFC8288] Nottingham, M., "Web Linking", RFC 8288, DOI 10.17487/RFC8288, October 2017, <https://www.rfc-editor.org/info/rfc8288>. [RFC8615] Nottingham, M., "Well-Known Uniform Resource Identifiers (URIs)", RFC 8615, DOI 10.17487/RFC8615, May 2019, <https://www.rfc-editor.org/info/rfc8615>. [W3C.REC-html401-19991224] Raggett, D., Hors, A., and I. Jacobs, "HTML 4.01 Specification", World Wide Web Consortium Recommendation REC-html401-19991224, December 1999, <http://www.w3.org/TR/1999/REC-html401-19991224>. 6.3. URIs [1]https://github.com/mnot/I-D/labels/rfc7320https://github.com/mnot/I-D/labels/rfc7320bis [2]https://mnot.github.io/I-D/rfc7320/https://mnot.github.io/I-D/rfc7320bis/ [3]https://github.com/mnot/I-D/commits/gh-pages/rfc7320https://github.com/mnot/I-D/commits/gh-pages/rfc7320bis [4]https://datatracker.ietf.org/doc/draft-nottingham-rfc7320/https://datatracker.ietf.org/doc/draft-nottingham-rfc7320bis/ Appendix A. Acknowledgments Thanks to David Booth, Dave Crocker, Tim Bray, Anne van Kesteren, Martin Thomson, Erik Wilde, Dave Thaler and Barry Leiba for their suggestions and feedback. Author's Address Mark Nottingham Email: mnot@mnot.net URI: https://www.mnot.net/