<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="../Tools/rfc2629xslt/rfc2629.xslt" ?>
<!DOCTYPE rfc SYSTEM "../Tools/rfc2629xslt/rfc2629.dtd">
<rfc ipr="trust200902" docName="draft-nottingham-uri-get-off-my-lawn-01" category="bcp" updates="3986">
  <?rfc toc="yes"?>
  <?rfc tocindent="yes"?>
  <?rfc sortrefs="yes"?>
  <?rfc symrefs="yes"?>
  <?rfc strict="yes"?>
  <?rfc compact="yes"?>
  <?rfc comments="yes"?>
  <?rfc inline="yes"?>
  <front>
    <title abbrev="URI Structure Policies">Standardising Structure in URIs</title>
    <author initials="M." surname="Nottingham" fullname="Mark Nottingham">
      <organization/>
      <address>
        <email>mnot@mnot.net</email>
        <uri>http://www.mnot.net/</uri>
      </address>
    </author>
    <date year="2013"/>
    <area>General</area>
    <keyword>URI structure</keyword>
    <abstract>
      <t>Sometimes, it is attractive to add features to protocols or applications by specifying a particular
structure for URIs (or parts thereof). This document cautions against this practice in standards
(sometimes called “URI Squatting”).</t>
    </abstract>
  </front>
  <middle>
    <section anchor="introduction" title="Introduction">
      <t>URIs <xref target="RFC3986"/> very often include structure and application data. This might include artefacts
from filesystems (often occuring in the path component), and user information (often in the query
component). In some cases, there can even be application-specific data in the authority component
(e.g., some applications are spread across several hostnames to enable a form of partitioning or
dispatch).</t>
      <t>Furthermore, constraints upon the structure of URIs can be imposed by an implementation; for
example, many Web servers use the filename extension of the last path segment to determine the
media type of the response. Likewise, pre-packaged applications often have highly structured URIs
that can only be changed in limited ways (often, just the hostname and port they are deployed upon).</t>
      <t>Because the owner of the URI is choosing to use the server or the software, this can be seen as
reasonable delegation of authority. When such conventions are mandated by standards, however, it
can have several potentially detrimental effects:</t>
      <t>
        <list style="symbols">
          <t>Collisions - As more conventions for URI structure become standardised, it becomes more likely
that there will be collisions between such conventions (especially considering that servers,
applications and individual deployments will have their own conventions).</t>
          <t>Dilution - Adorning URIs with extra information to support new standard features dilutes their
usefulness as identifiers when that information is ephemeral (as URIs ought to be stable; see
<xref target="webarch"/> Section 3.5.1), or its inclusion causes several alternate forms of the URI to exist
(see <xref target="webarch"/> Section 2.3.1).</t>
          <t>Brittleness - A standard that specifies a static URI cannot change its form in future revisions.</t>
          <t>Operational Difficulty - Supporting some URI conventions can be difficult in some
implementations. For example, specifying that a particular query parameter be used precludes the
use of Web servers that serve the response from a filesystem. Likewise, an application that fixes
a base path for its operation (e.g., “/v1”) makes it impossible to deploy other applications with
the same prefix on the same host.</t>
          <t>Client Assumptions - When conventions are standardised, some clients will inevitably assume that
the standards are in use when those conventions are seen. This can lead to interoperability
problems; for example, if a specification documents that the “sig” URI query parameter indicates
that its payload is a cryptographic signature for the URI, it can lead to false positives.</t>
        </list>
      </t>
      <t>While it is not ideal when a server or a deployed application constrains uri structure (indeed, this
is not recommended practice, but that discussion is out of scope for this document), recommending
standards that mandate URI structure is inappropriate because the structure of a URI needs to be
firmly under the control of its owner, and the IETF (as well as other organisations) should not
usurp this ownership; see <xref target="webarch"/> Section 2.2.2.1.</t>
      <t>This document explains best current practices for establishing URI structures, conventions and
formats in standards. It also offers strategies for specifications to avoid violating these
guidelines in <xref target="alternatives"/>.</t>
      <section anchor="who-this-document-is-for" title="Who This Document Is For">
        <t>These guidelines are IETF Best Current Practice, and are therefore binding upon IETF
standards-track documents, as well as submissions to the RFC Editor on the Independent and IRTF streams. See <xref target="RFC2026"/> and <xref target="RFC4844"/> for more information. </t>
        <t>Other Open Standards organisations (in the sense of <xref target="RFC2026"/>) are encouraged to adopt them.
Questions as to their applicability ought to be handled through the liaison relationship, if
present.</t>
        <t>Ad hoc efforts are also encouraged to adopt them, as this RFC reflects Best Current Practice.</t>
        <t>This document’s requirements specifically targets a few different types of specifications:</t>
        <t>
          <list style="symbols">
            <t>URI Scheme Definitions (“scheme definitions”) - specifications that define and register URI
schemes, as per <xref target="RFC4395"/>.</t>
            <t>Protocol Extensions (“extensions”) - specifications that offer new capabilities to potentially
any identifier, or a large subset; e.g., a new signature mechanism for ‘http’ URIs, or metadata
for any URI.</t>
            <t>Applications Using URIs (“applications”) - specifications that use URIs to meet specific needs;
e.g., a HTTP interface to particular information on a host.</t>
          </list>
        </t>
        <t>Requirements that target the generic class “Specifications” apply to all specifications, including
both those enumerated above above and others.</t>
        <t>Note that this specification ought not be interpreted as preventing the allocation of control of
URIs by parties that legitimately own them, or have delegated that ownership; for example, a
specification might legitimately specify the semantics of a URI on the IANA.ORG Web site as part of
the establishment of a registry.</t>
      </section>
      <section anchor="notational-conventions" title="Notational Conventions">
        <t>The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”,
“RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in
<xref target="RFC2119"/>.</t>
      </section>
    </section>
    <section anchor="best-current-practices-for-standarising-structured-uris" title="Best Current Practices for Standarising Structured URIs">
      <t>Different components of a URI have differing practices recommended.</t>
      <section anchor="uri-schemes" title="URI Schemes">
        <t>Applications and extensions MAY require use of specific URI scheme(s); for example, it is perfectly
acceptable to require that an application support ‘http’ and ‘https’ URIs. However, applications
SHOULD NOT preclude the use of other URI schemes in the future, to promote reuse, unless they are
clearly specific to the nominated schemes.</t>
        <t>Specifications MUST NOT define substructure within URI schemes, unless they do so by modifying
<xref target="RFC4395"/>, or they are the registration document for the URI scheme(s) in question.</t>
      </section>
      <section anchor="uri-authorities" title="URI Authorities">
        <t>Scheme definitions define the presence, format and semantics of an authority component in URIs; all
other specifications MUST NOT constrain, define structure or semantics for them.</t>
      </section>
      <section anchor="uri-paths" title="URI Paths">
        <t>Scheme definitions define the presence, format, and semantics of a path component in URIs; all
other specifications MUST NOT constrain, define structure or semantics for any path component.</t>
        <t>The only exception to this requirement is registered “well-known” URIs, as specified by <xref target="RFC5785"/>.
See that document for a description of the applicability of that mechanism.</t>
      </section>
      <section anchor="uri-queries" title="URI Queries">
        <t>The presence, format and semantics of the query component of URIs is dependent upon many factors,
and MAY be constrained by a scheme definition. Often, they are determined by the implementation of
a resource itself.</t>
        <t>Applications SHOULD NOT directly specify the syntax of queries, as this can cause operational
difficulties for deployments that do not support a particular form of a query.</t>
        <t>Extensions MUST NOT specify the format or semantics of queries. In particular, extensions MUST NOT
assume that all HTTP(S) resources are capable of accepting queries in the format defined by
<xref target="HTML4"/>, Section 17.13.4.</t>
      </section>
      <section anchor="uri-fragment-identifiers" title="URI Fragment Identifiers">
        <t>Media type definitions (as per <xref target="RFC6838"/> SHOULD specify the fragment identifier syntax(es) to be
used with them; other specifications MUST NOT define structure within the fragment identifier,
unless they are explicitly defining one for reuse by media type definitions.</t>
      </section>
    </section>
    <section anchor="alternatives" title="Alternatives to Specifying Static URIs">
      <t>Given the issues above, the most successful strategy for applications and extensions that wish to
use URIs is to use them in the fashion they were designed; as run-time artefacts that are exchanged
as part of the protocol, rather than staticly specified syntax.</t>
      <t>For example, if a specific URI needs to be known to interact with an application, its “shape” can
be determined by interacting with the application’s more general interface (in Web terms, its “home
page”) to learn about that URI.</t>
      <t><xref target="RFC5988"/> describes a framework for identifying the semantics of a link in a “link relation type”
to aid this. <xref target="RFC6570"/> provides a standard syntax for “link templates” that can be used to
dynamically insert application-specific variables into a URI to enable such applications while
avoiding impinging upon URI owners’ control of them.</t>
      <t><xref target="RFC5785"/> allows specific paths to be ‘reserved’ for standard use on URI schemes that opt into
that mechanism (‘http’ and ‘https’ by default). Note, however, that this is not a general “escape
valve” for applications that need structured URIs; see that specification for more information.</t>
      <t>Specifying more elaborate structures in an attempt to avoid collisions is not adequate to conform
to this docuement. For example, prefixing query parameters with “myapp_” does not help.</t>
    </section>
    <section anchor="security-considerations" title="Security Considerations">
      <t>This document does not introduce new protocol artefacts with security considerations. </t>
    </section>
    <section anchor="iana-considerations" title="IANA Considerations">
      <t>This document clarifies appropriate registry policy for new URI schemes, and potentially for the
creation of new URI-related registries, if they attempt to mandate structure within URIs. There are
no direct IANA actions specified in this document.</t>
    </section>
  </middle>
  <back>
    <references title="Normative References">
      <reference anchor="RFC2119">
        <front>
          <title abbrev="RFC Key Words">Key words for use in RFCs to Indicate Requirement Levels</title>
          <author initials="S." surname="Bradner" fullname="Scott Bradner">
            <organization>Harvard University</organization>
            <address>
              <postal>
                <street>1350 Mass. Ave.</street>
                <street>Cambridge</street>
                <street>MA 02138</street>
              </postal>
              <phone>- +1 617 495 3864</phone>
              <email>sob@harvard.edu</email>
            </address>
          </author>
          <date year="1997" month="March"/>
          <area>General</area>
          <keyword>keyword</keyword>
          <abstract>
            <t>
   In many standards track documents several words are used to signify
   the requirements in the specification.  These words are often
   capitalized.  This document defines these words as they should be
   interpreted in IETF documents.  Authors who follow these guidelines
   should incorporate this phrase near the beginning of their document:

<list><t>
      The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
      NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and
      "OPTIONAL" in this document are to be interpreted as described in
      RFC 2119.
</t></list></t>
            <t>
   Note that the force of these words is modified by the requirement
   level of the document in which they are used.
</t>
          </abstract>
        </front>
        <seriesInfo name="BCP" value="14"/>
        <seriesInfo name="RFC" value="2119"/>
        <format type="TXT" octets="4723" target="http://www.rfc-editor.org/rfc/rfc2119.txt"/>
        <format type="HTML" octets="17970" target="http://xml.resource.org/public/rfc/html/rfc2119.html"/>
        <format type="XML" octets="5777" target="http://xml.resource.org/public/rfc/xml/rfc2119.xml"/>
      </reference>
      <reference anchor="RFC3986">
        <front>
          <title abbrev="URI Generic Syntax">Uniform Resource Identifier (URI): Generic Syntax</title>
          <author initials="T." surname="Berners-Lee" fullname="Tim Berners-Lee">
            <organization abbrev="W3C/MIT">World Wide Web Consortium</organization>
            <address>
              <postal>
                <street>Massachusetts Institute of Technology</street>
                <street>77 Massachusetts Avenue</street>
                <city>Cambridge</city>
                <region>MA</region>
                <code>02139</code>
                <country>USA</country>
              </postal>
              <phone>+1-617-253-5702</phone>
              <facsimile>+1-617-258-5999</facsimile>
              <email>timbl@w3.org</email>
              <uri>http://www.w3.org/People/Berners-Lee/</uri>
            </address>
          </author>
          <author initials="R." surname="Fielding" fullname="Roy T. Fielding">
            <organization abbrev="Day Software">Day Software</organization>
            <address>
              <postal>
                <street>5251 California Ave., Suite 110</street>
                <city>Irvine</city>
                <region>CA</region>
                <code>92617</code>
                <country>USA</country>
              </postal>
              <phone>+1-949-679-2960</phone>
              <facsimile>+1-949-679-2972</facsimile>
              <email>fielding@gbiv.com</email>
              <uri>http://roy.gbiv.com/</uri>
            </address>
          </author>
          <author initials="L." surname="Masinter" fullname="Larry Masinter">
            <organization abbrev="Adobe Systems">Adobe Systems Incorporated</organization>
            <address>
              <postal>
                <street>345 Park Ave</street>
                <city>San Jose</city>
                <region>CA</region>
                <code>95110</code>
                <country>USA</country>
              </postal>
              <phone>+1-408-536-3024</phone>
              <email>LMM@acm.org</email>
              <uri>http://larry.masinter.net/</uri>
            </address>
          </author>
          <date year="2005" month="January"/>
          <area>Applications</area>
          <keyword>uniform resource identifier</keyword>
          <keyword>URI</keyword>
          <keyword>URL</keyword>
          <keyword>URN</keyword>
          <keyword>WWW</keyword>
          <keyword>resource</keyword>
          <abstract>
            <t>
A Uniform Resource Identifier (URI) is a compact sequence of characters
that identifies an abstract or physical resource.  This specification
defines the generic URI syntax and a process for resolving URI references
that might be in relative form, along with guidelines and security
considerations for the use of URIs on the Internet.
The URI syntax defines a grammar that is a superset of all valid URIs,
allowing an implementation to parse the common components of a URI
reference without knowing the scheme-specific requirements of every
possible identifier.  This specification does not define a generative
grammar for URIs; that task is performed by the individual
specifications of each URI scheme.
</t>
          </abstract>
        </front>
        <seriesInfo name="STD" value="66"/>
        <seriesInfo name="RFC" value="3986"/>
        <format type="TXT" octets="141811" target="http://www.rfc-editor.org/rfc/rfc3986.txt"/>
        <format type="HTML" octets="214067" target="http://xml.resource.org/public/rfc/html/rfc3986.html"/>
        <format type="XML" octets="163534" target="http://xml.resource.org/public/rfc/xml/rfc3986.xml"/>
      </reference>
      <reference anchor="RFC4395">
        <front>
          <title>Guidelines and Registration Procedures for New URI Schemes</title>
          <author initials="T." surname="Hansen" fullname="T. Hansen">
            <organization/>
          </author>
          <author initials="T." surname="Hardie" fullname="T. Hardie">
            <organization/>
          </author>
          <author initials="L." surname="Masinter" fullname="L. Masinter">
            <organization/>
          </author>
          <date year="2006" month="February"/>
          <abstract>
            <t>This document provides guidelines and recommendations for the definition of Uniform Resource Identifier (URI) schemes.  It also updates the process and IANA registry for URI schemes.  It obsoletes both RFC 2717 and RFC 2718.  This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements.</t>
          </abstract>
        </front>
        <seriesInfo name="BCP" value="35"/>
        <seriesInfo name="RFC" value="4395"/>
        <format type="TXT" octets="31933" target="http://www.rfc-editor.org/rfc/rfc4395.txt"/>
      </reference>
      <reference anchor="RFC6838">
        <front>
          <title>Media Type Specifications and Registration Procedures</title>
          <author initials="N." surname="Freed" fullname="N. Freed">
            <organization/>
          </author>
          <author initials="J." surname="Klensin" fullname="J. Klensin">
            <organization/>
          </author>
          <author initials="T." surname="Hansen" fullname="T. Hansen">
            <organization/>
          </author>
          <date year="2013" month="January"/>
          <abstract>
            <t>This document defines procedures for the specification and registration of media types for use in HTTP, MIME, and other Internet protocols.  This memo documents an Internet Best Current Practice.</t>
          </abstract>
        </front>
        <seriesInfo name="BCP" value="13"/>
        <seriesInfo name="RFC" value="6838"/>
        <format type="TXT" octets="72942" target="http://www.rfc-editor.org/rfc/rfc6838.txt"/>
      </reference>
    </references>
    <references title="Informative References">
      <reference anchor="RFC2026">
        <front>
          <title abbrev="Internet Standards Process">The Internet Standards Process -- Revision 3</title>
          <author initials="S." surname="Bradner" fullname="Scott O. Bradner">
            <organization>Harvard University</organization>
            <address>
              <postal>
                <street>1350 Mass. Ave.</street>
                <city>Cambridge</city>
                <region>MA</region>
                <code>02138</code>
                <country>US</country>
              </postal>
              <phone>+1 617 495 3864</phone>
              <email>sob@harvard.edu</email>
            </address>
          </author>
          <date year="1996" month="October"/>
          <abstract>
            <t>This memo documents the process used by the Internet community for the standardization of protocols and procedures.  It defines the stages in the standardization process, the requirements for moving a document between stages and the types of documents used during this process.  It also addresses the intellectual property rights and copyright issues associated with the standards process.</t>
          </abstract>
        </front>
        <seriesInfo name="BCP" value="9"/>
        <seriesInfo name="RFC" value="2026"/>
        <format type="TXT" octets="86731" target="http://www.rfc-editor.org/rfc/rfc2026.txt"/>
      </reference>
      <reference anchor="RFC4844">
        <front>
          <title>The RFC Series and RFC Editor</title>
          <author initials="L." surname="Daigle" fullname="L. Daigle">
            <organization/>
          </author>
          <author>
            <organization>Internet Architecture Board</organization>
          </author>
          <date year="2007" month="July"/>
          <abstract>
            <t>This document describes the framework for an RFC Series and an RFC Editor function that incorporate the principles of organized community involvement and accountability that has become necessary as the Internet technical community has grown, thereby enabling the RFC Series to continue to fulfill its mandate.  This memo provides information for the Internet community.</t>
          </abstract>
        </front>
        <seriesInfo name="RFC" value="4844"/>
        <format type="TXT" octets="38752" target="http://www.rfc-editor.org/rfc/rfc4844.txt"/>
      </reference>
      <reference anchor="RFC5785">
        <front>
          <title>Defining Well-Known Uniform Resource Identifiers (URIs)</title>
          <author initials="M." surname="Nottingham" fullname="M. Nottingham">
            <organization/>
          </author>
          <author initials="E." surname="Hammer-Lahav" fullname="E. Hammer-Lahav">
            <organization/>
          </author>
          <date year="2010" month="April"/>
          <abstract>
            <t>This memo defines a path prefix for "well-known locations", "/.well-known/", in selected Uniform Resource Identifier (URI) schemes. [STANDARDS-TRACK]</t>
          </abstract>
        </front>
        <seriesInfo name="RFC" value="5785"/>
        <format type="TXT" octets="13779" target="http://www.rfc-editor.org/rfc/rfc5785.txt"/>
      </reference>
      <reference anchor="RFC5988">
        <front>
          <title>Web Linking</title>
          <author initials="M." surname="Nottingham" fullname="M. Nottingham">
            <organization/>
          </author>
          <date year="2010" month="October"/>
          <abstract>
            <t>This document specifies relation types for Web links, and defines a registry for them.  It also defines the use of such links in HTTP headers with the Link header field. [STANDARDS-TRACK]</t>
          </abstract>
        </front>
        <seriesInfo name="RFC" value="5988"/>
        <format type="TXT" octets="46834" target="http://www.rfc-editor.org/rfc/rfc5988.txt"/>
      </reference>
      <reference anchor="RFC6570">
        <front>
          <title>URI Template</title>
          <author initials="J." surname="Gregorio" fullname="J. Gregorio">
            <organization/>
          </author>
          <author initials="R." surname="Fielding" fullname="R. Fielding">
            <organization/>
          </author>
          <author initials="M." surname="Hadley" fullname="M. Hadley">
            <organization/>
          </author>
          <author initials="M." surname="Nottingham" fullname="M. Nottingham">
            <organization/>
          </author>
          <author initials="D." surname="Orchard" fullname="D. Orchard">
            <organization/>
          </author>
          <date year="2012" month="March"/>
          <abstract>
            <t>A URI Template is a compact sequence of characters for describing a range of Uniform Resource Identifiers through variable expansion.  This specification defines the URI Template syntax and the process for expanding a URI Template into a URI reference, along with guidelines for the use of URI Templates on the Internet. [STANDARDS-TRACK]</t>
          </abstract>
        </front>
        <seriesInfo name="RFC" value="6570"/>
        <format type="TXT" octets="79813" target="http://www.rfc-editor.org/rfc/rfc6570.txt"/>
      </reference>
      <reference anchor="webarch" target="http://www.w3.org/TR/2004/REC-webarch-20041215">
        <front>
          <title>Architecture of the World Wide Web, Volume One</title>
          <author initials="I." surname="Jacobs" fullname="Ian Jacobs">
            <organization/>
          </author>
          <author initials="N." surname="Walsh" fullname="Norman Walsh">
            <organization/>
          </author>
          <date year="2004" month="December" day="15"/>
        </front>
      </reference>
      <reference anchor="HTML4" target="http://www.w3.org/TR/REC-html40/">
        <front>
          <title>HTML 4.01 Specification</title>
          <author initials="I." surname="Jacobs" fullname="Ian Jacobs">
            <organization>W3C</organization>
          </author>
          <author initials="A." surname="Le Hors" fullname="Arnaud Le Hors">
            <organization>W3C</organization>
          </author>
          <author initials="D." surname="Raggett" fullname="Dave Raggett">
            <organization/>
          </author>
          <date year="1999" month="December" day="24"/>
        </front>
      </reference>
    </references>
    <section anchor="acknowlegements" title="Acknowlegements">
      <t>Thanks to David Booth, Anne van Kesteren and Erik Wilde for their suggestions
and feedback</t>
    </section>
  </back>
</rfc>
