idnits 2.17.1 

draft-thaler-appsawg-multi-transport-uris-02.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack a both a reference to RFC 2119 and the
     recommended RFC 2119 boilerplate, even if it appears to use RFC 2119
     keywords. 

     RFC 2119 keyword, line 255: '...f port numbers is RECOMMENDED whenever...'


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (March 5, 2018) is 2234 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Missing Reference: 'RFC3261' is mentioned on line 111, but not defined

  -- Obsolete informational reference (is this intentional?): RFC 6555
     (Obsoleted by RFC 8305)

  -- Obsolete informational reference (is this intentional?): RFC 7320
     (Obsoleted by RFC 8820)


     Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 3 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                          D. Thaler
3	Internet-Draft                                                 Microsoft
4	Intended status: Informational                             March 5, 2018
5	Expires: September 6, 2018

7	                Using URIs With Multiple Protocol Stacks
8	              draft-thaler-appsawg-multi-transport-uris-02

10	Abstract

12	   Many Uniform Resource Identifiers (URIs) today have some mechanism to
13	   resolve them to one or more specific endpoints where that resource is
14	   available.  This document discusses issues that arise when the same
15	   resource can be reached over multiple protocol stacks, and discusses
16	   various approaches that have been used or discussed, and the
17	   tradeoffs between them.  Such issues are important to consider when
18	   defining new URI schemes and resolution mechanisms.

20	Status of This Memo

22	   This Internet-Draft is submitted in full conformance with the
23	   provisions of BCP 78 and BCP 79.

25	   Internet-Drafts are working documents of the Internet Engineering
26	   Task Force (IETF).  Note that other groups may also distribute
27	   working documents as Internet-Drafts.  The list of current Internet-
28	   Drafts is at https://datatracker.ietf.org/drafts/current/.

30	   Internet-Drafts are draft documents valid for a maximum of six months
31	   and may be updated, replaced, or obsoleted by other documents at any
32	   time.  It is inappropriate to use Internet-Drafts as reference
33	   material or to cite them other than as "work in progress."

35	   This Internet-Draft will expire on September 6, 2018.

37	Copyright Notice

39	   Copyright (c) 2018 IETF Trust and the persons identified as the
40	   document authors.  All rights reserved.

42	   This document is subject to BCP 78 and the IETF Trust's Legal
43	   Provisions Relating to IETF Documents
44	   (https://trustee.ietf.org/license-info) in effect on the date of
45	   publication of this document.  Please review these documents
46	   carefully, as they describe your rights and restrictions with respect
47	   to this document.  Code Components extracted from this document must
48	   include Simplified BSD License text as described in Section 4.e of
49	   the Trust Legal Provisions and are provided without warranty as
50	   described in the Simplified BSD License.

52	Table of Contents

54	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
55	   2.  Problem Statement . . . . . . . . . . . . . . . . . . . . . .   4
56	   3.  Protocol endpoint discovery . . . . . . . . . . . . . . . . .   4
57	     3.1.  Specified by the URI scheme specification . . . . . . . .   5
58	     3.2.  Passed in one URI . . . . . . . . . . . . . . . . . . . .   5
59	     3.3.  Use separate URI for each transport endpoint  . . . . . .   7
60	     3.4.  Use another mechanism for discovery . . . . . . . . . . .   7
61	   4.  Transport endpoint selection  . . . . . . . . . . . . . . . .   8
62	   5.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   9
63	   6.  Security Considerations . . . . . . . . . . . . . . . . . . .   9
64	   7.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .   9
65	   8.  Informative References  . . . . . . . . . . . . . . . . . . .   9
66	   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .  10

68	1.  Introduction

70	   For Uniform Resource Identifier (URI) schemes that function as
71	   locators (historically called "URLs"), [RFC3986] explains that:

73	      URI "resolution" is the process of determining an access mechanism
74	      and the appropriate parameters necessary to deference a URI; this
75	      resolution may require several iterations.  To use that access
76	      mechanism to perform an action on the URI's resource is to
77	      "dereference" the URI.

79	   The specific details vary by URI scheme and hence are up to each URI
80	   scheme definition to specify.  Requirements for URI scheme
81	   definitions are covered in [RFC3986], [RFC7320], and [RFC7595].  RFC
82	   7595 section 3.3 states:

84	      For schemes that function as locators, it is important that the
85	      mechanism of resource location be clearly defined.

87	   Closely related to the concept of resolving a URI to a resource that
88	   may have multiple ways to reach it, is the concept of "equivalence".
89	   [RFC3986] section 6.1 states:

91	      Even though it is possible to determine that two URIs are
92	      equivalent, URI comparison is not sufficient to determine whether
93	      two URIs identify different resources.  For example, an owner of
94	      two different domain names could decide to serve the same resource
95	      from both, resulting in two different URIs.  Therefore, comparison
96	      methods are designed to minimize false negatives while strictly
97	      avoiding false positives.

99	   Thus, it is possible that two distinct URIs refer to the same
100	   resource.  The goal, as RFC 3986 stated above, is simply to
101	   "minimize" such cases, but such minimization often comes at a cost.
102	   For example, for many URIs schemes, a DNS name can be used in the
103	   authority component rather than using several URIs that differ only
104	   in IP address literal, with the cost being a dependency on DNS name
105	   resolution and the potential latency and traffic involved.

107	   As another example, [RFC5630] section 4.1 states:

109	      SIP and SIPS URIs that are identical except for the scheme itself
110	      (e.g., sip:alice@example.com and sips:alice@example.com) refer to
111	      the same resource.  This requirement is implicit in [RFC3261],
112	      Section 19.1, which states that "any resource described by a SIP
113	      URI can be 'upgraded' to a SIPS URI by just changing the scheme,
114	      if it is desired to communicate with that resource securely".
115	      This does not mean that the SIPS URI will necessarily be
116	      reachable, in particular, if the proxy cannot establish a secure
117	      connection to a client or another proxy.  This does not suggest
118	      either that proxies would arbitrarily "upgrade" SIP URIs to SIPS
119	      URIs when forwarding a request (see Section 5.3).  Rather, it
120	      means that when a resource is addressable with SIP, it will also
121	      be addressable with SIPS.

123	   Similarly, the same resource might be identified using both "http"
124	   and "https", and indeed a commonly followed rule (section 4.1.3 of
125	   [USWP]) is that the URI scheme sets expectations for integrity of
126	   access, such that separate integrity levels result in separate URI
127	   schemes.

129	   Thus, the same resource might be identified by multiple URIs that
130	   differ only in URI scheme, or authority component, or path (e.g.,
131	   using ".." resolution).

133	   For URIs used in the World Wide Web, Section 2.3.1 of "Architecture
134	   of the World Wide Web" [AWWW] further discusses such aliasing,
135	   explaining that links to a resource increase the value of that
136	   resource, and multiple URIs for it interfere with such valuation, and
137	   also makes it difficult to correlate two sources as pointing to the
138	   same resource via differing aliases.  Thus to maximize the benefit to
139	   the Web, URI aliases should be minimized.

141	   See "URI Schemes and Web Protocols" [USWP] for additional discussion
142	   on the relationship between URI schemes and protocols in a web
143	   context, although that document has no official standing and there is
144	   a history of difficulty in reaching consensus on the connection
145	   between URI schemes and protocols.[Noah]

147	2.  Problem Statement

149	   Besides specifying one or more URI scheme names to be used and the
150	   syntax for each (e.g., what the authority component contains), there
151	   are two issues a URI scheme definer must deal with when multiple
152	   protocol stacks are available for accessing a given resource:

154	   1.  Specifying how the set of protocol endpoint identifiers (e.g.,
155	       TCP and UDP port numbers) for a given URI can be discovered by an
156	       entity wishing to resolve it, and

158	   2.  Specifying how an appropriate protocol endpoint can be selected
159	       for use, from among the discovered set.

161	   At a high level, these issues are equivalent to those arising when
162	   multiple IP addresses are available for the same resource.  However,
163	   in general, there may be multiple layers in a transport stack (e.g.,
164	   some application-layer protocol over WebSockets over TCP), each with
165	   its own identifiers, so the problems are compounded when multiple
166	   choices exist at each of multiple layers below the application-layer
167	   protocol itself.

169	   Thus, when we use the term "protocol stack" in this document, we
170	   typically mean the stack of protocols below the application-layer
171	   protocol associated with the URI scheme, and above the network layer.
172	   However, [USWP] also discusses the possibility ("Approach 2") that
173	   multiple application-layer protocols might share the same URI scheme,
174	   in which case the "protocol stack" also includes the application-
175	   layer protocols to select from.

177	3.  Protocol endpoint discovery

179	   A client wishing to access a resource needs to know, for each layer
180	   in the protocol stack, what protocol(s) can be used, and what
181	   identifier(s) are needed by each such protocol.  There are several
182	   possible approaches to endpoint identifier discovery, which we cover
183	   in the following sections.  For simplicity, we will discuss them as
184	   if the same approach is used for both types of information, but it is
185	   important to remember that a URI scheme could specify discovery of
186	   the set of protocols via one approach, and discovery of the
187	   identifier(s) for each protocol via another approach.

189	3.1.  Specified by the URI scheme specification

191	   In this approach, every resource is assumed to use the exact same set
192	   of transport protocols (i.e., stacks of protocols above the network
193	   layer) and identifiers.  The identifiers can be IANA assigned and
194	   specified as part of the URI scheme or protocol specification.  For
195	   example, TFTP only supports UDP port 69, and so no port number is
196	   permitted in a tftp URI.

198	   If support for a new transport protocol is later added under a
199	   protocol with a given URI scheme, different entities may thus have
200	   different hard-coded assumptions about the set of possible protocols,
201	   which just pushes the rest of the burden to the problem of selection
202	   among the known set (see Section 4).

204	   A disadvantage of this approach for many use cases is that it does
205	   not allow for non-default server configurations such as custom ports.

207	3.2.  Passed in one URI

209	   For single-transport protocols, a common mechanism is to specify a
210	   default port for the URI scheme, and to allow putting a non-default
211	   port number in the URI authority component.

213	   For multi-transport protocols, historically it was sometimes assumed
214	   that multiple transport protocols (e.g., UDP and TCP) would use the
215	   same port number, so specifying a single number would also be
216	   sufficient for multiple transports.  When port numbers appear in
217	   URIs, they are not the default ports that might be IANA-assigned
218	   (since default ports should be omitted from the URI per [RFC3986]
219	   section 3.2.3), but instead are either statically chosen by the
220	   server application, or are ephemeral ports dynamically allocated on
221	   the server hosting the resource.  In most TCP/IP stacks, ephemeral
222	   ports used by UDP endpoints have no relationship to ephemeral ports
223	   used by TCP endpoints in the same application and so it cannot be
224	   guaranteed that the port numbers are the same.  For example, port
225	   51000 might be allocated to one application for UDP, and a different
226	   application for TCP.

228	   Since 2011, this same issue can also occur with IANA-assigned ports,
229	   especially if support for a given transport protocol is added at a
230	   later time.  [RFC6335] section 7.2 explains:

232	      Effective with the publication of this document, IANA will begin
233	      assigning port numbers for only those transport protocols
234	      explicitly included in an assignment request.  This ends the long-
235	      standing practice of automatically assigning a port number to an
236	      application for both TCP and UDP, even if the request is for only
237	      one of these transport protocols.

239	   Thus, for most URI schemes, a port number appearing in a URI
240	   authority component must be specified as being in a specific
241	   transport-layer protocol's numbering space since its value for a
242	   given resource might differ by transport protocol.  If a URI scheme
243	   wishes for the port number in the URI authority component to be able
244	   to apply to multiple transport protocols, the URI scheme would
245	   typically have to assume static configuration on servers; this may be
246	   acceptable in some circumstances and unacceptable in others.

248	   A common solution in non-URI contexts is to use a service name rather
249	   than a literal port number, and allow the service name to be resolved
250	   to the relevant transport-layer identifier.  Indeed, [RFC6335]
251	   section 3 says:

253	      Because the port number space is finite (and therefore
254	      conservation is an important goal), the alternative of using
255	      service names instead of port numbers is RECOMMENDED whenever
256	      possible.

258	   Unfortunately, it is not possible to follow this recommendation with
259	   the port field in URI authority component, since the URI syntax only
260	   allows integers in the port field.

262	   For new URI schemes, it may be possible in some cases to place a
263	   service name in the host field, such as "_myservice._tcp.example.org"
264	   as would be used with a DNS SRV record [RFC2782].  That example still
265	   specifies only a single transport protocol stack ("_tcp") however,
266	   rather than a list of supported stacks.

268	   Another limitation of service names is that they are currently
269	   limited only to TCP, UDP, SCTP, and DCCP, and so cannot be used with
270	   other layers (e.g., websockets) or protocols.  Thus, a URI scheme for
271	   a protocol that supports both, say, websockets and raw TCP as
272	   possible transports for resource access, cannot use a service name as
273	   a common identifier for transport-layer endpoint resolution.

275	   It is usually also undesirable to put transport-layer endpoint
276	   information (the list of supported transport protocols or the
277	   identifier(s) used with the transport protocols) in the path or query
278	   components for two reasons.  First, those components are typically
279	   passed over the wire to the server when accessing a resource, which
280	   only consumes extra bandwidth with no benefit.  Second, if the
281	   transport-layer identifiers might change over the lifetime of the
282	   resource, then the URI would need to change even if the change did
283	   not affect the actual endpoint chosen by the client.  Such a change
284	   would negatively affect equivalence with the previous URI, e.g.,
285	   resulting in cache misses.

287	   Thus, an advantage of this approach is that it can work without any
288	   dependency on other protocols or deployment of servers needed for
289	   resolution, and a disadvantage is that putting information about
290	   multiple transport-layer endpoints anywhere in the same URI could
291	   make for a very long URI that might have issues with certain
292	   software, or have bandwidth or storage issues.

294	3.3.  Use separate URI for each transport endpoint

296	   In this approach, one must simply accept the fact that multiple URIs
297	   might refer to the same resource as RFC 3986 already allows.  This is
298	   similar to using a set of URIs that differ only in IP address
299	   literal, for a case when the resource server is not resolvable via a
300	   protocol such as DNS or SIP.

302	   The obvious disadvantage is that there are multiple URIs for the same
303	   resource.  Another potential disadvantage for some more complex use
304	   cases where there are multiple layers of the transport stack, is that
305	   it may be difficult or impossible to express all the identifiers in
306	   an entire stack of protocols in one URI.

308	   For cases where there are multiple transport protocols but only one
309	   such layer, this approach results in needing to identify a single
310	   transport protocol per URI.  As discussed in Section 3.2, this often
311	   cannot be put in the authority component and is undesirable to put in
312	   the path or query component.  As a result, such cases involve
313	   specifying a separate URI scheme per transport.  For example, "sip"
314	   and "sips" do this, as do "http" and "https".  RFC 8323 [RFC8323]
315	   also follows this approach for CoAP with "coap", "coaps", "coap+tcp",
316	   "coaps+tcp", etc.

318	3.4.  Use another mechanism for discovery

320	   In this approach, a URI scheme definer would specify a mechanism
321	   whereby transport stack identifiers can be resolved for a given URI,
322	   and the identifiers would come in a form that may not be expressed as
323	   a URI.  If multiple layers exist, then such resolution might involve
324	   a resolution step for each layer.

326	   DNS records (e.g., SRV records) provide one potential mechanism that
327	   can be used to discover a set of supported transports and their
328	   associated identifiers.  Other types of directories might be usable
329	   in other cases.  For example, HTTP now provides an "Alt-Svc"
330	   [RFC7838] mechanism that can discover alternate transport endpoints
331	   for the same HTTP URI.  Another example mentioned in [USWP] is where
332	   the protocol to use is identified by a media type value.

334	   One challenge in many cases is defining a common mechanism that could
335	   discover identifiers for different transport protocols for the same
336	   resource.  For example, websockets use URIs and TCP uses port numbers
337	   (and there is currently no URI scheme for TCP itself), and so the
338	   syntax of such identifiers may differ if an application layer
339	   protocol could use both TCP and websockets.

341	   The advantage of requiring a separate resolution mechanism is that
342	   the resource URI itself can be kept short and simple.  The downsides
343	   are extra complexity in both clients and servers, potentially extra
344	   specification work for the URI scheme definer, the possible
345	   additional deployment burden of provisioning and operating extra
346	   protocols or servers to facilitate such resolution, and any
347	   additional bandwidth or latency of doing the resolution.

349	   In some contexts, it might be feasible to discover the additional
350	   identifiers using the same mechanism used to discover the URI itself,
351	   perhaps even in the same message.

353	4.  Transport endpoint selection

355	   The URI scheme should specify the mechanism for choosing among
356	   transport protocol stacks, such as specifying at least one that is
357	   mandatory to implement and an algorithm for trying possible transport
358	   stacks in some order until one works.  The URI scheme might even
359	   leave it up to the client implementation or client configuration
360	   options as suggested in Approach 2 of [USWP].

362	   The endpoint selection problem is similar to that of choosing among
363	   multiple discovered IP addresses for the same transport stack, and
364	   two common solutions are used today in that context.  One category of
365	   algorithm is to sort the choices according to some criteria, and then
366	   to try them in order of preference.  For example, SRV records provide
367	   a priority and weight for each transport endpoint that can be used to
368	   sort them, and [RFC6724] provides an algorithm for sorting
369	   destination IP addresses.

371	   Another category of such algorithms is called "Happy Eyeballs"
372	   [RFC6555] where multiple possibilities are attempted in parallel
373	   (possibly with some delay added before starting non-preferred
374	   choices) and keeping the first one that responds successfully.  The
375	   advantage is faster connection when a non-preferred choice is needed,
376	   and the disadvantages are extra complexity in the client, extra
377	   traffic on the network, and extra connections at the server if
378	   multiple parallel attempts succeed.

380	   As noted earlier, when multiple layers exist in the transport stack,
381	   the number of possible permutations might be large in some cases, and
382	   so a mechanism must be cognizant of that.

384	5.  IANA Considerations

386	   This document has no actions for IANA.

388	6.  Security Considerations

390	   The security considerations in section 3.7 of [RFC7595] and section 7
391	   of [RFC3986] apply.  [RFC6943] also discusses security considerations
392	   with determining equivalence, and section 3.1.4 of that document is
393	   relevant to resolution.  This document does not raise additional
394	   security issues.

396	7.  Acknowledgements

398	   Thanks to Graham Klyne, Alexey Melnikov, and Gabriel Montenegro for
399	   helpful suggestions on this document.

401	8.  Informative References

403	   [RFC2782]  Gulbrandsen, A., Vixie, P., and L. Esibov, "A DNS RR for
404	              specifying the location of services (DNS SRV)", RFC 2782,
405	              DOI 10.17487/RFC2782, February 2000,
406	              <https://www.rfc-editor.org/info/rfc2782>.

408	   [RFC3986]  Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
409	              Resource Identifier (URI): Generic Syntax", STD 66,
410	              RFC 3986, DOI 10.17487/RFC3986, January 2005,
411	              <https://www.rfc-editor.org/info/rfc3986>.

413	   [RFC5630]  Audet, F., "The Use of the SIPS URI Scheme in the Session
414	              Initiation Protocol (SIP)", RFC 5630,
415	              DOI 10.17487/RFC5630, October 2009,
416	              <https://www.rfc-editor.org/info/rfc5630>.

418	   [RFC6335]  Cotton, M., Eggert, L., Touch, J., Westerlund, M., and S.
419	              Cheshire, "Internet Assigned Numbers Authority (IANA)
420	              Procedures for the Management of the Service Name and
421	              Transport Protocol Port Number Registry", BCP 165,
422	              RFC 6335, DOI 10.17487/RFC6335, August 2011,
423	              <https://www.rfc-editor.org/info/rfc6335>.

425	   [RFC6555]  Wing, D. and A. Yourtchenko, "Happy Eyeballs: Success with
426	              Dual-Stack Hosts", RFC 6555, DOI 10.17487/RFC6555, April
427	              2012, <https://www.rfc-editor.org/info/rfc6555>.

429	   [RFC6724]  Thaler, D., Ed., Draves, R., Matsumoto, A., and T. Chown,
430	              "Default Address Selection for Internet Protocol Version 6
431	              (IPv6)", RFC 6724, DOI 10.17487/RFC6724, September 2012,
432	              <https://www.rfc-editor.org/info/rfc6724>.

434	   [RFC6943]  Thaler, D., Ed., "Issues in Identifier Comparison for
435	              Security Purposes", RFC 6943, DOI 10.17487/RFC6943, May
436	              2013, <https://www.rfc-editor.org/info/rfc6943>.

438	   [RFC7320]  Nottingham, M., "URI Design and Ownership", BCP 190,
439	              RFC 7320, DOI 10.17487/RFC7320, July 2014,
440	              <https://www.rfc-editor.org/info/rfc7320>.

442	   [RFC7595]  Thaler, D., Ed., Hansen, T., and T. Hardie, "Guidelines
443	              and Registration Procedures for URI Schemes", BCP 35,
444	              RFC 7595, DOI 10.17487/RFC7595, June 2015,
445	              <https://www.rfc-editor.org/info/rfc7595>.

447	   [RFC7838]  Nottingham, M., McManus, P., and J. Reschke, "HTTP
448	              Alternative Services", RFC 7838, DOI 10.17487/RFC7838,
449	              April 2016, <https://www.rfc-editor.org/info/rfc7838>.

451	   [RFC8323]  Bormann, C., Lemay, S., Tschofenig, H., Hartke, K.,
452	              Silverajan, B., and B. Raymor, Ed., "CoAP (Constrained
453	              Application Protocol) over TCP, TLS, and WebSockets",
454	              RFC 8323, DOI 10.17487/RFC8323, February 2018,
455	              <https://www.rfc-editor.org/info/rfc8323>.

457	   [AWWW]     Jacobs, I. and N. Walsh, "Architecture of the World Wide
458	              Web, Volume One", December 2004,
459	              <http://www.w3.org/TR/webarch>.

461	   [USWP]     Mendelsohn, N., "URI Schemes and Web Protocols", November
462	              2005,
463	              <http://www.w3.org/2001/tag/doc/SchemeProtocols.html>.

465	   [Noah]     Mendelsohn, N., "Email from Noah Mendelsohn to the URI-
466	              Review mailing list", July 2017, <https://www.ietf.org/
467	              mail-archive/web/uri-review/current/msg01919.html>.

469	Author's Address
470	   Dave Thaler
471	   Microsoft
472	   One Microsoft Way
473	   Redmond, WA  98052
474	   USA

476	   Email: dthaler@microsoft.com