idnits 2.17.1 draft-thaler-appsawg-multi-transport-uris-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 255: '...f port numbers is RECOMMENDED whenever...' Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (March 5, 2018) is 2234 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'RFC3261' is mentioned on line 111, but not defined -- Obsolete informational reference (is this intentional?): RFC 6555 (Obsoleted by RFC 8305) -- Obsolete informational reference (is this intentional?): RFC 7320 (Obsoleted by RFC 8820) Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group D. Thaler 3 Internet-Draft Microsoft 4 Intended status: Informational March 5, 2018 5 Expires: September 6, 2018 7 Using URIs With Multiple Protocol Stacks 8 draft-thaler-appsawg-multi-transport-uris-02 10 Abstract 12 Many Uniform Resource Identifiers (URIs) today have some mechanism to 13 resolve them to one or more specific endpoints where that resource is 14 available. This document discusses issues that arise when the same 15 resource can be reached over multiple protocol stacks, and discusses 16 various approaches that have been used or discussed, and the 17 tradeoffs between them. Such issues are important to consider when 18 defining new URI schemes and resolution mechanisms. 20 Status of This Memo 22 This Internet-Draft is submitted in full conformance with the 23 provisions of BCP 78 and BCP 79. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF). Note that other groups may also distribute 27 working documents as Internet-Drafts. The list of current Internet- 28 Drafts is at https://datatracker.ietf.org/drafts/current/. 30 Internet-Drafts are draft documents valid for a maximum of six months 31 and may be updated, replaced, or obsoleted by other documents at any 32 time. It is inappropriate to use Internet-Drafts as reference 33 material or to cite them other than as "work in progress." 35 This Internet-Draft will expire on September 6, 2018. 37 Copyright Notice 39 Copyright (c) 2018 IETF Trust and the persons identified as the 40 document authors. All rights reserved. 42 This document is subject to BCP 78 and the IETF Trust's Legal 43 Provisions Relating to IETF Documents 44 (https://trustee.ietf.org/license-info) in effect on the date of 45 publication of this document. Please review these documents 46 carefully, as they describe your rights and restrictions with respect 47 to this document. Code Components extracted from this document must 48 include Simplified BSD License text as described in Section 4.e of 49 the Trust Legal Provisions and are provided without warranty as 50 described in the Simplified BSD License. 52 Table of Contents 54 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 55 2. Problem Statement . . . . . . . . . . . . . . . . . . . . . . 4 56 3. Protocol endpoint discovery . . . . . . . . . . . . . . . . . 4 57 3.1. Specified by the URI scheme specification . . . . . . . . 5 58 3.2. Passed in one URI . . . . . . . . . . . . . . . . . . . . 5 59 3.3. Use separate URI for each transport endpoint . . . . . . 7 60 3.4. Use another mechanism for discovery . . . . . . . . . . . 7 61 4. Transport endpoint selection . . . . . . . . . . . . . . . . 8 62 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 63 6. Security Considerations . . . . . . . . . . . . . . . . . . . 9 64 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 9 65 8. Informative References . . . . . . . . . . . . . . . . . . . 9 66 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 10 68 1. Introduction 70 For Uniform Resource Identifier (URI) schemes that function as 71 locators (historically called "URLs"), [RFC3986] explains that: 73 URI "resolution" is the process of determining an access mechanism 74 and the appropriate parameters necessary to deference a URI; this 75 resolution may require several iterations. To use that access 76 mechanism to perform an action on the URI's resource is to 77 "dereference" the URI. 79 The specific details vary by URI scheme and hence are up to each URI 80 scheme definition to specify. Requirements for URI scheme 81 definitions are covered in [RFC3986], [RFC7320], and [RFC7595]. RFC 82 7595 section 3.3 states: 84 For schemes that function as locators, it is important that the 85 mechanism of resource location be clearly defined. 87 Closely related to the concept of resolving a URI to a resource that 88 may have multiple ways to reach it, is the concept of "equivalence". 89 [RFC3986] section 6.1 states: 91 Even though it is possible to determine that two URIs are 92 equivalent, URI comparison is not sufficient to determine whether 93 two URIs identify different resources. For example, an owner of 94 two different domain names could decide to serve the same resource 95 from both, resulting in two different URIs. Therefore, comparison 96 methods are designed to minimize false negatives while strictly 97 avoiding false positives. 99 Thus, it is possible that two distinct URIs refer to the same 100 resource. The goal, as RFC 3986 stated above, is simply to 101 "minimize" such cases, but such minimization often comes at a cost. 102 For example, for many URIs schemes, a DNS name can be used in the 103 authority component rather than using several URIs that differ only 104 in IP address literal, with the cost being a dependency on DNS name 105 resolution and the potential latency and traffic involved. 107 As another example, [RFC5630] section 4.1 states: 109 SIP and SIPS URIs that are identical except for the scheme itself 110 (e.g., sip:alice@example.com and sips:alice@example.com) refer to 111 the same resource. This requirement is implicit in [RFC3261], 112 Section 19.1, which states that "any resource described by a SIP 113 URI can be 'upgraded' to a SIPS URI by just changing the scheme, 114 if it is desired to communicate with that resource securely". 115 This does not mean that the SIPS URI will necessarily be 116 reachable, in particular, if the proxy cannot establish a secure 117 connection to a client or another proxy. This does not suggest 118 either that proxies would arbitrarily "upgrade" SIP URIs to SIPS 119 URIs when forwarding a request (see Section 5.3). Rather, it 120 means that when a resource is addressable with SIP, it will also 121 be addressable with SIPS. 123 Similarly, the same resource might be identified using both "http" 124 and "https", and indeed a commonly followed rule (section 4.1.3 of 125 [USWP]) is that the URI scheme sets expectations for integrity of 126 access, such that separate integrity levels result in separate URI 127 schemes. 129 Thus, the same resource might be identified by multiple URIs that 130 differ only in URI scheme, or authority component, or path (e.g., 131 using ".." resolution). 133 For URIs used in the World Wide Web, Section 2.3.1 of "Architecture 134 of the World Wide Web" [AWWW] further discusses such aliasing, 135 explaining that links to a resource increase the value of that 136 resource, and multiple URIs for it interfere with such valuation, and 137 also makes it difficult to correlate two sources as pointing to the 138 same resource via differing aliases. Thus to maximize the benefit to 139 the Web, URI aliases should be minimized. 141 See "URI Schemes and Web Protocols" [USWP] for additional discussion 142 on the relationship between URI schemes and protocols in a web 143 context, although that document has no official standing and there is 144 a history of difficulty in reaching consensus on the connection 145 between URI schemes and protocols.[Noah] 147 2. Problem Statement 149 Besides specifying one or more URI scheme names to be used and the 150 syntax for each (e.g., what the authority component contains), there 151 are two issues a URI scheme definer must deal with when multiple 152 protocol stacks are available for accessing a given resource: 154 1. Specifying how the set of protocol endpoint identifiers (e.g., 155 TCP and UDP port numbers) for a given URI can be discovered by an 156 entity wishing to resolve it, and 158 2. Specifying how an appropriate protocol endpoint can be selected 159 for use, from among the discovered set. 161 At a high level, these issues are equivalent to those arising when 162 multiple IP addresses are available for the same resource. However, 163 in general, there may be multiple layers in a transport stack (e.g., 164 some application-layer protocol over WebSockets over TCP), each with 165 its own identifiers, so the problems are compounded when multiple 166 choices exist at each of multiple layers below the application-layer 167 protocol itself. 169 Thus, when we use the term "protocol stack" in this document, we 170 typically mean the stack of protocols below the application-layer 171 protocol associated with the URI scheme, and above the network layer. 172 However, [USWP] also discusses the possibility ("Approach 2") that 173 multiple application-layer protocols might share the same URI scheme, 174 in which case the "protocol stack" also includes the application- 175 layer protocols to select from. 177 3. Protocol endpoint discovery 179 A client wishing to access a resource needs to know, for each layer 180 in the protocol stack, what protocol(s) can be used, and what 181 identifier(s) are needed by each such protocol. There are several 182 possible approaches to endpoint identifier discovery, which we cover 183 in the following sections. For simplicity, we will discuss them as 184 if the same approach is used for both types of information, but it is 185 important to remember that a URI scheme could specify discovery of 186 the set of protocols via one approach, and discovery of the 187 identifier(s) for each protocol via another approach. 189 3.1. Specified by the URI scheme specification 191 In this approach, every resource is assumed to use the exact same set 192 of transport protocols (i.e., stacks of protocols above the network 193 layer) and identifiers. The identifiers can be IANA assigned and 194 specified as part of the URI scheme or protocol specification. For 195 example, TFTP only supports UDP port 69, and so no port number is 196 permitted in a tftp URI. 198 If support for a new transport protocol is later added under a 199 protocol with a given URI scheme, different entities may thus have 200 different hard-coded assumptions about the set of possible protocols, 201 which just pushes the rest of the burden to the problem of selection 202 among the known set (see Section 4). 204 A disadvantage of this approach for many use cases is that it does 205 not allow for non-default server configurations such as custom ports. 207 3.2. Passed in one URI 209 For single-transport protocols, a common mechanism is to specify a 210 default port for the URI scheme, and to allow putting a non-default 211 port number in the URI authority component. 213 For multi-transport protocols, historically it was sometimes assumed 214 that multiple transport protocols (e.g., UDP and TCP) would use the 215 same port number, so specifying a single number would also be 216 sufficient for multiple transports. When port numbers appear in 217 URIs, they are not the default ports that might be IANA-assigned 218 (since default ports should be omitted from the URI per [RFC3986] 219 section 3.2.3), but instead are either statically chosen by the 220 server application, or are ephemeral ports dynamically allocated on 221 the server hosting the resource. In most TCP/IP stacks, ephemeral 222 ports used by UDP endpoints have no relationship to ephemeral ports 223 used by TCP endpoints in the same application and so it cannot be 224 guaranteed that the port numbers are the same. For example, port 225 51000 might be allocated to one application for UDP, and a different 226 application for TCP. 228 Since 2011, this same issue can also occur with IANA-assigned ports, 229 especially if support for a given transport protocol is added at a 230 later time. [RFC6335] section 7.2 explains: 232 Effective with the publication of this document, IANA will begin 233 assigning port numbers for only those transport protocols 234 explicitly included in an assignment request. This ends the long- 235 standing practice of automatically assigning a port number to an 236 application for both TCP and UDP, even if the request is for only 237 one of these transport protocols. 239 Thus, for most URI schemes, a port number appearing in a URI 240 authority component must be specified as being in a specific 241 transport-layer protocol's numbering space since its value for a 242 given resource might differ by transport protocol. If a URI scheme 243 wishes for the port number in the URI authority component to be able 244 to apply to multiple transport protocols, the URI scheme would 245 typically have to assume static configuration on servers; this may be 246 acceptable in some circumstances and unacceptable in others. 248 A common solution in non-URI contexts is to use a service name rather 249 than a literal port number, and allow the service name to be resolved 250 to the relevant transport-layer identifier. Indeed, [RFC6335] 251 section 3 says: 253 Because the port number space is finite (and therefore 254 conservation is an important goal), the alternative of using 255 service names instead of port numbers is RECOMMENDED whenever 256 possible. 258 Unfortunately, it is not possible to follow this recommendation with 259 the port field in URI authority component, since the URI syntax only 260 allows integers in the port field. 262 For new URI schemes, it may be possible in some cases to place a 263 service name in the host field, such as "_myservice._tcp.example.org" 264 as would be used with a DNS SRV record [RFC2782]. That example still 265 specifies only a single transport protocol stack ("_tcp") however, 266 rather than a list of supported stacks. 268 Another limitation of service names is that they are currently 269 limited only to TCP, UDP, SCTP, and DCCP, and so cannot be used with 270 other layers (e.g., websockets) or protocols. Thus, a URI scheme for 271 a protocol that supports both, say, websockets and raw TCP as 272 possible transports for resource access, cannot use a service name as 273 a common identifier for transport-layer endpoint resolution. 275 It is usually also undesirable to put transport-layer endpoint 276 information (the list of supported transport protocols or the 277 identifier(s) used with the transport protocols) in the path or query 278 components for two reasons. First, those components are typically 279 passed over the wire to the server when accessing a resource, which 280 only consumes extra bandwidth with no benefit. Second, if the 281 transport-layer identifiers might change over the lifetime of the 282 resource, then the URI would need to change even if the change did 283 not affect the actual endpoint chosen by the client. Such a change 284 would negatively affect equivalence with the previous URI, e.g., 285 resulting in cache misses. 287 Thus, an advantage of this approach is that it can work without any 288 dependency on other protocols or deployment of servers needed for 289 resolution, and a disadvantage is that putting information about 290 multiple transport-layer endpoints anywhere in the same URI could 291 make for a very long URI that might have issues with certain 292 software, or have bandwidth or storage issues. 294 3.3. Use separate URI for each transport endpoint 296 In this approach, one must simply accept the fact that multiple URIs 297 might refer to the same resource as RFC 3986 already allows. This is 298 similar to using a set of URIs that differ only in IP address 299 literal, for a case when the resource server is not resolvable via a 300 protocol such as DNS or SIP. 302 The obvious disadvantage is that there are multiple URIs for the same 303 resource. Another potential disadvantage for some more complex use 304 cases where there are multiple layers of the transport stack, is that 305 it may be difficult or impossible to express all the identifiers in 306 an entire stack of protocols in one URI. 308 For cases where there are multiple transport protocols but only one 309 such layer, this approach results in needing to identify a single 310 transport protocol per URI. As discussed in Section 3.2, this often 311 cannot be put in the authority component and is undesirable to put in 312 the path or query component. As a result, such cases involve 313 specifying a separate URI scheme per transport. For example, "sip" 314 and "sips" do this, as do "http" and "https". RFC 8323 [RFC8323] 315 also follows this approach for CoAP with "coap", "coaps", "coap+tcp", 316 "coaps+tcp", etc. 318 3.4. Use another mechanism for discovery 320 In this approach, a URI scheme definer would specify a mechanism 321 whereby transport stack identifiers can be resolved for a given URI, 322 and the identifiers would come in a form that may not be expressed as 323 a URI. If multiple layers exist, then such resolution might involve 324 a resolution step for each layer. 326 DNS records (e.g., SRV records) provide one potential mechanism that 327 can be used to discover a set of supported transports and their 328 associated identifiers. Other types of directories might be usable 329 in other cases. For example, HTTP now provides an "Alt-Svc" 330 [RFC7838] mechanism that can discover alternate transport endpoints 331 for the same HTTP URI. Another example mentioned in [USWP] is where 332 the protocol to use is identified by a media type value. 334 One challenge in many cases is defining a common mechanism that could 335 discover identifiers for different transport protocols for the same 336 resource. For example, websockets use URIs and TCP uses port numbers 337 (and there is currently no URI scheme for TCP itself), and so the 338 syntax of such identifiers may differ if an application layer 339 protocol could use both TCP and websockets. 341 The advantage of requiring a separate resolution mechanism is that 342 the resource URI itself can be kept short and simple. The downsides 343 are extra complexity in both clients and servers, potentially extra 344 specification work for the URI scheme definer, the possible 345 additional deployment burden of provisioning and operating extra 346 protocols or servers to facilitate such resolution, and any 347 additional bandwidth or latency of doing the resolution. 349 In some contexts, it might be feasible to discover the additional 350 identifiers using the same mechanism used to discover the URI itself, 351 perhaps even in the same message. 353 4. Transport endpoint selection 355 The URI scheme should specify the mechanism for choosing among 356 transport protocol stacks, such as specifying at least one that is 357 mandatory to implement and an algorithm for trying possible transport 358 stacks in some order until one works. The URI scheme might even 359 leave it up to the client implementation or client configuration 360 options as suggested in Approach 2 of [USWP]. 362 The endpoint selection problem is similar to that of choosing among 363 multiple discovered IP addresses for the same transport stack, and 364 two common solutions are used today in that context. One category of 365 algorithm is to sort the choices according to some criteria, and then 366 to try them in order of preference. For example, SRV records provide 367 a priority and weight for each transport endpoint that can be used to 368 sort them, and [RFC6724] provides an algorithm for sorting 369 destination IP addresses. 371 Another category of such algorithms is called "Happy Eyeballs" 372 [RFC6555] where multiple possibilities are attempted in parallel 373 (possibly with some delay added before starting non-preferred 374 choices) and keeping the first one that responds successfully. The 375 advantage is faster connection when a non-preferred choice is needed, 376 and the disadvantages are extra complexity in the client, extra 377 traffic on the network, and extra connections at the server if 378 multiple parallel attempts succeed. 380 As noted earlier, when multiple layers exist in the transport stack, 381 the number of possible permutations might be large in some cases, and 382 so a mechanism must be cognizant of that. 384 5. IANA Considerations 386 This document has no actions for IANA. 388 6. Security Considerations 390 The security considerations in section 3.7 of [RFC7595] and section 7 391 of [RFC3986] apply. [RFC6943] also discusses security considerations 392 with determining equivalence, and section 3.1.4 of that document is 393 relevant to resolution. This document does not raise additional 394 security issues. 396 7. Acknowledgements 398 Thanks to Graham Klyne, Alexey Melnikov, and Gabriel Montenegro for 399 helpful suggestions on this document. 401 8. Informative References 403 [RFC2782] Gulbrandsen, A., Vixie, P., and L. Esibov, "A DNS RR for 404 specifying the location of services (DNS SRV)", RFC 2782, 405 DOI 10.17487/RFC2782, February 2000, 406 . 408 [RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform 409 Resource Identifier (URI): Generic Syntax", STD 66, 410 RFC 3986, DOI 10.17487/RFC3986, January 2005, 411 . 413 [RFC5630] Audet, F., "The Use of the SIPS URI Scheme in the Session 414 Initiation Protocol (SIP)", RFC 5630, 415 DOI 10.17487/RFC5630, October 2009, 416 . 418 [RFC6335] Cotton, M., Eggert, L., Touch, J., Westerlund, M., and S. 419 Cheshire, "Internet Assigned Numbers Authority (IANA) 420 Procedures for the Management of the Service Name and 421 Transport Protocol Port Number Registry", BCP 165, 422 RFC 6335, DOI 10.17487/RFC6335, August 2011, 423 . 425 [RFC6555] Wing, D. and A. Yourtchenko, "Happy Eyeballs: Success with 426 Dual-Stack Hosts", RFC 6555, DOI 10.17487/RFC6555, April 427 2012, . 429 [RFC6724] Thaler, D., Ed., Draves, R., Matsumoto, A., and T. Chown, 430 "Default Address Selection for Internet Protocol Version 6 431 (IPv6)", RFC 6724, DOI 10.17487/RFC6724, September 2012, 432 . 434 [RFC6943] Thaler, D., Ed., "Issues in Identifier Comparison for 435 Security Purposes", RFC 6943, DOI 10.17487/RFC6943, May 436 2013, . 438 [RFC7320] Nottingham, M., "URI Design and Ownership", BCP 190, 439 RFC 7320, DOI 10.17487/RFC7320, July 2014, 440 . 442 [RFC7595] Thaler, D., Ed., Hansen, T., and T. Hardie, "Guidelines 443 and Registration Procedures for URI Schemes", BCP 35, 444 RFC 7595, DOI 10.17487/RFC7595, June 2015, 445 . 447 [RFC7838] Nottingham, M., McManus, P., and J. Reschke, "HTTP 448 Alternative Services", RFC 7838, DOI 10.17487/RFC7838, 449 April 2016, . 451 [RFC8323] Bormann, C., Lemay, S., Tschofenig, H., Hartke, K., 452 Silverajan, B., and B. Raymor, Ed., "CoAP (Constrained 453 Application Protocol) over TCP, TLS, and WebSockets", 454 RFC 8323, DOI 10.17487/RFC8323, February 2018, 455 . 457 [AWWW] Jacobs, I. and N. Walsh, "Architecture of the World Wide 458 Web, Volume One", December 2004, 459 . 461 [USWP] Mendelsohn, N., "URI Schemes and Web Protocols", November 462 2005, 463 . 465 [Noah] Mendelsohn, N., "Email from Noah Mendelsohn to the URI- 466 Review mailing list", July 2017, . 469 Author's Address 470 Dave Thaler 471 Microsoft 472 One Microsoft Way 473 Redmond, WA 98052 474 USA 476 Email: dthaler@microsoft.com