idnits 2.17.1
draft-hammer-discovery-02.txt:
Checking boilerplate required by RFC 5378 and the IETF Trust (see
https://trustee.ietf.org/license-info):
----------------------------------------------------------------------------
No issues found here.
Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
----------------------------------------------------------------------------
No issues found here.
Checking nits according to https://www.ietf.org/id-info/checklist :
----------------------------------------------------------------------------
No issues found here.
Miscellaneous warnings:
----------------------------------------------------------------------------
== The copyright year in the IETF Trust and authors Copyright Line does not
match the current year
== Line 468 has weird spacing: '... query frag...'
-- The document date (February 12, 2009) is 5552 days in the past. Is this
intentional?
Checking references for intended status: Informational
----------------------------------------------------------------------------
== Missing Reference: '-' is mentioned on line 946, but not defined
== Missing Reference: 'TM' is mentioned on line 1120, but not defined
== Unused Reference: 'RFC2818' is defined on line 1101, but no explicit
reference was found in the text
== Outdated reference: A later version (-10) exists of
draft-nottingham-http-link-header-03
== Outdated reference: A later version (-05) exists of
draft-nottingham-site-meta-01
** Obsolete normative reference: RFC 2616 (Obsoleted by RFC 7230, RFC 7231,
RFC 7232, RFC 7233, RFC 7234, RFC 7235)
** Obsolete normative reference: RFC 2818 (Obsoleted by RFC 9110)
== Outdated reference: A later version (-28) exists of
draft-bryan-metalink-05
Summary: 2 errors (**), 0 flaws (~~), 8 warnings (==), 1 comment (--).
Run idnits with the --verbose option for more detailed information about
the items above.
--------------------------------------------------------------------------------
2 Network Working Group E. Hammer-Lahav
3 Internet-Draft Yahoo!
4 Intended status: Informational February 12, 2009
5 Expires: August 16, 2009
7 Link-based Resource Descriptor Discovery
8 draft-hammer-discovery-02
10 Status of this Memo
12 This Internet-Draft is submitted to IETF in full conformance with the
13 provisions of BCP 78 and BCP 79.
15 Internet-Drafts are working documents of the Internet Engineering
16 Task Force (IETF), its areas, and its working groups. Note that
17 other groups may also distribute working documents as Internet-
18 Drafts.
20 Internet-Drafts are draft documents valid for a maximum of six months
21 and may be updated, replaced, or obsoleted by other documents at any
22 time. It is inappropriate to use Internet-Drafts as reference
23 material or to cite them other than as "work in progress."
25 The list of current Internet-Drafts can be accessed at
26 http://www.ietf.org/ietf/1id-abstracts.txt.
28 The list of Internet-Draft Shadow Directories can be accessed at
29 http://www.ietf.org/shadow.html.
31 This Internet-Draft will expire on August 16, 2009.
33 Copyright Notice
35 Copyright (c) 2009 IETF Trust and the persons identified as the
36 document authors. All rights reserved.
38 This document is subject to BCP 78 and the IETF Trust's Legal
39 Provisions Relating to IETF Documents
40 (http://trustee.ietf.org/license-info) in effect on the date of
41 publication of this document. Please review these documents
42 carefully, as they describe your rights and restrictions with respect
43 to this document.
45 Abstract
47 This memo describes a process for obtaining information about a
48 resource identified by a URI. The 'information about a resource', a
49 resource descriptor, provides machine-readable information that aims
50 to increase interoperability and enhance the interaction with the
51 resource. This memo only defines the process for locating and
52 obtaining the descriptor, but leaves the descriptor format and its
53 interpretation out of scope.
55 Table of Contents
57 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
58 2. Notational Conventions . . . . . . . . . . . . . . . . . . . . 4
59 3. The describedby Link Relation . . . . . . . . . . . . . . . . 4
60 4. Identifying Descriptor Location . . . . . . . . . . . . . . . 5
61 4.1. Method Selection . . . . . . . . . . . . . . . . . . . . . 5
62 4.2. The Element . . . . . . . . . . . . . . . . . . . . 6
63 4.3. The HTTP Link Header . . . . . . . . . . . . . . . . . . . 7
64 4.4. The Host Metadata Document . . . . . . . . . . . . . . . . 8
65 5. Obtaining Resource Descriptor . . . . . . . . . . . . . . . . 9
66 6. The Link-Pattern host-meta Field . . . . . . . . . . . . . . . 9
67 6.1. Template Syntax . . . . . . . . . . . . . . . . . . . . . 10
68 7. Security Considerations . . . . . . . . . . . . . . . . . . . 11
69 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11
70 8.1. The Link-Pattern host-meta Field . . . . . . . . . . . . . 11
71 8.2. The describedby Relation Type . . . . . . . . . . . . . . 12
72 Appendix A. Descriptor Discovery vs. Service Discovery . . . . . 12
73 Appendix B. Methods Suitability Analysis . . . . . . . . . . . . 13
74 Appendix B.1. Requirements . . . . . . . . . . . . . . . . . . . . 13
75 Appendix B.2. Analysis . . . . . . . . . . . . . . . . . . . . . . 15
76 Appendix C. Acknowledgments . . . . . . . . . . . . . . . . . . 22
77 Appendix D. Document History . . . . . . . . . . . . . . . . . . 22
78 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 24
79 9.1. Normative References . . . . . . . . . . . . . . . . . . . 24
80 9.2. Informative References . . . . . . . . . . . . . . . . . . 25
81 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 25
83 1. Introduction
85 This memo defines a process for locating descriptors for resources
86 identified with URIs. Resource descriptors are documents (usually
87 based on well known serialization languages such as XML, RDF, and
88 JSON) which provide machine-readable information about resources
89 (resource metadata) for the purpose of promoting interoperability and
90 assist in interacting with unknown resources that support known
91 interfaces.
93 While many methods provide the ability to link a resource to its
94 metadata, none of these methods fully address the requirements of a
95 uniform and easily implementable process. These requirements include
96 the ability for resources to self-declare the location of their
97 descriptors, the ability to access descriptors directly without
98 interacting with the resource, and support a wide range of platforms
99 and scale of deployment. They must also be fully compliant with
100 existing web protocols, and support extensibility. These
101 requirements, and the analysis used as the basis for this memo are
102 explains in detail in Appendix B.
104 For example, a web page about an upcoming meeting can provide in its
105 descriptor document the location of the meeting organizer's free/busy
106 information to potentially negotiate a different time. A social
107 network profile page descriptor can identify the location of the
108 user's address book as well as accounts on other sites. A web
109 service implementing an API with optional components can advertise
110 which of these are supported.
112 This memo describes the first step in the discovery process in which
113 the resource descriptor document is located and retrieved. Other
114 steps, which are outside the scope of this memo, include parsing the
115 descriptor document based on its format (such as POWDER [POWDER], XRD
116 [XRD], and Metalink [I-D.bryan-metalink]) and utilizing it based on
117 the application.
119 Discovery can be performed before, after, or without obtaining a
120 representation of the resource. Performing discovery ahead of
121 accessing a representation allows the client not to reply on
122 assumptions about the properties of the resource. Performing
123 discovery after a representation has been obtained enables further
124 interaction with it.
126 Given the wide range of 'information about a resource', no single
127 descriptor format can adequately accommodate such scope. However,
128 there is great value in making the process locating the descriptor
129 uniform across formats. While HTTP is the most common protocol used
130 in association with discovery and is explicitly specified in this
131 memo, other protocols MAY be used.
133 Please discuss this draft on the www-talk@w3.org [1] mailing list.
135 2. Notational Conventions
137 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
138 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
139 document are to be interpreted as described in [RFC2119].
141 This document uses the Augmented Backus-Naur Form (ABNF) notation of
142 [RFC2616]. Additionally, the following rules are included from
143 [RFC3986]: reserved and unreserved, and from
144 [I-D.nottingham-http-link-header]: link-param.
146 3. The describedby Link Relation
148 The methods described in this memo express the location of the
149 resource descriptor as a link relation, utilizing the link framework
150 defined by [I-D.nottingham-http-link-header]. The association of a
151 descriptor document with the resource it describes is declared using
152 the "describedby" link relation type.
154 The "describedby" link relation is defined in [POWDER] and registered
155 as:
157 The relationship A "describedby" B asserts that resource B
158 provides a description of resource A. There are no constraints on
159 the format or representation of either A or B, neither are there
160 any further constraints on either resource.
162 Since a single resource can have many descriptors, the "describedby"
163 link relation has a one-to-many structure (the question whether a
164 single descriptor can describe multiple resources is outside the
165 scope of this memo). In the case of multiple "describedby" links
166 obtained from a single method, selecting which link to use is
167 application-specific.
169 To promote interoperability, applications referencing this memo
170 SHOULD clearly define the application-specific criteria used to
171 select between "describedby" links. This MAY be done by:
173 o Supporting a single descriptor format, or defining an order of
174 precedence for multiple descriptor formats. Applications MAY
175 require the presence of the link "type" attribute with the mime-
176 type of the required format.
178 o Using the "describedby" relation type together with another
179 application-specific relation type in the same link. The
180 application-specific relation type can be registered or an
181 extension.
183 o Specifying additional link attributes using link-extensions.
185 Link selection MUST NOT depend on the order in which multiple links
186 are obtained from a single method. Applications MUST NOT impose
187 constraints on the usage of the "describedby" relation type as it is
188 likely to be used by other applications in association with the same
189 resource.
191 4. Identifying Descriptor Location
193 The descriptor location (URI) is a function of the resource URI.
194 This section defines three methods which together satisfy the
195 requirements defined in Appendix B. While each method on its own
196 satisfies the requirements partially, together they provide enough
197 flexibility for most use cases. Each of the following three methods
198 is performed by using the resource URI to identify its descriptor
199 URI.
201 In many cases, a request for one URI leads to requesting other URIs,
202 as is the case with HTTP redirections. Because the decision whether
203 to use such URIs is application-specific, discovery is constrained to
204 a single URI identifying the resource. Any other resource URIs
205 received MUST be considered as a separate and discrete input into the
206 discovery function. If a resource URI obtained during the
207 performance of these methods is found to be more relevant to the
208 application, the discovery process MUST be restarted with the new
209 resource URI as its input.
211 For example, an HTTP HEAD request for URI A returns a redirect (307)
212 response with a set of "describedby" links, and identifies the
213 temporary location of the representation at URI B. An HTTP HEAD
214 request for URI B returns a successful (200) response with its own
215 set of "describedby" links. An application MAY choose to define a
216 process in which the two sets of links are obtained, prioritized, and
217 utilized, however, it MUST do so by explicitly instructing the client
218 to perform discovery multiple times, as each is considered separate
219 and distinct discovery.
221 4.1. Method Selection
223 Each method presents a different set of requirements. The criteria
224 used to determine which methods a server SHOULD support and client
225 SHOULD attempt are based on a combination of factors:
227 o The ability to offer and obtain a representation of the resource
228 by dereferencing its URI.
230 o The availability of a representation supporting markup
231 compatible with [I-D.nottingham-http-link-header].
233 o The availability of an HTTP representation of the resource and the
234 ability to provide and access link information in its response
235 header.
237 The methods are listed is based on the restrictiveness of their
238 requirements in descending order, from the most specialized to the
239 most generic. This ordering however, does not imply the order in
240 which multiple applicable methods should be attempted. Because
241 different methods are more appropriate in different circumstances, it
242 is up to each application to define how they should be used together.
244 To promote interoperability, applications referencing this memo MUST
245 clearly define the relationship between the three methods as either:
247 o equal, all methods MUST produce the same set of resource
248 descriptors and clients MAY attempt either method according to
249 their capabilities, or
251 o with an application-specific order of precedence, where methods
252 MUST be attempted in a specific order.
254 4.2. The Element
256 The element method is limited to resources with an available
257 markup representation that supports typed-relations using the
258 element, such as HTML [W3C.REC-html401-19991224], XHTML
259 [W3C.REC-xhtml1-20020801], and Atom [RFC4287]. Other markup formats
260 are permitted as long as the semantics of their elements are
261 fully compatible with the link framework defined in
262 [I-D.nottingham-http-link-header]. This method requires the
263 retrieval of a resource representation. While HTTP is the most
264 common transport for such documents, this method is transport
265 independent.
267 For example:
269
272 A client trying to obtain the location of the resource's descriptor
273 using this method SHALL:
275 1. Retrieve a representation of the resource using the applicable
276 transport for that resource URI. If the markup document is
277 obtained using HTTP, it MUST only be used by the client if the
278 document is a valid representation of the resource identified by
279 the HTTP request URI, typically in a response with a successful
280 (2xx) or redirection (3xx) status code. If no such valid
281 representation of the request URI is found, the method fails.
283 2. Parse the document as defined by its format specification and
284 look for elements with a "rel" attribute value containing
285 the "describedby" relation. The client MUST obey the document
286 markup schema and ignore any invalid elements (such as
287 elements outside the
section of an HTML document). This
288 is done to avoid unintentional markup from other parts of the
289 document to be used for discovery purposes, which can have vast
290 impact on usability and security.
292 3. Narrow down the selection if more than one "describedby" link is
293 found, following the application-specific criteria. The
294 descriptor location is obtained from the value of the "href"
295 attribute in the selected element.
297 elements MAY include other relation types together with
298 "describedby" in a single "rel" attribute (for example
299 'rel="describedby copyright"'). Clients MUST be properly process use
300 such multiple relation "rel" attributes as defined by the format
301 specification.
303 4.3. The HTTP Link Header
305 The HTTP Link header method is limited to resources for which an HTTP
306 GET or HEAD request returns a 2xx, 3xx, or 4xx HTTP response
307 [RFC2616]. This method uses the Link header defined in
308 [I-D.nottingham-http-link-header] and requires the retrieval of a
309 resource representation header.
311 For example:
313 Link: ; rel="describedby";
314 type="application/powder+xml"
316 A client trying to obtain the location of the resource's descriptor
317 using this method SHALL:
319 1. Make an HTTP (or HTTPS as required) GET or HEAD request to the
320 resource URI to obtain a valid response header. If the HTTP
321 response carries a status code other than successful (2xx),
322 redirection (3xx), or client error (4xx), the method fails.
324 2. Parse the HTTP response header and look for Link headers with a
325 "rel" parameter value containing the "describedby" relation.
327 3. Narrow down the selection if more than one "describedby" link is
328 found, following the application-specific criteria. The
329 descriptor location is obtained from the "<>" enclosed URI-
330 reference in the selected Link header.
332 Link headers MAY include other relation types together with
333 "describedby" in a single "rel" parameter (for example
334 'rel="describedby copyright"'). Clients MUST be properly process use
335 such multiple relation "rel" attributes as defined by
336 [I-D.nottingham-http-link-header].
338 4.4. The Host Metadata Document
340 The host metadata document method is available for any resource
341 identified by a URI whose authority supports the host-meta document
342 defined in [I-D.nottingham-site-meta]. This method does not require
343 obtaining any representation of the resource, and operates solely
344 using the resource URI.
346 The link relation between the resource URI and the descriptor URI is
347 obtained by using a template contained in the host-meta document. By
348 applying the host-wide template to an individual resource URI, a
349 resource-specific link is produced which can be used to indicate the
350 location of the descriptor document for that resource, bypassing the
351 need to access or provide a representation for it.
353 For example (line breaks are for formatting only, and are not allowed
354 in the document):
356 Link-Pattern: <{uri};about"> rel="describedby"
357 type="application/powder+xml"
359 A client trying to obtain the location of the resource's descriptor
360 using this method SHALL:
362 1. Retrieve the host-meta document for URI's authority as defined by
363 [I-D.nottingham-site-meta] section 4. If the request fails to
364 retrieve a valid host-meta document, the method fails.
366 2. Parse host-meta document and look for Link-Pattern fields with a
367 "rel" attribute value containing the "describedby" relation.
369 3. Narrow down the selection if more than one "describedby" link is
370 found, following the application-specific criteria. The
371 descriptor location is constructed by applying the template
372 obtained from the selected Link-Pattern field to the resource URI
373 as described by Section 6.1.
375 Link-Pattern MAY include other relation types together with
376 "describedby" in a single "rel" parameter (for example
377 'rel="describedby copyright"'). Clients MUST be properly process use
378 such multiple relation "rel" attributes as defined by Section 6.
380 5. Obtaining Resource Descriptor
382 Once the desired descriptor URI has been obtained, the descriptor
383 document is retrieved. If the descriptor URI scheme is "http" or
384 "https", the document is obtained via an HTTP (or HTTPS as required)
385 GET request to the identified URI. The client MUST obey HTTP
386 redirections (3xx), and the descriptor document is considered valid
387 only if retrieved with a successful HTTP response status (2xx).
389 6. The Link-Pattern host-meta Field
391 The Link host-meta field [I-D.nottingham-site-meta] conveys a link
392 relation between all resource URIs under the host-meta authority and
393 a common target URI. However, there are cases in which relations of
394 different resources with the same authority do not share the same
395 target URI, but do follow a common pattern in how the target URI is
396 constructed.
398 For example, a news site with multiple authors can provide
399 information about each article's author, but appending a suffix (such
400 as ";by") to the URI of each article. Each article has a unique
401 author, but all share the same pattern of where that information is
402 located. The same information can be provided using an HTTP link
403 header or HTML element, but in a less efficient manner when a
404 single pattern can provide the same information:
406 Link-Pattern: <{uri};by> rel="author"
408 The Link-Pattern host-meta field uses a slightly modified syntax of
409 the HTTP Link header [I-D.nottingham-http-link-header] to convey
410 relations whose context is individual resources with the same
411 authority as the host-meta document, and whose target is constructed
412 by applying a template to the context URI. The field is not specific
413 to any relation type and MAY be used to express any relations
414 supported by the Link header [I-D.nottingham-http-link-header].
416 The Link-Pattern host-meta field differs from the HTTP Link header in
417 the following respects:
419 o The "<>" enclosed token is not a valid URI, but instead contains a
420 template as defined in Section 6.1.
422 o Its context URI is defined as the individual resource URI used as
423 input to the template.
425 o If the resulting target URI expressed by the template is relative,
426 its base URI is the root resource of the authority.
428 Link-Pattern = "Link-Pattern" ":" #pattern-value
430 pattern-value = "<" template ">" *( ";" link-param )
432 template = *( uri-char | "{" [ "%" ] var-name "}" )
434 uri-char = ( reserved | unreserved )
436 var-name = "scheme" | "authority" | "path"
437 | "query" | "fragment" | "userinfo"
438 | "host" | "port" | "uri"
440 [[ should this spec define a filter/map parameter that will allow
441 applying link patterns to subsets of the host-meta scope? This can
442 use a regular expression match or something similar to robots.txt.
443 If the spec will end up not directly supporting this feature, I will
444 add a note suggesting that such a feature could be defined elsewhere
445 as an extension. ]]
447 6.1. Template Syntax
449 The template syntax provides a simple format for URI transformation.
450 A template is a string containing brace-enclosed ("{}") variable
451 names marking the parts of the string that are to be substituted by
452 the variable values. A template is transformed into a URI by
453 substituting the variables with their calculated value. If a
454 variable name is prefixed by "%", any character in the variable value
455 other than unreserved MUST be percent-encoded per [RFC3986].
457 To construct a URI using a template, the input URI is parsed into its
458 URI components and each component value assigned to a variable name.
459 The template variable substitution is based on the URI vocabulary
460 defined by [RFC3986] section 3 and includes: "scheme", "authority",
461 "path", "query", "fragment", "userinfo", "host", and "port". In
462 addition, it defines the "uri" variable as the entire input URI
463 excluding the fragment component and the "#" fragment separator.
465 foo://william@example.com:8080/over/there?name=ferret#nose
466 \_/ \______________________/\_________/ \_________/ \__/
467 | | | | |
468 scheme authority path query fragment
470 foo://william@example.com:8080/over/there?name=ferret#nose
471 \_____/ \_________/ \__/
472 | | |
473 userinfo host port
475 foo://william@example.com:8080/over/there?name=ferret#nose
476 \___________________________________________________/
477 |
478 uri
480 For example, given the input URI "http://example.com/r/1?f=xml#top",
481 each of the following templates will produce the associated output
482 URI:
484 http://example.org?q={%uri} -->
485 http://example.org?q=http%3A%2F%2Fexample.com%2Fr%2F1%3Ff%3Dxml
487 http://meta.{host}:8080{path}?{query} -->
488 http://meta.example.com:8080/r/1?f=xml
490 https://{authority}/v1{path}#{fragment} -->
491 https://example.com/v1/r/1#top
493 7. Security Considerations
495 The methods used to perform discovery are not secure, private or
496 integrity-guaranteed, and due caution should be exercised when using
497 them. Applications that perform discovery should consider the attack
498 vectors opened by automatically following, trusting, or otherwise
499 using links gathered from elements, HTTP Link headers, or
500 host-meta documents.
502 8. IANA Considerations
504 8.1. The Link-Pattern host-meta Field
506 This specification registers the Link-Pattern host-meta field in the
507 host-meta Field Registry [I-D.nottingham-site-meta].
509 Field Name: Link-Pattern
511 Change controller: IETF
513 Specification document(s): [[ this document ]]
515 Related information: [I-D.nottingham-http-link-header]
517 8.2. The describedby Relation Type
519 [[ this section will be removed if the "describedby" relation type is
520 registered by the time it is published ]]
522 This specification registers the "describedby" relation type in the
523 Link Relation Type Registry [I-D.nottingham-http-link-header].
525 o Relation Name: describedby
527 o Description: The relationship A "describedby" B asserts that
528 resource B provides a description of resource A. There are no
529 constraints on the format or representation of either A or B,
530 neither are there any further constraints on either resource.
532 o Documentation: [POWDER]
534 Appendix A. Descriptor Discovery vs. Service Discovery
536 Descriptor discovery provides a process for obtaining information
537 about a resource identified with a URI. It allows servers to
538 describe their resources in a machine-readable format, enabling
539 automatic interoperability by user-agents and resource consuming
540 applications. Discovery enables applications to utilize a wide range
541 of web services and resources across multiple providers without the
542 need to know about their capabilities in advance, reducing the need
543 for manual configuration and resource-specific software.
545 When discussing discovery, it is important to differentiate between
546 descriptor discovery and service discovery. Both types attempts to
547 associate capabilities with resources, but they approach it from
548 opposite ends.
550 Service discovery centers on identifying the location of qualified
551 resources, typically finding an endpoint capable of certain protocols
552 and capabilities. In contrast, descriptor discovery begins with a
553 resource, trying to find which capabilities it supports.
555 A simple way to distinguish between the two types of discovery is to
556 define the questions they are each trying to answer:
558 Descriptor-Discovery: Given a resource, what are its attributes:
559 capabilities, characteristics, and relationships to other
560 resources?
562 Service-Discovery: Given a set of attributes, which available
563 resources match the desired set and what is their location?
565 While this memo deals exclusively with descriptor discovery, it is
566 important to note that the two discovery types are closely related
567 and are usually used in tandem. In fact, a typical use case will
568 switch between service discovery and descriptor discovery multiple
569 times in a single workflow, and can start with either one.
571 One reason for this dependency between the two discovery types is
572 that resource descriptors usually contain not only a list of
573 capabilities, but also relationships to other resources. Since those
574 relationships are usually typed, the process in which an application
575 chooses which links to use is in fact service discovery.
577 Applications use descriptor discovery to obtain the list of links,
578 and service discovery to choose the relevant links. In another
579 common example, the application uses service discovery to find a
580 resource with a given capability, then uses descriptor discovery to
581 find out what other capabilities it supports.
583 Appendix B. Methods Suitability Analysis
585 Due to the wide range of use cases requiring resource descriptors,
586 and the desire to reuse as much as possible, no single solution has
587 been found to sufficiently cover the requirements for linking between
588 the resource URI and the descriptor URI. The following analysis
589 attempts to list all the method proposed for addressing descriptor
590 discovery. It is included here to provide background information as
591 to why certain methods have been selected while others rejected from
592 the discovery process. It has been updated to match the terms used
593 in this memo and its structure.
595 Appendix B.1. Requirements
597 Getting from a resource URI to its descriptor document can be
598 implemented in many ways. The problem is that none of the current
599 methods address all of the requirements presented by the common use
600 cases. The requirements are simple, but the more we try to address,
601 the less elegant and accessible the process becomes. While working
602 on the now defunct XRDS-Simple specification [XRDS-Simple] and
603 talking to companies and individual about it, the following
604 requirements emerged for any proposed process:
606 Self Declaration:
608 Allow resources to declare the availability of descriptor
609 information and its location. When a resource is accessed, it
610 needs to have a way to communicate to the client that it
611 supports the discovery protocol and to indicates the location
612 of such descriptor.
614 This is useful when the client is able or is already
615 interacting with the resource but can enhance its interaction
616 with additional information. For example, accessing a blog
617 page enhanced if it was generated from an Atom feed or Atom
618 entry and that feed supports Atom authoring.
620 Direct Descriptor Access:
622 Enable direct retrieval of the resource descriptor without
623 interacting with the resource itself. Before a resource is
624 accessed, the client should have a way to obtain the resource
625 descriptor without accessing the resource. This is important
626 for two reasons.
628 First, accessing an unknown resource may have undesirable
629 consequences. After all, the information contained in the
630 descriptor is supposed to inform the client how to interact
631 with the resource. The second is efficiency - removing the
632 need to first obtain the resource in order to get its
633 descriptor (reducing HTTP round-trips, network bandwidth, and
634 application latency).
636 Web Architecture Compliant:
638 Work with well-established web infrastructure. This may sound
639 obvious but it is in fact the most complex requirement.
640 Deploying new extensions to the HTTP protocol is a complicated
641 endeavor. Beside getting applications to support a new header,
642 method, or content negotiation, existing caches and proxies
643 must be enhanced to properly handle these requests, and they
644 must not fail performing their normal duties without such
645 enhancements.
647 For example, a new content negotiation method may cause an
648 existing cache to serve the wrong data to a non-discovery
649 client due to its inability to distinguish the metadata request
650 from the resource representation request.
652 Scale and Technology Agnostic:
654 Support large and small web providers regardless of the size of
655 operations and deployment. Any solution must work for a small
656 hosted web site as well as the world largest search engine. It
657 must be flexible enough to allow developers with restricted
658 access to the full HTTP protocol (such as limited access to
659 request or response headers) to be able to both provide and
660 consume resource descriptors. Any solution should also support
661 caching as much as possible and allow reuse of source code and
662 data.
664 Extensible:
666 Accommodate future enhancements and unknown descriptor formats.
667 It should support the existing set of descriptor formats such
668 as XRD and POWDER, as well as new descriptor relationships that
669 might emerge in the future. In addition, the solution should
670 not depend on the descriptor format itself and work equally
671 well with any document format - it should aim to keep the road
672 and destination separate.
674 Appendix B.2. Analysis
676 The following is a list of proposed and implemented methods trying to
677 address descriptor discovery. Each method is reviewed for its
678 compliance with the requirements identified previously. The [-],
679 [+], or [+-] symbols next to each requirement indicate how well the
680 method complies with the requirement.
682 Appendix B.2.1. HTTP Response Header
684 When a resource representation is retrieved using and HTTP GET
685 request, the server includes in the response a header pointing to the
686 location of the descriptor document. For example, POWDER uses the
687 "Link" response header to create an association between the resource
688 and its descriptor. XRDS [XRDS] (based on the Yadis protocol
689 [Yadis]) uses a similar approach, but since the Link header was not
690 available when Yadis was first drafted, it defines a custom header
691 X-XRDS-Location which serves a similar but less generic purpose.
693 [+] Self Declaration - using the Link header, any resource can point
694 to its descriptor documents.
696 [-] Direct Descriptor Access - the header is only accessible when
697 requesting the resource itself via an HTTP GET request. While
698 HTTP GET is meant to be a safe operation, it is still possible for
699 some resource to have side-effects.
701 [+] Web Architecture Compliant - uses the Link header which is an
702 IETF Internet Standard [[ currently a standard-track draft ]], and
703 is consistent with HTTP protocol design.
705 [-] Scale and Technology Agnostic - since discovery accounts for a
706 small percent of resource requests, the extra Link header is
707 wasteful. For some hosted servers, access to HTTP headers is
708 limited and will prevent implementation.
710 [+] Extensible - the Link header provides built-in extensibility by
711 allowing new link relations, mime-types, and other extensions.
713 Minimum roundtrips to retrieve the resource descriptor: 2
715 Appendix B.2.2. HTTP Response Header Via HEAD
717 Same as the HTTP Response Header method but used with an HTTP HEAD
718 request. The idea of using the HEAD method is to solve the wasteful
719 overhead of including the Link header in every reply. By limiting
720 the appearance of the Link header only to HEAD responses, typical GET
721 requests are not encumbered by the extra bytes.
723 [+] Self Declaration - Same as the HTTP Response Header method.
725 [-] Direct Descriptor Access - Same as the HTTP Response Header
726 method.
728 [-] Web Architecture Compliant - HTTP HEAD should return the exact
729 same response as HTTP GET with the sole exception that the
730 response body is omitted. By adding headers only to the HEAD
731 response, this solution violates the HTTP protocol and might not
732 work properly with proxies as they can return the header of the
733 cached GET request.
735 [+] Scale and Technology Agnostic - solves the wasted bandwidth
736 associated with the HTTP Response Header method, but still suffers
737 from the limitation imposed by requiring access to HTTP headers.
739 [+] Extensible - Same as the HTTP Response Header method.
741 Minimum roundtrips to retrieve the resource descriptor: 2
743 Appendix B.2.3. HTTP Content Negotiation
745 Using the HTTP Accept request header or Transparent Content
746 Negotiation as defined in [RFC2295], the client informs the server it
747 is interested in the descriptor and not the resource itself, to which
748 the server responds with the descriptor document or its location. In
749 Yadis, the client sends an HTTP GET (or HEAD) request to the resource
750 URI with an Accept header and content-type application/xrds+xml.
751 This informs the server of the client's discovery interest, which in
752 turn may reply with the descriptor document itself, redirect to it,
753 or return its location via the X-XRDS-Location response header.
755 [-] Self Declaration - does not address as it focuses on the client
756 declaring its intentions.
758 [+] Direct Descriptor Access - provides a simple method for directly
759 requesting the descriptor document.
761 [-] Web Architecture Compliant - while it can be argued that the
762 descriptor can be considered another representation of the
763 resource, it is very much external to it. Using the Accept header
764 to request a separate resource (as opposed to a different
765 representation of the same resource) violates web architecture.
766 It also prevents using the discovery content-type as a valid
767 (self-standing) web resource having its own descriptor.
769 [-] Scale and Technology Agnostic - requires access to HTTP request
770 and response headers, as well as the registration of multiple
771 handlers for the same resource URI based on the Accept header. In
772 addition, improper use or implementation of the Vary header in
773 conjunction with the Accept header will cause caches to serve the
774 descriptor document instead of the resource itself - a great
775 concern to large providers with frequently visited front-pages.
777 [-] Extensible - applies an implicit relation type to the descriptor
778 mime-type, limiting descriptor formats to a single purpose. It
779 also prevents using existing mime-types from being used as a
780 descriptor format.
782 Minimum roundtrips to retrieve the resource descriptor: 1
784 Appendix B.2.4. HTTP Header Negotiation
786 Similar to the HTTP Content Negotiation method, this solution uses a
787 custom HTTP request header to inform the server of the client's
788 discovery intentions. The server responds by serving the same
789 resource representation (via an HTTP GET or HEAD requests) with the
790 relevant Link headers. It attempts to solve the HTTP Response Header
791 waste issue by allowing the client to explicitly request the
792 inclusion of Link headers. One such header can be called "Request-
793 links" to inform the server the client would like it to include
794 certain Link headers of a given "rel" type in its reply.
796 [+] Self Declaration - same as HTTP Response Header with the option
797 of selective inclusion.
799 [-] Direct Descriptor Access - does not address.
801 [-] Web Architecture Compliant - HTTP does not include any mechanism
802 for header negotiation and any custom solution will break existing
803 caches.
805 [+-] Scale and Technology Agnostic - Requires advance access to HTTP
806 headers on both the client and server sides, but solves the
807 bandwidth waste issue of the HTTP Response Header method.
809 [+] Extensible - builds on top of Link header extensibility.
811 Minimum roundtrips to retrieve the resource descriptor: 2
813 Appendix B.2.5. Element
815 Embeds the location of the descriptor document within the resource
816 representation by leveraging the HTML header element (as
817 opposed to the HTTP header). Applies to HTML resource
818 representations or similar markup-based formats with support for
819 "Link"-like elements such as Atom. POWDER uses the element in
820 this manner, while XRDS uses the HTML element with an "http-
821 equiv" attribute equals to X-XRDS-Location (to create an embedded
822 version of the X-XRDS-Location custom header).
824 [+] Self Declaration - similar to HTTP Response Header method but
825 limited to HTML resources.
827 [-] Direct Descriptor Access - the method requires fetching the
828 entire resource representation in order to obtain the descriptor
829 location. In addition, it requires changing the resource HTML
830 representation which makes discovery an intrusive process.
832 [+] Web Architecture Compliant - uses the element as
833 designed.
835 [+] Scale and Technology Agnostic - while this solution requires
836 direct retrieval of the resource and manipulation of its content,
837 it is extremely accessible in many platforms.
839 [-] Extensible - extensibility is restricted to HTML representations
840 or similar markup formats with support for a similar element.
842 Minimum roundtrips to retrieve the resource descriptor: 2
844 Appendix B.2.6. HTTP OPTIONS Method
846 The HTTP OPTIONS method is used to interact with the HTTP server with
847 regard to its capabilities and communication-related information
848 about its resources. The OPTIONS method, together with an optional
849 request header, can be used to request both the descriptor location
850 and descriptor content itself.
852 [-] Self Declaration - does not address.
854 [+] Direct Descriptor Access - provides a clean mechanism for
855 requesting descriptor information about a resource without
856 interacting with it.
858 [+] Web Architecture Compliant - uses an existing HTTP featured.
860 [-] Scale and Technology Agnostic - requires client and server
861 access to the OPTIONS HTTP method. Also does not support caching
862 which makes this solution inefficient.
864 [+] Extensible - built-into the OPTIONS method.
866 Minimum roundtrips to retrieve the resource descriptor: 1
868 Appendix B.2.7. WebDAV PROPFIND Method
870 Similar to the HTTP OPTIONS method, the WebDAV PROPFIND method
871 defined in [RFC4918] can be used to request resource specific
872 properties, one of which can hold the location of the descriptor
873 document. PROPFIND, unlike OPTIONS, cannot return the descriptor
874 itself, unless it is returned in the required PROPFIND schema (a
875 multi-status XML element). Other alternatives include URIQA [URIQA],
876 an HTTP extension which defines a method called MGET, and ARK
877 (Archival Resource Key) [ARK] - a method similar to PROPFIND that
878 allows the retrieval of resource attributes using keys (which
879 describe the resource).
881 [-] Self Declaration - does not address.
883 [+-] Direct Descriptor Access - does not require interaction with
884 the resource, but does require at least two requests to get the
885 descriptor (get location, get document).
887 [+] Web Architecture Compliant - uses an HTTP extension with less
888 support than core HTTP, but still based on published standards.
890 [-] Scale and Technology Agnostic - same as the HTTP OPTIONS Method.
892 [+-] Extensible - uses extensible protocols but at the same time
893 depends on solutions that have already gone beyond the standard
894 HTTP protocol, which makes further extensions more complex and
895 unsupported.
897 Minimum roundtrips to retrieve the resource descriptor: 2
899 Appendix B.2.8. Custom HTTP Method
901 Similar to the HTTP OPTIONS Method, a new method can be defined (such
902 as DISCOVER) to return (or redirect to) the descriptor document. The
903 new method can allow caching.
905 [-] Self Declaration - does not address.
907 [+] Direct Descriptor Access - same as the HTTP OPTIONS Method.
909 [-] Web Architecture Compliant - depends heavily on extending every
910 platform to support the extension. Unlikely to be supported by
911 existing proxy services and caches.
913 [-] Scale and Technology Agnostic - same as HTTP OPTIONS Method with
914 the additional burden on smaller sites requiring access to the new
915 protocol.
917 [+] Extensible - new protocol that can extend as needed.
919 Minimum roundtrips to retrieve the resource descriptor: 1
921 Appendix B.2.9. Static Resource URI Transformation
923 Instead of using HTTP facilities to access the descriptor location,
924 this method defines a template to transform any resource URI to the
925 descriptor document URI. This can be done by adding a prefix or
926 suffix to the resource URI, which turns it into a new resource URI.
927 The new URI points to the descriptor document. For example, to fetch
928 the descriptor document for http://example.com/resource, the client
929 makes an HTTP GET request to http://example.com/resource;about using
930 a static template that adds the ";about" suffix.
932 [-] Self Declaration - does not address.
934 [+] Direct Descriptor Access - creates a unique URI for the
935 descriptor document.
937 [+-] Web Architecture Compliant - uses basic HTTP facilities but
938 intrudes on the domain authority namespace as it defines a static
939 template for URI transformation that is not likely to be
940 compatible with many existing URI naming conventions.
942 [+-] Scale and Technology Agnostic - depending on the static mapping
943 chosen. Some hosted environment will have a problem gaining
944 access to the mapped URI based on the URI format chosen.
946 [-] Extensible - provides a very specific and limited method to map
947 between resources and their descriptor, since each relation type
948 must mint its own static template.
950 Minimum roundtrips to retrieve the resource descriptor: 1
952 Appendix B.2.10. Dynamic Resource URI Transformation
954 Same as the Static Resource URI Transformation method but with the
955 ability for each domain authority to specify its own discovery
956 transformation template. This can done by placing a configuration
957 file at a known location (such as robots.txt) which contains the
958 template needed to perform the URL mapping. The client first obtains
959 the configuration document (which may be cached using normal HTTP
960 facilities), parses it, then uses that information to transform the
961 resource URI and access the descriptor document.
963 [+-] Self Declaration - does not address individual resources, but
964 allows entire domains to declare their support (and how to use
965 it).
967 [+-] Direct Descriptor Access - once the mapping template has been
968 obtained, descriptors can be accessed directly.
970 [+-] Web Architecture Compliant - uses an existing known-location
971 design pattern (such as robots.txt) and standard HTTP facilities.
972 The use of a known-location if not ideal and is considered a
973 violation of web architecture but if it serves as the last of its
974 kind, can be tolerated. An alternative to the known-location
975 approach can be using DNS to store either the location of the
976 mapping or the map template itself, but DNS adds a layer of
977 complexity not always available.
979 [+-] Scale and Technology Agnostic - works well at the URI authority
980 level (domain) but is inefficient at the URI path level (resource
981 path) and harder to implement when different paths within the same
982 domain need to use different templates. With the decreasing cost
983 of custom domains and sub-domains hosting, this will not be an
984 issue for most services, but it does require sharing configuration
985 at the domain/sub-domain level.
987 [+-] Extensible - can be, depending on the schema used to format the
988 known-location configuration document.
990 Minimum roundtrips to retrieve the resource descriptor: initially 2,
991 1 after caching
993 Appendix C. Acknowledgments
995 With the exception of the host-meta template extension, very little
996 of this memo is original work. Many communities and individuals have
997 been working on solving discovery for many years and this work is a
998 direct result of their hard and dedicated efforts.
1000 Inspiration for this memo derived from previous work on a descriptor
1001 format called XRDS-Simple, which in turn derived from another
1002 descriptor format, XRDS. Previous discovery workflows include Yadis
1003 which is currently used by the OpenID community. While suffering
1004 from significant shortcomings, Yadis was a breakthrough approach to
1005 performing discovery using extremely restricted hosting environments,
1006 and this memo has strived to preserve as much of that spirit as
1007 possible.
1009 The use of Link elements and headers and the introduction of the
1010 "describedby" relation type in this memo is a direct result of the
1011 dedicated work and contribution of Phil Archer to the W3C POWDER
1012 specification and Jonathan Rees to the W3C review of Uniform Access
1013 to Information About. The host-meta approach was first proposed by
1014 Mark Nottingham as an alternative to attaching links directly to
1015 resource representations.
1017 The author wishes to thanks the OASIS XRI community for their
1018 support, encouragement, and enthusiasm for this work. Special thanks
1019 go to Lisa Dusseault, Joseph Holsten, Mark Nottingham, John Panzer,
1020 Drummond Reed, and Jonathan Rees for their invaluable feedback.
1022 The author takes all responsibility for errors and omissions.
1024 Appendix D. Document History
1026 [[ to be removed by the RFC editor before publication as an RFC ]]
1028 -02
1029 o Changed focus from an HTTP-based process to Link-based process.
1031 o Completely revised and restructured document for better clarity.
1033 o Realigned the methods to produce consistent results and changed
1034 the way redirections and client-errors are handled.
1036 o Updated to use newer version of site-meta, now called host-meta,
1037 including a new plaintext-based format to replace the previous XML
1038 format.
1040 o Renamed Link-Template to Link-Pattern to avoid future conflict
1041 with a previously proposed Link-Template HTTP header.
1043 o Removed support for the "scheme" Link-Template parameter.
1045 o Replaced restrictions with interoperability recommendations.
1047 o Added IANA considerations per new host-meta registry requirements.
1049 -01
1051 o Rename 'resource discovery' to 'descriptor discovery'.
1053 o Added informative reference to Metalink.
1055 o Clarified that the resource descriptor URI can use any URI scheme,
1056 not just "http" or "https".
1058 o Removed comment regarding redirects when using Elements.
1060 o Clarified that HTTPS must be used with "https" URIs for both Link
1061 headers and host-meta retrieval.
1063 o Removed DNS verification step for host-meta with schemes other
1064 then "http" and "https". Replaced with a general discussion of
1065 authority and a security consideration comment.
1067 o Organized host-meta section into another sub-section level.
1069 o Enlarged the template vocabulary from a single "uri" variable to
1070 include smaller URI components.
1072 o Added informative reference to RFC 2295 in analysis appendix.
1074 -00
1075 o Initial draft.
1077 9. References
1079 9.1. Normative References
1081 [I-D.nottingham-http-link-header]
1082 Nottingham, M., "Link Relations and HTTP Header Linking",
1083 draft-nottingham-http-link-header-03 (work in progress),
1084 November 2008.
1086 [I-D.nottingham-site-meta]
1087 Nottingham, M. and E. Hammer-Lahav, "Host Metadata for the
1088 Web", draft-nottingham-site-meta-01 (work in progress),
1089 February 2009.
1091 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
1092 Requirement Levels", BCP 14, RFC 2119, March 1997.
1094 [RFC2295] Holtman, K. and A. Mutz, "Transparent Content Negotiation
1095 in HTTP", RFC 2295, March 1998.
1097 [RFC2616] Fielding, R., Gettys, J., Mogul, J., Frystyk, H.,
1098 Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext
1099 Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999.
1101 [RFC2818] Rescorla, E., "HTTP Over TLS", RFC 2818, May 2000.
1103 [RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
1104 Resource Identifier (URI): Generic Syntax", STD 66,
1105 RFC 3986, January 2005.
1107 [RFC4287] Nottingham, M., Ed. and R. Sayre, Ed., "The Atom
1108 Syndication Format", RFC 4287, December 2005.
1110 [RFC4918] Dusseault, L., "HTTP Extensions for Web Distributed
1111 Authoring and Versioning (WebDAV)", RFC 4918, June 2007.
1113 [W3C.REC-html401-19991224]
1114 Raggett, D., Jacobs, I., and A. Hors, "HTML 4.01
1115 Specification", World Wide Web Consortium
1116 Recommendation REC-html401-19991224, December 1999,
1117 .
1119 [W3C.REC-xhtml1-20020801]
1120 Pemberton, S., "XHTML[TM] 1.0 The Extensible HyperText
1121 Markup Language (Second Edition)", World Wide Web
1122 Consortium Recommendation REC-xhtml1-20020801,
1123 August 2002,
1124 .
1126 9.2. Informative References
1128 [ARK] Kunze, J. and R. Rodgers, "The ARK Identifier Scheme",
1129 .
1131 [I-D.bryan-metalink]
1132 Bryan, A., "The Metalink Download Description Format",
1133 draft-bryan-metalink-05 (work in progress), January 2009.
1135 [POWDER] Archer, P., Ed., Smith, K., Ed., and A. Perego, Ed.,
1136 "POWDER: Protocol for Web Description Resources",
1137 .
1139 [URIQA] Nokia, "The URI Query Agent Model",
1140 .
1142 [XRD] Hammer-Lahav, E., Ed., "XRD 1.0 [[ replace with new XRD
1143 specification reference ]]".
1145 [XRDS] Wachob, G., Reed, D., Chasen, L., Tan, W., and S.
1146 Churchill, "Extensible Resource Identifier (XRI)
1147 Resolution V2.0", .
1150 [XRDS-Simple]
1151 Hammer-Lahav, E., "XRDS-Simple 1.0",
1152 .
1154 [Yadis] Miller, J., "Yadis Specification 1.0",
1155 .
1157 URIs
1159 [1]
1161 Author's Address
1163 Eran Hammer-Lahav
1164 Yahoo!
1166 Email: eran@hueniverse.com
1167 URI: http://hueniverse.com