idnits 2.17.1
draft-hammer-discovery-03.txt:
Checking boilerplate required by RFC 5378 and the IETF Trust (see
https://trustee.ietf.org/license-info):
----------------------------------------------------------------------------
** The document seems to lack a License Notice according IETF Trust
Provisions of 28 Dec 2009, Section 6.b.ii or Provisions of 12 Sep 2009
Section 6.b -- however, there's a paragraph with a matching beginning.
Boilerplate error?
(You're using the IETF Trust Provisions' Section 6.b License Notice from
12 Feb 2009 rather than one of the newer Notices. See
https://trustee.ietf.org/license-info/.)
Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
----------------------------------------------------------------------------
No issues found here.
Checking nits according to https://www.ietf.org/id-info/checklist :
----------------------------------------------------------------------------
No issues found here.
Miscellaneous warnings:
----------------------------------------------------------------------------
== The copyright year in the IETF Trust and authors Copyright Line does not
match the current year
== Line 467 has weird spacing: '... query frag...'
-- The document date (March 23, 2009) is 5511 days in the past. Is this
intentional?
Checking references for intended status: Informational
----------------------------------------------------------------------------
== Missing Reference: '-' is mentioned on line 945, but not defined
== Missing Reference: 'TM' is mentioned on line 1126, but not defined
== Unused Reference: 'RFC2818' is defined on line 1107, but no explicit
reference was found in the text
== Outdated reference: A later version (-10) exists of
draft-nottingham-http-link-header-03
== Outdated reference: A later version (-05) exists of
draft-nottingham-site-meta-01
** Obsolete normative reference: RFC 2616 (Obsoleted by RFC 7230, RFC 7231,
RFC 7232, RFC 7233, RFC 7234, RFC 7235)
** Obsolete normative reference: RFC 2818 (Obsoleted by RFC 9110)
== Outdated reference: A later version (-28) exists of
draft-bryan-metalink-05
Summary: 3 errors (**), 0 flaws (~~), 8 warnings (==), 1 comment (--).
Run idnits with the --verbose option for more detailed information about
the items above.
--------------------------------------------------------------------------------
2 Network Working Group E. Hammer-Lahav
3 Internet-Draft Yahoo!
4 Intended status: Informational March 23, 2009
5 Expires: September 24, 2009
7 Link-based Resource Descriptor Discovery
8 draft-hammer-discovery-03
10 Status of this Memo
12 This Internet-Draft is submitted to IETF in full conformance with the
13 provisions of BCP 78 and BCP 79.
15 Internet-Drafts are working documents of the Internet Engineering
16 Task Force (IETF), its areas, and its working groups. Note that
17 other groups may also distribute working documents as Internet-
18 Drafts.
20 Internet-Drafts are draft documents valid for a maximum of six months
21 and may be updated, replaced, or obsoleted by other documents at any
22 time. It is inappropriate to use Internet-Drafts as reference
23 material or to cite them other than as "work in progress."
25 The list of current Internet-Drafts can be accessed at
26 http://www.ietf.org/ietf/1id-abstracts.txt.
28 The list of Internet-Draft Shadow Directories can be accessed at
29 http://www.ietf.org/shadow.html.
31 This Internet-Draft will expire on September 24, 2009.
33 Copyright Notice
35 Copyright (c) 2009 IETF Trust and the persons identified as the
36 document authors. All rights reserved.
38 This document is subject to BCP 78 and the IETF Trust's Legal
39 Provisions Relating to IETF Documents in effect on the date of
40 publication of this document (http://trustee.ietf.org/license-info).
41 Please review these documents carefully, as they describe your rights
42 and restrictions with respect to this document.
44 Abstract
46 This memo describes LRDD (pronounced 'lard'), a process for obtaining
47 information about a resource identified by a URI. The 'information
48 about a resource', a resource descriptor, provides machine-readable
49 information that aims to increase interoperability and enhance the
50 interaction with the resource. This memo only defines the process
51 for locating and obtaining the descriptor, but leaves the descriptor
52 format and its interpretation out of scope.
54 Table of Contents
56 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
57 2. Notational Conventions . . . . . . . . . . . . . . . . . . . . 4
58 3. The describedby Link Relation . . . . . . . . . . . . . . . . 4
59 4. Identifying Descriptor Location . . . . . . . . . . . . . . . 5
60 4.1. Method Selection . . . . . . . . . . . . . . . . . . . . . 5
61 4.2. The Element . . . . . . . . . . . . . . . . . . . . 6
62 4.3. The HTTP Link Header . . . . . . . . . . . . . . . . . . . 7
63 4.4. The Host Metadata Document . . . . . . . . . . . . . . . . 8
64 5. Obtaining Resource Descriptor . . . . . . . . . . . . . . . . 9
65 6. The Link-Pattern host-meta Field . . . . . . . . . . . . . . . 9
66 6.1. Template Syntax . . . . . . . . . . . . . . . . . . . . . 10
67 7. Security Considerations . . . . . . . . . . . . . . . . . . . 11
68 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11
69 8.1. The Link-Pattern host-meta Field . . . . . . . . . . . . . 11
70 8.2. The describedby Relation Type . . . . . . . . . . . . . . 12
71 Appendix A. Descriptor Discovery vs. Service Discovery . . . . . 12
72 Appendix B. Methods Suitability Analysis . . . . . . . . . . . . 13
73 Appendix B.1. Requirements . . . . . . . . . . . . . . . . . . . . 13
74 Appendix B.2. Analysis . . . . . . . . . . . . . . . . . . . . . . 15
75 Appendix C. Acknowledgments . . . . . . . . . . . . . . . . . . 22
76 Appendix D. Document History . . . . . . . . . . . . . . . . . . 22
77 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 24
78 9.1. Normative References . . . . . . . . . . . . . . . . . . . 24
79 9.2. Informative References . . . . . . . . . . . . . . . . . . 25
80 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 25
82 1. Introduction
84 This memo defines a process for locating descriptors for resources
85 identified with URIs. Resource descriptors are documents (usually
86 based on well known serialization languages such as XML, RDF, and
87 JSON) which provide machine-readable information about resources
88 (resource metadata) for the purpose of promoting interoperability and
89 assist in interacting with unknown resources that support known
90 interfaces.
92 While many methods provide the ability to link a resource to its
93 metadata, none of these methods fully address the requirements of a
94 uniform and easily implementable process. These requirements include
95 the ability for resources to self-declare the location of their
96 descriptors, the ability to access descriptors directly without
97 interacting with the resource, and support a wide range of platforms
98 and scale of deployment. They must also be fully compliant with
99 existing web protocols, and support extensibility. These
100 requirements, and the analysis used as the basis for this memo are
101 explains in detail in Appendix B.
103 For example, a web page about an upcoming meeting can provide in its
104 descriptor document the location of the meeting organizer's free/busy
105 information to potentially negotiate a different time. A social
106 network profile page descriptor can identify the location of the
107 user's address book as well as accounts on other sites. A web
108 service implementing an API with optional components can advertise
109 which of these are supported.
111 This memo describes the first step in the discovery process in which
112 the resource descriptor document is located and retrieved. Other
113 steps, which are outside the scope of this memo, include parsing the
114 descriptor document based on its format (such as POWDER [POWDER], XRD
115 [XRD], and Metalink [I-D.bryan-metalink]) and utilizing it based on
116 the application.
118 Discovery can be performed before, after, or without obtaining a
119 representation of the resource. Performing discovery ahead of
120 accessing a representation allows the client not to reply on
121 assumptions about the properties of the resource. Performing
122 discovery after a representation has been obtained enables further
123 interaction with it.
125 Given the wide range of 'information about a resource', no single
126 descriptor format can adequately accommodate such scope. However,
127 there is great value in making the process locating the descriptor
128 uniform across formats. While HTTP is the most common protocol used
129 in association with discovery and is explicitly specified in this
130 memo, other protocols MAY be used.
132 Please discuss this draft on the www-talk@w3.org [1] mailing list.
134 2. Notational Conventions
136 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
137 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
138 document are to be interpreted as described in [RFC2119].
140 This document uses the Augmented Backus-Naur Form (ABNF) notation of
141 [RFC2616]. Additionally, the following rules are included from
142 [RFC3986]: reserved and unreserved, and from
143 [I-D.nottingham-http-link-header]: link-param.
145 3. The describedby Link Relation
147 The methods described in this memo express the location of the
148 resource descriptor as a link relation, utilizing the link framework
149 defined by [I-D.nottingham-http-link-header]. The association of a
150 descriptor document with the resource it describes is declared using
151 the "describedby" link relation type.
153 The "describedby" link relation is defined in [POWDER] and registered
154 as:
156 The relationship A "describedby" B asserts that resource B
157 provides a description of resource A. There are no constraints on
158 the format or representation of either A or B, neither are there
159 any further constraints on either resource.
161 Since a single resource can have many descriptors, the "describedby"
162 link relation has a one-to-many structure (the question whether a
163 single descriptor can describe multiple resources is outside the
164 scope of this memo). In the case of multiple "describedby" links
165 obtained from a single method, selecting which link to use is
166 application-specific.
168 To promote interoperability, applications referencing this memo
169 SHOULD clearly define the application-specific criteria used to
170 select between "describedby" links. This MAY be done by:
172 o Supporting a single descriptor format, or defining an order of
173 precedence for multiple descriptor formats. Applications MAY
174 require the presence of the link "type" attribute with the mime-
175 type of the required format.
177 o Using the "describedby" relation type together with another
178 application-specific relation type in the same link. The
179 application-specific relation type can be registered or an
180 extension.
182 o Specifying additional link attributes using link-extensions.
184 Link selection MUST NOT depend on the order in which multiple links
185 are obtained from a single method. Applications MUST NOT impose
186 constraints on the usage of the "describedby" relation type as it is
187 likely to be used by other applications in association with the same
188 resource.
190 4. Identifying Descriptor Location
192 The descriptor location (URI) is a function of the resource URI.
193 This section defines three methods which together satisfy the
194 requirements defined in Appendix B. While each method on its own
195 satisfies the requirements partially, together they provide enough
196 flexibility for most use cases. Each of the following three methods
197 is performed by using the resource URI to identify its descriptor
198 URI.
200 In many cases, a request for one URI leads to requesting other URIs,
201 as is the case with HTTP redirections. Because the decision whether
202 to use such URIs is application-specific, discovery is constrained to
203 a single URI identifying the resource. Any other resource URIs
204 received MUST be considered as a separate and discrete input into the
205 discovery function. If a resource URI obtained during the
206 performance of these methods is found to be more relevant to the
207 application, the discovery process MUST be restarted with the new
208 resource URI as its input.
210 For example, an HTTP HEAD request for URI A returns a redirect (307)
211 response with a set of "describedby" links, and identifies the
212 temporary location of the representation at URI B. An HTTP HEAD
213 request for URI B returns a successful (200) response with its own
214 set of "describedby" links. An application MAY choose to define a
215 process in which the two sets of links are obtained, prioritized, and
216 utilized, however, it MUST do so by explicitly instructing the client
217 to perform discovery multiple times, as each is considered separate
218 and distinct discovery.
220 4.1. Method Selection
222 Each method presents a different set of requirements. The criteria
223 used to determine which methods a server SHOULD support and client
224 SHOULD attempt are based on a combination of factors:
226 o The ability to offer and obtain a representation of the resource
227 by dereferencing its URI.
229 o The availability of a representation supporting markup
230 compatible with [I-D.nottingham-http-link-header].
232 o The availability of an HTTP representation of the resource and the
233 ability to provide and access link information in its response
234 header.
236 The methods are listed is based on the restrictiveness of their
237 requirements in descending order, from the most specialized to the
238 most generic. This ordering however, does not imply the order in
239 which multiple applicable methods should be attempted. Because
240 different methods are more appropriate in different circumstances, it
241 is up to each application to define how they should be used together.
243 To promote interoperability, applications referencing this memo MUST
244 clearly define the relationship between the three methods as either:
246 o equal, all methods MUST produce the same set of resource
247 descriptors and clients MAY attempt either method according to
248 their capabilities, or
250 o with an application-specific order of precedence, where methods
251 MUST be attempted in a specific order.
253 4.2. The Element
255 The element method is limited to resources with an available
256 markup representation that supports typed-relations using the
257 element, such as HTML [W3C.REC-html401-19991224], XHTML
258 [W3C.REC-xhtml1-20020801], and Atom [RFC4287]. Other markup formats
259 are permitted as long as the semantics of their elements are
260 fully compatible with the link framework defined in
261 [I-D.nottingham-http-link-header]. This method requires the
262 retrieval of a resource representation. While HTTP is the most
263 common transport for such documents, this method is transport
264 independent.
266 For example:
268
271 A client trying to obtain the location of the resource's descriptor
272 using this method SHALL:
274 1. Retrieve a representation of the resource using the applicable
275 transport for that resource URI. If the markup document is
276 obtained using HTTP, it MUST only be used by the client if the
277 document is a valid representation of the resource identified by
278 the HTTP request URI, typically in a response with a successful
279 (2xx) or redirection (3xx) status code. If no such valid
280 representation of the request URI is found, the method fails.
282 2. Parse the document as defined by its format specification and
283 look for elements with a "rel" attribute value containing
284 the "describedby" relation. The client MUST obey the document
285 markup schema and ignore any invalid elements (such as
286 elements outside the
section of an HTML document). This
287 is done to avoid unintentional markup from other parts of the
288 document to be used for discovery purposes, which can have vast
289 impact on usability and security.
291 3. Narrow down the selection if more than one "describedby" link is
292 found, following the application-specific criteria. The
293 descriptor location is obtained from the value of the "href"
294 attribute in the selected element.
296 elements MAY include other relation types together with
297 "describedby" in a single "rel" attribute (for example
298 'rel="describedby copyright"'). Clients MUST be properly process use
299 such multiple relation "rel" attributes as defined by the format
300 specification.
302 4.3. The HTTP Link Header
304 The HTTP Link header method is limited to resources for which an HTTP
305 GET or HEAD request returns a 2xx, 3xx, or 4xx HTTP response
306 [RFC2616]. This method uses the Link header defined in
307 [I-D.nottingham-http-link-header] and requires the retrieval of a
308 resource representation header.
310 For example:
312 Link: ; rel="describedby";
313 type="application/powder+xml"
315 A client trying to obtain the location of the resource's descriptor
316 using this method SHALL:
318 1. Make an HTTP (or HTTPS as required) GET or HEAD request to the
319 resource URI to obtain a valid response header. If the HTTP
320 response carries a status code other than successful (2xx),
321 redirection (3xx), or client error (4xx), the method fails.
323 2. Parse the HTTP response header and look for Link headers with a
324 "rel" parameter value containing the "describedby" relation.
326 3. Narrow down the selection if more than one "describedby" link is
327 found, following the application-specific criteria. The
328 descriptor location is obtained from the "<>" enclosed URI-
329 reference in the selected Link header.
331 Link headers MAY include other relation types together with
332 "describedby" in a single "rel" parameter (for example
333 'rel="describedby copyright"'). Clients MUST be properly process use
334 such multiple relation "rel" attributes as defined by
335 [I-D.nottingham-http-link-header].
337 4.4. The Host Metadata Document
339 The host metadata document method is available for any resource
340 identified by a URI whose authority supports the host-meta document
341 defined in [I-D.nottingham-site-meta]. This method does not require
342 obtaining any representation of the resource, and operates solely
343 using the resource URI.
345 The link relation between the resource URI and the descriptor URI is
346 obtained by using a template contained in the host-meta document. By
347 applying the host-wide template to an individual resource URI, a
348 resource-specific link is produced which can be used to indicate the
349 location of the descriptor document for that resource, bypassing the
350 need to access or provide a representation for it.
352 For example (line breaks are for formatting only, and are not allowed
353 in the document):
355 Link-Pattern: <{uri};about">; rel="describedby";
356 type="application/powder+xml"
358 A client trying to obtain the location of the resource's descriptor
359 using this method SHALL:
361 1. Retrieve the host-meta document for URI's authority as defined by
362 [I-D.nottingham-site-meta] section 4. If the request fails to
363 retrieve a valid host-meta document, the method fails.
365 2. Parse host-meta document and look for Link-Pattern fields with a
366 "rel" attribute value containing the "describedby" relation.
368 3. Narrow down the selection if more than one "describedby" link is
369 found, following the application-specific criteria. The
370 descriptor location is constructed by applying the template
371 obtained from the selected Link-Pattern field to the resource URI
372 as described by Section 6.1.
374 Link-Pattern MAY include other relation types together with
375 "describedby" in a single "rel" parameter (for example
376 'rel="describedby copyright"'). Clients MUST be properly process use
377 such multiple relation "rel" attributes as defined by Section 6.
379 5. Obtaining Resource Descriptor
381 Once the desired descriptor URI has been obtained, the descriptor
382 document is retrieved. If the descriptor URI scheme is "http" or
383 "https", the document is obtained via an HTTP (or HTTPS as required)
384 GET request to the identified URI. The client MUST obey HTTP
385 redirections (3xx), and the descriptor document is considered valid
386 only if retrieved with a successful HTTP response status (2xx).
388 6. The Link-Pattern host-meta Field
390 The Link host-meta field [I-D.nottingham-site-meta] conveys a link
391 relation between all resource URIs under the host-meta authority and
392 a common target URI. However, there are cases in which relations of
393 different resources with the same authority do not share the same
394 target URI, but do follow a common pattern in how the target URI is
395 constructed.
397 For example, a news site with multiple authors can provide
398 information about each article's author, but appending a suffix (such
399 as ";by") to the URI of each article. Each article has a unique
400 author, but all share the same pattern of where that information is
401 located. The same information can be provided using an HTTP link
402 header or HTML element, but in a less efficient manner when a
403 single pattern can provide the same information:
405 Link-Pattern: <{uri};by>; rel="author"
407 The Link-Pattern host-meta field uses a slightly modified syntax of
408 the HTTP Link header [I-D.nottingham-http-link-header] to convey
409 relations whose context is individual resources with the same
410 authority as the host-meta document, and whose target is constructed
411 by applying a template to the context URI. The field is not specific
412 to any relation type and MAY be used to express any relations
413 supported by the Link header [I-D.nottingham-http-link-header].
415 The Link-Pattern host-meta field differs from the HTTP Link header in
416 the following respects:
418 o The "<>" enclosed token is not a valid URI, but instead contains a
419 template as defined in Section 6.1.
421 o Its context URI is defined as the individual resource URI used as
422 input to the template.
424 o If the resulting target URI expressed by the template is relative,
425 its base URI is the root resource of the authority.
427 Link-Pattern = "Link-Pattern" ":" #pattern-value
429 pattern-value = "<" template ">" *( ";" link-param )
431 template = *( uri-char | "{" [ "%" ] var-name "}" )
433 uri-char = ( reserved | unreserved )
435 var-name = "scheme" | "authority" | "path"
436 | "query" | "fragment" | "userinfo"
437 | "host" | "port" | "uri"
439 [[ should this spec define a filter/map parameter that will allow
440 applying link patterns to subsets of the host-meta scope? This can
441 use a regular expression match or something similar to robots.txt.
442 If the spec will end up not directly supporting this feature, I will
443 add a note suggesting that such a feature could be defined elsewhere
444 as an extension. ]]
446 6.1. Template Syntax
448 The template syntax provides a simple format for URI transformation.
449 A template is a string containing brace-enclosed ("{}") variable
450 names marking the parts of the string that are to be substituted by
451 the variable values. A template is transformed into a URI by
452 substituting the variables with their calculated value. If a
453 variable name is prefixed by "%", any character in the variable value
454 other than unreserved MUST be percent-encoded per [RFC3986].
456 To construct a URI using a template, the input URI is parsed into its
457 URI components and each component value assigned to a variable name.
458 The template variable substitution is based on the URI vocabulary
459 defined by [RFC3986] section 3 and includes: "scheme", "authority",
460 "path", "query", "fragment", "userinfo", "host", and "port". In
461 addition, it defines the "uri" variable as the entire input URI
462 excluding the fragment component and the "#" fragment separator.
464 foo://william@example.com:8080/over/there?name=ferret#nose
465 \_/ \______________________/\_________/ \_________/ \__/
466 | | | | |
467 scheme authority path query fragment
469 foo://william@example.com:8080/over/there?name=ferret#nose
470 \_____/ \_________/ \__/
471 | | |
472 userinfo host port
474 foo://william@example.com:8080/over/there?name=ferret#nose
475 \___________________________________________________/
476 |
477 uri
479 For example, given the input URI "http://example.com/r/1?f=xml#top",
480 each of the following templates will produce the associated output
481 URI:
483 http://example.org?q={%uri} -->
484 http://example.org?q=http%3A%2F%2Fexample.com%2Fr%2F1%3Ff%3Dxml
486 http://meta.{host}:8080{path}?{query} -->
487 http://meta.example.com:8080/r/1?f=xml
489 https://{authority}/v1{path}#{fragment} -->
490 https://example.com/v1/r/1#top
492 7. Security Considerations
494 The methods used to perform discovery are not secure, private or
495 integrity-guaranteed, and due caution should be exercised when using
496 them. Applications that perform discovery should consider the attack
497 vectors opened by automatically following, trusting, or otherwise
498 using links gathered from elements, HTTP Link headers, or
499 host-meta documents.
501 8. IANA Considerations
503 8.1. The Link-Pattern host-meta Field
505 This specification registers the Link-Pattern host-meta field in the
506 host-meta Field Registry [I-D.nottingham-site-meta].
508 Field Name: Link-Pattern
510 Change controller: IETF
512 Specification document(s): [[ this document ]]
514 Related information: [I-D.nottingham-http-link-header]
516 8.2. The describedby Relation Type
518 [[ this section will be removed if the "describedby" relation type is
519 registered by the time it is published ]]
521 This specification registers the "describedby" relation type in the
522 Link Relation Type Registry [I-D.nottingham-http-link-header].
524 o Relation Name: describedby
526 o Description: The relationship A "describedby" B asserts that
527 resource B provides a description of resource A. There are no
528 constraints on the format or representation of either A or B,
529 neither are there any further constraints on either resource.
531 o Documentation: [POWDER]
533 Appendix A. Descriptor Discovery vs. Service Discovery
535 Descriptor discovery provides a process for obtaining information
536 about a resource identified with a URI. It allows servers to
537 describe their resources in a machine-readable format, enabling
538 automatic interoperability by user-agents and resource consuming
539 applications. Discovery enables applications to utilize a wide range
540 of web services and resources across multiple providers without the
541 need to know about their capabilities in advance, reducing the need
542 for manual configuration and resource-specific software.
544 When discussing discovery, it is important to differentiate between
545 descriptor discovery and service discovery. Both types attempts to
546 associate capabilities with resources, but they approach it from
547 opposite ends.
549 Service discovery centers on identifying the location of qualified
550 resources, typically finding an endpoint capable of certain protocols
551 and capabilities. In contrast, descriptor discovery begins with a
552 resource, trying to find which capabilities it supports.
554 A simple way to distinguish between the two types of discovery is to
555 define the questions they are each trying to answer:
557 Descriptor-Discovery: Given a resource, what are its attributes:
558 capabilities, characteristics, and relationships to other
559 resources?
561 Service-Discovery: Given a set of attributes, which available
562 resources match the desired set and what is their location?
564 While this memo deals exclusively with descriptor discovery, it is
565 important to note that the two discovery types are closely related
566 and are usually used in tandem. In fact, a typical use case will
567 switch between service discovery and descriptor discovery multiple
568 times in a single workflow, and can start with either one.
570 One reason for this dependency between the two discovery types is
571 that resource descriptors usually contain not only a list of
572 capabilities, but also relationships to other resources. Since those
573 relationships are usually typed, the process in which an application
574 chooses which links to use is in fact service discovery.
576 Applications use descriptor discovery to obtain the list of links,
577 and service discovery to choose the relevant links. In another
578 common example, the application uses service discovery to find a
579 resource with a given capability, then uses descriptor discovery to
580 find out what other capabilities it supports.
582 Appendix B. Methods Suitability Analysis
584 Due to the wide range of use cases requiring resource descriptors,
585 and the desire to reuse as much as possible, no single solution has
586 been found to sufficiently cover the requirements for linking between
587 the resource URI and the descriptor URI. The following analysis
588 attempts to list all the method proposed for addressing descriptor
589 discovery. It is included here to provide background information as
590 to why certain methods have been selected while others rejected from
591 the discovery process. It has been updated to match the terms used
592 in this memo and its structure.
594 Appendix B.1. Requirements
596 Getting from a resource URI to its descriptor document can be
597 implemented in many ways. The problem is that none of the current
598 methods address all of the requirements presented by the common use
599 cases. The requirements are simple, but the more we try to address,
600 the less elegant and accessible the process becomes. While working
601 on the now defunct XRDS-Simple specification [XRDS-Simple] and
602 talking to companies and individual about it, the following
603 requirements emerged for any proposed process:
605 Self Declaration:
607 Allow resources to declare the availability of descriptor
608 information and its location. When a resource is accessed, it
609 needs to have a way to communicate to the client that it
610 supports the discovery protocol and to indicates the location
611 of such descriptor.
613 This is useful when the client is able or is already
614 interacting with the resource but can enhance its interaction
615 with additional information. For example, accessing a blog
616 page enhanced if it was generated from an Atom feed or Atom
617 entry and that feed supports Atom authoring.
619 Direct Descriptor Access:
621 Enable direct retrieval of the resource descriptor without
622 interacting with the resource itself. Before a resource is
623 accessed, the client should have a way to obtain the resource
624 descriptor without accessing the resource. This is important
625 for two reasons.
627 First, accessing an unknown resource may have undesirable
628 consequences. After all, the information contained in the
629 descriptor is supposed to inform the client how to interact
630 with the resource. The second is efficiency - removing the
631 need to first obtain the resource in order to get its
632 descriptor (reducing HTTP round-trips, network bandwidth, and
633 application latency).
635 Web Architecture Compliant:
637 Work with well-established web infrastructure. This may sound
638 obvious but it is in fact the most complex requirement.
639 Deploying new extensions to the HTTP protocol is a complicated
640 endeavor. Beside getting applications to support a new header,
641 method, or content negotiation, existing caches and proxies
642 must be enhanced to properly handle these requests, and they
643 must not fail performing their normal duties without such
644 enhancements.
646 For example, a new content negotiation method may cause an
647 existing cache to serve the wrong data to a non-discovery
648 client due to its inability to distinguish the metadata request
649 from the resource representation request.
651 Scale and Technology Agnostic:
653 Support large and small web providers regardless of the size of
654 operations and deployment. Any solution must work for a small
655 hosted web site as well as the world largest search engine. It
656 must be flexible enough to allow developers with restricted
657 access to the full HTTP protocol (such as limited access to
658 request or response headers) to be able to both provide and
659 consume resource descriptors. Any solution should also support
660 caching as much as possible and allow reuse of source code and
661 data.
663 Extensible:
665 Accommodate future enhancements and unknown descriptor formats.
666 It should support the existing set of descriptor formats such
667 as XRD and POWDER, as well as new descriptor relationships that
668 might emerge in the future. In addition, the solution should
669 not depend on the descriptor format itself and work equally
670 well with any document format - it should aim to keep the road
671 and destination separate.
673 Appendix B.2. Analysis
675 The following is a list of proposed and implemented methods trying to
676 address descriptor discovery. Each method is reviewed for its
677 compliance with the requirements identified previously. The [-],
678 [+], or [+-] symbols next to each requirement indicate how well the
679 method complies with the requirement.
681 Appendix B.2.1. HTTP Response Header
683 When a resource representation is retrieved using and HTTP GET
684 request, the server includes in the response a header pointing to the
685 location of the descriptor document. For example, POWDER uses the
686 "Link" response header to create an association between the resource
687 and its descriptor. XRDS [XRDS] (based on the Yadis protocol
688 [Yadis]) uses a similar approach, but since the Link header was not
689 available when Yadis was first drafted, it defines a custom header
690 X-XRDS-Location which serves a similar but less generic purpose.
692 [+] Self Declaration - using the Link header, any resource can point
693 to its descriptor documents.
695 [-] Direct Descriptor Access - the header is only accessible when
696 requesting the resource itself via an HTTP GET request. While
697 HTTP GET is meant to be a safe operation, it is still possible for
698 some resource to have side-effects.
700 [+] Web Architecture Compliant - uses the Link header which is an
701 IETF Internet Standard [[ currently a standard-track draft ]], and
702 is consistent with HTTP protocol design.
704 [-] Scale and Technology Agnostic - since discovery accounts for a
705 small percent of resource requests, the extra Link header is
706 wasteful. For some hosted servers, access to HTTP headers is
707 limited and will prevent implementation.
709 [+] Extensible - the Link header provides built-in extensibility by
710 allowing new link relations, mime-types, and other extensions.
712 Minimum roundtrips to retrieve the resource descriptor: 2
714 Appendix B.2.2. HTTP Response Header Via HEAD
716 Same as the HTTP Response Header method but used with an HTTP HEAD
717 request. The idea of using the HEAD method is to solve the wasteful
718 overhead of including the Link header in every reply. By limiting
719 the appearance of the Link header only to HEAD responses, typical GET
720 requests are not encumbered by the extra bytes.
722 [+] Self Declaration - Same as the HTTP Response Header method.
724 [-] Direct Descriptor Access - Same as the HTTP Response Header
725 method.
727 [-] Web Architecture Compliant - HTTP HEAD should return the exact
728 same response as HTTP GET with the sole exception that the
729 response body is omitted. By adding headers only to the HEAD
730 response, this solution violates the HTTP protocol and might not
731 work properly with proxies as they can return the header of the
732 cached GET request.
734 [+] Scale and Technology Agnostic - solves the wasted bandwidth
735 associated with the HTTP Response Header method, but still suffers
736 from the limitation imposed by requiring access to HTTP headers.
738 [+] Extensible - Same as the HTTP Response Header method.
740 Minimum roundtrips to retrieve the resource descriptor: 2
742 Appendix B.2.3. HTTP Content Negotiation
744 Using the HTTP Accept request header or Transparent Content
745 Negotiation as defined in [RFC2295], the client informs the server it
746 is interested in the descriptor and not the resource itself, to which
747 the server responds with the descriptor document or its location. In
748 Yadis, the client sends an HTTP GET (or HEAD) request to the resource
749 URI with an Accept header and content-type application/xrds+xml.
750 This informs the server of the client's discovery interest, which in
751 turn may reply with the descriptor document itself, redirect to it,
752 or return its location via the X-XRDS-Location response header.
754 [-] Self Declaration - does not address as it focuses on the client
755 declaring its intentions.
757 [+] Direct Descriptor Access - provides a simple method for directly
758 requesting the descriptor document.
760 [-] Web Architecture Compliant - while it can be argued that the
761 descriptor can be considered another representation of the
762 resource, it is very much external to it. Using the Accept header
763 to request a separate resource (as opposed to a different
764 representation of the same resource) violates web architecture.
765 It also prevents using the discovery content-type as a valid
766 (self-standing) web resource having its own descriptor.
768 [-] Scale and Technology Agnostic - requires access to HTTP request
769 and response headers, as well as the registration of multiple
770 handlers for the same resource URI based on the Accept header. In
771 addition, improper use or implementation of the Vary header in
772 conjunction with the Accept header will cause caches to serve the
773 descriptor document instead of the resource itself - a great
774 concern to large providers with frequently visited front-pages.
776 [-] Extensible - applies an implicit relation type to the descriptor
777 mime-type, limiting descriptor formats to a single purpose. It
778 also prevents using existing mime-types from being used as a
779 descriptor format.
781 Minimum roundtrips to retrieve the resource descriptor: 1
783 Appendix B.2.4. HTTP Header Negotiation
785 Similar to the HTTP Content Negotiation method, this solution uses a
786 custom HTTP request header to inform the server of the client's
787 discovery intentions. The server responds by serving the same
788 resource representation (via an HTTP GET or HEAD requests) with the
789 relevant Link headers. It attempts to solve the HTTP Response Header
790 waste issue by allowing the client to explicitly request the
791 inclusion of Link headers. One such header can be called "Request-
792 links" to inform the server the client would like it to include
793 certain Link headers of a given "rel" type in its reply.
795 [+] Self Declaration - same as HTTP Response Header with the option
796 of selective inclusion.
798 [-] Direct Descriptor Access - does not address.
800 [-] Web Architecture Compliant - HTTP does not include any mechanism
801 for header negotiation and any custom solution will break existing
802 caches.
804 [+-] Scale and Technology Agnostic - Requires advance access to HTTP
805 headers on both the client and server sides, but solves the
806 bandwidth waste issue of the HTTP Response Header method.
808 [+] Extensible - builds on top of Link header extensibility.
810 Minimum roundtrips to retrieve the resource descriptor: 2
812 Appendix B.2.5. Element
814 Embeds the location of the descriptor document within the resource
815 representation by leveraging the HTML header element (as
816 opposed to the HTTP header). Applies to HTML resource
817 representations or similar markup-based formats with support for
818 "Link"-like elements such as Atom. POWDER uses the element in
819 this manner, while XRDS uses the HTML element with an "http-
820 equiv" attribute equals to X-XRDS-Location (to create an embedded
821 version of the X-XRDS-Location custom header).
823 [+] Self Declaration - similar to HTTP Response Header method but
824 limited to HTML resources.
826 [-] Direct Descriptor Access - the method requires fetching the
827 entire resource representation in order to obtain the descriptor
828 location. In addition, it requires changing the resource HTML
829 representation which makes discovery an intrusive process.
831 [+] Web Architecture Compliant - uses the element as
832 designed.
834 [+] Scale and Technology Agnostic - while this solution requires
835 direct retrieval of the resource and manipulation of its content,
836 it is extremely accessible in many platforms.
838 [-] Extensible - extensibility is restricted to HTML representations
839 or similar markup formats with support for a similar element.
841 Minimum roundtrips to retrieve the resource descriptor: 2
843 Appendix B.2.6. HTTP OPTIONS Method
845 The HTTP OPTIONS method is used to interact with the HTTP server with
846 regard to its capabilities and communication-related information
847 about its resources. The OPTIONS method, together with an optional
848 request header, can be used to request both the descriptor location
849 and descriptor content itself.
851 [-] Self Declaration - does not address.
853 [+] Direct Descriptor Access - provides a clean mechanism for
854 requesting descriptor information about a resource without
855 interacting with it.
857 [+] Web Architecture Compliant - uses an existing HTTP featured.
859 [-] Scale and Technology Agnostic - requires client and server
860 access to the OPTIONS HTTP method. Also does not support caching
861 which makes this solution inefficient.
863 [+] Extensible - built-into the OPTIONS method.
865 Minimum roundtrips to retrieve the resource descriptor: 1
867 Appendix B.2.7. WebDAV PROPFIND Method
869 Similar to the HTTP OPTIONS method, the WebDAV PROPFIND method
870 defined in [RFC4918] can be used to request resource specific
871 properties, one of which can hold the location of the descriptor
872 document. PROPFIND, unlike OPTIONS, cannot return the descriptor
873 itself, unless it is returned in the required PROPFIND schema (a
874 multi-status XML element). Other alternatives include URIQA [URIQA],
875 an HTTP extension which defines a method called MGET, and ARK
876 (Archival Resource Key) [ARK] - a method similar to PROPFIND that
877 allows the retrieval of resource attributes using keys (which
878 describe the resource).
880 [-] Self Declaration - does not address.
882 [+-] Direct Descriptor Access - does not require interaction with
883 the resource, but does require at least two requests to get the
884 descriptor (get location, get document).
886 [+] Web Architecture Compliant - uses an HTTP extension with less
887 support than core HTTP, but still based on published standards.
889 [-] Scale and Technology Agnostic - same as the HTTP OPTIONS Method.
891 [+-] Extensible - uses extensible protocols but at the same time
892 depends on solutions that have already gone beyond the standard
893 HTTP protocol, which makes further extensions more complex and
894 unsupported.
896 Minimum roundtrips to retrieve the resource descriptor: 2
898 Appendix B.2.8. Custom HTTP Method
900 Similar to the HTTP OPTIONS Method, a new method can be defined (such
901 as DISCOVER) to return (or redirect to) the descriptor document. The
902 new method can allow caching.
904 [-] Self Declaration - does not address.
906 [+] Direct Descriptor Access - same as the HTTP OPTIONS Method.
908 [-] Web Architecture Compliant - depends heavily on extending every
909 platform to support the extension. Unlikely to be supported by
910 existing proxy services and caches.
912 [-] Scale and Technology Agnostic - same as HTTP OPTIONS Method with
913 the additional burden on smaller sites requiring access to the new
914 protocol.
916 [+] Extensible - new protocol that can extend as needed.
918 Minimum roundtrips to retrieve the resource descriptor: 1
920 Appendix B.2.9. Static Resource URI Transformation
922 Instead of using HTTP facilities to access the descriptor location,
923 this method defines a template to transform any resource URI to the
924 descriptor document URI. This can be done by adding a prefix or
925 suffix to the resource URI, which turns it into a new resource URI.
926 The new URI points to the descriptor document. For example, to fetch
927 the descriptor document for http://example.com/resource, the client
928 makes an HTTP GET request to http://example.com/resource;about using
929 a static template that adds the ";about" suffix.
931 [-] Self Declaration - does not address.
933 [+] Direct Descriptor Access - creates a unique URI for the
934 descriptor document.
936 [+-] Web Architecture Compliant - uses basic HTTP facilities but
937 intrudes on the domain authority namespace as it defines a static
938 template for URI transformation that is not likely to be
939 compatible with many existing URI naming conventions.
941 [+-] Scale and Technology Agnostic - depending on the static mapping
942 chosen. Some hosted environment will have a problem gaining
943 access to the mapped URI based on the URI format chosen.
945 [-] Extensible - provides a very specific and limited method to map
946 between resources and their descriptor, since each relation type
947 must mint its own static template.
949 Minimum roundtrips to retrieve the resource descriptor: 1
951 Appendix B.2.10. Dynamic Resource URI Transformation
953 Same as the Static Resource URI Transformation method but with the
954 ability for each domain authority to specify its own discovery
955 transformation template. This can done by placing a configuration
956 file at a known location (such as robots.txt) which contains the
957 template needed to perform the URL mapping. The client first obtains
958 the configuration document (which may be cached using normal HTTP
959 facilities), parses it, then uses that information to transform the
960 resource URI and access the descriptor document.
962 [+-] Self Declaration - does not address individual resources, but
963 allows entire domains to declare their support (and how to use
964 it).
966 [+-] Direct Descriptor Access - once the mapping template has been
967 obtained, descriptors can be accessed directly.
969 [+-] Web Architecture Compliant - uses an existing known-location
970 design pattern (such as robots.txt) and standard HTTP facilities.
971 The use of a known-location if not ideal and is considered a
972 violation of web architecture but if it serves as the last of its
973 kind, can be tolerated. An alternative to the known-location
974 approach can be using DNS to store either the location of the
975 mapping or the map template itself, but DNS adds a layer of
976 complexity not always available.
978 [+-] Scale and Technology Agnostic - works well at the URI authority
979 level (domain) but is inefficient at the URI path level (resource
980 path) and harder to implement when different paths within the same
981 domain need to use different templates. With the decreasing cost
982 of custom domains and sub-domains hosting, this will not be an
983 issue for most services, but it does require sharing configuration
984 at the domain/sub-domain level.
986 [+-] Extensible - can be, depending on the schema used to format the
987 known-location configuration document.
989 Minimum roundtrips to retrieve the resource descriptor: initially 2,
990 1 after caching
992 Appendix C. Acknowledgments
994 With the exception of the host-meta template extension, very little
995 of this memo is original work. Many communities and individuals have
996 been working on solving discovery for many years and this work is a
997 direct result of their hard and dedicated efforts.
999 Inspiration for this memo derived from previous work on a descriptor
1000 format called XRDS-Simple, which in turn derived from another
1001 descriptor format, XRDS. Previous discovery workflows include Yadis
1002 which is currently used by the OpenID community. While suffering
1003 from significant shortcomings, Yadis was a breakthrough approach to
1004 performing discovery using extremely restricted hosting environments,
1005 and this memo has strived to preserve as much of that spirit as
1006 possible.
1008 The use of Link elements and headers and the introduction of the
1009 "describedby" relation type in this memo is a direct result of the
1010 dedicated work and contribution of Phil Archer to the W3C POWDER
1011 specification and Jonathan Rees to the W3C review of Uniform Access
1012 to Information About. The host-meta approach was first proposed by
1013 Mark Nottingham as an alternative to attaching links directly to
1014 resource representations.
1016 The author wishes to thanks the OASIS XRI community for their
1017 support, encouragement, and enthusiasm for this work. Special thanks
1018 go to Lisa Dusseault, Joseph Holsten, Mark Nottingham, John Panzer,
1019 Drummond Reed, and Jonathan Rees for their invaluable feedback.
1021 The author takes all responsibility for errors and omissions.
1023 Appendix D. Document History
1025 [[ to be removed by the RFC editor before publication as an RFC ]]
1027 -03
1028 o Added protocol name LRDD (pronounced 'lard').
1030 o Fixed Link-Pattern examples to include missing semicolons.
1032 -02
1034 o Changed focus from an HTTP-based process to Link-based process.
1036 o Completely revised and restructured document for better clarity.
1038 o Realigned the methods to produce consistent results and changed
1039 the way redirections and client-errors are handled.
1041 o Updated to use newer version of site-meta, now called host-meta,
1042 including a new plaintext-based format to replace the previous XML
1043 format.
1045 o Renamed Link-Template to Link-Pattern to avoid future conflict
1046 with a previously proposed Link-Template HTTP header.
1048 o Removed support for the "scheme" Link-Template parameter.
1050 o Replaced restrictions with interoperability recommendations.
1052 o Added IANA considerations per new host-meta registry requirements.
1054 -01
1056 o Rename 'resource discovery' to 'descriptor discovery'.
1058 o Added informative reference to Metalink.
1060 o Clarified that the resource descriptor URI can use any URI scheme,
1061 not just "http" or "https".
1063 o Removed comment regarding redirects when using Elements.
1065 o Clarified that HTTPS must be used with "https" URIs for both Link
1066 headers and host-meta retrieval.
1068 o Removed DNS verification step for host-meta with schemes other
1069 then "http" and "https". Replaced with a general discussion of
1070 authority and a security consideration comment.
1072 o Organized host-meta section into another sub-section level.
1074 o Enlarged the template vocabulary from a single "uri" variable to
1075 include smaller URI components.
1077 o Added informative reference to RFC 2295 in analysis appendix.
1079 -00
1081 o Initial draft.
1083 9. References
1085 9.1. Normative References
1087 [I-D.nottingham-http-link-header]
1088 Nottingham, M., "Link Relations and HTTP Header Linking",
1089 draft-nottingham-http-link-header-03 (work in progress),
1090 November 2008.
1092 [I-D.nottingham-site-meta]
1093 Nottingham, M. and E. Hammer-Lahav, "Host Metadata for the
1094 Web", draft-nottingham-site-meta-01 (work in progress),
1095 February 2009.
1097 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
1098 Requirement Levels", BCP 14, RFC 2119, March 1997.
1100 [RFC2295] Holtman, K. and A. Mutz, "Transparent Content Negotiation
1101 in HTTP", RFC 2295, March 1998.
1103 [RFC2616] Fielding, R., Gettys, J., Mogul, J., Frystyk, H.,
1104 Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext
1105 Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999.
1107 [RFC2818] Rescorla, E., "HTTP Over TLS", RFC 2818, May 2000.
1109 [RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
1110 Resource Identifier (URI): Generic Syntax", STD 66,
1111 RFC 3986, January 2005.
1113 [RFC4287] Nottingham, M., Ed. and R. Sayre, Ed., "The Atom
1114 Syndication Format", RFC 4287, December 2005.
1116 [RFC4918] Dusseault, L., "HTTP Extensions for Web Distributed
1117 Authoring and Versioning (WebDAV)", RFC 4918, June 2007.
1119 [W3C.REC-html401-19991224]
1120 Raggett, D., Jacobs, I., and A. Hors, "HTML 4.01
1121 Specification", World Wide Web Consortium
1122 Recommendation REC-html401-19991224, December 1999,
1123 .
1125 [W3C.REC-xhtml1-20020801]
1126 Pemberton, S., "XHTML[TM] 1.0 The Extensible HyperText
1127 Markup Language (Second Edition)", World Wide Web
1128 Consortium Recommendation REC-xhtml1-20020801,
1129 August 2002,
1130 .
1132 9.2. Informative References
1134 [ARK] Kunze, J. and R. Rodgers, "The ARK Identifier Scheme",
1135 .
1137 [I-D.bryan-metalink]
1138 Bryan, A., "The Metalink Download Description Format",
1139 draft-bryan-metalink-05 (work in progress), January 2009.
1141 [POWDER] Archer, P., Ed., Smith, K., Ed., and A. Perego, Ed.,
1142 "POWDER: Protocol for Web Description Resources",
1143 .
1145 [URIQA] Nokia, "The URI Query Agent Model",
1146 .
1148 [XRD] Hammer-Lahav, E., Ed., "XRD 1.0 [[ replace with new XRD
1149 specification reference ]]".
1151 [XRDS] Wachob, G., Reed, D., Chasen, L., Tan, W., and S.
1152 Churchill, "Extensible Resource Identifier (XRI)
1153 Resolution V2.0", .
1156 [XRDS-Simple]
1157 Hammer-Lahav, E., "XRDS-Simple 1.0",
1158 .
1160 [Yadis] Miller, J., "Yadis Specification 1.0",
1161 .
1163 URIs
1165 [1]
1167 Author's Address
1169 Eran Hammer-Lahav
1170 Yahoo!
1172 Email: eran@hueniverse.com
1173 URI: http://hueniverse.com