idnits 2.17.1 

draft-ietf-ltru-matching-09.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** It looks like you're using RFC 3978 boilerplate.  You should update this
     to the boilerplate described in the IETF Trust License Policy document
     (see https://trustee.ietf.org/license-info), which is required now.

  -- Found old boilerplate from RFC 3978, Section 5.1 on line 16.

  -- Found old boilerplate from RFC 3978, Section 5.5 on line 1137.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1114.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1121.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1127.

  ** This document has an original RFC 3978 Section 5.4 Copyright Line,
     instead of the newer IETF Trust Copyright according to RFC 4748.

  ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead
     of the newer disclaimer which includes the IETF Trust according to RFC
     4748.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not
     match the current year

  == The document seems to lack the recommended RFC 2119 boilerplate, even if
     it appears to use RFC 2119 keywords. 

     (The document does seem to have the reference to RFC 2119 which the
     ID-Checklist requires).
  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (February 6, 2006) is 6653 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  ** Obsolete normative reference: RFC 4234 (Obsoleted by RFC 5234)

  -- Obsolete informational reference (is this intentional?): RFC 1766
     (Obsoleted by RFC 3066, RFC 3282)

  -- Obsolete informational reference (is this intentional?): RFC 2616
     (Obsoleted by RFC 7230, RFC 7231, RFC 7232, RFC 7233, RFC 7234, RFC 7235)

  -- Obsolete informational reference (is this intentional?): RFC 3066
     (Obsoleted by RFC 4646, RFC 4647)


     Summary: 4 errors (**), 0 flaws (~~), 3 warnings (==), 10 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                   A. Phillips, Ed.
3	Internet-Draft                                                Yahoo! Inc
4	Obsoletes: 3066 (if approved)                              M. Davis, Ed.
5	Expires: August 10, 2006                                          Google
6	                                                        February 6, 2006

8	                       Matching of Language Tags
9	                      draft-ietf-ltru-matching-09

11	Status of this Memo

13	   By submitting this Internet-Draft, each author represents that any
14	   applicable patent or other IPR claims of which he or she is aware
15	   have been or will be disclosed, and any of which he or she becomes
16	   aware will be disclosed, in accordance with Section 6 of BCP 79.

18	   Internet-Drafts are working documents of the Internet Engineering
19	   Task Force (IETF), its areas, and its working groups.  Note that
20	   other groups may also distribute working documents as Internet-
21	   Drafts.

23	   Internet-Drafts are draft documents valid for a maximum of six months
24	   and may be updated, replaced, or obsoleted by other documents at any
25	   time.  It is inappropriate to use Internet-Drafts as reference
26	   material or to cite them other than as "work in progress."

28	   The list of current Internet-Drafts can be accessed at
29	   http://www.ietf.org/ietf/1id-abstracts.txt.

31	   The list of Internet-Draft Shadow Directories can be accessed at
32	   http://www.ietf.org/shadow.html.

34	   This Internet-Draft will expire on August 10, 2006.

36	Copyright Notice

38	   Copyright (C) The Internet Society (2006).

40	Abstract

42	   This document describes different mechanisms for comparing, matching,
43	   and evaluating language tags.  Possible algorithms for language
44	   negotiation or content selection, filtering, and lookup are
45	   described.  This document, in combination with RFC 3066bis (replace
46	   "3066bis" with the RFC number assigned to
47	   draft-ietf-ltru-registry-14), replaces RFC 3066, which replaced RFC
48	   1766.

50	Table of Contents

52	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
53	   2.  The Language Range . . . . . . . . . . . . . . . . . . . . . .  4
54	     2.1.  Basic Language Range . . . . . . . . . . . . . . . . . . .  4
55	     2.2.  Extended Language Range  . . . . . . . . . . . . . . . . .  5
56	     2.3.  The Language Priority List . . . . . . . . . . . . . . . .  7
57	   3.  Types of Matching  . . . . . . . . . . . . . . . . . . . . . .  8
58	     3.1.  Choosing a Type of Matching  . . . . . . . . . . . . . . .  8
59	     3.2.  Filtering  . . . . . . . . . . . . . . . . . . . . . . . .  9
60	       3.2.1.  Filtering with Basic Language Ranges . . . . . . . . . 11
61	       3.2.2.  Filtering with Extended Language Ranges  . . . . . . . 11
62	       3.2.3.  Scored Filtering . . . . . . . . . . . . . . . . . . . 11
63	     3.3.  Lookup . . . . . . . . . . . . . . . . . . . . . . . . . . 15
64	   4.  Other Considerations . . . . . . . . . . . . . . . . . . . . . 19
65	     4.1.  Choosing Language Ranges . . . . . . . . . . . . . . . . . 19
66	     4.2.  Meaning of Language Tags and Ranges  . . . . . . . . . . . 20
67	     4.3.  Considerations for Private Use Subtags . . . . . . . . . . 21
68	     4.4.  Length Considerations in Matching  . . . . . . . . . . . . 22
69	   5.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 24
70	   6.  Changes  . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
71	   7.  Security Considerations  . . . . . . . . . . . . . . . . . . . 26
72	   8.  Character Set Considerations . . . . . . . . . . . . . . . . . 27
73	   9.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 28
74	     9.1.  Normative References . . . . . . . . . . . . . . . . . . . 28
75	     9.2.  Informative References . . . . . . . . . . . . . . . . . . 28
76	   Appendix A.  Acknowledgements  . . . . . . . . . . . . . . . . . . 29
77	   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 30
78	   Intellectual Property and Copyright Statements . . . . . . . . . . 31

80	1.  Introduction

82	   Human beings on our planet have, past and present, used a number of
83	   languages.  There are many reasons why one would want to identify the
84	   language used when presenting or requesting information.

86	   Information about a user's language preferences commonly needs to be
87	   identified so that appropriate processing can be applied.  For
88	   example, the user's language preferences in a browser can be used to
89	   select web pages appropriately.  Language preferences can also be
90	   used to select among tools (such as dictionaries) to assist in the
91	   processing or understanding of content in different languages.

93	   Given a set of language identifiers, such as those defined in
94	   [RFC3066bis], various mechanisms can be envisioned for performing
95	   language negotiation and tag matching.

97	   This document defines a syntax (called a language range (Section 2))
98	   for specifying a user's language preferences, as well as several
99	   schemes for selecting or filtering content by comparing language
100	   ranges to the language tags [RFC3066bis] used to identify the natural
101	   language of that content.  Applications, protocols, or specifications
102	   will have varying needs and requirements that affect the choice of a
103	   suitable matching scheme.  Depending on the choice of scheme, there
104	   are various options left to the implementation.  Protocols that
105	   implement a matching scheme either need to specify each particular
106	   choice or indicate the options that are left to the implementation to
107	   decide.

109	   This document is divided into three main sections.  One describes how
110	   to indicate a user's preferences using language ranges.  Then a
111	   section describes various schemes for matching these ranges to a set
112	   of language tags in order to select specific content.  There is also
113	   a section that deals with various practical considerations that apply
114	   to implementing and using these schemes.

116	   This document, in combination with [RFC3066bis] (Ed.: replace
117	   "3066bis" globally in this document with the RFC number assigned to
118	   draft-ietf-ltru-registry-14), replaces [RFC3066], which replaced
119	   [RFC1766].

121	   The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
122	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
123	   document are to be interpreted as described in [RFC2119].

125	2.  The Language Range

127	   Language Tags [RFC3066bis] are used to identify the language of some
128	   information item or content.  Applications or protocols that use
129	   language tags are often faced with the problem of identifying sets of
130	   content that share certain language attributes.  For example,
131	   HTTP/1.1 [RFC2616] describes one such mechanism in its discussion of
132	   the Accept-Language header (Section 14.4), which is used when
133	   selecting content from servers based on the language of that content.

135	   When selecting content according to its language, it is useful to
136	   have a mechanism for identifying sets of language tags that share
137	   specific attributes.  This allows users to select or filter content
138	   based on specific requirements.  Such an identifier is called a
139	   "Language Range".

141	   Language ranges are similar in structure and content to language
142	   tags: they consist of alphanumeric "subtags" separated by hyphens,
143	   plus a special subtag consisting of the character "*" (%2A,
144	   ASTERISK), which is used in ranges as a "wildcard", that is, a value
145	   that matches any subtag.

147	   Language tags and thus language ranges are to be treated as case-
148	   insensitive: there exist conventions for the capitalization of some
149	   of the subtags, but these MUST NOT be taken to carry meaning.
150	   Matching of language tags to language ranges MUST be done in a case-
151	   insensitive manner as well.

153	2.1.  Basic Language Range

155	   A "basic language range" identifies the set of content whose language
156	   tags begin with the same sequence of subtags.  Each range consists of
157	   a sequence of alphanumeric subtags separated by hyphens.  The basic
158	   language range is defined by the following ABNF[RFC4234]:

160	   language-range = language-tag / "*"
161	   language-tag   = 1*8[alphanum] *["-" 1*8alphanum]
162	   alphanum       = ALPHA / DIGIT

164	   Basic language ranges (originally described by HTTP/1.1 [RFC2616] and
165	   later [RFC3066]) have the same syntax as an [RFC3066] language tag or
166	   are the single character "*".  They differ from the language tags
167	   defined in [RFC3066bis] only in that there is no requirement that
168	   they be "well-formed" or be validated against the IANA Language
169	   Subtag Registry (although such ill-formed ranges will probably not
170	   match anything).

172	   Use of a basic language range seems to imply that there is a semantic
173	   relationship between language tags that share the same prefix.  While
174	   this is often the case, it is not always true and users should note
175	   that the set of language tags that match a specific language-range
176	   may not be mutually intelligible.

178	2.2.  Extended Language Range

180	   A Basic Language Range does not always provide the most appropriate
181	   way to specify a user's preferences.  Sometimes it is beneficial to
182	   use a more fine-grained matching scheme that takes advantage of the
183	   internal structure of language tags.  This allows the user to
184	   specify, for example, the value of a specific field in a language tag
185	   or to indicate which values are of interest in filtering or selecting
186	   the content.

188	   In an extended language range, the identifier takes the form of a
189	   series of subtags which MUST consist of well-formed subtags or the
190	   special subtag "*".  For example, the language range "en-*-US"
191	   specifies a primary language of 'en', followed by any script subtag,
192	   followed by the region subtag 'US'.

194	   An extended language range can be represented by the following ABNF:

196	   extended-language-range  = range ; a range
197	                 / privateuse       ; a private-use range
198	                 / grandfathered    ; a grandfathered registration

200	   range         = (language
201	                    ["-" script]
202	                    ["-" region]
203	                    *("-" variant)
204	                    *("-" extension)
205	                    ["-" privateuse])

207	   language      = (2*3ALPHA [ extlang ]) ; shortest ISO 639 code
208	                 / 4ALPHA                 ; reserved for future use
209	                 / 5*8ALPHA               ; registered language subtag
210	                 / "*"                    ; or wildcard

212	   extlang       = *2("-" 3ALPHA) ("-" ( 3ALPHA / "*"))
213	                                          ; reserved for future use
214	                                          ; wildcard can only appear
215	                                          ;   at the end

217	   script        = 4ALPHA                 ; ISO 15924 code
218	                 / "*"                    ; or wildcard

220	   region        = 2ALPHA                 ; ISO 3166 code
221	                 / 3DIGIT                 ; UN M.49 code
222	                 / "*"                    ; or wildcard

224	   variant       = 5*8alphanum            ; registered variants
225	                 / (DIGIT 3alphanum)      ;
226	                 / "*"                    ; or wildcard

228	   extension     = singleton *("-" (2*8alphanum)) [ "-*" ]
229	                                          ; extension subtags
230	                                          ; wildcard can only appear
231	                                          ;   at the end

233	   singleton     = %x41-57 / %x59-5A / %x61-77 / %x79-7A / DIGIT
234	                 ; single letters (except for "x") or digits

236	   privateuse    = "x" 1*("-" (1*8alphanum))

238	   grandfathered = 1*3ALPHA 1*2("-" (2*8alphanum))
239	                   ; grandfathered registration
240	                   ; Note: I is the only singleton
241	                   ; that starts a grandfathered tag

243	   alphanum      = (ALPHA / DIGIT)       ; letters and numbers
244	   A field not present in the middle of an extended language range is
245	   treated as if the field contained a "*".  Implementations that
246	   normalize extended language ranges SHOULD expand missing fields to be
247	   "*" so that the semantic meaning of the language range is clear to
248	   the user.  At the same time, multiple wildcards in a row are
249	   redundant and implementations SHOULD collapse these to a single
250	   wildcard when normalizing the range (for brevity).  For example, both
251	   the range "sl-nedis" and the range "sl-*-*-nedis" are equivalent to
252	   and should be normalized as "sl-*-nedis".

254	2.3.  The Language Priority List

256	   When users specify a language preference they often need to specify a
257	   prioritized list of language ranges in order to best reflect their
258	   language preferences.  This is especially true for speakers of
259	   minority languages.  A speaker of Breton in France, for example, may
260	   specify "be" followed by "fr", meaning that if Breton is available,
261	   it is preferred, but otherwise French is the best alternative.  It
262	   can get more complex: a speaker may wish to fall back from Skolt Sami
263	   to Northern Sami to Finnish.

265	   A "Language Priority List" is a prioritized or weighted list of
266	   language ranges.  One well known example of such a list is the
267	   "Accept-Language" header defined in RFC 2616 [RFC2616] (see Section
268	   14.4) and RFC 3282 [RFC3282].  A simple list of ranges, i.e. one that
269	   contains no weighting information, is considered to be in descending
270	   order of priority.

272	   The various matching operations described in this document include
273	   considerations for using a language priority list.  This document
274	   does not define any syntax for a language priority list; defining
275	   such a syntax is the responsibility of the protocol, application, or
276	   implementation that uses it.  When given as examples in this
277	   document, language priority lists will be shown as a quoted sequence
278	   of ranges separated by semi-colons, like this: "en; fr; zh-Hant"
279	   (which would be read as "English before French before Chinese as
280	   written in the Traditional script").

282	3.  Types of Matching

284	   Matching language ranges to language tags can be done in a number of
285	   different ways.  This section describes several different matching
286	   schemes, as well as the considerations for choosing between them.
287	   Protocols and specifications SHOULD clearly indicate the particular
288	   mechanism used in selecting or matching language tags.

290	   There are two basic types of matching scheme: those that produce zero
291	   or more information items (called "filtering") and those that produce
292	   a single information item for a given request (called "lookup").

294	   A key difference between these two types of matching scheme is that
295	   the language ranges in the language priority list represent the
296	   _least_ specific content one will accept as a match, while for lookup
297	   operations the language ranges represent the _most_ specific content.

299	3.1.  Choosing a Type of Matching

301	   Applications, protocols, and specifications are faced with the
302	   decision of what type of matching to use.  Sometimes, different
303	   styles of matching might be suited for different kinds of processing
304	   within a particular application or protocol.

306	   Language tag matching is a tool, and does not by itself specify a
307	   complete procedure for the use of language tags.  Such procedures are
308	   intimately tied to the application protocol in which they occur.
309	   When specifying a protocol operation using matching, the protocol
310	   MUST specify:

312	   o  Which type(s) of language tag matching it uses

314	   o  Whether the operation returns a single result (lookup) or a
315	      possibly empty set of results (filtering)

317	   o  For lookup, what the result is when no matching tag is found.  For
318	      instance, a protocol might result in failure of the operation, an
319	      empty value, returning some protocol defined or implementation
320	      defined default, or returning i-default [RFC2277].

322	   Filtering can be used to produce a set of results (such as a
323	   collection of documents).  For example, if using a search engine, one
324	   might use filtering to limit the results to documents written in
325	   French.  It can also be used when deciding whether to perform a
326	   language-sensitive process on some content.  For example, a process
327	   might cause paragraphs whose language tag matched the language range
328	   "nl" to be displayed in italics within a document.

330	   This document describes four types of matching (three types of
331	   filtering, plus the lookup scheme):

333	   1.  Basic Filtering (Section 3.2.1) is used to match content using
334	       basic language ranges (Section 2.1).

336	   2.  Extended Range Filtering (Section 3.2.2) is used to match content
337	       using extended language ranges (Section 2.2).

339	   3.  Scored Filtering (Section 3.2.3) produces an ordered set of
340	       content using extended language ranges.  It SHOULD be used when
341	       the quality of the match within a specific language range is
342	       important, as when presenting a list of documents resulting from
343	       a search.

345	   4.  Lookup (Section 3.3) is used when each request needs to produce
346	       _exactly_ one piece of content.  For example, if a process were
347	       to insert a human readable error message into a protocol header,
348	       it might select the text based on the user's language preference.
349	       Since it can return only one item, it must choose a single item
350	       and it must return some item, even if no content matches the
351	       language priority list supplied by the user.

353	   Most types of matching in this document are designed so that
354	   implementations are not required to validate or understand any of the
355	   semantics of the subtags supplied and, except for scored filtering,
356	   they do not need access to the IANA Language Subtag Registry (see
357	   Section 3 in [RFC3066bis]).  This simplifies and speeds the
358	   performance of implementations.

360	   Regardless of the matching scheme chosen, protocols and
361	   implementations MAY canonicalize language tags and ranges by mapping
362	   grandfathered and obsolete tags or subtags into modern equivalents.
363	   If an implementation canonicalizes either ranges or tags, then the
364	   implementation will require the IANA Language Subtag Registry
365	   information for that purpose.  Implementations MAY also use semantic
366	   information external to the registry when matching tags.  For
367	   example, the primary language subtags 'nn' (Nynorsk Norwegian) and
368	   'nb' (Bokmal Norwegian) might both be usefully matched to the more
369	   general subtag 'no' (Norwegian).  Or an implementation might infer
370	   that content labeled "zh-CN" is more likely to match the range "zh-
371	   Hans" than equivalent content labeled "zh-TW".

373	3.2.  Filtering

375	   Filtering is used to select the set of content that matches a given
376	   language priority list.  It is called "filtering" because this set of
377	   content may contain no items at all or it may return an arbitrarily
378	   large number of matching items: as many items as match the language
379	   priority list, thus "filtering out" the non-matching items.

381	   In filtering, the language range represents the _least_ specific
382	   (that is, the fewest number of subtags) language tag which is an
383	   acceptable match.  That is, all of the language tags in the set of
384	   filtered content will have an equal or greater number of subtags than
385	   the language range.  For example, if the language priority list
386	   consists of the range "de-CH", one might see matching content with
387	   the tag "de-CH-1996" but one will never see a match with the tag
388	   "de".

390	   If the language priority list (see Section 2.3) contains more than
391	   one range, the content returned is typically ordered in descending
392	   level of preference.

394	   Some examples where filtering might be appropriate include:

396	   o  Applying a style to sections of a document in a particular set of
397	      languages.

399	   o  Displaying the set of documents containing a particular set of
400	      keywords written in a specific set of languages.

402	   o  Selecting all email items written in a specific set of languages.

404	   Filtering can produce either an ordered or an unordered set of
405	   results.  For example, applying formatting to a document based on the
406	   language of specific pieces of content does not require the content
407	   to be ordered.  It is sufficient to know whether a specific piece of
408	   content is selected by the language priority list (or not).  A search
409	   application, on the other hand, probably would want to order the
410	   results.

412	   If an ordered set is desired, as described above, then the
413	   application or protocol needs to determine the relative "quality" of
414	   the match between different language tags and the language range.

416	   This measurement is called a "distance metric".  A distance metric
417	   assigns a numeric value to the comparison of a language tag to a
418	   language range that represents the 'distance' between the two.  A
419	   distance of zero means that they are identical, a small distance
420	   indicates that they are very similar, and a large distance indicates
421	   that they are very different.  Using a distance metric,
422	   implementations can, for example, allow users to select a threshold
423	   distance for a match to be "successful" while filtering, or they
424	   might use the numeric values to order the results.

426	3.2.1.  Filtering with Basic Language Ranges

428	   When filtering using basic language ranges, each basic language range
429	   in the language priority list is considered in turn, according to
430	   priority.  A particular language tag matches a language range if it
431	   exactly equals the tag, or if it exactly equals a prefix of the tag
432	   such that the first character following the prefix is "-".  (That is,
433	   the language-range "de-de" matches the language tag "de-DE-1996", but
434	   not the language tag "de-Deva".)

436	   The special range "*" in a language priority list matches any tag.  A
437	   protocol which uses language ranges MAY specify additional rules
438	   about the semantics of "*"; for instance, HTTP/1.1 [RFC2616]
439	   specifies that the range "*" matches only languages not matched by
440	   any other range within an "Accept-Language" header.

442	3.2.2.  Filtering with Extended Language Ranges

444	   When filtering using extended language ranges, each extended language
445	   range in the language priority list is considered in turn, according
446	   to priority.  The subtags in each extended language range are
447	   compared to the corresponding subtags in the language tag being
448	   examined.  The subtag from the range is considered to match if it
449	   exactly matches the corresponding subtag in the tag or the range's
450	   subtag has the value "*" (which matches all subtags, including the
451	   empty subtag).

453	   Subtags not specified, including those at the end of the language
454	   range, are assigned the wildcard value "*".  This makes each range
455	   into a prefix much like that used in basic language range matching.
456	   For example, the extended language range "de-*-DE" matches all of the
457	   following tags, in part because the unspecified variant, extension,
458	   and private-use subtags are expanded to "*":

460	      de-DE

462	      de-Latn-DE

464	      de-Latf-DE

466	      de-DE-x-goethe

468	      de-Latn-DE-1996

470	3.2.3.  Scored Filtering

472	   Both basic and extended language range filtering produce simple
473	   boolean matches between a language range and a language tag.

475	   Sometimes it may be useful to provide an array of results with
476	   different levels of matching, for example, sorting results based on
477	   the overall "quality" of the match.  Scored (or "distance metric")
478	   filtering provides a way to generate these quality values.

480	   As with the other forms of filtering, the process considers each
481	   language range in the language priority list in order of priority.

483	   Each extended language range and language tag MUST first be
484	   canonicalized by mapping grandfathered and obsolete tags into modern
485	   equivalents.  This requires the information in the IANA Language
486	   Subtag Registry (see Section 3 of [RFC3066bis]).

488	   The language range and each language tag it is to be compared to are
489	   then transformed into a "quintuple" consisting of five "elements" in
490	   the form (language, script, country, variant, extension).

492	   Any extended language subtags are considered part of the language
493	   "element".  For example, the language element for the tag "zh-cmn-
494	   Hans" would be "zh-cmn".

496	   Private-use subtag sequences are considered part of the language
497	   "element" if in the initial position in the tag and part of the
498	   variant "element" if not.  The different handling of private-use
499	   sequences prevents a range such as "x-twain" from matching all
500	   possible tags, while a range such as "en-US-x-twain" would closely
501	   match nearly all tags for English as used in the United States.

503	   Language subtags 'und', 'mul', and the script subtag 'Zyyy' are
504	   converted to "*": these subtag values represent undetermined,
505	   multiple, or private-use values which are consistent with the use of
506	   the wildcard.

508	   For language tags that have no script subtag but whose language
509	   subtag's record in the IANA Language Subtag Registry contains the
510	   field "Suppress-Script", the script element in the quintuple MUST be
511	   set to the script subtag in the Suppress-Script field.  This is
512	   necessary because [RFC3066bis] strongly recommends that users not use
513	   this subtag to form language tags and this document (see Section 4.1)
514	   recommends that users not use them to form ranges.  Languages which
515	   have a "Suppress-Script" field in the registry are predominantly
516	   written in that single script, making the subtag redundant in forming
517	   a language tag or range.  Thus if the script were not expanded in
518	   this manner, a range such as "de-DE" would produce a more-distant
519	   score for content that happened to be labeled "de-Latn-DE" than users
520	   would expect that it should.

522	   Any remaining missing components in the language tag are set to "*";
523	   thus an empty language tag becomes the quintuple ("*", "*", "*", "*",
524	   "*").  Missing components in the language range are handled similarly
525	   to extended range lookup: missing internal subtags are expanded to
526	   "*".  Missing end subtags are expanded as the empty string.  Thus a
527	   pattern "en-US" becomes the quintuple ("en","*","US","","").

529	   Here are some examples of language tags, showing their quintuples as
530	   both language tags and language ranges:

532	   en-US
533	      Tag:   (en, *, US, *, *)
534	      Range: (en, *, US, "", "")

536	   sr-Latn
537	      Tag:   (sr, Latn, *, *, *)
538	      Range: (sr, Latn, "", "", "")

540	   zh-cmn-Hant
541	      Tag:   (zh-cmn, Hant, *, *, *)
542	      Range: (zh-cmn, Hant, "", "", "")

544	   x-foo
545	      Tag:   (x-foo, *, *, *, *)
546	      Range: (x-foo, "", "", "", "")

548	   en-x-foo
549	      Tag:   (en, *, *, x-foo, *)
550	      Range: (en, *, *, x-foo, "")

552	   i-default
553	      Tag:   (i-default, *, *, *, *)
554	      Range: (i-default, "", "", "", "")

556	   sl-Latn-IT-rozaj
557	      Tag:   (sl, Latn, IT, rozaj, *)
558	      Range: (sl, Latn, IT, rozaj, "")

560	   zh-r-wadegile (hypothetical)
561	      Tag:   (zh, *, *, *, r-wadegile)
562	      Range: (zh, *, *, *, r-wadegile)

564	   Figure 3: Examples of Distance Metric Quintuples

566	   Each pair of quintuples being compared is assigned a distance value,
567	   in which small values indicate better matches and large values
568	   indicate worse ones.  The distance between the pair is the sum of the
569	   distances for each of the corresponding elements of the quintuple.
570	   If the elements are identical or one is '*', then the distance value
571	   between them is zero.  Otherwise, it is given by the following table:
572	     256    language mismatch
573	     128    script mismatch
574	      32    region mismatch
575	       4    variant mismatch
576	       1    extension mismatch

578	   A value of 0 is a perfect match; 421 is no match at all.  Different
579	   threshold values might be appropriate for different applications or
580	   protocols.  Implementations will usually allow users to choose the
581	   most appropriate selection value, ranking the matched items based on
582	   score.

584	   Examples of various tag's distances from the range "en-US":

586	   "fr-FR"          384 (language & region mismatch)
587	   "fr"             256 (language mismatch, region match)
588	   "en-GB"           32 (region mismatch)
589	   "en-Latn-US"       0 (all fields match)
590	   "en-Brai"         32 (region mismatch)
591	   "en-US-x-foo"      4 (variant mismatch: range is the empty string)
592	   "en-US-r-wadegile" 1 (extension mismatch: range is the empty string)

594	   Where a language priority list follows the syntax of the "Accept-
595	   Language" header defined in [RFC2616] (see Section 14.4) and
596	   [RFC3282], language ranges without a Q value are given values equal
597	   to the value of the previous language range in the list (processing
598	   from first to last).  If the first language range has no Q value, it
599	   is given a value of 1.0.  Language ranges with Q values of zero are
600	   removed.  For example, "fr, en;q=0.5, de, it" becomes
601	   "fr;q=1.0,en;q=0.5,de;q=0.5,it;q=0.5".  The distance values given
602	   above are then divided by the Q values.  For example, if that
603	   language tag "fr-FR" has a distance of 384 from a language range with
604	   a Q value of 0.8, then the resulting distance is 480 (384 div 0.8).

606	   Implementations or protocols MAY use different weighting systems than
607	   the ones described above, as long as the weightings and weighting
608	   mechanisms are clearly specified.  Thus, for example, an
609	   implementation or protocol could give all language tags with missing
610	   Q values a value of 1.0, or give the distance value 1000 to a
611	   language mismatch.  They MAY also use more sophisticated weights that
612	   depend on the values of the corresponding elements.  For example, an
613	   implementation might give a small distance to the difference closely
614	   related subtags.  Some examples of closely related subtags might be:

616	   Language:
617	     no (Norwegian)
618	     nb (Bokmal Norwegian)
619	     nn (Nynorsk Norwegian)

621	   Script:
622	     Kata (katakana)
623	     Hira (hiragana)

625	   Region:
626	     US (United States of America)
627	     UM (United States Minor Outlying Islands)

629	   Figure 6: Examples of Closely Related Subtags

631	3.3.  Lookup

633	   Lookup is used to select the single information item that best
634	   matches the language priority list for a given request.  When
635	   performing lookup, each language range in the language priority list
636	   is considered in turn, according to priority.  By contrast with
637	   filtering, each language ranges represents the _most_ specific tag
638	   which is an acceptable match.  The first information item found with
639	   a matching tag, according the user's priority, is considered the
640	   closest match and is the item returned.  For example, if the language
641	   range is "de-CH", one might expect to receive an information item
642	   with the tag "de" but never one with the tag "de-CH-1996".  Usually
643	   if no content matches the request, a "default" item is returned.

645	   For example, if an application inserts some dynamic content into a
646	   document, returning an empty string if there is no exact match is not
647	   an option.  Instead, the application "falls back" until it finds a
648	   suitable piece of content to insert.  Other examples of lookup might
649	   include:

651	   o  Selection of a template containing the text for an automated email
652	      response.

654	   o  Selection of a item containing some text for inclusion in a
655	      particular Web page.

657	   o  Selection of a string of text for inclusion in an error log.

659	   In the lookup scheme, the language range is progressively truncated
660	   from the end until a matching piece of content is located.  For
661	   example, starting with the range "zh-Hant-CN-x-private", the lookup
662	   progressively searches for content as shown below:

664	   Range to match: zh-Hant-CN-x-private
665	   1. zh-Hant-CN-x-private
666	   2. zh-Hant-CN
667	   3. zh-Hant
668	   4. zh
669	   5. (default content or the empty tag)

671	   Figure 7: Example of a Lookup Fallback Pattern

673	   This scheme allows some flexibility in finding content.  For example,
674	   it provides better results for cases in which data is not available
675	   that exactly matches the user request than if the default language
676	   for the system or content were returned immediately.  Not every
677	   specific level of tag granularity is usually available or language
678	   content may be sparsely populated, so "falling back" through the
679	   subtag sequence provides more opportunity to find a match between
680	   available content and the user's request.

682	   The default content is implementation defined.  It might be content
683	   with no language tag; might have an empty value (the built-in
684	   attribute xml:lang in [XML10] permits the empty value); might be a
685	   particular language designated for that bit of content; or it might
686	   be content that is labeled with the tag "i-default" (see [RFC2277]).
687	   When performing lookup using a language priority list, the
688	   progressive search MUST proceed to consider each language range in
689	   the list before finding the default content or empty tag.

691	   One common way for an application or implementation to provide for
692	   default content is to allow a specific language range to be set as
693	   the default for a specific type of request.  This language range is
694	   then treated as if it were appended to the end of the language
695	   priority list as a whole, rather than after each item in the language
696	   priority list.

698	   For example, if a particular user's language priority list were
699	   "fr-FR; zh-Hant" and the program doing the matching had a default
700	   language range of "ja-JP", the program would search for content as
701	   follows:
702	   1. fr-FR
703	   2. fr
704	   3. zh-Hant // next language
705	   4. zh
706	   5. (search for the default content)
707	      a. ja-JP
708	      b. ja
709	      c. (implementation defined default)

711	   Figure 8: Lookup Using a Language Priority List
712	   Implementations SHOULD ignore extensions and unrecognized private-use
713	   subtags when performing lookup, since these subtags are usually
714	   orthogonal to the user's request.

716	   The special language range "*" matches any language tag.  In the
717	   lookup scheme, this range does not convey enough information by
718	   itself to determine which content is most appropriate, since it
719	   matches everything.  If the language range "*" is the only one in the
720	   language priority list, it matches the default content.  If the
721	   language range "*" is followed by other language ranges, it should be
722	   skipped.

724	   In some cases, the language priority list might contain one or more
725	   extended language ranges (as, for example, when the same language
726	   priority list is used as input for both lookup and filtering
727	   operations).  Wildcard values in an extended language range normally
728	   match any value that occurs in that position in a language tag.
729	   Since only one item can be returned for any given lookup request,
730	   wildcards in a language range have to be processed in a consistent
731	   manner or the same request will produce widely varying results.
732	   Implementations that accept extended language ranges MUST define
733	   which content is returned when more than one item matches the
734	   extended language range.

736	   For example, an implementation could return the matching content that
737	   is first in ASCII-order.  For example, if the language range were
738	   "*-CH" and the set of content included "de-CH", "fr-CH", and "it-CH",
739	   then the content labeled "de-CH" would be returned.

741	   Implementations MAY also map extended language ranges to basic
742	   language ranges: if the first subtag is a "*" then the entire range
743	   is treated as "*" (which matches the default content), otherwise each
744	   wildcard subtag is removed.  For example, if the language range were
745	   "en-*-US", then the range would be mapped to "en-US".

747	   Where a language priority list contains Q values as in the syntax of
748	   the "Accept-Language" header defined in [RFC2616] (see Section 14.4)
749	   and [RFC3282], language tags without a Q value are given values equal
750	   to the value of the previous language tag (processing from first to
751	   last).  If the first language tag has no Q value, it is given a value
752	   of 1.0.  Then language tags with zero Q values are removed.  For
753	   example, "fr, en;q=0.5, de, it" becomes "fr;q=1.0, en;q=0.5,
754	   de;q=0.5, it;q=0.5".  The language priority list is then sorted from
755	   highest priority to lowest, whereby any two language tags with the
756	   same Q values are remain in the same order as in the original
757	   language priority list.  This list is then traversed as described
758	   above in doing lookup.

760	   Implementations or protocols MAY use different lookup mechanisms
761	   systems than the ones described above, as long as those mechanisms
762	   are clearly specified.

764	4.  Other Considerations

766	   When working with language ranges and matching schemes, there are
767	   some additional points that may influence the choice of either.

769	4.1.  Choosing Language Ranges

771	   Users indicate their language preferences via the choice of a
772	   language range or the list of language ranges in a language priority
773	   list.  The type of matching affects what the best choice is for a
774	   given user.

776	   Most matching schemes make no attempt to process the semantic meaning
777	   of the subtags.  The language range (or its subtags) is usually
778	   compared in a case-insensitive manner to each language tag being
779	   matched, using basic string processing.

781	   Users SHOULD avoid subtags that add no distinguishing value to a
782	   language range.  Generally, the fewer subtags that appear in the
783	   language range, the more content the range will match.

785	   Most notably, script subtags SHOULD NOT be used to form a language
786	   range in combination with language subtags that have a matching
787	   Suppress-Script field in their registry entry.  Thus the language
788	   range "en-Latn" is probably inappropriate in most cases (because the
789	   vast majority of English documents are written in the Latin script
790	   and thus the 'en' language subtag has a Suppress-Script field for
791	   'Latn' in the registry).

793	   When working with tags and ranges, note that extensions and most
794	   private-use subtags are orthogonal to language tag matching, in that
795	   they specify additional attributes of the text not related to the
796	   goals of most matching schemes.  Users SHOULD avoid using these
797	   subtags in language ranges, since they interfere with the selection
798	   of available content.  When used in language tags (as opposed to
799	   ranges), these subtags normally do not interfere with filtering
800	   (Section 3), since they appear at the end of the tag and will match
801	   all prefixes.

803	   When working with language tags and language ranges note that:

805	   o  Private-use and Extension subtags are normally orthogonal to
806	      language tag fallback.  Implementations or specifications that use
807	      a lookup (Section 3.3) matching scheme often ignore unrecognized
808	      private-use and extension subtags when performing language tag
809	      fallback.  In addition, since these subtags are always at the end
810	      of the sequence of subtags, their use in language tags normally
811	      doesn't interfere with the use of ranges that omit them in the
812	      filtering (Section 3.2) matching schemes described below.
813	      However, they do interfere with filtering when used in language
814	      ranges and SHOULD be avoided in ranges as a result.

816	   o  Applications, specifications, or protocols that choose not to
817	      interpret one or more private-use or extension subtags SHOULD NOT
818	      remove or modify these extensions in content that they are
819	      processing.  When a language tag instance is to be used in a
820	      specific, known protocol, and is not being passed through to other
821	      protocols, language tags MAY be filtered to remove subtags and
822	      extensions that are not supported by that protocol.  Such
823	      filtering SHOULD be avoided, if possible, since it removes
824	      information that might be relevant to services on the other end of
825	      the protocol that would make use of that information.

827	   o  Some applications of language tags might want or need to consider
828	      extensions and private-use subtags when matching tags.  If
829	      extensions and private-use subtags are included in a matching or
830	      filtering process that utilizes one of the schemes described in
831	      this document, then the implementation SHOULD canonicalize the
832	      language tags and/or ranges before performing the matching.  Note
833	      that language tag processors that claim to be "well-formed"
834	      processors as defined in [RFC3066bis] generally fall into this
835	      category.

837	4.2.  Meaning of Language Tags and Ranges

839	   Selecting content using language ranges requires some understanding
840	   by users of what they are selecting.  A language tag or range
841	   identifies a language as spoken (or written, signed or otherwise
842	   signaled) by human beings for communication of information to other
843	   human beings.

845	   If a language tag B contains language tag A as a prefix, then B is
846	   typically "narrower" or "more specific" than A. For example, "zh-
847	   Hant-TW" is more specific than "zh-Hant".

849	   This relationship is not guaranteed in all cases: specifically,
850	   languages that begin with the same sequence of subtags are NOT
851	   guaranteed to be mutually intelligible, although they might be.

853	   For example, the tag "az" shares a prefix with both "az-Latn"
854	   (Azerbaijani written using the Latin script) and "az-Arab"
855	   (Azerbaijani written using the Arabic script).  A person fluent in
856	   one script might not be able to read the other, even though the text
857	   might be otherwise identical.  Content tagged as "az" most probably
858	   is written in just one script and thus might not be intelligible to a
859	   reader familiar with the other script.

861	   Variant subtags in particular seem to represent specific divisions in
862	   mutual understanding, since they often encode dialects or other
863	   idiosyncratic variations within a language.  They also seem to
864	   represent relatively low divisions with a high chance of at least
865	   limited understanding, although this depends on the specific variant
866	   in question.

868	   The relationship between the language tag and the information it
869	   relates to is defined by the standard describing the context in which
870	   it appears.  Accordingly, this section can only give possible
871	   examples of its usage:

873	   o  For a single information object, the associated language tags
874	      might be interpreted as the set of languages that are necessary
875	      for a complete comprehension of the complete object.  Example:
876	      Plain text documents.

878	   o  For an aggregation of information objects, the associated language
879	      tags could be taken as the set of languages used inside components
880	      of that aggregation.  Examples: Document stores and libraries.

882	   o  For information objects whose purpose is to provide alternatives,
883	      the associated language tags could be regarded as a hint that the
884	      content is provided in several languages, and that one has to
885	      inspect each of the alternatives in order to find its language or
886	      languages.  In this case, the presence of multiple tags might not
887	      mean that one needs to be multi-lingual to get complete
888	      understanding of the document.  Example: MIME multipart/
889	      alternative.

891	   o  In markup languages, such as HTML and XML, language information
892	      can be added to each part of the document identified by the markup
893	      structure (including the whole document itself).  For example, one
894	      could write <span lang="FR">C'est la vie.</span> inside a
895	      Norwegian document; the Norwegian-speaking user could then access
896	      a French-Norwegian dictionary to find out what the marked section
897	      meant.  If the user were listening to that document through a
898	      speech synthesis interface, this formation could be used to signal
899	      the synthesizer to appropriately apply French text-to-speech
900	      pronunciation rules to that span of text, instead of misapplying
901	      the Norwegian rules.

903	4.3.  Considerations for Private Use Subtags

905	   Private-use subtags require private agreement between the parties
906	   that intend to use or exchange language tags that use them and great
907	   caution SHOULD be used in employing them in content or protocols
908	   intended for general use.  Private-use subtags are simply useless for
909	   information exchange without prior arrangement.

911	   The value and semantic meaning of private-use tags and of the subtags
912	   used within such a language tag are not defined.  Matching private-
913	   use tags using language ranges or extended language ranges can result
914	   in unpredictable content being returned.

916	4.4.  Length Considerations in Matching

918	   RFC 3066 [RFC3066] did not provide an upper limit on the size of
919	   language tags or ranges.  RFC 3066 did define the semantics of
920	   particular subtags in such a way that most language tags or ranges
921	   consisted of language and region subtags with a combined total length
922	   of up to six characters.  Larger tags and ranges (in terms of both
923	   subtags and characters) did exist, however.

925	   [RFC3066bis] also does not impose a fixed upper limit on the number
926	   of subtags in a language tag or range (and thus an upper bound on the
927	   size of either).  The syntax in that document suggests that,
928	   depending on the specific language or range of languages, more
929	   subtags (and thus characters) are sometimes necessary as a result.
930	   Length considerations and their impact on the selection and
931	   processing of tags are described in Section 2.1.1 of that document.

933	   An application or protocol MAY choose to limit the length of the
934	   language tags or ranges used in matching.  Any such limitation SHOULD
935	   be clearly documented, and such documentation SHOULD include the
936	   disposition of any longer tags or ranges (for example, whether an
937	   error value is generated or the language tag or range is truncated).
938	   If truncation is permitted it MUST NOT permit a subtag to be divided,
939	   since this changes the semantics of the subtag being matched and can
940	   result in false positives or negatives.

942	   Applications or protocols that restrict storage SHOULD consider the
943	   impact of tag or range truncation on the resulting matches.  For
944	   example, removing the "*" from the end of an extended language range
945	   (see Section 2.2) can greatly modify the set of returned matches.  A
946	   protocol that allows tags or ranges to be truncated at an arbitrary
947	   limit, without giving any indication of what that limit is, has the
948	   potential for causing harm by changing the meaning of values in
949	   substantial ways.

951	   In practice, most tags do not require additional subtags or
952	   substantially more characters.  Additional subtags sometimes add
953	   useful distinguishing information, but extraneous subtags interfere
954	   with the meaning, understanding, and especially matching of language
955	   tags.  Since language tags or ranges MAY be truncated by an
956	   application or protocol that limits storage, when choosing language
957	   tags or ranges users and applications SHOULD avoid adding subtags
958	   that add no distinguishing value.  In particular, users and
959	   implementations SHOULD follow the 'Prefix' and 'Suppress-Script'
960	   fields in the registry (defined in Section 3.6 of [RFC3066bis]):
961	   these fields provide guidance on when specific additional subtags
962	   SHOULD (and SHOULD NOT) be used.

964	   Implementations MUST support a limit of at least 33 characters.  This
965	   limit includes at least one subtag of each non-extension, non-private
966	   use type.  When choosing a buffer limit, a length of at least 42
967	   characters is strongly RECOMMENDED.

969	   The practical limit on tags or ranges derived solely from registered
970	   values is 42 characters.  Implementations MUST be able to handle tags
971	   and ranges of this length.  Support for tags and ranges of at least
972	   62 characters in length is RECOMMENDED.  Implementations MAY support
973	   longer values, including matching extensive sets of private-use or
974	   extension subtags.

976	   Applications or protocols which have to truncate a tag MUST do so by
977	   progressively removing subtags along with their preceding "-" from
978	   the right side of the language tag until the tag is short enough for
979	   the given buffer.  If the resulting tag ends with a single-character
980	   subtag, that subtag and its preceding "-" MUST also be removed.  For
981	   example:

983	   Tag to truncate: zh-Latn-CN-variant1-a-extend1-x-wadegile-private1
984	   1. zh-Latn-CN-variant1-a-extend1-x-wadegile
985	   2. zh-Latn-CN-variant1-a-extend1
986	   3. zh-Latn-CN-variant1
987	   4. zh-Latn-CN
988	   5. zh-Latn
989	   6. zh

991	   Figure 9: Example of Tag Truncation

993	5.  IANA Considerations

995	   This document presents no new or existing considerations for IANA.

997	6.  Changes

999	   This is the first version of this document.

1001	   The following changes were put into this document since draft-07:

1003	      Added a mention of "*" to the Character Set Considerations section
1004	      (D.Ewell)

1006	7.  Security Considerations

1008	   Language ranges used in content negotiation might be used to infer
1009	   the nationality of the sender, and thus identify potential targets
1010	   for surveillance.  In addition, unique or highly unusual language
1011	   ranges or combinations of language ranges might be used to track a
1012	   specific individual's activities.

1014	   This is a special case of the general problem that anything you send
1015	   is visible to the receiving party.  It is useful to be aware that
1016	   such concerns can exist in some cases.

1018	   The evaluation of the exact magnitude of the threat, and any possible
1019	   countermeasures, is left to each application or protocol.

1021	8.  Character Set Considerations

1023	   Language tags permit only the characters A-Z, a-z, 0-9, and HYPHEN-
1024	   MINUS (%x2D).  Language ranges also use the character ASTERISK
1025	   (%x2A).  These characters are present in most character sets, so
1026	   presentation or exchange of language tags or ranges should not be
1027	   constrained by character set issues.

1029	9.  References

1031	9.1.  Normative References

1033	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
1034	              Requirement Levels", BCP 14, RFC 2119, March 1997.

1036	   [RFC2277]  Alvestrand, H., "IETF Policy on Character Sets and
1037	              Languages", BCP 18, RFC 2277, January 1998.

1039	   [RFC3066bis]
1040	              Phillips, A., Ed. and M. Davis, Ed., "Tags for the
1041	              Identification of Languages", October 2005, <http://
1042	              www.ietf.org/internet-drafts/
1043	              draft-ietf-ltru-registry-14.txt>.

1045	   [RFC4234]  Crocker, D. and P. Overell, "Augmented BNF for Syntax
1046	              Specifications: ABNF", RFC 4234, October 2005.

1048	9.2.  Informative References

1050	   [RFC1766]  Alvestrand, H., "Tags for the Identification of
1051	              Languages", RFC 1766, March 1995.

1053	   [RFC2616]  Fielding, R., Gettys, J., Mogul, J., Frystyk, H.,
1054	              Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext
1055	              Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999.

1057	   [RFC3066]  Alvestrand, H., "Tags for the Identification of
1058	              Languages", BCP 47, RFC 3066, January 2001.

1060	   [RFC3282]  Alvestrand, H., "Content Language Headers", RFC 3282,
1061	              May 2002.

1063	   [XML10]    Bray (et al), T., "Extensible Markup Language (XML) 1.0",
1064	              02 2004.

1066	Appendix A.  Acknowledgements

1068	   Any list of contributors is bound to be incomplete; please regard the
1069	   following as only a selection from the group of people who have
1070	   contributed to make this document what it is today.

1072	   The contributors to [RFC3066bis], [RFC3066] and [RFC1766], each of
1073	   which is a precursor to this document, made enormous contributions
1074	   directly or indirectly to this document and are generally responsible
1075	   for the success of language tags.

1077	   The following people (in alphabetical order by family name)
1078	   contributed to this document:

1080	   Harald Alvestrand, Jeremy Carroll, John Cowan, Martin Duerst, Frank
1081	   Ellermann, Doug Ewell, Marion Gunn, Kent Karlsson, Ira McDonald, M.
1082	   Patton, Randy Presuhn, Eric van der Poel, Markus Scherer, and many,
1083	   many others.

1085	   Very special thanks must go to Harald Tveit Alvestrand, who
1086	   originated RFCs 1766 and 3066, and without whom this document would
1087	   not have been possible.

1089	   For this particular document, John Cowan originated the scheme
1090	   described in Section 3.2.3.  Mark Davis originated the scheme
1091	   described in the Section 3.3.

1093	Authors' Addresses

1095	   Addison Phillips (editor)
1096	   Yahoo! Inc

1098	   Email: addison at inter dash locale dot com

1100	   Mark Davis (editor)
1101	   Google

1103	   Email: mark dot davis at macchiato dot com

1105	Intellectual Property Statement

1107	   The IETF takes no position regarding the validity or scope of any
1108	   Intellectual Property Rights or other rights that might be claimed to
1109	   pertain to the implementation or use of the technology described in
1110	   this document or the extent to which any license under such rights
1111	   might or might not be available; nor does it represent that it has
1112	   made any independent effort to identify any such rights.  Information
1113	   on the procedures with respect to rights in RFC documents can be
1114	   found in BCP 78 and BCP 79.

1116	   Copies of IPR disclosures made to the IETF Secretariat and any
1117	   assurances of licenses to be made available, or the result of an
1118	   attempt made to obtain a general license or permission for the use of
1119	   such proprietary rights by implementers or users of this
1120	   specification can be obtained from the IETF on-line IPR repository at
1121	   http://www.ietf.org/ipr.

1123	   The IETF invites any interested party to bring to its attention any
1124	   copyrights, patents or patent applications, or other proprietary
1125	   rights that may cover technology that may be required to implement
1126	   this standard.  Please address the information to the IETF at
1127	   ietf-ipr@ietf.org.

1129	Disclaimer of Validity

1131	   This document and the information contained herein are provided on an
1132	   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
1133	   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
1134	   ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
1135	   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
1136	   INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
1137	   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

1139	Copyright Statement

1141	   Copyright (C) The Internet Society (2006).  This document is subject
1142	   to the rights, licenses and restrictions contained in BCP 78, and
1143	   except as set forth therein, the authors retain all their rights.

1145	Acknowledgment

1147	   Funding for the RFC Editor function is currently provided by the
1148	   Internet Society.