idnits 2.17.1 

draft-ietf-ltru-registry-05.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** It looks like you're using RFC 3978 boilerplate.  You should update this
     to the boilerplate described in the IETF Trust License Policy document
     (see https://trustee.ietf.org/license-info), which is required now.

  -- Found old boilerplate from RFC 3978, Section 5.1 on line 16.

  -- Found old boilerplate from RFC 3978, Section 5.5 on line 2733.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 2710.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 2717.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 2723.

  ** This document has an original RFC 3978 Section 5.4 Copyright Line,
     instead of the newer IETF Trust Copyright according to RFC 4748.

  ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead
     of the newer disclaimer which includes the IETF Trust according to RFC
     4748.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  -- The abstract seems to indicate that this document obsoletes RFC3066, but
     the header doesn't have an 'Obsoletes:' line to match this.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not
     match the current year

  == Line 843 has weird spacing: '...logical  line ...'

  == Line 844 has weird spacing: '...prising  a fie...'

  == Line 845 has weird spacing: '...ld-body  porti...'

  == Line 846 has weird spacing: '...   this  conce...'

  == Line 989 has weird spacing: '...ve been  possi...'

  == (10 more instances...)

  == The document seems to lack the recommended RFC 2119 boilerplate, even if
     it appears to use RFC 2119 keywords. 

     (The document does seem to have the reference to RFC 2119 which the
     ID-Checklist requires).
  -- The exact meaning of the all-uppercase expression 'NOT REQUIRED' is not
     defined in RFC 2119.  If it is intended as a requirements expression, it
     should be rewritten using one of the combinations defined in RFC 2119;
     otherwise it should not be all-uppercase.

  == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD',
     or 'RECOMMENDED' is not an accepted usage according to RFC 2119.  Please
     use uppercase 'NOT' together with RFC 2119 keywords (if that is what you
     mean).
     
     Found 'MUST not' in this paragraph:
     
     The tags and their subtags, including private-use and extensions,
     are to be treated as case insensitive: there exist conventions for the
     capitalization of some of the subtags, but these MUST not be taken to
     carry meaning.

  == The expression 'MAY NOT', while looking like RFC 2119 requirements text,
     is not defined in RFC 2119, and should not be used.  Consider using 'MUST
     NOT' instead (if that is what you mean).
     
     Found 'MAY NOT' in this paragraph:
     
     Note that 'Preferred-Value' mappings in records of type 'region'
     MAY NOT represent exactly the same meaning as the original value.  There
     are many reasons for a country code to be changed and the effect this has
     on the formation of language tags will depend on the nature of the change
     in question.

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (June 15, 2005) is 6888 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  -- Looks like a reference, but probably isn't: 'ISO 639' on line 206

  -- Looks like a reference, but probably isn't: 'ISO 3166' on line 209

  -- Looks like a reference, but probably isn't: 'ISO 15924' on line 340

  -- Looks like a reference, but probably isn't: 'RFC 2231' on line 274

  -- Looks like a reference, but probably isn't: 'RFC 2047' on line 278

  -- Looks like a reference, but probably isn't: 'ISO 639-1' on line 392

  -- Looks like a reference, but probably isn't: 'ISO 639-2' on line 399

  -- Looks like a reference, but probably isn't: 'RFC 2028' on line 1593

  -- Looks like a reference, but probably isn't: 'RFC 2026' on line 1405

  == Unused Reference: '22' is defined on line 2390, but no explicit
     reference was found in the text

  -- Possible downref: Non-RFC (?) normative reference: ref. '1'

  -- Possible downref: Non-RFC (?) normative reference: ref. '2'

  -- Possible downref: Non-RFC (?) normative reference: ref. '3'

  -- Possible downref: Non-RFC (?) normative reference: ref. '4'

  -- Possible downref: Non-RFC (?) normative reference: ref. '5'

  -- Possible downref: Non-RFC (?) normative reference: ref. '6'

  ** Obsolete normative reference: RFC 2028 (ref. '9') (Obsoleted by RFC 9281)

  ** Obsolete normative reference: RFC 2434 (ref. '12') (Obsoleted by RFC
     5226)

  ** Downref: Normative reference to an Informational RFC: RFC 2781 (ref.
     '13')

  ** Downref: Normative reference to an Informational RFC: RFC 2860 (ref.
     '14')

  -- Obsolete informational reference (is this intentional?): RFC 1766 (ref.
     '22') (Obsoleted by RFC 3066, RFC 3282)

  -- Obsolete informational reference (is this intentional?): RFC 3066 (ref.
     '24') (Obsoleted by RFC 4646, RFC 4647)


     Summary: 7 errors (**), 0 flaws (~~), 12 warnings (==), 26 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                   A. Phillips, Ed.
3	Internet-Draft                                            Quest Software
4	Expires: December 17, 2005                                 M. Davis, Ed.
5	                                                                     IBM
6	                                                           June 15, 2005

8	                     Tags for Identifying Languages
9	                      draft-ietf-ltru-registry-05

11	Status of this Memo

13	   By submitting this Internet-Draft, each author represents that any
14	   applicable patent or other IPR claims of which he or she is aware
15	   have been or will be disclosed, and any of which he or she becomes
16	   aware will be disclosed, in accordance with Section 6 of BCP 79.

18	   Internet-Drafts are working documents of the Internet Engineering
19	   Task Force (IETF), its areas, and its working groups.  Note that
20	   other groups may also distribute working documents as Internet-
21	   Drafts.

23	   Internet-Drafts are draft documents valid for a maximum of six months
24	   and may be updated, replaced, or obsoleted by other documents at any
25	   time.  It is inappropriate to use Internet-Drafts as reference
26	   material or to cite them other than as "work in progress."

28	   The list of current Internet-Drafts can be accessed at
29	   http://www.ietf.org/ietf/1id-abstracts.txt.

31	   The list of Internet-Draft Shadow Directories can be accessed at
32	   http://www.ietf.org/shadow.html.

34	   This Internet-Draft will expire on December 17, 2005.

36	Copyright Notice

38	   Copyright (C) The Internet Society (2005).

40	Abstract

42	   This document describes the structure, content, construction, and
43	   semantics of language tags for use in cases where it is desirable to
44	   indicate the language used in an information object.  It also
45	   describes how to register values for use in language tags and the
46	   creation of user defined extensions for private interchange.  This
47	   document obsoletes RFC 3066 (which replaced RFC 1766).

49	Table of Contents

51	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
52	   2.  The Language Tag . . . . . . . . . . . . . . . . . . . . . . .  4
53	     2.1   Syntax . . . . . . . . . . . . . . . . . . . . . . . . . .  4
54	       2.1.1   Length Considerations  . . . . . . . . . . . . . . . .  6
55	     2.2   Language Subtag Sources and Interpretation . . . . . . . .  8
56	       2.2.1   Primary Language Subtag  . . . . . . . . . . . . . . .  9
57	       2.2.2   Extended Language Subtags  . . . . . . . . . . . . . . 11
58	       2.2.3   Script Subtag  . . . . . . . . . . . . . . . . . . . . 12
59	       2.2.4   Region Subtag  . . . . . . . . . . . . . . . . . . . . 13
60	       2.2.5   Variant Subtags  . . . . . . . . . . . . . . . . . . . 14
61	       2.2.6   Extension Subtags  . . . . . . . . . . . . . . . . . . 15
62	       2.2.7   Private Use Subtags  . . . . . . . . . . . . . . . . . 17
63	       2.2.8   Pre-Existing RFC 3066 Registrations  . . . . . . . . . 17
64	       2.2.9   Classes of Conformance . . . . . . . . . . . . . . . . 17
65	   3.  Registry Format and Maintenance  . . . . . . . . . . . . . . . 19
66	     3.1   Format of the IANA Language Subtag Registry  . . . . . . . 19
67	     3.2   Maintenance of the Registry  . . . . . . . . . . . . . . . 24
68	     3.3   Stability of IANA Registry Entries . . . . . . . . . . . . 25
69	     3.4   Registration Procedure for Subtags . . . . . . . . . . . . 29
70	     3.5   Possibilities for Registration . . . . . . . . . . . . . . 32
71	     3.6   Extensions and Extensions Namespace  . . . . . . . . . . . 33
72	     3.7   Initialization of the Registry . . . . . . . . . . . . . . 36
73	   4.  Formation and Processing of Language Tags  . . . . . . . . . . 39
74	     4.1   Choice of Language Tag . . . . . . . . . . . . . . . . . . 39
75	     4.2   Meaning of the Language Tag  . . . . . . . . . . . . . . . 41
76	     4.3   Canonicalization of Language Tags  . . . . . . . . . . . . 42
77	     4.4   Considerations for Private Use Subtags . . . . . . . . . . 44
78	   5.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 45
79	   6.  Security Considerations  . . . . . . . . . . . . . . . . . . . 46
80	   7.  Character Set Considerations . . . . . . . . . . . . . . . . . 47
81	   8.  Changes from RFC 3066  . . . . . . . . . . . . . . . . . . . . 48
82	   9.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 51
83	     9.1   Normative References . . . . . . . . . . . . . . . . . . . 51
84	     9.2   Informative References . . . . . . . . . . . . . . . . . . 52
85	       Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 53
86	   A.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 54
87	   B.  Examples of Language Tags (Informative)  . . . . . . . . . . . 55
88	   C.  Example Registry . . . . . . . . . . . . . . . . . . . . . . . 58
89	       Intellectual Property and Copyright Statements . . . . . . . . 62

91	1.  Introduction

93	   Human beings on our planet have, past and present, used a number of
94	   languages.  There are many reasons why one would want to identify the
95	   language used when presenting or requesting information.

97	   Information about a user's language preferences commonly needs to be
98	   identified so that appropriate processing can be applied.  For
99	   example, the user's language preferences in a browser can be used to
100	   select web pages appropriately.  A choice of language preference can
101	   also be used to select among tools (such as dictionaries) to assist
102	   in the processing or understanding of content in different languages.

104	   In addition, knowledge about the particular language used by some
105	   piece of information content might be useful or even required by some
106	   types of information processing; for example spell-checking,
107	   computer-synthesized speech, Braille transcription, or high-quality
108	   print renderings.

110	   One means of indicating the language used is by labeling the
111	   information content with a language identifier.  These identifiers
112	   can also be used to specify user preferences when selecting
113	   information content, or for labeling additional attributes of content
114	   and associated resources.

116	   These identifiers can also be used to indicate additional attributes
117	   of content that are closely related to the language.  In particular,
118	   it is often necessary to indicate specific information about the
119	   dialect, writing system, or orthography used in a document or
120	   resource, as these attributes may be important for the user to obtain
121	   information in a form that they can understand, or important in
122	   selecting appropriate processing resources for the given content.

124	   This document specifies an identifier mechanism and a registration
125	   function for values to be used with that identifier mechanism.  It
126	   also defines a mechanism for private use values and future extension.

128	   This document replaces RFC 3066, which replaced RFC 1766.  For a list
129	   of changes in this document, see Section 8.

131	   The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
132	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
133	   document are to be interpreted as described in RFC 2119 [11].

135	2.  The Language Tag

137	2.1  Syntax

139	   The language tag is composed of one or more parts: A primary language
140	   subtag and a (possibly empty) series of subsequent subtags.  Subtags
141	   are distinguished by their length, position in the subtag sequence,
142	   and content, so that each type of subtag can be recognized solely by
143	   these features.  This makes it possible to construct a parser that
144	   can extract and assign some semantic information to the subtags, even
145	   if specific subtag values are not recognized.  Thus a parser need not
146	   have an up-to-date copy of the registered subtag values to perform
147	   most searching and matching operations.

149	   The syntax of this tag in ABNF [7] is:

151	   Language-Tag = (lang
152	                   *3("-" extlang)
153	                   ["-" script]
154	                   ["-" region]
155	                   *("-" variant)
156	                   *("-" extension)
157	                   ["-" privateuse])
158	                   / privateuse         ; private-use tag
159	                   / grandfathered      ; grandfathered registrations

161	   lang            = 2*4ALPHA           ; shortest ISO 639 code
162	                   / registered-lang
163	   extlang         = 3ALPHA             ; reserved for future use
164	   script          = 4ALPHA             ; ISO 15924 code
165	   region          = 2ALPHA             ; ISO 3166 code
166	                   / 3DIGIT             ; UN country number
167	   variant         =  5*8alphanum       ; registered variants
168	                   / ( DIGIT 3alphanum )
169	   extension       = singleton 1*("-" (2*8alphanum))
170	   privateuse      = ("x"/"X") 1*("-" (1*8alphanum))
171	   singleton       = %x41-57 / %x59-5A / %x61-77 / %x79-7A / DIGIT
172	                   ; "a"-"w" / "y"-"z" / "A"-"W" / "Y"-"Z" / "0"-"9"
173	                   ; Single letters: x/X is reserved for private use
174	   registered-lang = 4*8ALPHA          ; registered language subtag
175	   grandfathered   = 1*3ALPHA 1*2("-" (2*8alphanum))
176	                                       ; grandfathered registration
177	                                       ; Note: i is the only singleton
178	                                       ; that starts a grandfathered tag
179	   alphanum        = (ALPHA / DIGIT)   ; letters and numbers

181	                        Figure 1: Language Tag ABNF

183	   The character "-" is HYPHEN-MINUS (ABNF: %x2D).  All subtags have a
184	   maximum length of eight characters.  Note that there is a subtlety in
185	   the ABNF for 'variant': variants starting with a digit MAY be four
186	   characters long, while those starting with a letter MUST be at least
187	   five characters long.

189	   Whitespace is not permitted in a language tag.  For examples of
190	   language tags, see Appendix B.

192	   Note that although [7] refers to octets, the language tags described
193	   in this document are sequences of characters from the US-ASCII
194	   repertoire.  Language tags MAY be used in documents and applications
195	   that use other encodings, so long as these encompass the US-ASCII
196	   repertoire.  An example of this would be an XML document that uses
197	   the UTF-16LE [13] encoding of Unicode [21].

199	   The tags and their subtags, including private-use and extensions, are
200	   to be treated as case insensitive: there exist conventions for the
201	   capitalization of some of the subtags, but these MUST not be taken to
202	   carry meaning.

204	   For example:

206	   o  [ISO 639] [1] recommends that language codes be written in lower
207	      case ('mn' Mongolian).

209	   o  [ISO 3166] [4] recommends that country codes be capitalized ('MN'
210	      Mongolia).

212	   o  [ISO 15924] [3] recommends that script codes use lower case with
213	      the initial letter capitalized ('Cyrl' Cyrillic).

215	   However, in the tags defined by this document, the uppercase US-ASCII
216	   letters in the range 'A' through 'Z' are considered equivalent and
217	   mapped directly to their US-ASCII lowercase equivalents in the range
218	   'a' through 'z'.  Thus the tag "mn-Cyrl-MN" is not distinct from "MN-
219	   cYRL-mn" or "mN-cYrL-Mn" (or any other combination) and each of these
220	   variations conveys the same meaning: Mongolian written in the
221	   Cyrillic script as used in Mongolia.

223	2.1.1  Length Considerations

225	   RFC 3066 [24] did not provide an upper limit on the size of language
226	   tags.  While RFC 3066 did define the semantics of particular subtags
227	   in such a way that most language tags consisted of language and
228	   region subtags with a combined total length of up to six characters,
229	   larger registered tags were not only possible but were actually
230	   registered.

232	   Neither this document nor the syntax in the ANBF imposes a fixed
233	   upper limit on the number of subtags in a language tag (and thus an
234	   upper bound on the size of a tag).  The syntax in this document
235	   suggests that, depending on the specific language, more subtags (and
236	   thus characters) are sometimes necessary to form a complete tag; thus
237	   it is possible to envision long or complex subtag sequences.

239	   Some applications and protocols are forced to allocate fixed buffer
240	   sizes or otherwise limit the length of a language tag in a particular
241	   application.  A conformant implementation or specification MAY refuse
242	   to support the storage of language tags which exceed a specified
243	   length.  Any such limitation SHOULD be clearly documented, and such
244	   documentation SHOULD include the disposition of any longer tags (for
245	   example, whether an error value is generated or the language tag is
246	   truncated).

248	   In practice, most tags do not require additional subtags or
249	   substantially more characters.  Additional subtags sometimes add
250	   useful distinguishing information, but extraneous subtags interfere
251	   with the meaning, understanding, and processing of language tags.
252	   Since language tags MAY be truncated by an application or protocol
253	   that limits tag sizes, when choosing language tags users and
254	   applications SHOULD avoid adding subtags that add no distinguishing
255	   value.  In particular, users and implementations SHOULD follow the
256	   'Prefix' and 'Suppress-Script' fields in the registry (defined in
257	   Section 3.1): these fields provide guidance on when specific
258	   additional subtags SHOULD (and SHOULD NOT) be used in a language tag.
259	   (For more information on selecting subtags, see Section 4.1.)

261	   Implementations MUST support a limit of at least 33 characters.  This
262	   limit includes at least one subtag of each non-extension, non-private
263	   use type.  When choosing a buffer limit, a length of at least 42
264	   characters is strongly RECOMMENDED.

266	   If truncation is permitted it MUST NOT permit a subtag to be divided
267	   or the formation of invalid tags (for example, one ending with the
268	   "-" character).  A protocol that allows tags to be truncated at an
269	   arbitrary limit, without giving any indication of what that limit is,
270	   has the potential for causing harm by changing the meaning of tags in
271	   substantial ways.

273	   Some specifications are space constrained but do not have a fixed
274	   length limitation.  For example, see [RFC 2231] [23].  This protocol
275	   has no explicit length limitation: the length of the language tag in
276	   this document is limited by the length of other header components
277	   (such as the charset's name) coupled with the 76 character limit in
278	   [RFC 2047] [10].  Thus the "limit" might be 50 or more characters,
279	   but it could potentially be quite small.  In these cases,
280	   implementations SHOULD use the longest possible language tag.
281	   Warning the user of truncation, if necessary, is RECOMMENDED, as
282	   truncation can change the semantic meaning of the tag.

284	   The following illustration shows how the 42-character recommendation
285	   was derived.  The combination of language and extended language
286	   subtags was chosen for future compatibility.  At up to 11 characters,
287	   this combination is longer than the longest possible language subtag
288	   (8 characters):

290	   language      =  3 (ISO 639-2; ISO 639-1 requires 2)
291	   extlang1      =  4 (each subsequent subtag includes '-')
292	   extlang2      =  4 (unlikely: needs prefix="language-extlang1")
293	   extlang3      =  4 (extremely unlikely)
294	   script        =  5 (if not suppressed: see Section 4.1)
295	   region        =  4 (UN M.49; ISO 3166 requires 3)
296	   variant1      =  9 (MUST have language as a prefix)
297	   variant2      =  9 (MUST have language-variant1 as a prefix)

299	   total         = 42 characters

301	              Figure 2: Derivation of the Limit on Tag Length

303	   Applications or protocols which have to truncate a tag MUST do so by
304	   progressively removing subtags along with their preceding "-" from
305	   the right side of the language tag until the tag is short enough for
306	   the given buffer.  If the resulting tag ends with a single-character
307	   subtag, that subtag and its preceding "-" MUST also be removed.  For
308	   example:

310	   Tag to truncate: zh-Hant-CN-variant1-a-extend1-x-wadegile-private1
311	   1. zh-Hant-CN-variant1-a-extend1-x-wadegile
312	   2. zh-Hant-CN-variant1-a-extend1
313	   3. zh-Hant-CN-variant1
314	   4. zh-Hant-CN
315	   5. zh-Hant
316	   6. zh

318	                    Figure 3: Example of Tag Truncation

320	2.2  Language Subtag Sources and Interpretation

322	   The namespace of language tags and their subtags is administered by
323	   the Internet Assigned Numbers Authority (IANA) [14] according to the
324	   rules in Section 5 of this document.  The registry maintained by IANA
325	   is the source for valid subtags: other standards referenced in this
326	   section provide the source material for that registry.

328	   Terminology in this section:

330	   o  Tag or tags refers to a complete language tag, such as
331	      "fr-Latn-CA".  Examples of tags in this document are enclosed in
332	      double-quotes ("en-US").

334	   o  Subtag refers to a specific section of a tag, delimited by hyphen,
335	      such as the subtag 'Latn' in "fr-Latn-CA".  Examples of subtags in
336	      this document are enclosed in single quotes ('Latn').

338	   o  Code or codes refers to values defined in external standards (and
339	      which are used as subtags in this document).  For example, 'Latn'
340	      is an [ISO 15924] [3] script code which was used to define the
341	      'Latn' script subtag for use in a language tag.  Examples of codes
342	      in this document are enclosed in single quotes ('en', 'Latn').

344	   The definitions in this section apply to the various subtags within
345	   the language tags defined by this document, excepting those
346	   "grandfathered" tags defined in Section 2.2.8.

348	   Language tags are designed so that each subtag type has unique length
349	   and content restrictions.  These make identification of the subtag's
350	   type possible, even if the content of the subtag itself is
351	   unrecognized.  This allows tags to be parsed and processed without
352	   reference to the latest version of the underlying standards or the
353	   IANA registry and makes the associated exception handling when
354	   parsing tags simpler.

356	   Subtags in the IANA registry that do not come from an underlying
357	   standard can only appear in specific positions in a tag.
358	   Specifically, they can only occur as primary language subtags or as
359	   variant subtags.

361	   Note that sequences of private-use and extension subtags MUST occur
362	   at the end of the sequence of subtags and MUST NOT be interspersed
363	   with subtags defined elsewhere in this document.

365	   Single letter and digit subtags are reserved for current or future
366	   use.  These include the following current uses:

368	   o  The single letter subtag 'x' is reserved to introduce a sequence
369	      of private-use subtags.  The interpretation of any private-use
370	      subtags is defined solely by private agreement and is not defined
371	      by the rules in this section or in any standard or registry
372	      defined in this document.

374	   o  All other single letter subtags are reserved to introduce
375	      standardized extension subtag sequences as described in
376	      Section 3.6.

378	   The single letter subtag 'i' is used by some grandfathered tags, such
379	   as "i-enochian", where it always appears in the first position and
380	   cannot be confused with an extension.

382	2.2.1  Primary Language Subtag

384	   The primary language subtag is the first subtag in a language tag
385	   (with the exception of private-use and certain grandfathered tags)
386	   and cannot be omitted.  The following rules apply to the primary
387	   language subtag:

389	   1.  All two character language subtags were defined in the IANA
390	       registry according to the assignments found in the standard ISO
391	       639 Part 1, "ISO 639-1:2002, Codes for the representation of
392	       names of languages -- Part 1: Alpha-2 code" [ISO 639-1] [1], or
393	       using assignments subsequently made by the ISO 639 Part 1
394	       maintenance agency or governing standardization bodies.

396	   2.  All three character language subtags were defined in the IANA
397	       registry according to the assignments found in ISO 639 Part 2,
398	       "ISO 639-2:1998 - Codes for the representation of names of
399	       languages -- Part 2: Alpha-3 code - edition 1" [ISO 639-2] [2],
400	       or assignments subsequently made by the ISO 639 Part 2
401	       maintenance agency or governing standardization bodies.

403	   3.  The subtags in the range 'qaa' through 'qtz' are reserved for
404	       private use in language tags.  These subtags correspond to codes
405	       reserved by ISO 639-2 for private use.  These codes MAY be used
406	       for non-registered primary-language subtags (instead of using
407	       private-use subtags following 'x-').  Please refer to Section 4.4
408	       for more information on private use subtags.

410	   4.  All four character language subtags are reserved for possible
411	       future standardization.

413	   5.  All language subtags of 5 to 8 characters in length in the IANA
414	       registry were defined via the registration process in Section 3.4
415	       and MAY be used to form the primary language subtag.  At the time
416	       this document was created, there were no examples of this kind of
417	       subtag and future registrations of this type will be discouraged:
418	       primary languages are strongly RECOMMENDED for registration with
419	       ISO 639 and proposals rejected by ISO 639/RA will be closely
420	       scrutinized before they are registered with IANA.

422	   6.  The single character subtag 'x' as the primary subtag indicates
423	       that the language tag consists solely of subtags whose meaning is
424	       defined by private agreement.  For example, in the tag "x-fr-CH",
425	       the subtags 'fr' and 'CH' SHOULD NOT be taken to represent the
426	       French language or the country of Switzerland (or any other value
427	       in the IANA registry) unless there is a private agreement in
428	       place to do so.  See Section 4.4.

430	   7.  The single character subtag 'i' is used by some grandfathered
431	       tags (see Section 2.2.8) such as "i-klingon" and "i-bnn".  (Other
432	       grandfathered tags have a primary language subtag in their first
433	       position)

435	   8.  Other values MUST NOT be assigned to the primary subtag except by
436	       revision or update of this document.

438	   Note: For languages that have both an ISO 639-1 two character code
439	   and an ISO 639-2 three character code, only the ISO 639-1 two
440	   character code is defined in the IANA registry.

442	   Note: For languages that have no ISO 639-1 two character code and for
443	   which the ISO 639-2/T (Terminology) code and the ISO 639-2/B
444	   (Bibliographic) codes differ, only the Terminology code is defined in
445	   the IANA registry.  At the time this document was created, all
446	   languages that had both kinds of three character code were also
447	   assigned a two character code; it is not expected that future
448	   assignments of this nature will occur.

450	   Note: To avoid problems with versioning and subtag choice as
451	   experienced during the transition between RFC 1766 and RFC 3066, as
452	   well as the canonical nature of subtags defined by this document, the
453	   ISO 639 Registration Authority Joint Advisory Committee (ISO 639/
454	   RA-JAC) has included the following statement in [17]:

456	   "A language code already in ISO 639-2 at the point of freezing ISO
457	   639-1 shall not later be added to ISO 639-1.  This is to ensure
458	   consistency in usage over time, since users are directed in Internet
459	   applications to employ the alpha-3 code when an alpha-2 code for that
460	   language is not available."

462	   In order to avoid instability of the canonical form of tags, if a two
463	   character code is added to ISO 639-1 for a language for which a three
464	   character code was already included in ISO 639-2, the two character
465	   code will not be added as a subtag in the registry.  See Section 3.3.

467	   For example, if some content were tagged with 'haw' (Hawaiian), which
468	   currently has no two character code, the tag would not be invalidated
469	   if ISO 639-1 were to assign a two character code to the Hawaiian
470	   language at a later date.

472	   For example, one of the grandfathered IANA registrations is
473	   "i-enochian".  The subtag 'enochian' could be registered in the IANA
474	   registry as a primary language subtag (assuming that ISO 639 does not
475	   register this language first), making tags such as "enochian-AQ" and
476	   "enochian-Latn" valid.

478	2.2.2  Extended Language Subtags

480	   The following rules apply to the extended language subtags:

482	   1.  Three letter subtags immediately following the primary subtag are
483	       reserved for future standardization, anticipating work that is
484	       currently under way on ISO 639.

486	   2.  Extended language subtags MUST follow the primary subtag and
487	       precede any other subtags.

489	   3.  There MAY be up to three extended language subtags.

491	   4.  Extended language subtags will not be registered except by
492	       revision of this document.

494	   5.  Extended language subtags MUST NOT be used to form language tags
495	       except by revision of this document.

497	   Extended language subtag records, once they appear in the registry,
498	   MUST include exactly one 'Prefix' field indicating an appropriate
499	   language subtag or sequence of subtags that MUST always appear as a
500	   prefix to the extended language subtag.

502	   Example: In a future revision or update of this document, the tag
503	   "zh-gan" (registered under RFC 3066) might become a valid non-
504	   grandfathered (that is, redundant) tag in which the subtag 'gan'
505	   might represent the Chinese dialect 'Gan'.

507	2.2.3  Script Subtag

509	   The following rules apply to the script subtags:

511	   1.  All four character subtags were defined according to ISO 15924
512	       [3]--"Codes for the representation of the names of scripts":
513	       alpha-4 script codes, or subsequently assigned by the ISO 15924
514	       maintenance agency or governing standardization bodies, denoting
515	       the script or writing system used in conjunction with this
516	       language.

518	   2.  Script subtags MUST immediately follow the primary language
519	       subtag and all extended language subtags and MUST occur before
520	       any other type of subtag described below.

522	   3.  The script subtags 'Qaaa' through 'Qabx' are reserved for private
523	       use in language tags.  These subtags correspond to codes reserved
524	       by ISO 15924 for private use.  These codes MAY be used for non-
525	       registered script values.  Please refer to Section 4.4 for more
526	       information on private-use subtags.

528	   4.  Script subtags cannot be registered using the process in
529	       Section 3.4 of this document.  Variant subtags MAY be considered
530	       for registration for that purpose.

532	   5.  There MUST be at most one script subtag in a language tag and the
533	       script subtag SHOULD be omitted when it adds no distinguishing
534	       value to the tag or when the primary language subtag's record
535	       includes a Supress-Script field listing the applicable script
536	       subtag.

538	   Example: "sr-Latn" represents Serbian written using the Latin script.

540	2.2.4  Region Subtag

542	   The following rules apply to the region subtags:

544	   1.  The region subtag defines language variations used in a specific
545	       region, geographic, or political area.  Region subtags MUST
546	       follow any language, extended language, or script subtags and
547	       MUST precede all other subtags.

549	   2.  All two character subtags following the primary subtag were
550	       defined in the IANA registry according to the assignments found
551	       in ISO 3166 [4]--"Codes for the representation of names of
552	       countries and their subdivisions - Part 1: Country
553	       codes"--alpha-2 country codes or assignments subsequently made by
554	       the ISO 3166 maintenance agency or governing standardization
555	       bodies.

557	   3.  All three character subtags consisting of digit (numeric)
558	       characters following the primary subtag were defined in the IANA
559	       registry according to the assignments found in UN Standard
560	       Country or Area Codes for Statistical  Use [5] (UN M.49) or
561	       assignments subsequently made by the governing standards body.
562	       Note that not all of the UN M.49 codes are defined in the IANA
563	       registry.  The following rules define which codes are entered
564	       into the registry as valid subtags:

566	       A.  UN numeric codes assigned to 'macro-geographical
567	           (continental)' or sub-regions MUST be registered in the
568	           registry.  These codes are not associated with an assigned
569	           ISO 3166 alpha-2 code and represent supra-national areas,
570	           usually covering more than one nation, state, province, or
571	           territory.

573	       B.  UN numeric codes for 'economic groupings' or 'other
574	           groupings' MUST NOT be registered in the IANA registry and
575	           MUST NOT be used to form language tags.

577	       C.  UN numeric codes for countries or areas with ambiguous ISO
578	           3166 alpha-2 codes, when entered into the registry, MUST be
579	           defined according to the rules in Section 3.3 and MUST be
580	           used to form language tags that represent the country or
581	           region for which they are defined.

583	       D.  UN numeric codes for countries or areas for which there is an
584	           associated ISO 3166 alpha-2 code in the registry MUST NOT be
585	           entered into the registry and MUST NOT be used to form
586	           language tags.  Note that the ISO 3166-based subtag in the
587	           registry MUST actually be associated with the UN M.49 code in
588	           question.

590	       E.  All other UN numeric codes for countries or areas which do
591	           not have an associated ISO 3166 alpha-2 code MUST NOT be
592	           entered into the registry and MUST NOT be used to form
593	           language tags.  For more information about these codes, see
594	           Section 3.3.

596	   4.  Note: The alphanumeric codes in Appendix X of the UN document
597	       MUST NOT be entered into the registry and MUST NOT be used to
598	       form language tags.  (At the time this document was created these
599	       values match the ISO 3166 alpha-2 codes.)

601	   5.  There MUST be at most one region subtag in a language tag and the
602	       region subtag MAY be omitted, as when it adds no distinguishing
603	       value to the tag.

605	   6.  The region subtags 'AA', 'QM'-'QZ', 'XA'-'XZ', and 'ZZ' are
606	       reserved for private use in language tags.  These subtags
607	       correspond to codes reserved by ISO 3166 for private use.  These
608	       codes MAY be used for private use region subtags (instead of
609	       using a private-use subtag sequence).  Please refer to
610	       Section 4.4 for more information on private use subtags.

612	   "de-CH" represents German ('de') as used in Switzerland ('CH').

614	   "sr-Latn-CS" represents Serbian ('sr') written using Latin script
615	   ('Latn') as used in Serbia and Montenegro ('CS').

617	   "es-419" represents Spanish ('es') as used in the UN-defined Latin
618	   America and Caribbean region ('419').

620	2.2.5  Variant Subtags

622	   The following rules apply to the variant subtags:

624	   1.  Variant subtags are not associated with any external standard.
625	       Variant subtags and their meanings are defined by the
626	       registration process defined in Section 3.4.

628	   2.  Variant subtags MUST follow all of the other defined subtags, but
629	       precede any extension or private-use subtag sequences.

631	   3.  More than one variant MAY be used to form the language tag.

633	   4.  Variant subtags MUST be registered with IANA according to the
634	       rules in Section 3.4 of this document before being used to form
635	       language tags.  In order to distinguish variants from other types
636	       of subtags, registrations MUST meet the following length and
637	       content restrictions:

639	       1.  Variant subtags that begin with a letter (a-z, A-Z) MUST be
640	           at least five characters long.

642	       2.  Variant subtags that begin with a digit (0-9) MUST be at
643	           least four characters long.

645	   Variant subtag records in the language subtag registry MAY include
646	   one or more 'Prefix' fields, which indicates the language tag or tags
647	   that would make a suitable prefix (with other subtags, as
648	   appropriate) in forming a language tag with the variant.  For
649	   example, the subtag 'scouse' has a Prefix of "en", making it suitable
650	   to form language tags such as "en-scouse" and "en-GB-scouse", but not
651	   suitable for use in a tag such as "zh-scouse" or "it-GB-scouse".

653	   "en-scouse" represents the Scouse dialect of English.

655	   "de-CH-1996" represents German as used in Switzerland and as written
656	   using the spelling reform beginning in the year 1996 C.E.

658	   Most variants that share a prefix are mutually exclusive.  For
659	   example, the German orthographic variations '1996' and '1901' SHOULD
660	   NOT be used in the same tag, as they represent the dates of different
661	   spelling reforms.  A variant that can meaningfully be used in
662	   combination with another variant SHOULD include a 'Prefix' field in
663	   its registry record that lists that other variant.  For example, if
664	   another German variant 'example' were created that made sense to use
665	   with '1996', then 'example' should include two Prefix fields: "de"
666	   and "de-1996".

668	2.2.6  Extension Subtags

670	   The following rules apply to extensions:

672	   1.   Extension subtags are separated from the other subtags defined
673	        in this document by a single-letter subtag ("singleton").  The
674	        singleton MUST be one allocated to a registration authority via
675	        the mechanism described in Section 3.6 and cannot be the letter
676	        'x', which is reserved for private-use subtag sequences.

678	   2.   Note: Private-use subtag sequences starting with the singleton
679	        subtag 'x' are described below.

681	   3.   An extension MUST follow at least a primary language subtag.
682	        That is, a language tag cannot begin with an extension.
683	        Extensions extend language tags, they do not override or replace
684	        them.  For example, "a-value" is not a well-formed language tag,
685	        while "de-a-value" is.

687	   4.   Each singleton subtag MUST appear at most one time in each tag
688	        (other than as a private-use subtag).  That is, singleton
689	        subtags MUST NOT be repeated.  For example, the tag "en-a-bbb-a-
690	        ccc" is invalid because the subtag 'a' appears twice.  Note that
691	        the tag "en-a-bbb-x-a-ccc" is valid because the second
692	        appearance of the singleton 'a' is in a private use sequence.

694	   5.   Extension subtags MUST meet all of the requirements for the
695	        content and format of subtags defined in this document.

697	   6.   Extension subtags MUST meet whatever requirements are set by the
698	        document that defines their singleton prefix and whatever
699	        requirements are provided by the maintaining authority.

701	   7.   Each extension subtag MUST be from two to eight characters long
702	        and consist solely of letters or digits, with each subtag
703	        separated by a single '-'.

705	   8.   Each singleton MUST be followed by at least one extension
706	        subtag.  For example, the tag "tlh-a-b-foo" is invalid because
707	        the first singleton 'a' is followed immediately by another
708	        singleton 'b'.

710	   9.   Extension subtags MUST follow all language, extended language,
711	        script, region and variant subtags in a tag.

713	   10.  All subtags following the singleton and before another singleton
714	        are part of the extension.  Example: In the tag "fr-a-Latn", the
715	        subtag 'Latn' does not represent the script subtag 'Latn'
716	        defined in the IANA Language Subtag Registry.  Its meaning is
717	        defined by the extension 'a'.

719	   11.  In the event that more than one extension appears in a single
720	        tag, the tag SHOULD be canonicalized as described in
721	        Section 4.3.

723	   For example, if the prefix singleton 'r' and the shown subtags were
724	   defined, then the following tag would be a valid example: "en-Latn-
725	   GB-boont-r-extended-sequence-x-private"

727	2.2.7  Private Use Subtags

729	   The following rules apply to private-use subtags:

731	   1.  Private-use subtags are separated from the other subtags defined
732	       in this document by the reserved single-character subtag 'x'.

734	   2.  Private-use subtags MUST follow all language, extended language,
735	       script, region, variant, and extension subtags in the tag.
736	       Another way of saying this is that all subtags following the
737	       singleton 'x' MUST be considered private use.  Example: The
738	       subtag 'US' in the tag "en-x-US" is a private use subtag.

740	   3.  A tag MAY consist entirely of private-use subtags.

742	   4.  No source is defined for private use subtags.  Use of private use
743	       subtags is by private agreement only.

745	   For example: Users who wished to utilize SIL Ethnologue for
746	   identification might agree to exchange tags such as "az-Arab-x-AZE-
747	   derbend".  This example contains two private-use subtags.  The first
748	   is 'AZE' and the second is 'derbend'.

750	2.2.8  Pre-Existing RFC 3066 Registrations

752	   Existing IANA-registered language tags from RFC 1766 and/or RFC 3066
753	   maintain their validity.  IANA will maintain these tags in the
754	   registry under either the "grandfathered" or "redundant" type.  For
755	   more information see Section 3.7.

757	   It is important to note that all language tags formed under the
758	   guidelines in this document were either legal, well-formed tags or
759	   could have been registered under RFC 3066.

761	2.2.9  Classes of Conformance

763	   Implementations sometimes need to describe their capabilities with
764	   regard to the rules and practices described in this document.  There
765	   are two classes of conforming implementations described by this
766	   document: "well-formed" processors and "validating" processors.

768	   Claims of conformance SHOULD explicitly reference one of these
769	   definitions.

771	   An implementation that claims to check for well-formed language tags
772	   MUST:

774	   o  Check that the tag and all of its subtags, including extension and
775	      private-use subtags, conform to the ABNF or that the tag is on the
776	      list of grandfathered tags.

778	   o  Check that singleton subtags that identify extensions do not
779	      repeat.  For example, the tag "en-a-xx-b-yy-a-zz" is not well-
780	      formed.

782	   Well-formed processors are strongly encouraged to implement the
783	   canonicalization rules contained in Section 4.3.

785	   An implementation that claims to be validating MUST:

787	   o  Check that the tag is well-formed.

789	   o  Specify the particular registry date for which the implementation
790	      performs validation of subtags.

792	   o  Check that either the tag is a grandfathered tag, or that all
793	      language, script, region, and variant subtags consist of valid
794	      codes for use in language tags according to the IANA registry as
795	      of the particular date specified by the implementation.

797	   o  Specify which, if any, extension RFCs as defined in Section 3.6
798	      are supported, including version, revision, and date.

800	   o  For any such extensions supported, check that all subtags used in
801	      that extension are valid.

803	   o  For variant and extended language subtags, if the registry
804	      contains one or more 'Prefix' fields for that subtag, check that
805	      the tag matches at least one prefix.  The tag matches if all the
806	      subtags in the 'Prefix' also appear in the tag.  For example, the
807	      prefix "es-CO" matches the tag "es-Latn-CO-x-private" because both
808	      the 'es' language subtag and 'CO' region subtag appear in the tag.

810	3.  Registry Format and Maintenance

812	   This section defines the Language Subtag Registry and the maintenance
813	   and update procedures associated with it.

815	   The language subtag registry will be maintained so that, except for
816	   extension subtags, it is possible to validate all of the subtags that
817	   appear in a language tag under the provisions of this document or its
818	   revisions or successors.  In addition, the meaning of the various
819	   subtags will be unambiguous and stable over time.  (The meaning of
820	   private-use subtags, of course, is not defined by the IANA registry.)

822	   The registry defined under this document contains a comprehensive
823	   list of all of the subtags valid in language tags.  This allows
824	   implementers a straightforward and reliable way to validate language
825	   tags.

827	3.1  Format of the IANA Language Subtag Registry

829	   The IANA Language Subtag Registry ("the registry") will consist of a
830	   text file that is machine readable in the format described in this
831	   section, plus copies of the registration forms approved by the
832	   Language Subtag Reviewer in accordance with the process described in
833	   Section 3.4.  With the exception of the registration forms for
834	   grandfathered and redundant tags, no registration records will be
835	   maintained for the initial set of subtags.

837	   The registry will be in a modified record-jar format text file [18].
838	   Lines are limited to 72 characters, including all whitespace.

840	   Records are separated by lines containing only the sequence "%%"
841	   (%x25.25).

843	   Each field can be viewed as a single, logical  line  of ASCII
844	   characters,  comprising  a field-name and a field-body separated by a
845	   COLON character (%x3A).  For convenience, the field-body  portion  of
846	   this  conceptual entity  can be split into a multiple-line
847	   representation; this is called "folding".  The format of the registry
848	   is described by the following ABNF (per [7]):

850	   registry   = record *("%%" CRLF record)
851	   record     = 1*( field-name *SP ":" *SP field-body CRLF )
852	   field-name = *(ALPHA / DIGIT / "-")
853	   field-body = *(ASCCHAR/LWSP)
854	   ASCCHAR    = %x21-25 / %x27-7E / UNICHAR ; Note: AMPERSAND is %x26
855	   UNICHAR    = "&#x" 2*6HEXDIG ";"

857	   The sequence '..' (%x2E.2E) in a field-body denotes a range of
858	   values.  Such a range represents all subtags of the same length that
859	   are alphabetically within that range, including the values explicitly
860	   mentioned.  For example 'a..c' denotes the values 'a', 'b', and 'c'.

862	   Characters from outside the US-ASCII repertoire, as well as the
863	   AMPERSAND character ("&", %x26) when it occurs in a field-body are
864	   represented by a "Numeric Character Reference" using hexadecimal
865	   notation in the style used by XML 1.0 [19] (see
866	   <http://www.w3.org/TR/REC-xml/#dt-charref>).  This consists of the
867	   sequence "&#x" (%x26.23.78) followed by a hexadecimal representation
868	   of the character's code point in ISO/IEC 10646 [6] followed by a
869	   closing semicolon (%x3B).  For example, the EURO SIGN, U+20AC, would
870	   be represented by the sequence "&#x20AC;".  Note that the hexadecimal
871	   notation MAY have between two and six digits.

873	   All fields whose field-body contains a date value use the "full-date"
874	   format specified in RFC 3339 [15].  For example: "2004-06-28"
875	   represents June 28, 2004 in the Gregorian calendar.

877	   The first record in the file contains the single field whose field-
878	   name is "File-Date".  The field-body of this record contains the last
879	   modification date of this copy of the registry, making it possible to
880	   compare different versions of the registry.  The registry on the IANA
881	   website is the most current.  Versions with an older date than that
882	   one are not up-to-date.

884	   File-Date: 2004-06-28
885	   %%

887	   Subsequent records represent subtags in the registry.  Each of the
888	   fields in each record MUST occur no more than once, unless otherwise
889	   noted below.  Each record MUST contain the following fields:

891	   o  'Type'

893	      *  Type's field-value MUST consist of one of the following
894	         strings: "language", "extlang", "script", "region", "variant",
895	         "grandfathered", and "redundant" and denotes the type of tag or
896	         subtag.

898	   o  Either 'Subtag' or 'Tag'

900	      *  Subtag's field-value contains the subtag being defined.  This
901	         field MUST only appear in records of whose Type has one of
902	         these values: "language", "extlang", "script", "region", or
903	         "variant".

905	      *  Tag's field-value contains a complete language tag.  This field
906	         MUST only appear in records whose Type has one of these values:
907	         "grandfathered" or "redundant".

909	   o  Description

911	      *  Description's field-value contains a non-normative description
912	         of the subtag or tag.

914	   o  Added

916	      *  Added's field-value contains the date the record was added to
917	         the registry.

919	   The 'Subtag' or 'Tag' field MUST use lowercase letters to form the
920	   subtag or tag, with two exceptions.  Subtags whose 'Type' field is
921	   'script' (in other words, subtags defined by ISO 15924) MUST use
922	   titlecase.  Subtags whose 'Type' field is 'region' (in other words,
923	   subtags defined by ISO 3166) MUST use uppercase.  These exceptions
924	   mirror the use of case in the underlying standards.

926	   The field 'Description' MAY appear more than one time.  At least one
927	   of the  'Description' fields MUST contain a description of the tag
928	   being registered written or transcribed into the Latin script; the
929	   same or additional fields MAY also include a description in a non-
930	   Latin script.  The 'Description' field is used for identification
931	   purposes and SHOULD NOT be taken to represent the actual native name
932	   of the language or variation or to be in any particular language.
933	   Most descriptions are taken directly from source standards such as
934	   ISO 639 or ISO 3166.

936	   Note: Descriptions in registry entries that correspond to ISO 639,
937	   ISO 15924,  ISO 3166 or UN M.49 codes are intended only to indicate
938	   the meaning of that identifier as defined in the source standard at
939	   the time it was added to the registry.  The description does not
940	   replace the content of the source standard itself.  The descriptions
941	   are not intended to be the English localized names for the subtags.
942	   Localization or translation of language tag and subtag descriptions
943	   is out of scope of this document.

945	   Each record MAY also contain the following fields:

947	   o  Preferred-Value

949	      *  For fields of type 'language', 'extlang', 'script', 'region',
950	         and 'variant', 'Preferred-Value' contains a subtag of the same
951	         'Type' which is preferred for forming the language tag.

953	      *  For fields of type 'grandfathered' and 'redundant', a canonical
954	         mapping to a complete language tag.

956	   o  Deprecated

958	      *  Deprecated's field-value contains the date the record was
959	         deprecated.

961	   o  Prefix

963	      *  Prefix's field-value contains a language tag with which this
964	         subtag MAY be used to form a new language tag, perhaps with
965	         other subtags as well.  This field MUST only appear in records
966	         whose 'Type' field-value is 'variant' or 'extlang'.  For
967	         example, the 'Prefix' for the variant 'scouse' is 'en', meaning
968	         that the tags "en-scouse" and "en-GB-scouse" might be
969	         appropriate while the tag "is-scouse" is not.

971	   o  Comments

973	      *  Comments contains additional information about the subtag, as
974	         deemed appropriate for understanding the registry and
975	         implementing language tags using the subtag or tag.

977	   o  Suppress-Script

979	      *  Suppress-Script contains a script subtag that SHOULD NOT be
980	         used to form language tags with the associated primary language
981	         subtag.  This field MUST only appear in records whose 'Type'
982	         field-value is 'language'.  See Section 4.1.

984	   The field 'Deprecated' MAY be added to any record via the maintenance
985	   process described in Section 3.2 or via the registration process
986	   described in Section 3.4.  Usually the addition of a 'Deprecated'
987	   field is due to the action of one of the standards bodies, such as
988	   ISO 3166, withdrawing a code.  In some historical cases it might not
989	   have been  possible to reconstruct the original deprecation date.
990	   For these cases, an approximate date appears in the registry.
991	   Although valid in language tags, subtags and tags with a 'Deprecated'
992	   field are deprecated and validating processors SHOULD NOT generate
993	   these subtags.  Note that a record that contains a 'Deprecated' field
994	   and no corresponding 'Preferred-Value' field has no replacement
995	   mapping.

997	   Thie field 'Preferred-Value' contains a mapping between the record in
998	   which it appears and a tag or subtag which SHOULD be preferred when
999	   selected language tags.  These values form three groups:

1001	      ISO 639 language codes which were later withdrawn in favor of
1002	      other codes.  These values are mostly a historical curiosity.

1004	      ISO 3166 region codes which have been withdrawn in favor of a new
1005	      code.  This sometimes happens when a country changes its name or
1006	      administration in such a way that warrants a new region code.

1008	      Tags grandfathered from RFC 3066.  In many cases these tags have
1009	      become obsolete because the values they represent were later
1010	      encoded by ISO 639.

1012	   Records that contain a 'Preferred-Value' field MUST also have a
1013	   'Deprecated' field.  This field contains a date of deprecation.  Thus
1014	   a language tag processor can use the registry to construct the valid,
1015	   non-deprecated set of subtags for a given date.  In addition, for any
1016	   given tag, a processor can construct the set of valid language tags
1017	   that correspond to that tag for all dates up to the date of the
1018	   registry.  The ability to do these mappings MAY be beneficial to
1019	   applications that are matching, selecting, for filtering content
1020	   based on its language tags.

1022	   Note that 'Preferred-Value' mappings in records of type 'region' MAY
1023	   NOT represent exactly the same meaning as the original value.  There
1024	   are many reasons for a country code to be changed and the effect this
1025	   has on the formation of language tags will depend on the nature of
1026	   the change in question.

1028	   In particular, the 'Preferred-Value' field does not imply retagging
1029	   content that uses the affected subtag.

1031	   The field 'Preferred-Value' MUST NOT be modified once created in the
1032	   registry.  The field MAY be added to records of type "grandfathered"
1033	   and "region" according to the rules in Section 3.2.  Otherwise the
1034	   field MUST NOT be added to any record already in the registry.

1036	   The 'Preferred-Value' field in records of type "grandfathered" and
1037	   "redundant" contains whole language tags that are strongly
1038	   RECOMMENDED for use in place of the record's value.  In many cases
1039	   the mappings were created by deprecation of the tags during the
1040	   period before this document was adopted.  For example, the tag "no-
1041	   nyn" was deprecated in favor of the ISO 639-1 defined language code
1042	   'nn'.

1044	   Records of type 'variant' MAY have more than one field of type
1045	   'Prefix'.  Additional fields of this type MAY be added to a 'variant'
1046	   record via the registration process.

1048	   Records of type 'extlang' MUST have _exactly_ one 'Prefix' field.

1050	   The field-value of the 'Prefix' field consists of a language tag
1051	   whose subtags are appropriate to use with this subtag.  For example,
1052	   the variant subtag 'scouse' has a Prefix field of "en".  This means
1053	   that tags starting with the sequence "en-" are most appropriate with
1054	   this subtag, so "en-Latn-scouse" and "en-GB-scouse" are both
1055	   acceptable, while the tag "fr-scouse" is an inappropriate choice.

1057	   The field of type 'Prefix' MUST NOT be removed from any record.  The
1058	   field-value for this type of field MUST NOT be modified.

1060	   The field 'Comments' MAY appear more than once per record.  This
1061	   field MAY be inserted or changed via the registration process and no
1062	   guarantee of stability is provided.  The content of this field is not
1063	   restricted, except by the need to register the information, the
1064	   suitability of the request, and by reasonable practical size
1065	   limitations.  Long screeds about a particular subtag are frowned
1066	   upon.

1068	   The field 'Suppress-Script' MUST only appear in records whose 'Type'
1069	   field-value is 'language'.  This field MAY appear at most one time in
1070	   a record.  This field indicates a script used to write the
1071	   overwhelming majority of documents for the given language and which
1072	   therefore adds no distinguishing information to a language tag.  It
1073	   helps ensure greater compatibility between the language tags
1074	   generated according to the rules in this document and language tags
1075	   and tag processors or consumers based on RFC 3066.  For example,
1076	   virtually all Icelandic documents are written in the Latin script,
1077	   making the subtag 'Latn' redundant in the tag "is-Latn".

1079	   For examples of registry entries and their format, see Appendix C.

1081	3.2  Maintenance of the Registry

1083	   Maintenance of the registry requires that as codes are assigned or
1084	   withdrawn by ISO 639, ISO 15924, ISO 3166, and UN M.49, the Language
1085	   Subtag Reviewer will evaluate each change, determine whether it
1086	   conflicts with existing registry entries, and submit the information
1087	   to IANA for inclusion in the registry.  If an change takes place and
1088	   the Language Subtag Reviewer does not do this in a timely manner,
1089	   then any interested party MAY use the procedure in Section 3.4 to
1090	   register the appropriate update.

1092	   Note: The redundant and grandfathered entries together are the
1093	   complete list of tags registered under RFC 3066 [24].  The redundant
1094	   tags are those that can now be formed using the subtags defined in
1095	   the registry together with the rules of  Section 2.2.  The
1096	   grandfathered entries are those that can never be legal under those
1097	   same provisions.

1099	   The set of redundant and grandfathered tags is permanent and stable:
1100	   no new entries will be added and none of the entries will be removed.
1101	   Records of type 'grandfathered' MAY have their type converted to
1102	   'redundant': see  Section 3.7 for more information.

1104	   RFC 3066 tags that were deprecated prior to the adoption of this
1105	   document are part of the list of grandfathered tags and their
1106	   component subtags were not included as registered variants (although
1107	   they remain eligible for registration).  For example, the tag "art-
1108	   lojban" was deprecated in favor of the language subtag 'jbo'.

1110	   The Language Subtag Reviewer MUST ensure that new subtags meet the
1111	   requirements in Section 4.1 or submit an appropriate alternate subtag
1112	   as described in that section.  If a change or addition to the
1113	   registry is needed, the Language Subtag Reviewer will prepare the
1114	   complete record, including all fields, and forward it to IANA for
1115	   insertion into the registry.  If this represents a new subtag, then
1116	   the message will indicate that this represents an INSERTION of a
1117	   record.  If this represents a change to an existing subtag, then the
1118	   message MUST indicate that this represents a MODIFICATION, as shown
1119	   in the following example:

1121	   LANGUAGE SUBTAG MODIFICATION
1122	   File-Date: 2005-01-02
1123	   %%
1124	   Type: variant
1125	   Subtag: nedis
1126	   Description: Natisone dialect
1127	   Description: Nadiza dialect
1128	   Added: 2003-10-09
1129	   Prefix: sl
1130	   Comments: This is a comment shown
1131	     as an example.
1132	   %%

1134	                                 Figure 6

1136	   Whenever an entry is created or modified in the registry, the 'File-
1137	   Date' record at the start of the registry is updated to reflect the
1138	   most recent modification date in the RFC 3339 [15] "full-date"
1139	   format.

1141	   Values in the 'Subtag' field MUST be lowercase except as provided for
1142	   in Section 3.1.

1144	3.3  Stability of IANA Registry Entries

1146	   The stability of entries and their meaning in the registry is
1147	   critical to the long term stability of language tags.  The rules in
1148	   this section guarantee that a specific language tag's meaning is
1149	   stable over time and will not change.

1151	   These rules specifically deal with how changes to codes (including
1152	   withdrawal and deprecation of codes) maintained by ISO 639, ISO
1153	   15924, ISO 3166, and UN M.49 are reflected in the IANA Language
1154	   Subtag Registry.  Assignments to the IANA Language Subtag Registry
1155	   MUST follow the following stability rules:

1157	   o  Values in the fields 'Type', 'Subtag', 'Tag', 'Added',
1158	      'Deprecated' and 'Preferred-Value' MUST NOT be changed and are
1159	      guaranteed to be stable over time.

1161	   o  Values in the 'Description' field MUST NOT be changed in a way
1162	      that would invalidate previously-existing tags.  They MAY be
1163	      broadened somewhat in scope, changed to add information, or
1164	      adapted to the most common modern usage.  For example, countries
1165	      occasionally change their official names: an historical example of
1166	      this would be "Upper Volta" changing to "Burkina Faso".

1168	   o  Values in the field 'Prefix' MAY be added to records of type
1169	      'variant' via the registration process.

1171	   o  Values in the field 'Prefix' MAY be modified, so long as the
1172	      modifications broaden the set of prefixes.  That is, a prefix MAY
1173	      be replaced by one of its own prefixes.  For example, the prefix
1174	      "en-US" could be replaced by "en", but not by the prefixes "en-
1175	      Latn", "fr", or "en-US-boont".  If one of those prefixes were
1176	      needed, a new Prefix SHOULD be registered.

1178	   o  Values in the field 'Prefix' MUST NOT be removed.

1180	   o  The field 'Comments' MAY be added, changed, modified, or removed
1181	      via the registration process or any of the processes or
1182	      considerations described in this section.

1184	   o  The field 'Suppress-Script' MAY be added or removed via the
1185	      registration process.

1187	   o  Codes assigned by ISO 639, ISO 15924, and ISO 3166 that do not
1188	      conflict with existing subtags of the associated type and whose
1189	      meaning is not the same as an existing subtag of the same type are
1190	      entered into the IANA registry as new records.

1192	   o  Codes assigned by ISO 639, ISO 15924, or ISO 3166 that are
1193	      withdrawn by their respective maintenance or registration
1194	      authority remain valid in language tags.  A 'Deprecated' field
1195	      containing the date of withdrawal is added to the record.  If a
1196	      new record of the same type is added that represents a replacement
1197	      value, then a 'Preferred-Value' field MAY also be added.  The
1198	      registration process MAY be used to add comments about the
1199	      withdrawal of the code by the respective standard.

1201	      *  The region code 'TL' was assigned to the country 'Timor-Leste',
1202	         replacing the code 'TP' (which was assigned to 'East Timor'
1203	         when it was under administration by Portugal).  The subtag 'TP'
1204	         remains valid in language tags, but its record contains the a
1205	         'Preferred-Value' of 'TL' and its field 'Deprecated' contains
1206	         the date the new code was assigned ('2004-07-06').

1208	   o  Codes assigned by ISO 639, ISO 15924, or ISO 3166 that conflict
1209	      with existing subtags of the associated type, including subtags
1210	      that are deprecated, MUST NOT be entered into the registry.  The
1211	      following additional considerations apply to subtag values that
1212	      are reassigned:

1214	      *  For ISO 639 codes, if the newly assigned code's meaning is not
1215	         represented by a subtag in the IANA registry, the Language
1216	         Subtag Reviewer, as described in Section 3.4, SHALL prepare a
1217	         proposal for entering in the IANA registry as soon as practical
1218	         a registered language subtag as an alternate value for the new
1219	         code.  The form of the registered language subtag will be at
1220	         the discretion of the Language Subtag Reviewer and MUST conform
1221	         to other restrictions on language subtags in this document.

1223	      *  For all subtags whose meaning is derived from an external
1224	         standard (i.e.  ISO 639, ISO 15924, ISO 3166, or UN M.49), if a
1225	         new meaning is assigned to an existing code and the new meaning
1226	         broadens the meaning of that code, then the meaning for the
1227	         associated subtag MAY be changed to match.  The meaning of a
1228	         subtag MUST NOT be narrowed, however, as this can result in an
1229	         unknown proportion of the existing uses of a subtag becoming
1230	         invalid.  Note: ISO 639 MA/RA has adopted a similar stability
1231	         policy.

1233	      *  For ISO 15924 codes, if the newly assigned code's meaning is
1234	         not represented by a subtag in the IANA registry, the Language
1235	         Subtag Reviewer, as described in Section 3.4, SHALL prepare a
1236	         proposal for entering in the IANA registry as soon as practical
1237	         a registered variant subtag as an alternate value for the new
1238	         code.  The form of the registered variant subtag will be at the
1239	         discretion of the Language Subtag Reviewer and MUST conform to
1240	         other restrictions on variant subtags in this document.

1242	      *  For ISO 3166 codes, if the newly assigned code's meaning is
1243	         associated with the same UN M.49 code as another 'region'
1244	         subtag, then the existing region subtag remains as the
1245	         preferred value for that region and no new entry is created.  A
1246	         comment MAY be added to the existing region subtag indicating
1247	         the relationship to the new ISO 3166 code.

1249	      *  For ISO 3166 codes, if the newly assigned code's meaning is
1250	         associated with a UN M.49 code that is not represented by an
1251	         existing region subtag, then the Language Subtag Reviewer, as
1252	         described in Section 3.4, SHALL prepare a proposal for entering
1253	         the appropriate UN M.49 country code as an entry in the IANA
1254	         registry.

1256	      *  Codes assigned by UN M.49 to countries or areas (as opposed to
1257	         geographical regions and sub-regions) for which there is no
1258	         corresponding ISO 3166 code MUST NOT be registered, except
1259	         under the previous provision.  If it is necessary to identify a
1260	         region for which only a UN M.49 code exists in language tags,
1261	         then the registration authority for ISO 3166 SHOULD be
1262	         petitioned to assign a code, which can then be registered for
1263	         use in language tags.  At the time this document was written,
1264	         there were only four such codes: 830 (Channel Islands), 831
1265	         (Guernsey), 832 (Jersey), and 833 (Isle of Man).  This rule
1266	         exists so that UN M.49 codes remain available as the value of
1267	         last resort in cases where ISO 3166 reassigns a deprecated
1268	         value in the registry.

1270	      *  For ISO 3166 codes, if there is no associated UN numeric code,
1271	         then the Language Subtag Reviewer SHALL petition the UN to
1272	         create one.  If there is no response from the UN within ninety
1273	         days of the request being sent, the Language Subtag Reviewer
1274	         SHALL prepare a proposal for entering in the IANA registry as
1275	         soon as practical a registered variant subtag as an alternate
1276	         value for the new code.  The form of the registered variant
1277	         subtag will be at the discretion of the Language Subtag
1278	         Reviewer and MUST conform to other restrictions on variant
1279	         subtags in this document.  This situation is very unlikely to
1280	         ever occur.

1282	   o  Stability provisions apply to grandfathered tags with this
1283	      exception: should all of the subtags in a grandfathered tag become
1284	      valid subtags in the IANA registry, then the field 'Type' in that
1285	      record is changed from 'grandfathered' to 'redundant'.  Note that
1286	      this will not affect language tags that match the grandfathered
1287	      tag, since these tags will now match valid generative subtag
1288	      sequences.  For example, if the subtag 'gan' in the language tag
1289	      "zh-gan" were to be registered as an extended language subtag,
1290	      then the grandfathered tag "zh-gan" would be deprecated (but
1291	      existing content or implementations that use "zh-gan" would remain
1292	      valid).

1294	3.4  Registration Procedure for Subtags

1296	   The procedure given here MUST be used by anyone who wants to use a
1297	   subtag not currently in the IANA Language Subtag Registry.

1299	   Only subtags  of type 'language' and 'variant' will be considered for
1300	   independent registration of new subtags.  Handling of subtags needed
1301	   for stability and subtags necessary to keep the registry synchronized
1302	   with ISO 639, ISO 15924, ISO 3166, and UN M.49 within the limits
1303	   defined by this document are described in Section 3.2.  Stability
1304	   provisions are described in Section 3.3.

1306	   This procedure MAY also be used to register or alter the information
1307	   for the "Description", "Comments", "Deprecated", or "Prefix" fields
1308	   in a subtag's record as described in Figure 9.  Changes to all other
1309	   fields in the IANA registry are NOT permitted.

1311	   Registering a new subtag or requesting modifications to an existing
1312	   tag or subtag starts with the requester filling out the registration
1313	   form reproduced below.  Note that each response is not limited in
1314	   size so that the request can adequately describe the registration.
1315	   The fields in the "Record Requested" section SHOULD follow the
1316	   requirements in Section 3.1.

1318	   LANGUAGE SUBTAG REGISTRATION FORM
1319	   1. Name of requester:
1320	   2. E-mail address of requester:
1321	   3. Record Requested:

1323	   Type:
1324	   Subtag:
1325	   Description:
1326	   Prefix:
1327	   Preferred-Value:
1328	   Deprecated:
1329	   Suppress-Script:
1330	   Comments:

1332	   4. Intended meaning of the subtag:
1333	   5. Reference to published description
1334	   of the language (book or article):
1335	   6. Any other relevant information:

1337	                                 Figure 7

1339	   The subtag registration form MUST be sent to
1340	   <ietf-languages@iana.org> for a two week review period before it can
1341	   be submitted to IANA.  (This is an open list and can be joined by
1342	   sending a request to <ietf-languages-request@iana.org>.)

1344	   Variant and extlang subtags are always registered for use with a
1345	   particular range of language tags.  For example, the subtag 'scouse'
1346	   is intended for use with language tags that start with the primary
1347	   language subtag "en", since Scouse is a dialect of English.  Thus the
1348	   subtag 'scouse' could be included in tags such as "en-Latn-scouse" or
1349	   "en-GB-scouse".  This information is stored in the "Prefix" field in
1350	   the registry.  Variant registration requests are REQUIRED to include
1351	   at least one "Prefix" field in the registration form.

1353	   The 'Prefix' field for a given registered subtag will be maintained
1354	   in the IANA registry as a guide to usage.  Additional prefixes MAY be
1355	   added by filing an additional registration form.  In that form, the
1356	   "Any other relevant information:" field MUST indicate that it is the
1357	   addition of a prefix.

1359	   Requests to add a prefix to a variant subtag that imply a different
1360	   semantic meaning will probably be rejected.  For example, a request
1361	   to add the prefix "de" to the subtag 'nedis' so that the tag "de-
1362	   nedis" represented some German dialect would be rejected.  The
1363	   'nedis' subtag represents a particular Slovenian dialect and the
1364	   additional registration would change the semantic meaning assigned to
1365	   the subtag.  A separate subtag SHOULD be proposed instead.

1367	   The 'Description' field MUST contain a description of the tag being
1368	   registered written or transcribed into the Latin script; it MAY also
1369	   include a description in a non-Latin script.  Non-ASCII characters
1370	   MUST be escaped using the syntax described in Section 3.1.  The
1371	   'Description' field is used for identification purposes and doesn't
1372	   necessarily  represent the actual native name of the language or
1373	   variation or to be in any particular language.

1375	   While the 'Description' field itself is not guaranteed to be stable
1376	   and errata corrections MAY be undertaken from time to time, attempts
1377	   to provide translations or transcriptions of entries in the registry
1378	   itself will probably be frowned upon by the community or rejected
1379	   outright, as changes of this nature have an impact on the provisions
1380	   in Section 3.3.

1382	   The Language Subtag Reviewer is responsible for responding to
1383	   requests for the registration of subtags through the registration
1384	   process  and is appointed by the IESG.

1386	   When the two week period has passed the Language Subtag Reviewer
1387	   either forwards the record to be inserted or modified to
1388	   iana@iana.org according to the procedure described in Section 3.2, or
1389	   rejects the request because of significant objections raised on the
1390	   list or due to problems with constraints in this document (which MUST
1391	   be explicitly cited).  The reviewer MAY also extend the review period
1392	   in two week increments to permit further discussion.  The reviewer
1393	   MUST indicate on the list whether the registration has been accepted,
1394	   rejected, or extended following each two week period.

1396	   Note that the reviewer can raise objections on the list if he or she
1397	   so desires.  The important thing is that the objection MUST be made
1398	   publicly.

1400	   The applicant is free to modify a rejected application with
1401	   additional information and submit it again; this restarts the two
1402	   week comment period.

1404	   Decisions made by the reviewer MAY be appealed to the IESG [RFC 2028]
1405	   [9] under the same rules as other IETF decisions [RFC 2026] [8].

1407	   All approved registration forms are available online in the directory
1408	   http://www.iana.org/numbers.html under "languages".

1410	   Updates or changes to existing records, including previous
1411	   registrations, follow the same procedure as new registrations.  The
1412	   Language Subtag Reviewer decides whether there is consensus to update
1413	   the registration following the two week review period; normally
1414	   objections by the original registrant will carry extra weight in
1415	   forming such a consensus.

1417	   Registrations are permanent and stable.  Once registered, subtags
1418	   will not be removed from the registry and will remain a valid way in
1419	   which to specify a specific language or variant.

1421	   Note: The purpose of the "Description" in the registration form is
1422	   intended as an aid to people trying to verify whether a language is
1423	   registered or what language or language variation a particular subtag
1424	   refers to.  In most cases, reference to an authoritative grammar or
1425	   dictionary of that language will be useful; in cases where no such
1426	   work exists, other well known works describing that language or in
1427	   that language MAY be appropriate.  The subtag reviewer decides what
1428	   constitutes "good enough" reference material.  This requirement is
1429	   not intended to exclude particular languages or dialects due to the
1430	   size of the speaker population or lack of a standardized orthography.
1431	   Minority languages will be considered equally on their own merits.

1433	3.5  Possibilities for Registration

1435	   Possibilities for registration of subtags or information about
1436	   subtags include:

1438	   o  Primary language subtags for languages not listed in ISO 639 that
1439	      are not variants of any listed or registered language can be
1440	      registered.  At the time this document was created there were no
1441	      examples of this form of subtag.  Before attempting to register a
1442	      language subtag, there MUST be an attempt to register the language
1443	      with ISO 639.  No language subtags will be registered for codes
1444	      that exist in ISO 639-1 or ISO 639-2, which are under
1445	      consideration by the ISO 639 maintenance or registration
1446	      authorities, or which have never been attempted for registration
1447	      with those authorities.  If ISO 639 has previously rejected a
1448	      language for registration, it is reasonable to assume that there
1449	      must be additional very compelling evidence of need before it will
1450	      be registered in the IANA registry (to the extent that it is very
1451	      unlikely that any subtags will be registered of this type).

1453	   o  Dialect or other divisions or variations within a language, its
1454	      orthography, writing system, regional or historical usage,
1455	      transliteration or other transformation, or distinguishing
1456	      variation MAY be registered as variant subtags.  An example is the
1457	      'scouse' subtag (the Scouse dialect of English).

1459	   o  The addition or maintenance of fields (generally of an
1460	      informational nature) in Tag or Subtag records as described in
1461	      Section 3.1 and subject to the stability provisions in
1462	      Section 3.3.  This includes  descriptions; comments; deprecation
1463	      and preferred values for obsolete or withdrawn codes; or the
1464	      addition of script or extlang information to primary language
1465	      subtags.

1467	   o  The addition of records and related field value changes necessary
1468	      to reflect assignments made by ISO 639, ISO 15924, ISO 3166, and
1469	      UN  M.49 as described in Section 3.3.

1471	   This document leaves the decision on what subtags  or changes to
1472	   subtags are appropriate (or not) to the registration process
1473	   described in Section 3.4.

1475	   Note: four character primary language subtags are reserved to allow
1476	   for the possibility of  alpha4 codes in some future addition to the
1477	   ISO 639 family of standards.

1479	   ISO 639 defines a maintenance agency for additions to and changes in
1480	   the list of languages in ISO 639.  This agency is:

1482	   International Information Centre for Terminology (Infoterm)
1483	   Aichholzgasse 6/12, AT-1120
1484	   Wien, Austria
1485	   Phone: +43 1 26 75 35 Ext. 312 Fax: +43 1 216 32 72

1487	   ISO 639-2 defines a maintenance agency for additions to and changes
1488	   in the list of languages in ISO 639-2.  This agency is:

1490	   Library of Congress
1491	   Network Development and MARC Standards Office
1492	   Washington, D.C. 20540 USA
1493	   Phone: +1 202 707 6237  Fax: +1 202 707 0115
1494	   URL: http://www.loc.gov/standards/iso639

1496	   The maintenance agency for ISO 3166 (country codes) is:

1498	   ISO 3166 Maintenance Agency
1499	   c/o International Organization for Standardization
1500	   Case postale 56
1501	   CH-1211 Geneva 20 Switzerland
1502	   Phone: +41 22 749 72 33  Fax: +41 22 749 73 49
1503	   URL: http://www.iso.org/iso/en/prods-services/iso3166ma/index.html

1505	   The registration authority for ISO 15924 (script codes) is:

1507	   Unicode Consortium Box 391476
1508	   Mountain View, CA 94039-1476, USA
1509	   URL: http://www.unicode.org/iso15924

1511	   The Statistics Division of the United Nations Secretariat maintains
1512	   the Standard Country or Area Codes for Statistical Use and can be
1513	   reached at:

1515	   Statistical Services Branch
1516	   Statistics Division
1517	   United Nations, Room DC2-1620
1518	   New York, NY 10017, USA

1520	   Fax: +1-212-963-0623
1521	   E-mail: statistics@un.org
1522	   URL: http://unstats.un.org/unsd/methods/m49/m49alpha.htm

1524	3.6  Extensions and Extensions Namespace

1526	   Extension subtags are those introduced by single-letter subtags other
1527	   than 'x'.  They are reserved for the generation of identifiers which
1528	   contain a language component, and are compatible with applications
1529	   that understand language tags.  For example, they might be used to
1530	   define locale identifiers, which are generally based on language.

1532	   The structure and form of extensions are defined by this document so
1533	   that implementations can be created that are forward compatible with
1534	   applications that might be created using single-letter subtags in the
1535	   future.  In addition, defining a mechanism for maintaining single-
1536	   letter subtags will lend to the stability of this document by
1537	   reducing the likely need for future revisions or updates.

1539	   Allocation of a single-letter subtag SHALL take the form of an RFC
1540	   defining the name, purpose, processes, and procedures for maintaining
1541	   the subtags.  The maintaining or registering authority, including
1542	   name, contact email, discussion list email, and URL location of the
1543	   registry MUST be indicated clearly in the RFC.  The RFC MUST specify
1544	   or include each of the following:

1546	   o  The specification MUST reference the specific version or revision
1547	      of this document that governs its creation and MUST reference this
1548	      section of this document.

1550	   o  The specification and all subtags defined by the specification
1551	      MUST follow the ABNF and other rules for the formation of tags and
1552	      subtags as defined in this document.  In particular it MUST
1553	      specify that case is not significant and that subtags MUST NOT
1554	      exceed eight characters in length.

1556	   o  The specification MUST specify a canonical representation.

1558	   o  The specification of valid subtags MUST be available over the
1559	      Internet and at no cost.

1561	   o  The specification MUST be in the public domain or available via a
1562	      royalty-free license acceptable to the IETF and specified in the
1563	      RFC.

1565	   o  The specification MUST be versioned and each version of the
1566	      specification MUST be numbered, dated, and stable.

1568	   o  The specification MUST be stable.  That is, extension subtags,
1569	      once defined by a specification, MUST NOT be retracted or change
1570	      in meaning in any substantial way.

1572	   o  The specification MUST include in a separate section the
1573	      registration form reproduced in this section (below) to be used in
1574	      registering the extension upon publication as an RFC.

1576	   o  IANA MUST be informed of changes to the contact information and
1577	      URL for the specification.

1579	   IANA will maintain a registry of allocated single-letter (singleton)
1580	   subtags.  This registry will use the record-jar format described by
1581	   the ABNF in Section 3.1.  Upon publication of an extension as an RFC,
1582	   the maintaining authority defined in the RFC MUST forward this
1583	   registration form to iesg@ietf.org, who will forward the request to
1584	   iana@iana.org.  The maintaining authority of the extension MUST
1585	   maintain the accuracy of the record by sending an updated full copy
1586	   of the record to iana@iana.org with the subject line "LANGUAGE TAG
1587	   EXTENSION UPDATE" whenever content changes.  Only the 'Comments',
1588	   'Contact_Email', 'Mailing_List', and 'URL' fields MAY be modified in
1589	   these updates.

1591	   Failure to maintain this record, the corresponding registry, or meet
1592	   other conditions imposed by this section of this document MAY be
1593	   appealed to the IESG [RFC 2028] [9] under the same rules as other
1594	   IETF decisions (see [8]) and MAY result in the authority to maintain
1595	   the extension being withdrawn or reassigned by the IESG.
1596	   %%
1597	   Identifier:
1598	   Description:
1599	   Comments:
1600	   Added:
1601	   RFC:
1602	   Authority:
1603	   Contact_Email:
1604	   Mailing_List:
1605	   URL:
1606	   %%

1608	    Figure 8: Format of Records in the Language Tag Extensions Registry

1610	   'Identifier' contains the single letter subtag (singleton) assigned
1611	   to the extension.  The Internet-Draft submitted to define the
1612	   extension SHOULD specify which letter to use, although the IESG MAY
1613	   change the assignment when approving the RFC.

1615	   'Description' contains the name and description of the extension.

1617	   'Comments' is an OPTIONAL field and MAY contain a broader description
1618	   of the extension.

1620	   'Added' contains the date the RFC was published in the "full-date"
1621	   format specified in RFC 3339 [15].  For example: 2004-06-28
1622	   represents June 28, 2004, in the Gregorian calendar.

1624	   'RFC' contains the RFC number assigned to the extension.

1626	   'Authority' contains the name of the maintaining authority for the
1627	   extension.

1629	   'Contact_Email' contains the email address used to contact the
1630	   maintaining authority.

1632	   'Mailing_List' contains the URL or subscription email address of the
1633	   mailing list used by the maintaining authority.

1635	   'URL' contains the URL of the registry for this extension.

1637	   The determination of whether an Internet-Draft meets the above
1638	   conditions and the decision to grant or withhold such authority rests
1639	   solely with the IESG, and is subject to the normal review and appeals
1640	   process associated with the RFC process.

1642	   Extension authors are strongly cautioned that many (including most
1643	   well-formed) processors will be unaware of any special relationships
1644	   or meaning inherent in the order of extension subtags.  Extension
1645	   authors SHOULD avoid subtag relationships or canonicalization
1646	   mechanisms that interfere with matching or with length restrictions
1647	   that sometimes exist in common protocols where the extension is used.
1648	   In particular, applications MAY truncate the subtags in doing
1649	   matching or in fitting into limited lengths, so it is RECOMMENDED
1650	   that the most significant information be in the most significant
1651	   (left-most) subtags, and that the specification gracefully handle
1652	   truncated subtags.

1654	   When a language tag is to be used in a specific, known, protocol, it
1655	   is RECOMMENDED that that the language tag not contain extensions not
1656	   supported by that protocol.  In addition, note that some protocols
1657	   MAY impose upper limits on the length of the strings used to store or
1658	   transport the language tag.

1660	3.7  Initialization of the Registry

1662	   Upon publication of this document as a BCP, the Language Subtag
1663	   Registry MUST be created and populated with the initial set of
1664	   subtags.  This includes converting the entries from the existing IANA
1665	   language tag registry defined by RFC 3066 to the new format.  This
1666	   section defines the process for defining the new registry and
1667	   performing the conversion of the old registry.

1669	   The impact on the IANA maintainers of the registry of this conversion
1670	   will be a small increase in the frequency of new entries.  The
1671	   initial set of records represents no impact on IANA, since the work
1672	   to create it will be performed externally (as defined in this
1673	   section).  Future work will be limited to inserting or replacing
1674	   whole records preformatted for IANA by the Language Subtag Reviewer.

1676	   The initial registry will be created by the LTRU working group.
1677	   Using the instructions in this document, the working group will
1678	   prepare an Informational RFC by creating a series of Internet-Drafts
1679	   containing the prototype registry according to the rules in Sections
1680	   4.2.2 and 4.2.3 and subject to IESG review as described in Section
1681	   6.1.1 of RFC 2026 [8].

1683	   When the Internet-Draft containing the prototype registry has been
1684	   approved by the IESG for publication as an RFC, the document will be
1685	   forwarded to IANA, which will post the contents of the new registry
1686	   on-line.

1688	   Tags in the RFC 3066 registry that are not deprecated that consist
1689	   entirely of subtags that are valid under this document and which have
1690	   the correct form and format for tags defined by this document are
1691	   superseded by this document.  Such tags are placed in records of type
1692	   'redundant' in the registry.  For example, "zh-Hant" is now defined
1693	   by this document.

1695	   All other tags in the RFC 3066 registry that are deprecated will be
1696	   maintained as grandfathered entries.  The record for the
1697	   grandfathered entry will contain a 'Deprecated' field with the most
1698	   appropriate date that can be determined for when the record was
1699	   deprecated.  The 'Comments' field will contain the reason for the
1700	   deprecation.  The 'Preferred-Value' field will contain the tag that
1701	   replaces the value.  For example, the tag "art-lojban" is deprecated
1702	   and will be placed in the grandfathered section.  It's 'Deprecated'
1703	   field will contain the deprecation date (in this case "2003-09-02")
1704	   and the 'Preferred-Value' field the value "jbo".

1706	   Tags that are not deprecated and which contain subtags which are
1707	   consistent with registration under the guidelines in this document
1708	   will not automatically have a new subtag registration created for
1709	   each eligible subtag.  Interested parties MAY use the registration
1710	   process in Section 3.4 to register these subtags.  If all of the
1711	   subtags in the original tag become fully defined by the resulting
1712	   registrations, then the original tag is superseded by this document.
1713	   Such tags will have their record changed from type 'grandfathered' to
1714	   type 'redundant' in the registry.  For example, the subtag 'boont'
1715	   could be registered, resulting in the change of the grandfathered tag
1716	   "en-boont" to type redundant in the registry.

1718	   Tags that contain one or more subtags that do not match the valid
1719	   registration pattern and which are not otherwise defined by this
1720	   document will have records of type  'grandfathered' created in the
1721	   registry.  These records cannot become type 'redundant', but MAY have
1722	   a 'Deprecated' and 'Preferred-Value' field added to them if a subtag
1723	   assignment or combination of assignments renders the tag obsolete.

1725	   There MUST be a reasonable period in which the community can comment
1726	   on the proposed list entries, which SHALL be no less than four weeks
1727	   in length.  At the completion of this period, the chair(s) will
1728	   notify iana@iana.org and the ltru and ietf-languages mail lists that
1729	   the task is complete and forward the necessary materials to IANA for
1730	   publication.

1732	   Registrations that are in process under the rules defined in RFC 3066
1733	   MAY be completed under the former rules, at the discretion of the
1734	   language tag reviewer.  Any new registrations submitted after the
1735	   request for conversion of the registry MUST be rejected.

1737	   All existing RFC 3066 language tag registrations will be maintained
1738	   in perpetuity.

1740	   Users of tags that are grandfathered SHOULD consider registering
1741	   appropriate subtags in the IANA subtag registry (but are NOT REQUIRED
1742	   to).

1744	   UN numeric codes assigned to 'macro-geographical (continental)' MUST
1745	   be defined in the IANA registry and made valid for use in language
1746	   tags.  These codes MUST be added to the initial version of the
1747	   registry.  The UN numeric codes for 'economic groupings' or 'other
1748	   groupings', and the alphanumeric codes in Appendix X of the UN
1749	   document MUST NOT be added to the registry.  The UN numeric codes for
1750	   countries or areas not associated with an assigned ISO 3166 alpha-2
1751	   code MUST NOT be added to the initial version of the registry.  These
1752	   values MAY be registered by individuals using the process defined in
1753	   Section 3.4 and according to the rules in Section 3.3.

1755	   When creating records for ISO 639, ISO 15924, ISO3166, and UN M.49
1756	   codes, the following criteria SHALL be applied to the inclusion,
1757	   preferred value, and deprecation of codes:

1759	   For each standard, the date of the standard referenced in RFC 1766 is
1760	   selected as the starting date.  Codes that were valid on that date in
1761	   the selected standard are added to the registry.  Codes that were
1762	   previously assigned by but which were vacated or withdrawn before
1763	   that date are not added to the registry.  For each successive change
1764	   to the standard, any additional assignments are added to the
1765	   registry.  Values that are withdrawn are marked as deprecated, but
1766	   not removed.  Changes in meaning or assignment of a subtag are
1767	   permitted during this process (for example, the ISO 3166 code 'CS'
1768	   was originally assigned to 'Czechoslovakia' and is now assigned to
1769	   'Serbia and Montenegro').  This continues up to the date that this
1770	   document was adopted.  The resulting set of records is added to the
1771	   registry.  Future changes or additions to this portion of the
1772	   registry are governed by the provisions of this document.

1774	4.  Formation and Processing of Language Tags

1776	   This section addresses how to use the registry with the language tag
1777	   format to choose, form and process language tags.

1779	4.1  Choice of Language Tag

1781	   One is sometimes faced with the choice between several possible tags
1782	   for the same body of text.

1784	   Interoperability is best served when all users use the same language
1785	   tag in order to represent the same language.  If an application has
1786	   requirements that make the rules here inapplicable, then that
1787	   application risks damaging interoperability.  It is strongly
1788	   RECOMMENDED that users not define their own rules for language tag
1789	   choice.

1791	   Of particular note, many applications can benefit from the use of
1792	   script subtags in language tags, as long as the use is consistent for
1793	   a given context.  Script subtags were not formally defined in RFC
1794	   3066 and their use can affect matching and subtag identification by
1795	   implementations of RFC 3066, as these subtags appear between the
1796	   primary language and region subtags.  For example, if a user requests
1797	   content in an implementation of Section 2.5 of RFC 3066 [24] using
1798	   the language range "en-US", content labeled "en-Latn-US" will not
1799	   match the request.  Therefore it is important to know when script
1800	   subtags will customarily be used and when they ought not be used.  In
1801	   the registry, the Suppress-Script field helps ensure greater
1802	   compatibility between the language tags generated according to the
1803	   rules in this document and language tags and tag processors or
1804	   consumers based on RFC 3066 by defining when users SHOULD NOT include
1805	   a script subtag with a particular primary language subtag.

1807	   Extended language subtags (type 'extlang' in the registry, see
1808	   Section 3.1) also appear between the primary language and region
1809	   subtags and are reserved for future standardization.  Applications
1810	   might benefit from their judicious use in forming language tags in
1811	   the future.  Similar recommendations are expected to apply to their
1812	   use as apply to script subtags.

1814	   Standards, protocols and applications that reference this document
1815	   normatively but apply different rules to the ones given in this
1816	   section MUST specify how the procedure varies from the one given
1817	   here.

1819	   The choice of subtags used to form a language tag SHOULD be guided by
1820	   the following rules:

1822	   1.  Use as precise a tag as possible, but no more specific than is
1823	       justified.  Avoid using subtags that are not important for
1824	       distinguishing content in an application.

1826	       *  For example, 'de' might suffice for tagging an email written
1827	          in German, while "de-CH-1996" is probably unnecessarily
1828	          precise for such a task.

1830	   2.  The script subtag SHOULD NOT be used to form language tags unless
1831	       the script adds some distinguishing information to the tag.  The
1832	       field 'Suppress-Script' in the primary language record in the
1833	       registry indicates which script subtags do not add distinguishing
1834	       information for most applications.

1836	       *  For example, the subtag 'Latn' should not be used with the
1837	          primary language 'en' because nearly all English documents are
1838	          written in the Latin script and it adds no distinguishing
1839	          information.  However, if a document were written in English
1840	          mixing Latin script with another script such as Braille
1841	          ('Brai'), then it might be appropriate to choose to indicate
1842	          both scripts to aid in content selection, such as the
1843	          application of a stylesheet.

1845	   3.  If a tag or subtag has a 'Preferred-Value' field in its registry
1846	       entry, then the  value of that field SHOULD be used to form the
1847	       language tag in preference to the tag or subtag in which the
1848	       preferred value appears.

1850	       *  For example, use 'he' for Hebrew in preference to 'iw'.

1852	   4.  The 'und' (Undetermined) primary language subtag SHOULD NOT be
1853	       used to label content, even if the language is unknown.  Omitting
1854	       the language tag altogether is preferred to using a tag with a
1855	       primary language subtag of 'und'.  The 'und' subtag MAY be useful
1856	       for protocols that require a language tag to be provided.  The
1857	       'und' subtag MAY also be useful when matching language tags in
1858	       certain situations.

1860	   5.  The 'mul' (Multiple) primary language subtag SHOULD NOT be used
1861	       whenever the protocol allows the separate tags for multiple
1862	       languages, as is the case for the Content-Language header in
1863	       HTTP.  The 'mul' subtag conveys little useful information:
1864	       content in multiple languages SHOULD individually tag the
1865	       languages where they appear or otherwise indicate the actual
1866	       language in preference to the 'mul' subtag.

1868	   6.  The same variant subtag SHOULD NOT be used more than once within
1869	       a language tag.

1871	       *  For example, do not use "en-GB-scouse-scouse".

1873	   To ensure consistent backward compatibility, this document contains
1874	   several provisions to account for potential instability in the
1875	   standards used to define the subtags that make up language tags.
1876	   These provisions mean that no language tag created under the rules in
1877	   this document will become obsolete.

1879	4.2  Meaning of the Language Tag

1881	   The language tag always defines a language as spoken (or written,
1882	   signed or otherwise signaled) by human beings for communication of
1883	   information to other human beings.  Computer languages such as
1884	   programming languages are explicitly excluded.

1886	   If a language tag B contains language tag A as a prefix, then B is
1887	   typically "narrower" or "more specific" than A. For example, "zh-
1888	   Hant-TW" is more specific than "zh-Hant".

1890	   This relationship is not guaranteed in all cases: specifically,
1891	   languages that begin with the same sequence of subtags are NOT
1892	   guaranteed to be mutually intelligible, although they might be.  For
1893	   example, the tag "az" shares a prefix with both "az-Latn"
1894	   (Azerbaijani written using the Latin script) and "az-Cyrl"
1895	   (Azerbaijani written using the Cyrillic script).  A person fluent in
1896	   one script might not be able to read the other, even though the text
1897	   might be identical.  Content tagged as "az" most probably is written
1898	   in just one script and thus might not be intelligible to a reader
1899	   familiar with the other script.

1901	   The relationship between the tag and the information it relates to is
1902	   defined by the standard describing the context in which it appears.
1903	   Accordingly, this section can only give possible examples of its
1904	   usage.

1906	   o  For a single information object, the associated language tags
1907	      might be interpreted as the set of languages that is necessary for
1908	      a complete comprehension of the complete object.  Example: Plain
1909	      text documents.

1911	   o  For an aggregation of information objects, the associated language
1912	      tags could be taken as the set of languages used inside components
1913	      of that aggregation.  Examples: Document stores and libraries.

1915	   o  For information objects whose purpose is to provide alternatives,
1916	      the associated language tags could be regarded as a hint that the
1917	      content is provided in several languages, and that one has to
1918	      inspect each of the alternatives in order to find its language or
1919	      languages.  In this case, the presence of multiple tags might not
1920	      mean that one needs to be multi-lingual to get complete
1921	      understanding of the document.  Example: MIME multipart/
1922	      alternative.

1924	   o  In markup languages, such as HTML and XML, language information
1925	      can be added to each part of the document identified by the markup
1926	      structure (including the whole document itself).  For example, one
1927	      could write <span lang="fr">C'est la vie.</span> inside a
1928	      Norwegian document; the Norwegian-speaking user could then access
1929	      a French-Norwegian dictionary to find out what the marked section
1930	      meant.  If the user were listening to that document through a
1931	      speech synthesis interface, this formation could be used to signal
1932	      the synthesizer to appropriately apply French text-to-speech
1933	      pronunciation rules to that span of text, instead of applying the
1934	      inappropriate Norwegian rules.

1936	4.3  Canonicalization of Language Tags

1938	   Since a particular language tag is sometimes used by many processes,
1939	   language tags SHOULD always be created or generated in a canonical
1940	   form.

1942	   A language tag is in canonical form when:

1944	   1.  The tag is well-formed according the rules in Section 2.1 and
1945	       Section 2.2.

1947	   2.  Subtags of type 'Region' that have a Preferred-Value mapping in
1948	       the IANA registry (see Section 3.1) SHOULD be replaced with their
1949	       mapped value.

1951	   3.  Redundant or grandfathered tags that have a Preferred-Value
1952	       mapping in the IANA registry (see Section 3.1) MUST be replaced
1953	       with their mapped value.  These items are either deprecated
1954	       mappings created before the adoption of this document (such as
1955	       the mapping of "no-nyn" to "nn" or "i-klingon" to "tlh") or are
1956	       the result of later registrations or additions to this document
1957	       (for example, "zh-guoyu" might be mapped to a language-extlang
1958	       combination such as "zh-cmn" by some future update of this
1959	       document).

1961	   4.  Other subtags that have a Preferred-Value mapping in the IANA
1962	       registry (see Section 3.1) MUST be replaced with their mapped
1963	       value.  These items consist entirely of clerical corrections to
1964	       ISO 639-1 in which the deprecated subtags have been maintained
1965	       for compatibility purposes.

1967	   5.  If more than one extension subtag sequence exists, the extension
1968	       sequences are ordered into case-insensitive ASCII order by
1969	       singleton subtag.

1971	   Example: The language tag "en-A-aaa-B-ccc-bbb-x-xyz" is in canonical
1972	   form, while "en-B-ccc-bbb-A-aaa-X-xyz" is well-formed but not in
1973	   canonical form.

1975	   Example: The language tag "en-NH" (English as used in the New
1976	   Hebrides) is not canonical because the 'NH' subtag has a canonical
1977	   mapping to 'VU' (Vanuatu), although the tag "en-NH" maintains its
1978	   validity.

1980	   Canonicalization of language tags does not imply anything about the
1981	   use of upper or lowercase letters when processing or comparing
1982	   subtags (and as described in Section 2.1).  All comparisons MUST be
1983	   performed in a case-insensitive manner.

1985	   When performing canonicalization of language tags, processors MAY
1986	   regularize the case of the subtags (that is, this process is
1987	   OPTIONAL), following the case used in the registry.  Note that this
1988	   corresponds to the following casing rules: uppercase all non-initial
1989	   two-letter subtags; titlecase all non-initial four-letter subtags;
1990	   lowercase everything else.

1992	   Note: Case folding of ASCII letters in certain locales, unless
1993	   carefully handled, sometimes produces non-ASCII character values.
1994	   The Unicode Character Database file "SpecialCasing.txt" defines the
1995	   specific cases that are known to cause problems with this.  In
1996	   particular, the letter 'i' (U+0069) in Turkish and Azerbaijani is
1997	   uppercased to U+0130 (LATIN CAPITAL LETTER I WITH DOT ABOVE).
1998	   Implementers SHOULD specify a locale-neutral casing operation to
1999	   ensure that case folding of subtags does not produce this value,
2000	   which is illegal in language tags.  For example, if one were to
2001	   uppercase the region subtag 'in' using Turkish locale rules, the
2002	   sequence U+0130 U+004E would result instead of the expected 'IN'.

2004	   Note: if the field 'Deprecated' appears in a registry record without
2005	   an accompanying 'Preferred-Value' field, then that tag or subtag is
2006	   deprecated without a replacement.  Validating processors SHOULD NOT
2007	   generate tags that include these values, although the values are
2008	   canonical when they appear in a language tag.

2010	   An extension MUST define any relationships that exist between the
2011	   various subtags in the extension and thus MAY define an alternate
2012	   canonicalization scheme for the extension's subtags.  Extensions MAY
2013	   define how the order of the extension's subtags are interpreted.  For
2014	   example, an extension could define that its subtags are in canonical
2015	   order when the subtags are placed into ASCII order: that is, "en-a-
2016	   aaa-bbb-ccc" instead of "en-a-ccc-bbb-aaa".  Another extension might
2017	   define that the order of the subtags influences their semantic
2018	   meaning (so that "en-b-ccc-bbb-aaa" has a different value from "en-b-
2019	   aaa-bbb-ccc").  However, extension specifications SHOULD be designed
2020	   so that they are tolerant of the typical processes described in
2021	   Section 3.6.

2023	4.4  Considerations for Private Use Subtags

2025	   Private-use subtags require private agreement between the parties
2026	   that intend to use or exchange language tags that use them and great
2027	   caution SHOULD be used in employing them in content or protocols
2028	   intended for general use.  Private-use subtags are simply useless for
2029	   information exchange without prior arrangement.

2031	   The value and semantic meaning of private-use tags and of the subtags
2032	   used within such a language tag are not defined by this document.

2034	   The use of subtags defined in the IANA registry as having a specific
2035	   private use meaning convey more information that a purely private use
2036	   tag prefixed by the singleton subtag 'x'.  For applications this
2037	   additional information MAY be useful.

2039	   For example, the region subtags 'AA', 'ZZ' and in the ranges
2040	   'QM'-'QZ' and 'XA'-'XZ' (derived from ISO 3166 private use codes) MAY
2041	   be used to form a language tag.  A tag such as "zh-Hans-XQ" conveys a
2042	   great deal of public, interchangeable information about the language
2043	   material (that it is Chinese in the simplified Chinese script and is
2044	   suitable for some geographic region 'XQ').  While the precise
2045	   geographic region is not known outside of private agreement, the tag
2046	   conveys far more information than an opaque tag such as "x-someLang",
2047	   which contains no information about the language subtag or script
2048	   subtag outside of the private agreement.

2050	   However, in some cases content tagged with private use subtags MAY
2051	   interact with other systems in a different and possibly unsuitable
2052	   manner compared to tags that use opaque, privately defined subtags,
2053	   so the choice of the best approach sometimes depends on the
2054	   particular domain in question.

2056	5.  IANA Considerations

2058	   This section deals with the processes and requirements necessary for
2059	   IANA to undertake to maintain the subtag and extension registries as
2060	   defined by this document and in accordance with the requirements of
2061	   RFC 2434 [12].

2063	   The impact on the IANA maintainers of the two registries defined by
2064	   this document will be a small increase in the frequency of new
2065	   entries or updates.

2067	   Upon adoption of this document, the process described in Section 3.7
2068	   will be used to generate the initial Language Subtag Registry.  The
2069	   initial set of records represents no impact on IANA, since the work
2070	   to create it will be performed externally (as defined in that
2071	   section).  The new registry will be listed under "Language Tags" at
2072	   <http://www.iana.org/numbers.html>.  The existing directory of
2073	   registration forms and RFC 3066 registrations will be relabeled as
2074	   "Language Tags (Obsolete)" and maintained (but not added to or
2075	   modified).

2077	   Future work on the Language Subtag Registry will be limited to
2078	   inserting or replacing whole records preformatted for IANA by the
2079	   Language Subtag Reviewer as described in Section 3.2 of this
2080	   document.  Each record will be sent to iana@iana.org with a subject
2081	   line indicating whether the enclosed record is an insertion (of a new
2082	   record) or a replacement of an existing record which has a Type and
2083	   Subtag (or Tag) field that exactly matches the record sent.  Records
2084	   cannot be deleted from the registry.

2086	   The Language Tag Extensions registry will also be generated and sent
2087	   to IANA as described in Section 3.6.  This registry can contain at
2088	   most 35 records and thus changes to this registry are expected to be
2089	   very infrequent.

2091	   Future work by IANA on the Language Tag Extensions Registry is
2092	   limited to two cases.  First, the IESG MAY request that new records
2093	   be inserted into this registry from time to time.  These requests
2094	   will include the record to insert in the exact format described in
2095	   Section 3.6.  In addition, there MAY be occasional requests from the
2096	   maintaining authority for a specific extension to update the contact
2097	   information or URLs in the record.  These requests MUST include the
2098	   complete, updated record.  IANA is not responsible for validating the
2099	   information provided, only that it is properly formatted.  It should
2100	   reasonably be seen to come from the maintaining authority named in
2101	   the record present in the registry.

2103	6.  Security Considerations

2105	   Language tags used in content negotiation, like any other information
2106	   exchanged on the Internet, might be a source of concern because they
2107	   might be used to infer the nationality of the sender, and thus
2108	   identify potential targets for surveillance.

2110	   This is a special case of the general problem that anything sent is
2111	   visible to the receiving party and possibly to third parties as well.
2112	   It is useful to be aware that such concerns can exist in some cases.

2114	   The evaluation of the exact magnitude of the threat, and any possible
2115	   countermeasures, is left to each application protocol (see BCP 72,
2116	   RFC  3552 [16] for best current practice guidance on security threats
2117	   and defenses).

2119	   The language tag associated with a particular information item is of
2120	   no consequence whatsoever in determining whether that content might
2121	   contain possible homographs.  The fact that a text is tagged as being
2122	   in one language or using a particular script subtag provides no
2123	   assurance whatsoever that it does not contain characters from scripts
2124	   other than the one(s) associated with or specified by that language
2125	   tag.

2127	   Since there is no limit to the number of variant, private use, and
2128	   extension subtags, and consequently no limit on the possible length
2129	   of a tag, implementations need to guard against buffer overflow
2130	   attacks.  See Section 2.1.1 for details on language tag truncation,
2131	   which can occur as a consequence of defenses against buffer overflow.

2133	   Although the specification of valid subtags for an extension (see:
2134	   Section 3.6) MUST be available over the Internet, implementations
2135	   SHOULD NOT mechanically depend on it being always accessible, to
2136	   prevent denial-of-service attacks.

2138	7.  Character Set Considerations

2140	   The syntax in this document requires that language tags use only the
2141	   characters A-Z, a-z, 0-9, and HYPHEN-MINUS, which are present in most
2142	   character sets, so the composition of language tags should not have
2143	   any character set issues.

2145	   Rendering of characters based on the content of a language tag is not
2146	   addressed in this memo.  Historically, some languages have relied on
2147	   the use of specific character sets or other information in order to
2148	   infer how a specific character should be rendered (notably this
2149	   applies to language and culture specific variations of Han ideographs
2150	   as used in Japanese, Chinese, and Korean).  When language tags are
2151	   applied to spans of text, rendering engines can use that information
2152	   in deciding which font to use in the absence of other information,
2153	   particularly where languages with distinct writing traditions use the
2154	   same characters.

2156	8.  Changes from RFC 3066

2158	   The main goals for this revision of language tags were the following:

2160	   *Compatibility.* All valid RFC 3066 language tags  (including those
2161	   in the IANA registry)  remain valid in this specification.  Thus
2162	   there is complete backward compatibility of this specification with
2163	   existing content.  In addition, this document defines language tags
2164	   in such as way as to ensure future compatibility, and processors
2165	   based solely on the RFC 3066 ABNF (such as those described in XML
2166	   Schema version 1.0 [20]) will be able to process tags described by
2167	   this document.

2169	   *Stability.* Because of the changes in underlying ISO standards, a
2170	   valid RFC 3066 language tag may become invalid (or have its meaning
2171	   change) at a later date.  With so much of the world's computing
2172	   infrastructure dependent on language tags, this is simply
2173	   unacceptable: it invalidates content that may have an extensive
2174	   shelf-life.  In this specification, once a language tag is valid, it
2175	   remains valid forever.  Previously, there was no way to determine
2176	   when two tags were equivalent.  This specification provides a stable
2177	   mechanism for doing so, through the use of canonical forms.  These
2178	   are also stable, so that implementations can depend on the use of
2179	   canonical forms to assess equivalency.

2181	   *Validity.*  The structure of language tags defined by this document
2182	   makes it possible to determine if a particular tag is well-formed
2183	   without regard for the actual content or "meaning" of the tag as a
2184	   whole.  This is important because the registry and underlying
2185	   standards  change over time.  In addition, it must be possible to
2186	   determine if a tag is valid (or not) for a given point in time in
2187	   order  to provide reproducible, testable results.  This process must
2188	   not be error-prone; otherwise even intelligent people will generate
2189	   implementations that give different results.  This specification
2190	   provides for that by having a single data file, with specific
2191	   versioning information, so that the validity of language tags at any
2192	   point in time can be precisely determined (instead of interpolating
2193	   values from many separate sources).

2195	   *Extensibility.* It is important to be able to differentiate between
2196	   written forms of language -- for many implementations this is more
2197	   important than distinguishing between spoken variants of a language.
2198	   Languages are written in a wide variety of different scripts, so this
2199	   document provides for the generative use of ISO 15924 script codes.
2200	   Like the generative use of ISO language and country codes in RFC
2201	   3066, this allows combinations to be produced without resorting to
2202	   the registration process.  The addition of UN codes provides for the
2203	   generation of language tags with regional scope, which is also
2204	   required for information technology.

2206	   The recast of the registry from containing whole language tags to
2207	   subtags is a key part of this.  An important feature of RFC 3066 was
2208	   that it allowed generative use of subtags.  This allows people to
2209	   meaningfully use generated tags, without the delays in registering
2210	   whole tags, and the burden on the registry of having to supply all of
2211	   the combinations that people may find useful.

2213	   Because of the widespread use of language tags, it is potentially
2214	   disruptive to have periodic revisions of the core specification,
2215	   despite demonstrated need.  The extension mechanism provides for a
2216	   way for independent RFCs to define extensions to language tags.
2217	   These extensions have a very constrained, well-defined structure to
2218	   prevent extensions from interfering with implementations of language
2219	   tags defined in this document.  The document also anticipates
2220	   features of ISO 639-3 with the addition of the extended language
2221	   subtags, as well as the possibility of other ISO 639 parts becoming
2222	   useful for the formation of language tags in the future.  The use and
2223	   definition of private use tags has also been modified, to allow
2224	   people to move as much information as possible out of private use
2225	   tags, and into the regular structure.  The goal is to dramatically
2226	   reduce the need to produce a revision of this document in the future.

2228	   The specific changes in this document to meet these goals are:

2230	   o  Defines the ABNF and rules for subtags so that the category of all
2231	      subtags can be determined without reference to the registry.

2233	   o  Adds the concept of well-formed vs. validating processors,
2234	      defining the rules by which an implementation can claim to be one
2235	      or the other.

2237	   o  Replaces the IANA language tag registry with a language subtag
2238	      registry that provides a complete list of valid subtags in the
2239	      IANA registry.  This allows for robust implementation and ease of
2240	      maintenance.  The language subtag registry becomes the canonical
2241	      source for forming language tags.

2243	   o  Provides a process that guarantees stability of language tags, by
2244	      handling reuse of values by ISO 639, ISO 15924, and ISO 3166 in
2245	      the event that they register a previously used value for a new
2246	      purpose.

2248	   o  Allows ISO 15924 script code subtags and allows them to be used
2249	      generatively.  Defines a method for indicating in the registry
2250	      when script subtags are necessary for a given language tag.

2252	   o  Adds the concept of a variant subtag and allows variants to be
2253	      used generatively.

2255	   o  Adds the ability to use a class of UN M.49 tags for  supra-
2256	      national regions and to resolve conflicts in the assignment of ISO
2257	      3166 codes.

2259	   o  Defines the private-use tags in ISO 639, ISO 15924, and ISO 3166
2260	      as the mechanism for creating private-use language, script, and
2261	      region subtags respectively.

2263	   o  Adds a well-defined extension mechanism.

2265	   o  Defines an extended language subtag, possibly for use with certain
2266	      anticipated features of ISO 639-3.

2268	   Ed Note: The following items are provided for the convenience of
2269	   reviewers and will be removed from the final document.

2271	   Changes between draft-ietf-ltru-registry-04 and this version are:

2273	   o  Changes to Section 2.1.1.  Incorporated Frank Ellermann's text
2274	      about RFC 2231 and modified some conformance criteria. (#944)

2276	   o  Changed Section 2.2.4 and added UN M.49 to the list of standards
2277	      monitored for changes in Section 3.4, plus added some additional
2278	      squirms to Section 3.3 to ensure that ISO-3166-less UN M.49 codes
2279	      are not registered automagically but may be registered by
2280	      individuals given inaction on the part of ISO 3166 for 180 days.
2281	      Also made the assignments of UN M.49 codes in Section 2.2.4
2282	      normative (MUST instead of 'are').  Finally, the initial rules
2283	      were modified to reflect the foregoing in Section 3.7. (#1026)
2284	      (D.Ewell, P.Constable, A.Phillips)

2286	   o  Added text to Section 3.5 allowing new entries and other changes
2287	      per the rules in Section 3.3 (A.Phillips)

2289	   o  Added text to Section 2.2.4 and Section 3.3 forbidding the
2290	      registration of UN M.49 country or area codes not assigned an ISO
2291	      3166 code. (#1026) (A.Phillips)

2293	   o  Harmonized the rules pertaining to position and number of script
2294	      and region subtags (basically now they say that they MUST occur
2295	      only once and MAY be omitted) (A.Phillips)

2297	   o  Added the homograph paragraph to Section 6. (#967)(R.Presuhn)

2299	9.  References

2301	9.1  Normative References

2303	   [1]   International Organization for Standardization, "ISO 639-
2304	         1:2002, Codes for the representation of names of languages --
2305	         Part 1: Alpha-2 code", ISO Standard 639, 2002.

2307	   [2]   International Organization for Standardization, "ISO 639-2:1998
2308	         - Codes for the representation of names of languages -- Part 2:
2309	         Alpha-3 code - edition 1", August 1988.

2311	   [3]   ISO TC46/WG3, "ISO 15924:2003 (E/F) - Codes for the
2312	         representation of names of scripts", January 2004.

2314	   [4]   International Organization for Standardization, "Codes for the
2315	         representation of names of countries, 3rd edition",
2316	         ISO Standard 3166, August 1988.

2318	   [5]   Statistical Division, United Nations, "Standard Country or Area
2319	         Codes for Statistical Use", UN Standard Country or Area Codes
2320	         for Statistical Use, Revision 4 (United Nations publication,
2321	         Sales No. 98.XVII.9, June 1999.

2323	   [6]   International Organization for Standardization, "ISO/IEC 10646-
2324	         1:2000. Information technology -- Universal Multiple-Octet
2325	         Coded Character Set (UCS) -- Part 1: Architecture and Basic
2326	         Multilingual Plane and ISO/IEC 10646-2:2001. Information
2327	         technology -- Universal Multiple-Octet Coded Character Set
2328	         (UCS) -- Part 2: Supplementary Planes, as, from time to time,
2329	         amended, replaced by a new edition or expanded by the addition
2330	         of new parts", 2000.

2332	   [7]   Crocker, D. and P. Overell, "Augmented BNF for Syntax
2333	         Specifications: ABNF", draft-crocker-abnf-rfc2234bis-00 (work
2334	         in progress), March 2005.

2336	   [8]   Bradner, S., "The Internet Standards Process -- Revision 3",
2337	         BCP 9, RFC 2026, October 1996.

2339	   [9]   Hovey, R. and S. Bradner, "The Organizations Involved in the
2340	         IETF Standards Process", BCP 11, RFC 2028, October 1996.

2342	   [10]  Moore, K., "MIME (Multipurpose Internet Mail Extensions) Part
2343	         Three: Message Header Extensions for Non-ASCII Text", RFC 2047,
2344	         November 1996.

2346	   [11]  Bradner, S., "Key words for use in RFCs to Indicate Requirement
2347	         Levels", BCP 14, RFC 2119, March 1997.

2349	   [12]  Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA
2350	         Considerations Section in RFCs", BCP 26, RFC 2434,
2351	         October 1998.

2353	   [13]  Hoffman, P. and F. Yergeau, "UTF-16, an encoding of ISO 10646",
2354	         RFC 2781, February 2000.

2356	   [14]  Carpenter, B., Baker, F., and M. Roberts, "Memorandum of
2357	         Understanding Concerning the Technical Work of the Internet
2358	         Assigned Numbers Authority", RFC 2860, June 2000.

2360	   [15]  Klyne, G. and C. Newman, "Date and Time on the Internet:
2361	         Timestamps", RFC 3339, July 2002.

2363	   [16]  Rescorla, E. and B. Korver, "Guidelines for Writing RFC Text on
2364	         Security Considerations", BCP 72, RFC 3552, July 2003.

2366	9.2  Informative References

2368	   [17]  ISO 639 Joint Advisory Committee, "ISO 639 Joint Advisory
2369	         Committee:  Working principles for ISO 639 maintenance",
2370	         March 2000,
2371	         <http://www.loc.gov/standards/iso639-2/iso639jac_n3r.html>.

2373	   [18]  Raymond, E., "The Art of Unix Programming", 2003.

2375	   [19]  Bray (et al), T., "Extensible Markup Language (XML) 1.0",
2376	         02 2004.

2378	   [20]  Biron, P., Ed. and A. Malhotra, Ed., "XML Schema Part 2:
2379	         Datatypes Second Edition", 10 2004, <
2380	         http://www.w3.org/TR/xmlschema-2/>.

2382	   [21]  Unicode Consortium, "The Unicode Consortium. The Unicode
2383	         Standard, Version 4.1.0, defined by: The Unicode Standard,
2384	         Version 4.0 (Boston, MA, Addison-Wesley, 2003. ISBN 0-321-
2385	         18578-1), as amended by Unicode 4.0.1
2386	         (http://www.unicode.org/versions/Unicode4.0.1) and by Unicode
2387	         4.1.0 (http://www.unicode.org/versions/Unicode4.1.0).",
2388	         March 2005.

2390	   [22]  Alvestrand, H., "Tags for the Identification of Languages",
2391	         RFC 1766, March 1995.

2393	   [23]  Freed, N. and K. Moore, "MIME Parameter Value and Encoded Word
2394	         Extensions: Character Sets, Languages, and Continuations",
2395	         RFC 2231, November 1997.

2397	   [24]  Alvestrand, H., "Tags for the Identification of Languages",
2398	         BCP 47, RFC 3066, January 2001.

2400	Authors' Addresses

2402	   Addison Phillips (editor)
2403	   Quest Software

2405	   Email: addison.phillips@quest.com

2407	   Mark Davis (editor)
2408	   IBM

2410	   Email: mark.davis@us.ibm.com

2412	Appendix A.  Acknowledgements

2414	   Any list of contributors is bound to be incomplete; please regard the
2415	   following as only a selection from the group of people who have
2416	   contributed to make this document what it is today.

2418	   The contributors to RFC 3066 and RFC 1766, the precursors of this
2419	   document, made enormous contributions directly or indirectly to this
2420	   document and are generally responsible for the success of language
2421	   tags.

2423	   The following people (in alphabetical order) contributed to this
2424	   document or to RFCs 1766 and 3066:

2426	   Glenn Adams, Harald Tveit Alvestrand, Tim Berners-Lee, Marc Blanchet,
2427	   Nathaniel Borenstein, Eric Brunner, Sean M. Burke, M.T. Carrasco
2428	   Benitez, Jeremy Carroll, John Clews, Jim Conklin, Peter Constable,
2429	   John Cowan, Mark Crispin, Dave Crocker, Martin Duerst, Frank
2430	   Ellerman, Michael Everson, Doug Ewell, Ned Freed, Tim Goodwin, Dirk-
2431	   Willem van Gulik, Marion Gunn, Joel Halpren, Elliotte Rusty Harold,
2432	   Paul Hoffman, Scott Hollenbeck, Richard Ishida, Olle Jarnefors, Kent
2433	   Karlsson, John Klensin, Alain LaBonte, Eric Mader, Ira McDonald,
2434	   Keith Moore, Chris Newman, Masataka Ohta, Randy Presuhn, George
2435	   Rhoten, Markus Scherer, Keld Jorn Simonsen, Thierry Sourbier, Otto
2436	   Stolz, Tex Texin, Andrea Vine, Rhys Weatherley, Misha Wolf, Francois
2437	   Yergeau and many, many others.

2439	   Very special thanks must go to Harald Tveit Alvestrand, who
2440	   originated RFCs 1766 and 3066, and without whom this document would
2441	   not have been possible.  Special thanks must go to Michael Everson,
2442	   who has served as language tag reviewer for almost the complete
2443	   period since the publication of RFC 1766.  Special thanks to Doug
2444	   Ewell, for his production of the first complete subtag registry, and
2445	   his work in producing a test parser for verifying language tags.

2447	Appendix B.  Examples of Language Tags (Informative)

2449	   Simple language subtag:

2451	      de (German)

2453	      fr (French)

2455	      ja (Japanese)

2457	      i-enochian (example of a grandfathered tag)

2459	   Language subtag plus Script subtag:

2461	      zh-Hant (Chinese written using the Traditional Chinese script)

2463	      zh-Hans (Chinese written using the Simplified Chinese script)

2465	      sr-Cyrl (Serbian written using the  Cyrillic script)

2467	      sr-Latn (Serbian written using the Latin script)

2469	   Language-Script-Region:

2471	      zh-Hans-CN (Chinese written using the Simplified script as used in
2472	      mainland China)

2474	      sr-Latn-CS (Serbian written using the Latin script as used in
2475	      Serbia and Montenegro)

2477	   Language-Variant:

2479	      en-boont (Boontling dialect of English)

2481	      en-scouse (Scouse dialect of English)

2483	   Language-Region-Variant:

2485	      en-GB-scouse (Scouse dialect of English as used in the UK)

2487	   Language-Script-Region-Variant:

2489	      sl-Latn-IT-nedis (Nadiza dialect of Slovenian written using the
2490	      Latin script as used in Italy.  Note that this tag is NOT
2491	      RECOMMENDED because subtag 'sl' has a Suppress-Script value of
2492	      'Latn')

2494	   Language-Region:

2496	      de-DE (German for Germany)

2498	      en-US (English as used in the United States)

2500	      es-419 (Spanish for Latin America and Caribbean region using the
2501	      UN region code)

2503	   Private-use subtags:

2505	      de-CH-x-phonebk

2507	      az-Arab-x-AZE-derbend

2509	   Extended language subtags (examples ONLY: extended languages MUST be
2510	   defined by revision or update to this document):

2512	      zh-min

2514	      zh-min-nan-Hant-CN

2516	   Private-use registry values:

2518	      x-whatever (private use using the singleton 'x')

2520	      qaa-Qaaa-QM-x-southern (all private tags)

2522	      de-Qaaa (German, with a private script)

2524	      sr-Latn-QM (Serbian, Latin-script, private region)

2526	      sr-Qaaa-CS (Serbian, private script, for Serbia and Montenegro)

2528	   Tags that use extensions (examples ONLY: extensions MUST be defined
2529	   by revision or update to this document or by RFC):

2531	      en-US-u-islamCal

2533	      zh-CN-a-myExt-x-private

2535	      en-a-myExt-b-another

2537	   Some Invalid Tags:

2539	      de-419-DE (two region tags)
2540	      a-DE (use of a single character subtag in primary position; note
2541	      that there are a few grandfathered tags that start with "i-" that
2542	      are valid)

2544	      ar-a-aaa-b-bbb-a-ccc (two extensions with same single letter
2545	      prefix)

2547	Appendix C.  Example Registry

2549	   Example Registry

2551	   File-Date: 2005-04-18
2552	   %%
2553	   Type: language
2554	   Subtag: aa
2555	   Description: Afar
2556	   Added: 2004-07-06
2557	   %%
2558	   Type: language
2559	   Subtag: ab
2560	   Description: Abkhazian
2561	   Added: 2004-07-06
2562	   %%
2563	   Type: language
2564	   Subtag: ae
2565	   Description: Avestan
2566	   Added: 2004-07-06
2567	   %%
2568	   Type: language
2569	   Subtag: ar
2570	   Description: Arabic
2571	   Added: 2004-07-06
2572	   Suppress-Script: Arab
2573	   Comment: Arabic text is usually written in Arabic script
2574	   %%
2575	   Type: language
2576	   Subtag: qaa..qtz
2577	   Description: PRIVATE USE
2578	   Added: 2004-08-01
2579	   Comment: Use private use codes in preference
2580	     to the x- singleton for primary language
2581	   Comment: This is an example of two comments.
2582	   %%
2583	   Type: script
2584	   Subtag: Arab
2585	   Description: Arabic
2586	   Added: 2004-07-06
2587	   %%
2588	   Type: script
2589	   Subtag: Armn
2590	   Description: Armenian
2591	   Added: 2004-07-06
2592	   %%
2593	   Type: script
2594	   Subtag: Bali
2595	   Description: Balinese
2596	   Added: 2004-07-06
2597	   %%
2598	   Type: script
2599	   Subtag: Batk
2600	   Description: Batak
2601	   Added: 2004-07-06
2602	   %%
2603	   Type: region
2604	   Subtag: AA
2605	   Description: PRIVATE USE
2606	   Added: 2004-08-01
2607	   %%
2608	   Type: region
2609	   Subtag: AD
2610	   Description: Andorra
2611	   Added: 2004-07-06
2612	   %%
2613	   Type: region
2614	   Subtag: AE
2615	   Description: United Arab Emirates
2616	   Added: 2004-07-06
2617	   %%
2618	   Type: region
2619	   Subtag: AX
2620	   Description: &#xC5;land Islands
2621	   Added: 2004-07-06
2622	   Comments: The description shows a Unicode escape
2623	     for the letter A-ring.
2624	   %%
2625	   Type: region
2626	   Subtag: 001
2627	   Description: World
2628	   Added: 2004-07-06
2629	   %%
2630	   Type: region
2631	   Subtag: 002
2632	   Description: Africa
2633	   Added: 2004-07-06
2634	   %%
2635	   Type: region
2636	   Subtag: 003
2637	   Description: North America
2638	   Added: 2004-07-06
2639	   %%
2640	   Type: variant
2641	   Subtag: 1901
2642	   Description: Traditional German
2643	      orthography
2644	   Added: 2004-09-09
2645	   Prefix: de
2646	   Comment: <shows continuation>
2647	   %%
2648	   Type: variant
2649	   Subtag: 1996
2650	   Description: German orthography of 1996
2651	   Added: 2004-09-09
2652	   Prefix: de
2653	   %%
2654	   Type: variant
2655	   Subtag: boont
2656	   Description: Boontling
2657	   Added: 2003-02-14
2658	   Prefix: en
2659	   %%
2660	   Type: variant
2661	   Subtag: gaulish
2662	   Description: Gaulish
2663	   Added: 2001-05-25
2664	   Prefix: cel
2665	   %%
2666	   Type: grandfathered
2667	   Tag: art-lojban
2668	   Description: Lojban
2669	   Added: 2001-11-11
2670	   Canonical: jbo
2671	   Deprecated: 2003-09-02
2672	   %%
2673	   Type: grandfathered
2674	   Tag: en-GB-oed
2675	   Description: English, Oxford English Dictionary spelling
2676	   Added: 2003-07-09
2677	   %%
2678	   Type: grandfathered
2679	   Tag: i-ami
2680	   Description: 'Amis
2681	   Added: 1999-05-25
2682	   %%
2683	   Type: grandfathered
2684	   Tag: i-bnn
2685	   Description: Bunun
2686	   Added: 1999-05-25
2687	   %%
2688	   Type: redundant
2689	   Tag: az-Arab
2690	   Description: Azerbaijani in Arabic script
2691	   Added: 2003-05-30
2692	   %%
2693	   Type: redundant
2694	   Tag: az-Cyrl
2695	   Description: Azerbaijani in Cyrillic script
2696	   Added: 2003-05-30
2697	   %%

2699	                 Figure 9: Example of the Registry Format

2701	Intellectual Property Statement

2703	   The IETF takes no position regarding the validity or scope of any
2704	   Intellectual Property Rights or other rights that might be claimed to
2705	   pertain to the implementation or use of the technology described in
2706	   this document or the extent to which any license under such rights
2707	   might or might not be available; nor does it represent that it has
2708	   made any independent effort to identify any such rights.  Information
2709	   on the procedures with respect to rights in RFC documents can be
2710	   found in BCP 78 and BCP 79.

2712	   Copies of IPR disclosures made to the IETF Secretariat and any
2713	   assurances of licenses to be made available, or the result of an
2714	   attempt made to obtain a general license or permission for the use of
2715	   such proprietary rights by implementers or users of this
2716	   specification can be obtained from the IETF on-line IPR repository at
2717	   http://www.ietf.org/ipr.

2719	   The IETF invites any interested party to bring to its attention any
2720	   copyrights, patents or patent applications, or other proprietary
2721	   rights that may cover technology that may be required to implement
2722	   this standard.  Please address the information to the IETF at
2723	   ietf-ipr@ietf.org.

2725	Disclaimer of Validity

2727	   This document and the information contained herein are provided on an
2728	   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
2729	   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
2730	   ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
2731	   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
2732	   INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
2733	   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

2735	Copyright Statement

2737	   Copyright (C) The Internet Society (2005).  This document is subject
2738	   to the rights, licenses and restrictions contained in BCP 78, and
2739	   except as set forth therein, the authors retain all their rights.

2741	Acknowledgment

2743	   Funding for the RFC Editor function is currently provided by the
2744	   Internet Society.