technical title for collation

idnits 2.17.1 

draft-newman-i18n-comparator-11.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** It looks like you're using RFC 3978 boilerplate.  You should update this
     to the boilerplate described in the IETF Trust License Policy document
     (see https://trustee.ietf.org/license-info), which is required now.

  -- Found old boilerplate from RFC 3978, Section 5.1 on line 17.

  -- Found old boilerplate from RFC 3978, Section 5.5 on line 1566.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1543.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1550.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1556.

  ** This document has an original RFC 3978 Section 5.4 Copyright Line,
     instead of the newer IETF Trust Copyright according to RFC 4748.

  ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead
     of the newer disclaimer which includes the IETF Trust according to RFC
     4748.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** There is 1 instance of too long lines in the document, the longest one
     being 52 characters in excess of 72.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not
     match the current year

  == Line 821 has weird spacing: '...=accent  e, o,...'

  == Line 822 has weird spacing: '...ch=case    e, ...'

  == The document seems to lack the recommended RFC 2119 boilerplate, even if
     it appears to use RFC 2119 keywords -- however, there's a paragraph with
     a matching beginning. Boilerplate error?

     (The document does seem to have the reference to RFC 2119 which the
     ID-Checklist requires).
  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (May 23, 2006) is 6545 days in the past.  Is this
     intentional?

  -- Found something which looks like a code comment -- if you have code
     sections in the document, please surround them with '<CODE BEGINS>' and
     '<CODE ENDS>' lines.


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  -- Possible downref: Non-RFC (?) normative reference: ref. 'RFC XXXX'

  ** Obsolete normative reference: RFC 4234 (ref. '2') (Obsoleted by RFC 5234)

  ** Obsolete normative reference: RFC 3066 (ref. '5') (Obsoleted by RFC
     4646, RFC 4647)

  ** Obsolete normative reference: RFC 3454 (ref. '6') (Obsoleted by RFC 7564)

  ** Obsolete normative reference: RFC 3491 (ref. '7') (Obsoleted by RFC 5891)

  -- Possible downref: Non-RFC (?) normative reference: ref. '8'

  -- Possible downref: Non-RFC (?) normative reference: ref. '9'

  -- Obsolete informational reference (is this intentional?): RFC 2222 (ref.
     '11') (Obsoleted by RFC 4422, RFC 4752)

  -- Obsolete informational reference (is this intentional?): RFC 2822 (ref.
     '13') (Obsoleted by RFC 5322)

  -- Obsolete informational reference (is this intentional?): RFC 3028 (ref.
     '15') (Obsoleted by RFC 5228, RFC 5429)

  -- Obsolete informational reference (is this intentional?): RFC 3501 (ref.
     '16') (Obsoleted by RFC 9051)

  == Outdated reference: A later version (-20) exists of
     draft-ietf-imapext-sort-17

  == Outdated reference: A later version (-15) exists of
     draft-ietf-imapext-i18n-06


     Summary: 8 errors (**), 0 flaws (~~), 7 warnings (==), 15 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	Network Working Group                                          C. Newman
2	Internet-Draft                                          Sun Microsystems
3	Expires: November 24, 2006                                     M. Duerst
4	                                                                     AGU
5	                                                          A. Gulbrandsen
6	                                                                    Oryx
7	                                                            May 23, 2006

9	            Internet Application Protocol Collation Registry
10	                  draft-newman-i18n-comparator-11.txt

12	Status of this Memo

14	   By submitting this Internet-Draft, each author represents that any
15	   applicable patent or other IPR claims of which he or she is aware
16	   have been or will be disclosed, and any of which he or she becomes
17	   aware will be disclosed, in accordance with Section 6 of BCP 79.

19	   Internet-Drafts are working documents of the Internet Engineering
20	   Task Force (IETF), its areas, and its working groups.  Note that
21	   other groups may also distribute working documents as Internet-
22	   Drafts.

24	   Internet-Drafts are draft documents valid for a maximum of six months
25	   and may be updated, replaced, or obsoleted by other documents at any
26	   time.  It is inappropriate to use Internet-Drafts as reference
27	   material or to cite them other than as "work in progress."

29	   The list of current Internet-Drafts can be accessed at
30	   http://www.ietf.org/ietf/1id-abstracts.txt.

32	   The list of Internet-Draft Shadow Directories can be accessed at
33	   http://www.ietf.org/shadow.html.

35	   This Internet-Draft will expire on November 24, 2006.

37	Copyright Notice

39	   Copyright (C) The Internet Society (2006).

41	Abstract

43	   Many Internet application protocols include string-based lookup,
44	   searching, or sorting operations.  However the problem space for
45	   searching and sorting international strings is large, not fully
46	   explored, and is outside the area of expertise for the Internet
47	   Engineering Task Force (IETF).  Rather than attempt to solve such a
48	   large problem, this specification creates an abstraction framework so
49	   that application protocols can precisely identify a comparison
50	   function and the repertoire of comparison functions can be extended
51	   in the future.

53	Table of Contents

55	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  4
56	     1.1.   Conventions Used in this Document . . . . . . . . . . . .  4
57	   2.  Collation Definition and Purpose . . . . . . . . . . . . . . .  4
58	     2.1.   Definition  . . . . . . . . . . . . . . . . . . . . . . .  4
59	     2.2.   Purpose . . . . . . . . . . . . . . . . . . . . . . . . .  4
60	     2.3.   Some Other Terms Used in this Document  . . . . . . . . .  5
61	     2.4.   Sort Keys . . . . . . . . . . . . . . . . . . . . . . . .  5
62	   3.  Collation Name Syntax  . . . . . . . . . . . . . . . . . . . .  6
63	     3.1.   Basic Syntax  . . . . . . . . . . . . . . . . . . . . . .  6
64	     3.2.   Wildcards . . . . . . . . . . . . . . . . . . . . . . . .  6
65	     3.3.   Ordering Direction  . . . . . . . . . . . . . . . . . . .  6
66	     3.4.   URIs  . . . . . . . . . . . . . . . . . . . . . . . . . .  7
67	     3.5.   Naming Guidelines . . . . . . . . . . . . . . . . . . . .  7
68	   4.  Collation Specification Requirements . . . . . . . . . . . . .  8
69	     4.1.   Collation/Server Interface  . . . . . . . . . . . . . . .  8
70	     4.2.   Operations Supported  . . . . . . . . . . . . . . . . . .  8
71	       4.2.1.  Validity . . . . . . . . . . . . . . . . . . . . . . .  8
72	       4.2.2.  Equality . . . . . . . . . . . . . . . . . . . . . . .  9
73	       4.2.3.  Substring  . . . . . . . . . . . . . . . . . . . . . .  9
74	       4.2.4.  Ordering . . . . . . . . . . . . . . . . . . . . . . . 10
75	     4.3.   Sort Keys . . . . . . . . . . . . . . . . . . . . . . . . 10
76	     4.4.   Use of Lookup Tables  . . . . . . . . . . . . . . . . . . 10
77	   5.  Application Protocol Requirements  . . . . . . . . . . . . . . 10
78	     5.1.   Character Encoding  . . . . . . . . . . . . . . . . . . . 11
79	     5.2.   Operations  . . . . . . . . . . . . . . . . . . . . . . . 11
80	     5.3.   Wildcards . . . . . . . . . . . . . . . . . . . . . . . . 12
81	     5.4.   Canonicalization Function . . . . . . . . . . . . . . . . 12
82	     5.5.   Disconnected Clients  . . . . . . . . . . . . . . . . . . 12
83	     5.6.   Error Codes . . . . . . . . . . . . . . . . . . . . . . . 12
84	     5.7.   Octet Collation . . . . . . . . . . . . . . . . . . . . . 12
85	   6.  Use by Existing Protocols  . . . . . . . . . . . . . . . . . . 13
86	   7.  Collation Registration . . . . . . . . . . . . . . . . . . . . 13
87	     7.1.   Collation Registration Procedure  . . . . . . . . . . . . 13
88	     7.2.   Collation Registration Format . . . . . . . . . . . . . . 14
89	       7.2.1.  Registration Template  . . . . . . . . . . . . . . . . 14
90	       7.2.2.  The collation Element  . . . . . . . . . . . . . . . . 14
91	       7.2.3.  The name Element . . . . . . . . . . . . . . . . . . . 15
92	       7.2.4.  The title Element  . . . . . . . . . . . . . . . . . . 15
93	       7.2.5.  The operations Element . . . . . . . . . . . . . . . . 15
94	       7.2.6.  The specification Element  . . . . . . . . . . . . . . 15
95	       7.2.7.  The submitter Element  . . . . . . . . . . . . . . . . 16
96	       7.2.8.  The owner Element  . . . . . . . . . . . . . . . . . . 16
97	       7.2.9.  The version Element  . . . . . . . . . . . . . . . . . 16
98	       7.2.10. The UnicodeVersion Element . . . . . . . . . . . . . . 16
99	       7.2.11. The UCAVersion Element . . . . . . . . . . . . . . . . 16
100	       7.2.12. The UCAMatchLevel Element  . . . . . . . . . . . . . . 16
101	     7.3.   DTD for Collation Registration  . . . . . . . . . . . . . 17
102	     7.4.   Structure of Collation Registry . . . . . . . . . . . . . 17
103	     7.5.   Example Initial Registry Summary  . . . . . . . . . . . . 18
104	   8.  Guidelines for Expert Reviewer . . . . . . . . . . . . . . . . 18
105	   9.  Initial Collations . . . . . . . . . . . . . . . . . . . . . . 19
106	     9.1.   ASCII Numeric Collation . . . . . . . . . . . . . . . . . 19
107	       9.1.1.  ASCII Numeric Collation Description  . . . . . . . . . 19
108	       9.1.2.  ASCII Numeric Collation Registration . . . . . . . . . 20
109	     9.2.   ASCII Casemap Collation . . . . . . . . . . . . . . . . . 20
110	       9.2.1.  ASCII Casemap Collation Description  . . . . . . . . . 20
111	       9.2.2.  ASCII Casemap Collation Registration . . . . . . . . . 21
112	     9.3.   Nameprep Collation  . . . . . . . . . . . . . . . . . . . 21
113	       9.3.1.  Nameprep Collation Description . . . . . . . . . . . . 21
114	       9.3.2.  Nameprep Collation Registration  . . . . . . . . . . . 22
115	     9.4.   Basic Collation . . . . . . . . . . . . . . . . . . . . . 22
116	       9.4.1.  Basic Collation Description  . . . . . . . . . . . . . 22
117	       9.4.2.  Basic Collation Registration . . . . . . . . . . . . . 24
118	       9.4.3.  Basic Accent Sensitive Match Collation Registration  . 25
119	       9.4.4.  Basic Case Sensitive Match Collation Registration  . . 25
120	     9.5.   Octet Collation . . . . . . . . . . . . . . . . . . . . . 25
121	       9.5.1.  Octet Collation Description  . . . . . . . . . . . . . 25
122	       9.5.2.  Octet Collation Registration . . . . . . . . . . . . . 26
123	   10. IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 26
124	   11. Security Considerations  . . . . . . . . . . . . . . . . . . . 27
125	   12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 27
126	   13. Open Issues  . . . . . . . . . . . . . . . . . . . . . . . . . 27
127	   14. Change Log . . . . . . . . . . . . . . . . . . . . . . . . . . 28
128	     14.1.  Changes From -10  . . . . . . . . . . . . . . . . . . . . 28
129	     14.2.  Changes From -09  . . . . . . . . . . . . . . . . . . . . 28
130	     14.3.  Changes From -08  . . . . . . . . . . . . . . . . . . . . 29
131	     14.4.  Changes From -06  . . . . . . . . . . . . . . . . . . . . 29
132	     14.5.  Changes From -05  . . . . . . . . . . . . . . . . . . . . 30
133	     14.6.  Changes From -04  . . . . . . . . . . . . . . . . . . . . 30
134	     14.7.  Changes From -03  . . . . . . . . . . . . . . . . . . . . 30
135	     14.8.  Changes From -02  . . . . . . . . . . . . . . . . . . . . 30
136	     14.9.  Changes From -01  . . . . . . . . . . . . . . . . . . . . 31
137	     14.10. Changes From -00  . . . . . . . . . . . . . . . . . . . . 31
138	   15. References . . . . . . . . . . . . . . . . . . . . . . . . . . 31
139	     15.1.  Normative References  . . . . . . . . . . . . . . . . . . 31
140	     15.2.  Informative References  . . . . . . . . . . . . . . . . . 32
141	   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 34
142	   Intellectual Property and Copyright Statements . . . . . . . . . . 35

144	1.  Introduction

146	   The ACAP [12] specification introduced the concept of a comparator
147	   (which we call collation in this document), but failed to create an
148	   IANA registry.  With the introduction of stringprep [6] and the
149	   Unicode Collation Algorithm [8], it is now time to create that
150	   registry and populate it with some initial values appropriate for an
151	   international community.  This specification replaces and generalizes
152	   the definition of a comparator in ACAP and creates a collation
153	   registry.

155	1.1.  Conventions Used in this Document

157	   The key words "MUST", "MUST NOT", "SHOULD", "SHOULD NOT", and "MAY"
158	   in this document are to be interpreted as defined in "Key words for
159	   use in RFCs to Indicate Requirement Levels" [1].

161	   The attribute syntax specifications use the Augmented Backus-Naur
162	   Form (ABNF) [2] notation including the core rules defined in Appendix
163	   A. This also inherits ABNF rules from Language Tags [5].

165	2.  Collation Definition and Purpose

167	2.1.  Definition

169	   A collation is a named function which takes two arbitrary length
170	   strings as input and can be used to perform one or more of three
171	   basic comparison operations: equality test, substring match, and
172	   ordering test.

174	2.2.  Purpose

176	   Collations abstraction layer for comparison functions so that these
177	   comparison functions can be used in multiple protocols.  The details
178	   of a particular comparison operation can be specified by someone with
179	   appropriate expertise independent of the application protocols that
180	   use that collation.  This is similar to the way a charset [14]
181	   separates the details of octet to character mapping from a protocol
182	   specification such as MIME [10] or the way SASL [11] separates the
183	   details of an authentication mechanism from a protocol specification
184	   such as ACAP [12].

186	   Here is a small diagram to help illustrate the value of this
187	   abstraction layer:

189	   +-------------------+                         +-----------------+
190	   | IMAP i18n SEARCH  |--+                      | Basic           |
191	   +-------------------+  |                   +--| Collation Spec  |
192	                          |                   |  +-----------------+
193	   +-------------------+  |  +-------------+  |  +-----------------+
194	   | ACAP i18n SEARCH  |--+--| Collation   |--+--| A stringprep    |
195	   +-------------------+  |  | Registry    |  |  | Collation Spec  |
196	                          |  +-------------+  |  +-----------------+
197	   +-------------------+  |                   |  +-----------------+
198	   | ...other protocol |--+                   |  | locale-specific |
199	   +-------------------+                      +--| Collation Spec  |
200	                                                 +-----------------+

202	   Thus IMAP, ACAP and future application protocols with international
203	   search capability simply specify how to interface to the collation
204	   registry instead of each protocol specification having to specify all
205	   the collations it supports.

207	2.3.  Some Other Terms Used in this Document

209	   The terms client, server and protocol are used in somewhat unusual
210	   senses.

212	   Client means a user, or a program acting directly on behalf of a
213	   user.  This may be an mail reader acting as an IMAP client, or it may
214	   be an interactive shell where the user can type protocol directly, or
215	   it may be a script or program written by the user.

217	   Server means a program that performs services requested by the
218	   client.  This may be a traditional server such as an HTTP server, or
219	   it may be a Sieve [15] interpreter running a Sieve script written by
220	   a user.  A server needs to use the operations provided by collations
221	   in order to fulfill the client's requests.

223	   The protocol describes how the client tells the server what it wants
224	   done, and (if applicable) how the server tells the client about the
225	   results.  IMAP is a protocol by this definition, and so is the Sieve
226	   language.

228	2.4.  Sort Keys

230	   One component of a collation is a transformation which turns a string
231	   into a sort key, which is then used while sorting.

233	   The transformation can range from an identity mapping (e.g., the
234	   i;octet collation Section 9.5) to a mapping which makes the string
235	   unreadable to a human (e.g., the basic collation Section 9.4).

237	   This is an implementation detail of collations or servers.  A
238	   protocol SHOULD NOT expose it, since some collations leave the sort
239	   key's format up to the implementation, and current conformant
240	   implementations are known to use different formats.

242	3.  Collation Name Syntax

244	3.1.  Basic Syntax

246	   The collation name itself is a single US-ASCII string beginning with
247	   a letter and made up of letters, digits, and one of the following 4
248	   symbols: "-", ";", "=" and ".".  The name MUST NOT be longer than 254
249	   characters.

251	     collation-char  =  ALPHA / DIGIT / "-" / ";" / "=" / "."

253	     collation-name  =  ALPHA *253collation-char

255	   The name "default" is reserved.  For protocol which have a default
256	   collation, "default" refers to that collation.  For other protocols,
257	   the name "default" matches no collations, and servers SHOULD treat it
258	   in the same way as they treat names of nonexistent collations.

260	3.2.  Wildcards

262	   The string a client uses to select a collation MAY contain one or
263	   more wildcard ("*") character which matches zero or more collation-
264	   chars.  Wildcard characters MUST NOT be adjacent.  If the wildcard
265	   string matches multiple collations, the server SHOULD select the
266	   collation with the broadest scope (preferably international scope),
267	   the most recent table versions and the greatest number of supported
268	   operations.

270	     collation-wild  =  ("*" / (ALPHA ["*"])) *(collation-char ["*"])
271	                         ; MUST NOT exceed 254 characters total

273	3.3.  Ordering Direction

275	   When used as a protocol element for ordering, the collation name MAY
276	   be prefixed by either "+" or "-" to explicitly specify an ordering
277	   direction. "+" has no effect on the ordering operation, while "-"
278	   inverts the result of the ordering operation.  In general, collation-
279	   order is used when a client requests a collation, and collation-
280	   selected is used when the server informs the client of the selected
281	   collation.

283	     collation-selected =  ["+" / "-"] collation-name

285	     collation-order =  ["+" / "-"] collation-wild

287	3.4.  URIs

289	   Some protocols are designed to use URIs [4] to refer to collations
290	   rather than simple tokens.  A special section of the IANA web page is
291	   reserved for such usage.  The "collation-uri" form is used to refer
292	   to a specific IANA registry entry for a specific named collation (the
293	   collation registration may not actually be present if it is
294	   experimental).  The "collation-auri" form is an abstract name for an
295	   ordering, a collation pattern or a vendor private collator.

297	     collation-uri   =  "http://www.iana.org/assignments/collation/"
298	                        collation-name ".xml"

300	     collation-auri  =  ( "http://www.iana.org/assignments/collation/"
301	                        collation-order ".xml" ) / other-uri

303	     other-uri       =  <absoluteURI>
304	                     ;  excluding the IANA collation namespace.

306	3.5.  Naming Guidelines

308	   While this specification makes no absolute requirements on the
309	   structure of collation names, naming consistency is important, so the
310	   following initial guidelines are provided.

312	   Collation names with an international audience typically begin with
313	   "i;".  Collation names intended for a particular language or locale
314	   typically begin with a language tag [5] followed by a ";".  After the
315	   first ";" is normally the name of the general collation algorithm,
316	   followed by a series of algorithm modifications separated by the ";"
317	   delimiter.  Parameterized modifications will use "=" to delimit the
318	   parameter from the value.  The version numbers of any lookup tables
319	   used by the algorithm SHOULD be present as parameterized
320	   modifications.

322	   Collation names of the form *;vnd-domain.com;* are reserved for
323	   vendor-specific collations created by the owner of the domain name
324	   following the "vnd-" prefix (e.g. vnd-example.com for the vendor
325	   example.com).  Registration of such collations (or the name space as
326	   a whole) with intended use of "Vendor" is encouraged when a public
327	   specification or open-source implementation is available, but is not
328	   required.

330	4.  Collation Specification Requirements

332	4.1.  Collation/Server Interface

334	   The collation itself defines what it operates on.  Most collations
335	   are expected to operate on character strings.  The i;octet
336	   (Section 9.5) collation operates on octet strings.  The i;ascii-
337	   numeric (Section 9.1) operation operates on numbers.

339	   This specification defines the collation interface in terms of octet
340	   strings.  However, implementations may choose to use character
341	   strings instead.  Such implementations may not be able to implement
342	   e.g. i;octet.  Since i;octet is not currently mandatory to implement
343	   for any protocol, this should not be a problem.

345	4.2.  Operations Supported

347	   A collation specification MUST state which of the three basic
348	   operations are supported (equality, substring, ordering) and how to
349	   perform each of the supported operations on any two input character
350	   strings including empty strings.  Collations must be deterministic,
351	   i.e. given a collation with a specific name, and any two fixed input
352	   strings, the result MUST be the same for the same operation.

354	   In general, collation operations should behave as their names
355	   suggest.  While a collation may be new, the operations are not, so
356	   the new collation's operations should be similar to those of older
357	   collations.  For example, a date/time collation should not provide a
358	   "substring" operation that would morph IMAP substring SEARCH into
359	   e.g. a date-range search.

361	   A nonobvious consequence of the rules for each collation operation is
362	   that for any single collation, either none or all of the operations
363	   can return "undefined".  For example, it is not possible to have an
364	   equality operation that never returns "undefined" and a substring
365	   operation that occasionally does.

367	4.2.1.  Validity

369	   The validity test takes one string as argument returns valid if its
370	   input string is valid input to collation's other operations, and
371	   invalid if not.  (In other words, a string is valid if it is equal to
372	   itself according to the collation's equality operation.)

374	   The validity test is provided by all collations.  It MUST NOT be
375	   listed separately in the collation registration.

377	4.2.2.  Equality

379	   The equality test always returns "match" or "no-match" when supplied
380	   valid input, and MAY return "undefined" if one or both input strings
381	   are not valid.

383	   The equality test MUST be reflexive and symmetric.  For valid input,
384	   it MUST be transitive.

386	   If a collation provides either a substring or an ordering test, it
387	   MUST also provide an equality test.  The substring and/or ordering
388	   tests MUST be consistent with the equality test.

390	   In this specification, the return values of the equality test are
391	   called "match", "no-match" and "undefined".  This is not a
392	   specification, merely a choice of phrasing.

394	4.2.3.  Substring

396	   The substring matching operation determines if the first string is a
397	   substring of the second string, ie. if one or more substrings of the
398	   second string is equal to the first, as defined by the collation's
399	   equality operation.

401	   A collation which supports substring matching will automatically
402	   support two special cases of substring matching: prefix and suffix
403	   matching if those special cases are supported by the application
404	   protocol.  It returns "match" or "no-match" when supplied valid input
405	   and returns "undefined" when supplied invalid input.

407	   Application protocols MAY return position information for substring
408	   matches.  If this is done, the position information SHOULD include
409	   both the starting offset and the ending offset for each match.  This
410	   is important because more sophisticated collations can match strings
411	   of unequal length (for example, a pre-composed accented character can
412	   match a decomposed accented character).  All matching substrings
413	   should be reported, even overlapping matches (as when "ana" occurs
414	   twice within "banana").

416	   A string is a substring of itself.  The empty string is a substring
417	   of all strings.

419	   Note that the substring operation of some collations can match
420	   strings of unequal length.  For example, a pre-composed accented
421	   character can match a decomposed accented character.  Unicode
422	   Collation Algorithm [8] discusses this in more detail.

424	   In this specification, the return values of the substring operation
425	   are called "match", "no-match" and "undefined".  This is not a
426	   specification, merely a choice of phrasing.

428	4.2.4.  Ordering

430	   The ordering operation determines how two strings are ordered.  It
431	   MUST be trichotomous and reflexive.  For valid input, it MUST be
432	   transitive.

434	   Ordering returns "less" if the first string is listed before the
435	   second string according to the collation, "greater" if the second
436	   string is listed before the first string, and "equal" if the two
437	   strings are equal as defined by the collation's equality operation.
438	   If one or both strings are invalid, the result of ordering is
439	   "undefined".

441	   When the collation is used with a "+" prefix, the behavior is the
442	   same as when used with no prefix.  When the collation is used with a
443	   "-" prefix, the result of the ordering operation of the collation
444	   MUST be reversed.

446	   In this specification, the return values of the ordering operation
447	   are called "less", "equal", "greater" and "undefined".  This is not a
448	   specification, merely a choice of phrasing.

450	4.3.  Sort Keys

452	   A collation specification SHOULD describe the internal transformation
453	   algorithm to generate sort keys.  This algorithm can be applied to
454	   individual strings and the result can be stored to potentially
455	   optimize future comparison operations.  A collation MAY specify that
456	   the sort key is generated by the identity function.  The sort key may
457	   have no meaning to a human.  The sort key may not be valid input to
458	   the collation.

460	4.4.  Use of Lookup Tables

462	   Some collations use customizable lookup tables, e.g. because the
463	   tables depend on locale and may be modified after shipping the
464	   software.  Collations which use more than one customizable lookup
465	   table in a documented format MUST assign numbers to the tables they
466	   use.  This permits an application protocol command to access the
467	   tables used by a server collation, so that clients and servers use
468	   the same tables.

470	5.  Application Protocol Requirements
471	   This section describes the requirements and issues that an
472	   application protocol needs to consider if it offers searching,
473	   substring matching and/or sorting, and permits the use of characters
474	   outside the US-ASCII charset.

476	5.1.  Character Encoding

478	   The protocol specification has to make sure that it is clear on which
479	   characters (rather than just octets) the collations are used.  This
480	   can be done by specifying the protocol itself in terms of characters
481	   (e.g. in the case of a query language), by specifying a single
482	   character encoding for the protocol (e.g.  UTF-8 [3]), or by
483	   carefully describing the relevant issues of character encoding
484	   labeling and conversion.  In the later case, details to consider
485	   include how to handle unknown charsets, any charsets which are
486	   mandatory-to-implement, any issues with byte-order that might apply,
487	   and any transfer encodings which need to be supported.

489	5.2.  Operations

491	   The protocol must specify which of the operations defined in this
492	   specification (equality matching, substring matching and ordering)
493	   can be invoked in the protocol, and how they are invoked.  There may
494	   be more than one way to invoke an operation.

496	   The protocol MUST provide a mechanism for the client to select the
497	   collation to use with equality matching, substring matching and
498	   ordering.

500	   If a protocol needs a total ordering and the collation chosen does
501	   not provide it because the ordering operation returns "undefined" at
502	   least once, the recommended fallback is to sort all invalid strings
503	   after the valid ones, and use i;octet to order the invalid strings.

505	   Although the collation's substring function provides a list of
506	   matches, a protocol need not provide all that to the client.  It may
507	   provide only the first matching substring, or even just the
508	   information that the substring search matched.

510	   If the protocol provides positional information for the results of a
511	   substring match, that positional information SHOULD fully specify the
512	   substring(s) in the result that matches independent of the length of
513	   the search string.  For example, returning both the starting and
514	   ending offset of the match would suffice, as would the starting
515	   offset and a length.  Returning just the starting offset is not
516	   acceptable.  This rule is necessary because advanced collations can
517	   treat strings of different lengths as equal (for example, pre-
518	   composed and decomposed accented characters).

520	5.3.  Wildcards

522	   The protocol MUST specify whether it allows the use of wildcards in
523	   collation names or not.  If the protocol allows wildcards, then:
524	      The protocol MUST specify how comparisons behave in the absence of
525	      explicit collation negotiation or when a collation of "*" is
526	      requested.  The protocol MAY specify that the default collation
527	      used in such circumstances is sensitive to server configuration.
528	      The protocol SHOULD provide a way to list available collations
529	      matching a given wildcard pattern or patterns.

531	5.4.  Canonicalization Function

533	   If the protocol uses a canonicalization function for strings, then
534	   use of collations MAY be appropriate for that function.  As an
535	   example, many protocols use case independent strings.  In most cases,
536	   a simple ASCII mapping to upper/lower case works well, as i;ascii-
537	   casemap offers.  However, in some cases another collation may be
538	   better, e.g. to handle Turkish dotted/dotless i.  Protocol designers
539	   should consider in each case whether to use a specifiable collation.

541	5.5.  Disconnected Clients

543	   If the protocol supports disconnected clients, then a mechanism for
544	   the client to precisely replicate the server's collation algorithm is
545	   likely desirable.  Thus the protocol MAY wish to provide a command to
546	   fetch lookup tables used by charset conversions and collations.

548	5.6.  Error Codes

550	   The protocol specification should consider assigning protocol error
551	   codes for the following circumstances:
552	   o  The client requests the use of a collation by name or pattern, but
553	      no implemented collation matches that pattern.
554	   o  The client attempts to use a collation for an operation that is
555	      not supported by that collation.  For example, attempting to use
556	      the "i;ascii-numeric" collation for substring matching.
557	   o  The client uses an equality or substring matching collation and
558	      the result is an error.  It may be appropriate to distinguish
559	      between the two input strings, particularly when one is supplied
560	      by the client and one is stored by the server.  It might also be
561	      appropriate to distinguish the specific case of an invalid UTF-8
562	      string.

564	5.7.  Octet Collation

566	   The i;octet (Section 9.5) collation is only usable with protocols
567	   based on octet-strings.  Clients and servers MUST NOT use i;octet
568	   with other protocols.

570	   If the protocol permits the use of collations with data structures
571	   other than strings, the protocol MUST describe the default behavior
572	   for a collation with those data structures.

574	6.  Use by Existing Protocols

576	   Both ACAP [12] and Sieve [15] are standards track specifications
577	   which used collations prior to the creation of this specification and
578	   registry.  Those standards do not meet all the application protocol
579	   requirements described in Section 5.

581	   These protocols allow the use of the i;octet (Section 9.5) collation
582	   working directly on UTF-8 data as used in these protocols.

584	   In Sieve, all matches are either true and false.  Accordingly, Sieve
585	   servers must treat "undefined" and "no-match" results of the equality
586	   and substring operations as false, and only "match" as true.

588	   IMAP [16] also uses collation, although the use is explicit only when
589	   the COMPARATOR [18] extension is used.  The built-in IMAP substring
590	   operation and the ordering provided by the SORT [17] extension may
591	   not meet the requirements made in this document.

593	   Other protocols may be in a similar position.

595	   In IMAP, the default collation is i;ascii-casemap, because its
596	   operations most closely resembles IMAP's built-in operations.

598	7.  Collation Registration

600	7.1.  Collation Registration Procedure

602	   The IETF will create a mailing list, collation@ietf.org, which can be
603	   used for public discussion of collation proposals prior to
604	   registration.  Use of the mailing list is strongly encouraged.  The
605	   IESG will appoint a designated expert who will monitor the
606	   collation@ietf.org mailing list and review registrations.

608	   The registration procedure begins when a completed registration
609	   template is sent to iana@iana.org and collation@ietf.org.  The
610	   designated expert is expected to tell IANA and the submitter of the
611	   registration within two weeks whether the registration is approved,
612	   approved with minor changes, or rejected with cause.  When a
613	   registration is rejected with cause, it can be re-submitted if the
614	   concerns listed in the cause are addressed.  Decisions made by the
615	   designated expert can be appealed to IESG Applications Area Director,
616	   then to the IESG.  They follow the normal appeals procedure for IESG
617	   decisions.

619	   Collation registrations in a standards track, BCP or IESG-approved
620	   experimental RFC are owned by the IETF, and changes to the
621	   registration follow normal procedures for updating such documents.
622	   Collation registrations in other RFCs are owned by the RFC author(s).
623	   Other collation registrations are owned by the individual(s) listed
624	   in the contact field of the registration and IANA will preserve this
625	   information.  Changes to a registration MUST be approved by the
626	   owner.  In the event the owner cannot be contacted for a period of
627	   one month and a change is deemed necessary, the IESG MAY re-assign
628	   ownership to an appropriate party.

630	7.2.  Collation Registration Format

632	   Registration of a collation is done by sending a well-formed XML
633	   document that validates with collationreg.dtd (Section 7.3).

635	7.2.1.  Registration Template

637	   Here is a template for the registration:

639	   <?xml version='1.0'?>
640	   <!DOCTYPE collation SYSTEM 'collationreg.dtd'>
641	   <collation rfc="YYYY" scope="i18n" intendedUse="common">
642	     <name>collation name</name>
643	     <title>technical title for collation</title>
644	     <operations>equality order substring</operations>
645	     <specification>specification reference</specification>
646	     <owner>email address of owner or IETF</owner>
647	     <submitter>email address of submitter</submitter>
648	     <version>1</version>
649	     <UnicodeVersion>3.2</UnicodeVersion>
650	     <UCAVersion>3.1.1</UCAVersion>
651	   </collation>

653	7.2.2.  The collation Element

655	   The root of the registration document MUST be a <collation> element.
656	   The collation element contains the other elements in the
657	   registration, which are described in the following sub-subsections,
658	   in the order given here.

660	   The <collation> element MAY include an "rfc=" attribute if the
661	   specification is in an RFC.  The "rfc=" attribute gives only the
662	   number of the RFC, without any prefix, such as "RFC", or suffix, such
663	   as ".txt".

665	   The <collation> element MUST include a "scope=" attribute, which MUST
666	   have one of the values "i18n", "local" or "other".

668	   The <collation> element MUST include an "intendedUse=" attribute,
669	   which must have one of the values "common", "limited", "vendor", or
670	   "deprecated".  Collation specifications intended for "common" use are
671	   expected to reference standards from standards bodies with
672	   significant experience dealing with the details of international
673	   character sets.

675	   Be aware that future revisions of this specification may add
676	   additional function types, as well as additional XML attributes,
677	   values and elements.  Any system which automatically parses these XML
678	   documents MUST take this into account to preserve future
679	   compatibility.  A DTD for the current definition of the collation
680	   registration template is given in Section 7.3.

682	7.2.3.  The name Element

684	   The <name> element gives the precise name of the collation.  The
685	   <name> element is mandatory.

687	7.2.4.  The title Element

689	   The <title> element gives the title of the collation.  The <title>
690	   element is mandatory.

692	7.2.5.  The operations Element

694	   The <operations> element lists which of the three operations
695	   ("equality", "order" or "substring") the collation provides,
696	   separated by single spaces.  The <operations> element is mandatory.

698	7.2.6.  The specification Element

700	   The <specification> element describes where to find the
701	   specification.  The <specification> element is mandatory.  It MAY
702	   have a URI attribute.  There may be more than one <specification>
703	   elements, in which case they together form the specification.

705	   If it is discovered that parts of a collation specification conflict,
706	   a new revision of the collation is necessary, and the
707	   collation@ietf.org mailing list should be notified.

709	7.2.7.  The submitter Element

711	   The <submitter> element provides an RFC 2822 [13] email address for
712	   the person who submitted the registration.  It is optional if the
713	   <owner> element contains an email address.

715	   There may be more than one <submitter> element.

717	7.2.8.  The owner Element

719	   The <owner> element contains either the four letters "IETF" or an
720	   email address of the owner of the registration.  The <owner> element
721	   is mandatory.  There may be more than one <owner> element.  If so,
722	   all owners are equal.  Each owner can speak for all.

724	7.2.9.  The version Element

726	   The <version> element is included when the registration is likely to
727	   be revised or has been revised in such a way that the results change
728	   for certain input strings.  The <version> element is optional.

730	7.2.10.  The UnicodeVersion Element

732	   The <UnicodeVersion> element indicates the version number of the
733	   UnicodeData file on which the collation is based.  The
734	   <UnicodeVersion> element is optional.

736	7.2.11.  The UCAVersion Element

738	   The <UCAVersion> element specifics the version of the Unicode
739	   Collation Algorithm on which the collation is based.  The
740	   <UCAVersion> element is optional.

742	7.2.12.  The UCAMatchLevel Element

744	   The <UCAMatchLevel> element specifies the number of Unicode Collation
745	   Algorithm sort key levels used for the equality and substring
746	   operations.  The <UCAMatchLevel> element is optional.

748	7.3.  DTD for Collation Registration

750	   <!--
751	     DTD for Collation Registration Document

753	     Data types:

755	     entity      description
756	     ======      ===========
757	     NUMBER      [0-9]+
758	     URI         As defined in RFC 3986
759	     CTEXT       printable ASCII text (no line-terminators)
760	     TEXT        character data
761	     -->
762	   <!ENTITY % NUMBER        "CDATA">
763	   <!ENTITY % URI           "CDATA">
764	   <!ENTITY % CTEXT         "#PCDATA">
765	   <!ENTITY % TEXT          "#PCDATA">
766	   <!ELEMENT collation      (name,title,operations,specification+,
767	                             owner+,submitter*,version?,
768	                             UnicodeVersion?,UCAVersion?,
769	                             UCAMatchLevel?)>
770	   <!ATTLIST collation
771	             rfc            %NUMBER;                           "0"
772	             scope          (i18n|local|other)                 #IMPLIED
773	             intendedUse    (common|limited|vendor|deprecated) #IMPLIED>
774	   <!ELEMENT name           (%CTEXT;)>
775	   <!ELEMENT title          (%CTEXT;)>
776	   <!ELEMENT operations     (%CTEXT;)>
777	   <!ELEMENT specification  (%TEXT;)>
778	   <!ATTLIST specification
779	             uri            %URI;                              "">
780	   <!ELEMENT owner          (%CTEXT;)>
781	   <!ELEMENT submitter      (%CTEXT;)>
782	   <!ELEMENT version        (%CTEXT;)>
783	   <!ELEMENT UnicodeVersion (%CTEXT;)>
784	   <!ELEMENT UCAVersion     (%CTEXT;)>
785	   <!ELEMENT UCAMatchLevel  (%CTEXT;)>

787	7.4.  Structure of Collation Registry

789	   Once the registration is approved, IANA will store each XML
790	   registration document in a URL of the form
791	   http://www.iana.org/assignments/collation/collation-name.xml where
792	   collation-name is the contents of the name element in the
793	   registration.  Both the submitter and the designated expert are
794	   responsible for verifying that the XML is well-formed and complies
795	   with the DTD.

797	   IANA will also maintain a text summary of the registry under the name
798	   http://www.iana.org/assignments/collation/summary.txt.  This summary
799	   is divided into four sections.  The first section is for collations
800	   intended for common use.  This section is intended for collation
801	   registrations published in IESG approved RFCs or for locally scoped
802	   collations from the primary standards body for that locale.  The
803	   designated expert is encouraged to reject collation registrations
804	   with an intended use of "common" if the expert believes it should be
805	   "limited", as it is desirable to keep the number of "common"
806	   registrations small and high quality.  The second section is reserved
807	   for limited use collations.  The third section is reserved for
808	   registered vendor specific collations.  The final section is reserved
809	   for deprecated collations.

811	7.5.  Example Initial Registry Summary

813	   The following is an example of how IANA might structure the initial
814	   registry summary.txt file:

816	     Collation                              Functions Scope Reference
817	     ---------                              --------- ----- ---------
818	   Common Use Collations:
819	     i;nameprep;v=1;uv=3.2                  e, o, s   i18n  [RFC XXXX]
820	     i;basic;uca=3.1.1;uv=3.2               e, o, s   i18n  [RFC XXXX]
821	     i;basic;uca=3.1.1;uv=3.2;match=accent  e, o, s   i18n  [RFC XXXX]
822	     i;basic;uca=3.1.1;uv=3.2;match=case    e, o, s   i18n  [RFC XXXX]
823	     i;ascii-casemap                        e, o, s   Local [RFC XXXX]

825	   Limited Use Collations:
826	     i;octet                                e, o, s   Other [RFC XXXX]
827	     i;ascii-numeric                        e, o      Other [RFC XXXX]

829	   Vendor Collations:

831	   Deprecated Collations:
832	     i;ascii-casemap                        e, o, s   Local [RFC XXXX]

834	   References
835	   ----------
836	   [RFC XXXX]  Newman, C., "Internet Application Protocol Collation
837	               Registry", RFC XXXX, Sun Microsystems, October 2003.

839	8.  Guidelines for Expert Reviewer

841	   The expert reviewer appointed by the IESG has fairly broad latitude
842	   for this registry.  While a number of collations are expected
843	   (particularly customizations of the basic collation for localized
844	   use), an explosion of collations (particularly common use collations)
845	   is not desirable for widespread interoperability.  However, it is
846	   important for the expert reviewer to provide cause when rejecting a
847	   registration, and when possible to describe corrective action to
848	   permit the registration to proceed.  The following table includes
849	   some example reasons to reject a registration with cause:
850	   o  The registration is not a well-formed XML document that follows
851	      the DTD.
852	   o  The registration has an intended use of "common", but there is no
853	      evidence the collation will be widely deployed, so it should be
854	      listed as "limited".
855	   o  The registration has an intended use of "common", but it is
856	      redundant with the functionality of a previously registered
857	      "common" collation.
858	   o  The registration has an intended use of "common", but the
859	      specification is not detailed enough to allow interoperable
860	      implementations by others.
861	   o  The collation name fails to precisely identify the version numbers
862	      of relevant tables to use.
863	   o  The registration fails to meet one of the "MUST" requirements in
864	      Section 4.
865	   o  The collation name fails to meet the syntax in Section 3.
866	   o  The collation specification referenced in the registration is
867	      vague or has optional features without a clear behavior specified.
868	   o  The referenced specification does not adequately address security
869	      considerations specific to that collation.
870	   o  The regitration's operations are needlessly different from those
871	      of traditional operations.

873	9.  Initial Collations

875	   This section describes an initial set of collations for the collation
876	   registry.

878	9.1.  ASCII Numeric Collation

880	9.1.1.  ASCII Numeric Collation Description

882	   The "i;ascii-numeric" collation is a simple collation intended for
883	   use with arbitrary sized unsigned decimal integer numbers stored as
884	   octet strings.  US-ASCII digits (0x30 to 0x39) represent digits of
885	   the numbers.  Before converting from string to integer, the input
886	   string is truncated at the first non-digit character (so for example,
887	   "4294967298", "04294967298" and "4294967298b" all represent the same
888	   number, 4294967298).

890	   The collation supports equality and ordering, but does not support
891	   the substring operation.

893	   The equality operation returns "match" if the two strings represent
894	   the same number (ie. leading zeroes are disregarded), "no-match" if
895	   the two strings represent different numbers, and "undefined" if
896	   either string is empty or does not start with a digit.

898	   The ordering operation returns "less" if the first string represents
899	   a smaller number than the second, "equal" if they represent the same
900	   number, and "greater" if the first string represents a larger number
901	   than the second.  If either string is empty or starts with a non-
902	   digit, the ordering operation returns "undefined".

904	9.1.2.  ASCII Numeric Collation Registration

906	   <?xml version='1.0'?>
907	   <!DOCTYPE collation SYSTEM 'collationreg.dtd'>
908	   <collation rfc="XXXX" scope="other" intendedUse="limited">
909	     <name>i;ascii-numeric</name>
910	     <title>ASCII Numeric</title>
911	     <operations>equality order</operations>
912	     <specification>RFC XXXX</specification>
913	     <owner>IETF</owner>
914	     <submitter>chris.newman@sun.com<submitter>
915	   </collation>

917	9.2.  ASCII Casemap Collation

919	9.2.1.  ASCII Casemap Collation Description

921	   The "i;ascii-casemap" collation is a simple collation which operates
922	   on octet strings and treats US-ASCII letters case-insensitively.  It
923	   provides equality, substring and ordering operations.  All input is
924	   valid.

926	   Its equality, ordering and substring operations are as for i;octet,
927	   except that first, the lower-case letters (octet values 97-122) in
928	   each input string are changed to upper case (octet values 65-90).

930	   Care should be taken when using OS-supplied functions to implement
931	   this collation as it is not locale sensitive.  Functions such as
932	   strcasecmp and toupper are sometimes locale sensitive and may
933	   inappropriately map lower-case letters other than a-z to upper case.

935	   The i;ascii-casemap collation is well suited to to use with many
936	   internet protocols and computer languages.  Use with natural language
937	   is often inappropriate: even though the collation apparently supports
938	   languages such as Italian and English, in real-world use it tends to
939	   stumble over words such as "naive", names such as "Llwyd", people and
940	   place names containing non-ASCII, euro and pound sterling symbols,
941	   quotation marks, dashes/hyphens, etc.

943	9.2.2.  ASCII Casemap Collation Registration

945	   <?xml version='1.0'?>
946	   <!DOCTYPE collation SYSTEM 'collationreg.dtd'>
947	   <collation rfc="XXXX" scope="local" intendedUse="common">
948	     <name>i;ascii-casemap</name>
949	     <title>ASCII Casemap</title>
950	     <operations>equality order substring</operations>
951	     <specification>RFC XXXX</specification>
952	     <owner>IETF</owner>
953	     <submitter>chris.newman@sun.com<submitter>
954	   </collation>

956	9.3.  Nameprep Collation

958	9.3.1.  Nameprep Collation Description

960	   The "i;nameprep;v=1;uv=3.2" collation is an implementation of the
961	   nameprep [7] specification based on normalization tables from Unicode
962	   version 3.2.  This collation applies the nameprep canoncialization
963	   function to both input strings and then returns the result of the
964	   i;octet collation on the canonicalized strings.  While this collation
965	   offers all three operations, the ordering operation it provides is
966	   inadequate for use by the majority of the world.

968	   Version number 1 is applied to nameprep as specified in RFC 3491.  If
969	   the nameprep specification is revised without any changes that would
970	   produce different results when given the same pair of input octet
971	   strings, then the version number will remain unchanged.

973	       The table numbers for tables used by nameprep are as follows:

975	                 +--------------+-----------------------+
976	                 | Table Number | Table Name            |
977	                 +--------------+-----------------------+
978	                 |            1 | UnicodeData-3.2.0.txt |
979	                 |            2 | Table B.1             |
980	                 |            3 | Table B.2             |
981	                 |            4 | Table C.1.2           |
982	                 |            5 | Table C.2.2           |
983	                 |            6 | Table C.3             |
984	                 |            7 | Table C.4             |
985	                 |            8 | Table C.5             |
986	                 |            9 | Table C.6             |
987	                 |           10 | Table C.7             |
988	                 |           11 | Table C.8             |
989	                 |           12 | Table C.9             |
990	                 +--------------+-----------------------+

992	9.3.2.  Nameprep Collation Registration

994	   <?xml version='1.0'?>
995	   <!DOCTYPE collation SYSTEM 'collationreg.dtd'>
996	   <collation rfc="XXXX" scope="i18n" intendedUse="common">
997	     <name>i;nameprep;v=1;uv=3.2</name>
998	     <title>Nameprep</title>
999	     <operations>equality order substring</operations>
1000	     <specification>RFC XXXX</specification>
1001	     <owner>IETF</owner>
1002	     <submitter>chris.newman@sun.com<submitter>
1003	     <version>1</version>
1004	     <UnicodeVersion>3.2</UnicodeVersion>
1005	   </collation>

1007	9.4.  Basic Collation

1009	9.4.1.  Basic Collation Description

1011	   The basic collation is intended to provide tolerable results for a
1012	   number of languages for all three operations (equality, substring and
1013	   ordering) so it is suitable as a mandatory-to-implement collation for
1014	   protocols which include ordering support.  The ordering operation of
1015	   the basic collation is the Unicode Collation Algorithm [8] version 14
1016	   (UCAv14).

1018	   The equality and substring operations are created as described in
1019	   UCAv14 section 8.  While that section is informative to UCAv14, it is
1020	   normative to this collation specification.

1022	   This collation is based on Unicode version 3.2, with the following
1023	   tables relevant:
1024	   1.  For the normalization step,
1025	       <http://www.unicode.org/Public/3.2-Update/UnicodeData-3.2.0.txt>
1026	       is used.  Column 5 is used to determine the canonical
1027	       decomposition, while column 3 contains the canonical combining
1028	       classes necessary to attain canonical order.
1029	   2.  The table of characters which require a logical order exception
1030	       is a subset of the table in
1031	       <http://www.unicode.org/Public/3.2-Update/PropList-3.2.0.txt> and
1032	       is included here:

1034	   0E40..0E44    ; Logical_Order_Exception
1035	   # Lo   [5] THAI CHARACTER SARA E..THAI CHARACTER SARA AI MAIMALAI
1036	   0EC0..0EC4    ; Logical_Order_Exception
1037	   # Lo   [5] LAO VOWEL SIGN E..LAO VOWEL SIGN AI

1039	   # Total code points: 10

1041	   3.  The table used to translate normalized code points to a sort key
1042	       is <http://www.unicode.org/reports/tr10/allkeys-3.1.1.txt>.

1044	   UCAv14 includes a number of configurable parameters and steps
1045	   labelled as potentially optional.  The following list summarizes the
1046	   defaults used by this collation:
1047	   o  The logical order exception step is mandatory by default to
1048	      support the largest number of languages.
1049	   o  Steps 2.1.1 to 2.1.3 are mandatory as the repertoire of the basic
1050	      collation is intended to be large.
1051	   o  The second level in the sort key is evaluated forwards by default.
1052	   o  The variable weighting uses the "non-ignorable" option by default.
1053	   o  The semi-stable option is not used by default.
1054	   o  Support for exactly three levels of collation is the default
1055	      behavior.
1056	   o  No preprocessing step is used by the basic collation prior to
1057	      applying the UCAv14 algorithm.  Note that an application protocol
1058	      specification MAY require pre-processing prior to the use of any
1059	      collations.
1060	   o  The equality and substring algorithms exclude differences at level
1061	      2 and 3 by default (thus it is case-insensitive and ignores
1062	      accentual distinctions.
1063	   o  The equality and substring algorithms use the "Whole Characters
1064	      Only" feature described in UCAv14 section 8 by default.

1066	   The exact collation name with these defaults is
1067	   "i;basic;uca=3.1.1;uv=3.2".  When a specification states that the
1068	   basic collation is mandatory-to-implement, only this specific name is
1069	   mandatory-to-implement.

1071	   In order to allow modification of the optional behaviors, the
1072	   following ABNF is used for variations of the basic collation:

1074	     basic-collation  =  ("i" / Language-Tag) ";basic;uca=3.1.1;uv=3.2"
1075	                         [";match=accent" / ";match=case"]
1076	                         [";tailor=" 1*collation-char ]

1078	   If multiple modifiers appear, they MUST appear in the order described
1079	   above.  The modifiers have the following meanings:
1080	   match=accent   Both the first and second levels of the sort keys are
1081	                  considered relevant to the equality and substring
1082	                  operations (rather than the default of first level
1083	                  only).  This makes the matching functions sensitive to
1084	                  accentual distinctions.
1085	   match=case     The first three levels of sort keys are considered
1086	                  relevant to the equality and substring operations.
1087	                  This makes the matching functions sensitive to both
1088	                  case and accentual distinctions.

1090	   The default weighting option is "non-ignorable".  The "semi-stable"
1091	   sort key option is not used by default.

1093	   Sort keys are generated as described in section 4.3 of the UCA
1094	   specification.  (Note that the result is not a string of characters.)

1096	   Finally, the UCAv14 algorithm permits the "allkeys" table to be
1097	   tailored to a language.  People who make quality tailorings are
1098	   encouraged to register those tailorings using the collation registry.
1099	   Tailoring names beginning with "x" are reserved for experimental use,
1100	   are treated as "Limited use" and MUST NOT match wildcards if any
1101	   registered collation is available that does match.

1103	9.4.2.  Basic Collation Registration

1105	   <?xml version='1.0'?>
1106	   <!DOCTYPE collation SYSTEM 'collationreg.dtd'>
1107	   <collation rfc="XXXX" scope="i18n" intendedUse="common">
1108	     <name>i;basic;uca=3.1.1;uv=3.2</name>
1109	     <title>Basic</title>
1110	     <operations>equality order substring</operations>
1111	     <specification>RFC XXXX</specification>
1112	     <owner>IETF</owner>
1113	     <submitter>chris.newman@sun.com<submitter>
1114	     <UnicodeVersion>3.2</UnicodeVersion>
1115	     <UCAVersion>3.1.1</UCAVersion>
1116	     <UCAMatchLevel>1</UCAMatchLevel>
1117	   </collation>

1119	9.4.3.  Basic Accent Sensitive Match Collation Registration

1121	   <?xml version='1.0'?>
1122	   <!DOCTYPE collation SYSTEM 'collationreg.dtd'>
1123	   <collation rfc="XXXX" scope="i18n" intendedUse="common">
1124	     <name>i;basic;uca=3.1.1;uv=3.2;match=accent</name>
1125	     <title>Basic Accent Sensitive Match</title>
1126	     <operations>equality order substring</operations>
1127	     <specification>RFC XXXX</specification>
1128	     <owner>IETF</owner>
1129	     <submitter>chris.newman@sun.com<submitter>
1130	     <UnicodeVersion>3.2</UnicodeVersion>
1131	     <UCAVersion>3.1.1</UCAVersion>
1132	     <UCAMatchLevel>2</UCAMatchLevel>
1133	   </collation>

1135	9.4.4.  Basic Case Sensitive Match Collation Registration

1137	   <?xml version='1.0'?>
1138	   <!DOCTYPE collation SYSTEM 'collationreg.dtd'>
1139	   <collation rfc="XXXX" scope="i18n" intendedUse="common">
1140	     <name>i;basic;uca=3.1.1;uv=3.2;match=case</name>
1141	     <title>Basic Case Sensitive Match</title>
1142	     <operations>equality order substring</operations>
1143	     <specification>RFC XXXX</specification>
1144	     <owner>IETF</owner>
1145	     <submitter>chris.newman@sun.com<submitter>
1146	     <UnicodeVersion>3.2</UnicodeVersion>
1147	     <UCAVersion>3.1.1</UCAVersion>
1148	     <UCAMatchLevel>3</UCAMatchLevel>
1149	   </collation>

1151	9.5.  Octet Collation

1153	9.5.1.  Octet Collation Description

1155	   The "i;octet" collation is a simple and fast collation intended for
1156	   use on binary octet strings rather than on character data.  Protocols
1157	   that want to make this collation available have to do so by
1158	   explicitly allowing it.  If not explicitly allowed, it MUST NOT be
1159	   used.  It never returns an "undefined" result.  It provides equality,
1160	   substring and ordering operations.

1162	   The ordering algorithm is as follows:
1163	   1.  If both strings are the empty string, return the result "equal".
1164	   2.  If the first string is empty and the second is not, return the
1165	       result "less".

1167	   3.  If the second string is empty and the first is not, return the
1168	       result "greater".
1169	   4.  If both strings begin with the same octet value, remove the first
1170	       octet from both strings and repeat this algorithm from step 1.
1171	   5.  If the unsigned value (0 to 255) of the first octet of the first
1172	       string is less than the unsigned value of the first octet of the
1173	       second string, then return "less".
1174	   6.  If this step is reached, return "greater".

1176	   This algorithm is roughly equivalent to the C library function memcmp
1177	   with appropriate length checks added.

1179	   The matching operation returns "match" if the sorting algorithm would
1180	   return "equal".  Otherwise the matching operation returns "no-match".

1182	   The substring operation returns "match" if the first string is the
1183	   empty string, or if there exists a substring of the second string of
1184	   length equal to the length of the first string which would result in
1185	   a "match" result from the equality function.  Otherwise the substring
1186	   operation returns "no-match".

1188	9.5.2.  Octet Collation Registration

1190	   This collation is defined with intendedUse="limited" because it can
1191	   only be used by protocols that explicitly allow it.

1193	   <?xml version='1.0'?>
1194	   <!DOCTYPE collation SYSTEM 'collationreg.dtd'>
1195	   <collation rfc="XXXX" scope="i18n" intendedUse="limited">
1196	     <name>i;octet</name>
1197	     <title>Octet</title>
1198	     <operations>equality order substring</operations>
1199	     <specification>RFC XXXX</specification>
1200	     <owner>IETF</owner>
1201	     <submitter>chris.newman@sun.com<submitter>
1202	   </collation>

1204	10.  IANA Considerations

1206	   Section 7 defines how to register collations with IANA.  Section 9
1207	   defines a list of predefined collations, which should be registered
1208	   when this document is approved and published as an RFC.

1210	   The IANA publishes the DTD itself at URL
1211	   http://www.iana.org/assignments/collation/collationreg.dtd.

1213	11.  Security Considerations

1215	   Collations will normally be used with UTF-8 strings.  Thus the
1216	   security considerations for UTF-8 [3], stringprep [6] and Unicode
1217	   TR-36 [9] also apply and are normative to this specification.

1219	12.  Acknowledgements

1221	   The authors want to thank all who have contributed to this document,
1222	   including at least John Cowan, Dave Cridland, Mark Davis, Lisa
1223	   Dusseault, Frank Ellermann, Philip Guenther, Tony Hansen, Kjetil
1224	   Torgrim Homme, Michael Kay, Alexey Melnikov, Jim Melton and Abhijit
1225	   Menon-Sen.

1227	13.  Open Issues

1229	   When converting this to an RFC, several things must be done: Martin
1230	   Duerst's name request, checking for unfortunate page breaks, adding a
1231	   note to the RFC editor to possibly replace the 3066 reference.

1233	   Mark Davis writes:

1235	   The sample registry would suffer a combinatorial explosion if
1236	   parameters are not handled differently.  For example, with CLDR
1237	   collations, there can be hundreds of locales, six different strength
1238	   settings; four different case-first settings; three different
1239	   alternate settings, backwards settings, normalization settings, case
1240	   level settings, hiragana settings, and numeric settings; plus a
1241	   variable-top setting which has a string as an operand.  Registering
1242	   the combinations that people are allowed to use would be untenable.
1243	   Maybe the new DTD from Martin Duerst fixes this.

1245	   Dave Cridland suggests that collations never return error.  Have
1246	   asked for rationale.

1248	   Is it appropriate to use just the level 1 for equality checking in
1249	   the basic collation?  Level 1 is case insensitive and disregards
1250	   accents.

1252	   Can "undefined" cover all the cases "error" covered, or do we need an
1253	   "error" case in addition to "undefined"?

1255	   Martin Duerst suggests, sensibly:

1257	   Can we give "the string a client uses to select a collation" a name?
1258	   E.g. "collection request string" or some such?
1259	   Should collation names be called collation identifiers?  Having both
1260	   name and title in the DTD is a bit confusing.  Try it in -12?

1262	   New DTD should permit comments, at least enough to say "The
1263	   specification is a republication of the earlier document (URL HERE)"
1264	   and things like that.

1266	14.  Change Log

1268	14.1.  Changes From -10
1269	   1.  Updated contact details for Martin Duerst.
1270	   2.  Various textual improvements.
1271	   3.  The registration's file name now has a mandatory .xml extension.
1272	   4.  Removed binding MUST for Sieve; it's more appropriate to put that
1273	       in 3028bis.
1274	   5.  Syntax fix in registration example.
1275	   6.  When there are multiple specifications, they now act in concert,
1276	       so it's possible to have e.g. a main specification and multiple
1277	       locale-specific supplements.  It is not possible to name multiple
1278	       locations for the same specification any more.  That'll return as
1279	       a comment feature.
1280	   7.  Hopefully clearer exposition of i;ascii-casemap.
1281	   8.  The ban on registering octet-based collations is lifted.  One
1282	       hopes that the collation mailing list will present a suitable
1283	       threshold - not too high, not too low.
1284	   9.  The DTD is published where IE can see it while looking at the
1285	       registrations.

1287	14.2.  Changes From -09
1288	   1.  Rename "error" to "undefined", as suggested by Mark Davis.  The
1289	       new name makes for nicer prose IMO.
1290	   2.  7b=7 according to i;ascii-numeric.  ACAP/Sieve need it.
1291	   3.  Clarified that even though the collation specification returns a
1292	       list of substrings, the protocol/server need not use all of that
1293	       information.  (As indeed IMAP SEARCH does not.)
1294	   4.  Registrations go directly to the collation list _and_ to the
1295	       IANA, not to the IANA and from there forwarded to designated
1296	       expert.
1297	   5.  Added an acknowledgements list and populated it with a quick grep
1298	       from my mailbox and memory.  Surely incomplete.
1299	   6.  Noted that in sieve, "no-match" and "undefined" must be treated
1300	       in the same way by the engine.
1301	   7.  Finish the rename from canonical to sort key.
1302	   8.  Don't fall back to i;octet from any other collation.  Return
1303	       undefined instead.  Note that protocols may fall back to i;octet
1304	       to provide total ordering, if necessary.

1306	   9.  Call the things operations everywhere, not operators/operations.

1308	14.3.  Changes From -08
1309	   1.   i;ascii-casemap instead of en;ascii-casemap.
1310	   2.   UCA v 14.  Changing to "latest version of UCA" was suggested,
1311	        but rejected since IETF standards reference stable
1312	        specifications, and "latest" is a moving target.
1313	   3.   Removed all text on multi-valued attributes.  Can be added once
1314	        there is a concrete need for it, either in an update to this
1315	        document or in the protocol that needs it.
1316	   4.   "Collations MUST specify the canonicalization".  Well, the UCA
1317	        doesn't, so I changed that to a MAY.
1318	   5.   Add some text explaining why one might want to download tables.
1319	   6.   Changed the remaining instances of "canonicalization" to talk
1320	        about sort keys.  Added a note that a collation's sort key need
1321	        not be valid input to the same collation.
1322	   7.   Reserve the word "default" and use it to name a protocol's
1323	        default collation, provided that protocol has a default
1324	        collation.  In earlier versions of the draft, "*" was used to
1325	        name the default collation, but "*" also was implicitly defined
1326	        as the most general collation available.
1327	   8.   Reinstate the different-length example of substring match.
1328	        Explain what an overlapping match is, by the canonical example.
1329	   9.   Avoid the word "contain" when talking about substring matches.
1330	        Fewer terms is better.
1331	   10.  Until -07, both a collation and equality/substring/sort was
1332	        called functions.  In -07, the trio was renamed as operations.
1333	        Now, the DTD is updated to match.
1334	   11.  Appeals go to the Apps AD before the general AD, as suggested by
1335	        Spencer Dawkins.

1337	14.4.  Changes From -06
1338	   1.  Clarified equality and identity: equality is as defined by a
1339	       collation, identity is stronger.
1340	   2.  Added reference to
1341	       http://www.unicode.org/reports/tr10/#Searching.
1342	   3.  Don't describe sort keys as a canonical representation of the
1343	       string.
1344	   4.  Permit disconnected clients to use wildcards.  (A disconnected
1345	       client has to resolve the wildcard itself, in the same way that a
1346	       server would.)
1347	   5.  Change collation-wild to have the same length limit as collation.
1348	   6.  Change to use "less" instead of "-1", etc., and specify that it's
1349	       just phrasing, not specification.
1350	   7.  Don't describe the equality, substring and ordering operations as
1351	       functions.  The definition of collation uses the word function
1352	       about the collation itself.  A function that has three functions?
1353	       Something has to give.

1355	   8.  Strike a requirement that selecting '*' is the same as not
1356	       selecting any collation.  It restricted the protocol's default
1357	       too much.  Existing code wasn't listening.
1358	   9.  Left out the canonicalization/sort keys.

1360	14.5.  Changes From -05
1361	   1.  Added definitions of client, server and protocol, and prose to
1362	       specify that while the IANA registrations of collations are
1363	       written in terms octet strings, implementations may do it
1364	       differently.
1365	   2.  Changed the wording for ascii-numeric to treat the numbers as
1366	       numbers, etc.
1367	   3.  Added explicit property requirements for the three functions,
1368	       e.g. that equality be symmetric.  Added requirements that the
1369	       three functions be consistent, and that if any operations are
1370	       present, equality must be (needed for consistency).
1371	   4.  Random editing, e.g. changing 'numbers' for ascii-numeric to
1372	       'integer numbers'.
1373	   5.  Gave IMAP/SORT/COMPARATOR the same grandfather treatment as ACAP
1374	       and SIEVE.

1376	14.6.  Changes From -04

1378	   Grammar and clarity changes only.  One (weak) example added.  No
1379	   substantive changes.

1381	14.7.  Changes From -03

1383	   (This does not include all changes made.)
1384	   1.  Checked and resolved most issues marked 'check whether this is
1385	       true' or similar.
1386	   2.  Resolved nameprep issue: No.
1387	   3.  Removed NULL for compatibility with existing collations (IMAP
1388	       SORT, Sieve).
1389	   4.  There can be multiple owners and submitters.  Say how.
1390	   5.  Added a requirement that common collations must now be
1391	       interoperable.  Insufficiently detailed specs cannot be "common".
1392	   6.  Added a guideline that the operations provided by new collations
1393	       should be reminiscent of similar operations on existing
1394	       collations.

1396	14.8.  Changes From -02

1398	   1.  Changed from data being octet sequences (in UTF-8) to data being
1399	       character sequences (with octet collation as an exception).
1400	   2.  Made XML format description much more structured.

1402	   3.  Changed <submittor> to <submitter>, because this spelling is much
1403	       more common.
1404	   4.  Defined 'protocol' to include query languages.
1405	   5.  Reorganized document, in particular IANA considerations section
1406	       (which newly is just a list of pointers).
1407	   6.  Added subsections, and a 'Structure of this Document' section.
1408	   7.  Updated references.
1409	   8.  Created a 'Change Log' chapter, with sections for each draft.
1410	   9.  Reduced 'Open issues' section, open issues are now maintained at
1411	       http://www.w3.org/2004/08/ietf-collation.

1413	14.9.  Changes From -01

1415	   Add IANA comment to open issues.  Otherwise this is just a re-publish
1416	   to keep the document alive.

1418	14.10.  Changes From -00

1420	   1.  Replaced the term comparator with collation.  While comparator is
1421	       somewhat more precise because these abstract functions are used
1422	       for matching as well as ordering, collation is the term used by
1423	       other parts of the industry.  Thus I have changed the name to
1424	       collation for consistency.
1425	   2.  Remove all modifiers to the basic collation except for the
1426	       customization and the match rules.  The other behavior
1427	       modifications can be specified in a customization of the
1428	       collation.
1429	   3.  Use ";" instead of "-" as delimiter between parameters to make
1430	       names more URL-ish.
1431	   4.  Add URL form for comparator reference.
1432	   5.  Switched registration template to use XML document.
1433	   6.  Added a number of useful registration template elements related
1434	       to the Unicode Collation Algorithm.
1435	   7.  Switched language from "custom" to "tailor" to match UCA language
1436	       for tailoring of the collation algorithm.

1438	15.  References

1440	15.1.  Normative References

1442	   [1]  Bradner, S., "Key words for use in RFCs to Indicate Requirement
1443	        Levels", BCP 14, RFC 2119, March 1997.

1445	   [2]  Crocker, D. and P. Overell, "Augmented BNF for Syntax
1446	        Specifications: ABNF", RFC 4234, October 2005.

1448	   [3]  Yergeau, F., "UTF-8, a transformation format of ISO 10646",
1449	        STD 63, RFC 3629, November 2003.

1451	   [4]  Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
1452	        Resource Identifier (URI): Generic Syntax", RFC 3986,
1453	        January 2005.

1455	   [5]  Alvestrand, H., "Tags for the Identification of Languages",
1456	        BCP 47, RFC 3066, January 2001.

1458	   [6]  Hoffman, P. and M. Blanchet, "Preparation of Internationalized
1459	        Strings ("stringprep")", RFC 3454, December 2002.

1461	   [7]  Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep Profile for
1462	        Internationalized Domain Names (IDN)", RFC 3491, March 2003.

1464	   [8]  Davis, M. and K. Whistler, "Unicode Collation Algorithm version
1465	        14", May 2005,
1466	        <http://www.unicode.org/reports/tr10/tr10-14.html>.

1468	   [9]  Davis, M. and M. Suignard, "Unicode Security Considerations",
1469	        February 2006, <http://www.unicode.org/reports/tr36/>.

1471	15.2.  Informative References

1473	   [10]  Freed, N. and N. Borenstein, "Multipurpose Internet Mail
1474	         Extensions (MIME) Part One: Format of Internet Message Bodies",
1475	         RFC 2045, November 1996.

1477	   [11]  Myers, J., "Simple Authentication and Security Layer (SASL)",
1478	         RFC 2222, October 1997.

1480	   [12]  Newman, C. and J. Myers, "ACAP -- Application Configuration
1481	         Access Protocol", RFC 2244, November 1997.

1483	   [13]  Resnick, P., "Internet Message Format", RFC 2822, April 2001.

1485	   [14]  Freed, N. and J. Postel, "IANA Charset Registration
1486	         Procedures", BCP 19, RFC 2978, October 2000.

1488	   [15]  Showalter, T., "Sieve: A Mail Filtering Language", RFC 3028,
1489	         January 2001.

1491	   [16]  Crispin, M., "Internet Message Access Protocol - Version
1492	         4rev1", RFC 3501, March 2003.

1494	   [17]  Crispin, M. and K. Murchison, "Internet Message Access Protocol
1495	         - Sort and Thread Extensions", draft-ietf-imapext-sort-17.txt
1496	         (work in progress), May 2004.

1498	   [18]  Newman, C. and A. Gulbrandsen, "Internet Message Access
1499	         Protocol Internationalization", draft-ietf-imapext-i18n-06.txt
1500	         (work in progress), January 2006.

1502	Authors' Addresses

1504	   Chris Newman
1505	   Sun Microsystems
1506	   1050 Lakes Drive
1507	   West Covina, CA  91790
1508	   US

1510	   Email: chris.newman@sun.com

1512	   Martin Duerst (Note: Please write "Duerst" with u-umlaut wherever possible, for example as "D&#252;rst" in XML and HTML.)
1513	   Aoyama Gakuin University
1514	   5-10-1 Fuchinobe
1515	   Sagamihara, Kanagawa  229-8558
1516	   Japan

1518	   Phone: +81 42 759 6329
1519	   Fax:   +81 42 759 6495
1520	   Email: mailto:duerst@it.aoyama.ac.jp
1521	   URI:   http://www.sw.it.aoyama.ac.jp/D%C3%BCrst/

1523	   Arnt Gulbrandsen
1524	   Oryx Mail Systems GmbH
1525	   Schweppermannstr. 8
1526	   Munich  81671
1527	   Germany

1529	   Phone: +49 89 4502 9757
1530	   Fax:   +49 89 4502 9758
1531	   Email: mailto:arnt@oryx.com
1532	   URI:   http://www.oryx.com/arnt/

1534	Intellectual Property Statement

1536	   The IETF takes no position regarding the validity or scope of any
1537	   Intellectual Property Rights or other rights that might be claimed to
1538	   pertain to the implementation or use of the technology described in
1539	   this document or the extent to which any license under such rights
1540	   might or might not be available; nor does it represent that it has
1541	   made any independent effort to identify any such rights.  Information
1542	   on the procedures with respect to rights in RFC documents can be
1543	   found in BCP 78 and BCP 79.

1545	   Copies of IPR disclosures made to the IETF Secretariat and any
1546	   assurances of licenses to be made available, or the result of an
1547	   attempt made to obtain a general license or permission for the use of
1548	   such proprietary rights by implementers or users of this
1549	   specification can be obtained from the IETF on-line IPR repository at
1550	   http://www.ietf.org/ipr.

1552	   The IETF invites any interested party to bring to its attention any
1553	   copyrights, patents or patent applications, or other proprietary
1554	   rights that may cover technology that may be required to implement
1555	   this standard.  Please address the information to the IETF at
1556	   ietf-ipr@ietf.org.

1558	Disclaimer of Validity

1560	   This document and the information contained herein are provided on an
1561	   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
1562	   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
1563	   ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
1564	   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
1565	   INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
1566	   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

1568	Copyright Statement

1570	   Copyright (C) The Internet Society (2006).  This document is subject
1571	   to the rights, licenses and restrictions contained in BCP 78, and
1572	   except as set forth therein, the authors retain all their rights.

1574	Acknowledgment

1576	   Funding for the RFC Editor function is currently provided by the
1577	   Internet Society.