idnits 2.17.1 

draft-iab-identifier-comparison-06.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  == There are 1 instance of lines with non-RFC2606-compliant FQDNs in the
     document.

  == There are 2 instances of lines with private range IPv4 addresses in the
     document.  If these are generic example addresses, they should be changed
     to use any of the ranges defined in RFC 6890 (or successor): 192.0.2.x,
     198.51.100.x or 203.0.113.x.

  ** The document seems to lack a both a reference to RFC 2119 and the
     recommended RFC 2119 boilerplate, even if it appears to use RFC 2119
     keywords. 

     RFC 2119 keyword, line 418: '...dentity of an Internet host, it SHOULD...'
     RFC 2119 keyword, line 420: '...#.#.#.#") form.  The host SHOULD check...'


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (December 14, 2012) is 4151 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Missing Reference: 'RFC5890' is mentioned on line 547, but not defined

  == Outdated reference: A later version (-09) exists of
     draft-ietf-precis-problem-statement-08

  -- Obsolete informational reference (is this intentional?): RFC 3490
     (Obsoleted by RFC 5890, RFC 5891)


     Summary: 1 error (**), 0 flaws (~~), 5 warnings (==), 2 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                     D. Thaler, Ed.
3	Internet-Draft                                                 Microsoft
4	Intended status: Informational                         December 14, 2012
5	Expires: June 17, 2013

7	         Issues in Identifier Comparison for Security Purposes
8	                 draft-iab-identifier-comparison-06.txt

10	Abstract

12	   Identifiers such as hostnames, URIs, and email addresses are often
13	   used in security contexts to identify security principals and
14	   resources.  In such contexts, an identifier supplied via some
15	   protocol is often compared against some policy to make security
16	   decisions such as whether the principal may access the resource, what
17	   level of authentication or encryption is required, etc.  If the
18	   parties involved in a security decision use different algorithms to
19	   compare identifiers, then failure scenarios ranging from denial of
20	   service to elevation of privilege can result.

22	Status of this Memo

24	   This Internet-Draft is submitted in full conformance with the
25	   provisions of BCP 78 and BCP 79.

27	   Internet-Drafts are working documents of the Internet Engineering
28	   Task Force (IETF).  Note that other groups may also distribute
29	   working documents as Internet-Drafts.  The list of current Internet-
30	   Drafts is at http://datatracker.ietf.org/drafts/current/.

32	   Internet-Drafts are draft documents valid for a maximum of six months
33	   and may be updated, replaced, or obsoleted by other documents at any
34	   time.  It is inappropriate to use Internet-Drafts as reference
35	   material or to cite them other than as "work in progress."

37	   This Internet-Draft will expire on June 17, 2013.

39	Copyright Notice

41	   Copyright (c) 2012 IETF Trust and the persons identified as the
42	   document authors.  All rights reserved.

44	   This document is subject to BCP 78 and the IETF Trust's Legal
45	   Provisions Relating to IETF Documents
46	   (http://trustee.ietf.org/license-info) in effect on the date of
47	   publication of this document.  Please review these documents
48	   carefully, as they describe your rights and restrictions with respect
49	   to this document.  Code Components extracted from this document must
50	   include Simplified BSD License text as described in Section 4.e of
51	   the Trust Legal Provisions and are provided without warranty as
52	   described in the Simplified BSD License.

54	Table of Contents

56	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
57	     1.1.  Canonicalization . . . . . . . . . . . . . . . . . . . . .  4
58	   2.  Security Uses  . . . . . . . . . . . . . . . . . . . . . . . .  5
59	     2.1.  Types of Identifiers . . . . . . . . . . . . . . . . . . .  6
60	     2.2.  False Positives and Negatives  . . . . . . . . . . . . . .  7
61	     2.3.  Hypothetical Example . . . . . . . . . . . . . . . . . . .  8
62	   3.  Common Identifiers . . . . . . . . . . . . . . . . . . . . . .  9
63	     3.1.  Hostnames  . . . . . . . . . . . . . . . . . . . . . . . .  9
64	       3.1.1.  IPv4 Literals  . . . . . . . . . . . . . . . . . . . . 10
65	       3.1.2.  IPv6 Literals  . . . . . . . . . . . . . . . . . . . . 11
66	       3.1.3.  Internationalization . . . . . . . . . . . . . . . . . 12
67	       3.1.4.  Resolution for comparison  . . . . . . . . . . . . . . 12
68	     3.2.  Ports and Service Names  . . . . . . . . . . . . . . . . . 13
69	     3.3.  URIs . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
70	       3.3.1.  Scheme component . . . . . . . . . . . . . . . . . . . 15
71	       3.3.2.  Authority component  . . . . . . . . . . . . . . . . . 15
72	       3.3.3.  Path component . . . . . . . . . . . . . . . . . . . . 16
73	       3.3.4.  Query component  . . . . . . . . . . . . . . . . . . . 16
74	       3.3.5.  Fragment component . . . . . . . . . . . . . . . . . . 16
75	       3.3.6.  Resolution for comparison  . . . . . . . . . . . . . . 17
76	     3.4.  Email Address-like Identifiers . . . . . . . . . . . . . . 17
77	   4.  General Issues . . . . . . . . . . . . . . . . . . . . . . . . 18
78	     4.1.  Conflation . . . . . . . . . . . . . . . . . . . . . . . . 18
79	     4.2.  Internationalization . . . . . . . . . . . . . . . . . . . 18
80	     4.3.  Scope  . . . . . . . . . . . . . . . . . . . . . . . . . . 19
81	     4.4.  Temporality  . . . . . . . . . . . . . . . . . . . . . . . 20
82	   5.  Security Considerations  . . . . . . . . . . . . . . . . . . . 20
83	   6.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 21
84	   7.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 21
85	   8.  Informative References . . . . . . . . . . . . . . . . . . . . 21
86	   Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 24

88	1.  Introduction

90	   In computing and the Internet, various types of "identifiers" are
91	   used to identify humans, devices, content, etc.  Before discussing
92	   security issues, we first give some background on some typical
93	   processes involving identifiers.

95	   As depicted in Figure 1, there are multiple processes relevant to our
96	   discussion.
97	   1.  An identifier must first be generated.  If the identifier is
98	       intended to be unique, the generation process includes some
99	       mechanism, such as allocation by a central authority, to help
100	       ensure uniqueness.  However the notion of "unique" involves
101	       determining whether a putative identifier matches any other
102	       already-allocated identifier.  As we will see, for many types of
103	       identifiers, this is not simply an exact binary match.

105	       As a result of generating the identifier, it is often stored in
106	       two locations: with the requester or "holder" of the identifier,
107	       and with some repository of identifiers (e.g., DNS).  For
108	       example, if the identifier was allocated by a central authority,
109	       the repository might be that authority.  If the identifier
110	       identifies a device or content on a device, the repository might
111	       be that device.
112	   2.  The identifier must be distributed, either by the holder of the
113	       identifier or by a repository of identifiers, to others who could
114	       use the identifier.  This distribution might be electronic, but
115	       sometimes it is via other channels such as voice, business card,
116	       billboard, or other form of advertisement.  The identifier itself
117	       might be distributed directly, or it might be used to generate a
118	       portion of another type of identifier that is then distributed.
119	       For example, a URI or email address might include a server name,
120	       and hence distributing the URI or email address also inherently
121	       distributes the server name.
122	   3.  The identifier must be used by some party.  Generally the user
123	       supplies the identifier which is (directly or indirectly) sent to
124	       the repository of identifiers.  For example, using an email
125	       address to send email to the holder of an identifier may result
126	       in the email arriving at the holder's email server which has
127	       access to the mail stores.

129	       The repository of identifiers must then attempt to match the
130	       user-supplied identifier with an identifier in its repository.

132	                            +------------+
133	                            |  Holder of |     1. Generation
134	                            | identifier +<---------+
135	                            +----+-------+          |
136	                                 |                  | Match
137	                                 |                  v/
138	                                 |          +-------+-------+
139	                                 +----------+ Repository of |
140	                                 |          |  identifiers  |
141	                                 |          +-------+-------+
142	                 2. Distribution |                  ^\
143	                                 |                  | Match
144	                                 v                  |
145	                       +---------+-------+          |
146	                       |      User of    |          |
147	                       |    identifier   +----------+
148	                       +-----------------+    3. Use

150	                       Typical Identifier Processes

152	                                 Figure 1

154	   One key aspect is that the identifier values passed in generation,
155	   distribution, and use, may all be different forms.  For example,
156	   generation might be exchanged in printed form, distribution done via
157	   voice, and use done electronically.  As such, the match process can
158	   be complicated.

160	   Furthermore, in many uses, the relationship between holder,
161	   repositories, and users may be more involved.  For example, when a
162	   hierarchy of web caches exist, each cache is itself a repository of a
163	   sort, and the match process is usually intended to be the same as on
164	   the origin server.

166	1.1.  Canonicalization

168	   Perhaps the most common algorithm for comparison involves first
169	   converting each identifier to a canonical form (a process known as
170	   "canonicalization" or "normalization"), and then testing the
171	   resulting canonical representations for bitwise equality.  In so
172	   doing, it is thus critical that all entities involved agree on the
173	   same canonical form and use the same canonicalization algorithm so
174	   that the overall comparison process is also the same.

176	   Note that in some contexts, such as in internationalization, the
177	   terms "canonicalization" and "normalization" have a precise meaning.
178	   In this document, however, we use these terms synonymously in their
179	   more generic form, to mean conversion to some standard form.

181	   While the most common method of comparison includes canonicalization,
182	   comparison can also be done by defining an equivalence algorithm,
183	   where no single form is canonical.  However in most cases, a
184	   canonical form is useful for other purposes, such as output, and so
185	   in such cases defining a canonical form suffices to define a
186	   comparison method.

188	2.  Security Uses

190	   Identifiers such as hostnames, URIs, and email addresses are used in
191	   security contexts to identify principals and resources as well as
192	   other security parameters such as types and values of claims.  Those
193	   identifiers are then used to make security decisions based on an
194	   identifier supplied via some protocol.  For example:
195	   o  Authentication: a protocol might match a security principal
196	      identifier to look up expected keying material, and then match
197	      keying material.
198	   o  Authorization: a protocol might match a resource name to look up
199	      an access control list (ACL), and then look up the security
200	      principal identifier (or a surrogate for it) in that ACL.
201	   o  Accounting: a system might create an accounting record for a
202	      security principal identifier or resource name, and then might
203	      later need to match a supplied identifier to (for example) add new
204	      filtering rules based on the records in order to stop an attack.

206	   If the parties involved in a security decision use different matching
207	   algorithms for the same identifiers, then failure scenarios ranging
208	   from denial of service to elevation of privilege can result, as we
209	   will see.

211	   This is especially complicated in cases involving multiple parties
212	   and multiple protocols.  For example, there are many scenarios where
213	   some form of "security token service" is used to grant to a requester
214	   permission to access a resource, where the resource is held by a
215	   third party that relies on the security token service (see Figure 2).
216	   The protocol used to request permission (e.g., Kerberos or OAuth) may
217	   be different from the protocol used to access the resource (e.g.,
218	   HTTP).  Opportunities for security problems arise when two protocols
219	   define different comparison algorithms for the same type of
220	   identifier, or when a protocol is ambiguously specified and two
221	   endpoints (e.g., a security token service and a resource holder)
222	   implement different algorithms within the same protocol.

224	        +----------+
225	        | security |
226	        |  token   |
227	        | service  |
228	        +----------+
229	             ^
230	             | 1. supply credentials and
231	             | get token for resource
232	             |                                             +--------+
233	        +----------+  2. supply token and access resource  |resource|
234	        |requester |=------------------------------------->| holder |
235	        +----------+                                       +--------+

237	                         Simple Security Exchange

239	                                 Figure 2

241	   In many cases the situation is more complex.  With certificates, the
242	   name in a certificate gets compared against names in ACLs or other
243	   things.  In the case of web site security, the name in the
244	   certificate gets compared to a portion of the URI that a user may
245	   have typed into a browser.  The fact that many different people are
246	   doing the typing, on many different types of systems, complicates the
247	   problem.

249	   Add to this the certificate enrollment step, and the certificate
250	   issuance step, and two more parties have an opportunity to adjust the
251	   encoding or worse, the software that supports them might make changes
252	   that the parties are unaware are happening.

254	2.1.  Types of Identifiers

256	   In this document we will refer to the following types of identifiers:

258	   o  Absolute: identifiers that can be compared byte-by-byte for
259	      equality.  Two identifiers that have different bytes are defined
260	      to be different.  For example, binary IP addresses are in this
261	      class.
262	   o  Definite: identifiers that have a well-defined comparison
263	      algorithm on which all parties agree.  For example, URI scheme
264	      names are required to be ASCII and are defined to match in a case-
265	      insensitive way; the comparison is thus definite since all parties
266	      agree on how to do a case-insensitive match among ASCII strings.
267	   o  Indefinite: identifiers that have no single comparison algorithm
268	      on which all parties agree.  For example, human names are in this
269	      class.  Everyone might want the comparison to be tailored for
270	      their locale, for some definition of locale.  In some cases, there
271	      may be limited subsets of parties that might be able to agree
272	      (e.g., ASCII users might all agree on a common comparison
273	      algorithm whereas users of other Latin scripts, such as Turkish,
274	      may not), but identifiers often tend to leak out of such limited
275	      environments.

277	2.2.  False Positives and Negatives

279	   It is first worth discussing in more detail the effects of errors in
280	   the comparison algorithm.  A "false positive" results when two
281	   identifiers compare as if they were equal, but in reality refer to
282	   two different objects (e.g., security principals or resources).  When
283	   privilege is granted on a match, a false positive thus results in an
284	   elevation of privilege, for example allowing execution of an
285	   operation that should not have been permitted otherwise.  When
286	   privilege is denied on a match (e.g., matching an entry in a block/
287	   deny list or a revocation list), a permissible operation is denied.
288	   At best, this can cause worse performance (e.g., a cache miss, or
289	   forcing redundant authentication), and at worst can result in a
290	   denial of service.

292	   A "false negative" results when two identifiers that in reality refer
293	   to the same thing compare as if they were different, and the effects
294	   are the reverse of those for false positives.  That is, when
295	   privilege is granted on a match, the result is at best worse
296	   performance and at worst a denial of service; when privilege is
297	   denied on a match, elevation of privilege results.

299	   Figure 3 summarizes these effects.

301	                  | "Grant on match"       | "Deny on match"
302	   ---------------+------------------------+-----------------------
303	   False positive | Elevation of privilege | Denial of service
304	   ---------------+------------------------+-----------------------
305	   False negative | Denial of service      | Elevation of privilege
306	   ---------------+------------------------+-----------------------

308	                Worst Effects of False Positives/Negatives

310	                                 Figure 3

312	   Elevation of privilege is almost always seen as far worse than denial
313	   of service.  Hence, for URIs for example, Section 6.1 of [RFC3986]
314	   states: "comparison methods are designed to minimize false negatives
315	   while strictly avoiding false positives".

317	   Thus URIs were defined with a "grant privilege on match" paradigm in
318	   mind, where it is critical to prevent elevation of privilege while
319	   minimizing denial of service.  Using URIs in a "deny privilege on
320	   match" system can thus be problematic.

322	2.3.  Hypothetical Example

324	   In this example, both security principals and resources are
325	   identified using URIs.  Foo Corp has paid example.com for access to
326	   the Stuff service.  Foo Corp allows its employees to create accounts
327	   on the Stuff service.  Alice gets the account
328	   "http://example.com/Stuff/FooCorp/alice" and Bob gets
329	   "http://example.com/Stuff/FooCorp/bob".  It turns out, however, that
330	   Foo Corp's URI canonicalizer includes URI fragment components in
331	   comparisons whereas example.com's does not, and Foo Corp does not
332	   disallow the # character in the account name.  So Chuck, who is a
333	   malicious employee of Foo Corp, asks to create an account at
334	   example.com with the name alice#stuff.  Foo Corp's URI logic checks
335	   its records for accounts it has created with stuff and sees that
336	   there is no account with the name alice#stuff.  Hence, in its
337	   records, it associates the account alice#stuff with Chuck and will
338	   only issue tokens good for use with
339	   "http://example.com/Stuff/FooCorp/alice#stuff" to Chuck.

341	   Chuck, the attacker, goes to a security token service at Foo Corp and
342	   asks for a security token good for
343	   "http://example.com/Stuff/FooCorp/alice#stuff".  Foo Corp issues the
344	   token since Chuck is the legitimate owner (in Foo Corp's view) of the
345	   alice#stuff account.  Chuck then submits the security token in a
346	   request to "http://example.com/Stuff/FooCorp/alice".

348	   But example.com uses a URI canonicalizer that, for the purposes of
349	   checking equality, ignores fragments.  So when example.com looks in
350	   the security token to see if the requester has permission from Foo
351	   Corp to access the given account it successfully matches the URI in
352	   the security token, "http://example.com/Stuff/FooCorp/alice#stuff",
353	   with the requested resource name
354	   "http://example.com/Stuff/FooCorp/alice".

356	   Leveraging the inconsistencies in the canonicalizers used by Foo Corp
357	   and example.com, Chuck is able to successfully launch an elevation of
358	   privilege attack and access Alice's resource.

360	   Furthermore, consider an attacker using a similar corporation such as
361	   "foocorp" (or any variation containing a non-ASCII character that
362	   some humans might expect to represent the same corporation).  If the
363	   resource holder treats them as different, but the security token
364	   service treats them as the same, then again elevation of privilege
365	   can occur.

367	3.  Common Identifiers

369	   In this section, we walk through a number of common types of
370	   identifiers and discuss various issues related to comparison that may
371	   affect security whenever they are used to identify security
372	   principals or resources.  These examples illustrate common patterns
373	   that may arise with other types of identifiers.

375	3.1.  Hostnames

377	   Hostnames (composed of dot-separated labels) are commonly used either
378	   directly as identifiers, or as components in identifiers such as in
379	   URIs and email addresses.  Another example is in [RFC5280], sections
380	   7.2 and 7.3 (and updated in section 3 of
381	   [I-D.ietf-pkix-rfc5280-clarifications]), which specify use in
382	   certificates.

384	   In this section we discuss a number of issues in comparing strings
385	   that appear to be some form of hostname.

387	   It is first worth pointing out that the term itself is often
388	   ambiguous, and hence it is important that any use clarify which
389	   definition is intended.  Some examples of definitions include:
390	   a.  A Fully-Qualified Domain Name (FQDN),
391	   b.  An FQDN that is associated with address records,
392	   c.  The leftmost label in an FQDN, or
393	   d.  The leftmost label in an FQDN that is associated with address
394	       records.

396	   The use of different definitions in different places results in
397	   questions such as whether "example" and "example.com" are considered
398	   equal or not.

400	   Section 3 of [RFC6055] discusses the differences between a "hostname"
401	   vs. a "DNS name", where the former is a subset of the latter by using
402	   a restricted set of characters.  If one canonicalizer uses the "DNS
403	   name" definition whereas another uses a "hostname" definition, a name
404	   might be valid in the former but invalid in the latter.  As long as
405	   invalid identifiers are denied privilege, this difference will not
406	   result in elevation of privilege.

408	   [IAB1123] briefly discusses issues with the ambiguity around whether
409	   a label will be "alphabetic", including among other issues, how
410	   "alphabetic" should be interpreted in an internationalized
411	   environment, and whether a hostname can be interpreted as an IP
412	   address.  We explore this last issue in more detail below.

414	3.1.1.  IPv4 Literals

416	   [RFC1123] section 2.1 states:

418	      Whenever a user inputs the identity of an Internet host, it SHOULD
419	      be possible to enter either (1) a host domain name or (2) an IP
420	      address in dotted-decimal ("#.#.#.#") form.  The host SHOULD check
421	      the string syntactically for a dotted-decimal number before
422	      looking it up in the Domain Name System.

424	   and

426	      This last requirement is not intended to specify the complete
427	      syntactic form for entering a dotted-decimal host number; that is
428	      considered to be a user-interface issue.

430	   In specifying the inet_addr() API, the POSIX standard [IEEE-1003.1]
431	   defines "IPv4 dotted decimal notation" as allowing not only strings
432	   of the form "10.0.1.2", but also allows octal and hexadecimal, and
433	   addresses with less than four parts.  For example, "10.0.258",
434	   "0xA000001", and "012.0x102" all represent the same IPv4 address in
435	   standard "IPv4 dotted decimal" notation.  We will refer to this as
436	   the "loose" syntax of an IPv4 address literal.

438	   In section 6.1 of [RFC3493] getaddrinfo() is defined to support the
439	   same (loose) syntax as inet_addr():

441	      If the specified address family is AF_INET or AF_UNSPEC, address
442	      strings using Internet standard dot notation as specified in
443	      inet_addr() are valid.

445	   In contrast, section 6.3 of the same RFC states, specifying
446	   inet_pton():

448	      If the af argument of inet_pton() is AF_INET, the src string shall
449	      be in the standard IPv4 dotted-decimal form: ddd.ddd.ddd.ddd where
450	      "ddd" is a one to three digit decimal number between 0 and 255.
451	      The inet_pton() function does not accept other formats (such as
452	      the octal numbers, hexadecimal numbers, and fewer than four
453	      numbers that inet_addr() accepts).

455	   As shown above, inet_pton() uses what we will refer to as the
456	   "strict" form of an IPv4 address literal.  Some platforms also use
457	   the strict form with getaddrinfo() when the AI_NUMERICHOST flag is
458	   passed to it.

460	   Both the strict and loose forms are standard forms, and hence a
461	   protocol specification is still ambiguous if it simply defines a
462	   string to be in the "standard IPv4 dotted decimal form".  And, as a
463	   result of these differences, names such as "10.11.12" are ambiguous
464	   as to whether they are an IP address or a hostname, and even
465	   "10.11.12.13" can be ambiguous because of the "SHOULD" in RFC 1123
466	   above making it optional whether to treat it as an address or a name.

468	   Protocols and data formats that can use addresses in string form for
469	   security purposes need to resolve these ambiguities.  For example,
470	   for the host component of URIs, section 3.2.2 of [RFC3986] resolves
471	   the first ambiguity by only allowing the strict form, and the second
472	   ambiguity by specifying that it is considered an IPv4 address
473	   literal.  New protocols and data formats should similarly consider
474	   using the strict form rather than the loose form in order to better
475	   match user expectations.

477	   A string might be valid under the "loose" definition, but invalid
478	   under the "strict" definition.  As long as invalid identifiers are
479	   denied privilege, this difference will not result in elevation of
480	   privilege.  Some protocols, however, use strings that can be either
481	   an IP address literal or a hostname.  Such strings are at best
482	   Definite identifiers, and often turn out to be Indefinite
483	   identifiers.  (See Section 4.1 for more discussion.)

485	   Furthermore, when strings can contain non-ASCII characters, they can
486	   contain other characters that may look like dots or digits to a human
487	   viewing and/or entering the identifier, especially to one who might
488	   expect digits to appear in his or her native script.

490	3.1.2.  IPv6 Literals

492	   IPv6 addresses similarly have a wide variety of alternate but
493	   semantically identical string representations, as defined in section
494	   2.2 of [RFC4291] and section 2 of [I-D.ietf-6man-uri-zoneid].  As
495	   discussed in section 3.2.5 of [RFC5952], this fact causes problems in
496	   security contexts if comparison (such as in X.509 certificates), is
497	   done between strings rather than between the binary representations
498	   of addresses.

500	   [RFC5952] recently specified a recommended canonical string format as
501	   an attempt to solve this problem, but it may not be ubiquitously
502	   supported at present.  And, when strings can contain non-ASCII
503	   characters, the same issues (and more, since hexadecimal and colons
504	   are allowed) arise as with IPv4 literals.

506	   Whereas (binary) IPv6 addresses are Absolute identifiers, IPv6
507	   address literals are Definite identifiers, since string-to-address
508	   conversion for IPv6 address literals is unambiguous.

510	3.1.3.  Internationalization

512	   The IETF policy on character sets and languages [RFC2277] requires
513	   support for UTF-8 in protocols, and as a result many protocols now do
514	   support non-ASCII characters.  When a hostname is sent in a UTF-8
515	   field, there are a number of ways it may be encoded.  For example,
516	   hostname labels might be encoded directly in UTF-8, or might first be
517	   Punycode-encoded [RFC3492] or even percent-encoded from UTF-8.

519	   For example, in URIs, [RFC3986] section 3.2.2 specifically allows for
520	   the use of percent-encoded UTF-8 characters in the hostname, as well
521	   as the use of IDNA encoding [RFC3490] using the Punycode algorithm.

523	   Percent-encoding is unambiguous for hostnames since the percent
524	   character cannot appear in the strict definition of a "hostname",
525	   though it can appear in a DNS name.

527	   Punycode-encoded labels (or "A-labels") on the other hand can be
528	   ambiguous if hosts are actually allowed to be named with a name
529	   starting with "xn--", and false positives can result.  While this may
530	   be extremely unlikely for normal scenarios, it nevertheless provides
531	   a possible vector for an attacker.

533	   A hostname comparator thus needs to decide whether a Punycode-encoded
534	   label should or should not be considered a valid hostname label, and
535	   if so, then whether it should match a label encoded in some other
536	   form such as a percent-encoded Unicode label (U-label).

538	   For example, Section 3 of "Transport Layer Security (TLS) Extensions"
539	   [RFC6066], states:

541	      "HostName" contains the fully qualified DNS hostname of the
542	      server, as understood by the client.  The hostname is represented
543	      as a byte string using ASCII encoding without a trailing dot.
544	      This allows the support of internationalized domain names through
545	      the use of A-labels defined in [RFC5890].  DNS hostnames are case-
546	      insensitive.  The algorithm to compare hostnames is described in
547	      [RFC5890], Section 2.3.2.4.

549	   For some additional discussion of security issues that arise with
550	   internationalization, see [TR36].

552	3.1.4.  Resolution for comparison

554	   Some systems (specifically Java URLs [JAVAURL]) use the rule that if
555	   two hostnames resolve to the same IP address(es) then the hostnames
556	   are considered equal.  That is, the canonicalization algorithm
557	   involves name resolution with an IP address being the canonical form.

559	   For example, if resolution was done via DNS, and DNS contained:

561	   example.com.  IN A 10.0.0.6
562	   example.net.  CNAME example.com.
563	   example.org.  IN A 10.0.0.6

565	   then the algorithm might treat all three names as equal, even though
566	   the third name might refer to a different entity.

568	   With the introduction of dynamic IP addresses, private IP addresses,
569	   multiple IP addresses per name, multiple address families (e.g., IPv4
570	   vs. IPv6), devices that roam to new locations, commonly deployed DNS
571	   tricks that result in the answer depending on factors such as the
572	   requester's location and the load on the server whose address is
573	   returned, etc., this method of comparison cannot be relied upon.
574	   There is no guarantee that two names for the same host will resolve
575	   the name to the same IP addresses, nor that the addresses resolved
576	   refer to the same entity such as when the names resolve to private IP
577	   addresses, nor even that the system has connectivity (and the
578	   willingness to wait for the delay) to resolve names at the time the
579	   answer is needed.

581	   In addition, a comparison mechanism that relies on the ability to
582	   resolve identifiers such as hostnames to other identifies such as IP
583	   addresses leaks information about security decisions to outsiders if
584	   these queries are publicly observable.

586	   Finally, it is worth noting that resolving two identifiers to
587	   determine if they refer to the same entity can be thought of as a use
588	   of such identifiers, as opposed to actually comparing the identifiers
589	   themselves, which is the focus of this document.

591	3.2.  Ports and Service Names

593	   Port numbers and service names are discussed in depth in [RFC6335].
594	   Historically, there were port numbers, service names used in SRV
595	   records, and mnemonic identifiers for assigned port numbers (known as
596	   port "keywords" at [IANA-PORT]).  The latter two are now unified, and
597	   various protocols use one or more of these types in strings.  For
598	   example, the common syntax used by many URI schemes allows port
599	   numbers but not service names.  Some implementations of the
600	   getaddrinfo() API support strings that can be either port numbers or
601	   port keywords (but not service names).

603	   For protocols that use service names that must be resolved, the
604	   issues are the same as those for resolution of addresses in
605	   Section 3.1.4.  In addition, Section 5.1 of [RFC6335] clarifies that
606	   service names/port keywords must contain at least one letter.  This
607	   prevents confusion with port numbers in strings where both are
608	   allowed.

610	3.3.  URIs

612	   This section looks at issues related to using URIs for security
613	   purposes.  For example, [RFC5280], section 7.4, specifies comparison
614	   of URIs in certificates.  Examples of URIs in security token-based
615	   access control systems include WS-*, SAML-P and OAuth WRAP.  In such
616	   systems, a variety of participants in the security infrastructure are
617	   identified by URIs.  For example, requesters of security tokens are
618	   sometimes identified with URIs.  The issuers of security tokens and
619	   the relying parties who are intended to consume security tokens are
620	   frequently identified by URIs.  Claims in security tokens often have
621	   their types defined using URIs and the values of the claims can also
622	   be URIs.

624	   Also, when a URI is embedded in plain text (e.g., an email message),
625	   there is an additional concern because there is no termination
626	   criterion for a URI.  For example, consider
627	   http://unicode.org/cldr/utility/list-unicodeset.jsp?a=a&amp;g=gc.
628	   Some applications that detect URIs will stop before the first '.' in
629	   the path, while others go to last '.', and yet others may stop at the
630	   ';'.  As another point of comparison, Section 2.37 of [EE] (a
631	   standard for history citations) specifies the use of a space after a
632	   URI and before the punctuation.

634	   URIs are defined with multiple components, each of which has its own
635	   rules.  We cover each in turn below.  However, it is also important
636	   to note that there exist multiple comparison algorithms.  [RFC3986]
637	   section 6.2 states:

639	      A variety of methods are used in practice to test URI equivalence.
640	      These methods fall into a range, distinguished by the amount of
641	      processing required and the degree to which the probability of
642	      false negatives is reduced.  As noted above, false negatives
643	      cannot be eliminated.  In practice, their probability can be
644	      reduced, but this reduction requires more processing and is not
645	      cost-effective for all applications.
646	      If this range of comparison practices is considered as a ladder,
647	      the following discussion will climb the ladder, starting with
648	      practices that are cheap but have a relatively higher chance of
649	      producing false negatives, and proceeding to those that have
650	      higher computational cost and lower risk of false negatives.

652	   The ladder approach has both pros and cons.  On the pro side, it
653	   allows some uses to optimize for security, and other uses to optimize
654	   for cost, thus allowing URIs to be applicable to a wide range of
655	   uses.  A disadvantage is that when different approaches are taken by
656	   different components in the same system using the same identifiers,
657	   the inconsistencies can result in security issues.

659	3.3.1.  Scheme component

661	   [RFC3986] defines URI schemes as being case-insensitive ASCII and in
662	   section 6.2.2.1 specifies that scheme names should be normalized to
663	   lower-case characters.

665	   New schemes can be defined over time.  In general two URIs with an
666	   unrecognized scheme cannot be safely compared, however.  This is
667	   because the canonicalization and comparison rules for the other
668	   components may vary by scheme.  For example, a new URI scheme might
669	   have a default port of X, and without that knowledge, a comparison
670	   algorithm cannot know whether "example.com" and "example.com:X"
671	   should be considered to match in the authority component.  Hence for
672	   security purposes, it is safest for unrecognized schemes to be
673	   treated as invalid identifiers.  However, if the URIs are only used
674	   with a "grant access on match" paradigm then unrecognized schemes can
675	   be supported by doing a generic case-sensitive comparison, at the
676	   expense of some false negatives.

678	3.3.2.  Authority component

680	   The authority component is scheme-specific, but many schemes follow a
681	   common syntax that allows for userinfo, host, and port.

683	3.3.2.1.  Host

685	   Section 3.1 discussed issues with hostnames in general.  In addition,
686	   [RFC3986] section 3.2.2 allows future changes using the IPvFuture
687	   production.  As with IPv4 and IPv6 literals, IPvFuture formats may
688	   have issues with multiple semantically identical string
689	   representations, and may also be semantically identical to an IPv4 or
690	   IPv6 address.  As such, false negatives may be common if IPvFuture is
691	   used.

693	3.3.2.2.  Port

695	   See discussion in Section 3.2.

697	3.3.2.3.  Userinfo

699	   [RFC3986] defines the userinfo production that allows arbitrary data
700	   about the user of the URI to be placed before '@' signs in URIs.  For
701	   example: "http://alice:bob:chuck@example.com/bar" has the value
702	   "alice:bob:chuck" as its userinfo.  When comparing URIs in a security
703	   context, one must decide whether to treat the userinfo as being
704	   significant or not.  Some URI comparison services for example treat
705	   "http://alice:ick@example.com" and "http://example.com" as being
706	   equal.

708	   When the userinfo is treated as being significant, it has additional
709	   considerations (e.g., whether it is case-sensitive or not) which we
710	   cover in Section 3.4.

712	3.3.3.  Path component

714	   [RFC3986] supports the use of path segment values such as "./" or
715	   "../" for relative URIs.  Strictly speaking, including such path
716	   segment values in a fully qualified URI is syntactically illegal but
717	   [RFC3986] section 4.1 nevertheless defines an algorithm to remove
718	   them.

720	   Unless a scheme states otherwise, the path component is defined to be
721	   case-sensitive.  However, if the resource is stored and accessed
722	   using a filesystem using case-insensitive paths, there will be many
723	   paths that refer to the same resource.  As such, false negatives can
724	   be common in this case.

726	3.3.4.  Query component

728	   There is the question as to whether "http://example.com/foo",
729	   "http://example.com/foo?", and "http://example.com/foo?bar" are each
730	   considered equal or different.

732	   Similarly, it is unspecified whether the order of values matters.
733	   For example, should "http://example.com/blah?ick=bick&foo=bar" be
734	   considered equal to "http://example.com/blah?foo=bar&ick=bick"?  And
735	   if a domain name is permitted to appear in a query component (e.g.,
736	   in a reference to another URI), the same issues in Section 3.1 apply.

738	3.3.5.  Fragment component

740	   Some URI formats include fragment identifiers.  These are typically
741	   handles to locations within a resource and are used for local
742	   reference.  A classic example is the use of fragments in HTTP URIs
743	   where a URI of the form "http://example.com/blah.html#ick" means
744	   retrieve the resource "http://example.com/blah.html" and, once it has
745	   arrived locally, find the HTML anchor named ick and display that.

747	   So, for example, when a user clicks on the link
748	   "http://example.com/blah.html#baz" a browser will check its cache by
749	   doing a URI comparison for "http://example.com/blah.html" and, if the
750	   resource is present in the cache, a match is declared.

752	   Hence comparisons for security purposes typically ignore the fragment
753	   component and treat all fragments as equal to the full resource.
754	   However, if one were actually trying to compare the piece of a
755	   resource that was identified by the fragment identifier, ignoring it
756	   would result in potential false positives.

758	3.3.6.  Resolution for comparison

760	   As with Section 3.1.4 for hostnames, it may be tempting to define a
761	   URI comparison algorithm based on whether they resolve to the same
762	   content.  Similar problems exist, however, including content that
763	   dynamically changes over time or based on factors such as the
764	   requester's location, potential lack of external connectivity at the
765	   time/place comparison is done, potentially undesirable delay
766	   introduced, etc.

768	   In addition, as noted in Section 3.1.4, resolution leaks information
769	   about security decisions to outsiders if the queries are publicly
770	   observable.

772	3.4.  Email Address-like Identifiers

774	   Section 3.4.1 of [RFC5322] defines the syntax of an email address-
775	   like identifier, and Section 3.2 of [RFC6532] updates it to support
776	   internationalization.  [RFC5280], section 7.5, further discusses the
777	   use of internationalized email addresses in certificates.

779	   [RFC6532] use in certificates points to [RFC6530], where Section 13
780	   of that document contains a discussion of many issues resulting from
781	   internationalization.

783	   Email address-like identifiers have a local part and a domain part.
784	   The issues with the domain part are essentially the same as with
785	   hostnames, covered earlier.

787	   The local part is left for each domain to define.  People quite
788	   commonly use email addresses as usernames with web sites such as
789	   banks or shopping sites, but the site doesn't know whether
790	   foo@example.com is the same person as FOO@example.com.  Thus email
791	   address-like identifiers are typically Indefinite identifiers.

793	   To avoid false positives, some security mechanisms (such as
794	   [RFC5280]) compare the local part using an exact match.  Hence, like
795	   URIs, email address-like identifiers are designed for use in grant-
796	   on-match security schemes, not in deny-on-match schemes.

798	   Furthermore, if a mailbox is stored and accessed using a fileystem
799	   using case-insensitive paths, there may be many paths that refer to
800	   the same mailbox.  As such, false negatives can be common in this
801	   case.

803	4.  General Issues

805	4.1.  Conflation

807	   There are a number of examples (some in the preceding sections) of
808	   strings that conflate two types of identifiers, using some heuristic
809	   to try to determine which type of identifier is given.  Similarly,
810	   two ways of encoding the same type of identifier might be conflated
811	   within the same string.

813	   Some examples include:
814	   1.  A string that might be an IPv4 address literal or an IPv6 address
815	       literal
816	   2.  A string that might be an IP address literal or a hostname
817	   3.  A string that might be a port number or a service name
818	   4.  A DNS label that might be literal or be Punycode-encoded

820	   Strings that allow such conflation can only be considered Definite if
821	   there exists a well-defined rule to determine which identifier type
822	   is meant.  One way to do so is to ensure that the valid syntax for
823	   the two is disjoint (e.g., distinguishing IPv4 vs. IPv6 address
824	   literals by the use of colons in the latter).  A second way to do so
825	   is to define a precedence rule that results in some identifiers being
826	   inaccessible via a conflated string (e.g., a host literally named
827	   "xn--de-jg4avhby1noc0d" may be inaccessible due to the "xn--" prefix
828	   denoting the use of Punycode encoding).  In some cases, such
829	   inaccessible space may be reserved so that the actual set of
830	   identifiers in use are unambiguous.  For example, Section 2.5.5.2 of
831	   [RFC4291] defines a range of the IPv6 address space for representing
832	   IPv4 addresses.

834	4.2.  Internationalization

836	   In addition to the issues with hostnames discussed in Section 3.1.3,
837	   there are a number of internationalization issues that apply to many
838	   types of Definite and Indefinite identifiers.

840	   First, there is no DNS mechanism for identifying whether non-
841	   identical strings would be seen by a human as being equivalent.
842	   There are problematic examples even with ASCII (Basic Latin) strings
843	   including regional spelling variations such as "color" and "colour"
844	   and many non-English cases including partially-numeric strings in
845	   Arabic script contexts, Chinese strings in Simplified and Traditional
846	   forms, and so on.  Attempts to produce such alternate forms
847	   algorithmically could produce false positives and hence have an
848	   adverse affect on security.

850	   Second, some strings are visually confusable with others, and hence
851	   if a security decision is made by a user based on visual inspection,
852	   many opportunities for false positives exist.  As such, using visual
853	   inspection for security is unreliable.  In addition to the security
854	   issues, visual confusability also adversely affects the usability of
855	   identifiers distributed via visual mediums.  Similar issues can arise
856	   with audible confusability when using audio (e.g., for radio
857	   distribution, accessibility to the blind, etc.) in place of a visual
858	   medium.

860	   Determining whether a string is a valid identifier should typically
861	   be done after, or as part of, canonicalization.  Otherwise an
862	   attacker might use the canonicalization algorithm to inject (e.g.,
863	   via percent encoding, NFKC, or non-shortest-form UTF-8) delimiters
864	   such as '@' in an email address-like identifier, or a '.' in a
865	   hostname.

867	   Any case-insensitive comparisons need to define how comparison is
868	   done, since such comparisons may vary by locale of the endpoint.  As
869	   such, using case-insensitive comparisons in general often result in
870	   identifiers being either Indefinite or, if the legal character set is
871	   restricted (e.g., to ASCII), then Definite.

873	   See also [WEBER] for a more visual discussion of many of these
874	   issues.

876	   Finally, the set of permitted characters and the canonical form of
877	   the characters (and hence the canonicalization algorithm) sometimes
878	   varies by protocol today, even when the intent is to use the same
879	   identifier, such as when one protocol passes identifiers to the
880	   other.  See [I-D.ietf-precis-problem-statement] for further
881	   discussion.

883	4.3.  Scope

885	   Another issue arises when an identifier (e.g., "localhost",
886	   "10.11.12.13", etc.) is not globally unique.  [RFC3986] Section 1.1
887	   states:

889	      URIs have a global scope and are interpreted consistently
890	      regardless of context, though the result of that interpretation
891	      may be in relation to the end-user's context.  For example,
892	      "http://localhost/" has the same interpretation for every user of
893	      that reference, even though the network interface corresponding to
894	      "localhost" may be different for each end-user: interpretation is
895	      independent of access.

897	   Whenever a non-globally-unique identifier is passed to another entity
898	   outside of the scope of uniqueness, it will refer to a different
899	   resource, and can result in a false positive.  This problem is often
900	   addressed by using the identifier together with some other unique
901	   identifier of the context.  For example "alice" may uniquely identify
902	   a user within a system, but must be used with "example.com" (as in
903	   "alice@example.com") to uniquely identify the context outside of that
904	   system.

906	   It is also worth noting that non-globally-scoped IPv6 addresses can
907	   be written with, or otherwise associated with, a "zone ID" to
908	   identify the context (see [RFC4007] for more information).  However,
909	   zone IDs are only unique within a host, so they typically narrow,
910	   rather than expand, the scope of uniqueness of the resulting
911	   identifier.

913	4.4.  Temporality

915	   Often identifiers are not unique across all time, but have some
916	   lifetime associated with them after which they may be reassigned to
917	   another entity.  For example, bob@example.com might go to to an
918	   employee of the Example company, but if he leaves and another Bob is
919	   later hired, the same identifier might be reused.  As another
920	   example, IP address 203.0.113.0 might be assigned to one subscriber,
921	   and then later reassigned to another subscriber.  If all entities
922	   that store the identifier (e.g., in an access control list) are not
923	   updated, security issues can arise.  This issue is similar to the
924	   issue of scope discussed in Section 4.3, except that the scope of
925	   uniqueness is temporal rather than topological.

927	5.  Security Considerations

929	   This entire document is about security considerations.

931	   To minimize elevation of privilege issues, any system that requires
932	   the ability to use both deny and allow operations within the same
933	   identifier space should avoid the use of Indefinite identifiers in
934	   security comparisons.

936	   To minimize future security risks, any new identifiers being designed
937	   should specify an Absolute or Definite comparison algorithm, and if
938	   extensibility is allowed (e.g., as new schemes in URIs allow) then
939	   the comparison algorithm should remain invariant so that unrecognized
940	   extensions can be compared.  That is, security risks can be reduced
941	   by specifying the comparison algorithm, making sure to resolve any
942	   ambiguities pointed out in this document (e.g., "standard dotted
943	   decimal").

945	   Some issues (such as unrecognized extensions) can be mitigated by
946	   treating such identifiers as invalid.  Validity checking of
947	   identifiers is further discussed in [RFC3696].

949	   Perhaps the hardest issues arise when multiple protocols are used
950	   together, such as in the figure in Section 2, where the two protocols
951	   are defined or implemented using different comparison algorithms.
952	   When constructing an architecture that uses multiple such protocols,
953	   designers should pay attention to any differences in comparison
954	   algorithms among the protocols, in order to fully understand the
955	   security risks.  An area for future work is how to deal with such
956	   security risks in current systems.

958	6.  Acknowledgements

960	   Yaron Goland contributed to the discussion on URIs.  Patrik Faltstrom
961	   contributed to the background on identifiers.  John Klensin
962	   contributed text in a number of different sections.  Additional
963	   helpful feedback and suggestions came from Bernard Aboba, Leslie
964	   Daigle, Mark Davis, Russ Housley, Christian Huitema, Magnus Nystrom,
965	   and Chris Weber.

967	7.  IANA Considerations

969	   This document requires no actions by the IANA.

971	8.  Informative References

973	   [EE]       Mills, E., "Evidence Explained: Citing History Sources
974	              from Artifacts to Cyberspace", 2007.

976	   [I-D.ietf-6man-uri-zoneid]
977	              Carpenter, B., Cheshire, S., and R. Hinden, "Representing
978	              IPv6 Zone Identifiers in Address Literals and Uniform
979	              Resource Identifiers", draft-ietf-6man-uri-zoneid-06 (work
980	              in progress), December 2012.

982	   [I-D.ietf-pkix-rfc5280-clarifications]
983	              Yee, P., "Updates to the Internet X.509 Public Key
984	              Infrastructure Certificate and Certificate Revocation List
985	              (CRL) Profile", draft-ietf-pkix-rfc5280-clarifications-11
986	              (work in progress), November 2012.

988	   [I-D.ietf-precis-problem-statement]
989	              Blanchet, M. and A. Sullivan, "Stringprep Revision and
990	              PRECIS Problem Statement",
991	              draft-ietf-precis-problem-statement-08 (work in progress),
992	              September 2012.

994	   [IAB1123]  IAB, "The interpretation of rules in the ICANN gTLD
995	              Applicant Guidebook", February 2012, <http://www.iab.org/
996	              documents/correspondence-reports-documents/2012-2/
997	              iab-statement-the-interpretation-of-rules-in-the-icann-
998	              gtld-applicant-guidebook>.

1000	   [IANA-PORT]
1001	              IANA, "PORT NUMBERS", June 2011,
1002	              <http://www.iana.org/assignments/port-numbers>.

1004	   [IEEE-1003.1]
1005	              IEEE and The Open Group, "The Open Group Base
1006	              Specifications, Issue 6 IEEE Std 1003.1, 2004 Edition",
1007	              IEEE Std 1003.1, 2004.

1009	   [JAVAURL]  Oracle, "Class URL, Java(TM) Platform, Standard Ed. 7",
1010	              2011, <http://docs.oracle.com/javase/7/docs/api/java/net/
1011	              URL.html>.

1013	   [RFC1123]  Braden, R., "Requirements for Internet Hosts - Application
1014	              and Support", STD 3, RFC 1123, October 1989.

1016	   [RFC2277]  Alvestrand, H., "IETF Policy on Character Sets and
1017	              Languages", BCP 18, RFC 2277, January 1998.

1019	   [RFC3490]  Faltstrom, P., Hoffman, P., and A. Costello,
1020	              "Internationalizing Domain Names in Applications (IDNA)",
1021	              RFC 3490, March 2003.

1023	   [RFC3492]  Costello, A., "Punycode: A Bootstring encoding of Unicode
1024	              for Internationalized Domain Names in Applications
1025	              (IDNA)", RFC 3492, March 2003.

1027	   [RFC3493]  Gilligan, R., Thomson, S., Bound, J., McCann, J., and W.
1028	              Stevens, "Basic Socket Interface Extensions for IPv6",
1029	              RFC 3493, February 2003.

1031	   [RFC3696]  Klensin, J., "Application Techniques for Checking and
1032	              Transformation of Names", RFC 3696, February 2004.

1034	   [RFC3986]  Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
1035	              Resource Identifier (URI): Generic Syntax", STD 66,
1036	              RFC 3986, January 2005.

1038	   [RFC4007]  Deering, S., Haberman, B., Jinmei, T., Nordmark, E., and
1039	              B. Zill, "IPv6 Scoped Address Architecture", RFC 4007,
1040	              March 2005.

1042	   [RFC4291]  Hinden, R. and S. Deering, "IP Version 6 Addressing
1043	              Architecture", RFC 4291, February 2006.

1045	   [RFC5280]  Cooper, D., Santesson, S., Farrell, S., Boeyen, S.,
1046	              Housley, R., and W. Polk, "Internet X.509 Public Key
1047	              Infrastructure Certificate and Certificate Revocation List
1048	              (CRL) Profile", RFC 5280, May 2008.

1050	   [RFC5322]  Resnick, P., Ed., "Internet Message Format", RFC 5322,
1051	              October 2008.

1053	   [RFC5952]  Kawamura, S. and M. Kawashima, "A Recommendation for IPv6
1054	              Address Text Representation", RFC 5952, August 2010.

1056	   [RFC6055]  Thaler, D., Klensin, J., and S. Cheshire, "IAB Thoughts on
1057	              Encodings for Internationalized Domain Names", RFC 6055,
1058	              February 2011.

1060	   [RFC6066]  Eastlake, D., "Transport Layer Security (TLS) Extensions:
1061	              Extension Definitions", RFC 6066, January 2011.

1063	   [RFC6335]  Cotton, M., Eggert, L., Touch, J., Westerlund, M., and S.
1064	              Cheshire, "Internet Assigned Numbers Authority (IANA)
1065	              Procedures for the Management of the Service Name and
1066	              Transport Protocol Port Number Registry", BCP 165,
1067	              RFC 6335, August 2011.

1069	   [RFC6530]  Klensin, J. and Y. Ko, "Overview and Framework for
1070	              Internationalized Email", RFC 6530, February 2012.

1072	   [RFC6532]  Yang, A., Steele, S., and N. Freed, "Internationalized
1073	              Email Headers", RFC 6532, February 2012.

1075	   [TR36]     Unicode Consortium, "Unicode Security Considerations",
1076	              Unicode Technical Report 36, August 2004.

1078	   [WEBER]    Weber, C., "Attacking Software Globalization", March 2010,
1079	              <http://www.lookout.net/files/
1080	              Chris_Weber_Character%20Transformations%20v1.7_IUC33.pdf>.

1082	Author's Address

1084	   Dave Thaler (editor)
1085	   Microsoft Corporation
1086	   One Microsoft Way
1087	   Redmond, WA  98052
1088	   USA

1090	   Phone: +1 425 703 8835
1091	   Email: dthaler@microsoft.com