< draft-ietf-idnabis-rationale-11.txt   draft-ietf-idnabis-rationale-12.txt >
Network Working Group J. Klensin Network Working Group J. Klensin
Internet-Draft August 13, 2009 Internet-Draft September 10, 2009
Intended status: Informational Intended status: Informational
Expires: February 14, 2010 Expires: March 14, 2010
Internationalized Domain Names for Applications (IDNA): Background, Internationalized Domain Names for Applications (IDNA): Background,
Explanation, and Rationale Explanation, and Rationale
draft-ietf-idnabis-rationale-11.txt draft-ietf-idnabis-rationale-12.txt
Status of this Memo Status of this Memo
This Internet-Draft is submitted to IETF in full conformance with the This Internet-Draft is submitted to IETF in full conformance with the
provisions of BCP 78 and BCP 79. This document may contain material provisions of BCP 78 and BCP 79. This document may contain material
from IETF Documents or IETF Contributions published or made publicly from IETF Documents or IETF Contributions published or made publicly
available before November 10, 2008. The person(s) controlling the available before November 10, 2008. The person(s) controlling the
copyright in some of this material may not have granted the IETF copyright in some of this material may not have granted the IETF
Trust the right to allow modifications of such material outside the Trust the right to allow modifications of such material outside the
IETF Standards Process. Without obtaining an adequate license from IETF Standards Process. Without obtaining an adequate license from
skipping to change at page 1, line 43 skipping to change at page 1, line 43
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt. http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html. http://www.ietf.org/shadow.html.
This Internet-Draft will expire on February 14, 2010. This Internet-Draft will expire on March 14, 2010.
Copyright Notice Copyright Notice
Copyright (c) 2009 IETF Trust and the persons identified as the Copyright (c) 2009 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents in effect on the date of Provisions Relating to IETF Documents in effect on the date of
publication of this document (http://trustee.ietf.org/license-info). publication of this document (http://trustee.ietf.org/license-info).
Please review these documents carefully, as they describe your rights Please review these documents carefully, as they describe your rights
skipping to change at page 3, line 9 skipping to change at page 3, line 9
Alternate Character Forms . . . . . . . . . . . . . . . . 18 Alternate Character Forms . . . . . . . . . . . . . . . . 18
4.4. Case Mapping and Related Issues . . . . . . . . . . . . . 20 4.4. Case Mapping and Related Issues . . . . . . . . . . . . . 20
4.5. Right to Left Text . . . . . . . . . . . . . . . . . . . . 21 4.5. Right to Left Text . . . . . . . . . . . . . . . . . . . . 21
5. IDNs and the Robustness Principle . . . . . . . . . . . . . . 21 5. IDNs and the Robustness Principle . . . . . . . . . . . . . . 21
6. Front-end and User Interface Processing for Lookup . . . . . . 22 6. Front-end and User Interface Processing for Lookup . . . . . . 22
7. Migration from IDNA2003 and Unicode Version Synchronization . 24 7. Migration from IDNA2003 and Unicode Version Synchronization . 24
7.1. Design Criteria . . . . . . . . . . . . . . . . . . . . . 24 7.1. Design Criteria . . . . . . . . . . . . . . . . . . . . . 24
7.1.1. Summary and Discussion of IDNA Validity Criteria . . . 25 7.1.1. Summary and Discussion of IDNA Validity Criteria . . . 25
7.1.2. Labels in Registration . . . . . . . . . . . . . . . . 25 7.1.2. Labels in Registration . . . . . . . . . . . . . . . . 25
7.1.3. Labels in Lookup . . . . . . . . . . . . . . . . . . . 26 7.1.3. Labels in Lookup . . . . . . . . . . . . . . . . . . . 26
7.2. Changes in Character Interpretations . . . . . . . . . . . 27 7.2. Changes in Character Interpretations . . . . . . . . . . . 28
7.3. Character Mapping . . . . . . . . . . . . . . . . . . . . 29 7.3. Character Mapping . . . . . . . . . . . . . . . . . . . . 29
7.4. The Question of Prefix Changes . . . . . . . . . . . . . . 29 7.4. The Question of Prefix Changes . . . . . . . . . . . . . . 29
7.4.1. Conditions Requiring a Prefix Change . . . . . . . . . 29 7.4.1. Conditions Requiring a Prefix Change . . . . . . . . . 29
7.4.2. Conditions Not Requiring a Prefix Change . . . . . . . 30 7.4.2. Conditions Not Requiring a Prefix Change . . . . . . . 30
7.4.3. Implications of Prefix Changes . . . . . . . . . . . . 30 7.4.3. Implications of Prefix Changes . . . . . . . . . . . . 30
7.5. Stringprep Changes and Compatibility . . . . . . . . . . . 31 7.5. Stringprep Changes and Compatibility . . . . . . . . . . . 31
7.6. The Symbol Question . . . . . . . . . . . . . . . . . . . 31 7.6. The Symbol Question . . . . . . . . . . . . . . . . . . . 32
7.7. Migration Between Unicode Versions: Unassigned Code 7.7. Migration Between Unicode Versions: Unassigned Code
Points . . . . . . . . . . . . . . . . . . . . . . . . . . 33 Points . . . . . . . . . . . . . . . . . . . . . . . . . . 33
7.8. Other Compatibility Issues . . . . . . . . . . . . . . . . 34 7.8. Other Compatibility Issues . . . . . . . . . . . . . . . . 35
8. Name Server Considerations . . . . . . . . . . . . . . . . . . 35 8. Name Server Considerations . . . . . . . . . . . . . . . . . . 35
8.1. Processing Non-ASCII Strings . . . . . . . . . . . . . . . 35 8.1. Processing Non-ASCII Strings . . . . . . . . . . . . . . . 35
8.2. DNSSEC Authentication of IDN Domain Names . . . . . . . . 35 8.2. DNSSEC Authentication of IDN Domain Names . . . . . . . . 36
8.3. Root and other DNS Server Considerations . . . . . . . . . 36 8.3. Root and other DNS Server Considerations . . . . . . . . . 36
9. Internationalization Considerations . . . . . . . . . . . . . 36 9. Internationalization Considerations . . . . . . . . . . . . . 36
10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 36 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 37
10.1. IDNA Character Registry . . . . . . . . . . . . . . . . . 37 10.1. IDNA Character Registry . . . . . . . . . . . . . . . . . 37
10.2. IDNA Context Registry . . . . . . . . . . . . . . . . . . 37 10.2. IDNA Context Registry . . . . . . . . . . . . . . . . . . 37
10.3. IANA Repository of IDN Practices of TLDs . . . . . . . . . 37 10.3. IANA Repository of IDN Practices of TLDs . . . . . . . . . 37
11. Security Considerations . . . . . . . . . . . . . . . . . . . 37 11. Security Considerations . . . . . . . . . . . . . . . . . . . 38
11.1. General Security Issues with IDNA . . . . . . . . . . . . 37 11.1. General Security Issues with IDNA . . . . . . . . . . . . 38
12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 38 12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 38
13. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 38 13. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 39
14. References . . . . . . . . . . . . . . . . . . . . . . . . . . 39 14. References . . . . . . . . . . . . . . . . . . . . . . . . . . 39
14.1. Normative References . . . . . . . . . . . . . . . . . . . 39 14.1. Normative References . . . . . . . . . . . . . . . . . . . 39
14.2. Informative References . . . . . . . . . . . . . . . . . . 40 14.2. Informative References . . . . . . . . . . . . . . . . . . 40
Appendix A. Change Log . . . . . . . . . . . . . . . . . . . . . 42 Appendix A. Change Log . . . . . . . . . . . . . . . . . . . . . 42
A.1. Changes between Version -00 and Version -01 of A.1. Changes between Version -00 and Version -01 of
draft-ietf-idnabis-rationale . . . . . . . . . . . . . . . 42 draft-ietf-idnabis-rationale . . . . . . . . . . . . . . . 43
A.2. Version -02 . . . . . . . . . . . . . . . . . . . . . . . 43 A.2. Version -02 . . . . . . . . . . . . . . . . . . . . . . . 43
A.3. Version -03 . . . . . . . . . . . . . . . . . . . . . . . 43 A.3. Version -03 . . . . . . . . . . . . . . . . . . . . . . . 43
A.4. Version -04 . . . . . . . . . . . . . . . . . . . . . . . 44 A.4. Version -04 . . . . . . . . . . . . . . . . . . . . . . . 44
A.5. Version -05 . . . . . . . . . . . . . . . . . . . . . . . 44 A.5. Version -05 . . . . . . . . . . . . . . . . . . . . . . . 44
A.6. Version -06 . . . . . . . . . . . . . . . . . . . . . . . 44 A.6. Version -06 . . . . . . . . . . . . . . . . . . . . . . . 45
A.7. Version -07 . . . . . . . . . . . . . . . . . . . . . . . 45 A.7. Version -07 . . . . . . . . . . . . . . . . . . . . . . . 45
A.8. Version -08 . . . . . . . . . . . . . . . . . . . . . . . 45 A.8. Version -08 . . . . . . . . . . . . . . . . . . . . . . . 45
A.9. Version -09 . . . . . . . . . . . . . . . . . . . . . . . 45 A.9. Version -09 . . . . . . . . . . . . . . . . . . . . . . . 46
A.10. Version -10 . . . . . . . . . . . . . . . . . . . . . . . 46 A.10. Version -10 . . . . . . . . . . . . . . . . . . . . . . . 46
A.11. Version -11 . . . . . . . . . . . . . . . . . . . . . . . 46 A.11. Version -11 . . . . . . . . . . . . . . . . . . . . . . . 46
A.12. Version -12 . . . . . . . . . . . . . . . . . . . . . . . 47
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 47 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 47
1. Introduction 1. Introduction
1.1. Context and Overview 1.1. Context and Overview
Internationalized Domain Names in Applications (IDNA) is a collection Internationalized Domain Names in Applications (IDNA) is a collection
of standards that allow client applications to convert some Unicode of standards that allow client applications to convert some Unicode
mnemonics to an ASCII-compatible encoding form ("ACE") which is a mnemonics to an ASCII-compatible encoding form ("ACE") which is a
valid DNS label containing only letters, digits, and hyphens. The valid DNS label containing only letters, digits, and hyphens. The
skipping to change at page 5, line 6 skipping to change at page 5, line 6
characters that are valid in A-labels are identified from rules characters that are valid in A-labels are identified from rules
listed in the Tables document [IDNA2008-Tables], but validity can be listed in the Tables document [IDNA2008-Tables], but validity can be
derived from the Unicode properties of those characters with a very derived from the Unicode properties of those characters with a very
few exceptions. few exceptions.
Traditionally, DNS labels are matched case-insensitively Traditionally, DNS labels are matched case-insensitively
[RFC1034][RFC1035]. That convention was preserved in IDNA2003 by a [RFC1034][RFC1035]. That convention was preserved in IDNA2003 by a
case-folding operation that generally maps capital letters into case-folding operation that generally maps capital letters into
lower-case ones. However, if case rules are enforced from one lower-case ones. However, if case rules are enforced from one
language, another language sometimes loses the ability to treat two language, another language sometimes loses the ability to treat two
characters separately. Case-sensitivity is treated slightly characters separately. Case-insensitivity is treated slightly
differently in IDNA2008. differently in IDNA2008.
IDNA2003 used Unicode version 3.2 only. In order to keep up with new IDNA2003 used Unicode version 3.2 only. In order to keep up with new
characters added in new versions of UNICODE, IDNA2008 decouples its characters added in new versions of UNICODE, IDNA2008 decouples its
rules from any particular version of UNICODE. Instead, the rules from any particular version of UNICODE. Instead, the
attributes of new characters in Unicode, supplemented by a small attributes of new characters in Unicode, supplemented by a small
number of exception cases, determine how and whether the characters number of exception cases, determine how and whether the characters
can be used in IDNA labels. can be used in IDNA labels.
This document provides informational context for IDNA2008, including This document provides informational context for IDNA2008, including
skipping to change at page 11, line 44 skipping to change at page 11, line 44
joining and ones similar to them (known as "CONTEXT-JOINER" or joining and ones similar to them (known as "CONTEXT-JOINER" or
"CONTEXTJ") and other characters requiring contextual treatment "CONTEXTJ") and other characters requiring contextual treatment
("CONTEXT-OTHER" or "CONTEXTO"). Only the former require full ("CONTEXT-OTHER" or "CONTEXTO"). Only the former require full
testing at lookup time. testing at lookup time.
It is important to note that these contextual rules cannot prevent It is important to note that these contextual rules cannot prevent
all uses of the relevant characters that might be confusing or all uses of the relevant characters that might be confusing or
problematic. What they are expected do is to confine applicability problematic. What they are expected do is to confine applicability
of the characters to scripts (and narrower contexts) where zone of the characters to scripts (and narrower contexts) where zone
administrators are knowledgeable enough about the use of those administrators are knowledgeable enough about the use of those
characters to be prepared to deal with them appropriately. For characters to be prepared to deal with them appropriately.
example, a registry dealing with an Indic script that requires ZWJ
and/or ZWNJ as part of the writing system is expected to understand For example, a registry dealing with an Indic script that requires
where the characters have visible effect and where they do not and to ZWJ and/or ZWNJ as part of the writing system is expected to
make registration rules accordingly. By contrast, a registry dealing understand where the characters have visible effect and where they do
primarily with Latin or Cyrillic script might not be actively aware not and to make registration rules accordingly. By contrast, a
that the characters exist, much less about the consequences of registry dealing primarily with Latin or Cyrillic script might not be
embedding them in labels drawn from those scripts. actively aware that the characters exist, much less about the
consequences of embedding them in labels drawn from those scripts and
therefore should avoid accepting registrations containing those
characters, at least in Latin or Cyrillic-script labels.
3.1.2.2. Rules and Their Application 3.1.2.2. Rules and Their Application
Rules have descriptions such as "Must follow a character from Script Rules have descriptions such as "Must follow a character from Script
XYZ", "Must occur only if the entire label is in Script ABC", or XYZ", "Must occur only if the entire label is in Script ABC", or
"Must occur only if the previous and subsequent characters have the "Must occur only if the previous and subsequent characters have the
DFG property". The actual rules may be DEFINED or NULL. If present, DFG property". The actual rules may be DEFINED or NULL. If present,
they may have values of "True" (character may be used in any position they may have values of "True" (character may be used in any position
in any label), "False" (character may not be used in any label), or in any label), "False" (character may not be used in any label), or
may be a set of procedural rules that specify the context in which may be a set of procedural rules that specify the context in which
skipping to change at page 14, line 43 skipping to change at page 14, line 43
exactly what is being registered as well as encouraging use of those exactly what is being registered as well as encouraging use of those
canonical forms. That provision should not be interpreted as canonical forms. That provision should not be interpreted as
requiring that registrant need to provide characters in a particular requiring that registrant need to provide characters in a particular
code sequence. Registrant input conventions and management are part code sequence. Registrant input conventions and management are part
of registrant-registrar interactions and relationships between of registrant-registrar interactions and relationships between
registries and registrars and are outside the scope of these registries and registrars and are outside the scope of these
standards. standards.
It is worth stressing that these principles of policy development and It is worth stressing that these principles of policy development and
application apply at all levels of the DNS, not only, e.g., TLD or application apply at all levels of the DNS, not only, e.g., TLD or
SLD registrations and that even a trivial, "anything permitted that SLD registrations and that even a trivial, "anything is permitted
is valid under the protocol" policy is helpful in that it helps users that is valid under the protocol" policy is helpful in that it helps
and application developers know what to expect. users and application developers know what to expect.
3.3. Layered Restrictions: Tables, Context, Registration, Applications 3.3. Layered Restrictions: Tables, Context, Registration, Applications
The character rules in IDNA2008 are based on the realization that The character rules in IDNA2008 are based on the realization that
there is no single magic bullet for any of the security, there is no single magic bullet for any of the security,
confusability, or other issues associated with IDNs. Instead, the confusability, or other issues associated with IDNs. Instead, the
specifications define a variety of approaches. The character tables specifications define a variety of approaches. The character tables
are the first mechanism, protocol rules about how those characters are the first mechanism, protocol rules about how those characters
are applied or restricted in context are the second, and those two in are applied or restricted in context are the second, and those two in
combination constitute the limits of what can be done in the combination constitute the limits of what can be done in the
skipping to change at page 25, line 7 skipping to change at page 25, line 7
o to permit incrementally adding new characters, character groups, o to permit incrementally adding new characters, character groups,
scripts, and other character collections as they are incorporated scripts, and other character collections as they are incorporated
into Unicode, doing so without disruption and, in the long term, into Unicode, doing so without disruption and, in the long term,
without "heavy" processes (an IETF consensus process is required without "heavy" processes (an IETF consensus process is required
by the IDNA2008 specifications and is expected to be required and by the IDNA2008 specifications and is expected to be required and
used until significant experience accumulates with IDNA operations used until significant experience accumulates with IDNA operations
and new versions of Unicode). and new versions of Unicode).
7.1.1. Summary and Discussion of IDNA Validity Criteria 7.1.1. Summary and Discussion of IDNA Validity Criteria
The general criteria for a label to be considered IDNA-valid are (the The general criteria for a label to be considered valid under IDNA
actual rules are rigorously defined in the "Protocol" and "Tables" are (the actual rules are rigorously defined in the "Protocol" and
documents): "Tables" documents):
o The characters are "letters", marks needed to form letters, o The characters are "letters", marks needed to form letters,
numerals, or other code points used to write words in some numerals, or other code points used to write words in some
language. Symbols, drawing characters, and various notational language. Symbols, drawing characters, and various notational
characters are intended to be permanently excluded. There is no characters are intended to be permanently excluded. There is no
evidence that they are important enough to Internet operations or evidence that they are important enough to Internet operations or
internationalization to justify expansion of domain names beyond internationalization to justify expansion of domain names beyond
the general principle of "letters, digits, and hyphen". the general principle of "letters, digits, and hyphen".
(Additional discussion and rationale for the symbol decision (Additional discussion and rationale for the symbol decision
appears in Section 7.6). appears in Section 7.6).
skipping to change at page 25, line 41 skipping to change at page 25, line 41
application are not permitted, even on lookup. The issues application are not permitted, even on lookup. The issues
involved in this decision are discussed in Section 7.7. involved in this decision are discussed in Section 7.7.
o Any character that is mapped to another character by a current o Any character that is mapped to another character by a current
version of NFKC is prohibited as input to IDNA (for either version of NFKC is prohibited as input to IDNA (for either
registration or lookup). With a few exceptions, this principle registration or lookup). With a few exceptions, this principle
excludes any character mapped to another by Nameprep [RFC3491]. excludes any character mapped to another by Nameprep [RFC3491].
The principles above drive the design of rules that are specified The principles above drive the design of rules that are specified
exactly in [IDNA2008-Tables]. Those rules identify the characters exactly in [IDNA2008-Tables]. Those rules identify the characters
that are IDNA-valid. The rules themselves are normative, and the that are valid under IDNA. The rules themselves are normative, and
tables are derived from them, rather than vice versa. the tables are derived from them, rather than vice versa.
7.1.2. Labels in Registration 7.1.2. Labels in Registration
Any label registered in a DNS zone must be validated -- i.e., the Any label registered in a DNS zone must be validated -- i.e., the
criteria for that label must be met -- in order for applications to criteria for that label must be met -- in order for applications to
work as intended. This principle is not new. For example, since the work as intended. This principle is not new. For example, since the
DNS was first deployed, zone administrators have been expected to DNS was first deployed, zone administrators have been expected to
verify that names meet "hostname" requirements [RFC0952] where those verify that names meet "hostname" requirements [RFC0952] where those
requirements are imposed by the expected applications. Other requirements are imposed by the expected applications. Other
applications contexts, such as the later addition of special service applications contexts, such as the later addition of special service
location formats [RFC2782] imposed new requirements on zone location formats [RFC2782] imposed new requirements on zone
administrators. For zones that will contain IDNs, support for administrators. For zones that will contain IDNs, support for
Unicode version-independence requires restrictions on all strings Unicode version-independence requires restrictions on all strings
placed in the zone. In particular, for such zones: placed in the zone. In particular, for such zones:
o Any label that appears to be an A-label, i.e., any label that o Any label that appears to be an A-label, i.e., any label that
starts in "xn--", must be IDNA-valid, i.e., they must be valid starts in "xn--", must be valid under IDNA, i.e., they must be
A-labels, as discussed in Section 2 above. valid A-labels, as discussed in Section 2 above.
o The Unicode tables (i.e., tables of code points, character o The Unicode tables (i.e., tables of code points, character
classes, and properties) and IDNA tables (i.e., tables of classes, and properties) and IDNA tables (i.e., tables of
contextual rules such as those that appear in the Tables contextual rules such as those that appear in the Tables
document), must be consistent on the systems performing or document), must be consistent on the systems performing or
validating labels to be registered. Note that this does not validating labels to be registered. Note that this does not
require that tables reflect the latest version of Unicode, only require that tables reflect the latest version of Unicode, only
that all tables used on a given system are consistent with each that all tables used on a given system are consistent with each
other. other.
Under this model, registry tables will need to be updated (both the Under this model, registry tables will need to be updated (both the
Unicode-associated tables and the tables of permitted IDN characters) Unicode-associated tables and the tables of permitted IDN characters)
to enable a new script or other set of new characters. The registry to enable a new script or other set of new characters. The registry
will not be affected by newer versions of Unicode, or newly- will not be affected by newer versions of Unicode, or newly-
authorized characters, until and unless it wishes to support them. authorized characters, until and unless it wishes to support them.
The zone administrator is responsible for verifying IDNA-validity as The zone administrator is responsible for verifying validity for IDNA
well as its local policies -- a more extensive set of checks than are as well as its local policies -- a more extensive set of checks than
required for looking up the labels. Systems looking up or resolving are required for looking up the labels. Systems looking up or
DNS labels, especially IDN DNS labels, must be able to assume that resolving DNS labels, especially IDN DNS labels, must be able to
applicable registration rules were followed for names entered into assume that applicable registration rules were followed for names
the DNS. entered into the DNS.
7.1.3. Labels in Lookup 7.1.3. Labels in Lookup
Anyone looking up a label in a DNS zone is required to Anyone looking up a label in a DNS zone is required to
o Maintain IDNA and Unicode tables that are consistent with regard o Maintain IDNA and Unicode tables that are consistent with regard
to versions, i.e., unless the application actually executes the to versions, i.e., unless the application actually executes the
classification rules in [IDNA2008-Tables], its IDNA tables must be classification rules in [IDNA2008-Tables], its IDNA tables must be
derived from the version of Unicode that is supported more derived from the version of Unicode that is supported more
generally on the system. As with registration, the tables need generally on the system. As with registration, the tables need
skipping to change at page 27, line 23 skipping to change at page 27, line 23
* any required contextual rules are available, and * any required contextual rules are available, and
* any contextual rules that are associated with Joiner Controls * any contextual rules that are associated with Joiner Controls
(and "CONTEXTJ" characters more generally) are tested. (and "CONTEXTJ" characters more generally) are tested.
o Do not reject labels based on other contextual rules about o Do not reject labels based on other contextual rules about
characters, including mixed-script label prohibitions. Such rules characters, including mixed-script label prohibitions. Such rules
may be used to influence presentation decisions in the user may be used to influence presentation decisions in the user
interface, but not to avoid looking up domain names. interface, but not to avoid looking up domain names.
Lookup applications that following these rules, rather than having To further clarify the rules about handling characters that require
their own criteria for rejecting lookup attempts, are not sensitive contextual rules, note that one can have a context-required character
to version incompatibilities with the particular zone registry (i.e., one that requires a rule), but no rule. In that case, the
character is treated the same way DISALLOWED characters are treated,
until and unless a rule is supplied. That state is more or less
equivalent to "the idea of permitting this character is accepted in
principle, but it won't be permitted in practice until consensus is
reached on a safe way to use it".
The ability to add a rule more or less exempts these characters from
the prohibition against reclassifying characters from DISALLOWED to
PVALID.
And, obviously, "no rule" is different from "have a rule, but the
test either succeeds or fails".
Lookup applications that follow these rules, rather than having their
own criteria for rejecting lookup attempts, are not sensitive to
version incompatibilities with the particular zone registry
associated with the domain name except for labels containing associated with the domain name except for labels containing
characters recently added to Unicode. characters recently added to Unicode.
An application or client that processes names according to this An application or client that processes names according to this
protocol and then resolves them in the DNS will be able to locate any protocol and then resolves them in the DNS will be able to locate any
name that is registered, as long as those registrations are IDNA- name that is registered, as long as those registrations are valid
valid and its version of the IDNA tables is sufficiently up-to-date under IDNA and its version of the IDNA tables is sufficiently up-to-
to interpret all of the characters in the label. Messages to users date to interpret all of the characters in the label. Messages to
should distinguish between "label contains an unallocated code point" users should distinguish between "label contains an unallocated code
and other types of lookup failures. A failure on the basis of an old point" and other types of lookup failures. A failure on the basis of
version of Unicode may lead the user to a desire to upgrade to a an old version of Unicode may lead the user to a desire to upgrade to
newer version, but will have no other ill effects (this is consistent a newer version, but will have no other ill effects (this is
with behavior in the transition to the DNS when some hosts could not consistent with behavior in the transition to the DNS when some hosts
yet handle some forms of names or record types). could not yet handle some forms of names or record types).
7.2. Changes in Character Interpretations 7.2. Changes in Character Interpretations
In those scripts that make case distinctions, there are a few In those scripts that make case distinctions, there are a few
characters for which an obvious and unique upper case character has characters for which an obvious and unique upper case character has
not historically been available to match a lower case one or vice not historically been available to match a lower case one or vice
versa. For those characters, the mappings used in constructing the versa. For those characters, the mappings used in constructing the
Stringprep tables for IDNA2003, performed using the Unicode CaseFold Stringprep tables for IDNA2003, performed using the Unicode CaseFold
operation (See Section 5.8 of the Unicode Standard [Unicode51]), operation (See Section 5.8 of the Unicode Standard [Unicode51]),
generate different characters or sets of characters. Those generate different characters or sets of characters. Those
skipping to change at page 32, line 42 skipping to change at page 33, line 12
the hypothetical label doesn't know whether to read it -- try to the hypothetical label doesn't know whether to read it -- try to
transmit it to a colleague by voice -- as "heart", as "love", as transmit it to a colleague by voice -- as "heart", as "love", as
"black heart", or as any of the other examples below. "black heart", or as any of the other examples below.
5. The actual situation is even worse than this. There is no 5. The actual situation is even worse than this. There is no
possible way for a normal, casual, user to tell the difference possible way for a normal, casual, user to tell the difference
between the hearts of U+2665 and U+2765 and the stars of U+2606 between the hearts of U+2665 and U+2765 and the stars of U+2606
and U+2729 or the without somehow knowing to look for a and U+2729 or the without somehow knowing to look for a
distinction. We have a white heart (U+2661) and few black distinction. We have a white heart (U+2661) and few black
hearts. Consequently, describing a label as containing a heart hearts. Consequently, describing a label as containing a heart
hopelessly ambiguous: we can only know that it contains one of is hopelessly ambiguous: we can only know that it contains one of
several characters that look like hearts or have "heart" in their several characters that look like hearts or have "heart" in their
names. In cities where "Square" is a popular part of a location names. In cities where "Square" is a popular part of a location
name, one might well want to use a square symbol in a label as name, one might well want to use a square symbol in a label as
well and there are far more squares of various flavors in Unicode well and there are far more squares of various flavors in Unicode
than there are hearts or stars. than there are hearts or stars.
The consequence of these ambiguities is that symbols are a very poor The consequence of these ambiguities is that symbols are a very poor
basis for reliable communication. Consistent with this conclusion, basis for reliable communication. Consistent with this conclusion,
the Unicode standard recommends that strings used in identifiers not the Unicode standard recommends that strings used in identifiers not
contain symbols or punctuation [Unicode-UAX31]. Of course, these contain symbols or punctuation [Unicode-UAX31]. Of course, these
skipping to change at page 38, line 10 skipping to change at page 38, line 28
consequently introduces no new security issues. It would, of course, consequently introduces no new security issues. It would, of course,
be a poor idea for someone to try to implement from it; such an be a poor idea for someone to try to implement from it; such an
attempt would almost certainly lead to interoperability problems and attempt would almost certainly lead to interoperability problems and
might lead to security ones. A discussion of security issues with might lead to security ones. A discussion of security issues with
IDNA, including some relevant history, appears in [IDNA2008-Defs]. IDNA, including some relevant history, appears in [IDNA2008-Defs].
12. Acknowledgments 12. Acknowledgments
The editor and contributors would like to express their thanks to The editor and contributors would like to express their thanks to
those who contributed significant early (pre-WG) review comments, those who contributed significant early (pre-WG) review comments,
sometimes accompanied by text, especially Mark Davis, Paul Hoffman, sometimes accompanied by text, Paul Hoffman, Simon Josefsson, and Sam
Simon Josefsson, and Sam Weiler. In addition, some specific ideas Weiler. In addition, some specific ideas were incorporated from
were incorporated from suggestions, text, or comments about sections suggestions, text, or comments about sections that were unclear
that were unclear supplied by Vint Cerf, Frank Ellerman, Michael supplied by Vint Cerf, Frank Ellerman, Michael Everson, Asmus
Everson, Asmus Freytag, Erik van der Poel, Michel Suignard, and Ken Freytag, Erik van der Poel, Michel Suignard, and Ken Whistler.
Whistler. Thanks are also due to Vint Cerf, Lisa Dusseault, Debbie Thanks are also due to Vint Cerf, Lisa Dusseault, Debbie Garside, and
Garside, and Jefsey Morfin for conversations that led to considerable Jefsey Morfin for conversations that led to considerable improvements
improvements in the content of this document. in the content of this document.
A meeting was held on 30 January 2008 to attempt to reconcile A meeting was held on 30 January 2008 to attempt to reconcile
differences in perspective and terminology about this set of differences in perspective and terminology about this set of
specifications between the design team and members of the Unicode specifications between the design team and members of the Unicode
Technical Consortium. The discussions at and subsequent to that Technical Consortium. The discussions at and subsequent to that
meeting were very helpful in focusing the issues and in refining the meeting were very helpful in focusing the issues and in refining the
specifications. The active participants at that meeting were (in specifications. The active participants at that meeting were (in
alphabetic order as usual) Harald Alvestrand, Vint Cerf, Tina Dam, alphabetic order as usual) Harald Alvestrand, Vint Cerf, Tina Dam,
Mark Davis, Lisa Dusseault, Patrik Faltstrom (by telephone), Cary Mark Davis, Lisa Dusseault, Patrik Faltstrom (by telephone), Cary
Karp, John Klensin, Warren Kumari, Lisa Moore, Erik van der Poel, Karp, John Klensin, Warren Kumari, Lisa Moore, Erik van der Poel,
Michel Suignard, and Ken Whistler. We express our thanks to Google Michel Suignard, and Ken Whistler. We express our thanks to Google
for support of that meeting and to the participants for their for support of that meeting and to the participants for their
contributions. contributions.
Useful comments and text on the WG versions of the draft were Useful comments and text on the WG versions of the draft were
received from many participants in the IETF "IDNABIS" WG and a number received from many participants in the IETF "IDNABIS" WG and a number
of document changes resulted from mailing list discussions made by of document changes resulted from mailing list discussions made by
that group. Marcos Sanz provided specific analysis and suggestions that group. Marcos Sanz provided specific analysis and suggestions
that were exceptionally helpful in refining the text, as did Vint that were exceptionally helpful in refining the text, as did Vint
Cerf, Mark Davis, Martin Duerst, Andrew Sullivan, and Ken Whistler. Cerf, Martin Duerst, Andrew Sullivan, and Ken Whistler. Lisa
Lisa Dusseault provided extensive editorial suggestions during the Dusseault provided extensive editorial suggestions during the spring
spring of 2009, most of which were incorporated. of 2009, most of which were incorporated.
As is usual with IETF specifications, while the document represents
rough consensus, it should not be assumed that all participants and
contributors agree with all provisions.
13. Contributors 13. Contributors
While the listed editor held the pen, the core of this document and While the listed editor held the pen, the core of this document and
the initial WG version represents the joint work and conclusions of the initial WG version represents the joint work and conclusions of
an ad hoc design team consisting of the editor and, in alphabetic an ad hoc design team consisting of the editor and, in alphabetic
order, Harald Alvestrand, Tina Dam, Patrik Faltstrom, and Cary Karp. order, Harald Alvestrand, Tina Dam, Patrik Faltstrom, and Cary Karp.
Considerable material describing mapping principles has been Considerable material describing mapping principles has been
incorporated from a draft of [IDNA2008-Mapping] by Pete Resnick and incorporated from a draft of [IDNA2008-Mapping] by Pete Resnick and
Paul Hoffman. In addition, there were many specific contributions Paul Hoffman. In addition, there were many specific contributions
skipping to change at page 47, line 5 skipping to change at page 47, line 22
o Eliminated the former Section 7.3 ("More Flexibility in User o Eliminated the former Section 7.3 ("More Flexibility in User
Agents"), moving its material into Section 4.2. The replacement Agents"), moving its material into Section 4.2. The replacement
section is basically a placeholder to retain the mapping issues as section is basically a placeholder to retain the mapping issues as
one of the migration topics. Note that this item and the previous one of the migration topics. Note that this item and the previous
one involve considerable text, so people should check things one involve considerable text, so people should check things
carefully. carefully.
o Corrected several typographical and editorial errors that don't o Corrected several typographical and editorial errors that don't
fall into any of the above categories. fall into any of the above categories.
A.12. Version -12
o Got rid of the term "IDNA-valid". It no longer appears in
Definitions and we didn't really need the extra term. Where the
concept was needed, the text now says "valid under IDNA" or
equivalent.
o Adjusted Acknowledgments to remove Mark Davis's name, per his
request and advice from IETF Trust Counsel.
o Incorporated other changes from WG Last Call.
o Small typographical and editorial corrections.
Author's Address Author's Address
John C Klensin John C Klensin
1770 Massachusetts Ave, Ste 322 1770 Massachusetts Ave, Ste 322
Cambridge, MA 02140 Cambridge, MA 02140
USA USA
Phone: +1 617 245 1457 Phone: +1 617 245 1457
Email: john+ietf@jck.com Email: john+ietf@jck.com
 End of changes. 28 change blocks. 
68 lines changed or deleted 98 lines changed or added

This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/