< draft-ietf-idnabis-rationale-07.txt   draft-ietf-idnabis-rationale-08.txt >
Network Working Group J. Klensin Network Working Group J. Klensin
Internet-Draft February 24, 2009 Internet-Draft March 6, 2009
Intended status: Informational Intended status: Informational
Expires: August 28, 2009 Expires: September 7, 2009
Internationalized Domain Names for Applications (IDNA): Background, Internationalized Domain Names for Applications (IDNA): Background,
Explanation, and Rationale Explanation, and Rationale
draft-ietf-idnabis-rationale-07.txt draft-ietf-idnabis-rationale-08.txt
Status of this Memo Status of this Memo
This Internet-Draft is submitted to IETF in full conformance with the This Internet-Draft is submitted to IETF in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79. This document may contain material
from IETF Documents or IETF Contributions published or made publicly
available before November 10, 2008. The person(s) controlling the
copyright in some of this material may not have granted the IETF
Trust the right to allow modifications of such material outside the
IETF Standards Process. Without obtaining an adequate license from
the person(s) controlling the copyright in such materials, this
document may not be modified outside the IETF Standards Process, and
derivative works of it may not be created outside the IETF Standards
Process, except to format it for publication as an RFC or to
translate it into languages other than English.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet- other groups may also distribute working documents as Internet-
Drafts. Drafts.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt. http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html. http://www.ietf.org/shadow.html.
This Internet-Draft will expire on August 28, 2009. This Internet-Draft will expire on September 7, 2009.
Copyright Notice Copyright Notice
Copyright (c) 2009 IETF Trust and the persons identified as the Copyright (c) 2009 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents in effect on the date of Provisions Relating to IETF Documents in effect on the date of
publication of this document (http://trustee.ietf.org/license-info). publication of this document (http://trustee.ietf.org/license-info).
Please review these documents carefully, as they describe your rights Please review these documents carefully, as they describe your rights
and restrictions with respect to this document. and restrictions with respect to this document.
This document may contain material from IETF Documents or IETF
Contributions published or made publicly available before November
10, 2008. The person(s) controlling the copyright in some of this
material may not have granted the IETF Trust the right to allow
modifications of such material outside the IETF Standards Process.
Without obtaining an adequate license from the person(s) controlling
the copyright in such materials, this document may not be modified
outside the IETF Standards Process, and derivative works of it may
not be created outside the IETF Standards Process, except to format
it for publication as an RFC or to translate it into languages other
than English.
Abstract Abstract
Several years have passed since the original protocol for Several years have passed since the original protocol for
Internationalized Domain Names (IDNs) was completed and deployed. Internationalized Domain Names (IDNs) was completed and deployed.
During that time, a number of issues have arisen, including the need During that time, a number of issues have arisen, including the need
to update the system to deal with newer versions of Unicode. Some of to update the system to deal with newer versions of Unicode. Some of
these issues require tuning of the existing protocols and the tables these issues require tuning of the existing protocols and the tables
on which they depend. This document provides an overview of a on which they depend. This document provides an overview of a
revised system and provides explanatory material for its components. revised system and provides explanatory material for its components.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1. Context and Overview . . . . . . . . . . . . . . . . . . . 5 1.1. Context and Overview . . . . . . . . . . . . . . . . . . . 4
1.2. Discussion Forum . . . . . . . . . . . . . . . . . . . . . 5 1.2. Discussion Forum . . . . . . . . . . . . . . . . . . . . . 4
1.3. Terminology . . . . . . . . . . . . . . . . . . . . . . . 6 1.3. Terminology . . . . . . . . . . . . . . . . . . . . . . . 5
1.3.1. Documents and Standards . . . . . . . . . . . . . . . 6 1.3.1. Documents and Standards . . . . . . . . . . . . . . . 5
1.3.2. DNS "Name" Terminology . . . . . . . . . . . . . . . . 6 1.3.2. DNS "Name" Terminology . . . . . . . . . . . . . . . . 5
1.3.3. New Terminology and Restrictions . . . . . . . . . . . 7 1.3.3. New Terminology and Restrictions . . . . . . . . . . . 6
1.4. Objectives . . . . . . . . . . . . . . . . . . . . . . . . 7 1.4. Objectives . . . . . . . . . . . . . . . . . . . . . . . . 6
1.5. Applicability and Function of IDNA . . . . . . . . . . . . 8 1.5. Applicability and Function of IDNA . . . . . . . . . . . . 7
1.6. Comprehensibility of IDNA Mechanisms and Processing . . . 9 1.6. Comprehensibility of IDNA Mechanisms and Processing . . . 8
2. Processing in IDNA2008 . . . . . . . . . . . . . . . . . . . . 10 2. Processing in IDNA2008 . . . . . . . . . . . . . . . . . . . . 9
3. Permitted Characters: An Inclusion List . . . . . . . . . . . 11 3. Permitted Characters: An Inclusion List . . . . . . . . . . . 10
3.1. A Tiered Model of Permitted Characters and Labels . . . . 11 3.1. A Tiered Model of Permitted Characters and Labels . . . . 10
3.1.1. PROTOCOL-VALID . . . . . . . . . . . . . . . . . . . . 12 3.1.1. PROTOCOL-VALID . . . . . . . . . . . . . . . . . . . . 11
3.1.1.1. Contextual Rules . . . . . . . . . . . . . . . . . 12 3.1.2. Characters Valid Only in Context With Others . . . . . 11
3.1.1.2. Rules and Their Application . . . . . . . . . . . 13 3.1.2.2. Rules and Their Application . . . . . . . . . . . 12
3.1.2. DISALLOWED . . . . . . . . . . . . . . . . . . . . . . 13 3.1.3. DISALLOWED . . . . . . . . . . . . . . . . . . . . . . 12
3.1.3. UNASSIGNED . . . . . . . . . . . . . . . . . . . . . . 14 3.1.4. UNASSIGNED . . . . . . . . . . . . . . . . . . . . . . 13
3.2. Registration Policy . . . . . . . . . . . . . . . . . . . 14 3.2. Registration Policy . . . . . . . . . . . . . . . . . . . 14
3.3. Layered Restrictions: Tables, Context, Registration, 3.3. Layered Restrictions: Tables, Context, Registration,
Applications . . . . . . . . . . . . . . . . . . . . . . . 15 Applications . . . . . . . . . . . . . . . . . . . . . . . 14
4. Issues that Constrain Possible Solutions . . . . . . . . . . . 15 4. Issues that Constrain Possible Solutions . . . . . . . . . . . 15
4.1. Display and Network Order . . . . . . . . . . . . . . . . 16 4.1. Display and Network Order . . . . . . . . . . . . . . . . 15
4.2. Entry and Display in Applications . . . . . . . . . . . . 17 4.2. Entry and Display in Applications . . . . . . . . . . . . 16
4.3. Linguistic Expectations: Ligatures, Digraphs, and 4.3. Linguistic Expectations: Ligatures, Digraphs, and
Alternate Character Forms . . . . . . . . . . . . . . . . 18 Alternate Character Forms . . . . . . . . . . . . . . . . 17
4.4. Case Mapping and Related Issues . . . . . . . . . . . . . 20 4.4. Case Mapping and Related Issues . . . . . . . . . . . . . 20
4.5. Right to Left Text . . . . . . . . . . . . . . . . . . . . 21 4.5. Right to Left Text . . . . . . . . . . . . . . . . . . . . 20
5. IDNs and the Robustness Principle . . . . . . . . . . . . . . 22 5. IDNs and the Robustness Principle . . . . . . . . . . . . . . 21
6. Front-end and User Interface Processing for Lookup . . . . . . 23 6. Front-end and User Interface Processing for Lookup . . . . . . 22
7. Migration from IDNA2003 and Unicode Version Synchronization . 26 7. Migration from IDNA2003 and Unicode Version Synchronization . 25
7.1. Design Criteria . . . . . . . . . . . . . . . . . . . . . 26 7.1. Design Criteria . . . . . . . . . . . . . . . . . . . . . 25
7.1.1. General IDNA Validity Criteria . . . . . . . . . . . . 26 7.1.1. General IDNA Validity Criteria . . . . . . . . . . . . 25
7.1.2. Labels in Registration . . . . . . . . . . . . . . . . 27 7.1.2. Labels in Registration . . . . . . . . . . . . . . . . 26
7.1.3. Labels in Lookup . . . . . . . . . . . . . . . . . . . 28 7.1.3. Labels in Lookup . . . . . . . . . . . . . . . . . . . 27
7.2. Changes in Character Interpretations . . . . . . . . . . . 29 7.2. Changes in Character Interpretations . . . . . . . . . . . 28
7.3. More Flexibility in User Agents . . . . . . . . . . . . . 31 7.3. More Flexibility in User Agents . . . . . . . . . . . . . 30
7.4. The Question of Prefix Changes . . . . . . . . . . . . . . 32 7.4. The Question of Prefix Changes . . . . . . . . . . . . . . 31
7.4.1. Conditions Requiring a Prefix Change . . . . . . . . . 32 7.4.1. Conditions Requiring a Prefix Change . . . . . . . . . 32
7.4.2. Conditions Not Requiring a Prefix Change . . . . . . . 33 7.4.2. Conditions Not Requiring a Prefix Change . . . . . . . 32
7.4.3. Implications of Prefix Changes . . . . . . . . . . . . 33 7.4.3. Implications of Prefix Changes . . . . . . . . . . . . 33
7.5. Stringprep Changes and Compatibility . . . . . . . . . . . 34 7.5. Stringprep Changes and Compatibility . . . . . . . . . . . 33
7.6. The Symbol Question . . . . . . . . . . . . . . . . . . . 34 7.6. The Symbol Question . . . . . . . . . . . . . . . . . . . 34
7.7. Migration Between Unicode Versions: Unassigned Code 7.7. Migration Between Unicode Versions: Unassigned Code
Points . . . . . . . . . . . . . . . . . . . . . . . . . . 36 Points . . . . . . . . . . . . . . . . . . . . . . . . . . 35
7.8. Other Compatibility Issues . . . . . . . . . . . . . . . . 37 7.8. Other Compatibility Issues . . . . . . . . . . . . . . . . 37
8. Name Server Considerations . . . . . . . . . . . . . . . . . . 37
8. Name Server Considerations . . . . . . . . . . . . . . . . . . 38 8.1. Processing Non-ASCII Strings . . . . . . . . . . . . . . . 37
8.1. Processing Non-ASCII Strings . . . . . . . . . . . . . . . 38
8.2. DNSSEC Authentication of IDN Domain Names . . . . . . . . 38 8.2. DNSSEC Authentication of IDN Domain Names . . . . . . . . 38
8.3. Root and other DNS Server Considerations . . . . . . . . . 39 8.3. Root and other DNS Server Considerations . . . . . . . . . 38
9. Internationalization Considerations . . . . . . . . . . . . . 39 9. Internationalization Considerations . . . . . . . . . . . . . 39
10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 39 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 39
10.1. IDNA Character Registry . . . . . . . . . . . . . . . . . 40 10.1. IDNA Character Registry . . . . . . . . . . . . . . . . . 39
10.2. IDNA Context Registry . . . . . . . . . . . . . . . . . . 40 10.2. IDNA Context Registry . . . . . . . . . . . . . . . . . . 39
10.3. IANA Repository of IDN Practices of TLDs . . . . . . . . . 40 10.3. IANA Repository of IDN Practices of TLDs . . . . . . . . . 39
11. Security Considerations . . . . . . . . . . . . . . . . . . . 40 11. Security Considerations . . . . . . . . . . . . . . . . . . . 40
11.1. General Security Issues with IDNA . . . . . . . . . . . . 40 11.1. General Security Issues with IDNA . . . . . . . . . . . . 40
12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 41 12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 40
13. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 41 13. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 41
14. References . . . . . . . . . . . . . . . . . . . . . . . . . . 42 14. References . . . . . . . . . . . . . . . . . . . . . . . . . . 41
14.1. Normative References . . . . . . . . . . . . . . . . . . . 42 14.1. Normative References . . . . . . . . . . . . . . . . . . . 41
14.2. Informative References . . . . . . . . . . . . . . . . . . 43 14.2. Informative References . . . . . . . . . . . . . . . . . . 42
Appendix A. Change Log . . . . . . . . . . . . . . . . . . . . . 45 Appendix A. Change Log . . . . . . . . . . . . . . . . . . . . . 44
A.1. Changes between Version -00 and Version -01 of A.1. Changes between Version -00 and Version -01 of
draft-ietf-idnabis-rationale . . . . . . . . . . . . . . . 45 draft-ietf-idnabis-rationale . . . . . . . . . . . . . . . 44
A.2. Version -02 . . . . . . . . . . . . . . . . . . . . . . . 45 A.2. Version -02 . . . . . . . . . . . . . . . . . . . . . . . 45
A.3. Version -03 . . . . . . . . . . . . . . . . . . . . . . . 46 A.3. Version -03 . . . . . . . . . . . . . . . . . . . . . . . 45
A.4. Version -04 . . . . . . . . . . . . . . . . . . . . . . . 46 A.4. Version -04 . . . . . . . . . . . . . . . . . . . . . . . 46
A.5. Version -05 . . . . . . . . . . . . . . . . . . . . . . . 47 A.5. Version -05 . . . . . . . . . . . . . . . . . . . . . . . 46
A.6. Version -06 . . . . . . . . . . . . . . . . . . . . . . . 47 A.6. Version -06 . . . . . . . . . . . . . . . . . . . . . . . 46
A.7. Version -07 . . . . . . . . . . . . . . . . . . . . . . . 47 A.7. Version -07 . . . . . . . . . . . . . . . . . . . . . . . 47
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 48 A.8. Version -08 . . . . . . . . . . . . . . . . . . . . . . . 47
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 47
1. Introduction 1. Introduction
1.1. Context and Overview 1.1. Context and Overview
The original standards for Internationalized Domain Names (IDNs) were The original standards for Internationalized Domain Names (IDNs) were
completed and deployed starting in 2003. Those standards are known completed and deployed starting in 2003. Those standards are known
as Internationalized Domain Names in Applications (IDNA), taken from as Internationalized Domain Names in Applications (IDNA), taken from
the name of the highest level standard within the group, RFC 3490 the name of the highest level standard within the group, RFC 3490
[RFC3490]. After those standards were deployed, a number of issues [RFC3490]. After those standards were deployed, a number of issues
skipping to change at page 12, line 28 skipping to change at page 11, line 28
in the category. Registries are still expected to apply judgment in the category. Registries are still expected to apply judgment
about labels they will accept and to maintain rules consistent with about labels they will accept and to maintain rules consistent with
those judgments (see [IDNA2008-Protocol] and Section 3.3). those judgments (see [IDNA2008-Protocol] and Section 3.3).
Characters that are placed in the "PROTOCOL-VALID" category are Characters that are placed in the "PROTOCOL-VALID" category are
expected to never be removed from it or reclassified. While expected to never be removed from it or reclassified. While
theoretically characters could be removed from Unicode, such removal theoretically characters could be removed from Unicode, such removal
would be inconsistent with the Unicode stability principles (see would be inconsistent with the Unicode stability principles (see
[Unicode51], Appendix F) and hence should never occur. [Unicode51], Appendix F) and hence should never occur.
3.1.1.1. Contextual Rules 3.1.2. Characters Valid Only in Context With Others
Some characters may be unsuitable for general use in IDNs but Some characters may be unsuitable for general use in IDNs but
necessary for the plausible support of some scripts. The two most necessary for the plausible support of some scripts. The two most
commonly-cited examples are the zero-width joiner and non-joiner commonly-cited examples are the zero-width joiner and non-joiner
characters (ZWJ, U+200D and ZWNJ, U+200C), but provisions for characters (ZWJ, U+200D and ZWNJ, U+200C), but provisions for
unambiguous labels may require that other characters be restricted to unambiguous labels may require that other characters be restricted to
particular contexts. For example, the ASCII hyphen is not permitted particular contexts. For example, the ASCII hyphen is not permitted
to start or end a label, whether that label contains non-ASCII to start or end a label, whether that label contains non-ASCII
characters or not. characters or not.
3.1.2.1. Contextual Restrictions
These characters must not appear in IDNs without additional These characters must not appear in IDNs without additional
restrictions, typically because they have no visible consequences in restrictions, typically because they have no visible consequences in
most scripts but affect format or presentation in a few others or most scripts but affect format or presentation in a few others or
because they are combining characters that are safe for use only in because they are combining characters that are safe for use only in
conjunction with particular characters or scripts. In order to conjunction with particular characters or scripts. In order to
permit them to be used at all, they are specially identified as permit them to be used at all, they are specially identified as
"CONTEXTUAL RULE REQUIRED" and, when adequately understood, "CONTEXTUAL RULE REQUIRED" and, when adequately understood,
associated with a rule. In addition, the rule will define whether it associated with a rule. In addition, the rule will define whether it
is to be applied on lookup as well as registration. A distinction is is to be applied on lookup as well as registration. A distinction is
made between characters that indicate or prohibit joining (known as made between characters that indicate or prohibit joining (known as
"CONTEXT-JOINER" or "CONTEXTJ") and other characters requiring "CONTEXT-JOINER" or "CONTEXTJ") and other characters requiring
contextual treatment ("CONTEXT-OTHER" or "CONTEXTO"). Only the contextual treatment ("CONTEXT-OTHER" or "CONTEXTO"). Only the
former require full testing at lookup time. former require full testing at lookup time.
3.1.1.2. Rules and Their Application It is important to note that these contextual rules cannot prevent
all uses of the relevant characters that might be confusing or
problematic. What they are expected do is to confine applicability
of the characters to scripts (and narrower contexts) where zone
administrators are knowledgeable enough about the use of those
characters to be prepared to deal with them appropriately. For
example, a registry dealing with an Indic script that requires ZWJ
and/or ZWNJ as part of the writing system is expected to understand
where the characters have visible effect and where they do not and to
make registration rules accordingly. By contrast, a registry dealing
with Latin or Cyrillic script might not be actively aware that the
characters exist, much less about the consequences of embedding them
in labels drawn from those scripts.
3.1.2.2. Rules and Their Application
The actual rules may be DEFINED or NULL. If present, they may have The actual rules may be DEFINED or NULL. If present, they may have
values of "True" (character may be used in any position in any values of "True" (character may be used in any position in any
label), "False" (character may not be used in any label), or may be a label), "False" (character may not be used in any label), or may be a
set of procedural rules that specify the context in which the set of procedural rules that specify the context in which the
character is permitted. character is permitted.
Examples of descriptions of typical rules, stated informally and in Examples of descriptions of typical rules, stated informally and in
English, include "Must follow a character from Script XYZ", "Must English, include "Must follow a character from Script XYZ", "Must
occur only if the entire label is in Script ABC", "Must occur only if occur only if the entire label is in Script ABC", "Must occur only if
skipping to change at page 13, line 27 skipping to change at page 12, line 42
Because it is easier to identify these characters than to know that Because it is easier to identify these characters than to know that
they are actually needed in IDNs or how to establish exactly the they are actually needed in IDNs or how to establish exactly the
right rules for each one, a rule may have a null value in a given right rules for each one, a rule may have a null value in a given
version of the tables. Characters associated with null rules are not version of the tables. Characters associated with null rules are not
permitted to appear in putative labels for either registration or permitted to appear in putative labels for either registration or
lookup. Of course, a later version of the tables might contain a lookup. Of course, a later version of the tables might contain a
non-null rule. non-null rule.
The description of the syntax of the rules, and the rules themselves, The description of the syntax of the rules, and the rules themselves,
appears in [IDNA2008-Tables]. appears in [IDNA2008-Tables]. [[anchor11: ??? Section number would
be good here.]]
3.1.2. DISALLOWED 3.1.3. DISALLOWED
Some characters are inappropriate for use in IDNs and are thus Some characters are inappropriate for use in IDNs and are thus
excluded for both registration and lookup (i.e., IDNA-conforming excluded for both registration and lookup (i.e., IDNA-conforming
applications performing name lookup should verify that these applications performing name lookup should verify that these
characters are absent; if they are present, the label strings should characters are absent; if they are present, the label strings should
be rejected rather than converted to A-labels and looked up. Some of be rejected rather than converted to A-labels and looked up. Some of
these characters are problematic for use in IDNs (such as the these characters are problematic for use in IDNs (such as the
FRACTION SLASH character, U+2044), while some of them (such as the FRACTION SLASH character, U+2044), while some of them (such as the
various HEART symbols, e.g., U+2665, U+2661, and U+2765, see various HEART symbols, e.g., U+2665, U+2661, and U+2765, see
Section 7.6) simply fall outside the conventions for typical Section 7.6) simply fall outside the conventions for typical
skipping to change at page 14, line 21 skipping to change at page 13, line 38
normalization method NFKC to the character yields some other normalization method NFKC to the character yields some other
character. character.
o The character is an upper-case form or some other form that is o The character is an upper-case form or some other form that is
mapped to another character by Unicode casefolding. mapped to another character by Unicode casefolding.
o The character is a symbol or punctuation form or, more generally, o The character is a symbol or punctuation form or, more generally,
something that is not a letter, digit, or a mark that is used to something that is not a letter, digit, or a mark that is used to
form a letter or digit. form a letter or digit.
3.1.3. UNASSIGNED 3.1.4. UNASSIGNED
For convenience in processing and table-building, code points that do For convenience in processing and table-building, code points that do
not have assigned values in a given version of Unicode are treated as not have assigned values in a given version of Unicode are treated as
belonging to a special UNASSIGNED category. Such code points are belonging to a special UNASSIGNED category. Such code points are
prohibited in labels to be registered or looked up. The category prohibited in labels to be registered or looked up. The category
differs from DISALLOWED in that code points are moved out of it by differs from DISALLOWED in that code points are moved out of it by
the simple expedient of being assigned in a later version of Unicode the simple expedient of being assigned in a later version of Unicode
(at which point, they are classified into one of the other categories (at which point, they are classified into one of the other categories
as appropriate). as appropriate).
skipping to change at page 18, line 23 skipping to change at page 17, line 38
In any place where a protocol or document format allows transmission In any place where a protocol or document format allows transmission
of the characters in internationalized labels, labels should be of the characters in internationalized labels, labels should be
transmitted using whatever character encoding and escape mechanism transmitted using whatever character encoding and escape mechanism
the protocol or document format uses at that place. This provision the protocol or document format uses at that place. This provision
is intended to prevent situations in which, e.g., UTF-8 domain names is intended to prevent situations in which, e.g., UTF-8 domain names
appear embedded in text that is otherwise in some other character appear embedded in text that is otherwise in some other character
coding. coding.
All protocols that use domain name slots (See Section 2.3.1.6 All protocols that use domain name slots (See Section 2.3.1.6
[[anchor12: ?? Verify this]] in [IDNA2008-Defs]) already have the [[anchor14: ?? Verify this]] in [IDNA2008-Defs]) already have the
capacity for handling domain names in the ASCII charset. Thus, capacity for handling domain names in the ASCII charset. Thus,
A-labels can inherently be handled by those protocols. A-labels can inherently be handled by those protocols.
4.3. Linguistic Expectations: Ligatures, Digraphs, and Alternate 4.3. Linguistic Expectations: Ligatures, Digraphs, and Alternate
Character Forms Character Forms
[[anchor13: There is some internal redundancy and repetition in the [[anchor15: There is some internal redundancy and repetition in the
material in this section. Specific suggestions about to reduce or material in this section. Specific suggestions about to reduce or
eliminate redundant text would be appreciated. If no such eliminate redundant text would be appreciated. If no such
suggestions are received before -07 is posted, this note will be suggestions are received before -07 is posted, this note will be
removed.]] removed.]]
Users often have expectations about character matching or equivalence Users often have expectations about character matching or equivalence
that are based on their own languages and the orthography of those that are based on their own languages and the orthography of those
languages. These expectations may not be consistent with forms or languages. These expectations may not be consistent with forms or
actions that can be naturally accommodated in a character coding actions that can be naturally accommodated in a character coding
system, especially if multiple languages are written using the same system, especially if multiple languages are written using the same
skipping to change at page 20, line 38 skipping to change at page 20, line 6
provides a prime example of a situation in which a registry that is provides a prime example of a situation in which a registry that is
aware of the language context in which labels are to be registered, aware of the language context in which labels are to be registered,
and where that language sometimes (or always) treats the two- and where that language sometimes (or always) treats the two-
character sequences as equivalent to the combined form, should give character sequences as equivalent to the combined form, should give
serious consideration to applying a "variant" model [RFC3743] serious consideration to applying a "variant" model [RFC3743]
[RFC4290], or to prohibiting registration of one the forms entirely, [RFC4290], or to prohibiting registration of one the forms entirely,
to reduce the opportunities for user confusion and fraud that would to reduce the opportunities for user confusion and fraud that would
result from the related strings being registered to different result from the related strings being registered to different
parties. parties.
[[anchor14: Placeholder: A discussion of the Arabic digit issue [[anchor16: Placeholder: A discussion of the Arabic digit issue
should go here once it is resolved in some appropriate way.]] should go here once it is resolved in some appropriate way.]]
4.4. Case Mapping and Related Issues 4.4. Case Mapping and Related Issues
In the DNS, ASCII letters are stored with their case preserved. In the DNS, ASCII letters are stored with their case preserved.
Matching during the query process is case-independent, but none of Matching during the query process is case-independent, but none of
the information that might be represented by choices of case has been the information that might be represented by choices of case has been
lost. That model has been accidentally helpful because, as people lost. That model has been accidentally helpful because, as people
have created DNS labels by catenating words (or parts of words) to have created DNS labels by catenating words (or parts of words) to
form labels, case has often been used to distinguish among components form labels, case has often been used to distinguish among components
skipping to change at page 28, line 47 skipping to change at page 28, line 17
o Validate the label itself for conformance with a small number of o Validate the label itself for conformance with a small number of
whole-label rules, notably verifying that there are no leading whole-label rules, notably verifying that there are no leading
combining marks, that the "bidi" conditions are met if right to combining marks, that the "bidi" conditions are met if right to
left characters appear, that any required contextual rules are left characters appear, that any required contextual rules are
available and that, if such rules are associated with Joiner available and that, if such rules are associated with Joiner
Controls, they are tested. Controls, they are tested.
o Avoid validating other contextual rules about characters, o Avoid validating other contextual rules about characters,
including mixed-script label prohibitions, although such rules may including mixed-script label prohibitions, although such rules may
be used to influence presentation decisions in the user interface. be used to influence presentation decisions in the user interface.
[[anchor18: Check this, and all similar statements, against [[anchor20: Check this, and all similar statements, against
Protocol when that is finished.]] Protocol when that is finished.]]
By avoiding applying its own interpretation of which labels are valid By avoiding applying its own interpretation of which labels are valid
as a means of rejecting lookup attempts, the lookup application as a means of rejecting lookup attempts, the lookup application
becomes less sensitive to version incompatibilities with the becomes less sensitive to version incompatibilities with the
particular zone registry associated with the domain name. particular zone registry associated with the domain name.
An application or client that processes names according to this An application or client that processes names according to this
protocol and then resolves them in the DNS will be able to locate any protocol and then resolves them in the DNS will be able to locate any
name that is validly registered, as long as its version of the name that is validly registered, as long as its version of the
skipping to change at page 29, line 20 skipping to change at page 28, line 39
of the characters in the label. Messages to users should distinguish of the characters in the label. Messages to users should distinguish
between "label contains an unallocated code point" and other types of between "label contains an unallocated code point" and other types of
lookup failures. A failure on the basis of an old version of Unicode lookup failures. A failure on the basis of an old version of Unicode
may lead the user to a desire to upgrade to a newer version, but will may lead the user to a desire to upgrade to a newer version, but will
have no other ill effects (this is consistent with behavior in the have no other ill effects (this is consistent with behavior in the
transition to the DNS when some hosts could not yet handle some forms transition to the DNS when some hosts could not yet handle some forms
of names or record types). of names or record types).
7.2. Changes in Character Interpretations 7.2. Changes in Character Interpretations
[[anchor19: Note in Draft: This subsection is completely new in [[anchor21: Note in Draft: This subsection is completely new in
version -04 and has been further tuned in -05 and -06 of this version -04 and has been further tuned in -05 and -06 of this
document. It could almost certainly use improvement, although this document. It could almost certainly use improvement, although this
note will be removed if there are not significant suggestions about note will be removed if there are not significant suggestions about
the -06 version. It also contains some material that is redundant the -06 version. It also contains some material that is redundant
with material in other sections. I have not tried to remove that with material in other sections. I have not tried to remove that
material and will not do so until the WG concludes that this section material and will not do so until the WG concludes that this section
is relatively stable, but would appreciate help in identifying what is relatively stable, but would appreciate help in identifying what
should be removed or how this might be enhanced to contain more of should be removed or how this might be enhanced to contain more of
that other material. --JcK]] that other material. --JcK]]
skipping to change at page 35, line 26 skipping to change at page 34, line 44
there are no uniform conventions for naming; variations such as there are no uniform conventions for naming; variations such as
outline, solid, and shaded forms may or may not exist; and so on. outline, solid, and shaded forms may or may not exist; and so on.
As just one example, consider a "heart" symbol as it might appear As just one example, consider a "heart" symbol as it might appear
in a logo that might be read as "I love...". While the user might in a logo that might be read as "I love...". While the user might
read such a logo as "I love..." or "I heart...", considerable read such a logo as "I love..." or "I heart...", considerable
knowledge of the coding distinctions made in Unicode is needed to knowledge of the coding distinctions made in Unicode is needed to
know that there more than one "heart" character (e.g., U+2665, know that there more than one "heart" character (e.g., U+2665,
U+2661, and U+2765) and how to describe it. These issues are of U+2661, and U+2765) and how to describe it. These issues are of
particular importance if strings are expected to be understood or particular importance if strings are expected to be understood or
transcribed by the listener after being read out loud. transcribed by the listener after being read out loud.
[[anchor20: The above paragraph remains controversial as to [[anchor22: The above paragraph remains controversial as to
whether it is valid. The WG will need to make a decision if this whether it is valid. The WG will need to make a decision if this
section is not dropped entirely.]] section is not dropped entirely.]]
o Consider the case of a screen reader used by blind Internet users o Consider the case of a screen reader used by blind Internet users
who must listen to renderings of IDN domain names and possibly who must listen to renderings of IDN domain names and possibly
reproduce them on the keyboard. reproduce them on the keyboard.
o As a simplified example of this, assume one wanted to use a o As a simplified example of this, assume one wanted to use a
"heart" or "star" symbol in a label. This is problematic because "heart" or "star" symbol in a label. This is problematic because
those names are ambiguous in the Unicode system of naming (the those names are ambiguous in the Unicode system of naming (the
skipping to change at page 40, line 20 skipping to change at page 39, line 43
rules that are integral elements of [IDNA2008-Tables]. Convenience rules that are integral elements of [IDNA2008-Tables]. Convenience
in programming and validation requires a registry of characters and in programming and validation requires a registry of characters and
scripts and their categories, updated for each new version of Unicode scripts and their categories, updated for each new version of Unicode
and the characters it contains. The details of this registry are and the characters it contains. The details of this registry are
specified in [IDNA2008-Tables]. specified in [IDNA2008-Tables].
10.2. IDNA Context Registry 10.2. IDNA Context Registry
For characters that are defined in the IDNA Character Registry list For characters that are defined in the IDNA Character Registry list
as PROTOCOL-VALID but requiring a contextual rule (i.e., the types of as PROTOCOL-VALID but requiring a contextual rule (i.e., the types of
rule described in Section 3.1.1.1), IANA will create and maintain a rule described in Section 3.1.2), IANA will create and maintain a
list of approved contextual rules. The details for those rules list of approved contextual rules. The details for those rules
appear in [IDNA2008-Tables]. appear in [IDNA2008-Tables].
10.3. IANA Repository of IDN Practices of TLDs 10.3. IANA Repository of IDN Practices of TLDs
This registry, historically described as the "IANA Language Character This registry, historically described as the "IANA Language Character
Set Registry" or "IANA Script Registry" (both somewhat misleading Set Registry" or "IANA Script Registry" (both somewhat misleading
terms) is maintained by IANA at the request of ICANN. It is used to terms) is maintained by IANA at the request of ICANN. It is used to
provide a central documentation repository of the IDN policies used provide a central documentation repository of the IDN policies used
by top level domain (TLD) registries who volunteer to contribute to by top level domain (TLD) registries who volunteer to contribute to
skipping to change at page 48, line 11 skipping to change at page 47, line 36
o Moved the "name server considerations" material to this document o Moved the "name server considerations" material to this document
from Protocol because it is non-normative and not part of the from Protocol because it is non-normative and not part of the
protocol itself. protocol itself.
o To improve clarity, redid discussion of the reasons why looking up o To improve clarity, redid discussion of the reasons why looking up
unassigned code points is prohibited. unassigned code points is prohibited.
o Editorial and other non-substantive corrections to reflect earlier o Editorial and other non-substantive corrections to reflect earlier
errors as well as new definitions and terminology. errors as well as new definitions and terminology.
A.8. Version -08
o Slight revision to "contextual" discussion (Section 3.1.2) and
moving it to a separate subsection, rather than under "PVALID",
for better parallelism with Tables. Also reflected Mark's
comments about the limitations of the approach.
o Added placeholder notes as reminders of where references to the
other documents need Section numbers. More of these will be added
as needed (feel free to identify relevant places), but the actual
section numbers will not be inserted until the documents are
completely stable, i.e., on their way to the RFC Editor.
Author's Address Author's Address
John C Klensin John C Klensin
1770 Massachusetts Ave, Ste 322 1770 Massachusetts Ave, Ste 322
Cambridge, MA 02140 Cambridge, MA 02140
USA USA
Phone: +1 617 245 1457 Phone: +1 617 245 1457
Email: john+ietf@jck.com Email: john+ietf@jck.com
 End of changes. 37 change blocks. 
82 lines changed or deleted 110 lines changed or added

This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/