| < draft-ietf-idnabis-rationale-07.txt | draft-ietf-idnabis-rationale-08.txt > | |||
|---|---|---|---|---|
| Network Working Group J. Klensin | Network Working Group J. Klensin | |||
| Internet-Draft February 24, 2009 | Internet-Draft March 6, 2009 | |||
| Intended status: Informational | Intended status: Informational | |||
| Expires: August 28, 2009 | Expires: September 7, 2009 | |||
| Internationalized Domain Names for Applications (IDNA): Background, | Internationalized Domain Names for Applications (IDNA): Background, | |||
| Explanation, and Rationale | Explanation, and Rationale | |||
| draft-ietf-idnabis-rationale-07.txt | draft-ietf-idnabis-rationale-08.txt | |||
| Status of this Memo | Status of this Memo | |||
| This Internet-Draft is submitted to IETF in full conformance with the | This Internet-Draft is submitted to IETF in full conformance with the | |||
| provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. This document may contain material | |||
| from IETF Documents or IETF Contributions published or made publicly | ||||
| available before November 10, 2008. The person(s) controlling the | ||||
| copyright in some of this material may not have granted the IETF | ||||
| Trust the right to allow modifications of such material outside the | ||||
| IETF Standards Process. Without obtaining an adequate license from | ||||
| the person(s) controlling the copyright in such materials, this | ||||
| document may not be modified outside the IETF Standards Process, and | ||||
| derivative works of it may not be created outside the IETF Standards | ||||
| Process, except to format it for publication as an RFC or to | ||||
| translate it into languages other than English. | ||||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF), its areas, and its working groups. Note that | Task Force (IETF), its areas, and its working groups. Note that | |||
| other groups may also distribute working documents as Internet- | other groups may also distribute working documents as Internet- | |||
| Drafts. | Drafts. | |||
| Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| The list of current Internet-Drafts can be accessed at | The list of current Internet-Drafts can be accessed at | |||
| http://www.ietf.org/ietf/1id-abstracts.txt. | http://www.ietf.org/ietf/1id-abstracts.txt. | |||
| The list of Internet-Draft Shadow Directories can be accessed at | The list of Internet-Draft Shadow Directories can be accessed at | |||
| http://www.ietf.org/shadow.html. | http://www.ietf.org/shadow.html. | |||
| This Internet-Draft will expire on August 28, 2009. | This Internet-Draft will expire on September 7, 2009. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2009 IETF Trust and the persons identified as the | Copyright (c) 2009 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents in effect on the date of | Provisions Relating to IETF Documents in effect on the date of | |||
| publication of this document (http://trustee.ietf.org/license-info). | publication of this document (http://trustee.ietf.org/license-info). | |||
| Please review these documents carefully, as they describe your rights | Please review these documents carefully, as they describe your rights | |||
| and restrictions with respect to this document. | and restrictions with respect to this document. | |||
| This document may contain material from IETF Documents or IETF | ||||
| Contributions published or made publicly available before November | ||||
| 10, 2008. The person(s) controlling the copyright in some of this | ||||
| material may not have granted the IETF Trust the right to allow | ||||
| modifications of such material outside the IETF Standards Process. | ||||
| Without obtaining an adequate license from the person(s) controlling | ||||
| the copyright in such materials, this document may not be modified | ||||
| outside the IETF Standards Process, and derivative works of it may | ||||
| not be created outside the IETF Standards Process, except to format | ||||
| it for publication as an RFC or to translate it into languages other | ||||
| than English. | ||||
| Abstract | Abstract | |||
| Several years have passed since the original protocol for | Several years have passed since the original protocol for | |||
| Internationalized Domain Names (IDNs) was completed and deployed. | Internationalized Domain Names (IDNs) was completed and deployed. | |||
| During that time, a number of issues have arisen, including the need | During that time, a number of issues have arisen, including the need | |||
| to update the system to deal with newer versions of Unicode. Some of | to update the system to deal with newer versions of Unicode. Some of | |||
| these issues require tuning of the existing protocols and the tables | these issues require tuning of the existing protocols and the tables | |||
| on which they depend. This document provides an overview of a | on which they depend. This document provides an overview of a | |||
| revised system and provides explanatory material for its components. | revised system and provides explanatory material for its components. | |||
| Table of Contents | Table of Contents | |||
| 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
| 1.1. Context and Overview . . . . . . . . . . . . . . . . . . . 5 | 1.1. Context and Overview . . . . . . . . . . . . . . . . . . . 4 | |||
| 1.2. Discussion Forum . . . . . . . . . . . . . . . . . . . . . 5 | 1.2. Discussion Forum . . . . . . . . . . . . . . . . . . . . . 4 | |||
| 1.3. Terminology . . . . . . . . . . . . . . . . . . . . . . . 6 | 1.3. Terminology . . . . . . . . . . . . . . . . . . . . . . . 5 | |||
| 1.3.1. Documents and Standards . . . . . . . . . . . . . . . 6 | 1.3.1. Documents and Standards . . . . . . . . . . . . . . . 5 | |||
| 1.3.2. DNS "Name" Terminology . . . . . . . . . . . . . . . . 6 | 1.3.2. DNS "Name" Terminology . . . . . . . . . . . . . . . . 5 | |||
| 1.3.3. New Terminology and Restrictions . . . . . . . . . . . 7 | 1.3.3. New Terminology and Restrictions . . . . . . . . . . . 6 | |||
| 1.4. Objectives . . . . . . . . . . . . . . . . . . . . . . . . 7 | 1.4. Objectives . . . . . . . . . . . . . . . . . . . . . . . . 6 | |||
| 1.5. Applicability and Function of IDNA . . . . . . . . . . . . 8 | 1.5. Applicability and Function of IDNA . . . . . . . . . . . . 7 | |||
| 1.6. Comprehensibility of IDNA Mechanisms and Processing . . . 9 | 1.6. Comprehensibility of IDNA Mechanisms and Processing . . . 8 | |||
| 2. Processing in IDNA2008 . . . . . . . . . . . . . . . . . . . . 10 | 2. Processing in IDNA2008 . . . . . . . . . . . . . . . . . . . . 9 | |||
| 3. Permitted Characters: An Inclusion List . . . . . . . . . . . 11 | 3. Permitted Characters: An Inclusion List . . . . . . . . . . . 10 | |||
| 3.1. A Tiered Model of Permitted Characters and Labels . . . . 11 | 3.1. A Tiered Model of Permitted Characters and Labels . . . . 10 | |||
| 3.1.1. PROTOCOL-VALID . . . . . . . . . . . . . . . . . . . . 12 | 3.1.1. PROTOCOL-VALID . . . . . . . . . . . . . . . . . . . . 11 | |||
| 3.1.1.1. Contextual Rules . . . . . . . . . . . . . . . . . 12 | 3.1.2. Characters Valid Only in Context With Others . . . . . 11 | |||
| 3.1.1.2. Rules and Their Application . . . . . . . . . . . 13 | 3.1.2.2. Rules and Their Application . . . . . . . . . . . 12 | |||
| 3.1.2. DISALLOWED . . . . . . . . . . . . . . . . . . . . . . 13 | 3.1.3. DISALLOWED . . . . . . . . . . . . . . . . . . . . . . 12 | |||
| 3.1.3. UNASSIGNED . . . . . . . . . . . . . . . . . . . . . . 14 | 3.1.4. UNASSIGNED . . . . . . . . . . . . . . . . . . . . . . 13 | |||
| 3.2. Registration Policy . . . . . . . . . . . . . . . . . . . 14 | 3.2. Registration Policy . . . . . . . . . . . . . . . . . . . 14 | |||
| 3.3. Layered Restrictions: Tables, Context, Registration, | 3.3. Layered Restrictions: Tables, Context, Registration, | |||
| Applications . . . . . . . . . . . . . . . . . . . . . . . 15 | Applications . . . . . . . . . . . . . . . . . . . . . . . 14 | |||
| 4. Issues that Constrain Possible Solutions . . . . . . . . . . . 15 | 4. Issues that Constrain Possible Solutions . . . . . . . . . . . 15 | |||
| 4.1. Display and Network Order . . . . . . . . . . . . . . . . 16 | 4.1. Display and Network Order . . . . . . . . . . . . . . . . 15 | |||
| 4.2. Entry and Display in Applications . . . . . . . . . . . . 17 | 4.2. Entry and Display in Applications . . . . . . . . . . . . 16 | |||
| 4.3. Linguistic Expectations: Ligatures, Digraphs, and | 4.3. Linguistic Expectations: Ligatures, Digraphs, and | |||
| Alternate Character Forms . . . . . . . . . . . . . . . . 18 | Alternate Character Forms . . . . . . . . . . . . . . . . 17 | |||
| 4.4. Case Mapping and Related Issues . . . . . . . . . . . . . 20 | 4.4. Case Mapping and Related Issues . . . . . . . . . . . . . 20 | |||
| 4.5. Right to Left Text . . . . . . . . . . . . . . . . . . . . 21 | 4.5. Right to Left Text . . . . . . . . . . . . . . . . . . . . 20 | |||
| 5. IDNs and the Robustness Principle . . . . . . . . . . . . . . 22 | 5. IDNs and the Robustness Principle . . . . . . . . . . . . . . 21 | |||
| 6. Front-end and User Interface Processing for Lookup . . . . . . 23 | 6. Front-end and User Interface Processing for Lookup . . . . . . 22 | |||
| 7. Migration from IDNA2003 and Unicode Version Synchronization . 26 | 7. Migration from IDNA2003 and Unicode Version Synchronization . 25 | |||
| 7.1. Design Criteria . . . . . . . . . . . . . . . . . . . . . 26 | 7.1. Design Criteria . . . . . . . . . . . . . . . . . . . . . 25 | |||
| 7.1.1. General IDNA Validity Criteria . . . . . . . . . . . . 26 | 7.1.1. General IDNA Validity Criteria . . . . . . . . . . . . 25 | |||
| 7.1.2. Labels in Registration . . . . . . . . . . . . . . . . 27 | 7.1.2. Labels in Registration . . . . . . . . . . . . . . . . 26 | |||
| 7.1.3. Labels in Lookup . . . . . . . . . . . . . . . . . . . 28 | 7.1.3. Labels in Lookup . . . . . . . . . . . . . . . . . . . 27 | |||
| 7.2. Changes in Character Interpretations . . . . . . . . . . . 29 | 7.2. Changes in Character Interpretations . . . . . . . . . . . 28 | |||
| 7.3. More Flexibility in User Agents . . . . . . . . . . . . . 31 | 7.3. More Flexibility in User Agents . . . . . . . . . . . . . 30 | |||
| 7.4. The Question of Prefix Changes . . . . . . . . . . . . . . 32 | 7.4. The Question of Prefix Changes . . . . . . . . . . . . . . 31 | |||
| 7.4.1. Conditions Requiring a Prefix Change . . . . . . . . . 32 | 7.4.1. Conditions Requiring a Prefix Change . . . . . . . . . 32 | |||
| 7.4.2. Conditions Not Requiring a Prefix Change . . . . . . . 33 | 7.4.2. Conditions Not Requiring a Prefix Change . . . . . . . 32 | |||
| 7.4.3. Implications of Prefix Changes . . . . . . . . . . . . 33 | 7.4.3. Implications of Prefix Changes . . . . . . . . . . . . 33 | |||
| 7.5. Stringprep Changes and Compatibility . . . . . . . . . . . 34 | 7.5. Stringprep Changes and Compatibility . . . . . . . . . . . 33 | |||
| 7.6. The Symbol Question . . . . . . . . . . . . . . . . . . . 34 | 7.6. The Symbol Question . . . . . . . . . . . . . . . . . . . 34 | |||
| 7.7. Migration Between Unicode Versions: Unassigned Code | 7.7. Migration Between Unicode Versions: Unassigned Code | |||
| Points . . . . . . . . . . . . . . . . . . . . . . . . . . 36 | Points . . . . . . . . . . . . . . . . . . . . . . . . . . 35 | |||
| 7.8. Other Compatibility Issues . . . . . . . . . . . . . . . . 37 | 7.8. Other Compatibility Issues . . . . . . . . . . . . . . . . 37 | |||
| 8. Name Server Considerations . . . . . . . . . . . . . . . . . . 37 | ||||
| 8. Name Server Considerations . . . . . . . . . . . . . . . . . . 38 | 8.1. Processing Non-ASCII Strings . . . . . . . . . . . . . . . 37 | |||
| 8.1. Processing Non-ASCII Strings . . . . . . . . . . . . . . . 38 | ||||
| 8.2. DNSSEC Authentication of IDN Domain Names . . . . . . . . 38 | 8.2. DNSSEC Authentication of IDN Domain Names . . . . . . . . 38 | |||
| 8.3. Root and other DNS Server Considerations . . . . . . . . . 39 | 8.3. Root and other DNS Server Considerations . . . . . . . . . 38 | |||
| 9. Internationalization Considerations . . . . . . . . . . . . . 39 | 9. Internationalization Considerations . . . . . . . . . . . . . 39 | |||
| 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 39 | 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 39 | |||
| 10.1. IDNA Character Registry . . . . . . . . . . . . . . . . . 40 | 10.1. IDNA Character Registry . . . . . . . . . . . . . . . . . 39 | |||
| 10.2. IDNA Context Registry . . . . . . . . . . . . . . . . . . 40 | 10.2. IDNA Context Registry . . . . . . . . . . . . . . . . . . 39 | |||
| 10.3. IANA Repository of IDN Practices of TLDs . . . . . . . . . 40 | 10.3. IANA Repository of IDN Practices of TLDs . . . . . . . . . 39 | |||
| 11. Security Considerations . . . . . . . . . . . . . . . . . . . 40 | 11. Security Considerations . . . . . . . . . . . . . . . . . . . 40 | |||
| 11.1. General Security Issues with IDNA . . . . . . . . . . . . 40 | 11.1. General Security Issues with IDNA . . . . . . . . . . . . 40 | |||
| 12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 41 | 12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 40 | |||
| 13. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 41 | 13. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 41 | |||
| 14. References . . . . . . . . . . . . . . . . . . . . . . . . . . 42 | 14. References . . . . . . . . . . . . . . . . . . . . . . . . . . 41 | |||
| 14.1. Normative References . . . . . . . . . . . . . . . . . . . 42 | 14.1. Normative References . . . . . . . . . . . . . . . . . . . 41 | |||
| 14.2. Informative References . . . . . . . . . . . . . . . . . . 43 | 14.2. Informative References . . . . . . . . . . . . . . . . . . 42 | |||
| Appendix A. Change Log . . . . . . . . . . . . . . . . . . . . . 45 | Appendix A. Change Log . . . . . . . . . . . . . . . . . . . . . 44 | |||
| A.1. Changes between Version -00 and Version -01 of | A.1. Changes between Version -00 and Version -01 of | |||
| draft-ietf-idnabis-rationale . . . . . . . . . . . . . . . 45 | draft-ietf-idnabis-rationale . . . . . . . . . . . . . . . 44 | |||
| A.2. Version -02 . . . . . . . . . . . . . . . . . . . . . . . 45 | A.2. Version -02 . . . . . . . . . . . . . . . . . . . . . . . 45 | |||
| A.3. Version -03 . . . . . . . . . . . . . . . . . . . . . . . 46 | A.3. Version -03 . . . . . . . . . . . . . . . . . . . . . . . 45 | |||
| A.4. Version -04 . . . . . . . . . . . . . . . . . . . . . . . 46 | A.4. Version -04 . . . . . . . . . . . . . . . . . . . . . . . 46 | |||
| A.5. Version -05 . . . . . . . . . . . . . . . . . . . . . . . 47 | A.5. Version -05 . . . . . . . . . . . . . . . . . . . . . . . 46 | |||
| A.6. Version -06 . . . . . . . . . . . . . . . . . . . . . . . 47 | A.6. Version -06 . . . . . . . . . . . . . . . . . . . . . . . 46 | |||
| A.7. Version -07 . . . . . . . . . . . . . . . . . . . . . . . 47 | A.7. Version -07 . . . . . . . . . . . . . . . . . . . . . . . 47 | |||
| Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 48 | A.8. Version -08 . . . . . . . . . . . . . . . . . . . . . . . 47 | |||
| Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 47 | ||||
| 1. Introduction | 1. Introduction | |||
| 1.1. Context and Overview | 1.1. Context and Overview | |||
| The original standards for Internationalized Domain Names (IDNs) were | The original standards for Internationalized Domain Names (IDNs) were | |||
| completed and deployed starting in 2003. Those standards are known | completed and deployed starting in 2003. Those standards are known | |||
| as Internationalized Domain Names in Applications (IDNA), taken from | as Internationalized Domain Names in Applications (IDNA), taken from | |||
| the name of the highest level standard within the group, RFC 3490 | the name of the highest level standard within the group, RFC 3490 | |||
| [RFC3490]. After those standards were deployed, a number of issues | [RFC3490]. After those standards were deployed, a number of issues | |||
| skipping to change at page 12, line 28 ¶ | skipping to change at page 11, line 28 ¶ | |||
| in the category. Registries are still expected to apply judgment | in the category. Registries are still expected to apply judgment | |||
| about labels they will accept and to maintain rules consistent with | about labels they will accept and to maintain rules consistent with | |||
| those judgments (see [IDNA2008-Protocol] and Section 3.3). | those judgments (see [IDNA2008-Protocol] and Section 3.3). | |||
| Characters that are placed in the "PROTOCOL-VALID" category are | Characters that are placed in the "PROTOCOL-VALID" category are | |||
| expected to never be removed from it or reclassified. While | expected to never be removed from it or reclassified. While | |||
| theoretically characters could be removed from Unicode, such removal | theoretically characters could be removed from Unicode, such removal | |||
| would be inconsistent with the Unicode stability principles (see | would be inconsistent with the Unicode stability principles (see | |||
| [Unicode51], Appendix F) and hence should never occur. | [Unicode51], Appendix F) and hence should never occur. | |||
| 3.1.1.1. Contextual Rules | 3.1.2. Characters Valid Only in Context With Others | |||
| Some characters may be unsuitable for general use in IDNs but | Some characters may be unsuitable for general use in IDNs but | |||
| necessary for the plausible support of some scripts. The two most | necessary for the plausible support of some scripts. The two most | |||
| commonly-cited examples are the zero-width joiner and non-joiner | commonly-cited examples are the zero-width joiner and non-joiner | |||
| characters (ZWJ, U+200D and ZWNJ, U+200C), but provisions for | characters (ZWJ, U+200D and ZWNJ, U+200C), but provisions for | |||
| unambiguous labels may require that other characters be restricted to | unambiguous labels may require that other characters be restricted to | |||
| particular contexts. For example, the ASCII hyphen is not permitted | particular contexts. For example, the ASCII hyphen is not permitted | |||
| to start or end a label, whether that label contains non-ASCII | to start or end a label, whether that label contains non-ASCII | |||
| characters or not. | characters or not. | |||
| 3.1.2.1. Contextual Restrictions | ||||
| These characters must not appear in IDNs without additional | These characters must not appear in IDNs without additional | |||
| restrictions, typically because they have no visible consequences in | restrictions, typically because they have no visible consequences in | |||
| most scripts but affect format or presentation in a few others or | most scripts but affect format or presentation in a few others or | |||
| because they are combining characters that are safe for use only in | because they are combining characters that are safe for use only in | |||
| conjunction with particular characters or scripts. In order to | conjunction with particular characters or scripts. In order to | |||
| permit them to be used at all, they are specially identified as | permit them to be used at all, they are specially identified as | |||
| "CONTEXTUAL RULE REQUIRED" and, when adequately understood, | "CONTEXTUAL RULE REQUIRED" and, when adequately understood, | |||
| associated with a rule. In addition, the rule will define whether it | associated with a rule. In addition, the rule will define whether it | |||
| is to be applied on lookup as well as registration. A distinction is | is to be applied on lookup as well as registration. A distinction is | |||
| made between characters that indicate or prohibit joining (known as | made between characters that indicate or prohibit joining (known as | |||
| "CONTEXT-JOINER" or "CONTEXTJ") and other characters requiring | "CONTEXT-JOINER" or "CONTEXTJ") and other characters requiring | |||
| contextual treatment ("CONTEXT-OTHER" or "CONTEXTO"). Only the | contextual treatment ("CONTEXT-OTHER" or "CONTEXTO"). Only the | |||
| former require full testing at lookup time. | former require full testing at lookup time. | |||
| 3.1.1.2. Rules and Their Application | It is important to note that these contextual rules cannot prevent | |||
| all uses of the relevant characters that might be confusing or | ||||
| problematic. What they are expected do is to confine applicability | ||||
| of the characters to scripts (and narrower contexts) where zone | ||||
| administrators are knowledgeable enough about the use of those | ||||
| characters to be prepared to deal with them appropriately. For | ||||
| example, a registry dealing with an Indic script that requires ZWJ | ||||
| and/or ZWNJ as part of the writing system is expected to understand | ||||
| where the characters have visible effect and where they do not and to | ||||
| make registration rules accordingly. By contrast, a registry dealing | ||||
| with Latin or Cyrillic script might not be actively aware that the | ||||
| characters exist, much less about the consequences of embedding them | ||||
| in labels drawn from those scripts. | ||||
| 3.1.2.2. Rules and Their Application | ||||
| The actual rules may be DEFINED or NULL. If present, they may have | The actual rules may be DEFINED or NULL. If present, they may have | |||
| values of "True" (character may be used in any position in any | values of "True" (character may be used in any position in any | |||
| label), "False" (character may not be used in any label), or may be a | label), "False" (character may not be used in any label), or may be a | |||
| set of procedural rules that specify the context in which the | set of procedural rules that specify the context in which the | |||
| character is permitted. | character is permitted. | |||
| Examples of descriptions of typical rules, stated informally and in | Examples of descriptions of typical rules, stated informally and in | |||
| English, include "Must follow a character from Script XYZ", "Must | English, include "Must follow a character from Script XYZ", "Must | |||
| occur only if the entire label is in Script ABC", "Must occur only if | occur only if the entire label is in Script ABC", "Must occur only if | |||
| skipping to change at page 13, line 27 ¶ | skipping to change at page 12, line 42 ¶ | |||
| Because it is easier to identify these characters than to know that | Because it is easier to identify these characters than to know that | |||
| they are actually needed in IDNs or how to establish exactly the | they are actually needed in IDNs or how to establish exactly the | |||
| right rules for each one, a rule may have a null value in a given | right rules for each one, a rule may have a null value in a given | |||
| version of the tables. Characters associated with null rules are not | version of the tables. Characters associated with null rules are not | |||
| permitted to appear in putative labels for either registration or | permitted to appear in putative labels for either registration or | |||
| lookup. Of course, a later version of the tables might contain a | lookup. Of course, a later version of the tables might contain a | |||
| non-null rule. | non-null rule. | |||
| The description of the syntax of the rules, and the rules themselves, | The description of the syntax of the rules, and the rules themselves, | |||
| appears in [IDNA2008-Tables]. | appears in [IDNA2008-Tables]. [[anchor11: ??? Section number would | |||
| be good here.]] | ||||
| 3.1.2. DISALLOWED | 3.1.3. DISALLOWED | |||
| Some characters are inappropriate for use in IDNs and are thus | Some characters are inappropriate for use in IDNs and are thus | |||
| excluded for both registration and lookup (i.e., IDNA-conforming | excluded for both registration and lookup (i.e., IDNA-conforming | |||
| applications performing name lookup should verify that these | applications performing name lookup should verify that these | |||
| characters are absent; if they are present, the label strings should | characters are absent; if they are present, the label strings should | |||
| be rejected rather than converted to A-labels and looked up. Some of | be rejected rather than converted to A-labels and looked up. Some of | |||
| these characters are problematic for use in IDNs (such as the | these characters are problematic for use in IDNs (such as the | |||
| FRACTION SLASH character, U+2044), while some of them (such as the | FRACTION SLASH character, U+2044), while some of them (such as the | |||
| various HEART symbols, e.g., U+2665, U+2661, and U+2765, see | various HEART symbols, e.g., U+2665, U+2661, and U+2765, see | |||
| Section 7.6) simply fall outside the conventions for typical | Section 7.6) simply fall outside the conventions for typical | |||
| skipping to change at page 14, line 21 ¶ | skipping to change at page 13, line 38 ¶ | |||
| normalization method NFKC to the character yields some other | normalization method NFKC to the character yields some other | |||
| character. | character. | |||
| o The character is an upper-case form or some other form that is | o The character is an upper-case form or some other form that is | |||
| mapped to another character by Unicode casefolding. | mapped to another character by Unicode casefolding. | |||
| o The character is a symbol or punctuation form or, more generally, | o The character is a symbol or punctuation form or, more generally, | |||
| something that is not a letter, digit, or a mark that is used to | something that is not a letter, digit, or a mark that is used to | |||
| form a letter or digit. | form a letter or digit. | |||
| 3.1.3. UNASSIGNED | 3.1.4. UNASSIGNED | |||
| For convenience in processing and table-building, code points that do | For convenience in processing and table-building, code points that do | |||
| not have assigned values in a given version of Unicode are treated as | not have assigned values in a given version of Unicode are treated as | |||
| belonging to a special UNASSIGNED category. Such code points are | belonging to a special UNASSIGNED category. Such code points are | |||
| prohibited in labels to be registered or looked up. The category | prohibited in labels to be registered or looked up. The category | |||
| differs from DISALLOWED in that code points are moved out of it by | differs from DISALLOWED in that code points are moved out of it by | |||
| the simple expedient of being assigned in a later version of Unicode | the simple expedient of being assigned in a later version of Unicode | |||
| (at which point, they are classified into one of the other categories | (at which point, they are classified into one of the other categories | |||
| as appropriate). | as appropriate). | |||
| skipping to change at page 18, line 23 ¶ | skipping to change at page 17, line 38 ¶ | |||
| In any place where a protocol or document format allows transmission | In any place where a protocol or document format allows transmission | |||
| of the characters in internationalized labels, labels should be | of the characters in internationalized labels, labels should be | |||
| transmitted using whatever character encoding and escape mechanism | transmitted using whatever character encoding and escape mechanism | |||
| the protocol or document format uses at that place. This provision | the protocol or document format uses at that place. This provision | |||
| is intended to prevent situations in which, e.g., UTF-8 domain names | is intended to prevent situations in which, e.g., UTF-8 domain names | |||
| appear embedded in text that is otherwise in some other character | appear embedded in text that is otherwise in some other character | |||
| coding. | coding. | |||
| All protocols that use domain name slots (See Section 2.3.1.6 | All protocols that use domain name slots (See Section 2.3.1.6 | |||
| [[anchor12: ?? Verify this]] in [IDNA2008-Defs]) already have the | [[anchor14: ?? Verify this]] in [IDNA2008-Defs]) already have the | |||
| capacity for handling domain names in the ASCII charset. Thus, | capacity for handling domain names in the ASCII charset. Thus, | |||
| A-labels can inherently be handled by those protocols. | A-labels can inherently be handled by those protocols. | |||
| 4.3. Linguistic Expectations: Ligatures, Digraphs, and Alternate | 4.3. Linguistic Expectations: Ligatures, Digraphs, and Alternate | |||
| Character Forms | Character Forms | |||
| [[anchor13: There is some internal redundancy and repetition in the | [[anchor15: There is some internal redundancy and repetition in the | |||
| material in this section. Specific suggestions about to reduce or | material in this section. Specific suggestions about to reduce or | |||
| eliminate redundant text would be appreciated. If no such | eliminate redundant text would be appreciated. If no such | |||
| suggestions are received before -07 is posted, this note will be | suggestions are received before -07 is posted, this note will be | |||
| removed.]] | removed.]] | |||
| Users often have expectations about character matching or equivalence | Users often have expectations about character matching or equivalence | |||
| that are based on their own languages and the orthography of those | that are based on their own languages and the orthography of those | |||
| languages. These expectations may not be consistent with forms or | languages. These expectations may not be consistent with forms or | |||
| actions that can be naturally accommodated in a character coding | actions that can be naturally accommodated in a character coding | |||
| system, especially if multiple languages are written using the same | system, especially if multiple languages are written using the same | |||
| skipping to change at page 20, line 38 ¶ | skipping to change at page 20, line 6 ¶ | |||
| provides a prime example of a situation in which a registry that is | provides a prime example of a situation in which a registry that is | |||
| aware of the language context in which labels are to be registered, | aware of the language context in which labels are to be registered, | |||
| and where that language sometimes (or always) treats the two- | and where that language sometimes (or always) treats the two- | |||
| character sequences as equivalent to the combined form, should give | character sequences as equivalent to the combined form, should give | |||
| serious consideration to applying a "variant" model [RFC3743] | serious consideration to applying a "variant" model [RFC3743] | |||
| [RFC4290], or to prohibiting registration of one the forms entirely, | [RFC4290], or to prohibiting registration of one the forms entirely, | |||
| to reduce the opportunities for user confusion and fraud that would | to reduce the opportunities for user confusion and fraud that would | |||
| result from the related strings being registered to different | result from the related strings being registered to different | |||
| parties. | parties. | |||
| [[anchor14: Placeholder: A discussion of the Arabic digit issue | [[anchor16: Placeholder: A discussion of the Arabic digit issue | |||
| should go here once it is resolved in some appropriate way.]] | should go here once it is resolved in some appropriate way.]] | |||
| 4.4. Case Mapping and Related Issues | 4.4. Case Mapping and Related Issues | |||
| In the DNS, ASCII letters are stored with their case preserved. | In the DNS, ASCII letters are stored with their case preserved. | |||
| Matching during the query process is case-independent, but none of | Matching during the query process is case-independent, but none of | |||
| the information that might be represented by choices of case has been | the information that might be represented by choices of case has been | |||
| lost. That model has been accidentally helpful because, as people | lost. That model has been accidentally helpful because, as people | |||
| have created DNS labels by catenating words (or parts of words) to | have created DNS labels by catenating words (or parts of words) to | |||
| form labels, case has often been used to distinguish among components | form labels, case has often been used to distinguish among components | |||
| skipping to change at page 28, line 47 ¶ | skipping to change at page 28, line 17 ¶ | |||
| o Validate the label itself for conformance with a small number of | o Validate the label itself for conformance with a small number of | |||
| whole-label rules, notably verifying that there are no leading | whole-label rules, notably verifying that there are no leading | |||
| combining marks, that the "bidi" conditions are met if right to | combining marks, that the "bidi" conditions are met if right to | |||
| left characters appear, that any required contextual rules are | left characters appear, that any required contextual rules are | |||
| available and that, if such rules are associated with Joiner | available and that, if such rules are associated with Joiner | |||
| Controls, they are tested. | Controls, they are tested. | |||
| o Avoid validating other contextual rules about characters, | o Avoid validating other contextual rules about characters, | |||
| including mixed-script label prohibitions, although such rules may | including mixed-script label prohibitions, although such rules may | |||
| be used to influence presentation decisions in the user interface. | be used to influence presentation decisions in the user interface. | |||
| [[anchor18: Check this, and all similar statements, against | [[anchor20: Check this, and all similar statements, against | |||
| Protocol when that is finished.]] | Protocol when that is finished.]] | |||
| By avoiding applying its own interpretation of which labels are valid | By avoiding applying its own interpretation of which labels are valid | |||
| as a means of rejecting lookup attempts, the lookup application | as a means of rejecting lookup attempts, the lookup application | |||
| becomes less sensitive to version incompatibilities with the | becomes less sensitive to version incompatibilities with the | |||
| particular zone registry associated with the domain name. | particular zone registry associated with the domain name. | |||
| An application or client that processes names according to this | An application or client that processes names according to this | |||
| protocol and then resolves them in the DNS will be able to locate any | protocol and then resolves them in the DNS will be able to locate any | |||
| name that is validly registered, as long as its version of the | name that is validly registered, as long as its version of the | |||
| skipping to change at page 29, line 20 ¶ | skipping to change at page 28, line 39 ¶ | |||
| of the characters in the label. Messages to users should distinguish | of the characters in the label. Messages to users should distinguish | |||
| between "label contains an unallocated code point" and other types of | between "label contains an unallocated code point" and other types of | |||
| lookup failures. A failure on the basis of an old version of Unicode | lookup failures. A failure on the basis of an old version of Unicode | |||
| may lead the user to a desire to upgrade to a newer version, but will | may lead the user to a desire to upgrade to a newer version, but will | |||
| have no other ill effects (this is consistent with behavior in the | have no other ill effects (this is consistent with behavior in the | |||
| transition to the DNS when some hosts could not yet handle some forms | transition to the DNS when some hosts could not yet handle some forms | |||
| of names or record types). | of names or record types). | |||
| 7.2. Changes in Character Interpretations | 7.2. Changes in Character Interpretations | |||
| [[anchor19: Note in Draft: This subsection is completely new in | [[anchor21: Note in Draft: This subsection is completely new in | |||
| version -04 and has been further tuned in -05 and -06 of this | version -04 and has been further tuned in -05 and -06 of this | |||
| document. It could almost certainly use improvement, although this | document. It could almost certainly use improvement, although this | |||
| note will be removed if there are not significant suggestions about | note will be removed if there are not significant suggestions about | |||
| the -06 version. It also contains some material that is redundant | the -06 version. It also contains some material that is redundant | |||
| with material in other sections. I have not tried to remove that | with material in other sections. I have not tried to remove that | |||
| material and will not do so until the WG concludes that this section | material and will not do so until the WG concludes that this section | |||
| is relatively stable, but would appreciate help in identifying what | is relatively stable, but would appreciate help in identifying what | |||
| should be removed or how this might be enhanced to contain more of | should be removed or how this might be enhanced to contain more of | |||
| that other material. --JcK]] | that other material. --JcK]] | |||
| skipping to change at page 35, line 26 ¶ | skipping to change at page 34, line 44 ¶ | |||
| there are no uniform conventions for naming; variations such as | there are no uniform conventions for naming; variations such as | |||
| outline, solid, and shaded forms may or may not exist; and so on. | outline, solid, and shaded forms may or may not exist; and so on. | |||
| As just one example, consider a "heart" symbol as it might appear | As just one example, consider a "heart" symbol as it might appear | |||
| in a logo that might be read as "I love...". While the user might | in a logo that might be read as "I love...". While the user might | |||
| read such a logo as "I love..." or "I heart...", considerable | read such a logo as "I love..." or "I heart...", considerable | |||
| knowledge of the coding distinctions made in Unicode is needed to | knowledge of the coding distinctions made in Unicode is needed to | |||
| know that there more than one "heart" character (e.g., U+2665, | know that there more than one "heart" character (e.g., U+2665, | |||
| U+2661, and U+2765) and how to describe it. These issues are of | U+2661, and U+2765) and how to describe it. These issues are of | |||
| particular importance if strings are expected to be understood or | particular importance if strings are expected to be understood or | |||
| transcribed by the listener after being read out loud. | transcribed by the listener after being read out loud. | |||
| [[anchor20: The above paragraph remains controversial as to | [[anchor22: The above paragraph remains controversial as to | |||
| whether it is valid. The WG will need to make a decision if this | whether it is valid. The WG will need to make a decision if this | |||
| section is not dropped entirely.]] | section is not dropped entirely.]] | |||
| o Consider the case of a screen reader used by blind Internet users | o Consider the case of a screen reader used by blind Internet users | |||
| who must listen to renderings of IDN domain names and possibly | who must listen to renderings of IDN domain names and possibly | |||
| reproduce them on the keyboard. | reproduce them on the keyboard. | |||
| o As a simplified example of this, assume one wanted to use a | o As a simplified example of this, assume one wanted to use a | |||
| "heart" or "star" symbol in a label. This is problematic because | "heart" or "star" symbol in a label. This is problematic because | |||
| those names are ambiguous in the Unicode system of naming (the | those names are ambiguous in the Unicode system of naming (the | |||
| skipping to change at page 40, line 20 ¶ | skipping to change at page 39, line 43 ¶ | |||
| rules that are integral elements of [IDNA2008-Tables]. Convenience | rules that are integral elements of [IDNA2008-Tables]. Convenience | |||
| in programming and validation requires a registry of characters and | in programming and validation requires a registry of characters and | |||
| scripts and their categories, updated for each new version of Unicode | scripts and their categories, updated for each new version of Unicode | |||
| and the characters it contains. The details of this registry are | and the characters it contains. The details of this registry are | |||
| specified in [IDNA2008-Tables]. | specified in [IDNA2008-Tables]. | |||
| 10.2. IDNA Context Registry | 10.2. IDNA Context Registry | |||
| For characters that are defined in the IDNA Character Registry list | For characters that are defined in the IDNA Character Registry list | |||
| as PROTOCOL-VALID but requiring a contextual rule (i.e., the types of | as PROTOCOL-VALID but requiring a contextual rule (i.e., the types of | |||
| rule described in Section 3.1.1.1), IANA will create and maintain a | rule described in Section 3.1.2), IANA will create and maintain a | |||
| list of approved contextual rules. The details for those rules | list of approved contextual rules. The details for those rules | |||
| appear in [IDNA2008-Tables]. | appear in [IDNA2008-Tables]. | |||
| 10.3. IANA Repository of IDN Practices of TLDs | 10.3. IANA Repository of IDN Practices of TLDs | |||
| This registry, historically described as the "IANA Language Character | This registry, historically described as the "IANA Language Character | |||
| Set Registry" or "IANA Script Registry" (both somewhat misleading | Set Registry" or "IANA Script Registry" (both somewhat misleading | |||
| terms) is maintained by IANA at the request of ICANN. It is used to | terms) is maintained by IANA at the request of ICANN. It is used to | |||
| provide a central documentation repository of the IDN policies used | provide a central documentation repository of the IDN policies used | |||
| by top level domain (TLD) registries who volunteer to contribute to | by top level domain (TLD) registries who volunteer to contribute to | |||
| skipping to change at page 48, line 11 ¶ | skipping to change at page 47, line 36 ¶ | |||
| o Moved the "name server considerations" material to this document | o Moved the "name server considerations" material to this document | |||
| from Protocol because it is non-normative and not part of the | from Protocol because it is non-normative and not part of the | |||
| protocol itself. | protocol itself. | |||
| o To improve clarity, redid discussion of the reasons why looking up | o To improve clarity, redid discussion of the reasons why looking up | |||
| unassigned code points is prohibited. | unassigned code points is prohibited. | |||
| o Editorial and other non-substantive corrections to reflect earlier | o Editorial and other non-substantive corrections to reflect earlier | |||
| errors as well as new definitions and terminology. | errors as well as new definitions and terminology. | |||
| A.8. Version -08 | ||||
| o Slight revision to "contextual" discussion (Section 3.1.2) and | ||||
| moving it to a separate subsection, rather than under "PVALID", | ||||
| for better parallelism with Tables. Also reflected Mark's | ||||
| comments about the limitations of the approach. | ||||
| o Added placeholder notes as reminders of where references to the | ||||
| other documents need Section numbers. More of these will be added | ||||
| as needed (feel free to identify relevant places), but the actual | ||||
| section numbers will not be inserted until the documents are | ||||
| completely stable, i.e., on their way to the RFC Editor. | ||||
| Author's Address | Author's Address | |||
| John C Klensin | John C Klensin | |||
| 1770 Massachusetts Ave, Ste 322 | 1770 Massachusetts Ave, Ste 322 | |||
| Cambridge, MA 02140 | Cambridge, MA 02140 | |||
| USA | USA | |||
| Phone: +1 617 245 1457 | Phone: +1 617 245 1457 | |||
| Email: john+ietf@jck.com | Email: john+ietf@jck.com | |||
| End of changes. 37 change blocks. | ||||
| 82 lines changed or deleted | 110 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||