| < draft-ietf-idnabis-rationale-13.txt | draft-ietf-idnabis-rationale-14.txt > | |||
|---|---|---|---|---|
| Network Working Group J. Klensin | Network Working Group J. Klensin | |||
| Internet-Draft September 13, 2009 | Internet-Draft October 25, 2009 | |||
| Intended status: Informational | Intended status: Informational | |||
| Expires: March 17, 2010 | Expires: April 28, 2010 | |||
| Internationalized Domain Names for Applications (IDNA): Background, | Internationalized Domain Names for Applications (IDNA): Background, | |||
| Explanation, and Rationale | Explanation, and Rationale | |||
| draft-ietf-idnabis-rationale-13.txt | draft-ietf-idnabis-rationale-14.txt | |||
| Status of this Memo | Status of this Memo | |||
| This Internet-Draft is submitted to IETF in full conformance with the | This Internet-Draft is submitted to IETF in full conformance with the | |||
| provisions of BCP 78 and BCP 79. This document may contain material | provisions of BCP 78 and BCP 79. This document may contain material | |||
| from IETF Documents or IETF Contributions published or made publicly | from IETF Documents or IETF Contributions published or made publicly | |||
| available before November 10, 2008. The person(s) controlling the | available before November 10, 2008. The person(s) controlling the | |||
| copyright in some of this material may not have granted the IETF | copyright in some of this material may not have granted the IETF | |||
| Trust the right to allow modifications of such material outside the | Trust the right to allow modifications of such material outside the | |||
| IETF Standards Process. Without obtaining an adequate license from | IETF Standards Process. Without obtaining an adequate license from | |||
| skipping to change at page 1, line 43 ¶ | skipping to change at page 1, line 43 ¶ | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| The list of current Internet-Drafts can be accessed at | The list of current Internet-Drafts can be accessed at | |||
| http://www.ietf.org/ietf/1id-abstracts.txt. | http://www.ietf.org/ietf/1id-abstracts.txt. | |||
| The list of Internet-Draft Shadow Directories can be accessed at | The list of Internet-Draft Shadow Directories can be accessed at | |||
| http://www.ietf.org/shadow.html. | http://www.ietf.org/shadow.html. | |||
| This Internet-Draft will expire on March 17, 2010. | This Internet-Draft will expire on April 28, 2010. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2009 IETF Trust and the persons identified as the | Copyright (c) 2009 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents in effect on the date of | Provisions Relating to IETF Documents in effect on the date of | |||
| publication of this document (http://trustee.ietf.org/license-info). | publication of this document (http://trustee.ietf.org/license-info). | |||
| Please review these documents carefully, as they describe your rights | Please review these documents carefully, as they describe your rights | |||
| skipping to change at page 2, line 23 ¶ | skipping to change at page 2, line 23 ¶ | |||
| Several years have passed since the original protocol for | Several years have passed since the original protocol for | |||
| Internationalized Domain Names (IDNs) was completed and deployed. | Internationalized Domain Names (IDNs) was completed and deployed. | |||
| During that time, a number of issues have arisen, including the need | During that time, a number of issues have arisen, including the need | |||
| to update the system to deal with newer versions of Unicode. Some of | to update the system to deal with newer versions of Unicode. Some of | |||
| these issues require tuning of the existing protocols and the tables | these issues require tuning of the existing protocols and the tables | |||
| on which they depend. This document provides an overview of a | on which they depend. This document provides an overview of a | |||
| revised system and provides explanatory material for its components. | revised system and provides explanatory material for its components. | |||
| Table of Contents | Table of Contents | |||
| 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5 | |||
| 1.1. Context and Overview . . . . . . . . . . . . . . . . . . . 4 | 1.1. Context and Overview . . . . . . . . . . . . . . . . . . . 5 | |||
| 1.2. Discussion Forum . . . . . . . . . . . . . . . . . . . . . 5 | 1.2. Discussion Forum . . . . . . . . . . . . . . . . . . . . . 6 | |||
| 1.3. Terminology . . . . . . . . . . . . . . . . . . . . . . . 5 | 1.3. Terminology . . . . . . . . . . . . . . . . . . . . . . . 6 | |||
| 1.3.1. DNS "Name" Terminology . . . . . . . . . . . . . . . . 5 | 1.3.1. DNS "Name" Terminology . . . . . . . . . . . . . . . . 6 | |||
| 1.3.2. New Terminology and Restrictions . . . . . . . . . . . 6 | 1.3.2. New Terminology and Restrictions . . . . . . . . . . . 7 | |||
| 1.4. Objectives . . . . . . . . . . . . . . . . . . . . . . . . 6 | 1.4. Objectives . . . . . . . . . . . . . . . . . . . . . . . . 7 | |||
| 1.5. Applicability and Function of IDNA . . . . . . . . . . . . 7 | 1.5. Applicability and Function of IDNA . . . . . . . . . . . . 8 | |||
| 1.6. Comprehensibility of IDNA Mechanisms and Processing . . . 8 | 1.6. Comprehensibility of IDNA Mechanisms and Processing . . . 9 | |||
| 2. Processing in IDNA2008 . . . . . . . . . . . . . . . . . . . . 9 | 2. Processing in IDNA2008 . . . . . . . . . . . . . . . . . . . . 10 | |||
| 3. Permitted Characters: An Inclusion List . . . . . . . . . . . 9 | 3. Permitted Characters: An Inclusion List . . . . . . . . . . . 10 | |||
| 3.1. A Tiered Model of Permitted Characters and Labels . . . . 10 | 3.1. A Tiered Model of Permitted Characters and Labels . . . . 11 | |||
| 3.1.1. PROTOCOL-VALID . . . . . . . . . . . . . . . . . . . . 10 | 3.1.1. PROTOCOL-VALID . . . . . . . . . . . . . . . . . . . . 11 | |||
| 3.1.2. CONTEXTUAL RULE REQUIRED . . . . . . . . . . . . . . . 11 | 3.1.2. CONTEXTUAL RULE REQUIRED . . . . . . . . . . . . . . . 12 | |||
| 3.1.2.2. Rules and Their Application . . . . . . . . . . . 12 | 3.1.2.1. Contextual Restrictions . . . . . . . . . . . . . 12 | |||
| 3.1.3. DISALLOWED . . . . . . . . . . . . . . . . . . . . . . 12 | 3.1.2.2. Rules and Their Application . . . . . . . . . . . 13 | |||
| 3.1.4. UNASSIGNED . . . . . . . . . . . . . . . . . . . . . . 13 | 3.1.3. DISALLOWED . . . . . . . . . . . . . . . . . . . . . . 13 | |||
| 3.1.4. UNASSIGNED . . . . . . . . . . . . . . . . . . . . . . 14 | ||||
| 3.2. Registration Policy . . . . . . . . . . . . . . . . . . . 14 | 3.2. Registration Policy . . . . . . . . . . . . . . . . . . . 14 | |||
| 3.3. Layered Restrictions: Tables, Context, Registration, | 3.3. Layered Restrictions: Tables, Context, Registration, | |||
| Applications . . . . . . . . . . . . . . . . . . . . . . . 14 | Applications . . . . . . . . . . . . . . . . . . . . . . . 15 | |||
| 4. Issues that Constrain Possible Solutions . . . . . . . . . . . 15 | 4. Issues that Constrain Possible Solutions . . . . . . . . . . . 16 | |||
| 4.1. Display and Network Order . . . . . . . . . . . . . . . . 15 | 4.1. Display and Network Order . . . . . . . . . . . . . . . . 16 | |||
| 4.2. Entry and Display in Applications . . . . . . . . . . . . 16 | 4.2. Entry and Display in Applications . . . . . . . . . . . . 17 | |||
| 4.3. Linguistic Expectations: Ligatures, Digraphs, and | 4.3. Linguistic Expectations: Ligatures, Digraphs, and | |||
| Alternate Character Forms . . . . . . . . . . . . . . . . 18 | Alternate Character Forms . . . . . . . . . . . . . . . . 19 | |||
| 4.4. Case Mapping and Related Issues . . . . . . . . . . . . . 20 | 4.4. Case Mapping and Related Issues . . . . . . . . . . . . . 21 | |||
| 4.5. Right to Left Text . . . . . . . . . . . . . . . . . . . . 21 | 4.5. Right to Left Text . . . . . . . . . . . . . . . . . . . . 22 | |||
| 5. IDNs and the Robustness Principle . . . . . . . . . . . . . . 21 | 5. IDNs and the Robustness Principle . . . . . . . . . . . . . . 22 | |||
| 6. Front-end and User Interface Processing for Lookup . . . . . . 22 | 6. Front-end and User Interface Processing for Lookup . . . . . . 23 | |||
| 7. Migration from IDNA2003 and Unicode Version Synchronization . 24 | 7. Migration from IDNA2003 and Unicode Version Synchronization . 25 | |||
| 7.1. Design Criteria . . . . . . . . . . . . . . . . . . . . . 24 | 7.1. Design Criteria . . . . . . . . . . . . . . . . . . . . . 25 | |||
| 7.1.1. Summary and Discussion of IDNA Validity Criteria . . . 25 | 7.1.1. Summary and Discussion of IDNA Validity Criteria . . . 25 | |||
| 7.1.2. Labels in Registration . . . . . . . . . . . . . . . . 25 | 7.1.2. Labels in Registration . . . . . . . . . . . . . . . . 26 | |||
| 7.1.3. Labels in Lookup . . . . . . . . . . . . . . . . . . . 26 | 7.1.3. Labels in Lookup . . . . . . . . . . . . . . . . . . . 27 | |||
| 7.2. Changes in Character Interpretations . . . . . . . . . . . 28 | 7.2. Changes in Character Interpretations . . . . . . . . . . . 29 | |||
| 7.3. Character Mapping . . . . . . . . . . . . . . . . . . . . 29 | 7.3. Character Mapping . . . . . . . . . . . . . . . . . . . . 30 | |||
| 7.4. The Question of Prefix Changes . . . . . . . . . . . . . . 29 | 7.4. The Question of Prefix Changes . . . . . . . . . . . . . . 30 | |||
| 7.4.1. Conditions Requiring a Prefix Change . . . . . . . . . 29 | 7.4.1. Conditions Requiring a Prefix Change . . . . . . . . . 30 | |||
| 7.4.2. Conditions Not Requiring a Prefix Change . . . . . . . 30 | 7.4.2. Conditions Not Requiring a Prefix Change . . . . . . . 31 | |||
| 7.4.3. Implications of Prefix Changes . . . . . . . . . . . . 30 | 7.4.3. Implications of Prefix Changes . . . . . . . . . . . . 31 | |||
| 7.5. Stringprep Changes and Compatibility . . . . . . . . . . . 31 | 7.5. Stringprep Changes and Compatibility . . . . . . . . . . . 32 | |||
| 7.6. The Symbol Question . . . . . . . . . . . . . . . . . . . 32 | 7.6. The Symbol Question . . . . . . . . . . . . . . . . . . . 32 | |||
| 7.7. Migration Between Unicode Versions: Unassigned Code | 7.7. Migration Between Unicode Versions: Unassigned Code | |||
| Points . . . . . . . . . . . . . . . . . . . . . . . . . . 33 | Points . . . . . . . . . . . . . . . . . . . . . . . . . . 34 | |||
| 7.8. Other Compatibility Issues . . . . . . . . . . . . . . . . 35 | 7.8. Other Compatibility Issues . . . . . . . . . . . . . . . . 36 | |||
| 8. Name Server Considerations . . . . . . . . . . . . . . . . . . 35 | 8. Name Server Considerations . . . . . . . . . . . . . . . . . . 36 | |||
| 8.1. Processing Non-ASCII Strings . . . . . . . . . . . . . . . 35 | 8.1. Processing Non-ASCII Strings . . . . . . . . . . . . . . . 36 | |||
| 8.2. DNSSEC Authentication of IDN Domain Names . . . . . . . . 36 | 8.2. Root and other DNS Server Considerations . . . . . . . . . 37 | |||
| 8.3. Root and other DNS Server Considerations . . . . . . . . . 36 | 9. Internationalization Considerations . . . . . . . . . . . . . 37 | |||
| 9. Internationalization Considerations . . . . . . . . . . . . . 36 | ||||
| 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 37 | 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 37 | |||
| 10.1. IDNA Character Registry . . . . . . . . . . . . . . . . . 37 | 10.1. IDNA Character Registry . . . . . . . . . . . . . . . . . 37 | |||
| 10.2. IDNA Context Registry . . . . . . . . . . . . . . . . . . 37 | 10.2. IDNA Context Registry . . . . . . . . . . . . . . . . . . 38 | |||
| 10.3. IANA Repository of IDN Practices of TLDs . . . . . . . . . 37 | 10.3. IANA Repository of IDN Practices of TLDs . . . . . . . . . 38 | |||
| 11. Security Considerations . . . . . . . . . . . . . . . . . . . 38 | 11. Security Considerations . . . . . . . . . . . . . . . . . . . 38 | |||
| 11.1. General Security Issues with IDNA . . . . . . . . . . . . 38 | 11.1. General Security Issues with IDNA . . . . . . . . . . . . 38 | |||
| 12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 38 | 12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 38 | |||
| 13. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 39 | 13. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 39 | |||
| 14. References . . . . . . . . . . . . . . . . . . . . . . . . . . 39 | 14. References . . . . . . . . . . . . . . . . . . . . . . . . . . 40 | |||
| 14.1. Normative References . . . . . . . . . . . . . . . . . . . 39 | 14.1. Normative References . . . . . . . . . . . . . . . . . . . 40 | |||
| 14.2. Informative References . . . . . . . . . . . . . . . . . . 40 | 14.2. Informative References . . . . . . . . . . . . . . . . . . 41 | |||
| Appendix A. Change Log . . . . . . . . . . . . . . . . . . . . . 42 | Appendix A. Change Log . . . . . . . . . . . . . . . . . . . . . 43 | |||
| A.1. Changes between Version -00 and Version -01 of | A.1. Changes between Version -00 and Version -01 of | |||
| draft-ietf-idnabis-rationale . . . . . . . . . . . . . . . 43 | draft-ietf-idnabis-rationale . . . . . . . . . . . . . . . 43 | |||
| A.2. Version -02 . . . . . . . . . . . . . . . . . . . . . . . 43 | A.2. Version -02 . . . . . . . . . . . . . . . . . . . . . . . 44 | |||
| A.3. Version -03 . . . . . . . . . . . . . . . . . . . . . . . 43 | A.3. Version -03 . . . . . . . . . . . . . . . . . . . . . . . 44 | |||
| A.4. Version -04 . . . . . . . . . . . . . . . . . . . . . . . 44 | A.4. Version -04 . . . . . . . . . . . . . . . . . . . . . . . 44 | |||
| A.5. Version -05 . . . . . . . . . . . . . . . . . . . . . . . 44 | A.5. Version -05 . . . . . . . . . . . . . . . . . . . . . . . 45 | |||
| A.6. Version -06 . . . . . . . . . . . . . . . . . . . . . . . 45 | A.6. Version -06 . . . . . . . . . . . . . . . . . . . . . . . 45 | |||
| A.7. Version -07 . . . . . . . . . . . . . . . . . . . . . . . 45 | A.7. Version -07 . . . . . . . . . . . . . . . . . . . . . . . 46 | |||
| A.8. Version -08 . . . . . . . . . . . . . . . . . . . . . . . 45 | A.8. Version -08 . . . . . . . . . . . . . . . . . . . . . . . 46 | |||
| A.9. Version -09 . . . . . . . . . . . . . . . . . . . . . . . 46 | A.9. Version -09 . . . . . . . . . . . . . . . . . . . . . . . 46 | |||
| A.10. Version -10 . . . . . . . . . . . . . . . . . . . . . . . 46 | A.10. Version -10 . . . . . . . . . . . . . . . . . . . . . . . 47 | |||
| A.11. Version -11 . . . . . . . . . . . . . . . . . . . . . . . 46 | A.11. Version -11 . . . . . . . . . . . . . . . . . . . . . . . 47 | |||
| A.12. Version -12 . . . . . . . . . . . . . . . . . . . . . . . 47 | A.12. Version -12 . . . . . . . . . . . . . . . . . . . . . . . 47 | |||
| A.13. Version -13 . . . . . . . . . . . . . . . . . . . . . . . 47 | A.13. Version -13 . . . . . . . . . . . . . . . . . . . . . . . 48 | |||
| Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 47 | A.14. Version -14 . . . . . . . . . . . . . . . . . . . . . . . 48 | |||
| A.15. Version -14 . . . . . . . . . . . . . . . . . . . . . . . 48 | ||||
| Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 48 | ||||
| 1. Introduction | 1. Introduction | |||
| 1.1. Context and Overview | 1.1. Context and Overview | |||
| Internationalized Domain Names in Applications (IDNA) is a collection | Internationalized Domain Names in Applications (IDNA) is a collection | |||
| of standards that allow client applications to convert some Unicode | of standards that allow client applications to convert some Unicode | |||
| mnemonics to an ASCII-compatible encoding form ("ACE") which is a | mnemonics to an ASCII-compatible encoding form ("ACE") which is a | |||
| valid DNS label containing only letters, digits, and hyphens. The | valid DNS label containing only letters, digits, and hyphens. The | |||
| specific form of ACE label used by IDNA is called an "A-label". A | specific form of ACE label used by IDNA is called an "A-label". A | |||
| skipping to change at page 6, line 24 ¶ | skipping to change at page 7, line 24 ¶ | |||
| label that may appear to meet certain definitional constraints but | label that may appear to meet certain definitional constraints but | |||
| has not yet been sufficiently tested for validity. | has not yet been sufficiently tested for validity. | |||
| These definitions are also illustrated in Figure 1 of the Definitions | These definitions are also illustrated in Figure 1 of the Definitions | |||
| Document [IDNA2008-Defs]. R-LDH-labels contain "--" in the third and | Document [IDNA2008-Defs]. R-LDH-labels contain "--" in the third and | |||
| fourth character from the beginning of the label. In IDNA-aware | fourth character from the beginning of the label. In IDNA-aware | |||
| applications, only a subset of these reserved labels is permitted to | applications, only a subset of these reserved labels is permitted to | |||
| be used, namely the A-label subset. A-labels are a subset of the | be used, namely the A-label subset. A-labels are a subset of the | |||
| R-LDH-labels that begin with the case-insensitive string "xn--". | R-LDH-labels that begin with the case-insensitive string "xn--". | |||
| Labels that bear this prefix but which are not otherwise valid fall | Labels that bear this prefix but which are not otherwise valid fall | |||
| into the "Fake-A-label" category. The non-reserved labels (NR-LDH- | into the "Fake A-label" category. The non-reserved labels (NR-LDH- | |||
| labels) are implicitly valid since they do not trigger any | labels) are implicitly valid since they do not bear any resemblance | |||
| resemblance to IDNA-landr NR-LDH-labels. | to the labels specified by IDNA. | |||
| The creation of the Reserved-LDH category is required for three | The creation of the Reserved-LDH category is required for three | |||
| reasons: | reasons: | |||
| o to prevent confusion with pre-IDNA coding forms; | o to prevent confusion with pre-IDNA coding forms; | |||
| o to permit future extensions that would require changing the | o to permit future extensions that would require changing the | |||
| prefix, no matter how unlikely those might be (see Section 7.4); | prefix, no matter how unlikely those might be (see Section 7.4); | |||
| and | and | |||
| skipping to change at page 7, line 24 ¶ | skipping to change at page 8, line 24 ¶ | |||
| algorithms. | algorithms. | |||
| 1.5. Applicability and Function of IDNA | 1.5. Applicability and Function of IDNA | |||
| The IDNA specification solves the problem of extending the repertoire | The IDNA specification solves the problem of extending the repertoire | |||
| of characters that can be used in domain names to include a large | of characters that can be used in domain names to include a large | |||
| subset of the Unicode repertoire. | subset of the Unicode repertoire. | |||
| IDNA does not extend DNS. Instead, the applications (and, by | IDNA does not extend DNS. Instead, the applications (and, by | |||
| implication, the users) continue to see an exact-match lookup | implication, the users) continue to see an exact-match lookup | |||
| service. Either there is a single exactly-matching (subject to the | service. Either there is a single exactly-matching name (subject to | |||
| base DNS requirement of case-insensitive ASCII matching) name or | the base DNS requirement of case-insensitive ASCII matching) or there | |||
| there is no match. This model has served the existing applications | is no match. This model has served the existing applications well, | |||
| well, but it requires, with or without internationalized domain | but it requires, with or without internationalized domain names, that | |||
| names, that users know the exact spelling of the domain names that | users know the exact spelling of the domain names that are to be | |||
| are to be typed into applications such as web browsers and mail user | typed into applications such as web browsers and mail user agents. | |||
| agents. The introduction of the larger repertoire of characters | The introduction of the larger repertoire of characters potentially | |||
| potentially makes the set of misspellings larger, especially given | makes the set of misspellings larger, especially given that in some | |||
| that in some cases the same appearance, for example on a business | cases the same appearance, for example on a business card, might | |||
| card, might visually match several Unicode code points or several | visually match several Unicode code points or several sequences of | |||
| sequences of code points. | code points. | |||
| The IDNA standard does not require any applications to conform to it, | The IDNA standard does not require any applications to conform to it, | |||
| nor does it retroactively change those applications. An application | nor does it retroactively change those applications. An application | |||
| can elect to use IDNA in order to support IDN while maintaining | can elect to use IDNA in order to support IDN while maintaining | |||
| interoperability with existing infrastructure. If an application | interoperability with existing infrastructure. If an application | |||
| wants to use non-ASCII characters in public DNS domain names, IDNA is | wants to use non-ASCII characters in public DNS domain names, IDNA is | |||
| the only currently-defined option. Adding IDNA support to an | the only currently-defined option. Adding IDNA support to an | |||
| existing application entails changes to the application only, and | existing application entails changes to the application only, and | |||
| leaves room for flexibility in front-end processing and more | leaves room for flexibility in front-end processing and more | |||
| specifically in the user interface (see Section 6). | specifically in the user interface (see Section 6). | |||
| A great deal of the discussion of IDN solutions has focused on | A great deal of the discussion of IDN solutions has focused on | |||
| transition issues and how IDNs will work in a world where not all of | transition issues and how IDNs will work in a world where not all of | |||
| the components have been updated. Proposals that were not chosen by | the components have been updated. Proposals that were not chosen by | |||
| the original IDN Working Group would have depended on updating of | the original IDN Working Group would have depended on updating of | |||
| user applications, DNS resolvers, and DNS servers in order for a user | user applications, DNS resolvers, and DNS servers in order for a user | |||
| to apply an internationalized domain name in any form or coding | to apply an internationalized domain name in any form or coding | |||
| acceptable under that method. While processing must be performed | acceptable under that method. While processing must be performed | |||
| prior to or after access to the DNS, IDNA requires no changes to the | prior to or after access to the DNS, IDNA requires no changes to the | |||
| DNS protocol or any DNS servers or the resolvers on user's computers. | DNS protocol, any DNS servers, or the resolvers on users' computers. | |||
| IDNA allows the graceful introduction of IDNs not only by avoiding | IDNA allows the graceful introduction of IDNs not only by avoiding | |||
| upgrades to existing infrastructure (such as DNS servers and mail | upgrades to existing infrastructure (such as DNS servers and mail | |||
| transport agents), but also by allowing some limited use of IDNs in | transport agents), but also by allowing some limited use of IDNs in | |||
| applications by using the ASCII-encoded representation of the labels | applications by using the ASCII-encoded representation of the labels | |||
| containing non-ASCII characters. While such names are user- | containing non-ASCII characters. While such names are user- | |||
| unfriendly to read and type, and hence not optimal for user input, | unfriendly to read and type, and hence not optimal for user input, | |||
| they can be used as a last resort to allow rudimentary IDN usage. | they can be used as a last resort to allow rudimentary IDN usage. | |||
| For example, they might be the best choice for display if it were | For example, they might be the best choice for display if it were | |||
| known that relevant fonts were not available on the user's computer. | known that relevant fonts were not available on the user's computer. | |||
| skipping to change at page 11, line 41 ¶ | skipping to change at page 12, line 41 ¶ | |||
| rule itself is to be applied on lookup as well as registration. | rule itself is to be applied on lookup as well as registration. | |||
| A distinction is made between characters that indicate or prohibit | A distinction is made between characters that indicate or prohibit | |||
| joining and ones similar to them (known as "CONTEXT-JOINER" or | joining and ones similar to them (known as "CONTEXT-JOINER" or | |||
| "CONTEXTJ") and other characters requiring contextual treatment | "CONTEXTJ") and other characters requiring contextual treatment | |||
| ("CONTEXT-OTHER" or "CONTEXTO"). Only the former require full | ("CONTEXT-OTHER" or "CONTEXTO"). Only the former require full | |||
| testing at lookup time. | testing at lookup time. | |||
| It is important to note that these contextual rules cannot prevent | It is important to note that these contextual rules cannot prevent | |||
| all uses of the relevant characters that might be confusing or | all uses of the relevant characters that might be confusing or | |||
| problematic. What they are expected do is to confine applicability | problematic. What they are expected to do is to confine | |||
| of the characters to scripts (and narrower contexts) where zone | applicability of the characters to scripts (and narrower contexts) | |||
| administrators are knowledgeable enough about the use of those | where zone administrators are knowledgeable enough about the use of | |||
| characters to be prepared to deal with them appropriately. | those characters to be prepared to deal with them appropriately. | |||
| For example, a registry dealing with an Indic script that requires | For example, a registry dealing with an Indic script that requires | |||
| ZWJ and/or ZWNJ as part of the writing system is expected to | ZWJ and/or ZWNJ as part of the writing system is expected to | |||
| understand where the characters have visible effect and where they do | understand where the characters have visible effect and where they do | |||
| not and to make registration rules accordingly. By contrast, a | not and to make registration rules accordingly. By contrast, a | |||
| registry dealing primarily with Latin or Cyrillic script might not be | registry dealing primarily with Latin or Cyrillic script might not be | |||
| actively aware that the characters exist, much less about the | actively aware that the characters exist, much less about the | |||
| consequences of embedding them in labels drawn from those scripts and | consequences of embedding them in labels drawn from those scripts and | |||
| therefore should avoid accepting registrations containing those | therefore should avoid accepting registrations containing those | |||
| characters, at least in Latin or Cyrillic-script labels. | characters, at least in Latin or Cyrillic-script labels. | |||
| skipping to change at page 12, line 19 ¶ | skipping to change at page 13, line 19 ¶ | |||
| Rules have descriptions such as "Must follow a character from Script | Rules have descriptions such as "Must follow a character from Script | |||
| XYZ", "Must occur only if the entire label is in Script ABC", or | XYZ", "Must occur only if the entire label is in Script ABC", or | |||
| "Must occur only if the previous and subsequent characters have the | "Must occur only if the previous and subsequent characters have the | |||
| DFG property". The actual rules may be DEFINED or NULL. If present, | DFG property". The actual rules may be DEFINED or NULL. If present, | |||
| they may have values of "True" (character may be used in any position | they may have values of "True" (character may be used in any position | |||
| in any label), "False" (character may not be used in any label), or | in any label), "False" (character may not be used in any label), or | |||
| may be a set of procedural rules that specify the context in which | may be a set of procedural rules that specify the context in which | |||
| the character is permitted. | the character is permitted. | |||
| Examples of descriptions of typical rules, stated informally and in | ||||
| English, include "Must follow a character from Script XYZ", "Must | ||||
| occur only if the entire label is in Script ABC", "Must occur only if | ||||
| the previous and subsequent characters have the DFG property". | ||||
| Because it is easier to identify these characters than to know that | Because it is easier to identify these characters than to know that | |||
| they are actually needed in IDNs or how to establish exactly the | they are actually needed in IDNs or how to establish exactly the | |||
| right rules for each one, a rule may have a null value in a given | right rules for each one, a rule may have a null value in a given | |||
| version of the tables. Characters associated with null rules are not | version of the tables. Characters associated with null rules are not | |||
| permitted to appear in putative labels for either registration or | permitted to appear in putative labels for either registration or | |||
| lookup. Of course, a later version of the tables might contain a | lookup. Of course, a later version of the tables might contain a | |||
| non-null rule. | non-null rule. | |||
| The actual rules and their descriptions are in Sections 2 and 3 of | The actual rules and their descriptions are in Sections 2 and 3 of | |||
| [IDNA2008-Tables]. That document also specifies the creation of a | [IDNA2008-Tables]. That document also specifies the creation of a | |||
| skipping to change at page 13, line 43 ¶ | skipping to change at page 14, line 37 ¶ | |||
| not have assigned values in a given version of Unicode are treated as | not have assigned values in a given version of Unicode are treated as | |||
| belonging to a special UNASSIGNED category. Such code points are | belonging to a special UNASSIGNED category. Such code points are | |||
| prohibited in labels to be registered or looked up. The category | prohibited in labels to be registered or looked up. The category | |||
| differs from DISALLOWED in that code points are moved out of it by | differs from DISALLOWED in that code points are moved out of it by | |||
| the simple expedient of being assigned in a later version of Unicode | the simple expedient of being assigned in a later version of Unicode | |||
| (at which point, they are classified into one of the other categories | (at which point, they are classified into one of the other categories | |||
| as appropriate). | as appropriate). | |||
| The rationale for restricting the processing of UNASSIGNED characters | The rationale for restricting the processing of UNASSIGNED characters | |||
| is simply that the properties of such code points cannot be | is simply that the properties of such code points cannot be | |||
| completely known until actual characters are assigned to them. If, | completely known until actual characters are assigned to them. For | |||
| for example, such a code point was permitted to be included in a | example, assume that an UNASSIGNED code point were included in a | |||
| label to be looked up, and the code point was later to be assigned to | label to be looked up. Assume that the code point was later assigned | |||
| a character that required some set of contextual rules, un-updated | to a character that required some set of contextual rules. With that | |||
| instances of IDNA-aware software might permit lookup of labels | combination, un-updated instances of IDNA-aware software might permit | |||
| containing the previously-unassigned characters while updated | lookup of labels containing the previously-unassigned characters | |||
| versions of IDNA-aware software might restrict their use in lookup, | while updated versions of the software might restrict use of the same | |||
| depending on the contextual rules. It should be clear that under no | label in lookup, depending on the contextual rules. It should be | |||
| circumstance should an UNASSIGNED character be permitted in a label | clear that under no circumstance should an UNASSIGNED character be | |||
| to be registered as part of a domain name. | permitted in a label to be registered as part of a domain name. | |||
| 3.2. Registration Policy | 3.2. Registration Policy | |||
| While these recommendations cannot and should not define registry | While these recommendations cannot and should not define registry | |||
| policies, registries should develop and apply additional restrictions | policies, registries should develop and apply additional restrictions | |||
| as needed to reduce confusion and other problems. For example, it is | as needed to reduce confusion and other problems. For example, it is | |||
| generally believed that labels containing characters from more than | generally believed that labels containing characters from more than | |||
| one script are a bad practice although there may be some important | one script are a bad practice although there may be some important | |||
| exceptions to that principle. Some registries may choose to restrict | exceptions to that principle. Some registries may choose to restrict | |||
| registrations to characters drawn from a very small number of | registrations to characters drawn from a very small number of | |||
| scripts. For many scripts, the use of variant techniques such as | scripts. For many scripts, the use of variant techniques such as | |||
| those as described in RFC 3843 [RFC3743] and RFC 4290 [RFC4290], and | those as described in RFC 3743 [RFC3743] and RFC 4290 [RFC4290], and | |||
| illustrated for Chinese by the tables described in RFC 4713 [RFC4713] | illustrated for Chinese by the tables described in RFC 4713 [RFC4713] | |||
| may be helpful in reducing problems that might be perceived by users. | may be helpful in reducing problems that might be perceived by users. | |||
| In general, users will benefit if registries only permit characters | In general, users will benefit if registries only permit characters | |||
| from scripts that are well-understood by the registry or its | from scripts that are well-understood by the registry or its | |||
| advisers. If a registry decides to reduce opportunities for | advisers. If a registry decides to reduce opportunities for | |||
| confusion by constructing policies that disallow characters used in | confusion by constructing policies that disallow characters used in | |||
| historic writing systems or characters whose use is restricted to | historic writing systems or characters whose use is restricted to | |||
| specialized, highly technical contexts, some relevant information may | specialized, highly technical contexts, some relevant information may | |||
| be found in Section 2.4 "Specific Character Adjustments", Table 4 | be found in Section 2.4 "Specific Character Adjustments", Table 4 | |||
| "Candidate Characters for Exclusion from Identifiers" of | "Candidate Characters for Exclusion from Identifiers" of | |||
| [Unicode-UAX31] and Section 3.1. "General Security Profile for | [Unicode-UAX31] and Section 3.1. "General Security Profile for | |||
| Identifiers" in [Unicode-Security]. | Identifiers" in [Unicode-Security]. | |||
| The requirement (in Section 4.1 of [IDNA2008-Protocol]) that | The requirement (in Section 4.1 of [IDNA2008-Protocol]) that | |||
| registration procedures use only U-labels and/or A-labels is intended | registration procedures use only U-labels and/or A-labels is intended | |||
| to ensure that registrants are fully aware of exactly what is being | to ensure that registrants are fully aware of exactly what is being | |||
| registered as well as encouraging use of those canonical forms. That | registered as well as encouraging use of those canonical forms. That | |||
| provision should not be interpreted as requiring that registrant need | provision should not be interpreted as requiring that registrants | |||
| to provide characters in a particular code sequence. Registrant | need to provide characters in a particular code sequence. Registrant | |||
| input conventions and management are part of registrant-registrar | input conventions and management are part of registrant-registrar | |||
| interactions and relationships between registries and registrars and | interactions and relationships between registries and registrars and | |||
| are outside the scope of these standards. | are outside the scope of these standards. | |||
| It is worth stressing that these principles of policy development and | It is worth stressing that these principles of policy development and | |||
| application apply at all levels of the DNS, not only, e.g., TLD or | application apply at all levels of the DNS, not only, e.g., TLD or | |||
| SLD registrations. Even a trivial, "anything is permitted that is | SLD registrations. Even a trivial, "anything is permitted that is | |||
| valid under the protocol" policy is helpful in that it helps users | valid under the protocol" policy is helpful in that it helps users | |||
| and application developers know what to expect. | and application developers know what to expect. | |||
| skipping to change at page 16, line 47 ¶ | skipping to change at page 17, line 42 ¶ | |||
| A-labels or U-labels, the application may reasonably have an option | A-labels or U-labels, the application may reasonably have an option | |||
| for the user to select the preferred method of display. Rendering | for the user to select the preferred method of display. Rendering | |||
| the U-label should normally be the default. | the U-label should normally be the default. | |||
| Domain names are often stored and transported in many places. For | Domain names are often stored and transported in many places. For | |||
| example, they are part of documents such as mail messages and web | example, they are part of documents such as mail messages and web | |||
| pages. They are transported in many parts of many protocols, such as | pages. They are transported in many parts of many protocols, such as | |||
| both the control commands of SMTP and associated message body parts, | both the control commands of SMTP and associated message body parts, | |||
| and in the headers and the body content in HTTP. It is important to | and in the headers and the body content in HTTP. It is important to | |||
| remember that domain names appear both in domain name slots and in | remember that domain names appear both in domain name slots and in | |||
| the content that is passed over protocols. | the content that is passed over protocols and it would be helpful if | |||
| protocols explicitly define what their domain name slots are. | ||||
| In protocols and document formats that define how to handle | In protocols and document formats that define how to handle | |||
| specification or negotiation of charsets, labels can be encoded in | specification or negotiation of charsets, labels can be encoded in | |||
| any charset allowed by the protocol or document format. If a | any charset allowed by the protocol or document format. If a | |||
| protocol or document format only allows one charset, the labels must | protocol or document format only allows one charset, the labels must | |||
| be given in that charset. Of course, not all charsets can properly | be given in that charset. Of course, not all charsets can properly | |||
| represent all labels. If a U-label cannot be displayed in its | represent all labels. If a U-label cannot be displayed in its | |||
| entirety, the only choice (without loss of information) may be to | entirety, the only choice (without loss of information) may be to | |||
| display the A-label. | display the A-label. | |||
| skipping to change at page 21, line 6 ¶ | skipping to change at page 21, line 50 ¶ | |||
| Unicode case folding operation maps Greek Final Form Sigma (U+03C2) | Unicode case folding operation maps Greek Final Form Sigma (U+03C2) | |||
| to the medial form (U+03C3) and maps Eszett (German Sharp S, U+00DF) | to the medial form (U+03C3) and maps Eszett (German Sharp S, U+00DF) | |||
| to "ss". Neither of these mappings is reversible because the upper | to "ss". Neither of these mappings is reversible because the upper | |||
| case of U+03C3 is the Upper Case Sigma (U+03A3) and "ss" is an ASCII | case of U+03C3 is the Upper Case Sigma (U+03A3) and "ss" is an ASCII | |||
| string. IDNA2008 permits, at the risk of some incompatibility, | string. IDNA2008 permits, at the risk of some incompatibility, | |||
| slightly more flexibility in this area by avoiding case folding and | slightly more flexibility in this area by avoiding case folding and | |||
| treating these characters as themselves. Approaches to handling one- | treating these characters as themselves. Approaches to handling one- | |||
| way mappings are discussed in Section 7.2. | way mappings are discussed in Section 7.2. | |||
| Because IDNA2003 maps Final Sigma and Eszett to other characters, and | Because IDNA2003 maps Final Sigma and Eszett to other characters, and | |||
| the reverse mapping is never possible, that in some sense means that | the reverse mapping is never possible, neither Final Sigma nor Eszett | |||
| neither Final Sigma nor Eszett can be represented in a IDNA2003 IDN. | can be represented in the ACE form of IDNA2003 IDN nor in the native | |||
| With IDNA2008, both characters can be used in an IDN and so the | character (U-label) form derived from it. With IDNA2008, both | |||
| A-label used for lookup for any U-label containing those characters, | characters can be used in an IDN and so the A-label used for lookup | |||
| is now different. See Section 7.1 for a discussion of what kinds of | for any U-label containing those characters, is now different. See | |||
| changes might require the IDNA prefix to change; after extended | Section 7.1 for a discussion of what kinds of changes might require | |||
| discussions, the WG came to consensus that the change for these | the IDNA prefix to change; after extended discussions, the WG came to | |||
| characters did not justify a prefix change. | consensus that the change for these characters did not justify a | |||
| prefix change. | ||||
| 4.5. Right to Left Text | 4.5. Right to Left Text | |||
| In order to be sure that the directionality of right to left text is | In order to be sure that the directionality of right to left text is | |||
| unambiguous, IDNA2003 required that any label in which right to left | unambiguous, IDNA2003 required that any label in which right to left | |||
| characters appear both starts and ends with them and that it not | characters appear both starts and ends with them and that it not | |||
| include any characters with strong left to right properties (that | include any characters with strong left to right properties (that | |||
| excludes other alphabetic characters but permits European digits). | excludes other alphabetic characters but permits European digits). | |||
| Any other string that contains a right to left character and does not | Any other string that contains a right to left character and does not | |||
| meet those requirements is rejected. This is one of the few places | meet those requirements is rejected. This is one of the few places | |||
| skipping to change at page 25, line 8 ¶ | skipping to change at page 25, line 49 ¶ | |||
| scripts, and other character collections as they are incorporated | scripts, and other character collections as they are incorporated | |||
| into Unicode, doing so without disruption and, in the long term, | into Unicode, doing so without disruption and, in the long term, | |||
| without "heavy" processes (an IETF consensus process is required | without "heavy" processes (an IETF consensus process is required | |||
| by the IDNA2008 specifications and is expected to be required and | by the IDNA2008 specifications and is expected to be required and | |||
| used until significant experience accumulates with IDNA operations | used until significant experience accumulates with IDNA operations | |||
| and new versions of Unicode). | and new versions of Unicode). | |||
| 7.1.1. Summary and Discussion of IDNA Validity Criteria | 7.1.1. Summary and Discussion of IDNA Validity Criteria | |||
| The general criteria for a label to be considered valid under IDNA | The general criteria for a label to be considered valid under IDNA | |||
| are (the actual rules are rigorously defined in the "Protocol" and | are (the actual rules are rigorously defined in [IDNA2008-Protocol] | |||
| "Tables" documents): | and [IDNA2008-Tables]): | |||
| o The characters are "letters", marks needed to form letters, | o The characters are "letters", marks needed to form letters, | |||
| numerals, or other code points used to write words in some | numerals, or other code points used to write words in some | |||
| language. Symbols, drawing characters, and various notational | language. Symbols, drawing characters, and various notational | |||
| characters are intended to be permanently excluded. There is no | characters are intended to be permanently excluded. There is no | |||
| evidence that they are important enough to Internet operations or | evidence that they are important enough to Internet operations or | |||
| internationalization to justify expansion of domain names beyond | internationalization to justify expansion of domain names beyond | |||
| the general principle of "letters, digits, and hyphen". | the general principle of "letters, digits, and hyphen". | |||
| (Additional discussion and rationale for the symbol decision | (Additional discussion and rationale for the symbol decision | |||
| appears in Section 7.6). | appears in Section 7.6). | |||
| skipping to change at page 29, line 37 ¶ | skipping to change at page 30, line 32 ¶ | |||
| mappings no longer exist as requirements in IDNA2008. These | mappings no longer exist as requirements in IDNA2008. These | |||
| specifications strongly prefer that only A-labels or U-labels be used | specifications strongly prefer that only A-labels or U-labels be used | |||
| in protocol contexts and as much as practical more generally. | in protocol contexts and as much as practical more generally. | |||
| IDNA2008 does anticipate situations in which some mapping at the time | IDNA2008 does anticipate situations in which some mapping at the time | |||
| of user input into lookup applications is appropriate and desirable. | of user input into lookup applications is appropriate and desirable. | |||
| The issues are discussed in Section 6 and specific recommendations | The issues are discussed in Section 6 and specific recommendations | |||
| are made in [IDNA2008-Mapping]. | are made in [IDNA2008-Mapping]. | |||
| 7.4. The Question of Prefix Changes | 7.4. The Question of Prefix Changes | |||
| The conditions that would require a change in the IDNA ACE prefix | The conditions that would have required a change in the IDNA ACE | |||
| ("xn--" for the version of IDNA specified in [RFC3490]) have been a | prefix ("xn--" for the version of IDNA specified in [RFC3490]) were | |||
| great concern to the community. A prefix change would clearly be | of great concern to the community. A prefix change would have | |||
| necessary if the algorithms were modified in a manner that would | clearly been necessary if the algorithms were modified in a manner | |||
| create serious ambiguities during subsequent transition in | that would have created serious ambiguities during subsequent | |||
| registrations. This section summarizes our conclusions about the | transition in registrations. This section summarizes the working | |||
| conditions under which changes in prefix would be necessary and the | group's conclusions about the conditions under which a change in the | |||
| implications of such a change. | prefix would have been necessary and the implications of such a | |||
| change. | ||||
| 7.4.1. Conditions Requiring a Prefix Change | 7.4.1. Conditions Requiring a Prefix Change | |||
| An IDN prefix change is needed if a given string would be looked up | An IDN prefix change would have been needed if a given string would | |||
| or otherwise interpreted differently depending on the version of the | be looked up or otherwise interpreted differently depending on the | |||
| protocol or tables being used. An IDNA upgrade would require a | version of the protocol or tables being used. This IDNA upgrade | |||
| prefix change if, and only if, one of the following four conditions | would have required a prefix change if, and only if, one of the | |||
| were met: | following four conditions were met: | |||
| 1. The conversion of an A-label to Unicode (i.e., a U-label) yields | 1. The conversion of an A-label to Unicode (i.e., a U-label) would | |||
| one string under IDNA2003 (RFC3490) and a different string under | have yielded one string under IDNA2003 (RFC3490) and a different | |||
| IDNA2008. | string under IDNA2008. | |||
| 2. In a significant number of cases, an input string that is valid | 2. In a significant number of cases, an input string that was valid | |||
| under IDNA2003 and also valid under IDNA2008 yields two different | under IDNA2003 and also valid under IDNA2008 would have yielded | |||
| A-labels with the different versions. This condition is believed | two different A-labels with the different versions. This | |||
| to be essentially equivalent to the one above except for a very | condition is believed to be essentially equivalent to the one | |||
| small number of edge cases which may not justify a prefix change | above except for a very small number of edge cases that were not | |||
| (See Section 7.2). | found to justify a prefix change (See Section 7.2). | |||
| Note that if the input string is valid under one version and not | Note that if the input string was valid under one version and not | |||
| valid under the other, this condition does not apply. See the | valid under the other, this condition would not apply. See the | |||
| first item in Section 7.4.2, below. | first item in Section 7.4.2, below. | |||
| 3. A fundamental change is made to the semantics of the string that | 3. A fundamental change was made to the semantics of the string that | |||
| is inserted in the DNS, e.g., if a decision were made to try to | would be inserted in the DNS, e.g., if a decision were made to | |||
| include language or script information in the encoding in | try to include language or script information in the encoding in | |||
| addition to the string itself. | addition to the string itself. | |||
| 4. A sufficiently large number of characters is added to Unicode so | 4. A sufficiently large number of characters were added to Unicode | |||
| that the Punycode mechanism for block offsets can no longer | so that the Punycode mechanism for block offsets would no longer | |||
| reference the higher-numbered planes and blocks. This condition | reference the higher-numbered planes and blocks. This condition | |||
| is unlikely even in the long term and certain not to arise in the | is unlikely even in the long term and certain not to arise in the | |||
| next several years. | next several years. | |||
| 7.4.2. Conditions Not Requiring a Prefix Change | 7.4.2. Conditions Not Requiring a Prefix Change | |||
| As a result of the principles described above, none of the following | As a result of the principles described above, none of the following | |||
| changes require a new prefix: | changes required a new prefix: | |||
| 1. Prohibition of some characters as input to IDNA. This may make | 1. Prohibition of some characters as input to IDNA. Such a | |||
| names that are now registered inaccessible, but does not change | prohibition might make names that were previously registered | |||
| those names. | inaccessible, but did not change those names. | |||
| 2. Adjustments in IDNA tables or actions, including normalization | 2. Adjustments in IDNA tables or actions, including normalization | |||
| definitions, that affect characters that were already invalid | definitions, that affected characters that were already invalid | |||
| under IDNA2003. | under IDNA2003. | |||
| 3. Changes in the style of the IDNA definition that does not alter | 3. Changes in the style of the IDNA definition that did not alter | |||
| the actions performed by IDNA. | the actions performed by IDNA. | |||
| 7.4.3. Implications of Prefix Changes | 7.4.3. Implications of Prefix Changes | |||
| While it might be possible to make a prefix change, the costs of such | While it might have been possible to make a prefix change, the costs | |||
| a change are considerable. Registries could not convert all IDNA2003 | of such a change are considerable. Registries could not have | |||
| ("xn--") registrations to a new form at the same time and synchronize | converted all IDNA2003 ("xn--") registrations to a new form at the | |||
| that change with applications supporting lookup. Unless all existing | same time and synchronize that change with applications supporting | |||
| registrations were simply to be declared invalid (and perhaps even | lookup. Unless all existing registrations were simply to be declared | |||
| then) systems that needed to support both labels with old prefixes | invalid (and perhaps even then) systems that needed to support both | |||
| and labels with new ones would first process a putative label under | labels with old prefixes and labels with new ones would be required | |||
| the IDNA2008 rules and try to look it up and then, if it were not | to first process a putative label under the IDNA2008 rules and try to | |||
| found, would process the label under IDNA2003 rules and look it up | look it up and then, if it were not found, would be required to | |||
| again. That process could significantly slow down all processing | process the label under IDNA2003 rules and look it up again. That | |||
| process would probably have significantly slowed down all processing | ||||
| that involved IDNs in the DNS especially since a fully-qualified name | that involved IDNs in the DNS especially since a fully-qualified name | |||
| might contain a mixture of labels that were registered with the old | might contain a mixture of labels that were registered with the old | |||
| and new prefixes. That would make DNS caching very difficult. In | and new prefixes. That would have made DNS caching very difficult. | |||
| addition, looking up the same input string as two separate A-labels | In addition, looking up the same input string as two separate | |||
| creates some potential for confusion and attacks, since the labels | A-labels would have created some potential for confusion and attacks, | |||
| could map to different targets and then resolve to different entries | since the labels could map to different targets and then resolve to | |||
| in the DNS. | different entries in the DNS. | |||
| Consequently, a prefix change is to be avoided if at all possible, | Consequently, a prefix change should have been, and was, avoided if | |||
| even if it means accepting some IDNA2003 decisions about character | at all possible, even if it means accepting some IDNA2003 decisions | |||
| distinctions as irreversible and/or giving special treatment to edge | about character distinctions as irreversible and/or giving special | |||
| cases. | treatment to edge cases. | |||
| 7.5. Stringprep Changes and Compatibility | 7.5. Stringprep Changes and Compatibility | |||
| The Nameprep [RFC3491] specification, a key part of IDNA2003, is a | The Nameprep [RFC3491] specification, a key part of IDNA2003, is a | |||
| profile of Stringprep [RFC3454]. While Nameprep is a Stringprep | profile of Stringprep [RFC3454]. While Nameprep is a Stringprep | |||
| profile specific to IDNA, Stringprep is used by a number of other | profile specific to IDNA, Stringprep is used by a number of other | |||
| protocols. Were Stringprep to be modified by IDNA2008, those changes | protocols. Were Stringprep to have been modified by IDNA2008, those | |||
| to improve the handling of IDNs could cause problems for non-DNS | changes to improve the handling of IDNs could cause problems for non- | |||
| uses, most notably if they affected identification and authentication | DNS uses, most notably if they affected identification and | |||
| protocols. Several elements of IDNA2008 give interpretations to | authentication protocols. Several elements of IDNA2008 give | |||
| strings prohibited under IDNA2003 or prohibit strings that IDNA2003 | interpretations to strings prohibited under IDNA2003 or prohibit | |||
| permitted. Those elements include the proposed new inclusion tables | strings that IDNA2003 permitted. Those elements include the proposed | |||
| [IDNA2008-Tables], the reduction in the number of characters | new inclusion tables [IDNA2008-Tables], the reduction in the number | |||
| permitted as input for registration or lookup (Section 3), and even | of characters permitted as input for registration or lookup | |||
| the proposed changes in handling of right to left strings | (Section 3), and even the proposed changes in handling of right to | |||
| [IDNA2008-Bidi]. IDNA2008 does not use Nameprep or Stringprep at | left strings [IDNA2008-Bidi]. IDNA2008 does not use Nameprep or | |||
| all, so there are no side-effect changes to other protocols. | Stringprep at all, so there are no side-effect changes to other | |||
| protocols. | ||||
| It is particularly important to keep IDNA processing separate from | It is particularly important to keep IDNA processing separate from | |||
| processing for various security protocols because some of the | processing for various security protocols because some of the | |||
| constraints that are necessary for smooth and comprehensible use of | constraints that are necessary for smooth and comprehensible use of | |||
| IDNs may be unwanted or undesirable in other contexts. For example, | IDNs may be unwanted or undesirable in other contexts. For example, | |||
| the criteria for good passwords or passphrases are very different | the criteria for good passwords or passphrases are very different | |||
| from those for desirable IDNs: passwords should be hard to guess, | from those for desirable IDNs: passwords should be hard to guess, | |||
| while domain names should normally be easily memorable. Similarly, | while domain names should normally be easily memorable. Similarly, | |||
| internationalized SCSI identifiers and other protocol components are | internationalized SCSI identifiers and other protocol components are | |||
| likely to have different requirements than IDNs. | likely to have different requirements than IDNs. | |||
| skipping to change at page 32, line 32 ¶ | skipping to change at page 33, line 27 ¶ | |||
| than an ASCII base. | than an ASCII base. | |||
| 2. Symbol names are more problematic than letters because there may | 2. Symbol names are more problematic than letters because there may | |||
| be no general agreement on whether a particular glyph matches a | be no general agreement on whether a particular glyph matches a | |||
| symbol; there are no uniform conventions for naming; variations | symbol; there are no uniform conventions for naming; variations | |||
| such as outline, solid, and shaded forms may or may not exist; | such as outline, solid, and shaded forms may or may not exist; | |||
| and so on. As just one example, consider a "heart" symbol as it | and so on. As just one example, consider a "heart" symbol as it | |||
| might appear in a logo that might be read as "I love...". While | might appear in a logo that might be read as "I love...". While | |||
| the user might read such a logo as "I love..." or "I heart...", | the user might read such a logo as "I love..." or "I heart...", | |||
| considerable knowledge of the coding distinctions made in Unicode | considerable knowledge of the coding distinctions made in Unicode | |||
| is needed to know that there more than one "heart" character | is needed to know that there is more than one "heart" character | |||
| (e.g., U+2665, U+2661, and U+2765) and how to describe it. These | (e.g., U+2665, U+2661, and U+2765) and how to describe it. These | |||
| issues are of particular importance if strings are expected to be | issues are of particular importance if strings are expected to be | |||
| understood or transcribed by the listener after being read out | understood or transcribed by the listener after being read out | |||
| loud. | loud. | |||
| 3. Design of a screen reader used by blind Internet users who must | 3. Design of a screen reader used by blind Internet users who must | |||
| listen to renderings of IDN domain names and possibly reproduce | listen to renderings of IDN domain names and possibly reproduce | |||
| them on the keyboard becomes considerably more complicated when | them on the keyboard becomes considerably more complicated when | |||
| the names of characters are not obvious and intuitive to anyone | the names of characters are not obvious and intuitive to anyone | |||
| familiar with the language in question. | familiar with the language in question. | |||
| skipping to change at page 33, line 9 ¶ | skipping to change at page 34, line 5 ¶ | |||
| would-be registrant has no way to know -- absent careful study of | would-be registrant has no way to know -- absent careful study of | |||
| the code tables -- whether it is ambiguous (e.g., where there are | the code tables -- whether it is ambiguous (e.g., where there are | |||
| multiple "heart" characters) or not. Conversely, the user seeing | multiple "heart" characters) or not. Conversely, the user seeing | |||
| the hypothetical label doesn't know whether to read it -- try to | the hypothetical label doesn't know whether to read it -- try to | |||
| transmit it to a colleague by voice -- as "heart", as "love", as | transmit it to a colleague by voice -- as "heart", as "love", as | |||
| "black heart", or as any of the other examples below. | "black heart", or as any of the other examples below. | |||
| 5. The actual situation is even worse than this. There is no | 5. The actual situation is even worse than this. There is no | |||
| possible way for a normal, casual, user to tell the difference | possible way for a normal, casual, user to tell the difference | |||
| between the hearts of U+2665 and U+2765 and the stars of U+2606 | between the hearts of U+2665 and U+2765 and the stars of U+2606 | |||
| and U+2729 or the without somehow knowing to look for a | and U+2729 without somehow knowing to look for a distinction. We | |||
| distinction. We have a white heart (U+2661) and few black | have a white heart (U+2661) and few black hearts. Consequently, | |||
| hearts. Consequently, describing a label as containing a heart | describing a label as containing a heart is hopelessly ambiguous: | |||
| is hopelessly ambiguous: we can only know that it contains one of | we can only know that it contains one of several characters that | |||
| several characters that look like hearts or have "heart" in their | look like hearts or have "heart" in their names. In cities where | |||
| names. In cities where "Square" is a popular part of a location | "Square" is a popular part of a location name, one might well | |||
| name, one might well want to use a square symbol in a label as | want to use a square symbol in a label as well and there are far | |||
| well and there are far more squares of various flavors in Unicode | more squares of various flavors in Unicode than there are hearts | |||
| than there are hearts or stars. | or stars. | |||
| The consequence of these ambiguities is that symbols are a very poor | The consequence of these ambiguities is that symbols are a very poor | |||
| basis for reliable communication. Consistent with this conclusion, | basis for reliable communication. Consistent with this conclusion, | |||
| the Unicode standard recommends that strings used in identifiers not | the Unicode standard recommends that strings used in identifiers not | |||
| contain symbols or punctuation [Unicode-UAX31]. Of course, these | contain symbols or punctuation [Unicode-UAX31]. Of course, these | |||
| difficulties with symbols do not arise with actual pictographic | difficulties with symbols do not arise with actual pictographic | |||
| languages and scripts which would be treated like any other language | languages and scripts which would be treated like any other language | |||
| characters; the two should not be confused. | characters; the two should not be confused. | |||
| 7.7. Migration Between Unicode Versions: Unassigned Code Points | 7.7. Migration Between Unicode Versions: Unassigned Code Points | |||
| skipping to change at page 34, line 41 ¶ | skipping to change at page 35, line 37 ¶ | |||
| Unicode. The reality is that a script that is obscure to much of the | Unicode. The reality is that a script that is obscure to much of the | |||
| world may still be very important to those who use it. Cultural and | world may still be very important to those who use it. Cultural and | |||
| linguistic preservation principles make it inappropriate to declare | linguistic preservation principles make it inappropriate to declare | |||
| the script of no importance in IDNs. Second, we already have | the script of no importance in IDNs. Second, we already have | |||
| counterexamples in, e.g., the relationships associated with new Han | counterexamples in, e.g., the relationships associated with new Han | |||
| characters being added (whether in the BMP or in Unicode Plane 2). | characters being added (whether in the BMP or in Unicode Plane 2). | |||
| Independent of the technical transition issues identified above, it | Independent of the technical transition issues identified above, it | |||
| can be observed that any addition of characters to an existing script | can be observed that any addition of characters to an existing script | |||
| to make it easier to use or to better accommodate particular | to make it easier to use or to better accommodate particular | |||
| languages may lead to transition issues. Such changes may change the | languages may lead to transition issues. Such additions may change | |||
| preferred form for writing a particular string, changes that may be | the preferred form for writing a particular string, changes that may | |||
| reflected, e.g., in keyboard transition modules that would | be reflected, e.g., in keyboard transition modules that would | |||
| necessarily be different from those for earlier versions of Unicode | necessarily be different from those for earlier versions of Unicode | |||
| where the newer characters may not exist. This creates an inherent | where the newer characters may not exist. This creates an inherent | |||
| transition problem because attempts to access labels may use either | transition problem because attempts to access labels may use either | |||
| the old or the new conventions, requiring registry action whether the | the old or the new conventions, requiring registry action whether the | |||
| older conventions were used in labels or not. The need to consider | older conventions were used in labels or not. The need to consider | |||
| transition mechanisms is inherent to evolution of Unicode to better | transition mechanisms is inherent to evolution of Unicode to better | |||
| accommodate writing systems and is independent of how IDNs are | accommodate writing systems and is independent of how IDNs are | |||
| represented in the DNS or how transitions among versions of those | represented in the DNS or how transitions among versions of those | |||
| mechanisms occur. The requirement for transitions of this type is | mechanisms occur. The requirement for transitions of this type is | |||
| illustrated by the addition of Malayalam Chillu in Unicode 5.1.0. | illustrated by the addition of Malayalam Chillu in Unicode 5.1.0. | |||
| skipping to change at page 35, line 42 ¶ | skipping to change at page 36, line 40 ¶ | |||
| All existing channels through which names can enter a DNS server | All existing channels through which names can enter a DNS server | |||
| database (for example, master files (as described in RFC 1034) and | database (for example, master files (as described in RFC 1034) and | |||
| DNS update messages [RFC2136]) are IDN-unaware because they predate | DNS update messages [RFC2136]) are IDN-unaware because they predate | |||
| IDNA. Other sections of this document provide the needed shielding | IDNA. Other sections of this document provide the needed shielding | |||
| by ensuring that internationalized domain names entering DNS server | by ensuring that internationalized domain names entering DNS server | |||
| databases through such channels have already been converted to their | databases through such channels have already been converted to their | |||
| equivalent ASCII A-label forms. | equivalent ASCII A-label forms. | |||
| Because of the distinction made between the algorithms for | Because of the distinction made between the algorithms for | |||
| Registration and Lookup in [IDNA2008-Protocol] (a domain name | Registration and Lookup in [IDNA2008-Protocol] (a domain name | |||
| containing only ASCII codepoints can not be converted to an A-label), | containing only ASCII codepoints cannot be converted to an A-label), | |||
| there can not be more than one A-label form for any given U-label. | there cannot be more than one A-label form for any given U-label. | |||
| As specified in RFC 2181 [RFC2181], the DNS protocol explicitly | As specified in RFC 2181 [RFC2181], the DNS protocol explicitly | |||
| allows domain labels to contain octets beyond the ASCII range | allows domain labels to contain octets beyond the ASCII range | |||
| (0000..007F), and this document does not change that. However, | (0000..007F), and this document does not change that. However, | |||
| although the interpretation of octets 0080..00FF is well-defined in | although the interpretation of octets 0080..00FF is well-defined in | |||
| the DNS, many application protocols support only ASCII labels and | the DNS, many application protocols support only ASCII labels and | |||
| there is no defined interpretation of these non-ASCII octets as | there is no defined interpretation of these non-ASCII octets as | |||
| characters and, in particular, no interpretation of case-independent | characters and, in particular, no interpretation of case-independent | |||
| matching for them (see, e.g., [RFC4343]). If labels containing these | matching for them (see, e.g., [RFC4343]). If labels containing these | |||
| octets are returned to applications, unpredictable behavior could | octets are returned to applications, unpredictable behavior could | |||
| result. The A-label form, which cannot contain those characters, is | result. The A-label form, which cannot contain those characters, is | |||
| the only standard representation for internationalized labels in the | the only standard representation for internationalized labels in the | |||
| DNS protocol. | DNS protocol. | |||
| 8.2. DNSSEC Authentication of IDN Domain Names | 8.2. Root and other DNS Server Considerations | |||
| DNS Security (DNSSEC) [RFC2535] is a method for supplying | ||||
| cryptographic verification information along with DNS messages. | ||||
| Public Key Cryptography is used in conjunction with digital | ||||
| signatures to provide a means for a requester of domain information | ||||
| to authenticate the source of the data. This ensures that it can be | ||||
| traced back to a trusted source, either directly or via a chain of | ||||
| trust linking the source of the information to the top of the DNS | ||||
| hierarchy. | ||||
| IDNA specifies that all internationalized domain names served by DNS | ||||
| servers that cannot be represented directly in ASCII MUST use the | ||||
| A-label form. Conversion to A-labels MUST be performed prior to a | ||||
| zone being signed by the private key for that zone. Because of this | ||||
| ordering, it is important to recognize that DNSSEC authenticates a | ||||
| domain name containing A-labels or conventional LDH-labels, not | ||||
| U-labels. In the presence of DNSSEC, no form of a zone file or query | ||||
| response that contains a U-label may be signed or the signature | ||||
| validated. | ||||
| One consequence of this for sites deploying IDNA in the presence of | ||||
| DNSSEC is that any special purpose proxies or forwarders used to | ||||
| transform user input into IDNs must be earlier in the lookup flow | ||||
| than DNSSEC authenticating nameservers for DNSSEC to work. | ||||
| 8.3. Root and other DNS Server Considerations | ||||
| IDNs in A-label form will generally be somewhat longer than current | IDNs in A-label form will generally be somewhat longer than current | |||
| domain names, so the bandwidth needed by the root servers is likely | domain names, so the bandwidth needed by the root servers is likely | |||
| to go up by a small amount. Also, queries and responses for IDNs | to go up by a small amount. Also, queries and responses for IDNs | |||
| will probably be somewhat longer than typical queries historically, | will probably be somewhat longer than typical queries historically, | |||
| so EDNS0 [RFC2671] support may be more important (otherwise, queries | so EDNS0 [RFC2671] support may be more important (otherwise, queries | |||
| and responses may be forced to go to TCP instead of UDP). | and responses may be forced to go to TCP instead of UDP). | |||
| 9. Internationalization Considerations | 9. Internationalization Considerations | |||
| skipping to change at page 38, line 35 ¶ | skipping to change at page 39, line 8 ¶ | |||
| The editor and contributors would like to express their thanks to | The editor and contributors would like to express their thanks to | |||
| those who contributed significant early (pre-WG) review comments, | those who contributed significant early (pre-WG) review comments, | |||
| sometimes accompanied by text, Paul Hoffman, Simon Josefsson, and Sam | sometimes accompanied by text, Paul Hoffman, Simon Josefsson, and Sam | |||
| Weiler. In addition, some specific ideas were incorporated from | Weiler. In addition, some specific ideas were incorporated from | |||
| suggestions, text, or comments about sections that were unclear | suggestions, text, or comments about sections that were unclear | |||
| supplied by Vint Cerf, Frank Ellerman, Michael Everson, Asmus | supplied by Vint Cerf, Frank Ellerman, Michael Everson, Asmus | |||
| Freytag, Erik van der Poel, Michel Suignard, and Ken Whistler. | Freytag, Erik van der Poel, Michel Suignard, and Ken Whistler. | |||
| Thanks are also due to Vint Cerf, Lisa Dusseault, Debbie Garside, and | Thanks are also due to Vint Cerf, Lisa Dusseault, Debbie Garside, and | |||
| Jefsey Morfin for conversations that led to considerable improvements | Jefsey Morfin for conversations that led to considerable improvements | |||
| in the content of this document. | in the content of this document and to several others, including Ben | |||
| Campbell, Martin Duerst, Subramanian Moonesamy, Peter Saint-Andre, | ||||
| and Dan Winship, for catching specific errors and recommending | ||||
| corrections. | ||||
| A meeting was held on 30 January 2008 to attempt to reconcile | A meeting was held on 30 January 2008 to attempt to reconcile | |||
| differences in perspective and terminology about this set of | differences in perspective and terminology about this set of | |||
| specifications between the design team and members of the Unicode | specifications between the design team and members of the Unicode | |||
| Technical Consortium. The discussions at and subsequent to that | Technical Consortium. The discussions at and subsequent to that | |||
| meeting were very helpful in focusing the issues and in refining the | meeting were very helpful in focusing the issues and in refining the | |||
| specifications. The active participants at that meeting were (in | specifications. The active participants at that meeting were (in | |||
| alphabetic order as usual) Harald Alvestrand, Vint Cerf, Tina Dam, | alphabetic order as usual) Harald Alvestrand, Vint Cerf, Tina Dam, | |||
| Mark Davis, Lisa Dusseault, Patrik Faltstrom (by telephone), Cary | Mark Davis, Lisa Dusseault, Patrik Faltstrom (by telephone), Cary | |||
| Karp, John Klensin, Warren Kumari, Lisa Moore, Erik van der Poel, | Karp, John Klensin, Warren Kumari, Lisa Moore, Erik van der Poel, | |||
| skipping to change at page 41, line 37 ¶ | skipping to change at page 42, line 11 ¶ | |||
| [RFC2136] Vixie, P., Thomson, S., Rekhter, Y., and J. Bound, | [RFC2136] Vixie, P., Thomson, S., Rekhter, Y., and J. Bound, | |||
| "Dynamic Updates in the Domain Name System (DNS UPDATE)", | "Dynamic Updates in the Domain Name System (DNS UPDATE)", | |||
| RFC 2136, April 1997. | RFC 2136, April 1997. | |||
| [RFC2181] Elz, R. and R. Bush, "Clarifications to the DNS | [RFC2181] Elz, R. and R. Bush, "Clarifications to the DNS | |||
| Specification", RFC 2181, July 1997. | Specification", RFC 2181, July 1997. | |||
| [RFC2277] Alvestrand, H., "IETF Policy on Character Sets and | [RFC2277] Alvestrand, H., "IETF Policy on Character Sets and | |||
| Languages", BCP 18, RFC 2277, January 1998. | Languages", BCP 18, RFC 2277, January 1998. | |||
| [RFC2535] Eastlake, D., "Domain Name System Security Extensions", | ||||
| RFC 2535, March 1999. | ||||
| [RFC2671] Vixie, P., "Extension Mechanisms for DNS (EDNS0)", | [RFC2671] Vixie, P., "Extension Mechanisms for DNS (EDNS0)", | |||
| RFC 2671, August 1999. | RFC 2671, August 1999. | |||
| [RFC2673] Crawford, M., "Binary Labels in the Domain Name System", | [RFC2673] Crawford, M., "Binary Labels in the Domain Name System", | |||
| RFC 2673, August 1999. | RFC 2673, August 1999. | |||
| [RFC2782] Gulbrandsen, A., Vixie, P., and L. Esibov, "A DNS RR for | [RFC2782] Gulbrandsen, A., Vixie, P., and L. Esibov, "A DNS RR for | |||
| specifying the location of services (DNS SRV)", RFC 2782, | specifying the location of services (DNS SRV)", RFC 2782, | |||
| February 2000. | February 2000. | |||
| skipping to change at page 47, line 41 ¶ | skipping to change at page 48, line 17 ¶ | |||
| o Incorporated other changes from WG Last Call. | o Incorporated other changes from WG Last Call. | |||
| o Small typographical and editorial corrections. | o Small typographical and editorial corrections. | |||
| A.13. Version -13 | A.13. Version -13 | |||
| o Substituted in Section numbers to references to other IDNA2008 | o Substituted in Section numbers to references to other IDNA2008 | |||
| documents. | documents. | |||
| A.14. Version -14 | ||||
| A.15. Version -14 | ||||
| This is the version of the document produced to reflect comments on | ||||
| IETF Last Call. For the convenience of those who made comments and | ||||
| of the IESG in evaluating them, this section therefore identifies | ||||
| non-editorial changes made in response to Last Call comments in | ||||
| somewhat more detail than may be usual. | ||||
| o Removed the discussion of DNSSEC after extensive discussion on the | ||||
| IETF and IDNABIS lists. | ||||
| o Modified the discussion of prefix changes to make it clear that | ||||
| the decisions have been made, rather than still representing open | ||||
| issues. (Dan Winship review, 20091013) | ||||
| o Suggested explicit identification of domain name slots in | ||||
| protocols that use IDNA. Peter Saint-Andre, 20091019. | ||||
| o Several other clarifications as suggested by Peter Saint-Andre, | ||||
| 20091019. | ||||
| o Several minor editorial corrections per suggestions in Ben | ||||
| Campbell's Gen-ART review 20091013. | ||||
| o Typo corrections. | ||||
| Author's Address | Author's Address | |||
| John C Klensin | John C Klensin | |||
| 1770 Massachusetts Ave, Ste 322 | 1770 Massachusetts Ave, Ste 322 | |||
| Cambridge, MA 02140 | Cambridge, MA 02140 | |||
| USA | USA | |||
| Phone: +1 617 245 1457 | Phone: +1 617 245 1457 | |||
| Email: john+ietf@jck.com | Email: john+ietf@jck.com | |||
| End of changes. 50 change blocks. | ||||
| 222 lines changed or deleted | 226 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||