| < draft-ietf-idnabis-rationale-16.txt | draft-ietf-idnabis-rationale-17.txt > | |||
|---|---|---|---|---|
| Network Working Group J. Klensin | Network Working Group J. Klensin | |||
| Internet-Draft January 7, 2010 | Internet-Draft January 11, 2010 | |||
| Intended status: Informational | Intended status: Informational | |||
| Expires: July 11, 2010 | Expires: July 15, 2010 | |||
| Internationalized Domain Names for Applications (IDNA): Background, | Internationalized Domain Names for Applications (IDNA): Background, | |||
| Explanation, and Rationale | Explanation, and Rationale | |||
| draft-ietf-idnabis-rationale-16.txt | draft-ietf-idnabis-rationale-17.txt | |||
| Abstract | Abstract | |||
| Several years have passed since the original protocol for | Several years have passed since the original protocol for | |||
| Internationalized Domain Names (IDNs) was completed and deployed. | Internationalized Domain Names (IDNs) was completed and deployed. | |||
| During that time, a number of issues have arisen, including the need | During that time, a number of issues have arisen, including the need | |||
| to update the system to deal with newer versions of Unicode. Some of | to update the system to deal with newer versions of Unicode. Some of | |||
| these issues require tuning of the existing protocols and the tables | these issues require tuning of the existing protocols and the tables | |||
| on which they depend. This document provides an overview of a | on which they depend. This document provides an overview of a | |||
| revised system and provides explanatory material for its components. | revised system and provides explanatory material for its components. | |||
| skipping to change at page 1, line 43 ¶ | skipping to change at page 1, line 43 ¶ | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| The list of current Internet-Drafts can be accessed at | The list of current Internet-Drafts can be accessed at | |||
| http://www.ietf.org/ietf/1id-abstracts.txt. | http://www.ietf.org/ietf/1id-abstracts.txt. | |||
| The list of Internet-Draft Shadow Directories can be accessed at | The list of Internet-Draft Shadow Directories can be accessed at | |||
| http://www.ietf.org/shadow.html. | http://www.ietf.org/shadow.html. | |||
| This Internet-Draft will expire on July 11, 2010. | This Internet-Draft will expire on July 15, 2010. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2010 IETF Trust and the persons identified as the | Copyright (c) 2010 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
| (http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
| publication of this document. Please review these documents | publication of this document. Please review these documents | |||
| skipping to change at page 3, line 28 ¶ | skipping to change at page 3, line 28 ¶ | |||
| 3.1. A Tiered Model of Permitted Characters and Labels . . . . 10 | 3.1. A Tiered Model of Permitted Characters and Labels . . . . 10 | |||
| 3.1.1. PROTOCOL-VALID . . . . . . . . . . . . . . . . . . . . 10 | 3.1.1. PROTOCOL-VALID . . . . . . . . . . . . . . . . . . . . 10 | |||
| 3.1.2. CONTEXTUAL RULE REQUIRED . . . . . . . . . . . . . . . 11 | 3.1.2. CONTEXTUAL RULE REQUIRED . . . . . . . . . . . . . . . 11 | |||
| 3.1.2.1. Contextual Restrictions . . . . . . . . . . . . . 11 | 3.1.2.1. Contextual Restrictions . . . . . . . . . . . . . 11 | |||
| 3.1.2.2. Rules and Their Application . . . . . . . . . . . 12 | 3.1.2.2. Rules and Their Application . . . . . . . . . . . 12 | |||
| 3.1.3. DISALLOWED . . . . . . . . . . . . . . . . . . . . . . 12 | 3.1.3. DISALLOWED . . . . . . . . . . . . . . . . . . . . . . 12 | |||
| 3.1.4. UNASSIGNED . . . . . . . . . . . . . . . . . . . . . . 13 | 3.1.4. UNASSIGNED . . . . . . . . . . . . . . . . . . . . . . 13 | |||
| 3.2. Registration Policy . . . . . . . . . . . . . . . . . . . 13 | 3.2. Registration Policy . . . . . . . . . . . . . . . . . . . 13 | |||
| 3.3. Layered Restrictions: Tables, Context, Registration, | 3.3. Layered Restrictions: Tables, Context, Registration, | |||
| Applications . . . . . . . . . . . . . . . . . . . . . . . 14 | Applications . . . . . . . . . . . . . . . . . . . . . . . 14 | |||
| 4. Issues that Constrain Possible Solutions . . . . . . . . . . . 15 | 4. Application-Related Issues . . . . . . . . . . . . . . . . . . 15 | |||
| 4.1. Display and Network Order . . . . . . . . . . . . . . . . 15 | 4.1. Display and Network Order . . . . . . . . . . . . . . . . 15 | |||
| 4.2. Entry and Display in Applications . . . . . . . . . . . . 16 | 4.2. Entry and Display in Applications . . . . . . . . . . . . 16 | |||
| 4.3. Linguistic Expectations: Ligatures, Digraphs, and | 4.3. Linguistic Expectations: Ligatures, Digraphs, and | |||
| Alternate Character Forms . . . . . . . . . . . . . . . . 18 | Alternate Character Forms . . . . . . . . . . . . . . . . 18 | |||
| 4.4. Case Mapping and Related Issues . . . . . . . . . . . . . 20 | 4.4. Case Mapping and Related Issues . . . . . . . . . . . . . 20 | |||
| 4.5. Right to Left Text . . . . . . . . . . . . . . . . . . . . 21 | 4.5. Right to Left Text . . . . . . . . . . . . . . . . . . . . 21 | |||
| 5. IDNs and the Robustness Principle . . . . . . . . . . . . . . 21 | 5. IDNs and the Robustness Principle . . . . . . . . . . . . . . 21 | |||
| 6. Front-end and User Interface Processing for Lookup . . . . . . 22 | 6. Front-end and User Interface Processing for Lookup . . . . . . 22 | |||
| 7. Migration from IDNA2003 and Unicode Version Synchronization . 24 | 7. Migration from IDNA2003 and Unicode Version Synchronization . 24 | |||
| 7.1. Design Criteria . . . . . . . . . . . . . . . . . . . . . 24 | 7.1. Design Criteria . . . . . . . . . . . . . . . . . . . . . 24 | |||
| skipping to change at page 4, line 41 ¶ | skipping to change at page 4, line 41 ¶ | |||
| A.7. Version -07 . . . . . . . . . . . . . . . . . . . . . . . 46 | A.7. Version -07 . . . . . . . . . . . . . . . . . . . . . . . 46 | |||
| A.8. Version -08 . . . . . . . . . . . . . . . . . . . . . . . 46 | A.8. Version -08 . . . . . . . . . . . . . . . . . . . . . . . 46 | |||
| A.9. Version -09 . . . . . . . . . . . . . . . . . . . . . . . 46 | A.9. Version -09 . . . . . . . . . . . . . . . . . . . . . . . 46 | |||
| A.10. Version -10 . . . . . . . . . . . . . . . . . . . . . . . 47 | A.10. Version -10 . . . . . . . . . . . . . . . . . . . . . . . 47 | |||
| A.11. Version -11 . . . . . . . . . . . . . . . . . . . . . . . 47 | A.11. Version -11 . . . . . . . . . . . . . . . . . . . . . . . 47 | |||
| A.12. Version -12 . . . . . . . . . . . . . . . . . . . . . . . 48 | A.12. Version -12 . . . . . . . . . . . . . . . . . . . . . . . 48 | |||
| A.13. Version -13 . . . . . . . . . . . . . . . . . . . . . . . 48 | A.13. Version -13 . . . . . . . . . . . . . . . . . . . . . . . 48 | |||
| A.14. Version -14 . . . . . . . . . . . . . . . . . . . . . . . 48 | A.14. Version -14 . . . . . . . . . . . . . . . . . . . . . . . 48 | |||
| A.15. Version -15 . . . . . . . . . . . . . . . . . . . . . . . 49 | A.15. Version -15 . . . . . . . . . . . . . . . . . . . . . . . 49 | |||
| A.16. Version -16 . . . . . . . . . . . . . . . . . . . . . . . 49 | A.16. Version -16 . . . . . . . . . . . . . . . . . . . . . . . 49 | |||
| A.17. Version -17 . . . . . . . . . . . . . . . . . . . . . . . 49 | ||||
| Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 49 | Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 49 | |||
| 1. Introduction | 1. Introduction | |||
| 1.1. Context and Overview | 1.1. Context and Overview | |||
| Internationalized Domain Names in Applications (IDNA) is a collection | Internationalized Domain Names in Applications (IDNA) is a collection | |||
| of standards that allow client applications to convert some Unicode | of standards that allow client applications to convert some mnemonic | |||
| mnemonics to an ASCII-compatible encoding form ("ACE") which is a | strings expressed in Unicode to an ASCII-compatible encoding form | |||
| valid DNS label containing only letters, digits, and hyphens. The | ("ACE") which is a valid DNS label containing only letters, digits, | |||
| specific form of ACE label used by IDNA is called an "A-label". A | and hyphens. The specific form of ACE label used by IDNA is called | |||
| client can look up an exact A-label in the existing DNS, so A-labels | an "A-label". A client can look up an exact A-label in the existing | |||
| do not require any extensions to DNS, upgrades of DNS servers or | DNS, so A-labels do not require any extensions to DNS, upgrades of | |||
| updates to low-level client libraries. An A-label is recognizable | DNS servers or updates to low-level client libraries. An A-label is | |||
| from the prefix "xn--" before the characters produced by the Punycode | recognizable from the prefix "xn--" before the characters produced by | |||
| algorithm [RFC3492], thus a user application can identify an A-label | the Punycode algorithm [RFC3492], thus a user application can | |||
| and convert it into Unicode (or some local coded character set) for | identify an A-label and convert it into Unicode (or some local coded | |||
| display. | character set) for display. | |||
| On the registry side, IDNA allows a registry to offer | On the registry side, IDNA allows a registry to offer | |||
| Internationalized Domain Names (IDNs) for registration as A-labels. | Internationalized Domain Names (IDNs) for registration as A-labels. | |||
| A registry may offer any subset of valid IDNs, and may apply any | A registry may offer any subset of valid IDNs, and may apply any | |||
| restrictions or bundling (grouping of similar labels together in one | restrictions or bundling (grouping of similar labels together in one | |||
| registration) appropriate for the context of that registry. | registration) appropriate for the context of that registry. | |||
| Registration of labels is sometimes discussed separately from lookup, | Registration of labels is sometimes discussed separately from lookup, | |||
| and is subject to a few specific requirements that do not apply to | and is subject to a few specific requirements that do not apply to | |||
| lookup. | lookup. | |||
| skipping to change at page 7, line 50 ¶ | skipping to change at page 7, line 50 ¶ | |||
| As with other documents in the IDNA2008 set, this document uses the | As with other documents in the IDNA2008 set, this document uses the | |||
| term "registry" to describe any zone in the DNS. That term, and the | term "registry" to describe any zone in the DNS. That term, and the | |||
| terms "zone" or "zone administration", are interchangeable. | terms "zone" or "zone administration", are interchangeable. | |||
| 1.4. Objectives | 1.4. Objectives | |||
| These are the main objectives in revising IDNA. | These are the main objectives in revising IDNA. | |||
| o Use a more recent version of Unicode, and allow IDNA to be | o Use a more recent version of Unicode, and allow IDNA to be | |||
| independent of Unicode versions, so that IDNA2008 need not be | independent of Unicode versions, so that IDNA2008 need not be | |||
| updated for implementations to adopt codepoints from new Unicode | updated for implementations to adopt code points from new Unicode | |||
| versions. | versions. | |||
| o Fix a very small number of code-point categorizations that have | o Fix a very small number of code point categorizations that have | |||
| turned out to cause problems in the communities that use those | turned out to cause problems in the communities that use those | |||
| code-points. | code points. | |||
| o Reduce the dependency on mapping, in order that the pre-mapped | o Reduce the dependency on mapping, in order that the pre-mapped | |||
| forms (which are not valid IDNA labels) tend to appear less often | forms (which are not valid IDNA labels) tend to appear less often | |||
| in various contexts, in favor of valid A-labels. | in various contexts, in favor of valid A-labels. | |||
| o Fix some details in the bidirectional codepoint handling | o Fix some details in the bidirectional code point handling | |||
| algorithms. | algorithms. | |||
| 1.5. Applicability and Function of IDNA | 1.5. Applicability and Function of IDNA | |||
| The IDNA specification solves the problem of extending the repertoire | The IDNA specification solves the problem of extending the repertoire | |||
| of characters that can be used in domain names to include a large | of characters that can be used in domain names to include a large | |||
| subset of the Unicode repertoire. | subset of the Unicode repertoire. | |||
| IDNA does not extend DNS. Instead, the applications (and, by | IDNA does not extend DNS. Instead, the applications (and, by | |||
| implication, the users) continue to see an exact-match lookup | implication, the users) continue to see an exact-match lookup | |||
| skipping to change at page 10, line 43 ¶ | skipping to change at page 10, line 43 ¶ | |||
| at registration time but not during lookup. Another significant | at registration time but not during lookup. Another significant | |||
| benefit is that separation facilitates incremental addition of | benefit is that separation facilitates incremental addition of | |||
| permitted character groups to avoid freezing on one particular | permitted character groups to avoid freezing on one particular | |||
| version of Unicode. | version of Unicode. | |||
| The actual registration and lookup protocols for IDNA2008 are | The actual registration and lookup protocols for IDNA2008 are | |||
| specified in [IDNA2008-Protocol]. | specified in [IDNA2008-Protocol]. | |||
| 3. Permitted Characters: An Inclusion List | 3. Permitted Characters: An Inclusion List | |||
| IDNA2008 adopts the inclusion model. A code-point is assumed to be | IDNA2008 adopts the inclusion model. A code point is assumed to be | |||
| invalid for IDN use unless it is included as part of a Unicode | invalid for IDN use unless it is included as part of a Unicode | |||
| property-based rule or, in rare cases, included individually by an | property-based rule or, in rare cases, included individually by an | |||
| exception. When an implementation moves to a new version of Unicode, | exception. When an implementation moves to a new version of Unicode, | |||
| the rules may indicate new valid code-points. | the rules may indicate new valid code points. | |||
| This section provides an overview of the model used to establish the | This section provides an overview of the model used to establish the | |||
| algorithm and character lists of [IDNA2008-Tables] and describes the | algorithm and character lists of [IDNA2008-Tables] and describes the | |||
| names and applicability of the categories used there. Note that the | names and applicability of the categories used there. Note that the | |||
| inclusion of a character in the first category group (Section 3.1.1) | inclusion of a character in the first category group (Section 3.1.1) | |||
| does not imply that it can be used indiscriminately; some characters | does not imply that it can be used indiscriminately; some characters | |||
| are associated with contextual rules that must be applied as well. | are associated with contextual rules that must be applied as well. | |||
| The information given in this section is provided to make the rules, | The information given in this section is provided to make the rules, | |||
| tables, and protocol easier to understand. The normative generating | tables, and protocol easier to understand. The normative generating | |||
| skipping to change at page 11, line 34 ¶ | skipping to change at page 11, line 34 ¶ | |||
| no visible effect in others. IDNA2003 prohibited those types of | no visible effect in others. IDNA2003 prohibited those types of | |||
| characters entirely by discarding them. We now have a consensus that | characters entirely by discarding them. We now have a consensus that | |||
| under some conditions, these "joiner" characters are legitimately | under some conditions, these "joiner" characters are legitimately | |||
| needed to allow useful mnemonics for some languages and scripts. In | needed to allow useful mnemonics for some languages and scripts. In | |||
| general, context-dependent rules help deal with characters (generally | general, context-dependent rules help deal with characters (generally | |||
| characters that would otherwise be prohibited entirely) that are used | characters that would otherwise be prohibited entirely) that are used | |||
| differently or perceived differently across different scripts, and | differently or perceived differently across different scripts, and | |||
| allow the standard to be applied more appropriately in cases where a | allow the standard to be applied more appropriately in cases where a | |||
| string is not universally handled the same way. | string is not universally handled the same way. | |||
| IDNA2008 divides all possible Unicode code-points into four | IDNA2008 divides all possible Unicode code points into four | |||
| categories: PROTOCOL-VALID, CONTEXTUAL RULE REQUIRED, DISALLOWED and | categories: PROTOCOL-VALID, CONTEXTUAL RULE REQUIRED, DISALLOWED and | |||
| UNASSIGNED. | UNASSIGNED. | |||
| 3.1.1. PROTOCOL-VALID | 3.1.1. PROTOCOL-VALID | |||
| Characters identified as "PROTOCOL-VALID" (often abbreviated | Characters identified as "PROTOCOL-VALID" (often abbreviated | |||
| "PVALID") are permitted in IDNs. Their use may be restricted by | "PVALID") are permitted in IDNs. Their use may be restricted by | |||
| rules about the context in which they appear or by other rules that | rules about the context in which they appear or by other rules that | |||
| apply to the entire label in which they are to be embedded. For | apply to the entire label in which they are to be embedded. For | |||
| example, any label that contains a character in this category that | example, any label that contains a character in this category that | |||
| skipping to change at page 16, line 13 ¶ | skipping to change at page 16, line 13 ¶ | |||
| registries are expected to restrict what they permit to be | registries are expected to restrict what they permit to be | |||
| registered, devising and using rules that are designed to optimize | registered, devising and using rules that are designed to optimize | |||
| the balance between confusion and risk on the one hand and maximum | the balance between confusion and risk on the one hand and maximum | |||
| expressiveness in mnemonics on the other. | expressiveness in mnemonics on the other. | |||
| In addition, there is an important role for user agents in warning | In addition, there is an important role for user agents in warning | |||
| against label forms that appear problematic given their knowledge of | against label forms that appear problematic given their knowledge of | |||
| local contexts and conventions. Of course, no approach based on | local contexts and conventions. Of course, no approach based on | |||
| naming or identifiers alone can protect against all threats. | naming or identifiers alone can protect against all threats. | |||
| 4. Issues that Constrain Possible Solutions | 4. Application-Related Issues | |||
| 4.1. Display and Network Order | 4.1. Display and Network Order | |||
| Domain names are always transmitted in network order (the order in | Domain names are always transmitted in network order (the order in | |||
| which the code points are sent in protocols), but may have a | which the code points are sent in protocols), but may have a | |||
| different display order (the order in which the code points are | different display order (the order in which the code points are | |||
| displayed on a screen or paper). When a domain name contains | displayed on a screen or paper). When a domain name contains | |||
| characters that are normally written right to left, display order may | characters that are normally written right to left, display order may | |||
| be affected although network order is not. It gets even more | be affected although network order is not. It gets even more | |||
| complicated if left to right and right to left labels are adjacent to | complicated if left to right and right to left labels are adjacent to | |||
| skipping to change at page 38, line 5 ¶ | skipping to change at page 38, line 5 ¶ | |||
| All existing channels through which names can enter a DNS server | All existing channels through which names can enter a DNS server | |||
| database (for example, master files (as described in RFC 1034) and | database (for example, master files (as described in RFC 1034) and | |||
| DNS update messages [RFC2136]) are IDN-unaware because they predate | DNS update messages [RFC2136]) are IDN-unaware because they predate | |||
| IDNA. Other sections of this document provide the needed shielding | IDNA. Other sections of this document provide the needed shielding | |||
| by ensuring that internationalized domain names entering DNS server | by ensuring that internationalized domain names entering DNS server | |||
| databases through such channels have already been converted to their | databases through such channels have already been converted to their | |||
| equivalent ASCII A-label forms. | equivalent ASCII A-label forms. | |||
| Because of the distinction made between the algorithms for | Because of the distinction made between the algorithms for | |||
| Registration and Lookup in [IDNA2008-Protocol] (a domain name | Registration and Lookup in [IDNA2008-Protocol] (a domain name | |||
| containing only ASCII codepoints cannot be converted to an A-label), | containing only ASCII code points cannot be converted to an A-label), | |||
| there cannot be more than one A-label form for any given U-label. | there cannot be more than one A-label form for any given U-label. | |||
| As specified in RFC 2181 [RFC2181], the DNS protocol explicitly | As specified in RFC 2181 [RFC2181], the DNS protocol explicitly | |||
| allows domain labels to contain octets beyond the ASCII range | allows domain labels to contain octets beyond the ASCII range | |||
| (0000..007F), and this document does not change that. However, | (0000..007F), and this document does not change that. However, | |||
| although the interpretation of octets 0080..00FF is well-defined in | although the interpretation of octets 0080..00FF is well-defined in | |||
| the DNS, many application protocols support only ASCII labels and | the DNS, many application protocols support only ASCII labels and | |||
| there is no defined interpretation of these non-ASCII octets as | there is no defined interpretation of these non-ASCII octets as | |||
| characters and, in particular, no interpretation of case-independent | characters and, in particular, no interpretation of case-independent | |||
| matching for them (see, e.g., [RFC4343]). If labels containing these | matching for them (see, e.g., [RFC4343]). If labels containing these | |||
| skipping to change at page 50, line 24 ¶ | skipping to change at page 50, line 24 ¶ | |||
| I-D version. | I-D version. | |||
| o Altered use of "these documents" and "these specifications" back | o Altered use of "these documents" and "these specifications" back | |||
| to "IDNA2008", undoing the change made in Appendix A.6. The | to "IDNA2008", undoing the change made in Appendix A.6. The | |||
| convolutions became ambiguous in places. | convolutions became ambiguous in places. | |||
| o Added a sentence to the Introduction to make the non-normative | o Added a sentence to the Introduction to make the non-normative | |||
| status of this document even more clear and added references to | status of this document even more clear and added references to | |||
| 7.1.2 and 7.1.3 to point to the more formal definitions. | 7.1.2 and 7.1.3 to point to the more formal definitions. | |||
| A.17. Version -17 | ||||
| o Final IESG comments picked up and included. A few more editorial/ | ||||
| typographic errors caught and fixed. | ||||
| o Section 4 title adjusted to better match its content. | ||||
| Author's Address | Author's Address | |||
| John C Klensin | John C Klensin | |||
| 1770 Massachusetts Ave, Ste 322 | 1770 Massachusetts Ave, Ste 322 | |||
| Cambridge, MA 02140 | Cambridge, MA 02140 | |||
| USA | USA | |||
| Phone: +1 617 245 1457 | Phone: +1 617 245 1457 | |||
| Email: john+ietf@jck.com | Email: john+ietf@jck.com | |||
| End of changes. 17 change blocks. | ||||
| 25 lines changed or deleted | 33 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||