| < draft-resman-idna2008-mappings-00.txt | draft-resman-idna2008-mappings-01.txt > | |||
|---|---|---|---|---|
| Network Working Group P. Resnick | Network Working Group P. Resnick | |||
| Internet-Draft Qualcomm Incorporated | Internet-Draft Qualcomm Incorporated | |||
| Intended status: Informational P. Hoffman | Intended status: Informational P. Hoffman | |||
| Expires: October 15, 2010 VPN Consortium | Expires: October 21, 2010 VPN Consortium | |||
| April 13, 2010 | April 19, 2010 | |||
| Mapping Characters in IDNA2008 | Mapping Characters in IDNA2008 | |||
| draft-resman-idna2008-mappings-00 | draft-resman-idna2008-mappings-01 | |||
| Abstract | Abstract | |||
| In the original version of the Internationalized Domain Names in | In the original version of the Internationalized Domain Names in | |||
| Applications (IDNA) protocol, any Unicode code points taken from user | Applications (IDNA) protocol, any Unicode code points taken from user | |||
| input were mapped into a set of Unicode code points that "made | input were mapped into a set of Unicode code points that "made | |||
| sense", and then encoded and passed to the domain name system (DNS). | sense", and then encoded and passed to the domain name system (DNS). | |||
| The IDNA2008 protocol presumes that the input to the protocol comes | The IDNA2008 protocol presumes that the input to the protocol comes | |||
| from a set of "permitted" code points, which it then encodes and | from a set of "permitted" code points, which it then encodes and | |||
| passes to the DNS, but does not specify what to do with the result of | passes to the DNS, but does not specify what to do with the result of | |||
| skipping to change at line 39 ¶ | skipping to change at page 1, line 40 ¶ | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
| working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
| Drafts is at http://datatracker.ietf.org/drafts/current/. | Drafts is at http://datatracker.ietf.org/drafts/current/. | |||
| Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| This Internet-Draft will expire on October 15, 2010. | This Internet-Draft will expire on October 21, 2010. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2010 IETF Trust and the persons identified as the | Copyright (c) 2010 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
| (http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
| publication of this document. Please review these documents | publication of this document. Please review these documents | |||
| skipping to change at line 116 ¶ | skipping to change at page 3, line 24 ¶ | |||
| context-free mapping without considering the user interface | context-free mapping without considering the user interface | |||
| properties has the potential of doing exactly the wrong thing for the | properties has the potential of doing exactly the wrong thing for the | |||
| user. | user. | |||
| The original version of IDNA conflated user interface processing and | The original version of IDNA conflated user interface processing and | |||
| protocol. It took whatever characters the user produced in whatever | protocol. It took whatever characters the user produced in whatever | |||
| encoding the application used, assumed some conversion to Unicode | encoding the application used, assumed some conversion to Unicode | |||
| code points, and then without regard to context, locale, or anything | code points, and then without regard to context, locale, or anything | |||
| about the user's intentions, mapped them into a particular set of | about the user's intentions, mapped them into a particular set of | |||
| other characters, and then re-encoded them in Punycode, in order have | other characters, and then re-encoded them in Punycode, in order have | |||
| the entire operation be contained within the protocol. This made for | the entire operation be contained within the protocol. Ignoring | |||
| a much simpler implementation, making it it significantly less | context, locale, and user preference in the IDNA protocol made life | |||
| complicated for the application developer, but at the expense of | significantly less complicated for the application developer, but at | |||
| minimizing "user surprise" for consumers and producers of domain | the expense of violating the principle of "least user surprise" for | |||
| names. | consumers and producers of domain names. | |||
| In IDNA2008, the dividing line between "user interface" and | In IDNA2008, the dividing line between "user interface" and | |||
| "protocol" is clear. The IDNA2008 specification defines the protocol | "protocol" is clear. The IDNA2008 specification defines the protocol | |||
| part of IDNA: it explicitly does not deal with the user interface. | part of IDNA: it explicitly does not deal with the user interface. | |||
| Mappings such as the one described in this document explicitly deal | Mappings such as the one described in this document explicitly deal | |||
| with the user interface and not the protocol. That is, a mapping is | with the user interface and not the protocol. That is, a mapping is | |||
| only to be applied before a string of characters is treated as a | only to be applied before a string of characters is treated as a | |||
| domain name (in the "user interface") and is never to be applied | domain name (in the "user interface") and is never to be applied | |||
| during domain name processing (in the "protocol"). | during domain name processing (in the "protocol"). | |||
| skipping to change at line 150 ¶ | skipping to change at page 4, line 10 ¶ | |||
| for quite large populations of people. | for quite large populations of people. | |||
| A good mapping in the real world might use the "sensible and friendly | A good mapping in the real world might use the "sensible and friendly | |||
| and mostly obvious" design goal but come up with a different | and mostly obvious" design goal but come up with a different | |||
| algorithm. Many algorithms will have results that are close to what | algorithm. Many algorithms will have results that are close to what | |||
| is described here, but will differ in assumptions about the users' | is described here, but will differ in assumptions about the users' | |||
| way of thinking or typing. Having said that, it is likely that some | way of thinking or typing. Having said that, it is likely that some | |||
| mappings will be significantly different. For example, a mapping | mappings will be significantly different. For example, a mapping | |||
| might apply to a spoken user interface instead of a typed one. | might apply to a spoken user interface instead of a typed one. | |||
| Another example is that a mapping might be different for users typing | Another example is that a mapping might be different for users typing | |||
| than for users using copy-and-paste from different applications. | than for users using copy-and-paste from different applications. Yet | |||
| another example is that a user interface that allows typed input that | ||||
| is transliterated from Latin characters could have very different | ||||
| mappings than one that applies to typing in other character sets; | ||||
| this would be typical in a Pinyin input method for Chinese | ||||
| characters. | ||||
| 2. The General Procedure | 2. The General Procedure | |||
| This section defines a general algorithm that applications ought to | This section defines a general algorithm that applications ought to | |||
| implement in order to produce Unicode code points that will be valid | implement in order to produce Unicode code points that will be valid | |||
| under the IDNA protocol. An application might implement the full | under the IDNA protocol. An application might implement the full | |||
| mapping as described below, or can choose a different mapping. This | mapping as described below, or can choose a different mapping. This | |||
| mapping is very general and was designed to be very acceptable to the | mapping is very general and was designed to be very acceptable to the | |||
| widest user community, but as stated above, it does not take into | widest user community, but as stated above, it does not take into | |||
| account any particular context, culture, or locale. | account any particular context, culture, or locale. | |||
| skipping to change at line 175 ¶ | skipping to change at page 4, line 40 ¶ | |||
| 1. Upper case characters are mapped to their lower case equivalents | 1. Upper case characters are mapped to their lower case equivalents | |||
| by using the algorithm for mapping case in Unicode characters. | by using the algorithm for mapping case in Unicode characters. | |||
| This step was chosen because the output will behave more like | This step was chosen because the output will behave more like | |||
| ASCII host names behave. | ASCII host names behave. | |||
| 2. Full-width and half-width characters (those defined with | 2. Full-width and half-width characters (those defined with | |||
| Decomposition Types <wide> and <narrow>) are mapped to their | Decomposition Types <wide> and <narrow>) are mapped to their | |||
| decomposition mappings as shown in the Unicode character | decomposition mappings as shown in the Unicode character | |||
| database. This step was chosen because many input mechanisms, | database. This step was chosen because many input mechanisms, | |||
| particularly in Asia, do no allow you to easily enter characters | particularly in Asia, do not allow you to easily enter characters | |||
| in the form used by IDNA2008. Even if they do allow the correct | in the form used by IDNA2008. Even if they do allow the correct | |||
| character form, the user might not know which form they are | character form, the user might not know which form they are | |||
| entering. | entering. | |||
| 3. All characters are mapped using Unicode Normalization Form C | 3. All characters are mapped using Unicode Normalization Form C | |||
| (NFC). This step was chosen because it maps combinations of | (NFC). This step was chosen because it maps combinations of | |||
| combining characters into canonical composed form. As with the | combining characters into canonical composed form. As with the | |||
| full-width/half-width mapping, users are not generally aware of | full-width/half-width mapping, users are not generally aware of | |||
| the particular form of characters that they are entering, and | the particular form of characters that they are entering, and | |||
| IDNA2008 requires that only the canonical composed forms from NFC | IDNA2008 requires that only the canonical composed forms from NFC | |||
| End of changes. 6 change blocks. | ||||
| 11 lines changed or deleted | 16 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||