| < draft-ietf-idnabis-rationale-00.txt | draft-ietf-idnabis-rationale-01.txt > | |||
|---|---|---|---|---|
| Network Working Group J. Klensin | Network Working Group J. Klensin | |||
| Internet-Draft May 10, 2008 | Internet-Draft July 12, 2008 | |||
| Intended status: Standards Track | Intended status: Standards Track | |||
| Expires: November 11, 2008 | Expires: January 13, 2009 | |||
| Internationalizing Domain Names for Applications (IDNA): Definitions, | Internationalized Domain Names for Applications (IDNA): Definitions, | |||
| Background and Rationale | Background and Rationale | |||
| draft-ietf-idnabis-rationale-00.txt | draft-ietf-idnabis-rationale-01.txt | |||
| Status of this Memo | Status of this Memo | |||
| By submitting this Internet-Draft, each author represents that any | By submitting this Internet-Draft, each author represents that any | |||
| applicable patent or other IPR claims of which he or she is aware | applicable patent or other IPR claims of which he or she is aware | |||
| have been or will be disclosed, and any of which he or she becomes | have been or will be disclosed, and any of which he or she becomes | |||
| aware will be disclosed, in accordance with Section 6 of BCP 79. | aware will be disclosed, in accordance with Section 6 of BCP 79. | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF), its areas, and its working groups. Note that | Task Force (IETF), its areas, and its working groups. Note that | |||
| skipping to change at page 1, line 35 ¶ | skipping to change at page 1, line 35 ¶ | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| The list of current Internet-Drafts can be accessed at | The list of current Internet-Drafts can be accessed at | |||
| http://www.ietf.org/ietf/1id-abstracts.txt. | http://www.ietf.org/ietf/1id-abstracts.txt. | |||
| The list of Internet-Draft Shadow Directories can be accessed at | The list of Internet-Draft Shadow Directories can be accessed at | |||
| http://www.ietf.org/shadow.html. | http://www.ietf.org/shadow.html. | |||
| This Internet-Draft will expire on November 11, 2008. | This Internet-Draft will expire on January 13, 2009. | |||
| Abstract | Abstract | |||
| Several years have passed since the original protocol for | Several years have passed since the original protocol for | |||
| Internationalized Domain Names (IDNs) was completed and deployed. | Internationalized Domain Names (IDNs) was completed and deployed. | |||
| During that time, a number of issues have arisen, including the need | During that time, a number of issues have arisen, including the need | |||
| to update the system to deal with newer versions of Unicode. Some of | to update the system to deal with newer versions of Unicode. Some of | |||
| these issues require tuning of the existing protocols and the tables | these issues require tuning of the existing protocols and the tables | |||
| on which they depend. This document provides an overview of a | on which they depend. This document provides an overview of a | |||
| revised system and provides explanatory material for its components. | revised system and provides explanatory material for its components. | |||
| skipping to change at page 2, line 17 ¶ | skipping to change at page 2, line 17 ¶ | |||
| 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
| 1.1. Context and Overview . . . . . . . . . . . . . . . . . . . 4 | 1.1. Context and Overview . . . . . . . . . . . . . . . . . . . 4 | |||
| 1.2. Discussion Forum . . . . . . . . . . . . . . . . . . . . . 4 | 1.2. Discussion Forum . . . . . . . . . . . . . . . . . . . . . 4 | |||
| 1.3. Objectives . . . . . . . . . . . . . . . . . . . . . . . . 4 | 1.3. Objectives . . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
| 1.4. Applicability and Function of IDNA . . . . . . . . . . . . 5 | 1.4. Applicability and Function of IDNA . . . . . . . . . . . . 5 | |||
| 1.5. Terminology . . . . . . . . . . . . . . . . . . . . . . . 6 | 1.5. Terminology . . . . . . . . . . . . . . . . . . . . . . . 6 | |||
| 1.5.1. Documents and Standards . . . . . . . . . . . . . . . 6 | 1.5.1. Documents and Standards . . . . . . . . . . . . . . . 6 | |||
| 1.5.2. Terminology about Characters and Character Sets . . . 6 | 1.5.2. Terminology about Characters and Character Sets . . . 6 | |||
| 1.5.3. DNS-related Terminology . . . . . . . . . . . . . . . 7 | 1.5.3. DNS-related Terminology . . . . . . . . . . . . . . . 7 | |||
| 1.5.4. Terminology Specific to IDNA . . . . . . . . . . . . . 7 | 1.5.4. Terminology Specific to IDNA . . . . . . . . . . . . . 7 | |||
| 1.5.5. Punycode is an Algorithm, not a Name . . . . . . . . . 11 | 1.5.5. Punycode is an Algorithm, not a Name . . . . . . . . . 10 | |||
| 1.5.6. Other Terminology Issues . . . . . . . . . . . . . . . 11 | 1.5.6. Other Terminology Issues . . . . . . . . . . . . . . . 11 | |||
| 1.5.7. Comprehensibility of IDNA Mechanisms and Processing . 12 | 1.6. Comprehensibility of IDNA Mechanisms and Processing . . . 12 | |||
| 2. Summary of Major Changes from IDNA2003 . . . . . . . . . . . . 13 | 2. Summary of Major Changes from IDNA2003 . . . . . . . . . . . . 13 | |||
| 3. The Revised IDNA Model . . . . . . . . . . . . . . . . . . . . 14 | 3. The Revised IDNA Model . . . . . . . . . . . . . . . . . . . . 14 | |||
| 4. Processing in IDNA2008 . . . . . . . . . . . . . . . . . . . . 14 | 4. Processing in IDNA2008 . . . . . . . . . . . . . . . . . . . . 14 | |||
| 5. IDNA2008 Document List . . . . . . . . . . . . . . . . . . . . 15 | 5. IDNA2008 Document List . . . . . . . . . . . . . . . . . . . . 14 | |||
| 6. Permitted Characters: An Inclusion List . . . . . . . . . . . 15 | 6. Permitted Characters: An Inclusion List . . . . . . . . . . . 15 | |||
| 6.1. A Tiered Model of Permitted Characters and Labels . . . . 16 | 6.1. A Tiered Model of Permitted Characters and Labels . . . . 15 | |||
| 6.1.1. PROTOCOL-VALID . . . . . . . . . . . . . . . . . . . . 16 | 6.1.1. PROTOCOL-VALID . . . . . . . . . . . . . . . . . . . . 16 | |||
| 6.1.2. DISALLOWED . . . . . . . . . . . . . . . . . . . . . . 18 | 6.1.2. DISALLOWED . . . . . . . . . . . . . . . . . . . . . . 17 | |||
| 6.1.3. UNASSIGNED . . . . . . . . . . . . . . . . . . . . . . 19 | 6.1.3. UNASSIGNED . . . . . . . . . . . . . . . . . . . . . . 18 | |||
| 6.2. Registration Policy . . . . . . . . . . . . . . . . . . . 19 | 6.2. Registration Policy . . . . . . . . . . . . . . . . . . . 19 | |||
| 6.3. Layered Restrictions: Tables, Context, Registration, | 6.3. Layered Restrictions: Tables, Context, Registration, | |||
| Applications . . . . . . . . . . . . . . . . . . . . . . . 19 | Applications . . . . . . . . . . . . . . . . . . . . . . . 19 | |||
| 7. Issues that Constrain Possible Solutions . . . . . . . . . . . 20 | 7. Issues that Constrain Possible Solutions . . . . . . . . . . . 19 | |||
| 7.1. Display and Network Order . . . . . . . . . . . . . . . . 20 | 7.1. Display and Network Order . . . . . . . . . . . . . . . . 19 | |||
| 7.2. Entry and Display in Applications . . . . . . . . . . . . 21 | 7.2. Entry and Display in Applications . . . . . . . . . . . . 21 | |||
| 7.3. Linguistic Expectations: Ligatures, Digraphs, and | 7.3. Linguistic Expectations: Ligatures, Digraphs, and | |||
| Alternate Character Forms . . . . . . . . . . . . . . . . 22 | Alternate Character Forms . . . . . . . . . . . . . . . . 22 | |||
| 7.4. Case Mapping and Related Issues . . . . . . . . . . . . . 24 | 7.4. Case Mapping and Related Issues . . . . . . . . . . . . . 24 | |||
| 7.5. Right to Left Text . . . . . . . . . . . . . . . . . . . . 25 | 7.5. Right to Left Text . . . . . . . . . . . . . . . . . . . . 25 | |||
| 8. IDNs and the Robustness Principle . . . . . . . . . . . . . . 26 | 8. IDNs and the Robustness Principle . . . . . . . . . . . . . . 25 | |||
| 9. Front-end and User Interface Processing . . . . . . . . . . . 26 | 9. Front-end and User Interface Processing . . . . . . . . . . . 26 | |||
| 10. Migration and Version Synchronization . . . . . . . . . . . . 28 | 10. Migration and Version Synchronization . . . . . . . . . . . . 29 | |||
| 10.1. Design Criteria . . . . . . . . . . . . . . . . . . . . . 28 | 10.1. Design Criteria . . . . . . . . . . . . . . . . . . . . . 29 | |||
| 10.1.1. General IDNA Validity Criteria . . . . . . . . . . . . 29 | 10.1.1. General IDNA Validity Criteria . . . . . . . . . . . . 29 | |||
| 10.1.2. Labels in Registration . . . . . . . . . . . . . . . . 30 | 10.1.2. Labels in Registration . . . . . . . . . . . . . . . . 30 | |||
| 10.1.3. Labels in Resolution (Lookup) . . . . . . . . . . . . 31 | 10.1.3. Labels in Resolution (Lookup) . . . . . . . . . . . . 31 | |||
| 10.2. More Flexibility in User Agents . . . . . . . . . . . . . 31 | 10.2. More Flexibility in User Agents . . . . . . . . . . . . . 32 | |||
| 10.3. The Question of Prefix Changes . . . . . . . . . . . . . . 33 | 10.3. The Question of Prefix Changes . . . . . . . . . . . . . . 33 | |||
| 10.3.1. Conditions Requiring a Prefix Change . . . . . . . . . 33 | 10.3.1. Conditions Requiring a Prefix Change . . . . . . . . . 33 | |||
| 10.3.2. Conditions Not Requiring a Prefix Change . . . . . . . 34 | 10.3.2. Conditions Not Requiring a Prefix Change . . . . . . . 34 | |||
| 10.3.3. Implications of Prefix Changes . . . . . . . . . . . . 34 | 10.3.3. Implications of Prefix Changes . . . . . . . . . . . . 35 | |||
| 10.4. Stringprep Changes and Compatibility . . . . . . . . . . . 35 | 10.4. Stringprep Changes and Compatibility . . . . . . . . . . . 35 | |||
| 10.5. The Symbol Question . . . . . . . . . . . . . . . . . . . 35 | 10.5. The Symbol Question . . . . . . . . . . . . . . . . . . . 36 | |||
| 10.6. Migration Between Unicode Versions: Unassigned Code | 10.6. Migration Between Unicode Versions: Unassigned Code | |||
| Points . . . . . . . . . . . . . . . . . . . . . . . . . . 37 | Points . . . . . . . . . . . . . . . . . . . . . . . . . . 37 | |||
| 10.7. Other Compatibility Issues . . . . . . . . . . . . . . . . 38 | 10.7. Other Compatibility Issues . . . . . . . . . . . . . . . . 38 | |||
| 11. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 38 | 11. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 39 | |||
| 12. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 39 | 12. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 39 | |||
| 13. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 39 | 13. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 40 | |||
| 13.1. IDNA Character Registry . . . . . . . . . . . . . . . . . 39 | 13.1. IDNA Character Registry . . . . . . . . . . . . . . . . . 40 | |||
| 13.2. IDNA Context Registry . . . . . . . . . . . . . . . . . . 39 | 13.2. IDNA Context Registry . . . . . . . . . . . . . . . . . . 40 | |||
| 13.3. IANA Repository of IDN Practices of TLDs . . . . . . . . . 40 | 13.3. IANA Repository of IDN Practices of TLDs . . . . . . . . . 40 | |||
| 14. Security Considerations . . . . . . . . . . . . . . . . . . . 40 | 14. Security Considerations . . . . . . . . . . . . . . . . . . . 41 | |||
| 15. Change Log . . . . . . . . . . . . . . . . . . . . . . . . . . 41 | 15. Change Log . . . . . . . . . . . . . . . . . . . . . . . . . . 42 | |||
| 15.1. Version -01 of draft-klensin-idnabis-issues . . . . . . . 42 | 15.1. Version -01 of draft-klensin-idnabis-issues . . . . . . . 42 | |||
| 15.2. Version -02 of draft-klensin-idnabis-issues . . . . . . . 42 | 15.2. Version -02 of draft-klensin-idnabis-issues . . . . . . . 42 | |||
| 15.3. Version -03 of draft-klensin-idnabis-issues . . . . . . . 42 | 15.3. Version -03 of draft-klensin-idnabis-issues . . . . . . . 43 | |||
| 15.4. Version -04 of draft-klensin-idnabis-issues . . . . . . . 42 | 15.4. Version -04 of draft-klensin-idnabis-issues . . . . . . . 43 | |||
| 15.5. Version -05 of draft-klensin-idnabis-issues . . . . . . . 43 | 15.5. Version -05 of draft-klensin-idnabis-issues . . . . . . . 43 | |||
| 15.6. Version -06 of draft-klensin-idnabis-issues . . . . . . . 43 | 15.6. Version -06 of draft-klensin-idnabis-issues . . . . . . . 43 | |||
| 15.7. Version -07 of draft-klensin-idnabis-issues . . . . . . . 43 | 15.7. Version -07 of draft-klensin-idnabis-issues . . . . . . . 44 | |||
| 15.8. Version -00 of draft-ietf-idnabis-rationale . . . . . . . 44 | 15.8. Version -00 of draft-ietf-idnabis-rationale . . . . . . . 44 | |||
| 16. References . . . . . . . . . . . . . . . . . . . . . . . . . . 44 | 15.9. Version -01 of draft-ietf-idnabis-rationale . . . . . . . 45 | |||
| 16.1. Normative References . . . . . . . . . . . . . . . . . . . 44 | 16. References . . . . . . . . . . . . . . . . . . . . . . . . . . 46 | |||
| 16.2. Informative References . . . . . . . . . . . . . . . . . . 46 | 16.1. Normative References . . . . . . . . . . . . . . . . . . . 46 | |||
| Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 47 | 16.2. Informative References . . . . . . . . . . . . . . . . . . 47 | |||
| Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 48 | ||||
| Intellectual Property and Copyright Statements . . . . . . . . . . 49 | Intellectual Property and Copyright Statements . . . . . . . . . . 49 | |||
| 1. Introduction | 1. Introduction | |||
| 1.1. Context and Overview | 1.1. Context and Overview | |||
| Several years have passed since the original protocol for | Several years have passed since the original protocol for | |||
| Internationalized Domain Names (IDNs) was completed and deployed. | Internationalized Domain Names (IDNs) was completed and deployed. | |||
| During that time, a number of issues have arisen, including a subset | During that time, a number of issues have arisen, including a subset | |||
| of those described in a recent IAB report [RFC4690] and the need to | of those described in a recent IAB report [RFC4690] and the need to | |||
| skipping to change at page 6, line 11 ¶ | skipping to change at page 6, line 11 ¶ | |||
| in applications by using the ASCII representation of the non-ASCII | in applications by using the ASCII representation of the non-ASCII | |||
| name labels. While such names are user-unfriendly to read and type, | name labels. While such names are user-unfriendly to read and type, | |||
| and hence not optimal for user input, they allow (for instance) | and hence not optimal for user input, they allow (for instance) | |||
| replying to email and clicking on URLs even though the domain name | replying to email and clicking on URLs even though the domain name | |||
| displayed is incomprehensible to the user. In order to allow user- | displayed is incomprehensible to the user. In order to allow user- | |||
| friendly input and output of the IDNs and acceptance of some | friendly input and output of the IDNs and acceptance of some | |||
| characters as equivalent to those to be processed according to the | characters as equivalent to those to be processed according to the | |||
| protocol, the applications need to be modified to conform to this | protocol, the applications need to be modified to conform to this | |||
| specification. | specification. | |||
| IDNA uses the Unicode character repertoire, which avoids the | IDNA uses the Unicode character repertoire, for continuity with | |||
| significant delays that would be inherent in waiting for a different | IDNA2003. | |||
| and specific character sets to be defined for IDN purposes, | ||||
| presumably by some other standards developing organization. | ||||
| 1.5. Terminology | 1.5. Terminology | |||
| 1.5.1. Documents and Standards | 1.5.1. Documents and Standards | |||
| This document uses the term "IDNA2003" to refer to the set of | This document uses the term "IDNA2003" to refer to the set of | |||
| standards that make up and support the version of IDNA published in | standards that make up and support the version of IDNA published in | |||
| 2003, i.e., those commonly known as the IDNA base specification | 2003, i.e., those commonly known as the IDNA base specification | |||
| [RFC3490], Nameprep [RFC3491], Punycode [RFC3492], and Stringprep | [RFC3490], Nameprep [RFC3491], Punycode [RFC3492], and Stringprep | |||
| [RFC3454]. In this document, those names are used to refer, | [RFC3454]. In this document, those names are used to refer, | |||
| skipping to change at page 8, line 20 ¶ | skipping to change at page 8, line 14 ¶ | |||
| subsection. In the next, it defines a historical one to be slightly | subsection. In the next, it defines a historical one to be slightly | |||
| more precise for IDNA contexts. | more precise for IDNA contexts. | |||
| o A string is "IDNA-valid" if it meets all of the requirements of | o A string is "IDNA-valid" if it meets all of the requirements of | |||
| these specifications for an IDNA label. IDNA-valid strings may | these specifications for an IDNA label. IDNA-valid strings may | |||
| appear in either of two forms, defined immediately below. It is | appear in either of two forms, defined immediately below. It is | |||
| expected that specific reference will be made to the form | expected that specific reference will be made to the form | |||
| appropriate to any context in which the distinction is important. | appropriate to any context in which the distinction is important. | |||
| o An "A-label" is the ASCII-Compatible Encoding (ACE, see | o An "A-label" is the ASCII-Compatible Encoding (ACE, see | |||
| Section 1.5.4.3) form of an IDNA-valid string. It must be a | Section 1.5.4.4) form of an IDNA-valid string. It must be a | |||
| complete label: IDNA is defined for labels, not for parts of them | complete label: IDNA is defined for labels, not for parts of them | |||
| and not for complete domain names. This means, by definition, | and not for complete domain names. This means, by definition, | |||
| that every A-label will begin with the IDNA ACE prefix, "xn--", | that every A-label will begin with the IDNA ACE prefix, "xn--", | |||
| followed by a string that is a valid output of the Punycode | followed by a string that is a valid output of the Punycode | |||
| algorithm and hence a maximum of 59 ASCII characters in length. | algorithm and hence a maximum of 59 ASCII characters in length. | |||
| The prefix and string together must conform to all requirements | The prefix and string together must conform to all requirements | |||
| for a label that can be stored in the DNS including conformance to | for a label that can be stored in the DNS including conformance to | |||
| the LDH ("host name") rule described in RFC 1034, RFC 1123 and | the LDH ("host name") rule described in RFC 1034, RFC 1123 and | |||
| elsewhere. | elsewhere. | |||
| o A "U-label" is an IDNA-valid string of Unicode characters, | o A "U-label" is an IDNA-valid string of Unicode characters, | |||
| expressed in a standard Unicode Encoding Form, normally UTF-8 in | including at least one non-ASCII character, expressed in a | |||
| an Internet transmission context, and subject to the constraint | standard Unicode Encoding Form, normally UTF-8 in an Internet | |||
| below. Conversions between valid U-labels and valid A-labels is | transmission context, and subject to the constraint below. | |||
| performed according to the specification in [RFC3492], adding or | Conversions between valid U-labels and valid A-labels is performed | |||
| removing the ACE prefix (see Section 1.5.4.3) as needed. | according to the specification in [RFC3492], adding or removing | |||
| the ACE prefix (see Section 1.5.4.4) as needed. | ||||
| To be valid, U-labels and A-labels must obey an important symmetry | To be valid, U-labels and A-labels must obey an important symmetry | |||
| constraint. While that constraint may be tested in any of several | constraint. While that constraint may be tested in any of several | |||
| ways, an A-label must be capable of being produced by conversion from | ways, an A-label must be capable of being produced by conversion from | |||
| a U-label and a U-label must be capable of being produced by | a U-label and a U-label must be capable of being produced by | |||
| conversion from an A-label. Among other things, this implies that | conversion from an A-label. Among other things, this implies that | |||
| both U-labels and A-labels must represent strings in normalized form. | both U-labels and A-labels must represent strings in normalized form. | |||
| These strings MUST contain only characters specified elsewhere in | These strings MUST contain only characters specified elsewhere in | |||
| this document and its companion documents, and only in the contexts | this document and its companion documents, and only in the contexts | |||
| indicated as appropriate. | indicated as appropriate. | |||
| skipping to change at page 9, line 17 ¶ | skipping to change at page 9, line 14 ¶ | |||
| A different way to look at these terms, which may be more clear to | A different way to look at these terms, which may be more clear to | |||
| some readers, is that U-labels, A-labels, and LDH-labels (see the | some readers, is that U-labels, A-labels, and LDH-labels (see the | |||
| next subsection) are disjoint categories that, together, make up the | next subsection) are disjoint categories that, together, make up the | |||
| forms of legitimate strings for use in domain names that describe | forms of legitimate strings for use in domain names that describe | |||
| hosts. Of the three, only A-labels and LDH-labels can actually | hosts. Of the three, only A-labels and LDH-labels can actually | |||
| appear in DNS zone files or queries; U-labels can appear, along with | appear in DNS zone files or queries; U-labels can appear, along with | |||
| the other two, in presentation and user interface forms and in | the other two, in presentation and user interface forms and in | |||
| selected protocols other than those of the DNS itself. Strings that | selected protocols other than those of the DNS itself. Strings that | |||
| do not conform to the rules for one of these three categories and, in | do not conform to the rules for one of these three categories and, in | |||
| particular, strings that contain "-" in the third or fourth character | particular, strings that contain "--" in the third and fourth | |||
| position but are: | character position but are: | |||
| o not A-labels or | o not A-labels or | |||
| o cannot be processed as U-labels or A-labels as described in these | o cannot be processed as U-labels or A-labels as described in these | |||
| specifications, | specifications, | |||
| are invalid as labels in domain names that identify Internet hosts or | are invalid in IDNA-conformant applications as labels in domain names | |||
| similar resources. This restriction on strings containing "--" is | that identify Internet hosts or similar resources. This restriction | |||
| required for three reasons: | on strings containing "--" is required for three reasons: | |||
| o to prevent confusion with pre-IDNA coding forms; | o to prevent confusion with pre-IDNA coding forms; | |||
| o to permit future extensions that would require changing the | o to permit future extensions that would require changing the | |||
| prefix, no matter how unlikely those might be (see Section 10.3); | prefix, no matter how unlikely those might be (see Section 10.3); | |||
| and | and | |||
| o to reduce the opportunities for attacks on the encoding system. | o to reduce the opportunities for attacks via the encoding system. | |||
| 1.5.4.1.2. LDH-label and Internationalized Label | 1.5.4.2. LDH-label and Internationalized Label | |||
| In the hope of further clarifying discussions about IDNs, these | In the hope of further clarifying discussions about IDNs, these | |||
| specifications use the term "LDH-label" strictly to refer to an all- | specifications use the term "LDH-label" strictly to refer to an all- | |||
| ASCII label that obeys the "hostname" (LDH) conventions and that is | ASCII label that obeys the "hostname" (LDH) conventions and that is | |||
| not an IDN. In other words, only "U-label" and "A-label" refer to | not an IDN. In other words, only "U-label" and "A-label" refer to | |||
| IDNs and LDH-labels are not IDNs. "Internationalized label" is used | IDNs; LDH-labels are not IDNs. "Internationalized label" is used | |||
| when a term is needed to refer to any of the three categories. There | when a term is needed to refer to any of the three categories. There | |||
| are some standardized DNS label formats, such as those for service | are some standardized DNS label formats, such as those for service | |||
| location (SRV) records [RFC2782] that do not fall into any of the | location (SRV) records [RFC2782] that do not fall into any of the | |||
| three categories and hence are not internationalized labels. | three categories and hence are not internationalized labels. | |||
| 1.5.4.2. Equivalence | 1.5.4.3. Equivalence | |||
| In IDNA, equivalence of labels is defined in terms of the A-labels. | In IDNA, equivalence of labels is defined in terms of the A-labels. | |||
| If the A-labels are equal in a case-independent comparison, then the | If the A-labels are equal in a case-independent comparison, then the | |||
| labels are considered equivalent, no matter how they are represented. | labels are considered equivalent, no matter how they are represented. | |||
| Traditional LDH labels already have a notion of equivalence: within | Traditional LDH labels already have a notion of equivalence: within | |||
| that list of characters, upper case and lower case are considered | that list of characters, upper case and lower case are considered | |||
| equivalent. The IDNA notion of equivalence is an extension of that | equivalent. The IDNA notion of equivalence is an extension of that | |||
| older notion. Equivalent labels in IDNA are treated as alternate | older notion. Equivalent labels in IDNA are treated as alternate | |||
| forms of the same label, just as "foo" and "Foo" are treated as | forms of the same label, just as "foo" and "Foo" are treated as | |||
| alternate forms of the same label. | alternate forms of the same label. | |||
| 1.5.4.3. ACE Prefix | 1.5.4.4. ACE Prefix | |||
| The "ACE prefix" is defined in this document to be a string of ASCII | The "ACE prefix" is defined in this document to be a string of ASCII | |||
| characters "xn--" that appears at the beginning of every A-label. | characters "xn--" that appears at the beginning of every A-label. | |||
| "ACE" stands for "ASCII-Compatible Encoding". | "ACE" stands for "ASCII-Compatible Encoding". | |||
| 1.5.4.4. Domain Name Slot | 1.5.4.5. Domain Name Slot | |||
| A "domain name slot" is defined in this document to be a protocol | A "domain name slot" is defined in this document to be a protocol | |||
| element or a function argument or a return value (and so on) | element or a function argument or a return value (and so on) | |||
| explicitly designated for carrying a domain name. Examples of domain | explicitly designated for carrying a domain name. Examples of domain | |||
| name slots include: the QNAME field of a DNS query; the name argument | name slots include: the QNAME field of a DNS query; the name argument | |||
| of the gethostbyname() or getaddrinfo() standard C library functions; | of the gethostbyname() or getaddrinfo() standard C library functions; | |||
| the part of an email address following the at-sign (@) in the | the part of an email address following the at-sign (@) in the | |||
| parameter to the SMTP MAIL or RCPT commands or the "From:" field of | parameter to the SMTP MAIL or RCPT commands or the "From:" field of | |||
| an email message header; and the host portion of the URI in the src | an email message header; and the host portion of the URI in the src | |||
| attribute of an HTML <IMG> tag. General text that just happens to | attribute of an HTML <IMG> tag. General text that just happens to | |||
| skipping to change at page 11, line 41 ¶ | skipping to change at page 11, line 31 ¶ | |||
| because they are mnemonics, they need not obey the orthographic | because they are mnemonics, they need not obey the orthographic | |||
| conventions of any language: it is not a requirement that it be | conventions of any language: it is not a requirement that it be | |||
| possible for them to be "words". | possible for them to be "words". | |||
| This distinction is important because the reasonable goal of an IDN | This distinction is important because the reasonable goal of an IDN | |||
| effort is not to be able to write the great Klingon (or language of | effort is not to be able to write the great Klingon (or language of | |||
| one's choice) novel in DNS labels but to be able to form a usefully | one's choice) novel in DNS labels but to be able to form a usefully | |||
| broad range of mnemonics in ways that are as natural as possible in a | broad range of mnemonics in ways that are as natural as possible in a | |||
| very broad range of scripts. | very broad range of scripts. | |||
| "The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | ||||
| "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | ||||
| document are to be interpreted as described in RFC 2119 [RFC2119]. | ||||
| An "internationalized domain name" (IDN) is a domain name that may | An "internationalized domain name" (IDN) is a domain name that may | |||
| contain any mixture of LDH-labels, A-labels, or U-labels. This | contain any mixture of LDH-labels, A-labels, or U-labels. This | |||
| implies that every conventional domain name is an IDN (which implies | implies that every conventional domain name is an IDN (which implies | |||
| that it is possible for a domain name to be an IDN without it | that it is possible for a domain name to be an IDN without it | |||
| containing any non-ASCII characters). Just as has been the case with | containing any non-ASCII characters). Just as has been the case with | |||
| ASCII names, some DNS zone administrators may impose restrictions, | ASCII names, some DNS zone administrators may impose restrictions, | |||
| beyond those imposed by DNS or IDNA, on the characters or strings | beyond those imposed by DNS or IDNA, on the characters or strings | |||
| that may be registered as labels in their zones. Because of the | that may be registered as labels in their zones. Because of the | |||
| diversity of characters that can be used in a U-label and the | diversity of characters that can be used in a U-label and the | |||
| confusion they might cause, such restrictions are mandatory for IDN | confusion they might cause, such restrictions are mandatory for IDN | |||
| registries and zones even though the particular restrictions are not | registries and zones even though the particular restrictions are not | |||
| part of these specifications. Because these restrictions, commonly | part of these specifications. Because these restrictions, commonly | |||
| known as "registry restrictions", only affect what can be registered | known as "registry restrictions", only affect what can be registered | |||
| and not resolution processing, they have no effect on the syntax or | and not resolution processing, they have no effect on the syntax or | |||
| semantics of DNS protocol messages; a query for a name that matches | semantics of DNS protocol messages; a query for a name that matches | |||
| no records will yield the same response regardless of the reason why | no records will yield the same response regardless of the reason why | |||
| it is not in the zone. Clients issuing queries or interpreting | it is not in the zone. Clients issuing queries or interpreting | |||
| responses cannot be assumed to have any knowledge of zone-specific | responses cannot be assumed to have any knowledge of zone-specific | |||
| restrictions or conventions. See Section 6.2. | restrictions or conventions. See Section 6.2. | |||
| 1.5.7. Comprehensibility of IDNA Mechanisms and Processing | "The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
| "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | ||||
| document are to be interpreted as described in RFC 2119 [RFC2119]. | ||||
| 1.6. Comprehensibility of IDNA Mechanisms and Processing | ||||
| One of the major goals of this work is to improve the general | One of the major goals of this work is to improve the general | |||
| understanding of how IDNA works and what characters are permitted and | understanding of how IDNA works and what characters are permitted and | |||
| what happens to them. Comprehensibility and predictability to users | what happens to them. Comprehensibility and predictability to users | |||
| and registrants are themselves important motivations and design goals | and registrants are themselves important motivations and design goals | |||
| for this effort. The effort includes some new terminology and a | for this effort. The effort includes some new terminology and a | |||
| revised and extended model, both covered in this section, and some | revised and extended model, both covered in this section, and some | |||
| more specific protocol, processing, and table modifications. Details | more specific protocol, processing, and table modifications. Details | |||
| of the latter appear in other documents (see Section 5). | of the latter appear in other documents (see Section 5). | |||
| skipping to change at page 14, line 19 ¶ | skipping to change at page 14, line 10 ¶ | |||
| specific contexts. | specific contexts. | |||
| 7. Allow typical words and names in languages such as Dhivehi and | 7. Allow typical words and names in languages such as Dhivehi and | |||
| Yiddish to be expressed. | Yiddish to be expressed. | |||
| 8. Make bidirectional domain names (delimited strings of labels, | 8. Make bidirectional domain names (delimited strings of labels, | |||
| not just labels standing on their own) display in a non- | not just labels standing on their own) display in a non- | |||
| surprising fashion. | surprising fashion. | |||
| 9. Make bidirectional domain names in a paragraph display in a non- | 9. Make bidirectional domain names in a paragraph display in a non- | |||
| surprising fashion. | surprising fashion.[[anchor17: Is this statement necessary or is | |||
| it redundant with the previous one?]] | ||||
| 10. Remove the dot separator from the mandatory part of the | 10. Remove the dot separator from the mandatory part of the | |||
| protocol. | protocol. | |||
| 11. Make some currently-valid labels that are not actually IDNA | 11. Make some currently-valid labels that are not actually IDNA | |||
| labels invalid. | labels invalid. | |||
| 3. The Revised IDNA Model | 3. The Revised IDNA Model | |||
| IDNA is a client-side protocol, i.e., almost all of the processing is | IDNA is a client-side protocol, i.e., almost all of the processing is | |||
| skipping to change at page 16, line 14 ¶ | skipping to change at page 16, line 5 ¶ | |||
| are in [IDNA2008-Protocol]. | are in [IDNA2008-Protocol]. | |||
| 6.1. A Tiered Model of Permitted Characters and Labels | 6.1. A Tiered Model of Permitted Characters and Labels | |||
| Moving to an inclusion model requires respecifying the list of | Moving to an inclusion model requires respecifying the list of | |||
| characters that are permitted in IDNs. In IDNA2003, the role and | characters that are permitted in IDNs. In IDNA2003, the role and | |||
| utility of characters are independent of context and fixed forever | utility of characters are independent of context and fixed forever | |||
| (or until the standard is replaced). Making completely context- | (or until the standard is replaced). Making completely context- | |||
| independent rules globally has proven impractical because some | independent rules globally has proven impractical because some | |||
| characters, especially those that are called "Join_Controls" in | characters, especially those that are called "Join_Controls" in | |||
| Unicode, are needed to make reasonable use of some scripts but become | Unicode, are needed to make reasonable use of some scripts but have | |||
| invisible characters in others. Of necessity, IDNA2003 prohibited | no visible effect(s) in others. Of necessity, IDNA2003 prohibited | |||
| those types of characters entirely. But the restrictions were much | those types of characters entirely. But the restrictions were much | |||
| too severe to permit an adequate range of mnemonics for terminology | too severe to permit an adequate range of mnemonics for terminology | |||
| based on some languages. The requirement to support those characters | based on some languages. The requirement to support those characters | |||
| but limit their use to very specific contexts was reinforced by the | but limit their use to very specific contexts was reinforced by the | |||
| observation that handling of particular characters across the | observation that handling of particular characters across the | |||
| languages that use a script, or the use of similar or identical- | languages that use a script, or the use of similar or identical- | |||
| looking characters in different scripts, is less well understood than | looking characters in different scripts, is less well understood than | |||
| many people believed it was several years ago. | many people believed it was several years ago. | |||
| Independently of the characters chosen (see next subsection), the | Independently of the characters chosen (see next subsection), the | |||
| skipping to change at page 17, line 10 ¶ | skipping to change at page 16, line 48 ¶ | |||
| removed from it unless the code points themselves are removed from | removed from it unless the code points themselves are removed from | |||
| Unicode (such removal would be inconsistent with the Unicode | Unicode (such removal would be inconsistent with the Unicode | |||
| stability principles (see [Unicode51], Appendix F) and hence should | stability principles (see [Unicode51], Appendix F) and hence should | |||
| never occur). | never occur). | |||
| [[anchor21: Placeholder: Does this topic or comment need additional | [[anchor21: Placeholder: Does this topic or comment need additional | |||
| discussion or explanation?]] | discussion or explanation?]] | |||
| 6.1.1.1. Contextual Rules | 6.1.1.1. Contextual Rules | |||
| Characters in the PROTOCOL-VALID category may actually be unsuitable | Some characters may be unsuitable for general use in IDNs but | |||
| for general use in IDNs but necessary for the plausible support of | necessary for the plausible support of some scripts. The two most | |||
| some scripts. The two most commonly-cited examples are the zero- | commonly-cited examples are the zero-width joiner and non-joiner | |||
| width joiner and non-joiner characters (ZWNJ, U+200C, and ZWJ, | characters (ZWNJ, U+200C, and ZWJ, U+200D), but provisions for | |||
| U+200D), but provisions for unambiguous labels may require that other | unambiguous labels may require that other characters be restricted to | |||
| characters be restricted to particular contexts. For example, the | particular contexts. For example, the ASCII hyphen is not permitted | |||
| ASCII hyphen is not permitted to start or end a label, whether that | to start or end a label, whether that label contains non-ASCII | |||
| label contains non-ASCII characters or not. | characters or not. | |||
| These characters must not appear in IDNs without additional | These characters must not appear in IDNs without additional | |||
| restrictions, typically because they are invisible in most scripts | restrictions, typically because they have no visible consequences in | |||
| but affect format or presentation in a few others or because they are | most scripts but affect format or presentation in a few others or | |||
| combining characters that are safe for use only in conjunction with | because they are combining characters that are safe for use only in | |||
| particular characters or scripts. In order to permit them to be used | conjunction with particular characters or scripts. In order to | |||
| at all, they are specially identified as "CONTEXTUAL RULE REQUIRED" | permit them to be used at all, they are specially identified as | |||
| and, when adequately understood, associated with a rule. In | "CONTEXTUAL RULE REQUIRED" and, when adequately understood, | |||
| addition, the rule will define whether it is to be applied on lookup | associated with a rule. In addition, the rule will define whether it | |||
| as well as registration. A distinction is made between characters | is to be applied on lookup as well as registration. A distinction is | |||
| that indicate or prohibit joining (known as "CONTEXT-JOINER" or | made between characters that indicate or prohibit joining (known as | |||
| "CONTEXTJ") and other characters requiring contextual treatment | "CONTEXT-JOINER" or "CONTEXTJ") and other characters requiring | |||
| ("CONTEXT-OTHER" or "CONTEXTO"). Only the former are fully tested at | contextual treatment ("CONTEXT-OTHER" or "CONTEXTO"). Only the | |||
| lookup time. | former are fully tested at lookup time. | |||
| 6.1.1.2. Rules and Their Application | 6.1.1.2. Rules and Their Application | |||
| The actual rules may be present or absent. If present, they may have | The actual rules may be present or absent. If present, they may have | |||
| values of "True" (character may be used in any position in any | values of "True" (character may be used in any position in any | |||
| label), "False" (character may not be used in any label), or may be | label), "False" (character may not be used in any label), or may be | |||
| an extended regular expression that specifies the context in which | an extended regular expression that specifies the context in which | |||
| the character is permitted. | the character is permitted. | |||
| Examples of descriptions of typical rules, stated informally and in | Examples of descriptions of typical rules, stated informally and in | |||
| skipping to change at page 18, line 5 ¶ | skipping to change at page 17, line 43 ¶ | |||
| occur only if the entire label is in Script ABC", "MUST occur only if | occur only if the entire label is in Script ABC", "MUST occur only if | |||
| the previous and subsequent characters have the DFG property". | the previous and subsequent characters have the DFG property". | |||
| Because it is easier to identify these characters than to know that | Because it is easier to identify these characters than to know that | |||
| they are actually needed in IDNs or how to establish exactly the | they are actually needed in IDNs or how to establish exactly the | |||
| right rules for each one, a rule may have a null value in a given | right rules for each one, a rule may have a null value in a given | |||
| version of the tables. Characters associated with null rules MUST | version of the tables. Characters associated with null rules MUST | |||
| NOT appear in putative labels for either registration or lookup. Of | NOT appear in putative labels for either registration or lookup. Of | |||
| course, a later version of the tables might contain a non-null rule. | course, a later version of the tables might contain a non-null rule. | |||
| [[anchor23: Definition of regular expression language to be | [[anchor23: Definition of regular expression language to be supplied | |||
| supplied]] | or replaced with a description of the definitional technique. It may | |||
| be useful to more more of this material to Tables as part of moving | ||||
| the rules from Protocol to Tables.]] | ||||
| 6.1.2. DISALLOWED | 6.1.2. DISALLOWED | |||
| Some characters are sufficiently problematic for use in IDNs that | Some characters are sufficiently problematic for use in IDNs that | |||
| they should be excluded for both registration and lookup (i.e., | they should be excluded for both registration and lookup (i.e., | |||
| conforming applications performing name resolution should verify that | conforming applications performing name resolution should verify that | |||
| these characters are absent; if they are present, the label strings | these characters are absent; if they are present, the label strings | |||
| should be rejected rather than converted to A-labels and looked up. | should be rejected rather than converted to A-labels and looked up. | |||
| Of course, this category would include code points that had been | Of course, this category would include code points that had been | |||
| skipping to change at page 23, line 37 ¶ | skipping to change at page 23, line 25 ¶ | |||
| (as in the above example with different spelling conventions). This | (as in the above example with different spelling conventions). This | |||
| can be illustrated by many words in the Norwegian language, where the | can be illustrated by many words in the Norwegian language, where the | |||
| "ae" ligature is the 27th letter of a 29-letter extended Latin | "ae" ligature is the 27th letter of a 29-letter extended Latin | |||
| alphabet. It is equivalent to the 28th letter of the Swedish | alphabet. It is equivalent to the 28th letter of the Swedish | |||
| alphabet (also containing 29 letters), U+00E4 LATIN SMALL LETTER A | alphabet (also containing 29 letters), U+00E4 LATIN SMALL LETTER A | |||
| WITH DIAERESIS, for which an "ae" cannot be substituted according to | WITH DIAERESIS, for which an "ae" cannot be substituted according to | |||
| current orthographic standards. | current orthographic standards. | |||
| That character (U+00E4) is also part of the German alphabet where, | That character (U+00E4) is also part of the German alphabet where, | |||
| unlike in the Nordic languages, the two-character sequence "ae" is | unlike in the Nordic languages, the two-character sequence "ae" is | |||
| usually treated as a fully acceptable alternate orthography. The | usually treated as a fully acceptable alternate orthography for the | |||
| inverse is however not true, and those two characters cannot | "umlauted a" character. The inverse is however not true, and those | |||
| necessarily be combined into an "umlauted a". This also applies to | two characters cannot necessarily be combined into an "umlauted a". | |||
| another German character, the "umlauted o" (U+00F6 LATIN SMALL LETTER | This also applies to another German character, the "umlauted o" | |||
| O WITH DIAERESIS) which, for example, cannot be used for writing the | (U+00F6 LATIN SMALL LETTER O WITH DIAERESIS) which, for example, | |||
| name of the author "Goethe". It is also a letter in the Swedish | cannot be used for writing the name of the author "Goethe". It is | |||
| alphabet where, in parallel to the "umlauted a", it cannot be | also a letter in the Swedish alphabet where, in parallel to the | |||
| correctly represented as "oe" and in the Norwegian alphabet, where it | "umlauted a", it cannot be correctly represented as "oe" and in the | |||
| is represented, not as "umlauted o", but as "slashed o", U+00F8. | Norwegian alphabet, where it is represented, not as "umlauted o", but | |||
| as "slashed o", U+00F8. | ||||
| Some of the ligatures that have explicit code points in Unicode were | Some of the ligatures that have explicit code points in Unicode were | |||
| given special handling in IDNA2003 and now pose additional problems | given special handling in IDNA2003 and now pose additional problems | |||
| as people argue that they should have been treated differently to | as people argue that they should have been treated differently to | |||
| preserve important information. For example, the German character | preserve important information. For example, the German character | |||
| Eszett (Sharp S, U+00DF) is retained as itself by NFKC but case- | Eszett (Sharp S, U+00DF) is retained as itself by NFKC but case- | |||
| folded by Stringprep to "ss", but the closely-related, but less | folded by Stringprep to "ss", but the closely-related, but less | |||
| frequently seen, character "Long S T" (U+FB05) is a compatibility | frequently seen, character "Long S T" (U+FB05) is a compatibility | |||
| character that is mapped out by NFKC. Unless exceptions are made, | character that is mapped out by NFKC. Unless exceptions are made, | |||
| both will be treated as DISALLOWED by IDNA2008. But there is | both will be treated as DISALLOWED by IDNA2008. But there is | |||
| skipping to change at page 27, line 4 ¶ | skipping to change at page 26, line 40 ¶ | |||
| resolver that decides a string that is valid under the protocol is | resolver that decides a string that is valid under the protocol is | |||
| dangerous and refuses to look it up is in violation of the protocols; | dangerous and refuses to look it up is in violation of the protocols; | |||
| one that is willing to look something up, but warns against it, is | one that is willing to look something up, but warns against it, is | |||
| exercising a local choice. | exercising a local choice. | |||
| 9. Front-end and User Interface Processing | 9. Front-end and User Interface Processing | |||
| Domain names may be identified and processed in many contexts. They | Domain names may be identified and processed in many contexts. They | |||
| may be typed in by users either by themselves or as part of URIs or | may be typed in by users either by themselves or as part of URIs or | |||
| IRIs. They may occur in running text or be processed by one system | IRIs. They may occur in running text or be processed by one system | |||
| after being provided in another. They may wish to try to normalize | after being provided in another. Systems may wish to try to | |||
| URLs so as to determine (or guess) whether a reference is valid or | normalize URLs so as to determine (or guess) whether a reference is | |||
| two references point to the same object without actually looking the | valid or two references point to the same object without actually | |||
| objects up and comparing them. Some of these goals may be more | looking the objects up and comparing them. Some of these goals may | |||
| easily and reliably satisfied than others. While there are strong | be more easily and reliably satisfied than others. While there are | |||
| arguments for any domain name that is placed "on the wire" -- | strong arguments for any domain name that is placed "on the wire" -- | |||
| transmitted between systems -- to be in the minimum-ambiguity forms | transmitted between systems -- to be in the minimum-ambiguity forms | |||
| of A-labels, U-labels, or LDH-labels, it is inevitable that programs | of A-labels, U-labels, or LDH-labels, it is inevitable that programs | |||
| that process domain names will encounter variant forms. One source | that process domain names will encounter variant forms. One source | |||
| of such forms will be labels created under IDNA2003. Because of the | of such forms will be labels created under IDNA2003. Because of the | |||
| way that protocol was specified, there are a significant number of | way that protocol was specified, there are a significant number of | |||
| domain names in files on the Internet that use characters that cannot | domain names in files on the Internet that use characters that cannot | |||
| be represented directly in domain names but for which interpretations | be represented directly in domain names but for which interpretations | |||
| are provided. There are two major categories of such characters, | are provided. There are two major categories of such characters, | |||
| those that are removed by NFKC normalization and those upper-case | those that are removed by NFKC normalization and those upper-case | |||
| characters that are mapped to lower-case (there are also a few | characters that are mapped to lower-case (there are also a few | |||
| characters that are given special-case mapping treatment in | characters that are given special-case mapping treatment in | |||
| Stringprep). | Stringprep). [[anchor29: The text above is a too obscure, but was | |||
| intended to address the mapping differences between IDNA2003 and the | ||||
| current proposal. Patrik suggests the following, which will need | ||||
| some tuning before it can be inserted: One source of such forms will | ||||
| be labels created under IDNA2003 as some allowed labels where | ||||
| transformed before they where turned into its ascii (xn--) form so | ||||
| that ToUnicode(ToASCII(label)) != label. This is why IDNA2008 | ||||
| explicitly define A-label and U-label being a form of the label that | ||||
| is stable when converting between A-label and U-label, without | ||||
| mappings. A different way of explaining this is that there could be | ||||
| already today domain names in files on the Internet that use | ||||
| characters that cannot be represented directly in domain names but | ||||
| for which interpretations are provided. There are two major | ||||
| categories of such characters, those that are removed by NFKC | ||||
| normalization and those upper-case characters that are mapped to | ||||
| lower-case (there are also a few characters that are given special- | ||||
| case mapping treatment in Stringprep)."]] | ||||
| Other issues in domain name identification and processing arise | Other issues in domain name identification and processing arise | |||
| because IDNA2003 specified that several other characters be treated | because IDNA2003 specified that several other characters be treated | |||
| as equivalent to the ASCII period (dot, full stop) character used as | as equivalent to the ASCII period (dot, full stop) character used as | |||
| a label separator. If a domain name appears in an arbitrary context | a label separator. If a domain name appears in an arbitrary context | |||
| (such as running text), one may be faced with the requirement to know | (such as running text), it is difficult, even with only ASCII | |||
| that a string is a domain name in order to adjust for the different | characters, to know whether a domain name (or a protocol parameter | |||
| forms of dots but also to have traditional dots to recognize that a | like a URI) is present and where it starts and ends. When using | |||
| string is a domain name -- an obvious contradiction. | Unicode this gets even more difficult if treatment of certain special | |||
| characters (like the dot that separates labels in a domain name) | ||||
| depends on context. That problem occurs if the dot is part of a | ||||
| domain name or not, which would mean that, contrary to common | ||||
| practice today, the primary heuristic for identifying a domain name | ||||
| depends on dots separating strings with no intervening spaces. | ||||
| [[anchor30: Above text is a substitute for an earlier (pre -01) | ||||
| version and is hoped to be more clear. Comments and improvements | ||||
| welcome.]] | ||||
| As discussed elsewhere in this document, the IDNA2008 model removes | As discussed elsewhere in this document, the IDNA2008 model removes | |||
| all of these mappings and interpretations, including the equivalence | all of these mappings and interpretations, including the equivalence | |||
| of different forms of dots, from the protocol, leaving such mappings | of different forms of dots, from the protocol, leaving such mappings | |||
| to local processing. This should not be taken to imply that local | to local processing. This should not be taken to imply that local | |||
| processing is optional or can be avoided entirely. Instead, unless | processing is optional or can be avoided entirely. Instead, unless | |||
| the program context is such that it is known that any IDNs that | the program context is such that it is known that any IDNs that | |||
| appear will be either U-labels or A-labels, some local processing of | appear will be either U-labels or A-labels, some local processing of | |||
| apparent domain name strings will be required, both to maintain | apparent domain name strings will be required, both to maintain | |||
| compatibility with IDNA2003 and to prevent user astonishment. Such | compatibility with IDNA2003 and to prevent user astonishment. Such | |||
| skipping to change at page 28, line 36 ¶ | skipping to change at page 28, line 48 ¶ | |||
| version 7 (IE7), utterly refuse to handle "strange" characters at | version 7 (IE7), utterly refuse to handle "strange" characters at | |||
| all if they appear in U-label form. None of those local decisions | all if they appear in U-label form. None of those local decisions | |||
| are a threat to interoperability as long as (i) only U-labels and | are a threat to interoperability as long as (i) only U-labels and | |||
| A-labels are used in interchange with systems outside the local | A-labels are used in interchange with systems outside the local | |||
| environment, (ii) no character that would be valid in a U-label as | environment, (ii) no character that would be valid in a U-label as | |||
| itself is mapped to something else, (iii) any local mappings are | itself is mapped to something else, (iii) any local mappings are | |||
| applied as a preprocessing step (or, for conversions from U-labels | applied as a preprocessing step (or, for conversions from U-labels | |||
| or A-labels to presentation forms, postprocessing), not as part of | or A-labels to presentation forms, postprocessing), not as part of | |||
| IDNA processing proper, and (iv) appropriate consideration is | IDNA processing proper, and (iv) appropriate consideration is | |||
| given to labels that might have entered the environment in | given to labels that might have entered the environment in | |||
| conformance to IDNA2003. | conformance to IDNA2003. [[anchor31: Placeholder: there have been | |||
| suggestions that this text be removed entirely. Comments (or | ||||
| improved text) welcome.]] | ||||
| 10. Migration and Version Synchronization | 10. Migration and Version Synchronization | |||
| 10.1. Design Criteria | 10.1. Design Criteria | |||
| As mentioned above and in RFC 4690, two key goals of this work are to | As mentioned above and in RFC 4690, two key goals of this work are to | |||
| enable applications to be agnostic about whether they are being run | enable applications to be agnostic about whether they are being run | |||
| in environments supporting any Unicode version from 3.2 onward and to | in environments supporting any Unicode version from 3.2 onward and to | |||
| permit incrementally adding permitted scripts and other character | permit incrementally adding permitted scripts and other character | |||
| collections without disruption or, subsequent to this version, | collections without disruption or, subsequent to this version, | |||
| skipping to change at page 29, line 28 ¶ | skipping to change at page 29, line 42 ¶ | |||
| discussion and rationale for the symbol decision appears in | discussion and rationale for the symbol decision appears in | |||
| Section 10.5). | Section 10.5). | |||
| o Other than in very exceptional cases, e.g., where they are needed | o Other than in very exceptional cases, e.g., where they are needed | |||
| to write substantially any word of a given language, punctuation | to write substantially any word of a given language, punctuation | |||
| characters are excluded as well. The fact that a word exists is | characters are excluded as well. The fact that a word exists is | |||
| not proof that it should be usable in a DNS label and DNS labels | not proof that it should be usable in a DNS label and DNS labels | |||
| are not expected to be usable for multiple-word phrases (although | are not expected to be usable for multiple-word phrases (although | |||
| they are certainly not prohibited if the conventions and | they are certainly not prohibited if the conventions and | |||
| orthography of a particular language cause that to be possible). | orthography of a particular language cause that to be possible). | |||
| Even for English, very common constructions -- contractions like | ||||
| "don't" or "it's", names that are written with apostrophes such as | ||||
| "O'Reilly" or characters for which apostrophes are common | ||||
| substitutes, and words whose usually-preferred spellings retain | ||||
| diacritical marks from earlier forms -- cannot be represented in | ||||
| DNS labels. | ||||
| o Characters that are unassigned (have no character assignment at | o Characters that are unassigned (have no character assignment at | |||
| all) in the version of Unicode being used by the registry or | all) in the version of Unicode being used by the registry or | |||
| application are not permitted, even on resolution (lookup). There | application are not permitted, even on resolution (lookup). There | |||
| are at least two reasons for this. Tests involving the context of | are at least two reasons for this. Tests involving the context of | |||
| characters (e.g., some characters being permitted only adjacent to | characters (e.g., some characters being permitted only adjacent to | |||
| ones of specific types but otherwise invisible or very problematic | ones of specific types but otherwise invisible or very problematic | |||
| for other reasons) and integrity tests on complete labels are | for other reasons) and integrity tests on complete labels are | |||
| needed. Unassigned code points cannot be permitted because one | needed. Unassigned code points cannot be permitted because one | |||
| cannot determine whether particular code points will require | cannot determine whether particular code points will require | |||
| contextual rules (and what those rules should be)7 before | contextual rules (and what those rules should be) before | |||
| characters are assigned to them and the properties of those | characters are assigned to them and the properties of those | |||
| characters fully understood. Second, Unicode specifies that an | characters fully understood. Second, Unicode specifies that an | |||
| unassigned code point normalizes and case folds to itself. If the | unassigned code point normalizes and case folds to itself. If the | |||
| code point is later assigned to a character, and particularly if | code point is later assigned to a character, and particularly if | |||
| the newly-assigned code point has a combining class that | the newly-assigned code point has a combining class that | |||
| determines its placement relative to other combining characters, | determines its placement relative to other combining characters, | |||
| it could normalize to some other code point or sequence, creating | it could normalize to some other code point or sequence, creating | |||
| confusion and/or violating other rules listed here. | confusion and/or violating other rules listed here. | |||
| o Any character that is mapped to another character by Nameprep2003 | o Any character that is mapped to another character by Nameprep2003 | |||
| skipping to change at page 30, line 14 ¶ | skipping to change at page 30, line 34 ¶ | |||
| environments, context, or users. | environments, context, or users. | |||
| Tables used to identify the characters that are IDNA-valid are | Tables used to identify the characters that are IDNA-valid are | |||
| expected to be driven by the principles above (described in more | expected to be driven by the principles above (described in more | |||
| precise form in [IDNA2008-Tables]). The principles are not just an | precise form in [IDNA2008-Tables]). The principles are not just an | |||
| interpretation of the tables. | interpretation of the tables. | |||
| 10.1.2. Labels in Registration | 10.1.2. Labels in Registration | |||
| Anyone entering a label into a DNS zone must properly validate that | Anyone entering a label into a DNS zone must properly validate that | |||
| label -- i.e., be sure that the criteria for an A-label are met -- in | label -- i.e., be sure that the criteria for that label are met -- in | |||
| order for Unicode version-independence to be possible. In | order for applications to work as intended. This principle is not | |||
| particular: | new: for example, zone administrators are expected to verify that | |||
| names meet "hostname" [RFC0952] or special service location formats | ||||
| [RFC2782] where necessary for the expected applications. For zones | ||||
| that will contain IDNs, support for Unicode version-independence | ||||
| requires restrictions on all strings placed in the zone. In | ||||
| particular, for such zones: | ||||
| o Any label that contains hyphens as its third and fourth characters | o Any label that appears to be an A-label, i.e., any label that | |||
| MUST be IDNA-valid. This implies that, (i) if the third and | starts in "xn--", MUST be IDNA-valid, i.e., that they MUST be | |||
| fourth characters are hyphens, the first and second ones MUST be | valid A-labels, as discussed in Section 3 above. | |||
| "xn" until and unless this specification is updated to permit | ||||
| other prefixes and (ii) labels starting in "xn--" MUST be valid | ||||
| A-labels, as discussed in Section 3 above. | ||||
| o The Unicode tables (i.e., tables of code points, character | o The Unicode tables (i.e., tables of code points, character | |||
| classes, and properties) and IDNA tables (i.e., tables of | classes, and properties) and IDNA tables (i.e., tables of | |||
| contextual rules such as those described above), MUST be | contextual rules such as those described above), MUST be | |||
| consistent on the systems performing or validating labels to be | consistent on the systems performing or validating labels to be | |||
| registered. Note that this does not require that tables reflect | registered. Note that this does not require that tables reflect | |||
| the latest version of Unicode, only that all tables used on a | the latest version of Unicode, only that all tables used on a | |||
| given system are consistent with each other. | given system are consistent with each other. | |||
| [[anchor33: Note in draft: the above text was changed significantly | ||||
| between -00 and -01 to clearly restrict its scope to zones supporting | ||||
| IDNA and to eliminate comments about labels containing "--" in the | ||||
| third and forth positions but with different prefixes. There appears | ||||
| to be consensus that more extensive rules belong in a "best | ||||
| practices" document about appropriate DNS labels, but that document | ||||
| is not in-scope for the IDNABIS WG.]] | ||||
| Under this model, a registry (or entity communicating with a registry | Under this model, a registry (or entity communicating with a registry | |||
| to accomplish name registrations) will need to update its tables -- | to accomplish name registrations) will need to update its tables -- | |||
| both the Unicode-associated tables and the tables of permitted IDN | both the Unicode-associated tables and the tables of permitted IDN | |||
| characters -- to enable a new script or other set of new characters. | characters -- to enable a new script or other set of new characters. | |||
| It will not be affected by newer versions of Unicode, or newly- | It will not be affected by newer versions of Unicode, or newly- | |||
| authorized characters, until and unless it wishes to make those | authorized characters, until and unless it wishes to make those | |||
| registrations. The registration side is also responsible --under the | registrations. The registration side is also responsible --under the | |||
| protocol and to registrants and users-- for much more careful | protocol and to registrants and users-- for much more careful | |||
| checking than is expected of applications systems that look names up, | checking than is expected of applications systems that look names up, | |||
| both checking as required by the protocol and checking required by | both checking as required by the protocol and checking required by | |||
| whatever policies it develops for minimizing risks due to confusable | whatever policies it develops for minimizing risks due to confusable | |||
| characters and sequences and preserving language or script integrity. | characters and sequences and preserving language or script integrity. | |||
| Systems looking up or resolving DNS labels MUST be able to assume | Systems looking up or resolving DNS labels, especially IDN DNS | |||
| that applicable registration rules were followed for names entered | labels, MUST be able to assume that applicable registration rules | |||
| into the DNS. | were followed for names entered into the DNS. | |||
| 10.1.3. Labels in Resolution (Lookup) | 10.1.3. Labels in Resolution (Lookup) | |||
| Anyone looking up a label in a DNS zone | Anyone looking up a label in a DNS zone | |||
| o MUST maintain a consistent set of tables, as discussed above. As | o MUST maintain a consistent set of tables, as discussed above. As | |||
| with registration, the tables need not reflect the latest version | with registration, the tables need not reflect the latest version | |||
| of Unicode but they MUST be consistent. | of Unicode but they MUST be consistent. | |||
| o MUST validate the characters in labels to be looked up only to the | o MUST validate the characters in labels to be looked up only to the | |||
| skipping to change at page 36, line 22 ¶ | skipping to change at page 36, line 47 ¶ | |||
| there are no uniform conventions for naming; variations such as | there are no uniform conventions for naming; variations such as | |||
| outline, solid, and shaded forms may or may not exist; and so on. | outline, solid, and shaded forms may or may not exist; and so on. | |||
| As just one example, consider a "heart" symbol as it might appear | As just one example, consider a "heart" symbol as it might appear | |||
| in a logo that might be read as "I love...". While the user might | in a logo that might be read as "I love...". While the user might | |||
| read such a logo as "I love..." or "I heart...", considerable | read such a logo as "I love..." or "I heart...", considerable | |||
| knowledge of the coding distinctions made in Unicode is needed to | knowledge of the coding distinctions made in Unicode is needed to | |||
| know that there more than one "heart" character (e.g., U+2665, | know that there more than one "heart" character (e.g., U+2665, | |||
| U+2661, and U+2765) and how to describe it. These issues are of | U+2661, and U+2765) and how to describe it. These issues are of | |||
| particular importance if strings are expected to be understood or | particular importance if strings are expected to be understood or | |||
| transcribed by the listener after being read out loud. | transcribed by the listener after being read out loud. | |||
| [[anchor35: The above paragraph remains controversial as to | ||||
| whether it is valid. The WG will need to make a decision if this | ||||
| section is not dropped entirely.]] | ||||
| o As a simplified example of this, assume one wanted to use a | o As a simplified example of this, assume one wanted to use a | |||
| "heart" or "star" symbol in a label. This is problematic because | "heart" or "star" symbol in a label. This is problematic because | |||
| the those names are ambiguous in the Unicode system of naming (the | the those names are ambiguous in the Unicode system of naming (the | |||
| actual Unicode names require far more qualification). A user or | actual Unicode names require far more qualification). A user or | |||
| would-be registrant has no way to know --absent careful study of | would-be registrant has no way to know --absent careful study of | |||
| the code tables-- whether it is ambiguous (e.g., where there are | the code tables-- whether it is ambiguous (e.g., where there are | |||
| multiple "heart" characters) or not. Conversely, the user seeing | multiple "heart" characters) or not. Conversely, the user seeing | |||
| the hypothetical label doesn't know whether to read it --try to | the hypothetical label doesn't know whether to read it --try to | |||
| transmit it to a colleague by voice-- as "heart", as "love", as | transmit it to a colleague by voice-- as "heart", as "love", as | |||
| skipping to change at page 37, line 5 ¶ | skipping to change at page 37, line 35 ¶ | |||
| Unicode than there are hearts or stars. | Unicode than there are hearts or stars. | |||
| o The consequence of these ambiguities of description and | o The consequence of these ambiguities of description and | |||
| dependencies on distinctions that were, or were not, made in | dependencies on distinctions that were, or were not, made in | |||
| Unicode codings, is that symbols are a very poor basis for | Unicode codings, is that symbols are a very poor basis for | |||
| reliable communication. Of course, these difficulties with | reliable communication. Of course, these difficulties with | |||
| symbols do not arise with actual pictographic languages and | symbols do not arise with actual pictographic languages and | |||
| scripts which would be treated like any other language characters; | scripts which would be treated like any other language characters; | |||
| the two should not be confused. | the two should not be confused. | |||
| [[anchor32: Note in Draft: Should the above section be significantly | [[anchor36: Note in Draft: Should the above section be significantly | |||
| trimmed or eliminated?]] | trimmed or eliminated?]] | |||
| 10.6. Migration Between Unicode Versions: Unassigned Code Points | 10.6. Migration Between Unicode Versions: Unassigned Code Points | |||
| In IDNA2003, labels containing unassigned code points are resolved on | In IDNA2003, labels containing unassigned code points are resolved on | |||
| the theory that, if they appear in labels and can be resolved, the | the theory that, if they appear in labels and can be resolved, the | |||
| relevant standards must have changed and the registry has properly | relevant standards must have changed and the registry has properly | |||
| allocated only assigned values. | allocated only assigned values. | |||
| In this specification, strings containing unassigned code points MUST | In this specification, strings containing unassigned code points MUST | |||
| skipping to change at page 38, line 27 ¶ | skipping to change at page 39, line 8 ¶ | |||
| does not get an xn-- prefix, but the string that can be displayed to | does not get an xn-- prefix, but the string that can be displayed to | |||
| a user appears to be an IDN. The proposed IDNA2008 eliminates this | a user appears to be an IDN. The proposed IDNA2008 eliminates this | |||
| artifact. A character is either permitted as itself or it is | artifact. A character is either permitted as itself or it is | |||
| prohibited; special cases that make sense only in a particular | prohibited; special cases that make sense only in a particular | |||
| linguistic or cultural context can be dealt with as localization | linguistic or cultural context can be dealt with as localization | |||
| matters where appropriate. | matters where appropriate. | |||
| 11. Acknowledgments | 11. Acknowledgments | |||
| The editor and contributors would like to express their thanks to | The editor and contributors would like to express their thanks to | |||
| those who contributed significant early review comments, sometimes | those who contributed significant early (pre-WG) review comments, | |||
| accompanied by text, especially Mark Davis, Paul Hoffman, Simon | sometimes accompanied by text, especially Mark Davis, Paul Hoffman, | |||
| Josefsson, and Sam Weiler. In addition, some specific ideas were | Simon Josefsson, and Sam Weiler. In addition, some specific ideas | |||
| incorporated from suggestions, text, or comments about sections that | were incorporated from suggestions, text, or comments about sections | |||
| were unclear supplied by Frank Ellerman, Michael Everson, Asmus | that were unclear supplied by Frank Ellerman, Michael Everson, Asmus | |||
| Freytag, Erik van der Poel, Michel Suignard, and Ken Whistler, | Freytag, Erik van der Poel, Michel Suignard, and Ken Whistler, | |||
| although, as usual, they bear little or no responsibility for the | although, as usual, they bear little or no responsibility for the | |||
| conclusions the editor and contributors reached after receiving their | conclusions the editor and contributors reached after receiving their | |||
| suggestions. Thanks are also due to Vint Cerf, Debbie Garside, and | suggestions. Thanks are also due to Vint Cerf, Debbie Garside, and | |||
| Jefsey Morphin for conversations that led to considerable | Jefsey Morphin for conversations that led to considerable | |||
| improvements in the content of this document. | improvements in the content of this document. | |||
| A meeting was held on 30 January 2008 to attempt to reconcile | A meeting was held on 30 January 2008 to attempt to reconcile | |||
| differences in perspective and terminology about this set of | differences in perspective and terminology about this set of | |||
| specifications between the design team and members of the Unicode | specifications between the design team and members of the Unicode | |||
| skipping to change at page 39, line 8 ¶ | skipping to change at page 39, line 36 ¶ | |||
| alphabetic order as usual) Harald Alvestrand, Vint Cerf, Tina Dam, | alphabetic order as usual) Harald Alvestrand, Vint Cerf, Tina Dam, | |||
| Mark Davis, Lisa Dusseault, Patrik Faltstrom (by telephone), Cary | Mark Davis, Lisa Dusseault, Patrik Faltstrom (by telephone), Cary | |||
| Karp, John Klensin, Warren Kumari, Lisa Moore, Erik van der Poel, | Karp, John Klensin, Warren Kumari, Lisa Moore, Erik van der Poel, | |||
| Michel Suignard, and Ken Whistler. We express our thanks to Google | Michel Suignard, and Ken Whistler. We express our thanks to Google | |||
| for support of that meeting and to the participants for their | for support of that meeting and to the participants for their | |||
| contributions. | contributions. | |||
| Special thanks are due to Paul Hoffman for permission to extract | Special thanks are due to Paul Hoffman for permission to extract | |||
| material from his Internet-Draft to form the basis for Section 2. | material from his Internet-Draft to form the basis for Section 2. | |||
| Useful comments and text on the WG versions of the draft were | ||||
| received from many participants in the IETF "IDNABIS" WG and a number | ||||
| of document changes resulted from mailing list discussions made by | ||||
| that group. | ||||
| 12. Contributors | 12. Contributors | |||
| While the listed editor held the pen, this core of this document and | While the listed editor held the pen, this core of this document and | |||
| the initial WG version represents the joint work and conclusions of | the initial WG version represents the joint work and conclusions of | |||
| an ad hoc design team consisting of the editor and, in alphabetic | an ad hoc design team consisting of the editor and, in alphabetic | |||
| order, Harald Alvestrand, Tina Dam, Patrik Faltstrom, and Cary Karp. | order, Harald Alvestrand, Tina Dam, Patrik Faltstrom, and Cary Karp. | |||
| In addition, there were many specific contributions and helpful | In addition, there were many specific contributions and helpful | |||
| comments from those listed in the Acknowledgments section and others | comments from those listed in the Acknowledgments section and others | |||
| who have contributed to the development and use of the IDNA | who have contributed to the development and use of the IDNA | |||
| protocols. | protocols. | |||
| skipping to change at page 39, line 39 ¶ | skipping to change at page 40, line 25 ¶ | |||
| in programming and validation requires a registry of characters and | in programming and validation requires a registry of characters and | |||
| scripts and their categories, updated for each new version of Unicode | scripts and their categories, updated for each new version of Unicode | |||
| and the characters it contains. The details of this registry are | and the characters it contains. The details of this registry are | |||
| specified in [IDNA2008-Tables]. | specified in [IDNA2008-Tables]. | |||
| 13.2. IDNA Context Registry | 13.2. IDNA Context Registry | |||
| For characters that are defined in the IDNA Character Registry list | For characters that are defined in the IDNA Character Registry list | |||
| as PROTOCOL-VALID but requiring a contextual rule (i.e., the types of | as PROTOCOL-VALID but requiring a contextual rule (i.e., the types of | |||
| rule described in Section 6.1.1.1), IANA will create and maintain a | rule described in Section 6.1.1.1), IANA will create and maintain a | |||
| list of approved contextual rules, using the the "expert reviewer" | list of approved contextual rules. Additions or changes to these | |||
| model. Unlike usual practice, we recommend that the "expert | rules require IETF Review, as described in [RFC5226]. | |||
| reviewer" be a committee that reflects expertise on the relevant | [[anchor41: Note in Draft: This section was changed between -00 and | |||
| scripts, and encourage IANA, the IESG, and IAB to establish liaisons | -01 based on list discussion. Consensus needs to be verified for | |||
| and work together with other relevant standards bodies to populate | that decision.]] | |||
| that committee and its procedures over the long term. [[anchor37: | ||||
| Note in Draft: This section requires careful review by the WG, since | ||||
| "expert review" may not be appropriate but other mechanisms may be | ||||
| excessively burdensome.]] | ||||
| A table from which that registry can be initialized, and some further | A table from which that registry can be initialized, and some further | |||
| discussion, appears in [RulesInit]. | discussion, appears in [RulesInit]. | |||
| [[anchor42: This subsection should probably be moved to Tables along | ||||
| with the Contextual rules themselves (from Protocol) when the move is | ||||
| made.]] | ||||
| 13.3. IANA Repository of IDN Practices of TLDs | 13.3. IANA Repository of IDN Practices of TLDs | |||
| This registry, historically described as the "IANA Language Character | This registry, historically described as the "IANA Language Character | |||
| Set Registry" or "IANA Script Registry" (both somewhat misleading | Set Registry" or "IANA Script Registry" (both somewhat misleading | |||
| terms) is maintained by IANA at the request of ICANN. It is used to | terms) is maintained by IANA at the request of ICANN. It is used to | |||
| provide a central documentation repository of the IDN policies used | provide a central documentation repository of the IDN policies used | |||
| by top level domain (TLD) registries who volunteer to contribute to | by top level domain (TLD) registries who volunteer to contribute to | |||
| it and is used in conjunction with ICANN Guidelines for IDN use. | it and is used in conjunction with ICANN Guidelines for IDN use. | |||
| skipping to change at page 41, line 46 ¶ | skipping to change at page 42, line 31 ¶ | |||
| synchronization dependency between IDNA changes and possible upgrades | synchronization dependency between IDNA changes and possible upgrades | |||
| to security protocols or conventions. | to security protocols or conventions. | |||
| No mechanism involving names or identifiers alone can protect a wide | No mechanism involving names or identifiers alone can protect a wide | |||
| variety of security threats and attacks that are largely independent | variety of security threats and attacks that are largely independent | |||
| of them including spoofed pages, DNS query trapping and diversion, | of them including spoofed pages, DNS query trapping and diversion, | |||
| and so on. | and so on. | |||
| 15. Change Log | 15. Change Log | |||
| [[anchor40: RFC Editor: Please remove this section.]] | [[anchor45: RFC Editor: Please remove this section.]] | |||
| For version 00 of draft-ietf-idnabis-rational, this list contains a | For version 00 of draft-ietf-idnabis-rationale, this list contains a | |||
| complete trace going back through the earlier, design team, drafts. | complete trace going back through the earlier, design team, drafts. | |||
| That earlier material will be removed in subsequent drafts. | Material earlier than that described in Section 15.9 will be removed | |||
| in WG draft -02. | ||||
| 15.1. Version -01 of draft-klensin-idnabis-issues | 15.1. Version -01 of draft-klensin-idnabis-issues | |||
| Version -01 of this document is a considerable rewrite from -00. | Version -01 of this document is a considerable rewrite from -00. | |||
| Many sections have been clarified or extended and several new | Many sections have been clarified or extended and several new | |||
| sections have been added to reflect discussions in a number of | sections have been added to reflect discussions in a number of | |||
| contexts since -00 was issued. | contexts since -00 was issued. | |||
| 15.2. Version -02 of draft-klensin-idnabis-issues | 15.2. Version -02 of draft-klensin-idnabis-issues | |||
| skipping to change at page 44, line 44 ¶ | skipping to change at page 45, line 29 ¶ | |||
| e.g., whether Section 7.3, Section 7.4, or Section 10.3 provide | e.g., whether Section 7.3, Section 7.4, or Section 10.3 provide | |||
| enough value to be worth retaining? Can Section 10.4 be trimmed | enough value to be worth retaining? Can Section 10.4 be trimmed | |||
| without loss of useful information and, if so, how? Section 10.7 | without loss of useful information and, if so, how? Section 10.7 | |||
| appears critical of IDNA2003 in undesirable ways: should it be | appears critical of IDNA2003 in undesirable ways: should it be | |||
| dropped or do people have suggestions about how to improve it? | dropped or do people have suggestions about how to improve it? | |||
| Strong opinions have been expressed that Section 10.5 should be | Strong opinions have been expressed that Section 10.5 should be | |||
| trimmed significantly or removed entirely. The WG will need to | trimmed significantly or removed entirely. The WG will need to | |||
| discuss that too. Are there other materials that should be trimmed | discuss that too. Are there other materials that should be trimmed | |||
| out? | out? | |||
| 15.9. Version -01 of draft-ietf-idnabis-rationale | ||||
| o Clarified the U-label definition to note that U-labels must | ||||
| contain at least one non-ASCII character. Also clarified the | ||||
| relationship among label types. | ||||
| o Rewrote the discussion of Labels in Registration (Section 10.1.2) | ||||
| and related text in Section 1.5.4.1.1 to narrow its focus and | ||||
| remove more general restrictions. Added a temporary note in line | ||||
| to explain the situation. | ||||
| o Changed the "IDNA uses Unicode" statement to focus on | ||||
| compatibility with IDNA2003 and avoid more general or | ||||
| controversial assertions. | ||||
| o Added a discussion of examples to Section 10.1 | ||||
| o Made a number of other small editorial changes and corrections | ||||
| suggested by Mark Davis. | ||||
| o Added several more discussion anchors and notes and expanded or | ||||
| updated some existing ones. | ||||
| 16. References | 16. References | |||
| 16.1. Normative References | 16.1. Normative References | |||
| [ASCII] American National Standards Institute (formerly United | [ASCII] American National Standards Institute (formerly United | |||
| States of America Standards Institute), "USA Code for | States of America Standards Institute), "USA Code for | |||
| Information Interchange", ANSI X3.4-1968, 1968. | Information Interchange", ANSI X3.4-1968, 1968. | |||
| ANSI X3.4-1968 has been replaced by newer versions with | ANSI X3.4-1968 has been replaced by newer versions with | |||
| slight modifications, but the 1968 version remains | slight modifications, but the 1968 version remains | |||
| definitive for the Internet. | definitive for the Internet. | |||
| [IDNA2008-Bidi] | [IDNA2008-Bidi] | |||
| Alvestrand, H. and C. Karp, "An updated IDNA criterion for | Alvestrand, H. and C. Karp, "An updated IDNA criterion for | |||
| right to left scripts", February 2008, <http:// | right to left scripts", July 2008, <http://www.ietf.org/ | |||
| www.ietf.org/internet-drafts/ | internet-drafts/draft-ietf-idnabs-bidi-01.txt>. | |||
| draft-alvestrand-idna-bidi-04.txt>. | ||||
| New version of this document pending as | ||||
| draft-ietf-idnabis-bidi-00. | ||||
| [IDNA2008-Protocol] | [IDNA2008-Protocol] | |||
| Klensin, J., "Internationalizing Domain Names in | Klensin, J., "Internationalized Domain Names in | |||
| Applications (IDNA): Protocol", May 2008, <http:// | Applications (IDNA): Protocol", July 2008, <http:// | |||
| www.ietf.org/internet-drafts/ | www.ietf.org/internet-drafts/ | |||
| draft-ietf-idnabis-protocol-00.txt>. | draft-ietf-idnabis-protocol-02.txt>. | |||
| [IDNA2008-Tables] | [IDNA2008-Tables] | |||
| Faltstrom, P., "The Unicode Code Points and IDNA", | Faltstrom, P., "The Unicode Code Points and IDNA", | |||
| April 2008, <http://www.ietf.org/internet-drafts/ | May 2008, <http://www.ietf.org/internet-drafts/ | |||
| draft-ietf-idnabis-tables-00.txt>. | draft-ietf-idnabis-tables-01.txt>. | |||
| A version of this document is available in HTML format at | A version of this document is available in HTML format at | |||
| http://stupid.domain.name/idnabis/ | http://stupid.domain.name/idnabis/ | |||
| draft-ietf-idnabis-tables-00.html | draft-ietf-idnabis-tables-01.html | |||
| [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
| Requirement Levels", BCP 14, RFC 2119, March 1997. | Requirement Levels", BCP 14, RFC 2119, March 1997. | |||
| [RFC3454] Hoffman, P. and M. Blanchet, "Preparation of | [RFC3454] Hoffman, P. and M. Blanchet, "Preparation of | |||
| Internationalized Strings ("stringprep")", RFC 3454, | Internationalized Strings ("stringprep")", RFC 3454, | |||
| December 2002. | December 2002. | |||
| [RFC3490] Faltstrom, P., Hoffman, P., and A. Costello, | [RFC3490] Faltstrom, P., Hoffman, P., and A. Costello, | |||
| "Internationalizing Domain Names in Applications (IDNA)", | "Internationalizing Domain Names in Applications (IDNA)", | |||
| RFC 3490, March 2003. | RFC 3490, March 2003. | |||
| [RFC3491] Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep | [RFC3491] Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep | |||
| Profile for Internationalized Domain Names (IDN)", | Profile for Internationalized Domain Names (IDN)", | |||
| RFC 3491, March 2003. | RFC 3491, March 2003. | |||
| [RFC3492] Costello, A., "Punycode: A Bootstring encoding of Unicode | [RFC3492] Costello, A., "Punycode: A Bootstring encoding of Unicode | |||
| for Internationalized Domain Names in Applications | for Internationalized Domain Names in Applications | |||
| (IDNA)", RFC 3492, March 2003. | (IDNA)", RFC 3492, March 2003. | |||
| [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an | ||||
| IANA Considerations Section in RFCs", BCP 26, RFC 5226, | ||||
| May 2008. | ||||
| [RulesInit] | [RulesInit] | |||
| Klensin, J., "Internationalizing Domain Names in | Klensin, J., "Internationalizing Domain Names in | |||
| Applications (IDNA): Protocol, Appendix A Contextual Rules | Applications (IDNA): Protocol, Appendix A Contextual Rules | |||
| Table", May 2008, <http://www.ietf.org/internet-drafts/ | Table", July 2008, <http://www.ietf.org/internet-drafts/ | |||
| draft-ietf-idnabis-protocol-00.txt>. | draft-ietf-idnabis-protocol-02.txt>. | |||
| Forthconming. | ||||
| [Unicode-PropertyValueAliases] | ||||
| The Unicode Consortium, "Unicode Character Database: | ||||
| PropertyValueAliases", March 2008, <http:// | ||||
| www.unicode.org/Public/UNIDATA/PropertyValueAliases.txt>. | ||||
| [Unicode-RegEx] | ||||
| The Unicode Consortium, "Unicode Technical Standard #18: | ||||
| Unicode Regular Expressions", May 2005, | ||||
| <http://www.unicode.org/reports/tr18/>. | ||||
| [Unicode-Scripts] | ||||
| The Unicode Consortium, "Unicode Standard Annex #24: | ||||
| Unicode Script Property", February 2008, | ||||
| <http://www.unicode.org/reports/tr24/>. | ||||
| [Unicode51] | [Unicode51] | |||
| The Unicode Consortium, "The Unicode Standard, Version | The Unicode Consortium, "The Unicode Standard, Version | |||
| 5.1.0", 2008. | 5.1.0", 2008. | |||
| defined by: The Unicode Standard, Version 5.0, Boston, MA, | defined by: The Unicode Standard, Version 5.0, Boston, MA, | |||
| Addison-Wesley, 2007, ISBN 0-321-48091-0, as amended by | Addison-Wesley, 2007, ISBN 0-321-48091-0, as amended by | |||
| Unicode 5.1.0 | Unicode 5.1.0 | |||
| (http://www.unicode.org/versions/Unicode5.1.0/). | (http://www.unicode.org/versions/Unicode5.1.0/). | |||
| skipping to change at page 47, line 6 ¶ | skipping to change at page 47, line 45 ¶ | |||
| O'Reilly & Associates, 1999 | O'Reilly & Associates, 1999 | |||
| [GB18030] "Chinese National Standard GB 18030-2000: Information | [GB18030] "Chinese National Standard GB 18030-2000: Information | |||
| Technology -- Chinese ideograms coded character set for | Technology -- Chinese ideograms coded character set for | |||
| information interchange -- Extension for the basic set.", | information interchange -- Extension for the basic set.", | |||
| 2000. | 2000. | |||
| [RFC0810] Feinler, E., Harrenstien, K., Su, Z., and V. White, "DoD | [RFC0810] Feinler, E., Harrenstien, K., Su, Z., and V. White, "DoD | |||
| Internet host table specification", RFC 810, March 1982. | Internet host table specification", RFC 810, March 1982. | |||
| [RFC0952] Harrenstien, K., Stahl, M., and E. Feinler, "DoD Internet | ||||
| host table specification", RFC 952, October 1985. | ||||
| [RFC1034] Mockapetris, P., "Domain names - concepts and facilities", | [RFC1034] Mockapetris, P., "Domain names - concepts and facilities", | |||
| STD 13, RFC 1034, November 1987. | STD 13, RFC 1034, November 1987. | |||
| [RFC1035] Mockapetris, P., "Domain names - implementation and | [RFC1035] Mockapetris, P., "Domain names - implementation and | |||
| specification", STD 13, RFC 1035, November 1987. | specification", STD 13, RFC 1035, November 1987. | |||
| [RFC1123] Braden, R., "Requirements for Internet Hosts - Application | [RFC1123] Braden, R., "Requirements for Internet Hosts - Application | |||
| and Support", STD 3, RFC 1123, October 1989. | and Support", STD 3, RFC 1123, October 1989. | |||
| [RFC2782] Gulbrandsen, A., Vixie, P., and L. Esibov, "A DNS RR for | [RFC2782] Gulbrandsen, A., Vixie, P., and L. Esibov, "A DNS RR for | |||
| End of changes. 69 change blocks. | ||||
| 166 lines changed or deleted | 229 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||