| < draft-faltstrom-unicode11-02.txt | draft-faltstrom-unicode11-03.txt > | |||
|---|---|---|---|---|
| Network Working Group P. Faltstrom | Network Working Group P. Faltstrom | |||
| Internet-Draft Netnod | Internet-Draft Netnod | |||
| Intended status: Informational September 25, 2018 | Intended status: Informational October 02, 2018 | |||
| Expires: March 29, 2019 | Expires: April 5, 2019 | |||
| IDNA2008 and Unicode 11.0.0 | IDNA2008 and Unicode 11.0.0 | |||
| draft-faltstrom-unicode11-02 | draft-faltstrom-unicode11-03 | |||
| Abstract | Abstract | |||
| This document describes changes between Unicode 6.3.0 and Unicode | This document describes changes between Unicode 6.3.0 and Unicode | |||
| 11.0.0 in the context of IDNA2008. It further suggests for the IETF | 11.0.0 in the context of IDNA2008. It further suggests for the IETF | |||
| a path forward regarding ensuring IDNA2008 follows the evolution of | a path forward regarding ensuring IDNA2008 follows the evolution of | |||
| the Unicode Standard. | the Unicode Standard. | |||
| In a few cases changes have been made in the Unicode Standard related | In a few cases changes have been made in the Unicode Standard related | |||
| to the algorithm IDNA2008 specifies. IDNA2008 do give the ability to | to the algorithm IDNA2008 specifies. IDNA2008 do give the ability to | |||
| skipping to change at page 1, line 38 ¶ | skipping to change at page 1, line 38 ¶ | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
| working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
| Drafts is at https://datatracker.ietf.org/drafts/current/. | Drafts is at https://datatracker.ietf.org/drafts/current/. | |||
| Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| This Internet-Draft will expire on March 29, 2019. | This Internet-Draft will expire on April 5, 2019. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2018 IETF Trust and the persons identified as the | Copyright (c) 2018 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
| (https://trustee.ietf.org/license-info) in effect on the date of | (https://trustee.ietf.org/license-info) in effect on the date of | |||
| publication of this document. Please review these documents | publication of this document. Please review these documents | |||
| skipping to change at page 2, line 24 ¶ | skipping to change at page 2, line 24 ¶ | |||
| 3.1. IDNA2008 Documents . . . . . . . . . . . . . . . . . . . 3 | 3.1. IDNA2008 Documents . . . . . . . . . . . . . . . . . . . 3 | |||
| 3.2. Deployment . . . . . . . . . . . . . . . . . . . . . . . 5 | 3.2. Deployment . . . . . . . . . . . . . . . . . . . . . . . 5 | |||
| 4. Notable changes between Unicode 6.3.0 and 11.0.0 . . . . . . 6 | 4. Notable changes between Unicode 6.3.0 and 11.0.0 . . . . . . 6 | |||
| 4.1. Changes to Unicode 7.0.0 . . . . . . . . . . . . . . . . 6 | 4.1. Changes to Unicode 7.0.0 . . . . . . . . . . . . . . . . 6 | |||
| 4.2. Changes between Unicode 7.0.0 and 10.0.0 . . . . . . . . 6 | 4.2. Changes between Unicode 7.0.0 and 10.0.0 . . . . . . . . 6 | |||
| 4.3. Changes to Unicode 11.0.0 . . . . . . . . . . . . . . . . 6 | 4.3. Changes to Unicode 11.0.0 . . . . . . . . . . . . . . . . 6 | |||
| 5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 8 | 5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 8 | |||
| 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 | 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 | |||
| 7. Security Considerations . . . . . . . . . . . . . . . . . . . 8 | 7. Security Considerations . . . . . . . . . . . . . . . . . . . 8 | |||
| 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 8 | 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 8 | |||
| 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 8 | 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 9 | |||
| 9.1. Normative References . . . . . . . . . . . . . . . . . . 8 | 9.1. Normative References . . . . . . . . . . . . . . . . . . 9 | |||
| 9.2. Non-normative references . . . . . . . . . . . . . . . . 9 | 9.2. Non-normative references . . . . . . . . . . . . . . . . 10 | |||
| Appendix A. Changes from Unicode 6.3.0 to Unicode 7.0.0 . . . . 12 | Appendix A. Changes from Unicode 6.3.0 to Unicode 7.0.0 . . . . 12 | |||
| Appendix B. Changes from Unicode 7.0.0 to Unicode 8.0.0 . . . . 15 | Appendix B. Changes from Unicode 7.0.0 to Unicode 8.0.0 . . . . 15 | |||
| Appendix C. Changes from Unicode 8.0.0 to Unicode 9.0.0 . . . . 16 | Appendix C. Changes from Unicode 8.0.0 to Unicode 9.0.0 . . . . 16 | |||
| Appendix D. Changes from Unicode 9.0.0 to Unicode 10.0.0 . . . . 17 | Appendix D. Changes from Unicode 9.0.0 to Unicode 10.0.0 . . . . 17 | |||
| Appendix E. Changes from Unicode 10.0.0 to Unicode 11.0.0 . . . 18 | Appendix E. Changes from Unicode 10.0.0 to Unicode 11.0.0 . . . 18 | |||
| Appendix F. Code points in Unicode Character Database (UCD) | Appendix F. Code points in Unicode Character Database (UCD) | |||
| format for Unicode 11.0.0 . . . . . . . . . . . . . 20 | format for Unicode 11.0.0 . . . . . . . . . . . . . 20 | |||
| Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 79 | Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 79 | |||
| 1. Introduction | 1. Introduction | |||
| skipping to change at page 5, line 38 ¶ | skipping to change at page 5, line 38 ¶ | |||
| o A mix between IDNA2003 and IDNA2008 according to local | o A mix between IDNA2003 and IDNA2008 according to local | |||
| interpretation of the Unicode Technical Standard #46 [UTS-46]. | interpretation of the Unicode Technical Standard #46 [UTS-46]. | |||
| The issue is further complicated by having a very diverse | The issue is further complicated by having a very diverse | |||
| implementations of the requirements in RFC 5894 [RFC5894] that | implementations of the requirements in RFC 5894 [RFC5894] that | |||
| registry operators to based on the IDNA2008 specification create | registry operators to based on the IDNA2008 specification create | |||
| additional rules for what code points are allowed to be used for | additional rules for what code points are allowed to be used for | |||
| registration. | registration. | |||
| In practice, the Unicode Consortium set a maximum set of code points | In practice, the Unicode Consortium creates a maximum set of code | |||
| by assigning code points in the Unicode Standard. The IDNA2008 rules | points by assigning code points in the Unicode Standard. The | |||
| based on the Unicode Standard create a subset of these by assigning | IDNA2008 rules based on the Unicode Standard create a subset of these | |||
| the PVALID derived property value to them. Registries (and others | by assigning the PVALID derived property value to them. Registries | |||
| dealing with Internationalized Domain Names) are supposed to create | (and others dealing with Internationalized Domain Names) are supposed | |||
| an even smaller subset that ultimately is the set of code points that | to create an even smaller subset that ultimately is the set of code | |||
| can be used in a particular registry. | points that can be used in a particular registry. | |||
| There is further recommendation to be conservative when these subsets | There is further recommendation to be conservative when these subsets | |||
| are calculated and to use the inclusion principle; this is explained | are calculated and to use the inclusion principle; this is explained | |||
| in SAC-084 [SAC-084] and RFC 6912 [RFC6912]. | in SAC-084 [SAC-084] and RFC 6912 [RFC6912]. | |||
| The complicated situation with deployment of IDNA2008 is discussed | The complicated situation with deployment of IDNA2008 is discussed | |||
| further in draft-klensin-idna-rfc5891bis | further in draft-klensin-idna-rfc5891bis | |||
| [I-D.klensin-idna-rfc5891bis] and draft-freytag-troublesome- | [I-D.klensin-idna-rfc5891bis] and draft-freytag-troublesome- | |||
| characters [I-D.freytag-troublesome-characters]. | characters [I-D.freytag-troublesome-characters]. | |||
| 4. Notable changes between Unicode 6.3.0 and 11.0.0 | 4. Notable changes between Unicode 6.3.0 and 11.0.0 | |||
| 4.1. Changes to Unicode 7.0.0 | 4.1. Changes to Unicode 7.0.0 | |||
| The character ARABIC LETTER BEH WITH HAMZA ABOVE U+08A1 was | The character ARABIC LETTER BEH WITH HAMZA ABOVE U+08A1 was | |||
| introduced in Unicode 7.0.0. This was discussed in the IETF | introduced in Unicode 7.0.0. This was discussed in the IETF | |||
| extensively and IAB in their statement [IAB] requesting the IETF to | extensively and by IAB in their statement [IAB] requesting the IETF | |||
| investigate the issue and specifically IAB stated: | to investigate the issue. Specifically IAB stated: | |||
| On the same precautionary principle, the IAB recommends that the | On the same precautionary principle, the IAB recommends that the | |||
| Internationalized Domain Names for Applications (IDNA) Parameters | Internationalized Domain Names for Applications (IDNA) Parameters | |||
| registry (http://www.iana.org/assignments/idna-tables/) not be | registry (http://www.iana.org/assignments/idna-tables/) not be | |||
| updated to Unicode 7.0.0 until the IETF has consensus on a | updated to Unicode 7.0.0 until the IETF has consensus on a | |||
| solution to this problem. | solution to this problem. | |||
| The discussion in the IETF concluded that although it is possible to | The discussion in the IETF concluded that although it is possible to | |||
| create "the same" character in multiple ways, the issue with U+08A1 | create "the same" character in multiple ways, the issue with U+08A1 | |||
| is not unique. In the case of U+08A1, it can be represented with the | is not unique. In the case of U+08A1, it can be represented with the | |||
| skipping to change at page 6, line 43 ¶ | skipping to change at page 6, line 43 ¶ | |||
| As U+08A1 is discussed in draft-freytag-troublesome-characters | As U+08A1 is discussed in draft-freytag-troublesome-characters | |||
| [I-D.freytag-troublesome-characters] and elsewhere. Regardless of | [I-D.freytag-troublesome-characters] and elsewhere. Regardless of | |||
| whether those discussions ends in recommending including the code | whether those discussions ends in recommending including the code | |||
| point in the repertoire of characters permissable for registration or | point in the repertoire of characters permissable for registration or | |||
| not, it is acceptable to allow the code point to have a derived | not, it is acceptable to allow the code point to have a derived | |||
| property value of PVALID. | property value of PVALID. | |||
| 4.2. Changes between Unicode 7.0.0 and 10.0.0 | 4.2. Changes between Unicode 7.0.0 and 10.0.0 | |||
| There are no changes made to Unicode between version 7.0.0 and 10.0.0 | There are no changes made to Unicode between version 7.0.0 and 10.0.0 | |||
| that impacts IDNA2008 calculation of the derived property value. | that impact IDNA2008 calculation of the derived property value. | |||
| 4.3. Changes to Unicode 11.0.0 | 4.3. Changes to Unicode 11.0.0 | |||
| The Unicode Standard Version 11.0.0 [Unicode-11.0.0] have included a | The Unicode Standard Version 11.0.0 [Unicode-11.0.0] has included a | |||
| number of changes [Changes-11.0.0] from version 10.0.0, specifically | number of changes [Changes-11.0.0] from version 10.0.0, specifically | |||
| to UnicodeData.txt: | to UnicodeData.txt: | |||
| o Entries were added for the 684 new characters, including letters, | o Entries were added for the 684 new characters, including letters, | |||
| combining marks, digits, symbols, and punctuation marks. | combining marks, digits, symbols, and punctuation marks. | |||
| o Georgian letters in the ranges U+10D0..U+10FA, U+10FD..U+10FF were | o Georgian letters in the ranges U+10D0..U+10FA, U+10FD..U+10FF were | |||
| changed from Lo to Ll, to reflect their status as the lowercase of | changed from Lo to Ll, to reflect their status as the lowercase of | |||
| new Georgian case pairs. Case mappings were also added. | new Georgian case pairs. Case mappings were also added. | |||
| skipping to change at page 8, line 15 ¶ | skipping to change at page 8, line 15 ¶ | |||
| 5. Conclusion | 5. Conclusion | |||
| As described in Section 4 changes have been made to Unicode between | As described in Section 4 changes have been made to Unicode between | |||
| version 6.3.0 and 11.0.0. Some changes to specific characters | version 6.3.0 and 11.0.0. Some changes to specific characters | |||
| changed their derived property value. Others did not. Given the | changed their derived property value. Others did not. Given the | |||
| diverse deployment described in Section 3.2 and the changes | diverse deployment described in Section 3.2 and the changes | |||
| described, including implications to normalization, the conclusion is | described, including implications to normalization, the conclusion is | |||
| to not add any exception rules to IDNA2008. | to not add any exception rules to IDNA2008. | |||
| To increase overall harmonization in the use of internationalized | To increase overall harmonization in the use of internationalized | |||
| domain names, the the author recommends that the derived property | domain names, the author recommends that the derived property values | |||
| values MUST be calculated according to the IDNA2008 specification for | MUST be calculated according to the IDNA2008 specification for | |||
| Unicode Version 11.0.0 [Unicode-11.0.0]. | Unicode Version 11.0.0 [Unicode-11.0.0]. | |||
| All registries (and others) SHOULD calculate a repertoir, for example | All registries (and others) SHOULD calculate a repertoire, for | |||
| as explained in draft-freytag-troublesome-characters | example as explained in draft-freytag-troublesome-characters | |||
| [I-D.freytag-troublesome-characters] and draft-klensin-idna- | [I-D.freytag-troublesome-characters] and draft-klensin-idna- | |||
| rfc5891bis [I-D.klensin-idna-rfc5891bis] using the conservatism and | rfc5891bis [I-D.klensin-idna-rfc5891bis] using the conservatism and | |||
| inclusive principles as laid out in SAC-084 [SAC-084]. | inclusive principles as laid out in SAC-084 [SAC-084]. | |||
| 6. IANA Considerations | 6. IANA Considerations | |||
| IANA is requested to update the registry of derived property values | IANA is requested to update the registry of derived property values | |||
| after validation with the Appointed Expert that the derived property | after validation with the Appointed Expert that the derived property | |||
| values are calculated correctly. | values are calculated correctly. | |||
| 7. Security Considerations | 7. Security Considerations | |||
| Not following the recommendations regarding use of the IDNA2008 | This document makes recommendations regarding the use of the IDNA2008 | |||
| algorithm for calculation of the derived property value and/or | algorithm for calculation of derived property values, based on the | |||
| explicitly deciding what subset of the by IDNA2008 algorith applied | current Unicode version. It also recommends that registries (and | |||
| to current Unicode version should be permissable can lead to various | others dealing with Internationalized Domain Names) explicitly select | |||
| security issues related to specifically confusability, and that way | appropriate subsets of characters with the derived value of PVALID. | |||
| Not following these recommendations can lead to various security | ||||
| issues. Specifically, allowing confusable characters may lead to | ||||
| various phishing attacks. | various phishing attacks. | |||
| 8. Acknowledgements | 8. Acknowledgements | |||
| Thanks to John Klensin, Asmus Freytag, Andrew Sullivan, Ted Hardie, | Thanks to Martin Durst, Asmus Freytag, Ted Hardie, John Klensin, | |||
| Suzanne Woolf and Michel Suignard for input to this document. | Michel Suignard, Andrew Sullivan and Suzanne Woolf for input to this | |||
| document. | ||||
| 9. References | 9. References | |||
| 9.1. Normative References | 9.1. Normative References | |||
| [IAB] Internet Architecture Board, "IAB Statement on Identifiers | [IAB] Internet Architecture Board, "IAB Statement on Identifiers | |||
| and Unicode 7.0.0", IAB Statement on Identifiers and | and Unicode 7.0.0", IAB Statement on Identifiers and | |||
| Unicode 7.0.0 | Unicode 7.0.0 | |||
| https://www.iab.org/documents/correspondence-reports- | https://www.iab.org/documents/correspondence-reports- | |||
| documents/2015-2/iab-statement-on-identifiers-and-unicode- | documents/2015-2/iab-statement-on-identifiers-and-unicode- | |||
| End of changes. 12 change blocks. | ||||
| 29 lines changed or deleted | 33 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||