| < draft-faltstrom-unicode11-01.txt | draft-faltstrom-unicode11-02.txt > | |||
|---|---|---|---|---|
| Network Working Group P. Faltstrom | Network Working Group P. Faltstrom | |||
| Internet-Draft Netnod | Internet-Draft Netnod | |||
| Intended status: Informational July 2, 2018 | Intended status: Informational September 25, 2018 | |||
| Expires: January 3, 2019 | Expires: March 29, 2019 | |||
| IDNA2008 and Unicode 11.0.0 | IDNA2008 and Unicode 11.0.0 | |||
| draft-faltstrom-unicode11-01 | draft-faltstrom-unicode11-02 | |||
| Abstract | Abstract | |||
| This document describes changes between Unicode 6.3.0 and Unicode | This document describes changes between Unicode 6.3.0 and Unicode | |||
| 11.0.0 in the context of IDNA2008. It further suggests for the IETF | 11.0.0 in the context of IDNA2008. It further suggests for the IETF | |||
| a path forward regarding ensuring IDNA2008 follows the evolution of | a path forward regarding ensuring IDNA2008 follows the evolution of | |||
| the Unicode Standard. | the Unicode Standard. | |||
| In a few cases changes have been made in the Unicode Standard related | ||||
| to the algorithm IDNA2008 specifies. IDNA2008 do give the ability to | ||||
| add exceptions for backward compatibility to the algorithm but the | ||||
| conclusions provided in this document suggests no such changes. | ||||
| Status of This Memo | Status of This Memo | |||
| This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
| provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
| working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
| Drafts is at https://datatracker.ietf.org/drafts/current/. | Drafts is at https://datatracker.ietf.org/drafts/current/. | |||
| Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| This Internet-Draft will expire on January 3, 2019. | This Internet-Draft will expire on March 29, 2019. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2018 IETF Trust and the persons identified as the | Copyright (c) 2018 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
| (https://trustee.ietf.org/license-info) in effect on the date of | (https://trustee.ietf.org/license-info) in effect on the date of | |||
| publication of this document. Please review these documents | publication of this document. Please review these documents | |||
| skipping to change at page 2, line 11 ¶ | skipping to change at page 2, line 15 ¶ | |||
| include Simplified BSD License text as described in Section 4.e of | include Simplified BSD License text as described in Section 4.e of | |||
| the Trust Legal Provisions and are provided without warranty as | the Trust Legal Provisions and are provided without warranty as | |||
| described in the Simplified BSD License. | described in the Simplified BSD License. | |||
| Table of Contents | Table of Contents | |||
| 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 | |||
| 2. Keywords for Requirement Levels . . . . . . . . . . . . . . . 3 | 2. Keywords for Requirement Levels . . . . . . . . . . . . . . . 3 | |||
| 3. Background . . . . . . . . . . . . . . . . . . . . . . . . . 3 | 3. Background . . . . . . . . . . . . . . . . . . . . . . . . . 3 | |||
| 3.1. IDNA2008 Documents . . . . . . . . . . . . . . . . . . . 3 | 3.1. IDNA2008 Documents . . . . . . . . . . . . . . . . . . . 3 | |||
| 3.2. Deployment . . . . . . . . . . . . . . . . . . . . . . . 4 | 3.2. Deployment . . . . . . . . . . . . . . . . . . . . . . . 5 | |||
| 4. Notable changes between Unicode 6.3.0 and 11.0.0 . . . . . . 5 | 4. Notable changes between Unicode 6.3.0 and 11.0.0 . . . . . . 6 | |||
| 4.1. Changes to Unicode 7.0.0 . . . . . . . . . . . . . . . . 5 | 4.1. Changes to Unicode 7.0.0 . . . . . . . . . . . . . . . . 6 | |||
| 4.2. Changes to Unicode 11.0.0 . . . . . . . . . . . . . . . . 6 | 4.2. Changes between Unicode 7.0.0 and 10.0.0 . . . . . . . . 6 | |||
| 5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 7 | 4.3. Changes to Unicode 11.0.0 . . . . . . . . . . . . . . . . 6 | |||
| 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 7 | 5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 8 | |||
| 7. Security Considerations . . . . . . . . . . . . . . . . . . . 7 | 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 | |||
| 7. Security Considerations . . . . . . . . . . . . . . . . . . . 8 | ||||
| 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 8 | 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 8 | |||
| 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 8 | 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 8 | |||
| 9.1. Normative References . . . . . . . . . . . . . . . . . . 8 | 9.1. Normative References . . . . . . . . . . . . . . . . . . 8 | |||
| 9.2. Non-normative references . . . . . . . . . . . . . . . . 9 | 9.2. Non-normative references . . . . . . . . . . . . . . . . 9 | |||
| Appendix A. Changes from Unicode 6.3.0 to Unicode 7.0.0 . . . . 11 | Appendix A. Changes from Unicode 6.3.0 to Unicode 7.0.0 . . . . 12 | |||
| Appendix B. Changes from Unicode 7.0.0 to Unicode 8.0.0 . . . . 14 | Appendix B. Changes from Unicode 7.0.0 to Unicode 8.0.0 . . . . 15 | |||
| Appendix C. Changes from Unicode 8.0.0 to Unicode 9.0.0 . . . . 15 | Appendix C. Changes from Unicode 8.0.0 to Unicode 9.0.0 . . . . 16 | |||
| Appendix D. Changes from Unicode 9.0.0 to Unicode 10.0.0 . . . . 16 | Appendix D. Changes from Unicode 9.0.0 to Unicode 10.0.0 . . . . 17 | |||
| Appendix E. Changes from Unicode 10.0.0 to Unicode 11.0.0 . . . 17 | Appendix E. Changes from Unicode 10.0.0 to Unicode 11.0.0 . . . 18 | |||
| Appendix F. Code points in Unicode Character Database (UCD) | Appendix F. Code points in Unicode Character Database (UCD) | |||
| format for Unicode 11.0.0 . . . . . . . . . . . . . 19 | format for Unicode 11.0.0 . . . . . . . . . . . . . 20 | |||
| Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 78 | Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 79 | |||
| 1. Introduction | 1. Introduction | |||
| The current version of Internationalized Domain Names for | The current version of Internationalized Domain Names for | |||
| Applications (IDNA) was largely completed in 2008, known within the | Applications (IDNA) was largely completed in 2008, known within the | |||
| series and elsewhere as "IDNA2008" and is specified in a series of | series and elsewhere as "IDNA2008" and is specified in a series of | |||
| documents (see Section Section 3.1). The standard include an | documents (see Section Section 3.1). The standard include an | |||
| algorithm by which a derived property value is calculated based on | algorithm by which a derived property value is calculated based on | |||
| the properties defined in the Unicode Standard. | the properties defined in the Unicode Standard. | |||
| skipping to change at page 3, line 5 ¶ | skipping to change at page 3, line 11 ¶ | |||
| Assigning code points might create problems if the newly assigned | Assigning code points might create problems if the newly assigned | |||
| code points are compositions of code points so that it either changes | code points are compositions of code points so that it either changes | |||
| or would have changed the normalization functions. This because it | or would have changed the normalization functions. This because it | |||
| changes the matching algorithms used which in turn might create | changes the matching algorithms used which in turn might create | |||
| problems looking up already stored strings in for example DNS. | problems looking up already stored strings in for example DNS. | |||
| Changing properties to already assigned code points might create | Changing properties to already assigned code points might create | |||
| problems if the change do result in the derived property value | problems if the change do result in the derived property value | |||
| changes. This might make an earlier allowed code point (derived | changes. This might make an earlier allowed code point (derived | |||
| property value PVALID) not be allowed anymore (derived property value | property value PVALID) not be allowed anymore (derived property value | |||
| DISALLOWED). | DISALLOWED). Or the other way around, a code point that was not | |||
| allowed (and because of that blocked in some situations) suddenly end | ||||
| up being allowed. | ||||
| Historically the IETF has accepted all implications of changes in the | Historically the IETF has accepted all implications of changes in the | |||
| Unicode Standard even though the changes have resulted in problematic | Unicode Standard even though the changes have resulted in problematic | |||
| changes in the derived property value. The primary reason for that | changes in the derived property value. The primary reason for that | |||
| is that staying with the Unicode Standard has been viewed as | is that staying with the Unicode Standard has been viewed as | |||
| important given the diversity in implementations already existing in | important given the diversity in implementations already existing in | |||
| the wild. | the wild. | |||
| The Internet Architecture Board did issue a statement [IAB] which | As described in Section 4, a few changes have been made regarding | |||
| requested IETF to resolve the issues related to the code point ARABIC | certain attributes to code points in Unicode between version 6.3.0 | |||
| LETTER BEH WITH HAMZA ABOVE (U+08A1), introduced in Unicode 7.0.0 | and 11.0.0. Such changes could result in either a change in the | |||
| [Unicode-7.0.0]. This document resolves this issue and suggests | derived property value for the code point in question or no such | |||
| IDNA2008 standard is to follow the Unicode Standard and not update | change. In turn, if the result is a change, it can be between any of | |||
| RFC 5892 [RFC5892] or any other IDNA2008 RFCs. | the derived property values except DISALLOWED. Also in this case, | |||
| when moving from version 6.3.0 to 11.0.0, this document concludes | ||||
| that no exceptions are to be added to IDNA2008 even if changes in the | ||||
| derived property value is a result of the changes made in Unicode. | ||||
| Specifically, the Internet Architecture Board did issue a statement | ||||
| [IAB] which requested IETF to resolve the issues related to the code | ||||
| point ARABIC LETTER BEH WITH HAMZA ABOVE (U+08A1), introduced in | ||||
| Unicode 7.0.0 [Unicode-7.0.0]. This document resolves this issue and | ||||
| suggests IDNA2008 standard is to follow the Unicode Standard and not | ||||
| update RFC 5892 [RFC5892] or any other IDNA2008 RFCs. | ||||
| 2. Keywords for Requirement Levels | 2. Keywords for Requirement Levels | |||
| The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
| "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | |||
| document are to be interpreted as described in RFC 2119 [RFC2119]. | document are to be interpreted as described in RFC 2119 [RFC2119]. | |||
| 3. Background | 3. Background | |||
| 3.1. IDNA2008 Documents | 3.1. IDNA2008 Documents | |||
| skipping to change at page 6, line 13 ¶ | skipping to change at page 6, line 34 ¶ | |||
| create "the same" character in multiple ways, the issue with U+08A1 | create "the same" character in multiple ways, the issue with U+08A1 | |||
| is not unique. In the case of U+08A1, it can be represented with the | is not unique. In the case of U+08A1, it can be represented with the | |||
| sequence ARABIC LETTER BEH (U+0628) and ARABIC HAMZA ABOVE (U+0654). | sequence ARABIC LETTER BEH (U+0628) and ARABIC HAMZA ABOVE (U+0654). | |||
| Just like LATIN SMALL LETTER A WITH DIAERESIS (U+00E4) can be | Just like LATIN SMALL LETTER A WITH DIAERESIS (U+00E4) can be | |||
| represented via the sequence LATIN SMALL LETTER A (U+0061), and | represented via the sequence LATIN SMALL LETTER A (U+0061), and | |||
| COMBINING DIAERESIS (U+0308). One difference between these sequences | COMBINING DIAERESIS (U+0308). One difference between these sequences | |||
| is how they are treated in the normalization forms specified by the | is how they are treated in the normalization forms specified by the | |||
| Unicode Consortium. | Unicode Consortium. | |||
| As U+08A1 is discussed in draft-freytag-troublesome-characters | As U+08A1 is discussed in draft-freytag-troublesome-characters | |||
| [I-D.freytag-troublesome-characters] and elsewhere, regardless of | [I-D.freytag-troublesome-characters] and elsewhere. Regardless of | |||
| whether that discussion end in recommending including the code point | whether those discussions ends in recommending including the code | |||
| in the repertoire of characters permissable for registration or not, | point in the repertoire of characters permissable for registration or | |||
| it is acceptable to allow the code point to have a derived property | not, it is acceptable to allow the code point to have a derived | |||
| value of PVALID. | property value of PVALID. | |||
| 4.2. Changes to Unicode 11.0.0 | 4.2. Changes between Unicode 7.0.0 and 10.0.0 | |||
| There are no changes made to Unicode between version 7.0.0 and 10.0.0 | ||||
| that impacts IDNA2008 calculation of the derived property value. | ||||
| 4.3. Changes to Unicode 11.0.0 | ||||
| The Unicode Standard Version 11.0.0 [Unicode-11.0.0] have included a | The Unicode Standard Version 11.0.0 [Unicode-11.0.0] have included a | |||
| number of changes [Changes-11.0.0] from version 10.0.0, specifically | number of changes [Changes-11.0.0] from version 10.0.0, specifically | |||
| to UnicodeData.txt: | to UnicodeData.txt: | |||
| o Entries were added for the 684 new characters, including letters, | o Entries were added for the 684 new characters, including letters, | |||
| combining marks, digits, symbols, and punctuation marks. | combining marks, digits, symbols, and punctuation marks. | |||
| o Georgian letters in the ranges U+10D0..U+10FA, U+10FD..U+10FF were | o Georgian letters in the ranges U+10D0..U+10FA, U+10FD..U+10FF were | |||
| changed from Lo to Ll, to reflect their status as the lowercase of | changed from Lo to Ll, to reflect their status as the lowercase of | |||
| skipping to change at page 7, line 27 ¶ | skipping to change at page 8, line 7 ¶ | |||
| changed. | changed. | |||
| o U+29A1 SPHERICAL ANGLE OPENING UP have existed since before | o U+29A1 SPHERICAL ANGLE OPENING UP have existed since before | |||
| IDNA2008 was created. Applying the IDNA2008 algorithm to the code | IDNA2008 was created. Applying the IDNA2008 algorithm to the code | |||
| point did assign the derived property value PVALID and that value | point did assign the derived property value PVALID and that value | |||
| is unchanged even if the underlying Unicode properties have | is unchanged even if the underlying Unicode properties have | |||
| changed. | changed. | |||
| 5. Conclusion | 5. Conclusion | |||
| Given the changes laid out in Section 4 the derived property values | As described in Section 4 changes have been made to Unicode between | |||
| MUST be calculated according to the IDNA2008 specification for | version 6.3.0 and 11.0.0. Some changes to specific characters | |||
| Unicode Version 11.0.0 [Unicode-11.0.0]. The changes in code points, | changed their derived property value. Others did not. Given the | |||
| implications to normalization and changes in derived property values | diverse deployment described in Section 3.2 and the changes | |||
| are acceptable. | described, including implications to normalization, the conclusion is | |||
| to not add any exception rules to IDNA2008. | ||||
| All registries and others SHOULD calculate a repertoir as explained | To increase overall harmonization in the use of internationalized | |||
| in draft-freytag-troublesome-characters | domain names, the the author recommends that the derived property | |||
| values MUST be calculated according to the IDNA2008 specification for | ||||
| Unicode Version 11.0.0 [Unicode-11.0.0]. | ||||
| All registries (and others) SHOULD calculate a repertoir, for example | ||||
| as explained in draft-freytag-troublesome-characters | ||||
| [I-D.freytag-troublesome-characters] and draft-klensin-idna- | [I-D.freytag-troublesome-characters] and draft-klensin-idna- | |||
| rfc5891bis [I-D.klensin-idna-rfc5891bis] using the conservatism and | rfc5891bis [I-D.klensin-idna-rfc5891bis] using the conservatism and | |||
| inclusive principles as laid out in SAC-084 [SAC-084]. | inclusive principles as laid out in SAC-084 [SAC-084]. | |||
| 6. IANA Considerations | 6. IANA Considerations | |||
| IANA is requested to update the registry of derived property values | IANA is requested to update the registry of derived property values | |||
| after validation with the Appointed Expert that the derived values | after validation with the Appointed Expert that the derived property | |||
| are calculated correctly. | values are calculated correctly. | |||
| 7. Security Considerations | 7. Security Considerations | |||
| Not following the recommendations regarding explicitly deciding what | Not following the recommendations regarding use of the IDNA2008 | |||
| subset of the by IDNA2008 algorith applied to current Unicode version | algorithm for calculation of the derived property value and/or | |||
| should be permissable can lead to various security issues related to | explicitly deciding what subset of the by IDNA2008 algorith applied | |||
| specifically confusability, and that way various phishing attacks. | to current Unicode version should be permissable can lead to various | |||
| security issues related to specifically confusability, and that way | ||||
| various phishing attacks. | ||||
| 8. Acknowledgements | 8. Acknowledgements | |||
| Thanks to John Klensin, Asmus Freytag, Andrew Sullivan and Michel | Thanks to John Klensin, Asmus Freytag, Andrew Sullivan, Ted Hardie, | |||
| Suignard for input to this document. | Suzanne Woolf and Michel Suignard for input to this document. | |||
| 9. References | 9. References | |||
| 9.1. Normative References | 9.1. Normative References | |||
| [IAB] Internet Architecture Board, "IAB Statement on Identifiers | [IAB] Internet Architecture Board, "IAB Statement on Identifiers | |||
| and Unicode 7.0.0", IAB Statement on Identifiers and | and Unicode 7.0.0", IAB Statement on Identifiers and | |||
| Unicode 7.0.0 | Unicode 7.0.0 | |||
| https://www.iab.org/documents/correspondence-reports- | https://www.iab.org/documents/correspondence-reports- | |||
| documents/2015-2/iab-statement-on-identifiers-and-unicode- | documents/2015-2/iab-statement-on-identifiers-and-unicode- | |||
| End of changes. 16 change blocks. | ||||
| 46 lines changed or deleted | 77 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||