< draft-faltstrom-unicode11-06.txt   draft-faltstrom-unicode11-07.txt >
Network Working Group P. Faltstrom Network Working Group P. Faltstrom
Internet-Draft Netnod Internet-Draft Netnod
Intended status: Standards Track December 09, 2018 Intended status: Standards Track January 07, 2019
Expires: June 12, 2019 Expires: July 11, 2019
IDNA2008 and Unicode 11.0.0 IDNA2008 and Unicode 11.0.0
draft-faltstrom-unicode11-06 draft-faltstrom-unicode11-07
Abstract Abstract
This document describes the changes between Unicode 6.3.0 and Unicode This document describes the changes between Unicode 6.3.0 and Unicode
11.0.0 in the context of IDNA2008. It further suggests a path 11.0.0 in the context of IDNA2008. It further suggests a path
forward for the IETF to ensure IDNA2008 follows the evolution of the forward for the IETF to ensure IDNA2008 follows the evolution of the
Unicode Standard. Unicode Standard.
Some changes have been made in the Unicode Standard related to the Some changes have been made in the Unicode Standard related to the
algorithm IDNA2008 specifies. IDNA2008 allows adding exceptions to algorithm IDNA2008 specifies. IDNA2008 allows adding exceptions to
skipping to change at page 1, line 48 skipping to change at page 1, line 48
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/. Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on June 12, 2019. This Internet-Draft will expire on July 11, 2019.
Copyright Notice Copyright Notice
Copyright (c) 2018 IETF Trust and the persons identified as the Copyright (c) 2019 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of (https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
skipping to change at page 6, line 7 skipping to change at page 6, line 7
In practice, the Unicode Consortium creates a maximum set of code In practice, the Unicode Consortium creates a maximum set of code
points by assigning code points in the Unicode Standard. The points by assigning code points in the Unicode Standard. The
IDNA2008 rules based on the Unicode Standard create a subset of these IDNA2008 rules based on the Unicode Standard create a subset of these
by assigning the PVALID derived property value to them. DNS by assigning the PVALID derived property value to them. DNS
registries and other organizations that deal with IDNs are supposed registries and other organizations that deal with IDNs are supposed
to create their own subsets from IDNA2008 for use by those registries to create their own subsets from IDNA2008 for use by those registries
and organizations. and organizations.
SAC-084 [SAC-084] and RFC 6912 [RFC6912] recommend to DNS registries SAC-084 [SAC-084] and RFC 6912 [RFC6912] recommend to DNS registries
and other organizations to be conservative when creating their and other organizations to be conservative when creating their
subsets are calculated, and to use the principle of creating subsets subsets, and to use the principle of creating subsets by inclusion.
by inclusion.
4. Notable Changes Between Unicode 6.3.0 and 11.0.0 4. Notable Changes Between Unicode 6.3.0 and 11.0.0
4.1. Changes in Unicode 7.0.0 4.1. Changes in Unicode 7.0.0
The character ARABIC LETTER BEH WITH HAMZA ABOVE (U+08A1) was The character ARABIC LETTER BEH WITH HAMZA ABOVE (U+08A1) was
introduced in Unicode 7.0.0. This was discussed extensively in the introduced in Unicode 7.0.0. This was discussed extensively in the
IETF, and by the IAB in their statement [IAB] requesting the IETF to IETF, and by the IAB in their statement [IAB] requesting the IETF to
investigate the issue. Specifically, the IAB stated: investigate the issue. Specifically, the IAB stated:
skipping to change at page 6, line 36 skipping to change at page 6, line 35
create "the same" character in multiple ways, the issue with U+08A1 create "the same" character in multiple ways, the issue with U+08A1
is not unique. The character U+08A1 can be represented with the is not unique. The character U+08A1 can be represented with the
sequence ARABIC LETTER BEH (U+0628) and ARABIC HAMZA ABOVE (U+0654). sequence ARABIC LETTER BEH (U+0628) and ARABIC HAMZA ABOVE (U+0654).
This identical to LATIN SMALL LETTER A WITH DIAERESIS (U+00E4), that This identical to LATIN SMALL LETTER A WITH DIAERESIS (U+00E4), that
can be represented with the sequence LATIN SMALL LETTER A (U+0061) can be represented with the sequence LATIN SMALL LETTER A (U+0061)
followed by COMBINING DIAERESIS (U+0308). One difference between followed by COMBINING DIAERESIS (U+0308). One difference between
these two sequences is how they are treated in the normalization these two sequences is how they are treated in the normalization
forms specified by the Unicode Consortium. forms specified by the Unicode Consortium.
U+08A1 is discussed in draft-freytag-troublesome-characters U+08A1 is discussed in draft-freytag-troublesome-characters
[I-D.freytag-troublesome-characters] and other Internet Drafts. [I-D.freytag-troublesome-characters] and other Internet-Drafts.
Regardless of whether the discussion of those drafts ends in Regardless of whether the discussion of those drafts ends in
recommendations to include the code point in the repertoire of recommendations to include the code point in the repertoire of
characters permissable for registration or not, it is still characters permissable for registration or not, it is still
acceptable to allow the code point to have a derived property value acceptable to allow the code point to have a derived property value
of PVALID. of PVALID.
4.2. Changes between Unicode 7.0.0 and 10.0.0 4.2. Changes between Unicode 7.0.0 and 10.0.0
There are no changes made to Unicode between version 7.0.0 and 10.0.0 There are no changes made to Unicode between version 7.0.0 and 10.0.0
that impact IDNA2008 calculation of the derived property value. that impact IDNA2008 calculation of the derived property value.
skipping to change at page 8, line 44 skipping to change at page 8, line 44
others dealing with Internationalized Domain Names) explicitly select others dealing with Internationalized Domain Names) explicitly select
appropriate subsets of characters with the derived value of PVALID. appropriate subsets of characters with the derived value of PVALID.
Not following these recommendations can lead to various security Not following these recommendations can lead to various security
issues. Specifically, allowing confusable characters may lead to issues. Specifically, allowing confusable characters may lead to
various phishing attacks, as described in the Security Consideration various phishing attacks, as described in the Security Consideration
Sections in the documents listed in section Section 3.1. Sections in the documents listed in section Section 3.1.
8. Acknowledgements 8. Acknowledgements
Thanks to Martin Durst, Asmus Freytag, Ted Hardie, John Klensin, Erik Thanks to Martin Duerst, Asmus Freytag, Ted Hardie, John Klensin,
Nordmark, Michel Suignard, Andrew Sullivan and Suzanne Woolf for Erik Nordmark, Michel Suignard, Andrew Sullivan and Suzanne Woolf for
input to this document. input to this document.
9. References 9. References
9.1. Normative References 9.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997, DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>. <https://www.rfc-editor.org/info/rfc2119>.
 End of changes. 7 change blocks. 
10 lines changed or deleted 9 lines changed or added

This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/