< draft-faltstrom-unicode11-05.txt   draft-faltstrom-unicode11-06.txt >
Network Working Group P. Faltstrom Network Working Group P. Faltstrom
Internet-Draft Netnod Internet-Draft Netnod
Intended status: Informational December 01, 2018 Intended status: Standards Track December 09, 2018
Expires: June 4, 2019 Expires: June 12, 2019
IDNA2008 and Unicode 11.0.0 IDNA2008 and Unicode 11.0.0
draft-faltstrom-unicode11-05 draft-faltstrom-unicode11-06
Abstract Abstract
This document describes changes between Unicode 6.3.0 and Unicode This document describes the changes between Unicode 6.3.0 and Unicode
11.0.0 in the context of IDNA2008. It further suggests for the IETF 11.0.0 in the context of IDNA2008. It further suggests a path
a path forward regarding ensuring IDNA2008 follows the evolution of forward for the IETF to ensure IDNA2008 follows the evolution of the
the Unicode Standard. Unicode Standard.
In a few cases changes have been made in the Unicode Standard related Some changes have been made in the Unicode Standard related to the
to the algorithm IDNA2008 specifies. IDNA2008 do give the ability to algorithm IDNA2008 specifies. IDNA2008 allows adding exceptions to
add exceptions for backward compatibility to the algorithm but the the algorithm for backward compatibility; however, this document
conclusions provided in this document suggests no such changes. makes no such changes. Thus this document requests that IANA update
the tables to Unicode 11.
Thus this document requests that IANA update the tables to Unicode The document also recomments that all DNS registries continue the
11. practice of calculating a repertoire using conservatism and inclusion
principles.
In addition, all registries should continue the practice of TO BE REMOVED AT TIME OF PUBLICATION AS AN RFC:
calculating a repertoire using conservatism and inclusion principles.
This document is discussed on the i18nrp@ietf.org mailing list of the
IETF.
Status of This Memo Status of This Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/. Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on June 4, 2019. This Internet-Draft will expire on June 12, 2019.
Copyright Notice Copyright Notice
Copyright (c) 2018 IETF Trust and the persons identified as the Copyright (c) 2018 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of (https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 2, line 27 skipping to change at page 2, line 27
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Keywords for Requirement Levels . . . . . . . . . . . . . . . 4 2. Keywords for Requirement Levels . . . . . . . . . . . . . . . 4
3. Background . . . . . . . . . . . . . . . . . . . . . . . . . 4 3. Background . . . . . . . . . . . . . . . . . . . . . . . . . 4
3.1. IDNA2008 Documents . . . . . . . . . . . . . . . . . . . 4 3.1. IDNA2008 Documents . . . . . . . . . . . . . . . . . . . 4
3.2. Deployment . . . . . . . . . . . . . . . . . . . . . . . 5 3.2. Deployment . . . . . . . . . . . . . . . . . . . . . . . 5
4. Notable changes between Unicode 6.3.0 and 11.0.0 . . . . . . 6 4. Notable Changes Between Unicode 6.3.0 and 11.0.0 . . . . . . 6
4.1. Changes to Unicode 7.0.0 . . . . . . . . . . . . . . . . 6 4.1. Changes in Unicode 7.0.0 . . . . . . . . . . . . . . . . 6
4.2. Changes between Unicode 7.0.0 and 10.0.0 . . . . . . . . 7 4.2. Changes between Unicode 7.0.0 and 10.0.0 . . . . . . . . 6
4.3. Changes to Unicode 11.0.0 . . . . . . . . . . . . . . . . 7 4.3. Changes in Unicode 11.0.0 . . . . . . . . . . . . . . . . 6
5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 8 5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 8
6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8
7. Security Considerations . . . . . . . . . . . . . . . . . . . 8 7. Security Considerations . . . . . . . . . . . . . . . . . . . 8
8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 9 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 8
9. References . . . . . . . . . . . . . . . . . . . . . . . . . 9 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 8
9.1. Normative References . . . . . . . . . . . . . . . . . . 9 9.1. Normative References . . . . . . . . . . . . . . . . . . 9
9.2. Non-normative references . . . . . . . . . . . . . . . . 10 9.2. Non-normative references . . . . . . . . . . . . . . . . 9
Appendix A. Changes from Unicode 6.3.0 to Unicode 7.0.0 . . . . 12 Appendix A. Changes from Unicode 6.3.0 to Unicode 7.0.0 . . . . 12
Appendix B. Changes from Unicode 7.0.0 to Unicode 8.0.0 . . . . 15 Appendix B. Changes from Unicode 7.0.0 to Unicode 8.0.0 . . . . 15
Appendix C. Changes from Unicode 8.0.0 to Unicode 9.0.0 . . . . 16 Appendix C. Changes from Unicode 8.0.0 to Unicode 9.0.0 . . . . 16
Appendix D. Changes from Unicode 9.0.0 to Unicode 10.0.0 . . . . 17 Appendix D. Changes from Unicode 9.0.0 to Unicode 10.0.0 . . . . 17
Appendix E. Changes from Unicode 10.0.0 to Unicode 11.0.0 . . . 18 Appendix E. Changes from Unicode 10.0.0 to Unicode 11.0.0 . . . 18
Appendix F. Code points in Unicode Character Database (UCD) Appendix F. Code points in Unicode Character Database (UCD)
format for Unicode 11.0.0 . . . . . . . . . . . . . 20 format for Unicode 11.0.0 . . . . . . . . . . . . . 20
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 79 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 79
1. Introduction 1. Introduction
The current version of Internationalized Domain Names for The current version of Internationalized Domain Names for
Applications (IDNA) was largely completed in 2008, known within the Applications (IDNA) was largely completed in 2008, and is thus known
series and elsewhere as "IDNA2008" and is specified in a series of within as "IDNA2008". It is specified in a series of documents
documents (see Section Section 3.1). The standard include an listed in Section 3.1. The IDNA2008 standard includes an algorithm
algorithm by which a derived property value is calculated based on by which a derived property value is calculated based on the
the properties defined in the Unicode Standard. properties defined from the Unicode Standard.
When the Unicode Standard is updated code points are assigned and When the Unicode Standard is updated, new code points are assigned
property values might be changed for already assigned code points. and already-assigned code points can have their property values
changed.
Assigning code points might create problems if the newly assigned o Assigning code points can create problems if the newly-assigned
code points are compositions of code points so that it either changes code points are compositions of code points changes (or would have
or would have changed the normalization functions. This is because changed) the normalization functions. These problems can arise if
it changes the matching algorithms used which in turn might create the new code points change the matching algorithms used and this
problems looking up already stored strings in for example DNS. in turn creates problems looking up already stored strings.
Changing properties for already assigned code points might create o Changing properties for already-assigned code points can create
problems if the change results in the derived property value changes. problems if the property change results in changes to the derived
This might make an earlier allowed code point (derived property value property value. This might make an earlier allowed code point
PVALID) not be allowed anymore (derived property value DISALLOWED). whose derived property value is PVALID to then not be allowed
Or the other way around, a code point that was not allowed (and anymore if its derived property value changes to DISALLOWED. The
because of that blocked in some situations) suddenly end up being problem can also happen the other way around: a code point that
allowed. was not allowed (and thus is blocked in some situations) to
suddenly end up being allowed.
Historically the IETF has accepted all implications of changes in the Historically, the IETF has accepted all implications of changes in
Unicode Standard even though the changes have resulted in problematic the Unicode Standard even though the changes have resulted in
changes in the derived property value. The primary reason for that problematic changes in the derived property value. The primary
is that staying with the Unicode Standard has been viewed as reason for that choice is that staying with the Unicode Standard has
important given the diversity in implementations already existing in been viewed as important because of the diversity of implementations
the wild. already existing in the wild.
As described in Section 4, a few changes have been made regarding As described in Section 4, a few changes have been made regarding
certain attributes to code points in Unicode between version 6.3.0 certain attributes to code points in Unicode between version 6.3.0
and 11.0.0. Such changes could result in either a change in the and 11.0.0. Such changes could result in a change in the derived
derived property value for the code point in question or no such property value for the code point in question. If a change occurs,
change. In turn, if the result is a change, it can be between any of and it is between any of the derived property values except
the derived property values except DISALLOWED. Also in this case, DISALLOWED, there is not a problem. This document concludes that no
when moving from version 6.3.0 to 11.0.0, this document concludes exceptions are to be added to IDNA2008 even if changes in the derived
that no exceptions are to be added to IDNA2008 even if changes in the property value is a result of the changes made in Unicode between
derived property value is a result of the changes made in Unicode. version 6.3.0 and 11.0.0.
Specifically, the Internet Architecture Board did issue a statement In 2015, the Internet Architecture Board (IAB) issued a statement
[IAB] which requested IETF to resolve the issues related to the code [IAB] which requested the IETF to resolve the issues related to the
point ARABIC LETTER BEH WITH HAMZA ABOVE (U+08A1), introduced in code point ARABIC LETTER BEH WITH HAMZA ABOVE (U+08A1) that was
Unicode 7.0.0 [Unicode-7.0.0]. This document resolves this issue and introduced in Unicode 7.0.0 [Unicode-7.0.0]. The current document
suggests IDNA2008 standard is to follow the Unicode Standard and not resolves this issue and suggests that the IDNA2008 standard followsO
update RFC 5892 [RFC5892] or any other IDNA2008 RFCs. the Unicode Standard and not update RFC 5892 [RFC5892] or any other
IDNA2008 RFCs.
2. Keywords for Requirement Levels 2. Keywords for Requirement Levels
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in BCP "OPTIONAL" in this document are to be interpreted as described in BCP
14 RFC2119 [RFC2119] RFC8174 [RFC8174] when, and only when, they 14 RFC2119 [RFC2119] RFC8174 [RFC8174] when, and only when, they
appear in all capitals, as shown here. appear in all capitals, as shown here.
3. Background 3. Background
3.1. IDNA2008 Documents 3.1. IDNA2008 Documents
IDNA2008 consists of the following documents: IDNA2008 consists of the following documents. The documents in the
set have informal names.
o A document, RFC 5890 [RFC5890], containing definitions and other o RFC 5890 [RFC5890], informally called "Defs" or "Definitions",
material that are needed for understanding other documents in the contains definitions and other material that are needed for
set. It is referred to informally in other documents in the set understanding other documents in the set.
as "Defs" or "Definitions".
o A document, RFC 5891 [RFC5891], that describes the core IDNA2008 o RFC 5891 [RFC5891], informally called "Protocol", describes the
protocol and its operations. It is to be interpreted in core IDNA2008 protocol and its operations. It needs to be
combination with the Bidi document, described immediately below. interpreted in combination with the Bidi document (described
It is referred to informally in other documents in the set as below).
"Protocol".
o A specification, RFC 5892 [RFC5892], of the categories and rules o RFC 5892 [RFC5892], informally called "Tables", lists the
that identify the code points allowed in a label written in native categories and rules that identify the code points allowed in a
character form (defined more specifically as a "U-label"), based label written in native character form (called a "U-label"), an is
originally on Unicode 5.2.0 [Unicode-5.2.0] code point assignments based originally on Unicode 5.2.0 [Unicode-5.2.0] code point
and additional rules unique to IDNA2008. The Unicode-based rules assignments and additional rules unique to IDNA2008. The Unicode-
are expected to be stable across Unicode updates and hence based rules in RFC 4892 are expected to be stable across Unicode
independent of Unicode versions. That specification obsoletes RFC updates and hence independent of Unicode versions. RFC 5892
3491 [RFC3491] and IDN use of the tables to which it refers. It obsoletes RFC 3491 [RFC3491], and in particular the use of the
is referred to informally in other documents in the set as tables to which it refers.
"Tables".
o A document, RFC 5893 [RFC5893], that specifies special rules o RFC 5893 [RFC5893], informally called "Bidi", specifies special
(Bidi) for labels that contain characters that are written from rules for labels that contain characters that are written from
right to left. right to left.
o A document, RFC 5894 [RFC5894], that provides an overview of the o RFC 5894 [RFC5894], informally called "Rationale", provides an
protocol and associated tables together with explanatory material overview of the protocol and associated tables, and gives
and some rationale for the decisions that led to IDNA2008. That explanatory material and some rationale for the decisions that led
document also contains advice for registry operations and those to IDNA2008. It also contains advice for DNS registry operators
who use Internationalized Domain Names (IDNs). It is referred to and others who use Internationalized Domain Names (IDNs).
informally in other documents in the set as "Rationale".
o A document, RFC 5895 [RFC5895], that discusses the issue of o RFC 5895 [RFC5895], informally called "Mapping", discusses the
mapping characters into other characters and that provides issue of mapping characters into other characters and that
guidance for doing so when that is appropriate. That document, provides guidance for doing so when that is appropriate. RFC 5895
referred to informally as "Mapping", provides advice; it is not a provides advice and is not a required part of IDNA.
required part of IDNA.
o A document, RFC 6452 [RFC6452], that looks at some changes made to o RFC 6452 [RFC6452] describes some changes made to Unicode 6.0.0
Unicode 6.0.0 [Unicode-6.0.0] that resulted in the derived [Unicode-6.0.0] that resulted in the derived property value change
property value change for the code points U+0CF1, U+0CF2 and for the code points U+0CF1, U+0CF2 and U+19DA. U+0CF1 and U+0CF2
U+19DA. The first two changed from DISALLOWED to PVALID, the last changed from DISALLOWED to PVALID, while U+19DA changed from
from PVALID to DISSALOWED. IETF came to the conclusion that no PVALID to DISSALOWED. The IETF concluded that no update to RFC
update is needed to RFC 5892 [RFC5892] based on the changes made 5892 [RFC5892] was needed based on the changes made in Unicode
in Unicode 6.0.0 [Unicode-6.0.0]. As a result, the derived 6.0.0 [Unicode-6.0.0]. As a result, the derived property value
property value remained aligned with the Unicode Standard. remained aligned with the Unicode Standard.
3.2. Deployment 3.2. Deployment
The deployment of IDNA2008 is unfortunately quite diverse. The The level of deployment of IDNA2008 is unfortunately quite diverse.
following lists some of the strategies that existing implementations The following lists some of the strategies that existing
are known to implement: implementations are known to use:
o IDNA2003 as specified in RFC 3490 [RFC3490] and RFC 3491 [RFC3491] o IDNA2003 as specified in RFC 3490 [RFC3490] and RFC 3491 [RFC3491]
which implies using a table within which it is said whether code which implies using a table within which it is said whether code
points are allowed to be used or not, after doing the points are allowed to be used or not, after doing the
normalization specified in IDNA2003. normalization specified in IDNA2003.
o A mix between IDNA2003 and IDNA2008 where code points assigned to o A mix between IDNA2003 and IDNA2008 where code points assigned to
Unicode after Unicode 3.2.0 [Unicode-3.2.0] have derived property Unicode after Unicode 3.2.0 [Unicode-3.2.0] have derived property
value calculated according to the algorithm specified in IDNA2008. value calculated according to the algorithm specified in IDNA2008.
o Strict IDNA2008 following IANA which implies staying at Unicode o Strict IDNA2008 following the current IANA tables, which implies
6.3.0 [Unicode-6.3.0] and treating later assigned code points as staying at Unicode 6.3.0 [Unicode-6.3.0] and treating later
UNASSIGNED. assigned code points as UNASSIGNED.
o The IDNA2008 algorithm applied to whatever version of Unicode o The IDNA2008 algorithm applied to whatever version of Unicode
Standard exists in the operating system and/or libraries used, Standard exists in the operating system and/or libraries used,
regardless of whether the version is later than Unicode version regardless of whether the version is later than Unicode version
6.3.0 or not. 6.3.0.
o A mix between IDNA2003 and IDNA2008 according to local o A mix between IDNA2003 and IDNA2008 according to local
interpretation of the Unicode Technical Standard #46 [UTS-46]. interpretation of the Unicode Technical Standard #46 [UTS-46].
The issue is further complicated by having a very diverse The issue is further complicated by having diverse implementations of
implementations of the requirements in RFC 5894 [RFC5894] by registry the requirements in RFC 5894 [RFC5894] by DNS registry operators,
operators based on the IDNA2008 specification to create additional based on the IDNA2008 specification, but with additional rules for
rules for what code points are allowed to be used for registration. the specific code points that are allowed for registration.
In practice, the Unicode Consortium creates a maximum set of code In practice, the Unicode Consortium creates a maximum set of code
points by assigning code points in the Unicode Standard. The points by assigning code points in the Unicode Standard. The
IDNA2008 rules based on the Unicode Standard create a subset of these IDNA2008 rules based on the Unicode Standard create a subset of these
by assigning the PVALID derived property value to them. Registries by assigning the PVALID derived property value to them. DNS
(and others dealing with Internationalized Domain Names) are supposed registries and other organizations that deal with IDNs are supposed
to create an even smaller subset that ultimately is the set of code to create their own subsets from IDNA2008 for use by those registries
points that can be used in a particular registry. and organizations.
There is further recommendation to be conservative when these subsets SAC-084 [SAC-084] and RFC 6912 [RFC6912] recommend to DNS registries
are calculated and to use the inclusion principle; this is explained and other organizations to be conservative when creating their
in SAC-084 [SAC-084] and RFC 6912 [RFC6912]. subsets are calculated, and to use the principle of creating subsets
by inclusion.
4. Notable changes between Unicode 6.3.0 and 11.0.0 4. Notable Changes Between Unicode 6.3.0 and 11.0.0
4.1. Changes to Unicode 7.0.0 4.1. Changes in Unicode 7.0.0
The character ARABIC LETTER BEH WITH HAMZA ABOVE U+08A1 was The character ARABIC LETTER BEH WITH HAMZA ABOVE (U+08A1) was
introduced in Unicode 7.0.0. This was discussed in the IETF introduced in Unicode 7.0.0. This was discussed extensively in the
extensively and by IAB in their statement [IAB] requesting the IETF IETF, and by the IAB in their statement [IAB] requesting the IETF to
to investigate the issue. Specifically IAB stated: investigate the issue. Specifically, the IAB stated:
On the same precautionary principle, the IAB recommends that the On the same precautionary principle, the IAB recommends that the
Internationalized Domain Names for Applications (IDNA) Parameters Internationalized Domain Names for Applications (IDNA) Parameters
registry (http://www.iana.org/assignments/idna-tables/) not be registry (http://www.iana.org/assignments/idna-tables/) not be
updated to Unicode 7.0.0 until the IETF has consensus on a updated to Unicode 7.0.0 until the IETF has consensus on a
solution to this problem. solution to this problem.
The discussion in the IETF concluded that although it is possible to The discussion in the IETF concluded that although it is possible to
create "the same" character in multiple ways, the issue with U+08A1 create "the same" character in multiple ways, the issue with U+08A1
is not unique. In the case of U+08A1, it can be represented with the is not unique. The character U+08A1 can be represented with the
sequence ARABIC LETTER BEH (U+0628) and ARABIC HAMZA ABOVE (U+0654). sequence ARABIC LETTER BEH (U+0628) and ARABIC HAMZA ABOVE (U+0654).
Just like LATIN SMALL LETTER A WITH DIAERESIS (U+00E4) can be This identical to LATIN SMALL LETTER A WITH DIAERESIS (U+00E4), that
represented via the sequence LATIN SMALL LETTER A (U+0061), and can be represented with the sequence LATIN SMALL LETTER A (U+0061)
COMBINING DIAERESIS (U+0308). One difference between these sequences followed by COMBINING DIAERESIS (U+0308). One difference between
is how they are treated in the normalization forms specified by the these two sequences is how they are treated in the normalization
Unicode Consortium. forms specified by the Unicode Consortium.
As U+08A1 is discussed in draft-freytag-troublesome-characters U+08A1 is discussed in draft-freytag-troublesome-characters
[I-D.freytag-troublesome-characters] and elsewhere. Regardless of [I-D.freytag-troublesome-characters] and other Internet Drafts.
whether those discussions ends in recommending including the code Regardless of whether the discussion of those drafts ends in
point in the repertoire of characters permissable for registration or recommendations to include the code point in the repertoire of
not, it is acceptable to allow the code point to have a derived characters permissable for registration or not, it is still
property value of PVALID. acceptable to allow the code point to have a derived property value
of PVALID.
4.2. Changes between Unicode 7.0.0 and 10.0.0 4.2. Changes between Unicode 7.0.0 and 10.0.0
There are no changes made to Unicode between version 7.0.0 and 10.0.0 There are no changes made to Unicode between version 7.0.0 and 10.0.0
that impact IDNA2008 calculation of the derived property value. that impact IDNA2008 calculation of the derived property value.
4.3. Changes to Unicode 11.0.0 4.3. Changes in Unicode 11.0.0
The Unicode Standard Version 11.0.0 [Unicode-11.0.0] has included a The Unicode Standard Version 11.0.0 [Unicode-11.0.0] has included a
number of changes [Changes-11.0.0] from version 10.0.0, specifically number of changes [Changes-11.0.0] from version 10.0.0.
to UnicodeData.txt:
o Entries were added for the 684 new characters, including letters, o 684 new characters were added, including letters, combining marks,
combining marks, digits, symbols, and punctuation marks. digits, symbols, and punctuation marks.
o Georgian letters in the ranges U+10D0..U+10FA, U+10FD..U+10FF were o Georgian letters in the ranges U+10D0..U+10FA and U+10FD..U+10FF
changed from Lo to Ll, to reflect their status as the lowercase of had their General Properties changed from Lo to Ll, to reflect
new Georgian case pairs. Case mappings were also added. their status as the lowercase of new Georgian case pairs. Case
mappings were also added.
o U+111C9 SHARADA SANDHI MARK was changed from Po to Mn, and from o SHARADA SANDHI MARK (U+111C9 ) was changed from Po to Mn, and from
bc=L to bc=NSM. bc=L to bc=NSM.
o U+11A07 ZANABAZAR SQUARE VOWEL SIGN AI and U+11A08 ZANABZAR SQUARE o The properties for ZANABAZAR SQUARE VOWEL SIGN AI (U+11A07) and
VOWEL SIGN AU were corrected from Mc to Mn. ZANABZAR SQUARE VOWEL SIGN AU (U+11A08) were corrected from Mc to
Mn.
o U+29A1 SPHERICAL ANGLE OPENING UP was changed to Bidi_M=N. o SPHERICAL ANGLE OPENING UP (U+29A1) was changed to Bidi_M=N.
These changes to the Unicode Standard have the following implications These changes to the Unicode Standard have the following implications
for these code points: for these code points:
o The newly assigned 684 characters are to have a derived property o The newly assigned 684 characters are to have a derived property
value as of a result of applying the IDNA2008 algorithm. value as of a result of applying the IDNA2008 algorithm.
o The Georgian letters in the ranges U+10D0..U+10FA and o The Georgian letters in the ranges U+10D0..U+10FA and
U+10FD..U+10FF have existed since before IDNA2008 was created. U+10FD..U+10FF existed before IDNA2008 was created. Applying the
Applying the IDNA2008 algorithm to the code points did assign the IDNA2008 algorithm to the code points assigned the derived
derived property value PVALID and that value is unchanged even if property value PVALID, and that value is unchanged even if the
the underlying Unicode properties have changed. underlying Unicode properties have changed.
o The U+111C9 SHARADA SANDHI MARK was added to Unicode 8.0.0 o The U+111C9 SHARADA SANDHI MARK was added to Unicode 8.0.0
[Unicode-8.0.0]. Applying the IDNA2008 algorithm to the code [Unicode-8.0.0]. Applying the IDNA2008 algorithm to the code
point did assign the derived property value DISALLOWED. The point assigned the derived property value DISALLOWED. The changes
changes in the underlying properties in the Unicode Standard in the underlying properties in the Unicode Standard Version
Version 11.0.0 [Unicode-11.0.0] make the derived property value 11.0.0 [Unicode-11.0.0] caused the derived property value to
change to PVALID which is an acceptable change. change to PVALID, which is an acceptable change.
o The characters U+11A07 ZANABAZAR SQUARE VOWEL SIGN AI and U+11A08 o The characters ZANABAZAR SQUARE VOWEL SIGN AI (U+11A07) and
ZANABZAR SQUARE VOWEL SIGN AU were added to Unicode 10.0.0 ZANABZAR SQUARE VOWEL SIGN AU (U+11A08) were added to Unicode
[Unicode-10.0.0]. Applying the IDNA2008 algorithm to the code 10.0.0 [Unicode-10.0.0]. Applying the IDNA2008 algorithm to the
points did assign the derived property value PVALID and that value code points assigned the derived property value PVALID, and that
is unchanged even if the underlying Unicode properties have value is unchanged even if the underlying Unicode properties have
changed. changed.
o U+29A1 SPHERICAL ANGLE OPENING UP have existed since before o SPHERICAL ANGLE OPENING UP (U+29A1) existed before IDNA2008 was
IDNA2008 was created. Applying the IDNA2008 algorithm to the code created. Applying the IDNA2008 algorithm to the code point
point did assign the derived property value PVALID and that value assigned the derived property value PVALID, and that value is
is unchanged even if the underlying Unicode properties have unchanged even if the underlying Unicode properties have changed.
changed.
5. Conclusion 5. Conclusion
As described in Section 4 changes have been made to Unicode between As described in Section 4, changes have been made to Unicode between
version 6.3.0 and 11.0.0. Some changes to specific characters version 6.3.0 and 11.0.0. Some changes to specific characters
changed their derived property value. Others did not. Given the changed their derived property value, while other changes did not.
diverse deployment described in Section 3.2 and the changes Given the diverse deployment described in Section 3.2 and the changes
described, including implications to normalization, the conclusion is described, including implications to normalization, the conclusion of
to not add any exception rules to IDNA2008. this document is to not add any exception rules to IDNA2008.
To increase overall harmonization in the use of internationalized To increase overall harmonization in the use of IDNs, this document
domain names, the author recommends that the derived property values recommends that the derived property values MUST be calculated as
MUST be calculated as specified in the documents listed in section specified in the documents listed in section Section 3.1 and with the
Section 3.1 also with code points in Unicode Version 11.0.0 code points in Unicode Version 11.0.0 [Unicode-11.0.0].
[Unicode-11.0.0].
All registries (and others) SHOULD calculate a repertoire using the All DNS registries (and other organizatios) SHOULD calculate a
conservatism and inclusion principles as laid out for example in in repertoire using the conservatism and inclusion principles, as
SAC-084 [SAC-084]. described in SAC-084 [SAC-084] and similar documents.
6. IANA Considerations 6. IANA Considerations
IANA is requested to update the registry of derived property values IANA is requested to update the IDNA Parameters registry of derived
after validation with the Appointed Expert that the derived property property values, after the expert reviewer validates that the derived
values are calculated correctly. property values are calculated correctly.
7. Security Considerations 7. Security Considerations
This document makes recommendations regarding the use of the IDNA2008 This document makes recommendations regarding the use of the IDNA2008
algorithm for calculation of derived property values, based on the algorithm for calculation of derived property values, based on the
current Unicode version. It also recommends that registries (and current Unicode version. It also recommends that DNS registries (and
others dealing with Internationalized Domain Names) explicitly select others dealing with Internationalized Domain Names) explicitly select
appropriate subsets of characters with the derived value of PVALID. appropriate subsets of characters with the derived value of PVALID.
Not following these recommendations can lead to various security Not following these recommendations can lead to various security
issues. Specifically, allowing confusable characters may lead to issues. Specifically, allowing confusable characters may lead to
various phishing attacks. See Security Consideration Sections in the various phishing attacks, as described in the Security Consideration
documents listed in section Section 3.1. Sections in the documents listed in section Section 3.1.
8. Acknowledgements 8. Acknowledgements
Thanks to Martin Durst, Asmus Freytag, Ted Hardie, John Klensin, Erik Thanks to Martin Durst, Asmus Freytag, Ted Hardie, John Klensin, Erik
Nordmark, Michel Suignard, Andrew Sullivan and Suzanne Woolf for Nordmark, Michel Suignard, Andrew Sullivan and Suzanne Woolf for
input to this document. input to this document.
9. References 9. References
9.1. Normative References 9.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997, DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>. <https://www.rfc-editor.org/info/rfc2119>.
[RFC3491] Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep [RFC3491] Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep
Profile for Internationalized Domain Names (IDN)", Profile for Internationalized Domain Names (IDN)",
RFC 3491, DOI 10.17487/RFC3491, March 2003, RFC 3491, DOI 10.17487/RFC3491, March 2003,
 End of changes. 56 change blocks. 
188 lines changed or deleted 191 lines changed or added

This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/