idnits 2.17.1
draft-faltstrom-unicode12-07.txt:
Checking boilerplate required by RFC 5378 and the IETF Trust (see
https://trustee.ietf.org/license-info):
----------------------------------------------------------------------------
No issues found here.
Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
----------------------------------------------------------------------------
No issues found here.
Checking nits according to https://www.ietf.org/id-info/checklist :
----------------------------------------------------------------------------
No issues found here.
Miscellaneous warnings:
----------------------------------------------------------------------------
== The copyright year in the IETF Trust and authors Copyright Line does not
match the current year
-- The document date (February 13, 2022) is 802 days in the past. Is this
intentional?
Checking references for intended status: Proposed Standard
----------------------------------------------------------------------------
(See RFCs 3967 and 4897 for information about using normative references
to lower-maturity documents in RFCs)
** Obsolete normative reference: RFC 3491 (Obsoleted by RFC 5891)
-- Obsolete informational reference (is this intentional?): RFC 3454
(Obsoleted by RFC 7564)
-- Obsolete informational reference (is this intentional?): RFC 3490
(Obsoleted by RFC 5890, RFC 5891)
Summary: 1 error (**), 0 flaws (~~), 1 warning (==), 3 comments (--).
Run idnits with the --verbose option for more detailed information about
the items above.
--------------------------------------------------------------------------------
2 Network Working Group P. Faltstrom
3 Internet-Draft Netnod
4 Intended status: Standards Track February 13, 2022
5 Expires: August 17, 2022
7 IDNA2008 and Unicode 12.0.0
8 draft-faltstrom-unicode12-07
10 Abstract
12 This document describes the changes between Unicode 6.0.0 and Unicode
13 12.0.0 in the context of IDNA2008. Some additions and changes have
14 been made in the Unicode Standard that affect the values produced by
15 the algorithm IDNA2008 specifies. IDNA2008 allows adding exceptions
16 to the algorithm for backward compatibility; however, this document
17 does not add any such exceptions. This document provides the
18 necessary tables to IANA to make its database consistent with Unicode
19 12.0.0.
21 To improve understanding, this document describes systems that are
22 being used as alternatives to those that conform to IDNA2008.
24 TO BE REMOVED AT TIME OF PUBLICATION AS AN RFC:
26 This document is discussed on the i18n-discuss@ietf.org mailing list
27 of the IETF.
29 Status of This Memo
31 This Internet-Draft is submitted in full conformance with the
32 provisions of BCP 78 and BCP 79.
34 Internet-Drafts are working documents of the Internet Engineering
35 Task Force (IETF). Note that other groups may also distribute
36 working documents as Internet-Drafts. The list of current Internet-
37 Drafts is at https://datatracker.ietf.org/drafts/current/.
39 Internet-Drafts are draft documents valid for a maximum of six months
40 and may be updated, replaced, or obsoleted by other documents at any
41 time. It is inappropriate to use Internet-Drafts as reference
42 material or to cite them other than as "work in progress."
44 This Internet-Draft will expire on August 17, 2022.
46 Copyright Notice
48 Copyright (c) 2022 IETF Trust and the persons identified as the
49 document authors. All rights reserved.
51 This document is subject to BCP 78 and the IETF Trust's Legal
52 Provisions Relating to IETF Documents
53 (https://trustee.ietf.org/license-info) in effect on the date of
54 publication of this document. Please review these documents
55 carefully, as they describe your rights and restrictions with respect
56 to this document. Code Components extracted from this document must
57 include Simplified BSD License text as described in Section 4.e of
58 the Trust Legal Provisions and are provided without warranty as
59 described in the Simplified BSD License.
61 Table of Contents
63 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
64 2. Background . . . . . . . . . . . . . . . . . . . . . . . . . 4
65 2.1. IDNA2008 Documents . . . . . . . . . . . . . . . . . . . 5
66 2.2. Additional important IDNA2008-related documents . . . . . 6
67 2.3. Deployment . . . . . . . . . . . . . . . . . . . . . . . 6
68 3. Notable Changes Between Unicode 6.0.0 and 12.0.0 . . . . . . 7
69 3.1. Changes between Unicode 6.0.0 and 7.0.0 . . . . . . . . . 7
70 3.2. Changes between Unicode 7.0.0 and 10.0.0 . . . . . . . . 8
71 3.3. Changes between Unicode 10.0.0 and 11.0.0 . . . . . . . . 9
72 3.4. Changes between Unicode 11.0.0 and 12.0.0 . . . . . . . . 10
73 4. U+111C9 SHARADA SANDHI MARK . . . . . . . . . . . . . . . . . 11
74 5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 11
75 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11
76 7. Security Considerations . . . . . . . . . . . . . . . . . . . 12
77 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 12
78 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 12
79 9.1. Normative References . . . . . . . . . . . . . . . . . . 12
80 9.2. Non-normative references . . . . . . . . . . . . . . . . 13
81 Appendix A. Changes from Unicode 6.0.0 to Unicode 7.0.0 . . . . 15
82 Appendix B. Changes from Unicode 7.0.0 to Unicode 8.0.0 . . . . 21
83 Appendix C. Changes from Unicode 8.0.0 to Unicode 9.0.0 . . . . 23
84 Appendix D. Changes from Unicode 9.0.0 to Unicode 10.0.0 . . . . 24
85 Appendix E. Changes from Unicode 10.0.0 to Unicode 11.0.0 . . . 26
86 Appendix F. Changes from Unicode 11.0.0 to Unicode 12.0.0 . . . 27
87 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 29
89 1. Introduction
91 The current version of Internationalized Domain Names for
92 Applications (IDNA) was initiated in 2008, and despite not being
93 completed until 2010, is widely known as "IDNA2008". It is specified
94 in the series of documents listed in Section 2.1. The IDNA2008
95 standard includes an algorithm by which a derived property value is
96 calculated based on the properties defined from the Unicode Standard.
98 The derived property values that can be calculated are defined in RFC
99 5892 [RFC5892]. Below is a summary to aid in the reading of this
100 document. For definition of the terms, please see RFC 5892
101 [RFC5892].
103 o PROTOCOL VALID: Those that are allowed to be used in IDNs. Code
104 points with this property value are permitted for general use in
105 IDNs. However, that a label consists only of code points that
106 have this property value does not imply that the label can be used
107 in DNS. The abbreviated term PVALID is used to refer to this
108 value.
110 o CONTEXTUAL RULE REQUIRED: Some characteristics of the character,
111 such as it being invisible in certain contexts or problematic in
112 others, require that it not be used in labels unless specific
113 other characters or properties are present. The abbreviated term
114 CONTEXT is used to refer to this value. As explained in RFC 5892
115 [RFC5892] CONTEXT is in turn divided into CONTEXTJ and CONTEXTO.
117 o DISALLOWED: Those that should clearly not be included in IDNs.
118 Code points with this property value are not permitted in IDNs.
120 o UNASSIGNED: Those code points that are not designated (i.e., are
121 unassigned) in the Unicode Standard.
123 When the Unicode Standard is updated, new code points are assigned
124 and already-assigned code points can have their property values
125 changed.
127 o Assigning code points can create problems if the newly-assigned
128 code points are compositions of existing code points and because
129 of that the normalization relationships associated with those code
130 points should have been changed.
132 o Changing properties for already-assigned code points can create
133 problems if the property change results in changes to the derived
134 property value. This might make an earlier allowed code point
135 whose derived property value is PVALID to then not be allowed
136 anymore if its derived property value changes to DISALLOWED. The
137 problem can also happen the other way around: a code point that
138 was not allowed (and thus is prohibited) can suddenly end up being
139 allowed.
141 o Problems can also be created if the properties assigned to those
142 code points are inconsistent with IDNA2008 assumptions about how
143 properties are assigned and/or about how code points with those
144 properties are used or behave.
146 There were three incompatible changes in the Unicode standard between
147 Unicode 5.2.0 [Unicode-5.2.0] and Unicode 6.0.0 [Unicode-6.0.0]; they
148 are described in RFC 6452 [RFC6452]. The code points U+0CF1 and
149 U+0CF2 had a derived property value change from DISALLOWED to PVALID,
150 and the code point U+19DA had a change in derived property value from
151 PVALID to DISALLOWED. These changes where examined in great detail,
152 but the IETF concluded that these changes to the Unicode standard did
153 not warrant an update to RFC 5892 [RFC5892].
155 As described in Section 3, more incompatible changes have been made
156 to code points between Unicode 6.0.0 and Unicode 12.0.0
157 [Unicode-12.0.0]; however, the changes in the derived property values
158 do not result in exceptions (as defined in section 2.6 of RFC 5892
159 [RFC5892]) being added to RFC 5892 [RFC5892].
161 Further, in 2015, the Internet Architecture Board (IAB) issued a
162 statement [IAB2005-1] that advised the community to avoid using any
163 of the potentially problematic code points and asked the IETF to
164 resolve the issues related to the code point ARABIC LETTER BEH WITH
165 HAMZA ABOVE (U+08A1) that was introduced in Unicode 7.0.0
166 [Unicode-7.0.0]. In February of that year, the statement was revised
167 [IAB2005-2] to focus on the latter request. More details about the
168 problem of code point sequences not normalizing as one might expect
169 appear in a draft that was part of the discussion [IDNA7].
171 The result of the work in the IETF was that no exception was added to
172 RFC 5892 [RFC5892]; however, it should be noted that the review of
173 the issues around U+08A1 indicated that this code point is not an
174 isolated case and that a number of long-standing PVALID code points
175 may have similar issues. While the affected code points remain
176 PVALID in this document, identification of the problem resulted in a
177 clarification of the review process for new Unicode versions. That
178 clarification, which reinforces the original review plan to capture
179 issues like these, was published as RFC 8753 [RFC8753]. Any review
180 of Unicode versions after 12.0.0 should be made according to RFC 8753
181 [RFC8753]; an objective of this document is to ensure that a proper
182 review of such versions after version 12.0.0 can be made.
184 2. Background
185 2.1. IDNA2008 Documents
187 IDNA2008 consists of the following documents. The documents in the
188 set have informal names.
190 o Internationalized Domain Names for Applications (IDNA):
191 Definitions and Document Framework [RFC5890], informally called
192 "Defs" or "Definitions", contains definitions and other material
193 that are needed for understanding other documents in the set.
195 o Internationalized Domain Names in Applications (IDNA): Protocol
196 [RFC5891], informally called "Protocol", describes the core
197 IDNA2008 protocol and its operations. It needs to be interpreted
198 in combination with the Bidi document (described below).
200 o The Unicode Code Points and Internationalized Domain Names for
201 Applications (IDNA) [RFC5892], informally called "Tables", lists
202 the categories and rules that identify the code points allowed in
203 a label written in native character form (called a "U-label"), and
204 is based on Unicode 5.2.0 [Unicode-5.2.0] code point assignments
205 and additional rules unique to IDNA2008. The Unicode-based rules
206 in RFC 5892 are expected to be stable across Unicode updates and
207 hence independent of Unicode versions. RFC 5892 [RFC5892]
208 obsoletes RFC 3491 [RFC3491], and in particular the use of the
209 tables to which RFC 3491 [RFC3491] refers.
211 o Right-to-Left Scripts for Internationalized Domain Names for
212 Applications (IDNA) [RFC5893], informally called "Bidi", specifies
213 special rules for labels that contain characters that are written
214 from right to left.
216 o Internationalized Domain Names for Applications (IDNA):
217 Background, Explanation, and Rationale [RFC5894], informally
218 called "Rationale", provides an overview of the protocol and
219 associated tables, and gives explanatory material and some
220 rationale for the decisions that led to IDNA2008. It also
221 contains advice for DNS registry operators and others who use
222 Internationalized Domain Names (IDNs).
224 o Mapping Characters for Internationalized Domain Names in
225 Applications (IDNA) 2008 [RFC5895], informally called "Mapping",
226 discusses the issue of mapping characters into other characters
227 and provides guidance for doing so when that is appropriate. RFC
228 5895 provides advice only and is not a required part of IDNA.
230 2.2. Additional important IDNA2008-related documents
232 There are other documents important for the understanding and
233 functioning of IDNA2008, for example this.
235 o The Unicode Code Points and Internationalized Domain Names for
236 Applications (IDNA) - Unicode 6.0 [RFC6452] describes some changes
237 made to Unicode 6.0.0 [Unicode-6.0.0] that resulted in derived
238 property value change for the code points U+0CF1, U+0CF2 and
239 U+19DA. U+0CF1 and U+0CF2 changed from DISALLOWED to PVALID,
240 while U+19DA changed from PVALID to DISALLOWED. The IETF
241 concluded that no update to RFC 5892 [RFC5892] was needed based on
242 the changes made in Unicode 6.0.0 [Unicode-6.0.0]. As a result,
243 the derived property value remained aligned with the Unicode
244 Standard. Specifically, no exception was added.
246 2.3. Deployment
248 There are many variations on the general IDNA model in use in the
249 various parts of the community. The following lists some of the
250 strategies that implementations that claim to be IDNA compliant are
251 known to use, but it should be noted the list is not complete:
253 o IDNA2003 as specified in RFC 3490 [RFC3490] and RFC 3491
254 [RFC3491]. Those specifications are dependent on case folding and
255 NFKC normalization and on tables that specify for each code point
256 whether it is allowed to be used or not, with a distinction made
257 between use for "stored strings" and "query strings". The tables
258 themselves are dependent on Unicode 3.2 [Unicode-3.2.0].
260 o A number of variations on IDNA2003, sometimes presented as
261 "updated IDNA2003" or the like, which follow the principles of
262 IDNA2003 as understood by the implementers but that use tables
263 that represent how the implementers believe Stringprep [RFC3454]
264 and Nameprep [RFC3491] would have evolved had the IETF not moved
265 in the direction of IDNA2008 instead.
267 o A mix between IDNA2003 and IDNA2008 where code points assigned to
268 Unicode after Unicode 3.2.0 [Unicode-3.2.0] have derived property
269 value calculated according to the algorithm specified in IDNA2008.
271 o A mix between IDNA2003 and IDNA2008 according to the Unicode
272 Technical Standard #46 [UTS-46]. Because that document specifies
273 different profiles, there are several variations that leave users
274 with no guarantee that two applications claiming conformance to
275 UTS#46 will interoperate well with each other much less with
276 conforming IDNA2008 implementations. UTS#46 is ultimately based
277 on a normative table very much like the one used by Stringprep
278 [RFC3454] but updated for each new version of Unicode.
280 o The (normative) IDNA2008 algorithm applied to whatever version of
281 Unicode Standard exists in the operating system and/or libraries
282 used, independent of whatever version of tables appears in the
283 (non-normative) IANA database.
285 In practice, the Unicode Consortium creates a maximum set of code
286 points by assigning code points in the Unicode Standard. The
287 IDNA2008 rules use the Unicode Standard to create a further subset of
288 code points and context that are permitted in DNS labels associated
289 with its PVALID, and CONTEXT (CONTEXTJ or CONTEXTO) derived property
290 values. DNS registries and other organizations that deal with IDNs
291 are supposed to create their own subsets from IDNA2008 for use by
292 those registries and organizations.
294 This progressive subsetting and narrowing of the repertoire of code
295 points that can be used in labels is an implementation of the
296 principles of being conservative when deciding what code points to
297 include in such a subset. SAC-084 [SAC-084] and RFC 6912 [RFC6912]
298 recommend to DNS registries and other organizations to be
299 conservative when creating their subsets, and to use the principle of
300 creating subsets by inclusion.
302 See also the Security Considerations section in this document.
304 3. Notable Changes Between Unicode 6.0.0 and 12.0.0
306 Among the changes between the Unicode versions, most code points that
307 change derived property value change from UNASSIGNED to PVALID or
308 from UNASSIGNED to DISALLOWED. The interesting changes in derived
309 property values include other changes. All changes between the major
310 versions of Unicode can be found in Appendix A (6.0.0-7.0.0),
311 Appendix B (7.0.0-8.0.0), Appendix C (8.0.0-9.0.0), Appendix D
312 (9.0.0-10.0.0), Appendix E (10.0.0-11.0.0) and Appendix F
313 (11.0.0-12.0.0).
315 3.1. Changes between Unicode 6.0.0 and 7.0.0
317 Change in number of characters in each category:
319 PVALID changed from 97418 to 99867 (+2449)
321 UNASSIGNED changed from 865081 to 861509 (-3572)
323 CONTEXTJ did not change, at 2
324 CONTEXTO did not change, at 25
326 DISALLOWED changed from 151586 to 152709 (+1123)
328 TOTAL did not change, at 1114112
330 There are no changes made to Unicode between version 6.0.0 and
331 7.0.0 that impact IDNA2008 calculation of the derived property
332 values.
334 The code points U+17B4 KHMER VOWEL INHERENT AQ and U+17B5 KHMER VOWEL
335 INHERENT AA both changed the general category from Cf (Format) to Mn
336 (Nonspacing_Mark), but that did not impact the calculation of the
337 derived property value which stayed at DISALLOWED.
339 The character ARABIC LETTER BEH WITH HAMZA ABOVE (U+08A1) was
340 introduced in Unicode 7.0.0. This was discussed extensively in the
341 IETF, and by the IAB in their statement [IAB2005-1] requesting the
342 IETF to investigate the issue. Specifically, the IAB stated:
344 On the same precautionary principle, the IAB recommends that the
345 Internationalized Domain Names for Applications (IDNA) Parameters
346 registry not be
347 updated to Unicode 7.0.0 until the IETF has consensus on a
348 solution to this problem.
350 The discussion in the IETF concluded that although it is possible to
351 create "the same" character in multiple ways, the issue with U+08A1
352 is not unique. The character U+08A1 (ARABIC LETTER BEH WITH HAMZA
353 ABOVE) can be represented with the sequence ARABIC LETTER BEH
354 (U+0628) and ARABIC HAMZA ABOVE (U+0654). This identical to LATIN
355 SMALL LETTER O WITH STROKE (U+00F8), which can be represented with
356 the sequence LATIN SMALL LETTER O (U+006F) followed by COMBINING
357 SHORT SOLIDUS OVERLAY (U+0337).
359 Although the discussion about this specific code point resulted in
360 acceptance of the derived property value of PVALID, the underlying
361 problem with combining sequences is not understood fully. Therefore,
362 it cannot be claimed that this case can be extrapolated to other
363 situations and other code points.
365 3.2. Changes between Unicode 7.0.0 and 10.0.0
367 Change in number of characters in each category:
369 Code points that changed derived property value: 0
371 PVALID changed from 99867 to 122411 (+22544)
372 UNASSIGNED changed from 861509 to 837775 (-23734)
374 CONTEXTJ did not change, at 2
376 CONTEXTO did not change, at 25
378 DISALLOWED changed from 152709 to 153899 (+1190)
380 TOTAL did not change, at 1114112
382 There are no changes made to Unicode between version 7.0.0 and
383 10.0.0 that impact IDNA2008 calculation of the derived property
384 values.
386 3.3. Changes between Unicode 10.0.0 and 11.0.0
388 Change in number of characters in each category:
390 Code points that changed derived property value: 1
392 PVALID changed from 122411 to 122734 (+323)
394 UNASSIGNED changed from 837775 to 837091 (-684)
396 CONTEXTJ did not change, at 2
398 CONTEXTO did not change, at 25
400 DISALLOWED changed from 153899 to 154260 (+361)
402 TOTAL did not change, at 1114112
404 Georgian letters in the ranges U+10D0..U+10FA and U+10FD..U+10FF
405 had their General Properties changed from Lo to Ll, to reflect
406 their status as the lowercase of new Georgian case pairs. Case
407 mappings were also added.
409 SHARADA SANDHI MARK (U+111C9) was changed from Po to Mn, and from
410 bc=L to bc=NSM.
412 The properties for ZANABAZAR SQUARE VOWEL SIGN AI (U+11A07) and
413 ZANABZAR SQUARE VOWEL SIGN AU (U+11A08) were corrected from Mc to
414 Mn.
416 SPHERICAL ANGLE OPENING UP (U+29A1) was changed to Bidi_M=N.
418 These changes to the Unicode Standard have the following implications
419 for these code points:
421 o The newly assigned 684 characters are assigned a derived property
422 value as of a result of applying the IDNA2008 algorithm.
424 o The Georgian letters in the ranges U+10D0..U+10FA and
425 U+10FD..U+10FF existed before IDNA2008 was created. Applying the
426 IDNA2008 algorithm to the code points assigned the derived
427 property value PVALID, and that value is unchanged even if the
428 underlying Unicode properties have changed. The newly encoded
429 Mtavruli letters have general category "Lu" and are therefore
430 DISALLOWED.
432 o The U+111C9 SHARADA SANDHI MARK was added to Unicode 8.0.0
433 [Unicode-8.0.0]. Applying the IDNA2008 algorithm to the code
434 point assigned the derived property value DISALLOWED. The changes
435 in the underlying properties in the Unicode Standard Version
436 11.0.0 [Unicode-11.0.0] caused the derived property value to
437 change to PVALID.
439 o The characters ZANABAZAR SQUARE VOWEL SIGN AI (U+11A07) and
440 ZANABZAR SQUARE VOWEL SIGN AU (U+11A08) were added to Unicode
441 10.0.0 [Unicode-10.0.0]. Applying the IDNA2008 algorithm to the
442 code points assigned the derived property value PVALID, and that
443 value is unchanged even if the underlying Unicode properties have
444 changed.
446 o SPHERICAL ANGLE OPENING UP (U+29A1) existed before IDNA2008 was
447 created. Applying the IDNA2008 algorithm to the code point
448 assigned the derived property value DISALLOWED, and that value is
449 unchanged even if the underlying Unicode properties have changed.
451 3.4. Changes between Unicode 11.0.0 and 12.0.0
453 Change in number of characters in each category:
455 Code points that changed derived property value: 0
457 PVALID changed from 122734 to 123006 (+272)
459 UNASSIGNED changed from 837091 to 836537 (-554)
461 CONTEXTJ did not change, at 2
463 CONTEXTO did not change, at 25
465 DISALLOWED changed from 154260 to 154542 (+282)
467 TOTAL did not change, at 1114112
469 4. U+111C9 SHARADA SANDHI MARK
471 As one can see in Section 3, an incompatible property change was made
472 between Unicode 6.0.0 and 12.0.0, affecting the code point U+111C9.
473 Its derived property value thus changed from DISALLOWED to PVALID.
474 In situations like these, IDNA2008 allow for addition of rules to RFC
475 5892 [RFC5892] section 2.7. If the code point is accepted, it might
476 still be rejected if validated by software based on older versions of
477 Unicode than 12.0.0. As the character is rarely used outside the
478 group of Sharada specialists, and used in some records for indicating
479 sandhi breaks, the conclusion is that it could either be added as an
480 exception or allowed to change its property value, as the use of the
481 code point is limited outside a special community. As including an
482 exception would require implementation changes in deployed
483 implementations of IDNA20008, the IETF has decided to not add a
484 BackwardCompatible rule to IDNA2008 (i.e. Section 2.7 of RFC 5892
485 [RFC5892] for this code point. This also ensures all sandhi marks
486 being treated in an equal way.
488 5. Conclusion
490 As described in Section 3 and Section 4, changes have been made to
491 Unicode between version 6.0.0 and 12.0.0. Some changes to specific
492 characters changed their derived property value, whereas other
493 changes did not. Given the deployment considerations described in
494 Section 2.3 and changes in the Unicode Standard described in
495 Section 3 and Section 4, including implications to normalization, the
496 conclusion is to not add any exception rules to IDNA2008.
498 This document addresses only changes to Unicode between version 6.0.0
499 and version 12.0.0. Changes in future Unicode versions might result
500 in the conclusion that exception rules need to be added to IDNA2008
501 after the review process explained in RFC 8753 [RFC8753]. Separately
502 from any changes in Unicode, the IETF might conclude that updates to
503 RFC 5892 [RFC5892] or other IDNA2008 documents might become
504 necessary; such updates might include changes to the algorithm
505 specified in IDNA2008 as well as additional rules, categories, or
506 other forms of tuning, like the clarifications in RFC 8753 [RFC8753].
508 6. IANA Considerations
510 IANA is requested to update the IDNA Parameters registry [IANA-IDNA]
511 of derived property values, after the expert reviewer validates that
512 the derived property values are calculated correctly.
514 7. Security Considerations
516 This document makes recommendations regarding the use of the IDNA2008
517 algorithm for calculation of derived property values, based on
518 Unicode version 12.0.0. This recommendation does not say anything
519 about what recommendations to make for future versions of the Unicode
520 Standard.
522 Not following these recommendations can lead to various security
523 issues. Specifically, allowing confusable characters may lead to
524 various phishing attacks, as described in the Security Consideration
525 Sections in the documents listed in Section 2.1.
527 8. Acknowledgements
529 Thanks to Harald Alvestrand, Marc Blanchet, Martin Duerst, Asmus
530 Freytag, Ted Hardie, John Klensin, Erik Nordmark, Pete Resnick, Peter
531 Saint-Andre, Michel Suignard, Andrew Sullivan and Suzanne Woolf for
532 input to this document.
534 9. References
536 9.1. Normative References
538 [RFC3491] Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep
539 Profile for Internationalized Domain Names (IDN)",
540 RFC 3491, DOI 10.17487/RFC3491, March 2003,
541 .
543 [RFC5890] Klensin, J., "Internationalized Domain Names for
544 Applications (IDNA): Definitions and Document Framework",
545 RFC 5890, DOI 10.17487/RFC5890, August 2010,
546 .
548 [RFC5891] Klensin, J., "Internationalized Domain Names in
549 Applications (IDNA): Protocol", RFC 5891,
550 DOI 10.17487/RFC5891, August 2010,
551 .
553 [RFC5892] Faltstrom, P., Ed., "The Unicode Code Points and
554 Internationalized Domain Names for Applications (IDNA)",
555 RFC 5892, DOI 10.17487/RFC5892, August 2010,
556 .
558 [RFC5893] Alvestrand, H., Ed. and C. Karp, "Right-to-Left Scripts
559 for Internationalized Domain Names for Applications
560 (IDNA)", RFC 5893, DOI 10.17487/RFC5893, August 2010,
561 .
563 [RFC6452] Faltstrom, P., Ed. and P. Hoffman, Ed., "The Unicode Code
564 Points and Internationalized Domain Names for Applications
565 (IDNA) - Unicode 6.0", RFC 6452, DOI 10.17487/RFC6452,
566 November 2011, .
568 9.2. Non-normative references
570 [IAB2005-1]
571 Internet Architecture Board, "IAB Statement on Identifiers
572 and Unicode 7.0.0", IAB Statement on Identifiers and
573 Unicode 7.0.0
574 , January 2015.
578 [IAB2005-2]
579 Internet Architecture Board, "IAB Statement on Identifiers
580 and Unicode 7.0.0", IAB Statement on Identifiers and
581 Unicode 7.0.0
582 , February 2015.
586 [IANA-IDNA]
587 IANA, "IDNA Rules and Derived Property Values", IDNA Rules
588 and Derived Property Values
589 , April 2020.
592 [IDNA7] Klensin, J. and P. Faltstrom, "IDNA Update for Unicode 7.0
593 and Later Versions", draft-klensin-idna-5892upd-unicode70
594 , October 2017.
597 [RFC3454] Hoffman, P. and M. Blanchet, "Preparation of
598 Internationalized Strings ("stringprep")", RFC 3454,
599 DOI 10.17487/RFC3454, December 2002,
600 .
602 [RFC3490] Faltstrom, P., Hoffman, P., and A. Costello,
603 "Internationalizing Domain Names in Applications (IDNA)",
604 RFC 3490, DOI 10.17487/RFC3490, March 2003,
605 .
607 [RFC5894] Klensin, J., "Internationalized Domain Names for
608 Applications (IDNA): Background, Explanation, and
609 Rationale", RFC 5894, DOI 10.17487/RFC5894, August 2010,
610 .
612 [RFC5895] Resnick, P. and P. Hoffman, "Mapping Characters for
613 Internationalized Domain Names in Applications (IDNA)
614 2008", RFC 5895, DOI 10.17487/RFC5895, September 2010,
615 .
617 [RFC6912] Sullivan, A., Thaler, D., Klensin, J., and O. Kolkman,
618 "Principles for Unicode Code Point Inclusion in Labels in
619 the DNS", RFC 6912, DOI 10.17487/RFC6912, April 2013,
620 .
622 [RFC8753] Klensin, J. and P. Faeltstroem, "Internationalized Domain
623 Names for Applications (IDNA) Review for New Unicode
624 Versions", RFC 8753, DOI 10.17487/RFC8753, April 2020,
625 .
627 [SAC-084] The Security and Stability Advisory Committee, "SAC084",
628 SSAC Comments on Guidelines for the Extended Process
629 Similarity Review Panel for the IDN ccTLD Fast Track
630 Process , August 2016.
633 [Unicode-3.2.0]
634 The Unicode Consortium, "The Unicode Standard, Version
635 3.2.0", The Unicode Standard, Version 3.2.0 ISBN
636 0-201-61633-5, March 2002.
638 [Unicode-5.2.0]
639 The Unicode Consortium, "The Unicode Standard, Version
640 5.2.0", The Unicode Standard, Version 5.2.0 ISBN
641 978-1-936213-00-9, October 2009.
643 [Unicode-6.0.0]
644 The Unicode Consortium, "The Unicode Standard, Version
645 6.0.0", The Unicode Standard, Version 6.0.0 ISBN
646 978-1-936213-01-6, October 2011.
648 [Unicode-7.0.0]
649 The Unicode Consortium, "The Unicode Standard, Version
650 7.0.0", The Unicode Standard, Version 7.0.0 ISBN
651 978-1-936213-09-2, June 2014.
653 [Unicode-8.0.0]
654 The Unicode Consortium, "The Unicode Standard, Version
655 8.0.0", The Unicode Standard, Version 8.0.0 ISBN
656 978-1-936213-10-8, June 2015.
658 [Unicode-10.0.0]
659 The Unicode Consortium, "The Unicode Standard, Version
660 10.0.0", The Unicode Standard, Version 10.0.0 ISBN
661 978-1-936213-16-0, June 2017.
663 [Unicode-11.0.0]
664 The Unicode Consortium, "The Unicode Standard, Version
665 11.0.0", The Unicode Standard, Version 11.0.0 ISBN
666 978-1-936213-19-1, June 2018.
668 [Unicode-12.0.0]
669 The Unicode Consortium, "The Unicode Standard, Version
670 12.0.0", The Unicode Standard, Version 12.0.0 ISBN
671 978-1-936213-22-1, March 2019.
673 [UTS-46] The Unicode Consortium, "Unicode Technical Standard #46,
674 Version 12.0.0", UNICODE IDNA COMPATIBILITY
675 PROCESSING , March
676 2019.
678 Appendix A. Changes from Unicode 6.0.0 to Unicode 7.0.0
680 Changes from derived property value UNASSIGNED to either PVALID or
681 DISALLOWED.
683 037F ; DISALLOWED # GREEK CAPITAL LETTER YOT
684 0528 ; DISALLOWED # CYRILLIC CAPITAL LETTER EN WITH LEFT HOOK
685 0529 ; PVALID # CYRILLIC SMALL LETTER EN WITH LEFT HOOK
686 052A ; DISALLOWED # CYRILLIC CAPITAL LETTER DZZHE
687 052B ; PVALID # CYRILLIC SMALL LETTER DZZHE
688 052C ; DISALLOWED # CYRILLIC CAPITAL LETTER DCHE
689 052D ; PVALID # CYRILLIC SMALL LETTER DCHE
690 052E ; DISALLOWED # CYRILLIC CAPITAL LETTER EL WITH DESCENDER
691 052F ; PVALID # CYRILLIC SMALL LETTER EL WITH DESCENDER
692 058D..058F ; DISALLOWED # RIGHT-FACING ARMENIAN ETERNITY SIGN..ARMENIAN
693 0604..0605 ; DISALLOWED # ARABIC SIGN SAMVAT..ARABIC NUMBER MARK ABOVE
694 061C ; DISALLOWED # ARABIC LETTER MARK
695 08A0..08B2 ; PVALID # ARABIC LETTER BEH WITH SMALL V BELOW..ARABIC
696 08E4..08FF ; PVALID # ARABIC CURLY FATHA..ARABIC MARK SIDEWAYS NOON
697 0978 ; PVALID # DEVANAGARI LETTER MARWARI DDA
698 0980 ; PVALID # BENGALI ANJI
699 0AF0 ; DISALLOWED # GUJARATI ABBREVIATION SIGN
700 0C00 ; PVALID # TELUGU SIGN COMBINING CANDRABINDU ABOVE
701 0C34 ; PVALID # TELUGU LETTER LLLA
702 0C81 ; PVALID # KANNADA SIGN CANDRABINDU
703 0D01 ; PVALID # MALAYALAM SIGN CANDRABINDU
704 0DE6..0DEF ; PVALID # SINHALA LITH DIGIT ZERO..SINHALA LITH DIGIT N
705 0EDE..0EDF ; PVALID # LAO LETTER KHMU GO..LAO LETTER KHMU NYO
706 10C7 ; DISALLOWED # GEORGIAN CAPITAL LETTER YN
707 10CD ; DISALLOWED # GEORGIAN CAPITAL LETTER AEN
708 10FD..10FF ; PVALID # GEORGIAN LETTER AEN..GEORGIAN LETTER LABIAL S
709 16F1..16F8 ; PVALID # RUNIC LETTER K..RUNIC LETTER FRANKS CASKET AE
710 17B4..17B5 ; DISALLOWED # KHMER VOWEL INHERENT AQ..KHMER VOWEL INHERENT
711 191D..191E ; PVALID # LIMBU LETTER GYAN..LIMBU LETTER TRA
712 1AB0..1ABD ; PVALID # COMBINING DOUBLED CIRCUMFLEX ACCENT..COMBININ
713 1ABE ; DISALLOWED # COMBINING PARENTHESES OVERLAY
714 1BAB..1BAD ; PVALID # SUNDANESE SIGN VIRAMA..SUNDANESE CONSONANT SI
715 1BBA..1BBF ; PVALID # SUNDANESE AVAGRAHA..SUNDANESE LETTER FINAL M
716 1CC0..1CC7 ; DISALLOWED # SUNDANESE PUNCTUATION BINDU SURYA..SUNDANESE
717 1CF3..1CF6 ; PVALID # VEDIC SIGN ROTATED ARDHAVISARGA..VEDIC SIGN U
718 1CF8..1CF9 ; PVALID # VEDIC TONE RING ABOVE..VEDIC TONE DOUBLE RING
719 1DE7..1DF5 ; PVALID # COMBINING LATIN SMALL LETTER ALPHA..COMBINING
720 2066..2069 ; DISALLOWED # LEFT-TO-RIGHT ISOLATE..POP DIRECTIONAL ISOLAT
721 20BA..20BD ; DISALLOWED # TURKISH LIRA SIGN..RUBLE SIGN
722 23F4..23FA ; DISALLOWED # BLACK MEDIUM LEFT-POINTING TRIANGLE..BLACK CI
723 2700 ; DISALLOWED # BLACK SAFETY SCISSORS
724 27CB ; DISALLOWED # MATHEMATICAL RISING DIAGONAL
725 27CD ; DISALLOWED # MATHEMATICAL FALLING DIAGONAL
726 2B4D..2B4F ; DISALLOWED # DOWNWARDS TRIANGLE-HEADED ZIGZAG ARROW..SHORT
727 2B5A..2B73 ; DISALLOWED # SLANTED NORTH ARROW WITH HOOKED HEAD..DOWNWAR
728 2B76..2B95 ; DISALLOWED # NORTH WEST TRIANGLE-HEADED ARROW TO BAR..RIGH
729 2B98..2BB9 ; DISALLOWED # THREE-D TOP-LIGHTED LEFTWARDS EQUILATERAL ARR
730 2BBD..2BC8 ; DISALLOWED # BALLOT BOX WITH LIGHT X..BLACK MEDIUM RIGHT-P
731 2BCA..2BD1 ; DISALLOWED # TOP HALF BLACK CIRCLE..UNCERTAINTY SIGN
732 2CF2 ; DISALLOWED # COPTIC CAPITAL LETTER BOHAIRIC KHEI
733 2CF3 ; PVALID # COPTIC SMALL LETTER BOHAIRIC KHEI
734 2D27 ; PVALID # GEORGIAN SMALL LETTER YN
735 2D2D ; PVALID # GEORGIAN SMALL LETTER AEN
736 2D66..2D67 ; PVALID # TIFINAGH LETTER YE..TIFINAGH LETTER YO
737 2E32..2E42 ; DISALLOWED # TURNED COMMA..DOUBLE LOW-REVERSED-9 QUOTATION
738 9FCC ; PVALID #
739 A674..A67B ; PVALID # COMBINING CYRILLIC LETTER UKRAINIAN IE..COMBI
740 A698 ; DISALLOWED # CYRILLIC CAPITAL LETTER DOUBLE O
741 A699 ; PVALID # CYRILLIC SMALL LETTER DOUBLE O
742 A69A ; DISALLOWED # CYRILLIC CAPITAL LETTER CROSSED O
743 A69B ; PVALID # CYRILLIC SMALL LETTER CROSSED O
744 A69C..A69D ; DISALLOWED # MODIFIER LETTER CYRILLIC HARD SIGN..MODIFIER
745 A69F ; PVALID # COMBINING CYRILLIC LETTER IOTIFIED E
746 A792 ; DISALLOWED # LATIN CAPITAL LETTER C WITH BAR
747 A793..A795 ; PVALID # LATIN SMALL LETTER C WITH BAR..LATIN SMALL LE
748 A796 ; DISALLOWED # LATIN CAPITAL LETTER B WITH FLOURISH
749 A797 ; PVALID # LATIN SMALL LETTER B WITH FLOURISH
750 A798 ; DISALLOWED # LATIN CAPITAL LETTER F WITH STROKE
751 A799 ; PVALID # LATIN SMALL LETTER F WITH STROKE
752 A79A ; DISALLOWED # LATIN CAPITAL LETTER VOLAPUK AE
753 A79B ; PVALID # LATIN SMALL LETTER VOLAPUK AE
754 A79C ; DISALLOWED # LATIN CAPITAL LETTER VOLAPUK OE
755 A79D ; PVALID # LATIN SMALL LETTER VOLAPUK OE
756 A79E ; DISALLOWED # LATIN CAPITAL LETTER VOLAPUK UE
757 A79F ; PVALID # LATIN SMALL LETTER VOLAPUK UE
758 A7AA..A7AD ; DISALLOWED # LATIN CAPITAL LETTER H WITH HOOK..LATIN CAPIT
759 A7B0..A7B1 ; DISALLOWED # LATIN CAPITAL LETTER TURNED K..LATIN CAPITAL
760 A7F7 ; PVALID # LATIN EPIGRAPHIC LETTER SIDEWAYS I
761 A7F8..A7F9 ; DISALLOWED # MODIFIER LETTER CAPITAL H WITH STROKE..MODIFI
762 A9E0..A9FE ; PVALID # MYANMAR LETTER SHAN GHA..MYANMAR LETTER TAI L
763 AA7C..AA7F ; PVALID # MYANMAR SIGN TAI LAING TONE-2..MYANMAR LETTER
764 AAE0..AAEF ; PVALID # MEETEI MAYEK LETTER E..MEETEI MAYEK VOWEL SIG
765 AAF0..AAF1 ; DISALLOWED # MEETEI MAYEK CHEIKHAN..MEETEI MAYEK AHANG KHU
766 AAF2..AAF6 ; PVALID # MEETEI MAYEK ANJI..MEETEI MAYEK VIRAMA
767 AB30..AB5A ; PVALID # LATIN SMALL LETTER BARRED ALPHA..LATIN SMALL
768 AB5B..AB5F ; DISALLOWED # MODIFIER BREVE WITH INVERTED BREVE..MODIFIER
769 AB64..AB65 ; PVALID # LATIN SMALL LETTER INVERTED ALPHA..GREEK LETT
770 FA2E..FA2F ; DISALLOWED # CJK COMPATIBILITY IDEOGRAPH-FA2E..CJK COMPATI
771 FE27..FE2D ; PVALID # COMBINING LIGATURE LEFT HALF BELOW..COMBINING
772 1018B..1018C; DISALLOWED # GREEK ONE QUARTER SIGN..GREEK SINUSOID SIGN
773 101A0 ; DISALLOWED # GREEK SYMBOL TAU RHO
774 102E0 ; PVALID # COPTIC EPACT THOUSANDS MARK
775 102E1..102FB; DISALLOWED # COPTIC EPACT DIGIT ONE..COPTIC EPACT NUMBER N
776 1031F ; PVALID # OLD ITALIC LETTER ESS
777 10350..1037A; PVALID # OLD PERMIC LETTER AN..COMBINING OLD PERMIC LE
778 10500..10527; PVALID # ELBASAN LETTER A..ELBASAN LETTER KHE
779 10530..10563; PVALID # CAUCASIAN ALBANIAN LETTER ALT..CAUCASIAN ALBA
780 1056F ; DISALLOWED # CAUCASIAN ALBANIAN CITATION MARK
781 10600..10736; PVALID # LINEAR A SIGN AB001..LINEAR A SIGN A664
782 10740..10755; PVALID # LINEAR A SIGN A701 A..LINEAR A SIGN A732 JE
783 10760..10767; PVALID # LINEAR A SIGN A800..LINEAR A SIGN A807
784 10860..10876; PVALID # PALMYRENE LETTER ALEPH..PALMYRENE LETTER TAW
785 10877..1087F; DISALLOWED # PALMYRENE LEFT-POINTING FLEURON..PALMYRENE NU
786 10880..1089E; PVALID # NABATAEAN LETTER FINAL ALEPH..NABATAEAN LETTE
787 108A7..108AF; DISALLOWED # NABATAEAN NUMBER ONE..NABATAEAN NUMBER ONE HU
788 10980..109B7; PVALID # MEROITIC HIEROGLYPHIC LETTER A..MEROITIC CURS
789 109BE..109BF; PVALID # MEROITIC CURSIVE LOGOGRAM RMT..MEROITIC CURSI
790 10A80..10A9C; PVALID # OLD NORTH ARABIAN LETTER HEH..OLD NORTH ARABI
791 10A9D..10A9F; DISALLOWED # OLD NORTH ARABIAN NUMBER ONE..OLD NORTH ARABI
792 10AC0..10AC7; PVALID # MANICHAEAN LETTER ALEPH..MANICHAEAN LETTER WA
793 10AC8 ; DISALLOWED # MANICHAEAN SIGN UD
794 10AC9..10AE6; PVALID # MANICHAEAN LETTER ZAYIN..MANICHAEAN ABBREVIAT
795 10AEB..10AF6; DISALLOWED # MANICHAEAN NUMBER ONE..MANICHAEAN PUNCTUATION
796 10B80..10B91; PVALID # PSALTER PAHLAVI LETTER ALEPH..PSALTER PAHLAVI
797 10B99..10B9C; DISALLOWED # PSALTER PAHLAVI SECTION MARK..PSALTER PAHLAVI
798 10BA9..10BAF; DISALLOWED # PSALTER PAHLAVI NUMBER ONE..PSALTER PAHLAVI N
799 1107F ; PVALID # BRAHMI NUMBER JOINER
800 110D0..110E8; PVALID # SORA SOMPENG LETTER SAH..SORA SOMPENG LETTER
801 110F0..110F9; PVALID # SORA SOMPENG DIGIT ZERO..SORA SOMPENG DIGIT N
802 11100..11134; PVALID # CHAKMA SIGN CANDRABINDU..CHAKMA MAAYYAA
803 11136..1113F; PVALID # CHAKMA DIGIT ZERO..CHAKMA DIGIT NINE
804 11140..11143; DISALLOWED # CHAKMA SECTION MARK..CHAKMA QUESTION MARK
805 11150..11173; PVALID # MAHAJANI LETTER A..MAHAJANI SIGN NUKTA
806 11174..11175; DISALLOWED # MAHAJANI ABBREVIATION SIGN..MAHAJANI SECTION
807 11176 ; PVALID # MAHAJANI LIGATURE SHRI
808 11180..111C4; PVALID # SHARADA SIGN CANDRABINDU..SHARADA OM
809 111C5..111C8; DISALLOWED # SHARADA DANDA..SHARADA SEPARATOR
810 111CD ; DISALLOWED # SHARADA SUTRA MARK
811 111D0..111DA; PVALID # SHARADA DIGIT ZERO..SHARADA EKAM
812 111E1..111F4; DISALLOWED # SINHALA ARCHAIC DIGIT ONE..SINHALA ARCHAIC NU
813 11200..11211; PVALID # KHOJKI LETTER A..KHOJKI LETTER JJA
814 11213..11237; PVALID # KHOJKI LETTER NYA..KHOJKI SIGN SHADDA
815 11238..1123D; DISALLOWED # KHOJKI DANDA..KHOJKI ABBREVIATION SIGN
816 112B0..112EA; PVALID # KHUDAWADI LETTER A..KHUDAWADI SIGN VIRAMA
817 112F0..112F9; PVALID # KHUDAWADI DIGIT ZERO..KHUDAWADI DIGIT NINE
818 11301..11303; PVALID # GRANTHA SIGN CANDRABINDU..GRANTHA SIGN VISARG
819 11305..1130C; PVALID # GRANTHA LETTER A..GRANTHA LETTER VOCALIC L
820 1130F..11310; PVALID # GRANTHA LETTER EE..GRANTHA LETTER AI
821 11313..11328; PVALID # GRANTHA LETTER OO..GRANTHA LETTER NA
822 1132A..11330; PVALID # GRANTHA LETTER PA..GRANTHA LETTER RA
823 11332..11333; PVALID # GRANTHA LETTER LA..GRANTHA LETTER LLA
824 11335..11339; PVALID # GRANTHA LETTER VA..GRANTHA LETTER HA
825 1133C..11344; PVALID # GRANTHA SIGN NUKTA..GRANTHA VOWEL SIGN VOCALI
826 11347..11348; PVALID # GRANTHA VOWEL SIGN EE..GRANTHA VOWEL SIGN AI
827 1134B..1134D; PVALID # GRANTHA VOWEL SIGN OO..GRANTHA SIGN VIRAMA
828 11357 ; PVALID # GRANTHA AU LENGTH MARK
829 1135D..11363; PVALID # GRANTHA SIGN PLUTA..GRANTHA VOWEL SIGN VOCALI
830 11366..1136C; PVALID # COMBINING GRANTHA DIGIT ZERO..COMBINING GRANT
831 11370..11374; PVALID # COMBINING GRANTHA LETTER A..COMBINING GRANTHA
832 11480..114C5; PVALID # TIRHUTA ANJI..TIRHUTA GVANG
833 114C6 ; DISALLOWED # TIRHUTA ABBREVIATION SIGN
834 114C7 ; PVALID # TIRHUTA OM
835 114D0..114D9; PVALID # TIRHUTA DIGIT ZERO..TIRHUTA DIGIT NINE
836 11580..115B5; PVALID # SIDDHAM LETTER A..SIDDHAM VOWEL SIGN VOCALIC
837 115B8..115C0; PVALID # SIDDHAM VOWEL SIGN E..SIDDHAM SIGN NUKTA
838 115C1..115C9; DISALLOWED # SIDDHAM SIGN SIDDHAM..SIDDHAM END OF TEXT MAR
839 11600..11640; PVALID # MODI LETTER A..MODI SIGN ARDHACANDRA
840 11641..11643; DISALLOWED # MODI DANDA..MODI ABBREVIATION SIGN
841 11644 ; PVALID # MODI SIGN HUVA
842 11650..11659; PVALID # MODI DIGIT ZERO..MODI DIGIT NINE
843 11680..116B7; PVALID # TAKRI LETTER A..TAKRI SIGN NUKTA
844 116C0..116C9; PVALID # TAKRI DIGIT ZERO..TAKRI DIGIT NINE
845 118A0..118BF; DISALLOWED # WARANG CITI CAPITAL LETTER NGAA..WARANG CITI
846 118C0..118E9; PVALID # WARANG CITI SMALL LETTER NGAA..WARANG CITI DI
847 118EA..118F2; DISALLOWED # WARANG CITI NUMBER TEN..WARANG CITI NUMBER NI
848 118FF ; PVALID # WARANG CITI OM
849 11AC0..11AF8; PVALID # PAU CIN HAU LETTER PA..PAU CIN HAU GLOTTAL ST
850 1236F..12398; PVALID # CUNEIFORM SIGN KAP ELAMITE..CUNEIFORM SIGN UM
851 12463..1246E; DISALLOWED # CUNEIFORM NUMERIC SIGN ONE QUARTER GUR..CUNEI
852 12474 ; DISALLOWED # CUNEIFORM PUNCTUATION SIGN DIAGONAL QUADCOLON
853 16A40..16A5E; PVALID # MRO LETTER TA..MRO LETTER TEK
854 16A60..16A69; PVALID # MRO DIGIT ZERO..MRO DIGIT NINE
855 16A6E..16A6F; DISALLOWED # MRO DANDA..MRO DOUBLE DANDA
856 16AD0..16AED; PVALID # BASSA VAH LETTER ENNI..BASSA VAH LETTER I
857 16AF0..16AF4; PVALID # BASSA VAH COMBINING HIGH TONE..BASSA VAH COMB
858 16AF5 ; DISALLOWED # BASSA VAH FULL STOP
859 16B00..16B36; PVALID # PAHAWH HMONG VOWEL KEEB..PAHAWH HMONG MARK CI
860 16B37..16B3F; DISALLOWED # PAHAWH HMONG SIGN VOS THOM..PAHAWH HMONG SIGN
861 16B40..16B43; PVALID # PAHAWH HMONG SIGN VOS SEEV..PAHAWH HMONG SIGN
862 16B44..16B45; DISALLOWED # PAHAWH HMONG SIGN XAUS..PAHAWH HMONG SIGN CIM
863 16B50..16B59; PVALID # PAHAWH HMONG DIGIT ZERO..PAHAWH HMONG DIGIT N
864 16B5B..16B61; DISALLOWED # PAHAWH HMONG NUMBER TENS..PAHAWH HMONG NUMBER
865 16B63..16B77; PVALID # PAHAWH HMONG SIGN VOS LUB..PAHAWH HMONG SIGN
866 16B7D..16B8F; PVALID # PAHAWH HMONG CLAN SIGN TSHEEJ..PAHAWH HMONG C
867 16F00..16F44; PVALID # MIAO LETTER PA..MIAO LETTER HHA
868 16F50..16F7E; PVALID # MIAO LETTER NASALIZATION..MIAO VOWEL SIGN NG
869 16F8F..16F9F; PVALID # MIAO TONE RIGHT..MIAO LETTER REFORMED TONE-8
870 1BC00..1BC6A; PVALID # DUPLOYAN LETTER H..DUPLOYAN LETTER VOCALIC M
871 1BC70..1BC7C; PVALID # DUPLOYAN AFFIX LEFT HORIZONTAL SECANT..DUPLOY
872 1BC80..1BC88; PVALID # DUPLOYAN AFFIX HIGH ACUTE..DUPLOYAN AFFIX HIG
873 1BC90..1BC99; PVALID # DUPLOYAN AFFIX LOW ACUTE..DUPLOYAN AFFIX LOW
874 1BC9C ; DISALLOWED # DUPLOYAN SIGN O WITH CROSS
875 1BC9D..1BC9E; PVALID # DUPLOYAN THICK LETTER SELECTOR..DUPLOYAN DOUB
876 1BC9F..1BCA3; DISALLOWED # DUPLOYAN PUNCTUATION CHINOOK FULL STOP..SHORT
877 1E800..1E8C4; PVALID # MENDE KIKAKUI SYLLABLE M001 KI..MENDE KIKAKUI
878 1E8C7..1E8CF; DISALLOWED # MENDE KIKAKUI DIGIT ONE..MENDE KIKAKUI DIGIT
879 1E8D0..1E8D6; PVALID # MENDE KIKAKUI COMBINING NUMBER TEENS..MENDE K
880 1EE00..1EE03; DISALLOWED # ARABIC MATHEMATICAL ALEF..ARABIC MATHEMATICAL
881 1EE05..1EE1F; DISALLOWED # ARABIC MATHEMATICAL WAW..ARABIC MATHEMATICAL
882 1EE21..1EE22; DISALLOWED # ARABIC MATHEMATICAL INITIAL BEH..ARABIC MATHE
883 1EE24 ; DISALLOWED # ARABIC MATHEMATICAL INITIAL HEH
884 1EE27 ; DISALLOWED # ARABIC MATHEMATICAL INITIAL HAH
885 1EE29..1EE32; DISALLOWED # ARABIC MATHEMATICAL INITIAL YEH..ARABIC MATHE
886 1EE34..1EE37; DISALLOWED # ARABIC MATHEMATICAL INITIAL SHEEN..ARABIC MAT
887 1EE39 ; DISALLOWED # ARABIC MATHEMATICAL INITIAL DAD
888 1EE3B ; DISALLOWED # ARABIC MATHEMATICAL INITIAL GHAIN
889 1EE42 ; DISALLOWED # ARABIC MATHEMATICAL TAILED JEEM
890 1EE47 ; DISALLOWED # ARABIC MATHEMATICAL TAILED HAH
891 1EE49 ; DISALLOWED # ARABIC MATHEMATICAL TAILED YEH
892 1EE4B ; DISALLOWED # ARABIC MATHEMATICAL TAILED LAM
893 1EE4D..1EE4F; DISALLOWED # ARABIC MATHEMATICAL TAILED NOON..ARABIC MATHE
894 1EE51..1EE52; DISALLOWED # ARABIC MATHEMATICAL TAILED SAD..ARABIC MATHEM
895 1EE54 ; DISALLOWED # ARABIC MATHEMATICAL TAILED SHEEN
896 1EE57 ; DISALLOWED # ARABIC MATHEMATICAL TAILED KHAH
897 1EE59 ; DISALLOWED # ARABIC MATHEMATICAL TAILED DAD
898 1EE5B ; DISALLOWED # ARABIC MATHEMATICAL TAILED GHAIN
899 1EE5D ; DISALLOWED # ARABIC MATHEMATICAL TAILED DOTLESS NOON
900 1EE5F ; DISALLOWED # ARABIC MATHEMATICAL TAILED DOTLESS QAF
901 1EE61..1EE62; DISALLOWED # ARABIC MATHEMATICAL STRETCHED BEH..ARABIC MAT
902 1EE64 ; DISALLOWED # ARABIC MATHEMATICAL STRETCHED HEH
903 1EE67..1EE6A; DISALLOWED # ARABIC MATHEMATICAL STRETCHED HAH..ARABIC MAT
904 1EE6C..1EE72; DISALLOWED # ARABIC MATHEMATICAL STRETCHED MEEM..ARABIC MA
905 1EE74..1EE77; DISALLOWED # ARABIC MATHEMATICAL STRETCHED SHEEN..ARABIC M
906 1EE79..1EE7C; DISALLOWED # ARABIC MATHEMATICAL STRETCHED DAD..ARABIC MAT
907 1EE7E ; DISALLOWED # ARABIC MATHEMATICAL STRETCHED DOTLESS FEH
908 1EE80..1EE89; DISALLOWED # ARABIC MATHEMATICAL LOOPED ALEF..ARABIC MATHE
909 1EE8B..1EE9B; DISALLOWED # ARABIC MATHEMATICAL LOOPED LAM..ARABIC MATHEM
910 1EEA1..1EEA3; DISALLOWED # ARABIC MATHEMATICAL DOUBLE-STRUCK BEH..ARABIC
911 1EEA5..1EEA9; DISALLOWED # ARABIC MATHEMATICAL DOUBLE-STRUCK WAW..ARABIC
912 1EEAB..1EEBB; DISALLOWED # ARABIC MATHEMATICAL DOUBLE-STRUCK LAM..ARABIC
913 1EEF0..1EEF1; DISALLOWED # ARABIC MATHEMATICAL OPERATOR MEEM WITH HAH WI
914 1F0BF ; DISALLOWED # PLAYING CARD RED JOKER
915 1F0E0..1F0F5; DISALLOWED # PLAYING CARD FOOL..PLAYING CARD TRUMP-21
916 1F10B..1F10C; DISALLOWED # DINGBAT CIRCLED SANS-SERIF DIGIT ZERO..DINGBA
917 1F16A..1F16B; DISALLOWED # RAISED MC SIGN..RAISED MD SIGN
918 1F321..1F32C; DISALLOWED # THERMOMETER..WIND BLOWING FACE
919 1F336 ; DISALLOWED # HOT PEPPER
920 1F37D ; DISALLOWED # FORK AND KNIFE WITH PLATE
921 1F394..1F39F; DISALLOWED # HEART WITH TIP ON THE LEFT..ADMISSION TICKETS
922 1F3C5 ; DISALLOWED # SPORTS MEDAL
923 1F3CB..1F3CE; DISALLOWED # WEIGHT LIFTER..RACING CAR
924 1F3D4..1F3DF; DISALLOWED # SNOW CAPPED MOUNTAIN..STADIUM
925 1F3F1..1F3F7; DISALLOWED # WHITE PENNANT..LABEL
926 1F43F ; DISALLOWED # CHIPMUNK
927 1F441 ; DISALLOWED # EYE
928 1F4F8 ; DISALLOWED # CAMERA WITH FLASH
929 1F4FD..1F4FE; DISALLOWED # FILM PROJECTOR..PORTABLE STEREO
930 1F53E..1F54A; DISALLOWED # LOWER RIGHT SHADOWED WHITE CIRCLE..DOVE OF PE
931 1F568..1F579; DISALLOWED # RIGHT SPEAKER..JOYSTICK
932 1F57B..1F5A3; DISALLOWED # LEFT HAND TELEPHONE RECEIVER..BLACK DOWN POIN
933 1F5A5..1F5FA; DISALLOWED # DESKTOP COMPUTER..WORLD MAP
934 1F600 ; DISALLOWED # GRINNING FACE
935 1F611 ; DISALLOWED # EXPRESSIONLESS FACE
936 1F615 ; DISALLOWED # CONFUSED FACE
937 1F617 ; DISALLOWED # KISSING FACE
938 1F619 ; DISALLOWED # KISSING FACE WITH SMILING EYES
939 1F61B ; DISALLOWED # FACE WITH STUCK-OUT TONGUE
940 1F61F ; DISALLOWED # WORRIED FACE
941 1F626..1F627; DISALLOWED # FROWNING FACE WITH OPEN MOUTH..ANGUISHED FACE
942 1F62C ; DISALLOWED # GRIMACING FACE
943 1F62E..1F62F; DISALLOWED # FACE WITH OPEN MOUTH..HUSHED FACE
944 1F634 ; DISALLOWED # SLEEPING FACE
945 1F641..1F642; DISALLOWED # SLIGHTLY FROWNING FACE..SLIGHTLY SMILING FACE
946 1F650..1F67F; DISALLOWED # NORTH WEST POINTING LEAF..REVERSE CHECKER BOA
947 1F6C6..1F6CF; DISALLOWED # TRIANGLE WITH ROUNDED CORNERS..BED
948 1F6E0..1F6EC; DISALLOWED # HAMMER AND WRENCH..AIRPLANE ARRIVING
949 1F6F0..1F6F3; DISALLOWED # SATELLITE..PASSENGER SHIP
950 1F780..1F7D4; DISALLOWED # BLACK LEFT-POINTING ISOSCELES RIGHT TRIANGLE.
951 1F800..1F80B; DISALLOWED # LEFTWARDS ARROW WITH SMALL TRIANGLE ARROWHEAD
952 1F810..1F847; DISALLOWED # LEFTWARDS ARROW WITH SMALL EQUILATERAL ARROWH
953 1F850..1F859; DISALLOWED # LEFTWARDS SANS-SERIF ARROW..UP DOWN SANS-SERI
954 1F860..1F887; DISALLOWED # WIDE-HEADED LEFTWARDS LIGHT BARB ARROW..WIDE-
955 1F890..1F8AD; DISALLOWED # LEFTWARDS TRIANGLE ARROWHEAD..WHITE ARROW SHA
957 Appendix B. Changes from Unicode 7.0.0 to Unicode 8.0.0
959 Changes from derived property value UNASSIGNED to either PVALID or
960 DISALLOWED.
962 08B3..08B4 ; PVALID # ARABIC LETTER AIN WITH THREE DOTS BELOW..ARAB
963 08E3 ; PVALID # ARABIC TURNED DAMMA BELOW
964 0AF9 ; PVALID # GUJARATI LETTER ZHA
965 0C5A ; PVALID # TELUGU LETTER RRRA
966 0D5F ; PVALID # MALAYALAM LETTER ARCHAIC II
967 13F5 ; PVALID # CHEROKEE LETTER MV
968 13F8..13FD ; DISALLOWED # CHEROKEE SMALL LETTER YE..CHEROKEE SMALL LETT
969 20BE ; DISALLOWED # LARI SIGN
970 218A..218B ; DISALLOWED # TURNED DIGIT TWO..TURNED DIGIT THREE
971 2BEC..2BEF ; DISALLOWED # LEFTWARDS TWO-HEADED ARROW WITH TRIANGLE ARRO
972 9FCD..9FD5 ; PVALID # ..
973 A69E ; PVALID # COMBINING CYRILLIC LETTER EF
974 A78F ; PVALID # LATIN LETTER SINOLOGICAL DOT
975 A7B2..A7B4 ; DISALLOWED # LATIN CAPITAL LETTER J WITH CROSSED-TAIL..LAT
976 A7B5 ; PVALID # LATIN SMALL LETTER BETA
977 A7B6 ; DISALLOWED # LATIN CAPITAL LETTER OMEGA
978 A7B7 ; PVALID # LATIN SMALL LETTER OMEGA
979 A8FC ; DISALLOWED # DEVANAGARI SIGN SIDDHAM
980 A8FD ; PVALID # DEVANAGARI JAIN OM
981 AB60..AB63 ; PVALID # LATIN SMALL LETTER SAKHA YAT..LATIN SMALL LET
982 AB70..ABBF ; DISALLOWED # CHEROKEE SMALL LETTER A..CHEROKEE SMALL LETTE
983 FE2E..FE2F ; PVALID # COMBINING CYRILLIC TITLO LEFT HALF..COMBINING
984 108E0..108F2; PVALID # HATRAN LETTER ALEPH..HATRAN LETTER QOPH
985 108F4..108F5; PVALID # HATRAN LETTER SHIN..HATRAN LETTER TAW
986 108FB..108FF; DISALLOWED # HATRAN NUMBER ONE..HATRAN NUMBER ONE HUNDRED
987 109BC..109BD; DISALLOWED # MEROITIC CURSIVE FRACTION ELEVEN TWELFTHS..ME
988 109C0..109CF; DISALLOWED # MEROITIC CURSIVE NUMBER ONE..MEROITIC CURSIVE
989 109D2..109FF; DISALLOWED # MEROITIC CURSIVE NUMBER ONE HUNDRED..MEROITIC
990 10C80..10CB2; DISALLOWED # OLD HUNGARIAN CAPITAL LETTER A..OLD HUNGARIAN
991 10CC0..10CF2; PVALID # OLD HUNGARIAN SMALL LETTER A..OLD HUNGARIAN S
992 10CFA..10CFF; DISALLOWED # OLD HUNGARIAN NUMBER ONE..OLD HUNGARIAN NUMBE
993 111C9 ; DISALLOWED # SHARADA SANDHI MARK
994 111CA..111CC; PVALID # SHARADA SIGN NUKTA..SHARADA EXTRA SHORT VOWEL
995 111DB ; DISALLOWED # SHARADA SIGN SIDDHAM
996 111DC ; PVALID # SHARADA HEADSTROKE
997 111DD..111DF; DISALLOWED # SHARADA CONTINUATION SIGN..SHARADA SECTION MA
998 11280..11286; PVALID # MULTANI LETTER A..MULTANI LETTER GA
999 11288 ; PVALID # MULTANI LETTER GHA
1000 1128A..1128D; PVALID # MULTANI LETTER CA..MULTANI LETTER JJA
1001 1128F..1129D; PVALID # MULTANI LETTER NYA..MULTANI LETTER BA
1002 1129F..112A8; PVALID # MULTANI LETTER BHA..MULTANI LETTER RHA
1003 112A9 ; DISALLOWED # MULTANI SECTION MARK
1004 11300 ; PVALID # GRANTHA SIGN COMBINING ANUSVARA ABOVE
1005 11350 ; PVALID # GRANTHA OM
1006 115CA..115D7; DISALLOWED # SIDDHAM SECTION MARK WITH TRIDENT AND U-SHAPE
1007 115D8..115DD; PVALID # SIDDHAM LETTER THREE-CIRCLE ALTERNATE I..SIDD
1008 11700..11719; PVALID # AHOM LETTER KA..AHOM LETTER JHA
1009 1171D..1172B; PVALID # AHOM CONSONANT SIGN MEDIAL LA..AHOM SIGN KILL
1010 11730..11739; PVALID # AHOM DIGIT ZERO..AHOM DIGIT NINE
1011 1173A..1173F; DISALLOWED # AHOM NUMBER TEN..AHOM SYMBOL VI
1012 12399 ; PVALID # CUNEIFORM SIGN U U
1013 12480..12543; PVALID # CUNEIFORM SIGN AB TIMES NUN TENU..CUNEIFORM S
1014 14400..14646; PVALID # ANATOLIAN HIEROGLYPH A001..ANATOLIAN HIEROGLY
1015 1D1DE..1D1E8; DISALLOWED # MUSICAL SYMBOL KIEVAN C CLEF..MUSICAL SYMBOL
1016 1D800..1D9FF; DISALLOWED # SIGNWRITING HAND-FIST INDEX..SIGNWRITING HEAD
1017 1DA00..1DA36; PVALID # SIGNWRITING HEAD RIM..SIGNWRITING AIR SUCKING
1018 1DA37..1DA3A; DISALLOWED # SIGNWRITING AIR BLOW SMALL ROTATIONS..SIGNWRI
1019 1DA3B..1DA6C; PVALID # SIGNWRITING MOUTH CLOSED NEUTRAL..SIGNWRITING
1020 1DA6D..1DA74; DISALLOWED # SIGNWRITING SHOULDER HIP SPINE..SIGNWRITING T
1021 1DA75 ; PVALID # SIGNWRITING UPPER BODY TILTING FROM HIP JOINT
1022 1DA76..1DA83; DISALLOWED # SIGNWRITING LIMB COMBINATION..SIGNWRITING LOC
1023 1DA84 ; PVALID # SIGNWRITING LOCATION HEAD NECK
1024 1DA85..1DA8B; DISALLOWED # SIGNWRITING LOCATION TORSO..SIGNWRITING PAREN
1025 1DA9B..1DA9F; PVALID # SIGNWRITING FILL MODIFIER-2..SIGNWRITING FILL
1026 1DAA1..1DAAF; PVALID # SIGNWRITING ROTATION MODIFIER-2..SIGNWRITING
1027 1F32D..1F32F; DISALLOWED # HOT DOG..BURRITO
1028 1F37E..1F37F; DISALLOWED # BOTTLE WITH POPPING CORK..POPCORN
1029 1F3CF..1F3D3; DISALLOWED # CRICKET BAT AND BALL..TABLE TENNIS PADDLE AND
1030 1F3F8..1F3FF; DISALLOWED # BADMINTON RACQUET AND SHUTTLECOCK..EMOJI MODI
1031 1F4FF ; DISALLOWED # PRAYER BEADS
1032 1F54B..1F54F; DISALLOWED # KAABA..BOWL OF HYGIEIA
1033 1F643..1F644; DISALLOWED # UPSIDE-DOWN FACE..FACE WITH ROLLING EYES
1034 1F6D0 ; DISALLOWED # PLACE OF WORSHIP
1035 1F910..1F918; DISALLOWED # ZIPPER-MOUTH FACE..SIGN OF THE HORNS
1036 1F980..1F984; DISALLOWED # CRAB..UNICORN FACE
1037 1F9C0 ; DISALLOWED # CHEESE WEDGE
1038 2B820..2CEA1; PVALID # ....
1080 18800..18AF2; PVALID # TANGUT COMPONENT-001..TANGUT COMPONENT-755
1081 1E000..1E006; PVALID # COMBINING GLAGOLITIC LETTER AZU..COMBINING GL
1082 1E008..1E018; PVALID # COMBINING GLAGOLITIC LETTER ZEMLJA..COMBINING
1083 1E01B..1E021; PVALID # COMBINING GLAGOLITIC LETTER SHTA..COMBINING G
1084 1E023..1E024; PVALID # COMBINING GLAGOLITIC LETTER YU..COMBINING GLA
1085 1E026..1E02A; PVALID # COMBINING GLAGOLITIC LETTER YO..COMBINING GLA
1086 1E900..1E921; DISALLOWED # ADLAM CAPITAL LETTER ALIF..ADLAM CAPITAL LETT
1087 1E922..1E94A; PVALID # ADLAM SMALL LETTER ALIF..ADLAM NUKTA
1088 1E950..1E959; PVALID # ADLAM DIGIT ZERO..ADLAM DIGIT NINE
1089 1E95E..1E95F; DISALLOWED # ADLAM INITIAL EXCLAMATION MARK..ADLAM INITIAL
1090 1F19B..1F1AC; DISALLOWED # SQUARED THREE D..SQUARED VOD
1091 1F23B ; DISALLOWED # SQUARED CJK UNIFIED IDEOGRAPH-914D
1092 1F57A ; DISALLOWED # MAN DANCING
1093 1F5A4 ; DISALLOWED # BLACK HEART
1094 1F6D1..1F6D2; DISALLOWED # OCTAGONAL SIGN..SHOPPING TROLLEY
1095 1F6F4..1F6F6; DISALLOWED # SCOOTER..CANOE
1096 1F919..1F91E; DISALLOWED # CALL ME HAND..HAND WITH INDEX AND MIDDLE FING
1097 1F920..1F927; DISALLOWED # FACE WITH COWBOY HAT..SNEEZING FACE
1098 1F930 ; DISALLOWED # PREGNANT WOMAN
1099 1F933..1F93E; DISALLOWED # SELFIE..HANDBALL
1100 1F940..1F94B; DISALLOWED # WILTED FLOWER..MARTIAL ARTS UNIFORM
1101 1F950..1F95E; DISALLOWED # CROISSANT..PANCAKES
1102 1F985..1F991; DISALLOWED # EAGLE..SQUID
1104 Appendix D. Changes from Unicode 9.0.0 to Unicode 10.0.0
1106 Changes from derived property value UNASSIGNED to either PVALID or
1107 DISALLOWED.
1109 0860..086A ; PVALID # SYRIAC LETTER MALAYALAM NGA..SYRIAC LETTER MA
1110 09FC ; PVALID # BENGALI LETTER VEDIC ANUSVARA
1111 09FD ; DISALLOWED # BENGALI ABBREVIATION SIGN
1112 0AFA..0AFF ; PVALID # GUJARATI SIGN SUKUN..GUJARATI SIGN TWO-CIRCLE
1113 0D00 ; PVALID # MALAYALAM SIGN COMBINING ANUSVARA ABOVE
1114 0D3B..0D3C ; PVALID # MALAYALAM SIGN VERTICAL BAR VIRAMA..MALAYALAM
1115 1CF7 ; PVALID # VEDIC SIGN ATIKRAMA
1116 1DF6..1DF9 ; PVALID # COMBINING KAVYKA ABOVE RIGHT..COMBINING WIDE
1117 20BF ; DISALLOWED # BITCOIN SIGN
1118 23FF ; DISALLOWED # OBSERVER EYE SYMBOL
1119 2BD2 ; DISALLOWED # GROUP MARK
1120 2E45..2E49 ; DISALLOWED # INVERTED LOW KAVYKA..DOUBLE STACKED COMMA
1121 312E ; PVALID # BOPOMOFO LETTER O WITH DOT ABOVE
1122 9FD6..9FEA ; PVALID # ..
1123 1032D..1032F; PVALID # OLD ITALIC LETTER YE..OLD ITALIC LETTER SOUTH
1124 11A00..11A3E; PVALID # ZANABAZAR SQUARE LETTER A..ZANABAZAR SQUARE C
1125 11A3F..11A46; DISALLOWED # ZANABAZAR SQUARE INITIAL HEAD MARK..ZANABAZAR
1126 11A47 ; PVALID # ZANABAZAR SQUARE SUBJOINER
1127 11A50..11A83; PVALID # SOYOMBO LETTER A..SOYOMBO LETTER KSSA
1128 11A86..11A99; PVALID # SOYOMBO CLUSTER-INITIAL LETTER RA..SOYOMBO SU
1129 11A9A..11A9C; DISALLOWED # SOYOMBO MARK TSHEG..SOYOMBO MARK DOUBLE SHAD
1130 11A9E..11AA2; DISALLOWED # SOYOMBO HEAD MARK WITH MOON AND SUN AND TRIPL
1131 11D00..11D06; PVALID # MASARAM GONDI LETTER A..MASARAM GONDI LETTER
1132 11D08..11D09; PVALID # MASARAM GONDI LETTER AI..MASARAM GONDI LETTER
1133 11D0B..11D36; PVALID # MASARAM GONDI LETTER AU..MASARAM GONDI VOWEL
1134 11D3A ; PVALID # MASARAM GONDI VOWEL SIGN E
1135 11D3C..11D3D; PVALID # MASARAM GONDI VOWEL SIGN AI..MASARAM GONDI VO
1136 11D3F..11D47; PVALID # MASARAM GONDI VOWEL SIGN AU..MASARAM GONDI RA
1137 11D50..11D59; PVALID # MASARAM GONDI DIGIT ZERO..MASARAM GONDI DIGIT
1138 16FE1 ; PVALID # NUSHU ITERATION MARK
1139 1B002..1B11E; PVALID # HENTAIGANA LETTER A-1..HENTAIGANA LETTER N-MU
1140 1B170..1B2FB; PVALID # NUSHU CHARACTER-1B170..NUSHU CHARACTER-1B2FB
1141 1F260..1F265; DISALLOWED # ROUNDED SYMBOL FOR FU..ROUNDED SYMBOL FOR CAI
1142 1F6D3..1F6D4; DISALLOWED # STUPA..PAGODA
1143 1F6F7..1F6F8; DISALLOWED # SLED..FLYING SAUCER
1144 1F900..1F90B; DISALLOWED # CIRCLED CROSS FORMEE WITH FOUR DOTS..DOWNWARD
1145 1F91F ; DISALLOWED # I LOVE YOU HAND SIGN
1146 1F928..1F92F; DISALLOWED # FACE WITH ONE EYEBROW RAISED..SHOCKED FACE WI
1147 1F931..1F932; DISALLOWED # BREAST-FEEDING..PALMS UP TOGETHER
1148 1F94C ; DISALLOWED # CURLING STONE
1149 1F95F..1F96B; DISALLOWED # DUMPLING..CANNED FOOD
1150 1F992..1F997; DISALLOWED # GIRAFFE FACE..CRICKET
1151 1F9D0..1F9E6; DISALLOWED # FACE WITH MONOCLE..SOCKS
1152 2CEB0..2EBE0; PVALID # ....
1181 A7AF ; PVALID # LATIN LETTER SMALL CAPITAL Q
1182 A7B8 ; DISALLOWED # LATIN CAPITAL LETTER U WITH STROKE
1183 A7B9 ; PVALID # LATIN SMALL LETTER U WITH STROKE
1184 A8FE..A8FF ; PVALID # DEVANAGARI LETTER AY..DEVANAGARI VOWEL SIGN A
1185 10A34..10A35; PVALID # KHAROSHTHI LETTER TTTA..KHAROSHTHI LETTER VHA
1186 10A48 ; DISALLOWED # KHAROSHTHI FRACTION ONE HALF
1187 10D00..10D27; PVALID # HANIFI ROHINGYA LETTER A..HANIFI ROHINGYA SIG
1188 10D30..10D39; PVALID # HANIFI ROHINGYA DIGIT ZERO..HANIFI ROHINGYA D
1189 10F00..10F1C; PVALID # OLD SOGDIAN LETTER ALEPH..OLD SOGDIAN LETTER
1190 10F1D..10F26; DISALLOWED # OLD SOGDIAN NUMBER ONE..OLD SOGDIAN FRACTION
1191 10F27 ; PVALID # OLD SOGDIAN LIGATURE AYIN-DALETH
1192 10F30..10F50; PVALID # SOGDIAN LETTER ALEPH..SOGDIAN COMBINING STROK
1193 10F51..10F59; DISALLOWED # SOGDIAN NUMBER ONE..SOGDIAN PUNCTUATION HALF
1194 110CD ; DISALLOWED # KAITHI NUMBER SIGN ABOVE
1195 11144..11146; PVALID # CHAKMA LETTER LHAA..CHAKMA VOWEL SIGN EI
1196 1133B ; PVALID # COMBINING BINDU BELOW
1197 1145E ; PVALID # NEWA SANDHI MARK
1198 1171A ; PVALID # AHOM LETTER ALTERNATE BA
1199 11800..1183A; PVALID # DOGRA LETTER A..DOGRA SIGN NUKTA
1200 1183B ; DISALLOWED # DOGRA ABBREVIATION SIGN
1201 11A9D ; PVALID # SOYOMBO MARK PLUTA
1202 11D60..11D65; PVALID # GUNJALA GONDI LETTER A..GUNJALA GONDI LETTER
1203 11D67..11D68; PVALID # GUNJALA GONDI LETTER EE..GUNJALA GONDI LETTER
1204 11D6A..11D8E; PVALID # GUNJALA GONDI LETTER OO..GUNJALA GONDI VOWEL
1205 11D90..11D91; PVALID # GUNJALA GONDI VOWEL SIGN EE..GUNJALA GONDI VO
1206 11D93..11D98; PVALID # GUNJALA GONDI VOWEL SIGN OO..GUNJALA GONDI OM
1207 11DA0..11DA9; PVALID # GUNJALA GONDI DIGIT ZERO..GUNJALA GONDI DIGIT
1208 11EE0..11EF6; PVALID # MAKASAR LETTER KA..MAKASAR VOWEL SIGN O
1209 11EF7..11EF8; DISALLOWED # MAKASAR PASSIMBANG..MAKASAR END OF SECTION
1210 16E40..16E5F; DISALLOWED # MEDEFAIDRIN CAPITAL LETTER M..MEDEFAIDRIN CAP
1211 16E60..16E7F; PVALID # MEDEFAIDRIN SMALL LETTER M..MEDEFAIDRIN SMALL
1212 16E80..16E9A; DISALLOWED # MEDEFAIDRIN DIGIT ZERO..MEDEFAIDRIN EXCLAMATI
1213 187ED..187F1; PVALID # ..
1214 1D2E0..1D2F3; DISALLOWED # MAYAN NUMERAL ZERO..MAYAN NUMERAL NINETEEN
1215 1D372..1D378; DISALLOWED # IDEOGRAPHIC TALLY MARK ONE..TALLY MARK FIVE
1216 1EC71..1ECB4; DISALLOWED # INDIC SIYAQ NUMBER ONE..INDIC SIYAQ ALTERNATE
1217 1F12F ; DISALLOWED # COPYLEFT SYMBOL
1218 1F6F9 ; DISALLOWED # SKATEBOARD
1219 1F7D5..1F7D8; DISALLOWED # CIRCLED TRIANGLE..NEGATIVE CIRCLED SQUARE
1220 1F94D..1F94F; DISALLOWED # LACROSSE STICK AND BALL..FLYING DISC
1221 1F96C..1F970; DISALLOWED # LEAFY GREEN..SMILING FACE WITH SMILING EYES A
1222 1F973..1F976; DISALLOWED # FACE WITH PARTY HORN AND PARTY HAT..FREEZING
1223 1F97A ; DISALLOWED # FACE WITH PLEADING EYES
1224 1F97C..1F97F; DISALLOWED # LAB COAT..FLAT SHOE
1225 1F998..1F9A2; DISALLOWED # KANGAROO..SWAN
1226 1F9B0..1F9B9; DISALLOWED # EMOJI COMPONENT RED HAIR..SUPERVILLAIN
1227 1F9C1..1F9C2; DISALLOWED # CUPCAKE..SALT SHAKER
1228 1F9E7..1F9FF; DISALLOWED # RED GIFT ENVELOPE..NAZAR AMULET
1229 1FA60..1FA6D; DISALLOWED # XIANGQI RED GENERAL..XIANGQI BLACK SOLDIER
1231 Appendix F. Changes from Unicode 11.0.0 to Unicode 12.0.0
1233 Changes from derived property value UNASSIGNED to either PVALID or
1234 DISALLOWED.
1236 0C77 ; DISALLOWED # TELUGU SIGN SIDDHAM
1237 0E86 ; PVALID # LAO LETTER PALI GHA
1238 0E89 ; PVALID # LAO LETTER PALI CHA
1239 0E8C ; PVALID # LAO LETTER PALI JHA
1240 0E8E..0E93 ; PVALID # LAO LETTER PALI NYA..LAO LETTER PALI NNA
1241 0E98 ; PVALID # LAO LETTER PALI DHA
1242 0EA0 ; PVALID # LAO LETTER PALI BHA
1243 0EA8..0EA9 ; PVALID # LAO LETTER SANSKRIT SHA..LAO LETTER SANSKRIT
1244 0EAC ; PVALID # LAO LETTER PALI LLA
1245 0EBA ; PVALID # LAO SIGN PALI VIRAMA
1246 1CFA ; PVALID # VEDIC SIGN DOUBLE ANUSVARA ANTARGOMUKHA
1247 2BC9 ; DISALLOWED # NEPTUNE FORM TWO
1248 2BFF ; DISALLOWED # HELLSCHREIBER PAUSE SYMBOL
1249 2E4F ; DISALLOWED # CORNISH VERSE DIVIDER
1250 A7BA ; DISALLOWED # LATIN CAPITAL LETTER GLOTTAL A
1251 A7BB ; PVALID # LATIN SMALL LETTER GLOTTAL A
1252 A7BC ; DISALLOWED # LATIN CAPITAL LETTER GLOTTAL I
1253 A7BD ; PVALID # LATIN SMALL LETTER GLOTTAL I
1254 A7BE ; DISALLOWED # LATIN CAPITAL LETTER GLOTTAL U
1255 A7BF ; PVALID # LATIN SMALL LETTER GLOTTAL U
1256 A7C2 ; DISALLOWED # LATIN CAPITAL LETTER ANGLICANA W
1257 A7C3 ; PVALID # LATIN SMALL LETTER ANGLICANA W
1258 A7C4..A7C6 ; DISALLOWED # LATIN CAPITAL LETTER C WITH PALATAL HOOK..LAT
1259 AB66..AB67 ; PVALID # LATIN SMALL LETTER DZ DIGRAPH WITH RETROFLEX
1260 10FE0..10FF6; PVALID # ELYMAIC LETTER ALEPH..ELYMAIC LIGATURE ZAYIN-
1261 1145F ; PVALID # NEWA LETTER VEDIC ANUSVARA
1262 116B8 ; PVALID # TAKRI LETTER ARCHAIC KHA
1263 119A0..119A7; PVALID # NANDINAGARI LETTER A..NANDINAGARI LETTER VOCA
1264 119AA..119D7; PVALID # NANDINAGARI LETTER E..NANDINAGARI VOWEL SIGN
1265 119DA..119E1; PVALID # NANDINAGARI VOWEL SIGN E..NANDINAGARI SIGN AV
1266 119E2 ; DISALLOWED # NANDINAGARI SIGN SIDDHAM
1267 119E3..119E4; PVALID # NANDINAGARI HEADSTROKE..NANDINAGARI VOWEL SIG
1268 11A84..11A85; PVALID # SOYOMBO SIGN JIHVAMULIYA..SOYOMBO SIGN UPADHM
1269 11FC0..11FF1; DISALLOWED # TAMIL FRACTION ONE THREE-HUNDRED-AND-TWENTIET
1270 11FFF ; DISALLOWED # TAMIL PUNCTUATION END OF TEXT
1271 13430..13438; DISALLOWED # EGYPTIAN HIEROGLYPH VERTICAL JOINER..EGYPTIAN
1272 16F45..16F4A; PVALID # MIAO LETTER BRI..MIAO LETTER RTE
1273 16F4F ; PVALID # MIAO SIGN CONSONANT MODIFIER BAR
1274 16F7F..16F87; PVALID # MIAO VOWEL SIGN UOG..MIAO VOWEL SIGN UI
1275 16FE2 ; DISALLOWED # OLD CHINESE HOOK MARK
1276 16FE3 ; PVALID # OLD CHINESE ITERATION MARK
1277 187F2..187F7; PVALID # ..
1278 1B150..1B152; PVALID # HIRAGANA LETTER SMALL WI..HIRAGANA LETTER SMA
1279 1B164..1B167; PVALID # KATAKANA LETTER SMALL WI..KATAKANA LETTER SMA
1280 1E100..1E12C; PVALID # NYIAKENG PUACHUE HMONG LETTER MA..NYIAKENG PU
1281 1E130..1E13D; PVALID # NYIAKENG PUACHUE HMONG TONE-B..NYIAKENG PUACH
1282 1E140..1E149; PVALID # NYIAKENG PUACHUE HMONG DIGIT ZERO..NYIAKENG P
1283 1E14E ; PVALID # NYIAKENG PUACHUE HMONG LOGOGRAM NYAJ
1284 1E14F ; DISALLOWED # NYIAKENG PUACHUE HMONG CIRCLED CA
1285 1E2C0..1E2F9; PVALID # WANCHO LETTER AA..WANCHO DIGIT NINE
1286 1E2FF ; DISALLOWED # WANCHO NGUN SIGN
1287 1E94B ; PVALID # ADLAM NASALIZATION MARK
1288 1ED01..1ED3D; DISALLOWED # OTTOMAN SIYAQ NUMBER ONE..OTTOMAN SIYAQ FRACT
1289 1F16C ; DISALLOWED # RAISED MR SIGN
1290 1F6D5 ; DISALLOWED # HINDU TEMPLE
1291 1F6FA ; DISALLOWED # AUTO RICKSHAW
1292 1F7E0..1F7EB; DISALLOWED # LARGE ORANGE CIRCLE..LARGE BROWN SQUARE
1293 1F90D..1F90F; DISALLOWED # WHITE HEART..PINCHING HAND
1294 1F93F ; DISALLOWED # DIVING MASK
1295 1F971 ; DISALLOWED # YAWNING FACE
1296 1F97B ; DISALLOWED # SARI
1297 1F9A5..1F9AA; DISALLOWED # SLOTH..OYSTER
1298 1F9AE..1F9AF; DISALLOWED # GUIDE DOG..PROBING CANE
1299 1F9BA..1F9BF; DISALLOWED # SAFETY VEST..MECHANICAL LEG
1300 1F9C3..1F9CA; DISALLOWED # BEVERAGE BOX..ICE CUBE
1301 1F9CD..1F9CF; DISALLOWED # STANDING PERSON..DEAF PERSON
1302 1FA00..1FA53; DISALLOWED # NEUTRAL CHESS KING..BLACK CHESS KNIGHT-BISHOP
1303 1FA70..1FA73; DISALLOWED # BALLET SHOES..SHORTS
1304 1FA78..1FA7A; DISALLOWED # DROP OF BLOOD..STETHOSCOPE
1305 1FA80..1FA82; DISALLOWED # YO-YO..PARACHUTE
1306 1FA90..1FA95; DISALLOWED # RINGED PLANET..BANJO
1308 Author's Address
1310 Patrik Faltstrom
1311 Netnod
1313 Email: paf@netnod.se