idnits 2.17.1 draft-hoffman-idna2-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** The document seems to lack a License Notice according IETF Trust Provisions of 28 Dec 2009, Section 6.b.ii or Provisions of 12 Sep 2009 Section 6.b -- however, there's a paragraph with a matching beginning. Boilerplate error? (You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Feb 2009 rather than one of the newer Notices. See https://trustee.ietf.org/license-info/.) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == The 'Updates: ' line in the draft header should list only the _numbers_ of the RFCs which will be updated by this document (if approved); it should not include the word 'RFC' in the list. -- The draft header indicates that this document updates RFC3490, but the abstract doesn't seem to mention this, which it should. -- The draft header indicates that this document updates RFC3454, but the abstract doesn't seem to mention this, which it should. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year (Using the creation date from RFC3454, updated by this document, for RFC5378 checks: 2001-09-27) (Using the creation date from RFC3490, updated by this document, for RFC5378 checks: 2000-09-13) -- The document seems to contain a disclaimer for pre-RFC5378 work, and may have content which was first submitted before 10 November 2008. The disclaimer is necessary when there are original authors that you have been unable to contact, or if some do not wish to grant the BCP78 rights to the IETF Trust. If you are able to get all authors (current and original) to grant those rights, you can and should remove the disclaimer; otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (March 4, 2009) is 5526 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 3454 (Obsoleted by RFC 7564) ** Obsolete normative reference: RFC 3490 (Obsoleted by RFC 5890, RFC 5891) ** Obsolete normative reference: RFC 3491 (Obsoleted by RFC 5891) -- Possible downref: Non-RFC (?) normative reference: ref. 'UNICODE32' -- Possible downref: Non-RFC (?) normative reference: ref. 'UNICODE51' Summary: 4 errors (**), 0 flaws (~~), 2 warnings (==), 6 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group P. Hoffman 3 Internet-Draft March 4, 2009 4 Updates: RFC 3454, 3490, 3491 5 (if approved) 6 Intended status: Standards Track 7 Expires: September 5, 2009 9 Internationalizing Domain Names in Applications (IDNA) version 2 10 draft-hoffman-idna2-02.txt 12 Status of this Memo 14 This Internet-Draft is submitted to IETF in full conformance with the 15 provisions of BCP 78 and BCP 79. This document may contain material 16 from IETF Documents or IETF Contributions published or made publicly 17 available before November 10, 2008. The person(s) controlling the 18 copyright in some of this material may not have granted the IETF 19 Trust the right to allow modifications of such material outside the 20 IETF Standards Process. Without obtaining an adequate license from 21 the person(s) controlling the copyright in such materials, this 22 document may not be modified outside the IETF Standards Process, and 23 derivative works of it may not be created outside the IETF Standards 24 Process, except to format it for publication as an RFC or to 25 translate it into languages other than English. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF), its areas, and its working groups. Note that 29 other groups may also distribute working documents as Internet- 30 Drafts. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 The list of current Internet-Drafts can be accessed at 38 http://www.ietf.org/ietf/1id-abstracts.txt. 40 The list of Internet-Draft Shadow Directories can be accessed at 41 http://www.ietf.org/shadow.html. 43 This Internet-Draft will expire on September 5, 2009. 45 Copyright Notice 47 Copyright (c) 2009 IETF Trust and the persons identified as the 48 document authors. All rights reserved. 50 This document is subject to BCP 78 and the IETF Trust's Legal 51 Provisions Relating to IETF Documents in effect on the date of 52 publication of this document (http://trustee.ietf.org/license-info). 53 Please review these documents carefully, as they describe your rights 54 and restrictions with respect to this document. 56 Abstract 58 IDNA has been a world-wide success since it was introduced over five 59 years ago. However, it has some notable deficiencies, including 60 being tied to an old version of the Unicode standard and needless 61 restrictions that prevented some languages from being used. This 62 document describes IDNA version 2, which rectifies those problems 63 while making the fewest changes necessary to the original protocol. 65 Table of Contents 67 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 68 1.1. Acknowledgements . . . . . . . . . . . . . . . . . . . . . 4 69 1.2. Conventions Used In This Document . . . . . . . . . . . . . 4 70 2. Changes to RFC 3490 (IDNA v.1) . . . . . . . . . . . . . . . . 4 71 3. Changes to RFC 3454 (Stringprep) . . . . . . . . . . . . . . . 4 72 4. Changes to RFC 3491 (Nameprep) . . . . . . . . . . . . . . . . 6 73 5. Changes to RFC 3492 (Punycode) . . . . . . . . . . . . . . . . 7 74 6. Suggestions for Registries . . . . . . . . . . . . . . . . . . 7 75 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 7 76 8. Security Considerations . . . . . . . . . . . . . . . . . . . . 7 77 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 7 78 9.1. Normative References . . . . . . . . . . . . . . . . . . . 7 79 9.2. Informative References . . . . . . . . . . . . . . . . . . 8 80 Appendix A. Work Still to be Done . . . . . . . . . . . . . . . . 8 81 Appendix B. Changes between versions . . . . . . . . . . . . . . . 8 82 B.1. Changes between the -00 and -01 drafts . . . . . . . . . . 8 83 B.2. Changes between the -01 and -02 drafts . . . . . . . . . . 9 84 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 9 86 1. Introduction 88 This document describes Internationalizing Domain Names in 89 Applications (IDNA) version 2 (hereafter called "IDNAv2"), a direct 90 update to IDNA (hereafter called "IDNAv1"). IDNAv1 consists of four 91 RFCs: 92 o [RFC3490], "Internationalizing Domain Names in Applications 93 (IDNA)", is the main definition of IDNAv1. This defines the 94 processing rules for IDNA and gives the background for how IDNA 95 works. 96 o [RFC3454], "Preparation of Internationalized Strings 97 ("stringprep")", defines the general framework for processing non- 98 ASCII strings that are used in IDNA. 99 o [RFC3491], "Nameprep: A Stringprep Profile for Internationalized 100 Domain Names (IDN)", is a short profile of the rules from the 101 stringprep framework. 102 o [RFC3492], "Punycode: A Bootstring encoding of Unicode for 103 Internationalized Domain Names in Applications (IDNA)", defines 104 the encoding used in IDNAv1 labels. 106 IDNAv2 is backwards-compatible with IDNv1, meaning that any DNS label 107 that was legal in IDNAv1 has exactly the same representation in 108 IDNAv2. New labels are allowed in IDNAv2 that were not allowed in 109 IDNAv1. 111 IDNA needs to be updated for many reasons, some of which are covered 112 in [RFC4690]. If for no other reason, many characters that could 113 appear in domain names have been added since Unicode version 3.2 114 [UNICODE32], which is the version of the Unicode Standard on which 115 IDNAv1 is based. 117 One explicit goal of this update is to allow labels with characters 118 that have been added since Unicode version 3.2 to be used in IDNA. 119 To that end, IDNAv2 is based on Unicode 5.1 [UNICODE51]. The tables 120 in stringprep and Nameprep are updated to reflect this change. 122 Another explicit goal of this update is to not change the encoding of 123 any label that is legal in IDNAv1. If an internationalized label in 124 IDNAv1 produces an ACE label, IDNAv2 must produce the same ACE label. 125 If an internationalized label in IDNAv1 produces an ASCII label, 126 IDNAv2 must produce the same ASCII label. 128 A third explicit goal is to update the bidirectional ("bidi") 129 algorithm used by IDNAv1 to cover more languages such as Dhivehi and 130 Yiddish. This is done to cover an oversight in IDNAv1 that was 131 discovered after the work was finished. 133 This document updates IDNAv1 to reflect Unicode version 5.1. Of 134 course, the Unicode Consortium will not stop at Unicode version 5.1. 135 Because of that, IDNAv2 will probably later need to be updated to 136 reflect newer versions of Unicode. 138 1.1. Acknowledgements 140 The first serious work on updating IDNAv1 was undertaken by John 141 Klensin, Patrik Faltstrom, Harald Alvestrand, and Cary Karp. It led 142 to the formation of the IDNAbis Working Group in the IETF, and they 143 produced many revisions of their documents in that WG. Some of the 144 ideas in this IDNAv2 document (most notably, the update to the bidi 145 algorithm) is derived from their efforts. 147 Many, many people worked on IDNAv1. In addition to the authors of 148 the standards (Marc Blanchet, Adam Costello, Patrik Faltstrom, and 149 me), there were literally dozens of active participants in the 150 original IDN Working Group in the IETF that began in 2000. Their 151 tireless effort led to IDNAv1. 153 1.2. Conventions Used In This Document 155 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 156 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 157 document are to be interpreted as described in [RFC2119]. 159 In sections of this document where changes are made to RFCs, those 160 changes are shown with a vertical line character ("|") in the first 161 column. 163 2. Changes to RFC 3490 (IDNA v.1) 165 All references to the Unicode Standard are updated to refer to 166 [UNICODE51]. 168 All references to Nameprep are updated to refer to the Nameprep in 169 this document. Similarly, all references to stringprep are updated 170 to refer to the stringprep in this document. 172 In section 3.1, the first bullet point ("1) Whenever dots are 173 used...") is changed to add the following at the end of the sentence: 174 "U+2CFE (Coptic full stop)". 176 3. Changes to RFC 3454 (Stringprep) 178 [[[ ============================================================ 179 NOTE FOR EARLY VERSIONS OF THIS DRAFT 181 This section is intentionally incomplete. The tables in Stringprep 182 need to be added to based on the characters added to the repertoire 183 after Unicode 3.2 up to and including Unicode 5.1. 185 Probably the best way for this to be done is a few dedicated 186 individuals go through the new characters one-by-one, and also to go 187 through them programmatically, and see which tables need to be added 188 to. I have done a first pass of doing this one-by-one, but I felt 189 that publishing my results in the first draft would cause others to 190 get lazy about this important task. Future versions of this document 191 will reflect the results of that work. 193 The character review will be similar to what we did in IDNAv1, except 194 that we don't have to create any new buckets. Basically, we have to 195 see whether a particular new character should be mapped to nothing, 196 or whether it should be prohibited for one of the reasons already 197 listed in RFC 3454. In my not-careful first pass, I found very few 198 characters that will need to be added to sections 3 or 5. The case- 199 mapping will happen algorithmically, with a check that the new map 200 does not change any value in the old map. 202 ============================================================ ]]] 204 This document is significantly revised to reflect the use of Unicode 205 version 5.1. All the substantiative changes are additions. There 206 has been no effort to "correct" perceived mistakes in RFC 3454. (One 207 can argue that the extending of the bidi rules in section 6 to allow 208 more languages to be expressed is such a correction; however, the 209 change lets more strings to be allowed, and doesn't cause any string 210 that was allowed in RFC 3454 to not be allowed in the new version.) 212 Most of the changes to RFC 3454 are to add characters to the tables 213 in the document. These characters come from Unicode version 5.1. 214 Thus, the tables become valid for Unicode version 5.1. However, the 215 same tables are still valid for Unicode version 3.2 because a profile 216 that is still using version 3.2 will not ever use the added rows in 217 the updated tables. 219 In all places other than Appendix A, references to "[Unicode3.2]" are 220 updated to refer to [UNICODE51]. Similarly, all text references to 221 "Unicode version 3.2" are updated to "Unicode version 5.1". 223 Characters will be added to the tables in section 3.1 to reflect the 224 differences between Unicode 3.2 and Unicode 5.1. For example, 225 U+E0100 to U+E01EF will be added to the second list in the section. 227 In section 3.2, change "CaseFolding-3.txt" to "CaseFolding.txt". 229 Characters will be added to the tables in subsections of section 5. 230 An example is that U+2064 will be added to the list in section 5.2. 232 In section 6, at the end of the fourth paragraph (which currently 233 ends with "have bidirectional category "EN"."), the following 234 sentence is added: "The Unicode Standard also defines a bidirectional 235 category "NSM" for "non-spacing marks"." 237 In section 6, the third requirement is changed to read: 239 | 3) If a string contains any RandALCat character, the first 240 | character MUST be a RandALCat chacter, and the last 241 | characters of the string must be either a RandALCat 242 | character or a RandALCat character followed by one or 243 | more NSM charcters. 245 In the references, update the reference for UAX15, and add a 246 reference for [UNICODE51]. 248 Appendix A is changed to read: 250 | The following is the only repertoire covered in this document: 251 | 252 | - Unicode 3.2, as defined in [UNICODE32] 253 | 254 | - Unicode 5.1, as defined in [UNICODE51] 256 A new appendix, "A.2 Unassigned code points in Unicode 5.1", will be 257 added. 259 The tables in appendixes B, C, and D will be added to. 261 4. Changes to RFC 3491 (Nameprep) 263 All references to IDNA and stringprep are updated to refer to the 264 stringprep in this document. 266 In section 1 and 2, "Unicode 3.2" is changed to "Unicode 5.1". 268 In section 10, change the last table entry to "This is the second 269 version of Nameprep." 271 5. Changes to RFC 3492 (Punycode) 273 IDNAv2 does not change RFC 3492. 275 6. Suggestions for Registries 277 This is a placeholder for a short section that covers new advice for 278 registries that was not included in IDNAv1. It will include ideas 279 about multi-script labels and possibly other advice. 281 7. IANA Considerations 283 IANA is requested to add the following to the stringprep profile 284 registry (www.iana.org/assignments/stringprep-profiles). 286 Name of this profile: Nameprep 288 RFC in which the profile is defined: This document. 290 Indicator whether or not this is the newest version of the profile: 291 This is the second version of Nameprep. 293 8. Security Considerations 295 The security considerations from RFCs 3454, 3490, 3491, and 3492 all 296 apply to this document. The changes between IDNAv1 and IDNAv2 are 297 not believed to add any new security considerations. 299 9. References 301 9.1. Normative References 303 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 304 Requirement Levels", BCP 14, RFC 2119, March 1997. 306 [RFC3454] Hoffman, P. and M. Blanchet, "Preparation of 307 Internationalized Strings ("stringprep")", RFC 3454, 308 December 2002. 310 [RFC3490] Faltstrom, P., Hoffman, P., and A. Costello, 311 "Internationalizing Domain Names in Applications (IDNA)", 312 RFC 3490, March 2003. 314 [RFC3491] Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep 315 Profile for Internationalized Domain Names (IDN)", 316 RFC 3491, March 2003. 318 [RFC3492] Costello, A., "Punycode: A Bootstring encoding of Unicode 319 for Internationalized Domain Names in Applications 320 (IDNA)", RFC 3492, March 2003. 322 [UNICODE32] 323 The Unicode Consortium, "The Unicode Standard, Version 324 3.2", The Unicode Standard version 3.2. 326 [UNICODE51] 327 The Unicode Consortium, "The Unicode Standard, Version 328 5.1", The Unicode Standard version 5.1. 330 9.2. Informative References 332 [RFC4690] Klensin, J., Faltstrom, P., Karp, C., and IAB, "Review and 333 Recommendations for Internationalized Domain Names 334 (IDNs)", RFC 4690, September 2006. 336 Appendix A. Work Still to be Done 338 Figure out exactly how we want the reference to Unicode 3.2 and 339 Unicode 5.1 to look in the references section, then figure out how to 340 wrestle xml2rfc to produce that. 342 Fill in all the tables for the updates to stringprep. 344 Decide if this entire document should be about Unicode 5.2, which is 345 expected out by mid-2009. 347 Appendix B. Changes between versions 349 (This section is to be removed by the RFC Editor.) 351 B.1. Changes between the -00 and -01 drafts 353 In section 1, changed the target for backwards-compatibility to be 354 for strings that have only visible characters. 356 In section 3, removed the first paragraph. 358 In section 3 (about Stringprep section 3.1), added the text about 359 removing U+200C and U+200D from the mapped-to-nothing list. 361 In section 3 (about Stringprep section 6), replaced: 363 | 3) If a string contains any RandALCat character, a RandALCat 364 | character MUST be the first character of the string, and 365 | either a RandALCat character or NSM charcter MUST be the 366 | last character of the string. 368 with 370 | 3) If a string contains any RandALCat character, the first 371 | character MUST be a RandALCat chacter, and the last 372 | characters of the string must be either a RandALCat 373 | character or a RandALCat character followed by one or 374 | more NSM charcters. 376 Added new placeholder section 6 on advice to registries. 378 In Appendix A, added the thought about targeting Unicode 5.2 instead 379 of Unicode 5.1. 381 B.2. Changes between the -01 and -02 drafts 383 Reversed the changes made in -01 with respect to U+200C and U+200D. 385 Added paragraph at the end of section 1 acknowledging that IDNAv2 386 will eventually need to be updated as well. 388 Author's Address 390 Paul Hoffman 392 Email: phoffman@imc.org