idnits 2.17.1 draft-ietf-idn-compare-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 1) being 625 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Authors' Addresses Section. ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There is 1 instance of too long lines in the document, the longest one being 1 character in excess of 72. ** The abstract seems to contain references ([IDN-REQ]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (June 14, 2000) is 8717 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC1034' is mentioned on line 131, but not defined == Missing Reference: 'UTR-15' is mentioned on line 453, but not defined == Unused Reference: 'UTR15' is defined on line 602, but no explicit reference was found in the text -- Possible downref: Non-RFC (?) normative reference: ref. 'BLOCK-NAMES' == Outdated reference: A later version (-04) exists of draft-duerst-i18n-norm-03 -- Possible downref: Normative reference to a draft: ref. 'DUERST' -- Possible downref: Normative reference to a draft: ref. 'HOFFMAN' == Outdated reference: A later version (-10) exists of draft-ietf-idn-requirements-02 -- Possible downref: Normative reference to a draft: ref. 'IDN-REQ' == Outdated reference: A later version (-06) exists of draft-skwan-utf8-dns-03 -- Possible downref: Normative reference to a draft: ref. 'KWAN' -- Possible downref: Normative reference to a draft: ref. 'OSCARSSON' ** Obsolete normative reference: RFC 2279 (Obsoleted by RFC 3629) ** Obsolete normative reference: RFC 2671 (Obsoleted by RFC 6891) == Outdated reference: A later version (-02) exists of draft-jseng-utf5-01 -- Possible downref: Normative reference to a draft: ref. 'SENG' -- Possible downref: Non-RFC (?) normative reference: ref. 'UTR15' Summary: 8 errors (**), 0 flaws (~~), 9 warnings (==), 10 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Draft Paul Hoffman 2 draft-ietf-idn-compare-00.txt IMC & VPNC 3 June 14, 2000 4 Expires in six months 6 Comparison of Internationalized Domain Name Proposals 8 Status of this memo 10 This document is an Internet-Draft and is in full conformance with all 11 provisions of Section 10 of RFC 2026. 13 Internet-Drafts are working documents of the Internet Engineering Task 14 Force (IETF), its areas, and its working groups. Note that other groups 15 may also distribute working documents as Internet-Drafts. 17 Internet-Drafts are draft documents valid for a maximum of six months 18 and may be updated, replaced, or obsoleted by other documents at any 19 time. It is inappropriate to use Internet-Drafts as reference material 20 or to cite them other than as "work in progress." 22 The list of current Internet-Drafts can be accessed at 23 http://www.ietf.org/ietf/1id-abstracts.txt 25 The list of Internet-Draft Shadow Directories can be accessed at 26 http://www.ietf.org/shadow.html. 28 Abstract 30 The IDN Working Group is working on proposals for internationalized 31 domain names that might become a standard in the IETF. Before a single 32 full proposal can be made, competing proposals must be compared on a 33 wide range of requirements and desired features. This document compares 34 the many parts of a comprehensive protocol that have been proposed. It 35 is the companion document to "Requirements of Internationalized Domain 36 Names" [IDN-REQ], which lays out the requirements for the 37 internationalized domain name protocol. 39 1. Introduction 41 As the IDN Working Group has discussed the requirements for IDN, 42 suggestions have been made for various candidate protocols that might 43 meet the requirements. These proposals have been somewhat helpful in 44 bringing up real-world needs for the requirements. 46 It became clear no single proposal had wide agreement from the working 47 group. In fact, the authors of various proposals found themselves taking 48 some features from other proposals as they revised their drafts. At the 49 same time, working group participants were making suggestions for 50 incremental changes that might affect more than one proposal. 52 Because of this mixing and matching, it was decided that this IDN 53 comparisons document should compare features that might end up in the 54 final protocol, not full protocol suggestions themselves. The features 55 that had been discussed in the working group were divided by function, 56 and appear in this document in separate sections. For each function, 57 there are multiple suggestions for protocol elements that might meet the 58 requirements that are described in [IDN-REQ]. 60 This document is being discussed on the "idn" mailing list. To join the 61 list, send a message to with the words 62 "subscribe idn" in the body of the message. Archives of the mailing list 63 can also be found at ftp://ops.ietf.org/pub/lists/idn*. 65 1.1 Format of this document 67 Each section covers one feature that has been discussed as being part of 68 the final IDN solution. Within each section, alternate proposals are 69 listed with the major perceived pros and cons of the proposal. Also, 70 each proposal is given a label to make discussion of this document (and 71 of the proposals themselves) easier. 73 References to the numbered requirements in [IDN-REQ] are from version 74 -02 of that document. These numbers are expected to change and the 75 requirements document evolves. In this draft, the requirements are show 76 as "[#n-02]", where "n" is the requirement number from draft -02 of 77 [IDN-REQ]. This document only lists where particular proposals don't 78 meet particular requirmenents from [IDN-REQ], not the ones that they 79 fulfill. 81 Note that this document is supposed to reflect the discussion of all 82 proposed alternatives, not just the ones that fully match the 83 requirements in [IDN-REQ]. It will serve as a summary of the discussion 84 in the IDN WG for readers in the future who may want to know why certain 85 alternatives were not chosen for the eventual protocol. 87 The proposal drafts covered in this document are: 89 [DUERST] Character Normalization in IETF Protocols, 90 draft-duerst-i18n-norm-03 92 [HOFFMAN] Compatible Internationalized Domain Names Using Compression, 93 draft-hoffman-idn-cidnuc-03 95 [KWAN] Using the UTF-8 Character Set in the Domain Name System, 96 draft-skwan-utf8-dns-03 98 [OSCARSSON] Internationalisation of the Domain Name Service, 99 draft-oscarsson-i18ndns-00 101 [SENG] UTF-5, a transformation format of Unicode and ISO 10646, 102 draft-jseng-utf5-01 104 1.2 Editor's note for the -00 draft 106 This first draft is probably incomplete in many aspects. There may be 107 proposals that appeared on the mailing list or in conversation that I 108 did not include here, and there are likely to be pro and con arguments 109 that I have left out. Any such omission is not an indication of the lack 110 of merit, but instead an error on the editor's part. Also, if there are 111 proposals here that do not fulfill some of the requirements from 112 [IDN-REQ] but that fact is not reflected here, that is an omission that 113 should be corrected. In any case, please actively send comments and 114 corrections to this document to the IDN working group. 116 2. Architecture 118 One of the biggest questions raised early in the IDN discussion was what 119 the format of internationalized name parts would be on the wire, that 120 is, between the user's computer and the DNS resolvers. It was agreed 121 that the DNS protocols certainly allow non-ASCII octets in domain name 122 parts and resource records, but there was also acknowledgement that many 123 protocols that rely on the DNS could not handle non-ASCII names due to 124 the design of the protocol. Section 3.1 of this document describes the 125 proposed encodings for the non-ASCII name parts. 127 Because of requirement [#2-02], there were proposals for 128 ASCII-compatible encodings (ACEs) of non-ASCII characters. Different 129 ACEs were proposed (and are discussed in Section 4 of this document), 130 but they all have the same goal: to allow non-ASCII characters to be 131 represented in host names that conform to RFC 1034 [RFC1034]. 133 2.1 arch-1: Just send binary 135 [KWAN] proposes beginning to send characters outside the range allowed 136 in RFC 1034. 138 Pro: Easiest to describe. Only changes host name syntax, not any of the 139 related DNS protocols. 141 Con: Doesn't work with many exiting protocols that relies on DNS. 142 Violates requirement [#9-02]. 144 2.2 arch-2: Send binary or ACE 146 [OSCARSSON] proposes using both binary and ACE formats on the wire. 148 Pro: Allows protocols that can handle binary name parts to use them 149 directly, while allowing protocols that cannot use binary name parts to 150 also handle names without conversion. Allows domain names in free text 151 to be displayed in binary even in systems that require ACE-formatted 152 names on the wire. 154 Con: Requires all software that uses domain names to handle both 155 formats. Requires processing time for conversion of ACE formats into the 156 format must likely used internally to the software. 158 2.3 arch-3: Just send ACE 160 [HOFFMAN] and [SENG] propose that host naming rules remain the same and 161 that all internationalize domain names be sent in ACE format. 163 Pro: No changes at all to current DNS protocols. 165 Con: Requires all software to recognize ACE domain names and convert 166 them to human-readable for display. This is true not only in domain 167 names used on the wire but also domain names used in free text. 169 3. Names in binary 171 Both arch-1 and arch-2 include domain name parts that are represented on 172 the wire in a binary format. This section describes some of the features 173 of such names. 175 3.1 bin-1: Format 177 There are many different charsets and encodings for the scripts of the 178 world. The WG has discussed which binary encoding should be used on the 179 wire. 181 3.1.1 bin-1.1: UTF-8 183 The IETF policy on character sets [RFC2277] states that UTF-8 [RFC2279] 184 is the preferred charset for IETF protocols. UTF-8 encodes all 185 characters in the ISO 10646 repertoire. 187 Pro: Well-supported in other IETF protocols. Compact for most scripts. 188 Wide implementation in programming languages. US-ASCII characters have 189 the same encoding in UTF-8 as they do in US-ASCII. Because it is based 190 on ISO 10646, expansion of the repertoire comes from respected 191 international standards bodies. 193 Con: Asian scripts require three octets per character. 195 3.1.2 bin-1.2: Labelled charsets 197 Mailing list discussion mentioned using multiple charsets for the binary 198 representation. Each name part would be labelled with the charset used. 200 Pro: Allows users to specify names in the charsets they are most 201 familiar with. 203 Con: All resolvers would have to know all charsets. Thus, the number of 204 charsets would probably have to be limited and never expand. Mapping of 205 characters between charsets would have to be exact and not change over 206 time. 208 3.2 bin-2: Distinguishing binary from current format 210 Software built for current domain names might give unexpected results 211 when dealing with non-ASCII characters in domain names. For example, it 212 was reported on the mailing list that some software crashes when a 213 non-ASCII domain name is returned for in-addr.arpa requests. Thus, there 214 may be a need for IDN to prevent software that is not binary-aware from 215 receiving domain names with binary parts. This would only apply to an 216 IDN that used arch-2, not arch-1. 218 3.2.1 bin-2.1: Don't mark binary 220 [KWAN] does not specify any way of changing requests to prevent binary 221 name parts from being transmitted. 223 Pro: No changes to current DNS requests and responses. 225 Con: Likely to cause disruption in software that is not binary-aware. 226 Likely to cause systems to misread names and possibly (and incorrectly) 227 convert them to ASCII names by stripping off the high bit in octets; 228 this in turn would lead to security problems due to mistaken identities. 229 Returning binary host names to DNS queries is known to break some 230 current software. 232 3.2.2 bin-2.2: Mark binary with IN bit 234 [OSCARSSON] describes using a bit from the header of DNS queries to mark 235 the query as possibly containing a binary name part and indicating that 236 the response to the query can contain binary name parts. 238 Pro: This bit is currently unused and must be set to zero, so current 239 software won't use it accidentally. No changes to any other part of the 240 query or RRs. 242 Con: It's the last unused bit in the header and DNS folks have indicated 243 that they are very hesitant to give it up. 245 3.2.3 bin-2.3: Mark binary with new QTYPEs 247 Off-list discussion has mentioned using new QTYPEs to mark the query as 248 possibly containing a binary name part and indicating that the response 249 to the query can contain binary name parts. QTYPEs are two octets long, 250 and no QTYPEs to date use more than the lower eight bits, so one of the 251 bits from the upper octet could be used to indicate binary names. 253 Pro: These bits are currently unused and must be set to zero, so current 254 software won't use them accidentally. No changes to any other part of 255 the query or RRs. Uses a bit that isn't as prized as the IN bit. 257 Con: Software must pay more attention to the QTYPEs than it might have 258 previously. 260 3.2.4 bin-2.4: Mark binary with EDNS0 262 Off-list discussion has mentioned using EDNS0 [RFC2671] to mark the 263 query as possibly containing a binary name part and indicating that the 264 response to the query can contain binary name parts. 266 Pro: There is little use of EDNS0 at this point, so it is very unlikely 267 to have bad interactions with old software. 269 Con: There is little use of EDNS0 and this might make implementation 270 harder. 272 4. Names in ASCII-compatible encoding (ACE) 274 Both arch-2 and arch-3 include domain name parts that are represented on 275 the wire in an ASCII-compatible encoding (ACE). This section describes 276 some of the features of such names. 278 4.1 ace-1: Format 280 A variety of proposals for the format of ACE have been proposed. Each 281 proposal has different features, such as how many characters can be 282 encoded within the 63 octet limit for each name part. The length 283 descriptions in this section assume that there is no distinguishing of 284 ACE from current names; this is not a likely outcome of the WG work. 286 The descriptions of lengths is based on script block names from 287 [BLOCK-NAMES]. 289 4.1.1 ace-1.1: UTF-5 291 [SENG] Describes UTF-5, which is a fairly direct encoding of ISO 10646 292 characters using a system similar to UTF-8. Characters from Basic Latin 293 and Latin-1 Supplement take 2 octets; Latin Extended-A through Tibetan 294 take 3 octets; Myanmar through the end of BMP take 4 octets; non-BMP 295 characters take 5 octets. This means that names using all characters 296 in the Myanmar through the end of BMP are limited to 15 characters. 298 Pro: Extremely simple. 300 Con: Poor compression, particularly for Asian scripts. 302 4.1.2 ace-1.2: CIDNUC 304 [HOFFMAN] describes CIDNUC, which is a two-step algorithm that first 305 compresses the name part, then converts the compressed string into and 306 ACE. Name parts in all scripts other than Han, Yi, Hangul syllables, and 307 non-BMP take up ceil(1.6*(n+1)) octets; name parts in those scripts and 308 any name that mixes characters from different rows in ISO 10646 take up 309 ceil(3.2*(n+1)) octets. This means that names using Han, Yi, or Hangul 310 syllables are limited to 18 characters. 312 Pro: Best compression for most scripts, and similar compression for the 313 scripts where it is not the best. 315 Con: More complicated than UTF-5. Not well optimized for names that have 316 mixed scripts, such as non-Latin names that use hyphen or ASCII digits. 318 4.1.3 ace-1.3: Hex of UTF-8 320 [OSCARSSON] describes "hex of UTF-8", which is a straight-forward 321 hexadecimal encoding of UTF-8. Characters in Basic Latin (other than 322 non-US-ASCII and hyphen) take 3 octets; Latin Extended-A through Tibetan 323 take 5 octets; Myanmar through end of BMP take 7 octets; non-BMP 324 characters take 9 octets. This means that names using all characters 325 in the Myanmar through the end of BMP are limited to 9 characters. 327 Pros: Very simple to describe. 329 Cons: Very poor compression for all scripts. 331 4.1.4 ace-1.5: SACE 333 A message on the mailing list pointed to code for SACE, an ASCII 334 encoding that purports to compact to about the same size as UTF-8. 336 Pros: Similar compression to UTF-8. 338 Cons: No description of how the algorithm works. 340 4.2 ace-2: Distinguishing ACE from current names 342 Software that finds ACE name parts in free text probably should 343 display the name part using the actual characters, not the ACE 344 equivalent. Thus, software must be able to identify which ASCII name 345 parts are ACE and which are non-ACE ASCII parts (such as current names). 346 This would only apply to an IDN proposal that used arch-2, not arch-3. 348 4.2.1 ace-2.1: Currently legal names 350 Name parts that are currently legal in RFC 1034 can be tagged to 351 indicate the part is encoded with ACE. 353 4.2.1.1 ace-2.1.1: Add hopefully-unique legal tag 355 [HOFFMAN] proposes adding a hopefully-unique legal tag to the beginning 356 of the name. The proposal would also work with such a tag at the end of 357 the name part, but it is easier for most people to recognize at the 358 beginning of name parts. 360 Pros: Easy for software (and humans) to recognize. 362 Cons: There is no way to prevent people from beginning non-ACE names 363 with the tag. Unless the tag is very unlikely to appear in any name in 364 any human language, non-ACE names that begin with the tag will display 365 oddly or be rejected by some systems. 367 4.2.1.2 ace-2.1.2: Add a checksum 369 Off-list discussion has mentioned the possibility of creating a checksum 370 mechanism where the checksum would be added to the beginning (or end) of 371 ACE name parts. 373 4.2.2 ace-2.2: Currently illegal names 375 Instead of creating names that are currently legal, another proposal is 376 to create names that use the current ASCII characters but are illegal. 378 4.2.2.1 ace-2.2.1: Add trailing hyphen 380 [OSCARSSON] describes using a trailing hyphen as a signifier of an ACE 381 name. 383 Pros: It is surmised that most current software does not reject names 384 that are illegal in this fashion. Thus, there would be little disruption 385 to current systems. This mechanism takes up fewer characters than any 386 proposed in ace-2.1. 388 Cons: Some current software is will probably break with this mechanism. 389 It goes against some current protocols that match the rules in RFC 1034. 391 5. Prohibited characters 393 There was a short but active discussion on the mailing list about which 394 characters from the ISO 10646 character set should never appear in host 395 names. To date, there are no Internet Drafts on the subject. This 396 section summarizes some of the suggestions. 398 5.1 prohib-1: Identical and near-identical characters 400 Some characters are visually identical or incredibly similar to other 401 characters, thus making it impossible to accurately enter host names 402 that are seen in print. 404 5.2 prohib-2: Separators 406 Horizontal and vertical spacing characters would make it unclear where a 407 host name begins and ends. Also, allowing periods and period-like 408 characters as characters within a name part would also cause similar 409 confusion. 411 5.3 prohib-3: Non-displaying and non-spacing characters 413 There are many characters that cannot be seen in the ISO 10646 character 414 set. These include control characters, non-breaking spaces, formatting 415 characters, and tagging characters. These characters would certainly 416 cause confusion if allowed in host names. 418 5.4 prohib-4: Private use characters 420 Private use characters from ISO 10646 inherently have no specified 421 visual form (and in fact can be used for non-displaying characters). 422 Thus, there could be no visual interoperability for characters in the 423 private use areas. 425 5.5 prohib-5: Punctuation 427 Some punctuation characters are disallowed in URLs because they are used 428 in URL syntax. 430 5.6 prohib-6: Symbols 432 Some mailing list discussion stated that characters that do not normally 433 appear in human or company names should not be allowed in host names. 434 This includes symbols and non-name punctuation. 436 6. Canonicalization 438 The working group has a spirited discussion on the need for 439 canonicalization. [IDN-REQ] describes many requirements for when and what 440 type of canonicalization might be performed. 442 6.1 canon-1: Type of canonicalization 444 The Unicode Consortium's recommendations and definitions of 445 canonicalization [UTR-15] describes many forms of canonicalization that 446 can be performed on character strings. [DUERST] covers much of the same 447 ground but makes more focused requirements for canonicalization on the 448 Internet. 450 6.1.1 canon-1.1: Normalization Form C 452 Both [UTR-15] and [DUERST] recommend Normalization Form C, as described 453 in [UTR-15]. This form is a canonical decomposition, followed by 454 canonical composition. 456 6.1.2 canon-1.2: Normalization Form KC 458 Discussion on the mailing list recommended Normalization Form KC, This 459 form is a compatibility decomposition, followed by canonical 460 composition. Compatibility decomposition makes characters that have 461 compatibility equivalence the same after decomposing. 463 6.2 canon-2: Other canonicalization 465 Host names may have special canonicalization needs that can be added to 466 those given in canon-1. 468 6.2.1 canon-2.1: Case folding in ASCII 470 RFC 1034 specifies that there is no difference between host names that 471 have the same letters but the letters have different case. Thus, the 472 name part "example" is considered the same as "Example" and "EXamPLe". 473 Neither uppercase nor lowercase is specified as being canonical. 475 6.2.2 canon-2.2: Case folding in non-ASCII 477 Discussion on the mailing list has raised the issue of whether or not 478 non-ASCII Latin characters should have the same case-folding rules as 479 ASCII. Such rules would match the expectations of native speakers of 480 some languages, but would go counter to the expectations of native 481 speakers of other languages. 483 6.2.3 canon-2.3: Han folding 485 Discussion on the mailing list has raised the issue of equivalences in 486 some languages use of Han characters. For example, in Chinese, there are 487 many traditional characters that have equivalent simplified characters. 488 Similarly, there are some Han ideographs for which there are multiple 489 representations in ISO 10646. There are no well-established rules for 490 such folding, and some of the proposed folding would be locale-specific. 492 7. Transitions 494 Early in the working group discussion, there was active debate about how 495 the transition from the current host name rules to IDN would be handled. 496 Given requirement [#1-02], this transition is quite important to 497 deciding which proposals might be feasible. 499 7.1 trans-1: Always do current plus new architecture 501 In this proposal, IDN will be used at the same time as the current DNS 502 forever. That is, IDN will be in addition to the current DNS. 504 7.2 trans-2: Transition period 506 In this proposal, IDN will be used at the same time as the current DNS 507 for a specified period of time, after which only IDN will exist. That 508 is, IDN will replace the current DNS. 510 8. Root server considerations 512 DNS root servers receive all requests for top-level domains that not in 513 the local DNS cache. They are critical to the Internet. Care must be 514 taken to ensure that root servers will not be affected by new mechanisms 515 introduced. 517 Any IDN proposal that includes a binary encoding will have an impact on 518 the root servers. The binary requests will affect the root servers 519 because the current root server software is designed to handle current 520 host names. Further, the root zone files which contain ccTLDs and gTLDs 521 would have to support binary domain names and possibly binary host names 522 for NS records. Because all the root servers are equivalent, they would 523 have to be synchronized to support the binary domain names at the same 524 time. 526 Proposals that only use ACE and use tagging with currently-legal names 527 would, by definition, not affect the root servers. 529 9. Security considerations 531 All security considerations listed in [IDN-REQ] apply to this document. 532 Further, all security considerations listed in each of the IDN proposals 533 must be considered when comparing the proposals. 535 Some proposals described in this document may create new security 536 considerations. However, these considerations will have to be addressed 537 in the eventual protocol document. All the proposals described here are 538 still incomplete and security considerations may be added to them as 539 they are revised. All the proposals listed in this document use the ISO 540 10646 character set, so the proposals inherit any security 541 characteristics of that character set. 543 Many protocols and applications rely on domain names to identify the 544 parties involved in a network transaction. For example, a user who 545 connects to a web site by entering or selecting a URL expects that their 546 software will select the web site named in the URL. The uniqueness of 547 domain names are crucial to ensure identification of Internet entities. 549 To make round-trip translation between local charsets and ISO 10646, the 550 ISO 10646 specification has assigned multiple code points to individual 551 glyphs. Moreover, some glyphs might look similar to some users, but look 552 clearly different by other users. This means that it would be simple for 553 an attacker to mimic a domain name by using similar-looking but 554 different glyphs and guessing that some users will not see the 555 difference in their user interface. 557 Some IDN protocols may have denial of service attacks, such as by using 558 non-identified chars, exception characters, or under-specified behavior 559 in using some special characters. 561 10. IANA considerations 563 This document does not create any new IANA registries. However, it is 564 possible that a character property registry may need to be set up when 565 the IDN protocol is created in order to list prohibited characters 566 (section 5) and canonicalization mappings (section 6). 568 11. Acknowledgements 570 James Seng and Marc Blanchet gave many helpful suggestions on the 571 pre-release versions of this document. 573 12. References 575 [BLOCK-NAMES] Unicode Consortium, 576 . 578 [DUERST] Character Normalization in IETF Protocols, 579 draft-duerst-i18n-norm-03 581 [HOFFMAN] Compatible Internationalized Domain Names Using Compression, 582 draft-hoffman-idn-cidnuc-03 584 [IDN-REQ] Requirements of Internationalized Domain Names, 585 draft-ietf-idn-requirements-02 587 [KWAN] Using the UTF-8 Character Set in the Domain Name System, 588 draft-skwan-utf8-dns-03 590 [OSCARSSON] Internationalisation of the Domain Name Service, 591 draft-oscarsson-i18ndns-00 593 [RFC2277] IETF Policy on Character Sets and Languages, RFC 2277 595 [RFC2279] UTF-8, a transformation format of ISO 10646, RFC 2279 597 [RFC2671] Extension Mechanisms for DNS (EDNS0), RFC 2671 599 [SENG] UTF-5, a transformation format of Unicode and ISO 10646, 600 draft-jseng-utf5-01 602 [UTR15] Unicode Normalization Forms, Unicode Technical Report #15 604 A. Author Contact 606 Paul Hoffman 607 IMC & VPNC 608 127 Segre Place 609 Santa Cruz, CA 95060 610 phoffman@imc.org or paul.hoffman@vpnc.org