idnits 2.17.1 draft-ietf-idn-idna-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 1) being 613 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** There are 10 instances of too long lines in the document, the longest one being 4 characters in excess of 72. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 70: '...Punycode [PUNYCODE]. Implementations of IDNA MUST fully implement...' RFC 2119 keyword, line 163: '... 2), every label MUST contain only ASC...' RFC 2119 keyword, line 168: '...omain name slots SHOULD be hidden from...' RFC 2119 keyword, line 176: '...e compared, they MUST be considered to...' RFC 2119 keyword, line 198: '...sequence MUST NOT be used as a label i...' (16 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- Couldn't find a document date in the document -- date freshness check skipped. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? 'NAMEPREP' on line 557 looks like a reference -- Missing reference section? 'STRINGPREP' on line 571 looks like a reference -- Missing reference section? 'PUNYCODE' on line 552 looks like a reference -- Missing reference section? 'RFC2119' on line 560 looks like a reference -- Missing reference section? 'UNICODE' on line 578 looks like a reference -- Missing reference section? 'STD13' on line 567 looks like a reference -- Missing reference section? 'STD3' on line 563 looks like a reference -- Missing reference section? 'UAX9' on line 575 looks like a reference -- Missing reference section? 'DNSSEC' on line 554 looks like a reference Summary: 5 errors (**), 0 flaws (~~), 2 warnings (==), 11 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Draft Patrik Faltstrom 2 draft-ietf-idn-idna-07.txt Cisco 3 February 24, 2002 Paul Hoffman 4 Expires in six months IMC & VPNC 5 Adam M. Costello 6 UC Berkeley 8 Internationalizing Domain Names in Applications (IDNA) 10 Status of this Memo 12 This document is an Internet-Draft and is in full conformance with all 13 provisions of Section 10 of RFC2026. 15 Internet-Drafts are working documents of the Internet Engineering Task 16 Force (IETF), its areas, and its working groups. Note that other groups 17 may also distribute working documents as Internet-Drafts. 19 Internet-Drafts are draft documents valid for a maximum of six months 20 and may be updated, replaced, or obsoleted by other documents at any 21 time. It is inappropriate to use Internet-Drafts as reference material 22 or to cite them other than as "work in progress." 24 The list of current Internet-Drafts can be accessed at 25 http://www.ietf.org/ietf/1id-abstracts.txt 27 The list of Internet-Draft Shadow Directories can be accessed at 28 http://www.ietf.org/shadow.html. 30 Abstract 32 Until now, there has been no standard method for domain names to use 33 characters outside the ASCII repertoire. This document defines 34 internationalized domain names (IDNs) and a mechanism called IDNA for 35 handling them in a standard fashion. IDNs use characters drawn from a 36 large repertoire (Unicode), but IDNA allows the non-ASCII characters to 37 be represented using the same octets used in so-called host names 38 today. IDNA is only meant for processing domain names, not free 39 text. 41 1. Introduction 43 IDNA works by allowing applications to use certain ASCII name labels 44 (beginning with a special prefix) to represent non-ASCII name labels. 45 Lower-layer protocols need not be aware of this; therefore IDNA does not 46 require changes to any infrastructure. In particular, IDNA does not 47 require any changes to DNS servers, resolvers, or protocol elements, 48 because the ASCII name service provided by the existing DNS is entirely 49 sufficient. 51 This document does not require any applications to conform to IDNA, 52 but applications can elect to use IDNA in order to support IDN while 53 maintaining interoperability with existing infrastructure. Adding IDNA 54 support to an existing application entails changes to the application 55 only, and leaves room for flexibility in the user interface. 57 A great deal of the discussion of IDN solutions has focused on 58 transition issues and how IDN will work in a world where not all of the 59 components have been updated. Other proposals would require that user 60 applications, resolvers, and DNS servers be updated in order for a user 61 to use an internationalized domain name. Rather than require widespread 62 updating of all components, IDNA requires only user applications to be 63 updated; no changes are needed to the DNS protocol or any DNS servers or 64 the resolvers on user's computers. 66 1.1 Interaction of protocol parts 68 IDNA requires that implementations process input strings with Nameprep 69 [NAMEPREP], which is a profile of Stringprep [STRINGPREP], and then with 70 Punycode [PUNYCODE]. Implementations of IDNA MUST fully implement 71 Nameprep and Punycode; neither Nameprep nor Punycode are optional. 73 2 Terminology 75 The key words "MUST", "SHALL", "REQUIRED", "SHOULD", "RECOMMENDED", and 76 "MAY" in this document are to be interpreted as described in RFC 2119 77 [RFC2119]. 79 A code point is an integral value associated with a character in a coded 80 character set. 82 Unicode [UNICODE] is a coded character set containing tens of thousands 83 of characters. A single Unicode code point is denoted by "U+" followed 84 by four to six hexadecimal digits, while a range of Unicode code points 85 is denoted by two hexadecimal numbers separated by "..", with no 86 prefixes. 88 ASCII means US-ASCII, a coded character set containing 128 characters 89 associated with code points in the range 0..7F. Unicode is an extension 90 of ASCII: it includes all the ASCII characters and associates them with 91 the same code points. 93 The term "LDH code points" is defined in this document to mean the code 94 points associated with ASCII letters, digits, and the hyphen-minus; that 95 is, U+002D, 30..39, 41..5A, and 61..7A. "LDH" is an abbreviation for 96 "letters, digits, hyphen". 98 [STD13] talks about "domain names" and "host names", but many people use 99 the terms interchangeably. Further, because [STD13] was not terribly 100 clear, many people who are sure they know the exact definitions of each 101 of these terms disagree on the definitions. 103 A label is an individual part of a domain name. Labels are usually shown 104 separated by dots; for example, the domain name "www.example.com" is 105 composed of three labels: "www", "example", and "com". (The zero-length 106 root label that is implied in domain names, as described in [STD13], is 107 not considered a label in this specification.) Throughout this document 108 the term "label" is shorthand for "text label", and "every label" means 109 "every text label". In IDNA, not all text strings can be labels. 111 An "internationalized domain name" (IDN) is a domain name for which the 112 ToASCII operation (see section 4) can be applied to each label without 113 failing. This document does not attempt to define an "internationalized 114 host name". It is expected that protocols and name-handling bodies will 115 want to limit the characters allowed in IDNs further than what is 116 specified in this document, such as to prohibit additional characters 117 that they feel are unneeded or harmful in registered domain names. 119 An "internationalized label" is a label composed of characters from the 120 Unicode character set; note, however, that not every string of Unicode 121 characters can be an internationalized label. To allow internationalized 122 labels to be handled by existing applications, IDNA uses an "ACE label" 123 (ACE stands for ASCII Compatible Encoding), which can be represented 124 using only ASCII characters but is equivalent to a label containing 125 non-ASCII characters. More rigorously, an ACE label is defined to be any 126 label that the ToUnicode operation would alter (see section 4.2). For 127 every internationalized label that cannot be directly represented in 128 ASCII, there is an equivalent ACE label. The conversion of labels to and 129 from the ACE form is specified in section 4. 131 The "ACE prefix" is defined in this document to be a string of ASCII 132 characters that appears at the beginning of every ACE label. It is 133 specified in section 5. 135 A "domain name slot" is defined in this document to be a protocol element 136 or a function argument or a return value (and so on) explicitly 137 designated for carrying a domain name. Examples of domain name slots 138 include: the QNAME field of a DNS query; the name argument of the 139 gethostbyname() library function; the part of an email address following 140 the at-sign (@) in the From: field of an email message header; and the host 141 portion of the URI in the src attribute of an HTML tag. 142 General text that just happens to contain a domain name is not a domain name 143 slot; for example, a domain name appearing in the plain text body of an 144 email message is not occupying a domain name slot. 146 An "internationalized domain name slot" is defined in this document to 147 be a domain name slot explicitly designated for carrying an 148 internationalized domain name as defined in this document. The 149 designation may be static (for example, in the specification of the 150 protocol or interface) or dynamic (for example, as a result of 151 negotiation in an interactive session). 153 A "generic domain name slot" is defined in this document to be any 154 domain name slot that is not an internationalized domain name slot. 155 Obviously, this includes any domain name slot whose specification 156 predates IDNA. 158 3. Requirements 160 IDNA conformance means adherence of the following three requirements: 162 1) Whenever a domain name is put into a generic domain name slot (see 163 section 2), every label MUST contain only ASCII characters. Given an 164 internationalized domain name (IDN), an equivalent domain name 165 satisfying this requirement can be obtained by applying the ToASCII 166 operation (see section 4) to each label. 168 2) ACE labels obtained from domain name slots SHOULD be hidden from 169 users except when the use of the non-ASCII form would cause problems or 170 when the ACE form is explicitly requested. Given an internationalized 171 domain name, an equivalent domain name containing no ACE labels can be 172 obtained by applying the ToUnicode operation (see section 4) to each 173 label. When requirements 1 and 2 both apply, requirement 1 takes 174 precedence. 176 3) Whenever two labels are compared, they MUST be considered to 177 match if and only if their ASCII forms (obtained by applying ToASCII) 178 match using a case-insensitive ASCII comparison. 180 4. Conversion operations 182 This section specifies the ToASCII and ToUnicode operations. Each one 183 operates on a sequence of Unicode code points (but remember that all 184 ASCII code points are also Unicode code points). When domain names are 185 represented using character sets other than Unicode and ASCII, they will 186 need to first be transcoded to Unicode before these operations can be 187 applied, and might need to be transcoded back afterwards. 189 4.1 ToASCII 191 The ToASCII operation takes a sequence of Unicode code points and 192 transforms it into a sequence of code points in the ASCII range (0..7F). 193 The original sequence and the resulting sequence are equivalent labels. 194 (If the original is an internationalized label that cannot be directly 195 represented in ASCII, the result will be the equivalent ACE label.) 197 ToASCII fails if any step of it fails. If any step fails, the original 198 sequence MUST NOT be used as a label in an IDN. 200 The inputs to ToASCII are a sequence of code points; a flag indicating 201 whether to prohibit unassigned code points (see [STRINGPREP]); and a 202 flag indicating whether to apply the host name syntax rules. The output 203 of ToASCII is either a sequence of ASCII code points or a failure 204 condition. 206 ToASCII never alters a sequence of code points that are all in the ASCII 207 range to begin with (although it could fail). 209 ToASCII consists of the following steps: 211 1. If all code points in the sequence are in the ASCII range (0..7F) 212 then skip to step 3. 214 2. Perform the steps specified in [NAMEPREP] and fail if there is 215 an error. 217 3. If the label is part of a host name (or is subject to the host 218 name syntax rules) then perform these checks: 220 (a) Verify the absence of non-LDH ASCII code points; that is, 221 the absence of 0..2C, 2E..2F, 3A..40, 5B..60, and 7B..7F. 223 (b) Verify the absence of leading and trailing hyphen-minus; 224 that is, the absence of U+002D at the beginning and end of 225 the sequence. 227 4. If all code points in the sequence are in the ASCII range (0..7F), 228 then skip to step 8. 230 5. Verify that the sequence does NOT begin with the ACE prefix. 232 6. Encode the sequence using the encoding algorithm in [PUNYCODE]. 234 7. Prepend the ACE prefix. 236 8. Verify that the number of code points is in the range 1 to 63 237 inclusive. 239 4.2 ToUnicode 241 The ToUnicode operation takes a sequence of Unicode code points and 242 returns a sequence of Unicode code points. If the input sequence is a 243 label in ACE form, then the result is an equivalent internationalized 244 label that is not in ACE form, otherwise the original sequence is 245 returned unaltered. 247 ToUnicode never fails. If any step fails, then the original input 248 sequence is returned immediately in that step. 250 The inputs to ToUnicode are a sequence of code points; a flag indicating 251 whether to prohibit unassigned code points (see [STRINGPREP]); and a 252 flag indicating whether to apply the host name syntax rules. The output 253 of ToUnicode is always a sequence of Unicode code points. 255 1. If all code points in the sequence are in the ASCII range (0..7F) 256 then skip to step 3. 258 2. Perform the steps specified in [NAMEPREP] and fail if there is an 259 error. (If step 3 of ToASCII is also performed here, it will not 260 affect the overall behavior of ToUnicode, but it is not 261 necessary.) 263 3. Verify that the sequence begins with the ACE prefix, and save a 264 copy of the sequence. 266 4. Remove the ACE prefix. 268 5. Decode the sequence using decoding algorithm in [PUNYCODE]. Save 269 a copy of the result of this step. 271 6. Apply ToASCII. 273 7. Verify that the sequence matches the saved copy from step 3, using 274 a case-insensitive ASCII comparison. 276 8. Return the saved copy from step 5. 278 5. ACE prefix 280 [[ Note to the IESG and Internet Draft readers: The two uses of the 281 string "IESG--" below are to be changed at time of publication to a 282 prefix which fulfills the requirements in the first paragraph. ]] 284 The ACE prefix, used in the conversion operations (section 4), is two 285 alphanumeric ASCII characters followed by two hyphen-minuses. It cannot 286 be any of the prefixes already used in earlier documents, which includes 287 the following: "bl--", "bq--", "dq--", "lq--", "mq--", "ra--", "wq--" 288 and "zq--". The ToASCII and ToUnicode operations MUST recognize the ACE 289 prefix in a case-insensitive manner. 291 The ACE prefix for IDNA is "IESG--". 293 This means that an ACE label might be "IESG--de-jg4avhby1noc0d", where 294 "de-jg4avhby1noc0d" is the part of the ACE label that is generated by 295 the encoding steps in [PUNYCODE]. 297 6. Implications for typical applications using DNS 299 In IDNA, applications perform the processing needed to input 300 internationalized domain names from users, display internationalized 301 domain names to users, and process the inputs and outputs from DNS and 302 other protocols that carry domain names. 304 The components and interfaces between them can be represented 305 pictorially as: 307 +------+ 308 | User | 309 +------+ 310 ^ 311 | Input and display: local interface methods 312 | (pen, keyboard, glowing phosphorus, ...) 313 +-------------------|-------------------------------+ 314 | v | 315 | +-----------------------------+ | 316 | | Application | | 317 | | (conversion between local | | 318 | | character set and Unicode | | 319 | | is done here) | | 320 | +-----------------------------+ | 321 | ^ ^ | End system 322 | | | | 323 | Call to resolver: | | Application-specific | 324 | ACE | | protocol: | 325 | v | predefined by the | 326 | +----------+ | protocol or defaults | 327 | | Resolver | | to ACE | 328 | +----------+ | | 329 | ^ | | 330 +-----------------|----------|----------------------+ 331 DNS protocol: | | 332 ACE | | 333 v v 334 +-------------+ +---------------------+ 335 | DNS servers | | Application servers | 336 +-------------+ +---------------------+ 338 6.1 Entry and display in applications 340 Applications can accept domain names using any character set or sets 341 desired by the application developer, and can display domain names in any 342 charset. That is, the IDNA protocol does not affect the interface 343 between users and applications. 345 An IDNA-aware application can accept and display internationalized 346 domain names in two formats: the internationalized character set(s) 347 supported by the application, and as an ACE label. ACE labels that are 348 displayed or input MUST always include the ACE prefix. Applications MAY 349 allow input and display of ACE labels, but are not encouraged to do so 350 except as an interface for special purposes, possibly for debugging. ACE 351 encoding is opaque and ugly, and should thus only be exposed to users 352 who absolutely need it. The optional use, especially during a transition 353 period, of ACE encodings in the user interface is described in section 354 6.4. Because name labels encoded as ACE name labels can be rendered 355 either as the encoded ASCII characters or the proper decoded characters, 356 the application MAY have an option for the user to select the preferred 357 method of display; if it does, rendering the ACE SHOULD NOT be the 358 default. 360 Domain names are often stored and transported in many places. For example, 361 they are part of documents such as mail messages and web pages. They are 362 transported in many parts of many protocols, such as both the 363 control commands and the RFC 2822 body parts of SMTP, and the headers 364 and the body content in HTTP. It is important to remember that domain 365 names appear both in domain name slots and in the content that is passed 366 over protocols. 368 In protocols and document formats that define how to handle 369 specification or negotiation of charsets, labels can be encoded in any 370 charset allowed by the protocol or document format. If a protocol or 371 document format only allows one charset, the labels MUST be given in 372 that charset. 374 In any place where a protocol or document format allows transmission of 375 the characters in internationalized labels, internationalized labels 376 SHOULD be transmitted using whatever character encoding and escape 377 mechanism that the protocol or document format uses at that place. 379 All protocols that use domain name slots already have the capacity for 380 handling domain names in the ASCII charset. Thus, ACE labels 381 (internationalized labels that have been processed with the ToASCII 382 operation) can inherently be handled by those protocols. 384 6.2 Applications and resolver libraries 386 Applications normally use functions in the operating system when they 387 resolve DNS queries. Those functions in the operating system are often 388 called "the resolver library", and the applications communicate with the 389 resolver libraries through a programming interface (API). 391 Because these resolver libraries today expect only domain names in 392 ASCII, applications MUST prepare labels that are passed to the resolver 393 library using the ToASCII operation. Labels received from the resolver 394 library contain only ASCII characters; internationalized labels that 395 cannot be represented directly in ASCII use the ACE form. ACE labels 396 always include the ACE prefix. 398 IDNA-aware applications MUST be able to work with both 399 non-internationalized labels (those that conform to [STD13] 400 and [STD3]) and internationalized labels. 402 It is expected that new versions of the resolver libraries in the future 403 will be able to accept domain names in other formats than ASCII, and 404 application developers might one day pass not only domain names in 405 Unicode, but also in local script to a new API for the resolver 406 libraries in the operating system. 408 6.3 DNS servers 410 An operating system might have a set of libraries for performing the 411 ToASCII operation. The input to such a library might be in one or more 412 charsets that are used in applications (UTF-8 and UTF-16 are likely 413 candidates for almost any operating system, and script-specific charsets 414 are likely for localized operating systems). 416 For internationalized labels that cannot be represented directly in 417 ASCII, DNS servers MUST use the ACE form produced by the ToASCII 418 operation. All IDNs served by DNS servers MUST contain only ASCII 419 characters. 421 If a signalling system which makes negotiation possible between old and 422 new DNS clients and servers is standardized in the future, the encoding 423 of the query in the DNS protocol itself can be changed from ACE to 424 something else, such as UTF-8. The question whether or not this should 425 be used is, however, a separate problem and is not discussed in this 426 memo. 428 6.4 Avoiding exposing users to the raw ACE encoding 430 All applications that might show the user a domain name obtained from a 431 domain name slot, such as from gethostbyaddr or part of a mail header, 432 SHOULD be updated as soon as possible in order to prevent users from 433 seeing the ACE. 435 If an application decodes an ACE name using ToUnicode but cannot show 436 all of the characters in the decoded name, such as if the name contains 437 characters that the output system cannot display, the application SHOULD 438 show the name in ACE format (which always includes the ACE prefix) 439 instead of displaying the name with the replacement character (U+FFFD). 440 This is to make it easier for the user to transfer the name correctly to 441 other programs. Programs that by default show the ACE form when they 442 cannot show all the characters in a name label SHOULD also have a 443 mechanism to show the name that is produced by the ToUnicode operation 444 with as many characters as possible and replacement characters in the 445 positions where characters cannot be displayed. 447 The ToUnicode operation does not alter labels that are not valid ACE 448 labels, even if they begin with the ACE prefix. After ToUnicode has been 449 applied, if a label still begins with the ACE prefix, then it is not a 450 valid ACE label, and is not equivalent to any of the intermediate 451 Unicode strings constructed by ToUnicode. 453 6.5 Bidirectional text in domain names 455 The display of domain names that contain bidirectional text is not covered 456 in this document. It may be covered in a future version of this 457 document, or may be covered in a different document. 459 For developers interested in displaying domain names that have 460 bidirectional text, the Unicode standard has an extensive discussion of 461 how to deal with reorder glyphs for display when dealing with 462 bidirectional text such as Arabic or Hebrew. See [UAX9] for more 463 information. In particular, all Unicode text is stored in logical order. 465 6.6 DNSSEC authentication of IDN domain names 467 DNS Security [DNSSEC] is a method for supplying cryptographic 468 verification information along with DNS messages. Public Key 469 Cryptography is used in conjunction with digital signatures to provide a 470 means for a requester of domain information to authenticate the source 471 of the data. This ensures that it can be traced back to a trusted 472 source, either directly, or via a chain of trust linking the source of 473 the information to the top of the DNS hierarchy. 475 IDNA specifies that all internationalized domain names served by DNS 476 servers that cannot be represented directly in ASCII must use the ACE 477 form produced by the ToASCII operation. This operation must be performed 478 prior to a zone being signed by the private key for that zone. Because 479 of this ordering, it is important to recognize that DNSSEC authenticates 480 the ASCII domain name, not the Unicode form or the mapping between the 481 Unicode form and the ASCII form. In other words, the output of ToASCII 482 is the canonical name. In the presence of DNSSEC, this is the name that 483 MUST be signed in the zone and MUST be validated against. It also SHOULD 484 be used for other name comparisons, such as when a browser wants to 485 indicate that a URL has been previously visited. 487 One consequence of this for sites deploying IDNA in the presence of 488 DNSSEC is that any special purpose proxies or forwarders used to 489 transform user input into IDNs must be earlier in the resolution flow 490 than DNSSEC authenticating nameservers for DNSSEC to work. 492 6.7 Limitations of IDNA 494 The IDNA protocol does not solve all linguistic issues with users 495 inputting names in different scripts. Many important language-based and 496 script-based mappings are not covered in IDNA and must be handled 497 outside the protocol. For example, names that are entered in a mix of 498 traditional and simplified Chinese characters will not be mapped to a 499 single canonical name. Another example is Scandinavian names that are 500 entered with U+00F6 (LATIN SMALL LETTER O WITH DIAERESIS) will not be 501 mapped to U+00F8 (LATIN SMALL LETTER O WITH STROKE). 503 7. Name Server Considerations 505 Internationalized domain name data in zone files (as specified by section 506 5 of RFC 1035) MUST be processed with ToASCII before it is entered in 507 the zone files. 509 It is imperative that there be only one ASCII encoding for a particular 510 domain name. ACE is an encoding for domain name labels that use non-ASCII 511 characters. Thus, a primary master name server MUST NOT contain an 512 ACE-encoded label that decodes to an ASCII label. The ToASCII operation 513 assures that no such names are ever output from the operation. 515 Name servers MUST NOT serve records with domain names that contain 516 non-ASCII characters; such names MUST be converted to ACE form by the 517 ToASCII operation in order to be served. If names that are not processed 518 by ToASCII are passed to an application, it will result in unpredictable 519 behavior. Note that [STRINGPREP] describes how to handle versioning of 520 unallocated codepoints. 522 8. Root Server Considerations 524 IDNs are likely to be somewhat longer than current host names, so the 525 bandwidth needed by the root servers should go up by a small amount. 526 Also, queries and responses for IDNs will probably be somewhat longer 527 than typical queries today, so more queries and responses may be forced 528 to go to TCP instead of UDP. 530 9. Security Considerations 532 Security on the Internet partly relies on the DNS. Thus, any 533 change to the characteristics of the DNS can change the security of much 534 of the Internet. 536 This memo describes an algorithm which encodes characters that are not 537 valid according to STD3 and STD13 into octet values that are valid. No 538 security issues such as string length increases or new allowed values 539 are introduced by the encoding process or the use of these encoded 540 values, apart from those introduced by the ACE encoding itself. 542 Domain names are used by users to connect to Internet servers. The 543 security of the Internet would be compromised if a user entering a 544 single internationalized name could be connected to different servers 545 based on different interpretations of the internationalized domain name. 547 Because this document normatively refers to [NAMEPREP], it includes the 548 security considerations from that document as well. 550 A. References 552 [PUNYCODE] Adam Costello, "Punycode", draft-ietf-idn-punycode. 554 [DNSSEC] Don Eastlake, "Domain Name System Security Extensions", RFC 555 2535, March 1999. 557 [NAMEPREP] Paul Hoffman and Marc Blanchet, "Preparation of 558 Internationalized Domain Names", draft-ietf-idn-nameprep. 560 [RFC2119] Scott Bradner, "Key words for use in RFCs to Indicate 561 Requirement Levels", March 1997, RFC 2119. 563 [STD3] Bob Braden, "Requirements for Internet Hosts -- Communication 564 Layers" (RFC 1122) and "Requirements for Internet Hosts -- Application 565 and Support" (RFC 1123), STD 3, October 1989. 567 [STD13] Paul Mockapetris, "Domain names - concepts and facilities" (RFC 568 1034) and "Domain names - implementation and specification" (RFC 1035), 569 STD 13, November 1987. 571 [STRINGPREP] Paul Hoffman and Marc Blanchet, "Preparation of 572 Internationalized Strings ("stringprep")", draft-hoffman-stringprep, 573 work in progress 574 . 575 [UAX9] Unicode Standard Annex #9, The Bidirectional Algorithm, 576 . 578 [UNICODE] The Unicode Standard, Version 3.1.0: The Unicode Consortium. 579 The Unicode Standard, Version 3.0. Reading, MA, Addison-Wesley 580 Developers Press, 2000. ISBN 0-201-61633-5, as amended by: Unicode 581 Standard Annex #27: Unicode 3.1, 582 . 584 B. Authors' Addresses 586 Patrik Faltstrom 587 Cisco Systems 588 Arstaangsvagen 31 J 589 S-117 43 Stockholm Sweden 590 paf@cisco.com 592 Paul Hoffman 593 Internet Mail Consortium and VPN Consortium 594 127 Segre Place 595 Santa Cruz, CA 95060 USA 596 phoffman@imc.org 598 Adam M. Costello 599 University of California, Berkeley 600 idna-spec.amc @ nicemice.net