idnits 2.17.1 draft-ietf-idn-requirements-09.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 1) being 656 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack an Authors' Addresses Section. ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 7 instances of too long lines in the document, the longest one being 3 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (21 May 2002) is 8008 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: '26' is mentioned on line 491, but not defined == Missing Reference: '27' is mentioned on line 496, but not defined == Missing Reference: '28' is mentioned on line 498, but not defined == Missing Reference: '29' is mentioned on line 504, but not defined == Missing Reference: '30' is mentioned on line 508, but not defined -- Possible downref: Non-RFC (?) normative reference: ref. '1' -- Possible downref: Non-RFC (?) normative reference: ref. '2' ** Downref: Normative reference to an Unknown state RFC: RFC 952 (ref. '3') ** Obsolete normative reference: RFC 2278 (ref. '12') (Obsoleted by RFC 2978) ** Obsolete normative reference: RFC 2279 (ref. '13') (Obsoleted by RFC 3629) ** Obsolete normative reference: RFC 2535 (ref. '14') (Obsoleted by RFC 4033, RFC 4034, RFC 4035) ** Obsolete normative reference: RFC 2553 (ref. '15') (Obsoleted by RFC 3493) ** Downref: Normative reference to an Informational RFC: RFC 2825 (ref. '16') ** Downref: Normative reference to an Informational RFC: RFC 2826 (ref. '17') == Outdated reference: A later version (-01) exists of draft-ietf-idn-compare-00 -- Possible downref: Normative reference to a draft: ref. '18' -- Possible downref: Non-RFC (?) normative reference: ref. '19' -- Possible downref: Non-RFC (?) normative reference: ref. '20' -- Possible downref: Non-RFC (?) normative reference: ref. '21' -- Possible downref: Non-RFC (?) normative reference: ref. '22' -- Possible downref: Non-RFC (?) normative reference: ref. '23' -- Possible downref: Non-RFC (?) normative reference: ref. '24' -- Possible downref: Non-RFC (?) normative reference: ref. '25' Summary: 13 errors (**), 0 flaws (~~), 9 warnings (==), 13 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 IETF IDN Working Group Editors Zita Wenzel, James Seng 2 Internet Draft draft-ietf-idn-requirements-09.txt 3 21 November 2001 Expires 21 May 2002 5 Requirements of Internationalized Domain Names 7 Status of this Memo 9 This document is an Internet-Draft and is in full conformance with 10 all provisions of Section 10 of RFC 2026 [8]. 12 Internet-Drafts are working documents of the Internet Engineering 13 Task Force (IETF), its areas, and its working groups. Note that 14 other groups may also distribute working documents as 15 Internet-Drafts. 17 Internet-Drafts are draft documents valid for a maximum of six 18 months and may be updated, replaced, or made obsolete by other 19 documents at any time. It is inappropriate to use Internet- 20 Drafts as reference material or to cite them other than as 21 "work in progress." 23 The list of current Internet-Drafts can be accessed at 24 http://www.ietf.org/ietf/1id-abstracts.txt 26 The list of Internet-Draft Shadow Directories can be accessed at 27 http://www.ietf.org/shadow.html. 29 Intended Scope 31 The intended scope of this document is to explore requirements for the 32 internationalization of domain names on the Internet. It is not 33 intended to document user requirements. It is recommended that 34 solutions not necessarily be within the DNS itself, but could be a layer 35 interjected between the application and the DNS. Proposals SHOULD 36 fulfill most, if not all, of the requirements. This document MAY be 37 updated based on actual trials. 39 Abstract 41 This document describes the requirement for encoding international 42 characters into DNS names and records. This document is guidance for 43 developing protocols for internationalized domain names. 45 1. Introduction 47 At present, the encoding of Internet domain names is restricted to a 48 subset of 7-bit ASCII (ISO/IEC 646). HTML, XML, IMAP, FTP, and many 49 other text based protocols on the Internet have already been at least 50 partially internationalized. It is important for domain names to be 51 similarly internationalized or for an equivalent solution to be found. 52 This document assumes that the most effective solution involves putting 53 non-ASCII names inside some parts of the overall DNS system although 54 this assumption may not be the consensus of the IETF community. 55 However, several sections of this document, including "Definitions and 56 Conventions" should be useful in any case. A reasonable familiarity 57 with DNS terminology is assumed in this document. 59 This document is being discussed on the "idn" mailing list. To join the 60 list, send a message to with the words 61 "subscribe idn" in the body of the message. Archives of the mailing 62 list can also be found at ftp://ops.ietf.org/pub/lists/idn*. 64 1.1 Definitions and Conventions 66 A language is a way that humans interact. In computerized form, a text 67 in a written language can be expressed as a string of characters. 68 The same set of characters can often be used for many written languages, 69 and many written languages can be expressed using different scripts. 70 The same characters are often shown with somewhat different glyphs 71 (shapes) for display of a text depending on the font used, the 72 automatic shaping applied, or the automatic formation of ligatures. In 73 addition, the same characters can be shown with somewhat different 74 glyphs (shapes) for display of a text depending on the language being 75 used, even within the same font or through automatic font change. 77 Character: A character is a member of a set of elements used for 78 organization, control, or representation of textual data. 80 Graphic character: A graphic character is a character, other than a 81 control function, that has a visual representation normally 82 handwritten, printed, or displayed. 84 Characters mentioned in this document are identified by their position 85 in the Unicode character set. This character set is also 86 known as the UCS (ISO 10646) [19]. The notation U+12AB, for example, 87 indicates the character at position 12AB (hexadecimal) in the Unicode 88 character set. Note that the use of this notation is not an 89 indication of a requirement to use Unicode. 91 Examples quoted in this document should be considered as a method to 92 further explain the meanings and principles adopted by the document. It 93 is not a requirement for the protocol to satisfy the examples. 95 Unicode Technical Report #17 [24] defines a character encoding 96 model in several levels (much of the text below is quoted from 97 Unicode Technical Report #17). 99 [N.B. Sections 1-6 below to be unpacked and and reworded to be 100 independent of the Unicode Technical Report #17.] 102 1. A abstract character repertoire (ACR) is defined as the set of 103 abstract characters to be encoded, normally a familiar alphabet 104 or symbol set. The word abstract just means that these objects 105 are defined by convention (such as the 26 letters of the English 106 alphabet, uppercase and lowercase forms). Examples: the ASCII 107 repertoire, the Latin 9 repertoire, the JIS X 0208 repertoire, 108 the UCS repertoire (of a particular version). 110 2. A coded character set (CCS) is defined to be a mapping from a 111 set of abstract characters to the set of non-negative integers. 112 This range of integers need not be contiguous. An abstract 113 character is defined to be in a coded character set if the coded 114 character set maps from it to an integer. That integer is said 115 to be the code point for the abstract character. That abstract 116 character is then an encoded character. Examples: ASCII, Latin-15, 117 JIS X 0208, the UCS. 119 3. A character encoding form (CEF) is a mapping from the set of integers 120 used in a CCS to the set of sequences of code units. A code unit 121 is an integer occupying a specified binary width in a computer 122 architecture, such as a septet, an octet, or a 16-bit unit. The 123 encoding form enables character representation as actual data in 124 a computer. The sequences of code units do not necessarily have the 125 same length. Examples: ASCII, Latin-15, Shift-JIS, UTF-16, UTF-8. 127 4. A character encoding scheme (CES) is a mapping of code units into 128 serialized octet sequences. Character encoding schemes are relevant 129 to the issue of cross-platform persistent data involving code units 130 wider than a byte, where byte-swapping may be required to put data 131 into the byte polarity canonical for a particular platform. 133 The CES may involve two or more CCS's, and may include code units 134 (e.g., single shifts, SI/SO, or escape sequences) that are not part 135 of the CCS per se, but which are defined by the character encoding 136 architecture and which may require an external registry of particular 137 values (as for the ISO 2022 escape sequences). In such a case, the 138 CES is called a compound CES. (A CES that only involves a single 139 CCS is called a simple CES.) Examples: ASCII, Latin-15, Shift-JIS, 140 UTF-16BE, UTF-16LE, UTF-8. 142 5. The mapping from an abstract character repertoire (ACR) to a 143 serialized sequence of octets is called a Character Map (CM). A simple 144 character map thus implicitly includes a CCS, a CEF, and a CES, 145 mapping from abstract characters to code units to octets. A compound 146 character map includes a compound CES, and thus includes more than one 147 CCS and CEF. In that case, the abstract character repertoire for the 148 character map is the union of the repertoires covered by the coded 149 character sets involved. 151 A sequence of encoded characters must be unambiguously 152 mapped onto a sequence of octets by the charset. The charset must be 153 specified in all instances, as in Internet protocols, where textual 154 content is treated as an ordered sequence of octets, and where the 155 textual content must be reconstructible from that sequence of 156 octets. Charset names are registered by the IANA according to 157 procedures documented in RFC 2278 [12]. In many cases, the same 158 name is used for both a character map and for a character encoding 159 scheme, such as UTF-16BE. Typically this is done for simple 160 character maps when such usage is clear from context. 162 6. A transfer encoding syntax (TES) is a reversible transform of encoded 163 data which may (or may not) include textual data represented in 164 one or more character encoding schemes. Examples: 8bit, 165 Quoted-Printable, BASE64, UTF-7 (defunct), UTF-5, and RACE. 167 1.2 Description of the Domain Name System 169 The Domain Name System is defined by RFC 1034 [4] and RFC 1035 [5], with 170 clarifications, extensions and modifications given in RFC 1123 [6], 171 RFC 1996 [7], RFC 2181 [10], and others. Of special importance here are the 172 security extensions described in RFC 2535 [14] and related RFCs. 174 Over the years, many different words have been used to describe the 175 components of resource naming on the Internet (e.g., URI, URN); to make 176 certain that the set of terms used in this document are well-defined and 177 non-ambiguous, the definitions are given here. 179 Master server: A master server for a zone holds the main copy of that 180 zone. This copy is sometimes stored in a zone file. A slave server for 181 a zone holds a complete copy of the records for that zone. Slave 182 servers MAY be either authorized by the zone owner (secondary servers) 183 or unauthorized (sometimes called "stealth secondaries"). Master and 184 authorized slave servers are listed in the NS records for the zone, 185 and are termed "authoritative" servers. In many contexts outside this 186 document, the term "primary" is used interchangeably with "master" and 187 "secondary" is used interchangeably with "slave". 189 Caching server: A caching server holds temporary copies of DNS 190 records; it uses records to answer queries about domain names. Further 191 explanation of these terms can be found in RFC 1034 [4] and RFC 1996 192 [7]. 194 DNS names can be represented in multiple forms, with different 195 properties for internationalization. The most important ones are: 197 - Domain name: The binary representation of a name used internally in 198 the DNS protocol. This consists of a series of components of 1-63 199 octets, with an overall length limited to 255 octets (including the 200 length fields). 202 - Master file format domain name: This is a representation of the name 203 as a sequence of characters in some character sets; the common 204 convention (derived from RFC 1035 [5] section 5.1) is to represent the 205 octets of the name as ASCII characters where the octet is in the set 206 corresponding to the ASCII values for [a-z,A-Z,0-9,-], using an escape 207 mechanism (\x or \NNN) where not, and separating the components of the 208 name by the dot character ("."). 210 The form specified for most protocols using the DNS is a limited form of 211 the master file format domain name. This limited form is defined in 212 RFC 1034 [4] Section 3.5 and RFC 1123 [6]. In most implementations of 213 applications today, domain names in the Internet have been limited to 214 the much more restricted forms used, e.g., in email, which defines its 215 own rules. Those names are limited to the upper- and lower-case 216 letters a-z (interpreted in a case-independent fashion), the digits, 217 and the hyphen-minus, all in ASCII. 219 1.3 Definition of "hostname" and "Internationalized Domain Name" 221 Hostname: 223 In the DNS protocols, a name is referred to as a sequence of octets. 224 However, when discussing requirements for internationalized domain 225 names, what we are looking for are ways to represent characters that 226 are meaningful for humans. 228 Internationalized Domain Name: 230 In this document, this representation is referred to as a 231 "hostname". While this term has been used for many different purposes 232 over the years, it is used here in the sense of sequence of characters 233 (not octets) representing a domain name conforming to the limited 234 hostname syntax specified in RFC 952 [3]. This document attempts to 235 define the requirements for an "Internationalized Domain Name" 236 (IDN). IDN is defined as a sequence of characters that can be used in 237 the context of functions where a hostname is used today, but contains 238 one or more characters that are outside the set of characters 239 specified as legal characters for host names RFC 1123 [6]. 241 1.4 A multilayer model of the DNS function 243 The DNS can be seen as a multilayer function: 245 - The bottom layer is where the packets are passed across the Internet 246 in a DNS query and a DNS response. At this level, what matters is 247 the format and meaning of bits and octets in a DNS packet. 249 - Above that is the "DNS service", created by an infrastructure of DNS 250 servers, NS records that point to those DNS servers, that is 251 pointed to by the root servers (listed in the "root cache file" on 252 each DNS server often called "named.cache"). It is at this level 253 that the statement "the DNS has a single root" RFC 2826 [17] makes 254 sense, but still, what is being transferred are octets, not 255 characters. 257 - Interfacing to the user is a service layer, often called "the resolver 258 library". It is often embedded in the operating system or system 259 libraries of the client machines. It is at the top of this layer that 260 the API calls commonly known as "gethostbyname" and "gethostbyaddress" 261 reside. These calls are modified to support IPv6 RFC 2553 [15]. A 262 conceptually similar layer exists in authoritative DNS servers, 263 comprising the parts that generate "meaningful" strings in DNS files. 264 Due to the popularity of the "master file" format, this layer often 265 exists only in the administrative routines of the service maintainers. 267 - The user of this layer (resolver library) is the application programs 268 that use the DNS, such as mailers, mail servers, Web clients, Web 269 servers, Web caches, IRC clients, FTP clients, distributed file 270 systems, distributed databases, and almost all other applications on 271 TCP/IP. 273 Graphically, one can illustrate it like this: 275 +---------------+ +---------------------+ 276 | Application | | (Base data) | 277 +---------------+ +---------------------+ 278 | Application service interface | 279 | For ex. GethostbyXXXX interface | (no standard) 280 +---------------+ +---------------------+ 281 | Resolver | | Auth DNS server | 282 +---------------+ +---------------------+ 283 | <----- DNS service interface -----> | 284 +------------------------------------------------------------------+ 285 | DNS service | 286 | +-----------------------+ +--------------------+ | 287 | | Forwarding DNS server | | Caching DNS server | | 288 | +-----------------------+ +--------------------+ | 289 | | 290 | +-------------------------+ | 291 | | Parent-zone DNS servers | | 292 | +-------------------------+ | 293 | | 294 | +-------------------------+ | 295 | | Root DNS servers | | 296 | +-------------------------+ | 297 | | 298 +------------------------------------------------------------------+ 300 1.5 Service model of the DNS 302 The Domain Name Service is used for multiple purposes, each of which is 303 characterized by what it puts into the system (the query) and what it 304 expects as a result (the reply). 306 The most used ones in the current DNS are: 308 - Hostname-to-address service (A, AAAA, A6): Enter a hostname, and get 309 back an IPv4 or IPv6 address. 311 - Hostname-to-mail server service (MX): As above, but the expected 312 return value is a hostname and a priority for SMTP servers. 314 - Address-to-hostname service (PTR): Enter an IPv4 or IPv6 address (in 315 in-addr.arpa. or ip6.arpa form respectively) and get back a hostname. 317 - Domain delegation service (NS). Enter a domain name and get back 318 nameserver records (designated hosts which provide authoritive 319 nameservice) for the domain. 321 New services are being defined, either as entirely new services (IPv6 to 322 hostname mapping using binary labels) or as embellishments to other 323 services such as DNS Security (DNSSEC) [14], returning information 324 about whether a given DNS service is performed securely or not). 326 These services exist, conceptually, at the Application/Resolver 327 interface, NOT at the DNS-service interface. This document attempts to 328 set requirements for an equivalent of the "used services" given above, 329 where "hostname" is replaced by "Internationalized Domain Name". This 330 does not preclude the fact that IDN should work with any kind of DNS 331 queries. IDN is a new service. Since existing protocols like SMTP or 332 HTTP use the old service, it is a matter of great concern how the new 333 and old services work together, and how other protocols can take 334 advantage of the new service. 336 2. General Requirements 338 These requirements address two concerns: The service offered to the 339 users (the application service), and the protocol extensions, if needed, 340 added to support this service. 342 In the requirements, we attempt to use the term "service" whenever a 343 requirement concerns the service, and "protocol" whenever a requirement 344 is believed to constrain the possible implementation. 346 2.1 Compatibility and Interoperability 348 [1] The DNS is essential to the entire Internet. Therefore, the service 349 MUST NOT damage present DNS protocol interoperability. It MUST make the 350 minimum number of changes to existing protocols on all layers of the 351 stack. It MUST continue to allow any system anywhere that implements 352 the IDN specification to resolve any internationalized domain name. 354 [2] The service MUST preserve the basic concept and facilities of domain 355 names as described in RFC 1034 [4]. It MUST maintain a single, global, 356 universal, and consistent hierarchical namespace. 358 [3] The DNS protocol (the packet formats that go on the wire) MUST 359 NOT limit the codepoints that can be used. A service defined on top of 360 the DNS, for instance the IDN-to-address function, MAY limit the 361 codepoints that can be used. The service descriptions MUST describe 362 what limitations are imposed. 364 [4] The protocol MUST work for all features of DNS, IPv4, and 365 IPv6. The protocol MUST NOT allow an IDN to be returned to a requestor 366 that requests the IP-to-(old)-domain-name mapping service. 368 [5] The same name resolution request MUST generate the same response, 369 regardless of the location or localization settings in the resolver, in 370 the master server, and in any slave servers involved in the resolution 371 process. 373 [6] The protocol MUST NOT require that the current DNS cache 374 servers be modified to support IDN. If a cache server can have 375 additional functionality to support IDN better, this additional 376 functionality MUST NOT cause problems for resolving correctly 377 functioning current domain names. 379 [7] A caching server MUST NOT return data in response to a query that 380 would not have been returned if the same query had been presented to an 381 authoritative server. This applies fully for the cases when: 383 - The caching server does not know about IDN 384 - The caching server implements the whole specification 385 - The caching server implements a valid subset of the specification 387 [8] The service MAY modify the DNS protocol RFC 1035 [5] and other related 388 work undertaken by the DNS Extensions (DNSEXT) [2] working group. However, 389 these changes SHOULD be as small as possible and any changes SHOULD be 390 coordinated with the DNSEXT working group. 392 [9] The protocol supporting the service SHOULD be as simple as possible 393 from the user's perspective. Ideally, users SHOULD NOT realize that IDN 394 was added on to the existing DNS. 396 [10] The best solution is one that maintains maximum feasible 397 compatibility with current DNS standards as long as it meets the other 398 requirements in this document. 400 [11] The protocol should handle with care new revisions of the CCS. 401 Undefined codepoints should not be allowed unless a new revision of 402 the protocol can handle it. Protocol revisions should be tagged. 404 2.2 Internationalization 406 [12] Internationalized characters MUST be allowed to be represented and 407 used in DNS names and records. The protocol MUST specify what charset is 408 used when resolving domain names and how characters are encoded in DNS 409 records. 411 [13] Codepoints SHOULD be from the Universal Set as defined in 412 ISO-10646 or Unicode. The specifics of versions MUST be defined in the 413 proposed solution. If multiple charsets are allowed, each charset MUST 414 be tagged and conform to RFC 2277 [11]. 416 [14] The protocol MUST NOT reject any non-IDN characters (to be 417 defined) in any DNS queries or responses. 419 [15] The protocol SHOULD NOT invent a new CCS for the purpose of IDN 420 only and SHOULD use an existing CES. The charset(s) chosen SHOULD also be 421 non-ambiguous. 423 [16] The protocol SHOULD NOT make any assumptions about the location 424 in a domain name where internationalization might appear. In other 425 words, it SHOULD NOT differentiate between any part of a domain name 426 because this MAY impose restrictions on future internationalization 427 efforts. For example, the Top-Level Domains (TLDs) can be 428 internationalized. 430 [17] The protocol also SHOULD NOT make any localized restrictions in the 431 protocol. For example, an IDN implementation which only allows domain 432 names to use a single local script would immediately restrict 433 multinational organization. 435 [18] While there are a wide range of devices that use the DNS and a wide 436 range of characteristics of international scripts and methods of 437 domain name input and display, IDN is only concerned with the 438 protocol. Therefore, there MUST be a single way of encoding an 439 internationalized domain name within the DNS. 441 2.3 Canonicalization 443 Matching rules are a complicated process for IDN. Canonicalization 444 of characters MUST follow precise and predictable rules to ensure 445 consistency. "Requirements for String Identity Matching and String 446 Indexing" is RECOMMENDED as a guide on canonicalization. 448 The DNS has to match a host name in a request with a host name held 449 in one or more zones. It also needs to sort names into order. It is 450 expected that some sort of canonicalization algorithm will be used as 451 the first step of this process. This section discusses some of the 452 properties which will be REQUIRED of that algorithm. 454 [19] To achieve interoperability, canonicalization MUST be done at a 455 single well-defined place in the DNS resolution process. The protocol 456 MUST specify canonicalization; it MUST specify exactly where in the 457 DNS that canonicalization happens and does not happen; it MUST specify 458 how additions to ISO 10646 will affect the stability of the DNS and 459 the amount of work done on the root DNS servers. 461 [20] The canonicalization algorithm MAY specify operations for case, 462 ligature, and punctuation folding. 464 [21] In order to retain backward compatibility with the current DNS, 465 the service MUST retain the case-insensitive comparison for US-ASCII 466 as specified in RFC 1035 [5]. For example, Latin capital letter A 467 (U+0041) MUST match Latin small letter a (U+0061). Unicode Technical 468 Report #21 [25] describes some of the issues with case 469 mapping. Case-insensitivity for non US-ASCII MUST be discussed in the 470 protocol proposal. 472 [22] Case folding MUST be locale independent. If it were 473 locale-dependent, then different clients would get different results. 474 For example, Latin capital letter I (U+0049) case folded to lower case 475 in the Turkish context will become Latin small letter dotless i 476 (U+0131). But in the English context, it will become Latin small 477 letter i (U+0069). 479 [23] If other canonicalization is done, it MUST be done before the 480 domain name is resolved. Further, the canonicalization MUST be easily 481 upgradable as new languages and writing systems are added. 483 [24] Any conversion (case, ligature folding, punctuation folding, etc) 484 from what the user enters into a client to what the client asks for 485 resolution MUST be done identically on any request from any client. 487 [25] If the charset can be normalized, then it SHOULD be normalized 488 before it is used in IDN. Normalization SHOULD follow Unicode 489 Technical Report #15 [23]. 491 [26] The protocol SHOULD avoid inventing a new normalization form 492 provided a technically sufficient one is available. 494 2.4 Operational Issues 496 [27] Zone files SHOULD remain easily editable. 498 [28] An IDN-capable resolver or server SHALL NOT generate more traffic 499 than a non-IDN-capable resolver or server would when resolving an 500 ASCII-only domain name. The amount of traffic generated when resolving 501 an IDN SHALL be similar to that generated when resolving an ASCII-only 502 name. 504 [29] The service SHOULD NOT add new centralized administration for the 505 DNS. A domain administrator SHOULD be able to create internationalized 506 names as easily as adding current domain names. 508 [30] The protocol MUST work with DNSSEC. The protocol MAY break 509 language sort order. 511 3. Security Considerations 513 Any solution that meets the requirements in this document MUST NOT be 514 less secure than the current DNS. Specifically, the mapping of 515 internationalized host names to and from IP addresses MUST have the 516 same characteristics as the mapping of today's host names. 518 Specifying requirements for internationalized domain names does not 519 itself raise any new security issues. However, any change to the DNS MAY 520 affect the security of any protocol that relies on the DNS or on 521 DNS names. A thorough evaluation of those protocols for security 522 concerns will be needed when they are developed. In particular, IDNs 523 MUST be compatible with DNSSEC and, if multiple charsets or 524 representation forms are permitted, the implications of this name-spoof 525 MUST be throughly understood. 527 4. References 529 [1] World Wide Web Consortium, "Requirements for string identity 530 matching and String Indexing", http://www.w3.org/TR/WD-charreq, July 531 1998. 533 [2] Olafur Gudmundson, Randy Bush, "IETF DNS Extensions Working Group" 534 (DNSEXT), namedroppers@ops.ietf.org. 536 [3] K. Harrenstien, M.K. Stahl, E.J. Feinler, "DoD Internet Host Table 537 Specification", RFC 952, October 1985. 539 [4] P. Mockapetris, "Domain Names - Concepts and Facilities", 540 RFC 1034, November 1987. 542 [5] P. Mockapetris, "Domain Names - Implementation and 543 Specification", RFC 1035, November 1987. 545 [6] R. Braden, "Requirements for Internet Hosts -- Application and 546 Support", RFC 1123, October 1989. 548 [7] P. Vixie, "A Mechanism for Prompt Notification of Zone Changes 549 (DNS NOTIFY)", RFC 1996, August 1996. 551 [8] S. Bradner, "The Internet Standards Process -- Revision 3", RFC 552 2026, October 1996. 554 [9] S. Bradner, "Key words for use in RFCs to Indicate Requirement 555 Levels", RFC 2119, March 1997. 557 [10] R. Elz, R. Bush, "Clarifications to the DNS Specification", 558 RFC 2181, July 1997. 560 [11] H. Alvestrand, "IETF Policy on Character Sets and Languages", RFC 561 2277, January 1998. 563 [12] N. Freed and J. Postel, "IANA Charset Registration Procedures", 564 RFC 2278, January 1998. 566 [13] F. Yergeau, "UTF-8, a transformation format of ISO 10646", RFC 567 2279, January 1998. 569 [14] D. Eastlake, "Domain Name System Security Extensions", RFC 2535, 570 March 1999. 572 [15] R. Gilligan et al, "Basic Socket Interface Extensions for IPv6", 573 RFC 2553, March 1999. 575 [16] L. Daigle et al, "A Tangled Web: Issues of I18N, Domain Names, 576 and the Other Internet protocols", RFC 2825, May 2000. 578 [17] Internet Architecture Board, "IAB Technical Comment on the Unique DNS 579 Root", RFC 2826, May 2000. 581 [18] P. Hoffman, "Comparison of Internationalized Domain Name 582 Proposals", draft-ietf-idn-compare-00.txt, June 2000. 584 [19] ISO/IEC 10646-1:2000 (note that an amendment 1 is in 585 preparation), ISO/IEC 10646-2 (in preparation), plus corrigenda and 586 amendments to these standards. 588 [20] The Unicode Consortium, "The Unicode Standard". Described at 589 http://www.unicode.org/unicode/standard/versions/. 591 [21] The Unicode Consortium, "The Unicode Standard -- Version 3.0", 592 ISBN 0-201-61633-5. Same repertoire as ISO/IEC 10646-1:2000. Described 593 at http://www.unicode.org/unicode/standard/versions/Unicode3.0.html. 595 [22] Coded Character Set -- 7-bit American Standard Code for 596 Information Interchange, ANSI X3.4-1986; also: ISO/IEC 646 (IRV). 598 [23] M. Davis and M. Duerst, Unicode Consortium, "Unicode 599 Normalization Forms", Unicode Standard Annex #15, 600 http://www.unicode.org/unicode/reports/tr15/, 2000-08-31. 602 [24] K. Whistler and M. Davis, Unicode Consortium, "Character Encoding 603 Model", Unicode Technical Report #17, 604 http://www.unicode.org/unicode/reports/tr17/, 2000-08-31. 606 [25] M. Davis, Unicode Consortium, "Case Mappings", Unicode Technical 607 Report #21, http://www.unicode.org/unicode/reports/tr21/, 2000-09-12. 609 5. Editors' Contact 611 Zita Wenzel, Ph.D. 612 Information Sciences Institute 613 University of Southern California 614 4676 Admiralty Way 615 Marina del Rey, CA 616 90292 USA 617 Tel: +1 310 448 8462 618 Fax: +1 310 823 6714 619 zita@isi.edu 621 James Seng 622 i-DNS.net International Pte Ltd. 623 8 Temesek Boulevand 624 #24-02 Suntec Tower 3 625 Singapore 038988 626 Tel: +65 248 6208 627 Fax: +65 248 6198 628 Email: jseng@pobox.org.sg 630 6. Acknowledgements 632 The editors gratefully acknowledge the contributions of: 634 Harald Tveit Alvestrand 635 Mark Andrews 636 RJ Atkinson 637 Alan Barret 638 Marc Blanchet 639 Randy Bush 640 Andrew Draper 641 Martin Duerst 642 Patrik Faltstrom 643 Ned Freed 644 Olafur Gudmundsson 645 Paul Hoffman 646 Simon Josefsson 647 Kent Karlsson 648 John Klensin 649 Tan Juay Kwang 650 Dongman Lee 651 Bill Manning 652 Dan Oscarsson 653 J. William Semich 654 Yoshiro Yoneda