idnits 2.17.1
draft-ietf-idn-idna-07.txt:
Checking boilerplate required by RFC 5378 and the IETF Trust (see
https://trustee.ietf.org/license-info):
----------------------------------------------------------------------------
** Looks like you're using RFC 2026 boilerplate. This must be updated to
follow RFC 3978/3979, as updated by RFC 4748.
Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
----------------------------------------------------------------------------
** Missing expiration date. The document expiration date should appear on
the first and last page.
== No 'Intended status' indicated for this document; assuming Proposed
Standard
== The page length should not exceed 58 lines per page, but there was 1
longer page, the longest (page 1) being 613 lines
Checking nits according to https://www.ietf.org/id-info/checklist :
----------------------------------------------------------------------------
** The document seems to lack an IANA Considerations section. (See Section
2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
when there are no actions for IANA.)
** There are 10 instances of too long lines in the document, the longest
one being 4 characters in excess of 72.
** The document seems to lack a both a reference to RFC 2119 and the
recommended RFC 2119 boilerplate, even if it appears to use RFC 2119
keywords.
RFC 2119 keyword, line 70: '...Punycode [PUNYCODE]. Implementations of IDNA MUST fully implement...'
RFC 2119 keyword, line 163: '... 2), every label MUST contain only ASC...'
RFC 2119 keyword, line 168: '...omain name slots SHOULD be hidden from...'
RFC 2119 keyword, line 176: '...e compared, they MUST be considered to...'
RFC 2119 keyword, line 198: '...sequence MUST NOT be used as a label i...'
(16 more instances...)
Miscellaneous warnings:
----------------------------------------------------------------------------
-- The document seems to lack a disclaimer for pre-RFC5378 work, but may
have content which was first submitted before 10 November 2008. If you
have contacted all the original authors and they are all willing to grant
the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
this comment. If not, you may need to add the pre-RFC5378 disclaimer.
(See the Legal Provisions document at
https://trustee.ietf.org/license-info for more information.)
-- Couldn't find a document date in the document -- date freshness check
skipped.
Checking references for intended status: Proposed Standard
----------------------------------------------------------------------------
(See RFCs 3967 and 4897 for information about using normative references
to lower-maturity documents in RFCs)
-- Missing reference section? 'NAMEPREP' on line 557 looks like a reference
-- Missing reference section? 'STRINGPREP' on line 571 looks like a
reference
-- Missing reference section? 'PUNYCODE' on line 552 looks like a reference
-- Missing reference section? 'RFC2119' on line 560 looks like a reference
-- Missing reference section? 'UNICODE' on line 578 looks like a reference
-- Missing reference section? 'STD13' on line 567 looks like a reference
-- Missing reference section? 'STD3' on line 563 looks like a reference
-- Missing reference section? 'UAX9' on line 575 looks like a reference
-- Missing reference section? 'DNSSEC' on line 554 looks like a reference
Summary: 5 errors (**), 0 flaws (~~), 2 warnings (==), 11 comments (--).
Run idnits with the --verbose option for more detailed information about
the items above.
--------------------------------------------------------------------------------
1 Internet Draft Patrik Faltstrom
2 draft-ietf-idn-idna-07.txt Cisco
3 February 24, 2002 Paul Hoffman
4 Expires in six months IMC & VPNC
5 Adam M. Costello
6 UC Berkeley
8 Internationalizing Domain Names in Applications (IDNA)
10 Status of this Memo
12 This document is an Internet-Draft and is in full conformance with all
13 provisions of Section 10 of RFC2026.
15 Internet-Drafts are working documents of the Internet Engineering Task
16 Force (IETF), its areas, and its working groups. Note that other groups
17 may also distribute working documents as Internet-Drafts.
19 Internet-Drafts are draft documents valid for a maximum of six months
20 and may be updated, replaced, or obsoleted by other documents at any
21 time. It is inappropriate to use Internet-Drafts as reference material
22 or to cite them other than as "work in progress."
24 The list of current Internet-Drafts can be accessed at
25 http://www.ietf.org/ietf/1id-abstracts.txt
27 The list of Internet-Draft Shadow Directories can be accessed at
28 http://www.ietf.org/shadow.html.
30 Abstract
32 Until now, there has been no standard method for domain names to use
33 characters outside the ASCII repertoire. This document defines
34 internationalized domain names (IDNs) and a mechanism called IDNA for
35 handling them in a standard fashion. IDNs use characters drawn from a
36 large repertoire (Unicode), but IDNA allows the non-ASCII characters to
37 be represented using the same octets used in so-called host names
38 today. IDNA is only meant for processing domain names, not free
39 text.
41 1. Introduction
43 IDNA works by allowing applications to use certain ASCII name labels
44 (beginning with a special prefix) to represent non-ASCII name labels.
45 Lower-layer protocols need not be aware of this; therefore IDNA does not
46 require changes to any infrastructure. In particular, IDNA does not
47 require any changes to DNS servers, resolvers, or protocol elements,
48 because the ASCII name service provided by the existing DNS is entirely
49 sufficient.
51 This document does not require any applications to conform to IDNA,
52 but applications can elect to use IDNA in order to support IDN while
53 maintaining interoperability with existing infrastructure. Adding IDNA
54 support to an existing application entails changes to the application
55 only, and leaves room for flexibility in the user interface.
57 A great deal of the discussion of IDN solutions has focused on
58 transition issues and how IDN will work in a world where not all of the
59 components have been updated. Other proposals would require that user
60 applications, resolvers, and DNS servers be updated in order for a user
61 to use an internationalized domain name. Rather than require widespread
62 updating of all components, IDNA requires only user applications to be
63 updated; no changes are needed to the DNS protocol or any DNS servers or
64 the resolvers on user's computers.
66 1.1 Interaction of protocol parts
68 IDNA requires that implementations process input strings with Nameprep
69 [NAMEPREP], which is a profile of Stringprep [STRINGPREP], and then with
70 Punycode [PUNYCODE]. Implementations of IDNA MUST fully implement
71 Nameprep and Punycode; neither Nameprep nor Punycode are optional.
73 2 Terminology
75 The key words "MUST", "SHALL", "REQUIRED", "SHOULD", "RECOMMENDED", and
76 "MAY" in this document are to be interpreted as described in RFC 2119
77 [RFC2119].
79 A code point is an integral value associated with a character in a coded
80 character set.
82 Unicode [UNICODE] is a coded character set containing tens of thousands
83 of characters. A single Unicode code point is denoted by "U+" followed
84 by four to six hexadecimal digits, while a range of Unicode code points
85 is denoted by two hexadecimal numbers separated by "..", with no
86 prefixes.
88 ASCII means US-ASCII, a coded character set containing 128 characters
89 associated with code points in the range 0..7F. Unicode is an extension
90 of ASCII: it includes all the ASCII characters and associates them with
91 the same code points.
93 The term "LDH code points" is defined in this document to mean the code
94 points associated with ASCII letters, digits, and the hyphen-minus; that
95 is, U+002D, 30..39, 41..5A, and 61..7A. "LDH" is an abbreviation for
96 "letters, digits, hyphen".
98 [STD13] talks about "domain names" and "host names", but many people use
99 the terms interchangeably. Further, because [STD13] was not terribly
100 clear, many people who are sure they know the exact definitions of each
101 of these terms disagree on the definitions.
103 A label is an individual part of a domain name. Labels are usually shown
104 separated by dots; for example, the domain name "www.example.com" is
105 composed of three labels: "www", "example", and "com". (The zero-length
106 root label that is implied in domain names, as described in [STD13], is
107 not considered a label in this specification.) Throughout this document
108 the term "label" is shorthand for "text label", and "every label" means
109 "every text label". In IDNA, not all text strings can be labels.
111 An "internationalized domain name" (IDN) is a domain name for which the
112 ToASCII operation (see section 4) can be applied to each label without
113 failing. This document does not attempt to define an "internationalized
114 host name". It is expected that protocols and name-handling bodies will
115 want to limit the characters allowed in IDNs further than what is
116 specified in this document, such as to prohibit additional characters
117 that they feel are unneeded or harmful in registered domain names.
119 An "internationalized label" is a label composed of characters from the
120 Unicode character set; note, however, that not every string of Unicode
121 characters can be an internationalized label. To allow internationalized
122 labels to be handled by existing applications, IDNA uses an "ACE label"
123 (ACE stands for ASCII Compatible Encoding), which can be represented
124 using only ASCII characters but is equivalent to a label containing
125 non-ASCII characters. More rigorously, an ACE label is defined to be any
126 label that the ToUnicode operation would alter (see section 4.2). For
127 every internationalized label that cannot be directly represented in
128 ASCII, there is an equivalent ACE label. The conversion of labels to and
129 from the ACE form is specified in section 4.
131 The "ACE prefix" is defined in this document to be a string of ASCII
132 characters that appears at the beginning of every ACE label. It is
133 specified in section 5.
135 A "domain name slot" is defined in this document to be a protocol element
136 or a function argument or a return value (and so on) explicitly
137 designated for carrying a domain name. Examples of domain name slots
138 include: the QNAME field of a DNS query; the name argument of the
139 gethostbyname() library function; the part of an email address following
140 the at-sign (@) in the From: field of an email message header; and the host
141 portion of the URI in the src attribute of an HTML tag.
142 General text that just happens to contain a domain name is not a domain name
143 slot; for example, a domain name appearing in the plain text body of an
144 email message is not occupying a domain name slot.
146 An "internationalized domain name slot" is defined in this document to
147 be a domain name slot explicitly designated for carrying an
148 internationalized domain name as defined in this document. The
149 designation may be static (for example, in the specification of the
150 protocol or interface) or dynamic (for example, as a result of
151 negotiation in an interactive session).
153 A "generic domain name slot" is defined in this document to be any
154 domain name slot that is not an internationalized domain name slot.
155 Obviously, this includes any domain name slot whose specification
156 predates IDNA.
158 3. Requirements
160 IDNA conformance means adherence of the following three requirements:
162 1) Whenever a domain name is put into a generic domain name slot (see
163 section 2), every label MUST contain only ASCII characters. Given an
164 internationalized domain name (IDN), an equivalent domain name
165 satisfying this requirement can be obtained by applying the ToASCII
166 operation (see section 4) to each label.
168 2) ACE labels obtained from domain name slots SHOULD be hidden from
169 users except when the use of the non-ASCII form would cause problems or
170 when the ACE form is explicitly requested. Given an internationalized
171 domain name, an equivalent domain name containing no ACE labels can be
172 obtained by applying the ToUnicode operation (see section 4) to each
173 label. When requirements 1 and 2 both apply, requirement 1 takes
174 precedence.
176 3) Whenever two labels are compared, they MUST be considered to
177 match if and only if their ASCII forms (obtained by applying ToASCII)
178 match using a case-insensitive ASCII comparison.
180 4. Conversion operations
182 This section specifies the ToASCII and ToUnicode operations. Each one
183 operates on a sequence of Unicode code points (but remember that all
184 ASCII code points are also Unicode code points). When domain names are
185 represented using character sets other than Unicode and ASCII, they will
186 need to first be transcoded to Unicode before these operations can be
187 applied, and might need to be transcoded back afterwards.
189 4.1 ToASCII
191 The ToASCII operation takes a sequence of Unicode code points and
192 transforms it into a sequence of code points in the ASCII range (0..7F).
193 The original sequence and the resulting sequence are equivalent labels.
194 (If the original is an internationalized label that cannot be directly
195 represented in ASCII, the result will be the equivalent ACE label.)
197 ToASCII fails if any step of it fails. If any step fails, the original
198 sequence MUST NOT be used as a label in an IDN.
200 The inputs to ToASCII are a sequence of code points; a flag indicating
201 whether to prohibit unassigned code points (see [STRINGPREP]); and a
202 flag indicating whether to apply the host name syntax rules. The output
203 of ToASCII is either a sequence of ASCII code points or a failure
204 condition.
206 ToASCII never alters a sequence of code points that are all in the ASCII
207 range to begin with (although it could fail).
209 ToASCII consists of the following steps:
211 1. If all code points in the sequence are in the ASCII range (0..7F)
212 then skip to step 3.
214 2. Perform the steps specified in [NAMEPREP] and fail if there is
215 an error.
217 3. If the label is part of a host name (or is subject to the host
218 name syntax rules) then perform these checks:
220 (a) Verify the absence of non-LDH ASCII code points; that is,
221 the absence of 0..2C, 2E..2F, 3A..40, 5B..60, and 7B..7F.
223 (b) Verify the absence of leading and trailing hyphen-minus;
224 that is, the absence of U+002D at the beginning and end of
225 the sequence.
227 4. If all code points in the sequence are in the ASCII range (0..7F),
228 then skip to step 8.
230 5. Verify that the sequence does NOT begin with the ACE prefix.
232 6. Encode the sequence using the encoding algorithm in [PUNYCODE].
234 7. Prepend the ACE prefix.
236 8. Verify that the number of code points is in the range 1 to 63
237 inclusive.
239 4.2 ToUnicode
241 The ToUnicode operation takes a sequence of Unicode code points and
242 returns a sequence of Unicode code points. If the input sequence is a
243 label in ACE form, then the result is an equivalent internationalized
244 label that is not in ACE form, otherwise the original sequence is
245 returned unaltered.
247 ToUnicode never fails. If any step fails, then the original input
248 sequence is returned immediately in that step.
250 The inputs to ToUnicode are a sequence of code points; a flag indicating
251 whether to prohibit unassigned code points (see [STRINGPREP]); and a
252 flag indicating whether to apply the host name syntax rules. The output
253 of ToUnicode is always a sequence of Unicode code points.
255 1. If all code points in the sequence are in the ASCII range (0..7F)
256 then skip to step 3.
258 2. Perform the steps specified in [NAMEPREP] and fail if there is an
259 error. (If step 3 of ToASCII is also performed here, it will not
260 affect the overall behavior of ToUnicode, but it is not
261 necessary.)
263 3. Verify that the sequence begins with the ACE prefix, and save a
264 copy of the sequence.
266 4. Remove the ACE prefix.
268 5. Decode the sequence using decoding algorithm in [PUNYCODE]. Save
269 a copy of the result of this step.
271 6. Apply ToASCII.
273 7. Verify that the sequence matches the saved copy from step 3, using
274 a case-insensitive ASCII comparison.
276 8. Return the saved copy from step 5.
278 5. ACE prefix
280 [[ Note to the IESG and Internet Draft readers: The two uses of the
281 string "IESG--" below are to be changed at time of publication to a
282 prefix which fulfills the requirements in the first paragraph. ]]
284 The ACE prefix, used in the conversion operations (section 4), is two
285 alphanumeric ASCII characters followed by two hyphen-minuses. It cannot
286 be any of the prefixes already used in earlier documents, which includes
287 the following: "bl--", "bq--", "dq--", "lq--", "mq--", "ra--", "wq--"
288 and "zq--". The ToASCII and ToUnicode operations MUST recognize the ACE
289 prefix in a case-insensitive manner.
291 The ACE prefix for IDNA is "IESG--".
293 This means that an ACE label might be "IESG--de-jg4avhby1noc0d", where
294 "de-jg4avhby1noc0d" is the part of the ACE label that is generated by
295 the encoding steps in [PUNYCODE].
297 6. Implications for typical applications using DNS
299 In IDNA, applications perform the processing needed to input
300 internationalized domain names from users, display internationalized
301 domain names to users, and process the inputs and outputs from DNS and
302 other protocols that carry domain names.
304 The components and interfaces between them can be represented
305 pictorially as:
307 +------+
308 | User |
309 +------+
310 ^
311 | Input and display: local interface methods
312 | (pen, keyboard, glowing phosphorus, ...)
313 +-------------------|-------------------------------+
314 | v |
315 | +-----------------------------+ |
316 | | Application | |
317 | | (conversion between local | |
318 | | character set and Unicode | |
319 | | is done here) | |
320 | +-----------------------------+ |
321 | ^ ^ | End system
322 | | | |
323 | Call to resolver: | | Application-specific |
324 | ACE | | protocol: |
325 | v | predefined by the |
326 | +----------+ | protocol or defaults |
327 | | Resolver | | to ACE |
328 | +----------+ | |
329 | ^ | |
330 +-----------------|----------|----------------------+
331 DNS protocol: | |
332 ACE | |
333 v v
334 +-------------+ +---------------------+
335 | DNS servers | | Application servers |
336 +-------------+ +---------------------+
338 6.1 Entry and display in applications
340 Applications can accept domain names using any character set or sets
341 desired by the application developer, and can display domain names in any
342 charset. That is, the IDNA protocol does not affect the interface
343 between users and applications.
345 An IDNA-aware application can accept and display internationalized
346 domain names in two formats: the internationalized character set(s)
347 supported by the application, and as an ACE label. ACE labels that are
348 displayed or input MUST always include the ACE prefix. Applications MAY
349 allow input and display of ACE labels, but are not encouraged to do so
350 except as an interface for special purposes, possibly for debugging. ACE
351 encoding is opaque and ugly, and should thus only be exposed to users
352 who absolutely need it. The optional use, especially during a transition
353 period, of ACE encodings in the user interface is described in section
354 6.4. Because name labels encoded as ACE name labels can be rendered
355 either as the encoded ASCII characters or the proper decoded characters,
356 the application MAY have an option for the user to select the preferred
357 method of display; if it does, rendering the ACE SHOULD NOT be the
358 default.
360 Domain names are often stored and transported in many places. For example,
361 they are part of documents such as mail messages and web pages. They are
362 transported in many parts of many protocols, such as both the
363 control commands and the RFC 2822 body parts of SMTP, and the headers
364 and the body content in HTTP. It is important to remember that domain
365 names appear both in domain name slots and in the content that is passed
366 over protocols.
368 In protocols and document formats that define how to handle
369 specification or negotiation of charsets, labels can be encoded in any
370 charset allowed by the protocol or document format. If a protocol or
371 document format only allows one charset, the labels MUST be given in
372 that charset.
374 In any place where a protocol or document format allows transmission of
375 the characters in internationalized labels, internationalized labels
376 SHOULD be transmitted using whatever character encoding and escape
377 mechanism that the protocol or document format uses at that place.
379 All protocols that use domain name slots already have the capacity for
380 handling domain names in the ASCII charset. Thus, ACE labels
381 (internationalized labels that have been processed with the ToASCII
382 operation) can inherently be handled by those protocols.
384 6.2 Applications and resolver libraries
386 Applications normally use functions in the operating system when they
387 resolve DNS queries. Those functions in the operating system are often
388 called "the resolver library", and the applications communicate with the
389 resolver libraries through a programming interface (API).
391 Because these resolver libraries today expect only domain names in
392 ASCII, applications MUST prepare labels that are passed to the resolver
393 library using the ToASCII operation. Labels received from the resolver
394 library contain only ASCII characters; internationalized labels that
395 cannot be represented directly in ASCII use the ACE form. ACE labels
396 always include the ACE prefix.
398 IDNA-aware applications MUST be able to work with both
399 non-internationalized labels (those that conform to [STD13]
400 and [STD3]) and internationalized labels.
402 It is expected that new versions of the resolver libraries in the future
403 will be able to accept domain names in other formats than ASCII, and
404 application developers might one day pass not only domain names in
405 Unicode, but also in local script to a new API for the resolver
406 libraries in the operating system.
408 6.3 DNS servers
410 An operating system might have a set of libraries for performing the
411 ToASCII operation. The input to such a library might be in one or more
412 charsets that are used in applications (UTF-8 and UTF-16 are likely
413 candidates for almost any operating system, and script-specific charsets
414 are likely for localized operating systems).
416 For internationalized labels that cannot be represented directly in
417 ASCII, DNS servers MUST use the ACE form produced by the ToASCII
418 operation. All IDNs served by DNS servers MUST contain only ASCII
419 characters.
421 If a signalling system which makes negotiation possible between old and
422 new DNS clients and servers is standardized in the future, the encoding
423 of the query in the DNS protocol itself can be changed from ACE to
424 something else, such as UTF-8. The question whether or not this should
425 be used is, however, a separate problem and is not discussed in this
426 memo.
428 6.4 Avoiding exposing users to the raw ACE encoding
430 All applications that might show the user a domain name obtained from a
431 domain name slot, such as from gethostbyaddr or part of a mail header,
432 SHOULD be updated as soon as possible in order to prevent users from
433 seeing the ACE.
435 If an application decodes an ACE name using ToUnicode but cannot show
436 all of the characters in the decoded name, such as if the name contains
437 characters that the output system cannot display, the application SHOULD
438 show the name in ACE format (which always includes the ACE prefix)
439 instead of displaying the name with the replacement character (U+FFFD).
440 This is to make it easier for the user to transfer the name correctly to
441 other programs. Programs that by default show the ACE form when they
442 cannot show all the characters in a name label SHOULD also have a
443 mechanism to show the name that is produced by the ToUnicode operation
444 with as many characters as possible and replacement characters in the
445 positions where characters cannot be displayed.
447 The ToUnicode operation does not alter labels that are not valid ACE
448 labels, even if they begin with the ACE prefix. After ToUnicode has been
449 applied, if a label still begins with the ACE prefix, then it is not a
450 valid ACE label, and is not equivalent to any of the intermediate
451 Unicode strings constructed by ToUnicode.
453 6.5 Bidirectional text in domain names
455 The display of domain names that contain bidirectional text is not covered
456 in this document. It may be covered in a future version of this
457 document, or may be covered in a different document.
459 For developers interested in displaying domain names that have
460 bidirectional text, the Unicode standard has an extensive discussion of
461 how to deal with reorder glyphs for display when dealing with
462 bidirectional text such as Arabic or Hebrew. See [UAX9] for more
463 information. In particular, all Unicode text is stored in logical order.
465 6.6 DNSSEC authentication of IDN domain names
467 DNS Security [DNSSEC] is a method for supplying cryptographic
468 verification information along with DNS messages. Public Key
469 Cryptography is used in conjunction with digital signatures to provide a
470 means for a requester of domain information to authenticate the source
471 of the data. This ensures that it can be traced back to a trusted
472 source, either directly, or via a chain of trust linking the source of
473 the information to the top of the DNS hierarchy.
475 IDNA specifies that all internationalized domain names served by DNS
476 servers that cannot be represented directly in ASCII must use the ACE
477 form produced by the ToASCII operation. This operation must be performed
478 prior to a zone being signed by the private key for that zone. Because
479 of this ordering, it is important to recognize that DNSSEC authenticates
480 the ASCII domain name, not the Unicode form or the mapping between the
481 Unicode form and the ASCII form. In other words, the output of ToASCII
482 is the canonical name. In the presence of DNSSEC, this is the name that
483 MUST be signed in the zone and MUST be validated against. It also SHOULD
484 be used for other name comparisons, such as when a browser wants to
485 indicate that a URL has been previously visited.
487 One consequence of this for sites deploying IDNA in the presence of
488 DNSSEC is that any special purpose proxies or forwarders used to
489 transform user input into IDNs must be earlier in the resolution flow
490 than DNSSEC authenticating nameservers for DNSSEC to work.
492 6.7 Limitations of IDNA
494 The IDNA protocol does not solve all linguistic issues with users
495 inputting names in different scripts. Many important language-based and
496 script-based mappings are not covered in IDNA and must be handled
497 outside the protocol. For example, names that are entered in a mix of
498 traditional and simplified Chinese characters will not be mapped to a
499 single canonical name. Another example is Scandinavian names that are
500 entered with U+00F6 (LATIN SMALL LETTER O WITH DIAERESIS) will not be
501 mapped to U+00F8 (LATIN SMALL LETTER O WITH STROKE).
503 7. Name Server Considerations
505 Internationalized domain name data in zone files (as specified by section
506 5 of RFC 1035) MUST be processed with ToASCII before it is entered in
507 the zone files.
509 It is imperative that there be only one ASCII encoding for a particular
510 domain name. ACE is an encoding for domain name labels that use non-ASCII
511 characters. Thus, a primary master name server MUST NOT contain an
512 ACE-encoded label that decodes to an ASCII label. The ToASCII operation
513 assures that no such names are ever output from the operation.
515 Name servers MUST NOT serve records with domain names that contain
516 non-ASCII characters; such names MUST be converted to ACE form by the
517 ToASCII operation in order to be served. If names that are not processed
518 by ToASCII are passed to an application, it will result in unpredictable
519 behavior. Note that [STRINGPREP] describes how to handle versioning of
520 unallocated codepoints.
522 8. Root Server Considerations
524 IDNs are likely to be somewhat longer than current host names, so the
525 bandwidth needed by the root servers should go up by a small amount.
526 Also, queries and responses for IDNs will probably be somewhat longer
527 than typical queries today, so more queries and responses may be forced
528 to go to TCP instead of UDP.
530 9. Security Considerations
532 Security on the Internet partly relies on the DNS. Thus, any
533 change to the characteristics of the DNS can change the security of much
534 of the Internet.
536 This memo describes an algorithm which encodes characters that are not
537 valid according to STD3 and STD13 into octet values that are valid. No
538 security issues such as string length increases or new allowed values
539 are introduced by the encoding process or the use of these encoded
540 values, apart from those introduced by the ACE encoding itself.
542 Domain names are used by users to connect to Internet servers. The
543 security of the Internet would be compromised if a user entering a
544 single internationalized name could be connected to different servers
545 based on different interpretations of the internationalized domain name.
547 Because this document normatively refers to [NAMEPREP], it includes the
548 security considerations from that document as well.
550 A. References
552 [PUNYCODE] Adam Costello, "Punycode", draft-ietf-idn-punycode.
554 [DNSSEC] Don Eastlake, "Domain Name System Security Extensions", RFC
555 2535, March 1999.
557 [NAMEPREP] Paul Hoffman and Marc Blanchet, "Preparation of
558 Internationalized Domain Names", draft-ietf-idn-nameprep.
560 [RFC2119] Scott Bradner, "Key words for use in RFCs to Indicate
561 Requirement Levels", March 1997, RFC 2119.
563 [STD3] Bob Braden, "Requirements for Internet Hosts -- Communication
564 Layers" (RFC 1122) and "Requirements for Internet Hosts -- Application
565 and Support" (RFC 1123), STD 3, October 1989.
567 [STD13] Paul Mockapetris, "Domain names - concepts and facilities" (RFC
568 1034) and "Domain names - implementation and specification" (RFC 1035),
569 STD 13, November 1987.
571 [STRINGPREP] Paul Hoffman and Marc Blanchet, "Preparation of
572 Internationalized Strings ("stringprep")", draft-hoffman-stringprep,
573 work in progress
574 .
575 [UAX9] Unicode Standard Annex #9, The Bidirectional Algorithm,
576 .
578 [UNICODE] The Unicode Standard, Version 3.1.0: The Unicode Consortium.
579 The Unicode Standard, Version 3.0. Reading, MA, Addison-Wesley
580 Developers Press, 2000. ISBN 0-201-61633-5, as amended by: Unicode
581 Standard Annex #27: Unicode 3.1,
582 .
584 B. Authors' Addresses
586 Patrik Faltstrom
587 Cisco Systems
588 Arstaangsvagen 31 J
589 S-117 43 Stockholm Sweden
590 paf@cisco.com
592 Paul Hoffman
593 Internet Mail Consortium and VPN Consortium
594 127 Segre Place
595 Santa Cruz, CA 95060 USA
596 phoffman@imc.org
598 Adam M. Costello
599 University of California, Berkeley
600 idna-spec.amc @ nicemice.net