idnits 2.17.1 draft-crocker-idn-idn-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents -- however, there's a paragraph with a matching beginning. Boilerplate error? == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 1) being 312 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) Miscellaneous warnings: ---------------------------------------------------------------------------- == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (23 June 2002) is 7978 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'Unicode' is mentioned on line 181, but not defined == Unused Reference: 'DNSSEC' is defined on line 226, but no explicit reference was found in the text == Unused Reference: 'UAX9' is defined on line 232, but no explicit reference was found in the text -- Obsolete informational reference (is this intentional?): RFC 2535 (ref. 'DNSSEC') (Obsoleted by RFC 4033, RFC 4034, RFC 4035) Summary: 3 errors (**), 0 flaws (~~), 6 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Draft D. Crocker 2 draft-crocker-idn-idn-00.txt Brandenburg InternetWorking 3 Expires in six months 23 June 2002 5 Internationalized Domain Names (IDN) 7 Status of this Memo 9 This document is an Internet-Draft and is in full 10 conformance with all provisions of Section 10 of 11 RFC2026. 13 Internet-Drafts are working documents of the Internet 14 Engineering Task Force (IETF), its areas and its working 15 groups. Note that other groups may also distribute 16 working documents as Internet-Drafts. 18 Internet-Drafts are draft documents valid for a maximum 19 of six months and may be updated, replaced, or obsoleted 20 by other documents at any time. It is inappropriate to 21 use Internet-Drafts as reference material or to cite 22 them other than as "work in progress." 24 The list of current Internet-Drafts can be accessed at 25 http://www.ietf.org/ietf/1id-abstracts.txt 27 The list of Internet-Draft Shadow Directories can be 28 accessed at http://www.ietf.org/shadow.html. 30 Abstract 32 Globalization of the Internet requires that domain names 33 be able use characters outside the ASCII repertoire. 34 This document specifies internationalized domain names 35 (IDNs) and defines initial domain name constructs in 36 which IDNs can be used. IDNs use characters drawn from 37 a large repertoire (Unicode). 39 0. Document Change Notes -- 41 This is a revision to draft-ietf-idn-idna-09.txt. It is 42 being distributed independently to facilitate 43 discussion. 45 The goal is to gain consensus about revisions to the IDN 46 working group document, specifically for the following 47 changes: 49 a. Split the document into two, one for defining 50 Internationalized Domain Names (IDN) and the other for 51 defining an encoding method of IDNs, namely IDNA using ACE. 53 b. Distinguish general IDN from its specific use for host 54 names (IDN-Host). Use for host names is specified more 55 precisely, in terms of a specific syntax BNF rule from the 56 relevant existing DNS specification, so that IDN-Host will 57 apply precisely to all DNS record fields and protocol units 58 conforming to that BNF. 60 1. Introduction 62 Until now, there has been no standard method for domain 63 names to use characters outside the ASCII repertoire. 64 This document defines enhancements to the definition of 65 domain names, to support internationalized domain names 66 (IDN). The details for doing protocol encoding of IDNs 67 are specified separately. 69 2. Terminology 71 The key words "MUST", "SHALL", "REQUIRED", "SHOULD", 72 "RECOMMENDED", and "MAY" in this document are to be 73 interpreted as described in RFC 2119 [RFC2119]. 75 "ASCII" 77 means US-ASCII [USASCII], a coded character set 78 containing 128 characters associated with code 79 points in the range 0..7F. Unicode is an extension 80 of ASCII: it includes all the ASCII characters and 81 associates them with the same code points. 83 Code point 85 refers to an integral value associated with a 86 character in a coded character set. 88 Domain name 90 is used as a general term for strings conforming to 91 [STD13]. [STD13] talks about "domain names" and 92 "host names", but many people use the terms 93 interchangeably. Further, because [STD13] was not 94 terribly clear, many people who are sure they know 95 the exact definitions of each of these terms 96 disagree on the definitions. This document uses the 97 terms separately. 99 Domain name slot 101 refers to a protocol element or a function argument 102 or a return value (and so on) explicitly designated 103 for carrying a domain name. Examples of domain name 104 slots include: the QNAME field of a DNS query; the 105 name argument of the gethostbyname() library 106 function; the part of an email address following 107 the at-sign (@) in the From: field of an email 108 message header; and the host portion of the URI in 109 the src attribute of an HTML tag. General 110 text that just happens to contain a domain name is 111 not a domain name slot; for example, a domain name 112 appearing in the plain text body of an email 113 message is not occupying a domain name slot. 115 Host name 117 is a domain name conforming to STD13, with the 118 naming character set limited to LDH. 120 Internationalized host name (IDN-Host) 122 is an IDN conforming to the STD13, except that it 123 also supports non-ASCII characters from Unicode. 125 Internationalized domain name" (IDN) 127 is a domain name that has characters drawn from the 128 restricted set of Unicode defined in <> 130 Internationalized label 132 is a label composed of characters from the Unicode 133 character set; note, however, that not every string 134 of Unicode characters can be an internationalized 135 label. 137 IDN-native 139 is a domain name slot specified to hold an 140 internationalized domain name. The designation may 141 be static (for example, in the specification of the 142 protocol or interface) or dynamic (for example, as 143 a result of negotiation in an interactive session). 145 Label 147 is an individual part of a domain name. Labels are 148 usually shown separated by dots; for example, the 149 domain name "www.example.com" is composed of three 150 labels: "www", "example", and "com". (The zero- 151 length root label described in [STD13], which can 152 be explicit as in "www.example.com." or implicit as 153 in "www.example.com", is not considered a label in 154 this specification.) Throughout this document the 155 term "label" is shorthand for "text label", and 156 "every label" means "every text label". In IDNA, 157 not all text strings can be labels. 159 LDH code points 161 is defined to mean the codepoints associated with 162 ASCII letters, digits, and the hyphen-minus; that 163 is, U+002D, 30..39, 41..5A, and 61..7A. "LDH" is an 164 abbreviation for "letters, digits, hyphen". 166 Unicode 168 is a coded character set [UNICODE] containing tens 169 of thousands of characters. A single Unicode code 170 point is denoted by "U+" followed by four to six 171 hexadecimal digits, while a range of Unicode code 172 points is denoted by two hexadecimal numbers 173 separated by "..", with no prefixes. 175 3. International Domain Names (IDN) 177 3.1. Data representation 179 This specification enhances the set of values for valid 180 domain name labels from the restricted ASCII specified 181 in [STD3], to include [Unicode]. 183 Mechanisms for encoding Unicode values in Domain Names 184 is specified separately. Hence this specification 185 provides no detail for IDNs in "native" binary form (IDN- 186 Native) or for "encoded" Unicode-based IDNs. 188 3.2. Dot as label separator 190 For systems supporting IDN, wherever dot is permitted as 191 a label separator, the following characters MUST be 192 recognized as dots: U+002E (full stop), U+3002 193 (ideographic full stop), U+FF0E (fullwidth full stop), 194 U+FF61 (halfwidth ideographic full stop). 196 << // Are there also multiple Unicode characters 197 permitted for at-sign? What about for slash ("/")? 199 If not, then why is the domain name lexical 200 analyzer now required to look for 4 characters 201 rather than only one? 203 This appears to be a case of putting into the 204 protocol something that is, in fact, entirely a 205 user-interface issue. That some user interfaces 206 will choose to map U+3002 to ASCII dot does not 207 mean that it needs to be in the protocol. // /Dave 208 >> 210 4. References 212 4.1. Normative references 214 [STD3] Bob Braden, "Requirements for Internet Hosts -- 215 Communication Layers" (RFC 1122) and "Requirements for 216 Internet Hosts -- Application and Support" (RFC 1123), 217 STD 3, October 1989. 219 [STD13] Paul Mockapetris, "Domain names - concepts and 220 facilities" (RFC 1034) and "Domain names - 221 implementation and specification" (RFC 1035), STD 13, 222 November 1987. 224 4.2. Informative references 226 [DNSSEC] Don Eastlake, "Domain Name System Security 227 Extensions", RFC 2535, March 1999. 229 [RFC2119] Scott Bradner, "Key words for use in RFCs to 230 Indicate Requirement Levels", March 1997, RFC 2119. 232 [UAX9] Unicode Standard Annex #9, The Bidirectional 233 Algorithm, 234 . 236 [UNICODE] The Unicode Standard, Version 3.1.0: The 237 Unicode Consortium. The Unicode Standard, Version 3.0. 238 Reading, MA, Addison-Wesley Developers Press, 2000. ISBN 239 0-201-61633-5, as amended by: Unicode Standard Annex 240 #27: Unicode 3.1, 241 . 244 [USASCII] Vint Cerf, "ASCII format for Network 245 Interchange", October 1969, RFC 20. 247 5. Security Considerations 249 Security on the Internet partly relies on the DNS. Thus, 250 any change to the characteristics of the DNS can change 251 the security of much of the Internet. 253 This memo describes an algorithm that encodes characters 254 that are not valid according to STD3 and STD13 into 255 octet values that are valid. No security issues such as 256 string length increases or new allowed values are 257 introduced by the encoding process or the use of these 258 encoded values, apart from those introduced by the ACE 259 encoding itself. 261 Domain names are used by users to connect to Internet 262 servers. The security of the Internet would be 263 compromised if a user entering a single 264 internationalized name could be connected to different 265 servers based on different interpretations of the 266 internationalized domain name. 268 6. Authors' Addresses 270 Patrik Faltstrom 271 Cisco Systems 272 Arstaangsvagen 31 J 273 S-117 43 Stockholm Sweden 274 paf@cisco.com 276 Paul Hoffman 277 Internet Mail Consortium and VPN Consortium 278 127 Segre Place 279 Santa Cruz, CA 95060 USA 280 phoffman@imc.org 282 Adam M. Costello 283 University of California, Berkeley 284 idna-spec.amc @ nicemice.net