idnits 2.17.1 draft-skwan-utf8-dns-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-19) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == No 'Intended status' indicated for this document; assuming Proposed Standard == It seems as if not all pages are separated by form feeds - found 0 form feeds but 4 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (September 1998) is 9348 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 2044 (Obsoleted by RFC 2279) ** Downref: Normative reference to an Informational RFC: RFC 1958 ** Downref: Normative reference to an Informational RFC: RFC 2130 Summary: 10 errors (**), 0 flaws (~~), 2 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 INTERNET-DRAFT Stuart Kwan 3 James Gilroy 4 Microsoft Corp. 5 March 1998 6 Expires September 1998 8 Using the UTF-8 Character Set in the Domain Name System 10 Status of this Memo 12 This document is an Internet-Draft. Internet-Drafts are working 13 documents of the Internet Engineering Task Force (IETF), its 14 areas, and its working groups. Note that other groups may also 15 distribute working documents as Internet-Drafts. 17 Internet-Drafts are draft documents valid for a maximum of six 18 months and may be updated, replaced, or obsoleted by other 19 documents at any time. It is inappropriate to use Internet- 20 Drafts as reference material or to cite them other than as 21 "work in progress." 23 To view the entire list of current Internet-Drafts, please check 24 the "1id-abstracts.txt" listing contained in the Internet-Drafts 25 Shadow Directories on ftp.is.co.za (Africa), ftp.nordu.net 26 (Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East 27 Coast), or ftp.isi.edu (US West Coast). 29 Abstract 31 The Domain Name System standard specifies that names are represented 32 using the ASCII character encoding. This document expands that 33 specification to allow the use of the UTF-8 character encoding, a 34 superset of ASCII and a translation of the UCS-2 character encoding. 36 1. Introduction 38 The Domain Name System standard [RFC1035] specifies that names are 39 represented using the ASCII character encoding. This document expands 40 that specification to allow the use of the UTF-8 character encoding 41 [RFC2044], a superset of ASCII and a translation of the UCS-2 42 character encoding. 44 Interpreting names as ASCII-only limits the utility of DNS in an 45 international setting. The UTF-8 character set includes characters 46 from most of the world's written languages, allowing a far greater 47 range of possible names and allowing names to use characters that are 48 relevant to a particular locality. UTF-8 is the recommended character 49 set for protocols that are evolving beyond ASCII [RFC2130]. 51 This document defines the technology for a richer character set in 52 DNS. This document specifically does not define policy for the 53 characters allowed in a name when used in a particular application. 54 For example, some protocols place restrictions on the characters 55 allowed in a name. In addition, names that are intended to be 56 globally visible [RFC1958] should contain ASCII-only characters 57 per [RFC1123]. 59 2. Protocol Description 61 A UTF-8-aware DNS server is a DNS server that can load and store DNS 62 names that contain UTF-8 characters. Names are encoded in logical 63 order as opposed to visual order (see [UNICODE 2.0]). 65 Uniform downcasing permits UTF-8-aware DNS implementations to 66 interoperate with non-UTF-8-aware DNS implementations. Any binary 67 string can be used in a DNS name [RFC2181], but names must be 68 compared with case-insensitivity [RFC1035]. A non-UTF-8-aware DNS 69 implementation is unable to perform a case-insensitive comparison 70 on a name containing UTF-8 characters. However, if UTF-8 names are 71 downcased before transmission, then binary comparisons will provide 72 the desired result on non-UTF-8-aware servers without violating the 73 case-insensitivity requirement. 75 The DNS protocol standard states that original case should be 76 preserved when possible as data is entered into the system. This 77 requirement is modified as follows: a UTF-8-aware DNS server must 78 downcase all names containing UTF-8 characters in both record names 79 and record data before transmitting those names in any message. 80 A UTF-8-aware DNS client/resolver must downcase all names containing 81 UTF-8 characters before transmitting those names in any message. 83 For consistency, UTF-8-aware DNS servers must compare names that 84 contain UTF-8 characters byte-for-byte, as opposed to using Unicode 85 equivalency rules. 87 Applications should take care when allowing uppercase UTF-8 characters 88 to be passed to the resolver, and DNS servers should take care when 89 allowing uppercase UTF-8 characters to be entered in zone data. 90 Downcasing in UTF-8 is locale-sensitive and the result may vary 91 according to the locale of the code execution. The desired result will 92 always be obtained if the application and server only accept lowercase 93 characters. 95 Names encoded in UTF-8 must not exceed the size limits clarified in 96 [RFC2181]. Character count is insufficient to determine size, since 97 some UTF-8 characters exceed one octet in length. 99 3. Interoperability Considerations 101 The UTF-8 character encoding is ideal for use with existing protocol 102 implementations that expect US-ASCII characters. The representation 103 of a US-ASCII characters in UTF-8 is byte for byte identical to the 104 US-ASCII representation. Non-UTF-8-aware DNS clients always encode 105 names in ASCII format and those names will always be correctly 106 interpreted by a UTF-8-aware DNS server. 108 DNS server authors may wish to provide a configuration switch on the 109 DNS server to allow/disallow the use of UTF-8 characters on a 110 per-server or per-zone basis. 112 A non-UTF-8-aware DNS server may accept a zone transfer of a zone 113 containing UTF-8 names, but it may not be able to write back those 114 names to a zone file or reload those names from a zone file. 115 Administrators should exercise caution when transferring a zone 116 containing UTF-8 names to a non-UTF-8-aware DNS server. 118 4. Security Considerations 120 The choice of character encoding for names does not impact the 121 security of the DNS protocol. 123 5. Acknowledgements 125 The authors of this document would like to thank the following people 126 for their contribution to this specification: John McConnell, 127 Cliff Van Dyke and Bjorn Rettig. 129 6. References 131 [RFC1035] P.V. Mockapetris, "Domain Names - Implementation and 132 Specification," RFC 1035, ISI, Nov 1987. 134 [RFC2044] F. Yergeau, "UTF-8, a transformation format of Unicode 135 and ISO 10646," RFC 2044, Alis Technologies, Oct 1996. 137 [RFC1958] B. Carpenter, "Architectural Principles of the 138 Internet," RFC 1958, IAB, June 1996. 140 [RFC1123] R. Braden, "Requirements for Internet Hosts - 141 Application and Support," STD 3, RFC 1123, January 1989. 143 [RFC2130] C. Weider et. al., "The Report of the IAB Character 144 Set Workshop held 29 February - 1 March 1996", 145 RFC 2130, Apr 1997. 147 [RFC2181] R. Elz and R. Bush, "Clarifications to the DNS 148 Specification," RFC 2181, University of Melbourne and 149 RGnet Inc, July 1997. 151 [UNICODE 2.0] The Unicode Consortium, "The Unicode Standard, Version 152 2.0," Addison-Wesley, 1996. ISBN 0-201-48345-9. 154 7. Author's Addresses 156 Stuart Kwan James Gilroy 157 Microsoft Corporation Microsoft Corporation 158 One Microsoft Way One Microsoft Way 159 Redmond, WA 98052 Redmond, WA 98052 160 USA USA 161