idnits 2.17.1 draft-skwan-utf8-dns-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-27) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == No 'Intended status' indicated for this document; assuming Proposed Standard == It seems as if not all pages are separated by form feeds - found 0 form feeds but 4 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (May 1998) is 9479 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 2044 (Obsoleted by RFC 2279) ** Downref: Normative reference to an Informational RFC: RFC 2130 Summary: 9 errors (**), 0 flaws (~~), 2 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 INTERNET-DRAFT Stuart Kwan 3 James Gilroy 4 Microsoft Corp. 5 November 1997 6 Expires May 1998 8 Using the UTF-8 Character Set in the Domain Name System 10 Status of this Memo 12 This document is an Internet-Draft. Internet-Drafts are working 13 documents of the Internet Engineering Task Force (IETF), its 14 areas, and its working groups. Note that other groups may also 15 distribute working documents as Internet-Drafts. 17 Internet-Drafts are draft documents valid for a maximum of six 18 months and may be updated, replaced, or obsoleted by other 19 documents at any time. It is inappropriate to use Internet- 20 Drafts as reference material or to cite them other than as 21 "work in progress." 23 To view the entire list of current Internet-Drafts, please check 24 the "1id-abstracts.txt" listing contained in the Internet-Drafts 25 Shadow Directories on ftp.is.co.za (Africa), ftp.nordu.net 26 (Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East 27 Coast), or ftp.isi.edu (US West Coast). 29 Abstract 31 The Domain Name System standard specifies that names are represented 32 using the ASCII character encoding. This document expands that 33 specification to allow the use of the UTF-8 character encoding, a 34 superset of ASCII and a translation of the UCS-2 character encoding. 36 1. Introduction 38 The Domain Name System standard [RFC1035] specifies that names are 39 represented using the ASCII character encoding. This document expands 40 that specification to allow the use of the UTF-8 character encoding 41 [RFC2044], a superset of ASCII and a translation of the UCS-2 42 character encoding. 44 Interpreting names as ASCII-only limits the utility of DNS in an 45 international setting. The UTF-8 character set includes characters 46 from most of the world's written languages, allowing a far greater 47 range of possible names and allowing names to use characters that are 48 relevant to a particular locality. UTF-8 is the recommended character 49 set for protocols that are evolving beyond ASCII [RFC2130]. 51 This document defines the technology for a richer character set in 52 DNS. It does not define the policy for the characters allowed in a 53 name when used by a particular protocol. Protocol authors are 54 encouraged to place no restrictions on characters allowed in a name. 56 2. Protocol Description 58 A UTF-8-aware DNS server is a DNS server that can load and store DNS 59 names that contain UTF-8 characters. Names are encoded in logical 60 order as opposed to visual order (see [UNICODE 2.0]). 62 Uniform downcasing permits UTF-8-aware DNS implementations to 63 interoperate with non-UTF-8-aware DNS implementations. Any binary 64 string can be used in a DNS name [RFC2181], but names must be 65 compared with case-insensitivity [RFC1035]. A non-UTF-8-aware DNS 66 implementation is unable to perform a case-insensitive comparison 67 on a name containing UTF-8 characters. However, if UTF-8 names are 68 downcased before transmission, then binary comparisons will provide 69 the desired result on non-UTF-8-aware servers without violating the 70 case-insensitivity requirement. 72 The DNS protocol standard states that original case should be 73 preserved when possible as data is entered into the system. This 74 requirement is modified as follows: a UTF-8-aware DNS server must 75 downcase all names containing UTF-8 characters in both record names 76 and record data before transmitting those names in any message. 77 A UTF-8-aware DNS client/resolver must downcase all names containing 78 UTF-8 characters before transmitting those names in any message. 80 For consistency, UTF-8-aware DNS servers must compare names that 81 contain UTF-8 characters byte-for-byte, as opposed to using Unicode 82 equivalency rules. 84 Applications should take care when allowing uppercase UTF-8 characters 85 to be passed to the resolver, and DNS servers should take care when 86 allowing uppercase UTF-8 characters to be entered in zone data. 87 Downcasing in UTF-8 is locale-sensitive and the result may vary 88 according to the locale of the code execution. The desired result will 89 always be obtained if the application and server only accept lowercase 90 characters. 92 Names encoded in UTF-8 must not exceed the size limits clarified in 93 [RFC2181]: a maximum of 64 octets per label and 255 octets per name. 94 Character count is insufficient to determine size, since some UTF-8 95 characters exceed one octet in length. 97 3. Interoperability Considerations 99 The UTF-8 character encoding is ideal for use with existing protocol 100 implementations that expect US-ASCII characters. The representation 101 of a US-ASCII characters in UTF-8 is byte for byte identical to the 102 US-ASCII representation. Non-UTF-8-aware DNS clients always encode 103 names in ASCII format and those names will always be correctly 104 interpreted by a UTF-8-aware DNS server. 106 DNS server authors may wish to provide a configuration switch on the 107 DNS server to allow/disallow the use of UTF-8 characters on a 108 per-server or per-zone basis. 110 A non-UTF-8-aware DNS server may accept a zone transfer of a zone 111 containing UTF-8 names, but it may not be able to write back those 112 names to a zone file or reload those names from a zone file. 113 Administrators should exercise caution when transferring a zone 114 containing UTF-8 names to a non-UTF-8-aware DNS server. 116 4. Security Considerations 118 The choice of character encoding for names does not impact the 119 security of the DNS protocol. 121 5. Acknowledgements 123 The authors of this document would like to thank the following people 124 for their contribution to this specification: John McConnell, 125 Cliff Van Dyke and Bjorn Rettig. 127 6. References 129 [RFC1035] P.V. Mockapetris, "Domain Names - Implementation and 130 Specification," RFC 1035, ISI, Nov 1987. 132 [RFC2044] F. Yergeau, "UTF-8, a transformation format of Unicode 133 and ISO 10646," RFC 2044, Alis Technologies, Oct 1996. 135 [RFC2130] C. Weider et. al., "The Report of the IAB Character 136 Set Workshop held 29 February - 1 March 1996", 137 RFC 2130, Apr 1997. 139 [RFC2181] R. Elz and R. Bush, "Clarifications to the DNS 140 Specification," RFC 2181, University of Melbourne and 141 RGnet Inc, July 1997. 143 [UNICODE 2.0] The Unicode Consortium, "The Unicode Standard, Version 144 2.0," Addison-Wesley, 1996. ISBN 0-201-48345-9. 146 7. Author's Addresses 148 Stuart Kwan James Gilroy 149 Microsoft Corporation Microsoft Corporation 150 One Microsoft Way One Microsoft Way 151 Redmond, WA 98052 Redmond, WA 98052 152 USA USA 153