idnits 2.17.1 draft-ietf-dnsext-aliasing-requirements-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). == The document seems to contain a disclaimer for pre-RFC5378 work, but was first submitted on or after 10 November 2008. The disclaimer is usually necessary only for documents that revise or obsolete older RFCs, and that take significant amounts of text from those RFCs. If you can contact all authors of the source material and they are willing to grant the BCP78 rights to the IETF Trust, you can and should remove the disclaimer. Otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (March 14, 2011) is 4792 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'ASCII' is defined on line 859, but no explicit reference was found in the text == Unused Reference: 'RFC2119' is defined on line 872, but no explicit reference was found in the text == Unused Reference: 'RFC2136' is defined on line 875, but no explicit reference was found in the text == Unused Reference: 'RFC3597' is defined on line 886, but no explicit reference was found in the text == Unused Reference: 'RFC3629' is defined on line 889, but no explicit reference was found in the text == Unused Reference: 'RFC4033' is defined on line 897, but no explicit reference was found in the text == Unused Reference: 'RFC4034' is defined on line 901, but no explicit reference was found in the text == Unused Reference: 'RFC4035' is defined on line 905, but no explicit reference was found in the text == Unused Reference: 'CNAME-DNAME' is defined on line 923, but no explicit reference was found in the text == Unused Reference: 'IDN-TLD-Variants' is defined on line 928, but no explicit reference was found in the text == Unused Reference: 'RFC2672bis' is defined on line 933, but no explicit reference was found in the text == Unused Reference: 'SHADOW' is defined on line 938, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2671 (ref. 'EDNS0') (Obsoleted by RFC 6891) ** Obsolete normative reference: RFC 2672 (Obsoleted by RFC 6672) ** Obsolete normative reference: RFC 3490 (Obsoleted by RFC 5890, RFC 5891) == Outdated reference: A later version (-06) exists of draft-yao-dnsext-bname-01 -- Duplicate reference: RFC2672, mentioned in 'RFC2672bis', was also mentioned in 'RFC2672'. -- Obsolete informational reference (is this intentional?): RFC 2672 (Obsoleted by RFC 6672) Summary: 3 errors (**), 0 flaws (~~), 16 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group S. Woolf 3 Internet-Draft Internet Systems Consortium, Inc. 4 Intended status: Informational X. Lee 5 Expires: September 15, 2011 J. Yao 6 CNNIC 7 March 14, 2011 9 Problem Statement: DNS Resolution of Aliased Names 10 draft-ietf-dnsext-aliasing-requirements-01.txt 12 Abstract 14 This document attempts to describe a set of issues that arises from 15 the desire to treat a set or group of names as "aliases" of each 16 other, "bundled," "variants," or "the same," which is problematic in 17 terms of corresponding behavior for DNS labels and FQDNs. 19 With the emergence of internationalized domain names, among other 20 potential use cases, two or more names that users will regard as 21 having identical meaning may sometimes require corresponding behavior 22 in the underlying infrastructure, possibly in the DNS itself. It's 23 not clear how to accommodate this required behavior of such names in 24 DNS resolution; in particular, it's not clear when they are best 25 accommodated in registry practices for generating names for lookup in 26 the DNS, existing DNS protocol elements and behavior, existing 27 application-layer mechanisms and practices, or some set of protocol 28 elements or behavior not yet defined. This document attempts to 29 describe some of these cases and the behavior of some of the possible 30 solutions discussed to date. 32 Status of this Memo 34 This Internet-Draft is submitted in full conformance with the 35 provisions of BCP 78 and BCP 79. 37 Internet-Drafts are working documents of the Internet Engineering 38 Task Force (IETF). Note that other groups may also distribute 39 working documents as Internet-Drafts. The list of current Internet- 40 Drafts is at http://datatracker.ietf.org/drafts/current/. 42 Internet-Drafts are draft documents valid for a maximum of six months 43 and may be updated, replaced, or obsoleted by other documents at any 44 time. It is inappropriate to use Internet-Drafts as reference 45 material or to cite them other than as "work in progress." 47 This Internet-Draft will expire on September 15, 2011. 49 Copyright Notice 51 Copyright (c) 2011 IETF Trust and the persons identified as the 52 document authors. All rights reserved. 54 This document is subject to BCP 78 and the IETF Trust's Legal 55 Provisions Relating to IETF Documents 56 (http://trustee.ietf.org/license-info) in effect on the date of 57 publication of this document. Please review these documents 58 carefully, as they describe your rights and restrictions with respect 59 to this document. Code Components extracted from this document must 60 include Simplified BSD License text as described in Section 4.e of 61 the Trust Legal Provisions and are provided without warranty as 62 described in the Simplified BSD License. 64 This document may contain material from IETF Documents or IETF 65 Contributions published or made publicly available before November 66 10, 2008. The person(s) controlling the copyright in some of this 67 material may not have granted the IETF Trust the right to allow 68 modifications of such material outside the IETF Standards Process. 69 Without obtaining an adequate license from the person(s) controlling 70 the copyright in such materials, this document may not be modified 71 outside the IETF Standards Process, and derivative works of it may 72 not be created outside the IETF Standards Process, except to format 73 it for publication as an RFC or to translate it into languages other 74 than English. 76 Table of Contents 78 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 79 1.1. What this document does . . . . . . . . . . . . . . . . . 5 80 1.2. What this document does not do . . . . . . . . . . . . . . 5 81 1.3. Terminology . . . . . . . . . . . . . . . . . . . . . . . 6 82 2. Problem Statement . . . . . . . . . . . . . . . . . . . . . . 6 83 2.1. Registration of Domain Name Variants . . . . . . . . . . . 7 84 2.2. Identical DNS Resolution for Bundled DNS Names . . . . . . 8 85 2.3. Character Variants . . . . . . . . . . . . . . . . . . . . 9 86 2.3.1. An example: Simplified and Traditional Chinese . . . . 9 87 2.3.2. An example: Greek . . . . . . . . . . . . . . . . . . 9 88 2.3.3. An Example: Arabic . . . . . . . . . . . . . . . . . . 10 89 2.4. Use of Variants . . . . . . . . . . . . . . . . . . . . . 10 90 3. Operational Considerations . . . . . . . . . . . . . . . . . . 11 91 3.1. Zone Provisioning and Authority Servers . . . . . . . . . 11 92 3.1.1. Provisioning of 'aliases' in the registry . . . . . . 12 93 3.1.2. Impact of special mechanisms . . . . . . . . . . . . . 12 94 3.2. Recursive Resolvers . . . . . . . . . . . . . . . . . . . 12 95 3.3. Applications . . . . . . . . . . . . . . . . . . . . . . . 13 96 4. Proposed Requirements . . . . . . . . . . . . . . . . . . . . 14 97 5. Possible Solutions . . . . . . . . . . . . . . . . . . . . . . 15 98 5.1. Mapping or Redirection of Domain Names . . . . . . . . . . 16 99 5.1.1. Mapping itself (CNAME) . . . . . . . . . . . . . . . . 16 100 5.1.2. Mapping its descendants . . . . . . . . . . . . . . . 16 101 5.1.3. Mapping itself and its descendants . . . . . . . . . . 17 102 5.2. Zone Clone . . . . . . . . . . . . . . . . . . . . . . . . 17 103 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 18 104 7. Security Considerations . . . . . . . . . . . . . . . . . . . 18 105 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 19 106 9. Change History . . . . . . . . . . . . . . . . . . . . . . . . 19 107 9.1. draft-yao-dnsext-identical-resolution: Version 00 . . . . 19 108 9.2. draft-yao-dnsext-identical-resolution: Version 01 . . . . 19 109 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 19 110 10.1. Normative References . . . . . . . . . . . . . . . . . . . 19 111 10.2. Informative References . . . . . . . . . . . . . . . . . . 21 112 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 21 114 1. Introduction 116 As the Internet and the DNS have evolved beyond their original realms 117 of use, a set of needs and expectations has appeared about how DNS 118 labels behave that is informed significantly by common human 119 assumptions about how "names" or "words" work. One aspect of this is 120 the notion or expectation that multiple sets of names may be similar 121 to a human user, and expected to behave "the same" or as "aliases" of 122 one another, across multiple services and interactions. The DNS was 123 designed with the implicit expectation that names would be based on 124 ASCII characters, and the "similarity" or "sameness" property doesn't 125 seem to arise terribly often in the names people originally wanted to 126 use in the DNS; thus the requirements of identical resolution of 127 "aliased" or "bundled" names hasn't figured prominently as an 128 attribute that needed to be accommodated in the generation or lookup 129 of DNS names. However, with the standardization of internationalized 130 domain names protocols (ref: IDNA and IDNAbis), more and more 131 internationalized domain name labels [RFC3490] are appearing in DNS 132 zones. In some cases, these labels [RFC3743] are accompanied by the 133 expectation that they are "equivalent" or should behave "the same," 134 often because these labels are derived from names or strings that 135 users consider "the same" in some languages. Accordingly, Internet 136 users hope for such labels to behave in DNS contexts as they expect 137 the corresponding human constructs to behave, regardless of the 138 specific service (smtp, http, etc.) involved.. 140 The general issues of what "the same" means, or of defining 141 "variants" in human scripts as codified in Unicode (or anywhere else) 142 are well outside the scope of the DNS or the expertise of most of the 143 people who work on it. They are matters for philosophers and 144 linguists, and for applications developers, respectively. However, 145 to the extent that these issues can be specified as involving the 146 resolution of names in the DNS, it's reasonable to describe those 147 expectations and attempt to accommodate them. 149 There is some existing technology defined in the DNS for behavior 150 that can be described as one name behaving "the same" as another. 151 For a single node in the DNS tree, CNAME can be used to map one name 152 as an "alias" to another, "canonical" name. If there is a need to 153 map a subtree of the DNS-- a zone, or a domain and its subdomains-- 154 to another domain, DNAME has been defined to allow this behavior. 155 However, there is no way currently defined to do both, as CNAME is 156 required to be the sole record at its node in the tree. Behavior 157 that combines the characteristics of CNAME and DNAME is not currently 158 defined in the DNS. 160 If existing protocol does not meet the zone administrator's need to 161 be able to treat one label, name, or zone as "the same" as another, 162 there are also administrative mechanisms available for manipulating 163 databases underlying the generation and resolution of DNS names. 164 Registry operators have many mechanisms for working around DNS 165 protocol in order to get behavior they want for names in DNS zones, 166 and management of "aliases" is no exception. However, it is not 167 clear how much of the user and operator requirements for "aliases" 168 can be met by mechanisms for provisioning DNS zones, at acceptable 169 cost. Concerns have been raised about this approach particularly at 170 large scales and there is a need both to provision possibly 171 exponential numbers of domains and then to audit them for compliance 172 with parent registry policy. 174 1.1. What this document does 176 Attempts to think about "aliases" or similar concepts as applied to 177 the DNS have been difficult, both because use cases have been unclear 178 and because terminology for describing and distinguishing them has 179 not been readily available. This document attempts to provide both 180 brief descriptions of identified use cases, and a rough organization 181 for how to think about behavior in the DNS that might correspond to 182 the requirements derived from them as a way of evaluating proposed 183 solutions. This includes existing and additional possible solutions, 184 from the perspective of both DNS (authoritative server, resolver, and 185 client) and application needs. 187 As a departure point, we attempt to be rigorous about distinguishing 188 DNS "labels" from "words" (a human construct) and "strings" (which we 189 use here as machine-readable constructs that nonetheless may not 190 conform to DNS label constraints, such as IDNA U-labels). The 191 distinctions among what humans type or see, what applications use, 192 and what DNS stores and resolves are sometimes subtle but 193 particularly important. 195 A list of broad requirements is proposed for any DNS protocol changes 196 that might be undertaken. 198 We also review existing technologies (CNAME, DNAME) and proposed new 199 ones ("BNAME," "zone clones") against the proposed requirements. 201 1.2. What this document does not do 203 This document makes no attempt to solve or even describe 204 "translation" of one name into another in the DNS, which is likely to 205 be impossible. "Translation" in general, or even the particular 206 problem of determining when or why two DNS labels (or even FQDNs) 207 should be considered "the same", is simply not in scope for the DNS 208 protocol. We pre-suppose those decisions are made elsewhere and that 209 the DNS needs to deliver behavior in conformance with that external 210 decision. In particular, we're talking about creating a property or 211 association among a set of DNS names as "sameness" or "alias-of", but 212 the correspondence between that set and any set of human- or 213 application-visible strings is created outside of the DNS database 214 and protocol. Language Variant Tables (RFC3743) serve as guides for 215 registration policy, but the associations they create are basically 216 expressed only as policy about what can be registered, not visible to 217 DNS resolvers, applications, or users. 219 Accordingly, this document makes no comment on policy regarding when 220 two names are "the same," what restrictions should be placed on their 221 generation or use outside those imposed by the DNS protocol, or the 222 ability of one approach over another to instantiate what a given user 223 regards as "the same" for a language, script, culture, community, 224 application, encoding, or purpose. 226 1.3. Terminology 228 All the basic terms used in this specification are defined in the 229 documents [RFC1034], [RFC1035], [RFC2672] and [RFC3490]. 231 We also note that there is a wide variety of terminology in use to 232 describe the issues we attempt to treat in this document, and no 233 consensus on which apply under what conditions. Terms for "a set of 234 domain names that somehow need to be treated as similar" include 235 "bundle," "variants," or "clones". As uniformity of terms is one of 236 the goals of any work on this topic in the DNS, we try not to add to 237 the confusion in the problem statement but can't claim to have 238 finalized a recommendation in early versions of this document. 240 2. Problem Statement 242 From the point of view of the DNS, a number of attributes suggest 243 themselves as important dimensions for evaluating what "the same" 244 might mean. 246 One question is exactly what it is that's to be defined as "the 247 same"? Are the end results to be identical, and if so from what 248 perspective: that of the recursive resolver? The application? The 249 human consumer of content? Is it enough that lookups on the FQDN 250 portion of an email address result in the same A or AAAA records, or 251 does some intermediate mapping need to be maintained between MX 252 records in the resolution chain? What about the FQDN portion of a 253 URL handed back to an application, or in resolution processes that 254 include multiple lookups of records that may include FQDNs? Do there 255 need to be general rules specified for the handling of FQDNs in RDATA 256 of present and future RRtypes? 257 Another question is the behavior of multiple names with respect to 258 one another: is it enough to define one as "canonical" or 259 "preferred," with the others considered as "variants" that are 260 transformed to the "preferred" form? Or is there a real need for 261 multiple names to be "equivalent", interchangeable, with none 262 considered "preferred" over the others? (We note here that no 263 requirement for complete interchangeability or identity has been 264 articulated, except anecdotally, and such equivalence would be 265 extremely difficult to define in the DNS.) 267 In addition, the tree structure of the DNS requires that we consider 268 the behavior of "identical" names across multiple zones in the 269 hierarchy. Are mappings to be maintained in names more than a level, 270 or two, deep? If so, with what characteristics, and what 271 characteristics are required for scalability? 273 A further question arises with respect to how applications should 274 interact with alias-specific DNS behavior. A basic requirement would 275 seem to be "First, do no harm," or in other words, any extensions to 276 DNS protocol in support of the desired "alias" behavior should not 277 interfere with applications that expect to do such interpretation on 278 their own. This concern is based in the expectation that DNS is 279 simple and predictable, operating strictly as infrastructure under 280 the process of creating "the user experience," not as part of it. 282 A key point in evaluating these questions is that DNS is a lookup- 283 based protocol. A DNS name either resolves, or not. There's no 284 search function in the DNS, no fuzzy match. It provides lookup on a 285 specified name. This is critically important because it means that 286 any work to define any kind of "sameness" that can't be expressed as 287 a lookup, such as selection among a set of candidate names for which 288 to return results, must be done elsewhere than in the DNS. 290 2.1. Registration of Domain Name Variants 292 To some degree, issues of "sameness" or creating an association among 293 a set of names have existed around the use of domain names from the 294 beginning. Points where the behavior of DNS labels have collided 295 with expectations around the behavior of words have included DNS 296 handling of case sensitivity, the kind of transformations a human 297 expects to "just work" around "try-this.example" vs. 298 "trythis.example", and continuing frustration that "confusingly 299 similar" names can be delegated to different parties by DNS 300 registries. However, the introduction of IDN has provided a forcing 301 function in that it has added visibility for a wider variety of 302 issues along these lines, and possibly the urgency of dealing with 303 them for large numbers of users. 305 A need has been identified in connection with the introduction of IDN 306 for defining how "variants" might behave as DNS names. Specifically 307 defining "variant" is a matter for experts, but it's generally 308 conceded that recognition and careful management of cases where 309 multiple names are associated together as "variants" in the 310 expectation or preference of users are important; without such 311 management of grouped domain names, security risks may be increased 312 and the quality of user experience may be compromised. [RFC3743] 313 developed by JET (Joint Engineering Team) gives one possible solution 314 of how to manage registration of a domain name, intended to be 315 applied to the script and usage common across Chinese, Japanese, and 316 Korean users. [RFC3743] proposed an algorithm which will allocate a 317 group of names, consisting of a domain name and its variants, to the 318 same domain holder. It means that the domain holder will get control 319 of the domain name and its variants. [RFC4290] suggests the practice 320 in [RFC3743] to be used in registrations of internationalized domain 321 names. But [RFC3743] and [RFC4290] do not define how, exactly, these 322 bundles of names are to be treated by the registry or the DNS in 323 order to obtain the desired "identical" behavior. [RFC4690] said 324 that the "variant" model introduced in [RFC3743] and [RFC4290] can be 325 used by a registry to prevent the most negative consequences of 326 possible confusion, by ensuring either that both names are registered 327 to the same party in a given domain or that one of them is completely 328 prohibited. The principles described in [RFC3743], [RFC4290] and 329 [RFC4690] have been accepted by many registries. But the technical 330 details of how to guarantee that a bundle of domain names are 331 "identical" in the DNS remain unspecified. 333 2.2. Identical DNS Resolution for Bundled DNS Names 335 To some extent, the desired behavior can be described: "identical DNS 336 resolution" means that the process of resolving two domain names will 337 end with the same result, in most cases the same IP address. In the 338 history of DNS protocol development, there have been two attempts to 339 specify such "identical resolution" behavior:CNAME[RFC1034] which 340 maps or redirects itself, and DNAME[RFC2672] which maps or redirects 341 its descendants. In the case of bundles or groups of names, however, 342 some operators have asserted they need identical DNS resolution at 343 all levels' domain names, including the domain name itself and its 344 descendants. As alluded to above, registries are left with ad hoc 345 provisioning and database management mechanisms for managing variant 346 names, with some help from existing DNS protocol mechanisms for 347 mapping labels or FQDNs to each other. However, some are finding the 348 existing mechanisms to have unsatisfactory limitations; they are 349 seeking more guidance on the use of existing mechanisms, and perhaps 350 the addition of new ones in the DNS protocol. 352 2.3. Character Variants 354 Many defined scripts as used in many different languages have 355 "character variants" included. There is no uniform definition of 356 variants, and in fact their characteristics differ widely, but it's 357 possible to define some. For example, the definition of variant 358 characters in the JET Guidelines [RFC3743], intended for use with the 359 CJK language/script communities, is roughly this: One conceptual 360 character can be identified with several different code points in 361 character sets for computer use. In UNICODE definitions of some 362 scripts, including Han (chinese), some characters can be identified 363 as "compatibility variants" of another character, which usually 364 implies that the first can be remapped to the second without the loss 365 of any meaning. In this document, variant characters are two or more 366 characters that may be similar in appearance or identical in meaning 367 (similarity in appearance is not required by the definition but often 368 occurs). 370 With the introduction of IDNs in the DNS, perhaps most prominently in 371 the root zone, decisions about how to deal with IDN variants is a 372 significant challenge ahead of us. We describe here a couple of 373 examples, Chinese and Greek; comparable situations exist in Arabic, 374 Cyrillic, and others. 376 2.3.1. An example: Simplified and Traditional Chinese 378 For example, the IDN TLD "China"(U+4E2D U+56FD) and its variant 379 (U+4E2D U+570B) are in the root today. The first one (U+4E2D U+56FD) 380 can be considered the "original" IDN TLD and the second one (U+4E2D 381 U+570B) can be considered the IDN TLD "variant". Ideally, it should 382 be possible to treat the original IDN TLD and its IDN TLD variant as 383 "identical" for purposes of DNS resolution, in a way similar to the 384 case mapping most DNS users take for granted. However, this analogy 385 is a bit perilous, and turns out to be hard to use as a guide to what 386 behavior is actually desirable, not least because it's not fully 387 consistent even within the DNS. 389 At this writing, four Han script IDN TLDs are in the root, including 390 two pairs comprising a Traditional Chinese name and its Simplified 391 counterpart. These operators will, in an ideal world, be able to 392 share some operational experience around implementation of registry 393 policy regarding managing multiple DNS trees as "the same" 395 2.3.2. An example: Greek 397 In Greek, almost every word has the "tonos" accent sign, but where it 398 is placed (on which character) can vary. Further, some words end in 399 a final sigma, which is represented differently to sigma appearing 400 elsewhere in the word. If a registry wishes to be able to enforce 401 the association among all of the domain names that correspond to a 402 "word" in Greek, with all its possible Unicode strings, some 403 mechanism must be used to enumerate the "variant" names and tie them 404 together. This makes sense from the human factors perspective, as 405 depending on how the user types something, results may include a 406 different domain to what was expected, although the user may have the 407 firm belief that "the same word" was input in multiple cases. 409 As an example, the domain names "xn--0xadhj4a.gr" and "xn-- 410 0xaafjl.gr" appear to a native speaker/reader of Greek to represent 411 "the same word," in a sense very much like the case insensitivity 412 that native users of Latin script take for granted in the DNS. 414 2.3.3. An Example: Arabic 416 [STW: [to be added] 418 2.4. Use of Variants 420 It's reasonable to pose the question at this point, without 421 necessarily being able to answer it yet, of what is the ideal or 422 intended impact of solving the issue identified so far for registries 423 on applications and end users. Ultimately, simplifying the 424 provisioning side may result in the same semantics as we have today 425 for zones maintained in parallel but for less work. However, we 426 later assert a proposed requirement that synthesizing the same record 427 as a query would have obtained from an enumerated parallel tree isn't 428 enough-- that the property of association or "sameness" we're 429 creating with specific mechanisms needs to be useful in some specific 430 way to the consumer of the data. 432 The trigger for raising the questions discussed here is based in user 433 expectations that one name, in certain circumstances to be determined 434 and somehow encoded by humans, can be treated as interchangeable with 435 another with regards to a particular context or activity, with 436 resolution of domain names as part of the context or activity. 437 However, it's useful to note here that satisfying that set of user 438 expectations may or may not reasonably be done in the DNS, wholly or 439 in part. 441 There are two arguments for placing functionality that links one name 442 to another as "the same" in the DNS. Their validity is not yet 443 determined, but they amount to: 444 1. The expectation that two or more names be "the same" is often 445 expressed as a desire to register the associated names as a 446 "bundle" or otherwise link them as domain names. This is because 447 the domain name in a URL or email address is often presented to 448 the user as semantically meaningful, based on strings used to 449 derive DNS labels-- proper names, "words," etc. This brings such 450 concerns to the attention of providers of domain names, and 451 suggests at least exploring how to answer the question near where 452 it's asked-- i.e. the registry. 453 2. The desire for names on the Internet to act like words is often 454 service-independent; users want to be able to use identical 455 strings in the course of invoking multiple services that seem to 456 be related, such as going to a webpage and then sending email to 457 an address in "the same" domain (probably an FQDN). It's been 458 noted that people are very comfortable with a certain amount of 459 fuzziness about "alternative spellings" and assorted other 460 variations within the notion of "sameness", but they nonetheless 461 often want such an association to exist. In cases where a set of 462 variant strings is parseable in the application, has 463 corresponding A-labels that can be looked up in the DNS, etc. but 464 only a subset can be typed on the user's input device or rendered 465 on the user's screen, the association may be necessary to the 466 successful completion of the activity the user is attempting. 467 This argues in turn for some mechanism that is not dependent on 468 the specific service, protocol, or application involved, since 469 leaving it up to service-specific mechanisms, or conventions in 470 the use of DNS records or other mechanisms not really intended 471 for the purpose, leads to confusion and inconsistency. 473 3. Operational Considerations 475 Any change to a mature infrastructure protocol such as DNS needs to 476 be informed by consideration of the tradeoffs among providing the 477 associated service, using the service, and possibly conflicting means 478 of offering comparable functionality. In the case of DNS, this 479 requires that we look at provisioning (populating zones and the 480 mechanics of authority servers responding to queries), service by 481 intermediate and client resolvers, and related capabilities provided 482 with existing facilities in the DNS and in applications. 484 3.1. Zone Provisioning and Authority Servers 486 The initial motivation for discussion of support for aliases in the 487 DNS was provided by operators of top-level domain (TLD) registries. 488 The problem facing them lies in scaling the provision of "bundles" of 489 names that users expect to be treated "the same", as in the examples 490 previously described. 492 3.1.1. Provisioning of 'aliases' in the registry 494 The most obvious way to provision multiple names as "the same" is to 495 delegate each separately, and then maintain the contents of the 496 delegated zones together, from the same backend database or by some 497 similar mechanism. This has the advantage of requiring no new 498 technology; it can be done, and is done today, entirely with 499 provisioning logic and registry policy. 501 However, it doubles the work and the number of records required. If 502 provisioning isn't done carefully, errors can arise, leaving 503 inconsistencies. And provisioning multiple trees does nothing to 504 link the resulting names directly; there is no property of 505 "association" or isomorphism created in the DNS that corresponds to 506 user or application expectations for "sameness". There is no way to 507 tell, from resolving a name in one tree, that it's part of a set or 508 bundle of related names. 510 Separate provisioning also poses a limitation for some registry 511 operators in that there is no way to verify that the trees are being 512 maintained in parallel without exhaustively walking the zones, which 513 may be large or nested to multiple levels. In the case, for example, 514 of a zone A.B.C.example.com, in which each of A,B,C are derived from 515 strings with a single character variant each, eight zones must be 516 maintained in parallel and possibly available for audit by the 517 authority over example.com, depending on its delegation policy. 519 3.1.2. Impact of special mechanisms 521 Once we begin to consider mechanisms for maintaining parallel zones 522 or "aliases", we need to look at how the "alias" or association 523 property is created and where the burden of maintaining it lies. In 524 the case of proposed mechanisms, we attempt to describe them below. 526 Existing mechanisms besides the simple, straightforward provisioning 527 of zones that are identical except for the ownernames of 528 corresponding records include wildcards, CNAME, and DNAME. See 529 below, but here we note that they require special processing by the 530 authority server in order to synthesize responses that are supposed 531 to be the equivalent of simply providing the parallel zones by one- 532 to-one enumeration. 534 3.2. Recursive Resolvers 536 Another area where it's necessary to review requirements and impacts 537 of changes to the DNS is in resolver expectations and behavior, given 538 that recursive resolvers do most of the work of getting the data out 539 of DNS that provisioning activities put into it. 541 In practice, much of the work of special processing falls to 542 resolvers. In particular, any scheme that can result in multiple 543 queries and some need to chain the answers or disambiguate multiple 544 answers is going to make more work for resolvers, and is going to 545 need to specify that work in careful detail. Any ambiguity or lack 546 of precision in specifying the use of "aliases" will propagate back 547 to applications, and quite possibly leave applications writers and 548 users worse off than they were without DNS mechanisms intended to 549 "help". 551 Ideally any new RRtypes defined to support "aliases" would be 552 provisioned on the authority server and require no special 553 processing, which would make them transparent to intermediate 554 resolvers. However, depending on how much such RRs and their 555 processing need to be visible to the application to be effective, 556 this may not be possible. 558 3.3. Applications 560 The most complex part of the analysis of costs and benefits of 561 defining new technology for support of "aliases" by DNS is in 562 determining what applications would do with such new mechanisms and 563 how it would help to have them. In particular, it is critically 564 important not to simply provide additional complexity, even in the 565 name of making provisioning on the server side easier, unless there's 566 some clear benefit to it for the ultimate client of the DNS as well-- 567 the user who is trying to "do something" on the Internet. 569 Such a clear benefit could come from the ability, alluded to above, 570 to provide a facility that was anchored in the DNS and so did not 571 have to be re-invented anew for each application or protocol that 572 wished to have user-transparent access to the ability to reduce 573 "aliases" to a canonical domain name without necessarily being aware, 574 a priori, that the name was part of a set that could be deemed "the 575 same". 577 An example used more than once in discussion is provided by SSL, as a 578 protocol that uses domain names without necessarily using the DNS 579 protocol per se. SSL certificates are tied to one domain name. It 580 would be helpful to applications to have a non-protocol-specific way 581 to identify securely cases where multiple domain names can be 582 canonicalized to the domain name used for an SSL certificate. 583 Currently HTTP has such an ability, but it's considered awkward to 584 use and does not help writers or users of other application 585 protocols. 587 An important characteristic of such a solution for applications, 588 however, is that the writer and user be able to tell when such a 589 mechanism was invoked in the DNS, to avoid interference among 590 multiple possible ways to find "aliases" and compare them. This in 591 turn implies a fair amount of complexity to be inflicted not only on 592 DNS protocol but also on API/library writers seeking to use such new 593 facilities, particularly given the caution above that DNS is a lookup 594 protocol and must be given precise sequences of bits to look up. 596 Another characteristic of an "aliases" mechanism of interest to 597 application writers is the difficulty, and therefore the likely speed 598 and breadth of deployment, of such a DNS-based mechanism for 599 canonicalizing aliased names. DNS is notorious, as an aging 600 infrastructure protocol, for the long tail of deployment of 601 significant protocol features. Again, a feature can be designed to 602 be fairly easy to deploy, but without an incentive such as faster 603 application development or more secure applications, it still may not 604 see wide uptake even after it's present in current code bases. 606 An additional assumption often made needs to be examined here, as 607 well: that applications, and applications writers, have 608 straightforward, well-defined ways of interacting with the DNS, into 609 which new functionality can obviously and straightforwardly be added. 610 This can be true, as in the case of a query for a specific RRtype at 611 an unambiguously determined name by an application designed to find 612 and take advantage of the data represented in records of that RRtype. 613 However, it does not have to be, and often isn't: there's no standard 614 DNS API, and no standard abstraction for applications to interact 615 with the DNS. There is confusion about how to use DNS effectively, 616 and over the actual behavior of existing "alias" mechanisms such as 617 CNAME. Adding new technology is likely to be accompanied by 618 challenges not only in getting it deployed to the installed base, but 619 in getting its uses clearly documented for applications writers. 621 4. Proposed Requirements 623 These observations and examples, along with general discussion to 624 date, lead to the following tentative set of actual requirements. 626 1. Any mechanism proposed in the DNS to support "aliases" or 627 multiple names as "the same" MUST be workable for DNSSEC- signed 628 zones. 629 2. Any mechanism proposed in the DNS to support "aliases" or 630 multiple names as "the same" MUST be "backwards compatible," in 631 that it MUST NOT change the established behavior of existing 632 RRtypes and query processing. 633 3. Any mechanism proposed SHOULD NOT require more overhead of 634 registries, authoritative servers, or clients than existing 635 mechanisms for approximating the desired behavior, such as 636 provisioning of multiple parallel trees or CNAME processing. If 637 a new solution is more work than existing mechanisms, imperfect 638 as they may be, it's not clear where the incentives would lie to 639 deploy it. This is particularly a concern for implementors and 640 application developers. 641 4. Any mechanism proposed MAY require new RRtypes and special 642 processing for them. 643 5. Any mechanism proposed MUST NOT only reduce costs of generating 644 and providing authoritative service for DNS zones. It would be 645 too easy to reduce costs on the authority server provider while 646 adding costs elsewhere, particularly in terms of complexity. 647 Given the central importance of DNS service to Internet 648 operations, any change undertaken to lower the cost to providers 649 may be useful, but should not simply shift costs to DNS users, 650 whether applications or end users. 652 5. Possible Solutions 654 Currently, there are several possible mechanisms to support identical 655 DNS resolution of "bundled" or "variant" names as "aliases" in the 656 DNS. Existing mechanisms in the DNS include CNAME and DNAME. In 657 addition, as described briefly above, registry operators have a great 658 many techniques for applying policy to what names can be registered, 659 and provisioning technology to how they are instantiated in the DNS, 660 in support of keeping "variant" names behaving similarly to each 661 other, or in preventing the use of such variants as might be 662 considered confusing or dangerous. 664 In addition, there are new proposals for DNS protocol to support 665 "aliases" in the DNS as part of the desired behavior of "variant" 666 names: Names direction[BNAME], and "Zone clone". 668 All of the solutions have their advantages and disadvantages. In 669 particular, there are a couple of limitations they all share. Every 670 mechanism existing or proposed to support "aliases" in the DNS 671 requires that one name be designated as the "canonical" name 672 ("preferred" in the terminology of the JET variant mechanism) and any 673 others bundled with it are to be considered "variants" or "aliases". 674 The only known way to enforce a symmetrical or equivalent association 675 is via careful registry provisioning within and across domains. In 676 addition, the different "alias" mechanisms differ in subtle ways that 677 have to be carefully reviewed against the desired behavior of the DNS 678 in support of different types of "variants". 680 5.1. Mapping or Redirection of Domain Names 682 5.1.1. Mapping itself (CNAME) 684 It was recognized as part of the original specification of the DNS 685 that a host can have many names; in fact this expectation predates 686 the DNS, referring to the earlier specification of host names. In 687 the simplest case for "aliases", Internet users need these multiple 688 names to be resolved to the same IP address by a DNS server. The 689 CNAME record [RFC1034], where "CNAME" is an abbreviation for 690 "Canonical Name", is a way to designate aliases of the "real" or 691 canonical name of a host. In some cases, CNAME can be used to 692 produce the necessary association a bundle of variant domain names. 693 But the CNAME only maps itself, not its descendants; in fact it is 694 defined to not have descendants, as it is the only name at a node in 695 the DNS tree and can't exist at the same name as delegation. In the 696 case of IDN variants, however, it is often desirable that the name 697 map both itself and its descendants. 699 In terms, however, of deployment and availability, it's useful to 700 note that CNAME is already part of the installed base of DNS 701 authority servers and intermediate mode resolvers. Using it for this 702 purpose requires description of how to do it and how it behaves, but 703 that already is available. There are no issues of uptake or 704 backwards compatibility or new code or new documentation. 706 5.1.2. Mapping its descendants 708 In order to maintain the address-to-name mappings in a context of 709 network renumbering, a DNAME record or Delegation Name record defined 710 by [RFC2672] was invented to create an alias for all its subdomains. 711 In contrast, the CNAME record creates an alias only of a single name 712 (and not of its subdomains). As with the CNAME record, the DNS 713 lookup will continue by retrying the lookup with the new name. If a 714 DNS resolver sends a query without EDNS[EDNS0], or with EDNS version 715 0, then a name server synthesizes a CNAME record to simulate the 716 semantics of the DNAME record. A DNAME record is very much like the 717 CNAME record, but while the CNAME record only applies for one name, 718 with a DNAME record one can create aliases for all the records for 719 its subdomain. 721 DNAME is can be considered slightly less widely deployed than CNAME 722 for the EDNS0 compatibility reason described above, but it's been 723 defined in the DNS for quite some time, and includes a backwards 724 compatibility mechanism in the CNAME synthesis just described, so use 725 of DNAME does not rely on deployment of resolver code capable of 726 special processing for DNAME; it relies entirely on authority server 727 capability. 729 5.1.3. Mapping itself and its descendants 731 Bundling of "variant" strings or names as domain names, possibly 732 along with other use cases not yet identified, require the ability to 733 map a whole tree of the domain space to another domain. The current 734 DNS protocols do not support this function. A new DNS resource 735 record [BNAME] has been proposed to deal with this problem. 737 The advantage of BNAME is that it would enable a class of "aliasing" 738 behavior that some operators find desirable, particularly in 739 preference to some of the provisioning overhead they describe having 740 to deploy to support potentially large numbers of "bundles" of 741 variants at multiple levels of the DNS tree. The disadvantage is 742 that it may not provide the behavior people really want while 743 requiring the time and resources to code and deploy any new DNS 744 facility. 746 Alternatively, a proposal has been made that would leave CNAME as 747 already specified, but eliminating the constraint that a CNAME must 748 be alone at a node in the DNS tree. This would avoid any coding and 749 deployment overhead associated with new RRtypes, while obtaining the 750 desired behavior. Concerns expressed about it, however, include the 751 possible (but not yet specified) effort required for backwards 752 compatibility to avoid harm to implementations that expect, and use, 753 the old behavior. 755 Both of these mechanisms would require both authority server and 756 resolver changes to enable the new capability. 758 5.2. Zone Clone 760 The proposal of "zone clone" or "dns shadow", is an alternative 761 solution for a higher level of support than the DNS currently 762 provides for "alias" behavior across zones. In this scheme, a new 763 RRtype, SHADOW, is specified; it can exist at a zone apex and can be 764 used to define "clones" or "shadows" of the zone content so that 765 records in the zone are reachable via lookups from multiple 766 delegations. This mechanism varies fundamentally from CNAME/DNAME/ 767 BNAME in that it creates a local copy on each cooperating 768 authoritative server that has the original zone, reachable by the 769 names specified in the SHADOW RR. Its scope, then, is the zone as 770 maintained by an authoritative server rather than a single RRset 771 (even one corresponding to a delegation). 773 This scheme has the advantage that it allows a SHADOW zone to be used 774 in all the same contexts as the canonical or underlying zone, 775 including contexts where a CNAME or DNAME (or, presumably, a BNAME) 776 cannot appear, such as in the RDATA of certain RRtypes. Of the 777 proposed DNS protocol mechanisms, it probably comes closest to the 778 behavior some have requested as "equivalence," where none of the 779 bundled or SHADOW names is canonical or preferred over the others. 780 It does implicate an unknown level of effort to implement and 781 support. 783 6. IANA Considerations 785 There are no obvious IANA considerations in this memo; we reiterate 786 that the determination of which names are to be considered "the same" 787 is explicitly out of scope. 789 7. Security Considerations 791 [STW: Looking for examples for this section.] 793 Unsolved issues that will have to be considered in the definition of 794 what "the same" means for the DNS include the implications for 795 DNSSEC, and whether "identical" resolution includes DNSSEC validation 796 in the expected "identical" behavior. 798 Another area of possible peril includes SSL certificates, "Host" 799 headers as seen by web servers, and other security-relevant data 800 often associated with domain names. It will have to be considered 801 whether, and how, the "sameness" property maps into the expected 802 behavior of security-related protocols that use domain names, 803 particularly given that it's unlikely that all operators will ever 804 use the same set of constructs (whether in the DNS or elsewhere) to 805 signal whether different "names" are "the same" for purposes of the 806 function of a particular application or protocol. 808 In addition, there is a large cluster of security risks at the user 809 and application levels that motivate significant portions of the 810 interest in what it means to treat a set of names as "aliases" of 811 each other. One set of issues is around the expectation that two 812 strings are seen as "different" by the user in some obvious way (such 813 as visually) but need to be treated as "the same". The potential for 814 user confusion and subversion is not hard to imagine in cases where 815 two visually distinct strings are nonetheless likely to be expected 816 by the user to behave "the same" in some functional way. This is the 817 case we have attempted to address here. 819 There is a separate but complementary set of issues that arise around 820 cases where strings that look "the same" should nonetheless be 821 treated as different-- the so-called "confusing visual similarity" 822 problem. The easy example is substituting the Unicode codepoint for 823 a character in one script, or a string of them, for the Unicode 824 codepoints for similar-looking characters in an altogether different 825 script. This has a different set of potential risks to users, and 826 has not been discussed here. It's often closely related to the 827 "alias" issue we have attempted to deal with, however, which poses 828 risks of its own to analysis of the either subject. 830 8. Acknowledgements 832 Most of the ideas here and much of the text is taken from discussions 833 on the DNSEXT and DNSOP WG mailing lists. Particular help is 834 acknowledged from the authors of the proposed solutions drafts, and 835 from the many contributors to the IDNAbis work and its underpinnings. 836 Special thanks at the intersection of DNS and IDNAbis is owed to 837 Patrik Faltstrom, Cary Karp, John Klensin, Vaggelis Segredakis, and 838 Andrew Sullivan for their patient explanations. 840 9. Change History 842 [[anchor28: RFC Editor: Please remove this section.]] 844 9.1. draft-yao-dnsext-identical-resolution: Version 00 846 o Domain Name Identical Resolution Problem Statement (initial 847 attempt) 849 9.2. draft-yao-dnsext-identical-resolution: Version 01 851 o Expanded introduction 852 o Added Greek example 853 o Added some detail to descriptions of proposed solutions 855 10. References 857 10.1. Normative References 859 [ASCII] American National Standards Institute (formerly United 860 States of America Standards Institute), "USA Code for 861 Information Interchange", ANSI X3.4-1968, 1968. 863 [EDNS0] Vixie, P., "Extension Mechanisms for DNS (EDNS0)", 864 RFC 2671, August 1999. 866 [RFC1034] Mockapetris, P., "Domain names - concepts and facilities", 867 STD 13, RFC 1034, November 1987. 869 [RFC1035] Mockapetris, P., "Domain names - implementation and 870 specification", STD 13, RFC 1035, November 1987. 872 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 873 Requirement Levels", BCP 14, RFC 2119, March 1997. 875 [RFC2136] Vixie, P., Thomson, S., Rekhter, Y., and J. Bound, 876 "Dynamic Updates in the Domain Name System (DNS UPDATE)", 877 RFC 2136, April 1997. 879 [RFC2672] Crawford, M., "Non-Terminal DNS Name Redirection", 880 RFC 2672, August 1999. 882 [RFC3490] Faltstrom, P., Hoffman, P., and A. Costello, 883 "Internationalizing Domain Names in Applications (IDNA)", 884 RFC 3490, March 2003. 886 [RFC3597] Gustafsson, A., "Handling of Unknown DNS Resource Record 887 (RR) Types", RFC 3597, September 2003. 889 [RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO 890 10646", RFC 3629, November 2003. 892 [RFC3743] Konishi, K., Huang, K., Qian, H., and Y. Ko, "Joint 893 Engineering Team (JET) Guidelines for Internationalized 894 Domain Names (IDN) Registration and Administration for 895 Chinese, Japanese, and Korean", RFC 3743, April 2004. 897 [RFC4033] Arends, R., Austein, R., Larson, M., Massey, D., and S. 898 Rose, "DNS Security Introduction and Requirements", 899 RFC 4033, March 2005. 901 [RFC4034] Arends, R., Austein, R., Larson, M., Massey, D., and S. 902 Rose, "Resource Records for the DNS Security Extensions", 903 RFC 4034, March 2005. 905 [RFC4035] Arends, R., Austein, R., Larson, M., Massey, D., and S. 906 Rose, "Protocol Modifications for the DNS Security 907 Extensions", RFC 4035, March 2005. 909 [RFC4290] Klensin, J., "Suggested Practices for Registration of 910 Internationalized Domain Names (IDN)", RFC 4290, 911 December 2005. 913 [RFC4690] Klensin, J., Faltstrom, P., Karp, C., and IAB, "Review and 914 Recommendations for Internationalized Domain Names 915 (IDNs)", RFC 4690, September 2006. 917 10.2. Informative References 919 [BNAME] Yao, J., Lee, X., and P. Vixie, "Bundle DNS Name 920 Redirection", draft-yao-dnsext-bname-01.txt (work in 921 progress), 12 2009. 923 [CNAME-DNAME] 924 Sury, O., "CNAME+DNAME Name Redirection", 925 draft-sury-dnsext-cname-dname-00.txt (work in progress), 926 4 2010. 928 [IDN-TLD-Variants] 929 Yao, J. and X. Lee, "IDN TLD Variants Implementation 930 Guideline", draft-yao-dnsop-idntld-implementation-01.txt 931 (work in progress), 11 2009. 933 [RFC2672bis] 934 Rose, S. and W. Wijngaards, "Update to DNAME Redirection 935 in the DNS", Internet-Draft ietf-dnsext-rfc2672bis-dname- 936 17.txt, 6 2009. 938 [SHADOW] Vixie, P., "Use of DNS to Carry Configuration Metadata 939 Concerning Automatic Replication of Zones", 940 draft-vixie-dnsext-dnsshadow-00.txt (work in progress), 941 2 2010. 943 Authors' Addresses 945 Suzanne Woolf 946 Internet Systems Consortium, Inc. 947 950 Charter St. 948 Redwood City, CA 94063 950 Phone: +1 650 423 1333 951 Email: woolf@isc.org 953 Xiaodong LEE 954 CNNIC 955 No.4 South 4th Street, Zhongguancun 956 Beijing 958 Phone: +86 10 58813020 959 Email: lee@cnnic.cn 960 Jiankang YAO 961 CNNIC 962 No.4 South 4th Street, Zhongguancun 963 Beijing 965 Phone: +86 10 58813007 966 Email: yaojk@cnnic.cn