idnits 2.17.1 draft-kolkman-root-test-delegation-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (November 18, 2013) is 3805 days in the past. Is this intentional? Checking references for intended status: None ---------------------------------------------------------------------------- No issues found here. Summary: 2 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group G. Huston 3 Internet-Draft APNIC 4 Intended status: Experimental Protocol O. Kolkman 5 Expires: May 20, 2014 NLnet Labs 6 A. Sullivan 7 Dyn, Inc. 8 W. Kumari 9 Google, Inc. 10 November 18, 2013 12 Using Test Delegations from the Root Prior to Full Allocation and 13 Delegation 14 draft-kolkman-root-test-delegation-02 16 Abstract 18 The delegation of certain strings as generic Top Level Domains 19 (gTLDs) may cause stability and security issues if such strings have 20 been used in private environments prior to their delegation. Test 21 delegations can be used to enable empirical research on the extent of 22 the potential for name collision. This document describes one such 23 approach to an empirical testing framework for name collision, and 24 considers the applicability of this approach to detect other forms of 25 name collision. 27 Status of this Memo 29 This Internet-Draft is submitted in full conformance with the 30 provisions of BCP 78 and BCP 79. 32 Internet-Drafts are working documents of the Internet Engineering 33 Task Force (IETF). Note that other groups may also distribute 34 working documents as Internet-Drafts. The list of current Internet- 35 Drafts is at http://datatracker.ietf.org/drafts/current/. 37 Internet-Drafts are draft documents valid for a maximum of six months 38 and may be updated, replaced, or obsoleted by other documents at any 39 time. It is inappropriate to use Internet-Drafts as reference 40 material or to cite them other than as "work in progress." 42 This Internet-Draft will expire on May 20, 2014. 44 Copyright Notice 46 Copyright (c) 2013 IETF Trust and the persons identified as the 47 document authors. All rights reserved. 49 This document is subject to BCP 78 and the IETF Trust's Legal 50 Provisions Relating to IETF Documents (http://trustee.ietf.org/ 51 license-info) in effect on the date of publication of this document. 52 Please review these documents carefully, as they describe your rights 53 and restrictions with respect to this document. Code Components 54 extracted from this document must include Simplified BSD License text 55 as described in Section 4.e of the Trust Legal Provisions and are 56 provided without warranty as described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction and Motivation . . . . . . . . . . . . . . . . . 2 61 1.1. Scire est mensurare . . . . . . . . . . . . . . . . . . . 3 62 2. Terms and Conventions Used in this Memo . . . . . . . . . . . 4 63 3. Principle of Operation . . . . . . . . . . . . . . . . . . . . 4 64 3.1. Measurements Servers and Zones . . . . . . . . . . . . . . 5 65 3.2. Query Generation . . . . . . . . . . . . . . . . . . . . . 5 66 3.3. Sampling . . . . . . . . . . . . . . . . . . . . . . . . . 6 67 4. Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . 6 68 5. Name Resolution Considerations . . . . . . . . . . . . . . . . 7 69 6. Security Considerations . . . . . . . . . . . . . . . . . . . 9 70 7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 9 71 Appendix A. Acknowledgements . . . . . . . . . . . . . . . . . . . 10 72 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 10 74 1. Introduction and Motivation 76 [[The authors are aware that this version of the document is not 77 fully consistent. However they would value feedback on whether the 78 idea is worth further study. A mail list to discuss this draft is 79 collisions@lists.dns-oarc.net.]] 81 While certain names have been reserved for internal or private use 82 [RFC6761], there is evidence [SAC45] that various sites connected to 83 the Internet have used other names for internal purposes. In fact, 84 the Multicast DNS specification [RFC6762] advises not to use .local 85 for private use and observes: "the following top-level domains have 86 been used on private internal networks without the problems caused by 87 trying to reuse ".local." for this purpose: 88 .intranet 89 .internal. 90 .private. 91 .corp. 92 .home. 93 .lan. 95 In the event such names are delegated for use in the public DNS, 96 there will be inevitable consequences for sites that have used those 97 names. Some of those consequences may have security implications, 98 with the potential for leakage of credentials and HTTP cookies 99 ([RFC6265]). Responsible administration of the public namespace 100 therefore requires careful consideration in permitting public 101 delegation of any name when there are grounds to believe it is in 102 widespread use as a private namespace, even though such private 103 namespaces are (from the point of view of the DNS) irregular, even if 104 common. 106 One form of name collision involves network domains that use selected 107 names as local-use top level domains, as noted in [RFC6762]. In the 108 case where the same label is delegated in the global DNS as a gTLD, 109 then hosts in the local domain will be unable to resolve domain names 110 in the context of the gTLD. This state of name occlusion is further 111 compounded by a number of scenarios where the resolution of a name is 112 performed across multiple name scope domains. This may happen with a 113 mobile host (in the case, for example, when the host uses a 114 statically defined "home page" on their local browser that is defined 115 within a particular local scope), or even with applications, such as, 116 for example, mail delivery (in the case where multiple MTAs who are 117 listed as mail servers for a domain reside in different name scope 118 domains, some of which have this name collision between the domain 119 and locally defined pseudo-TLDs). 121 Name collision opens up the potential for misdirection, where the 122 named remote point being contacted by the application may not 123 necessarily be the intended service point for the transaction. When 124 a host leaves the intranet environment, the host's applications may 125 anticipate that the DNS names associated with a label return an RCODE 126 3 (NXDOMAIN) response, but may encounter an unanticipated response 127 when the gTLD is deployed with a colliding name. Similarly, a host 128 that has an association with a named service point within the gTLD 129 may encounter unanticipated responses when the host is placed into an 130 intranet environment where the same name exist as a locally-scoped 131 pseudo-TLD. 133 There is a subtle form of interaction of names when the same name is 134 placed on a local name search list. Certain name resolver libraries 135 first query the original name, and if the query returns an NXDOMAIN, 136 then they apply the local search list to the original name. When 137 this process occurs in the context of a visible gTLD name colliding 138 with the local name there is the possibility of the name resolving in 139 the context of the gTLD, which then bypasses the application of the 140 local search list. 142 1.1. Scire est mensurare 143 The local use of undelegated top-level domain names is troublesome 144 because it may produce different user experiences depending on the 145 locally used name, the names placed in a local search list and the 146 location of a given host, and the host's name resolution behaviour. 148 Prudent operation of the root zone requires that deployment of new 149 names in the root should not necessarily cause widespread untoward 150 effects for users of the DNS, particularly when those users are 151 relying on name resolution outcomes that have always been part of the 152 name resolution behaviour up unto this point. 154 What is useful in this context is a mechanism to test whether a 155 particular delegation from the root zone presents a conflict with 156 widespread local use. This memo presents a methodology for making 157 such a determination. 159 The methodology considered here depends on temporary delegation of 160 the top-level domains in question, and the use of a domain under an 161 existing TLD in order to capture and compare queries generated by a 162 large number of querying sources under the control of the experiment. 164 2. Terms and Conventions Used in this Memo 166 The mechanism outlined here is intended to complement the analysis 167 already performed in "Name Collision in the DNS" [namecollision]. We 168 therefore use the terms defined in section 1.1 of [namecollision] 169 whenever appropriate. 171 Note that the evaluation methodology outlined here is intended to be 172 complementary input to a risk analysis e.g. as found in 173 [namecollision]; risk tradeofs are likely to include other factors 174 than the effects measured herewith. 176 3. Principle of Operation 178 The goal of the experiment is to assess whether there is significant 179 existing use of a given candidate string ("CandidateTLD"). 181 We propose the use of a software test that is executed by a large 182 number of end hosts drawn from across the entire Internet. The 183 execution of this test will cause the end host to attempt to retrieve 184 a small set of URLs. This will trigger a set of DNS queries to 185 resolve the domain name part of each URL, and subsequent HTTP queries 186 to retrieve the object in the case that the DNS name is successfully 187 resolved to an IP address. Both the DNS queries and the HTTP 188 requests are answered by dedicated servers that analyse the received 189 responses and match them to the original set of queries that were 190 used by the end host. This will allow us to infer whether the lost 191 is located in an context where there is name collision with the 192 CandidateTLD. In this section we describe the query generation, data- 193 collection, and analysis. 195 This methodology is based on earlier work by APNIC [Method]. 197 3.1. Measurements Servers and Zones 199 In addition to the use of CandidateTLD, the methodology uses an 200 additional name, delegated from a 'common' existing TLD, 201 ("TestName.ExistingTLD") to the experiment's server. 203 The experiment's name server is authoritative for CandidateTLD and 204 TestName.ExistingTLD. The name server will respond to an A and AAAA 205 query for any name within "TestName.CandidateTLD" with the IPV4 or 206 IPv6 address of the experiment's HTTP server. The name server will 207 respond to queries for any other name within CandidateTLD with RCODE 208 3 (Name Error or NXDOMAIN). The name server will respond to A and 209 AAAA queries in TestName.ExistingTLD with the IPv4 or IPv6 address of 210 the experiment's HTTP server. 212 The experiment's HTTP server will respond with a "200 OK" for a 213 request for the object "1x1.png" in TestName.CandidateTLD and in 214 TestName.ExistingTLD. The server will respond with "404 Not Found" 215 for any other object name. 217 3.2. Query Generation 219 The TestName is a synthetic name with no intentional semantic 220 meaning, that is generated in such a way to reduce the likliehood of 221 collision with any existing delegated name. It is suggested that it 222 be generated by using the hex encoding of a randomly selected integer 223 value between 1,000,000,000 and 2,000,000,000. The name must not be 224 already delegated from the root or in the ExistingTLD. 226 Each query set constitutes one "measurement". A "measurement" is 227 identified by a measurement identifier (, syntactically a 228 valid hostname) that is uniquely generated for each instance of a 229 measurement. This ensures that when the domain name is resolved, and 230 when the named object is retrieved there is no occlusion of the 231 interaction with the experiment's services because of local name or 232 web object caches. The set uses the following URLs: 234 A: http://-a.TestName.CandidateTLD/1x1.png? 235 -a 237 B: http://-a.TestName.ExistingTLD/1x1.png? 238 -b 240 C: http://results.TestName.ExistingTLD/1x1.png? 241 ?za=&zb= 243 The A URL is intended to test if CandidateTLD is a locally used name. 244 In other words, if local use of CandidateTLD occludes visibility of 245 CandidateTLD as a gTLD. The DNS query for the A Fully Qualified 246 Domain Name (FQDN) will only be received by the authoritative name 247 server for this name if there is no local name resolution function 248 that uses the CandidateTLD name as a locally defined pseudo-top level 249 domain. 251 The B URL is intended to function as the control test for the 252 experiment, and the use of ExistingTLD in B is intended to operate as 253 a name that does not collide with a local use context. 255 As the experiment uses the absence of a fetch of the A URL to infer 256 the name resolution behaviour of the location where the measurement 257 is being performed, it is necessary to ensure that the measurement 258 code has run to completion. The measurement code starts a timer at 259 the start of its execution. Upon expiration of the timer, or when 260 both the A and B objects have been successfully retrieved, the code 261 will schedule the retrieval of the C URL. The arguments to the C URL 262 include the client-side measurement of the elapsed time to retrieve 263 the A and B URLs. 265 3.3. Sampling 267 One way to perform this measurement is to embed the measurement in 268 web content, using a scripting language. When the web content is 269 loaded the script is activated, and the measurement sequence is 270 performed. 272 One way to distribute this content to clients to perform the test is 273 via an online (ad) campaign. If the measurement script is enclosed 274 within the ad itself, then there is no reason for the campaign 275 actually to cause users to click though in order to perform the test. 276 Behavior of this sort is trivially achievable with a number of 277 available online advertising systems. 279 It is also necessary to spread the delivery of the ad to a very broad 280 spectrum of clients, uso the as should be presented across all time 281 zones, across all language bases, and across all geographic regions. 283 4. Evaluation 285 To evaluate the results, we take those measurements that return the C 286 URL. The use of the C URL ensures that we use measurement results 287 where the ExistingTLD name is not being locally occluded. We count 288 the number of experiments of each of the possible combinations of 289 retrieving the A and B URLs. These combinations are: 291 Not A and Not B: This result contributes to experimental 292 uncertainty. (We know that ExistingTLD is not locally 293 occluded, so the failure to retrieve B is due to other factors 294 that are not being examined in the context of this 295 measurement.) 297 A and Not B: This result indicates that the client is able to 298 resolve names in the CandidateTLD in the context of the global 299 DNS, but the inability to retrieve the B URL contributes to 300 experimental uncertainty. (The same reasoning about the 301 ExistingTLD and local occlusion applies to this case). 303 Not A and B: This result is an indicator that the client's use of 304 CandidateTLD is probably being occluded by some form of local 305 use. 307 A and B: This result indicates that the client is able to resolve 308 names in the CandidateTLD in the context of the global DNS. 310 If the CandidateTLD is in widespread private use then we would see 311 the count of "Not A and B" be far in excess of the level of 312 experimental uncertainty, then we can conclude that there are locales 313 where the CandidateTLD is being used in local context. Analysis of 314 the source IP addresses of the clients that fetch "Not A and B", and 315 the BGP Origin AS of these addresses and their geolocation may 316 indicate if such local use is clustered in a particular network or 317 group or networks, or clustered in a particular geography or language 318 region. 320 5. Name Resolution Considerations 322 Eariler versions of this memo proposed to use this experimental 323 technique to detect name search list considerations. This section 324 describes the name search list collision considerations, and 325 describes some further investigation that has lead to the conclusion 326 that this technique would not necessarily be applicable in that 327 context. 329 The basic algorithm used in name resolution when search lists are 330 present appears to be consistent across a number of implementations: 331 various permutations of using the base name and appending individual 332 values from the name search list are used as DNS queries in order to 333 find a name that can be resolved by the local DNS resolver. The 334 search process stops when the DNS query returns other than an 335 NXDOMAIN response. 337 However the exact order of generating these candidate names has been 338 observerd to vary across implementations. To describe these 339 observations it is first necessary to introduce some basic 340 terminology. There are four generic ways that name resolution 341 libraries apply a search list to a "base name" in order to construct 342 a set of FQDN that are used in DNS queries: 344 none the search list is not applied to the base name. 346 pre the search list is applied to the base name, then the base 347 name alone is used. 349 post the base name alone is used, then the search list is applied 350 to the base name. 352 always the search list is applied to the base name, and the base 353 name alone is not used. 355 The form of name collision with search lists, as discribed in the 356 introduction section of this memo, occurs in the "post" case, where 357 the unexpected resolution of the base name causes the search list not 358 to be applied to the base name, and the global name context is 359 applied to the base name, rather applying a local name context, as 360 defined by the search list. 362 Table 1 provides a summary of the behaviour of various operating 363 systems and their local name resolver library behaviour when 364 resolving base names that contain a single label, and names that 365 contain two labels. As can be seen, only Windows XP and Unix-based 366 libraries perform the "post" form of search name application that 367 would be susceptable for this form of name collision. 369 +---------------+--------------+-------------+ 370 | System | Single Label | Multi-Label | 371 +---------------+--------------+-------------+ 372 | MAC OSX 10.9 | always | never | 373 | Windows XP | always | post | 374 | Windows Vista | always | never | 375 | Windows 7 | always | never | 376 | Windows 8.1 | always | never | 377 | FreeBSD 9.1 | pre | post | 378 | Ubuntu 13.04 | pre | post | 379 +---------------+--------------+-------------+ 381 The experimental approach described here does not necessarily use the 382 operating system's name resolution libraries. The experimental 383 technique forms a name query within the browser, so it is more 384 relevant to examine the behaviour of the browsers when given single 385 and multi-label names to lookup. Table 2 shows the behaviour of a 386 number of browsers on two operating system platforms. (It should be 387 noted that these results in Table 2 were obtained by using Javascript 388 to feed names to the browser. The interactive data entry procedures 389 in current browsers are a dual purpose URL and search engine term 390 data entry, and the variations on behaviour between browsers in the 391 way in which entered data is interpreted is more due to the 392 differences in the browser's input parser than it is due to any 393 differences in the browser's name resolution library.) 394 +---------------------------+--------------+-------------+ 395 | System | Single Label | Multi-Label | 396 +---------------------------+--------------+-------------+ 397 | MacOS OSX 10.9 | | | 398 | Chrome (31.0.1650.39) | always | post | 399 | Opera (12.16) | always | never | 400 | Firefox (25.0) | always | never | 401 | Safari (7.0 9537.71) | always | never | 402 | Windows 8.1 | | | 403 | Chrome (30.0.1599.101) | always | never | 404 | Opera (17.0) | always | never | 405 | Firefox (25.0) | always | never | 406 | Safari (5.1.7 7534.57.2) | always | never | 407 | Explorer (11.0.900.16384) | always | never | 408 +---------------------------+--------------+-------------+ 410 Only one browser / Operating System combination tested shows the 411 "post" form of search name use, namely Chrome on the Mac OSX 412 platform. In all other cases a single label name always has the 413 local search list appended, and a multi-label name never applies the 414 local search list. 416 6. Security Considerations 418 The delegation of the Proposed TLD (CandidateTLD) comes with some 419 risk of interference with existing deployments. In the case where a 420 local system queries a name, and that query returns a NXDOMAIN 421 response, then local system then queries further name forms where 422 each entry on a local name search list is appended to the original 423 name in turn, searching for a name response that is not NXDOMAIN. 424 The delegation of CandidateTLD for this experiment may interfere this 425 this behaviour. 427 However, two observations mitigate this concern. The first is that 428 this situation of potential collision arises in the case where the 429 local system is querying for the CandidateTLD name as a "dotless" 430 name (as the only delegated subdomain in the CandidateTLD zone is 431 TestName, which is intended to have no semantic meaning in any 432 language). The second observation is that for such "dotless" names, 433 the currently widely deployed name resolver libraries no not 434 initially query the "dotless" domain name then apply the search list 435 is the first query results in an RCODE 3 response. Many name 436 resolver libraries do not query for "dotless" domain names at all, 437 while those libraries that have been observed to perform such queries 438 (Windows XP, Linux, FreeBSD) perform them after using the local 439 search name list, rather then before. 441 7. References 443 [Method] APNIC, "APNIC Labs IPv6 Measurement System ", May 2013. 445 [RFC6265] Barth, A., "HTTP State Management Mechanism", RFC 6265, 446 April 2011. 448 [RFC6761] Cheshire, S. and M. Krochmal, "Special-Use Domain Names", 449 RFC 6761, February 2013. 451 [RFC6762] Cheshire, S. and M. Krochmal, "Multicast DNS", RFC 6762, 452 February 2013. 454 [SAC45] ICANN Security and Stability Advisory Committee, "Invalid 455 Top Level Domain Queries at the Root Level of the Domain 456 Name System", 11 2010, . 459 [namecollision] 460 Interisle Consulting Group, "Name Collision in the DNS", 461 August 2013. 463 Appendix A. Acknowledgements 465 This draft is a follow-up of, an borrows heavily from, our earlier 466 (abandonded) work on "A Procedure for Cautious Delegation of a DNS 467 Names". Discussion of that document in various hallways lead to 468 inspiration for this document and we want to thank those that gave us 469 feed-back. 471 The idea of using different names to trigger events in a DNS server 472 is due to Geoff Huston and George Michaelson. 474 The approach described here of using code embedded in ads delivered 475 by online advertisement networks to generate a large volume of URL- 476 based experiments performed by end users' browsers was developed by 477 George Michaelson, Byron Ellacot and Geoff Huston. 479 Authors' Addresses 481 Geoff Huston 482 APNIC 483 6 Cordelia St 484 South Brisbane, QLD 4101 485 Australia 487 Email: gih@apnic.net 489 Olaf Kolkman 490 NLnet Labs 491 Science Park 400 492 Amsterdam, 1098 XH 493 The Netherlands 495 Email: olaf@NLnetLabs.nl 496 Andrew Sullivan 497 Dyn, Inc. 498 150 Dow St 499 Manchester, NH 03101 500 U.S.A. 502 Email: asullivan@dyn.com 504 Warren Kumari 505 Google, Inc. 506 1600 Amphitheatre Pkwy 507 Mountain View, CA 94043 508 U.S.A. 510 Email: warren@kumari.net