idnits 2.17.1 

draft-kolkman-root-test-delegation-02.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (November 18, 2013) is 3805 days in the past.  Is this
     intentional?


  Checking references for intended status: None
  ----------------------------------------------------------------------------

     No issues found here.

     Summary: 2 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                          G. Huston
3	Internet-Draft                                                     APNIC
4	Intended status: Experimental Protocol                        O. Kolkman
5	Expires: May 20, 2014                                         NLnet Labs
6	                                                             A. Sullivan
7	                                                               Dyn, Inc.
8	                                                               W. Kumari
9	                                                            Google, Inc.
10	                                                       November 18, 2013

12	   Using Test Delegations from the Root Prior to Full Allocation and
13	                               Delegation
14	                 draft-kolkman-root-test-delegation-02

16	Abstract

18	   The delegation of certain strings as generic Top Level Domains
19	   (gTLDs) may cause stability and security issues if such strings have
20	   been used in private environments prior to their delegation.  Test
21	   delegations can be used to enable empirical research on the extent of
22	   the potential for name collision.  This document describes one such
23	   approach to an empirical testing framework for name collision, and
24	   considers the applicability of this approach to detect other forms of
25	   name collision.

27	Status of this Memo

29	   This Internet-Draft is submitted in full conformance with the
30	   provisions of BCP 78 and BCP 79.

32	   Internet-Drafts are working documents of the Internet Engineering
33	   Task Force (IETF).  Note that other groups may also distribute
34	   working documents as Internet-Drafts.  The list of current Internet-
35	   Drafts is at http://datatracker.ietf.org/drafts/current/.

37	   Internet-Drafts are draft documents valid for a maximum of six months
38	   and may be updated, replaced, or obsoleted by other documents at any
39	   time.  It is inappropriate to use Internet-Drafts as reference
40	   material or to cite them other than as "work in progress."

42	   This Internet-Draft will expire on May 20, 2014.

44	Copyright Notice

46	   Copyright (c) 2013 IETF Trust and the persons identified as the
47	   document authors.  All rights reserved.

49	   This document is subject to BCP 78 and the IETF Trust's Legal
50	   Provisions Relating to IETF Documents (http://trustee.ietf.org/
51	   license-info) in effect on the date of publication of this document.
52	   Please review these documents carefully, as they describe your rights
53	   and restrictions with respect to this document.  Code Components
54	   extracted from this document must include Simplified BSD License text
55	   as described in Section 4.e of the Trust Legal Provisions and are
56	   provided without warranty as described in the Simplified BSD License.

58	Table of Contents

60	   1.  Introduction and Motivation  . . . . . . . . . . . . . . . . .  2
61	     1.1.  Scire est mensurare  . . . . . . . . . . . . . . . . . . .  3
62	   2.  Terms and Conventions Used in this Memo  . . . . . . . . . . .  4
63	   3.  Principle of Operation . . . . . . . . . . . . . . . . . . . .  4
64	     3.1.  Measurements Servers and Zones . . . . . . . . . . . . . .  5
65	     3.2.  Query Generation . . . . . . . . . . . . . . . . . . . . .  5
66	     3.3.  Sampling . . . . . . . . . . . . . . . . . . . . . . . . .  6
67	   4.  Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . .  6
68	   5.  Name Resolution Considerations . . . . . . . . . . . . . . . .  7
69	   6.  Security Considerations  . . . . . . . . . . . . . . . . . . .  9
70	   7.  References . . . . . . . . . . . . . . . . . . . . . . . . . .  9
71	   Appendix A. Acknowledgements . . . . . . . . . . . . . . . . . . . 10
72	   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 10

74	1.  Introduction and Motivation

76	   [[The authors are aware that this version of the document is not
77	   fully consistent.  However they would value feedback on whether the
78	   idea is worth further study.  A mail list to discuss this draft is
79	   collisions@lists.dns-oarc.net.]]

81	   While certain names have been reserved for internal or private use
82	   [RFC6761], there is evidence [SAC45] that various sites connected to
83	   the Internet have used other names for internal purposes.  In fact,
84	   the Multicast DNS specification [RFC6762] advises not to use .local
85	   for private use and observes: "the following top-level domains have
86	   been used on private internal networks without the problems caused by
87	   trying to reuse ".local."  for this purpose:
88	   .intranet
89	   .internal.
90	   .private.
91	   .corp.
92	   .home.
93	   .lan.

95	   In the event such names are delegated for use in the public DNS,
96	   there will be inevitable consequences for sites that have used those
97	   names.  Some of those consequences may have security implications,
98	   with the potential for leakage of credentials and HTTP cookies
99	   ([RFC6265]). Responsible administration of the public namespace
100	   therefore requires careful consideration in permitting public
101	   delegation of any name when there are grounds to believe it is in
102	   widespread use as a private namespace, even though such private
103	   namespaces are (from the point of view of the DNS) irregular, even if
104	   common.

106	   One form of name collision involves network domains that use selected
107	   names as local-use top level domains, as noted in [RFC6762].  In the
108	   case where the same label is delegated in the global DNS as a gTLD,
109	   then hosts in the local domain will be unable to resolve domain names
110	   in the context of the gTLD. This state of name occlusion is further
111	   compounded by a number of scenarios where the resolution of a name is
112	   performed across multiple name scope domains.  This may happen with a
113	   mobile host (in the case, for example, when the host uses a
114	   statically defined "home page" on their local browser that is defined
115	   within a particular local scope), or even with applications, such as,
116	   for example, mail delivery (in the case where multiple MTAs who are
117	   listed as mail servers for a domain reside in different name scope
118	   domains, some of which have this name collision between the domain
119	   and locally defined pseudo-TLDs).

121	   Name collision opens up the potential for misdirection, where the
122	   named remote point being contacted by the application may not
123	   necessarily be the intended service point for the transaction.  When
124	   a host leaves the intranet environment, the host's applications may
125	   anticipate that the DNS names associated with a label return an RCODE
126	   3 (NXDOMAIN) response, but may encounter an unanticipated response
127	   when the gTLD is deployed with a colliding name.  Similarly, a host
128	   that has an association with a named service point within the gTLD
129	   may encounter unanticipated responses when the host is placed into an
130	   intranet environment where the same name exist as a locally-scoped
131	   pseudo-TLD.

133	   There is a subtle form of interaction of names when the same name is
134	   placed on a local name search list.  Certain name resolver libraries
135	   first query the original name, and if the query returns an NXDOMAIN,
136	   then they apply the local search list to the original name.  When
137	   this process occurs in the context of a visible gTLD name colliding
138	   with the local name there is the possibility of the name resolving in
139	   the context of the gTLD, which then bypasses the application of the
140	   local search list.

142	1.1.  Scire est mensurare
143	   The local use of undelegated top-level domain names is troublesome
144	   because it may produce different user experiences depending on the
145	   locally used name, the names placed in a local search list and the
146	   location of a given host, and the host's name resolution behaviour.

148	   Prudent operation of the root zone requires that deployment of new
149	   names in the root should not necessarily cause widespread untoward
150	   effects for users of the DNS, particularly when those users are
151	   relying on name resolution outcomes that have always been part of the
152	   name resolution behaviour up unto this point.

154	   What is useful in this context is a mechanism to test whether a
155	   particular delegation from the root zone presents a conflict with
156	   widespread local use.  This memo presents a methodology for making
157	   such a determination.

159	   The methodology considered here depends on temporary delegation of
160	   the top-level domains in question, and the use of a domain under an
161	   existing TLD in order to capture and compare queries generated by a
162	   large number of querying sources under the control of the experiment.

164	2.  Terms and Conventions Used in this Memo

166	   The mechanism outlined here is intended to complement the analysis
167	   already performed in "Name Collision in the DNS" [namecollision].  We
168	   therefore use the terms defined in section 1.1 of [namecollision]
169	   whenever appropriate.

171	   Note that the evaluation methodology outlined here is intended to be
172	   complementary input to a risk analysis e.g.  as found in
173	   [namecollision]; risk tradeofs are likely to include other factors
174	   than the effects measured herewith.

176	3.  Principle of Operation

178	   The goal of the experiment is to assess whether there is significant
179	   existing use of a given candidate string ("CandidateTLD").

181	   We propose the use of a software test that is executed by a large
182	   number of end hosts drawn from across the entire Internet.  The
183	   execution of this test will cause the end host to attempt to retrieve
184	   a small set of URLs.  This will trigger a set of DNS queries to
185	   resolve the domain name part of each URL, and subsequent HTTP queries
186	   to retrieve the object in the case that the DNS name is successfully
187	   resolved to an IP address.  Both the DNS queries and the HTTP
188	   requests are answered by dedicated servers that analyse the received
189	   responses and match them to the original set of queries that were
190	   used by the end host.  This will allow us to infer whether the lost
191	   is located in an context where there is name collision with the
192	   CandidateTLD. In this section we describe the query generation, data-
193	   collection, and analysis.

195	   This methodology is based on earlier work by APNIC [Method].

197	3.1.  Measurements Servers and Zones

199	   In addition to the use of CandidateTLD, the methodology uses an
200	   additional name, delegated from a 'common' existing TLD,
201	   ("TestName.ExistingTLD") to the experiment's server.

203	   The experiment's name server is authoritative for CandidateTLD and
204	   TestName.ExistingTLD. The name server will respond to an A and AAAA
205	   query for any name within "TestName.CandidateTLD" with the IPV4 or
206	   IPv6 address of the experiment's HTTP server.  The name server will
207	   respond to queries for any other name within CandidateTLD with RCODE
208	   3 (Name Error or NXDOMAIN). The name server will respond to A and
209	   AAAA queries in TestName.ExistingTLD with the IPv4 or IPv6 address of
210	   the experiment's HTTP server.

212	   The experiment's HTTP server will respond with a "200 OK" for a
213	   request for the object "1x1.png" in TestName.CandidateTLD and in
214	   TestName.ExistingTLD. The server will respond with "404 Not Found"
215	   for any other object name.

217	3.2.  Query Generation

219	   The TestName is a synthetic name with no intentional semantic
220	   meaning, that is generated in such a way to reduce the likliehood of
221	   collision with any existing delegated name.  It is suggested that it
222	   be generated by using the hex encoding of a randomly selected integer
223	   value between 1,000,000,000 and 2,000,000,000. The name must not be
224	   already delegated from the root or in the ExistingTLD.

226	   Each query set constitutes one "measurement".  A "measurement" is
227	   identified by a measurement identifier (<uniqueid>, syntactically a
228	   valid hostname) that is uniquely generated for each instance of a
229	   measurement.  This ensures that when the domain name is resolved, and
230	   when the named object is retrieved there is no occlusion of the
231	   interaction with the experiment's services because of local name or
232	   web object caches.  The set uses the following URLs:

234	      A: http://<unique_id>-a.TestName.CandidateTLD/1x1.png?
235	         <uniqueid>-a

237	      B: http://<unique_id>-a.TestName.ExistingTLD/1x1.png?
238	         <uniqueid>-b

240	      C: http://results.TestName.ExistingTLD/1x1.png?
241	         <uniqueid>?za=<a_result>&zb=<b_result>

243	   The A URL is intended to test if CandidateTLD is a locally used name.
244	   In other words, if local use of CandidateTLD occludes visibility of
245	   CandidateTLD as a gTLD. The DNS query for the A Fully Qualified
246	   Domain Name (FQDN) will only be received by the authoritative name
247	   server for this name if there is no local name resolution function
248	   that uses the CandidateTLD name as a locally defined pseudo-top level
249	   domain.

251	   The B URL is intended to function as the control test for the
252	   experiment, and the use of ExistingTLD in B is intended to operate as
253	   a name that does not collide with a local use context.

255	   As the experiment uses the absence of a fetch of the A URL to infer
256	   the name resolution behaviour of the location where the measurement
257	   is being performed, it is necessary to ensure that the measurement
258	   code has run to completion.  The measurement code starts a timer at
259	   the start of its execution.  Upon expiration of the timer, or when
260	   both the A and B objects have been successfully retrieved, the code
261	   will schedule the retrieval of the C URL. The arguments to the C URL
262	   include the client-side measurement of the elapsed time to retrieve
263	   the A and B URLs.

265	3.3.  Sampling

267	   One way to perform this measurement is to embed the measurement in
268	   web content, using a scripting language.  When the web content is
269	   loaded the script is activated, and the measurement sequence is
270	   performed.

272	   One way to distribute this content to clients to perform the test is
273	   via an online (ad) campaign.  If the measurement script is enclosed
274	   within the ad itself, then there is no reason for the campaign
275	   actually to cause users to click though in order to perform the test.
276	   Behavior of this sort is trivially achievable with a number of
277	   available online advertising systems.

279	   It is also necessary to spread the delivery of the ad to a very broad
280	   spectrum of clients, uso the as should be presented across all time
281	   zones, across all language bases, and across all geographic regions.

283	4.  Evaluation

285	   To evaluate the results, we take those measurements that return the C
286	   URL. The use of the C URL ensures that we use measurement results
287	   where the ExistingTLD name is not being locally occluded.  We count
288	   the number of experiments of each of the possible combinations of
289	   retrieving the A and B URLs.  These combinations are:

291	      Not A and Not B: This result contributes to experimental
292	         uncertainty.  (We know that ExistingTLD is not locally
293	         occluded, so the failure to retrieve B is due to other factors
294	         that are not being examined in the context of this
295	         measurement.)

297	      A and Not B: This result indicates that the client is able to
298	         resolve names in the CandidateTLD in the context of the global
299	         DNS, but the inability to retrieve the B URL contributes to
300	         experimental uncertainty.  (The same reasoning about the
301	         ExistingTLD and local occlusion applies to this case).

303	      Not A and B: This result is an indicator that the client's use of
304	         CandidateTLD is probably being occluded by some form of local
305	         use.

307	      A and B: This result indicates that the client is able to resolve
308	         names in the CandidateTLD in the context of the global DNS.

310	   If the CandidateTLD is in widespread private use then we would see
311	   the count of "Not A and B" be far in excess of the level of
312	   experimental uncertainty, then we can conclude that there are locales
313	   where the CandidateTLD is being used in local context.  Analysis of
314	   the source IP addresses of the clients that fetch "Not A and B", and
315	   the BGP Origin AS of these addresses and their geolocation may
316	   indicate if such local use is clustered in a particular network or
317	   group or networks, or clustered in a particular geography or language
318	   region.

320	5.  Name Resolution Considerations

322	   Eariler versions of this memo proposed to use this experimental
323	   technique to detect name search list considerations.  This section
324	   describes the name search list collision considerations, and
325	   describes some further investigation that has lead to the conclusion
326	   that this technique would not necessarily be applicable in that
327	   context.

329	   The basic algorithm used in name resolution when search lists are
330	   present appears to be consistent across a number of implementations:
331	   various permutations of using the base name and appending individual
332	   values from the name search list are used as DNS queries in order to
333	   find a name that can be resolved by the local DNS resolver.  The
334	   search process stops when the DNS query returns other than an
335	   NXDOMAIN response.

337	   However the exact order of generating these candidate names has been
338	   observerd to vary across implementations.  To describe these
339	   observations it is first necessary to introduce some basic
340	   terminology.  There are four generic ways that name resolution
341	   libraries apply a search list to a "base name" in order to construct
342	   a set of FQDN that are used in DNS queries:

344	      none the search list is not applied to the base name.

346	      pre the search list is applied to the base name, then the base
347	         name alone is used.

349	      post the base name alone is used, then the search list is applied
350	         to the base name.

352	      always the search list is applied to the base name, and the base
353	         name alone is not used.

355	   The form of name collision with search lists, as discribed in the
356	   introduction section of this memo, occurs in the "post" case, where
357	   the unexpected resolution of the base name causes the search list not
358	   to be applied to the base name, and the global name context is
359	   applied to the base name, rather applying a local name context, as
360	   defined by the search list.

362	   Table 1 provides a summary of the behaviour of various operating
363	   systems and their local name resolver library behaviour when
364	   resolving base names that contain a single label, and names that
365	   contain two labels.  As can be seen, only Windows XP and Unix-based
366	   libraries perform the "post" form of search name application that
367	   would be susceptable for this form of name collision.

369	             +---------------+--------------+-------------+
370	             | System        | Single Label | Multi-Label |
371	             +---------------+--------------+-------------+
372	             | MAC OSX 10.9  |    always    |    never    |
373	             | Windows XP    |    always    |     post    |
374	             | Windows Vista |    always    |    never    |
375	             | Windows 7     |    always    |    never    |
376	             | Windows 8.1   |    always    |    never    |
377	             | FreeBSD 9.1   |     pre      |     post    |
378	             | Ubuntu 13.04  |     pre      |     post    |
379	             +---------------+--------------+-------------+

381	   The experimental approach described here does not necessarily use the
382	   operating system's name resolution libraries.  The experimental
383	   technique forms a name query within the browser, so it is more
384	   relevant to examine the behaviour of the browsers when given single
385	   and multi-label names to lookup.  Table 2 shows the behaviour of a
386	   number of browsers on two operating system platforms.  (It should be
387	   noted that these results in Table 2 were obtained by using Javascript
388	   to feed names to the browser.  The interactive data entry procedures
389	   in current browsers are a dual purpose URL and search engine term
390	   data entry, and the variations on behaviour between browsers in the
391	   way in which entered data is interpreted is more due to the
392	   differences in the browser's input parser than it is due to any
393	   differences in the browser's name resolution library.)
394	       +---------------------------+--------------+-------------+
395	       | System                    | Single Label | Multi-Label |
396	       +---------------------------+--------------+-------------+
397	       | MacOS OSX 10.9            |              |             |
398	       | Chrome (31.0.1650.39)     |    always    |     post    |
399	       | Opera (12.16)             |    always    |    never    |
400	       | Firefox (25.0)            |    always    |    never    |
401	       | Safari (7.0 9537.71)      |    always    |    never    |
402	       | Windows 8.1               |              |             |
403	       | Chrome (30.0.1599.101)    |    always    |    never    |
404	       | Opera (17.0)              |    always    |    never    |
405	       | Firefox (25.0)            |    always    |    never    |
406	       | Safari (5.1.7 7534.57.2)  |    always    |    never    |
407	       | Explorer (11.0.900.16384) |    always    |    never    |
408	       +---------------------------+--------------+-------------+

410	   Only one browser / Operating System combination tested shows the
411	   "post" form of search name use, namely Chrome on the Mac OSX
412	   platform.  In all other cases a single label name always has the
413	   local search list appended, and a multi-label name never applies the
414	   local search list.

416	6.  Security Considerations

418	   The delegation of the Proposed TLD (CandidateTLD) comes with some
419	   risk of interference with existing deployments.  In the case where a
420	   local system queries a name, and that query returns a NXDOMAIN
421	   response, then local system then queries further name forms where
422	   each entry on a local name search list is appended to the original
423	   name in turn, searching for a name response that is not NXDOMAIN.
424	   The delegation of CandidateTLD for this experiment may interfere this
425	   this behaviour.

427	   However, two observations mitigate this concern.  The first is that
428	   this situation of potential collision arises in the case where the
429	   local system is querying for the CandidateTLD name as a "dotless"
430	   name (as the only delegated subdomain in the CandidateTLD zone is
431	   TestName, which is intended to have no semantic meaning in any
432	   language). The second observation is that for such "dotless" names,
433	   the currently widely deployed name resolver libraries no not
434	   initially query the "dotless" domain name then apply the search list
435	   is the first query results in an RCODE 3 response.  Many name
436	   resolver libraries do not query for "dotless" domain names at all,
437	   while those libraries that have been observed to perform such queries
438	   (Windows XP, Linux, FreeBSD) perform them after using the local
439	   search name list, rather then before.

441	7.  References

443	   [Method]   APNIC, "APNIC Labs IPv6 Measurement System ", May 2013.

445	   [RFC6265]  Barth, A., "HTTP State Management Mechanism", RFC 6265,
446	              April 2011.

448	   [RFC6761]  Cheshire, S. and M. Krochmal, "Special-Use Domain Names",
449	              RFC 6761, February 2013.

451	   [RFC6762]  Cheshire, S. and M. Krochmal, "Multicast DNS", RFC 6762,
452	              February 2013.

454	   [SAC45]    ICANN Security and Stability Advisory Committee, "Invalid
455	              Top Level Domain Queries at the Root Level of the Domain
456	              Name System", 11 2010, <http://www.icann.org/en/groups/
457	              ssac/documents/sac-045-en.pdf>.

459	   [namecollision]
460	              Interisle Consulting Group, "Name Collision in the DNS",
461	              August 2013.

463	Appendix A.  Acknowledgements

465	   This draft is a follow-up of, an borrows heavily from, our earlier
466	   (abandonded) work on "A Procedure for Cautious Delegation of a DNS
467	   Names".  Discussion of that document in various hallways lead to
468	   inspiration for this document and we want to thank those that gave us
469	   feed-back.

471	   The idea of using different names to trigger events in a DNS server
472	   is due to Geoff Huston and George Michaelson.

474	   The approach described here of using code embedded in ads delivered
475	   by online advertisement networks to generate a large volume of URL-
476	   based experiments performed by end users' browsers was developed by
477	   George Michaelson, Byron Ellacot and Geoff Huston.

479	Authors' Addresses

481	   Geoff Huston
482	   APNIC
483	   6 Cordelia St
484	   South Brisbane, QLD 4101
485	   Australia

487	   Email: gih@apnic.net

489	   Olaf Kolkman
490	   NLnet Labs
491	   Science Park 400
492	   Amsterdam, 1098 XH
493	   The Netherlands

495	   Email: olaf@NLnetLabs.nl
496	   Andrew Sullivan
497	   Dyn, Inc.
498	   150 Dow St
499	   Manchester, NH 03101
500	   U.S.A.

502	   Email: asullivan@dyn.com

504	   Warren Kumari
505	   Google, Inc.
506	   1600 Amphitheatre Pkwy
507	   Mountain View, CA 94043
508	   U.S.A.

510	   Email: warren@kumari.net