idnits 2.17.1 draft-arkko-abcd-distributed-resolver-selection-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (November 04, 2019) is 1625 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Looks like a reference, but probably isn't: '1' on line 556 -- Looks like a reference, but probably isn't: '2' on line 558 -- Looks like a reference, but probably isn't: '3' on line 561 == Unused Reference: 'MSCVUS' is defined on line 550, but no explicit reference was found in the text == Outdated reference: A later version (-07) exists of draft-levine-dbound-dns-03 == Outdated reference: A later version (-02) exists of draft-schinazi-httpbis-doh-preference-hints-00 Summary: 2 errors (**), 0 flaws (~~), 4 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group J. Arkko 3 Internet-Draft Ericsson 4 Intended status: Informational M. Thomson 5 Expires: May 7, 2020 Mozilla 6 T. Hardie 7 Google 8 November 04, 2019 10 Selecting Resolvers from a Set of Distributed DNS Resolvers 11 draft-arkko-abcd-distributed-resolver-selection-00 13 Abstract 15 This memo discusses the use of a set of different DNS resolvers to 16 reduce privacy problems related to resolvers learning the Internet 17 usage patterns of their clients. 19 Status of This Memo 21 This Internet-Draft is submitted in full conformance with the 22 provisions of BCP 78 and BCP 79. 24 Internet-Drafts are working documents of the Internet Engineering 25 Task Force (IETF). Note that other groups may also distribute 26 working documents as Internet-Drafts. The list of current Internet- 27 Drafts is at https://datatracker.ietf.org/drafts/current/. 29 Internet-Drafts are draft documents valid for a maximum of six months 30 and may be updated, replaced, or obsoleted by other documents at any 31 time. It is inappropriate to use Internet-Drafts as reference 32 material or to cite them other than as "work in progress." 34 This Internet-Draft will expire on May 7, 2020. 36 Copyright Notice 38 Copyright (c) 2019 IETF Trust and the persons identified as the 39 document authors. All rights reserved. 41 This document is subject to BCP 78 and the IETF Trust's Legal 42 Provisions Relating to IETF Documents 43 (https://trustee.ietf.org/license-info) in effect on the date of 44 publication of this document. Please review these documents 45 carefully, as they describe your rights and restrictions with respect 46 to this document. Code Components extracted from this document must 47 include Simplified BSD License text as described in Section 4.e of 48 the Trust Legal Provisions and are provided without warranty as 49 described in the Simplified BSD License. 51 Table of Contents 53 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 54 2. Operational Context . . . . . . . . . . . . . . . . . . . . . 4 55 3. Goals and Constraints . . . . . . . . . . . . . . . . . . . . 4 56 4. Query distribution strategies . . . . . . . . . . . . . . . . 5 57 4.1. Client-based . . . . . . . . . . . . . . . . . . . . . . 6 58 4.1.1. Analysis of client-based selection . . . . . . . . . 6 59 4.1.2. Enhancements to client-based selection . . . . . . . 7 60 4.2. Name-based . . . . . . . . . . . . . . . . . . . . . . . 7 61 4.2.1. Name reduction . . . . . . . . . . . . . . . . . . . 8 62 5. Early conclusions . . . . . . . . . . . . . . . . . . . . . . 9 63 5.1. Analysis conclusions . . . . . . . . . . . . . . . . . . 9 64 5.2. Recommendations . . . . . . . . . . . . . . . . . . . . . 9 65 5.3. Poor distribution strategies . . . . . . . . . . . . . . 9 66 6. Effects of query distribution . . . . . . . . . . . . . . . . 10 67 6.1. Caching considerations . . . . . . . . . . . . . . . . . 10 68 6.2. Consistency considerations . . . . . . . . . . . . . . . 10 69 6.3. Resolver load distribution and failover . . . . . . . . . 10 70 6.4. Query performance . . . . . . . . . . . . . . . . . . . . 11 71 6.5. Debugging . . . . . . . . . . . . . . . . . . . . . . . . 11 72 7. Further work . . . . . . . . . . . . . . . . . . . . . . . . 11 73 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 11 74 8.1. Normative References . . . . . . . . . . . . . . . . . . 11 75 8.2. Informative References . . . . . . . . . . . . . . . . . 12 76 8.3. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 12 77 Appendix A. Acknowledgements . . . . . . . . . . . . . . . . . . 12 78 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 13 80 1. Introduction 82 The DNS [DNS] is a complex system with many different security 83 issues, challenges, deployment models and usage patterns. This 84 document focuses on one narrow aspect within DNS and its security. 86 Traditionally, systems are configured with a single DNS recursive 87 resolver, or a set of primary and alternate recursive resolvers. 88 Recursive resolver services are offered by organisations such as 89 enterprises, ISPs, and global providers. Even when clients use 90 alternate recursive resolvers, they are typically all provided by the 91 same organisation. 93 The resolvers will learn the Internet usage patterns of their 94 clients. A client might decide to trust a particular recursive 95 resolver with information about DNS queries. However, it is 96 difficult or impossible to provide any guarantees about data handling 97 practices in the general case. And even if a service can be trusted 98 to respect privacy with respect to handling of query data, legal and 99 commercial pressures or surveillance activity could result in misuse 100 of data. Similarly, outside attacks may occur towards any DNS 101 services. For a service with many clients, these risks are 102 particularly undesirable. 104 This memo discusses whether DNS clients can improve their privacy 105 through the potential use of a set of multiple recursive resolver 106 services. The goal is indeed an improvement only. There is no 107 expectation that it would be possible to have no part of the DNS 108 infrastructure aware of what queries are being made, but perhaps 109 there are mitigations that would make possible information collection 110 from the DNS infrastruture harder. 112 It should be understood that this is a narrow aspect within a bigger 113 set of topics even within privacy issues around DNS, let alone other 114 security issues, deployment models, or the many protocol questions 115 within DNS. Some of these other topics include detecting the 116 tampering DNS query responses [DNSSEC], encrypting DNS queries [DOT] 117 [DOH], application-specific DNS resolution mechanisms, or centralised 118 deployment models. Those other topics are not covered in this memo 119 and need to be dealt with elsewhere. 121 Specifically, the scope of this memo is not limited to DNS-over-TLS 122 (DOT) or DNS-over-HTTPS (DOH) deployments nor does it take a stand on 123 operating system vs. application or local vs. centralized DNS 124 deployment models. This memo is intended to provide useful 125 information for those that wish to consider trustworthiness of their 126 recursive resolvers as a part of their privacy analysis. 128 Naturally, there are some interactions between different topics. For 129 instance, privacy is affected both by what happens to data in transit 130 and at the endpoints, so where privacy is a concern, one would expect 131 to consider both aspects, and lack of consideration on one probably 132 leads to issues that dwarf the problems that this memo can address. 133 Both issues are also important aspects when considering defense again 134 pervasive monitoring efforts [PMON]. 136 The rest of this memo is organized as follows. Section 2 discusses 137 the operational context that we imagine the multiple recursive 138 resolver arrangement might be applied in. Section 3 specifies 139 security goals for a system that employs multiple recursive 140 resolvers. 142 One key aspect of a system using multiple resolvers would be how to 143 select which particular recursive resolver to use in a particular 144 situation. This is discussed in Section 4. This section covers a 145 number of possible strategies and further considerations for this 146 selection, along with an analysis of the implications of choosing a 147 particular strategy. There are technical issues in the use of 148 multiple recursive resolvers, and there are both technical and non- 149 technical questions in deciding on what recursive resolvers should 150 even be in the set. 152 Some early recommendations are provided in Section 5. Section 6 153 discusses operational and other implications of the distributed 154 approach. Finally, Section 7 discusses potential further work in 155 this area. 157 2. Operational Context 159 Our perspective is that of a client, choosing to either distribute or 160 not distribute its queries to a set of different resolvers. And if 161 the client decides to use distribution, it can choose exactly how it 162 does that. 164 There are obviously additional operational aspect of this - such as 165 central configuration mechanisms, resolver selection application 166 choices, and so on. But these are not covered in this memo. 168 It should also be observed that the practices suggested in this memo 169 are currently not widely used. Operational and other issues may be 170 discovered, such as those outlined in Section 6. 172 Many of these issues need further work, but this memo aims to discuss 173 the concept and analyse its impacts before dwelling into the 174 technical arrangements for configuring and using this particular 175 approach. 177 3. Goals and Constraints 179 This document aims to reduce the concentration of information about 180 client activity by distributing DNS queries across different resolver 181 services, for all DNS queries in the aggregate and for DNS queries 182 made by individual clients. By distributing queries in this way, the 183 goal is to reduce the amount of information that any given DNS 184 resolver service can acquire about client activity. As such, creates 185 a benefit for the client, but also makes these resolvers less 186 valuable targets for attacks relating to that client's activity. 188 Any method for distributing queries from a single client needs to 189 consider these benefits with regards to the following constraints: 191 o A careful selection of the set of trusted resolvers must be the 192 first priority. It does not make sense to add less trustworthy 193 resolvers merely for the sake of distribution. For instance, 194 there is no reason to mix resolvers with good reliability or high 195 degree of privacy regulation with other resolvers: just include 196 the best resolvers in the set. 198 o As the goal is to reduce the amount of information given to any 199 given resolver, a strategy that tells the same information to all 200 resolvers is a poor one. A design that results in replicating the 201 same query toward multiple services would thus be a net privacy 202 loss. This happens quite easily over a long period of time, 203 unless the distribution method is carefully designed. 205 More subtle leaks arise as a result of distributing queries for 206 sub-domains and even domains that are superficially unrelated, 207 because these could share a commonality that might be exploited to 208 link them. For instance, some web sites use names that are appear 209 unrelated to their primary name for hosting some kinds of content, 210 like static images or videos. If queries for these unrelated 211 names were sent to different services, that effectively allows 212 multiple resolvers to learn that the client accessed the web site. 214 A distribution scheme also needs to consider stability of query 215 routing over time. A resolver can observe the absence of queries and 216 infer things about the state of a client cache, which can reveal that 217 queries were made to other resolvers. 219 In effect, there are two goals in tension: 221 o to split queries between as many different resolvers as possible; 222 and 224 o to reduce the spread of information about related queries across 225 multiple resolvers. 227 The need to limit replication of private information about queries 228 eliminates naive distribution schemes, such as those discussed in 229 Section 5.3. The designs described in Section 4 all attempt to 230 balance these different goals using different properties from the 231 context of a query (Section 4.1) or the query name itself 232 (Section 4.2). 234 4. Query distribution strategies 236 This section introduces and analyzes several potential strategies for 237 distributing queries to different resolvers. Each strategy is 238 formulated as an algorithm for choosing a resolver Ri from a set of n 239 resolvers R1, R2, ..., Rn. 241 The designs presented in Section 4 assume that the stub resolver 242 performing distribution of queries has varying degrees of contextual 243 information. In general, more contextual information allows for 244 finer-grained distribution of information between resolvers. 246 4.1. Client-based 248 The simplest strategy is to distribute each different client to a 249 different resolver. This reduces the number of users any particular 250 service will know about. However, this does little to protect an 251 individual user from the aggregation of information about queries at 252 the selected resolver. 254 In this design clients select and consistently use the same resolver. 255 This might be achieved by randomly selecting and remembering a 256 resolver. Alternatively, a resolver might be selected using 257 consistent hashing that takes some conception of client identity as 258 input: 260 i = h(client identity) % n 262 For the purposes of this determination, a client might be an entire 263 device, with the selection being made at the operating system level, 264 or it could be a selection made by individual applications. In the 265 extreme, an individual application might be able to partition its 266 activities in a way that allows it to direct queries to multiple 267 resolvers. 269 4.1.1. Analysis of client-based selection 271 This is a simple and effective strategy, but while it provides 272 distribution of DNS queries in the aggregate, it does little to 273 divide information about a particular client between resolvers. It 274 effectively only reduces the number of clients that each resolver can 275 acquire information about. This provides systemic benefit, but does 276 not provide individual clients with any significant advantage as 277 there is still some resolver service that has a complete view of the 278 user's DNS usage patterns. 280 In addition, there are specific issues where this selection method is 281 used in particular deployment modes. Where different applications 282 make independent resolver selections, activities that involve 283 multiple applications can result in information about those 284 activities being exposed to multiple resolvers. For instance, an 285 application could open another application for the purposes of 286 handling a specific file type or to load a URL. This could expose 287 queries related to the activity as a whole to multiple resolvers. 289 Making different selections at the level of a device resolves this 290 issue. But of course it is still possible that an individual who 291 uses multiple devices might perform similar activities on those 292 devices, but have DNS queries distributed to different resolvers, 293 resulting in replicating that individual's information to multiple 294 resolvers. The individual may or may not be identifiable through 295 fingerprinting of the specific set of queries being made from the 296 devices. 298 4.1.2. Enhancements to client-based selection 300 Clients can break continuity of records by occasionally resetting 301 state so that a different resolver is selected. A client might 302 choose to do this when it moves to a new network location, and may 303 otherwise appear as a new client its current resolver. But it is 304 unclear if there's a sufficient advantage to breaking continuity, as 305 the potential benefits are offset by the client's information being 306 disclosed to several resolvers as part of performing a series of 307 resets. And it is possible that a particular individual's usage 308 patterns can be identified across network locations and periods of 309 using other resolvers. 311 Breaking continuity is less effective if any state, in particular 312 cached results, is retained across the change. If activities that 313 depend on DNS querying are continued across the change then it might 314 be possible for the old resolver to make inferences about the 315 activity on the new resolver, or the new resolver to make similar 316 guesses about past activity. As many modern applications provide 317 session continuity features across shutdowns and crashes, this can 318 mean that finding an appropriate point in time to perform a switch 319 can be difficult. 321 4.2. Name-based 323 Clients might also additionally attempt to distribute queries based 324 on the name being queried. This results in different names going to 325 different resolvers. 327 A naive algorithm for name distribution uses the target name as input 328 to a fixed hash: 330 i = h(queried name) % n 332 However, this simplistic approach fails to prevent related queries 333 from being distributed to different resolvers in several ways. For 334 instance, queries that are executed after receiving a CNAME record in 335 a response will leak the same information as the original query that 336 resulted in the CNAME record. Services that use related domain names 337 - such as where "example.com" uses "static.example.com" or 338 "cdn.example.net" - might reveal the use of the combined service to a 339 resolver that receives a query for any associated name. In both 340 cases, sensitive information is effectively replicated across 341 multiple resolvers. 343 4.2.1. Name reduction 345 In order to reduce the effect of distributing similar names to 346 different servers, a grouping mechanism might be used. Leading 347 labels in names might be erased before being input to the hashing 348 algorithm. This requires that the part of the suffix that is shared 349 between different services can be identified. For the purposes of 350 ensuring that queries are consistently routed to the same resolver, a 351 weak signal is likely sufficient. 353 Several options for grouping domain names into equivalence sets might 354 be used: 356 o The public suffix list [1] provides a manually curated list of 357 shared domain suffixes. Names can be reduced to include one label 358 more than the list allows, referred to as effective top-level 359 domain plus one (eTLD+1). This reduces the number of cases where 360 queries for domains under the same administrative control are sent 361 to different resolvers. 363 o Services often relies on multiple domain names across different 364 eTLD+1 domains. Developing equivalence sets might be needed to 365 avoid broadcasting queries to servers. Mozilla maintains a 366 manually curated equivalence list [2] for web sites that aims to 367 maps the complete set of unrelated names used by services to a 368 single service name. 370 o Other technologies, such as the proposed first party sets [3] or 371 the abandoned DBOUND [DBOUND] provide domain owners a means to 372 declare some form of equivalence for different names. 374 Each of these techniques are imperfect in different ways. They may 375 also skew the distribution of queries in ways that might concentrate 376 information on particular resolvers. 378 5. Early conclusions 380 5.1. Analysis conclusions 382 Both the client-based and more advanced name-based strategies provide 383 benefits. The former provides primarily a systemic benefit, while 384 the latter provides also some privacy benefits to each individual 385 client. However, neither strategy is perfect, and can leak the same 386 information to multiple resolvers in some cases. 388 5.2. Recommendations 390 Both strategies are, however, likely generally beneficial in the 391 common cases, and can improve the overall privacy situation. And 392 they are certainly a considerable privacy improvement over a 393 situation where a large number of clients use a single resolver. 395 Their use may also reduce any pressures against specific resolvers, 396 as information available in these specific resolvers does not 397 constitute all information about all clients. As such, the use of 398 one of these distribution strategies is tentatively recommended, 399 subject to further testing, discussion, and resolving any remaining 400 operational issues. 402 The naive name-based strategy is, however, not recommended, and 403 neither are other, even simpler strategies listed in Section 5.3. It 404 should be noted that no technique presented in this memo can defend 405 against a situation where an actor such as a surveillance agency has 406 access to information from all resolvers. 408 5.3. Poor distribution strategies 410 Random allocation to a resolver might be implemented: 412 i = rand() % n 414 Similar drawbacks can be seen where clients iterate over available 415 resolvers: 417 i = counter++ % n 419 Whether this choice is made on a per-query basis, these two methods 420 eventually provide information about all queries to all resolvers 421 over time. Domain names are often queried many times over long 422 periods, so queries for the same domain name will eventually be 423 distributed to all resolvers. Only one-off queries will avoid being 424 distributed. 426 Implementing either method at a much slower cadence might be 427 effective, subject to the constraints in Section 4.1.2. This only 428 slows the distribution of information about repeated queries to all 429 resolvers. 431 6. Effects of query distribution 433 Choosing to use more than one DNS resolver has broader implications 434 than just the effect on privacy. Using multiple resolvers is a 435 significant change from the assumed model where stub resolvers send 436 all queries to a single resolver. 438 6.1. Caching considerations 440 Using a common cache for multiple resolvers introduces the 441 possibility that a resolver could learn about queries that were 442 originally directed to another resolvers by observing the absence of 443 queries. Though this can reduce caching performance, clients can 444 address this by having a per-resolver cache and only using the cache 445 for the selected resolver. 447 6.2. Consistency considerations 449 Making the same query to multiple resolvers can result in different 450 answers. For instance, DNS-based load balancing can lead to 451 different answers being produced over time or for different query 452 origins. Or, different resolvers might have different policies with 453 respect to blocking or filtering of queries that lead to clients 454 receiving inconsistent answers. 456 In the extreme, an application might encounter errors as a result of 457 receiving incompatible answers, particularly if a server operator 458 (incorrectly) assumes that different DNS queries for the same client 459 always originate from the same source address. This is most likely 460 to occur if name-based selection is used, as queries could be related 461 based on information that the client does not consider. 463 6.3. Resolver load distribution and failover 465 Any selection of resolvers that is based on random inputs will need 466 to account for available capacity on resolvers. Otherwise, resolvers 467 with less available query-processing capacity will receive too high a 468 proportion of all queries. Clients only need to be informed of 469 relative available capacity in order to make an appropriate 470 selection. How relative capacities of resolvers are determined is 471 not in scope for this document. 473 The choice of different resolvers would also need to work well with 474 whatever mechanisms exist for failover to alternate resolvers when 475 one is not responsive. The same is true of IPv4/IPv6 connectivity, 476 the availability of communications to specific ports, etc. And the 477 dynamic situation should obviously not lead to extensive leakage to 478 different resolvers, either. 480 6.4. Query performance 482 Distribution of queries between resolvers also means that clients are 483 exposed to greater variations in performance. 485 6.5. Debugging 487 The use of multiple resolvers may complicate debugging. 489 7. Further work 491 Should there be interest in the deployment of ideas laid out in this 492 memo, further work is needed. There would have to be ways to 493 configure systems to use multiple resolvers, including for instance: 495 o Central configuration mechanisms to enable the use of multiple 496 resolvers, perhaps through usual network configuration mechanisms 497 or choices made by applications using resolver services directly. 498 It may also be necessary to employ discovery mechanisms, such as, 499 e.g., [I-D.schinazi-httpbis-doh-preference-hints] (but see 500 Section 3). 502 o Mechanisms to allow both failover to working resolvers when a 503 resolver is unreachable, 505 o Additional testing for potential operational issues discussed in 506 Section 2 would be beneficial. 508 Finally, more work is needed to determine factors other than privacy 509 that could motivate having queries routed to the same resolver. The 510 choice between different approaches is often a combination of several 511 factors, and privacy is only one of those factors. 513 8. References 515 8.1. Normative References 517 [DNS] Mockapetris, P., "Domain names - implementation and 518 specification", STD 13, RFC 1035, DOI 10.17487/RFC1035, 519 November 1987, . 521 [DNSSEC] Arends, R., Austein, R., Larson, M., Massey, D., and S. 522 Rose, "DNS Security Introduction and Requirements", 523 RFC 4033, DOI 10.17487/RFC4033, March 2005, 524 . 526 [DOH] Hoffman, P. and P. McManus, "DNS Queries over HTTPS 527 (DoH)", RFC 8484, DOI 10.17487/RFC8484, October 2018, 528 . 530 [DOT] Hu, Z., Zhu, L., Heidemann, J., Mankin, A., Wessels, D., 531 and P. Hoffman, "Specification for DNS over Transport 532 Layer Security (TLS)", RFC 7858, DOI 10.17487/RFC7858, May 533 2016, . 535 [PMON] Farrell, S. and H. Tschofenig, "Pervasive Monitoring Is an 536 Attack", BCP 188, RFC 7258, DOI 10.17487/RFC7258, May 537 2014, . 539 8.2. Informative References 541 [DBOUND] Levine, J., "Publishing Organization Boundaries in the 542 DNS", draft-levine-dbound-dns-03 (work in progress), April 543 2019. 545 [I-D.schinazi-httpbis-doh-preference-hints] 546 Schinazi, D., Sullivan, N., and J. Kipp, "DoH Preference 547 Hints for HTTP", draft-schinazi-httpbis-doh-preference- 548 hints-00 (work in progress), July 2019. 550 [MSCVUS] Wikipedia, ., "Microsoft Corp. v. United States", 551 https://en.wikipedia.org/wiki/ 552 Microsoft_Corp._v._United_States , n.d.. 554 8.3. URIs 556 [1] https://publicsuffix.org/ 558 [2] https://github.com/mozilla-services/shavar-prod- 559 lists/blob/master/disconnect-entitylist.json 561 [3] https://github.com/krgovind/first-party-sets 563 Appendix A. Acknowledgements 565 The authors would like to thank Christian Huitema, Ari Keraenen, Mark 566 Nottingham, Stephen Farrell, Gonzalo Camarillo, Mirja Kuehlewind, 567 David Allan, Daniel Migault Goran AP Eriksson, and many others for 568 interesting discussions in this problem space. 570 Authors' Addresses 572 Jari Arkko 573 Ericsson 575 Email: jari.arkko@piuha.net 577 Martin Thomson 578 Mozilla 580 Email: martin.thomson@gmail.com 582 Ted Hardie 583 Google 585 Email: ted.ietf@gmail.com