idnits 2.17.1 draft-ietf-dprive-problem-statement-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (June 15, 2015) is 3231 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Looks like a reference, but probably isn't: '1' on line 762 == Outdated reference: A later version (-08) exists of draft-ietf-dnsop-edns-client-subnet-01 == Outdated reference: A later version (-01) exists of draft-ietf-dprive-start-tls-for-dns-00 == Outdated reference: A later version (-09) exists of draft-ietf-dnsop-qname-minimisation-03 == Outdated reference: A later version (-05) exists of draft-ietf-dnsop-dns-terminology-02 Summary: 0 errors (**), 0 flaws (~~), 5 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 DNS PRIVate Exchange (dprive) Working Group S. Bortzmeyer 3 Internet-Draft AFNIC 4 Intended status: Informational June 15, 2015 5 Expires: December 17, 2015 7 DNS privacy considerations 8 draft-ietf-dprive-problem-statement-06 10 Abstract 12 This document describes the privacy issues associated with the use of 13 the DNS by Internet users. It is intended to be an analysis of the 14 present situation and does not prescribe solutions. 16 Status of This Memo 18 This Internet-Draft is submitted in full conformance with the 19 provisions of BCP 78 and BCP 79. 21 Internet-Drafts are working documents of the Internet Engineering 22 Task Force (IETF). Note that other groups may also distribute 23 working documents as Internet-Drafts. The list of current Internet- 24 Drafts is at http://datatracker.ietf.org/drafts/current/. 26 Internet-Drafts are draft documents valid for a maximum of six months 27 and may be updated, replaced, or obsoleted by other documents at any 28 time. It is inappropriate to use Internet-Drafts as reference 29 material or to cite them other than as "work in progress." 31 This Internet-Draft will expire on December 17, 2015. 33 Copyright Notice 35 Copyright (c) 2015 IETF Trust and the persons identified as the 36 document authors. All rights reserved. 38 This document is subject to BCP 78 and the IETF Trust's Legal 39 Provisions Relating to IETF Documents 40 (http://trustee.ietf.org/license-info) in effect on the date of 41 publication of this document. Please review these documents 42 carefully, as they describe your rights and restrictions with respect 43 to this document. Code Components extracted from this document must 44 include Simplified BSD License text as described in Section 4.e of 45 the Trust Legal Provisions and are provided without warranty as 46 described in the Simplified BSD License. 48 Table of Contents 50 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 51 2. Risks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 52 2.1. The alleged public nature of DNS data . . . . . . . . . . 5 53 2.2. Data in the DNS request . . . . . . . . . . . . . . . . . 5 54 2.3. Cache snooping . . . . . . . . . . . . . . . . . . . . . 6 55 2.4. On the wire . . . . . . . . . . . . . . . . . . . . . . . 7 56 2.5. In the servers . . . . . . . . . . . . . . . . . . . . . 8 57 2.5.1. In the recursive resolvers . . . . . . . . . . . . . 9 58 2.5.2. In the authoritative name servers . . . . . . . . . . 9 59 2.5.3. Rogue servers . . . . . . . . . . . . . . . . . . . . 10 60 2.6. Re-identification and other inferences . . . . . . . . . 11 61 3. Actual "attacks" . . . . . . . . . . . . . . . . . . . . . . 11 62 4. Legalities . . . . . . . . . . . . . . . . . . . . . . . . . 12 63 5. Security considerations . . . . . . . . . . . . . . . . . . . 12 64 6. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 12 65 7. IANA considerations . . . . . . . . . . . . . . . . . . . . . 12 66 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 13 67 8.1. Normative References . . . . . . . . . . . . . . . . . . 13 68 8.2. Informative References . . . . . . . . . . . . . . . . . 13 69 8.3. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 17 70 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 17 72 1. Introduction 74 This document is an analysis of the DNS privacy issues, in the spirit 75 of section 8 of [RFC6973]. 77 The Domain Name System is specified in [RFC1034] and [RFC1035] and 78 many later RFCs, which have never been consolidated. It is one of 79 the most important infrastructure components of the Internet and 80 often ignored or misunderstood by Internet users (and even by many 81 professionals). Almost every activity on the Internet starts with a 82 DNS query (and often several). Its use has many privacy implications 83 and this is an attempt at a comprehensive and accurate list. 85 Let us begin with a simplified reminder of how the DNS works. (See 86 also [I-D.ietf-dnsop-dns-terminology].) A client, the stub resolver, 87 issues a DNS query to a server, called the recursive resolver (also 88 called caching resolver or full resolver or recursive name server). 89 Let's use the query "What are the AAAA records for www.example.com?" 90 as an example. AAAA is the QTYPE (Query Type), and www.example.com 91 is the QNAME (Query Name). (The description which follows assumes a 92 cold cache, for instance because the server just started.) The 93 recursive resolver will first query the root nameservers. In most 94 cases, the root nameservers will send a referral. In this example, 95 the referral will be to the .com nameservers. The resolver repeats 96 the query to one of the .com nameservers. The .com nameservers, in 97 turn, will refer to the example.com nameservers. The example.com 98 nameserver will then return the answer. The root name servers, the 99 name servers of .com and the name servers of example.com are called 100 authoritative name servers. It is important, when analyzing the 101 privacy issues, to remember that the question asked to all these name 102 servers is always the original question, not a derived question. The 103 question sent to the root name servers is "What are the AAAA records 104 for www.example.com?", not "What are the name servers of .com?". By 105 repeating the full question, instead of just the relevant part of the 106 question to the next in line, the DNS provides more information than 107 necessary to the nameserver. 109 Because DNS relies on caching heavily, the algorithm described just 110 above is actually a bit more complicated, and not all questions are 111 sent to the authoritative name servers. If a few seconds later the 112 stub resolver asks to the recursive resolver, "What are the SRV 113 records of _xmpp-server._tcp.example.com?", the recursive resolver 114 will remember that it knows the name servers of example.com and will 115 just query them, bypassing the root and .com. Because there is 116 typically no caching in the stub resolver, the recursive resolver, 117 unlike the authoritative servers, sees all the DNS traffic. 118 (Applications, like Web browsers, may have some form of caching which 119 do not follow DNS rules, for instance because it may ignore the TTL. 120 So, the recursive resolver does not see all the name resolution 121 activity.) 123 It should be noted that DNS recursive resolvers sometimes forward 124 requests to other recursive resolvers, typically bigger machines, 125 with a larger and more shared cache (and the query hierarchy can be 126 even deeper, with more than two levels of recursive resolvers). From 127 the point of view of privacy, these forwarders are like resolvers, 128 except that they do not see all of the requests being made (due to 129 caching in the first resolver). 131 Almost all this DNS traffic is currently sent in clear (unencrypted). 132 There are a few cases where there is some channel encryption, for 133 instance in an IPsec VPN, at least between the stub resolver and the 134 resolver. 136 Today, almost all DNS queries are sent over UDP [thomas-ditl-tcp]. 137 This has practical consequences when considering encryption of the 138 traffic as a possible privacy technique. Some encryption solutions 139 are only designed for TCP, not UDP. 141 Another important point to keep in mind when analyzing the privacy 142 issues of DNS is the fact that DNS requests received by a server were 143 triggered by different reasons. Let's assume an eavesdropper wants 144 to know which Web page is viewed by a user. For a typical Web page, 145 there are three sorts of DNS requests being issued: 147 Primary request: this is the domain name in the URL that the user 148 typed, selected from a bookmark or chose by clicking on an 149 hyperlink. Presumably, this is what is of interest for the 150 eavesdropper. 152 Secondary requests: these are the additional requests performed by 153 the user agent (here, the Web browser) without any direct 154 involvement or knowledge of the user. For the Web, they are 155 triggered by embedded content, CSS sheets, JavaScript code, 156 embedded images, etc. In some cases, there can be dozens of 157 domain names in different contexts on a single Web page. 159 Tertiary requests: these are the additional requests performed by 160 the DNS system itself. For instance, if the answer to a query is 161 a referral to a set of name servers, and the glue records are not 162 returned, the resolver will have to do additional requests to turn 163 name servers' names into IP addresses. Similarly, even if glue 164 records are returned, a careful recursive server will do tertiary 165 requests to verify the IP addresses of those records. 167 It can be noted also that, in the case of a typical Web browser, more 168 DNS requests than stricly necessary are sent, for instance to 169 prefetch resources that the user may query later, or when 170 autocompleting the URL in the address bar. Both are a big privacy 171 concern since they may leak information even about non-explicit 172 actions. For instance, just reading a local HTML page, even without 173 selecting the hyperlinks, may trigger DNS requests. 175 For privacy-related terms, we will use here the terminology of 176 [RFC6973]. 178 2. Risks 180 This document focuses mostly on the study of privacy risks for the 181 end-user (the one performing DNS requests). We consider the risks of 182 pervasive surveillance ([RFC7258]) as well as risks coming from a 183 more focused surveillance. Privacy risks for the holder of a zone 184 (the risk that someone gets the data) are discussed in [RFC5936] and 185 [RFC5155]. Non-privacy risks (such as cache poisoning) are out of 186 scope. 188 2.1. The alleged public nature of DNS data 190 It has long been claimed that "the data in the DNS is public". While 191 this sentence makes sense for an Internet-wide lookup system, there 192 are multiple facets to the data and metadata involved that deserve a 193 more detailed look. First, access control lists and private 194 namespaces nonwithstanding, the DNS operates under the assumption 195 that public facing authoritative name servers will respond to "usual" 196 DNS queries for any zone they are authoritative for without further 197 authentication or authorization of the client (resolver). Due to the 198 lack of search capabilities, only a given QNAME will reveal the 199 resource records associated with that name (or that name's non- 200 existence). In other words: one needs to know what to ask for, in 201 order to receive a response. The zone transfer QTYPE [RFC5936] is 202 often blocked or restricted to authenticated/authorized access to 203 enforce this difference (and maybe for other reasons). 205 Another differentiation to be considered is between the DNS data 206 itself and a particular transaction (i.e., a DNS name lookup). DNS 207 data and the results of a DNS query are public, within the boundaries 208 described above, and may not have any confidentiality requirements. 209 However, the same is not true of a single transaction or sequence of 210 transactions; that transaction is not/should not be public. A 211 typical example from outside the DNS world is: the Web site of 212 Alcoholics Anonymous is public; the fact that you visit it should not 213 be. 215 2.2. Data in the DNS request 217 The DNS request includes many fields but two of them seem 218 particularly relevant for the privacy issues: the QNAME and the 219 source IP address. "source IP address" is used in a loose sense of 220 "source IP address + maybe source port", because the port is also in 221 the request and can be used to differentiate between several users 222 sharing an IP address (behind a CGN for instance [RFC6269]). 224 The QNAME is the full name sent by the user. It gives information 225 about what the user does ("What are the MX records of example.net?" 226 means he probably wants to send email to someone at example.net, 227 which may be a domain used by only a few persons and therefore very 228 revealing about communication relationships). Some QNAMEs are more 229 sensitive than others. For instance, querying the A record of a 230 well-known Web statistics domain reveals very little (everybody 231 visits Web sites which use this analytics service) but querying the A 232 record of www.verybad.example where verybad.example is the domain of 233 an organization that some people find offensive or objectionable, may 234 create more problems for the user. Also, sometimes, the QNAME embeds 235 the software one uses, which could be a privacy issue. For instance, 236 _ldap._tcp.Default-First-Site-Name._sites.gc._msdcs.example.org. 237 There are also some BitTorrent clients that query a SRV record for 238 _bittorrent-tracker._tcp.domain.example. 240 Another important thing about the privacy of the QNAME is the future 241 usages. Today, the lack of privacy is an obstacle to putting 242 potentially sensitive or personally identifiable data in the DNS. At 243 the moment your DNS traffic might reveal that you are doing email but 244 not with whom. If your MUA starts looking up PGP keys in the DNS 245 [I-D.wouters-dane-openpgp] then privacy becomes a lot more important. 246 And email is just an example; there would be other really interesting 247 uses for a more privacy-friendly DNS. 249 For the communication between the stub resolver and the recursive 250 resolver, the source IP address is the address of the user's machine. 251 Therefore, all the issues and warnings about collection of IP 252 addresses apply here. For the communication between the recursive 253 resolver and the authoritative name servers, the source IP address 254 has a different meaning; it does not have the same status as the 255 source address in a HTTP connection. It is now the IP address of the 256 recursive resolver which, in a way "hides" the real user. However, 257 hiding does not always work. Sometimes 258 [I-D.ietf-dnsop-edns-client-subnet] is used (see its privacy analysis 259 in [denis-edns-client-subnet]). Sometimes the end user has a 260 personal recursive resolver on her machine. In both cases, the IP 261 address is as sensitive as it is for HTTP [sidn-entrada]. 263 A note about IP addresses: there is currently no IETF document which 264 describes in detail all the privacy issues around IP addressing. In 265 the meantime, the discussion here is intended to include both IPv4 266 and IPv6 source addresses. For a number of reasons their assignment 267 and utilization characteristics are different, which may have 268 implications for details of information leakage associated with the 269 collection of source addresses. (For example, a specific IPv6 source 270 address seen on the public Internet is less likely than an IPv4 271 address to originate behind a CGN or other NAT.) However, for both 272 IPv4 and IPv6 addresses, it's important to note that source addresses 273 are propagated with queries and comprise metadata about the host, 274 user, or application that originated them. 276 2.3. Cache snooping 278 The content of recursive resolvers' caches can reveal data about the 279 clients using it (the privacy risks depend on the number of clients). 280 This information can sometimes be examined by sending DNS queries 281 with RD=0 to inspect cache content, particularly looking at the DNS 282 TTLs [grangeia.snooping]. Since this also is a reconnaissance 283 technique for subsequent cache poisoning attacks, some counter 284 measures have already been developed and deployed. 286 2.4. On the wire 288 DNS traffic can be seen by an eavesdropper like any other traffic. 289 It is typically not encrypted. (DNSSEC, specified in [RFC4033] 290 explicitly excludes confidentiality from its goals.) So, if an 291 initiator starts a HTTPS communication with a recipient, while the 292 HTTP traffic will be encrypted, the DNS exchange prior to it will not 293 be. When other protocols will become more and more privacy-aware and 294 secured against surveillance, the DNS may become "the weakest link" 295 in privacy. 297 An important specificity of the DNS traffic is that it may take a 298 different path than the communication between the initiator and the 299 recipient. For instance, an eavesdropper may be unable to tap the 300 wire between the initiator and the recipient but may have access to 301 the wire going to the recursive resolver, or to the authoritative 302 name servers. 304 The best place to tap, from an eavesdropper's point of view, is 305 clearly between the stub resolvers and the recursive resolvers, 306 because traffic is not limited by DNS caching. 308 The attack surface between the stub resolver and the rest of the 309 world can vary widely depending upon how the end user's computer is 310 configured. By order of increasing attack surface: 312 The recursive resolver can be on the end user's computer. In 313 (currently) a small number of cases, individuals may choose to 314 operate their own DNS resolver on their local machine. In this 315 case the attack surface for the connection between the stub 316 resolver and the caching resolver is limited to that single 317 machine. 319 The recursive resolver may be at the local network edge. For 320 many/most enterprise networks and for some residential users the 321 caching resolver may exist on a server at the edge of the local 322 network. In this case the attack surface is the local network. 323 Note that in large enterprise networks the DNS resolver may not be 324 located at the edge of the local network but rather at the edge of 325 the overall enterprise network. In this case the enterprise 326 network could be thought of as similar to the IAP (Internet Access 327 Provider) network referenced below. 329 The recursive resolver can be in the IAP (Internet Access 330 Provider) premises. For most residential users and potentially 331 other networks the typical case is for the end user's computer to 332 be configured (typically automatically through DHCP) with the 333 addresses of the DNS recursive resolvers at the IAP. The attack 334 surface for on-the-wire attacks is therefore from the end user 335 system across the local network and across the IAP network to the 336 IAP's recursive resolvers. 338 The recursive resolver can be a public DNS service. Some machines 339 may be configured to use public DNS resolvers such as those 340 operated today by Google Public DNS or OpenDNS. The end user may 341 have configured their machine to use these DNS recursive resolvers 342 themselves - or their IAP may have chosen to use the public DNS 343 resolvers rather than operating their own resolvers. In this case 344 the attack surface is the entire public Internet between the end 345 user's connection and the public DNS service. 347 2.5. In the servers 349 Using the terminology of [RFC6973], the DNS servers (recursive 350 resolvers and authoritative servers) are enablers: they facilitate 351 communication between an initiator and a recipient without being 352 directly in the communications path. As a result, they are often 353 forgotten in risk analysis. But, to quote again [RFC6973], "Although 354 [...] enablers may not generally be considered as attackers, they may 355 all pose privacy threats (depending on the context) because they are 356 able to observe, collect, process, and transfer privacy-relevant 357 data." In [RFC6973] parlance, enablers become observers when they 358 start collecting data. 360 Many programs exist to collect and analyze DNS data at the servers. 361 From the "query log" of some programs like BIND, to tcpdump and more 362 sophisticated programs like PacketQ [packetq] and DNSmezzo 363 [dnsmezzo]. The organization managing the DNS server can use these 364 data itself or it can be part of a surveillance program like PRISM 365 [prism] and pass data to an outside observer. 367 Sometimes, these data are kept for a long time and/or distributed to 368 third parties, for research purposes [ditl] [day-at-root], for 369 security analysis, or for surveillance tasks. These uses are 370 sometimes under some sort of contract, with various limitations, for 371 instance on redistribution, giving the sensitive nature of the data. 372 Also, there are observation points in the network which gather DNS 373 data and then make it accessible to third-parties for research or 374 security purposes ("passive DNS [passive-dns]"). 376 2.5.1. In the recursive resolvers 378 Recursive Resolvers see all the traffic since there is typically no 379 caching before them. To summarize: your recursive resolver knows a 380 lot about you. The resolver of a large IAP, or a large public 381 resolver can collect data from many users. You may get an idea of 382 the data collected by reading the privacy policy of a big public 383 resolver [1]. 385 2.5.2. In the authoritative name servers 387 Unlike what happens for recursive resolvers, observation capabilities 388 of authoritative name servers are limited by caching; they see only 389 the requests for which the answer was not in the cache. For 390 aggregated statistics ("What is the percentage of LOC queries?"), 391 this is sufficient; but it prevents an observer from seeing 392 everything. Still, the authoritative name servers see a part of the 393 traffic, and this subset may be sufficient to violate some privacy 394 expectations. 396 Also, the end user has typically some legal/contractual link with the 397 recursive resolver (he has chosen the IAP, or he has chosen to use a 398 given public resolver), while having no control and perhaps no 399 awareness of the role of the authoritative name servers and their 400 observation abilities. 402 As noted before, using a local resolver or a resolver close to the 403 machine decreases the attack surface for an on-the-wire eavesdropper. 404 But it may decrease privacy against an observer located on an 405 authoritative name server. This authoritative name server will see 406 the IP address of the end client, instead of the address of a big 407 recursive resolver shared by many users. 409 This "protection", when using a large resolver with many clients, is 410 no longer present if [I-D.ietf-dnsop-edns-client-subnet] is used 411 because, in this case, the authoritative name server sees the 412 original IP address (or prefix, depending on the setup). 414 As of today, all the instances of one root name server, L-root, 415 receive together around 50,000 queries per second. While most of it 416 is "junk" (errors on the TLD name), it gives an idea of the amount of 417 big data which pours into name servers. (And even "junk" can leak 418 information, for instance if there is a typing error in the TLD, the 419 user will send data to a TLD which is not the usual one.) 421 Many domains, including TLDs, are partially hosted by third-party 422 servers, sometimes in a different country. The contracts between the 423 domain manager and these servers may or may not take privacy into 424 account. Whatever the contract, the third-party hoster may be honest 425 or not but, in any case, it will have to follow its local laws. So, 426 requests to a given ccTLD may go to servers managed by organizations 427 outside of the ccTLD's country. End-users may not anticipate that, 428 when doing a security analysis. 430 Also, it seems [aeris-dns] that there is a strong concentration of 431 authoritative name servers among "popular" domains (such as the Alexa 432 Top N list). For instance, among the Alexa Top 100k, one DNS 433 provider hosts today 10 % of the domains. The ten most important DNS 434 providers host together one third of the domains. With the control 435 (or the ability to sniff the traffic) of a few name servers, you can 436 gather a lot of information. 438 2.5.3. Rogue servers 440 The previous paragraphs discussed DNS privacy, assuming that all the 441 traffic was directed to the intended servers, and that the potential 442 attacker was purely passive. But, in reality, we can have active 443 attackers, redirecting the traffic, not for changing it but just to 444 observe it. 446 For instance, a rogue DHCP server, or a trusted DHCP server that has 447 had its configuration altered by malicious parties, can direct you to 448 a rogue recursive resolver. Most of the time, it seems to be done to 449 divert traffic, by providing lies for some domain names. But it 450 could be used just to capture the traffic and gather information 451 about you. Other attacks, besides using DHCP, are possible. The 452 traffic from a DNS client to a DNS server can be intercepted along 453 its way from originator to intended source; for instance by 454 transparent DNS proxies in the network that will divert the traffic 455 intended for a legitimate DNS server. This rogue server can 456 masquerade as the intended server and respond with data to the 457 client. (Rogue servers that inject malicious data are possible, but 458 is a separate problem not relevant to privacy.) A rogue server may 459 respond correctly for a long period of time, thereby foregoing 460 detection. This may be done for what could be claimed to be good 461 reasons, such as optimization or caching, but it leads to a reduction 462 of privacy compared to if there were no attacker present. Also, 463 malware like DNSchanger [dnschanger] can change the recursive 464 resolver in the machine's configuration, or the routing itself can be 465 subverted (for instance [turkey-googledns]). 467 A practical consequence of this section is that solutions for DNS 468 privacy may have to address authentication of the server, not just 469 passive sniffing. 471 2.6. Re-identification and other inferences 473 An observer has access not only to the data he/she directly collects 474 but also to the results of various inferences about these data. 476 For instance, a user can be re-identified via DNS queries. If the 477 adversary knows a user's identity and can watch their DNS queries for 478 a period, then that same adversary may be able to re-identify the 479 user solely based on their pattern of DNS queries later on regardless 480 of the location from which the user makes those queries. For 481 example, one study [herrmann-reidentification] found that such re- 482 identification is possible so that "73.1% of all day-to-day links 483 were correctly established, i.e. user u was either re-identified 484 unambiguously (1) or the classifier correctly reported that u was not 485 present on day t+1 any more (2)". While that study related to web 486 browsing behaviour, equally characteristic patterns may be produced 487 even in machine-to-machine communications or without a user taking 488 specific actions, e.g. at reboot time if a characteristic set of 489 services are accessed by the device. 491 For instance, one could imagine, for an intelligence agency to 492 identify people going to a site by putting in a very long DNS name 493 and looking for queries of a specific length. Such traffic analysis 494 could weaken some privacy solutions. 496 The IAB privacy and security programme also have a work in progress 497 [I-D.iab-privsec-confidentiality-threat] that considers such 498 inference based attacks in a more general framework. 500 3. Actual "attacks" 502 A very quick examination of DNS traffic may lead to the false 503 conclusion that extracting the needle from the haystack is difficult. 504 "Interesting" primary DNS requests are mixed with useless (for the 505 eavesdropper) secondary and tertiary requests (see the terminology in 506 Section 1). But, in this time of "big data" processing, powerful 507 techniques now exist to get from the raw data to what the 508 eavesdropper is actually interested in. 510 Many research papers about malware detection use DNS traffic to 511 detect "abnormal" behaviour that can be traced back to the activity 512 of malware on infected machines. Yes, this research was done for the 513 good; but, technically, it is a privacy attack and it demonstrates 514 the power of the observation of DNS traffic. See [dns-footprint], 515 [dagon-malware] and [darkreading-dns]. 517 Passive DNS systems [passive-dns] allow reconstruction of the data of 518 sometimes an entire zone. They are used for many reasons, some good, 519 some bad. Well-known passive DNS systems keep only the DNS 520 responses, and not the source IP address of the client, precisely for 521 privacy reasons. Other passive DNS systems may not be so careful. 522 And there is still the potential problems with revealing QNAMEs. 524 The revelations (from the Edward Snowden documents, leaked from the 525 NSA) of the MORECOWBELL surveillance program [morecowbell], which 526 uses the DNS, both passively and actively, to surreptitiously gather 527 information about the users, is another good example showing that the 528 lack of privacy protections in the DNS is actively exploited. 530 4. Legalities 532 To our knowledge, there are no specific privacy laws for DNS data, in 533 any country. Interpreting general privacy laws like 534 [data-protection-directive] (European Union) in the context of DNS 535 traffic data is not an easy task and we do not know a court precedent 536 here. An interesting analysis is [sidn-entrada]. 538 5. Security considerations 540 This document is entirely about security, more precisely privacy. It 541 just lays out the problem, it does not try to set requirements (with 542 the choices and compromises they imply), much less to define 543 solutions. Possible solutions to the issues described here are 544 discussed in other documents (currently too many to all be 545 mentioned), see for instance [I-D.ietf-dnsop-qname-minimisation] for 546 the minimisation of data, or [I-D.ietf-dprive-start-tls-for-dns] 547 about encryption. 549 6. Acknowledgments 551 Thanks to Nathalie Boulvard and to the CENTR members for the original 552 work which leaded to this document. Thanks to Ondrej Sury for the 553 interesting discussions. Thanks to Mohsen Souissi and John Heidemann 554 for proofreading, to Paul Hoffman, Matthijs Mekking, Marcos Sanz, Tim 555 Wicinski, Francis Dupont, Allison Mankin and Warren Kumari for 556 proofreading, technical remarks, and many readability improvements. 557 Thanks to Dan York, Suzanne Woolf, Tony Finch, Stephen Farrell, Peter 558 Koch, Simon Josefsson and Frank Denis for good written contributions. 559 And thanks to the IESG members for the last remarks. 561 7. IANA considerations 563 This document has no actions for IANA. 565 8. References 567 8.1. Normative References 569 [RFC1034] Mockapetris, P., "Domain names - concepts and facilities", 570 STD 13, RFC 1034, November 1987. 572 [RFC1035] Mockapetris, P., "Domain names - implementation and 573 specification", STD 13, RFC 1035, November 1987. 575 [RFC6973] Cooper, A., Tschofenig, H., Aboba, B., Peterson, J., 576 Morris, J., Hansen, M., and R. Smith, "Privacy 577 Considerations for Internet Protocols", RFC 6973, July 578 2013. 580 [RFC7258] Farrell, S. and H. Tschofenig, "Pervasive Monitoring Is an 581 Attack", BCP 188, RFC 7258, May 2014. 583 8.2. Informative References 585 [RFC4033] Arends, R., Austein, R., Larson, M., Massey, D., and S. 586 Rose, "DNS Security Introduction and Requirements", RFC 587 4033, March 2005. 589 [RFC5155] Laurie, B., Sisson, G., Arends, R., and D. Blacka, "DNS 590 Security (DNSSEC) Hashed Authenticated Denial of 591 Existence", RFC 5155, March 2008. 593 [RFC5936] Lewis, E. and A. Hoenes, "DNS Zone Transfer Protocol 594 (AXFR)", RFC 5936, June 2010. 596 [RFC6269] Ford, M., Boucadair, M., Durand, A., Levis, P., and P. 597 Roberts, "Issues with IP Address Sharing", RFC 6269, June 598 2011. 600 [I-D.ietf-dnsop-edns-client-subnet] 601 Contavalli, C., Gaast, W., Lawrence, D., and W. Kumari, 602 "Client Subnet in DNS Querys", draft-ietf-dnsop-edns- 603 client-subnet-01 (work in progress), May 2015. 605 [I-D.iab-privsec-confidentiality-threat] 606 Barnes, R., Schneier, B., Jennings, C., Hardie, T., 607 Trammell, B., Huitema, C., and D. Borkmann, 608 "Confidentiality in the Face of Pervasive Surveillance: A 609 Threat Model and Problem Statement", draft-iab-privsec- 610 confidentiality-threat-07 (work in progress), May 2015. 612 [I-D.wouters-dane-openpgp] 613 Wouters, P., "Using DANE to Associate OpenPGP public keys 614 with email addresses", draft-wouters-dane-openpgp-02 (work 615 in progress), February 2014. 617 [I-D.ietf-dprive-start-tls-for-dns] 618 Zi, Z., Zhu, L., Heidemann, J., Mankin, A., Wessels, D., 619 and P. Hoffman, "TLS for DNS: Initiation and Performance 620 Considerations", draft-ietf-dprive-start-tls-for-dns-00 621 (work in progress), May 2015. 623 [I-D.ietf-dnsop-qname-minimisation] 624 Bortzmeyer, S., "DNS query name minimisation to improve 625 privacy", draft-ietf-dnsop-qname-minimisation-03 (work in 626 progress), June 2015. 628 [I-D.ietf-dnsop-dns-terminology] 629 Hoffman, P., Sullivan, A., and K. Fujiwara, "DNS 630 Terminology", draft-ietf-dnsop-dns-terminology-02 (work in 631 progress), May 2015. 633 [denis-edns-client-subnet] 634 Denis, F., "Security and privacy issues of edns-client- 635 subnet", August 2013, . 638 [dagon-malware] 639 Dagon, D., "Corrupted DNS Resolution Paths: The Rise of a 640 Malicious Resolution Authority", 2007, . 644 [dns-footprint] 645 Stoner, E., "DNS footprint of malware", October 2010, 646 . 649 [morecowbell] 650 Grothoff, C., Wachs, M., Ermert, M., and J. Appelbaum, 651 "NSA's MORECOWBELL: Knell for DNS", January 2015, 652 . 654 [darkreading-dns] 655 Lemos, R., "Got Malware? Three Signs Revealed In DNS 656 Traffic", May 2013, 657 . 660 [dnschanger] 661 Wikipedia, , "DNSchanger", November 2011, 662 . 664 [packetq] Dot SE, , "PacketQ, a simple tool to make SQL-queries 665 against PCAP-files", 2011, 666 . 668 [dnsmezzo] 669 Bortzmeyer, S., "DNSmezzo", 2009, 670 . 672 [prism] NSA, , "PRISM", 2007, . 675 [grangeia.snooping] 676 Grangeia, L., "DNS Cache Snooping or Snooping the Cache 677 for Fun and Profit", 2004, 678 . 681 [ditl] CAIDA, , "A Day in the Life of the Internet (DITL)", 2002, 682 . 684 [day-at-root] 685 Castro, S., Wessels, D., Fomenkov, M., and K. Claffy, "A 686 Day at the Root of the Internet", 2008, 687 . 690 [turkey-googledns] 691 Bortzmeyer, S., "Hijacking of public DNS servers in 692 Turkey, through routing", 2014, 693 . 696 [data-protection-directive] 697 Europe, , "European directive 95/46/EC on the protection 698 of individuals with regard to the processing of personal 699 data and on the free movement of such data", November 700 1995, . 703 [passive-dns] 704 Weimer, F., "Passive DNS Replication", April 2005, 705 . 707 [tor-leak] 708 Tor, , "DNS leaks in Tor", 2013, 709 . 713 [yanbin-tsudik] 714 Yanbin, L. and G. Tsudik, "Towards Plugging Privacy Leaks 715 in the Domain Name System", 2009, 716 . 718 [castillo-garcia] 719 Castillo-Perez, S. and J. Garcia-Alfaro, "Anonymous 720 Resolution of DNS Queries", 2008, 721 . 723 [fangming-hori-sakurai] 724 Fangming, , Hori, Y., and K. Sakurai, "Analysis of Privacy 725 Disclosure in DNS Query", 2007, 726 . 728 [thomas-ditl-tcp] 729 Thomas, M. and D. Wessels, "An Analysis of TCP Traffic in 730 Root Server DITL Data"", 2014, . 734 [federrath-fuchs-herrmann-piosecny] 735 Federrath, H., Fuchs, K., Herrmann, D., and C. Piosecny, 736 "Privacy-Preserving DNS: Analysis of Broadcast, Range 737 Queries and Mix-Based Protection Methods", 2011, 738 . 741 [aeris-dns] 742 Vinot, N., "[In French] Vie privee : et le DNS alors ?", 743 2015, . 746 [herrmann-reidentification] 747 Herrmann, D., Gerber, C., Banse, C., and H. Federrath, 748 "Analyzing characteristic host access patterns for re- 749 identification of web user sessions", 2012, 750 . 753 [sidn-entrada] 754 Hesselman, C., Jansen, J., Wullink, M., Vink, K., and M. 755 Simon, "A privacy framework for 'DNS big data' 756 applications", 2014, 757 . 760 8.3. URIs 762 [1] https://developers.google.com/speed/public-dns/privacy 764 Author's Address 766 Stephane Bortzmeyer 767 AFNIC 768 1, rue Stephenson 769 Montigny-le-Bretonneux 78180 770 France 772 Phone: +33 1 39 30 83 46 773 Email: bortzmeyer+ietf@nic.fr 774 URI: http://www.afnic.fr/