idnits 2.17.1 draft-ietf-intarea-hostname-practice-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (May 11, 2016) is 2879 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Obsolete informational reference (is this intentional?): RFC 3315 (Obsoleted by RFC 8415) -- Obsolete informational reference (is this intentional?): RFC 7719 (Obsoleted by RFC 8499) Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group C. Huitema 3 Internet-Draft D. Thaler 4 Intended status: Informational Microsoft 5 Expires: November 12, 2016 R. Winter 6 University of Applied Sciences Augsburg 7 May 11, 2016 9 Current Hostname Practice Considered Harmful 10 draft-ietf-intarea-hostname-practice-02.txt 12 Abstract 14 Giving a hostname to your computer and publishing it as you roam from 15 one network to another is the Internet equivalent of walking around 16 with a name tag affixed to your lapel. This current practice can 17 significantly compromise your privacy, and something should change in 18 order to mitigate these privacy threads. 20 There are several possible remedies, such as fixing a variety of 21 protocols or avoiding disclosing a hostname at all. This document 22 describes some of the protocols that reveal hostnames today and 23 sketches another possible remedy, which is to replace static 24 hostnames by frequently changing randomized values. 26 Status of This Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at http://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on November 12, 2016. 43 Copyright Notice 45 Copyright (c) 2016 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (http://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 61 2. Naming Practices . . . . . . . . . . . . . . . . . . . . . . 3 62 3. Partial Identifiers . . . . . . . . . . . . . . . . . . . . . 4 63 4. Protocols that leak Hostnames . . . . . . . . . . . . . . . . 4 64 4.1. DHCP . . . . . . . . . . . . . . . . . . . . . . . . . . 5 65 4.2. DNS Address to Name Resolution . . . . . . . . . . . . . 5 66 4.3. Multicast DNS . . . . . . . . . . . . . . . . . . . . . . 5 67 4.4. Link-local Multicast Name Resolution . . . . . . . . . . 6 68 4.5. DNS-Based Service Discovery . . . . . . . . . . . . . . . 6 69 5. Randomized Hostames as Remedy . . . . . . . . . . . . . . . . 7 70 6. Security Considerations . . . . . . . . . . . . . . . . . . . 8 71 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 72 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 8 73 9. Informative References . . . . . . . . . . . . . . . . . . . 8 74 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 10 76 1. Introduction 78 There is a long established practice of giving names to computers. 79 In the Internet protocols, these names are referred to as "hostnames" 80 [RFC7719] . Hostnames are normally used in conjunction with a domain 81 name suffix to build the "Fully Qualified Domain Name" (FQDN) of a 82 host. However, it is common practice to use the hostname without 83 further qualification in a variety of applications from file sharing 84 to network management. Hostnames are typically published as part of 85 domain names, and can be obtained through a variety of name lookup 86 and discovery protocols. 88 Hostnames have to be unique within the domain in which they are 89 created and used. They do not have to be globally unique 90 identifiers, but they will always be at least partial identifiers, as 91 discussed in Section 3. 93 The disclosure of information through hostnames creates a problem for 94 mobile devices. Adversaries that monitor a remote network such as a 95 Wi-Fi hot spot can obtain the hostname through passive monitoring or 96 active probing of a variety of Internet protocols, such as for 97 example DHCP, or multicast DNS (mDNS). They can correlate the 98 hostname with various other information extracted from traffic 99 analysis and other information sources, and can potentially identify 100 the device, device properties and its user [TRAC2016]. 102 2. Naming Practices 104 There are many reasons to give names to computers. This is 105 particularly true when computers operate on a network. Operating 106 systems like Microsoft Windows or Unix assume that computers have a 107 "hostname." This enables users and administrators to do things such 108 as ping a computer, add its name to an access control list, remotely 109 mount a computer disk, or connect to the computer through tools such 110 as telnet or remote desktop. Other operating systems maintain 111 multiple hostnames for different purposes, e.g. for use with certain 112 protocols such as mDNS. 114 In most consumer networks, naming is pretty much left to the fancy of 115 the user. Some will pick names of planets or stars, other names of 116 fruits or flowers, and other will pick whatever suits their mood when 117 they unwrap the device. As long as users are careful to not pick a 118 name already in use on the same network, anything goes. Very often 119 however, the operating system is suggesting a hostname at install 120 time, which can contain the user name, the login name and information 121 learned from the device itself such as the brand, model or maker of 122 the device [TRAC2016]. 124 In large organizations, collisions are more likely and a more 125 structured approach is necessary. In theory, organizations could use 126 multiple DNS subdomains to ease the pressure on uniqueness, but in 127 practice many don't and insist on unique flat names, if only to 128 simplify network management. To ensure unique names, organizations 129 will set naming guidelines and enforce some kind of structured 130 naming. For example, within the Microsoft corporate network, 131 computer names are derived from the login name of the main user, 132 leading to names like "huitema-test2" for a machine that one of the 133 authors uses to test software. 135 There is less pressure to assign names to small devices, including 136 for example smart phones, as these devices typically do not enable 137 sharing of their disks or remote login. As a consequence, these 138 devices often have manufacturer assigned names, which vary from very 139 generic like "Windows Phone" to completely unique like "BrandX- 140 123456-7890-abcdef" and often contain the name of the device owner 141 the device's brand name and often also a hint as to which language 142 the device owner speaks [TRAC2016]. 144 3. Partial Identifiers 146 Suppose an adversary wants to track the people connecting to a 147 specific Wi-Fi hot spot, for example in a railroad station. Assume 148 that the adversary is able to retrieve the hostname used by a 149 specific laptop. That, in itself, might not be enough to identify 150 the laptop's owner. Suppose however that the adversary observes that 151 the laptop name is "huitema-laptop" and that the laptop has 152 established a VPN connection to the Microsoft corporate network. The 153 two pieces of information, put together, firmly point to Christian 154 Huitema, employed by Microsoft. The identification is successful. 156 In the example, we saw a login name inside the hostname, and that 157 certainly helped identification. But generic names like "jupiter" or 158 "rosebud" also provide partial identification, especially if the 159 adversary is capable of maintaining a database recording, among other 160 information, the hostnames of devices used by specific users. 161 Generic names are picked from vocabularies that include thousands of 162 potential choices. Finding the name reduces the scope of the search 163 significantly. Other information such as the visited sites will 164 quickly complement that data and can lead to user identification. 166 Also the special circumstances of the network can play a role. 167 Experiments on operational networks such as the IETF meeting network 168 have shown that with the help of external data such as the publicly 169 available IETF attendees list or other data sources such as LDAP 170 servers on the network can [TRAC2016], the identification of the 171 device owner can become trivial given only partial identifiers in a 172 hostname. 174 Unique names assigned by manufacturers do not directly encode a user 175 identifier, but they have the property of being stable and unique to 176 the device in a large context. A unique name like "BrandX- 177 123456-7890-abcdef" allows efficient tracking across multiple 178 domains. In theory, this only allows tracking of the device but not 179 of the user. However, an adversary could correlate the device to the 180 user through other means, for example the one-time capture of some 181 clear text traffic. Adversaries could then maintain databases 182 linking unique host name to user identity. This will allow efficient 183 tracking of both the user and the device. 185 4. Protocols that leak Hostnames 187 Many IETF protocols can leak the "hostname" of a computer. A non 188 exhaustive list includes DHCP, DNS address to name resolution, 189 Multicast DNS, Link-local Multicast Name Resolution, and DNS service 190 discovery. 192 4.1. DHCP 194 Shortly after connecting to a new network, a host can use DHCP 195 [RFC2131] to acquire an IPv4 address and other parameters [RFC2132]. 196 A DHCP query can disclose the "hostname." DHCP traffic is sent to 197 the broadcast address and can be easily monitored, enabling 198 adversaries to discover the hostname associated with a computer 199 visiting a particular network. DHCPv6 [RFC3315] shares similar 200 issues. 202 The problems with the hostname and FQDN parameters in DHCP are 203 analyzed in [I-D.ietf-dhc-dhcp-privacy] and 204 [I-D.ietf-dhc-dhcpv6-privacy]. Possible mitigations are described in 205 [I-D.ietf-dhc-anonymity-profile]. 207 4.2. DNS Address to Name Resolution 209 The domain name service design [RFC1035] includes the specification 210 of the special domain "in-addr.arpa" for resolving the name of the 211 computer using a particular IPv4 address, using the PTR format 212 defined in [RFC1033]. A similar domain, "ip6.arpa", is defined in 213 [RFC3596] for finding the name of a computer using a specific IPv6 214 address. 216 Adversaries who observe a particular address in use on a specific 217 network can try to retrieve the PTR record associated with that 218 address, and thus the hostname of the computer, or even the fully 219 qualified domain name of that computer. The retrieval may not be 220 useful in many IPv4 networks due to the prevalence of NAT, but it 221 could work in IPv6 networks. 223 4.3. Multicast DNS 225 Multicast DNS (mDNS) is defined in [RFC6762]. It enables hosts to 226 send DNS queries over multicast, and to elicit responses from hosts 227 participating in the service. 229 If an adversary suspects that a particular host is present on a 230 network, the adversary can send mDNS requests to find, for example, 231 the A or AAAA records associated with the hostname in the ".local" 232 domain. A positive reply will confirm the presence of the host. 234 When a new responder starts, it must send a set of multicast queries 235 to verify that the name that it advertises is unique on the network, 236 and also to populate the caches of other mDNS hosts. Adversaries can 237 monitor this traffic and discover the hostname of computers as they 238 join the monitored network. 240 4.4. Link-local Multicast Name Resolution 242 Link-local Multicast Name Resolution (LLMNR) is defined in [RFC4795]. 243 The specification did not achieve consensus as an IETF standard, but 244 it is widely deployed. Like mDNS, it enables hosts to send DNS 245 queries over multicast, and to elicit responses from computers 246 implementing the LLMNR service. 248 Like mDNS, LLMNR can be used by adversaries to confirm the presence 249 of a specific host on a network, by issuing a multicast requests to 250 find the A or AAAA records associated with the hostname in the 251 ".local" domain. 253 When an LLMNR responder starts, it sends a set of multicast queries 254 to verify that the name that it advertises is unique on the network. 255 Adversaries can monitor this traffic and discover the hostname of 256 computers as they join the monitored network. 258 4.5. DNS-Based Service Discovery 260 DNS-Based Service Discovery (DNS-SD) is described in [RFC6763]. It 261 enables participating hosts to retrieve the location of services 262 proposed by other hosts. It can be used with DNS servers, or in 263 conjunction with mDNS in a server-less environment. 265 Participating hosts publish a service described by an "instance 266 name," typically chosen by the user responsible for the publication. 267 While this is obviously an active disclosure of information, privacy 268 aspects can be mitigated by user control. Services should only be 269 published when deciding to do so, and the information disclosed in 270 the service name should be well under the control of the device's 271 owner. 273 In theory there should not be any privacy issue, but in practice the 274 publication of a service also forces the publication of the hostname, 275 due to a chain of dependencies. The service name is used to publish 276 a PTR record announcing the service. The PTR record typically points 277 to the service name in the local domain. The service names, in turn, 278 are used to publish TXT records describing service parameters, and 279 SRV records describing the service location. 281 SRV records are described in [RFC2782]. Each record contains 4 282 parameters: priority, weight, port number and hostname. While the 283 service name published in the PTR record is chosen by the user, the 284 "hostname" in the SRV record is indeed the hostname of the device. 286 Adversaries can monitor the mDNS traffic associated with DNS-SD and 287 retrieve the hostname of computers advertising any service with DNS- 288 SD. 290 5. Randomized Hostames as Remedy 292 There are several ways to remedy the hostname practices. We could 293 instruct people to just turn off any protocol that leaks hostnames, 294 at least when they visit some "insecure" place. We could also 295 examine each particular standard that publishes hostnames, and 296 somehow fix the corresponding protocols. Or, we could attempt to 297 revise the way devices manage the hostname parameter. 299 There is a lot of merit in "turning off unneeded protocols when 300 visiting insecure places." This amounts to attack surface reduction, 301 and is clearly beneficial -- this is an advantage of the stealth mode 302 defined in [RFC7288]. However, there are two issues with this 303 advice. First, it relies on recognizing which networks are secure or 304 insecure. This is hard to automate, but relying on end-user judgment 305 may not always provide good results. Second, some protocols such as 306 DHCP cannot be turned off without losing connectivity, which limits 307 the value of this option. Also, the services that rely on protocols 308 that leak hostnames such as mDNS will not be available when switched 309 off. In addition, not always are hostname-leaking protocols well- 310 known as they might be proprietary and come with an installed 311 application instead of being provided by the operating system. 313 It may be possible in many cases to examine a protocol and prevent it 314 from leaking hostnames. This is for example what is attempted for 315 DHCP in [I-D.ietf-dhc-anonymity-profile]. However, it is unclear 316 that we can identify, revisit and fix all the protocols that publish 317 hostnames. In particular, this is impossible for proprietary 318 protocols. 320 We may be able to mitigate most of the effects of hostname leakage by 321 revisiting the way platforms handle hostnames. This is in a way 322 similar to the approach of MAC address randomization described in 323 [I-D.ietf-dhc-anonymity-profile]. Let's assume that the operating 324 system, at the time of connecting to a new network, picks a random 325 hostname and start publicizing that random name in protocols such as 326 DHCP or mDNS, instead of the static value. This will render 327 monitoring and identification of users by adversaries much more 328 difficult, without preventing protocols such as DNS-SD from operating 329 as expected. This has of course implications on the applications 330 making use of such protocols e.g. when the hostname is being 331 displayed to users of the application. They will not as easily be 332 able to identify e.g. network shares or services based on the 333 hostname carried in the underlying protocols. Also, the generation 334 of new hostnames should be synchronized with the change of other 335 tokens used in network protocols such as the MAC or IP address to 336 prevent correlation of this information. E.g. if the IP address 337 changes but the hostname stays the same, the new IP address can be 338 correlated to belong to the same device based on a leaked hostname. 340 Some operating systems, including Windows, support "per network" 341 hostnames, but some other operating systems only support "global" 342 hostnames. In that case, changing the hostname may be difficult if 343 the host is multi-homed, as the same name will be used on several 344 networks. Other operating systems already use potentially different 345 hostnames for different purposes, which might be a good model to 346 combine both static hostnames and randomized hostnames based on their 347 potential use and thread to a user's privacy. Obviously, further 348 studies are required before the idea of randomized hostnames can be 349 implemented. 351 6. Security Considerations 353 This draft does not introduce any new protocol. It does point to 354 potential privacy issues in a set of existing protocols. 356 7. IANA Considerations 358 This draft does not require any IANA action. 360 8. Acknowledgments 362 Thanks to the members of the INTAREA Working Group for discussions 363 and reviews. 365 9. Informative References 367 [I-D.ietf-dhc-anonymity-profile] 368 Huitema, C., Mrugalski, T., and S. Krishnan, "Anonymity 369 profile for DHCP clients", draft-ietf-dhc-anonymity- 370 profile-08 (work in progress), February 2016. 372 [I-D.ietf-dhc-dhcp-privacy] 373 Krishnan, S., Mrugalski, T., and S. Jiang, "Privacy 374 considerations for DHCP", draft-ietf-dhc-dhcp-privacy-05 375 (work in progress), February 2016. 377 [I-D.ietf-dhc-dhcpv6-privacy] 378 Krishnan, S., Mrugalski, T., and S. Jiang, "Privacy 379 considerations for DHCPv6", draft-ietf-dhc- 380 dhcpv6-privacy-05 (work in progress), February 2016. 382 [RFC1033] Lottor, M., "Domain Administrators Operations Guide", RFC 383 1033, DOI 10.17487/RFC1033, November 1987, 384 . 386 [RFC1035] Mockapetris, P., "Domain names - implementation and 387 specification", STD 13, RFC 1035, DOI 10.17487/RFC1035, 388 November 1987, . 390 [RFC2131] Droms, R., "Dynamic Host Configuration Protocol", RFC 391 2131, DOI 10.17487/RFC2131, March 1997, 392 . 394 [RFC2132] Alexander, S. and R. Droms, "DHCP Options and BOOTP Vendor 395 Extensions", RFC 2132, DOI 10.17487/RFC2132, March 1997, 396 . 398 [RFC2782] Gulbrandsen, A., Vixie, P., and L. Esibov, "A DNS RR for 399 specifying the location of services (DNS SRV)", RFC 2782, 400 DOI 10.17487/RFC2782, February 2000, 401 . 403 [RFC3315] Droms, R., Ed., Bound, J., Volz, B., Lemon, T., Perkins, 404 C., and M. Carney, "Dynamic Host Configuration Protocol 405 for IPv6 (DHCPv6)", RFC 3315, DOI 10.17487/RFC3315, July 406 2003, . 408 [RFC3596] Thomson, S., Huitema, C., Ksinant, V., and M. Souissi, 409 "DNS Extensions to Support IP Version 6", RFC 3596, DOI 410 10.17487/RFC3596, October 2003, 411 . 413 [RFC4795] Aboba, B., Thaler, D., and L. Esibov, "Link-local 414 Multicast Name Resolution (LLMNR)", RFC 4795, DOI 415 10.17487/RFC4795, January 2007, 416 . 418 [RFC6762] Cheshire, S. and M. Krochmal, "Multicast DNS", RFC 6762, 419 DOI 10.17487/RFC6762, February 2013, 420 . 422 [RFC6763] Cheshire, S. and M. Krochmal, "DNS-Based Service 423 Discovery", RFC 6763, DOI 10.17487/RFC6763, February 2013, 424 . 426 [RFC7288] Thaler, D., "Reflections on Host Firewalls", RFC 7288, DOI 427 10.17487/RFC7288, June 2014, 428 . 430 [RFC7719] Hoffman, P., Sullivan, A., and K. Fujiwara, "DNS 431 Terminology", RFC 7719, DOI 10.17487/RFC7719, December 432 2015, . 434 [TRAC2016] 435 Faath, M., Weisshaar, F., and R. Winter, "How Broadcast 436 Data Reveals Your Identity and Social Graph", 7th 437 International Workshop on TRaffic Analysis and 438 Characterization IEEE TRAC 2016, September 2016. 440 Authors' Addresses 442 Christian Huitema 443 Microsoft 444 Redmond, WA 98052 445 U.S.A. 447 Email: huitema@microsoft.com 449 Dave Thaler 450 Microsoft 451 Redmond, WA 98052 452 U.S.A. 454 Email: dthaler@microsoft.com 456 Rolf Winter 457 University of Applied Sciences Augsburg 458 Augsburg 459 DE 461 Email: rolf.winter@hs-augsburg.de