idnits 2.17.1 

draft-ietf-intarea-hostname-practice-04.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (January 23, 2017) is 2648 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  -- Obsolete informational reference (is this intentional?): RFC 3315
     (Obsoleted by RFC 8415)

  -- Obsolete informational reference (is this intentional?): RFC 7719
     (Obsoleted by RFC 8499)


     Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 3 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------


2	Network Working Group                                         C. Huitema
3	Internet-Draft                                      Private Octopus Inc.
4	Intended status: Informational                                 D. Thaler
5	Expires: July 27, 2017                                         Microsoft
6	                                                               R. Winter
7	                                 University of Applied Sciences Augsburg
8	                                                        January 23, 2017

10	              Current Hostname Practice Considered Harmful
11	              draft-ietf-intarea-hostname-practice-04.txt

13	Abstract

15	   Giving a hostname to your computer and publishing it as you roam from
16	   one network to another is the Internet equivalent of walking around
17	   with a name tag affixed to your lapel.  This current practice can
18	   significantly compromise your privacy, and something should change in
19	   order to mitigate these privacy threats.

21	   There are several possible remedies, such as fixing a variety of
22	   protocols or avoiding disclosing a hostname at all.  This document
23	   describes some of the protocols that reveal hostnames today and
24	   sketches another possible remedy, which is to replace static
25	   hostnames by frequently changing randomized values.

27	Status of This Memo

29	   This Internet-Draft is submitted in full conformance with the
30	   provisions of BCP 78 and BCP 79.

32	   Internet-Drafts are working documents of the Internet Engineering
33	   Task Force (IETF).  Note that other groups may also distribute
34	   working documents as Internet-Drafts.  The list of current Internet-
35	   Drafts is at http://datatracker.ietf.org/drafts/current/.

37	   Internet-Drafts are draft documents valid for a maximum of six months
38	   and may be updated, replaced, or obsoleted by other documents at any
39	   time.  It is inappropriate to use Internet-Drafts as reference
40	   material or to cite them other than as "work in progress."

42	   This Internet-Draft will expire on July 27, 2017.

44	Copyright Notice

46	   Copyright (c) 2017 IETF Trust and the persons identified as the
47	   document authors.  All rights reserved.

49	   This document is subject to BCP 78 and the IETF Trust's Legal
50	   Provisions Relating to IETF Documents
51	   (http://trustee.ietf.org/license-info) in effect on the date of
52	   publication of this document.  Please review these documents
53	   carefully, as they describe your rights and restrictions with respect
54	   to this document.  Code Components extracted from this document must
55	   include Simplified BSD License text as described in Section 4.e of
56	   the Trust Legal Provisions and are provided without warranty as
57	   described in the Simplified BSD License.

59	Table of Contents

61	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
62	   2.  Naming Practices  . . . . . . . . . . . . . . . . . . . . . .   3
63	   3.  Partial Identifiers . . . . . . . . . . . . . . . . . . . . .   4
64	   4.  Protocols that leak Hostnames . . . . . . . . . . . . . . . .   4
65	     4.1.  DHCP  . . . . . . . . . . . . . . . . . . . . . . . . . .   5
66	     4.2.  DNS Address to Name Resolution  . . . . . . . . . . . . .   5
67	     4.3.  Multicast DNS . . . . . . . . . . . . . . . . . . . . . .   5
68	     4.4.  Link-local Multicast Name Resolution  . . . . . . . . . .   6
69	     4.5.  DNS-Based Service Discovery . . . . . . . . . . . . . . .   6
70	     4.6.  NetBIOS-over-TCP  . . . . . . . . . . . . . . . . . . . .   7
71	   5.  Randomized Hostnames as Remedy  . . . . . . . . . . . . . . .   7
72	   6.  Security Considerations . . . . . . . . . . . . . . . . . . .   8
73	   7.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   8
74	   8.  Acknowledgments . . . . . . . . . . . . . . . . . . . . . . .   8
75	   9.  Informative References  . . . . . . . . . . . . . . . . . . .   9
76	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  10

78	1.  Introduction

80	   There is a long established practice of giving names to computers.
81	   In the Internet protocols, these names are referred to as "hostnames"
82	   [RFC7719] .  Hostnames are normally used in conjunction with a domain
83	   name suffix to build the "Fully Qualified Domain Name" (FQDN) of a
84	   host.  However, it is common practice to use the hostname without
85	   further qualification in a variety of applications from file sharing
86	   to network management.  Hostnames are typically published as part of
87	   domain names, and can be obtained through a variety of name lookup
88	   and discovery protocols.

90	   Hostnames have to be unique within the domain in which they are
91	   created and used.  They do not have to be globally unique
92	   identifiers, but they will always be at least partial identifiers, as
93	   discussed in Section 3.

95	   The disclosure of information through hostnames creates a problem for
96	   mobile devices.  Adversaries that monitor a remote network such as a
97	   Wi-Fi hot spot can obtain the hostname through passive monitoring or
98	   active probing of a variety of Internet protocols, such as for
99	   example DHCP, or multicast DNS (mDNS).  They can correlate the
100	   hostname with various other information extracted from traffic
101	   analysis and other information sources, and can potentially identify
102	   the device, device properties and its user [TRAC2016].

104	2.  Naming Practices

106	   There are many reasons to give names to computers.  This is
107	   particularly true when computers operate on a network.  Operating
108	   systems like Microsoft Windows or Unix assume that computers have a
109	   "hostname."  This enables users and administrators to do things such
110	   as ping a computer, add its name to an access control list, remotely
111	   mount a computer disk, or connect to the computer through tools such
112	   as telnet or remote desktop.  Other operating systems maintain
113	   multiple hostnames for different purposes, e.g. for use with certain
114	   protocols such as mDNS.

116	   In most consumer networks, naming is pretty much left to the fancy of
117	   the user.  Some will pick names of planets or stars, other names of
118	   fruits or flowers, and other will pick whatever suits their mood when
119	   they unwrap the device.  As long as users are careful to not pick a
120	   name already in use on the same network, anything goes.  Very often
121	   however, the operating system is suggesting a hostname at install
122	   time, which can contain the user name, the login name and information
123	   learned from the device itself such as the brand, model or maker of
124	   the device [TRAC2016].

126	   In large organizations, collisions are more likely and a more
127	   structured approach is necessary.  In theory, organizations could use
128	   multiple DNS subdomains to ease the pressure on uniqueness, but in
129	   practice many don't and insist on unique flat names, if only to
130	   simplify network management.  To ensure unique names, organizations
131	   will set naming guidelines and enforce some kind of structured
132	   naming.  For example, within the Microsoft corporate network,
133	   computer names are derived from the login name of the main user,
134	   leading to names like "huitema-test2" for a machine that one of the
135	   authors used to test software.

137	   There is less pressure to assign names to small devices, including
138	   for example smart phones, as these devices typically do not enable
139	   sharing of their disks or remote login.  As a consequence, these
140	   devices often have manufacturer assigned names, which vary from very
141	   generic like "Windows Phone" to completely unique like "BrandX-
142	   123456-7890-abcdef" and often contain the name of the device owner,
143	   the device's brand name, and often also a hint as to which language
144	   the device owner speaks [TRAC2016].

146	3.  Partial Identifiers

148	   Suppose an adversary wants to track the people connecting to a
149	   specific Wi-Fi hot spot, for example in a railroad station.  Assume
150	   that the adversary is able to retrieve the hostname used by a
151	   specific laptop.  That, in itself, might not be enough to identify
152	   the laptop's owner.  Suppose however that the adversary observes that
153	   the laptop name is "huitema-laptop" and that the laptop has
154	   established a VPN connection to the Microsoft corporate network.  The
155	   two pieces of information, put together, firmly point to Christian
156	   Huitema, employed by Microsoft.  The identification is successful.

158	   In the example, we saw a login name inside the hostname, and that
159	   certainly helped identification.  But generic names like "jupiter" or
160	   "rosebud" also provide partial identification, especially if the
161	   adversary is capable of maintaining a database recording, among other
162	   information, the hostnames of devices used by specific users.
163	   Generic names are picked from vocabularies that include thousands of
164	   potential choices.  Finding the name reduces the scope of the search
165	   significantly.  Other information such as the visited sites will
166	   quickly complement that data and can lead to user identification.

168	   Also the special circumstances of the network can play a role.
169	   Experiments on operational networks such as the IETF meeting network
170	   have shown that with the help of external data such as the publicly
171	   available IETF attendees list or other data sources such as LDAP
172	   servers on the network [TRAC2016], the identification of the device
173	   owner can become trivial given only partial identifiers in a
174	   hostname.

176	   Unique names assigned by manufacturers do not directly encode a user
177	   identifier, but they have the property of being stable and unique to
178	   the device in a large context.  A unique name like "BrandX-
179	   123456-7890-abcdef" allows efficient tracking across multiple
180	   domains.  In theory, this only allows tracking of the device but not
181	   of the user.  However, an adversary could correlate the device to the
182	   user through other means, for example the one-time capture of some
183	   clear text traffic.  Adversaries could then maintain databases
184	   linking unique host name to user identity.  This will allow efficient
185	   tracking of both the user and the device.

187	4.  Protocols that leak Hostnames

189	   Many IETF protocols can leak the "hostname" of a computer.  A non
190	   exhaustive list includes DHCP, DNS address to name resolution,
191	   Multicast DNS, Link-local Multicast Name Resolution, and DNS service
192	   discovery.

194	4.1.  DHCP

196	   Shortly after connecting to a new network, a host can use DHCP
197	   [RFC2131] to acquire an IPv4 address and other parameters [RFC2132].
198	   A DHCP query can disclose the "hostname."  DHCP traffic is sent to
199	   the broadcast address and can be easily monitored, enabling
200	   adversaries to discover the hostname associated with a computer
201	   visiting a particular network.  DHCPv6 [RFC3315] shares similar
202	   issues.

204	   The problems with the hostname and FQDN parameters in DHCP are
205	   analyzed in [RFC7819] and [RFC7824].  Possible mitigations are
206	   described in [RFC7844].

208	4.2.  DNS Address to Name Resolution

210	   The domain name service design [RFC1035] includes the specification
211	   of the special domain "in-addr.arpa" for resolving the name of the
212	   computer using a particular IPv4 address, using the PTR format
213	   defined in [RFC1033].  A similar domain, "ip6.arpa", is defined in
214	   [RFC3596] for finding the name of a computer using a specific IPv6
215	   address.

217	   Adversaries who observe a particular address in use on a specific
218	   network can try to retrieve the PTR record associated with that
219	   address, and thus the hostname of the computer, or even the fully
220	   qualified domain name of that computer.  The retrieval may not be
221	   useful in many IPv4 networks due to the prevalence of NAT, but it
222	   could work in IPv6 networks.  Other name lookup mechanisms, such as
223	   [RFC4620], share similar issues.

225	4.3.  Multicast DNS

227	   Multicast DNS (mDNS) is defined in [RFC6762].  It enables hosts to
228	   send DNS queries over multicast, and to elicit responses from hosts
229	   participating in the service.

231	   If an adversary suspects that a particular host is present on a
232	   network, the adversary can send mDNS requests to find, for example,
233	   the A or AAAA records associated with the hostname in the ".local"
234	   domain.  A positive reply will confirm the presence of the host.

236	   When a new responder starts, it must send a set of multicast queries
237	   to verify that the name that it advertises is unique on the network,
238	   and also to populate the caches of other mDNS hosts.  Adversaries can
239	   monitor this traffic and discover the hostname of computers as they
240	   join the monitored network.

242	   mDNS further allows to send queries via unicast to port 5353.  An
243	   adversary might decide to use unicast instead of multicast in order
244	   to hide from e.g. intrusion detection systems.

246	4.4.  Link-local Multicast Name Resolution

248	   Link-local Multicast Name Resolution (LLMNR) is defined in [RFC4795].
249	   The specification did not achieve consensus as an IETF standard, but
250	   it is widely deployed.  Like mDNS, it enables hosts to send DNS
251	   queries over multicast, and to elicit responses from computers
252	   implementing the LLMNR service.

254	   Like mDNS, LLMNR can be used by adversaries to confirm the presence
255	   of a specific host on a network, by issuing a multicast request to
256	   find the A or AAAA records associated with the hostname in the
257	   ".local" domain.

259	   When an LLMNR responder starts, it sends a set of multicast queries
260	   to verify that the name that it advertises is unique on the network.
261	   Adversaries can monitor this traffic and discover the hostname of
262	   computers as they join the monitored network.

264	4.5.  DNS-Based Service Discovery

266	   DNS-Based Service Discovery (DNS-SD) is described in [RFC6763].  It
267	   enables participating hosts to retrieve the location of services
268	   proposed by other hosts.  It can be used with DNS servers, or in
269	   conjunction with mDNS in a server-less environment.

271	   Participating hosts publish a service described by an "instance
272	   name," typically chosen by the user responsible for the publication.
273	   While this is obviously an active disclosure of information, privacy
274	   aspects can be mitigated by user control.  Services should only be
275	   published when deciding to do so, and the information disclosed in
276	   the service name should be well under the control of the device's
277	   owner.

279	   In theory there should not be any privacy issue, but in practice the
280	   publication of a service also forces the publication of the hostname,
281	   due to a chain of dependencies.  The service name is used to publish
282	   a PTR record announcing the service.  The PTR record typically points
283	   to the service name in the local domain.  The service names, in turn,
284	   are used to publish TXT records describing service parameters, and
285	   SRV records describing the service location.

287	   SRV records are described in [RFC2782].  Each record contains 4
288	   parameters: priority, weight, port number and hostname.  While the
289	   service name published in the PTR record is chosen by the user, the
290	   "hostname" in the SRV record is indeed the hostname of the device.

292	   Adversaries can monitor the mDNS traffic associated with DNS-SD and
293	   retrieve the hostname of computers advertising any service with DNS-
294	   SD.

296	4.6.  NetBIOS-over-TCP

298	   Amongst other things, NetBIOS-over-TCP ([RFC1002]) implements a name
299	   registration and resolution mechanism called the NetBIOS Name
300	   Service.  In practice, NetBIOS resource names are often based on
301	   hostnames.

303	   NetBIOS allows an application to register resource names and to
304	   resolve such names to IP addresses.  In environments without an
305	   NetBIOS Name Server, the protocol makes extensive use of broadcasts
306	   from which resource names can be easily extracted.  NetBIOS also
307	   allows querying for the names registered by a node directly (node
308	   status).

310	5.  Randomized Hostnames as Remedy

312	   There are several ways to remedy the hostname practices.  We could
313	   instruct people to just turn off any protocol that leaks hostnames,
314	   at least when they visit some "insecure" place.  We could also
315	   examine each particular standard that publishes hostnames, and
316	   somehow fix the corresponding protocols.  Or, we could attempt to
317	   revise the way devices manage the hostname parameter.

319	   There is a lot of merit in "turning off unneeded protocols when
320	   visiting insecure places."  This amounts to attack surface reduction,
321	   and is clearly beneficial -- this is an advantage of the stealth mode
322	   defined in [RFC7288].  However, there are two issues with this
323	   advice.  First, it relies on recognizing which networks are secure or
324	   insecure.  This is hard to automate, but relying on end-user judgment
325	   may not always provide good results.  Second, some protocols such as
326	   DHCP cannot be turned off without losing connectivity, which limits
327	   the value of this option.  Also, the services that rely on protocols
328	   that leak hostnames such as mDNS will not be available when switched
329	   off.  In addition, not always are hostname-leaking protocols well-
330	   known as they might be proprietary and come with an installed
331	   application instead of being provided by the operating system.

333	   It may be possible in many cases to examine a protocol and prevent it
334	   from leaking hostnames.  This is for example what is attempted for
335	   DHCP in [RFC7844].  However, it is unclear that we can identify,
336	   revisit and fix all the protocols that publish hostnames.  In
337	   particular, this is impossible for proprietary protocols.

339	   We may be able to mitigate most of the effects of hostname leakage by
340	   revisiting the way platforms handle hostnames.  This is in a way
341	   similar to the approach of MAC address randomization described in
342	   [RFC7844].  Let's assume that the operating system, at the time of
343	   connecting to a new network, picks a random hostname and starts
344	   publicizing that random name in protocols such as DHCP or mDNS,
345	   instead of the static value.  This will render monitoring and
346	   identification of users by adversaries much more difficult, without
347	   preventing protocols such as DNS-SD from operating as expected.  This
348	   has of course implications on the applications making use of such
349	   protocols e.g. when the hostname is being displayed to users of the
350	   application.  They will not as easily be able to identify e.g.
351	   network shares or services based on the hostname carried in the
352	   underlying protocols.  Also, the generation of new hostnames should
353	   be synchronized with the change of other tokens used in network
354	   protocols such as the MAC or IP address to prevent correlation of
355	   this information.  E.g. if the IP address changes but the hostname
356	   stays the same, the new IP address can be correlated to belong to the
357	   same device based on a leaked hostname.

359	   Some operating systems, including Windows, support "per network"
360	   hostnames, but some other operating systems only support "global"
361	   hostnames.  In that case, changing the hostname may be difficult if
362	   the host is multi-homed, as the same name will be used on several
363	   networks.  Other operating systems already use potentially different
364	   hostnames for different purposes, which might be a good model to
365	   combine both static hostnames and randomized hostnames based on their
366	   potential use and threat to a user's privacy.  Obviously, further
367	   studies are required before the idea of randomized hostnames can be
368	   implemented.

370	6.  Security Considerations

372	   This draft does not introduce any new protocol.  It does point to
373	   potential privacy issues in a set of existing protocols.

375	7.  IANA Considerations

377	   This draft does not require any IANA action.

379	8.  Acknowledgments

381	   Thanks to the members of the INTAREA Working Group for discussions
382	   and reviews.

384	9.  Informative References

386	   [RFC1002]  NetBIOS Working Group in the Defense Advanced Research
387	              Projects Agency, Internet Activities Board, and End-to-End
388	              Services Task Force, "Protocol standard for a NetBIOS
389	              service on a TCP/UDP transport: Detailed specifications",
390	              STD 19, RFC 1002, DOI 10.17487/RFC1002, March 1987,
391	              <http://www.rfc-editor.org/info/rfc1002>.

393	   [RFC1033]  Lottor, M., "Domain Administrators Operations Guide",
394	              RFC 1033, DOI 10.17487/RFC1033, November 1987,
395	              <http://www.rfc-editor.org/info/rfc1033>.

397	   [RFC1035]  Mockapetris, P., "Domain names - implementation and
398	              specification", STD 13, RFC 1035, DOI 10.17487/RFC1035,
399	              November 1987, <http://www.rfc-editor.org/info/rfc1035>.

401	   [RFC2131]  Droms, R., "Dynamic Host Configuration Protocol",
402	              RFC 2131, DOI 10.17487/RFC2131, March 1997,
403	              <http://www.rfc-editor.org/info/rfc2131>.

405	   [RFC2132]  Alexander, S. and R. Droms, "DHCP Options and BOOTP Vendor
406	              Extensions", RFC 2132, DOI 10.17487/RFC2132, March 1997,
407	              <http://www.rfc-editor.org/info/rfc2132>.

409	   [RFC2782]  Gulbrandsen, A., Vixie, P., and L. Esibov, "A DNS RR for
410	              specifying the location of services (DNS SRV)", RFC 2782,
411	              DOI 10.17487/RFC2782, February 2000,
412	              <http://www.rfc-editor.org/info/rfc2782>.

414	   [RFC3315]  Droms, R., Ed., Bound, J., Volz, B., Lemon, T., Perkins,
415	              C., and M. Carney, "Dynamic Host Configuration Protocol
416	              for IPv6 (DHCPv6)", RFC 3315, DOI 10.17487/RFC3315, July
417	              2003, <http://www.rfc-editor.org/info/rfc3315>.

419	   [RFC3596]  Thomson, S., Huitema, C., Ksinant, V., and M. Souissi,
420	              "DNS Extensions to Support IP Version 6", RFC 3596,
421	              DOI 10.17487/RFC3596, October 2003,
422	              <http://www.rfc-editor.org/info/rfc3596>.

424	   [RFC4620]  Crawford, M. and B. Haberman, Ed., "IPv6 Node Information
425	              Queries", RFC 4620, DOI 10.17487/RFC4620, August 2006,
426	              <http://www.rfc-editor.org/info/rfc4620>.

428	   [RFC4795]  Aboba, B., Thaler, D., and L. Esibov, "Link-local
429	              Multicast Name Resolution (LLMNR)", RFC 4795,
430	              DOI 10.17487/RFC4795, January 2007,
431	              <http://www.rfc-editor.org/info/rfc4795>.

433	   [RFC6762]  Cheshire, S. and M. Krochmal, "Multicast DNS", RFC 6762,
434	              DOI 10.17487/RFC6762, February 2013,
435	              <http://www.rfc-editor.org/info/rfc6762>.

437	   [RFC6763]  Cheshire, S. and M. Krochmal, "DNS-Based Service
438	              Discovery", RFC 6763, DOI 10.17487/RFC6763, February 2013,
439	              <http://www.rfc-editor.org/info/rfc6763>.

441	   [RFC7288]  Thaler, D., "Reflections on Host Firewalls", RFC 7288,
442	              DOI 10.17487/RFC7288, June 2014,
443	              <http://www.rfc-editor.org/info/rfc7288>.

445	   [RFC7719]  Hoffman, P., Sullivan, A., and K. Fujiwara, "DNS
446	              Terminology", RFC 7719, DOI 10.17487/RFC7719, December
447	              2015, <http://www.rfc-editor.org/info/rfc7719>.

449	   [RFC7819]  Jiang, S., Krishnan, S., and T. Mrugalski, "Privacy
450	              Considerations for DHCP", RFC 7819, DOI 10.17487/RFC7819,
451	              April 2016, <http://www.rfc-editor.org/info/rfc7819>.

453	   [RFC7824]  Krishnan, S., Mrugalski, T., and S. Jiang, "Privacy
454	              Considerations for DHCPv6", RFC 7824,
455	              DOI 10.17487/RFC7824, May 2016,
456	              <http://www.rfc-editor.org/info/rfc7824>.

458	   [RFC7844]  Huitema, C., Mrugalski, T., and S. Krishnan, "Anonymity
459	              Profiles for DHCP Clients", RFC 7844,
460	              DOI 10.17487/RFC7844, May 2016,
461	              <http://www.rfc-editor.org/info/rfc7844>.

463	   [TRAC2016]
464	              Faath, M., Weisshaar, F., and R. Winter, "How Broadcast
465	              Data Reveals Your Identity and Social Graph", 7th
466	              International Workshop on TRaffic Analysis and
467	              Characterization IEEE TRAC 2016, September 2016.

469	Authors' Addresses

471	   Christian Huitema
472	   Private Octopus Inc.
473	   Friday Harbor, WA  98250
474	   U.S.A.

476	   Email: huitema@huitema.net
477	   Dave Thaler
478	   Microsoft
479	   Redmond, WA  98052
480	   U.S.A.

482	   Email: dthaler@microsoft.com

484	   Rolf Winter
485	   University of Applied Sciences Augsburg
486	   Augsburg
487	   DE

489	   Email: rolf.winter@hs-augsburg.de