idnits 2.17.1
draft-johansson-linkability-bad-01.txt:
Checking boilerplate required by RFC 5378 and the IETF Trust (see
https://trustee.ietf.org/license-info):
----------------------------------------------------------------------------
No issues found here.
Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
----------------------------------------------------------------------------
No issues found here.
Checking nits according to https://www.ietf.org/id-info/checklist :
----------------------------------------------------------------------------
** The document seems to lack a Security Considerations section.
** The document seems to lack an IANA Considerations section. (See Section
2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
when there are no actions for IANA.)
Miscellaneous warnings:
----------------------------------------------------------------------------
== The copyright year in the IETF Trust and authors Copyright Line does not
match the current year
-- The document date (January 21, 2014) is 3720 days in the past. Is this
intentional?
Checking references for intended status: Informational
----------------------------------------------------------------------------
No issues found here.
Summary: 2 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--).
Run idnits with the --verbose option for more detailed information about
the items above.
--------------------------------------------------------------------------------
2 General L. Johansson
3 Internet-Draft SUNET
4 Intended status: Informational L. Nordberg
5 Expires: July 25, 2014 NORDUnet
6 January 21, 2014
8 Linkability Considered Harmful
9 draft-johansson-linkability-bad-01
11 Abstract
13 Current debate on pervasive monitoring often focus on passive attacks
14 on the protocol and transport layers but even if these issues were
15 eliminated through the judicious use of encryption, roughly the same
16 information would still be available to an attacker who is able to
17 (legally or otherwise) obtain access to linked data sets which are
18 being maintained by large content and service providers.
20 Status of This Memo
22 This Internet-Draft is submitted in full conformance with the
23 provisions of BCP 78 and BCP 79.
25 Internet-Drafts are working documents of the Internet Engineering
26 Task Force (IETF). Note that other groups may also distribute
27 working documents as Internet-Drafts. The list of current Internet-
28 Drafts is at http://datatracker.ietf.org/drafts/current/.
30 Internet-Drafts are draft documents valid for a maximum of six months
31 and may be updated, replaced, or obsoleted by other documents at any
32 time. It is inappropriate to use Internet-Drafts as reference
33 material or to cite them other than as "work in progress."
35 This Internet-Draft will expire on July 25, 2014.
37 Copyright Notice
39 Copyright (c) 2014 IETF Trust and the persons identified as the
40 document authors. All rights reserved.
42 This document is subject to BCP 78 and the IETF Trust's Legal
43 Provisions Relating to IETF Documents
44 (http://trustee.ietf.org/license-info) in effect on the date of
45 publication of this document. Please review these documents
46 carefully, as they describe your rights and restrictions with respect
47 to this document. Code Components extracted from this document must
48 include Simplified BSD License text as described in Section 4.e of
49 the Trust Legal Provisions and are provided without warranty as
50 described in the Simplified BSD License.
52 1. Introduction
54 This I-D is submitted as a position paper for the joint IAB/W3C
55 STRINT workshop 2014. The authors wishes to call attention to the
56 fact that linked data sets are a source of information, sometimes
57 every bit as useful as anything that can be gleaned from passive
58 monitoring of Internet traffic. Such data sets are routinely
59 generated and maintained by service and content providers and are
60 often a source of secondary (or even primary) income for those that
61 own and generate them.
63 In the current discussion on pervasive monitoring we often overlook
64 the fact that even as more encryption is used, making passive attacks
65 harder, focus may simply shift to attacks on owners of linked data
66 sets. We should strike at the root of this problem by making it less
67 appealing to maintain these data sets and by offering users a measure
68 of control over how their information is used and shared.
70 Linkability is by no means a new concept and the authors do not
71 propose to (re)define the concept in this draft. Instead our intent
72 is to show, using some simple examples, how linkability occurs in
73 practice and what effect linkability has on privacy on the Internet.
75 2. A Simple Example
77 Service providers (we use this term in a general sense, and not with
78 a view to any particular protocols etc) typically manage users and
79 billing records. This leads to a data set being created for every
80 user of that service. Most services employ a simple pattern for user
81 enrollment which relies on an email address as a means of
82 (supposedly) uniquely identifying the user. The email address has
83 become the defacto user identifier on the Internet.
85 When a user pays for the service a pair of linked data sets is
86 created: the user data at the service provider is associated (via the
87 credit card information) to the user data held by the credit card
88 company. The value of the linked data, as well as the risk to the
89 user, is higher than the value/risk involved in the two data sets
90 taken as separate entities. For instance the linked data says
91 something about the buying habits of the user (based on the use of
92 the particular credit card) which in itself is valuable information.
94 Linking increases the risk to the user as well. With every service
95 that stores the users credit card the risk of exposure to active
96 attacks increase as events in recent years have made it painfully
97 clear.
99 If this example seems overly simplified or even naive to bring up,
100 consider the simple observation that when we visit a store in the
101 physical world we have the ability to "browse", i.e to view and
102 select among the offered goods without having to identify ourselves
103 or prove our ability to pay for any of the goods in the store. This
104 aspect of the real world has not been translated into the online
105 world where prospective customers are routinely fingerprinted and our
106 behaviour tracked even when we have shown no intention of engaging in
107 a business transaction with the store owner.
109 Naturally there must be ways to "conduct business on the Internet",
110 but there are ways to enable business without the need for linkable
111 attributes. In fact there are ways to enable business using non-
112 linkable attributes in such a way that the risk to business owners
113 are reduced.
115 3. Avoiding Linkable Attributes
117 The way to avoid linking is simple (and yet so hard in practice):
118 avoid the use of linkable attributes. In our e-commerce example
119 above, the credit card number is a linkable attribute. However in
120 this case the credit card is strictly speaking not needed at the
121 service provider. When the user provides her credit card information
122 to the service provider she is actually providing an authorization to
123 the service giving the service provider the right to obtain payment
124 from the credit card company.
126 Instead of using the credit card number as an implicit grant (of a
127 right to obtain payment), a token that isn't linkable across
128 identifier domains could be used to represent an explicit grant
129 issued on behalf of the user by the credit card company to the
130 service provider. This is a simple example of a general pattern:
131 instead of using a linkable user identifier, provide access to an
132 attribute representing some property of the user that used to grant
133 specific access.
135 Some credit card companies have actually taken first steps towards
136 this by involving the user directly in a second factor authentication
137 (typically to reduce the risk of fraud). This practice follows a
138 model for 3:rd party authentication services (aka identity providers)
139 commonly used in the enterprise and R&E community. Experience from
140 the R&E identity federation community shows that access control using
141 identity providers and non-linkable pseudonymous identifiers is by no
142 means problem free, but can be made to work in many situations.
144 4. Linkability and Probability
146 Links between data sets do not necessarily have to have be uniquely
147 linkable in order to blur the line between private and public actions
148 on the Internet. Linkability is really about probability and often
149 absolute certainty (demonstrated in the example above) is not needed
150 for an adversary to conclude that two actions were likely performed
151 by the same client.
153 Web browsers are typically fingerprintable even when the user tries
154 to avoid sticking out when compared to other users of a given web
155 site. Regardless of encryption, the operator of a web site can more
156 often than not tell one user from another by looking at information
157 sent by the browser without the users knowledge.
159 Some examples of sources for fingerprintability in web browsers are
160 information about browser window and desktop resolution, browser
161 toolbar presence, title bar font size and window manager settings
162 [TORBUTTON-DESIGN]. This has been confirmed by empirical studies
163 like the [PANOPTICLICK] study done by the EFF.
165 5. Incentives for Collecting Data
167 There are strong incentives for service providers to enrich the value
168 of their data set using attribute linking. The value of the
169 attribute naturally increase with the inverse of the size of the set
170 of users who share that attribute: the more specific the attribute
171 the more valuable it is, because it can be used to profile a user
172 with a higher degree of certainty.
174 Unfortunately there seem to be few costs associated with keeping
175 large linked data sets around - stolen user credentials in the 10s of
176 thousands rarely result in more than a brief notice in the news
177 anymore. To date the IETF community have focused on how to avoid the
178 use of long-term credentials (passwords) to reduce the effects of
179 such attacks. We need to broaden our scope to find ways to
180 disincentivize the (over)use of linkable attributes.
182 6. The Least common Denominator of Privacy
184 Out of all the transactions that a given user performs over the time
185 her traffic is being observed by an adversary, the most linkable one
186 will define her level of privacy towards that adversary. In other
187 words, linkability is the least common denominator of privacy and
188 must be treated as the privacy killer that it is. When we allow for
189 too much linkability in protocols we must acknowledge the fact that
190 we're building something that can't provide privacy.
192 Part of the Internet economy seems to be based on linked data sets
193 and linkable attributes. Changing this will require creating
194 negative incentives for service providers, making it less attractive
195 to keep data around as well as establishing technical mechanisms that
196 allow service providers access to the attributes they do need in
197 order to conduct their business without having to rely on linkable
198 attributes. Success will depend on carefully engineering the
199 negative incentives to match the technical mechanisms in order to
200 promote good behaviour.
202 The authors believe that the IETF community should attempt to design
203 technical controls into existing and future protocols that make it
204 possible for users of Internet technology to have a choice when to
205 provide linkable data to services and eavesdroppers and when not to.
207 7. Acknowledgements
209 Many thanks to Nick Mathewson for important contributions on the
210 topic of linkability. Many thanks also to Lucy Lynch who is the
211 source of much wisdom, the "I'm just browsing" response to
212 identification on the web in particular.
214 8. Informative References
216 [TORBUTTON-DESIGN]
217 Perry, M., "Torbutton Design Documentation", n.d., .
220 [PANOPTICLICK]
221 Eckersley, P., "How Unique Is Your Web Browser?", n.d.,
222 .
224 Authors' Addresses
226 Leif Johansson
227 SUNET
229 Email: leifj@sunet.se
230 Linus Nordberg
231 NORDUnet
233 Email: linus@nordu.net