idnits 2.17.1 

draft-iab-privacy-considerations-03.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == Line 1212 has weird spacing: '... states  on th...'

  == Line 1213 has weird spacing: '...cessing  of...'

  -- The document date (July 16, 2012) is 4301 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Unused Reference: 'Chaum' is defined on line 1207, but no explicit
     reference was found in the text

  == Unused Reference: 'I-D.iab-identifier-comparison' is defined on line
     1221, but no explicit reference was found in the text

  == Outdated reference: A later version (-09) exists of
     draft-iab-identifier-comparison-02

  == Outdated reference: A later version (-27) exists of
     draft-ietf-geopriv-policy-26

  -- Obsolete informational reference (is this intentional?): RFC 2616
     (Obsoleted by RFC 7230, RFC 7231, RFC 7232, RFC 7233, RFC 7234, RFC 7235)

  -- Obsolete informational reference (is this intentional?): RFC 4282
     (Obsoleted by RFC 7542)

  -- Obsolete informational reference (is this intentional?): RFC 5077
     (Obsoleted by RFC 8446)

  -- Obsolete informational reference (is this intentional?): RFC 5246
     (Obsoleted by RFC 8446)


     Summary: 0 errors (**), 0 flaws (~~), 7 warnings (==), 5 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                          A. Cooper
3	Internet-Draft                                                       CDT
4	Intended status: Informational                             H. Tschofenig
5	Expires: January 17, 2013                         Nokia Siemens Networks
6	                                                                B. Aboba
7	                                                   Microsoft Corporation
8	                                                             J. Peterson
9	                                                           NeuStar, Inc.
10	                                                               J. Morris

12	                                                               M. Hansen
13	                                                                ULD Kiel
14	                                                                R. Smith
15	                                                               JANET(UK)
16	                                                           July 16, 2012

18	             Privacy Considerations for Internet Protocols
19	                draft-iab-privacy-considerations-03.txt

21	Abstract

23	   This document offers guidance for developing privacy considerations
24	   for inclusion in IETF documents and aims to make protocol designers
25	   aware of privacy-related design choices.

27	   Discussion of this document is taking place on the IETF Privacy
28	   Discussion mailing list (see
29	   https://www.ietf.org/mailman/listinfo/ietf-privacy).

31	Status of this Memo

33	   This Internet-Draft is submitted in full conformance with the
34	   provisions of BCP 78 and BCP 79.

36	   Internet-Drafts are working documents of the Internet Engineering
37	   Task Force (IETF).  Note that other groups may also distribute
38	   working documents as Internet-Drafts.  The list of current Internet-
39	   Drafts is at http://datatracker.ietf.org/drafts/current/.

41	   Internet-Drafts are draft documents valid for a maximum of six months
42	   and may be updated, replaced, or obsoleted by other documents at any
43	   time.  It is inappropriate to use Internet-Drafts as reference
44	   material or to cite them other than as "work in progress."

46	   This Internet-Draft will expire on January 17, 2013.

48	Copyright Notice
49	   Copyright (c) 2012 IETF Trust and the persons identified as the
50	   document authors.  All rights reserved.

52	   This document is subject to BCP 78 and the IETF Trust's Legal
53	   Provisions Relating to IETF Documents
54	   (http://trustee.ietf.org/license-info) in effect on the date of
55	   publication of this document.  Please review these documents
56	   carefully, as they describe your rights and restrictions with respect
57	   to this document.  Code Components extracted from this document must
58	   include Simplified BSD License text as described in Section 4.e of
59	   the Trust Legal Provisions and are provided without warranty as
60	   described in the Simplified BSD License.

62	Table of Contents

64	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
65	   2.  Scope  . . . . . . . . . . . . . . . . . . . . . . . . . . . .  4
66	   3.  Terminology  . . . . . . . . . . . . . . . . . . . . . . . . .  6
67	     3.1.  Entities . . . . . . . . . . . . . . . . . . . . . . . . .  6
68	     3.2.  Data and Analysis  . . . . . . . . . . . . . . . . . . . .  6
69	     3.3.  Identifiability  . . . . . . . . . . . . . . . . . . . . .  7
70	   4.  Internet Privacy Threat Model  . . . . . . . . . . . . . . . .  9
71	     4.1.  Communications Model . . . . . . . . . . . . . . . . . . .  9
72	     4.2.  Privacy Threats  . . . . . . . . . . . . . . . . . . . . . 10
73	       4.2.1.  Combined Security-Privacy Threats  . . . . . . . . . . 11
74	       4.2.2.  Privacy-Specific Threats . . . . . . . . . . . . . . . 12
75	   5.  Threat Mitigations . . . . . . . . . . . . . . . . . . . . . . 16
76	     5.1.  Data Minimization  . . . . . . . . . . . . . . . . . . . . 16
77	       5.1.1.  Anonymity  . . . . . . . . . . . . . . . . . . . . . . 16
78	       5.1.2.  Pseudonymity . . . . . . . . . . . . . . . . . . . . . 17
79	       5.1.3.  Identity Confidentiality . . . . . . . . . . . . . . . 18
80	       5.1.4.  Data Minimization within Identity Management . . . . . 18
81	     5.2.  User Participation . . . . . . . . . . . . . . . . . . . . 19
82	     5.3.  Security . . . . . . . . . . . . . . . . . . . . . . . . . 19
83	   6.  Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . 21
84	     6.1.  General  . . . . . . . . . . . . . . . . . . . . . . . . . 21
85	     6.2.  Data Minimization  . . . . . . . . . . . . . . . . . . . . 21
86	     6.3.  User Participation . . . . . . . . . . . . . . . . . . . . 22
87	     6.4.  Security . . . . . . . . . . . . . . . . . . . . . . . . . 23
88	   7.  Example  . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
89	   8.  Security Considerations  . . . . . . . . . . . . . . . . . . . 29
90	   9.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 30
91	   10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 31
92	   11. Informative References . . . . . . . . . . . . . . . . . . . . 32
93	   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 35

95	1.  Introduction

97	   [RFC3552] provides detailed guidance to protocol designers about both
98	   how to consider security as part of protocol design and how to inform
99	   readers of IETF documents about security issues.  This document
100	   intends to provide a similar set of guidance for considering privacy
101	   in protocol design.

103	   Privacy is a complicated concept with a rich history that spans many
104	   disciplines.  With regard to data, often it is a concept applied to
105	   "personal data," information relating to an identified or
106	   identifiable individual.  Many sets of privacy principles and privacy
107	   design frameworks have been developed in different forums over the
108	   years.  These include the Fair Information Practices (FIPs), a
109	   baseline set of privacy protections pertaining to the collection and
110	   use of personal data (often based on the principles established in
111	   [OECD], for example), and the Privacy by Design concept, which
112	   provides high-level privacy guidance for systems design (see [PbD]
113	   for one example).  The guidance provided in this document is inspired
114	   by this prior work, but it aims to be more concrete, pointing
115	   protocol designers to specific engineering choices that can impact
116	   the privacy of the individuals that make use of Internet protocols.

118	   Privacy as a legal concept is understood differently in different
119	   jurisdictions.  The guidance provided in this document is generic and
120	   can be used to inform the design of any protocol to be used anywhere
121	   in the world, without reference to specific legal frameworks.

123	   Whether any individual document will require a specific privacy
124	   considerations section will depend on the document's content.
125	   Documents whose entire focus is privacy may not merit a separate
126	   section (for example, [RFC3325]).  For certain specifications,
127	   privacy considerations are a subset of security considerations and
128	   can be discussed explicitly in the security considerations section.
129	   The guidance provided here can and should be used to assess the
130	   privacy considerations of protocol, architectural, and operational
131	   specifications and to decide whether those considerations are to be
132	   documented in a stand-alone section, within the security
133	   considerations section, or throughout the document.

135	   This document is organized as follows.  Section 2 describes the
136	   extent to which the guidance offered is applicable within the IETF.
137	   Section 3 explains the terminology used in this document.  Section 4
138	   discusses threats to privacy as they apply to Internet protocols.
139	   Section 5 outlines privacy goals.  Section 6 provides the guidelines
140	   for analyzing and documenting privacy considerations within IETF
141	   specifications.  Section 7 examines the privacy characteristics of an
142	   IETF protocol to demonstrate the use of the guidance framework.

144	2.  Scope

146	   The core function of IETF activity is building protocols.  Internet
147	   protocols are often built flexibly, making them useful in a variety
148	   of architectures, contexts, and deployment scenarios without
149	   requiring significant interdependency between disparately designed
150	   components.  Although protocol designers often have a particular
151	   target architecture or set of architectures in mind at design time,
152	   it is not uncommon for architectural frameworks to develop later,
153	   after implementations exist and have been deployed in combination
154	   with other protocols or components to form complete systems.

156	   As a consequence, the extent to which protocol designers can foresee
157	   all of the privacy implications of a particular protocol at design
158	   time is significantly limited.  An individual protocol may be
159	   relatively benign on its own, but when deployed within a larger
160	   system or used in a way not envisioned at design time, its use may
161	   create new privacy risks.  Protocols are often implemented and
162	   deployed long after design time by different people than those who
163	   did the protocol design.  The guidelines in Section 6 ask protocol
164	   designers to consider how their protocols are expected to interact
165	   with systems and information that exist outside the protocol bounds,
166	   but not to imagine every possible deployment scenario.

168	   Furthermore, in many cases the privacy properties of a system are
169	   dependent upon the complete system design where various protocols are
170	   combined together to form a product solution; the implementation,
171	   which includes the user interface design; and operational deployment
172	   practices, including default privacy settings and security processes
173	   within the company doing the deployment.  These details are specific
174	   to particular instantiations and generally outside the scope of the
175	   work conducted in the IETF.  The guidance provided here may be useful
176	   in making choices about these details, but its primary aim is to
177	   assist with the design, implementation, and operation of protocols.

179	   Transparency of data collection and use -- often effectuated through
180	   user interface design -- is normally a key factor in determining the
181	   privacy impact of a system.  Although most IETF activities do not
182	   involve standardizing user interfaces or user-facing communications,
183	   in some cases understanding expected user interactions can be
184	   important for protocol design.  Unexpected user behavior may have an
185	   adverse impact on security and/or privacy.

187	   In sum, privacy issues, even those related to protocol development,
188	   go beyond the technical guidance discussed herein.  As an example,
189	   consider HTTP [RFC2616], which was designed to allow the exchange of
190	   arbitrary data.  A complete analysis of the privacy considerations
191	   for uses of HTTP might include what type of data is exchanged, how
192	   this data is stored, and how it is processed.  Hence the analysis for
193	   an individual's static personal web page would be different than the
194	   use of HTTP for exchanging health records.  A protocol designer
195	   working on HTTP extensions (such as WebDAV [RFC4918]) is not expected
196	   to describe the privacy risks derived from all possible usage
197	   scenarios, but rather the privacy properties specific to the
198	   extensions and any particular uses of the extensions that are
199	   expected and foreseen at design time.

201	3.  Terminology

203	   This section defines basic terms used in this document, with
204	   references to pre-existing definitions as appropriate.

206	3.1.  Entities

208	   Several of these terms are further elaborated in Section 4.1.

210	   $ Attacker:   An entity that intentionally works against some
211	      protection goal.

213	   $ Eavesdropper:   An entity that passively observes an initiator's
214	      communications without the initiator's knowledge or authorization.
215	      See [RFC4949].

217	   $ Enabler:   A protocol entity that facilitates communication between
218	      an initiator and a recipient without being directly in the
219	      communications path.

221	   $ Individual:   A natural person.

223	   $ Initiator:   A protocol entity that initiates communications with a
224	      recipient.

226	   $ Intermediary:   A protocol entity that sits between the initiator
227	      and the recipient and is necessary for the initiator and recipient
228	      to communicate.  Unlike an eavesdropper, an intermediary is an
229	      entity that is part of the communication architecture.  For
230	      example, a SIP proxy is an intermediary in the SIP architecture.

232	   $ Observer:   An entity that is authorized to receive and handle data
233	      from an initiator and thereby is able to observe and collect
234	      information, potentially posing privacy threats depending on the
235	      context.  As defined in this document, recipients, intermediaries,
236	      and enablers can all be observers.

238	   $ Recipient:   A protocol entity that receives communications from an
239	      initiator.

241	3.2.  Data and Analysis

243	   $ Correlation:   The combination of various pieces of information
244	      relating to an individual.

246	   $ Fingerprint:   A set of information elements that identifies a
247	      device or application instance.

249	   $ Fingerprinting:   The process of an observer or attacker uniquely
250	      identifying (with a sufficiently high probability) a device or
251	      application instance based on multiple information elements
252	      communicated to the observer or attacker.  See [EFF].

254	   $ Item of Interest (IOI):   Any data item that an observer or
255	      attacker might be interested in.  This includes attributes,
256	      identifiers, identities, or communications interactions (such as
257	      the sending or receiving of a communication).

259	   $ Personal Data:   Any information relating to an identified
260	      individual or an individual who can be identified, directly or
261	      indirectly.

263	   $ (Protocol) Interaction:   A unit of communication within a
264	      particular protocol.  A single interaction may be compromised of a
265	      single message between an initiator and recipient or multiple
266	      messages, depending on the protocol.

268	   $ Traffic Analysis:   The inference of information from observation
269	      of traffic flows (presence, absence, amount, direction, and
270	      frequency).  See [RFC4949].

272	   $ Undetectability:   The inability of an observer or attacker to
273	      sufficiently distinguish whether an item of interest exists or
274	      not.

276	   $ Unlinkability:   Within a particular set of information, the
277	      inability of an observer or attacker to distinguish whether two
278	      items of interest are related or not (with a high enough degree of
279	      probability to be useful to the observer or attacker).

281	3.3.  Identifiability

283	   $ Anonymity:   The state of being anonymous.

285	   $ Anonymous:   A property of an individual in which an observer or
286	      attacker cannot identify the individual within a set of other
287	      individuals (the anonymity set).

289	   $ Attribute:   A property of an individual or initiator.

291	   $ Identifiable:   A state in which a individual's identity is capable
292	      of being known.

294	   $ Identifiability:   The extent to which an individual is
295	      identifiable.

297	   $ Identified:   A state in which an individual's identity is known.

299	   $ Identifier:   A data object that represents a specific identity of
300	      a protocol entity or individual.  See [RFC4949].

302	   $ Identification:   The linking of information to a particular
303	      individual to infer the individual's identity or that allows the
304	      inference of the individual's identity.

306	   $ Identity:   Any subset of an individual's attributes that
307	      identifies the individual within a given context.  Individuals
308	      usually have multiple identities for use in different contexts.

310	   $ Identity confidentiality:   A property of an individual wherein any
311	      party other than the recipient cannot sufficiently identify the
312	      individual within a set of other individuals (the anonymity set).

314	   $ Identity provider:   An entity (usually an organization) that is
315	      responsible for establishing, maintaining, and securing the
316	      identity associated with individuals.

318	   $ Pseudonym:   An identifier of an individual other than the
319	      individual's real name.

321	   $ Pseudonymity:   The state of being pseudonymous.

323	   $ Pseudonymous:   A property of an individual in which the individual
324	      is identified by a pseudonym.

326	   $ Real name:   The opposite of a pseudonym.  An individual's real
327	      name typically comprises his or her given names and a family name.
328	      An individual may have multiple real names over a lifetime,
329	      including legal names.  From a technological perspective it cannot
330	      always be determined whether an identifier of an individual is a
331	      pseudonym or a real name.

333	   $ Relying party:   An entity that manages access to some resource.
334	      Security mechanisms allow the relying party to delegate aspects of
335	      identity management to an identity provider.  This delegation
336	      requires protocol exchanges, trust, and a common understanding of
337	      semantics of information exchanged between the relying party and
338	      the identity provider.

340	4.  Internet Privacy Threat Model

342	   Privacy harms come in a number of forms, including harms to financial
343	   standing, reputation, solitude, autonomy, and safety.  A victim of
344	   identity theft or blackmail, for example, may suffer a financial loss
345	   as a result.  Reputational harm can occur when disclosure of
346	   information about an individual, whether true or false, subjects that
347	   individual to stigma, embarrassment, or loss of personal dignity.
348	   Intrusion or interruption of an individual's life or activities can
349	   harm the individual's ability to be left alone.  When individuals or
350	   their activities are monitored, exposed, or at risk of exposure,
351	   those individuals may be stifled from expressing themselves,
352	   associating with others, and generally conducting their lives freely.
353	   They may also feel a general sense of unease, in that it is "creepy"
354	   to be monitored or to have data collected about them.  In cases where
355	   such monitoring is for the purpose of stalking or violence, it can
356	   put individuals in physical danger.

358	   This section lists common privacy threats (drawing liberally from
359	   [Solove], as well as [CoE]), showing how each of them may cause
360	   individuals to incur privacy harms and providing examples of how
361	   these threats can exist on the Internet.

363	4.1.  Communications Model

365	   To understand attacks in the privacy-harm sense, it is helpful to
366	   consider the overall communication architecture and different actors'
367	   roles within it.  Consider a protocol element that initiates
368	   communication with some recipient (an "initiator").  Privacy analysis
369	   is most relevant for protocols with use cases in which the initiator
370	   acts on behalf of a natural person (or different people at different
371	   times).  It is this individual whose privacy is potentially
372	   threatened.

374	   Communications may be direct between the initiator and the recipient,
375	   or they may involve an application-layer intermediary (such as a
376	   proxy or cache) that is necessary for the two parties to communicate.
377	   In some cases this intermediary stays in the communication path for
378	   the entire duration of the communication and sometimes it is only
379	   used for communication establishment, for either inbound or outbound
380	   communication.  In rare cases there may be a series of intermediaries
381	   that are traversed.  At lower layers, additional entities are
382	   involved in packet forwarding that may interfere with privacy
383	   protection goals as well.

385	   Some communications tasks require multiple protocol interactions with
386	   different entities.  For example, a request to an HTTP server may be
387	   preceded by an interaction between the initiator and an
388	   Authentication, Authorization, and Accounting (AAA) server for
389	   network access and to a DNS server for name resolution.  In this
390	   case, the HTTP server is the recipient and the other entities are
391	   enablers of the initiator-to-recipient communication.  Similarly, a
392	   single communication with the recipient my generate further protocol
393	   interactions between either the initiator or the recipient and other
394	   entities.  For example, an HTTP request might trigger interactions
395	   with an authentication server or with other resource servers.

397	   As a general matter, recipients, intermediaries, and enablers are
398	   usually assumed to be authorized to receive and handle data from
399	   initiators.  As [RFC3552] explains, "we assume that the end-systems
400	   engaging in a protocol exchange have not themselves been
401	   compromised."

403	   Although recipients, intermediairies, and enablers may not generally
404	   be considered as attackers, they may all pose privacy threats
405	   (depending on the context) because they are able to observe and
406	   collect privacy-relevant data.  These entities are collectively
407	   described below as "observers" to distinguish them from traditional
408	   attackers.  From a privacy perspective, one important type of
409	   attacker is an eavesdropper: an entity that passively observes the
410	   initiator's communications without the initiator's knowledge or
411	   authorization.

413	   The threat descriptions in the next section explain how observers and
414	   attackers might act to harm individuals' privacy.  Different kinds of
415	   attacks may be feasible at different points in the communications
416	   path.  For example, an observer could mount surveillance or
417	   identification attacks between the initiator and intermediary, or
418	   instead could surveil an enabler (e.g., by observing DNS queries from
419	   the initiator).

421	4.2.  Privacy Threats

423	   Some privacy threats are already considered in IETF protocols as a
424	   matter of routine security analysis.  Others are more pure privacy
425	   threats that existing security considerations do not usually address.
426	   The threats described here are divided into those that may also be
427	   considered security threats and those that are primarily privacy
428	   threats.

430	   Note that an individual's awareness of and consent to the practices
431	   described below can greatly affect the extent to which they threaten
432	   privacy.  If an individual authorizes surveillance of his own
433	   activities, for example, the harms associated with it may be
434	   mitigated, or the individual may accept the risk of harm.

436	4.2.1.  Combined Security-Privacy Threats

438	4.2.1.1.  Surveillance

440	   Surveillance is the observation or monitoring of an individual's
441	   communications or activities.  The effects of surveillance on the
442	   individual can range from anxiety and discomfort to behavioral
443	   changes such as inhibition and self-censorship to the perpetration of
444	   violence against the individual.  The individual need not be aware of
445	   the surveillance for it to impact privacy -- the possibility of
446	   surveillance may be enough to harm individual autonomy.

448	   Surveillance can be conducted by observers or eavesdroppers at any
449	   point along the communications path.  Confidentiality protections (as
450	   discussed in [RFC3552] Section 3) are necessary to prevent
451	   surveillance of the content of communications.  To prevent traffic
452	   analysis or other surveillance of communications patterns, other
453	   measures may be necessary, such as [Tor].

455	4.2.1.2.  Stored Data Compromise

457	   End systems that do not take adequate measures to secure stored data
458	   from unauthorized or inappropriate access expose individuals to
459	   potential financial, reputational, or physical harm.

461	   Protecting against stored data compromise is typically outside the
462	   scope of IETF protocols.  However, a number of common protocol
463	   functions -- key management, access control, or operational logging,
464	   for example -- require the storage of data about initiators of
465	   communications.  When requiring or recommending that information
466	   about initiators or their communications be stored or logged by end
467	   systems (see, e.g., RFC 6302), it is important to recognize the
468	   potential for that information to be compromised and for that
469	   potential to be weighed against the benefits of data storage.  Any
470	   recipient, intermediary, or enabler that stores data may be
471	   vulnerable to compromise.

473	4.2.1.3.  Intrusion

475	   Intrusion consists of invasive acts that disturb or interrupt one's
476	   life or activities.  Intrusion can thwart individuals' desires to be
477	   let alone, sap their time or attention, or interrupt their
478	   activities.

480	   Unsolicited messages and denial-of-service attacks are the most
481	   common types of intrusion on the Internet.  Intrusion can be
482	   perpetrated by any attacker that is capable of sending unwanted
483	   traffic to the initiator.

485	4.2.1.4.  Misattribution

487	   Misattribution occurs when data or communications related to one
488	   individual are attributed to another.  Misattribution can result in
489	   adverse reputational, financial, or other consequences for
490	   individuals that are misidentified.

492	   Misattribution in the protocol context comes as a result of using
493	   inadequate or insecure forms of identity or authentication.  For
494	   example, as [RFC6269] notes, abuse mitigation is often conducted on
495	   the basis of source IP address, such that connections from individual
496	   IP addresses may be prevented or temporarily blacklisted if abusive
497	   activity is determined to be sourced from those addresses.  However,
498	   in the case where a single IP address is shared by multiple
499	   individuals, those penalties may be suffered by all individuals
500	   sharing the address, even if they were not involved in the abuse.
501	   This threat can be mitigated by using identity management mechanisms
502	   with proper forms of authentication (ideally with cryptographic
503	   properties) so that actions can be attributed uniquely to an
504	   individual to provide the basis for accountability without generating
505	   false-positives.

507	4.2.2.  Privacy-Specific Threats

509	4.2.2.1.  Correlation

511	   Correlation is the combination of various pieces of information
512	   related to an individual.  Correlation can defy people's expectations
513	   of the limits of what others know about them.  It can increase the
514	   power that those doing the correlating have over individuals as well
515	   as correlators' ability to pass judgment, threatening individual
516	   autonomy and reputation.

518	   Correlation is closely related to identification.  Internet protocols
519	   can facilitate correlation by allowing individuals' activities to be
520	   tracked and combined over time.  The use of persistent or
521	   infrequently replaced identifiers at any layer of the stack can
522	   facilitate correlation.  For example, an initiator's persistent use
523	   of the same device ID, certificate, or email address across multiple
524	   interactions could allow recipients to correlate all of the
525	   initiator's communications over time.

527	   As an example, consider Transport Layer Security (TLS) session
528	   resumption [RFC5246] or TLS session resumption without server side
529	   state [RFC5077].  In RFC 5246 [RFC5246] a server provides the client
530	   with a session_id in the ServerHello message and caches the
531	   master_secret for later exchanges.  When the client initiates a new
532	   connection with the server it re-uses the previously obtained
533	   session_id in its ClientHello message.  The server agrees to resume
534	   the session by using the same session_id and the previously stored
535	   master_secret for the generation of the TLS Record Layer security
536	   association.  RFC 5077 [RFC5077] borrows from the session resumption
537	   design idea but the server encapsulates all state information into a
538	   ticket instead of caching it.  An attacker who is able to observe the
539	   protocol exchanges between the TLS client and the TLS server is able
540	   to link the initial exchange to subsequently resumed TLS sessions
541	   when the session_id and the ticket are exchanged in the clear (which
542	   is the case with data exchanged in the initial handshake messages).

544	   In theory any observer or attacker that receives an initiator's
545	   communications can engage in correlation.  The extent of the
546	   potential for correlation will depend on what data the entity
547	   receives from the initiator and has access to otherwise.  Often,
548	   intermediaries only require a small amount of information for message
549	   routing and/or security.  In theory, protocol mechanisms could ensure
550	   that end-to-end information is not made accessible to these entities,
551	   but in practice the difficulty of deploying end-to-end security
552	   procedures, additional messaging or computational overhead, and other
553	   business or legal requirements often slow or prevent the deployment
554	   of end-to-end security mechanisms, giving intermediaries greater
555	   exposure to initiators' data than is strictly necessary from a
556	   technical point of view.

558	4.2.2.2.  Identification

560	   Identification is the linking of information to a particular
561	   individual.  In some contexts it is perfectly legitimate to identify
562	   individuals, whereas in others identification may potentially stifle
563	   individuals' activities or expression by inhibiting their ability to
564	   be anonymous or pseudonymous.  Identification also makes it easier
565	   for individuals to be explicitly controlled by others (e.g.,
566	   governments) and to be treated differentially compared to other
567	   individuals.

569	   Many protocols provide functionality to convey the idea that some
570	   means has been provided to guarantee that entities are who they claim
571	   to be.  Often, this is accomplished with cryptographic
572	   authentication.  Furthermore, many protocol identifiers, such as
573	   those used in SIP or XMPP, may allow for the direct identification of
574	   individuals.  Protocol identifiers may also contribute indirectly to
575	   identification via correlation.  For example, a web site that does
576	   not directly authenticate users may be able to match its HTTP header
577	   logs with logs from another site that does authenticate users,
578	   rendering users on the first site identifiable.

580	   As with correlation, any observer or attacker may be able to engage
581	   in identification depending on the information about the initiator
582	   that is available via the protocol mechanism or other channels.

584	4.2.2.3.  Secondary Use

586	   Secondary use is the use of collected information without the
587	   individual's consent for a purpose different from that for which the
588	   information was collected.  Secondary use may violate people's
589	   expectations or desires.  The potential for secondary use can
590	   generate uncertainty over how one's information will be used in the
591	   future, potentially discouraging information exchange in the first
592	   place.

594	   One example of secondary use would be a network access server that
595	   uses an initiator's access requests to track the initiator's
596	   location.  Any observer or attacker could potentially make unwanted
597	   secondary uses of initiators' data.  Protecting against secondary use
598	   is typically outside the scope of IETF protocols.

600	4.2.2.4.  Disclosure

602	   Disclosure is the revelation of information about an individual that
603	   affects the way others judge the individual.  Disclosure can violate
604	   individuals' expectations of the confidentiality of the data they
605	   share.  The threat of disclosure may deter people from engaging in
606	   certain activities for fear of reputational harm, or simply because
607	   they do not wish to be observed.

609	   Any observer or attacker that receives data about an initiator may
610	   engage in disclosure.  Sometimes disclosure is unintentional because
611	   system designers do not realize that information being exchanged
612	   relates to individuals.  The most common way for protocols to limit
613	   disclosure is by providing access control mechanisms (discussed in
614	   the next section).  A further example is provided by the IETF
615	   geolocation privacy architecture [RFC6280], which supports a way for
616	   users to express a preference that their location information not be
617	   disclosed beyond the intended recipient.

619	4.2.2.5.  Exclusion

621	   Exclusion is the failure to allow individuals to know about the data
622	   that others have about them and to participate in its handling and
623	   use.  Exclusion reduces accountability on the part of entities that
624	   maintain information about people and creates a sense of
625	   vulnerability about individuals' ability to control how information
626	   about them is collected and used.

628	   The most common way for Internet protocols to be involved in limiting
629	   exclusion is through access control mechanisms.  The presence
630	   architecture developed in the IETF is a good example where
631	   individuals are included in the control of information about them.
632	   Using a rules expression language (e.g., Presence Authorization Rules
633	   [RFC5025]), presence clients can authorize the specific conditions
634	   under which their presence information may be shared.

636	   Exclusion is primarily considered problematic when the recipient
637	   fails to involve the initiator in decisions about data collection,
638	   handling, and use.  Eavesdroppers engage in exclusion by their very
639	   nature since their data collection and handling practices are covert.

641	5.  Threat Mitigations

643	   Privacy is notoriously difficult to measure and quantify.  The extent
644	   to which a particular protocol, system, or architecture "protects" or
645	   "enhances" privacy is dependent on a large number of factors relating
646	   to its design, use, and potential misuse.  However, there are certain
647	   widely recognized classes of mitigations against the threats
648	   discussed in Section 4.2.  This section describes three categories of
649	   relevant mitigations: (1) data minimization, (2) user participation,
650	   and (3) security.

652	5.1.  Data Minimization

654	   Data minimization refers to collecting, using, disclosing, and
655	   storing the minimal data necessary to perform a task.  The less data
656	   about individuals that gets exchanged in the first place, the lower
657	   the chances of that data being misused or leaked.

659	   Data minimization can be effectuated in a number of different ways,
660	   including by limiting collection, use, disclosure, retention,
661	   identifiability, sensitivity, and access to personal data.  Limiting
662	   the data collected by protocol elements only to what is necessary
663	   (collection limitation) is the most straightforward way to ensure
664	   that use of the data does not incur privacy harm.  In some cases,
665	   protocol designers may also be able to recommend limits to the use or
666	   retention of data, although protocols themselves are not often
667	   capable of controlling these properties.

669	   However, the most direct application of data minimization to protocol
670	   design is limiting identifiability.  Reducing the identifiability of
671	   data by using pseudonymous or anonymous identifiers helps to weaken
672	   the link between an individual and his or her communications.
673	   Allowing for the periodic creation of new identifiers reduces the
674	   possibility that multiple protocol interactions or communications can
675	   be correlated back to the same individual.  The following sections
676	   explore a number of different properties related to identifiability
677	   that protocol designers may seek to achieve.

679	   (Threats mitigated: surveillance, stored data compromise,
680	   correlation, identification, secondary use, disclosure)

682	5.1.1.  Anonymity

684	   To enable anonymity of an individual, there must exist a set of
685	   individuals with potentially the same attributes.  To the attacker or
686	   the observer these individuals must appear indistinguishable from
687	   each other.  The set of all such individuals is known as the
688	   anonymity set and membership of this set may vary over time.

690	   The composition of the anonymity set depends on the knowledge of the
691	   observer or attacker.  Thus anonymity is relative with respect to the
692	   observer or attacker.  An initiator may be anonymous only within a
693	   set of potential initiators -- its initiator anonymity set -- which
694	   itself may be a subset of all individuals that may initiate
695	   communications.  Conversely, a recipient may be anonymous only within
696	   a set of potential recipients -- its recipient anonymity set.  Both
697	   anonymity sets may be disjoint, may overlap, or may be the same.

699	   As an example, consider RFC 3325 (P-Asserted-Identity, PAI)
700	   [RFC3325], an extension for the Session Initiation Protocol (SIP),
701	   that allows an individual, such as a VoIP caller, to instruct an
702	   intermediary that he or she trusts not to populate the SIP From
703	   header field with the individual's authenticated and verified
704	   identity.  The recipient of the call, as well as any other entity
705	   outside of the individual's trust domain, would therefore only learn
706	   that the SIP message (typically a SIP INVITE) was sent with a header
707	   field 'From: "Anonymous" <sip:anonymous@anonymous.invalid>' rather
708	   than the individual's address-of-record, which is typically thought
709	   of as the "public address" of the user.  When PAI is used, the
710	   individual becomes anonymous within the initiator anonymity set that
711	   is populated by every individual making use of that specific
712	   intermediary.

714	   Note that this example ignores the fact that other personal data may
715	   be inferred from the other SIP protocol payloads.  This caveat makes
716	   the analysis of the specific protocol extension easier but cannot be
717	   assumed when conducting analysis of an entire architecture.

719	5.1.2.  Pseudonymity

721	   In the context of IETF protocols, almost all identifiers are
722	   pseudonyms since there is typically no requirement to use real names
723	   in protocols.  However, in certain scenarios it is reasonable to
724	   assume that real names will be used (with vCard [RFC6350], for
725	   example).

727	   Pseudonymity is strengthened when less personal data can be linked to
728	   the pseudonym; when the same pseudonym is used less often and across
729	   fewer contexts; and when independently chosen pseudonyms are more
730	   frequently used for new actions (making them, from an observer's or
731	   attacker's perspective, unlinkable).

733	   For Internet protocols it is important whether protocols allow
734	   pseudonyms to be changed without human interaction, the default
735	   length of pseudonym lifetimes, to whom pseudonyms are exposed, how
736	   individuals are able to control disclosure, how often pseudonyms can
737	   be changed, and the consequences of changing them.

739	5.1.3.  Identity Confidentiality

741	   An initiator has identity confidentiality when any party other than
742	   the recipient cannot sufficiently identify the initiator within the
743	   anonymity set.  In comparison to anonymity and pseudonymity, identity
744	   confidentiality is concerned with eavesdroppers and intermediaries.

746	   As an example, consider the network access authentication procedures
747	   utilizing the Extensible Authentication Protocol (EAP) [RFC3748].
748	   EAP includes an identity exchange where the Identity Response is
749	   primarily used for routing purposes and selecting which EAP method to
750	   use.  Since EAP Identity Requests and Responses are sent in
751	   cleartext, eavesdroppers and intermediaries along the communication
752	   path between the EAP peer and the EAP server can snoop on the
753	   identity.  To address this treat, as discussed in RFC 4282 [RFC4282],
754	   the user's identity can be hidden against these eavesdroppers and
755	   intermediaries with the cryptography support by EAP methods.
756	   Identity confidentiality has become a recommended design criteria for
757	   EAP (see [RFC4017]).  EAP-AKA [RFC4187], for example, protects the
758	   EAP peer's identity against passive adversaries by utilizing temporal
759	   identities.  EAP-IKEv2 [RFC5106] is an example of an EAP method that
760	   offers protection against active attackers with regard to the
761	   individual's identity.

763	5.1.4.  Data Minimization within Identity Management

765	   Modern systems are increasingly relying on multi-party transactions
766	   to authenticate individuals.  Many of these systems make use of an
767	   identity provider that is responsible for providing authentication
768	   and authorization information to entities (relying parties) that
769	   require authentication or authorization of individuals in order to
770	   process transactions or grant access.  To facilitate the provision of
771	   authentication and authorization, an identity provider will usually
772	   go through a process of verifying the individual's identity and
773	   issuing the individual a set of credentials.  When an individual
774	   seeks to make use of a service provided by the relying party, the
775	   relying party relies on the authentication and authorization
776	   assertions provided by its identity provider.

778	   Such systems have the ability to support a number of properties that
779	   minimize data collection in different ways:

781	      Relying parties can be prevented from knowing the real or
782	      pseudonymous identity of an individual, since the identity
783	      provider is the only entity involved in verifying identity.

785	      Relying parties that collude can be prevented from using an
786	      individual's credentials to track the individual.  That is, two
787	      different relying parties can be prevented from determining that
788	      the same individual has authenticated to both of them.  This
789	      requires that each relying party use a different means of
790	      identifying individuals.

792	      The identity provider can be prevented from knowing which relying
793	      parties an individual interacted with.  This requires avoiding
794	      direct communication between the identity provider and the relying
795	      party at the time when access to a resource by the initiator is
796	      made.

798	5.2.  User Participation

800	   As explained in Section 4.2.2.5, data collection and use that happens
801	   "in secret," without the individual's knowledge, is apt to violate
802	   the individual's expectation of privacy and may create incentives for
803	   misuse of data.  As a result, privacy regimes tend to include
804	   provisions to support informing individuals about data collection and
805	   use and involving them in decisions about the treatment of their
806	   data.  In an engineering context, supporting the goal of user
807	   participation usually means providing ways for users to control the
808	   data that is shared about them.  It may also mean providing ways for
809	   users to signal how they expect their data to be used and shared.
810	   (Threats mitigated: surveillance, secondary use, disclosure,
811	   exclusion)

813	5.3.  Security

815	   Keeping data secure at rest and in transit is another important
816	   component of privacy protection.  As they are described in [RFC3552]
817	   Section 2, a number of security goals also serve to enhance privacy:

819	   o  Confidentiality: Keeping data secret from unintended listeners.

821	   o  Peer entity authentication: Ensuring that the endpoint of a
822	      communication is the one that is intended (in support of
823	      maintaining confidentiality).

825	   o  Unauthorized usage: Limiting data access to only those users who
826	      are authorized.  (Note that this goal also falls within data
827	      minimization.)

829	   o  Inappropriate usage: Limiting how authorized users can use data.
830	      (Note that this goal also falls within data minimization.)

832	   Note that even when these goals are achieved, the existence of items
833	   of interest -- attributes, identifiers, identities, communications,
834	   actions (such as the sending or receiving of a communication), or
835	   anything else an attacker or observer might be interested in -- may
836	   still be detectable, even if they are not readable.  Thus
837	   undetectability, in which an observer or attacker cannot sufficiently
838	   distinguish whether an item of interest exists or not, may be
839	   considered as a further security goal (albeit one that can be
840	   extremely difficult to accomplish).

842	   (Threats mitigated: surveillance, stored data compromise,
843	   misattribution, secondary use, disclosure, intrusion)

845	6.  Guidelines

847	   This section provides guidance for document authors in the form of a
848	   questionnaire about a protocol being designed.  The questionnaire may
849	   be useful at any point in the design process, particularly after
850	   document authors have developed a high-level protocol model as
851	   described in [RFC4101].

853	   Note that the guidance does not recommend specific practices.  The
854	   range of protocols developed in the IETF is too broad to make
855	   recommendations about particular uses of data or how privacy might be
856	   balanced against other design goals.  However, by carefully
857	   considering the answers to each question, document authors should be
858	   able to produce a comprehensive analysis that can serve as the basis
859	   for discussion of whether the protocol adequately protects against
860	   privacy threats.

862	   The framework is divided into four sections that address each of the
863	   mitigation classes from Section 5, plus a general section.  Security
864	   is not fully elaborated since substantial guidance already exists in
865	   [RFC3552].

867	6.1.  General

869	      a.  Trade-offs.  Does the protocol make trade-offs between privacy
870	      and usability, privacy and efficiency, privacy and
871	      implementability, or privacy and other design goals?  Describe the
872	      trade-offs and the rationale for the design chosen.

874	6.2.  Data Minimization

876	      a.  Identifiers.  What identifiers does the protocol use for
877	      distinguishing initiators of communications?  Does the protocol
878	      use identifiers that allow different protocol interactions to be
879	      correlated?

881	      b.  Data.  What information does the protocol expose about
882	      individuals, their devices, and/or their device usage (other than
883	      the identifiers discussed in (a))?  To what extent is this
884	      information linked to the identities of the individuals?  How does
885	      the protocol combine personal data with the identifiers discussed
886	      in (a)?

888	      c.  Observers.  Which information discussed in (a) and (b) is
889	      exposed to each other protocol entity (i.e., recipients,
890	      intermediaries, and enablers)?  Are there ways for protocol
891	      implementers to choose to limit the information shared with each
892	      entity?  Are there operational controls available to limit the
893	      information shared with each entity?

895	      d.  Fingerprinting.  In many cases the specific ordering and/or
896	      occurrences of information elements in a protocol allow users,
897	      devices, or software using the protocol to be fingerprinted.  Is
898	      this protocol vulnerable to fingerprinting?  If so, how?

900	      e.  Persistence of identifiers.  What assumptions are made in the
901	      protocol design about the lifetime of the identifiers discussed in
902	      (a)?  Does the protocol allow implementers or users to delete or
903	      replace identifiers?  How often does the specification recommend
904	      to delete or replace identifiers by default?

906	      f.  Correlation.  Does the protocol allow for correlation of
907	      identifiers?  Are there expected ways that information exposed by
908	      the protocol will be combined or correlated with information
909	      obtained outside the protocol?  How will such combination or
910	      correlation facilitate fingerprinting of a user, device, or
911	      application?  Are there expected combinations or correlations with
912	      outside data that will make users of the protocol more
913	      identifiable?

915	      g.  Retention.  Do the protocol or its anticipated uses require
916	      that the information discussed in (a) or (b) be retained by
917	      recipients, intermediaries, or enablers?  Is the retention
918	      expected to be persistent or temporary?

920	6.3.  User Participation

922	      a.  User control.  What controls or consent mechanisms does the
923	      protocol define or require before personal data or identifiers are
924	      shared or exposed via the protocol?  If no such mechanisms are
925	      specified, is it expected that control and consent will be handled
926	      outside of the protocol?

928	      b.  Control over sharing with individual recipients.  Does the
929	      protocol provide ways for initiators to share different
930	      information with different recipients?  If not, are there
931	      mechanisms that exist outside of the protocol to provide
932	      initiators with such control?

934	      c.  Control over sharing with intermediaries.  Does the protocol
935	      provide ways for initiators to limit which information is shared
936	      with intermediaries?  If not, are there mechanisms that exist
937	      outside of the protocol to provide users with such control?  Is it
938	      expected that users will have relationships (contractual or
939	      otherwise) with intermediaries that govern the use of the
940	      information?
941	      d.  Preference expression.  Does the protocol provide ways for
942	      initiators to express individuals' preferences to recipients or
943	      intermediaries with regard to the collection, use, or disclosure
944	      of their personal data?

946	6.4.  Security

948	      a.  Surveillance.  How do the protocol's security considerations
949	      prevent surveillance, including eavesdropping and traffic
950	      analysis?

952	      b.  Stored data compromise.  How do the protocol's security
953	      considerations prevent or mitigate stored data compromise?

955	      c.  Intrusion.  How do the protocol's security considerations
956	      prevent or mitigate intrusion, including denial-of-service attacks
957	      and unsolicited communications more generally?

959	      d.  Misattribution.  How do the protocol's mechanisms for
960	      identifying and/or authenticating individuals prevent
961	      misattribution?

963	7.  Example

965	   The following section gives an example of the threat analysis and
966	   threat mitigation recommended by this document.  It covers a
967	   particularly difficult application protocol, presence, to try to
968	   demonstrate these principles on an architecture that is vulnerable to
969	   many of the threats described above.  This text is not intended as an
970	   example of a Privacy Considerations section that might appear in an
971	   IETF specification, but rather as an example of the thinking that
972	   should go into the design of a protocol when considering privacy as a
973	   first principle.

975	   A presence service, as defined in the abstract in [RFC2778], allows
976	   users of a communications service to monitor one another's
977	   availability and disposition in order to make decisions about
978	   communicating.  Presence information is highly dynamic, and generally
979	   characterizes whether a user is online or offline, busy or idle, away
980	   from communications devices or nearby, and the like.  Necessarily,
981	   this information has certain privacy implications, and from the start
982	   the IETF approached this work with the aim to provide users with the
983	   controls to determine how their presence information would be shared.
984	   The Common Profile for Presence (CPP) [RFC3859] defines a set of
985	   logical operations for delivery of presence information.  This
986	   abstract model is applicable to multiple presence systems.  The SIP-
987	   based SIMPLE presence system [RFC3261] uses CPP as its baseline
988	   architecture, and the presence operations in the Extensible Messaging
989	   and Presence Protocol (XMPP) have also been mapped to CPP [RFC3922].

991	   The fundamental architecture defined in RFC 2778 and RFC 3859 is a
992	   mediated one.  Clients (presentities in RFC 2778 terms) publish their
993	   presence information to presence servers, which in turn distribute
994	   information to authorized watchers.  Presence servers thus retain
995	   presence information for an interval of time, until it either changes
996	   or expires, so that it can be revealed to authorized watchers upon
997	   request.  This architecture mirrors existing pre-standard deployment
998	   models.  The integration of an explicit authorization mechanism into
999	   the presence architecture has been widely successful in involving the
1000	   end users in the decision making process before sharing information.
1001	   Nearly all presence systems deployed today provide such a mechanism,
1002	   typically through a reciprocal authorization system by which a pair
1003	   of users, when they agree to be "buddies," consent to divulge their
1004	   presence information to one another.  Buddylists are managed by
1005	   servers but controlled by end users.  Users can also explicit block
1006	   one another through a similar interface, and in some deployments it
1007	   is desirable to provide "polite blocking" of various kinds.

1009	   From a perspective of privacy design, however, the classical presence
1010	   architecture represents nearly a worst-case scenario.  In terms of
1011	   data minimization, presentities share their sensitive information
1012	   with presence services, and while services only share this presence
1013	   information with watchers authorized by the user, no technical
1014	   mechanism constrains those watchers from relaying presence to further
1015	   third parties.  Any of these entities could conceivable log or retain
1016	   presence information indefinitely.  The sensitivity cannot be
1017	   mitigated by rendering the user anonymous, as it is indeed the
1018	   purpose of the system to facilitate communications between users who
1019	   know one another.  The identifiers employed by users are long-lived
1020	   and often contain personal information, including real names and the
1021	   domains of service providers.  While users do participate in the
1022	   construction of buddylists and blacklists, they do so with little
1023	   prospect for accountability: the user effectively throws their
1024	   presence information over the wall to a presence server that in turn
1025	   distributes the information to watchers.  Users typically have no way
1026	   to verify that presence is being distributed only to authorized
1027	   watchers, especially as it is the server that authenticates watchers,
1028	   not the end user.  Connections between the server and all publishers
1029	   and consumers of presence data are moreover an attractive target for
1030	   eavesdroppers, and require strong confidentiality mechanisms, though
1031	   again the end user has no way to verify what mechanisms are in place
1032	   between the presence server and a watcher.

1034	   Moreover, the sensitivity of presence information is not limited to
1035	   the disposition and capability to communicate.  Capability can reveal
1036	   the type of device that a user employs, for example, and since
1037	   multiple devices can publish the same user's presence, there are
1038	   significant risks of allowing attackers to correlate user devices.
1039	   An important extension to presence was developed to enable the
1040	   support for location sharing.  The effort to standardize protocols
1041	   for systems sharing geolocation was started in the GEOPRIV working
1042	   group.  During the initial requirements and privacy threat analysis
1043	   in the process of chartering the working group, it became clear that
1044	   the system would require an underlying communication mechanism
1045	   supporting user consent to share location information.  The
1046	   resemblance of these requirements to the presence framework was
1047	   quickly recognized, and this design decision was documented in
1048	   [RFC4079].  Location information thus mingles with other presence
1049	   information available through the system to intermediaries and to
1050	   authorized watchers.

1052	   Privacy concerns about presence information largely arise due to the
1053	   built-in mediation of the presence architecture.  The need for a
1054	   presence server is motivated by two primary design requirements of
1055	   presence: in the first place, the server can respond with an
1056	   "offline" indication when the user is not online; in the second
1057	   place, the server can compose presence information published by
1058	   different devices under the user's control.  Additionally, to
1059	   preserve the use of URIs as identifiers for entities, some service
1060	   must operate a host with the domain name appearing in a presence URI,
1061	   and in practical terms no commercial presence architecture would
1062	   force end users to own and operate their own domain names.  Many end
1063	   users of applications like presence are behind NATs or firewalls, and
1064	   effectively cannot receive direct connections from the Internet - the
1065	   persistent bidirectional channel these clients open and maintain with
1066	   a presence server is essential to the operation of the protocol.

1068	   One must first ask if the trade-off of mediation is worth it, for
1069	   presence.  Does a server need to be in the middle of all publications
1070	   of presence information?  It might seem that end-to-end encryption of
1071	   the presence information could solve many of these problems.  A
1072	   presentity could encrypt the presence information with the public key
1073	   of a watcher, and only then send the presence information through the
1074	   server.  The IETF defined an object format for presence information
1075	   called the Presence Information Data Format (PIDF), which for the
1076	   purposes of conveying location information was extended to the PIDF
1077	   Location Object (PIDF-LO) - these XML objects were designed to
1078	   accommodate an encrypted wrapper.  Encrypting this data would have
1079	   the added benefit of preventing stored cleartext presence information
1080	   from being seized by an attacker who manages to compromise a presence
1081	   server.  This proposal, however, quickly runs into usability
1082	   problems.  Discovering the public keys of watchers is the first
1083	   difficulty, one that few Internet protocols have addressed
1084	   successfully.  This solution would then require the presentity to
1085	   publish one encrypted copy of its presence information per authorized
1086	   watcher to the presence service, regardless of whether or not a
1087	   watcher is actively seeking presence information - for a presentity
1088	   with many watchers, this may place an unacceptable burden on the
1089	   presence server, especially given the dynamism of presence
1090	   information.  Finally, it prevents the server from composing presence
1091	   information reported by multiple devices under the same user's
1092	   control.  On the whole, these difficulties render object encryption
1093	   of presence information a doubtful prospect.

1095	   Some protocols that provide presence information, such as SIP, can
1096	   operate intermediaries in a redirecting mode, rather than a
1097	   publishing or proxying mode.  Instead of sending presence information
1098	   through the server, in other words, these protocols can merely
1099	   redirect watchers to the presentity, and then presence information
1100	   could pass directly and securely from the presentity to the watcher.
1101	   In that case, the presentity can decide exactly what information it
1102	   would like to share with the watcher in question, it can authenticate
1103	   the watcher itself with whatever strength of credential it chooses,
1104	   and with end-to-end encryption it can reduce the likelihood of any
1105	   eavesdropping.  In a redirection architecture, a presence server
1106	   could still provide the necessary "offline" indication, without
1107	   requiring the presence server to observe and forward all information
1108	   itself.  This mechanism is more promising than encryption, but also
1109	   suffers from significant difficulties.  It too does not provide for
1110	   composition of presence information from multiple devices - it in
1111	   fact forces the watcher to perform this composition itself, which may
1112	   lead to unexpected results.  The largest single impediment to this
1113	   approach is however the difficulty of creating end-to-end connections
1114	   between the presentity's device(s) and a watcher, as some or all of
1115	   these endpoints may be behind NATs or firewalls that prevent peer-to-
1116	   peer connections.  While there are potential solutions for this
1117	   problem, like STUN and TURN, they add complexity to the overall
1118	   system.

1120	   Consequently, mediation is a difficult feature of the presence
1121	   architecture to remove, and due especially to the requirement for
1122	   composition it is hard to minimize the data shared with
1123	   intermediaries.  Control over sharing with intermediaries must
1124	   therefore come from some other explicit component of the
1125	   architecture.  As such, the presence work in the IETF focused on
1126	   improving the user participation over the activities of the presence
1127	   server.  This work began in the GEOPRIV working group, with controls
1128	   on location privacy, as location of users is perceived as having
1129	   especially sensitive properties.  With the aim to meet the privacy
1130	   requirements defined in [RFC2779] a set of usage indications, such as
1131	   whether retransmission is allowed or when the retention period
1132	   expires, have been added to PIDF-LO that always travel with location
1133	   information itself.  These privacy preferences apply not only to the
1134	   intermediaries that store and forward presence information, but also
1135	   to the watchers who consume it.

1137	   This approach very much follows the spirit of Creative Commons [1],
1138	   namely the usage of a limited number of conditions (such as 'Share
1139	   Alike' [2]).  Unlike Creative Commons, the GEOPRIV working group did
1140	   not, however, initiate work to produce legal language nor to design
1141	   graphical icons since this would fall outside the scope of the IETF.
1142	   In particular, the GEOPRIV rules state a preference on the retention
1143	   and retransmission of location information; while GEOPRIV cannot
1144	   force any entity receiving a PIDF-LO object to abide by those
1145	   preferences, if users lack the ability to express them at all, we can
1146	   guarantee their preferences will not be honored.

1148	   The retention and retransmission elements were envisioned as the only
1149	   first and most essential examples of preference expression in sharing
1150	   presence.  The PIDF object was designed for extensibility, and the
1151	   rulesets created for PIDF-LO can also be extended to provide new
1152	   expressions of user preference.  Not all user preference information
1153	   should be bound into a particular PIDF object, however - many forms
1154	   of access control policy assumed by the presence architecture need to
1155	   be provisioned in the presence server by some interface with the
1156	   user.  This requirement eventually triggered the standardization of a
1157	   general access control policy language called the Common Policy
1158	   (defined in [RFC4745]) framework.  This language allows one to
1159	   express ways to control the distribution of information as simple
1160	   conditions, actions, and transformations rules expressed in an XML
1161	   format.  Common Policy itself is an abstract format which needs to be
1162	   instantiated: two examples can be found with the Presence
1163	   Authorization Rules [RFC5025] and the Geolocation Policy
1164	   [I-D.ietf-geopriv-policy].  The former provides additional
1165	   expressiveness for presence based systems, while the latter defines
1166	   syntax and semantic for location based conditions and
1167	   transformations.

1169	   Ultimately, the privacy work on presence represents a compromise
1170	   between privacy principles and the needs of the architecture and
1171	   marketplace.  While it was not feasible to remove intermediaries from
1172	   the architecture entirely, nor to prevent their access to presence
1173	   information, the IETF did provide a way for users to express their
1174	   preferences and provision their controls at the presence service.  By
1175	   documenting and acknowledging the limitations of these mechanisms,
1176	   the designers were able to provide implementers, and end users, with
1177	   an informed perspective on the privacy properties of the IETF's
1178	   presence protocols.

1180	8.  Security Considerations

1182	   This document describes privacy aspects that protocol designers
1183	   should consider in addition to regular security analysis.

1185	9.  IANA Considerations

1187	   This document does not require actions by IANA.

1189	10.  Acknowledgements

1191	   We would like to thank Christine Runnegar for her extensive helpful
1192	   review comments.

1194	   We would like to thank Scott Brim, Kasey Chappelle, Marc Linsner,
1195	   Bryan McLaughlin, Nick Mathewson, Eric Rescorla, Scott Bradner, Nat
1196	   Sakimura, Bjoern Hoehrmann, David Singer, Dean Willis, Christine
1197	   Runnegar, Lucy Lynch, Trent Adams, Mark Lizar, Martin Thomson, Josh
1198	   Howlett, Mischa Tuffield, S. Moonesamy, Ted Hardie, Zhou Sujing,
1199	   Claudia Diaz, Leif Johansson, and Klaas Wierenga.

1201	   Finally, we would like to thank the participants for the feedback
1202	   they provided during the December 2010 Internet Privacy workshop co-
1203	   organized by MIT, ISOC, W3C and the IAB.

1205	11.  Informative References

1207	   [Chaum]    Chaum, D., "Untraceable Electronic Mail, Return Addresses,
1208	              and Digital Pseudonyms", Communications of the ACM , 24/2,
1209	              84-88, 1981.

1211	   [CoE]      Council of Europe, "Recommendation CM/Rec(2010)13  of the
1212	              Committee of Ministers to member states  on the protection
1213	              of individuals with regard to automatic processing  of
1214	              personal data in the context of profiling", available at
1215	              (November 2010) ,
1216	              https://wcd.coe.int/ViewDoc.jsp?Ref=CM/Rec%282010%2913,
1217	              2010.

1219	   [EFF]      Electronic Frontier Foundation, "Panopticlick", 2011.

1221	   [I-D.iab-identifier-comparison]
1222	              Thaler, D., "Issues in Identifier Comparison for Security
1223	              Purposes", draft-iab-identifier-comparison-02 (work in
1224	              progress), May 2012.

1226	   [I-D.ietf-geopriv-policy]
1227	              Schulzrinne, H., Tschofenig, H., Cuellar, J., Polk, J.,
1228	              Morris, J., and M. Thomson, "Geolocation Policy: A
1229	              Document Format for Expressing Privacy Preferences for
1230	              Location Information", draft-ietf-geopriv-policy-26 (work
1231	              in progress), June 2012.

1233	   [OECD]     Organization for Economic Co-operation and Development,
1234	              "OECD Guidelines on the Protection of Privacy and
1235	              Transborder Flows of Personal Data", available at
1236	              (September 2010) , http://www.oecd.org/EN/document/
1237	              0,,EN-document-0-nodirectorate-no-24-10255-0,00.html,
1238	              1980.

1240	   [PbD]      Office of the Information and Privacy Commissioner,
1241	              Ontario, Canada, "Privacy by Design", 2011.

1243	   [RFC2616]  Fielding, R., Gettys, J., Mogul, J., Frystyk, H.,
1244	              Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext
1245	              Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999.

1247	   [RFC2778]  Day, M., Rosenberg, J., and H. Sugano, "A Model for
1248	              Presence and Instant Messaging", RFC 2778, February 2000.

1250	   [RFC2779]  Day, M., Aggarwal, S., Mohr, G., and J. Vincent, "Instant
1251	              Messaging / Presence Protocol Requirements", RFC 2779,
1252	              February 2000.

1254	   [RFC3261]  Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston,
1255	              A., Peterson, J., Sparks, R., Handley, M., and E.
1256	              Schooler, "SIP: Session Initiation Protocol", RFC 3261,
1257	              June 2002.

1259	   [RFC3325]  Jennings, C., Peterson, J., and M. Watson, "Private
1260	              Extensions to the Session Initiation Protocol (SIP) for
1261	              Asserted Identity within Trusted Networks", RFC 3325,
1262	              November 2002.

1264	   [RFC3552]  Rescorla, E. and B. Korver, "Guidelines for Writing RFC
1265	              Text on Security Considerations", BCP 72, RFC 3552,
1266	              July 2003.

1268	   [RFC3748]  Aboba, B., Blunk, L., Vollbrecht, J., Carlson, J., and H.
1269	              Levkowetz, "Extensible Authentication Protocol (EAP)",
1270	              RFC 3748, June 2004.

1272	   [RFC3859]  Peterson, J., "Common Profile for Presence (CPP)",
1273	              RFC 3859, August 2004.

1275	   [RFC3922]  Saint-Andre, P., "Mapping the Extensible Messaging and
1276	              Presence Protocol (XMPP) to Common Presence and Instant
1277	              Messaging (CPIM)", RFC 3922, October 2004.

1279	   [RFC4017]  Stanley, D., Walker, J., and B. Aboba, "Extensible
1280	              Authentication Protocol (EAP) Method Requirements for
1281	              Wireless LANs", RFC 4017, March 2005.

1283	   [RFC4079]  Peterson, J., "A Presence Architecture for the
1284	              Distribution of GEOPRIV Location Objects", RFC 4079,
1285	              July 2005.

1287	   [RFC4101]  Rescorla, E. and IAB, "Writing Protocol Models", RFC 4101,
1288	              June 2005.

1290	   [RFC4187]  Arkko, J. and H. Haverinen, "Extensible Authentication
1291	              Protocol Method for 3rd Generation Authentication and Key
1292	              Agreement (EAP-AKA)", RFC 4187, January 2006.

1294	   [RFC4282]  Aboba, B., Beadles, M., Arkko, J., and P. Eronen, "The
1295	              Network Access Identifier", RFC 4282, December 2005.

1297	   [RFC4745]  Schulzrinne, H., Tschofenig, H., Morris, J., Cuellar, J.,
1298	              Polk, J., and J. Rosenberg, "Common Policy: A Document
1299	              Format for Expressing Privacy Preferences", RFC 4745,
1300	              February 2007.

1302	   [RFC4918]  Dusseault, L., "HTTP Extensions for Web Distributed
1303	              Authoring and Versioning (WebDAV)", RFC 4918, June 2007.

1305	   [RFC4949]  Shirey, R., "Internet Security Glossary, Version 2",
1306	              RFC 4949, August 2007.

1308	   [RFC5025]  Rosenberg, J., "Presence Authorization Rules", RFC 5025,
1309	              December 2007.

1311	   [RFC5077]  Salowey, J., Zhou, H., Eronen, P., and H. Tschofenig,
1312	              "Transport Layer Security (TLS) Session Resumption without
1313	              Server-Side State", RFC 5077, January 2008.

1315	   [RFC5106]  Tschofenig, H., Kroeselberg, D., Pashalidis, A., Ohba, Y.,
1316	              and F. Bersani, "The Extensible Authentication Protocol-
1317	              Internet Key Exchange Protocol version 2 (EAP-IKEv2)
1318	              Method", RFC 5106, February 2008.

1320	   [RFC5246]  Dierks, T. and E. Rescorla, "The Transport Layer Security
1321	              (TLS) Protocol Version 1.2", RFC 5246, August 2008.

1323	   [RFC6269]  Ford, M., Boucadair, M., Durand, A., Levis, P., and P.
1324	              Roberts, "Issues with IP Address Sharing", RFC 6269,
1325	              June 2011.

1327	   [RFC6280]  Barnes, R., Lepinski, M., Cooper, A., Morris, J.,
1328	              Tschofenig, H., and H. Schulzrinne, "An Architecture for
1329	              Location and Location Privacy in Internet Applications",
1330	              BCP 160, RFC 6280, July 2011.

1332	   [RFC6350]  Perreault, S., "vCard Format Specification", RFC 6350,
1333	              August 2011.

1335	   [Solove]   Solove, D., "Understanding Privacy", 2010.

1337	   [Tor]      The Tor Project, Inc., "Tor", 2011.

1339	   [1]  <http://creativecommons.org/>

1341	   [2]  <http://wiki.creativecommons.org/Share_Alike>

1343	Authors' Addresses

1345	   Alissa Cooper
1346	   CDT
1347	   1634 Eye St. NW, Suite 1100
1348	   Washington, DC  20006
1349	   US

1351	   Phone: +1-202-637-9800
1352	   Email: acooper@cdt.org
1353	   URI:   http://www.cdt.org/

1355	   Hannes Tschofenig
1356	   Nokia Siemens Networks
1357	   Linnoitustie 6
1358	   Espoo  02600
1359	   Finland

1361	   Phone: +358 (50) 4871445
1362	   Email: Hannes.Tschofenig@gmx.net
1363	   URI:   http://www.tschofenig.priv.at

1365	   Bernard Aboba
1366	   Microsoft Corporation
1367	   One Microsoft Way
1368	   Redmond, WA  98052
1369	   US

1371	   Email: bernarda@microsoft.com

1373	   Jon Peterson
1374	   NeuStar, Inc.
1375	   1800 Sutter St Suite 570
1376	   Concord, CA  94520
1377	   US

1379	   Email: jon.peterson@neustar.biz

1381	   John B. Morris, Jr.

1383	   Email: ietf@jmorris.org
1384	   Marit Hansen
1385	   ULD Kiel

1387	   Email: marit.hansen@datenschutzzentrum.de

1389	   Rhys Smith
1390	   JANET(UK)

1392	   Email: rhys.smith@ja.net