idnits 2.17.1 

draft-trammell-perpass-ppa-01.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** There is 1 instance of too long lines in the document, the longest one
     being 4 characters in excess of 72.

  -- The document has examples using IPv4 documentation addresses according
     to RFC6890, but does not use any IPv6 documentation addresses.  Maybe
     there should be IPv6 examples, too?


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (November 13, 2013) is 3816 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  -- Obsolete informational reference (is this intentional?): RFC 2821
     (Obsoleted by RFC 5321)

  -- Obsolete informational reference (is this intentional?): RFC 3501
     (Obsoleted by RFC 9051)

  -- Obsolete informational reference (is this intentional?): RFC 3851
     (Obsoleted by RFC 5751)

  -- Obsolete informational reference (is this intentional?): RFC 4941
     (Obsoleted by RFC 8981)

  -- Obsolete informational reference (is this intentional?): RFC 5246
     (Obsoleted by RFC 8446)


     Summary: 1 error (**), 0 flaws (~~), 1 warning (==), 7 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	perpass non-WG                                               B. Trammell
3	Internet-Draft                                                ETH Zurich
4	Intended status: Informational                               D. Borkmann
5	Expires: May 17, 2014                                            Red Hat
6	                                                              C. Huitema
7	                                                   Microsoft Corporation
8	                                                       November 13, 2013

10	           A Threat Model for Pervasive Passive Surveillance
11	                   draft-trammell-perpass-ppa-01.txt

13	Abstract

15	   This document elaborates a threat model for pervasive surveillance.
16	   We assume an adversary with an interest in indiscriminate
17	   eavesdropping that can passively observe network traffic at every
18	   layer at every point in the network between the endpoints.  It is
19	   intended to demonstrate to protocol designers and implementors the
20	   observability and inferability of information and metainformation
21	   transported over their respective protocols, to assist in the
22	   evaluation of the performance of these protocols and the
23	   effectiveness of their protection mechanisms under pervasive passive
24	   surveillance.

26	Status of This Memo

28	   This Internet-Draft is submitted in full conformance with the
29	   provisions of BCP 78 and BCP 79.

31	   Internet-Drafts are working documents of the Internet Engineering
32	   Task Force (IETF).  Note that other groups may also distribute
33	   working documents as Internet-Drafts.  The list of current Internet-
34	   Drafts is at http://datatracker.ietf.org/drafts/current/.

36	   Internet-Drafts are draft documents valid for a maximum of six months
37	   and may be updated, replaced, or obsoleted by other documents at any
38	   time.  It is inappropriate to use Internet-Drafts as reference
39	   material or to cite them other than as "work in progress."

41	   This Internet-Draft will expire on May 17, 2014.

43	Copyright Notice

45	   Copyright (c) 2013 IETF Trust and the persons identified as the
46	   document authors.  All rights reserved.

48	   This document is subject to BCP 78 and the IETF Trust's Legal
49	   Provisions Relating to IETF Documents
50	   (http://trustee.ietf.org/license-info) in effect on the date of
51	   publication of this document.  Please review these documents
52	   carefully, as they describe your rights and restrictions with respect
53	   to this document.  Code Components extracted from this document must
54	   include Simplified BSD License text as described in Section 4.e of
55	   the Trust Legal Provisions and are provided without warranty as
56	   described in the Simplified BSD License.

58	Table of Contents

60	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
61	   2.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   3
62	   3.  The Pervasive Passive Adversary . . . . . . . . . . . . . . .   4
63	   4.  Threat analysis . . . . . . . . . . . . . . . . . . . . . . .   5
64	     4.1.  Information subject to direct observation . . . . . . . .   6
65	     4.2.  Information useful for inference  . . . . . . . . . . . .   6
66	     4.3.  On the Non-Anonymity of IP Addresses  . . . . . . . . . .   7
67	       4.3.1.  Analysis of IP headers  . . . . . . . . . . . . . . .   7
68	       4.3.2.  Correlation of IP addresses to user identities  . . .   8
69	       4.3.3.  Monitoring messaging clients for IP address
70	               correlation . . . . . . . . . . . . . . . . . . . . .   9
71	       4.3.4.  Retrieving IP addresses from mail headers . . . . . .   9
72	       4.3.5.  Tracking address use with web cookies . . . . . . . .  10
73	       4.3.6.  Tracking address use with network graphs  . . . . . .  10
74	   5.  Evaluating protocols for PPA resistance . . . . . . . . . . .  11
75	   6.  General protocol design recommendations for PPA resistance  .  11
76	     6.1.  Encrypt everything you can  . . . . . . . . . . . . . . .  11
77	     6.2.  Design and implement for simplicity and auditability  . .  12
78	     6.3.  Allow for fingerprinting resistance in protocol designs .  12
79	     6.4.  Do not rely on static IP addresses  . . . . . . . . . . .  12
80	   7.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  13
81	   8.  Security Considerations . . . . . . . . . . . . . . . . . . .  13
82	   9.  Acknowledgments . . . . . . . . . . . . . . . . . . . . . . .  13
83	   10. References  . . . . . . . . . . . . . . . . . . . . . . . . .  13
84	     10.1.  Normative References . . . . . . . . . . . . . . . . . .  13
85	     10.2.  Informative References . . . . . . . . . . . . . . . . .  14
86	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  15

88	1.  Introduction

90	   Surveillance is defined in [RFC6973], Section 5.1.1, as "the
91	   observation or monitoring of an individual's communications or
92	   activities".  Pervasive passive surveillance in the Internet is the
93	   practice of surveillance at widespread observation points, without
94	   any particular target in mind at time of surveillance, and without
95	   any modification or injection of of network traffic.  Pervasive
96	   passive surveillance allows subsequent analysis and inference to be
97	   applied to the collected data to achieve surveillance aims on a
98	   target to be identified later, or to analyze general communications
99	   patterns and/or behaviors without a specified target individual or
100	   group.

102	   This differentiates privacy in the face of pervasive surveillance
103	   from privacy as addressed in the literature, in that threats to
104	   privacy are generally (as in [RFC6973]) taken to have as a specific
105	   goal revealing the identity and/or associations of a specified
106	   individual; defeating pervasive surveillance of a large population is
107	   therefore more difficult than protecting the privacy of a single
108	   individual within a larger population.

110	   In this document, we take as given that communications systems should
111	   aim to provide privacy guarantees to their users, and that
112	   susceptibility to pervasive surveillance should be avoided to the
113	   extent possible as a design goal in protocol design.  From these
114	   assumptions we take the very act of pervasive surveillance to be
115	   adversarial by definition.

117	   This document outlines a threat model for an entity performing
118	   pervasive passive surveillance, termed the Pervasive Passive
119	   Adversary (PPA), and explores how to apply this model to the
120	   evaluation of protocols.  As the primary threat posed by pervasive
121	   surveillance is a threat to the privacy of the parties to a given
122	   communication, this document is heavily based on [RFC6973].

124	2.  Terminology

126	   [EDITOR'S NOTE: Check to see whether we actually use these...]

128	   The terms Anonymity, Anonymity Set, Anonymous, Attacker,
129	   Eavesdropper, Fingerprint, Fingerprinting, Identifier, Identity,
130	   Individual, Initiator, Intermediary, Observer, Pseudonym,
131	   Pseudonymity, Pseudonymous, Recipient, and Traffic Analysis are used
132	   in this document as defined by Section 3, Terminology, of [RFC6973].
133	   In addition, this document defines the following terms:

135	   Observation:   Information collected directly from communications by
136	      an eavesdropper or observer.  For example, the knowledge that
137	      <alice@example.com> sent a message to <bob@example.com> via SMTP
138	      taken from the headers of an observed SMTP message would be an
139	      observation.

141	   Inference:   Information extracted from analysis of information
142	      collected directly from communications by an eavesdropper or
143	      observer.  For example, the knowledge that a given web page was
144	      accessed by a given IP address, by comparing the size in octets of
145	      measured network flow records to fingerprints derived from known
146	      sizes of linked resources on the web servers involved would be an
147	      inference.

149	3.  The Pervasive Passive Adversary

151	   The pervasive passive adversary (PPA) is an indiscriminate
152	   eavesdropper on a computer network that can:

154	   o  observe every packet of all communications at any or every hop in
155	      any network path between an initiator and a recipient; and can

157	   o  observe data at rest in intermediate systems between the endpoints
158	      controlled by the initiator and recipient; but

160	   o  takes no other action with respect to these communications (i.e.,
161	      blocking, modification, injection, etc.).

163	   We note that a threat model that limits the adversary to being
164	   completely passive may under-represent the threat to communications
165	   privacy posed especially by well-resourced adversaries, but submit
166	   that it represents the maximum capability of a single entity
167	   interested in remaining undetectable.

169	   The techniques available to the PPA are direct observation and
170	   inference.  Direct observation involves taking information directly
171	   from eavesdropped communications - e.g., URLs identifying content or
172	   email addresses identifying individuals from application-layer
173	   headers.  Inference, on the other hand involves analyzing
174	   eavesdropped information to derive new information from it; e.g.,
175	   searching for application or behavioral fingerprints in observed
176	   traffic to derive information about the observed individual from
177	   them, in absence of directly-observed sources of the same
178	   information.

180	   We would like to assume that the PPA does not have the ability to
181	   observe communications on trusted systems at either the initiator or
182	   a recipient of a communication, as there would seem to be little that
183	   a protocol designer could do in the case of compromised endpoints.
184	   However, given the state of vulnerability of many endpoints to
185	   various security exploits, we would encourage protocol designers to
186	   consider the protections their protocols afford to the privacy of
187	   their users even in the face of partially compromised endpoints.

189	   The PPA may additionally have have privileged information allowing
190	   the reversal of strong encryption -- e.g. compromised key material or
191	   knowledge of weaknesses in the design or implementation of
192	   cryptographic algorithms or random number generators at the
193	   initiator, recipient, and/or intermediaries.  However, we consider
194	   the evaluation and improvement of cryptographic protections, while
195	   important to improving the security of the Internet in the face of
196	   pervasive surveillance, to be out of scope for this work: here, we
197	   will assume that a given cryptographic protection for a protocol
198	   works as advertised.

200	4.  Threat analysis

202	   On initial examination, the PPA would appear to be trivially
203	   impossible to defend against.  If the PPA has access to every byte of
204	   every packet of a communication, then full application payload and
205	   content is available for applications which do not provide
206	   encryption.

208	   Guidance to protocol designers [RFC3365] to provide cryptographic
209	   protection of confidentiality in their protocols improves this
210	   situation a great deal.  The use of TLS [RFC5246] reduces the
211	   information available for correlation to the network and transport
212	   layer headers (e.g. source and destination IP addresses and ports) on
213	   each hop, but leaves any data at rest used by a protocol on
214	   intermediate systems vulnerable to intermediate system compromise.

216	   End-to-end approaches (e.g. S/MIME [RFC3851]) help defend against
217	   this threat.  However, protocols that route messages based on
218	   recipient identifier or pseudonym, such as SMTP [RFC2821] and XMPP
219	   [RFC6120], still require intermediate systems to handle these in the
220	   clear, and may leak additional metadata as well (e.g., in the S/MIME
221	   example, the SMTP headers), making this available to the PPA if it is
222	   has compromised these intermediate systems.

224	   We can assume that the PPA does not have unlimited resources, i.e.,
225	   that it will attempt to eavesdrop at the most efficient observation
226	   point(s) available to it, and will collect as little raw data as
227	   necessary to support its aims.  This allows us to back away from this
228	   worst-case scenario.  Storing full packet information for a fully-
229	   loaded 10 Gigabit Ethernet link will fill one 4TB hard disk (the
230	   largest commodity hard disk available as of this writing) in less
231	   than an hour; storing network flow data from the same link, e.g. as
232	   IPFIX Files [RFC5655], requires on the order of 1/1000 the storage
233	   (i.e., 4GB an hour).  Metadata-based surveillance approaches are
234	   therefore more scalable for pervasive surveillance, so it is
235	   worthwhile to analyze information which can be inferred from various
236	   network traffic capture and analysis techniques other than full
237	   packet observation.

239	   In the remainder of this analysis, we categorize the ways that
240	   information radiates off of protocols on the Internet.  First, we
241	   list kinds of information that can be directly observed; this may
242	   seem somewhat obvious, but is included for completeness.  We then
243	   explore the types of information which may be useful for drawing
244	   inferences about user behavior, then go into practical detail on
245	   inference attacks against just information available in the IP
246	   header, to better illustrate the extent of the problem.

248	4.1.  Information subject to direct observation

250	   Protocols which do not encrypt their payload make the entire content
251	   of the communication available to a PPA along their path.  Following
252	   the advice in [RFC3365], most such protocols have a secure variant
253	   which encrypts payload for confidentiality, and these secure variants
254	   are seeing ever-wider deployment.  A noteworthy exception is DNS
255	   [RFC1035], as DNSSEC [RFC4033] does not have confidentiality as a
256	   requirement.  This implies that all DNS queries and answers generated
257	   by the activities of any protocol are available to a PPA.

259	   Protocols which encrypt their payload using an application- or
260	   transport-layer encryption scheme (e.g. TLS [RFC5246]) still expose
261	   all the information in their network and transport layer headers to a
262	   PPA, including source and destination addresses and ports.  IPsec ESP
263	   [RFC4303] further encrypts the transport-layer headers, but still
264	   leaves IP address information unencrypted; in tunnel mode, these
265	   addresses correspond to the tunnel endpoints.  Cryptographic
266	   protocols themselves, e.g. the TLS session identifier, may leak
267	   information that can be used for correlation and inference.  While
268	   this information is much less semantically rich than the application
269	   payload, it can still be useful for the inferring an individual's
270	   activities.

272	   Protocols which imply the storage of some data at rest in
273	   intermediaries leave this data subject to observation at a PPA that
274	   has compromised these intermediaries, unless the data is encrypted
275	   end-to-end by the application layer protocol, or the implementation
276	   uses an encrypted store for this data.

278	4.2.  Information useful for inference

280	   Inference is information extracted from later analysis of an observed
281	   communication, and/or correlation of observed information with
282	   information available from other sources.  Indeed, most useful
283	   inference performed by a PPA falls under the rubric of correlation.
284	   The simplest example of this is the observation of DNS queries and
285	   answers from and to a source and correlating those with IP addresses
286	   with which that source communicates can give access to information
287	   otherwise not available from encrypted application payloads (e.g.,
288	   the Host: HTTP/1.1 request header when HTTP is used with TLS).

290	   Inference can also leverage information obtained from sources other
291	   than direct traffic observation.  Geolocation databases, for example,
292	   have been developed map IP addresses to a location, in order to
293	   provide location-aware services such as targeted advertising.  This
294	   location information is often of sufficient resolution that it can be
295	   used to draw further inferences toward identifying or profiling an
296	   individual.

298	   Social media provide another source of more or less publicly
299	   accessible information.  This information can be extremely
300	   semantically rich, including information about an individual's
301	   location, associations with other individuals and groups, and
302	   activities.  Further, this information is generally contributed and
303	   curated voluntarily by the individuals themselves: it represents
304	   information which the individuals are not necessarily interested in
305	   protecting for privascy reasons.  However, correlation of this social
306	   networking data with information available from direct observation of
307	   network traffic allows the creation of a much richer picture of an
308	   individual's activities than either alone.  We note with some alarm
309	   that there is little that can be done from the protocol design side
310	   to limit such correlation by a PPA, and that the existence of such
311	   data sources in many cases greatly complicates the problem of
312	   protecting privacy by hardening protocols alone.

314	4.3.  On the Non-Anonymity of IP Addresses

316	   In this section, we explore the non-anonymity of even encrypted IP
317	   traffic by examining some inference techniques for associating a set
318	   of addresses with an individual, in order to illustrate the
319	   difficulty of defending communications against a PPA.  Here, the
320	   basic problem is that information radiated even from protocols which
321	   have no obvious connection with personal data can be correlated with
322	   other information which can paint a very rich behavioral picture,
323	   that only takes one unprotected link in the chain to associate with
324	   an identity.

326	4.3.1.  Analysis of IP headers

328	   Internet traffic can be monitored by tapping Internet links, or by
329	   installing monitoring tools in Internet routers.  Of course, a single
330	   link or a single router only provides access to a fraction of the
331	   global Internet traffic.  However, monitoring a number of high
332	   capacity links or a set of routers placed at strategic locations
333	   provides access to a good sampling of Internet traffic.

335	   Tools like IPFIX [RFC7011] allow administrators to acquire statistics
336	   about sequences of packets with some common properties that pass
337	   through a network device.  The most common set of properties used in
338	   flow measurement is the "five-tuple" of source and destination
339	   addresses, protocol type, and source and destination ports.  These
340	   statistics are commonly used for network engineering, but could
341	   certainly be used for other purposes.

343	   Let's assume for a moment that IP addresses can be correlated to
344	   specific services or specific users.  Analysis of the sequences of
345	   packets will quickly reveal which users use what services, and also
346	   which users engage in peer-to-peer connections with other users.
347	   Analysis of traffic variations over time can be used to detect
348	   increased activity by particular users, or in the case of peer-to-
349	   peer connections increased activity within groups of users.

351	4.3.2.  Correlation of IP addresses to user identities

353	   In Section 4.3.1, we have assumed that IP addresses can be correlated
354	   with specific user identities.  This can be done in various ways.

356	   Tools like reverse DNS lookup can be used to retrieve the DNS names
357	   of servers.  Since the addresses of servers tend to be quite stable
358	   and since servers are relatively less numerous than users, a PPA
359	   could easily maintain its own copy of the DNS for well-known or
360	   popular servers, to accelerate such lookups.

362	   On the other hand, the reverse lookup of IP addresses of users is
363	   generally less informative.  For example, a lookup of the address
364	   currently used by one author's home network returns a name of the
365	   form "c-192-000-002-033.hsd1.wa.comcast.net".  This particular type
366	   of reverse DNS lookup generally reveals only coarse-grained location
367	   or provider information.

369	   In many jurisdictions, Internet Service Providers (ISPs) are required
370	   to provide identification on a case by case basis of the "owner" of a
371	   specific IP address for law enforcement purposes.  This is a
372	   reasonably expedient process for targeted investigations, but
373	   pervasive surveillance requires something more efficient.  A PPA that
374	   could secure the cooperation of the ISP could correlate IP addresses
375	   and user identities automatically.

377	   Even if the ISP does not cooperate, identity can often be obtained
378	   via inference.  We will discuss in the next section how SMTP and HTTP
379	   can leak information that links the IP address to the identity of the
380	   user.

382	4.3.3.  Monitoring messaging clients for IP address correlation

384	   POP3 [RFC1939] and IMAP [RFC3501] are used to retrieve mail from mail
385	   servers, while a variant of SMTP [RFC5321] is used to submit messages
386	   through mail servers.  IMAP connections originate from the client,
387	   and typically start with an authentication exchange in which the
388	   client proves its identity by answering a password challenge.

390	   If the protocol is executed in clear text, monitoring services can
391	   "tap" the links to the mail server, retrieve the user name provided
392	   by the client, and associate it with the IP address used to establish
393	   the connection.

395	   The same attack can be executed against the SIP [RFC3261] protocol,
396	   if the connection between the SIP UA and the SIP server operates in
397	   clear text

399	   In addition, there are many instant messaging services operating over
400	   the Internet using proprietary protocols.  If any of these
401	   proprietary protocols includes clear-text transmission of the user
402	   identity, these can be observed to provide an association between the
403	   user identity and the IP address.

405	4.3.4.  Retrieving IP addresses from mail headers

407	   SMTP [RFC5321] requires that each successive SMTP relay adds a
408	   "Received" header to the mail headers.  The purpose of these headers
409	   is to enable audit of mail transmission, and perhaps to distinguish
410	   between regular mail and spam.  Here is an extract from the headers
411	   of a message recently received from the "perpass" mailing list:

413	    Received: from 192-000-002-044.zone13.example.org (HELO ?192.168.1.100?)
414	   (xxx.xxx.xxx.xxx)
415	   by lvps192-000-002-219.example.net with ESMTPSA
416	   (DHE-RSA-AES256-SHA encrypted, authenticated);
417	   27 Oct 2013 21:47:14 +0100
418	    Message-ID: <526D7BD2.7070908@example.org>
419	    Date: Sun, 27 Oct 2013 20:47:14 +0000
420	    From: Some One <some.one@example.org>

422	   This is the first "Received" header attached to the message by the
423	   first SMTP relay.  For privacy reasons, the field values have been
424	   anonymized.  We learn here that the message was submitted by "Some
425	   One" on October 27, from a host behind a NAT (192.168.1.100)
426	   [RFC1918] that used the IP address 192.0.2.44.  The information
427	   remained in the message, and is accessible by all recipients of the
428	   "perpass" mailing list, or indeed by any PPA that sees at least one
429	   copy of the message.

431	   A PPA that can observe sufficient email traffic can regularly update
432	   the mapping between public IP addresses and individual email
433	   identities.  Even if the SMTP traffic was encrypted on submission and
434	   relaying, the PPA can still receive a copy of public mailing lists
435	   like "perpass".

437	   Similar information is available in the SIP headers [RFC3261].

439	4.3.5.  Tracking address use with web cookies

441	   Many web sites only encrypt a small fraction of their transactions.
442	   A popular pattern was to use HTTPS for the login information, and
443	   then use a "cookie" to associate following clear-text transactions
444	   with the user's identity.  Cookies are also used by various
445	   advertisement services to quickly identify the users and serve them
446	   with "personalized" advertisements.  Such cookies are particularly
447	   useful if the advertisement services want to keep tracking the user
448	   across multiple sessions that may use different IP addresses.

450	   As cookies are sent in clear text, a PPA can build a database that
451	   associates cookies to IP addresses for non-HTTPS traffic.  If the IP
452	   address is already identified, the cookie can be linked to the user
453	   identify.  After that, if the same cookie appears on a new IP
454	   address, the new IP address can be immediately associated with the
455	   pre-determined identity.

457	4.3.6.  Tracking address use with network graphs

459	   A PPA can track traffic from an IP address not yet associated with an
460	   individual to various public services (e.g. websites, mail servers,
461	   game servers), and exploit patterns in the observed traffic to
462	   correlate this address with other addresses that show similar
463	   patterns.  For example, any two addresses that show connections to
464	   the same IMAP or webmail services, the same set of favorite websites,
465	   and game servers at similar times of day may be associated with the
466	   same individual.  Correlated addresses can then be tied to an
467	   individual through one of the techniques above, walking the "network
468	   graph" to expand the set of attributable traffic.

470	5.  Evaluating protocols for PPA resistance

472	   Though inference by a PPA makes the problem of guaranteeing privacy
473	   in the face of passive surveillance difficult, it is possible to
474	   strengthen each link in the chain in order to increase their
475	   resistance.  PPA resistent protocols have the following properties:

477	   o  The confidentiality of all information not absolutely required for
478	      the operation of the protocol at intermediate systems is
479	      cryptographically protected.

481	   o  The confidentiality of all identifiers which can be associated
482	      with specific individuals through observation or inference are
483	      cryptographically protected on a hop-by-hop basis, even if they
484	      are required for the operation of the protocol at intermediate
485	      systems.

487	   o  Identifiers required for the operation of the protocol are non-
488	      persistent and non-specific to individuals to the extent possible.

490	   o  The protocol radiates as little information as possible which can
491	      be used to fingerprint specific instances of the protocol.

493	   Clearly, the messaging protocols examined in Section 4.3 are, by
494	   these criteria, not particularly resistent to a PPA.  In evaluating a
495	   protocol for PPA resistance, tradeoffs in efficiency, latency,
496	   manageability, and other application requirements will need to be
497	   evaluated, as well.  More detailed information on privacy
498	   considerations for protocol design are given in
499	   [I-D.cooper-ietf-privacy-requirements].

501	6.  General protocol design recommendations for PPA resistance

503	   The following general recommendations are intended to guide
504	   discussions about improving the resistance of IETF protocols to a
505	   PPA; specific recommendations are the subject of a separate
506	   specification.

508	6.1.  Encrypt everything you can

510	   Though IETF protocols have been long moving in the direction of more
511	   and better cryptographic protection [RFC3365], there is continued
512	   room for improvement.  Approaches such as opportunistic encryption,
513	   while not providing identity guarantees, may have benefits in
514	   confidentiality that reduce the information radiated from protocols,
515	   increasing the costs for pervasive surveillance.  To some extent
516	   encryption is a deployment problem rather than a protocol design and
517	   implementation problem; improvements in usability may be useful here.

519	   The design and deployment of end-to-end encryption for a protocol,
520	   especially for messaging applications, can reduce the ability of a
521	   PPA to observe application-layer information and identifiers at a
522	   compromised intermediate system.

524	6.2.  Design and implement for simplicity and auditability

526	   This would seem to be common sense, but in practice it is not really
527	   the case that protocol design processes naturally have simplicity as
528	   a goal.  Simplicity of a design is directly related to the
529	   auditability of the design and implementations thereof.  Privacy and
530	   security features designed into a protocol which are too complex to
531	   understand will suffer from limited implementation and deployment.  A
532	   good example of such a case is IPsec where primary complaints are
533	   related to its complexity [Ferguson03].

535	   The auditability of a protocol is directly related to the ability to
536	   measure and reason about the information that it radiates that could
537	   be used for inference by a PPA.  Audits of designs and
538	   implementations can also reduce the risk of hidden side channels
539	   which could carry additional information useful to a PPA.  One
540	   approach for improving auditability is the release of implementations
541	   as open source.

543	6.3.  Allow for fingerprinting resistance in protocol designs

545	   Fingerprinting provides a source of information for inference, and
546	   can rely on packet and flow size and timing information.  The
547	   inclusion of null information in packets, or grouping information
548	   into more/fewer packets can reduce this risk.  Since protocols tend
549	   to be optimized for minimum bandwidth usage and minumum latency, the
550	   only way to go is up, so this resistance comes at the expense of
551	   usable bandwidth and increased latency.  While not necessarily
552	   applicable in the general case, protocol designs can make it possible
553	   do to this.

555	6.4.  Do not rely on static IP addresses

557	   Always on broadband connections may or may not provide the
558	   subscribers with static IP addresses.  Some users pay extra for the
559	   convenience of a stable address.  Of course, stable addresses greatly
560	   facilitate IP header monitoring.

562	   In contrast, we could imagine that the broadband modem is re-
563	   provisioned at regular interval with a new IPv4 address, or with a
564	   new IPv6 address prefix.  Some convenience will be lost, and TCP
565	   connections active before the renumbering will have to be
566	   reestablished.  However, the renumbering will significantly
567	   complicate the task of IP header monitoring.

569	   Similarly, the Privacy Extensions for Stateless Address
570	   Autoconfiguration in IPv6 [RFC4941] allow users to configure
571	   temporary IPv6 addresses out of a global prefix.  Privacy addresses
572	   are meant to be used for a short time, typically no more than a day,
573	   and are specifically designed to render monitoring based on IPv6
574	   addresses harder.

576	7.  IANA Considerations

578	   This document has no actions for IANA

580	8.  Security Considerations

582	   This document explores the capabilities of an adversary with an
583	   interest in undermining the security of the Internet to enable
584	   pervasive surveillance activities.  It does not provide any specific
585	   protocol guidance that may impact the security of those protocols,
586	   but it is hoped that the awareness of this threat will end up being a
587	   metacontribution to Internet security.

589	9.  Acknowledgments

591	   Thanks to Dilip Many and Stephan Neuhaus, who contributed to an
592	   initial version of this work.  Thanks to Mark Townsley, Stephen
593	   Farrell, Chris Inacio, and others in the anonymity set of "people
594	   we've forgotten to thank" for feedback and input to this draft.

596	10.  References

598	10.1.  Normative References

600	   [RFC6973]  Cooper, A., Tschofenig, H., Aboba, B., Peterson, J.,
601	              Morris, J., Hansen, M., and R. Smith, "Privacy
602	              Considerations for Internet Protocols", RFC 6973, July
603	              2013.

605	   [I-D.cooper-ietf-privacy-requirements]
606	              Cooper, A., Farrell, S., and S. Turner, "Privacy
607	              Requirements for IETF Protocols", draft-cooper-ietf-
608	              privacy-requirements-01 (work in progress), October 2013.

610	10.2.  Informative References

612	   [RFC1035]  Mockapetris, P., "Domain names - implementation and
613	              specification", STD 13, RFC 1035, November 1987.

615	   [RFC1918]  Rekhter, Y., Moskowitz, R., Karrenberg, D., Groot, G., and
616	              E. Lear, "Address Allocation for Private Internets", BCP
617	              5, RFC 1918, February 1996.

619	   [RFC1939]  Myers, J. and M. Rose, "Post Office Protocol - Version 3",
620	              STD 53, RFC 1939, May 1996.

622	   [RFC2821]  Klensin, J., "Simple Mail Transfer Protocol", RFC 2821,
623	              April 2001.

625	   [RFC3261]  Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston,
626	              A., Peterson, J., Sparks, R., Handley, M., and E.
627	              Schooler, "SIP: Session Initiation Protocol", RFC 3261,
628	              June 2002.

630	   [RFC3365]  Schiller, J., "Strong Security Requirements for Internet
631	              Engineering Task Force Standard Protocols", BCP 61, RFC
632	              3365, August 2002.

634	   [RFC3501]  Crispin, M., "INTERNET MESSAGE ACCESS PROTOCOL - VERSION
635	              4rev1", RFC 3501, March 2003.

637	   [RFC3851]  Ramsdell, B., "Secure/Multipurpose Internet Mail
638	              Extensions (S/MIME) Version 3.1 Message Specification",
639	              RFC 3851, July 2004.

641	   [RFC4033]  Arends, R., Austein, R., Larson, M., Massey, D., and S.
642	              Rose, "DNS Security Introduction and Requirements", RFC
643	              4033, March 2005.

645	   [RFC4303]  Kent, S., "IP Encapsulating Security Payload (ESP)", RFC
646	              4303, December 2005.

648	   [RFC4941]  Narten, T., Draves, R., and S. Krishnan, "Privacy
649	              Extensions for Stateless Address Autoconfiguration in
650	              IPv6", RFC 4941, September 2007.

652	   [RFC5246]  Dierks, T. and E. Rescorla, "The Transport Layer Security
653	              (TLS) Protocol Version 1.2", RFC 5246, August 2008.

655	   [RFC5321]  Klensin, J., "Simple Mail Transfer Protocol", RFC 5321,
656	              October 2008.

658	   [RFC5655]  Trammell, B., Boschi, E., Mark, L., Zseby, T., and A.
659	              Wagner, "Specification of the IP Flow Information Export
660	              (IPFIX) File Format", RFC 5655, October 2009.

662	   [RFC6120]  Saint-Andre, P., "Extensible Messaging and Presence
663	              Protocol (XMPP): Core", RFC 6120, March 2011.

665	   [RFC7011]  Claise, B., Trammell, B., and P. Aitken, "Specification of
666	              the IP Flow Information Export (IPFIX) Protocol for the
667	              Exchange of Flow Information", STD 77, RFC 7011, September
668	              2013.

670	   [Ferguson03]
671	              Ferguson, D. and B. Schneier, "A Cryptographic Evaluation
672	              of IPsec (https://www.schneier.com/paper-ipsec.pdf)",
673	              December 2003.

675	Authors' Addresses

677	   Brian Trammell
678	   Swiss Federal Institute of Technology Zurich
679	   Gloriastrasse 35
680	   8092 Zurich
681	   Switzerland

683	   Phone: +41 44 632 70 13
684	   Email: trammell@tik.ee.ethz.ch

686	   Daniel Borkmann
687	   Red Hat
688	   Seefeldstrasse 69
689	   8008 Zurich
690	   Switzerland

692	   Email: dborkman@redhat.com

694	   Christian Huitema
695	   Microsoft Corporation
696	   One Microsoft Way
697	   Redmond, WA  98052-6399
698	   U.S.A.

700	   Email: huitema@huitema.net