idnits 2.17.1 

draft-irtf-pearg-website-fingerprinting-01.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (September 8, 2020) is 1320 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Outdated reference: A later version (-19) exists of
     draft-ietf-ipsecme-iptfs-01

  == Outdated reference: A later version (-34) exists of
     draft-ietf-quic-transport-29

  == Outdated reference: A later version (-18) exists of
     draft-ietf-tls-esni-07

  -- Obsolete informational reference (is this intentional?): RFC 7540
     (Obsoleted by RFC 9113)


     Summary: 0 errors (**), 0 flaws (~~), 4 warnings (==), 2 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------


2	pearg                                                        I. Goldberg
3	Internet-Draft                                    University of Waterloo
4	Intended status: Informational                                   T. Wang
5	Expires: March 12, 2021          HK University of Science and Technology
6	                                                               C.A. Wood
7	                                                             Apple, Inc.
8	                                                       September 8, 2020

10	                  Network-Based Website Fingerprinting
11	               draft-irtf-pearg-website-fingerprinting-01

13	Abstract

15	   The IETF is well on its way to protecting connection metadata with
16	   protocols such as DNS-over-TLS and DNS-over-HTTPS, and work-in-
17	   progress towards encrypting the TLS SNI.  However, more work is
18	   needed to protect traffic metadata, especially in the context of web
19	   traffic.  In this document, we survey Website Fingerprinting attacks,
20	   which are a class of attacks that use machine learning techniques to
21	   attack web privacy, and highlight metadata leaks used by said
22	   attacks.  We also survey proposed mitigations for such leakage and
23	   discuss their applicability to IETF protocols such as TLS, QUIC, and
24	   HTTP.  We endeavor to show that Website Fingerprinting attacks are a
25	   serious problem that affect all Internet users, and we pose open
26	   problems and directions for future research in this area.

28	Note to Readers

30	   Source for this draft and an issue tracker can be found at
31	   https://github.com/chris-wood/ietf-fingerprinting
32	   (https://github.com/chris-wood/ietf-fingerprinting).

34	Status of This Memo

36	   This Internet-Draft is submitted in full conformance with the
37	   provisions of BCP 78 and BCP 79.

39	   Internet-Drafts are working documents of the Internet Engineering
40	   Task Force (IETF).  Note that other groups may also distribute
41	   working documents as Internet-Drafts.  The list of current Internet-
42	   Drafts is at https://datatracker.ietf.org/drafts/current/.

44	   Internet-Drafts are draft documents valid for a maximum of six months
45	   and may be updated, replaced, or obsoleted by other documents at any
46	   time.  It is inappropriate to use Internet-Drafts as reference
47	   material or to cite them other than as "work in progress."
48	   This Internet-Draft will expire on March 12, 2021.

50	Copyright Notice

52	   Copyright (c) 2020 IETF Trust and the persons identified as the
53	   document authors.  All rights reserved.

55	   This document is subject to BCP 78 and the IETF Trust's Legal
56	   Provisions Relating to IETF Documents (https://trustee.ietf.org/
57	   license-info) in effect on the date of publication of this document.
58	   Please review these documents carefully, as they describe your rights
59	   and restrictions with respect to this document.  Code Components
60	   extracted from this document must include Simplified BSD License text
61	   as described in Section 4.e of the Trust Legal Provisions and are
62	   provided without warranty as described in the Simplified BSD License.

64	Table of Contents

66	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
67	   2.  Background  . . . . . . . . . . . . . . . . . . . . . . . . .   4
68	   3.  Website Fingerprinting  . . . . . . . . . . . . . . . . . . .   4
69	   4.  Attacks . . . . . . . . . . . . . . . . . . . . . . . . . . .   5
70	   5.  Base Rate Fallacy . . . . . . . . . . . . . . . . . . . . . .   8
71	   6.  Defenses  . . . . . . . . . . . . . . . . . . . . . . . . . .   9
72	     6.1.  Traffic Morphing  . . . . . . . . . . . . . . . . . . . .   9
73	     6.2.  Traffic Splitting . . . . . . . . . . . . . . . . . . . .  13
74	   7.  Open Problems and Directions  . . . . . . . . . . . . . . . .  13
75	   8.  Protocol Design Considerations  . . . . . . . . . . . . . . .  15
76	   9.  Security Considerations . . . . . . . . . . . . . . . . . . .  15
77	   10. IANA Considerations . . . . . . . . . . . . . . . . . . . . .  15
78	   11. Informative References  . . . . . . . . . . . . . . . . . . .  15
79	   Appendix A.  Acknowledgements . . . . . . . . . . . . . . . . . .  22
80	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  22

82	1.  Introduction

84	   Internet protocols such as TLS 1.3 [RFC8446] and QUIC
85	   [I-D.ietf-quic-transport] bring substantial improvements to end-
86	   users.  The IETF engineered these with security and privacy in mind
87	   by encrypting more protocol messages using modern cryptographic
88	   primitives and algorithms, and engineering against flaws found in
89	   previous protocols, yielding several desirable security properties,
90	   including: forward-secure session key secrecy, downgrade protection,
91	   key compromise impersonation resistance, and protection of endpoint
92	   identities.  Combined, these two protocols are set to protect a
93	   significant amount of Internet data.  However, significant metadata
94	   leaks still exist for users of these protocols.  Examples include
95	   plaintext TLS SNI and application-specific extensions (ALPN), as well
96	   as DNS queries.  This information can be used by a passive attacker
97	   to learn information about the contents of an otherwise encrypted
98	   network connection.  Recently, such information has also been studied
99	   as a means of building unique user profiles [li2018can].  It has also
100	   been used to build flow classifiers that aid network management
101	   [foremski2014dns].

103	   In the context of Tor, a popular low-latency anonymity network, a
104	   common class of attacks that use metadata for such inference is
105	   called Website Fingerprinting (WF).  These attacks use machine
106	   learning techniques built with features extracted from metadata such
107	   as traffic patterns to attack web (browsing) privacy.  Miller et al.
108	   [miller2014know] show how these attacks can be applied to web
109	   browsing traffic protected with HTTPS to reveal private information
110	   about users.  Pironti et al. [pironti2012identifying] use similar
111	   attacks based on data sizes to identify individual social media
112	   clients using encrypted connections.  Fingerprinting attacks using
113	   encrypted traffic analysis are also applicable to encrypted media
114	   streams, such as Netflix videos.  (See work from Reed et al.
115	   [reed2017identifying] and Schuster et al. [schuster2017beauty] for
116	   examples of these attacks.)  WF attacks have also been applied to
117	   other IETF protocols such as encrypted DNS, including dnscrypt, DNS-
118	   over-TLS, and DNS-over-HTTPS [siby2018dns][shulman2014pretty].  In
119	   the past, they have also been conducted remotely
120	   [gong2010fingerprinting], using buffer-based side channels in a
121	   victim's home router.

123	   Protocols such as DNS-over-TLS and DNS-over-HTTPS [RFC8484], and
124	   work-in-progress towards encrypting the TLS SNI extension
125	   [I-D.ietf-tls-esni], help minimize metadata sent in cleartext on the
126	   wire.  However, regardless of protocol and even network-layer
127	   fingerprinting mitigations, application layer specifics, e.g., web
128	   page sizes and client request patterns, reveal a noticeable amount of
129	   information to attackers.  We argue that much more work is needed to
130	   protect encrypted connection metadata, especially in the context of
131	   web traffic.

133	   In this document, we describe WF attacks in the context of IETF
134	   protocols such as TLS and QUIC.  We survey WF attacks and highlight
135	   metadata features and classification techniques used to conduct said
136	   attacks.  We also describe proposed mitigations for these attacks and
137	   discuss their applicability to IETF protocols.  We conclude with a
138	   discussion of open problems and directions for future research and
139	   advocate for more work in this area.

141	2.  Background

143	   In this section we review how most secure Internet connections are
144	   made today.  We omit custom configurations such as those using VPNs
145	   and proxies since they do not represent the common case for most
146	   Internet users.  The following steps briefly describe the sequence of
147	   events that normally occur when a web client, e.g., browser, curl,
148	   etc., connects to a website and obtains some resource.  First an
149	   unencrypted DNS query is sent to an untrusted DNS recursive resolver
150	   to resolve a name to an IP address.  Upon receipt, clients then open
151	   a TCP and TLS connection to the destination address.  During this
152	   stage, metadata such as the TLS SNI and ALPN values are sent in
153	   cleartext.  The SNI is used to denote the destination application or
154	   endpoint to which clients want to connect.  Servers use this for
155	   several purposes, including selecting an appropriate certificate (one
156	   with the SNI name in the SubjectAlternativeName list) or routing to a
157	   different backend terminator.  ALPN values are used to negotiate
158	   which application-layer protocol will be used on top of the TLS
159	   connection.  Common values include "http/1.1", "h2", and (soon) "h3".
160	   Upon connection, clients then send HTTP messages to obtain the
161	   desired resource.

163	   Connections look different (on the wire) with TLS 1.3, encrypted DNS
164	   via DNS-over-TLS or DNS-over-HTTPS, and encrypted SNI.  DNS queries
165	   are encrypted to a (trusted) recursive resolver and TLS metadata such
166	   as SNI are encrypted in transit to the terminator.  Despite the
167	   reduction in cleartext metadata sent over the wire, there still
168	   remains several sources of information that an adversary may use for
169	   malicious purposes, including: size and timing of DNS queries and
170	   responses, size and timing or application traffic, and connection
171	   attempts induced while loading a web resource, e.g., Javascript
172	   files.  So while technologies such as Encrypted SNI, DoT, and DoH
173	   help protect some metadata, they are not complete solutions to the
174	   larger problem.  In the following section, we discuss this
175	   overarching problem in detail.

177	3.  Website Fingerprinting

179	   Website Fingerprinting (WF) is a class of attacks that exploit
180	   metadata leakage to attack end-user privacy on the Internet.  In the
181	   WF threat model, Adv is assumed to be a passive and local attacker.
182	   Local means that Adv can associate traffic with a given client.
183	   Examples include proxies to which clients directly connect.  Passive
184	   means that Adv can only view traffic in transit.  It cannot add,
185	   drop, or otherwise modify packets between the victim client and
186	   server(s).  Use of reliable and encrypted transport protocols such as
187	   TLS limit on-path attackers to eavesdropping on encrypted packets.
188	   (In QUIC, however, reordering packets is possible.)
189	   Traffic features used for classification include properties such as
190	   packet size, timing, direction, interarrival times, and burstiness,
191	   among many others [wang2016website].  Normally, features are
192	   restricted to those which are extractable as a passive eavesdropper,
193	   and not those which are viewable by modifying client or server
194	   behavior.  Specifically, this means that attacks such as CRIME
195	   [CRIME] and TIME [TIME], which rely on an attacker abusing TLS-layer
196	   compression to leak contents of an encrypted connection, are out of
197	   scope.

199	   Website Fingerprinting attacks have evolved over the years through
200	   three phases: (1) Closed-world WF on SSL/TLS, (2) Closed-world WF on
201	   Tor, and (3) Open-world WF on Tor.

203	   1.  In the closed-world model, clients are assumed to only visit a
204	       small set of pages monitored by Adv. This is less realistic but
205	       easier to analyze than the open-world model discussed below, and
206	       so the earliest results achieved success on SSL/TLS in this
207	       model.  (For a realistic attack, Adv would need to monitor every
208	       possible page of interest to each client, which is impractical.)
209	       Attacks against proxy-based privacy technologies such as VPNs and
210	       SSH tunneling, which has almost no effect on the network, falls
211	       under this category as well.

213	   2.  Tor, an anonymity network built on onion routing, is harder to
214	       attack than SSL for several reasons; successful results on Tor
215	       thus came later.  First, Tor pads all cells (Tor's application-
216	       layer datagrams) to the same constant size, removing unique
217	       packet lengths as a powerful feature for the attacker.  Second,
218	       Tor imposes random network conditions upon the client due to
219	       random selection of proxies, so packet sequences are less likely
220	       to be consistent.

222	   3.  In the open-world model, Adv wishes to learn whenever a victim
223	       client visits one of a select number of monitored pages
224	       [wang2016website].  Adversaries train classifiers in this model
225	       using monitored and non-monitored websites of their choosing.  By
226	       definition, Adv cannot train using client-chosen pages.  Clients
227	       then visit pages at will and Adv attempts to learn whenever a
228	       monitored page is visited, if any are at all.  This is a
229	       realistic model capturing the fact that the set of pages any
230	       attacker would be interested in must necessarily be a small
231	       subset of the set of all pages.  As this is a harder model to
232	       attack, successful results on this model came later.

234	4.  Attacks
235	   1.  Closed-world WF on TLS: WF attacks date back to applications on
236	       SSL first inspired by Wagner and Schneier [wagner1996analysis],
237	       in which the authors observed that packet lengths reveal
238	       information about the underlying data.  Subsequent attacks
239	       carried out by Cheng et al. [cheng1998traffic], Sun et al.
240	       [sun2002statistical], and Hintz [hintz2002fingerprinting]
241	       continued to show access.  These attacks assume Adv has knowledge
242	       of the target resource length(s), which is not always possible
243	       with techniques such as padding.

245	   Bissias et al. [bissias2005privacy] use cross correlation of inter-
246	   packet times in one second time windows as an WF attack.  Danezis
247	   [danezis2009traffic] model websites using a Hidden Markov Model (HMM)
248	   and use it, along with TLS traffic traces revealing only approximate
249	   lengths, to identify requested resources on a page.  Their results
250	   vary the amount of information available to an adversary when
251	   building the HMM.  Even in cases where resource popularity is
252	   omitted, which reflects the case where an adversary scrapes static
253	   websites, resource recall was high (86\%).  Liberatore and Levine
254	   [liberatore2006inferring] proposed two WF attacks using the Jaccard
255	   coefficient and the Naive Bayes classifier.  Herrmann et al.
256	   [herrmann2009website] extended the work of Liberatore and Levine with
257	   a multinomial Naive Bayes classifier computed using three input
258	   frequency transformations.  Results yielded higher accuracy than that
259	   of Liberatore and Levine.  Herrmann's attack is the best in this
260	   category, but the authors assume packets which do not fill a MTU
261	   represent packet trailers.  Therefore, uniqueness is only accurate
262	   modulo the MTU.  Efficacy is limited if endpoints pad packets to the
263	   MTU or another fixed length.  Modern protocols such as HTTP/2, QUIC,
264	   and TLS 1.3 all provide some form of application-controlled padding.
265	   (Note: These attacks are not successful on Tor.)

267	   1.  Closed-world WF on Tor: Shmatikov and Wang [shmatikov2006timing]
268	       presented a WF attack that exploits cross correlation of arrival
269	       packet counts in one second time windows.  Lu et al.
270	       [lu2010website] developed a classifier based on the Levenshtein
271	       distance between ingress and egress packet lengths extracted from
272	       packet sequences.  Distance is computed between strings of
273	       ingress and egress packet lengths.  The training packet sequence
274	       with the closest distance to the testing packet sequence is
275	       deemed the match.  Dyer et al. [dyer2012peek] used a Naive Bayes
276	       classifier trained with a reduced set of features, including
277	       total response transmission time, length of packets (in each
278	       direction), and burst lengths.  (Wang [wang2016website] notes
279	       that measuring burst lengths in Tor is difficult given the
280	       presence of SENDME cells for flow control.)  This approach did
281	       not yield any measurable improvements over the SVM classifier
282	       from Panchenko et al.  Cai et al. [cai2012touching] extend the
283	       work of Lu et al. by adding transpositions to the Levenshtein
284	       distance computation and normalizing the result, yielding what
285	       the authors refer to as the Optimal String Alignment Distance
286	       (OSAD).  Before feature extraction, the authors round TCP packet
287	       lengths to the nearest multiple of 600B as an estimate of the
288	       number of Tor cells.

290	   Wang et al. [wang2013improved] tuned the OSAD-based attack to improve
291	   its accuracy.  Specific changes include use of Tor cells instead of
292	   TCP packets for packet and burst lengths, as well as heuristics to
293	   remove SENDME cells (those not carrying application data) from flows
294	   to recover true burst lengths.  The authors also modified the
295	   distance computation by removing substitutions, increasing the weight
296	   for egress packets, and varying the transposition cost across the
297	   packet sequence (large weights at the beginning of a trace, and
298	   smaller weights near the end, where variations are expected across
299	   repeated page loads.)  Wang et al. also developed an alternate
300	   classifier with lower accuracy yet superior performance (quadratic to
301	   linear time complexity).  It works by minimizing the sum of two
302	   costs: sequence transpositions and sequence deletions or insertions.
303	   These two costs are computed separately, in contrast to the first
304	   approach which computes them simultaneously.

306	   Hayes et al. [hayes2016k] developed an attack called
307	   k-fingerprinting, which uses a k-NN classifier with features ranked
308	   by random decision forests.  Their feature set includes timing
309	   information, e.g., statistics on packets per second, among the higher
310	   ranked features.  (Higher ranked features have more weight in the
311	   classification phase.)  Yan et al. [yan2018feature] used similar
312	   (manually curated) features with a CNN-based classifier.  Time-based
313	   features were among the more effective features identified.  Rahman
314	   et al. [rahman2019tik] improved time-based features by focusing on
315	   bursts, e.g., burst length, variance, inter-burst delay, etc., rather
316	   than more granular per-packet statistics.  (The latter tend to vary
317	   for inconsistencies across packet traces for websites.)  This
318	   improved accuracy of existing Deep Learning attacks from Sirinam et
319	   al. [sirinam2018deep], especially when coupled with packet direction
320	   information.

322	   1.  Open-world WF on Tor and TLS: Panchenko et al.
323	       [panchenko2011website] were the first to use a support vector
324	       machine (SVM) classifier trained with web domain-specific
325	       features, such as HTML document sizes, as well as packet lengths.
326	       Wang et al. [wang2014effective] also developed an attack using a
327	       k-Nearest Neighbors (k-NN) classifier, which is a supervised
328	       machine learning algorithm, targeting the open world setting.
329	       The classifier extracts a large number of features from packet
330	       sequences, including raw (ingress and egress) packet counts,
331	       unique packet lengths, direction, burst lengths, and inter-packet
332	       times, among others.  (There are 4226 features in total.)  The
333	       k-NN distance metric is computed as the sum of weighted feature
334	       differences.

336	   Kota et al. [abe2016fingerprinting] were the first to use Deep
337	   Learning (DL) methods based on Stacked Denoising Autoencoders for WF
338	   attacks.  (Autoencoders reduce feature input dimensions when
339	   stacked.)  Kota et al. form input vectors from Tor cell directions
340	   (+1 or -1).  They use no other features.  Using a (small) data set
341	   from Wang [wang2016website], the classifier achieves a 86% true
342	   positive rate and 2% false positive rate in the open world model.
343	   Rimmer et al. [rimmer2018automated] applied DL for automated feature
344	   generation and classifier construction.  Trained with 2,500 traces
345	   per website, their system achieves 96.3% accuracy in the open world
346	   model.  Recently, Bhat et al. [bhat2018var], Oh et al. [oh2017pfp],
347	   and Sirinam et al. [sirinam2018deep] used Convolutional Neural
348	   Networks (CNNs) and Deep Neural Networks (DNNs) for WF attacks.
349	   Results from Sirinam et al. show the best results - 98% on Tor
350	   without recent defenses (in Section {{defenses}) - while performing
351	   favorably when select defenses are used for both open and closed
352	   world models.

354	   Yan et al. [yan2018feature] studied manual high-information feature
355	   extraction from packet traces.  They "exhaustively" examined
356	   different levels of features, including packet, burst, TCP, port, and
357	   IP address, summing to 35,683 in total, and distilled them into a
358	   diverse set of uncorrelated features for eight different
359	   communication scenarios.  Rahman [rahman2018using] studied the
360	   utility of features derived from packet interarrival times,
361	   including: median interarrival time (per burst), burst packet arrival
362	   time variance, cross-burst interarrival median differences, and
363	   others.  Using a CNN, results show that these features yield a non-
364	   negligible increase in WF attack accuracy.

366	5.  Base Rate Fallacy

368	   For all WF attacks, one limitation worth highlighting is the base
369	   rate fallacy.  This can be summarized as follows: highly accurate
370	   classifiers with a reliable false positive rate (FPR) decrease in
371	   efficacy as the world size increases.  Juarez et al.
372	   [juarez2014critical] studied its impact by measuring the Bayesian
373	   detection rate (BDR) in comparison to the FPR as a function of world
374	   size.  As the world size increases, the BDR approaches 0 while the
375	   FPR remains stable, meaning that the probability of incorrect
376	   classifier results increase as well.  Juarez et al. partially address
377	   the base rate fallacy problem by adding a confirmation step to their
378	   classifier.  Another problem is that web content is (increasingly)
379	   dynamic.  Most WF attacks, especially those in closed world models,
380	   assume that traces are static.  However, Juarez et al.
381	   [juarez2014critical] show this is not the case even for "simple"
382	   pages such as google.com.  Thus, due to the base fallacy rate and
383	   dynamic nature of content, classifiers require continual retraining
384	   in order to ensure accuracy.

386	6.  Defenses

388	   There are at least two types of WF defenses: traffic shaping or
389	   morphing algorithms, and traffic splitting algorithms.  This section
390	   describes and illustrates examples of both.

392	6.1.  Traffic Morphing

394	   WF defenses are deterministic or randomized algorithms that take as
395	   input application data or packet sequences and return modified
396	   application data or packet sequences.  Viable defenses seek to
397	   minimize the transformation cost and maximum (theoretical and
398	   perfect) attacker accuracy.  Naive defenses such as sending a
399	   constant stream of (possibly random) bytes between client and server
400	   may be effective though clearly not viable from a cost perspective.
401	   Relevant cost metrics include bandwidth overhead, added time or
402	   latency (and its impact on related metrics such as page load time),
403	   and even CPU cost, though the latter is often ignored in favor of the
404	   former two.  Wang [wang2016website] describe defenses as either
405	   limited or general.  A limited defense is one which only helps
406	   mitigate specific WF attacks by transforming packets in a way to
407	   obviate a particular (set of) feature(s) used by said attacks.  In
408	   contrast, general defenses help mitigate a variety of attacks.

410	   Several general defenses have been proposed, including BuFLO
411	   [dyer2012peek], which pads packets to a fixed length of 1500B (the
412	   normal MTU) and schedules packets for transmission at fixed period
413	   intervals (and sends fake data if nothing is yet available).  Tamaraw
414	   [wang2014comparing] is an improvement over BuFLO that uses two
415	   different fixed lengths for packet transmission, rather than one, to
416	   save on bandwidth overhead.  Tamaraw also uses two different
417	   scheduling rates for ingress and egress packets.  The authors chose
418	   to make the ingress packet period smaller than the egress packet
419	   period since HTTP responses are often larger in size and count - if
420	   HTTP Push is used - than requests.  While provably correct, both
421	   BuFLO and Tamaraw limit the rate at which clients send traffic, and
422	   requires all clients to send at a uniform rate.  Both requirements
423	   therefore make it difficult to apply as a generic defense in IETF
424	   protocols.

426	   Wang et al. also developed Supersequence [wang2016website], which
427	   attempts to approximate a bandwidth-optimal deterministic defense.
428	   This is done by casting the padding and flow control problem as the
429	   shortest common subsequence (SCS) of the transformed packet trace.
430	   Supersequence approximates the solution by learning the optimal
431	   packet scheduling rate; it uses the same padding scheme as Tamaraw.

433	   Walkie-Talkie [wang2015walkie] is a collection of mechanisms for WF
434	   defense.  It includes running the client (browser) in half-duplex
435	   mode to batch requests and responses together, as well as randomly
436	   padding traffic so as to mimic traffic of benign websites.  It
437	   assumes knowledge of traffic patterns for benign websites, which can
438	   be information learned over time or provided by a cooperating peer.
439	   Goldberg and Wang also propose a "randomized" variant that pads real
440	   bursts of requests and generates random request bursts according to a
441	   uniform distribution.  The half-duplex mode could be implemented as
442	   an extension to a protocol such as HTTP/2, QUIC, or TLS.

444	   Many limited defenses have also been proposed.  We list prominent
445	   works below.

447	   *  Shmatikov and Wang [shmatikov2006timing] developed adaptive
448	      padding which adds packets to mask inter-packet times.  (This
449	      mechanism does not ever delay application data being sent, in
450	      contrast to other padding mechanisms such as BuFLO; see below.)
451	      Juarez et al. [juarez2015wtf]}[juarez2016toward] also created a WF
452	      defense based on adaptive padding called WTF-PAD.  This variant
453	      uses application data and "gap" distribution to generate padding
454	      for delays.  Specifically, when not sending application data,
455	      senders use the gap distribution to drive fake packet
456	      transmission.  WTF-PAD can be run by a single endpoint, though it
457	      is assumed that both client and server participate.  As mentioned
458	      above, protocols such as HTTP/2, QUIC, and TLS 1.3 offer a
459	      mechanism by which applications can send padding.  WTF-PAD could
460	      therefore be implemented as an extension to any of these
461	      protocols, either by applications supplying padding distributions
462	      or the system learning them over time.

464	   *  In the context of HTTP, Danezis [danezis2009traffic] proposed
465	      padding: URLs, content, and even HTML document structures to mask
466	      application data lengths.

468	   *  Wright et al. [wright2009traffic] developed traffic morphing,
469	      which pads packets in such a way so as to make the sequence from
470	      one page have characteristics of another (non-monitored or benign)
471	      page.  This technique requires application-specific knowledge
472	      about benign pages and is therefore best implemented outside of
473	      the transport layer.

475	   *  Nithyanand et al. [nithyanand2014glove] developed a mechanism
476	      called Glove, wherein traces were first clustered and then morphed
477	      (via dummy insertion, packet merging, splitting, and delaying) to
478	      look indistinguishable within clusters.  When used to protect the
479	      Alexa top 500 domains, Glove performs well with respect to
480	      bandwidth overhead when compared to BuFLO and CS-BuFLO.  Varying
481	      the cluster size can tune Glove's bandwidth overhead.

483	   *  Pironti et al. [pironti2012identifying] developed a TLS-based
484	      fragmentation and padding scheme designed to hide the length of
485	      application data within a certain range with record padding.  The
486	      mechanism works by iteratively splitting application data into
487	      variable sized segments.  Applications can guide the range of
488	      viable lengths provided such information is available.

490	   *  Luo et al. [luo2011httpos] created HTTPS with Obfuscation
491	      (HTTPOS), which is a client-side mechanism for obfuscating HTTP
492	      traffic.  It uses the HTTP Range method to receive resources in
493	      chunks, TCP MSS to limit the size of individual chunks, and
494	      advertised window size to control the flow of chunks in
495	      transmission.

497	   *  Panchenko et al. [panchenko2011website] developed Decoy, which is
498	      a simple mechanism that loads a benign page alongside a real page.
499	      This seeks to mask the real page load by properties of the "decoy"
500	      page.  As with morphing, this defense requires application-
501	      specific knowledge about benign pages and is best implemented
502	      outside of the transport layer.

504	   *  The Tor project implemented HTTP pipelining
505	      [perry2011experimental], which bundles egress HTTP/1.1 requests
506	      into batches of varying sizes with random orders.  Batching
507	      requests to mask request and response sizes could be made easier
508	      with HTTP/2 [RFC7540], HTTP/3, and QUIC, since these protocol
509	      naturally support multiplexing.  However, pipelining and batching
510	      may necessarily introduce latency delays that negatively impact
511	      the user experience.

513	   *  Cherubin et al. [cherubin2017website] design two application-layer
514	      defenses called Application Layer Padding Concerns Adversaries
515	      (ALPaCA) and Lightweight application-Layer Masquerading Add-on
516	      (LLaMA).  ALPaCA is a server-side defense that pads first-party
517	      content (deterministically or probabilistically) according to a
518	      known distribution.  (Deterministic padding similar to Tamaraw
519	      performs worse than probabilistic padding.)  LLaMA is similar to
520	      randomized pipelining, yet differs in that requests are also
521	      delayed (if necessary) and spurious requests are generated
522	      according to some probability distribution.  Comparatively, ALPaCA
523	      yields a greater reduction in WF attack accuracy than LLaMA.

525	   *  Lu et al. [lu2018dynaflow] designed DynaFlow, which is a defense
526	      that dynamically adjusts traffic flows using a combination of
527	      burst pattern morphing, constant traffic flow with flexible
528	      intervals, and burst padding.  DynaFlow overhead is 40% less than
529	      that of Tamaraw and was shown to have similar benefits.

531	   *  Rahman [rahman18gan] uses generative adversarial networks (GANs)
532	      to modify candidate burst properties of packet traces, i.e., by
533	      inserting dummy packets, such that they appear indistinguishable
534	      from other traces.  Normally, the generator component in a GAN
535	      uses random noise to produce information that matches a target
536	      data distribution as classified by the discriminator component.
537	      Rahman uses a modified GAN architecture wherein the generator
538	      produces padding (dummy packets) for input data such that the
539	      discriminator cannot distinguish it from noise, or a desired burst
540	      packet sequence.  Preliminary results with the GAN trained and
541	      tested on defended traffic, i.e., traffic already subject to some
542	      form of WF defense, show a 9% increase in bandwidth and 15%
543	      decrease in attack accuracy (from 98% to 85% in a closed world
544	      setting).

546	   *  Imani et al. [imanimockingbird] developed Mockingbird, a defense
547	      built on using generated adversarial examples, i.e., dummy traffic
548	      designed to disrupt classifier behavior, aimed towards model
549	      misclassification.  When run on classifiers trained without
550	      adversarial examples, Mockingbird reduced state-of-the-art DF
551	      attacks and CUMUL attacks from [panchenko2016website] from 98% to
552	      3% and 92% to 31%, respectively.  Conversely, classifiers trained
553	      and hardened with adversarial examples only reduce attack accuracy
554	      from 95% to between 25-57%, respectively.  Classification results
555	      for half-duplex traces, i.e., those in which traffic flows in
556	      half-duplex mode, are lower.  Mockingbird's bandwidth overhead is
557	      tunable based on parameters that control the internal traffic
558	      shaping algorithm.

560	   *  Gong et al. developed [gong2020zero] is a lightweight defense that
561	      does not delay any packets, minimizing its effect on user
562	      experience.  Instead of adding packets during a packet trace in
563	      order to obfuscate which page it came from, GLUE adds packets
564	      between packet traces (during user think time/downtime) to merge
565	      them together, creating a seamless sequence of packets covering
566	      multiple page loads.  Attackers are unable to train classifiers
567	      for multiple contiguous traces and also unable to identify
568	      individual page traces from the sequence.  This is in part because
569	      the GLUE used is itself a real packet trace, thwarting attacker
570	      classification.  GLUE also adds extra noise packets ("FRONT") in
571	      the first trace as it is vulnerable otherwise.

573	6.2.  Traffic Splitting

575	   Traffic splitting is a type of defense wherein application data is
576	   sent over multiple, disjoint network paths.  Multipath TCP (MPTCP) is
577	   one type of "traffic splitting" protocol, wherein an endpoint may
578	   send TCP segments for a single connection over multiple interfaces.
579	   This is commonly done for multi-homed devices, such as mobile devices
580	   with cellular and WiFi or wired network connections.  Traffic
581	   splitting assumes that guided traffic distribution reduces
582	   information available to an adversary, and thereby decreases the
583	   success probability of WF attacks.  Traffic splitting defenses differ
584	   in the algorithm used for traffic distribution.

586	   Henri et al. [henri2020protecting] studied several traffic splitting
587	   algorithms, including: weighted and non-weighted round-robin path
588	   splitting, incoming and outgoing path split, fixed-probability
589	   splitting, and variants of per-connection uniform probability
590	   splitting.  The best results came from a per-connection path
591	   splitting variant where the maximum number of packets sent on any
592	   given path was limited by a random variable chosen from a geometric
593	   distribution.  (Once this limit was reached, a new path was selected
594	   uniformly at random.)  De la Cadena et al. [de2019poster] also study
595	   path splitting algorithmss.  They conclude that a weighted random
596	   path selection algorithm works best.  (The authors do not give
597	   specifics of path weight probability derivation.)

599	7.  Open Problems and Directions

601	   To date, WF attacks target clients running over Tor or some other
602	   anonymizing service, meaning that WF attacks are likely more accurate
603	   on normal TLS-protected connections.  Moreover, attacks normally
604	   assume clients use HTTP/1.1 with parallel connections for parallel
605	   resource fetches.  In recent years, however, protocols such as SPDY,
606	   HTTP/2, and QUIC with built-in padding support and multiplexed
607	   stream-based connections should make existing attacks more difficult
608	   to carry out.  That said, it is unclear how exactly these protocol
609	   design trends will impact WF attacks.  A non-exhaustive list of
610	   questions that warrant further research are below:

612	   1.  How does connection coalescing and consolidation affect WF
613	       attacks?  Technologies such as DNS-over-HTTPS and ESNI favor
614	       architectures wherein a single network or connection can serve
615	       multiple origins or resources.  With connection coalescing,
616	       traffic for multiple resources is sent on the same connection,
617	       thereby adding effects similar to that of the Decoy defense
618	       mechanism described in Section 6

620	   2.  To what extent does protocol multiplexing increase WF attack
621	       difficulty?  Using a single connection with multiple streams to
622	       avoid HoL blocking saves on connection startup and bandwidth
623	       costs while simultaneously mixing information from multiple
624	       requests and resources on the same connection.

626	   3.  How can protocol features such as HTTP Push be used to improve WF
627	       defense efficacy?  Defenses without cooperative peer support
628	       often induce suboptimal bandwidth or latency costs.  If both
629	       endpoints of a connection participate in the defense, even
630	       proactively with Push, perhaps this could be improved.

632	   4.  Can connection bootstrapping techniques such as those used by
633	       ESNI be used to distribute WF defense information?  One possible
634	       approach is to distribute client padding profiles derived from
635	       CDN knowledge of serviced resources.

637	   5.  How can clients build, use, and possibly share WF defense
638	       information to benefit others?

640	   6.  How can applications package websites and subresources in such a
641	       way that limits unique information?  For example, websites link
642	       to third party resources in an ad-hoc fashion, causing the
643	       subsequent trace of browser fetches to possibly uniquely identify
644	       the website.

646	   Research into the above questions will help the IETF community better
647	   understand the extent to which WF attacks are a problem for Internet
648	   users in general.

650	   It is worth mentioning that traffic-based WF attacks may not be
651	   required to achieve the desired goal of learning a connection's
652	   destination.  Network connections by nature reveal information about
653	   endpoint behavior.  The relationship between network address and
654	   domains, especially when stable and unique, are a strong signal for
655	   website fingerprinting.  Trevisan et al. [trevisan2016towards]
656	   explored use of this signal as a reliable mechanism for website
657	   fingerprinting.  They find that most major services (domains) have
658	   clearly associated IP address(es), though these addresses may change
659	   over time.  Jiang et al. [jiang2007lightweight] and Tammaro et al.
660	   [tammaro2012exploiting] also previously came to the same conclusion.
661	   Address-based website fingerprinting was also explored by Patil and
662	   Borisov [patil2019can], wherein they showed that addresses,
663	   especially when grouped together as part of a single web page load,
664	   leak a substantial amount of information about the corresponding
665	   domain.  Thus, in general, classifiers that rely solely on network
666	   addresses may be used to aid website fingerprinting attacks.

668	8.  Protocol Design Considerations

670	   New protocols such as TLS 1.3 and QUIC are designed with privacy-
671	   protections in mind.  TLS 1.3, for example, supports record-layer
672	   padding [RFC8446], although it is not used widely in practice.
673	   Despite this, TLS connections still leak metadata, including
674	   negotiatied ciphersuites.  (See [fordTLSMetadata] for a discussion of
675	   this issue.)  QUIC is more aggressive in its use of encryption as
676	   both a mitigation for middlebox ossificatiion and privacy
677	   enhancement.  IPsec Traffic Flow Confidentiality [RFC4303] and
678	   Traffic Flow Security [I-D.ietf-ipsecme-iptfs] are two mechanisms by
679	   which endpoints can ESP datagrams to mask size metadata.

681	   Future protocols should follow these trends when possible to remove
682	   unnecessary metadata from the network.

684	9.  Security Considerations

686	   This document surveys security and privacy attacks and defenses on
687	   encrypted TLS connections.  It does not introduce, specify, or
688	   recommend any particular mitigation to the aforementioned attacks.

690	10.  IANA Considerations

692	   This document makes no IANA requests.

694	11.  Informative References

696	   [abe2016fingerprinting]
697	              "Fingerprinting attack on tor anonymity using deep
698	              learning", Asia-Pacific Advanced Network, 2016 , n.d..

700	   [backes2013preventing]
701	              "Preventing Side-Channel Leaks in Web Traffic -- A Formal
702	              Approach", NDSS, 2013 , n.d..

704	   [bhat2018var]
705	              "Var-CNN and DynaFlow -- Improved Attacks and Defenses for
706	              Website Fingerprinting", arXiv preprint arXiv:1802.10215 ,
707	              n.d..

709	   [bissias2005privacy]
710	              "Privacy vulnerabilities in encrypted HTTP streams",
711	              International Workshop on Privacy Enhancing Technologies,
712	              2005 , n.d..

714	   [cai2012touching]
715	              "Touching from a distance -- Website fingerprinting
716	              attacks and defenses", ACM conference on Computer and
717	              communications security, 2012 , n.d..

719	   [cheng1998traffic]
720	              "Traffic analysis of SSL encrypted web browsing", n.d..

722	   [cherubin2017website]
723	              "Website fingerprinting defenses at the application
724	              layer", Privacy Enhancing Technologies, 2017 , n.d..

726	   [coull2007web]
727	              "On Web Browsing Privacy in Anonymized NetFlows", USENIX
728	              Security Symposium , n.d..

730	   [CRIME]    "The CRIME Attack", n.d.,
731	              <https://www.ekoparty.org/archive/2012/
732	              CRIME_ekoparty2012.pdf>.

734	   [danezis2009traffic]
735	              "Traffic Analysis of the HTTP Protocol over TLS", 2009 ,
736	              n.d..

738	   [de2019poster]
739	              "Traffic Splitting to Counter Website Fingerprinting",
740	              2019, <https://dl.acm.org/doi/10.1145/3319535.3363249>.

742	   [dyer2012peek]
743	              "Peek-a-boo, i still see you -- Why efficient traffic
744	              analysis countermeasures fail", IEEE Symposium on Security
745	              and Privacy, 2012 , n.d..

747	   [fordTLSMetadata]
748	              "Metadata Protection Considerations for TLS Present and
749	              Future", n.d., <http://bford.info/pub/net/tlsmeta.pdf>.

751	   [foremski2014dns]
752	              "DNS-Class -- immediate classification of IP flows using
753	              DNS", International Journal of Network Management , n.d..

755	   [gong2010fingerprinting]
756	              "Fingerprinting websites using remote traffic analysis",
757	              Proceedings of the 17th ACM conference on Computer and
758	              communications security , n.d..

760	   [gong2020zero]
761	              "Zero-delay Lightweight Defenses against Website
762	              Fingerprinting", n.d.,
763	              <https://www.usenix.org/system/files/
764	              sec20summer_gong_prepub.pdf>.

766	   [hayes2016k]
767	              "k-fingerprinting -- A Robust Scalable Website
768	              Fingerprinting Technique", USENIX Security Symposium,
769	              2016 , n.d..

771	   [henri2020protecting]
772	              "Protecting against Website Fingerprinting with
773	              Multihoming", 2020,
774	              <https://petsymposium.org/2020/files/papers/issue2/popets-
775	              2020-0019.pdf>.

777	   [herrmann2009website]
778	              "Website fingerprinting -- attacking popular privacy
779	              enhancing technologies with the multinomial naive-bayes
780	              classifier", ACM workshop on Cloud computing security,
781	              2009 , n.d..

783	   [hintz2002fingerprinting]
784	              "Fingerprinting websites using traffic analysis",
785	              International Workshop on Privacy Enhancing Technologies,
786	              2002 , n.d..

788	   [I-D.ietf-ipsecme-iptfs]
789	              Hopps, C., "IP Traffic Flow Security", Work in Progress,
790	              Internet-Draft, draft-ietf-ipsecme-iptfs-01, March 2,
791	              2020, <http://www.ietf.org/internet-drafts/draft-ietf-
792	              ipsecme-iptfs-01.txt>.

794	   [I-D.ietf-quic-transport]
795	              Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed
796	              and Secure Transport", Work in Progress, Internet-Draft,
797	              draft-ietf-quic-transport-29, June 9, 2020,
798	              <http://www.ietf.org/internet-drafts/draft-ietf-quic-
799	              transport-29.txt>.

801	   [I-D.ietf-tls-esni]
802	              Rescorla, E., Oku, K., Sullivan, N., and C. Wood, "TLS
803	              Encrypted Client Hello", Work in Progress, Internet-Draft,
804	              draft-ietf-tls-esni-07, June 1, 2020,
805	              <http://www.ietf.org/internet-drafts/draft-ietf-tls-esni-
806	              07.txt>.

808	   [imanimockingbird]
809	              "Mockingbird -- Defending Against Deep-Learning-Based
810	              Website Fingerprinting Attacks with Adversarial Traces",
811	              n.d., <https://arxiv.org/pdf/1902.06626.pdf>.

813	   [jiang2007lightweight]
814	              "Lightweight application classification for network
815	              management", SIGCOMM workshop on Internet network
816	              management, 2007 , n.d..

818	   [juarez2014critical]
819	              "A critical evaluation of website fingerprinting attacks",
820	              ACM SIGSAC Conference on Computer and Communications
821	              Security, 2014 , n.d..

823	   [juarez2015wtf]
824	              "WTF-PAD -- toward an efficient website fingerprinting
825	              defense for tor", CoRR, abs/1512.00524 , n.d., <https://pd
826	              fs.semanticscholar.org/0f54/4d0845cb9f317722759dc49e1493ef
827	              30d83d.pdf>.

829	   [juarez2016toward]
830	              "Toward an efficient website fingerprinting defense",
831	              European Symposium on Research in Computer Security,
832	              2016 , n.d..

834	   [li2018can]
835	              "Can We Learn What People Are Doing from Raw DNS
836	              Queries?", IEEE INFOCOM 2018-IEEE Conference on Computer
837	              Communications , n.d..

839	   [liberatore2006inferring]
840	              "Inferring the source of encrypted HTTP connections", ACM
841	              Conference on Computer and Communications Security, 2006 ,
842	              n.d..

844	   [lu2010website]
845	              "Website fingerprinting and identification using ordered
846	              feature sequences", European Symposium on Research in
847	              Computer Security, 2010 , n.d..

849	   [lu2018dynaflow]
850	              "DynaFlow -- An Efficient Website Fingerprinting Defense
851	              Based on Dynamically-Adjusting Flows", Workshop on Privacy
852	              in the Electronic Society, 2018 , n.d..

854	   [luo2011httpos]
855	              "HTTPOS -- Sealing Information Leaks with Browser-side
856	              Obfuscation of Encrypted Flows", NDSS, 2011 , n.d..

858	   [miller2014know]
859	              "I know why you went to the clinic -- Risks and
860	              realization of https traffic analysis", International
861	              Symposium on Privacy Enhancing Technologies Symposium,
862	              2014 , n.d..

864	   [nithyanand2014glove]
865	              "Glove -- A bespoke website fingerprinting defense",
866	              Proceedings of the 13th Workshop on Privacy in the
867	              Electronic Society , n.d..

869	   [oh2017pfp]
870	              "p-FP -- Extraction, Classification, and Prediction of
871	              Website Fingerprints with Deep Learning", n.d..

873	   [panchenko2011website]
874	              "Website fingerprinting in onion routing based
875	              anonymization networks", ACM workshop on Privacy in the
876	              electronic society, 2011 , n.d..

878	   [panchenko2016website]
879	              "Website Fingerprinting at Internet Scale", n.d.,
880	              <https://www.freehaven.net/anonbib/cache/fingerprinting-
881	              ndss2016.pdf>.

883	   [patil2019can]
884	              "What can you learn from an IP?", n.d.,
885	              <https://irtf.org/anrw/2019/
886	              anrw2019-final44-acmpaginated.pdf>.

888	   [perry2011experimental]
889	              "Experimental defense for website traffic fingerprinting",
890	              n.d., <https://blog.torproject.org/experimental-defense-
891	              website-traffic-fingerprinting>.

893	   [pironti2012identifying]
894	              "Identifying website users by TLS traffic analysis -- New
895	              attacks and effective countermeasures", n.d..

897	   [rahman18gan]
898	              "Generating Adversarial Packets for Website Fingerprinting
899	              Defense", n.d., <https://www.rahmanmsaidur.com/projects/
900	              Fall_18_Generating_Adversarial_Packets.pdf>.

902	   [rahman2018using]
903	              "Using Packet Timing Information in Website
904	              Fingerprinting", n.d..

906	   [rahman2019tik]
907	              "Tik-Tok -- The Utility of Packet Timing in Website
908	              Fingerprinting Attacks", n.d.,
909	              <https://arxiv.org/pdf/1902.06421.pdf>.

911	   [reed2017identifying]
912	              "Identifying https-protected netflix videos in real-time",
913	              ACM on Conference on Data and Application Security and
914	              Privacy, 2017 , n.d..

916	   [RFC4303]  Kent, S., "IP Encapsulating Security Payload (ESP)",
917	              RFC 4303, DOI 10.17487/RFC4303, December 2005,
918	              <https://www.rfc-editor.org/info/rfc4303>.

920	   [RFC7540]  Belshe, M., Peon, R., and M. Thomson, Ed., "Hypertext
921	              Transfer Protocol Version 2 (HTTP/2)", RFC 7540,
922	              DOI 10.17487/RFC7540, May 2015,
923	              <https://www.rfc-editor.org/info/rfc7540>.

925	   [RFC8446]  Rescorla, E., "The Transport Layer Security (TLS) Protocol
926	              Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018,
927	              <https://www.rfc-editor.org/info/rfc8446>.

929	   [RFC8484]  Hoffman, P. and P. McManus, "DNS Queries over HTTPS
930	              (DoH)", RFC 8484, DOI 10.17487/RFC8484, October 2018,
931	              <https://www.rfc-editor.org/info/rfc8484>.

933	   [rimmer2018automated]
934	              "Automated website fingerprinting through deep learning",
935	              Network & Distributed System Security Symposium (NDSS),
936	              2018 , n.d..

938	   [schuster2017beauty]
939	              "Beauty and the burst -- Remote identification of
940	              encrypted video streams", USENIX Security, 2017 , n.d..

942	   [shmatikov2006timing]
943	              "Timing analysis in low-latency mix networks -- Attacks
944	              and defenses", European Symposium on Research in Computer
945	              Security, 2006 , n.d..

947	   [shulman2014pretty]
948	              "Pretty bad privacy -- Pitfalls of DNS encryption",
949	              Workshop on Privacy in the Electronic Society, 2014 ,
950	              n.d..

952	   [siby2018dns]
953	              "DNS Privacy not so private -- the traffic analysis
954	              perspective", n.d..

956	   [sirinam2018deep]
957	              "Deep fingerprinting -- Undermining website fingerprinting
958	              defenses with deep learning", arXiv preprint
959	              arXiv:1801.02265 , n.d..

961	   [sun2002statistical]
962	              "Statistical identification of encrypted web browsing
963	              traffic", IEEE, 2002 , n.d..

965	   [tammaro2012exploiting]
966	              "Exploiting packet-sampling measurements for traffic
967	              characterization and classification", International
968	              Journal of Network Management, 2012 , n.d..

970	   [TIME]     "A Perfect CRIME? Only TIME Will Tell", Black Hat Europe
971	              2013 , n.d..

973	   [trevisan2016towards]
974	              "Towards web service classification using addresses and
975	              DNS", Wireless Communications and Mobile Computing
976	              Conference (IWCMC), 2016 International. IEEE, 2016 , n.d..

978	   [wagner1996analysis]
979	              "Analysis of the SSL 3.0 protocol", USENIX Workshop on
980	              Electronic Commerce Proceedings, 1996 , n.d..

982	   [wang2013improved]
983	              "Improved website fingerprinting on tor", Workshop on
984	              privacy in the electronic society, 2013 , n.d..

986	   [wang2014comparing]
987	              "Comparing website fingerprinting attacks and defenses",
988	              Technical Report 2013-30, CACR, 2013. , n.d..

990	   [wang2014effective]
991	              "Effective Attacks and Provable Defenses for Website
992	              Fingerprinting", USENIX Security Symposium, 2014 , n.d..

994	   [wang2015walkie]
995	              "Walkie-talkie -- An effective and efficient defense
996	              against website fingerprinting", n.d..

998	   [wang2016website]
999	              "Website fingerprinting -- Attacks and defenses",
1000	              University of Waterloo , n.d..

1002	   [wright2009traffic]
1003	              "Traffic Morphing -- An Efficient Defense Against
1004	              Statistical Traffic Analysis", NDSS, 2009 , n.d..

1006	   [yan2018feature]
1007	              "Feature selection for website fingerprinting",
1008	              Proceedings on Privacy Enhancing Technologies, 2018 ,
1009	              n.d..

1011	Appendix A.  Acknowledgements

1013	   The authors would like to thank Frederic Jacobs and Tim Taubert for
1014	   feedback on earlier versions of this document.

1016	Authors' Addresses

1018	   Ian Goldberg
1019	   University of Waterloo

1021	   Email: iang@uwaterloo.ca

1023	   Tao Wang
1024	   HK University of Science and Technology

1026	   Email: taow@cse.ust.hk
1027	   Christopher A. Wood
1028	   Apple, Inc.

1030	   Email: cawood@apple.com