idnits 2.17.1 

draft-hall-censorship-tech-04.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack a Security Considerations section.

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  == There are 2 instances of lines with non-RFC2606-compliant FQDNs in the
     document.

  == There are 2 instances of lines with non-RFC6890-compliant IPv4 addresses
     in the document.  If these are example addresses, they should be changed.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (July 08, 2016) is 2842 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  -- Obsolete informational reference (is this intentional?): RFC  793
     (Obsoleted by RFC 9293)


     Summary: 2 errors (**), 0 flaws (~~), 3 warnings (==), 2 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                            J. Hall
3	Internet-Draft                                                       CDT
4	Intended status: Informational                                  M. Aaron
5	Expires: January 9, 2017                                      CU Boulder
6	                                                                B. Jones
7	                                                             N. Feamster
8	                                                               Princeton
9	                                                           July 08, 2016

11	              A Survey of Worldwide Censorship Techniques
12	                     draft-hall-censorship-tech-04

14	Abstract

16	   This document describes the technical mechanisms used by censorship
17	   regimes around the world to block or impair Internet traffic.  It
18	   aims to make designers, implementers, and users of Internet protocols
19	   aware of the properties being exploited and mechanisms used to censor
20	   end-user access to information.  This document makes no suggestions
21	   on individual protocol considerations, and is purely informational,
22	   intended to be a reference.

24	Status of This Memo

26	   This Internet-Draft is submitted in full conformance with the
27	   provisions of BCP 78 and BCP 79.

29	   Internet-Drafts are working documents of the Internet Engineering
30	   Task Force (IETF).  Note that other groups may also distribute
31	   working documents as Internet-Drafts.  The list of current Internet-
32	   Drafts is at http://datatracker.ietf.org/drafts/current/.

34	   Internet-Drafts are draft documents valid for a maximum of six months
35	   and may be updated, replaced, or obsoleted by other documents at any
36	   time.  It is inappropriate to use Internet-Drafts as reference
37	   material or to cite them other than as "work in progress."

39	   This Internet-Draft will expire on January 9, 2017.

41	Copyright Notice

43	   Copyright (c) 2016 IETF Trust and the persons identified as the
44	   document authors.  All rights reserved.

46	   This document is subject to BCP 78 and the IETF Trust's Legal
47	   Provisions Relating to IETF Documents
48	   (http://trustee.ietf.org/license-info) in effect on the date of
49	   publication of this document.  Please review these documents
50	   carefully, as they describe your rights and restrictions with respect
51	   to this document.  Code Components extracted from this document must
52	   include Simplified BSD License text as described in Section 4.e of
53	   the Trust Legal Provisions and are provided without warranty as
54	   described in the Simplified BSD License.

56	Table of Contents

58	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
59	   2.  Technical Prescription  . . . . . . . . . . . . . . . . . . .   3
60	   3.  Technical Identification  . . . . . . . . . . . . . . . . . .   3
61	     3.1.  Points of Control . . . . . . . . . . . . . . . . . . . .   4
62	     3.2.  Application Layer . . . . . . . . . . . . . . . . . . . .   5
63	       3.2.1.  HTTP Request Header Identification  . . . . . . . . .   5
64	       3.2.2.  HTTP Response Header Identification . . . . . . . . .   6
65	       3.2.3.  Instrumenting Content Providers . . . . . . . . . . .   6
66	       3.2.4.  Deep Packet Inspection (DPI) Identification . . . . .   8
67	       3.2.5.  Server Name Indication  . . . . . . . . . . . . . . .   9
68	     3.3.  Transport Layer . . . . . . . . . . . . . . . . . . . . .   9
69	       3.3.1.  TCP/IP Header Identification  . . . . . . . . . . . .   9
70	       3.3.2.  Protocol Identification . . . . . . . . . . . . . . .  10
71	   4.  Technical Interference  . . . . . . . . . . . . . . . . . . .  11
72	     4.1.  Performance Degradation . . . . . . . . . . . . . . . . .  11
73	     4.2.  Packet Dropping . . . . . . . . . . . . . . . . . . . . .  12
74	     4.3.  RST Packet Injection  . . . . . . . . . . . . . . . . . .  12
75	     4.4.  DNS Interference  . . . . . . . . . . . . . . . . . . . .  14
76	     4.5.  Distributed Denial of Service (DDoS)  . . . . . . . . . .  16
77	     4.6.  Network Disconnection or Adversarial Route Announcement .  16
78	   5.  Non-Technical Prescription  . . . . . . . . . . . . . . . . .  17
79	   6.  Non-Technical Interference  . . . . . . . . . . . . . . . . .  17
80	     6.1.  Self Censorship . . . . . . . . . . . . . . . . . . . . .  17
81	     6.2.  Domain Name Reallocation  . . . . . . . . . . . . . . . .  18
82	     6.3.  Server Takedown . . . . . . . . . . . . . . . . . . . . .  18
83	     6.4.  Notice and Takedown . . . . . . . . . . . . . . . . . . .  18
84	   7.  Contributors  . . . . . . . . . . . . . . . . . . . . . . . .  19
85	   8.  Informative References  . . . . . . . . . . . . . . . . . . .  19
86	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  28

88	1.  Introduction

90	   Censorship is where an entity in a position of power - such as a
91	   government, organization, or individual - suppresses communication
92	   that it considers objectionable, harmful, sensitive, politically
93	   incorrect or inconvenient.  (Although censors that engage in
94	   censorship must do so through legal, military, or other means, this
95	   document focuses largely on technical mechanisms used to achieve
96	   network censorship.)
97	   This document describes the technical mechanisms that censorship
98	   regimes around the world use to block or degrade Internet traffic
99	   (see [RFC7754] for a discussion of Internet blocking and filtering in
100	   terms of implications for Internet architecture, rather than end-user
101	   access to content and services).

103	   We describe three elements of Internet censorship: prescription,
104	   identification, and interference.  Prescription is the process by
105	   which censors determine what types of material they should block,
106	   i.e. they decide to block a list of pornographic websites.
107	   Identification is the process by which censors classify specific
108	   traffic to be blocked or impaired, i.e. the censor blocks or impairs
109	   all webpages containing "sex" in the title or traffic to sex.com.
110	   Interference is the process by which the censor intercedes in
111	   communication and prevents access to censored materials by blocking
112	   access or impairing the connection.

114	2.  Technical Prescription

116	   Prescription is the process of figuring out what censors would like
117	   to block [Glanville-2008].  Generally, censors aggregate "to block"
118	   information in blacklists or using real-time heuristic assessment of
119	   content [Ding-1999].

121	   There are typically three types of blacklists: Keyword, Domain Name,
122	   or IP.  Keyword and Domain Name blocking take place at the
123	   application level (e.g.  HTTP), whereas IP blocking tends to take
124	   place using routing data in TCP/IP headers.  The mechanisms for
125	   building up these blacklists are varied.  Censors can purchase from
126	   private industry "content control" software, such as SmartFilter,
127	   which allows filtering from broad categories that they would like to
128	   block, such as gambling or pornography.  In these cases, thes private
129	   services attempt to categorize every semi-questionable website as to
130	   allow for metatag blocking (similarly, they tune real-time content
131	   heuristic systems to map their assessments onto categories of
132	   objectionable content).

134	   Countries that are more interested in retaining specific political
135	   control, a desire which requires swift and decisive action, often
136	   have ministries or organizations, such as the Ministry of Industry
137	   and Information Technology in China or the Ministry of Culture and
138	   Islamic Guidance in Iran, which maintain their own blacklists.

140	3.  Technical Identification
141	3.1.  Points of Control

143	   Internet censorship, necessarily, takes place over a network.
144	   Network design gives censors a number of different points-of-control
145	   where they can identify the content they are interested in filtering.
146	   An important aspect of pervasive technical interception is the
147	   necessity to rely on software or hardware to intercept the content
148	   the censor is interested in.  This requirement, the need to have the
149	   interception mechanism located somewhere, logically or physically,
150	   implicates various general points-of-control:

152	   o  Internet Backbone: If a censor controls the gateways into a
153	      region, they can filter undesirable traffic that is traveling into
154	      and out of the region by sniffing and mirroring at the relevant
155	      exchange points.  Censorship at this point of control is most
156	      effective at controlling the flow of information between a region
157	      and the rest of the Internet, but is ineffective at identifying
158	      content traveling between the users within a region.

160	   o  Internet Service Providers: Internet Service Providers are perhaps
161	      the most natural point of control.  They have a benefit of being
162	      easily enumerable by a censor paired with the ability to identify
163	      the regional and international traffic of all their users.  The
164	      censor's filtration mechanisms can be placed on an ISP via
165	      governmental mandates, ownership, or voluntary/coercive influence.

167	   o  Institutions: Private institutions such as corporations, schools,
168	      and cyber cafes can put filtration mechanisms in place.  These
169	      mechanisms are occasionally at the request of a censor, but are
170	      more often implemented to help achieve institutional goals, such
171	      as to prevent the viewing of pornography on school computers.

173	   o  Personal Devices: Censors can mandate censorship software be
174	      installed on the device level.  This has many disadvantages in
175	      terms of scalability, ease-of-circumvention, and operating system
176	      requirements.  The emergence of mobile devices exacerbate these
177	      feasibility problems.

179	   o  Services: Application service providers can be pressured, coerced,
180	      or legally required to censor specific content or flows of data.
181	      Service providers naturally face incentives to maximize their
182	      potential customer base and potential service shutdowns or legal
183	      liability due to censorship efforts may seem much less attractive
184	      than potentially excluding content, users, or uses of their
185	      service.

187	   o  Certificate Authorities: Authorities that issue cryptographically
188	      secured resources can be a significant point of control.

190	      Certificate Authorities that issue certificates to domain holders
191	      for TLS/HTTPS or Regional/Local Internet Registries that issue
192	      Route Origination Authorizations to BGP operators can be forced to
193	      issue rogue certificates that may allow compromises in
194	      confidentiatlity guarantees - allowing censorship software to
195	      engage in identification and interference where not possible
196	      before - or integrity degrees - allowing, for example, adversarial
197	      routing of traffic.

199	   o  Content Distribution Networks (CDNs): CDNs seek to collapse
200	      network topology in order to better locate content closer to the
201	      service's users in order to improve quality of service.  These can
202	      be powerful points of control for censors, especially if the
203	      location of a CDN results in easier interference.

205	   At all levels of the network hierarchy, the filtration mechanisms
206	   used to detect undesirable traffic are essentially the same: a censor
207	   sniffs transmitting packets and identifies undesirable content, and
208	   then uses a blocking or shaping mechanism to prevent or impair
209	   access.  Identification of undesirable traffic can occur at the
210	   application, transport, or network layer of the IP stack.  Censors
211	   are almost always concerned with web traffic, so the relevant
212	   protocols tend to be filtered in predictable ways.  For example, a
213	   subversive image would always make it past a keyword filter, but the
214	   IP address of the site serving the image may be blacklisted when
215	   identified as a provider of undesirable content.

217	3.2.  Application Layer

219	3.2.1.  HTTP Request Header Identification

221	   An HTTP header contains a lot of useful information for traffic
222	   identification; although host is the only required field in an HTTP
223	   request header (for HTTP/1.1 and later), an HTTP method field is
224	   necessary to do anything useful.  As such, the method and host fields
225	   are the two fields used most often for ubiquitous censorship.  A
226	   censor can sniff traffic and identify a specific domain name (host)
227	   and usually a page name (GET /page) as well.  This identification
228	   technique is usually paired with TCP/IP header identification (see
229	   Section 3.3.1) for a more robust method.

231	   Tradeoffs: Request Identification is a technically straight-forward
232	   identification method that can be easily implemented at the Backbone
233	   or ISP level.  The hardware needed for this sort of identification is
234	   cheap and easy-to-acquire, making it desirable when budget and scope
235	   are a concern.  HTTPS will encrypt the relevant request and response
236	   fields, so pairing with TCP/IP identification (see Section 3.3.1) is
237	   necessary for filtering of HTTPS.  However, some countermeasures such
238	   as URL obfuscation [RSF-2005] can trivially defeat simple forms of
239	   HTTP Request Header Identification.

241	   Empirical Examples: Studies exploring censorship mechanisms have
242	   found evidence of HTTP header/ URL filtering in many countries,
243	   including Bangladesh, Bahrain, China, India, Iran, Malaysia,
244	   Pakistan, Russia, Saudi Arabia, South Korea, Thailand, and Turkey
245	   [Verkamp-2012] [Nabi-2013] [Aryan-2012].  Commercial technologies
246	   such as the McAfee SmartFilter and NetSweeper are often purchased by
247	   censors [Dalek-2013].  These commercial technologies use a
248	   combination of HTTP Request Identification and TCP/IP Header
249	   Identification to filter specific URLs.  Dalek et al. and Jones et
250	   al. identified the use of these products in the wild [Dalek-2013]
251	   [Jones-2014].

253	3.2.2.  HTTP Response Header Identification

255	   While HTTP Request Header Identification relies on the information
256	   contained in the HTTP request from client to server, response
257	   identification uses information sent in response by the server to
258	   client to identify undesirable content.

260	   Tradeoffs: As with HTTP Request Header Identification, the techniques
261	   used to identify HTTP traffic are well-known, cheap, and relatively
262	   easy to implement, but is made useless by HTTPS, because the response
263	   in HTTPS is encrypted, including headers.

265	   The response fields are also less helpful for identifying content
266	   than request fields, as Server could easily be identified using HTTP
267	   Request Header identification, and Via is rarely relevant.  HTTP
268	   Response censorship mechanisms normally let the first n packets
269	   through while the mirrored traffic is being processed; this may allow
270	   some content through and the user may be able to detect that the
271	   censor is actively interfering with undesirable content.

273	   Empirical Examples: In 2009, Jong Park et al. at the University of
274	   New Mexico demonstrated that the Great Firewall of China (GFW) used
275	   this technique [Crandall-2010].  However, Jong Park et al. found that
276	   the GFW discontinued this practice during the course of the study.
277	   Due to the overlap in HTTP response filtering and keyword filtering
278	   (see Section 3.2.3), it is likely that most censors rely on keyword
279	   filtering over TCP streams instead of HTTP response filtering.

281	3.2.3.  Instrumenting Content Providers

283	   In addition to censorship by the state, many governments pressure
284	   content providers to censor themselves.  Due to the extensive reach
285	   of government censorship, we need to define content provider as any
286	   service that provides utility to users, including everything from web
287	   sites to locally installed programs.  The defining factor of keyword
288	   identification by content providers is the choice of content
289	   providers to detect restricted terms on their platform.  The terms to
290	   look for may be provided by the government or the content provider
291	   may be expected to come up with their own list.

293	   Tradeoffs: By instrumenting content providers to identify restricted
294	   content, the censor can gain new information at the cost of political
295	   capital with the companies it forces or encourages to participate in
296	   censorship.  For example, the censor can gain insight about the
297	   content of encrypted traffic by coercing web sites to identify
298	   restricted content, but this may drive away potential investment.
299	   Coercing content providers may encourage self censorship, an
300	   additional advantage for censors.  The tradeoffs for instrumenting
301	   content providers are highly dependent on the content provider and
302	   the requested assistance.

304	   Empirical Examples: Researchers have discovered keyword
305	   identification by content providers on platforms ranging from instant
306	   messaging applications [Senft-2013] to search engines [Rushe-2015]
307	   [Cheng-2010] [Whittaker-2013] [BBC-2013] [Condliffe-2013].  To
308	   demonstrate the prevalence of this type of keyword identification, we
309	   look to search engine censorship.

311	   Search engine censorship demonstrates keyword identification by
312	   content providers and can be regional or worldwide.  Implementation
313	   is occasionally voluntary, but normally is based on laws and
314	   regulations of the country a search engine is operating in.  The
315	   keyword blacklists are most likely maintained by the search engine
316	   provider.  China requires search engine providers to "voluntarily"
317	   maintain search term blacklists to acquire/keep an Internet content
318	   provider (ICP) license [Cheng-2010].  It is clear these blacklists
319	   are maintained by each search engine provider based on the slight
320	   variations in the intercepted searches [Zhu-2011] [Whittaker-2013].
321	   The United Kingdom has been pushing search engines to self censor
322	   with the threat of litigation if they don't do it themselves: Google
323	   and Microsoft have agreed to block more than 100,00 queries in U.K.
324	   to help combat abuse [BBC-2013] [Condliffe-2013].

326	   Depending on the output, search engine keyword identification may be
327	   difficult or easy to detect.  In some cases specialized or blank
328	   results provide a trivial enumeration mechanism, but more subtle
329	   censorship can be difficult to detect.  In February 2015, Microsoft's
330	   search engine, Bing, was accused of censoring Chinese content outside
331	   of China [Rushe-2015] because Bing returned different results for
332	   censored terms in Chinese and English.  However, it is possible that
333	   censorship of the largest base of Chinese search users, China, biased
334	   Bing's results so that the more popular results in China (the
335	   uncensored results) were also more popular for Chinese speakers
336	   outside of China.

338	3.2.4.  Deep Packet Inspection (DPI) Identification

340	   Deep Packet Inspection has become computationally feasible as a
341	   censorship mechanism in recent years [Wagner-2009].  Unlike other
342	   techniques, DPI reassembles network flows to examine the application
343	   "data" section, as opposed to only the header, and is therefore often
344	   used for keyword identification.  DPI also differs from other
345	   identification technologies because it can leverage additional packet
346	   and flow characteristics, i.e. packet sizes and timings, to identify
347	   content.  To prevent substantial quality of service (QoS) impacts,
348	   DPI normally analyzes a copy of data while the original packets
349	   continue to be routed.  Typically, the traffic is split using either
350	   a mirror switch or fiber splitter, and analyzed on a cluster of
351	   machines running Intrusion Detection Systems (IDS) configured for
352	   censorship.

354	   Tradeoffs: DPI is one of the most expensive identification mechanisms
355	   and can have a large QoS impact [Porter-2010].  When used as a
356	   keyword filter for TCP flows, DPI systems can cause also major
357	   overblocking problems.  Like other techniques, DPI is less useful
358	   against encrypted data, though DPI can leverage unencrypted elements
359	   of an encrypted data flow (e.g., the Server Name Indicator (SNI) sent
360	   in the clear for TLS) or statistical information about an encrypted
361	   flow (e.g., video takes more bandwidth than audio or textual forms of
362	   communication) to identify traffic.  (TODO: talk about content
363	   inference through things like TLS fingerprinting?)

365	   Despite these problems, DPI is the most powerful identification
366	   method and is widely used in practice.  The Great Firewall of China
367	   (GFW), the largest censorship system in the world, uses DPI to
368	   identify restricted content over HTTP and DNS and inject TCP RSTs and
369	   bad DNS responses, respectively, into connections [Crandall-2010]
370	   [Clayton-2006] [Anonymous-2014].

372	   Empirical Evidence: Several studies have found evidence of DPI being
373	   used to censor content and tools.  Clayton et al.  Crandal et al.,
374	   Anonymous, and Khattak et al., all explored the GFW and Khattak et
375	   al. even probed the firewall to discover implementation details like
376	   how much state it stores [Crandall-2010] [Clayton-2006]
377	   [Anonymous-2014] [Khattak-2013].  The Tor project claims that China,
378	   Iran, Ethiopia, and others must have used DPI to block the obsf2
379	   protocol [Wilde-2012].  Malaysia has been accused of using targeted
380	   DPI, paired with DDoS, to identify and subsequently knockout pro-
381	   opposition material [Wagstaff-2013].  It also seems likely that
382	   organizations not so worried about blocking content in real-time
383	   could use DPI to sort and categorically search gathered traffic using
384	   technologies such as NarusInsight [Hepting-2011].

386	3.2.5.  Server Name Indication

388	   In encrypted connections using Transport Layer Security (TLS), there
389	   may be servers that host multiple "virtual servers" at a give network
390	   address, and the client will need to specify in the (unencrypted)
391	   Client Hello message which domain name it seeks to connect to (so
392	   that the server can respond with the appropriate TLS certificate)
393	   using the Server Name Indication (SNI) TLS extension [RFC6066].
394	   Since SNI is sent in the clear, censors and filtering software can
395	   use it as a basis for blocking, filtering, or impairment by dropping
396	   connections to domains that match prohibited content (e.g.,
397	   bad.foo.com may be censored while good.foo.com is not) [Shbair-2015].
398	   (TODO: talk about domain fronting in CDNs where SNI does not match
399	   the HOST field inside the encrypted HTTPS envelope?)

401	   Tradeoffs: Some clients do not send the SNI extension (e.g., clients
402	   that only support versions of SSL and not TLS) or will fall back to
403	   SSL if a TLS connection fails, rendering this method ineffective.  In
404	   addition, this technique requires deep packet inspection techniques
405	   that can be computationally and infrastructurally expensive and
406	   improper configuration of an SNI-based block can result in
407	   significant overblocking, e.g., when a second-level domain like
408	   google.com is inadvertently blocked.

410	   Empirical Evidence: While there are many examples of security firms
411	   that offer SNI-based filtering [Trustwave-2015] [Sophos-2015]
412	   [Shbair-2015], the authors currently know of no specific examples or
413	   reports of SNI-based filtering observed in the field used for
414	   censorship purposes.

416	3.3.  Transport Layer

418	3.3.1.  TCP/IP Header Identification

420	   TCP/IP Header Identification is the most pervasive, reliable, and
421	   predictable type of identification.  TCP/IP headers contain a few
422	   invaluable pieces of information that must be transparent for traffic
423	   to be successfully routed: destination and source IP address and
424	   port.  Destination and Source IP are doubly useful, as not only does
425	   it allow a censor to block undesirable content via IP blacklisting,
426	   but also allows a censor to identify the IP of the user making the
427	   request.  Port is useful for whitelisting certain applications.

429	   Trade-offs: TCP/IP identification is popular due to its simplicity,
430	   availability, and robustness.

432	   TCP/IP identification is trivial to implement, but is difficult to
433	   implement in backbone or ISP routers at scale, and is therefore
434	   typically implemented with DPI.  Blacklisting an IP is equivalent to
435	   installing a /32 route on a router and due to limited flow table
436	   space, this cannot scale beyond a few thousand IPs at most.  IP
437	   blocking is also relatively crude, leading to overblocking, and
438	   cannot deal with some services like Content Distribution Networks
439	   (CDN), that host content at hundreds or thousands of IP addresses.
440	   Despite these limitations, IP blocking is extremely effective because
441	   the user needs to proxy their traffic through another destination to
442	   circumvent this type of identification.

444	   Port-blocking is generally not useful because many types of content
445	   share the same port and it is possible for censored applications to
446	   change their port.  For example, most HTTP traffic goes over port 80,
447	   so the censor cannot differentiate between restricted and allowed
448	   content solely on the basis of port.  Port whitelisting is
449	   occasionally used, where a censor limits communication to approved
450	   ports, such as 80 for HTTP traffic and is most effective when used in
451	   conjunction with other identification mechanisms.  For example, a
452	   censor could block the default HTTPS port, port 443, thereby forcing
453	   most users to fall back to HTTP.

455	3.3.2.  Protocol Identification

457	   Censors sometimes identify entire protocols to be blocked using a
458	   variety of traffic characteristics.  For example, Iran impairs the
459	   performance of HTTPS traffic, a protocol that prevents further
460	   analysis, to encourage users to switch to HTTP, a protocol that they
461	   can analyze [Aryan-2012].  A simple protocol identification would be
462	   to recognize all TCP traffic over port 443 as HTTPS, but more
463	   sophisticated analysis of the statistical properties of payload data
464	   and flow behavior, would be more effective, even when port 443 is not
465	   used [Hjelmvik-2010] [Sandvine-2014].

467	   If censors can detect circumvention tools, they can block them, so
468	   censors like China are extremely interested in identifying the
469	   protocols for censorship circumvention tools.  In recent years, this
470	   has devolved into an arms race between censors and circumvention tool
471	   developers.  As part of this arms race, China developed an extremely
472	   effective protocol identification technique that researchers call
473	   active probing or active scanning.

475	   In active probing, the censor determines whether hosts are running a
476	   circumvention protocol by trying to initiate communication using the
477	   circumvention protocol.  If the host and the censor successfully
478	   negotiate a connection, then the censor conclusively knows that host
479	   is running a circumvention tool.  China has used active scanning to
480	   great effect to block Tor [Winter-2012].

482	   Trade-offs: Protocol Identification necessarily only provides insight
483	   into the way information is traveling, and not the information
484	   itself.

486	   Protocol identification is useful for detecting and blocking
487	   circumvention tools, like Tor, or traffic that is difficult to
488	   analyze, like VoIP or SSL, because the censor can assume that this
489	   traffic should be blocked.  However, this can lead to overblocking
490	   problems when used with popular protocols.  These methods are
491	   expensive, both computationally and financially, due to the use of
492	   statistical analysis, and can be ineffective due to its imprecise
493	   nature.

495	   Empirical Examples: Protocol identification can be easy to detect if
496	   it is conducted in real time and only a particular protocol is
497	   blocked, but some types of protocol identification, like active
498	   scanning, are much more difficult to detect.  Protocol identification
499	   has been used by Iran to identify and throttle SSH traffic to make it
500	   unusable [Anonymous-2007] and by China to identify and block Tor
501	   relays [Winter-2012].  Protocol Identification has also been used for
502	   traffic management, such as the 2007 case where Comcast in the United
503	   States used RST injection to interrupt BitTorrent Traffic
504	   [Winter-2012].

506	4.  Technical Interference

508	   (TODO: organize this section into layers just like identification
509	   above.  Alternatively, the whole document can be organized in a layer
510	   structure and do identification and interference at the same time for
511	   each layer?  That seems wise.)

513	4.1.  Performance Degradation

515	   While other interference techniques outlined in this section mostly
516	   focus on blocking or preventing access to content, it can be an
517	   effective censorship strategy in some cases to not entirely block
518	   access to a given destination, or service but instead degrade the
519	   performance of the relevant network connection.  The resulting user
520	   experience for a site or service under performance degradation can be
521	   so bad that users opt to use a different site, service, or method of
522	   communication, or may not engage in communication at all if there are
523	   no alternatives.  Traffic shaping techniques that rate-limit the
524	   bandwidth available to certain types of traffic is one example of a
525	   performance degradation.

527	   Trade offs: While implementing a performance degradation will not
528	   always eliminate the ability of people to access a desire resource,
529	   it may force them to use other means of communication where
530	   censorship (or surveillance) is more easily accomplished.

532	   Empirical examples: Iran is known to shape the bandwidth available to
533	   HTTPS traffic to encourage unencrypted HTTP traffic [Aryan-2012].

535	4.2.  Packet Dropping

537	   Packet dropping is a simple mechanism to prevent undesirable traffic.
538	   The censor identifies undesirable traffic and chooses to not properly
539	   forward any packets it sees associated with the traversing
540	   undesirable traffic instead of following a normal routing protocol.
541	   This can be paired with any of the previously described mechanisms so
542	   long as the censor knows the user must route traffic through a
543	   controlled router.

545	   Trade offs: Packet Dropping is most successful when every traversing
546	   packet has transparent information linked to undesirable content,
547	   such as a Destination IP.  One downside Packet Dropping suffers from
548	   is the necessity of overblocking all content from otherwise allowable
549	   IPs based on a single subversive sub-domain; blogging services and
550	   github repositories are good examples.  China famously dropped all
551	   github packets for three days based on a single repository hosting
552	   undesirable content [Anonymous-2013].  The need to inspect every
553	   traversing packet in close to real time also makes Packet Dropping
554	   somewhat challenging from a QoS perspective.

556	   Empirical Examples: Packet Dropping is a very common form of
557	   technical interference and lends itself to accurate detection given
558	   the unique nature of the time-out requests it leaves in its wake.
559	   The Great Firewall of China uses packet dropping as one of its
560	   primary mechanisms of technical censorship [Ensafi-2013].  Iran also
561	   uses Packet Dropping as the mechanisms for throttling SSH
562	   [Aryan-2012].  These are but two examples of a ubiquitous censorship
563	   practice.

565	4.3.  RST Packet Injection

567	   Packet injection, generally, refers to a man-in-the-middle (MITM)
568	   network interference technique that spoofs packets in an established
569	   traffic stream.  RST packets are normally used to let one side of TCP
570	   connection know the other side has stopped sending information, and
571	   thus the receiver should close the connection.  RST Packet Injection
572	   is a specific type of packet injection attack that is used to
573	   interrupt an established stream by sending RST packets to both sides
574	   of a TCP connection; as each receiver thinks the other has dropped
575	   the connection, the session is terminated.

577	   Trade-offs: RST Packet Injection has a few advantages that make it
578	   extremely popular as a censorship technique.  RST Packet Injection is
579	   an out-of-band interference mechanism, allowing the avoidance of the
580	   the QoS bottleneck one can encounter with inline techniques such as
581	   Packet Dropping.  This out-of-band property allows a censor to
582	   inspect a copy of the information, usually mirrored by an optical
583	   splitter, making it an ideal pairing for DPI and Protocol
584	   Identification [Weaver-2009] (this asynchronous version of a MITM is
585	   often called a Man-on-the-Side (MOTS)).  RST Packet Injection also
586	   has the advantage of only requiring one of the two endpoints to
587	   accept the spoofed packet for the connection to be interrupted.

589	   The difficult part of RST Packet Injection is spoofing "enough"
590	   correct information to ensure one end-point accepts a RST packet as
591	   legitimate; this generally implies a correct IP, port, and (TCP)
592	   sequence number.  Sequence number is the hardest to get correct, as
593	   [RFC0793] specifies an RST Packet should be in-sequence to be
594	   accepted, although the RFC also recommends allowing in-window packets
595	   as "good enough".  This in-window recommendation is important, as if
596	   it is implemented it allows for successful Blind RST Injection
597	   attacks [Netsec-2011].  When in-window sequencing is allowed, It is
598	   trivial to conduct a Blind RST Injection, a blind injection implies
599	   the censor doesn't know any sensitive (encrypted) sequencing
600	   information about the TCP stream they are injecting into, they can
601	   simply enumerate the ~70000 possible windows; this is particularly
602	   useful for interrupting encrypted/obfuscated protocols such as SSH or
603	   Tor. RST Packet Injection relies on a stateful network, making it
604	   useless against UDP connections.  RST Packet Injection is among the
605	   most popular censorship techniques used today given its versatile
606	   nature and effectiveness against all types of TCP traffic.

608	   Empirical Examples: RST Packet Injection, as mentioned above, is most
609	   often paired with identification techniques that require splitting,
610	   such as DPI or Protocol Identification.  In 2007 Comcast was accused
611	   of using RST Packet Injection to interrupt traffic it identified as
612	   BitTorrent [Schoen-2007], this later led to a US Federal
613	   Communications Commission ruling against Comcast [VonLohmann-2008].
614	   China has also been known to use RST Packet Injection for censorship
615	   purposes.  This interference is especially evident in the
616	   interruption of encrypted/obfuscated protocols, such as those used by
617	   Tor [Winter-2012].

619	4.4.  DNS Interference

621	   There are a variety of mechanisms that censors can use to block or
622	   filter access to content by altering responses from the DNS
623	   [AFNIC-2013] [ICANN-SSAC-2012], including blocking the response,
624	   replying with an error message, or responding with an incorrect
625	   address (potentially to a server that can communicate to the end-user
626	   a reason for blocking access to that resource, for example using HTTP
627	   Status Code 451 [RFC7725]).

629	   "DNS mangling" is a network-level technique where an incorrect IP
630	   address is returned in response to a DNS query to a censored
631	   destination.  An example of this is what the Chinese network does (we
632	   are not aware of any other wide-scale uses of mangling).  On the
633	   Chinese network every DNS request in transit is examined (presumably
634	   by network inspection technologies such as DPI) and, if it matches a
635	   censored domain, a false response is injected.  End users can see
636	   this technique in action by simply sending DNS requests to any unused
637	   IP address in China (see example below).  If it is not a censored
638	   name, there will be no response.  If it is censored, an erroneous
639	   response will be returned.  For example, using the command-line dig
640	   utility to query an unused IP address in China of 113.113.113.113 for
641	   the name "www.ietf.org" (uncensored at the time of writing) compared
642	   with "www.facebook.com" (censored at the time of writing), we get an
643	   erroneous IP address "37.61.54.158" as a response:

645	   % dig +short +nodnssec @113.113.113.113 A www.ietf.org
646	   ;; connection timed out; no servers could be reached

648	   % dig +short +nodnssec @113.113.113.113 A www.facebook.com
649	   37.61.54.158

651	   There are also cases of what is colloquially called "DNS lying",
652	   where a censor mandates that the DNS responses provided - by an
653	   operator of a resursive resolver such as an Internet access provider
654	   - be different than what authoritative resolvers would provide
655	   [Bortzmayer-2015].

657	   DNS cache poisoning refers to a mechanism where a censor interferes
658	   with the response sent by an authoritative DNS resolver to a
659	   recursive resolver by responding more quickly than the authoritative
660	   resolver can respond with an alternative IP address [ViewDNS-2011].
661	   (TODO: Stephane says this cite misuses "cache poisoning" and that we
662	   haven't seen much of this performed systematically.)  Cache poisoning
663	   occurs after the requested site's name servers resolve the request
664	   and attempt to forward the true IP back to the requesting device; on
665	   the return route the resolved IP is recursively cached by each DNS
666	   server that initially forwarded the request.  During this caching
667	   process if an undesirable keyword is recognized, the resolved IP is
668	   "poisoned" and an alternative IP (or NXDOMAIN error) is returned more
669	   quickly than the upstream resolver can respond, causing an erroneous
670	   IP address to be cached (and potentially recursively so).  The
671	   alternative IPs usually direct to a nonsense domain or a warning
672	   page.  Alternatively, Iranian censorship appears to prevent the
673	   communication en-route, preventing a response from ever being sent
674	   [Aryan-2012].

676	   Trade-offs: These forms of DNS interference require the censor to
677	   force a user to traverse a controlled DNS hierarchy (or intervening
678	   network on which the censor serves as a Active Pervasive Attacker
679	   [RFC7624] to rewrite DNS responses) for the mechanism to be
680	   effective.  It can be circumvented by a technical savvy user that
681	   opts to use alternative DNS resolvers (such as the public DNS
682	   resolvers provided by Google, OpenDNS, Telcomix, or FDN) or Virtual
683	   Private Network technology.  DNS mangling and cache poisoning also
684	   imply returning an incorrect IP to those attempting to resolve a
685	   domain name, but in some cases the destination may be technically
686	   accessible; over HTTP, for example, the user may have another method
687	   of obtaining the IP address of the desired site and may be able to
688	   access it if the site is configured to be the default server
689	   listening at this IP address.  Blocking overflow has also been a
690	   problem, as occasionally users outside of the censors region will be
691	   directed through DNS servers or DNS-rewriting network equipment
692	   controlled by a censor, causing the request to fail.  The ease of
693	   circumvention paired with the large risk of overblocking and blocking
694	   overflow make DNS interference a partial, difficult, and less than
695	   ideal censorship mechanism.

697	   Empirical Evidence: DNS interference, when properly implemented, is
698	   easy to identify based on the shortcomings identified above.  Turkey
699	   relied on DNS interference for its country-wide block of websites
700	   such Twitter and Youtube for almost week in March of 2014 but the
701	   ease of circumvention resulted in an increase in the popularity of
702	   Twitter until Turkish ISPs implementing an IP blacklist to achieve
703	   the governmental mandate [Zmijewki-2014].  Ultimately, Turkish ISPs
704	   started hijacking all requests to Google and Level 3's international
705	   DNS resolvers [Zmijewki-2014].  DNS interference, when incorrectly
706	   implemented, has resulted in some of the largest "censorship
707	   disasters".  In January 2014 China started directing all requests
708	   passing through the Great Fire Wall to a single domain,
709	   dongtaiwang.com, due to an improperly configured DNS poisoning
710	   attempt; this incident is thought to be the largest Internet-service
711	   outage in history [AFP-2014] [Anon-SIGCOMM12].  Countries such as
712	   China, Iran, Turkey, and the United States have discussed blocking
713	   entire TLDs as well, but only Iran has acted by blocking all Israeli
714	   (.il) domains [Albert-2011].

716	4.5.  Distributed Denial of Service (DDoS)

718	   Distributed Denial of Service attacks are a common attack mechanism
719	   used by "hacktivists" and malicious hackers, but censors have used
720	   DDoS in the past for a variety of reasons.  There is a huge variety
721	   of DDoS attacks [Wikip-DoS], but on a high level two possible impacts
722	   tend to occur; a flood attack results in the service being unusable
723	   while resources are being spent to flood the service, a crash attack
724	   aims to crash the service so resources can be reallocated elsewhere
725	   without "releasing" the service.

727	   Trade-offs: DDoS is an appealing mechanism when a censor would like
728	   to prevent all access to undesirable content, instead of only access
729	   in their region for a limited period of time, but this is really the
730	   only uniquely beneficial feature for DDoS as a censorship technique.
731	   The resources required to carry out a successful DDoS against major
732	   targets are computationally expensive, usually requiring renting or
733	   owning a malicious distributed platform such as a botnet, and
734	   imprecise.  DDoS is an incredibly crude censorship technique, and
735	   appears to largely be used as a timely, easy-to-access mechanism for
736	   blocking undesirable content for a limited period of time.

738	   Empirical Examples: In 2012 the U.K.'s GCHQ used DDoS to temporarily
739	   shutdown IRC chat rooms frequented by members of Anonymous using the
740	   Syn Flood DDoS method; Syn Flood exploits the handshake used by TCP
741	   to overload the victim server with so many requests that legitimate
742	   traffic becomes slow or impossible [Schone-2014] [CERT-2000].
743	   Dissenting opinion websites are frequently victims of DDoS around
744	   politically sensitive events in Burma [Villeneuve-2011].  Controlling
745	   parties in Russia [Kravtsova-2012], Zimbabwe [Orion-2013], and
746	   Malaysia [Muncaster-2013] have been accused of using DDoS to
747	   interrupt opposition support and access during elections.  In 2015,
748	   China launched a DDoS attack using a true MITM system colocated with
749	   the Great Firewall, dubbed "Great Cannon", that was able to inject
750	   JavaScript code into web visits to a Chinese search engine that
751	   comandeered those user agents to send DDoS traffic to various sites
752	   [Marczak-2015].

754	4.6.  Network Disconnection or Adversarial Route Announcement

756	   While it is perhaps the crudest of all censorship techniques, there
757	   is no more effective way of making sure undesirable information isn't
758	   allowed to propagate on the web than by shutting off the network.
759	   The network can be logically cut off in a region when a censoring
760	   body withdraws all of the Boarder Gateway Protocol (BGP) prefixes
761	   routing through the censor's country.

763	   Trade-offs: The impact to a network disconnection in a region is huge
764	   and absolute; the censor pays for absolute control over digital
765	   information with all the benefits the Internet brings; this is never
766	   a long-term solution for any rational censor and is normally only
767	   used as a last resort in times of substantial unrest.

769	   Empirical Examples: Network Disconnections tend to only happen in
770	   times of substantial unrest, largely due to the huge social,
771	   political, and economic impact such a move has.  One of the first,
772	   highly covered occurrences was with the Junta in Myanmar employing
773	   Network Disconnection to help Junta forces quash a rebellion in 2007
774	   [Dobie-2007].  China disconnected the network in the Xinjiang region
775	   during unrest in 2009 in an effort to prevent the protests from
776	   spreading to other regions [Heacock-2009].  The Arab Spring saw the
777	   the most frequent usage of Network Disconnection, with events in
778	   Egypt and Libya in 2011 [Cowie-2011] [Cowie-2011b], and Syria in 2012
779	   [Thomson-2012].

781	5.  Non-Technical Prescription

783	   As the name implies, sometimes manpower is the easiest way to figure
784	   out which content to block.  Manual Filtering differs from the common
785	   tactic of building up blacklists in that it doesn't necessarily
786	   target a specific IP or DNS, but instead removes or flags content.
787	   Given the imprecise nature of automatic filtering, manually sorting
788	   through content and flagging dissenting websites, blogs, articles and
789	   other media for filtration can be an effective technique.  This
790	   filtration can occur on the Backbone/ISP level - China's army of
791	   monitors is a good example [BBC-2013b] - but more commonly manual
792	   filtering occurs on an institutional level.  Internet Content
793	   Providers such as Google or Weibo, require a business license to
794	   operate in China.  One of the prerequisites for a business license is
795	   an agreement to sign a "voluntary pledge" known as the "Public Pledge
796	   on Self-discipline for the Chinese Internet Industry".  The failure
797	   to "energetically uphold" the pledged values can lead to the ICPs
798	   being held liable for the offending content by the Chinese government
799	   [BBC-2013b].

801	6.  Non-Technical Interference

803	6.1.  Self Censorship

805	   Self censorship is one of the most interesting and effective types of
806	   censorship; a mix of Bentham's Panopticon, cultural manipulation,
807	   intelligence gathering, and meatspace enforcement.  Simply put, self
808	   censorship is when a censor creates an atmosphere where users censor
809	   themselves.  This can be achieved through controlling information,
810	   intimidating would-be dissidents, swaying public thought, and
811	   creating apathy.  Self censorship is difficult to document, as when
812	   it is implemented effectively the only noticeable tracing is a lack
813	   of undesirable content; instead one must look at the tools and
814	   techniques used by censors to encourage self-censorship.  Controlling
815	   Information relies on traditional censorship techniques, or by
816	   forcing all users to connect through an intranet, such as in North
817	   Korea.  Intimidation is often achieved through allowing Internet
818	   users to post "whatever they want", but arresting those who post
819	   about dissenting views, this technique is incredibly common
820	   [Calamur-2013] [AP-2012] [Hopkins-2011] [Guardian-2014]
821	   [Johnson-2010].  A good example of swaying public thought is China's
822	   "50-Cent Party", composed of somewhere between 20,000 [Bristow-2013]
823	   and 300,000 [Fareed-2008] contributors who are paid to "guide public
824	   thought" on local and regional issues as directed by the Ministry of
825	   Culture.  Creating apathy can be a side-effect of successfully
826	   controlling information over time and is ideal for a censorship
827	   regime [Gao-2014].

829	6.2.  Domain Name Reallocation

831	   As Domain Names are resolved recursively, if a TLD deregisters a
832	   domain all other DNS servers will be unable to properly forward and
833	   cache the site.  Domain name registration is only really a risk where
834	   undesirable content is hosted on TLD controlled by the censoring
835	   country, such as .cn or .ru [Anderson-2011] or where legal processes
836	   in countries like the United States result in domain name seizures
837	   and/or DNS redirection by the government [Kopel-2013].

839	6.3.  Server Takedown

841	   Servers must have a physical location somewhere in the world.  If
842	   undesirable content is hosted in the censoring country the servers
843	   can be physically seized or the hosting provider can be required to
844	   prevent access [Anderson-2011].

846	6.4.  Notice and Takedown

848	   In some countries, legal mechanisms exist where an individual can
849	   issue a legal request to a content host that requires the host to
850	   take down content.  Examples include the voluntary systems employed
851	   by companies like Google to comply with "Right to be Forgotten"
852	   policies in the European Union [Google-RTBF] and the copyright-
853	   oriented notice and takedown regime of the United States Digital
854	   Millennium Copyright Act (DMCA) Section 512 [DMLP-512].

856	7.  Contributors

858	   This document benefited from discussions with Stephane Bortzmeyer,
859	   Nick Feamster, and Martin Nilsson.

861	8.  Informative References

863	   [AFNIC-2013]
864	              AFNIC, "Report of the AFNIC Scientific Council:
865	              Consequences of DNS-based Internet filtering", 2013,
866	              <http://www.afnic.fr/medias/documents/conseilscientifique/
867	              SC-consequences-of-DNS-based-Internet-filtering.pdf>.

869	   [AFP-2014]
870	              AFP, "China Has Massive Internet Breakdown Reportedly
871	              Caused By Their Own Censoring Tools", 2014,
872	              <http://www.businessinsider.com/chinas-internet-breakdown-
873	              reportedly-caused-by-censoring-tools-2014-1>.

875	   [Albert-2011]
876	              Albert, K., "DNS Tampering and the new ICANN gTLD Rules",
877	              2011, <https://opennet.net/blog/2011/06/dns-tampering-and-
878	              new-icann-gtld-rules>.

880	   [Anderson-2011]
881	              Anderson, R. and S. Murdoch, "Access Denied: Tools and
882	              Technology of Internet Filtering", 2011,
883	              <http://access.opennet.net/wp-content/uploads/2011/12/
884	              accessdenied-chapter-3.pdf>.

886	   [Anon-SIGCOMM12]
887	              Anonymous, "The Collateral Damage of Internet Censorship
888	              by DNS Injection", 2012,
889	              <http://www.sigcomm.org/sites/default/files/ccr/
890	              papers/2012/July/2317307-2317311.pdf>.

892	   [Anonymous-2007]
893	              Anonymous, "How to Bypass Comcast's Bittorrent
894	              Throttling", 2012, <https://torrentfreak.com/how-to-
895	              bypass-comcast-bittorrent-throttling-071021>.

897	   [Anonymous-2013]
898	              Anonymous, "GitHub blocked in China - how it happened, how
899	              to get around it, and where it will take us", 2013,
900	              <https://en.greatfire.org/blog/2013/jan/github-blocked-
901	              china-how-it-happened-how-get-around-it-and-where-it-will-
902	              take-us>.

904	   [Anonymous-2014]
905	              Anonymous, "Towards a Comprehensive Picture of the Great
906	              Firewall's DNS Censorship", 2014,
907	              <https://www.usenix.org/system/files/conference/foci14/
908	              foci14-anonymous.pdf>.

910	   [AP-2012]  Associated Press, "Sattar Beheshit, Iranian Blogger, Was
911	              Beaten In Prison According To Prosecutor", 2012,
912	              <http://www.huffingtonpost.com/2012/12/03/
913	              sattar-beheshit-iran_n_2233125.html>.

915	   [Aryan-2012]
916	              Aryan, S., Aryan, H., and J. Halderman, "Internet
917	              Censorship in Iran: A First Look", 2012,
918	              <https://jhalderm.com/pub/papers/iran-foci13.pdf>.

920	   [BBC-2013]
921	              BBC News, "Google and Microsoft agree steps to block abuse
922	              images", 2013, <http://www.bbc.com/news/uk-24980765>.

924	   [BBC-2013b]
925	              BBC, "China employs two million microblog monitors state
926	              media say", 2013,
927	              <http://www.bbc.com/news/world-asia-china-2439695>.

929	   [Bortzmayer-2015]
930	              Bortzmayer, S., "DNS Censorship (DNS Lies) As Seen By RIPE
931	              Atlas", 2015,
932	              <https://labs.ripe.net/Members/stephane_bortzmeyer/dns-
933	              censorship-dns-lies-seen-by-atlas-probes>.

935	   [Bristow-2013]
936	              Bristow, M., "China's internet 'spin doctors'", 2013,
937	              <http://news.bbc.co.uk/2/hi/asia-pacific/7783640.stm>.

939	   [Calamur-2013]
940	              Calamur, K., "Prominent Egyptian Blogger Arrested", 2013,
941	              <http://www.npr.org/blogs/thetwo-way/2013/11/29/247820503/
942	              prominent-egyptian-blogger-arrested>.

944	   [CERT-2000]
945	              CERT, "TCP SYN Flooding and IP Spoofing Attacks", 2000,
946	              <http://www.cert.org/historical/advisories/
947	              CA-1996-21.cfm>.

949	   [Cheng-2010]
950	              Cheng, J., "Google stops Hong Kong auto-redirect as China
951	              plays hardball", 2010, <http://arstechnica.com/tech-
952	              policy/2010/06/
953	              google-tweaks-china-to-hong-kong-redirect-same-results/>.

955	   [Clayton-2006]
956	              Clayton, R., "Ignoring the Great Firewall of China", 2006,
957	              <http://link.springer.com/chapter/10.1007/11957454_2>.

959	   [Condliffe-2013]
960	              Condliffe, J., "Google Announces Massive New Restrictions
961	              on Child Abuse Search Terms", 2013, <http://gizmodo.com/
962	              google-announces-massive-new-restrictions-on-child-abus-
963	              1466539163>.

965	   [Cowie-2011]
966	              Cowie, J., "Egypt Leaves the Internet", 2011,
967	              <http://www.renesys.com/2011/01/
968	              egypt-leaves-the-internet/>.

970	   [Cowie-2011b]
971	              Cowie, J., "Libyan Disconnect", 2011,
972	              <http://www.renesys.com/2011/02/libyan-disconnect-1/>.

974	   [Crandall-2010]
975	              Crandall, J., "Empirical Study of a National-Scale
976	              Distributed Intrusion Detection System: Backbone-Level
977	              Filtering of HTML Responses in China", 2010,
978	              <http://www.cs.unm.edu/~crandall/icdcs2010.pdf>.

980	   [Dalek-2013]
981	              Dalek, J., "A Method for Identifying and Confirming the
982	              Use of URL Filtering Products for Censorship", 2013,
983	              <http://www.cs.stonybrook.edu/~phillipa/papers/
984	              imc112s-dalek.pdf>.

986	   [Ding-1999]
987	              Ding, C., Chi, C., Deng, J., and C. Dong, "Centralized
988	              Content-Based Web Filtering and Blocking: How Far Can It
989	              Go?", 1999, <http://citeseerx.ist.psu.edu/viewdoc/
990	              download?doi=10.1.1.132.3302&rep=rep1&type=pdf>.

992	   [DMLP-512]
993	              Digital Media Law Project, "Protecting Yourself Against
994	              Copyright Claims Based on User Content", 2012,
995	              <http://www.dmlp.org/legal-guide/protecting-yourself-
996	              against-copyright-claims-based-user-content>.

998	   [Dobie-2007]
999	              Dobie, M., "Junta tightens media screw", 2007,
1000	              <http://news.bbc.co.uk/2/hi/asia-pacific/7016238.stm>.

1002	   [Ensafi-2013]
1003	              Ensafi, R., "Detecting Intentional Packet Drops on the
1004	              Internet via TCP/IP Side Channels", 2013,
1005	              <http://arxiv.org/pdf/1312.5739v1.pdf>.

1007	   [Fareed-2008]
1008	              Fareed, M., "China joins a turf war", 2008,
1009	              <http://www.theguardian.com/media/2008/sep/22/
1010	              chinathemedia.marketingandpr>.

1012	   [Gao-2014]
1013	              Gao, H., "Tiananmen, Forgotten", 2014,
1014	              <http://www.nytimes.com/2014/06/04/opinion/
1015	              tiananmen-forgotten.html>.

1017	   [Glanville-2008]
1018	              Glanville, J., "The Big Business of Net Censorship", 2008,
1019	              <http://www.theguardian.com/commentisfree/2008/nov/17/
1020	              censorship-internet>.

1022	   [Google-RTBF]
1023	              Google, Inc., "Search removal request under data
1024	              protection law in Europe", 2015,
1025	              <https://support.google.com/legal/contact/
1026	              lr_eudpa?product=websearch>.

1028	   [Guardian-2014]
1029	              The Gaurdian, "Chinese blogger jailed under crackdown on
1030	              'internet rumours'", 2014,
1031	              <http://www.theguardian.com/world/2014/apr/17/chinese-
1032	              blogger-jailed-crackdown-internet-rumours-qin-zhihui>.

1034	   [Heacock-2009]
1035	              Heacock, R., "China Shuts Down Internet in Xinjiang Region
1036	              After Riots", 2009, <https://opennet.net/blog/2009/07/
1037	              china-shuts-down-internet-xinjiang-region-after-riots>.

1039	   [Hepting-2011]
1040	              Electronic Frontier Foundation, "Hepting vs. AT&T", 2011,
1041	              <https://www.eff.org/cases/hepting>.

1043	   [Hjelmvik-2010]
1044	              Hjelmvik, E., "Breaking and Improving Protocol
1045	              Obfuscation", 2010, <https://www.iis.se/docs/
1046	              hjelmvik_breaking.pdf>.

1048	   [Hopkins-2011]
1049	              Hopkins, C., "Communications Blocked in Libya, Qatari
1050	              Blogger Arrested: This Week in Online Tyranny", 2011,
1051	              <http://readwrite.com/2011/03/03/
1052	              communications_blocked_in_libya_this_week_in_onlin>.

1054	   [ICANN-SSAC-2012]
1055	              ICANN Security and Stability Advisory Committee (SSAC),
1056	              "SAC 056: SSAC Advisory on Impacts of Content Blocking via
1057	              the Domain Name System", 2012,
1058	              <https://www.icann.org/en/system/files/files/sac-
1059	              056-en.pdf>.

1061	   [Johnson-2010]
1062	              Johnson, L., "Torture feared in arrest of Iraqi blogger",
1063	              2011, <http://seattlepostglobe.org/2010/02/05/
1064	              torture-feared-in-arrest-of-iraqi-blogger/>.

1066	   [Jones-2014]
1067	              Jones, B., "Automated Detection and Fingerprinting of
1068	              Censorship Block Pages", 2014,
1069	              <http://conferences2.sigcomm.org/imc/2014/papers/
1070	              p299.pdf>.

1072	   [Khattak-2013]
1073	              Khattak, S., "Towards Illuminating a Censorship Monitor's
1074	              Model to Facilitate Evasion", 2013, <http://0b4af6cdc2f0c5
1075	              998459-c0245c5c937c5dedcca3f1764ecc9b2f.r43.cf2.rackcdn.co
1076	              m/12389-foci13-khattak.pdf>.

1078	   [Kopel-2013]
1079	              Kopel, K., "Operation Seizing Our Sites: How the Federal
1080	              Government is Taking Domain Names Without Prior Notice",
1081	              2013, <http://dx.doi.org/doi:10.15779/Z384Q3M>.

1083	   [Kravtsova-2012]
1084	              Kravtsova, Y., "Cyberattacks Disrupt Opposition's
1085	              Election", 2012,
1086	              <http://www.themoscowtimes.com/news/article/
1087	              cyberattacks-disrupt-oppositions-election/470119.html>.

1089	   [Marczak-2015]
1090	              Marczak, B., Weaver, N., Dalek, J., Ensafi, R., Fifield,
1091	              D., McKune, S., Rey, A., Scott-Railton, J., Deibert, R.,
1092	              and V. Paxson, "An Analysis of China's "Great Cannon"",
1093	              2015,
1094	              <https://www.usenix.org/system/files/conference/foci15/
1095	              foci15-paper-marczak.pdf>.

1097	   [Muncaster-2013]
1098	              Muncaster, P., "Malaysian election sparks web blocking/
1099	              DDoS claims", 2013,
1100	              <http://www.theregister.co.uk/2013/05/09/
1101	              malaysia_fraud_elections_ddos_web_blocking/>.

1103	   [Nabi-2013]
1104	              Nabi, Z., "The Anatomy of Web Censorship in Pakistan",
1105	              2013, <http://0b4af6cdc2f0c5998459-c0245c5c937c5dedcca3f17
1106	              64ecc9b2f.r43.cf2.rackcdn.com/12387-foci13-nabi.pdf>.

1108	   [Netsec-2011]
1109	              n3t2.3c, "TCP-RST Injection", 2011, <https://nets.ec/TCP-
1110	              RST_Injection>.

1112	   [Orion-2013]
1113	              Orion, E., "Zimbabwe election hit by hacking and DDoS
1114	              attacks", 2013,
1115	              <http://www.theinquirer.net/inquirer/news/2287433/
1116	              zimbabwe-election-hit-by-hacking-and-ddos-attacks>.

1118	   [Porter-2010]
1119	              Porter, T., "The Perils of Deep Packet Inspection", 2010,
1120	              <http://www.symantec.com/connect/articles/
1121	              perils-deep-packet-inspection>.

1123	   [RFC0793]  Postel, J., "Transmission Control Protocol", STD 7,
1124	              RFC 793, DOI 10.17487/RFC0793, September 1981,
1125	              <http://www.rfc-editor.org/info/rfc793>.

1127	   [RFC6066]  Eastlake 3rd, D., "Transport Layer Security (TLS)
1128	              Extensions: Extension Definitions", RFC 6066,
1129	              DOI 10.17487/RFC6066, January 2011,
1130	              <http://www.rfc-editor.org/info/rfc6066>.

1132	   [RFC7624]  Barnes, R., Schneier, B., Jennings, C., Hardie, T.,
1133	              Trammell, B., Huitema, C., and D. Borkmann,
1134	              "Confidentiality in the Face of Pervasive Surveillance: A
1135	              Threat Model and Problem Statement", RFC 7624,
1136	              DOI 10.17487/RFC7624, August 2015,
1137	              <http://www.rfc-editor.org/info/rfc7624>.

1139	   [RFC7725]  Bray, T., "An HTTP Status Code to Report Legal Obstacles",
1140	              RFC 7725, DOI 10.17487/RFC7725, February 2016,
1141	              <http://www.rfc-editor.org/info/rfc7725>.

1143	   [RFC7754]  Barnes, R., Cooper, A., Kolkman, O., Thaler, D., and E.
1144	              Nordmark, "Technical Considerations for Internet Service
1145	              Blocking and Filtering", RFC 7754, DOI 10.17487/RFC7754,
1146	              March 2016, <http://www.rfc-editor.org/info/rfc7754>.

1148	   [RSF-2005]
1149	              Reporters Sans Frontieres, "Technical ways to get around
1150	              censorship", 2005, <http://archives.rsf.org/
1151	              print-blogs.php3?id_article=15013>.

1153	   [Rushe-2015]
1154	              Rushe, D., "Bing censoring Chinese language search results
1155	              for users in the US", 2013,
1156	              <http://www.theguardian.com/technology/2014/feb/11/
1157	              bing-censors-chinese-language-search-results>.

1159	   [Sandvine-2014]
1160	              Sandvine, "Technology Showcase on Traffic Classification:
1161	              Why Measurements and Freeform Policy Matter", 2014,
1162	              <https://www.sandvine.com/downloads/general/technology/
1163	              sandvine-technology-showcases/sandvine-technology-
1164	              showcase-traffic-classification.pdf>.

1166	   [Schoen-2007]
1167	              Schoen, S., "EFF tests agree with AP: Comcast is forging
1168	              packets to interfere with user traffic", 2007,
1169	              <https://www.eff.org/deeplinks/2007/10/eff-tests-agree-ap-
1170	              comcast-forging-packets-to-interfere>.

1172	   [Schone-2014]
1173	              Schone, M., Esposito, R., Cole, M., and G. Greenwald,
1174	              "Snowden Docs Show UK Spies Attacked Anonymous, Hackers",
1175	              2014, <http://www.nbcnews.com/feature/edward-snowden-
1176	              interview/exclusive-snowden-docs-show-uk-spies-attacked-
1177	              anonymous-hackers-n21361>.

1179	   [Senft-2013]
1180	              Senft, A., "Asia Chats: Analyzing Information Controls and
1181	              Privacy in Asian Messaging Applications", 2013,
1182	              <https://citizenlab.org/2013/11/asia-chats-analyzing-
1183	              information-controls-privacy-asian-messaging-
1184	              applications/>.

1186	   [Shbair-2015]
1187	              Shbair, W., Cholez, T., Goichot, A., and I. Chrisment,
1188	              "Efficiently Bypassing SNI-based HTTPS Filtering", 2015,
1189	              <https://hal.inria.fr/hal-01202712/document>.

1191	   [Sophos-2015]
1192	              Sophos, "Understanding Sophos Web Filtering", 2015,
1193	              <https://www.sophos.com/en-us/support/
1194	              knowledgebase/115865.aspx>.

1196	   [Thomson-2012]
1197	              Thomson, I., "Syria Cuts off Internet and Mobile
1198	              Communication", 2012,
1199	              <http://www.theregister.co.uk/2012/11/29/
1200	              syria_internet_blackout/>.

1202	   [Trustwave-2015]
1203	              Trustwave, "Filter: SNI extension feature and HTTPS
1204	              blocking", 2015,
1205	              <https://www3.trustwave.com/software/8e6/hlp/r3000/
1206	              files/1system_filter.html>.

1208	   [Verkamp-2012]
1209	              Verkamp, J. and M. Gupta, "Inferring Mechanics of Web
1210	              Censorship Around the World", 2012,
1211	              <https://www.usenix.org/system/files/conference/foci12/
1212	              foci12-final1.pdf>.

1214	   [ViewDNS-2011]
1215	              ViewDNS.info, "DNS Cache Poisoning in the People's
1216	              Republic of China", 2011, <http://viewdns.info/research/
1217	              dns-cache-poisoning-in-the-peoples-republic-of-china/>.

1219	   [Villeneuve-2011]
1220	              Villeneuve, N., "Open Access: Chapter 8, Control and
1221	              Resistance, Attacks on Burmese Opposition Media", 2011,
1222	              <http://access.opennet.net/wp-content/uploads/2011/12/
1223	              accesscontested-chapter-08.pdf>.

1225	   [VonLohmann-2008]
1226	              VonLohmann, F., "FCC Rules Against Comcast for BitTorrent
1227	              Blocking", 2008, <https://www.eff.org/deeplinks/2008/08/
1228	              fcc-rules-against-comcast-bit-torrent-blocking>.

1230	   [Wagner-2009]
1231	              Wagner, B., "Deep Packet Inspection and Internet
1232	              Censorship: International Convergence on an 'Integrated
1233	              Technology of Control'", 2009,
1234	              <http://advocacy.globalvoicesonline.org/wp-
1235	              content/uploads/2009/06/
1236	              deeppacketinspectionandinternet-censorship2.pdf>.

1238	   [Wagstaff-2013]
1239	              Wagstaff, J., "In Malaysia, online election battles take a
1240	              nasty turn", 2013,
1241	              <http://www.reuters.com/article/2013/05/04/
1242	              uk-malaysia-election-online-idUKBRE94309G20130504>.

1244	   [Weaver-2009]
1245	              Weaver, N., Sommer, R., and V. Paxson, "Detecting Forged
1246	              TCP Packets", 2009, <http://www.icir.org/vern/papers/
1247	              reset-injection.ndss09.pdf>.

1249	   [Whittaker-2013]
1250	              Whittaker, Z., "1,168 keywords Skype uses to censor,
1251	              monitor its Chinese users", 2013,
1252	              <http://www.zdnet.com/1168-keywords-skype-uses-to-censor-
1253	              monitor-its-chinese-users-7000012328/>.

1255	   [Wikip-DoS]
1256	              Wikipedia, "Denial of Service Attacks", 2016,
1257	              <https://en.wikipedia.org/w/index.php?title=Denial-of-
1258	              service_attack&oldid=710558258>.

1260	   [Wilde-2012]
1261	              Wilde, T., "Knock Knock Knockin' on Bridges Doors", 2012,
1262	              <https://blog.torproject.org/blog/knock-knock-knockin-
1263	              bridges-doors>.

1265	   [Winter-2012]
1266	              Winter, P., "How China is Blocking Tor", 2012,
1267	              <http://arxiv.org/pdf/1204.0447v1.pdf>.

1269	   [Zhu-2011]
1270	              Zhu, T., "An Analysis of Chinese Search Engine Filtering",
1271	              2011,
1272	              <http://arxiv.org/ftp/arxiv/papers/1107/1107.3794.pdf>.

1274	   [Zmijewki-2014]
1275	              Zmijewki, E., "Turkish Internet Censorship Takes a New
1276	              Turn", 2014, <http://www.renesys.com/2014/03/
1277	              turkish-internet-censorship/>.

1279	Authors' Addresses

1281	   Joseph Lorenzo Hall
1282	   CDT

1284	   Email: joe@cdt.org

1286	   Michael D. Aaron
1287	   CU Boulder

1289	   Email: michael.aaron@colorado.edu

1291	   Ben Jones
1292	   Princeton

1294	   Email: bj6@cs.princeton.edu

1296	   Nick Feamster
1297	   Princeton

1299	   Email: feamster@cs.princeton.edu