idnits 2.17.1 draft-hall-censorship-tech-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) == There are 2 instances of lines with non-RFC2606-compliant FQDNs in the document. == There are 2 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 22, 2018) is 2010 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Obsolete informational reference (is this intentional?): RFC 793 (Obsoleted by RFC 9293) Summary: 2 errors (**), 0 flaws (~~), 3 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group J. Hall 3 Internet-Draft CDT 4 Intended status: Informational M. Aaron 5 Expires: April 25, 2019 CU Boulder 6 S. Adams 7 CDT 8 B. Jones 9 N. Feamster 10 Princeton 11 October 22, 2018 13 A Survey of Worldwide Censorship Techniques 14 draft-hall-censorship-tech-06 16 Abstract 18 This document describes the technical mechanisms used by censorship 19 regimes around the world to block or impair Internet traffic. It 20 aims to make designers, implementers, and users of Internet protocols 21 aware of the properties being exploited and mechanisms used to censor 22 end-user access to information. This document makes no suggestions 23 on individual protocol considerations, and is purely informational, 24 intended to be a reference. 26 Status of This Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at https://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on April 25, 2019. 43 Copyright Notice 45 Copyright (c) 2018 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (https://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 61 2. Technical Prescription . . . . . . . . . . . . . . . . . . . 3 62 3. Technical Identification . . . . . . . . . . . . . . . . . . 4 63 3.1. Points of Control . . . . . . . . . . . . . . . . . . . . 4 64 3.2. Application Layer . . . . . . . . . . . . . . . . . . . . 5 65 3.2.1. HTTP Request Header Identification . . . . . . . . . 5 66 3.2.2. HTTP Response Header Identification . . . . . . . . . 6 67 3.2.3. Instrumenting Content Providers . . . . . . . . . . . 7 68 3.2.4. Deep Packet Inspection (DPI) Identification . . . . . 8 69 3.2.5. Server Name Indication . . . . . . . . . . . . . . . 9 70 3.3. Transport Layer . . . . . . . . . . . . . . . . . . . . . 10 71 3.3.1. TCP/IP Header Identification . . . . . . . . . . . . 10 72 3.3.2. Protocol Identification . . . . . . . . . . . . . . . 11 73 4. Technical Interference . . . . . . . . . . . . . . . . . . . 12 74 4.1. Application Layer . . . . . . . . . . . . . . . . . . . . 12 75 4.1.1. DNS Interference . . . . . . . . . . . . . . . . . . 12 76 4.2. Transport Layer . . . . . . . . . . . . . . . . . . . . . 14 77 4.2.1. Performance Degradation . . . . . . . . . . . . . . . 14 78 4.2.2. Packet Dropping . . . . . . . . . . . . . . . . . . . 14 79 4.2.3. RST Packet Injection . . . . . . . . . . . . . . . . 15 80 4.3. Multi-layer and Non-layer . . . . . . . . . . . . . . . . 16 81 4.3.1. Distributed Denial of Service (DDoS) . . . . . . . . 16 82 4.3.2. Network Disconnection or Adversarial Route 83 Announcement . . . . . . . . . . . . . . . . . . . . 17 84 5. Non-Technical Prescription . . . . . . . . . . . . . . . . . 18 85 6. Non-Technical Interference . . . . . . . . . . . . . . . . . 18 86 6.1. Self-Censorship . . . . . . . . . . . . . . . . . . . . . 18 87 6.2. Domain Name Reallocation . . . . . . . . . . . . . . . . 19 88 6.3. Server Takedown . . . . . . . . . . . . . . . . . . . . . 19 89 6.4. Notice and Takedown . . . . . . . . . . . . . . . . . . . 19 90 7. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 19 91 8. Informative References . . . . . . . . . . . . . . . . . . . 19 92 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 29 94 1. Introduction 96 Censorship is where an entity in a position of power - such as a 97 government, organization, or individual - suppresses communication 98 that it considers objectionable, harmful, sensitive, politically 99 incorrect or inconvenient. (Although censors that engage in 100 censorship must do so through legal, military, or other means, this 101 document focuses largely on technical mechanisms used to achieve 102 network censorship.) 104 This document describes the technical mechanisms that censorship 105 regimes around the world use to block or degrade Internet traffic 106 (see [RFC7754] for a discussion of Internet blocking and filtering in 107 terms of implications for Internet architecture, rather than end-user 108 access to content and services). 110 We describe three elements of Internet censorship: prescription, 111 identification, and interference. Prescription is the process by 112 which censors determine what types of material they should block, 113 i.e. they decide to block a list of pornographic websites. 114 Identification is the process by which censors classify specific 115 traffic to be blocked or impaired, i.e. the censor blocks or impairs 116 all webpages containing "sex" in the title or traffic to sex.com. 117 Interference is the process by which the censor intercedes in 118 communication and prevents access to censored materials by blocking 119 access or impairing the connection. 121 2. Technical Prescription 123 Prescription is the process of figuring out what censors would like 124 to block [Glanville-2008]. Generally, censors aggregate information 125 "to block" in blacklists or using real-time heuristic assessment of 126 content [Ding-1999]. There are indications that online censors are 127 starting to use machine learning techniques as well [Tang-2016]. 129 There are typically three types of blacklists: Keyword, Domain Name, 130 or Internet Protocol (IP) address. Keyword and Domain Name blocking 131 take place at the application level (e.g. HTTP), whereas IP blocking 132 tends to take place using routing data in TCP/IP headers. The 133 mechanisms for building up these blacklists are varied. Censors can 134 purchase from private industry "content control" software, such as 135 SmartFilter, which allows filtering from broad categories that they 136 would like to block, such as gambling or pornography. In these 137 cases, these private services attempt to categorize every semi- 138 questionable website as to allow for meta-tag blocking (similarly, 139 they tune real-time content heuristic systems to map their 140 assessments onto categories of objectionable content). 142 Countries that are more interested in retaining specific political 143 control, a desire which requires swift and decisive action, often 144 have ministries or organizations, such as the Ministry of Industry 145 and Information Technology in China or the Ministry of Culture and 146 Islamic Guidance in Iran, which maintain their own blacklists. 148 3. Technical Identification 150 3.1. Points of Control 152 Internet censorship, necessarily, takes place over a network. 153 Network design gives censors a number of different points-of-control 154 where they can identify the content they are interested in filtering. 155 An important aspect of pervasive technical interception is the 156 necessity to rely on software or hardware to intercept the content 157 the censor is interested in. This requirement, the need to have the 158 interception mechanism located somewhere, logically or physically, 159 implicates various general points-of-control: 161 o *Internet Backbone:* If a censor controls the gateways into a 162 region, they can filter undesirable traffic that is traveling into 163 and out of the region by packet sniffing and port mirroring at the 164 relevant exchange points. Censorship at this point of control is 165 most effective at controlling the flow of information between a 166 region and the rest of the Internet, but is ineffective at 167 identifying content traveling between the users within a region. 169 o *Internet Service Providers:* Internet Service Providers are 170 perhaps the most natural point of control. They have a benefit of 171 being easily enumerable by a censor paired with the ability to 172 identify the regional and international traffic of all their 173 users. The censor's filtration mechanisms can be placed on an ISP 174 via governmental mandates, ownership, or voluntary/coercive 175 influence. 177 o *Institutions:* Private institutions such as corporations, 178 schools, and cyber cafes can put filtration mechanisms in place. 179 These mechanisms are occasionally at the request of a censor, but 180 are more often implemented to help achieve institutional goals, 181 such as to prevent the viewing of pornography on school computers. 183 o *Personal Devices:* Censors can mandate censorship software be 184 installed on the device level. This has many disadvantages in 185 terms of scalability, ease-of-circumvention, and operating system 186 requirements. The emergence of mobile devices exacerbate these 187 feasibility problems. 189 o *Services:* Application service providers can be pressured, 190 coerced, or legally required to censor specific content or flows 191 of data. Service providers naturally face incentives to maximize 192 their potential customer base and potential service shutdowns or 193 legal liability due to censorship efforts may seem much less 194 attractive than potentially excluding content, users, or uses of 195 their service. 197 o *Certificate Authorities:* Authorities that issue 198 cryptographically secured resources can be a significant point of 199 control. Certificate Authorities that issue certificates to 200 domain holders for TLS/HTTPS or Regional/Local Internet Registries 201 that issue Route Origination Authorizations to BGP operators can 202 be forced to issue rogue certificates that may allow compromises 203 in confidentiality guarantees - allowing censorship software to 204 engage in identification and interference where not possible 205 before - or integrity guarantees - allowing, for example, 206 adversarial routing of traffic. 208 o *Content Distribution Networks (CDNs):* CDNs seek to collapse 209 network topology in order to better locate content closer to the 210 service's users in order to improve quality of service. These can 211 be powerful points of control for censors, especially if the 212 location of a CDN results in easier interference. 214 At all levels of the network hierarchy, the filtration mechanisms 215 used to detect undesirable traffic are essentially the same: a censor 216 sniffs transmitting packets and identifies undesirable content, and 217 then uses a blocking or shaping mechanism to prevent or impair 218 access. Identification of undesirable traffic can occur at the 219 application, transport, or network layer of the IP stack. Censors 220 are almost always concerned with web traffic, so the relevant 221 protocols tend to be filtered in predictable ways. For example, a 222 subversive image would always make it past a keyword filter, but the 223 IP address of the site serving the image may be blacklisted when 224 identified as a provider of undesirable content. 226 3.2. Application Layer 228 3.2.1. HTTP Request Header Identification 230 An HTTP header contains a lot of useful information for traffic 231 identification; although "host" is the only required field in an HTTP 232 request header (for HTTP/1.1 and later), an HTTP method field is 233 necessary to do anything useful. As such, "method" and "host" are 234 the two fields used most often for ubiquitous censorship. A censor 235 can sniff traffic and identify a specific domain name (host) and 236 usually a page name (GET /page) as well. This identification 237 technique is usually paired with TCP/IP header identification (see 238 Section 3.3.1) for a more robust method. 240 *Tradeoffs:* Request Identification is a technically straight-forward 241 identification method that can be easily implemented at the Backbone 242 or ISP level. The hardware needed for this sort of identification is 243 cheap and easy-to-acquire, making it desirable when budget and scope 244 are a concern. HTTPS will encrypt the relevant request and response 245 fields, so pairing with TCP/IP identification (see Section 3.3.1) is 246 necessary for filtering of HTTPS. However, some countermeasures such 247 as URL obfuscation [RSF-2005] can trivially defeat simple forms of 248 HTTP Request Header Identification. 250 *Empirical Examples:* Studies exploring censorship mechanisms have 251 found evidence of HTTP header/ URL filtering in many countries, 252 including Bangladesh, Bahrain, China, India, Iran, Malaysia, 253 Pakistan, Russia, Saudi Arabia, South Korea, Thailand, and Turkey 254 [Verkamp-2012] [Nabi-2013] [Aryan-2012]. Commercial technologies 255 such as the McAfee SmartFilter and NetSweeper are often purchased by 256 censors [Dalek-2013]. These commercial technologies use a 257 combination of HTTP Request Identification and TCP/IP Header 258 Identification to filter specific URLs. Dalek et al. and Jones et 259 al. identified the use of these products in the wild [Dalek-2013] 260 [Jones-2014]. 262 3.2.2. HTTP Response Header Identification 264 While HTTP Request Header Identification relies on the information 265 contained in the HTTP request from client to server, response 266 identification uses information sent in response by the server to 267 client to identify undesirable content. 269 *Tradeoffs:* As with HTTP Request Header Identification, the 270 techniques used to identify HTTP traffic are well-known, cheap, and 271 relatively easy to implement, but is made useless by HTTPS, because 272 the response in HTTPS is encrypted, including headers. 274 The response fields are also less helpful for identifying content 275 than request fields, as "Server" could easily be identified using 276 HTTP Request Header identification, and "Via" is rarely relevant. 277 HTTP Response censorship mechanisms normally let the first n packets 278 through while the mirrored traffic is being processed; this may allow 279 some content through and the user may be able to detect that the 280 censor is actively interfering with undesirable content. 282 *Empirical Examples:* In 2009, Jong Park et al. at the University of 283 New Mexico demonstrated that the Great Firewall of China (GFW) used 284 this technique [Crandall-2010]. However, Jong Park et al. found that 285 the GFW discontinued this practice during the course of the study. 286 Due to the overlap in HTTP response filtering and keyword filtering 287 (see Section 3.2.3), it is likely that most censors rely on keyword 288 filtering over TCP streams instead of HTTP response filtering. 290 3.2.3. Instrumenting Content Providers 292 In addition to censorship by the state, many governments pressure 293 content providers to censor themselves. Due to the extensive reach 294 of government censorship, we need to define content provider as any 295 service that provides utility to users, including everything from web 296 sites to locally installed programs. The defining factor of keyword 297 identification by content providers is the choice of content 298 providers to detect restricted terms on their platform. The terms to 299 look for may be provided by the government or the content provider 300 may be expected to come up with their own list. 302 *Tradeoffs:* By instrumenting content providers to identify 303 restricted content, the censor can gain new information at the cost 304 of political capital with the companies it forces or encourages to 305 participate in censorship. For example, the censor can gain insight 306 about the content of encrypted traffic by coercing web sites to 307 identify restricted content, but this may drive away potential 308 investment. Coercing content providers may encourage self- 309 censorship, an additional advantage for censors. The tradeoffs for 310 instrumenting content providers are highly dependent on the content 311 provider and the requested assistance. 313 *Empirical Examples:* Researchers have discovered keyword 314 identification by content providers on platforms ranging from instant 315 messaging applications [Senft-2013] to search engines [Rushe-2015] 316 [Cheng-2010] [Whittaker-2013] [BBC-2013] [Condliffe-2013]. To 317 demonstrate the prevalence of this type of keyword identification, we 318 look to search engine censorship. 320 Search engine censorship demonstrates keyword identification by 321 content providers and can be regional or worldwide. Implementation 322 is occasionally voluntary, but normally is based on laws and 323 regulations of the country a search engine is operating in. The 324 keyword blacklists are most likely maintained by the search engine 325 provider. China requires search engine providers to "voluntarily" 326 maintain search term blacklists to acquire/keep an Internet content 327 provider (ICP) license [Cheng-2010]. It is clear these blacklists 328 are maintained by each search engine provider based on the slight 329 variations in the intercepted searches [Zhu-2011] [Whittaker-2013]. 330 The United Kingdom has been pushing search engines to self-censor 331 with the threat of litigation if they don't do it themselves: Google 332 and Microsoft have agreed to block more than 100,000 queries in U.K. 333 to help combat abuse [BBC-2013] [Condliffe-2013]. 335 Depending on the output, search engine keyword identification may be 336 difficult or easy to detect. In some cases specialized or blank 337 results provide a trivial enumeration mechanism, but more subtle 338 censorship can be difficult to detect. In February 2015, Microsoft's 339 search engine, Bing, was accused of censoring Chinese content outside 340 of China [Rushe-2015] because Bing returned different results for 341 censored terms in Chinese and English. However, it is possible that 342 censorship of the largest base of Chinese search users, China, biased 343 Bing's results so that the more popular results in China (the 344 uncensored results) were also more popular for Chinese speakers 345 outside of China. 347 3.2.4. Deep Packet Inspection (DPI) Identification 349 Deep Packet Inspection has become computationally feasible as a 350 censorship mechanism in recent years [Wagner-2009]. Unlike other 351 techniques, DPI reassembles network flows to examine the application 352 "data" section, as opposed to only the header, and is therefore often 353 used for keyword identification. DPI also differs from other 354 identification technologies because it can leverage additional packet 355 and flow characteristics, i.e. packet sizes and timings, to identify 356 content. To prevent substantial quality of service (QoS) impacts, 357 DPI normally analyzes a copy of data while the original packets 358 continue to be routed. Typically, the traffic is split using either 359 a mirror switch or fiber splitter, and analyzed on a cluster of 360 machines running Intrusion Detection Systems (IDS) configured for 361 censorship. 363 *Tradeoffs:* DPI is one of the most expensive identification 364 mechanisms and can have a large QoS impact [Porter-2010]. When used 365 as a keyword filter for TCP flows, DPI systems can cause also major 366 overblocking problems. Like other techniques, DPI is less useful 367 against encrypted data, though DPI can leverage unencrypted elements 368 of an encrypted data flow (e.g., the Server Name Indicator (SNI) sent 369 in the clear for TLS) or statistical information about an encrypted 370 flow (e.g., video takes more bandwidth than audio or textual forms of 371 communication) to identify traffic. 373 Other kinds of information can be inferred by comparing certain 374 unencrypted elements exchanged during TLS handshakes to similar data 375 points from known sources. This practice, called TLS fingerprinting, 376 allows a probabilistic identification of a party's operating system, 377 browser, or application based on a comparison of the specific 378 combinations of TLS version, ciphersuites, compression options, etc. 380 sent in the ClientHello message to similar signatures found in 381 unencrypted traffic [Husak-2016]. 383 Despite these problems, DPI is the most powerful identification 384 method and is widely used in practice. The Great Firewall of China 385 (GFW), the largest censorship system in the world, uses DPI to 386 identify restricted content over HTTP and DNS and inject TCP RSTs and 387 bad DNS responses, respectively, into connections [Crandall-2010] 388 [Clayton-2006] [Anonymous-2014]. 390 *Empirical Evidence:* Several studies have found evidence of DPI 391 being used to censor content and tools. Clayton et al. Crandal et 392 al., Anonymous, and Khattak et al., all explored the GFW and Khattak 393 et al. even probed the firewall to discover implementation details 394 like how much state it stores [Crandall-2010] [Clayton-2006] 395 [Anonymous-2014] [Khattak-2013]. The Tor project claims that China, 396 Iran, Ethiopia, and others must have used DPI to block the obsf2 397 protocol [Wilde-2012]. Malaysia has been accused of using targeted 398 DPI, paired with DDoS, to identify and subsequently knockout pro- 399 opposition material [Wagstaff-2013]. It also seems likely that 400 organizations not so worried about blocking content in real-time 401 could use DPI to sort and categorically search gathered traffic using 402 technologies such as NarusInsight [Hepting-2011]. 404 3.2.5. Server Name Indication 406 In encrypted connections using Transport Layer Security (TLS), there 407 may be servers that host multiple "virtual servers" at a give network 408 address, and the client will need to specify in the (unencrypted) 409 Client Hello message which domain name it seeks to connect to (so 410 that the server can respond with the appropriate TLS certificate) 411 using the Server Name Indication (SNI) TLS extension [RFC6066]. 412 Since SNI is sent in the clear, censors and filtering software can 413 use it as a basis for blocking, filtering, or impairment by dropping 414 connections to domains that match prohibited content (e.g., 415 bad.foo.com may be censored while good.foo.com is not) [Shbair-2015]. 417 Domain fronting has been one popular way to avoid identification by 418 censors [Fifield-2015]. To avoid identification by censors, 419 applications using domain fronting put a different domain name in the 420 SNI extension than the one encrypted by HTTPS. The visible SNI would 421 indicate an unblocked domain, while the blocked domain remains hidden 422 in the encrypted application header. Some encrypted messaging 423 services relied on domain fronting to enable their provision in 424 countries employing SNI-based filtering. These services used the 425 cover provided by domains for which blocking at the domain level 426 would be undesirable to hide their true domain names. However, the 427 companies holding the most popular domains have since reconfigured 428 their software to prevent this practice. It may be possible to 429 achieve similar results using potential future options to encrypt SNI 430 in TLS 1.3. 432 *Tradeoffs:* Some clients do not send the SNI extension (e.g., 433 clients that only support versions of SSL and not TLS) or will fall 434 back to SSL if a TLS connection fails, rendering this method 435 ineffective. In addition, this technique requires deep packet 436 inspection techniques that can be computationally and 437 infrastructurally expensive and improper configuration of an SNI- 438 based block can result in significant overblocking, e.g., when a 439 second-level domain like google.com is inadvertently blocked. In the 440 case of encrypted SNI, pressure to censor may transfer to other 441 points of intervention, such as content and application providers. 443 *Empirical Evidence:* While there are many examples of security firms 444 that offer SNI-based filtering [Trustwave-2015] [Sophos-2015] 445 [Shbair-2015], the authors currently know of no specific examples or 446 reports of SNI-based filtering observed in the field used for 447 censorship purposes. 449 3.3. Transport Layer 451 3.3.1. TCP/IP Header Identification 453 TCP/IP Header Identification is the most pervasive, reliable, and 454 predictable type of identification. TCP/IP headers contain a few 455 invaluable pieces of information that must be transparent for traffic 456 to be successfully routed: destination and source IP address and 457 port. Destination and Source IP are doubly useful, as not only does 458 it allow a censor to block undesirable content via IP blacklisting, 459 but also allows a censor to identify the IP of the user making the 460 request. Port is useful for whitelisting certain applications. 462 *Trade-offs:* TCP/IP identification is popular due to its simplicity, 463 availability, and robustness. 465 TCP/IP identification is trivial to implement, but is difficult to 466 implement in backbone or ISP routers at scale, and is therefore 467 typically implemented with DPI. Blacklisting an IP is equivalent to 468 installing a /32 route on a router and due to limited flow table 469 space, this cannot scale beyond a few thousand IPs at most. IP 470 blocking is also relatively crude, leading to overblocking, and 471 cannot deal with some services like Content Distribution Networks 472 (CDN), that host content at hundreds or thousands of IP addresses. 473 Despite these limitations, IP blocking is extremely effective because 474 the user needs to proxy their traffic through another destination to 475 circumvent this type of identification. 477 Port-blocking is generally not useful because many types of content 478 share the same port and it is possible for censored applications to 479 change their port. For example, most HTTP traffic goes over port 80, 480 so the censor cannot differentiate between restricted and allowed 481 content solely on the basis of port. Port whitelisting is 482 occasionally used, where a censor limits communication to approved 483 ports, such as 80 for HTTP traffic and is most effective when used in 484 conjunction with other identification mechanisms. For example, a 485 censor could block the default HTTPS port, port 443, thereby forcing 486 most users to fall back to HTTP. 488 3.3.2. Protocol Identification 490 Censors sometimes identify entire protocols to be blocked using a 491 variety of traffic characteristics. For example, Iran impairs the 492 performance of HTTPS traffic, a protocol that prevents further 493 analysis, to encourage users to switch to HTTP, a protocol that they 494 can analyze [Aryan-2012]. A simple protocol identification would be 495 to recognize all TCP traffic over port 443 as HTTPS, but more 496 sophisticated analysis of the statistical properties of payload data 497 and flow behavior, would be more effective, even when port 443 is not 498 used [Hjelmvik-2010] [Sandvine-2014]. 500 If censors can detect circumvention tools, they can block them, so 501 censors like China are extremely interested in identifying the 502 protocols for censorship circumvention tools. In recent years, this 503 has devolved into an arms race between censors and circumvention tool 504 developers. As part of this arms race, China developed an extremely 505 effective protocol identification technique that researchers call 506 active probing or active scanning. 508 In active probing, the censor determines whether hosts are running a 509 circumvention protocol by trying to initiate communication using the 510 circumvention protocol. If the host and the censor successfully 511 negotiate a connection, then the censor conclusively knows that host 512 is running a circumvention tool. China has used active scanning to 513 great effect to block Tor [Winter-2012]. 515 *Trade-offs:* Protocol Identification necessarily only provides 516 insight into the way information is traveling, and not the 517 information itself. 519 Protocol identification is useful for detecting and blocking 520 circumvention tools, like Tor, or traffic that is difficult to 521 analyze, like VoIP or SSL, because the censor can assume that this 522 traffic should be blocked. However, this can lead to over-blocking 523 problems when used with popular protocols. These methods are 524 expensive, both computationally and financially, due to the use of 525 statistical analysis, and can be ineffective due to its imprecise 526 nature. 528 *Empirical Examples:* Protocol identification can be easy to detect 529 if it is conducted in real time and only a particular protocol is 530 blocked, but some types of protocol identification, like active 531 scanning, are much more difficult to detect. Protocol identification 532 has been used by Iran to identify and throttle SSH traffic to make it 533 unusable [Anonymous-2007] and by China to identify and block Tor 534 relays [Winter-2012]. Protocol Identification has also been used for 535 traffic management, such as the 2007 case where Comcast in the United 536 States used RST injection to interrupt BitTorrent Traffic 537 [Winter-2012]. 539 4. Technical Interference 541 4.1. Application Layer 543 4.1.1. DNS Interference 545 There are a variety of mechanisms that censors can use to block or 546 filter access to content by altering responses from the DNS 547 [AFNIC-2013] [ICANN-SSAC-2012], including blocking the response, 548 replying with an error message, or responding with an incorrect 549 address. 551 "DNS mangling" is a network-level technique where an incorrect IP 552 address is returned in response to a DNS query to a censored 553 destination. An example of this is what the Chinese network does (we 554 are not aware of any other wide-scale uses of mangling). On the 555 Chinese network every DNS request in transit is examined (presumably 556 by network inspection technologies such as DPI) and, if it matches a 557 censored domain, a false response is injected. End users can see 558 this technique in action by simply sending DNS requests to any unused 559 IP address in China (see example below). If it is not a censored 560 name, there will be no response. If it is censored, an erroneous 561 response will be returned. For example, using the command-line dig 562 utility to query an unused IP address in China of 113.113.113.113 for 563 the name "www.ietf.org" (uncensored at the time of writing) compared 564 with "www.facebook.com" (censored at the time of writing), we get an 565 erroneous IP address "37.61.54.158" as a response: 567 % dig +short +nodnssec @113.113.113.113 A www.ietf.org 568 ;; connection timed out; no servers could be reached 570 % dig +short +nodnssec @113.113.113.113 A www.facebook.com 571 37.61.54.158 572 There are also cases of what is colloquially called "DNS lying", 573 where a censor mandates that the DNS responses provided - by an 574 operator of a recursive resolver such as an Internet access provider 575 - be different than what authoritative resolvers would provide 576 [Bortzmayer-2015]. 578 DNS cache poisoning refers to a mechanism where a censor interferes 579 with the response sent by an authoritative DNS resolver to a 580 recursive resolver by responding more quickly than the authoritative 581 resolver can respond with an alternative IP address [Halley-2008]. 582 Cache poisoning occurs after the requested site's name servers 583 resolve the request and attempt to forward the true IP back to the 584 requesting device; on the return route the resolved IP is recursively 585 cached by each DNS server that initially forwarded the request. 586 During this caching process if an undesirable keyword is recognized, 587 the resolved IP is "poisoned" and an alternative IP (or NXDOMAIN 588 error) is returned more quickly than the upstream resolver can 589 respond, causing an erroneous IP address to be cached (and 590 potentially recursively so). The alternative IPs usually direct to a 591 nonsense domain or a warning page. Alternatively, Iranian censorship 592 appears to prevent the communication en-route, preventing a response 593 from ever being sent [Aryan-2012]. 595 *Trade-offs:* These forms of DNS interference require the censor to 596 force a user to traverse a controlled DNS hierarchy (or intervening 597 network on which the censor serves as a Active Pervasive Attacker 598 [RFC7624] to rewrite DNS responses) for the mechanism to be 599 effective. It can be circumvented by a technical savvy user that 600 opts to use alternative DNS resolvers (such as the public DNS 601 resolvers provided by Google, OpenDNS, Telcomix, or FDN) or Virtual 602 Private Network technology. DNS mangling and cache poisoning also 603 imply returning an incorrect IP to those attempting to resolve a 604 domain name, but in some cases the destination may be technically 605 accessible; over HTTP, for example, the user may have another method 606 of obtaining the IP address of the desired site and may be able to 607 access it if the site is configured to be the default server 608 listening at this IP address. Target blocking has also been a 609 problem, as occasionally users outside of the censors region will be 610 directed through DNS servers or DNS-rewriting network equipment 611 controlled by a censor, causing the request to fail. The ease of 612 circumvention paired with the large risk of content blocking and 613 target blocking make DNS interference a partial, difficult, and less 614 than ideal censorship mechanism. 616 *Empirical Evidence:* DNS interference, when properly implemented, is 617 easy to identify based on the shortcomings identified above. Turkey 618 relied on DNS interference for its country-wide block of websites 619 such Twitter and YouTube for almost week in March of 2014 but the 620 ease of circumvention resulted in an increase in the popularity of 621 Twitter until Turkish ISPs implementing an IP blacklist to achieve 622 the governmental mandate [Zmijewki-2014]. Ultimately, Turkish ISPs 623 started hijacking all requests to Google and Level 3's international 624 DNS resolvers [Zmijewki-2014]. DNS interference, when incorrectly 625 implemented, has resulted in some of the largest "censorship 626 disasters". In January 2014 China started directing all requests 627 passing through the Great Fire Wall to a single domain, 628 dongtaiwang.com, due to an improperly configured DNS poisoning 629 attempt; this incident is thought to be the largest Internet-service 630 outage in history [AFP-2014] [Anon-SIGCOMM12]. Countries such as 631 China, Iran, Turkey, and the United States have discussed blocking 632 entire TLDs as well, but only Iran has acted by blocking all Israeli 633 (.il) domains [Albert-2011]. 635 4.2. Transport Layer 637 4.2.1. Performance Degradation 639 While other interference techniques outlined in this section mostly 640 focus on blocking or preventing access to content, it can be an 641 effective censorship strategy in some cases to not entirely block 642 access to a given destination, or service but instead degrade the 643 performance of the relevant network connection. The resulting user 644 experience for a site or service under performance degradation can be 645 so bad that users opt to use a different site, service, or method of 646 communication, or may not engage in communication at all if there are 647 no alternatives. Traffic shaping techniques that rate-limit the 648 bandwidth available to certain types of traffic is one example of a 649 performance degradation. 651 *Trade offs:* While implementing a performance degradation will not 652 always eliminate the ability of people to access a desire resource, 653 it may force them to use other means of communication where 654 censorship (or surveillance) is more easily accomplished. 656 *Empirical examples:* Iran is known to shape the bandwidth available 657 to HTTPS traffic to encourage unencrypted HTTP traffic [Aryan-2012]. 659 4.2.2. Packet Dropping 661 Packet dropping is a simple mechanism to prevent undesirable traffic. 662 The censor identifies undesirable traffic and chooses to not properly 663 forward any packets it sees associated with the traversing 664 undesirable traffic instead of following a normal routing protocol. 665 This can be paired with any of the previously described mechanisms so 666 long as the censor knows the user must route traffic through a 667 controlled router. 669 *Trade offs:* Packet Dropping is most successful when every 670 traversing packet has transparent information linked to undesirable 671 content, such as a Destination IP. One downside Packet Dropping 672 suffers from is the necessity of blocking all content from otherwise 673 allowable IPs based on a single subversive sub-domain; blogging 674 services and github repositories are good examples. China famously 675 dropped all github packets for three days based on a single 676 repository hosting undesirable content [Anonymous-2013]. The need to 677 inspect every traversing packet in close to real time also makes 678 Packet Dropping somewhat challenging from a QoS perspective. 680 *Empirical Examples:* Packet Dropping is a very common form of 681 technical interference and lends itself to accurate detection given 682 the unique nature of the time-out requests it leaves in its wake. 683 The Great Firewall of China uses packet dropping as one of its 684 primary mechanisms of technical censorship [Ensafi-2013]. Iran also 685 uses Packet Dropping as the mechanisms for throttling SSH 686 [Aryan-2012]. These are but two examples of a ubiquitous censorship 687 practice. 689 4.2.3. RST Packet Injection 691 Packet injection, generally, refers to a man-in-the-middle (MITM) 692 network interference technique that spoofs packets in an established 693 traffic stream. RST packets are normally used to let one side of TCP 694 connection know the other side has stopped sending information, and 695 thus the receiver should close the connection. RST Packet Injection 696 is a specific type of packet injection attack that is used to 697 interrupt an established stream by sending RST packets to both sides 698 of a TCP connection; as each receiver thinks the other has dropped 699 the connection, the session is terminated. 701 *Trade-offs:* RST Packet Injection has a few advantages that make it 702 extremely popular as a censorship technique. RST Packet Injection is 703 an out-of-band interference mechanism, allowing the avoidance of the 704 the QoS bottleneck one can encounter with inline techniques such as 705 Packet Dropping. This out-of-band property allows a censor to 706 inspect a copy of the information, usually mirrored by an optical 707 splitter, making it an ideal pairing for DPI and Protocol 708 Identification [Weaver-2009] (this asynchronous version of a MITM is 709 often called a Man-on-the-Side (MOTS)). RST Packet Injection also 710 has the advantage of only requiring one of the two endpoints to 711 accept the spoofed packet for the connection to be interrupted. 713 The difficult part of RST Packet Injection is spoofing "enough" 714 correct information to ensure one end-point accepts a RST packet as 715 legitimate; this generally implies a correct IP, port, and (TCP) 716 sequence number. Sequence number is the hardest to get correct, as 718 [RFC0793] specifies an RST Packet should be in-sequence to be 719 accepted, although the RFC also recommends allowing in-window packets 720 as "good enough". This in-window recommendation is important, as if 721 it is implemented it allows for successful Blind RST Injection 722 attacks [Netsec-2011]. When in-window sequencing is allowed, It is 723 trivial to conduct a Blind RST Injection, a blind injection implies 724 the censor doesn't know any sensitive (encrypted) sequencing 725 information about the TCP stream they are injecting into, they can 726 simply enumerate the ~70000 possible windows; this is particularly 727 useful for interrupting encrypted/obfuscated protocols such as SSH or 728 Tor. RST Packet Injection relies on a stateful network, making it 729 useless against UDP connections. RST Packet Injection is among the 730 most popular censorship techniques used today given its versatile 731 nature and effectiveness against all types of TCP traffic. 733 *Empirical Examples:* RST Packet Injection, as mentioned above, is 734 most often paired with identification techniques that require 735 splitting, such as DPI or Protocol Identification. In 2007 Comcast 736 was accused of using RST Packet Injection to interrupt traffic it 737 identified as BitTorrent [Schoen-2007], this later led to a US 738 Federal Communications Commission ruling against Comcast 739 [VonLohmann-2008]. China has also been known to use RST Packet 740 Injection for censorship purposes. This interference is especially 741 evident in the interruption of encrypted/obfuscated protocols, such 742 as those used by Tor [Winter-2012]. 744 4.3. Multi-layer and Non-layer 746 4.3.1. Distributed Denial of Service (DDoS) 748 Distributed Denial of Service attacks are a common attack mechanism 749 used by "hacktivists" and malicious hackers, but censors have used 750 DDoS in the past for a variety of reasons. There is a huge variety 751 of DDoS attacks [Wikip-DoS], but on a high level two possible impacts 752 tend to occur; a flood attack results in the service being unusable 753 while resources are being spent to flood the service, a crash attack 754 aims to crash the service so resources can be reallocated elsewhere 755 without "releasing" the service. 757 *Trade-offs:* DDoS is an appealing mechanism when a censor would like 758 to prevent all access to undesirable content, instead of only access 759 in their region for a limited period of time, but this is really the 760 only uniquely beneficial feature for DDoS as a censorship technique. 761 The resources required to carry out a successful DDoS against major 762 targets are computationally expensive, usually requiring renting or 763 owning a malicious distributed platform such as a botnet, and 764 imprecise. DDoS is an incredibly crude censorship technique, and 765 appears to largely be used as a timely, easy-to-access mechanism for 766 blocking undesirable content for a limited period of time. 768 *Empirical Examples:* In 2012 the U.K.'s GCHQ used DDoS to 769 temporarily shutdown IRC chat rooms frequented by members of 770 Anonymous using the Syn Flood DDoS method; Syn Flood exploits the 771 handshake used by TCP to overload the victim server with so many 772 requests that legitimate traffic becomes slow or impossible 773 [Schone-2014] [CERT-2000]. Dissenting opinion websites are 774 frequently victims of DDoS around politically sensitive events in 775 Burma [Villeneuve-2011]. Controlling parties in Russia 776 [Kravtsova-2012], Zimbabwe [Orion-2013], and Malaysia 777 [Muncaster-2013] have been accused of using DDoS to interrupt 778 opposition support and access during elections. In 2015, China 779 launched a DDoS attack using a true MITM system collocated with the 780 Great Firewall, dubbed "Great Cannon", that was able to inject 781 JavaScript code into web visits to a Chinese search engine that 782 commandeered those user agents to send DDoS traffic to various sites 783 [Marczak-2015]. 785 4.3.2. Network Disconnection or Adversarial Route Announcement 787 While it is perhaps the crudest of all censorship techniques, there 788 is no more effective way of making sure undesirable information isn't 789 allowed to propagate on the web than by shutting off the network. 790 The network can be logically cut off in a region when a censoring 791 body withdraws all of the Boarder Gateway Protocol (BGP) prefixes 792 routing through the censor's country. 794 *Trade-offs:* The impact to a network disconnection in a region is 795 huge and absolute; the censor pays for absolute control over digital 796 information with all the benefits the Internet brings; this is never 797 a long-term solution for any rational censor and is normally only 798 used as a last resort in times of substantial unrest. 800 *Empirical Examples:* Network Disconnections tend to only happen in 801 times of substantial unrest, largely due to the huge social, 802 political, and economic impact such a move has. One of the first, 803 highly covered occurrences was with the Junta in Myanmar employing 804 Network Disconnection to help Junta forces quash a rebellion in 2007 805 [Dobie-2007]. China disconnected the network in the Xinjiang region 806 during unrest in 2009 in an effort to prevent the protests from 807 spreading to other regions [Heacock-2009]. The Arab Spring saw the 808 the most frequent usage of Network Disconnection, with events in 809 Egypt and Libya in 2011 [Cowie-2011] [Cowie-2011b], and Syria in 2012 810 [Thomson-2012]. 812 5. Non-Technical Prescription 814 As the name implies, sometimes manpower is the easiest way to figure 815 out which content to block. Manual Filtering differs from the common 816 tactic of building up blacklists in that it doesn't necessarily 817 target a specific IP or DNS, but instead removes or flags content. 818 Given the imprecise nature of automatic filtering, manually sorting 819 through content and flagging dissenting websites, blogs, articles and 820 other media for filtration can be an effective technique. This 821 filtration can occur on the Backbone/ISP level - China's army of 822 monitors is a good example [BBC-2013b] - but more commonly manual 823 filtering occurs on an institutional level. Internet Content 824 Providers such as Google or Weibo, require a business license to 825 operate in China. One of the prerequisites for a business license is 826 an agreement to sign a "voluntary pledge" known as the "Public Pledge 827 on Self-discipline for the Chinese Internet Industry". The failure 828 to "energetically uphold" the pledged values can lead to the ICPs 829 being held liable for the offending content by the Chinese government 830 [BBC-2013b]. 832 6. Non-Technical Interference 834 6.1. Self-Censorship 836 Self-censorship is one of the most interesting and effective types of 837 censorship; a mix of Bentham's Panopticon, cultural manipulation, 838 intelligence gathering, and meatspace enforcement. Simply put, self- 839 censorship is when a censor creates an atmosphere where users censor 840 themselves. This can be achieved through controlling information, 841 intimidating would-be dissidents, swaying public thought, and 842 creating apathy. Self-censorship is difficult to document, as when 843 it is implemented effectively the only noticeable tracing is a lack 844 of undesirable content; instead one must look at the tools and 845 techniques used by censors to encourage self-censorship. Controlling 846 Information relies on traditional censorship techniques, or by 847 forcing all users to connect through an intranet, such as in North 848 Korea. Intimidation is often achieved through allowing Internet 849 users to post "whatever they want," but arresting those who post 850 about dissenting views, this technique is incredibly common 851 [Calamur-2013] [AP-2012] [Hopkins-2011] [Guardian-2014] 852 [Johnson-2010]. A good example of swaying public thought is China's 853 "50-Cent Party," composed of somewhere between 20,000 [Bristow-2013] 854 and 300,000 [Fareed-2008] contributors who are paid to "guide public 855 thought" on local and regional issues as directed by the Ministry of 856 Culture. Creating apathy can be a side-effect of successfully 857 controlling information over time and is ideal for a censorship 858 regime [Gao-2014]. 860 6.2. Domain Name Reallocation 862 As Domain Names are resolved recursively, if a TLD deregisters a 863 domain all other DNS servers will be unable to properly forward and 864 cache the site. Domain name registration is only really a risk where 865 undesirable content is hosted on TLD controlled by the censoring 866 country, such as .cn or .ru [Anderson-2011] or where legal processes 867 in countries like the United States result in domain name seizures 868 and/or DNS redirection by the government [Kopel-2013]. 870 6.3. Server Takedown 872 Servers must have a physical location somewhere in the world. If 873 undesirable content is hosted in the censoring country the servers 874 can be physically seized or the hosting provider can be required to 875 prevent access [Anderson-2011]. 877 6.4. Notice and Takedown 879 In some countries, legal mechanisms exist where an individual can 880 issue a legal request to a content host that requires the host to 881 take down content. Examples include the voluntary systems employed 882 by companies like Google to comply with "Right to be Forgotten" 883 policies in the European Union [Google-RTBF] and the copyright- 884 oriented notice and takedown regime of the United States Digital 885 Millennium Copyright Act (DMCA) Section 512 [DMLP-512]. 887 7. Contributors 889 This document benefited from discussions with Stephane Bortzmeyer, 890 Nick Feamster, and Martin Nilsson. 892 8. Informative References 894 [AFNIC-2013] 895 AFNIC, "Report of the AFNIC Scientific Council: 896 Consequences of DNS-based Internet filtering", 2013, 897 . 900 [AFP-2014] 901 AFP, "China Has Massive Internet Breakdown Reportedly 902 Caused By Their Own Censoring Tools", 2014, 903 . 906 [Albert-2011] 907 Albert, K., "DNS Tampering and the new ICANN gTLD Rules", 908 2011, . 911 [Anderson-2011] 912 Anderson, R. and S. Murdoch, "Access Denied: Tools and 913 Technology of Internet Filtering", 2011, 914 . 917 [Anon-SIGCOMM12] 918 Anonymous, "The Collateral Damage of Internet Censorship 919 by DNS Injection", 2012, 920 . 923 [Anonymous-2007] 924 Anonymous, "How to Bypass Comcast's Bittorrent 925 Throttling", 2012, . 928 [Anonymous-2013] 929 Anonymous, "GitHub blocked in China - how it happened, how 930 to get around it, and where it will take us", 2013, 931 . 935 [Anonymous-2014] 936 Anonymous, "Towards a Comprehensive Picture of the Great 937 Firewall's DNS Censorship", 2014, 938 . 941 [AP-2012] Associated Press, "Sattar Beheshit, Iranian Blogger, Was 942 Beaten In Prison According To Prosecutor", 2012, 943 . 946 [Aryan-2012] 947 Aryan, S., Aryan, H., and J. Halderman, "Internet 948 Censorship in Iran: A First Look", 2012, 949 . 951 [BBC-2013] 952 BBC News, "Google and Microsoft agree steps to block abuse 953 images", 2013, . 955 [BBC-2013b] 956 BBC, "China employs two million microblog monitors state 957 media say", 2013, 958 . 960 [Bortzmayer-2015] 961 Bortzmayer, S., "DNS Censorship (DNS Lies) As Seen By RIPE 962 Atlas", 2015, 963 . 966 [Bristow-2013] 967 Bristow, M., "China's internet 'spin doctors'", 2013, 968 . 970 [Calamur-2013] 971 Calamur, K., "Prominent Egyptian Blogger Arrested", 2013, 972 . 975 [CERT-2000] 976 CERT, "TCP SYN Flooding and IP Spoofing Attacks", 2000, 977 . 980 [Cheng-2010] 981 Cheng, J., "Google stops Hong Kong auto-redirect as China 982 plays hardball", 2010, . 986 [Clayton-2006] 987 Clayton, R., "Ignoring the Great Firewall of China", 2006, 988 . 990 [Condliffe-2013] 991 Condliffe, J., "Google Announces Massive New Restrictions 992 on Child Abuse Search Terms", 2013, . 996 [Cowie-2011] 997 Cowie, J., "Egypt Leaves the Internet", 2011, 998 . 1001 [Cowie-2011b] 1002 Cowie, J., "Libyan Disconnect", 2011, 1003 . 1005 [Crandall-2010] 1006 Crandall, J., "Empirical Study of a National-Scale 1007 Distributed Intrusion Detection System: Backbone-Level 1008 Filtering of HTML Responses in China", 2010, 1009 . 1011 [Dalek-2013] 1012 Dalek, J., "A Method for Identifying and Confirming the 1013 Use of URL Filtering Products for Censorship", 2013, 1014 . 1017 [Ding-1999] 1018 Ding, C., Chi, C., Deng, J., and C. Dong, "Centralized 1019 Content-Based Web Filtering and Blocking: How Far Can It 1020 Go?", 1999, . 1023 [DMLP-512] 1024 Digital Media Law Project, "Protecting Yourself Against 1025 Copyright Claims Based on User Content", 2012, 1026 . 1029 [Dobie-2007] 1030 Dobie, M., "Junta tightens media screw", 2007, 1031 . 1033 [Ensafi-2013] 1034 Ensafi, R., "Detecting Intentional Packet Drops on the 1035 Internet via TCP/IP Side Channels", 2013, 1036 . 1038 [Fareed-2008] 1039 Fareed, M., "China joins a turf war", 2008, 1040 . 1043 [Fifield-2015] 1044 Fifield, D., Lan, C., Hynes, R., Wegmann, P., and V. 1045 Paxson, "Blocking-resistant communication through domain 1046 fronting", 2015, 1047 . 1049 [Gao-2014] 1050 Gao, H., "Tiananmen, Forgotten", 2014, 1051 . 1054 [Glanville-2008] 1055 Glanville, J., "The Big Business of Net Censorship", 2008, 1056 . 1059 [Google-RTBF] 1060 Google, Inc., "Search removal request under data 1061 protection law in Europe", 2015, 1062 . 1065 [Guardian-2014] 1066 The Gaurdian, "Chinese blogger jailed under crackdown on 1067 'internet rumours'", 2014, 1068 . 1071 [Halley-2008] 1072 Halley, B., "How DNS cache poisoning works", 2014, 1073 . 1076 [Heacock-2009] 1077 Heacock, R., "China Shuts Down Internet in Xinjiang Region 1078 After Riots", 2009, . 1081 [Hepting-2011] 1082 Electronic Frontier Foundation, "Hepting vs. AT&T", 2011, 1083 . 1085 [Hjelmvik-2010] 1086 Hjelmvik, E., "Breaking and Improving Protocol 1087 Obfuscation", 2010, 1088 . 1090 [Hopkins-2011] 1091 Hopkins, C., "Communications Blocked in Libya, Qatari 1092 Blogger Arrested: This Week in Online Tyranny", 2011, 1093 . 1096 [Husak-2016] 1097 Husak, M., Cermak, M., Jirsik, T., and P. Celeda, "HTTPS 1098 traffic analysis and client identification using passive 1099 SSL/TLS fingerprinting", 2016, 1100 . 1103 [ICANN-SSAC-2012] 1104 ICANN Security and Stability Advisory Committee (SSAC), 1105 "SAC 056: SSAC Advisory on Impacts of Content Blocking via 1106 the Domain Name System", 2012, 1107 . 1110 [Johnson-2010] 1111 Johnson, L., "Torture feared in arrest of Iraqi blogger", 1112 2011, . 1115 [Jones-2014] 1116 Jones, B., "Automated Detection and Fingerprinting of 1117 Censorship Block Pages", 2014, 1118 . 1121 [Khattak-2013] 1122 Khattak, S., "Towards Illuminating a Censorship Monitor's 1123 Model to Facilitate Evasion", 2013, . 1127 [Kopel-2013] 1128 Kopel, K., "Operation Seizing Our Sites: How the Federal 1129 Government is Taking Domain Names Without Prior Notice", 1130 2013, . 1132 [Kravtsova-2012] 1133 Kravtsova, Y., "Cyberattacks Disrupt Opposition's 1134 Election", 2012, 1135 . 1138 [Marczak-2015] 1139 Marczak, B., Weaver, N., Dalek, J., Ensafi, R., Fifield, 1140 D., McKune, S., Rey, A., Scott-Railton, J., Deibert, R., 1141 and V. Paxson, "An Analysis of China's "Great Cannon"", 1142 2015, 1143 . 1146 [Muncaster-2013] 1147 Muncaster, P., "Malaysian election sparks web blocking/ 1148 DDoS claims", 2013, 1149 . 1152 [Nabi-2013] 1153 Nabi, Z., "The Anatomy of Web Censorship in Pakistan", 1154 2013, . 1157 [Netsec-2011] 1158 n3t2.3c, "TCP-RST Injection", 2011, 1159 . 1161 [Orion-2013] 1162 Orion, E., "Zimbabwe election hit by hacking and DDoS 1163 attacks", 2013, 1164 . 1167 [Porter-2010] 1168 Porter, T., "The Perils of Deep Packet Inspection", 2010, 1169 . 1172 [RFC0793] Postel, J., "Transmission Control Protocol", STD 7, 1173 RFC 793, DOI 10.17487/RFC0793, September 1981, 1174 . 1176 [RFC6066] Eastlake 3rd, D., "Transport Layer Security (TLS) 1177 Extensions: Extension Definitions", RFC 6066, 1178 DOI 10.17487/RFC6066, January 2011, 1179 . 1181 [RFC7624] Barnes, R., Schneier, B., Jennings, C., Hardie, T., 1182 Trammell, B., Huitema, C., and D. Borkmann, 1183 "Confidentiality in the Face of Pervasive Surveillance: A 1184 Threat Model and Problem Statement", RFC 7624, 1185 DOI 10.17487/RFC7624, August 2015, 1186 . 1188 [RFC7754] Barnes, R., Cooper, A., Kolkman, O., Thaler, D., and E. 1189 Nordmark, "Technical Considerations for Internet Service 1190 Blocking and Filtering", RFC 7754, DOI 10.17487/RFC7754, 1191 March 2016, . 1193 [RSF-2005] 1194 Reporters Sans Frontieres, "Technical ways to get around 1195 censorship", 2005, . 1198 [Rushe-2015] 1199 Rushe, D., "Bing censoring Chinese language search results 1200 for users in the US", 2013, 1201 . 1204 [Sandvine-2014] 1205 Sandvine, "Technology Showcase on Traffic Classification: 1206 Why Measurements and Freeform Policy Matter", 2014, 1207 . 1211 [Schoen-2007] 1212 Schoen, S., "EFF tests agree with AP: Comcast is forging 1213 packets to interfere with user traffic", 2007, 1214 . 1217 [Schone-2014] 1218 Schone, M., Esposito, R., Cole, M., and G. Greenwald, 1219 "Snowden Docs Show UK Spies Attacked Anonymous, Hackers", 1220 2014, . 1224 [Senft-2013] 1225 Senft, A., "Asia Chats: Analyzing Information Controls and 1226 Privacy in Asian Messaging Applications", 2013, 1227 . 1231 [Shbair-2015] 1232 Shbair, W., Cholez, T., Goichot, A., and I. Chrisment, 1233 "Efficiently Bypassing SNI-based HTTPS Filtering", 2015, 1234 . 1236 [Sophos-2015] 1237 Sophos, "Understanding Sophos Web Filtering", 2015, 1238 . 1241 [Tang-2016] 1242 Tang, C., "In-depth analysis of the Great Firewall of 1243 China", 2016, 1244 . 1247 [Thomson-2012] 1248 Thomson, I., "Syria Cuts off Internet and Mobile 1249 Communication", 2012, 1250 . 1253 [Trustwave-2015] 1254 Trustwave, "Filter: SNI extension feature and HTTPS 1255 blocking", 2015, 1256 . 1259 [Verkamp-2012] 1260 Verkamp, J. and M. Gupta, "Inferring Mechanics of Web 1261 Censorship Around the World", 2012, 1262 . 1265 [Villeneuve-2011] 1266 Villeneuve, N., "Open Access: Chapter 8, Control and 1267 Resistance, Attacks on Burmese Opposition Media", 2011, 1268 . 1271 [VonLohmann-2008] 1272 VonLohmann, F., "FCC Rules Against Comcast for BitTorrent 1273 Blocking", 2008, . 1276 [Wagner-2009] 1277 Wagner, B., "Deep Packet Inspection and Internet 1278 Censorship: International Convergence on an 'Integrated 1279 Technology of Control'", 2009, 1280 . 1284 [Wagstaff-2013] 1285 Wagstaff, J., "In Malaysia, online election battles take a 1286 nasty turn", 2013, 1287 . 1290 [Weaver-2009] 1291 Weaver, N., Sommer, R., and V. Paxson, "Detecting Forged 1292 TCP Packets", 2009, . 1295 [Whittaker-2013] 1296 Whittaker, Z., "1,168 keywords Skype uses to censor, 1297 monitor its Chinese users", 2013, 1298 . 1301 [Wikip-DoS] 1302 Wikipedia, "Denial of Service Attacks", 2016, 1303 . 1306 [Wilde-2012] 1307 Wilde, T., "Knock Knock Knockin' on Bridges Doors", 2012, 1308 . 1311 [Winter-2012] 1312 Winter, P., "How China is Blocking Tor", 2012, 1313 . 1315 [Zhu-2011] 1316 Zhu, T., "An Analysis of Chinese Search Engine Filtering", 1317 2011, 1318 . 1320 [Zmijewki-2014] 1321 Zmijewki, E., "Turkish Internet Censorship Takes a New 1322 Turn", 2014, . 1325 Authors' Addresses 1327 Joseph Lorenzo Hall 1328 CDT 1330 Email: joe@cdt.org 1332 Michael D. Aaron 1333 CU Boulder 1335 Email: michael.aaron@colorado.edu 1337 Stan Adams 1338 CDT 1340 Email: sadams@cdt.org 1342 Ben Jones 1343 Princeton 1345 Email: bj6@cs.princeton.edu 1347 Nick Feamster 1348 Princeton 1350 Email: feamster@cs.princeton.edu