idnits 2.17.1 draft-irtf-pearg-censorship-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) -- The document has examples using IPv4 documentation addresses according to RFC6890, but does not use any IPv6 documentation addresses. Maybe there should be IPv6 examples, too? Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 13, 2020) is 1383 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'AP-2012' is defined on line 1065, but no explicit reference was found in the text == Unused Reference: 'Bentham-1791' is defined on line 1084, but no explicit reference was found in the text == Unused Reference: 'Bristow-2013' is defined on line 1101, but no explicit reference was found in the text == Unused Reference: 'Calamur-2013' is defined on line 1105, but no explicit reference was found in the text == Unused Reference: 'Ellul-1973' is defined on line 1228, but no explicit reference was found in the text == Unused Reference: 'Fareed-2008' is defined on line 1244, but no explicit reference was found in the text == Unused Reference: 'Gao-2014' is defined on line 1255, but no explicit reference was found in the text == Unused Reference: 'Guardian-2014' is defined on line 1284, but no explicit reference was found in the text == Unused Reference: 'Hopkins-2011' is defined on line 1320, but no explicit reference was found in the text == Unused Reference: 'Johnson-2010' is defined on line 1361, but no explicit reference was found in the text == Unused Reference: 'Kopel-2013' is defined on line 1383, but no explicit reference was found in the text == Unused Reference: 'RSF-2005' is defined on line 1504, but no explicit reference was found in the text == Outdated reference: A later version (-34) exists of draft-ietf-quic-transport-29 == Outdated reference: A later version (-18) exists of draft-ietf-tls-esni-07 -- Obsolete informational reference (is this intentional?): RFC 793 (Obsoleted by RFC 9293) Summary: 2 errors (**), 0 flaws (~~), 15 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 pearg J. Hall 3 Internet-Draft Internet Society 4 Intended status: Informational M. Aaron 5 Expires: January 14, 2021 CU Boulder 6 S. Adams 7 CDT 8 A. Andersdotter 10 B. Jones 11 Princeton 12 N. Feamster 13 U Chicago 14 July 13, 2020 16 A Survey of Worldwide Censorship Techniques 17 draft-irtf-pearg-censorship-04 19 Abstract 21 This document describes technical mechanisms censorship regimes 22 around the world use for blocking or impairing Internet traffic. It 23 aims to make designers, implementers, and users of Internet protocols 24 aware of the properties exploited and mechanisms used for censoring 25 end-user access to information. This document makes no suggestions 26 on individual protocol considerations, and is purely informational, 27 intended as a reference. 29 Status of This Memo 31 This Internet-Draft is submitted in full conformance with the 32 provisions of BCP 78 and BCP 79. 34 Internet-Drafts are working documents of the Internet Engineering 35 Task Force (IETF). Note that other groups may also distribute 36 working documents as Internet-Drafts. The list of current Internet- 37 Drafts is at https://datatracker.ietf.org/drafts/current/. 39 Internet-Drafts are draft documents valid for a maximum of six months 40 and may be updated, replaced, or obsoleted by other documents at any 41 time. It is inappropriate to use Internet-Drafts as reference 42 material or to cite them other than as "work in progress." 44 This Internet-Draft will expire on January 14, 2021. 46 Copyright Notice 48 Copyright (c) 2020 IETF Trust and the persons identified as the 49 document authors. All rights reserved. 51 This document is subject to BCP 78 and the IETF Trust's Legal 52 Provisions Relating to IETF Documents 53 (https://trustee.ietf.org/license-info) in effect on the date of 54 publication of this document. Please review these documents 55 carefully, as they describe your rights and restrictions with respect 56 to this document. Code Components extracted from this document must 57 include Simplified BSD License text as described in Section 4.e of 58 the Trust Legal Provisions and are provided without warranty as 59 described in the Simplified BSD License. 61 Table of Contents 63 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 64 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 65 3. Technical Prescription . . . . . . . . . . . . . . . . . . . 3 66 4. Technical Identification . . . . . . . . . . . . . . . . . . 4 67 4.1. Points of Control . . . . . . . . . . . . . . . . . . . . 4 68 4.2. Application Layer . . . . . . . . . . . . . . . . . . . . 6 69 4.2.1. HTTP Request Header Identification . . . . . . . . . 6 70 4.2.2. HTTP Response Header Identification . . . . . . . . . 7 71 4.2.3. Instrumenting Content Distributors . . . . . . . . . 8 72 4.2.4. Deep Packet Inspection (DPI) Identification . . . . . 9 73 4.3. Transport Layer . . . . . . . . . . . . . . . . . . . . . 12 74 4.3.1. Shallow Packet Inspection and Transport Header 75 Identification . . . . . . . . . . . . . . . . . . . 12 76 4.3.2. Protocol Identification . . . . . . . . . . . . . . . 13 77 5. Technical Interference . . . . . . . . . . . . . . . . . . . 14 78 5.1. Application Layer . . . . . . . . . . . . . . . . . . . . 14 79 5.1.1. DNS Interference . . . . . . . . . . . . . . . . . . 14 80 5.2. Transport Layer . . . . . . . . . . . . . . . . . . . . . 17 81 5.2.1. Performance Degradation . . . . . . . . . . . . . . . 17 82 5.2.2. Packet Dropping . . . . . . . . . . . . . . . . . . . 17 83 5.2.3. RST Packet Injection . . . . . . . . . . . . . . . . 18 84 5.3. Multi-layer and Non-layer . . . . . . . . . . . . . . . . 19 85 5.3.1. Distributed Denial of Service (DDoS) . . . . . . . . 19 86 5.3.2. Network Disconnection or Adversarial Route 87 Announcement . . . . . . . . . . . . . . . . . . . . 20 88 6. Non-Technical Interference . . . . . . . . . . . . . . . . . 21 89 6.1. Manual Filtering . . . . . . . . . . . . . . . . . . . . 21 90 6.2. Self-Censorship . . . . . . . . . . . . . . . . . . . . . 21 91 6.3. Server Takedown . . . . . . . . . . . . . . . . . . . . . 21 92 6.4. Notice and Takedown . . . . . . . . . . . . . . . . . . . 21 93 6.5. Domain-Name Seizures . . . . . . . . . . . . . . . . . . 22 95 7. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 22 96 8. Informative References . . . . . . . . . . . . . . . . . . . 22 97 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 36 99 1. Introduction 101 Censorship is where an entity in a position of power - such as a 102 government, organization, or individual - suppresses communication 103 that it considers objectionable, harmful, sensitive, politically 104 incorrect or inconvenient [WP-Def-2020]. Although censors that 105 engage in censorship must do so through legal, military, or other 106 means, this document focuses largely on technical mechanisms used to 107 achieve network censorship. 109 This document describes technical mechanisms that censorship regimes 110 around the world use for blocking or impairing Internet traffic. See 111 [RFC7754] for a discussion of Internet blocking and filtering in 112 terms of implications for Internet architecture, rather than end-user 113 access to content and services. There is also a growing field of 114 academic study of censorship circumvention (see the review article of 115 [Tschantz-2016]), results from which we seek to make relevant here 116 for protocol designers and implementers. 118 2. Terminology 120 We describe three elements of Internet censorship: prescription, 121 identification, and interference. The document contains three major 122 sections, each corresponding to one of these elements. Prescription 123 is the process by which censors determine what types of material they 124 should censor, e.g., classifying pornographic websites as 125 undesirable. Identification is the process by which censors classify 126 specific traffic or traffic identifiers to be blocked or impaired, 127 e.g., deciding that webpages containing "sex" in an HTTP Header or 128 that accept traffic through the URL wwww.sex.example are likely to be 129 undesirable. Interference is the process by which censors intercede 130 in communication and prevents access to censored materials by 131 blocking access or impairing the connection, e.g., implementing a 132 technical solution capable of identifying HTTP headers or URLs and 133 ensuring they are rendered wholly or partially inaccessible. 135 3. Technical Prescription 137 Prescription is the process of figuring out what censors would like 138 to block [Glanville-2008]. Generally, censors aggregate information 139 "to block" in blocklists or use real-time heuristic assessment of 140 content [Ding-1999]. Some national networks are designed to more 141 naturally serve as points of control [Leyba-2019]. There are also 142 indications that online censors use probabilistic machine learning 143 techniques [Tang-2016]. Indeed, web crawling and machine learning 144 techniques are an active research idea in the effort to identify 145 content deemed as morally or commercially harmful to companies or 146 consumers in some jurisdictions [SIDN2020]. 148 There are typically three types of blocklist elements: Keyword, 149 domain name, or Internet Protocol (IP) address. Keyword and domain 150 name blocking take place at the application level, e.g., HTTP, 151 whereas IP blocking tends to take place using IP addresses in IPv4/ 152 IPv6 headers. The mechanisms for building up these blocklists vary. 153 Censors can purchase from private industry "content control" 154 software, such as SmartFilter, which lets censors filter traffic from 155 broad categories they would like to block, such as gambling or 156 pornography [Knight-2005]. In these cases, these private services 157 attempt to categorize every semi-questionable website as to allow for 158 meta-tag blocking. Similarly, they tune real-time content heuristic 159 systems to map their assessments onto categories of objectionable 160 content. 162 Countries that are more interested in retaining specific political 163 control typically have ministries or organizations that maintain 164 blocklists. Examples include the Ministry of Industry and 165 Information Technology in China, Ministry of Culture and Islamic 166 Guidance in Iran, and specific to copyright in France [HADOPI-2020] 167 and across the EU for consumer protection law [Reda-2017]. 169 4. Technical Identification 171 4.1. Points of Control 173 Internet censorship takes place in all parts of the network topology. 174 It may be implemented in the network itself (e.g. local loop or 175 backhaul), on the services side of communication (e.g. web hosts, 176 cloud providers or content delivery networks), in the ancillary 177 services eco-system (e.g. domain name system or certificate 178 authorities) or on the end-client side (e.g. in an end-user device 179 such as a smartphone, laptop or desktop or software executed on such 180 devices). An important aspect of pervasive technical interception is 181 the necessity to rely on software or hardware to intercept the 182 content the censor is interested in. There are various logical and 183 physical points-of-control censors may use for interception 184 mechanisms, including, though not limited to, the following. 186 o Internet Backbone: If a censor controls the gateways into a 187 region, they can filter undesirable traffic that is traveling into 188 and out of the region by packet sniffing and port mirroring at the 189 relevant exchange points. Censorship at this point of control is 190 most effective at controlling the flow of information between a 191 region and the rest of the Internet, but is ineffective at 192 identifying content traveling between the users within a region. 193 Some national network designs naturally serve as more effective 194 chokepoints and points of control [Leyba-2019]. 196 o Internet Service Providers: Internet Service Providers are 197 frequently exploited points of control. They have the benefit of 198 being easily enumerable by a censor - often falling under the 199 jurisdictional or operational control of a censor in an 200 indisputable way - with the additional feature that an ISP can 201 identify the regional and international traffic of all their 202 users. The censor's filtration mechanisms can be placed on an ISP 203 via governmental mandates, ownership, or voluntary/coercive 204 influence. 206 o Institutions: Private institutions such as corporations, schools, 207 and Internet cafes can use filtration mechanisms. These 208 mechanisms are occasionally at the request of a government censor, 209 but can also be implemented to help achieve institutional goals, 210 such as fostering a particular moral outlook on life by school- 211 children, independent of broader society or government goals. 213 o Content Distribution Networks (CDNs): CDNs seek to collapse 214 network topology in order to better locate content closer to the 215 service's users. This reduces content transmission latency and 216 improves quality of service. The CDN service's content servers, 217 located "close" to the user in a network-sense, can be powerful 218 points of control for censors, especially if the location of CDN 219 content repositories allow for easier interference. 221 o Certificate Authorities (CAs) for Public-Key Infrastructures 222 (PKIs): Authorities that issue cryptographically secured resources 223 can be a significant point of control. CAs that issue 224 certificates to domain holders for TLS/HTTPS (the Web PKI) or 225 Regional/Local Internet Registries (RIRs) that issue Route 226 Origination Authorizations (ROAs) to BGP operators can be forced 227 to issue rogue certificates that may allow compromise, i.e., by 228 allowing censorship software to engage in identification and 229 interference where not possible before. CAs may also be forced to 230 revoke certificates. This may lead to adversarial traffic routing 231 or TLS interception being allowed, or an otherwise rightful origin 232 or destination point of traffic flows being unable to communicate 233 in a secure way. 235 o Services: Application service providers can be pressured, coerced, 236 or legally required to censor specific content or data flows. 237 Service providers naturally face incentives to maximize their 238 potential customer base and potential service shutdowns or legal 239 liability due to censorship efforts may seem much less attractive 240 than potentially excluding content, users, or uses of their 241 service. Services have increasingly become focal points of 242 censorship discussions, as well as the focus of discussions of 243 moral imperatives to use censorship tools. 245 o Personal Devices: Censors can mandate censorship software be 246 installed on the device level. This has many disadvantages in 247 terms of scalability, ease-of-circumvention, and operating system 248 requirements. (Of course, if a personal device is treated with 249 censorship software before sale and this software is difficult to 250 reconfigure, this may work in favor of those seeking to control 251 information, say for children, students, customers, or employees.) 252 The emergence of mobile devices exacerbate these feasibility 253 problems. This software can also be mandated by institutional 254 actors acting on non-governmentally mandated moral imperatives. 256 At all levels of the network hierarchy, the filtration mechanisms 257 used to censor undesirable traffic are essentially the same: a censor 258 either directly identifies undesirable content using the identifiers 259 described below and then uses a blocking or shaping mechanism such as 260 the ones exemplified below to prevent or impair access, or requests 261 that an actor ancillary to the censor, such as a private entity, 262 perform these functions. Identification of undesirable traffic can 263 occur at the application, transport, or network layer of the IP 264 stack. Censors often focus on web traffic, so the relevant protocols 265 tend to be filtered in predictable ways (see Section 4.2.1 and 266 Section 4.2.2). For example, a subversive image might make it past a 267 keyword filter. However, if later the image is deemed undesirable, a 268 censor may then blacklist the provider site's IP address. 270 4.2. Application Layer 272 The following subsections describe properties and tradeoffs of common 273 ways in which censors filter using application-layer information. 274 Each subsection includes empirical examples describing these common 275 behaviors for further reference. 277 4.2.1. HTTP Request Header Identification 279 An HTTP header contains a lot of useful information for traffic 280 identification. Although "host" is the only required field in an 281 HTTP request header (for HTTP/1.1 and later), an HTTP method field is 282 necessary to do anything useful. As such, "method" and "host" are 283 the two fields used most often for ubiquitous censorship. A censor 284 can sniff traffic and identify a specific domain name (host) and 285 usually a page name (GET /page) as well. This identification 286 technique is usually paired with transport header identification (see 287 Section 4.3.1) for a more robust method. 289 Tradeoffs: Request Identification is a technically straight-forward 290 identification method that can be easily implemented at the Backbone 291 or ISP level. The hardware needed for this sort of identification is 292 cheap and easy-to-acquire, making it desirable when budget and scope 293 are a concern. HTTPS will encrypt the relevant request and response 294 fields, so pairing with transport identification (see Section 4.3.1) 295 is necessary for HTTPS filtering. However, some countermeasures can 296 trivially defeat simple forms of HTTP Request Header Identification. 297 For example, two cooperating endpoints - an instrumented web server 298 and client - could encrypt or otherwise obfuscate the "host" header 299 in a request, potentially thwarting techniques that match against 300 "host" header values. 302 Empirical Examples: Studies exploring censorship mechanisms have 303 found evidence of HTTP header/ URL filtering in many countries, 304 including Bangladesh, Bahrain, China, India, Iran, Malaysia, 305 Pakistan, Russia, Saudi Arabia, South Korea, Thailand, and Turkey 306 [Verkamp-2012] [Nabi-2013] [Aryan-2012]. Commercial technologies 307 such as the McAfee SmartFilter and NetSweeper are often purchased by 308 censors [Dalek-2013]. These commercial technologies use a 309 combination of HTTP Request Identification and Transport Header 310 Identification to filter specific URLs. Dalek et al. and Jones et 311 al. identified the use of these products in the wild [Dalek-2013] 312 [Jones-2014]. 314 4.2.2. HTTP Response Header Identification 316 While HTTP Request Header Identification relies on the information 317 contained in the HTTP request from client to server, response 318 identification uses information sent in response by the server to 319 client to identify undesirable content. 321 Tradeoffs: As with HTTP Request Header Identification, the techniques 322 used to identify HTTP traffic are well-known, cheap, and relatively 323 easy to implement. However, they are made useless by HTTPS because 324 HTTPS encrypts the response and its headers. 326 The response fields are also less helpful for identifying content 327 than request fields, as "Server" could easily be identified using 328 HTTP Request Header identification, and "Via" is rarely relevant. 329 HTTP Response censorship mechanisms normally let the first n packets 330 through while the mirrored traffic is being processed; this may allow 331 some content through and the user may be able to detect that the 332 censor is actively interfering with undesirable content. 334 Empirical Examples: In 2009, Jong Park et al. at the University of 335 New Mexico demonstrated that the Great Firewall of China (GFW) has 336 used this technique [Crandall-2010]. However, Jong Park et al. found 337 that the GFW discontinued this practice during the course of the 338 study. Due to the overlap in HTTP response filtering and keyword 339 filtering (see Section 4.2.3), it is likely that most censors rely on 340 keyword filtering over TCP streams instead of HTTP response 341 filtering. 343 4.2.3. Instrumenting Content Distributors 345 Many governments pressure content providers to censor themselves, or 346 provide the legal framework within which content distributors are 347 incentivized to follow the content restriction preferences of agents 348 external to the content distributor [Boyle-1997]. Due to the 349 extensive reach of such censorship, we define content distributor as 350 any service that provides utility to users, including everything from 351 web sites to locally installed programs. A commonly used method of 352 instrumenting content distributors consists of keyword identification 353 to detect restricted terms on their platform. Governments may 354 provide the terms on such keyword lists. Alternatively, the content 355 provider may be expected to come up with their own list. A different 356 method of instrumenting content distributors consists of requiring a 357 distributor to disassociate with some categories of users. See also 358 Section 6.4. 360 Tradeoffs: By instrumenting content distributors to identify 361 restricted content or content providers, the censor can gain new 362 information at the cost of political capital with the companies it 363 forces or encourages to participate in censorship. For example, the 364 censor can gain insight about the content of encrypted traffic by 365 coercing web sites to identify restricted content. Coercing content 366 distributors to regulate users, categories of users, content and 367 content providers may encourage users and content providers to 368 exhibit self-censorship, an additional advantage for censors (see 369 Section 6.2). The tradeoffs for instrumenting content distributors 370 are highly dependent on the content provider and the requested 371 assistance. A typical concern is that the targeted keywords or 372 categories of users are too broad, risk being too broadly applied, or 373 are not subjected to a sufficiently robust legal process prior to 374 their mandatory application (see p. 8 of [EC-2012]). 376 Empirical Examples: Researchers discovered keyword identification by 377 content providers on platforms ranging from instant messaging 378 applications [Senft-2013] to search engines [Rushe-2015] [Cheng-2010] 379 [Whittaker-2013] [BBC-2013] [Condliffe-2013]. To demonstrate the 380 prevalence of this type of keyword identification, we look to search 381 engine censorship. 383 Search engine censorship demonstrates keyword identification by 384 content providers and can be regional or worldwide. Implementation 385 is occasionally voluntary, but normally it is based on laws and 386 regulations of the country a search engine is operating in. The 387 keyword blocklists are most likely maintained by the search engine 388 provider. China is known to require search engine providers to 389 "voluntarily" maintain search term blocklists to acquire and keep an 390 Internet content provider (ICP) license [Cheng-2010]. It is clear 391 these blocklists are maintained by each search engine provider based 392 on the slight variations in the intercepted searches [Zhu-2011] 393 [Whittaker-2013]. The United Kingdom has been pushing search engines 394 to self-censor with the threat of litigation if they do not do it 395 themselves: Google and Microsoft have agreed to block more than 396 100,000 queries in U.K. to help combat abuse [BBC-2013] 397 [Condliffe-2013]. European Union law, as well as US law, requires 398 modification of search engine results in response to either 399 copyright, trademark, data protection or defamation concerns 400 [EC-2012]. 402 Depending on the output, search engine keyword identification may be 403 difficult or easy to detect. In some cases specialized or blank 404 results provide a trivial enumeration mechanism, but more subtle 405 censorship can be difficult to detect. In February 2015, Microsoft's 406 search engine, Bing, was accused of censoring Chinese content outside 407 of China [Rushe-2015] because Bing returned different results for 408 censored terms in Chinese and English. However, it is possible that 409 censorship of the largest base of Chinese search users, China, biased 410 Bing's results so that the more popular results in China (the 411 uncensored results) were also more popular for Chinese speakers 412 outside of China. 414 Disassociation by content distributors from certain categories of 415 users has happened for instance in Spain, as a result of the conflict 416 between the Catalunyan independence movement and the Spanish legal 417 presumption of a unitary state [Lomas-2019]. E-sport event 418 organizers have also disassociated themselves from top players who 419 expressed political opinions in relation to the 2019 Hong Kong 420 protests [Victor-2019]. See also Section 5.3.2. 422 4.2.4. Deep Packet Inspection (DPI) Identification 424 DPI (deep packet inspection) technically is any kind of packet 425 analysis beyond IP address and port number and has become 426 computationally feasible as a component of censorship mechanisms in 427 recent years [Wagner-2009]. Unlike other techniques, DPI reassembles 428 network flows to examine the application "data" section, as opposed 429 to only headers, and is therefore often used for keyword 430 identification. DPI also differs from other identification 431 technologies because it can leverage additional packet and flow 432 characteristics, e.g., packet sizes and timings, when identifying 433 content. To prevent substantial quality of service (QoS) impacts, 434 DPI normally analyzes a copy of data while the original packets 435 continue to be routed. Typically, the traffic is split using either 436 a mirror switch or fiber splitter, and analyzed on a cluster of 437 machines running Intrusion Detection Systems (IDS) configured for 438 censorship. 440 Tradeoffs: DPI is one of the most expensive identification mechanisms 441 and can have a large QoS impact [Porter-2010]. When used as a 442 keyword filter for TCP flows, DPI systems can cause also major 443 overblocking problems. Like other techniques, DPI is less useful 444 against encrypted data, though DPI can leverage unencrypted elements 445 of an encrypted data flow, e.g., the Server Name Indication (SNI) 446 sent in the clear for TLS, or metadata about an encrypted flow, e.g., 447 packet sizes, which differ across video and textual flows, to 448 identify traffic. See Section 4.2.4.1 for more information about 449 SNI-based filtration mechanisms. 451 Other kinds of information can be inferred by comparing certain 452 unencrypted elements exchanged during TLS handshakes to similar data 453 points from known sources. This practice, called TLS fingerprinting, 454 allows a probabilistic identification of a party's operating system, 455 browser, or application based on a comparison of the specific 456 combinations of TLS version, ciphersuites, compression options, etc. 457 sent in the ClientHello message to similar signatures found in 458 unencrypted traffic [Husak-2016]. 460 Despite these problems, DPI is the most powerful identification 461 method and is widely used in practice. The Great Firewall of China 462 (GFW), the largest censorship system in the world, uses DPI to 463 identify restricted content over HTTP and DNS and inject TCP RSTs and 464 bad DNS responses, respectively, into connections [Crandall-2010] 465 [Clayton-2006] [Anonymous-2014]. 467 Empirical Examples: Several studies have found evidence of censors 468 using DPI for censoring content and tools. Clayton et al., Crandal 469 et al., Anonymous, and Khattak et al., all explored the GFW 470 [Crandall-2010] [Clayton-2006] [Anonymous-2014]. Khattak et al. even 471 probed the firewall to discover implementation details like how much 472 state it stores [Khattak-2013]. The Tor project claims that China, 473 Iran, Ethiopia, and others must have used DPI to block the obfs2 474 protocol [Wilde-2012]. Malaysia has been accused of using targeted 475 DPI, paired with DDoS, to identify and subsequently attack pro- 476 opposition material [Wagstaff-2013]. It also seems likely that 477 organizations not so worried about blocking content in real-time 478 could use DPI to sort and categorically search gathered traffic using 479 technologies such as NarusInsight [Hepting-2011]. 481 4.2.4.1. Server Name Indication 483 In encrypted connections using Transport Layer Security (TLS), there 484 may be servers that host multiple "virtual servers" at a given 485 network address, and the client will need to specify in the 486 (unencrypted) Client Hello message which domain name it seeks to 487 connect to (so that the server can respond with the appropriate TLS 488 certificate) using the Server Name Indication (SNI) TLS extension 489 [RFC6066]. Since SNI is often sent in the clear (as are the cert 490 fields sent in response), censors and filtering software can use it 491 (and response cert fields) as a basis for blocking, filtering, or 492 impairment by dropping connections to domains that match prohibited 493 content (e.g., bad.foo.example may be censored while good.foo.example 494 is not) [Shbair-2015]. There are undergoing standardization efforts 495 in the TLS Working Group to encrypt SNI [I-D.ietf-tls-sni-encryption] 496 [I-D.ietf-tls-esni] and recent research shows promising results in 497 the use of encrypted SNI in the face of SNI-based filtering 498 [Chai-2019]. 500 Domain fronting has been one popular way to avoid identification by 501 censors [Fifield-2015]. To avoid identification by censors, 502 applications using domain fronting put a different domain name in the 503 SNI extension than in the Host: header, which is protected by HTTPS. 504 The visible SNI would indicate an unblocked domain, while the blocked 505 domain remains hidden in the encrypted application header. Some 506 encrypted messaging services relied on domain fronting to enable 507 their provision in countries employing SNI-based filtering. These 508 services used the cover provided by domains for which blocking at the 509 domain level would be undesirable to hide their true domain names. 510 However, the companies holding the most popular domains have since 511 reconfigured their software to prevent this practice. It may be 512 possible to achieve similar results using potential future options to 513 encrypt SNI. 515 Tradeoffs: Some clients do not send the SNI extension (e.g., clients 516 that only support versions of SSL and not TLS), rendering this method 517 ineffective. In addition, this technique requires deep packet 518 inspection techniques that can be computationally and 519 infrastructurally expensive and improper configuration of an SNI- 520 based block can result in significant overblocking, e.g., when a 521 second-level domain like populardomain.example is inadvertently 522 blocked. In the case of encrypted SNI, pressure to censor may 523 transfer to other points of intervention, such as content and 524 application providers. 526 Empirical Examples: There are many examples of security firms that 527 offer SNI-based filtering products [Trustwave-2015] [Sophos-2015] 528 [Shbair-2015], and the governments of China, Egypt, Iran, Qatar, 529 South Korea, Turkey, Turkmenistan, and the UAE all do widespread SNI 530 filtering or blocking [OONI-2018] [OONI-2019] [NA-SK-2019] 531 [CitizenLab-2018] [Gatlan-2019] [Chai-2019] [Grover-2019] 532 [Singh-2019]. 534 4.3. Transport Layer 536 4.3.1. Shallow Packet Inspection and Transport Header Identification 538 Of the various shallow packet inspection methods, Transport Header 539 Identification is the most pervasive, reliable, and predictable type 540 of identification. Transport headers contain a few invaluable pieces 541 of information that must be transparent for traffic to be 542 successfully routed: destination and source IP address and port. 543 Destination and Source IP are doubly useful, as not only does it 544 allow a censor to block undesirable content via IP blocklisting, but 545 also allows a censor to identify the IP of the user making the 546 request and the IP address of the destination being visited, which in 547 most cases can be used to infer the domain being visited 548 [Patil-2019]. Port is useful for allowlisting certain applications. 550 Trade-offs: header identification is popular due to its simplicity, 551 availability, and robustness. 553 Header identification is trivial to implement, but is difficult to 554 implement in backbone or ISP routers at scale, and is therefore 555 typically implemented with DPI. Blocklisting an IP is equivalent to 556 installing a specific route on a router (such as a /32 route for IPv4 557 addresses and a /128 route for IPv6 addresses). However, due to 558 limited flow table space, this cannot scale beyond a few thousand IPs 559 at most. IP blocking is also relatively crude. It often leads to 560 overblocking and cannot deal with some services like Content 561 Distribution Networks (CDN) that host content at hundreds or 562 thousands of IP addresses. Despite these limitations, IP blocking is 563 extremely effective because the user needs to proxy their traffic 564 through another destination to circumvent this type of 565 identification. 567 Port-blocking is generally not useful because many types of content 568 share the same port and it is possible for censored applications to 569 change their port. For example, most HTTP traffic goes over port 80, 570 so the censor cannot differentiate between restricted and allowed web 571 content solely on the basis of port. HTTPS goes over port 443, with 572 similar consequences for the censor except only partial metadata may 573 now be available to the censor. Port allowlisting is occasionally 574 used, where a censor limits communication to approved ports, such as 575 80 for HTTP traffic and is most effective when used in conjunction 576 with other identification mechanisms. For example, a censor could 577 block the default HTTPS port, port 443, thereby forcing most users to 578 fall back to HTTP. A counter-example is that port 25 (SMTP) has long 579 been blocked on residential ISPs' networks to reduce the risk for 580 email spam, but in doing so also prohibits residential ISP customers 581 from running their own email servers. 583 4.3.2. Protocol Identification 585 Censors sometimes identify entire protocols to be blocked using a 586 variety of traffic characteristics. For example, Iran impairs the 587 performance of HTTPS traffic, a protocol that prevents further 588 analysis, to encourage users to switch to HTTP, a protocol that they 589 can analyze [Aryan-2012]. A simple protocol identification would be 590 to recognize all TCP traffic over port 443 as HTTPS, but more 591 sophisticated analysis of the statistical properties of payload data 592 and flow behavior, would be more effective, even when port 443 is not 593 used [Hjelmvik-2010] [Sandvine-2014]. 595 If censors can detect circumvention tools, they can block them, so 596 censors like China are extremely interested in identifying the 597 protocols for censorship circumvention tools. In recent years, this 598 has devolved into an arms race between censors and circumvention tool 599 developers. As part of this arms race, China developed an extremely 600 effective protocol identification technique that researchers call 601 active probing or active scanning. 603 In active probing, the censor determines whether hosts are running a 604 circumvention protocol by trying to initiate communication using the 605 circumvention protocol. If the host and the censor successfully 606 negotiate a connection, then the censor conclusively knows that host 607 is running a circumvention tool. China has used active scanning to 608 great effect to block Tor [Winter-2012]. 610 Trade-offs: Protocol identification necessarily only provides insight 611 into the way information is traveling, and not the information 612 itself. 614 Protocol identification is useful for detecting and blocking 615 circumvention tools, like Tor, or traffic that is difficult to 616 analyze, like VoIP or SSL, because the censor can assume that this 617 traffic should be blocked. However, this can lead to over-blocking 618 problems when used with popular protocols. These methods are 619 expensive, both computationally and financially, due to the use of 620 statistical analysis, and can be ineffective due to their imprecise 621 nature. Moreover, censorship circumvention groups like the Tor 622 Project have developed "pluggable transports" which seek to make the 623 traffic of censorship circumvention tools appear indistinguishable 624 from other kinds of traffic [Tor-2020]. 626 Empirical Examples: Protocol identification can be easy to detect if 627 it is conducted in real time and only a particular protocol is 628 blocked, but some types of protocol identification, like active 629 scanning, are much more difficult to detect. Protocol identification 630 has been used by Iran to identify and throttle SSH traffic to make it 631 unusable [Anonymous-2007] and by China to identify and block Tor 632 relays [Winter-2012]. Protocol identification has also been used for 633 traffic management, such as the 2007 case where Comcast in the United 634 States used RST injection to interrupt BitTorrent Traffic 635 [Winter-2012]. 637 5. Technical Interference 639 5.1. Application Layer 641 5.1.1. DNS Interference 643 There are a variety of mechanisms that censors can use to block or 644 filter access to content by altering responses from the DNS 645 [AFNIC-2013] [ICANN-SSAC-2012], including blocking the response, 646 replying with an error message, or responding with an incorrect 647 address. Note that there are now encrypted transports for DNS 648 queries in DNS-over-HTTPS [RFC8484] and DNS-over-TLS [RFC7858] that 649 can mitigate interference with DNS queries between the stub and the 650 resolver. 652 "DNS mangling" is a network-level technique where an incorrect IP 653 address is returned in response to a DNS query to a censored 654 destination. An example of this is what some Chinese networks do (we 655 are not aware of any other wide-scale uses of mangling). On those 656 Chinese networks, every DNS request in transit is examined 657 (presumably by network inspection technologies such as DPI) and, if 658 it matches a censored domain, a false response is injected. End 659 users can see this technique in action by simply sending DNS requests 660 to any unused IP address in China (see example below). If it is not 661 a censored name, there will be no response. If it is censored, a 662 forged response will be returned. For example, using the command- 663 line dig utility to query an unused IP address in China of 192.0.2.2 664 for the name "www.uncensored.example" compared with 665 "www.censored.example" (censored at the time of writing), we get a 666 forged IP address "198.51.100.0" as a response: 668 % dig +short +nodnssec @192.0.2.2 A www.uncensored.example 669 ;; connection timed out; no servers could be reached 671 % dig +short +nodnssec @192.0.2.2 A www.censored.example 672 198.51.100.0 674 There are also cases of what is colloquially called "DNS lying", 675 where a censor mandates that the DNS responses provided - by an 676 operator of a recursive resolver such as an Internet access provider 677 - be different than what authoritative resolvers would provide 678 [Bortzmayer-2015]. 680 DNS cache poisoning refers to a mechanism where a censor interferes 681 with the response sent by an authoritative DNS resolver to a 682 recursive resolver by responding more quickly than the authoritative 683 resolver can respond with an alternative IP address [Halley-2008]. 684 Cache poisoning occurs after the requested site's name servers 685 resolve the request and attempt to forward the true IP back to the 686 requesting device; on the return route the resolved IP is recursively 687 cached by each DNS server that initially forwarded the request. 688 During this caching process if an undesirable keyword is recognized, 689 the resolved IP is "poisoned" and an alternative IP (or NXDOMAIN 690 error) is returned more quickly than the upstream resolver can 691 respond, causing a forged IP address to be cached (and potentially 692 recursively so). The alternative IPs usually direct to a nonsense 693 domain or a warning page. Alternatively, Iranian censorship appears 694 to prevent the communication en-route, preventing a response from 695 ever being sent [Aryan-2012]. 697 Trade-offs: These forms of DNS interference require the censor to 698 force a user to traverse a controlled DNS hierarchy (or intervening 699 network on which the censor serves as a Active Pervasive Attacker 700 [RFC7624] to rewrite DNS responses) for the mechanism to be 701 effective. It can be circumvented by using alternative DNS resolvers 702 (such as any of the public DNS resolvers) that may fall outside of 703 the jurisdictional control of the censor, or Virtual Private Network 704 (VPN) technology. DNS mangling and cache poisoning also imply 705 returning an incorrect IP to those attempting to resolve a domain 706 name, but in some cases the destination may be technically 707 accessible; over HTTP, for example, the user may have another method 708 of obtaining the IP address of the desired site and may be able to 709 access it if the site is configured to be the default server 710 listening at this IP address. Target blocking has also been a 711 problem, as occasionally users outside of the censors region will be 712 directed through DNS servers or DNS-rewriting network equipment 713 controlled by a censor, causing the request to fail. The ease of 714 circumvention paired with the large risk of content blocking and 715 target blocking make DNS interference a partial, difficult, and less 716 than ideal censorship mechanism. 718 Additionally, the above mechanisms rely on DNSSEC not being deployed 719 or DNSSEC validation not being active on the client or recursive 720 resolver (neither of which are hard to imagine given limited 721 deployment of DNSSEC and limited client support for DNSSEC 722 validation). Note that an adversary seeking to merely block 723 resolution can serve a DNSSEC record that doesn't validate correctly, 724 assuming of course that the client/recursive resolver validates. 726 Previously, techniques were used for e.g. censorship that relied on 727 DNS requests being passed in cleartext over port 53 [SSAC-109-2020]. 728 With the deployment of encrypted DNS (e.g., DNS-over-HTTPS [RFC8484]) 729 these requests are now increasingly passed on port 443 with other 730 HTTPS traffic, or in the case of DNS-over-TLS [RFC7858] no longer 731 passed in the clear (see also Section 4.3.1). 733 Empirical Examples: DNS interference, when properly implemented, is 734 easy to identify based on the shortcomings identified above. Turkey 735 relied on DNS interference for its country-wide block of websites 736 such Twitter and YouTube for almost week in March of 2014 but the 737 ease of circumvention resulted in an increase in the popularity of 738 Twitter until Turkish ISPs implementing an IP blocklist to achieve 739 the governmental mandate [Zmijewski-2014]. Ultimately, Turkish ISPs 740 started hijacking all requests to Google and Level 3's international 741 DNS resolvers [Zmijewski-2014]. DNS interference, when incorrectly 742 implemented, has resulted in some of the largest "censorship 743 disasters". In January 2014, China started directing all requests 744 passing through the Great Fire Wall to a single domain, 745 dongtaiwang.com, due to an improperly configured DNS poisoning 746 attempt; this incident is thought to be the largest Internet-service 747 outage in history [AFP-2014] [Anon-SIGCOMM12]. Countries such as 748 China, Iran, Turkey, and the United States have discussed blocking 749 entire TLDs as well, but only Iran has acted by blocking all Israeli 750 (.il) domains [Albert-2011]. DNS-blocking is commonly deployed in 751 European countries to deal with undesirable content, such as child 752 abuse content (Norway, United Kingdom, Belgium, Denmark, Finland, 753 France, Germany, Ireland, Italy, Malta, the Netherlands, Poland, 754 Spain and Sweden [Wright-2013] [Eneman-2010]), online gambling 755 (Belgium, Bulgaria, Czech Republic, Cyprus, Denmark, Estonia, France, 756 Greece, Hungary, Italy, Latvia, Lithuania, Poland, Portugal, Romania, 757 Slovakia, Slovenia, Spain (see Section 6.3.2 of: [EC-gambling-2012], 758 [EC-gambling-2019])), copyright infringement (all European Economic 759 Area countries), hate-speech and extremism (France [Hertel-2015]) and 760 terrorism content (France [Hertel-2015]). 762 5.2. Transport Layer 764 5.2.1. Performance Degradation 766 While other interference techniques outlined in this section mostly 767 focus on blocking or preventing access to content, it can be an 768 effective censorship strategy in some cases to not entirely block 769 access to a given destination, or service but instead degrade the 770 performance of the relevant network connection. The resulting user 771 experience for a site or service under performance degradation can be 772 so bad that users opt to use a different site, service, or method of 773 communication, or may not engage in communication at all if there are 774 no alternatives. Traffic shaping techniques that rate-limit the 775 bandwidth available to certain types of traffic is one example of a 776 performance degradation. 778 Trade offs: While implementing a performance degradation will not 779 always eliminate the ability of people to access a desire resource, 780 it may force them to use other means of communication where 781 censorship (or surveillance) is more easily accomplished. 783 Empirical Examples: Iran has been known to shape the bandwidth 784 available to HTTPS traffic to encourage unencrypted HTTP traffic 785 [Aryan-2012]. 787 5.2.2. Packet Dropping 789 Packet dropping is a simple mechanism to prevent undesirable traffic. 790 The censor identifies undesirable traffic and chooses to not properly 791 forward any packets it sees associated with the traversing 792 undesirable traffic instead of following a normal routing protocol. 793 This can be paired with any of the previously described mechanisms so 794 long as the censor knows the user must route traffic through a 795 controlled router. 797 Trade offs: Packet Dropping is most successful when every traversing 798 packet has transparent information linked to undesirable content, 799 such as a Destination IP. One downside Packet Dropping suffers from 800 is the necessity of blocking all content from otherwise allowable IPs 801 based on a single subversive sub-domain; blogging services and github 802 repositories are good examples. China famously dropped all github 803 packets for three days based on a single repository hosting 804 undesirable content [Anonymous-2013]. The need to inspect every 805 traversing packet in close to real time also makes Packet Dropping 806 somewhat challenging from a QoS perspective. 808 Empirical Examples: Packet Dropping is a very common form of 809 technical interference and lends itself to accurate detection given 810 the unique nature of the time-out requests it leaves in its wake. 811 The Great Firewall of China has been observed using packet dropping 812 as one of its primary mechanisms of technical censorship 813 [Ensafi-2013]. Iran has also used Packet Dropping as the mechanisms 814 for throttling SSH [Aryan-2012]. These are but two examples of a 815 ubiquitous censorship practice. 817 5.2.3. RST Packet Injection 819 Packet injection, generally, refers to a man-in-the-middle (MITM) 820 network interference technique that spoofs packets in an established 821 traffic stream. RST packets are normally used to let one side of TCP 822 connection know the other side has stopped sending information, and 823 thus the receiver should close the connection. RST Packet Injection 824 is a specific type of packet injection attack that is used to 825 interrupt an established stream by sending RST packets to both sides 826 of a TCP connection; as each receiver thinks the other has dropped 827 the connection, the session is terminated. QUIC is not vulnerable to 828 these types of injection attacks once the connection has been setup, 829 but is vulnerable during setup (See [I-D.ietf-quic-transport] for 830 more details). 832 Trade-offs: Although ineffective against non-TCP protocols (QUIC, 833 IPSec), RST Packet Injection has a few advantages that make it 834 extremely popular as a censorship technique. RST Packet Injection is 835 an out-of-band interference mechanism, allowing the avoidance of the 836 the QoS bottleneck one can encounter with inline techniques such as 837 Packet Dropping. This out-of-band property allows a censor to 838 inspect a copy of the information, usually mirrored by an optical 839 splitter, making it an ideal pairing for DPI and protocol 840 identification [Weaver-2009] (this asynchronous version of a MITM is 841 often called a Man-on-the-Side (MOTS)). RST Packet Injection also 842 has the advantage of only requiring one of the two endpoints to 843 accept the spoofed packet for the connection to be interrupted. 845 The difficult part of RST Packet Injection is spoofing "enough" 846 correct information to ensure one end-point accepts a RST packet as 847 legitimate; this generally implies a correct IP, port, and TCP 848 sequence number. Sequence number is the hardest to get correct, as 849 [RFC0793] specifies an RST Packet should be in-sequence to be 850 accepted, although the RFC also recommends allowing in-window packets 851 as "good enough". This in-window recommendation is important, as if 852 it is implemented it allows for successful Blind RST Injection 853 attacks [Netsec-2011]. When in-window sequencing is allowed, it is 854 trivial to conduct a Blind RST Injection: while the term "blind" 855 injection implies the censor doesn't know any sensitive (encrypted) 856 sequencing information about the TCP stream they are injecting into, 857 they can simply enumerate all ~70000 possible windows; this is 858 particularly useful for interrupting encrypted/obfuscated protocols 859 such as SSH or Tor. RST Packet Injection relies on a stateful 860 network, making it useless against UDP connections. RST Packet 861 Injection is among the most popular censorship techniques used today 862 given its versatile nature and effectiveness against all types of TCP 863 traffic. Recent research shows that a TCP RST packet injection 864 attack can even work in the case of an off-path attacker [Cao-2016]. 866 Empirical Examples: RST Packet Injection, as mentioned above, is most 867 often paired with identification techniques that require splitting, 868 such as DPI or protocol identification. In 2007, Comcast was accused 869 of using RST Packet Injection to interrupt traffic it identified as 870 BitTorrent [Schoen-2007], this later led to a US Federal 871 Communications Commission ruling against Comcast [VonLohmann-2008]. 872 China has also been known to use RST Packet Injection for censorship 873 purposes. This interference is especially evident in the 874 interruption of encrypted/obfuscated protocols, such as those used by 875 Tor [Winter-2012]. 877 5.3. Multi-layer and Non-layer 879 5.3.1. Distributed Denial of Service (DDoS) 881 Distributed Denial of Service attacks are a common attack mechanism 882 used by "hacktivists" and malicious hackers, but censors have used 883 DDoS in the past for a variety of reasons. There is a huge variety 884 of DDoS attacks [Wikip-DoS], but at a high level two possible impacts 885 tend to occur; a flood attack results in the service being unusable 886 while resources are being spent to flood the service, a crash attack 887 aims to crash the service so resources can be reallocated elsewhere 888 without "releasing" the service. 890 Trade-offs: DDoS is an appealing mechanism when a censor would like 891 to prevent all access to undesirable content, instead of only access 892 in their region for a limited period of time, but this is really the 893 only uniquely beneficial feature for DDoS as a censorship technique. 894 The resources required to carry out a successful DDoS against major 895 targets are computationally expensive, usually requiring renting or 896 owning a malicious distributed platform such as a botnet, and 897 imprecise. DDoS is an incredibly crude censorship technique, and 898 appears to largely be used as a timely, easy-to-access mechanism for 899 blocking undesirable content for a limited period of time. 901 Empirical Examples: In 2012 the U.K.'s GCHQ used DDoS to temporarily 902 shutdown IRC chat rooms frequented by members of Anonymous using the 903 Syn Flood DDoS method; Syn Flood exploits the handshake used by TCP 904 to overload the victim server with so many requests that legitimate 905 traffic becomes slow or impossible [Schone-2014] [CERT-2000]. 907 Dissenting opinion websites are frequently victims of DDoS around 908 politically sensitive events in Burma [Villeneuve-2011]. Controlling 909 parties in Russia [Kravtsova-2012], Zimbabwe [Orion-2013], and 910 Malaysia [Muncaster-2013] have been accused of using DDoS to 911 interrupt opposition support and access during elections. In 2015, 912 China launched a DDoS attack using a true MITM system collocated with 913 the Great Firewall, dubbed "Great Cannon", that was able to inject 914 JavaScript code into web visits to a Chinese search engine that 915 commandeered those user agents to send DDoS traffic to various sites 916 [Marczak-2015]. 918 5.3.2. Network Disconnection or Adversarial Route Announcement 920 While it is perhaps the crudest of all censorship techniques, there 921 is no more effective way of making sure undesirable information isn't 922 allowed to propagate on the web than by shutting off the network. 923 The network can be logically cut off in a region when a censoring 924 body withdraws all of the Boarder Gateway Protocol (BGP) prefixes 925 routing through the censor's country. 927 Trade-offs: The impact to a network disconnection in a region is huge 928 and absolute; the censor pays for absolute control over digital 929 information by losing all the benefits the Internet brings; this 930 rarely a long-term solution for any censor and is normally only used 931 as a last resort in times of substantial unrest. 933 Empirical Examples: Network Disconnections tend to only happen in 934 times of substantial unrest, largely due to the huge social, 935 political, and economic impact such a move has. One of the first, 936 highly covered occurrences was with the Junta in Myanmar employing 937 Network Disconnection to help Junta forces quash a rebellion in 2007 938 [Dobie-2007]. China disconnected the network in the Xinjiang region 939 during unrest in 2009 in an effort to prevent the protests from 940 spreading to other regions [Heacock-2009]. The Arab Spring saw the 941 the most frequent usage of Network Disconnection, with events in 942 Egypt and Libya in 2011 [Cowie-2011] [Cowie-2011b], and Syria in 2012 943 [Thomson-2012]. Russia has indicated that it will attempt to 944 disconnect all Russian networks from the global internet in April 945 2019 as part of a test of the nation's network independence. Reports 946 also indicate that, as part of the test disconnect, Russian 947 telecommunications firms must now route all traffic to state-operated 948 monitoring points [Cimpanu-2019]. India was the country that saw the 949 largest number of internet shutdowns per year in 2016 and 2017 950 [Dada-2017]. 952 6. Non-Technical Interference 954 6.1. Manual Filtering 956 As the name implies, sometimes manpower is the easiest way to figure 957 out which content to block. Manual Filtering differs from the common 958 tactic of building up blocklists in that it doesn't necessarily 959 target a specific IP or DNS, but instead removes or flags content. 960 Given the imprecise nature of automatic filtering, manually sorting 961 through content and flagging dissenting websites, blogs, articles and 962 other media for filtration can be an effective technique. This 963 filtration can occur on the Backbone/ISP level - China's army of 964 monitors is a good example [BBC-2013b] - but more commonly manual 965 filtering occurs on an institutional level. Internet Content 966 Providers such as Google or Weibo, require a business license to 967 operate in China. One of the prerequisites for a business license is 968 an agreement to sign a "voluntary pledge" known as the "Public Pledge 969 on Self-discipline for the Chinese Internet Industry". The failure 970 to "energetically uphold" the pledged values can lead to the ICPs 971 being held liable for the offending content by the Chinese government 972 [BBC-2013b]. 974 6.2. Self-Censorship 976 Self-censorship is difficult to document, as it manifests primarily 977 through a lack of undesirable content. Tools which encourage self- 978 censorship are those which may lead a prospective speaker to believe 979 that speaking increases the risk of unfavourable outcomes for the 980 speaker (technical monitoring, identification requirements, etc.). 981 Reporters Without Borders exemplify methods of imposing self- 982 censorship in their annual World Press Freedom Index reports 983 [RWB2020]. 985 6.3. Server Takedown 987 As mentioned in passing by [Murdoch-2011], servers must have a 988 physical location somewhere in the world. If undesirable content is 989 hosted in the censoring country the servers can be physically seized 990 or - in cases where a server is virtualized in a cloud infrastructure 991 where it may not necessarily have a fixed physical location - the 992 hosting provider can be required to prevent access. 994 6.4. Notice and Takedown 996 In many countries, legal mechanisms exist where an individual or 997 other content provider can issue a legal request to a content host 998 that requires the host to take down content. Examples include the 999 systems employed by companies like Google to comply with "Right to be 1000 Forgotten" policies in the European Union [Google-RTBF], intermediary 1001 liability rules for electronic platform providers [EC-2012], or the 1002 copyright-oriented notice and takedown regime of the United States 1003 Digital Millennium Copyright Act (DMCA) Section 512 [DMLP-512]. 1005 6.5. Domain-Name Seizures 1007 Domain names are catalogued in so-called name-servers operated by 1008 legal entities called registries. These registries can be made to 1009 cede control over a domain name to someone other than the entity 1010 which registered the domain name through a legal procedure grounded 1011 in either private contracts or public law. Domain name seizures is 1012 increasingly used by both public authorities and private entities to 1013 deal with undesired content dissemination [ICANN2012] [EFF2017]. 1015 7. Contributors 1017 This document benefited from discussions with and input from David 1018 Belson, Stephane Bortzmeyer, Vinicius Fortuna, Gurshabad Grover, 1019 Andrew McConachie, Martin Nilsson, Michael Richardson, Patrick Vacek 1020 and Chris Wood. 1022 8. Informative References 1024 [AFNIC-2013] 1025 AFNIC, "Report of the AFNIC Scientific Council: 1026 Consequences of DNS-based Internet filtering", 2013, 1027 . 1030 [AFP-2014] 1031 AFP, "China Has Massive Internet Breakdown Reportedly 1032 Caused By Their Own Censoring Tools", 2014, 1033 . 1036 [Albert-2011] 1037 Albert, K., "DNS Tampering and the new ICANN gTLD Rules", 1038 2011, . 1041 [Anon-SIGCOMM12] 1042 Anonymous, "The Collateral Damage of Internet Censorship 1043 by DNS Injection", 2012, 1044 . 1047 [Anonymous-2007] 1048 Anonymous, "How to Bypass Comcast's Bittorrent 1049 Throttling", 2012, . 1052 [Anonymous-2013] 1053 Anonymous, "GitHub blocked in China - how it happened, how 1054 to get around it, and where it will take us", 2013, 1055 . 1059 [Anonymous-2014] 1060 Anonymous, "Towards a Comprehensive Picture of the Great 1061 Firewall's DNS Censorship", 2014, 1062 . 1065 [AP-2012] Associated Press, "Sattar Beheshit, Iranian Blogger, Was 1066 Beaten In Prison According To Prosecutor", 2012, 1067 . 1070 [Aryan-2012] 1071 Aryan, S., Aryan, H., and J. Halderman, "Internet 1072 Censorship in Iran: A First Look", 2012, 1073 . 1075 [BBC-2013] 1076 BBC News, "Google and Microsoft agree steps to block abuse 1077 images", 2013, . 1079 [BBC-2013b] 1080 BBC, "China employs two million microblog monitors state 1081 media say", 2013, 1082 . 1084 [Bentham-1791] 1085 Bentham, J., "Panopticon Or the Inspection House", 1791, 1086 . 1089 [Bortzmayer-2015] 1090 Bortzmayer, S., "DNS Censorship (DNS Lies) As Seen By RIPE 1091 Atlas", 2015, 1092 . 1095 [Boyle-1997] 1096 Boyle, J., "Foucault in Cyberspace: Surveillance, 1097 Sovereignty, and Hardwired Censors", 1997, 1098 . 1101 [Bristow-2013] 1102 Bristow, M., "China's internet 'spin doctors'", 2013, 1103 . 1105 [Calamur-2013] 1106 Calamur, K., "Prominent Egyptian Blogger Arrested", 2013, 1107 . 1110 [Cao-2016] 1111 Cao, Y., Qian, Z., Wang, Z., Dao, T., Krishnamurthy, S., 1112 and L. Marvel, "Off-Path TCP Exploits: Global Rate Limit 1113 Considered Dangerous", 2016, 1114 . 1117 [CERT-2000] 1118 CERT, "TCP SYN Flooding and IP Spoofing Attacks", 2000, 1119 . 1122 [Chai-2019] 1123 Chai, Z., Ghafari, A., and A. Houmansadr, "On the 1124 Importance of Encrypted-SNI (ESNI) to Censorship 1125 Circumvention", 2019, 1126 . 1129 [Cheng-2010] 1130 Cheng, J., "Google stops Hong Kong auto-redirect as China 1131 plays hardball", 2010, . 1135 [Cimpanu-2019] 1136 Cimpanu, C., "Russia to disconnect from the internet as 1137 part of a planned test", 2019, 1138 . 1141 [CitizenLab-2018] 1142 Marczak, B., Dalek, J., McKune, S., Senft, A., Scott- 1143 Railton, J., and R. Deibert, "Bad Traffic: Sandvine's 1144 PacketLogic Devices Used to Deploy Government Spyware in 1145 Turkey and Redirect Egyptian Users to Affiliate Ads?", 1146 2018, . 1150 [Clayton-2006] 1151 Clayton, R., "Ignoring the Great Firewall of China", 2006, 1152 . 1154 [Condliffe-2013] 1155 Condliffe, J., "Google Announces Massive New Restrictions 1156 on Child Abuse Search Terms", 2013, . 1160 [Cowie-2011] 1161 Cowie, J., "Egypt Leaves the Internet", 2011, 1162 . 1165 [Cowie-2011b] 1166 Cowie, J., "Libyan Disconnect", 2011, 1167 . 1169 [Crandall-2010] 1170 Crandall, J., "Empirical Study of a National-Scale 1171 Distributed Intrusion Detection System: Backbone-Level 1172 Filtering of HTML Responses in China", 2010, 1173 . 1175 [Dada-2017] 1176 Dada, T. and P. Micek, "Launching STOP: the #KeepItOn 1177 internet shutdown tracker", 2017, 1178 . 1180 [Dalek-2013] 1181 Dalek, J., "A Method for Identifying and Confirming the 1182 Use of URL Filtering Products for Censorship", 2013, 1183 . 1186 [Ding-1999] 1187 Ding, C., Chi, C., Deng, J., and C. Dong, "Centralized 1188 Content-Based Web Filtering and Blocking: How Far Can It 1189 Go?", 1999, . 1192 [DMLP-512] 1193 Digital Media Law Project, "Protecting Yourself Against 1194 Copyright Claims Based on User Content", 2012, 1195 . 1198 [Dobie-2007] 1199 Dobie, M., "Junta tightens media screw", 2007, 1200 . 1202 [EC-2012] European Commission, "Summary of the results of the Public 1203 Consultation on the future of electronic commerce in the 1204 Internal Market and the implementation of the Directive on 1205 electronic commerce (2000/31/EC)", 2012, 1206 . 1210 [EC-gambling-2012] 1211 European Commission, "Online gambling in the Internal 1212 Market", 2012, . 1215 [EC-gambling-2019] 1216 European Commission, "Evaluation of regulatory tools for 1217 enforcing online gambling rules and channeling demand 1218 towards controlled offers", 2019, 1219 . 1223 [EFF2017] Malcom, J., Stoltz, M., Rossi, G., and V. Paxson, "Which 1224 Internet registries offer the best protection for domain 1225 owners?", 2017, . 1228 [Ellul-1973] 1229 Ellul, J., "Propaganda: The Formation of Men's Attitudes", 1230 1973, . 1233 [Eneman-2010] 1234 Eneman, M., "ISPs filtering of child abusive material: A 1235 critical reflection of its effectiveness", 2010, 1236 . 1239 [Ensafi-2013] 1240 Ensafi, R., "Detecting Intentional Packet Drops on the 1241 Internet via TCP/IP Side Channels", 2013, 1242 . 1244 [Fareed-2008] 1245 Fareed, M., "China joins a turf war", 2008, 1246 . 1249 [Fifield-2015] 1250 Fifield, D., Lan, C., Hynes, R., Wegmann, P., and V. 1251 Paxson, "Blocking-resistant communication through domain 1252 fronting", 2015, 1253 . 1255 [Gao-2014] 1256 Gao, H., "Tiananmen, Forgotten", 2014, 1257 . 1260 [Gatlan-2019] 1261 Gatlan, S., "South Korea is Censoring the Internet by 1262 Snooping on SNI Traffic", 2019, 1263 . 1267 [Glanville-2008] 1268 Glanville, J., "The Big Business of Net Censorship", 2008, 1269 . 1272 [Google-RTBF] 1273 Google, Inc., "Search removal request under data 1274 protection law in Europe", 2015, 1275 . 1278 [Grover-2019] 1279 Grover, G., Singh, K., and E. Hickok, "Reliance Jio is 1280 using SNI inspection to block websites", 2019, 1281 . 1284 [Guardian-2014] 1285 The Gaurdian, "Chinese blogger jailed under crackdown on 1286 'internet rumours'", 2014, 1287 . 1290 [HADOPI-2020] 1291 Haute Autorite pour la Diffusion des oeuvres et la 1292 Protection des Droits sur Internet, "Presentation", 2020, 1293 . 1295 [Halley-2008] 1296 Halley, B., "How DNS cache poisoning works", 2014, 1297 . 1300 [Heacock-2009] 1301 Heacock, R., "China Shuts Down Internet in Xinjiang Region 1302 After Riots", 2009, . 1305 [Hepting-2011] 1306 Electronic Frontier Foundation, "Hepting vs. AT&T", 2011, 1307 . 1309 [Hertel-2015] 1310 Hertel, O., "Comment les autorites peuvent bloquer un site 1311 Internet", 2015, . 1315 [Hjelmvik-2010] 1316 Hjelmvik, E., "Breaking and Improving Protocol 1317 Obfuscation", 2010, 1318 . 1320 [Hopkins-2011] 1321 Hopkins, C., "Communications Blocked in Libya, Qatari 1322 Blogger Arrested: This Week in Online Tyranny", 2011, 1323 . 1326 [Husak-2016] 1327 Husak, M., Cermak, M., Jirsik, T., and P. Celeda, "HTTPS 1328 traffic analysis and client identification using passive 1329 SSL/TLS fingerprinting", 2016, 1330 . 1333 [I-D.ietf-quic-transport] 1334 Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed 1335 and Secure Transport", draft-ietf-quic-transport-29 (work 1336 in progress), June 2020. 1338 [I-D.ietf-tls-esni] 1339 Rescorla, E., Oku, K., Sullivan, N., and C. Wood, "TLS 1340 Encrypted Client Hello", draft-ietf-tls-esni-07 (work in 1341 progress), June 2020. 1343 [I-D.ietf-tls-sni-encryption] 1344 Huitema, C. and E. Rescorla, "Issues and Requirements for 1345 SNI Encryption in TLS", draft-ietf-tls-sni-encryption-09 1346 (work in progress), October 2019. 1348 [ICANN-SSAC-2012] 1349 ICANN Security and Stability Advisory Committee (SSAC), 1350 "SAC 056: SSAC Advisory on Impacts of Content Blocking via 1351 the Domain Name System", 2012, 1352 . 1355 [ICANN2012] 1356 ICANN Security and Stability Advisory Committee, "Guidance 1357 for Preparing Domain Name Orders, Seizures & Takedowns", 1358 2012, . 1361 [Johnson-2010] 1362 Johnson, L., "Torture feared in arrest of Iraqi blogger", 1363 2011, . 1366 [Jones-2014] 1367 Jones, B., "Automated Detection and Fingerprinting of 1368 Censorship Block Pages", 2014, 1369 . 1372 [Khattak-2013] 1373 Khattak, S., "Towards Illuminating a Censorship Monitor's 1374 Model to Facilitate Evasion", 2013, . 1378 [Knight-2005] 1379 Knight, W., "Iranian net censorship powered by US 1380 technology", 2005, . 1383 [Kopel-2013] 1384 Kopel, K., "Operation Seizing Our Sites: How the Federal 1385 Government is Taking Domain Names Without Prior Notice", 1386 2013, . 1388 [Kravtsova-2012] 1389 Kravtsova, Y., "Cyberattacks Disrupt Opposition's 1390 Election", 2012, 1391 . 1394 [Leyba-2019] 1395 Leyba, K., Edwards, B., Freeman, C., Crandall, J., and S. 1396 Forrest, "Borders and Gateways: Measuring and Analyzing 1397 National AS Chokepoints", 2019, 1398 . 1401 [Lomas-2019] 1402 Lomas, N., "Github removes Tsunami Democratic's APK after 1403 a takedown order from Spain", 2019, 1404 . 1407 [Marczak-2015] 1408 Marczak, B., Weaver, N., Dalek, J., Ensafi, R., Fifield, 1409 D., McKune, S., Rey, A., Scott-Railton, J., Deibert, R., 1410 and V. Paxson, "An Analysis of China's "Great Cannon"", 1411 2015, 1412 . 1415 [Muncaster-2013] 1416 Muncaster, P., "Malaysian election sparks web blocking/ 1417 DDoS claims", 2013, 1418 . 1421 [Murdoch-2011] 1422 Murdoch, S. and R. Anderson, "Access Denied: Tools and 1423 Technology of Internet Filtering", 2011, 1424 . 1427 [NA-SK-2019] 1428 Morgus, R., Sherman, J., and S. Nam, "Analysis: South 1429 Korea's New Tool for Filtering Illegal Internet Content", 1430 2019, . 1434 [Nabi-2013] 1435 Nabi, Z., "The Anatomy of Web Censorship in Pakistan", 1436 2013, . 1439 [Netsec-2011] 1440 n3t2.3c, "TCP-RST Injection", 2011, 1441 . 1443 [OONI-2018] 1444 Evdokimov, L., "Iran Protests: DPI blocking of Instagram 1445 (Part 2)", 2018, 1446 . 1448 [OONI-2019] 1449 Singh, S., Filasto, A., and M. Xynou, "China is now 1450 blocking all language editions of Wikipedia", 2019, 1451 . 1453 [Orion-2013] 1454 Orion, E., "Zimbabwe election hit by hacking and DDoS 1455 attacks", 2013, 1456 . 1459 [Patil-2019] 1460 Patil, S. and N. Borisov, "What Can You Learn from an 1461 IP?", 2019, . 1464 [Porter-2010] 1465 Porter, T., "The Perils of Deep Packet Inspection", 2010, 1466 . 1469 [Reda-2017] 1470 Reda, J., "New EU law prescribes website blocking in the 1471 name of 'consumer protection'", 2017, 1472 . 1474 [RFC0793] Postel, J., "Transmission Control Protocol", STD 7, 1475 RFC 793, DOI 10.17487/RFC0793, September 1981, 1476 . 1478 [RFC6066] Eastlake 3rd, D., "Transport Layer Security (TLS) 1479 Extensions: Extension Definitions", RFC 6066, 1480 DOI 10.17487/RFC6066, January 2011, 1481 . 1483 [RFC7624] Barnes, R., Schneier, B., Jennings, C., Hardie, T., 1484 Trammell, B., Huitema, C., and D. Borkmann, 1485 "Confidentiality in the Face of Pervasive Surveillance: A 1486 Threat Model and Problem Statement", RFC 7624, 1487 DOI 10.17487/RFC7624, August 2015, 1488 . 1490 [RFC7754] Barnes, R., Cooper, A., Kolkman, O., Thaler, D., and E. 1491 Nordmark, "Technical Considerations for Internet Service 1492 Blocking and Filtering", RFC 7754, DOI 10.17487/RFC7754, 1493 March 2016, . 1495 [RFC7858] Hu, Z., Zhu, L., Heidemann, J., Mankin, A., Wessels, D., 1496 and P. Hoffman, "Specification for DNS over Transport 1497 Layer Security (TLS)", RFC 7858, DOI 10.17487/RFC7858, May 1498 2016, . 1500 [RFC8484] Hoffman, P. and P. McManus, "DNS Queries over HTTPS 1501 (DoH)", RFC 8484, DOI 10.17487/RFC8484, October 2018, 1502 . 1504 [RSF-2005] 1505 Reporters Sans Frontieres, "Technical ways to get around 1506 censorship", 2005, . 1509 [Rushe-2015] 1510 Rushe, D., "Bing censoring Chinese language search results 1511 for users in the US", 2013, 1512 . 1515 [RWB2020] Reporters Without Borders, "2020 World Press Freedom 1516 Index: Entering a decisive decade for journalism, 1517 exacerbated by coronavirus", 2020, . 1521 [Sandvine-2014] 1522 Sandvine, "Technology Showcase on Traffic Classification: 1523 Why Measurements and Freeform Policy Matter", 2014, 1524 . 1528 [Schoen-2007] 1529 Schoen, S., "EFF tests agree with AP: Comcast is forging 1530 packets to interfere with user traffic", 2007, 1531 . 1534 [Schone-2014] 1535 Schone, M., Esposito, R., Cole, M., and G. Greenwald, 1536 "Snowden Docs Show UK Spies Attacked Anonymous, Hackers", 1537 2014, . 1541 [Senft-2013] 1542 Senft, A., "Asia Chats: Analyzing Information Controls and 1543 Privacy in Asian Messaging Applications", 2013, 1544 . 1548 [Shbair-2015] 1549 Shbair, W., Cholez, T., Goichot, A., and I. Chrisment, 1550 "Efficiently Bypassing SNI-based HTTPS Filtering", 2015, 1551 . 1553 [SIDN2020] 1554 Moura, G., "Detecting and Taking Down Fraudulent Webshops 1555 at the .nl ccTLD", 2020, 1556 . 1559 [Singh-2019] 1560 Singh, K., Grover, G., and V. Bansal, "How India Censors 1561 the Web", 2019, . 1563 [Sophos-2015] 1564 Sophos, "Understanding Sophos Web Filtering", 2015, 1565 . 1568 [SSAC-109-2020] 1569 ICANN Security and Stability Advisory Committee, "SAC109: 1570 The Implications of DNS over HTTPS and DNS over TLS", 1571 2020, . 1574 [Tang-2016] 1575 Tang, C., "In-depth analysis of the Great Firewall of 1576 China", 2016, 1577 . 1580 [Thomson-2012] 1581 Thomson, I., "Syria Cuts off Internet and Mobile 1582 Communication", 2012, 1583 . 1586 [Tor-2020] 1587 The Tor Project, "Tor: Pluggable Transports", 2020, 1588 . 1591 [Trustwave-2015] 1592 Trustwave, "Filter: SNI extension feature and HTTPS 1593 blocking", 2015, 1594 . 1597 [Tschantz-2016] 1598 Tschantz, M., Afroz, S., Anonymous, A., and V. Paxson, 1599 "SoK: Towards Grounding Censorship Circumvention in 1600 Empiricism", 2016, 1601 . 1603 [Verkamp-2012] 1604 Verkamp, J. and M. Gupta, "Inferring Mechanics of Web 1605 Censorship Around the World", 2012, 1606 . 1609 [Victor-2019] 1610 Victor, D., "Blizzard Sets Off Backlash for Penalizing 1611 Hearthstone Gamer in Hong Kong", 2019, 1612 . 1615 [Villeneuve-2011] 1616 Villeneuve, N., "Open Access: Chapter 8, Control and 1617 Resistance, Attacks on Burmese Opposition Media", 2011, 1618 . 1621 [VonLohmann-2008] 1622 VonLohmann, F., "FCC Rules Against Comcast for BitTorrent 1623 Blocking", 2008, . 1626 [Wagner-2009] 1627 Wagner, B., "Deep Packet Inspection and Internet 1628 Censorship: International Convergence on an 'Integrated 1629 Technology of Control'", 2009, 1630 . 1634 [Wagstaff-2013] 1635 Wagstaff, J., "In Malaysia, online election battles take a 1636 nasty turn", 2013, 1637 . 1640 [Weaver-2009] 1641 Weaver, N., Sommer, R., and V. Paxson, "Detecting Forged 1642 TCP Packets", 2009, . 1645 [Whittaker-2013] 1646 Whittaker, Z., "1,168 keywords Skype uses to censor, 1647 monitor its Chinese users", 2013, 1648 . 1651 [Wikip-DoS] 1652 Wikipedia, "Denial of Service Attacks", 2016, 1653 . 1656 [Wilde-2012] 1657 Wilde, T., "Knock Knock Knockin' on Bridges Doors", 2012, 1658 . 1661 [Winter-2012] 1662 Winter, P., "How China is Blocking Tor", 2012, 1663 . 1665 [WP-Def-2020] 1666 Wikipedia contributors, "Censorship", 2020, 1667 . 1670 [Wright-2013] 1671 Wright, J. and Y. Breindl, "Internet filtering trends in 1672 liberal democracies: French and German regulatory 1673 debates", 2013, 1674 . 1678 [Zhu-2011] 1679 Zhu, T., "An Analysis of Chinese Search Engine Filtering", 1680 2011, 1681 . 1683 [Zmijewski-2014] 1684 Zmijewski, E., "Turkish Internet Censorship Takes a New 1685 Turn", 2014, . 1688 Authors' Addresses 1690 Joseph Lorenzo Hall 1691 Internet Society 1693 Email: hall@isoc.org 1695 Michael D. Aaron 1696 CU Boulder 1698 Email: michael.drew.aaron@gmail.com 1699 Stan Adams 1700 CDT 1702 Email: sadams@cdt.org 1704 Amelia Andersdotter 1706 Email: amelia.ietf@andersdotter.cc 1708 Ben Jones 1709 Princeton 1711 Email: bj6@cs.princeton.edu 1713 Nick Feamster 1714 U Chicago 1716 Email: feamster@uchicago.edu