idnits 2.17.1 draft-irtf-pearg-censorship-06.txt: -(1609): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding -(1708): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding -(1714): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding -(1726): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == There are 12 instances of lines with non-ascii characters in the document. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) -- The document has examples using IPv4 documentation addresses according to RFC6890, but does not use any IPv6 documentation addresses. Maybe there should be IPv6 examples, too? Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (25 May 2022) is 702 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'AP-2012' is defined on line 1322, but no explicit reference was found in the text == Unused Reference: 'Bentham-1791' is defined on line 1340, but no explicit reference was found in the text == Unused Reference: 'Bristow-2013' is defined on line 1389, but no explicit reference was found in the text == Unused Reference: 'Calamur-2013' is defined on line 1393, but no explicit reference was found in the text == Unused Reference: 'Ellul-1973' is defined on line 1510, but no explicit reference was found in the text == Unused Reference: 'Fareed-2008' is defined on line 1536, but no explicit reference was found in the text == Unused Reference: 'Gao-2014' is defined on line 1547, but no explicit reference was found in the text == Unused Reference: 'Google-2018' is defined on line 1566, but no explicit reference was found in the text == Unused Reference: 'Guardian-2014' is defined on line 1583, but no explicit reference was found in the text == Unused Reference: 'Hopkins-2011' is defined on line 1619, but no explicit reference was found in the text == Unused Reference: 'Kopel-2013' is defined on line 1689, but no explicit reference was found in the text == Unused Reference: 'RSF-2005' is defined on line 1827, but no explicit reference was found in the text == Outdated reference: A later version (-18) exists of draft-ietf-tls-esni-14 -- Obsolete informational reference (is this intentional?): RFC 793 (Obsoleted by RFC 9293) Summary: 2 errors (**), 0 flaws (~~), 15 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 pearg J.L. Hall 3 Internet-Draft Internet Society 4 Intended status: Informational M.D. Aaron 5 Expires: 26 November 2022 CU Boulder 6 A. Andersdotter 8 B. Jones 9 Princeton 10 N. Feamster 11 U Chicago 12 M. Knodel 13 Center for Democracy & Technology 14 25 May 2022 16 A Survey of Worldwide Censorship Techniques 17 draft-irtf-pearg-censorship-06 19 Abstract 21 This document describes technical mechanisms employed in network 22 censorship that regimes around the world use for blocking or 23 impairing Internet traffic. It aims to make designers, implementers, 24 and users of Internet protocols aware of the properties exploited and 25 mechanisms used for censoring end-user access to information. This 26 document makes no suggestions on individual protocol considerations, 27 and is purely informational, intended as a reference. 29 Status of This Memo 31 This Internet-Draft is submitted in full conformance with the 32 provisions of BCP 78 and BCP 79. 34 Internet-Drafts are working documents of the Internet Engineering 35 Task Force (IETF). Note that other groups may also distribute 36 working documents as Internet-Drafts. The list of current Internet- 37 Drafts is at https://datatracker.ietf.org/drafts/current/. 39 Internet-Drafts are draft documents valid for a maximum of six months 40 and may be updated, replaced, or obsoleted by other documents at any 41 time. It is inappropriate to use Internet-Drafts as reference 42 material or to cite them other than as "work in progress." 44 This Internet-Draft will expire on 26 November 2022. 46 Copyright Notice 48 Copyright (c) 2022 IETF Trust and the persons identified as the 49 document authors. All rights reserved. 51 This document is subject to BCP 78 and the IETF Trust's Legal 52 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 53 license-info) in effect on the date of publication of this document. 54 Please review these documents carefully, as they describe your rights 55 and restrictions with respect to this document. Code Components 56 extracted from this document must include Revised BSD License text as 57 described in Section 4.e of the Trust Legal Provisions and are 58 provided without warranty as described in the Revised BSD License. 60 Table of Contents 62 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 63 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 64 3. Technical Prescription . . . . . . . . . . . . . . . . . . . 4 65 4. Technical Identification . . . . . . . . . . . . . . . . . . 4 66 4.1. Points of Control . . . . . . . . . . . . . . . . . . . . 4 67 4.2. Application Layer . . . . . . . . . . . . . . . . . . . . 7 68 4.2.1. HTTP Request Header Identification . . . . . . . . . 7 69 4.2.2. HTTP Response Header Identification . . . . . . . . . 8 70 4.2.3. Transport Layer Security (TLS) . . . . . . . . . . . 8 71 4.2.4. Instrumenting Content Distributors . . . . . . . . . 11 72 4.2.5. Deep Packet Inspection (DPI) Identification . . . . . 13 73 4.3. Transport Layer . . . . . . . . . . . . . . . . . . . . . 14 74 4.3.1. Shallow Packet Inspection and Transport Header 75 Identification . . . . . . . . . . . . . . . . . . . 14 76 4.3.2. Protocol Identification . . . . . . . . . . . . . . . 15 77 4.4. Residual Censorship . . . . . . . . . . . . . . . . . . . 17 78 5. Technical Interference . . . . . . . . . . . . . . . . . . . 18 79 5.1. Application Layer . . . . . . . . . . . . . . . . . . . . 18 80 5.1.1. DNS Interference . . . . . . . . . . . . . . . . . . 18 81 5.2. Transport Layer . . . . . . . . . . . . . . . . . . . . . 20 82 5.2.1. Performance Degradation . . . . . . . . . . . . . . . 20 83 5.2.2. Packet Dropping . . . . . . . . . . . . . . . . . . . 21 84 5.2.3. RST Packet Injection . . . . . . . . . . . . . . . . 22 85 5.3. Routing Layer . . . . . . . . . . . . . . . . . . . . . . 23 86 5.3.1. Network Disconnection . . . . . . . . . . . . . . . . 23 87 5.3.2. Adversarial Route Announcement . . . . . . . . . . . 24 88 5.4. Multi-layer and Non-layer . . . . . . . . . . . . . . . . 25 89 5.4.1. Distributed Denial of Service (DDoS) . . . . . . . . 25 90 5.4.2. Censorship in Depth . . . . . . . . . . . . . . . . . 26 91 6. Non-Technical Interference . . . . . . . . . . . . . . . . . 26 92 6.1. Manual Filtering . . . . . . . . . . . . . . . . . . . . 26 93 6.2. Self-Censorship . . . . . . . . . . . . . . . . . . . . . 27 94 6.3. Server Takedown . . . . . . . . . . . . . . . . . . . . . 27 95 6.4. Notice and Takedown . . . . . . . . . . . . . . . . . . . 27 96 6.5. Domain-Name Seizures . . . . . . . . . . . . . . . . . . 27 97 7. Future work . . . . . . . . . . . . . . . . . . . . . . . . . 28 98 8. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 28 99 9. Informative References . . . . . . . . . . . . . . . . . . . 28 100 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 44 102 1. Introduction 104 Censorship is where an entity in a position of power - such as a 105 government, organization, or individual - suppresses communication 106 that it considers objectionable, harmful, sensitive, politically 107 incorrect or inconvenient [WP-Def-2020]. Although censors that 108 engage in censorship must do so through legal, military, or other 109 means, this document focuses largely on technical mechanisms used to 110 achieve network censorship. 112 This document describes technical mechanisms that censorship regimes 113 around the world use for blocking or impairing Internet traffic. See 114 [RFC7754] for a discussion of Internet blocking and filtering in 115 terms of implications for Internet architecture, rather than end-user 116 access to content and services. There is also a growing field of 117 academic study of censorship circumvention (see the review article of 118 [Tschantz-2016]), results from which we seek to make relevant here 119 for protocol designers and implementers. 121 Censorship circumvention also impacts the cost of implementation of a 122 censorship measure and we include mentions of tradeoffs in relation 123 to such costs in conjunction with each technical method identified 124 below. 126 2. Terminology 128 We describe three elements of Internet censorship: prescription, 129 identification, and interference. The document contains three major 130 sections, each corresponding to one of these elements. Prescription 131 is the process by which censors determine what types of material they 132 should censor, e.g., classifying pornographic websites as 133 undesirable. Identification is the process by which censors classify 134 specific traffic or traffic identifiers to be blocked or impaired, 135 e.g., deciding that webpages containing "sex" in an HTTP Header or 136 that accept traffic through the URL wwww.sex.example are likely to be 137 undesirable. Interference is the process by which censors intercede 138 in communication and prevents access to censored materials by 139 blocking access or impairing the connection, e.g., implementing a 140 technical solution capable of identifying HTTP headers or URLs and 141 ensuring they are rendered wholly or partially inaccessible. 143 3. Technical Prescription 145 Prescription is the process of figuring out what censors would like 146 to block [Glanville-2008]. Generally, censors aggregate information 147 "to block" in blocklists or use real-time heuristic assessment of 148 content [Ding-1999]. Some national networks are designed to more 149 naturally serve as points of control [Leyba-2019]. There are also 150 indications that online censors use probabilistic machine learning 151 techniques [Tang-2016]. Indeed, web crawling and machine learning 152 techniques are an active research idea in the effort to identify 153 content deemed as morally or commercially harmful to companies or 154 consumers in some jurisdictions [SIDN2020]. 156 There are typically a few types of blocklist elements: Keyword, 157 domain name, protocol, or Internet Protocol (IP) address. Keyword 158 and domain name blocking take place at the application level, e.g., 159 HTTP; protocol blocking often occurs using Deep Packet Inspection to 160 identify a forbidden protocol; IP blocking tends to take place using 161 IP addresses in IPv4/IPv6 headers. Some censors also use the 162 presence of certain keywords to enable more aggressive blocklists 163 [Rambert-2021] or to be more permissive with content [Knockel-2021]. 165 The mechanisms for building up these blocklists vary. Censors can 166 purchase from private industry "content control" software, such as 167 SmartFilter, which lets censors filter traffic from broad categories 168 they would like to block, such as gambling or pornography 169 [Knight-2005]. In these cases, these private services attempt to 170 categorize every semi-questionable website as to allow for meta-tag 171 blocking. Similarly, they tune real-time content heuristic systems 172 to map their assessments onto categories of objectionable content. 174 Countries that are more interested in retaining specific political 175 control typically have ministries or organizations that maintain 176 blocklists. Examples include the Ministry of Industry and 177 Information Technology in China, Ministry of Culture and Islamic 178 Guidance in Iran, and specific to copyright in France [HADOPI-2020] 179 and across the EU for consumer protection law [Reda-2017]. 181 4. Technical Identification 183 4.1. Points of Control 185 Internet censorship takes place in all parts of the network topology. 186 It may be implemented in the network itself (e.g. local loop or 187 backhaul), on the services side of communication (e.g. web hosts, 188 cloud providers or content delivery networks), in the ancillary 189 services eco-system (e.g. domain name system or certificate 190 authorities) or on the end-client side (e.g. in an end-user device 191 such as a smartphone, laptop or desktop or software executed on such 192 devices). An important aspect of pervasive technical interception is 193 the necessity to rely on software or hardware to intercept the 194 content the censor is interested in. There are various logical and 195 physical points-of-control censors may use for interception 196 mechanisms, including, though not limited to, the following. 198 * Internet Backbone: If a censor controls the gateways into a 199 region, they can filter undesirable traffic that is traveling into 200 and out of the region by packet sniffing and port mirroring at the 201 relevant exchange points. Censorship at this point of control is 202 most effective at controlling the flow of information between a 203 region and the rest of the Internet, but is ineffective at 204 identifying content traveling between the users within a region. 205 Some national network designs naturally serve as more effective 206 chokepoints and points of control [Leyba-2019]. 208 * Internet Service Providers: Internet Service Providers are 209 frequently exploited points of control. They have the benefit of 210 being easily enumerable by a censor - often falling under the 211 jurisdictional or operational control of a censor in an 212 indisputable way - with the additional feature that an ISP can 213 identify the regional and international traffic of all their 214 users. The censor's filtration mechanisms can be placed on an ISP 215 via governmental mandates, ownership, or voluntary/coercive 216 influence. 218 * Institutions: Private institutions such as corporations, schools, 219 and Internet cafes can use filtration mechanisms. These 220 mechanisms are occasionally at the request of a government censor, 221 but can also be implemented to help achieve institutional goals, 222 such as fostering a particular moral outlook on life by school- 223 children, independent of broader society or government goals. 225 * Content Distribution Networks (CDNs): CDNs seek to collapse 226 network topology in order to better locate content closer to the 227 service's users. This reduces content transmission latency and 228 improves quality of service. The CDN service's content servers, 229 located "close" to the user in a network-sense, can be powerful 230 points of control for censors, especially if the location of CDN 231 content repositories allow for easier interference. 233 * Certificate Authorities (CAs) for Public-Key Infrastructures 234 (PKIs): Authorities that issue cryptographically secured resources 235 can be a significant point of control. CAs that issue 236 certificates to domain holders for TLS/HTTPS (the Web PKI) or 237 Regional/Local Internet Registries (RIRs) that issue Route 238 Origination Authorizations (ROAs) to BGP operators can be forced 239 to issue rogue certificates that may allow compromise, i.e., by 240 allowing censorship software to engage in identification and 241 interference where not possible before. CAs may also be forced to 242 revoke certificates. This may lead to adversarial traffic routing 243 or TLS interception being allowed, or an otherwise rightful origin 244 or destination point of traffic flows being unable to communicate 245 in a secure way. 247 * Services: Application service providers can be pressured, coerced, 248 or legally required to censor specific content or data flows. 249 Service providers naturally face incentives to maximize their 250 potential customer base and potential service shutdowns or legal 251 liability due to censorship efforts may seem much less attractive 252 than potentially excluding content, users, or uses of their 253 service. Services have increasingly become focal points of 254 censorship discussions, as well as the focus of discussions of 255 moral imperatives to use censorship tools. 257 * Content sites: On the service side of communications lie many 258 platforms that publish user-generated content require terms of 259 service compliance with all content and user accounts in order to 260 avoid intermediary liability for the web hosts. In aggregate 261 these policies, actions and remedies are known as content 262 moderation. Content moderation happens above the services or 263 application layer, but these mechanisms are built to filter, sort 264 and block content and users thus making them available to censors 265 through direct pressure on the private entity. 267 * Personal Devices: Censors can mandate censorship software be 268 installed on the device level. This has many disadvantages in 269 terms of scalability, ease-of-circumvention, and operating system 270 requirements. (Of course, if a personal device is treated with 271 censorship software before sale and this software is difficult to 272 reconfigure, this may work in favor of those seeking to control 273 information, say for children, students, customers, or employees.) 274 The emergence of mobile devices exacerbate these feasibility 275 problems. This software can also be mandated by institutional 276 actors acting on non-governmentally mandated moral imperatives. 278 At all levels of the network hierarchy, the filtration mechanisms 279 used to censor undesirable traffic are essentially the same: a censor 280 either directly identifies undesirable content using the identifiers 281 described below and then uses a blocking or shaping mechanism such as 282 the ones exemplified below to prevent or impair access, or requests 283 that an actor ancillary to the censor, such as a private entity, 284 perform these functions. Identification of undesirable traffic can 285 occur at the application, transport, or network layer of the IP 286 stack. Censors often focus on web traffic, so the relevant protocols 287 tend to be filtered in predictable ways (see Section 4.2.1 and 288 Section 4.2.2). For example, a subversive image might make it past a 289 keyword filter. However, if later the image is deemed undesirable, a 290 censor may then blacklist the provider site's IP address. 292 4.2. Application Layer 294 The following subsections describe properties and tradeoffs of common 295 ways in which censors filter using application-layer information. 296 Each subsection includes empirical examples describing these common 297 behaviors for further reference. 299 4.2.1. HTTP Request Header Identification 301 An HTTP header contains a lot of useful information for traffic 302 identification. Although "host" is the only required field in an 303 HTTP request header (for HTTP/1.1 and later), an HTTP method field is 304 necessary to do anything useful. As such, "method" and "host" are 305 the two fields used most often for ubiquitous censorship. A censor 306 can sniff traffic and identify a specific domain name (host) and 307 usually a page name (GET /page) as well. This identification 308 technique is usually paired with transport header identification (see 309 Section 4.3.1) for a more robust method. 311 Tradeoffs: Request Identification is a technically straight-forward 312 identification method that can be easily implemented at the Backbone 313 or ISP level. The hardware needed for this sort of identification is 314 cheap and easy-to-acquire, making it desirable when budget and scope 315 are a concern. HTTPS will encrypt the relevant request and response 316 fields, so pairing with transport identification (see Section 4.3.1) 317 is necessary for HTTPS filtering. However, some countermeasures can 318 trivially defeat simple forms of HTTP Request Header Identification. 319 For example, two cooperating endpoints - an instrumented web server 320 and client - could encrypt or otherwise obfuscate the "host" header 321 in a request, potentially thwarting techniques that match against 322 "host" header values. 324 Empirical Examples: Studies exploring censorship mechanisms have 325 found evidence of HTTP header/ URL filtering in many countries, 326 including Bangladesh, Bahrain, China, India, Iran, Malaysia, 327 Pakistan, Russia, Saudi Arabia, South Korea, Thailand, and Turkey 328 [Verkamp-2012] [Nabi-2013] [Aryan-2012]. Commercial technologies 329 such as the McAfee SmartFilter and NetSweeper are often purchased by 330 censors [Dalek-2013]. These commercial technologies use a 331 combination of HTTP Request Identification and Transport Header 332 Identification to filter specific URLs. Dalek et al. and Jones et 333 al. identified the use of these products in the wild [Dalek-2013] 334 [Jones-2014]. 336 4.2.2. HTTP Response Header Identification 338 While HTTP Request Header Identification relies on the information 339 contained in the HTTP request from client to server, response 340 identification uses information sent in response by the server to 341 client to identify undesirable content. 343 Tradeoffs: As with HTTP Request Header Identification, the techniques 344 used to identify HTTP traffic are well-known, cheap, and relatively 345 easy to implement. However, they are made useless by HTTPS because 346 HTTPS encrypts the response and its headers. 348 The response fields are also less helpful for identifying content 349 than request fields, as "Server" could easily be identified using 350 HTTP Request Header identification, and "Via" is rarely relevant. 351 HTTP Response censorship mechanisms normally let the first n packets 352 through while the mirrored traffic is being processed; this may allow 353 some content through and the user may be able to detect that the 354 censor is actively interfering with undesirable content. 356 Empirical Examples: In 2009, Jong Park et al. at the University of 357 New Mexico demonstrated that the Great Firewall of China (GFW) has 358 used this technique [Crandall-2010]. However, Jong Park et al. found 359 that the GFW discontinued this practice during the course of the 360 study. Due to the overlap in HTTP response filtering and keyword 361 filtering (see Section 4.2.4), it is likely that most censors rely on 362 keyword filtering over TCP streams instead of HTTP response 363 filtering. 365 4.2.3. Transport Layer Security (TLS) 367 Similar to HTTP, censors have deployed a variety of techniques 368 towards censoring Transport Layer Security (TLS) (and by extension 369 HTTPS). Most of these techniques relate to the Server Name 370 Indication (SNI) field, including censoring SNI, Encrypted SNI, or 371 omitted SNI. Censors can also censor HTTPS content via server 372 certificates. Note that TLS 1.3 acts as a security component of 373 QUIC. 375 4.2.3.1. Server Name Indication (SNI) 377 In encrypted connections using TLS, there may be servers that host 378 multiple "virtual servers" at a given network address, and the client 379 will need to specify in the Client Hello message which domain name it 380 seeks to connect to (so that the server can respond with the 381 appropriate TLS certificate) using the Server Name Indication (SNI) 382 TLS extension [RFC6066]. The Client Hello message is unencrypted for 383 TCP-based TLS. When using QUIC, the Client Hello message is 384 encrypted but its confidentiality is not effectively protected 385 because the initial encryption keys are derived using a value that is 386 visible on the wire. Since SNI is often sent in the clear (as are 387 the cert fields sent in response), censors and filtering software can 388 use it (and response cert fields) as a basis for blocking, filtering, 389 or impairment by dropping connections to domains that match 390 prohibited content (e.g., bad.foo.example may be censored while 391 good.foo.example is not) [Shbair-2015]. There are undergoing 392 standardization efforts in the TLS Working Group to encrypt SNI 393 [I-D.ietf-tls-sni-encryption] [I-D.ietf-tls-esni] and recent research 394 shows promising results in the use of encrypted SNI in the face of 395 SNI-based filtering [Chai-2019] in some countries. 397 Domain fronting has been one popular way to avoid identification by 398 censors [Fifield-2015]. To avoid identification by censors, 399 applications using domain fronting put a different domain name in the 400 SNI extension than in the Host: header, which is protected by HTTPS. 401 The visible SNI would indicate an unblocked domain, while the blocked 402 domain remains hidden in the encrypted application header. Some 403 encrypted messaging services relied on domain fronting to enable 404 their provision in countries employing SNI-based filtering. These 405 services used the cover provided by domains for which blocking at the 406 domain level would be undesirable to hide their true domain names. 407 However, the companies holding the most popular domains have since 408 reconfigured their software to prevent this practice. It may be 409 possible to achieve similar results using potential future options to 410 encrypt SNI. 412 Tradeoffs: Some clients do not send the SNI extension (e.g., clients 413 that only support versions of SSL and not TLS), rendering this method 414 ineffective (see Section 4.2.3.3). In addition, this technique 415 requires deep packet inspection techniques that can be 416 computationally and infrastructurally expensive, especially when 417 applied to QUIC where deep packet inspection requires key extraction 418 and decryption of the Client Hello in order to read the SNI. 419 Improper configuration of an SNI-based block can result in 420 significant overblocking, e.g., when a second-level domain like 421 populardomain.example is inadvertently blocked. In the case of 422 encrypted SNI, pressure to censor may transfer to other points of 423 intervention, such as content and application providers. 425 Empirical Examples: There are many examples of security firms that 426 offer SNI-based filtering products [Trustwave-2015] [Sophos-2015] 427 [Shbair-2015], and the governments of China, Egypt, Iran, Qatar, 428 South Korea, Turkey, Turkmenistan, and the UAE all do widespread SNI 429 filtering or blocking [OONI-2018] [OONI-2019] [NA-SK-2019] 430 [CitizenLab-2018] [Gatlan-2019] [Chai-2019] [Grover-2019] 431 [Singh-2019]. SNI blocking against QUIC traffic has been first 432 observed in Russia in March 2022 [Elmenhorst-2022]. 434 4.2.3.2. Encrypted SNI (ESNI) 436 With the data leakage present with the SNI field, a natural response 437 is to encrypt it, which is forthcoming in TLS 1.3 with Encrypted 438 Client Hello (ECH). Prior to ECH, the Encrypted SNI (ESNI) extension 439 is available to prevent the data leakage caused by SNI, which 440 encrypts only the SNI field. Unfortunately, censors can target 441 connections that use the ESNI extension specifically for censorship. 442 This guarantees overblocking for the censor, but can be worth the 443 cost if ESNI is not yet widely deployed within the country. 444 Encrypted Client Hello (ECH) is the emerging standard for protecting 445 the entire TLS Client Hello, but it is not yet widely deployed. 447 Tradeoffs: The cost to censoring Encrypted SNI (ESNI) is 448 significantly higher than SNI to a censor, as the censor can no 449 longer target censorship to specific domains and guarantees over- 450 blocking. In these cases, the censor uses the over-blocking to 451 discourage the use of ESNI entirely. 453 Empirical Examples: In 2020, China began censoring all uses of 454 Encrypted ESNI (ESNI) [Bock-2020b], even for innocuous connections. 455 The censorship mechanism for China's ESNI censorship differs from how 456 China censors SNI-based connections, suggesting that new middleboxes 457 were deployed specifically to target ESNI connections. 459 4.2.3.3. Omitted-SNI 461 Researchers have observed that some clients omit the SNI extension 462 entirely. This omitted-SNI approach limits the information available 463 to a censor. Like with ESNI, censors can choose to block connections 464 that omit the SNI, though this too risks over-blocking. 466 Tradeoffs: The approach of censoring all connections that omit the 467 SNI field is guaranteed to over-block, though connections that omit 468 the SNI field should be relatively rare in the wild. 470 Empirical Examples: In the past, researchers have observed censors in 471 Russia blocking connections that omit the SNI field [Bock-2020b]. 473 4.2.3.4. Server Response Certificate 475 During the TLS handshake after the TLS Client Hello, the server will 476 respond with the TLS certificate. This certificate also contains the 477 domain the client is trying to access, creating another avenue that 478 censors can use to perform censorship. This technique will not work 479 in TLS 1.3, as the certificate will be encrypted. 481 Tradeoffs: Censoring based on the server certificate requires deep 482 packet inspection techniques that can be more computationally 483 expensive compared to other methods. Additionally, the certificate 484 is sent later in the TLS Handshake compared to the SNI field, forcing 485 the censor to track the connection for longer. 487 Empirical Examples: Researchers have observed the Reliance Jio ISP in 488 India using certificate response fields to censor connections 489 [Satija-2021]. 491 4.2.4. Instrumenting Content Distributors 493 Many governments pressure content providers to censor themselves, or 494 provide the legal framework within which content distributors are 495 incentivized to follow the content restriction preferences of agents 496 external to the content distributor [Boyle-1997]. Due to the 497 extensive reach of such censorship, we define content distributor as 498 any service that provides utility to users, including everything from 499 web sites to locally installed programs. A commonly used method of 500 instrumenting content distributors consists of keyword identification 501 to detect restricted terms on their platform. Governments may 502 provide the terms on such keyword lists. Alternatively, the content 503 provider may be expected to come up with their own list. A different 504 method of instrumenting content distributors consists of requiring a 505 distributor to disassociate with some categories of users. See also 506 Section 6.4. 508 Tradeoffs: By instrumenting content distributors to identify 509 restricted content or content providers, the censor can gain new 510 information at the cost of political capital with the companies it 511 forces or encourages to participate in censorship. For example, the 512 censor can gain insight about the content of encrypted traffic by 513 coercing web sites to identify restricted content. Coercing content 514 distributors to regulate users, categories of users, content and 515 content providers may encourage users and content providers to 516 exhibit self-censorship, an additional advantage for censors (see 517 Section 6.2). The tradeoffs for instrumenting content distributors 518 are highly dependent on the content provider and the requested 519 assistance. A typical concern is that the targeted keywords or 520 categories of users are too broad, risk being too broadly applied, or 521 are not subjected to a sufficiently robust legal process prior to 522 their mandatory application (see p. 8 of [EC-2012]). 524 Empirical Examples: Researchers discovered keyword identification by 525 content providers on platforms ranging from instant messaging 526 applications [Senft-2013] to search engines [Rushe-2015] [Cheng-2010] 527 [Whittaker-2013] [BBC-2013] [Condliffe-2013]. To demonstrate the 528 prevalence of this type of keyword identification, we look to search 529 engine censorship. 531 Search engine censorship demonstrates keyword identification by 532 content providers and can be regional or worldwide. Implementation 533 is occasionally voluntary, but normally it is based on laws and 534 regulations of the country a search engine is operating in. The 535 keyword blocklists are most likely maintained by the search engine 536 provider. China is known to require search engine providers to 537 "voluntarily" maintain search term blocklists to acquire and keep an 538 Internet content provider (ICP) license [Cheng-2010]. It is clear 539 these blocklists are maintained by each search engine provider based 540 on the slight variations in the intercepted searches [Zhu-2011] 541 [Whittaker-2013]. The United Kingdom has been pushing search engines 542 to self-censor with the threat of litigation if they do not do it 543 themselves: Google and Microsoft have agreed to block more than 544 100,000 queries in U.K. to help combat abuse [BBC-2013] 545 [Condliffe-2013]. European Union law, as well as US law, requires 546 modification of search engine results in response to either 547 copyright, trademark, data protection or defamation concerns 548 [EC-2012]. 550 Depending on the output, search engine keyword identification may be 551 difficult or easy to detect. In some cases specialized or blank 552 results provide a trivial enumeration mechanism, but more subtle 553 censorship can be difficult to detect. In February 2015, Microsoft's 554 search engine, Bing, was accused of censoring Chinese content outside 555 of China [Rushe-2015] because Bing returned different results for 556 censored terms in Chinese and English. However, it is possible that 557 censorship of the largest base of Chinese search users, China, biased 558 Bing's results so that the more popular results in China (the 559 uncensored results) were also more popular for Chinese speakers 560 outside of China. 562 Disassociation by content distributors from certain categories of 563 users has happened for instance in Spain, as a result of the conflict 564 between the Catalunyan independence movement and the Spanish legal 565 presumption of a unitary state [Lomas-2019]. E-sport event 566 organizers have also disassociated themselves from top players who 567 expressed political opinions in relation to the 2019 Hong Kong 568 protests [Victor-2019]. See also Section 5.3.1. 570 4.2.5. Deep Packet Inspection (DPI) Identification 572 DPI (deep packet inspection) technically is any kind of packet 573 analysis beyond IP address and port number and has become 574 computationally feasible as a component of censorship mechanisms in 575 recent years [Wagner-2009]. Unlike other techniques, DPI reassembles 576 network flows to examine the application "data" section, as opposed 577 to only headers, and is therefore often used for keyword 578 identification. DPI also differs from other identification 579 technologies because it can leverage additional packet and flow 580 characteristics, e.g., packet sizes and timings, when identifying 581 content. To prevent substantial quality of service (QoS) impacts, 582 DPI normally analyzes a copy of data while the original packets 583 continue to be routed. Typically, the traffic is split using either 584 a mirror switch or fiber splitter, and analyzed on a cluster of 585 machines running Intrusion Detection Systems (IDS) configured for 586 censorship. 588 Tradeoffs: DPI is one of the most expensive identification mechanisms 589 and can have a large QoS impact [Porter-2010]. When used as a 590 keyword filter for TCP flows, DPI systems can cause also major 591 overblocking problems. Like other techniques, DPI is less useful 592 against encrypted data, though DPI can leverage unencrypted elements 593 of an encrypted data flow, e.g., the Server Name Indication (SNI) 594 sent in the clear for TLS, or metadata about an encrypted flow, e.g., 595 packet sizes, which differ across video and textual flows, to 596 identify traffic. See Section 4.2.3.1 for more information about 597 SNI-based filtration mechanisms. 599 Other kinds of information can be inferred by comparing certain 600 unencrypted elements exchanged during TLS handshakes to similar data 601 points from known sources. This practice, called TLS fingerprinting, 602 allows a probabilistic identification of a party's operating system, 603 browser, or application based on a comparison of the specific 604 combinations of TLS version, ciphersuites, compression options, etc. 605 sent in the ClientHello message to similar signatures found in 606 unencrypted traffic [Husak-2016]. 608 Despite these problems, DPI is the most powerful identification 609 method and is widely used in practice. The Great Firewall of China 610 (GFW), the largest censorship system in the world, uses DPI to 611 identify restricted content over HTTP and DNS and inject TCP RSTs and 612 bad DNS responses, respectively, into connections [Crandall-2010] 613 [Clayton-2006] [Anonymous-2014]. 615 Empirical Examples: Several studies have found evidence of censors 616 using DPI for censoring content and tools. Clayton et al., Crandal 617 et al., Anonymous, and Khattak et al., all explored the GFW 618 [Crandall-2010] [Clayton-2006] [Anonymous-2014]. Khattak et al. even 619 probed the firewall to discover implementation details like how much 620 state it stores [Khattak-2013]. The Tor project claims that China, 621 Iran, Ethiopia, and others must have used DPI to block the obfs2 622 protocol [Wilde-2012]. Malaysia has been accused of using targeted 623 DPI, paired with DDoS, to identify and subsequently attack pro- 624 opposition material [Wagstaff-2013]. It also seems likely that 625 organizations not so worried about blocking content in real-time 626 could use DPI to sort and categorically search gathered traffic using 627 technologies such as NarusInsight [Hepting-2011]. 629 4.3. Transport Layer 631 4.3.1. Shallow Packet Inspection and Transport Header Identification 633 Of the various shallow packet inspection methods, Transport Header 634 Identification is the most pervasive, reliable, and predictable type 635 of identification. Transport headers contain a few invaluable pieces 636 of information that must be transparent for traffic to be 637 successfully routed: destination and source IP address and port. 638 Destination and Source IP are doubly useful, as not only does it 639 allow a censor to block undesirable content via IP blocklisting, but 640 also allows a censor to identify the IP of the user making the 641 request and the IP address of the destination being visited, which in 642 most cases can be used to infer the domain being visited 643 [Patil-2019]. Port is useful for allowlisting certain applications. 645 Combining IP address, port and protocol information found in the 646 transport header, shallow packet inspection can be used by a censor 647 to identify specific TCP or UDP endpoints. UDP endpoint blocking has 648 been observed in the context of QUIC blocking [Elmenhorst-2021]. 650 Trade-offs: header identification is popular due to its simplicity, 651 availability, and robustness. 653 Header identification is trivial to implement, but is difficult to 654 implement in backbone or ISP routers at scale, and is therefore 655 typically implemented with DPI. Blocklisting an IP is equivalent to 656 installing a specific route on a router (such as a /32 route for IPv4 657 addresses and a /128 route for IPv6 addresses). However, due to 658 limited flow table space, this cannot scale beyond a few thousand IPs 659 at most. IP blocking is also relatively crude. It often leads to 660 overblocking and cannot deal with some services like Content 661 Distribution Networks (CDN) that host content at hundreds or 662 thousands of IP addresses. Despite these limitations, IP blocking is 663 extremely effective because the user needs to proxy their traffic 664 through another destination to circumvent this type of 665 identification. In addition, IP blocking is effective against all 666 protocols above IP, e.g. TCP and QUIC. 668 Port-blocking is generally not useful because many types of content 669 share the same port and it is possible for censored applications to 670 change their port. For example, most HTTP traffic goes over port 80, 671 so the censor cannot differentiate between restricted and allowed web 672 content solely on the basis of port. HTTPS goes over port 443, with 673 similar consequences for the censor except only partial metadata may 674 now be available to the censor. Port allowlisting is occasionally 675 used, where a censor limits communication to approved ports, such as 676 80 for HTTP traffic and is most effective when used in conjunction 677 with other identification mechanisms. For example, a censor could 678 block the default HTTPS port, port 443, thereby forcing most users to 679 fall back to HTTP. A counter-example is that port 25 (SMTP) has long 680 been blocked on residential ISPs' networks to reduce the risk for 681 email spam, but in doing so also prohibits residential ISP customers 682 from running their own email servers. 684 4.3.2. Protocol Identification 686 Censors sometimes identify entire protocols to be blocked using a 687 variety of traffic characteristics. For example, Iran impairs the 688 performance of HTTPS traffic, a protocol that prevents further 689 analysis, to encourage users to switch to HTTP, a protocol that they 690 can analyze [Aryan-2012]. A simple protocol identification would be 691 to recognize all TCP traffic over port 443 as HTTPS, but more 692 sophisticated analysis of the statistical properties of payload data 693 and flow behavior, would be more effective, even when port 443 is not 694 used [Hjelmvik-2010] [Sandvine-2014]. 696 If censors can detect circumvention tools, they can block them, so 697 censors like China are extremely interested in identifying the 698 protocols for censorship circumvention tools. In recent years, this 699 has devolved into an arms race between censors and circumvention tool 700 developers. As part of this arms race, China developed an extremely 701 effective protocol identification technique that researchers call 702 active probing or active scanning. 704 In active probing, the censor determines whether hosts are running a 705 circumvention protocol by trying to initiate communication using the 706 circumvention protocol. If the host and the censor successfully 707 negotiate a connection, then the censor conclusively knows that host 708 is running a circumvention tool. China has used active scanning to 709 great effect to block Tor [Winter-2012]. 711 Trade-offs: Protocol identification necessarily only provides insight 712 into the way information is traveling, and not the information 713 itself. 715 Protocol identification is useful for detecting and blocking 716 circumvention tools, like Tor, or traffic that is difficult to 717 analyze, like VoIP or SSL, because the censor can assume that this 718 traffic should be blocked. However, this can lead to over-blocking 719 problems when used with popular protocols. These methods are 720 expensive, both computationally and financially, due to the use of 721 statistical analysis, and can be ineffective due to their imprecise 722 nature. 724 Censors have also used protocol identification in the past in an 725 'allowlist' filtering capacity, such as by only allowing specific, 726 pre-vetted protocols to be used and blocking any unrecognized 727 protocols [Bock-2020]. These protocol filtering approaches can also 728 lead to over-blocking if the allowed lists of protocols is too small 729 or incomplete, but can be cheap to implement, as many standard 730 'allowed' protocols are simple to identify (such as HTTP). 732 Empirical Examples: Protocol identification can be easy to detect if 733 it is conducted in real time and only a particular protocol is 734 blocked, but some types of protocol identification, like active 735 scanning, are much more difficult to detect. Protocol identification 736 has been used by Iran to identify and throttle SSH traffic to make it 737 unusable [Anonymous-2007] and by China to identify and block Tor 738 relays [Winter-2012]. Protocol identification has also been used for 739 traffic management, such as the 2007 case where Comcast in the United 740 States used RST injection to interrupt BitTorrent Traffic 741 [Winter-2012]. In 2020, Iran deployed an allowlist protocol filter, 742 which only allowed three protocols to be used (DNS, TLS, and HTTP) on 743 specific ports and censored any connection it could not identify 744 [Bock-2020]. In 2022, Russia seemed to have used protocol 745 identification to block most HTTP/3 connections [Elmenhorst-2022]. 747 4.4. Residual Censorship 749 Another feature of some modern censorship systems is residual 750 censorship, a punitive form of censorship whereby after a censor 751 disrupts a forbidden connection, the censor continues to target 752 subsequent connections, even if they are innocuous [Bock-2021]. 753 Residual censorship can take many forms and often relies on the 754 methods of technical interference described in the next section. 756 An important facet of residual censorship is precisely what the 757 censor continues to block after censorship is initially triggered. 758 There are three common options available to an adversary: 2-tuple 759 (client IP, server IP), 3-tuple (client IP, server IP+port), or 760 4-tuple (client IP+port, server IP+port). Future connections that 761 match the tuple of information the censor records will be disrupted 762 [Bock-2021]. 764 Residual censorship can sometimes be difficult to identify and can 765 often complicate censorship measurement. 767 Trade-offs: The impact of residual censorship is to provide users 768 with further discouragement from trying to access forbidden content, 769 though it is not clear how successful it is at accomplishing this. 771 Empirical Examples: China has used 3-tuple residual censorship in 772 conjunction with their HTTP censorship for years and researchers have 773 reported seeing similar residual censorship for HTTPS. China seems 774 to use a mix of 3-tuple and 4-tuple residual censorship for their 775 censorship of HTTPS with ESNI. Some censors that perform censorship 776 via packet dropping often accidentally implement 4-tuple residual 777 censorship, including Iran and Kazakhstan [Bock-2021]. 779 5. Technical Interference 781 5.1. Application Layer 783 5.1.1. DNS Interference 785 There are a variety of mechanisms that censors can use to block or 786 filter access to content by altering responses from the DNS 787 [AFNIC-2013] [ICANN-SSAC-2012], including blocking the response, 788 replying with an error message, or responding with an incorrect 789 address. Note that there are now encrypted transports for DNS 790 queries in DNS-over-HTTPS [RFC8484] and DNS-over-TLS [RFC7858] that 791 can mitigate interference with DNS queries between the stub and the 792 resolver. 794 Responding to a DNS query with an incorrect address can be achieved 795 with on-path interception, off-path cache poisoning, and lying by the 796 nameserver. 798 "DNS mangling" is a network-level technique of on-path interception 799 where an incorrect IP address is returned in response to a DNS query 800 to a censored destination. An example of this is what some Chinese 801 networks do (we are not aware of any other wide-scale uses of 802 mangling). On those Chinese networks, every DNS request in transit 803 is examined (presumably by network inspection technologies such as 804 DPI) and, if it matches a censored domain, a false response is 805 injected. End users can see this technique in action by simply 806 sending DNS requests to any unused IP address in China (see example 807 below). If it is not a censored name, there will be no response. If 808 it is censored, a forged response will be returned. For example, 809 using the command-line dig utility to query an unused IP address in 810 China of 192.0.2.2 for the name "www.uncensored.example" compared 811 with "www.censored.example" (censored at the time of writing), we get 812 a forged IP address "198.51.100.0" as a response: 814 % dig +short +nodnssec @192.0.2.2 A www.uncensored.example 815 ;; connection timed out; no servers could be reached 817 % dig +short +nodnssec @192.0.2.2 A www.censored.example 818 198.51.100.0 820 DNS cache poisoning happens off-path and refers to a mechanism where 821 a censor interferes with the response sent by an authoritative DNS 822 name server to a recursive resolver by responding more quickly than 823 the authoritative name server can respond with an alternative IP 824 address [Halley-2008]. Cache poisoning occurs after the requested 825 site's name servers resolve the request and attempt to forward the 826 true IP back to the requesting device; on the return route the 827 resolved IP is recursively cached by each DNS server that initially 828 forwarded the request. During this caching process if an undesirable 829 keyword is recognized, the resolved IP is "poisoned" and an 830 alternative IP (or NXDOMAIN error) is returned more quickly than the 831 upstream resolver can respond, causing a forged IP address to be 832 cached (and potentially recursively so). The alternative IPs usually 833 direct to a nonsense domain or a warning page. Alternatively, 834 Iranian censorship appears to prevent the communication en-route, 835 preventing a response from ever being sent [Aryan-2012]. 837 There are also cases of what is colloquially called "DNS lying", 838 where a censor mandates that the DNS responses provided - by an 839 operator of a recursive resolver such as an Internet access provider 840 - be different than what authoritative name server would provide 841 [Bortzmeyer-2015]. 843 Trade-offs: These forms of DNS interference require the censor to 844 force a user to traverse a controlled DNS hierarchy (or intervening 845 network on which the censor serves as a Active Pervasive Attacker 846 [RFC7624] to rewrite DNS responses) for the mechanism to be 847 effective. It can be circumvented by using alternative DNS resolvers 848 (such as any of the public DNS resolvers) that may fall outside of 849 the jurisdictional control of the censor, or Virtual Private Network 850 (VPN) technology. DNS mangling and cache poisoning also imply 851 returning an incorrect IP to those attempting to resolve a domain 852 name, but in some cases the destination may be technically 853 accessible; over HTTP, for example, the user may have another method 854 of obtaining the IP address of the desired site and may be able to 855 access it if the site is configured to be the default server 856 listening at this IP address. Target blocking has also been a 857 problem, as occasionally users outside of the censors region will be 858 directed through DNS servers or DNS-rewriting network equipment 859 controlled by a censor, causing the request to fail. The ease of 860 circumvention paired with the large risk of content blocking and 861 target blocking make DNS interference a partial, difficult, and less 862 than ideal censorship mechanism. 864 Additionally, the above mechanisms rely on DNSSEC not being deployed 865 or DNSSEC validation not being active on the client or recursive 866 resolver (neither of which are hard to imagine given limited 867 deployment of DNSSEC and limited client support for DNSSEC 868 validation). Note that an adversary seeking to merely block 869 resolution can serve a DNSSEC record that doesn't validate correctly, 870 assuming of course that the client/recursive resolver validates. 872 Previously, techniques were used for e.g. censorship that relied on 873 DNS requests being passed in cleartext over port 53 [SSAC-109-2020]. 874 With the deployment of encrypted DNS (e.g., DNS-over-HTTPS [RFC8484]) 875 these requests are now increasingly passed on port 443 with other 876 HTTPS traffic, or in the case of DNS-over-TLS [RFC7858] no longer 877 passed in the clear (see also Section 4.3.1). 879 Empirical Examples: DNS interference, when properly implemented, is 880 easy to identify based on the shortcomings identified above. Turkey 881 relied on DNS interference for its country-wide block of websites 882 such Twitter and YouTube for almost week in March of 2014 but the 883 ease of circumvention resulted in an increase in the popularity of 884 Twitter until Turkish ISPs implementing an IP blocklist to achieve 885 the governmental mandate [Zmijewski-2014]. Ultimately, Turkish ISPs 886 started hijacking all requests to Google and Level 3's international 887 DNS resolvers [Zmijewski-2014]. DNS interference, when incorrectly 888 implemented, has resulted in some of the largest "censorship 889 disasters". In January 2014, China started directing all requests 890 passing through the Great Fire Wall to a single domain, 891 dongtaiwang.com, due to an improperly configured DNS poisoning 892 attempt; this incident is thought to be the largest Internet-service 893 outage in history [AFP-2014] [Anon-SIGCOMM12]. Countries such as 894 China, Iran, Turkey, and the United States have discussed blocking 895 entire TLDs as well, but only Iran has acted by blocking all Israeli 896 (.il) domains [Albert-2011]. DNS-blocking is commonly deployed in 897 European countries to deal with undesirable content, such as child 898 abuse content (Norway, United Kingdom, Belgium, Denmark, Finland, 899 France, Germany, Ireland, Italy, Malta, the Netherlands, Poland, 900 Spain and Sweden [Wright-2013] [Eneman-2010]), online gambling 901 (Belgium, Bulgaria, Czech Republic, Cyprus, Denmark, Estonia, France, 902 Greece, Hungary, Italy, Latvia, Lithuania, Poland, Portugal, Romania, 903 Slovakia, Slovenia, Spain (see Section 6.3.2 of: [EC-gambling-2012], 904 [EC-gambling-2019])), copyright infringement (all European Economic 905 Area countries), hate-speech and extremism (France [Hertel-2015]) and 906 terrorism content (France [Hertel-2015]). 908 5.2. Transport Layer 910 5.2.1. Performance Degradation 912 While other interference techniques outlined in this section mostly 913 focus on blocking or preventing access to content, it can be an 914 effective censorship strategy in some cases to not entirely block 915 access to a given destination, or service but instead degrade the 916 performance of the relevant network connection. The resulting user 917 experience for a site or service under performance degradation can be 918 so bad that users opt to use a different site, service, or method of 919 communication, or may not engage in communication at all if there are 920 no alternatives. Traffic shaping techniques that rate-limit the 921 bandwidth available to certain types of traffic is one example of a 922 performance degradation. 924 Trade offs: While implementing a performance degradation will not 925 always eliminate the ability of people to access a desire resource, 926 it may force them to use other means of communication where 927 censorship (or surveillance) is more easily accomplished. 929 Empirical Examples: Iran has been known to shape the bandwidth 930 available to HTTPS traffic to encourage unencrypted HTTP traffic 931 [Aryan-2012]. 933 5.2.2. Packet Dropping 935 Packet dropping is a simple mechanism to prevent undesirable traffic. 936 The censor identifies undesirable traffic and chooses to not properly 937 forward any packets it sees associated with the traversing 938 undesirable traffic instead of following a normal routing protocol. 939 This can be paired with any of the previously described mechanisms so 940 long as the censor knows the user must route traffic through a 941 controlled router. 943 Trade offs: Packet Dropping is most successful when every traversing 944 packet has transparent information linked to undesirable content, 945 such as a Destination IP. One downside Packet Dropping suffers from 946 is the necessity of blocking all content from otherwise allowable IPs 947 based on a single subversive sub-domain; blogging services and github 948 repositories are good examples. China famously dropped all github 949 packets for three days based on a single repository hosting 950 undesirable content [Anonymous-2013]. The need to inspect every 951 traversing packet in close to real time also makes Packet Dropping 952 somewhat challenging from a QoS perspective. 954 Empirical Examples: Packet Dropping is a very common form of 955 technical interference and lends itself to accurate detection given 956 the unique nature of the time-out requests it leaves in its wake. 957 The Great Firewall of China has been observed using packet dropping 958 as one of its primary mechanisms of technical censorship 959 [Ensafi-2013]. Iran has also used Packet Dropping as the mechanisms 960 for throttling SSH [Aryan-2012]. These are but two examples of a 961 ubiquitous censorship practice. Notably, packet dropping during the 962 handshake or working connection is the only interference technique 963 observed for QUIC traffic so far, e.g. in India, Iran, Russia and 964 Uganda [Elmenhorst-2021][Elmenhorst-2022]. 966 5.2.3. RST Packet Injection 968 Packet injection, generally, refers to a man-in-the-middle (MITM) 969 network interference technique that spoofs packets in an established 970 traffic stream. RST packets are normally used to let one side of TCP 971 connection know the other side has stopped sending information, and 972 thus the receiver should close the connection. RST Packet Injection 973 is a specific type of packet injection attack that is used to 974 interrupt an established stream by sending RST packets to both sides 975 of a TCP connection; as each receiver thinks the other has dropped 976 the connection, the session is terminated. 978 QUIC is not vulnerable to these types of injection attacks once the 979 connection has been setup. While QUIC implements a stateless reset 980 mechanism, such a reset is only accepted by a peer if the packet ends 981 in a previously issued stateless reset token which is hard to guess. 982 During the handshake, QUIC only provides effective protection against 983 off-path attackers but is vulnerable to injection attacks by 984 attackers that have parsed prior packets. (See 985 [I-D.ietf-quic-transport] for more details.) 987 Trade-offs: Although ineffective against non-TCP protocols (QUIC, 988 IPSec), RST Packet Injection has a few advantages that make it 989 extremely popular as a technique employed for censorship. RST Packet 990 Injection is an out-of-band interference mechanism, allowing the 991 avoidance of the the QoS bottleneck one can encounter with inline 992 techniques such as Packet Dropping. This out-of-band property allows 993 a censor to inspect a copy of the information, usually mirrored by an 994 optical splitter, making it an ideal pairing for DPI and protocol 995 identification [Weaver-2009] (this asynchronous version of a MITM is 996 often called a Man-on-the-Side (MOTS)). RST Packet Injection also 997 has the advantage of only requiring one of the two endpoints to 998 accept the spoofed packet for the connection to be interrupted. 1000 The difficult part of RST Packet Injection is spoofing "enough" 1001 correct information to ensure one end-point accepts a RST packet as 1002 legitimate; this generally implies a correct IP, port, and TCP 1003 sequence number. Sequence number is the hardest to get correct, as 1004 [RFC0793] specifies an RST Packet should be in-sequence to be 1005 accepted, although the RFC also recommends allowing in-window packets 1006 as "good enough". This in-window recommendation is important, as if 1007 it is implemented it allows for successful Blind RST Injection 1008 attacks [Netsec-2011]. When in-window sequencing is allowed, it is 1009 trivial to conduct a Blind RST Injection: while the term "blind" 1010 injection implies the censor doesn't know any sensitive sequencing 1011 information about the TCP stream they are injecting into, they can 1012 simply enumerate all ~70000 possible windows; this is particularly 1013 useful for interrupting encrypted/obfuscated protocols such as SSH or 1014 Tor [Gilad]. Some censorship evasion systems work by trying to 1015 confuse the censor into tracking incorrect information, rendering 1016 their RST Packet Injection useless [Khattak-2013], [Wang-2017], 1017 [Li-2017], [Bock-2019], [Wang-2020]. 1019 RST Packet Injection relies on a stateful network, making it useless 1020 against UDP connections. RST Packet Injection is among the most 1021 popular censorship techniques used today given its versatile nature 1022 and effectiveness against all types of TCP traffic. Recent research 1023 shows that a TCP RST packet injection attack can even work in the 1024 case of an off-path attacker [Cao-2016]. 1026 Empirical Examples: RST Packet Injection, as mentioned above, is most 1027 often paired with identification techniques that require splitting, 1028 such as DPI or protocol identification. In 2007, Comcast was accused 1029 of using RST Packet Injection to interrupt traffic it identified as 1030 BitTorrent [Schoen-2007], this later led to a US Federal 1031 Communications Commission ruling against Comcast [VonLohmann-2008]. 1032 China has also been known to use RST Packet Injection for censorship 1033 purposes. This interference is especially evident in the 1034 interruption of encrypted/obfuscated protocols, such as those used by 1035 Tor [Winter-2012]. 1037 5.3. Routing Layer 1039 5.3.1. Network Disconnection 1041 While it is perhaps the crudest of all techniques employed for 1042 censorship, there is no more effective way of making sure undesirable 1043 information isn't allowed to propagate on the web than by shutting 1044 off the network. The network can be logically cut off in a region 1045 when a censoring body withdraws all of the Boarder Gateway Protocol 1046 (BGP) prefixes routing through the censor's country. 1048 Trade-offs: The impact to a network disconnection in a region is huge 1049 and absolute; the censor pays for absolute control over digital 1050 information by losing all the benefits the Internet brings; this 1051 rarely a long-term solution for any censor and is normally only used 1052 as a last resort in times of substantial unrest. 1054 Empirical Examples: Network Disconnections tend to only happen in 1055 times of substantial unrest, largely due to the huge social, 1056 political, and economic impact such a move has. One of the first, 1057 highly covered occurrences was with the Junta in Myanmar employing 1058 Network Disconnection to help Junta forces quash a rebellion in 2007 1059 [Dobie-2007]. China disconnected the network in the Xinjiang region 1060 during unrest in 2009 in an effort to prevent the protests from 1061 spreading to other regions [Heacock-2009]. The Arab Spring saw the 1062 the most frequent usage of Network Disconnection, with events in 1063 Egypt and Libya in 2011 [Cowie-2011], and Syria in 2012 1064 [Thomson-2012]. Russia indicated that it would attempt to disconnect 1065 all Russian networks from the global internet in April 2019 as part 1066 of a test of the nation's network independence. Reports also 1067 indicate that, as part of the test disconnect, Russian 1068 telecommunications firms must now route all traffic to state-operated 1069 monitoring points [Cimpanu-2019]. India was the country that saw the 1070 largest number of internet shutdowns per year in 2016 and 2017 1071 [Dada-2017]. 1073 5.3.2. Adversarial Route Announcement 1075 More fine-grained and potentially wide-spread censorship can be 1076 achieved with BGP hijacking, which adversarially re-routes BGP IP 1077 prefixes incorrectly within a region and beyond. This restricts and 1078 effectively censors the correctly known location of information that 1079 flows into or out of a jurisdiction and will similarly prevent people 1080 from outside your jurisdiction from viewing content generated outside 1081 your jurisdiction as the adversarial route announcement propagates. 1082 The first can be achieved by an adversarial BGP announcement of 1083 incorrect routes that are not intended to leak beyond a jurisdiction, 1084 where the latter attacks traffic by deliberately introducing bogus 1085 BGP announcements that reach the global internet. 1087 Trade-offs: A global leak of a misrouted website can overwhelm an ISP 1088 if the website gets a lot of traffic. It is not a permanent solution 1089 because incorrect BGP routes that leak globally can be fixed, though 1090 within a jurisdiction only the ISP/IXP is in a position to correct 1091 them for local users. 1093 Empirical examples: In 2008 Pakistan Telecom censored Youtube at the 1094 request of the Pakistan government by changing its BGP routes for the 1095 website. The new routes were announced to the ISP's upstream 1096 providers and beyond. The entire Internet began directing Youtube 1097 routes to Pakistan Telecom and continued doing so for many hours. In 1098 2018 nearly all Google services and Google cloud customers like 1099 Spotify all lost more than one hour of service after it lost control 1100 of several million of its IP addresses. Those IP prefixes were being 1101 misdirected to China Telecom, a Chinese government-owned ISP 1102 [Google-2018]}, in a manner similar to the BGP hijacking of US 1103 government and military websites by China Telecom in 2010. ISPs in 1104 both Russia (2022) and Myanmar (2021) have tried to hijack the same 1105 Twitter prefix more than once [MANRS]. 1107 5.4. Multi-layer and Non-layer 1109 5.4.1. Distributed Denial of Service (DDoS) 1111 Distributed Denial of Service attacks are a common attack mechanism 1112 used by "hacktivists" and malicious hackers, but censors have used 1113 DDoS in the past for a variety of reasons. There is a huge variety 1114 of DDoS attacks [Wikip-DoS], but at a high level two possible impacts 1115 tend to occur; a flood attack results in the service being unusable 1116 while resources are being spent to flood the service, a crash attack 1117 aims to crash the service so resources can be reallocated elsewhere 1118 without "releasing" the service. 1120 Trade-offs: DDoS is an appealing mechanism when a censor would like 1121 to prevent all access to undesirable content, instead of only access 1122 in their region for a limited period of time, but this is really the 1123 only uniquely beneficial feature for DDoS as a technique employed for 1124 censorship. The resources required to carry out a successful DDoS 1125 against major targets are computationally expensive, usually 1126 requiring renting or owning a malicious distributed platform such as 1127 a botnet, and imprecise. DDoS is an incredibly crude censorship 1128 technique, and appears to largely be used as a timely, easy-to-access 1129 mechanism for blocking undesirable content for a limited period of 1130 time. 1132 Empirical Examples: In 2012 the U.K.'s GCHQ used DDoS to temporarily 1133 shutdown IRC chat rooms frequented by members of Anonymous using the 1134 Syn Flood DDoS method; Syn Flood exploits the handshake used by TCP 1135 to overload the victim server with so many requests that legitimate 1136 traffic becomes slow or impossible [Schone-2014] [CERT-2000]. 1137 Dissenting opinion websites are frequently victims of DDoS around 1138 politically sensitive events in Burma [Villeneuve-2011]. Controlling 1139 parties in Russia [Kravtsova-2012], Zimbabwe [Orion-2013], and 1140 Malaysia [Muncaster-2013] have been accused of using DDoS to 1141 interrupt opposition support and access during elections. In 2015, 1142 China launched a DDoS attack using a true MITM system collocated with 1143 the Great Firewall, dubbed "Great Cannon", that was able to inject 1144 JavaScript code into web visits to a Chinese search engine that 1145 commandeered those user agents to send DDoS traffic to various sites 1146 [Marczak-2015]. 1148 5.4.2. Censorship in Depth 1150 Often, censors implement multiple techniques in tandem, creating 1151 "censorship in depth". Censorship in depth can take many forms; some 1152 censors block the same content through multiple techniques (such as 1153 blocking a domain by DNS, IP blocking, and HTTP simultaneously), some 1154 deploy parallel systems to improve censorship reliability (such as 1155 deploying multiple different censorship systems to block the same 1156 domain), and others can use complimentary systems to limit evasion 1157 (such as by blocking unwanted protocols entirely, forcing users to 1158 use other filtered protocols). 1160 Trade-offs: Censorship in depth can be attractive for censors to 1161 deploy, as it offers additional guarantees about censorship: even if 1162 someone evades one type of censorship, they may still be blocked by 1163 another. The main drawback to this approach is the cost to initial 1164 deployment, as it requires the system to deploy multiple censorship 1165 systems in tandem. 1167 Empirical Examples: Censorship in depth is present in many large 1168 censoring nation states today. Researchers have observed China has 1169 deployed significant censorship in depth, often censoring the same 1170 resource across multiple protocols [Chai-2019], [Bock-2020b] or 1171 deploying additional censorship systems to censor the same content 1172 and protocol [Bock-2021b]. Iran also has deployed a complimentary 1173 protocol filter to limit which protocols can be used on certain 1174 ports, forcing users to rely on protocols their censorship system can 1175 filter [Bock-2020]. 1177 6. Non-Technical Interference 1179 6.1. Manual Filtering 1181 As the name implies, sometimes manpower is the easiest way to figure 1182 out which content to block. Manual Filtering differs from the common 1183 tactic of building up blocklists in that it doesn't necessarily 1184 target a specific IP or DNS, but instead removes or flags content. 1185 Given the imprecise nature of automatic filtering, manually sorting 1186 through content and flagging dissenting websites, blogs, articles and 1187 other media for filtration can be an effective technique. This 1188 filtration can occur on the Backbone/ISP level - China's army of 1189 monitors is a good example [BBC-2013b] - but more commonly manual 1190 filtering occurs on an institutional level. Internet Content 1191 Providers such as Google or Weibo, require a business license to 1192 operate in China. One of the prerequisites for a business license is 1193 an agreement to sign a "voluntary pledge" known as the "Public Pledge 1194 on Self-discipline for the Chinese Internet Industry". The failure 1195 to "energetically uphold" the pledged values can lead to the ICPs 1196 being held liable for the offending content by the Chinese government 1197 [BBC-2013b]. 1199 6.2. Self-Censorship 1201 Self-censorship is difficult to document, as it manifests primarily 1202 through a lack of undesirable content. Tools which encourage self- 1203 censorship are those which may lead a prospective speaker to believe 1204 that speaking increases the risk of unfavourable outcomes for the 1205 speaker (technical monitoring, identification requirements, etc.). 1206 Reporters Without Borders exemplify methods of imposing self- 1207 censorship in their annual World Press Freedom Index reports 1208 [RWB2020]. 1210 6.3. Server Takedown 1212 As mentioned in passing by [Murdoch-2011], servers must have a 1213 physical location somewhere in the world. If undesirable content is 1214 hosted in the censoring country the servers can be physically seized 1215 or - in cases where a server is virtualized in a cloud infrastructure 1216 where it may not necessarily have a fixed physical location - the 1217 hosting provider can be required to prevent access. 1219 6.4. Notice and Takedown 1221 In many countries, legal mechanisms exist where an individual or 1222 other content provider can issue a legal request to a content host 1223 that requires the host to take down content. Examples include the 1224 systems employed by companies like Google to comply with "Right to be 1225 Forgotten" policies in the European Union [Google-RTBF], intermediary 1226 liability rules for electronic platform providers [EC-2012], or the 1227 copyright-oriented notice and takedown regime of the United States 1228 Digital Millennium Copyright Act (DMCA) Section 512 [DMLP-512]. 1230 6.5. Domain-Name Seizures 1232 Domain names are catalogued in so-called name-servers operated by 1233 legal entities called registries. These registries can be made to 1234 cede control over a domain name to someone other than the entity 1235 which registered the domain name through a legal procedure grounded 1236 in either private contracts or public law. Domain name seizures is 1237 increasingly used by both public authorities and private entities to 1238 deal with undesired content dissemination [ICANN2012] [EFF2017]. 1240 7. Future work 1242 In addition to establishing a thorough resource for describing 1243 censorship techniques this document implicates critical areas for 1244 future work. 1246 Taken as a whole the apparent costs of implementation of censorship 1247 techniques indicate a need for better classification of censorship 1248 regimes as they evolve and mature and specifying censorship 1249 circumvention techniques themselves. Censors maturity refers to the 1250 technical maturity required of the censor to perform the specific 1251 censorship technique. Future work might classify techniques by 1252 essentially how hard a censor must work, including what 1253 infrastructure is required, in order to successfully censor content, 1254 users or services. 1256 On circumvention, the increase in protocols leveraging encryption is 1257 an effective counter-measure against some forms of censorship 1258 described in this document, but that thorough research on 1259 circumvention and encryption be left for another document. Moreover 1260 the censorship circumvention community has developed an area of 1261 research on "pluggable transports," which collects, documents and 1262 makes agile methods for obfuscating the on-path traffic of censorship 1263 circumvention tools such that it appears indistinguishable from other 1264 kinds of traffic [Tor-2020]. Those methods would benefit from future 1265 work in the internet standards community, too. 1267 Lastly the empirical examples demonstrate that censorship techniques 1268 can evolve quickly, and experience shows that this document can only 1269 be a point-in-time statement. Future work might extend this document 1270 with updates and new techniques described using a comparable 1271 methodology. 1273 8. Contributors 1275 This document benefited from discussions with and input from David 1276 Belson, Stephane Bortzmeyer, Vinicius Fortuna, Gurshabad Grover, 1277 Andrew McConachie, Martin Nilsson, Michael Richardson, Patrick Vacek 1278 and Chris Wood. 1280 9. Informative References 1282 [AFNIC-2013] 1283 AFNIC, "Report of the AFNIC Scientific Council: 1284 Consequences of DNS-based Internet filtering", 2013, 1285 . 1288 [AFP-2014] AFP, "China Has Massive Internet Breakdown Reportedly 1289 Caused By Their Own Censoring Tools", 2014, 1290 . 1293 [Albert-2011] 1294 Albert, K., "DNS Tampering and the new ICANN gTLD Rules", 1295 2011, . 1298 [Anon-SIGCOMM12] 1299 Anonymous, "The Collateral Damage of Internet Censorship 1300 by DNS Injection", 2012, 1301 . 1304 [Anonymous-2007] 1305 Anonymous, "How to Bypass Comcast's Bittorrent 1306 Throttling", 2012, . 1309 [Anonymous-2013] 1310 Anonymous, "GitHub blocked in China - how it happened, how 1311 to get around it, and where it will take us", 2013, 1312 . 1316 [Anonymous-2014] 1317 Anonymous, "Towards a Comprehensive Picture of the Great 1318 Firewall's DNS Censorship", 2014, 1319 . 1322 [AP-2012] Associated Press, "Sattar Beheshit, Iranian Blogger, Was 1323 Beaten In Prison According To Prosecutor", 2012, 1324 . 1327 [Aryan-2012] 1328 Aryan, S., Aryan, H., and J.A. Halderman, "Internet 1329 Censorship in Iran: A First Look", 2012, 1330 . 1332 [BBC-2013] BBC News, "Google and Microsoft agree steps to block abuse 1333 images", 2013, . 1335 [BBC-2013b] 1336 BBC, "China employs two million microblog monitors state 1337 media say", 2013, 1338 . 1340 [Bentham-1791] 1341 Bentham, J., "Panopticon Or the Inspection House", 1791, 1342 . 1345 [Bock-2019] 1346 Bock, K., Hughey, G., Qiang, X., and D. Levin, "Geneva: 1347 Evolving Censorship Evasion Strategies", 2019, 1348 . 1350 [Bock-2020] 1351 Bock, K., Fax, Y., Reese, K., Singh, J., and D. Levin, 1352 "Detecting and Evading Censorship-in-Depth: A Case Study 1353 of Iran’s Protocol Filter", 2020, 1354 . 1357 [Bock-2020b] 1358 Bock, K., iyouport, ., Anonymous, ., Merino, L., Fifield, 1359 D., Houmansadr, A., and D. Levin, "Exposing and 1360 Circumventing China's Censorship of ESNI", 2020, 1361 . 1364 [Bock-2021] 1365 Bock, K., Bharadwaj, P., Singh, J., and D. Levin, "Your 1366 Censor is My Censor: Weaponizing Censorship Infrastructure 1367 for Availability Attacks", 2021, 1368 . 1371 [Bock-2021b] 1372 Bock, K., Naval, G., Reese, K., and D. Levin, "Even 1373 Censors Have a Backup: Examining China’s Double HTTPS 1374 Censorship Middleboxes", 2021, 1375 . 1377 [Bortzmeyer-2015] 1378 Bortzmeyer, S., "DNS Censorship (DNS Lies) As Seen By RIPE 1379 Atlas", 2015, 1380 . 1383 [Boyle-1997] 1384 Boyle, J., "Foucault in Cyberspace: Surveillance, 1385 Sovereignty, and Hardwired Censors", 1997, 1386 . 1389 [Bristow-2013] 1390 Bristow, M., "China's internet 'spin doctors‘", 2013, 1391 . 1393 [Calamur-2013] 1394 Calamur, K., "Prominent Egyptian Blogger Arrested", 2013, 1395 . 1398 [Cao-2016] Cao, Y., Qian, Z., Wang, Z., Dao, T., Krishnamurthy, S., 1399 and L. Marvel, "Off-Path TCP Exploits: Global Rate Limit 1400 Considered Dangerous", 2016, 1401 . 1404 [CERT-2000] 1405 CERT, "TCP SYN Flooding and IP Spoofing Attacks", 2000, 1406 . 1409 [Chai-2019] 1410 Chai, Z., Ghafari, A., and A. Houmansadr, "On the 1411 Importance of Encrypted-SNI (ESNI) to Censorship 1412 Circumvention", 2019, 1413 . 1416 [Cheng-2010] 1417 Cheng, J., "Google stops Hong Kong auto-redirect as China 1418 plays hardball", 2010, . 1422 [Cimpanu-2019] 1423 Cimpanu, C., "Russia to disconnect from the internet as 1424 part of a planned test", 2019, 1425 . 1428 [CitizenLab-2018] 1429 Marczak, B., Dalek, J., McKune, S., Senft, A., Scott- 1430 Railton, J., and R. Deibert, "Bad Traffic: Sandvine’s 1431 PacketLogic Devices Used to Deploy Government Spyware in 1432 Turkey and Redirect Egyptian Users to Affiliate Ads?", 1433 2018, . 1437 [Clayton-2006] 1438 Clayton, R., "Ignoring the Great Firewall of China", 2006, 1439 . 1441 [Condliffe-2013] 1442 Condliffe, J., "Google Announces Massive New Restrictions 1443 on Child Abuse Search Terms", 2013, . 1447 [Cowie-2011] 1448 Cowie, J., "Egypt Leaves the Internet", 2011, 1449 . 1452 [Crandall-2010] 1453 Crandall, J., "Empirical Study of a National-Scale 1454 Distributed Intrusion Detection System: Backbone-Level 1455 Filtering of HTML Responses in China", 2010, 1456 . 1458 [Dada-2017] 1459 Dada, T. and P. Micek, "Launching STOP: the #KeepItOn 1460 internet shutdown tracker", 2017, 1461 . 1463 [Dalek-2013] 1464 Dalek, J., "A Method for Identifying and Confirming the 1465 Use of URL Filtering Products for Censorship", 2013, 1466 . 1469 [Ding-1999] 1470 Ding, C., Chi, C.H., Deng, J., and C.L. Dong, "Centralized 1471 Content-Based Web Filtering and Blocking: How Far Can It 1472 Go?", 1999, . 1475 [DMLP-512] Digital Media Law Project, "Protecting Yourself Against 1476 Copyright Claims Based on User Content", 2012, 1477 . 1480 [Dobie-2007] 1481 Dobie, M., "Junta tightens media screw", 2007, 1482 . 1484 [EC-2012] European Commission, "Summary of the results of the Public 1485 Consultation on the future of electronic commerce in the 1486 Internal Market and the implementation of the Directive on 1487 electronic commerce (2000/31/EC)", 2012, 1488 . 1492 [EC-gambling-2012] 1493 European Commission, "Online gambling in the Internal 1494 Market", 2012, . 1497 [EC-gambling-2019] 1498 European Commission, "Evaluation of regulatory tools for 1499 enforcing online gambling rules and channeling demand 1500 towards controlled offers", 2019, 1501 . 1505 [EFF2017] Malcom, J., Stoltz, M., Rossi, G., and V. Paxson, "Which 1506 Internet registries offer the best protection for domain 1507 owners?", 2017, . 1510 [Ellul-1973] 1511 Ellul, J., "Propaganda: The Formation of Men's Attitudes", 1512 1973, . 1515 [Elmenhorst-2021] 1516 Elmenhorst, K., Schuetz, B., Basso, S., and N. 1517 Aschenbruck, "Web Censorship Measurements of HTTP/3 over 1518 QUIC", 2021, 1519 . 1521 [Elmenhorst-2022] 1522 Elmenhorst, K., "A Quick Look at QUIC Censorship", 2022, 1523 . 1525 [Eneman-2010] 1526 Eneman, M., "ISPs filtering of child abusive material: A 1527 critical reflection of its effectiveness", 2010, 1528 . 1531 [Ensafi-2013] 1532 Ensafi, R., "Detecting Intentional Packet Drops on the 1533 Internet via TCP/IP Side Channels", 2013, 1534 . 1536 [Fareed-2008] 1537 Fareed, M., "China joins a turf war", 2008, 1538 . 1541 [Fifield-2015] 1542 Fifield, D., Lan, C., Hynes, R., Wegmann, P., and V. 1543 Paxson, "Blocking-resistant communication through domain 1544 fronting", 2015, 1545 . 1547 [Gao-2014] Gao, H., "Tiananmen, Forgotten", 2014, 1548 . 1551 [Gatlan-2019] 1552 Gatlan, S., "South Korea is Censoring the Internet by 1553 Snooping on SNI Traffic", 2019, 1554 . 1558 [Gilad] Gilad, Y. and A. Herzberg, "Off-Path TCP Injection 1559 Attacks", 2014, . 1561 [Glanville-2008] 1562 Glanville, J., "The Big Business of Net Censorship", 2008, 1563 . 1566 [Google-2018] 1567 "Google Cloud Networking Incident #18018", 2018, 1568 . 1571 [Google-RTBF] 1572 Google, Inc., "Search removal request under data 1573 protection law in Europe", 2015, 1574 . 1577 [Grover-2019] 1578 Grover, G., Singh, K., and E. Hickok, "Reliance Jio is 1579 using SNI inspection to block websites", 2019, 1580 . 1583 [Guardian-2014] 1584 The Gaurdian, "Chinese blogger jailed under crackdown on 1585 'internet rumours'", 2014, 1586 . 1589 [HADOPI-2020] 1590 Haute Autorité pour la Diffusion des oeuvres et la 1591 Protection des Droits sur Internet, "Présentation", 2020, 1592 . 1594 [Halley-2008] 1595 Halley, B., "How DNS cache poisoning works", 2014, 1596 . 1599 [Heacock-2009] 1600 Heacock, R., "China Shuts Down Internet in Xinjiang Region 1601 After Riots", 2009, . 1604 [Hepting-2011] 1605 Electronic Frontier Foundation, "Hepting vs. AT&T", 2011, 1606 . 1608 [Hertel-2015] 1609 Hertel, O., "Comment les autorités peuvent bloquer un site 1610 Internet", 2015, . 1614 [Hjelmvik-2010] 1615 Hjelmvik, E., "Breaking and Improving Protocol 1616 Obfuscation", 2010, 1617 . 1619 [Hopkins-2011] 1620 Hopkins, C., "Communications Blocked in Libya, Qatari 1621 Blogger Arrested: This Week in Online Tyranny", 2011, 1622 . 1625 [Husak-2016] 1626 Husak, M., Cermak, M., Jirsik, T., and P. Celeda, "HTTPS 1627 traffic analysis and client identification using passive 1628 SSL/TLS fingerprinting", 2016, 1629 . 1632 [I-D.ietf-quic-transport] 1633 Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed 1634 and Secure Transport", Work in Progress, Internet-Draft, 1635 draft-ietf-quic-transport-34, 14 January 2021, 1636 . 1639 [I-D.ietf-tls-esni] 1640 Rescorla, E., Oku, K., Sullivan, N., and C. A. Wood, "TLS 1641 Encrypted Client Hello", Work in Progress, Internet-Draft, 1642 draft-ietf-tls-esni-14, 13 February 2022, 1643 . 1646 [I-D.ietf-tls-sni-encryption] 1647 Huitema, C. and E. Rescorla, "Issues and Requirements for 1648 Server Name Identification (SNI) Encryption in TLS", Work 1649 in Progress, Internet-Draft, draft-ietf-tls-sni- 1650 encryption-09, 28 October 2019, 1651 . 1654 [ICANN-SSAC-2012] 1655 ICANN Security and Stability Advisory Committee (SSAC), 1656 "SAC 056: SSAC Advisory on Impacts of Content Blocking via 1657 the Domain Name System", 2012, 1658 . 1661 [ICANN2012] 1662 ICANN Security and Stability Advisory Committee, "Guidance 1663 for Preparing Domain Name Orders, Seizures & Takedowns", 1664 2012, . 1667 [Jones-2014] 1668 Jones, B., "Automated Detection and Fingerprinting of 1669 Censorship Block Pages", 2014, 1670 . 1673 [Khattak-2013] 1674 Khattak, S., "Towards Illuminating a Censorship Monitor's 1675 Model to Facilitate Evasion", 2013, . 1679 [Knight-2005] 1680 Knight, W., "Iranian net censorship powered by US 1681 technology", 2005, . 1684 [Knockel-2021] 1685 Knockel, J. and L. Ruan, "Measuring QQMail's automated 1686 email censorship in China", 2021, 1687 . 1689 [Kopel-2013] 1690 Kopel, K., "Operation Seizing Our Sites: How the Federal 1691 Government is Taking Domain Names Without Prior Notice", 1692 2013, . 1694 [Kravtsova-2012] 1695 Kravtsova, Y., "Cyberattacks Disrupt Opposition's 1696 Election", 2012, 1697 . 1700 [Leyba-2019] 1701 Leyba, K., Edwards, B., Freeman, C., Crandall, J., and S. 1702 Forrest, "Borders and Gateways: Measuring and Analyzing 1703 National AS Chokepoints", 2019, 1704 . 1707 [Li-2017] Li, F., Razaghpanah, A., Kakhki, A., Niaki, A., Choffnes, 1708 D., Gill, P., and A. Mislove, "lib•erate, (n) : A library 1709 for exposing (traffic-classification) rules and avoiding 1710 them efficiently", 2017, 1711 . 1713 [Lomas-2019] 1714 Lomas, N., "Github removes Tsunami Democràtic’s APK after 1715 a takedown order from Spain", 2019, 1716 . 1719 [MANRS] Siddiqui, A., "Lesson Learned: Twitter Shored Up Its 1720 Routing Security", 2022, . 1723 [Marczak-2015] 1724 Marczak, B., Weaver, N., Dalek, J., Ensafi, R., Fifield, 1725 D., McKune, S., Rey, A., Scott-Railton, J., Deibert, R., 1726 and V. Paxson, "An Analysis of China’s “Great Cannon”", 1727 2015, 1728 . 1731 [Muncaster-2013] 1732 Muncaster, P., "Malaysian election sparks web blocking/ 1733 DDoS claims", 2013, 1734 . 1737 [Murdoch-2011] 1738 Murdoch, S.J. and R. Anderson, "Access Denied: Tools and 1739 Technology of Internet Filtering", 2011, 1740 . 1743 [NA-SK-2019] 1744 Morgus, R., Sherman, J., and S. Nam, "Analysis: South 1745 Korea's New Tool for Filtering Illegal Internet Content", 1746 2019, . 1750 [Nabi-2013] 1751 Nabi, Z., "The Anatomy of Web Censorship in Pakistan", 1752 2013, . 1755 [Netsec-2011] 1756 n3t2.3c, "TCP-RST Injection", 2011, 1757 . 1759 [OONI-2018] 1760 Evdokimov, L., "Iran Protests: DPI blocking of Instagram 1761 (Part 2)", 2018, 1762 . 1764 [OONI-2019] 1765 Singh, S., Filastò, A., and M. Xynou, "China is now 1766 blocking all language editions of Wikipedia", 2019, 1767 . 1769 [Orion-2013] 1770 Orion, E., "Zimbabwe election hit by hacking and DDoS 1771 attacks", 2013, 1772 . 1775 [Patil-2019] 1776 Patil, S. and N. Borisov, "What Can You Learn from an 1777 IP?", 2019, . 1780 [Porter-2010] 1781 Porter, T., "The Perils of Deep Packet Inspection", 2010, 1782 . 1785 [Rambert-2021] 1786 Rampert, R., Weinberg, Z., Barradas, D., and N. Christin, 1787 "Chinese Wall or Swiss Cheese? Keyword filtering in the 1788 Great Firewall of China", 2021, 1789 . 1792 [Reda-2017] 1793 Reda, J., "New EU law prescribes website blocking in the 1794 name of 'consumer protection'", 2017, 1795 . 1797 [RFC0793] Postel, J., "Transmission Control Protocol", STD 7, 1798 RFC 793, DOI 10.17487/RFC0793, September 1981, 1799 . 1801 [RFC6066] Eastlake 3rd, D., "Transport Layer Security (TLS) 1802 Extensions: Extension Definitions", RFC 6066, 1803 DOI 10.17487/RFC6066, January 2011, 1804 . 1806 [RFC7624] Barnes, R., Schneier, B., Jennings, C., Hardie, T., 1807 Trammell, B., Huitema, C., and D. Borkmann, 1808 "Confidentiality in the Face of Pervasive Surveillance: A 1809 Threat Model and Problem Statement", RFC 7624, 1810 DOI 10.17487/RFC7624, August 2015, 1811 . 1813 [RFC7754] Barnes, R., Cooper, A., Kolkman, O., Thaler, D., and E. 1814 Nordmark, "Technical Considerations for Internet Service 1815 Blocking and Filtering", RFC 7754, DOI 10.17487/RFC7754, 1816 March 2016, . 1818 [RFC7858] Hu, Z., Zhu, L., Heidemann, J., Mankin, A., Wessels, D., 1819 and P. Hoffman, "Specification for DNS over Transport 1820 Layer Security (TLS)", RFC 7858, DOI 10.17487/RFC7858, May 1821 2016, . 1823 [RFC8484] Hoffman, P. and P. McManus, "DNS Queries over HTTPS 1824 (DoH)", RFC 8484, DOI 10.17487/RFC8484, October 2018, 1825 . 1827 [RSF-2005] Reporters Sans Frontieres, "Technical ways to get around 1828 censorship", 2005, . 1831 [Rushe-2015] 1832 Rushe, D., "Bing censoring Chinese language search results 1833 for users in the US", 2013, 1834 . 1837 [RWB2020] Reporters Without Borders, "2020 World Press Freedom 1838 Index: Entering a decisive decade for journalism, 1839 exacerbated by coronavirus", 2020, . 1843 [Sandvine-2014] 1844 Sandvine, "Technology Showcase on Traffic Classification: 1845 Why Measurements and Freeform Policy Matter", 2014, 1846 . 1850 [Satija-2021] 1851 Satija, S. and R. Chatterjee, "BlindTLS: Circumventing 1852 TLS-based HTTPS censorship", 2021, 1853 . 1855 [Schoen-2007] 1856 Schoen, S., "EFF tests agree with AP: Comcast is forging 1857 packets to interfere with user traffic", 2007, 1858 . 1861 [Schone-2014] 1862 Schone, M., Esposito, R., Cole, M., and G. Greenwald, 1863 "Snowden Docs Show UK Spies Attacked Anonymous, Hackers", 1864 2014, . 1868 [Senft-2013] 1869 Senft, A., "Asia Chats: Analyzing Information Controls and 1870 Privacy in Asian Messaging Applications", 2013, 1871 . 1875 [Shbair-2015] 1876 Shbair, W.M., Cholez, T., Goichot, A., and I. Chrisment, 1877 "Efficiently Bypassing SNI-based HTTPS Filtering", 2015, 1878 . 1880 [SIDN2020] Moura, G., "Detecting and Taking Down Fraudulent Webshops 1881 at the .nl ccTLD", 2020, 1882 . 1885 [Singh-2019] 1886 Singh, K., Grover, G., and V. Bansal, "How India Censors 1887 the Web", 2019, . 1889 [Sophos-2015] 1890 Sophos, "Understanding Sophos Web Filtering", 2015, 1891 . 1894 [SSAC-109-2020] 1895 ICANN Security and Stability Advisory Committee, "SAC109: 1896 The Implications of DNS over HTTPS and DNS over TLS", 1897 2020, . 1900 [Tang-2016] 1901 Tang, C., "In-depth analysis of the Great Firewall of 1902 China", 2016, 1903 . 1906 [Thomson-2012] 1907 Thomson, I., "Syria Cuts off Internet and Mobile 1908 Communication", 2012, 1909 . 1912 [Tor-2020] The Tor Project, "Tor: Pluggable Transports", 2020, 1913 . 1916 [Trustwave-2015] 1917 Trustwave, "Filter: SNI extension feature and HTTPS 1918 blocking", 2015, 1919 . 1922 [Tschantz-2016] 1923 Tschantz, M., Afroz, S., Anonymous, A., and V. Paxson, 1924 "SoK: Towards Grounding Censorship Circumvention in 1925 Empiricism", 2016, 1926 . 1928 [Verkamp-2012] 1929 Verkamp, J.P. and M. Gupta, "Inferring Mechanics of Web 1930 Censorship Around the World", 2012, 1931 . 1934 [Victor-2019] 1935 Victor, D., "Blizzard Sets Off Backlash for Penalizing 1936 Hearthstone Gamer in Hong Kong", 2019, 1937 . 1940 [Villeneuve-2011] 1941 Villeneuve, N., "Open Access: Chapter 8, Control and 1942 Resistance, Attacks on Burmese Opposition Media", 2011, 1943 . 1946 [VonLohmann-2008] 1947 VonLohmann, F., "FCC Rules Against Comcast for BitTorrent 1948 Blocking", 2008, . 1951 [Wagner-2009] 1952 Wagner, B., "Deep Packet Inspection and Internet 1953 Censorship: International Convergence on an ‘Integrated 1954 Technology of Control'", 2009, 1955 . 1959 [Wagstaff-2013] 1960 Wagstaff, J., "In Malaysia, online election battles take a 1961 nasty turn", 2013, 1962 . 1965 [Wang-2017] 1966 Wang, Z., Cao, Y., Qian, Z., Song, C., and S. 1967 Krishnamurthy, "Your State is Not Mine: A Closer Look at 1968 Evading Stateful Internet Censorship", 2017, 1969 . 1972 [Wang-2020] 1973 Wang, Z., Zhu, S., Cao, Y., Qian, Z., Song, C., 1974 Krishnamurthy, S., Chan, K., and T. Braun, "SYMTCP: 1975 Eluding Stateful Deep Packet Inspection with Automated 1976 Discrepancy Discovery", 2020, 1977 . 1979 [Weaver-2009] 1980 Weaver, N., Sommer, R., and V. Paxson, "Detecting Forged 1981 TCP Packets", 2009, . 1984 [Whittaker-2013] 1985 Whittaker, Z., "1,168 keywords Skype uses to censor, 1986 monitor its Chinese users", 2013, 1987 . 1990 [Wikip-DoS] 1991 Wikipedia, "Denial of Service Attacks", 2016, 1992 . 1995 [Wilde-2012] 1996 Wilde, T., "Knock Knock Knockin' on Bridges Doors", 2012, 1997 . 2000 [Winter-2012] 2001 Winter, P., "How China is Blocking Tor", 2012, 2002 . 2004 [WP-Def-2020] 2005 Wikipedia contributors, "Censorship", 2020, 2006 . 2009 [Wright-2013] 2010 Wright, J. and Y. Breindl, "Internet filtering trends in 2011 liberal democracies: French and German regulatory 2012 debates", 2013, 2013 . 2017 [Zhu-2011] Zhu, T., "An Analysis of Chinese Search Engine Filtering", 2018 2011, 2019 . 2021 [Zmijewski-2014] 2022 Zmijewski, E., "Turkish Internet Censorship Takes a New 2023 Turn", 2014, 2024 . 2027 Authors' Addresses 2029 Joseph Lorenzo Hall 2030 Internet Society 2031 Email: hall@isoc.org 2033 Michael D. Aaron 2034 CU Boulder 2035 Email: michael.drew.aaron@gmail.com 2037 Amelia Andersdotter 2038 Email: amelia.ietf@andersdotter.cc 2039 Ben Jones 2040 Princeton 2041 Email: bj6@cs.princeton.edu 2043 Nick Feamster 2044 U Chicago 2045 Email: feamster@uchicago.edu 2047 Mallory Knodel 2048 Center for Democracy & Technology 2049 Email: mknodel@cdt.org