idnits 2.17.1 draft-ietf-trans-gossip-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 39 instances of too long lines in the document, the longest one being 60 characters in excess of 72. == There are 4 instances of lines with non-RFC2606-compliant FQDNs in the document. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 360: '...tension. The client MUST discard SCTs...' RFC 2119 keyword, line 361: '...own to the client and SHOULD store the...' RFC 2119 keyword, line 367: '...ed on the client MUST be keyed by the ...' RFC 2119 keyword, line 368: '...contacted. They MUST NOT be sent to a...' RFC 2119 keyword, line 371: '...mple.com.) They MUST NOT be sent to a...' (67 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 1800 has weird spacing: '... bool has_...' == Line 1801 has weird spacing: '... bool proo...' == Line 1893 has weird spacing: '...h later num...' == Line 1901 has weird spacing: '...h later num...' == Line 1903 has weird spacing: '...h later num...' == (9 more instances...) -- The document date (July 08, 2016) is 2849 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Experimental ---------------------------------------------------------------------------- -- Looks like a reference, but probably isn't: '1' on line 293 -- Looks like a reference, but probably isn't: '2' on line 295 -- Looks like a reference, but probably isn't: '3' on line 297 ** Obsolete normative reference: RFC 6962 (Obsoleted by RFC 9162) ** Obsolete normative reference: RFC 7159 (Obsoleted by RFC 8259) Summary: 4 errors (**), 0 flaws (~~), 8 warnings (==), 5 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 TRANS L. Nordberg 3 Internet-Draft NORDUnet 4 Intended status: Experimental D. Gillmor 5 Expires: January 9, 2017 ACLU 6 T. Ritter 8 July 08, 2016 10 Gossiping in CT 11 draft-ietf-trans-gossip-03 13 Abstract 15 The logs in Certificate Transparency are untrusted in the sense that 16 the users of the system don't have to trust that they behave 17 correctly since the behavior of a log can be verified to be correct. 19 This document tries to solve the problem with logs presenting a 20 "split view" of their operations. It describes three gossiping 21 mechanisms for Certificate Transparency: SCT Feedback, STH 22 Pollination and Trusted Auditor Relationship. 24 Status of This Memo 26 This Internet-Draft is submitted in full conformance with the 27 provisions of BCP 78 and BCP 79. 29 Internet-Drafts are working documents of the Internet Engineering 30 Task Force (IETF). Note that other groups may also distribute 31 working documents as Internet-Drafts. The list of current Internet- 32 Drafts is at http://datatracker.ietf.org/drafts/current/. 34 Internet-Drafts are draft documents valid for a maximum of six months 35 and may be updated, replaced, or obsoleted by other documents at any 36 time. It is inappropriate to use Internet-Drafts as reference 37 material or to cite them other than as "work in progress." 39 This Internet-Draft will expire on January 9, 2017. 41 Copyright Notice 43 Copyright (c) 2016 IETF Trust and the persons identified as the 44 document authors. All rights reserved. 46 This document is subject to BCP 78 and the IETF Trust's Legal 47 Provisions Relating to IETF Documents 48 (http://trustee.ietf.org/license-info) in effect on the date of 49 publication of this document. Please review these documents 50 carefully, as they describe your rights and restrictions with respect 51 to this document. Code Components extracted from this document must 52 include Simplified BSD License text as described in Section 4.e of 53 the Trust Legal Provisions and are provided without warranty as 54 described in the Simplified BSD License. 56 Table of Contents 58 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 59 2. Defining the problem . . . . . . . . . . . . . . . . . . . . 4 60 3. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 4 61 4. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 62 4.1. Pre-Loaded vs Locally Added Anchors . . . . . . . . . . . 5 63 5. Who gossips with whom . . . . . . . . . . . . . . . . . . . . 5 64 6. What to gossip about and how . . . . . . . . . . . . . . . . 6 65 7. Data flow . . . . . . . . . . . . . . . . . . . . . . . . . . 6 66 8. Gossip Mechanisms . . . . . . . . . . . . . . . . . . . . . . 7 67 8.1. SCT Feedback . . . . . . . . . . . . . . . . . . . . . . 7 68 8.1.1. SCT Feedback data format . . . . . . . . . . . . . . 8 69 8.1.2. HTTPS client to server . . . . . . . . . . . . . . . 8 70 8.1.3. HTTPS server operation . . . . . . . . . . . . . . . 11 71 8.1.4. HTTPS server to auditors . . . . . . . . . . . . . . 13 72 8.2. STH pollination . . . . . . . . . . . . . . . . . . . . . 14 73 8.2.1. HTTPS Clients and Proof Fetching . . . . . . . . . . 15 74 8.2.2. STH Pollination without Proof Fetching . . . . . . . 17 75 8.2.3. Auditor Action . . . . . . . . . . . . . . . . . . . 17 76 8.2.4. STH Pollination data format . . . . . . . . . . . . . 17 77 8.3. Trusted Auditor Stream . . . . . . . . . . . . . . . . . 17 78 8.3.1. Trusted Auditor data format . . . . . . . . . . . . . 18 79 9. 3-Method Ecosystem . . . . . . . . . . . . . . . . . . . . . 19 80 9.1. SCT Feedback . . . . . . . . . . . . . . . . . . . . . . 19 81 9.2. STH Pollination . . . . . . . . . . . . . . . . . . . . . 20 82 9.3. Trusted Auditor Relationship . . . . . . . . . . . . . . 21 83 9.4. Interaction . . . . . . . . . . . . . . . . . . . . . . . 22 84 10. Security considerations . . . . . . . . . . . . . . . . . . . 22 85 10.1. Attacks by actively malicious logs . . . . . . . . . . . 22 86 10.2. Dual-CA Compromise . . . . . . . . . . . . . . . . . . . 23 87 10.3. Censorship/Blocking considerations . . . . . . . . . . . 24 88 10.4. Flushing Attacks . . . . . . . . . . . . . . . . . . . . 25 89 10.4.1. STHs . . . . . . . . . . . . . . . . . . . . . . . . 25 90 10.4.2. SCTs & Certificate Chains on HTTPS Servers . . . . . 26 91 10.4.3. SCTs & Certificate Chains on HTTPS Clients . . . . . 26 92 10.5. Privacy considerations . . . . . . . . . . . . . . . . . 27 93 10.5.1. Privacy and SCTs . . . . . . . . . . . . . . . . . . 27 94 10.5.2. Privacy in SCT Feedback . . . . . . . . . . . . . . 27 95 10.5.3. Privacy for HTTPS clients performing STH Proof 96 Fetching . . . . . . . . . . . . . . . . . . . . . . 28 98 10.5.4. Privacy in STH Pollination . . . . . . . . . . . . . 28 99 10.5.5. Privacy in STH Interaction . . . . . . . . . . . . . 29 100 10.5.6. Trusted Auditors for HTTPS Clients . . . . . . . . . 29 101 10.5.7. HTTPS Clients as Auditors . . . . . . . . . . . . . 30 102 11. Policy Recommendations . . . . . . . . . . . . . . . . . . . 30 103 11.1. Blocking Recommendations . . . . . . . . . . . . . . . . 31 104 11.1.1. Frustrating blocking . . . . . . . . . . . . . . . . 31 105 11.1.2. Responding to possible blocking . . . . . . . . . . 31 106 11.2. Proof Fetching Recommendations . . . . . . . . . . . . . 32 107 11.3. Record Distribution Recommendations . . . . . . . . . . 33 108 11.3.1. Mixing Algorithm . . . . . . . . . . . . . . . . . . 34 109 11.3.2. The Deletion Algorithm . . . . . . . . . . . . . . . 35 110 11.4. Concrete Recommendations . . . . . . . . . . . . . . . . 36 111 11.4.1. STH Pollination . . . . . . . . . . . . . . . . . . 36 112 11.4.2. SCT Feedback . . . . . . . . . . . . . . . . . . . . 39 113 12. IANA considerations . . . . . . . . . . . . . . . . . . . . . 53 114 13. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 53 115 14. ChangeLog . . . . . . . . . . . . . . . . . . . . . . . . . . 53 116 14.1. Changes between ietf-02 and ietf-03 . . . . . . . . . . 53 117 14.2. Changes between ietf-01 and ietf-02 . . . . . . . . . . 54 118 14.3. Changes between ietf-00 and ietf-01 . . . . . . . . . . 54 119 14.4. Changes between -01 and -02 . . . . . . . . . . . . . . 54 120 14.5. Changes between -00 and -01 . . . . . . . . . . . . . . 55 121 15. References . . . . . . . . . . . . . . . . . . . . . . . . . 55 122 15.1. Normative References . . . . . . . . . . . . . . . . . . 55 123 15.2. Informative References . . . . . . . . . . . . . . . . . 55 124 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 56 126 1. Introduction 128 The purpose of the protocols in this document, collectively referred 129 to as CT Gossip, is to detect certain misbehavior by CT logs. In 130 particular, CT Gossip aims to detect logs that are providing 131 inconsistent views to different log clients, and logs failing to 132 include submitted certificates within the time period stipulated by 133 MMD. 135 One of the major challenges of any gossip protocol is limiting damage 136 to user privacy. The goal of CT gossip is to publish and distribute 137 information about the logs and their operations, but not to expose 138 any additional information about the operation of any of the other 139 participants. Privacy of consumers of log information (in 140 particular, of web browsers and other TLS clients) should not be 141 undermined by gossip. 143 This document presents three different, complementary mechanisms for 144 non-log elements of the CT ecosystem to exchange information about 145 logs in a manner that preserves the privacy of HTTPS clients. They 146 should provide protective benefits for the system as a whole even if 147 their adoption is not universal. 149 2. Defining the problem 151 When a log provides different views of the log to different clients 152 this is described as a partitioning attack. Each client would be 153 able to verify the append-only nature of the log but, in the extreme 154 case, each client might see a unique view of the log. 156 The CT logs are public, append-only and untrusted and thus have to be 157 audited for consistency, i.e., they should never rewrite history. 158 Additionally, auditors and other log clients need to exchange 159 information about logs in order to be able to detect a partitioning 160 attack (as described above). 162 Gossiping about log behavior helps address the problem of detecting 163 malicious or compromised logs with respect to a partitioning attack. 164 We want some side of the partitioned tree, and ideally both sides, to 165 see the other side. 167 Disseminating information about a log poses a potential threat to the 168 privacy of end users. Some data of interest (e.g. SCTs) is linkable 169 to specific log entries and thereby to specific websites, which makes 170 sharing them with others a privacy concern. Gossiping about this 171 data has to take privacy considerations into account in order not to 172 expose associations between users of the log (e.g., web browsers) and 173 certificate holders (e.g., web sites). Even sharing STHs (which do 174 not link to specific log entries) can be problematic - user tracking 175 by fingerprinting through rare STHs is one potential attack (see 176 Section 8.2). 178 3. Overview 180 This document presents three gossiping mechanisms: SCT Feedback, STH 181 Pollination, and a Trusted Auditor Relationship. 183 SCT Feedback enables HTTPS clients to share Signed Certificate 184 Timestamps (SCTs) (Section 3.3 of [RFC-6962-BIS-09]) with CT auditors 185 in a privacy-preserving manner by sending SCTs to originating HTTPS 186 servers, who in turn share them with CT auditors. 188 In STH Pollination, HTTPS clients use HTTPS servers as pools to share 189 Signed Tree Heads (STHs) (Section 3.6 of [RFC-6962-BIS-09]) with 190 other connecting clients in the hope that STHs will find their way to 191 CT auditors. 193 HTTPS clients in a Trusted Auditor Relationship share SCTs and STHs 194 with trusted CT auditors directly, with expectations of privacy 195 sensitive data being handled according to whatever privacy policy is 196 agreed on between client and trusted party. 198 Despite the privacy risks with sharing SCTs there is no loss in 199 privacy if a client sends SCTs for a given site to the site 200 corresponding to the SCT. This is because the site's logs would 201 already indicate that the client is accessing that site. In this way 202 a site can accumulate records of SCTs that have been issued by 203 various logs for that site, providing a consolidated repository of 204 SCTs that could be shared with auditors. Auditors can use this 205 information to detect a misbehaving log that fails to include a 206 certificate within the time period stipulated by its MMD metadata. 208 Sharing an STH is considered reasonably safe from a privacy 209 perspective as long as the same STH is shared by a large number of 210 other log clients. This safety in numbers can be achieved by only 211 allowing gossiping of STHs issued in a certain window of time, while 212 also refusing to gossip about STHs from logs with too high an STH 213 issuance frequency (see Section 8.2). 215 4. Terminology 217 This document relies on terminology and data structures defined in 218 [RFC-6962-BIS-09], including MMD, STH, SCT, Version, LogID, SCT 219 timestamp, CtExtensions, SCT signature, Merkle Tree Hash. 221 This document relies on terminology defined in 222 [draft-ietf-trans-threat-analysis-03], including Auditing. 224 4.1. Pre-Loaded vs Locally Added Anchors 226 Through the document, we refer to both Trust Anchors (Certificate 227 Authorities) and Logs. Both Logs and Trust Anchors may be locally 228 added by an administrator. Unless otherwise clarified, in both cases 229 we refer to the set of Trust Anchors and Logs that come pre-loaded 230 and pre-trusted in a piece of client software. 232 5. Who gossips with whom 234 o HTTPS clients and servers (SCT Feedback and STH Pollination) 236 o HTTPS servers and CT auditors (SCT Feedback and STH Pollination) 238 o CT auditors (Trusted Auditor Relationship) 239 Additionally, some HTTPS clients may engage with an auditor who they 240 trust with their privacy: 242 o HTTPS clients and CT auditors (Trusted Auditor Relationship) 244 6. What to gossip about and how 246 There are three separate gossip streams: 248 o SCT Feedback - transporting SCTs and certificate chains from HTTPS 249 clients to CT auditors via HTTPS servers. 251 o STH Pollination - HTTPS clients and CT auditors using HTTPS 252 servers as STH pools for exchanging STHs. 254 o Trusted Auditor Stream - HTTPS clients communicating directly with 255 trusted CT auditors sharing SCTs, certificate chains and STHs. 257 It is worthwhile to note that when an HTTPS client or CT auditor 258 interacts with a log, they may equivalently interact with a log 259 mirror or cache that replicates the log. 261 7. Data flow 263 The following picture shows how certificates, SCTs and STHs flow 264 through a CT system with SCT Feedback and STH Pollination. It does 265 not show what goes in the Trusted Auditor Relationship stream. 267 +- Cert ---- +----------+ 268 | | CA | ----------+ 269 | + SCT -> +----------+ | 270 v | Cert [& SCT] 271 +----------+ | 272 | Log | ---------- SCT -----------+ 273 +----------+ v 274 | ^ +----------+ 275 | | SCTs & Certs --- | Website | 276 | |[1] | +----------+ 277 | |[2] STHs ^ | 278 | |[3] v | | 279 | | +----------+ | | 280 | +--------> | Auditor | | HTTPS traffic 281 | +----------+ | | 282 STH | SCT & Cert 283 | SCTs & Certs | 284 Log entries | | 285 | STHs STHs 286 v | | 287 +----------+ | v 288 | Monitor | +----------+ 289 +----------+ | Browser | 290 +----------+ 292 # Auditor Log 293 [1] |--- get-sth ------------------->| 294 |<-- STH ------------------------| 295 [2] |--- leaf hash + tree size ----->| 296 |<-- index + inclusion proof --->| 297 [3] |--- tree size 1 + tree size 2 ->| 298 |<-- consistency proof ----------| 300 8. Gossip Mechanisms 302 8.1. SCT Feedback 304 The goal of SCT Feedback is for clients to share SCTs and certificate 305 chains with CT auditors while still preserving the privacy of the end 306 user. The sharing of SCTs contribute to the overall goal of 307 detecting misbehaving logs by providing auditors with SCTs from many 308 vantage points, making it more likely to catch a violation of a log's 309 MMD or a log presenting inconsistent views. The sharing of 310 certificate chains is beneficial to HTTPS server operators interested 311 in direct feedback from clients for detecting bogus certificates 312 issued in their name and therefore incentivizes server operators to 313 take part in SCT Feedback. 315 SCT Feedback is the most privacy-preserving gossip mechanism, as it 316 does not directly expose any links between an end user and the sites 317 they've visited to any third party. 319 HTTPS clients store SCTs and certificate chains they see, and later 320 send them to the originating HTTPS server by posting them to a well- 321 known URL (associated with that server), as described in 322 Section 8.1.2. Note that clients will send the same SCTs and chains 323 to a server multiple times with the assumption that any man-in-the- 324 middle attack eventually will cease, and an honest server will 325 eventually receive collected malicious SCTs and certificate chains. 327 HTTPS servers store SCTs and certificate chains received from 328 clients, as described in Section 8.1.3. They later share them with 329 CT auditors by either posting them to auditors or making them 330 available via a well-known URL. This is described in Section 8.1.4. 332 8.1.1. SCT Feedback data format 334 The data shared between HTTPS clients and servers, as well as between 335 HTTPS servers and CT auditors, is a JSON array [RFC7159]. Each item 336 in the array is a JSON object with the following content: 338 o x509_chain: An array of PEM-encoded X.509 certificates. The first 339 element is the end-entity certificate, the second certifies the 340 first and so on. 342 o sct_data: An array of objects consisting of the base64 343 representation of the binary SCT data as defined in 344 [RFC-6962-BIS-09] Section 3.3. 346 We will refer to this object as 'sct_feedback'. 348 The x509_chain element always contains a full chain from a leaf 349 certificate to a self-signed trust anchor. 351 See Section 8.1.2 for details on what the sct_data element contains 352 as well as more details about the x509_chain element. 354 8.1.2. HTTPS client to server 356 When an HTTPS client connects to an HTTPS server, the client receives 357 a set of SCTs as part of the TLS handshake. SCTs are included in the 358 TLS handshake using one or more of the three mechanisms described in 359 [RFC-6962-BIS-09] section 3.4 - in the server certificate, in a TLS 360 extension, or in an OCSP extension. The client MUST discard SCTs 361 that are not signed by a log known to the client and SHOULD store the 362 remaining SCTs together with a locally constructed certificate chain 363 which is trusted (i.e. terminated in a pre-loaded or locally 364 installed Trust Anchor) in an sct_feedback object or equivalent data 365 structure for later use in SCT Feedback. 367 The SCTs stored on the client MUST be keyed by the exact domain name 368 the client contacted. They MUST NOT be sent to any domain not 369 matching the original domain (e.g. if the original domain is 370 sub.example.com they must not be sent to sub.sub.example.com or to 371 example.com.) They MUST NOT be sent to any Subject Alternate Names 372 specified in the certificate. In the case of certificates that 373 validate multiple domain names, the same SCT is expected to be stored 374 multiple times. 376 Not following these constraints would increase the risk for two types 377 of privacy breaches. First, the HTTPS server receiving the SCT would 378 learn about other sites visited by the HTTPS client. Second, 379 auditors receiving SCTs from the HTTPS server would learn information 380 about other HTTPS servers visited by its clients. 382 If the client later again connects to the same HTTPS server, it again 383 receives a set of SCTs and calculates a certificate chain, and again 384 creates an sct_feedback or similar object. If this object does not 385 exactly match an existing object in the store, then the client MUST 386 add this new object to the store, associated with the exact domain 387 name contacted, as described above. An exact comparison is needed to 388 ensure that attacks involving alternate chains are detected. An 389 example of such an attack is described in 390 [dual-ca-compromise-attack]. However, at least one optimization is 391 safe and MAY be performed: If the certificate chain exactly matches 392 an existing certificate chain, the client MAY store the union of the 393 SCTs from the two objects in the first (existing) object. 395 If the client does connect to the same HTTPS server a subsequent 396 time, it MUST send to the server sct_feedback objects in the store 397 that are associated with that domain name. However, it is not 398 necessary to send an sct_feedback object constructed from the current 399 TLS session, and if the client does so, it MUST NOT be marked as sent 400 in any internal tracking done by the client. 402 Refer to Section 11.3 for recommendations for implementation. 404 Because SCTs can be used as a tracking mechanism (see 405 Section 10.5.2), they deserve special treatment when they are 406 received from (and provided to) domains that are loaded as 407 subresources from an origin domain. Such domains are commonly called 408 'third party domains'. An HTTPS client SHOULD store SCT Feedback 409 using a 'double-keying' approach, which isolates third party domains 410 by the first party domain. This is described in [double-keying]. 412 Gossip would be performed normally for third party domains only when 413 the user revisits the first party domain. In lieu of 'double- 414 keying', an HTTPS client MAY treat SCT Feedback in the same manner it 415 treats other security mechanisms that can enable tracking (such as 416 HSTS and HPKP.) 418 If the HTTPS client has configuration options for not sending cookies 419 to third parties, SCTs of third parties MUST be treated as cookies 420 with respect to this setting. This prevents third party tracking 421 through the use of SCTs/certificates, which would bypass the cookie 422 policy. For domains that are only loaded as third party domains, the 423 client may never perform SCT Feedback; however the client may perform 424 STH Pollination after fetching an inclusion proof, as specified in 425 Section 8.2. 427 SCTs and corresponding certificates are POSTed to the originating 428 HTTPS server at the well-known URL: 430 https:///.well-known/ct-gossip/v1/sct-feedback 432 The data sent in the POST is defined in Section 8.1.1. This data 433 SHOULD be sent in an already-established TLS session. This makes it 434 hard for an attacker to disrupt SCT Feedback without also disturbing 435 ordinary secure browsing (https://). This is discussed more in 436 Section 11.1.1. 438 The HTTPS server SHOULD respond with an HTTP 200 response code and an 439 empty body if it was able to process the request. An HTTPS client 440 who receives any other response SHOULD consider it an error. 442 Some clients have trust anchors or logs that are locally added (e.g. 443 by an administrator or by the user themselves). These additions are 444 potentially privacy-sensitive because they can carry information 445 about the specific configuration, computer, or user. 447 Certificates validated by locally added trust anchors will commonly 448 have no SCTs associated with them, so in this case no action is 449 needed with respect to CT Gossip. SCTs issued by locally added logs 450 MUST NOT be reported via SCT Feedback. 452 If a certificate is validated by SCTs that are issued by publicly 453 trusted logs, but chains to a local trust anchor, the client MAY 454 perform SCT Feedback for this SCT and certificate chain bundle. If 455 it does so, the client MUST include the full chain of certificates 456 chaining to the local trust anchor in the x509_chain array. 457 Performing SCT Feedback in this scenario may be advantageous for the 458 broader internet and CT ecosystem, but may also disclose information 459 about the client. If the client elects to omit SCT Feedback, it can 460 choose to perform STH Pollination after fetching an inclusion proof, 461 as specified in Section 8.2. 463 We require the client to send the full chain (or nothing at all) for 464 two reasons. Firstly, it simplifies the operation on the server if 465 there are not two code paths. Secondly, omitting the chain does not 466 actually preserve user privacy. The Issuer field in the certificate 467 describes the signing certificate. And if the certificate is being 468 submitted at all, it means the certificate is logged, and has SCTs. 469 This means that the Issuer can be queried and obtained from the log, 470 so omitting the signing certificate from the client's submission does 471 not actually help user privacy. 473 8.1.3. HTTPS server operation 475 HTTPS servers can be configured (or omit configuration), resulting 476 in, broadly, two modes of operation. In the simpler mode, the server 477 will only track leaf certificates and SCTs applicable to those leaf 478 certificates. In the more complex mode, the server will confirm the 479 client's chain validation and store the certificate chain. The 480 latter mode requires more configuration, but is necessary to prevent 481 denial of service (DoS) attacks on the server's storage space. 483 In the simple mode of operation, upon receiving a submission at the 484 sct-feedback well-known URL, an HTTPS server will perform a set of 485 operations, checking on each sct_feedback object before storing it: 487 1. the HTTPS server MAY modify the sct_feedback object, and discard 488 all items in the x509_chain array except the first item (which is 489 the end-entity certificate) 491 2. if a bit-wise compare of the sct_feedback object matches one 492 already in the store, this sct_feedback object SHOULD be 493 discarded 495 3. if the leaf cert is not for a domain for which the server is 496 authoritative, the SCT MUST be discarded 498 4. if an SCT in the sct_data array can't be verified to be a valid 499 SCT for the accompanying leaf cert, and issued by a known log, 500 the individual SCT SHOULD be discarded 502 The modification in step number 1 is necessary to prevent a malicious 503 client from exhausting the server's storage space. A client can 504 generate their own issuing certificate authorities, and create an 505 arbitrary number of chains that terminate in an end-entity 506 certificate with an existing SCT. By discarding all but the end- 507 entity certificate, we prevent a simple HTTPS server from storing 508 this data. Note that operation in this mode will not prevent the 509 attack described in [dual-ca-compromise-attack]. Skipping this step 510 requires additional configuration as described below. 512 The check in step 2 is for detecting duplicates and minimizing 513 processing and storage by the server. As on the client, an exact 514 comparison is needed to ensure that attacks involving alternate 515 chains are detected. Again, at least one optimization is safe and 516 MAY be performed. If the certificate chain exactly matches an 517 existing certificate chain, the server MAY store the union of the 518 SCTs from the two objects in the first (existing) object. If the 519 validity check on any of the SCTs fails, the server SHOULD NOT store 520 the union of the SCTs. 522 The check in step 3 is to help malfunctioning clients from exposing 523 which sites they visit. It additionally helps prevent DoS attacks on 524 the server. 526 [ Note: Thinking about building this, how does the SCT Feedback app 527 know which sites it's authoritative for? It will need that amount of 528 configuration at least. ] 530 The check in step 4 is to prevent DoS attacks where an adversary 531 fills up the store prior to attacking a client (thus preventing the 532 client's feedback from being recorded), or an attack where an 533 adversary simply attempts to fill up server's storage space. 535 The above describes the simpler mode of operation. In the more 536 advanced server mode, the server will detect the attack described in 537 [dual-ca-compromise-attack]. In this configuration the server will 538 not modify the sct_feedback object prior to performing checks 2, 3, 539 and 4. 541 To prevent a malicious client from filling the server's data store, 542 the HTTPS server SHOULD perform an additional check in the more 543 advanced mode: 545 o if the x509_chain consists of an invalid certificate chain, or the 546 culminating trust anchor is not recognized by the server, the 547 server SHOULD modify the sct_feedback object, discarding all items 548 in the x509_chain array except the first item 550 The HTTPS server MAY choose to omit checks 4 or 5. This will place 551 the server at risk of having its data store filled up by invalid 552 data, but can also allow a server to identify interesting certificate 553 or certificate chains that omit valid SCTs, or do not chain to a 554 trusted root. This information may enable an HTTPS server operator 555 to detect attacks or unusual behavior of Certificate Authorities even 556 outside the Certificate Transparency ecosystem. 558 8.1.4. HTTPS server to auditors 560 HTTPS servers receiving SCTs from clients SHOULD share SCTs and 561 certificate chains with CT auditors by either serving them on the 562 well-known URL: 564 https:///.well-known/ct-gossip/v1/collected-sct-feedback 566 or by HTTPS POSTing them to a set of preconfigured auditors. This 567 allows an HTTPS server to choose between an active push model or a 568 passive pull model. 570 The data received in a GET of the well-known URL or sent in the POST 571 is defined in Section 8.1.1 with the following difference: The 572 x509_chain element may contain only he end-entity certificate, as 573 described below. 575 HTTPS servers SHOULD share all sct_feedback objects they see that 576 pass the checks in Section 8.1.3. If this is an infeasible amount of 577 data, the server MAY choose to expire submissions according to an 578 undefined policy. Suggestions for such a policy can be found in 579 Section 11.3. 581 HTTPS servers MUST NOT share any other data that they may learn from 582 the submission of SCT Feedback by HTTPS clients, like the HTTPS 583 client IP address or the time of submission. 585 As described above, HTTPS servers can be configured (or omit 586 configuration), resulting in two modes of operation. In one mode, 587 the x509_chain array will contain a full certificate chain. This 588 chain may terminate in a trust anchor the auditor may recognize, or 589 it may not. (One scenario where this could occur is if the client 590 submitted a chain terminating in a locally added trust anchor, and 591 the server kept this chain.) In the other mode, the x509_chain array 592 will consist of only a single element, which is the end-entity 593 certificate. 595 Auditors SHOULD provide the following URL accepting HTTPS POSTing of 596 SCT feedback data: 598 https:///ct-gossip/v1/sct-feedback 600 Auditors SHOULD regularly poll HTTPS servers at the well-known 601 collected-sct-feedback URL. The frequency of the polling and how to 602 determine which domains to poll is outside the scope of this 603 document. However, the selection MUST NOT be influenced by potential 604 HTTPS clients connecting directly to the auditor. For example, if a 605 poll to example.com occurs directly after a client submits an SCT for 606 example.com, an adversary observing the auditor can trivially 607 conclude the activity of the client. 609 8.2. STH pollination 611 The goal of sharing Signed Tree Heads (STHs) through pollination is 612 to share STHs between HTTPS clients and CT auditors while still 613 preserving the privacy of the end user. The sharing of STHs 614 contribute to the overall goal of detecting misbehaving logs by 615 providing CT auditors with STHs from many vantage points, making it 616 possible to detect logs that are presenting inconsistent views. 618 HTTPS servers supporting the protocol act as STH pools. HTTPS 619 clients and CT auditors in the possession of STHs can pollinate STH 620 pools by sending STHs to them, and retrieving new STHs to send to 621 other STH pools. CT auditors can improve the value of their auditing 622 by retrieving STHs from pools. 624 HTTPS clients send STHs to HTTPS servers by POSTing them to the well- 625 known URL: 627 https:///.well-known/ct-gossip/v1/sth-pollination 629 The data sent in the POST is defined in Section 8.2.4. This data 630 SHOULD be sent in an already established TLS session. This makes it 631 hard for an attacker to disrupt STH gossiping without also disturbing 632 ordinary secure browsing (https://). This is discussed more in 633 Section 11.1.1. 635 On a successful connection to an HTTPS server implementing STH 636 Pollination, the response code will be 200, and the response body is 637 application/json, containing zero or more STHs in the same format, as 638 described in Section 8.2.4. 640 An HTTPS client may acquire STHs by several methods: 642 o in replies to pollination POSTs; 644 o asking logs that it recognizes for the current STH, either 645 directly (v2/get-sth) or indirectly (for example over DNS) 647 o resolving an SCT and certificate to an STH via an inclusion proof 649 o resolving one STH to another via a consistency proof 650 HTTPS clients (that have STHs) and CT auditors SHOULD pollinate STH 651 pools with STHs. Which STHs to send and how often pollination should 652 happen is regarded as undefined policy with the exception of privacy 653 concerns explained below. Suggestions for the policy can be found in 654 Section 11.3. 656 An HTTPS client could be tracked by giving it a unique or rare STH. 657 To address this concern, we place restrictions on different 658 components of the system to ensure an STH will not be rare. 660 o HTTPS clients silently ignore STHs from logs with an STH issuance 661 frequency of more than one STH per hour. Logs use the STH 662 Frequency Count metadata to express this ([RFC-6962-BIS-09] 663 sections 3.6 and 5.1). 665 o HTTPS clients silently ignore STHs which are not fresh. 667 An STH is considered fresh iff its timestamp is less than 14 days in 668 the past. Given a maximum STH issuance rate of one per hour, an 669 attacker has 336 unique STHs per log for tracking. Clients MUST 670 ignore STHs older than 14 days. We consider STHs within this 671 validity window not to be personally identifiable data, and STHs 672 outside this window to be personally identifiable. 674 When multiplied by the number of logs from which a client accepts 675 STHs, this number of unique STHs grow and the negative privacy 676 implications grow with it. It's important that this is taken into 677 account when logs are chosen for default settings in HTTPS clients. 678 This concern is discussed upon in Section 10.5.5. 680 A log may cease operation, in which case there will soon be no STH 681 within the validity window. Clients SHOULD perform all three methods 682 of gossip about a log that has ceased operation since it is possible 683 the log was still compromised and gossip can detect that. STH 684 Pollination is the one mechanism where a client must know about a log 685 shutdown. A client who does not know about a log shutdown MUST NOT 686 attempt any heuristic to detect a shutdown. Instead the client MUST 687 be informed about the shutdown from a verifiable source (e.g. a 688 software update). The client SHOULD be provided the final STH issued 689 by the log and SHOULD resolve SCTs and STHs to this final STH. If an 690 SCT or STH cannot be resolved to the final STH, clients SHOULD follow 691 the requirements and recommendations set forth in Section 11.1.2. 693 8.2.1. HTTPS Clients and Proof Fetching 695 There are two types of proofs a client may retrieve; inclusion proofs 696 and consistency proofs. 698 An HTTPS client will retrieve SCTs together with certificate chains 699 from an HTTPS server. Using the timestamp in the SCT together with 700 the end-entity certificate and the issuer key hash, it can obtain an 701 inclusion proof to an STH in order to verify the promise made by the 702 SCT. 704 An HTTPS client will have STHs from performing STH Pollination, and 705 may obtain a consistency proof to a more recent STH. 707 An HTTPS client may also receive an SCT bundled with an inclusion 708 proof to a historical STH via an unspecified future mechanism. 709 Because this historical STH is considered personally identifiable 710 information per above, the client needs to obtain a consistency proof 711 to a more recent STH. 713 A client SHOULD perform proof fetching. A client MUST NOT perform 714 proof fetching for any SCTs or STHs issued by a locally added log. A 715 client MAY fetch an inclusion proof for an SCT (issued by a pre- 716 loaded log) that validates a certificate chaining to a locally added 717 trust anchor. 719 If a client requested either proof directly from a log or auditor, it 720 would reveal the client's browsing habits to a third party. To 721 mitigate this risk, an HTTPS client MUST retrieve the proof in a 722 manner that disguises the client. 724 Depending on the client's DNS provider, DNS may provide an 725 appropriate intermediate layer that obfuscates the linkability 726 between the user of the client and the request for inclusion (while 727 at the same time providing a caching layer for oft-requested 728 inclusion proofs). See [draft-ct-over-dns] for an example of how 729 this can be done. 731 Anonymity networks such as Tor also present a mechanism for a client 732 to anonymously retrieve a proof from an auditor or log. 734 Even when using a privacy-preserving layer between the client and the 735 log, certain observations may be made about an anonymous client or 736 general user behavior depending on how proofs are fetched. For 737 example, if a client fetched all outstanding proofs at once, a log 738 would know that SCTs or STHs received around the same time are more 739 likely to come from a particular client. This could potentially go 740 so far as correlation of activity at different times to a single 741 client. In aggregate the data could reveal what sites are commonly 742 visited together. HTTPS clients SHOULD use a strategy of proof 743 fetching that attempts to obfuscate these patterns. A suggestion of 744 such a policy can be found in Section 11.2. 746 Resolving either SCTs and STHs may result in errors. These errors 747 may be routine downtime or other transient errors, or they may be 748 indicative of an attack. Clients SHOULD follow the requirements and 749 recommendations set forth in Section 11.1.2 when handling these 750 errors in order to give the CT ecosystem the greatest chance of 751 detecting and responding to a compromise. 753 8.2.2. STH Pollination without Proof Fetching 755 An HTTPS client MAY participate in STH Pollination without fetching 756 proofs. In this situation, the client receives STHs from a server, 757 applies the same validation logic to them (signed by a known log, 758 within the validity window) and will later pass them to another HTTPS 759 server. 761 When operating in this fashion, the HTTPS client is promoting gossip 762 for Certificate Transparency, but derives no direct benefit itself. 763 In comparison, a client who resolves SCTs or historical STHs to 764 recent STHs and pollinates them is assured that if it was attacked, 765 there is a probability that the ecosystem will detect and respond to 766 the attack (by distrusting the log). 768 8.2.3. Auditor Action 770 CT auditors participate in STH pollination by retrieving STHs from 771 HTTPS servers. They verify that the STH is valid by checking the 772 signature, and requesting a consistency proof from the STH to the 773 most recent STH. 775 After retrieving the consistency proof to the most recent STH, they 776 SHOULD pollinate this new STH among participating HTTPS servers. In 777 this way, as STHs "age out" and are no longer fresh, their "lineage" 778 continues to be tracked in the system. 780 8.2.4. STH Pollination data format 782 The data sent from HTTPS clients and CT auditors to HTTPS servers is 783 a JSON object [RFC7159] with the following content: 785 o sths - an array of 0 or more fresh SignedTreeHeads as defined in 786 [RFC-6962-BIS-09] Section 3.6.1. 788 8.3. Trusted Auditor Stream 790 HTTPS clients MAY send SCTs and cert chains, as well as STHs, 791 directly to auditors. If sent, this data MAY include data that 792 reflects locally added logs or trust anchors. Note that there are 793 privacy implications in doing so, these are outlined in 794 Section 10.5.1 and Section 10.5.6. 796 The most natural trusted auditor arrangement arguably is a web 797 browser that is "logged in to" a provider of various internet 798 services. Another equivalent arrangement is a trusted party like a 799 corporation to which an employee is connected through a VPN or by 800 other similar means. A third might be individuals or smaller groups 801 of people running their own services. In such a setting, retrieving 802 proofs from that third party could be considered reasonable from a 803 privacy perspective. The HTTPS client may also do its own auditing 804 and might additionally share SCTs and STHs with the trusted party to 805 contribute to herd immunity. Here, the ordinary [RFC-6962-BIS-09] 806 protocol is sufficient for the client to do the auditing while SCT 807 Feedback and STH Pollination can be used in whole or in parts for the 808 gossip part. 810 Another well established trusted party arrangement on the internet 811 today is the relation between internet users and their providers of 812 DNS resolver services. DNS resolvers are typically provided by the 813 internet service provider (ISP) used, which by the nature of name 814 resolving already know a great deal about which sites their users 815 visit. As mentioned in Section 8.2.1, in order for HTTPS clients to 816 be able to retrieve proofs in a privacy preserving manner, logs could 817 expose a DNS interface in addition to the ordinary HTTPS interface. 818 A specification of such a protocol can be found in 819 [draft-ct-over-dns]. 821 8.3.1. Trusted Auditor data format 823 Trusted Auditors expose a REST API at the fixed URI: 825 https:///ct-gossip/v1/trusted-auditor 827 Submissions are made by sending an HTTPS POST request, with the body 828 of the POST in a JSON object. Upon successful receipt the Trusted 829 Auditor returns 200 OK. 831 The JSON object consists of two top-level keys: 'sct_feedback' and 832 'sths'. The 'sct_feedback' value is an array of JSON objects as 833 defined in Section 8.1.1. The 'sths' value is an array of STHs as 834 defined in Section 8.2.4. 836 Example: 838 { 839 'sct_feedback' : 840 [ 841 { 842 'x509_chain' : 843 [ 844 '----BEGIN CERTIFICATE---\n 845 AAA...', 846 '----BEGIN CERTIFICATE---\n 847 AAA...', 848 ... 849 ], 850 'sct_data' : 851 [ 852 'AAA...', 853 'AAA...', 854 ... 855 ] 856 }, ... 857 ], 858 'sths' : 859 [ 860 'AAA...', 861 'AAA...', 862 ... 863 ] 864 } 866 9. 3-Method Ecosystem 868 The use of three distinct methods for auditing logs may seem 869 excessive, but each represents a needed component in the CT 870 ecosystem. To understand why, the drawbacks of each component must 871 be outlined. In this discussion we assume that an attacker knows 872 which mechanisms an HTTPS client and HTTPS server implement. 874 9.1. SCT Feedback 876 SCT Feedback requires the cooperation of HTTPS clients and more 877 importantly HTTPS servers. Although SCT Feedback does require a 878 significant amount of server-side logic to respond to the 879 corresponding APIs, this functionality does not require 880 customization, so it may be pre-provided and work out of the box. 881 However, to take full advantage of the system, an HTTPS server would 882 wish to perform some configuration to optimize its operation: 884 o Minimize its disk commitment by maintaining a list of known SCTs 885 and certificate chains (or hashes thereof) 887 o Maximize its chance of detecting a misissued certificate by 888 configuring a trust store of CAs 890 o Establish a "push" mechanism for POSTing SCTs to CT auditors 892 These configuration needs, and the simple fact that it would require 893 some deployment of software, means that some percentage of HTTPS 894 servers will not deploy SCT Feedback. 896 It is worthwhile to note that an attacker may be able to prevent 897 detection of an attack on a webserver (in all cases) if SCT Feedback 898 is not implemented. This attack is detailed in Section 10.1). 900 If SCT Feedback was the only mechanism in the ecosystem, any server 901 that did not implement the feature would open itself and its users to 902 attack without any possibility of detection. 904 If SCT Feedback is not deployed by a webserver, malicious logs will 905 be able to attack all users of the webserver (who do not have a 906 Trusted Auditor relationship) with impunity. Additionally, users who 907 wish to have the strongest measure of privacy protection (by 908 disabling STH Pollination Proof Fetching and forgoing a Trusted 909 Auditor) could be attacked without risk of detection. 911 9.2. STH Pollination 913 STH Pollination requires the cooperation of HTTPS clients, HTTPS 914 servers, and logs. 916 For a client to fully participate in STH Pollination, and have this 917 mechanism detect attacks against it, the client must have a way to 918 safely perform Proof Fetching in a privacy preserving manner. (The 919 client may pollinate STHs it receives without performing Proof 920 Fetching, but we do not consider this option in this section.) 922 HTTPS servers must deploy software (although, as in the case with SCT 923 Feedback this logic can be pre-provided) and commit some configurable 924 amount of disk space to the endeavor. 926 Logs (or a third party mirroring the logs) must provide access to 927 clients to query proofs in a privacy preserving manner, most likely 928 through DNS. 930 Unlike SCT Feedback, the STH Pollination mechanism is not hampered if 931 only a minority of HTTPS servers deploy it. However, it makes an 932 assumption that an HTTPS client performs Proof Fetching (such as the 933 DNS mechanism discussed). Unfortunately, any manner that is 934 anonymous for some (such as clients who use shared DNS services such 935 as a large ISP), may not be anonymous for others. 937 For instance, DNS requests expose a considerable amount of sensitive 938 information (including what data is already present in the cache) in 939 plaintext over the network. For this reason, some percentage of 940 HTTPS clients may choose to not enable the Proof Fetching component 941 of STH Pollination. (Although they can still request and send STHs 942 among participating HTTPS servers, even when this affords them no 943 direct benefit.) 945 If STH Pollination was the only mechanism deployed, users that 946 disable it would be able to be attacked without risk of detection. 948 If STH Pollination was not deployed, HTTPS clients visiting HTTPS 949 Servers who did not deploy SCT Feedback could be attacked without 950 risk of detection. 952 9.3. Trusted Auditor Relationship 954 The Trusted Auditor Relationship is expected to be the rarest gossip 955 mechanism, as an HTTPS client is providing an unadulterated report of 956 its browsing history to a third party. While there are valid and 957 common reasons for doing so, there is no appropriate way to enter 958 into this relationship without retrieving informed consent from the 959 user. 961 However, the Trusted Auditor Relationship mechanism still provides 962 value to a class of HTTPS clients. For example, web crawlers have no 963 concept of a "user" and no expectation of privacy. Organizations 964 already performing network auditing for anomalies or attacks can run 965 their own Trusted Auditor for the same purpose with marginal increase 966 in privacy concerns. 968 The ability to change one's Trusted Auditor is a form of Trust 969 Agility that allows a user to choose who to trust, and be able to 970 revise that decision later without consequence. A Trusted Auditor 971 connection can be made more confidential than DNS (through the use of 972 TLS), and can even be made (somewhat) anonymous through the use of 973 anonymity services such as Tor. (Note that this does ignore the de- 974 anonymization possibilities available from viewing a user's browsing 975 history.) 977 If the Trusted Auditor relationship was the only mechanism deployed, 978 users who do not enable it (the majority) would be able to be 979 attacked without risk of detection. 981 If the Trusted Auditor relationship was not deployed, crawlers and 982 organizations would build it themselves for their own needs. By 983 standardizing it, users who wish to opt-in (for instance those 984 unwilling to participate fully in STH Pollination) can have an 985 interoperable standard they can use to choose and change their 986 trusted auditor. 988 9.4. Interaction 990 The interactions of the mechanisms is thus outlined: 992 HTTPS clients can be attacked without risk of detection if they do 993 not participate in any of the three mechanisms. 995 HTTPS clients are afforded the greatest chance of detecting an attack 996 when they either participate in both SCT Feedback and STH Pollination 997 with Proof Fetching or if they have a Trusted Auditor relationship. 998 (Participating in SCT Feedback is required to prevent a malicious log 999 from refusing to ever resolve an SCT to an STH, as put forward in 1000 Section 10.1). Additionally, participating in SCT Feedback enables 1001 an HTTPS client to assist in detecting the exact target of an attack. 1003 HTTPS servers that omit SCT Feedback enable malicious logs to carry 1004 out attacks without risk of detection. If these servers are targeted 1005 specifically, even if the attack is detected, without SCT Feedback 1006 they may never learn that they were specifically targeted. HTTPS 1007 servers without SCT Feedback do gain some measure of herd immunity, 1008 but only because their clients participate in STH Pollination (with 1009 Proof Fetching) or have a Trusted Auditor Relationship. 1011 When HTTPS servers omit SCT feedback, it allows their users to be 1012 attacked without detection by a malicious log; the vulnerable users 1013 are those who do not have a Trusted Auditor relationship. 1015 10. Security considerations 1017 10.1. Attacks by actively malicious logs 1019 One of the most powerful attacks possible in the CT ecosystem is a 1020 trusted log that has actively decided to be malicious. It can carry 1021 out an attack in two ways: 1023 In the first attack, the log can present a split view of the log for 1024 all time. The only way to detect this attack is to resolve each view 1025 of the log to the two most recent STHs and then force the log to 1026 present a consistency proof. (Which it cannot.) This attack can be 1027 detected by CT auditors participating in STH Pollination, as long as 1028 they are explicitly built to handle the situation of a log 1029 continuously presenting a split view. 1031 In the second attack, the log can sign an SCT, and refuse to ever 1032 include the certificate that the SCT refers to in the tree. 1033 (Alternately, it can include it in a branch of the tree and issue an 1034 STH, but then abandon that branch.) Whenever someone requests an 1035 inclusion proof for that SCT (or a consistency proof from that STH), 1036 the log would respond with an error, and a client may simply regard 1037 the response as a transient error. This attack can be detected using 1038 SCT Feedback, or an Auditor of Last Resort, as presented in 1039 Section 11.1.2. 1041 10.2. Dual-CA Compromise 1043 [dual-ca-compromise-attack] describes an attack possible by an 1044 adversary who compromises two Certificate Authorities and a Log. This 1045 attack is difficult to defend against in the CT ecosystem, and 1046 [dual-ca-compromise-attack] describes a few approaches to doing so. 1047 We note that Gossip is not intended to defend against this attack, 1048 but can in certain modes. 1050 Defending against the Dual-CA Compromise attack requires SCT 1051 Feedback, and explicitly requires the server to save full certificate 1052 chains (described in Section 8.1.3 as the 'complex' configuration.) 1053 After CT auditors receive the full certificate chains from servers, 1054 they MAY compare the chain built by clients to the chain supplied by 1055 the log. If the chains differ significantly, the auditor SHOULD 1056 raise a concern. A method of determining if chains differ 1057 significantly is by asserting that one chain is not a subset of the 1058 other and that the roots of the chains are different. 1060 [Note: Justification for this algorithm: 1062 Cross-Signatures could result in a different org being treated as the 1063 'root', but in this case, one chain would be a subset of the other. 1065 Intermediate swapping (e.g. different signature algorithms) could 1066 result in different chains, but the root would be the same. 1068 (Hitting both those cases at once would cause a false positive 1069 though, but this would likely be rare.) 1071 Are there other cases that could occur? (Left for the purposes of 1072 reading during pre-Last Call, to be removed by Editor)] 1074 10.3. Censorship/Blocking considerations 1076 We assume a network attacker who is able to fully control the 1077 client's internet connection for some period of time, including 1078 selectively blocking requests to certain hosts and truncating TLS 1079 connections based on information observed or guessed about client 1080 behavior. In order to successfully detect log misbehavior, the 1081 gossip mechanisms must still work even in these conditions. 1083 There are several gossip connections that can be blocked: 1085 1. Clients sending SCTs to servers in SCT Feedback 1087 2. Servers sending SCTs to auditors in SCT Feedback (server push 1088 mechanism) 1090 3. Servers making SCTs available to auditors (auditor pull 1091 mechanism) 1093 4. Clients fetching proofs in STH Pollination 1095 5. Clients sending STHs to servers in STH Pollination 1097 6. Servers sending STHs to clients in STH Pollination 1099 7. Clients sending SCTs to Trusted Auditors 1101 If a party cannot connect to another party, it can be assured that 1102 the connection did not succeed. While it may not have been 1103 maliciously blocked, it knows the transaction did not succeed. 1104 Mechanisms which result in a positive affirmation from the recipient 1105 that the transaction succeeded allow confirmation that a connection 1106 was not blocked. In this situation, the party can factor this into 1107 strategies suggested in Section 11.3 and in Section 11.1.2. 1109 The connections that allow positive affirmation are 1, 2, 4, 5, and 1110 7. 1112 More insidious is blocking the connections that do not allow positive 1113 confirmation: 3 and 6. An attacker may truncate or drop a response 1114 from a server to a client, such that the server believes it has 1115 shared data with the recipient, when it has not. However, in both 1116 scenarios (3 and 6), the server cannot distinguish the client as a 1117 cooperating member of the CT ecosystem or as an attacker performing a 1118 Sybil attack, aiming to flush the server's data store. Therefore the 1119 fact that these connections can be undetectably blocked does not 1120 actually alter the threat model of servers responding to these 1121 requests. The choice of algorithm to release data is crucial to 1122 protect against these attacks; strategies are suggested in 1123 Section 11.3. 1125 Handling censorship and network blocking (which is indistinguishable 1126 from network error) is relegated to the implementation policy chosen 1127 by clients. Suggestions for client behavior are specified in 1128 Section 11.1. 1130 10.4. Flushing Attacks 1132 A flushing attack is an attempt by an adversary to flush a particular 1133 piece of data from a pool. In the CT Gossip ecosystem, an attacker 1134 may have performed an attack and left evidence of a compromised log 1135 on a client or server. They would be interested in flushing that 1136 data, i.e. tricking the target into gossiping or pollinating the 1137 incriminating evidence with only attacker-controlled clients or 1138 servers with the hope they trick the target into deleting it. 1140 Flushing attacks may be defended against differently depending on the 1141 entity (HTTPS client or HTTPS server) and record (STHs or SCTs with 1142 Certificate Chains). 1144 10.4.1. STHs 1146 For both HTTPS clients and HTTPS servers, STHs within the validity 1147 window SHOULD NOT be deleted. An attacker cannot flush an item from 1148 the cache if it is never removed so flushing attacks are completely 1149 mitigated. 1151 The required disk space for all STHs within the validity window is 1152 336 STHs per log that is trusted. If 20 logs are trusted, and each 1153 STH takes 1 Kilobytes, this is 6.56 Megabytes. 1155 Note that it is important that implementors do not calculate the 1156 exact size of cache expected - if an attack does occur, a small 1157 number of additional STHs will enter into the cache. These STHs will 1158 be in addition to the expected set, and will be evidence of the 1159 attack. 1161 If an HTTPS client or HTTPS server is operating in a constrained 1162 environment and cannot devote enough storage space to hold all STHs 1163 within the validity window it is recommended to use the below 1164 Deletion Algorithm Section 11.3.2 to make it more difficult for the 1165 attacker to perform a flushing attack. 1167 10.4.2. SCTs & Certificate Chains on HTTPS Servers 1169 An HTTPS server will only accept SCTs and Certificate Chains for 1170 domains it is authoritative for. Therefore the storage space needed 1171 is bound by the number of logs it accepts, multiplied by the number 1172 of domains it is authoritative for, multiplied by the number of 1173 certificates issued for those domains. 1175 Imagine a server authoritative for 10,000 domains, and each domain 1176 has 3 certificate chains, and 10 SCTs. A certificate chain is 5 1177 Kilobytes in size and an SCT 1 Kilobyte. This yields 732 Megabytes. 1179 This data can be large, but it is calculable. Web properties with 1180 more certificates and domains are more likely to be able to handle 1181 the increased storage need, while small web properties will not seen 1182 an undue burden. Therefore HTTPS servers SHOULD NOT delete SCTs or 1183 Certificate Chains. This completely mitigates flushing attacks. 1185 Again, note that it is important that implementors do not calculate 1186 the exact size of cache expected - if an attack does occur, the new 1187 SCT(s) and Certificate Chain(s) will enter into the cache. This data 1188 will be in addition to the expected set, and will be evidence of the 1189 attack. 1191 If an HTTPS server is operating in a constrained environment and 1192 cannot devote enough storage space to hold all SCTs and Certificate 1193 Chains it is authoritative for it is recommended to configure the SCT 1194 Feedback mechanism to allow only certain certificates that are known 1195 to be valid. These chains and SCTs can then be discarded without 1196 being stored or subsequently provided to any clients or auditors. If 1197 the allowlist is not sufficient, the below Deletion Algorithm 1198 Section 11.3.2 is recommended to make it more difficult for the 1199 attacker to perform a flushing attack. 1201 10.4.3. SCTs & Certificate Chains on HTTPS Clients 1203 HTTPS clients will accumulate SCTs and Certificate Chains without 1204 bound. It is expected they will choose a particular cache size and 1205 delete entries when the cache size meets its limit. This does not 1206 mitigate flushing attacks, and such an attack is documented in 1207 [gossip-mixing]. 1209 The below Deletion Algorithm Section 11.3.2 is recommended to make it 1210 more difficult for the attacker to perform a flushing attack. 1212 10.5. Privacy considerations 1214 CT Gossip deals with HTTPS clients which are trying to share 1215 indicators that correspond to their browsing history. The most 1216 sensitive relationships in the CT ecosystem are the relationships 1217 between HTTPS clients and HTTPS servers. Client-server relationships 1218 can be aggregated into a network graph with potentially serious 1219 implications for correlative de-anonymization of clients and 1220 relationship-mapping or clustering of servers or of clients. 1222 There are, however, certain clients that do not require privacy 1223 protection. Examples of these clients are web crawlers or robots. 1224 But even in this case, the method by which these clients crawl the 1225 web may in fact be considered sensitive information. In general, it 1226 is better to err on the side of safety, and not assume a client is 1227 okay with giving up its privacy. 1229 10.5.1. Privacy and SCTs 1231 An SCT contains information that links it to a particular web site. 1232 Because the client-server relationship is sensitive, gossip between 1233 clients and servers about unrelated SCTs is risky. Therefore, a 1234 client with an SCT for a given server SHOULD NOT transmit that 1235 information in any other than the following two channels: to the 1236 server associated with the SCT itself; or to a Trusted Auditor, if 1237 one exists. 1239 10.5.2. Privacy in SCT Feedback 1241 SCTs introduce yet another mechanism for HTTPS servers to store state 1242 on an HTTPS client, and potentially track users. HTTPS clients which 1243 allow users to clear history or cookies associated with an origin 1244 MUST clear stored SCTs and certificate chains associated with the 1245 origin as well. 1247 Auditors should treat all SCTs as sensitive data. SCTs received 1248 directly from an HTTPS client are especially sensitive, because the 1249 auditor is a trusted by the client to not reveal their associations 1250 with servers. Auditors MUST NOT share such SCTs in any way, 1251 including sending them to an external log, without first mixing them 1252 with multiple other SCTs learned through submissions from multiple 1253 other clients. Suggestions for mixing SCTs are presented in 1254 Section 11.3. 1256 There is a possible fingerprinting attack where a log issues a unique 1257 SCT for targeted log client(s). A colluding log and HTTPS server 1258 operator could therefore be a threat to the privacy of an HTTPS 1259 client. Given all the other opportunities for HTTPS servers to 1260 fingerprint clients - TLS session tickets, HPKP and HSTS headers, 1261 HTTP Cookies, etc. - this is considered acceptable. 1263 The fingerprinting attack described above would be mitigated by a 1264 requirement that logs must use a deterministic signature scheme when 1265 signing SCTs ([RFC-6962-BIS-09] Section 2.1.4). A log signing using 1266 RSA is not required to use a deterministic signature scheme. 1268 Since logs are allowed to issue a new SCT for a certificate already 1269 present in the log, mandating deterministic signatures does not stop 1270 this fingerprinting attack altogether. It does make the attack 1271 harder to pull off without being detected though. 1273 There is another similar fingerprinting attack where an HTTPS server 1274 tracks a client by using a unique certificate or a variation of cert 1275 chains. The risk for this attack is accepted on the same grounds as 1276 the unique SCT attack described above. 1278 10.5.3. Privacy for HTTPS clients performing STH Proof Fetching 1280 An HTTPS client performing Proof Fetching SHOULD NOT request proofs 1281 from a CT log that it doesn't accept SCTs from. An HTTPS client 1282 SHOULD regularly request an STH from all logs it is willing to 1283 accept, even if it has seen no SCTs from that log. 1285 The time between two polls for new STH's SHOULD NOT be significantly 1286 shorter than the MMD of the polled log divided by its STH Frequency 1287 Count ([RFC-6962-BIS-09] section 5.1). 1289 The actual mechanism by which Proof Fetching is done carries 1290 considerable privacy concerns. Although out of scope for the 1291 document, DNS is a mechanism currently discussed. DNS exposes data 1292 in plaintext over the network (including what sites the user is 1293 visiting and what sites they have previously visited) and may not be 1294 suitable for some. 1296 10.5.4. Privacy in STH Pollination 1298 An STH linked to an HTTPS client may indicate the following about 1299 that client: 1301 o that the client gossips; 1303 o that the client has been using CT at least until the time that the 1304 timestamp and the tree size indicate; 1306 o that the client is talking, possibly indirectly, to the log 1307 indicated by the tree hash; 1309 o which software and software version is being used. 1311 There is a possible fingerprinting attack where a log issues a unique 1312 STH for a targeted HTTPS client. This is similar to the 1313 fingerprinting attack described in Section 10.5.2, but can operate 1314 cross-origin. If a log (or HTTPS server cooperating with a log) 1315 provides a unique STH to a client, the targeted client will be the 1316 only client pollinating that STH cross-origin. 1318 It is mitigated partially because the log is limited in the number of 1319 STHs it can issue. It must 'save' one of its STHs each MMD to 1320 perform the attack. 1322 10.5.5. Privacy in STH Interaction 1324 An HTTPS client may pollinate any STH within the last 14 days. An 1325 HTTPS client may also pollinate an STH for any log that it knows 1326 about. When a client pollinates STHs to a server, it will release 1327 more than one STH at a time. It is unclear if a server may 'prime' a 1328 client and be able to reliably detect the client at a later time. 1330 It's clear that a single site can track a user any way they wish, but 1331 this attack works cross-origin and is therefore more concerning. Two 1332 independent sites A and B want to collaborate to track a user cross- 1333 origin. A feeds a client Carol some N specific STHs from the M logs 1334 Carol trusts, chosen to be older and less common, but still in the 1335 validity window. Carol visits B and chooses to release some of the 1336 STHs she has stored, according to some policy. 1338 Modeling a representation for how common older STHs are in the pools 1339 of clients, and examining that with a given policy of how to choose 1340 which of those STHs to send to B, it should be possible to calculate 1341 statistics about how unique Carol looks when talking to B and how 1342 useful/accurate such a tracking mechanism is. 1344 Building such a model is likely impossible without some real world 1345 data, and requires a given implementation of a policy. To combat 1346 this attack, suggestions are provided in Section 11.3 to attempt to 1347 minimize it, but follow-up testing with real world deployment to 1348 improve the policy will be required. 1350 10.5.6. Trusted Auditors for HTTPS Clients 1352 Some HTTPS clients may choose to use a trusted auditor. This trust 1353 relationship exposes a large amount of information about the client 1354 to the auditor. In particular, it will identify the web sites that 1355 the client has visited to the auditor. Some clients may already 1356 share this information to a third party, for example, when using a 1357 server to synchronize browser history across devices in a server- 1358 visible way, or when doing DNS lookups through a trusted DNS 1359 resolver. For clients with such a relationship already established, 1360 sending SCTs to a trusted auditor run by the same organization does 1361 not appear to expose any additional information to the trusted third 1362 party. 1364 Clients who wish to contact a CT auditor without associating their 1365 identities with their SCTs may wish to use an anonymizing network 1366 like Tor to submit SCT Feedback to the auditor. Auditors SHOULD 1367 accept SCT Feedback that arrives over such anonymizing networks. 1369 Clients sending feedback to an auditor may prefer to reduce the 1370 temporal granularity of the history exposure to the auditor by 1371 caching and delaying their SCT Feedback reports. This is elaborated 1372 upon in Section 11.3. This strategy is only as effective as the 1373 granularity of the timestamps embedded in the SCTs and STHs. 1375 10.5.7. HTTPS Clients as Auditors 1377 Some HTTPS clients may choose to act as CT auditors themselves. A 1378 Client taking on this role needs to consider the following: 1380 o an Auditing HTTPS client potentially exposes its history to the 1381 logs that they query. Querying the log through a cache or a proxy 1382 with many other users may avoid this exposure, but may expose 1383 information to the cache or proxy, in the same way that a non- 1384 Auditing HTTPS Client exposes information to a Trusted Auditor. 1386 o an effective CT auditor needs a strategy about what to do in the 1387 event that it discovers misbehavior from a log. Misbehavior from 1388 a log involves the log being unable to provide either (a) a 1389 consistency proof between two valid STHs or (b) an inclusion proof 1390 for a certificate to an STH any time after the log's MMD has 1391 elapsed from the issuance of the SCT. The log's inability to 1392 provide either proof will not be externally cryptographically- 1393 verifiable, as it may be indistinguishable from a network error. 1395 11. Policy Recommendations 1397 This section is intended as suggestions to implementors of HTTPS 1398 Clients, HTTPS servers, and CT auditors. It is not a requirement for 1399 technique of implementation, so long as privacy considerations 1400 established above are obeyed. 1402 11.1. Blocking Recommendations 1404 11.1.1. Frustrating blocking 1406 When making gossip connections to HTTPS servers or Trusted Auditors, 1407 it is desirable to minimize the plaintext metadata in the connection 1408 that can be used to identify the connection as a gossip connection 1409 and therefore be of interest to block. Additionally, introducing 1410 some randomness into client behavior may be important. We assume 1411 that the adversary is able to inspect the behavior of the HTTPS 1412 client and understand how it makes gossip connections. 1414 As an example, if a client, after establishing a TLS connection (and 1415 receiving an SCT, but not making its own HTTP request yet), 1416 immediately opens a second TLS connection for the purpose of gossip, 1417 the adversary can reliably block this second connection to block 1418 gossip without affecting normal browsing. For this reason it is 1419 recommended to run the gossip protocols over an existing connection 1420 to the server, making use of connection multiplexing such as HTTP 1421 Keep-Alive or SPDY. 1423 Truncation is also a concern. If a client always establishes a TLS 1424 connection, makes a request, receives a response, and then always 1425 attempts a gossip communication immediately following the first 1426 response, truncation will allow an attacker to block gossip reliably. 1428 For these reasons, we recommend that, if at all possible, clients 1429 SHOULD send gossip data in an already established TLS session. This 1430 can be done through the use of HTTP Pipelining, SPDY, or HTTP/2. 1432 11.1.2. Responding to possible blocking 1434 In some circumstances a client may have a piece of data that they 1435 have attempted to share (via SCT Feedback or STH Pollination), but 1436 have been unable to do so: with every attempt they receive an error. 1437 These situations are: 1439 1. The client has an SCT and a certificate, and attempts to retrieve 1440 an inclusion proof - but receives an error on every attempt. 1442 2. The client has an STH, and attempts to resolve it to a newer STH 1443 via a consistency proof - but receives an error on every attempt. 1445 3. The client has attempted to share an SCT and constructed 1446 certificate via SCT Feedback - but receives an error on every 1447 attempt. 1449 4. The client has attempted to share an STH via STH Pollination - 1450 but receives an error on every attempt. 1452 5. The client has attempted to share a specific piece of data with a 1453 Trusted Auditor - but receives an error on every attempt. 1455 In the case of 1 or 2, it is conceivable that the reason for the 1456 errors is that the log acted improperly, either through malicious 1457 actions or compromise. A proof may not be able to be fetched because 1458 it does not exist (and only errors or timeouts occur). One such 1459 situation may arise because of an actively malicious log, as 1460 presented in Section 10.1. This data is especially important to 1461 share with the broader internet to detect this situation. 1463 If an SCT has attempted to be resolved to an STH via an inclusion 1464 proof multiple times, and each time has failed, this SCT might very 1465 well be a compromising proof of an attack. However the client MUST 1466 NOT share the data with any other third party (excepting a Trusted 1467 Auditor should one exist). 1469 If an STH has attempted to be resolved to a newer STH via a 1470 consistency proof multiple times, and each time has failed, a client 1471 MAY share the STH with an "Auditor of Last Resort" even if the STH in 1472 question is no longer within the validity window. This auditor may 1473 be pre-configured in the client, but the client SHOULD permit a user 1474 to disable the functionality or change whom data is sent to. The 1475 Auditor of Last Resort itself represents a point of failure and 1476 privacy concerns, so if implemented, it SHOULD connect using public 1477 key pinning and not consider an item delivered until it receives a 1478 confirmation. 1480 In the cases 3, 4, and 5, we assume that the webserver(s) or trusted 1481 auditor in question is either experiencing an operational failure, or 1482 being attacked. In both cases, a client SHOULD retain the data for 1483 later submission (subject to Private Browsing or other history- 1484 clearing actions taken by the user.) This is elaborated upon more in 1485 Section 11.3. 1487 11.2. Proof Fetching Recommendations 1489 Proof fetching (both inclusion proofs and consistency proofs) SHOULD 1490 be performed at random time intervals. If proof fetching occurred 1491 all at once, in a flurry of activity, a log would know that SCTs or 1492 STHs received around the same time are more likely to come from a 1493 particular client. While proof fetching is required to be done in a 1494 manner that attempts to be anonymous from the perspective of the log, 1495 the correlation of activity to a single client would still reveal 1496 patterns of user behavior we wish to keep confidential. These 1497 patterns could be recognizable as a single user, or could reveal what 1498 sites are commonly visited together in the aggregate. 1500 11.3. Record Distribution Recommendations 1502 In several components of the CT Gossip ecosystem, the recommendation 1503 is made that data from multiple sources be ingested, mixed, stored 1504 for an indeterminate period of time, provided (multiple times) to a 1505 third party, and eventually deleted. The instances of these 1506 recommendations in this draft are: 1508 o When a client receives SCTs during SCT Feedback, it should store 1509 the SCTs and Certificate Chain for some amount of time, provide 1510 some of them back to the server at some point, and may eventually 1511 remove them from its store 1513 o When a client receives STHs during STH Pollination, it should 1514 store them for some amount of time, mix them with other STHs, 1515 release some of them them to various servers at some point, 1516 resolve some of them to new STHs, and eventually remove them from 1517 its store 1519 o When a server receives SCTs during SCT Feedback, it should store 1520 them for some period of time, provide them to auditors some number 1521 of times, and may eventually remove them 1523 o When a server receives STHs during STH Pollination, it should 1524 store them for some period of time, mix them with other STHs, 1525 provide some of them to connecting clients, may resolve them to 1526 new STHs via Proof Fetching, and eventually remove them from its 1527 store 1529 o When a Trusted Auditor receives SCTs or historical STHs from 1530 clients, it should store them for some period of time, mix them 1531 with SCTs received from other clients, and act upon them at some 1532 period of time 1534 Each of these instances have specific requirements for user privacy, 1535 and each have options that may not be invoked. As one example, an 1536 HTTPS client should not mix SCTs from server A with SCTs from server 1537 B and release server B's SCTs to Server A. As another example, an 1538 HTTPS server may choose to resolve STHs to a single more current STH 1539 via proof fetching, but it is under no obligation to do so. 1541 These requirements should be met, but the general problem of 1542 aggregating multiple pieces of data, choosing when and how many to 1543 release, and when to remove them is shared. This problem has 1544 previously been considered in the case of Mix Networks and Remailers, 1545 including papers such as [trickle]. 1547 There are several concerns to be addressed in this area, outlined 1548 below. 1550 11.3.1. Mixing Algorithm 1552 When SCTs or STHs are recorded by a participant in CT Gossip and 1553 later used, it is important that they are selected from the datastore 1554 in a non-deterministic fashion. 1556 This is most important for servers, as they can be queried for SCTs 1557 and STHs anonymously. If the server used a predictable ordering 1558 algorithm, an attacker could exploit the predictability to learn 1559 information about a client. One such method would be by observing 1560 the (encrypted) traffic to a server. When a client of interest 1561 connects, the attacker makes a note. They observe more clients 1562 connecting, and predicts at what point the client-of-interest's data 1563 will be disclosed, and ensures that they query the server at that 1564 point. 1566 Although most important for servers, random ordering is still 1567 strongly recommended for clients and Trusted Auditors. The above 1568 attack can still occur for these entities, although the circumstances 1569 are less straightforward. For clients, an attacker could observe 1570 their behavior, note when they receive an STH from a server, and use 1571 javascript to cause a network connection at the correct time to force 1572 a client to disclose the specific STH. Trusted Auditors are stewards 1573 of sensitive client data. If an attacker had the ability to observe 1574 the activities of a Trusted Auditor (perhaps by being a log, or 1575 another auditor), they could perform the same attack - noting the 1576 disclosure of data from a client to the Trusted Auditor, and then 1577 correlating a later disclosure from the Trusted Auditor as coming 1578 from that client. 1580 Random ordering can be ensured by several mechanisms. A datastore 1581 can be shuffled, using a secure shuffling algorithm such as Fisher- 1582 Yates. Alternately, a series of random indexes into the data store 1583 can be selected (if a collision occurs, a new index is selected.) A 1584 cryptographically secure random number generator must be used in 1585 either case. If shuffling is performed, the datastore must be marked 1586 'dirty' upon item insertion, and at least one shuffle operation 1587 occurs on a dirty datastore before data is retrieved from it for use. 1589 11.3.2. The Deletion Algorithm 1591 No entity in CT Gossip is required to delete records at any time, 1592 except to respect user's wishes such as private browsing mode or 1593 clearing history. However, it is likely that over time the 1594 accumulated storage will grow in size and need to be pruned. 1596 While deletion of data will occur, proof fetching can ensure that any 1597 misbehavior from a log will still be detected, even after the direct 1598 evidence from the attack is deleted. Proof fetching ensures that if 1599 a log presents a split view for a client, they must maintain that 1600 split view in perpetuity. An inclusion proof from an SCT to an STH 1601 does not erase the evidence - the new STH is evidence itself. A 1602 consistency proof from that STH to a new one likewise - the new STH 1603 is every bit as incriminating as the first. (Client behavior in the 1604 situation where an SCT or STH cannot be resolved is suggested in 1605 Section 11.1.2.) Because of this property, we recommend that if a 1606 client is performing proof fetching, that they make every effort to 1607 not delete data until it has been successfully resolved to a new STH 1608 via a proof. 1610 When it is time to delete a record, it can be done in a way that 1611 makes it more difficult for a successful flushing attack to to be 1612 performed. 1614 1. When the record cache has reached a certain size that is yet 1615 under the limit, aggressively perform proof fetching. This 1616 should resolve records to a small set of STHs that can be 1617 retained. Once a proof has been fetched, the record is safer to 1618 delete. 1620 2. If proof fetching has failed, or is disabled, begin by deleting 1621 SCTs and Certificate Chains that have been successfully reported. 1622 Deletion from this set of SCTs should be done at random. For a 1623 client, a submission is not counted as being reported unless it 1624 is sent over a connection using a different SCT, so the attacker 1625 is faced with a recursive problem. (For a server, this step does 1626 not apply.) 1628 3. Attempt to save any submissions that have failed proof fetching 1629 repeatedly, as these are the most likely to be indicative of an 1630 attack. 1632 4. Finally, if the above steps have been followed and have not 1633 succeeded in reducing the size sufficiently, records may be 1634 deleted at random. 1636 Note that if proof fetching is disabled (which is expected although 1637 not required for servers) - the algorithm collapses down to 'delete 1638 at random'. 1640 The decision to delete records at random is intentional. Introducing 1641 non-determinism in the decision is absolutely necessary to make it 1642 more difficult for an adversary to know with certainty or high 1643 confidence that the record has been successfully flushed from a 1644 target. 1646 11.4. Concrete Recommendations 1648 We present the following pseudocode as a concrete outline of our 1649 policy recommendations. 1651 Both suggestions presented are applicable to both clients and 1652 servers. Servers may not perform proof fetching, in which case large 1653 portions of the pseudocode are not applicable. But it should work in 1654 either case. 1656 11.4.1. STH Pollination 1658 The STH class contains data pertaining specifically to the STH 1659 itself. 1661 class STH 1662 { 1663 uint16 proof_attempts 1664 uint16 proof_failure_count 1665 uint32 num_reports_to_thirdparty 1666 datetime timestamp 1667 byte[] data 1668 } 1670 The broader STH store itself would contain all the STHs known by an 1671 entity participating in STH Pollination (either client or server). 1672 This simplistic view of the class does not take into account the 1673 complicated locking that would likely be required for a data 1674 structure being accessed by multiple threads. Something to note 1675 about this pseudocode is that it does not remove STHs once they have 1676 been resolved to a newer STH. Doing so might make older STHs within 1677 the validity window rarer and thus enable tracking. 1679 class STHStore 1680 { 1681 STH[] sth_list 1683 // This function is run after receiving a set of STHs from 1684 // a third party in response to a pollination submission 1685 def insert(STH[] new_sths) { 1686 foreach(new in new_sths) { 1687 if(this.sth_list.contains(new)) 1688 continue 1689 this.sth_list.insert(new) 1690 } 1691 } 1693 // This function is called to delete the given STH 1694 // from the data store 1695 def delete_now(STH s) { 1696 this.sth_list.remove(s) 1697 } 1699 // When it is time to perform STH Pollination, the HTTPS client 1700 // calls this function to get a selection of STHs to send as 1701 // feedback 1702 def get_pollination_selection() { 1703 if(len(this.sth_list) < MAX_STH_TO_GOSSIP) 1704 return this.sth_list 1705 else { 1706 indexes = set() 1707 modulus = len(this.sth_list) 1708 while(len(indexes) < MAX_STH_TO_GOSSIP) { 1709 r = randomInt() % modulus 1710 // Ignore STHs that are past the validity window but not 1711 // yet removed. 1712 if(r not in indexes 1713 && now() - this.sth_list[i].timestamp < TWO_WEEKS) 1714 indexes.insert(r) 1715 } 1717 return_selection = [] 1718 foreach(i in indexes) { 1719 return_selection.insert(this.sth_list[i]) 1720 } 1721 return return_selection 1722 } 1723 } 1724 } 1725 We also suggest a function that will be called periodically in the 1726 background, iterating through the STH store, performing a cleaning 1727 operation and queuing consistency proofs. This function can live as 1728 a member functions of the STHStore class. 1730 //Just a suggestion: 1731 #define MIN_PROOF_FAILURES_CONSIDERED_SUSPICIOUS 3 1733 def clean_list() { 1734 foreach(sth in this.sth_list) { 1736 if(now() - sth.timestamp > TWO_WEEKS) { 1737 //STH is too old, we must remove it 1738 if(proof_fetching_enabled 1739 && auditor_of_last_resort_enabled 1740 && sth.proof_failure_count 1741 > MIN_PROOF_FAILURES_CONSIDERED_SUSPICIOUS) { 1742 queue_for_auditor_of_last_resort(sth, 1743 auditor_of_last_resort_callback) 1744 } else { 1745 delete_now(sth) 1746 } 1747 } 1749 else if(proof_fetching_enabled 1750 && now() - sth.timestamp > LOG_MMD 1751 && sth.proof_attempts != UINT16_MAX 1752 // Only fetch a proof is we have never received a proof 1753 // before. (This also avoids submitting something 1754 // already in the queue.) 1755 && sth.proof_attempts == sth.proof_failure_count) { 1756 sth.proof_attempts++ 1757 queue_consistency_proof(sth, consistency_proof_callback) 1758 } 1759 } 1760 } 1762 These functions also exist in the STHStore class. 1764 // This function is called after successfully pollinating STHs 1765 // to a third party. It is passed the STHs sent to the third 1766 // party, which is the output of get_gossip_selection(), as well 1767 // as the STHs received in the response. 1768 def successful_thirdparty_submission_callback(STH[] submitted_sth_list, 1769 STH[] new_sths) 1770 { 1771 foreach(sth in submitted_sth_list) { 1772 sth.num_reports_to_thirdparty++ 1773 } 1775 this.insert(new_sths); 1776 } 1778 // Attempt auditor of last resort submissions until it succeeds 1779 def auditor_of_last_resort_callback(original_sth, error) { 1780 if(!error) { 1781 delete_now(original_sth) 1782 } 1783 } 1785 def consistency_proof_callback(consistency_proof, original_sth, error) { 1786 if(!error) { 1787 insert(consistency_proof.current_sth) 1788 } else { 1789 original_sth.proof_failure_count++ 1790 } 1791 } 1793 11.4.2. SCT Feedback 1795 The SCT class contains data pertaining specifically to an SCT itself. 1797 class SCT 1798 { 1799 uint16 proof_failure_count 1800 bool has_been_resolved_to_sth 1801 bool proof_outstanding 1802 byte[] data 1803 } 1805 The SCT bundle will contain the trusted certificate chain the HTTPS 1806 client built (chaining to a trusted root certificate.) It also 1807 contains the list of associated SCTs, the exact domain it is 1808 applicable to, and metadata pertaining to how often it has been 1809 reported to the third party. 1811 class SCTBundle 1812 { 1813 X509[] certificate_chain 1814 SCT[] sct_list 1815 string domain 1816 uint32 num_reports_to_thirdparty 1818 def equals(sct_bundle) { 1819 if(sct_bundle.domain != this.domain) 1820 return false 1821 if(sct_bundle.certificate_chain != this.certificate_chain) 1822 return false 1823 if(sct_bundle.sct_list != this.sct_list) 1824 return false 1826 return true 1827 } 1828 def approx_equals(sct_bundle) { 1829 if(sct_bundle.domain != this.domain) 1830 return false 1831 if(sct_bundle.certificate_chain != this.certificate_chain) 1832 return false 1834 return true 1835 } 1837 def insert_scts(sct[] sct_list) { 1838 this.sct_list.union(sct_list) 1839 this.num_reports_to_thirdparty = 0 1840 } 1842 def has_been_fully_resolved_to_sths() { 1843 foreach(s in this.sct_list) { 1844 if(!s.has_been_resolved_to_sth && !s.proof_outstanding) 1845 return false 1846 } 1847 return true 1848 } 1850 def max_proof_failures() { 1851 uint max = 0 1852 foreach(sct in this.sct_list) { 1853 if(sct.proof_failure_count > max) 1854 max = sct.proof_failure_count 1855 } 1856 return max 1857 } 1858 } 1859 For each domain, we store a SCTDomainEntry that holds the SCTBundles 1860 seen for that domain, as well as encapsulating some logic relating to 1861 SCT Feedback for that particular domain. In particular, this data 1862 structure also contains the logic that handles domains not supporting 1863 SCT Feedback. Its behavior is: 1865 1. When a user visits a domain, SCT Feedback is attempted for it. 1866 If it fails, it will retry after a month (configurable). If it 1867 succeeds, excellent. SCT Feedback data is still collected and 1868 stored even if SCT Feedback failed. 1870 2. After 3 month-long waits between failures, the domain will be 1871 marked as failing long-term. No SCT Feedback data will be stored 1872 beyond meta-data, but SCT Feedback will still be attempted after 1873 month-long waits 1875 3. If at any point in time, SCT Feedback succeeds, all failure 1876 counters are reset 1878 4. If a domain succeeds, but then begins failing, it must fail more 1879 than 90% of the time (configurable) and then the process begins 1880 at (2). 1882 If a domain is visited infrequently (say, once every 7 months) then 1883 it will be evicted from the cache and start all over again (according 1884 to the suggestion values in the below pseudocode). 1886 [ Note: To be certain the logic is correct I give the following test 1887 cases which illustrate the intended behavior. Hopefully the code 1888 matches! 1890 Succeed 1 Time num_submissions_attempted=1 num_submissions_succeeded=1 num_feedback_loop_failures=0 1891 Fail 10 Times num_submissions_attempted=11 num_submissions_succeeded=1 num_feedback_loop_failures=0 1892 ... wait a month ... 1893 Fail 1 month later num_submissions_attempted=12 num_submissions_succeeded=1 num_feedback_loop_failures=1 1894 ... wait a month ... 1895 Succeed 1 month later num_submissions_attempted=13 num_submissions_succeeded=2 num_feedback_loop_failures=0(r) indicates (Reset) 1896 -> Feedback is attempted regularly. 1898 Succeed 1 Time num_submissions_attempted=1 num_submissions_succeeded=1 num_feedback_loop_failures=0 1899 Fail 10 Times num_submissions_attempted=11 num_submissions_succeeded=1 num_feedback_loop_failures=0 1900 ... wait a month ... 1901 Fail 1 month later num_submissions_attempted=12 num_submissions_succeeded=1 num_feedback_loop_failures=1 1902 ... wait a month ... 1903 Fail 1 month later num_submissions_attempted=13 num_submissions_succeeded=1 num_feedback_loop_failures=2 1904 ... wait a month ... 1905 Succeed 1 month later num_submissions_attempted=14 num_submissions_succeeded=2 num_feedback_loop_failures=0(r) 1906 -> Feedback is attempted regularly. 1908 Succeed 1 Time num_submissions_attempted=1 num_submissions_succeeded=1 num_feedback_loop_failures=0 1909 Fail 10 Times num_submissions_attempted=11 num_submissions_succeeded=1 num_feedback_loop_failures=0 1910 ... wait a month ... 1911 Fail 1 month later num_submissions_attempted=12 num_submissions_succeeded=1 num_feedback_loop_failures=1 1912 ... wait a month ... 1913 Fail 1 month later num_submissions_attempted=13 num_submissions_succeeded=1 num_feedback_loop_failures=2 1914 ... wait a month ... 1915 Fail 1 month later num_submissions_attempted=14 num_submissions_succeeded=2 num_feedback_loop_failures=3 1916 ... clear_old_data() is run every hour ... 1917 num_submissions_attempted=0 num_submissions_succeeded=0 num_feedback_loop_failures=3 1918 sct_feedback_failing_longterm=True 1919 Fail 1 month later num_submissions_attempted=1 num_submissions_succeeded=0 num_feedback_loop_failures=4 1920 sct_feedback_failing_longterm=True 1921 ... clear_old_data() is run every hour ... 1922 num_submissions_attempted=0(r) num_submissions_succeeded=0 num_feedback_loop_failures=3 1923 sct_feedback_failing_longterm=True 1924 Succeed 1 month later num_submissions_attempted=2 num_submissions_succeeded=1 num_feedback_loop_failures=0(r) 1925 sct_feedback_failing_longterm=False 1926 -> Feedback is attempted regularly. 1928 Note above that the second run of clear_old_data() will reset num_submissions_attempted from 1 to 0. This is 1929 CRITICAL. Otherwise, we would have the below bug (where after 10 months of failures, a success would not hit 1930 the required ratio to keep going) 1932 //The below represents a bug. 1933 Succeed 1 Time num_submissions_attempted=1 num_submissions_succeeded=1 num_feedback_loop_failures=0 1934 Fail 10 Times num_submissions_attempted=11 num_submissions_succeeded=1 num_feedback_loop_failures=0 1935 ... wait a month ... 1936 Fail 1 month later num_submissions_attempted=12 num_submissions_succeeded=1 num_feedback_loop_failures=1 1937 ... wait a month ... 1938 Fail 1 month later num_submissions_attempted=13 num_submissions_succeeded=1 num_feedback_loop_failures=2 1939 ... wait a month ... 1940 Fail 1 month later num_submissions_attempted=14 num_submissions_succeeded=2 num_feedback_loop_failures=3 1941 ... clear_old_data() is run every hour ... 1942 num_submissions_attempted=0 num_submissions_succeeded=0 num_feedback_loop_failures=3 1943 sct_feedback_failing_longterm=True 1944 Fail 1 month later num_submissions_attempted=1 num_submissions_succeeded=0 num_feedback_loop_failures=4 1945 sct_feedback_failing_longterm=True 1946 Fail 9 times for 9 months 1947 num_submissions_attempted=10 num_submissions_succeeded=0 num_feedback_loop_failures=13 1948 sct_feedback_failing_longterm=True 1949 Succeed 1 month later num_submissions_attempted=11 num_submissions_succeeded=1 num_feedback_loop_failures=0(r) 1950 sct_feedback_failing_longterm=False 1951 -> Feedback is NOT attempted regularly. \] 1953 //Suggestions: 1954 // After concluding a domain doesn't support feedback, we try again 1955 // after WAIT_BETWEEN_SCT_FEEDBACK_ATTEMPTS amount of time to see if 1956 // they added support 1957 #define WAIT_BETWEEN_SCT_FEEDBACK_ATTEMPTS 1 month 1959 // If we've waited MIN_SCT_FEEDBACK_ATTEMPTS_BEFORE_OMITTING_STORAGE 1960 // multiplied by WAIT_BETWEEN_SCT_FEEDBACK_ATTEMPTS amount of time, we 1961 // still attempt SCT Feedback, but no longer bother storing any data 1962 // until the domain supports SCT Feedback 1963 #define MIN_SCT_FEEDBACK_ATTEMPTS_BEFORE_OMITTING_STORAGE 3 1965 // If this percentage of SCT Feedback attempts previously succeeded, 1966 // we consider the domain as supporting feedback and is just having 1967 // transient errors 1968 #define MIN_RATIO_FOR_SCT_FEEDBACK_TO_BE_WORKING .10 1970 class SCTDomainEntry 1971 { 1972 // This is the primary key of the object, the exact domain name it 1973 // is valid for 1974 string domain 1976 // This is the last time the domain was contacted. For client 1977 // operations it is updated whenever the client makes any request 1978 // (not just feedback) to the domain. For server operations, it is 1979 // updated whenever any client contacts the domain. Responsibility 1980 // for updating lies OUTSIDE of the class 1981 public datetime last_contact_for_domain 1983 // This is the last time SCT Feedback was attempted for the domain. 1984 // It is updated whenever feedback is attempted - responsibility for 1985 // updating lies OUTSIDE of the class 1986 // This is not used when this algorithm runs on servers 1987 public datetime last_sct_feedback_attempt 1989 // This is the number of times we have waited an 1990 // WAIT_BETWEEN_SCT_FEEDBACK_ATTEMPTS amount of time, and still failed 1991 // e.g. 10 months of failures 1992 // This is not used when this algorithm runs on servers 1993 private uint16 num_feedback_loop_failures 1995 // This is whether or not SCT Feedback has failed enough times that we 1996 // should not bother storing data for it anymore. It is a small function 1997 // used for illustrative purposes 1998 // This is not used when this algorithm runs on servers 1999 private bool sct_feedback_failing_longterm() 2000 { num_feedback_loop_failures >= MIN_SCT_FEEDBACK_ATTEMPTS_BEFORE_OMITTING_STORAGE } 2002 // This is the number of SCT Feedback submissions attempted. 2004 // Responsibility for incrementing lies OUTSIDE of the class 2005 // (And watch for integer overflows) 2006 // This is not used when this algorithm runs on servers 2007 public uint16 num_submissions_attempted 2009 // This is the number of successful SCT Feedback submissions. This 2010 // variable is updated by the class. 2011 // This is not used when this algorithm runs on servers 2012 private uint16 num_submissions_succeeded 2014 // This contains all the bundles of SCT data we have observed for 2015 // this domain 2016 SCTBundle[] observed_records 2018 // This function can be called to determine if we should attempt 2019 // SCT Feedback for this domain. 2020 def should_attempt_feedback() { 2021 // Servers always perform feedback! 2022 if(operator_is_server) 2023 return true 2025 // If we have not tried in a month, try again 2026 if(now() - last_sct_feedback_attempt > WAIT_BETWEEN_SCT_FEEDBACK_ATTEMPTS) 2027 return true 2029 // If we have tried recently, and it seems to be working, go for it! 2030 if((num_submissions_succeeded / num_submissions_attempted) > 2031 MIN_RATIO_FOR_SCT_FEEDBACK_TO_BE_WORKING) 2032 return true 2034 // Otherwise don't try 2035 return false 2036 } 2038 // For Clients, this function is called after a successful 2039 // connection to an HTTPS server, with a single SCTBundle 2040 // constructed from that connection's certificate chain and SCTs. 2041 // For Servers, this is called after receiving SCT Feedback with 2042 // all the bundles sent in the feedback. 2043 def insert(SCTBundle[] bundles) { 2044 // Do not store data for long-failing domains 2045 if(sct_feedback_failing_longterm()) { 2046 return 2047 } 2049 foreach(b in bundles) { 2050 if(operator_is_server) { 2051 if(!passes_validity_checks(b)) 2052 return 2053 } 2055 bool have_inserted = false 2056 foreach(e in this.observed_records) { 2057 if(e.equals(b)) 2058 return 2059 else if(e.approx_equals(b)) { 2060 have_inserted = true 2061 e.insert_scts(b.sct_list) 2062 } 2063 } 2064 if(!have_inserted) 2065 this.observed_records.insert(b) 2066 } 2067 SCTStoreManager.update_cache_percentage() 2068 } 2070 // When it is time to perform SCT Feedback, the HTTPS client 2071 // calls this function to get a selection of SCTBundles to send 2072 // as feedback 2073 def get_gossip_selection() { 2074 if(len(observed_records) > MAX_SCT_RECORDS_TO_GOSSIP) { 2075 indexes = set() 2076 modulus = len(observed_records) 2077 while(len(indexes) < MAX_SCT_RECORDS_TO_GOSSIP) { 2078 r = randomInt() % modulus 2079 if(r not in indexes) 2080 indexes.insert(r) 2081 } 2083 return_selection = [] 2084 foreach(i in indexes) { 2085 return_selection.insert(this.observed_records[i]) 2086 } 2088 return return_selection 2089 } 2090 else 2091 return this.observed_records 2092 } 2094 def passes_validity_checks(SCTBundle b) { 2095 // This function performs the validity checks specified in 2096 // {{feedback-srvop}} 2097 } 2098 } 2099 The SCTDomainEntry is responsible for handling the outcome of a 2100 submission report for that domain using its member function: 2102 // This function is called after providing SCT Feedback 2103 // to a server. It is passed the feedback sent to the other party, which 2104 // is the output of get_gossip_selection(), and also the SCTBundle 2105 // representing the connection the data was sent on. 2106 // (When this code runs on the server, connectionBundle is NULL) 2107 // If the Feedback was not sent successfully, error is True 2108 def after_submit_to_thirdparty(error, SCTBundle[] submittedBundles, 2109 SCTBundle connectionBundle) 2110 { 2111 // Server operation in this instance is exceedingly simple 2112 if(operator_is_server) { 2113 if(error) 2114 return 2115 foreach(bundle in submittedBundles) 2116 bundle.num_reports_to_thirdparty++ 2117 return 2118 } 2120 // Client behavior is much more complicated 2121 if(error) { 2122 if(sct_feedback_failing_longterm()) { 2123 num_feedback_loop_failures++ 2124 } 2125 else if((num_submissions_succeeded / num_submissions_attempted) 2126 > MIN_RATIO_FOR_SCT_FEEDBACK_TO_BE_WORKING) { 2127 // Do nothing. num_submissions_succeeded will not be incremented 2128 // After enough of these failures, the ratio will fall beyond 2129 // acceptable 2130 } else { 2131 // The domain has begun its three-month grace period. We will 2132 // attempt submissions once a month 2133 num_feedback_loop_failures++ 2134 } 2135 return 2136 } 2137 // We succeeded, so reset all of our failure states 2138 // Note, there is a race condition here if clear_old_data() is called 2139 // while this callback is outstanding. 2140 num_feedback_loop_failures = 0 2141 if(num_submissions_succeeded != UINT16_MAX ) 2142 num_submissions_succeeded++ 2144 foreach(bundle in submittedBundles) 2145 { 2146 // Compare Certificate Chains, if they do not match, it counts as a 2147 // submission. 2148 if(!connectionBundle.approx_equals(bundle)) 2149 bundle.num_reports_to_thirdparty++ 2150 else { 2151 // This check ensures that a SCT Bundle is not considered reported 2152 // if it is submitted over a connection with the same SCTs. This 2153 // satisfies the constraint in Paragraph 5 of {{feedback-clisrv}} 2154 // Consider three submission scenarios: 2155 // Submitted SCTs Connection SCTs Considered Submitted 2156 // A, B A, B No - no new information 2157 // A A, B Yes - B is a new SCT 2158 // A, B A No - no new information 2159 if(connectionBundle.sct_list is NOT a subset of bundle.sct_list) 2160 bundle.num_reports_to_thirdparty++ 2161 } 2162 } 2163 } 2165 Instances of the SCTDomainEntry class are stored as part of a larger 2166 class that manages the entire SCT Cache, storing them in a hashmap 2167 keyed by domain. This class also tracks the current size of the 2168 cache, and will trigger cache eviction. 2170 //Suggestions: 2171 #define CACHE_PRESSURE_SAFE .50 2172 #define CACHE_PRESSURE_IMMINENT .70 2173 #define CACHE_PRESSURE_ALMOST_FULL .85 2174 #define CACHE_PRESSURE_FULL .95 2175 #define WAIT_BETWEEN_IMMINENT_CACHE_EVICTION 5 minutes 2177 class SCTStoreManager 2178 { 2179 hashmap all_sct_entries 2180 uint32 current_cache_size 2181 datetime imminent_cache_pressure_check_performed 2183 float current_cache_percentage() { 2184 return current_cache_size / MAX_CACHE_SIZE; 2185 } 2187 static def update_cache_percentage() { 2188 // This function calculates the current size of the cache 2189 // and updates current_cache_size 2190 /* ... perform calculations ... */ 2191 current_cache_size = /* new calculated value */ 2193 // Perform locking to prevent multiple of these functions being 2194 // called concurrently or unnecessarily 2195 if(current_cache_percentage() > CACHE_PRESSURE_FULL) { 2196 cache_is_full() 2197 } 2199 else if(current_cache_percentage() > CACHE_PRESSURE_ALMOST_FULL) { 2200 cache_pressure_almost_full() 2201 } 2203 else if(current_cache_percentage() > CACHE_PRESSURE_IMMINENT) { 2204 // Do not repeatedly perform the imminent cache pressure operation 2205 if(now() - imminent_cache_pressure_check_performed > 2206 WAIT_BETWEEN_IMMINENT_CACHE_EVICTION) { 2207 cache_pressure_is_imminent() 2208 } 2209 } 2210 } 2211 } 2213 The SCTStoreManager contains a function that will be called 2214 periodically in the background, iterating through all SCTDomainEntry 2215 objects and performing maintenance tasks. It removes data for 2216 domains we have not contacted in a long time. This function is not 2217 intended to clear data if the cache is getting full, separate 2218 functions are used for that. 2220 // Suggestions: 2221 #define TIME_UNTIL_OLD_SUBMITTED_SCTDATA_ERASED 3 months 2222 #define TIME_UNTIL_OLD_UNSUBMITTED_SCTDATA_ERASED 6 months 2224 def clear_old_data() 2225 { 2226 foreach(domainEntry in all_sct_stores) 2227 { 2228 // Queue proof fetches 2229 if(proof_fetching_enabled) { 2230 foreach(sctBundle in domainEntry.observed_records) { 2231 if(!sctBundle.has_been_fully_resolved_to_sths()) { 2232 foreach(s in bundle.sct_list) { 2233 if(!s.has_been_resolved_to_sth && !s.proof_outstanding) { 2234 sct.proof_outstanding = True 2235 queue_inclusion_proof(sct, inclusion_proof_callback) 2236 } 2237 } 2238 } 2239 } 2240 } 2242 // Do not store data for domains who are not supporting SCT 2243 if(!operator_is_server 2244 && domainEntry.sct_feedback_failing_longterm()) 2245 { 2246 // Note that reseting these variables every single time is 2247 // necessary to avoid a bug 2248 all_sct_stores[domainEntry].num_submissions_attempted = 0 2249 all_sct_stores[domainEntry].num_submissions_succeeded = 0 2250 delete all_sct_stores[domainEntry].observed_records 2251 all_sct_stores[domainEntry].observed_records = NULL 2252 } 2254 // This check removes successfully submitted data for 2255 // old domains we have not dealt with in a long time 2256 if(domainEntry.num_submissions_succeeded > 0 2257 && now() - domainEntry.last_contact_for_domain 2258 > TIME_UNTIL_OLD_SUBMITTED_SCTDATA_ERASED) 2259 { 2260 all_sct_stores.remove(domainEntry) 2261 } 2263 // This check removes unsuccessfully submitted data for 2264 // old domains we have not dealt with in a very long time 2265 if(now() - domainEntry.last_contact_for_domain 2266 > TIME_UNTIL_OLD_UNSUBMITTED_SCTDATA_ERASED) 2267 { 2268 all_sct_stores.remove(domainEntry) 2269 } 2271 SCTStoreManager.update_cache_percentage() 2272 } 2274 Inclusion Proof Fetching is handled fairly independently 2276 // This function is a callback invoked after an inclusion proof 2277 // has been retrieved. It can exist on the SCT class or independently, 2278 // so long as it can modify the SCT class' members 2279 def inclusion_proof_callback(inclusion_proof, original_sct, error) 2280 { 2281 // Unlike the STH code, this counter must be incremented on the 2282 // callback as there is a race condition on using this counter in the 2283 // cache_* functions. 2284 original_sct.proof_attempts++ 2285 original_sct.proof_outstanding = False 2286 if(!error) { 2287 original_sct.has_been_resolved_to_sth = True 2288 insert_to_sth_datastore(inclusion_proof.new_sth) 2289 } else { 2290 original_sct.proof_failure_count++ 2291 } 2292 } 2294 If the cache is getting full, these three member functions of the 2295 SCTStoreManager class will be used. 2297 // ----------------------------------------------------------------- 2298 // This function is called when the cache is not yet full, but is 2299 // nearing it. It prioritizes deleting data that should be safe 2300 // to delete (because it has been shared with the site or resolved 2301 // to a STH) 2302 def cache_pressure_is_imminent() 2303 { 2304 bundlesToDelete = [] 2305 foreach(domainEntry in all_sct_stores) { 2306 foreach(sctBundle in domainEntry.observed_records) { 2308 if(proof_fetching_enabled) { 2309 // First, queue proofs for anything not already queued. 2310 if(!sctBundle.has_been_fully_resolved_to_sths()) { 2311 foreach(sct in bundle.sct_list) { 2312 if(!sct.has_been_resolved_to_sth 2313 && !sct.proof_outstanding) { 2314 sct.proof_outstanding = True 2315 queue_inclusion_proof(sct, inclusion_proof_callback) 2316 } 2317 } 2318 } 2320 // Second, consider deleting entries that have been fully 2321 // resolved. 2322 else { 2323 bundlesToDelete.append( Struct(domainEntry, sctBundle) ) 2324 } 2325 } 2327 // Third, consider deleting entries that have been successfully 2328 // reported 2329 if(sctBundle.num_reports_to_thirdparty > 0) { 2330 bundlesToDelete.append( Struct(domainEntry, sctBundle) ) 2331 } 2332 } 2333 } 2335 // Third, delete the eligible entries at random until the cache is 2336 // at a safe level 2337 uint recalculateIndex = 0 2338 #define RECALCULATE_EVERY_N_OPERATIONS 50 2340 while(bundlesToDelete.length > 0 && 2341 current_cache_percentage() > CACHE_PRESSURE_SAFE) { 2342 uint rndIndex = rand() % bundlesToDelete.length 2343 bundlesToDelete[rndIndex].domainEntry.observed_records.remove(bundlesToDelete[rndIndex].sctBundle) 2344 bundlesToDelete.removeAt(rndIndex) 2346 recalculateIndex++ 2347 if(recalculateIndex % RECALCULATE_EVERY_N_OPERATIONS == 0) { 2348 update_cache_percentage() 2349 } 2350 } 2352 // Finally, tell the proof fetching engine to go faster 2353 if(proof_fetching_enabled) { 2354 // This function would speed up proof fetching until an 2355 // arbitrary time has passed. Perhaps until it has fetched 2356 // proofs for the number of items currently in its queue? Or 2357 // a percentage of them? 2358 proof_fetch_faster_please() 2359 } 2360 update_cache_percentage(); 2361 } 2363 // ----------------------------------------------------------------- 2364 // This function is called when the cache is almost full. It will 2365 // evict entries at random, while attempting to save entries that 2366 // appear to have proof fetching failures 2367 def cache_pressure_almost_full() 2368 { 2369 uint recalculateIndex = 0 2370 uint savedRecords = 0 2371 #define RECALCULATE_EVERY_N_OPERATIONS 50 2373 while(all_sct_stores.length > savedRecords && 2374 current_cache_percentage() > CACHE_PRESSURE_SAFE) { 2375 uint rndIndex1 = rand() % all_sct_stores.length 2376 uint rndIndex2 = rand() % all_sct_stores[rndIndex1].observed_records.length 2378 if(proof_fetching_enabled) { 2379 if(all_sct_stores[rndIndex1].observed_records[rndIndex2].max_proof_failures() > 2380 MIN_PROOF_FAILURES_CONSIDERED_SUSPICIOUS) { 2381 savedRecords++ 2382 continue 2383 } 2384 } 2386 // If proof fetching is not enabled we need some other logic 2387 else { 2388 if(sctBundle.num_reports_to_thirdparty == 0) { 2389 savedRecords++ 2390 continue 2391 } 2392 } 2394 all_sct_stores[rndIndex1].observed_records.removeAt(rndIndex2) 2395 if(all_sct_stores[rndIndex1].observed_records.length == 0) { 2396 all_sct_stores.removeAt(rndIndex1) 2397 } 2399 recalculateIndex++ 2400 if(recalculateIndex % RECALCULATE_EVERY_N_OPERATIONS == 0) { 2401 update_cache_percentage() 2402 } 2403 } 2405 update_cache_percentage(); 2406 } 2407 // ----------------------------------------------------------------- 2408 // This function is called when the cache is full, and will evict 2409 // cache entries at random 2410 def cache_is_full() 2411 { 2412 uint recalculateIndex = 0 2413 #define RECALCULATE_EVERY_N_OPERATIONS 50 2415 while(all_sct_stores.length > 0 && 2416 current_cache_percentage() > CACHE_PRESSURE_SAFE) { 2417 uint rndIndex1 = rand() % all_sct_stores.length 2418 uint rndIndex2 = rand() % all_sct_stores[rndIndex1].observed_records.length 2420 all_sct_stores[rndIndex1].observed_records.removeAt(rndIndex2) 2421 if(all_sct_stores[rndIndex1].observed_records.length == 0) { 2422 all_sct_stores.removeAt(rndIndex1) 2423 } 2425 recalculateIndex++ 2426 if(recalculateIndex % RECALCULATE_EVERY_N_OPERATIONS == 0) { 2427 update_cache_percentage() 2428 } 2429 } 2431 update_cache_percentage(); 2432 } 2434 12. IANA considerations 2436 [ TBD ] 2438 13. Contributors 2440 The authors would like to thank the following contributors for 2441 valuable suggestions: Al Cutter, Ben Laurie, Benjamin Kaduk, Josef 2442 Gustafsson, Karen Seo, Magnus Ahltorp, Steven Kent, Yan Zhu. 2444 14. ChangeLog 2446 14.1. Changes between ietf-02 and ietf-03 2448 o TBD's resolved. 2450 o References added. 2452 o Pseduocode changed to work for both clients and servers. 2454 14.2. Changes between ietf-01 and ietf-02 2456 o Requiring full certificate chain in SCT Feedback. 2458 o Clarifications on what clients store for and send in SCT Feedback 2459 added. 2461 o SCT Feedback server operation updated to protect against DoS 2462 attacks on servers. 2464 o Pre-Loaded vs Locally Added Anchors explained. 2466 o Base for well-known URL's changed. 2468 o Remove all mentions of monitors - gossip deals with auditors. 2470 o New sections added: Trusted Auditor protocol, attacks by actively 2471 malicious log, the Dual-CA compromise attack, policy 2472 recommendations, 2474 14.3. Changes between ietf-00 and ietf-01 2476 o Improve language and readability based on feedback from Stephen 2477 Kent. 2479 o STH Pollination Proof Fetching defined and indicated as optional. 2481 o 3-Method Ecosystem section added. 2483 o Cases with Logs ceasing operation handled. 2485 o Text on tracking via STH Interaction added. 2487 o Section with some early recommendations for mixing added. 2489 o Section detailing blocking connections, frustrating it, and the 2490 implications added. 2492 14.4. Changes between -01 and -02 2494 o STH Pollination defined. 2496 o Trusted Auditor Relationship defined. 2498 o Overview section rewritten. 2500 o Data flow picture added. 2502 o Section on privacy considerations expanded. 2504 14.5. Changes between -00 and -01 2506 o Add the SCT feedback mechanism: Clients send SCTs to originating 2507 web server which shares them with auditors. 2509 o Stop assuming that clients see STHs. 2511 o Don't use HTTP headers but instead .well-known URL's - avoid that 2512 battle. 2514 o Stop referring to trans-gossip and trans-gossip-transport-https - 2515 too complicated. 2517 o Remove all protocols but HTTPS in order to simplify - let's come 2518 back and add more later. 2520 o Add more reasoning about privacy. 2522 o Do specify data formats. 2524 15. References 2526 15.1. Normative References 2528 [RFC-6962-BIS-09] 2529 Laurie, B., Langley, A., Kasper, E., Messeri, E., and R. 2530 Stradling, "Certificate Transparency", October 2015, 2531 . 2534 [RFC7159] Bray, T., "The JavaScript Object Notation (JSON) Data 2535 Interchange Format", RFC 7159, March 2014. 2537 15.2. Informative References 2539 [double-keying] 2540 Perry, M., Clark, E., and S. Murdoch, "Cross-Origin 2541 Identifier Unlinkability", May 2015, 2542 . 2545 [draft-ct-over-dns] 2546 Laurie, B., Phaneuf, P., and A. Eijdenberg, "Certificate 2547 Transparency over DNS", February 2016, 2548 . 2551 [draft-ietf-trans-threat-analysis-03] 2552 Kent, S., "Attack Model and Threat for Certificate 2553 Transparency", October 2015, 2554 . 2557 [dual-ca-compromise-attack] 2558 Gillmor, D., "can CT defend against dual CA compromise?", 2559 n.d., . 2562 [gossip-mixing] 2563 Ritter, T., "A Bit on Certificate Transparency Gossip", 2564 June 2016, . 2567 [trickle] Serjantov, A., Dingledine, R., and . Paul Syverson, "From 2568 a Trickle to a Flood: Active Attacks on Several Mix 2569 Types", October 2002, 2570 . 2572 Authors' Addresses 2574 Linus Nordberg 2575 NORDUnet 2577 Email: linus@nordu.net 2579 Daniel Kahn Gillmor 2580 ACLU 2582 Email: dkg@fifthhorseman.net 2584 Tom Ritter 2586 Email: tom@ritter.vg