idnits 2.17.1 draft-ietf-trans-gossip-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 5 instances of lines with non-RFC2606-compliant FQDNs in the document. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 354: '...tension. The client MUST discard SCTs...' RFC 2119 keyword, line 355: '...own to the client and SHOULD store the...' RFC 2119 keyword, line 361: '...ed on the client MUST be keyed by the ...' RFC 2119 keyword, line 362: '...contacted. They MUST NOT be sent to a...' RFC 2119 keyword, line 365: '...mple.com.) They MUST NOT be sent to a...' (51 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 1798 has weird spacing: '... bool has_...' == Line 1864 has weird spacing: '... string doma...' -- The document date (March 21, 2016) is 2959 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Experimental ---------------------------------------------------------------------------- -- Looks like a reference, but probably isn't: '1' on line 286 -- Looks like a reference, but probably isn't: '2' on line 288 -- Looks like a reference, but probably isn't: '3' on line 290 == Missing Reference: 'Y' is mentioned on line 1454, but not defined == Missing Reference: 'Z' is mentioned on line 1454, but not defined == Missing Reference: 'STATISTICS HERE' is mentioned on line 1575, but not defined ** Obsolete normative reference: RFC 6962 (Obsoleted by RFC 9162) ** Obsolete normative reference: RFC 7159 (Obsoleted by RFC 8259) Summary: 3 errors (**), 0 flaws (~~), 7 warnings (==), 5 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 TRANS L. Nordberg 3 Internet-Draft NORDUnet 4 Intended status: Experimental D. Gillmor 5 Expires: September 22, 2016 ACLU 6 T. Ritter 8 March 21, 2016 10 Gossiping in CT 11 draft-ietf-trans-gossip-02 13 Abstract 15 The logs in Certificate Transparency are untrusted in the sense that 16 the users of the system don't have to trust that they behave 17 correctly since the behaviour of a log can be verified to be correct. 19 This document tries to solve the problem with logs presenting a 20 "split view" of their operations. It describes three gossiping 21 mechanisms for Certificate Transparency: SCT Feedback, STH 22 Pollination and Trusted Auditor Relationship. 24 Status of This Memo 26 This Internet-Draft is submitted in full conformance with the 27 provisions of BCP 78 and BCP 79. 29 Internet-Drafts are working documents of the Internet Engineering 30 Task Force (IETF). Note that other groups may also distribute 31 working documents as Internet-Drafts. The list of current Internet- 32 Drafts is at http://datatracker.ietf.org/drafts/current/. 34 Internet-Drafts are draft documents valid for a maximum of six months 35 and may be updated, replaced, or obsoleted by other documents at any 36 time. It is inappropriate to use Internet-Drafts as reference 37 material or to cite them other than as "work in progress." 39 This Internet-Draft will expire on September 22, 2016. 41 Copyright Notice 43 Copyright (c) 2016 IETF Trust and the persons identified as the 44 document authors. All rights reserved. 46 This document is subject to BCP 78 and the IETF Trust's Legal 47 Provisions Relating to IETF Documents 48 (http://trustee.ietf.org/license-info) in effect on the date of 49 publication of this document. Please review these documents 50 carefully, as they describe your rights and restrictions with respect 51 to this document. Code Components extracted from this document must 52 include Simplified BSD License text as described in Section 4.e of 53 the Trust Legal Provisions and are provided without warranty as 54 described in the Simplified BSD License. 56 Table of Contents 58 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 59 2. Defining the problem . . . . . . . . . . . . . . . . . . . . 4 60 3. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 4 61 4. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 62 4.1. Pre-Loaded vs Locally Added Anchors . . . . . . . . . . . 5 63 5. Who gossips with whom . . . . . . . . . . . . . . . . . . . . 5 64 6. What to gossip about and how . . . . . . . . . . . . . . . . 6 65 7. Data flow . . . . . . . . . . . . . . . . . . . . . . . . . . 6 66 8. Gossip Mechanisms . . . . . . . . . . . . . . . . . . . . . . 7 67 8.1. SCT Feedback . . . . . . . . . . . . . . . . . . . . . . 7 68 8.1.1. SCT Feedback data format . . . . . . . . . . . . . . 8 69 8.1.2. HTTPS client to server . . . . . . . . . . . . . . . 8 70 8.1.3. HTTPS server operation . . . . . . . . . . . . . . . 11 71 8.1.4. HTTPS server to auditors . . . . . . . . . . . . . . 13 72 8.2. STH pollination . . . . . . . . . . . . . . . . . . . . . 14 73 8.2.1. HTTPS Clients and Proof Fetching . . . . . . . . . . 15 74 8.2.2. STH Pollination without Proof Fetching . . . . . . . 17 75 8.2.3. Auditor Action . . . . . . . . . . . . . . . . . . . 17 76 8.2.4. STH Pollination data format . . . . . . . . . . . . . 17 77 8.3. Trusted Auditor Stream . . . . . . . . . . . . . . . . . 17 78 8.3.1. Trusted Auditor data format . . . . . . . . . . . . . 18 79 9. 3-Method Ecosystem . . . . . . . . . . . . . . . . . . . . . 19 80 9.1. SCT Feedback . . . . . . . . . . . . . . . . . . . . . . 19 81 9.2. STH Pollination . . . . . . . . . . . . . . . . . . . . . 20 82 9.3. Trusted Auditor Relationship . . . . . . . . . . . . . . 21 83 9.4. Interaction . . . . . . . . . . . . . . . . . . . . . . . 22 84 10. Security considerations . . . . . . . . . . . . . . . . . . . 22 85 10.1. Attacks by actively malicious logs . . . . . . . . . . . 22 86 10.2. Dual-CA Compromise . . . . . . . . . . . . . . . . . . . 23 87 10.3. Censorship/Blocking considerations . . . . . . . . . . . 23 88 10.4. Privacy considerations . . . . . . . . . . . . . . . . . 25 89 10.4.1. Privacy and SCTs . . . . . . . . . . . . . . . . . . 25 90 10.4.2. Privacy in SCT Feedback . . . . . . . . . . . . . . 25 91 10.4.3. Privacy for HTTPS clients performing STH Proof 92 Fetching . . . . . . . . . . . . . . . . . . . . . . 26 93 10.4.4. Privacy in STH Pollination . . . . . . . . . . . . . 26 94 10.4.5. Privacy in STH Interaction . . . . . . . . . . . . . 27 95 10.4.6. Trusted Auditors for HTTPS Clients . . . . . . . . . 28 96 10.4.7. HTTPS Clients as Auditors . . . . . . . . . . . . . 28 98 11. Policy Recommendations . . . . . . . . . . . . . . . . . . . 29 99 11.1. Blocking Recommendations . . . . . . . . . . . . . . . . 29 100 11.1.1. Frustrating blocking . . . . . . . . . . . . . . . . 29 101 11.1.2. Responding to possible blocking . . . . . . . . . . 29 102 11.2. Proof Fetching Recommendations . . . . . . . . . . . . . 31 103 11.3. Record Distribution Recommendations . . . . . . . . . . 31 104 11.3.1. Mixing Algorithm . . . . . . . . . . . . . . . . . . 32 105 11.3.2. Flushing Attacks . . . . . . . . . . . . . . . . . . 33 106 11.3.3. The Deletion Algorithm . . . . . . . . . . . . . . . 34 107 12. IANA considerations . . . . . . . . . . . . . . . . . . . . . 45 108 13. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 45 109 14. ChangeLog . . . . . . . . . . . . . . . . . . . . . . . . . . 45 110 14.1. Changes between ietf-01 and ietf-02 . . . . . . . . . . 45 111 14.2. Changes between ietf-00 and ietf-01 . . . . . . . . . . 46 112 14.3. Changes between -01 and -02 . . . . . . . . . . . . . . 46 113 14.4. Changes between -00 and -01 . . . . . . . . . . . . . . 46 114 15. References . . . . . . . . . . . . . . . . . . . . . . . . . 47 115 15.1. Normative References . . . . . . . . . . . . . . . . . . 47 116 15.2. Informative References . . . . . . . . . . . . . . . . . 47 117 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 47 119 1. Introduction 121 The purpose of the protocols in this document, collectively referred 122 to as CT Gossip, is to detect certain misbehavior by CT logs. In 123 particular, CT Gossip aims to detect logs that are providing 124 inconsistent views to different log clients, and logs failing to 125 include submitted certificates within the time period stipulated by 126 MMD. 128 [ TODO: enumerate the interfaces used for detecting misbehaviour? ] 130 One of the major challenges of any gossip protocol is limiting damage 131 to user privacy. The goal of CT gossip is to publish and distribute 132 information about the logs and their operations, but not to expose 133 any additional information about the operation of any of the other 134 participants. Privacy of consumers of log information (in 135 particular, of web browsers and other TLS clients) should not be 136 undermined by gossip. 138 This document presents three different, complementary mechanisms for 139 non-log elements of the CT ecosystem to exchange information about 140 logs in a manner that preserves the privacy of HTTPS clients. They 141 should provide protective benefits for the system as a whole even if 142 their adoption is not universal. 144 2. Defining the problem 146 When a log provides different views of the log to different clients 147 this is described as a partitioning attack. Each client would be 148 able to verify the append-only nature of the log but, in the extreme 149 case, each client might see a unique view of the log. 151 The CT logs are public, append-only and untrusted and thus have to be 152 audited for consistency, i.e., they should never rewrite history. 153 Additionally, auditors and other log clients need to exchange 154 information about logs in order to be able to detect a partitioning 155 attack (as described above). 157 Gossiping about log behaviour helps address the problem of detecting 158 malicious or compromised logs with respect to a partitioning attack. 159 We want some side of the partitioned tree, and ideally both sides, to 160 see the other side. 162 Disseminating information about a log poses a potential threat to the 163 privacy of end users. Some data of interest (e.g. SCTs) is linkable 164 to specific log entries and thereby to specific websites, which makes 165 sharing them with others a privacy concern. Gossiping about this 166 data has to take privacy considerations into account in order not to 167 expose associations between users of the log (e.g., web browsers) and 168 certificate holders (e.g., web sites). Even sharing STHs (which do 169 not link to specific log entries) can be problematic - user tracking 170 by fingerprinting through rare STHs is one potential attack (see 171 Section 8.2). 173 3. Overview 175 SCT Feedback enables HTTPS clients to share Signed Certificate 176 Timestamps (SCTs) (Section 3.3 of [RFC-6962-BIS-09]) with CT auditors 177 in a privacy-preserving manner by sending SCTs to originating HTTPS 178 servers, who in turn share them with CT auditors. 180 In STH Pollination, HTTPS clients use HTTPS servers as pools to share 181 Signed Tree Heads (STHs) (Section 3.6 of [RFC-6962-BIS-09]) with 182 other connecting clients in the hope that STHs will find their way to 183 CT auditors. 185 HTTPS clients in a Trusted Auditor Relationship share SCTs and STHs 186 with trusted CT auditors directly, with expectations of privacy 187 sensitive data being handled according to whatever privacy policy is 188 agreed on between client and trusted party. 190 Despite the privacy risks with sharing SCTs there is no loss in 191 privacy if a client sends SCTs for a given site to the site 192 corresponding to the SCT. This is because the site's logs would 193 already indicate that the client is accessing that site. In this way 194 a site can accumulate records of SCTs that have been issued by 195 various logs for that site, providing a consolidated repository of 196 SCTs that could be shared with auditors. Auditors can use this 197 information to detect logs that misbehave by not including 198 certificates within the time period stipulated by the MMD metadata. 200 Sharing an STH is considered reasonably safe from a privacy 201 perspective as long as the same STH is shared by a large number of 202 other log clients. This safety in numbers can be achieved by only 203 allowing gossiping of STHs issued in a certain window of time, while 204 also refusing to gossip about STHs from logs with too high an STH 205 issuance frequency (see Section 8.2). 207 4. Terminology 209 This document relies on terminology and data structures defined in 210 [RFC-6962-BIS-09], including STH, SCT, Version, LogID, SCT timestamp, 211 CtExtensions, SCT signature, Merkle Tree Hash. 213 This document relies on terminology defined in 214 [draft-ietf-trans-threat-analysis-03], including Auditing. 216 4.1. Pre-Loaded vs Locally Added Anchors 218 Through the document, we refer to both Trust Anchors (Certificate 219 Authorities) and Logs. Both Logs and Trust Anchors may be locally 220 added by an administrator. Unless otherwise clarified, in both cases 221 we refer to the set of Trust Anchors and Logs that come pre-loaded 222 and pre-trusted in a piece of client software. 224 5. Who gossips with whom 226 o HTTPS clients and servers (SCT Feedback and STH Pollination) 228 o HTTPS servers and CT auditors (SCT Feedback and STH Pollination) 230 o CT auditors (Trusted Auditor Relationship) 232 Additionally, some HTTPS clients may engage with an auditor who they 233 trust with their privacy: 235 o HTTPS clients and CT auditors (Trusted Auditor Relationship) 237 6. What to gossip about and how 239 There are three separate gossip streams: 241 o SCT Feedback - transporting SCTs and certificate chains from HTTPS 242 clients to CT auditors via HTTPS servers. 244 o STH Pollination - HTTPS clients and CT auditors using HTTPS 245 servers as STH pools for exchanging STHs. 247 o Trusted Auditor Stream - HTTPS clients communicating directly with 248 trusted CT auditors sharing SCTs, certificate chains and STHs. 250 It is worthwhile to note that when an HTTPS Client or CT auditor 251 interact with a log, they may equivalently interact with a log mirror 252 or cache that replicates the log. 254 7. Data flow 256 The following picture shows how certificates, SCTs and STHs flow 257 through a CT system with SCT Feedback and STH Pollination. It does 258 not show what goes in the Trusted Auditor Relationship stream. 260 +- Cert ---- +----------+ 261 | | CA | ----------+ 262 | + SCT -> +----------+ | 263 v | Cert [& SCT] 264 +----------+ | 265 | Log | ---------- SCT -----------+ 266 +----------+ v 267 | ^ +----------+ 268 | | SCT & Certs --- | Website | 269 | |[1] | +----------+ 270 | |[2] STH ^ | 271 | |[3] v | | 272 | | +----------+ | | 273 | +--------> | Auditor | | HTTPS traffic 274 | +----------+ | | 275 STH | SCT 276 | SCT & Certs | 277 Log entries | | 278 | STH STH 279 v | | 280 +----------+ | v 281 | Monitor | +----------+ 282 +----------+ | Browser | 283 +----------+ 285 # Auditor Log 286 [1] |--- get-sth ------------------->| 287 |<-- STH ------------------------| 288 [2] |--- leaf hash + tree size ----->| 289 |<-- index + inclusion proof --->| 290 [3] |--- tree size 1 + tree size 2 ->| 291 |<-- consistency proof ----------| 293 8. Gossip Mechanisms 295 8.1. SCT Feedback 297 The goal of SCT Feedback is for clients to share SCTs and certificate 298 chains with CT auditors while still preserving the privacy of the end 299 user. The sharing of SCTs contribute to the overall goal of 300 detecting misbehaving logs by providing auditors with SCTs from many 301 vantage points, making it more likely to catch a violation of a log's 302 MMD or a log presenting inconsistent views. The sharing of 303 certificate chains is beneficial to HTTPS server operators interested 304 in direct feedback from clients for detecting bogus certificates 305 issued in their name and therefore incentivises server operators to 306 take part in SCT Feedback. 308 SCT Feedback is the most privacy-preserving gossip mechanism, as it 309 does not directly expose any links between an end user and the sites 310 they've visisted to any third party. 312 HTTPS clients store SCTs and certificate chains they see, and later 313 send them to the originating HTTPS server by posting them to a well- 314 known URL (associated with that server), as described in 315 Section 8.1.2. Note that clients will send the same SCTs and chains 316 to a server multiple times with the assumption that any man-in-the- 317 middle attack eventually will cease, and an honest server will 318 eventually receive collected malicious SCTs and certificate chains. 320 HTTPS servers store SCTs and certificate chains received from 321 clients, as described in Section 8.1.3. They later share them with 322 CT auditors by either posting them to auditors or making them 323 available via a well-known URL. This is described in Section 8.1.4. 325 8.1.1. SCT Feedback data format 327 The data shared between HTTPS clients and servers, as well as between 328 HTTPS servers and CT auditors, is a JSON array [RFC7159]. Each item 329 in the array is a JSON object with the following content: 331 o x509_chain: An array of base64-encoded X.509 certificates. The 332 first element is the end-entity certificate, the second certifies 333 the first and so on. 335 o sct_data: An array of objects consisting of the base64 336 representation of the binary SCT data as defined in 337 [RFC-6962-BIS-09] Section 3.3. 339 We will refer to this object as 'sct_feedback'. 341 The x509_chain element always contains at least one element. It also 342 always contains a full chain from a leaf certificate to a self-signed 343 trust anchor. 345 [ TBD: Be strict about what sct_data may contain or is this 346 sufficiently implied by previous sections? ] 348 8.1.2. HTTPS client to server 350 When an HTTPS client connects to an HTTPS server, the client receives 351 a set of SCTs as part of the TLS handshake. SCTs are included in the 352 TLS handshake using one or more of the three mechanisms described in 353 [RFC-6962-BIS-09] section 3.4 - in the server certificate, in a TLS 354 extension, or in an OCSP extension. The client MUST discard SCTs 355 that are not signed by a log known to the client and SHOULD store the 356 remaining SCTs together with a locally constructed certificate chain 357 which is trusted (i.e. terminated in a pre-loaded or locally 358 installed Trust Anchor) in an sct_feedback object or equivalent data 359 structure for later use in SCT Feedback. 361 The SCTs stored on the client MUST be keyed by the exact domain name 362 the client contacted. They MUST NOT be sent to any domain not 363 matching the original domain (e.g. if the original domain is 364 sub.example.com they must not be sent to sub.sub.example.com or to 365 example.com.) They MUST NOT be sent to any Subject Alternate Names 366 specified in the certificate. In the case of certificates that 367 validate multiple domain names, the same SCT is expected to be stored 368 multiple times. 370 Not following these constraints would increase the risk for two types 371 of privacy breaches. First, the HTTPS server receiving the SCT would 372 learn about other sites visited by the HTTPS client. Second, 373 auditors receiving SCTs from the HTTPS server would learn information 374 about other HTTPS servers visited by its clients. 376 If the client later again connects to the same HTTPS server, it again 377 receives a set of SCTs and calculates a certificate chain, and again 378 creates an sct_feedback or similar object. If this object does not 379 exactly match an existing object in the store, then the client MUST 380 add this new object to the store, associated with the exact domain 381 name contacted, as described above. An exact comparison is needed to 382 ensure that attacks involving alternate chains are detected. An 383 example of such an attack is described in [TODO double-CA-compromise 384 attack]. However, at least one optimization is safe and MAY be 385 performed: If the certificate chain exactly matches an existing 386 certificate chain, the client may store the union of the SCTs from 387 the two objects in the first (existing) object. 389 If the client does connect to the same HTTPS server a subsequent 390 time, it MUST send to the server sct_feedback objects in the store 391 that are associated with that domain name. It is not necessary to 392 send an sct_feedback object constructed from the current TLS session. 394 The client MUST NOT send the same set of SCTs to the same server more 395 often than TBD. 397 [ TODO: expand on rate/resource limiting motivation ] 399 Refer to Section 11.3 for recommendations about strategies. 401 Because SCTs can be used as a tracking mechanism (see 402 Section 10.4.2), they deserve special treatment when they are 403 received from (and provided to) domains that are loaded as 404 subresources from an origin domain. Such domains are commonly called 405 'third party domains'. An HTTPS Client SHOULD store SCT Feedback 406 using a 'double-keying' approach, which isolates third party domains 407 by the first party domain. This is described in XXX. Gossip would 408 be performed normally for third party domains only when the user 409 revisits the first party domain. In lieu of 'double-keying', an 410 HTTPS Client MAY treat SCT Feedback in the same manner it treats 411 other security mechanisms that can enable tracking (such as HSTS and 412 HPKP.) 414 [ XXX is currently https://www.torproject.org/projects/torbrowser/ 415 design/#identifier-linkability How should it be references? Do we 416 need to copy this out into another document? An appendix? ] 418 If the HTTPS client has configuration options for not sending cookies 419 to third parties, SCTs of third parties MUST be treated as cookies 420 with respect to this setting. This prevents third party tracking 421 through the use of SCTs/certificates, which would bypass the cookie 422 policy. 424 SCTs and corresponding certificates are POSTed to the originating 425 HTTPS server at the well-known URL: 427 https:///.well-known/ct-gossip/v1/sct-feedback 429 The data sent in the POST is defined in Section 8.1.1. This data 430 SHOULD be sent in an already established TLS session. This makes it 431 hard for an attacker to disrupt SCT Feedback without also disturbing 432 ordinary secure browsing (https://). This is discussed more in 433 Section 11.1.1. 435 Some clients have trust anchors or logs that are locally added (e.g. 436 by an administrator or by the user themselves). These additions are 437 potentially privacy-sensitive because they can carry information 438 about the specific configuration, computer, or user. 440 Certificates validated by locally added trust anchors will commonly 441 have no SCTs associated with them, so in this case no action is 442 needed with respect to CT Gossip. SCTs issued by locally added logs 443 MUST NOT be reported via SCT Feedback. 445 If a certificate is validated by SCTs that are issued by publicly 446 trusted logs, but chains to a local trust anchor, the client MAY 447 perfom SCT Feedback for this SCT and certificate chain bundle. If it 448 does so, the client MUST include the full chain of certificates 449 chaining to the local trust anchor in the x509_chain array. 450 Perfoming SCT Feedback in this scenario may be advantageous for the 451 broader internet and CT ecosystem, but may also disclose information 452 about the client. If the client elects to omit SCT Feedback, it can 453 still choose to perform STH Pollination after fetching an inclusion 454 proof, as specified in Section 8.2. 456 We require the client to send the full chain (or nothing at all) for 457 two reasons. Firstly, it simplifies the operation on the server if 458 there are not two code paths. Secondly, omitting the chain does not 459 actually preserve user privacy. The Issuer field in the certificate 460 describes the signing certificate. And if the certificate is being 461 submitted at all, it means the certificate is logged, and has SCTs. 462 This means that the Issuer can be queried and obtained from the log 463 so omitting the parent from the client's submission does not actually 464 help user privacy. 466 8.1.3. HTTPS server operation 468 HTTPS servers can be configured (or omit configuration), resulting 469 in, broadly, two modes of operation. In the simpler mode, the server 470 will only track leaf certificates and SCTs applicable to those leaf 471 certificates. In the more complex mode, the server will confirm the 472 client's chain validation and store the certificate chain. The 473 latter mode requires more configuration, but is necessary to prevent 474 denial of service (DoS) attacks on the server's storage space. 476 In the simple mode of operation, upon recieving a submission at the 477 sct-feedback well-known URL, an HTTPS server will perform a set of 478 operations, checking on each sct_feedback object before storing it: 480 1. the HTTPS server MAY modify the sct_feedback object, and discard 481 all items in the x509_chain array except the first item (which is 482 the end-entity certificate) 484 2. if a bit-wise compare of the sct_feedback object matches one 485 already in the store, this sct_feedback object SHOULD be 486 discarded 488 3. if the leaf cert is not for a domain for which the server is 489 authoritative, the SCT MUST be discarded 491 4. if an SCT in the sct_data array can't be verified to be a valid 492 SCT for the accompanying leaf cert, and issued by a known log, 493 the individual SCT SHOULD be discarded 495 The modification in step number 1 is necessary to prevent a malicious 496 client from exhausting the server's storage space. A client can 497 generate their own issuing certificate authorities, and create an 498 arbitrary number of chains that terminate in an end-entity 499 certificate with an existing SCT. By discarding all but the end- 500 entity certificate, we prevent a simple HTTPS server from storing 501 this data. Note that operation in this mode will not prevent the 502 attack described in Section 10.2. Skipping this step requires 503 additional configuration as described below. 505 The check in step 2 is for detecting duplicates and minimizing 506 processing and storage by the server. As on the client, an exact 507 comparison is needed to ensure that attacks involving alternate 508 chains are detected. Again, at least one optimization is safe and 509 MAY be performed. If the certificate chain exactly matches an 510 existing certificate chain, the server may store the union of the 511 SCTs from the two objects in the first (existing) object. It should 512 do this after completing the validity check on the SCTs. 514 The check in step 3 is to help malfunctioning clients from exposing 515 which sites they visit. It additionally helps prevent DoS attacks on 516 the server. 518 [ TBD: Thinking about building this, how does the SCT Feedback app 519 know which sites it's authoritative for? ] 521 The check in step 4 is to prevent DoS attacks where an adversary 522 fills up the store prior to attacking a client (thus preventing the 523 client's feedback from being recorded), or an attack where an 524 adversary simply attempts to fill up server's storage space. 526 The more advanced server configuration will detect the [TODO double- 527 CA-compromise] attack. In this configuration the server will not 528 modify the sct_feedback object prior to performing checks 2, 3, and 529 4. 531 To prevent a malicious client from filling the server's data store, 532 the HTTPS Server SHOULD perform an additional check: 534 1. if the x509_chain consists of an invalid certificate chain, or 535 the culminating trust anchor is not recognized by the server, the 536 server SHOULD modify the sct_feedback object, discarding all 537 items in the x509_chain array except the first item 539 The HTTPS server may choose to omit checks 4 or 5. This will place 540 the server at risk of having its data store filled up by invalid 541 data, but can also allow a server to identify interesting certificate 542 or certificate chains that omit valid SCTs, or do not chain to a 543 trusted root. This information may enable an HTTPS server operator 544 to detect attacks or unusual behavior of Certificate Authorities even 545 outside the Certificate Transparency ecosystem. 547 8.1.4. HTTPS server to auditors 549 HTTPS servers receiving SCTs from clients SHOULD share SCTs and 550 certificate chains with CT auditors by either serving them on the 551 well-known URL: 553 https:///.well-known/ct-gossip/v1/collected-sct-feedback 555 or by HTTPS POSTing them to a set of preconfigured auditors. This 556 allows an HTTPS server to choose between an active push model or a 557 passive pull model. 559 The data received in a GET of the well-known URL or sent in the POST 560 is defined in Section 8.1.1. 562 HTTPS servers SHOULD share all sct_feedback objects they see that 563 pass the checks in Section 8.1.3. If this is an infeasible amount of 564 data, the server may choose to expire submissions according to an 565 undefined policy. Suggestions for such a policy can be found in 566 Section 11.3. 568 HTTPS servers MUST NOT share any other data that they may learn from 569 the submission of SCT Feedback by HTTPS clients, like the HTTPS 570 client IP address or the time of submission. 572 As described above, HTTPS servers can be configured (or omit 573 configuration), resulting in two modes of operation. In one mode, 574 the x509_chain array will contain a full certificate chain. This 575 chain may terminate in a trust anchor the auditor may recognize, or 576 it may not. (One scenario where this could occur is if the client 577 submitted a chain terminiating in a locally added trust anchor, and 578 the server kept this chain.) In the other mode, the x509_chain array 579 will consist of only a single element, which is the end-entity 580 certificate. 582 Auditors SHOULD provide the following URL accepting HTTPS POSTing of 583 SCT feedback data: 585 https:///ct-gossip/v1/sct-feedback 587 [ TBD: Should that be .well-known? Depends on whether auditors will 588 operate in their own URL name space or not. ] 590 Auditors SHOULD regularly poll HTTPS servers at the well-known 591 collected-sct-feedback URL. The frequency of the polling and how to 592 determine which domains to poll is outside the scope of this 593 document. However, the selection MUST NOT be influenced by potential 594 HTTPS clients connecting directly to the auditor. For example, if a 595 poll to example.com occurs directly after a client submits an SCT for 596 example.com, an adversary observing the auditor can trivially 597 conclude the activity of the client. 599 8.2. STH pollination 601 The goal of sharing Signed Tree Heads (STHs) through pollination is 602 to share STHs between HTTPS clients and CT auditors while still 603 preserving the privacy of the end user. The sharing of STHs 604 contribute to the overall goal of detecting misbehaving logs by 605 providing CT auditors with STHs from many vantage points, making it 606 possible to detect logs that are presenting inconsistent views. 608 HTTPS servers supporting the protocol act as STH pools. HTTPS 609 clients and CT auditors in the possession of STHs can pollinate STH 610 pools by sending STHs to them, and retrieving new STHs to send to 611 other STH pools. CT auditors can improve the value of their auditing 612 by retrieving STHs from pools. 614 HTPS clients send STHs to HTTPS servers by POSTing them to the well- 615 known URL: 617 https:///.well-known/ct-gossip/v1/sth-pollination 619 The data sent in the POST is defined in Section 8.2.4. This data 620 SHOULD be sent in an already established TLS session. This makes it 621 hard for an attacker to disrupt STH gossiping without also disturbing 622 ordinary secure browsing (https://). This is discussed more in 623 Section 11.1.1. 625 The response contains zero or more STHs in the same format, described 626 in Section 8.2.4. 628 An HTTPS client may acquire STHs by several methods: 630 o in replies to pollination POSTs; 632 o asking logs that it recognises for the current STH, either 633 directly (v2/get-sth) or indirectly (for example over DNS) 635 o resolving an SCT and certificate to an STH via an inclusion proof 637 o resolving one STH to another via a consistency proof 639 HTTPS clients (that have STHs) and CT auditors SHOULD pollinate STH 640 pools with STHs. Which STHs to send and how often pollination should 641 happen is regarded as undefined policy with the exception of privacy 642 concerns explained below. Suggestions for the policy may be found in 643 Section 11.3. 645 An HTTPS client could be tracked by giving it a unique or rare STH. 646 To address this concern, we place restrictions on different 647 components of the system to ensure an STH will not be rare. 649 o HTTPS clients silently ignore STHs from logs with an STH issuance 650 frequency of more than one STH per hour. Logs use the STH 651 Frequency Count metadata to express this ([RFC-6962-BIS-09] 652 sections 3.6 and 5.1). 654 o HTTPS clients silently ignore STHs which are not fresh. 656 An STH is considered fresh iff its timestamp is less than 14 days in 657 the past. Given a maximum STH issuance rate of one per hour, an 658 attacker has 336 unique STHs per log for tracking. Clients MUST 659 ignore STHs older than 14 days. We consider STHs within this 660 validity window not to be personally identifiable data, and STHs 661 outside this window to be personally identifiable. 663 When multiplied by the number of logs from which a client accepts 664 STHs, this number of unique STHs grow and the negative privacy 665 implications grow with it. It's important that this is taken into 666 account when logs are chosen for default settings in HTTPS clients. 667 This concern is discussed upon in Section 10.4.5. 669 A log may cease operation, in which case there will soon be no STH 670 within the validity window. Clients SHOULD perform all three methods 671 of gossip about a log that has ceased operation since it is possible 672 the log was still compromised and gossip can detect that. STH 673 Pollination is the one mechanism where a client must know about a log 674 shutdown. A client who does not know about a log shutdown MUST NOT 675 attempt any heuristic to detect a shutdown. Instead the client MUST 676 be informed about the shutdown from a verifiable source (e.g. a 677 software update). The client SHOULD be provided the final STH issued 678 by the log and SHOULD resolve SCTs and STHs to this final STH. If an 679 SCT or STH cannot be resolved to the final STH, clients should follow 680 the requirements and recommendations set forth in Section 11.1.2. 682 8.2.1. HTTPS Clients and Proof Fetching 684 There are two types of proofs a client may retrieve; inclusion proofs 685 and consistency proofs. 687 An HTTPS client will retrieve SCTs from an HTTPS server, and must 688 obtain an inclusion proof to an STH in order to verify the promise 689 made by the SCT. 691 An HTTPS client may also receive an SCT bundled with an inclusion 692 proof to a historical STH via an unspecified future mechanism. 693 Because this historical STH is considered personally identifiable 694 information per above, the client must obtain a consistency proof to 695 a more recent STH. 697 A client SHOULD perform proof fetching. A client MUST NOT perform 698 proof fetching for any SCTs or STHs issued by a locally added log. A 699 client MAY fetch an inclusion proof for an SCT (issued by a pre- 700 loaded log) that validates a certificate chaining to a locally added 701 trust anchor. 703 [ TBD: Linus doesn't like this because we're mandating behavior that 704 is not necessarily safe. Is it unsafe? Not sure.] 706 If a client requested either proof directly from a log or auditor, it 707 would reveal the client's browsing habits to a third party. To 708 mitigate this risk, an HTTPS client MUST retrieve the proof in a 709 manner that disguises the client. 711 Depending on the client's DNS provider, DNS may provide an 712 appropriate intermediate layer that obfuscates the linkability 713 between the user of the client and the request for inclusion (while 714 at the same time providing a caching layer for oft-requested 715 inclusion proofs.) 717 [ TODO: Add a reference to Google's DNS mechanism more proper than 718 http://www.certificate-transparency.org/august-2015-newsletter ] 720 Anonymity networks such as Tor also present a mechanism for a client 721 to anonymously retrieve a proof from an auditor or log. 723 Even when using a privacy-preserving layer between the client and the 724 log, certain observations may be made about an anonymous client or 725 general user behavior depending on how proofs are fetched. For 726 example, if a client fetched all outstanding proofs at once, a log 727 would know that SCTs or STHs recieved around the same time are more 728 likely to come from a particular client. This could potentially go 729 so far as correlation of activity at different times to a single 730 client. In aggregate the data could reveal what sites are commonly 731 visited together. HTTPS clients SHOULD use a strategy of proof 732 fetching that attempts to obfuscate these patterns. A suggestion of 733 such a policy can be found in Section 11.2. 735 Resolving either SCTs and STHs may result in errors. These errors 736 may be routine downtime or other transient errors, or they may be 737 indicative of an attack. Clients should follow the requirements and 738 recommendations set forth in Section 11.1.2 when handling these 739 errors in order to give the CT ecosystem the greatest chance of 740 detecting and responding to a compromise. 742 8.2.2. STH Pollination without Proof Fetching 744 An HTTPS client MAY participate in STH Pollination without fetching 745 proofs. In this situation, the client receives STHs from a server, 746 applies the same validation logic to them (signed by a known log, 747 within the validity window) and will later pass them to an HTTPS 748 server. 750 When operating in this fashion, the HTTPS client is promoting gossip 751 for Certificate Transparency, but derives no direct benefit itself. 752 In comparison, a client who resolves SCTs or historical STHs to 753 recent STHs and pollinates them is assured that if it was attacked, 754 there is a probability that the ecosystem will detect and respond to 755 the attack (by distrusting the log). 757 8.2.3. Auditor Action 759 CT auditors participate in STH pollination by retrieving STHs from 760 HTTPS servers. They verify that the STH is valid by checking the 761 signature, and requesting a consistency proof from the STH to the 762 most recent STH. 764 After retrieving the consistency proof to the most recent STH, they 765 SHOULD pollinate this new STH among participating HTTPS Servers. In 766 this way, as STHs "age out" and are no longer fresh, their "lineage" 767 continues to be tracked in the system. 769 8.2.4. STH Pollination data format 771 The data sent from HTTPS clients and CT auditors to HTTPS servers is 772 a JSON object [RFC7159] with the following content: 774 o sths - an array of 0 or more fresh SignedTreeHead's as defined in 775 [RFC-6962-BIS-09] Section 3.6.1. 777 8.3. Trusted Auditor Stream 779 HTTPS clients MAY send SCTs and cert chains, as well as STHs, 780 directly to auditors. If sent, this data MAY include data that 781 reflects locally added logs or trust anchors. Note that there are 782 privacy implications in doing so, these are outlined in 783 Section 10.4.1 and Section 10.4.6. 785 The most natural trusted auditor arrangement arguably is a web 786 browser that is "logged in to" a provider of various internet 787 services. Another equivalent arrangement is a trusted party like a 788 corporation to which an employee is connected through a VPN or by 789 other similar means. A third might be individuals or smaller groups 790 of people running their own services. In such a setting, retrieving 791 proofs from that third party could be considered reasonable from a 792 privacy perspective. The HTTPS client may also do its own auditing 793 and might additionally share SCTs and STHs with the trusted party to 794 contribute to herd immunity. Here, the ordinary [RFC-6962-BIS-09] 795 protocol is sufficient for the client to do the auditing while SCT 796 Feedback and STH Pollination can be used in whole or in parts for the 797 gossip part. 799 Another well established trusted party arrangement on the internet 800 today is the relation between internet users and their providers of 801 DNS resolver services. DNS resolvers are typically provided by the 802 internet service provider (ISP) used, which by the nature of name 803 resolving already know a great deal about which sites their users 804 visit. As mentioned in Section 8.2.1, in order for HTTPS clients to 805 be able to retrieve proofs in a privacy preserving manner, logs could 806 expose a DNS interface in addition to the ordinary HTTPS interface. 807 An informal writeup of such a protocol can be found at XXX. 809 8.3.1. Trusted Auditor data format 811 Trusted Auditors expose a REST API at the fixed URI: 813 https:///ct-gossip/v1/trusted-auditor 815 Submissions are made by sending an HTTPS POST request, with the body 816 of the POST in a JSON object. Upon successful receipt the Trusted 817 Auditor returns 200 OK. 819 The JSON object consists of two top-level keys: 'sct_feedback' and 820 'sths'. The 'sct_feedback' value is an array of JSON objects as 821 defined in Section 8.1.1. The 'sths' value is an array of STHs as 822 defined in Section 8.2.4. 824 Example: 826 { 827 'sct_feedback' : 828 [ 829 { 830 'x509_chain' : 831 [ 832 '----BEGIN CERTIFICATE---\n 833 AAA...', 834 '----BEGIN CERTIFICATE---\n 835 AAA...', 836 ... 837 ], 838 'sct_data' : 839 [ 840 'AAA...', 841 'AAA...', 842 ... 843 ] 844 }, ... 845 ], 846 'sths' : 847 [ 848 'AAA...', 849 'AAA...', 850 ... 851 ] 852 } 854 9. 3-Method Ecosystem 856 The use of three distinct methods for auditing logs may seem 857 excessive, but each represents a needed component in the CT 858 ecosystem. To understand why, the drawbacks of each component must 859 be outlined. In this discussion we assume that an attacker knows 860 which mechanisms an HTTPS client and HTTPS server implement. 862 9.1. SCT Feedback 864 SCT Feedback requires the cooperation of HTTPS clients and more 865 importantly HTTPS servers. Although SCT Feedback does require a 866 significant amount of server-side logic to respond to the 867 corresponding APIs, this functionality does not require 868 customization, so it may be pre-provided and work out of the box. 869 However, to take full advantage of the system, an HTTPS server would 870 wish to perform some configuration to optimize its operation: 872 o Minimize its disk commitment by maintaining a list of known SCTs 873 and certificate chains (or hashes thereof) 875 o Maximize its chance of detecting a misissued certificate by 876 configuring a trust store of CAs 878 o Establish a "push" mechanism for POSTing SCTs to CT auditors 880 These configuration needs, and the simple fact that it would require 881 some deployment of software, means that some percentage of HTTPS 882 servers will not deploy SCT Feedback. 884 It is worthwhile to note that an attacker may be able to prevent 885 detection of an attack on a webserver (in all cases) if SCT Feedback 886 is not implemented. This attack is detailed in Section 10.1). 888 If SCT Feedback was the only mechanism in the ecosystem, any server 889 that did not implement the feature would open itself and its users to 890 attack without any possibility of detection. 892 If SCT Feedback is not deployed by a webserver, malicious logs will 893 be able to attack all users of the webserver (who do not have a 894 Trusted Auditor relationship) with impunity. Additionally, users who 895 wish to have the strongest measure of privacy protection (by 896 disabling STH Pollination Proof Fetching and forgoing a Trusted 897 Auditor) could be attacked without risk of detection. 899 9.2. STH Pollination 901 STH Pollination requires the cooperation of HTTPS clients, HTTPS 902 servers, and logs. 904 For a client to fully participate in STH Pollination, and have this 905 mechanism detect attacks against it, the client must have a way to 906 safely perform Proof Fetching in a privacy preserving manner. (The 907 client may pollinate STHs it receives without performing Proof 908 Fetching, but we do not consider this option in this section.) 910 HTTPS Servers must deploy software (although, as in the case with SCT 911 Feedback this logic can be pre-provided) and commit some configurable 912 amount of disk space to the endeavor. 914 Logs (or a third party) must provide access to clients to query 915 proofs in a privacy preserving manner, most likely through DNS. 917 Unlike SCT Feedback, the STH Pollination mechanism is not hampered if 918 only a minority of HTTPS servers deploy it. However, it makes an 919 assumption that an HTTPS client performs Proof Fetching (such as the 920 DNS mechanism discussed). Unfortunately, any manner that is 921 anonymous for some (such as clients who use shared DNS services such 922 as a large ISP), may not be anonymous for others. 924 For instance, DNS requests expose a considerable amount of sensitive 925 information (including what data is already present in the cache) in 926 plaintext over the network. For this reason, some percentage of 927 HTTPS clients may choose to not enable the Proof Fetching component 928 of STH Pollination. (Although they can still request and send STHs 929 among participating HTTPS servers, even when this affords them no 930 direct benefit.) 932 If STH Pollination was the only mechanism deployed, users that 933 disable it would be able to be attacked without risk of detection. 935 If STH Pollination was not deployed, HTTPS Clients visiting HTTPS 936 Servers who did not deploy SCT Feedback could be attacked without 937 risk of detection. 939 9.3. Trusted Auditor Relationship 941 The Trusted Auditor Relationship is expected to be the rarest gossip 942 mechanism, as an HTTPS Client is providing an unadulterated report of 943 its browsing history to a third party. While there are valid and 944 common reasons for doing so, there is no appropriate way to enter 945 into this relationship without retrieving informed consent from the 946 user. 948 However, the Trusted Auditor Relationship mechanism still provides 949 value to a class of HTTPS Clients. For example, web crawlers have no 950 concept of a "user" and no expectation of privacy. Organizations 951 already performing network auditing for anomalies or attacks can run 952 their own Trusted Auditor for the same purpose with marginal increase 953 in privacy concerns. 955 The ability to change one's Trusted Auditor is a form of Trust 956 Agility that allows a user to choose who to trust, and be able to 957 revise that decision later without consequence. A Trusted Auditor 958 connection can be made more confidential than DNS (through the use of 959 TLS), and can even be made (somewhat) anonymous through the use of 960 anonymity services such as Tor. (Note that this does ignore the de- 961 anonymization possibilities available from viewing a user's browsing 962 history.) 964 If the Trusted Auditor relationship was the only mechanism deployed, 965 users who do not enable it (the majority) would be able to be 966 attacked without risk of detection. 968 If the Trusted Auditor relationship was not deployed, crawlers and 969 organizations would build it themselves for their own needs. By 970 standardizing it, users who wish to opt-in (for instance those 971 unwilling to participate fully in STH Pollination) can have an 972 interoperable standard they can use to choose and change their 973 trusted auditor. 975 9.4. Interaction 977 The interactions of the mechanisms is thus outlined: 979 HTTPS Clients can be attacked without risk of detection if they do 980 not participate in any of the three mechanisms. 982 HTTPS Clients are afforded the greatest chance of detecting an attack 983 when they either participate in both SCT Feedback and STH Pollination 984 with Proof Fetching or if they have a Trusted Auditor relationship. 985 (Participating in SCT Feedback is required to prevent a malicious log 986 from refusing to ever resolve an SCT to an STH, as put forward in 987 Section 10.1). Additionally, participating in SCT Feedback enables 988 an HTTPS Client to assist in detecting the exact target of an attack. 990 HTTPS Servers that omit SCT Feedback enable malicious logs to carry 991 out attacks without risk of detection. If these servers are targeted 992 specifically, even if the attack is detected, without SCT Feedback 993 they may never learn that they were specifically targeted. HTTPS 994 servers without SCT Feedback do gain some measure of herd immunity, 995 but only because their clients participate in STH Pollination (with 996 Proof Fetching) or have a Trusted Auditor Relationship. 998 When HTTPS Servers omit SCT feedback, it allows their users to be 999 attacked without detection by a malicious log; the vulnerable users 1000 are those who do not have a Trusted Auditor relationship. 1002 10. Security considerations 1004 10.1. Attacks by actively malicious logs 1006 One of the most powerful attacks possible in the CT ecosystem is a 1007 trusted log that has actively decided to be malicious. It can carry 1008 out an attack in two ways: 1010 In the first attack, the log can present a split view of the log for 1011 all time. The only way to detect this attack is to resolve each view 1012 of the log to the two most recent STHs and then force the log to 1013 present a consistency proof. (Which it cannot.) This attack can be 1014 detected by CT auditors participating in STH Pollination, as long as 1015 they are explicitly built to handle the situation of a log 1016 continuously presenting a split view. 1018 In the second attack, the log can sign an SCT, and refuse to ever 1019 include the certificate that the SCT refers to in the tree. 1021 (Alternately, it can include it in a branch of the tree and issue an 1022 STH, but then abandon that branch.) Whenever someone requests an 1023 inclusion proof for that SCT (or a consistency proof from that STH), 1024 the log would respond with an error, and a client may simply regard 1025 the response as a transient error. This attack can be detected using 1026 SCT Feedback, or an Auditor of Last Resort, as presented in 1027 Section 11.1.2. 1029 10.2. Dual-CA Compromise 1031 XXX describes an attack possible by an adversary who compromises two 1032 Certificate Authorites and a Log. This attack is difficult to defend 1033 against in the CT ecosystem, and XXX describes a few approaches to 1034 doing so. We note that Gossip is not intended to defend against this 1035 attack, but can in certain modes. 1037 Defending against the Dual-CA Compromise attack requires SCT 1038 Feedback, and explicitly requires the server to save full certificate 1039 chains (described in Section 8.1.3 as the 'complex' configuration.) 1040 After CT auditors receive the full certificate chains from servers, 1041 they must compare the chain built by clients to the chain supplied by 1042 the log. If the chains differ significantly, the auditor can raise a 1043 concern. 1045 [ What does 'differ significantly' mean? We should provide guidance. 1046 I _think_ the correct algorithm to raise a concern is: 1048 If one chain is not a subset of the other AND If the root 1049 certificates of the chains are different THEN It's suspicious. 1051 Justification: - Cross-Signatures could result in a different org 1052 being treated as the 'root', but in this case, one chain would be a 1053 subset of the other. - Intermediate swapping (e.g. different 1054 signature algorithms) could result in different chains, but the root 1055 would be the same. 1057 (Hitting both those cases at once would cause a false positive 1058 though.) 1060 What did I miss? ] 1062 10.3. Censorship/Blocking considerations 1064 We assume a network attacker who is able to fully control the 1065 client's internet connection for some period of time, including 1066 selectively blocking requests to certain hosts and truncating TLS 1067 connections based on information observed or guessed about client 1068 behavior. In order to successfully detect log misbehavior, the 1069 gossip mechanisms must still work even in these conditions. 1071 There are several gossip connections that can be blocked: 1073 1. Clients sending SCTs to servers in SCT Feedback 1075 2. Servers sending SCTs to auditors in SCT Feedback (server push 1076 mechanism) 1078 3. Servers making SCTs available to auditors (auditor pull 1079 mechanism) 1081 4. Clients fetching proofs in STH Pollination 1083 5. Clients sending STHs to servers in STH Pollination 1085 6. Servers sending STHs to clients in STH Pollination 1087 7. Clients sending SCTs to Trusted Auditors 1089 If a party cannot connect to another party, it can be assured that 1090 the connection did not succeed. While it may not have been 1091 maliciously blocked, it knows the transaction did not succeed. 1092 Mechanisms which result in a positive affirmation from the recipient 1093 that the transaction succeeded allow confirmation that a connection 1094 was not blocked. In this situation, the party can factor this into 1095 strategies suggested in Section 11.3 and in Section 11.1.2. 1097 The connections that allow positive affirmation are 1, 2, 4, 5, and 1098 7. 1100 More insidious is blocking the connections that do not allow positive 1101 confirmation: 3 and 6. An attacker may truncate or drop a response 1102 from a server to a client, such that the server believes it has 1103 shared data with the recipient, when it has not. However, in both 1104 scenatios (3 and 6), the server cannot distinguish the client as a 1105 cooperating member of the CT ecosystem or as an attacker performing a 1106 sybil attack, aiming to flush the server's data store. Therefore the 1107 fact that these connections can be undetectably blocked does not 1108 actually alter the threat model of servers responding to these 1109 requests. The choice of algorithm to release data is crucial to 1110 protect against these attacks; strategies are suggested in 1111 Section 11.3. 1113 Handling censorship and network blocking (which is indistinguishable 1114 from network error) is relegated to the implementation policy chosen 1115 by clients. Suggestions for client behavior are specified in 1116 Section 11.1. 1118 10.4. Privacy considerations 1120 CT Gossip deals with HTTPS Clients which are trying to share 1121 indicators that correspond to their browsing history. The most 1122 sensitive relationships in the CT ecosystem are the relationships 1123 between HTTPS clients and HTTPS servers. Client-server relationships 1124 can be aggregated into a network graph with potentially serious 1125 implications for correlative de-anonymisation of clients and 1126 relationship-mapping or clustering of servers or of clients. 1128 There are, however, certain clients that do not require privacy 1129 protection. Examples of these clients are web crawlers or robots. 1130 But even in this case, the method by which these clients crawl the 1131 web may in fact be considered sensitive information. In general, it 1132 is better to err on the side of safety, and not assume a client is 1133 okay with giving up its privacy. 1135 10.4.1. Privacy and SCTs 1137 An SCT contains information that links it to a particular web site. 1138 Because the client-server relationship is sensitive, gossip between 1139 clients and servers about unrelated SCTs is risky. Therefore, a 1140 client with an SCT for a given server should transmit that 1141 information in only two channels: to the server associated with the 1142 SCT itself; and to a Trusted Auditor, if one exists. 1144 10.4.2. Privacy in SCT Feedback 1146 SCTs introduce yet another mechanism for HTTPS servers to store state 1147 on an HTTPS client, and potentially track users. HTTPS clients which 1148 allow users to clear history or cookies associated with an origin 1149 MUST clear stored SCTs and certificate chains associated with the 1150 origin as well. 1152 Auditors should treat all SCTs as sensitive data. SCTs received 1153 directly from an HTTPS client are especially sensitive, because the 1154 auditor is a trusted by the client to not reveal their associations 1155 with servers. Auditors MUST NOT share such SCTs in any way, 1156 including sending them to an external log, without first mixing them 1157 with multiple other SCTs learned through submissions from multiple 1158 other clients. Suggestions for mixing SCTs are presented in 1159 Section 11.3. 1161 There is a possible fingerprinting attack where a log issues a unique 1162 SCT for targeted log client(s). A colluding log and HTTPS server 1163 operator could therefore be a threat to the privacy of an HTTPS 1164 client. Given all the other opportunities for HTTPS servers to 1165 fingerprint clients - TLS session tickets, HPKP and HSTS headers, 1166 HTTP Cookies, etc. - this is considered acceptable. 1168 The fingerprinting attack described above would be mitigated by a 1169 requirement that logs MUST use a deterministic signature scheme when 1170 signing SCTs ([RFC-6962-BIS-09] Section 2.1.4). A log signing using 1171 RSA is not required to use a deterministic signature scheme. 1173 Since logs are allowed to issue a new SCT for a certificate already 1174 present in the log, mandating deterministic signatures does not stop 1175 this fingerprinting attack altogether. It does make the attack 1176 harder to pull off without being detected though. 1178 There is another similar fingerprinting attack where an HTTPS server 1179 tracks a client by using a unqiue certificate or a variation of cert 1180 chains. The risk for this attack is accepted on the same grounds as 1181 the unique SCT attack described above. [XXX any mitigations possible 1182 here?] 1184 10.4.3. Privacy for HTTPS clients performing STH Proof Fetching 1186 An HTTPS client performing Proof Fetching should only request proofs 1187 from a CT log that it accepts SCTs from. An HTTPS client MAY [TBD 1188 SHOULD?] regularly request an STH from all logs it is willing to 1189 accept, even if it has seen no SCTs from that log. 1191 [ TBD how regularly? This has operational implications for log 1192 operators ] 1194 The actual mechanism by which Proof Fetching is done carries 1195 considerable privacy concerns. Although out of scope for the 1196 document, DNS is a mechanism currently discussed. DNS exposes data 1197 in plaintext over the network (including what sites the user is 1198 visiting and what sites they have previously visited) an may not be 1199 suitable for some. 1201 10.4.4. Privacy in STH Pollination 1203 An STH linked to an HTTPS client may indicate the following about 1204 that client: 1206 o that the client gossips; 1208 o that the client has been using CT at least until the time that the 1209 timestamp and the tree size indicate; 1211 o that the client is talking, possibly indirectly, to the log 1212 indicated by the tree hash; 1214 o which software and software version is being used. 1216 There is a possible fingerprinting attack where a log issues a unique 1217 STH for a targeted HTTPS client. This is similar to the 1218 fingerprinting attack described in Section 10.4.2, but can operate 1219 cross-origin. If a log (or HTTPS Server cooperating with a log) 1220 provides a unique STH to a client, the targeted client will be the 1221 only client pollinating that STH cross-origin. 1223 It is mitigated partially because the log is limited in the number of 1224 STHs it can issue. It must 'save' one of its STHs each MMD to 1225 perform the attack. 1227 10.4.5. Privacy in STH Interaction 1229 An HTTPS client may pollinate any STH within the last 14 days. An 1230 HTTPS Client may also pollinate an STH for any log that it knows 1231 about. When a client pollinates STHs to a server, it will release 1232 more than one STH at a time. It is unclear if a server may 'prime' a 1233 client and be able to reliably detect the client at a later time. 1235 It's clear that a single site can track a user any way they wish, but 1236 this attack works cross-origin and is therefore more concerning. Two 1237 independent sites A and B want to collaborate to track a user cross- 1238 origin. A feeds a client Carol some N specific STHs from the M logs 1239 Carol trusts, chosen to be older and less common, but still in the 1240 validity window. Carol visits B and chooses to release some of the 1241 STHs she has stored, according to some policy. 1243 Modeling a representation for how common older STHs are in the pools 1244 of clients, and examining that with a given policy of how to choose 1245 which of those STHs to send to B, it should be possible to calculate 1246 statistics about how unique Carol looks when talking to B and how 1247 useful/accurate such a tracking mechanism is. 1249 Building such a model is likely impossible without some real world 1250 data, and requires a given implementation of a policy. To combat 1251 this attack, suggestions are provided in Section 11.3 to attempt to 1252 minimize it, but follow-up testing with real world deployment to 1253 improve the policy will be required. 1255 10.4.6. Trusted Auditors for HTTPS Clients 1257 Some HTTPS clients may choose to use a trusted auditor. This trust 1258 relationship exposes a large amount of information about the client 1259 to the auditor. In particular, it will identify the web sites that 1260 the client has visited to the auditor. Some clients may already 1261 share this information to a third party, for example, when using a 1262 server to synchronize browser history across devices in a server- 1263 visible way, or when doing DNS lookups through a trusted DNS 1264 resolver. For clients with such a relationship already established, 1265 sending SCTs to a trusted auditor run by the same organization does 1266 not appear to expose any additional information to the trusted third 1267 party. 1269 Clients who wish to contact a CT auditor without associating their 1270 identities with their SCTs may wish to use an anonymizing network 1271 like Tor to submit SCT Feedback to the auditor. Auditors SHOULD 1272 accept SCT Feedback that arrives over such anonymizing networks. 1274 Clients sending feedback to an auditor may prefer to reduce the 1275 temporal granularity of the history exposure to the auditor by 1276 caching and delaying their SCT Feedback reports. This is elaborated 1277 upon in Section 11.3. This strategy is only as effective as the 1278 granularity of the timestamps embedded in the SCTs and STHs. 1280 10.4.7. HTTPS Clients as Auditors 1282 Some HTTPS Clients may choose to act as CT auditors themselves. A 1283 Client taking on this role needs to consider the following: 1285 o an Auditing HTTPS Client potentially exposes its history to the 1286 logs that they query. Querying the log through a cache or a proxy 1287 with many other users may avoid this exposure, but may expose 1288 information to the cache or proxy, in the same way that a non- 1289 Auditing HTTPS Client exposes information to a Trusted Auditor. 1291 o an effective CT auditor needs a strategy about what to do in the 1292 event that it discovers misbehavior from a log. Misbehavior from 1293 a log involves the log being unable to provide either (a) a 1294 consistency proof between two valid STHs or (b) an inclusion proof 1295 for a certificate to an STH any time after the log's MMD has 1296 elapsed from the issuance of the SCT. The log's inability to 1297 provide either proof will not be externally cryptographically- 1298 verifiable, as it may be indistinguishable from a network error. 1300 11. Policy Recommendations 1302 This section is intended as suggestions to implementors of HTTPS 1303 Clients, HTTPS Servers, and CT auditors. It is not a requirement for 1304 technique of implementation, so long as privacy considerations 1305 established above are obeyed. 1307 11.1. Blocking Recommendations 1309 11.1.1. Frustrating blocking 1311 When making gossip connections to HTTPS Servers or Trusted Auditors, 1312 it is desirable to minimize the plaintext metadata in the connection 1313 that can be used to identify the connection as a gossip connection 1314 and therefore be of interest to block. Additionally, introducing 1315 some randomness into client behavior may be important. We assume 1316 that the adversary is able to inspect the behavior of the HTTPS 1317 client and understand how it makes gossip connections. 1319 As an example, if a client, after establishing a TLS connection (and 1320 receiving an SCT, but not making its own HTTP request yet), 1321 immediately opens a second TLS connection for the purpose of gossip, 1322 the adversary can reliably block this second connection to block 1323 gossip without affecting normal browsing. For this reason it is 1324 recommended to run the gossip protocols over an existing connection 1325 to the server, making use of connection multiplexing such as HTTP 1326 Keep-Alives or SPDY. 1328 Truncation is also a concern. If a client always establishes a TLS 1329 connection, makes a request, receives a response, and then always 1330 attempts a gossip communication immediately following the first 1331 response, truncation will allow an attacker to block gossip reliably. 1333 For these reasons, we recommend that, if at all possible, clients 1334 SHOULD send gossip data in an already established TLS session. This 1335 can be done through the use of HTTP Pipelining, SPDY, or HTTP/2. 1337 11.1.2. Responding to possible blocking 1339 In some cirsumstances a client may have a piece of data that they 1340 have attempted to share (via SCT Feedback or STH Pollination), but 1341 have been unable to do so: with every attempt they recieve an error. 1342 These situations are: 1344 1. The client has an SCT and a certificate, and attempts to retrieve 1345 an inclusion proof - but recieves an error on every attempt. 1347 2. The client has an STH, and attempts to resolve it to a newer STH 1348 via a consistency proof - but recieves an error on every attempt. 1350 3. The client has attempted to share an SCT and constructed 1351 certificate via SCT Feedback - but recieves an error on every 1352 attempt. 1354 4. The client has attempted to share an STH via STH Pollination - 1355 but recieves an error on every attempt. 1357 5. The client has attempted to share a specific piece of data with a 1358 Trusted Auditor - but recieves an error on every attempt. 1360 In the case of 1 or 2, it is conceivable that the reason for the 1361 errors is that the log acted improperly, either through malicious 1362 actions or compromise. A proof may not be able to be fetched because 1363 it does not exist (and only errors or timeouts occur). One such 1364 situation may arise because of an actively malicious log, as 1365 presented in Section 10.1. This data is especially important to 1366 share with the broader internet to detect this situation. 1368 If an SCT has attempted to be resolved to an STH via an inclusion 1369 proof multiple times, and each time has failed, a client SHOULD make 1370 every effort to send this SCT via SCT Feedback. However the client 1371 MUST NOT share the data with any other third party (excepting a 1372 Trusted Auditor should one exist). 1374 If an STH has attempted to be resolved to a newer STH via a 1375 consistency proof multiple times, and each time has failed, a client 1376 MAY share the STH with an "Auditor of Last Resort" even if the STH in 1377 question is no longer within the validity window. This auditor may 1378 be pre-configured in the client, but the client SHOULD permit a user 1379 to disable the functionality or change whom data is sent to. The 1380 Auditor of Last Resort itself represents a point of failure, so if 1381 implemented, it should connect using public key pinning and not 1382 considered an item delivered until it recieves a confirmation. 1384 In the cases 3, 4, and 5, we assume that the webserver(s) or trusted 1385 auditor in question is either experiencing an operational failure, or 1386 being attacked. In both cases, a client SHOULD retain the data for 1387 later submission (subject to Private Browsing or other history- 1388 clearing actions taken by the user.) This is elaborated upon more in 1389 Section 11.3. 1391 11.2. Proof Fetching Recommendations 1393 Proof fetching (both inclusion proofs and consistency proofs) should 1394 be performed at random time intervals. If proof fetching occured all 1395 at once, in a flurry of activity, a log would know that SCTs or STHs 1396 recieved around the same time are more likely to come from a 1397 particular client. While proof fetching is required to be done in a 1398 manner that attempts to be anonymous from the perspective of the log, 1399 the correlation of activity to a single client would still reveal 1400 patterns of user behavior we wish to keep confidential. These 1401 patterns could be recognizable as a single user, or could reveal what 1402 sites are commonly visited together in the aggregate. 1404 [ TBD: What other recommendations do we want to make here? We can 1405 talk more about the inadequecies of DNS... The first paragraph is 80% 1406 identical between here and above ] 1408 11.3. Record Distribution Recommendations 1410 In several components of the CT Gossip ecosystem, the recommendation 1411 is made that data from multiple sources be ingested, mixed, stored 1412 for an indeterminate period of time, provided (multiple times) to a 1413 third party, and eventually deleted. The instances of these 1414 recommendations in this draft are: 1416 o When a client receives SCTs during SCT Feedback, it should store 1417 the SCTs and Certificate Chain for some amount of time, provide 1418 some of them back to the server at some point, and may eventually 1419 remove them from its store 1421 o When a client receives STHs during STH Pollination, it should 1422 store them for some amount of time, mix them with other STHs, 1423 release some of them them to various servers at some point, 1424 resolve some of them to new STHs, and eventually remove them from 1425 its store 1427 o When a server receives SCTs during SCT Feedback, it should store 1428 them for some period of time, provide them to auditors some number 1429 of times, and may eventually remove them 1431 o When a server receives STHs during STH Pollination, it should 1432 store them for some period of time, mix them with other STHs, 1433 provide some of them to connecting clients, may resolve them to 1434 new STHs via Proof Fetching, and eventually remove them from its 1435 store 1437 o When a Trusted Auditor receives SCTs or historical STHs from 1438 clients, it should store them for some period of time, mix them 1439 with SCTs received from other clients, and act upon them at some 1440 period of time 1442 Each of these instances have specific requirements for user privacy, 1443 and each have options that may not be invoked. As one example, an 1444 HTTPS client should not mix SCTs from server A with SCTs from server 1445 B and release server B's SCTs to Server A. As another example, an 1446 HTTPS server may choose to resolve STHs to a single more current STH 1447 via proof fetching, but it is under no obligation to do so. 1449 These requirements should be met, but the general problem of 1450 aggregating multiple pieces of data, choosing when and how many to 1451 release, and when to remove them is shared. This problem has 1452 previously been considered in the case of Mix Networks and Remailers, 1453 including papers such as "From a Trickle to a Flood: Active Attacks 1454 on Several Mix Types", [Y], and [Z]. 1456 There are several concerns to be addressed in this area, outlined 1457 below. 1459 11.3.1. Mixing Algorithm 1461 When SCTs or STHs are recorded by a participant in CT Gossip and 1462 later used, it is important that they are selected from the datastore 1463 in a non-deterministic fashion. 1465 This is most important for servers, as they can be queried for SCTs 1466 and STHs anonymously. If the server used a predictable ordering 1467 algorithm, an attacker could exploit the predictability to learn 1468 information about a client. One such method would be by observing 1469 the (encrypted) traffic to a server. When a client of interest 1470 connects, the attacker makes a note. They observe more clients 1471 connecting, and predicts at what point the client-of-interest's data 1472 will be disclosed, and ensures that they query the server at that 1473 point. 1475 Although most important for servers, random ordering is still 1476 strongly recommended for clients and Trusted Auditors. The above 1477 attack can still occur for these entities, although the circumstances 1478 are less straightforward. For clients, an attacker could observe 1479 their behavior, note when they recieve an STH from a server, and use 1480 javascript to cause a network connection at the correct time to force 1481 a client to disclose the specific STH. Trusted Auditors are stewards 1482 of sensitive client data. If an attacker had the ability to observe 1483 the activities of a Trusted Auditor (perhaps by being a log, or 1484 another auditor), they could perform the same attack - noting the 1485 disclosure of data from a client to the Trusted Auditor, and then 1486 correlating a later disclosure from the Trusted Auditor as coming 1487 from that client. 1489 Random ordering can be ensured by several mechanisms. A datastore 1490 can be shuffled, using a secure shuffling algorithm such as Fisher- 1491 Yates. Alternately, a series of random indexes into the data store 1492 can be selected (if a collision occurs, a new index is selected.) A 1493 cryptographyically secure random number generator must be used in 1494 either case. If shuffling is performed, the datastore must be marked 1495 'dirty' upon item insertion, and at least one shuffle operation 1496 occurs on a dirty datastore before data is retrieved from it for use. 1498 11.3.2. Flushing Attacks 1500 A flushing attack is an attempt by an adversary to flush a particular 1501 piece of data from a pool. In the CT Gossip ecosystem, an attacker 1502 may have performed an attack and left evidence of a compromised log 1503 on a client or server. They would be interested in flushing that 1504 data, i.e. tricking the target into gossiping or pollinating the 1505 incriminating evidence with only attacker-controlled clients or 1506 servers with the hope they trick the target into deleting it. 1508 Servers are most vulnerable to flushing attacks, as they release 1509 records to anonymous connections. An attacker can perform a Sybil 1510 attack - connecting to the server hundreds or thousands of times in 1511 an attempt to trigger repeated release of a record, and then 1512 deletion. For this reason, servers must be especially aggressive 1513 about retaining data for a longer period of time. 1515 Clients are vulnerable to flushing attacks targetting STHs, as these 1516 can be given to any cooperating server and an attacker can generally 1517 induce connections to random servers using javascript. It would be 1518 more difficult to perform a flushing attack against SCTs, as the 1519 target server must be authenticated (and an attacker impersonating an 1520 authentic server presents a recursive problem for the attacker). 1521 Nonetheless, flushing SCTs should not be ruled impossible. A Trusted 1522 Auditor may also be vulnerable to flushing attacks if it does not 1523 perform auditing operations itself. 1525 Flushing attacks are defended against using non-determinism and dummy 1526 messages. The goal is to ensure that an adversary does not know for 1527 certain if the data in question has been released or not, and if it 1528 has been deleted or not. 1530 [ TBD: At present, we do not have any support for dummy messages. Do 1531 we want to define a dummy message that clients and servers alike know 1532 to ignore? Will HTTP Compression leak the presence of >1 dummy 1533 messages? 1534 Is it sufficient to define a dummy message as _anything_ with an 1535 invalid siganture? This would negatively impact SCT Feedback servers 1536 that log all things just in case they're interesting. ] 1538 11.3.3. The Deletion Algorithm 1540 No entity in CT Gossip is required to delete SCTs or STHs at any 1541 time, except to respect user's wishes such as private browsing mode 1542 or clearing history. However, requiring infinite storage space is 1543 not a desirable characteristic in a protocol, so deletion is 1544 expected. 1546 While deletion of SCTs and STHs will occur, proof fetching can ensure 1547 that any misbehavior from a log will still be detected, even after 1548 the direct evidence from the attack is deleted. Proof fetching 1549 ensures that if a log presents a split view for a client, they must 1550 maintain that split view in perpetuity. An inclusion proof from an 1551 SCT to an STH does not erase the evidence - the new STH is evidence 1552 itself. A consistency proof from that STH to a new one likewise - 1553 the new STH is every bit as incriminating as the first. (Client 1554 behavior in the situation where an SCT or STH cannot be resolved is 1555 suggested in Section 11.1.2.) Because of this property, we recommend 1556 that if a client is performing proof fetching, that they make every 1557 effort to not delete an SCT or STH until it has been successfully 1558 resolved to a new STH via a proof. 1560 When it is time to delete a record, it is important that the decision 1561 to do so not be done deterministicly. Introducing non-determinism in 1562 the decision is absolutely necessary to prevent an adversary from 1563 knowing with certainty that the record has been successfully flushed 1564 from a target. Therefore, we speak of making a record 'eligible for 1565 deletion' and then being processed by the 'deletion algorithm'. 1566 Making a record eligible for deletion simply means that it will have 1567 the deletion algorithm run. The deletion algorithm will use a 1568 probability based system and a secure random number generator to 1569 determine if the record will be deleted. 1571 Although the deletion algorithm is specifically designed to be non- 1572 deterministic, if the record has been resolved via proof to a new STH 1573 the record may be safely deleted, as long as the new STH is retained. 1575 The actual deletion algorithm may be [STATISTICS HERE]. [ Something 1576 as simple as 'Pick an integer securely between 1 and 10. If it's 1577 greater than 7, delete the record.' Or something more complicated. ] 1579 [ TODO Enumerating the problems of different types of mixes vs 1580 Cottrell Mix ] 1582 11.3.3.1. Experimental Algorithms 1584 More complex algorithms could be inserted at any step. Three 1585 examples are illustrated: 1587 SCTs are not eligible to be submitted to an Auditor of Last Resort. 1588 Therefore, it is more important that they be resolved to STHs and 1589 reported via SCT feedback. If fetching an inclusion proof regularly 1590 fails for a particular SCT, one can require it be reported more times 1591 than normal via SCT Feedback before becoming eligible for deletion. 1593 Before an item is made eligible for deletion by a client, the client 1594 could aim to make it difficult for a point-in-time attacker to flush 1595 the pool by not making an item eligible for deletion until the client 1596 has moved networks (as seen by either the local IP address, or a 1597 report-back providing the client with its observed public IP 1598 address). The HTTPS client could also require reporting over a 1599 timespan, e.g. it must be reported at least N time, M weeks apart. 1600 This strategy could be employed always, or only when the client has 1601 disabled proof fetching and the Auditor of Last Resort, as those two 1602 mechanisms (when used together) will enable a client to report most 1603 attacks. 1605 11.3.3.2. Concrete Recommendations 1607 The recommendations for behavior are: - If proof fetching is enabled, 1608 do not delete an SCT until it has had a proof resolving it to an STH. 1609 - If proof fetching continually fails for an SCT, do not make the 1610 item eligible for deletion of the SCT until it has been released, 1611 multiple times, via SCT Feedback. - If proof fetching continually 1612 fails for an STH, do not make the item eligible for deletion until it 1613 has been queued for release to an Auditor of Last Resort. - Do not 1614 dequeue entries to an Auditor of Last Resort if reporting fails. 1615 Instead keep the items queued until they have been successfully sent. 1616 - Use a probability based system, with a cryptographically secure 1617 random number generator, to determine if an item should be deleted. 1618 - Select items from the datastores by selecting random indexes into 1619 the datastore. Use a cryptographically secure random number 1620 generator. 1622 [ TBD: More? ] 1624 We present the following pseudocode as a concrete outline of our 1625 suggestion. 1627 11.3.3.2.1. STH Data Structures 1629 The STH class contains data pertaining specifically to the STH 1630 itself. 1632 class STH 1633 { 1634 uint32 proof_attempts 1635 uint32 proof_failure_count 1636 uint32 num_reports_to_thirdparty 1637 datetime timestamp 1638 byte[] data 1639 } 1641 The broader STH store itself would contain all the STHs known by an 1642 entity participating in STH Pollination (either client or server). 1643 This simplistic view of the class does not take into account the 1644 complicated locking that would likely be required for a data 1645 structure being accessed by multiple threads. One thing to note 1646 about this pseudocode is that it aggressively removes STHs once they 1647 have been resolved to a newer STH (if proof fetching is configured). 1648 The only STHs in the store are ones that have never been resolved to 1649 a newer STH, either because proof fetching does not occur, has 1650 failed, or because the STH is considered too new to request a proof 1651 for. It seems less likely that servers will perform proof fetching. 1652 Therefore it would be recommended that the various constants in use 1653 be increased considerably to ensure STHs are pollinated more 1654 aggressively. 1656 class STHStore 1657 { 1658 STH[] sth_list 1660 // This function is run after receiving a set of STHs from 1661 // a third party in response to a pollination submission 1662 def insert(STH[] new_sths) { 1663 foreach(new in new_sths) { 1664 if(this.sth_list.contains(new)) 1665 continue 1666 this.sth_list.insert(new) 1667 } 1668 } 1670 // This function is called to possibly delete the given STH 1671 // from the data store 1672 def delete_maybe(STH s) { 1673 //Perform statistical test and see if I should delete this bundle 1674 } 1675 // This function is called to (certainly) delete the given STH 1676 // from the data store 1677 def delete_now(STH s) { 1678 this.sth_list.remove(s) 1679 } 1681 // When it is time to perform STH Pollination, the HTTPS Client 1682 // calls this function to get a selection of STHs to send as 1683 // feedback 1684 def get_pollination_selection() { 1685 if(len(this.sth_list) < MAX_STH_TO_GOSSIP) 1686 return this.sth_list 1687 else { 1688 indexes = set() 1689 modulus = len(this.sth_list) 1690 while(len(indexes) < MAX_STH_TO_GOSSIP) { 1691 r = randomInt() % modulus 1692 if(r not in indexes 1693 && now() - this.sth_list[i].timestamp < ONE_WEEK) 1694 indexes.insert(r) 1695 } 1697 return_selection = [] 1698 foreach(i in indexes) { 1699 return_selection.insert(this.sth_list[i]) 1700 } 1701 return return_selection 1702 } 1703 } 1704 } 1706 We also suggest a function that can be called periodically in the 1707 background, iterating through the STH store, performing a cleaning 1708 operation and queuing consistency proofs. This function can live as 1709 a member functions of the STHStore class. 1711 def clean_list() { 1712 foreach(sth in this.sth_list) { 1714 if(now() - sth.timestamp > ONE_WEEK) { 1715 //STH is too old, we must remove it 1716 if(proof_fetching_enabled 1717 && auditor_of_last_resort_enabled 1718 && (sth.proof_failure_count / sth.proof_attempts) 1719 > MIN_PROOF_FAILURE_RATIO_CONSIDERED_SUSPICIOUS) { 1720 queue_sth_for_auditor_of_last_resort(sth) 1721 delete_maybe(sth) 1722 } else { 1723 delete_now(sth) 1724 } 1725 } 1727 else if(proof_fetching_enabled 1728 && now() - sth.timestamp > TWO_DAYS 1729 && now() - sth.timestamp > LOG_MMD) { 1730 sth.proof_attempts++ 1731 queue_consistency_proof(sth, consistency_proof_callback) 1732 } 1733 } 1734 } 1736 11.3.3.2.2. STH Deletion Procedure 1738 The STH Deletion Procedure is run after successfully submitting a 1739 list of STHs to a third party during pollination. The following 1740 pseudocode would be included in the STHStore class, and called with 1741 the result of get_pollination_selection(), after the STHs have been 1742 (successfully) sent to the third party. 1744 // This function is called after successfully pollinating STHs 1745 // to a third party. It is passed the STHs sent to the third 1746 // party, which is the output of get_gossip_selection() 1747 def after_submit_to_thirdparty(STH[] sth_list) 1748 { 1749 foreach(sth in sth_list) 1750 { 1751 sth.num_reports_to_thirdparty++ 1753 if(proof_fetching_enabled) { 1754 if(now() - sth.timestamp > LOG_MMD) { 1755 sth.proof_attempts++ 1756 queue_consistency_proof(sth, consistency_proof_callback) 1757 } 1759 if(auditor_of_last_resort_enabled 1760 && sth.proof_failure_count > 1761 MIN_PROOF_ATTEMPTS_CONSIDERED_SUSPICIOUS 1762 && (sth.proof_failure_count / sth.proof_attempts) > 1763 MIN_PROOF_FAILURE_RATIO_CONSIDERED_SUSPICIOUS) { 1764 queue_sth_for_auditor_of_last_resort(sth) 1765 } 1766 } 1767 else { //proof fetching not enabled 1768 if(sth.num_reports_to_thirdparty 1769 > MIN_STH_REPORTS_TO_THIRDPARTY) { 1770 delete_maybe(sth) 1771 } 1772 } 1773 } 1774 } 1776 def consistency_proof_callback(consistency_proof, 1777 original_sth, 1778 error) { 1779 if(!error) { 1780 insert(consistency_proof.current_sth) 1781 delete_now(consistency_proof.original_sth) 1782 } else { 1783 original_sth.proof_failure_count++ 1784 } 1785 } 1787 11.3.3.2.3. SCT Data Structures 1789 TBD TBD This section is not well abstracted to be used for both 1790 servers and clients. TKTK 1791 The SCT class contains data pertaining specifically to the SCT 1792 itself. 1794 class SCT 1795 { 1796 uint32 proof_attempts 1797 uint32 proof_failure_count 1798 bool has_been_resolved_to_sth 1799 byte[] data 1800 } 1802 The SCT bundle will contain the trusted certificate chain the HTTPS 1803 client built (chaining to a trusted root certificate.) It also 1804 contains the list of associated SCTs, the exact domain it is 1805 applicable to, and metadata pertaining to how often it has been 1806 reported to the third party. 1808 class SCTBundle 1809 { 1810 X509[] certificate_chain 1811 SCT[] sct_list 1812 string domain 1813 uint32 num_reports_to_thirdparty 1815 def equals(sct_bundle) { 1816 if(sct_bundle.domain != this.domain) 1817 return false 1818 if(sct_bundle.certificate_chain != this.certificate_chain) 1819 return false 1820 if(sct_bundle.sct_list != this.sct_list) 1821 return false 1823 return true 1824 } 1825 def approx_equals(sct_bundle) { 1826 if(sct_bundle.domain != this.domain) 1827 return false 1828 if(sct_bundle.certificate_chain != this.certificate_chain) 1829 return false 1831 return true 1832 } 1834 def insert_scts(sct[] sct_list) { 1835 this.sct_list.union(sct_list) 1836 this.num_reports_to_thirdparty = 0 1837 } 1839 def has_been_fully_resolved_to_sths() { 1840 foreach(s in this.sct_list) { 1841 if(!s.has_been_resolved_to_sth) 1842 return false 1843 } 1844 return true 1845 } 1847 def max_proof_failure_count() { 1848 uint32 max = 0 1849 foreach(s in this.sct_list) { 1850 if(s.proof_failure_count > max) 1851 max = proof_failure_count 1852 } 1853 return max 1854 } 1855 } 1856 We suppose a large data structure is used, such as a hashmap, indexed 1857 by the domain name. For each domain, the structure will contain a 1858 data structure that holds the SCTBundles seen for that domain, as 1859 well as encapsulating some logic relating to SCT Feedback for that 1860 particular domain. 1862 class SCTStore 1863 { 1864 string domain 1865 datetime last_contact_for_domain 1866 uint32 num_submissions_attempted 1867 uint32 num_submissions_succeeded 1868 SCTBundle[] observed_records 1870 // This function is called after recieving an SCTBundle. 1871 // For Clients, this is after a successful connection to a 1872 // HTTPS Server, calling this function with an SCTBundle 1873 // constructed from that certificate chain and SCTs 1874 // For Servers, this is after receiving SCT Feedback 1875 def insert(SCTBundle b) { 1876 if(operator_is_server) { 1877 if(!passes_validity_checks(b)) 1878 return 1879 } 1880 foreach(e in this.observed_records) { 1881 if(e.equals(b)) 1882 return 1883 else if(e.approx_equals(b)) { 1884 e.insert_scts(b.sct_list) 1885 return 1886 } 1887 } 1888 this.observed_records.insert(b) 1889 } 1891 // When it is time to perform SCT Feedback, the HTTPS Client 1892 // calls this function to get a selection of SCTBundles to send 1893 // as feedback 1894 def get_gossip_selection() { 1895 if(len(observed_records) > MAC_SCT_RECORDS_TO_GOSSIP) { 1896 indexes = set() 1897 modulus = len(observed_records) 1898 while(len(indexes) < MAX_SCT_RECORDS_TO_GOSSIP) { 1899 r = randomInt() % modulus 1900 if(r not in indexes) 1901 indexes.insert(r) 1902 } 1903 return_selection = [] 1904 foreach(i in indexes) { 1905 return_selection.insert(this.observed_records[i]) 1906 } 1908 return return_selection 1909 } 1910 else 1911 return this.observed_records 1912 } 1914 def delete_maybe(SCTBundle b) { 1915 //Perform statistical test and see if I should delete this bundle 1916 } 1918 def delete_now(SCTBundle b) { 1919 this.observed_records.remove(b) 1920 } 1922 def passes_validity_checks(SCTBundle b) { 1923 // This function performs the validity checks specified in 1924 // {{feedback-srvop}} 1925 } 1926 } 1928 We also suggest a function that can be called periodically in the 1929 background, iterating through all SCTStore objects in the large 1930 hashmap (here called 'all_sct_stores') and removing old data. 1932 def clear_old_data() 1933 { 1934 foreach(storeEntry in all_sct_stores) 1935 { 1936 if(storeEntry.num_submissions_succeeded == 0 1937 && storeEntry.num_submissions_attempted 1938 > MIN_SCT_ATTEMPTS_FOR_DOMAIN_TO_BE_IGNORED) 1939 { 1940 all_sct_stores.remove(storeEntry) 1941 } 1942 else if(storeEntry.num_submissions_succeeded > 0 1943 && now() - storeEntry.last_contact_for_domain 1944 > TIME_UNTIL_OLD_SCTDATA_ERASED) 1945 { 1946 all_sct_stores.remove(storeEntry) 1947 } 1948 } 1949 } 1951 11.3.3.2.4. SCT Deletion Procedure 1953 The SCT Deletion procedure is more complicated than the respective 1954 STH procedure. This is because servers may elect not to participate 1955 in SCT Feedback, and this must be accounted for by being more 1956 conservative in sending SCT reports to them. 1958 The following pseudocode would be included in the SCTStore class, and 1959 called with the result of get_gossip_selection() after the SCT 1960 Feedback has been sent (successfully) to the server. We also note 1961 that the first experimental algorithm from above is included in the 1962 pseudocode as an illustration. 1964 // This function is called after successfully providing SCT Feedback 1965 // to a server. It is passed the feedback sent to the server, which 1966 // is the output of get_gossip_selection() 1967 def after_submit_to_thirdparty(SCTBundle[] submittedBundles) 1968 { 1969 foreach(bundle in submittedBundles) 1970 { 1971 bundle.num_reports_to_thirdparty++ 1973 if(proof_fetching_enabled) { 1974 if(!bundle.has_been_fully_resolved_to_sths()) { 1975 foreach(s in bundle.sct_list) { 1976 if(!s.has_been_resolved_to_sth) { 1977 s.proof_attempts++ 1978 queue_inclusion_proof(sct, inclusion_proof_callback) 1979 } 1980 } 1981 } 1982 else { 1983 if(run_ct_gossip_experiment_one) { 1984 if(bundle.num_reports_to_thirdparty 1985 > MIN_SCT_REPORTS_TO_THIRDPARTY 1986 && bundle.num_reports_to_thirdparty * 1.5 1987 > bundle.max_proof_failure_count()) { 1988 maybe_delete(bundle) 1989 } 1990 } 1991 else { //Do not run experiment 1992 if(bundle.num_reports_to_thirdparty 1993 > MIN_SCT_REPORTS_TO_THIRDPARTY) { 1994 maybe_delete(bundle) 1995 } 1996 } 1997 } 1998 } 1999 else {//proof fetching not enabled 2000 if(bundle.num_reports_to_thirdparty 2001 > (MIN_SCT_REPORTS_TO_THIRDPARTY 2002 * NO_PROOF_FETCHING_REPORT_INCREASE_FACTOR)) { 2003 maybe_delete(bundle) 2004 } 2005 } 2006 } 2007 } 2009 // This function is a callback invoked after an inclusion proof 2010 // has been retrieved 2011 def inclusion_proof_callback(inclusion_proof, original_sct, error) 2012 { 2013 if(!error) { 2014 original_sct.has_been_resolved_to_sth = True 2015 insert_to_sth_datastore(inclusion_proof.new_sth) 2016 } else { 2017 original_sct.proof_failure_count++ 2018 } 2019 } 2021 12. IANA considerations 2023 [ TBD ] 2025 13. Contributors 2027 The authors would like to thank the following contributors for 2028 valuable suggestions: Al Cutter, Ben Laurie, Benjamin Kaduk, Josef 2029 Gustafsson, Karen Seo, Magnus Ahltorp, Steven Kent, Yan Zhu. 2031 14. ChangeLog 2033 14.1. Changes between ietf-01 and ietf-02 2035 o Requiring full certificate chain in SCT Feedback. 2037 o Clarifications on what clients store for and send in SCT Feedback 2038 added. 2040 o SCT Feedback server operation updated to protect against DoS 2041 attacks on servers. 2043 o Pre-Loaded vs Locally Added Anchors explained. 2045 o Base for well-known URL's changed. 2047 o Remove all mentions of monitors - gossip deals with adutitors. 2049 o New sections added: Trusted Auditor protocol, attacks by actively 2050 malicious log, the Dual-CA compromise attack, policy 2051 recommendations, 2053 14.2. Changes between ietf-00 and ietf-01 2055 o Improve langugage and readability based on feedback from Stephen 2056 Kent. 2058 o STH Pollination Proof Fetching defined and indicated as optional. 2060 o 3-Method Ecosystem section added. 2062 o Cases with Logs ceasing operation handled. 2064 o Text on tracking via STH Interaction added. 2066 o Section with some early recommendations for mixing added. 2068 o Section detailing blocking connections, frustrating it, and the 2069 implications added. 2071 14.3. Changes between -01 and -02 2073 o STH Pollination defined. 2075 o Trusted Auditor Relationship defined. 2077 o Overview section rewritten. 2079 o Data flow picture added. 2081 o Section on privacy considerations expanded. 2083 14.4. Changes between -00 and -01 2085 o Add the SCT feedback mechanism: Clients send SCTs to originating 2086 web server which shares them with auditors. 2088 o Stop assuming that clients see STHs. 2090 o Don't use HTTP headers but instead .well-known URL's - avoid that 2091 battle. 2093 o Stop referring to trans-gossip and trans-gossip-transport-https - 2094 too complicated. 2096 o Remove all protocols but HTTPS in order to simplify - let's come 2097 back and add more later. 2099 o Add more reasoning about privacy. 2101 o Do specify data formats. 2103 15. References 2105 15.1. Normative References 2107 [RFC-6962-BIS-09] 2108 Laurie, B., Langley, A., Kasper, E., Messeri, E., and R. 2109 Stradling, "Certificate Transparency", October 2015, 2110 . 2113 [RFC7159] Bray, T., "The JavaScript Object Notation (JSON) Data 2114 Interchange Format", RFC 7159, March 2014. 2116 15.2. Informative References 2118 [draft-ietf-trans-threat-analysis-03] 2119 Kent, S., "Attack Model and Threat for Certificate 2120 Transparency", October 2015, 2121 . 2124 Authors' Addresses 2126 Linus Nordberg 2127 NORDUnet 2129 Email: linus@nordu.net 2131 Daniel Kahn Gillmor 2132 ACLU 2134 Email: dkg@fifthhorseman.net 2136 Tom Ritter 2138 Email: tom@ritter.vg