idnits 2.17.1 

draft-ietf-trans-gossip-03.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** There are 39 instances of too long lines in the document, the longest
     one being 60 characters in excess of 72.

  == There are 4 instances of lines with non-RFC2606-compliant FQDNs in the
     document.

  ** The document seems to lack a both a reference to RFC 2119 and the
     recommended RFC 2119 boilerplate, even if it appears to use RFC 2119
     keywords. 

     RFC 2119 keyword, line 360: '...tension.  The client MUST discard SCTs...'
     RFC 2119 keyword, line 361: '...own to the client and SHOULD store the...'
     RFC 2119 keyword, line 367: '...ed on the client MUST be keyed by the ...'
     RFC 2119 keyword, line 368: '...contacted.  They MUST NOT be sent to a...'
     RFC 2119 keyword, line 371: '...mple.com.)  They MUST NOT be sent to a...'
     (67 more instances...)


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == Line 1800 has weird spacing: '...   bool   has_...'

  == Line 1801 has weird spacing: '...   bool   proo...'

  == Line 1893 has weird spacing: '...h later    num...'

  == Line 1901 has weird spacing: '...h later    num...'

  == Line 1903 has weird spacing: '...h later    num...'

  == (9 more instances...)

  -- The document date (July 08, 2016) is 2849 days in the past.  Is this
     intentional?

  -- Found something which looks like a code comment -- if you have code
     sections in the document, please surround them with '<CODE BEGINS>' and
     '<CODE ENDS>' lines.


  Checking references for intended status: Experimental
  ----------------------------------------------------------------------------

  -- Looks like a reference, but probably isn't: '1' on line 293

  -- Looks like a reference, but probably isn't: '2' on line 295

  -- Looks like a reference, but probably isn't: '3' on line 297

  ** Obsolete normative reference: RFC 6962 (Obsoleted by RFC 9162)

  ** Obsolete normative reference: RFC 7159 (Obsoleted by RFC 8259)


     Summary: 4 errors (**), 0 flaws (~~), 8 warnings (==), 5 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	TRANS                                                        L. Nordberg
3	Internet-Draft                                                  NORDUnet
4	Intended status: Experimental                                 D. Gillmor
5	Expires: January 9, 2017                                            ACLU
6	                                                               T. Ritter

8	                                                           July 08, 2016

10	                            Gossiping in CT
11	                       draft-ietf-trans-gossip-03

13	Abstract

15	   The logs in Certificate Transparency are untrusted in the sense that
16	   the users of the system don't have to trust that they behave
17	   correctly since the behavior of a log can be verified to be correct.

19	   This document tries to solve the problem with logs presenting a
20	   "split view" of their operations.  It describes three gossiping
21	   mechanisms for Certificate Transparency: SCT Feedback, STH
22	   Pollination and Trusted Auditor Relationship.

24	Status of This Memo

26	   This Internet-Draft is submitted in full conformance with the
27	   provisions of BCP 78 and BCP 79.

29	   Internet-Drafts are working documents of the Internet Engineering
30	   Task Force (IETF).  Note that other groups may also distribute
31	   working documents as Internet-Drafts.  The list of current Internet-
32	   Drafts is at http://datatracker.ietf.org/drafts/current/.

34	   Internet-Drafts are draft documents valid for a maximum of six months
35	   and may be updated, replaced, or obsoleted by other documents at any
36	   time.  It is inappropriate to use Internet-Drafts as reference
37	   material or to cite them other than as "work in progress."

39	   This Internet-Draft will expire on January 9, 2017.

41	Copyright Notice

43	   Copyright (c) 2016 IETF Trust and the persons identified as the
44	   document authors.  All rights reserved.

46	   This document is subject to BCP 78 and the IETF Trust's Legal
47	   Provisions Relating to IETF Documents
48	   (http://trustee.ietf.org/license-info) in effect on the date of
49	   publication of this document.  Please review these documents
50	   carefully, as they describe your rights and restrictions with respect
51	   to this document.  Code Components extracted from this document must
52	   include Simplified BSD License text as described in Section 4.e of
53	   the Trust Legal Provisions and are provided without warranty as
54	   described in the Simplified BSD License.

56	Table of Contents

58	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
59	   2.  Defining the problem  . . . . . . . . . . . . . . . . . . . .   4
60	   3.  Overview  . . . . . . . . . . . . . . . . . . . . . . . . . .   4
61	   4.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   5
62	     4.1.  Pre-Loaded vs Locally Added Anchors . . . . . . . . . . .   5
63	   5.  Who gossips with whom . . . . . . . . . . . . . . . . . . . .   5
64	   6.  What to gossip about and how  . . . . . . . . . . . . . . . .   6
65	   7.  Data flow . . . . . . . . . . . . . . . . . . . . . . . . . .   6
66	   8.  Gossip Mechanisms . . . . . . . . . . . . . . . . . . . . . .   7
67	     8.1.  SCT Feedback  . . . . . . . . . . . . . . . . . . . . . .   7
68	       8.1.1.  SCT Feedback data format  . . . . . . . . . . . . . .   8
69	       8.1.2.  HTTPS client to server  . . . . . . . . . . . . . . .   8
70	       8.1.3.  HTTPS server operation  . . . . . . . . . . . . . . .  11
71	       8.1.4.  HTTPS server to auditors  . . . . . . . . . . . . . .  13
72	     8.2.  STH pollination . . . . . . . . . . . . . . . . . . . . .  14
73	       8.2.1.  HTTPS Clients and Proof Fetching  . . . . . . . . . .  15
74	       8.2.2.  STH Pollination without Proof Fetching  . . . . . . .  17
75	       8.2.3.  Auditor Action  . . . . . . . . . . . . . . . . . . .  17
76	       8.2.4.  STH Pollination data format . . . . . . . . . . . . .  17
77	     8.3.  Trusted Auditor Stream  . . . . . . . . . . . . . . . . .  17
78	       8.3.1.  Trusted Auditor data format . . . . . . . . . . . . .  18
79	   9.  3-Method Ecosystem  . . . . . . . . . . . . . . . . . . . . .  19
80	     9.1.  SCT Feedback  . . . . . . . . . . . . . . . . . . . . . .  19
81	     9.2.  STH Pollination . . . . . . . . . . . . . . . . . . . . .  20
82	     9.3.  Trusted Auditor Relationship  . . . . . . . . . . . . . .  21
83	     9.4.  Interaction . . . . . . . . . . . . . . . . . . . . . . .  22
84	   10. Security considerations . . . . . . . . . . . . . . . . . . .  22
85	     10.1.  Attacks by actively malicious logs . . . . . . . . . . .  22
86	     10.2.  Dual-CA Compromise . . . . . . . . . . . . . . . . . . .  23
87	     10.3.  Censorship/Blocking considerations . . . . . . . . . . .  24
88	     10.4.  Flushing Attacks . . . . . . . . . . . . . . . . . . . .  25
89	       10.4.1.  STHs . . . . . . . . . . . . . . . . . . . . . . . .  25
90	       10.4.2.  SCTs & Certificate Chains on HTTPS Servers . . . . .  26
91	       10.4.3.  SCTs & Certificate Chains on HTTPS Clients . . . . .  26
92	     10.5.  Privacy considerations . . . . . . . . . . . . . . . . .  27
93	       10.5.1.  Privacy and SCTs . . . . . . . . . . . . . . . . . .  27
94	       10.5.2.  Privacy in SCT Feedback  . . . . . . . . . . . . . .  27
95	       10.5.3.  Privacy for HTTPS clients performing STH Proof
96	                Fetching . . . . . . . . . . . . . . . . . . . . . .  28

98	       10.5.4.  Privacy in STH Pollination . . . . . . . . . . . . .  28
99	       10.5.5.  Privacy in STH Interaction . . . . . . . . . . . . .  29
100	       10.5.6.  Trusted Auditors for HTTPS Clients . . . . . . . . .  29
101	       10.5.7.  HTTPS Clients as Auditors  . . . . . . . . . . . . .  30
102	   11. Policy Recommendations  . . . . . . . . . . . . . . . . . . .  30
103	     11.1.  Blocking Recommendations . . . . . . . . . . . . . . . .  31
104	       11.1.1.  Frustrating blocking . . . . . . . . . . . . . . . .  31
105	       11.1.2.  Responding to possible blocking  . . . . . . . . . .  31
106	     11.2.  Proof Fetching Recommendations . . . . . . . . . . . . .  32
107	     11.3.  Record Distribution Recommendations  . . . . . . . . . .  33
108	       11.3.1.  Mixing Algorithm . . . . . . . . . . . . . . . . . .  34
109	       11.3.2.  The Deletion Algorithm . . . . . . . . . . . . . . .  35
110	     11.4.  Concrete Recommendations . . . . . . . . . . . . . . . .  36
111	       11.4.1.  STH Pollination  . . . . . . . . . . . . . . . . . .  36
112	       11.4.2.  SCT Feedback . . . . . . . . . . . . . . . . . . . .  39
113	   12. IANA considerations . . . . . . . . . . . . . . . . . . . . .  53
114	   13. Contributors  . . . . . . . . . . . . . . . . . . . . . . . .  53
115	   14. ChangeLog . . . . . . . . . . . . . . . . . . . . . . . . . .  53
116	     14.1.  Changes between ietf-02 and ietf-03  . . . . . . . . . .  53
117	     14.2.  Changes between ietf-01 and ietf-02  . . . . . . . . . .  54
118	     14.3.  Changes between ietf-00 and ietf-01  . . . . . . . . . .  54
119	     14.4.  Changes between -01 and -02  . . . . . . . . . . . . . .  54
120	     14.5.  Changes between -00 and -01  . . . . . . . . . . . . . .  55
121	   15. References  . . . . . . . . . . . . . . . . . . . . . . . . .  55
122	     15.1.  Normative References . . . . . . . . . . . . . . . . . .  55
123	     15.2.  Informative References . . . . . . . . . . . . . . . . .  55
124	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  56

126	1.  Introduction

128	   The purpose of the protocols in this document, collectively referred
129	   to as CT Gossip, is to detect certain misbehavior by CT logs.  In
130	   particular, CT Gossip aims to detect logs that are providing
131	   inconsistent views to different log clients, and logs failing to
132	   include submitted certificates within the time period stipulated by
133	   MMD.

135	   One of the major challenges of any gossip protocol is limiting damage
136	   to user privacy.  The goal of CT gossip is to publish and distribute
137	   information about the logs and their operations, but not to expose
138	   any additional information about the operation of any of the other
139	   participants.  Privacy of consumers of log information (in
140	   particular, of web browsers and other TLS clients) should not be
141	   undermined by gossip.

143	   This document presents three different, complementary mechanisms for
144	   non-log elements of the CT ecosystem to exchange information about
145	   logs in a manner that preserves the privacy of HTTPS clients.  They
146	   should provide protective benefits for the system as a whole even if
147	   their adoption is not universal.

149	2.  Defining the problem

151	   When a log provides different views of the log to different clients
152	   this is described as a partitioning attack.  Each client would be
153	   able to verify the append-only nature of the log but, in the extreme
154	   case, each client might see a unique view of the log.

156	   The CT logs are public, append-only and untrusted and thus have to be
157	   audited for consistency, i.e., they should never rewrite history.
158	   Additionally, auditors and other log clients need to exchange
159	   information about logs in order to be able to detect a partitioning
160	   attack (as described above).

162	   Gossiping about log behavior helps address the problem of detecting
163	   malicious or compromised logs with respect to a partitioning attack.
164	   We want some side of the partitioned tree, and ideally both sides, to
165	   see the other side.

167	   Disseminating information about a log poses a potential threat to the
168	   privacy of end users.  Some data of interest (e.g.  SCTs) is linkable
169	   to specific log entries and thereby to specific websites, which makes
170	   sharing them with others a privacy concern.  Gossiping about this
171	   data has to take privacy considerations into account in order not to
172	   expose associations between users of the log (e.g., web browsers) and
173	   certificate holders (e.g., web sites).  Even sharing STHs (which do
174	   not link to specific log entries) can be problematic - user tracking
175	   by fingerprinting through rare STHs is one potential attack (see
176	   Section 8.2).

178	3.  Overview

180	   This document presents three gossiping mechanisms: SCT Feedback, STH
181	   Pollination, and a Trusted Auditor Relationship.

183	   SCT Feedback enables HTTPS clients to share Signed Certificate
184	   Timestamps (SCTs) (Section 3.3 of [RFC-6962-BIS-09]) with CT auditors
185	   in a privacy-preserving manner by sending SCTs to originating HTTPS
186	   servers, who in turn share them with CT auditors.

188	   In STH Pollination, HTTPS clients use HTTPS servers as pools to share
189	   Signed Tree Heads (STHs) (Section 3.6 of [RFC-6962-BIS-09]) with
190	   other connecting clients in the hope that STHs will find their way to
191	   CT auditors.

193	   HTTPS clients in a Trusted Auditor Relationship share SCTs and STHs
194	   with trusted CT auditors directly, with expectations of privacy
195	   sensitive data being handled according to whatever privacy policy is
196	   agreed on between client and trusted party.

198	   Despite the privacy risks with sharing SCTs there is no loss in
199	   privacy if a client sends SCTs for a given site to the site
200	   corresponding to the SCT.  This is because the site's logs would
201	   already indicate that the client is accessing that site.  In this way
202	   a site can accumulate records of SCTs that have been issued by
203	   various logs for that site, providing a consolidated repository of
204	   SCTs that could be shared with auditors.  Auditors can use this
205	   information to detect a misbehaving log that fails to include a
206	   certificate within the time period stipulated by its MMD metadata.

208	   Sharing an STH is considered reasonably safe from a privacy
209	   perspective as long as the same STH is shared by a large number of
210	   other log clients.  This safety in numbers can be achieved by only
211	   allowing gossiping of STHs issued in a certain window of time, while
212	   also refusing to gossip about STHs from logs with too high an STH
213	   issuance frequency (see Section 8.2).

215	4.  Terminology

217	   This document relies on terminology and data structures defined in
218	   [RFC-6962-BIS-09], including MMD, STH, SCT, Version, LogID, SCT
219	   timestamp, CtExtensions, SCT signature, Merkle Tree Hash.

221	   This document relies on terminology defined in
222	   [draft-ietf-trans-threat-analysis-03], including Auditing.

224	4.1.  Pre-Loaded vs Locally Added Anchors

226	   Through the document, we refer to both Trust Anchors (Certificate
227	   Authorities) and Logs.  Both Logs and Trust Anchors may be locally
228	   added by an administrator.  Unless otherwise clarified, in both cases
229	   we refer to the set of Trust Anchors and Logs that come pre-loaded
230	   and pre-trusted in a piece of client software.

232	5.  Who gossips with whom

234	   o  HTTPS clients and servers (SCT Feedback and STH Pollination)

236	   o  HTTPS servers and CT auditors (SCT Feedback and STH Pollination)

238	   o  CT auditors (Trusted Auditor Relationship)
239	   Additionally, some HTTPS clients may engage with an auditor who they
240	   trust with their privacy:

242	   o  HTTPS clients and CT auditors (Trusted Auditor Relationship)

244	6.  What to gossip about and how

246	   There are three separate gossip streams:

248	   o  SCT Feedback - transporting SCTs and certificate chains from HTTPS
249	      clients to CT auditors via HTTPS servers.

251	   o  STH Pollination - HTTPS clients and CT auditors using HTTPS
252	      servers as STH pools for exchanging STHs.

254	   o  Trusted Auditor Stream - HTTPS clients communicating directly with
255	      trusted CT auditors sharing SCTs, certificate chains and STHs.

257	   It is worthwhile to note that when an HTTPS client or CT auditor
258	   interacts with a log, they may equivalently interact with a log
259	   mirror or cache that replicates the log.

261	7.  Data flow

263	   The following picture shows how certificates, SCTs and STHs flow
264	   through a CT system with SCT Feedback and STH Pollination.  It does
265	   not show what goes in the Trusted Auditor Relationship stream.

267	      +- Cert ---- +----------+
268	      |            |    CA    | ----------+
269	      |   + SCT -> +----------+           |
270	      v   |                           Cert [& SCT]
271	   +----------+                           |
272	   |   Log    | ---------- SCT -----------+
273	   +----------+                           v
274	     |  ^                          +----------+
275	     |  |         SCTs & Certs --- | Website  |
276	     |  |[1]           |           +----------+
277	     |  |[2]         STHs            ^     |
278	     |  |[3]           v             |     |
279	     |  |          +----------+      |     |
280	     |  +--------> | Auditor  |      |  HTTPS traffic
281	     |             +----------+      |     |
282	    STH                              |  SCT & Cert
283	     |                        SCTs & Certs |
284	   Log entries                       |     |
285	     |                             STHs   STHs
286	     v                               |     |
287	   +----------+                      |     v
288	   | Monitor  |                    +----------+
289	   +----------+                    | Browser  |
290	                                   +----------+

292	   #   Auditor                        Log
293	   [1] |--- get-sth ------------------->|
294	       |<-- STH ------------------------|
295	   [2] |--- leaf hash + tree size ----->|
296	       |<-- index + inclusion proof --->|
297	   [3] |--- tree size 1 + tree size 2 ->|
298	       |<-- consistency proof ----------|

300	8.  Gossip Mechanisms

302	8.1.  SCT Feedback

304	   The goal of SCT Feedback is for clients to share SCTs and certificate
305	   chains with CT auditors while still preserving the privacy of the end
306	   user.  The sharing of SCTs contribute to the overall goal of
307	   detecting misbehaving logs by providing auditors with SCTs from many
308	   vantage points, making it more likely to catch a violation of a log's
309	   MMD or a log presenting inconsistent views.  The sharing of
310	   certificate chains is beneficial to HTTPS server operators interested
311	   in direct feedback from clients for detecting bogus certificates
312	   issued in their name and therefore incentivizes server operators to
313	   take part in SCT Feedback.

315	   SCT Feedback is the most privacy-preserving gossip mechanism, as it
316	   does not directly expose any links between an end user and the sites
317	   they've visited to any third party.

319	   HTTPS clients store SCTs and certificate chains they see, and later
320	   send them to the originating HTTPS server by posting them to a well-
321	   known URL (associated with that server), as described in
322	   Section 8.1.2.  Note that clients will send the same SCTs and chains
323	   to a server multiple times with the assumption that any man-in-the-
324	   middle attack eventually will cease, and an honest server will
325	   eventually receive collected malicious SCTs and certificate chains.

327	   HTTPS servers store SCTs and certificate chains received from
328	   clients, as described in Section 8.1.3.  They later share them with
329	   CT auditors by either posting them to auditors or making them
330	   available via a well-known URL.  This is described in Section 8.1.4.

332	8.1.1.  SCT Feedback data format

334	   The data shared between HTTPS clients and servers, as well as between
335	   HTTPS servers and CT auditors, is a JSON array [RFC7159].  Each item
336	   in the array is a JSON object with the following content:

338	   o  x509_chain: An array of PEM-encoded X.509 certificates.  The first
339	      element is the end-entity certificate, the second certifies the
340	      first and so on.

342	   o  sct_data: An array of objects consisting of the base64
343	      representation of the binary SCT data as defined in
344	      [RFC-6962-BIS-09] Section 3.3.

346	   We will refer to this object as 'sct_feedback'.

348	   The x509_chain element always contains a full chain from a leaf
349	   certificate to a self-signed trust anchor.

351	   See Section 8.1.2 for details on what the sct_data element contains
352	   as well as more details about the x509_chain element.

354	8.1.2.  HTTPS client to server

356	   When an HTTPS client connects to an HTTPS server, the client receives
357	   a set of SCTs as part of the TLS handshake.  SCTs are included in the
358	   TLS handshake using one or more of the three mechanisms described in
359	   [RFC-6962-BIS-09] section 3.4 - in the server certificate, in a TLS
360	   extension, or in an OCSP extension.  The client MUST discard SCTs
361	   that are not signed by a log known to the client and SHOULD store the
362	   remaining SCTs together with a locally constructed certificate chain
363	   which is trusted (i.e. terminated in a pre-loaded or locally
364	   installed Trust Anchor) in an sct_feedback object or equivalent data
365	   structure for later use in SCT Feedback.

367	   The SCTs stored on the client MUST be keyed by the exact domain name
368	   the client contacted.  They MUST NOT be sent to any domain not
369	   matching the original domain (e.g. if the original domain is
370	   sub.example.com they must not be sent to sub.sub.example.com or to
371	   example.com.)  They MUST NOT be sent to any Subject Alternate Names
372	   specified in the certificate.  In the case of certificates that
373	   validate multiple domain names, the same SCT is expected to be stored
374	   multiple times.

376	   Not following these constraints would increase the risk for two types
377	   of privacy breaches.  First, the HTTPS server receiving the SCT would
378	   learn about other sites visited by the HTTPS client.  Second,
379	   auditors receiving SCTs from the HTTPS server would learn information
380	   about other HTTPS servers visited by its clients.

382	   If the client later again connects to the same HTTPS server, it again
383	   receives a set of SCTs and calculates a certificate chain, and again
384	   creates an sct_feedback or similar object.  If this object does not
385	   exactly match an existing object in the store, then the client MUST
386	   add this new object to the store, associated with the exact domain
387	   name contacted, as described above.  An exact comparison is needed to
388	   ensure that attacks involving alternate chains are detected.  An
389	   example of such an attack is described in
390	   [dual-ca-compromise-attack].  However, at least one optimization is
391	   safe and MAY be performed: If the certificate chain exactly matches
392	   an existing certificate chain, the client MAY store the union of the
393	   SCTs from the two objects in the first (existing) object.

395	   If the client does connect to the same HTTPS server a subsequent
396	   time, it MUST send to the server sct_feedback objects in the store
397	   that are associated with that domain name.  However, it is not
398	   necessary to send an sct_feedback object constructed from the current
399	   TLS session, and if the client does so, it MUST NOT be marked as sent
400	   in any internal tracking done by the client.

402	   Refer to Section 11.3 for recommendations for implementation.

404	   Because SCTs can be used as a tracking mechanism (see
405	   Section 10.5.2), they deserve special treatment when they are
406	   received from (and provided to) domains that are loaded as
407	   subresources from an origin domain.  Such domains are commonly called
408	   'third party domains'.  An HTTPS client SHOULD store SCT Feedback
409	   using a 'double-keying' approach, which isolates third party domains
410	   by the first party domain.  This is described in [double-keying].

412	   Gossip would be performed normally for third party domains only when
413	   the user revisits the first party domain.  In lieu of 'double-
414	   keying', an HTTPS client MAY treat SCT Feedback in the same manner it
415	   treats other security mechanisms that can enable tracking (such as
416	   HSTS and HPKP.)

418	   If the HTTPS client has configuration options for not sending cookies
419	   to third parties, SCTs of third parties MUST be treated as cookies
420	   with respect to this setting.  This prevents third party tracking
421	   through the use of SCTs/certificates, which would bypass the cookie
422	   policy.  For domains that are only loaded as third party domains, the
423	   client may never perform SCT Feedback; however the client may perform
424	   STH Pollination after fetching an inclusion proof, as specified in
425	   Section 8.2.

427	   SCTs and corresponding certificates are POSTed to the originating
428	   HTTPS server at the well-known URL:

430	   https://<domain>/.well-known/ct-gossip/v1/sct-feedback

432	   The data sent in the POST is defined in Section 8.1.1.  This data
433	   SHOULD be sent in an already-established TLS session.  This makes it
434	   hard for an attacker to disrupt SCT Feedback without also disturbing
435	   ordinary secure browsing (https://).  This is discussed more in
436	   Section 11.1.1.

438	   The HTTPS server SHOULD respond with an HTTP 200 response code and an
439	   empty body if it was able to process the request.  An HTTPS client
440	   who receives any other response SHOULD consider it an error.

442	   Some clients have trust anchors or logs that are locally added (e.g.
443	   by an administrator or by the user themselves).  These additions are
444	   potentially privacy-sensitive because they can carry information
445	   about the specific configuration, computer, or user.

447	   Certificates validated by locally added trust anchors will commonly
448	   have no SCTs associated with them, so in this case no action is
449	   needed with respect to CT Gossip.  SCTs issued by locally added logs
450	   MUST NOT be reported via SCT Feedback.

452	   If a certificate is validated by SCTs that are issued by publicly
453	   trusted logs, but chains to a local trust anchor, the client MAY
454	   perform SCT Feedback for this SCT and certificate chain bundle.  If
455	   it does so, the client MUST include the full chain of certificates
456	   chaining to the local trust anchor in the x509_chain array.
457	   Performing SCT Feedback in this scenario may be advantageous for the
458	   broader internet and CT ecosystem, but may also disclose information
459	   about the client.  If the client elects to omit SCT Feedback, it can
460	   choose to perform STH Pollination after fetching an inclusion proof,
461	   as specified in Section 8.2.

463	   We require the client to send the full chain (or nothing at all) for
464	   two reasons.  Firstly, it simplifies the operation on the server if
465	   there are not two code paths.  Secondly, omitting the chain does not
466	   actually preserve user privacy.  The Issuer field in the certificate
467	   describes the signing certificate.  And if the certificate is being
468	   submitted at all, it means the certificate is logged, and has SCTs.
469	   This means that the Issuer can be queried and obtained from the log,
470	   so omitting the signing certificate from the client's submission does
471	   not actually help user privacy.

473	8.1.3.  HTTPS server operation

475	   HTTPS servers can be configured (or omit configuration), resulting
476	   in, broadly, two modes of operation.  In the simpler mode, the server
477	   will only track leaf certificates and SCTs applicable to those leaf
478	   certificates.  In the more complex mode, the server will confirm the
479	   client's chain validation and store the certificate chain.  The
480	   latter mode requires more configuration, but is necessary to prevent
481	   denial of service (DoS) attacks on the server's storage space.

483	   In the simple mode of operation, upon receiving a submission at the
484	   sct-feedback well-known URL, an HTTPS server will perform a set of
485	   operations, checking on each sct_feedback object before storing it:

487	   1.  the HTTPS server MAY modify the sct_feedback object, and discard
488	       all items in the x509_chain array except the first item (which is
489	       the end-entity certificate)

491	   2.  if a bit-wise compare of the sct_feedback object matches one
492	       already in the store, this sct_feedback object SHOULD be
493	       discarded

495	   3.  if the leaf cert is not for a domain for which the server is
496	       authoritative, the SCT MUST be discarded

498	   4.  if an SCT in the sct_data array can't be verified to be a valid
499	       SCT for the accompanying leaf cert, and issued by a known log,
500	       the individual SCT SHOULD be discarded

502	   The modification in step number 1 is necessary to prevent a malicious
503	   client from exhausting the server's storage space.  A client can
504	   generate their own issuing certificate authorities, and create an
505	   arbitrary number of chains that terminate in an end-entity
506	   certificate with an existing SCT.  By discarding all but the end-
507	   entity certificate, we prevent a simple HTTPS server from storing
508	   this data.  Note that operation in this mode will not prevent the
509	   attack described in [dual-ca-compromise-attack].  Skipping this step
510	   requires additional configuration as described below.

512	   The check in step 2 is for detecting duplicates and minimizing
513	   processing and storage by the server.  As on the client, an exact
514	   comparison is needed to ensure that attacks involving alternate
515	   chains are detected.  Again, at least one optimization is safe and
516	   MAY be performed.  If the certificate chain exactly matches an
517	   existing certificate chain, the server MAY store the union of the
518	   SCTs from the two objects in the first (existing) object.  If the
519	   validity check on any of the SCTs fails, the server SHOULD NOT store
520	   the union of the SCTs.

522	   The check in step 3 is to help malfunctioning clients from exposing
523	   which sites they visit.  It additionally helps prevent DoS attacks on
524	   the server.

526	   [ Note: Thinking about building this, how does the SCT Feedback app
527	   know which sites it's authoritative for?  It will need that amount of
528	   configuration at least. ]

530	   The check in step 4 is to prevent DoS attacks where an adversary
531	   fills up the store prior to attacking a client (thus preventing the
532	   client's feedback from being recorded), or an attack where an
533	   adversary simply attempts to fill up server's storage space.

535	   The above describes the simpler mode of operation.  In the more
536	   advanced server mode, the server will detect the attack described in
537	   [dual-ca-compromise-attack].  In this configuration the server will
538	   not modify the sct_feedback object prior to performing checks 2, 3,
539	   and 4.

541	   To prevent a malicious client from filling the server's data store,
542	   the HTTPS server SHOULD perform an additional check in the more
543	   advanced mode:

545	   o  if the x509_chain consists of an invalid certificate chain, or the
546	      culminating trust anchor is not recognized by the server, the
547	      server SHOULD modify the sct_feedback object, discarding all items
548	      in the x509_chain array except the first item

550	   The HTTPS server MAY choose to omit checks 4 or 5.  This will place
551	   the server at risk of having its data store filled up by invalid
552	   data, but can also allow a server to identify interesting certificate
553	   or certificate chains that omit valid SCTs, or do not chain to a
554	   trusted root.  This information may enable an HTTPS server operator
555	   to detect attacks or unusual behavior of Certificate Authorities even
556	   outside the Certificate Transparency ecosystem.

558	8.1.4.  HTTPS server to auditors

560	   HTTPS servers receiving SCTs from clients SHOULD share SCTs and
561	   certificate chains with CT auditors by either serving them on the
562	   well-known URL:

564	   https://<domain>/.well-known/ct-gossip/v1/collected-sct-feedback

566	   or by HTTPS POSTing them to a set of preconfigured auditors.  This
567	   allows an HTTPS server to choose between an active push model or a
568	   passive pull model.

570	   The data received in a GET of the well-known URL or sent in the POST
571	   is defined in Section 8.1.1 with the following difference: The
572	   x509_chain element may contain only he end-entity certificate, as
573	   described below.

575	   HTTPS servers SHOULD share all sct_feedback objects they see that
576	   pass the checks in Section 8.1.3.  If this is an infeasible amount of
577	   data, the server MAY choose to expire submissions according to an
578	   undefined policy.  Suggestions for such a policy can be found in
579	   Section 11.3.

581	   HTTPS servers MUST NOT share any other data that they may learn from
582	   the submission of SCT Feedback by HTTPS clients, like the HTTPS
583	   client IP address or the time of submission.

585	   As described above, HTTPS servers can be configured (or omit
586	   configuration), resulting in two modes of operation.  In one mode,
587	   the x509_chain array will contain a full certificate chain.  This
588	   chain may terminate in a trust anchor the auditor may recognize, or
589	   it may not.  (One scenario where this could occur is if the client
590	   submitted a chain terminating in a locally added trust anchor, and
591	   the server kept this chain.)  In the other mode, the x509_chain array
592	   will consist of only a single element, which is the end-entity
593	   certificate.

595	   Auditors SHOULD provide the following URL accepting HTTPS POSTing of
596	   SCT feedback data:

598	   https://<auditor>/ct-gossip/v1/sct-feedback

600	   Auditors SHOULD regularly poll HTTPS servers at the well-known
601	   collected-sct-feedback URL.  The frequency of the polling and how to
602	   determine which domains to poll is outside the scope of this
603	   document.  However, the selection MUST NOT be influenced by potential
604	   HTTPS clients connecting directly to the auditor.  For example, if a
605	   poll to example.com occurs directly after a client submits an SCT for
606	   example.com, an adversary observing the auditor can trivially
607	   conclude the activity of the client.

609	8.2.  STH pollination

611	   The goal of sharing Signed Tree Heads (STHs) through pollination is
612	   to share STHs between HTTPS clients and CT auditors while still
613	   preserving the privacy of the end user.  The sharing of STHs
614	   contribute to the overall goal of detecting misbehaving logs by
615	   providing CT auditors with STHs from many vantage points, making it
616	   possible to detect logs that are presenting inconsistent views.

618	   HTTPS servers supporting the protocol act as STH pools.  HTTPS
619	   clients and CT auditors in the possession of STHs can pollinate STH
620	   pools by sending STHs to them, and retrieving new STHs to send to
621	   other STH pools.  CT auditors can improve the value of their auditing
622	   by retrieving STHs from pools.

624	   HTTPS clients send STHs to HTTPS servers by POSTing them to the well-
625	   known URL:

627	   https://<domain>/.well-known/ct-gossip/v1/sth-pollination

629	   The data sent in the POST is defined in Section 8.2.4.  This data
630	   SHOULD be sent in an already established TLS session.  This makes it
631	   hard for an attacker to disrupt STH gossiping without also disturbing
632	   ordinary secure browsing (https://).  This is discussed more in
633	   Section 11.1.1.

635	   On a successful connection to an HTTPS server implementing STH
636	   Pollination, the response code will be 200, and the response body is
637	   application/json, containing zero or more STHs in the same format, as
638	   described in Section 8.2.4.

640	   An HTTPS client may acquire STHs by several methods:

642	   o  in replies to pollination POSTs;

644	   o  asking logs that it recognizes for the current STH, either
645	      directly (v2/get-sth) or indirectly (for example over DNS)

647	   o  resolving an SCT and certificate to an STH via an inclusion proof

649	   o  resolving one STH to another via a consistency proof
650	   HTTPS clients (that have STHs) and CT auditors SHOULD pollinate STH
651	   pools with STHs.  Which STHs to send and how often pollination should
652	   happen is regarded as undefined policy with the exception of privacy
653	   concerns explained below.  Suggestions for the policy can be found in
654	   Section 11.3.

656	   An HTTPS client could be tracked by giving it a unique or rare STH.
657	   To address this concern, we place restrictions on different
658	   components of the system to ensure an STH will not be rare.

660	   o  HTTPS clients silently ignore STHs from logs with an STH issuance
661	      frequency of more than one STH per hour.  Logs use the STH
662	      Frequency Count metadata to express this ([RFC-6962-BIS-09]
663	      sections 3.6 and 5.1).

665	   o  HTTPS clients silently ignore STHs which are not fresh.

667	   An STH is considered fresh iff its timestamp is less than 14 days in
668	   the past.  Given a maximum STH issuance rate of one per hour, an
669	   attacker has 336 unique STHs per log for tracking.  Clients MUST
670	   ignore STHs older than 14 days.  We consider STHs within this
671	   validity window not to be personally identifiable data, and STHs
672	   outside this window to be personally identifiable.

674	   When multiplied by the number of logs from which a client accepts
675	   STHs, this number of unique STHs grow and the negative privacy
676	   implications grow with it.  It's important that this is taken into
677	   account when logs are chosen for default settings in HTTPS clients.
678	   This concern is discussed upon in Section 10.5.5.

680	   A log may cease operation, in which case there will soon be no STH
681	   within the validity window.  Clients SHOULD perform all three methods
682	   of gossip about a log that has ceased operation since it is possible
683	   the log was still compromised and gossip can detect that.  STH
684	   Pollination is the one mechanism where a client must know about a log
685	   shutdown.  A client who does not know about a log shutdown MUST NOT
686	   attempt any heuristic to detect a shutdown.  Instead the client MUST
687	   be informed about the shutdown from a verifiable source (e.g. a
688	   software update).  The client SHOULD be provided the final STH issued
689	   by the log and SHOULD resolve SCTs and STHs to this final STH.  If an
690	   SCT or STH cannot be resolved to the final STH, clients SHOULD follow
691	   the requirements and recommendations set forth in Section 11.1.2.

693	8.2.1.  HTTPS Clients and Proof Fetching

695	   There are two types of proofs a client may retrieve; inclusion proofs
696	   and consistency proofs.

698	   An HTTPS client will retrieve SCTs together with certificate chains
699	   from an HTTPS server.  Using the timestamp in the SCT together with
700	   the end-entity certificate and the issuer key hash, it can obtain an
701	   inclusion proof to an STH in order to verify the promise made by the
702	   SCT.

704	   An HTTPS client will have STHs from performing STH Pollination, and
705	   may obtain a consistency proof to a more recent STH.

707	   An HTTPS client may also receive an SCT bundled with an inclusion
708	   proof to a historical STH via an unspecified future mechanism.
709	   Because this historical STH is considered personally identifiable
710	   information per above, the client needs to obtain a consistency proof
711	   to a more recent STH.

713	   A client SHOULD perform proof fetching.  A client MUST NOT perform
714	   proof fetching for any SCTs or STHs issued by a locally added log.  A
715	   client MAY fetch an inclusion proof for an SCT (issued by a pre-
716	   loaded log) that validates a certificate chaining to a locally added
717	   trust anchor.

719	   If a client requested either proof directly from a log or auditor, it
720	   would reveal the client's browsing habits to a third party.  To
721	   mitigate this risk, an HTTPS client MUST retrieve the proof in a
722	   manner that disguises the client.

724	   Depending on the client's DNS provider, DNS may provide an
725	   appropriate intermediate layer that obfuscates the linkability
726	   between the user of the client and the request for inclusion (while
727	   at the same time providing a caching layer for oft-requested
728	   inclusion proofs).  See [draft-ct-over-dns] for an example of how
729	   this can be done.

731	   Anonymity networks such as Tor also present a mechanism for a client
732	   to anonymously retrieve a proof from an auditor or log.

734	   Even when using a privacy-preserving layer between the client and the
735	   log, certain observations may be made about an anonymous client or
736	   general user behavior depending on how proofs are fetched.  For
737	   example, if a client fetched all outstanding proofs at once, a log
738	   would know that SCTs or STHs received around the same time are more
739	   likely to come from a particular client.  This could potentially go
740	   so far as correlation of activity at different times to a single
741	   client.  In aggregate the data could reveal what sites are commonly
742	   visited together.  HTTPS clients SHOULD use a strategy of proof
743	   fetching that attempts to obfuscate these patterns.  A suggestion of
744	   such a policy can be found in Section 11.2.

746	   Resolving either SCTs and STHs may result in errors.  These errors
747	   may be routine downtime or other transient errors, or they may be
748	   indicative of an attack.  Clients SHOULD follow the requirements and
749	   recommendations set forth in Section 11.1.2 when handling these
750	   errors in order to give the CT ecosystem the greatest chance of
751	   detecting and responding to a compromise.

753	8.2.2.  STH Pollination without Proof Fetching

755	   An HTTPS client MAY participate in STH Pollination without fetching
756	   proofs.  In this situation, the client receives STHs from a server,
757	   applies the same validation logic to them (signed by a known log,
758	   within the validity window) and will later pass them to another HTTPS
759	   server.

761	   When operating in this fashion, the HTTPS client is promoting gossip
762	   for Certificate Transparency, but derives no direct benefit itself.
763	   In comparison, a client who resolves SCTs or historical STHs to
764	   recent STHs and pollinates them is assured that if it was attacked,
765	   there is a probability that the ecosystem will detect and respond to
766	   the attack (by distrusting the log).

768	8.2.3.  Auditor Action

770	   CT auditors participate in STH pollination by retrieving STHs from
771	   HTTPS servers.  They verify that the STH is valid by checking the
772	   signature, and requesting a consistency proof from the STH to the
773	   most recent STH.

775	   After retrieving the consistency proof to the most recent STH, they
776	   SHOULD pollinate this new STH among participating HTTPS servers.  In
777	   this way, as STHs "age out" and are no longer fresh, their "lineage"
778	   continues to be tracked in the system.

780	8.2.4.  STH Pollination data format

782	   The data sent from HTTPS clients and CT auditors to HTTPS servers is
783	   a JSON object [RFC7159] with the following content:

785	   o  sths - an array of 0 or more fresh SignedTreeHeads as defined in
786	      [RFC-6962-BIS-09] Section 3.6.1.

788	8.3.  Trusted Auditor Stream

790	   HTTPS clients MAY send SCTs and cert chains, as well as STHs,
791	   directly to auditors.  If sent, this data MAY include data that
792	   reflects locally added logs or trust anchors.  Note that there are
793	   privacy implications in doing so, these are outlined in
794	   Section 10.5.1 and Section 10.5.6.

796	   The most natural trusted auditor arrangement arguably is a web
797	   browser that is "logged in to" a provider of various internet
798	   services.  Another equivalent arrangement is a trusted party like a
799	   corporation to which an employee is connected through a VPN or by
800	   other similar means.  A third might be individuals or smaller groups
801	   of people running their own services.  In such a setting, retrieving
802	   proofs from that third party could be considered reasonable from a
803	   privacy perspective.  The HTTPS client may also do its own auditing
804	   and might additionally share SCTs and STHs with the trusted party to
805	   contribute to herd immunity.  Here, the ordinary [RFC-6962-BIS-09]
806	   protocol is sufficient for the client to do the auditing while SCT
807	   Feedback and STH Pollination can be used in whole or in parts for the
808	   gossip part.

810	   Another well established trusted party arrangement on the internet
811	   today is the relation between internet users and their providers of
812	   DNS resolver services.  DNS resolvers are typically provided by the
813	   internet service provider (ISP) used, which by the nature of name
814	   resolving already know a great deal about which sites their users
815	   visit.  As mentioned in Section 8.2.1, in order for HTTPS clients to
816	   be able to retrieve proofs in a privacy preserving manner, logs could
817	   expose a DNS interface in addition to the ordinary HTTPS interface.
818	   A specification of such a protocol can be found in
819	   [draft-ct-over-dns].

821	8.3.1.  Trusted Auditor data format

823	   Trusted Auditors expose a REST API at the fixed URI:

825	   https://<auditor>/ct-gossip/v1/trusted-auditor

827	   Submissions are made by sending an HTTPS POST request, with the body
828	   of the POST in a JSON object.  Upon successful receipt the Trusted
829	   Auditor returns 200 OK.

831	   The JSON object consists of two top-level keys: 'sct_feedback' and
832	   'sths'.  The 'sct_feedback' value is an array of JSON objects as
833	   defined in Section 8.1.1.  The 'sths' value is an array of STHs as
834	   defined in Section 8.2.4.

836	   Example:

838	   {
839	     'sct_feedback' :
840	       [
841	         {
842	           'x509_chain' :
843	             [
844	               '----BEGIN CERTIFICATE---\n
845	                AAA...',
846	               '----BEGIN CERTIFICATE---\n
847	                AAA...',
848	                ...
849	             ],
850	           'sct_data' :
851	             [
852	               'AAA...',
853	               'AAA...',
854	               ...
855	             ]
856	         }, ...
857	       ],
858	     'sths' :
859	       [
860	         'AAA...',
861	         'AAA...',
862	         ...
863	       ]
864	   }

866	9.  3-Method Ecosystem

868	   The use of three distinct methods for auditing logs may seem
869	   excessive, but each represents a needed component in the CT
870	   ecosystem.  To understand why, the drawbacks of each component must
871	   be outlined.  In this discussion we assume that an attacker knows
872	   which mechanisms an HTTPS client and HTTPS server implement.

874	9.1.  SCT Feedback

876	   SCT Feedback requires the cooperation of HTTPS clients and more
877	   importantly HTTPS servers.  Although SCT Feedback does require a
878	   significant amount of server-side logic to respond to the
879	   corresponding APIs, this functionality does not require
880	   customization, so it may be pre-provided and work out of the box.
881	   However, to take full advantage of the system, an HTTPS server would
882	   wish to perform some configuration to optimize its operation:

884	   o  Minimize its disk commitment by maintaining a list of known SCTs
885	      and certificate chains (or hashes thereof)

887	   o  Maximize its chance of detecting a misissued certificate by
888	      configuring a trust store of CAs

890	   o  Establish a "push" mechanism for POSTing SCTs to CT auditors

892	   These configuration needs, and the simple fact that it would require
893	   some deployment of software, means that some percentage of HTTPS
894	   servers will not deploy SCT Feedback.

896	   It is worthwhile to note that an attacker may be able to prevent
897	   detection of an attack on a webserver (in all cases) if SCT Feedback
898	   is not implemented.  This attack is detailed in Section 10.1).

900	   If SCT Feedback was the only mechanism in the ecosystem, any server
901	   that did not implement the feature would open itself and its users to
902	   attack without any possibility of detection.

904	   If SCT Feedback is not deployed by a webserver, malicious logs will
905	   be able to attack all users of the webserver (who do not have a
906	   Trusted Auditor relationship) with impunity.  Additionally, users who
907	   wish to have the strongest measure of privacy protection (by
908	   disabling STH Pollination Proof Fetching and forgoing a Trusted
909	   Auditor) could be attacked without risk of detection.

911	9.2.  STH Pollination

913	   STH Pollination requires the cooperation of HTTPS clients, HTTPS
914	   servers, and logs.

916	   For a client to fully participate in STH Pollination, and have this
917	   mechanism detect attacks against it, the client must have a way to
918	   safely perform Proof Fetching in a privacy preserving manner.  (The
919	   client may pollinate STHs it receives without performing Proof
920	   Fetching, but we do not consider this option in this section.)

922	   HTTPS servers must deploy software (although, as in the case with SCT
923	   Feedback this logic can be pre-provided) and commit some configurable
924	   amount of disk space to the endeavor.

926	   Logs (or a third party mirroring the logs) must provide access to
927	   clients to query proofs in a privacy preserving manner, most likely
928	   through DNS.

930	   Unlike SCT Feedback, the STH Pollination mechanism is not hampered if
931	   only a minority of HTTPS servers deploy it.  However, it makes an
932	   assumption that an HTTPS client performs Proof Fetching (such as the
933	   DNS mechanism discussed).  Unfortunately, any manner that is
934	   anonymous for some (such as clients who use shared DNS services such
935	   as a large ISP), may not be anonymous for others.

937	   For instance, DNS requests expose a considerable amount of sensitive
938	   information (including what data is already present in the cache) in
939	   plaintext over the network.  For this reason, some percentage of
940	   HTTPS clients may choose to not enable the Proof Fetching component
941	   of STH Pollination.  (Although they can still request and send STHs
942	   among participating HTTPS servers, even when this affords them no
943	   direct benefit.)

945	   If STH Pollination was the only mechanism deployed, users that
946	   disable it would be able to be attacked without risk of detection.

948	   If STH Pollination was not deployed, HTTPS clients visiting HTTPS
949	   Servers who did not deploy SCT Feedback could be attacked without
950	   risk of detection.

952	9.3.  Trusted Auditor Relationship

954	   The Trusted Auditor Relationship is expected to be the rarest gossip
955	   mechanism, as an HTTPS client is providing an unadulterated report of
956	   its browsing history to a third party.  While there are valid and
957	   common reasons for doing so, there is no appropriate way to enter
958	   into this relationship without retrieving informed consent from the
959	   user.

961	   However, the Trusted Auditor Relationship mechanism still provides
962	   value to a class of HTTPS clients.  For example, web crawlers have no
963	   concept of a "user" and no expectation of privacy.  Organizations
964	   already performing network auditing for anomalies or attacks can run
965	   their own Trusted Auditor for the same purpose with marginal increase
966	   in privacy concerns.

968	   The ability to change one's Trusted Auditor is a form of Trust
969	   Agility that allows a user to choose who to trust, and be able to
970	   revise that decision later without consequence.  A Trusted Auditor
971	   connection can be made more confidential than DNS (through the use of
972	   TLS), and can even be made (somewhat) anonymous through the use of
973	   anonymity services such as Tor. (Note that this does ignore the de-
974	   anonymization possibilities available from viewing a user's browsing
975	   history.)

977	   If the Trusted Auditor relationship was the only mechanism deployed,
978	   users who do not enable it (the majority) would be able to be
979	   attacked without risk of detection.

981	   If the Trusted Auditor relationship was not deployed, crawlers and
982	   organizations would build it themselves for their own needs.  By
983	   standardizing it, users who wish to opt-in (for instance those
984	   unwilling to participate fully in STH Pollination) can have an
985	   interoperable standard they can use to choose and change their
986	   trusted auditor.

988	9.4.  Interaction

990	   The interactions of the mechanisms is thus outlined:

992	   HTTPS clients can be attacked without risk of detection if they do
993	   not participate in any of the three mechanisms.

995	   HTTPS clients are afforded the greatest chance of detecting an attack
996	   when they either participate in both SCT Feedback and STH Pollination
997	   with Proof Fetching or if they have a Trusted Auditor relationship.
998	   (Participating in SCT Feedback is required to prevent a malicious log
999	   from refusing to ever resolve an SCT to an STH, as put forward in
1000	   Section 10.1).  Additionally, participating in SCT Feedback enables
1001	   an HTTPS client to assist in detecting the exact target of an attack.

1003	   HTTPS servers that omit SCT Feedback enable malicious logs to carry
1004	   out attacks without risk of detection.  If these servers are targeted
1005	   specifically, even if the attack is detected, without SCT Feedback
1006	   they may never learn that they were specifically targeted.  HTTPS
1007	   servers without SCT Feedback do gain some measure of herd immunity,
1008	   but only because their clients participate in STH Pollination (with
1009	   Proof Fetching) or have a Trusted Auditor Relationship.

1011	   When HTTPS servers omit SCT feedback, it allows their users to be
1012	   attacked without detection by a malicious log; the vulnerable users
1013	   are those who do not have a Trusted Auditor relationship.

1015	10.  Security considerations

1017	10.1.  Attacks by actively malicious logs

1019	   One of the most powerful attacks possible in the CT ecosystem is a
1020	   trusted log that has actively decided to be malicious.  It can carry
1021	   out an attack in two ways:

1023	   In the first attack, the log can present a split view of the log for
1024	   all time.  The only way to detect this attack is to resolve each view
1025	   of the log to the two most recent STHs and then force the log to
1026	   present a consistency proof.  (Which it cannot.)  This attack can be
1027	   detected by CT auditors participating in STH Pollination, as long as
1028	   they are explicitly built to handle the situation of a log
1029	   continuously presenting a split view.

1031	   In the second attack, the log can sign an SCT, and refuse to ever
1032	   include the certificate that the SCT refers to in the tree.
1033	   (Alternately, it can include it in a branch of the tree and issue an
1034	   STH, but then abandon that branch.)  Whenever someone requests an
1035	   inclusion proof for that SCT (or a consistency proof from that STH),
1036	   the log would respond with an error, and a client may simply regard
1037	   the response as a transient error.  This attack can be detected using
1038	   SCT Feedback, or an Auditor of Last Resort, as presented in
1039	   Section 11.1.2.

1041	10.2.  Dual-CA Compromise

1043	   [dual-ca-compromise-attack] describes an attack possible by an
1044	   adversary who compromises two Certificate Authorities and a Log. This
1045	   attack is difficult to defend against in the CT ecosystem, and
1046	   [dual-ca-compromise-attack] describes a few approaches to doing so.
1047	   We note that Gossip is not intended to defend against this attack,
1048	   but can in certain modes.

1050	   Defending against the Dual-CA Compromise attack requires SCT
1051	   Feedback, and explicitly requires the server to save full certificate
1052	   chains (described in Section 8.1.3 as the 'complex' configuration.)
1053	   After CT auditors receive the full certificate chains from servers,
1054	   they MAY compare the chain built by clients to the chain supplied by
1055	   the log.  If the chains differ significantly, the auditor SHOULD
1056	   raise a concern.  A method of determining if chains differ
1057	   significantly is by asserting that one chain is not a subset of the
1058	   other and that the roots of the chains are different.

1060	   [Note: Justification for this algorithm:

1062	   Cross-Signatures could result in a different org being treated as the
1063	   'root', but in this case, one chain would be a subset of the other.

1065	   Intermediate swapping (e.g. different signature algorithms) could
1066	   result in different chains, but the root would be the same.

1068	   (Hitting both those cases at once would cause a false positive
1069	   though, but this would likely be rare.)

1071	   Are there other cases that could occur?  (Left for the purposes of
1072	   reading during pre-Last Call, to be removed by Editor)]

1074	10.3.  Censorship/Blocking considerations

1076	   We assume a network attacker who is able to fully control the
1077	   client's internet connection for some period of time, including
1078	   selectively blocking requests to certain hosts and truncating TLS
1079	   connections based on information observed or guessed about client
1080	   behavior.  In order to successfully detect log misbehavior, the
1081	   gossip mechanisms must still work even in these conditions.

1083	   There are several gossip connections that can be blocked:

1085	   1.  Clients sending SCTs to servers in SCT Feedback

1087	   2.  Servers sending SCTs to auditors in SCT Feedback (server push
1088	       mechanism)

1090	   3.  Servers making SCTs available to auditors (auditor pull
1091	       mechanism)

1093	   4.  Clients fetching proofs in STH Pollination

1095	   5.  Clients sending STHs to servers in STH Pollination

1097	   6.  Servers sending STHs to clients in STH Pollination

1099	   7.  Clients sending SCTs to Trusted Auditors

1101	   If a party cannot connect to another party, it can be assured that
1102	   the connection did not succeed.  While it may not have been
1103	   maliciously blocked, it knows the transaction did not succeed.
1104	   Mechanisms which result in a positive affirmation from the recipient
1105	   that the transaction succeeded allow confirmation that a connection
1106	   was not blocked.  In this situation, the party can factor this into
1107	   strategies suggested in Section 11.3 and in Section 11.1.2.

1109	   The connections that allow positive affirmation are 1, 2, 4, 5, and
1110	   7.

1112	   More insidious is blocking the connections that do not allow positive
1113	   confirmation: 3 and 6.  An attacker may truncate or drop a response
1114	   from a server to a client, such that the server believes it has
1115	   shared data with the recipient, when it has not.  However, in both
1116	   scenarios (3 and 6), the server cannot distinguish the client as a
1117	   cooperating member of the CT ecosystem or as an attacker performing a
1118	   Sybil attack, aiming to flush the server's data store.  Therefore the
1119	   fact that these connections can be undetectably blocked does not
1120	   actually alter the threat model of servers responding to these
1121	   requests.  The choice of algorithm to release data is crucial to
1122	   protect against these attacks; strategies are suggested in
1123	   Section 11.3.

1125	   Handling censorship and network blocking (which is indistinguishable
1126	   from network error) is relegated to the implementation policy chosen
1127	   by clients.  Suggestions for client behavior are specified in
1128	   Section 11.1.

1130	10.4.  Flushing Attacks

1132	   A flushing attack is an attempt by an adversary to flush a particular
1133	   piece of data from a pool.  In the CT Gossip ecosystem, an attacker
1134	   may have performed an attack and left evidence of a compromised log
1135	   on a client or server.  They would be interested in flushing that
1136	   data, i.e.  tricking the target into gossiping or pollinating the
1137	   incriminating evidence with only attacker-controlled clients or
1138	   servers with the hope they trick the target into deleting it.

1140	   Flushing attacks may be defended against differently depending on the
1141	   entity (HTTPS client or HTTPS server) and record (STHs or SCTs with
1142	   Certificate Chains).

1144	10.4.1.  STHs

1146	   For both HTTPS clients and HTTPS servers, STHs within the validity
1147	   window SHOULD NOT be deleted.  An attacker cannot flush an item from
1148	   the cache if it is never removed so flushing attacks are completely
1149	   mitigated.

1151	   The required disk space for all STHs within the validity window is
1152	   336 STHs per log that is trusted.  If 20 logs are trusted, and each
1153	   STH takes 1 Kilobytes, this is 6.56 Megabytes.

1155	   Note that it is important that implementors do not calculate the
1156	   exact size of cache expected - if an attack does occur, a small
1157	   number of additional STHs will enter into the cache.  These STHs will
1158	   be in addition to the expected set, and will be evidence of the
1159	   attack.

1161	   If an HTTPS client or HTTPS server is operating in a constrained
1162	   environment and cannot devote enough storage space to hold all STHs
1163	   within the validity window it is recommended to use the below
1164	   Deletion Algorithm Section 11.3.2 to make it more difficult for the
1165	   attacker to perform a flushing attack.

1167	10.4.2.  SCTs & Certificate Chains on HTTPS Servers

1169	   An HTTPS server will only accept SCTs and Certificate Chains for
1170	   domains it is authoritative for.  Therefore the storage space needed
1171	   is bound by the number of logs it accepts, multiplied by the number
1172	   of domains it is authoritative for, multiplied by the number of
1173	   certificates issued for those domains.

1175	   Imagine a server authoritative for 10,000 domains, and each domain
1176	   has 3 certificate chains, and 10 SCTs.  A certificate chain is 5
1177	   Kilobytes in size and an SCT 1 Kilobyte.  This yields 732 Megabytes.

1179	   This data can be large, but it is calculable.  Web properties with
1180	   more certificates and domains are more likely to be able to handle
1181	   the increased storage need, while small web properties will not seen
1182	   an undue burden.  Therefore HTTPS servers SHOULD NOT delete SCTs or
1183	   Certificate Chains.  This completely mitigates flushing attacks.

1185	   Again, note that it is important that implementors do not calculate
1186	   the exact size of cache expected - if an attack does occur, the new
1187	   SCT(s) and Certificate Chain(s) will enter into the cache.  This data
1188	   will be in addition to the expected set, and will be evidence of the
1189	   attack.

1191	   If an HTTPS server is operating in a constrained environment and
1192	   cannot devote enough storage space to hold all SCTs and Certificate
1193	   Chains it is authoritative for it is recommended to configure the SCT
1194	   Feedback mechanism to allow only certain certificates that are known
1195	   to be valid.  These chains and SCTs can then be discarded without
1196	   being stored or subsequently provided to any clients or auditors.  If
1197	   the allowlist is not sufficient, the below Deletion Algorithm
1198	   Section 11.3.2 is recommended to make it more difficult for the
1199	   attacker to perform a flushing attack.

1201	10.4.3.  SCTs & Certificate Chains on HTTPS Clients

1203	   HTTPS clients will accumulate SCTs and Certificate Chains without
1204	   bound.  It is expected they will choose a particular cache size and
1205	   delete entries when the cache size meets its limit.  This does not
1206	   mitigate flushing attacks, and such an attack is documented in
1207	   [gossip-mixing].

1209	   The below Deletion Algorithm Section 11.3.2 is recommended to make it
1210	   more difficult for the attacker to perform a flushing attack.

1212	10.5.  Privacy considerations

1214	   CT Gossip deals with HTTPS clients which are trying to share
1215	   indicators that correspond to their browsing history.  The most
1216	   sensitive relationships in the CT ecosystem are the relationships
1217	   between HTTPS clients and HTTPS servers.  Client-server relationships
1218	   can be aggregated into a network graph with potentially serious
1219	   implications for correlative de-anonymization of clients and
1220	   relationship-mapping or clustering of servers or of clients.

1222	   There are, however, certain clients that do not require privacy
1223	   protection.  Examples of these clients are web crawlers or robots.
1224	   But even in this case, the method by which these clients crawl the
1225	   web may in fact be considered sensitive information.  In general, it
1226	   is better to err on the side of safety, and not assume a client is
1227	   okay with giving up its privacy.

1229	10.5.1.  Privacy and SCTs

1231	   An SCT contains information that links it to a particular web site.
1232	   Because the client-server relationship is sensitive, gossip between
1233	   clients and servers about unrelated SCTs is risky.  Therefore, a
1234	   client with an SCT for a given server SHOULD NOT transmit that
1235	   information in any other than the following two channels: to the
1236	   server associated with the SCT itself; or to a Trusted Auditor, if
1237	   one exists.

1239	10.5.2.  Privacy in SCT Feedback

1241	   SCTs introduce yet another mechanism for HTTPS servers to store state
1242	   on an HTTPS client, and potentially track users.  HTTPS clients which
1243	   allow users to clear history or cookies associated with an origin
1244	   MUST clear stored SCTs and certificate chains associated with the
1245	   origin as well.

1247	   Auditors should treat all SCTs as sensitive data.  SCTs received
1248	   directly from an HTTPS client are especially sensitive, because the
1249	   auditor is a trusted by the client to not reveal their associations
1250	   with servers.  Auditors MUST NOT share such SCTs in any way,
1251	   including sending them to an external log, without first mixing them
1252	   with multiple other SCTs learned through submissions from multiple
1253	   other clients.  Suggestions for mixing SCTs are presented in
1254	   Section 11.3.

1256	   There is a possible fingerprinting attack where a log issues a unique
1257	   SCT for targeted log client(s).  A colluding log and HTTPS server
1258	   operator could therefore be a threat to the privacy of an HTTPS
1259	   client.  Given all the other opportunities for HTTPS servers to
1260	   fingerprint clients - TLS session tickets, HPKP and HSTS headers,
1261	   HTTP Cookies, etc. - this is considered acceptable.

1263	   The fingerprinting attack described above would be mitigated by a
1264	   requirement that logs must use a deterministic signature scheme when
1265	   signing SCTs ([RFC-6962-BIS-09] Section 2.1.4).  A log signing using
1266	   RSA is not required to use a deterministic signature scheme.

1268	   Since logs are allowed to issue a new SCT for a certificate already
1269	   present in the log, mandating deterministic signatures does not stop
1270	   this fingerprinting attack altogether.  It does make the attack
1271	   harder to pull off without being detected though.

1273	   There is another similar fingerprinting attack where an HTTPS server
1274	   tracks a client by using a unique certificate or a variation of cert
1275	   chains.  The risk for this attack is accepted on the same grounds as
1276	   the unique SCT attack described above.

1278	10.5.3.  Privacy for HTTPS clients performing STH Proof Fetching

1280	   An HTTPS client performing Proof Fetching SHOULD NOT request proofs
1281	   from a CT log that it doesn't accept SCTs from.  An HTTPS client
1282	   SHOULD regularly request an STH from all logs it is willing to
1283	   accept, even if it has seen no SCTs from that log.

1285	   The time between two polls for new STH's SHOULD NOT be significantly
1286	   shorter than the MMD of the polled log divided by its STH Frequency
1287	   Count ([RFC-6962-BIS-09] section 5.1).

1289	   The actual mechanism by which Proof Fetching is done carries
1290	   considerable privacy concerns.  Although out of scope for the
1291	   document, DNS is a mechanism currently discussed.  DNS exposes data
1292	   in plaintext over the network (including what sites the user is
1293	   visiting and what sites they have previously visited) and may not be
1294	   suitable for some.

1296	10.5.4.  Privacy in STH Pollination

1298	   An STH linked to an HTTPS client may indicate the following about
1299	   that client:

1301	   o  that the client gossips;

1303	   o  that the client has been using CT at least until the time that the
1304	      timestamp and the tree size indicate;

1306	   o  that the client is talking, possibly indirectly, to the log
1307	      indicated by the tree hash;

1309	   o  which software and software version is being used.

1311	   There is a possible fingerprinting attack where a log issues a unique
1312	   STH for a targeted HTTPS client.  This is similar to the
1313	   fingerprinting attack described in Section 10.5.2, but can operate
1314	   cross-origin.  If a log (or HTTPS server cooperating with a log)
1315	   provides a unique STH to a client, the targeted client will be the
1316	   only client pollinating that STH cross-origin.

1318	   It is mitigated partially because the log is limited in the number of
1319	   STHs it can issue.  It must 'save' one of its STHs each MMD to
1320	   perform the attack.

1322	10.5.5.  Privacy in STH Interaction

1324	   An HTTPS client may pollinate any STH within the last 14 days.  An
1325	   HTTPS client may also pollinate an STH for any log that it knows
1326	   about.  When a client pollinates STHs to a server, it will release
1327	   more than one STH at a time.  It is unclear if a server may 'prime' a
1328	   client and be able to reliably detect the client at a later time.

1330	   It's clear that a single site can track a user any way they wish, but
1331	   this attack works cross-origin and is therefore more concerning.  Two
1332	   independent sites A and B want to collaborate to track a user cross-
1333	   origin.  A feeds a client Carol some N specific STHs from the M logs
1334	   Carol trusts, chosen to be older and less common, but still in the
1335	   validity window.  Carol visits B and chooses to release some of the
1336	   STHs she has stored, according to some policy.

1338	   Modeling a representation for how common older STHs are in the pools
1339	   of clients, and examining that with a given policy of how to choose
1340	   which of those STHs to send to B, it should be possible to calculate
1341	   statistics about how unique Carol looks when talking to B and how
1342	   useful/accurate such a tracking mechanism is.

1344	   Building such a model is likely impossible without some real world
1345	   data, and requires a given implementation of a policy.  To combat
1346	   this attack, suggestions are provided in Section 11.3 to attempt to
1347	   minimize it, but follow-up testing with real world deployment to
1348	   improve the policy will be required.

1350	10.5.6.  Trusted Auditors for HTTPS Clients

1352	   Some HTTPS clients may choose to use a trusted auditor.  This trust
1353	   relationship exposes a large amount of information about the client
1354	   to the auditor.  In particular, it will identify the web sites that
1355	   the client has visited to the auditor.  Some clients may already
1356	   share this information to a third party, for example, when using a
1357	   server to synchronize browser history across devices in a server-
1358	   visible way, or when doing DNS lookups through a trusted DNS
1359	   resolver.  For clients with such a relationship already established,
1360	   sending SCTs to a trusted auditor run by the same organization does
1361	   not appear to expose any additional information to the trusted third
1362	   party.

1364	   Clients who wish to contact a CT auditor without associating their
1365	   identities with their SCTs may wish to use an anonymizing network
1366	   like Tor to submit SCT Feedback to the auditor.  Auditors SHOULD
1367	   accept SCT Feedback that arrives over such anonymizing networks.

1369	   Clients sending feedback to an auditor may prefer to reduce the
1370	   temporal granularity of the history exposure to the auditor by
1371	   caching and delaying their SCT Feedback reports.  This is elaborated
1372	   upon in Section 11.3.  This strategy is only as effective as the
1373	   granularity of the timestamps embedded in the SCTs and STHs.

1375	10.5.7.  HTTPS Clients as Auditors

1377	   Some HTTPS clients may choose to act as CT auditors themselves.  A
1378	   Client taking on this role needs to consider the following:

1380	   o  an Auditing HTTPS client potentially exposes its history to the
1381	      logs that they query.  Querying the log through a cache or a proxy
1382	      with many other users may avoid this exposure, but may expose
1383	      information to the cache or proxy, in the same way that a non-
1384	      Auditing HTTPS Client exposes information to a Trusted Auditor.

1386	   o  an effective CT auditor needs a strategy about what to do in the
1387	      event that it discovers misbehavior from a log.  Misbehavior from
1388	      a log involves the log being unable to provide either (a) a
1389	      consistency proof between two valid STHs or (b) an inclusion proof
1390	      for a certificate to an STH any time after the log's MMD has
1391	      elapsed from the issuance of the SCT.  The log's inability to
1392	      provide either proof will not be externally cryptographically-
1393	      verifiable, as it may be indistinguishable from a network error.

1395	11.  Policy Recommendations

1397	   This section is intended as suggestions to implementors of HTTPS
1398	   Clients, HTTPS servers, and CT auditors.  It is not a requirement for
1399	   technique of implementation, so long as privacy considerations
1400	   established above are obeyed.

1402	11.1.  Blocking Recommendations

1404	11.1.1.  Frustrating blocking

1406	   When making gossip connections to HTTPS servers or Trusted Auditors,
1407	   it is desirable to minimize the plaintext metadata in the connection
1408	   that can be used to identify the connection as a gossip connection
1409	   and therefore be of interest to block.  Additionally, introducing
1410	   some randomness into client behavior may be important.  We assume
1411	   that the adversary is able to inspect the behavior of the HTTPS
1412	   client and understand how it makes gossip connections.

1414	   As an example, if a client, after establishing a TLS connection (and
1415	   receiving an SCT, but not making its own HTTP request yet),
1416	   immediately opens a second TLS connection for the purpose of gossip,
1417	   the adversary can reliably block this second connection to block
1418	   gossip without affecting normal browsing.  For this reason it is
1419	   recommended to run the gossip protocols over an existing connection
1420	   to the server, making use of connection multiplexing such as HTTP
1421	   Keep-Alive or SPDY.

1423	   Truncation is also a concern.  If a client always establishes a TLS
1424	   connection, makes a request, receives a response, and then always
1425	   attempts a gossip communication immediately following the first
1426	   response, truncation will allow an attacker to block gossip reliably.

1428	   For these reasons, we recommend that, if at all possible, clients
1429	   SHOULD send gossip data in an already established TLS session.  This
1430	   can be done through the use of HTTP Pipelining, SPDY, or HTTP/2.

1432	11.1.2.  Responding to possible blocking

1434	   In some circumstances a client may have a piece of data that they
1435	   have attempted to share (via SCT Feedback or STH Pollination), but
1436	   have been unable to do so: with every attempt they receive an error.
1437	   These situations are:

1439	   1.  The client has an SCT and a certificate, and attempts to retrieve
1440	       an inclusion proof - but receives an error on every attempt.

1442	   2.  The client has an STH, and attempts to resolve it to a newer STH
1443	       via a consistency proof - but receives an error on every attempt.

1445	   3.  The client has attempted to share an SCT and constructed
1446	       certificate via SCT Feedback - but receives an error on every
1447	       attempt.

1449	   4.  The client has attempted to share an STH via STH Pollination -
1450	       but receives an error on every attempt.

1452	   5.  The client has attempted to share a specific piece of data with a
1453	       Trusted Auditor - but receives an error on every attempt.

1455	   In the case of 1 or 2, it is conceivable that the reason for the
1456	   errors is that the log acted improperly, either through malicious
1457	   actions or compromise.  A proof may not be able to be fetched because
1458	   it does not exist (and only errors or timeouts occur).  One such
1459	   situation may arise because of an actively malicious log, as
1460	   presented in Section 10.1.  This data is especially important to
1461	   share with the broader internet to detect this situation.

1463	   If an SCT has attempted to be resolved to an STH via an inclusion
1464	   proof multiple times, and each time has failed, this SCT might very
1465	   well be a compromising proof of an attack.  However the client MUST
1466	   NOT share the data with any other third party (excepting a Trusted
1467	   Auditor should one exist).

1469	   If an STH has attempted to be resolved to a newer STH via a
1470	   consistency proof multiple times, and each time has failed, a client
1471	   MAY share the STH with an "Auditor of Last Resort" even if the STH in
1472	   question is no longer within the validity window.  This auditor may
1473	   be pre-configured in the client, but the client SHOULD permit a user
1474	   to disable the functionality or change whom data is sent to.  The
1475	   Auditor of Last Resort itself represents a point of failure and
1476	   privacy concerns, so if implemented, it SHOULD connect using public
1477	   key pinning and not consider an item delivered until it receives a
1478	   confirmation.

1480	   In the cases 3, 4, and 5, we assume that the webserver(s) or trusted
1481	   auditor in question is either experiencing an operational failure, or
1482	   being attacked.  In both cases, a client SHOULD retain the data for
1483	   later submission (subject to Private Browsing or other history-
1484	   clearing actions taken by the user.)  This is elaborated upon more in
1485	   Section 11.3.

1487	11.2.  Proof Fetching Recommendations

1489	   Proof fetching (both inclusion proofs and consistency proofs) SHOULD
1490	   be performed at random time intervals.  If proof fetching occurred
1491	   all at once, in a flurry of activity, a log would know that SCTs or
1492	   STHs received around the same time are more likely to come from a
1493	   particular client.  While proof fetching is required to be done in a
1494	   manner that attempts to be anonymous from the perspective of the log,
1495	   the correlation of activity to a single client would still reveal
1496	   patterns of user behavior we wish to keep confidential.  These
1497	   patterns could be recognizable as a single user, or could reveal what
1498	   sites are commonly visited together in the aggregate.

1500	11.3.  Record Distribution Recommendations

1502	   In several components of the CT Gossip ecosystem, the recommendation
1503	   is made that data from multiple sources be ingested, mixed, stored
1504	   for an indeterminate period of time, provided (multiple times) to a
1505	   third party, and eventually deleted.  The instances of these
1506	   recommendations in this draft are:

1508	   o  When a client receives SCTs during SCT Feedback, it should store
1509	      the SCTs and Certificate Chain for some amount of time, provide
1510	      some of them back to the server at some point, and may eventually
1511	      remove them from its store

1513	   o  When a client receives STHs during STH Pollination, it should
1514	      store them for some amount of time, mix them with other STHs,
1515	      release some of them them to various servers at some point,
1516	      resolve some of them to new STHs, and eventually remove them from
1517	      its store

1519	   o  When a server receives SCTs during SCT Feedback, it should store
1520	      them for some period of time, provide them to auditors some number
1521	      of times, and may eventually remove them

1523	   o  When a server receives STHs during STH Pollination, it should
1524	      store them for some period of time, mix them with other STHs,
1525	      provide some of them to connecting clients, may resolve them to
1526	      new STHs via Proof Fetching, and eventually remove them from its
1527	      store

1529	   o  When a Trusted Auditor receives SCTs or historical STHs from
1530	      clients, it should store them for some period of time, mix them
1531	      with SCTs received from other clients, and act upon them at some
1532	      period of time

1534	   Each of these instances have specific requirements for user privacy,
1535	   and each have options that may not be invoked.  As one example, an
1536	   HTTPS client should not mix SCTs from server A with SCTs from server
1537	   B and release server B's SCTs to Server A.  As another example, an
1538	   HTTPS server may choose to resolve STHs to a single more current STH
1539	   via proof fetching, but it is under no obligation to do so.

1541	   These requirements should be met, but the general problem of
1542	   aggregating multiple pieces of data, choosing when and how many to
1543	   release, and when to remove them is shared.  This problem has
1544	   previously been considered in the case of Mix Networks and Remailers,
1545	   including papers such as [trickle].

1547	   There are several concerns to be addressed in this area, outlined
1548	   below.

1550	11.3.1.  Mixing Algorithm

1552	   When SCTs or STHs are recorded by a participant in CT Gossip and
1553	   later used, it is important that they are selected from the datastore
1554	   in a non-deterministic fashion.

1556	   This is most important for servers, as they can be queried for SCTs
1557	   and STHs anonymously.  If the server used a predictable ordering
1558	   algorithm, an attacker could exploit the predictability to learn
1559	   information about a client.  One such method would be by observing
1560	   the (encrypted) traffic to a server.  When a client of interest
1561	   connects, the attacker makes a note.  They observe more clients
1562	   connecting, and predicts at what point the client-of-interest's data
1563	   will be disclosed, and ensures that they query the server at that
1564	   point.

1566	   Although most important for servers, random ordering is still
1567	   strongly recommended for clients and Trusted Auditors.  The above
1568	   attack can still occur for these entities, although the circumstances
1569	   are less straightforward.  For clients, an attacker could observe
1570	   their behavior, note when they receive an STH from a server, and use
1571	   javascript to cause a network connection at the correct time to force
1572	   a client to disclose the specific STH.  Trusted Auditors are stewards
1573	   of sensitive client data.  If an attacker had the ability to observe
1574	   the activities of a Trusted Auditor (perhaps by being a log, or
1575	   another auditor), they could perform the same attack - noting the
1576	   disclosure of data from a client to the Trusted Auditor, and then
1577	   correlating a later disclosure from the Trusted Auditor as coming
1578	   from that client.

1580	   Random ordering can be ensured by several mechanisms.  A datastore
1581	   can be shuffled, using a secure shuffling algorithm such as Fisher-
1582	   Yates.  Alternately, a series of random indexes into the data store
1583	   can be selected (if a collision occurs, a new index is selected.)  A
1584	   cryptographically secure random number generator must be used in
1585	   either case.  If shuffling is performed, the datastore must be marked
1586	   'dirty' upon item insertion, and at least one shuffle operation
1587	   occurs on a dirty datastore before data is retrieved from it for use.

1589	11.3.2.  The Deletion Algorithm

1591	   No entity in CT Gossip is required to delete records at any time,
1592	   except to respect user's wishes such as private browsing mode or
1593	   clearing history.  However, it is likely that over time the
1594	   accumulated storage will grow in size and need to be pruned.

1596	   While deletion of data will occur, proof fetching can ensure that any
1597	   misbehavior from a log will still be detected, even after the direct
1598	   evidence from the attack is deleted.  Proof fetching ensures that if
1599	   a log presents a split view for a client, they must maintain that
1600	   split view in perpetuity.  An inclusion proof from an SCT to an STH
1601	   does not erase the evidence - the new STH is evidence itself.  A
1602	   consistency proof from that STH to a new one likewise - the new STH
1603	   is every bit as incriminating as the first.  (Client behavior in the
1604	   situation where an SCT or STH cannot be resolved is suggested in
1605	   Section 11.1.2.)  Because of this property, we recommend that if a
1606	   client is performing proof fetching, that they make every effort to
1607	   not delete data until it has been successfully resolved to a new STH
1608	   via a proof.

1610	   When it is time to delete a record, it can be done in a way that
1611	   makes it more difficult for a successful flushing attack to to be
1612	   performed.

1614	   1.  When the record cache has reached a certain size that is yet
1615	       under the limit, aggressively perform proof fetching.  This
1616	       should resolve records to a small set of STHs that can be
1617	       retained.  Once a proof has been fetched, the record is safer to
1618	       delete.

1620	   2.  If proof fetching has failed, or is disabled, begin by deleting
1621	       SCTs and Certificate Chains that have been successfully reported.
1622	       Deletion from this set of SCTs should be done at random.  For a
1623	       client, a submission is not counted as being reported unless it
1624	       is sent over a connection using a different SCT, so the attacker
1625	       is faced with a recursive problem.  (For a server, this step does
1626	       not apply.)

1628	   3.  Attempt to save any submissions that have failed proof fetching
1629	       repeatedly, as these are the most likely to be indicative of an
1630	       attack.

1632	   4.  Finally, if the above steps have been followed and have not
1633	       succeeded in reducing the size sufficiently, records may be
1634	       deleted at random.

1636	   Note that if proof fetching is disabled (which is expected although
1637	   not required for servers) - the algorithm collapses down to 'delete
1638	   at random'.

1640	   The decision to delete records at random is intentional.  Introducing
1641	   non-determinism in the decision is absolutely necessary to make it
1642	   more difficult for an adversary to know with certainty or high
1643	   confidence that the record has been successfully flushed from a
1644	   target.

1646	11.4.  Concrete Recommendations

1648	   We present the following pseudocode as a concrete outline of our
1649	   policy recommendations.

1651	   Both suggestions presented are applicable to both clients and
1652	   servers.  Servers may not perform proof fetching, in which case large
1653	   portions of the pseudocode are not applicable.  But it should work in
1654	   either case.

1656	11.4.1.  STH Pollination

1658	   The STH class contains data pertaining specifically to the STH
1659	   itself.

1661	   class STH
1662	   {
1663	     uint16   proof_attempts
1664	     uint16   proof_failure_count
1665	     uint32   num_reports_to_thirdparty
1666	     datetime timestamp
1667	     byte[]   data
1668	   }

1670	   The broader STH store itself would contain all the STHs known by an
1671	   entity participating in STH Pollination (either client or server).
1672	   This simplistic view of the class does not take into account the
1673	   complicated locking that would likely be required for a data
1674	   structure being accessed by multiple threads.  Something to note
1675	   about this pseudocode is that it does not remove STHs once they have
1676	   been resolved to a newer STH.  Doing so might make older STHs within
1677	   the validity window rarer and thus enable tracking.

1679	   class STHStore
1680	   {
1681	     STH[] sth_list

1683	     //  This function is run after receiving a set of STHs from
1684	     //  a third party in response to a pollination submission
1685	     def insert(STH[] new_sths) {
1686	       foreach(new in new_sths) {
1687	         if(this.sth_list.contains(new))
1688	           continue
1689	         this.sth_list.insert(new)
1690	       }
1691	     }

1693	     //  This function is called to delete the given STH
1694	     //  from the data store
1695	     def delete_now(STH s) {
1696	       this.sth_list.remove(s)
1697	     }

1699	     //  When it is time to perform STH Pollination, the HTTPS client
1700	     //  calls this function to get a selection of STHs to send as
1701	     //  feedback
1702	     def get_pollination_selection() {
1703	       if(len(this.sth_list) < MAX_STH_TO_GOSSIP)
1704	         return this.sth_list
1705	       else {
1706	         indexes = set()
1707	         modulus = len(this.sth_list)
1708	         while(len(indexes) < MAX_STH_TO_GOSSIP) {
1709	           r = randomInt() % modulus
1710	           // Ignore STHs that are past the validity window but not
1711	           // yet removed.
1712	           if(r not in indexes
1713	              && now() - this.sth_list[i].timestamp < TWO_WEEKS)
1714	             indexes.insert(r)
1715	         }

1717	         return_selection = []
1718	         foreach(i in indexes) {
1719	           return_selection.insert(this.sth_list[i])
1720	         }
1721	         return return_selection
1722	       }
1723	     }
1724	   }
1725	   We also suggest a function that will be called periodically in the
1726	   background, iterating through the STH store, performing a cleaning
1727	   operation and queuing consistency proofs.  This function can live as
1728	   a member functions of the STHStore class.

1730	//Just a suggestion:
1731	#define MIN_PROOF_FAILURES_CONSIDERED_SUSPICIOUS 3

1733	def clean_list() {
1734	  foreach(sth in this.sth_list) {

1736	    if(now() - sth.timestamp > TWO_WEEKS) {
1737	      //STH is too old, we must remove it
1738	      if(proof_fetching_enabled
1739	         && auditor_of_last_resort_enabled
1740	         && sth.proof_failure_count
1741	            > MIN_PROOF_FAILURES_CONSIDERED_SUSPICIOUS) {
1742	        queue_for_auditor_of_last_resort(sth,
1743	                                        auditor_of_last_resort_callback)
1744	      } else {
1745	        delete_now(sth)
1746	      }
1747	    }

1749	    else if(proof_fetching_enabled
1750	            && now() - sth.timestamp > LOG_MMD
1751	            && sth.proof_attempts != UINT16_MAX
1752	            // Only fetch a proof is we have never received a proof
1753	            // before. (This also avoids submitting something
1754	            // already in the queue.)
1755	            && sth.proof_attempts == sth.proof_failure_count) {
1756	      sth.proof_attempts++
1757	      queue_consistency_proof(sth, consistency_proof_callback)
1758	    }
1759	  }
1760	}

1762	   These functions also exist in the STHStore class.

1764	//  This function is called after successfully pollinating STHs
1765	//  to a third party. It is passed the STHs sent to the third
1766	//  party, which is the output of get_gossip_selection(), as well
1767	//  as the STHs received in the response.
1768	def successful_thirdparty_submission_callback(STH[] submitted_sth_list,
1769	                                              STH[] new_sths)
1770	{
1771	  foreach(sth in submitted_sth_list) {
1772	    sth.num_reports_to_thirdparty++
1773	  }

1775	  this.insert(new_sths);
1776	}

1778	//  Attempt auditor of last resort submissions until it succeeds
1779	def auditor_of_last_resort_callback(original_sth, error) {
1780	  if(!error) {
1781	    delete_now(original_sth)
1782	  }
1783	}

1785	def consistency_proof_callback(consistency_proof, original_sth, error) {
1786	  if(!error) {
1787	    insert(consistency_proof.current_sth)
1788	  } else {
1789	    original_sth.proof_failure_count++
1790	  }
1791	}

1793	11.4.2.  SCT Feedback

1795	   The SCT class contains data pertaining specifically to an SCT itself.

1797	   class SCT
1798	   {
1799	     uint16 proof_failure_count
1800	     bool   has_been_resolved_to_sth
1801	     bool   proof_outstanding
1802	     byte[] data
1803	   }

1805	   The SCT bundle will contain the trusted certificate chain the HTTPS
1806	   client built (chaining to a trusted root certificate.)  It also
1807	   contains the list of associated SCTs, the exact domain it is
1808	   applicable to, and metadata pertaining to how often it has been
1809	   reported to the third party.

1811	   class SCTBundle
1812	   {
1813	     X509[] certificate_chain
1814	     SCT[]  sct_list
1815	     string domain
1816	     uint32 num_reports_to_thirdparty

1818	     def equals(sct_bundle) {
1819	       if(sct_bundle.domain != this.domain)
1820	         return false
1821	       if(sct_bundle.certificate_chain != this.certificate_chain)
1822	         return false
1823	       if(sct_bundle.sct_list != this.sct_list)
1824	         return false

1826	       return true
1827	     }
1828	     def approx_equals(sct_bundle) {
1829	       if(sct_bundle.domain != this.domain)
1830	         return false
1831	       if(sct_bundle.certificate_chain != this.certificate_chain)
1832	         return false

1834	       return true
1835	     }

1837	     def insert_scts(sct[] sct_list) {
1838	       this.sct_list.union(sct_list)
1839	       this.num_reports_to_thirdparty = 0
1840	     }

1842	     def has_been_fully_resolved_to_sths() {
1843	       foreach(s in this.sct_list) {
1844	         if(!s.has_been_resolved_to_sth && !s.proof_outstanding)
1845	           return false
1846	       }
1847	       return true
1848	     }

1850	     def max_proof_failures() {
1851	       uint max = 0
1852	       foreach(sct in this.sct_list) {
1853	         if(sct.proof_failure_count > max)
1854	           max = sct.proof_failure_count
1855	       }
1856	       return max
1857	     }
1858	   }
1859	   For each domain, we store a SCTDomainEntry that holds the SCTBundles
1860	   seen for that domain, as well as encapsulating some logic relating to
1861	   SCT Feedback for that particular domain.  In particular, this data
1862	   structure also contains the logic that handles domains not supporting
1863	   SCT Feedback.  Its behavior is:

1865	   1.  When a user visits a domain, SCT Feedback is attempted for it.
1866	       If it fails, it will retry after a month (configurable).  If it
1867	       succeeds, excellent.  SCT Feedback data is still collected and
1868	       stored even if SCT Feedback failed.

1870	   2.  After 3 month-long waits between failures, the domain will be
1871	       marked as failing long-term.  No SCT Feedback data will be stored
1872	       beyond meta-data, but SCT Feedback will still be attempted after
1873	       month-long waits

1875	   3.  If at any point in time, SCT Feedback succeeds, all failure
1876	       counters are reset

1878	   4.  If a domain succeeds, but then begins failing, it must fail more
1879	       than 90% of the time (configurable) and then the process begins
1880	       at (2).

1882	   If a domain is visited infrequently (say, once every 7 months) then
1883	   it will be evicted from the cache and start all over again (according
1884	   to the suggestion values in the below pseudocode).

1886	   [ Note: To be certain the logic is correct I give the following test
1887	   cases which illustrate the intended behavior.  Hopefully the code
1888	   matches!

1890	 Succeed 1 Time        num_submissions_attempted=1    num_submissions_succeeded=1  num_feedback_loop_failures=0
1891	 Fail 10 Times         num_submissions_attempted=11   num_submissions_succeeded=1  num_feedback_loop_failures=0
1892	 ... wait a month ...
1893	 Fail 1 month later    num_submissions_attempted=12   num_submissions_succeeded=1  num_feedback_loop_failures=1
1894	 ... wait a month ...
1895	 Succeed 1 month later num_submissions_attempted=13   num_submissions_succeeded=2  num_feedback_loop_failures=0(r) indicates (Reset)
1896	 -> Feedback is attempted regularly.

1898	 Succeed 1 Time        num_submissions_attempted=1    num_submissions_succeeded=1  num_feedback_loop_failures=0
1899	 Fail 10 Times         num_submissions_attempted=11   num_submissions_succeeded=1  num_feedback_loop_failures=0
1900	 ... wait a month ...
1901	 Fail 1 month later    num_submissions_attempted=12   num_submissions_succeeded=1  num_feedback_loop_failures=1
1902	 ... wait a month ...
1903	 Fail 1 month later    num_submissions_attempted=13   num_submissions_succeeded=1  num_feedback_loop_failures=2
1904	 ... wait a month ...
1905	 Succeed 1 month later num_submissions_attempted=14   num_submissions_succeeded=2  num_feedback_loop_failures=0(r)
1906	 -> Feedback is attempted regularly.

1908	 Succeed 1 Time        num_submissions_attempted=1    num_submissions_succeeded=1  num_feedback_loop_failures=0
1909	 Fail 10 Times         num_submissions_attempted=11   num_submissions_succeeded=1  num_feedback_loop_failures=0
1910	 ... wait a month ...
1911	 Fail 1 month later    num_submissions_attempted=12   num_submissions_succeeded=1  num_feedback_loop_failures=1
1912	 ... wait a month ...
1913	 Fail 1 month later    num_submissions_attempted=13   num_submissions_succeeded=1  num_feedback_loop_failures=2
1914	 ... wait a month ...
1915	 Fail 1 month later    num_submissions_attempted=14   num_submissions_succeeded=2  num_feedback_loop_failures=3
1916	 ... clear_old_data() is run every hour ...
1917	                       num_submissions_attempted=0    num_submissions_succeeded=0  num_feedback_loop_failures=3
1918	                       sct_feedback_failing_longterm=True
1919	 Fail 1 month later    num_submissions_attempted=1    num_submissions_succeeded=0  num_feedback_loop_failures=4
1920	                       sct_feedback_failing_longterm=True
1921	 ... clear_old_data() is run every hour ...
1922	                       num_submissions_attempted=0(r) num_submissions_succeeded=0  num_feedback_loop_failures=3
1923	                       sct_feedback_failing_longterm=True
1924	 Succeed 1 month later num_submissions_attempted=2   num_submissions_succeeded=1  num_feedback_loop_failures=0(r)
1925	                       sct_feedback_failing_longterm=False
1926	 -> Feedback is attempted regularly.

1928	 Note above that the second run of clear_old_data() will reset num_submissions_attempted from 1 to 0.  This is
1929	 CRITICAL. Otherwise, we would have the below bug (where after 10 months of failures, a success would not hit
1930	 the required ratio to keep going)

1932	 //The below represents a bug.
1933	 Succeed 1 Time        num_submissions_attempted=1   num_submissions_succeeded=1  num_feedback_loop_failures=0
1934	 Fail 10 Times         num_submissions_attempted=11  num_submissions_succeeded=1  num_feedback_loop_failures=0
1935	 ... wait a month ...
1936	 Fail 1 month later    num_submissions_attempted=12  num_submissions_succeeded=1  num_feedback_loop_failures=1
1937	 ... wait a month ...
1938	 Fail 1 month later    num_submissions_attempted=13  num_submissions_succeeded=1  num_feedback_loop_failures=2
1939	 ... wait a month ...
1940	 Fail 1 month later    num_submissions_attempted=14  num_submissions_succeeded=2  num_feedback_loop_failures=3
1941	 ... clear_old_data() is run every hour ...
1942	                       num_submissions_attempted=0   num_submissions_succeeded=0  num_feedback_loop_failures=3
1943	                       sct_feedback_failing_longterm=True
1944	 Fail 1 month later    num_submissions_attempted=1   num_submissions_succeeded=0  num_feedback_loop_failures=4
1945	                       sct_feedback_failing_longterm=True
1946	 Fail 9 times for 9 months
1947	                       num_submissions_attempted=10  num_submissions_succeeded=0  num_feedback_loop_failures=13
1948	                       sct_feedback_failing_longterm=True
1949	 Succeed 1 month later num_submissions_attempted=11  num_submissions_succeeded=1  num_feedback_loop_failures=0(r)
1950	                       sct_feedback_failing_longterm=False
1951	 -> Feedback is NOT attempted regularly. \]

1953	//Suggestions:
1954	//  After concluding a domain doesn't support feedback, we try again
1955	//  after WAIT_BETWEEN_SCT_FEEDBACK_ATTEMPTS amount of time to see if
1956	//  they added support
1957	#define WAIT_BETWEEN_SCT_FEEDBACK_ATTEMPTS                     1 month

1959	//  If we've waited MIN_SCT_FEEDBACK_ATTEMPTS_BEFORE_OMITTING_STORAGE
1960	//  multiplied by WAIT_BETWEEN_SCT_FEEDBACK_ATTEMPTS amount of time, we
1961	//  still attempt SCT Feedback, but no longer bother storing any data
1962	//  until the domain supports SCT Feedback
1963	#define MIN_SCT_FEEDBACK_ATTEMPTS_BEFORE_OMITTING_STORAGE      3

1965	//  If this percentage of SCT Feedback attempts previously succeeded,
1966	//  we consider the domain as supporting feedback and is just having
1967	//  transient errors
1968	#define MIN_RATIO_FOR_SCT_FEEDBACK_TO_BE_WORKING               .10

1970	class SCTDomainEntry
1971	{
1972	  //  This is the primary key of the object, the exact domain name it
1973	  //  is valid for
1974	  string   domain

1976	  //  This is the last time the domain was contacted. For client
1977	  //  operations it is updated whenever the client makes any request
1978	  //  (not just feedback) to the domain. For server operations, it is
1979	  //  updated whenever any client contacts the domain. Responsibility
1980	  //  for updating lies OUTSIDE of the class
1981	  public datetime last_contact_for_domain

1983	  //  This is the last time SCT Feedback was attempted for the domain.
1984	  //  It is updated whenever feedback is attempted - responsibility for
1985	  //  updating lies OUTSIDE of the class
1986	  //  This is not used when this algorithm runs on servers
1987	  public datetime last_sct_feedback_attempt

1989	  //  This is the number of times we have waited an
1990	  //  WAIT_BETWEEN_SCT_FEEDBACK_ATTEMPTS amount of time, and still failed
1991	  //  e.g. 10 months of failures
1992	  //  This is not used when this algorithm runs on servers
1993	  private uint16   num_feedback_loop_failures

1995	  //  This is whether or not SCT Feedback has failed enough times that we
1996	  //  should not bother storing data for it anymore. It is a small function
1997	  //  used for illustrative purposes
1998	  //  This is not used when this algorithm runs on servers
1999	  private bool     sct_feedback_failing_longterm()
2000	    { num_feedback_loop_failures >= MIN_SCT_FEEDBACK_ATTEMPTS_BEFORE_OMITTING_STORAGE }

2002	  //  This is the number of SCT Feedback submissions attempted.

2004	  //  Responsibility for incrementing lies OUTSIDE of the class
2005	  //  (And watch for integer overflows)
2006	  //  This is not used when this algorithm runs on servers
2007	  public uint16    num_submissions_attempted

2009	  //  This is the number of successful SCT Feedback submissions. This
2010	  //  variable is updated by the class.
2011	  //  This is not used when this algorithm runs on servers
2012	  private uint16   num_submissions_succeeded

2014	  //  This contains all the bundles of SCT data we have observed for
2015	  //  this domain
2016	  SCTBundle[] observed_records

2018	  //  This function can be called to determine if we should attempt
2019	  //  SCT Feedback for this domain.
2020	  def should_attempt_feedback() {
2021	    // Servers always perform feedback!
2022	    if(operator_is_server)
2023	      return true

2025	    // If we have not tried in a month, try again
2026	    if(now() - last_sct_feedback_attempt > WAIT_BETWEEN_SCT_FEEDBACK_ATTEMPTS)
2027	      return true

2029	    // If we have tried recently, and it seems to be working, go for it!
2030	    if((num_submissions_succeeded / num_submissions_attempted) >
2031	       MIN_RATIO_FOR_SCT_FEEDBACK_TO_BE_WORKING)
2032	      return true

2034	    // Otherwise don't try
2035	    return false
2036	  }

2038	  //  For Clients, this function is called after a successful
2039	  //  connection to an HTTPS server, with a single SCTBundle
2040	  //  constructed from that connection's certificate chain and SCTs.
2041	  //  For Servers, this is called after receiving SCT Feedback with
2042	  //  all the bundles sent in the feedback.
2043	  def insert(SCTBundle[] bundles) {
2044	    // Do not store data for long-failing domains
2045	    if(sct_feedback_failing_longterm()) {
2046	      return
2047	    }

2049	    foreach(b in bundles) {
2050	      if(operator_is_server) {
2051	        if(!passes_validity_checks(b))
2052	          return
2053	      }

2055	      bool have_inserted = false
2056	      foreach(e in this.observed_records) {
2057	        if(e.equals(b))
2058	          return
2059	        else if(e.approx_equals(b)) {
2060	          have_inserted = true
2061	          e.insert_scts(b.sct_list)
2062	        }
2063	      }
2064	      if(!have_inserted)
2065	        this.observed_records.insert(b)
2066	    }
2067	    SCTStoreManager.update_cache_percentage()
2068	  }

2070	  //  When it is time to perform SCT Feedback, the HTTPS client
2071	  //  calls this function to get a selection of SCTBundles to send
2072	  //  as feedback
2073	  def get_gossip_selection() {
2074	    if(len(observed_records) > MAX_SCT_RECORDS_TO_GOSSIP) {
2075	      indexes = set()
2076	      modulus = len(observed_records)
2077	      while(len(indexes) < MAX_SCT_RECORDS_TO_GOSSIP) {
2078	        r = randomInt() % modulus
2079	        if(r not in indexes)
2080	          indexes.insert(r)
2081	      }

2083	      return_selection = []
2084	      foreach(i in indexes) {
2085	        return_selection.insert(this.observed_records[i])
2086	      }

2088	      return return_selection
2089	    }
2090	    else
2091	      return this.observed_records
2092	  }

2094	  def passes_validity_checks(SCTBundle b) {
2095	    //  This function performs the validity checks specified in
2096	    //  {{feedback-srvop}}
2097	  }
2098	}
2099	   The SCTDomainEntry is responsible for handling the outcome of a
2100	   submission report for that domain using its member function:

2102	//  This function is called after providing SCT Feedback
2103	//  to a server. It is passed the feedback sent to the other party, which
2104	//  is the output of get_gossip_selection(), and also the SCTBundle
2105	//  representing the connection the data was sent on.
2106	//  (When this code runs on the server, connectionBundle is NULL)
2107	//  If the Feedback was not sent successfully, error is True
2108	def after_submit_to_thirdparty(error, SCTBundle[] submittedBundles,
2109	                               SCTBundle connectionBundle)
2110	{
2111	  // Server operation in this instance is exceedingly simple
2112	  if(operator_is_server) {
2113	    if(error)
2114	      return
2115	    foreach(bundle in submittedBundles)
2116	      bundle.num_reports_to_thirdparty++
2117	    return
2118	  }

2120	  // Client behavior is much more complicated
2121	  if(error) {
2122	    if(sct_feedback_failing_longterm()) {
2123	      num_feedback_loop_failures++
2124	    }
2125	    else if((num_submissions_succeeded / num_submissions_attempted)
2126	            > MIN_RATIO_FOR_SCT_FEEDBACK_TO_BE_WORKING) {
2127	      // Do nothing. num_submissions_succeeded will not be incremented
2128	      // After enough of these failures, the ratio will fall beyond
2129	      // acceptable
2130	    } else {
2131	      // The domain has begun its three-month grace period. We will
2132	      // attempt submissions once a month
2133	      num_feedback_loop_failures++
2134	    }
2135	    return
2136	  }
2137	  // We succeeded, so reset all of our failure states
2138	  // Note, there is a race condition here if clear_old_data() is called
2139	  // while this callback is outstanding.
2140	  num_feedback_loop_failures     = 0
2141	  if(num_submissions_succeeded != UINT16_MAX )
2142	    num_submissions_succeeded++

2144	  foreach(bundle in submittedBundles)
2145	  {
2146	    // Compare Certificate Chains, if they do not match, it counts as a
2147	    // submission.
2148	    if(!connectionBundle.approx_equals(bundle))
2149	      bundle.num_reports_to_thirdparty++
2150	    else {
2151	      // This check ensures that a SCT Bundle is not considered reported
2152	      // if it is submitted over a connection with the same SCTs. This
2153	      // satisfies the constraint in Paragraph 5 of {{feedback-clisrv}}
2154	      // Consider three submission scenarios:
2155	      // Submitted SCTs      Connection SCTs    Considered Submitted
2156	      // A, B                A, B               No - no new information
2157	      // A                   A, B               Yes - B is a new SCT
2158	      // A, B                A                  No - no new information
2159	      if(connectionBundle.sct_list is NOT a subset of bundle.sct_list)
2160	        bundle.num_reports_to_thirdparty++
2161	    }
2162	  }
2163	}

2165	   Instances of the SCTDomainEntry class are stored as part of a larger
2166	   class that manages the entire SCT Cache, storing them in a hashmap
2167	   keyed by domain.  This class also tracks the current size of the
2168	   cache, and will trigger cache eviction.

2170	//Suggestions:
2171	#define CACHE_PRESSURE_SAFE                   .50
2172	#define CACHE_PRESSURE_IMMINENT               .70
2173	#define CACHE_PRESSURE_ALMOST_FULL            .85
2174	#define CACHE_PRESSURE_FULL                   .95
2175	#define WAIT_BETWEEN_IMMINENT_CACHE_EVICTION  5 minutes

2177	class SCTStoreManager
2178	{
2179	  hashmap<String, SCTDomainEntry> all_sct_entries
2180	  uint32                         current_cache_size
2181	  datetime                       imminent_cache_pressure_check_performed

2183	  float current_cache_percentage() {
2184	    return current_cache_size / MAX_CACHE_SIZE;
2185	  }

2187	  static def update_cache_percentage() {
2188	    // This function calculates the current size of the cache
2189	    // and updates current_cache_size
2190	    /* ... perform calculations ... */
2191	    current_cache_size = /* new calculated value */

2193	    // Perform locking to prevent multiple of these functions being
2194	    // called concurrently or unnecessarily
2195	    if(current_cache_percentage() > CACHE_PRESSURE_FULL) {
2196	        cache_is_full()
2197	    }

2199	    else if(current_cache_percentage() > CACHE_PRESSURE_ALMOST_FULL) {
2200	      cache_pressure_almost_full()
2201	    }

2203	    else if(current_cache_percentage() > CACHE_PRESSURE_IMMINENT) {
2204	      // Do not repeatedly perform the imminent cache pressure operation
2205	      if(now() - imminent_cache_pressure_check_performed >
2206	          WAIT_BETWEEN_IMMINENT_CACHE_EVICTION) {
2207	        cache_pressure_is_imminent()
2208	      }
2209	    }
2210	  }
2211	}

2213	   The SCTStoreManager contains a function that will be called
2214	   periodically in the background, iterating through all SCTDomainEntry
2215	   objects and performing maintenance tasks.  It removes data for
2216	   domains we have not contacted in a long time.  This function is not
2217	   intended to clear data if the cache is getting full, separate
2218	   functions are used for that.

2220	 // Suggestions:
2221	 #define TIME_UNTIL_OLD_SUBMITTED_SCTDATA_ERASED     3 months
2222	 #define TIME_UNTIL_OLD_UNSUBMITTED_SCTDATA_ERASED   6 months

2224	 def clear_old_data()
2225	 {
2226	   foreach(domainEntry in all_sct_stores)
2227	   {
2228	     // Queue proof fetches
2229	     if(proof_fetching_enabled) {
2230	       foreach(sctBundle in domainEntry.observed_records) {
2231	         if(!sctBundle.has_been_fully_resolved_to_sths()) {
2232	           foreach(s in bundle.sct_list) {
2233	             if(!s.has_been_resolved_to_sth && !s.proof_outstanding) {
2234	               sct.proof_outstanding = True
2235	               queue_inclusion_proof(sct, inclusion_proof_callback)
2236	             }
2237	           }
2238	         }
2239	       }
2240	     }

2242	     // Do not store data for domains who are not supporting SCT
2243	     if(!operator_is_server
2244	        && domainEntry.sct_feedback_failing_longterm())
2245	     {
2246	       // Note that reseting these variables every single time is
2247	       // necessary to avoid a bug
2248	       all_sct_stores[domainEntry].num_submissions_attempted      = 0
2249	       all_sct_stores[domainEntry].num_submissions_succeeded      = 0
2250	       delete all_sct_stores[domainEntry].observed_records
2251	       all_sct_stores[domainEntry].observed_records               = NULL
2252	     }

2254	     // This check removes successfully submitted data for
2255	     // old domains we have not dealt with in a long time
2256	     if(domainEntry.num_submissions_succeeded > 0
2257	        && now() - domainEntry.last_contact_for_domain
2258	           > TIME_UNTIL_OLD_SUBMITTED_SCTDATA_ERASED)
2259	     {
2260	       all_sct_stores.remove(domainEntry)
2261	     }

2263	     // This check removes unsuccessfully submitted data for
2264	     // old domains we have not dealt with in a very long time
2265	     if(now() - domainEntry.last_contact_for_domain
2266	        > TIME_UNTIL_OLD_UNSUBMITTED_SCTDATA_ERASED)
2267	     {
2268	       all_sct_stores.remove(domainEntry)
2269	     }

2271	 SCTStoreManager.update_cache_percentage()
2272	 }

2274	   Inclusion Proof Fetching is handled fairly independently

2276	 // This function is a callback invoked after an inclusion proof
2277	 // has been retrieved. It can exist on the SCT class or independently,
2278	 // so long as it can modify the SCT class' members
2279	 def inclusion_proof_callback(inclusion_proof, original_sct, error)
2280	 {
2281	   // Unlike the STH code, this counter must be incremented on the
2282	   // callback as there is a race condition on using this counter in the
2283	   // cache_* functions.
2284	   original_sct.proof_attempts++
2285	   original_sct.proof_outstanding = False
2286	   if(!error) {
2287	     original_sct.has_been_resolved_to_sth = True
2288	     insert_to_sth_datastore(inclusion_proof.new_sth)
2289	   } else {
2290	     original_sct.proof_failure_count++
2291	   }
2292	 }

2294	   If the cache is getting full, these three member functions of the
2295	   SCTStoreManager class will be used.

2297	// -----------------------------------------------------------------
2298	// This function is called when the cache is not yet full, but is
2299	// nearing it. It prioritizes deleting data that should be safe
2300	// to delete (because it has been shared with the site or resolved
2301	// to a STH)
2302	def cache_pressure_is_imminent()
2303	{
2304	  bundlesToDelete = []
2305	  foreach(domainEntry in all_sct_stores) {
2306	    foreach(sctBundle in domainEntry.observed_records) {

2308	      if(proof_fetching_enabled) {
2309	        // First, queue proofs for anything not already queued.
2310	        if(!sctBundle.has_been_fully_resolved_to_sths()) {
2311	          foreach(sct in bundle.sct_list) {
2312	            if(!sct.has_been_resolved_to_sth
2313	               && !sct.proof_outstanding) {
2314	              sct.proof_outstanding = True
2315	              queue_inclusion_proof(sct, inclusion_proof_callback)
2316	            }
2317	          }
2318	        }

2320	        // Second, consider deleting entries that have been fully
2321	        // resolved.
2322	        else {
2323	          bundlesToDelete.append( Struct(domainEntry, sctBundle) )
2324	        }
2325	      }

2327	      // Third, consider deleting entries that have been successfully
2328	      // reported
2329	      if(sctBundle.num_reports_to_thirdparty > 0) {
2330	        bundlesToDelete.append( Struct(domainEntry, sctBundle) )
2331	      }
2332	    }
2333	  }

2335	  // Third, delete the eligible entries at random until the cache is
2336	  // at a safe level
2337	  uint recalculateIndex                = 0
2338	  #define RECALCULATE_EVERY_N_OPERATIONS 50

2340	  while(bundlesToDelete.length > 0 &&
2341	        current_cache_percentage() > CACHE_PRESSURE_SAFE) {
2342	    uint rndIndex = rand() % bundlesToDelete.length
2343	    bundlesToDelete[rndIndex].domainEntry.observed_records.remove(bundlesToDelete[rndIndex].sctBundle)
2344	    bundlesToDelete.removeAt(rndIndex)

2346	    recalculateIndex++
2347	    if(recalculateIndex % RECALCULATE_EVERY_N_OPERATIONS == 0) {
2348	      update_cache_percentage()
2349	    }
2350	  }

2352	  // Finally, tell the proof fetching engine to go faster
2353	  if(proof_fetching_enabled) {
2354	    // This function would speed up proof fetching until an
2355	    // arbitrary time has passed. Perhaps until it has fetched
2356	    // proofs for the number of items currently in its queue? Or
2357	    // a percentage of them?
2358	    proof_fetch_faster_please()
2359	  }
2360	  update_cache_percentage();
2361	}

2363	// -----------------------------------------------------------------
2364	// This function is called when the cache is almost full. It will
2365	// evict entries at random, while attempting to save entries that
2366	// appear to have proof fetching failures
2367	def cache_pressure_almost_full()
2368	{
2369	  uint recalculateIndex                = 0
2370	  uint savedRecords                    = 0
2371	  #define RECALCULATE_EVERY_N_OPERATIONS 50

2373	  while(all_sct_stores.length > savedRecords &&
2374	        current_cache_percentage() > CACHE_PRESSURE_SAFE) {
2375	    uint rndIndex1 = rand() % all_sct_stores.length
2376	    uint rndIndex2 = rand() % all_sct_stores[rndIndex1].observed_records.length

2378	    if(proof_fetching_enabled) {
2379	      if(all_sct_stores[rndIndex1].observed_records[rndIndex2].max_proof_failures() >
2380	         MIN_PROOF_FAILURES_CONSIDERED_SUSPICIOUS) {
2381	        savedRecords++
2382	        continue
2383	      }
2384	    }

2386	    // If proof fetching is not enabled we need some other logic
2387	    else {
2388	      if(sctBundle.num_reports_to_thirdparty == 0) {
2389	        savedRecords++
2390	        continue
2391	      }
2392	    }

2394	    all_sct_stores[rndIndex1].observed_records.removeAt(rndIndex2)
2395	    if(all_sct_stores[rndIndex1].observed_records.length == 0) {
2396	      all_sct_stores.removeAt(rndIndex1)
2397	    }

2399	    recalculateIndex++
2400	    if(recalculateIndex % RECALCULATE_EVERY_N_OPERATIONS == 0) {
2401	      update_cache_percentage()
2402	    }
2403	  }

2405	  update_cache_percentage();
2406	}
2407	// -----------------------------------------------------------------
2408	// This function is called when the cache is full, and will evict
2409	// cache entries at random
2410	def cache_is_full()
2411	{
2412	  uint recalculateIndex                = 0
2413	  #define RECALCULATE_EVERY_N_OPERATIONS 50

2415	  while(all_sct_stores.length > 0 &&
2416	        current_cache_percentage() > CACHE_PRESSURE_SAFE) {
2417	    uint rndIndex1 = rand() % all_sct_stores.length
2418	    uint rndIndex2 = rand() % all_sct_stores[rndIndex1].observed_records.length

2420	    all_sct_stores[rndIndex1].observed_records.removeAt(rndIndex2)
2421	    if(all_sct_stores[rndIndex1].observed_records.length == 0) {
2422	      all_sct_stores.removeAt(rndIndex1)
2423	    }

2425	    recalculateIndex++
2426	    if(recalculateIndex % RECALCULATE_EVERY_N_OPERATIONS == 0) {
2427	      update_cache_percentage()
2428	    }
2429	  }

2431	  update_cache_percentage();
2432	}

2434	12.  IANA considerations

2436	   [ TBD ]

2438	13.  Contributors

2440	   The authors would like to thank the following contributors for
2441	   valuable suggestions: Al Cutter, Ben Laurie, Benjamin Kaduk, Josef
2442	   Gustafsson, Karen Seo, Magnus Ahltorp, Steven Kent, Yan Zhu.

2444	14.  ChangeLog

2446	14.1.  Changes between ietf-02 and ietf-03

2448	   o  TBD's resolved.

2450	   o  References added.

2452	   o  Pseduocode changed to work for both clients and servers.

2454	14.2.  Changes between ietf-01 and ietf-02

2456	   o  Requiring full certificate chain in SCT Feedback.

2458	   o  Clarifications on what clients store for and send in SCT Feedback
2459	      added.

2461	   o  SCT Feedback server operation updated to protect against DoS
2462	      attacks on servers.

2464	   o  Pre-Loaded vs Locally Added Anchors explained.

2466	   o  Base for well-known URL's changed.

2468	   o  Remove all mentions of monitors - gossip deals with auditors.

2470	   o  New sections added: Trusted Auditor protocol, attacks by actively
2471	      malicious log, the Dual-CA compromise attack, policy
2472	      recommendations,

2474	14.3.  Changes between ietf-00 and ietf-01

2476	   o  Improve language and readability based on feedback from Stephen
2477	      Kent.

2479	   o  STH Pollination Proof Fetching defined and indicated as optional.

2481	   o  3-Method Ecosystem section added.

2483	   o  Cases with Logs ceasing operation handled.

2485	   o  Text on tracking via STH Interaction added.

2487	   o  Section with some early recommendations for mixing added.

2489	   o  Section detailing blocking connections, frustrating it, and the
2490	      implications added.

2492	14.4.  Changes between -01 and -02

2494	   o  STH Pollination defined.

2496	   o  Trusted Auditor Relationship defined.

2498	   o  Overview section rewritten.

2500	   o  Data flow picture added.

2502	   o  Section on privacy considerations expanded.

2504	14.5.  Changes between -00 and -01

2506	   o  Add the SCT feedback mechanism: Clients send SCTs to originating
2507	      web server which shares them with auditors.

2509	   o  Stop assuming that clients see STHs.

2511	   o  Don't use HTTP headers but instead .well-known URL's - avoid that
2512	      battle.

2514	   o  Stop referring to trans-gossip and trans-gossip-transport-https -
2515	      too complicated.

2517	   o  Remove all protocols but HTTPS in order to simplify - let's come
2518	      back and add more later.

2520	   o  Add more reasoning about privacy.

2522	   o  Do specify data formats.

2524	15.  References

2526	15.1.  Normative References

2528	   [RFC-6962-BIS-09]
2529	              Laurie, B., Langley, A., Kasper, E., Messeri, E., and R.
2530	              Stradling, "Certificate Transparency", October 2015,
2531	              <https://datatracker.ietf.org/doc/draft-ietf-trans-
2532	              rfc6962-bis/>.

2534	   [RFC7159]  Bray, T., "The JavaScript Object Notation (JSON) Data
2535	              Interchange Format", RFC 7159, March 2014.

2537	15.2.  Informative References

2539	   [double-keying]
2540	              Perry, M., Clark, E., and S. Murdoch, "Cross-Origin
2541	              Identifier Unlinkability", May 2015,
2542	              <https://www.torproject.org/projects/torbrowser/
2543	              design/#identifier-linkability>.

2545	   [draft-ct-over-dns]
2546	              Laurie, B., Phaneuf, P., and A. Eijdenberg, "Certificate
2547	              Transparency over DNS", February 2016,
2548	              <https://github.com/google/certificate-transparency-
2549	              rfcs/blob/master/dns/draft-ct-over-dns.md>.

2551	   [draft-ietf-trans-threat-analysis-03]
2552	              Kent, S., "Attack Model and Threat for Certificate
2553	              Transparency", October 2015,
2554	              <https://datatracker.ietf.org/doc/draft-ietf-trans-threat-
2555	              analysis/>.

2557	   [dual-ca-compromise-attack]
2558	              Gillmor, D., "can CT defend against dual CA compromise?",
2559	              n.d., <https://www.ietf.org/mail-
2560	              archive/web/trans/current/msg01984.html>.

2562	   [gossip-mixing]
2563	              Ritter, T., "A Bit on Certificate Transparency Gossip",
2564	              June 2016, <https://ritter.vg/blog-
2565	              a_bit_on_certificate_transparency_gossip.html>.

2567	   [trickle]  Serjantov, A., Dingledine, R., and . Paul Syverson, "From
2568	              a Trickle to a Flood: Active Attacks on Several Mix
2569	              Types", October 2002,
2570	              <http://freehaven.net/doc/batching-taxonomy/taxonomy.pdf>.

2572	Authors' Addresses

2574	   Linus Nordberg
2575	   NORDUnet

2577	   Email: linus@nordu.net

2579	   Daniel Kahn Gillmor
2580	   ACLU

2582	   Email: dkg@fifthhorseman.net

2584	   Tom Ritter

2586	   Email: tom@ritter.vg