idnits 2.17.1 

draft-ietf-trans-gossip-02.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  == There are 5 instances of lines with non-RFC2606-compliant FQDNs in the
     document.

  ** The document seems to lack a both a reference to RFC 2119 and the
     recommended RFC 2119 boilerplate, even if it appears to use RFC 2119
     keywords. 

     RFC 2119 keyword, line 354: '...tension.  The client MUST discard SCTs...'
     RFC 2119 keyword, line 355: '...own to the client and SHOULD store the...'
     RFC 2119 keyword, line 361: '...ed on the client MUST be keyed by the ...'
     RFC 2119 keyword, line 362: '...contacted.  They MUST NOT be sent to a...'
     RFC 2119 keyword, line 365: '...mple.com.)  They MUST NOT be sent to a...'
     (51 more instances...)


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == Line 1798 has weird spacing: '...   bool   has_...'

  == Line 1864 has weird spacing: '... string   doma...'

  -- The document date (March 21, 2016) is 2959 days in the past.  Is this
     intentional?

  -- Found something which looks like a code comment -- if you have code
     sections in the document, please surround them with '<CODE BEGINS>' and
     '<CODE ENDS>' lines.


  Checking references for intended status: Experimental
  ----------------------------------------------------------------------------

  -- Looks like a reference, but probably isn't: '1' on line 286

  -- Looks like a reference, but probably isn't: '2' on line 288

  -- Looks like a reference, but probably isn't: '3' on line 290

  == Missing Reference: 'Y' is mentioned on line 1454, but not defined

  == Missing Reference: 'Z' is mentioned on line 1454, but not defined

  == Missing Reference: 'STATISTICS HERE' is mentioned on line 1575, but not
     defined

  ** Obsolete normative reference: RFC 6962 (Obsoleted by RFC 9162)

  ** Obsolete normative reference: RFC 7159 (Obsoleted by RFC 8259)


     Summary: 3 errors (**), 0 flaws (~~), 7 warnings (==), 5 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	TRANS                                                        L. Nordberg
3	Internet-Draft                                                  NORDUnet
4	Intended status: Experimental                                 D. Gillmor
5	Expires: September 22, 2016                                         ACLU
6	                                                               T. Ritter

8	                                                          March 21, 2016

10	                            Gossiping in CT
11	                       draft-ietf-trans-gossip-02

13	Abstract

15	   The logs in Certificate Transparency are untrusted in the sense that
16	   the users of the system don't have to trust that they behave
17	   correctly since the behaviour of a log can be verified to be correct.

19	   This document tries to solve the problem with logs presenting a
20	   "split view" of their operations.  It describes three gossiping
21	   mechanisms for Certificate Transparency: SCT Feedback, STH
22	   Pollination and Trusted Auditor Relationship.

24	Status of This Memo

26	   This Internet-Draft is submitted in full conformance with the
27	   provisions of BCP 78 and BCP 79.

29	   Internet-Drafts are working documents of the Internet Engineering
30	   Task Force (IETF).  Note that other groups may also distribute
31	   working documents as Internet-Drafts.  The list of current Internet-
32	   Drafts is at http://datatracker.ietf.org/drafts/current/.

34	   Internet-Drafts are draft documents valid for a maximum of six months
35	   and may be updated, replaced, or obsoleted by other documents at any
36	   time.  It is inappropriate to use Internet-Drafts as reference
37	   material or to cite them other than as "work in progress."

39	   This Internet-Draft will expire on September 22, 2016.

41	Copyright Notice

43	   Copyright (c) 2016 IETF Trust and the persons identified as the
44	   document authors.  All rights reserved.

46	   This document is subject to BCP 78 and the IETF Trust's Legal
47	   Provisions Relating to IETF Documents
48	   (http://trustee.ietf.org/license-info) in effect on the date of
49	   publication of this document.  Please review these documents
50	   carefully, as they describe your rights and restrictions with respect
51	   to this document.  Code Components extracted from this document must
52	   include Simplified BSD License text as described in Section 4.e of
53	   the Trust Legal Provisions and are provided without warranty as
54	   described in the Simplified BSD License.

56	Table of Contents

58	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
59	   2.  Defining the problem  . . . . . . . . . . . . . . . . . . . .   4
60	   3.  Overview  . . . . . . . . . . . . . . . . . . . . . . . . . .   4
61	   4.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   5
62	     4.1.  Pre-Loaded vs Locally Added Anchors . . . . . . . . . . .   5
63	   5.  Who gossips with whom . . . . . . . . . . . . . . . . . . . .   5
64	   6.  What to gossip about and how  . . . . . . . . . . . . . . . .   6
65	   7.  Data flow . . . . . . . . . . . . . . . . . . . . . . . . . .   6
66	   8.  Gossip Mechanisms . . . . . . . . . . . . . . . . . . . . . .   7
67	     8.1.  SCT Feedback  . . . . . . . . . . . . . . . . . . . . . .   7
68	       8.1.1.  SCT Feedback data format  . . . . . . . . . . . . . .   8
69	       8.1.2.  HTTPS client to server  . . . . . . . . . . . . . . .   8
70	       8.1.3.  HTTPS server operation  . . . . . . . . . . . . . . .  11
71	       8.1.4.  HTTPS server to auditors  . . . . . . . . . . . . . .  13
72	     8.2.  STH pollination . . . . . . . . . . . . . . . . . . . . .  14
73	       8.2.1.  HTTPS Clients and Proof Fetching  . . . . . . . . . .  15
74	       8.2.2.  STH Pollination without Proof Fetching  . . . . . . .  17
75	       8.2.3.  Auditor Action  . . . . . . . . . . . . . . . . . . .  17
76	       8.2.4.  STH Pollination data format . . . . . . . . . . . . .  17
77	     8.3.  Trusted Auditor Stream  . . . . . . . . . . . . . . . . .  17
78	       8.3.1.  Trusted Auditor data format . . . . . . . . . . . . .  18
79	   9.  3-Method Ecosystem  . . . . . . . . . . . . . . . . . . . . .  19
80	     9.1.  SCT Feedback  . . . . . . . . . . . . . . . . . . . . . .  19
81	     9.2.  STH Pollination . . . . . . . . . . . . . . . . . . . . .  20
82	     9.3.  Trusted Auditor Relationship  . . . . . . . . . . . . . .  21
83	     9.4.  Interaction . . . . . . . . . . . . . . . . . . . . . . .  22
84	   10. Security considerations . . . . . . . . . . . . . . . . . . .  22
85	     10.1.  Attacks by actively malicious logs . . . . . . . . . . .  22
86	     10.2.  Dual-CA Compromise . . . . . . . . . . . . . . . . . . .  23
87	     10.3.  Censorship/Blocking considerations . . . . . . . . . . .  23
88	     10.4.  Privacy considerations . . . . . . . . . . . . . . . . .  25
89	       10.4.1.  Privacy and SCTs . . . . . . . . . . . . . . . . . .  25
90	       10.4.2.  Privacy in SCT Feedback  . . . . . . . . . . . . . .  25
91	       10.4.3.  Privacy for HTTPS clients performing STH Proof
92	                Fetching . . . . . . . . . . . . . . . . . . . . . .  26
93	       10.4.4.  Privacy in STH Pollination . . . . . . . . . . . . .  26
94	       10.4.5.  Privacy in STH Interaction . . . . . . . . . . . . .  27
95	       10.4.6.  Trusted Auditors for HTTPS Clients . . . . . . . . .  28
96	       10.4.7.  HTTPS Clients as Auditors  . . . . . . . . . . . . .  28

98	   11. Policy Recommendations  . . . . . . . . . . . . . . . . . . .  29
99	     11.1.  Blocking Recommendations . . . . . . . . . . . . . . . .  29
100	       11.1.1.  Frustrating blocking . . . . . . . . . . . . . . . .  29
101	       11.1.2.  Responding to possible blocking  . . . . . . . . . .  29
102	     11.2.  Proof Fetching Recommendations . . . . . . . . . . . . .  31
103	     11.3.  Record Distribution Recommendations  . . . . . . . . . .  31
104	       11.3.1.  Mixing Algorithm . . . . . . . . . . . . . . . . . .  32
105	       11.3.2.  Flushing Attacks . . . . . . . . . . . . . . . . . .  33
106	       11.3.3.  The Deletion Algorithm . . . . . . . . . . . . . . .  34
107	   12. IANA considerations . . . . . . . . . . . . . . . . . . . . .  45
108	   13. Contributors  . . . . . . . . . . . . . . . . . . . . . . . .  45
109	   14. ChangeLog . . . . . . . . . . . . . . . . . . . . . . . . . .  45
110	     14.1.  Changes between ietf-01 and ietf-02  . . . . . . . . . .  45
111	     14.2.  Changes between ietf-00 and ietf-01  . . . . . . . . . .  46
112	     14.3.  Changes between -01 and -02  . . . . . . . . . . . . . .  46
113	     14.4.  Changes between -00 and -01  . . . . . . . . . . . . . .  46
114	   15. References  . . . . . . . . . . . . . . . . . . . . . . . . .  47
115	     15.1.  Normative References . . . . . . . . . . . . . . . . . .  47
116	     15.2.  Informative References . . . . . . . . . . . . . . . . .  47
117	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  47

119	1.  Introduction

121	   The purpose of the protocols in this document, collectively referred
122	   to as CT Gossip, is to detect certain misbehavior by CT logs.  In
123	   particular, CT Gossip aims to detect logs that are providing
124	   inconsistent views to different log clients, and logs failing to
125	   include submitted certificates within the time period stipulated by
126	   MMD.

128	   [ TODO: enumerate the interfaces used for detecting misbehaviour? ]

130	   One of the major challenges of any gossip protocol is limiting damage
131	   to user privacy.  The goal of CT gossip is to publish and distribute
132	   information about the logs and their operations, but not to expose
133	   any additional information about the operation of any of the other
134	   participants.  Privacy of consumers of log information (in
135	   particular, of web browsers and other TLS clients) should not be
136	   undermined by gossip.

138	   This document presents three different, complementary mechanisms for
139	   non-log elements of the CT ecosystem to exchange information about
140	   logs in a manner that preserves the privacy of HTTPS clients.  They
141	   should provide protective benefits for the system as a whole even if
142	   their adoption is not universal.

144	2.  Defining the problem

146	   When a log provides different views of the log to different clients
147	   this is described as a partitioning attack.  Each client would be
148	   able to verify the append-only nature of the log but, in the extreme
149	   case, each client might see a unique view of the log.

151	   The CT logs are public, append-only and untrusted and thus have to be
152	   audited for consistency, i.e., they should never rewrite history.
153	   Additionally, auditors and other log clients need to exchange
154	   information about logs in order to be able to detect a partitioning
155	   attack (as described above).

157	   Gossiping about log behaviour helps address the problem of detecting
158	   malicious or compromised logs with respect to a partitioning attack.
159	   We want some side of the partitioned tree, and ideally both sides, to
160	   see the other side.

162	   Disseminating information about a log poses a potential threat to the
163	   privacy of end users.  Some data of interest (e.g.  SCTs) is linkable
164	   to specific log entries and thereby to specific websites, which makes
165	   sharing them with others a privacy concern.  Gossiping about this
166	   data has to take privacy considerations into account in order not to
167	   expose associations between users of the log (e.g., web browsers) and
168	   certificate holders (e.g., web sites).  Even sharing STHs (which do
169	   not link to specific log entries) can be problematic - user tracking
170	   by fingerprinting through rare STHs is one potential attack (see
171	   Section 8.2).

173	3.  Overview

175	   SCT Feedback enables HTTPS clients to share Signed Certificate
176	   Timestamps (SCTs) (Section 3.3 of [RFC-6962-BIS-09]) with CT auditors
177	   in a privacy-preserving manner by sending SCTs to originating HTTPS
178	   servers, who in turn share them with CT auditors.

180	   In STH Pollination, HTTPS clients use HTTPS servers as pools to share
181	   Signed Tree Heads (STHs) (Section 3.6 of [RFC-6962-BIS-09]) with
182	   other connecting clients in the hope that STHs will find their way to
183	   CT auditors.

185	   HTTPS clients in a Trusted Auditor Relationship share SCTs and STHs
186	   with trusted CT auditors directly, with expectations of privacy
187	   sensitive data being handled according to whatever privacy policy is
188	   agreed on between client and trusted party.

190	   Despite the privacy risks with sharing SCTs there is no loss in
191	   privacy if a client sends SCTs for a given site to the site
192	   corresponding to the SCT.  This is because the site's logs would
193	   already indicate that the client is accessing that site.  In this way
194	   a site can accumulate records of SCTs that have been issued by
195	   various logs for that site, providing a consolidated repository of
196	   SCTs that could be shared with auditors.  Auditors can use this
197	   information to detect logs that misbehave by not including
198	   certificates within the time period stipulated by the MMD metadata.

200	   Sharing an STH is considered reasonably safe from a privacy
201	   perspective as long as the same STH is shared by a large number of
202	   other log clients.  This safety in numbers can be achieved by only
203	   allowing gossiping of STHs issued in a certain window of time, while
204	   also refusing to gossip about STHs from logs with too high an STH
205	   issuance frequency (see Section 8.2).

207	4.  Terminology

209	   This document relies on terminology and data structures defined in
210	   [RFC-6962-BIS-09], including STH, SCT, Version, LogID, SCT timestamp,
211	   CtExtensions, SCT signature, Merkle Tree Hash.

213	   This document relies on terminology defined in
214	   [draft-ietf-trans-threat-analysis-03], including Auditing.

216	4.1.  Pre-Loaded vs Locally Added Anchors

218	   Through the document, we refer to both Trust Anchors (Certificate
219	   Authorities) and Logs.  Both Logs and Trust Anchors may be locally
220	   added by an administrator.  Unless otherwise clarified, in both cases
221	   we refer to the set of Trust Anchors and Logs that come pre-loaded
222	   and pre-trusted in a piece of client software.

224	5.  Who gossips with whom

226	   o  HTTPS clients and servers (SCT Feedback and STH Pollination)

228	   o  HTTPS servers and CT auditors (SCT Feedback and STH Pollination)

230	   o  CT auditors (Trusted Auditor Relationship)

232	   Additionally, some HTTPS clients may engage with an auditor who they
233	   trust with their privacy:

235	   o  HTTPS clients and CT auditors (Trusted Auditor Relationship)

237	6.  What to gossip about and how

239	   There are three separate gossip streams:

241	   o  SCT Feedback - transporting SCTs and certificate chains from HTTPS
242	      clients to CT auditors via HTTPS servers.

244	   o  STH Pollination - HTTPS clients and CT auditors using HTTPS
245	      servers as STH pools for exchanging STHs.

247	   o  Trusted Auditor Stream - HTTPS clients communicating directly with
248	      trusted CT auditors sharing SCTs, certificate chains and STHs.

250	   It is worthwhile to note that when an HTTPS Client or CT auditor
251	   interact with a log, they may equivalently interact with a log mirror
252	   or cache that replicates the log.

254	7.  Data flow

256	   The following picture shows how certificates, SCTs and STHs flow
257	   through a CT system with SCT Feedback and STH Pollination.  It does
258	   not show what goes in the Trusted Auditor Relationship stream.

260	      +- Cert ---- +----------+
261	      |            |    CA    | ----------+
262	      |   + SCT -> +----------+           |
263	      v   |                           Cert [& SCT]
264	   +----------+                           |
265	   |   Log    | ---------- SCT -----------+
266	   +----------+                           v
267	     |  ^                          +----------+
268	     |  |          SCT & Certs --- | Website  |
269	     |  |[1]           |           +----------+
270	     |  |[2]          STH            ^     |
271	     |  |[3]           v             |     |
272	     |  |          +----------+      |     |
273	     |  +--------> | Auditor  |      |  HTTPS traffic
274	     |             +----------+      |     |
275	    STH                              |    SCT
276	     |                         SCT & Certs |
277	   Log entries                       |     |
278	     |                              STH   STH
279	     v                               |     |
280	   +----------+                      |     v
281	   | Monitor  |                    +----------+
282	   +----------+                    | Browser  |
283	                                   +----------+

285	   #   Auditor                        Log
286	   [1] |--- get-sth ------------------->|
287	       |<-- STH ------------------------|
288	   [2] |--- leaf hash + tree size ----->|
289	       |<-- index + inclusion proof --->|
290	   [3] |--- tree size 1 + tree size 2 ->|
291	       |<-- consistency proof ----------|

293	8.  Gossip Mechanisms

295	8.1.  SCT Feedback

297	   The goal of SCT Feedback is for clients to share SCTs and certificate
298	   chains with CT auditors while still preserving the privacy of the end
299	   user.  The sharing of SCTs contribute to the overall goal of
300	   detecting misbehaving logs by providing auditors with SCTs from many
301	   vantage points, making it more likely to catch a violation of a log's
302	   MMD or a log presenting inconsistent views.  The sharing of
303	   certificate chains is beneficial to HTTPS server operators interested
304	   in direct feedback from clients for detecting bogus certificates
305	   issued in their name and therefore incentivises server operators to
306	   take part in SCT Feedback.

308	   SCT Feedback is the most privacy-preserving gossip mechanism, as it
309	   does not directly expose any links between an end user and the sites
310	   they've visisted to any third party.

312	   HTTPS clients store SCTs and certificate chains they see, and later
313	   send them to the originating HTTPS server by posting them to a well-
314	   known URL (associated with that server), as described in
315	   Section 8.1.2.  Note that clients will send the same SCTs and chains
316	   to a server multiple times with the assumption that any man-in-the-
317	   middle attack eventually will cease, and an honest server will
318	   eventually receive collected malicious SCTs and certificate chains.

320	   HTTPS servers store SCTs and certificate chains received from
321	   clients, as described in Section 8.1.3.  They later share them with
322	   CT auditors by either posting them to auditors or making them
323	   available via a well-known URL.  This is described in Section 8.1.4.

325	8.1.1.  SCT Feedback data format

327	   The data shared between HTTPS clients and servers, as well as between
328	   HTTPS servers and CT auditors, is a JSON array [RFC7159].  Each item
329	   in the array is a JSON object with the following content:

331	   o  x509_chain: An array of base64-encoded X.509 certificates.  The
332	      first element is the end-entity certificate, the second certifies
333	      the first and so on.

335	   o  sct_data: An array of objects consisting of the base64
336	      representation of the binary SCT data as defined in
337	      [RFC-6962-BIS-09] Section 3.3.

339	   We will refer to this object as 'sct_feedback'.

341	   The x509_chain element always contains at least one element.  It also
342	   always contains a full chain from a leaf certificate to a self-signed
343	   trust anchor.

345	   [ TBD: Be strict about what sct_data may contain or is this
346	   sufficiently implied by previous sections? ]

348	8.1.2.  HTTPS client to server

350	   When an HTTPS client connects to an HTTPS server, the client receives
351	   a set of SCTs as part of the TLS handshake.  SCTs are included in the
352	   TLS handshake using one or more of the three mechanisms described in
353	   [RFC-6962-BIS-09] section 3.4 - in the server certificate, in a TLS
354	   extension, or in an OCSP extension.  The client MUST discard SCTs
355	   that are not signed by a log known to the client and SHOULD store the
356	   remaining SCTs together with a locally constructed certificate chain
357	   which is trusted (i.e. terminated in a pre-loaded or locally
358	   installed Trust Anchor) in an sct_feedback object or equivalent data
359	   structure for later use in SCT Feedback.

361	   The SCTs stored on the client MUST be keyed by the exact domain name
362	   the client contacted.  They MUST NOT be sent to any domain not
363	   matching the original domain (e.g. if the original domain is
364	   sub.example.com they must not be sent to sub.sub.example.com or to
365	   example.com.)  They MUST NOT be sent to any Subject Alternate Names
366	   specified in the certificate.  In the case of certificates that
367	   validate multiple domain names, the same SCT is expected to be stored
368	   multiple times.

370	   Not following these constraints would increase the risk for two types
371	   of privacy breaches.  First, the HTTPS server receiving the SCT would
372	   learn about other sites visited by the HTTPS client.  Second,
373	   auditors receiving SCTs from the HTTPS server would learn information
374	   about other HTTPS servers visited by its clients.

376	   If the client later again connects to the same HTTPS server, it again
377	   receives a set of SCTs and calculates a certificate chain, and again
378	   creates an sct_feedback or similar object.  If this object does not
379	   exactly match an existing object in the store, then the client MUST
380	   add this new object to the store, associated with the exact domain
381	   name contacted, as described above.  An exact comparison is needed to
382	   ensure that attacks involving alternate chains are detected.  An
383	   example of such an attack is described in [TODO double-CA-compromise
384	   attack].  However, at least one optimization is safe and MAY be
385	   performed: If the certificate chain exactly matches an existing
386	   certificate chain, the client may store the union of the SCTs from
387	   the two objects in the first (existing) object.

389	   If the client does connect to the same HTTPS server a subsequent
390	   time, it MUST send to the server sct_feedback objects in the store
391	   that are associated with that domain name.  It is not necessary to
392	   send an sct_feedback object constructed from the current TLS session.

394	   The client MUST NOT send the same set of SCTs to the same server more
395	   often than TBD.

397	   [ TODO: expand on rate/resource limiting motivation ]

399	   Refer to Section 11.3 for recommendations about strategies.

401	   Because SCTs can be used as a tracking mechanism (see
402	   Section 10.4.2), they deserve special treatment when they are
403	   received from (and provided to) domains that are loaded as
404	   subresources from an origin domain.  Such domains are commonly called
405	   'third party domains'.  An HTTPS Client SHOULD store SCT Feedback
406	   using a 'double-keying' approach, which isolates third party domains
407	   by the first party domain.  This is described in XXX.  Gossip would
408	   be performed normally for third party domains only when the user
409	   revisits the first party domain.  In lieu of 'double-keying', an
410	   HTTPS Client MAY treat SCT Feedback in the same manner it treats
411	   other security mechanisms that can enable tracking (such as HSTS and
412	   HPKP.)

414	   [ XXX is currently https://www.torproject.org/projects/torbrowser/
415	   design/#identifier-linkability How should it be references?  Do we
416	   need to copy this out into another document?  An appendix? ]

418	   If the HTTPS client has configuration options for not sending cookies
419	   to third parties, SCTs of third parties MUST be treated as cookies
420	   with respect to this setting.  This prevents third party tracking
421	   through the use of SCTs/certificates, which would bypass the cookie
422	   policy.

424	   SCTs and corresponding certificates are POSTed to the originating
425	   HTTPS server at the well-known URL:

427	   https://<domain>/.well-known/ct-gossip/v1/sct-feedback

429	   The data sent in the POST is defined in Section 8.1.1.  This data
430	   SHOULD be sent in an already established TLS session.  This makes it
431	   hard for an attacker to disrupt SCT Feedback without also disturbing
432	   ordinary secure browsing (https://).  This is discussed more in
433	   Section 11.1.1.

435	   Some clients have trust anchors or logs that are locally added (e.g.
436	   by an administrator or by the user themselves).  These additions are
437	   potentially privacy-sensitive because they can carry information
438	   about the specific configuration, computer, or user.

440	   Certificates validated by locally added trust anchors will commonly
441	   have no SCTs associated with them, so in this case no action is
442	   needed with respect to CT Gossip.  SCTs issued by locally added logs
443	   MUST NOT be reported via SCT Feedback.

445	   If a certificate is validated by SCTs that are issued by publicly
446	   trusted logs, but chains to a local trust anchor, the client MAY
447	   perfom SCT Feedback for this SCT and certificate chain bundle.  If it
448	   does so, the client MUST include the full chain of certificates
449	   chaining to the local trust anchor in the x509_chain array.
450	   Perfoming SCT Feedback in this scenario may be advantageous for the
451	   broader internet and CT ecosystem, but may also disclose information
452	   about the client.  If the client elects to omit SCT Feedback, it can
453	   still choose to perform STH Pollination after fetching an inclusion
454	   proof, as specified in Section 8.2.

456	   We require the client to send the full chain (or nothing at all) for
457	   two reasons.  Firstly, it simplifies the operation on the server if
458	   there are not two code paths.  Secondly, omitting the chain does not
459	   actually preserve user privacy.  The Issuer field in the certificate
460	   describes the signing certificate.  And if the certificate is being
461	   submitted at all, it means the certificate is logged, and has SCTs.
462	   This means that the Issuer can be queried and obtained from the log
463	   so omitting the parent from the client's submission does not actually
464	   help user privacy.

466	8.1.3.  HTTPS server operation

468	   HTTPS servers can be configured (or omit configuration), resulting
469	   in, broadly, two modes of operation.  In the simpler mode, the server
470	   will only track leaf certificates and SCTs applicable to those leaf
471	   certificates.  In the more complex mode, the server will confirm the
472	   client's chain validation and store the certificate chain.  The
473	   latter mode requires more configuration, but is necessary to prevent
474	   denial of service (DoS) attacks on the server's storage space.

476	   In the simple mode of operation, upon recieving a submission at the
477	   sct-feedback well-known URL, an HTTPS server will perform a set of
478	   operations, checking on each sct_feedback object before storing it:

480	   1.  the HTTPS server MAY modify the sct_feedback object, and discard
481	       all items in the x509_chain array except the first item (which is
482	       the end-entity certificate)

484	   2.  if a bit-wise compare of the sct_feedback object matches one
485	       already in the store, this sct_feedback object SHOULD be
486	       discarded

488	   3.  if the leaf cert is not for a domain for which the server is
489	       authoritative, the SCT MUST be discarded

491	   4.  if an SCT in the sct_data array can't be verified to be a valid
492	       SCT for the accompanying leaf cert, and issued by a known log,
493	       the individual SCT SHOULD be discarded

495	   The modification in step number 1 is necessary to prevent a malicious
496	   client from exhausting the server's storage space.  A client can
497	   generate their own issuing certificate authorities, and create an
498	   arbitrary number of chains that terminate in an end-entity
499	   certificate with an existing SCT.  By discarding all but the end-
500	   entity certificate, we prevent a simple HTTPS server from storing
501	   this data.  Note that operation in this mode will not prevent the
502	   attack described in Section 10.2.  Skipping this step requires
503	   additional configuration as described below.

505	   The check in step 2 is for detecting duplicates and minimizing
506	   processing and storage by the server.  As on the client, an exact
507	   comparison is needed to ensure that attacks involving alternate
508	   chains are detected.  Again, at least one optimization is safe and
509	   MAY be performed.  If the certificate chain exactly matches an
510	   existing certificate chain, the server may store the union of the
511	   SCTs from the two objects in the first (existing) object.  It should
512	   do this after completing the validity check on the SCTs.

514	   The check in step 3 is to help malfunctioning clients from exposing
515	   which sites they visit.  It additionally helps prevent DoS attacks on
516	   the server.

518	   [ TBD: Thinking about building this, how does the SCT Feedback app
519	   know which sites it's authoritative for? ]

521	   The check in step 4 is to prevent DoS attacks where an adversary
522	   fills up the store prior to attacking a client (thus preventing the
523	   client's feedback from being recorded), or an attack where an
524	   adversary simply attempts to fill up server's storage space.

526	   The more advanced server configuration will detect the [TODO double-
527	   CA-compromise] attack.  In this configuration the server will not
528	   modify the sct_feedback object prior to performing checks 2, 3, and
529	   4.

531	   To prevent a malicious client from filling the server's data store,
532	   the HTTPS Server SHOULD perform an additional check:

534	   1.  if the x509_chain consists of an invalid certificate chain, or
535	       the culminating trust anchor is not recognized by the server, the
536	       server SHOULD modify the sct_feedback object, discarding all
537	       items in the x509_chain array except the first item

539	   The HTTPS server may choose to omit checks 4 or 5.  This will place
540	   the server at risk of having its data store filled up by invalid
541	   data, but can also allow a server to identify interesting certificate
542	   or certificate chains that omit valid SCTs, or do not chain to a
543	   trusted root.  This information may enable an HTTPS server operator
544	   to detect attacks or unusual behavior of Certificate Authorities even
545	   outside the Certificate Transparency ecosystem.

547	8.1.4.  HTTPS server to auditors

549	   HTTPS servers receiving SCTs from clients SHOULD share SCTs and
550	   certificate chains with CT auditors by either serving them on the
551	   well-known URL:

553	   https://<domain>/.well-known/ct-gossip/v1/collected-sct-feedback

555	   or by HTTPS POSTing them to a set of preconfigured auditors.  This
556	   allows an HTTPS server to choose between an active push model or a
557	   passive pull model.

559	   The data received in a GET of the well-known URL or sent in the POST
560	   is defined in Section 8.1.1.

562	   HTTPS servers SHOULD share all sct_feedback objects they see that
563	   pass the checks in Section 8.1.3.  If this is an infeasible amount of
564	   data, the server may choose to expire submissions according to an
565	   undefined policy.  Suggestions for such a policy can be found in
566	   Section 11.3.

568	   HTTPS servers MUST NOT share any other data that they may learn from
569	   the submission of SCT Feedback by HTTPS clients, like the HTTPS
570	   client IP address or the time of submission.

572	   As described above, HTTPS servers can be configured (or omit
573	   configuration), resulting in two modes of operation.  In one mode,
574	   the x509_chain array will contain a full certificate chain.  This
575	   chain may terminate in a trust anchor the auditor may recognize, or
576	   it may not.  (One scenario where this could occur is if the client
577	   submitted a chain terminiating in a locally added trust anchor, and
578	   the server kept this chain.)  In the other mode, the x509_chain array
579	   will consist of only a single element, which is the end-entity
580	   certificate.

582	   Auditors SHOULD provide the following URL accepting HTTPS POSTing of
583	   SCT feedback data:

585	   https://<auditor>/ct-gossip/v1/sct-feedback

587	   [ TBD: Should that be .well-known?  Depends on whether auditors will
588	   operate in their own URL name space or not. ]

590	   Auditors SHOULD regularly poll HTTPS servers at the well-known
591	   collected-sct-feedback URL.  The frequency of the polling and how to
592	   determine which domains to poll is outside the scope of this
593	   document.  However, the selection MUST NOT be influenced by potential
594	   HTTPS clients connecting directly to the auditor.  For example, if a
595	   poll to example.com occurs directly after a client submits an SCT for
596	   example.com, an adversary observing the auditor can trivially
597	   conclude the activity of the client.

599	8.2.  STH pollination

601	   The goal of sharing Signed Tree Heads (STHs) through pollination is
602	   to share STHs between HTTPS clients and CT auditors while still
603	   preserving the privacy of the end user.  The sharing of STHs
604	   contribute to the overall goal of detecting misbehaving logs by
605	   providing CT auditors with STHs from many vantage points, making it
606	   possible to detect logs that are presenting inconsistent views.

608	   HTTPS servers supporting the protocol act as STH pools.  HTTPS
609	   clients and CT auditors in the possession of STHs can pollinate STH
610	   pools by sending STHs to them, and retrieving new STHs to send to
611	   other STH pools.  CT auditors can improve the value of their auditing
612	   by retrieving STHs from pools.

614	   HTPS clients send STHs to HTTPS servers by POSTing them to the well-
615	   known URL:

617	   https://<domain>/.well-known/ct-gossip/v1/sth-pollination

619	   The data sent in the POST is defined in Section 8.2.4.  This data
620	   SHOULD be sent in an already established TLS session.  This makes it
621	   hard for an attacker to disrupt STH gossiping without also disturbing
622	   ordinary secure browsing (https://).  This is discussed more in
623	   Section 11.1.1.

625	   The response contains zero or more STHs in the same format, described
626	   in Section 8.2.4.

628	   An HTTPS client may acquire STHs by several methods:

630	   o  in replies to pollination POSTs;

632	   o  asking logs that it recognises for the current STH, either
633	      directly (v2/get-sth) or indirectly (for example over DNS)

635	   o  resolving an SCT and certificate to an STH via an inclusion proof

637	   o  resolving one STH to another via a consistency proof

639	   HTTPS clients (that have STHs) and CT auditors SHOULD pollinate STH
640	   pools with STHs.  Which STHs to send and how often pollination should
641	   happen is regarded as undefined policy with the exception of privacy
642	   concerns explained below.  Suggestions for the policy may be found in
643	   Section 11.3.

645	   An HTTPS client could be tracked by giving it a unique or rare STH.
646	   To address this concern, we place restrictions on different
647	   components of the system to ensure an STH will not be rare.

649	   o  HTTPS clients silently ignore STHs from logs with an STH issuance
650	      frequency of more than one STH per hour.  Logs use the STH
651	      Frequency Count metadata to express this ([RFC-6962-BIS-09]
652	      sections 3.6 and 5.1).

654	   o  HTTPS clients silently ignore STHs which are not fresh.

656	   An STH is considered fresh iff its timestamp is less than 14 days in
657	   the past.  Given a maximum STH issuance rate of one per hour, an
658	   attacker has 336 unique STHs per log for tracking.  Clients MUST
659	   ignore STHs older than 14 days.  We consider STHs within this
660	   validity window not to be personally identifiable data, and STHs
661	   outside this window to be personally identifiable.

663	   When multiplied by the number of logs from which a client accepts
664	   STHs, this number of unique STHs grow and the negative privacy
665	   implications grow with it.  It's important that this is taken into
666	   account when logs are chosen for default settings in HTTPS clients.
667	   This concern is discussed upon in Section 10.4.5.

669	   A log may cease operation, in which case there will soon be no STH
670	   within the validity window.  Clients SHOULD perform all three methods
671	   of gossip about a log that has ceased operation since it is possible
672	   the log was still compromised and gossip can detect that.  STH
673	   Pollination is the one mechanism where a client must know about a log
674	   shutdown.  A client who does not know about a log shutdown MUST NOT
675	   attempt any heuristic to detect a shutdown.  Instead the client MUST
676	   be informed about the shutdown from a verifiable source (e.g. a
677	   software update).  The client SHOULD be provided the final STH issued
678	   by the log and SHOULD resolve SCTs and STHs to this final STH.  If an
679	   SCT or STH cannot be resolved to the final STH, clients should follow
680	   the requirements and recommendations set forth in Section 11.1.2.

682	8.2.1.  HTTPS Clients and Proof Fetching

684	   There are two types of proofs a client may retrieve; inclusion proofs
685	   and consistency proofs.

687	   An HTTPS client will retrieve SCTs from an HTTPS server, and must
688	   obtain an inclusion proof to an STH in order to verify the promise
689	   made by the SCT.

691	   An HTTPS client may also receive an SCT bundled with an inclusion
692	   proof to a historical STH via an unspecified future mechanism.
693	   Because this historical STH is considered personally identifiable
694	   information per above, the client must obtain a consistency proof to
695	   a more recent STH.

697	   A client SHOULD perform proof fetching.  A client MUST NOT perform
698	   proof fetching for any SCTs or STHs issued by a locally added log.  A
699	   client MAY fetch an inclusion proof for an SCT (issued by a pre-
700	   loaded log) that validates a certificate chaining to a locally added
701	   trust anchor.

703	   [ TBD: Linus doesn't like this because we're mandating behavior that
704	   is not necessarily safe.  Is it unsafe?  Not sure.]

706	   If a client requested either proof directly from a log or auditor, it
707	   would reveal the client's browsing habits to a third party.  To
708	   mitigate this risk, an HTTPS client MUST retrieve the proof in a
709	   manner that disguises the client.

711	   Depending on the client's DNS provider, DNS may provide an
712	   appropriate intermediate layer that obfuscates the linkability
713	   between the user of the client and the request for inclusion (while
714	   at the same time providing a caching layer for oft-requested
715	   inclusion proofs.)

717	   [ TODO: Add a reference to Google's DNS mechanism more proper than
718	   http://www.certificate-transparency.org/august-2015-newsletter ]

720	   Anonymity networks such as Tor also present a mechanism for a client
721	   to anonymously retrieve a proof from an auditor or log.

723	   Even when using a privacy-preserving layer between the client and the
724	   log, certain observations may be made about an anonymous client or
725	   general user behavior depending on how proofs are fetched.  For
726	   example, if a client fetched all outstanding proofs at once, a log
727	   would know that SCTs or STHs recieved around the same time are more
728	   likely to come from a particular client.  This could potentially go
729	   so far as correlation of activity at different times to a single
730	   client.  In aggregate the data could reveal what sites are commonly
731	   visited together.  HTTPS clients SHOULD use a strategy of proof
732	   fetching that attempts to obfuscate these patterns.  A suggestion of
733	   such a policy can be found in Section 11.2.

735	   Resolving either SCTs and STHs may result in errors.  These errors
736	   may be routine downtime or other transient errors, or they may be
737	   indicative of an attack.  Clients should follow the requirements and
738	   recommendations set forth in Section 11.1.2 when handling these
739	   errors in order to give the CT ecosystem the greatest chance of
740	   detecting and responding to a compromise.

742	8.2.2.  STH Pollination without Proof Fetching

744	   An HTTPS client MAY participate in STH Pollination without fetching
745	   proofs.  In this situation, the client receives STHs from a server,
746	   applies the same validation logic to them (signed by a known log,
747	   within the validity window) and will later pass them to an HTTPS
748	   server.

750	   When operating in this fashion, the HTTPS client is promoting gossip
751	   for Certificate Transparency, but derives no direct benefit itself.
752	   In comparison, a client who resolves SCTs or historical STHs to
753	   recent STHs and pollinates them is assured that if it was attacked,
754	   there is a probability that the ecosystem will detect and respond to
755	   the attack (by distrusting the log).

757	8.2.3.  Auditor Action

759	   CT auditors participate in STH pollination by retrieving STHs from
760	   HTTPS servers.  They verify that the STH is valid by checking the
761	   signature, and requesting a consistency proof from the STH to the
762	   most recent STH.

764	   After retrieving the consistency proof to the most recent STH, they
765	   SHOULD pollinate this new STH among participating HTTPS Servers.  In
766	   this way, as STHs "age out" and are no longer fresh, their "lineage"
767	   continues to be tracked in the system.

769	8.2.4.  STH Pollination data format

771	   The data sent from HTTPS clients and CT auditors to HTTPS servers is
772	   a JSON object [RFC7159] with the following content:

774	   o  sths - an array of 0 or more fresh SignedTreeHead's as defined in
775	      [RFC-6962-BIS-09] Section 3.6.1.

777	8.3.  Trusted Auditor Stream

779	   HTTPS clients MAY send SCTs and cert chains, as well as STHs,
780	   directly to auditors.  If sent, this data MAY include data that
781	   reflects locally added logs or trust anchors.  Note that there are
782	   privacy implications in doing so, these are outlined in
783	   Section 10.4.1 and Section 10.4.6.

785	   The most natural trusted auditor arrangement arguably is a web
786	   browser that is "logged in to" a provider of various internet
787	   services.  Another equivalent arrangement is a trusted party like a
788	   corporation to which an employee is connected through a VPN or by
789	   other similar means.  A third might be individuals or smaller groups
790	   of people running their own services.  In such a setting, retrieving
791	   proofs from that third party could be considered reasonable from a
792	   privacy perspective.  The HTTPS client may also do its own auditing
793	   and might additionally share SCTs and STHs with the trusted party to
794	   contribute to herd immunity.  Here, the ordinary [RFC-6962-BIS-09]
795	   protocol is sufficient for the client to do the auditing while SCT
796	   Feedback and STH Pollination can be used in whole or in parts for the
797	   gossip part.

799	   Another well established trusted party arrangement on the internet
800	   today is the relation between internet users and their providers of
801	   DNS resolver services.  DNS resolvers are typically provided by the
802	   internet service provider (ISP) used, which by the nature of name
803	   resolving already know a great deal about which sites their users
804	   visit.  As mentioned in Section 8.2.1, in order for HTTPS clients to
805	   be able to retrieve proofs in a privacy preserving manner, logs could
806	   expose a DNS interface in addition to the ordinary HTTPS interface.
807	   An informal writeup of such a protocol can be found at XXX.

809	8.3.1.  Trusted Auditor data format

811	   Trusted Auditors expose a REST API at the fixed URI:

813	   https://<auditor>/ct-gossip/v1/trusted-auditor

815	   Submissions are made by sending an HTTPS POST request, with the body
816	   of the POST in a JSON object.  Upon successful receipt the Trusted
817	   Auditor returns 200 OK.

819	   The JSON object consists of two top-level keys: 'sct_feedback' and
820	   'sths'.  The 'sct_feedback' value is an array of JSON objects as
821	   defined in Section 8.1.1.  The 'sths' value is an array of STHs as
822	   defined in Section 8.2.4.

824	   Example:

826	   {
827	     'sct_feedback' :
828	       [
829	         {
830	           'x509_chain' :
831	             [
832	               '----BEGIN CERTIFICATE---\n
833	                AAA...',
834	               '----BEGIN CERTIFICATE---\n
835	                AAA...',
836	                ...
837	             ],
838	           'sct_data' :
839	             [
840	               'AAA...',
841	               'AAA...',
842	               ...
843	             ]
844	         }, ...
845	       ],
846	     'sths' :
847	       [
848	         'AAA...',
849	         'AAA...',
850	         ...
851	       ]
852	   }

854	9.  3-Method Ecosystem

856	   The use of three distinct methods for auditing logs may seem
857	   excessive, but each represents a needed component in the CT
858	   ecosystem.  To understand why, the drawbacks of each component must
859	   be outlined.  In this discussion we assume that an attacker knows
860	   which mechanisms an HTTPS client and HTTPS server implement.

862	9.1.  SCT Feedback

864	   SCT Feedback requires the cooperation of HTTPS clients and more
865	   importantly HTTPS servers.  Although SCT Feedback does require a
866	   significant amount of server-side logic to respond to the
867	   corresponding APIs, this functionality does not require
868	   customization, so it may be pre-provided and work out of the box.
869	   However, to take full advantage of the system, an HTTPS server would
870	   wish to perform some configuration to optimize its operation:

872	   o  Minimize its disk commitment by maintaining a list of known SCTs
873	      and certificate chains (or hashes thereof)

875	   o  Maximize its chance of detecting a misissued certificate by
876	      configuring a trust store of CAs

878	   o  Establish a "push" mechanism for POSTing SCTs to CT auditors

880	   These configuration needs, and the simple fact that it would require
881	   some deployment of software, means that some percentage of HTTPS
882	   servers will not deploy SCT Feedback.

884	   It is worthwhile to note that an attacker may be able to prevent
885	   detection of an attack on a webserver (in all cases) if SCT Feedback
886	   is not implemented.  This attack is detailed in Section 10.1).

888	   If SCT Feedback was the only mechanism in the ecosystem, any server
889	   that did not implement the feature would open itself and its users to
890	   attack without any possibility of detection.

892	   If SCT Feedback is not deployed by a webserver, malicious logs will
893	   be able to attack all users of the webserver (who do not have a
894	   Trusted Auditor relationship) with impunity.  Additionally, users who
895	   wish to have the strongest measure of privacy protection (by
896	   disabling STH Pollination Proof Fetching and forgoing a Trusted
897	   Auditor) could be attacked without risk of detection.

899	9.2.  STH Pollination

901	   STH Pollination requires the cooperation of HTTPS clients, HTTPS
902	   servers, and logs.

904	   For a client to fully participate in STH Pollination, and have this
905	   mechanism detect attacks against it, the client must have a way to
906	   safely perform Proof Fetching in a privacy preserving manner.  (The
907	   client may pollinate STHs it receives without performing Proof
908	   Fetching, but we do not consider this option in this section.)

910	   HTTPS Servers must deploy software (although, as in the case with SCT
911	   Feedback this logic can be pre-provided) and commit some configurable
912	   amount of disk space to the endeavor.

914	   Logs (or a third party) must provide access to clients to query
915	   proofs in a privacy preserving manner, most likely through DNS.

917	   Unlike SCT Feedback, the STH Pollination mechanism is not hampered if
918	   only a minority of HTTPS servers deploy it.  However, it makes an
919	   assumption that an HTTPS client performs Proof Fetching (such as the
920	   DNS mechanism discussed).  Unfortunately, any manner that is
921	   anonymous for some (such as clients who use shared DNS services such
922	   as a large ISP), may not be anonymous for others.

924	   For instance, DNS requests expose a considerable amount of sensitive
925	   information (including what data is already present in the cache) in
926	   plaintext over the network.  For this reason, some percentage of
927	   HTTPS clients may choose to not enable the Proof Fetching component
928	   of STH Pollination.  (Although they can still request and send STHs
929	   among participating HTTPS servers, even when this affords them no
930	   direct benefit.)

932	   If STH Pollination was the only mechanism deployed, users that
933	   disable it would be able to be attacked without risk of detection.

935	   If STH Pollination was not deployed, HTTPS Clients visiting HTTPS
936	   Servers who did not deploy SCT Feedback could be attacked without
937	   risk of detection.

939	9.3.  Trusted Auditor Relationship

941	   The Trusted Auditor Relationship is expected to be the rarest gossip
942	   mechanism, as an HTTPS Client is providing an unadulterated report of
943	   its browsing history to a third party.  While there are valid and
944	   common reasons for doing so, there is no appropriate way to enter
945	   into this relationship without retrieving informed consent from the
946	   user.

948	   However, the Trusted Auditor Relationship mechanism still provides
949	   value to a class of HTTPS Clients.  For example, web crawlers have no
950	   concept of a "user" and no expectation of privacy.  Organizations
951	   already performing network auditing for anomalies or attacks can run
952	   their own Trusted Auditor for the same purpose with marginal increase
953	   in privacy concerns.

955	   The ability to change one's Trusted Auditor is a form of Trust
956	   Agility that allows a user to choose who to trust, and be able to
957	   revise that decision later without consequence.  A Trusted Auditor
958	   connection can be made more confidential than DNS (through the use of
959	   TLS), and can even be made (somewhat) anonymous through the use of
960	   anonymity services such as Tor. (Note that this does ignore the de-
961	   anonymization possibilities available from viewing a user's browsing
962	   history.)

964	   If the Trusted Auditor relationship was the only mechanism deployed,
965	   users who do not enable it (the majority) would be able to be
966	   attacked without risk of detection.

968	   If the Trusted Auditor relationship was not deployed, crawlers and
969	   organizations would build it themselves for their own needs.  By
970	   standardizing it, users who wish to opt-in (for instance those
971	   unwilling to participate fully in STH Pollination) can have an
972	   interoperable standard they can use to choose and change their
973	   trusted auditor.

975	9.4.  Interaction

977	   The interactions of the mechanisms is thus outlined:

979	   HTTPS Clients can be attacked without risk of detection if they do
980	   not participate in any of the three mechanisms.

982	   HTTPS Clients are afforded the greatest chance of detecting an attack
983	   when they either participate in both SCT Feedback and STH Pollination
984	   with Proof Fetching or if they have a Trusted Auditor relationship.
985	   (Participating in SCT Feedback is required to prevent a malicious log
986	   from refusing to ever resolve an SCT to an STH, as put forward in
987	   Section 10.1).  Additionally, participating in SCT Feedback enables
988	   an HTTPS Client to assist in detecting the exact target of an attack.

990	   HTTPS Servers that omit SCT Feedback enable malicious logs to carry
991	   out attacks without risk of detection.  If these servers are targeted
992	   specifically, even if the attack is detected, without SCT Feedback
993	   they may never learn that they were specifically targeted.  HTTPS
994	   servers without SCT Feedback do gain some measure of herd immunity,
995	   but only because their clients participate in STH Pollination (with
996	   Proof Fetching) or have a Trusted Auditor Relationship.

998	   When HTTPS Servers omit SCT feedback, it allows their users to be
999	   attacked without detection by a malicious log; the vulnerable users
1000	   are those who do not have a Trusted Auditor relationship.

1002	10.  Security considerations

1004	10.1.  Attacks by actively malicious logs

1006	   One of the most powerful attacks possible in the CT ecosystem is a
1007	   trusted log that has actively decided to be malicious.  It can carry
1008	   out an attack in two ways:

1010	   In the first attack, the log can present a split view of the log for
1011	   all time.  The only way to detect this attack is to resolve each view
1012	   of the log to the two most recent STHs and then force the log to
1013	   present a consistency proof.  (Which it cannot.)  This attack can be
1014	   detected by CT auditors participating in STH Pollination, as long as
1015	   they are explicitly built to handle the situation of a log
1016	   continuously presenting a split view.

1018	   In the second attack, the log can sign an SCT, and refuse to ever
1019	   include the certificate that the SCT refers to in the tree.

1021	   (Alternately, it can include it in a branch of the tree and issue an
1022	   STH, but then abandon that branch.)  Whenever someone requests an
1023	   inclusion proof for that SCT (or a consistency proof from that STH),
1024	   the log would respond with an error, and a client may simply regard
1025	   the response as a transient error.  This attack can be detected using
1026	   SCT Feedback, or an Auditor of Last Resort, as presented in
1027	   Section 11.1.2.

1029	10.2.  Dual-CA Compromise

1031	   XXX describes an attack possible by an adversary who compromises two
1032	   Certificate Authorites and a Log. This attack is difficult to defend
1033	   against in the CT ecosystem, and XXX describes a few approaches to
1034	   doing so.  We note that Gossip is not intended to defend against this
1035	   attack, but can in certain modes.

1037	   Defending against the Dual-CA Compromise attack requires SCT
1038	   Feedback, and explicitly requires the server to save full certificate
1039	   chains (described in Section 8.1.3 as the 'complex' configuration.)
1040	   After CT auditors receive the full certificate chains from servers,
1041	   they must compare the chain built by clients to the chain supplied by
1042	   the log.  If the chains differ significantly, the auditor can raise a
1043	   concern.

1045	   [ What does 'differ significantly' mean?  We should provide guidance.
1046	   I _think_ the correct algorithm to raise a concern is:

1048	   If one chain is not a subset of the other AND If the root
1049	   certificates of the chains are different THEN It's suspicious.

1051	   Justification: - Cross-Signatures could result in a different org
1052	   being treated as the 'root', but in this case, one chain would be a
1053	   subset of the other.  - Intermediate swapping (e.g. different
1054	   signature algorithms) could result in different chains, but the root
1055	   would be the same.

1057	   (Hitting both those cases at once would cause a false positive
1058	   though.)

1060	   What did I miss? ]

1062	10.3.  Censorship/Blocking considerations

1064	   We assume a network attacker who is able to fully control the
1065	   client's internet connection for some period of time, including
1066	   selectively blocking requests to certain hosts and truncating TLS
1067	   connections based on information observed or guessed about client
1068	   behavior.  In order to successfully detect log misbehavior, the
1069	   gossip mechanisms must still work even in these conditions.

1071	   There are several gossip connections that can be blocked:

1073	   1.  Clients sending SCTs to servers in SCT Feedback

1075	   2.  Servers sending SCTs to auditors in SCT Feedback (server push
1076	       mechanism)

1078	   3.  Servers making SCTs available to auditors (auditor pull
1079	       mechanism)

1081	   4.  Clients fetching proofs in STH Pollination

1083	   5.  Clients sending STHs to servers in STH Pollination

1085	   6.  Servers sending STHs to clients in STH Pollination

1087	   7.  Clients sending SCTs to Trusted Auditors

1089	   If a party cannot connect to another party, it can be assured that
1090	   the connection did not succeed.  While it may not have been
1091	   maliciously blocked, it knows the transaction did not succeed.
1092	   Mechanisms which result in a positive affirmation from the recipient
1093	   that the transaction succeeded allow confirmation that a connection
1094	   was not blocked.  In this situation, the party can factor this into
1095	   strategies suggested in Section 11.3 and in Section 11.1.2.

1097	   The connections that allow positive affirmation are 1, 2, 4, 5, and
1098	   7.

1100	   More insidious is blocking the connections that do not allow positive
1101	   confirmation: 3 and 6.  An attacker may truncate or drop a response
1102	   from a server to a client, such that the server believes it has
1103	   shared data with the recipient, when it has not.  However, in both
1104	   scenatios (3 and 6), the server cannot distinguish the client as a
1105	   cooperating member of the CT ecosystem or as an attacker performing a
1106	   sybil attack, aiming to flush the server's data store.  Therefore the
1107	   fact that these connections can be undetectably blocked does not
1108	   actually alter the threat model of servers responding to these
1109	   requests.  The choice of algorithm to release data is crucial to
1110	   protect against these attacks; strategies are suggested in
1111	   Section 11.3.

1113	   Handling censorship and network blocking (which is indistinguishable
1114	   from network error) is relegated to the implementation policy chosen
1115	   by clients.  Suggestions for client behavior are specified in
1116	   Section 11.1.

1118	10.4.  Privacy considerations

1120	   CT Gossip deals with HTTPS Clients which are trying to share
1121	   indicators that correspond to their browsing history.  The most
1122	   sensitive relationships in the CT ecosystem are the relationships
1123	   between HTTPS clients and HTTPS servers.  Client-server relationships
1124	   can be aggregated into a network graph with potentially serious
1125	   implications for correlative de-anonymisation of clients and
1126	   relationship-mapping or clustering of servers or of clients.

1128	   There are, however, certain clients that do not require privacy
1129	   protection.  Examples of these clients are web crawlers or robots.
1130	   But even in this case, the method by which these clients crawl the
1131	   web may in fact be considered sensitive information.  In general, it
1132	   is better to err on the side of safety, and not assume a client is
1133	   okay with giving up its privacy.

1135	10.4.1.  Privacy and SCTs

1137	   An SCT contains information that links it to a particular web site.
1138	   Because the client-server relationship is sensitive, gossip between
1139	   clients and servers about unrelated SCTs is risky.  Therefore, a
1140	   client with an SCT for a given server should transmit that
1141	   information in only two channels: to the server associated with the
1142	   SCT itself; and to a Trusted Auditor, if one exists.

1144	10.4.2.  Privacy in SCT Feedback

1146	   SCTs introduce yet another mechanism for HTTPS servers to store state
1147	   on an HTTPS client, and potentially track users.  HTTPS clients which
1148	   allow users to clear history or cookies associated with an origin
1149	   MUST clear stored SCTs and certificate chains associated with the
1150	   origin as well.

1152	   Auditors should treat all SCTs as sensitive data.  SCTs received
1153	   directly from an HTTPS client are especially sensitive, because the
1154	   auditor is a trusted by the client to not reveal their associations
1155	   with servers.  Auditors MUST NOT share such SCTs in any way,
1156	   including sending them to an external log, without first mixing them
1157	   with multiple other SCTs learned through submissions from multiple
1158	   other clients.  Suggestions for mixing SCTs are presented in
1159	   Section 11.3.

1161	   There is a possible fingerprinting attack where a log issues a unique
1162	   SCT for targeted log client(s).  A colluding log and HTTPS server
1163	   operator could therefore be a threat to the privacy of an HTTPS
1164	   client.  Given all the other opportunities for HTTPS servers to
1165	   fingerprint clients - TLS session tickets, HPKP and HSTS headers,
1166	   HTTP Cookies, etc. - this is considered acceptable.

1168	   The fingerprinting attack described above would be mitigated by a
1169	   requirement that logs MUST use a deterministic signature scheme when
1170	   signing SCTs ([RFC-6962-BIS-09] Section 2.1.4).  A log signing using
1171	   RSA is not required to use a deterministic signature scheme.

1173	   Since logs are allowed to issue a new SCT for a certificate already
1174	   present in the log, mandating deterministic signatures does not stop
1175	   this fingerprinting attack altogether.  It does make the attack
1176	   harder to pull off without being detected though.

1178	   There is another similar fingerprinting attack where an HTTPS server
1179	   tracks a client by using a unqiue certificate or a variation of cert
1180	   chains.  The risk for this attack is accepted on the same grounds as
1181	   the unique SCT attack described above.  [XXX any mitigations possible
1182	   here?]

1184	10.4.3.  Privacy for HTTPS clients performing STH Proof Fetching

1186	   An HTTPS client performing Proof Fetching should only request proofs
1187	   from a CT log that it accepts SCTs from.  An HTTPS client MAY [TBD
1188	   SHOULD?] regularly request an STH from all logs it is willing to
1189	   accept, even if it has seen no SCTs from that log.

1191	   [ TBD how regularly?  This has operational implications for log
1192	   operators ]

1194	   The actual mechanism by which Proof Fetching is done carries
1195	   considerable privacy concerns.  Although out of scope for the
1196	   document, DNS is a mechanism currently discussed.  DNS exposes data
1197	   in plaintext over the network (including what sites the user is
1198	   visiting and what sites they have previously visited) an may not be
1199	   suitable for some.

1201	10.4.4.  Privacy in STH Pollination

1203	   An STH linked to an HTTPS client may indicate the following about
1204	   that client:

1206	   o  that the client gossips;

1208	   o  that the client has been using CT at least until the time that the
1209	      timestamp and the tree size indicate;

1211	   o  that the client is talking, possibly indirectly, to the log
1212	      indicated by the tree hash;

1214	   o  which software and software version is being used.

1216	   There is a possible fingerprinting attack where a log issues a unique
1217	   STH for a targeted HTTPS client.  This is similar to the
1218	   fingerprinting attack described in Section 10.4.2, but can operate
1219	   cross-origin.  If a log (or HTTPS Server cooperating with a log)
1220	   provides a unique STH to a client, the targeted client will be the
1221	   only client pollinating that STH cross-origin.

1223	   It is mitigated partially because the log is limited in the number of
1224	   STHs it can issue.  It must 'save' one of its STHs each MMD to
1225	   perform the attack.

1227	10.4.5.  Privacy in STH Interaction

1229	   An HTTPS client may pollinate any STH within the last 14 days.  An
1230	   HTTPS Client may also pollinate an STH for any log that it knows
1231	   about.  When a client pollinates STHs to a server, it will release
1232	   more than one STH at a time.  It is unclear if a server may 'prime' a
1233	   client and be able to reliably detect the client at a later time.

1235	   It's clear that a single site can track a user any way they wish, but
1236	   this attack works cross-origin and is therefore more concerning.  Two
1237	   independent sites A and B want to collaborate to track a user cross-
1238	   origin.  A feeds a client Carol some N specific STHs from the M logs
1239	   Carol trusts, chosen to be older and less common, but still in the
1240	   validity window.  Carol visits B and chooses to release some of the
1241	   STHs she has stored, according to some policy.

1243	   Modeling a representation for how common older STHs are in the pools
1244	   of clients, and examining that with a given policy of how to choose
1245	   which of those STHs to send to B, it should be possible to calculate
1246	   statistics about how unique Carol looks when talking to B and how
1247	   useful/accurate such a tracking mechanism is.

1249	   Building such a model is likely impossible without some real world
1250	   data, and requires a given implementation of a policy.  To combat
1251	   this attack, suggestions are provided in Section 11.3 to attempt to
1252	   minimize it, but follow-up testing with real world deployment to
1253	   improve the policy will be required.

1255	10.4.6.  Trusted Auditors for HTTPS Clients

1257	   Some HTTPS clients may choose to use a trusted auditor.  This trust
1258	   relationship exposes a large amount of information about the client
1259	   to the auditor.  In particular, it will identify the web sites that
1260	   the client has visited to the auditor.  Some clients may already
1261	   share this information to a third party, for example, when using a
1262	   server to synchronize browser history across devices in a server-
1263	   visible way, or when doing DNS lookups through a trusted DNS
1264	   resolver.  For clients with such a relationship already established,
1265	   sending SCTs to a trusted auditor run by the same organization does
1266	   not appear to expose any additional information to the trusted third
1267	   party.

1269	   Clients who wish to contact a CT auditor without associating their
1270	   identities with their SCTs may wish to use an anonymizing network
1271	   like Tor to submit SCT Feedback to the auditor.  Auditors SHOULD
1272	   accept SCT Feedback that arrives over such anonymizing networks.

1274	   Clients sending feedback to an auditor may prefer to reduce the
1275	   temporal granularity of the history exposure to the auditor by
1276	   caching and delaying their SCT Feedback reports.  This is elaborated
1277	   upon in Section 11.3.  This strategy is only as effective as the
1278	   granularity of the timestamps embedded in the SCTs and STHs.

1280	10.4.7.  HTTPS Clients as Auditors

1282	   Some HTTPS Clients may choose to act as CT auditors themselves.  A
1283	   Client taking on this role needs to consider the following:

1285	   o  an Auditing HTTPS Client potentially exposes its history to the
1286	      logs that they query.  Querying the log through a cache or a proxy
1287	      with many other users may avoid this exposure, but may expose
1288	      information to the cache or proxy, in the same way that a non-
1289	      Auditing HTTPS Client exposes information to a Trusted Auditor.

1291	   o  an effective CT auditor needs a strategy about what to do in the
1292	      event that it discovers misbehavior from a log.  Misbehavior from
1293	      a log involves the log being unable to provide either (a) a
1294	      consistency proof between two valid STHs or (b) an inclusion proof
1295	      for a certificate to an STH any time after the log's MMD has
1296	      elapsed from the issuance of the SCT.  The log's inability to
1297	      provide either proof will not be externally cryptographically-
1298	      verifiable, as it may be indistinguishable from a network error.

1300	11.  Policy Recommendations

1302	   This section is intended as suggestions to implementors of HTTPS
1303	   Clients, HTTPS Servers, and CT auditors.  It is not a requirement for
1304	   technique of implementation, so long as privacy considerations
1305	   established above are obeyed.

1307	11.1.  Blocking Recommendations

1309	11.1.1.  Frustrating blocking

1311	   When making gossip connections to HTTPS Servers or Trusted Auditors,
1312	   it is desirable to minimize the plaintext metadata in the connection
1313	   that can be used to identify the connection as a gossip connection
1314	   and therefore be of interest to block.  Additionally, introducing
1315	   some randomness into client behavior may be important.  We assume
1316	   that the adversary is able to inspect the behavior of the HTTPS
1317	   client and understand how it makes gossip connections.

1319	   As an example, if a client, after establishing a TLS connection (and
1320	   receiving an SCT, but not making its own HTTP request yet),
1321	   immediately opens a second TLS connection for the purpose of gossip,
1322	   the adversary can reliably block this second connection to block
1323	   gossip without affecting normal browsing.  For this reason it is
1324	   recommended to run the gossip protocols over an existing connection
1325	   to the server, making use of connection multiplexing such as HTTP
1326	   Keep-Alives or SPDY.

1328	   Truncation is also a concern.  If a client always establishes a TLS
1329	   connection, makes a request, receives a response, and then always
1330	   attempts a gossip communication immediately following the first
1331	   response, truncation will allow an attacker to block gossip reliably.

1333	   For these reasons, we recommend that, if at all possible, clients
1334	   SHOULD send gossip data in an already established TLS session.  This
1335	   can be done through the use of HTTP Pipelining, SPDY, or HTTP/2.

1337	11.1.2.  Responding to possible blocking

1339	   In some cirsumstances a client may have a piece of data that they
1340	   have attempted to share (via SCT Feedback or STH Pollination), but
1341	   have been unable to do so: with every attempt they recieve an error.
1342	   These situations are:

1344	   1.  The client has an SCT and a certificate, and attempts to retrieve
1345	       an inclusion proof - but recieves an error on every attempt.

1347	   2.  The client has an STH, and attempts to resolve it to a newer STH
1348	       via a consistency proof - but recieves an error on every attempt.

1350	   3.  The client has attempted to share an SCT and constructed
1351	       certificate via SCT Feedback - but recieves an error on every
1352	       attempt.

1354	   4.  The client has attempted to share an STH via STH Pollination -
1355	       but recieves an error on every attempt.

1357	   5.  The client has attempted to share a specific piece of data with a
1358	       Trusted Auditor - but recieves an error on every attempt.

1360	   In the case of 1 or 2, it is conceivable that the reason for the
1361	   errors is that the log acted improperly, either through malicious
1362	   actions or compromise.  A proof may not be able to be fetched because
1363	   it does not exist (and only errors or timeouts occur).  One such
1364	   situation may arise because of an actively malicious log, as
1365	   presented in Section 10.1.  This data is especially important to
1366	   share with the broader internet to detect this situation.

1368	   If an SCT has attempted to be resolved to an STH via an inclusion
1369	   proof multiple times, and each time has failed, a client SHOULD make
1370	   every effort to send this SCT via SCT Feedback.  However the client
1371	   MUST NOT share the data with any other third party (excepting a
1372	   Trusted Auditor should one exist).

1374	   If an STH has attempted to be resolved to a newer STH via a
1375	   consistency proof multiple times, and each time has failed, a client
1376	   MAY share the STH with an "Auditor of Last Resort" even if the STH in
1377	   question is no longer within the validity window.  This auditor may
1378	   be pre-configured in the client, but the client SHOULD permit a user
1379	   to disable the functionality or change whom data is sent to.  The
1380	   Auditor of Last Resort itself represents a point of failure, so if
1381	   implemented, it should connect using public key pinning and not
1382	   considered an item delivered until it recieves a confirmation.

1384	   In the cases 3, 4, and 5, we assume that the webserver(s) or trusted
1385	   auditor in question is either experiencing an operational failure, or
1386	   being attacked.  In both cases, a client SHOULD retain the data for
1387	   later submission (subject to Private Browsing or other history-
1388	   clearing actions taken by the user.)  This is elaborated upon more in
1389	   Section 11.3.

1391	11.2.  Proof Fetching Recommendations

1393	   Proof fetching (both inclusion proofs and consistency proofs) should
1394	   be performed at random time intervals.  If proof fetching occured all
1395	   at once, in a flurry of activity, a log would know that SCTs or STHs
1396	   recieved around the same time are more likely to come from a
1397	   particular client.  While proof fetching is required to be done in a
1398	   manner that attempts to be anonymous from the perspective of the log,
1399	   the correlation of activity to a single client would still reveal
1400	   patterns of user behavior we wish to keep confidential.  These
1401	   patterns could be recognizable as a single user, or could reveal what
1402	   sites are commonly visited together in the aggregate.

1404	   [ TBD: What other recommendations do we want to make here?  We can
1405	   talk more about the inadequecies of DNS... The first paragraph is 80%
1406	   identical between here and above ]

1408	11.3.  Record Distribution Recommendations

1410	   In several components of the CT Gossip ecosystem, the recommendation
1411	   is made that data from multiple sources be ingested, mixed, stored
1412	   for an indeterminate period of time, provided (multiple times) to a
1413	   third party, and eventually deleted.  The instances of these
1414	   recommendations in this draft are:

1416	   o  When a client receives SCTs during SCT Feedback, it should store
1417	      the SCTs and Certificate Chain for some amount of time, provide
1418	      some of them back to the server at some point, and may eventually
1419	      remove them from its store

1421	   o  When a client receives STHs during STH Pollination, it should
1422	      store them for some amount of time, mix them with other STHs,
1423	      release some of them them to various servers at some point,
1424	      resolve some of them to new STHs, and eventually remove them from
1425	      its store

1427	   o  When a server receives SCTs during SCT Feedback, it should store
1428	      them for some period of time, provide them to auditors some number
1429	      of times, and may eventually remove them

1431	   o  When a server receives STHs during STH Pollination, it should
1432	      store them for some period of time, mix them with other STHs,
1433	      provide some of them to connecting clients, may resolve them to
1434	      new STHs via Proof Fetching, and eventually remove them from its
1435	      store

1437	   o  When a Trusted Auditor receives SCTs or historical STHs from
1438	      clients, it should store them for some period of time, mix them
1439	      with SCTs received from other clients, and act upon them at some
1440	      period of time

1442	   Each of these instances have specific requirements for user privacy,
1443	   and each have options that may not be invoked.  As one example, an
1444	   HTTPS client should not mix SCTs from server A with SCTs from server
1445	   B and release server B's SCTs to Server A.  As another example, an
1446	   HTTPS server may choose to resolve STHs to a single more current STH
1447	   via proof fetching, but it is under no obligation to do so.

1449	   These requirements should be met, but the general problem of
1450	   aggregating multiple pieces of data, choosing when and how many to
1451	   release, and when to remove them is shared.  This problem has
1452	   previously been considered in the case of Mix Networks and Remailers,
1453	   including papers such as "From a Trickle to a Flood: Active Attacks
1454	   on Several Mix Types", [Y], and [Z].

1456	   There are several concerns to be addressed in this area, outlined
1457	   below.

1459	11.3.1.  Mixing Algorithm

1461	   When SCTs or STHs are recorded by a participant in CT Gossip and
1462	   later used, it is important that they are selected from the datastore
1463	   in a non-deterministic fashion.

1465	   This is most important for servers, as they can be queried for SCTs
1466	   and STHs anonymously.  If the server used a predictable ordering
1467	   algorithm, an attacker could exploit the predictability to learn
1468	   information about a client.  One such method would be by observing
1469	   the (encrypted) traffic to a server.  When a client of interest
1470	   connects, the attacker makes a note.  They observe more clients
1471	   connecting, and predicts at what point the client-of-interest's data
1472	   will be disclosed, and ensures that they query the server at that
1473	   point.

1475	   Although most important for servers, random ordering is still
1476	   strongly recommended for clients and Trusted Auditors.  The above
1477	   attack can still occur for these entities, although the circumstances
1478	   are less straightforward.  For clients, an attacker could observe
1479	   their behavior, note when they recieve an STH from a server, and use
1480	   javascript to cause a network connection at the correct time to force
1481	   a client to disclose the specific STH.  Trusted Auditors are stewards
1482	   of sensitive client data.  If an attacker had the ability to observe
1483	   the activities of a Trusted Auditor (perhaps by being a log, or
1484	   another auditor), they could perform the same attack - noting the
1485	   disclosure of data from a client to the Trusted Auditor, and then
1486	   correlating a later disclosure from the Trusted Auditor as coming
1487	   from that client.

1489	   Random ordering can be ensured by several mechanisms.  A datastore
1490	   can be shuffled, using a secure shuffling algorithm such as Fisher-
1491	   Yates.  Alternately, a series of random indexes into the data store
1492	   can be selected (if a collision occurs, a new index is selected.)  A
1493	   cryptographyically secure random number generator must be used in
1494	   either case.  If shuffling is performed, the datastore must be marked
1495	   'dirty' upon item insertion, and at least one shuffle operation
1496	   occurs on a dirty datastore before data is retrieved from it for use.

1498	11.3.2.  Flushing Attacks

1500	   A flushing attack is an attempt by an adversary to flush a particular
1501	   piece of data from a pool.  In the CT Gossip ecosystem, an attacker
1502	   may have performed an attack and left evidence of a compromised log
1503	   on a client or server.  They would be interested in flushing that
1504	   data, i.e.  tricking the target into gossiping or pollinating the
1505	   incriminating evidence with only attacker-controlled clients or
1506	   servers with the hope they trick the target into deleting it.

1508	   Servers are most vulnerable to flushing attacks, as they release
1509	   records to anonymous connections.  An attacker can perform a Sybil
1510	   attack - connecting to the server hundreds or thousands of times in
1511	   an attempt to trigger repeated release of a record, and then
1512	   deletion.  For this reason, servers must be especially aggressive
1513	   about retaining data for a longer period of time.

1515	   Clients are vulnerable to flushing attacks targetting STHs, as these
1516	   can be given to any cooperating server and an attacker can generally
1517	   induce connections to random servers using javascript.  It would be
1518	   more difficult to perform a flushing attack against SCTs, as the
1519	   target server must be authenticated (and an attacker impersonating an
1520	   authentic server presents a recursive problem for the attacker).
1521	   Nonetheless, flushing SCTs should not be ruled impossible.  A Trusted
1522	   Auditor may also be vulnerable to flushing attacks if it does not
1523	   perform auditing operations itself.

1525	   Flushing attacks are defended against using non-determinism and dummy
1526	   messages.  The goal is to ensure that an adversary does not know for
1527	   certain if the data in question has been released or not, and if it
1528	   has been deleted or not.

1530	   [ TBD: At present, we do not have any support for dummy messages.  Do
1531	   we want to define a dummy message that clients and servers alike know
1532	   to ignore?  Will HTTP Compression leak the presence of >1 dummy
1533	   messages?
1534	   Is it sufficient to define a dummy message as _anything_ with an
1535	   invalid siganture?  This would negatively impact SCT Feedback servers
1536	   that log all things just in case they're interesting. ]

1538	11.3.3.  The Deletion Algorithm

1540	   No entity in CT Gossip is required to delete SCTs or STHs at any
1541	   time, except to respect user's wishes such as private browsing mode
1542	   or clearing history.  However, requiring infinite storage space is
1543	   not a desirable characteristic in a protocol, so deletion is
1544	   expected.

1546	   While deletion of SCTs and STHs will occur, proof fetching can ensure
1547	   that any misbehavior from a log will still be detected, even after
1548	   the direct evidence from the attack is deleted.  Proof fetching
1549	   ensures that if a log presents a split view for a client, they must
1550	   maintain that split view in perpetuity.  An inclusion proof from an
1551	   SCT to an STH does not erase the evidence - the new STH is evidence
1552	   itself.  A consistency proof from that STH to a new one likewise -
1553	   the new STH is every bit as incriminating as the first.  (Client
1554	   behavior in the situation where an SCT or STH cannot be resolved is
1555	   suggested in Section 11.1.2.)  Because of this property, we recommend
1556	   that if a client is performing proof fetching, that they make every
1557	   effort to not delete an SCT or STH until it has been successfully
1558	   resolved to a new STH via a proof.

1560	   When it is time to delete a record, it is important that the decision
1561	   to do so not be done deterministicly.  Introducing non-determinism in
1562	   the decision is absolutely necessary to prevent an adversary from
1563	   knowing with certainty that the record has been successfully flushed
1564	   from a target.  Therefore, we speak of making a record 'eligible for
1565	   deletion' and then being processed by the 'deletion algorithm'.
1566	   Making a record eligible for deletion simply means that it will have
1567	   the deletion algorithm run.  The deletion algorithm will use a
1568	   probability based system and a secure random number generator to
1569	   determine if the record will be deleted.

1571	   Although the deletion algorithm is specifically designed to be non-
1572	   deterministic, if the record has been resolved via proof to a new STH
1573	   the record may be safely deleted, as long as the new STH is retained.

1575	   The actual deletion algorithm may be [STATISTICS HERE].  [ Something
1576	   as simple as 'Pick an integer securely between 1 and 10.  If it's
1577	   greater than 7, delete the record.'  Or something more complicated. ]

1579	   [ TODO Enumerating the problems of different types of mixes vs
1580	   Cottrell Mix ]

1582	11.3.3.1.  Experimental Algorithms

1584	   More complex algorithms could be inserted at any step.  Three
1585	   examples are illustrated:

1587	   SCTs are not eligible to be submitted to an Auditor of Last Resort.
1588	   Therefore, it is more important that they be resolved to STHs and
1589	   reported via SCT feedback.  If fetching an inclusion proof regularly
1590	   fails for a particular SCT, one can require it be reported more times
1591	   than normal via SCT Feedback before becoming eligible for deletion.

1593	   Before an item is made eligible for deletion by a client, the client
1594	   could aim to make it difficult for a point-in-time attacker to flush
1595	   the pool by not making an item eligible for deletion until the client
1596	   has moved networks (as seen by either the local IP address, or a
1597	   report-back providing the client with its observed public IP
1598	   address).  The HTTPS client could also require reporting over a
1599	   timespan, e.g. it must be reported at least N time, M weeks apart.
1600	   This strategy could be employed always, or only when the client has
1601	   disabled proof fetching and the Auditor of Last Resort, as those two
1602	   mechanisms (when used together) will enable a client to report most
1603	   attacks.

1605	11.3.3.2.  Concrete Recommendations

1607	   The recommendations for behavior are: - If proof fetching is enabled,
1608	   do not delete an SCT until it has had a proof resolving it to an STH.
1609	   - If proof fetching continually fails for an SCT, do not make the
1610	   item eligible for deletion of the SCT until it has been released,
1611	   multiple times, via SCT Feedback.  - If proof fetching continually
1612	   fails for an STH, do not make the item eligible for deletion until it
1613	   has been queued for release to an Auditor of Last Resort.  - Do not
1614	   dequeue entries to an Auditor of Last Resort if reporting fails.
1615	   Instead keep the items queued until they have been successfully sent.
1616	   - Use a probability based system, with a cryptographically secure
1617	   random number generator, to determine if an item should be deleted.
1618	   - Select items from the datastores by selecting random indexes into
1619	   the datastore.  Use a cryptographically secure random number
1620	   generator.

1622	   [ TBD: More? ]

1624	   We present the following pseudocode as a concrete outline of our
1625	   suggestion.

1627	11.3.3.2.1.  STH Data Structures

1629	   The STH class contains data pertaining specifically to the STH
1630	   itself.

1632	   class STH
1633	   {
1634	     uint32   proof_attempts
1635	     uint32   proof_failure_count
1636	     uint32   num_reports_to_thirdparty
1637	     datetime timestamp
1638	     byte[]   data
1639	   }

1641	   The broader STH store itself would contain all the STHs known by an
1642	   entity participating in STH Pollination (either client or server).
1643	   This simplistic view of the class does not take into account the
1644	   complicated locking that would likely be required for a data
1645	   structure being accessed by multiple threads.  One thing to note
1646	   about this pseudocode is that it aggressively removes STHs once they
1647	   have been resolved to a newer STH (if proof fetching is configured).
1648	   The only STHs in the store are ones that have never been resolved to
1649	   a newer STH, either because proof fetching does not occur, has
1650	   failed, or because the STH is considered too new to request a proof
1651	   for.  It seems less likely that servers will perform proof fetching.
1652	   Therefore it would be recommended that the various constants in use
1653	   be increased considerably to ensure STHs are pollinated more
1654	   aggressively.

1656	   class STHStore
1657	   {
1658	     STH[] sth_list

1660	     //  This function is run after receiving a set of STHs from
1661	     //  a third party in response to a pollination submission
1662	     def insert(STH[] new_sths) {
1663	       foreach(new in new_sths) {
1664	         if(this.sth_list.contains(new))
1665	           continue
1666	         this.sth_list.insert(new)
1667	       }
1668	     }

1670	     //  This function is called to possibly delete the given STH
1671	     //  from the data store
1672	     def delete_maybe(STH s) {
1673	       //Perform statistical test and see if I should delete this bundle
1674	     }
1675	     //  This function is called to (certainly) delete the given STH
1676	     //  from the data store
1677	     def delete_now(STH s) {
1678	       this.sth_list.remove(s)
1679	     }

1681	     //  When it is time to perform STH Pollination, the HTTPS Client
1682	     //  calls this function to get a selection of STHs to send as
1683	     //  feedback
1684	     def get_pollination_selection() {
1685	       if(len(this.sth_list) < MAX_STH_TO_GOSSIP)
1686	         return this.sth_list
1687	       else {
1688	         indexes = set()
1689	         modulus = len(this.sth_list)
1690	         while(len(indexes) < MAX_STH_TO_GOSSIP) {
1691	           r = randomInt() % modulus
1692	           if(r not in indexes
1693	              && now() - this.sth_list[i].timestamp < ONE_WEEK)
1694	             indexes.insert(r)
1695	         }

1697	         return_selection = []
1698	         foreach(i in indexes) {
1699	           return_selection.insert(this.sth_list[i])
1700	         }
1701	         return return_selection
1702	       }
1703	     }
1704	   }

1706	   We also suggest a function that can be called periodically in the
1707	   background, iterating through the STH store, performing a cleaning
1708	   operation and queuing consistency proofs.  This function can live as
1709	   a member functions of the STHStore class.

1711	   def clean_list() {
1712	     foreach(sth in this.sth_list) {

1714	       if(now() - sth.timestamp > ONE_WEEK) {
1715	         //STH is too old, we must remove it
1716	         if(proof_fetching_enabled
1717	            && auditor_of_last_resort_enabled
1718	            && (sth.proof_failure_count / sth.proof_attempts)
1719	               > MIN_PROOF_FAILURE_RATIO_CONSIDERED_SUSPICIOUS) {
1720	           queue_sth_for_auditor_of_last_resort(sth)
1721	           delete_maybe(sth)
1722	         } else {
1723	           delete_now(sth)
1724	         }
1725	       }

1727	       else if(proof_fetching_enabled
1728	               && now() - sth.timestamp > TWO_DAYS
1729	               && now() - sth.timestamp > LOG_MMD) {
1730	         sth.proof_attempts++
1731	         queue_consistency_proof(sth, consistency_proof_callback)
1732	       }
1733	     }
1734	   }

1736	11.3.3.2.2.  STH Deletion Procedure

1738	   The STH Deletion Procedure is run after successfully submitting a
1739	   list of STHs to a third party during pollination.  The following
1740	   pseudocode would be included in the STHStore class, and called with
1741	   the result of get_pollination_selection(), after the STHs have been
1742	   (successfully) sent to the third party.

1744	   //  This function is called after successfully pollinating STHs
1745	   //  to a third party. It is passed the STHs sent to the third
1746	   //  party, which is the output of get_gossip_selection()
1747	   def after_submit_to_thirdparty(STH[] sth_list)
1748	   {
1749	     foreach(sth in sth_list)
1750	     {
1751	       sth.num_reports_to_thirdparty++

1753	       if(proof_fetching_enabled) {
1754	         if(now() - sth.timestamp > LOG_MMD) {
1755	           sth.proof_attempts++
1756	           queue_consistency_proof(sth, consistency_proof_callback)
1757	         }

1759	         if(auditor_of_last_resort_enabled
1760	            && sth.proof_failure_count >
1761	               MIN_PROOF_ATTEMPTS_CONSIDERED_SUSPICIOUS
1762	            && (sth.proof_failure_count / sth.proof_attempts) >
1763	               MIN_PROOF_FAILURE_RATIO_CONSIDERED_SUSPICIOUS) {
1764	             queue_sth_for_auditor_of_last_resort(sth)
1765	         }
1766	       }
1767	       else { //proof fetching not enabled
1768	         if(sth.num_reports_to_thirdparty
1769	            > MIN_STH_REPORTS_TO_THIRDPARTY) {
1770	           delete_maybe(sth)
1771	         }
1772	       }
1773	     }
1774	   }

1776	   def consistency_proof_callback(consistency_proof,
1777	                                  original_sth,
1778	                                  error) {
1779	     if(!error) {
1780	       insert(consistency_proof.current_sth)
1781	       delete_now(consistency_proof.original_sth)
1782	     } else {
1783	       original_sth.proof_failure_count++
1784	     }
1785	   }

1787	11.3.3.2.3.  SCT Data Structures

1789	   TBD TBD This section is not well abstracted to be used for both
1790	   servers and clients.  TKTK
1791	   The SCT class contains data pertaining specifically to the SCT
1792	   itself.

1794	   class SCT
1795	   {
1796	     uint32 proof_attempts
1797	     uint32 proof_failure_count
1798	     bool   has_been_resolved_to_sth
1799	     byte[] data
1800	   }

1802	   The SCT bundle will contain the trusted certificate chain the HTTPS
1803	   client built (chaining to a trusted root certificate.)  It also
1804	   contains the list of associated SCTs, the exact domain it is
1805	   applicable to, and metadata pertaining to how often it has been
1806	   reported to the third party.

1808	   class SCTBundle
1809	   {
1810	     X509[] certificate_chain
1811	     SCT[]  sct_list
1812	     string domain
1813	     uint32 num_reports_to_thirdparty

1815	     def equals(sct_bundle) {
1816	       if(sct_bundle.domain != this.domain)
1817	         return false
1818	       if(sct_bundle.certificate_chain != this.certificate_chain)
1819	         return false
1820	       if(sct_bundle.sct_list != this.sct_list)
1821	         return false

1823	       return true
1824	     }
1825	     def approx_equals(sct_bundle) {
1826	       if(sct_bundle.domain != this.domain)
1827	         return false
1828	       if(sct_bundle.certificate_chain != this.certificate_chain)
1829	         return false

1831	       return true
1832	     }

1834	     def insert_scts(sct[] sct_list) {
1835	       this.sct_list.union(sct_list)
1836	       this.num_reports_to_thirdparty = 0
1837	     }

1839	     def has_been_fully_resolved_to_sths() {
1840	       foreach(s in this.sct_list) {
1841	         if(!s.has_been_resolved_to_sth)
1842	           return false
1843	       }
1844	       return true
1845	     }

1847	     def max_proof_failure_count() {
1848	       uint32 max = 0
1849	       foreach(s in this.sct_list) {
1850	         if(s.proof_failure_count > max)
1851	           max = proof_failure_count
1852	       }
1853	       return max
1854	     }
1855	   }
1856	   We suppose a large data structure is used, such as a hashmap, indexed
1857	   by the domain name.  For each domain, the structure will contain a
1858	   data structure that holds the SCTBundles seen for that domain, as
1859	   well as encapsulating some logic relating to SCT Feedback for that
1860	   particular domain.

1862	   class SCTStore
1863	   {
1864	     string   domain
1865	     datetime last_contact_for_domain
1866	     uint32   num_submissions_attempted
1867	     uint32   num_submissions_succeeded
1868	     SCTBundle[] observed_records

1870	     //  This function is called after recieving an SCTBundle.
1871	     //  For Clients, this is after a successful connection to a
1872	     //  HTTPS Server, calling this function with an SCTBundle
1873	     //  constructed from that certificate chain and SCTs
1874	     //  For Servers, this is after receiving SCT Feedback
1875	     def insert(SCTBundle b) {
1876	       if(operator_is_server) {
1877	         if(!passes_validity_checks(b))
1878	           return
1879	       }
1880	       foreach(e in this.observed_records) {
1881	         if(e.equals(b))
1882	           return
1883	         else if(e.approx_equals(b)) {
1884	           e.insert_scts(b.sct_list)
1885	           return
1886	         }
1887	       }
1888	       this.observed_records.insert(b)
1889	     }

1891	     //  When it is time to perform SCT Feedback, the HTTPS Client
1892	     //  calls this function to get a selection of SCTBundles to send
1893	     //  as feedback
1894	     def get_gossip_selection() {
1895	       if(len(observed_records) > MAC_SCT_RECORDS_TO_GOSSIP) {
1896	         indexes = set()
1897	         modulus = len(observed_records)
1898	         while(len(indexes) < MAX_SCT_RECORDS_TO_GOSSIP) {
1899	           r = randomInt() % modulus
1900	           if(r not in indexes)
1901	             indexes.insert(r)
1902	         }
1903	         return_selection = []
1904	         foreach(i in indexes) {
1905	           return_selection.insert(this.observed_records[i])
1906	         }

1908	         return return_selection
1909	       }
1910	       else
1911	         return this.observed_records
1912	     }

1914	     def delete_maybe(SCTBundle b) {
1915	       //Perform statistical test and see if I should delete this bundle
1916	     }

1918	     def delete_now(SCTBundle b) {
1919	       this.observed_records.remove(b)
1920	     }

1922	     def passes_validity_checks(SCTBundle b) {
1923	       //  This function performs the validity checks specified in
1924	       //  {{feedback-srvop}}
1925	     }
1926	   }

1928	   We also suggest a function that can be called periodically in the
1929	   background, iterating through all SCTStore objects in the large
1930	   hashmap (here called 'all_sct_stores') and removing old data.

1932	   def clear_old_data()
1933	   {
1934	     foreach(storeEntry in all_sct_stores)
1935	     {
1936	       if(storeEntry.num_submissions_succeeded == 0
1937	          && storeEntry.num_submissions_attempted
1938	             > MIN_SCT_ATTEMPTS_FOR_DOMAIN_TO_BE_IGNORED)
1939	       {
1940	         all_sct_stores.remove(storeEntry)
1941	       }
1942	       else if(storeEntry.num_submissions_succeeded > 0
1943	               && now() - storeEntry.last_contact_for_domain
1944	                  > TIME_UNTIL_OLD_SCTDATA_ERASED)
1945	       {
1946	         all_sct_stores.remove(storeEntry)
1947	       }
1948	     }
1949	   }

1951	11.3.3.2.4.  SCT Deletion Procedure

1953	   The SCT Deletion procedure is more complicated than the respective
1954	   STH procedure.  This is because servers may elect not to participate
1955	   in SCT Feedback, and this must be accounted for by being more
1956	   conservative in sending SCT reports to them.

1958	   The following pseudocode would be included in the SCTStore class, and
1959	   called with the result of get_gossip_selection() after the SCT
1960	   Feedback has been sent (successfully) to the server.  We also note
1961	   that the first experimental algorithm from above is included in the
1962	   pseudocode as an illustration.

1964	   //  This function is called after successfully providing SCT Feedback
1965	   //  to a server. It is passed the feedback sent to the server, which
1966	   //  is the output of get_gossip_selection()
1967	   def after_submit_to_thirdparty(SCTBundle[] submittedBundles)
1968	   {
1969	     foreach(bundle in submittedBundles)
1970	     {
1971	       bundle.num_reports_to_thirdparty++

1973	       if(proof_fetching_enabled) {
1974	         if(!bundle.has_been_fully_resolved_to_sths()) {
1975	           foreach(s in bundle.sct_list) {
1976	             if(!s.has_been_resolved_to_sth) {
1977	               s.proof_attempts++
1978	               queue_inclusion_proof(sct, inclusion_proof_callback)
1979	             }
1980	           }
1981	         }
1982	         else {
1983	           if(run_ct_gossip_experiment_one) {
1984	             if(bundle.num_reports_to_thirdparty
1985	                > MIN_SCT_REPORTS_TO_THIRDPARTY
1986	                && bundle.num_reports_to_thirdparty * 1.5
1987	                   > bundle.max_proof_failure_count()) {
1988	               maybe_delete(bundle)
1989	             }
1990	           }
1991	           else { //Do not run experiment
1992	             if(bundle.num_reports_to_thirdparty
1993	                > MIN_SCT_REPORTS_TO_THIRDPARTY) {
1994	               maybe_delete(bundle)
1995	             }
1996	           }
1997	         }
1998	       }
1999	       else {//proof fetching not enabled
2000	         if(bundle.num_reports_to_thirdparty
2001	            > (MIN_SCT_REPORTS_TO_THIRDPARTY
2002	               * NO_PROOF_FETCHING_REPORT_INCREASE_FACTOR)) {
2003	           maybe_delete(bundle)
2004	         }
2005	       }
2006	     }
2007	   }

2009	   // This function is a callback invoked after an inclusion proof
2010	   // has been retrieved
2011	   def inclusion_proof_callback(inclusion_proof, original_sct, error)
2012	   {
2013	     if(!error) {
2014	       original_sct.has_been_resolved_to_sth = True
2015	       insert_to_sth_datastore(inclusion_proof.new_sth)
2016	     } else {
2017	       original_sct.proof_failure_count++
2018	     }
2019	   }

2021	12.  IANA considerations

2023	   [ TBD ]

2025	13.  Contributors

2027	   The authors would like to thank the following contributors for
2028	   valuable suggestions: Al Cutter, Ben Laurie, Benjamin Kaduk, Josef
2029	   Gustafsson, Karen Seo, Magnus Ahltorp, Steven Kent, Yan Zhu.

2031	14.  ChangeLog

2033	14.1.  Changes between ietf-01 and ietf-02

2035	   o  Requiring full certificate chain in SCT Feedback.

2037	   o  Clarifications on what clients store for and send in SCT Feedback
2038	      added.

2040	   o  SCT Feedback server operation updated to protect against DoS
2041	      attacks on servers.

2043	   o  Pre-Loaded vs Locally Added Anchors explained.

2045	   o  Base for well-known URL's changed.

2047	   o  Remove all mentions of monitors - gossip deals with adutitors.

2049	   o  New sections added: Trusted Auditor protocol, attacks by actively
2050	      malicious log, the Dual-CA compromise attack, policy
2051	      recommendations,

2053	14.2.  Changes between ietf-00 and ietf-01

2055	   o  Improve langugage and readability based on feedback from Stephen
2056	      Kent.

2058	   o  STH Pollination Proof Fetching defined and indicated as optional.

2060	   o  3-Method Ecosystem section added.

2062	   o  Cases with Logs ceasing operation handled.

2064	   o  Text on tracking via STH Interaction added.

2066	   o  Section with some early recommendations for mixing added.

2068	   o  Section detailing blocking connections, frustrating it, and the
2069	      implications added.

2071	14.3.  Changes between -01 and -02

2073	   o  STH Pollination defined.

2075	   o  Trusted Auditor Relationship defined.

2077	   o  Overview section rewritten.

2079	   o  Data flow picture added.

2081	   o  Section on privacy considerations expanded.

2083	14.4.  Changes between -00 and -01

2085	   o  Add the SCT feedback mechanism: Clients send SCTs to originating
2086	      web server which shares them with auditors.

2088	   o  Stop assuming that clients see STHs.

2090	   o  Don't use HTTP headers but instead .well-known URL's - avoid that
2091	      battle.

2093	   o  Stop referring to trans-gossip and trans-gossip-transport-https -
2094	      too complicated.

2096	   o  Remove all protocols but HTTPS in order to simplify - let's come
2097	      back and add more later.

2099	   o  Add more reasoning about privacy.

2101	   o  Do specify data formats.

2103	15.  References

2105	15.1.  Normative References

2107	   [RFC-6962-BIS-09]
2108	              Laurie, B., Langley, A., Kasper, E., Messeri, E., and R.
2109	              Stradling, "Certificate Transparency", October 2015,
2110	              <https://datatracker.ietf.org/doc/draft-ietf-trans-
2111	              rfc6962-bis/>.

2113	   [RFC7159]  Bray, T., "The JavaScript Object Notation (JSON) Data
2114	              Interchange Format", RFC 7159, March 2014.

2116	15.2.  Informative References

2118	   [draft-ietf-trans-threat-analysis-03]
2119	              Kent, S., "Attack Model and Threat for Certificate
2120	              Transparency", October 2015,
2121	              <https://datatracker.ietf.org/doc/draft-ietf-trans-threat-
2122	              analysis/>.

2124	Authors' Addresses

2126	   Linus Nordberg
2127	   NORDUnet

2129	   Email: linus@nordu.net

2131	   Daniel Kahn Gillmor
2132	   ACLU

2134	   Email: dkg@fifthhorseman.net

2136	   Tom Ritter

2138	   Email: tom@ritter.vg