Re: [TLS] Consensus Call on draft-ietf-tls-dnssec-chain-extension

Viktor Dukhovni <ietf-dane@dukhovni.org> Wed, 04 April 2018 23:27 UTC

Return-Path: <ietf-dane@dukhovni.org>
X-Original-To: tls@ietfa.amsl.com
Delivered-To: tls@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 84B1912DA27 for <tls@ietfa.amsl.com>; Wed, 4 Apr 2018 16:27:46 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.201
X-Spam-Level:
X-Spam-Status: No, score=-4.201 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id aTHpdrfkIwm7 for <tls@ietfa.amsl.com>; Wed, 4 Apr 2018 16:27:42 -0700 (PDT)
Received: from mournblade.imrryr.org (mournblade.imrryr.org [108.5.242.66]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id BA49F12D954 for <tls@ietf.org>; Wed, 4 Apr 2018 16:27:38 -0700 (PDT)
Received: from [10.200.0.109] (unknown [8.2.105.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mournblade.imrryr.org (Postfix) with ESMTPSA id 800457A330D for <tls@ietf.org>; Wed, 4 Apr 2018 23:27:37 +0000 (UTC) (envelope-from ietf-dane@dukhovni.org)
Content-Type: text/plain; charset="us-ascii"
Mime-Version: 1.0 (Mac OS X Mail 11.2 \(3445.5.20\))
From: Viktor Dukhovni <ietf-dane@dukhovni.org>
In-Reply-To: <CAOgPGoAhzEtxpW5mzmkf2kv3AcugNy0dAzhvpaqrTSuMSqWqfw@mail.gmail.com>
Date: Wed, 04 Apr 2018 19:27:36 -0400
Content-Transfer-Encoding: quoted-printable
Reply-To: TLS WG <tls@ietf.org>
Message-Id: <EDB0F480-1272-4364-9A3D-23F9E1A02141@dukhovni.org>
References: <CAOgPGoAhzEtxpW5mzmkf2kv3AcugNy0dAzhvpaqrTSuMSqWqfw@mail.gmail.com>
To: TLS WG <tls@ietf.org>
X-Mailer: Apple Mail (2.3445.5.20)
Archived-At: <https://mailarchive.ietf.org/arch/msg/tls/5yiDdsw1GZ6PWgcR0P-C6KP2VKA>
Subject: Re: [TLS] Consensus Call on draft-ietf-tls-dnssec-chain-extension
X-BeenThere: tls@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "This is the mailing list for the Transport Layer Security working group of the IETF." <tls.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tls>, <mailto:tls-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tls/>
List-Post: <mailto:tls@ietf.org>
List-Help: <mailto:tls-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tls>, <mailto:tls-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 04 Apr 2018 23:27:50 -0000


> On Apr 4, 2018, at 1:50 PM, Joseph Salowey <joe@salowey.net> wrote:
> 
> - Recommendation of adding denial of existence proofs in the chain provided by the extension
> - Adding signaling to require the use of this extension for a period of time (Pinning with TTL)

These are indeed the immediate proposed fixes, and ultimately the consensus call will be about
whether the fixes should be formalized and integrated or the document advanced as-is, but before
we consider them in a vacuum, I'd like to explain in some detail *why* they are needed.  This
requires firstly an examination of why the extension is needed in the first place, and what
use-cases it should address.  This will be detailed below, but first:

> This is a consensus call on how to progress this document.  Please answer the following questions:
> 
> 1) Do you support publication of the document as is, leaving these two issues to potentially be addressed in follow-up work?
> 
> If the answer to 1) is no then please indicate if you think the working group should work on the document to include 
> 
> A) Recommendation of adding denial of existence proofs in the chain provided by the extension
> B) Adding signaling to require the use of this extension for a period of time (Pinning with TTL)
> C) Both

For the record, (C) both.  And now to explain why:

The first thing to understand is why this extension is needed in the first place.  A partial
answer can bee seen in the introduction to the draft. Please pardon burying the lede, below
I quote from the text of the introduction in order of presentation, but the main point is closer
to the end:

  -----------------
  This draft describes a new TLS [RFC5246] [TLS13] extension for
  transport of a DNS record set serialized with the DNSSEC signatures
  [RFC4034] needed to authenticate that record set.
  -----------------

We should note that an empty record set (NXDOMAIN or NODATA) is still
a record set, and DNSSEC *authenticates* denial of existence, and it
is necessary to also support denial of existence, in order to support
downgrade-resistance against "TLSA-stripping".  The reason is that this
extension is not being introduced into brand new internet in which the
CA/B forum WebPKI does not exist and DANE will be the *only* way for
TLS clients to authenticate servers.  For this extension to coexist
with the WebPKI clients will need to be able to *some* servers via
DANE and others (those that don't have DANE TLSA records) via the
existing WebPKI.

Suppose that, in the absence of authenticated denial of existence,
and clients being able to rely on continued support for the extension:

   * As will be the cast for most existing applications, the client
     is willing to use the WebPKI when DANE TLSA records don't appear
     to be availble.

   * An attacker is able to obtain a fraudulent WebPKI certificate.

Then the attacker can impersonate the target server by leaving out
the extension and presenting just the fraudulent WebPKI certificate,
and the client will have to accept this even when the real server
previously employed the extension and provided TLSA records.

What the above establishes is that in mixed environments, i.e.
almost all use-cases for this specification, TLSA records provide
no additional protection, if WebPKI fails, security is compromised
even with DANE.

Some have argued that this is OK, that even though the above
"restrictive" use-case is unavailable we still get an "additive"
use case, where servers can be authenticated with DANE when
that happens to be present and otherwise WebPKI.  Sadly, that
use-case is largely a mirage.  Firstly, given the existing
dominant client base that only supports WebPKI, any server
would in any case still need WebPKI certificates (say from
Let's Encrypt), and that will suffice for all clients both
those that support DANE and those that don't.  Thus enabling
DANE on the server, only makes it possible to impersonate
the server if DANE is compromised, but does not defend against
WebPKI compromise.  So the server gets more complexity, and
reduced security, and maybe even increased risk of failure
if the TLSA records are managed carelessly and fail to match.

No rational server operator will employ the "additive" mode.
Some DANE idealists might do it to signal their support for
the technology, but this will be futile, they'll get no real
benefit from doing so.

Continuing on with the introduction:

  -----------------
  The intent of this
  proposal is to allow TLS clients to perform DANE Authentication
  [RFC6698] [RFC7671] of a TLS server without performing additional DNS
  record lookups and incurring the associated latency penalty.  It also
  provides the ability to avoid potential problems with TLS clients
  being unable to look up DANE records because of an interfering or
  broken middlebox on the path between the client and a DNS server
  [HAMPERING].  And lastly, it allows a TLS client to validate the
  server's DANE (TLSA) records itself without needing access to a
  validating DNS resolver to which it has a secure connection.
  -----------------

Here we see the promise of being able to get by without having
access to a validating resolver, but validating resolvers deliver
both signed data when present and signed proof of non-existence when
the requested records are absent.  But most importantly:

  -----------------
  This mechanism is useful for TLS applications that need to address
  the problems described above, typically web browsers or SIP/VoIP
  [RFC3261] and XMPP [RFC7590].
  -----------------

This plainly shows that browser HTTPS was a motivating use-case
for the extension.  Indeed earlier mention of latency concerts is
also based on the reasons why browsers don't (want to) do DANE even
when direct DNSSEC lookups are possible.

Not just browsers, no existing application can adopt specification
incrementally.  Even if the extension is required in the application,
it is unrealistic to expect that all server domains will rapidly
adopt DNSSEC.  At the moment there are ~6 million DNSSEC-signed
domains, substantially concentrated in northern Europe, with
fairly thin adoption generally (only ~800K domains out of ~120
million in .com).  An application that expects the extension
should expect to not find TLSA records for many servers it
connects to, indeed many server domains will be unsigned, and
the server would only be able to present denial of existence
for the DS records of a parent domain, proving its domain
"insecure" (in DNSSEC terms).

The modifications in (A) and (B) are most similar to STS (not HPKP)
and harden DANE-authenticated HTTPS against downgrade to just DV,
and make going to all the trouble of supporting the extension at
the server a tangible security benefit.  Absent such a benefit,
no reasonable server operator will implement the extension. Just
deploying a WebPKI certificate, which is necessary in any case,
provides the same protection with less effort.

The pin-time in (B) also allows servers to signal a zero pin time,
and thus indicate an incomplete deployment, reducing breakage when
clients find the extension partially deployed, but decide to expect
it anyway for security reasons.

The pin (pinning just the boolean presence of the extension, like
STS pins availability of TLS) would not pin any of the TLSA data,
that's subject to the normal, and typically short, DNS TTLs should
the client choose to cache the presented DNS records.  Its lifetime
would be measured in hours (not seconds) and I'd go with a 16-bit
value that represent any time from 0 hours (partial deployment)
to 65535 hours which is close to 7.5 years.

The pin is set (or reset to a new value) when the server performs
a handshake in which the extension is present with a non-zero TTL,
and either provides matching TLSA records or a valid denial of
existence (and is authenticated via WebPKI).

For browser HTTPS, (in a separate specification) in might make
sense to limit DANE support to just the "restrictive" case,
PKIX-TA(0) and PKIX-EE(1).  There are many in the TLS and
browser communities who don't trust the security of DNSSEC,
and would be reluctant to authenticate sensitive browser
connections based on DANE alone.  That's fine.  Just as
DANE for SMTP limits the scope to DANE-TA(2) and DANE-EE(3)
for reasons explained in section 1.3 of RFC7672, the converse
can reasonably be argued for browsers.  This suggests even
more strongly that DANE stripping matters (if "restrictive"
turns out to be the main purpose of DANE in that space) and
that this specification needs to harden DANE against downgrades.

I should also mention that there is a user community that expects
the IETF to make DANE support work in browsers and which would be
once again left frustrated with the specification failing to meet
their expectations.  For a published example:

  https://service.networking4all.com/hc/en-us/articles/115001319589-Wat-is-DANE-en-DNSSEC-

but I've certainly been party to various conversations with users
where want DANE to work for their Web servers and browsers, the
finer points to be worked out by this WG and browser/web server
developers.

And yet, as it stands, the deployment cost-benefit analysis for this
extension in existing applications plays out as follows:

COSTS:

  * You still manage WebPKI certificates to support the majority of existing clients.
  * If the WebPKI is compromised, you're compromised.
  * If DNSSEC is compromised, you're compromised
  * You pay the complexity cost of also supporting the extension
  * You might present incorrect TLSA records and the connection might fail even when
    it would otherwise succeed with WebPKI

BENEFITS:

  * Nothing other than bragging rights that you're cool enough to deploy a shiny new technology

I think that users deserve better than that, and that the extension should have tangible
security benefits not only in greenfield applications where DANE is the mandatory authentication
mechanism, but also in legacy applications where deployment would necessarily be incremental and
initially slow.

Yes, there is not today an immediate backlog of implementations waiting to adopt the
specification into legacy spaces.  The same was true for DANE in RFC6698 when the
specification was completed in Aug 2012.  The first substantive implementation work
did not begin until Feb/Mar of 2013 and its later adoption was substantially driven
by the unanticipated news in June of that year.

We don't know what the future will bring, the WebPKI is certainly looking much more
fragile lately than it did in the past with multiple failures at various CAs, and
the health of the ecosystem threatened by a new economic reality.  It is prudent
to have alternatives "in the wings", even if there is not an immediate rush to
adopt.

This specification should be sufficiently robust to support not only the immediate
narrow DPRIV use-case, but also other use-cases.  What's more adding support for
(A) and (B) in no way burdens the immediate adopters, all they suffer is a short
delay in the publication of the final document.  They can ignore the pin TTL,
and given mandatory TLSA records are certainly not compelled to generate denial
of existence given that they're planning to have TLSA records.  It may however
turn out that even there a capability to vend denial of existence could prove
useful.

In summary, the present specification fails to address its motivating use-cases,
provides no incremental adoption path, and is only narrowly fit for purpose with
some new applications, but provides no alternative security mechanisms in the
existing application spaces where direct use DNSSEC by the client (bypassing
this extension) is not an option.

This is why my "hum" is for both (A) and (B), i.e. (C).

-- 
	Viktor.