[TLS] 0-RTT and Anti-Replay

Eric Rescorla <ekr@rtfm.com> Sun, 22 March 2015 21:50 UTC

Return-Path: <ekr@rtfm.com>
X-Original-To: tls@ietfa.amsl.com
Delivered-To: tls@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 416381A1BD7 for <tls@ietfa.amsl.com>; Sun, 22 Mar 2015 14:50:14 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.078
X-Spam-Level:
X-Spam-Status: No, score=-0.078 tagged_above=-999 required=5 tests=[BAYES_40=-0.001, FM_FORGED_GMAIL=0.622, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id BvYM-Y_0ji8D for <tls@ietfa.amsl.com>; Sun, 22 Mar 2015 14:50:10 -0700 (PDT)
Received: from mail-wg0-f42.google.com (mail-wg0-f42.google.com [74.125.82.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 100E41A1EE8 for <tls@ietf.org>; Sun, 22 Mar 2015 14:50:10 -0700 (PDT)
Received: by wgra20 with SMTP id a20so131743097wgr.3 for <tls@ietf.org>; Sun, 22 Mar 2015 14:50:08 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:from:date:message-id:subject:to :content-type; bh=y7QCcyIG4BZMdSusMfzvy82jHFq2zfDNQ6ECPSMqoPc=; b=mDsxtDesoS1nQUe8vBFRZc5eOeRQ8FYW/qwZj4AzziYnSl3/Fd3F3fpqmiUKHqus/x nY+5xYvOKPJqVmke5NpE2mK+meikF3X3kbp4ag3C8g7fDUQYPJ8zByQ7I94NN0mK56C2 zDCN07CT8iIwCDoNRQxrXkX48m8b4YmskTAT6bgorWcPriO3pRrnaH3vH6ZVJsH6U1ig MFcqZnFocr3+9SIARoPTD7Z90GlYBi7OfQCOA5JtRjGVMminMLiY+sUphJQKwz2vO6yK RnxzVBsSfDbeExy/iXhhfsbppyL7Bs1/Ada4M/yS3u0uIpqkhCZUYvzwk9/m6lYp+ppk o/hQ==
X-Gm-Message-State: ALoCoQlxnIATOjgQSpb5/+RxxGc4d4SFjFf7mfuAl3PVKdRYpfm+ewIHwf4nuPMmolvcxys6E4X4
X-Received: by 10.180.20.144 with SMTP id n16mr13897264wie.44.1427061008832; Sun, 22 Mar 2015 14:50:08 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.27.205.198 with HTTP; Sun, 22 Mar 2015 14:49:28 -0700 (PDT)
From: Eric Rescorla <ekr@rtfm.com>
Date: Sun, 22 Mar 2015 14:49:28 -0700
Message-ID: <CABcZeBP9LaGhDVETsJeecnAtSPUj=Kv37rb_2esDi3YaGk9b4w@mail.gmail.com>
To: "tls@ietf.org" <tls@ietf.org>
Content-Type: multipart/alternative; boundary="bcaec53f2e2f812ea20511e788f2"
Archived-At: <http://mailarchive.ietf.org/arch/msg/tls/gDzOxgKQADVfItfC4NyW3ylr7yc>
Subject: [TLS] 0-RTT and Anti-Replay
X-BeenThere: tls@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "This is the mailing list for the Transport Layer Security working group of the IETF." <tls.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tls>, <mailto:tls-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tls/>
List-Post: <mailto:tls@ietf.org>
List-Help: <mailto:tls-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tls>, <mailto:tls-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 22 Mar 2015 21:50:14 -0000

In the interim in Seattle, we had an extensive discussion of 0-RTT
anti-replay in which DKG observed that all the proposed anti-replay
mechanisms provide limited protection. The underlying problem is the
desire to present a uniform interface in which the calling application
can count on reliable delivery of the data it provides in the first
flight, thus requiring the TLS stack to retransmit it automatically.

This message describes the problem and should set us up for a
discussion about the topic in Dallas. Apologies in advance for
the length of this, but I'd like to be clear.


BACKGROUND: ANTI-REPLAY
As you'll recall, TLS ordinarily uses a server-provided random value
mixed into the keys for anti-replay. This obviously doesn't work for
0-RTT. For reference, here's a simplified description of the way that
(and the way that both QUIC and snap-start do it).

- The server keeps a list of every ClientRandom it has received
  within a given time window keyed by a server context ("orbit")
  value that loosely can be thought of as indicating the data center.

- When the server receives a new ClientHello, it checks to
  see if it has seen it before and then rejects the connection.

Now consider what happens when the server reboots and loses its
state. You don't want to reject every client 0-RTT connection so
instead what you do is you ignore the client's initial data sent in
the first flight. Since the ServerHello provides a new Random value,
any data the client sends in subsequent flights has the usual
anti-replay protection.


A SOMEWHAT CONTRIVED ATTACK
This is all fine so far, but the abstraction you probably want TLS to
supply to the application is that the data it sends in the first
flight isn't lost even in this case. In order to accomodate this, the
natural design is for the TLS stack to retransmit the plaintext in the
client's first flight as the first data it sends under the new keys.

An attacker can make use of this as follows. When the client sends
the initial first flight he captures the TCP connection and makes his
own TCP connection to the server, forwarding the first flight.

Client                        Attacker                          Server

ClientHello [+0-RTT] ------------>
"POST /buy-something" ----------->

                              ClientHello [+0-RTT] --------------->
                              "POST /buy-something" -------------->

                                                   [Processes purchase]
                                 <---------- ServerHello [accept 0-RTT]
                                                  (+ rest of handshake)


The attacker just discards the server's response and then the server
somehow forces the server to reboot (this is the hard part).

Once the server has rebooted, the attacker re-sends the client's
initial flight then just forwards data back and forth to the client.

Client                        Attacker                          Server
                              ClientHello [+0-RTT] --------------->
                              "POST /buy-something" -------------->

   <--------------------------------------  ServerHello [reject 0-RTT]
                                                 (+ rest of handshake)

Finished --------------------------------------------------------->
"Post /buy-something" -------------------------------------------->
                                                  [Processes purchase]

At this point, you've mounted a successful replay attack. Obviously
this is not that great an attack because it requires you to be able to
force a reboot and also get in under the client's timeout window.


A MORE REALISTIC ATTACK
During the interim meeting, DKG observed that you could produce the
same "I've forgotten my state" situation if you have a distributed
server. Say that the server operates in two loosely-synchronized
data centers. In that case, you can get the following situation
(shown without the attacker's intervention because of ASCII art
limitation

 Client             Attacker            Server1              Server 2

ClientHello [+0-RTT] -->
"POST /buy-something" ->

                    ClientHello [+0-RTT] -->
                    "POST /buy-something" ->

                                        [Processes
                                         purchase]



                    ClientHello [+0-RTT] ----------------------->
                    "POST /buy-something" ---------------------->


   <--------------------------------------  ServerHello [reject 0-RTT]
                                                 (+ rest of handshake)

Finished --------------------------------------------------------->
"Post /buy-something" -------------------------------------------->
                                                  [Processes purchase]

Again, the attacker will need to mess with the TCP channel, but their
ability to do so is part of our basic threat model [RFC 3552].

It's important to understand that this is a generic issue, not an
issue with TLS in particular, so it's not like there's some other
0-RTT model we can lift and put into TLS that would solve the problem.


APPROACHES
One way to address this would be to simply not provide a 0-RTT mode.
This is a charter item, so that would be sad, but obviously that's not
out of the question if we really think this is bad. The more serious
question is that there's an enormous demand for this feature (see, for
instance, the discussion at the SEMI workshop), and so if we refuse to
do a 0-RTT mode, we're giving applications the annoying choice of
accepting a RTT or inventing their own 0-RTT method.


There are a number of basic ways to address this issue, but I think
the main plausible[0] ones are:

1. Keep the server state globally consistent and also temporally
   consistent so that replays can always be detected.

2. Remove the TLS anti-replay guarantee for the data sent in the first
   flight and tell applications to only send data there that can
   tolerate being replayed.

3. Remove the TLS reliable delivery guarantee for the data sent in
   the first flight, so that the stack doesn't automatically replay it.

The first of these options (global state) is possible, but only in
some limited circumstances, namely very sophisticated operators and/or
situations where there's really only one server which has good state
management. An example of the latter is WebRTC, where the server can
have a different anti-replay context for each connection.

The other two options clearly require a separate API to handle this
special first-flight data and would require applications to handle it
separately. So, for instance, in option 2, you would have something
like:

    c = new TLSConnection(...)
    c.setReplayable0RTTData("GET /....")
    c.connect();

And in the case of option 3 you would have something like:

    c = new TLSConnection(...)
    c.setUnreliable0RTTData("GET /....")
    c.connect()
    if (c.delivered0RTTData()) {
       // Things are cool
    } else {
       // Try to figure out whether to replay or not
    }

So in the former case, the choice of replay is in the TLS
stack's hands but in the latter in the application's hands.

I would expect them to have relatively similar impacts on the wire,
namely applications would self-designate certain data as replay-safe
(e.g., HTTP GETs) and would send it in the first flight and then
either let the stack retransmit (option 2) or retransmit themselves
(option 3). This isn't that odd, since, as AGL observes, browsers
already routinely retry some HTTP requests that appear to fail even for
ordinary TLS (i.e., no HTTP response was received) so in those cases
they have already circumvented the anti-replay guarantees supplied by
TLS, but of course that's different from having TLS give up those
guarantees.


RECOMMENDATION
As I said, I think we do need to supply some sort of limited 0-RTT
functionality. Given the practical difficulty of making global state
work, I think it's a mistake to build a system which relies on
this. The remaining two options require that we relax one of TLS's
guarantees and of these, relaxing reliable delivery seems safer,
and it also makes the implementation rather easier.

Accordingly, I propose that we continue to develop 0-RTT but with the
server just dropping 0-RTT first-flight data when 0-RTT is not
possible (and of course telling the client it is doing so).  The
client application can then retransmit if it believes it is safe.
In the very limited cases where application domains can in fact
safely keep state, they can profile the use of TLS to that end
and require the endpoints to behave appropriately.

Thanks,
-Ekr


[0] One implausible approach is to treat any attempt to 0-RTT when
you have lost state as a hard-fail, but this will create a lot of
random failures, so it's not really practical.