[Cfrg] E-521 vs. numsp512t1

"D. J. Bernstein" <djb@cr.yp.to> Wed, 22 October 2014 21:35 UTC

Return-Path: <djb-dsn2-1406711340.7506@cr.yp.to>
X-Original-To: cfrg@ietfa.amsl.com
Delivered-To: cfrg@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 57F1F1A6F1D for <cfrg@ietfa.amsl.com>; Wed, 22 Oct 2014 14:35:03 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 2.61
X-Spam-Level: **
X-Spam-Status: No, score=2.61 tagged_above=-999 required=5 tests=[BAYES_50=0.8, RCVD_IN_DNSWL_LOW=-0.7, TO_NO_BRKTS_PCNT=2.499, T_TVD_FUZZY_SECURITIES=0.01, UNPARSEABLE_RELAY=0.001] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id upfq6mzPTGg2 for <cfrg@ietfa.amsl.com>; Wed, 22 Oct 2014 14:35:01 -0700 (PDT)
Received: from mace.cs.uic.edu (mace.cs.uic.edu [131.193.32.224]) by ietfa.amsl.com (Postfix) with SMTP id 9EFE21A6F04 for <cfrg@irtf.org>; Wed, 22 Oct 2014 14:35:01 -0700 (PDT)
Received: (qmail 6121 invoked by uid 1011); 22 Oct 2014 21:34:55 -0000
Received: from unknown (unknown) by unknown with QMTP; 22 Oct 2014 21:34:55 -0000
Received: (qmail 20220 invoked by uid 1001); 22 Oct 2014 21:34:47 -0000
Date: Wed, 22 Oct 2014 21:34:47 -0000
Message-ID: <20141022213447.20218.qmail@cr.yp.to>
From: "D. J. Bernstein" <djb@cr.yp.to>
To: cfrg@irtf.org
Mail-Followup-To: cfrg@irtf.org
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
Archived-At: http://mailarchive.ietf.org/arch/msg/cfrg/D4U_v3K-KxYeybk6zAGwZ6ySatQ
Subject: [Cfrg] E-521 vs. numsp512t1
X-BeenThere: cfrg@irtf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Crypto Forum Research Group <cfrg.irtf.org>
List-Unsubscribe: <http://www.irtf.org/mailman/options/cfrg>, <mailto:cfrg-request@irtf.org?subject=unsubscribe>
List-Archive: <http://www.irtf.org/mail-archive/web/cfrg/>
List-Post: <mailto:cfrg@irtf.org>
List-Help: <mailto:cfrg-request@irtf.org?subject=help>
List-Subscribe: <http://www.irtf.org/mailman/listinfo/cfrg>, <mailto:cfrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Wed, 22 Oct 2014 21:35:03 -0000

Rob Granger and Mike Scott have posted a new paper "Faster ECC over
\F_{2^521-1}" (https://eprint.iacr.org/2014/852) reporting ECC speeds
mod 2^521-1, and in particular the first (as far as I know) serious
implementation of E-521.

Granger and Scott haven't contributed the software to SUPERCOP yet, and
they report only Haswell speeds, while Microsoft's paper reports only
Sandy Bridge speeds, so comparison isn't easy. I downloaded the E-521
software and tried it on Sandy Bridge:

   * E-521: slightly under 1.12 million cycles.

   * numsp512t1 (ed-512-mers): 1.293 million cycles, according to the
     Microsoft paper.

To summarize, on the microarchitecture selected by Microsoft for
benchmarks, Microsoft's claimed numsp512t1 speeds are about 15% slower
than these higher-security E-521 speeds.

At a lower level, the new paper reports 155 Haswell cycles for field
multiplication, a noticeable improvement over the 167 Haswell cycles
mentioned in Mike Hamburg's message

   https://www.ietf.org/mail-archive/web/cfrg/current/msg04984.html

(using a different technique). Mike had already estimated that his
approach to E-521 would beat the Microsoft results for numsp512t1.

For comparison, Microsoft claimed in July to "show that ... moving from
512 bits to 521 bits ... degrades performance by a factor 1.2". This is
yet another example of the dangers of measuring _some_ implementation
rather than measuring the _best_ implementation. There's clearly a big
difference in performance between

   * a 2^521-1 implementation from someone who never had an incentive to
     use state-of-the-art techniques and
   * a 2^521-1 implementation from someone who wants it to run fast.

The new E-521 code is also surprisingly short; we're not talking about
some difficult tradeoff between simplicity and performance.

The code actually works mod 2^(58*9)-c where c=2, and relies heavily on
c being very small. This appears to add even more evidence for the idea
that "make everything as fast as possible" creates rigidity: there are
many primes that provide tolerable speeds but there are only a few
primes that provide truly top speeds. (When I say "truly top" I mean
something even better than Pareto-optimal; see my message

   https://www.ietf.org/mail-archive/web/cfrg/current/msg04894.html

for details.)

If further analysis continues to support the statement that 2^521-1 is
exceptionally fast, how many people will argue that downgrading to
2^512-569---sacrificing security and sacrificing speed---is a good idea?

I don't understand how the "512 is a power of 2" statement is supposed
to be relevant. I also don't understand the "match 2^256" argument. As
Adam Langley wrote:

   Above the 120ish bit level one is no longer fighting stronger
   attackers, but hedging against a breakthrough (but not too much of
   one). Matching 192 or 256 bit security levels is a *misunderstanding*
   in the wider community, and is thus of the lowest priority.

I realize that the TLS WG requested "128-bit" and "256-bit" and maybe
"192-bit" but all indications are that this was a knee-jerk list---not
something CFRG should take seriously, and certainly not something that
should override CFRG's security judgment. If there's doubt about this
then surely the TLS WG can provide more clarification as to its needs.

I don't mean to suggest that it's obvious that CFRG should pick _any_
curve at this stratospheric >500-bit size. I simply don't see the
argument for picking 2^512-569 rather than 2^521-1.

---Dan