Hi Hugo and Pasi, I have some comments and questions on draft-krawczyk-hkdf-00 and "On Extract-then-Expand Key Derivation Functions and an HMAC-based KDF". First, thanks for taking on this work; it makes strong contributions in an important area. The most important question is: what is the precise security statement for HKDF? What assumptions does one need to make about the hash function used in HKDF in order that the security analysis applies? The paper says that "it is shown in [23] (see Section 8) that using HMAC with a truncated output as an extractor allows to prove security under considerably weaker assumptions on the underlying hash function." However, both of the Lemmas in that paper (and the implication in Section 8) make random oracle assumptions. A recommended instantiation of HKDF from the paper uses HMAC-SHA-512 (with output truncated to 256 bits) in the extract stage and HMAC-SHA-256 in the expand stage. I understand from [23] that "if we are interested in an output of L close-to-uniform bits then the key to the underlying compression function needs to be sufficiently larger than L," which motivates the use of SHA-512 in the extraction stage. But I don't see any exact security statement for this instantiation. What is the impact of the salt (and its omission) on the security properties? The draft is somewhat incomplete as a normative specification, in that it does not require the implementation of any particular hash (would it make sense to include the recommended instantiation from the paper?). Also, it does not use any of the RFC 2119 keywords. I would expect that, to be used in standards, it would be necessary to at least provide some recommended instantiations and the security levels that they target. The conspicuous unmeet need, in crypto standards at least, is for a computational extractor, since "extraction" is an important goal, and there is an efficiency gap for statistical extractors. On the other hand, there are many definitions and implementations of PRFs that implement the "expand" stage perfectly well. One option we have is to define an extractor function, rather than a general-purpose KDF (or in addition to it). It is worth considering the different use cases. Here is a categorization of the different applications mentioned for a KDF: 1. Generate one key from another (IKEv2 Section 2.15 and TLS "Key Calculation", for example). In this case, no extraction stage is needed. 2. Generate a key from a DH shared secret (IKEv2 SKEYSEED computation, Sections 2.14 and 2.18, and TLS "Computing the Master Secret", for example). In this case, computational extraction seems to be needed, because statistical extraction would be inefficient. 3. Generate key material from a non-uniform random source. In this case, either statistical or computational extraction could be used. 4. Generate a key from a passphrase. It is not exactly clear if this is in scope, because the draft discusses modifications of HKDF to better address this application. But it could be in scope, and if it is, computational extraction should be used to ensure that the output keys are unpredictable whenever the passwords used as input have sufficient min-entropy. I wonder if we really want to target all of these disparate cases with a single KDF function. Applications using just case #1 may reject adopting HKDF on the reasonable grounds that they don't need the extraction stage, and that adopting HKDF with the suggestion of Section 3.3 to skip that stage would amount to replacing a perfectly OK PRF with another PRF that offers no security advantage. IKE and TLS both use a PRF for applications #1 and #2, but are there other protocols that use a single PRF in multiple applications? There is a listing of some IETF references to KDFs at http://www.mindspring.com/~dmcgrew/ic/internet-crypto.html#prfs and, besides IKE, these uses seem to belong to just a single case. There are also some other protocols with built-in single-purpose KDFs that are not externally documented, such as SRTP's AES-CTR based KDF and EAP TLS key derivation. If IKE and TLS are the main intended applications, I suggest adding a discussion of these applications to the draft. It would be valuable to compare HKDF to the existing KDF functions used in those protocols. A minor point: The HKDF analysis asserts that OFB mode is better than CTR, because the attacker has less knowledge on the inputs to the PRF, and because successive values of a counter differ in very few bits. These are valid points, but are they strong enough to justify the implementation of a new algorithm for an "expand" stage? It would be a valuable contribution to theory to isolate the security requirements of the hash function that are needed to build a good computational extractor, and provide these requirements to the NIST hash function competition. I was unsatisfied with my results in trying to track down references on a computational extractor. It would be valuable to have a precise and concise statement of the security goal (reference [35] in the paper doesn't mention min-entropy - should I be looking somewhere else?). Also, the citation "Randomness Extraction and Key Derivation Using the CBC, Cascade and HMAC Modes", includes the statement "Computational Security: Our second approach to analyzing NMAC is similar to the analysis of the padded cascade from Lemma 5. We will present it in the full version." Is the full version available online? Or is there another reference that you could recommend? Regards, David
Attachment:
smime.p7s
Description: S/MIME cryptographic signature