[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Cfrg] (no subject) (Randomness recommendations)
John Kelsey wrote:
...
As has been said before, a simple measure of a bit stream calculating
Shannon entropy is not good. Min-entropy is difficult to compute for a
bit stream. (What block size do you use? And if you pick one large
enough to find many problems that might occur, it requires too much
space and time.)
I don't see why it's any harder to compute than Shannon entropy. Entropy
as we're discussing is defined for a probability distribution. For
Shannon entropy, you need to know the probabilities of all the symbols to
compute it; for Min-entropy, you need only know the highest probability.
You are correct: min-entropy is no harder than shannon entropy. Both
have the same problems. You have to choose a block size. If want to
catch correlations you need a block size big enough to encompass the
correlated bits. Then you have to choose whether to use a sequence of
adjacent blocks in the data stream, or a sliding window. The sequence
of adjacent blocks will miss correlations that cross block boundaries.
The sliding window will count the same pattern multiple times as
different values. Thus both will overestimate the entropy but for
different reasons.
You can avoid choosing a window by using an LZ compression scheme as in
NIST 800-22 test 10 (Lempel-Ziv Compression Test). But
(a) NIST 800-22 test 10 has met with disfavor. (See Larry Bassam's
slides at csrc.nist.gov/CryptoToolkit/RNG/Workshop/ValTestandSTS.pdf.
By the way, what is the reason for the disfavor?)
(b) The test works by counting the number of nodes in the internal
dictionary, and specifies an empirically determined mean and standard
deviation that can be used to compute a P-value. However, the size of
the tree does not relate directly to entropy, so getting an entropy
measurment out of it is a bit of a research project.
Now, you *do* need to take into account any correlation between outputs,
but that's true for both Shannon entropy and min-entropy. (The usual
estimate of bits of Shannon entropy per letter in English text is based on
conditional probabilities given previous letters seen, not just on the
letter frequencies!)
And the purpose of min-entropy does not apply to a bit
stream that is going to be "mixed" or "accumulated".
What do you mean?
Consider a hyptothetical raw HW generator that outputs two bits each cycle
code probability
00 1/2
10 1/4
11 1/4
10 0
Now, I'm not going to get into ascii art, but lets assume our mixer is
an SHA1 hasher and a register (initially zero) with the register output
fed back to the SHA1 input, with two of the bits XOR'd with the HW
generator output.
Now the naive question is how many cycles of input do I need to "fill"
the mixer to where all 2^160 states are reachable? A bit more
accurately, it is probably not known that all 2^160 outputs are in fact
reachable, and there will be multiple ways of reaching the same output
code. So just for fun, lets consider the number of cycles so the most
probable state occurs with probability 1/(2^(160-k)) for small k.
In the beginning there is exactly one state "reachable". After one
cycle we are in one of 3 states, two with probabilty 1/4 and one with
probabilty 1/2. After two cycles we one in one of 9 states. The number
of reachable states grows exponentially a first, but pretty soon the
probabilty of a collison becomes significant, so the growth in the
number of reachable states slows, and the probabilities of the states
equalizes.
I don't know how to solve this analytically, so I wrote a little
simulation but with just 16 bits. I had an array of 65536
probabilities. At each stage, I just chose the 3 successor states for
each state with a random number generator. One got half the probability
and the other 2 got 1/4 of the probability. In reality the successor
states are determined by the design of SHA1, and, of course, are the
same on each round rather than chosen anew on each round as I did.
Here is the output of one run of my little simulation.
round min-entropy Shannon entropy
0 0.000000 0.000000
1 1.000000 1.500000
2 2.000000 3.000000
3 3.000000 4.500000
4 4.000000 6.000000
5 5.000000 7.500000
6 6.000000 8.993274
7 7.000000 10.467430
8 8.000000 11.896266
9 9.000000 13.218532
10 9.966577 14.284023
11 10.966577 14.964906
12 11.966577 15.309874
13 12.522495 15.456957
14 13.017274 15.521733
15 13.203137 15.545512
16 13.227718 15.554354
17 13.320790 15.562737
18 13.297072 15.560971
19 13.053438 15.563896
20 13.387674 15.563441
Yikes! I'm wrong. The min-entropy is always less than the sum of the
min-entropies of the stirred in stuff. (The min-entropy of the
generator is 1 bit per round, and the Shannon entropy is 1.5 bits.) I
was thinking that the "equalization" due to collisions would equalize
the probabilities, reducing the difference between means and maxima.
Look, what we really want to know is "how hard is it for the attacker to
attack our PRNG by guessing our seed?" So, if we know the most probable
value for the seed, that gives us a nice worst-case estimate of how hard
it is: if the most likely seed has probability 2^{-100}, then we know that
there's no way for an attacker to try N guesses and have better than a
probability of N*2^{-100} of success.
Suppose you send in a sequence of 256-bit blocks to be accumulated, where
each block has a maximum probability, conditioned on all previous blocks,
of P[i]. After we send in 100 such blocks, what's the maximum probability
of the whole input sequence? p[0]*p[1]*p[2]*...*p[99]. Since the
min-entropy of block i is ME[i] = -lg(p[i]), we can add those logs to
multiply the probabilities, so we get ME[whole sequence] =
ME[0]+ME[1]+...+ME[99].
Now, why is that important? Because assuming the accumulator isn't losing
any entropy (it will start losing a lot, once the amount of entropy gets
close to the size of the memory of the accumulator), that tells us the
maximum probability of the accumulated state.
Let's make this concrete. Suppose I have a ring oscillator source which
buffers 32 bits at a time. Based on a lot of statistical analysis, I am
convinced that conditioned on all previous bits of output, the next 32-bit
block of output never has probability higher than 1/8=2^{-3}, so it has 3
bits of min-entropy. Now, I need to accumulate enough to make a 128 bit
AES key for my PRNG. So I know that I need to accumulate ceiling(128/3) =
43 samples. When I've done that, I know that my sequence of samples has
maximum probability of less than 2^{-128}, meaning that an attacker who's
trying to attack my seed by guessing what samples I got from my ring
oscillator source is worse off than if he just tried to guess my AES key
from the start.
...
-- David Jacobson
--John Kelsey, NIST, September 2004
In summary, my main point was that there are enough difficulties that
the document should contain a reference or two to literature on
estimating entropy.
And, thank you, John, for a good educational experience.
-- David Jacobson
_______________________________________________
Cfrg mailing list
Cfrg at ietf.org
https://www1.ietf.org/mailman/listinfo/cfrg