idnits 2.17.1 

draft-ietf-security-randomness-01.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in
     this document.

     Expected boilerplate is as follows today (2024-04-25) according to
     https://trustee.ietf.org/license-info :

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.a:
        This Internet-Draft is submitted in full conformance with the provisions
        of BCP 78 and BCP 79.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2:
        Copyright (c) 2024 IETF Trust and the persons identified as the document
        authors.  All rights reserved.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3:
        This document is subject to BCP 78 and the IETF Trust's Legal Provisions
        Relating to IETF Documents
        (https://trustee.ietf.org/license-info) in effect on the date of
        publication of this document.  Please review these documents
        carefully, as they describe your rights and restrictions with
        respect to this document.  Code Components extracted from this
        document must include Simplified BSD License text as described in
        Section 4.e of the Trust Legal Provisions and are provided
        without warranty as described in the Simplified BSD License.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** Missing expiration date.  The document expiration date should appear on
     the first and last page.

  ** The document seems to lack a 1id_guidelines paragraph about
     Internet-Drafts being working documents. 

  ** The document seems to lack a 1id_guidelines paragraph about 6 months
     document validity. 

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     current Internet-Drafts. 

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     Shadow Directories. 

  ** Bad filename characters: the document name given in the document,
     'draft-ietf-security-randomness-01.txt,', contains other characters than
     digits, lowercase letters and dash.

  ** Missing revision: the document name given in the document,
     'draft-ietf-security-randomness-01.txt,', does not give the document
     revision number

  ~~ Missing draftname component: the document name given in the document,
     'draft-ietf-security-randomness-01.txt,', does not seem to contain all
     the document name components required ('draft' prefix, document source,
     document name, and revision) -- see
     https://www.ietf.org/id-info/guidelines#naming for more information.

  == Mismatching filename: the document gives the document name as
     'draft-ietf-security-randomness-01.txt,', but the file name used is
     'draft-ietf-security-randomness-01'

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.

  ** There are 4 instances of too long lines in the document, the longest one
     being 7 characters in excess of 72.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The "Author's Address" (or "Authors' Addresses") section title is
     misspelled.

  == Line 351 has weird spacing: '...   The  amount...'

  == Couldn't figure out when the document was first submitted -- there may
     comments or warnings related to the use of a disclaimer for pre-RFC5378
     work that could not be issued because of this.  Please check the Legal
     Provisions document at https://trustee.ietf.org/license-info to determine
     if you need the pre-RFC5378 disclaimer.

  -- The document date (31 March 1994) is 10983 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Missing Reference: 'DEE' is mentioned on line 1110, but not defined

  == Missing Reference: 'SDC1' is mentioned on line 1119, but not defined

  == Missing Reference: 'JIS' is mentioned on line 1128, but not defined

  == Unused Reference: 'ASYMMETRIC' is defined on line 1043, but no explicit
     reference was found in the text

  == Unused Reference: 'CRYPTO1' is defined on line 1050, but no explicit
     reference was found in the text

  == Unused Reference: 'SHIFT1' is defined on line 1095, but no explicit
     reference was found in the text

  == Unused Reference: 'SHIFT2' is defined on line 1098, but no explicit
     reference was found in the text

  -- Possible downref: Non-RFC (?) normative reference: ref. 'ASYMMETRIC'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'CRC'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'CRYPTO1'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'CRYPTO2'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'DES'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'DES MODES'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'D-H'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'DoD'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'KNUTH'

  ** Obsolete normative reference: RFC 1319 (ref. 'MD2') (Obsoleted by RFC
     6149)

  ** Obsolete normative reference: RFC 1320 (ref. 'MD4') (Obsoleted by RFC
     6150)

  ** Downref: Normative reference to an Informational RFC: RFC 1321 (ref.
     'MD5')

  -- Possible downref: Non-RFC (?) normative reference: ref. 'SHANNON'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'SHIFT1'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'SHIFT2'


     Summary: 14 errors (**), 1 flaw (~~), 12 warnings (==), 13 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	INTERNET-DRAFT                      Randomness Requirements for Security
2	                                                         01 October 1993
3	                                                   Expires 31 March 1994

5	                  Randomness Requirements for Security
6	                  ---------- ------------ --- --------
7	   Donald E. Eastlake 3rd, Stephen D. Crocker, & Jeffrey I. Schiller

9	Status of This Document

11	   This draft, file name draft-ietf-security-randomness-01.txt, is
12	   intended to be submitted to the RFC editor as an Informational RFC.
13	   Distribution of this document is unlimited.

15	   This document is an Internet Draft.  Internet Drafts are working
16	   documents of the Internet Engineering Task Force (IETF), its Areas,
17	   its Working Groupsd and other organizations or individuals.

19	   Internet Drafts are draft documents valid for a maximum of six
20	   months.  Internet Drafts may be updated, replaced, or obsoleted by
21	   other documents at any time.  It is not appropriate to use Internet
22	   Drafts as reference material or to cite them other than as a
23	   ``working draft'' or ``work in progress.'' Please check the 1id-
24	   abstracts.txt listing contained in the internet-drafts Shadow
25	   Directories on ds.internic.net, nic.nordu.net, ftp.nisc.sri.com, or
26	   munnari.oz.au to learn the current status of any Internet Draft.

28	Abstract

30	   Security systems today are built on increasingly strong cryptographic
31	   algorithms that foil pattern analysis attempts. However, the security
32	   of these systems is dependent on generating secret quantities for
33	   passwords, cryptographic keys, and similar quantities.  The use of
34	   pseudo-random processes to generate secret quantities can result in
35	   pseudo-security.  The sophisticated attacker of these security
36	   systems will often find it easier to reproduce the environment that
37	   produced the secret quantities, searching the resulting small set of
38	   possibilities, than to locate the quantities in the whole of the
39	   number space.

41	   Choosing random quantities to foil a resourceful and motivated
42	   attacker is surprisingly difficult.  This paper points out many
43	   pitfalls in using traditional pseudo-random number generation
44	   techniques for choosing such quantities, recommends the use of truly
45	   random hardware techniques, provides suggestions to ameliorate the
46	   problem when a hardware solution is not available, and gives examples
47	   of how large such quantities need to be for some particular
48	   applications.

50	Acknowledgements

52	   Useful comments on this document that have been incorporated were
53	   received from (in alphabetic order) the following:
54	        David M. Balenson (TIS)
55	        Carl Ellison (Stratus)
56	        Marc Horowitz (MIT)
57	        Charlie Kaufman (DEC)
58	        Steve Kent (BBN)
59	        Hal Murray (DEC)
60	        Neil Haller (Bellcore)
61	        Richard Pitkin (DEC)
62	        Tim Redmond (TIS)
63	        Doug Tygar (CMU)

65	Table of Contents

67	      Status of This Document....................................1
68	      Abstract...................................................1
69	      Acknowledgements...........................................2
70	      Table of Contents..........................................3
71	      1. Introduction............................................4
72	      2. Requirements............................................5
73	      3. Traditional Pseudo-Random Sequences.....................7
74	      4. Unpredictability........................................9
75	      4.1 Problems with Clocks and Serial Numbers................9
76	      4.2 Timing and Content of External Events.................10
77	      4.3 The Fallacy of Complex Manipulation...................10
78	      4.4 The Fallacy of Selection from a Large Database........11
79	      5. Hardware for Randomness................................12
80	      5.1 Volume Required.......................................12
81	      5.2 Sensitivity to Skew...................................12
82	      5.2.1 Using Stream Parity to De-Skew......................13
83	      5.2.2 Using Transition Mappings to De-Skew................14
84	      5.2.3 Using Compression to De-Skew........................15
85	      5.3 Using Sound/Video Input...............................15
86	      6. Recommended Non-Hardware Strategy......................17
87	      6.1 Mixing Functions......................................17
88	      6.1.1 A Trivial Mixing Function...........................17
89	      6.1.2 Stronger Mixing Functions...........................18
90	      6.1.3 Using a Mixing Function to Stretch Random Bits......19
91	      6.1.4 Other Factors in Choosing a Mixing Function.........20
92	      6.2 Non-Hardware Sources of Randomness....................20
93	      6.3 Cryptographically Strong Sequences....................21
94	      7. US DoD Recommendations for Password Generation.........23
95	      8. Examples of Randomness Required........................24
96	      8.1  Password Generation..................................24
97	      8.2 A Very High Security Cryptographic Key................24
98	      8.2.1 Effort per Key Trial................................25
99	      8.2.2 Meet in the Middle Attacks..........................25
100	      8.2.3 Other Considerations................................26
101	      9. Security Considerations................................27
102	      References................................................27
103	      Authors Addresses.........................................29
104	      Expiration and File Name..................................29

106	1. Introduction

108	   Software cryptography is coming into wider use.  Systems like
109	   Kerberos, PEM, PGP, etc. are maturing and becoming a part of the
110	   network landscape.  These systems provide substantial protection
111	   against snooping and spoofing.  However, there is a potential flaw.
112	   At the heart of all cryptographic systems is the generation of random
113	   numbers.

115	   For the present, the lack of generally available facilities for
116	   generating unpredictable numbers is an open wound in the design of
117	   cryptographic software.  For the software developer who wants to
118	   build a key or password generation procedure that runs on a wide
119	   range of hardware, the only safe strategy so far has been to force
120	   the local installation to supply a suitable routine to generate
121	   unpredictable numbers.  To say the least, this is an awkward, error-
122	   prone and unpalatable solution.

124	   It is important to keep in mind that the requirement is for data that
125	   an adversary has a very low probability of guessing.  This will fail
126	   if pseudo-random data, which only meets traditional statistical tests
127	   for randomness or which is based on guessable range sources, such as
128	   clocks, is used.  Frequently such random quantities are guessable by
129	   an adversary searching through an embarrassingly small space of
130	   possibilities.

132	   This informational document suggests techniques for producing random
133	   quantities that will be resistant to such attack.  It recommends that
134	   future systems include hardware random number generation, suggests
135	   methods for use if such hardware is not available, and gives some
136	   estimates of the number of random bits required for sample
137	   applications.

139	2. Requirements

141	   Probably the most commonly encountered randomness requirement is the
142	   typical user password character string.  Obviously, if a password can
143	   be guessed, it does not provide security.  (For this particular
144	   application it is desirable that users be able to remember the
145	   password.  This may make it advisable to use pronounceable character
146	   strings or phrases composed on ordinary words.  But this only affects
147	   the format of the password information, not the requirement that the
148	   password be very hard to guess.)

150	   Many other requirements come from the cryptographic arena.
151	   Cryptographic techniques can be used to provide a variety of services
152	   including confidentiality and authentication.  Such services are
153	   based on quantities, traditionally called "keys", that are unknown to
154	   and unguessable by an adversary.

156	   In some cases, such as the use of symmetric encryption with the one
157	   time pads [CRYPTO*] or the US Data Encryption Standard [DES], the
158	   parties who wish to communicate confidentially and/or with
159	   authentication must all know the same secret key.  In other cases,
160	   using what are called asymmetric or "public key" cryptographic
161	   techniques, keys come in pairs.  One key of the pair is private and
162	   must be kept secret by one party, the other is public and can be
163	   published to the world.  It is computationally infeasible to
164	   determine the private key from the public key.  [ASYMMETRIC, CRYPTO*]

166	   The frequency and volume of the requirement for random quantities
167	   differs greatly for different cryptographic systems.  Using RSA
168	   [CRYPTO*], random quantities are required when the key pair is
169	   generated, but thereafter any number of messages can be signed
170	   without any further need for randomness.  The public key Digital
171	   Signature Algorithm that has been proposed by the US National
172	   Institute of Standards and Technology (NIST) requires good random
173	   numbers for each signature.  And encrypting with a one time pad, in
174	   principle the strongest possible encryption technique, requires a
175	   volume of randomness equal to all the messages to be processed.

177	   In all of these cases, an adversary may try to determine the "secret"
178	   key by trial and error.  (This is possible as long as the key is
179	   enough smaller than the message that the correct key can be uniquely
180	   identified.)  The probability of an adversary succeeding at this must
181	   be made acceptably low, depending on the particular application.  The
182	   size of the space the adversary must search is related to the amount
183	   of key "information" present in the information theoretic sense
184	   [SHANNON].  This depends on the number of different secret values
185	   possible and the probability of each value as follows:

187	                      -----
188	                       \
189	        Bits-of-info =  \  - p   * log  ( p  )
190	                        /     i       2    i
191	                       /
192	                      -----

194	   where i varies from 1 to the number of possible secret values and p
195	   sub i is the probability of the value numbered i.  (Since p sub i is
196	   less than one, the log will be negative so each term in the sum will
197	   be non-negative.)

199	   If there are 2^n different values of equal probability, then n bits
200	   of information are present and an adversary would, on the average,
201	   have to try half of the values, or 2^(n-1) , before guessing the
202	   secret quantity.  If the probability of different values is unequal,
203	   then there is less information present and fewer guesses will, on
204	   average, be required by an adversary.  In particular, any values that
205	   the adversary can know are impossible, or are of low probability, can
206	   be ignored by an adversary, who will search through the more probable
207	   values first.

209	   For example, consider a cryptographic system that uses 56 bit keys.
210	   If these 56 bit keys are derived by using a pseudo-random number
211	   generator that is seeded with an 8 bit seed, then an attacker needs
212	   to search through only 256 keys (by running the pseudo-random number
213	   generator with every possible seed), not the 2^56 keys that may at
214	   first appear to be the case. Only 8 bits of "information" are in
215	   these 56 bit keys.

217	3. Traditional Pseudo-Random Sequences

219	   Most traditional sources of random numbers use deterministic sources
220	   of "pseudo-random" numbers.  These typically start with a "seed"
221	   quantity and use numeric or logical operations to produce a sequence
222	   of values.

224	   [KNUTH] has a general exposition on pseudo-random numbers.
225	   Applications he mentions are simulation of natural phenomena,
226	   sampling, numerical analysis, testing computer programs, decision
227	   making, and games.  None of these have the same characteristics as
228	   the sort of security uses we are talking about.  Only in the last two
229	   could there be an adversary trying to find the random quantity.
230	   However, in these cases, the adversary normally has only a single
231	   chance to use a guessed value.  In guessing passwords or attempting
232	   to break an encryption scheme, the adversary normally has many,
233	   perhaps unlimited, chances at guessing the correct value and should
234	   be assumed to be aided by a computer.

236	   For testing the "randomness" of numbers, Knuth suggests a variety of
237	   measures including statistical and spectral.  These tests check
238	   things like autocorrelation between different parts of a "random"
239	   sequence or distribution of its values.  They could be met by a
240	   constant stored random sequence, such as the "random" sequence
241	   printed in the CRC Standard Mathematical Tables [CRC].

243	   A typical pseudo-random number generation technique, known as a
244	   linear congruence pseudo-random number generator, is modular
245	   arithmetic where the N+1th value is calculated from the Nth value by

247	        V    = ( V  * a + b )(Mod c)
248	         N+1      N

250	   The above technique has a strong relationship to linear shift
251	   register pseudo-random number generators, which are well understood
252	   cryptographically [SHIFT*].  In such generators bits are introduced
253	   at one end of a shift register as the Exclusive Or (binary sum
254	   without carry) of bits from selected fixed taps into the register.
255	   For example:

257	      +----+     +----+     +----+                      +----+
258	      | B  | <-- | B  | <-- | B  | <--  . . . . . . <-- | B  | <-+
259	      |  0 |     |  1 |     |  2 |                      |  n |   |
260	      +----+     +----+     +----+                      +----+   |
261	        |                     |            |                     |
262	        |                     |            V                  +-----+
263	        |                     V            +----------------> |     |
264	        V                     +-----------------------------> | XOR |
265	        +---------------------------------------------------> |     |
266	                                                              +-----+

268	       V    = ( ( V  * 2 ) + B .xor. B ...B )(Mod 2^n)
269	        N+1         N         0       1    n

271	   The goodness of traditional pseudo-random number generator algorithm
272	   is measured by statistical tests on such sequences.  Carefully chosen
273	   values of the initial V and a, b, and c or the placement of shift
274	   register tap in the above simple processes can produce excellent
275	   statistics.

277	   These sequences may be adequate in simulations (Monte Carlo
278	   experiments) as long as the sequence is orthogonal to the structure
279	   of the space being explored.  Even there, subtle patterns may cause
280	   problems.  [ref to come - Marsaglia] However, such sequences are
281	   clearly bad for use in security applications.  They are fully
282	   predictable if the initial state is known.  Depending on the form of
283	   the pseudo-random number generator, the sequence may be determinable
284	   from observation of a short portion of the sequence.  For example,
285	   with the generators above, one can determine V(n+1) given knowledge
286	   of V(n).  In fact, it has been shown that with them even if only one
287	   bit of the pseudo-random values are released, the seed can be
288	   determined from short sequences.  [ref to come - Frieze, Hastad,
289	   Kannan, Lagaris, & Shamir]

291	4. Unpredictability

293	   Randomness in the traditional sense described in the previous section
294	   is NOT the same as the unpredictability required for security use.

296	   For example, use of a widely available constant sequence, such as
297	   that from the CRC tables, is very weak against an adversary. Once
298	   they learn of or guess it, they can easily break all security, future
299	   and past, based on the sequence. [CRC]

301	4.1 Problems with Clocks and Serial Numbers

303	   Computer clocks, or similar operating system or hardware values,
304	   provide significantly fewer real bits of unpredictability than might
305	   appear from their specifications.

307	   Tests have been done on clocks on numerous systems and it was found
308	   that their behavior can vary widely and in unexpected ways.  One
309	   version of an operating system running on one set of hardware may
310	   actually provide, say, microsecond resolution in a clock while a
311	   different configuration of the "same" system may always provide the
312	   same lower bits and only count in the upper bits at much lower
313	   resolution.  This means that successive reads on the clock may
314	   produce identical values even if enough time has passed that the
315	   value "should" change based on the nominal clock resolution. There
316	   are also cases where frequently reading a clock can produce
317	   artificial sequential values because of extra code that checks for
318	   the clock being unchanged between two reads and increases it by one!
319	   Designing portable application code to generate unpredictable numbers
320	   based on such system clocks is particularly challenging because the
321	   system designer does not always know the properties of the system
322	   clocks that the code will execute on.

324	   Use of a hardware serial number such as an Ethernet address may also
325	   provide fewer bits of uniqueness than one would guess.  Such
326	   quantities are usually heavily structured and subfields may have only
327	   a limited range of possible values or values easily guessable based
328	   on approximate date of manufacture or other data.  For example, it is
329	   likely that most of the Ethernet cards installed on Digital Equipment
330	   Corporation (DEC) hardware within DEC were manufactured by DEC
331	   itself, which significantly limits the range of possible serial
332	   numbers.

334	   Problems such as those described above related to clocks and serial
335	   numbers make code to produce unpredictable quantities difficult if
336	   the code is to be ported across a variety of computer platforms and
337	   systems.

339	4.2 Timing and Content of External Events

341	   It is possible to measure the timing of mouse movement, key strokes,
342	   and the like.  This is a reasonable source of unguessable data with
343	   two exceptions.  On some machines, inputs such as key strokes are
344	   buffered.  Even though the user's inter-keystroke timing may have
345	   sufficient variation and unpredictability, there might not be an easy
346	   way to access that variation.  The other problem is that no standard
347	   method exists to sample timing details.  This makes it hard to build
348	   standard software intended for distribution to a large range of
349	   machines based on this technique.

351	   The  amount of mouse movement or keys actually hit are usually easier
352	   to access than timings but may yield less unpredictability as the
353	   user may provide highly repetative input.

355	4.3 The Fallacy of Complex Manipulation

357	   One strategy which may give a misleading appearance of strength is to
358	   take a very complex algorithm (or an excellent traditional pseudo-
359	   random number generator with good statistical properties) and
360	   calculate a cryptographic key by starting with the current value of a
361	   computer system clock as the seed.  An adversary who knew roughly
362	   when the generator was started would have a relatively small number
363	   of seed values to test as they would know likely values of the system
364	   clock.  Large numbers of pseudo-random bits could be generated but
365	   the search space an adversary would need to check could be quite
366	   small.

368	   Thus very strong and/or complex manipulation of data will not help if
369	   the adversary can learn what the manipulation is and there is not
370	   enough unpredictability in the starting seed value.

372	   Another serious strategy error is to assume that a very complex
373	   pseudo-random number generation algorithm will produce strong random
374	   numbers when there has been no theory behind or analysis of the
375	   algorithm.  There is a excellent example of this fallacy right near
376	   the beginning of [KNUTH] where the author describes a complex
377	   algorithm.  It was intended that the machine language program
378	   corresponding to the algorithm would be so complicated that a person
379	   trying to read the code without comments wouldn't know what the
380	   program was doing.  Unfortunately, actual use of this algorithm
381	   showed that it almost immediately converged to a single repeated
382	   value in one case and a small cycle of values in another case.

384	   Not only does complex manipulation not help you if you have a limited
385	   range of seeds but blindly chosen complex manipulation can destroy
386	   the randomness in a good seed!

388	4.4 The Fallacy of Selection from a Large Database

390	   Another strategy that can give a misleading appearance of strength is
391	   selection of a quantity randomly from a database and the assumption
392	   that its strength is related to the total number of bits in the
393	   database.  For example, typical NNTP servers as of this date process
394	   over 30 megabytes of information per day.  Assume a random quantity
395	   was selected by fetching 32 bytes of data from a random starting
396	   point in this data.  This does not yield 32*8 = 256 bits worth of
397	   unguessability.  Even after allowing that much of the data is human
398	   language and probably has more like 3 bits of information per byte,
399	   it doesn't yield 32*3 = 96 bits of unguessability.  For an adversary
400	   with access to the same 30 megabytes the unguessability rests only on
401	   the starting point of the selection.  That is, at best, about 25 bits
402	   of unguessability in this case.

404	   The same argument applies to selecting sequences from the data on a
405	   CD ROM or Audio CD recording or any other large public database.  If
406	   the adversary has access to the same database, this "selection from a
407	   large volume of data" step buys very little.  However, if a selection
408	   can be made from data to which the adversary has no access, such as
409	   active system buffers on an active multi-user system, it may be of
410	   some help.

412	5. Hardware for Randomness

414	   Is there any hope for strong portable randomness in the future?
415	   There might be.  All that's needed is a physical source of
416	   unpredictable numbers.

418	   A thermal noise or radioactive decay source and a fast, free-running
419	   oscillator would do the trick.  This is a trivial amount of hardware,
420	   and could easily be included as a standard part of a computer
421	   system's architecture.  All that's needed is the common perception
422	   among computer vendors that this small addition is necessary and
423	   useful.

425	5.1 Volume Required

427	   How much unpredictability is needed?  Is it possible to quantify the
428	   requirement in, say, number of random bits per second?

430	   The answer is not very much is needed.  For DES, the key is 56 bits
431	   and, as we show in an example in Section 8, even the highest security
432	   system is unlikely to require a keying material of over 200 bits.
433	   Even if a series of keys are needed, they can be generated from a
434	   strong random seed using a cryptographically strong sequence as
435	   explained in Section 6.3.  A few hundred random bits generated once a
436	   day would be enough using such techniques.  Even if the random bits
437	   are generated as slowly as one per second and it is not possible to
438	   overlap the generation process, it should be tolerable in high
439	   security applications to wait 200 seconds occasionally.

441	   These numbers are trivial to achieve.  It could be done by a person
442	   repeatedly tossing a coin.  Almost any hardware process is likely to
443	   be much faster.

445	5.2 Sensitivity to Skew

447	   Is there any specific requirement on the shape of the distribution of
448	   the random numbers?  The good news is the distribution need not be
449	   uniform.  All that is needed is a conservative estimate of how non-
450	   uniform it is to bound performance.  Two simple techniques to de-skew
451	   the bit stream are given below and stronger techniques are mentioned
452	   in Section 6.1.2 below.

454	5.2.1 Using Stream Parity to De-Skew

456	   Consider taking a sufficiently long string of bits and map the string
457	   to "zero" or "one".  The mapping will not yield a perfectly uniform
458	   distribution, but it can be as close as desired.  One mapping that
459	   serves the purpose is to take the parity of the string.  This has the
460	   advantages that it is robust across all degrees of skew up to the
461	   estimated maximum skew and is absolutely trivial to implement in
462	   hardware.

464	   The following analysis gives the number of bits that must be sampled:

466	   Suppose the ratio of ones to zeros is 0.5 + e : 0.5 - e, where e is
467	   between 0 and 0.5 and is a measure of the "eccentricity" of the
468	   distribution.  Consider the distribution of the parity function of N
469	   bit samples.  The probabilities that the parity will be one or zero
470	   will be the sum of the odd or even terms in the binomial expansion of
471	   (p + q)^N, where p = 0.5 + e, the probability of a one, and q = 0.5 -
472	   e, the probability of a zero.

474	   These sums can be computed easily as

476	        1/2 * ( ( p + q )^N + ( p - q )^N )
477	   and
478	        1/2 * ( ( p + q )^N - ( p - q )^N ).

480	   (Which one corresponds to the probability the parity will be 1
481	   depends on whether N is odd or even.)

483	   Since p + q = 1 and p - q = 2e, these expressions reduce to

485	        1/2 * [1 + (2e)^N]
486	   and
487	        1/2 * [1 - (2e)^N].

489	   Neither of these will ever be exactly 0.5 unless e is zero, but we
490	   can bring them arbitrarily close to 0.5.  If we want the
491	   probabilities to be within some delta d of 0.5, i.e. then

493	        ( 0.5 + ( 0.5 * (2e)^N ) ) < 0.5 + d.

495	   Solving for N yields N > log(2d)/log(2e).  (Note that 2e is less than
496	   1, so its log is negative.  Division by a negative number reverses
497	   the sense of an inequality.)

499	   The following table gives the length of the string which must be
500	   sampled for various degrees of skew in order to come within 0.001 of
501	   a 50/50 distribution.

503	                       +---------+--------+-------+
504	                       | Prob(1) |    e   |    N  |
505	                       +---------+--------+-------+
506	                       |   0.5   |  0.00  |    1  |
507	                       |   0.6   |  0.10  |    4  |
508	                       |   0.7   |  0.20  |    7  |
509	                       |   0.8   |  0.30  |   13  |
510	                       |   0.9   |  0.40  |   28  |
511	                       |   0.95  |  0.45  |   59  |
512	                       |   0.99  |  0.49  |  308  |
513	                       +---------+--------+-------+

515	   The last entry shows that even if the distribution is skewed 99% in
516	   favor of ones, the parity of a string of 308 samples will be within
517	   0.001 of a 50/50 distribution.

519	5.2.2 Using Transition Mappings to De-Skew

521	   Another possible technique is to examine a bit stream as a sequence
522	   of non-overlapping pairs. You could then discard any 00 or 11 pairs
523	   found, interpret 01 as a 0 and 10 as a 1.  Assume the probability of
524	   a 1 is 0.5+e and the probability of a 0 is 0.5-e where e is the
525	   eccentricity of the source and described in the previous section.
526	   Then the probability of each pair is as follows:

528	            +------+-----------------------------------------+
529	            | pair |            probability                  |
530	            +------+-----------------------------------------+
531	            |  00  | (0.5 - e)^2          =  0.25 - e + e^2  |
532	            |  01  | (0.5 - e)*(0.5 + e)  =  0.25     - e^2  |
533	            |  10  | (0.5 + e)*(0.5 - e)  =  0.25     - e^2  |
534	            |  11  | (0.5 + e)^2          =  0.25 + e + e^2  |
535	            +------+-----------------------------------------+

537	   This technique will completely eliminate any bias but at the expense
538	   of taking an indeterminate number of input bits for any particular
539	   desired number of output bits.  The probability of any particular
540	   pair being discarded is 0.5 + 2e^2 so the expected number of input
541	   bits to produce X output bits is X/(0.25 - e^2).

543	   This technique assumes that the bits are from a stream where each bit
544	   has the same probability of being a 0 or 1 as any other bit in the
545	   stream and that bits are not correlated, i.e., that the bits are
546	   identical independent distributions.  If alternate bits were from two
547	   different sources, for example, the above analysis breaks down.

549	   The above technique also provides another illustration of how a
550	   simple statistical analysis can mislead if one is not always on the
551	   lookout for patterns that could be exploited by an adversary.  If the
552	   algorithm were mis-read slightly so that overlapping successive bits
553	   pairs were used instead of non-overlapping pairs, the statistical
554	   analysis given is the same; however, instead of provided an unbiased
555	   uncorrelated series of random 1's and 0's, it would instead produce a
556	   totally predictable sequence of exactly alternating 1's and 0's.

558	5.2.3 Using Compression to De-Skew

560	   Reversible compression techniques also provide a crude method of de-
561	   skewing a skewed bit stream.  This follows directly from the
562	   definition of reversible compression and the formula in Section 2
563	   above for the amount of information in a sequence.  Since the
564	   compression is reversible, the same amount of information must be
565	   present in the shorter output than was present in the longer input.
566	   By the Shannon information equation, this is only possible if, on
567	   average, the probabilities of the different shorter sequences are
568	   more uniformly distributed than were the probabilities of the longer
569	   sequences.  Thus the shorter sequences are de-skewed relative to the
570	   input.

572	   However, many compression techniques add a somewhat predicatable
573	   preface to their output stream and may insert such a sequence again
574	   periodically in their output or otherwise introduce subtle patterns
575	   of their own.  They should be considered only a rough technique
576	   compared with those described above or in Section 6.1.2.  At a
577	   minimum, the beginning of the compressed sequence should be skipped
578	   and only later bits used for applications requiring random bits.

580	5.3 Using Sound/Video Input

582	   Increasingly computers are being built with inputs that digitize some
583	   real world analog source, such as sound from a microphone or video
584	   input from a camera.  Under appropriate circumstances, such input can
585	   provide reasonably high quality random bits.  The "input" from a
586	   sound digitizer with no source plugged in or a camera with the lens
587	   cap on, if the system is high enough gain to detect anything, is
588	   essentially thermal noise.

590	   For example, on a Sparkstation, one can read from the /dev/audio
591	   device with nothing plugged into the microphone jack.  Such data is
592	   essentially random noise although it should not be trusted without
593	   some checking in case of hardware failure.  It will, in any case,
594	   need to be de-skewed as described elsewhere.

596	   Thus, combining this with compression to de-skew, one can in UNIXese
597	   generate a hugh amount of relatively random data by doing

599	        cat /dev/audio | compress - >random-bits-file

601	6. Recommended Non-Hardware Strategy

603	   What is the best overall strategy for meeting the requirement for
604	   unguessable random numbers in the absence of a reliable hardware
605	   source?  It is to obtain random input from a large number of
606	   uncorrelated sources and to mix them with a strong mixing function.
607	   Such a function will preserve the randomness present in any of the
608	   sources even if other quantities being combined are fixed or easily
609	   guessable.  This may be advisable even with a good hardware source as
610	   hardware can also fail, though this should be weighed against any
611	   increase in the chance of overall failure due to added software
612	   complexity.

614	6.1 Mixing Functions

616	   A strong mixing function is one which combines two or more inputs and
617	   produces an output where each output bit is a different complex non-
618	   linear function of all the input bits.  On average, changing any
619	   input bit will change about half the output bits.  But because the
620	   relationship is complex and non-linear, no particular output bit is
621	   guaranteed to change when any particular input bit is changed.

623	   Note that the problem of converting a stream of bits that is skewed
624	   towards 0 or 1 to a shorter stream which is more random, as discussed
625	   in Section 5.2 above, is simply another case where a strong mixing
626	   function is desired.  The technique given in Section 5.2.1 of using
627	   the parity of a number of bits is simply the result of successively
628	   Exclusive Or'ing them which is examined as a trivial mixing function
629	   immediately below.  Use of stronger mixing functions to extract more
630	   of the randomness in a stream of skewed bits is examined in Section
631	   6.1.2.

633	6.1.1 A Trivial Mixing Function

635	   A trivial example for single bit inputs is the Exclusive Or function,
636	   which is equivalent to addition without carry, as show in the table
637	   below.  This is a degenerate case in which the one output bit always
638	   changes for a change in either input bit but it will still provide a
639	   useful illustration.

641	                   +-----------+-----------+----------+
642	                   |  input 1  |  input 2  |  output  |
643	                   +-----------+-----------+----------+
644	                   |     0     |     0     |     0    |
645	                   |     0     |     1     |     1    |
646	                   |     1     |     0     |     1    |
647	                   |     1     |     1     |     0    |
648	                   +-----------+-----------+----------+

650	   If inputs 1 and 2 are uncorrelated and combined in this fashion then
651	   the output will be an even better (less skewed) random bit than the
652	   inputs.  If we assume an "eccentricity" e as defined in Section 5.2
653	   above, then the output eccentricity relates to the input eccentricity
654	   as follows:

656	        e       = 2 * e        * e
657	         output        input 1    input 2

659	   Since e is never greater than 1/2, the eccentricity is always
660	   improved except in the case where at least one input is a totally
661	   skewed constant.  This is illustrated in the following table where
662	   the top and left side values are the two input eccentricities and the
663	   entries are the output eccentricity:

665	     +--------+--------+--------+--------+--------+--------+--------+
666	     |    e   |  0.00  |  0.10  |  0.20  |  0.30  |  0.40  |  0.50  |
667	     +--------+--------+--------+--------+--------+--------+--------+
668	     |  0.00  |  0.00  |  0.00  |  0.00  |  0.00  |  0.00  |  0.00  |
669	     |  0.10  |  0.00  |  0.02  |  0.04  |  0.06  |  0.08  |  0.10  |
670	     |  0.20  |  0.00  |  0.04  |  0.08  |  0.12  |  0.16  |  0.20  |
671	     |  0.30  |  0.00  |  0.06  |  0.12  |  0.18  |  0.24  |  0.30  |
672	     |  0.40  |  0.00  |  0.08  |  0.16  |  0.24  |  0.32  |  0.40  |
673	     |  0.50  |  0.00  |  0.10  |  0.20  |  0.30  |  0.40  |  0.50  |
674	     +--------+--------+--------+--------+--------+--------+--------+

676	   However, keep in mind that the above calculations assume that the
677	   inputs are not correlated.  If the inputs were, say, the parity of
678	   the number of minutes from midnight on two clocks accurate to a few
679	   seconds, then each might appear random if sampled at random intervals
680	   much longer than a minute.  Yet if they were both sampled and
681	   combined with xor, the result would usually be a constant zero.

683	6.1.2 Stronger Mixing Functions

685	   The US Government Data Encryption Standard [DES] is a good example of
686	   a strong mixing function for multiple bit quantities.  It takes up to
687	   120 bits of input (64 bits of "data" and 56 bits of "key") and
688	   produces 64 bits of output each of which is dependent on a complex
689	   non-linear function of all input bits.  Another good family of mixing
690	   functions are the "message digest" or hashing functions such as MD2,
691	   MD4, or MD5 that take an arbitrary amount of input and produce an
692	   output, frequently 128 bits, mixing all the input bits. [MD2, MD4,
693	   MD5]

695	   Although message digest functions like MD5 are designed for variable
696	   amounts of input, DES can also be used to combine any number of
697	   inputs.  If 64 bits of output is adequate, the inputs can be packed
698	   into a 64 bit data quantity and successive 56 bit keys, padding with
699	   zeros if needed, which are then used to successively encrypt using
700	   DES in Electronic Codebook Mode [DES MODES].  If more than 64 bits of
701	   output are needed, use more complex mixing.  For example, if inputs
702	   are packed into three quantities, A, B, and C, use DES to encrypt A
703	   with B as a key and then with C as a key to produce the 1st part of
704	   the output, then encrypt B with C and then A for more output and, if
705	   necessary, encrypt C with A and then B for yet more output.  Still
706	   more output can be produced by reversing the order of the keys given
707	   above to stretch things, but keep in mind that it is impossible to
708	   get more bits of "randomness" out than are put in.

710	   An example of using a strong mixing function would be to reconsider
711	   the case of a string of 308 bits each of which is biased 99% towards
712	   zero.  The parity technique given in Section 5.2.1 above reduced this
713	   to one bit with only a 1/1000 deviance from being equally likely a
714	   zero or one.  But, applying the equation for information given in
715	   Section 2, this 308 bit sequence has 5 bits of information in it.
716	   Thus hashing it with MD5 and taking the bottom 5 bits of the result
717	   would yield 5 unbiased random bits as opposed to the single bit given
718	   by calculating the parity of the string.

720	   Other functions besides DES and the MD* family should serve well as
721	   mixing functions.  This is an advantage of Diffie-Hellman exponential
722	   key exchange.  Diffie-Hellman yields a shared secret between two
723	   parties that is a mixture of initial random quantities generated by
724	   each of them [D-H, ref to come - Odlyzko].

726	6.1.3 Using a Mixing Function to Stretch Random Bits

728	   While it is not necessary for a mixing function to produce the same
729	   or fewer bits than its inputs, mixing bits cannot "stretch" the
730	   amount of random unpredictability present in the inputs.  Thus four
731	   inputs of 32 bits each where there is 12 bits worth of
732	   unpredicatability (such as 4,096 equally probable values) in each
733	   input cannot produce more than 48 bits worth of unpredictable output.
734	   The output can be expanded to hundreds or thousands of bits by, for
735	   example, mixing with successive integers, but the clever adversary's
736	   search space is still 2^48 possibilities.  Furthermore, mixing to
737	   fewer bits than are input will tend to strengthen the randomness of
738	   the output the way using Exclusive Or to produce one bit from two did
739	   above.

741	   The last table in Section 6.1.1 shows that mixing a random bit with a
742	   constant bit with Exclusive Or will produce a random bit.  While this
743	   is true, it does not provide a way to "stretch" one random bit into
744	   more than one.  If, for example, a random bit is mixed with a 0 and
745	   then with a 1, this produces a two bit sequence but it will always be
746	   either 01 or 10.  Since there are only two possible values, there is
747	   still only the one bit of original randomness.

749	6.1.4 Other Factors in Choosing a Mixing Function

751	   For local use, DES has the advantages that it has been widely tested
752	   for flaws, is widely documented, and is widely implemented with
753	   hardware and software implementations available all over the world
754	   including source code available by anonymous FTP.  The MD* family are
755	   younger algorithms which have been less tested but there is no
756	   particular reason to believe they are flawed.  They also have source
757	   code available by anonymous FTP [MD2, MD4, MD5].  DES, MD4, and MD5
758	   are royalty free for all purposes but MD2 has been freely licensed
759	   only for non-profit use in connection with Privacy Enhanced Mail.
760	   Some people believe that, as with Goldilocks and the Three Bears, MD2
761	   is strong but too slow, MD4 is fast but too weak, and MD5 is just
762	   right.

764	   Another advantage of the MD* or similar hashing algorithms is that
765	   they are not subject to the regulations imposed by the US Government
766	   prohibiting the export or import of encryption/decryption software
767	   (or hardware).  The same should be true of DES rigged to produce an
768	   irreversible hash code but most DES packages are oriented to
769	   reversible encryption.

771	6.2 Non-Hardware Sources of Randomness

773	   The best source of input for mixing would be a hardware random number
774	   generator based on some fundamentally random physical process such as
775	   thermal noise or radioactive decay.  However, if that is not
776	   available, other possibilities include system clocks, system or
777	   input/output buffers, user/system/hardware/network serial numbers
778	   and/or addresses, user input, and timings of input/output operations.
779	   Unfortunately, any of these sources can produce limited or
780	   predicatable values under some circumstances.

782	   Some of the sources listed above would be quite strong on multi-user
783	   systems where, in essence, each user of the system is a source of
784	   randomness.  However, on a small single user system, such as a
785	   typical IBM PC or Apple Macintosh, it might be possible for an
786	   adversary to assemble a similar configuration.  This could give the
787	   adversary inputs to the mixing process that were sufficiently
788	   correlated to those used originally as to make exhaustive search
789	   practical.

791	   The use of multiple random inputs with a strong mixing function is
792	   recommended and can overcome weakness in any particular input.  For
793	   example, the timing and content of requested "random" user keystrokes
794	   can yield hundreds of random bits but conservative assumptions need
795	   to be made.  For example, assuming a few bits of randomness if the
796	   inter-keystroke interval is unique in the sequence up to that point
797	   and a similar assumption if the key hit is unique but assuming that
798	   no bits of randomness are present in the initial key value or if the
799	   timing or key value duplicate previous values.  The results of mixing
800	   these timings and characters typed could be further combined with
801	   clock values and other inputs.

803	   This strategy may make practical portable code to produce good random
804	   numbers for security even if some of the inputs are very weak on some
805	   of the target systems.  However, it may fail against a high grade
806	   attack on small single user systems, especially if the adversary has
807	   even been able to observe the generation process in the past.  A
808	   hardware random source is still preferable.

810	6.3 Cryptographically Strong Sequences

812	   In cases where a series of random quantities must be generated, an
813	   adversary may learn some values in the sequence.  In general, they
814	   should not be able to predict other values from the ones that they
815	   know.

817	   The correct technique is to start with a strong random seed, take
818	   cryptographically strong steps from that seed [CRYPTO2], and do not
819	   reveal the complete state of the generator in the sequence elements.
820	   If each value in the sequence can be calculated in a fixed way from
821	   the previous value, then when any value is compromised, all future
822	   values can be determined.  This would be the case, for example, if
823	   each value were a constant function of the previous values, even if
824	   the function were a very strong, non-invertible message digest
825	   function.

827	   A good way to achieve a strong sequence is to have the values be
828	   produced by hashing the quantities produced by concatenating the seed
829	   with successive integers or the like and then mask the values
830	   obtained so as to limit the amount of generator state available to
831	   the adversary.  It may also be possible to use an encryption
832	   algorithm with a random key and seed value to encrypt and feedback
833	   some of the output encrypted value into the value to be encrypted for
834	   the next iteration.  Appropriate feedback techniques will usually be
835	   recommended with the encryption algorithm.  An example is shown below
836	   where shifting and masking are used to combine the cypher output
837	   feedback.  This type of feedback is recommended in connection with
838	   DES [DES MODES].

840	         +---------------+
841	         |       V       |
842	         |  |     n      |
843	         +--+------------+
844	               |      |           +---------+
845	               |      +---------> |         |      +-----+
846	            +--+                  | Encrypt | <--- | Key |
847	            |           +-------- |         |      +-----+
848	            |           |         +---------+
849	            V           V
850	         +------------+--+
851	         |      V     |  |
852	         |       n+1     |
853	         +---------------+

855	   Note that if a shift of one is used, this is the same as the shift
856	   register technique described in Section 3 above but with the all
857	   important difference that the feedback is determined by a complex
858	   non-linear function of all bits rather than a simple linear
859	   combination of output from a few bit position taps.

861	   To predict values of a sequence from others when the sequence was
862	   generated by these techniques is equivalent to breaking the
863	   cryptosystem or inverting the "non-invertible" hashing involved with
864	   only partial information available.  The less information revealed
865	   each iteration, the harder it will be for an adversary to predict the
866	   sequence.  Thus it is best to use only one bit from each value.  It
867	   has been shown that some cases this makes it impossible to break a
868	   system even when the cryptographic system is invertible and can be
869	   broken if all of the generated values were revealed.

871	7. US DoD Recommendations for Password Generation

873	   The United States Department of Defense has specific recommendations
874	   for password generation [DoD]. They suggest using the US Data
875	   Encryption Standard [DES] in Output Feedback Mode [DES MODES] as
876	   follows:

878	        use an initialization vector determined from
879	             the system clock,
880	             system ID,
881	             user ID, and
882	             date and time;
883	        use a key determined from
884	             system interrupt registers,
885	             system status registers, and
886	             system counters; and,
887	        as plain text, use an external randomly generated 64 bit
888	        quantity such as 8 characters typed in by a system
889	        administrator.

891	   The password can then be calculated from the 64 bit "cipher text"
892	   generated in 64-bit Output Feedback Mode.  As many bits as are needed
893	   can be taken from these 64 bits and expanded into a pronounceable
894	   word, phrase, or other format.

896	8. Examples of Randomness Required

898	   Below are two examples showing rough calculations of needed
899	   randomness for security.

901	8.1  Password Generation

903	   Assume that user passwords change once a year and a probability of
904	   less than one in a thousand that an adversary could guess the
905	   password for a particular account is desired.  The key question is
906	   how often they can try possibilities.  Assume that delays have been
907	   introduced into a system so that, at most, an adversary can make one
908	   password try every six seconds.  That's 600 per hour or about 15,000
909	   per day or about 5,000,000 tries in a year.  Assuming any sort of
910	   monitoring, it is unlikely someone could actually try continuously
911	   for a year.  In fact, even if log files are only checked monthly,
912	   500,000 tries is more plausible before the attack is noticed and
913	   steps taken to change passwords and make it harder to try more
914	   passwords.  (All this assumes that sending a password to the system
915	   is the only way to try a password.)

917	   To have a one in a thousand chance of guessing the password in
918	   500,000 tries implies a universe of at least 500,000,000 passwords or
919	   about 2^29.  Thus 29 bits of randomness are needed. This can probably
920	   be achieved using the US DoD recommended inputs for password
921	   generation as it has 8 inputs which probably average over 5 bits of
922	   randomness each.  Using a list of 1000 words, the password could be
923	   expressed as a three word phrase (1,000,000,000 possibilities) or,
924	   using case insensitive letters and digits, six would suffice
925	   ((26+10)^6 = 2,176,782,336 possibilities).

927	   For a higher security password, the number of bits required goes up.
928	   To decrease the probability by 1,000 requires increasing the universe
929	   of passwords by the same factor which adds about 12 bits.  Thus to
930	   have only a one in a million chance of a password being guessed under
931	   the above scenario would require 31 bits of randomness and a password
932	   that was a four word phrase from a 1000 word list or eight
933	   letters/digits.  To go to a one in 10^9 chance, 43 bits of randomness
934	   are needed implying a five word phrase or ten letter/digit password.

936	8.2 A Very High Security Cryptographic Key

938	   Assume that a very high security key is needed for symmetric
939	   encryption / decryption between two parties.  Assume an adversary can
940	   observe communications and knows the algorithm being used.  Within
941	   the field of random possibilities, the adversary can exhaustively try
942	   key values.  Assume further that there is no systematic weakness in
943	   the cryptographic system so that brute force trial of keys is the
944	   best the adversary can do.

946	8.2.1 Effort per Key Trial

948	   How much effort will it take to try each key?  For very high security
949	   applications it is best to assume a low value of effort.  Even if it
950	   would clearly take tens of thousands of computer cycles or more to
951	   try a single key, there may be some pattern that enables huge blocks
952	   of key values to be tested with much less effort per key.  Thus it is
953	   probably best to assume no more than a hundred cycles per key.
954	   (There is no clear lower bound on this as computers operate in
955	   parallel on a number of bits and a poor encryption algorithm could
956	   allow many keys or even groups of keys to be tested in parallel.
957	   However, we need to assume some value and can hope that a reasonably
958	   strong algorithm has been chosen for our hypothetical high security
959	   task.)

961	   If the adversary can command a highly parallel processor or a large
962	   network of work stations, 10^10 cycles per second is probably a
963	   minimum assumption for availability today.  Looking forward just a
964	   few years, there should be at least an order of magnitude
965	   improvement.  Thus assuming 10^9 keys could be checked per second or
966	   3.6*10^11 per hour or 6*10^13 per week or 2.4*10^14 per month is
967	   reasonable.  This implies a need for a minimum of 48 bits of
968	   randomness in keys to be sure they cannot be found in a week.  Even
969	   then it is possible that, a few years from now, a highly determined
970	   and resourceful adversary could break the key in 2 weeks (on average
971	   they need try only half the keys).

973	8.2.2 Meet in the Middle Attacks

975	   If chosen or known plain text and the resulting encrypted text are
976	   available, a "meet in the middle" attack is possible if the structure
977	   of the encryption algorithm allows it.  (In a known plain text
978	   attack, the adversary knows all or part of the messages being
979	   encrypted, possibly some standard header or trailer fields.  In a
980	   chosen plain text attack, the adversary can force some known plain
981	   text to be encrypted, possibly by "leaking" an exciting text that
982	   would then be sent by the adversary over an encrypted channel.)

984	   An oversimplified explanation of the meet in the middle attack attack
985	   is as follows: the adversary can half-encrypt the know or chosen
986	   plain text with all possible first half-keys, sort these, then half-
987	   decrypt the encoded text with all the second half-keys.  If a match
988	   is found, the full key can be assembled from the halves and used to
989	   decrypt other parts of the message or other messages.  At its best,
990	   this type of attack can halve the exponent of the work required by
991	   the adversary while adding a moderate constant factor.  To be assured
992	   of safety against this, a doubling of the amount of randomness in the
993	   key to a minimum of 96 bits is required.

995	   The meet in the middle attack assumes that the cryptographic
996	   algorithm can be decomposed in this way but we can not rule that out
997	   without a deep knowledge of the algorithm.  Even if a basic algorithm
998	   is not subject to a meet in the middle attack, an attempt to produce
999	   a stronger algorithm by applying the basic algorithm twice with
1000	   different keys may gain much less than would be expected.  Such a
1001	   composite algorithm would be subject to this type of attack.

1003	   Enormous resources may be required to mount a meet in the middle
1004	   attack but they are probably within the range of the national
1005	   security services of a major nation.  Almost all nations spy on other
1006	   nations government traffic and some nations are known to spy on
1007	   commercial traffic and give the information to their domestic
1008	   companies to assist them against foreign competition.

1010	8.2.3 Other Considerations

1012	   Since we have not even considered the possibilities of special
1013	   purpose code breaking hardware or just how much of a safety margin we
1014	   want beyond our assumptions above, probably a good minimum for a very
1015	   high security cryptographic key is 128 bits of randomness which
1016	   implies a minimum key length of 128 bits.  If the two parties agree
1017	   on a key by Diffie-Hellman exchange [D-H], then in principle only
1018	   half of this randomness would have to be supplied by each party.
1019	   However, there is probably some correlation between their random
1020	   inputs so it is probably best to assume that each party needs to
1021	   provide at least 96 bits worth of randomness for very high security.

1023	   This amount of randomness is probably beyond the limit of that in the
1024	   inputs recommended by the US DoD for password generation and could
1025	   require user typing timing, hardware random number generation, or
1026	   other sources.

1028	   It should be noted that key length calculations such at those above
1029	   are controversial and depend on various assumptions about the
1030	   cryptographic algorithms in use.  In some cases, a professional with
1031	   a deep knowledge of code breaking techniques and of the strength of
1032	   the algorithm in use could be satisfied with less than half of the
1033	   key size derived above.

1035	9. Security Considerations

1037	   The entirety of this draft concerns techniques and recommendations
1038	   for generating "random" quantities for use as passwords,
1039	   cryptographic keys, and similar security uses.

1041	References

1043	   [ASYMMETRIC] - Secure Communications and Asymmetric Cryptosystems,
1044	   edited by Gustavus J. Simmons, AAAS Selected Symposium 69, Westview
1045	   Press, Inc.

1047	   [CRC] - C.R.C. Standard Mathematical Tables, Chemical Rubber
1048	   Publishing Company.

1050	   [CRYPTO1] - Cryptography:  A Primer, by Alan G. Konheim, A Wiley-
1051	   Interscience Publication, John Wiley & Sons, 1981, Alan G. Konheim.

1053	   [CRYPTO2] - Cryptography:  A New Dimension in Computer Data Security,
1054	   A Wiley-Interscience Publication, John Wiley & Sons, 1982, Carl H.
1055	   Meyer & Stephen M. Matyas.

1057	   [DES] -  Data Encryption Standard, United States of America,
1058	   Department of Commerce, National Institute of Standards and
1059	   Technology, Federal Information Processing Standard (FIPS) 46-1.
1060	   - Data Encryption Algorithm, American National Standards Institute,
1061	   ANSI X3.92-1981.
1062	   (See also FIPS 112, Password Usage, which includes FORTRAN code for
1063	   performing DES.)

1065	   [DES MODES] - DES Modes of Operation, United States of America,
1066	   Department of Commerce, National Institute of Standards and
1067	   Technology, Federal Information Processing Standard (FIPS) 81.
1068	   - Data Encryption Algorithm - Modes of Operation, American National
1069	   Standards Institute, ANSI X3.106-1983.

1071	   [D-H] - New Directions in Cryptography, IEEE Transactions on
1072	   Information Technology, November, 1976, Whitfield Diffie and Martin
1073	   E. Hellman.

1075	   [DoD] - Password Management Guideline, United States of America,
1076	   Department of Defense, Computer Security Center, CSC-STD-002-85.
1077	   (See also FIPS 112, Password Usage, which incorporates CSC-STD-002-85
1078	   as one of its appendicies.)

1080	   [KNUTH] - The Art of Computer Programming, Volume 2: Seminumerical
1081	   Algorithms, Chapter 3: Random Numbers. Addison Wesley Publishing
1082	   Company, Second Edition 1982, Donald E. Knuth.

1084	   [MD2] - The MD2 Message-Digest Algorithm, RFC1319, April 1992, B.
1085	   Kaliski
1086	   [MD4] - The MD4 Message-Digest Algorithm, RFC1320, April 1992, R.
1087	   Rivest
1088	   [MD5] - The MD5 Message-Digest Algorithm, RFC1321, April 1992, R.
1089	   Rivest

1091	   [SHANNON] - The Mathematical Theory of Communication, University of
1092	   Illinois Press, 1963, Claude E. Shannon.  (originally from:  Bell
1093	   System Technical Journal, July and October 1948)

1095	   [SHIFT1] - Shift Register Sequences, Aegean Park Press, Revised
1096	   Edition 1982, Solomon W. Golomb.

1098	   [SHIFT2] - Cryptanalysis of Shift-Register Generated Stream Cypher
1099	   Systems, Aegean Park Press, 1984, Wayne G. Barker.

1101	Authors Addresses

1103	   Donald E. Eastlake 3rd
1104	   Digital Equipment Corporation
1105	   550 King Street, LKG2-1/BB3
1106	   Littleton, MA 01460

1108	   Telephone:   +1 508 486 6577(w)  +1 508 287 4877(h)
1109	   EMail:       dee@skidrow.lkg.dec.com
1110	   NIC Handle:  [DEE]

1112	   Stephen D. Crocker
1113	   Trusted Information Systems, Inc.
1114	   3060 Washington Road
1115	   Glenwood, MD 21738

1117	   Telephone:   +1 301 854 6889
1118	   EMail:       crocker@tis.com
1119	   NIC Handle:  [SDC1]

1121	   Jeffrey I. Schiller
1122	   Massachusetts Institute of Technology
1123	   77 Massachusetts Avenue
1124	   Cambridge, MA 02139

1126	   Telephone:   +1 617 253 0161
1127	   EMail:       jis@mit.edu
1128	   NIC Handle:  [JIS]

1130	Expiration and File Name

1132	   This draft expires 31 March 1994.

1134	   Its file name is draft-ietf-security-randomness-01.txt.

1136	%%% overflow headers %%%
1137	Cc: David M. Balenson <balenson@tis.com>, Stephen D. Crocker <crocker@tis.com>,
1138	        Beast (Donald E. Eastlake,III) <dee@skidrow.lkg.dec.com>,
1139	        Carl Ellison <cme@ellisun.sw.stratus.com>,
1140	        Neil Haller <nmh@thumper.bellcore.com>, Marc Horowitz <marc@mit.edu>,
1141	        Charlie Kaufman <kaufman@abyss.enet.dec.com>,
1142	        Steve Kent <kent@bbn.com>, Hal Murray <murray@decsrc.enet.dec.com>,
1143	        Richard Pitkin <pitkin@ranger.enet.dec.com>,
1144	        Tim Redmond <redmond@tis.com>, Jeffrey I. Schiller <jis@mit.edu>,
1145	        Doug Tygar <doug.tygar@cs.cmu.edu>,
1146	        Eirikur Hallgrimsson <eirikur@ranger.enet.dec.com>,
1147	        Al Kent <arkent@world.std.com>,
1148	        Jim Burrows (Brons) <burrows@brons.enet.dec.com>
1149	%%% end overflow headers %%%