Network Working Group W A Simpson [DayDreamer] Internet Draft expires in six months July 1998 ESP with Cipher Block CheckSums (CBCS) draft-simpson-cbcs-01.txt Status of this Memo This document is an Internet-Draft. Internet Drafts are working doc- uments of the Internet Engineering Task Force (IETF), its Areas, and its Working Groups. Note that other groups may also distribute work- ing documents as Internet Drafts. Internet Drafts are draft documents valid for a maximum of six months, and may be updated, replaced, or obsoleted by other documents at any time. It is not appropriate to use Internet Drafts as refer- ence material, or to cite them other than as a ``working draft'' or ``work in progress.'' To learn the current status of any Internet-Draft, please check the ``1id-abstracts.txt'' listing contained in the internet-drafts Shadow Directories on: ftp.is.co.za (Africa) nic.nordu.net (Northern Europe) ftp.nis.garr.it (Southern Europe) ftp.ietf.org (Eastern USA) ftp.isi.edu (Western USA) munnari.oz.au (Pacific Rim) Distribution of this memo is unlimited. Copyright Notice Copyright (C) William Allen Simpson (1994-1995, 1997-1998). All Rights Reserved. Abstract This document describes the Cipher Block Chaining mode with addi- tional single or double parallel CheckSums for integrity, used with a number of Internet Encapsulating Security Payload (ESP) transforms. This version is described in terms of 64-bit block ciphers, but can be expanded to other block sizes. Simpson expires in six months [Page i] DRAFT CBCS mode July 1998 1. Introduction The Encapsulating Security Payload (ESP) [RFC-1827x] provides confi- dentiality for IP datagrams by encrypting the payload data to be pro- tected. This specification describes the ESP use of the Cipher Block Chaining (CBC) mode, with an added checksum for integrity. CBC is used to mask patterns of identical blocks within the same datagram. Together with an Initialization Variable (IV) that is dif- ferent for every datagram, identical plaintext payloads will each encrypt to different ciphertext payloads. As an added benefit, when the cipher output is effectively random in appearance (a characteris- tic of a good cipher), masking the plaintext with previous ciphertext will easily strengthen the entropy of the next input to the cipher. A pair of checksums are added to detect changes to the ciphertext, providing a modest degree of integrity in a single pass over the data, rather than using a separate pass with a stronger algorithm. This combination is intended to be reasonably simple to implement, either serially in software or in parallel hardware for performance. Although a simple checksum over multiple blocks would not normally be considered cryptographically secure, the use of a cipher in the mid- dle provides a background of highly random internal chaining. These random blocks are used as a masking function to inhibit determination of the internal state of the checksums. CBC was first defined for DES in [FIPS-81], and generalized by [ISO-8732] and [ISO/IEC-10116]. For a technical exposition on CBC, see [MOV97]. For more explanation and implementation information for CBC, and a useful comparison with other modes of operation, see [Schneier95]. Although this algorithm is described in terms of 64-bit (8 byte) blocks with 128-bit (16 byte) internal chaining values, this could be expanded to other block sizes. 1.1. Design Goals Unlike the CBC stream orientation, CBCS is intended from its incep- tion to be used with the Internet Protocol (IP) datagram network. This environment provides specific limitations on the scope of vul- nerability. Each IP datagram has a length limitation of 576 bytes, unless a larger size is explicitly indicated by the transport layer. Future IP versions extend this length to 1500 bytes. The CBCS integrity is Simpson expires in six months [Page 1] DRAFT CBCS mode July 1998 designed to reliably detect any multiple bit change within the 11,680-bit payload, with an order of at least 1/2**16 probability of failure. This is considerably stronger than the 1/2**7 probability provided by the IP, TCP and UDP checksums. Given that ESP provides a Sequence Number range of 2**32-1, and the window of acceptable values is in the range 2**5 to 2**8, CBCS is intended to be computationally infeasible to predictably and unde- tectably modify a datagram in transit during the time provided by the communication bandwidth-delay product. Software processing cost should be less than half as much as crypto- graphic hashing algorithms (such as MD5 or SHA1) on a little-endian machine, much better on a 32-bit big-endian machine, and nearly neg- ligible on a 64-bit big-endian machine. Hardware performance will benefit from the removal of multiple algo- rithm passes over the same data. The hardware resources required are considerably less than required for MD5 or SHA1. 1.2. Primary CheckSum The primary checksum involves feed-forward through the encryption function, in the same fashion as CBC, through XOR with the plaintext. Any single or multiple bit change in a cipher block (or the datagram IV) will result in a large unpredicatable change to all following cipher blocks. The primary checksum may be used alone. Its strength is based on an integrity secret-key, the related blocksize of the cipher, and is dependent on the strength of the cipher itself. 1.3. Secondary CheckSum The secondary checksum involves only the checksum of successive ciphertext blocks. Any single bit change in a cipher block (or the datagram IV) will result in a small unpredicatable change to all fol- lowing cipher blocks. Multiple bit changes in carefully chosen pairs can be used to reduce the unpredictability. The secondary checksum is probably too weak to be used alone. Its strength is based on an integrity secret-key, the related blocksize of the cipher, and the unpredictability of the cipher output. However, it can be combined with the primary checksum to double the amount of internal chaining for the integrity function. Simpson expires in six months [Page 2] DRAFT CBCS mode July 1998 2. Description IVa P1 P2 Pi | | | | +-----+ v +-----+ v +-----+ v Sa->| S |->(X)->| S |->->(X)->| S |->->(X)-> +-----+ | +-----+ | +-----+ | v ^ v ^ v +-----+ ^ +-----+ ^ +-----+ k->| E | ^ k->| E | ^ k->| E | +-----+ ^ +-----+ ^ +-----+ | ^ | ^ | IVb +->->->+ +->->->+ +->->-> | | | | +-----+ v +-----+ v +-----+ v Sb->| S |->(X)->| S |->->(X)->| S |->->(X)-> +-----+ | +-----+ | +-----+ | v ^ v ^ v +->->->+ +->->->+ +->->-> | | | C1 C2 Ci The Cipher Block CheckSums (CBCS) mode is a simple variant of the CBC mode [RFC-wwww]. For each datagram, the checksums are initialized with separate seeds (Sa, Sb). An Initialization Variable (IV) is split into two parts (IVa, IVb) which are separately summed into the primary and secondary checksums respectively. The primary checksum output XOR masks (X) the first plaintext block (P1). The keyed encryption function (Ek) is followed by another XOR mask (X) with the secondary checksum output, which generates the ciphertext (C1) for the block. For successive blocks, the previous output of the encryption function (before masking) is summed into the primary checksum, and the previ- ous ciphertext block (after masking) is summed into the secondary checksum. The primary checksum output XOR masks (X) the current plaintext (Pi). The keyed encryption function (Ek) is followed by another XOR mask (X) with the secondary checksum output, which gener- ates the ciphertext (Ci) for that block. Simpson expires in six months [Page 3] DRAFT CBCS mode July 1998 C1 C2 Ci | | | IVb +->->->+ +->->->+ +->->-> | | v | v | +-----+ v +-----+ v +-----+ v Sb->| S |->(X)->| S |->->(X)->| S |->->(X)-> +-----+ | +-----+ | +-----+ | v v v +->->->+ +->->->+ +->->-> v v v v v +-----+ v +-----+ v +-----+ k->| D | v k->| D | v k->| D | +-----+ v +-----+ v +-----+ IVa v v v v v | | v | v | +-----+ v +-----+ v +-----+ v Sa->| S |->(X)->| S |->->(X)->| S |->->(X)-> +-----+ | +-----+ | +-----+ | P1 P2 Pi To decrypt, the order of the manipulations is reversed (as shown). Design Notes: The (X) is a crude symbol indicating XOR of the two blocks, but the checksum block is passed unaltered to the next checksum stage. The checksums could be seeded with the encryption key. However, additional keying material is relatively easy to generate, and should provide an increase in the security of the integrity check. Commonly, only one IV of the block size is available. The check- sums could both use the same IV, but separate IVs should provide an increase in the security. Using the plaintext was considered as the addend to the secondary checksum, but this was rejected in fear of revealing some bits of the plaintext, or a cipher relationship between the plaintext and the ciphertext. 3. Initialization Variable CBCS requires an Initialization Variable (IV). The IV conceals ini- tial blocks that repeat in multiple datagrams. For ESP, each datagram generates its IV from material carried in the datagram. This ensures that decryption of the received datagram can Simpson expires in six months [Page 4] DRAFT CBCS mode July 1998 be performed, even when some datagrams are lost, duplicated, or re- ordered in transit. For 64-bit block ciphers: The primary 64-bit IVa is generated from the 32-bit Security Parameters Index (SPI) field followed by (concatenated with) the 32-bit Sequence Number (SN) field. Then, the bit-wise complement of the 32-bit Sequence Number (SN) value is XOR'd with the first 32-bits (SPI): (SPI ^ -SN) || SN The secondary 64-bit IVb is the complement of IVa: (-SPI ^ SN) || -SN The SPI and SN are processed in big-endian order. That is, the most significant byte is the first byte in network transmission order. Security Notes: Each IV is intended to be unique over the lifetime of the ESP cipher session-key(s). A counter is most commonly used to gener- ate the IV, providing an easy method to prevent repetition. However, cryptanalysis might be aided by the rare serendipitous occurrence when the counter repeatedly changes in exactly the same fashion as corresponding bit positions in the first block. Design of specific IV generation techniques must take this into account. Ideally, the IV would be based on explicit fields carried in each datagram, but generated pseudo-randomly and protected from disclo- sure [VK83]. This completely protects the first block from unde- tectable modification. Incorporating the ESP Security Parameters Index (SPI) and the anti-replay ESP Sequence Number (SN) together can provide both uniqueness and mutual protection between the first block and the ESP header. Modification of the SPI to alter the decryption key(s) will prevent correct decryption of the first block. Modi- fication of the SN to avoid anti-replay measures will also prevent correct decryption of the first block. The first block is most likely to contain datagram headers required for delivery. Attempts to modify the IV to deliberately redirect transport head- ers will also likely be detected by the transport checksums. Simpson expires in six months [Page 5] DRAFT CBCS mode July 1998 Inclusion of the bit-wise complement of SN ensures that bit changes are reflected twice in the IV. The checksum step with an undisclosed seed is intended to generate unpredictable initial values. 4. Integrity Unlike CBC alone, CBCS provides integrity for the datagram. A single ciphertext bit change will affect the current block, and every fol- lowing block. The final checksum will give a highly probable indica- tion of the alteration. 4.1. CheckSum Steps The checksum summation functions are calculated in three steps: 1) addition modulo 2**N with end around carry; 2) count of the number of 1-bits in the sum; 3) left circular rotation of the sum by the bit count. This technique can be generalized to both 32-bit and 64-bit regis- ters, over block sizes of 64-bits, 96-bits, 128-bits, and more. All checksum operations are processed in big-endian order. That is, the first byte of a block in transmission order is treated as the most significant byte for addition and left circular rotation. Design Notes: End around carry is a common feature of protocol checksums. For example, for 2**8, 254 + 1 -> 255, 255 + 1 -> 1. 4.2. Single 32-bit CheckSum (CBCS1-32) For ESP with only a primary 64-bit checksum, an optional 32-bit Authenticator (when provided) is calculated in six final steps: a) add and rotate a 64-bit count of the number of processed bits, including the IV, as described for the checksum; b) split the checksum into two 32-bit halves; Simpson expires in six months [Page 6] DRAFT CBCS mode July 1998 c) multiply the two 32-bit halves, yielding a 64-bit product; d) rotate the product by its bit count, as described for the check- sum; e) split the product again into two 32-bit halves; f) XOR the 32-bit halves, yielding a final 32-bit Authenticator value. The Authenticator is deposited in big-endian order. That is, the most significant byte is the first byte in network transmission order. Design Notes: The multiply is intended to confuse and diffuse the semi- independent halves, while the rotations and XOR prevent determina- tion of the final internal state of the checksum. 4.3. Double 32-bit CheckSum (CBCS2-32) For ESP with both primary and secondary 64-bit checksums, an optional 32-bit Authenticator (when provided) is calculated in seven final steps: a) to each checksum, add and rotate a 64-bit count of the number of processed bits, including the IV, as described for the checksum; b) add and rotate the two checksums, as described for the checksum; c) split the 64-bit sum into two 32-bit halves; d) multiply the two 32-bit halves, yielding a 64-bit product; e) rotate the product by its bit count, as described for the check- sum; f) split the product again into two 32-bit halves; g) XOR the 32-bit halves, yielding a final 32-bit Authenticator value. The Authenticator is deposited in big-endian order. That is, the most significant byte is the first byte in network transmission order. Simpson expires in six months [Page 7] DRAFT CBCS mode July 1998 Design Notes: The addition and multiply are intended to confuse and diffuse the independent checksums, in much the same fashion as the Single 32-bit CheckSum. 4.4. Double 64-bit CheckSum (CBCS2-64) For ESP with both primary and secondary 64-bit checksums, an optional 64-bit Authenticator (when provided) is calculated in five final steps: a) to each checksum, add and rotate a 64-bit count of the number of processed bits, including the IV, as described for the checksum; b) multiply the two checksums, yielding a 128-bit product; c) split the product into two 64-bit halves; d) rotate each half of the product by its bit count, as described for the checksum; e) add the halves modulo 2**64 with end around carry, yielding a final 64-bit Authenticator value. The Authenticator is deposited in big-endian order. That is, the most significant byte is the first byte in network transmission order. Design Notes: The larger multiply is intended to more effectively diffuse the checksums, while the rotations and addition prevent determination of the final internal state of the checksums. 5. Collisions This checksum technique is hoped to prevent linear cryptanalysis by incorporating a non-linear rotation function. In case this hope proves illusive, the analysis in [RFC-wwww] should be considered. Simpson expires in six months [Page 8] DRAFT CBCS mode July 1998 A. Rotation Probability Assuming that the cipher produces a uniformly random distribution, the probability for rotation in 32-bits (4,294,967,296) is: 0, 32: 1 1, 31: 32 2, 30: 496 3, 29: 4,960 4, 28: 35,960 5, 27: 201,376 6, 26: 906,192 7, 25: 3,365,856 8, 24: 10,518,300 9, 23: 28,048,800 10, 22: 64,512,240 11, 21: 129,024,480 12, 20: 225,792,840 13, 19: 347,373,600 14, 18: 471,435,600 15, 17: 565,722,720 16: 601,080,390 B. Code Samples Here are a few extracts of code to illustrate basic concepts, based on examples graciously provided by Niels Provos. B.1. Bit Count When a population count is not natively available in the machine instructions, adding the counts per byte can be significantly faster than testing each bit. Machines with byte extract and translate instructions will improve performance. Simpson expires in six months [Page 9] DRAFT CBCS mode July 1998 uint8 bit8ones[] = { /* 1 2 3 4 5 6 7 8 9 a b c d e f */ 0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8, }; static int bit32count( uint32 n ) { int result = bit8ones[ n & 0xff ]; n >>= 8; result += bit8ones[ n & 0xff ]; n >>= 8; result += bit8ones[ n & 0xff ]; n >>= 8; return (result + bit8ones[ n & 0xff ]); } B.2. Variable Rotation When a rotation is not natively available in the machine instruc- tions, it can be emulated with an appropriate combination of shifts. Where a fixed shift is faster than a variable shift, 65 switch-cases can improve performance. Simpson expires in six months [Page 10] DRAFT CBCS mode July 1998 static void bit64rotation32( uint32 *cs ) { uint32 csy = cs[1]; uint32 csz = cs[0]; int bcv = bit32count(csz) + bit32count(csy); if ( bcv < 32 ) { /* 0..31 */ cs[0] = (csz << bcv) | (csy >> (32-bcv)); cs[1] = (csy << bcv) | (csz >> (32-bcv)); } else if ( bcv == 32 ) { /* fast swap */ cs[0] = csy; cs[1] = csz; } else { /* 33..64 swap */ cs[0] = (csy << (bcv-32)) | (csz >> (64-bcv)); cs[1] = (csz << (bcv-32)) | (csy >> (64-bcv)); } } B.3. Addition with End Around Carry Addition with carry is usually natively available in the machine instructions. Simpson expires in six months [Page 11] DRAFT CBCS mode July 1998 static void bit64addition32( uint32 *cs, uint32 *addend ) { uint32 topy = cs[1] | addend[1]; uint32 topz = cs[0] | addend[0]; cs[0] += addend[0]; if ( cs[0] < topz ) { cs[1]++; } cs[1] += addend[1]; if ( cs[1] < topy ) { cs[0]++; } } Operational Considerations The specification provides only a few manually configurable parame- ters: Primary Seed A primary seed is configured in order prior to any keys. The value is the same size as the block size of the cipher algorithm. Secondary Seed An optional secondary seed is configured in order following any keys. The value is the same size as the block size of the cipher algorithm. The configured seed values are interpreted in big-endian order. That is, the most significant byte of the seed is the leading byte. Simpson expires in six months [Page 12] DRAFT CBCS mode July 1998 Security Considerations Specific security limitations are described as notes in the relevant sections. Acknowledgements Some of the text of this specification was derived from earlier work by William Allen Simpson and Perry Metzger in multiple Request for Comments. The checksum technique is based upon a common use of the CDC 6000 series 60-bit registers for "normal" protocols. Fortuitously, those machines featured both fast variable rotate and a population count instruction. Niels Provos and David Wagner provided useful critiques of earlier versions of this document. References [ISO-8732] "Banking -- Key management (wholesale)", International Organization for Standardization, 1988. [ISO/IEC-10116] "Information Processing -- Modes of Operation for an n- bit block cipher algorithm", International Organization for Standardization, 1991. [FIPS-81] US National Bureau of Standards, "DES Modes of Opera- tion", Federal Information Processing Standard Publica- tion 81, December 1980. [MOV97] Menezes, A.J., van Oorschot, P., and Vanstone, S., "Hand- book of Applied Cryptography", CRC Press, 1997. [RFC-1827x] Simpson, W., "IP Encapsulating Security Protocol (ESP) for implementors", work in progress. [RFC-wwww] Simpson, W.A, "ESP with Cipher Block Chaining (CBC)", work in progress. [Schneier95] Schneier, B., "Applied Cryptography Second Edition", John Wiley & Sons, New York, NY, 1995. ISBN 0-471-12845-7. Simpson expires in six months [Page 13] DRAFT CBCS mode July 1998 Contacts Comments about this document should be discussed on the ipsec@tis.com mailing list. Questions about this document can also be directed to: William Allen Simpson DayDreamer Computer Systems Consulting Services 1384 Fontaine Madison Heights, Michigan 48071 wsimpson@UMich.edu wsimpson@GreenDragon.com (preferred) Full Copyright Statement Copyright (C) William Allen Simpson (1994-1995, 1997-1998). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this doc- ument itself may not be modified in any way, except as required to translate it into languages other than English. This document and the information contained herein is provided on an "AS IS" basis and the author(s) DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING (BUT NOT LIMITED TO) ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Simpson expires in six months [Page 14]