Network Working Group                           W A Simpson [DayDreamer]
Internet Draft
expires in six months                                          July 1998


                 ESP with Cipher Block CheckSums (CBCS)
                       draft-simpson-cbcs-01.txt


Status of this Memo

   This document is an Internet-Draft.  Internet Drafts are working doc-
   uments of the Internet Engineering Task Force (IETF), its Areas, and
   its Working Groups.  Note that other groups may also distribute work-
   ing documents as Internet Drafts.

   Internet Drafts are draft documents valid for a maximum of six
   months, and may be updated, replaced, or obsoleted by other documents
   at any time.  It is not appropriate to use Internet Drafts as refer-
   ence material, or to cite them other than as a ``working draft'' or
   ``work in progress.''

   To learn the current status of any Internet-Draft, please check the
   ``1id-abstracts.txt'' listing contained in the internet-drafts Shadow
   Directories on:

      ftp.is.co.za (Africa)
      nic.nordu.net (Northern Europe)
      ftp.nis.garr.it (Southern Europe)
      ftp.ietf.org (Eastern USA)
      ftp.isi.edu (Western USA)
      munnari.oz.au (Pacific Rim)

   Distribution of this memo is unlimited.

Copyright Notice

   Copyright (C) William Allen Simpson (1994-1995, 1997-1998).  All
   Rights Reserved.

Abstract

   This document describes the Cipher Block Chaining mode with addi-
   tional single or double parallel CheckSums for integrity, used with a
   number of Internet Encapsulating Security Payload (ESP) transforms.
   This version is described in terms of 64-bit block ciphers, but can
   be expanded to other block sizes.


Simpson                   expires in six months                 [Page i]
DRAFT                           CBCS mode                      July 1998


1.  Introduction

   The Encapsulating Security Payload (ESP) [RFC-1827x] provides confi-
   dentiality for IP datagrams by encrypting the payload data to be pro-
   tected.  This specification describes the ESP use of the Cipher Block
   Chaining (CBC) mode, with an added checksum for integrity.

   CBC is used to mask patterns of identical blocks within the same
   datagram.  Together with an Initialization Variable (IV) that is dif-
   ferent for every datagram, identical plaintext payloads will each
   encrypt to different ciphertext payloads.  As an added benefit, when
   the cipher output is effectively random in appearance (a characteris-
   tic of a good cipher), masking the plaintext with previous ciphertext
   will easily strengthen the entropy of the next input to the cipher.

   A pair of checksums are added to detect changes to the ciphertext,
   providing a modest degree of integrity in a single pass over the
   data, rather than using a separate pass with a stronger algorithm.
   This combination is intended to be reasonably simple to implement,
   either serially in software or in parallel hardware for performance.

   Although a simple checksum over multiple blocks would not normally be
   considered cryptographically secure, the use of a cipher in the mid-
   dle provides a background of highly random internal chaining.  These
   random blocks are used as a masking function to inhibit determination
   of the internal state of the checksums.

   CBC was first defined for DES in [FIPS-81], and generalized by
   [ISO-8732] and [ISO/IEC-10116].  For a technical exposition on CBC,
   see [MOV97].  For more explanation and implementation information for
   CBC, and a useful comparison with other modes of operation, see
   [Schneier95].

   Although this algorithm is described in terms of 64-bit (8 byte)
   blocks with 128-bit (16 byte) internal chaining values, this could be
   expanded to other block sizes.


1.1.  Design Goals

   Unlike the CBC stream orientation, CBCS is intended from its incep-
   tion to be used with the Internet Protocol (IP) datagram network.
   This environment provides specific limitations on the scope of vul-
   nerability.

   Each IP datagram has a length limitation of 576 bytes, unless a
   larger size is explicitly indicated by the transport layer.  Future
   IP versions extend this length to 1500 bytes.  The CBCS integrity is


Simpson                   expires in six months                 [Page 1]
DRAFT                           CBCS mode                      July 1998


   designed to reliably detect any multiple bit change within the
   11,680-bit payload, with an order of at least 1/2**16 probability of
   failure.  This is considerably stronger than the 1/2**7 probability
   provided by the IP, TCP and UDP checksums.

   Given that ESP provides a Sequence Number range of 2**32-1, and the
   window of acceptable values is in the range 2**5 to 2**8, CBCS is
   intended to be computationally infeasible to predictably and unde-
   tectably modify a datagram in transit during the time provided by the
   communication bandwidth-delay product.

   Software processing cost should be less than half as much as crypto-
   graphic hashing algorithms (such as MD5 or SHA1) on a little-endian
   machine, much better on a 32-bit big-endian machine, and nearly neg-
   ligible on a 64-bit big-endian machine.

   Hardware performance will benefit from the removal of multiple algo-
   rithm passes over the same data.  The hardware resources required are
   considerably less than required for MD5 or SHA1.


1.2.  Primary CheckSum

   The primary checksum involves feed-forward through the encryption
   function, in the same fashion as CBC, through XOR with the plaintext.
   Any single or multiple bit change in a cipher block (or the datagram
   IV) will result in a large unpredicatable change to all following
   cipher blocks.

   The primary checksum may be used alone.  Its strength is based on an
   integrity secret-key, the related blocksize of the cipher, and is
   dependent on the strength of the cipher itself.


1.3.  Secondary CheckSum

   The secondary checksum involves only the checksum of successive
   ciphertext blocks.  Any single bit change in a cipher block (or the
   datagram IV) will result in a small unpredicatable change to all fol-
   lowing cipher blocks.  Multiple bit changes in carefully chosen pairs
   can be used to reduce the unpredictability.

   The secondary checksum is probably too weak to be used alone.  Its
   strength is based on an integrity secret-key, the related blocksize
   of the cipher, and the unpredictability of the cipher output.

   However, it can be combined with the primary checksum to double the
   amount of internal chaining for the integrity function.


Simpson                   expires in six months                 [Page 2]
DRAFT                           CBCS mode                      July 1998


2.  Description

              IVa     P1              P2              Pi
               |      |               |               |
            +-----+   v   +-----+     v   +-----+     v
        Sa->|  S  |->(X)->|  S  |->->(X)->|  S  |->->(X)->
            +-----+   |   +-----+     |   +-----+     |
                      v      ^        v      ^        v
                   +-----+   ^     +-----+   ^     +-----+
                k->|  E  |   ^  k->|  E  |   ^  k->|  E  |
                   +-----+   ^     +-----+   ^     +-----+
                      |      ^        |      ^        |
              IVb     +->->->+        +->->->+        +->->->
               |      |               |               |
            +-----+   v   +-----+     v   +-----+     v
        Sb->|  S  |->(X)->|  S  |->->(X)->|  S  |->->(X)->
            +-----+   |   +-----+     |   +-----+     |
                      v      ^        v      ^        v
                      +->->->+        +->->->+        +->->->
                      |               |               |
                      C1              C2              Ci

   The Cipher Block CheckSums (CBCS) mode is a simple variant of the CBC
   mode [RFC-wwww].

   For each datagram, the checksums are initialized with separate seeds
   (Sa, Sb).  An Initialization Variable (IV) is split into two parts
   (IVa, IVb) which are separately summed into the primary and secondary
   checksums respectively.

   The primary checksum output XOR masks (X) the first plaintext block
   (P1).  The keyed encryption function (Ek) is followed by another XOR
   mask (X) with the secondary checksum output, which generates the
   ciphertext (C1) for the block.

   For successive blocks, the previous output of the encryption function
   (before masking) is summed into the primary checksum, and the previ-
   ous ciphertext block (after masking) is summed into the secondary
   checksum.  The primary checksum output XOR masks (X) the current
   plaintext (Pi).  The keyed encryption function (Ek) is followed by
   another XOR mask (X) with the secondary checksum output, which gener-
   ates the ciphertext (Ci) for that block.


Simpson                   expires in six months                 [Page 3]
DRAFT                           CBCS mode                      July 1998


                      C1              C2              Ci
                      |               |               |
              IVb     +->->->+        +->->->+        +->->->
               |      |      v        |      v        |
            +-----+   v   +-----+     v   +-----+     v
        Sb->|  S  |->(X)->|  S  |->->(X)->|  S  |->->(X)->
            +-----+   |   +-----+     |   +-----+     |
                      v               v               v
                      +->->->+        +->->->+        +->->->
                      v      v        v      v        v
                   +-----+   v     +-----+   v     +-----+
                k->|  D  |   v  k->|  D  |   v  k->|  D  |
                   +-----+   v     +-----+   v     +-----+
              IVa     v      v        v      v        v
               |      |      v        |      v        |
            +-----+   v   +-----+     v   +-----+     v
        Sa->|  S  |->(X)->|  S  |->->(X)->|  S  |->->(X)->
            +-----+   |   +-----+     |   +-----+     |
                      P1              P2              Pi

   To decrypt, the order of the manipulations is reversed (as shown).

   Design Notes:

      The (X) is a crude symbol indicating XOR of the two blocks, but
      the checksum block is passed unaltered to the next checksum stage.

      The checksums could be seeded with the encryption key.  However,
      additional keying material is relatively easy to generate, and
      should provide an increase in the security of the integrity check.

      Commonly, only one IV of the block size is available.  The check-
      sums could both use the same IV, but separate IVs should provide
      an increase in the security.

      Using the plaintext was considered as the addend to the secondary
      checksum, but this was rejected in fear of revealing some bits of
      the plaintext, or a cipher relationship between the plaintext and
      the ciphertext.


3.  Initialization Variable

   CBCS requires an Initialization Variable (IV).  The IV conceals ini-
   tial blocks that repeat in multiple datagrams.

   For ESP, each datagram generates its IV from material carried in the
   datagram.  This ensures that decryption of the received datagram can


Simpson                   expires in six months                 [Page 4]
DRAFT                           CBCS mode                      July 1998


   be performed, even when some datagrams are lost, duplicated, or re-
   ordered in transit.

   For 64-bit block ciphers:

      The primary 64-bit IVa is generated from the 32-bit Security
      Parameters Index (SPI) field followed by (concatenated with) the
      32-bit Sequence Number (SN) field.  Then, the bit-wise complement
      of the 32-bit Sequence Number (SN) value is XOR'd with the first
      32-bits (SPI):

         (SPI ^ -SN) || SN

      The secondary 64-bit IVb is the complement of IVa:

         (-SPI ^ SN) || -SN

      The SPI and SN are processed in big-endian order.  That is, the
      most significant byte is the first byte in network transmission
      order.

   Security Notes:

      Each IV is intended to be unique over the lifetime of the ESP
      cipher session-key(s).  A counter is most commonly used to gener-
      ate the IV, providing an easy method to prevent repetition.

      However, cryptanalysis might be aided by the rare serendipitous
      occurrence when the counter repeatedly changes in exactly the same
      fashion as corresponding bit positions in the first block.  Design
      of specific IV generation techniques must take this into account.

      Ideally, the IV would be based on explicit fields carried in each
      datagram, but generated pseudo-randomly and protected from disclo-
      sure [VK83].  This completely protects the first block from unde-
      tectable modification.

      Incorporating the ESP Security Parameters Index (SPI) and the
      anti-replay ESP Sequence Number (SN) together can provide both
      uniqueness and mutual protection between the first block and the
      ESP header.  Modification of the SPI to alter the decryption
      key(s) will prevent correct decryption of the first block.  Modi-
      fication of the SN to avoid anti-replay measures will also prevent
      correct decryption of the first block.  The first block is most
      likely to contain datagram headers required for delivery.
      Attempts to modify the IV to deliberately redirect transport head-
      ers will also likely be detected by the transport checksums.


Simpson                   expires in six months                 [Page 5]
DRAFT                           CBCS mode                      July 1998


      Inclusion of the bit-wise complement of SN ensures that bit
      changes are reflected twice in the IV.

      The checksum step with an undisclosed seed is intended to generate
      unpredictable initial values.


4.  Integrity

   Unlike CBC alone, CBCS provides integrity for the datagram.  A single
   ciphertext bit change will affect the current block, and every fol-
   lowing block.  The final checksum will give a highly probable indica-
   tion of the alteration.


4.1.  CheckSum Steps

   The checksum summation functions are calculated in three steps:

   1) addition modulo 2**N with end around carry;

   2) count of the number of 1-bits in the sum;

   3) left circular rotation of the sum by the bit count.

   This technique can be generalized to both 32-bit and 64-bit regis-
   ters, over block sizes of 64-bits, 96-bits, 128-bits, and more.

   All checksum operations are processed in big-endian order.  That is,
   the first byte of a block in transmission order is treated as the
   most significant byte for addition and left circular rotation.

   Design Notes:

      End around carry is a common feature of protocol checksums.  For
      example, for 2**8, 254 + 1 -> 255, 255 + 1 -> 1.


4.2.  Single 32-bit CheckSum (CBCS1-32)

   For ESP with only a primary 64-bit checksum, an optional 32-bit
   Authenticator (when provided) is calculated in six final steps:

   a) add and rotate a 64-bit count of the number of processed bits,
      including the IV, as described for the checksum;

   b) split the checksum into two 32-bit halves;


Simpson                   expires in six months                 [Page 6]
DRAFT                           CBCS mode                      July 1998


   c) multiply the two 32-bit halves, yielding a 64-bit product;

   d) rotate the product by its bit count, as described for the check-
      sum;

   e) split the product again into two 32-bit halves;

   f) XOR the 32-bit halves, yielding a final 32-bit Authenticator
      value.

   The Authenticator is deposited in big-endian order.  That is, the
   most significant byte is the first byte in network transmission
   order.

   Design Notes:

      The multiply is intended to confuse and diffuse the semi-
      independent halves, while the rotations and XOR prevent determina-
      tion of the final internal state of the checksum.


4.3.  Double 32-bit CheckSum (CBCS2-32)

   For ESP with both primary and secondary 64-bit checksums, an optional
   32-bit Authenticator (when provided) is calculated in seven final
   steps:

   a) to each checksum, add and rotate a 64-bit count of the number of
      processed bits, including the IV, as described for the checksum;

   b) add and rotate the two checksums, as described for the checksum;

   c) split the 64-bit sum into two 32-bit halves;

   d) multiply the two 32-bit halves, yielding a 64-bit product;

   e) rotate the product by its bit count, as described for the check-
      sum;

   f) split the product again into two 32-bit halves;

   g) XOR the 32-bit halves, yielding a final 32-bit Authenticator
      value.

   The Authenticator is deposited in big-endian order.  That is, the
   most significant byte is the first byte in network transmission
   order.


Simpson                   expires in six months                 [Page 7]
DRAFT                           CBCS mode                      July 1998


   Design Notes:

      The addition and multiply are intended to confuse and diffuse the
      independent checksums, in much the same fashion as the Single
      32-bit CheckSum.


4.4.  Double 64-bit CheckSum (CBCS2-64)

   For ESP with both primary and secondary 64-bit checksums, an optional
   64-bit Authenticator (when provided) is calculated in five final
   steps:

   a) to each checksum, add and rotate a 64-bit count of the number of
      processed bits, including the IV, as described for the checksum;

   b) multiply the two checksums, yielding a 128-bit product;

   c) split the product into two 64-bit halves;

   d) rotate each half of the product by its bit count, as described for
      the checksum;

   e) add the halves modulo 2**64 with end around carry, yielding a
      final 64-bit Authenticator value.

   The Authenticator is deposited in big-endian order.  That is, the
   most significant byte is the first byte in network transmission
   order.

   Design Notes:

      The larger multiply is intended to more effectively diffuse the
      checksums, while the rotations and addition prevent determination
      of the final internal state of the checksums.


5.  Collisions

   This checksum technique is hoped to prevent linear cryptanalysis by
   incorporating a non-linear rotation function.  In case this hope
   proves illusive, the analysis in [RFC-wwww] should be considered.


Simpson                   expires in six months                 [Page 8]
DRAFT                           CBCS mode                      July 1998


A.  Rotation Probability

   Assuming that the cipher produces a uniformly random distribution,
   the probability for rotation in 32-bits (4,294,967,296) is:

       0, 32:           1
       1, 31:          32
       2, 30:         496
       3, 29:       4,960
       4, 28:      35,960
       5, 27:     201,376
       6, 26:     906,192
       7, 25:   3,365,856
       8, 24:  10,518,300
       9, 23:  28,048,800
      10, 22:  64,512,240
      11, 21: 129,024,480
      12, 20: 225,792,840
      13, 19: 347,373,600
      14, 18: 471,435,600
      15, 17: 565,722,720
          16: 601,080,390


B.  Code Samples

   Here are a few extracts of code to illustrate basic concepts, based
   on examples graciously provided by Niels Provos.


B.1.  Bit Count

   When a population count is not natively available in the machine
   instructions, adding the counts per byte can be significantly faster
   than testing each bit.  Machines with byte extract and translate
   instructions will improve performance.


Simpson                   expires in six months                 [Page 9]
DRAFT                           CBCS mode                      July 1998


      uint8 bit8ones[] =
      {   /*  1   2   3   4   5   6   7   8   9   a   b   c   d   e   f */
          0,  1,  1,  2,  1,  2,  2,  3,  1,  2,  2,  3,  2,  3,  3,  4,
          1,  2,  2,  3,  2,  3,  3,  4,  2,  3,  3,  4,  3,  4,  4,  5,
          1,  2,  2,  3,  2,  3,  3,  4,  2,  3,  3,  4,  3,  4,  4,  5,
          2,  3,  3,  4,  3,  4,  4,  5,  3,  4,  4,  5,  4,  5,  5,  6,
          1,  2,  2,  3,  2,  3,  3,  4,  2,  3,  3,  4,  3,  4,  4,  5,
          2,  3,  3,  4,  3,  4,  4,  5,  3,  4,  4,  5,  4,  5,  5,  6,
          2,  3,  3,  4,  3,  4,  4,  5,  3,  4,  4,  5,  4,  5,  5,  6,
          3,  4,  4,  5,  4,  5,  5,  6,  4,  5,  5,  6,  5,  6,  6,  7,

          1,  2,  2,  3,  2,  3,  3,  4,  2,  3,  3,  4,  3,  4,  4,  5,
          2,  3,  3,  4,  3,  4,  4,  5,  3,  4,  4,  5,  4,  5,  5,  6,
          2,  3,  3,  4,  3,  4,  4,  5,  3,  4,  4,  5,  4,  5,  5,  6,
          3,  4,  4,  5,  4,  5,  5,  6,  4,  5,  5,  6,  5,  6,  6,  7,
          2,  3,  3,  4,  3,  4,  4,  5,  3,  4,  4,  5,  4,  5,  5,  6,
          3,  4,  4,  5,  4,  5,  5,  6,  4,  5,  5,  6,  5,  6,  6,  7,
          3,  4,  4,  5,  4,  5,  5,  6,  4,  5,  5,  6,  5,  6,  6,  7,
          4,  5,  5,  6,  5,  6,  6,  7,  5,  6,  6,  7,  6,  7,  7,  8,
      };

      static int
      bit32count( uint32 n )
      {
          int result = bit8ones[ n & 0xff ];
          n >>= 8;
          result += bit8ones[ n & 0xff ];
          n >>= 8;
          result += bit8ones[ n & 0xff ];
          n >>= 8;
          return (result + bit8ones[ n & 0xff ]);
      }


B.2.  Variable Rotation

   When a rotation is not natively available in the machine instruc-
   tions, it can be emulated with an appropriate combination of shifts.
   Where a fixed shift is faster than a variable shift, 65 switch-cases
   can improve performance.


Simpson                   expires in six months                [Page 10]
DRAFT                           CBCS mode                      July 1998


      static void
      bit64rotation32( uint32 *cs )
      {
          uint32 csy = cs[1];
          uint32 csz = cs[0];
          int bcv = bit32count(csz) + bit32count(csy);

          if ( bcv < 32 )
          {
              /* 0..31 */
              cs[0] = (csz << bcv) | (csy >> (32-bcv));
              cs[1] = (csy << bcv) | (csz >> (32-bcv));
          }
          else if ( bcv == 32 )
          {
              /* fast swap */
              cs[0] = csy;
              cs[1] = csz;
          }
          else
          {
              /* 33..64 swap */
              cs[0] = (csy << (bcv-32)) | (csz >> (64-bcv));
              cs[1] = (csz << (bcv-32)) | (csy >> (64-bcv));
          }
      }


B.3.  Addition with End Around Carry

   Addition with carry is usually natively available in the machine
   instructions.


Simpson                   expires in six months                [Page 11]
DRAFT                           CBCS mode                      July 1998


      static void
      bit64addition32( uint32 *cs, uint32 *addend )
      {
          uint32 topy = cs[1] | addend[1];
          uint32 topz = cs[0] | addend[0];

          cs[0] += addend[0];
          if ( cs[0] < topz )
          {
              cs[1]++;
          }
          cs[1] += addend[1];
          if ( cs[1] < topy )
          {
              cs[0]++;
          }
      }


Operational Considerations

   The specification provides only a few manually configurable parame-
   ters:

   Primary Seed
      A primary seed is configured in order prior to any keys.  The
      value is the same size as the block size of the cipher algorithm.

   Secondary Seed
      An optional secondary seed is configured in order following any
      keys.  The value is the same size as the block size of the cipher
      algorithm.

   The configured seed values are interpreted in big-endian order.  That
   is, the most significant byte of the seed is the leading byte.


Simpson                   expires in six months                [Page 12]
DRAFT                           CBCS mode                      July 1998


Security Considerations

   Specific security limitations are described as notes in the relevant
   sections.


Acknowledgements

   Some of the text of this specification was derived from earlier work
   by William Allen Simpson and Perry Metzger in multiple Request for
   Comments.

   The checksum technique is based upon a common use of the CDC 6000
   series 60-bit registers for "normal" protocols.  Fortuitously, those
   machines featured both fast variable rotate and a population count
   instruction.

   Niels Provos and David Wagner provided useful critiques of earlier
   versions of this document.


References

   [ISO-8732]  "Banking -- Key management (wholesale)", International
               Organization for Standardization, 1988.

   [ISO/IEC-10116]
               "Information Processing -- Modes of Operation for an n-
               bit block cipher algorithm", International Organization
               for Standardization, 1991.

   [FIPS-81]   US National Bureau of Standards, "DES Modes of Opera-
               tion", Federal Information Processing Standard Publica-
               tion 81, December 1980.

   [MOV97]     Menezes, A.J., van Oorschot, P., and Vanstone, S., "Hand-
               book of Applied Cryptography", CRC Press, 1997.

   [RFC-1827x] Simpson, W., "IP Encapsulating Security Protocol (ESP)
               for implementors", work in progress.

   [RFC-wwww]  Simpson, W.A, "ESP with Cipher Block Chaining (CBC)",
               work in progress.

   [Schneier95]
               Schneier, B., "Applied Cryptography Second Edition", John
               Wiley & Sons, New York, NY, 1995.  ISBN 0-471-12845-7.


Simpson                   expires in six months                [Page 13]
DRAFT                           CBCS mode                      July 1998


Contacts

   Comments about this document should be discussed on the ipsec@tis.com
   mailing list.

   Questions about this document can also be directed to:

      William Allen Simpson
      DayDreamer
      Computer Systems Consulting Services
      1384 Fontaine
      Madison Heights, Michigan  48071

          wsimpson@UMich.edu
          wsimpson@GreenDragon.com (preferred)


Full Copyright Statement

   Copyright (C) William Allen Simpson (1994-1995, 1997-1998).  All
   Rights Reserved.

   This document and translations of it may be copied and furnished to
   others, and derivative works that comment on or otherwise explain it
   or assist in its implementation may be prepared, copied, published
   and distributed, in whole or in part, without restriction of any
   kind, provided that the above copyright notice and this paragraph are
   included on all such copies and derivative works.  However, this doc-
   ument itself may not be modified in any way, except as required to
   translate it into languages other than English.

   This document and the information contained herein is provided on an
   "AS IS" basis and the author(s) DISCLAIM ALL WARRANTIES, EXPRESS OR
   IMPLIED, INCLUDING (BUT NOT LIMITED TO) ANY WARRANTY THAT THE USE OF
   THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


Simpson                   expires in six months                [Page 14]