idnits 2.17.1 

draft-schaad-smime-aed-rant-02.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack a both a reference to RFC 2119 and the
     recommended RFC 2119 boilerplate, even if it appears to use RFC 2119
     keywords. 

     RFC 2119 keyword, line 107: '...with the statement "The recipient MUST...'
     RFC 2119 keyword, line 132: '...             originatorInfo [0] IMPLICIT OriginatorInfo OPTIONAL,...'
     RFC 2119 keyword, line 135: '...             authAttrs [1] IMPLICIT AuthAttributes OPTIONAL,...'
     RFC 2119 keyword, line 137: '...             unauthAttrs [2] IMPLICIT UnauthAttributes OPTIONAL }...'


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == Line 177 has weird spacing: '...ge Data  is th...'

  == Line 182 has weird spacing: '...ed Data  is th...'

  == Line 186 has weird spacing: '...ion Tag  is a ...'

  == Line 193 has weird spacing: '...g Model  is a ...'

  -- The document date (April 6, 2011) is 4766 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  -- Looks like a reference, but probably isn't: '0' on line 132

  -- Looks like a reference, but probably isn't: '1' on line 135

  -- Looks like a reference, but probably isn't: '2' on line 137


     Summary: 1 error (**), 0 flaws (~~), 5 warnings (==), 4 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                          J. Schaad
3	Internet-Draft                                   Soaring Hawk Consulting
4	Intended status: Informational                             April 6, 2011
5	Expires: October 8, 2011

7	  Commentary on the Design of the Authenticated-Enveloped-Data Content
8	                                  Type
9	                     draft-schaad-smime-aed-rant-02

11	Abstract

13	   The Authenticated-Enveloped-Data Content Type allows for the use of
14	   Authenticated-Enveloped modes with block cipher algorithms.  At the
15	   time of the original design there was discussion about the relative
16	   location of the authenticated attributes and the encrypted content in
17	   the ASN.1 structure.  With the benefits of implementation experience
18	   I revisit the discussion made at the time and re-evaluate the
19	   decision made.

21	Status of this Memo

23	   This Internet-Draft is submitted in full conformance with the
24	   provisions of BCP 78 and BCP 79.

26	   Internet-Drafts are working documents of the Internet Engineering
27	   Task Force (IETF).  Note that other groups may also distribute
28	   working documents as Internet-Drafts.  The list of current Internet-
29	   Drafts is at http://datatracker.ietf.org/drafts/current/.

31	   Internet-Drafts are draft documents valid for a maximum of six months
32	   and may be updated, replaced, or obsoleted by other documents at any
33	   time.  It is inappropriate to use Internet-Drafts as reference
34	   material or to cite them other than as "work in progress."

36	   This Internet-Draft will expire on October 8, 2011.

38	Copyright Notice

40	   Copyright (c) 2011 IETF Trust and the persons identified as the
41	   document authors.  All rights reserved.

43	   This document is subject to BCP 78 and the IETF Trust's Legal
44	   Provisions Relating to IETF Documents
45	   (http://trustee.ietf.org/license-info) in effect on the date of
46	   publication of this document.  Please review these documents
47	   carefully, as they describe your rights and restrictions with respect
48	   to this document.  Code Components extracted from this document must
49	   include Simplified BSD License text as described in Section 4.e of
50	   the Trust Legal Provisions and are provided without warranty as
51	   described in the Simplified BSD License.

53	Table of Contents

55	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
56	     1.1.  Terminology  . . . . . . . . . . . . . . . . . . . . . . .  4
57	   2.  Historic Arguments . . . . . . . . . . . . . . . . . . . . . .  6
58	   3.  Algorithm Taxonomy . . . . . . . . . . . . . . . . . . . . . .  8
59	     3.1.  CCM: Counter with CBC-MAC  . . . . . . . . . . . . . . . .  9
60	     3.2.  CS: Cipher-State . . . . . . . . . . . . . . . . . . . . . 10
61	     3.3.  CWC: Carter Wegman with Counter  . . . . . . . . . . . . . 10
62	     3.4.  EAX: A Conventional Authenticated-Encryption Mode  . . . . 11
63	     3.5.  GCM: Galois/Counter Mode . . . . . . . . . . . . . . . . . 12
64	     3.6.  IACBC: Integrity Aware Cipher Block Chaining . . . . . . . 13
65	     3.7.  IAPM: Integrity Aware Parallelizable Mode  . . . . . . . . 13
66	     3.8.  OCB: Offset Codebook . . . . . . . . . . . . . . . . . . . 13
67	     3.9.  PCFB: Propagating Cipher Feedback  . . . . . . . . . . . . 15
68	     3.10. SIV: Synthetic IV  . . . . . . . . . . . . . . . . . . . . 15
69	     3.11. XCBC: eXtended Cipher Block Chaining Encryption  . . . . . 16
70	     3.12. MAC-Authenticated Encryption . . . . . . . . . . . . . . . 16
71	   4.  My Assumptions . . . . . . . . . . . . . . . . . . . . . . . . 18
72	   5.  Conclusions  . . . . . . . . . . . . . . . . . . . . . . . . . 19
73	   6.  Responses  . . . . . . . . . . . . . . . . . . . . . . . . . . 23
74	   7.  Security Considerations  . . . . . . . . . . . . . . . . . . . 24
75	   8.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 25
76	   9.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 26
77	     9.1.  Normative References . . . . . . . . . . . . . . . . . . . 26
78	     9.2.  Informative References . . . . . . . . . . . . . . . . . . 26
79	   Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 27

81	1.  Introduction

83	   When the Cryptographic Message Syntax (CMS) [CMS] Authenticated-
84	   Enveloped-Data content type (defined in RFC 5083 [CMS-AED]) was being
85	   discussed, the S/MIME working group had no actual implementation
86	   experience to guide it in some of the decisions that were being made
87	   at the time.  In this document I am revisiting one of these decisions
88	   based on the implementation experience that I have since garnered.

90	   Issues that were discussed at the time included:

92	      What should the order be for the authenticated attributes, the
93	      encrypted data and the authentication code be in the ASN.1
94	      structure.  There was uniform agreement that the authentication
95	      code should be last, however the placement of the other two fields
96	      was hotly disputed.  This is the issue that we further address
97	      below.

99	      Should we change from using a SET to a SEQUENCE for the attribute
100	      list.  Doing so would have simplified the encoding processing for
101	      hashing.  There was no support for doing this as a common routine
102	      exists that already worked for the signed and authenticated data
103	      structures.

105	      What are the security issues that deal with the timing of release
106	      of the encrypted content vs. the validation step.  This issue was
107	      addressed in section 2 with the statement "The recipient MUST
108	      verify the integrity of the received content before releasing any
109	      information, especially the plaintext of the content."

111	      Step 5 in section 2 says that padding needed to be done to the
112	      block length, however there was some concern that the issue of how
113	      padding should be done is better left to the algorithm description
114	      rather than being specified here.  No changes were made to address
115	      the issue.

117	   The major focus of the discussions centered on the relative placement
118	   of the encrypted data blob (contained in the authEncryptedContentInfo
119	   field) and the authenticated attributes (contained in the authAttrs
120	   field).  There were three different camps that emerged.  These where:
121	   1) The attributes should be before the encrypted data, 2) The
122	   attributes should be after the encrypted data, and 3) There should be
123	   the ability to place the attributes both before and after the
124	   encrypted data and the encoder would choice which to use.  As can be
125	   seen from the ASN.1 in Figure 1 the final decision was to place the
126	   authenticated attributes after the encrypted content.  This was
127	   counter to the arguments that I made at the time to place the
128	   authenticated attributes before the encrypted content.

130	           AuthEnvelopedData ::= SEQUENCE {
131	             version CMSVersion,
132	             originatorInfo [0] IMPLICIT OriginatorInfo OPTIONAL,
133	             recipientInfos RecipientInfos,
134	             authEncryptedContentInfo EncryptedContentInfo,
135	             authAttrs [1] IMPLICIT AuthAttributes OPTIONAL,
136	             mac MessageAuthenticationCode,
137	             unauthAttrs [2] IMPLICIT UnauthAttributes OPTIONAL }

139	                 Figure 1: AuthEnvelopedData ASN.1 Extract

141	   This document is organized as follows:

143	   o  Section 2 contains a review of the arguments presented at the
144	      time.

146	   o  Section 3 has a taxonomy of a number of authenticated encryption
147	      algorithms.

149	   o  Section 4 presents a set of criteria to be used.

151	   o  Section 5 contains my personal conclusions on the issue.

153	   o  Section 6 contains rebuttals (or maybe not).

155	   The major part of my discussion focuses on the desirability to use a
156	   streaming model for processing the ASN.1 structure and the data
157	   contained within it.  If one does not want to use streaming in doing
158	   the processing, then much of the discussion here is moot.  If one is
159	   willing to buffer up all of the input to the encryption algorithm
160	   before applying it, the order that the inputs are presented are
161	   immaterial.  This will be further detailed in Section 4.

163	1.1.  Terminology

165	   The following is a list of standardized terms used in the document:

167	   AE is an abbreviation for Authenticated Encryption.  This is block
168	      cipher mode of operation which simultaneously provides
169	      confidentiality and integrity assurances on the data.

171	   AEAD  is an abbreviation for Authenticated Encryption with Auxiliary
172	      Data.  This is a block cipher mode of operation which
173	      simultaneously provides confidentiality and integrity assurances
174	      on the message data as well as integrity assurances on an
175	      additional set of data.

177	   Message Data  is the section of the input data that is to be
178	      authenticated and encrypted by the AE or AEAD algorithm mode.  For
179	      CMS, the encrypted message data is placed in the encryptedContent
180	      field of the authEncryptedContentInfo sequence.

182	   Authenticated Data  is the section of input data that is to be
183	      authenticated but not encrypted.  For CMS, the authenticated data
184	      is the sequence in the authAttrs field.

186	   Authentication Tag  is a value that is generated by the mode which is
187	      used to validate the integrity of the data.  The Authentication
188	      Tag is sometimes implicit and does not exist as an independent
189	      value.  For CMS, it is assumed that the use of the algorithm will
190	      define an explicit tag and the tag will be placed in the mac
191	      field.

193	   Streaming Model  is a method of doing the processing such that the
194	      ASN.1 processing and the cryptographic processing can be
195	      interleaved with each other.

197	2.  Historic Arguments

199	   I have gone through the archived mailing list from the time to find
200	   the arguments that were being advanced.  The arguments are laid out
201	   with the pro side being for attribute being placed after the data
202	   except for the last item in the list.

204	   1.  Consistency with the existing CMS data types:

206	          PRO: We have working implementations of both AuthenticatedData
207	          and SignedData which work.  In both of these cases the data
208	          structures are ordered such that the message data precedes the
209	          authenticated data.  Keeping the order consistent makes coding
210	          easier and leads to fewer mistakes.

212	          CON: Being constant is nice, however if it does not work
213	          correctly that does not matter.

215	   2.  Authenticated attributes may be derived from the message content:

217	          PRO: It should be possible to create authenticated attributes
218	          based on the content of the encrypted data and have these
219	          attributes authenticated.  Placing the attribute before the
220	          message content means that one must buffer the message content
221	          to do this.  The example of this presented on the mailing list
222	          was the ability for a sender to process the body of the
223	          message on fly by a virus checker and publish the result of
224	          the virus checking as an authenticated attribute.  This is the
225	          same thing that currently happens today for both SignedData
226	          and AuthenticatedData where the hash of the message data is
227	          computed on the fly and then placed in the signed/
228	          authenticated attributes which are then processed to compute
229	          the signature or mac values.

231	          CON: Placing this information after the message data means
232	          that the recipient cannot know to perform matching processing,
233	          if necessary, in order to check the value presented by the
234	          sender.  The analogous step for the SignedData structure is
235	          the need for the recipient to hash the message data during
236	          processing in order to correctly validate the signed attribute
237	          fields.

239	   3.  The decision should be dictated by Algorithm Characteristics:

241	          PRO: The order of placing the attributes before the message
242	          data was dictated by a specific choice of algorithms (CCM and
243	          GCM) and that other authenticated encryption algorithms
244	          (specifically CWC) would naturally place the attributes
245	          second.

247	          CON: No detailed analysis of algorithms was done.  However,
248	          the attribute data should be expected to be much smaller than
249	          the message data and thus it makes more sense to cache the
250	          attributes for later processing than to cache the message data
251	          for later processing.

253	   4.  Resource requirements for the sender and recipient:

255	          What happens with resource constrained devices that are acting
256	          as senders or recipients?  The initial argument dealt with the
257	          question of resource limited senders that would not be able to
258	          store intermediate data, but the same question applies to
259	          resource limited recipients.  We know that this was intended
260	          to be used with firmware upgrades as one option, but it could
261	          equally be used by a device sending out reports to a central
262	          server.  This is a case where a close analysis would need to
263	          be done on the algorithm being used and how it will affect the
264	          resources needed.

266	   5.  Relative frequency of processing:

268	          There was a certain amount of discussion of the question of
269	          the relative frequency of processing between the sender and
270	          the recipient of a message.  This would have bearing on the
271	          question of which entity the decisions should be optimized
272	          for.  One set of people argued that recipients process
273	          messages more frequently than senders.  Another set of people
274	          argued that there exist applications where the sender may
275	          create messages that are never verified.

277	   6.  Attributes should be placed in both locations.

279	          There were a couple of people who attempted to argue that the
280	          discussions should be made by the sender of the message rather
281	          than by the object designers.  In this case we should have two
282	          different locations where the authenticated attributes could
283	          be place, either before or after the data, but only one of the
284	          two could be used.  The message creator would then select one
285	          or the other based on characteristics of their choosing.
286	          Recipients would then be required to deal with the attributes
287	          occurring in either location.  It was generally felt that the
288	          additional complexity on the recipient side was not worth the
289	          added flexibility.

291	3.  Algorithm Taxonomy

293	   In item 3 in the previous section, one of the issues was what would a
294	   rigorous analysis of the AEAD algorithms lead us to believe about how
295	   the choice should be laid out.  At the time we were using only
296	   hearsay facts about what would make for a good choice.  In this
297	   section, I define a set of criteria that I will use to analysis the
298	   set of algorithms and then describe how each algorithm fits the
299	   criteria.

301	   NIST has been gathering information on Authenticated Encryption Modes
302	   over the last decade.  Information on these modes can be found at
303	   <http://crc.nist.gov/groups/ST/toolkit/BCM/modes_development.html>.
304	   For simplicity I used this as the set of algorithms to look at in
305	   order to characterize the requirements for the purposes of comparison
306	   with the characteristics required by the Authenticated Encryption
307	   data structure.

309	   In this section we will look at 11 AE algorithms from the NIST
310	   submissions along with an algorithm being developed by Peter Gutmann.
311	   Since we are interested in how to setup a streaming model, the
312	   criteria we are looking at are chosen with that in mode.  The major
313	   characteristics we are going to be looking at are:

315	   1.  What are the parameters used for the algorithm?  This contains a
316	       list of the elements that are needed for processing exclusive of
317	       the key value.  These are the items that would need to be encoded
318	       in the ASN.1 parameters field of an AlgorithmInformation
319	       structure.

321	   2.  What information is directly authenticated?  This is a list of
322	       the data which is directly authenticated in the order of
323	       authentication.  (It is possible that this list may change
324	       depending on the parameters.  Thus if HMAC-SHA1 is used, the
325	       length of the data is directly authenticated but it would not be
326	       if MAC-AES-128-CCBC was used.)

328	   3.  What information is required before the first byte of message
329	       data can be processed?  Assuming that the first byte of message
330	       data is to be processed upon it being decoded from the ASN.1 (or
331	       encoded to ASN.1), what items of information are needed by the
332	       encryption/decryption algorithm prior to it being processed.

334	   4.  What information is required before the first byte authenticated
335	       data can be processed?  Assuming that the first byte of
336	       authenticated data is to be processed upon it being decoded from
337	       the ASN.1 (or encoded to ASN.1), what items of information are
338	       needed by the encryption/decryption algorithm prior to it being
339	       processed.

341	   NIST is currently in the middle of doing a review and selection
342	   process for new modes to adopt as US security standards.  For
343	   simplicity the set of algorithms that I will be looking at come from
344	   the current set of candidate algorithms that are being reviewed for
345	   this purpose.  One additional algorithm added to this is a simple
346	   hash and encrypt algorithm that has been proposed by Peter Gutmann.

348	3.1.  CCM: Counter with CBC-MAC

350	   The Counter with CBC-MAC (CCM) mode was designed and documented by
351	   Doug Whiting, Russ Housley and Niels Ferguson.  A full description of
352	   the mode can be found in RFC 3610 [RFC3610] and on the NIST website.
353	   CCM is one of the standardized NIST modes (see [NIST-800-38C]) and is
354	   one of the two modes that are currently documented for use with the
355	   CMS Authenticated-Enveloped structures.

357	   The characteristics of the algorithm are:

359	   1.  The parameters of the algorithm are the nonce (IV) and the length
360	       of the tag to be generated.

362	   2.  The data authenticated is:

364	       A.  The nonce value,

366	       B.  The length of authentication tag,

368	       C.  The length of message data,

370	       D.  The length of authenticated data,

372	       E.  The authenticated data,

374	       F.  The message data

376	   3.  Before the first byte of message data can be processed, you must
377	       know:

379	       A.  The nonce value

381	       B.  The length of the authentication tag

383	       C.  The length of the message

385	       D.  The length of authenticated data,
386	       E.  The authenticated data

388	   4.  Before the first byte of the authenticated data can be processed,
389	       you must know:

391	       A.  The nonce value,

393	       B.  The length of the authentication tag

395	       C.  The length of the message

397	       D.  The length of authenticated data,

399	   This algorithm mode provides major problems for a sender to process
400	   in a streaming model.  The lengths of the message data and the
401	   authenticated data are both required to be known before any bytes of
402	   the message data or authenticated data can be processed.  Except in
403	   cases where fixed length messages will be generated, it is required
404	   that the message data be cached prior to encrypting.

406	   This algorithm provides some problems for recipients in processing,
407	   but under the correct circumstances can be processed under a
408	   streaming model.  The length of the message data must be presented to
409	   the recipient before the message data is given.  The authenticated
410	   data must be presented before the message data is presented.  Optimal
411	   use of this algorithm would require that 1) the authenticated data be
412	   moved before the message data bytes and 2) a requirement be
413	   established that either the message data be DER encoded or the
414	   message data length be published as part of the authenticated data.
415	   Given that this algorithm uses counter mode for encryption, the
416	   length of the message is already known so publishing it as part of
417	   the authenticated data would not leak any additional information.

419	3.2.  CS: Cipher-State

421	   Cipher-State is an algorithm that supports an AE mode of operation,
422	   but not an AEAD mode of operation.  As such it does not matter where
423	   the authenticated parameters would be placed as they are not
424	   supported by the mode.  This mode is therefore not of interest to
425	   this discussion.

427	3.3.  CWC: Carter Wegman with Counter

429	   The Carter Wegman with Counter Authenticated Encryption mode was
430	   designed by Tadayoshi Kohno, John Viega and Doug Whiting.  A full
431	   description of the mode can be found in [CWC] and on the NIST
432	   website.

434	   The characteristics of the algorithm are:

436	   1.  The only parameter of the algorithm is a nonce.

438	   2.  The data actually authenticated is:

440	       A.  The nonce,

442	       B.  The authenticated data,

444	       C.  The encrypted message data

446	   3.  Before the first byte of data can be processed, you must know:

448	       A.  The nonce value,

450	       B.  The authenticated data

452	   4.  Before the first byte of authenticated data can be processed, you
453	       must know:

455	       A.  The nonce value

457	   It should be noted that the analysis above is for a simplistic
458	   implementation of the algorithm such as would normally be done in
459	   software.  The algorithm is designed so that it can be performed in
460	   parallel, it would be possible for message data bytes to be fully
461	   processed before the authenticated data bytes are processed.  The
462	   full details of this approach are not spelled out in the referenced
463	   documents.

465	   This algorithm can be easily streamed for the sender provided that
466	   the authenticated data are generated prior to the message data being
467	   generated.

469	   This algorithm can be easily streamed for the recipient provided that
470	   the authenticated data is presented prior to the message data being
471	   presented.

473	3.4.  EAX: A Conventional Authenticated-Encryption Mode

475	   A Conventional Authenticated-Encryption Mode was designed and
476	   documented by M. Bellare, P. Rogaway and D. Wagner.  A full
477	   description of the algorithm can be found at [EAX] and on the NIST
478	   website.

480	   The characteristics of the algorithm are:

482	   1.  The only parameter of the algorithm is a nonce.

484	   2.  The data actually authenticated is:

486	       A.  The nonce,

488	       B.  The authenticated attributes,

490	       C.  The encrypted message.

492	   3.  Before the first byte of data can be processed, you must know:

494	       A.  The nonce value.

496	   4.  Before the first byte of the data can be processed, you must
497	       know:

499	       A.  The nonce value.

501	   5.  Before the first byte of authenticated data can be processed, you
502	       must know: nothing.

504	   This mode computes the authentication value on the authenticated data
505	   and on the encrypted message separately - so they can be computed in
506	   any order - and combines the results together after the entire
507	   message has been processed.

509	   This algorithm can easily be streamed for the sender.  The order of
510	   generating the authenticated data and message data is immaterial.

512	   This algorithm can easily be streamed for the recipient.  The order
513	   of presenting the authenticated data and the message data is
514	   immaterial.

516	3.5.  GCM: Galois/Counter Mode

518	   The Galois/Counter Mode of Operation (GCM) was designed and
519	   documented by David McGrew and John Viega.  A full description of the
520	   algorithm can be found on the NIST website.  GCM is one of the
521	   standardized NIST modes (see [NIST-800-38D]) and is one of the two
522	   modes that are currently documented for use with the CMS
523	   Authenticated-Enveloped structures.

525	   The characteristics of the algorithm are:

527	   1.  The parameters of the algorithm are a nonce and the length of the
528	       tag to be generated.

530	   2.  The data actually authenticated is:

532	       A.  The authenticated data,

534	       B.  The encrypted message data,

536	       C.  The length of the authenticated data,

538	       D.  The length of the message data.

540	   3.  Before the first byte of message data can be processed, you must
541	       know:

543	       A.  The nonce value.

545	       B.  The authenticated data.

547	   4.  Before the first byte of authenticated data can be processed you
548	       must know: nothing.

550	   This mode can easily be used in a stream model for senders provided
551	   the authenticated data is generated prior to the message data.

553	   This mode can easily be used in a stream model for recipients
554	   provided that the authenticated data is presented prior to the
555	   message data.

557	3.6.  IACBC: Integrity Aware Cipher Block Chaining

559	   Integrity Aware Cipher Block Chaining is an algorithm that supports
560	   an AE mode of operation, but not an AEAD mode of operation.  As such
561	   it does not matter where the authenticated parameters would be placed
562	   as they are not supported by the mode.  This mode is therefore not of
563	   interest to this discussion.

565	3.7.  IAPM: Integrity Aware Parallelizable Mode

567	   Integrity Aware Parallelizable Mode is an algorithm that supports an
568	   AE mode of operation, but not an AEAD mode of operation.  As such it
569	   does not matter where the authenticated parameters would be placed as
570	   they are not supported by the mode.  This mode is therefore not of
571	   interest to this discussion.

573	3.8.  OCB: Offset Codebook

575	   Offset Codebook mode is an algorithm that supports an AE mode of
576	   operation, but not an AEAD mode of operation.  As such it does not
577	   matter where the authenticated parameters would be placed as they are
578	   not supported by the mode.  This mode is therefore not of interest to
579	   this discussion.

581	   However, an addendum to the original mode submission described a
582	   method of adding the AEAD capability to any AE algorithm.  This was
583	   described by Phillip Rogaway in [OCB-AD1] as section 5 and designated
584	   as Ciphertext Translation.

586	   The characteristics of this algorithm are:

588	   1.  This mode adds no additional parameters to the underlying AE
589	       algorithm parameters.

591	   2.  The data actually authenticated is:

593	       A.  The message data

595	       B.  The authenticated data

597	   3.  Before the first byte of data can be processed, you must know:
598	       the same information as for the AE mode by itself.

600	   4.  Before the first byte of authenticated data can be processed you
601	       must know: nothing.

603	   It needs to be noted that before one can process the last t bytes of
604	   the message (for either encryption or decryption) the authenticated
605	   data must be known.  The value t is equal to the length of the output
606	   function for the authenticated data processor.  This does mean that
607	   an indication that one is in the last t bytes of processing the data
608	   is needed for both encryption and decryption modes.

610	   The sender can operate using a streaming model as long as it buffers
611	   the last t bytes of message data so that it can be correctly tagged
612	   and sent to the cryptographic code as needing special processing.
613	   The authenticated data must be computed prior to the last t bytes of
614	   the encryption stream being produced.  One possible way of dealing
615	   with this is to make the last t bytes the authentication tag as there
616	   is no explicit authentication tag created.

618	   The recipient can operate using a streaming model as long as it
619	   buffers the last t bytes of encrypted data so that it can be
620	   correctly tagged when sent to the cryptographic code.  As no separate
621	   authentication tag is created by the algorithm, the authenticated
622	   attributes must be presented prior to the last bytes of the encrypted
623	   data stream being decrypted.

625	3.9.  PCFB: Propagating Cipher Feedback

627	   Propagating Cipher Feedback is an algorithm that supports an AE mode
628	   of operation, but not an AEAD mode of operation.  As such it does not
629	   matter where the authenticated parameters would be placed as they are
630	   not supported by the mode.  This mode is therefore not of interest to
631	   this discussion.

633	3.10.  SIV: Synthetic IV

635	   The Synthetic IV (SIV) mode was designed and documented by Phillip
636	   Rogaway and Thomas Shrimpton.  A full description of the algorithm
637	   can be found on the NIST website at [SIV].

639	   The characteristics of the algorithm are:

641	   1.  The parameters of the algorithm are:

643	       A.  None for the sender of the message

645	       B.  An IV value for the recipient of the message.  (The IV value
646	           acts as the authentication tag.)

648	   2.  The data actually authenticated is:

650	       A.  The authenticated data

652	       B.  The message data

654	   3.  Before the first byte of data can be processed, you must know:

656	       A.  The authenticated attributes.

658	   4.  Before the first byte of authenticated data can be processed, you
659	       must know: nothing.

661	   The algorithm does not use a nonce value, instead the IV used for the
662	   counter mode is computed from the authenticated data and message
663	   data.  The IV is then emitted as the authentication tag.  Note that
664	   this also means that the message data must processed twice by the
665	   cryptographic code.  Once to do the authentication computation and
666	   produce the IV and one to do the counter mode encryption.

668	   This algorithm cannot be streamed by the sender.  Since the IV used
669	   for the counter mode encryption of the message data depends on all of
670	   the message data, the message data must actually be processed twice
671	   by the encryption algorithm.

673	   The algorithm can easily be streamed by the recipient.  The
674	   requirement is that the authenticated attributes and the IV be
675	   presented to the recipient before the message data is presented.  The
676	   authentication check is then done by comparing the IV passed in with
677	   the IV computed.

679	3.11.  XCBC: eXtended Cipher Block Chaining Encryption

681	   eXtended Cipher Block Chaining Encryption is an algorithm that
682	   supports an AE mode of operation, but not an AEAD mode of operation.
683	   As such it does not matter where the authenticated parameters would
684	   be placed as they are not supported by the mode.  This mode is
685	   therefore not of interest to this discussion.

687	3.12.  MAC-Authenticated Encryption

689	   The MAC-Authenticated Encryption mode has been documented by Peter
690	   Gutmann.  This mode is documented in [GUTMANN].

692	   The characteristics of the algorithm are:

694	   1.  The parameters of the algorithm are:

696	       A.  A key derivation algorithm,

698	       B.  A keyed MAC algorithm,

700	       C.  An encryption algorithm

702	   2.  The data actually authenticated is:

704	       A.  The encrypted message,

706	       B.  The authenticated attributes.

708	   3.  Before the first byte of the message data can be processed, you
709	       must know: nothing.

711	   4.  Before the first byte of the authenticated data can be processed,
712	       you must know:

714	       A.  The encrypted message data.

716	   This algorithm can easily be used in a streaming model by the sender.

718	   This algorithm can easily be used in a streaming model by the
719	   recipient.

721	   Note: In the series of messages that I exchanged with Peter during
722	   the design of this algorithm, one of the things he noted was that to
723	   make streaming easier he should put the authenticated attributes
724	   after the message data.  Thus the algorithm was designed to make sure
725	   that streaming worked well with the current encoding.

727	4.  My Assumptions

729	   This section will list the set of criteria that I am using in making
730	   my conclusions.  Again, the most important thing in my mind is the
731	   ability to implement a streaming model for encode and decode
732	   operations.

734	   1.  We want to implement using a single pass streaming module to
735	       encode and decode the structures.  There are many reasons to do
736	       so:

738	       1.  The amount of resources used is minimized by not buffering
739	           the entirety of the message at each level of wrapping.

741	       2.  The fact that not all messages are DER encode means that
742	           there is no single buffer in the original message that can be
743	           treated as a single input buffer.

745	       3.  The message may be feed to the encoder/decode in chunks due
746	           to the way things are read from files, the fact that nodes in
747	           trees are emitted serially or the fact that removal of MIME
748	           content transfer encoding is normally done on small buffers.

750	          There is one argument that says one should buffer up the
751	          entire encrypted buffer, decrypt in one chunk and then pass on
752	          the data in one piece.  Since the name of the algorithm class
753	          is encrypted and authenticated, one should perhaps actually
754	          authenticate that the data is correct prior to releasing the
755	          data for additional processing.

757	          I believe that it is sufficient to check that the encrypted
758	          buffer has been authenticated prior to acting on the data
759	          contained in the encrypted buffer.  Thus I believe it makes
760	          sense to continue doing the decode and either fail on the
761	          decode operation and propagate a failure up either when the
762	          decode itself fails or when the authentication check is
763	          actually made.  In this way it is no different than the
764	          processing of a signed message where the signature may be
765	          checked long after the message has been fully decoded.  In
766	          fact this is the normal case for an S/MIME client where the
767	          content is often viewable with some indication that the
768	          validation of the signature failed for some reason.

770	   2.  The relative lengths of the data to be encrypted and the
771	       attributes to be protected are such that the encrypted data is
772	       generally much larger than the attributes.  Thus if one has to
773	       cache one in a streaming mode, it is preferable to cache the
774	       attributes.

776	5.  Conclusions

778	   I now look again at the arguments presented in Section 2 and review
779	   the arguments presented.  All of the opinions in this section are
780	   mine and may or may not be represent those of any other people.
781	   Section 6 contains the opinions of other people.

783	   1.  Consistency with the existing CMS data types:

785	          This criteria should only be used a tie breaker in the event
786	          that all other criteria come out equal.  When looking at this
787	          argument I am reminded of the following:

789	          A foolish consistency is the hobgoblin of little minds,
790	          adored by little statesmen and philosophers and divines.
791	                                    (Ralph Waldo Emerson 1841)

793	   2.  Authenticated attributes that are derived from the message
794	       content:

796	          This argument is slightly more believable than it was before I
797	          began this document as I now have an attribute which is
798	          derived from the message content, however this attribute is
799	          the length of the message data and in order to be useful it
800	          needs to be placed before the message data is consumed.  (See
801	          Section 3.1.)

803	          I found this argument to be difficult to believe at the time
804	          it was presented, and I have not changed my mind since then.
805	          The argument that this means the authenticated attributes
806	          comes second would mean that this is an attribute that is
807	          attested to by the sender, but is not verified in any way by
808	          the recipient.  If the recipient needed to do any processing
809	          then it would be much more desirable to have the attribute
810	          occur before the message data so that the recipient can setup
811	          to do the necessary processing prior to processing the message
812	          data.

814	          In the process of writing [XOR-HASH] I have become convinced
815	          that there is a fundamental problem which is going to be
816	          coming in the future with the signed data structure.  Since
817	          the recipient does not "know" the correct set of hash
818	          algorithms to be used when processing a message the vast
819	          majority look at the list presented and then augment it with a
820	          number of different algorithms.  This often means that one is
821	          computing four or five different hash functions on the content
822	          just on the off-chance that they may be needed.  Many systems
823	          will not attempt a recovery if they find a signer info
824	          structure which uses a hash algorithm they did not realize
825	          that they needed even if it is known to the system because of
826	          the work involved in doing a restart after having parsed in
827	          all of the data.  This means that similar behavior should be
828	          expected for any attributes that need to be validated by the
829	          recipient after having been generated by the sender.  The
830	          problem is worse since there is no similar field to the set of
831	          digest algorithms that can filled at the beginning of a signed
832	          data object.

834	          I believe that this criteria was mis-applied.  The issues of
835	          how a recipient was supposed to deal with these types of
836	          attributes was completely ignored in the decision process and
837	          it should have had paramount importance.

839	   3.  The decision should be dictated by Algorithm Characteristics:

841	          Looking at the taxonomy of algorithms that is presented in
842	          Section 3 we come up with the following results:

844	             The algorithms which cannot be easily streamed are: CCM,
845	             SIV (for sender)

847	             The algorithms which need attributes before the message
848	             body are: CWC (serialized implementation), GCM, SIV (for
849	             recipient), CCM (for recipient in special circumstances)

851	             The algorithms which need the message body before the
852	             attributes are: MAC-Authenticated

854	             The algorithms which can have either the body or the
855	             attributes first are: CWC (parallelized implementation),
856	             EAX, OCB

858	          We can see that CCM and SIV will never be easily streamed for
859	          the sender.  It is unfortunate for people wanting to stream
860	          the CCM is one of the two algorithms that we have standardized
861	          on.  It should be noted that both of these algorithms can be
862	          setup to be streamed for the recipient of the message, but CCM
863	          requires an additional restriction to be applied.  If either
864	          of these algorithms is used then the entire question discussed
865	          above about a sender processing the content on sending would
866	          be academic as the message data needs to be buffered anyway.

868	          We have only one algorithm were the attributes are logically
869	          placed after the message data, that being the MAC-
870	          Authenticated, which was explicitly designed to be that way so
871	          that it could be streamed using the current data layout.

873	          Additionally there are two algorithms that are agnostic of the
874	          order of attributes and data plus one that can be implemented
875	          to be agnostic.

877	          For recipients, only the MAC-Authenticated algorithm
878	          necessitates that the attributes be cached until the message
879	          data has been processed.  All of the other algorithms can be
880	          made work with the attributes preceding the message data
881	          without any problems.

883	          In current practice, and in part because of NIST
884	          standardization, the only two modes that have significant use
885	          are the CCM and GCM modes.  It is possible that the MAC-
886	          Authenticated mode will also get traction since it is easy for
887	          people to understand and implement.  This should also be taken
888	          into consideration when looking at the algorithm
889	          characteristics.

891	          If we had done this analysis at the time the decision was made
892	          then we should have made the decision to place the attributes
893	          first.

895	   4.  Resource requirements for the sender and recipient:

897	          It is no more likely that the sender of a message is resource
898	          constrained than it is for the recipient of the message to be
899	          resource constrained.  This means that it is better for a set
900	          of algorithms and layout to be chosen that will work well in a
901	          streaming model under normal circumstances than to optimize
902	          for either the sender or the recipient.

904	   5.  Relative frequency of processing:

906	          In my opinion, most of the time messages that are created
907	          using an authenticated encryption algorithm will be decrypted
908	          by at least one recipient.  Messages which are not decrypted
909	          will exist, either from being lost in the ether or from being
910	          cached until needed, but these will be the smallest part of
911	          the set.  Messages which need to be decrypted multiple times
912	          by a single recipient will generally be a small number as
913	          well, unless it because part of the S/MIME standard.  However
914	          I believe that a significant number of messages will be
915	          created that will have multiple recipients.  This may be done
916	          by creating multiple lock boxes up front, or by creating the
917	          lock boxes on demand in cases where it does not matter than a
918	          traffic analysis can be done that multiple recipients have
919	          gotten the same message.  (An example of this might be sending
920	          a firmware upgrade to multiple devices, where the message is
921	          transferred on demand and it does not matter that an observer
922	          can see that the same set of firmware is being installed on
923	          multiple machines.  This would be something that could
924	          probably be assumed anyway.)

926	          I therefore think that overall more messages will be decoded
927	          and decrypted than encrypted and encoded.  This would mean
928	          that a bias should be placed for the recipients of messages
929	          not the sender of messages in making decisions.

931	   Based on the above, I would say that we should modify the order of
932	   these fields in the event that the document is updated.

934	6.  Responses

936	   An opportunity was provided to the Russ Housley as the author of
937	   [CMS-AED] and to others that were involved on the mailing list to
938	   provide a formal response.  Nobody took advantage of the offer.

940	7.  Security Considerations

942	   This document discusses a security related document, however it makes
943	   no changes to the document.  As such there are no actual security
944	   implications for this document.

946	8.  IANA Considerations

948	   No action by IANA is required for this document.

950	9.  References

952	9.1.  Normative References

954	   [RFC3610]  Whiting, D., Housley, R., and N. Ferguson, "Counter with
955	              CBC-MAC (CCM)", RFC 3610, September 2003.

957	   [CMS]      Housley, R., "Cryptographic Message Syntax (CMS)",
958	              RFC 5652, September 2009.

960	   [CMS-AED]  Housley, R., "Cryptographic Message Syntax (CMS)
961	              Authenticated-Enveloped-Data Content Type", RFC 5083,
962	              November 2007.

964	   [GUTMANN]  Gutmann, P., "Using MAC-authenticated Encryption in the
965	              Cryptographic Message Syntax (CMS)".

967	   [NIST-800-38C]
968	              Dworkin, M., "Recommendation for Block Cipher Modes of
969	              Operation: The CCM Mode for Authentication and
970	              Confidentiality", NIST Special Publication 800-38C,
971	              May 2004.

973	   [NIST-800-38D]
974	              Dworkin, M., "Recommendation for Block Cipher Modes of
975	              Operation: Galois/Counter Mode (GCM) and GMAC", NIST
976	              Special Publication 800-38D, November 2007.

978	   [CWC]      Kohno, T., Viega, J., and D. Whiting, "The CWC
979	              authenticated encryption (associated data) mode",
980	              May 2003.

982	   [EAX]      Bellare, M., Rogaway, P., and D. Wagner, "EAX: A
983	              Conventional Authenticated-Encryption Mode", 2003.

985	   [OCB-AD1]  Rogaway, P., "The Associated-Data Problem", November 2001.

987	   [SIV]      Rogaway, P. and T. Shrimpton, "The SIV Mode of Operation
988	              for Deterministic Authenticated-Encryption (Key Wrap) and
989	              Misuse-Resistant Nonce-Based Authenticated-Encryption",
990	              August 2007.

992	9.2.  Informative References

994	   [XOR-HASH]
995	              Schaad, J., "Experiment: Hash functions with parameters in
996	              CMS and S/MIME", draft-schaad-smime-hash-experiment-06
997	              (work in progress), January 2011.

999	Author's Address

1001	   Jim Schaad
1002	   Soaring Hawk Consulting

1004	   Email: jimsch@augustcellars.com