idnits 2.17.1 draft-schaad-smime-aed-rant-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 107: '...with the statement "The recipient MUST...' RFC 2119 keyword, line 132: '... originatorInfo [0] IMPLICIT OriginatorInfo OPTIONAL,...' RFC 2119 keyword, line 135: '... authAttrs [1] IMPLICIT AuthAttributes OPTIONAL,...' RFC 2119 keyword, line 137: '... unauthAttrs [2] IMPLICIT UnauthAttributes OPTIONAL }...' Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 177 has weird spacing: '...ge Data is th...' == Line 182 has weird spacing: '...ed Data is th...' == Line 186 has weird spacing: '...ion Tag is a ...' == Line 193 has weird spacing: '...g Model is a ...' -- The document date (April 6, 2011) is 4766 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Looks like a reference, but probably isn't: '0' on line 132 -- Looks like a reference, but probably isn't: '1' on line 135 -- Looks like a reference, but probably isn't: '2' on line 137 Summary: 1 error (**), 0 flaws (~~), 5 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group J. Schaad 3 Internet-Draft Soaring Hawk Consulting 4 Intended status: Informational April 6, 2011 5 Expires: October 8, 2011 7 Commentary on the Design of the Authenticated-Enveloped-Data Content 8 Type 9 draft-schaad-smime-aed-rant-02 11 Abstract 13 The Authenticated-Enveloped-Data Content Type allows for the use of 14 Authenticated-Enveloped modes with block cipher algorithms. At the 15 time of the original design there was discussion about the relative 16 location of the authenticated attributes and the encrypted content in 17 the ASN.1 structure. With the benefits of implementation experience 18 I revisit the discussion made at the time and re-evaluate the 19 decision made. 21 Status of this Memo 23 This Internet-Draft is submitted in full conformance with the 24 provisions of BCP 78 and BCP 79. 26 Internet-Drafts are working documents of the Internet Engineering 27 Task Force (IETF). Note that other groups may also distribute 28 working documents as Internet-Drafts. The list of current Internet- 29 Drafts is at http://datatracker.ietf.org/drafts/current/. 31 Internet-Drafts are draft documents valid for a maximum of six months 32 and may be updated, replaced, or obsoleted by other documents at any 33 time. It is inappropriate to use Internet-Drafts as reference 34 material or to cite them other than as "work in progress." 36 This Internet-Draft will expire on October 8, 2011. 38 Copyright Notice 40 Copyright (c) 2011 IETF Trust and the persons identified as the 41 document authors. All rights reserved. 43 This document is subject to BCP 78 and the IETF Trust's Legal 44 Provisions Relating to IETF Documents 45 (http://trustee.ietf.org/license-info) in effect on the date of 46 publication of this document. Please review these documents 47 carefully, as they describe your rights and restrictions with respect 48 to this document. Code Components extracted from this document must 49 include Simplified BSD License text as described in Section 4.e of 50 the Trust Legal Provisions and are provided without warranty as 51 described in the Simplified BSD License. 53 Table of Contents 55 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 56 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4 57 2. Historic Arguments . . . . . . . . . . . . . . . . . . . . . . 6 58 3. Algorithm Taxonomy . . . . . . . . . . . . . . . . . . . . . . 8 59 3.1. CCM: Counter with CBC-MAC . . . . . . . . . . . . . . . . 9 60 3.2. CS: Cipher-State . . . . . . . . . . . . . . . . . . . . . 10 61 3.3. CWC: Carter Wegman with Counter . . . . . . . . . . . . . 10 62 3.4. EAX: A Conventional Authenticated-Encryption Mode . . . . 11 63 3.5. GCM: Galois/Counter Mode . . . . . . . . . . . . . . . . . 12 64 3.6. IACBC: Integrity Aware Cipher Block Chaining . . . . . . . 13 65 3.7. IAPM: Integrity Aware Parallelizable Mode . . . . . . . . 13 66 3.8. OCB: Offset Codebook . . . . . . . . . . . . . . . . . . . 13 67 3.9. PCFB: Propagating Cipher Feedback . . . . . . . . . . . . 15 68 3.10. SIV: Synthetic IV . . . . . . . . . . . . . . . . . . . . 15 69 3.11. XCBC: eXtended Cipher Block Chaining Encryption . . . . . 16 70 3.12. MAC-Authenticated Encryption . . . . . . . . . . . . . . . 16 71 4. My Assumptions . . . . . . . . . . . . . . . . . . . . . . . . 18 72 5. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 19 73 6. Responses . . . . . . . . . . . . . . . . . . . . . . . . . . 23 74 7. Security Considerations . . . . . . . . . . . . . . . . . . . 24 75 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 25 76 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 26 77 9.1. Normative References . . . . . . . . . . . . . . . . . . . 26 78 9.2. Informative References . . . . . . . . . . . . . . . . . . 26 79 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 27 81 1. Introduction 83 When the Cryptographic Message Syntax (CMS) [CMS] Authenticated- 84 Enveloped-Data content type (defined in RFC 5083 [CMS-AED]) was being 85 discussed, the S/MIME working group had no actual implementation 86 experience to guide it in some of the decisions that were being made 87 at the time. In this document I am revisiting one of these decisions 88 based on the implementation experience that I have since garnered. 90 Issues that were discussed at the time included: 92 What should the order be for the authenticated attributes, the 93 encrypted data and the authentication code be in the ASN.1 94 structure. There was uniform agreement that the authentication 95 code should be last, however the placement of the other two fields 96 was hotly disputed. This is the issue that we further address 97 below. 99 Should we change from using a SET to a SEQUENCE for the attribute 100 list. Doing so would have simplified the encoding processing for 101 hashing. There was no support for doing this as a common routine 102 exists that already worked for the signed and authenticated data 103 structures. 105 What are the security issues that deal with the timing of release 106 of the encrypted content vs. the validation step. This issue was 107 addressed in section 2 with the statement "The recipient MUST 108 verify the integrity of the received content before releasing any 109 information, especially the plaintext of the content." 111 Step 5 in section 2 says that padding needed to be done to the 112 block length, however there was some concern that the issue of how 113 padding should be done is better left to the algorithm description 114 rather than being specified here. No changes were made to address 115 the issue. 117 The major focus of the discussions centered on the relative placement 118 of the encrypted data blob (contained in the authEncryptedContentInfo 119 field) and the authenticated attributes (contained in the authAttrs 120 field). There were three different camps that emerged. These where: 121 1) The attributes should be before the encrypted data, 2) The 122 attributes should be after the encrypted data, and 3) There should be 123 the ability to place the attributes both before and after the 124 encrypted data and the encoder would choice which to use. As can be 125 seen from the ASN.1 in Figure 1 the final decision was to place the 126 authenticated attributes after the encrypted content. This was 127 counter to the arguments that I made at the time to place the 128 authenticated attributes before the encrypted content. 130 AuthEnvelopedData ::= SEQUENCE { 131 version CMSVersion, 132 originatorInfo [0] IMPLICIT OriginatorInfo OPTIONAL, 133 recipientInfos RecipientInfos, 134 authEncryptedContentInfo EncryptedContentInfo, 135 authAttrs [1] IMPLICIT AuthAttributes OPTIONAL, 136 mac MessageAuthenticationCode, 137 unauthAttrs [2] IMPLICIT UnauthAttributes OPTIONAL } 139 Figure 1: AuthEnvelopedData ASN.1 Extract 141 This document is organized as follows: 143 o Section 2 contains a review of the arguments presented at the 144 time. 146 o Section 3 has a taxonomy of a number of authenticated encryption 147 algorithms. 149 o Section 4 presents a set of criteria to be used. 151 o Section 5 contains my personal conclusions on the issue. 153 o Section 6 contains rebuttals (or maybe not). 155 The major part of my discussion focuses on the desirability to use a 156 streaming model for processing the ASN.1 structure and the data 157 contained within it. If one does not want to use streaming in doing 158 the processing, then much of the discussion here is moot. If one is 159 willing to buffer up all of the input to the encryption algorithm 160 before applying it, the order that the inputs are presented are 161 immaterial. This will be further detailed in Section 4. 163 1.1. Terminology 165 The following is a list of standardized terms used in the document: 167 AE is an abbreviation for Authenticated Encryption. This is block 168 cipher mode of operation which simultaneously provides 169 confidentiality and integrity assurances on the data. 171 AEAD is an abbreviation for Authenticated Encryption with Auxiliary 172 Data. This is a block cipher mode of operation which 173 simultaneously provides confidentiality and integrity assurances 174 on the message data as well as integrity assurances on an 175 additional set of data. 177 Message Data is the section of the input data that is to be 178 authenticated and encrypted by the AE or AEAD algorithm mode. For 179 CMS, the encrypted message data is placed in the encryptedContent 180 field of the authEncryptedContentInfo sequence. 182 Authenticated Data is the section of input data that is to be 183 authenticated but not encrypted. For CMS, the authenticated data 184 is the sequence in the authAttrs field. 186 Authentication Tag is a value that is generated by the mode which is 187 used to validate the integrity of the data. The Authentication 188 Tag is sometimes implicit and does not exist as an independent 189 value. For CMS, it is assumed that the use of the algorithm will 190 define an explicit tag and the tag will be placed in the mac 191 field. 193 Streaming Model is a method of doing the processing such that the 194 ASN.1 processing and the cryptographic processing can be 195 interleaved with each other. 197 2. Historic Arguments 199 I have gone through the archived mailing list from the time to find 200 the arguments that were being advanced. The arguments are laid out 201 with the pro side being for attribute being placed after the data 202 except for the last item in the list. 204 1. Consistency with the existing CMS data types: 206 PRO: We have working implementations of both AuthenticatedData 207 and SignedData which work. In both of these cases the data 208 structures are ordered such that the message data precedes the 209 authenticated data. Keeping the order consistent makes coding 210 easier and leads to fewer mistakes. 212 CON: Being constant is nice, however if it does not work 213 correctly that does not matter. 215 2. Authenticated attributes may be derived from the message content: 217 PRO: It should be possible to create authenticated attributes 218 based on the content of the encrypted data and have these 219 attributes authenticated. Placing the attribute before the 220 message content means that one must buffer the message content 221 to do this. The example of this presented on the mailing list 222 was the ability for a sender to process the body of the 223 message on fly by a virus checker and publish the result of 224 the virus checking as an authenticated attribute. This is the 225 same thing that currently happens today for both SignedData 226 and AuthenticatedData where the hash of the message data is 227 computed on the fly and then placed in the signed/ 228 authenticated attributes which are then processed to compute 229 the signature or mac values. 231 CON: Placing this information after the message data means 232 that the recipient cannot know to perform matching processing, 233 if necessary, in order to check the value presented by the 234 sender. The analogous step for the SignedData structure is 235 the need for the recipient to hash the message data during 236 processing in order to correctly validate the signed attribute 237 fields. 239 3. The decision should be dictated by Algorithm Characteristics: 241 PRO: The order of placing the attributes before the message 242 data was dictated by a specific choice of algorithms (CCM and 243 GCM) and that other authenticated encryption algorithms 244 (specifically CWC) would naturally place the attributes 245 second. 247 CON: No detailed analysis of algorithms was done. However, 248 the attribute data should be expected to be much smaller than 249 the message data and thus it makes more sense to cache the 250 attributes for later processing than to cache the message data 251 for later processing. 253 4. Resource requirements for the sender and recipient: 255 What happens with resource constrained devices that are acting 256 as senders or recipients? The initial argument dealt with the 257 question of resource limited senders that would not be able to 258 store intermediate data, but the same question applies to 259 resource limited recipients. We know that this was intended 260 to be used with firmware upgrades as one option, but it could 261 equally be used by a device sending out reports to a central 262 server. This is a case where a close analysis would need to 263 be done on the algorithm being used and how it will affect the 264 resources needed. 266 5. Relative frequency of processing: 268 There was a certain amount of discussion of the question of 269 the relative frequency of processing between the sender and 270 the recipient of a message. This would have bearing on the 271 question of which entity the decisions should be optimized 272 for. One set of people argued that recipients process 273 messages more frequently than senders. Another set of people 274 argued that there exist applications where the sender may 275 create messages that are never verified. 277 6. Attributes should be placed in both locations. 279 There were a couple of people who attempted to argue that the 280 discussions should be made by the sender of the message rather 281 than by the object designers. In this case we should have two 282 different locations where the authenticated attributes could 283 be place, either before or after the data, but only one of the 284 two could be used. The message creator would then select one 285 or the other based on characteristics of their choosing. 286 Recipients would then be required to deal with the attributes 287 occurring in either location. It was generally felt that the 288 additional complexity on the recipient side was not worth the 289 added flexibility. 291 3. Algorithm Taxonomy 293 In item 3 in the previous section, one of the issues was what would a 294 rigorous analysis of the AEAD algorithms lead us to believe about how 295 the choice should be laid out. At the time we were using only 296 hearsay facts about what would make for a good choice. In this 297 section, I define a set of criteria that I will use to analysis the 298 set of algorithms and then describe how each algorithm fits the 299 criteria. 301 NIST has been gathering information on Authenticated Encryption Modes 302 over the last decade. Information on these modes can be found at 303 . 304 For simplicity I used this as the set of algorithms to look at in 305 order to characterize the requirements for the purposes of comparison 306 with the characteristics required by the Authenticated Encryption 307 data structure. 309 In this section we will look at 11 AE algorithms from the NIST 310 submissions along with an algorithm being developed by Peter Gutmann. 311 Since we are interested in how to setup a streaming model, the 312 criteria we are looking at are chosen with that in mode. The major 313 characteristics we are going to be looking at are: 315 1. What are the parameters used for the algorithm? This contains a 316 list of the elements that are needed for processing exclusive of 317 the key value. These are the items that would need to be encoded 318 in the ASN.1 parameters field of an AlgorithmInformation 319 structure. 321 2. What information is directly authenticated? This is a list of 322 the data which is directly authenticated in the order of 323 authentication. (It is possible that this list may change 324 depending on the parameters. Thus if HMAC-SHA1 is used, the 325 length of the data is directly authenticated but it would not be 326 if MAC-AES-128-CCBC was used.) 328 3. What information is required before the first byte of message 329 data can be processed? Assuming that the first byte of message 330 data is to be processed upon it being decoded from the ASN.1 (or 331 encoded to ASN.1), what items of information are needed by the 332 encryption/decryption algorithm prior to it being processed. 334 4. What information is required before the first byte authenticated 335 data can be processed? Assuming that the first byte of 336 authenticated data is to be processed upon it being decoded from 337 the ASN.1 (or encoded to ASN.1), what items of information are 338 needed by the encryption/decryption algorithm prior to it being 339 processed. 341 NIST is currently in the middle of doing a review and selection 342 process for new modes to adopt as US security standards. For 343 simplicity the set of algorithms that I will be looking at come from 344 the current set of candidate algorithms that are being reviewed for 345 this purpose. One additional algorithm added to this is a simple 346 hash and encrypt algorithm that has been proposed by Peter Gutmann. 348 3.1. CCM: Counter with CBC-MAC 350 The Counter with CBC-MAC (CCM) mode was designed and documented by 351 Doug Whiting, Russ Housley and Niels Ferguson. A full description of 352 the mode can be found in RFC 3610 [RFC3610] and on the NIST website. 353 CCM is one of the standardized NIST modes (see [NIST-800-38C]) and is 354 one of the two modes that are currently documented for use with the 355 CMS Authenticated-Enveloped structures. 357 The characteristics of the algorithm are: 359 1. The parameters of the algorithm are the nonce (IV) and the length 360 of the tag to be generated. 362 2. The data authenticated is: 364 A. The nonce value, 366 B. The length of authentication tag, 368 C. The length of message data, 370 D. The length of authenticated data, 372 E. The authenticated data, 374 F. The message data 376 3. Before the first byte of message data can be processed, you must 377 know: 379 A. The nonce value 381 B. The length of the authentication tag 383 C. The length of the message 385 D. The length of authenticated data, 386 E. The authenticated data 388 4. Before the first byte of the authenticated data can be processed, 389 you must know: 391 A. The nonce value, 393 B. The length of the authentication tag 395 C. The length of the message 397 D. The length of authenticated data, 399 This algorithm mode provides major problems for a sender to process 400 in a streaming model. The lengths of the message data and the 401 authenticated data are both required to be known before any bytes of 402 the message data or authenticated data can be processed. Except in 403 cases where fixed length messages will be generated, it is required 404 that the message data be cached prior to encrypting. 406 This algorithm provides some problems for recipients in processing, 407 but under the correct circumstances can be processed under a 408 streaming model. The length of the message data must be presented to 409 the recipient before the message data is given. The authenticated 410 data must be presented before the message data is presented. Optimal 411 use of this algorithm would require that 1) the authenticated data be 412 moved before the message data bytes and 2) a requirement be 413 established that either the message data be DER encoded or the 414 message data length be published as part of the authenticated data. 415 Given that this algorithm uses counter mode for encryption, the 416 length of the message is already known so publishing it as part of 417 the authenticated data would not leak any additional information. 419 3.2. CS: Cipher-State 421 Cipher-State is an algorithm that supports an AE mode of operation, 422 but not an AEAD mode of operation. As such it does not matter where 423 the authenticated parameters would be placed as they are not 424 supported by the mode. This mode is therefore not of interest to 425 this discussion. 427 3.3. CWC: Carter Wegman with Counter 429 The Carter Wegman with Counter Authenticated Encryption mode was 430 designed by Tadayoshi Kohno, John Viega and Doug Whiting. A full 431 description of the mode can be found in [CWC] and on the NIST 432 website. 434 The characteristics of the algorithm are: 436 1. The only parameter of the algorithm is a nonce. 438 2. The data actually authenticated is: 440 A. The nonce, 442 B. The authenticated data, 444 C. The encrypted message data 446 3. Before the first byte of data can be processed, you must know: 448 A. The nonce value, 450 B. The authenticated data 452 4. Before the first byte of authenticated data can be processed, you 453 must know: 455 A. The nonce value 457 It should be noted that the analysis above is for a simplistic 458 implementation of the algorithm such as would normally be done in 459 software. The algorithm is designed so that it can be performed in 460 parallel, it would be possible for message data bytes to be fully 461 processed before the authenticated data bytes are processed. The 462 full details of this approach are not spelled out in the referenced 463 documents. 465 This algorithm can be easily streamed for the sender provided that 466 the authenticated data are generated prior to the message data being 467 generated. 469 This algorithm can be easily streamed for the recipient provided that 470 the authenticated data is presented prior to the message data being 471 presented. 473 3.4. EAX: A Conventional Authenticated-Encryption Mode 475 A Conventional Authenticated-Encryption Mode was designed and 476 documented by M. Bellare, P. Rogaway and D. Wagner. A full 477 description of the algorithm can be found at [EAX] and on the NIST 478 website. 480 The characteristics of the algorithm are: 482 1. The only parameter of the algorithm is a nonce. 484 2. The data actually authenticated is: 486 A. The nonce, 488 B. The authenticated attributes, 490 C. The encrypted message. 492 3. Before the first byte of data can be processed, you must know: 494 A. The nonce value. 496 4. Before the first byte of the data can be processed, you must 497 know: 499 A. The nonce value. 501 5. Before the first byte of authenticated data can be processed, you 502 must know: nothing. 504 This mode computes the authentication value on the authenticated data 505 and on the encrypted message separately - so they can be computed in 506 any order - and combines the results together after the entire 507 message has been processed. 509 This algorithm can easily be streamed for the sender. The order of 510 generating the authenticated data and message data is immaterial. 512 This algorithm can easily be streamed for the recipient. The order 513 of presenting the authenticated data and the message data is 514 immaterial. 516 3.5. GCM: Galois/Counter Mode 518 The Galois/Counter Mode of Operation (GCM) was designed and 519 documented by David McGrew and John Viega. A full description of the 520 algorithm can be found on the NIST website. GCM is one of the 521 standardized NIST modes (see [NIST-800-38D]) and is one of the two 522 modes that are currently documented for use with the CMS 523 Authenticated-Enveloped structures. 525 The characteristics of the algorithm are: 527 1. The parameters of the algorithm are a nonce and the length of the 528 tag to be generated. 530 2. The data actually authenticated is: 532 A. The authenticated data, 534 B. The encrypted message data, 536 C. The length of the authenticated data, 538 D. The length of the message data. 540 3. Before the first byte of message data can be processed, you must 541 know: 543 A. The nonce value. 545 B. The authenticated data. 547 4. Before the first byte of authenticated data can be processed you 548 must know: nothing. 550 This mode can easily be used in a stream model for senders provided 551 the authenticated data is generated prior to the message data. 553 This mode can easily be used in a stream model for recipients 554 provided that the authenticated data is presented prior to the 555 message data. 557 3.6. IACBC: Integrity Aware Cipher Block Chaining 559 Integrity Aware Cipher Block Chaining is an algorithm that supports 560 an AE mode of operation, but not an AEAD mode of operation. As such 561 it does not matter where the authenticated parameters would be placed 562 as they are not supported by the mode. This mode is therefore not of 563 interest to this discussion. 565 3.7. IAPM: Integrity Aware Parallelizable Mode 567 Integrity Aware Parallelizable Mode is an algorithm that supports an 568 AE mode of operation, but not an AEAD mode of operation. As such it 569 does not matter where the authenticated parameters would be placed as 570 they are not supported by the mode. This mode is therefore not of 571 interest to this discussion. 573 3.8. OCB: Offset Codebook 575 Offset Codebook mode is an algorithm that supports an AE mode of 576 operation, but not an AEAD mode of operation. As such it does not 577 matter where the authenticated parameters would be placed as they are 578 not supported by the mode. This mode is therefore not of interest to 579 this discussion. 581 However, an addendum to the original mode submission described a 582 method of adding the AEAD capability to any AE algorithm. This was 583 described by Phillip Rogaway in [OCB-AD1] as section 5 and designated 584 as Ciphertext Translation. 586 The characteristics of this algorithm are: 588 1. This mode adds no additional parameters to the underlying AE 589 algorithm parameters. 591 2. The data actually authenticated is: 593 A. The message data 595 B. The authenticated data 597 3. Before the first byte of data can be processed, you must know: 598 the same information as for the AE mode by itself. 600 4. Before the first byte of authenticated data can be processed you 601 must know: nothing. 603 It needs to be noted that before one can process the last t bytes of 604 the message (for either encryption or decryption) the authenticated 605 data must be known. The value t is equal to the length of the output 606 function for the authenticated data processor. This does mean that 607 an indication that one is in the last t bytes of processing the data 608 is needed for both encryption and decryption modes. 610 The sender can operate using a streaming model as long as it buffers 611 the last t bytes of message data so that it can be correctly tagged 612 and sent to the cryptographic code as needing special processing. 613 The authenticated data must be computed prior to the last t bytes of 614 the encryption stream being produced. One possible way of dealing 615 with this is to make the last t bytes the authentication tag as there 616 is no explicit authentication tag created. 618 The recipient can operate using a streaming model as long as it 619 buffers the last t bytes of encrypted data so that it can be 620 correctly tagged when sent to the cryptographic code. As no separate 621 authentication tag is created by the algorithm, the authenticated 622 attributes must be presented prior to the last bytes of the encrypted 623 data stream being decrypted. 625 3.9. PCFB: Propagating Cipher Feedback 627 Propagating Cipher Feedback is an algorithm that supports an AE mode 628 of operation, but not an AEAD mode of operation. As such it does not 629 matter where the authenticated parameters would be placed as they are 630 not supported by the mode. This mode is therefore not of interest to 631 this discussion. 633 3.10. SIV: Synthetic IV 635 The Synthetic IV (SIV) mode was designed and documented by Phillip 636 Rogaway and Thomas Shrimpton. A full description of the algorithm 637 can be found on the NIST website at [SIV]. 639 The characteristics of the algorithm are: 641 1. The parameters of the algorithm are: 643 A. None for the sender of the message 645 B. An IV value for the recipient of the message. (The IV value 646 acts as the authentication tag.) 648 2. The data actually authenticated is: 650 A. The authenticated data 652 B. The message data 654 3. Before the first byte of data can be processed, you must know: 656 A. The authenticated attributes. 658 4. Before the first byte of authenticated data can be processed, you 659 must know: nothing. 661 The algorithm does not use a nonce value, instead the IV used for the 662 counter mode is computed from the authenticated data and message 663 data. The IV is then emitted as the authentication tag. Note that 664 this also means that the message data must processed twice by the 665 cryptographic code. Once to do the authentication computation and 666 produce the IV and one to do the counter mode encryption. 668 This algorithm cannot be streamed by the sender. Since the IV used 669 for the counter mode encryption of the message data depends on all of 670 the message data, the message data must actually be processed twice 671 by the encryption algorithm. 673 The algorithm can easily be streamed by the recipient. The 674 requirement is that the authenticated attributes and the IV be 675 presented to the recipient before the message data is presented. The 676 authentication check is then done by comparing the IV passed in with 677 the IV computed. 679 3.11. XCBC: eXtended Cipher Block Chaining Encryption 681 eXtended Cipher Block Chaining Encryption is an algorithm that 682 supports an AE mode of operation, but not an AEAD mode of operation. 683 As such it does not matter where the authenticated parameters would 684 be placed as they are not supported by the mode. This mode is 685 therefore not of interest to this discussion. 687 3.12. MAC-Authenticated Encryption 689 The MAC-Authenticated Encryption mode has been documented by Peter 690 Gutmann. This mode is documented in [GUTMANN]. 692 The characteristics of the algorithm are: 694 1. The parameters of the algorithm are: 696 A. A key derivation algorithm, 698 B. A keyed MAC algorithm, 700 C. An encryption algorithm 702 2. The data actually authenticated is: 704 A. The encrypted message, 706 B. The authenticated attributes. 708 3. Before the first byte of the message data can be processed, you 709 must know: nothing. 711 4. Before the first byte of the authenticated data can be processed, 712 you must know: 714 A. The encrypted message data. 716 This algorithm can easily be used in a streaming model by the sender. 718 This algorithm can easily be used in a streaming model by the 719 recipient. 721 Note: In the series of messages that I exchanged with Peter during 722 the design of this algorithm, one of the things he noted was that to 723 make streaming easier he should put the authenticated attributes 724 after the message data. Thus the algorithm was designed to make sure 725 that streaming worked well with the current encoding. 727 4. My Assumptions 729 This section will list the set of criteria that I am using in making 730 my conclusions. Again, the most important thing in my mind is the 731 ability to implement a streaming model for encode and decode 732 operations. 734 1. We want to implement using a single pass streaming module to 735 encode and decode the structures. There are many reasons to do 736 so: 738 1. The amount of resources used is minimized by not buffering 739 the entirety of the message at each level of wrapping. 741 2. The fact that not all messages are DER encode means that 742 there is no single buffer in the original message that can be 743 treated as a single input buffer. 745 3. The message may be feed to the encoder/decode in chunks due 746 to the way things are read from files, the fact that nodes in 747 trees are emitted serially or the fact that removal of MIME 748 content transfer encoding is normally done on small buffers. 750 There is one argument that says one should buffer up the 751 entire encrypted buffer, decrypt in one chunk and then pass on 752 the data in one piece. Since the name of the algorithm class 753 is encrypted and authenticated, one should perhaps actually 754 authenticate that the data is correct prior to releasing the 755 data for additional processing. 757 I believe that it is sufficient to check that the encrypted 758 buffer has been authenticated prior to acting on the data 759 contained in the encrypted buffer. Thus I believe it makes 760 sense to continue doing the decode and either fail on the 761 decode operation and propagate a failure up either when the 762 decode itself fails or when the authentication check is 763 actually made. In this way it is no different than the 764 processing of a signed message where the signature may be 765 checked long after the message has been fully decoded. In 766 fact this is the normal case for an S/MIME client where the 767 content is often viewable with some indication that the 768 validation of the signature failed for some reason. 770 2. The relative lengths of the data to be encrypted and the 771 attributes to be protected are such that the encrypted data is 772 generally much larger than the attributes. Thus if one has to 773 cache one in a streaming mode, it is preferable to cache the 774 attributes. 776 5. Conclusions 778 I now look again at the arguments presented in Section 2 and review 779 the arguments presented. All of the opinions in this section are 780 mine and may or may not be represent those of any other people. 781 Section 6 contains the opinions of other people. 783 1. Consistency with the existing CMS data types: 785 This criteria should only be used a tie breaker in the event 786 that all other criteria come out equal. When looking at this 787 argument I am reminded of the following: 789 A foolish consistency is the hobgoblin of little minds, 790 adored by little statesmen and philosophers and divines. 791 (Ralph Waldo Emerson 1841) 793 2. Authenticated attributes that are derived from the message 794 content: 796 This argument is slightly more believable than it was before I 797 began this document as I now have an attribute which is 798 derived from the message content, however this attribute is 799 the length of the message data and in order to be useful it 800 needs to be placed before the message data is consumed. (See 801 Section 3.1.) 803 I found this argument to be difficult to believe at the time 804 it was presented, and I have not changed my mind since then. 805 The argument that this means the authenticated attributes 806 comes second would mean that this is an attribute that is 807 attested to by the sender, but is not verified in any way by 808 the recipient. If the recipient needed to do any processing 809 then it would be much more desirable to have the attribute 810 occur before the message data so that the recipient can setup 811 to do the necessary processing prior to processing the message 812 data. 814 In the process of writing [XOR-HASH] I have become convinced 815 that there is a fundamental problem which is going to be 816 coming in the future with the signed data structure. Since 817 the recipient does not "know" the correct set of hash 818 algorithms to be used when processing a message the vast 819 majority look at the list presented and then augment it with a 820 number of different algorithms. This often means that one is 821 computing four or five different hash functions on the content 822 just on the off-chance that they may be needed. Many systems 823 will not attempt a recovery if they find a signer info 824 structure which uses a hash algorithm they did not realize 825 that they needed even if it is known to the system because of 826 the work involved in doing a restart after having parsed in 827 all of the data. This means that similar behavior should be 828 expected for any attributes that need to be validated by the 829 recipient after having been generated by the sender. The 830 problem is worse since there is no similar field to the set of 831 digest algorithms that can filled at the beginning of a signed 832 data object. 834 I believe that this criteria was mis-applied. The issues of 835 how a recipient was supposed to deal with these types of 836 attributes was completely ignored in the decision process and 837 it should have had paramount importance. 839 3. The decision should be dictated by Algorithm Characteristics: 841 Looking at the taxonomy of algorithms that is presented in 842 Section 3 we come up with the following results: 844 The algorithms which cannot be easily streamed are: CCM, 845 SIV (for sender) 847 The algorithms which need attributes before the message 848 body are: CWC (serialized implementation), GCM, SIV (for 849 recipient), CCM (for recipient in special circumstances) 851 The algorithms which need the message body before the 852 attributes are: MAC-Authenticated 854 The algorithms which can have either the body or the 855 attributes first are: CWC (parallelized implementation), 856 EAX, OCB 858 We can see that CCM and SIV will never be easily streamed for 859 the sender. It is unfortunate for people wanting to stream 860 the CCM is one of the two algorithms that we have standardized 861 on. It should be noted that both of these algorithms can be 862 setup to be streamed for the recipient of the message, but CCM 863 requires an additional restriction to be applied. If either 864 of these algorithms is used then the entire question discussed 865 above about a sender processing the content on sending would 866 be academic as the message data needs to be buffered anyway. 868 We have only one algorithm were the attributes are logically 869 placed after the message data, that being the MAC- 870 Authenticated, which was explicitly designed to be that way so 871 that it could be streamed using the current data layout. 873 Additionally there are two algorithms that are agnostic of the 874 order of attributes and data plus one that can be implemented 875 to be agnostic. 877 For recipients, only the MAC-Authenticated algorithm 878 necessitates that the attributes be cached until the message 879 data has been processed. All of the other algorithms can be 880 made work with the attributes preceding the message data 881 without any problems. 883 In current practice, and in part because of NIST 884 standardization, the only two modes that have significant use 885 are the CCM and GCM modes. It is possible that the MAC- 886 Authenticated mode will also get traction since it is easy for 887 people to understand and implement. This should also be taken 888 into consideration when looking at the algorithm 889 characteristics. 891 If we had done this analysis at the time the decision was made 892 then we should have made the decision to place the attributes 893 first. 895 4. Resource requirements for the sender and recipient: 897 It is no more likely that the sender of a message is resource 898 constrained than it is for the recipient of the message to be 899 resource constrained. This means that it is better for a set 900 of algorithms and layout to be chosen that will work well in a 901 streaming model under normal circumstances than to optimize 902 for either the sender or the recipient. 904 5. Relative frequency of processing: 906 In my opinion, most of the time messages that are created 907 using an authenticated encryption algorithm will be decrypted 908 by at least one recipient. Messages which are not decrypted 909 will exist, either from being lost in the ether or from being 910 cached until needed, but these will be the smallest part of 911 the set. Messages which need to be decrypted multiple times 912 by a single recipient will generally be a small number as 913 well, unless it because part of the S/MIME standard. However 914 I believe that a significant number of messages will be 915 created that will have multiple recipients. This may be done 916 by creating multiple lock boxes up front, or by creating the 917 lock boxes on demand in cases where it does not matter than a 918 traffic analysis can be done that multiple recipients have 919 gotten the same message. (An example of this might be sending 920 a firmware upgrade to multiple devices, where the message is 921 transferred on demand and it does not matter that an observer 922 can see that the same set of firmware is being installed on 923 multiple machines. This would be something that could 924 probably be assumed anyway.) 926 I therefore think that overall more messages will be decoded 927 and decrypted than encrypted and encoded. This would mean 928 that a bias should be placed for the recipients of messages 929 not the sender of messages in making decisions. 931 Based on the above, I would say that we should modify the order of 932 these fields in the event that the document is updated. 934 6. Responses 936 An opportunity was provided to the Russ Housley as the author of 937 [CMS-AED] and to others that were involved on the mailing list to 938 provide a formal response. Nobody took advantage of the offer. 940 7. Security Considerations 942 This document discusses a security related document, however it makes 943 no changes to the document. As such there are no actual security 944 implications for this document. 946 8. IANA Considerations 948 No action by IANA is required for this document. 950 9. References 952 9.1. Normative References 954 [RFC3610] Whiting, D., Housley, R., and N. Ferguson, "Counter with 955 CBC-MAC (CCM)", RFC 3610, September 2003. 957 [CMS] Housley, R., "Cryptographic Message Syntax (CMS)", 958 RFC 5652, September 2009. 960 [CMS-AED] Housley, R., "Cryptographic Message Syntax (CMS) 961 Authenticated-Enveloped-Data Content Type", RFC 5083, 962 November 2007. 964 [GUTMANN] Gutmann, P., "Using MAC-authenticated Encryption in the 965 Cryptographic Message Syntax (CMS)". 967 [NIST-800-38C] 968 Dworkin, M., "Recommendation for Block Cipher Modes of 969 Operation: The CCM Mode for Authentication and 970 Confidentiality", NIST Special Publication 800-38C, 971 May 2004. 973 [NIST-800-38D] 974 Dworkin, M., "Recommendation for Block Cipher Modes of 975 Operation: Galois/Counter Mode (GCM) and GMAC", NIST 976 Special Publication 800-38D, November 2007. 978 [CWC] Kohno, T., Viega, J., and D. Whiting, "The CWC 979 authenticated encryption (associated data) mode", 980 May 2003. 982 [EAX] Bellare, M., Rogaway, P., and D. Wagner, "EAX: A 983 Conventional Authenticated-Encryption Mode", 2003. 985 [OCB-AD1] Rogaway, P., "The Associated-Data Problem", November 2001. 987 [SIV] Rogaway, P. and T. Shrimpton, "The SIV Mode of Operation 988 for Deterministic Authenticated-Encryption (Key Wrap) and 989 Misuse-Resistant Nonce-Based Authenticated-Encryption", 990 August 2007. 992 9.2. Informative References 994 [XOR-HASH] 995 Schaad, J., "Experiment: Hash functions with parameters in 996 CMS and S/MIME", draft-schaad-smime-hash-experiment-06 997 (work in progress), January 2011. 999 Author's Address 1001 Jim Schaad 1002 Soaring Hawk Consulting 1004 Email: jimsch@augustcellars.com