idnits 2.17.1 draft-ford-openpgp-format-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 112: '...n AEAD algorithm MUST use the AEAD Pro...' RFC 2119 keyword, line 114: '...n AEAD algorithm MUST NOT use the AEAD...' RFC 2119 keyword, line 145: '... schemes MAY expand the size of the ...' RFC 2119 keyword, line 146: '...r padding), but if so, MUST enable the...' RFC 2119 keyword, line 153: '... Packet (Tag 19) MUST NOT be used. (D...' Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 19, 2015) is 3105 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'SHA3' is mentioned on line 196, but not defined == Missing Reference: 'CAESAR' is mentioned on line 211, but not defined -- Possible downref: Non-RFC (?) normative reference: ref. 'GCM' -- Possible downref: Non-RFC (?) normative reference: ref. 'MONKEY' ** Obsolete normative reference: RFC 7539 (Obsoleted by RFC 8439) Summary: 3 errors (**), 0 flaws (~~), 3 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 OpenPGP Working Group B. Ford 3 Internet-Draft EPFL 4 Intended status: Standards Track October 19, 2015 5 Expires: April 21, 2016 7 Modernizing the OpenPGP Message Format 8 draft-ford-openpgp-format-00 10 Abstract 12 This draft proposes and solicits discussion on methods of modernizing 13 OpenPGP's encrypted message format to support more state-of-the-art 14 authenticated encryption schemes, and optionally to protect format 15 metadata as well as data via metadata encryption and judicious 16 padding. 18 Status of This Memo 20 This Internet-Draft is submitted in full conformance with the 21 provisions of BCP 78 and BCP 79. 23 Internet-Drafts are working documents of the Internet Engineering 24 Task Force (IETF). Note that other groups may also distribute 25 working documents as Internet-Drafts. The list of current Internet- 26 Drafts is at http://datatracker.ietf.org/drafts/current/. 28 Internet-Drafts are draft documents valid for a maximum of six months 29 and may be updated, replaced, or obsoleted by other documents at any 30 time. It is inappropriate to use Internet-Drafts as reference 31 material or to cite them other than as "work in progress." 33 This Internet-Draft will expire on April 21, 2016. 35 Copyright Notice 37 Copyright (c) 2015 IETF Trust and the persons identified as the 38 document authors. All rights reserved. 40 This document is subject to BCP 78 and the IETF Trust's Legal 41 Provisions Relating to IETF Documents 42 (http://trustee.ietf.org/license-info) in effect on the date of 43 publication of this document. Please review these documents 44 carefully, as they describe your rights and restrictions with respect 45 to this document. Code Components extracted from this document must 46 include Simplified BSD License text as described in Section 4.e of 47 the Trust Legal Provisions and are provided without warranty as 48 described in the Simplified BSD License. 50 Table of Contents 52 1. Overview and Rationale . . . . . . . . . . . . . . . . . . . 2 53 2. Adopting Authenticated Encryption Schemes . . . . . . . . . . 2 54 2.1. AEAD Protected Data Packet . . . . . . . . . . . . . . . 3 55 2.2. Concrete AEAD Schemes . . . . . . . . . . . . . . . . . . 4 56 2.2.1. AES-GCM . . . . . . . . . . . . . . . . . . . . . . . 4 57 2.2.2. ChaCha20-Poly1305 . . . . . . . . . . . . . . . . . . 4 58 2.2.3. Keccak-based sponge scheme . . . . . . . . . . . . . 5 59 2.2.4. Future: CAESAR competition winner . . . . . . . . . . 5 60 2.3. Metadata Leakage-Hardening the OpenPGP Format . . . . . . 5 61 2.3.1. Encrypting file format metadata . . . . . . . . . . . 6 62 2.3.2. Intelligent padding to minimize size-based leakage . 7 63 3. Security Considerations . . . . . . . . . . . . . . . . . . . 7 64 4. References . . . . . . . . . . . . . . . . . . . . . . . . . 8 65 4.1. Normative References . . . . . . . . . . . . . . . . . . 8 66 4.2. Informative References . . . . . . . . . . . . . . . . . 8 67 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 8 69 1. Overview and Rationale 71 The current OpenPGP message format [RFC4880] has evolved periodically 72 to support new cryptographic algorithms, but its structure embodies 73 assumptions and imposes limitations that are overdue for 74 reconsideration and modernization based on today's cryptographic 75 best-practices and evolved threat models. This draft proposes, as a 76 starting point for discussion, several of these issues and potential 77 approaches to modernizing the OpenPGP format to address them. 79 2. Adopting Authenticated Encryption Schemes 81 The OpenPGP format currently handles symmetric-key encryption and 82 integrity services as separate, orthogonal mechanisms. It is now 83 widely accepted in the cryptographic community, however, that it is 84 often advantageous to both performance and security to roll 85 symmetric-key encryption and identity protection together into a 86 single cryptographic abstraction, now commonly known as Authenticated 87 Encryption with Additional Data or AEAD. An increasingly rich body 88 of AEAD schemes is now available that considerably reduce computation 89 cost with respect to the traditional approach of applying encryption 90 and hash-based identity protection as separate, orthogonal steps. 92 Enhancing the OpenPGP format to support AEAD schemes will involve two 93 main updates to the OpenPGP format specification: (1) defining a 94 suitable AEAD-based alternative encoding for the current 95 Symmetrically Encrypted Integrity Protected Data packet (Tag 18, 96 section 5.13 of [RFC4880]); and (2) defining at least one concrete 97 AEAD scheme usable in this new data Encrypted Data packet format. 99 The sections below first propose a first-cut AEAD Data Packet format, 100 then briefly point out several possible AEAD algorithms for 101 consideration. 103 2.1. AEAD Protected Data Packet 105 The proposed AEAD Protected Data packet (tentatively Tag 20) contains 106 data that is both encrypted and integrity-protected by a single AEAD 107 algorithm, defined by the selected symmetric-key cipher. Symmetric- 108 key AEAD algorithms occupy the same identifier space as traditional 109 symmetric ciphers such as IDEA and Twofish (listed in section 9.2 of 110 [RFC4880]), but require the use of AEAD Protected Data packets 111 exclusively. That is, an OpenPGP message whose selected symmetric- 112 key algorithm is an AEAD algorithm MUST use the AEAD Protected Data 113 packet, while a message whose selected symmetric-key algorithm is not 114 an AEAD algorithm MUST NOT use the AEAD Protected Data packet. 116 AEAD algorithms in general take as inputs: (1) a symmetric secret 117 key, (2) a nonce or IV whose presence and size depends on the 118 algorithm, (3) a variable-length body to be encrypted and identity 119 protected, and (4) an "additional data" field to be identity 120 protected but not encrypted (often as header and/or trailer 121 metadata). The AEAD algorithm produces: (1) a ciphertext containing 122 the encrypted content of the variable-length body, and (2) a fixed- 123 length authenticator protecting the integrity of both the encrypted 124 body and the additional data. 126 In OpenPGP's specific use of an AEAD algorithm, the symmetric secret 127 key input is defined by the OpenPGP Session Key as conveyed in the 128 message's public-key and/or symmetric-key ESK packets. 130 In the context of OpenPGP, there is no clear need for the "additional 131 data" feature of AEAD schemes (in contrast with the uses of AEAD 132 schemes to encrypt packets or datagrams), so we tentatively propose 133 that the "additional data" field always be considered to be empty (0 134 bytes) in the context of OpenPGP. (DISCUSS: are we missing potential 135 uses of this that might warrant inclusion of some field or extension 136 allowing the AD part to be nonempty?) 138 The AEAD Protected Data packet in an OpenPGP message contains the 139 following octet sequences, directly concatenated: (1) the nonce or IV 140 required by the AEAD algorithm, if any, encoded as a fixed-length 141 header whose size is determined by the symmetric-key scheme; (2) the 142 variable-length ciphertext representing the AEAD-encrypted body; and 143 (3) the authenticator, as a fixed-length trailer whose size is 144 determined by the symmetric-key scheme. Note that symmetric-key AEAD 145 schemes MAY expand the size of the body during encryption (e.g., due 146 to internal metadata and/or padding), but if so, MUST enable the 147 decrypting side to determine the true size of the original variable- 148 length cleartext (e.g., by including any necessarily metadata within 149 the encrypted ciphertext to indicate how much padding was added to 150 the plaintext before encryption). 152 When an AEAD symmetric-key cipher is used, the Modification Detection 153 Code Packet (Tag 19) MUST NOT be used. (DISCUSS: can anyone identify 154 any benefit to placing the authenticator in a separate packet? The 155 justification for the current proposal is that in all the AEAD 156 schemes I'm aware of the authenticator is fixed-size and thus has no 157 need for additional size metadata.) 159 DISCUSS: in nonce-based AEAD schemes, the nonce is technically not 160 needed (or can be taken to be all 0's) if the symmetric-key is used 161 only once, which is likely to be at least the common case for OpenPGP 162 where the AEAD symmetric key is the one-time session key. Thus, we 163 could save the size of the nonce provided there is only ever at most 164 one encrypted data packet. The downside is the risk of a security 165 disaster if any implementation ever (incorrectly) produces multiple 166 AEAD Protected Data packets using the same key. 168 2.2. Concrete AEAD Schemes 170 The proposed AEAD enhancement will require the definition of at least 171 one and perhaps multiple concrete AEAD schemes to be specified for 172 use with OpenPGP. We propose the following choices as starting 173 points for discussion, deferring for now the instantiation details of 174 each: 176 2.2.1. AES-GCM 178 The AES cipher operated in Galois-Counter Mode (GCM) [GCM] has become 179 a well-accepted AEAD scheme used in other Internet standards 180 [RFC5288][RFC4106] and has no known serious cryptographic weaknesses. 181 Thus, AES-GCM is likely to be a reasonable choice for inclusion in an 182 AEAD extension to OpenPGP, even if it is does not necessarily 183 represent the current state-of-the-art in performance or security. 185 2.2.2. ChaCha20-Poly1305 187 The ChaCha20 stream cipher used with the Poly1305 authenticator 188 [RFC7539] has gained considerable traction as a practical alternative 189 to AES-GCM providing high performance especially in tuned software 190 implementations, believed to offer security comparable to or better 191 than AES-GCM, and based on contrasting cryptographic foundations. 193 2.2.3. Keccak-based sponge scheme 195 The Keccak sponge function forms the cryptographic core of the 196 recently-standardized SHA-3 family of hash algorithms [SHA3]. As a 197 sponge construction, Keccak offers an attractive basis for AEAD 198 schemes because the sponge construction can currently "absorb" bits 199 for integrity protection and "produce" pseudorandom bits for 200 encryption, while adding no significant overhead above the cost of 201 one or the other. As SHA-3 has received substantial public attention 202 and cryptanalysis, it represents a safe choice from a security 203 perspective, and is based on substantially different cryptographic 204 foundations from either of the above choices, offering further 205 diversity. A particular Keccak-based AEAD construction would need to 206 be selected, such as the well-known MonkeyDuplex [MONKEY] among other 207 reasonable choices. 209 2.2.4. Future: CAESAR competition winner 211 The CAESAR competition [CAESAR] is in the process of selecting a new 212 AEAD scheme for public recognition. The winner will not 213 automatically become a formal standard per se but may become a "de 214 facto" standard due to the extensive public cryptanalysis all the 215 competitors are currently undergoing, and as such will represent an 216 obvious potential choice for future standardization in an AEAD- 217 enhanced OpenPGP message format. 219 2.3. Metadata Leakage-Hardening the OpenPGP Format 221 The current OpenPGP format encodes a considerable amount of metadata 222 about an OpenPGP-encrypted encrypted file "in the clear": for 223 example, (1) the fact that it is an OpenPGP-encrypted file, (2) 224 exactly which public-key and/or symmetric-key algorithms the file is 225 encrypted with, (3) whether or not the file can be decrypted with a 226 passphrase, (4) whether or not the file can be decrypted with a 227 public/private keypair, and if so how many distinct keypairs can be 228 used to decrypt the file, and (5) the length of the encrypted 229 message. See [METADATA] for an illustration of this metadata. 231 While this unencrypted metadata was not thought to be privacy- 232 sensitive when the OpenPGP format was first designed, the evolution 233 of today's threats have called this assumption into question. For 234 example, the very existence on a hard drive of a file that is readily 235 identifiable as OpenPGP-encrypted can arouse suspicion and has been 236 known to lead airport, border-control, and other authorities of some 237 countries to demand passwords or decryption keys under threat of 238 incarceration even if it is not clear that the holder of the device 239 is in possession of the necessary decryption keys. Furthermore, as 240 the state-of-the-art in cryptanalysis and brute-force attacks 241 gradually overtakes the security of older cryptographic schemes, the 242 existence of a cryptographic scheme identifier in cleartext 243 effectively acts as a "crack me!" flag, making it unnecessarily easy 244 for an attacker to invest computational resources selectively into 245 cracking ciphertexts known to use weak cryptographic schemes, while 246 avoiding wasting compute resources attempting to crack ciphertexts 247 encrypted under stronger schemes. The number of distinct public keys 248 that can decrypt a file can serve to identify the group of people for 249 which the file was encrypted. Finally, even the file's length can 250 represent sensitive, possibly incriminating information especially in 251 known-plaintext situations, e.g., when an attacker suspects but 252 cannot otherwise prove that an OpenPGP file on a suspected 253 whistleblower's or dissident's hard disk is an encryption of a 254 particular document. 256 We therefore suggest for discussion two possible measures for the 257 further evolution of the OpenPGP format to reduce this metadata 258 leakage: encrypted metadata, and optional length padding. 260 2.3.1. Encrypting file format metadata 262 OpenPGP's current format makes the decryption process "easy" in the 263 sense that it is immediately clear to the decryptor which 264 cryptographic algorithms should be used to decrypt the file, at the 265 cost of the metadata leaks above. It is readily feasible to define a 266 new OpenPGP format in which no metadata is left unencrypted, leaving 267 the encrypted file's contents appearing to be a "Uniformly Random 268 Blob" or URB. 270 The obvious challenge such a format change presents is that the 271 decryptor will not know a priori which encryption scheme(s) were used 272 to encrypt a particular file, and hence would simply have to try in 273 turn each of the schemes it supports. For files protected only by a 274 single passphrase, implementing full metadata protection in this 275 fashion is straightforward. While it may seem likely to incur 276 significant cost, note that the decryptor need not attempt to decrypt 277 the entire file using each scheme, but only a short header portion, 278 before either successfully identifying the scheme in use (or giving 279 up if the passphrase is wrong and/or the scheme is unsupported). 281 A fully-encrypted-metadata format is more challenging in the general 282 case of files encrypted using a combination of one or more 283 passphrases and/or one or more public keypairs, but still readily 284 feasible, so as to require the decryptor to perform only one 285 expensive "trial" public-key operation per scheme (not per key) on a 286 file encrypted with any number of symmetric and/or public keys. 287 Details will be expanded on if the WG decides this to be a direction 288 potentially worth pursuing. 290 2.3.2. Intelligent padding to minimize size-based leakage 292 Even if the directly-encoded metadata of an OpenPGP file is encrypted 293 as discussed above, the file's mere length can still represent 294 significant leakage, likely immediately revealing the existence of a 295 known plaintext on a hard drive for example. The only "perfect" 296 solution from a security perspective - is to pad all encrypted files 297 to a common length - is obviously impractical from an efficiency 298 perspective. 300 A second, more conceivable but still costly choice would be to pad 301 files to (for example) the next power-of-two in size. This reduces 302 the maximum possible information leakage from an N-byte file from 303 O(log N) to O(log log N), but the up-to-100% expansion factor (50% 304 expansion on average) is significant and likely to be a considerable 305 deterrent against use. 307 A better choice would be to use a slightly more sophisticated padding 308 scheme, which pads any encrypted file into "size buckets" chosen to 309 limit maximum information leakage to O(log log N) - asymptotically 310 equivalent to the simple next-power-of-two scheme - while ensuring 311 that no file incurs more than about a 10% expansion and large files 312 incur progressively smaller expansion factors (e.g., no more than 3% 313 for files 1MB or larger). Details of this scheme will be expanded if 314 the WG deems this direction potentially worth pursuing. 316 In combination with the above encrypted-metadata techniques, the 317 resulting benefit is that (new) OpenPGP-encrypted messages or files 318 would be substantially more "anonymous" than they are now, at least 319 within the set of plaintexts whose ciphertext lengths fall into one 320 of these padded "size buckets." Furthermore, since the padding 321 scheme need not be specific to OpenPGP, the result would be that 322 metadata-protected, encrypted files produced by any application 323 designed to use the same padding scheme would produce objects 324 cryptogrphically indistinguishable from others in the same "size 325 bucket" across every application supporting a compatible padding 326 scheme. Thus, the resulting "Padded Uniform Random Blobs" or PURBs 327 could eventually provide metadata protection and some level of 328 "encrypted file anonymity" not only within the context of one 329 application (e.g., OpenPGP) but across different applications that 330 produce PURBs in quite different ways. 332 3. Security Considerations 334 No new security considerations (beyond those that already apply to 335 OpenPGP's existing message format) have been identified so far, but 336 likely will be. 338 4. References 340 4.1. Normative References 342 [GCM] Dworkin, M., "Recommendation for Block Cipher Modes of 343 Operation: Galois/Counter Mode (GCM) and GMAC", NIST 344 Special Publication 800-38D, November 2007, 345 . 348 [MONKEY] Bertoni, G., Daemen, J., Peeters, M., and G. Van Assche, 349 "Permutation-based encryption, authentication and 350 authenticated encryption", Directions in Authenticated 351 Ciphers 2012, August 2015, 352 . 354 [RFC4106] Viega, J. and D. McGrew, "The Use of Galois/Counter Mode 355 (GCM) in IPsec Encapsulating Security Payload (ESP)", RFC 356 4106, DOI 10.17487/RFC4106, June 2005, 357 . 359 [RFC4880] Callas, J., Donnerhacke, L., Finney, H., Shaw, D., and R. 360 Thayer, "OpenPGP Message Format", RFC 4880, DOI 10.17487/ 361 RFC4880, November 2007, 362 . 364 [RFC5288] Salowey, J., Choudhury, A., and D. McGrew, "AES Galois 365 Counter Mode (GCM) Cipher Suites for TLS", RFC 5288, DOI 366 10.17487/RFC5288, August 2008, 367 . 369 [RFC7539] Nir, Y. and A. Langley, "ChaCha20 and Poly1305 for IETF 370 Protocols", RFC 7539, DOI 10.17487/RFC7539, May 2015, 371 . 373 4.2. Informative References 375 [METADATA] 376 Underwood, M., "The information leaked from a gpg 377 encrypted file.", October 2015, 378 . 381 Author's Address 382 Bryan Ford 383 EPFL 384 BC 210, Station 14 385 Lausanne CH-1015 386 Switzerland 388 Phone: +41 21 693 28 73 389 Email: bryan.ford@epfl.ch