idnits 2.17.1 draft-nir-cfrg-chacha20-poly1305-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 215 has weird spacing: '...db886dc c9a62...' == Line 304 has weird spacing: '...37778ab e238d...' == Line 317 has weird spacing: '...94e16de e883d...' == Line 402 has weird spacing: '...c2c21ea b7417...' == Line 705 has weird spacing: '...44ddbad e49c1...' == The document seems to use 'NOT RECOMMENDED' as an RFC 2119 keyword, but does not include the phrase in its RFC 2119 key words list. -- The document date (January 27, 2014) is 3741 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Looks like a reference, but probably isn't: '3' on line 472 -- Looks like a reference, but probably isn't: '7' on line 473 -- Looks like a reference, but probably isn't: '11' on line 474 -- Looks like a reference, but probably isn't: '15' on line 475 -- Looks like a reference, but probably isn't: '4' on line 476 -- Looks like a reference, but probably isn't: '8' on line 477 -- Looks like a reference, but probably isn't: '12' on line 478 -- Looks like a reference, but probably isn't: '32' on line 469 Summary: 0 errors (**), 0 flaws (~~), 7 warnings (==), 10 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group Y. Nir 3 Internet-Draft Check Point 4 Intended status: Informational January 27, 2014 5 Expires: July 31, 2014 7 ChaCha20 and Poly1305 for IETF protocols 8 draft-nir-cfrg-chacha20-poly1305-00 10 Abstract 12 This document defines the ChaCha20 stream cipher, as well as the use 13 of the Poly1305 authenticator, both as stand-alone algorithms, and as 14 a :"combined mode", or Authenticated Encryption with Additional Data 15 (AEAD) algorithm. 17 This document does not introduce any new crypto, but is meant to 18 serve as a stable reference and an implementation guide. 20 Status of this Memo 22 This Internet-Draft is submitted in full conformance with the 23 provisions of BCP 78 and BCP 79. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF). Note that other groups may also distribute 27 working documents as Internet-Drafts. The list of current Internet- 28 Drafts is at http://datatracker.ietf.org/drafts/current/. 30 Internet-Drafts are draft documents valid for a maximum of six months 31 and may be updated, replaced, or obsoleted by other documents at any 32 time. It is inappropriate to use Internet-Drafts as reference 33 material or to cite them other than as "work in progress." 35 This Internet-Draft will expire on July 31, 2014. 37 Copyright Notice 39 Copyright (c) 2014 IETF Trust and the persons identified as the 40 document authors. All rights reserved. 42 This document is subject to BCP 78 and the IETF Trust's Legal 43 Provisions Relating to IETF Documents 44 (http://trustee.ietf.org/license-info) in effect on the date of 45 publication of this document. Please review these documents 46 carefully, as they describe your rights and restrictions with respect 47 to this document. 49 Table of Contents 51 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 52 1.1. Conventions Used in This Document . . . . . . . . . . . . 3 53 2. The Algorithms . . . . . . . . . . . . . . . . . . . . . . . . 4 54 2.1. The ChaCha Quarter Round . . . . . . . . . . . . . . . . . 4 55 2.1.1. Test Vector for the ChaCha Quarter Round . . . . . . . 4 56 2.2. A Quarter Round on the ChaCha State . . . . . . . . . . . 5 57 2.2.1. Test Vector for the Quarter Round on the ChaCha 58 state . . . . . . . . . . . . . . . . . . . . . . . . 5 59 2.3. The ChaCha20 block Function . . . . . . . . . . . . . . . 6 60 2.3.1. Test Vector for the ChaCha20 Block Function . . . . . 7 61 2.4. The ChaCha20 encryption algorithm . . . . . . . . . . . . 8 62 2.4.1. Example and Test Vector for the ChaCha20 Cipher . . . 8 63 2.5. The Poly1305 algorithm . . . . . . . . . . . . . . . . . . 10 64 2.5.1. Poly1305 Example and Test Vector . . . . . . . . . . . 12 65 2.6. Generating the Poly1305 key using ChaCha20 . . . . . . . . 13 66 2.7. AEAD Construction . . . . . . . . . . . . . . . . . . . . 14 67 2.7.1. Example and Test Vector for AEAD_CHACHA20-POLY1305 . . 15 68 3. Implementation Advice . . . . . . . . . . . . . . . . . . . . 17 69 4. Security Considerations . . . . . . . . . . . . . . . . . . . 17 70 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 19 71 6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 19 72 7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 19 73 7.1. Normative References . . . . . . . . . . . . . . . . . . . 19 74 7.2. Informative References . . . . . . . . . . . . . . . . . . 19 75 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 20 77 1. Introduction 79 The Advanced Encryption Standard (AES - [FIPS-197]) has become the 80 gold standard in encryption. Its efficient design, wide 81 implementation, and hardware support allow for high performance in 82 many areas. On most modern platforms, AES is anywhere from 4x to 10x 83 as fast as the previous most-used cipher, 3-key Data Encryption 84 Standard (3DES - [FIPS-46]), which makes it not only the best choice, 85 but the only choice. 87 The problem is that if future advances in cryptanalysis reveal a 88 weakness in AES, users will be in an unenviable position. With the 89 only other widely supported cipher being the much slower 3DES, it is 90 not feasible to re-configure implementations to use 3DES. 91 [standby-cipher] describes this issue and the need for a standby 92 cipher in greater detail. 94 This document defines such a standby cipher. We use ChaCha20 95 ([chacha]) with or without the Poly1305 ([poly1305]) authenticator. 96 These algorithms are not just fast and secure. They are fast even if 97 software-only C-language implementations, allowing for much quicker 98 deployment when compared with algorithms such as AES that are 99 significantly accelerated by hardware implementations. 101 These document does not introduce these new algorithms. They have 102 been defined in scientific papers by D. J. Bernstein, which are 103 referenced by this document. The purpose of this document is to 104 serve as a stable reference for IETF documents making use of these 105 algorithms. 107 1.1. Conventions Used in This Document 109 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 110 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 111 document are to be interpreted as described in [RFC2119]. 113 The description of the ChaCha algorithm will at various time refer to 114 the ChaCha state as a "vector" or as a "matrix". This follows the 115 use of these terms in DJB's paper. The matrix notation is more 116 visually convenient, and gives a better notion as to why some rounds 117 are called "column rounds" while others are called "diagonal rounds". 118 Here's a diagram of how to martices relate to vectors (using the C 119 language convention of zero being the index origin). 121 0 1 2 3 122 4 5 6 7 123 8 9 10 11 124 12 13 14 15 126 The elements in this vector or matrix are 32-bit unsigned integers. 128 The algorithm name is "ChaCha". "ChaCha20" is a specific instance 129 where 20 "rounds" (or 80 quarter rounds - see Section 2.1) are used. 130 Other variations are defined, with 8 or 12 rounds, but in this 131 document we only describe the 20-round ChaCha, so the names "ChaCha" 132 and "ChaCha20" will be used interchangeably. 134 2. The Algorithms 136 The subsections below describe the algorithms used and the AEAD 137 construction. 139 2.1. The ChaCha Quarter Round 141 The basic operation of the ChaCha algorithm is the quarter round. It 142 operates on four 32-bit unsigned integers, denoted a, b, c, and d. 143 The operation is as follows (in C-like notation): 144 o a += b; d ^= a; d <<<= 16; 145 o c += d; b ^= c; b <<<= 12; 146 o a += b; d ^= a; d <<<= 8; 147 o c += d; b ^= c; b <<<= 7; 148 Where "+" denotes integer addition without carry, "^" denotes a 149 bitwise XOR, and "<<< n" denotes an n-bit left rotation (towards the 150 high bits). 152 For example, let's see the add, XOR and roll operations from the 153 first line with sample numbers: 154 o b = 0x01020304 155 o a = 0x11111111 156 o d = 0x01234567 157 o a = a + b = 0x11111111 + 0x01020304 = 0x12131415 158 o d = d ^ a = 0x01234567 ^ 0x12131415 = 0x13305172 159 o d = d<<<16 = 0x51721330 161 2.1.1. Test Vector for the ChaCha Quarter Round 163 For a test vector, we will use the same numbers as in the example, 164 adding something random for c. 165 o a = 0x11111111 166 o b = 0x01020304 167 o c = 0x9b8d6f43 168 o d = 0x01234567 170 After running a Quarter Round on these 4 numbers, we get these: 172 o a = 0xea2a92f4 173 o b = 0xcb1cf8ce 174 o c = 0x4581472e 175 o d = 0x5881c4bb 177 2.2. A Quarter Round on the ChaCha State 179 The ChaCha state does not have 4 integer numbers, but 16. So the 180 quarter round operation works on only 4 of them - hence the name. 181 Each quarter round operates on 4 pre-determined numbers in the ChaCha 182 state. We will denote by QUATERROUND(x,y,z,w) a quarter-round 183 operation on the numbers at indexes x, y, z, and w of the ChaCha 184 state when viewed as a vector. For example, if we apply 185 QUARTERROUND(1,5,9,13) to a state, this means running the quarter 186 round operation on the elements marked with an asterisk, while 187 leaving the others alone: 189 0 *a 2 3 190 4 *b 6 7 191 8 *c 10 11 192 12 *d 14 15 194 Note that this run of quarter round is part of what is called a 195 "column round". 197 2.2.1. Test Vector for the Quarter Round on the ChaCha state 199 For a test vector, we will use a ChaCha state that was generated 200 randomly: 202 Sample ChaCha State 204 879531e0 c5ecf37d 516461b1 c9a62f8a 205 44c20ef3 3390af7f d9fc690b 2a5f714c 206 53372767 b00a5631 974c541a 359e9963 207 5c971061 3d631689 2098d9d6 91dbd320 209 We will apply the QUARTERROUND(2,7,8,13) operation to this state. 210 For obvious reasons, this one is part of what is called a "diagonal 211 round": 213 After applying QUARTERROUND(2,7,8,13) 215 879531e0 c5ecf37d bdb886dc c9a62f8a 216 44c20ef3 3390af7f d9fc690b cfacafd2 217 e46bea80 b00a5631 974c541a 359e9963 218 5c971061 ccc07c79 2098d9d6 91dbd320 220 Note that only the numbers in positions 2, 7, 8, and 13 changed. 222 2.3. The ChaCha20 block Function 224 The ChaCha block function transforms a ChaCha state by running 225 multiple quarter rounds. 227 The inputs to ChaCha20 are: 228 o A 256-bit key, treated as a concatenation of 8 32-bit little- 229 endian integers. 230 o A 32-bit sender ID, treated as a little-endian integer. 231 o A 64-bit nonce, treated as a concatenation of 2 32-bit little- 232 endian integers. 233 o A 32-bit block count parameter, treated as a 32-bit little-endian 234 integer. 236 The output is 64 random-looking bytes. 238 The ChaCha algorithm described here uses a 256-bit key. The original 239 algorithm also specified 128-bit keys and 8- and 12-round variants, 240 but these are out of scope for this document. In this section we 241 describe the ChaCha block function. 243 The ChaCha20 as follows: 244 o The first 4 words (0-3) are constants: 0x61707865, 0x3320646e, 245 0x79622d32, 0x6b206574. 246 o The next 8 words (4-11) are taken from the 256-bit key by reading 247 the bytes in little-endian order, in 4-byte chunks. 248 o Word 12 is a block counter. Since each block is 64-byte, a 32-bit 249 word is enough for 256 Gigabytes of data. 250 o Word 13 is called Sender ID, or SID. Note that in the original 251 ChaCha20 word 13 is also part of the block count, in case the 252 encrypted data exceeds 256 gigabytes. That is not necessary for 253 protocols such as TLS, IPsec, or SSH. 254 o Words 14-15 are a nonce, which should not be repeated for the same 255 combination of key and sender ID. The 14th word is the least 256 significant 32 bits of the input nonce (nonce & 0xffffffff), while 257 the 15th word is the most significant 32 bits (nonce >> 32). 259 ChaCha20 runs 20 rounds, alternating between "column" and "diagonal" 260 rounds. Each round is 4 quarter-rounds, and they are run as follows. 262 Rounds 1-4 are part of the "column" round, while 5-8 are part of the 263 "diagonal" round: 264 1. QUARTERROUND ( 0, 4, 8,12) 265 2. QUARTERROUND ( 1, 5, 9,13) 266 3. QUARTERROUND ( 2, 6,10,14) 267 4. QUARTERROUND ( 3, 7,11,15) 268 5. QUARTERROUND ( 0, 5,10,15) 269 6. QUARTERROUND ( 1, 6,11,12) 270 7. QUARTERROUND ( 2, 7, 8,13) 271 8. QUARTERROUND ( 3, 4, 9,14) 273 At the end of 20 rounds, the original input words are added to the 274 output words, and the result is serialized by sequencing the words 275 one-by-one in little-endian order. 277 2.3.1. Test Vector for the ChaCha20 Block Function 279 For a test vector, we will use the following inputs to the ChaCha20 280 block function: 281 o Key = 00:01:02:03:04:05:06:07:08:09:0a:0b:0c:0d:0e:0f:10:11:12:13: 282 14:15:16:17:18:19:1a:1b:1c:1d:1e:1f. The key is a sequence of 283 octets with no particular structure before we copy it into the 284 ChaCha state. 285 o Nonce = 74 (00:00:00:00:00:00:00:4a) 286 o Sender ID = 00:00:00:09. Usually this will be all zeros, but 287 we've set it to non-zero here so it will be visually conspicuous. 288 o Block Count = 1. 290 After setting up the ChaCha state, it looks like this: 292 ChaCha State with the key set up. 294 61707865 3320646e 79622d32 6b206574 295 03020100 07060504 0b0a0908 0f0e0d0c 296 13121110 17161514 1b1a1918 1f1e1d1c 297 00000001 09000000 4a000000 00000000 299 After running 20 rounds (10 column rounds interleaved with 10 300 diagonal rounds), the ChaCha state looks like this: 302 ChaCha State after 20 rounds 304 837778ab e238d763 a67ae21e 5950bb2f 305 c4f2d0c7 fc62bb2f 8fa018fc 3f5ec7b7 306 335271c2 f29489f3 eabda8fc 82e46ebd 307 d19c12b4 b04e16de 9e83d0cb 4e3c50a2 309 Finally we add the original state to the result (simple vector or 310 matrix addition), giving this: 312 ChaCha State at the end of the ChaCha20 operation 314 e4e7f110 15593bd1 1fdd0f50 c47120a3 315 c7f4d1c7 0368c033 9aaa2204 4e6cd4c3 316 466482d2 09aa9f07 05d7c214 a2028bd9 317 d19c12b5 b94e16de e883d0cb 4e3c50a2 319 2.4. The ChaCha20 encryption algorithm 321 ChaCha20 is a stream cipher designed by D. J. Bernstein. It is a 322 refinement of the Salsa20 algorithm, and uses a 256-bit key. 324 ChaCha20 successively calls the ChaCha20 block function, with the 325 same key, sender ID and nonce, and with successively increasing block 326 counter parameters. The resulting state is then serialized by 327 writing the numbers in little-endian order. Concatenating the 328 results from the successive blocks forms a key stream, which is then 329 XOR-ed with the plaintext. There is no requirement for the plaintext 330 to be an integral multiple of 512-bits. If there is extra keystream 331 from the last block, it is discarded. Specific protocols MAY require 332 that the plaintext and ciphertext have certain length. Such 333 protocols need to specify how the plaintext is padded, and how much 334 padding it receives. 336 The inputs to ChaCha20 are: 337 o A 256-bit key 338 o A 32-bit sender ID. Ordinarily, this will be zero. However, in 339 protocols that allow multiple senders for the same keys, different 340 sender IDs MAY be used. 341 o A 32-bit initial counter. This can be set to any number, but will 342 usually be zero or one. It makes sense to use 1 if we use the 343 zero block for something else, such as generating a one-time 344 authenticator key as part of an AEAD algorithm. 345 o A 64-bit nonce. In some protocols, this is known as the Initial 346 Vector. 347 o an arbitrary-length plaintext 349 The output is an encrypted message of the same length. 351 2.4.1. Example and Test Vector for the ChaCha20 Cipher 353 For a test vector, we will use the following inputs to the ChaCha20 354 block function: 355 o Key = 00:01:02:03:04:05:06:07:08:09:0a:0b:0c:0d:0e:0f:10:11:12:13: 356 14:15:16:17:18:19:1a:1b:1c:1d:1e:1f. 358 o Nonce = 74 (00:00:00:00:00:00:00:4a). 359 o Sender ID = zero (0). 360 o Initial Counter = 1. 362 We use the following for the plaintext. It was chosen to be long 363 enough to require more than one block, but not so long that it would 364 make this example cumbersome (so, less than 3 blocks): 366 Plaintext Sunscreen: 367 000 4c 61 64 69 65 73 20 61 6e 64 20 47 65 6e 74 6c|Ladies and Gentl 368 016 65 6d 65 6e 20 6f 66 20 74 68 65 20 63 6c 61 73|emen of the clas 369 032 73 20 6f 66 20 27 39 39 3a 20 49 66 20 49 20 63|s of '99: If I c 370 048 6f 75 6c 64 20 6f 66 66 65 72 20 79 6f 75 20 6f|ould offer you o 371 064 6e 6c 79 20 6f 6e 65 20 74 69 70 20 66 6f 72 20|nly one tip for 372 080 74 68 65 20 66 75 74 75 72 65 2c 20 73 75 6e 73|the future, suns 373 096 63 72 65 65 6e 20 77 6f 75 6c 64 20 62 65 20 69|creen would be i 374 112 74 2e |t. 376 The following figure shows 4 ChaCha state matrices: 377 1. First block as it is set up. 378 2. Second block as it is set up. Note that these blocks are only 379 two bits apart - only the counter in position 12 is different. 380 3. Third block is the first block after the ChaCha20 block 381 operation. 382 4. Final block is the second block after the ChaCha20 block 383 operation was applied. 384 After that, we show the keystream. 386 First block setup: 387 61707865 3320646e 79622d32 6b206574 388 03020100 07060504 0b0a0908 0f0e0d0c 389 13121110 17161514 1b1a1918 1f1e1d1c 390 00000001 00000000 4a000000 00000000 392 Second block setup: 393 61707865 3320646e 79622d32 6b206574 394 03020100 07060504 0b0a0908 0f0e0d0c 395 13121110 17161514 1b1a1918 1f1e1d1c 396 00000002 00000000 4a000000 00000000 398 First block after block operation: 399 f3514f22 e1d91b40 6f27de2f ed1d63b8 400 821f138c e2062c3d ecca4f7e 78cff39e 401 a30a3b8a 920a6072 cd7479b5 34932bed 402 40ba4c79 cd343ec6 4c2c21ea b7417df0 404 Second block after block operation: 405 9f74a669 410f633f 28feca22 7ec44dec 406 6d34d426 738cb970 3ac5e9f3 45590cc4 407 da6e8b39 892c831a cdea67c1 2b7e1d90 408 037463f3 a11a2073 e8bcfb88 edc49139 410 Keystream: 411 22:4f:51:f3:40:1b:d9:e1:2f:de:27:6f:b8:63:1d:ed:8c:13:1f:82:3d:2c:06 412 e2:7e:4f:ca:ec:9e:f3:cf:78:8a:3b:0a:a3:72:60:0a:92:b5:79:74:cd:ed:2b 413 93:34:79:4c:ba:40:c6:3e:34:cd:ea:21:2c:4c:f0:7d:41:b7:69:a6:74:9f:3f 414 63:0f:41:22:ca:fe:28:ec:4d:c4:7e:26:d4:34:6d:70:b9:8c:73:f3:e9:c5:3a 415 c4:0c:59:45:39:8b:6e:da:1a:83:2c:89:c1:67:ea:cd:90:1d:7e:2b:f3:63 417 Finally, we XOR the Keystream with the plaintext, yielding the 418 Ciphertext: 420 Ciphertext Sunscreen: 421 000 6e 2e 35 9a 25 68 f9 80 41 ba 07 28 dd 0d 69 81|n.5.%h..A..(..i. 422 016 e9 7e 7a ec 1d 43 60 c2 0a 27 af cc fd 9f ae 0b|.~z..C`..'...... 423 032 f9 1b 65 c5 52 47 33 ab 8f 59 3d ab cd 62 b3 57|..e.RG3..Y=..b.W 424 048 16 39 d6 24 e6 51 52 ab 8f 53 0c 35 9f 08 61 d8|.9.$.QR..S.5..a. 425 064 07 ca 0d bf 50 0d 6a 61 56 a3 8e 08 8a 22 b6 5e|....P.jaV....".^ 426 080 52 bc 51 4d 16 cc f8 06 81 8c e9 1a b7 79 37 36|R.QM.........y76 427 096 5a f9 0b bf 74 a3 5b e6 b4 0b 8e ed f2 78 5e 42|Z...t.[......x^B 428 112 87 4d |.M 430 2.5. The Poly1305 algorithm 432 Poly1305 is a one-time authenticator designed by D. J. Bernstein. 433 Poly1305 takes a 32-byte one-time key and a message and produces a 434 16-byte tag. 436 The original article ([poly1305]) is entitled "The Poly1305-AES 437 message-authentication code", and the MAC function there requires a 438 128-bit AES key, a 128-bit "additional key", and a 128-bit (non- 439 secret) nonce. AES is used there for encrypting the nonce, so as to 440 get a unique (and secret) 128-bit string, but as the paper states, 441 "There is nothing special about AES here. One can replace AES with 442 an arbitrary keyed function from an arbitrary set of nonces to 16- 443 byte strings.". 445 Regardless of how the key is generated, the key is partitioned into 446 two parts, called "r" and "k". "k" MUST be unique and secret for each 447 invocation (that is why it was originally obtained by encrypting a 448 nonce), while "r" MAY be constant, but needs to be modified as 449 follows before being used: ("r" is treated as a 16-octet little- 450 endian number): 452 o r[3], r[7], r[11], and r[15] are required to have their top four 453 bits clear (be smaller than 16) 454 o r[4], r[8], and r[12] are required to have their bottom two bits 455 clear (be divisible by 4) 457 The following sample code clamps "r" to be appropriate. Note that 458 the parameter is a 32-byte key, and "r" is considered to be at offset 459 16, so it's the 2nd half of the key: 461 /* 462 poly1305aes_test_clamp.c version 20050207 463 D. J. Bernstein 464 Public domain. 465 */ 467 #include "poly1305aes_test.h" 469 void poly1305aes_test_clamp(unsigned char kr[32]) 470 { 471 #define r (kr + 16) 472 r[3] &= 15; 473 r[7] &= 15; 474 r[11] &= 15; 475 r[15] &= 15; 476 r[4] &= 252; 477 r[8] &= 252; 478 r[12] &= 252; 479 } 481 The "k" portion MUST NOT be re-used, but it is perfectly acceptable 482 to generate both "k" and "r" uniquely each time. Because each of 483 them is 128-bit, randomly generating them (see Section 2.6) is also 484 acceptable. 486 The inputs to Poly1305 are: 487 o A 256-bit one-time key 488 o An arbitrary length message 490 The output is a 128-bit tag. 492 First, the "r" value should be clamped. 494 Next, set the constant "P" (for "polynomial") be 2^130-5: 495 3fffffffffffffffffffffffffffffffb. Also set a variable "accumulator" 496 to zero. 498 Next, divide the message into 16-byte blocks. The last one might be 499 shorter: 501 o Read the block as a little-endian number. 502 o Add one bit beyond the number of octets. For a 16-byte block this 503 is equivalent to adding 2^128 to the number. For the shorter 504 block it can be 2^120, 2^112, or any power of two that is evenly 505 divisible by 8, all the way down to 2^8. 506 o If the block is not 17 bytes long (the last block), pad it with 507 zeros. This is meaningless if you're treating it them as numbers. 508 o Add this number to the accumulator. 509 o Multiply by "r" 510 o Set the accumulator to the result modulu p. To summarize: Acc = 511 ((Acc+block)*r) % p. 513 Finally, the value of the secret key "k" is added to the accumulator, 514 and the 128 least significant bits are serialized in little-endian 515 order to form the tag. 517 2.5.1. Poly1305 Example and Test Vector 519 For our example, we will dispense with generating the one-time key 520 using AES, and assume that we got the following keying material: 521 o Key Material: 85:d6:be:78:57:55:6d:33:7f:44:52:fe:42:d5:06:a8:01: 522 03:80:8a:fb:0d:b2:fd:4a:bf:f6:af:41:49:f5:1b 523 o k as an octet string: 01:03:80:8a:fb:0d:b2:fd:4a:bf:f6:af:41:49: 524 f5:1b 525 o k as a 128-bit number: 1BF54941AFF6BF4AFDB20DFB8A800301 526 o r before clamping: 85:d6:be:78:57:55:6d:33:7f:44:52:fe:42:d5:06:a8 527 o Clamped r as a number: 806D5400E52447C036D555408BED685. 529 For our message, we'll use a short text: 531 Message to be Authenticated: 532 000 43 72 79 70 74 6f 67 72 61 70 68 69 63 20 46 6f|Cryptographic Fo 533 016 72 75 6d 20 52 65 73 65 61 72 63 68 20 47 72 6f|rum Research Gro 534 032 75 70 |up 536 Since Poly1305 works in 16-byte chunks, the 34-byte message divides 537 into 3 blocks. In the following calculation, "Acc" denotes the 538 accumulator and "Block" the current block: 540 Block #1 542 Acc = 00 543 Block = 6f4620636968706172676f7470797243 544 Block with 0x01 byte = 016f4620636968706172676f7470797243 545 Acc + block = 016f4620636968706172676f7470797243 546 (Acc+Block) * r = 547 B83FE991CA66800489155DCD69E8426BA2779453994AC90ED284034DA565ECF 548 Acc = ((Acc+Block)*r) % P = 2C88C77849D64AE9147DDEB88E69C83FC 549 Block #2 551 Acc = 2C88C77849D64AE9147DDEB88E69C83FC 552 Block = 6f7247206863726165736552206d7572 553 Block with 0x01 byte = 016f7247206863726165736552206d7572 554 Acc + block = 437FEBEA505C820F2AD5150DB0709F96E 555 (Acc+Block) * r = 556 21DCC992D0C659BA4036F65BB7F88562AE59B32C2B3B8F7EFC8B00F78E548A26 557 Acc = ((Acc+Block)*r) % P = 2D8ADAF23B0337FA7CCCFB4EA344B30DE 559 Last Block 561 Acc = 2D8ADAF23B0337FA7CCCFB4EA344B30DE 562 Block = 7075 563 Block with 0x01 byte = 017075 564 Acc + block = 2D8ADAF23B0337FA7CCCFB4EA344CA153 565 (Acc + Block) * r = 566 16D8E08A0F3FE1DE4FE4A15486ACA7A270A29F1E6C849221E4A6798B8E45321F 567 ((Acc + Block) * r) % P = 28D31B7CAFF946C77C8844335369D03A7 569 Adding k we get this number, and serialize if to get the tag: 571 Acc + k = 2A927010CAF8B2BC2C6365130C11D06A8 573 Tag: a8:06:1d:c1:30:51:36:c6:c2:2b:8b:af:0c:01:27:a9 575 2.6. Generating the Poly1305 key using ChaCha20 577 As said in Section 2.5, it is acceptable to generate the one-time 578 Poly1305 pseudo-randomly. This section proposes such a method. 580 To generate such a key pair (k,r), we will use the ChaCha20 block 581 function described in Section 2.3. This assumes that we have a 256- 582 bit session key for the MAC function, such as SK_ai and SK_ar in 583 IKEv2, the integrity key in ESP and AH, or the client_write_MAC_key 584 and server_write_MAC_key in TLS. Any document that specifies the use 585 of Poly1305 as a MAC algorithm for some protocol must specify that 586 256 bits are allocated for the integrity key. 588 The method is to call the block function with the following 589 parameters: 590 o The 256-bit session integrity key is used as the ChaCha20 key. 591 o The specific protocol specifies the sender ID, but for any sender, 592 all calls are made with the same sender ID. Usually, this will be 593 set to zero. For the AEAD protocol specified in Section 2.7, the 594 sender ID is set to the same value as for the encryption. 596 o The block counter is set to zero. 597 o The protocol will specify a 64-bit nonce. This MUST be unique per 598 invocation with the same key, so it MUST NOT be randomly 599 generated. A counter is a good way to implement this, but other 600 methods, such as an LFSR are also acceptable. 602 After running the block function, we have a 512-bit state. We take 603 the first 256 bits or the serialized state, and use those as the one- 604 time Poly1305 key. The other 256 bits are discarded. 606 Note that while many protocols have provisions for a nonce for 607 encryption algorithms (often called Initialization Vectors, or IVs), 608 they usually don't have such a provision for the MAC function. In 609 that case the per-invocation nonce will have to come from somewhere 610 else, such as a message counter. 612 2.7. AEAD Construction 614 Note: Much of the content of this document, including this AEAD 615 construction is taken from Adam Langley's draft ([agl-draft]) for the 616 use of these algorithms in TLS. The AEAD construction described here 617 is called AEAD_CHACHA20-POLY1305. 619 AEAD_CHACHA20-POLY1305 is an authenticated encryption with additional 620 data algorithm. The inputs to AEAD_CHACHA20-POLY1305 are: 621 o A 256-bit key 622 o A 32-bit sender ID 623 o A 64-bit nonce - different for each invocation with the 624 combination of key and sender ID. 625 o An arbitrary length plaintext 626 o Arbitrary length additional data 628 The ChaCha20 and Poly1305 primitives are combined into an AEAD that 629 takes a 256-bit key and 64-bit IV as follows: 630 o First, a Poly1305 one-time key is generated from the 256-bit key, 631 sender ID and nonce using the procedure described in Section 2.6. 632 o The ChaCha20 encryption function is called to encrypt the 633 plaintext, using the same key, sender ID, nonce and plaintext, and 634 with the initial counter set to 1. 635 o The Poly1305 function is called with the Poly1305 key calculated 636 above, and a message constructed as a concatenation of the 637 following: 638 * The additional data 639 * The length of the additional data in octets (as a 64-bit 640 little-endian integer). TBD: bit count rather than octets? 641 network order? 643 * The ciphertext 644 * The length of the ciphertext in octets (as a 64-bit little- 645 endian integer). TBD: bit count rather than octets? network 646 order? 648 Decryption is pretty much the same thing. 650 The output from the AEAD is twofold: 651 o A ciphertext of the same length as the plaintext. 652 o A 128-bit tag (the result of the Poly1305 function. 654 A few notes about this design: 655 1. The amount of encrypted data possible in a single invocation is 656 2^32-1 blocks of 64 bytes each, for a total of 247,877,906,880 657 bytes, or nearly 256 GB. This should be enough for traffic 658 protocols such as IPsec and TLS, but may be too small for file 659 and/or disk encryption. For such uses, we can return to the 660 original design, eliminate the Sender ID parameter, and use the 661 integer at position 13 as the top 32 bits of a 64-bit block 662 counter, increasing the total message size to over a million 663 petabytes (1,180,591,620,717,411,303,360 bytes to be exact). 664 2. Despite the previous item, the ciphertext length field in the 665 construction of the buffer on which Poly1305 runs limits the 666 ciphertext (and hence, the plaintext) size to 2^64 bytes, or 667 sixteen thousand petabytes (18,446,744,073,709,551,616 bytes to 668 be exact). 669 3. TBD: perhaps we allocate only 6 bits for sender ID, leaving 58 670 bits for block counter, which means the maximum number of blocks 671 is just enough to fill the 64-bit octet counter. 673 2.7.1. Example and Test Vector for AEAD_CHACHA20-POLY1305 675 For a test vector, we will use the following inputs to the 676 AEAD_CHACHA20-POLY1305 function: 678 Plaintext: 679 000 4c 61 64 69 65 73 20 61 6e 64 20 47 65 6e 74 6c|Ladies and Gentl 680 016 65 6d 65 6e 20 6f 66 20 74 68 65 20 63 6c 61 73|emen of the clas 681 032 73 20 6f 66 20 27 39 39 3a 20 49 66 20 49 20 63|s of '99: If I c 682 048 6f 75 6c 64 20 6f 66 66 65 72 20 79 6f 75 20 6f|ould offer you o 683 064 6e 6c 79 20 6f 6e 65 20 74 69 70 20 66 6f 72 20|nly one tip for 684 080 74 68 65 20 66 75 74 75 72 65 2c 20 73 75 6e 73|the future, suns 685 096 63 72 65 65 6e 20 77 6f 75 6c 64 20 62 65 20 69|creen would be i 686 112 74 2e |t. 688 Key: 689 000 80 81 82 83 84 85 86 87 88 89 8a 8b 8c 8d 8e 8f|................ 691 016 90 91 92 93 94 95 96 97 98 99 9a 9b 9c 9d 9e 9f|................ 693 IV: 694 000 40 41 42 43 44 45 46 47 @ABCDEFG 696 Set up for generating poly1305 one-time key (sender id=7): 697 61707865 3320646e 79622d32 6b206574 698 83828180 87868584 8b8a8988 8f8e8d8c 699 93929190 97969594 9b9a9998 9f9e9d9c 700 00000000 00000007 43424140 47464544 702 After generating Poly1305 one-time key: 703 252bac7b af47b42d 557ab609 8455e9a4 704 73d6e10a ebd97510 7875932a ff53d53e 705 decc7ea2 b44ddbad e49c17d1 d8430bc9 706 8c94b7bc 8b7d4b4b 3927f67d 1669a432 708 Poly1305 Key: 709 000 7b ac 2b 25 2d b4 47 af 09 b6 7a 55 a4 e9 55 84|{.+%-.G...zU..U. 710 016 0a e1 d6 73 10 75 d9 eb 2a 93 75 78 3e d5 53 ff|...s.u..*.ux>.S. 712 Poly1305 r = 455E9A4057AB6080F47B42C052BAC7B 713 Poly1305 K = FF53D53E7875932AEBD9751073D6E10A 715 Keystream bytes: 716 9f:7b:e9:5d:01:fd:40:ba:15:e2:8f:fb:36:81:0a:ae: 717 c1:c0:88:3f:09:01:6e:de:dd:8a:d0:87:55:82:03:a5: 718 4e:9e:cb:38:ac:8e:5e:2b:b8:da:b2:0f:fa:db:52:e8: 719 75:04:b2:6e:be:69:6d:4f:60:a4:85:cf:11:b8:1b:59: 720 fc:b1:c4:5f:42:19:ee:ac:ec:6a:de:c3:4e:66:69:78: 721 8e:db:41:c4:9c:a3:01:e1:27:e0:ac:ab:3b:44:b9:cf: 722 5c:86:bb:95:e0:6b:0d:f2:90:1a:b6:45:e4:ab:e6:22: 723 15:38 725 Ciphertext: 726 000 d3 1a 8d 34 64 8e 60 db 7b 86 af bc 53 ef 7e c2|...4d.`.{...S.~. 727 016 a4 ad ed 51 29 6e 08 fe a9 e2 b5 a7 36 ee 62 d6|...Q)n......6.b. 728 032 3d be a4 5e 8c a9 67 12 82 fa fb 69 da 92 72 8b|=..^..g....i..r. 729 048 1a 71 de 0a 9e 06 0b 29 05 d6 a5 b6 7e cd 3b 36|.q.....)....~.;6 730 064 92 dd bd 7f 2d 77 8b 8c 98 03 ae e3 28 09 1b 58|...-w......(..X 731 080 fa b3 24 e4 fa d6 75 94 55 85 80 8b 48 31 d7 bc|..$...u.U...H1.. 732 096 3f f4 de f0 8e 4b 7a 9d e5 76 d2 65 86 ce c6 4b|?....Kz..v.e...K 733 112 61 16 |a. 735 AEAD Construction for Poly1305: 736 000 50 51 52 53 0c 00 00 00 00 00 00 00 0c 00 00 00|PQRS............ 737 016 00 00 00 00 d3 1a 8d 34 64 8e 60 db 7b 86 af bc|.......4d.`.{... 738 032 53 ef 7e c2 a4 ad ed 51 29 6e 08 fe a9 e2 b5 a7|S.~....Q)n...... 739 048 36 ee 62 d6 3d be a4 5e 8c a9 67 12 82 fa fb 69|6.b.=..^..g....i 740 064 da 92 72 8b 1a 71 de 0a 9e 06 0b 29 05 d6 a5 b6|..r..q.....).... 741 080 7e cd 3b 36 92 dd bd 7f 2d 77 8b 8c 98 03 ae e3|~.;6...-w...... 742 096 28 09 1b 58 fa b3 24 e4 fa d6 75 94 55 85 80 8b|(..X..$...u.U... 743 112 48 31 d7 bc 3f f4 de f0 8e 4b 7a 9d e5 76 d2 65|H1..?....Kz..v.e 744 128 86 ce c6 4b 61 16 72 00 00 00 00 00 00 00 |...Ka.r....... 746 Tag: 747 58:7f:5c:12:9f:27:f8:e9:2e:a7:e3:2d:9f:2a:76:f2 749 3. Implementation Advice 751 Each block of ChaCha20 involves 16 move operations and one increment 752 operation for loading the state, 80 each of XOR, addition and Roll 753 operations for the rounds, 16 more add operations and 16 XOR 754 operations for protecting the plaintext. Section 2.3 describes the 755 ChaCha block function as "adding the original input words". This 756 implies that before starting the rounds on the ChaCha state, it is 757 copied aside only to be added in later. This would be correct, but 758 it saves a few operations to instead copy the state and do the work 759 on the copy. This way, for the next block you don't need to recreate 760 the state, but only to increment the block counter. This saves 761 approximately 5.5% of the cycles. 763 It is NOT RECOMMENDED to use a generic big number library such as the 764 one in OpenSSL for the arithmetic operations in Poly1305. Such 765 libraries use dynamic allocation to be able to handle any-sized 766 integer, but that flexibility comes at the expense of performance as 767 well as side-channel security. More efficient implementations that 768 run in constant time are easy to find, and one is even available in 769 [poly1305] paper (although DJB describes its performance as 770 "intolerable"). 772 4. Security Considerations 774 The ChaCha20 cipher is designed to provide 256-bit security. 776 The Poly1305 authenticator is designed to ensure that forged messages 777 are rejected with a probability of 1-(n/(2^102)) for a 16n-byte 778 message, even after sending 2^64 legitimate messages, so it is SUF- 779 CMA in the terminology of [AE]. 781 Proving the security of either of these is beyond the scope of this 782 document. Such proofs are available in the referenced academic 783 papers. 785 The most important security consideration in implementing this draft 786 is the uniqueness of the nonce used in ChaCha20. Counters and LFSRs 787 are both acceptable ways of generating unique nonces, as is 788 encrypting a counter using a 64-bit cipher such as DES. Note that it 789 is not acceptable to use a truncation of a counter encrypted with a 790 128-bit or 256-bit cipher, because such a truncation may repeat after 791 a short time. 793 The same kind of collision risk as in the above paragraph also exists 794 in the selection of the Poly1305 key in both the AEAD construction 795 (Section 2.7). ChaCha20 is used to generate a 512-bit value, which 796 is unique to each packet, but this uniqueness is not guaranteed for 797 its 256-bit truncation, which serves as the actual key for Poly1305. 798 While uniqueness is not guaranteed, it is still very likely. 799 Birthday paradox calculations show that generating Poly1305 keys 800 needs to happen over 2^101 times (way more than the 2^64 allowed by 801 IPsec or TLS) to drive the chances of collision up to 802 0.00000000000001%. This is why we may safely ignore the possibility 803 of collision. 805 The algorithms presented here were designed to be easy to implement 806 in constant time to avoid side-channel vulnerabilities. The 807 operations used in ChaCha20 are all additions, XORs, and fixed 808 rotations. All of these can and should be implemented in constant 809 time. Access to offsets into the ChaCha state and the number of 810 operations do not depend on any property of the key, eliminating the 811 chance of information about the key leaking through the timing of 812 cache misses. 814 For Poly1305, the operations are addition, multiplication and 815 modulus, all on >128-bit numbers. This can be done in constant time, 816 but a naive implementation (such as using some generic big number 817 library) will not be constant time. For example, if the 818 multiplication is performed as a separate operation from the modulus, 819 the result will some times be under 2^256 and some times be above 820 2^256. Implementers should be careful about timing side-channels for 821 Poly1305 by using the appropriate implementation of these operations. 823 5. IANA Considerations 825 There are no IANA considerations for this document. 827 6. Acknowledgements 829 None of the algorithms here are my own. ChaCha20 and Poly1305 were 830 invented by Daniel J. Bernstein, and the AEAD construction was 831 invented by Adam Langley. 833 Much of the text in this document was inspired by Adam Langley's 834 draft for TLS, and by "inspired" I mean "shamelessly copied". The 835 author would also like to thank Adam, Watson Ladd and Dave McGrew for 836 allaying his fears about writing a document describing crypto. 838 7. References 840 7.1. Normative References 842 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 843 Requirement Levels", BCP 14, RFC 2119, March 1997. 845 [chacha] Bernstein, D., "ChaCha, a variant of Salsa20", Jan 2008. 847 [poly1305] 848 Bernstein, D., "The Poly1305-AES message-authentication 849 code", Mar 2005. 851 7.2. Informative References 853 [AE] Bellare, M. and C. Namprempre, "Authenticated Encryption: 854 Relations among notions and analysis of the generic 855 composition paradigm", 856 . 858 [FIPS-197] 859 National Institute of Standards and Technology, "Advanced 860 Encryption Standard (AES)", FIPS PUB 197, November 2001. 862 [FIPS-46] National Institute of Standards and Technology, "Data 863 Encryption Standard", FIPS PUB 46-2, December 1993, 864 . 866 [agl-draft] 867 Langley, A. and W. Chang, "ChaCha20 and Poly1305 based 868 Cipher Suites for TLS", draft-agl-tls-chacha20poly1305-04 869 (work in progress), November 2013. 871 [standby-cipher] 872 McGrew, D., Grieco, A., and Y. Sheffer, "Selection of 873 Future Cryptographic Standards", 874 draft-mcgrew-standby-cipher (work in progress). 876 Author's Address 878 Yoav Nir 879 Check Point Software Technologies Ltd. 880 5 Hasolelim st. 881 Tel Aviv 6789735 882 Israel 884 Email: synp71@live.com