idnits 2.17.1 draft-eastlake-fnv-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 184 has weird spacing: '...ed char hash...' -- The document date (September 6, 2011) is 4615 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'N' is mentioned on line 184, but not defined Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group Glenn Fowler 2 INTERNET-DRAFT AT&T Labs Research 3 Intended Status: Informational Landon Curt Noll 4 Cisco Systems 5 Kiem-Phong Vo 6 AT&T Labs Research 7 Donald Eastlake 8 Huawei Technologies 9 Expires: March 5, 2011 September 6, 2011 11 The FNV Non-Cryptographic Hash Algorithm 12 14 Abstract 16 FNV (Fowler/Noll/Vo) is a fast, non-cryptographic hash algorithm with 17 good dispersion. The purpose of this document is to make information 18 on FNV and open source code performing FNV conveniently available to 19 the Internet community. 21 Status of This Memo 23 This Internet-Draft is submitted to IETF in full conformance with the 24 provisions of BCP 78 and BCP 79. 26 Distribution of this document is unlimited. Comments should be sent 27 to the authors. 29 Internet-Drafts are working documents of the Internet Engineering 30 Task Force (IETF), its areas, and its working groups. Note that 31 other groups may also distribute working documents as Internet- 32 Drafts. 34 Internet-Drafts are draft documents valid for a maximum of six months 35 and may be updated, replaced, or obsoleted by other documents at any 36 time. It is inappropriate to use Internet-Drafts as reference 37 material or to cite them other than as "work in progress." 39 The list of current Internet-Drafts can be accessed at 40 http://www.ietf.org/1id-abstracts.html 42 The list of Internet-Draft Shadow Directories can be accessed at 43 http://www.ietf.org/shadow.html 45 Table of Contents 47 1. Introduction............................................3 49 2. FNV Basics..............................................4 50 2.1 FNV Primes.............................................4 51 2.2 FNV offset_basis.......................................5 52 2.3 FNV Endianism..........................................5 54 3. Other Hash Sizes and XOR Folding........................6 55 4. FNV Constants...........................................7 57 5. The Source Code.........................................9 58 5.1 FNV C Header...........................................9 59 5.2 FNV C Code.............................................9 60 5.3 FNV Test Code..........................................9 62 6. Security Considerations................................10 63 6.1 Why is FNV Non-Cryptographic?.........................10 65 7. IANA Considerations....................................11 67 8. References.............................................12 68 8.1 Normative References..................................12 69 8.2 Informative References................................12 71 Appendix A: Work Comparison with SHA-1....................13 72 Appendix B: Previous IETF Reference to FNV................14 73 Appendix C: A Few Test Vectors............................15 75 1. Introduction 77 The FNV hash algorithm is based on an idea sent as reviewer comments 78 to the IEEE POSIX P1003.2 committee by Glenn Fowler and Phong Vo in 79 1991. In a subsequent ballot round Landon Curt Noll suggested an 80 improvement on their algorithm. Some people tried this hash and found 81 that it worked rather well. In an EMail message to Landon, they named 82 it the "Fowler/Noll/Vo" or FNV hash. [FNV] 84 FNV hashes are designed to be fast while maintaining a low collision 85 rate. Their speed allows one to quickly hash lots of data while 86 maintaining a reasonably low collision rate. The high dispersion of 87 the FNV hashes makes them well suited for hashing nearly identical 88 strings such as URLs, hostnames, filenames, text, IP addresses, etc. 89 However, they are generally not suitable for cryptographic use. (See 90 Section 6.1.) 92 The FNV hash is widely used, for example in DNS servers, database 93 indexing hashes, major web search / indexing engines, netnews history 94 file Message-ID lookup functions, anti-spam filters, a spellchecker 95 programmed in Ada 95, flatassembler's open source x86 assembler - 96 user-defined symbol hashtree, non-cryptographic file fingerprints, 97 computing Unique IDs in DASM (DTN Applications for Symbian Mobile- 98 phones), Microsoft's hash_map implementation for VC++ 2005, the 99 realpath cache in PHP 5.x (php-5.2.3/TSRM/tsrm_virtual_cwd.c), and 100 many other uses. 102 FNV hash algorithms and source code have been released into the 103 public domain. The authors of the FNV algorithm took deliberate steps 104 to disclose the algorithm in a public forum soon after it was 105 invented. More than a year passed after this public disclosure and 106 the authors deliberately took no steps to patent the FNV algorithm. 107 Therefore, it is safe to say that the FNV authors have no patent 108 claims on the FNV algorithm as published. 110 If you use an FNV function in an application, you are kindly 111 requested to send an EMail about it to: fnv-mail@asthe.com 113 2. FNV Basics 115 This document focuses on the FNV-1a function whose pseudo-code is as 116 follows: 118 hash = offset_basis 119 for each octet_of_data to be hashed 120 hash = hash xor octet_of_data 121 hash = hash * FNV_Prime 122 return hash 124 In the pseudo-code above, hash is a power-of-two number of bits (32, 125 64, ... 1024) and offset_basis and FNV_Prime depend on the size of 126 hash. 128 The FNV-1 algorithm is the same, including the values of offset_basis 129 and FNV_Prime, except that the order of the two lines with the "xor" 130 and multiply operations are reversed. Operational experience 131 indicates better hash dispersion for small amounts of data with 132 FNV-1a. FNV-0 is the same as FNV-1 but with offset_basis set to zero. 133 FNV-1a is suggested for general use. 135 2.1 FNV Primes 137 The theory behind FNV_Prime's is beyond the scope of this document 138 but the basic property to look for is how an FNV_Prime would impact 139 dispersion. Now, consider any n-bit FNV hash where n is >= 32 and 140 also a power of 2. For each such an n-bit FNV hash, an FNV_Prime p is 141 defined as: 143 The smallest prime of the form p = 2**t + 2**8 + b where: 144 - t is an integer such that: 145 If n == 32, then t == int((3/4)*n) == 24, or 146 If n >= 64, then t == 8*int((n+5)/12). 147 - b is an integer such that: 148 0 < b < 2**8, and 149 The number of one-bits in b is 4 or 5 151 Experimentally, FNV_Primes matching the above constraints tend to 152 have better dispersion properties. They improve the polynomial 153 feedback characteristic when an FNV_Prime multiplies an intermediate 154 hash value. As such, the hash values produced are more scattered 155 throughout the n-bit hash space. 157 Per the above constraints, an FNV_Prime should have only 6 or 7 one- 158 bits in it. Therefore, some compilers may seek to improve the 159 performance of a multiplication with an FNV_Prime by replacing the 160 multiplication with shifts and adds. However, note that the 161 performance of this substitution is highly hardware-dependent and 162 should be done with care. FNV_Primes were selected primarily for the 163 quality of resulting hash function, not for compiler optimization. 165 2.2 FNV offset_basis 167 The offset_basis values for the n-bit FNV-1a algorithms are computed 168 by applying the n-bit FNV-0 algorithm to the following 32 octets: 170 chongo /\../\ 172 The \'s in the above string are not C-style escape characters. In C- 173 string notation, these 32 octets are: 175 "chongo /\\../\\" 177 2.3 FNV Endianism 179 For persistent storage or interoperability between different hardware 180 platforms, an FNV hash shall be represented in the little endian 181 format. That is, the FNV hash will be stored in an array hash[N] with 182 N bytes such that its integer value can be retrieved as follows: 184 unsigned char hash[N]; 185 for ( i = N-1, value = 0; i >= 0; --i ) 186 value = value << 8 + hash[i]; 188 Of course, when FNV hashes are used in a single process or a group of 189 processes sharing memory on processors with compatible endian-ness, 190 the natural endianness of those processors can be used regardless of 191 its type, little, big, or some other exotic form. 193 3. Other Hash Sizes and XOR Folding 195 Many hash uses require a hash that is not one of the FNV sizes for 196 which constants are provided in Section 4. If a larger hash size is 197 needed, please contact the authors of this document. 199 Most hash applications make use of a hash that is a fixed size binary 200 field. Assume that k bits of hash are desired and k is less than 1024 201 but not one of the sizes for which constants are provided in Section 202 4. The recommended technique is to take the smallest FNV hash of size 203 S, where S is larger than k, and calculate the desired hash using xor 204 folding as shown below. The final bit masking operation is logically 205 unnecessarily if the size of hash is exactly the number of desired 206 bits. 208 temp = FNV_S ( data-to-be-hashed ) 209 hash = ( temp xor temp>>k ) bitwise-and ( 2**k - 1 ) 211 Hash functions are a trade-off between speed and strength. For 212 example, a somewhat stronger hash may be obtained for exact FNV sizes 213 by calculating an FNV twice as long as the desired output ( S = 2*k ) 214 and performing such data folding using a k equal to the size of the 215 desired output. However, if a much stronger hash, for example one 216 suitable for cryptographic applications, is wanted, algorithms 217 designed for that purpose, such as those in [RFC6234], should be 218 used. 220 If it is desired to obtain a hash result that is a value between 0 221 and max, where max is a not a power of two, simply choose an FNV hash 222 size S such that 2**S > max. Then calculate the following: 224 FNV_S mod ( max+1 ) 226 The resulting remainder will be in the range desired but will suffer 227 from a bias against large values with the bias being larger if 2**S 228 is only a little bigger than max. If this bias is acceptable, no 229 further processing is needed. If this bias is unacceptable, it can be 230 avoided by retrying for certain high values of hash, as follows, 231 before applying the mod operation above: 233 X = ( int( ( 2**S - 1 ) / ( max+1 ) ) ) * ( max+1 ) 234 while ( hash >= X ) 235 hash = ( hash * FNV_Prime ) + offset_basis 237 4. FNV Constants 239 The FNV Primes are as follows: 241 32 bit FNV_Prime = 2**24 + 2**8 + 0x93 = 16,777,619 242 = 0x01000193 244 64 bit FNV_Prime = 2**40 + 2**8 + 0xB3 = 1,099,511,628,211 245 = 0x00000100 000001B3 247 128 bit FNV_Prime = 2**88 + 2**8 + 0x3B = 248 309,485,009,821,345,068,724,781,371 249 = 0x00000000 01000000 00000000 0000013B 251 256 bit FNV_Prime = 2**168 + 2**8 + 0x63 = 252 374,144,419,156,711,147,060,143,317,175,368,453,031,918,731,002,211 = 253 0x0000000000000000 0000010000000000 0000000000000000 0000000000000163 255 512 bit FNV_Prime = 2**344 + 2**8 + 0x57 = 35, 256 835,915,874,844,867,368,919,076,489,095,108,449,946,327,955,754,392, 257 558,399,825,615,420,669,938,882,575,126,094,039,892,345,713,852,759 = 258 0x0000000000000000 0000000000000000 0000000001000000 0000000000000000 259 0000000000000000 0000000000000000 0000000000000000 0000000000000157 261 1024 bit FNV_Prime = 2**680 + 2**8 + 0x8D = 5, 262 016,456,510,113,118,655,434,598,811,035,278,955,030,765,345,404,790, 263 744,303,017,523,831,112,055,108,147,451,509,157,692,220,295,382,716, 264 162,651,878,526,895,249,385,292,291,816,524,375,083,746,691,371,804, 265 094,271,873,160,484,737,966,720,260,389,217,684,476,157,468,082,573 = 266 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 267 0000000000000000 0000010000000000 0000000000000000 0000000000000000 268 0000000000000000 0000000000000000 0000000000000000 0000000000000000 269 0000000000000000 0000000000000000 0000000000000000 000000000000018D 271 The FNV offset_basis values are as follows: 273 32 bit offset_basis = 2,166,136,261 = 0x811C9DC5 275 64 bit offset_basis = 14695981039346656037 = 0xCBF29CE4 84222325 277 128 bit offset_basis = 144066263297769815596495629667062367629 = 278 0x6C62272E 07BB0142 62B82175 6295C58D 280 256 bit offset_basis = 100,029,257,958,052,580,907,070,968, 281 620,625,704,837,092,796,014,241,193,945,225,284,501,741,471,925,557 = 282 0xDD268DBCAAC55036 2D98C384C4E576CC C8B1536847B6BBB3 1023B4C8CAEE0535 283 512 bit offset_basis = 9, 284 659,303,129,496,669,498,009,435,400,716,310,466,090,418,745,672,637, 285 896,108,374,329,434,462,657,994,582,932,197,716,438,449,813,051,892, 286 206,539,805,784,495,328,239,340,083,876,191,928,701,583,869,517,785 = 287 0xB86DB0B1171F4416 DCA1E50F309990AC AC87D059C9000000 0000000000000D21 288 E948F68A34C192F6 2EA79BC942DBE7CE 182036415F56E34B AC982AAC4AFE9FD9 290 1024 bit offset_basis = 14,197,795,064,947,621,068,722,070,641,403, 291 218,320,880,622,795,441,933,960,878,474,914,617,582,723,252,296,732, 292 303,717,722,150,864,096,521,202,355,549,365,628,174,669,108,571,814, 293 760,471,015,076,148,029,755,969,804,077,320,157,692,458,563,003,215, 294 304,957,150,157,403,644,460,363,550,505,412,711,285,966,361,610,267, 295 868,082,893,823,963,790,439,336,411,086,884,584,107,735,010,676,915 = 296 0x0000000000000000 005F7A76758ECC4D 32E56D5A591028B7 4B29FC4223FDADA1 297 6C3BF34EDA3674DA 9A21D90000000000 0000000000000000 0000000000000000 298 0000000000000000 0000000000000000 0000000000000000 000000000004C6D7 299 EB6E73802734510A 555F256CC005AE55 6BDE8CC9C6A93B21 AFF4B16C71EE90B3 301 5. The Source Code 303 The following sub-sections are intended, in later versions, to 304 include reference C source code and a test driver for FNV-1a. 306 5.1 FNV C Header 308 TBD 310 5.2 FNV C Code 312 TBD 314 5.3 FNV Test Code 316 TBD 318 6. Security Considerations 320 This document is intended to provide convenient open source access by 321 the Internet community to the FNV non-cryptographic hash. No 322 assertion of suitability for cryptographic applications is made for 323 the FNV hash algorithms. 325 6.1 Why is FNV Non-Cryptographic? 327 A full discussion of cryptographic hash requirements and strength is 328 beyond the scope of this document. However, here are three 329 characteristics of FNV that would generally be considered to make it 330 non-cryptographic: 332 1. Work Factor - To make brute force inversion hard, a cryptographic 333 hash should be computationally expensive, especially for a general 334 purpose processor. But FNV is designed to be very inexpensive on a 335 general purpose processor. (See Appendix A.) 337 2. Sticky State - A cryptographic hash should not have a state in 338 which it can stick for a plausible input pattern. But, in the 339 unlikely event that the FNV hash variable becomes zero and the 340 input is a sequence of zeros, the hash variable will remain at 341 zero until there is a non-zero input byte and the final hash value 342 will be unaffected by the length of the sequence of zero input 343 bytes. Of course, for the common case of fixed length input, this 344 would not be significant because the number of non-zero bytes 345 would vary inversely with the number of zero bytes and for some 346 types of input runs of zeros do not occur. Furthermore, the 347 inclusion of even a little unpredictable input may be sufficient 348 to stop an adversary from inducing a zero hash variable. 350 3. Diffusion - Every output bit of a cryptographic hash should be an 351 equally complex function of every input bit. But it is easy to see 352 that the least significant bit of a direct FNV hash is the XOR of 353 the least significant bits of every input byte and does not depend 354 on any other input bit. If this is considered a problem, it can be 355 easily fixed by folding (see Section 3). 357 Nevertheless, none of the above have proven to be a problem in actual 358 practice for the many applications of FNV. 360 7. IANA Considerations 362 This document requires no IANA Actions. RFC Editor Note: please 363 delete this section before publication. 365 8. References 367 Below are the normative and informative references for this document. 369 8.1 Normative References 371 None. 373 8.2 Informative References 375 [FNV] - FNV web site: 376 http://www.isthe.com/chongo/tech/comp/fnv/index.html 378 [RFC3174] - Eastlake 3rd, D. and P. Jones, "US Secure Hash Algorithm 379 1 (SHA1)", RFC 3174, September 2001. 381 [RFC6194] - Polk, T., Chen, L., Turner, S., and P. Hoffman, "Security 382 Considerations for the SHA-0 and SHA-1 Message-Digest 383 Algorithms", RFC 6194, March 2011. 385 [RFC6234] - Eastlake 3rd, D. and T. Hansen, "US Secure Hash 386 Algorithms (SHA and SHA-based HMAC and HKDF)", RFC 6234, May 387 2011. 389 Appendix A: Work Comparison with SHA-1 391 This section provides a simplistic rough comparison of the level of 392 effort required per input byte to compute FNV-1a and SHA-1 [RFC3174]. 394 Ignoring transfer of control and conditional tests and equating all 395 logical and arithmetic operations, FNV requires 2 operations per 396 byte, an XOR and a multiply. 398 SHA-1 is a relatively weak cryptographic hash producing a 160-bit 399 hash. It that has been partially broken [RFC6194]. It is actually 400 designed to accept a bit vector input although almost all computer 401 uses apply it to an integer number of bytes. It processes blocks of 402 512 bits (64 bytes) and we estimate the effort involved in SHA-1 403 processing a full block. Ignoring SHA-1 initial set up, transfer of 404 control, and conditional tests, but counting all logical and 405 arithmetic operations, including counting indexing as an addition, 406 SHA-1 requires 1,744 operations per 64 bytes block or 27.25 407 operations per byte. So by this rough measure, it is a little over 13 408 times the effort of FNV for very large amounts of data. However, FNV 409 is commonly used for small inputs. Using the above method, for inputs 410 of N bytes, where N is less than 55 so SHA-1 will take one block 411 (SHA-1 includes padding and an 8 byte length at the end of the data 412 in the last block), the ratio of the effort for SHA-1 to the effort 413 for FNV will be 872/N. For example, with an 8 byte input, SHA-1 will 414 take 109 times as much effort as FNV. 416 Stronger cryptographic functions than SHA-1 generally have an even 417 high work factor. 419 Appendix B: Previous IETF Reference to FNV 421 FNV-1a was referenced in draft-ietf-tls-cached-info-08.txt that has 422 since expired. It was later decided that it would be better to use a 423 cryptographic hash for that use. 425 Below is the Jave code for FNV64 from that TLS draft include by the 426 kind permission of the author: 428 /** 429 * Java code sample, implementing 64 bit FNV-1a 430 * By Stefan Santesson 431 */ 433 import java.math.BigInteger; 435 public class FNV { 437 static public BigInteger getFNV1aToByte(byte[] inp) { 439 BigInteger m = new BigInteger("2").pow(64); 440 BigInteger fnvPrime = new BigInteger("1099511628211"); 441 BigInteger fnvOffsetBasis = 442 new BigInteger("14695981039346656037"); 444 BigInteger digest = fnvOffsetBasis; 446 for (byte b : inp) { 447 digest = digest.xor(BigInteger.valueOf((int) b & 255)); 448 digest = digest.multiply(fnvPrime).mod(m); 449 } 450 return digest; 452 } 453 } 455 Appendix C: A Few Test Vectors 457 Below are a few test vectors in the form of ASCII strings and their 458 FNV32 and FNV64 hashes using the FNV-1a algorithm. 460 Strings without null (zero byte) termination: 462 String FNV32 FNV64 463 "" 0x811c9dc5 0xcbf29ce484222325 464 "a" 0xe40c292c 0xaf63dc4c8601ec8c 465 "foobar" 0xbf9cf968 0x85944171f73967e8 467 Strings including null (zero byte) termination: 469 String FNV32 FNV64 470 "" 0x050c5d1f 0xaf63bd4c8601b7df 471 "a" 0x2b24d044 0x089be207b544f1e4 472 "foobar" 0x0c1c9eb8 0x34531ca7168b8f38 474 Author's Address 476 Glenn Fowler 477 AT&T Labs Research 478 180 Park Avenue 479 Florham Park, NJ 07932 USA 481 Email: gsf@research.att.com 482 URL: http://www.research.att.com/~gsf/ 484 Landon Curt Noll 485 Cisco Systems 486 170 West Tasman Drive 487 San Jose, CA 95134 USA 489 Telephone: +1-408-424-1102 490 Email: fnv-rfc-mail@asthe.com 491 URL: http://www.isthe.com/chongo/index.html 493 Kiem-Phong Vo 494 AT&T Labs Research 495 180 Park Avenue 496 Florham Park, NJ 07932 USA 498 Email: kpv@research.att.com 499 URL: http://www.research.att.com/info/kpv/ 501 Donald Eastlake 502 Huawei Technologies 503 155 Beaver Street 504 Milford, MA 01757 USA 506 Telephone: +1-508-333-2270 507 EMail: d3e3e3@gmail.com 509 Copyright, Disclaimer, and Additional IPR Provisions 511 Copyright (c) 2011 IETF Trust and the persons identified as the 512 document authors. All rights reserved. 514 This document is subject to BCP 78 and the IETF Trust's Legal 515 Provisions Relating to IETF Documents 516 (http://trustee.ietf.org/license-info) in effect on the date of 517 publication of this document. Please review these documents 518 carefully, as they describe your rights and restrictions with respect 519 to this document. Code Components extracted from this document must 520 include Simplified BSD License text as described in Section 4.e of 521 the Trust Legal Provisions and are provided without warranty as 522 described in the Simplified BSD License. This Internet-Draft is 523 submitted to IETF in full conformance with the provisions of BCP 78 524 and BCP 79.