idnits 2.17.1 draft-eastlake-fnv-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Sep 2009 rather than the newer Notice from 28 Dec 2009. (See https://trustee.ietf.org/license-info/) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 178 has weird spacing: '...ed char hash...' -- The document date (March 7, 2011) is 4797 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'N' is mentioned on line 178, but not defined Summary: 1 error (**), 0 flaws (~~), 3 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group Glenn Fowler 2 INTERNET-DRAFT AT&T Labs Research 3 Intended Status: Informational Landon Curt Noll 4 Cisco Systems 5 Kiem-Phong Vo 6 AT&T Labs Research 7 Donald Eastlake 8 Huawei Technologies 9 Expires: September 6, 2011 March 7, 2011 11 The FNV Non-Cryptographic Hash Algorithm 12 14 Abstract 16 FNV (Fowler/Noll/Vo) is a fast, non-cryptographic hash algorithm with 17 good dispersion. The purpose of this document is to make information 18 on FNV and open source code performing FNV conveniently available to 19 the Internet community. 21 Status of This Memo 23 This Internet-Draft is submitted to IETF in full conformance with the 24 provisions of BCP 78 and BCP 79. 26 Distribution of this document is unlimited. Comments should be sent 27 to the authors. 29 Internet-Drafts are working documents of the Internet Engineering 30 Task Force (IETF), its areas, and its working groups. Note that 31 other groups may also distribute working documents as Internet- 32 Drafts. 34 Internet-Drafts are draft documents valid for a maximum of six months 35 and may be updated, replaced, or obsoleted by other documents at any 36 time. It is inappropriate to use Internet-Drafts as reference 37 material or to cite them other than as "work in progress." 39 The list of current Internet-Drafts can be accessed at 40 http://www.ietf.org/1id-abstracts.html 42 The list of Internet-Draft Shadow Directories can be accessed at 43 http://www.ietf.org/shadow.html 45 Table of Contents 47 1. Introduction............................................3 49 2. FNV Basics..............................................4 50 2.1 FNV Primes.............................................4 51 2.2 FNV offset_basis.......................................5 52 2.3 FNV Endianism..........................................5 54 3. Other Hash Sizes and XOR Folding........................6 55 4. FNV Constants...........................................7 57 5. The Source Code.........................................9 58 5.1 FNV C Header...........................................9 59 5.2 FNV C Code.............................................9 60 5.3 FNV Test Code..........................................9 62 6. Security Considerations................................10 63 7. IANA Considerations....................................10 65 8. References.............................................11 66 8.1 Normative References..................................11 67 8.2 Informative References................................11 69 1. Introduction 71 The FNV hash algorithm is based on an idea sent as reviewer comments 72 to the IEEE POSIX P1003.2 committee by Glenn Fowler and Phong Vo in 73 1991. In a subsequent ballot round Landon Curt Noll suggested an 74 improvement on their algorithm. Some people tried this hash and found 75 that it worked rather well. In an EMail message to Landon, they named 76 it the "Fowler/Noll/Vo" or FNV hash. [FNV] 78 FNV hashes are designed to be fast while maintaining a low collision 79 rate. Their speed allows one to quickly hash lots of data while 80 maintaining a reasonably low collision rate. The high dispersion of 81 the FNV hashes makes them well suited for hashing nearly identical 82 strings such as URLs, hostnames, filenames, text, IP addresses, etc. 83 However, they are not suitable for cryptographic use. (For some hash 84 algorithms more suitable for cryptographic use see [RFCsha].) 86 The FNV hash is widely used, for example in DNS servers, database 87 indexing hashes, major web search / indexing engines, netnews history 88 file Message-ID lookup functions, anti-spam filters, a spellchecker 89 programmed in Ada 95, flatassembler's open source x86 assembler - 90 user-defined symbol hashtree, non-cryptographic file fingerprints, 91 computing Unique IDs in DASM (DTN Applications for Symbian Mobile- 92 phones), Microsoft's hash_map implementation for VC++ 2005, the 93 realpath cache in PHP 5.x (php-5.2.3/TSRM/tsrm_virtual_cwd.c), and 94 many other uses. 96 FNV hash algorithms and source code have been released into the 97 public domain. The authors of the FNV algorithm took deliberate steps 98 to disclose the algorithm in a public forum soon after it was 99 invented. More than a year passed after this public disclosure and 100 the authors deliberatley took no steps to patent the FNV algorithm. 101 Therefore, it is safe to say that the FNV authors have no patent 102 claims on the FNV algorithm as published. 104 If you use an FNV function in an application, you are kindly 105 requested to send an EMail about it to: fnv-mail@asthe.com 107 2. FNV Basics 109 This document focuses on the FNV-1a function whose pseudo-code is as 110 follows: 112 hash = offset_basis 113 for each octet_of_data to be hashed 114 hash = hash xor octet_of_data 115 hash = hash * FNV_Prime 116 return hash 118 In the pseudo-code above, hash is a power-of-two number of bits (32, 119 64, ... 1024) and offset_basis and FNV_Prime depend on the size of 120 hash. 122 The FNV-1 algorithm is the same, including the values of offset_basis 123 and FNV_Prime, except that the order of the two lines with the "xor" 124 and multiply operations are reversed. Operational experience 125 indicates better hash dispersion for small amounts of data with 126 FNV-1a. FNV-0 is the same as FNV-1 but with offset_basis set to zero. 127 FNV-1a is suggested for general use. 129 2.1 FNV Primes 131 The theory behind FNV_Prime's is beyond the scope of this document 132 but the basic property to look for is how an FNV_Prime would impact 133 dispersion. Now, consider any n-bit FNV hash where n is >= 32 and 134 also a power of 2. For each such an n-bit FNV hash, an FNV_Prime p is 135 defined as: 137 The smallest prime of the form p = 2**t + 2**8 + b where: 138 - t is an integer such that: 139 If n == 32, then t == int((3/4)*n) == 24, or 140 If n >= 64, then t == 8*int((n+5)/12). 141 - b is an integer such that: 142 0 < b < 2**8, and 143 The number of one-bits in b is 4 or 5 145 Experimentally, FNV_Primes matching the above constraints tend to 146 have better dispersion properties. They improve the polynomial 147 feedback characteristic when an FNV_Prime multiplies an intermediate 148 hash value. As such, the hash values produced are more scattered 149 throughout the n-bit hash space. 151 Per the above constraints, an FNV_Prime should have only 6 or 7 one- 152 bits in it. Therefore, some compilers may seek to improve the 153 performance of a multiplication with an FNV_Prime by replacing the 154 multiplication with shifts and adds. However, note that the 155 performance of this substitution is highly hardware-dependent and 156 should be done with care. FNV_Primes were selected primarily for the 157 quality of resulting hash function, not for compiler optimization. 159 2.2 FNV offset_basis 161 The offset_basis values for the n-bit FNV-1a algorithms are computed 162 by applying the n-bit FNV-0 algorithm to the following 32 octets: 164 chongo /\../\ 166 The \'s in the above string are not C-style escape characters. In C- 167 string notation, these 32 octets are: 169 "chongo /\\../\\" 171 2.3 FNV Endianism 173 For persistent storage or interoperability between different hardware 174 platforms, an FNV hash shall be represented in the little endian 175 format. That is, the FNV hash will be stored in an array hash[N] with 176 N bytes such that its integer value can be retrieved as follows: 178 unsigned char hash[N]; 179 for ( i = N-1, value = 0; i >= 0; --i ) 180 value = value << 8 + hash[i]; 182 Of course, when FNV hashes are used in a single process or a group of 183 processes sharing memory on processors with compatible endian-ness, 184 the natural endianness of those processors can be used regardless of 185 its type, little, big, or some other exotic form. 187 3. Other Hash Sizes and XOR Folding 189 Many hash uses require a hash that is not one of the FNV sizes for 190 which constants are provided in Section 4. If a larger hash size is 191 needed, please contact the authors of this document. 193 Most hash applications make use of a hash that is a fixed size binary 194 field. Assume that k bits of hash are desired and k is less than 1024 195 but not one of the sizes for which constants are provided in Section 196 4. The recommended technique is to take the smallest FNV hash of size 197 S, where S is larger than k, and calculate the desired hash using xor 198 folding as shown below. The final bit masking operation is logically 199 unnecessarily if the size of hash is exactly the number of desired 200 bits. 202 temp = FNV_S ( data-to-be-hashed ) 203 hash = ( temp xor temp>>k ) bitwise-and ( 2**k - 1 ) 205 Hash functions are a trade-off between speed and strength. For 206 example, a somewhat stronger hash may be obtained for exact FNV sizes 207 by calculating an FNV twice as long as the desired output ( S = 2*k ) 208 and performing such data folding using a k equal to the size of the 209 desired output. However, if a much stronger hash, for example one 210 suitable for cryptographic applications, is wanted, algorithms 211 designed for that purpose, such as those in [RFCsha] should be used. 213 If it is desired to obtain a hash result that is a value between 0 214 and max, where max is a not a power of two, simply choose an FNV hash 215 size S such that 2**S > max. Then calculate the following: 217 FNV_S mod ( max+1 ) 219 The resulting remainder will be in the range desired but will suffer 220 from a bias against large values with the bias being larger if 2**S 221 is only a little bigger than max. If this bias is acceptable, no 222 further processing is needed. If this bias is unacceptable, it can be 223 avoided by retrying for certain high values of hash, as follows, 224 before applying the mod operation above: 226 X = ( int( ( 2**S - 1 ) / ( max+1 ) ) ) * ( max+1 ) 227 while ( hash >= X ) 228 hash = ( hash * FNV_Prime ) + offset_basis 230 4. FNV Constants 232 The FNV Primes are as follows: 234 32 bit FNV_Prime = 2**24 + 2**8 + 0x93 = 16,777,619 235 = 0x01000193 237 64 bit FNV_Prime = 2**40 + 2**8 + 0xB3 = 1,099,511,628,211 238 = 0x00000100 000001B3 240 128 bit FNV_Prime = 2**88 + 2**8 + 0x3B = 241 309,485,009,821,345,068,724,781,371 242 = 0x00000000 01000000 00000000 0000013B 244 256 bit FNV_Prime = 2**168 + 2**8 + 0x63 = 245 374,144,419,156,711,147,060,143,317,175,368,453,031,918,731,002,211 = 246 0x0000000000000000 0000010000000000 0000000000000000 0000000000000163 248 512 bit FNV_Prime = 2**344 + 2**8 + 0x57 = 35, 249 835,915,874,844,867,368,919,076,489,095,108,449,946,327,955,754,392, 250 558,399,825,615,420,669,938,882,575,126,094,039,892,345,713,852,759 = 251 0x0000000000000000 0000000000000000 0000000001000000 0000000000000000 252 0000000000000000 0000000000000000 0000000000000000 0000000000000157 254 1024 bit FNV_Prime = 2**680 + 2**8 + 0x8D = 5, 255 016,456,510,113,118,655,434,598,811,035,278,955,030,765,345,404,790, 256 744,303,017,523,831,112,055,108,147,451,509,157,692,220,295,382,716, 257 162,651,878,526,895,249,385,292,291,816,524,375,083,746,691,371,804, 258 094,271,873,160,484,737,966,720,260,389,217,684,476,157,468,082,573 = 259 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 260 0000000000000000 0000010000000000 0000000000000000 0000000000000000 261 0000000000000000 0000000000000000 0000000000000000 0000000000000000 262 0000000000000000 0000000000000000 0000000000000000 000000000000018D 264 The FNV offset_basis values are as follows: 266 32 bit offset_basis = 2,166,136,261 = 0x811C9DC5 268 64 bit offset_basis = 14695981039346656037 = 0xCBF29CE4 84222325 270 128 bit offset_basis = 144066263297769815596495629667062367629 = 271 0x6C62272E 07BB0142 62B82175 6295C58D 273 256 bit offset_basis = 100,029,257,958,052,580,907,070,968, 274 620,625,704,837,092,796,014,241,193,945,225,284,501,741,471,925,557 = 275 0xDD268DBCAAC55036 2D98C384C4E576CC C8B1536847B6BBB3 1023B4C8CAEE0535 276 512 bit offset_basis = 9, 277 659,303,129,496,669,498,009,435,400,716,310,466,090,418,745,672,637, 278 896,108,374,329,434,462,657,994,582,932,197,716,438,449,813,051,892, 279 206,539,805,784,495,328,239,340,083,876,191,928,701,583,869,517,785 = 280 0xB86DB0B1171F4416 DCA1E50F309990AC AC87D059C9000000 0000000000000D21 281 E948F68A34C192F6 2EA79BC942DBE7CE 182036415F56E34B AC982AAC4AFE9FD9 283 1024 bit offset_basis = 14,197,795,064,947,621,068,722,070,641,403, 284 218,320,880,622,795,441,933,960,878,474,914,617,582,723,252,296,732, 285 303,717,722,150,864,096,521,202,355,549,365,628,174,669,108,571,814, 286 760,471,015,076,148,029,755,969,804,077,320,157,692,458,563,003,215, 287 304,957,150,157,403,644,460,363,550,505,412,711,285,966,361,610,267, 288 868,082,893,823,963,790,439,336,411,086,884,584,107,735,010,676,915 = 289 0x0000000000000000 005F7A76758ECC4D 32E56D5A591028B7 4B29FC4223FDADA1 290 6C3BF34EDA3674DA 9A21D90000000000 0000000000000000 0000000000000000 291 0000000000000000 0000000000000000 0000000000000000 000000000004C6D7 292 EB6E73802734510A 555F256CC005AE55 6BDE8CC9C6A93B21 AFF4B16C71EE90B3 294 5. The Source Code 296 The following sub-sections are intended, in later versions, to 297 include reference C source code and a test driver for FNV-1a. 299 5.1 FNV C Header 301 TBD 303 5.2 FNV C Code 305 TBD 307 5.3 FNV Test Code 309 TBD 311 6. Security Considerations 313 This document is intended to provide convenient open source access by 314 the Internet community to the FNV non-cryptographic hash. No 315 assertion of suitability for cryptographic applications is made for 316 the FNV hash algorithms. 318 7. IANA Considerations 320 This document requires no IANA Actions. The RFC Editor should delete 321 this section before publication. 323 8. References 325 Below are the normative and informative references for this document. 327 8.1 Normative References 329 None. 331 8.2 Informative References 333 [FNV] - FNV web site: 334 http://www.isthe.com/chongo/tech/comp/fnv/index.html 336 [RFCsha] - D. Eastlake, T. Hansen, "US Secure Hash Algorithms (SHA 337 and SHA based HMAC and HKDF)", draft-eastlake-sha2b-07.txt, in 338 RFC Editor queue. 340 Appendix: Test Vectors 342 Below are a few test vectors in the form of ASCII strings and their 343 FNV32 and FNV64 hashes using the FNV-1a algorithm. 345 Strings without null (zero byte) termination: 347 String FNV32 FNV64 348 "" 0x811c9dc5 0xcbf29ce484222325 349 "a" 0xe40c292c 0xaf63dc4c8601ec8c 350 "foobar" 0xbf9cf968 0x85944171f73967e8 352 Strings including null (zero byte) termination: 354 String FNV32 FNV64 355 "" 0x050c5d1f 0xaf63bd4c8601b7df 356 "a" 0x2b24d044 0x089be207b544f1e4 357 "foobar" 0x0c1c9eb8 0x34531ca7168b8f38 359 Author's Address 361 Glenn Fowler 362 AT&T Labs Research 363 180 Park Avenue 364 Florham Park, NJ 07932 USA 366 Email: gsf@research.att.com 367 URL: http://www.research.att.com/~gsf/ 369 Landon Curt Noll 370 Cisco Systems 371 170 West Tasman Drive 372 San Jose, CA 95134 USA 374 Telephone: +1-408-424-1102 375 Email: fnv-rfc-mail@asthe.com 376 URL: http://www.isthe.com/chongo/index.html 378 Kiem-Phong Vo 379 AT&T Labs Research 380 180 Park Avenue 381 Florham Park, NJ 07932 USA 383 Email: kpv@research.att.com 384 URL: http://www.research.att.com/info/kpv/ 386 Donald Eastlake 387 Huawei Technologies 388 155 Beaver Street 389 Milford, MA 01757 USA 391 Telephone: +1-508-333-2270 392 EMail: d3e3e3@gmail.com 394 Copyright, Disclaimer, and Additional IPR Provisions 396 Copyright (c) 2011 IETF Trust and the persons identified as the 397 document authors. All rights reserved. 399 This document is subject to BCP 78 and the IETF Trust's Legal 400 Provisions Relating to IETF Documents 401 (http://trustee.ietf.org/license-info) in effect on the date of 402 publication of this document. Please review these documents 403 carefully, as they describe your rights and restrictions with respect 404 to this document. Code Components extracted from this document must 405 include Simplified BSD License text as described in Section 4.e of 406 the Trust Legal Provisions and are provided without warranty as 407 described in the BSD License. This Internet-Draft is submitted to 408 IETF in full conformance with the provisions of BCP 78 and BCP 79.