CityHash
CityHash is a family of non-cryptographic hash functions, designed for fast hashing of strings. It has 32-, 64-, 128-, and 256-bit variants. CityHash been referenced widely in academic papers.
Google developed the algorithm in-house starting in 2010.[1] The C++ source code for the reference implementation of the algorithm was released in 2011 under an MIT license, with credit to Geoff Pike and Jyrki Alakuijala.[2] The authors expect the algorithm to outperform previous work by a factor of 1.05 to 2.5, depending on the CPU and mix of string lengths being hashed.[3] CityHash is influenced by and partly based on MurmurHash.[4]
Some particularly fast CityHash functions depend on CRC32 instructions that are present in SSE4.2. However, most CityHash functions are designed to be portable, though they will run best on little-endian 32-bit or 64-bit CPUs.[3]
Google has announced FarmHash as the successor to CityHash.
Concerns
CityHash releases do not maintain backward compatibility with previous versions.[5] Users should not use CityHash for persistent storage, or else not upgrade CityHash.
The README warns that CityHash has not been tested much on big-endian platforms.[3]
References
External links
- Official site which redirects to an export on GitHub
- Introducing CityHash, Announcement by Google
- Slides from Geoff Pike's talk at Stanford University