Repo created
This commit is contained in:
parent
81b91f4139
commit
f8c34fa5ee
22732 changed files with 4815320 additions and 2 deletions
196
TMessagesProj/jni/voip/webrtc/base/third_party/cityhash/README
vendored
Normal file
196
TMessagesProj/jni/voip/webrtc/base/third_party/cityhash/README
vendored
Normal file
|
|
@ -0,0 +1,196 @@
|
|||
CityHash, a family of hash functions for strings.
|
||||
|
||||
|
||||
Introduction
|
||||
============
|
||||
|
||||
CityHash provides hash functions for strings. The functions mix the
|
||||
input bits thoroughly but are not suitable for cryptography. See
|
||||
"Hash Quality," below, for details on how CityHash was tested and so on.
|
||||
|
||||
We provide reference implementations in C++, with a friendly MIT license.
|
||||
|
||||
CityHash32() returns a 32-bit hash.
|
||||
|
||||
CityHash64() and similar return a 64-bit hash.
|
||||
|
||||
CityHash128() and similar return a 128-bit hash and are tuned for
|
||||
strings of at least a few hundred bytes. Depending on your compiler
|
||||
and hardware, it's likely faster than CityHash64() on sufficiently long
|
||||
strings. It's slower than necessary on shorter strings, but we expect
|
||||
that case to be relatively unimportant.
|
||||
|
||||
CityHashCrc128() and similar are variants of CityHash128() that depend
|
||||
on _mm_crc32_u64(), an intrinsic that compiles to a CRC32 instruction
|
||||
on some CPUs. However, none of the functions we provide are CRCs.
|
||||
|
||||
CityHashCrc256() is a variant of CityHashCrc128() that also depends
|
||||
on _mm_crc32_u64(). It returns a 256-bit hash.
|
||||
|
||||
All members of the CityHash family were designed with heavy reliance
|
||||
on previous work by Austin Appleby, Bob Jenkins, and others.
|
||||
For example, CityHash32 has many similarities with Murmur3a.
|
||||
|
||||
Performance on long strings: 64-bit CPUs
|
||||
========================================
|
||||
|
||||
We are most excited by the performance of CityHash64() and its variants on
|
||||
short strings, but long strings are interesting as well.
|
||||
|
||||
CityHash is intended to be fast, under the constraint that it hash very
|
||||
well. For CPUs with the CRC32 instruction, CRC is speedy, but CRC wasn't
|
||||
designed as a hash function and shouldn't be used as one. CityHashCrc128()
|
||||
is not a CRC, but it uses the CRC32 machinery.
|
||||
|
||||
On a single core of a 2.67GHz Intel Xeon X5550, CityHashCrc256 peaks at about
|
||||
5 to 5.5 bytes/cycle. The other CityHashCrc functions are wrappers around
|
||||
CityHashCrc256 and should have similar performance on long strings.
|
||||
(CityHashCrc256 in v1.0.3 was even faster, but we decided it wasn't as thorough
|
||||
as it should be.) CityHash128 peaks at about 4.3 bytes/cycle. The fastest
|
||||
Murmur variant on that hardware, Murmur3F, peaks at about 2.4 bytes/cycle.
|
||||
We expect the peak speed of CityHash128 to dominate CityHash64, which is
|
||||
aimed more toward short strings or use in hash tables.
|
||||
|
||||
For long strings, a new function by Bob Jenkins, SpookyHash, is just
|
||||
slightly slower than CityHash128 on Intel x86-64 CPUs, but noticeably
|
||||
faster on AMD x86-64 CPUs. For hashing long strings on AMD CPUs
|
||||
and/or CPUs without the CRC instruction, SpookyHash may be just as
|
||||
good or better than any of the CityHash variants.
|
||||
|
||||
Performance on short strings: 64-bit CPUs
|
||||
=========================================
|
||||
|
||||
For short strings, e.g., most hash table keys, CityHash64 is faster than
|
||||
CityHash128, and probably faster than all the aforementioned functions,
|
||||
depending on the mix of string lengths. Here are a few results from that
|
||||
same hardware, where we (unrealistically) tested a single string length over
|
||||
and over again:
|
||||
|
||||
Hash Results
|
||||
------------------------------------------------------------------------------
|
||||
CityHash64 v1.0.3 7ns for 1 byte, or 6ns for 8 bytes, or 9ns for 64 bytes
|
||||
Murmur2 (64-bit) 6ns for 1 byte, or 6ns for 8 bytes, or 15ns for 64 bytes
|
||||
Murmur3F 14ns for 1 byte, or 15ns for 8 bytes, or 23ns for 64 bytes
|
||||
|
||||
We don't have CityHash64 benchmarks results for v1.1, but we expect the
|
||||
numbers to be similar.
|
||||
|
||||
Performance: 32-bit CPUs
|
||||
========================
|
||||
|
||||
CityHash32 is the newest variant of CityHash. It is intended for
|
||||
32-bit hardware in general but has been mostly tested on x86. Our benchmarks
|
||||
suggest that Murmur3 is the nearest competitor to CityHash32 on x86.
|
||||
We don't know of anything faster that has comparable quality. The speed rankings
|
||||
in our testing: CityHash32 > Murmur3f > Murmur3a (for long strings), and
|
||||
CityHash32 > Murmur3a > Murmur3f (for short strings).
|
||||
|
||||
Installation
|
||||
============
|
||||
|
||||
We provide reference implementations of several CityHash functions, written
|
||||
in C++. The build system is based on autoconf. It defaults the C++
|
||||
compiler flags to "-g -O2", which is probably slower than -O3 if you are
|
||||
using gcc. YMMV.
|
||||
|
||||
On systems with gcc, we generally recommend:
|
||||
|
||||
./configure
|
||||
make all check CXXFLAGS="-g -O3"
|
||||
sudo make install
|
||||
|
||||
Or, if your system has the CRC32 instruction, and you want to build everything:
|
||||
|
||||
./configure --enable-sse4.2
|
||||
make all check CXXFLAGS="-g -O3 -msse4.2"
|
||||
sudo make install
|
||||
|
||||
Note that our build system doesn't try to determine the appropriate compiler
|
||||
flag for enabling SSE4.2. For gcc it is "-msse4.2". The --enable-sse4.2
|
||||
flag to the configure script controls whether citycrc.h is installed when
|
||||
you "make install." In general, picking the right compiler flags can be
|
||||
tricky, and may depend on your compiler, your hardware, and even how you
|
||||
plan to use the library.
|
||||
|
||||
For generic information about how to configure this software, please try:
|
||||
|
||||
./configure --help
|
||||
|
||||
Failing that, please work from city.cc and city*.h, as they contain all the
|
||||
necessary code.
|
||||
|
||||
|
||||
Usage
|
||||
=====
|
||||
|
||||
The above installation instructions will produce a single library. It will
|
||||
contain CityHash32(), CityHash64(), and CityHash128(), and their variants,
|
||||
and possibly CityHashCrc128(), CityHashCrc128WithSeed(), and
|
||||
CityHashCrc256(). The functions with Crc in the name are declared in
|
||||
citycrc.h; the rest are declared in city.h.
|
||||
|
||||
|
||||
Limitations
|
||||
===========
|
||||
|
||||
1) CityHash32 is intended for little-endian 32-bit code, and everything else in
|
||||
the current version of CityHash is intended for little-endian 64-bit CPUs.
|
||||
|
||||
All functions that don't use the CRC32 instruction should work in
|
||||
little-endian 32-bit or 64-bit code. CityHash should work on big-endian CPUs
|
||||
as well, but we haven't tested that very thoroughly yet.
|
||||
|
||||
2) CityHash is fairly complex. As a result of its complexity, it may not
|
||||
perform as expected on some compilers. For example, preliminary reports
|
||||
suggest that some Microsoft compilers compile CityHash to assembly that's
|
||||
10-20% slower than it could be.
|
||||
|
||||
|
||||
Hash Quality
|
||||
============
|
||||
|
||||
We like to test hash functions with SMHasher, among other things.
|
||||
SMHasher isn't perfect, but it seems to find almost any significant flaw.
|
||||
SMHasher is available at http://code.google.com/p/smhasher/
|
||||
|
||||
SMHasher is designed to pass a 32-bit seed to the hash functions it tests.
|
||||
No CityHash function is designed to work that way, so we adapt as follows:
|
||||
For our functions that accept a seed, we use the given seed directly (padded
|
||||
with zeroes); for our functions that don't accept a seed, we hash the
|
||||
concatenation of the given seed and the input string.
|
||||
|
||||
The CityHash functions have the following flaws according to SMHasher:
|
||||
|
||||
(1) CityHash64: none
|
||||
|
||||
(2) CityHash64WithSeed: none
|
||||
|
||||
(3) CityHash64WithSeeds: did not test
|
||||
|
||||
(4) CityHash128: none
|
||||
|
||||
(5) CityHash128WithSeed: none
|
||||
|
||||
(6) CityHashCrc128: none
|
||||
|
||||
(7) CityHashCrc128WithSeed: none
|
||||
|
||||
(8) CityHashCrc256: none
|
||||
|
||||
(9) CityHash32: none
|
||||
|
||||
Some minor flaws in 32-bit and 64-bit functions are harmless, as we
|
||||
expect the primary use of these functions will be in hash tables. We
|
||||
may have gone slightly overboard in trying to please SMHasher and other
|
||||
similar tests, but we don't want anyone to choose a different hash function
|
||||
because of some minor issue reported by a quality test.
|
||||
|
||||
|
||||
For more information
|
||||
====================
|
||||
|
||||
http://code.google.com/p/cityhash/
|
||||
|
||||
cityhash-discuss@googlegroups.com
|
||||
|
||||
Please feel free to send us comments, questions, bug reports, or patches.
|
||||
Loading…
Add table
Add a link
Reference in a new issue