-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Indexing 1G vectors
For those datasets, compression becomes mandatory (we are talking here about 50M-1G per server). The main compression method used in Faiss is PQ (product quantizer) compression, with a pre-selection based on a coarse quantizer (see previous section).
The Bigann dataset is a classical benchmark used in computer vision. It contains 1 billion SIFT descriptors. The plot below shows that it is relatively easy to index:
These results were obtained with bench_polysemous_1bn.py:
python bench_polysemous_1bn.py SIFT1000M OPQ8_64,IMI2x14,PQ8 autotuneMT
Another research dataset that was introduced in this CVPR'16 paper. It contains 1Bn 96D descriptors.
A recent CVPR'16 paper has a GPU implementation for the search. We experiment with relatively low-precision operating points (8 bytes per code) to allow for a direct comparison with published papers. Note however that for high quality neighbors, more bytes would be required (see above).
Method | Hardware | R@10 | query time (ms) / vector |
---|---|---|---|
Wieschollek et al. CVPR'16 | Titan X | 0.35 | 0.15 |
OPQ8_64,IMI2x13,PQ8x8 | CPU (1 thread) | 0.349 | 0.4852 |
" | CPU (20 threads) | 0.349 | 0.035 |
OPQ8_32,IVF262144,PQ8 | Titan X | 0.376 | 0.0340 |
" | " | 0.448 | 0.1214 |
(methods are described with their index_factory
string)
The operating point we are interested in is one that takes ~25 GB of RAM, which corresponds to 20-byte PQ codes. The first row is the best operating point we are aware of at the time we made the comparison. The other rows correspond to different operating points achieved by CPU- and GPU-Faiss algorithms.
Method | Hardware | R@1 | query time (ms) / vector |
---|---|---|---|
Babenko & al. CVPR'16 | CPU (1 thread) | 0.45 | 20 |
OPQ20_80,IMI2x14,PQ20 | CPU (1 thread) | 0.4561 | 3.66 |
OPQ20_80,IVF262144,PQ20 | 4*K40 | 0.488 | 0.18 |
" | 4*K40 | 0.493 | 1.1 |
OPQ32,IVF262144,PQ32 | 8*TitanX | 0.671 | 0.2328 |
OPQ64_128,IVF262144,PQ64 (float16 mode) | 8*TitanX | 0.856 | 0. 3207 |
Faiss building blocks: clustering, PCA, quantization
Index IO, cloning and hyper parameter tuning
Threads and asynchronous calls
Inverted list objects and scanners
Indexes that do not fit in RAM
Brute force search without an index
Fast accumulation of PQ and AQ codes (FastScan)
Setting search parameters for one query
Binary hashing index benchmark