Efficient implementation of IVF + IVFPQ Index with numpy and numba
Project description
FastIVF
Efficient implementation of IVF Index with numpy and numba
Installation
- Install numpy and numba from conda to use intel mkl libraries for linear algebra operations
- To install package from the source code run
pip install . - To install from pip run
pip install fast-ivf - You may need to install tensorflow>=2.13, see
CompressedFastIVFfor details - code tested with python==3.11
- see notebook test-index for Index usage examples
- see notebook test-kmeans for K-means usage examples
Features / limitations
- This is an experimental code which heavily relies on numba and numpy and may contain bugs
- IVF centroids are estimated with custom mini batch kmeans implementation
MiniBatchKMeansis used to estimate centroids of standard Inverted IndexSubspaceMiniBatchKMeansis used to estimate centroids of Product Quantization Index
- K-means implementations support only l2 or cosine distances
- All indices currently support only cosine distance
Results on custom benchmark data
- Resources restricted to
OMP_NUM_THREADS=MKL_NUM_THREADS=OPENBLAS_NUM_THREADS=12which was consuming 100% in our case for fast-ivf and faiss - Train vectors: internal ~900k vectors of dim=1024, normalized to unit length
- Test vectors: same but 40k vectors
- Hyperparams: nprobe=10, ratio_threshold=0.5, no re-scoring is used for approximated indices (for mini-batch kmeans we use repository defaults), for CompressionFastIVF we use compression_ndim=128 (which gives 8 times compression ratio)
- We measure recall@10, as function which checks if
exact_i is in top_indices[:10]for each test query, then we average the results over all test vectors - For faiss I used similar parameters for nlist, m, nbits etc
- Reported time is computed from average of 5 runs, divided by 40k to get the time per single query
- As we use numba internally, each Fast-Index is initialized with warmup call to compile the code
- Note: CompressedFastIVF requires to train small neural network to compress embeddings to lower dimensionality, which increases the index build time
- For both libraries each search() call was consuming all 40k vectors, to fully utilize all vectorization
| Index | Recall@10 | Query Time (ms) | Params |
|---|---|---|---|
| FastIVF | 0.964 | 0.100 | nlist=1024, nprobe=10, ratio_threshold=0.5 |
| Faiss IVF | 0.968 | 1.000 | nlist=1024, nprobe=10 |
| FastIVFPQ | 0.802 | 0.100 | nlist=1024, nprobe=10, ratio_threshold=0.5, pq_num_subvectors=32, pq_num_centroids=128 |
| Faiss IVFPQ | 0.864 | 0.220 | nlist=1024, nprobe=10, m=32, nbits=7 |
| CompressedFastIVF | 0.933 | 0.050 | nlist=1024, nprobe=10, ratio_threshold=0.5, compression_ndim=128 |
| CompressedFastIVF | 0.889 | 0.040 | nlist=1024, nprobe=10, ratio_threshold=0.5, compression_ndim=64 |
Custom mini batch k-means implementation
Efficient mini-batch kmeans implementations with numba and numpy
from fast_ivf.kmeans import MiniBatchKMeans
import numpy as np
kmeans = MiniBatchKMeans(num_centroids=16, batch_size=32, metric="l2")
data = np.random.rand(5000, 64)
kmeans.train(data)
kmeans.add(data)
labels = kmeans.predict(data)
Efficient mini-batch kmeans implementations to train product quantization centroids
from fast_ivf.kmeans import SubvectorsMiniBatchKMeans
import numpy as np
kmeans = SubvectorsMiniBatchKMeans(num_centroids=16, num_subvectors=8, batch_size=32, metric="l2")
data = np.random.rand(5000, 64)
kmeans.train(data)
kmeans.add(data)
labels = kmeans.predict(data)
FastIVF
Similar to faiss.IndexIVFFlat( faiss.IndexFlatIP(d), d, nlist, faiss.METRIC_INNER_PRODUCT)
from fast_ivf import FastIVF
from fast_ivf.core import normalize
import numpy as np
nlist = 1024
train_embeddings = normalize(np.random.rand(10000, 512).astype(np.float32))
index = FastIVF(512, nlist=nlist)
index.train(train_embeddings)
index.nprobe = 10
# greedy skip voronoi cells which are having score smaller than 0.5 of the largest score
# higher values lead to faster search but less accurate
index.ratio_threshold = 0.5
test_embeddings = normalize(np.random.rand(100, 512).astype(np.float32))
distances, indices = index.search(test_embeddings, k=100)
FastIVFPQ
Similar to faiss_index = faiss.IndexIVFPQ(faiss.IndexFlatIP(d), d, nlist, m, n_bits)
from fast_ivf import FastIVFPQ
nlist = 1024
# pq_num_centroids = 2 ** n_bits
# pq_num_subvectors = m
index = FastIVFPQ(512, nlist=nlist, pq_num_centroids=64, pq_num_subvectors=32)
index.train(train_embeddings)
index.nprobe = 10
index.ratio_threshold = 0.5
distances, indices = index.search(test_embeddings, k=100)
# compute exact scores for top 100 results, this is slower but more accurate
distances, indices = index.search(test_embeddings, k=100, rescore=True)
# calibrate scores by fitting a linear regression model to N=20 exact scores, if -1 then all scores are exactly computed
index.rescore_num_samples = 20
distances, indices = index.search(test_embeddings, k=100, rescore=True)
CompressedFastIVF
Trains keras autoencoder to compress embeddings to lower dimensionality
from fast_ivf import CompressedFastIVF
nlist = 1024
index = CompressedFastIVF(512, nlist=nlist, compression_ndim=128)
index.train(train_embeddings)
index.nprobe = 10
index.ratio_threshold = 0.5
distances, indices = index.search(test_embeddings, k=100)
# compute exact scores for top 100 results, this is slower but more accurate
distances, indices = index.search(test_embeddings, k=100, rescore=True)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fast_ivf-1.0.1.tar.gz.
File metadata
- Download URL: fast_ivf-1.0.1.tar.gz
- Upload date:
- Size: 14.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4d8760eb595bf49dccc3b0d24c748b5ad6cd21a4a43b5aa796edf58410288a32
|
|
| MD5 |
32afa380bb420b54a63ae2c75638963f
|
|
| BLAKE2b-256 |
41fb4bea60c9517b5650ccc4bb04ff88cae4e418999b3098711eec1fcf47217d
|
File details
Details for the file fast_ivf-1.0.1-py3-none-any.whl.
File metadata
- Download URL: fast_ivf-1.0.1-py3-none-any.whl
- Upload date:
- Size: 12.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f7ecb3bbaa21d270b1e7d5320c36d5392aac6655194f0285b4fed85cd87d69db
|
|
| MD5 |
fd29cd3358d8c68bd7ce3c118017d807
|
|
| BLAKE2b-256 |
d03b06f9507fd4d0be26443f4a127e65db8c1e2b64d137168141f58346130842
|