Source linked

MonaVecは、決定的な4ビット量子化を使用して、ベクトル検索を27MBにパックします。

MonaVec はランダム化された Hadamard 変換と事前計算された Lloyd-Max テーブルを使用して、ベクトルを 4 ビットに量子化し、訓練パスなしで AG News で 0.960 Recall@10 を達成します。

monavecvector searchedge airustquantizationfaiss

MonaVec hits 0.960 Recall@10 on a 45K x 1024-dim BGE-M3 embedding set using only 27 MB of index storage, with zero training and zero server dependencies.

That number matters because every existing vector search system worth naming - FAISS, usearch, ScaNN - either assumes a persistent server, gigabytes of RAM, or a training pass over the corpus. MonaVec ships for the edge: one file, one function call, runs anywhere. Its quantization core is training-free by default and data-oblivious.

Training-Free Quantization via Randomized Hadamard Transform

MonaVec applies a Randomized Hadamard Transform (RHDH) to condition any input distribution toward N(0,1). Once the distribution is normalized, precomputed Lloyd-Max tables quantize each vector component to 4 bits - an 8x reduction - with no learned codebook and no data pass. For magnitude-sensitive L2 data, a single-pass global standardization (fit()) extends the same pipeline without breaking the training-free guarantee.

Pure Rust implementation with Python bindings and runtime SIMD dispatch (AVX-512/AVX2/NEON/scalar) mean it runs on anything from an x86 server to a Raspberry Pi. The index persists as a single .mvec file; ChaCha20 rotation seed baked into that file ensures byte-identical results across architectures and builds.

Deterministic Indexing Beats Graph Libraries at Their Own Game

MonaVec 4-bit BruteForce leads float32 FAISS-IVF and 8-bit usearch on recall for the AG News benchmark while consuming a fraction of the memory. It trades peak throughput for something graph-based libraries cannot offer: byte-identical determinism. Parallel-build HNSW and IVF graphs produce different indices each run; MonaVec gives you the same answer every time, everywhere.

Optional IvfFlat and HNSW backends handle million-vector corpora, but the headline number - 0.960 Recall@10 in 27 MB, no training - is what makes it immediately useful for offline agents and on-device RAG.

One File, One Call: The SQLite of Vector Search

MonaVec targets the deployment profile SQLite owns for relational data: embedded, no server, no training, no network. For an edge AI engineer who needs deterministic retrieval of semantic embeddings on a phone or microcontroller, MonaVec removes the infrastructure tax that made vector search a cloud-only affair.

Whether you're building an offline coding assistant or a sensor that answers queries without phoning home, MonaVec ships the index as a file and the search as a function call - and it fits in the L2 cache of a Cortex-A72.


Source: MonaVec: A Training-Free Embedded Vector Search Kernel for Edge and Offline AI Systems
Domain: arxiv.org

Read original source ->

External source stays available while the OJO article and comment thread stay local.

Comments load interactively on the live page.