Practical and Asymptotically Optimal Quantization of High-Dimensional Vectors in Euclidean Space for Approximate Nearest Neighbor Search
- URL: http://arxiv.org/abs/2409.09913v1
- Date: Mon, 16 Sep 2024 01:06:23 GMT
- Title: Practical and Asymptotically Optimal Quantization of High-Dimensional Vectors in Euclidean Space for Approximate Nearest Neighbor Search
- Authors: Jianyang Gao, Yutong Gou, Yuexuan Xu, Yongyi Yang, Cheng Long, Raymond Chi-Wing Wong,
- Abstract summary: We introduce a new quantization method called RaBitQ that inherits the theoretical guarantees of RaBitQ and achieves optimality in terms of the trade-off between space and error bounds.
Our method consistently outperforms the state-of-the-art baselines in both accuracy and efficiency when using the same amount of memory.
- Score: 30.003470912691096
- License:
- Abstract: Approximate nearest neighbor (ANN) query in high-dimensional Euclidean space is a key operator in database systems. For this query, quantization is a popular family of methods developed for compressing vectors and reducing memory consumption. Recently, a method called RaBitQ achieves the state-of-the-art performance among these methods. It produces better empirical performance in both accuracy and efficiency when using the same compression rate and provides rigorous theoretical guarantees. However, the method is only designed for compressing vectors at high compression rates (32x) and lacks support for achieving higher accuracy by using more space. In this paper, we introduce a new quantization method to address this limitation by extending RaBitQ. The new method inherits the theoretical guarantees of RaBitQ and achieves the asymptotic optimality in terms of the trade-off between space and error bounds as to be proven in this study. Additionally, we present efficient implementations of the method, enabling its application to ANN queries to reduce both space and time consumption. Extensive experiments on real-world datasets confirm that our method consistently outperforms the state-of-the-art baselines in both accuracy and efficiency when using the same amount of memory.
Related papers
- Faster WIND: Accelerating Iterative Best-of-$N$ Distillation for LLM Alignment [81.84950252537618]
This paper reveals a unified game-theoretic connection between iterative BOND and self-play alignment.
We establish a novel framework, WIN rate Dominance (WIND), with a series of efficient algorithms for regularized win rate dominance optimization.
arXiv Detail & Related papers (2024-10-28T04:47:39Z) - LoRANN: Low-Rank Matrix Factorization for Approximate Nearest Neighbor Search [4.194768796374315]
We propose a new supervised score computation method based on the observation that inner product approximation is a multi-output regression problem.
Our experiments show that the proposed reduced-rank regression (RRR) method is superior to PQ in both query latency and memory usage.
We also introduce LoRANN, a clustering-based ANN library that leverages the proposed score computation method.
arXiv Detail & Related papers (2024-10-24T17:13:39Z) - Sparsity-Constraint Optimization via Splicing Iteration [1.3622424109977902]
We develop an algorithm named Sparsity-Constraint Optimization via sPlicing itEration (SCOPE)
SCOPE converges effectively without tuning parameters.
We apply SCOPE to solve quadratic optimization, learn sparse classifiers, and recover sparse Markov networks for binary variables.
Our open-source Python package skscope based on C++ implementation is publicly available on GitHub.
arXiv Detail & Related papers (2024-06-17T18:34:51Z) - RaBitQ: Quantizing High-Dimensional Vectors with a Theoretical Error Bound for Approximate Nearest Neighbor Search [16.389851096504277]
We propose a new randomized quantization method named RaBitQ, which quantizes $D$-dimensional vectors into $D$-bit strings.
RaBitQ guarantees a sharp theoretical error bound and provides good empirical accuracy at the same time.
In addition, we introduce efficient implementations of RaBitQ, supporting to estimate the distances with bitwise operations or SIMD-based operations.
arXiv Detail & Related papers (2024-05-21T04:55:04Z) - AcceleratedLiNGAM: Learning Causal DAGs at the speed of GPUs [57.12929098407975]
We show that by efficiently parallelizing existing causal discovery methods, we can scale them to thousands of dimensions.
Specifically, we focus on the causal ordering subprocedure in DirectLiNGAM and implement GPU kernels to accelerate it.
This allows us to apply DirectLiNGAM to causal inference on large-scale gene expression data with genetic interventions yielding competitive results.
arXiv Detail & Related papers (2024-03-06T15:06:11Z) - A Computationally Efficient Sparsified Online Newton Method [48.78646010774149]
Sparsified Online Newton (SONew) is a memory-efficient second-order algorithm that yields a sparsified yet effective preconditioner.
We achieve up to 30% faster convergence, 3.4% relative improvement in validation, and 80% relative improvement in training loss.
arXiv Detail & Related papers (2023-11-16T18:44:22Z) - Stochastic Optimization for Non-convex Problem with Inexact Hessian
Matrix, Gradient, and Function [99.31457740916815]
Trust-region (TR) and adaptive regularization using cubics have proven to have some very appealing theoretical properties.
We show that TR and ARC methods can simultaneously provide inexact computations of the Hessian, gradient, and function values.
arXiv Detail & Related papers (2023-10-18T10:29:58Z) - Similarity search in the blink of an eye with compressed indices [3.39271933237479]
Graph-based indices are currently the best performing techniques for billion-scale similarity search.
We present new techniques and systems for creating faster and smaller graph-based indices.
arXiv Detail & Related papers (2023-04-07T23:10:39Z) - Asymmetric Scalable Cross-modal Hashing [51.309905690367835]
Cross-modal hashing is a successful method to solve large-scale multimedia retrieval issue.
We propose a novel Asymmetric Scalable Cross-Modal Hashing (ASCMH) to address these issues.
Our ASCMH outperforms the state-of-the-art cross-modal hashing methods in terms of accuracy and efficiency.
arXiv Detail & Related papers (2022-07-26T04:38:47Z) - ZARTS: On Zero-order Optimization for Neural Architecture Search [94.41017048659664]
Differentiable architecture search (DARTS) has been a popular one-shot paradigm for NAS due to its high efficiency.
This work turns to zero-order optimization and proposes a novel NAS scheme, called ZARTS, to search without enforcing the above approximation.
In particular, results on 12 benchmarks verify the outstanding robustness of ZARTS, where the performance of DARTS collapses due to its known instability issue.
arXiv Detail & Related papers (2021-10-10T09:35:15Z) - One-Bit Compressed Sensing via One-Shot Hard Thresholding [7.594050968868919]
A problem of 1-bit compressed sensing is to estimate a sparse signal from a few binary measurements.
We present a novel and concise analysis that moves away from the widely used non-constrained notion of width.
arXiv Detail & Related papers (2020-07-07T17:28:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.