Related papers: RaBitQ: Quantizing High-Dimensional Vectors with a Theoretical Error Bound for Approximate Nearest Neighbor Search

RaBitQ: Quantizing High-Dimensional Vectors with a Theoretical Error Bound for Approximate Nearest Neighbor Search

URL: http://arxiv.org/abs/2405.12497v1
Date: Tue, 21 May 2024 04:55:04 GMT
Title: RaBitQ: Quantizing High-Dimensional Vectors with a Theoretical Error Bound for Approximate Nearest Neighbor Search
Authors: Jianyang Gao, Cheng Long,
Abstract summary: We propose a new randomized quantization method named RaBitQ, which quantizes $D$-dimensional vectors into $D$-bit strings. RaBitQ guarantees a sharp theoretical error bound and provides good empirical accuracy at the same time. In addition, we introduce efficient implementations of RaBitQ, supporting to estimate the distances with bitwise operations or SIMD-based operations.
Score: 16.389851096504277
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Searching for approximate nearest neighbors (ANN) in the high-dimensional Euclidean space is a pivotal problem. Recently, with the help of fast SIMD-based implementations, Product Quantization (PQ) and its variants can often efficiently and accurately estimate the distances between the vectors and have achieved great success in the in-memory ANN search. Despite their empirical success, we note that these methods do not have a theoretical error bound and are observed to fail disastrously on some real-world datasets. Motivated by this, we propose a new randomized quantization method named RaBitQ, which quantizes $D$-dimensional vectors into $D$-bit strings. RaBitQ guarantees a sharp theoretical error bound and provides good empirical accuracy at the same time. In addition, we introduce efficient implementations of RaBitQ, supporting to estimate the distances with bitwise operations or SIMD-based operations. Extensive experiments on real-world datasets confirm that (1) our method outperforms PQ and its variants in terms of accuracy-efficiency trade-off by a clear margin and (2) its empirical performance is well-aligned with our theoretical analysis.

Related papers

Asymptotically Optimal Linear Best Feasible Arm Identification with Fixed Budget [55.938644481736446]
We introduce a novel algorithm for best feasible arm identification that guarantees an exponential decay in the error probability.<n>We validate our algorithm through comprehensive empirical evaluations across various problem instances with different levels of complexity.
arXiv Detail & Related papers (2025-06-03T02:56:26Z)
Fractured Chain-of-Thought Reasoning [61.647243580650446]
We introduce Fractured Sampling, a unified inference-time strategy that interpolates between full CoT and solution-only sampling.<n>We show that Fractured Sampling consistently achieves superior accuracy-cost trade-offs, yielding steep log-linear scaling gains in Pass@k versus token budget.
arXiv Detail & Related papers (2025-05-19T11:30:41Z)
RaanA: A Fast, Flexible, and Data-Efficient Post-Training Quantization Algorithm [13.768298349218927]
Post-training Quantization (PTQ) has become a widely used technique for improving inference efficiency of large language models (LLMs) Existing PTQ methods generally suffer from crucial limitations such as heavy calibration data requirements and inflexible choice of target number of bits. We propose RaanA, a unified PTQ framework introducing two novel components: 1) RaBitQ-H, a variant of a randomized vector quantization method RaBitQ, designed for fast, accurate, and highly efficient quantization; and 2) AllocateBits, an algorithm that optimally allocates bit-widths across layers based on their quantization
arXiv Detail & Related papers (2025-03-29T05:03:12Z)
HAVER: Instance-Dependent Error Bounds for Maximum Mean Estimation and Applications to Q-Learning [11.026588768210601]
We study the problem of estimating the emphvalue of the largest mean among $K$ distributions via samples from them. We propose a novel algorithm called HAVER and analyze its mean squared error.
arXiv Detail & Related papers (2024-11-01T07:05:11Z)
Practical and Asymptotically Optimal Quantization of High-Dimensional Vectors in Euclidean Space for Approximate Nearest Neighbor Search [30.003470912691096]
We introduce a new quantization method called RaBitQ that inherits the theoretical guarantees of RaBitQ and achieves optimality in terms of the trade-off between space and error bounds. Our method consistently outperforms the state-of-the-art baselines in both accuracy and efficiency when using the same amount of memory.
arXiv Detail & Related papers (2024-09-16T01:06:23Z)
A Specialized Semismooth Newton Method for Kernel-Based Optimal Transport [92.96250725599958]
Kernel-based optimal transport (OT) estimators offer an alternative, functional estimation procedure to address OT problems from samples. We show that our SSN method achieves a global convergence rate of $O (1/sqrtk)$, and a local quadratic convergence rate under standard regularity conditions.
arXiv Detail & Related papers (2023-10-21T18:48:45Z)
Stochastic Optimization for Non-convex Problem with Inexact Hessian Matrix, Gradient, and Function [99.31457740916815]
Trust-region (TR) and adaptive regularization using cubics have proven to have some very appealing theoretical properties. We show that TR and ARC methods can simultaneously provide inexact computations of the Hessian, gradient, and function values.
arXiv Detail & Related papers (2023-10-18T10:29:58Z)
Stochastic Inexact Augmented Lagrangian Method for Nonconvex Expectation Constrained Optimization [88.0031283949404]
Many real-world problems have complicated non functional constraints and use a large number of data points. Our proposed method outperforms an existing method with the previously best-known result.
arXiv Detail & Related papers (2022-12-19T14:48:54Z)
Quantum Sparse Coding [5.130440339897477]
We develop a quantum-inspired algorithm for sparse coding. The emergence of quantum computers and Ising machines can potentially lead to more accurate estimations. We conduct numerical experiments with simulated data on Lightr's quantum-inspired digital platform.
arXiv Detail & Related papers (2022-09-08T13:00:30Z)
DR-DSGD: A Distributionally Robust Decentralized Learning Algorithm over Graphs [54.08445874064361]
We propose to solve a regularized distributionally robust learning problem in the decentralized setting. By adding a Kullback-Liebler regularization function to the robust min-max optimization problem, the learning problem can be reduced to a modified robust problem. We show that our proposed algorithm can improve the worst distribution test accuracy by up to $10%$.
arXiv Detail & Related papers (2022-08-29T18:01:42Z)
Efficient semidefinite-programming-based inference for binary and multi-class MRFs [83.09715052229782]
We propose an efficient method for computing the partition function or MAP estimate in a pairwise MRF. We extend semidefinite relaxations from the typical binary MRF to the full multi-class setting, and develop a compact semidefinite relaxation that can again be solved efficiently using the solver.
arXiv Detail & Related papers (2020-12-04T15:36:29Z)
AQD: Towards Accurate Fully-Quantized Object Detection [94.06347866374927]
We propose an Accurate Quantized object Detection solution, termed AQD, to get rid of floating-point computation. Our AQD achieves comparable or even better performance compared with the full-precision counterpart under extremely low-bit schemes.
arXiv Detail & Related papers (2020-07-14T09:07:29Z)
One-Bit Compressed Sensing via One-Shot Hard Thresholding [7.594050968868919]
A problem of 1-bit compressed sensing is to estimate a sparse signal from a few binary measurements. We present a novel and concise analysis that moves away from the widely used non-constrained notion of width.
arXiv Detail & Related papers (2020-07-07T17:28:03Z)
Quasi-Newton Solver for Robust Non-Rigid Registration [35.66014845211251]
We propose a formulation for robust non-rigid registration based on a globally smooth robust estimator for data fitting and regularization. We apply the majorization-minimization algorithm to the problem, which reduces each iteration to solving a simple least-squares problem with L-BFGS.
arXiv Detail & Related papers (2020-04-09T01:45:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.