LIRA: A Learning-based Query-aware Partition Framework for Large-scale ANN Search
- URL: http://arxiv.org/abs/2503.23409v1
- Date: Sun, 30 Mar 2025 12:03:57 GMT
- Title: LIRA: A Learning-based Query-aware Partition Framework for Large-scale ANN Search
- Authors: Ximu Zeng, Liwei Deng, Penghao Chen, Xu Chen, Han Su, Kai Zheng,
- Abstract summary: In the query phase, a common strategy is to probe partitions based on the distance ranks of a query to partition centroids.<n>In the partition construction phase, all partition-based methods face the boundary problem that separates a query's nearest neighbors to multiple partitions.<n>We propose LIRA, a LearnIng-based queRy-aware pArtition framework.
- Score: 14.354312183838642
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Approximate nearest neighbor search is fundamental in information retrieval. Previous partition-based methods enhance search efficiency by probing partial partitions, yet they face two common issues. In the query phase, a common strategy is to probe partitions based on the distance ranks of a query to partition centroids, which inevitably probes irrelevant partitions as it ignores data distribution. In the partition construction phase, all partition-based methods face the boundary problem that separates a query's nearest neighbors to multiple partitions, resulting in a long-tailed kNN distribution and degrading the optimal nprobe (i.e., the number of probing partitions). To address this gap, we propose LIRA, a LearnIng-based queRy-aware pArtition framework. Specifically, we propose a probing model to directly probe the partitions containing the kNN of a query, which can reduce probing waste and allow for query-aware probing with nprobe individually. Moreover, we incorporate the probing model into a learning-based redundancy strategy to mitigate the adverse impact of the long-tailed kNN distribution on search efficiency. Extensive experiments on real-world vector datasets demonstrate the superiority of LIRA in the trade-off among accuracy, latency, and query fan-out. The codes are available at https://github.com/SimoneZeng/LIRA-ANN-search.
Related papers
- A Query-Driven Approach to Space-Efficient Range Searching [12.760453906939446]
We show that a near-linear sample of queries allows the construction of a partition tree with a near-optimal expected number of nodes visited during querying.<n>We enhance this approach by treating node processing as a classification problem, leveraging fast classifiers like shallow neural networks to obtain experimentally efficient query times.<n>Our algorithm, based on a sample of queries, builds a balanced tree with nodes associated with separators that minimize query stabs on expectation.
arXiv Detail & Related papers (2025-02-19T12:01:00Z) - pEBR: A Probabilistic Approach to Embedding Based Retrieval [4.8338111302871525]
Embedding retrieval aims to learn a shared semantic representation space for both queries and items.
In current industrial practice, retrieval systems typically retrieve a fixed number of items for different queries.
arXiv Detail & Related papers (2024-10-25T07:14:12Z) - Approximating a RUM from Distributions on k-Slates [88.32814292632675]
We find a generalization-time algorithm that finds the RUM that best approximates the given distribution on average.
Our theoretical result can also be made practical: we obtain a that is effective and scales to real-world datasets.
arXiv Detail & Related papers (2023-05-22T17:43:34Z) - Efficient Joint-Dimensional Search with Solution Space Regularization
for Real-Time Semantic Segmentation [27.94898516315886]
We search an optimal network structure that can run in real-time for this problem.
A novel Solution Space Regularization (SSR) loss is first proposed to effectively encourage the supernet to converge to its discrete one.
A new Hierarchical and Progressive Solution Space Shrinking method is presented to further achieve high efficiency of searching.
arXiv Detail & Related papers (2022-08-10T11:07:33Z) - Unsupervised Space Partitioning for Nearest Neighbor Search [6.516813715425121]
We propose an end-to-end learning framework that couples the partitioning and learning-to-search steps using a custom loss function.
A key advantage of our proposed solution is that it does not require any expensive pre-processing of the dataset.
We show that our method beats the state-of-the-art space partitioning method and the ubiquitous K-means clustering method.
arXiv Detail & Related papers (2022-06-16T11:17:03Z) - Generalizing Few-Shot NAS with Gradient Matching [165.5690495295074]
One-Shot methods train one supernet to approximate the performance of every architecture in the search space via weight-sharing.
Few-Shot NAS reduces the level of weight-sharing by splitting the One-Shot supernet into multiple separated sub-supernets.
It significantly outperforms its Few-Shot counterparts while surpassing previous comparable methods in terms of the accuracy of derived architectures.
arXiv Detail & Related papers (2022-03-29T03:06:16Z) - BR-NS: an Archive-less Approach to Novelty Search [70.13948372218849]
We discuss an alternative approach to novelty estimation, dubbed Behavior Recognition based Novelty Search (BR-NS)
BR-NS does not require an archive, makes no assumption on the metrics that can be defined in the behavior space and does not rely on nearest neighbours search.
We conduct experiments to gain insight into its feasibility and dynamics as well as potential advantages over archive-based NS in terms of time complexity.
arXiv Detail & Related papers (2021-04-08T17:31:34Z) - Partition-Guided GANs [63.980473635585234]
We design a partitioner that breaks the space into smaller regions, each having a simpler distribution, and training a different generator for each partition.
This is done in an unsupervised manner without requiring any labels.
Experimental results on various standard benchmarks show that the proposed unsupervised model outperforms several recent methods.
arXiv Detail & Related papers (2021-04-02T00:06:53Z) - IRLI: Iterative Re-partitioning for Learning to Index [104.72641345738425]
Methods have to trade between obtaining high accuracy while maintaining load balance and scalability in distributed settings.
We propose a novel approach called IRLI, which iteratively partitions the items by learning the relevant buckets directly from the query-item relevance data.
We mathematically show that IRLI retrieves the correct item with high probability under very natural assumptions and provides superior load balancing.
arXiv Detail & Related papers (2021-03-17T23:13:25Z) - Best-First Beam Search [78.71330480725668]
We show that the standard implementation of beam search can be made up to 10x faster in practice.
We propose a memory-reduced variant of Best-First Beam Search, which has a similar beneficial search bias in terms of downstream performance.
arXiv Detail & Related papers (2020-07-08T05:56:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.