Efficient k-NN Search with Cross-Encoders using Adaptive Multi-Round CUR
Decomposition
- URL: http://arxiv.org/abs/2305.02996v2
- Date: Mon, 23 Oct 2023 17:48:34 GMT
- Title: Efficient k-NN Search with Cross-Encoders using Adaptive Multi-Round CUR
Decomposition
- Authors: Nishant Yadav, Nicholas Monath, Manzil Zaheer, Andrew McCallum
- Abstract summary: Cross-encoder models are prohibitively expensive for direct k-nearest neighbor (k-NN) search.
We propose ADACUR, a method that adaptively, iteratively, and efficiently minimizes the approximation error for the practically important top-k neighbors.
- Score: 77.4863142882136
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Cross-encoder models, which jointly encode and score a query-item pair, are
prohibitively expensive for direct k-nearest neighbor (k-NN) search.
Consequently, k-NN search typically employs a fast approximate retrieval (e.g.
using BM25 or dual-encoder vectors), followed by reranking with a
cross-encoder; however, the retrieval approximation often has detrimental
recall regret. This problem is tackled by ANNCUR (Yadav et al., 2022), a recent
work that employs a cross-encoder only, making search efficient using a
relatively small number of anchor items, and a CUR matrix factorization. While
ANNCUR's one-time selection of anchors tends to approximate the cross-encoder
distances on average, doing so forfeits the capacity to accurately estimate
distances to items near the query, leading to regret in the crucial end-task:
recall of top-k items. In this paper, we propose ADACUR, a method that
adaptively, iteratively, and efficiently minimizes the approximation error for
the practically important top-k neighbors. It does so by iteratively performing
k-NN search using the anchors available so far, then adding these retrieved
nearest neighbors to the anchor set for the next round. Empirically, on
multiple datasets, in comparison to previous traditional and state-of-the-art
methods such as ANNCUR and dual-encoder-based retrieve-and-rerank, our proposed
approach ADACUR consistently reduces recall error-by up to 70% on the important
k = 1 setting-while using no more compute than its competitors.
Related papers
- Early Exit Strategies for Approximate k-NN Search in Dense Retrieval [10.48678957367324]
We build upon state-of-the-art for early exit A-kNN and propose an unsupervised method based on the notion of patience.
We show that our techniques improve the A-kNN efficiency with up to 5x speedups while achieving negligible effectiveness losses.
arXiv Detail & Related papers (2024-08-09T10:17:07Z) - Adaptive Retrieval and Scalable Indexing for k-NN Search with Cross-Encoders [77.84801537608651]
Cross-encoder (CE) models which compute similarity by jointly encoding a query-item pair perform better than embedding-based models (dual-encoders) at estimating query-item relevance.
We propose a sparse-matrix factorization based method that efficiently computes latent query and item embeddings to approximate CE scores and performs k-NN search with the approximate CE similarity.
arXiv Detail & Related papers (2024-05-06T17:14:34Z) - Improving Dual-Encoder Training through Dynamic Indexes for Negative
Mining [61.09807522366773]
We introduce an algorithm that approximates the softmax with provable bounds and that dynamically maintains the tree.
In our study on datasets with over twenty million targets, our approach cuts error by half in relation to oracle brute-force negative mining.
arXiv Detail & Related papers (2023-03-27T15:18:32Z) - A Token-Wise Beam Search Algorithm for RNN-T [3.682821163882332]
We present a decoding beam search algorithm that batches the joint network calls across a segment of time steps.
In addition, aggregating emission probabilities over a segment may be seen as a better approximation to finding the most likely model output.
arXiv Detail & Related papers (2023-02-28T07:20:49Z) - Efficient Nearest Neighbor Search for Cross-Encoder Models using Matrix
Factorization [60.91600465922932]
We present an approach that avoids the use of a dual-encoder for retrieval, relying solely on the cross-encoder.
Our approach provides test-time recall-vs-computational cost trade-offs superior to the current widely-used methods.
arXiv Detail & Related papers (2022-10-23T00:32:04Z) - ReAct: Temporal Action Detection with Relational Queries [84.76646044604055]
This work aims at advancing temporal action detection (TAD) using an encoder-decoder framework with action queries.
We first propose a relational attention mechanism in the decoder, which guides the attention among queries based on their relations.
Lastly, we propose to predict the localization quality of each action query at inference in order to distinguish high-quality queries.
arXiv Detail & Related papers (2022-07-14T17:46:37Z) - Improving Novelty Detection using the Reconstructions of Nearest
Neighbours [0.0]
We show that using nearest neighbours in the latent space of autoencoders (AE) significantly improves performance of semi-supervised novelty detection.
Our method harnesses a combination of the reconstructions of the nearest neighbours and the latent-neighbour distances of a given input's latent representation.
arXiv Detail & Related papers (2021-11-11T11:09:44Z) - Learning to Accelerate Heuristic Searching for Large-Scale Maximum
Weighted b-Matching Problems in Online Advertising [51.97494906131859]
Bipartite b-matching is fundamental in algorithm design, and has been widely applied into economic markets, labor markets, etc.
Existing exact and approximate algorithms usually fail in such settings due to either requiring intolerable running time or too much computation resource.
We propose textttNeuSearcher which leverages the knowledge learned from previously instances to solve new problem instances.
arXiv Detail & Related papers (2020-05-09T02:48:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.