Are We There Yet? A Decision Framework for Replacing Term Based
Retrieval with Dense Retrieval Systems
- URL: http://arxiv.org/abs/2206.12993v1
- Date: Sun, 26 Jun 2022 23:16:05 GMT
- Title: Are We There Yet? A Decision Framework for Replacing Term Based
Retrieval with Dense Retrieval Systems
- Authors: Sebastian Hofst\"atter, Nick Craswell, Bhaskar Mitra, Hamed Zamani,
Allan Hanbury
- Abstract summary: Several dense retrieval (DR) models have demonstrated competitive performance to term-based retrieval.
DR projects queries and documents into a dense vector space and retrieves results via (approximate) nearest neighbor search.
It is impossible to predict whether DR will become ubiquitous in the future, but one way this is possible is through repeated applications of decision processes.
- Score: 35.77217529138364
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, several dense retrieval (DR) models have demonstrated competitive
performance to term-based retrieval that are ubiquitous in search systems. In
contrast to term-based matching, DR projects queries and documents into a dense
vector space and retrieves results via (approximate) nearest neighbor search.
Deploying a new system, such as DR, inevitably involves tradeoffs in aspects of
its performance. Established retrieval systems running at scale are usually
well understood in terms of effectiveness and costs, such as query latency,
indexing throughput, or storage requirements. In this work, we propose a
framework with a set of criteria that go beyond simple effectiveness measures
to thoroughly compare two retrieval systems with the explicit goal of assessing
the readiness of one system to replace the other. This includes careful
tradeoff considerations between effectiveness and various cost factors.
Furthermore, we describe guardrail criteria, since even a system that is better
on average may have systematic failures on a minority of queries. The
guardrails check for failures on certain query characteristics and novel
failure types that are only possible in dense retrieval systems. We demonstrate
our decision framework on a Web ranking scenario. In that scenario,
state-of-the-art DR models have surprisingly strong results, not only on
average performance but passing an extensive set of guardrail tests, showing
robustness on different query characteristics, lexical matching,
generalization, and number of regressions. It is impossible to predict whether
DR will become ubiquitous in the future, but one way this is possible is
through repeated applications of decision processes such as the one presented
here.
Related papers
- pEBR: A Probabilistic Approach to Embedding Based Retrieval [4.8338111302871525]
Embedding retrieval aims to learn a shared semantic representation space for both queries and items.
In current industrial practice, retrieval systems typically retrieve a fixed number of items for different queries.
arXiv Detail & Related papers (2024-10-25T07:14:12Z) - RAGEval: Scenario Specific RAG Evaluation Dataset Generation Framework [69.4501863547618]
This paper introduces RAGEval, a framework designed to assess RAG systems across diverse scenarios.
With a focus on factual accuracy, we propose three novel metrics Completeness, Hallucination, and Irrelevance.
Experimental results show that RAGEval outperforms zero-shot and one-shot methods in terms of clarity, safety, conformity, and richness of generated samples.
arXiv Detail & Related papers (2024-08-02T13:35:11Z) - Unified Active Retrieval for Retrieval Augmented Generation [69.63003043712696]
In Retrieval-Augmented Generation (RAG), retrieval is not always helpful and applying it to every instruction is sub-optimal.
Existing active retrieval methods face two challenges: 1.
They usually rely on a single criterion, which struggles with handling various types of instructions.
They depend on specialized and highly differentiated procedures, and thus combining them makes the RAG system more complicated.
arXiv Detail & Related papers (2024-06-18T12:09:02Z) - tieval: An Evaluation Framework for Temporal Information Extraction
Systems [2.3035364984111495]
Temporal information extraction has attracted a great deal of interest over the last two decades.
Having access to a large volume of corpora makes it difficult when it comes to benchmark TIE systems.
tieval is a Python library that provides a concise interface for importing different corpora and facilitates system evaluation.
arXiv Detail & Related papers (2023-01-11T18:55:22Z) - ReAct: Temporal Action Detection with Relational Queries [84.76646044604055]
This work aims at advancing temporal action detection (TAD) using an encoder-decoder framework with action queries.
We first propose a relational attention mechanism in the decoder, which guides the attention among queries based on their relations.
Lastly, we propose to predict the localization quality of each action query at inference in order to distinguish high-quality queries.
arXiv Detail & Related papers (2022-07-14T17:46:37Z) - Large-Scale Sequential Learning for Recommender and Engineering Systems [91.3755431537592]
In this thesis, we focus on the design of an automatic algorithms that provide personalized ranking by adapting to the current conditions.
For the former, we propose novel algorithm called SAROS that take into account both kinds of feedback for learning over the sequence of interactions.
The proposed idea of taking into account the neighbour lines shows statistically significant results in comparison with the initial approach for faults detection in power grid.
arXiv Detail & Related papers (2022-05-13T21:09:41Z) - What are the best systems? New perspectives on NLP Benchmarking [10.27421161397197]
We propose a new procedure to rank systems based on their performance across different tasks.
Motivated by the social choice theory, the final system ordering is obtained through aggregating the rankings induced by each task.
We show that our method yields different conclusions on state-of-the-art systems than the mean-aggregation procedure.
arXiv Detail & Related papers (2022-02-08T11:44:20Z) - An Investigation of Replay-based Approaches for Continual Learning [79.0660895390689]
Continual learning (CL) is a major challenge of machine learning (ML) and describes the ability to learn several tasks sequentially without catastrophic forgetting (CF)
Several solution classes have been proposed, of which so-called replay-based approaches seem very promising due to their simplicity and robustness.
We empirically investigate replay-based approaches of continual learning and assess their potential for applications.
arXiv Detail & Related papers (2021-08-15T15:05:02Z) - A Convolutional Baseline for Person Re-Identification Using Vision and
Language Descriptions [24.794592610444514]
In real-world surveillance scenarios, frequently no visual information will be available about the queried person.
A two stream deep convolutional neural network framework supervised by cross entropy loss is presented.
The learnt visual representations are more robust and perform 22% better during retrieval as compared to a single modality system.
arXiv Detail & Related papers (2020-02-20T10:12:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.