Surprise: Result List Truncation via Extreme Value Theory
- URL: http://arxiv.org/abs/2010.09797v1
- Date: Mon, 19 Oct 2020 19:15:50 GMT
- Title: Surprise: Result List Truncation via Extreme Value Theory
- Authors: Dara Bahri, Che Zheng, Yi Tay, Donald Metzler, Andrew Tomkins
- Abstract summary: We propose a statistical method that produces interpretable and calibrated relevance scores at query time using nothing more than the ranked scores.
We demonstrate its effectiveness on the result list truncation task across image, text, and IR datasets.
- Score: 92.5817701697342
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Work in information retrieval has largely been centered around ranking and
relevance: given a query, return some number of results ordered by relevance to
the user. The problem of result list truncation, or where to truncate the
ranked list of results, however, has received less attention despite being
crucial in a variety of applications. Such truncation is a balancing act
between the overall relevance, or usefulness of the results, with the user cost
of processing more results. Result list truncation can be challenging because
relevance scores are often not well-calibrated. This is particularly true in
large-scale IR systems where documents and queries are embedded in the same
metric space and a query's nearest document neighbors are returned during
inference. Here, relevance is inversely proportional to the distance between
the query and candidate document, but what distance constitutes relevance
varies from query to query and changes dynamically as more documents are added
to the index. In this work, we propose Surprise scoring, a statistical method
that leverages the Generalized Pareto distribution that arises in extreme value
theory to produce interpretable and calibrated relevance scores at query time
using nothing more than the ranked scores. We demonstrate its effectiveness on
the result list truncation task across image, text, and IR datasets and compare
it to both classical and recent baselines. We draw connections to hypothesis
testing and $p$-values.
Related papers
- pEBR: A Probabilistic Approach to Embedding Based Retrieval [4.8338111302871525]
Embedding retrieval aims to learn a shared semantic representation space for both queries and items.
In current industrial practice, retrieval systems typically retrieve a fixed number of items for different queries.
arXiv Detail & Related papers (2024-10-25T07:14:12Z) - Relevance Filtering for Embedding-based Retrieval [46.851594313019895]
In embedding-based retrieval, Approximate Nearest Neighbor (ANN) search enables efficient retrieval of similar items from large-scale datasets.
This paper introduces a novel relevance filtering component (called "Cosine Adapter") for embedding-based retrieval to address this challenge.
We are able to significantly increase the precision of the retrieved set, at the expense of a small loss of recall.
arXiv Detail & Related papers (2024-08-09T06:21:20Z) - Optimization of Retrieval-Augmented Generation Context with Outlier Detection [0.0]
We focus on methods to reduce the size and improve the quality of the prompt context required for question-answering systems.
Our goal is to select the most semantically relevant documents, treating the discarded ones as outliers.
It was found that the greatest improvements were achieved with increasing complexity of the questions and answers.
arXiv Detail & Related papers (2024-07-01T15:53:29Z) - List-aware Reranking-Truncation Joint Model for Search and
Retrieval-augmented Generation [80.12531449946655]
We propose a Reranking-Truncation joint model (GenRT) that can perform the two tasks concurrently.
GenRT integrates reranking and truncation via generative paradigm based on encoder-decoder architecture.
Our method achieves SOTA performance on both reranking and truncation tasks for web search and retrieval-augmented LLMs.
arXiv Detail & Related papers (2024-02-05T06:52:53Z) - Integrating Rankings into Quantized Scores in Peer Review [61.27794774537103]
In peer review, reviewers are usually asked to provide scores for the papers.
To mitigate this issue, conferences have started to ask reviewers to additionally provide a ranking of the papers they have reviewed.
There are no standard procedure for using this ranking information and Area Chairs may use it in different ways.
We take a principled approach to integrate the ranking information into the scores.
arXiv Detail & Related papers (2022-04-05T19:39:13Z) - Online Learning of Optimally Diverse Rankings [63.62764375279861]
We propose an algorithm that efficiently learns the optimal list based on users' feedback only.
We show that after $T$ queries, the regret of LDR scales as $O((N-L)log(T))$ where $N$ is the number of all items.
arXiv Detail & Related papers (2021-09-13T12:13:20Z) - Leveraging semantically similar queries for ranking via combining
representations [20.79800117378761]
In data-scarce settings, the amount of labeled data available for a particular query can lead to a highly variable and ineffective ranking function.
One way to mitigate the effect of the small amount of data is to leverage information from semantically similar queries.
We describe and explore this phenomenon in the context of the bias-variance trade off and apply it to the data-scarce settings of a Bing navigational graph and the Drosophila larva connectome.
arXiv Detail & Related papers (2021-06-23T18:36:20Z) - Choppy: Cut Transformer For Ranked List Truncation [92.58177016973421]
Choppy is an assumption-free model based on the widely successful Transformer architecture.
We show Choppy improves upon recent state-of-the-art methods.
arXiv Detail & Related papers (2020-04-26T00:52:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.