Leveraging semantically similar queries for ranking via combining
representations
- URL: http://arxiv.org/abs/2106.12621v1
- Date: Wed, 23 Jun 2021 18:36:20 GMT
- Title: Leveraging semantically similar queries for ranking via combining
representations
- Authors: Hayden S. Helm and Marah Abdin and Benjamin D. Pedigo and Shweti
Mahajan and Vince Lyzinski and Youngser Park and Amitabh Basu and
Piali~Choudhury and Christopher M. White and Weiwei Yang and Carey E. Priebe
- Abstract summary: In data-scarce settings, the amount of labeled data available for a particular query can lead to a highly variable and ineffective ranking function.
One way to mitigate the effect of the small amount of data is to leverage information from semantically similar queries.
We describe and explore this phenomenon in the context of the bias-variance trade off and apply it to the data-scarce settings of a Bing navigational graph and the Drosophila larva connectome.
- Score: 20.79800117378761
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In modern ranking problems, different and disparate representations of the
items to be ranked are often available. It is sensible, then, to try to combine
these representations to improve ranking. Indeed, learning to rank via
combining representations is both principled and practical for learning a
ranking function for a particular query. In extremely data-scarce settings,
however, the amount of labeled data available for a particular query can lead
to a highly variable and ineffective ranking function. One way to mitigate the
effect of the small amount of data is to leverage information from semantically
similar queries. Indeed, as we demonstrate in simulation settings and real data
examples, when semantically similar queries are available it is possible to
gainfully use them when ranking with respect to a particular query. We describe
and explore this phenomenon in the context of the bias-variance trade off and
apply it to the data-scarce settings of a Bing navigational graph and the
Drosophila larva connectome.
Related papers
- HyQE: Ranking Contexts with Hypothetical Query Embeddings [9.23634055123276]
In retrieval-augmented systems, context ranking techniques are commonly employed to reorder the retrieved contexts based on their relevance to a user query.
Large language models (LLMs) have been used for ranking contexts.
We introduce a scalable ranking framework that combines embedding similarity and LLM capabilities without requiring LLM fine-tuning.
arXiv Detail & Related papers (2024-10-20T03:15:01Z) - Query-oriented Data Augmentation for Session Search [71.84678750612754]
We propose query-oriented data augmentation to enrich search logs and empower the modeling.
We generate supplemental training pairs by altering the most important part of a search context.
We develop several strategies to alter the current query, resulting in new training data with varying degrees of difficulty.
arXiv Detail & Related papers (2024-07-04T08:08:33Z) - Database-Augmented Query Representation for Information Retrieval [59.57065228857247]
We present a novel retrieval framework called Database-Augmented Query representation (DAQu)
DAQu augments the original query with various (query-related) metadata across multiple tables.
We validate DAQu in diverse retrieval scenarios that can incorporate metadata from the relational database.
arXiv Detail & Related papers (2024-06-23T05:02:21Z) - The Surprising Effectiveness of Rankers Trained on Expanded Queries [4.874071145951159]
We improve the ranking performance of hard or difficult queries without compromising the performance of other queries.
We combine relevance scores from the specialized ranker and the base ranker, along with a query performance score estimated for each query.
In our experiments on the DL-Hard dataset, we find that a principled query performance based scoring method offers a significant improvement of up to 25% on the passage ranking task.
arXiv Detail & Related papers (2024-04-03T09:12:22Z) - Learning List-Level Domain-Invariant Representations for Ranking [59.3544317373004]
We propose list-level alignment -- learning domain-invariant representations at the higher level of lists.
The benefits are twofold: it leads to the first domain adaptation generalization bound for ranking, in turn providing theoretical support for the proposed method.
arXiv Detail & Related papers (2022-12-21T04:49:55Z) - Exposing Query Identification for Search Transparency [69.06545074617685]
We explore the feasibility of approximate exposing query identification (EQI) as a retrieval task by reversing the role of queries and documents in two classes of search systems.
We derive an evaluation metric to measure the quality of a ranking of exposing queries, as well as conducting an empirical analysis focusing on various practical aspects of approximate EQI.
arXiv Detail & Related papers (2021-10-14T20:19:27Z) - Connecting Images through Time and Sources: Introducing Low-data,
Heterogeneous Instance Retrieval [3.6526118822907594]
We show that it is not trivial to pick features responding well to a panel of variations and semantic content.
Introducing a new enhanced version of the Alegoria benchmark, we compare descriptors using the detailed annotations.
arXiv Detail & Related papers (2021-03-19T10:54:51Z) - PiRank: Learning To Rank via Differentiable Sorting [85.28916333414145]
We propose PiRank, a new class of differentiable surrogates for ranking.
We show that PiRank exactly recovers the desired metrics in the limit of zero temperature.
arXiv Detail & Related papers (2020-12-12T05:07:36Z) - Surprise: Result List Truncation via Extreme Value Theory [92.5817701697342]
We propose a statistical method that produces interpretable and calibrated relevance scores at query time using nothing more than the ranked scores.
We demonstrate its effectiveness on the result list truncation task across image, text, and IR datasets.
arXiv Detail & Related papers (2020-10-19T19:15:50Z) - Distance-based Positive and Unlabeled Learning for Ranking [13.339237388350043]
Learning to rank is a problem of general interest.
We show that learning to rank via combining representations using an integer linear program is effective when the supervision is as light as "these few items are similar to your item of interest"
arXiv Detail & Related papers (2020-05-20T01:53:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.