Related papers: Context Aware Query Rewriting for Text Rankers using LLM

Context Aware Query Rewriting for Text Rankers using LLM

URL: http://arxiv.org/abs/2308.16753v1
Date: Thu, 31 Aug 2023 14:19:50 GMT
Title: Context Aware Query Rewriting for Text Rankers using LLM
Authors: Abhijit Anand, Venktesh V, Vinay Setty, Avishek Anand
Abstract summary: We analyze the utility of large-language models for improved query rewriting for text ranking tasks. We adopt a simple, yet surprisingly effective, approach called context aware query rewriting (CAR) We find that fine-tuning a ranker using re-written queries offers a significant improvement of up to 33% on the passage ranking task and up to 28% on the document ranking task.
Score: 5.164642900490078
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Query rewriting refers to an established family of approaches that are applied to underspecified and ambiguous queries to overcome the vocabulary mismatch problem in document ranking. Queries are typically rewritten during query processing time for better query modelling for the downstream ranker. With the advent of large-language models (LLMs), there have been initial investigations into using generative approaches to generate pseudo documents to tackle this inherent vocabulary gap. In this work, we analyze the utility of LLMs for improved query rewriting for text ranking tasks. We find that there are two inherent limitations of using LLMs as query re-writers -- concept drift when using only queries as prompts and large inference costs during query processing. We adopt a simple, yet surprisingly effective, approach called context aware query rewriting (CAR) to leverage the benefits of LLMs for query understanding. Firstly, we rewrite ambiguous training queries by context-aware prompting of LLMs, where we use only relevant documents as context.Unlike existing approaches, we use LLM-based query rewriting only during the training phase. Eventually, a ranker is fine-tuned on the rewritten queries instead of the original queries during training. In our extensive experiments, we find that fine-tuning a ranker using re-written queries offers a significant improvement of up to 33% on the passage ranking task and up to 28% on the document ranking task when compared to the baseline performance of using original queries.

Related papers

QUITE: A Query Rewrite System Beyond Rules with LLM Agents [16.501023983083083]
Existing approaches mainly rely on predefined rewrite rules, but they handle a limited subset of queries and can cause performance regressions.<n>We propose QUITE ( query rewrite), a training-free and feedback-aware system based on Large Language Models (LLMs)<n>Extensive experiments show that QUITE reduces query execution time by up to 35.8% over state-of-the-art approaches and produces 24.1% more rewrites than prior methods.
arXiv Detail & Related papers (2025-06-09T11:51:27Z)
Rank-R1: Enhancing Reasoning in LLM-based Document Rerankers via Reinforcement Learning [76.50690734636477]
We introduce Rank-R1, a novel LLM-based reranker that performs reasoning over both the user query and candidate documents before performing the ranking task. Our experiments on the TREC DL and BRIGHT datasets show that Rank-R1 is highly effective, especially for complex queries.
arXiv Detail & Related papers (2025-03-08T03:14:26Z)
Guiding Retrieval using LLM-based Listwise Rankers [15.3583908068962]
We propose an adaptation of an existing adaptive retrieval method that supports the listwise setting. Specifically, our proposed algorithm merges results both from the initial ranking and feedback documents. We demonstrate that our method can improve nDCG@10 by up to 13.23% and recall by 28.02%--all while keeping the total number of LLM inferences constant and overheads due to the adaptive process minimal.
arXiv Detail & Related papers (2025-01-15T22:23:53Z)
R-Bot: An LLM-based Query Rewrite System [15.46599915198438]
We propose R-Bot, a query rewrite system based on machine learning. We first design a multi-source rewrite evidence preparation pipeline to generate query rewrite evidences. We then propose a hybrid-semantics retrieval method that combines structural and semantic analysis. We conduct comprehensive experiments on widely used benchmarks, and demonstrate the superior performance of our system.
arXiv Detail & Related papers (2024-12-02T16:13:04Z)
HyQE: Ranking Contexts with Hypothetical Query Embeddings [9.23634055123276]
In retrieval-augmented systems, context ranking techniques are commonly employed to reorder the retrieved contexts based on their relevance to a user query. Large language models (LLMs) have been used for ranking contexts. We introduce a scalable ranking framework that combines embedding similarity and LLM capabilities without requiring LLM fine-tuning.
arXiv Detail & Related papers (2024-10-20T03:15:01Z)
CHIQ: Contextual History Enhancement for Improving Query Rewriting in Conversational Search [67.6104548484555]
We introduce CHIQ, a two-step method that leverages the capabilities of open-source large language models (LLMs) to resolve ambiguities in the conversation history before query rewriting. We demonstrate on five well-established benchmarks that CHIQ leads to state-of-the-art results across most settings.
arXiv Detail & Related papers (2024-06-07T15:23:53Z)
LLM-R2: A Large Language Model Enhanced Rule-based Rewrite System for Boosting Query Efficiency [65.01402723259098]
We propose a novel method of query rewrite named LLM-R2, adopting a large language model (LLM) to propose possible rewrite rules for a database rewrite system. Experimental results have shown that our method can significantly improve the query execution efficiency and outperform the baseline methods.
arXiv Detail & Related papers (2024-04-19T13:17:07Z)
The Surprising Effectiveness of Rankers Trained on Expanded Queries [4.874071145951159]
We improve the ranking performance of hard or difficult queries without compromising the performance of other queries. We combine relevance scores from the specialized ranker and the base ranker, along with a query performance score estimated for each query. In our experiments on the DL-Hard dataset, we find that a principled query performance based scoring method offers a significant improvement of up to 25% on the passage ranking task.
arXiv Detail & Related papers (2024-04-03T09:12:22Z)
Optimizing LLM Queries in Relational Workloads [58.254894049950366]
We show how to optimize Large Language Models (LLMs) inference for analytical workloads that invoke LLMs within relational queries. We implement these optimizations in Apache Spark, with vLLM as the model serving backend. We achieve up to 4.4x improvement in end-to-end latency on a benchmark of diverse LLM-based queries on real datasets.
arXiv Detail & Related papers (2024-03-09T07:01:44Z)
Enhancing Conversational Search: Large Language Model-Aided Informative Query Rewriting [42.35788605017555]
We propose utilizing large language models (LLMs) as query rewriters. We define four essential properties for well-formed rewrites and incorporate all of them into the instruction. We introduce the role of rewrite editors for LLMs when initial query rewrites are available, forming a "rewrite-then-edit" process.
arXiv Detail & Related papers (2023-10-15T03:04:17Z)
Allies: Prompting Large Language Model with Beam Search [107.38790111856761]
In this work, we propose a novel method called ALLIES. Given an input query, ALLIES leverages LLMs to iteratively generate new queries related to the original query. By iteratively refining and expanding the scope of the original query, ALLIES captures and utilizes hidden knowledge that may not be directly through retrieval.
arXiv Detail & Related papers (2023-05-24T06:16:44Z)
Query Rewriting for Retrieval-Augmented Large Language Models [139.242907155883]
Large Language Models (LLMs) play powerful, black-box readers in the retrieve-then-read pipeline. This work introduces a new framework, Rewrite-Retrieve-Read instead of the previous retrieve-then-read for the retrieval-augmented LLMs.
arXiv Detail & Related papers (2023-05-23T17:27:50Z)
Large Language Models are Strong Zero-Shot Retriever [89.16756291653371]
We propose a simple method that applies a large language model (LLM) to large-scale retrieval in zero-shot scenarios. Our method, the Language language model as Retriever (LameR), is built upon no other neural models but an LLM.
arXiv Detail & Related papers (2023-04-27T14:45:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.