Context Aware Query Rewriting for Text Rankers using LLM
- URL: http://arxiv.org/abs/2308.16753v1
- Date: Thu, 31 Aug 2023 14:19:50 GMT
- Title: Context Aware Query Rewriting for Text Rankers using LLM
- Authors: Abhijit Anand, Venktesh V, Vinay Setty, Avishek Anand
- Abstract summary: We analyze the utility of large-language models for improved query rewriting for text ranking tasks.
We adopt a simple, yet surprisingly effective, approach called context aware query rewriting (CAR)
We find that fine-tuning a ranker using re-written queries offers a significant improvement of up to 33% on the passage ranking task and up to 28% on the document ranking task.
- Score: 5.164642900490078
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Query rewriting refers to an established family of approaches that are
applied to underspecified and ambiguous queries to overcome the vocabulary
mismatch problem in document ranking. Queries are typically rewritten during
query processing time for better query modelling for the downstream ranker.
With the advent of large-language models (LLMs), there have been initial
investigations into using generative approaches to generate pseudo documents to
tackle this inherent vocabulary gap. In this work, we analyze the utility of
LLMs for improved query rewriting for text ranking tasks. We find that there
are two inherent limitations of using LLMs as query re-writers -- concept drift
when using only queries as prompts and large inference costs during query
processing. We adopt a simple, yet surprisingly effective, approach called
context aware query rewriting (CAR) to leverage the benefits of LLMs for query
understanding. Firstly, we rewrite ambiguous training queries by context-aware
prompting of LLMs, where we use only relevant documents as context.Unlike
existing approaches, we use LLM-based query rewriting only during the training
phase. Eventually, a ranker is fine-tuned on the rewritten queries instead of
the original queries during training. In our extensive experiments, we find
that fine-tuning a ranker using re-written queries offers a significant
improvement of up to 33% on the passage ranking task and up to 28% on the
document ranking task when compared to the baseline performance of using
original queries.
Related papers
- HyQE: Ranking Contexts with Hypothetical Query Embeddings [9.23634055123276]
In retrieval-augmented systems, context ranking techniques are commonly employed to reorder the retrieved contexts based on their relevance to a user query.
Large language models (LLMs) have been used for ranking contexts.
We introduce a scalable ranking framework that combines embedding similarity and LLM capabilities without requiring LLM fine-tuning.
arXiv Detail & Related papers (2024-10-20T03:15:01Z) - CHIQ: Contextual History Enhancement for Improving Query Rewriting in Conversational Search [67.6104548484555]
We introduce CHIQ, a two-step method that leverages the capabilities of open-source large language models (LLMs) to resolve ambiguities in the conversation history before query rewriting.
We demonstrate on five well-established benchmarks that CHIQ leads to state-of-the-art results across most settings.
arXiv Detail & Related papers (2024-06-07T15:23:53Z) - LLM-R2: A Large Language Model Enhanced Rule-based Rewrite System for Boosting Query Efficiency [65.01402723259098]
We propose a novel method of query rewrite named LLM-R2, adopting a large language model (LLM) to propose possible rewrite rules for a database rewrite system.
Experimental results have shown that our method can significantly improve the query execution efficiency and outperform the baseline methods.
arXiv Detail & Related papers (2024-04-19T13:17:07Z) - The Surprising Effectiveness of Rankers Trained on Expanded Queries [4.874071145951159]
We improve the ranking performance of hard or difficult queries without compromising the performance of other queries.
We combine relevance scores from the specialized ranker and the base ranker, along with a query performance score estimated for each query.
In our experiments on the DL-Hard dataset, we find that a principled query performance based scoring method offers a significant improvement of up to 25% on the passage ranking task.
arXiv Detail & Related papers (2024-04-03T09:12:22Z) - Optimizing LLM Queries in Relational Workloads [58.254894049950366]
We show how to optimize Large Language Models (LLMs) inference for analytical workloads that invoke LLMs within relational queries.
We implement these optimizations in Apache Spark, with vLLM as the model serving backend.
We achieve up to 4.4x improvement in end-to-end latency on a benchmark of diverse LLM-based queries on real datasets.
arXiv Detail & Related papers (2024-03-09T07:01:44Z) - Enhancing Conversational Search: Large Language Model-Aided Informative
Query Rewriting [42.35788605017555]
We propose utilizing large language models (LLMs) as query rewriters.
We define four essential properties for well-formed rewrites and incorporate all of them into the instruction.
We introduce the role of rewrite editors for LLMs when initial query rewrites are available, forming a "rewrite-then-edit" process.
arXiv Detail & Related papers (2023-10-15T03:04:17Z) - Allies: Prompting Large Language Model with Beam Search [107.38790111856761]
In this work, we propose a novel method called ALLIES.
Given an input query, ALLIES leverages LLMs to iteratively generate new queries related to the original query.
By iteratively refining and expanding the scope of the original query, ALLIES captures and utilizes hidden knowledge that may not be directly through retrieval.
arXiv Detail & Related papers (2023-05-24T06:16:44Z) - Query Rewriting for Retrieval-Augmented Large Language Models [139.242907155883]
Large Language Models (LLMs) play powerful, black-box readers in the retrieve-then-read pipeline.
This work introduces a new framework, Rewrite-Retrieve-Read instead of the previous retrieve-then-read for the retrieval-augmented LLMs.
arXiv Detail & Related papers (2023-05-23T17:27:50Z) - Large Language Models are Strong Zero-Shot Retriever [89.16756291653371]
We propose a simple method that applies a large language model (LLM) to large-scale retrieval in zero-shot scenarios.
Our method, the Language language model as Retriever (LameR), is built upon no other neural models but an LLM.
arXiv Detail & Related papers (2023-04-27T14:45:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.