Related papers: Context Tuning for Retrieval Augmented Generation

Context Tuning for Retrieval Augmented Generation

URL: http://arxiv.org/abs/2312.05708v1
Date: Sat, 9 Dec 2023 23:33:16 GMT
Title: Context Tuning for Retrieval Augmented Generation
Authors: Raviteja Anantha, Tharun Bethi, Danil Vodianik, Srinivas Chappidi
Abstract summary: We propose Context Tuning for RAG, which employs a smart context retrieval system to fetch relevant information. Our empirical results demonstrate that context tuning significantly enhances semantic search. We also show that our proposed lightweight model using Reciprocal Rank Fusion (RRF) withMART outperforms GPT-4 based retrieval.
Score: 1.201626478128059
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models (LLMs) have the remarkable ability to solve new tasks with just a few examples, but they need access to the right tools. Retrieval Augmented Generation (RAG) addresses this problem by retrieving a list of relevant tools for a given task. However, RAG's tool retrieval step requires all the required information to be explicitly present in the query. This is a limitation, as semantic search, the widely adopted tool retrieval method, can fail when the query is incomplete or lacks context. To address this limitation, we propose Context Tuning for RAG, which employs a smart context retrieval system to fetch relevant information that improves both tool retrieval and plan generation. Our lightweight context retrieval model uses numerical, categorical, and habitual usage signals to retrieve and rank context items. Our empirical results demonstrate that context tuning significantly enhances semantic search, achieving a 3.5-fold and 1.5-fold improvement in Recall@K for context retrieval and tool retrieval tasks respectively, and resulting in an 11.6% increase in LLM-based planner accuracy. Additionally, we show that our proposed lightweight model using Reciprocal Rank Fusion (RRF) with LambdaMART outperforms GPT-4 based retrieval. Moreover, we observe context augmentation at plan generation, even after tool retrieval, reduces hallucination.

Related papers

FrugalRAG: Learning to retrieve and reason for multi-hop QA [10.193015391271535]
Large-scale fine-tuning is not needed to improve RAG metrics.<n>Supervised and RL-based fine-tuning can help RAG from the perspective of frugality.
arXiv Detail & Related papers (2025-07-10T11:02:13Z)
MassTool: A Multi-Task Search-Based Tool Retrieval Framework for Large Language Models [45.63804847907601]
MassTool is a multi-task search-based framework designed to enhance both query representation and tool retrieval accuracy.<n>It employs a two-tower architecture: a tool usage detection tower that predicts the need for function calls, and a tool retrieval tower that leverages a query-centric graph convolution network (QC-GCN) for effective query-tool matching.<n>By jointly optimizing tool usage detection loss, list-wise retrieval loss, and contrastive regularization loss, MassTool establishes a robust dual-step sequential decision-making pipeline for precise query understanding.
arXiv Detail & Related papers (2025-07-01T07:02:26Z)
ImpRAG: Retrieval-Augmented Generation with Implicit Queries [49.510101132093396]
ImpRAG is a query-free RAG system that integrates retrieval and generation into a unified model.<n>We show that ImpRAG achieves 3.6-11.5 improvements in exact match scores on unseen tasks with diverse formats.
arXiv Detail & Related papers (2025-06-02T21:38:21Z)
ELITE: Embedding-Less retrieval with Iterative Text Exploration [5.8851517822935335]
Large Language Models (LLMs) have achieved impressive progress in natural language processing.<n>Their limited ability to retain long-term context constrains performance on document-level or multi-turn tasks.
arXiv Detail & Related papers (2025-05-17T08:48:43Z)
Can we Retrieve Everything All at Once? ARM: An Alignment-Oriented LLM-based Retrieval Method [48.14236175156835]
ARM aims to better align the question with the organization of the data collection by exploring relationships among data objects. It outperforms standard RAG with query decomposition by up to 5.2 pt in execution accuracy and agentic RAG (ReAct) by up to 15.9 pt. It achieves up to 5.5 pt and 19.3 pt higher F1 match scores compared to these approaches.
arXiv Detail & Related papers (2025-01-30T18:07:19Z)
Improving Tool Retrieval by Leveraging Large Language Models for Query Generation [16.7926347207647]
In-context learning can provide a short list of relevant tools in the prompt. We propose leveraging Large Language Models (LLMs) to generate a retrieval query. The generated query is embedded and used to find the most relevant tools via a nearest-neighbor search.
arXiv Detail & Related papers (2024-11-17T03:02:09Z)
Sufficient Context: A New Lens on Retrieval Augmented Generation Systems [19.238772793096473]
Augmenting LLMs with context leads to improved performance across many applications. We develop a new notion of sufficient context, along with a way to classify instances that have enough information to answer the query. We find that proprietary LLMs excel at answering queries when the context is sufficient, but often output incorrect answers instead of abstaining when the context is not.
arXiv Detail & Related papers (2024-11-09T02:13:14Z)
Less is More: Making Smaller Language Models Competent Subgraph Retrievers for Multi-hop KGQA [51.3033125256716]
We model the subgraph retrieval task as a conditional generation task handled by small language models. Our base generative subgraph retrieval model, consisting of only 220M parameters, competitive retrieval performance compared to state-of-the-art models. Our largest 3B model, when plugged with an LLM reader, sets new SOTA end-to-end performance on both the WebQSP and CWQ benchmarks.
arXiv Detail & Related papers (2024-10-08T15:22:36Z)
Data-Efficient Massive Tool Retrieval: A Reinforcement Learning Approach for Query-Tool Alignment with Language Models [28.67532617021655]
Large language models (LLMs) integrated with external tools and APIs have successfully addressed complex tasks by using in-context learning or fine-tuning. Despite this progress, the vast scale of tool retrieval remains challenging due to stringent input length constraints. We propose a pre-retrieval strategy from an extensive repository, effectively framing the problem as the massive tool retrieval (MTR) task.
arXiv Detail & Related papers (2024-10-04T07:58:05Z)
Re-Invoke: Tool Invocation Rewriting for Zero-Shot Tool Retrieval [47.81307125613145]
Re-Invoke is an unsupervised tool retrieval method designed to scale effectively to large toolsets without training. We employ a novel multi-view similarity ranking strategy based on intents to pinpoint the most relevant tools for each query. Our evaluation demonstrates that Re-Invoke significantly outperforms state-of-the-art alternatives in both single-tool and multi-tool scenarios.
arXiv Detail & Related papers (2024-08-03T22:49:27Z)
Optimizing Query Generation for Enhanced Document Retrieval in RAG [53.10369742545479]
Large Language Models (LLMs) excel in various language tasks but they often generate incorrect information. Retrieval-Augmented Generation (RAG) aims to mitigate this by using document retrieval for accurate responses.
arXiv Detail & Related papers (2024-07-17T05:50:32Z)
Optimization of Retrieval-Augmented Generation Context with Outlier Detection [0.0]
We focus on methods to reduce the size and improve the quality of the prompt context required for question-answering systems. Our goal is to select the most semantically relevant documents, treating the discarded ones as outliers. It was found that the greatest improvements were achieved with increasing complexity of the questions and answers.
arXiv Detail & Related papers (2024-07-01T15:53:29Z)
RQ-RAG: Learning to Refine Queries for Retrieval Augmented Generation [42.82192656794179]
Large Language Models (LLMs) exhibit remarkable capabilities but are prone to generating inaccurate or hallucinatory responses. This limitation stems from their reliance on vast pretraining datasets, making them susceptible to errors in unseen scenarios. Retrieval-Augmented Generation (RAG) addresses this by incorporating external, relevant documents into the response generation process.
arXiv Detail & Related papers (2024-03-31T08:58:54Z)
Corrective Retrieval Augmented Generation [36.04062963574603]
Retrieval-augmented generation (RAG) relies heavily on relevance of retrieved documents, raising concerns about how the model behaves if retrieval goes wrong. We propose the Corrective Retrieval Augmented Generation (CRAG) to improve the robustness of generation. CRAG is plug-and-play and can be seamlessly coupled with various RAG-based approaches.
arXiv Detail & Related papers (2024-01-29T04:36:39Z)
Dense X Retrieval: What Retrieval Granularity Should We Use? [56.90827473115201]
Often-overlooked design choice is the retrieval unit in which the corpus is indexed, e.g. document, passage, or sentence. We introduce a novel retrieval unit, proposition, for dense retrieval. Experiments reveal that indexing a corpus by fine-grained units such as propositions significantly outperforms passage-level units in retrieval tasks.
arXiv Detail & Related papers (2023-12-11T18:57:35Z)
Generation-Augmented Retrieval for Open-domain Question Answering [134.27768711201202]
Generation-Augmented Retrieval (GAR) for answering open-domain questions. We show that generating diverse contexts for a query is beneficial as fusing their results consistently yields better retrieval accuracy. GAR achieves state-of-the-art performance on Natural Questions and TriviaQA datasets under the extractive QA setup when equipped with an extractive reader.
arXiv Detail & Related papers (2020-09-17T23:08:01Z)
Query Understanding via Intent Description Generation [75.64800976586771]
We propose a novel Query-to-Intent-Description (Q2ID) task for query understanding. Unlike existing ranking tasks which leverage the query and its description to compute the relevance of documents, Q2ID is a reverse task which aims to generate a natural language intent description. We demonstrate the effectiveness of our model by comparing with several state-of-the-art generation models on the Q2ID task.
arXiv Detail & Related papers (2020-08-25T08:56:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.