Multi-Head RAG: Solving Multi-Aspect Problems with LLMs
- URL: http://arxiv.org/abs/2406.05085v2
- Date: Tue, 19 Nov 2024 08:46:34 GMT
- Title: Multi-Head RAG: Solving Multi-Aspect Problems with LLMs
- Authors: Maciej Besta, Ales Kubicek, Roman Niggli, Robert Gerstenberger, Lucas Weitzendorf, Mingyuan Chi, Patrick Iff, Joanna Gajda, Piotr Nyczyk, Jürgen Müller, Hubert Niewiadomski, Marcin Chrapek, Michał Podstawski, Torsten Hoefler,
- Abstract summary: Retrieval Augmented Generation (RAG) enhances the abilities of Large Language Models (LLMs)
Existing RAG solutions do not focus on queries that may require fetching multiple documents with substantially different contents.
This paper introduces Multi-Head RAG (MRAG), a novel scheme designed to address this gap with a simple yet powerful idea.
- Score: 13.638439488923671
- License:
- Abstract: Retrieval Augmented Generation (RAG) enhances the abilities of Large Language Models (LLMs) by enabling the retrieval of documents into the LLM context to provide more accurate and relevant responses. Existing RAG solutions do not focus on queries that may require fetching multiple documents with substantially different contents. Such queries occur frequently, but are challenging because the embeddings of these documents may be distant in the embedding space, making it hard to retrieve them all. This paper introduces Multi-Head RAG (MRAG), a novel scheme designed to address this gap with a simple yet powerful idea: leveraging activations of Transformer's multi-head attention layer, instead of the decoder layer, as keys for fetching multi-aspect documents. The driving motivation is that different attention heads can learn to capture different data aspects. Harnessing the corresponding activations results in embeddings that represent various facets of data items and queries, improving the retrieval accuracy for complex queries. We provide an evaluation methodology and metrics, multi-aspect datasets that we release online, and real-world use cases to demonstrate MRAG's effectiveness, showing improvements of up to 20% in relevance over standard RAG baselines. MRAG can be seamlessly integrated with existing RAG frameworks and benchmarking tools like RAGAS as well as different classes of data stores.
Related papers
- mR$^2$AG: Multimodal Retrieval-Reflection-Augmented Generation for Knowledge-Based VQA [78.45521005703958]
multimodal Retrieval-Augmented Generation (mRAG) is naturally introduced to provide MLLMs with comprehensive and up-to-date knowledge.
We propose a novel framework called textbfRetrieval-textbfReftextbfAugmented textbfGeneration (mR$2$AG) which achieves adaptive retrieval and useful information localization.
mR$2$AG significantly outperforms state-of-the-art MLLMs on INFOSEEK and Encyclopedic-VQA
arXiv Detail & Related papers (2024-11-22T16:15:50Z) - DMQR-RAG: Diverse Multi-Query Rewriting for RAG [26.518517678671376]
Large language models often encounter challenges with static knowledge and hallucinations, which undermine their reliability.
We introduce DMQR-RAG, a Diverse Multi-Query Rewriting framework to improve the performance of both document retrieval and final responses in RAG.
arXiv Detail & Related papers (2024-11-20T09:43:30Z) - CodeXEmbed: A Generalist Embedding Model Family for Multiligual and Multi-task Code Retrieval [103.116634967815]
We introduce CodeXEmbed, a family of large-scale code embedding models ranging from 400M to 7B parameters.
Our novel training pipeline unifies multiple programming languages and transforms various code-related tasks into a common retrieval framework.
Our 7B model sets a new state-of-the-art (SOTA) in code retrieval, outperforming the previous leading model, Voyage-Code, by over 20% on CoIR benchmark.
arXiv Detail & Related papers (2024-11-19T16:54:45Z) - Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent [102.31558123570437]
Multimodal Retrieval Augmented Generation (mRAG) plays an important role in mitigating the "hallucination" issue inherent in multimodal large language models (MLLMs)
We propose the first self-adaptive planning agent for multimodal retrieval, OmniSearch.
arXiv Detail & Related papers (2024-11-05T09:27:21Z) - VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents [66.42579289213941]
Retrieval-augmented generation (RAG) is an effective technique that enables large language models to utilize external knowledge sources for generation.
In this paper, we introduce VisRAG, which tackles this issue by establishing a vision-language model (VLM)-based RAG pipeline.
In this pipeline, instead of first parsing the document to obtain text, the document is directly embedded using a VLM as an image and then retrieved to enhance the generation of a VLM.
arXiv Detail & Related papers (2024-10-14T15:04:18Z) - BERGEN: A Benchmarking Library for Retrieval-Augmented Generation [26.158785168036662]
Retrieval-Augmented Generation allows to enhance Large Language Models with external knowledge.
Inconsistent benchmarking poses a major challenge in comparing approaches and understanding the impact of each component in the pipeline.
In this work, we study best practices that lay the groundwork for a systematic evaluation of RAG and present BERGEN, an end-to-end library for reproducible research standardizing RAG experiments.
arXiv Detail & Related papers (2024-07-01T09:09:27Z) - Database-Augmented Query Representation for Information Retrieval [59.57065228857247]
We present a novel retrieval framework called Database-Augmented Query representation (DAQu)
DAQu augments the original query with various (query-related) metadata across multiple tables.
We validate DAQu in diverse retrieval scenarios that can incorporate metadata from the relational database.
arXiv Detail & Related papers (2024-06-23T05:02:21Z) - Multi-Meta-RAG: Improving RAG for Multi-Hop Queries using Database Filtering with LLM-Extracted Metadata [1.6574413179773757]
Retrieval-augmented generation (RAG) enables retrieval of relevant information from an external knowledge source.
Traditional RAG applications perform poorly in answering multi-hop questions.
We introduce a new method called Multi-Meta-RAG, which uses database filtering with LLM-extracted metadata.
arXiv Detail & Related papers (2024-06-19T04:53:48Z) - DR-RAG: Applying Dynamic Document Relevance to Retrieval-Augmented Generation for Question-Answering [4.364937306005719]
RAG has recently demonstrated the performance of Large Language Models (LLMs) in the knowledge-intensive tasks such as Question-Answering (QA)
We have found that even though there is low relevance between some critical documents and query, it is possible to retrieve the remaining documents by combining parts of the documents with the query.
A two-stage retrieval framework called Dynamic-Relevant Retrieval-Augmented Generation (DR-RAG) is proposed to improve document retrieval recall and the accuracy of answers.
arXiv Detail & Related papers (2024-06-11T15:15:33Z) - CRUD-RAG: A Comprehensive Chinese Benchmark for Retrieval-Augmented Generation of Large Language Models [49.16989035566899]
Retrieval-Augmented Generation (RAG) is a technique that enhances the capabilities of large language models (LLMs) by incorporating external knowledge sources.
This paper constructs a large-scale and more comprehensive benchmark, and evaluates all the components of RAG systems in various RAG application scenarios.
arXiv Detail & Related papers (2024-01-30T14:25:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.