MultiHop-RAG: Benchmarking Retrieval-Augmented Generation for Multi-Hop
Queries
- URL: http://arxiv.org/abs/2401.15391v1
- Date: Sat, 27 Jan 2024 11:41:48 GMT
- Title: MultiHop-RAG: Benchmarking Retrieval-Augmented Generation for Multi-Hop
Queries
- Authors: Yixuan Tang and Yi Yang
- Abstract summary: Retrieval-augmented generation (RAG) augments large language models (LLM) by retrieving relevant knowledge.
Existing RAG systems are inadequate in answering multi-hop queries, which require retrieving and reasoning over multiple pieces of supporting evidence.
We develop a novel dataset, MultiHop-RAG, which consists of a knowledge base, a large collection of multi-hop queries, their ground-truth answers, and the associated supporting evidence.
- Score: 22.4349439498591
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Retrieval-augmented generation (RAG) augments large language models (LLM) by
retrieving relevant knowledge, showing promising potential in mitigating LLM
hallucinations and enhancing response quality, thereby facilitating the great
adoption of LLMs in practice. However, we find that existing RAG systems are
inadequate in answering multi-hop queries, which require retrieving and
reasoning over multiple pieces of supporting evidence. Furthermore, to our
knowledge, no existing RAG benchmarking dataset focuses on multi-hop queries.
In this paper, we develop a novel dataset, MultiHop-RAG, which consists of a
knowledge base, a large collection of multi-hop queries, their ground-truth
answers, and the associated supporting evidence. We detail the procedure of
building the dataset, utilizing an English news article dataset as the
underlying RAG knowledge base. We demonstrate the benchmarking utility of
MultiHop-RAG in two experiments. The first experiment compares different
embedding models for retrieving evidence for multi-hop queries. In the second
experiment, we examine the capabilities of various state-of-the-art LLMs,
including GPT-4, PaLM, and Llama2-70B, in reasoning and answering multi-hop
queries given the evidence. Both experiments reveal that existing RAG methods
perform unsatisfactorily in retrieving and answering multi-hop queries. We hope
MultiHop-RAG will be a valuable resource for the community in developing
effective RAG systems, thereby facilitating greater adoption of LLMs in
practice. The MultiHop-RAG and implemented RAG system is publicly available at
https://github.com/yixuantt/MultiHop-RAG/.
Related papers
- Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent [102.31558123570437]
Multimodal Retrieval Augmented Generation (mRAG) plays an important role in mitigating the "hallucination" issue inherent in multimodal large language models (MLLMs)
We propose the first self-adaptive planning agent for multimodal retrieval, OmniSearch.
arXiv Detail & Related papers (2024-11-05T09:27:21Z) - BabelBench: An Omni Benchmark for Code-Driven Analysis of Multimodal and Multistructured Data [61.936320820180875]
Large language models (LLMs) have become increasingly pivotal across various domains.
BabelBench is an innovative benchmark framework that evaluates the proficiency of LLMs in managing multimodal multistructured data with code execution.
Our experimental findings on BabelBench indicate that even cutting-edge models like ChatGPT 4 exhibit substantial room for improvement.
arXiv Detail & Related papers (2024-10-01T15:11:24Z) - EfficientRAG: Efficient Retriever for Multi-Hop Question Answering [52.64500643247252]
We introduce EfficientRAG, an efficient retriever for multi-hop question answering.
Experimental results demonstrate that EfficientRAG surpasses existing RAG methods on three open-domain multi-hop question-answering datasets.
arXiv Detail & Related papers (2024-08-08T06:57:49Z) - Multi-Meta-RAG: Improving RAG for Multi-Hop Queries using Database Filtering with LLM-Extracted Metadata [1.6574413179773757]
Retrieval-augmented generation (RAG) enables retrieval of relevant information from an external knowledge source.
Traditional RAG applications perform poorly in answering multi-hop questions.
We introduce a new method called Multi-Meta-RAG, which uses database filtering with LLM-extracted metadata.
arXiv Detail & Related papers (2024-06-19T04:53:48Z) - Multi-Head RAG: Solving Multi-Aspect Problems with LLMs [13.638439488923671]
Retrieval Augmented Generation (RAG) enhances the abilities of Large Language Models (LLMs)
Existing RAG solutions do not focus on queries that may require fetching multiple documents with substantially different contents.
This paper introduces Multi-Head RAG (MRAG), a novel scheme designed to address this gap with a simple yet powerful idea.
arXiv Detail & Related papers (2024-06-07T16:59:38Z) - Generative Multi-Modal Knowledge Retrieval with Large Language Models [75.70313858231833]
We propose an innovative end-to-end generative framework for multi-modal knowledge retrieval.
Our framework takes advantage of the fact that large language models (LLMs) can effectively serve as virtual knowledge bases.
We demonstrate significant improvements ranging from 3.0% to 14.6% across all evaluation metrics when compared to strong baselines.
arXiv Detail & Related papers (2024-01-16T08:44:29Z) - Parrot: Enhancing Multi-Turn Instruction Following for Large Language Models [79.32652077838046]
We introduce Parrot, a solution aiming to enhance multi-turn instruction following for large language models (LLMs)
First, we introduce an efficient but effective method for collecting multi-turn instructions that feature human-like queries, such as anaphora and ellipsis.
Second, we propose a context-aware preference optimization strategy to further enhance LLMs for complex queries in multi-turn interaction.
arXiv Detail & Related papers (2023-10-11T08:36:43Z) - Enhancing Multi-modal and Multi-hop Question Answering via Structured
Knowledge and Unified Retrieval-Generation [33.56304858796142]
Multi-modal multi-hop question answering involves answering a question by reasoning over multiple input sources from different modalities.
Existing methods often retrieve evidences separately and then use a language model to generate an answer based on the retrieved evidences.
We propose a Structured Knowledge and Unified Retrieval-Generation (RG) approach to address these issues.
arXiv Detail & Related papers (2022-12-16T18:12:04Z) - UniKGQA: Unified Retrieval and Reasoning for Solving Multi-hop Question
Answering Over Knowledge Graph [89.98762327725112]
Multi-hop Question Answering over Knowledge Graph(KGQA) aims to find the answer entities that are multiple hops away from the topic entities mentioned in a natural language question.
We propose UniKGQA, a novel approach for multi-hop KGQA task, by unifying retrieval and reasoning in both model architecture and parameter learning.
arXiv Detail & Related papers (2022-12-02T04:08:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.