Enhancing Multi-modal and Multi-hop Question Answering via Structured
Knowledge and Unified Retrieval-Generation
- URL: http://arxiv.org/abs/2212.08632v2
- Date: Mon, 7 Aug 2023 03:02:06 GMT
- Title: Enhancing Multi-modal and Multi-hop Question Answering via Structured
Knowledge and Unified Retrieval-Generation
- Authors: Qian Yang, Qian Chen, Wen Wang, Baotian Hu, Min Zhang
- Abstract summary: Multi-modal multi-hop question answering involves answering a question by reasoning over multiple input sources from different modalities.
Existing methods often retrieve evidences separately and then use a language model to generate an answer based on the retrieved evidences.
We propose a Structured Knowledge and Unified Retrieval-Generation (RG) approach to address these issues.
- Score: 33.56304858796142
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Multi-modal multi-hop question answering involves answering a question by
reasoning over multiple input sources from different modalities. Existing
methods often retrieve evidences separately and then use a language model to
generate an answer based on the retrieved evidences, and thus do not adequately
connect candidates and are unable to model the interdependent relations during
retrieval. Moreover, the pipelined approaches of retrieval and generation might
result in poor generation performance when retrieval performance is low. To
address these issues, we propose a Structured Knowledge and Unified
Retrieval-Generation (SKURG) approach. SKURG employs an Entity-centered Fusion
Encoder to align sources from different modalities using shared entities. It
then uses a unified Retrieval-Generation Decoder to integrate intermediate
retrieval results for answer generation and also adaptively determine the
number of retrieval steps. Extensive experiments on two representative
multi-modal multi-hop QA datasets MultimodalQA and WebQA demonstrate that SKURG
outperforms the state-of-the-art models in both source retrieval and answer
generation performance with fewer parameters. Our code is available at
https://github.com/HITsz-TMG/SKURG.
Related papers
- Retrieve, Summarize, Plan: Advancing Multi-hop Question Answering with an Iterative Approach [6.549143816134531]
We propose a novel iterative RAG method called ReSP, equipped with a dual-function summarizer.
Experimental results on the multi-hop question-answering HotpotQA and 2WikiMultihopQA demonstrate that our method significantly outperforms the state-of-the-art.
arXiv Detail & Related papers (2024-07-18T02:19:00Z) - From RAG to RICHES: Retrieval Interlaced with Sequence Generation [3.859418700143553]
We present RICHES, a novel approach that interleaves retrieval with sequence generation tasks.
It retrieves documents by directly decoding their contents, constrained on the corpus.
We demonstrate the strong performance of RICHES across ODQA tasks including attributed and multi-hop QA.
arXiv Detail & Related papers (2024-06-29T08:16:58Z) - RAG-Fusion: a New Take on Retrieval-Augmented Generation [0.0]
Infineon has identified a need for engineers, account managers, and customers to rapidly obtain product information.
This research marks significant progress in artificial intelligence (AI) and natural language processing (NLP) applications.
arXiv Detail & Related papers (2024-01-31T22:06:07Z) - UniGen: A Unified Generative Framework for Retrieval and Question
Answering with Large Language Models [22.457013726785295]
We present textbfUniGen, a textbfUnified textbfGenerative framework for retrieval and question answering.
UniGen integrates both tasks into a single generative model leveraging the capabilities of large language models.
arXiv Detail & Related papers (2023-12-18T09:13:41Z) - End-to-end Knowledge Retrieval with Multi-modal Queries [50.01264794081951]
ReMuQ requires a system to retrieve knowledge from a large corpus by integrating contents from both text and image queries.
We introduce a retriever model ReViz'' that can directly process input text and images to retrieve relevant knowledge in an end-to-end fashion.
We demonstrate superior performance in retrieval on two datasets under zero-shot settings.
arXiv Detail & Related papers (2023-06-01T08:04:12Z) - Enhancing Retrieval-Augmented Large Language Models with Iterative
Retrieval-Generation Synergy [164.83371924650294]
We show that strong performance can be achieved by a method we call Iter-RetGen, which synergizes retrieval and generation in an iterative manner.
A model output shows what might be needed to finish a task, and thus provides an informative context for retrieving more relevant knowledge.
Iter-RetGen processes all retrieved knowledge as a whole and largely preserves the flexibility in generation without structural constraints.
arXiv Detail & Related papers (2023-05-24T16:17:36Z) - UniKGQA: Unified Retrieval and Reasoning for Solving Multi-hop Question
Answering Over Knowledge Graph [89.98762327725112]
Multi-hop Question Answering over Knowledge Graph(KGQA) aims to find the answer entities that are multiple hops away from the topic entities mentioned in a natural language question.
We propose UniKGQA, a novel approach for multi-hop KGQA task, by unifying retrieval and reasoning in both model architecture and parameter learning.
arXiv Detail & Related papers (2022-12-02T04:08:09Z) - Generate rather than Retrieve: Large Language Models are Strong Context
Generators [74.87021992611672]
We present a novel perspective for solving knowledge-intensive tasks by replacing document retrievers with large language model generators.
We call our method generate-then-read (GenRead), which first prompts a large language model to generate contextutal documents based on a given question, and then reads the generated documents to produce the final answer.
arXiv Detail & Related papers (2022-09-21T01:30:59Z) - MetaQA: Combining Expert Agents for Multi-Skill Question Answering [49.35261724460689]
We argue that despite the promising results of multi-dataset models, some domains or QA formats might require specific architectures.
We propose to combine expert agents with a novel, flexible, and training-efficient architecture that considers questions, answer predictions, and answer-prediction confidence scores.
arXiv Detail & Related papers (2021-12-03T14:05:52Z) - Query Resolution for Conversational Search with Limited Supervision [63.131221660019776]
We propose QuReTeC (Query Resolution by Term Classification), a neural query resolution model based on bidirectional transformers.
We show that QuReTeC outperforms state-of-the-art models, and furthermore, that our distant supervision method can be used to substantially reduce the amount of human-curated data required to train QuReTeC.
arXiv Detail & Related papers (2020-05-24T11:37:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.