S$^3$HQA: A Three-Stage Approach for Multi-hop Text-Table Hybrid Question Answering
- URL: http://arxiv.org/abs/2305.11725v2
- Date: Tue, 25 Jun 2024 09:53:44 GMT
- Title: S$^3$HQA: A Three-Stage Approach for Multi-hop Text-Table Hybrid Question Answering
- Authors: Fangyu Lei, Xiang Li, Yifan Wei, Shizhu He, Yiming Huang, Jun Zhao, Kang Liu,
- Abstract summary: Existing models mainly adopt a retriever-reader framework, which have several deficiencies.
We propose a three-stage TextTableQA framework S3HQA, which comprises of retriever, selector, and reasoner.
When trained on the full dataset, our approach outperforms all baseline methods, ranking first on the HybridQA leaderboard.
- Score: 27.66777544627217
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Answering multi-hop questions over hybrid factual knowledge from the given text and table (TextTableQA) is a challenging task. Existing models mainly adopt a retriever-reader framework, which have several deficiencies, such as noisy labeling in training retriever, insufficient utilization of heterogeneous information over text and table, and deficient ability for different reasoning operations. In this paper, we propose a three-stage TextTableQA framework S3HQA, which comprises of retriever, selector, and reasoner. We use a retriever with refinement training to solve the noisy labeling problem. Then, a hybrid selector considers the linked relationships between heterogeneous data to select the most relevant factual knowledge. For the final stage, instead of adapting a reading comprehension module like in previous methods, we employ a generation-based reasoner to obtain answers. This includes two approaches: a row-wise generator and an LLM prompting generator~(first time used in this task). The experimental results demonstrate that our method achieves competitive results in the few-shot setting. When trained on the full dataset, our approach outperforms all baseline methods, ranking first on the HybridQA leaderboard.
Related papers
- EfficientRAG: Efficient Retriever for Multi-Hop Question Answering [52.64500643247252]
We introduce EfficientRAG, an efficient retriever for multi-hop question answering.
Experimental results demonstrate that EfficientRAG surpasses existing RAG methods on three open-domain multi-hop question-answering datasets.
arXiv Detail & Related papers (2024-08-08T06:57:49Z) - TTQA-RS- A break-down prompting approach for Multi-hop Table-Text Question Answering with Reasoning and Summarization [3.531533402602335]
Multi-hop table-text QA requires multiple hops between the table and text.
Our model uses an enhanced retriever for table-text information retrieval.
Our experiments demonstrate the potential of prompt-based approaches.
arXiv Detail & Related papers (2024-06-20T20:55:38Z) - Long-Span Question-Answering: Automatic Question Generation and QA-System Ranking via Side-by-Side Evaluation [65.16137964758612]
We explore the use of long-context capabilities in large language models to create synthetic reading comprehension data from entire books.
Our objective is to test the capabilities of LLMs to analyze, understand, and reason over problems that require a detailed comprehension of long spans of text.
arXiv Detail & Related papers (2024-05-31T20:15:10Z) - Improving Question Generation with Multi-level Content Planning [70.37285816596527]
This paper addresses the problem of generating questions from a given context and an answer, specifically focusing on questions that require multi-hop reasoning across an extended context.
We propose MultiFactor, a novel QG framework based on multi-level content planning. Specifically, MultiFactor includes two components: FA-model, which simultaneously selects key phrases and generates full answers, and Q-model which takes the generated full answer as an additional input to generate questions.
arXiv Detail & Related papers (2023-10-20T13:57:01Z) - Phrase Retrieval for Open-Domain Conversational Question Answering with
Conversational Dependency Modeling via Contrastive Learning [54.55643652781891]
Open-Domain Conversational Question Answering (ODConvQA) aims at answering questions through a multi-turn conversation.
We propose a method to directly predict answers with a phrase retrieval scheme for a sequence of words.
arXiv Detail & Related papers (2023-06-07T09:46:38Z) - LIQUID: A Framework for List Question Answering Dataset Generation [17.86721740779611]
We propose LIQUID, an automated framework for generating list QA datasets from unlabeled corpora.
We first convert a passage from Wikipedia or PubMed into a summary and extract named entities from the summarized text as candidate answers.
We then create questions using an off-the-shelf question generator with the extracted entities and original passage.
Using our synthetic data, we significantly improve the performance of the previous best list QA models by exact-match F1 scores of 5.0 on MultiSpanQA, 1.9 on Quoref, and 2.8 averaged across three BioASQ benchmarks.
arXiv Detail & Related papers (2023-02-03T12:42:45Z) - UniKGQA: Unified Retrieval and Reasoning for Solving Multi-hop Question
Answering Over Knowledge Graph [89.98762327725112]
Multi-hop Question Answering over Knowledge Graph(KGQA) aims to find the answer entities that are multiple hops away from the topic entities mentioned in a natural language question.
We propose UniKGQA, a novel approach for multi-hop KGQA task, by unifying retrieval and reasoning in both model architecture and parameter learning.
arXiv Detail & Related papers (2022-12-02T04:08:09Z) - Weakly Supervised Pre-Training for Multi-Hop Retriever [23.79574380039197]
We propose a new method for weakly supervised multi-hop retriever pre-training without human efforts.
Our method includes 1) a pre-training task for generating vector representations of complex questions, 2) a scalable data generation method that produces the nested structure of question and sub-question as weak supervision for pre-training, and 3) a pre-training model structure based on dense encoders.
arXiv Detail & Related papers (2021-06-18T08:06:02Z) - FeTaQA: Free-form Table Question Answering [33.018256483762386]
We introduce FeTaQA, a new dataset with 10K Wikipedia-based table, question, free-form answer, supporting table cells pairs.
FeTaQA yields a more challenging table question answering setting because it requires generating free-form text answers after retrieval, inference, and integration of multiple discontinuous facts from a structured knowledge source.
arXiv Detail & Related papers (2021-04-01T09:59:40Z) - Open Question Answering over Tables and Text [55.8412170633547]
In open question answering (QA), the answer to a question is produced by retrieving and then analyzing documents that might contain answers to the question.
Most open QA systems have considered only retrieving information from unstructured text.
We present a new large-scale dataset Open Table-and-Text Question Answering (OTT-QA) to evaluate performance on this task.
arXiv Detail & Related papers (2020-10-20T16:48:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.