Understanding and Improving Zero-shot Multi-hop Reasoning in Generative
Question Answering
- URL: http://arxiv.org/abs/2210.04234v1
- Date: Sun, 9 Oct 2022 11:48:07 GMT
- Title: Understanding and Improving Zero-shot Multi-hop Reasoning in Generative
Question Answering
- Authors: Zhengbao Jiang, Jun Araki, Haibo Ding, Graham Neubig
- Abstract summary: We decompose multi-hop questions into multiple corresponding single-hop questions.
We find marked inconsistency in QA models' answers on these pairs of ostensibly identical question chains.
When trained only on single-hop questions, models generalize poorly to multi-hop questions.
- Score: 85.79940770146557
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generative question answering (QA) models generate answers to questions
either solely based on the parameters of the model (the closed-book setting) or
additionally retrieving relevant evidence (the open-book setting). Generative
QA models can answer some relatively complex questions, but the mechanism
through which they do so is still poorly understood. We perform several studies
aimed at better understanding the multi-hop reasoning capabilities of
generative QA models. First, we decompose multi-hop questions into multiple
corresponding single-hop questions, and find marked inconsistency in QA models'
answers on these pairs of ostensibly identical question chains. Second, we find
that models lack zero-shot multi-hop reasoning ability: when trained only on
single-hop questions, models generalize poorly to multi-hop questions. Finally,
we demonstrate that it is possible to improve models' zero-shot multi-hop
reasoning capacity through two methods that approximate real multi-hop natural
language (NL) questions by training on either concatenation of single-hop
questions or logical forms (SPARQL). In sum, these results demonstrate that
multi-hop reasoning does not emerge naturally in generative QA models, but can
be encouraged by advances in training or modeling techniques.
Related papers
- Explainable Multi-hop Question Generation: An End-to-End Approach without Intermediate Question Labeling [6.635572580071933]
Multi-hop question generation aims to generate complex questions that requires multi-step reasoning over several documents.
Previous studies have predominantly utilized end-to-end models, wherein questions are decoded based on the representation of context documents.
This paper introduces an end-to-end question rewriting model that increases question complexity through sequential rewriting.
arXiv Detail & Related papers (2024-03-31T06:03:54Z) - Learn to Explain: Multimodal Reasoning via Thought Chains for Science
Question Answering [124.16250115608604]
We present Science Question Answering (SQA), a new benchmark that consists of 21k multimodal multiple choice questions with a diverse set of science topics and annotations of their answers with corresponding lectures and explanations.
We show that SQA improves the question answering performance by 1.20% in few-shot GPT-3 and 3.99% in fine-tuned UnifiedQA.
Our analysis further shows that language models, similar to humans, benefit from explanations to learn from fewer data and achieve the same performance with just 40% of the data.
arXiv Detail & Related papers (2022-09-20T07:04:24Z) - Prompt-based Conservation Learning for Multi-hop Question Answering [11.516763652013005]
Multi-hop question answering requires reasoning over multiple documents to answer a complex question.
Most existing multi-hop QA methods fail to answer a large fraction of sub-questions.
We propose the Prompt-based Conservation Learning framework for multi-hop QA.
arXiv Detail & Related papers (2022-09-14T20:50:46Z) - Locate Then Ask: Interpretable Stepwise Reasoning for Multi-hop Question
Answering [71.49131159045811]
Multi-hop reasoning requires aggregating multiple documents to answer a complex question.
Existing methods usually decompose the multi-hop question into simpler single-hop questions.
We propose an interpretable stepwise reasoning framework to incorporate both single-hop supporting sentence identification and single-hop question generation.
arXiv Detail & Related papers (2022-08-22T13:24:25Z) - Modeling Multi-hop Question Answering as Single Sequence Prediction [88.72621430714985]
We propose a simple generative approach (PathFid) that extends the task beyond just answer generation.
PathFid explicitly models the reasoning process to resolve the answer for multi-hop questions.
Our experiments demonstrate that PathFid leads to strong performance gains on two multi-hop QA datasets.
arXiv Detail & Related papers (2022-05-18T21:57:59Z) - Generative Context Pair Selection for Multi-hop Question Answering [60.74354009152721]
We propose a generative context selection model for multi-hop question answering.
Our proposed generative passage selection model has a better performance (4.9% higher than baseline) on adversarial held-out set.
arXiv Detail & Related papers (2021-04-18T07:00:48Z) - Constructing A Multi-hop QA Dataset for Comprehensive Evaluation of
Reasoning Steps [31.472490306390977]
A multi-hop question answering dataset aims to test reasoning and inference skills by requiring a model to read multiple paragraphs to answer a given question.
Previous studies revealed that many examples in existing multi-hop datasets do not require multi-hop reasoning to answer a question.
We present a new multi-hop QA dataset, called 2WikiMultiHopQA, which uses structured and unstructured data.
arXiv Detail & Related papers (2020-11-02T15:42:40Z) - Reinforced Multi-task Approach for Multi-hop Question Generation [47.15108724294234]
We take up Multi-hop question generation, which aims at generating relevant questions based on supporting facts in the context.
We employ multitask learning with the auxiliary task of answer-aware supporting fact prediction to guide the question generator.
We demonstrate the effectiveness of our approach through experiments on the multi-hop question answering dataset, HotPotQA.
arXiv Detail & Related papers (2020-04-05T10:16:59Z) - Do Multi-Hop Question Answering Systems Know How to Answer the
Single-Hop Sub-Questions? [23.991872322492384]
We investigate whether top-performing models for multi-hop questions understand the underlying sub-questions like humans.
We show that multiple state-of-the-art multi-hop QA models fail to correctly answer a large portion of sub-questions.
Our work takes a step forward towards building a more explainable multi-hop QA system.
arXiv Detail & Related papers (2020-02-23T15:16:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.