Learning to Recover Reasoning Chains for Multi-Hop Question Answering
via Cooperative Games
- URL: http://arxiv.org/abs/2004.02393v1
- Date: Mon, 6 Apr 2020 03:54:38 GMT
- Title: Learning to Recover Reasoning Chains for Multi-Hop Question Answering
via Cooperative Games
- Authors: Yufei Feng, Mo Yu, Wenhan Xiong, Xiaoxiao Guo, Junjie Huang, Shiyu
Chang, Murray Campbell, Michael Greenspan and Xiaodan Zhu
- Abstract summary: We propose a new problem of learning to recover reasoning chains from weakly supervised signals.
How the evidence passages are selected and how the selected passages are connected are handled by two models.
For evaluation, we created benchmarks based on two multi-hop QA datasets.
- Score: 66.98855910291292
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose the new problem of learning to recover reasoning chains from
weakly supervised signals, i.e., the question-answer pairs. We propose a
cooperative game approach to deal with this problem, in which how the evidence
passages are selected and how the selected passages are connected are handled
by two models that cooperate to select the most confident chains from a large
set of candidates (from distant supervision). For evaluation, we created
benchmarks based on two multi-hop QA datasets, HotpotQA and MedHop; and
hand-labeled reasoning chains for the latter. The experimental results
demonstrate the effectiveness of our proposed approach.
Related papers
- Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models [17.60243337898751]
We present a Chain-of-Action framework for multimodal and retrieval-augmented Question-Answering (QA)
Compared to the literature, CoA overcomes two major challenges of current QA applications: (i) unfaithful hallucination that is inconsistent with real-time or domain facts and (ii) weak reasoning performance over compositional information.
arXiv Detail & Related papers (2024-03-26T03:51:01Z) - ChainLM: Empowering Large Language Models with Improved Chain-of-Thought Prompting [124.69672273754144]
Chain-of-Thought (CoT) prompting can enhance the reasoning capabilities of large language models (LLMs)
Existing CoT approaches usually focus on simpler reasoning tasks and thus result in low-quality and inconsistent CoT prompts.
We introduce CoTGenius, a novel framework designed for the automatic generation of superior CoT prompts.
arXiv Detail & Related papers (2024-03-21T11:34:26Z) - Multi-Source Test-Time Adaptation as Dueling Bandits for Extractive
Question Answering [25.44581667865143]
We study multi-source test-time model adaptation from user feedback, where K distinct models are established for adaptation.
We discuss two frameworks: multi-armed bandit learning and multi-armed dueling bandits.
Compared to multi-armed bandit learning, the dueling framework allows pairwise collaboration among K models, which is solved by a novel method named Co-UCB proposed in this work.
arXiv Detail & Related papers (2023-06-11T21:18:50Z) - Reasoning Chain Based Adversarial Attack for Multi-hop Question
Answering [0.0]
Previous adversarial attack works usually edit the whole question sentence.
We propose a multi-hop reasoning chain based adversarial attack method.
Results demonstrate significant performance reduction on both answer and supporting facts prediction.
arXiv Detail & Related papers (2021-12-17T18:03:14Z) - RocketQA: An Optimized Training Approach to Dense Passage Retrieval for
Open-Domain Question Answering [55.280108297460636]
In open-domain question answering, dense passage retrieval has become a new paradigm to retrieve relevant passages for finding answers.
We propose an optimized training approach, called RocketQA, to improve dense passage retrieval.
We make three major technical contributions in RocketQA, namely cross-batch negatives, denoised hard negatives and data augmentation.
arXiv Detail & Related papers (2020-10-16T06:54:05Z) - Counterfactual Variable Control for Robust and Interpretable Question
Answering [57.25261576239862]
Deep neural network based question answering (QA) models are neither robust nor explainable in many cases.
In this paper, we inspect such spurious "capability" of QA models using causal inference.
We propose a novel approach called Counterfactual Variable Control (CVC) that explicitly mitigates any shortcut correlation.
arXiv Detail & Related papers (2020-10-12T10:09:05Z) - Learning an Effective Context-Response Matching Model with
Self-Supervised Tasks for Retrieval-based Dialogues [88.73739515457116]
We introduce four self-supervised tasks including next session prediction, utterance restoration, incoherence detection and consistency discrimination.
We jointly train the PLM-based response selection model with these auxiliary tasks in a multi-task manner.
Experiment results indicate that the proposed auxiliary self-supervised tasks bring significant improvement for multi-turn response selection.
arXiv Detail & Related papers (2020-09-14T08:44:46Z) - Harvesting and Refining Question-Answer Pairs for Unsupervised QA [95.9105154311491]
We introduce two approaches to improve unsupervised Question Answering (QA)
First, we harvest lexically and syntactically divergent questions from Wikipedia to automatically construct a corpus of question-answer pairs (named as RefQA)
Second, we take advantage of the QA model to extract more appropriate answers, which iteratively refines data over RefQA.
arXiv Detail & Related papers (2020-05-06T15:56:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.