Understanding Unnatural Questions Improves Reasoning over Text
- URL: http://arxiv.org/abs/2010.09366v1
- Date: Mon, 19 Oct 2020 10:22:16 GMT
- Title: Understanding Unnatural Questions Improves Reasoning over Text
- Authors: Xiao-Yu Guo and Yuan-Fang Li and Gholamreza Haffari
- Abstract summary: Complex question answering (CQA) over raw text is a challenging task.
Learning an effective CQA model requires large amounts of human-annotated data.
We address the challenge of learning a high-quality programmer (parser) by projecting natural human-generated questions into unnatural machine-generated questions.
- Score: 54.235828149899625
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Complex question answering (CQA) over raw text is a challenging task. A
prominent approach to this task is based on the programmer-interpreter
framework, where the programmer maps the question into a sequence of reasoning
actions which is then executed on the raw text by the interpreter. Learning an
effective CQA model requires large amounts of human-annotated data,consisting
of the ground-truth sequence of reasoning actions, which is time-consuming and
expensive to collect at scale. In this paper, we address the challenge of
learning a high-quality programmer (parser) by projecting natural
human-generated questions into unnatural machine-generated questions which are
more convenient to parse. We firstly generate synthetic (question,action
sequence) pairs by a data generator, and train a semantic parser that
associates synthetic questions with their corresponding action sequences. To
capture the diversity when applied tonatural questions, we learn a projection
model to map natural questions into their most similar unnatural questions for
which the parser can work well. Without any natural training data, our
projection model provides high-quality action sequences for the CQA task.
Experimental results show that the QA model trained exclusively with synthetic
data generated by our method outperforms its state-of-the-art counterpart
trained on human-labeled data.
Related papers
- Syn-QA2: Evaluating False Assumptions in Long-tail Questions with Synthetic QA Datasets [7.52684798377727]
We introduce Syn-(QA)$2$, a set of two synthetically generated question-answering (QA) datasets.
We find that false assumptions in QA are challenging, echoing the findings of prior work.
The detection task is more challenging with long-tail questions compared to naturally occurring questions.
arXiv Detail & Related papers (2024-03-18T18:01:26Z) - Event Extraction as Question Generation and Answering [72.04433206754489]
Recent work on Event Extraction has reframed the task as Question Answering (QA)
We propose QGA-EE, which enables a Question Generation (QG) model to generate questions that incorporate rich contextual information instead of using fixed templates.
Experiments show that QGA-EE outperforms all prior single-task-based models on the ACE05 English dataset.
arXiv Detail & Related papers (2023-07-10T01:46:15Z) - Weakly Supervised Visual Question Answer Generation [2.7605547688813172]
We present a weakly supervised method that synthetically generates question-answer pairs procedurally from visual information and captions.
We perform an exhaustive experimental analysis on VQA dataset and see that our model significantly outperforms SOTA methods on BLEU scores.
arXiv Detail & Related papers (2023-06-11T08:46:42Z) - HPE:Answering Complex Questions over Text by Hybrid Question Parsing and
Execution [92.69684305578957]
We propose a framework of question parsing and execution on textual QA.
The proposed framework can be viewed as a top-down question parsing followed by a bottom-up answer backtracking.
Our experiments on MuSiQue, 2WikiQA, HotpotQA, and NQ show that the proposed parsing and hybrid execution framework outperforms existing approaches in supervised, few-shot, and zero-shot settings.
arXiv Detail & Related papers (2023-05-12T22:37:06Z) - ConvFinQA: Exploring the Chain of Numerical Reasoning in Conversational
Finance Question Answering [70.6359636116848]
We propose a new large-scale dataset, ConvFinQA, to study the chain of numerical reasoning in conversational question answering.
Our dataset poses great challenge in modeling long-range, complex numerical reasoning paths in real-world conversations.
arXiv Detail & Related papers (2022-10-07T23:48:50Z) - Learning to Ask Conversational Questions by Optimizing Levenshtein
Distance [83.53855889592734]
We introduce a Reinforcement Iterative Sequence Editing (RISE) framework that optimize the minimum Levenshtein distance (MLD) through explicit editing actions.
RISE is able to pay attention to tokens that are related to conversational characteristics.
Experimental results on two benchmark datasets show that RISE significantly outperforms state-of-the-art methods.
arXiv Detail & Related papers (2021-06-30T08:44:19Z) - Cooperative Learning of Zero-Shot Machine Reading Comprehension [9.868221447090855]
We propose a cooperative, self-play learning model for question generation and answering.
We can train question generation and answering models on any textual corpora without annotation.
Our model outperforms the state-of-the-art pretrained language models on standard question answering benchmarks.
arXiv Detail & Related papers (2021-03-12T18:22:28Z) - Inquisitive Question Generation for High Level Text Comprehension [60.21497846332531]
We introduce INQUISITIVE, a dataset of 19K questions that are elicited while a person is reading through a document.
We show that readers engage in a series of pragmatic strategies to seek information.
We evaluate question generation models based on GPT-2 and show that our model is able to generate reasonable questions.
arXiv Detail & Related papers (2020-10-04T19:03:39Z) - LogiQA: A Challenge Dataset for Machine Reading Comprehension with
Logical Reasoning [20.81312285957089]
We build a comprehensive dataset, named LogiQA, which is sourced from expert-written questions for testing human logical reasoning.
Results show that state-of-the-art neural models perform by far worse than human ceiling.
Our dataset can also serve as a benchmark for reinvestigating logical AI under the deep learning NLP setting.
arXiv Detail & Related papers (2020-07-16T05:52:16Z) - Simplifying Paragraph-level Question Generation via Transformer Language
Models [0.0]
Question generation (QG) is a natural language generation task where a model is trained to ask questions corresponding to some input text.
A single Transformer-based unidirectional language model leveraging transfer learning can be used to produce high quality questions.
Our QG model, finetuned from GPT-2 Small, outperforms several paragraph-level QG baselines on the SQuAD dataset by 0.95 METEOR points.
arXiv Detail & Related papers (2020-05-03T14:57:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.