RikiNet: Reading Wikipedia Pages for Natural Question Answering
- URL: http://arxiv.org/abs/2004.14560v1
- Date: Thu, 30 Apr 2020 03:29:21 GMT
- Title: RikiNet: Reading Wikipedia Pages for Natural Question Answering
- Authors: Dayiheng Liu, Yeyun Gong, Jie Fu, Yu Yan, Jiusheng Chen, Daxin Jiang,
Jiancheng Lv and Nan Duan
- Abstract summary: We introduce a new model, called RikiNet, which reads Wikipedia pages for natural question answering.
On the Natural Questions dataset, a single RikiNet achieves 74.3 F1 and 57.9 F1 on long-answer and short-answer tasks.
An ensemble RikiNet obtains 76.1 F1 and 61.3 F1 on long-answer and short-answer tasks, achieving the best performance on the official NQ leaderboard.
- Score: 101.505486822236
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reading long documents to answer open-domain questions remains challenging in
natural language understanding. In this paper, we introduce a new model, called
RikiNet, which reads Wikipedia pages for natural question answering. RikiNet
contains a dynamic paragraph dual-attention reader and a multi-level cascaded
answer predictor. The reader dynamically represents the document and question
by utilizing a set of complementary attention mechanisms. The representations
are then fed into the predictor to obtain the span of the short answer, the
paragraph of the long answer, and the answer type in a cascaded manner. On the
Natural Questions (NQ) dataset, a single RikiNet achieves 74.3 F1 and 57.9 F1
on long-answer and short-answer tasks. To our best knowledge, it is the first
single model that outperforms the single human performance. Furthermore, an
ensemble RikiNet obtains 76.1 F1 and 61.3 F1 on long-answer and short-answer
tasks, achieving the best performance on the official NQ leaderboard
Related papers
- Question Answering in Natural Language: the Special Case of Temporal
Expressions [0.0]
Our work aims to leverage a popular approach used for general question answering, answer extraction, in order to find answers to temporal questions within a paragraph.
To train our model, we propose a new dataset, inspired by SQuAD, specifically tailored to provide rich temporal information.
Our evaluation shows that a deep learning model trained to perform pattern matching, often used in general question answering, can be adapted to temporal question answering.
arXiv Detail & Related papers (2023-11-23T16:26:24Z) - Concise Answers to Complex Questions: Summarization of Long-form Answers [27.190319030219285]
We conduct a user study on summarized answers generated from state-of-the-art models and our newly proposed extract-and-decontextualize approach.
We find a large proportion of long-form answers can be adequately summarized by at least one system, while complex and implicit answers are challenging to compress.
We observe that decontextualization improves the quality of the extractive summary, exemplifying its potential in the summarization task.
arXiv Detail & Related papers (2023-05-30T17:59:33Z) - How Do We Answer Complex Questions: Discourse Structure of Long-form
Answers [51.973363804064704]
We study the functional structure of long-form answers collected from three datasets.
Our main goal is to understand how humans organize information to craft complex answers.
Our work can inspire future research on discourse-level modeling and evaluation of long-form QA systems.
arXiv Detail & Related papers (2022-03-21T15:14:10Z) - MixQG: Neural Question Generation with Mixed Answer Types [54.23205265351248]
We propose a neural question generator, MixQG, to bridge this gap.
We combine 9 question answering datasets with diverse answer types, including yes/no, multiple-choice, extractive, and abstractive answers.
Our model outperforms existing work in both seen and unseen domains.
arXiv Detail & Related papers (2021-10-15T16:03:40Z) - A Dataset of Information-Seeking Questions and Answers Anchored in
Research Papers [66.11048565324468]
We present a dataset of 5,049 questions over 1,585 Natural Language Processing papers.
Each question is written by an NLP practitioner who read only the title and abstract of the corresponding paper, and the question seeks information present in the full text.
We find that existing models that do well on other QA tasks do not perform well on answering these questions, underperforming humans by at least 27 F1 points when answering them from entire papers.
arXiv Detail & Related papers (2021-05-07T00:12:34Z) - No Answer is Better Than Wrong Answer: A Reflection Model for Document
Level Machine Reading Comprehension [92.57688872599998]
We propose a novel approach to handle all answer types systematically.
In particular, we propose a novel approach called Reflection Net which leverages a two-step training procedure to identify the no-answer and wrong-answer cases.
Our approach achieved the top 1 on both long and short answer leaderboard, with F1 scores of 77.2 and 64.1, respectively.
arXiv Detail & Related papers (2020-09-25T06:57:52Z) - ConfNet2Seq: Full Length Answer Generation from Spoken Questions [35.5617271023687]
We propose a novel system to generate full length natural language answers from spoken questions and factoid answers.
The spoken sequence is compactly represented as a confusion network extracted from a pre-trained Automatic Speech Recognizer.
We release a large-scale dataset of 259,788 samples of spoken questions, their factoid answers and corresponding full-length textual answers.
arXiv Detail & Related papers (2020-06-09T10:04:49Z) - Document Modeling with Graph Attention Networks for Multi-grained
Machine Reading Comprehension [127.3341842928421]
Natural Questions is a new challenging machine reading comprehension benchmark.
It has two-grained answers, which are a long answer (typically a paragraph) and a short answer (one or more entities inside the long answer)
Existing methods treat these two sub-tasks individually during training while ignoring their dependencies.
We present a novel multi-grained machine reading comprehension framework that focuses on modeling documents at their hierarchical nature.
arXiv Detail & Related papers (2020-05-12T14:20:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.