ReadTwice: Reading Very Large Documents with Memories
- URL: http://arxiv.org/abs/2105.04241v2
- Date: Tue, 11 May 2021 23:07:13 GMT
- Title: ReadTwice: Reading Very Large Documents with Memories
- Authors: Yury Zemlyanskiy, Joshua Ainslie, Michiel de Jong, Philip Pham, Ilya
Eckstein, Fei Sha
- Abstract summary: We propose ReadTwice, a technique that combines several strengths of prior approaches to model long-range dependencies with Transformers.
The main idea is to read text in small segments, in parallel, summarizing each segment into a memory table to be used in a second read of the text.
We show that the method outperforms models of comparable size on several question answering (QA) datasets and sets a new state of the art on the challenging NarrativeQA task.
- Score: 19.45538971299312
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Knowledge-intensive tasks such as question answering often require
assimilating information from different sections of large inputs such as books
or article collections. We propose ReadTwice, a simple and effective technique
that combines several strengths of prior approaches to model long-range
dependencies with Transformers. The main idea is to read text in small
segments, in parallel, summarizing each segment into a memory table to be used
in a second read of the text. We show that the method outperforms models of
comparable size on several question answering (QA) datasets and sets a new
state of the art on the challenging NarrativeQA task, with questions about
entire books. Source code and pre-trained checkpoints for ReadTwice can be
found at https://goo.gle/research-readtwice.
Related papers
- Long-Span Question-Answering: Automatic Question Generation and QA-System Ranking via Side-by-Side Evaluation [65.16137964758612]
We explore the use of long-context capabilities in large language models to create synthetic reading comprehension data from entire books.
Our objective is to test the capabilities of LLMs to analyze, understand, and reason over problems that require a detailed comprehension of long spans of text.
arXiv Detail & Related papers (2024-05-31T20:15:10Z) - NarrativeXL: A Large-scale Dataset For Long-Term Memory Models [0.0]
Using GPT 3.5, we summarized each scene in 1,500 hand-curated fiction books from Project Gutenberg.
With 990,595 total questions, our dataset is an order of magnitude larger than the closest alternatives.
Most questions have a known retention demand'', indicating how long-term of a memory is needed to answer them.
arXiv Detail & Related papers (2023-05-23T09:55:32Z) - Generate rather than Retrieve: Large Language Models are Strong Context
Generators [74.87021992611672]
We present a novel perspective for solving knowledge-intensive tasks by replacing document retrievers with large language model generators.
We call our method generate-then-read (GenRead), which first prompts a large language model to generate contextutal documents based on a given question, and then reads the generated documents to produce the final answer.
arXiv Detail & Related papers (2022-09-21T01:30:59Z) - Summ^N: A Multi-Stage Summarization Framework for Long Input Dialogues
and Documents [13.755637074366813]
SummN is a simple, flexible, and effective multi-stage framework for input texts longer than the maximum context lengths of typical pretrained LMs.
It can process input text of arbitrary length by adjusting the number of stages while keeping the LM context size fixed.
Our experiments demonstrate that SummN significantly outperforms previous state-of-the-art methods.
arXiv Detail & Related papers (2021-10-16T06:19:54Z) - Open Question Answering over Tables and Text [55.8412170633547]
In open question answering (QA), the answer to a question is produced by retrieving and then analyzing documents that might contain answers to the question.
Most open QA systems have considered only retrieving information from unstructured text.
We present a new large-scale dataset Open Table-and-Text Question Answering (OTT-QA) to evaluate performance on this task.
arXiv Detail & Related papers (2020-10-20T16:48:14Z) - Tradeoffs in Sentence Selection Techniques for Open-Domain Question
Answering [54.541952928070344]
We describe two groups of models for sentence selection: QA-based approaches, which run a full-fledged QA system to identify answer candidates, and retrieval-based models, which find parts of each passage specifically related to each question.
We show that very lightweight QA models can do well at this task, but retrieval-based models are faster still.
arXiv Detail & Related papers (2020-09-18T23:39:15Z) - Recurrent Chunking Mechanisms for Long-Text Machine Reading
Comprehension [59.80926970481975]
We study machine reading comprehension (MRC) on long texts.
A model takes as inputs a lengthy document and a question and then extracts a text span from the document as an answer.
We propose to let a model learn to chunk in a more flexible way via reinforcement learning.
arXiv Detail & Related papers (2020-05-16T18:08:58Z) - Document Modeling with Graph Attention Networks for Multi-grained
Machine Reading Comprehension [127.3341842928421]
Natural Questions is a new challenging machine reading comprehension benchmark.
It has two-grained answers, which are a long answer (typically a paragraph) and a short answer (one or more entities inside the long answer)
Existing methods treat these two sub-tasks individually during training while ignoring their dependencies.
We present a novel multi-grained machine reading comprehension framework that focuses on modeling documents at their hierarchical nature.
arXiv Detail & Related papers (2020-05-12T14:20:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.