Related papers: Recurrent Chunking Mechanisms for Long-Text Machine Reading Comprehension

Recurrent Chunking Mechanisms for Long-Text Machine Reading Comprehension

URL: http://arxiv.org/abs/2005.08056v2
Date: Tue, 19 May 2020 14:29:08 GMT
Title: Recurrent Chunking Mechanisms for Long-Text Machine Reading Comprehension
Authors: Hongyu Gong, Yelong Shen, Dian Yu, Jianshu Chen, Dong Yu
Abstract summary: We study machine reading comprehension (MRC) on long texts. A model takes as inputs a lengthy document and a question and then extracts a text span from the document as an answer. We propose to let a model learn to chunk in a more flexible way via reinforcement learning.
Score: 59.80926970481975
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we study machine reading comprehension (MRC) on long texts, where a model takes as inputs a lengthy document and a question and then extracts a text span from the document as an answer. State-of-the-art models tend to use a pretrained transformer model (e.g., BERT) to encode the joint contextual information of document and question. However, these transformer-based models can only take a fixed-length (e.g., 512) text as its input. To deal with even longer text inputs, previous approaches usually chunk them into equally-spaced segments and predict answers based on each segment independently without considering the information from other segments. As a result, they may form segments that fail to cover the correct answer span or retain insufficient contexts around it, which significantly degrades the performance. Moreover, they are less capable of answering questions that need cross-segment information. We propose to let a model learn to chunk in a more flexible way via reinforcement learning: a model can decide the next segment that it wants to process in either direction. We also employ recurrent mechanisms to enable information to flow across segments. Experiments on three MRC datasets -- CoQA, QuAC, and TriviaQA -- demonstrate the effectiveness of our proposed recurrent chunking mechanisms: we can obtain segments that are more likely to contain complete answers and at the same time provide sufficient contexts around the ground truth answers for better predictions.

Related papers

SitEmb-v1.5: Improved Context-Aware Dense Retrieval for Semantic Association and Long Story Comprehension [77.93156509994994]
We show how to represent short chunks in a way that is conditioned on a broader context window to enhance retrieval performance.<n>Existing embedding models are not well-equipped to encode such situated context effectively.<n>Our method substantially outperforms state-of-the-art embedding models.
arXiv Detail & Related papers (2025-08-03T23:59:31Z)
LLM$\times$MapReduce: Simplified Long-Sequence Processing using Large Language Models [73.13933847198395]
We propose a training-free framework for processing long texts, utilizing a divide-and-conquer strategy to achieve comprehensive document understanding. The proposed LLM$times$MapReduce framework splits the entire document into several chunks for LLMs to read and then aggregates the intermediate answers to produce the final output.
arXiv Detail & Related papers (2024-10-12T03:13:44Z)
A Novel LLM-based Two-stage Summarization Approach for Long Dialogues [9.835499880812646]
This study proposes a hierarchical framework that segments and condenses information from long documents. The condensation stage utilizes an unsupervised generation model to generate condensed data. The summarization stage fine-tunes the abstractive summarization model on the condensed data to generate the final results.
arXiv Detail & Related papers (2024-10-09T03:42:40Z)
Long-Span Question-Answering: Automatic Question Generation and QA-System Ranking via Side-by-Side Evaluation [65.16137964758612]
We explore the use of long-context capabilities in large language models to create synthetic reading comprehension data from entire books. Our objective is to test the capabilities of LLMs to analyze, understand, and reason over problems that require a detailed comprehension of long spans of text.
arXiv Detail & Related papers (2024-05-31T20:15:10Z)
Toward Unifying Text Segmentation and Long Document Summarization [31.084738269628748]
We study the role that section segmentation plays in extractive summarization of written and spoken documents. Our approach learns robust sentence representations by performing summarization and segmentation simultaneously. Our findings suggest that the model can not only achieve state-of-the-art performance on publicly available benchmarks, but demonstrate better cross-genre transferability.
arXiv Detail & Related papers (2022-10-28T22:07:10Z)
HETFORMER: Heterogeneous Transformer with Sparse Attention for Long-Text Extractive Summarization [57.798070356553936]
HETFORMER is a Transformer-based pre-trained model with multi-granularity sparse attentions for extractive summarization. Experiments on both single- and multi-document summarization tasks show that HETFORMER achieves state-of-the-art performance in Rouge F1.
arXiv Detail & Related papers (2021-10-12T22:42:31Z)
Automated News Summarization Using Transformers [4.932130498861987]
We will be presenting a comprehensive comparison of a few transformer architecture based pre-trained models for text summarization. For analysis and comparison, we have used the BBC news dataset that contains text data that can be used for summarization and human generated summaries.
arXiv Detail & Related papers (2021-04-23T04:22:33Z)
ERNIE-DOC: The Retrospective Long-Document Modeling Transformer [24.426571160930635]
We propose ERNIE-DOC, a document-level language pretraining model based on Recurrence Transformers. Two well-designed techniques, namely the retrospective feed mechanism and the enhanced recurrence mechanism enable ERNIE-DOC with much longer effective context length. Various experiments on both English and Chinese document-level tasks are conducted.
arXiv Detail & Related papers (2020-12-31T16:12:48Z)
Tradeoffs in Sentence Selection Techniques for Open-Domain Question Answering [54.541952928070344]
We describe two groups of models for sentence selection: QA-based approaches, which run a full-fledged QA system to identify answer candidates, and retrieval-based models, which find parts of each passage specifically related to each question. We show that very lightweight QA models can do well at this task, but retrieval-based models are faster still.
arXiv Detail & Related papers (2020-09-18T23:39:15Z)
Document Modeling with Graph Attention Networks for Multi-grained Machine Reading Comprehension [127.3341842928421]
Natural Questions is a new challenging machine reading comprehension benchmark. It has two-grained answers, which are a long answer (typically a paragraph) and a short answer (one or more entities inside the long answer) Existing methods treat these two sub-tasks individually during training while ignoring their dependencies. We present a novel multi-grained machine reading comprehension framework that focuses on modeling documents at their hierarchical nature.
arXiv Detail & Related papers (2020-05-12T14:20:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.