SEAL: Segment-wise Extractive-Abstractive Long-form Text Summarization
- URL: http://arxiv.org/abs/2006.10213v1
- Date: Thu, 18 Jun 2020 00:13:21 GMT
- Title: SEAL: Segment-wise Extractive-Abstractive Long-form Text Summarization
- Authors: Yao Zhao, Mohammad Saleh, Peter J.Liu
- Abstract summary: We study a sequence-to-sequence setting with input sequence lengths up to 100,000 tokens and output sequence lengths up to 768 tokens.
We propose SEAL, a Transformer-based model, featuring a new encoder-decoder attention that dynamically extracts/selects input snippets to sparsely attend to for each output segment.
The SEAL model achieves state-of-the-art results on existing long-form summarization tasks, and outperforms strong baseline models on a new dataset/task we introduce, Search2Wiki, with much longer input text.
- Score: 39.85688193525843
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Most prior work in the sequence-to-sequence paradigm focused on datasets with
input sequence lengths in the hundreds of tokens due to the computational
constraints of common RNN and Transformer architectures. In this paper, we
study long-form abstractive text summarization, a sequence-to-sequence setting
with input sequence lengths up to 100,000 tokens and output sequence lengths up
to 768 tokens. We propose SEAL, a Transformer-based model, featuring a new
encoder-decoder attention that dynamically extracts/selects input snippets to
sparsely attend to for each output segment. Using only the original documents
and summaries, we derive proxy labels that provide weak supervision for
extractive layers simultaneously with regular supervision from abstractive
summaries. The SEAL model achieves state-of-the-art results on existing
long-form summarization tasks, and outperforms strong baseline models on a new
dataset/task we introduce, Search2Wiki, with much longer input text. Since
content selection is explicit in the SEAL model, a desirable side effect is
that the selection can be inspected for enhanced interpretability.
Related papers
- LOCOST: State-Space Models for Long Document Abstractive Summarization [76.31514220737272]
We propose LOCOST: an encoder-decoder architecture based on state-space models for conditional text generation with long context inputs.
With a computational complexity of $O(L log L)$, this architecture can handle significantly longer sequences than state-of-the-art models that are based on sparse attention patterns.
arXiv Detail & Related papers (2024-01-31T15:33:37Z) - Learning Non-Autoregressive Models from Search for Unsupervised Sentence
Summarization [20.87460375478907]
Text summarization aims to generate a short summary for an input text.
In this work, we propose a Non-Autoregressive Unsupervised Summarization approach.
Experiments show that NAUS achieves state-of-the-art performance for unsupervised summarization.
arXiv Detail & Related papers (2022-05-28T21:09:23Z) - Efficient Long Sequence Encoding via Synchronization [29.075962393432857]
We propose a synchronization mechanism for hierarchical encoding.
Our approach first identifies anchor tokens across segments and groups them by their roles in the original input sequence.
Our approach is able to improve the global information exchange among segments while maintaining efficiency.
arXiv Detail & Related papers (2022-03-15T04:37:02Z) - Long Document Summarization with Top-down and Bottom-up Inference [113.29319668246407]
We propose a principled inference framework to improve summarization models on two aspects.
Our framework assumes a hierarchical latent structure of a document where the top-level captures the long range dependency.
We demonstrate the effectiveness of the proposed framework on a diverse set of summarization datasets.
arXiv Detail & Related papers (2022-03-15T01:24:51Z) - HETFORMER: Heterogeneous Transformer with Sparse Attention for Long-Text
Extractive Summarization [57.798070356553936]
HETFORMER is a Transformer-based pre-trained model with multi-granularity sparse attentions for extractive summarization.
Experiments on both single- and multi-document summarization tasks show that HETFORMER achieves state-of-the-art performance in Rouge F1.
arXiv Detail & Related papers (2021-10-12T22:42:31Z) - POINTER: Constrained Progressive Text Generation via Insertion-based
Generative Pre-training [93.79766670391618]
We present POINTER, a novel insertion-based approach for hard-constrained text generation.
The proposed method operates by progressively inserting new tokens between existing tokens in a parallel manner.
The resulting coarse-to-fine hierarchy makes the generation process intuitive and interpretable.
arXiv Detail & Related papers (2020-05-01T18:11:54Z) - Beyond 512 Tokens: Siamese Multi-depth Transformer-based Hierarchical
Encoder for Long-Form Document Matching [28.190001111358438]
We propose a Siamese Multi-depth Transformer-based SMITH for long-form document matching.
Our model contains several innovations to adapt self-attention models for longer text input.
We will open source a Wikipedia based benchmark dataset, code and a pre-trained checkpoint to accelerate future research on long-form document matching.
arXiv Detail & Related papers (2020-04-26T07:04:08Z) - Pre-training for Abstractive Document Summarization by Reinstating
Source Text [105.77348528847337]
This paper presents three pre-training objectives which allow us to pre-train a Seq2Seq based abstractive summarization model on unlabeled text.
Experiments on two benchmark summarization datasets show that all three objectives can improve performance upon baselines.
arXiv Detail & Related papers (2020-04-04T05:06:26Z) - Length-controllable Abstractive Summarization by Guiding with Summary
Prototype [27.094797760775297]
We propose a new length-controllable abstractive summarization model.
Our model generates a summary in two steps.
Experiments with the CNN/Daily Mail dataset and the NEWSROOM dataset show that our model outperformed previous models in length-controlled settings.
arXiv Detail & Related papers (2020-01-21T04:01:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.