Improving Long Text Understanding with Knowledge Distilled from Summarization Model
- URL: http://arxiv.org/abs/2405.04955v1
- Date: Wed, 8 May 2024 10:49:39 GMT
- Title: Improving Long Text Understanding with Knowledge Distilled from Summarization Model
- Authors: Yan Liu, Yazheng Yang, Xiaokang Chen,
- Abstract summary: We propose our emphGist Detector to leverage the gist detection ability of a summarization model.
Gist Detector first learns the gist detection knowledge distilled from a summarization model, and then produces gist-aware representations.
We evaluate our method on three different tasks: long document classification, distantly supervised open-domain question answering, and non-parallel text style transfer.
- Score: 17.39913210351487
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Long text understanding is important yet challenging for natural language processing. A long article or document usually contains many redundant words that are not pertinent to its gist and sometimes can be regarded as noise. With recent advances of abstractive summarization, we propose our \emph{Gist Detector} to leverage the gist detection ability of a summarization model and integrate the extracted gist into downstream models to enhance their long text understanding ability. Specifically, Gist Detector first learns the gist detection knowledge distilled from a summarization model, and then produces gist-aware representations to augment downstream models. We evaluate our method on three different tasks: long document classification, distantly supervised open-domain question answering, and non-parallel text style transfer. The experimental results show that our method can significantly improve the performance of baseline models on all tasks.
Related papers
- Improving Sequence-to-Sequence Models for Abstractive Text Summarization Using Meta Heuristic Approaches [0.0]
Humans have a unique ability to create abstractions.
The use of sequence-to-sequence (seq2seq) models for neural abstractive text summarization has been ascending as far as prevalence.
In this article, we aim toward enhancing the present architectures and models for abstractive text summarization.
arXiv Detail & Related papers (2024-03-24T17:39:36Z) - Retrieval is Accurate Generation [99.24267226311157]
We introduce a novel method that selects context-aware phrases from a collection of supporting documents.
Our model achieves the best performance and the lowest latency among several retrieval-augmented baselines.
arXiv Detail & Related papers (2024-02-27T14:16:19Z) - Retrieval augmentation of large language models for lay language
generation [12.686922203465896]
We introduce CELLS, the largest (63k pairs) and broadest-ranging (12 journals) parallel corpus for lay language generation.
The abstract and the corresponding lay language summary are written by domain experts, assuring the quality of our dataset.
We derive two specialized paired corpora from CELLS to address key challenges in lay language generation: generating background explanations and simplifying the original abstract.
arXiv Detail & Related papers (2022-11-07T19:06:53Z) - HETFORMER: Heterogeneous Transformer with Sparse Attention for Long-Text
Extractive Summarization [57.798070356553936]
HETFORMER is a Transformer-based pre-trained model with multi-granularity sparse attentions for extractive summarization.
Experiments on both single- and multi-document summarization tasks show that HETFORMER achieves state-of-the-art performance in Rouge F1.
arXiv Detail & Related papers (2021-10-12T22:42:31Z) - Enhance Long Text Understanding via Distilled Gist Detector from
Abstractive Summarization [7.851265919027389]
We consider the problem of how to disentangle the gist-relevant and irrelevant information for long text understanding.
Experiments on document classification, distantly supervised open-domain question answering (DS-QA) and non-parallel text style transfer show that our method can significantly improve the performance of the baseline models.
arXiv Detail & Related papers (2021-10-10T09:21:24Z) - StreamHover: Livestream Transcript Summarization and Annotation [54.41877742041611]
We present StreamHover, a framework for annotating and summarizing livestream transcripts.
With a total of over 500 hours of videos annotated with both extractive and abstractive summaries, our benchmark dataset is significantly larger than currently existing annotated corpora.
We show that our model generalizes better and improves performance over strong baselines.
arXiv Detail & Related papers (2021-09-11T02:19:37Z) - To Point or Not to Point: Understanding How Abstractive Summarizers
Paraphrase Text [4.4044968357361745]
We characterize how one popular abstractive model, the pointer-generator model of See et al., uses its explicit copy/generation switch to control its level of abstraction.
When we modify the copy/generation switch and force the model to generate, only simple neural abilities are revealed alongside factual inaccuracies and hallucinations.
In line with previous research, these results suggest that abstractive summarization models lack the semantic understanding necessary to generate paraphrases that are both abstractive and faithful to the source document.
arXiv Detail & Related papers (2021-06-03T04:03:15Z) - Knowledge Graph-Augmented Abstractive Summarization with Semantic-Driven
Cloze Reward [42.925345819778656]
We present ASGARD, a novel framework for Abstractive Summarization with Graph-Augmentation and semantic-driven RewarD.
We propose the use of dual encoders---a sequential document encoder and a graph-structured encoder---to maintain the global context and local characteristics of entities.
Results show that our models produce significantly higher ROUGE scores than a variant without knowledge graph as input on both New York Times and CNN/Daily Mail datasets.
arXiv Detail & Related papers (2020-05-03T18:23:06Z) - Few-Shot Learning for Opinion Summarization [117.70510762845338]
Opinion summarization is the automatic creation of text reflecting subjective information expressed in multiple documents.
In this work, we show that even a handful of summaries is sufficient to bootstrap generation of the summary text.
Our approach substantially outperforms previous extractive and abstractive methods in automatic and human evaluation.
arXiv Detail & Related papers (2020-04-30T15:37:38Z) - Unsupervised Opinion Summarization with Noising and Denoising [85.49169453434554]
We create a synthetic dataset from a corpus of user reviews by sampling a review, pretending it is a summary, and generating noisy versions thereof.
At test time, the model accepts genuine reviews and generates a summary containing salient opinions, treating those that do not reach consensus as noise.
arXiv Detail & Related papers (2020-04-21T16:54:57Z) - Pre-training for Abstractive Document Summarization by Reinstating
Source Text [105.77348528847337]
This paper presents three pre-training objectives which allow us to pre-train a Seq2Seq based abstractive summarization model on unlabeled text.
Experiments on two benchmark summarization datasets show that all three objectives can improve performance upon baselines.
arXiv Detail & Related papers (2020-04-04T05:06:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.