A more abstractive summarization model
- URL: http://arxiv.org/abs/2002.10959v1
- Date: Tue, 25 Feb 2020 15:22:23 GMT
- Title: A more abstractive summarization model
- Authors: Satyaki Chakraborty, Xinya Li, Sayak Chakraborty
- Abstract summary: We investigate why pointer-generator network is unable to generate novel words.
We then address that by adding an Out-of-vocabulary penalty.
We also report rouge scores of our model since most summarization models are evaluated with R-1, R-2, R-L scores.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pointer-generator network is an extremely popular method of text
summarization. More recent works in this domain still build on top of the
baseline pointer generator by augmenting a content selection phase, or by
decomposing the decoder into a contextual network and a language model.
However, all such models that are based on the pointer-generator base
architecture cannot generate novel words in the summary and mostly copy words
from the source text. In our work, we first thoroughly investigate why the
pointer-generator network is unable to generate novel words, and then address
that by adding an Out-of-vocabulary (OOV) penalty. This enables us to improve
the amount of novelty/abstraction significantly. We use normalized n-gram
novelty scores as a metric for determining the level of abstraction. Moreover,
we also report rouge scores of our model since most summarization models are
evaluated with R-1, R-2, R-L scores.
Related papers
- GLIMMER: Incorporating Graph and Lexical Features in Unsupervised Multi-Document Summarization [13.61818620609812]
We propose a lightweight yet effective unsupervised approach called GLIMMER: a Graph and LexIcal features based unsupervised Multi-docuMEnt summaRization approach.
It first constructs a sentence graph from the source documents, then automatically identifies semantic clusters by mining low-level features from raw texts.
Experiments conducted on Multi-News, Multi-XScience and DUC-2004 demonstrate that our approach outperforms existing unsupervised approaches.
arXiv Detail & Related papers (2024-08-19T16:01:48Z) - Retro-FPN: Retrospective Feature Pyramid Network for Point Cloud
Semantic Segmentation [65.78483246139888]
We propose Retro-FPN to model the per-point feature prediction as an explicit and retrospective refining process.
Its key novelty is a retro-transformer for summarizing semantic contexts from the previous layer.
We show that Retro-FPN can significantly improve performance over state-of-the-art backbones.
arXiv Detail & Related papers (2023-08-18T05:28:25Z) - Summarization Programs: Interpretable Abstractive Summarization with
Neural Modular Trees [89.60269205320431]
Current abstractive summarization models either suffer from a lack of clear interpretability or provide incomplete rationales.
We propose the Summarization Program (SP), an interpretable modular framework consisting of an (ordered) list of binary trees.
A Summarization Program contains one root node per summary sentence, and a distinct tree connects each summary sentence to the document sentences.
arXiv Detail & Related papers (2022-09-21T16:50:22Z) - HETFORMER: Heterogeneous Transformer with Sparse Attention for Long-Text
Extractive Summarization [57.798070356553936]
HETFORMER is a Transformer-based pre-trained model with multi-granularity sparse attentions for extractive summarization.
Experiments on both single- and multi-document summarization tasks show that HETFORMER achieves state-of-the-art performance in Rouge F1.
arXiv Detail & Related papers (2021-10-12T22:42:31Z) - MeetSum: Transforming Meeting Transcript Summarization using
Transformers! [2.1915057426589746]
We utilize a Transformer-based Pointer Generator Network to generate abstract summaries for meeting transcripts.
This model uses 2 LSTMs as an encoder and a decoder, a Pointer network which copies words from the inputted text, and a Generator network to produce out-of-vocabulary words.
We show that training the model on a news summary dataset and using zero-shot learning to test it on the meeting dataset proves to produce better results than training it on the AMI meeting dataset.
arXiv Detail & Related papers (2021-08-13T16:34:09Z) - To Point or Not to Point: Understanding How Abstractive Summarizers
Paraphrase Text [4.4044968357361745]
We characterize how one popular abstractive model, the pointer-generator model of See et al., uses its explicit copy/generation switch to control its level of abstraction.
When we modify the copy/generation switch and force the model to generate, only simple neural abilities are revealed alongside factual inaccuracies and hallucinations.
In line with previous research, these results suggest that abstractive summarization models lack the semantic understanding necessary to generate paraphrases that are both abstractive and faithful to the source document.
arXiv Detail & Related papers (2021-06-03T04:03:15Z) - Reinforced Generative Adversarial Network for Abstractive Text
Summarization [7.507096634112164]
Sequence-to-sequence models provide a viable new approach to generative summarization.
These models have three drawbacks: their grasp of the details of the original text is often inaccurate, and the text generated by such models often has repetitions.
We propose a new architecture that combines reinforcement learning and adversarial generative networks to enhance the sequence-to-sequence attention model.
arXiv Detail & Related papers (2021-05-31T17:34:47Z) - LT-LM: a novel non-autoregressive language model for single-shot lattice
rescoring [55.16665077221941]
We propose a novel rescoring approach, which processes the entire lattice in a single call to the model.
The key feature of our rescoring policy is a novel non-autoregressive Lattice Transformer Language Model (LT-LM)
arXiv Detail & Related papers (2021-04-06T14:06:07Z) - Concept Extraction Using Pointer-Generator Networks [86.75999352383535]
We propose a generic open-domain OOV-oriented extractive model that is based on distant supervision of a pointer-generator network.
The model has been trained on a large annotated corpus compiled specifically for this task from 250K Wikipedia pages.
arXiv Detail & Related papers (2020-08-25T22:28:14Z) - Select, Extract and Generate: Neural Keyphrase Generation with
Layer-wise Coverage Attention [75.44523978180317]
We propose emphSEG-Net, a neural keyphrase generation model that is composed of two major components.
The experimental results on seven keyphrase generation benchmarks from scientific and web documents demonstrate that SEG-Net outperforms the state-of-the-art neural generative methods by a large margin.
arXiv Detail & Related papers (2020-08-04T18:00:07Z) - Knowledge Graph-Augmented Abstractive Summarization with Semantic-Driven
Cloze Reward [42.925345819778656]
We present ASGARD, a novel framework for Abstractive Summarization with Graph-Augmentation and semantic-driven RewarD.
We propose the use of dual encoders---a sequential document encoder and a graph-structured encoder---to maintain the global context and local characteristics of entities.
Results show that our models produce significantly higher ROUGE scores than a variant without knowledge graph as input on both New York Times and CNN/Daily Mail datasets.
arXiv Detail & Related papers (2020-05-03T18:23:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.