Enhancing Abstractiveness of Summarization Models through Calibrated
Distillation
- URL: http://arxiv.org/abs/2310.13760v2
- Date: Mon, 4 Dec 2023 06:47:21 GMT
- Title: Enhancing Abstractiveness of Summarization Models through Calibrated
Distillation
- Authors: Hwanjun Song, Igor Shalyminov, Hang Su, Siffi Singh, Kaisheng Yao,
Saab Mansour
- Abstract summary: DisCal is a novel approach to enhance the level of abstractiveness without sacrificing informativeness.
Our experiments show that DisCal outperforms prior methods in abstractive summarization distillation.
- Score: 30.199051061633803
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Sequence-level knowledge distillation reduces the size of Seq2Seq models for
more efficient abstractive summarization. However, it often leads to a loss of
abstractiveness in summarization. In this paper, we propose a novel approach
named DisCal to enhance the level of abstractiveness (measured by n-gram
overlap) without sacrificing the informativeness (measured by ROUGE) of
generated summaries. DisCal exposes diverse pseudo summaries with two
supervision to the student model. Firstly, the best pseudo summary is
identified in terms of abstractiveness and informativeness and used for
sequence-level distillation. Secondly, their ranks are used to ensure the
student model to assign higher prediction scores to summaries with higher
ranks. Our experiments show that DisCal outperforms prior methods in
abstractive summarization distillation, producing highly abstractive and
informative summaries.
Related papers
- AugSumm: towards generalizable speech summarization using synthetic
labels from large language model [61.73741195292997]
Abstractive speech summarization (SSUM) aims to generate human-like summaries from speech.
conventional SSUM models are mostly trained and evaluated with a single ground-truth (GT) human-annotated deterministic summary.
We propose AugSumm, a method to leverage large language models (LLMs) as a proxy for human annotators to generate augmented summaries.
arXiv Detail & Related papers (2024-01-10T18:39:46Z) - Referee: Reference-Free Sentence Summarization with Sharper
Controllability through Symbolic Knowledge Distillation [72.70058049274664]
We present Referee, a novel framework for sentence summarization that can be trained reference-free (i.e., requiring no gold summaries for supervision)
Our work is the first to demonstrate that reference-free, controlled sentence summarization is feasible via the conceptual framework of Symbolic Knowledge Distillation.
arXiv Detail & Related papers (2022-10-25T07:07:54Z) - Correcting Diverse Factual Errors in Abstractive Summarization via
Post-Editing and Language Model Infilling [56.70682379371534]
We show that our approach vastly outperforms prior methods in correcting erroneous summaries.
Our model -- FactEdit -- improves factuality scores by over 11 points on CNN/DM and over 31 points on XSum.
arXiv Detail & Related papers (2022-10-22T07:16:19Z) - Salience Allocation as Guidance for Abstractive Summarization [61.31826412150143]
We propose a novel summarization approach with a flexible and reliable salience guidance, namely SEASON (SaliencE Allocation as Guidance for Abstractive SummarizatiON)
SEASON utilizes the allocation of salience expectation to guide abstractive summarization and adapts well to articles in different abstractiveness.
arXiv Detail & Related papers (2022-10-22T02:13:44Z) - Towards Summary Candidates Fusion [26.114829566197976]
We propose a new paradigm in second-stage abstractive summarization called SummaFusion.
It fuses several summary candidates to produce a novel abstractive second-stage summary.
Our method works well on several summarization datasets, improving both the ROUGE scores and qualitative properties of fused summaries.
arXiv Detail & Related papers (2022-10-17T06:48:05Z) - Attention Temperature Matters in Abstractive Summarization Distillation [43.12920043942568]
This paper aims to distill large sequence-to-sequence Transformer models into smaller ones for faster inference and minimal performance loss.
We find simply manipulating attention temperatures in Transformers can make pseudo labels easier to learn for student models.
arXiv Detail & Related papers (2021-06-07T09:18:21Z) - The Summary Loop: Learning to Write Abstractive Summaries Without
Examples [21.85348918324668]
This work presents a new approach to unsupervised abstractive summarization based on maximizing a combination of coverage and fluency for a given length constraint.
Key terms are masked out of the original document and must be filled in by a coverage model using the current generated summary.
When tested on popular news summarization datasets, the method outperforms previous unsupervised methods by more than 2 R-1 points.
arXiv Detail & Related papers (2021-05-11T23:19:46Z) - Constrained Abstractive Summarization: Preserving Factual Consistency
with Constrained Generation [93.87095877617968]
We propose Constrained Abstractive Summarization (CAS), a general setup that preserves the factual consistency of abstractive summarization.
We adopt lexically constrained decoding, a technique generally applicable to autoregressive generative models, to fulfill CAS.
We observe up to 13.8 ROUGE-2 gains when only one manual constraint is used in interactive summarization.
arXiv Detail & Related papers (2020-10-24T00:27:44Z) - Multi-Fact Correction in Abstractive Text Summarization [98.27031108197944]
Span-Fact is a suite of two factual correction models that leverages knowledge learned from question answering models to make corrections in system-generated summaries via span selection.
Our models employ single or multi-masking strategies to either iteratively or auto-regressively replace entities in order to ensure semantic consistency w.r.t. the source text.
Experiments show that our models significantly boost the factual consistency of system-generated summaries without sacrificing summary quality in terms of both automatic metrics and human evaluation.
arXiv Detail & Related papers (2020-10-06T02:51:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.