Learning with Rejection for Abstractive Text Summarization
- URL: http://arxiv.org/abs/2302.08531v1
- Date: Thu, 16 Feb 2023 19:07:08 GMT
- Title: Learning with Rejection for Abstractive Text Summarization
- Authors: Meng Cao, Yue Dong, Jingyi He and Jackie Chi Kit Cheung
- Abstract summary: We propose a training objective for abstractive summarization based on rejection learning.
We show that our method considerably improves the factuality of generated summaries in automatic and human evaluations.
- Score: 42.15551472507393
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: State-of-the-art abstractive summarization systems frequently hallucinate
content that is not supported by the source document, mainly due to noise in
the training dataset. Existing methods opt to drop the noisy samples or tokens
from the training set entirely, reducing the effective training set size and
creating an artificial propensity to copy words from the source. In this work,
we propose a training objective for abstractive summarization based on
rejection learning, in which the model learns whether or not to reject
potentially noisy tokens. We further propose a regularized decoding objective
that penalizes non-factual candidate summaries during inference by using the
rejection probability learned during training. We show that our method
considerably improves the factuality of generated summaries in automatic and
human evaluations when compared to five baseline models and that it does so
while increasing the abstractiveness of the generated summaries.
Related papers
- Noisy Self-Training with Synthetic Queries for Dense Retrieval [49.49928764695172]
We introduce a novel noisy self-training framework combined with synthetic queries.
Experimental results show that our method improves consistently over existing methods.
Our method is data efficient and outperforms competitive baselines.
arXiv Detail & Related papers (2023-11-27T06:19:50Z) - Improving Multi-Document Summarization through Referenced Flexible
Extraction with Credit-Awareness [21.037841262371355]
A notable challenge in Multi-Document Summarization (MDS) is the extremely-long length of the input.
We present an extract-then-abstract Transformer framework to overcome the problem.
We propose a loss weighting mechanism that makes the model aware of the unequal importance for the sentences not in the pseudo extraction oracle.
arXiv Detail & Related papers (2022-05-04T04:40:39Z) - Evaluating the Tradeoff Between Abstractiveness and Factuality in
Abstractive Summarization [20.83986393847262]
We analyze the tradeoff between abstractiveness and factuality of generated summaries across multiple datasets and models.
We propose new factuality metrics that adjust for the degree of abstractiveness.
arXiv Detail & Related papers (2021-08-05T21:28:20Z) - Multi-Fact Correction in Abstractive Text Summarization [98.27031108197944]
Span-Fact is a suite of two factual correction models that leverages knowledge learned from question answering models to make corrections in system-generated summaries via span selection.
Our models employ single or multi-masking strategies to either iteratively or auto-regressively replace entities in order to ensure semantic consistency w.r.t. the source text.
Experiments show that our models significantly boost the factual consistency of system-generated summaries without sacrificing summary quality in terms of both automatic metrics and human evaluation.
arXiv Detail & Related papers (2020-10-06T02:51:02Z) - Noisy Self-Knowledge Distillation for Text Summarization [83.49809205891496]
We apply self-knowledge distillation to text summarization which we argue can alleviate problems with maximum-likelihood training.
Our student summarization model is trained with guidance from a teacher which generates smoothed labels to help regularize training.
We demonstrate experimentally on three benchmarks that our framework boosts the performance of both pretrained and non-pretrained summarizers.
arXiv Detail & Related papers (2020-09-15T12:53:09Z) - Automatic Recall Machines: Internal Replay, Continual Learning and the
Brain [104.38824285741248]
Replay in neural networks involves training on sequential data with memorized samples, which counteracts forgetting of previous behavior caused by non-stationarity.
We present a method where these auxiliary samples are generated on the fly, given only the model that is being trained for the assessed objective.
Instead the implicit memory of learned samples within the assessed model itself is exploited.
arXiv Detail & Related papers (2020-06-22T15:07:06Z) - Unsupervised Opinion Summarization with Noising and Denoising [85.49169453434554]
We create a synthetic dataset from a corpus of user reviews by sampling a review, pretending it is a summary, and generating noisy versions thereof.
At test time, the model accepts genuine reviews and generates a summary containing salient opinions, treating those that do not reach consensus as noise.
arXiv Detail & Related papers (2020-04-21T16:54:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.