FRSUM: Towards Faithful Abstractive Summarization via Enhancing Factual
Robustness
- URL: http://arxiv.org/abs/2211.00294v1
- Date: Tue, 1 Nov 2022 06:09:00 GMT
- Title: FRSUM: Towards Faithful Abstractive Summarization via Enhancing Factual
Robustness
- Authors: Wenhao Wu, Wei Li, Jiachen Liu, Xinyan Xiao, Ziqiang Cao, Sujian Li,
Hua Wu
- Abstract summary: We study the faithfulness of existing systems from a new perspective of factual robustness.
We propose a novel training strategy, namely FRSUM, which teaches the model to defend against both explicit adversarial samples and implicit factual adversarial perturbations.
- Score: 56.263482420177915
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite being able to generate fluent and grammatical text, current Seq2Seq
summarization models still suffering from the unfaithful generation problem. In
this paper, we study the faithfulness of existing systems from a new
perspective of factual robustness which is the ability to correctly generate
factual information over adversarial unfaithful information. We first measure a
model's factual robustness by its success rate to defend against adversarial
attacks when generating factual information. The factual robustness analysis on
a wide range of current systems shows its good consistency with human judgments
on faithfulness. Inspired by these findings, we propose to improve the
faithfulness of a model by enhancing its factual robustness. Specifically, we
propose a novel training strategy, namely FRSUM, which teaches the model to
defend against both explicit adversarial samples and implicit factual
adversarial perturbations. Extensive automatic and human evaluation results
show that FRSUM consistently improves the faithfulness of various Seq2Seq
models, such as T5, BART.
Related papers
- FaithEval: Can Your Language Model Stay Faithful to Context, Even If "The Moon is Made of Marshmallows" [74.7488607599921]
FaithEval is a benchmark to evaluate the faithfulness of large language models (LLMs) in contextual scenarios.
FaithEval comprises 4.9K high-quality problems in total, validated through a rigorous four-stage context construction and validation framework.
arXiv Detail & Related papers (2024-09-30T06:27:53Z) - Factual Dialogue Summarization via Learning from Large Language Models [35.63037083806503]
Large language model (LLM)-based automatic text summarization models generate more factually consistent summaries.
We employ zero-shot learning to extract symbolic knowledge from LLMs, generating factually consistent (positive) and inconsistent (negative) summaries.
Our approach achieves better factual consistency while maintaining coherence, fluency, and relevance, as confirmed by various automatic evaluation metrics.
arXiv Detail & Related papers (2024-06-20T20:03:37Z) - Improving the Reliability of Large Language Models by Leveraging
Uncertainty-Aware In-Context Learning [76.98542249776257]
Large-scale language models often face the challenge of "hallucination"
We introduce an uncertainty-aware in-context learning framework to empower the model to enhance or reject its output in response to uncertainty.
arXiv Detail & Related papers (2023-10-07T12:06:53Z) - Factually Consistent Summarization via Reinforcement Learning with
Textual Entailment Feedback [57.816210168909286]
We leverage recent progress on textual entailment models to address this problem for abstractive summarization systems.
We use reinforcement learning with reference-free, textual entailment rewards to optimize for factual consistency.
Our results, according to both automatic metrics and human evaluation, show that our method considerably improves the faithfulness, salience, and conciseness of the generated summaries.
arXiv Detail & Related papers (2023-05-31T21:04:04Z) - Disentangled Text Representation Learning with Information-Theoretic
Perspective for Adversarial Robustness [17.5771010094384]
Adversarial vulnerability remains a major obstacle to constructing reliable NLP systems.
Recent work argues the adversarial vulnerability of the model is caused by the non-robust features in supervised training.
In this paper, we tackle the adversarial challenge from the view of disentangled representation learning.
arXiv Detail & Related papers (2022-10-26T18:14:39Z) - Precisely the Point: Adversarial Augmentations for Faithful and
Informative Text Generation [45.37475848753975]
In this paper, we conduct the first quantitative analysis on the robustness of pre-trained Seq2Seq models.
We find that even current SOTA pre-trained Seq2Seq model (BART) is still vulnerable, which leads to significant degeneration in faithfulness and informativeness for text generation tasks.
We propose a novel adversarial augmentation framework, namely AdvSeq, for improving faithfulness and informativeness of Seq2Seq models.
arXiv Detail & Related papers (2022-10-22T06:38:28Z) - CLIFF: Contrastive Learning for Improving Faithfulness and Factuality in
Abstractive Summarization [6.017006996402699]
We study generating abstractive summaries that are faithful and factually consistent with the given articles.
A novel contrastive learning formulation is presented, which leverages both reference summaries, as positive training data, and automatically generated erroneous summaries, as negative training data, to train summarization systems that are better at distinguishing between them.
arXiv Detail & Related papers (2021-09-19T20:05:21Z) - Adversarial Robustness under Long-Tailed Distribution [93.50792075460336]
Adversarial robustness has attracted extensive studies recently by revealing the vulnerability and intrinsic characteristics of deep networks.
In this work we investigate the adversarial vulnerability as well as defense under long-tailed distributions.
We propose a clean yet effective framework, RoBal, which consists of two dedicated modules, a scale-invariant and data re-balancing.
arXiv Detail & Related papers (2021-04-06T17:53:08Z) - Adversarially Robust Estimate and Risk Analysis in Linear Regression [17.931533943788335]
Adversarially robust learning aims to design algorithms that are robust to small adversarial perturbations on input variables.
By discovering the statistical minimax rate of convergence of adversarially robust estimators, we emphasize the importance of incorporating model information.
We propose a straightforward two-stage adversarial learning framework, which facilitates to utilize model structure information to improve adversarial robustness.
arXiv Detail & Related papers (2020-12-18T14:55:55Z) - InfoBERT: Improving Robustness of Language Models from An Information
Theoretic Perspective [84.78604733927887]
Large-scale language models such as BERT have achieved state-of-the-art performance across a wide range of NLP tasks.
Recent studies show that such BERT-based models are vulnerable facing the threats of textual adversarial attacks.
We propose InfoBERT, a novel learning framework for robust fine-tuning of pre-trained language models.
arXiv Detail & Related papers (2020-10-05T20:49:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.