Related papers: Improving the Robustness of Summarization Models by Detecting and Removing Input Noise

Improving the Robustness of Summarization Models by Detecting and Removing Input Noise

URL: http://arxiv.org/abs/2212.09928v2
Date: Mon, 4 Dec 2023 16:18:33 GMT
Title: Improving the Robustness of Summarization Models by Detecting and Removing Input Noise
Authors: Kundan Krishna, Yao Zhao, Jie Ren, Balaji Lakshminarayanan, Jiaming Luo, Mohammad Saleh, Peter J. Liu
Abstract summary: We present a large empirical study quantifying the sometimes severe loss in performance from different types of input noise for a range of datasets and model sizes. We propose a light-weight method for detecting and removing such noise in the input during model inference without requiring any training, auxiliary models, or even prior knowledge of the type of noise.
Score: 50.27105057899601
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The evaluation of abstractive summarization models typically uses test data that is identically distributed as training data. In real-world practice, documents to be summarized may contain input noise caused by text extraction artifacts or data pipeline bugs. The robustness of model performance under distribution shift caused by such noise is relatively under-studied. We present a large empirical study quantifying the sometimes severe loss in performance (up to 12 ROUGE-1 points) from different types of input noise for a range of datasets and model sizes. We then propose a light-weight method for detecting and removing such noise in the input during model inference without requiring any extra training, auxiliary models, or even prior knowledge of the type of noise. Our proposed approach effectively mitigates the loss in performance, recovering a large fraction of the performance drop, sometimes as large as 11 ROUGE-1 points.

Related papers

Machine Unlearning for Robust DNNs: Attribution-Guided Partitioning and Neuron Pruning in Noisy Environments [5.8166742412657895]
Deep neural networks (DNNs) have achieved remarkable success across diverse domains, but their performance can be severely degraded by noisy or corrupted training data.<n>We propose a novel framework that integrates attribution-guided data partitioning, discriminative neuron pruning, and targeted fine-tuning to mitigate the impact of noisy samples.<n>Our framework achieves approximately a 10% absolute accuracy improvement over standard retraining on CIFAR-10 with injected label noise.
arXiv Detail & Related papers (2025-06-13T09:37:11Z)
Improving Noise Robustness through Abstractions and its Impact on Machine Learning [2.6563873893593826]
Noise is a fundamental problem in learning theory with huge effects in the application of Machine Learning (ML) methods. In this paper, we propose a method to deal with noise: mitigating its effect through the use of data abstractions. The goal is to reduce the effect of noise over the model's performance through the loss of information produced by the abstraction.
arXiv Detail & Related papers (2024-06-12T17:14:44Z)
Learning with Noisy Foundation Models [95.50968225050012]
This paper is the first work to comprehensively understand and analyze the nature of noise in pre-training datasets. We propose a tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise and improve generalization.
arXiv Detail & Related papers (2024-03-11T16:22:41Z)
Noisy Pair Corrector for Dense Retrieval [59.312376423104055]
We propose a novel approach called Noisy Pair Corrector (NPC) NPC consists of a detection module and a correction module. We conduct experiments on text-retrieval benchmarks Natural Question and TriviaQA, code-search benchmarks StaQC and SO-DS.
arXiv Detail & Related papers (2023-11-07T08:27:14Z)
Towards Robust and Generalizable Training: An Empirical Study of Noisy Slot Filling for Input Perturbations [38.766702041991046]
We introduce a noise robustness evaluation dataset named Noise-SF for slot filling task. The proposed dataset contains five types of human-annotated noise. We find that baseline models have poor performance in robustness evaluation.
arXiv Detail & Related papers (2023-10-05T12:59:57Z)
Understanding and Mitigating the Label Noise in Pre-training on Downstream Tasks [91.15120211190519]
This paper aims to understand the nature of noise in pre-training datasets and to mitigate its impact on downstream tasks. We propose a light-weight black-box tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise.
arXiv Detail & Related papers (2023-09-29T06:18:15Z)
The role of noise in denoising models for anomaly detection in medical images [62.0532151156057]
Pathological brain lesions exhibit diverse appearance in brain images. Unsupervised anomaly detection approaches have been proposed using only normal data for training. We show that optimization of the spatial resolution and magnitude of the noise improves the performance of different model training regimes.
arXiv Detail & Related papers (2023-01-19T21:39:38Z)
Denoising Enhanced Distantly Supervised Ultrafine Entity Typing [36.14308856513851]
We build a noise model to estimate the unknown labeling noise distribution over input contexts and noisy type labels. With the noise model, more trustworthy labels can be recovered by subtracting the estimated noise from the input. We propose an entity typing model, which adopts a bi-encoder architecture, is trained on the denoised data.
arXiv Detail & Related papers (2022-10-18T05:20:16Z)
Deep Active Learning with Noise Stability [24.54974925491753]
Uncertainty estimation for unlabeled data is crucial to active learning. We propose a novel algorithm that leverages noise stability to estimate data uncertainty. Our method is generally applicable in various tasks, including computer vision, natural language processing, and structural data analysis.
arXiv Detail & Related papers (2022-05-26T13:21:01Z)
Bridging the Gap Between Clean Data Training and Real-World Inference for Spoken Language Understanding [76.89426311082927]
Existing models are trained on clean data, which causes a textitgap between clean data training and real-world inference. We propose a method from the perspective of domain adaptation, by which both high- and low-quality samples are embedding into similar vector space. Experiments on the widely-used dataset, Snips, and large scale in-house dataset (10 million training examples) demonstrate that this method not only outperforms the baseline models on real-world (noisy) corpus but also enhances the robustness, that is, it produces high-quality results under a noisy environment.
arXiv Detail & Related papers (2021-04-13T17:54:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.