Related papers: Per-Example Gradient Regularization Improves Learning Signals from Noisy Data

Per-Example Gradient Regularization Improves Learning Signals from Noisy Data

URL: http://arxiv.org/abs/2303.17940v1
Date: Fri, 31 Mar 2023 10:08:23 GMT
Title: Per-Example Gradient Regularization Improves Learning Signals from Noisy Data
Authors: Xuran Meng, Yuan Cao and Difan Zou
Abstract summary: Empirical evidence suggests that gradient regularization technique can significantly enhance the robustness of deep learning models against noisy perturbations. We present a theoretical analysis that demonstrates its effectiveness in improving both test error and robustness against noise perturbations. Our analysis reveals that PEGR penalizes the variance of pattern learning, thus effectively suppressing the memorization of noises from the training data.
Score: 25.646054298195434
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Gradient regularization, as described in \citet{barrett2021implicit}, is a highly effective technique for promoting flat minima during gradient descent. Empirical evidence suggests that this regularization technique can significantly enhance the robustness of deep learning models against noisy perturbations, while also reducing test error. In this paper, we explore the per-example gradient regularization (PEGR) and present a theoretical analysis that demonstrates its effectiveness in improving both test error and robustness against noise perturbations. Specifically, we adopt a signal-noise data model from \citet{cao2022benign} and show that PEGR can learn signals effectively while suppressing noise. In contrast, standard gradient descent struggles to distinguish the signal from the noise, leading to suboptimal generalization performance. Our analysis reveals that PEGR penalizes the variance of pattern learning, thus effectively suppressing the memorization of noises from the training data. These findings underscore the importance of variance control in deep learning training and offer useful insights for developing more effective training approaches.

Related papers

U-DREAM: Unsupervised Dereverberation guided by a Reverberation Model [12.192022160630165]
This paper explores the outcome of training state-ofthe-art dereverberation models with supervision settings ranging from weakly-supervised to fully unsupervised.<n>Most of the existing deep learning approaches typically require paired dry and reverberant data, which are difficult to obtain in practice.<n>We develop instead a sequential learning strategy motivated by a bayesian formulation of the dereverberation problem, wherein acoustic parameters and dry signals are estimated from reverberant inputs using deep neural networks.
arXiv Detail & Related papers (2025-07-17T12:26:18Z)
Machine Unlearning for Robust DNNs: Attribution-Guided Partitioning and Neuron Pruning in Noisy Environments [5.8166742412657895]
Deep neural networks (DNNs) have achieved remarkable success across diverse domains, but their performance can be severely degraded by noisy or corrupted training data.<n>We propose a novel framework that integrates attribution-guided data partitioning, discriminative neuron pruning, and targeted fine-tuning to mitigate the impact of noisy samples.<n>Our framework achieves approximately a 10% absolute accuracy improvement over standard retraining on CIFAR-10 with injected label noise.
arXiv Detail & Related papers (2025-06-13T09:37:11Z)
Learn Beneficial Noise as Graph Augmentation [54.44813218411879]
We propose PiNGDA, where positive-incentive noise (pi-noise) scientifically analyzes the beneficial effect of noise under the information theory.<n>We prove that the standard GCL with pre-defined augmentations is equivalent to estimate the beneficial noise via the point estimation.<n>Since the generator learns how to produce beneficial perturbations on graph topology and node attributes, PiNGDA is more reliable compared with the existing methods.
arXiv Detail & Related papers (2025-05-25T08:20:34Z)
Rethinking Contrastive Learning in Graph Anomaly Detection: A Clean-View Perspective [54.605073936695575]
Graph anomaly detection aims to identify unusual patterns in graph-based data, with wide applications in fields such as web security and financial fraud detection.<n>Existing methods rely on contrastive learning, assuming that a lower similarity between a node and its local subgraph indicates abnormality.<n>The presence of interfering edges invalidates this assumption, since it introduces disruptive noise that compromises the contrastive learning process.<n>We propose a Clean-View Enhanced Graph Anomaly Detection framework (CVGAD), which includes a multi-scale anomaly awareness module to identify key sources of interference in the contrastive learning process.
arXiv Detail & Related papers (2025-05-23T15:05:56Z)
Optimized Gradient Clipping for Noisy Label Learning [26.463965846251938]
We propose a simple yet effective approach called Optimized Gradient Clipping (OGC) OGC dynamically adjusts the clipping threshold based on the ratio of noise gradients to clean gradients after clipping. Our experiments across various types of label noise, including symmetric, asymmetric, instance-dependent, and real-world noise, demonstrate the effectiveness of OGC.
arXiv Detail & Related papers (2024-12-12T05:08:05Z)
Improved Noise Schedule for Diffusion Training [51.849746576387375]
We propose a novel approach to design the noise schedule for enhancing the training of diffusion models. We empirically demonstrate the superiority of our noise schedule over the standard cosine schedule.
arXiv Detail & Related papers (2024-07-03T17:34:55Z)
Stochastic Resetting Mitigates Latent Gradient Bias of SGD from Label Noise [2.048226951354646]
We show that resetting from a checkpoint can significantly improve generalization performance when training deep neural networks (DNNs) with noisy labels. In the presence of noisy labels, DNNs initially learn the general patterns of the data but then gradually memorize the corrupted data, leading to overfitting. By deconstructing the dynamics of gradient descent (SGD), we identify the behavior of a latent gradient bias induced by noisy labels, which harms generalization.
arXiv Detail & Related papers (2024-06-01T10:45:41Z)
Learning with Noisy Foundation Models [95.50968225050012]
This paper is the first work to comprehensively understand and analyze the nature of noise in pre-training datasets. We propose a tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise and improve generalization.
arXiv Detail & Related papers (2024-03-11T16:22:41Z)
Inference Stage Denoising for Undersampled MRI Reconstruction [13.8086726938161]
Reconstruction of magnetic resonance imaging (MRI) data has been positively affected by deep learning. A key challenge remains: to improve generalisation to distribution shifts between the training and testing data.
arXiv Detail & Related papers (2024-02-12T12:50:10Z)
Boosting of Implicit Neural Representation-based Image Denoiser [2.2452191187045383]
Implicit Neural Representation (INR) has emerged as an effective method for unsupervised image denoising. We propose a general recipe for regularizing INR models in image denoising.
arXiv Detail & Related papers (2024-01-03T05:51:25Z)
Analyze the Robustness of Classifiers under Label Noise [5.708964539699851]
Label noise in supervised learning, characterized by erroneous or imprecise labels, significantly impairs model performance. This research focuses on the increasingly pertinent issue of label noise's impact on practical applications.
arXiv Detail & Related papers (2023-12-12T13:51:25Z)
Understanding and Mitigating the Label Noise in Pre-training on Downstream Tasks [91.15120211190519]
This paper aims to understand the nature of noise in pre-training datasets and to mitigate its impact on downstream tasks. We propose a light-weight black-box tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise.
arXiv Detail & Related papers (2023-09-29T06:18:15Z)
Feature Noise Boosts DNN Generalization under Label Noise [65.36889005555669]
The presence of label noise in the training data has a profound impact on the generalization of deep neural networks (DNNs) In this study, we introduce and theoretically demonstrate a simple feature noise method, which directly adds noise to the features of training data.
arXiv Detail & Related papers (2023-08-03T08:31:31Z)
Advancing underwater acoustic target recognition via adaptive data pruning and smoothness-inducing regularization [27.039672355700198]
We propose a strategy based on cross-entropy to prune excessively similar segments in training data. We generate noisy samples and apply smoothness-inducing regularization based on KL divergence to mitigate overfitting.
arXiv Detail & Related papers (2023-04-24T08:30:41Z)
The role of noise in denoising models for anomaly detection in medical images [62.0532151156057]
Pathological brain lesions exhibit diverse appearance in brain images. Unsupervised anomaly detection approaches have been proposed using only normal data for training. We show that optimization of the spatial resolution and magnitude of the noise improves the performance of different model training regimes.
arXiv Detail & Related papers (2023-01-19T21:39:38Z)
Learning Sparsity-Promoting Regularizers using Bilevel Optimization [9.18465987536469]
We present a method for supervised learning of sparsity-promoting regularizers for denoising signals and images. Experiments with structured 1D signals and natural images show that the proposed method can learn an operator that outperforms well-known regularizers.
arXiv Detail & Related papers (2022-07-18T20:50:02Z)
A Self-Refinement Strategy for Noise Reduction in Grammatical Error Correction [54.569707226277735]
Existing approaches for grammatical error correction (GEC) rely on supervised learning with manually created GEC datasets. There is a non-negligible amount of "noise" where errors were inappropriately edited or left uncorrected. We propose a self-refinement method where the key idea is to denoise these datasets by leveraging the prediction consistency of existing models.
arXiv Detail & Related papers (2020-10-07T04:45:09Z)
Shape Matters: Understanding the Implicit Bias of the Noise Covariance [76.54300276636982]
Noise in gradient descent provides a crucial implicit regularization effect for training over parameterized models. We show that parameter-dependent noise -- induced by mini-batches or label perturbation -- is far more effective than Gaussian noise. Our analysis reveals that parameter-dependent noise introduces a bias towards local minima with smaller noise variance, whereas spherical Gaussian noise does not.
arXiv Detail & Related papers (2020-06-15T18:31:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.