Per-Example Gradient Regularization Improves Learning Signals from Noisy
Data
- URL: http://arxiv.org/abs/2303.17940v1
- Date: Fri, 31 Mar 2023 10:08:23 GMT
- Title: Per-Example Gradient Regularization Improves Learning Signals from Noisy
Data
- Authors: Xuran Meng, Yuan Cao and Difan Zou
- Abstract summary: Empirical evidence suggests that gradient regularization technique can significantly enhance the robustness of deep learning models against noisy perturbations.
We present a theoretical analysis that demonstrates its effectiveness in improving both test error and robustness against noise perturbations.
Our analysis reveals that PEGR penalizes the variance of pattern learning, thus effectively suppressing the memorization of noises from the training data.
- Score: 25.646054298195434
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Gradient regularization, as described in \citet{barrett2021implicit}, is a
highly effective technique for promoting flat minima during gradient descent.
Empirical evidence suggests that this regularization technique can
significantly enhance the robustness of deep learning models against noisy
perturbations, while also reducing test error. In this paper, we explore the
per-example gradient regularization (PEGR) and present a theoretical analysis
that demonstrates its effectiveness in improving both test error and robustness
against noise perturbations. Specifically, we adopt a signal-noise data model
from \citet{cao2022benign} and show that PEGR can learn signals effectively
while suppressing noise. In contrast, standard gradient descent struggles to
distinguish the signal from the noise, leading to suboptimal generalization
performance. Our analysis reveals that PEGR penalizes the variance of pattern
learning, thus effectively suppressing the memorization of noises from the
training data. These findings underscore the importance of variance control in
deep learning training and offer useful insights for developing more effective
training approaches.
Related papers
- Learning with Noisy Foundation Models [95.50968225050012]
This paper is the first work to comprehensively understand and analyze the nature of noise in pre-training datasets.
We propose a tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise and improve generalization.
arXiv Detail & Related papers (2024-03-11T16:22:41Z) - Inference Stage Denoising for Undersampled MRI Reconstruction [13.8086726938161]
Reconstruction of magnetic resonance imaging (MRI) data has been positively affected by deep learning.
A key challenge remains: to improve generalisation to distribution shifts between the training and testing data.
arXiv Detail & Related papers (2024-02-12T12:50:10Z) - Boosting of Implicit Neural Representation-based Image Denoiser [2.2452191187045383]
Implicit Neural Representation (INR) has emerged as an effective method for unsupervised image denoising.
We propose a general recipe for regularizing INR models in image denoising.
arXiv Detail & Related papers (2024-01-03T05:51:25Z) - Analyze the Robustness of Classifiers under Label Noise [5.708964539699851]
Label noise in supervised learning, characterized by erroneous or imprecise labels, significantly impairs model performance.
This research focuses on the increasingly pertinent issue of label noise's impact on practical applications.
arXiv Detail & Related papers (2023-12-12T13:51:25Z) - Understanding and Mitigating the Label Noise in Pre-training on
Downstream Tasks [91.15120211190519]
This paper aims to understand the nature of noise in pre-training datasets and to mitigate its impact on downstream tasks.
We propose a light-weight black-box tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise.
arXiv Detail & Related papers (2023-09-29T06:18:15Z) - Feature Noise Boosts DNN Generalization under Label Noise [65.36889005555669]
The presence of label noise in the training data has a profound impact on the generalization of deep neural networks (DNNs)
In this study, we introduce and theoretically demonstrate a simple feature noise method, which directly adds noise to the features of training data.
arXiv Detail & Related papers (2023-08-03T08:31:31Z) - Advancing underwater acoustic target recognition via adaptive data
pruning and smoothness-inducing regularization [27.039672355700198]
We propose a strategy based on cross-entropy to prune excessively similar segments in training data.
We generate noisy samples and apply smoothness-inducing regularization based on KL divergence to mitigate overfitting.
arXiv Detail & Related papers (2023-04-24T08:30:41Z) - The role of noise in denoising models for anomaly detection in medical
images [62.0532151156057]
Pathological brain lesions exhibit diverse appearance in brain images.
Unsupervised anomaly detection approaches have been proposed using only normal data for training.
We show that optimization of the spatial resolution and magnitude of the noise improves the performance of different model training regimes.
arXiv Detail & Related papers (2023-01-19T21:39:38Z) - Learning Sparsity-Promoting Regularizers using Bilevel Optimization [9.18465987536469]
We present a method for supervised learning of sparsity-promoting regularizers for denoising signals and images.
Experiments with structured 1D signals and natural images show that the proposed method can learn an operator that outperforms well-known regularizers.
arXiv Detail & Related papers (2022-07-18T20:50:02Z) - A Self-Refinement Strategy for Noise Reduction in Grammatical Error
Correction [54.569707226277735]
Existing approaches for grammatical error correction (GEC) rely on supervised learning with manually created GEC datasets.
There is a non-negligible amount of "noise" where errors were inappropriately edited or left uncorrected.
We propose a self-refinement method where the key idea is to denoise these datasets by leveraging the prediction consistency of existing models.
arXiv Detail & Related papers (2020-10-07T04:45:09Z) - Shape Matters: Understanding the Implicit Bias of the Noise Covariance [76.54300276636982]
Noise in gradient descent provides a crucial implicit regularization effect for training over parameterized models.
We show that parameter-dependent noise -- induced by mini-batches or label perturbation -- is far more effective than Gaussian noise.
Our analysis reveals that parameter-dependent noise introduces a bias towards local minima with smaller noise variance, whereas spherical Gaussian noise does not.
arXiv Detail & Related papers (2020-06-15T18:31:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.