Per-Example Gradient Regularization Improves Learning Signals from Noisy
Data
- URL: http://arxiv.org/abs/2303.17940v1
- Date: Fri, 31 Mar 2023 10:08:23 GMT
- Title: Per-Example Gradient Regularization Improves Learning Signals from Noisy
Data
- Authors: Xuran Meng, Yuan Cao and Difan Zou
- Abstract summary: Empirical evidence suggests that gradient regularization technique can significantly enhance the robustness of deep learning models against noisy perturbations.
We present a theoretical analysis that demonstrates its effectiveness in improving both test error and robustness against noise perturbations.
Our analysis reveals that PEGR penalizes the variance of pattern learning, thus effectively suppressing the memorization of noises from the training data.
- Score: 25.646054298195434
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Gradient regularization, as described in \citet{barrett2021implicit}, is a
highly effective technique for promoting flat minima during gradient descent.
Empirical evidence suggests that this regularization technique can
significantly enhance the robustness of deep learning models against noisy
perturbations, while also reducing test error. In this paper, we explore the
per-example gradient regularization (PEGR) and present a theoretical analysis
that demonstrates its effectiveness in improving both test error and robustness
against noise perturbations. Specifically, we adopt a signal-noise data model
from \citet{cao2022benign} and show that PEGR can learn signals effectively
while suppressing noise. In contrast, standard gradient descent struggles to
distinguish the signal from the noise, leading to suboptimal generalization
performance. Our analysis reveals that PEGR penalizes the variance of pattern
learning, thus effectively suppressing the memorization of noises from the
training data. These findings underscore the importance of variance control in
deep learning training and offer useful insights for developing more effective
training approaches.
Related papers
- Optimized Gradient Clipping for Noisy Label Learning [26.463965846251938]
We propose a simple yet effective approach called Optimized Gradient Clipping (OGC)
OGC dynamically adjusts the clipping threshold based on the ratio of noise gradients to clean gradients after clipping.
Our experiments across various types of label noise, including symmetric, asymmetric, instance-dependent, and real-world noise, demonstrate the effectiveness of OGC.
arXiv Detail & Related papers (2024-12-12T05:08:05Z) - Improved Noise Schedule for Diffusion Training [51.849746576387375]
We propose a novel approach to design the noise schedule for enhancing the training of diffusion models.
We empirically demonstrate the superiority of our noise schedule over the standard cosine schedule.
arXiv Detail & Related papers (2024-07-03T17:34:55Z) - Stochastic Resetting Mitigates Latent Gradient Bias of SGD from Label Noise [2.048226951354646]
We show that resetting from a checkpoint can significantly improve generalization performance when training deep neural networks (DNNs) with noisy labels.
In the presence of noisy labels, DNNs initially learn the general patterns of the data but then gradually memorize the corrupted data, leading to overfitting.
By deconstructing the dynamics of gradient descent (SGD), we identify the behavior of a latent gradient bias induced by noisy labels, which harms generalization.
arXiv Detail & Related papers (2024-06-01T10:45:41Z) - Learning with Noisy Foundation Models [95.50968225050012]
This paper is the first work to comprehensively understand and analyze the nature of noise in pre-training datasets.
We propose a tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise and improve generalization.
arXiv Detail & Related papers (2024-03-11T16:22:41Z) - Inference Stage Denoising for Undersampled MRI Reconstruction [13.8086726938161]
Reconstruction of magnetic resonance imaging (MRI) data has been positively affected by deep learning.
A key challenge remains: to improve generalisation to distribution shifts between the training and testing data.
arXiv Detail & Related papers (2024-02-12T12:50:10Z) - Boosting of Implicit Neural Representation-based Image Denoiser [2.2452191187045383]
Implicit Neural Representation (INR) has emerged as an effective method for unsupervised image denoising.
We propose a general recipe for regularizing INR models in image denoising.
arXiv Detail & Related papers (2024-01-03T05:51:25Z) - Understanding and Mitigating the Label Noise in Pre-training on
Downstream Tasks [91.15120211190519]
This paper aims to understand the nature of noise in pre-training datasets and to mitigate its impact on downstream tasks.
We propose a light-weight black-box tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise.
arXiv Detail & Related papers (2023-09-29T06:18:15Z) - Feature Noise Boosts DNN Generalization under Label Noise [65.36889005555669]
The presence of label noise in the training data has a profound impact on the generalization of deep neural networks (DNNs)
In this study, we introduce and theoretically demonstrate a simple feature noise method, which directly adds noise to the features of training data.
arXiv Detail & Related papers (2023-08-03T08:31:31Z) - Advancing underwater acoustic target recognition via adaptive data
pruning and smoothness-inducing regularization [27.039672355700198]
We propose a strategy based on cross-entropy to prune excessively similar segments in training data.
We generate noisy samples and apply smoothness-inducing regularization based on KL divergence to mitigate overfitting.
arXiv Detail & Related papers (2023-04-24T08:30:41Z) - The role of noise in denoising models for anomaly detection in medical
images [62.0532151156057]
Pathological brain lesions exhibit diverse appearance in brain images.
Unsupervised anomaly detection approaches have been proposed using only normal data for training.
We show that optimization of the spatial resolution and magnitude of the noise improves the performance of different model training regimes.
arXiv Detail & Related papers (2023-01-19T21:39:38Z) - A Self-Refinement Strategy for Noise Reduction in Grammatical Error
Correction [54.569707226277735]
Existing approaches for grammatical error correction (GEC) rely on supervised learning with manually created GEC datasets.
There is a non-negligible amount of "noise" where errors were inappropriately edited or left uncorrected.
We propose a self-refinement method where the key idea is to denoise these datasets by leveraging the prediction consistency of existing models.
arXiv Detail & Related papers (2020-10-07T04:45:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.