Noisy Recurrent Neural Networks
- URL: http://arxiv.org/abs/2102.04877v1
- Date: Tue, 9 Feb 2021 15:20:50 GMT
- Title: Noisy Recurrent Neural Networks
- Authors: Soon Hoe Lim, N. Benjamin Erichson, Liam Hodgkinson, Michael W.
Mahoney
- Abstract summary: We study recurrent neural networks (RNNs) trained by injecting noise into hidden states as discretizations of differential equations driven by input data.
We find that, under reasonable assumptions, this implicit regularization promotes flatter minima; it biases towards models with more stable dynamics; and, in classification tasks, it favors models with larger classification margin.
Our theory is supported by empirical results which demonstrate improved robustness with respect to various input perturbations, while maintaining state-of-the-art performance.
- Score: 45.94390701863504
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We provide a general framework for studying recurrent neural networks (RNNs)
trained by injecting noise into hidden states. Specifically, we consider RNNs
that can be viewed as discretizations of stochastic differential equations
driven by input data. This framework allows us to study the implicit
regularization effect of general noise injection schemes by deriving an
approximate explicit regularizer in the small noise regime. We find that, under
reasonable assumptions, this implicit regularization promotes flatter minima;
it biases towards models with more stable dynamics; and, in classification
tasks, it favors models with larger classification margin. Sufficient
conditions for global stability are obtained, highlighting the phenomenon of
stochastic stabilization, where noise injection can improve stability during
training. Our theory is supported by empirical results which demonstrate
improved robustness with respect to various input perturbations, while
maintaining state-of-the-art performance.
Related papers
- xAI-Drop: Don't Use What You Cannot Explain [23.33477769275026]
Graph Neural Networks (GNNs) have emerged as the predominant paradigm for learning from graph-structured data.
GNNs face challenges such as oversmoothing, lack of generalization and poor interpretability.
We introduce xAI-Drop, a novel topological-level dropping regularizer that leverages explainability to pinpoint noisy network elements.
arXiv Detail & Related papers (2024-07-29T14:53:45Z) - Stable Neighbor Denoising for Source-free Domain Adaptive Segmentation [91.83820250747935]
Pseudo-label noise is mainly contained in unstable samples in which predictions of most pixels undergo significant variations during self-training.
We introduce the Stable Neighbor Denoising (SND) approach, which effectively discovers highly correlated stable and unstable samples.
SND consistently outperforms state-of-the-art methods in various SFUDA semantic segmentation settings.
arXiv Detail & Related papers (2024-06-10T21:44:52Z) - Noise Injection Node Regularization for Robust Learning [0.0]
Noise Injection Node Regularization (NINR) is a method of injecting structured noise into Deep Neural Networks (DNN) during the training stage, resulting in an emergent regularizing effect.
We present theoretical and empirical evidence for substantial improvement in robustness against various test data perturbations for feed-forward DNNs when trained under NINR.
arXiv Detail & Related papers (2022-10-27T20:51:15Z) - Sparsity in Continuous-Depth Neural Networks [2.969794498016257]
We study the influence of weight and feature sparsity on forecasting and on identifying the underlying dynamical laws.
We curate real-world datasets consisting of human motion capture and human hematopoiesis single-cell RNA-seq data.
arXiv Detail & Related papers (2022-10-26T12:48:12Z) - Label noise (stochastic) gradient descent implicitly solves the Lasso
for quadratic parametrisation [14.244787327283335]
We study the role of the label noise in the training dynamics of a quadratically parametrised model through its continuous time version.
Our findings highlight the fact that structured noise can induce better generalisation and help explain the greater performances of dynamics as observed in practice.
arXiv Detail & Related papers (2022-06-20T15:24:42Z) - NoisyMix: Boosting Robustness by Combining Data Augmentations, Stability
Training, and Noise Injections [46.745755900939216]
We introduce NoisyMix, a training scheme that combines data augmentations with stability training and noise injections to improve both model robustness and in-domain accuracy.
We demonstrate the benefits of NoisyMix on a range of benchmark datasets, including ImageNet-C, ImageNet-R, and ImageNet-P.
arXiv Detail & Related papers (2022-02-02T19:53:35Z) - Asymmetric Heavy Tails and Implicit Bias in Gaussian Noise Injections [73.95786440318369]
We focus on the so-called implicit effect' of GNIs, which is the effect of the injected noise on the dynamics of gradient descent (SGD)
We show that this effect induces an asymmetric heavy-tailed noise on gradient updates.
We then formally prove that GNIs induce an implicit bias', which varies depending on the heaviness of the tails and the level of asymmetry.
arXiv Detail & Related papers (2021-02-13T21:28:09Z) - Training Generative Adversarial Networks by Solving Ordinary
Differential Equations [54.23691425062034]
We study the continuous-time dynamics induced by GAN training.
From this perspective, we hypothesise that instabilities in training GANs arise from the integration error.
We experimentally verify that well-known ODE solvers (such as Runge-Kutta) can stabilise training.
arXiv Detail & Related papers (2020-10-28T15:23:49Z) - Multiplicative noise and heavy tails in stochastic optimization [62.993432503309485]
empirical optimization is central to modern machine learning, but its role in its success is still unclear.
We show that it commonly arises in parameters of discrete multiplicative noise due to variance.
A detailed analysis is conducted in which we describe on key factors, including recent step size, and data, all exhibit similar results on state-of-the-art neural network models.
arXiv Detail & Related papers (2020-06-11T09:58:01Z) - Consistency Regularization for Certified Robustness of Smoothed
Classifiers [89.72878906950208]
A recent technique of randomized smoothing has shown that the worst-case $ell$-robustness can be transformed into the average-case robustness.
We found that the trade-off between accuracy and certified robustness of smoothed classifiers can be greatly controlled by simply regularizing the prediction consistency over noise.
arXiv Detail & Related papers (2020-06-07T06:57:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.