Variational Adaptive Noise and Dropout towards Stable Recurrent Neural Networks
- URL: http://arxiv.org/abs/2506.01350v1
- Date: Mon, 02 Jun 2025 06:13:22 GMT
- Title: Variational Adaptive Noise and Dropout towards Stable Recurrent Neural Networks
- Authors: Taisuke Kobayashi, Shingo Murata,
- Abstract summary: This paper proposes a novel stable learning theory for recurrent neural networks (RNNs)<n>We show that noise and dropout can be derived simultaneously by transforming the explicit regularization term into implicit regularization.<n>Their scale and ratio can also be adjusted appropriately to optimize the main objective of RNNs, respectively.
- Score: 6.454374656250522
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper proposes a novel stable learning theory for recurrent neural networks (RNNs), so-called variational adaptive noise and dropout (VAND). As stabilizing factors for RNNs, noise and dropout on the internal state of RNNs have been separately confirmed in previous studies. We reinterpret the optimization problem of RNNs as variational inference, showing that noise and dropout can be derived simultaneously by transforming the explicit regularization term arising in the optimization problem into implicit regularization. Their scale and ratio can also be adjusted appropriately to optimize the main objective of RNNs, respectively. In an imitation learning scenario with a mobile manipulator, only VAND is able to imitate sequential and periodic behaviors as instructed. https://youtu.be/UOho3Xr6A2w
Related papers
- Game-Theoretic Gradient Control for Robust Neural Network Training [0.0]
Feed-forward neural networks (FFNNs) are vulnerable to input noise, reducing prediction performance.<n>This study aims to enhance FFNN noise robustness by modifying backpropagation, interpreted as a multi-agent game, and exploring controlled target variable noising.
arXiv Detail & Related papers (2025-07-25T10:26:25Z) - High-performance deep spiking neural networks with 0.3 spikes per neuron [9.01407445068455]
It is hard to train biologically-inspired spiking neural networks (SNNs) than artificial neural networks (ANNs)
We show that training deep SNN models achieves the exact same performance as that of ANNs.
Our SNN accomplishes high-performance classification with less than 0.3 spikes per neuron, lending itself for an energy-efficient implementation.
arXiv Detail & Related papers (2023-06-14T21:01:35Z) - Benign Overfitting in Deep Neural Networks under Lazy Training [72.28294823115502]
We show that when the data distribution is well-separated, DNNs can achieve Bayes-optimal test error for classification.
Our results indicate that interpolating with smoother functions leads to better generalization.
arXiv Detail & Related papers (2023-05-30T19:37:44Z) - Adaptive-saturated RNN: Remember more with less instability [2.191505742658975]
This work proposes Adaptive-Saturated RNNs (asRNN), a variant that dynamically adjusts its saturation level between the two approaches.
Our experiments show encouraging results of asRNN on challenging sequence learning benchmarks compared to several strong competitors.
arXiv Detail & Related papers (2023-04-24T02:28:03Z) - Noise Injection Node Regularization for Robust Learning [0.0]
Noise Injection Node Regularization (NINR) is a method of injecting structured noise into Deep Neural Networks (DNN) during the training stage, resulting in an emergent regularizing effect.
We present theoretical and empirical evidence for substantial improvement in robustness against various test data perturbations for feed-forward DNNs when trained under NINR.
arXiv Detail & Related papers (2022-10-27T20:51:15Z) - Training High-Performance Low-Latency Spiking Neural Networks by
Differentiation on Spike Representation [70.75043144299168]
Spiking Neural Network (SNN) is a promising energy-efficient AI model when implemented on neuromorphic hardware.
It is a challenge to efficiently train SNNs due to their non-differentiability.
We propose the Differentiation on Spike Representation (DSR) method, which could achieve high performance.
arXiv Detail & Related papers (2022-05-01T12:44:49Z) - Comparative Analysis of Interval Reachability for Robust Implicit and
Feedforward Neural Networks [64.23331120621118]
We use interval reachability analysis to obtain robustness guarantees for implicit neural networks (INNs)
INNs are a class of implicit learning models that use implicit equations as layers.
We show that our approach performs at least as well as, and generally better than, applying state-of-the-art interval bound propagation methods to INNs.
arXiv Detail & Related papers (2022-04-01T03:31:27Z) - A Fully Tensorized Recurrent Neural Network [48.50376453324581]
We introduce a "fully tensorized" RNN architecture which jointly encodes the separate weight matrices within each recurrent cell.
This approach reduces model size by several orders of magnitude, while still maintaining similar or better performance compared to standard RNNs.
arXiv Detail & Related papers (2020-10-08T18:24:12Z) - Variational Inference-Based Dropout in Recurrent Neural Networks for
Slot Filling in Spoken Language Understanding [9.93926378136064]
This paper generalizes the variational recurrent neural network (RNN) with variational inference (VI)-based dropout regularization.
Experiments on the ATIS dataset suggest that the variational RNNs with the VI-based dropout regularization can significantly improve the naive dropout regularization RNNs-based baseline systems.
arXiv Detail & Related papers (2020-08-23T22:05:54Z) - Progressive Tandem Learning for Pattern Recognition with Deep Spiking
Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency.
We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z) - Frequentist Uncertainty in Recurrent Neural Networks via Blockwise
Influence Functions [121.10450359856242]
Recurrent neural networks (RNNs) are instrumental in modelling sequential and time-series data.
Existing approaches for uncertainty quantification in RNNs are based predominantly on Bayesian methods.
We develop a frequentist alternative that: (a) does not interfere with model training or compromise its accuracy, (b) applies to any RNN architecture, and (c) provides theoretical coverage guarantees on the estimated uncertainty intervals.
arXiv Detail & Related papers (2020-06-20T22:45:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.