Related papers: Paradoxical noise preference in RNNs

Paradoxical noise preference in RNNs

URL: http://arxiv.org/abs/2601.04539v1
Date: Thu, 08 Jan 2026 03:11:51 GMT
Title: Paradoxical noise preference in RNNs
Authors: Noah Eckstein, Manoj Srinivasan,
Abstract summary: In recurrent neural networks (RNNs) used to model biological neural networks, noise is typically introduced during training to emulate biological variability and regularize learning.<n>We find that continuous-time recurrent neural networks (CTRNNs) often perform best at a nonzero noise level, specifically, the same level used during training.<n>This noise preference typically arises when noise is injected inside the neural activation function; networks trained with noise injected outside the activation function perform best with zero noise.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In recurrent neural networks (RNNs) used to model biological neural networks, noise is typically introduced during training to emulate biological variability and regularize learning. The expectation is that removing the noise at test time should preserve or improve performance. Contrary to this intuition, we find that continuous-time recurrent neural networks (CTRNNs) often perform best at a nonzero noise level, specifically, the same level used during training. This noise preference typically arises when noise is injected inside the neural activation function; networks trained with noise injected outside the activation function perform best with zero noise. Through analyses of simple function approximation, maze navigation, and single neuron regulator tasks, we show that the phenomenon stems from noise-induced shifts of fixed points (stationary distributions) in the underlying stochastic dynamics of the RNNs. These fixed point shifts are noise-level dependent and bias the network outputs when the noise is removed, degrading performance. Analytical and numerical results show that the bias arises when neural states operate near activation function nonlinearities, where noise is asymmetrically attenuated, and that performance optimization incentivizes operation near these nonlinearities. Thus, networks can overfit to the stochastic training environment itself rather than just to the input-output data. The phenomenon is distinct from stochastic resonance, wherein nonzero noise enhances signal processing. Our findings reveal that training noise can become an integral part of the computation learned by recurrent networks, with implications for understanding neural population dynamics and for the design of robust artificial RNNs.

Related papers

Self-Supervised Learning via Flow-Guided Neural Operator on Time-Series Data [57.85958428020496]
Flow-Guided Neural Operator (FGNO) is a novel framework combining operator learning with flow matching for SSL training.<n>FGNO learns mappings in functional spaces by using Short-Time Fourier Transform to unify different time resolutions.<n>Unlike prior generative SSL methods that use noisy inputs during inference, we propose using clean inputs for representation extraction while learning representations with noise.
arXiv Detail & Related papers (2026-02-12T18:54:57Z)
Credit Assignment via Neural Manifold Noise Correlation [0.0]
Credit assignment is central to learning in brains and machines.<n>Noise correlation estimates gradients by correlating perturbations of activity with changes in output.<n>We propose neural manifold noise correlation, which performs credit assignment using perturbations restricted to the neural manifold.
arXiv Detail & Related papers (2026-01-06T01:17:55Z)
Internal noise in hardware deep and recurrent neural networks helps with learning [0.0]
Internal noise during the training of neural networks affects the final performance of recurrent and deep neural networks.<n>In most cases, both deep and echo state networks benefit from internal noise during training, as it enhances their resilience to noise.
arXiv Detail & Related papers (2025-04-18T16:26:46Z)
Learning Provably Robust Estimators for Inverse Problems via Jittering [51.467236126126366]
We investigate whether jittering, a simple regularization technique, is effective for learning worst-case robust estimators for inverse problems. We show that jittering significantly enhances the worst-case robustness, but can be suboptimal for inverse problems beyond denoising.
arXiv Detail & Related papers (2023-07-24T14:19:36Z)
How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series. We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z)
Robust Learning of Recurrent Neural Networks in Presence of Exogenous Noise [22.690064709532873]
We propose a tractable robustness analysis for RNN models subject to input noise. The robustness measure can be estimated efficiently using linearization techniques. Our proposed methodology significantly improves robustness of recurrent neural networks.
arXiv Detail & Related papers (2021-05-03T16:45:05Z)
Understanding and mitigating noise in trained deep neural networks [0.0]
We study the propagation of noise in deep neural networks comprising noisy nonlinear neurons in trained fully connected layers. We find that noise accumulation is generally bound, and adding additional network layers does not worsen the signal to noise ratio beyond a limit. We identify criteria allowing engineers to design noise-resilient novel neural network hardware.
arXiv Detail & Related papers (2021-03-12T17:16:26Z)
Asymmetric Heavy Tails and Implicit Bias in Gaussian Noise Injections [73.95786440318369]
We focus on the so-called implicit effect' of GNIs, which is the effect of the injected noise on the dynamics of gradient descent (SGD) We show that this effect induces an asymmetric heavy-tailed noise on gradient updates. We then formally prove that GNIs induce an implicit bias', which varies depending on the heaviness of the tails and the level of asymmetry.
arXiv Detail & Related papers (2021-02-13T21:28:09Z)
Artificial Neural Variability for Deep Learning: On Overfitting, Noise Memorization, and Catastrophic Forgetting [135.0863818867184]
artificial neural variability (ANV) helps artificial neural networks learn some advantages from natural'' neural networks. ANV plays as an implicit regularizer of the mutual information between the training data and the learned model. It can effectively relieve overfitting, label noise memorization, and catastrophic forgetting at negligible costs.
arXiv Detail & Related papers (2020-11-12T06:06:33Z)
Multi-Tones' Phase Coding (MTPC) of Interaural Time Difference by Spiking Neural Network [68.43026108936029]
We propose a pure spiking neural network (SNN) based computational model for precise sound localization in the noisy real-world environment. We implement this algorithm in a real-time robotic system with a microphone array. The experiment results show a mean error azimuth of 13 degrees, which surpasses the accuracy of the other biologically plausible neuromorphic approach for sound source localization.
arXiv Detail & Related papers (2020-07-07T08:22:56Z)
Robust Processing-In-Memory Neural Networks via Noise-Aware Normalization [26.270754571140735]
PIM accelerators often suffer from intrinsic noise in the physical components. We propose a noise-agnostic method to achieve robust neural network performance against any noise setting.
arXiv Detail & Related papers (2020-07-07T06:51:28Z)
Recurrent Neural Network Learning of Performance and Intrinsic Population Dynamics from Sparse Neural Data [77.92736596690297]
We introduce a novel training strategy that allows learning not only the input-output behavior of an RNN but also its internal network dynamics. We test the proposed method by training an RNN to simultaneously reproduce internal dynamics and output signals of a physiologically-inspired neural model. Remarkably, we show that the reproduction of the internal dynamics is successful even when the training algorithm relies on the activities of a small subset of neurons.
arXiv Detail & Related papers (2020-05-05T14:16:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.