Theory of gating in recurrent neural networks
- URL: http://arxiv.org/abs/2007.14823v5
- Date: Wed, 1 Dec 2021 17:43:29 GMT
- Title: Theory of gating in recurrent neural networks
- Authors: Kamesh Krishnamurthy, Tankut Can and David J. Schwab
- Abstract summary: Recurrent neural networks (RNNs) are powerful dynamical models, widely used in machine learning (ML) and neuroscience.
Here, we show that gating offers flexible control of two salient features of the collective dynamics.
The gate controlling timescales leads to a novel, marginally stable state, where the network functions as a flexible integrator.
- Score: 5.672132510411465
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recurrent neural networks (RNNs) are powerful dynamical models, widely used
in machine learning (ML) and neuroscience. Prior theoretical work has focused
on RNNs with additive interactions. However, gating - i.e. multiplicative -
interactions are ubiquitous in real neurons and also the central feature of the
best-performing RNNs in ML. Here, we show that gating offers flexible control
of two salient features of the collective dynamics: i) timescales and ii)
dimensionality. The gate controlling timescales leads to a novel, marginally
stable state, where the network functions as a flexible integrator. Unlike
previous approaches, gating permits this important function without parameter
fine-tuning or special symmetries. Gates also provide a flexible,
context-dependent mechanism to reset the memory trace, thus complementing the
memory function. The gate modulating the dimensionality can induce a novel,
discontinuous chaotic transition, where inputs push a stable system to strong
chaotic activity, in contrast to the typically stabilizing effect of inputs. At
this transition, unlike additive RNNs, the proliferation of critical points
(topological complexity) is decoupled from the appearance of chaotic dynamics
(dynamical complexity).
The rich dynamics are summarized in phase diagrams, thus providing a map for
principled parameter initialization choices to ML practitioners.
Related papers
- Sparse identification of quasipotentials via a combined data-driven method [4.599618895656792]
We leverage on machine learning via the combination of two data-driven techniques, namely a neural network and a sparse regression algorithm, to obtain symbolic expressions of quasipotential functions.
We show that our approach discovers a parsimonious quasipotential equation for an archetypal model with a known exact quasipotential and for the dynamics of a nanomechanical resonator.
arXiv Detail & Related papers (2024-07-06T11:27:52Z) - Spiking Neural Networks with Consistent Mapping Relations Allow High-Accuracy Inference [9.667807887916132]
Spike-based neuromorphic hardware has demonstrated substantial potential in low energy consumption and efficient inference.
Direct training of deep spiking neural networks is challenging, and conversion-based methods still require substantial time delay owing to unresolved conversion errors.
arXiv Detail & Related papers (2024-06-08T06:40:00Z) - Single Neuromorphic Memristor closely Emulates Multiple Synaptic
Mechanisms for Energy Efficient Neural Networks [71.79257685917058]
We demonstrate memristive nano-devices based on SrTiO3 that inherently emulate all these synaptic functions.
These memristors operate in a non-filamentary, low conductance regime, which enables stable and energy efficient operation.
arXiv Detail & Related papers (2024-02-26T15:01:54Z) - Complex Recurrent Spectral Network [1.0499611180329806]
This paper presents a novel approach to advancing artificial intelligence (AI) through the development of the Complex Recurrent Spectral Network ($mathbbC$-RSN)
The $mathbbC$-RSN is designed to address a critical limitation in existing neural network models: their inability to emulate the complex processes of biological neural networks.
arXiv Detail & Related papers (2023-12-12T14:14:40Z) - Leveraging Low-Rank and Sparse Recurrent Connectivity for Robust
Closed-Loop Control [63.310780486820796]
We show how a parameterization of recurrent connectivity influences robustness in closed-loop settings.
We find that closed-form continuous-time neural networks (CfCs) with fewer parameters can outperform their full-rank, fully-connected counterparts.
arXiv Detail & Related papers (2023-10-05T21:44:18Z) - How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series.
We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z) - ConCerNet: A Contrastive Learning Based Framework for Automated
Conservation Law Discovery and Trustworthy Dynamical System Prediction [82.81767856234956]
This paper proposes a new learning framework named ConCerNet to improve the trustworthiness of the DNN based dynamics modeling.
We show that our method consistently outperforms the baseline neural networks in both coordinate error and conservation metrics.
arXiv Detail & Related papers (2023-02-11T21:07:30Z) - Artificial Neuronal Ensembles with Learned Context Dependent Gating [0.0]
We introduce Learned Context Dependent Gating (LXDG), a method to flexibly allocate and recall artificial neuronal ensembles'
Activities in the hidden layers of the network are modulated by gates, which are dynamically produced during training.
We demonstrate the ability of this method to alleviate catastrophic forgetting on continual learning benchmarks.
arXiv Detail & Related papers (2023-01-17T20:52:48Z) - Equivariant Graph Mechanics Networks with Constraints [83.38709956935095]
We propose Graph Mechanics Network (GMN) which is efficient, equivariant and constraint-aware.
GMN represents, by generalized coordinates, the forward kinematics information (positions and velocities) of a structural object.
Extensive experiments support the advantages of GMN compared to the state-of-the-art GNNs in terms of prediction accuracy, constraint satisfaction and data efficiency.
arXiv Detail & Related papers (2022-03-12T14:22:14Z) - Modeling Implicit Bias with Fuzzy Cognitive Maps [0.0]
This paper presents a Fuzzy Cognitive Map model to quantify implicit bias in structured datasets.
We introduce a new reasoning mechanism equipped with a normalization-like transfer function that prevents neurons from saturating.
arXiv Detail & Related papers (2021-12-23T17:04:12Z) - Flexible Transmitter Network [84.90891046882213]
Current neural networks are mostly built upon the MP model, which usually formulates the neuron as executing an activation function on the real-valued weighted aggregation of signals received from other neurons.
We propose the Flexible Transmitter (FT) model, a novel bio-plausible neuron model with flexible synaptic plasticity.
We present the Flexible Transmitter Network (FTNet), which is built on the most common fully-connected feed-forward architecture.
arXiv Detail & Related papers (2020-04-08T06:55:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.