Ensemble perspective for understanding temporal credit assignment
- URL: http://arxiv.org/abs/2102.03740v1
- Date: Sun, 7 Feb 2021 08:14:05 GMT
- Title: Ensemble perspective for understanding temporal credit assignment
- Authors: Wenxuan Zou, Chan Li, and Haiping Huang
- Abstract summary: We show that each individual connection in recurrent neural networks is modeled by a spike and slab distribution, rather than a precise weight value.
Our model reveals important connections that determine the overall performance of the network.
It is thus promising to study the temporal credit assignment in recurrent neural networks from the ensemble perspective.
- Score: 1.9843222704723809
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recurrent neural networks are widely used for modeling spatio-temporal
sequences in both nature language processing and neural population dynamics.
However, understanding the temporal credit assignment is hard. Here, we propose
that each individual connection in the recurrent computation is modeled by a
spike and slab distribution, rather than a precise weight value. We then derive
the mean-field algorithm to train the network at the ensemble level. The method
is then applied to classify handwritten digits when pixels are read in
sequence, and to the multisensory integration task that is a fundamental
cognitive function of animals. Our model reveals important connections that
determine the overall performance of the network. The model also shows how
spatio-temporal information is processed through the hyperparameters of the
distribution, and moreover reveals distinct types of emergent neural
selectivity. It is thus promising to study the temporal credit assignment in
recurrent neural networks from the ensemble perspective.
Related papers
- Coding schemes in neural networks learning classification tasks [52.22978725954347]
We investigate fully-connected, wide neural networks learning classification tasks.
We show that the networks acquire strong, data-dependent features.
Surprisingly, the nature of the internal representations depends crucially on the neuronal nonlinearity.
arXiv Detail & Related papers (2024-06-24T14:50:05Z) - Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters.
Our approach enables a single model to encode neural computational graphs with diverse architectures.
We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z) - Understanding Activation Patterns in Artificial Neural Networks by
Exploring Stochastic Processes [0.0]
We propose utilizing the framework of processes, which has been underutilized thus far.
We focus solely on activation frequency, leveraging neuroscience techniques used for real neuron spike trains.
We derive parameters describing activation patterns in each network, revealing consistent differences across architectures and training sets.
arXiv Detail & Related papers (2023-08-01T22:12:30Z) - Decomposing spiking neural networks with Graphical Neural Activity
Threads [0.734084539365505]
We introduce techniques for analyzing spiking neural networks that decompose neural activity into multiple, disjoint, parallel threads of activity.
We find that this graph of spiking activity naturally decomposes into disjoint connected components that overlap in space and time.
We provide an efficient algorithm for finding analogous threads that reoccur in large spiking datasets, revealing that seemingly distinct spike trains are composed of similar underlying threads of activity.
arXiv Detail & Related papers (2023-06-29T05:10:11Z) - How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series.
We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z) - Learning the Evolutionary and Multi-scale Graph Structure for
Multivariate Time Series Forecasting [50.901984244738806]
We show how to model the evolutionary and multi-scale interactions of time series.
In particular, we first provide a hierarchical graph structure cooperated with the dilated convolution to capture the scale-specific correlations.
A unified neural network is provided to integrate the components above to get the final prediction.
arXiv Detail & Related papers (2022-06-28T08:11:12Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - Long Short-term Cognitive Networks [2.2748974006378933]
We present a recurrent neural system named Long Short-term Cognitive Networks (LSTCNs) as a generalisation of the Short-term Cognitive Network (STCN) model.
Our neural system reports small forecasting errors while being up to thousands of times faster than state-of-the-art recurrent models.
arXiv Detail & Related papers (2021-06-30T17:42:09Z) - Persistent Homology Captures the Generalization of Neural Networks
Without A Validation Set [0.0]
We suggest studying the training of neural networks with Algebraic Topology, specifically Persistent Homology.
Using simplicial complex representations of neural networks, we study the PH diagram distance evolution on the neural network learning process.
Results show that the PH diagram distance between consecutive neural network states correlates with the validation accuracy.
arXiv Detail & Related papers (2021-05-31T09:17:31Z) - Can Temporal-Difference and Q-Learning Learn Representation? A Mean-Field Theory [110.99247009159726]
Temporal-difference and Q-learning play a key role in deep reinforcement learning, where they are empowered by expressive nonlinear function approximators such as neural networks.
In particular, temporal-difference learning converges when the function approximator is linear in a feature representation, which is fixed throughout learning, and possibly diverges otherwise.
arXiv Detail & Related papers (2020-06-08T17:25:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.