Related papers: Ensemble perspective for understanding temporal credit assignment

Ensemble perspective for understanding temporal credit assignment

URL: http://arxiv.org/abs/2102.03740v1
Date: Sun, 7 Feb 2021 08:14:05 GMT
Title: Ensemble perspective for understanding temporal credit assignment
Authors: Wenxuan Zou, Chan Li, and Haiping Huang
Abstract summary: We show that each individual connection in recurrent neural networks is modeled by a spike and slab distribution, rather than a precise weight value. Our model reveals important connections that determine the overall performance of the network. It is thus promising to study the temporal credit assignment in recurrent neural networks from the ensemble perspective.
Score: 1.9843222704723809
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recurrent neural networks are widely used for modeling spatio-temporal sequences in both nature language processing and neural population dynamics. However, understanding the temporal credit assignment is hard. Here, we propose that each individual connection in the recurrent computation is modeled by a spike and slab distribution, rather than a precise weight value. We then derive the mean-field algorithm to train the network at the ensemble level. The method is then applied to classify handwritten digits when pixels are read in sequence, and to the multisensory integration task that is a fundamental cognitive function of animals. Our model reveals important connections that determine the overall performance of the network. The model also shows how spatio-temporal information is processed through the hyperparameters of the distribution, and moreover reveals distinct types of emergent neural selectivity. It is thus promising to study the temporal credit assignment in recurrent neural networks from the ensemble perspective.

Related papers

Coding schemes in neural networks learning classification tasks [52.22978725954347]
We investigate fully-connected, wide neural networks learning classification tasks. We show that the networks acquire strong, data-dependent features. Surprisingly, the nature of the internal representations depends crucially on the neuronal nonlinearity.
arXiv Detail & Related papers (2024-06-24T14:50:05Z)
Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters. Our approach enables a single model to encode neural computational graphs with diverse architectures. We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z)
Understanding Activation Patterns in Artificial Neural Networks by Exploring Stochastic Processes [0.0]
We propose utilizing the framework of processes, which has been underutilized thus far. We focus solely on activation frequency, leveraging neuroscience techniques used for real neuron spike trains. We derive parameters describing activation patterns in each network, revealing consistent differences across architectures and training sets.
arXiv Detail & Related papers (2023-08-01T22:12:30Z)
Decomposing spiking neural networks with Graphical Neural Activity Threads [0.734084539365505]
We introduce techniques for analyzing spiking neural networks that decompose neural activity into multiple, disjoint, parallel threads of activity. We find that this graph of spiking activity naturally decomposes into disjoint connected components that overlap in space and time. We provide an efficient algorithm for finding analogous threads that reoccur in large spiking datasets, revealing that seemingly distinct spike trains are composed of similar underlying threads of activity.
arXiv Detail & Related papers (2023-06-29T05:10:11Z)
How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series. We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z)
Learning the Evolutionary and Multi-scale Graph Structure for Multivariate Time Series Forecasting [50.901984244738806]
We show how to model the evolutionary and multi-scale interactions of time series. In particular, we first provide a hierarchical graph structure cooperated with the dilated convolution to capture the scale-specific correlations. A unified neural network is provided to integrate the components above to get the final prediction.
arXiv Detail & Related papers (2022-06-28T08:11:12Z)
Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs. By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z)
Long Short-term Cognitive Networks [2.2748974006378933]
We present a recurrent neural system named Long Short-term Cognitive Networks (LSTCNs) as a generalisation of the Short-term Cognitive Network (STCN) model. Our neural system reports small forecasting errors while being up to thousands of times faster than state-of-the-art recurrent models.
arXiv Detail & Related papers (2021-06-30T17:42:09Z)
Persistent Homology Captures the Generalization of Neural Networks Without A Validation Set [0.0]
We suggest studying the training of neural networks with Algebraic Topology, specifically Persistent Homology. Using simplicial complex representations of neural networks, we study the PH diagram distance evolution on the neural network learning process. Results show that the PH diagram distance between consecutive neural network states correlates with the validation accuracy.
arXiv Detail & Related papers (2021-05-31T09:17:31Z)
Can Temporal-Difference and Q-Learning Learn Representation? A Mean-Field Theory [110.99247009159726]
Temporal-difference and Q-learning play a key role in deep reinforcement learning, where they are empowered by expressive nonlinear function approximators such as neural networks. In particular, temporal-difference learning converges when the function approximator is linear in a feature representation, which is fixed throughout learning, and possibly diverges otherwise.
arXiv Detail & Related papers (2020-06-08T17:25:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.