EGRU: Event-based GRU for activity-sparse inference and learning
- URL: http://arxiv.org/abs/2206.06178v1
- Date: Mon, 13 Jun 2022 14:07:56 GMT
- Title: EGRU: Event-based GRU for activity-sparse inference and learning
- Authors: Anand Subramoney, Khaleelulla Khan Nazeer, Mark Sch\"one, Christian
Mayr, David Kappel
- Abstract summary: We propose a model that reformulates Gated Recurrent Units (GRU) as an event-based activity-sparse model.
We show that the Event-based GRU (EGRU) demonstrates competitive performance compared to state-of-the-art recurrent network models in real-world tasks.
- Score: 0.8260432715157026
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The scalability of recurrent neural networks (RNNs) is hindered by the
sequential dependence of each time step's computation on the previous time
step's output. Therefore, one way to speed up and scale RNNs is to reduce the
computation required at each time step independent of model size and task. In
this paper, we propose a model that reformulates Gated Recurrent Units (GRU) as
an event-based activity-sparse model that we call the Event-based GRU (EGRU),
where units compute updates only on receipt of input events (event-based) from
other units. When combined with having only a small fraction of the units
active at a time (activity-sparse), this model has the potential to be vastly
more compute efficient than current RNNs. Notably, activity-sparsity in our
model also translates into sparse parameter updates during gradient descent,
extending this compute efficiency to the training phase. We show that the EGRU
demonstrates competitive performance compared to state-of-the-art recurrent
network models in real-world tasks, including language modeling while
maintaining high activity sparsity naturally during inference and training.
This sets the stage for the next generation of recurrent networks that are
scalable and more suitable for novel neuromorphic hardware.
Related papers
- Towards Low-latency Event-based Visual Recognition with Hybrid Step-wise Distillation Spiking Neural Networks [50.32980443749865]
Spiking neural networks (SNNs) have garnered significant attention for their low power consumption and high biologicalability.
Current SNNs struggle to balance accuracy and latency in neuromorphic datasets.
We propose Step-wise Distillation (HSD) method, tailored for neuromorphic datasets.
arXiv Detail & Related papers (2024-09-19T06:52:34Z) - A Dynamical Model of Neural Scaling Laws [79.59705237659547]
We analyze a random feature model trained with gradient descent as a solvable model of network training and generalization.
Our theory shows how the gap between training and test loss can gradually build up over time due to repeated reuse of data.
arXiv Detail & Related papers (2024-02-02T01:41:38Z) - Online Evolutionary Neural Architecture Search for Multivariate
Non-Stationary Time Series Forecasting [72.89994745876086]
This work presents the Online Neuro-Evolution-based Neural Architecture Search (ONE-NAS) algorithm.
ONE-NAS is a novel neural architecture search method capable of automatically designing and dynamically training recurrent neural networks (RNNs) for online forecasting tasks.
Results demonstrate that ONE-NAS outperforms traditional statistical time series forecasting methods.
arXiv Detail & Related papers (2023-02-20T22:25:47Z) - Continuous-time convolutions model of event sequences [46.3471121117337]
Event sequences are non-uniform and sparse, making traditional models unsuitable.
We propose COTIC, a method based on an efficient convolution neural network designed to handle the non-uniform occurrence of events over time.
COTIC outperforms existing models in predicting the next event time and type, achieving an average rank of 1.5 compared to 3.714 for the nearest competitor.
arXiv Detail & Related papers (2023-02-13T10:34:51Z) - Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency.
We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z) - AEGNN: Asynchronous Event-based Graph Neural Networks [54.528926463775946]
Event-based Graph Neural Networks generalize standard GNNs to process events as "evolving"-temporal graphs.
AEGNNs are easily trained on synchronous inputs and can be converted to efficient, "asynchronous" networks at test time.
arXiv Detail & Related papers (2022-03-31T16:21:12Z) - Oscillatory Fourier Neural Network: A Compact and Efficient Architecture
for Sequential Processing [16.69710555668727]
We propose a novel neuron model that has cosine activation with a time varying component for sequential processing.
The proposed neuron provides an efficient building block for projecting sequential inputs into spectral domain.
Applying the proposed model to sentiment analysis on IMDB dataset reaches 89.4% test accuracy within 5 epochs.
arXiv Detail & Related papers (2021-09-14T19:08:07Z) - Structured in Space, Randomized in Time: Leveraging Dropout in RNNs for
Efficient Training [18.521882534906972]
We propose to structure dropout patterns, by dropping out the same set of physical neurons within a batch, resulting in column (row) level hidden state sparsity.
We conduct experiments for three representative NLP tasks: language modelling on the PTB dataset, OpenNMT based machine translation using the IWSLT De-En and En-Vi datasets, and named entity recognition sequence labelling.
arXiv Detail & Related papers (2021-06-22T22:44:32Z) - Convolutional Tensor-Train LSTM for Spatio-temporal Learning [116.24172387469994]
We propose a higher-order LSTM model that can efficiently learn long-term correlations in the video sequence.
This is accomplished through a novel tensor train module that performs prediction by combining convolutional features across time.
Our results achieve state-of-the-art performance-art in a wide range of applications and datasets.
arXiv Detail & Related papers (2020-02-21T05:00:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.