Activity Sparsity Complements Weight Sparsity for Efficient RNN
Inference
- URL: http://arxiv.org/abs/2311.07625v2
- Date: Thu, 7 Dec 2023 07:59:58 GMT
- Title: Activity Sparsity Complements Weight Sparsity for Efficient RNN
Inference
- Authors: Rishav Mukherji, Mark Sch\"one, Khaleelulla Khan Nazeer, Christian
Mayr, Anand Subramoney
- Abstract summary: We show that activity sparsity can compose multiplicatively with parameter sparsity in a recurrent neural network model.
We achieve up to $20times$ reduction of computation while maintaining perplexities below $60$ on the Penn Treebank language modeling task.
- Score: 2.0822643340897273
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Artificial neural networks open up unprecedented machine learning
capabilities at the cost of ever growing computational requirements.
Sparsifying the parameters, often achieved through weight pruning, has been
identified as a powerful technique to compress the number of model parameters
and reduce the computational operations of neural networks. Yet, sparse
activations, while omnipresent in both biological neural networks and deep
learning systems, have not been fully utilized as a compression technique in
deep learning. Moreover, the interaction between sparse activations and weight
pruning is not fully understood. In this work, we demonstrate that activity
sparsity can compose multiplicatively with parameter sparsity in a recurrent
neural network model based on the GRU that is designed to be activity sparse.
We achieve up to $20\times$ reduction of computation while maintaining
perplexities below $60$ on the Penn Treebank language modeling task. This
magnitude of reduction has not been achieved previously with solely sparsely
connected LSTMs, and the language modeling performance of our model has not
been achieved previously with any sparsely activated recurrent neural networks
or spiking neural networks. Neuromorphic computing devices are especially good
at taking advantage of the dynamic activity sparsity, and our results provide
strong evidence that making deep learning models activity sparse and porting
them to neuromorphic devices can be a viable strategy that does not compromise
on task performance. Our results also drive further convergence of methods from
deep learning and neuromorphic computing for efficient machine learning.
Related papers
- NEAR: A Training-Free Pre-Estimator of Machine Learning Model Performance [0.0]
We propose a zero-cost proxy Network Expressivity by Activation Rank (NEAR) to identify the optimal neural network without training.
We demonstrate the cutting-edge correlation between this network score and the model accuracy on NAS-Bench-101 and NATS-Bench-SSS/TSS.
arXiv Detail & Related papers (2024-08-16T14:38:14Z) - Exploiting Heterogeneity in Timescales for Sparse Recurrent Spiking Neural Networks for Energy-Efficient Edge Computing [16.60622265961373]
Spiking Neural Networks (SNNs) represent the forefront of neuromorphic computing.
This paper weaves together three groundbreaking studies that revolutionize SNN performance.
arXiv Detail & Related papers (2024-07-08T23:33:12Z) - Towards Scalable and Versatile Weight Space Learning [51.78426981947659]
This paper introduces the SANE approach to weight-space learning.
Our method extends the idea of hyper-representations towards sequential processing of subsets of neural network weights.
arXiv Detail & Related papers (2024-06-14T13:12:07Z) - Harnessing Neural Unit Dynamics for Effective and Scalable Class-Incremental Learning [38.09011520275557]
Class-incremental learning (CIL) aims to train a model to learn new classes from non-stationary data streams without forgetting old ones.
We propose a new kind of connectionist model by tailoring neural unit dynamics that adapt the behavior of neural networks for CIL.
arXiv Detail & Related papers (2024-06-04T15:47:03Z) - Weight Sparsity Complements Activity Sparsity in Neuromorphic Language Models [3.0753589871055107]
Event-based neural networks (SNNs) naturally exhibit activity sparsity, and many methods exist to sparsify their connectivity by pruning weights.
We study the effects of weight pruning when combined with activity sparsity on language modeling tasks.
Our results suggest sparsely connected event-based neural networks are promising candidates for effective and efficient sequence modeling.
arXiv Detail & Related papers (2024-05-01T10:33:36Z) - Single Neuromorphic Memristor closely Emulates Multiple Synaptic
Mechanisms for Energy Efficient Neural Networks [71.79257685917058]
We demonstrate memristive nano-devices based on SrTiO3 that inherently emulate all these synaptic functions.
These memristors operate in a non-filamentary, low conductance regime, which enables stable and energy efficient operation.
arXiv Detail & Related papers (2024-02-26T15:01:54Z) - SpikingJelly: An open-source machine learning infrastructure platform
for spike-based intelligence [51.6943465041708]
Spiking neural networks (SNNs) aim to realize brain-inspired intelligence on neuromorphic chips with high energy efficiency.
We contribute a full-stack toolkit for pre-processing neuromorphic datasets, building deep SNNs, optimizing their parameters, and deploying SNNs on neuromorphic chips.
arXiv Detail & Related papers (2023-10-25T13:15:17Z) - Progressive Tandem Learning for Pattern Recognition with Deep Spiking
Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency.
We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z) - Effective and Efficient Computation with Multiple-timescale Spiking
Recurrent Neural Networks [0.9790524827475205]
We show how a novel type of adaptive spiking recurrent neural network (SRNN) is able to achieve state-of-the-art performance.
We calculate a $>$100x energy improvement for our SRNNs over classical RNNs on the harder tasks.
arXiv Detail & Related papers (2020-05-24T01:04:53Z) - Recurrent Neural Network Learning of Performance and Intrinsic
Population Dynamics from Sparse Neural Data [77.92736596690297]
We introduce a novel training strategy that allows learning not only the input-output behavior of an RNN but also its internal network dynamics.
We test the proposed method by training an RNN to simultaneously reproduce internal dynamics and output signals of a physiologically-inspired neural model.
Remarkably, we show that the reproduction of the internal dynamics is successful even when the training algorithm relies on the activities of a small subset of neurons.
arXiv Detail & Related papers (2020-05-05T14:16:54Z) - Rectified Linear Postsynaptic Potential Function for Backpropagation in
Deep Spiking Neural Networks [55.0627904986664]
Spiking Neural Networks (SNNs) usetemporal spike patterns to represent and transmit information, which is not only biologically realistic but also suitable for ultra-low-power event-driven neuromorphic implementation.
This paper investigates the contribution of spike timing dynamics to information encoding, synaptic plasticity and decision making, providing a new perspective to design of future DeepSNNs and neuromorphic hardware systems.
arXiv Detail & Related papers (2020-03-26T11:13:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.