Related papers: Investigating Sparsity in Recurrent Neural Networks

Investigating Sparsity in Recurrent Neural Networks

URL: http://arxiv.org/abs/2407.20601v1
Date: Tue, 30 Jul 2024 07:24:58 GMT
Title: Investigating Sparsity in Recurrent Neural Networks
Authors: Harshil Darji,
Abstract summary: This thesis focuses on investigating the effects of pruning and Sparse Recurrent Neural Networks on the performance of RNNs. We first describe the pruning of RNNs, its impact on the performance of RNNs, and the number of training epochs required to regain accuracy after the pruning is performed. Next, we continue with the creation and training of Sparse Recurrent Neural Networks and identify the relation between the performance and the graph properties of its underlying arbitrary structure.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In the past few years, neural networks have evolved from simple Feedforward Neural Networks to more complex neural networks, such as Convolutional Neural Networks and Recurrent Neural Networks. Where CNNs are a perfect fit for tasks where the sequence is not important such as image recognition, RNNs are useful when order is important such as machine translation. An increasing number of layers in a neural network is one way to improve its performance, but it also increases its complexity making it much more time and power-consuming to train. One way to tackle this problem is to introduce sparsity in the architecture of the neural network. Pruning is one of the many methods to make a neural network architecture sparse by clipping out weights below a certain threshold while keeping the performance near to the original. Another way is to generate arbitrary structures using random graphs and embed them between an input and output layer of an Artificial Neural Network. Many researchers in past years have focused on pruning mainly CNNs, while hardly any research is done for the same in RNNs. The same also holds in creating sparse architectures for RNNs by generating and embedding arbitrary structures. Therefore, this thesis focuses on investigating the effects of the before-mentioned two techniques on the performance of RNNs. We first describe the pruning of RNNs, its impact on the performance of RNNs, and the number of training epochs required to regain accuracy after the pruning is performed. Next, we continue with the creation and training of Sparse Recurrent Neural Networks and identify the relation between the performance and the graph properties of its underlying arbitrary structure. We perform these experiments on RNN with Tanh nonlinearity (RNN-Tanh), RNN with ReLU nonlinearity (RNN-ReLU), GRU, and LSTM. Finally, we analyze and discuss the results achieved from both the experiments.

Related papers

NN-Former: Rethinking Graph Structure in Neural Architecture Representation [67.3378579108611]
Graph Neural Networks (GNNs) and transformers have shown promising performance in representing neural architectures.<n>We show that sibling nodes are pivotal while overlooked in previous research.<n>Our approach consistently achieves promising performance in both accuracy and latency prediction.
arXiv Detail & Related papers (2025-07-01T15:46:18Z)
CogniSNN: A First Exploration to Random Graph Architecture based Spiking Neural Networks with Enhanced Expandability and Neuroplasticity [8.24896024250985]
This paper develops a new modeling paradigm for spiking neural networks (SNNs) with random graph architecture (RGA)<n>We improve the expandability and neuroplasticity of CogniSNN by introducing a modified spiking residual neural node (ResNode)<n>Experiments show that CogniSNN with re-designed ResNode performs outstandingly in neuromorphic datasets with fewer parameters.
arXiv Detail & Related papers (2025-05-09T12:21:23Z)
Unveiling the Power of Sparse Neural Networks for Feature Selection [60.50319755984697]
Sparse Neural Networks (SNNs) have emerged as powerful tools for efficient feature selection. We show that SNNs trained with dynamic sparse training (DST) algorithms can achieve, on average, more than $50%$ memory and $55%$ FLOPs reduction. Our findings show that feature selection with SNNs trained with DST algorithms can achieve, on average, more than $50%$ memory and $55%$ FLOPs reduction.
arXiv Detail & Related papers (2024-08-08T16:48:33Z)
CNN2GNN: How to Bridge CNN with GNN [59.42117676779735]
We propose a novel CNN2GNN framework to unify CNN and GNN together via distillation. The performance of distilled boosted'' two-layer GNN on Mini-ImageNet is much higher than CNN containing dozens of layers such as ResNet152.
arXiv Detail & Related papers (2024-04-23T08:19:08Z)
Random-coupled Neural Network [17.53731608985241]
Pulse-coupled neural network (PCNN) is a well applicated model for imitating the characteristics of the human brain in computer vision and neural network fields. In this study, random-coupled neural network (RCNN) is proposed. It overcomes difficulties in PCNN's neuromorphic computing via a random inactivation process.
arXiv Detail & Related papers (2024-03-26T09:13:06Z)
A Hybrid Neural Coding Approach for Pattern Recognition with Spiking Neural Networks [53.31941519245432]
Brain-inspired spiking neural networks (SNNs) have demonstrated promising capabilities in solving pattern recognition tasks. These SNNs are grounded on homogeneous neurons that utilize a uniform neural coding for information representation. In this study, we argue that SNN architectures should be holistically designed to incorporate heterogeneous coding schemes.
arXiv Detail & Related papers (2023-05-26T02:52:12Z)
Heterogeneous Recurrent Spiking Neural Network for Spatio-Temporal Classification [13.521272923545409]
Spi Neural Networks are often touted as brain-inspired learning models for the third wave of Artificial Intelligence. This paper presents a heterogeneous spiking neural network (HRSNN) with unsupervised learning for video recognition tasks. We show that HRSNN can achieve similar performance to state-of-the-temporal backpropagation trained supervised SNN, but with less computation.
arXiv Detail & Related papers (2022-09-22T16:34:01Z)
Training High-Performance Low-Latency Spiking Neural Networks by Differentiation on Spike Representation [70.75043144299168]
Spiking Neural Network (SNN) is a promising energy-efficient AI model when implemented on neuromorphic hardware. It is a challenge to efficiently train SNNs due to their non-differentiability. We propose the Differentiation on Spike Representation (DSR) method, which could achieve high performance.
arXiv Detail & Related papers (2022-05-01T12:44:49Z)
Mining the Weights Knowledge for Optimizing Neural Network Structures [1.995792341399967]
We introduce a switcher neural network (SNN) that uses as inputs the weights of a task-specific neural network (called TNN for short) By mining the knowledge contained in the weights, the SNN outputs scaling factors for turning off neurons in the TNN. In terms of accuracy, we outperform baseline networks and other structure learning methods stably and significantly.
arXiv Detail & Related papers (2021-10-11T05:20:56Z)
Pruning of Deep Spiking Neural Networks through Gradient Rewiring [41.64961999525415]
Spiking Neural Networks (SNNs) have been attached great importance due to their biological plausibility and high energy-efficiency on neuromorphic chips. Most existing methods directly apply pruning approaches in artificial neural networks (ANNs) to SNNs, which ignore the difference between ANNs and SNNs. We propose gradient rewiring (Grad R), a joint learning algorithm of connectivity and weight for SNNs, that enables us to seamlessly optimize network structure without retrain.
arXiv Detail & Related papers (2021-05-11T10:05:53Z)
Combining Spiking Neural Network and Artificial Neural Network for Enhanced Image Classification [1.8411688477000185]
spiking neural networks (SNNs) that more closely resemble biological brain synapses have attracted attention owing to their low power consumption. We build versatile hybrid neural networks (HNNs) that improve the concerned performance.
arXiv Detail & Related papers (2021-02-21T12:03:16Z)
How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks [80.55378250013496]
We study how neural networks trained by gradient descent extrapolate what they learn outside the support of the training distribution. Graph Neural Networks (GNNs) have shown some success in more complex tasks.
arXiv Detail & Related papers (2020-09-24T17:48:59Z)
Progressive Tandem Learning for Pattern Recognition with Deep Spiking Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency. We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.