Related papers: LiteLSTM Architecture for Deep Recurrent Neural Networks

LiteLSTM Architecture for Deep Recurrent Neural Networks

URL: http://arxiv.org/abs/2201.11624v1
Date: Thu, 27 Jan 2022 16:33:02 GMT
Title: LiteLSTM Architecture for Deep Recurrent Neural Networks
Authors: Nelly Elsayed, Zag ElSayed, Anthony S. Maida
Abstract summary: Longtemporal short-term memory (LSTM) is a robust recurrent neural network architecture for learning data. This paper proposes a novel LiteLSTM architecture based on reducing the components of the LSTM using the weights sharing concept. The proposed LiteLSTM can be significant for learning big data where time-consumption is crucial.
Score: 1.1602089225841632
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Long short-term memory (LSTM) is a robust recurrent neural network architecture for learning spatiotemporal sequential data. However, it requires significant computational power for learning and implementing from both software and hardware aspects. This paper proposes a novel LiteLSTM architecture based on reducing the computation components of the LSTM using the weights sharing concept to reduce the overall architecture cost and maintain the architecture performance. The proposed LiteLSTM can be significant for learning big data where time-consumption is crucial such as the security of IoT devices and medical data. Moreover, it helps to reduce the CO2 footprint. The proposed model was evaluated and tested empirically on two different datasets from computer vision and cybersecurity domains.

Related papers

Large-Scale Model Enabled Semantic Communication Based on Robust Knowledge Distillation [53.16213723669751]
Large-scale models (LSMs) can be an effective framework for semantic representation and understanding.<n>However, their direct deployment is often hindered by high computational complexity and resource requirements.<n>This paper proposes a novel knowledge distillation based semantic communication framework.
arXiv Detail & Related papers (2025-08-04T07:47:18Z)
Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge. Existing methods struggle to balance high model performance with low resource consumption. We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z)
Implementation Guidelines and Innovations in Quantum LSTM Networks [2.938337278931738]
This paper presents a theoretical analysis and an implementation plan for a Quantum LSTM model, which seeks to integrate quantum computing principles with traditional LSTM networks. The actual architecture and its practical effectiveness in enhancing sequential data processing remain to be developed and demonstrated in future work.
arXiv Detail & Related papers (2024-06-13T10:26:14Z)
LiteLSTM Architecture Based on Weights Sharing for Recurrent Neural Networks [1.1602089225841632]
Long short-term memory (LSTM) is one of the robust recurrent neural network architectures for learning sequential data. This paper proposed a novel LiteLSTM architecture based on reducing the LSTM computation components via the weights sharing concept. The proposed LiteLSTM has comparable accuracy to the other state-of-the-art recurrent architecture while using a smaller computation budget.
arXiv Detail & Related papers (2023-01-12T03:39:59Z)
Algorithm and Hardware Co-Design of Energy-Efficient LSTM Networks for Video Recognition with Hierarchical Tucker Tensor Decomposition [22.502146009817416]
Long short-term memory (LSTM) is a powerful deep neural network that has been widely used in sequence analysis and modeling applications. In this paper, we propose to perform algorithm and hardware co-design towards high-performance energy-efficient LSTM networks.
arXiv Detail & Related papers (2022-12-05T05:51:56Z)
Improving Deep Learning for HAR with shallow LSTMs [70.94062293989832]
We propose to alter the DeepConvLSTM to employ a 1-layered instead of a 2-layered LSTM. Our results stand in contrast to the belief that one needs at least a 2-layered LSTM when dealing with sequential data.
arXiv Detail & Related papers (2021-08-02T08:14:59Z)
Learning Frequency-aware Dynamic Network for Efficient Super-Resolution [56.98668484450857]
This paper explores a novel frequency-aware dynamic network for dividing the input into multiple parts according to its coefficients in the discrete cosine transform (DCT) domain. In practice, the high-frequency part will be processed using expensive operations and the lower-frequency part is assigned with cheap operations to relieve the computation burden. Experiments conducted on benchmark SISR models and datasets show that the frequency-aware dynamic network can be employed for various SISR neural architectures.
arXiv Detail & Related papers (2021-03-15T12:54:26Z)
A journey in ESN and LSTM visualisations on a language task [77.34726150561087]
We trained ESNs and LSTMs on a Cross-Situationnal Learning (CSL) task. The results are of three kinds: performance comparison, internal dynamics analyses and visualization of latent space.
arXiv Detail & Related papers (2020-12-03T08:32:01Z)
Object Tracking through Residual and Dense LSTMs [67.98948222599849]
Deep learning-based trackers based on LSTMs (Long Short-Term Memory) recurrent neural networks have emerged as a powerful alternative. DenseLSTMs outperform Residual and regular LSTM, and offer a higher resilience to nuisances. Our case study supports the adoption of residual-based RNNs for enhancing the robustness of other trackers.
arXiv Detail & Related papers (2020-06-22T08:20:17Z)
Stacked Bidirectional and Unidirectional LSTM Recurrent Neural Network for Forecasting Network-wide Traffic State with Missing Values [23.504633202965376]
We focus on RNN-based models and attempt to reformulate the way to incorporate RNN and its variants into traffic prediction models. A stacked bidirectional and unidirectional LSTM network architecture (SBU-LSTM) is proposed to assist the design of neural network structures for traffic state forecasting. We also propose a data imputation mechanism in the LSTM structure (LSTM-I) by designing an imputation unit to infer missing values and assist traffic prediction.
arXiv Detail & Related papers (2020-05-24T00:17:15Z)
Sentiment Analysis Using Simplified Long Short-term Memory Recurrent Neural Networks [1.5146765382501612]
We perform sentiment analysis on a GOP Debate Twitter dataset. To speed up training and reduce the computational cost and time, six different parameter reduced slim versions of the LSTM model are proposed.
arXiv Detail & Related papers (2020-05-08T12:50:10Z)
One-step regression and classification with crosspoint resistive memory arrays [62.997667081978825]
High speed, low energy computing machines are in demand to enable real-time artificial intelligence at the edge. One-step learning is supported by simulations of the prediction of the cost of a house in Boston and the training of a 2-layer neural network for MNIST digit recognition. Results are all obtained in one computational step, thanks to the physical, parallel, and analog computing within the crosspoint array.
arXiv Detail & Related papers (2020-05-05T08:00:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.