LiteLSTM Architecture Based on Weights Sharing for Recurrent Neural
Networks
- URL: http://arxiv.org/abs/2301.04794v1
- Date: Thu, 12 Jan 2023 03:39:59 GMT
- Title: LiteLSTM Architecture Based on Weights Sharing for Recurrent Neural
Networks
- Authors: Nelly Elsayed, Zag ElSayed, Anthony S. Maida
- Abstract summary: Long short-term memory (LSTM) is one of the robust recurrent neural network architectures for learning sequential data.
This paper proposed a novel LiteLSTM architecture based on reducing the LSTM computation components via the weights sharing concept.
The proposed LiteLSTM has comparable accuracy to the other state-of-the-art recurrent architecture while using a smaller computation budget.
- Score: 1.1602089225841632
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Long short-term memory (LSTM) is one of the robust recurrent neural network
architectures for learning sequential data. However, it requires considerable
computational power to learn and implement both software and hardware aspects.
This paper proposed a novel LiteLSTM architecture based on reducing the LSTM
computation components via the weights sharing concept to reduce the overall
architecture computation cost and maintain the architecture performance. The
proposed LiteLSTM can be significant for processing large data where
time-consuming is crucial while hardware resources are limited, such as the
security of IoT devices and medical data processing. The proposed model was
evaluated and tested empirically on three different datasets from the computer
vision, cybersecurity, speech emotion recognition domains. The proposed
LiteLSTM has comparable accuracy to the other state-of-the-art recurrent
architecture while using a smaller computation budget.
Related papers
- A Single Transformer for Scalable Vision-Language Modeling [74.05173379908703]
We present SOLO, a single transformer for visiOn-Language mOdeling.
A unified single Transformer architecture, like SOLO, effectively addresses these scalability concerns in LVLMs.
In this paper, we introduce the first open-source training recipe for developing SOLO, an open-source 7B LVLM.
arXiv Detail & Related papers (2024-07-08T22:40:15Z) - A Novel Quantum LSTM Network [2.938337278931738]
This paper introduces the Quantum LSTM (qLSTM) model, which integrates quantum computing principles with traditional LSTM networks.
Our qLSTM model aims to address the limitations of traditional LSTMs, providing a robust framework for more efficient and effective sequential data processing.
arXiv Detail & Related papers (2024-06-13T10:26:14Z) - Mechanistic Design and Scaling of Hybrid Architectures [114.3129802943915]
We identify and test new hybrid architectures constructed from a variety of computational primitives.
We experimentally validate the resulting architectures via an extensive compute-optimal and a new state-optimal scaling law analysis.
We find MAD synthetics to correlate with compute-optimal perplexity, enabling accurate evaluation of new architectures.
arXiv Detail & Related papers (2024-03-26T16:33:12Z) - Algorithm and Hardware Co-Design of Energy-Efficient LSTM Networks for
Video Recognition with Hierarchical Tucker Tensor Decomposition [22.502146009817416]
Long short-term memory (LSTM) is a powerful deep neural network that has been widely used in sequence analysis and modeling applications.
In this paper, we propose to perform algorithm and hardware co-design towards high-performance energy-efficient LSTM networks.
arXiv Detail & Related papers (2022-12-05T05:51:56Z) - Neural Architecture Search for Improving Latency-Accuracy Trade-off in
Split Computing [5.516431145236317]
Split computing is an emerging machine-learning inference technique that addresses the privacy and latency challenges of deploying deep learning in IoT systems.
In split computing, neural network models are separated and cooperatively processed using edge servers and IoT devices via networks.
This paper proposes a neural architecture search (NAS) method for split computing.
arXiv Detail & Related papers (2022-08-30T03:15:43Z) - LiteLSTM Architecture for Deep Recurrent Neural Networks [1.1602089225841632]
Longtemporal short-term memory (LSTM) is a robust recurrent neural network architecture for learning data.
This paper proposes a novel LiteLSTM architecture based on reducing the components of the LSTM using the weights sharing concept.
The proposed LiteLSTM can be significant for learning big data where time-consumption is crucial.
arXiv Detail & Related papers (2022-01-27T16:33:02Z) - Improving Deep Learning for HAR with shallow LSTMs [70.94062293989832]
We propose to alter the DeepConvLSTM to employ a 1-layered instead of a 2-layered LSTM.
Our results stand in contrast to the belief that one needs at least a 2-layered LSTM when dealing with sequential data.
arXiv Detail & Related papers (2021-08-02T08:14:59Z) - Learning Frequency-aware Dynamic Network for Efficient Super-Resolution [56.98668484450857]
This paper explores a novel frequency-aware dynamic network for dividing the input into multiple parts according to its coefficients in the discrete cosine transform (DCT) domain.
In practice, the high-frequency part will be processed using expensive operations and the lower-frequency part is assigned with cheap operations to relieve the computation burden.
Experiments conducted on benchmark SISR models and datasets show that the frequency-aware dynamic network can be employed for various SISR neural architectures.
arXiv Detail & Related papers (2021-03-15T12:54:26Z) - MS-RANAS: Multi-Scale Resource-Aware Neural Architecture Search [94.80212602202518]
We propose Multi-Scale Resource-Aware Neural Architecture Search (MS-RANAS)
We employ a one-shot architecture search approach in order to obtain a reduced search cost.
We achieve state-of-the-art results in terms of accuracy-speed trade-off.
arXiv Detail & Related papers (2020-09-29T11:56:01Z) - One-step regression and classification with crosspoint resistive memory
arrays [62.997667081978825]
High speed, low energy computing machines are in demand to enable real-time artificial intelligence at the edge.
One-step learning is supported by simulations of the prediction of the cost of a house in Boston and the training of a 2-layer neural network for MNIST digit recognition.
Results are all obtained in one computational step, thanks to the physical, parallel, and analog computing within the crosspoint array.
arXiv Detail & Related papers (2020-05-05T08:00:07Z) - Near-Optimal Hardware Design for Convolutional Neural Networks [0.0]
This study proposes a novel, special-purpose, and high-efficiency hardware architecture for convolutional neural networks.
The proposed architecture maximizes the utilization of multipliers by designing the computational circuit with the same structure as that of the computational flow of the model.
An implementation based on the proposed hardware architecture has been applied in commercial AI products.
arXiv Detail & Related papers (2020-02-06T09:15:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.