ReLU and Addition-based Gated RNN
- URL: http://arxiv.org/abs/2308.05629v1
- Date: Thu, 10 Aug 2023 15:18:16 GMT
- Title: ReLU and Addition-based Gated RNN
- Authors: Rickard Br\"annvall, Henrik Forsgren, Fredrik Sandin and Marcus
Liwicki
- Abstract summary: We replace the multiplication and sigmoid function of the conventional recurrent gate with addition and ReLU activation.
This mechanism is designed to maintain long-term memory for sequence processing but at a reduced computational cost.
- Score: 1.484528358552186
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We replace the multiplication and sigmoid function of the conventional
recurrent gate with addition and ReLU activation. This mechanism is designed to
maintain long-term memory for sequence processing but at a reduced
computational cost, thereby opening up for more efficient execution or larger
models on restricted hardware. Recurrent Neural Networks (RNNs) with gating
mechanisms such as LSTM and GRU have been widely successful in learning from
sequential data due to their ability to capture long-term dependencies.
Conventionally, the update based on current inputs and the previous state
history is each multiplied with dynamic weights and combined to compute the
next state. However, multiplication can be computationally expensive,
especially for certain hardware architectures or alternative arithmetic systems
such as homomorphic encryption. It is demonstrated that the novel gating
mechanism can capture long-term dependencies for a standard synthetic sequence
learning task while significantly reducing computational costs such that
execution time is reduced by half on CPU and by one-third under encryption.
Experimental results on handwritten text recognition tasks furthermore show
that the proposed architecture can be trained to achieve comparable accuracy to
conventional GRU and LSTM baselines. The gating mechanism introduced in this
paper may enable privacy-preserving AI applications operating under homomorphic
encryption by avoiding the multiplication of encrypted variables. It can also
support quantization in (unencrypted) plaintext applications, with the
potential for substantial performance gains since the addition-based
formulation can avoid the expansion to double precision often required for
multiplication.
Related papers
- Resistive Memory-based Neural Differential Equation Solver for Score-based Diffusion Model [55.116403765330084]
Current AIGC methods, such as score-based diffusion, are still deficient in terms of rapidity and efficiency.
We propose a time-continuous and analog in-memory neural differential equation solver for score-based diffusion.
We experimentally validate our solution with 180 nm resistive memory in-memory computing macros.
arXiv Detail & Related papers (2024-04-08T16:34:35Z) - The Inhibitor: ReLU and Addition-Based Attention for Efficient
Transformers [0.0]
We replace the dot-product and Softmax-based attention with an alternative mechanism involving addition and ReLU activation only.
This side-steps the expansion to double precision often required by matrix multiplication and avoids costly Softmax evaluations.
It can enable more efficient execution and support larger quantized Transformer models on resource-constrained hardware or alternative arithmetic systems like homomorphic encryption.
arXiv Detail & Related papers (2023-10-03T13:34:21Z) - Incrementally-Computable Neural Networks: Efficient Inference for
Dynamic Inputs [75.40636935415601]
Deep learning often faces the challenge of efficiently processing dynamic inputs, such as sensor data or user inputs.
We take an incremental computing approach, looking to reuse calculations as the inputs change.
We apply this approach to the transformers architecture, creating an efficient incremental inference algorithm with complexity proportional to the fraction of modified inputs.
arXiv Detail & Related papers (2023-07-27T16:30:27Z) - Low-Latency Online Multiplier with Reduced Activities and Minimized
Interconnect for Inner Product Arrays [0.8078491757252693]
This paper proposes a low latency multiplier based on online or left-to-right arithmetic.
Online arithmetic enables overlapping successive operations regardless of data dependency.
Serial nature of the online algorithm and gradual increment/decrement of active slices minimize the interconnects and signal activities.
arXiv Detail & Related papers (2023-04-06T01:22:27Z) - Training Integer-Only Deep Recurrent Neural Networks [3.1829446824051195]
We present a quantization-aware training method for obtaining a highly accurate integer-only recurrent neural network (iRNN)
Our approach supports layer normalization, attention, and an adaptive piecewise linear (PWL) approximation of activation functions.
The proposed method enables RNN-based language models to run on edge devices with $2times$ improvement in runtime.
arXiv Detail & Related papers (2022-12-22T15:22:36Z) - Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency.
We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z) - Braille Letter Reading: A Benchmark for Spatio-Temporal Pattern
Recognition on Neuromorphic Hardware [50.380319968947035]
Recent deep learning approaches have reached accuracy in such tasks, but their implementation on conventional embedded solutions is still computationally very and energy expensive.
We propose a new benchmark for computing tactile pattern recognition at the edge through letters reading.
We trained and compared feed-forward and recurrent spiking neural networks (SNNs) offline using back-propagation through time with surrogate gradients, then we deployed them on the Intel Loihimorphic chip for efficient inference.
Our results show that the LSTM outperforms the recurrent SNN in terms of accuracy by 14%. However, the recurrent SNN on Loihi is 237 times more energy
arXiv Detail & Related papers (2022-05-30T14:30:45Z) - Structured in Space, Randomized in Time: Leveraging Dropout in RNNs for
Efficient Training [18.521882534906972]
We propose to structure dropout patterns, by dropping out the same set of physical neurons within a batch, resulting in column (row) level hidden state sparsity.
We conduct experiments for three representative NLP tasks: language modelling on the PTB dataset, OpenNMT based machine translation using the IWSLT De-En and En-Vi datasets, and named entity recognition sequence labelling.
arXiv Detail & Related papers (2021-06-22T22:44:32Z) - Quantized Neural Networks via {-1, +1} Encoding Decomposition and
Acceleration [83.84684675841167]
We propose a novel encoding scheme using -1, +1 to decompose quantized neural networks (QNNs) into multi-branch binary networks.
We validate the effectiveness of our method on large-scale image classification, object detection, and semantic segmentation tasks.
arXiv Detail & Related papers (2021-06-18T03:11:15Z) - PAC-learning gains of Turing machines over circuits and neural networks [1.4502611532302039]
We study the potential gains in sample efficiency that can bring in the principle of minimum description length.
We use Turing machines to represent universal models and circuits.
We highlight close relationships between classical open problems in Circuit Complexity and the tightness of these.
arXiv Detail & Related papers (2021-03-23T17:03:10Z) - Refined Gate: A Simple and Effective Gating Mechanism for Recurrent
Units [68.30422112784355]
We propose a new gating mechanism within general gated recurrent neural networks to handle this issue.
The proposed gates directly short connect the extracted input features to the outputs of vanilla gates.
We verify the proposed gating mechanism on three popular types of gated RNNs including LSTM, GRU and MGU.
arXiv Detail & Related papers (2020-02-26T07:51:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.