Fusion Recurrent Neural Network
- URL: http://arxiv.org/abs/2006.04069v1
- Date: Sun, 7 Jun 2020 07:39:49 GMT
- Title: Fusion Recurrent Neural Network
- Authors: Yiwen Sun, Yulu Wang, Kun Fu, Zheng Wang, Changshui Zhang, Jieping Ye
- Abstract summary: We propose a novel, succinct and promising RNN - Fusion Recurrent Neural Network (Fusion RNN)
Fusion RNN is composed of Fusion module and Transport module every time step.
In order to evaluate Fusion RNN's sequence feature extraction capability, we choose a representative data mining task for sequence data, estimated time of arrival (ETA) and present a novel model based on Fusion RNN.
- Score: 88.5550074808201
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Considering deep sequence learning for practical application, two
representative RNNs - LSTM and GRU may come to mind first. Nevertheless, is
there no chance for other RNNs? Will there be a better RNN in the future? In
this work, we propose a novel, succinct and promising RNN - Fusion Recurrent
Neural Network (Fusion RNN). Fusion RNN is composed of Fusion module and
Transport module every time step. Fusion module realizes the multi-round fusion
of the input and hidden state vector. Transport module which mainly refers to
simple recurrent network calculate the hidden state and prepare to pass it to
the next time step. Furthermore, in order to evaluate Fusion RNN's sequence
feature extraction capability, we choose a representative data mining task for
sequence data, estimated time of arrival (ETA) and present a novel model based
on Fusion RNN. We contrast our method and other variants of RNN for ETA under
massive vehicle travel data from DiDi Chuxing. The results demonstrate that for
ETA, Fusion RNN is comparable to state-of-the-art LSTM and GRU which are more
complicated than Fusion RNN.
Related papers
- Use of Parallel Explanatory Models to Enhance Transparency of Neural Network Configurations for Cell Degradation Detection [18.214293024118145]
We build a parallel model to illuminate and understand the internal operation of neural networks.
We show how each layer of the RNN transforms the input distributions to increase detection accuracy.
At the same time we also discover a side effect acting to limit the improvement in accuracy.
arXiv Detail & Related papers (2024-04-17T12:22:54Z) - Learning Useful Representations of Recurrent Neural Network Weight Matrices [30.583752432727326]
Recurrent Neural Networks (RNNs) are general-purpose parallel-sequential computers.
How to learn useful representations of RNN weights that facilitate RNN analysis as well as downstream tasks?
We consider several mechanistic approaches for RNN weights and adapt the permutation equivariant Deep Weight Space layer for RNNs.
Our two novel functionalist approaches extract information from RNN weights by 'interrogating' the RNN through probing inputs.
arXiv Detail & Related papers (2024-03-18T17:32:23Z) - On the Computational Complexity and Formal Hierarchy of Second Order
Recurrent Neural Networks [59.85314067235965]
We extend the theoretical foundation for the $2nd$-order recurrent network ($2nd$ RNN)
We prove there exists a class of a $2nd$ RNN that is Turing-complete with bounded time.
We also demonstrate that $2$nd order RNNs, without memory, outperform modern-day models such as vanilla RNNs and gated recurrent units in recognizing regular grammars.
arXiv Detail & Related papers (2023-09-26T06:06:47Z) - Multi-blank Transducers for Speech Recognition [49.6154259349501]
In our proposed method, we introduce additional blank symbols, which consume two or more input frames when emitted.
We refer to the added symbols as big blanks, and the method multi-blank RNN-T.
With experiments on multiple languages and datasets, we show that multi-blank RNN-T methods could bring relative speedups of over +90%/+139%.
arXiv Detail & Related papers (2022-11-04T16:24:46Z) - Exploiting Low-Rank Tensor-Train Deep Neural Networks Based on
Riemannian Gradient Descent With Illustrations of Speech Processing [74.31472195046099]
We exploit a low-rank tensor-train deep neural network (TT-DNN) to build an end-to-end deep learning pipeline, namely LR-TT-DNN.
A hybrid model combining LR-TT-DNN with a convolutional neural network (CNN) is set up to boost the performance.
Our empirical evidence demonstrates that the LR-TT-DNN and CNN+(LR-TT-DNN) models with fewer model parameters can outperform the TT-DNN and CNN+(LR-TT-DNN) counterparts.
arXiv Detail & Related papers (2022-03-11T15:55:34Z) - Fully Spiking Variational Autoencoder [66.58310094608002]
Spiking neural networks (SNNs) can be run on neuromorphic devices with ultra-high speed and ultra-low energy consumption.
In this study, we build a variational autoencoder (VAE) with SNN to enable image generation.
arXiv Detail & Related papers (2021-09-26T06:10:14Z) - DNNFusion: Accelerating Deep Neural Networks Execution with Advanced
Operator Fusion [28.03712082540713]
This paper proposes a novel and extensive loop fusion framework called DNNFusion.
DNNFusion finds up to 8.8x higher fusion opportunities, outperforms four state-of-the-art DNN execution frameworks with 9.3x speedup.
The memory requirement reduction and speedups can enable the execution of many of the target models on mobile devices and even make them part of a real-time application.
arXiv Detail & Related papers (2021-08-30T16:11:38Z) - Recurrent Neural Network from Adder's Perspective: Carry-lookahead RNN [9.20540910698296]
We discuss the similarities between recurrent neural network (RNN) and serial adder.
Inspired by carry-lookahead adder, we introduce carry-lookahead module to RNN, which makes it possible for RNN to run in parallel.
arXiv Detail & Related papers (2021-06-22T12:28:33Z) - MomentumRNN: Integrating Momentum into Recurrent Neural Networks [32.40217829362088]
We show that MomentumRNNs alleviate the vanishing gradient issue in training RNNs.
MomentumRNN is applicable to many types of recurrent cells, including those in the state-of-the-art RNNs.
We show that other advanced momentum-based optimization methods, such as Adam and Nesterov accelerated gradients with a restart, can be easily incorporated into the MomentumRNN framework.
arXiv Detail & Related papers (2020-06-12T03:02:29Z) - Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks.
We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.