Related papers: An Improved Time Feedforward Connections Recurrent Neural Networks

An Improved Time Feedforward Connections Recurrent Neural Networks

URL: http://arxiv.org/abs/2211.02561v1
Date: Thu, 3 Nov 2022 09:32:39 GMT
Title: An Improved Time Feedforward Connections Recurrent Neural Networks
Authors: Jin Wang, Yongsong Zou, Se-Jung Lim
Abstract summary: Recurrent Neural Networks (RNNs) have been widely applied to deal with temporal problems, such as flood forecasting and financial data processing. Traditional RNNs models amplify the gradient issue due to the strict time serial dependency. An improved Time Feedforward Connections Recurrent Neural Networks (TFC-RNNs) model was first proposed to address the gradient issue. A novel cell structure named Single Gate Recurrent Unit (SGRU) was presented to reduce the number of parameters for RNNs cell.
Score: 3.0965505512285967
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recurrent Neural Networks (RNNs) have been widely applied to deal with temporal problems, such as flood forecasting and financial data processing. On the one hand, traditional RNNs models amplify the gradient issue due to the strict time serial dependency, making it difficult to realize a long-term memory function. On the other hand, RNNs cells are highly complex, which will significantly increase computational complexity and cause waste of computational resources during model training. In this paper, an improved Time Feedforward Connections Recurrent Neural Networks (TFC-RNNs) model was first proposed to address the gradient issue. A parallel branch was introduced for the hidden state at time t-2 to be directly transferred to time t without the nonlinear transformation at time t-1. This is effective in improving the long-term dependence of RNNs. Then, a novel cell structure named Single Gate Recurrent Unit (SGRU) was presented. This cell structure can reduce the number of parameters for RNNs cell, consequently reducing the computational complexity. Next, applying SGRU to TFC-RNNs as a new TFC-SGRU model solves the above two difficulties. Finally, the performance of our proposed TFC-SGRU was verified through several experiments in terms of long-term memory and anti-interference capabilities. Experimental results demonstrated that our proposed TFC-SGRU model can capture helpful information with time step 1500 and effectively filter out the noise. The TFC-SGRU model accuracy is better than the LSTM and GRU models regarding language processing ability.

Related papers

MesaNet: Sequence Modeling by Locally Optimal Test-Time Training [67.45211108321203]
We introduce a numerically stable, chunkwise parallelizable version of the recently proposed Mesa layer.<n>We show that optimal test-time training enables reaching lower language modeling perplexity and higher downstream benchmark performance than previous RNNs.
arXiv Detail & Related papers (2025-06-05T16:50:23Z)
Bidirectional Linear Recurrent Models for Sequence-Level Multisource Fusion [10.867398697751742]
We introduce BLUR (Bidirectional Linear Unit for Recurrent network), which uses forward and backward linear recurrent units (LRUs) to capture both past and future dependencies with high computational efficiency. Experiments on sequential image and time series datasets reveal that BLUR not only surpasses transformers and traditional RNNs in accuracy but also significantly reduces computational costs.
arXiv Detail & Related papers (2025-04-11T20:42:58Z)
Deep-Unrolling Multidimensional Harmonic Retrieval Algorithms on Neuromorphic Hardware [78.17783007774295]
This paper explores the potential of conversion-based neuromorphic algorithms for highly accurate and energy-efficient single-snapshot multidimensional harmonic retrieval. A novel method for converting the complex-valued convolutional layers and activations into spiking neural networks (SNNs) is developed. The converted SNNs achieve almost five-fold power efficiency at moderate performance loss compared to the original CNNs.
arXiv Detail & Related papers (2024-12-05T09:41:33Z)
Scalable Mechanistic Neural Networks [52.28945097811129]
We propose an enhanced neural network framework designed for scientific machine learning applications involving long temporal sequences. By reformulating the original Mechanistic Neural Network (MNN) we reduce the computational time and space complexities from cubic and quadratic with respect to the sequence length, respectively, to linear. Extensive experiments demonstrate that S-MNN matches the original MNN in precision while substantially reducing computational resources.
arXiv Detail & Related papers (2024-10-08T14:27:28Z)
Were RNNs All We Needed? [53.393497486332]
We revisit traditional recurrent neural networks (RNNs) from over a decade ago. We show that by removing their hidden state dependencies from their input, forget, and update gates, LSTMs and GRUs no longer need to BPTT and can be efficiently trained in parallel.
arXiv Detail & Related papers (2024-10-02T03:06:49Z)
Delayed Memory Unit: Modelling Temporal Dependency Through Delay Gate [16.4160685571157]
Recurrent Neural Networks (RNNs) are widely recognized for their proficiency in modeling temporal dependencies. This paper proposes a novel Delayed Memory Unit (DMU) for gated RNNs. The DMU incorporates a delay line structure along with delay gates into vanilla RNN, thereby enhancing temporal interaction and facilitating temporal credit assignment.
arXiv Detail & Related papers (2023-10-23T14:29:48Z)
Time-Parameterized Convolutional Neural Networks for Irregularly Sampled Time Series [26.77596449192451]
Irregularly sampled time series are ubiquitous in several application domains, leading to sparse, not fully-observed and non-aligned observations. Standard sequential neural networks (RNNs) and convolutional neural networks (CNNs) consider regular spacing between observation times, posing significant challenges to irregular time series modeling. We parameterize convolutional layers by employing time-explicitly irregular kernels.
arXiv Detail & Related papers (2023-08-06T21:10:30Z)
Continuous time recurrent neural networks: overview and application to forecasting blood glucose in the intensive care unit [56.801856519460465]
Continuous time autoregressive recurrent neural networks (CTRNNs) are a deep learning model that account for irregular observations. We demonstrate the application of these models to probabilistic forecasting of blood glucose in a critical care setting.
arXiv Detail & Related papers (2023-04-14T09:39:06Z)
Oscillatory Fourier Neural Network: A Compact and Efficient Architecture for Sequential Processing [16.69710555668727]
We propose a novel neuron model that has cosine activation with a time varying component for sequential processing. The proposed neuron provides an efficient building block for projecting sequential inputs into spectral domain. Applying the proposed model to sentiment analysis on IMDB dataset reaches 89.4% test accuracy within 5 epochs.
arXiv Detail & Related papers (2021-09-14T19:08:07Z)
Online learning of windmill time series using Long Short-term Cognitive Networks [58.675240242609064]
The amount of data generated on windmill farms makes online learning the most viable strategy to follow. We use Long Short-term Cognitive Networks (LSTCNs) to forecast windmill time series in online settings. Our approach reported the lowest forecasting errors with respect to a simple RNN, a Long Short-term Memory, a Gated Recurrent Unit, and a Hidden Markov Model.
arXiv Detail & Related papers (2021-07-01T13:13:24Z)
UnICORNN: A recurrent model for learning very long time dependencies [0.0]
We propose a novel RNN architecture based on a structure preserving discretization of a Hamiltonian system of second-order ordinary differential equations. The resulting RNN is fast, invertible (in time), memory efficient and we derive rigorous bounds on the hidden state gradients to prove the mitigation of the exploding and vanishing gradient problem.
arXiv Detail & Related papers (2021-03-09T15:19:59Z)
Deep Cellular Recurrent Network for Efficient Analysis of Time-Series Data with Spatial Information [52.635997570873194]
This work proposes a novel deep cellular recurrent neural network (DCRNN) architecture to process complex multi-dimensional time series data with spatial information. The proposed architecture achieves state-of-the-art performance while utilizing substantially less trainable parameters when compared to comparable methods in the literature.
arXiv Detail & Related papers (2021-01-12T20:08:18Z)
A Fully Tensorized Recurrent Neural Network [48.50376453324581]
We introduce a "fully tensorized" RNN architecture which jointly encodes the separate weight matrices within each recurrent cell. This approach reduces model size by several orders of magnitude, while still maintaining similar or better performance compared to standard RNNs.
arXiv Detail & Related papers (2020-10-08T18:24:12Z)
Achieving Online Regression Performance of LSTMs with Simple RNNs [0.0]
We introduce a first-order training algorithm with a linear time complexity in the number of parameters. We show that when SRNNs are trained with our algorithm, they provide very similar regression performance with the LSTMs in two to three times shorter training time.
arXiv Detail & Related papers (2020-05-16T11:41:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.