Related papers: Oscillatory Fourier Neural Network: A Compact and Efficient Architecture for Sequential Processing

Oscillatory Fourier Neural Network: A Compact and Efficient Architecture for Sequential Processing

URL: http://arxiv.org/abs/2109.13090v1
Date: Tue, 14 Sep 2021 19:08:07 GMT
Title: Oscillatory Fourier Neural Network: A Compact and Efficient Architecture for Sequential Processing
Authors: Bing Han, Cheng Wang, and Kaushik Roy
Abstract summary: We propose a novel neuron model that has cosine activation with a time varying component for sequential processing. The proposed neuron provides an efficient building block for projecting sequential inputs into spectral domain. Applying the proposed model to sentiment analysis on IMDB dataset reaches 89.4% test accuracy within 5 epochs.
Score: 16.69710555668727
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Tremendous progress has been made in sequential processing with the recent advances in recurrent neural networks. However, recurrent architectures face the challenge of exploding/vanishing gradients during training, and require significant computational resources to execute back-propagation through time. Moreover, large models are typically needed for executing complex sequential tasks. To address these challenges, we propose a novel neuron model that has cosine activation with a time varying component for sequential processing. The proposed neuron provides an efficient building block for projecting sequential inputs into spectral domain, which helps to retain long-term dependencies with minimal extra model parameters and computation. A new type of recurrent network architecture, named Oscillatory Fourier Neural Network, based on the proposed neuron is presented and applied to various types of sequential tasks. We demonstrate that recurrent neural network with the proposed neuron model is mathematically equivalent to a simplified form of discrete Fourier transform applied onto periodical activation. In particular, the computationally intensive back-propagation through time in training is eliminated, leading to faster training while achieving the state of the art inference accuracy in a diverse group of sequential tasks. For instance, applying the proposed model to sentiment analysis on IMDB review dataset reaches 89.4% test accuracy within 5 epochs, accompanied by over 35x reduction in the model size compared to LSTM. The proposed novel RNN architecture is well poised for intelligent sequential processing in resource constrained hardware.

Related papers

MesaNet: Sequence Modeling by Locally Optimal Test-Time Training [67.45211108321203]
We introduce a numerically stable, chunkwise parallelizable version of the recently proposed Mesa layer.<n>We show that optimal test-time training enables reaching lower language modeling perplexity and higher downstream benchmark performance than previous RNNs.
arXiv Detail & Related papers (2025-06-05T16:50:23Z)
Deep-Unrolling Multidimensional Harmonic Retrieval Algorithms on Neuromorphic Hardware [78.17783007774295]
This paper explores the potential of conversion-based neuromorphic algorithms for highly accurate and energy-efficient single-snapshot multidimensional harmonic retrieval. A novel method for converting the complex-valued convolutional layers and activations into spiking neural networks (SNNs) is developed. The converted SNNs achieve almost five-fold power efficiency at moderate performance loss compared to the original CNNs.
arXiv Detail & Related papers (2024-12-05T09:41:33Z)
Time-Parameterized Convolutional Neural Networks for Irregularly Sampled Time Series [26.77596449192451]
Irregularly sampled time series are ubiquitous in several application domains, leading to sparse, not fully-observed and non-aligned observations. Standard sequential neural networks (RNNs) and convolutional neural networks (CNNs) consider regular spacing between observation times, posing significant challenges to irregular time series modeling. We parameterize convolutional layers by employing time-explicitly irregular kernels.
arXiv Detail & Related papers (2023-08-06T21:10:30Z)
Accelerating SNN Training with Stochastic Parallelizable Spiking Neurons [1.7056768055368383]
Spiking neural networks (SNN) are able to learn features while using less energy, especially on neuromorphic hardware. Most widely used neuron in deep learning is the temporal and Fire (LIF) neuron.
arXiv Detail & Related papers (2023-06-22T04:25:27Z)
How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series. We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z)
Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency. We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z)
The impact of memory on learning sequence-to-sequence tasks [6.603326895384289]
Recent success of neural networks in natural language processing has drawn renewed attention to learning sequence-to-sequence (seq2seq) tasks. We propose a model for a seq2seq task that has the advantage of providing explicit control over the degree of memory, or non-Markovianity, in the sequences.
arXiv Detail & Related papers (2022-05-29T14:57:33Z)
An advanced spatio-temporal convolutional recurrent neural network for storm surge predictions [73.4962254843935]
We study the capability of artificial neural network models to emulate storm surge based on the storm track/size/intensity history. This study presents a neural network model that can predict storm surge, informed by a database of synthetic storm simulations.
arXiv Detail & Related papers (2022-04-18T23:42:18Z)
Stochastic Recurrent Neural Network for Multistep Time Series Forecasting [0.0]
We leverage advances in deep generative models and the concept of state space models to propose an adaptation of the recurrent neural network for time series forecasting. Our model preserves the architectural workings of a recurrent neural network for which all relevant information is encapsulated in its hidden states, and this flexibility allows our model to be easily integrated into any deep architecture for sequential modelling.
arXiv Detail & Related papers (2021-04-26T01:43:43Z)
Deep Cellular Recurrent Network for Efficient Analysis of Time-Series Data with Spatial Information [52.635997570873194]
This work proposes a novel deep cellular recurrent neural network (DCRNN) architecture to process complex multi-dimensional time series data with spatial information. The proposed architecture achieves state-of-the-art performance while utilizing substantially less trainable parameters when compared to comparable methods in the literature.
arXiv Detail & Related papers (2021-01-12T20:08:18Z)
Incremental Training of a Recurrent Neural Network Exploiting a Multi-Scale Dynamic Memory [79.42778415729475]
We propose a novel incrementally trained recurrent architecture targeting explicitly multi-scale learning. We show how to extend the architecture of a simple RNN by separating its hidden state into different modules. We discuss a training algorithm where new modules are iteratively added to the model to learn progressively longer dependencies.
arXiv Detail & Related papers (2020-06-29T08:35:49Z)
Convolutional Tensor-Train LSTM for Spatio-temporal Learning [116.24172387469994]
We propose a higher-order LSTM model that can efficiently learn long-term correlations in the video sequence. This is accomplished through a novel tensor train module that performs prediction by combining convolutional features across time. Our results achieve state-of-the-art performance-art in a wide range of applications and datasets.
arXiv Detail & Related papers (2020-02-21T05:00:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.