MS-RNN: A Flexible Multi-Scale Framework for Spatiotemporal Predictive
Learning
- URL: http://arxiv.org/abs/2206.03010v7
- Date: Thu, 29 Feb 2024 13:21:27 GMT
- Title: MS-RNN: A Flexible Multi-Scale Framework for Spatiotemporal Predictive
Learning
- Authors: Zhifeng Ma, Hao Zhang, and Jie Liu
- Abstract summary: We propose a general framework named Multi-Scale RNN (MS-RNN) to boost recent RNN models for predictive learning.
We verify the MS-RNN framework by thorough theoretical analyses and exhaustive experiments.
Results show the efficiency that RNN models incorporating our framework have much lower memory cost but better performance than before.
- Score: 7.311071760653835
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Spatiotemporal predictive learning, which predicts future frames through
historical prior knowledge with the aid of deep learning, is widely used in
many fields. Previous work essentially improves the model performance by
widening or deepening the network, but it also brings surging memory overhead,
which seriously hinders the development and application of this technology. In
order to improve the performance without increasing memory consumption, we
focus on scale, which is another dimension to improve model performance but
with low memory requirement. The effectiveness has been widely demonstrated in
many CNN-based tasks such as image classification and semantic segmentation,
but it has not been fully explored in recent RNN models. In this paper,
learning from the benefit of multi-scale, we propose a general framework named
Multi-Scale RNN (MS-RNN) to boost recent RNN models for spatiotemporal
predictive learning. We verify the MS-RNN framework by thorough theoretical
analyses and exhaustive experiments, where the theory focuses on memory
reduction and performance improvement while the experiments employ eight RNN
models (ConvLSTM, TrajGRU, PredRNN, PredRNN++, MIM, MotionRNN, PredRNN-V2, and
PrecipLSTM) and four datasets (Moving MNIST, TaxiBJ, KTH, and Germany). The
results show the efficiency that RNN models incorporating our framework have
much lower memory cost but better performance than before. Our code is released
at \url{https://github.com/mazhf/MS-RNN}.
Related papers
- Bayesian Neural Networks with Domain Knowledge Priors [52.80929437592308]
We propose a framework for integrating general forms of domain knowledge into a BNN prior.
We show that BNNs using our proposed domain knowledge priors outperform those with standard priors.
arXiv Detail & Related papers (2024-02-20T22:34:53Z) - LM-HT SNN: Enhancing the Performance of SNN to ANN Counterpart through Learnable Multi-hierarchical Threshold Model [42.13762207316681]
Spiking Neural Network (SNN) has garnered widespread academic interest for its intrinsic ability to transmit information in a more energy-efficient manner.
Despite previous efforts to optimize the learning algorithm of SNNs through various methods, SNNs still lag behind ANNs in terms of performance.
We propose a novel LM-HT model, which is an equidistant multi-threshold model that can dynamically regulate the global input current and membrane potential leakage.
arXiv Detail & Related papers (2024-02-01T08:10:39Z) - Resource Constrained Model Compression via Minimax Optimization for
Spiking Neural Networks [11.19282454437627]
Spiking Neural Networks (SNNs) have the characteristics of event-driven and high energy-efficient networks.
It is difficult to deploy these networks on resource-limited edge devices directly.
We propose an improved end-to-end Minimax optimization method for this sparse learning problem.
arXiv Detail & Related papers (2023-08-09T02:50:15Z) - Recurrent Bilinear Optimization for Binary Neural Networks [58.972212365275595]
BNNs neglect the intrinsic bilinear relationship of real-valued weights and scale factors.
Our work is the first attempt to optimize BNNs from the bilinear perspective.
We obtain robust RBONNs, which show impressive performance over state-of-the-art BNNs on various models and datasets.
arXiv Detail & Related papers (2022-09-04T06:45:33Z) - Comparison Analysis of Traditional Machine Learning and Deep Learning
Techniques for Data and Image Classification [62.997667081978825]
The purpose of the study is to analyse and compare the most common machine learning and deep learning techniques used for computer vision 2D object classification tasks.
Firstly, we will present the theoretical background of the Bag of Visual words model and Deep Convolutional Neural Networks (DCNN)
Secondly, we will implement a Bag of Visual Words model, the VGG16 CNN Architecture.
arXiv Detail & Related papers (2022-04-11T11:34:43Z) - Weightless Neural Networks for Efficient Edge Inference [1.7882696915798877]
Weightless Neural Networks (WNNs) are a class of machine learning model which use table lookups to perform inference.
We propose a novel WNN architecture, BTHOWeN, with key algorithmic and architectural improvements over prior work.
BTHOWeN targets the large and growing edge computing sector by providing superior latency and energy efficiency.
arXiv Detail & Related papers (2022-03-03T01:46:05Z) - PFGE: Parsimonious Fast Geometric Ensembling of DNNs [6.973476713852153]
In this paper, we propose a new method called parsimonious FGE (PFGE), which employs a lightweight ensemble of higher-performing deep neural networks.
Our results show PFGE 5x memory efficiency compared to previous methods, without compromising on generalization performance.
arXiv Detail & Related papers (2022-02-14T12:27:46Z) - Deep Time Delay Neural Network for Speech Enhancement with Full Data
Learning [60.20150317299749]
This paper proposes a deep time delay neural network (TDNN) for speech enhancement with full data learning.
To make full use of the training data, we propose a full data learning method for speech enhancement.
arXiv Detail & Related papers (2020-11-11T06:32:37Z) - Modeling Token-level Uncertainty to Learn Unknown Concepts in SLU via
Calibrated Dirichlet Prior RNN [98.4713940310056]
One major task of spoken language understanding (SLU) in modern personal assistants is to extract semantic concepts from an utterance.
Recent research collected question and answer annotated data to learn what is unknown and should be asked.
We incorporate softmax-based slot filling neural architectures to model the sequence uncertainty without question supervision.
arXiv Detail & Related papers (2020-10-16T02:12:30Z) - MomentumRNN: Integrating Momentum into Recurrent Neural Networks [32.40217829362088]
We show that MomentumRNNs alleviate the vanishing gradient issue in training RNNs.
MomentumRNN is applicable to many types of recurrent cells, including those in the state-of-the-art RNNs.
We show that other advanced momentum-based optimization methods, such as Adam and Nesterov accelerated gradients with a restart, can be easily incorporated into the MomentumRNN framework.
arXiv Detail & Related papers (2020-06-12T03:02:29Z) - Recognizing Long Grammatical Sequences Using Recurrent Networks
Augmented With An External Differentiable Stack [73.48927855855219]
Recurrent neural networks (RNNs) are a widely used deep architecture for sequence modeling, generation, and prediction.
RNNs generalize poorly over very long sequences, which limits their applicability to many important temporal processing and time series forecasting problems.
One way to address these shortcomings is to couple an RNN with an external, differentiable memory structure, such as a stack.
In this paper, we improve the memory-augmented RNN with important architectural and state updating mechanisms.
arXiv Detail & Related papers (2020-04-04T14:19:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.