Related papers: Effect of Architectures and Training Methods on the Performance of Learned Video Frame Prediction

Effect of Architectures and Training Methods on the Performance of Learned Video Frame Prediction

URL: http://arxiv.org/abs/2008.06106v1
Date: Thu, 13 Aug 2020 20:45:28 GMT
Title: Effect of Architectures and Training Methods on the Performance of Learned Video Frame Prediction
Authors: M. Akin Yilmaz and A. Murat Tekalp
Abstract summary: Experimental results show that the residual FCNN architecture performs the best in terms of peak signal to noise ratio (PSNR) at the expense of higher training and test (inference) computational complexity. The CRNN can be trained stably and very efficiently using the stateful truncated backpropagation through time procedure.
Score: 10.404162481860634
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We analyze the performance of feedforward vs. recurrent neural network (RNN) architectures and associated training methods for learned frame prediction. To this effect, we trained a residual fully convolutional neural network (FCNN), a convolutional RNN (CRNN), and a convolutional long short-term memory (CLSTM) network for next frame prediction using the mean square loss. We performed both stateless and stateful training for recurrent networks. Experimental results show that the residual FCNN architecture performs the best in terms of peak signal to noise ratio (PSNR) at the expense of higher training and test (inference) computational complexity. The CRNN can be trained stably and very efficiently using the stateful truncated backpropagation through time procedure, and it requires an order of magnitude less inference runtime to achieve near real-time frame prediction with an acceptable performance.

Related papers

ParaRevSNN: A Parallel Reversible Spiking Neural Network for Efficient Training and Inference [4.174294693108078]
Reversible Spiking Neural Networks (RevSNNs) enable memory-efficient training by reconstructing forward activations during backpropagation.<n>RevSNNs suffer from high latency due to strictly sequential computation.<n>We propose ParaRevSNN, a parallel reversible SNN architecture that decouples sequential dependencies between reversible blocks.
arXiv Detail & Related papers (2025-08-02T06:40:59Z)
MesaNet: Sequence Modeling by Locally Optimal Test-Time Training [67.45211108321203]
We introduce a numerically stable, chunkwise parallelizable version of the recently proposed Mesa layer.<n>We show that optimal test-time training enables reaching lower language modeling perplexity and higher downstream benchmark performance than previous RNNs.
arXiv Detail & Related papers (2025-06-05T16:50:23Z)
Fast Training of Recurrent Neural Networks with Stationary State Feedbacks [48.22082789438538]
Recurrent neural networks (RNNs) have recently demonstrated strong performance and faster inference than Transformers. We propose a novel method that replaces BPTT with a fixed gradient feedback mechanism.
arXiv Detail & Related papers (2025-03-29T14:45:52Z)
An NMF-Based Building Block for Interpretable Neural Networks With Continual Learning [0.8158530638728501]
Existing learning methods often struggle to balance interpretability and predictive performance. Our approach aims to strike a better balance between these two aspects through the use of a building block based on NMF.
arXiv Detail & Related papers (2023-11-20T02:00:33Z)
CIF-T: A Novel CIF-based Transducer Architecture for Automatic Speech Recognition [8.302549684364195]
We propose a novel model named CIF-Transducer (CIF-T) which incorporates the Continuous Integrate-and-Fire (CIF) mechanism with the RNN-T model to achieve efficient alignment. CIF-T achieves state-of-the-art results with lower computational overhead compared to RNN-T models.
arXiv Detail & Related papers (2023-07-26T11:59:14Z)
SPP-CNN: An Efficient Framework for Network Robustness Prediction [13.742495880357493]
This paper develops an efficient framework for network robustness prediction, the spatial pyramid pooling convolutional neural network (SPP-CNN) The new framework installs a spatial pyramid pooling layer between the convolutional and fully-connected layers, overcoming the common mismatch issue in the CNN-based prediction approaches.
arXiv Detail & Related papers (2023-05-13T09:09:20Z)
Learning from Predictions: Fusing Training and Autoregressive Inference for Long-Term Spatiotemporal Forecasts [4.068387278512612]
We propose the Scheduled Autoregressive BPTT (BPTT-SA) algorithm for predicting complex systems. Our results show that BPTT-SA effectively reduces iterative error propagation in Convolutional RNNs and Convolutional Autoencoder RNNs.
arXiv Detail & Related papers (2023-02-22T02:46:54Z)
Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency. We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z)
Comparative Analysis of Interval Reachability for Robust Implicit and Feedforward Neural Networks [64.23331120621118]
We use interval reachability analysis to obtain robustness guarantees for implicit neural networks (INNs) INNs are a class of implicit learning models that use implicit equations as layers. We show that our approach performs at least as well as, and generally better than, applying state-of-the-art interval bound propagation methods to INNs.
arXiv Detail & Related papers (2022-04-01T03:31:27Z)
Deep Time Delay Neural Network for Speech Enhancement with Full Data Learning [60.20150317299749]
This paper proposes a deep time delay neural network (TDNN) for speech enhancement with full data learning. To make full use of the training data, we propose a full data learning method for speech enhancement.
arXiv Detail & Related papers (2020-11-11T06:32:37Z)
Distillation Guided Residual Learning for Binary Convolutional Neural Networks [83.6169936912264]
It is challenging to bridge the performance gap between Binary CNN (BCNN) and Floating point CNN (FCNN) We observe that, this performance gap leads to substantial residuals between intermediate feature maps of BCNN and FCNN. To minimize the performance gap, we enforce BCNN to produce similar intermediate feature maps with the ones of FCNN. This training strategy, i.e., optimizing each binary convolutional block with block-wise distillation loss derived from FCNN, leads to a more effective optimization to BCNN.
arXiv Detail & Related papers (2020-07-10T07:55:39Z)
Progressive Tandem Learning for Pattern Recognition with Deep Spiking Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency. We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z)
Tensor train decompositions on recurrent networks [60.334946204107446]
Matrix product state (MPS) tensor trains have more attractive features than MPOs, in terms of storage reduction and computing time at inference. We show that MPS tensor trains should be at the forefront of LSTM network compression through a theoretical analysis and practical experiments on NLP task.
arXiv Detail & Related papers (2020-06-09T18:25:39Z)
Error-feedback stochastic modeling strategy for time series forecasting with convolutional neural networks [11.162185201961174]
We propose a novel Error-feedback Modeling (ESM) strategy to construct a random Convolutional Network (ESM-CNN) Neural time series forecasting task. The proposed ESM-CNN not only outperforms the state-of-art random neural networks, but also exhibits stronger predictive power and less computing overhead in comparison to trained state-of-art deep neural network models.
arXiv Detail & Related papers (2020-02-03T13:30:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.