Taylor Swift: Taylor Driven Temporal Modeling for Swift Future Frame
Prediction
- URL: http://arxiv.org/abs/2110.14392v1
- Date: Wed, 27 Oct 2021 12:46:17 GMT
- Title: Taylor Swift: Taylor Driven Temporal Modeling for Swift Future Frame
Prediction
- Authors: Mohammad Saber Pourheydari, Mohsen Fayyaz, Emad Bahrami, Mehdi
Noroozi, Juergen Gall
- Abstract summary: We introduce TayloSwiftNet, a novel convolutional neural network that learns to estimate the higher order terms of the Taylor series for a given input video.
TayloSwiftNet can swiftly predict any desired future frame in just one forward pass and change the temporal resolution on-the-fly.
- Score: 22.57791389884491
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While recurrent neural networks (RNNs) demonstrate outstanding capabilities
in future video frame prediction, they model dynamics in a discrete time space
and sequentially go through all frames until the desired future temporal step
is reached. RNNs are therefore prone to accumulate the error as the number of
future frames increases. In contrast, partial differential equations (PDEs)
model physical phenomena like dynamics in continuous time space, however,
current PDE-based approaches discretize the PDEs using e.g., the forward Euler
method. In this work, we therefore propose to approximate the motion in a video
by a continuous function using the Taylor series. To this end, we introduce
TayloSwiftNet, a novel convolutional neural network that learns to estimate the
higher order terms of the Taylor series for a given input video. TayloSwiftNet
can swiftly predict any desired future frame in just one forward pass and
change the temporal resolution on-the-fly. The experimental results on various
datasets demonstrate the superiority of our model.
Related papers
- Trajectory Flow Matching with Applications to Clinical Time Series Modeling [77.58277281319253]
Trajectory Flow Matching (TFM) trains a Neural SDE in a simulation-free manner, bypassing backpropagation through the dynamics.
We demonstrate improved performance on three clinical time series datasets in terms of absolute performance and uncertainty prediction.
arXiv Detail & Related papers (2024-10-28T15:54:50Z) - Video Prediction Transformers without Recurrence or Convolution [65.93130697098658]
We propose PredFormer, a framework entirely based on Gated Transformers.
We provide a comprehensive analysis of 3D Attention in the context of video prediction.
The significant improvements in both accuracy and efficiency highlight the potential of PredFormer.
arXiv Detail & Related papers (2024-10-07T03:52:06Z) - Predicting Long-horizon Futures by Conditioning on Geometry and Time [49.86180975196375]
We explore the task of generating future sensor observations conditioned on the past.
We leverage the large-scale pretraining of image diffusion models which can handle multi-modality.
We create a benchmark for video prediction on a diverse set of videos spanning indoor and outdoor scenes.
arXiv Detail & Related papers (2024-04-17T16:56:31Z) - STDiff: Spatio-temporal Diffusion for Continuous Stochastic Video
Prediction [20.701792842768747]
We propose a novel video prediction model, which has infinite-dimensional latent variables over the temporal domain.
Our model is able to achieve temporal continuous prediction, i.e., predicting in an unsupervised way, with an arbitrarily high frame rate.
arXiv Detail & Related papers (2023-12-11T16:12:43Z) - Modelling Latent Dynamics of StyleGAN using Neural ODEs [52.03496093312985]
We learn the trajectory of independently inverted latent codes from GANs.
The learned continuous trajectory allows us to perform infinite frame and consistent video manipulation.
Our method achieves state-of-the-art performance but with much less computation.
arXiv Detail & Related papers (2022-08-23T21:20:38Z) - Multivariate Time Series Forecasting with Dynamic Graph Neural ODEs [65.18780403244178]
We propose a continuous model to forecast Multivariate Time series with dynamic Graph neural Ordinary Differential Equations (MTGODE)
Specifically, we first abstract multivariate time series into dynamic graphs with time-evolving node features and unknown graph structures.
Then, we design and solve a neural ODE to complement missing graph topologies and unify both spatial and temporal message passing.
arXiv Detail & Related papers (2022-02-17T02:17:31Z) - Simple Video Generation using Neural ODEs [9.303957136142293]
We learn latent variable models that predict the future in latent space and project back to pixels.
We show that our approach yields promising results in the task of future frame prediction on the Moving MNIST dataset with 1 and 2 digits.
arXiv Detail & Related papers (2021-09-07T19:03:33Z) - Taylor saves for later: disentanglement for video prediction using
Taylor representation [5.658571172210811]
We propose a two-branch seq-to-seq deep model to disentangle the Taylor feature and the residual feature in video frames.
TaylorCell can expand the video frames' high-dimensional features into the finite Taylor series to describe the latent laws.
MCU distills all past frames' information to correct the predicted Taylor feature from TPU.
arXiv Detail & Related papers (2021-05-24T01:59:21Z) - Revisiting Hierarchical Approach for Persistent Long-Term Video
Prediction [55.4498466252522]
We set a new standard of video prediction with orders of magnitude longer prediction time than existing approaches.
Our method predicts future frames by first estimating a sequence of semantic structures and subsequently translating the structures to pixels by video-to-video translation.
We evaluate our method on three challenging datasets involving car driving and human dancing, and demonstrate that it can generate complicated scene structures and motions over a very long time horizon.
arXiv Detail & Related papers (2021-04-14T08:39:38Z) - Liquid Time-constant Networks [117.57116214802504]
We introduce a new class of time-continuous recurrent neural network models.
Instead of declaring a learning system's dynamics by implicit nonlinearities, we construct networks of linear first-order dynamical systems.
These neural networks exhibit stable and bounded behavior, yield superior expressivity within the family of neural ordinary differential equations.
arXiv Detail & Related papers (2020-06-08T09:53:35Z) - A Spatio-temporal Transformer for 3D Human Motion Prediction [39.31212055504893]
We propose a Transformer-based architecture for the task of generative modelling of 3D human motion.
We empirically show that this effectively learns the underlying motion dynamics and reduces error accumulation over time observed in auto-gressive models.
arXiv Detail & Related papers (2020-04-18T19:49:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.