Deep Variational Luenberger-type Observer for Stochastic Video
Prediction
- URL: http://arxiv.org/abs/2003.00835v2
- Date: Sun, 10 Sep 2023 13:50:36 GMT
- Title: Deep Variational Luenberger-type Observer for Stochastic Video
Prediction
- Authors: Dong Wang, Feng Zhou, Zheng Yan, Guang Yao, Zongxuan Liu, Wennan Ma
and Cewu Lu
- Abstract summary: We study the problem of video prediction by combining interpretability of state space models and representation of deep neural networks.
Our model builds upon an variational encoder which transforms the input video into a latent feature space and a Luenberger-type observer which captures the dynamic evolution of the latent features.
- Score: 46.82873654555665
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Considering the inherent stochasticity and uncertainty, predicting future
video frames is exceptionally challenging. In this work, we study the problem
of video prediction by combining interpretability of stochastic state space
models and representation learning of deep neural networks. Our model builds
upon an variational encoder which transforms the input video into a latent
feature space and a Luenberger-type observer which captures the dynamic
evolution of the latent features. This enables the decomposition of videos into
static features and dynamics in an unsupervised manner. By deriving the
stability theory of the nonlinear Luenberger-type observer, the hidden states
in the feature space become insensitive with respect to the initial values,
which improves the robustness of the overall model. Furthermore, the
variational lower bound on the data log-likelihood can be derived to obtain the
tractable posterior prediction distribution based on the variational principle.
Finally, the experiments such as the Bouncing Balls dataset and the Pendulum
dataset are provided to demonstrate the proposed model outperforms concurrent
works.
Related papers
- Latent Space Energy-based Neural ODEs [73.01344439786524]
This paper introduces a novel family of deep dynamical models designed to represent continuous-time sequence data.
We train the model using maximum likelihood estimation with Markov chain Monte Carlo.
Experiments on oscillating systems, videos and real-world state sequences (MuJoCo) illustrate that ODEs with the learnable energy-based prior outperform existing counterparts.
arXiv Detail & Related papers (2024-09-05T18:14:22Z) - State-space Decomposition Model for Video Prediction Considering Long-term Motion Trend [3.910356300831074]
We propose a state-space decomposition video prediction model that decomposes the overall video frame generation into deterministic appearance prediction and motion prediction.
We infer the long-term motion trend from conditional frames to guide the generation of future frames that exhibit high consistency with the conditional frames.
arXiv Detail & Related papers (2024-04-17T17:19:48Z) - Towards Generalizable and Interpretable Motion Prediction: A Deep
Variational Bayes Approach [54.429396802848224]
This paper proposes an interpretable generative model for motion prediction with robust generalizability to out-of-distribution cases.
For interpretability, the model achieves the target-driven motion prediction by estimating the spatial distribution of long-term destinations.
Experiments on motion prediction datasets validate that the fitted model can be interpretable and generalizable.
arXiv Detail & Related papers (2024-03-10T04:16:04Z) - Capturing dynamical correlations using implicit neural representations [85.66456606776552]
We develop an artificial intelligence framework which combines a neural network trained to mimic simulated data from a model Hamiltonian with automatic differentiation to recover unknown parameters from experimental data.
In doing so, we illustrate the ability to build and train a differentiable model only once, which then can be applied in real-time to multi-dimensional scattering data.
arXiv Detail & Related papers (2023-04-08T07:55:36Z) - Learning and Inference in Sparse Coding Models with Langevin Dynamics [3.0600309122672726]
We describe a system capable of inference and learning in a probabilistic latent variable model.
We demonstrate this idea for a sparse coding model by deriving a continuous-time equation for inferring its latent variables via Langevin dynamics.
We show that Langevin dynamics lead to an efficient procedure for sampling from the posterior distribution in the 'L0 sparse' regime, where latent variables are encouraged to be set to zero as opposed to having a small L1 norm.
arXiv Detail & Related papers (2022-04-23T23:16:47Z) - An Energy-Based Prior for Generative Saliency [62.79775297611203]
We propose a novel generative saliency prediction framework that adopts an informative energy-based model as a prior distribution.
With the generative saliency model, we can obtain a pixel-wise uncertainty map from an image, indicating model confidence in the saliency prediction.
Experimental results show that our generative saliency model with an energy-based prior can achieve not only accurate saliency predictions but also reliable uncertainty maps consistent with human perception.
arXiv Detail & Related papers (2022-04-19T10:51:00Z) - Stochastic Trajectory Prediction via Motion Indeterminacy Diffusion [88.45326906116165]
We present a new framework to formulate the trajectory prediction task as a reverse process of motion indeterminacy diffusion (MID)
We encode the history behavior information and the social interactions as a state embedding and devise a Transformer-based diffusion model to capture the temporal dependencies of trajectories.
Experiments on the human trajectory prediction benchmarks including the Stanford Drone and ETH/UCY datasets demonstrate the superiority of our method.
arXiv Detail & Related papers (2022-03-25T16:59:08Z) - Quantifying Model Predictive Uncertainty with Perturbation Theory [21.591460685054546]
We propose a framework for predictive uncertainty quantification of a neural network.
We use perturbation theory from quantum physics to formulate a moment decomposition problem.
Our approach provides fast model predictive uncertainty estimates with much greater precision and calibration.
arXiv Detail & Related papers (2021-09-22T17:55:09Z) - Stochastic embeddings of dynamical phenomena through variational
autoencoders [1.7205106391379026]
We use a recognition network to increase the observed space dimensionality during the reconstruction of the phase space.
Our validation shows that this approach not only recovers a state space that resembles the original one, but it is also able to synthetize new time series.
arXiv Detail & Related papers (2020-10-13T10:10:24Z) - Stochastic Latent Residual Video Prediction [0.0]
This paper introduces a novel temporal model whose dynamics are governed in a latent space by a residual update rule.
It naturally models video dynamics as it allows our simpler, more interpretable, latent model to outperform prior state-of-the-art methods on challenging datasets.
arXiv Detail & Related papers (2020-02-21T10:44:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.