Interpretable Representation Learning from Videos using Nonlinear Priors
- URL: http://arxiv.org/abs/2410.18539v1
- Date: Thu, 24 Oct 2024 08:39:24 GMT
- Title: Interpretable Representation Learning from Videos using Nonlinear Priors
- Authors: Marian Longa, João F. Henriques,
- Abstract summary: We propose a deep learning framework where one can specify nonlinear priors for videos.
We do this by extending the Variational Auto-Encoder (VAE) prior from a simple isotropic Gaussian to an arbitrary nonlinear temporal Additive Noise Model (ANM)
We validate the method on different real-world physics videos including a pendulum, a mass on a spring, a falling object and a pulsar.
- Score: 15.779730667509915
- License:
- Abstract: Learning interpretable representations of visual data is an important challenge, to make machines' decisions understandable to humans and to improve generalisation outside of the training distribution. To this end, we propose a deep learning framework where one can specify nonlinear priors for videos (e.g. of Newtonian physics) that allow the model to learn interpretable latent variables and use these to generate videos of hypothetical scenarios not observed at training time. We do this by extending the Variational Auto-Encoder (VAE) prior from a simple isotropic Gaussian to an arbitrary nonlinear temporal Additive Noise Model (ANM), which can describe a large number of processes (e.g. Newtonian physics). We propose a novel linearization method that constructs a Gaussian Mixture Model (GMM) approximating the prior, and derive a numerically stable Monte Carlo estimate of the KL divergence between the posterior and prior GMMs. We validate the method on different real-world physics videos including a pendulum, a mass on a spring, a falling object and a pulsar (rotating neutron star). We specify a physical prior for each experiment and show that the correct variables are learned. Once a model is trained, we intervene on it to change different physical variables (such as oscillation amplitude or adding air drag) to generate physically correct videos of hypothetical scenarios that were not observed previously.
Related papers
- How Far is Video Generation from World Model: A Physical Law Perspective [101.24278831609249]
OpenAI's Sora highlights the potential of video generation for developing world models that adhere to physical laws.
But the ability of video generation models to discover such laws purely from visual data without human priors can be questioned.
In this work, we evaluate across three key scenarios: in-distribution, out-of-distribution, and generalization.
arXiv Detail & Related papers (2024-11-04T18:53:05Z) - Scaling and renormalization in high-dimensional regression [72.59731158970894]
This paper presents a succinct derivation of the training and generalization performance of a variety of high-dimensional ridge regression models.
We provide an introduction and review of recent results on these topics, aimed at readers with backgrounds in physics and deep learning.
arXiv Detail & Related papers (2024-05-01T15:59:00Z) - Probabilistic Unrolling: Scalable, Inverse-Free Maximum Likelihood
Estimation for Latent Gaussian Models [69.22568644711113]
We introduce probabilistic unrolling, a method that combines Monte Carlo sampling with iterative linear solvers to circumvent matrix inversions.
Our theoretical analyses reveal that unrolling and backpropagation through the iterations of the solver can accelerate gradient estimation for maximum likelihood estimation.
In experiments on simulated and real data, we demonstrate that probabilistic unrolling learns latent Gaussian models up to an order of magnitude faster than gradient EM, with minimal losses in model performance.
arXiv Detail & Related papers (2023-06-05T21:08:34Z) - PETAL: Physics Emulation Through Averaged Linearizations for Solving
Inverse Problems [0.6039786064227648]
Inverse problems describe the task of recovering an underlying signal of interest given observables.
We propose a simple learned weighted average model that embeds linearizations of the forward model around various reference points into the model itself.
arXiv Detail & Related papers (2023-05-18T15:50:54Z) - Physics-enhanced Gaussian Process Variational Autoencoder [21.222154875601984]
Variational autoencoders allow to learn a lower-dimensional latent space based on high-dimensional input/output data.
We propose a physics-enhanced variational autoencoder that places a physical-enhanced Gaussian process prior on the latent dynamics.
The benefits of the proposed approach are highlighted in a simulation with an oscillating particle.
arXiv Detail & Related papers (2023-05-15T20:41:39Z) - Capturing dynamical correlations using implicit neural representations [85.66456606776552]
We develop an artificial intelligence framework which combines a neural network trained to mimic simulated data from a model Hamiltonian with automatic differentiation to recover unknown parameters from experimental data.
In doing so, we illustrate the ability to build and train a differentiable model only once, which then can be applied in real-time to multi-dimensional scattering data.
arXiv Detail & Related papers (2023-04-08T07:55:36Z) - Physics-informed Information Field Theory for Modeling Physical Systems with Uncertainty Quantification [0.0]
Information field theory (IFT) provides the tools necessary to perform statistics over fields that are not necessarily Gaussian.
We extend IFT to physics-informed IFT (PIFT) by encoding the functional priors with information about the physical laws which describe the field.
The posteriors derived from this PIFT remain independent of any numerical scheme and can capture multiple modes.
We numerically demonstrate that the method correctly identifies when the physics cannot be trusted, in which case it automatically treats learning the field as a regression problem.
arXiv Detail & Related papers (2023-01-18T15:40:19Z) - Modelling of physical systems with a Hopf bifurcation using mechanistic
models and machine learning [0.0]
We propose a new hybrid modelling approach that combines a mechanistic model with a machine-learnt model to predict the limit cycle oscillations of physical systems with a Hopf bifurcation.
A data-driven mapping from this model to the experimental observations is then identified based on experimental data using machine learning techniques.
The method is shown to be general, data-efficient and to offer good accuracy without any prior knowledge about the system other than its bifurcation structure.
arXiv Detail & Related papers (2022-09-07T12:27:11Z) - Likelihood-Free Inference in State-Space Models with Unknown Dynamics [71.94716503075645]
We introduce a method for inferring and predicting latent states in state-space models where observations can only be simulated, and transition dynamics are unknown.
We propose a way of doing likelihood-free inference (LFI) of states and state prediction with a limited number of simulations.
arXiv Detail & Related papers (2021-11-02T12:33:42Z) - Equivariant vector field network for many-body system modeling [65.22203086172019]
Equivariant Vector Field Network (EVFN) is built on a novel equivariant basis and the associated scalarization and vectorization layers.
We evaluate our method on predicting trajectories of simulated Newton mechanics systems with both full and partially observed data.
arXiv Detail & Related papers (2021-10-26T14:26:25Z) - Deep Variational Luenberger-type Observer for Stochastic Video
Prediction [46.82873654555665]
We study the problem of video prediction by combining interpretability of state space models and representation of deep neural networks.
Our model builds upon an variational encoder which transforms the input video into a latent feature space and a Luenberger-type observer which captures the dynamic evolution of the latent features.
arXiv Detail & Related papers (2020-02-12T06:59:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.