Towards Physically Interpretable World Models: Meaningful Weakly Supervised Representations for Visual Trajectory Prediction
- URL: http://arxiv.org/abs/2412.12870v2
- Date: Mon, 27 Jan 2025 18:13:37 GMT
- Title: Towards Physically Interpretable World Models: Meaningful Weakly Supervised Representations for Visual Trajectory Prediction
- Authors: Zhenjiang Mao, Ivan Ruchkin,
- Abstract summary: Deep learning models are increasingly employed for perception, prediction, and control in complex systems.
Embedding physical knowledge into these models is crucial for achieving realistic and consistent outputs.
We propose Physically Interpretable World Models, a novel architecture that aligns learned latent representations with real-world physical quantities.
- Score: 0.1534667887016089
- License:
- Abstract: Deep learning models are increasingly employed for perception, prediction, and control in complex systems. Embedding physical knowledge into these models is crucial for achieving realistic and consistent outputs, a challenge often addressed by physics-informed machine learning. However, integrating physical knowledge with representation learning becomes difficult when dealing with high-dimensional observation data, such as images, particularly under conditions of incomplete or imprecise state information. To address this, we propose Physically Interpretable World Models, a novel architecture that aligns learned latent representations with real-world physical quantities. Our method combines a variational autoencoder with a dynamical model that incorporates unknown system parameters, enabling the discovery of physically meaningful representations. By employing weak supervision with interval-based constraints, our approach eliminates the reliance on ground-truth physical annotations. Experimental results demonstrate that our method improves the quality of learned representations while achieving accurate predictions of future states, advancing the field of representation learning in dynamic systems.
Related papers
- Conservation-informed Graph Learning for Spatiotemporal Dynamics Prediction [84.26340606752763]
In this paper, we introduce the conservation-informed GNN (CiGNN), an end-to-end explainable learning framework.
The network is designed to conform to the general symmetry conservation law via symmetry where conservative and non-conservative information passes over a multiscale space by a latent temporal marching strategy.
Results demonstrate that CiGNN exhibits remarkable baseline accuracy and generalizability, and is readily applicable to learning for prediction of varioustemporal dynamics.
arXiv Detail & Related papers (2024-12-30T13:55:59Z) - ContPhy: Continuum Physical Concept Learning and Reasoning from Videos [86.63174804149216]
ContPhy is a novel benchmark for assessing machine physical commonsense.
We evaluated a range of AI models and found that they still struggle to achieve satisfactory performance on ContPhy.
We also introduce an oracle model (ContPRO) that marries the particle-based physical dynamic models with the recent large language models.
arXiv Detail & Related papers (2024-02-09T01:09:21Z) - Bridging the Gap to Real-World Object-Centric Learning [66.55867830853803]
We show that reconstructing features from models trained in a self-supervised manner is a sufficient training signal for object-centric representations to arise in a fully unsupervised way.
Our approach, DINOSAUR, significantly out-performs existing object-centric learning models on simulated data.
arXiv Detail & Related papers (2022-09-29T15:24:47Z) - Knowledge-based Deep Learning for Modeling Chaotic Systems [7.075125892721573]
This paper considers extreme events and their dynamics and proposes models based on deep neural networks, called knowledge-based deep learning (KDL)
Our proposed KDL can learn the complex patterns governing chaotic systems by jointly training on real and simulated data.
We validate our model by assessing it on three real-world benchmark datasets: El Nino sea surface temperature, San Juan Dengue viral infection, and Bjornoya daily precipitation.
arXiv Detail & Related papers (2022-09-09T11:46:25Z) - Learning dynamics from partial observations with structured neural ODEs [5.757156314867639]
We propose a flexible framework to incorporate a broad spectrum of physical insight into neural ODE-based system identification.
We demonstrate the performance of the proposed approach on numerical simulations and on an experimental dataset from a robotic exoskeleton.
arXiv Detail & Related papers (2022-05-25T07:54:10Z) - Leveraging the structure of dynamical systems for data-driven modeling [111.45324708884813]
We consider the impact of the training set and its structure on the quality of the long-term prediction.
We show how an informed design of the training set, based on invariants of the system and the structure of the underlying attractor, significantly improves the resulting models.
arXiv Detail & Related papers (2021-12-15T20:09:20Z) - Physics-Integrated Variational Autoencoders for Robust and Interpretable
Generative Modeling [86.9726984929758]
We focus on the integration of incomplete physics models into deep generative models.
We propose a VAE architecture in which a part of the latent space is grounded by physics.
We demonstrate generative performance improvements over a set of synthetic and real-world datasets.
arXiv Detail & Related papers (2021-02-25T20:28:52Z) - Bridging the Gap: Machine Learning to Resolve Improperly Modeled
Dynamics [4.940323406667406]
We present a data-driven modeling strategy to overcome improperly modeled dynamics for systems exhibiting complex-temporal behaviors.
We propose a Deep Learning framework to resolve the differences between the true dynamics of the system and the dynamics given by a model of the system that is either inaccurately or inadequately described.
arXiv Detail & Related papers (2020-08-23T04:57:02Z) - Heteroscedastic Uncertainty for Robust Generative Latent Dynamics [7.107159120605662]
We present a method to jointly learn a latent state representation and the associated dynamics.
As our main contribution, we describe how our representation is able to capture a notion of heteroscedastic or input-specific uncertainty.
We present results from prediction and control experiments on two image-based tasks.
arXiv Detail & Related papers (2020-08-18T21:04:33Z) - Visual Grounding of Learned Physical Models [66.04898704928517]
Humans intuitively recognize objects' physical properties and predict their motion, even when the objects are engaged in complicated interactions.
We present a neural model that simultaneously reasons about physics and makes future predictions based on visual and dynamics priors.
Experiments show that our model can infer the physical properties within a few observations, which allows the model to quickly adapt to unseen scenarios and make accurate predictions into the future.
arXiv Detail & Related papers (2020-04-28T17:06:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.