Learning 3D Particle-based Simulators from RGB-D Videos
- URL: http://arxiv.org/abs/2312.05359v1
- Date: Fri, 8 Dec 2023 20:45:34 GMT
- Title: Learning 3D Particle-based Simulators from RGB-D Videos
- Authors: William F. Whitney, Tatiana Lopez-Guevara, Tobias Pfaff, Yulia
Rubanova, Thomas Kipf, Kimberly Stachenfeld, Kelsey R. Allen
- Abstract summary: We propose a method for learning simulators directly from observations.
Visual Particle Dynamics (VPD) jointly learns a latent particle-based representation of 3D scenes.
Unlike existing 2D video prediction models, VPD's 3D structure enables scene editing and long-term predictions.
- Score: 15.683877597215494
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Realistic simulation is critical for applications ranging from robotics to
animation. Traditional analytic simulators sometimes struggle to capture
sufficiently realistic simulation which can lead to problems including the well
known "sim-to-real" gap in robotics. Learned simulators have emerged as an
alternative for better capturing real-world physical dynamics, but require
access to privileged ground truth physics information such as precise object
geometry or particle tracks. Here we propose a method for learning simulators
directly from observations. Visual Particle Dynamics (VPD) jointly learns a
latent particle-based representation of 3D scenes, a neural simulator of the
latent particle dynamics, and a renderer that can produce images of the scene
from arbitrary views. VPD learns end to end from posed RGB-D videos and does
not require access to privileged information. Unlike existing 2D video
prediction models, we show that VPD's 3D structure enables scene editing and
long-term predictions. These results pave the way for downstream applications
ranging from video editing to robotic planning.
Related papers
- Dynamic 3D Gaussian Tracking for Graph-Based Neural Dynamics Modeling [10.247075501610492]
We introduce a framework to learn object dynamics directly from multi-view RGB videos.
We train a particle-based dynamics model using Graph Neural Networks.
Our method can predict object motions under varying initial configurations and unseen robot actions.
arXiv Detail & Related papers (2024-10-24T17:02:52Z) - DreamPhysics: Learning Physical Properties of Dynamic 3D Gaussians with Video Diffusion Priors [75.83647027123119]
We propose to learn the physical properties of a material field with video diffusion priors.
We then utilize a physics-based Material-Point-Method simulator to generate 4D content with realistic motions.
arXiv Detail & Related papers (2024-06-03T16:05:25Z) - Scaling Face Interaction Graph Networks to Real World Scenes [12.519862235430153]
We introduce a method which substantially reduces the memory required to run graph-based learned simulators.
We show that our method uses substantially less memory than previous graph-based simulators while retaining their accuracy.
This paves the way for expanding the application of learned simulators to settings where only perceptual information is available at inference time.
arXiv Detail & Related papers (2024-01-22T14:38:25Z) - Learning Interactive Real-World Simulators [96.5991333400566]
We explore the possibility of learning a universal simulator of real-world interaction through generative modeling.
We use the simulator to train both high-level vision-language policies and low-level reinforcement learning policies.
Video captioning models can benefit from training with simulated experience, opening up even wider applications.
arXiv Detail & Related papers (2023-10-09T19:42:22Z) - 3D-IntPhys: Towards More Generalized 3D-grounded Visual Intuitive
Physics under Challenging Scenes [68.66237114509264]
We present a framework capable of learning 3D-grounded visual intuitive physics models from videos of complex scenes with fluids.
We show our model can make long-horizon future predictions by learning from raw images and significantly outperforms models that do not employ an explicit 3D representation space.
arXiv Detail & Related papers (2023-04-22T19:28:49Z) - NeuPhysics: Editable Neural Geometry and Physics from Monocular Videos [82.74918564737591]
We present a method for learning 3D geometry and physics parameters of a dynamic scene from only a monocular RGB video input.
Experiments show that our method achieves superior mesh and video reconstruction of dynamic scenes compared to competing Neural Field approaches.
arXiv Detail & Related papers (2022-10-22T04:57:55Z) - T3VIP: Transformation-based 3D Video Prediction [49.178585201673364]
We propose a 3D video prediction (T3VIP) approach that explicitly models the 3D motion by decomposing a scene into its object parts.
Our model is fully unsupervised, captures the nature of the real world, and the observational cues in image and point cloud domains constitute its learning signals.
To the best of our knowledge, our model is the first generative model that provides an RGB-D video prediction of the future for a static camera.
arXiv Detail & Related papers (2022-09-19T15:01:09Z) - 3D Neural Scene Representations for Visuomotor Control [78.79583457239836]
We learn models for dynamic 3D scenes purely from 2D visual observations.
A dynamics model, constructed over the learned representation space, enables visuomotor control for challenging manipulation tasks.
arXiv Detail & Related papers (2021-07-08T17:49:37Z) - 3D-OES: Viewpoint-Invariant Object-Factorized Environment Simulators [24.181604511269096]
We propose an action-conditioned dynamics model that predicts scene changes caused by object and agent interactions in a viewpoint-in 3D neural scene representation space.
In this space, objects do not interfere with one another and their appearance persists over time and across viewpoints.
We show our model generalizes well its predictions across varying number and appearances of interacting objects as well as across camera viewpoints.
arXiv Detail & Related papers (2020-11-12T16:15:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.