ContactGaussian-WM: Learning Physics-Grounded World Model from Videos
- URL: http://arxiv.org/abs/2602.11021v1
- Date: Wed, 11 Feb 2026 16:48:13 GMT
- Title: ContactGaussian-WM: Learning Physics-Grounded World Model from Videos
- Authors: Meizhong Wang, Wanxin Jin, Kun Cao, Lihua Xie, Yiguang Hong,
- Abstract summary: We propose ContactGaussian-WM, a differentiable physics-grounded rigid-body world model capable of learning intricate physical laws directly from sparse and contact-rich video sequences.<n>Extensive simulations and real-world evaluations demonstrate that ContactGaussian-WM outperforms state-of-the-art methods in learning complex scenarios.
- Score: 25.368710400385392
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Developing world models that understand complex physical interactions is essential for advancing robotic planning and simulation.However, existing methods often struggle to accurately model the environment under conditions of data scarcity and complex contact-rich dynamic motion.To address these challenges, we propose ContactGaussian-WM, a differentiable physics-grounded rigid-body world model capable of learning intricate physical laws directly from sparse and contact-rich video sequences.Our framework consists of two core components: (1) a unified Gaussian representation for both visual appearance and collision geometry, and (2) an end-to-end differentiable learning framework that differentiates through a closed-form physics engine to infer physical properties from sparse visual observations.Extensive simulations and real-world evaluations demonstrate that ContactGaussian-WM outperforms state-of-the-art methods in learning complex scenarios, exhibiting robust generalization capabilities.Furthermore, we showcase the practical utility of our framework in downstream applications, including data synthesis and real-time MPC.
Related papers
- Mirage2Matter: A Physically Grounded Gaussian World Model from Video [87.9732484393686]
We present Simulate Anything, a graphics-driven world modeling and simulation framework.<n>Our approach reconstructs real-world environments into a photorealistic scene representation using 3D Gaussian Splatting (3DGS)<n>We then leverage generative models to recover a physically realistic representation and integrate it into a simulation environment via a precision calibration target.
arXiv Detail & Related papers (2026-01-24T07:43:57Z) - PhysRVG: Physics-Aware Unified Reinforcement Learning for Video Generative Models [100.65199317765608]
Physical principles are fundamental to realistic visual simulation, but remain a significant oversight in transformer-based video generation.<n>We introduce a physics-aware reinforcement learning paradigm for video generation models that enforces physical collision rules directly in high-dimensional spaces.<n>We extend this paradigm to a unified framework, termed Mimicry-Discovery Cycle (MDcycle), which allows substantial fine-tuning.
arXiv Detail & Related papers (2026-01-16T08:40:10Z) - HD-GEN: A High-Performance Software System for Human Mobility Data Generation Based on Patterns of Life [1.9739979974462676]
We introduce a comprehensive software pipeline for calibrating, generating, processing, and visualizing large-scale individual-level human mobility datasets.<n>A data generation engine constructs geographically grounded simulations using OpenStreetMap data.<n>A genetic algorithm-based calibration module fine-tunes simulation parameters to align with real-world mobility characteristics.<n>A data processing suite transforms raw simulation logs into structured formats suitable for downstream applications.
arXiv Detail & Related papers (2026-01-03T16:01:00Z) - Kinematify: Open-Vocabulary Synthesis of High-DoF Articulated Objects [59.51185639557874]
We introduce Kinematify, an automated framework that synthesizes articulated objects directly from arbitrary RGB images or textual descriptions.<n>Our method addresses two core challenges: (i) inferring kinematic topologies for high-DoF objects and (ii) estimating joint parameters from static geometry.
arXiv Detail & Related papers (2025-11-03T07:21:42Z) - PhysWorld: From Real Videos to World Models of Deformable Objects via Physics-Aware Demonstration Synthesis [52.905353023326306]
We propose PhysWorld, a framework that synthesizes physically plausible and diverse demonstrations to learn efficient world models.<n>Experiments show that PhysWorld has competitive performance while enabling inference speeds 47 times faster than the recent state-of-the-art method, i.e., PhysTwin.
arXiv Detail & Related papers (2025-10-24T13:25:39Z) - Guiding Human-Object Interactions with Rich Geometry and Relations [21.528466852204627]
Existing methods often rely on simplified object representations, such as the object's centroid or the nearest point to a human, to achieve physically plausible motions.<n>We introduce ROG, a novel framework that addresses relationships inherent in HOIs with rich geometric detail.<n>We show that ROG significantly outperforms state-of-the-art methods in the realism evaluations and semantic accuracy of synthesized HOIs.
arXiv Detail & Related papers (2025-03-26T02:57:18Z) - PhysTwin: Physics-Informed Reconstruction and Simulation of Deformable Objects from Videos [21.441062722848265]
PhysTwin is a novel framework that uses sparse videos of dynamic objects under interaction to produce a photo- and physically realistic, real-time interactive replica.<n>Our approach centers on two key components: (1) a physics-informed representation that combines spring-mass models for realistic physical simulation, and generative shape models for geometry, and Gaussian splats for rendering.<n>Our method integrates an inverse physics framework with visual perception cues, enabling high-fidelity reconstruction even from partial, occluded, and limited viewpoints.
arXiv Detail & Related papers (2025-03-23T07:49:19Z) - InterMimic: Towards Universal Whole-Body Control for Physics-Based Human-Object Interactions [27.225777494300775]
We introduce InterMimic, a framework that enables a single policy to robustly learn from hours of imperfect MoCap data.<n>Our experiments demonstrate that InterMimic produces realistic and diverse interactions across multiple HOI datasets.
arXiv Detail & Related papers (2025-02-27T18:59:12Z) - GausSim: Foreseeing Reality by Gaussian Simulator for Elastic Objects [55.02281855589641]
GausSim is a novel neural network-based simulator designed to capture the dynamic behaviors of real-world elastic objects represented through Gaussian kernels.<n>We leverage continuum mechanics and treat each kernel as a Center of Mass System (CMS) that represents continuous piece of matter.<n>In addition, GausSim incorporates explicit physics constraints, such as mass and momentum conservation, ensuring interpretable results and robust, physically plausible simulations.
arXiv Detail & Related papers (2024-12-23T18:58:17Z) - Integrating Physics and Topology in Neural Networks for Learning Rigid Body Dynamics [6.675805308519987]
We introduce a novel framework for modeling rigid body dynamics and learning collision interactions.<n>We propose a physics-informed message-passing neural architecture, embedding physical laws directly in the model.<n>This work addresses the challenge of multi-entity dynamic interactions, with applications spanning diverse scientific and engineering domains.
arXiv Detail & Related papers (2024-11-18T11:03:15Z) - Towards Complex Dynamic Physics System Simulation with Graph Neural ODEs [75.7104463046767]
This paper proposes a novel learning based simulation model that characterizes the varying spatial and temporal dependencies in particle systems.
We empirically evaluate GNSTODE's simulation performance on two real-world particle systems, Gravity and Coulomb.
arXiv Detail & Related papers (2023-05-21T03:51:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.