MotionPhysics: Learnable Motion Distillation for Text-Guided Simulation
- URL: http://arxiv.org/abs/2601.00504v1
- Date: Thu, 01 Jan 2026 22:56:37 GMT
- Title: MotionPhysics: Learnable Motion Distillation for Text-Guided Simulation
- Authors: Miaowei Wang, Jakub Zadrożny, Oisin Mac Aodha, Amir Vaxman,
- Abstract summary: MotionPhysics is an end-to-end differentiable framework that infers plausible physical parameters from a user-provided natural language prompt.<n>We evaluate MotionPhysics across more than thirty scenarios, including real-world, human-designed, and AI-generated 3D objects.
- Score: 25.78198969054392
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accurately simulating existing 3D objects and a wide variety of materials often demands expert knowledge and time-consuming physical parameter tuning to achieve the desired dynamic behavior. We introduce MotionPhysics, an end-to-end differentiable framework that infers plausible physical parameters from a user-provided natural language prompt for a chosen 3D scene of interest, removing the need for guidance from ground-truth trajectories or annotated videos. Our approach first utilizes a multimodal large language model to estimate material parameter values, which are constrained to lie within plausible ranges. We further propose a learnable motion distillation loss that extracts robust motion priors from pretrained video diffusion models while minimizing appearance and geometry inductive biases to guide the simulation. We evaluate MotionPhysics across more than thirty scenarios, including real-world, human-designed, and AI-generated 3D objects, spanning a wide range of materials such as elastic solids, metals, foams, sand, and both Newtonian and non-Newtonian fluids. We demonstrate that MotionPhysics produces visually realistic dynamic simulations guided by natural language, surpassing the state of the art while automatically determining physically plausible parameters. The code and project page are available at: https://wangmiaowei.github.io/MotionPhysics.github.io/.
Related papers
- PhysWorld: From Real Videos to World Models of Deformable Objects via Physics-Aware Demonstration Synthesis [52.905353023326306]
We propose PhysWorld, a framework that synthesizes physically plausible and diverse demonstrations to learn efficient world models.<n>Experiments show that PhysWorld has competitive performance while enabling inference speeds 47 times faster than the recent state-of-the-art method, i.e., PhysTwin.
arXiv Detail & Related papers (2025-10-24T13:25:39Z) - PhysCtrl: Generative Physics for Controllable and Physics-Grounded Video Generation [53.06495362038348]
Existing generation models excel at producing photo-realistic videos from text or images, but often lack physical plausibility and 3D controllability.<n>We introduce PhysCtrl, a novel framework for physics-grounded image-to-video generation with physical parameters and force control.<n> Experiments show that PhysCtrl generates realistic, physics-grounded motion trajectories which, when used to drive image-to-video models, yield high-fidelity, controllable videos.
arXiv Detail & Related papers (2025-09-24T17:58:04Z) - TRACE: Learning 3D Gaussian Physical Dynamics from Multi-view Videos [7.616167860385134]
We propose a new framework named TRACE to model the motion physics of complex dynamic 3D scenes.<n>By formulating each 3D point as a rigid particle with size and orientation in space, we directly learn a translation rotation dynamics system for each particle.
arXiv Detail & Related papers (2025-08-13T13:43:01Z) - PhysFlow: Unleashing the Potential of Multi-modal Foundation Models and Video Diffusion for 4D Dynamic Physical Scene Simulation [9.306758077479472]
PhysFlow is a novel approach that leverages multi-modal foundation models and video diffusion to achieve enhanced 4D dynamic scene simulation.<n>This integrated framework enables accurate prediction and realistic simulation of dynamic interactions in real-world scenarios.
arXiv Detail & Related papers (2024-11-21T18:55:23Z) - Efficient Physics Simulation for 3D Scenes via MLLM-Guided Gaussian Splatting [32.846428862045634]
We present Sim Anything, a physics-based approach that endows static 3D objects with interactive dynamics.<n>Inspired by human visual reasoning, we propose MLLM-based Physical Property Perception.<n>We also simulate objects in an open-world scene with particles sampled via the Physical-Geometric Adaptive Sampling.
arXiv Detail & Related papers (2024-11-19T12:52:21Z) - Latent Intuitive Physics: Learning to Transfer Hidden Physics from A 3D Video [58.043569985784806]
We introduce latent intuitive physics, a transfer learning framework for physics simulation.
It can infer hidden properties of fluids from a single 3D video and simulate the observed fluid in novel scenes.
We validate our model in three ways: (i) novel scene simulation with the learned visual-world physics, (ii) future prediction of the observed fluid dynamics, and (iii) supervised particle simulation.
arXiv Detail & Related papers (2024-06-18T16:37:44Z) - Physics3D: Learning Physical Properties of 3D Gaussians via Video Diffusion [35.71595369663293]
We propose textbfPhysics3D, a novel method for learning various physical properties of 3D objects through a video diffusion model.
Our approach involves designing a highly generalizable physical simulation system based on a viscoelastic material model.
Experiments demonstrate the effectiveness of our method with both elastic and plastic materials.
arXiv Detail & Related papers (2024-06-06T17:59:47Z) - DreamPhysics: Learning Physics-Based 3D Dynamics with Video Diffusion Priors [75.83647027123119]
We propose to learn the physical properties of a material field with video diffusion priors.<n>We then utilize a physics-based Material-Point-Method simulator to generate 4D content with realistic motions.
arXiv Detail & Related papers (2024-06-03T16:05:25Z) - PhysDreamer: Physics-Based Interaction with 3D Objects via Video Generation [62.53760963292465]
PhysDreamer is a physics-based approach that endows static 3D objects with interactive dynamics.
We present our approach on diverse examples of elastic objects and evaluate the realism of the synthesized interactions through a user study.
arXiv Detail & Related papers (2024-04-19T17:41:05Z) - Dynamic Visual Reasoning by Learning Differentiable Physics Models from
Video and Language [92.7638697243969]
We propose a unified framework that can jointly learn visual concepts and infer physics models of objects from videos and language.
This is achieved by seamlessly integrating three components: a visual perception module, a concept learner, and a differentiable physics engine.
arXiv Detail & Related papers (2021-10-28T17:59:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.