PhysCtrl: Generative Physics for Controllable and Physics-Grounded Video Generation
- URL: http://arxiv.org/abs/2509.20358v2
- Date: Fri, 07 Nov 2025 19:14:11 GMT
- Title: PhysCtrl: Generative Physics for Controllable and Physics-Grounded Video Generation
- Authors: Chen Wang, Chuhao Chen, Yiming Huang, Zhiyang Dou, Yuan Liu, Jiatao Gu, Lingjie Liu,
- Abstract summary: Existing generation models excel at producing photo-realistic videos from text or images, but often lack physical plausibility and 3D controllability.<n>We introduce PhysCtrl, a novel framework for physics-grounded image-to-video generation with physical parameters and force control.<n> Experiments show that PhysCtrl generates realistic, physics-grounded motion trajectories which, when used to drive image-to-video models, yield high-fidelity, controllable videos.
- Score: 53.06495362038348
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Existing video generation models excel at producing photo-realistic videos from text or images, but often lack physical plausibility and 3D controllability. To overcome these limitations, we introduce PhysCtrl, a novel framework for physics-grounded image-to-video generation with physical parameters and force control. At its core is a generative physics network that learns the distribution of physical dynamics across four materials (elastic, sand, plasticine, and rigid) via a diffusion model conditioned on physics parameters and applied forces. We represent physical dynamics as 3D point trajectories and train on a large-scale synthetic dataset of 550K animations generated by physics simulators. We enhance the diffusion model with a novel spatiotemporal attention block that emulates particle interactions and incorporates physics-based constraints during training to enforce physical plausibility. Experiments show that PhysCtrl generates realistic, physics-grounded motion trajectories which, when used to drive image-to-video models, yield high-fidelity, controllable videos that outperform existing methods in both visual quality and physical plausibility. Project Page: https://cwchenwang.github.io/physctrl
Related papers
- PhysRVG: Physics-Aware Unified Reinforcement Learning for Video Generative Models [100.65199317765608]
Physical principles are fundamental to realistic visual simulation, but remain a significant oversight in transformer-based video generation.<n>We introduce a physics-aware reinforcement learning paradigm for video generation models that enforces physical collision rules directly in high-dimensional spaces.<n>We extend this paradigm to a unified framework, termed Mimicry-Discovery Cycle (MDcycle), which allows substantial fine-tuning.
arXiv Detail & Related papers (2026-01-16T08:40:10Z) - PhysChoreo: Physics-Controllable Video Generation with Part-Aware Semantic Grounding [50.454084539837005]
PhysChoreo is a novel framework that can generate videos with diverse controllability and physical realism from a single image.<n>Our method consists of two stages: first, it estimates the static initial physical properties of all objects in the image through part-aware physical property reconstruction.<n>Then, through temporally instructed and physically editable simulation, it synthesizes high-quality videos with rich dynamic behaviors and physical realism.
arXiv Detail & Related papers (2025-11-25T17:59:04Z) - PhysWorld: From Real Videos to World Models of Deformable Objects via Physics-Aware Demonstration Synthesis [52.905353023326306]
We propose PhysWorld, a framework that synthesizes physically plausible and diverse demonstrations to learn efficient world models.<n>Experiments show that PhysWorld has competitive performance while enabling inference speeds 47 times faster than the recent state-of-the-art method, i.e., PhysTwin.
arXiv Detail & Related papers (2025-10-24T13:25:39Z) - PhysMaster: Mastering Physical Representation for Video Generation via Reinforcement Learning [49.88366485306749]
Video generation models nowadays are capable of generating visually realistic videos, but often fail to adhere to physical laws.<n>We propose PhysMaster, which captures physical knowledge as a representation for guiding video generation models to enhance their physics-awareness.
arXiv Detail & Related papers (2025-10-15T17:59:59Z) - Think Before You Diffuse: LLMs-Guided Physics-Aware Video Generation [55.046699347579455]
We propose DiffPhy, a generic framework that enables physically-correct and photo-realistic video generation.<n>Our method leverages large language models (LLMs) to explicitly reason a comprehensive physical context from the text prompt.<n>We also establish a high-quality physical video dataset containing diverse phyiscal actions and events to facilitate effective finetuning.
arXiv Detail & Related papers (2025-05-27T18:26:43Z) - Morpheus: Benchmarking Physical Reasoning of Video Generative Models with Real Physical Experiments [55.465371691714296]
We introduce Morpheus, a benchmark for evaluating video generation models on physical reasoning.<n>It features 80 real-world videos capturing physical phenomena, guided by conservation laws.<n>Our findings reveal that even with advanced prompting and video conditioning, current models struggle to encode physical principles.
arXiv Detail & Related papers (2025-04-03T15:21:17Z) - PhysGen3D: Crafting a Miniature Interactive World from a Single Image [31.41059199853702]
PhysGen3D is a novel framework that transforms a single image into an amodal, camera-centric, interactive 3D scene.<n>At its core, PhysGen3D estimates 3D shapes, poses, physical and lighting properties of objects.<n>We evaluate PhysGen3D's performance against closed-source state-of-the-art (SOTA) image-to-video models, including Pika, Kling, and Gen-3.
arXiv Detail & Related papers (2025-03-26T17:31:04Z) - PhysGen: Rigid-Body Physics-Grounded Image-to-Video Generation [29.831214435147583]
We present PhysGen, a novel image-to-video generation method.
It produces a realistic, physically plausible, and temporally consistent video.
Our key insight is to integrate model-based physical simulation with a data-driven video generation process.
arXiv Detail & Related papers (2024-09-27T17:59:57Z) - Physics3D: Learning Physical Properties of 3D Gaussians via Video Diffusion [35.71595369663293]
We propose textbfPhysics3D, a novel method for learning various physical properties of 3D objects through a video diffusion model.
Our approach involves designing a highly generalizable physical simulation system based on a viscoelastic material model.
Experiments demonstrate the effectiveness of our method with both elastic and plastic materials.
arXiv Detail & Related papers (2024-06-06T17:59:47Z) - DreamPhysics: Learning Physics-Based 3D Dynamics with Video Diffusion Priors [75.83647027123119]
We propose to learn the physical properties of a material field with video diffusion priors.<n>We then utilize a physics-based Material-Point-Method simulator to generate 4D content with realistic motions.
arXiv Detail & Related papers (2024-06-03T16:05:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.