Related papers: PhysCtrl: Generative Physics for Controllable and Physics-Grounded Video Generation

PhysCtrl: Generative Physics for Controllable and Physics-Grounded Video Generation

URL: http://arxiv.org/abs/2509.20358v2
Date: Fri, 07 Nov 2025 19:14:11 GMT
Title: PhysCtrl: Generative Physics for Controllable and Physics-Grounded Video Generation
Authors: Chen Wang, Chuhao Chen, Yiming Huang, Zhiyang Dou, Yuan Liu, Jiatao Gu, Lingjie Liu,
Abstract summary: Existing generation models excel at producing photo-realistic videos from text or images, but often lack physical plausibility and 3D controllability.<n>We introduce PhysCtrl, a novel framework for physics-grounded image-to-video generation with physical parameters and force control.<n> Experiments show that PhysCtrl generates realistic, physics-grounded motion trajectories which, when used to drive image-to-video models, yield high-fidelity, controllable videos.
Score: 53.06495362038348
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Existing video generation models excel at producing photo-realistic videos from text or images, but often lack physical plausibility and 3D controllability. To overcome these limitations, we introduce PhysCtrl, a novel framework for physics-grounded image-to-video generation with physical parameters and force control. At its core is a generative physics network that learns the distribution of physical dynamics across four materials (elastic, sand, plasticine, and rigid) via a diffusion model conditioned on physics parameters and applied forces. We represent physical dynamics as 3D point trajectories and train on a large-scale synthetic dataset of 550K animations generated by physics simulators. We enhance the diffusion model with a novel spatiotemporal attention block that emulates particle interactions and incorporates physics-based constraints during training to enforce physical plausibility. Experiments show that PhysCtrl generates realistic, physics-grounded motion trajectories which, when used to drive image-to-video models, yield high-fidelity, controllable videos that outperform existing methods in both visual quality and physical plausibility. Project Page: https://cwchenwang.github.io/physctrl

Related papers

PhysRVG: Physics-Aware Unified Reinforcement Learning for Video Generative Models [100.65199317765608]
Physical principles are fundamental to realistic visual simulation, but remain a significant oversight in transformer-based video generation.<n>We introduce a physics-aware reinforcement learning paradigm for video generation models that enforces physical collision rules directly in high-dimensional spaces.<n>We extend this paradigm to a unified framework, termed Mimicry-Discovery Cycle (MDcycle), which allows substantial fine-tuning.
arXiv Detail & Related papers (2026-01-16T08:40:10Z)
PhysChoreo: Physics-Controllable Video Generation with Part-Aware Semantic Grounding [50.454084539837005]
PhysChoreo is a novel framework that can generate videos with diverse controllability and physical realism from a single image.<n>Our method consists of two stages: first, it estimates the static initial physical properties of all objects in the image through part-aware physical property reconstruction.<n>Then, through temporally instructed and physically editable simulation, it synthesizes high-quality videos with rich dynamic behaviors and physical realism.
arXiv Detail & Related papers (2025-11-25T17:59:04Z)
PhysWorld: From Real Videos to World Models of Deformable Objects via Physics-Aware Demonstration Synthesis [52.905353023326306]
We propose PhysWorld, a framework that synthesizes physically plausible and diverse demonstrations to learn efficient world models.<n>Experiments show that PhysWorld has competitive performance while enabling inference speeds 47 times faster than the recent state-of-the-art method, i.e., PhysTwin.
arXiv Detail & Related papers (2025-10-24T13:25:39Z)
PhysMaster: Mastering Physical Representation for Video Generation via Reinforcement Learning [49.88366485306749]
Video generation models nowadays are capable of generating visually realistic videos, but often fail to adhere to physical laws.<n>We propose PhysMaster, which captures physical knowledge as a representation for guiding video generation models to enhance their physics-awareness.
arXiv Detail & Related papers (2025-10-15T17:59:59Z)
Think Before You Diffuse: LLMs-Guided Physics-Aware Video Generation [55.046699347579455]
We propose DiffPhy, a generic framework that enables physically-correct and photo-realistic video generation.<n>Our method leverages large language models (LLMs) to explicitly reason a comprehensive physical context from the text prompt.<n>We also establish a high-quality physical video dataset containing diverse phyiscal actions and events to facilitate effective finetuning.
arXiv Detail & Related papers (2025-05-27T18:26:43Z)
Morpheus: Benchmarking Physical Reasoning of Video Generative Models with Real Physical Experiments [55.465371691714296]
We introduce Morpheus, a benchmark for evaluating video generation models on physical reasoning.<n>It features 80 real-world videos capturing physical phenomena, guided by conservation laws.<n>Our findings reveal that even with advanced prompting and video conditioning, current models struggle to encode physical principles.
arXiv Detail & Related papers (2025-04-03T15:21:17Z)
PhysGen3D: Crafting a Miniature Interactive World from a Single Image [31.41059199853702]
PhysGen3D is a novel framework that transforms a single image into an amodal, camera-centric, interactive 3D scene.<n>At its core, PhysGen3D estimates 3D shapes, poses, physical and lighting properties of objects.<n>We evaluate PhysGen3D's performance against closed-source state-of-the-art (SOTA) image-to-video models, including Pika, Kling, and Gen-3.
arXiv Detail & Related papers (2025-03-26T17:31:04Z)
PhysGen: Rigid-Body Physics-Grounded Image-to-Video Generation [29.831214435147583]
We present PhysGen, a novel image-to-video generation method. It produces a realistic, physically plausible, and temporally consistent video. Our key insight is to integrate model-based physical simulation with a data-driven video generation process.
arXiv Detail & Related papers (2024-09-27T17:59:57Z)
Physics3D: Learning Physical Properties of 3D Gaussians via Video Diffusion [35.71595369663293]
We propose textbfPhysics3D, a novel method for learning various physical properties of 3D objects through a video diffusion model. Our approach involves designing a highly generalizable physical simulation system based on a viscoelastic material model. Experiments demonstrate the effectiveness of our method with both elastic and plastic materials.
arXiv Detail & Related papers (2024-06-06T17:59:47Z)
DreamPhysics: Learning Physics-Based 3D Dynamics with Video Diffusion Priors [75.83647027123119]
We propose to learn the physical properties of a material field with video diffusion priors.<n>We then utilize a physics-based Material-Point-Method simulator to generate 4D content with realistic motions.
arXiv Detail & Related papers (2024-06-03T16:05:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.