PhysChoreo: Physics-Controllable Video Generation with Part-Aware Semantic Grounding
- URL: http://arxiv.org/abs/2511.20562v1
- Date: Tue, 25 Nov 2025 17:59:04 GMT
- Title: PhysChoreo: Physics-Controllable Video Generation with Part-Aware Semantic Grounding
- Authors: Haoze Zhang, Tianyu Huang, Zichen Wan, Xiaowei Jin, Hongzhi Zhang, Hui Li, Wangmeng Zuo,
- Abstract summary: PhysChoreo is a novel framework that can generate videos with diverse controllability and physical realism from a single image.<n>Our method consists of two stages: first, it estimates the static initial physical properties of all objects in the image through part-aware physical property reconstruction.<n>Then, through temporally instructed and physically editable simulation, it synthesizes high-quality videos with rich dynamic behaviors and physical realism.
- Score: 50.454084539837005
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While recent video generation models have achieved significant visual fidelity, they often suffer from the lack of explicit physical controllability and plausibility. To address this, some recent studies attempted to guide the video generation with physics-based rendering. However, these methods face inherent challenges in accurately modeling complex physical properties and effectively control ling the resulting physical behavior over extended temporal sequences. In this work, we introduce PhysChoreo, a novel framework that can generate videos with diverse controllability and physical realism from a single image. Our method consists of two stages: first, it estimates the static initial physical properties of all objects in the image through part-aware physical property reconstruction. Then, through temporally instructed and physically editable simulation, it synthesizes high-quality videos with rich dynamic behaviors and physical realism. Experimental results show that PhysChoreo can generate videos with rich behaviors and physical realism, outperforming state-of-the-art methods on multiple evaluation metrics.
Related papers
- PhysRVG: Physics-Aware Unified Reinforcement Learning for Video Generative Models [100.65199317765608]
Physical principles are fundamental to realistic visual simulation, but remain a significant oversight in transformer-based video generation.<n>We introduce a physics-aware reinforcement learning paradigm for video generation models that enforces physical collision rules directly in high-dimensional spaces.<n>We extend this paradigm to a unified framework, termed Mimicry-Discovery Cycle (MDcycle), which allows substantial fine-tuning.
arXiv Detail & Related papers (2026-01-16T08:40:10Z) - PhysWorld: From Real Videos to World Models of Deformable Objects via Physics-Aware Demonstration Synthesis [52.905353023326306]
We propose PhysWorld, a framework that synthesizes physically plausible and diverse demonstrations to learn efficient world models.<n>Experiments show that PhysWorld has competitive performance while enabling inference speeds 47 times faster than the recent state-of-the-art method, i.e., PhysTwin.
arXiv Detail & Related papers (2025-10-24T13:25:39Z) - PhysMaster: Mastering Physical Representation for Video Generation via Reinforcement Learning [49.88366485306749]
Video generation models nowadays are capable of generating visually realistic videos, but often fail to adhere to physical laws.<n>We propose PhysMaster, which captures physical knowledge as a representation for guiding video generation models to enhance their physics-awareness.
arXiv Detail & Related papers (2025-10-15T17:59:59Z) - PhysHMR: Learning Humanoid Control Policies from Vision for Physically Plausible Human Motion Reconstruction [52.44375492811009]
We present PhysHMR, a unified framework that learns a visual-to-action policy for humanoid control in a physics-based simulator.<n>A key component of our approach is the pixel-as-ray strategy, which lifts 2D keypoints into 3D spatial rays and transforms them into global space.<n>PhysHMR produces high-fidelity, physically plausible motion across diverse scenarios, outperforming prior approaches in both visual accuracy and physical realism.
arXiv Detail & Related papers (2025-10-02T21:01:11Z) - PhysCtrl: Generative Physics for Controllable and Physics-Grounded Video Generation [53.06495362038348]
Existing generation models excel at producing photo-realistic videos from text or images, but often lack physical plausibility and 3D controllability.<n>We introduce PhysCtrl, a novel framework for physics-grounded image-to-video generation with physical parameters and force control.<n> Experiments show that PhysCtrl generates realistic, physics-grounded motion trajectories which, when used to drive image-to-video models, yield high-fidelity, controllable videos.
arXiv Detail & Related papers (2025-09-24T17:58:04Z) - PhysTwin: Physics-Informed Reconstruction and Simulation of Deformable Objects from Videos [21.441062722848265]
PhysTwin is a novel framework that uses sparse videos of dynamic objects under interaction to produce a photo- and physically realistic, real-time interactive replica.<n>Our approach centers on two key components: (1) a physics-informed representation that combines spring-mass models for realistic physical simulation, and generative shape models for geometry, and Gaussian splats for rendering.<n>Our method integrates an inverse physics framework with visual perception cues, enabling high-fidelity reconstruction even from partial, occluded, and limited viewpoints.
arXiv Detail & Related papers (2025-03-23T07:49:19Z) - PhysGen: Rigid-Body Physics-Grounded Image-to-Video Generation [29.831214435147583]
We present PhysGen, a novel image-to-video generation method.
It produces a realistic, physically plausible, and temporally consistent video.
Our key insight is to integrate model-based physical simulation with a data-driven video generation process.
arXiv Detail & Related papers (2024-09-27T17:59:57Z) - DreamPhysics: Learning Physics-Based 3D Dynamics with Video Diffusion Priors [75.83647027123119]
We propose to learn the physical properties of a material field with video diffusion priors.<n>We then utilize a physics-based Material-Point-Method simulator to generate 4D content with realistic motions.
arXiv Detail & Related papers (2024-06-03T16:05:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.