PAT3D: Physics-Augmented Text-to-3D Scene Generation
- URL: http://arxiv.org/abs/2511.21978v1
- Date: Wed, 26 Nov 2025 23:23:58 GMT
- Title: PAT3D: Physics-Augmented Text-to-3D Scene Generation
- Authors: Guying Lin, Kemeng Huang, Michael Liu, Ruihan Gao, Hanke Chen, Lyuhao Chen, Beijia Lu, Taku Komura, Yuan Liu, Jun-Yan Zhu, Minchen Li,
- Abstract summary: PAT3D generates 3D objects, infers their spatial relations, and organizes them into a hierarchical scene tree.<n>A differentiable rigid-body simulator ensures realistic object interactions under gravity.<n>Experiments demonstrate that PAT3D substantially outperforms prior approaches in physical plausibility, semantic consistency, and visual quality.
- Score: 47.18949891825537
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce PAT3D, the first physics-augmented text-to-3D scene generation framework that integrates vision-language models with physics-based simulation to produce physically plausible, simulation-ready, and intersection-free 3D scenes. Given a text prompt, PAT3D generates 3D objects, infers their spatial relations, and organizes them into a hierarchical scene tree, which is then converted into initial conditions for simulation. A differentiable rigid-body simulator ensures realistic object interactions under gravity, driving the scene toward static equilibrium without interpenetrations. To further enhance scene quality, we introduce a simulation-in-the-loop optimization procedure that guarantees physical stability and non-intersection, while improving semantic consistency with the input prompt. Experiments demonstrate that PAT3D substantially outperforms prior approaches in physical plausibility, semantic consistency, and visual quality. Beyond high-quality generation, PAT3D uniquely enables simulation-ready 3D scenes for downstream tasks such as scene editing and robotic manipulation. Code and data will be released upon acceptance.
Related papers
- Seed3D 1.0: From Images to High-Fidelity Simulation-Ready 3D Assets [63.67760219308476]
We present Seed3D 1.0, a foundation model that generates simulation-ready 3D assets from single images.<n>Unlike existing 3D generation models, our system produces assets with accurate geometry, well-aligned textures, and realistic physically-based materials.
arXiv Detail & Related papers (2025-10-22T18:16:32Z) - PhysX-3D: Physical-Grounded 3D Asset Generation [48.78065667043986]
Existing 3D generation primarily emphasizes geometries and textures while neglecting physical-grounded modeling.<n>We present PhysXNet - the first physics-grounded 3D dataset systematically annotated across five foundational dimensions.<n>We also propose textbfPhysXGen, a feed-forward framework for physics-grounded image-to-3D asset generation.
arXiv Detail & Related papers (2025-07-16T17:59:35Z) - Text-to-3D Gaussian Splatting with Physics-Grounded Motion Generation [47.6666060652434]
We present an innovative framework that generates 3D models with accurate appearances and geometric structures.<n>By integrating text-to-3D generation with physics-grounded motion synthesis, our framework renders photo-realistic 3D objects.
arXiv Detail & Related papers (2024-12-07T06:48:16Z) - Efficient Physics Simulation for 3D Scenes via MLLM-Guided Gaussian Splatting [32.846428862045634]
We present Sim Anything, a physics-based approach that endows static 3D objects with interactive dynamics.<n>Inspired by human visual reasoning, we propose MLLM-based Physical Property Perception.<n>We also simulate objects in an open-world scene with particles sampled via the Physical-Geometric Adaptive Sampling.
arXiv Detail & Related papers (2024-11-19T12:52:21Z) - DreamPhysics: Learning Physics-Based 3D Dynamics with Video Diffusion Priors [75.83647027123119]
We propose to learn the physical properties of a material field with video diffusion priors.<n>We then utilize a physics-based Material-Point-Method simulator to generate 4D content with realistic motions.
arXiv Detail & Related papers (2024-06-03T16:05:25Z) - Atlas3D: Physically Constrained Self-Supporting Text-to-3D for Simulation and Fabrication [50.541882834405946]
We introduce Atlas3D, an automatic and easy-to-implement text-to-3D method.
Our approach combines a novel differentiable simulation-based loss function with physically inspired regularization.
We verify Atlas3D's efficacy through extensive generation tasks and validate the resulting 3D models in both simulated and real-world environments.
arXiv Detail & Related papers (2024-05-28T18:33:18Z) - Differentiable Blocks World: Qualitative 3D Decomposition by Rendering
Primitives [70.32817882783608]
We present an approach that produces a simple, compact, and actionable 3D world representation by means of 3D primitives.
Unlike existing primitive decomposition methods that rely on 3D input data, our approach operates directly on images.
We show that the resulting textured primitives faithfully reconstruct the input images and accurately model the visible 3D points.
arXiv Detail & Related papers (2023-07-11T17:58:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.