Generative Physical AI in Vision: A Survey
- URL: http://arxiv.org/abs/2501.10928v2
- Date: Sat, 19 Apr 2025 14:52:08 GMT
- Title: Generative Physical AI in Vision: A Survey
- Authors: Daochang Liu, Junyu Zhang, Anh-Dung Dinh, Eunbyung Park, Shichao Zhang, Ajmal Mian, Mubarak Shah, Chang Xu,
- Abstract summary: Gene Artificial Intelligence (AI) has rapidly advanced the field of computer vision by enabling machines to create and interpret visual data with unprecedented sophistication.<n>This transformation builds upon a foundation of generative models to produce realistic images, videos, and 3D/4D content.<n>As generative models evolve to increasingly integrate physical realism and dynamic simulation, their potential to function as "world simulators" expands.
- Score: 78.07014292304373
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generative Artificial Intelligence (AI) has rapidly advanced the field of computer vision by enabling machines to create and interpret visual data with unprecedented sophistication. This transformation builds upon a foundation of generative models to produce realistic images, videos, and 3D/4D content. Conventional generative models primarily focus on visual fidelity while often neglecting the physical plausibility of the generated content. This gap limits their effectiveness in applications that require adherence to real-world physical laws, such as robotics, autonomous systems, and scientific simulations. As generative models evolve to increasingly integrate physical realism and dynamic simulation, their potential to function as "world simulators" expands. Therefore, the field of physics-aware generation in computer vision is rapidly growing, calling for a comprehensive survey to provide a structured analysis of current efforts. To serve this purpose, the survey presents a systematic review, categorizing methods based on how they incorporate physical knowledge, either through explicit simulation or implicit learning. It also analyzes key paradigms, discusses evaluation protocols, and identifies future research directions. By offering a comprehensive overview, this survey aims to help future developments in physically grounded generation for computer vision. The reviewed papers are summarized at https://tinyurl.com/Physics-Aware-Generation.
Related papers
- Digital Gene: Learning about the Physical World through Analytic Concepts [54.21005370169846]
AI systems still struggle when it comes to understanding and interacting with the physical world.
This research introduces the idea of analytic concept.
It provides machine intelligence a portal to perceive, reason about, and interact with the physical world.
arXiv Detail & Related papers (2025-04-05T13:22:11Z) - Exploring the Evolution of Physics Cognition in Video Generation: A Survey [44.305405114910904]
This survey aims to provide a comprehensive summary of architecture designs and their applications to fill this gap.
We discuss and organize the evolutionary process of physical cognition in video generation from a cognitive science perspective.
We propose a three-tier taxonomy: 1) basic perception for generation, 2) passive cognition of physical knowledge for generation, and 3) active cognition for world simulation.
arXiv Detail & Related papers (2025-03-27T17:58:33Z) - Grounding Creativity in Physics: A Brief Survey of Physical Priors in AIGC [14.522189177415724]
Recent advancements in AI-generated content have significantly improved the realism of 3D and 4D generation.
Most existing methods prioritize appearance consistency while neglecting underlying physical principles.
This survey provides a review of physics-aware generative methods, systematically analyzing how physical constraints are integrated into 3D and 4D generation.
arXiv Detail & Related papers (2025-02-10T20:13:16Z) - Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation [51.750634349748736]
Text-to-video (T2V) models have made significant strides in visualizing complex prompts.
However, the capacity of these models to accurately represent intuitive physics remains largely unexplored.
We introduce PhyGenBench to evaluate physical commonsense correctness in T2V generation.
arXiv Detail & Related papers (2024-10-07T17:56:04Z) - Haptic Repurposing with GenAI [5.424247121310253]
Mixed Reality aims to merge the digital and physical worlds to create immersive human-computer interactions.
This paper introduces Haptic Repurposing with GenAI, an innovative approach to enhance MR interactions by transforming any physical objects into adaptive haptic interfaces for AI-generated virtual assets.
arXiv Detail & Related papers (2024-06-11T13:06:28Z) - AI-Generated Images as Data Source: The Dawn of Synthetic Era [61.879821573066216]
generative AI has unlocked the potential to create synthetic images that closely resemble real-world photographs.
This paper explores the innovative concept of harnessing these AI-generated images as new data sources.
In contrast to real data, AI-generated data exhibit remarkable advantages, including unmatched abundance and scalability.
arXiv Detail & Related papers (2023-10-03T06:55:19Z) - X-VoE: Measuring eXplanatory Violation of Expectation in Physical Events [75.94926117990435]
This study introduces X-VoE, a benchmark dataset to assess AI agents' grasp of intuitive physics.
X-VoE establishes a higher bar for the explanatory capacities of intuitive physics models.
We present an explanation-based learning system that captures physics dynamics and infers occluded object states.
arXiv Detail & Related papers (2023-08-21T03:28:23Z) - Physics-Informed Computer Vision: A Review and Perspectives [22.71741766133866]
incorporation of physical information in machine learning frameworks is opening and transforming many application domains.
We present a systematic literature review of more than 250 papers on formulation and approaches to computer vision tasks guided by physical laws.
arXiv Detail & Related papers (2023-05-29T11:55:11Z) - Visual Grounding of Learned Physical Models [66.04898704928517]
Humans intuitively recognize objects' physical properties and predict their motion, even when the objects are engaged in complicated interactions.
We present a neural model that simultaneously reasons about physics and makes future predictions based on visual and dynamics priors.
Experiments show that our model can infer the physical properties within a few observations, which allows the model to quickly adapt to unseen scenarios and make accurate predictions into the future.
arXiv Detail & Related papers (2020-04-28T17:06:38Z) - RoboTHOR: An Open Simulation-to-Real Embodied AI Platform [56.50243383294621]
We introduce RoboTHOR to democratize research in interactive and embodied visual AI.
We show there exists a significant gap between the performance of models trained in simulation when they are tested in both simulations and their carefully constructed physical analogs.
arXiv Detail & Related papers (2020-04-14T20:52:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.