Generative Physical AI in Vision: A Survey
- URL: http://arxiv.org/abs/2501.10928v1
- Date: Sun, 19 Jan 2025 03:19:47 GMT
- Title: Generative Physical AI in Vision: A Survey
- Authors: Daochang Liu, Junyu Zhang, Anh-Dung Dinh, Eunbyung Park, Shichao Zhang, Chang Xu,
- Abstract summary: Generative Artificial Intelligence (AI) has rapidly advanced the field of computer vision by enabling machines to create and interpret visual data with unprecedented sophistication.
As generative AI evolves to increasingly integrate physical realism and dynamic simulation, its potential to function as a "world simulator"
This survey systematically reviews this emerging field of physics-aware generative AI in computer vision.
- Score: 25.867330158975932
- License:
- Abstract: Generative Artificial Intelligence (AI) has rapidly advanced the field of computer vision by enabling machines to create and interpret visual data with unprecedented sophistication. This transformation builds upon a foundation of generative models to produce realistic images, videos, and 3D or 4D content. Traditionally, generative models primarily focus on visual fidelity while often neglecting the physical plausibility of generated content. This gap limits their effectiveness in applications requiring adherence to real-world physical laws, such as robotics, autonomous systems, and scientific simulations. As generative AI evolves to increasingly integrate physical realism and dynamic simulation, its potential to function as a "world simulator" expands-enabling the modeling of interactions governed by physics and bridging the divide between virtual and physical realities. This survey systematically reviews this emerging field of physics-aware generative AI in computer vision, categorizing methods based on how they incorporate physical knowledge-either through explicit simulation or implicit learning. We analyze key paradigms, discuss evaluation protocols, and identify future research directions. By offering a comprehensive overview, this survey aims to help future developments in physically grounded generation for vision. The reviewed papers are summarized at https://github.com/BestJunYu/Awesome-Physics-aware-Generation.
Related papers
- Grounding Creativity in Physics: A Brief Survey of Physical Priors in AIGC [14.522189177415724]
Recent advancements in AI-generated content have significantly improved the realism of 3D and 4D generation.
Most existing methods prioritize appearance consistency while neglecting underlying physical principles.
This survey provides a review of physics-aware generative methods, systematically analyzing how physical constraints are integrated into 3D and 4D generation.
arXiv Detail & Related papers (2025-02-10T20:13:16Z) - Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation [51.750634349748736]
Text-to-video (T2V) models have made significant strides in visualizing complex prompts.
However, the capacity of these models to accurately represent intuitive physics remains largely unexplored.
We introduce PhyGenBench to evaluate physical commonsense correctness in T2V generation.
arXiv Detail & Related papers (2024-10-07T17:56:04Z) - Haptic Repurposing with GenAI [5.424247121310253]
Mixed Reality aims to merge the digital and physical worlds to create immersive human-computer interactions.
This paper introduces Haptic Repurposing with GenAI, an innovative approach to enhance MR interactions by transforming any physical objects into adaptive haptic interfaces for AI-generated virtual assets.
arXiv Detail & Related papers (2024-06-11T13:06:28Z) - AI-Generated Images as Data Source: The Dawn of Synthetic Era [61.879821573066216]
generative AI has unlocked the potential to create synthetic images that closely resemble real-world photographs.
This paper explores the innovative concept of harnessing these AI-generated images as new data sources.
In contrast to real data, AI-generated data exhibit remarkable advantages, including unmatched abundance and scalability.
arXiv Detail & Related papers (2023-10-03T06:55:19Z) - X-VoE: Measuring eXplanatory Violation of Expectation in Physical Events [75.94926117990435]
This study introduces X-VoE, a benchmark dataset to assess AI agents' grasp of intuitive physics.
X-VoE establishes a higher bar for the explanatory capacities of intuitive physics models.
We present an explanation-based learning system that captures physics dynamics and infers occluded object states.
arXiv Detail & Related papers (2023-08-21T03:28:23Z) - Physics-Informed Computer Vision: A Review and Perspectives [22.71741766133866]
incorporation of physical information in machine learning frameworks is opening and transforming many application domains.
We present a systematic literature review of more than 250 papers on formulation and approaches to computer vision tasks guided by physical laws.
arXiv Detail & Related papers (2023-05-29T11:55:11Z) - ThreeDWorld: A Platform for Interactive Multi-Modal Physical Simulation [75.0278287071591]
ThreeDWorld (TDW) is a platform for interactive multi-modal physical simulation.
TDW enables simulation of high-fidelity sensory data and physical interactions between mobile agents and objects in rich 3D environments.
We present initial experiments enabled by TDW in emerging research directions in computer vision, machine learning, and cognitive science.
arXiv Detail & Related papers (2020-07-09T17:33:27Z) - Visual Grounding of Learned Physical Models [66.04898704928517]
Humans intuitively recognize objects' physical properties and predict their motion, even when the objects are engaged in complicated interactions.
We present a neural model that simultaneously reasons about physics and makes future predictions based on visual and dynamics priors.
Experiments show that our model can infer the physical properties within a few observations, which allows the model to quickly adapt to unseen scenarios and make accurate predictions into the future.
arXiv Detail & Related papers (2020-04-28T17:06:38Z) - RoboTHOR: An Open Simulation-to-Real Embodied AI Platform [56.50243383294621]
We introduce RoboTHOR to democratize research in interactive and embodied visual AI.
We show there exists a significant gap between the performance of models trained in simulation when they are tested in both simulations and their carefully constructed physical analogs.
arXiv Detail & Related papers (2020-04-14T20:52:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.