A Survey on World Models Grounded in Acoustic Physical Information
- URL: http://arxiv.org/abs/2506.13833v1
- Date: Mon, 16 Jun 2025 04:59:42 GMT
- Title: A Survey on World Models Grounded in Acoustic Physical Information
- Authors: Xiaoliang Chen, Le Chang, Xin Yu, Yunhe Huang, Xianling Tu,
- Abstract summary: This survey provides a comprehensive overview of the emerging field of world models grounded in acoustic physical information.<n>It examines the theoretical underpinnings, essential methodological frameworks, and recent technological advancements.<n>The survey details the significant applications of acoustic world models in robotics, autonomous driving, healthcare, and finance.
- Score: 12.985712909050502
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This survey provides a comprehensive overview of the emerging field of world models grounded in the foundation of acoustic physical information. It examines the theoretical underpinnings, essential methodological frameworks, and recent technological advancements in leveraging acoustic signals for high-fidelity environmental perception, causal physical reasoning, and predictive simulation of dynamic events. The survey explains how acoustic signals, as direct carriers of mechanical wave energy from physical events, encode rich, latent information about material properties, internal geometric structures, and complex interaction dynamics. Specifically, this survey establishes the theoretical foundation by explaining how fundamental physical laws govern the encoding of physical information within acoustic signals. It then reviews the core methodological pillars, including Physics-Informed Neural Networks (PINNs), generative models, and self-supervised multimodal learning frameworks. Furthermore, the survey details the significant applications of acoustic world models in robotics, autonomous driving, healthcare, and finance. Finally, it systematically outlines the important technical and ethical challenges while proposing a concrete roadmap for future research directions toward robust, causal, uncertainty-aware, and responsible acoustic intelligence. These elements collectively point to a research pathway towards embodied active acoustic intelligence, empowering AI systems to construct an internal "intuitive physics" engine through sound.
Related papers
- PhysGaia: A Physics-Aware Dataset of Multi-Body Interactions for Dynamic Novel View Synthesis [62.283499219361595]
PhysGaia is a physics-aware dataset specifically designed for Dynamic Novel View Synthesis (DyNVS)<n>Our dataset provides complex dynamic scenarios with rich interactions among multiple objects.<n>PhysGaia will significantly advance research in dynamic view synthesis, physics-based scene understanding, and deep learning models integrated with physical simulation.
arXiv Detail & Related papers (2025-06-03T12:19:18Z) - Generative Physical AI in Vision: A Survey [78.07014292304373]
Gene Artificial Intelligence (AI) has rapidly advanced the field of computer vision by enabling machines to create and interpret visual data with unprecedented sophistication.<n>This transformation builds upon a foundation of generative models to produce realistic images, videos, and 3D/4D content.<n>As generative models evolve to increasingly integrate physical realism and dynamic simulation, their potential to function as "world simulators" expands.
arXiv Detail & Related papers (2025-01-19T03:19:47Z) - The Sound of Water: Inferring Physical Properties from Pouring Liquids [85.30865788636386]
We study the connection between audio-visual observations and the underlying physics of pouring liquids.<n>Our objective is to automatically infer physical properties such as the liquid level, the shape and size of the container, the pouring rate and the time to fill.
arXiv Detail & Related papers (2024-11-18T01:19:37Z) - Physics and Deep Learning in Computational Wave Imaging [24.99422165859396]
Computational wave imaging (CWI) extracts hidden structure and physical properties of a volume of material.
Current approaches for solving CWI problems can be divided into categories: those rooted in traditional physics, and those based on deep learning.
Machine learning-based computational methods have emerged, offering a different perspective to address these challenges.
arXiv Detail & Related papers (2024-10-10T19:32:17Z) - Point Neuron Learning: A New Physics-Informed Neural Network Architecture [8.545030794905584]
This paper proposes a new physics-informed neural network architecture.
It embeds the fundamental solution of the wave equation into the network architecture, enabling the learned model to strictly satisfy the wave equation.
Compared to other PINN methods, our approach directly processes complex numbers and offers better interpretability and generalizability.
arXiv Detail & Related papers (2024-08-30T02:07:13Z) - ContPhy: Continuum Physical Concept Learning and Reasoning from Videos [86.63174804149216]
ContPhy is a novel benchmark for assessing machine physical commonsense.
We evaluated a range of AI models and found that they still struggle to achieve satisfactory performance on ContPhy.
We also introduce an oracle model (ContPRO) that marries the particle-based physical dynamic models with the recent large language models.
arXiv Detail & Related papers (2024-02-09T01:09:21Z) - Taming Waves: A Physically-Interpretable Machine Learning Framework for
Realizable Control of Wave Dynamics [3.4530027457862]
We introduce an environment designed for the study of the control of acoustic waves by actuated metamaterial designs.
We utilize this environment for the development of a novel machine-learning method, based on deep neural networks.
Our model is fully interpretable and maps physical constraints and intrinsic properties of the real acoustic environment into its latent representation of information.
arXiv Detail & Related papers (2023-11-27T03:34:28Z) - PACS: A Dataset for Physical Audiovisual CommonSense Reasoning [119.0100966278682]
This paper contributes PACS: the first audiovisual benchmark annotated for physical commonsense attributes.
PACS contains a total of 13,400 question-answer pairs, involving 1,377 unique physical commonsense questions and 1,526 videos.
Using PACS, we evaluate multiple state-of-the-art models on this new challenging task.
arXiv Detail & Related papers (2022-03-21T17:05:23Z) - Constructing Neural Network-Based Models for Simulating Dynamical
Systems [59.0861954179401]
Data-driven modeling is an alternative paradigm that seeks to learn an approximation of the dynamics of a system using observations of the true system.
This paper provides a survey of the different ways to construct models of dynamical systems using neural networks.
In addition to the basic overview, we review the related literature and outline the most significant challenges from numerical simulations that this modeling paradigm must overcome.
arXiv Detail & Related papers (2021-11-02T10:51:42Z) - Neural Networks with Physics-Informed Architectures and Constraints for
Dynamical Systems Modeling [19.399031618628864]
We develop a framework to learn dynamics models from trajectory data.
We place constraints on the values of the outputs and the internal states of the model.
We experimentally demonstrate the benefits of the proposed approach on a variety of dynamical systems.
arXiv Detail & Related papers (2021-09-14T02:47:51Z) - Physics-Coupled Spatio-Temporal Active Learning for Dynamical Systems [15.923190628643681]
One of the major challenges is to infer the underlying causes, which generate the perceived data stream.
Success of machine learning based predictive models requires massive annotated data for model training.
Our experiments on both synthetic and real-world datasets exhibit that the proposed ST-PCNN with active learning converges to optimal accuracy with substantially fewer instances.
arXiv Detail & Related papers (2021-08-11T18:05:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.