Sound propagation in realistic interactive 3D scenes with parameterized
sources using deep neural operators
- URL: http://arxiv.org/abs/2308.05141v2
- Date: Sat, 13 Jan 2024 11:40:54 GMT
- Title: Sound propagation in realistic interactive 3D scenes with parameterized
sources using deep neural operators
- Authors: Nikolas Borrel-Jensen, Somdatta Goswami, Allan P. Engsig-Karup, George
Em Karniadakis, Cheol-Ho Jeong
- Abstract summary: We address the challenge of sound propagation simulations in 3D virtual rooms with moving sources.
We propose using deep operator networks to approximate linear wave-equation operators.
This enables the rapid prediction of sound propagation in realistic 3D acoustic scenes with moving sources.
- Score: 1.6874375111244329
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We address the challenge of sound propagation simulations in 3D virtual rooms
with moving sources, which have applications in virtual/augmented reality, game
audio, and spatial computing. Solutions to the wave equation can describe wave
phenomena such as diffraction and interference. However, simulating them using
conventional numerical discretization methods with hundreds of source and
receiver positions is intractable, making stimulating a sound field with moving
sources impractical. To overcome this limitation, we propose using deep
operator networks to approximate linear wave-equation operators. This enables
the rapid prediction of sound propagation in realistic 3D acoustic scenes with
moving sources, achieving millisecond-scale computations. By learning a compact
surrogate model, we avoid the offline calculation and storage of impulse
responses for all relevant source/listener pairs. Our experiments, including
various complex scene geometries, show good agreement with reference solutions,
with root mean squared errors ranging from 0.02 Pa to 0.10 Pa. Notably, our
method signifies a paradigm shift as no prior machine learning approach has
achieved precise predictions of complete wave fields within realistic domains.
We anticipate that our findings will drive further exploration of deep neural
operator methods, advancing research in immersive user experiences within
virtual environments.$
Related papers
- Treble10: A high-quality dataset for far-field speech recognition, dereverberation, and enhancement [2.6008293644386904]
We introduce Treble10, a large-scale, physically accurate room-acoustic dataset.<n> Treble10 contains over 3000 broadband room impulse responses (RIRs) simulated in 10 fully furnished real-world rooms.<n>All signals are simulated at 32 kHz, accurately modelling low-frequency wave effects and high-frequency reflections.
arXiv Detail & Related papers (2025-10-27T09:17:44Z) - Resounding Acoustic Fields with Reciprocity [13.126858950459557]
We introduce Versa, a physics-inspired approach to facilitating acoustic field learning.<n>Our method creates physically valid samples with dense virtual emitter positions by exchanging emitter and listener poses.<n>Results show Versa substantially improve the performance of acoustic field learning on both simulated and real-world datasets.
arXiv Detail & Related papers (2025-10-23T14:30:09Z) - Ivan-ISTD: Rethinking Cross-domain Heteroscedastic Noise Perturbations in Infrared Small Target Detection [53.689841037081834]
Ivan-ISTD is designed to address the dual challenges of cross-domain shift and heteroscedastic noise perturbations in ISTD.<n>Ivan-ISTD demonstrates excellent robustness in cross-domain scenarios.
arXiv Detail & Related papers (2025-10-14T07:48:31Z) - A Physics-Guided Probabilistic Surrogate Modeling Framework for Digital Twins of Underwater Radiated Noise [0.0]
Ship traffic is an increasing source of underwater radiated noise in coastal waters.<n>We present a physics-guided probabilistic framework to predict three-dimensional transmission loss in realistic ocean environments.
arXiv Detail & Related papers (2025-09-30T03:38:51Z) - Convergence of physics-informed neural networks modeling time-harmonic wave fields [0.0]
We study 3D room acoustic cases at low frequency, varying the source definition and the number of boundary condition sets.<n>We assess the convergence behavior by looking at the loss landscape of the PINN architecture.<n>The developments are part of an initiative aiming to model the low-frequency behavior of room acoustics, including absorbers.
arXiv Detail & Related papers (2025-05-18T19:12:14Z) - EvMic: Event-based Non-contact sound recovery from effective spatial-temporal modeling [69.96729022219117]
When sound waves hit an object, they induce vibrations that produce high-frequency and subtle visual changes.
Recent advances in event camera hardware show good potential for its application in visual sound recovery.
We propose a novel pipeline for non-contact sound recovery, fully utilizing spatial-temporal information from the event stream.
arXiv Detail & Related papers (2025-04-03T08:51:17Z) - Traveling Waves Integrate Spatial Information Through Time [3.3496112914071166]
We introduce convolutional recurrent neural networks that learn to produce traveling waves in their hidden states in response to visual stimuli.
We observe that traveling waves effectively expand the receptive field of locally connected neurons, supporting long-range encoding and communication of information.
As a first step toward traveling-wave-based communication and visual representation in artificial networks, our findings suggest wave-dynamics may provide efficiency and training stability benefits.
arXiv Detail & Related papers (2025-02-09T21:14:27Z) - Sim2Real Transfer for Audio-Visual Navigation with Frequency-Adaptive Acoustic Field Prediction [51.71299452862839]
We propose the first treatment of sim2real for audio-visual navigation by disentangling it into acoustic field prediction (AFP) and waypoint navigation.
We then collect real-world data to measure the spectral difference between the simulation and the real world by training AFP models that only take a specific frequency subband as input.
Lastly, we build a real robot platform and show that the transferred policy can successfully navigate to sounding objects.
arXiv Detail & Related papers (2024-05-05T06:01:31Z) - Differentiable Radio Frequency Ray Tracing for Millimeter-Wave Sensing [29.352303349003165]
We propose DiffSBR, a differentiable framework for mmWave-based 3D reconstruction.
DiffSBR incorporates a differentiable ray tracing engine to simulate radar point clouds from virtual 3D models.
Experiments using various radar hardware validate DiffSBR's capability for fine-grained 3D reconstruction.
arXiv Detail & Related papers (2023-11-22T06:13:39Z) - Listen2Scene: Interactive material-aware binaural sound propagation for
reconstructed 3D scenes [69.03289331433874]
We present an end-to-end audio rendering approach (Listen2Scene) for virtual reality (VR) and augmented reality (AR) applications.
We propose a novel neural-network-based sound propagation method to generate acoustic effects for 3D models of real environments.
arXiv Detail & Related papers (2023-02-02T04:09:23Z) - Few-Shot Audio-Visual Learning of Environment Acoustics [89.16560042178523]
Room impulse response (RIR) functions capture how the surrounding physical environment transforms the sounds heard by a listener.
We explore how to infer RIRs based on a sparse set of images and echoes observed in the space.
In experiments using a state-of-the-art audio-visual simulator for 3D environments, we demonstrate that our method successfully generates arbitrary RIRs.
arXiv Detail & Related papers (2022-06-08T16:38:24Z) - Wavelet neural operator: a neural operator for parametric partial
differential equations [0.0]
We introduce a novel operator learning algorithm referred to as the Wavelet Neural Operator (WNO)
WNO harnesses the superiority of the wavelets in time-frequency localization of the functions and enables accurate tracking of patterns in spatial domain.
The proposed approach is used to build a digital twin capable of predicting Earth's air temperature based on available historical data.
arXiv Detail & Related papers (2022-05-04T17:13:59Z) - Learning to Listen: Modeling Non-Deterministic Dyadic Facial Motion [89.01668641930206]
We present a framework for modeling interactional communication in dyadic conversations.
We autoregressively output multiple possibilities of corresponding listener motion.
Our method organically captures the multimodal and non-deterministic nature of nonverbal dyadic interactions.
arXiv Detail & Related papers (2022-04-18T17:58:04Z) - Deep Impulse Responses: Estimating and Parameterizing Filters with Deep
Networks [76.830358429947]
Impulse response estimation in high noise and in-the-wild settings is a challenging problem.
We propose a novel framework for parameterizing and estimating impulse responses based on recent advances in neural representation learning.
arXiv Detail & Related papers (2022-02-07T18:57:23Z) - Seismic wave propagation and inversion with Neural Operators [7.296366040398878]
We develop a prototype framework for learning general solutions using a recently developed machine learning paradigm called Neural Operator.
A trained Neural Operator can compute a solution in negligible time for any velocity structure or source location.
We illustrate the method with the 2D acoustic wave equation and demonstrate the method's applicability to seismic tomography.
arXiv Detail & Related papers (2021-08-11T19:17:39Z) - Feeling of Presence Maximization: mmWave-Enabled Virtual Reality Meets
Deep Reinforcement Learning [76.46530937296066]
This paper investigates the problem of providing ultra-reliable and energy-efficient virtual reality (VR) experiences for wireless mobile users.
To ensure reliable ultra-high-definition (UHD) video frame delivery to mobile users, a coordinated multipoint (CoMP) transmission technique and millimeter wave (mmWave) communications are exploited.
arXiv Detail & Related papers (2021-06-03T08:35:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.