Generative adversarial networks with physical sound field priors
- URL: http://arxiv.org/abs/2308.00426v1
- Date: Tue, 1 Aug 2023 10:11:23 GMT
- Title: Generative adversarial networks with physical sound field priors
- Authors: Xenofon Karakonstantis and Efren Fernandez-Grande
- Abstract summary: This paper presents a deep learning-based approach for learns-temporal reconstruction of sound fields using Generative Adversa Networks (GANs)
The proposed method uses a plane wave basis and the underlying statistical distributions of pressure in rooms to reconstruct sound fields from a limited number of measurements.
The results suggest that this approach provides a promising approach to sound field reconstruction using generative models that allow for a physically informed acoustics prior to problems.
- Score: 6.256923690998173
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper presents a deep learning-based approach for the spatio-temporal
reconstruction of sound fields using Generative Adversarial Networks (GANs).
The method utilises a plane wave basis and learns the underlying statistical
distributions of pressure in rooms to accurately reconstruct sound fields from
a limited number of measurements. The performance of the method is evaluated
using two established datasets and compared to state-of-the-art methods. The
results show that the model is able to achieve an improved reconstruction
performance in terms of accuracy and energy retention, particularly in the
high-frequency range and when extrapolating beyond the measurement region.
Furthermore, the proposed method can handle a varying number of measurement
positions and configurations without sacrificing performance. The results
suggest that this approach provides a promising approach to sound field
reconstruction using generative models that allow for a physically informed
prior to acoustics problems.
Related papers
- Maximum Likelihood Estimation of the Direction of Sound In A Reverberant Noisy Environment [0.8702432681310399]
We describe a new method for estimating the direction of sound in a reverberant environment from basic principles of sound propagation.
The method utilizes SNR-adaptive features from time-delay and energy of the directional components after acoustic wave decomposition.
arXiv Detail & Related papers (2024-06-24T19:42:22Z) - Learning with Noisy Foundation Models [95.50968225050012]
This paper is the first work to comprehensively understand and analyze the nature of noise in pre-training datasets.
We propose a tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise and improve generalization.
arXiv Detail & Related papers (2024-03-11T16:22:41Z) - Reconstruction of Sound Field through Diffusion Models [15.192190218332843]
Reconstructing the sound field in a room is an important task for several applications, such as sound control and augmented (AR) or virtual reality (VR)
We propose a data-driven generative model for reconstructing the magnitude of acoustic fields in rooms with a focus on the modal frequency range.
We introduce, for the first time, the use of a conditional Denoising Diffusion Probabilistic Model (DDPM) trained in order to reconstruct the sound field (SF-Diff) over an extended domain.
arXiv Detail & Related papers (2023-12-14T11:11:26Z) - Neural Acoustic Context Field: Rendering Realistic Room Impulse Response
With Neural Fields [61.07542274267568]
This letter proposes a novel Neural Acoustic Context Field approach, called NACF, to parameterize an audio scene.
Driven by the unique properties of RIR, we design a temporal correlation module and multi-scale energy decay criterion.
Experimental results show that NACF outperforms existing field-based methods by a notable margin.
arXiv Detail & Related papers (2023-09-27T19:50:50Z) - Adaptive Fake Audio Detection with Low-Rank Model Squeezing [50.7916414913962]
Traditional approaches, such as finetuning, are computationally intensive and pose a risk of impairing the acquired knowledge of known fake audio types.
We introduce the concept of training low-rank adaptation matrices tailored specifically to the newly emerging fake audio types.
Our approach offers several advantages, including reduced storage memory requirements and lower equal error rates.
arXiv Detail & Related papers (2023-06-08T06:06:42Z) - Bayesian inference and neural estimation of acoustic wave propagation [10.980762871305279]
We introduce a novel framework which combines physics and machine learning methods to analyse acoustic signals.
Three methods are developed for this task: a Bayesian inference approach for inferring the spectral acoustics characteristics, a neural-physical model which equips a neural network with forward and backward physical losses, and the non-linear least squares approach which serves as benchmark.
The simplicity and efficiency of this framework is empirically validated on simulated data.
arXiv Detail & Related papers (2023-05-28T15:14:46Z) - Blind Acoustic Room Parameter Estimation Using Phase Features [4.473249957074495]
We propose utilizing novel phase-related features to extend recent approaches to blindly estimate the so-called "reverberation fingerprint" parameters.
The addition of these features is shown to outperform existing methods that rely solely on magnitude-based spectral features.
arXiv Detail & Related papers (2023-03-13T20:05:41Z) - Denoising diffusion models for out-of-distribution detection [2.113925122479677]
We exploit the view of denoising probabilistic diffusion models (DDPM) as denoising autoencoders.
We use DDPMs to reconstruct an input that has been noised to a range of noise levels, and use the resulting multi-dimensional reconstruction error to classify out-of-distribution inputs.
arXiv Detail & Related papers (2022-11-14T20:35:11Z) - Ultrasound Signal Processing: From Models to Deep Learning [64.56774869055826]
Medical ultrasound imaging relies heavily on high-quality signal processing to provide reliable and interpretable image reconstructions.
Deep learning based methods, which are optimized in a data-driven fashion, have gained popularity.
A relatively new paradigm combines the power of the two: leveraging data-driven deep learning, as well as exploiting domain knowledge.
arXiv Detail & Related papers (2022-04-09T13:04:36Z) - Time-domain Speech Enhancement with Generative Adversarial Learning [53.74228907273269]
This paper proposes a new framework called Time-domain Speech Enhancement Generative Adversarial Network (TSEGAN)
TSEGAN is an extension of the generative adversarial network (GAN) in time-domain with metric evaluation to mitigate the scaling problem.
In addition, we provide a new method based on objective function mapping for the theoretical analysis of the performance of Metric GAN.
arXiv Detail & Related papers (2021-03-30T08:09:49Z) - Unsupervised Domain Adaptation for Acoustic Scene Classification Using
Band-Wise Statistics Matching [69.24460241328521]
Machine learning algorithms can be negatively affected by mismatches between training (source) and test (target) data distributions.
We propose an unsupervised domain adaptation method that consists of aligning the first- and second-order sample statistics of each frequency band of target-domain acoustic scenes to the ones of the source-domain training dataset.
We show that the proposed method outperforms the state-of-the-art unsupervised methods found in the literature in terms of both source- and target-domain classification accuracy.
arXiv Detail & Related papers (2020-04-30T23:56:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.