Related papers: Reconstruction of Sound Field through Diffusion Models

Reconstruction of Sound Field through Diffusion Models

URL: http://arxiv.org/abs/2312.08821v2
Date: Wed, 21 Feb 2024 16:15:40 GMT
Title: Reconstruction of Sound Field through Diffusion Models
Authors: Federico Miotello, Luca Comanducci, Mirco Pezzoli, Alberto Bernardini, Fabio Antonacci and Augusto Sarti
Abstract summary: Reconstructing the sound field in a room is an important task for several applications, such as sound control and augmented (AR) or virtual reality (VR) We propose a data-driven generative model for reconstructing the magnitude of acoustic fields in rooms with a focus on the modal frequency range. We introduce, for the first time, the use of a conditional Denoising Diffusion Probabilistic Model (DDPM) trained in order to reconstruct the sound field (SF-Diff) over an extended domain.
Score: 15.192190218332843
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Reconstructing the sound field in a room is an important task for several applications, such as sound control and augmented (AR) or virtual reality (VR). In this paper, we propose a data-driven generative model for reconstructing the magnitude of acoustic fields in rooms with a focus on the modal frequency range. We introduce, for the first time, the use of a conditional Denoising Diffusion Probabilistic Model (DDPM) trained in order to reconstruct the sound field (SF-Diff) over an extended domain. The architecture is devised in order to be conditioned on a set of limited available measurements at different frequencies and generate the sound field in target, unknown, locations. The results show that SF-Diff is able to provide accurate reconstructions, outperforming a state-of-the-art baseline based on kernel interpolation.

Related papers

HARP: A Large-Scale Higher-Order Ambisonic Room Impulse Response Dataset [0.6568378556428859]
This contribution introduces a dataset of 7th-order Ambisonic Room Impulse Responses (HOA-RIRs) created using the Image Source Method. By employing higher-order Ambisonics, our dataset enables precise spatial audio reproduction. The presented 64-microphone configuration allows us to capture RIRs directly in the Spherical Harmonics domain.
arXiv Detail & Related papers (2024-11-21T15:16:48Z)
Real Acoustic Fields: An Audio-Visual Room Acoustics Dataset and Benchmark [65.79402756995084]
Real Acoustic Fields (RAF) is a new dataset that captures real acoustic room data from multiple modalities. RAF is the first dataset to provide densely captured room acoustic data.
arXiv Detail & Related papers (2024-03-27T17:59:56Z)
Neural Acoustic Context Field: Rendering Realistic Room Impulse Response With Neural Fields [61.07542274267568]
This letter proposes a novel Neural Acoustic Context Field approach, called NACF, to parameterize an audio scene. Driven by the unique properties of RIR, we design a temporal correlation module and multi-scale energy decay criterion. Experimental results show that NACF outperforms existing field-based methods by a notable margin.
arXiv Detail & Related papers (2023-09-27T19:50:50Z)
DiffSED: Sound Event Detection with Denoising Diffusion [70.18051526555512]
We reformulate the SED problem by taking a generative learning perspective. Specifically, we aim to generate sound temporal boundaries from noisy proposals in a denoising diffusion process. During training, our model learns to reverse the noising process by converting noisy latent queries to the groundtruth versions.
arXiv Detail & Related papers (2023-08-14T17:29:41Z)
Generative adversarial networks with physical sound field priors [6.256923690998173]
This paper presents a deep learning-based approach for learns-temporal reconstruction of sound fields using Generative Adversa Networks (GANs) The proposed method uses a plane wave basis and the underlying statistical distributions of pressure in rooms to reconstruct sound fields from a limited number of measurements. The results suggest that this approach provides a promising approach to sound field reconstruction using generative models that allow for a physically informed acoustics prior to problems.
arXiv Detail & Related papers (2023-08-01T10:11:23Z)
Realistic Noise Synthesis with Diffusion Models [68.48859665320828]
Deep image denoising models often rely on large amount of training data for the high quality performance. We propose a novel method that synthesizes realistic noise using diffusion models, namely Realistic Noise Synthesize Diffusor (RNSD) RNSD can incorporate guided multiscale content, such as more realistic noise with spatial correlations can be generated at multiple frequencies.
arXiv Detail & Related papers (2023-05-23T12:56:01Z)
DiffusionAD: Norm-guided One-step Denoising Diffusion for Anomaly Detection [89.49600182243306]
We reformulate the reconstruction process using a diffusion model into a noise-to-norm paradigm. We propose a rapid one-step denoising paradigm, significantly faster than the traditional iterative denoising in diffusion models. The segmentation sub-network predicts pixel-level anomaly scores using the input image and its anomaly-free restoration.
arXiv Detail & Related papers (2023-03-15T16:14:06Z)
Three-Way Deep Neural Network for Radio Frequency Map Generation and Source Localization [67.93423427193055]
Monitoring wireless spectrum over spatial, temporal, and frequency domains will become a critical feature in beyond-5G and 6G communication technologies. In this paper, we present a Generative Adversarial Network (GAN) machine learning model to interpolate irregularly distributed measurements across the spatial domain.
arXiv Detail & Related papers (2021-11-23T22:25:10Z)
Mean absorption estimation from room impulse responses using virtually supervised learning [0.0]
This paper introduces and investigates a new approach to estimate mean absorption coefficients solely from a room impulse response (RIR) This inverse problem is tackled via virtually-supervised learning, namely, the RIR-to-absorption mapping is implicitly learned by regression on a simulated dataset using artificial neural networks.
arXiv Detail & Related papers (2021-09-01T14:06:20Z)
Deep Sound Field Reconstruction in Real Rooms: Introducing the ISOBEL Sound Field Dataset [0.0]
This paper extends evaluations of sound field reconstruction at low frequencies by introducing a dataset with measurements from four real rooms. The paper advances on a recent deep learning-based method for sound field reconstruction using a very low number of microphones.
arXiv Detail & Related papers (2021-02-12T11:34:18Z)
Real Time Speech Enhancement in the Waveform Domain [99.02180506016721]
We present a causal speech enhancement model working on the raw waveform that runs in real-time on a laptop CPU. The proposed model is based on an encoder-decoder architecture with skip-connections. It is capable of removing various kinds of background noise including stationary and non-stationary noises.
arXiv Detail & Related papers (2020-06-23T09:19:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.