Deep Sound Field Reconstruction in Real Rooms: Introducing the ISOBEL
Sound Field Dataset
- URL: http://arxiv.org/abs/2102.06455v1
- Date: Fri, 12 Feb 2021 11:34:18 GMT
- Title: Deep Sound Field Reconstruction in Real Rooms: Introducing the ISOBEL
Sound Field Dataset
- Authors: Miklas Str{\o}m Kristoffersen, Martin Bo M{\o}ller, Pablo
Mart\'inez-Nuevo, Jan {\O}stergaard
- Abstract summary: This paper extends evaluations of sound field reconstruction at low frequencies by introducing a dataset with measurements from four real rooms.
The paper advances on a recent deep learning-based method for sound field reconstruction using a very low number of microphones.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Knowledge of loudspeaker responses are useful in a number of applications,
where a sound system is located inside a room that alters the listening
experience depending on position within the room. Acquisition of sound fields
for sound sources located in reverberant rooms can be achieved through labor
intensive measurements of impulse response functions covering the room, or
alternatively by means of reconstruction methods which can potentially require
significantly fewer measurements. This paper extends evaluations of sound field
reconstruction at low frequencies by introducing a dataset with measurements
from four real rooms. The ISOBEL Sound Field dataset is publicly available, and
aims to bridge the gap between synthetic and real-world sound fields in
rectangular rooms. Moreover, the paper advances on a recent deep learning-based
method for sound field reconstruction using a very low number of microphones,
and proposes an approach for modeling both magnitude and phase response in a
U-Net-like neural network architecture. The complex-valued sound field
reconstruction demonstrates that the estimated room transfer functions are of
high enough accuracy to allow for personalized sound zones with contrast ratios
comparable to ideal room transfer functions using 15 microphones below 150 Hz.
Related papers
- Blind Spatial Impulse Response Generation from Separate Room- and Scene-Specific Information [0.42970700836450487]
knowledge of the users' real acoustic environment is crucial for rendering virtual sounds that seamlessly blend into the environment.
We show how both room- and position-specific parameters are considered in the final output.
arXiv Detail & Related papers (2024-09-23T12:41:31Z) - ActiveRIR: Active Audio-Visual Exploration for Acoustic Environment Modeling [57.1025908604556]
An environment acoustic model represents how sound is transformed by the physical characteristics of an indoor environment.
We propose active acoustic sampling, a new task for efficiently building an environment acoustic model of an unmapped environment.
We introduce ActiveRIR, a reinforcement learning policy that leverages information from audio-visual sensor streams to guide agent navigation and determine optimal acoustic data sampling positions.
arXiv Detail & Related papers (2024-04-24T21:30:01Z) - Real Acoustic Fields: An Audio-Visual Room Acoustics Dataset and Benchmark [65.79402756995084]
Real Acoustic Fields (RAF) is a new dataset that captures real acoustic room data from multiple modalities.
RAF is the first dataset to provide densely captured room acoustic data.
arXiv Detail & Related papers (2024-03-27T17:59:56Z) - Room Transfer Function Reconstruction Using Complex-valued Neural Networks and Irregularly Distributed Microphones [15.396703290586418]
We employ complex-valued neural networks to estimate room transfer functions in the frequency range of the first room resonances.
This is the first time that complex-valued neural networks are used to estimate room transfer functions.
arXiv Detail & Related papers (2024-02-01T21:16:40Z) - Reconstruction of Sound Field through Diffusion Models [15.192190218332843]
Reconstructing the sound field in a room is an important task for several applications, such as sound control and augmented (AR) or virtual reality (VR)
We propose a data-driven generative model for reconstructing the magnitude of acoustic fields in rooms with a focus on the modal frequency range.
We introduce, for the first time, the use of a conditional Denoising Diffusion Probabilistic Model (DDPM) trained in order to reconstruct the sound field (SF-Diff) over an extended domain.
arXiv Detail & Related papers (2023-12-14T11:11:26Z) - Neural Acoustic Context Field: Rendering Realistic Room Impulse Response
With Neural Fields [61.07542274267568]
This letter proposes a novel Neural Acoustic Context Field approach, called NACF, to parameterize an audio scene.
Driven by the unique properties of RIR, we design a temporal correlation module and multi-scale energy decay criterion.
Experimental results show that NACF outperforms existing field-based methods by a notable margin.
arXiv Detail & Related papers (2023-09-27T19:50:50Z) - End-to-End Binaural Speech Synthesis [71.1869877389535]
We present an end-to-end speech synthesis system that combines a low-bitrate audio system with a powerful decoder.
We demonstrate the capability of the adversarial loss in capturing environment effects needed to create an authentic auditory scene.
arXiv Detail & Related papers (2022-07-08T05:18:36Z) - Joint speaker diarisation and tracking in switching state-space model [51.58295550366401]
This paper proposes to explicitly track the movements of speakers while jointly performing diarisation within a unified model.
A state-space model is proposed, where the hidden state expresses the identity of the current active speaker and the predicted locations of all speakers.
Experiments on a Microsoft rich meeting transcription task show that the proposed joint location tracking and diarisation approach is able to perform comparably with other methods that use location information.
arXiv Detail & Related papers (2021-09-23T04:43:58Z) - Blind Room Parameter Estimation Using Multiple-Multichannel Speech
Recordings [37.145413836886455]
Knowing the geometrical and acoustical parameters of a room may benefit applications such as audio augmented reality, speech dereverberation or audio forensics.
We study the problem of jointly estimating the total surface area, the volume, as well as the frequency-dependent reverberation time and mean surface absorption of a room.
A novel convolutional neural network architecture leveraging both single- and inter-channel cues is proposed and trained on a large, realistic simulated dataset.
arXiv Detail & Related papers (2021-07-29T08:51:49Z) - Deep Speaker Embeddings for Far-Field Speaker Recognition on Short
Utterances [53.063441357826484]
Speaker recognition systems based on deep speaker embeddings have achieved significant performance in controlled conditions.
Speaker verification on short utterances in uncontrolled noisy environment conditions is one of the most challenging and highly demanded tasks.
This paper presents approaches aimed to achieve two goals: a) improve the quality of far-field speaker verification systems in the presence of environmental noise, reverberation and b) reduce the system qualitydegradation for short utterances.
arXiv Detail & Related papers (2020-02-14T13:34:33Z) - Sound field reconstruction in rooms: inpainting meets super-resolution [1.0705399532413618]
Deep-learning method for sound field reconstruction is proposed.
The method is based on a U-net-like neural network with partial convolutions trained solely on simulated data.
Experiments using simulated data together with an experimental validation in a real listening room are shown.
arXiv Detail & Related papers (2020-01-30T11:31:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.