MB-RIRs: a Synthetic Room Impulse Response Dataset with Frequency-Dependent Absorption Coefficients
- URL: http://arxiv.org/abs/2507.09750v1
- Date: Sun, 13 Jul 2025 19:00:26 GMT
- Title: MB-RIRs: a Synthetic Room Impulse Response Dataset with Frequency-Dependent Absorption Coefficients
- Authors: Enric Gusó, Joanna Luberadzka, Umut Sayin, Xavier Serra,
- Abstract summary: We implement three features on top of the traditional image source method-based (ISM) shoebox RIRs.<n>We train a DeepFilternet3 model for each RIR dataset and evaluate the performance on a test set of real RIRs.
- Score: 7.468132532398651
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We investigate the effects of four strategies for improving the ecological validity of synthetic room impulse response (RIR) datasets for monoaural Speech Enhancement (SE). We implement three features on top of the traditional image source method-based (ISM) shoebox RIRs: multiband absorption coefficients, source directivity and receiver directivity. We additionally consider mesh-based RIRs from the SoundSpaces dataset. We then train a DeepFilternet3 model for each RIR dataset and evaluate the performance on a test set of real RIRs both objectively and subjectively. We find that RIRs which use frequency-dependent acoustic absorption coefficients (MB-RIRs) can obtain +0.51dB of SDR and a +8.9 MUSHRA score when evaluated on real RIRs. The MB-RIRs dataset is publicly available for free download.
Related papers
- DiffusionRIR: Room Impulse Response Interpolation using Diffusion Models [16.92449230293275]
High-quality RIR estimates drive applications such as virtual microphones, sound source localization, augmented reality, and data augmentation.<n>This research addresses the challenge of estimating RIRs at unmeasured locations within a room using Denoising Diffusion Probabilistic Models (DDPM)
arXiv Detail & Related papers (2025-04-29T10:52:07Z) - Data Augmentation Using Neural Acoustic Fields With Retrieval-Augmented Pre-training [34.14967280931229]
This report details MERL's system for room impulse response (RIR) estimation submitted to the Generative Data Augmentation Workshop at ICASSP 2025.<n>We first pre-train a neural acoustic field conditioned by room geometry on an external large-scale dataset in which pairs of RIRs and the geometries are provided.<n>The neural acoustic field is then adapted to each target room by using the enrollment data.<n>We predict the RIRs for each pair of source and receiver locations specified by Task 1, and use these RIRs to train the speaker distance estimation model in Task 2.
arXiv Detail & Related papers (2025-04-19T21:43:56Z) - AV-RIR: Audio-Visual Room Impulse Response Estimation [49.469389715876915]
Accurate estimation of Room Impulse Response (RIR) is important for speech processing and AR/VR applications.
We propose AV-RIR, a novel multi-modal multi-task learning approach to accurately estimate the RIR from a given reverberant speech signal and visual cues of its corresponding environment.
arXiv Detail & Related papers (2023-11-30T22:58:30Z) - Synthetic Wave-Geometric Impulse Responses for Improved Speech
Dereverberation [69.1351513309953]
We show that accurately simulating the low-frequency components of Room Impulse Responses (RIRs) is important to achieving good dereverberation.
We demonstrate that speech dereverberation models trained on hybrid synthetic RIRs outperform models trained on RIRs generated by prior geometric ray tracing methods.
arXiv Detail & Related papers (2022-12-10T20:15:23Z) - Towards Improved Room Impulse Response Estimation for Speech Recognition [53.04440557465013]
We propose a novel approach for blind room impulse response (RIR) estimation systems in the context of far-field automatic speech recognition (ASR)
We first draw the connection between improved RIR estimation and improved ASR performance, as a means of evaluating neural RIR estimators.
We then propose a generative adversarial network (GAN) based architecture that encodes RIR features from reverberant speech and constructs an RIR from the encoded features.
arXiv Detail & Related papers (2022-11-08T00:40:27Z) - Few-Shot Audio-Visual Learning of Environment Acoustics [89.16560042178523]
Room impulse response (RIR) functions capture how the surrounding physical environment transforms the sounds heard by a listener.
We explore how to infer RIRs based on a sparse set of images and echoes observed in the space.
In experiments using a state-of-the-art audio-visual simulator for 3D environments, we demonstrate that our method successfully generates arbitrary RIRs.
arXiv Detail & Related papers (2022-06-08T16:38:24Z) - Deep Impulse Responses: Estimating and Parameterizing Filters with Deep
Networks [76.830358429947]
Impulse response estimation in high noise and in-the-wild settings is a challenging problem.
We propose a novel framework for parameterizing and estimating impulse responses based on recent advances in neural representation learning.
arXiv Detail & Related papers (2022-02-07T18:57:23Z) - FAST-RIR: Fast neural diffuse room impulse response generator [81.96114823691343]
We present a neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating room impulse responses (RIRs) for a given acoustic environment.
Our FAST-RIR is capable of generating RIRs for a given input reverberation time with an average error of 0.02s.
We show that our proposed FAST-RIR with batch size 1 is 400 times faster than a state-of-the-art diffuse acoustic simulator (DAS) on a CPU.
arXiv Detail & Related papers (2021-10-07T05:21:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.