Deep Impulse Responses: Estimating and Parameterizing Filters with Deep
Networks
- URL: http://arxiv.org/abs/2202.03416v1
- Date: Mon, 7 Feb 2022 18:57:23 GMT
- Title: Deep Impulse Responses: Estimating and Parameterizing Filters with Deep
Networks
- Authors: Alexander Richard, Peter Dodds, Vamsi Krishna Ithapu
- Abstract summary: Impulse response estimation in high noise and in-the-wild settings is a challenging problem.
We propose a novel framework for parameterizing and estimating impulse responses based on recent advances in neural representation learning.
- Score: 76.830358429947
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Impulse response estimation in high noise and in-the-wild settings, with
minimal control of the underlying data distributions, is a challenging problem.
We propose a novel framework for parameterizing and estimating impulse
responses based on recent advances in neural representation learning. Our
framework is driven by a carefully designed neural network that jointly
estimates the impulse response and the (apriori unknown) spectral noise
characteristics of an observed signal given the source signal. We demonstrate
robustness in estimation, even under low signal-to-noise ratios, and show
strong results when learning from spatio-temporal real-world speech data. Our
framework provides a natural way to interpolate impulse responses on a spatial
grid, while also allowing for efficiently compressing and storing them for
real-time rendering applications in augmented and virtual reality.
Related papers
- Synergistic Integration of Coordinate Network and Tensorial Feature for Improving Neural Radiance Fields from Sparse Inputs [26.901819636977912]
We propose a method that integrates multi-plane representation with a coordinate-based network known for strong bias toward low-frequency signals.
We demonstrate that our proposed method outperforms baseline models for both static and dynamic NeRFs with sparse inputs.
arXiv Detail & Related papers (2024-05-13T15:42:46Z) - Neural Acoustic Context Field: Rendering Realistic Room Impulse Response
With Neural Fields [61.07542274267568]
This letter proposes a novel Neural Acoustic Context Field approach, called NACF, to parameterize an audio scene.
Driven by the unique properties of RIR, we design a temporal correlation module and multi-scale energy decay criterion.
Experimental results show that NACF outperforms existing field-based methods by a notable margin.
arXiv Detail & Related papers (2023-09-27T19:50:50Z) - Bayesian inference and neural estimation of acoustic wave propagation [10.980762871305279]
We introduce a novel framework which combines physics and machine learning methods to analyse acoustic signals.
Three methods are developed for this task: a Bayesian inference approach for inferring the spectral acoustics characteristics, a neural-physical model which equips a neural network with forward and backward physical losses, and the non-linear least squares approach which serves as benchmark.
The simplicity and efficiency of this framework is empirically validated on simulated data.
arXiv Detail & Related papers (2023-05-28T15:14:46Z) - Synthetic Wave-Geometric Impulse Responses for Improved Speech
Dereverberation [69.1351513309953]
We show that accurately simulating the low-frequency components of Room Impulse Responses (RIRs) is important to achieving good dereverberation.
We demonstrate that speech dereverberation models trained on hybrid synthetic RIRs outperform models trained on RIRs generated by prior geometric ray tracing methods.
arXiv Detail & Related papers (2022-12-10T20:15:23Z) - Towards Improved Room Impulse Response Estimation for Speech Recognition [53.04440557465013]
We propose a novel approach for blind room impulse response (RIR) estimation systems in the context of far-field automatic speech recognition (ASR)
We first draw the connection between improved RIR estimation and improved ASR performance, as a means of evaluating neural RIR estimators.
We then propose a generative adversarial network (GAN) based architecture that encodes RIR features from reverberant speech and constructs an RIR from the encoded features.
arXiv Detail & Related papers (2022-11-08T00:40:27Z) - Likelihood-Free Parameter Estimation with Neural Bayes Estimators [0.0]
Neural point estimators are neural networks that map data to parameter point estimates.
We aim to increase the awareness of statisticians to this relatively new inferential tool, and to facilitate its adoption by providing user-friendly open-source software.
arXiv Detail & Related papers (2022-08-27T06:58:16Z) - Robust Learning of Recurrent Neural Networks in Presence of Exogenous
Noise [22.690064709532873]
We propose a tractable robustness analysis for RNN models subject to input noise.
The robustness measure can be estimated efficiently using linearization techniques.
Our proposed methodology significantly improves robustness of recurrent neural networks.
arXiv Detail & Related papers (2021-05-03T16:45:05Z) - Deep Networks for Direction-of-Arrival Estimation in Low SNR [89.45026632977456]
We introduce a Convolutional Neural Network (CNN) that is trained from mutli-channel data of the true array manifold matrix.
We train a CNN in the low-SNR regime to predict DoAs across all SNRs.
Our robust solution can be applied in several fields, ranging from wireless array sensors to acoustic microphones or sonars.
arXiv Detail & Related papers (2020-11-17T12:52:18Z) - Inferring, Predicting, and Denoising Causal Wave Dynamics [3.9407250051441403]
The DISTributed Artificial neural Network Architecture (DISTANA) is a generative, recurrent graph convolution neural network.
We show that DISTANA is very well-suited to denoise data streams, given that re-occurring patterns are observed.
It produces stable and accurate closed-loop predictions even over hundreds of time steps.
arXiv Detail & Related papers (2020-09-19T08:33:53Z) - Temporal-Spatial Neural Filter: Direction Informed End-to-End
Multi-channel Target Speech Separation [66.46123655365113]
Target speech separation refers to extracting the target speaker's speech from mixed signals.
Two main challenges are the complex acoustic environment and the real-time processing requirement.
We propose a temporal-spatial neural filter, which directly estimates the target speech waveform from multi-speaker mixture.
arXiv Detail & Related papers (2020-01-02T11:12:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.