A neural network-supported two-stage algorithm for lightweight
dereverberation on hearing devices
- URL: http://arxiv.org/abs/2204.02978v2
- Date: Wed, 31 May 2023 15:34:46 GMT
- Title: A neural network-supported two-stage algorithm for lightweight
dereverberation on hearing devices
- Authors: Jean-Marie Lemercier, Joachim Thiemann, Raphael Koning and Timo
Gerkmann
- Abstract summary: A two-stage lightweight online dereverberation algorithm for hearing devices is presented in this paper.
The approach combines a multi-channel multi-frame linear filter with a single-channel single-frame post-filter.
Both components rely on power spectral density (PSD) estimates provided by deep neural networks (DNNs)
- Score: 13.49645012479288
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A two-stage lightweight online dereverberation algorithm for hearing devices
is presented in this paper. The approach combines a multi-channel multi-frame
linear filter with a single-channel single-frame post-filter. Both components
rely on power spectral density (PSD) estimates provided by deep neural networks
(DNNs). By deriving new metrics analyzing the dereverberation performance in
various time ranges, we confirm that directly optimizing for a criterion at the
output of the multi-channel linear filtering stage results in a more efficient
dereverberation as compared to placing the criterion at the output of the DNN
to optimize the PSD estimation. More concretely, we show that training this
stage end-to-end helps further remove the reverberation in the range accessible
to the filter, thus increasing the \textit{early-to-moderate} reverberation
ratio. We argue and demonstrate that it can then be well combined with a
post-filtering stage to efficiently suppress the residual late reverberation,
thereby increasing the \textit{early-to-final} reverberation ratio. This
proposed two stage procedure is shown to be both very effective in terms of
dereverberation performance and computational demands, as compared to e.g.
recent state-of-the-art DNN approaches. Furthermore, the proposed two-stage
system can be adapted to the needs of different types of hearing-device users
by controlling the amount of reduction of early reflections.
Related papers
- Run-Time Adaptation of Neural Beamforming for Robust Speech Dereverberation and Denoising [15.152748065111194]
This paper describes speech enhancement for realtime automatic speech recognition in real environments.
It estimates the masks of clean dry speech from a noisy echoic mixture spectrogram with a deep neural network (DNN) and then computes a enhancement filter used for beamforming.
The performance of such a supervised approach, however, is drastically degraded under mismatched conditions.
arXiv Detail & Related papers (2024-10-30T08:32:47Z) - Low-rank extended Kalman filtering for online learning of neural
networks from streaming data [71.97861600347959]
We propose an efficient online approximate Bayesian inference algorithm for estimating the parameters of a nonlinear function from a potentially non-stationary data stream.
The method is based on the extended Kalman filter (EKF), but uses a novel low-rank plus diagonal decomposition of the posterior matrix.
In contrast to methods based on variational inference, our method is fully deterministic, and does not require step-size tuning.
arXiv Detail & Related papers (2023-05-31T03:48:49Z) - Simple Pooling Front-ends For Efficient Audio Classification [56.59107110017436]
We show that eliminating the temporal redundancy in the input audio features could be an effective approach for efficient audio classification.
We propose a family of simple pooling front-ends (SimPFs) which use simple non-parametric pooling operations to reduce the redundant information.
SimPFs can achieve a reduction in more than half of the number of floating point operations for off-the-shelf audio neural networks.
arXiv Detail & Related papers (2022-10-03T14:00:41Z) - NerfingMVS: Guided Optimization of Neural Radiance Fields for Indoor
Multi-view Stereo [97.07453889070574]
We present a new multi-view depth estimation method that utilizes both conventional SfM reconstruction and learning-based priors.
We show that our proposed framework significantly outperforms state-of-the-art methods on indoor scenes.
arXiv Detail & Related papers (2021-09-02T17:54:31Z) - Neural Calibration for Scalable Beamforming in FDD Massive MIMO with
Implicit Channel Estimation [10.775558382613077]
Channel estimation and beamforming play critical roles in frequency-division duplexing (FDD) massive multiple-input multiple-output (MIMO) systems.
We propose a deep learning-based approach that directly optimize the beamformers at the base station according to the received uplink pilots.
A neural calibration method is proposed to improve the scalability of the end-to-end design.
arXiv Detail & Related papers (2021-08-03T14:26:14Z) - Speaker Diarization using Two-pass Leave-One-Out Gaussian PLDA
Clustering of DNN Embeddings [9.826793576487736]
This paper presents a new two-pass version of a system for speaker diarization using clustering and embeddings.
For the Callhome corpus, we achieve the first published error rate below 4% without any task-dependent parameter tuning.
We also show significant progress towards a robust single solution for multiple diarization tasks.
arXiv Detail & Related papers (2021-04-06T12:52:55Z) - Exploiting Multiple Timescales in Hierarchical Echo State Networks [0.0]
Echo state networks (ESNs) are a powerful form of reservoir computing that only require training of linear output weights.
Here we explore the timescales in hierarchical ESNs, where the reservoir is partitioned into two smaller reservoirs linked with distinct properties.
arXiv Detail & Related papers (2021-01-11T22:33:17Z) - Deep Networks for Direction-of-Arrival Estimation in Low SNR [89.45026632977456]
We introduce a Convolutional Neural Network (CNN) that is trained from mutli-channel data of the true array manifold matrix.
We train a CNN in the low-SNR regime to predict DoAs across all SNRs.
Our robust solution can be applied in several fields, ranging from wireless array sensors to acoustic microphones or sonars.
arXiv Detail & Related papers (2020-11-17T12:52:18Z) - ADRN: Attention-based Deep Residual Network for Hyperspectral Image
Denoising [52.01041506447195]
We propose an attention-based deep residual network to learn a mapping from noisy HSI to the clean one.
Experimental results demonstrate that our proposed ADRN scheme outperforms the state-of-the-art methods both in quantitative and visual evaluations.
arXiv Detail & Related papers (2020-03-04T08:36:27Z) - Temporal-Spatial Neural Filter: Direction Informed End-to-End
Multi-channel Target Speech Separation [66.46123655365113]
Target speech separation refers to extracting the target speaker's speech from mixed signals.
Two main challenges are the complex acoustic environment and the real-time processing requirement.
We propose a temporal-spatial neural filter, which directly estimates the target speech waveform from multi-speaker mixture.
arXiv Detail & Related papers (2020-01-02T11:12:50Z) - End-to-End Multi-Task Denoising for joint SDR and PESQ Optimization [43.15288441772729]
Denoising networks learn mapping from noisy speech to clean one directly.
Existing schemes have either of two critical issues: spectrum and metric mismatches.
This paper presents a new end-to-end denoising framework with the goal of joint SDR and PESQ optimization.
arXiv Detail & Related papers (2019-01-26T02:48:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.