Extending DNN-based Multiplicative Masking to Deep Subband Filtering for
Improved Dereverberation
- URL: http://arxiv.org/abs/2303.00529v3
- Date: Wed, 31 May 2023 08:45:49 GMT
- Title: Extending DNN-based Multiplicative Masking to Deep Subband Filtering for
Improved Dereverberation
- Authors: Jean-Marie Lemercier, Julian Tobergte, Timo Gerkmann
- Abstract summary: We present a scheme for extending deep neural network-based multiplicative maskers to deep subband filters for speech restoration in the time-frequency domain.
The resulting method can be generically applied to any deep neural network providing masks in the time-frequency domain.
- Score: 15.16865739526702
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we present a scheme for extending deep neural network-based
multiplicative maskers to deep subband filters for speech restoration in the
time-frequency domain. The resulting method can be generically applied to any
deep neural network providing masks in the time-frequency domain, while
requiring only few more trainable parameters and a computational overhead that
is negligible for state-of-the-art neural networks. We demonstrate that the
resulting deep subband filtering scheme outperforms multiplicative masking for
dereverberation, while leaving the denoising performance virtually the same. We
argue that this is because deep subband filtering in the time-frequency domain
fits the subband approximation often assumed in the dereverberation literature,
whereas multiplicative masking corresponds to the narrowband approximation
generally employed for denoising.
Related papers
- BANF: Band-limited Neural Fields for Levels of Detail Reconstruction [28.95113960996025]
We show that via a simple modification, one can obtain neural fields that are low-pass filtered, and in turn show how this can be exploited to obtain a frequency decomposition of the entire signal.
We demonstrate the validity of our technique by investigating level-of-detail reconstruction, and showing how coarser representations can be computed effectively.
arXiv Detail & Related papers (2024-04-19T17:39:50Z) - Layer Ensembles [95.42181254494287]
We introduce a method for uncertainty estimation that considers a set of independent categorical distributions for each layer of the network.
We show that the method can be further improved by ranking samples, resulting in models that require less memory and time to run.
arXiv Detail & Related papers (2022-10-10T17:52:47Z) - Robust Real-World Image Super-Resolution against Adversarial Attacks [115.04009271192211]
adversarial image samples with quasi-imperceptible noises could threaten deep learning SR models.
We propose a robust deep learning framework for real-world SR that randomly erases potential adversarial noises.
Our proposed method is more insensitive to adversarial attacks and presents more stable SR results than existing models and defenses.
arXiv Detail & Related papers (2022-07-31T13:26:33Z) - A neural network-supported two-stage algorithm for lightweight
dereverberation on hearing devices [13.49645012479288]
A two-stage lightweight online dereverberation algorithm for hearing devices is presented in this paper.
The approach combines a multi-channel multi-frame linear filter with a single-channel single-frame post-filter.
Both components rely on power spectral density (PSD) estimates provided by deep neural networks (DNNs)
arXiv Detail & Related papers (2022-04-06T11:08:28Z) - DeepFilterNet: A Low Complexity Speech Enhancement Framework for
Full-Band Audio based on Deep Filtering [9.200520879361916]
We propose DeepFilterNet, a two stage speech enhancement framework utilizing deep filtering.
First, we enhance the spectral envelope using ERB-scaled gains modeling the human frequency perception.
The second stage employs deep filtering to enhance the periodic components of speech.
arXiv Detail & Related papers (2021-10-11T20:03:52Z) - Unsharp Mask Guided Filtering [53.14430987860308]
The goal of this paper is guided image filtering, which emphasizes the importance of structure transfer during filtering.
We propose a new and simplified formulation of the guided filter inspired by unsharp masking.
Our formulation enjoys a filtering prior to a low-pass filter and enables explicit structure transfer by estimating a single coefficient.
arXiv Detail & Related papers (2021-06-02T19:15:34Z) - Deep Unfolded Recovery of Sub-Nyquist Sampled Ultrasound Image [94.42139459221784]
We propose a reconstruction method from sub-Nyquist samples in the time and spatial domain, that is based on unfolding the ISTA algorithm.
Our method allows reducing the number of array elements, sampling rate, and computational time while ensuring high quality imaging performance.
arXiv Detail & Related papers (2021-03-01T19:19:38Z) - On Filter Generalization for Music Bandwidth Extension Using Deep Neural
Networks [0.40611352512781856]
We formulate the bandwidth extension problem using deep neural networks, where a band-limited signal is provided as input to the network.
Our main contribution centers on the impact of the choice of low pass filter when training and subsequently testing the network.
We propose a data augmentation strategy which utilizes multiple low pass filters during training and leads to improved generalization to unseen filtering conditions at test time.
arXiv Detail & Related papers (2020-11-14T11:41:28Z) - ESPN: Extremely Sparse Pruned Networks [50.436905934791035]
We show that a simple iterative mask discovery method can achieve state-of-the-art compression of very deep networks.
Our algorithm represents a hybrid approach between single shot network pruning methods and Lottery-Ticket type approaches.
arXiv Detail & Related papers (2020-06-28T23:09:27Z) - Sparse Mixture of Local Experts for Efficient Speech Enhancement [19.645016575334786]
We investigate a deep learning approach for speech denoising through an efficient ensemble of specialist neural networks.
By splitting up the speech denoising task into non-overlapping subproblems, we are able to improve denoising performance while also reducing computational complexity.
Our findings demonstrate that a fine-tuned ensemble network is able to exceed the speech denoising capabilities of a generalist network.
arXiv Detail & Related papers (2020-05-16T23:23:22Z) - ADRN: Attention-based Deep Residual Network for Hyperspectral Image
Denoising [52.01041506447195]
We propose an attention-based deep residual network to learn a mapping from noisy HSI to the clean one.
Experimental results demonstrate that our proposed ADRN scheme outperforms the state-of-the-art methods both in quantitative and visual evaluations.
arXiv Detail & Related papers (2020-03-04T08:36:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.