NASTAR: Noise Adaptive Speech Enhancement with Target-Conditional
Resampling
- URL: http://arxiv.org/abs/2206.09058v1
- Date: Sat, 18 Jun 2022 00:15:48 GMT
- Title: NASTAR: Noise Adaptive Speech Enhancement with Target-Conditional
Resampling
- Authors: Chi-Chang Lee, Cheng-Hung Hu, Yu-Chen Lin, Chu-Song Chen, Hsin-Min
Wang, Yu Tsao
- Abstract summary: We propose noise adaptive speech enhancement with target-conditional resampling (NASTAR)
NASTAR uses a feedback mechanism to simulate adaptive training data via a noise extractor and a retrieval model.
Experimental results show that NASTAR can effectively use one noisy speech sample to adapt an SE model to a target condition.
- Score: 34.565077865854484
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: For deep learning-based speech enhancement (SE) systems, the training-test
acoustic mismatch can cause notable performance degradation. To address the
mismatch issue, numerous noise adaptation strategies have been derived. In this
paper, we propose a novel method, called noise adaptive speech enhancement with
target-conditional resampling (NASTAR), which reduces mismatches with only one
sample (one-shot) of noisy speech in the target environment. NASTAR uses a
feedback mechanism to simulate adaptive training data via a noise extractor and
a retrieval model. The noise extractor estimates the target noise from the
noisy speech, called pseudo-noise. The noise retrieval model retrieves relevant
noise samples from a pool of noise signals according to the noisy speech,
called relevant-cohort. The pseudo-noise and the relevant-cohort set are
jointly sampled and mixed with the source speech corpus to prepare simulated
training data for noise adaptation. Experimental results show that NASTAR can
effectively use one noisy speech sample to adapt an SE model to a target
condition. Moreover, both the noise extractor and the noise retrieval model
contribute to model adaptation. To our best knowledge, NASTAR is the first work
to perform one-shot noise adaptation through noise extraction and retrieval.
Related papers
- Effective Noise-aware Data Simulation for Domain-adaptive Speech Enhancement Leveraging Dynamic Stochastic Perturbation [25.410770364140856]
Cross-domain speech enhancement (SE) is often faced with severe challenges due to the scarcity of noise and background information in an unseen target domain.
This study puts forward a novel data simulation method to address this issue, leveraging noise-extractive techniques and generative adversarial networks (GANs)
We introduce the notion of dynamic perturbation, which can inject controlled perturbations into the noise embeddings during inference.
arXiv Detail & Related papers (2024-09-03T02:29:01Z) - Noisy Pair Corrector for Dense Retrieval [59.312376423104055]
We propose a novel approach called Noisy Pair Corrector (NPC)
NPC consists of a detection module and a correction module.
We conduct experiments on text-retrieval benchmarks Natural Question and TriviaQA, code-search benchmarks StaQC and SO-DS.
arXiv Detail & Related papers (2023-11-07T08:27:14Z) - DiffSED: Sound Event Detection with Denoising Diffusion [70.18051526555512]
We reformulate the SED problem by taking a generative learning perspective.
Specifically, we aim to generate sound temporal boundaries from noisy proposals in a denoising diffusion process.
During training, our model learns to reverse the noising process by converting noisy latent queries to the groundtruth versions.
arXiv Detail & Related papers (2023-08-14T17:29:41Z) - Improving the Robustness of Summarization Models by Detecting and
Removing Input Noise [50.27105057899601]
We present a large empirical study quantifying the sometimes severe loss in performance from different types of input noise for a range of datasets and model sizes.
We propose a light-weight method for detecting and removing such noise in the input during model inference without requiring any training, auxiliary models, or even prior knowledge of the type of noise.
arXiv Detail & Related papers (2022-12-20T00:33:11Z) - NLIP: Noise-robust Language-Image Pre-training [95.13287735264937]
We propose a principled Noise-robust Language-Image Pre-training framework (NLIP) to stabilize pre-training via two schemes: noise-harmonization and noise-completion.
Our NLIP can alleviate the common noise effects during image-text pre-training in a more efficient way.
arXiv Detail & Related papers (2022-12-14T08:19:30Z) - Improving Noise Robustness of Contrastive Speech Representation Learning
with Speech Reconstruction [109.44933866397123]
Noise robustness is essential for deploying automatic speech recognition systems in real-world environments.
We employ a noise-robust representation learned by a refined self-supervised framework for noisy speech recognition.
We achieve comparable performance to the best supervised approach reported with only 16% of labeled data.
arXiv Detail & Related papers (2021-10-28T20:39:02Z) - Variational Autoencoder for Speech Enhancement with a Noise-Aware
Encoder [30.318947721658862]
We propose to include noise information in the training phase by using a noise-aware encoder trained on noisy-clean speech pairs.
We show that our proposed noise-aware VAE outperforms the standard VAE in terms of overall distortion without increasing the number of model parameters.
arXiv Detail & Related papers (2021-02-17T11:40:42Z) - Adaptive noise imitation for image denoising [58.21456707617451]
We develop a new textbfadaptive noise imitation (ADANI) algorithm that can synthesize noisy data from naturally noisy images.
To produce realistic noise, a noise generator takes unpaired noisy/clean images as input, where the noisy image is a guide for noise generation.
Coupling the noisy data output from ADANI with the corresponding ground-truth, a denoising CNN is then trained in a fully-supervised manner.
arXiv Detail & Related papers (2020-11-30T02:49:36Z) - SERIL: Noise Adaptive Speech Enhancement using Regularization-based
Incremental Learning [36.24803486242198]
Adaptation to a new environment may lead to catastrophic forgetting of the previously learned environments.
In this paper, we propose a regularization-based incremental learning SE (SERIL) strategy.
With a regularization constraint, the parameters are updated to the new noise environment while retaining the knowledge of the previous noise environments.
arXiv Detail & Related papers (2020-05-24T14:49:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.