RemixIT: Continual self-training of speech enhancement models via
bootstrapped remixing
- URL: http://arxiv.org/abs/2202.08862v1
- Date: Thu, 17 Feb 2022 19:07:29 GMT
- Title: RemixIT: Continual self-training of speech enhancement models via
bootstrapped remixing
- Authors: Efthymios Tzinis, Yossi Adi, Vamsi Krishna Ithapu, Buye Xu, Paris
Smaragdis, Anurag Kumar
- Abstract summary: RemixIT is a selfsupervised method for training speech enhancement without the need of a single isolated in-domain speech or a noise waveform.
We show that RemixIT can be combined with any separation model as well as be applied towards any semi-supervised and unsupervised domain adaptation task.
- Score: 41.77753005397551
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present RemixIT, a simple yet effective selfsupervised method for training
speech enhancement without the need of a single isolated in-domain speech nor a
noise waveform. Our approach overcomes limitations of previous methods which
make them dependent to clean in-domain target signals and thus, sensitive to
any domain mismatch between train and test samples. RemixIT is based on a
continuous self-training scheme in which a pre-trained teacher model on
out-of-domain data infers estimated pseudo-target signals for in-domain
mixtures. Then, by permuting the estimated clean and noise signals and remixing
them together, we generate a new set of bootstrapped mixtures and corresponding
pseudo-targets which are used to train the student network. Vice-versa, the
teacher periodically refines its estimates using the updated parameters of the
latest student models. Experimental results on multiple speech enhancement
datasets and tasks not only show the superiority of our method over prior
approaches but also showcase that RemixIT can be combined with any separation
model as well as be applied towards any semi-supervised and unsupervised domain
adaptation task. Our analysis, paired with empirical evidence, sheds light on
the inside functioning of our self-training scheme wherein the student model
keeps obtaining better performance while observing severely degraded
pseudo-targets.
Related papers
- Diffusing States and Matching Scores: A New Framework for Imitation Learning [16.941612670582522]
Adversarial Imitation Learning is traditionally framed as a two-player zero-sum game between a learner and an adversarially chosen cost function.
In recent years, diffusion models have emerged as a non-adversarial alternative to GANs.
We show our approach outperforms GAN-style imitation learning baselines across various continuous control problems.
arXiv Detail & Related papers (2024-10-17T17:59:25Z) - Learning with Noisy Foundation Models [95.50968225050012]
This paper is the first work to comprehensively understand and analyze the nature of noise in pre-training datasets.
We propose a tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise and improve generalization.
arXiv Detail & Related papers (2024-03-11T16:22:41Z) - Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation [63.180725016463974]
Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice.
We introduce a novel noisy correspondence learning framework, namely textbfSelf-textbfReinforcing textbfErrors textbfMitigation (SREM)
arXiv Detail & Related papers (2023-12-27T09:03:43Z) - Understanding and Mitigating the Label Noise in Pre-training on
Downstream Tasks [91.15120211190519]
This paper aims to understand the nature of noise in pre-training datasets and to mitigate its impact on downstream tasks.
We propose a light-weight black-box tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise.
arXiv Detail & Related papers (2023-09-29T06:18:15Z) - MAPS: A Noise-Robust Progressive Learning Approach for Source-Free
Domain Adaptive Keypoint Detection [76.97324120775475]
Cross-domain keypoint detection methods always require accessing the source data during adaptation.
This paper considers source-free domain adaptive keypoint detection, where only the well-trained source model is provided to the target domain.
arXiv Detail & Related papers (2023-02-09T12:06:08Z) - Continual self-training with bootstrapped remixing for speech
enhancement [32.68203972471562]
RemixIT is a simple and novel self-supervised training method for speech enhancement.
Our experiments show that RemixIT outperforms several previous state-of-the-art self-supervised methods.
arXiv Detail & Related papers (2021-10-19T16:56:18Z) - Deep Ensembles for Low-Data Transfer Learning [21.578470914935938]
We study different ways of creating ensembles from pre-trained models.
We show that the nature of pre-training itself is a performant source of diversity.
We propose a practical algorithm that efficiently identifies a subset of pre-trained models for any downstream dataset.
arXiv Detail & Related papers (2020-10-14T07:59:00Z) - Self-Supervised Contrastive Learning for Unsupervised Phoneme
Segmentation [37.054709598792165]
The model is a convolutional neural network that operates directly on the raw waveform.
It is optimized to identify spectral changes in the signal using the Noise-Contrastive Estimation principle.
At test time, a peak detection algorithm is applied over the model outputs to produce the final boundaries.
arXiv Detail & Related papers (2020-07-27T12:10:21Z) - Automatic Recall Machines: Internal Replay, Continual Learning and the
Brain [104.38824285741248]
Replay in neural networks involves training on sequential data with memorized samples, which counteracts forgetting of previous behavior caused by non-stationarity.
We present a method where these auxiliary samples are generated on the fly, given only the model that is being trained for the assessed objective.
Instead the implicit memory of learned samples within the assessed model itself is exploited.
arXiv Detail & Related papers (2020-06-22T15:07:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.