Unsupervised Source Separation via Self-Supervised Training
- URL: http://arxiv.org/abs/2202.03875v1
- Date: Tue, 8 Feb 2022 14:02:50 GMT
- Title: Unsupervised Source Separation via Self-Supervised Training
- Authors: Ertu\u{g} Karamatl{\i}, Serap K{\i}rb{\i}z
- Abstract summary: We introduce two novel unsupervised (blind) source separation methods, which involve self-supervised training from single-channel two-source speech mixtures.
Our first method employs permutation invariant training (PIT) to separate artificially-generated mixtures back into the original mixtures.
We improve upon this first method by creating mixtures of source estimates and employing PIT to separate these new mixtures in a cyclic fashion.
We show that MixPIT outperforms a common baseline (MixIT) on our small dataset (SC09Mix), and they have comparable performance on a standard dataset (LibriMix)
- Score: 0.913755431537592
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce two novel unsupervised (blind) source separation methods, which
involve self-supervised training from single-channel two-source speech mixtures
without any access to the ground truth source signals. Our first method employs
permutation invariant training (PIT) to separate artificially-generated
mixtures of the original mixtures back into the original mixtures, which we
named mixture permutation invariant training (MixPIT). We found this
challenging objective to be a valid proxy task for learning to separate the
underlying sources. We improve upon this first method by creating mixtures of
source estimates and employing PIT to separate these new mixtures in a cyclic
fashion. We named this second method cyclic mixture permutation invariant
training (MixCycle), where cyclic refers to the fact that we use the same model
to produce artificial mixtures and to learn from them continuously. We show
that MixPIT outperforms a common baseline (MixIT) on our small dataset
(SC09Mix), and they have comparable performance on a standard dataset
(LibriMix). Strikingly, we also show that MixCycle surpasses the performance of
supervised PIT by being data-efficient, thanks to its inherent data
augmentation mechanism. To the best of our knowledge, no other purely
unsupervised method is able to match or exceed the performance of supervised
training.
Related papers
- Data Mixing Laws: Optimizing Data Mixtures by Predicting Language Modeling Performance [55.872926690722714]
We study the predictability of model performance regarding the mixture proportions in function forms.
We propose nested use of the scaling laws of training steps, model sizes, and our data mixing law.
Our method effectively optimize the training mixture of a 1B model trained for 100B tokens in RedPajama.
arXiv Detail & Related papers (2024-03-25T17:14:00Z) - Fast Semisupervised Unmixing Using Nonconvex Optimization [80.11512905623417]
We introduce a novel convex convex model for semi/library-based unmixing.
We demonstrate the efficacy of Alternating Methods of sparse unsupervised unmixing.
arXiv Detail & Related papers (2024-01-23T10:07:41Z) - DP-Mix: Mixup-based Data Augmentation for Differentially Private
Learning [10.971246386083884]
We propose two novel data augmentation techniques specifically designed for the constraints of differentially private learning.
Our first technique, DP-Mix_Self, achieves SoTA classification performance across a range of datasets and settings by performing mixup on self-augmented data.
Our second technique, DP-Mix_Diff, further improves performance by incorporating synthetic data from a pre-trained diffusion model into the mixup process.
arXiv Detail & Related papers (2023-11-02T15:12:12Z) - Learning with Noisy Labels Using Collaborative Sample Selection and
Contrastive Semi-Supervised Learning [76.00798972439004]
Collaborative Sample Selection (CSS) removes noisy samples from identified clean set.
We introduce a co-training mechanism with a contrastive loss in semi-supervised learning.
arXiv Detail & Related papers (2023-10-24T05:37:20Z) - Expeditious Saliency-guided Mix-up through Random Gradient Thresholding [89.59134648542042]
Mix-up training approaches have proven to be effective in improving the generalization ability of Deep Neural Networks.
In this paper, inspired by the superior qualities of each direction over one another, we introduce a novel method that lies at the junction of the two routes.
We name our method R-Mix following the concept of "Random Mix-up"
In order to address the question of whether there exists a better decision protocol, we train a Reinforcement Learning agent that decides the mix-up policies.
arXiv Detail & Related papers (2022-12-09T14:29:57Z) - Diffusion-based Generative Speech Source Separation [27.928990101986862]
We propose DiffSep, a new single channel source separation method based on score-matching of a differential equation (SDE)
Experiments on the WSJ0 2mix dataset demonstrate the potential of the method.
The method is also suitable for speech enhancement and shows performance competitive with prior work on the VoiceBank-DEMAND dataset.
arXiv Detail & Related papers (2022-10-31T13:46:55Z) - Continual self-training with bootstrapped remixing for speech
enhancement [32.68203972471562]
RemixIT is a simple and novel self-supervised training method for speech enhancement.
Our experiments show that RemixIT outperforms several previous state-of-the-art self-supervised methods.
arXiv Detail & Related papers (2021-10-19T16:56:18Z) - Teacher-Student MixIT for Unsupervised and Semi-supervised Speech
Separation [27.19635746008699]
We introduce a novel semi-supervised learning framework for end-to-end speech separation.
The proposed method first uses mixtures of unseparated sources and the mixture invariant training criterion to train a teacher model.
Experiments with single and multi channel mixtures show that the teacher-student training resolves the over-separation problem.
arXiv Detail & Related papers (2021-06-15T02:26:42Z) - Thompson Sampling with a Mixture Prior [59.211830005673896]
We study Thompson sampling (TS) in online decision-making problems where the uncertain environment is sampled from a mixture distribution.
We develop a novel, general technique for analyzing the regret of TS with such priors.
arXiv Detail & Related papers (2021-06-10T09:21:07Z) - ReMix: Towards Image-to-Image Translation with Limited Data [154.71724970593036]
We propose a data augmentation method (ReMix) to tackle this issue.
We interpolate training samples at the feature level and propose a novel content loss based on the perceptual relations among samples.
The proposed approach effectively reduces the ambiguity of generation and renders content-preserving results.
arXiv Detail & Related papers (2021-03-31T06:24:10Z) - Unsupervised Sound Separation Using Mixture Invariant Training [38.0680944898427]
We show that MixIT can achieve competitive performance compared to supervised methods on speech separation.
In particular, we significantly improve reverberant speech separation performance by incorporating reverberant mixtures.
arXiv Detail & Related papers (2020-06-23T02:22:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.