Unsupervised Audio Source Separation using Generative Priors
- URL: http://arxiv.org/abs/2005.13769v1
- Date: Thu, 28 May 2020 03:57:16 GMT
- Title: Unsupervised Audio Source Separation using Generative Priors
- Authors: Vivek Narayanaswamy, Jayaraman J. Thiagarajan, Rushil Anirudh and
Andreas Spanias
- Abstract summary: We propose a novel approach for audio source separation based on generative priors trained on individual sources.
Our approach simultaneously searches in the source-specific latent spaces to effectively recover the constituent sources.
- Score: 43.35195236159189
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: State-of-the-art under-determined audio source separation systems rely on
supervised end-end training of carefully tailored neural network architectures
operating either in the time or the spectral domain. However, these methods are
severely challenged in terms of requiring access to expensive source level
labeled data and being specific to a given set of sources and the mixing
process, which demands complete re-training when those assumptions change. This
strongly emphasizes the need for unsupervised methods that can leverage the
recent advances in data-driven modeling, and compensate for the lack of labeled
data through meaningful priors. To this end, we propose a novel approach for
audio source separation based on generative priors trained on individual
sources. Through the use of projected gradient descent optimization, our
approach simultaneously searches in the source-specific latent spaces to
effectively recover the constituent sources. Though the generative priors can
be defined in the time domain directly, e.g. WaveGAN, we find that using
spectral domain loss functions for our optimization leads to good-quality
source estimates. Our empirical studies on standard spoken digit and instrument
datasets clearly demonstrate the effectiveness of our approach over classical
as well as state-of-the-art unsupervised baselines.
Related papers
- Source-Free Domain-Invariant Performance Prediction [68.39031800809553]
We propose a source-free approach centred on uncertainty-based estimation, using a generative model for calibration in the absence of source data.
Our experiments on benchmark object recognition datasets reveal that existing source-based methods fall short with limited source sample availability.
Our approach significantly outperforms the current state-of-the-art source-free and source-based methods, affirming its effectiveness in domain-invariant performance estimation.
arXiv Detail & Related papers (2024-08-05T03:18:58Z) - Iterative Sound Source Localization for Unknown Number of Sources [57.006589498243336]
We propose an iterative sound source localization approach called ISSL, which can iteratively extract each source's DOA without threshold until the termination criterion is met.
Our ISSL achieves significant performance improvements in both DOA estimation and source number detection compared with the existing threshold-based algorithms.
arXiv Detail & Related papers (2022-06-24T13:19:44Z) - Unsupervised Audio Source Separation Using Differentiable Parametric
Source Models [8.80867379881193]
We propose an unsupervised model-based deep learning approach to musical source separation.
A neural network is trained to reconstruct the observed mixture as a sum of the sources.
The experimental evaluation on a vocal ensemble separation task shows that the proposed method outperforms learning-free methods.
arXiv Detail & Related papers (2022-01-24T11:05:30Z) - Unsupervised Source Separation via Bayesian Inference in the Latent
Domain [4.583433328833251]
State of the art audio source separation models rely on supervised data-driven approaches.
We propose a simple yet effective unsupervised separation algorithm, which operates directly on a latent representation of time-domain signals.
We validate our approach on the Slakh dataset arXiv:1909.08494, demonstrating results in line with state of the art supervised approaches.
arXiv Detail & Related papers (2021-10-11T14:32:55Z) - PriorGrad: Improving Conditional Denoising Diffusion Models with
Data-Driven Adaptive Prior [103.00403682863427]
We propose PriorGrad to improve the efficiency of the conditional diffusion model.
We show that PriorGrad achieves a faster convergence leading to data and parameter efficiency and improved quality.
arXiv Detail & Related papers (2021-06-11T14:04:03Z) - Unsupervised Multi-source Domain Adaptation Without Access to Source
Data [58.551861130011886]
Unsupervised Domain Adaptation (UDA) aims to learn a predictor model for an unlabeled domain by transferring knowledge from a separate labeled source domain.
We propose a novel and efficient algorithm which automatically combines the source models with suitable weights in such a way that it performs at least as good as the best source model.
arXiv Detail & Related papers (2021-04-05T10:45:12Z) - Universal Source-Free Domain Adaptation [57.37520645827318]
We propose a novel two-stage learning process for domain adaptation.
In the Procurement stage, we aim to equip the model for future source-free deployment, assuming no prior knowledge of the upcoming category-gap and domain-shift.
In the Deployment stage, the goal is to design a unified adaptation algorithm capable of operating across a wide range of category-gaps.
arXiv Detail & Related papers (2020-04-09T07:26:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.