Differentiable Dictionary Search: Integrating Linear Mixing with Deep
Non-Linear Modelling for Audio Source Separation
- URL: http://arxiv.org/abs/2211.15524v1
- Date: Mon, 28 Nov 2022 16:37:02 GMT
- Title: Differentiable Dictionary Search: Integrating Linear Mixing with Deep
Non-Linear Modelling for Audio Source Separation
- Authors: Luk\'a\v{s} Samuel Mart\'ak, Rainer Kelz, Gerhard Widmer
- Abstract summary: This paper describes several improvements to a new method for signal decomposition that we recently formulated under the name Differenti Dictionary Search (DDS)
The fundamental idea is to exploit a class of powerful deep invertible density estimators called normalizing flows, to model the dictionary in a linear decomposition method such as NMF.
As the initial formulation was a proof of concept with some practical limitations, we will present several steps towards making it scalable.
- Score: 8.680081568962997
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper describes several improvements to a new method for signal
decomposition that we recently formulated under the name of Differentiable
Dictionary Search (DDS). The fundamental idea of DDS is to exploit a class of
powerful deep invertible density estimators called normalizing flows, to model
the dictionary in a linear decomposition method such as NMF, effectively
creating a bijection between the space of dictionary elements and the
associated probability space, allowing a differentiable search through the
dictionary space, guided by the estimated densities. As the initial formulation
was a proof of concept with some practical limitations, we will present several
steps towards making it scalable, hoping to improve both the computational
complexity of the method and its signal decomposition capabilities. As a
testbed for experimental evaluation, we choose the task of frame-level piano
transcription, where the signal is to be decomposed into sources whose activity
is attributed to individual piano notes. To highlight the impact of improved
non-linear modelling of sources, we compare variants of our method to a linear
overcomplete NMF baseline. Experimental results will show that even in the
absence of additional constraints, our models produce increasingly sparse and
precise decompositions, according to two pertinent evaluation measures.
Related papers
- Signature Isolation Forest [3.9440964696313485]
Functional Isolation Forest (FIF) is a state-of-the-art Anomaly Detection (AD) algorithm designed for functional data.
We introduce textitSignature Isolation Forest, a novel AD algorithm class leveraging the rough path theory's signature transform.
We provide several numerical experiments, including a real-world applications benchmark showing the relevance of our methods.
arXiv Detail & Related papers (2024-03-07T11:00:35Z) - Denoising Diffusion Variational Inference: Diffusion Models as
Expressive Variational Posteriors [12.380863420871071]
DDVI is an approximate inference algorithm for latent variable models that relies on diffusion models as flexible variational posteriors.
We use DDVI on a motivating task in biology -- inferring latent ancestry from human genomes -- and we find that it outperforms strong baselines on the Thousand Genomes dataset.
arXiv Detail & Related papers (2024-01-05T10:27:44Z) - AdjointDPM: Adjoint Sensitivity Method for Gradient Backpropagation of Diffusion Probabilistic Models [103.41269503488546]
Existing customization methods require access to multiple reference examples to align pre-trained diffusion probabilistic models with user-provided concepts.
This paper aims to address the challenge of DPM customization when the only available supervision is a differentiable metric defined on the generated contents.
We propose a novel method AdjointDPM, which first generates new samples from diffusion models by solving the corresponding probability-flow ODEs.
It then uses the adjoint sensitivity method to backpropagate the gradients of the loss to the models' parameters.
arXiv Detail & Related papers (2023-07-20T09:06:21Z) - Score-based Source Separation with Applications to Digital Communication
Signals [72.6570125649502]
We propose a new method for separating superimposed sources using diffusion-based generative models.
Motivated by applications in radio-frequency (RF) systems, we are interested in sources with underlying discrete nature.
Our method can be viewed as a multi-source extension to the recently proposed score distillation sampling scheme.
arXiv Detail & Related papers (2023-06-26T04:12:40Z) - Reflected Diffusion Models [93.26107023470979]
We present Reflected Diffusion Models, which reverse a reflected differential equation evolving on the support of the data.
Our approach learns the score function through a generalized score matching loss and extends key components of standard diffusion models.
arXiv Detail & Related papers (2023-04-10T17:54:38Z) - Probabilistic Modelling of Signal Mixtures with Differentiable
Dictionaries [8.680081568962997]
We introduce a novel way to incorporate prior information into (semi-) supervised non-negative matrix factorization.
It enables principled modelling of mixtures where non-linear sources are linearly mixed.
arXiv Detail & Related papers (2022-11-28T15:27:53Z) - Single-channel speech separation using Soft-minimum Permutation
Invariant Training [60.99112031408449]
A long-lasting problem in supervised speech separation is finding the correct label for each separated speech signal.
Permutation Invariant Training (PIT) has been shown to be a promising solution in handling the label ambiguity problem.
In this work, we propose a probabilistic optimization framework to address the inefficiency of PIT in finding the best output-label assignment.
arXiv Detail & Related papers (2021-11-16T17:25:05Z) - Parsimony-Enhanced Sparse Bayesian Learning for Robust Discovery of
Partial Differential Equations [5.584060970507507]
A Parsimony Enhanced Sparse Bayesian Learning (PeSBL) method is developed for discovering the governing Partial Differential Equations (PDEs) of nonlinear dynamical systems.
Results of numerical case studies indicate that the governing PDEs of many canonical dynamical systems can be correctly identified using the proposed PeSBL method.
arXiv Detail & Related papers (2021-07-08T00:56:11Z) - Improving Metric Dimensionality Reduction with Distributed Topology [68.8204255655161]
DIPOLE is a dimensionality-reduction post-processing step that corrects an initial embedding by minimizing a loss functional with both a local, metric term and a global, topological term.
We observe that DIPOLE outperforms popular methods like UMAP, t-SNE, and Isomap on a number of popular datasets.
arXiv Detail & Related papers (2021-06-14T17:19:44Z) - Rectangular Flows for Manifold Learning [38.63646804834534]
Normalizing flows are invertible neural networks with tractable change-of-volume terms.
Data of interest is typically assumed to live in some (often unknown) low-dimensional manifold embedded in high-dimensional ambient space.
We propose two methods to tractably the gradient of this term with respect to the parameters of the model.
arXiv Detail & Related papers (2021-06-02T18:30:39Z) - A Correspondence Variational Autoencoder for Unsupervised Acoustic Word
Embeddings [50.524054820564395]
We propose a new unsupervised model for mapping a variable-duration speech segment to a fixed-dimensional representation.
The resulting acoustic word embeddings can form the basis of search, discovery, and indexing systems for low- and zero-resource languages.
arXiv Detail & Related papers (2020-12-03T19:24:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.