Related papers: Differentiable Dictionary Search: Integrating Linear Mixing with Deep Non-Linear Modelling for Audio Source Separation

Differentiable Dictionary Search: Integrating Linear Mixing with Deep Non-Linear Modelling for Audio Source Separation

URL: http://arxiv.org/abs/2211.15524v1
Date: Mon, 28 Nov 2022 16:37:02 GMT
Title: Differentiable Dictionary Search: Integrating Linear Mixing with Deep Non-Linear Modelling for Audio Source Separation
Authors: Luk\'a\v{s} Samuel Mart\'ak, Rainer Kelz, Gerhard Widmer
Abstract summary: This paper describes several improvements to a new method for signal decomposition that we recently formulated under the name Differenti Dictionary Search (DDS) The fundamental idea is to exploit a class of powerful deep invertible density estimators called normalizing flows, to model the dictionary in a linear decomposition method such as NMF. As the initial formulation was a proof of concept with some practical limitations, we will present several steps towards making it scalable.
Score: 8.680081568962997
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper describes several improvements to a new method for signal decomposition that we recently formulated under the name of Differentiable Dictionary Search (DDS). The fundamental idea of DDS is to exploit a class of powerful deep invertible density estimators called normalizing flows, to model the dictionary in a linear decomposition method such as NMF, effectively creating a bijection between the space of dictionary elements and the associated probability space, allowing a differentiable search through the dictionary space, guided by the estimated densities. As the initial formulation was a proof of concept with some practical limitations, we will present several steps towards making it scalable, hoping to improve both the computational complexity of the method and its signal decomposition capabilities. As a testbed for experimental evaluation, we choose the task of frame-level piano transcription, where the signal is to be decomposed into sources whose activity is attributed to individual piano notes. To highlight the impact of improved non-linear modelling of sources, we compare variants of our method to a linear overcomplete NMF baseline. Experimental results will show that even in the absence of additional constraints, our models produce increasingly sparse and precise decompositions, according to two pertinent evaluation measures.

Related papers

Learning spatially adaptive sparsity level maps for arbitrary convolutional dictionaries [1.0243402599670037]
We build on a recently proposed image reconstruction method, which is based on embedding data-driven information into a model-based convolutional dictionary regularization.<n>We extend the method to achieve filter-permutation invariance as well as the possibility to change the convolutional dictionary at inference time.
arXiv Detail & Related papers (2026-02-25T09:13:24Z)
From STLS to Projection-based Dictionary Selection in Sparse Regression for System Identification [1.7341202786497238]
We revisit dictionary-based sparse regression, in particular, Sequential Threshold Least Squares (STLS)<n>We propose a score-guided library selection to provide practical guidance for data-driven modeling, with emphasis on SINDy-type algorithms.
arXiv Detail & Related papers (2025-12-16T13:42:10Z)
Bilinear Convolution Decomposition for Causal RL Interpretability [0.0]
Efforts to interpret reinforcement learning (RL) models often rely on high-level techniques such as attribution or probing. This work proposes replacing nonlinearities in convolutional neural networks (ConvNets) with bilinear variants, to produce a class of models for which these limitations can be addressed. We show bilinear model variants perform comparably in model-free reinforcement learning settings, and give a side by side comparison on ProcGen environments.
arXiv Detail & Related papers (2024-12-01T19:32:04Z)
Total Uncertainty Quantification in Inverse PDE Solutions Obtained with Reduced-Order Deep Learning Surrogate Models [50.90868087591973]
We propose an approximate Bayesian method for quantifying the total uncertainty in inverse PDE solutions obtained with machine learning surrogate models. We test the proposed framework by comparing it with the iterative ensemble smoother and deep ensembling methods for a non-linear diffusion equation.
arXiv Detail & Related papers (2024-08-20T19:06:02Z)
Signature Isolation Forest [4.462334751640167]
Functional Isolation Forest (FIF) is a state-of-the-art Anomaly Detection (AD) algorithm designed for functional data. We introduce textitSignature Isolation Forest, a novel AD algorithm class leveraging the rough path theory's signature transform. We provide several numerical experiments, including a real-world applications benchmark showing the relevance of our methods.
arXiv Detail & Related papers (2024-03-07T11:00:35Z)
AdjointDPM: Adjoint Sensitivity Method for Gradient Backpropagation of Diffusion Probabilistic Models [103.41269503488546]
Existing customization methods require access to multiple reference examples to align pre-trained diffusion probabilistic models with user-provided concepts. This paper aims to address the challenge of DPM customization when the only available supervision is a differentiable metric defined on the generated contents. We propose a novel method AdjointDPM, which first generates new samples from diffusion models by solving the corresponding probability-flow ODEs. It then uses the adjoint sensitivity method to backpropagate the gradients of the loss to the models' parameters.
arXiv Detail & Related papers (2023-07-20T09:06:21Z)
Score-based Source Separation with Applications to Digital Communication Signals [72.6570125649502]
We propose a new method for separating superimposed sources using diffusion-based generative models. Motivated by applications in radio-frequency (RF) systems, we are interested in sources with underlying discrete nature. Our method can be viewed as a multi-source extension to the recently proposed score distillation sampling scheme.
arXiv Detail & Related papers (2023-06-26T04:12:40Z)
Probabilistic Modelling of Signal Mixtures with Differentiable Dictionaries [8.680081568962997]
We introduce a novel way to incorporate prior information into (semi-) supervised non-negative matrix factorization. It enables principled modelling of mixtures where non-linear sources are linearly mixed.
arXiv Detail & Related papers (2022-11-28T15:27:53Z)
Single-channel speech separation using Soft-minimum Permutation Invariant Training [60.99112031408449]
A long-lasting problem in supervised speech separation is finding the correct label for each separated speech signal. Permutation Invariant Training (PIT) has been shown to be a promising solution in handling the label ambiguity problem. In this work, we propose a probabilistic optimization framework to address the inefficiency of PIT in finding the best output-label assignment.
arXiv Detail & Related papers (2021-11-16T17:25:05Z)
Parsimony-Enhanced Sparse Bayesian Learning for Robust Discovery of Partial Differential Equations [5.584060970507507]
A Parsimony Enhanced Sparse Bayesian Learning (PeSBL) method is developed for discovering the governing Partial Differential Equations (PDEs) of nonlinear dynamical systems. Results of numerical case studies indicate that the governing PDEs of many canonical dynamical systems can be correctly identified using the proposed PeSBL method.
arXiv Detail & Related papers (2021-07-08T00:56:11Z)
Improving Metric Dimensionality Reduction with Distributed Topology [68.8204255655161]
DIPOLE is a dimensionality-reduction post-processing step that corrects an initial embedding by minimizing a loss functional with both a local, metric term and a global, topological term. We observe that DIPOLE outperforms popular methods like UMAP, t-SNE, and Isomap on a number of popular datasets.
arXiv Detail & Related papers (2021-06-14T17:19:44Z)
Rectangular Flows for Manifold Learning [38.63646804834534]
Normalizing flows are invertible neural networks with tractable change-of-volume terms. Data of interest is typically assumed to live in some (often unknown) low-dimensional manifold embedded in high-dimensional ambient space. We propose two methods to tractably the gradient of this term with respect to the parameters of the model.
arXiv Detail & Related papers (2021-06-02T18:30:39Z)
Joint Dimensionality Reduction for Separable Embedding Estimation [43.22422640265388]
Low-dimensional embeddings for data from disparate sources play critical roles in machine learning, multimedia information retrieval, and bioinformatics. We propose a supervised dimensionality reduction method that learns linear embeddings jointly for two feature vectors representing data of different modalities or data from distinct types of entities. Our approach compares favorably against other dimensionality reduction methods, and against a state-of-the-art method of bilinear regression for predicting gene-disease associations.
arXiv Detail & Related papers (2021-01-14T08:48:37Z)
A Correspondence Variational Autoencoder for Unsupervised Acoustic Word Embeddings [50.524054820564395]
We propose a new unsupervised model for mapping a variable-duration speech segment to a fixed-dimensional representation. The resulting acoustic word embeddings can form the basis of search, discovery, and indexing systems for low- and zero-resource languages.
arXiv Detail & Related papers (2020-12-03T19:24:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.