Sound Source Separation Using Latent Variational Block-Wise
Disentanglement
- URL: http://arxiv.org/abs/2402.06683v1
- Date: Thu, 8 Feb 2024 07:22:39 GMT
- Title: Sound Source Separation Using Latent Variational Block-Wise
Disentanglement
- Authors: Karim Helwani, Masahito Togami, Paris Smaragdis, Michael M. Goodwin
- Abstract summary: We present a hybrid classical digital signal processing/deep neural network (DSP/DNN) approach to source separation (SS)
We show that the design choices and the variational formulation of the task at hand motivated by the classical signal processing theoretical results lead to robustness to unseen out-of-distribution data and reduction of the overfitting risk.
- Score: 33.94867897638613
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While neural network approaches have made significant strides in resolving
classical signal processing problems, it is often the case that hybrid
approaches that draw insight from both signal processing and neural networks
produce more complete solutions. In this paper, we present a hybrid classical
digital signal processing/deep neural network (DSP/DNN) approach to source
separation (SS) highlighting the theoretical link between variational
autoencoder and classical approaches to SS. We propose a system that transforms
the single channel under-determined SS task to an equivalent multichannel
over-determined SS problem in a properly designed latent space. The separation
task in the latent space is treated as finding a variational block-wise
disentangled representation of the mixture. We show empirically, that the
design choices and the variational formulation of the task at hand motivated by
the classical signal processing theoretical results lead to robustness to
unseen out-of-distribution data and reduction of the overfitting risk. To
address the resulting permutation issue we explicitly incorporate a novel
differentiable permutation loss function and augment the model with a memory
mechanism to keep track of the statistics of the individual sources.
Related papers
- Domain Generalization Guided by Gradient Signal to Noise Ratio of
Parameters [69.24377241408851]
Overfitting to the source domain is a common issue in gradient-based training of deep neural networks.
We propose to base the selection on gradient-signal-to-noise ratio (GSNR) of network's parameters.
arXiv Detail & Related papers (2023-10-11T10:21:34Z) - Spectral-Bias and Kernel-Task Alignment in Physically Informed Neural
Networks [4.604003661048267]
Physically informed neural networks (PINNs) are a promising emerging method for solving differential equations.
We propose a comprehensive theoretical framework that sheds light on this important problem.
We derive an integro-differential equation that governs PINN prediction in the large data-set limit.
arXiv Detail & Related papers (2023-07-12T18:00:02Z) - Score-based Source Separation with Applications to Digital Communication
Signals [72.6570125649502]
We propose a new method for separating superimposed sources using diffusion-based generative models.
Motivated by applications in radio-frequency (RF) systems, we are interested in sources with underlying discrete nature.
Our method can be viewed as a multi-source extension to the recently proposed score distillation sampling scheme.
arXiv Detail & Related papers (2023-06-26T04:12:40Z) - Neural Fast Full-Rank Spatial Covariance Analysis for Blind Source
Separation [26.6020148790775]
This paper describes an efficient unsupervised learning method for a neural source separation model.
We propose neural FastFCA based on the jointly-diagonalizable yet full-rank spatial model.
Experiment using mixture signals of two to four sound sources shows that neural FastFCA outperforms conventional BSS methods.
arXiv Detail & Related papers (2023-06-17T02:50:17Z) - Permutation Equivariant Neural Functionals [92.0667671999604]
This work studies the design of neural networks that can process the weights or gradients of other neural networks.
We focus on the permutation symmetries that arise in the weights of deep feedforward networks because hidden layer neurons have no inherent order.
In our experiments, we find that permutation equivariant neural functionals are effective on a diverse set of tasks.
arXiv Detail & Related papers (2023-02-27T18:52:38Z) - Decentralized Local Stochastic Extra-Gradient for Variational
Inequalities [125.62877849447729]
We consider distributed variational inequalities (VIs) on domains with the problem data that is heterogeneous (non-IID) and distributed across many devices.
We make a very general assumption on the computational network that covers the settings of fully decentralized calculations.
We theoretically analyze its convergence rate in the strongly-monotone, monotone, and non-monotone settings.
arXiv Detail & Related papers (2021-06-15T17:45:51Z) - Multivariate Deep Evidential Regression [77.34726150561087]
A new approach with uncertainty-aware neural networks shows promise over traditional deterministic methods.
We discuss three issues with a proposed solution to extract aleatoric and epistemic uncertainties from regression-based neural networks.
arXiv Detail & Related papers (2021-04-13T12:20:18Z) - Supervised training of spiking neural networks for robust deployment on
mixed-signal neuromorphic processors [2.6949002029513167]
Mixed-signal analog/digital electronic circuits can emulate spiking neurons and synapses with extremely high energy efficiency.
Mismatch is expressed as differences in effective parameters between identically-configured neurons and synapses.
We present a supervised learning approach that addresses this challenge by maximizing robustness to mismatch and other common sources of noise.
arXiv Detail & Related papers (2021-02-12T09:20:49Z) - Identification of Probability weighted ARX models with arbitrary domains [75.91002178647165]
PieceWise Affine models guarantees universal approximation, local linearity and equivalence to other classes of hybrid system.
In this work, we focus on the identification of PieceWise Auto Regressive with eXogenous input models with arbitrary regions (NPWARX)
The architecture is conceived following the Mixture of Expert concept, developed within the machine learning field.
arXiv Detail & Related papers (2020-09-29T12:50:33Z) - Theory inspired deep network for instantaneous-frequency extraction and
signal components recovery from discrete blind-source data [1.6758573326215689]
This paper is concerned with the inverse problem of recovering the unknown signal components, along with extraction of their frequencies.
None of the existing decomposition methods and algorithms is capable of solving this inverse problem.
We propose a synthesis of a deep neural network, based directly on a discrete sample set, that may be non-uniformly sampled, of the blind-source signal.
arXiv Detail & Related papers (2020-01-31T18:54:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.