Latent Iterative Refinement for Modular Source Separation
- URL: http://arxiv.org/abs/2211.11917v2
- Date: Mon, 16 Oct 2023 03:06:50 GMT
- Title: Latent Iterative Refinement for Modular Source Separation
- Authors: Dimitrios Bralios, Efthymios Tzinis, Gordon Wichern, Paris Smaragdis,
Jonathan Le Roux
- Abstract summary: Traditional source separation approaches train deep neural network models end-to-end with all the data available at once.
We argue that we can significantly increase resource efficiency during both training and inference stages.
- Score: 44.78689915209527
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Traditional source separation approaches train deep neural network models
end-to-end with all the data available at once by minimizing the empirical risk
on the whole training set. On the inference side, after training the model, the
user fetches a static computation graph and runs the full model on some
specified observed mixture signal to get the estimated source signals.
Additionally, many of those models consist of several basic processing blocks
which are applied sequentially. We argue that we can significantly increase
resource efficiency during both training and inference stages by reformulating
a model's training and inference procedures as iterative mappings of latent
signal representations. First, we can apply the same processing block more than
once on its output to refine the input signal and consequently improve
parameter efficiency. During training, we can follow a block-wise procedure
which enables a reduction on memory requirements. Thus, one can train a very
complicated network structure using significantly less computation compared to
end-to-end training. During inference, we can dynamically adjust how many
processing blocks and iterations of a specific block an input signal needs
using a gating module.
Related papers
- Transferable Post-training via Inverse Value Learning [83.75002867411263]
We propose modeling changes at the logits level during post-training using a separate neural network (i.e., the value network)
After training this network on a small base model using demonstrations, this network can be seamlessly integrated with other pre-trained models during inference.
We demonstrate that the resulting value network has broad transferability across pre-trained models of different parameter sizes.
arXiv Detail & Related papers (2024-10-28T13:48:43Z) - Partitioned Neural Network Training via Synthetic Intermediate Labels [0.0]
GPU memory constraints have become a notable bottleneck in training such sizable models.
This study advocates partitioning the model across GPU and generating synthetic intermediate labels to train individual segments.
This approach results in a more efficient training process that minimizes data communication while maintaining model accuracy.
arXiv Detail & Related papers (2024-03-17T13:06:29Z) - Just One Byte (per gradient): A Note on Low-Bandwidth Decentralized
Language Model Finetuning Using Shared Randomness [86.61582747039053]
Language model training in distributed settings is limited by the communication cost of exchanges.
We extend recent work using shared randomness to perform distributed fine-tuning with low bandwidth.
arXiv Detail & Related papers (2023-06-16T17:59:51Z) - Block-local learning with probabilistic latent representations [2.839567756494814]
Locking and weight transport are problems because they prevent efficient parallelization and horizontal scaling of the training process.
We propose a new method to address both these problems and scale up the training of large models.
We present results on a variety of tasks and architectures, demonstrating state-of-the-art performance using block-local learning.
arXiv Detail & Related papers (2023-05-24T10:11:30Z) - Decouple Graph Neural Networks: Train Multiple Simple GNNs Simultaneously Instead of One [60.5818387068983]
Graph neural networks (GNN) suffer from severe inefficiency.
We propose to decouple a multi-layer GNN as multiple simple modules for more efficient training.
We show that the proposed framework is highly efficient with reasonable performance.
arXiv Detail & Related papers (2023-04-20T07:21:32Z) - Lightweight and Flexible Deep Equilibrium Learning for CSI Feedback in
FDD Massive MIMO [13.856867175477042]
In frequency-division duplexing (FDD) massive multiple-input multiple-output (MIMO) systems, downlink channel state information (CSI) needs to be sent back to the base station (BS) by the users.
We propose a lightweight and flexible deep learning-based CSI feedback approach by capitalizing on deep equilibrium models.
arXiv Detail & Related papers (2022-11-28T05:53:09Z) - Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency.
We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z) - Blockwise Sequential Model Learning for Partially Observable
Reinforcement Learning [14.642266310020505]
This paper proposes a new sequential model learning architecture to solve partially observable Markov decision problems.
The proposed architecture generates a latent variable in each data block with a length of multiple timesteps and passes the most relevant information to the next block for policy optimization.
Numerical results show that the proposed method significantly outperforms previous methods in various partially observable environments.
arXiv Detail & Related papers (2021-12-10T05:38:24Z) - Incremental Training of a Recurrent Neural Network Exploiting a
Multi-Scale Dynamic Memory [79.42778415729475]
We propose a novel incrementally trained recurrent architecture targeting explicitly multi-scale learning.
We show how to extend the architecture of a simple RNN by separating its hidden state into different modules.
We discuss a training algorithm where new modules are iteratively added to the model to learn progressively longer dependencies.
arXiv Detail & Related papers (2020-06-29T08:35:49Z) - Efficient Learning of Model Weights via Changing Features During
Training [0.0]
We propose a machine learning model, which dynamically changes the features during training.
Our main motivation is to update the model in a small content during the training process with replacing less descriptive features to new ones from a large pool.
arXiv Detail & Related papers (2020-02-21T12:38:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.