LaSAFT: Latent Source Attentive Frequency Transformation for Conditioned
Source Separation
- URL: http://arxiv.org/abs/2010.11631v2
- Date: Wed, 14 Apr 2021 05:31:12 GMT
- Title: LaSAFT: Latent Source Attentive Frequency Transformation for Conditioned
Source Separation
- Authors: Woosung Choi and Minseok Kim and Jaehwa Chung and Soonyoung Jung
- Abstract summary: We propose the Latent Source Attentive Frequency Transformation (LaSAFT) block to capture source-dependent frequency patterns.
We also propose the Gated Point-wise Convolutional Modulation (GPoCM) to modulate internal features.
- Score: 7.002478301291264
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent deep-learning approaches have shown that Frequency Transformation (FT)
blocks can significantly improve spectrogram-based single-source separation
models by capturing frequency patterns. The goal of this paper is to extend the
FT block to fit the multi-source task. We propose the Latent Source Attentive
Frequency Transformation (LaSAFT) block to capture source-dependent frequency
patterns. We also propose the Gated Point-wise Convolutional Modulation
(GPoCM), an extension of Feature-wise Linear Modulation (FiLM), to modulate
internal features. By employing these two novel methods, we extend the
Conditioned-U-Net (CUNet) for multi-source separation, and the experimental
results indicate that our LaSAFT and GPoCM can improve the CUNet's performance,
achieving state-of-the-art SDR performance on several MUSDB18 source separation
tasks.
Related papers
- MFF-FTNet: Multi-scale Feature Fusion across Frequency and Temporal Domains for Time Series Forecasting [18.815152183468673]
Time series forecasting is crucial in many fields, yet current deep learning models struggle with noise, data sparsity, and capturing complex patterns.
This paper presents MFF-FTNet, a novel framework addressing these challenges by combining contrastive learning with multi-scale feature extraction.
Extensive experiments on five real-world datasets demonstrate that MFF-FTNet significantly outperforms state-of-the-art models.
arXiv Detail & Related papers (2024-11-26T12:41:42Z) - Accelerated Multi-Contrast MRI Reconstruction via Frequency and Spatial Mutual Learning [50.74383395813782]
We propose a novel Frequency and Spatial Mutual Learning Network (FSMNet) to explore global dependencies across different modalities.
The proposed FSMNet achieves state-of-the-art performance for the Multi-Contrast MR Reconstruction task with different acceleration factors.
arXiv Detail & Related papers (2024-09-21T12:02:47Z) - Score-based Source Separation with Applications to Digital Communication
Signals [72.6570125649502]
We propose a new method for separating superimposed sources using diffusion-based generative models.
Motivated by applications in radio-frequency (RF) systems, we are interested in sources with underlying discrete nature.
Our method can be viewed as a multi-source extension to the recently proposed score distillation sampling scheme.
arXiv Detail & Related papers (2023-06-26T04:12:40Z) - Transform Once: Efficient Operator Learning in Frequency Domain [69.74509540521397]
We study deep neural networks designed to harness the structure in frequency domain for efficient learning of long-range correlations in space or time.
This work introduces a blueprint for frequency domain learning through a single transform: transform once (T1)
arXiv Detail & Related papers (2022-11-26T01:56:05Z) - Adaptive Frequency Learning in Two-branch Face Forgery Detection [66.91715092251258]
We propose Adaptively learn Frequency information in the two-branch Detection framework, dubbed AFD.
We liberate our network from the fixed frequency transforms, and achieve better performance with our data- and task-dependent transform layers.
arXiv Detail & Related papers (2022-03-27T14:25:52Z) - FAMLP: A Frequency-Aware MLP-Like Architecture For Domain Generalization [73.41395947275473]
We propose a novel frequency-aware architecture, in which the domain-specific features are filtered out in the transformed frequency domain.
Experiments on three benchmarks demonstrate significant performance, outperforming the state-of-the-art methods by a margin of 3%, 4% and 9%, respectively.
arXiv Detail & Related papers (2022-03-24T07:26:29Z) - Deep Frequency Filtering for Domain Generalization [55.66498461438285]
Deep Neural Networks (DNNs) have preferences for some frequency components in the learning process.
We propose Deep Frequency Filtering (DFF) for learning domain-generalizable features.
We show that applying our proposed DFF on a plain baseline outperforms the state-of-the-art methods on different domain generalization tasks.
arXiv Detail & Related papers (2022-03-23T05:19:06Z) - Compute and memory efficient universal sound source separation [23.152611264259225]
We provide a family of efficient neural network architectures for general purpose audio source separation.
The backbone structure of this convolutional network is the SUccessive DOwnsampling and Resampling of Multi-Resolution Features (SuDoRM-RF)
Our experiments show that SuDoRM-RF models perform comparably and even surpass several state-of-the-art benchmarks.
arXiv Detail & Related papers (2021-03-03T19:16:53Z) - Sparse Multi-Family Deep Scattering Network [14.932318540666543]
We propose a novel architecture exploiting the interpretability of the Deep Scattering Network (DSN)
The SMF-DSN enhances the DSN by increasing the diversity of the scattering coefficients and (ii) improves its robustness with respect to non-stationary noise.
arXiv Detail & Related papers (2020-12-14T16:06:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.