LightSAFT: Lightweight Latent Source Aware Frequency Transform for
Source Separation
- URL: http://arxiv.org/abs/2111.12516v1
- Date: Wed, 24 Nov 2021 14:25:13 GMT
- Title: LightSAFT: Lightweight Latent Source Aware Frequency Transform for
Source Separation
- Authors: Yeong-Seok Jeong, Jinsung Kim, Woosung Choi, Jaehwa Chung, Soonyoung
Jung
- Abstract summary: LaSAFT-Net has shown that conditioned models can show comparable performance against existing single-source separation models.
LightSAFT-Net provides a sufficient SDR performance for comparison during the Music Demixing Challenge at ISMIR 2021.
Our enhanced LightSAFT-Net outperforms the previous one with fewer parameters.
- Score: 0.7192233658525915
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Conditioned source separations have attracted significant attention because
of their flexibility, applicability and extensionality. Their performance was
usually inferior to the existing approaches, such as the single source
separation model. However, a recently proposed method called LaSAFT-Net has
shown that conditioned models can show comparable performance against existing
single-source separation models. This paper presents LightSAFT-Net, a
lightweight version of LaSAFT-Net. As a baseline, it provided a sufficient SDR
performance for comparison during the Music Demixing Challenge at ISMIR 2021.
This paper also enhances the existing LightSAFT-Net by replacing the LightSAFT
blocks in the encoder with TFC-TDF blocks. Our enhanced LightSAFT-Net
outperforms the previous one with fewer parameters.
Related papers
- Causal Autoregressive Diffusion Language Model [70.7353007255797]
CARD reformulates the diffusion process within a strictly causal attention mask, enabling dense, per-token supervision in a single forward pass.<n>Our results demonstrate that CARD achieves ARM-level data efficiency while unlocking the latency benefits of parallel generation.
arXiv Detail & Related papers (2026-01-29T17:38:29Z) - EMTSF:Extraordinary Mixture of SOTA Models for Time Series Forecasting [0.750638869146118]
We propose a strong Mixture of Experts (MoE) framework for Time Series Forecasting.<n>Our method combines the state-of-the-art (SOTA) models including xLSTM, en hanced Linear, PatchTST, and minGRU.<n>Our proposed model outperforms all existing TSF models on standard benchmarks, surpassing even the latest approaches based on MoE frameworks.
arXiv Detail & Related papers (2025-10-27T14:55:30Z) - ScaleWeaver: Weaving Efficient Controllable T2I Generation with Multi-Scale Reference Attention [86.93601565563954]
ScaleWeaver is a framework designed to achieve high-fidelity, controllable generation upon advanced visual autoregressive( VAR) models.<n>The proposed Reference Attention module discards the unnecessary attention from image$rightarrow$condition, reducing computational cost.<n>Experiments show that ScaleWeaver delivers high-quality generation and precise control while attaining superior efficiency over diffusion-based methods.
arXiv Detail & Related papers (2025-10-16T17:00:59Z) - Communication-Efficient Wireless Federated Fine-Tuning for Large-Scale AI Models [13.742950928229078]
Low-Rank Adaptation (LoRA) addresses these issues by training compact, low-rank matrices instead of fully fine-tuning large models.
This paper introduces a wireless federated LoRA fine-tuning framework that optimize both learning performance and communication efficiency.
arXiv Detail & Related papers (2025-05-01T06:15:38Z) - A Lightweight Deep Exclusion Unfolding Network for Single Image Reflection Removal [68.0573194557999]
Single Image Reflection Removal (SIRR) is a canonical blind source separation problem.
We propose a novel Deep Exclusion unfolding Network (DExNet) for SIRR.
DExNet is constructed by unfolding and parameterizing a simple iterative Sparse and Auxiliary Feature Update (i-SAFU) algorithm.
arXiv Detail & Related papers (2025-03-03T07:54:27Z) - Diffusion-Driven Semantic Communication for Generative Models with Bandwidth Constraints [66.63250537475973]
This paper introduces a diffusion-driven semantic communication framework with advanced VAE-based compression for bandwidth-constrained generative model.<n>Our experimental results demonstrate significant improvements in pixel-level metrics like peak signal to noise ratio (PSNR) and semantic metrics like learned perceptual image patch similarity (LPIPS)
arXiv Detail & Related papers (2024-07-26T02:34:25Z) - R-SFLLM: Jamming Resilient Framework for Split Federated Learning with Large Language Models [83.77114091471822]
Split federated learning (SFL) is a compute-efficient paradigm in distributed machine learning (ML)
A challenge in SFL, particularly when deployed over wireless channels, is the susceptibility of transmitted model parameters to adversarial jamming.
This is particularly pronounced for word embedding parameters in large language models (LLMs), which are crucial for language understanding.
A physical layer framework is developed for resilient SFL with LLMs (R-SFLLM) over wireless networks.
arXiv Detail & Related papers (2024-07-16T12:21:29Z) - Distilling Semantic Priors from SAM to Efficient Image Restoration Models [80.83077145948863]
In image restoration (IR), leveraging semantic priors from segmentation models has been a common approach to improve performance.
Recent segment anything model (SAM) has emerged as a powerful tool for extracting advanced semantic priors to enhance IR tasks.
We propose a general framework to distill SAM's semantic knowledge to boost exiting IR models without interfering with their inference process.
arXiv Detail & Related papers (2024-03-25T02:17:20Z) - RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content [62.685566387625975]
Current mitigation strategies, while effective, are not resilient under adversarial attacks.
This paper introduces Resilient Guardrails for Large Language Models (RigorLLM), a novel framework designed to efficiently moderate harmful and unsafe inputs.
arXiv Detail & Related papers (2024-03-19T07:25:02Z) - LYT-NET: Lightweight YUV Transformer-based Network for Low-light Image Enhancement [0.0]
LYT-Net is a novel lightweight transformer-based model for low-light image enhancement (LLIE)
In our method we adopt a dual-path approach, treating chrominance channels U and V and luminance channel Y as separate entities to help the model better handle illumination adjustment and corruption restoration.
Our comprehensive evaluation on established LLIE datasets demonstrates that, despite its low complexity, our model outperforms recent LLIE methods.
arXiv Detail & Related papers (2024-01-26T21:02:44Z) - Transforming Image Super-Resolution: A ConvFormer-based Efficient Approach [58.57026686186709]
We introduce the Convolutional Transformer layer (ConvFormer) and propose a ConvFormer-based Super-Resolution network (CFSR)
CFSR inherits the advantages of both convolution-based and transformer-based approaches.
Experiments demonstrate that CFSR strikes an optimal balance between computational cost and performance.
arXiv Detail & Related papers (2024-01-11T03:08:00Z) - Can SAM Boost Video Super-Resolution? [78.29033914169025]
We propose a simple yet effective module -- SAM-guidEd refinEment Module (SEEM)
This light-weight plug-in module is specifically designed to leverage the attention mechanism for the generation of semantic-aware feature.
We apply our SEEM to two representative methods, EDVR and BasicVSR, resulting in consistently improved performance with minimal implementation effort.
arXiv Detail & Related papers (2023-05-11T02:02:53Z) - Incorporating Transformer Designs into Convolutions for Lightweight
Image Super-Resolution [46.32359056424278]
Large convolutional kernels have become popular in designing convolutional neural networks.
The increase in kernel size also leads to a quadratic growth in the number of parameters, resulting in heavy computation and memory requirements.
We propose a neighborhood attention (NA) module that upgrades the standard convolution with a self-attention mechanism.
Building upon the NA module, we propose a lightweight single image super-resolution (SISR) network named TCSR.
arXiv Detail & Related papers (2023-03-25T01:32:18Z) - Feature Distillation Interaction Weighting Network for Lightweight Image
Super-Resolution [25.50790871331823]
We propose a lightweight yet efficient Feature Distillation Interaction Weighted Network (FDIWN)
FDIWN is superior to other models to strike a good balance between model performance and efficiency.
arXiv Detail & Related papers (2021-12-16T06:20:35Z) - LaSAFT: Latent Source Attentive Frequency Transformation for Conditioned
Source Separation [7.002478301291264]
We propose the Latent Source Attentive Frequency Transformation (LaSAFT) block to capture source-dependent frequency patterns.
We also propose the Gated Point-wise Convolutional Modulation (GPoCM) to modulate internal features.
arXiv Detail & Related papers (2020-10-22T11:58:23Z) - Residual Feature Distillation Network for Lightweight Image
Super-Resolution [40.52635571871426]
We propose a lightweight and accurate SISR model called residual feature distillation network (RFDN)
RFDN uses multiple feature distillation connections to learn more discriminative feature representations.
We also propose a shallow residual block (SRB) as the main building block of RFDN so that the network can benefit most from residual learning.
arXiv Detail & Related papers (2020-09-24T08:46:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.