Related papers: An Efficient Speech Separation Network Based on Recurrent Fusion Dilated Convolution and Channel Attention

An Efficient Speech Separation Network Based on Recurrent Fusion Dilated Convolution and Channel Attention

URL: http://arxiv.org/abs/2306.05887v1
Date: Fri, 9 Jun 2023 13:30:27 GMT
Title: An Efficient Speech Separation Network Based on Recurrent Fusion Dilated Convolution and Channel Attention
Authors: Junyu Wang
Abstract summary: We present an efficient speech separation neural network, ARFDCN, which combines dilated convolutions, multi-scale fusion (MSF), and channel attention. Experimental results indicate that the model achieves a decent balance between performance and computational efficiency.
Score: 0.2538209532048866
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present an efficient speech separation neural network, ARFDCN, which combines dilated convolutions, multi-scale fusion (MSF), and channel attention to overcome the limited receptive field of convolution-based networks and the high computational cost of transformer-based networks. The suggested network architecture is encoder-decoder based. By using dilated convolutions with gradually increasing dilation value to learn local and global features and fusing them at adjacent stages, the model can learn rich feature content. Meanwhile, by adding channel attention modules to the network, the model can extract channel weights, learn more important features, and thus improve its expressive power and robustness. Experimental results indicate that the model achieves a decent balance between performance and computational efficiency, making it a promising alternative to current mainstream models for practical applications.

Related papers

CFFormer: Cross CNN-Transformer Channel Attention and Spatial Feature Fusion for Improved Segmentation of Low Quality Medical Images [29.68616115427831]
CNN-Transformer models are designed to combine the advantages of CNNs and Transformers to efficiently model both local information and long-range dependencies. We introduce the Cross Feature Channel Attention (CFCA) module and the X-Spatial Feature Fusion (XFF) module. The CFCA module filters and facilitates interactions between the channel features from the two encoders, while the XFF module effectively reduces the significant semantic information differences in spatial features.
arXiv Detail & Related papers (2025-01-07T08:59:20Z)
Residual Kolmogorov-Arnold Network for Enhanced Deep Learning [0.5852077003870417]
We introduce Residual Arnold, which incorporates the Kolmogorov-KAN framework as a residual component. Our results demonstrate the potential of RKAN to enhance the capabilities of deep CNNs in visual data.
arXiv Detail & Related papers (2024-10-07T21:12:32Z)
Wav-KAN: Wavelet Kolmogorov-Arnold Networks [3.38220960870904]
Wav-KAN is an innovative neural network architecture that leverages the Wavelet Kolmogorov-Arnold Networks (Wav-KAN) framework to enhance interpretability and performance. Our results highlight the potential of Wav-KAN as a powerful tool for developing interpretable and high-performance neural networks.
arXiv Detail & Related papers (2024-05-21T14:36:16Z)
TCCT-Net: Two-Stream Network Architecture for Fast and Efficient Engagement Estimation via Behavioral Feature Signals [58.865901821451295]
We present a novel two-stream feature fusion "Tensor-Convolution and Convolution-Transformer Network" (TCCT-Net) architecture. To better learn the meaningful patterns in the temporal-spatial domain, we design a "CT" stream that integrates a hybrid convolutional-transformer. In parallel, to efficiently extract rich patterns from the temporal-frequency domain, we introduce a "TC" stream that uses Continuous Wavelet Transform (CWT) to represent information in a 2D tensor form.
arXiv Detail & Related papers (2024-04-15T06:01:48Z)
Hybrid Convolutional and Attention Network for Hyperspectral Image Denoising [54.110544509099526]
Hyperspectral image (HSI) denoising is critical for the effective analysis and interpretation of hyperspectral data. We propose a hybrid convolution and attention network (HCANet) to enhance HSI denoising. Experimental results on mainstream HSI datasets demonstrate the rationality and effectiveness of the proposed HCANet.
arXiv Detail & Related papers (2024-03-15T07:18:43Z)
Dynamic Encoding and Decoding of Information for Split Learning in Mobile-Edge Computing: Leveraging Information Bottleneck Theory [1.1151919978983582]
Split learning is a privacy-preserving distributed learning paradigm in which an ML model is split into two parts (i.e., an encoder and a decoder) In mobile-edge computing, network functions can be trained via split learning where an encoder resides in a user equipment (UE) and a decoder resides in the edge network. We present a new framework and training mechanism to enable a dynamic balancing of the transmission resource consumption with the informativeness of the shared latent representations.
arXiv Detail & Related papers (2023-09-06T07:04:37Z)
Joint Channel Estimation and Feedback with Masked Token Transformers in Massive MIMO Systems [74.52117784544758]
This paper proposes an encoder-decoder based network that unveils the intrinsic frequency-domain correlation within the CSI matrix. The entire encoder-decoder network is utilized for channel compression. Our method outperforms state-of-the-art channel estimation and feedback techniques in joint tasks.
arXiv Detail & Related papers (2023-06-08T06:15:17Z)
Interference Cancellation GAN Framework for Dynamic Channels [74.22393885274728]
We introduce an online training framework that can adapt to any changes in the channel. Our framework significantly outperforms recent neural network models on highly dynamic channels.
arXiv Detail & Related papers (2022-08-17T02:01:18Z)
Cross-receptive Focused Inference Network for Lightweight Image Super-Resolution [64.25751738088015]
Transformer-based methods have shown impressive performance in single image super-resolution (SISR) tasks. Transformers that need to incorporate contextual information to extract features dynamically are neglected. We propose a lightweight Cross-receptive Focused Inference Network (CFIN) that consists of a cascade of CT Blocks mixed with CNN and Transformer.
arXiv Detail & Related papers (2022-07-06T16:32:29Z)
Graph-based Algorithm Unfolding for Energy-aware Power Allocation in Wireless Networks [27.600081147252155]
We develop a novel graph sumable framework to maximize energy efficiency in wireless communication networks. We show the permutation training which is a desirable property for models of wireless network data. Results demonstrate its generalizability across different network topologies.
arXiv Detail & Related papers (2022-01-27T20:23:24Z)
Convolutional Neural Network optimization via Channel Reassessment Attention module [19.566271646280978]
We propose a novel network optimization module called Channel Reassessment (CRA) module. CRA module uses channel attentions with spatial information of feature maps to enhance representational power of networks. Experiments on ImageNet and MS datasets demonstrate that embedding CRA module on various networks effectively improves the performance under different evaluation standards.
arXiv Detail & Related papers (2020-10-12T11:27:17Z)
Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks. We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.