Related papers: Mamba-SEUNet: Mamba UNet for Monaural Speech Enhancement

Mamba-SEUNet: Mamba UNet for Monaural Speech Enhancement

URL: http://arxiv.org/abs/2412.16626v2
Date: Thu, 02 Jan 2025 10:56:07 GMT
Title: Mamba-SEUNet: Mamba UNet for Monaural Speech Enhancement
Authors: Junyu Wang, Zizhen Lin, Tianrui Wang, Meng Ge, Longbiao Wang, Jianwu Dang,
Abstract summary: Mamba, as a novel state-space model (SSM), has gained widespread application in natural language processing and computer vision.<n>In this work, we introduce Mamba-SEUNet, an innovative architecture that integrates Mamba with U-Net for SE tasks.
Score: 54.427965535613886
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In recent speech enhancement (SE) research, transformer and its variants have emerged as the predominant methodologies. However, the quadratic complexity of the self-attention mechanism imposes certain limitations on practical deployment. Mamba, as a novel state-space model (SSM), has gained widespread application in natural language processing and computer vision due to its strong capabilities in modeling long sequences and relatively low computational complexity. In this work, we introduce Mamba-SEUNet, an innovative architecture that integrates Mamba with U-Net for SE tasks. By leveraging bidirectional Mamba to model forward and backward dependencies of speech signals at different resolutions, and incorporating skip connections to capture multi-scale information, our approach achieves state-of-the-art (SOTA) performance. Experimental results on the VCTK+DEMAND dataset indicate that Mamba-SEUNet attains a PESQ score of 3.59, while maintaining low computational complexity. When combined with the Perceptual Contrast Stretching technique, Mamba-SEUNet further improves the PESQ score to 3.73.

Related papers

Understanding Input Selectivity in Mamba: Impact on Approximation Power, Memorization, and Associative Recall Capacity [5.116777508056307]
State-Space Models (SSMs) have recently emerged as a promising alternative to Transformers.<n>Mamba introduces input selectivity to its SSM layer (S6) and incorporates convolution and gating into its block definition.<n>We demystify the role of input selectivity in Mamba, investigating its impact on function approximation power, long-term memorization, and associative recall capabilities.
arXiv Detail & Related papers (2025-06-13T15:38:41Z)
TransMamba: Fast Universal Architecture Adaption from Transformers to Mamba [88.31117598044725]
We explore cross-architecture training to transfer the ready knowledge in existing Transformer models to alternative architecture Mamba, termed TransMamba. Our approach employs a two-stage strategy to expedite training new Mamba models, ensuring effectiveness in across uni-modal and cross-modal tasks. For cross-modal learning, we propose a cross-Mamba module that integrates language awareness into Mamba's visual features, enhancing the cross-modal interaction capabilities of Mamba architecture.
arXiv Detail & Related papers (2025-02-21T01:22:01Z)
From Markov to Laplace: How Mamba In-Context Learns Markov Chains [36.22373318908893]
We study in-context learning on Markov chains and uncover a surprising phenomenon. Unlike transformers, even a single-layer Mamba efficiently learns the in-context Laplacian smoothing estimator. These theoretical insights align strongly with empirical results and represent the first formal connection between Mamba and optimal statistical estimators.
arXiv Detail & Related papers (2025-02-14T14:13:55Z)
MobileMamba: Lightweight Multi-Receptive Visual Mamba Network [51.33486891724516]
Previous research on lightweight models has primarily focused on CNNs and Transformer-based designs. We propose the MobileMamba framework, which balances efficiency and performance. MobileMamba achieves up to 83.6% on Top-1, surpassing existing state-of-the-art methods.
arXiv Detail & Related papers (2024-11-24T18:01:05Z)
Mamba in Vision: A Comprehensive Survey of Techniques and Applications [3.4580301733198446]
Mamba is emerging as a novel approach to overcome the challenges faced by Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) in computer vision. Mamba addresses these limitations by leveraging Selective Structured State Space Models to effectively capture long-range dependencies with linear computational complexity.
arXiv Detail & Related papers (2024-10-04T02:58:49Z)
State-space models are accurate and efficient neural operators for dynamical systems [23.59679792068364]
Physics-informed machine learning (PIML) has emerged as a promising alternative to classical methods for predicting dynamical systems. Existing models, including recurrent neural networks (RNNs), transformers, and neural operators, face challenges such as long-time integration, long-range dependencies, chaotic dynamics, and extrapolation. This paper introduces state-space models implemented in Mamba for accurate and efficient dynamical system operator learning.
arXiv Detail & Related papers (2024-09-05T03:57:28Z)
ReMamba: Equip Mamba with Effective Long-Sequence Modeling [50.530839868893786]
We propose ReMamba, which enhances Mamba's ability to comprehend long contexts. ReMamba incorporates selective compression and adaptation techniques within a two-stage re-forward process.
arXiv Detail & Related papers (2024-08-28T02:47:27Z)
An Investigation of Incorporating Mamba for Speech Enhancement [45.610243349192096]
We exploit a Mamba-based regression model to characterize speech signals and build an SE system upon Mamba, termed SEMamba. SEMamba demonstrates promising results and attains a PESQ score of 3.55 on the VoiceBank-DEMAND dataset.
arXiv Detail & Related papers (2024-05-10T16:18:49Z)
Vision Mamba: A Comprehensive Survey and Taxonomy [11.025533218561284]
State Space Model (SSM) is a mathematical model used to describe and analyze the behavior of dynamic systems. Based on the latest state-space models, Mamba merges time-varying parameters into SSMs and formulates a hardware-aware algorithm for efficient training and inference. Mamba is expected to become a new AI architecture that may outperform Transformer.
arXiv Detail & Related papers (2024-05-07T15:30:14Z)
SPMamba: State-space model is all you need in speech separation [20.168153319805665]
CNN-based speech separation models face local receptive field limitations and cannot effectively capture long time dependencies. We introduce an innovative speech separation method called SPMamba. This model builds upon the robust TF-GridNet architecture, replacing its traditional BLSTM modules with bidirectional Mamba modules.
arXiv Detail & Related papers (2024-04-02T16:04:31Z)
Is Mamba Capable of In-Context Learning? [63.682741783013306]
State of the art foundation models such as GPT-4 perform surprisingly well at in-context learning (ICL) This work provides empirical evidence that Mamba, a newly proposed state space model, has similar ICL capabilities.
arXiv Detail & Related papers (2024-02-05T16:39:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.