CU-Mamba: Selective State Space Models with Channel Learning for Image Restoration
- URL: http://arxiv.org/abs/2404.11778v1
- Date: Wed, 17 Apr 2024 22:02:22 GMT
- Title: CU-Mamba: Selective State Space Models with Channel Learning for Image Restoration
- Authors: Rui Deng, Tianpei Gu,
- Abstract summary: We introduce the Channel-Aware U-Shaped Mamba model, which incorporates a dual State Space Model framework into the U-Net architecture.
Experiments validate CU-Mamba's superiority over existing state-of-the-art methods.
- Score: 7.292363114816646
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reconstructing degraded images is a critical task in image processing. Although CNN and Transformer-based models are prevalent in this field, they exhibit inherent limitations, such as inadequate long-range dependency modeling and high computational costs. To overcome these issues, we introduce the Channel-Aware U-Shaped Mamba (CU-Mamba) model, which incorporates a dual State Space Model (SSM) framework into the U-Net architecture. CU-Mamba employs a Spatial SSM module for global context encoding and a Channel SSM component to preserve channel correlation features, both in linear computational complexity relative to the feature map size. Extensive experimental results validate CU-Mamba's superiority over existing state-of-the-art methods, underscoring the importance of integrating both spatial and channel contexts in image restoration.
Related papers
- CMamba: Learned Image Compression with State Space Models [31.10785880342252]
We propose a hybrid Convolution and State Space Models (SSMs) based image compression framework to achieve superior rate-distortion performance.
Specifically, CMamba introduces two key components: a Content-Adaptive SSM (CA-SSM) module and a Context-Aware Entropy (CAE) module.
Experimental results demonstrate that CMamba achieves superior rate-distortion performance.
arXiv Detail & Related papers (2025-02-07T15:07:04Z) - STNMamba: Mamba-based Spatial-Temporal Normality Learning for Video Anomaly Detection [48.997518615379995]
Video anomaly detection (VAD) has been extensively researched due to its potential for intelligent video systems.
Most existing methods based on CNNs and transformers still suffer from substantial computational burdens.
We propose a lightweight and effective Mamba-based network named STNMamba to enhance the learning of spatial-temporal normality.
arXiv Detail & Related papers (2024-12-28T08:49:23Z) - Multi-dimensional Visual Prompt Enhanced Image Restoration via Mamba-Transformer Aggregation [4.227991281224256]
This paper proposes to fully utilize complementary advantages from Mamba and Transformer without sacrificing computation efficiency.
The selective scanning mechanism of Mamba is employed to focus on spatial modeling, enabling capture long-range spatial dependencies.
The self-attention mechanism of Transformer is applied to focus on channel modeling, avoiding high burdens that are in quadratic growth with image's spatial dimensions.
arXiv Detail & Related papers (2024-12-20T12:36:34Z) - SEM-Net: Efficient Pixel Modelling for image inpainting with Spatially Enhanced SSM [11.447968918063335]
Image inpainting aims to repair a partially damaged image based on the information from known regions of the images.
SEM-Net is a novel visual State Space model (SSM) vision network, modelling corrupted images at the pixel level while capturing long-range dependencies (LRDs) in state space.
arXiv Detail & Related papers (2024-11-10T00:35:14Z) - Cross-Scan Mamba with Masked Training for Robust Spectral Imaging [51.557804095896174]
We propose the Cross-Scanning Mamba, named CS-Mamba, that employs a Spatial-Spectral SSM for global-local balanced context encoding.
Experiment results show that our CS-Mamba achieves state-of-the-art performance and the masked training method can better reconstruct smooth features to improve the visual quality.
arXiv Detail & Related papers (2024-08-01T15:14:10Z) - Efficient Visual State Space Model for Image Deblurring [83.57239834238035]
Convolutional neural networks (CNNs) and Vision Transformers (ViTs) have achieved excellent performance in image restoration.
We propose a simple yet effective visual state space model (EVSSM) for image deblurring.
arXiv Detail & Related papers (2024-05-23T09:13:36Z) - IRSRMamba: Infrared Image Super-Resolution via Mamba-based Wavelet Transform Feature Modulation Model [7.842507196763463]
IRSRMamba is a novel framework integrating wavelet transform feature modulation for multi-scale adaptation.
IRSRMamba outperforms state-of-the-art methods in PSNR, SSIM, and perceptual quality.
This work establishes Mamba-based architectures as a promising direction for high-fidelity IR image enhancement.
arXiv Detail & Related papers (2024-05-16T07:49:24Z) - WaterMamba: Visual State Space Model for Underwater Image Enhancement [17.172623370407155]
Underwater imaging often suffers from low quality due to factors affecting light propagation and absorption in water.
To improve image quality, some underwater image enhancement (UIE) methods based on convolutional neural networks (CNN) and Transformer have been proposed.
Considering computational complexity and severe underwater image degradation, a state space model (SSM) with linear computational complexity for UIE, named WaterMamba, is proposed.
arXiv Detail & Related papers (2024-05-14T08:26:29Z) - Frequency-Assisted Mamba for Remote Sensing Image Super-Resolution [49.902047563260496]
We develop the first attempt to integrate the Vision State Space Model (Mamba) for remote sensing image (RSI) super-resolution.
To achieve better SR reconstruction, building upon Mamba, we devise a Frequency-assisted Mamba framework, dubbed FMSR.
Our FMSR features a multi-level fusion architecture equipped with the Frequency Selection Module (FSM), Vision State Space Module (VSSM), and Hybrid Gate Module (HGM)
arXiv Detail & Related papers (2024-05-08T11:09:24Z) - MamMIL: Multiple Instance Learning for Whole Slide Images with State Space Models [56.37780601189795]
We propose a framework named MamMIL for WSI analysis.
We represent each WSI as an undirected graph.
To address the problem that Mamba can only process 1D sequences, we propose a topology-aware scanning mechanism.
arXiv Detail & Related papers (2024-03-08T09:02:13Z) - CM-GAN: Image Inpainting with Cascaded Modulation GAN and Object-Aware
Training [112.96224800952724]
We propose cascaded modulation GAN (CM-GAN) to generate plausible image structures when dealing with large holes in complex images.
In each decoder block, global modulation is first applied to perform coarse semantic-aware synthesis structure, then spatial modulation is applied on the output of global modulation to further adjust the feature map in a spatially adaptive fashion.
In addition, we design an object-aware training scheme to prevent the network from hallucinating new objects inside holes, fulfilling the needs of object removal tasks in real-world scenarios.
arXiv Detail & Related papers (2022-03-22T16:13:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.