IRSRMamba: Infrared Image Super-Resolution via Mamba-based Wavelet Transform Feature Modulation Model
- URL: http://arxiv.org/abs/2405.09873v2
- Date: Mon, 17 Feb 2025 01:24:21 GMT
- Title: IRSRMamba: Infrared Image Super-Resolution via Mamba-based Wavelet Transform Feature Modulation Model
- Authors: Yongsong Huang, Tomo Miyazaki, Xiaofeng Liu, Shinichiro Omachi,
- Abstract summary: IRSRMamba is a novel framework integrating wavelet transform feature modulation for multi-scale adaptation.
IRSRMamba outperforms state-of-the-art methods in PSNR, SSIM, and perceptual quality.
This work establishes Mamba-based architectures as a promising direction for high-fidelity IR image enhancement.
- Score: 7.842507196763463
- License:
- Abstract: Infrared image super-resolution demands long-range dependency modeling and multi-scale feature extraction to address challenges such as homogeneous backgrounds, weak edges, and sparse textures. While Mamba-based state-space models (SSMs) excel in global dependency modeling with linear complexity, their block-wise processing disrupts spatial consistency, limiting their effectiveness for IR image reconstruction. We propose IRSRMamba, a novel framework integrating wavelet transform feature modulation for multi-scale adaptation and an SSMs-based semantic consistency loss to restore fragmented contextual information. This design enhances global-local feature fusion, structural coherence, and fine-detail preservation while mitigating block-induced artifacts. Experiments on benchmark datasets demonstrate that IRSRMamba outperforms state-of-the-art methods in PSNR, SSIM, and perceptual quality. This work establishes Mamba-based architectures as a promising direction for high-fidelity IR image enhancement. Code are available at https://github.com/yongsongH/IRSRMamba.
Related papers
- Physics-Driven Autoregressive State Space Models for Medical Image Reconstruction [5.208643222679356]
We introduce a novel physics-driven autoregressive state space model (MambaRoll) for enhanced fidelity in medical image reconstruction.
MambaRoll employs an autoregressive framework based on physics-driven state space modules (PSSM), where PSSMs efficiently aggregate contextual features at a given spatial scale.
MambaRoll outperforms state-of-the-art PD methods based on convolutional, transformer and conventional SSM modules.
arXiv Detail & Related papers (2024-12-12T14:59:56Z) - MLLA-UNet: Mamba-like Linear Attention in an Efficient U-Shape Model for Medical Image Segmentation [6.578088710294546]
Traditional segmentation methods struggle to address challenges such as high anatomical variability, blurred tissue boundaries, low organ contrast, and noise.
We propose MLLA-UNet (Mamba-Like Linear Attention UNet), a novel architecture that achieves linear computational complexity while maintaining high segmentation accuracy.
Experiments demonstrate that MLLA-UNet achieves state-of-the-art performance on six challenging datasets with 24 different segmentation tasks, including but not limited to FLARE22, AMOS CT, and ACDC, with an average DSC of 88.32%.
arXiv Detail & Related papers (2024-10-31T08:54:23Z) - Enhanced Super-Resolution Training via Mimicked Alignment for Real-World Scenes [51.92255321684027]
We propose a novel plug-and-play module designed to mitigate misalignment issues by aligning LR inputs with HR images during training.
Specifically, our approach involves mimicking a novel LR sample that aligns with HR while preserving the characteristics of the original LR samples.
We comprehensively evaluate our method on synthetic and real-world datasets, demonstrating its effectiveness across a spectrum of SR models.
arXiv Detail & Related papers (2024-10-07T18:18:54Z) - Cross-Scan Mamba with Masked Training for Robust Spectral Imaging [51.557804095896174]
We propose the Cross-Scanning Mamba, named CS-Mamba, that employs a Spatial-Spectral SSM for global-local balanced context encoding.
Experiment results show that our CS-Mamba achieves state-of-the-art performance and the masked training method can better reconstruct smooth features to improve the visual quality.
arXiv Detail & Related papers (2024-08-01T15:14:10Z) - RSDehamba: Lightweight Vision Mamba for Remote Sensing Satellite Image Dehazing [19.89130165954241]
Remote sensing image dehazing (RSID) aims to remove nonuniform and physically irregular haze factors for high-quality image restoration.
We propose the first lightweight network on the mamba-based model called RSDhamba in the field of RSID.
arXiv Detail & Related papers (2024-05-16T12:12:07Z) - Frequency-Assisted Mamba for Remote Sensing Image Super-Resolution [49.902047563260496]
We develop the first attempt to integrate the Vision State Space Model (Mamba) for remote sensing image (RSI) super-resolution.
To achieve better SR reconstruction, building upon Mamba, we devise a Frequency-assisted Mamba framework, dubbed FMSR.
Our FMSR features a multi-level fusion architecture equipped with the Frequency Selection Module (FSM), Vision State Space Module (VSSM), and Hybrid Gate Module (HGM)
arXiv Detail & Related papers (2024-05-08T11:09:24Z) - FusionMamba: Dynamic Feature Enhancement for Multimodal Image Fusion with Mamba [19.761723108363796]
FusionMamba aims to overcome the challenges faced by CNNs and Vision Transformers (ViTs) in computer vision tasks.
The framework improves the visual state-space model Mamba by integrating dynamic convolution and channel attention mechanisms.
Experiments show that FusionMamba achieves state-of-the-art performance in a variety of multimodal image fusion tasks as well as downstream experiments.
arXiv Detail & Related papers (2024-04-15T06:37:21Z) - Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components.
CNNs are used to augment the local texture information of coarse priors.
DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z) - RBSR: Efficient and Flexible Recurrent Network for Burst
Super-Resolution [57.98314517861539]
Burst super-resolution (BurstSR) aims at reconstructing a high-resolution (HR) image from a sequence of low-resolution (LR) and noisy images.
In this paper, we suggest fusing cues frame-by-frame with an efficient and flexible recurrent network.
arXiv Detail & Related papers (2023-06-30T12:14:13Z) - Learning Detail-Structure Alternative Optimization for Blind
Super-Resolution [69.11604249813304]
We propose an effective and kernel-free network, namely DSSR, which enables recurrent detail-structure alternative optimization without blur kernel prior incorporation for blind SR.
In our DSSR, a detail-structure modulation module (DSMM) is built to exploit the interaction and collaboration of image details and structures.
Our method achieves the state-of-the-art against existing methods.
arXiv Detail & Related papers (2022-12-03T14:44:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.