Related papers: IRSRMamba: Infrared Image Super-Resolution via Mamba-based Wavelet Transform Feature Modulation Model

IRSRMamba: Infrared Image Super-Resolution via Mamba-based Wavelet Transform Feature Modulation Model

URL: http://arxiv.org/abs/2405.09873v1
Date: Thu, 16 May 2024 07:49:24 GMT
Title: IRSRMamba: Infrared Image Super-Resolution via Mamba-based Wavelet Transform Feature Modulation Model
Authors: Yongsong Huang, Tomo Miyazaki, Xiaofeng Liu, Shinichiro Omachi,
Abstract summary: Infrared (IR) image super-resolution faces challenges from homogeneous background pixel distributions and sparse target regions. Recent advancements in Mamba-based (Selective Structured State Space Model) models have shown significant potential in visual tasks. We introduce IRSRMamba: Infrared Image Super-Resolution via Mamba-based Wavelet Transform Feature Modulation Model.
Score: 7.842507196763463
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Infrared (IR) image super-resolution faces challenges from homogeneous background pixel distributions and sparse target regions, requiring models that effectively handle long-range dependencies and capture detailed local-global information. Recent advancements in Mamba-based (Selective Structured State Space Model) models, employing state space models, have shown significant potential in visual tasks, suggesting their applicability for IR enhancement. In this work, we introduce IRSRMamba: Infrared Image Super-Resolution via Mamba-based Wavelet Transform Feature Modulation Model, a novel Mamba-based model designed specifically for IR image super-resolution. This model enhances the restoration of context-sparse target details through its advanced dependency modeling capabilities. Additionally, a new wavelet transform feature modulation block improves multi-scale receptive field representation, capturing both global and local information efficiently. Comprehensive evaluations confirm that IRSRMamba outperforms existing models on multiple benchmarks. This research advances IR super-resolution and demonstrates the potential of Mamba-based models in IR image processing. Code are available at \url{https://github.com/yongsongH/IRSRMamba}.

Related papers

Rotation Equivariant Arbitrary-scale Image Super-Resolution [62.41329042683779]
The arbitrary-scale image super-resolution (ASISR) aims to achieve arbitrary-scale high-resolution recoveries from a low-resolution input image.<n>We make efforts to construct a rotation equivariant ASISR method in this study.
arXiv Detail & Related papers (2025-08-07T08:51:03Z)
GPSMamba: A Global Phase and Spectral Prompt-guided Mamba for Infrared Image Super-Resolution [4.063682271487617]
Infrared Image Super-Resolution is challenged by the low contrast and sparse textures of infrared data.<n>GPSMamba is a framework that synergizes architectural guidance with non-causal supervision.
arXiv Detail & Related papers (2025-07-25T06:56:16Z)
MambaOutRS: A Hybrid CNN-Fourier Architecture for Remote Sensing Image Classification [4.14360329494344]
We introduce MambaOutRS, a novel hybrid convolutional architecture for remote sensing image classification.<n>MambaOutRS builds upon stacked Gated CNN blocks for local feature extraction and introduces a novel Fourier Filter Gate (FFG) module.
arXiv Detail & Related papers (2025-06-24T12:20:11Z)
RD-UIE: Relation-Driven State Space Modeling for Underwater Image Enhancement [59.364418120895]
Underwater image enhancement (UIE) is a critical preprocessing step for marine vision applications.<n>We develop a novel relation-driven Mamba framework for effective UIE (RD-UIE)<n>Experiments on underwater enhancement benchmarks demonstrate RD-UIE outperforms the state-of-the-art approach WMamba.
arXiv Detail & Related papers (2025-05-02T12:21:44Z)
An Efficient and Mixed Heterogeneous Model for Image Restoration [71.85124734060665]
Current mainstream approaches are based on three architectural paradigms: CNNs, Transformers, and Mambas. We propose RestorMixer, an efficient and general-purpose IR model based on mixed-architecture fusion.
arXiv Detail & Related papers (2025-04-15T08:19:12Z)
Real-World Remote Sensing Image Dehazing: Benchmark and Baseline [19.747354924759104]
The scarcity of real-world remote sensing hazy image pairs has compelled existing methods to rely primarily on synthetic datasets. We introduce Real-World Remote Sensing Hazy Image dataset (RRSHID), the first large-scale dataset featuring real-world hazy and dehazed image pairs. Based on this, we propose MCAF-Net, a novel framework tailored for real-world RSID.
arXiv Detail & Related papers (2025-03-23T07:15:46Z)
Physics-Driven Autoregressive State Space Models for Medical Image Reconstruction [5.208643222679356]
We introduce a novel physics-driven autoregressive state space model (MambaRoll) for enhanced fidelity in medical image reconstruction. MambaRoll employs an autoregressive framework based on physics-driven state space modules (PSSM), where PSSMs efficiently aggregate contextual features at a given spatial scale. MambaRoll outperforms state-of-the-art PD methods based on convolutional, transformer and conventional SSM modules.
arXiv Detail & Related papers (2024-12-12T14:59:56Z)
MambaVT: Spatio-Temporal Contextual Modeling for robust RGB-T Tracking [51.28485682954006]
We propose a pure Mamba-based framework (MambaVT) to fully exploit intrinsic-temporal contextual modeling for robust visible-thermal tracking. Specifically, we devise the long-range cross-frame integration component to globally adapt to target appearance variations. Experiments show the significant potential of vision Mamba for RGB-T tracking, with MambaVT achieving state-of-the-art performance on four mainstream benchmarks.
arXiv Detail & Related papers (2024-08-15T02:29:00Z)
Cross-Scan Mamba with Masked Training for Robust Spectral Imaging [51.557804095896174]
We propose the Cross-Scanning Mamba, named CS-Mamba, that employs a Spatial-Spectral SSM for global-local balanced context encoding. Experiment results show that our CS-Mamba achieves state-of-the-art performance and the masked training method can better reconstruct smooth features to improve the visual quality.
arXiv Detail & Related papers (2024-08-01T15:14:10Z)
DiM: Diffusion Mamba for Efficient High-Resolution Image Synthesis [56.849285913695184]
Diffusion Mamba (DiM) is a sequence model for efficient high-resolution image synthesis. DiM architecture achieves inference-time efficiency for high-resolution images. Experiments demonstrate the effectiveness and efficiency of our DiM.
arXiv Detail & Related papers (2024-05-23T06:53:18Z)
RSDehamba: Lightweight Vision Mamba for Remote Sensing Satellite Image Dehazing [19.89130165954241]
Remote sensing image dehazing (RSID) aims to remove nonuniform and physically irregular haze factors for high-quality image restoration. We propose the first lightweight network on the mamba-based model called RSDhamba in the field of RSID.
arXiv Detail & Related papers (2024-05-16T12:12:07Z)
Frequency-Assisted Mamba for Remote Sensing Image Super-Resolution [49.902047563260496]
We develop the first attempt to integrate the Vision State Space Model (Mamba) for remote sensing image (RSI) super-resolution. To achieve better SR reconstruction, building upon Mamba, we devise a Frequency-assisted Mamba framework, dubbed FMSR. Our FMSR features a multi-level fusion architecture equipped with the Frequency Selection Module (FSM), Vision State Space Module (VSSM), and Hybrid Gate Module (HGM)
arXiv Detail & Related papers (2024-05-08T11:09:24Z)
FusionMamba: Dynamic Feature Enhancement for Multimodal Image Fusion with Mamba [17.75933946414591]
Multi-modal image fusion aims to combine information from different modes to create a single image with detailed textures. Transformer-based models, while excelling in global feature modeling, confront computational challenges stemming from their quadratic complexity. We propose FusionMamba, a novel dynamic feature enhancement method for multimodal image fusion with Mamba.
arXiv Detail & Related papers (2024-04-15T06:37:21Z)
RS-Mamba for Large Remote Sensing Image Dense Prediction [58.12667617617306]
We propose the Remote Sensing Mamba (RSM) for dense prediction tasks in large VHR remote sensing images. RSM is specifically designed to capture the global context of remote sensing images with linear complexity. Our model achieves better efficiency and accuracy than transformer-based models on large remote sensing images.
arXiv Detail & Related papers (2024-04-03T12:06:01Z)
RSMamba: Remote Sensing Image Classification with State Space Model [25.32283897448209]
We introduce RSMamba, a novel architecture for remote sensing image classification. RSMamba is based on the State Space Model (SSM) and incorporates an efficient, hardware-aware design known as the Mamba. We propose a dynamic multi-path activation mechanism to augment Mamba's capacity to model non-temporal image data.
arXiv Detail & Related papers (2024-03-28T17:59:49Z)
Diffusion Models Without Attention [110.5623058129782]
Diffusion State Space Model (DiffuSSM) is an architecture that supplants attention mechanisms with a more scalable state space model backbone. Our focus on FLOP-efficient architectures in diffusion training marks a significant step forward.
arXiv Detail & Related papers (2023-11-30T05:15:35Z)
Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components. CNNs are used to augment the local texture information of coarse priors. DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z)
RBSR: Efficient and Flexible Recurrent Network for Burst Super-Resolution [57.98314517861539]
Burst super-resolution (BurstSR) aims at reconstructing a high-resolution (HR) image from a sequence of low-resolution (LR) and noisy images. In this paper, we suggest fusing cues frame-by-frame with an efficient and flexible recurrent network.
arXiv Detail & Related papers (2023-06-30T12:14:13Z)
Learning Detail-Structure Alternative Optimization for Blind Super-Resolution [69.11604249813304]
We propose an effective and kernel-free network, namely DSSR, which enables recurrent detail-structure alternative optimization without blur kernel prior incorporation for blind SR. In our DSSR, a detail-structure modulation module (DSMM) is built to exploit the interaction and collaboration of image details and structures. Our method achieves the state-of-the-art against existing methods.
arXiv Detail & Related papers (2022-12-03T14:44:17Z)
Advancing Plain Vision Transformer Towards Remote Sensing Foundation Model [97.9548609175831]
We resort to plain vision transformers with about 100 million parameters and make the first attempt to propose large vision models customized for remote sensing tasks. Specifically, to handle the large image size and objects of various orientations in RS images, we propose a new rotated varied-size window attention. Experiments on detection tasks demonstrate the superiority of our model over all state-of-the-art models, achieving 81.16% mAP on the DOTA-V1.0 dataset.
arXiv Detail & Related papers (2022-08-08T09:08:40Z)
Exploiting Digital Surface Models for Inferring Super-Resolution for Remotely Sensed Images [2.3204178451683264]
This paper introduces a novel approach for forcing an SRR model to output realistic remote sensing images. Instead of relying on feature-space similarities as a perceptual loss, the model considers pixel-level information inferred from the normalized Digital Surface Model (nDSM) of the image. Based on visual inspection, the inferred super-resolution images exhibit particularly superior quality.
arXiv Detail & Related papers (2022-05-09T06:02:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.