Related papers: Frequency-Assisted Mamba for Remote Sensing Image Super-Resolution

Frequency-Assisted Mamba for Remote Sensing Image Super-Resolution

URL: http://arxiv.org/abs/2405.04964v1
Date: Wed, 8 May 2024 11:09:24 GMT
Title: Frequency-Assisted Mamba for Remote Sensing Image Super-Resolution
Authors: Yi Xiao, Qiangqiang Yuan, Kui Jiang, Yuzeng Chen, Qiang Zhang, Chia-Wen Lin,
Abstract summary: We develop the first attempt to integrate the Vision State Space Model (Mamba) for remote sensing image (RSI) super-resolution. To achieve better SR reconstruction, building upon Mamba, we devise a Frequency-assisted Mamba framework, dubbed FMSR. Our FMSR features a multi-level fusion architecture equipped with the Frequency Selection Module (FSM), Vision State Space Module (VSSM), and Hybrid Gate Module (HGM)
Score: 49.902047563260496
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent progress in remote sensing image (RSI) super-resolution (SR) has exhibited remarkable performance using deep neural networks, e.g., Convolutional Neural Networks and Transformers. However, existing SR methods often suffer from either a limited receptive field or quadratic computational overhead, resulting in sub-optimal global representation and unacceptable computational costs in large-scale RSI. To alleviate these issues, we develop the first attempt to integrate the Vision State Space Model (Mamba) for RSI-SR, which specializes in processing large-scale RSI by capturing long-range dependency with linear complexity. To achieve better SR reconstruction, building upon Mamba, we devise a Frequency-assisted Mamba framework, dubbed FMSR, to explore the spatial and frequent correlations. In particular, our FMSR features a multi-level fusion architecture equipped with the Frequency Selection Module (FSM), Vision State Space Module (VSSM), and Hybrid Gate Module (HGM) to grasp their merits for effective spatial-frequency fusion. Recognizing that global and local dependencies are complementary and both beneficial for SR, we further recalibrate these multi-level features for accurate feature fusion via learnable scaling adaptors. Extensive experiments on AID, DOTA, and DIOR benchmarks demonstrate that our FMSR outperforms state-of-the-art Transformer-based methods HAT-L in terms of PSNR by 0.11 dB on average, while consuming only 28.05% and 19.08% of its memory consumption and complexity, respectively.

Related papers

MVNet: Hyperspectral Remote Sensing Image Classification Based on Hybrid Mamba-Transformer Vision Backbone Architecture [12.168520751389622]
Hyperspectral image (HSI) classification faces challenges such as high-dimensional data, limited training samples, and spectral redundancy.<n>This paper proposes a novel MVNet network architecture that integrates 3D-CNN's local feature extraction, Transformer's global modeling, and Mamba's linear sequence modeling capabilities.<n>On IN, UP, and KSC datasets, MVNet outperforms mainstream hyperspectral image classification methods in both classification accuracy and computational efficiency.
arXiv Detail & Related papers (2025-07-06T14:52:26Z)
FADPNet: Frequency-Aware Dual-Path Network for Face Super-Resolution [70.61549422952193]
Face super-resolution (FSR) under limited computational costs remains an open problem.<n>Existing approaches typically treat all facial pixels equally, resulting in suboptimal allocation of computational resources.<n>We propose FADPNet, a Frequency-Aware Dual-Path Network that decomposes facial features into low- and high-frequency components.
arXiv Detail & Related papers (2025-06-17T02:33:42Z)
MambaVSR: Content-Aware Scanning State Space Model for Video Super-Resolution [33.457410717030946]
We propose MambaVSR, the first state-space model framework for super-resolution video.<n>MambaVSR enables dynamic interactions through the Shared Compass Construction ( SCC) and the Content-Aware Sequentialization (CAS)<n>Building upon, the CAS module effectively aligns and aggregates non-local similar content across multiple frames by interleaving temporal features along the learned spatial order.
arXiv Detail & Related papers (2025-06-13T13:22:28Z)
AdaptSR: Low-Rank Adaptation for Efficient and Scalable Real-World Super-Resolution [50.584551250242235]
AdaptSR is a low-rank adaptation framework that efficiently repurposes bi-cubic-trained SR models for real-world tasks. Our experiments demonstrate that AdaptSR outperforms GAN and diffusion-based SR methods by up to 4 dB in PSNR and 2% in perceptual scores on real SR benchmarks.
arXiv Detail & Related papers (2025-03-10T18:03:18Z)
GDSR: Global-Detail Integration through Dual-Branch Network with Wavelet Losses for Remote Sensing Image Super-Resolution [30.21425157733119]
We introduce Receptance Weighted Key Value (RWKV) to Remote Sensing Image (RSI) Super-Resolution (SR) To simultaneously model global and local features, we propose the Global-Detail dual-branch structure, GDSR, which performs SR reconstruction by paralleling RWKV and convolutional operations to handle large-scale RSIs. In addition, we propose Wavelet Loss, a loss function that effectively captures high-frequency detail information in images, thereby enhancing the visual quality of SR, particularly in terms of detail reconstruction.
arXiv Detail & Related papers (2024-12-31T10:43:19Z)
Accelerated Multi-Contrast MRI Reconstruction via Frequency and Spatial Mutual Learning [50.74383395813782]
We propose a novel Frequency and Spatial Mutual Learning Network (FSMNet) to explore global dependencies across different modalities. The proposed FSMNet achieves state-of-the-art performance for the Multi-Contrast MR Reconstruction task with different acceleration factors.
arXiv Detail & Related papers (2024-09-21T12:02:47Z)
Empowering Snapshot Compressive Imaging: Spatial-Spectral State Space Model with Across-Scanning and Local Enhancement [51.557804095896174]
We introduce a State Space Model with Across-Scanning and Local Enhancement, named ASLE-SSM, that employs a Spatial-Spectral SSM for global-local balanced context encoding and cross-channel interaction promoting. Experimental results illustrate ASLE-SSM's superiority over existing state-of-the-art methods, with an inference speed 2.4 times faster than Transformer-based MST and saving 0.12 (M) of parameters.
arXiv Detail & Related papers (2024-08-01T15:14:10Z)
MMR-Mamba: Multi-Modal MRI Reconstruction with Mamba and Spatial-Frequency Information Fusion [17.084083262801737]
We propose MMR-Mamba, a novel framework that thoroughly and efficiently integrates multi-modal features for MRI reconstruction. Specifically, we first design a Target modality-guided Cross Mamba (TCM) module in the spatial domain. Then, we introduce a Selective Frequency Fusion (SFF) module to efficiently integrate global information in the Fourier domain.
arXiv Detail & Related papers (2024-06-27T07:30:54Z)
RSDehamba: Lightweight Vision Mamba for Remote Sensing Satellite Image Dehazing [19.89130165954241]
Remote sensing image dehazing (RSID) aims to remove nonuniform and physically irregular haze factors for high-quality image restoration. We propose the first lightweight network on the mamba-based model called RSDhamba in the field of RSID.
arXiv Detail & Related papers (2024-05-16T12:12:07Z)
IRSRMamba: Infrared Image Super-Resolution via Mamba-based Wavelet Transform Feature Modulation Model [7.842507196763463]
IRSRMamba is a novel framework integrating wavelet transform feature modulation for multi-scale adaptation. IRSRMamba outperforms state-of-the-art methods in PSNR, SSIM, and perceptual quality. This work establishes Mamba-based architectures as a promising direction for high-fidelity IR image enhancement.
arXiv Detail & Related papers (2024-05-16T07:49:24Z)
GMSR:Gradient-Guided Mamba for Spectral Reconstruction from RGB Images [6.437310105552873]
GMSR-Net is a lightweight model characterized by a global receptive field and linear computational complexity. It achieves state-of-the-art performance while markedly reducing the number of parameters and computational burdens. Compared to existing approaches, GMSR-Net slashes parameters and FLOPS by substantial margins of 10 times and 20 times, respectively.
arXiv Detail & Related papers (2024-05-13T14:21:54Z)
Incorporating Transformer Designs into Convolutions for Lightweight Image Super-Resolution [46.32359056424278]
Large convolutional kernels have become popular in designing convolutional neural networks. The increase in kernel size also leads to a quadratic growth in the number of parameters, resulting in heavy computation and memory requirements. We propose a neighborhood attention (NA) module that upgrades the standard convolution with a self-attention mechanism. Building upon the NA module, we propose a lightweight single image super-resolution (SISR) network named TCSR.
arXiv Detail & Related papers (2023-03-25T01:32:18Z)
Recursive Generalization Transformer for Image Super-Resolution [108.67898547357127]
We propose the Recursive Generalization Transformer (RGT) for image SR, which can capture global spatial information and is suitable for high-resolution images. We combine the RG-SA with local self-attention to enhance the exploitation of the global context. Our RGT outperforms recent state-of-the-art methods quantitatively and qualitatively.
arXiv Detail & Related papers (2023-03-11T10:44:44Z)
Cross-Attention in Coupled Unmixing Nets for Unsupervised Hyperspectral Super-Resolution [79.97180849505294]
We propose a novel coupled unmixing network with a cross-attention mechanism, CUCaNet, to enhance the spatial resolution of HSI. Experiments are conducted on three widely-used HS-MS datasets in comparison with state-of-the-art HSI-SR models.
arXiv Detail & Related papers (2020-07-10T08:08:20Z)
Lightweight image super-resolution with enhanced CNN [82.36883027158308]
Deep convolutional neural networks (CNNs) with strong expressive ability have achieved impressive performances on single image super-resolution (SISR) We propose a lightweight enhanced SR CNN (LESRCNN) with three successive sub-blocks, an information extraction and enhancement block (IEEB), a reconstruction block (RB) and an information refinement block (IRB) IEEB extracts hierarchical low-resolution (LR) features and aggregates the obtained features step-by-step to increase the memory ability of the shallow layers on deep layers for SISR. RB converts low-frequency features into high-frequency features by fusing global
arXiv Detail & Related papers (2020-07-08T18:03:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.