HIMOSA: Efficient Remote Sensing Image Super-Resolution with Hierarchical Mixture of Sparse Attention
- URL: http://arxiv.org/abs/2512.00275v1
- Date: Sat, 29 Nov 2025 02:00:15 GMT
- Title: HIMOSA: Efficient Remote Sensing Image Super-Resolution with Hierarchical Mixture of Sparse Attention
- Authors: Yi Liu, Yi Wan, Xinyi Liu, Qiong Wu, Panwang Xia, Xuejun Huang, Yongjun Zhang,
- Abstract summary: HIMOSA is a lightweight super-resolution framework for remote sensing imagery.<n>Our method achieves state-of-the-art performance while maintaining computational efficiency.
- Score: 12.346708587151495
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In remote sensing applications, such as disaster detection and response, real-time efficiency and model lightweighting are of critical importance. Consequently, existing remote sensing image super-resolution methods often face a trade-off between model performance and computational efficiency. In this paper, we propose a lightweight super-resolution framework for remote sensing imagery, named HIMOSA. Specifically, HIMOSA leverages the inherent redundancy in remote sensing imagery and introduces a content-aware sparse attention mechanism, enabling the model to achieve fast inference while maintaining strong reconstruction performance. Furthermore, to effectively leverage the multi-scale repetitive patterns found in remote sensing imagery, we introduce a hierarchical window expansion and reduce the computational complexity by adjusting the sparsity of the attention. Extensive experiments on multiple remote sensing datasets demonstrate that our method achieves state-of-the-art performance while maintaining computational efficiency.
Related papers
- HAD: Hierarchical Asymmetric Distillation to Bridge Spatio-Temporal Gaps in Event-Based Object Tracking [80.07224739976911]
Event cameras offer exceptional temporal resolution and a range (modal)<n> RGB cameras excel at capturing rich texture with high resolution, whereas event cameras offer exceptional temporal resolution and a range (modal)
arXiv Detail & Related papers (2025-10-22T13:15:13Z) - DroneSR: Rethinking Few-shot Thermal Image Super-Resolution from Drone-based Perspective [50.887173519116196]
In super resolution tasks on images, diffusion models as representatives of generative models typically adopt large scale architectures.<n>Few-shot drone-captured infrared training data frequently induces severe overfitting in large-scale architectures.<n>We propose a new Gaussian quantization representation learning method oriented to diffusion models that alleviates overfitting and enhances robustness.
arXiv Detail & Related papers (2025-09-02T02:37:42Z) - DiMoSR: Feature Modulation via Multi-Branch Dilated Convolutions for Efficient Image Super-Resolution [7.714092783675679]
This paper introduces DiMoSR, a novel architecture that enhances feature representation through modulation to complement attention in lightweight SISR networks.<n> Experimental results demonstrate that DiMoSR outperforms state-of-the-art lightweight methods across diverse benchmark datasets.
arXiv Detail & Related papers (2025-05-27T14:40:05Z) - Breaking Complexity Barriers: High-Resolution Image Restoration with Rank Enhanced Linear Attention [54.42902794496325]
Linear attention, a variant of softmax attention, demonstrates promise in global context modeling.<n>We propose Rank Enhanced Linear Attention (RELA), a simple yet effective method that enriches feature representations by integrating a lightweight depthwise convolution.<n>Building upon RELA, we propose an efficient and effective image restoration Transformer, named LAformer.
arXiv Detail & Related papers (2025-05-22T02:57:23Z) - InstaRevive: One-Step Image Enhancement via Dynamic Score Matching [66.97989469865828]
InstaRevive is an image enhancement framework that employs score-based diffusion distillation to harness potent generative capability.<n>Our framework delivers high-quality and visually appealing results across a diverse array of challenging tasks and datasets.
arXiv Detail & Related papers (2025-04-22T01:19:53Z) - A Diffusion-Based Framework for Terrain-Aware Remote Sensing Image Reconstruction [4.824120664293887]
SatelliteMaker is a diffusion-based method that reconstructs missing data across varying levels of data loss.<n>Digital Elevation Model (DEM) as a conditioning input and use tailored prompts to generate realistic images.<n>VGG-Adapter module based on Distribution Loss, which reduces distribution discrepancy and ensures style consistency.
arXiv Detail & Related papers (2025-04-16T14:19:57Z) - RS-vHeat: Heat Conduction Guided Efficient Remote Sensing Foundation Model [59.37279559684668]
We introduce RS-vHeat, an efficient multi-modal remote sensing foundation model.<n>Specifically, RS-vHeat applies the Heat Conduction Operator (HCO) with a complexity of $O(N1.5)$ and a global receptive field.<n>Compared to attention-based remote sensing foundation models, we reduce memory usage by 84%, FLOPs by 24% and improves throughput by 2.7 times.
arXiv Detail & Related papers (2024-11-27T01:43:38Z) - RemoteDet-Mamba: A Hybrid Mamba-CNN Network for Multi-modal Object Detection in Remote Sensing Images [13.98477009749389]
We propose a multimodal remote sensing network that employs a quad-directional selective scanning fusion strategy called RemoteDet-Mamba.
RemoteDet-Mamba simultaneously facilitates the learning of single-modal local features and the integration of patch-level global features.
Experimental results on the DroneVehicle dataset demonstrate the effectiveness of RemoteDet-Mamba.
arXiv Detail & Related papers (2024-10-17T13:20:20Z) - RBSR: Efficient and Flexible Recurrent Network for Burst
Super-Resolution [57.98314517861539]
Burst super-resolution (BurstSR) aims at reconstructing a high-resolution (HR) image from a sequence of low-resolution (LR) and noisy images.
In this paper, we suggest fusing cues frame-by-frame with an efficient and flexible recurrent network.
arXiv Detail & Related papers (2023-06-30T12:14:13Z) - Boosting Image Super-Resolution Via Fusion of Complementary Information
Captured by Multi-Modal Sensors [21.264746234523678]
Image Super-Resolution (SR) provides a promising technique to enhance the image quality of low-resolution optical sensors.
In this paper, we attempt to leverage complementary information from a low-cost channel (visible/depth) to boost image quality of an expensive channel (thermal) using fewer parameters.
arXiv Detail & Related papers (2020-12-07T02:15:28Z) - Multi-image Super Resolution of Remotely Sensed Images using Residual
Feature Attention Deep Neural Networks [1.3764085113103222]
The presented research proposes a novel residual attention model (RAMS) that efficiently tackles the multi-image super-resolution task.
We introduce the mechanism of visual feature attention with 3D convolutions in order to obtain an aware data fusion and information extraction.
Our representation learning network makes extensive use of nestled residual connections to let flow redundant low-frequency signals.
arXiv Detail & Related papers (2020-07-06T22:54:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.