SAIP-Net: Enhancing Remote Sensing Image Segmentation via Spectral Adaptive Information Propagation
- URL: http://arxiv.org/abs/2504.16564v1
- Date: Wed, 23 Apr 2025 09:43:58 GMT
- Title: SAIP-Net: Enhancing Remote Sensing Image Segmentation via Spectral Adaptive Information Propagation
- Authors: Zhongtao Wang, Xizhe Cao, Yisong Chen, Guoping Wang,
- Abstract summary: This paper introduces SAIP-Net, a novel frequency-aware segmentation framework.<n>SAIP-Net employs adaptive frequency filtering and multi-scale receptive field enhancement.<n>Experiments demonstrate significant performance improvements over state-of-the-art methods.
- Score: 12.735064111733696
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Semantic segmentation of remote sensing imagery demands precise spatial boundaries and robust intra-class consistency, challenging conventional hierarchical models. To address limitations arising from spatial domain feature fusion and insufficient receptive fields, this paper introduces SAIP-Net, a novel frequency-aware segmentation framework that leverages Spectral Adaptive Information Propagation. SAIP-Net employs adaptive frequency filtering and multi-scale receptive field enhancement to effectively suppress intra-class feature inconsistencies and sharpen boundary lines. Comprehensive experiments demonstrate significant performance improvements over state-of-the-art methods, highlighting the effectiveness of spectral-adaptive strategies combined with expanded receptive fields for remote sensing image segmentation.
Related papers
- FSDENet: A Frequency and Spatial Domains based Detail Enhancement Network for Remote Sensing Semantic Segmentation [19.29677373677975]
We propose the Frequency and Spatial Domains based Detail Enhancement Network (FSDENet)<n>Our framework employs spatial processing methods to extract rich multi-scale spatial features and fine-grained semantic details.<n> FSDENet achieves state-of-the-art (SOTA) performance on four widely adopted datasets.
arXiv Detail & Related papers (2025-09-29T04:09:09Z) - GCRPNet: Graph-Enhanced Contextual and Regional Perception Network for Salient Object Detection in Optical Remote Sensing Images [68.33481681452675]
We propose a graph-enhanced contextual and regional perception network (GCRPNet)<n>It builds upon the Mamba architecture to simultaneously capture long-range dependencies and enhance regional feature representation.<n>It performs adaptive patch scanning on feature maps processed via multi-scale convolutions, thereby capturing rich local region information.
arXiv Detail & Related papers (2025-08-14T11:31:43Z) - Beyond Frequency: Seeing Subtle Cues Through the Lens of Spatial Decomposition for Fine-Grained Visual Classification [8.936378000130812]
We propose Subtle-Cue Oriented Perception Engine (SCOPE), which adaptively enhances the representational capability of low-level details and high-level semantics in the spatial domain.<n>SCOPE achieves new state-of-the-art on four popular fine-grained image classification benchmarks.
arXiv Detail & Related papers (2025-08-09T12:13:40Z) - RAPNet: A Receptive-Field Adaptive Convolutional Neural Network for Pansharpening [2.746409982853943]
We introduce RAPNet, a new architecture that leverages content-adaptive convolution.<n>RAPNet employs the Receptive-field Adaptive Pansharpening Convolution (RAPConv), designed to produce spatially adaptive kernels responsive to local feature context.<n>The network integrates the Pansharpening Dynamic Feature Fusion (PAN-DFF) module, which incorporates an attention mechanism to achieve an optimal balance between spatial detail enhancement and spectral fidelity.
arXiv Detail & Related papers (2025-07-14T16:39:14Z) - Residual Prior-driven Frequency-aware Network for Image Fusion [6.90874640835234]
Image fusion aims to integrate complementary information across modalities to generate high-quality fused images.<n>We propose a Residual Prior-driven Frequency-aware Network, termed as RPFNet.
arXiv Detail & Related papers (2025-07-09T10:48:00Z) - HoliSDiP: Image Super-Resolution via Holistic Semantics and Diffusion Prior [62.04939047885834]
We present HoliSDiP, a framework that leverages semantic segmentation to provide both precise textual and spatial guidance for Real-ISR.
Our method employs semantic labels as concise text prompts while introducing dense semantic guidance through segmentation masks and our proposed spatial-CLIP Map.
arXiv Detail & Related papers (2024-11-27T15:22:44Z) - PSTNet: Enhanced Polyp Segmentation with Multi-scale Alignment and Frequency Domain Integration [17.1088588766663]
Polyp Network with Shunted Transformer (PSTNet) is a novel approach that integrates both RGB and frequency domain cues present in the images.
PSTNet comprises three key modules: the Frequency characterization Attention Module (FCAM) for extracting frequency cues and capturing polyp characteristics, the Feature Supplementary Alignment Module (FSAM) for aligning semantic information and reducing noise, and the Cross Perception localization Module (CPM) for synergizing frequency cues with high-level semantics to achieve efficient polyp segmentation.
arXiv Detail & Related papers (2024-09-13T02:52:25Z) - Frequency-Spatial Entanglement Learning for Camouflaged Object Detection [34.426297468968485]
Existing methods attempt to reduce the impact of pixel similarity by maximizing the distinguishing ability of spatial features with complicated design.
We propose a new approach to address this issue by jointly exploring the representation in the frequency and spatial domains, introducing the Frequency-Spatial Entanglement Learning (FSEL) method.
Our experiments demonstrate the superiority of our FSEL over 21 state-of-the-art methods, through comprehensive quantitative and qualitative comparisons in three widely-used datasets.
arXiv Detail & Related papers (2024-09-03T07:58:47Z) - FANet: Feature Amplification Network for Semantic Segmentation in Cluttered Background [9.970265640589966]
Existing deep learning approaches leave out the semantic cues that are crucial in semantic segmentation present in complex scenarios.
We propose a feature amplification network (FANet) as a backbone network that incorporates semantic information using a novel feature enhancement module at multi-stages.
Our experimental results demonstrate the state-of-the-art performance compared to existing methods.
arXiv Detail & Related papers (2024-07-12T15:57:52Z) - Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation [63.15257949821558]
Referring Remote Sensing Image (RRSIS) is a new challenge that combines computer vision and natural language processing.
Traditional Referring Image (RIS) approaches have been impeded by the complex spatial scales and orientations found in aerial imagery.
We introduce the Rotated Multi-Scale Interaction Network (RMSIN), an innovative approach designed for the unique demands of RRSIS.
arXiv Detail & Related papers (2023-12-19T08:14:14Z) - ARHNet: Adaptive Region Harmonization for Lesion-aware Augmentation to
Improve Segmentation Performance [61.04246102067351]
We propose a foreground harmonization framework (ARHNet) to tackle intensity disparities and make synthetic images look more realistic.
We demonstrate the efficacy of our method in improving the segmentation performance using real and synthetic images.
arXiv Detail & Related papers (2023-07-02T10:39:29Z) - DCN-T: Dual Context Network with Transformer for Hyperspectral Image
Classification [109.09061514799413]
Hyperspectral image (HSI) classification is challenging due to spatial variability caused by complex imaging conditions.
We propose a tri-spectral image generation pipeline that transforms HSI into high-quality tri-spectral images.
Our proposed method outperforms state-of-the-art methods for HSI classification.
arXiv Detail & Related papers (2023-04-19T18:32:52Z) - Spatial-Phase Shallow Learning: Rethinking Face Forgery Detection in
Frequency Domain [88.7339322596758]
We present a novel Spatial-Phase Shallow Learning (SPSL) method, which combines spatial image and phase spectrum to capture the up-sampling artifacts of face forgery.
SPSL can achieve the state-of-the-art performance on cross-datasets evaluation as well as multi-class classification and obtain comparable results on single dataset evaluation.
arXiv Detail & Related papers (2021-03-02T16:45:08Z) - Unsupervised Bidirectional Cross-Modality Adaptation via Deeply
Synergistic Image and Feature Alignment for Medical Image Segmentation [73.84166499988443]
We present a novel unsupervised domain adaptation framework, named as Synergistic Image and Feature Alignment (SIFA)
Our proposed SIFA conducts synergistic alignment of domains from both image and feature perspectives.
Experimental results on two different tasks demonstrate that our SIFA method is effective in improving segmentation performance on unlabeled target images.
arXiv Detail & Related papers (2020-02-06T13:49:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.