DSFC-Net: A Dual-Encoder Spatial and Frequency Co-Awareness Network for Rural Road Extraction
- URL: http://arxiv.org/abs/2602.01278v1
- Date: Sun, 01 Feb 2026 15:23:42 GMT
- Title: DSFC-Net: A Dual-Encoder Spatial and Frequency Co-Awareness Network for Rural Road Extraction
- Authors: Zhengbo Zhang, Yihe Tian, Wanke Xia, Lin Chen, Yue Sun, Kun Ding, Ying Wang, Bing Xu, Shiming Xiang,
- Abstract summary: We propose DSFC-Net, a dual-encoder framework that fuses spatial and frequency-domain information.<n>CFIA module explicitly decouples high- and low-frequency information via a Laplacian Pyramid strategy.<n>Experiments on the WHU-RuR+, DeepGlobe, and Massachusetts datasets validate the superiority of DSFC-Net over state-of-the-art approaches.
- Score: 32.51260718935461
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Accurate extraction of rural roads from high-resolution remote sensing imagery is essential for infrastructure planning and sustainable development. However, this task presents unique challenges in rural settings due to several factors. These include high intra-class variability and low inter-class separability from diverse surface materials, frequent vegetation occlusions that disrupt spatial continuity, and narrow road widths that exacerbate detection difficulties. Existing methods, primarily optimized for structured urban environments, often underperform in these scenarios as they overlook such distinctive characteristics. To address these challenges, we propose DSFC-Net, a dual-encoder framework that synergistically fuses spatial and frequency-domain information. Specifically, a CNN branch is employed to capture fine-grained local road boundaries and short-range continuity, while a novel Spatial-Frequency Hybrid Transformer (SFT) is introduced to robustly model global topological dependencies against vegetation occlusions. Distinct from standard attention mechanisms that suffer from frequency bias, the SFT incorporates a Cross-Frequency Interaction Attention (CFIA) module that explicitly decouples high- and low-frequency information via a Laplacian Pyramid strategy. This design enables the dynamic interaction between spatial details and frequency-aware global contexts, effectively preserving the connectivity of narrow roads. Furthermore, a Channel Feature Fusion Module (CFFM) is proposed to bridge the two branches by adaptively recalibrating channel-wise feature responses, seamlessly integrating local textures with global semantics for accurate segmentation. Comprehensive experiments on the WHU-RuR+, DeepGlobe, and Massachusetts datasets validate the superiority of DSFC-Net over state-of-the-art approaches.
Related papers
- Fluxamba: Topology-Aware Anisotropic State Space Models for Geological Lineament Segmentation in Multi-Source Remote Sensing [6.815807403335458]
We propose a lightweight architecture that introduces a topology-aware feature rectification framework.<n>F Fluxamba achieves a real-time inference speed of over 24 FPS with only 3.4M parameters and 6.3G FLOPs.
arXiv Detail & Related papers (2026-01-24T03:55:21Z) - MFC-RFNet: A Multi-scale Guided Rectified Flow Network for Radar Sequence Prediction [7.015114232190396]
Accurate high-resolution precipitation nowcasting from radar echo sequences is crucial for disaster mitigation and economic planning.<n>Key difficulties include modeling complex multi-scale evolution, inter-frame feature misalignment caused by displacement, and efficiently capturing long-range context.<n>We present the Multi-scale Feature Communication Rectified Flow Network (MFRF-Net), a generative framework that integrates multi-scale communication with guided feature fusion.
arXiv Detail & Related papers (2026-01-07T06:24:26Z) - CoCo-Fed: A Unified Framework for Memory- and Communication-Efficient Federated Learning at the Wireless Edge [50.42067935605982]
We propose a novel Compression and Combination-based Federated learning framework that unifies local memory efficiency and global communication reduction.<n>CoCo-Fed significantly outperforms state-of-the-art baselines in both memory and communication efficiency while maintaining robust convergence under non-IID settings.
arXiv Detail & Related papers (2026-01-02T03:39:50Z) - Real-Time LiDAR Super-Resolution via Frequency-Aware Multi-Scale Fusion [0.4078247440919472]
FLASH (Frequency-aware LiDAR Adaptive Super-resolution with Hierarchical fusion) is a novel framework that overcomes limitations through dual-domain processing.<n> FLASH integrates two key innovations: (i) Frequency-Aware Window Attention that combines local spatial attention with global frequency-domain analysis via FFT, capturing both fine-grained geometry and periodic scanning patterns at log-linear complexity, and (ii) Adaptive Multi-Scale Fusion that replaces conventional skip connections with learned position-specific feature aggregation, enhanced by CBAM attention for dynamic feature selection.
arXiv Detail & Related papers (2025-11-10T18:38:15Z) - Wavelet-Guided Dual-Frequency Encoding for Remote Sensing Change Detection [67.84730634802204]
Change detection in remote sensing imagery plays a vital role in various engineering applications, such as natural disaster monitoring, urban expansion tracking, and infrastructure management.<n>Most existing methods still rely on spatial-domain modeling, where the limited diversity of feature representations hinders the detection of subtle change regions.<n>We observe that frequency-domain feature modeling particularly in the wavelet domain amplify fine-grained differences in frequency components, enhancing the perception of edge changes that are challenging to capture in the spatial domain.
arXiv Detail & Related papers (2025-08-07T11:14:16Z) - SWAN: Synergistic Wavelet-Attention Network for Infrared Small Target Detection [8.098063209250684]
Infrared small target detection (IRSTD) is critical in both civilian and military applications.<n>Recent methods focus on conventional convolution operations, which primarily capture local spatial patterns.<n>We propose the Synergistic Wavelet-Attention Network (SWAN), a novel framework designed to perceive targets from both spatial and frequency domains.
arXiv Detail & Related papers (2025-08-02T11:26:58Z) - FADPNet: Frequency-Aware Dual-Path Network for Face Super-Resolution [70.61549422952193]
Face super-resolution (FSR) under limited computational costs remains an open problem.<n>Existing approaches typically treat all facial pixels equally, resulting in suboptimal allocation of computational resources.<n>We propose FADPNet, a Frequency-Aware Dual-Path Network that decomposes facial features into low- and high-frequency components.
arXiv Detail & Related papers (2025-06-17T02:33:42Z) - F2Net: A Frequency-Fused Network for Ultra-High Resolution Remote Sensing Segmentation [10.67983913373955]
F2Net is a frequency-aware framework that decomposes UHR images into high- and low-frequency components for specialized processing.<n>A Hybrid-Frequency Fusion module integrates these observations, guided by two novel objectives.<n>F2Net achieves state-of-the-art performance with mIoU of 80.22 and 83.39, respectively.
arXiv Detail & Related papers (2025-06-09T15:09:49Z) - FreSca: Scaling in Frequency Space Enhances Diffusion Models [55.75504192166779]
This paper explores frequency-based control within latent diffusion models.<n>We introduce FreSca, a novel framework that decomposes noise difference into low- and high-frequency components.<n>FreSca operates without any model retraining or architectural change, offering model- and task-agnostic control.
arXiv Detail & Related papers (2025-04-02T22:03:11Z) - FUSE: Label-Free Image-Event Joint Monocular Depth Estimation via Frequency-Decoupled Alignment and Degradation-Robust Fusion [92.4205087439928]
Image-event joint depth estimation methods leverage complementary modalities for robust perception, yet face challenges in generalizability.<n>We propose the Self-supervised Transfer (PST) and the FrequencyDe-coupled Fusion module (FreDF)<n>PST establishes cross-modal knowledge transfer through latent space alignment with image foundation models, effectively mitigating data scarcity.<n>FreDF explicitly decouples high-frequency edge features from low-frequency structural components, resolving modality-specific frequency mismatches.<n>This combined approach enables FUSE to construct a universal image-event that only requires lightweight decoder adaptation for target datasets.
arXiv Detail & Related papers (2025-03-25T15:04:53Z) - Accelerated Multi-Contrast MRI Reconstruction via Frequency and Spatial Mutual Learning [50.74383395813782]
We propose a novel Frequency and Spatial Mutual Learning Network (FSMNet) to explore global dependencies across different modalities.
The proposed FSMNet achieves state-of-the-art performance for the Multi-Contrast MR Reconstruction task with different acceleration factors.
arXiv Detail & Related papers (2024-09-21T12:02:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.