Related papers: Feature-Augmented Deep Networks for Multiscale Building Segmentation in High-Resolution UAV and Satellite Imagery

Feature-Augmented Deep Networks for Multiscale Building Segmentation in High-Resolution UAV and Satellite Imagery

URL: http://arxiv.org/abs/2505.05321v1
Date: Thu, 08 May 2025 15:08:36 GMT
Title: Feature-Augmented Deep Networks for Multiscale Building Segmentation in High-Resolution UAV and Satellite Imagery
Authors: Chintan B. Maniyar, Minakshi Kumar, Gengchen Mai,
Abstract summary: We present a comprehensive deep learning framework for multiscale building segmentation using RGB aerial and satellite imagery.<n>Our model achieves an overall accuracy of 96.5%, an F1-score of 0.86, and an Intersection over Union (IoU) of 0.80, outperforming existing RGB-based benchmarks.
Score: 1.5417562870196788
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Accurate building segmentation from high-resolution RGB imagery remains challenging due to spectral similarity with non-building features, shadows, and irregular building geometries. In this study, we present a comprehensive deep learning framework for multiscale building segmentation using RGB aerial and satellite imagery with spatial resolutions ranging from 0.4m to 2.7m. We curate a diverse, multi-sensor dataset and introduce feature-augmented inputs by deriving secondary representations including Principal Component Analysis (PCA), Visible Difference Vegetation Index (VDVI), Morphological Building Index (MBI), and Sobel edge filters from RGB channels. These features guide a Res-U-Net architecture in learning complex spatial patterns more effectively. We also propose training policies incorporating layer freezing, cyclical learning rates, and SuperConvergence to reduce training time and resource usage. Evaluated on a held-out WorldView-3 image, our model achieves an overall accuracy of 96.5%, an F1-score of 0.86, and an Intersection over Union (IoU) of 0.80, outperforming existing RGB-based benchmarks. This study demonstrates the effectiveness of combining multi-resolution imagery, feature augmentation, and optimized training strategies for robust building segmentation in remote sensing applications.

Related papers

Unleashing Correlation and Continuity for Hyperspectral Reconstruction from RGB Images [64.80875911446937]
We propose a Correlation and Continuity Network (CCNet) for HSI reconstruction from RGB images.<n>For the correlation of local spectrum, we introduce the Group-wise Spectral Correlation Modeling (GrSCM) module.<n>For the continuity of global spectrum, we design the Neighborhood-wise Spectral Continuity Modeling (NeSCM) module.
arXiv Detail & Related papers (2025-01-02T15:14:40Z)
Efficient Semantic Splatting for Remote Sensing Multi-view Segmentation [29.621022493810088]
We propose a novel semantic splatting approach based on Gaussian Splatting to achieve efficient and low-latency.<n>Our method projects the RGB attributes and semantic features of point clouds onto the image plane, simultaneously rendering RGB images and semantic segmentation results.
arXiv Detail & Related papers (2024-12-08T15:28:30Z)
Hi-ResNet: Edge Detail Enhancement for High-Resolution Remote Sensing Segmentation [10.919956120261539]
High-resolution remote sensing (HRS) semantic segmentation extracts key objects from high-resolution coverage areas. objects of the same category within HRS images show significant differences in scale and shape across diverse geographical environments. We propose a High-resolution remote sensing network (Hi-ResNet) with efficient network structure designs.
arXiv Detail & Related papers (2023-05-22T03:58:25Z)
Ultra Sharp : Study of Single Image Super Resolution using Residual Dense Network [0.15229257192293202]
Single Image Super Resolution (SISR) has been an interesting and ill-posed problem in computer vision. Traditional super-resolution imaging approaches involve, reconstruction, and learning-based methods. This study examines the Residual Dense Networks architecture proposed by Yhang et al.
arXiv Detail & Related papers (2023-04-21T10:32:24Z)
RGB-D based Stair Detection using Deep Learning for Autonomous Stair Climbing [6.362951673024623]
We propose a neural network architecture with inputs of both RGB map and depth map. Specifically, we design the selective module which can make the network learn the complementary relationship between RGB map and depth map. Experiments on our dataset show that our method can achieve better accuracy and recall compared with the previous state-of-the-art deep learning method.
arXiv Detail & Related papers (2022-12-02T11:22:52Z)
StructVPR: Distill Structural Knowledge with Weighting Samples for Visual Place Recognition [49.58170209388029]
Visual place recognition (VPR) is usually considered as a specific image retrieval problem. We propose StructVPR, a novel training architecture for VPR, to enhance structural knowledge in RGB global features. Ours achieves state-of-the-art performance while maintaining a low computational cost.
arXiv Detail & Related papers (2022-12-02T02:52:01Z)
Learning Deep Context-Sensitive Decomposition for Low-Light Image Enhancement [58.72667941107544]
A typical framework is to simultaneously estimate the illumination and reflectance, but they disregard the scene-level contextual information encapsulated in feature spaces. We develop a new context-sensitive decomposition network architecture to exploit the scene-level contextual dependencies on spatial scales. We develop a lightweight CSDNet (named LiteCSDNet) by reducing the number of channels.
arXiv Detail & Related papers (2021-12-09T06:25:30Z)
FEANet: Feature-Enhanced Attention Network for RGB-Thermal Real-time Semantic Segmentation [19.265576529259647]
We propose a two-stage Feature-Enhanced Attention Network (FEANet) for the RGB-T semantic segmentation task. Specifically, we introduce a Feature-Enhanced Attention Module (FEAM) to excavate and enhance multi-level features from both the channel and spatial views. Benefited from the proposed FEAM module, our FEANet can preserve the spatial information and shift more attention to high-resolution features from the fused RGB-T images.
arXiv Detail & Related papers (2021-10-18T02:43:41Z)
High-resolution Depth Maps Imaging via Attention-based Hierarchical Multi-modal Fusion [84.24973877109181]
We propose a novel attention-based hierarchical multi-modal fusion network for guided DSR. We show that our approach outperforms state-of-the-art methods in terms of reconstruction accuracy, running speed and memory efficiency.
arXiv Detail & Related papers (2021-04-04T03:28:33Z)
Siamese Network for RGB-D Salient Object Detection and Beyond [113.30063105890041]
A novel framework is proposed to learn from both RGB and depth inputs through a shared network backbone. Comprehensive experiments using five popular metrics show that the designed framework yields a robust RGB-D saliency detector. We also link JL-DCF to the RGB-D semantic segmentation field, showing its capability of outperforming several semantic segmentation models.
arXiv Detail & Related papers (2020-08-26T06:01:05Z)
Progressively Guided Alternate Refinement Network for RGB-D Salient Object Detection [63.18846475183332]
We aim to develop an efficient and compact deep network for RGB-D salient object detection. We propose a progressively guided alternate refinement network to refine it. Our model outperforms existing state-of-the-art approaches by a large margin.
arXiv Detail & Related papers (2020-08-17T02:55:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.