Learning to Holistically Detect Bridges from Large-Size VHR Remote
Sensing Imagery
- URL: http://arxiv.org/abs/2312.02481v1
- Date: Tue, 5 Dec 2023 04:15:22 GMT
- Title: Learning to Holistically Detect Bridges from Large-Size VHR Remote
Sensing Imagery
- Authors: Yansheng Li, Junwei Luo, Yongjun Zhang, Yihua Tan, Jin-Gang Yu, Song
Bai
- Abstract summary: It is essential to perform holistic bridge detection in large-size very-high-resolution (VHR) RSIs.
The lack of datasets with large-size VHR RSIs limits the deep learning algorithms' performance on bridge detection.
This paper proposes a large-scale dataset named GLH-Bridge comprising 6,000 VHR RSIs sampled from diverse geographic locations.
- Score: 40.001753733290464
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Bridge detection in remote sensing images (RSIs) plays a crucial role in
various applications, but it poses unique challenges compared to the detection
of other objects. In RSIs, bridges exhibit considerable variations in terms of
their spatial scales and aspect ratios. Therefore, to ensure the visibility and
integrity of bridges, it is essential to perform holistic bridge detection in
large-size very-high-resolution (VHR) RSIs. However, the lack of datasets with
large-size VHR RSIs limits the deep learning algorithms' performance on bridge
detection. Due to the limitation of GPU memory in tackling large-size images,
deep learning-based object detection methods commonly adopt the cropping
strategy, which inevitably results in label fragmentation and discontinuous
prediction. To ameliorate the scarcity of datasets, this paper proposes a
large-scale dataset named GLH-Bridge comprising 6,000 VHR RSIs sampled from
diverse geographic locations across the globe. These images encompass a wide
range of sizes, varying from 2,048*2,048 to 16,38*16,384 pixels, and
collectively feature 59,737 bridges. Furthermore, we present an efficient
network for holistic bridge detection (HBD-Net) in large-size RSIs. The HBD-Net
presents a separate detector-based feature fusion (SDFF) architecture and is
optimized via a shape-sensitive sample re-weighting (SSRW) strategy. Based on
the proposed GLH-Bridge dataset, we establish a bridge detection benchmark
including the OBB and HBB tasks, and validate the effectiveness of the proposed
HBD-Net. Additionally, cross-dataset generalization experiments on two publicly
available datasets illustrate the strong generalization capability of the
GLH-Bridge dataset.
Related papers
- Renormalized Connection for Scale-preferred Object Detection in Satellite Imagery [51.83786195178233]
We design a Knowledge Discovery Network (KDN) to implement the renormalization group theory in terms of efficient feature extraction.
Renormalized connection (RC) on the KDN enables synergistic focusing'' of multi-scale features.
RCs extend the multi-level feature's divide-and-conquer'' mechanism of the FPN-based detectors to a wide range of scale-preferred tasks.
arXiv Detail & Related papers (2024-09-09T13:56:22Z) - PGNeXt: High-Resolution Salient Object Detection via Pyramid Grafting Network [24.54269823691119]
We present an advanced study on more challenging high-resolution salient object detection (HRSOD) from both dataset and network framework perspectives.
To compensate for the lack of HRSOD dataset, we thoughtfully collect a large-scale high resolution salient object detection dataset, called UHRSD.
All the images are finely annotated in pixel-level, far exceeding previous low-resolution SOD datasets.
arXiv Detail & Related papers (2024-08-02T09:31:21Z) - SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object Detection [79.23689506129733]
We establish a new benchmark dataset and an open-source method for large-scale SAR object detection.
Our dataset, SARDet-100K, is a result of intense surveying, collecting, and standardizing 10 existing SAR detection datasets.
To the best of our knowledge, SARDet-100K is the first COCO-level large-scale multi-class SAR object detection dataset ever created.
arXiv Detail & Related papers (2024-03-11T09:20:40Z) - Innovative Horizons in Aerial Imagery: LSKNet Meets DiffusionDet for
Advanced Object Detection [55.2480439325792]
We present an in-depth evaluation of an object detection model that integrates the LSKNet backbone with the DiffusionDet head.
The proposed model achieves a mean average precision (MAP) of approximately 45.7%, which is a significant improvement.
This advancement underscores the effectiveness of the proposed modifications and sets a new benchmark in aerial image analysis.
arXiv Detail & Related papers (2023-11-21T19:49:13Z) - Pyramid Grafting Network for One-Stage High Resolution Saliency
Detection [29.013012579688347]
We propose a one-stage framework called Pyramid Grafting Network (PGNet) to extract features from different resolution images independently.
An attention-based Cross-Model Grafting Module (CMGM) is proposed to enable CNN branch to combine broken detailed information more holistically.
We contribute a new Ultra-High-Resolution Saliency Detection dataset UHRSD, containing 5,920 images at 4K-8K resolutions.
arXiv Detail & Related papers (2022-04-11T12:22:21Z) - Learning Efficient Representations for Enhanced Object Detection on
Large-scene SAR Images [16.602738933183865]
It is a challenging problem to detect and recognize targets on complex large-scene Synthetic Aperture Radar (SAR) images.
Recently developed deep learning algorithms can automatically learn the intrinsic features of SAR images.
We propose an efficient and robust deep learning based target detection method.
arXiv Detail & Related papers (2022-01-22T03:25:24Z) - RGB-D Saliency Detection via Cascaded Mutual Information Minimization [122.8879596830581]
Existing RGB-D saliency detection models do not explicitly encourage RGB and depth to achieve effective multi-modal learning.
We introduce a novel multi-stage cascaded learning framework via mutual information minimization to "explicitly" model the multi-modal information between RGB image and depth data.
arXiv Detail & Related papers (2021-09-15T12:31:27Z) - Locality-Aware Rotated Ship Detection in High-Resolution Remote Sensing
Imagery Based on Multi-Scale Convolutional Network [7.984128966509492]
We propose a locality-aware rotated ship detection (LARSD) framework based on a multi-scale convolutional neural network (CNN)
The proposed framework applies a UNet-like multi-scale CNN to generate multi-scale feature maps with high-level information in high resolution.
To enlarge the detection dataset, we build a new high-resolution ship detection (HRSD) dataset, where 2499 images and 9269 instances were collected from Google Earth with different resolutions.
arXiv Detail & Related papers (2020-07-24T03:01:42Z) - Bifurcated backbone strategy for RGB-D salient object detection [168.19708737906618]
We leverage the inherent multi-modal and multi-level nature of RGB-D salient object detection to devise a novel cascaded refinement network.
Our architecture, named Bifurcated Backbone Strategy Network (BBS-Net), is simple, efficient, and backbone-independent.
arXiv Detail & Related papers (2020-07-06T13:01:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.