Localization, balance and affinity: a stronger multifaceted collaborative salient object detector in remote sensing images
- URL: http://arxiv.org/abs/2410.23991v1
- Date: Thu, 31 Oct 2024 14:50:48 GMT
- Title: Localization, balance and affinity: a stronger multifaceted collaborative salient object detector in remote sensing images
- Authors: Yakun Xie, Suning Liu, Hongyu Chen, Shaohan Cao, Huixin Zhang, Dejun Feng, Qian Wan, Jun Zhu, Qing Zhu,
- Abstract summary: We propose a stronger multifaceted collaborative salient object detector in ORSIs, termed LBA-MCNet.
The network focuses on accurately locating targets, balancing detailed features, and modeling image-level global context information.
- Score: 24.06927394483275
- License:
- Abstract: Despite significant advancements in salient object detection(SOD) in optical remote sensing images(ORSI), challenges persist due to the intricate edge structures of ORSIs and the complexity of their contextual relationships. Current deep learning approaches encounter difficulties in accurately identifying boundary features and lack efficiency in collaboratively modeling the foreground and background by leveraging contextual features. To address these challenges, we propose a stronger multifaceted collaborative salient object detector in ORSIs, termed LBA-MCNet, which incorporates aspects of localization, balance, and affinity. The network focuses on accurately locating targets, balancing detailed features, and modeling image-level global context information. Specifically, we design the Edge Feature Adaptive Balancing and Adjusting(EFABA) module for precise edge localization, using edge features to guide attention to boundaries and preserve spatial details. Moreover, we design the Global Distributed Affinity Learning(GDAL) module to model global context. It captures global context by generating an affinity map from the encoders final layer, ensuring effective modeling of global patterns. Additionally, deep supervision during deconvolution further enhances feature representation. Finally, we compared with 28 state of the art approaches on three publicly available datasets. The results clearly demonstrate the superiority of our method.
Related papers
- SaliencyI2PLoc: saliency-guided image-point cloud localization using contrastive learning [17.29563451509921]
SaliencyI2PLoc is a contrastive learning architecture that fuses the saliency map into feature aggregation.
Our method achieves a Recall@1 of 78.92% and a Recall@20 of 97.59% on the urban scenario evaluation dataset.
arXiv Detail & Related papers (2024-12-20T05:20:10Z) - World-Consistent Data Generation for Vision-and-Language Navigation [52.08816337783936]
Vision-and-Language Navigation (VLN) is a challenging task that requires an agent to navigate through photorealistic environments following natural-language instructions.
One main obstacle existing in VLN is data scarcity, leading to poor generalization performance over unseen environments.
We propose the world-consistent data generation (WCGEN), an efficacious data-augmentation framework satisfying both diversity and world-consistency.
arXiv Detail & Related papers (2024-12-09T11:40:54Z) - Point Cloud Understanding via Attention-Driven Contrastive Learning [64.65145700121442]
Transformer-based models have advanced point cloud understanding by leveraging self-attention mechanisms.
PointACL is an attention-driven contrastive learning framework designed to address these limitations.
Our method employs an attention-driven dynamic masking strategy that guides the model to focus on under-attended regions.
arXiv Detail & Related papers (2024-11-22T05:41:00Z) - Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation [63.15257949821558]
Referring Remote Sensing Image (RRSIS) is a new challenge that combines computer vision and natural language processing.
Traditional Referring Image (RIS) approaches have been impeded by the complex spatial scales and orientations found in aerial imagery.
We introduce the Rotated Multi-Scale Interaction Network (RMSIN), an innovative approach designed for the unique demands of RRSIS.
arXiv Detail & Related papers (2023-12-19T08:14:14Z) - Weakly-supervised Contrastive Learning for Unsupervised Object Discovery [52.696041556640516]
Unsupervised object discovery is promising due to its ability to discover objects in a generic manner.
We design a semantic-guided self-supervised learning model to extract high-level semantic features from images.
We introduce Principal Component Analysis (PCA) to localize object regions.
arXiv Detail & Related papers (2023-07-07T04:03:48Z) - Global-and-Local Collaborative Learning for Co-Salient Object Detection [162.62642867056385]
The goal of co-salient object detection (CoSOD) is to discover salient objects that commonly appear in a query group containing two or more relevant images.
We propose a global-and-local collaborative learning architecture, which includes a global correspondence modeling (GCM) and a local correspondence modeling (LCM)
The proposed GLNet is evaluated on three prevailing CoSOD benchmark datasets, demonstrating that our model trained on a small dataset (about 3k images) still outperforms eleven state-of-the-art competitors trained on some large datasets (about 8k-200k images)
arXiv Detail & Related papers (2022-04-19T14:32:41Z) - DenseGAP: Graph-Structured Dense Correspondence Learning with Anchor
Points [15.953570826460869]
Establishing dense correspondence between two images is a fundamental computer vision problem.
We introduce DenseGAP, a new solution for efficient Dense correspondence learning with a Graph-structured neural network conditioned on Anchor Points.
Our method advances the state-of-the-art of correspondence learning on most benchmarks.
arXiv Detail & Related papers (2021-12-13T18:59:30Z) - Perception-and-Regulation Network for Salient Object Detection [8.026227647732792]
We propose a novel global attention unit that adaptively regulates the feature fusion process by explicitly modeling interdependencies between features.
The perception part uses the structure of fully-connected layers in classification networks to learn the size and shape of objects.
An imitating eye observation module (IEO) is further employed for improving the global perception ability of the network.
arXiv Detail & Related papers (2021-07-27T02:38:40Z) - Global Context Aware RCNN for Object Detection [1.1939762265857436]
We propose a novel end-to-end trainable framework, called Global Context Aware (GCA) RCNN.
The core component of GCA framework is a context aware mechanism, in which both global feature pyramid and attention strategies are used for feature extraction and feature refinement.
In the end, we also present a lightweight version of our method, which only slightly increases model complexity and computational burden.
arXiv Detail & Related papers (2020-12-04T14:56:46Z) - Global Context-Aware Progressive Aggregation Network for Salient Object
Detection [117.943116761278]
We propose a novel network named GCPANet to integrate low-level appearance features, high-level semantic features, and global context features.
We show that the proposed approach outperforms the state-of-the-art methods both quantitatively and qualitatively.
arXiv Detail & Related papers (2020-03-02T04:26:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.