Progressive Self-Guided Loss for Salient Object Detection
- URL: http://arxiv.org/abs/2101.02412v1
- Date: Thu, 7 Jan 2021 07:33:38 GMT
- Title: Progressive Self-Guided Loss for Salient Object Detection
- Authors: Sheng Yang, Weisi Lin, Guosheng Lin, Qiuping Jiang, Zichuan Liu
- Abstract summary: We present a progressive self-guided loss function to facilitate deep learning-based salient object detection in images.
Our framework takes advantage of adaptively aggregated multi-scale features to locate and detect salient objects effectively.
- Score: 102.35488902433896
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present a simple yet effective progressive self-guided loss function to
facilitate deep learning-based salient object detection (SOD) in images. The
saliency maps produced by the most relevant works still suffer from incomplete
predictions due to the internal complexity of salient objects. Our proposed
progressive self-guided loss simulates a morphological closing operation on the
model predictions for epoch-wisely creating progressive and auxiliary training
supervisions to step-wisely guide the training process. We demonstrate that
this new loss function can guide the SOD model to highlight more complete
salient objects step-by-step and meanwhile help to uncover the spatial
dependencies of the salient object pixels in a region growing manner. Moreover,
a new feature aggregation module is proposed to capture multi-scale features
and aggregate them adaptively by a branch-wise attention mechanism. Benefiting
from this module, our SOD framework takes advantage of adaptively aggregated
multi-scale features to locate and detect salient objects effectively.
Experimental results on several benchmark datasets show that our loss function
not only advances the performance of existing SOD models without architecture
modification but also helps our proposed framework to achieve state-of-the-art
performance.
Related papers
- Visual Grounding with Attention-Driven Constraint Balancing [19.30650183073788]
We propose Attention-Driven Constraint Balancing (AttBalance) to optimize the behavior of visual features within language-relevant regions.
We achieve constant improvements over five different models evaluated on four different benchmarks.
We attain a new state-of-the-art performance by integrating our method into QRNet.
arXiv Detail & Related papers (2024-07-03T16:14:09Z) - Uncertainty modeling for fine-tuned implicit functions [10.902709236602536]
Implicit functions have become pivotal in computer vision for reconstructing detailed object shapes from sparse views.
We introduce Dropsembles, a novel method for uncertainty estimation in tuned implicit functions.
Our results show that Dropsembles achieve the accuracy and calibration levels of deep ensembles but with significantly less computational cost.
arXiv Detail & Related papers (2024-06-17T20:46:18Z) - InfRS: Incremental Few-Shot Object Detection in Remote Sensing Images [11.916941756499435]
In this paper, we explore the intricate task of incremental few-shot object detection in remote sensing images.
We introduce a pioneering fine-tuning-based technique, termed InfRS, designed to facilitate the incremental learning of novel classes.
We develop a prototypical calibration strategy based on the Wasserstein distance to mitigate the catastrophic forgetting problem.
arXiv Detail & Related papers (2024-05-18T13:39:50Z) - Innovative Horizons in Aerial Imagery: LSKNet Meets DiffusionDet for
Advanced Object Detection [55.2480439325792]
We present an in-depth evaluation of an object detection model that integrates the LSKNet backbone with the DiffusionDet head.
The proposed model achieves a mean average precision (MAP) of approximately 45.7%, which is a significant improvement.
This advancement underscores the effectiveness of the proposed modifications and sets a new benchmark in aerial image analysis.
arXiv Detail & Related papers (2023-11-21T19:49:13Z) - Weakly-supervised Contrastive Learning for Unsupervised Object Discovery [52.696041556640516]
Unsupervised object discovery is promising due to its ability to discover objects in a generic manner.
We design a semantic-guided self-supervised learning model to extract high-level semantic features from images.
We introduce Principal Component Analysis (PCA) to localize object regions.
arXiv Detail & Related papers (2023-07-07T04:03:48Z) - Boosting Object Representation Learning via Motion and Object Continuity [22.512380611375846]
We propose to exploit object motion and continuity, i.e., objects do not pop in and out of existence.
The resulting Motion and Object Continuity scheme can be instantiated using any baseline object detection model.
Our results show large improvements in the performances of a SOTA model in terms of object discovery, convergence speed and overall latent object representations.
arXiv Detail & Related papers (2022-11-16T09:36:41Z) - Dissecting Deep Metric Learning Losses for Image-Text Retrieval [8.248111272824326]
Visual-Semantic Embedding (VSE) is a prevalent approach in image-text retrieval by learning a joint embedding space between the image and language modalities.
The triplet loss with hard-negative mining has become the de-facto objective for most VSE methods.
We present a novel Gradient-based Objective AnaLysis framework, or textitGOAL, to systematically analyze the combinations and reweighting of the gradients in existing DML functions.
arXiv Detail & Related papers (2022-10-21T06:48:27Z) - Salient Objects in Clutter [130.63976772770368]
This paper identifies and addresses a serious design bias of existing salient object detection (SOD) datasets.
This design bias has led to a saturation in performance for state-of-the-art SOD models when evaluated on existing datasets.
We propose a new high-quality dataset and update the previous saliency benchmark.
arXiv Detail & Related papers (2021-05-07T03:49:26Z) - EDN: Salient Object Detection via Extremely-Downsampled Network [66.38046176176017]
We introduce an Extremely-Downsampled Network (EDN), which employs an extreme downsampling technique to effectively learn a global view of the whole image.
Experiments demonstrate that EDN achieves sArt performance with real-time speed.
arXiv Detail & Related papers (2020-12-24T04:23:48Z) - One-Shot Object Detection without Fine-Tuning [62.39210447209698]
We introduce a two-stage model consisting of a first stage Matching-FCOS network and a second stage Structure-Aware Relation Module.
We also propose novel training strategies that effectively improve detection performance.
Our method exceeds the state-of-the-art one-shot performance consistently on multiple datasets.
arXiv Detail & Related papers (2020-05-08T01:59:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.