Activation to Saliency: Forming High-Quality Labels for Unsupervised
Salient Object Detection
- URL: http://arxiv.org/abs/2112.03650v2
- Date: Wed, 8 Dec 2021 05:53:01 GMT
- Title: Activation to Saliency: Forming High-Quality Labels for Unsupervised
Salient Object Detection
- Authors: Huajun Zhou and Peijia Chen and Lingxiao Yang and Jianhuang Lai and
Xiaohua Xie
- Abstract summary: We propose a two-stage Activation-to-Saliency (A2S) framework that effectively generates high-quality saliency cues.
No human annotations are involved in our framework during the whole training process.
Our framework reports significant performance compared with existing USOD methods.
- Score: 54.92703325989853
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unsupervised Salient Object Detection (USOD) is of paramount significance for
both industrial applications and downstream tasks. Existing deep-learning (DL)
based USOD methods utilize some low-quality saliency predictions extracted by
several traditional SOD methods as saliency cues, which mainly capture some
conspicuous regions in images. Furthermore, they refine these saliency cues
with the assistant of semantic information, which is obtained from some models
trained by supervised learning in other related vision tasks. In this work, we
propose a two-stage Activation-to-Saliency (A2S) framework that effectively
generates high-quality saliency cues and uses these cues to train a robust
saliency detector. More importantly, no human annotations are involved in our
framework during the whole training process. In the first stage, we transform a
pretrained network (MoCo v2) to aggregate multi-level features to a single
activation map, where an Adaptive Decision Boundary (ADB) is proposed to assist
the training of the transformed network. To facilitate the generation of
high-quality pseudo labels, we propose a loss function to enlarges the feature
distances between pixels and their means. In the second stage, an Online Label
Rectifying (OLR) strategy updates the pseudo labels during the training process
to reduce the negative impact of distractors. In addition, we construct a
lightweight saliency detector using two Residual Attention Modules (RAMs),
which refine the high-level features using the complementary information in
low-level features, such as edges and colors. Extensive experiments on several
SOD benchmarks prove that our framework reports significant performance
compared with existing USOD methods. Moreover, training our framework on 3000
images consumes about 1 hour, which is over 30x faster than previous
state-of-the-art methods.
Related papers
- Efficient Meta-Learning Enabled Lightweight Multiscale Few-Shot Object Detection in Remote Sensing Images [15.12889076965307]
YOLOv7 one-stage detector is subjected to a novel meta-learning training framework.
This transformation allows the detector to adeptly address FSOD tasks while capitalizing on its inherent advantage of lightweight.
To validate the effectiveness of our proposed detector, we conducted performance comparisons with current state-of-the-art detectors.
arXiv Detail & Related papers (2024-04-29T04:56:52Z) - Unified Unsupervised Salient Object Detection via Knowledge Transfer [29.324193170890542]
Unsupervised salient object detection (USOD) has gained increasing attention due to its annotation-free nature.
In this paper, we propose a unified USOD framework for generic USOD tasks.
arXiv Detail & Related papers (2024-04-23T05:50:02Z) - 2D Feature Distillation for Weakly- and Semi-Supervised 3D Semantic
Segmentation [92.17700318483745]
We propose an image-guidance network (IGNet) which builds upon the idea of distilling high level feature information from a domain adapted synthetically trained 2D semantic segmentation network.
IGNet achieves state-of-the-art results for weakly-supervised LiDAR semantic segmentation on ScribbleKITTI, boasting up to 98% relative performance to fully supervised training with only 8% labeled points.
arXiv Detail & Related papers (2023-11-27T07:57:29Z) - Towards Discriminative and Transferable One-Stage Few-Shot Object
Detectors [3.9189402702217344]
Few-shot object detection (FSOD) aims to address this problem by learning novel classes given only a few samples.
We make the observation that the large gap in performance between two-stage and one-stage FSODs are mainly due to their weak discriminability.
To address these limitations, we propose the Few-shot RetinaNet (FSRN) that consists of: a multi-way support training strategy to augment the number of foreground samples for dense meta-detectors.
arXiv Detail & Related papers (2022-10-11T20:58:25Z) - A Weakly Supervised Learning Framework for Salient Object Detection via
Hybrid Labels [96.56299163691979]
This paper focuses on a new weakly-supervised salient object detection (SOD) task under hybrid labels.
To address the issues of label noise and quantity imbalance in this task, we design a new pipeline framework with three sophisticated training strategies.
Experiments on five SOD benchmarks show that our method achieves competitive performance against weakly-supervised/unsupervised methods.
arXiv Detail & Related papers (2022-09-07T06:45:39Z) - Label, Verify, Correct: A Simple Few Shot Object Detection Method [93.84801062680786]
We introduce a simple pseudo-labelling method to source high-quality pseudo-annotations from a training set.
We present two novel methods to improve the precision of the pseudo-labelling process.
Our method achieves state-of-the-art or second-best performance compared to existing approaches.
arXiv Detail & Related papers (2021-12-10T18:59:06Z) - Multi-Scale Aligned Distillation for Low-Resolution Detection [68.96325141432078]
This paper focuses on boosting the performance of low-resolution models by distilling knowledge from a high- or multi-resolution model.
On several instance-level detection tasks and datasets, the low-resolution models trained via our approach perform competitively with high-resolution models trained via conventional multi-scale training.
arXiv Detail & Related papers (2021-09-14T12:53:35Z) - One-Shot Object Detection without Fine-Tuning [62.39210447209698]
We introduce a two-stage model consisting of a first stage Matching-FCOS network and a second stage Structure-Aware Relation Module.
We also propose novel training strategies that effectively improve detection performance.
Our method exceeds the state-of-the-art one-shot performance consistently on multiple datasets.
arXiv Detail & Related papers (2020-05-08T01:59:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.