Related papers: Activation to Saliency: Forming High-Quality Labels for Unsupervised Salient Object Detection

Activation to Saliency: Forming High-Quality Labels for Unsupervised Salient Object Detection

URL: http://arxiv.org/abs/2112.03650v2
Date: Wed, 8 Dec 2021 05:53:01 GMT
Title: Activation to Saliency: Forming High-Quality Labels for Unsupervised Salient Object Detection
Authors: Huajun Zhou and Peijia Chen and Lingxiao Yang and Jianhuang Lai and Xiaohua Xie
Abstract summary: We propose a two-stage Activation-to-Saliency (A2S) framework that effectively generates high-quality saliency cues. No human annotations are involved in our framework during the whole training process. Our framework reports significant performance compared with existing USOD methods.
Score: 54.92703325989853
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Unsupervised Salient Object Detection (USOD) is of paramount significance for both industrial applications and downstream tasks. Existing deep-learning (DL) based USOD methods utilize some low-quality saliency predictions extracted by several traditional SOD methods as saliency cues, which mainly capture some conspicuous regions in images. Furthermore, they refine these saliency cues with the assistant of semantic information, which is obtained from some models trained by supervised learning in other related vision tasks. In this work, we propose a two-stage Activation-to-Saliency (A2S) framework that effectively generates high-quality saliency cues and uses these cues to train a robust saliency detector. More importantly, no human annotations are involved in our framework during the whole training process. In the first stage, we transform a pretrained network (MoCo v2) to aggregate multi-level features to a single activation map, where an Adaptive Decision Boundary (ADB) is proposed to assist the training of the transformed network. To facilitate the generation of high-quality pseudo labels, we propose a loss function to enlarges the feature distances between pixels and their means. In the second stage, an Online Label Rectifying (OLR) strategy updates the pseudo labels during the training process to reduce the negative impact of distractors. In addition, we construct a lightweight saliency detector using two Residual Attention Modules (RAMs), which refine the high-level features using the complementary information in low-level features, such as edges and colors. Extensive experiments on several SOD benchmarks prove that our framework reports significant performance compared with existing USOD methods. Moreover, training our framework on 3000 images consumes about 1 hour, which is over 30x faster than previous state-of-the-art methods.

Related papers

Underlying Semantic Diffusion for Effective and Efficient In-Context Learning [113.4003355229632]
Underlying Semantic Diffusion (US-Diffusion) is an enhanced diffusion model that boosts underlying semantics learning, computational efficiency, and in-context learning capabilities. We present a Feedback-Aided Learning (FAL) framework, which leverages feedback signals to guide the model in capturing semantic details. We also propose a plug-and-play Efficient Sampling Strategy (ESS) for dense sampling at time steps with high-noise levels.
arXiv Detail & Related papers (2025-03-06T03:06:22Z)
Highly Efficient and Unsupervised Framework for Moving Object Detection in Satellite Videos [0.2023650687546586]
We propose a highly efficient unsupervised framework for SVMOD. We show that our method can not only process 9 frames per second on 1024x images but also achieve foreground-art performance.
arXiv Detail & Related papers (2024-11-24T16:06:42Z)
Efficient Meta-Learning Enabled Lightweight Multiscale Few-Shot Object Detection in Remote Sensing Images [15.12889076965307]
YOLOv7 one-stage detector is subjected to a novel meta-learning training framework. This transformation allows the detector to adeptly address FSOD tasks while capitalizing on its inherent advantage of lightweight. To validate the effectiveness of our proposed detector, we conducted performance comparisons with current state-of-the-art detectors.
arXiv Detail & Related papers (2024-04-29T04:56:52Z)
Unified Unsupervised Salient Object Detection via Knowledge Transfer [29.324193170890542]
Unsupervised salient object detection (USOD) has gained increasing attention due to its annotation-free nature. In this paper, we propose a unified USOD framework for generic USOD tasks.
arXiv Detail & Related papers (2024-04-23T05:50:02Z)
2D Feature Distillation for Weakly- and Semi-Supervised 3D Semantic Segmentation [92.17700318483745]
We propose an image-guidance network (IGNet) which builds upon the idea of distilling high level feature information from a domain adapted synthetically trained 2D semantic segmentation network. IGNet achieves state-of-the-art results for weakly-supervised LiDAR semantic segmentation on ScribbleKITTI, boasting up to 98% relative performance to fully supervised training with only 8% labeled points.
arXiv Detail & Related papers (2023-11-27T07:57:29Z)
A Weakly Supervised Learning Framework for Salient Object Detection via Hybrid Labels [96.56299163691979]
This paper focuses on a new weakly-supervised salient object detection (SOD) task under hybrid labels. To address the issues of label noise and quantity imbalance in this task, we design a new pipeline framework with three sophisticated training strategies. Experiments on five SOD benchmarks show that our method achieves competitive performance against weakly-supervised/unsupervised methods.
arXiv Detail & Related papers (2022-09-07T06:45:39Z)
Label, Verify, Correct: A Simple Few Shot Object Detection Method [93.84801062680786]
We introduce a simple pseudo-labelling method to source high-quality pseudo-annotations from a training set. We present two novel methods to improve the precision of the pseudo-labelling process. Our method achieves state-of-the-art or second-best performance compared to existing approaches.
arXiv Detail & Related papers (2021-12-10T18:59:06Z)
Multi-Scale Aligned Distillation for Low-Resolution Detection [68.96325141432078]
This paper focuses on boosting the performance of low-resolution models by distilling knowledge from a high- or multi-resolution model. On several instance-level detection tasks and datasets, the low-resolution models trained via our approach perform competitively with high-resolution models trained via conventional multi-scale training.
arXiv Detail & Related papers (2021-09-14T12:53:35Z)
One-Shot Object Detection without Fine-Tuning [62.39210447209698]
We introduce a two-stage model consisting of a first stage Matching-FCOS network and a second stage Structure-Aware Relation Module. We also propose novel training strategies that effectively improve detection performance. Our method exceeds the state-of-the-art one-shot performance consistently on multiple datasets.
arXiv Detail & Related papers (2020-05-08T01:59:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.