Two-phase weakly supervised object detection with pseudo ground truth
mining
- URL: http://arxiv.org/abs/2104.00231v1
- Date: Thu, 1 Apr 2021 03:21:24 GMT
- Title: Two-phase weakly supervised object detection with pseudo ground truth
mining
- Authors: Jun Wang
- Abstract summary: Weakly Supervised Object Detection (WSOD) aiming to train detectors with only image-level dataset has arisen increasing attention for researchers.
In this project, we focus on two-phase WSOD architecture which integrates a powerful detector with a pure WSOD model.
We explore the effectiveness of some representative detectors utilized as the second-phase detector in two-phase WSOD and propose a two-phase WSOD architecture.
- Score: 8.227822364332814
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Weakly Supervised Object Detection (WSOD), aiming to train detectors with
only image-level dataset, has arisen increasing attention for researchers. In
this project, we focus on two-phase WSOD architecture which integrates a
powerful detector with a pure WSOD model. We explore the effectiveness of some
representative detectors utilized as the second-phase detector in two-phase
WSOD and propose a two-phase WSOD architecture. In addition, we present a
strategy to establish the pseudo ground truth (PGT) used to train the
second-phase detector. Unlike previous works that regard top one bounding boxes
as PGT, we consider more bounding boxes to establish the PGT annotations. This
alleviates the insufficient learning problem caused by the low recall of PGT.
We also propose some strategies to refine the PGT during the training of the
second detector. Our strategies suspend the training in specific epoch, then
refine the PGT by the outputs of the second-phase detector. After that, the
algorithm continues the training with the same gradients and weights as those
before suspending. Elaborate experiments are conduceted on the PASCAL VOC 2007
dataset to verify the effectiveness of our methods. As results demonstrate, our
two-phase architecture improves the mAP from 49.17% to 53.21% compared with the
single PCL model. Additionally, the best PGT generation strategy obtains a 0.7%
mAP increment. Our best refinement strategy boosts the performance by 1.74%
mAP. The best results adopting all of our methods achieve 55.231% mAP which is
the state-of-the-art performance.
Related papers
- ACTRESS: Active Retraining for Semi-supervised Visual Grounding [52.08834188447851]
A previous study, RefTeacher, makes the first attempt to tackle this task by adopting the teacher-student framework to provide pseudo confidence supervision and attention-based supervision.
This approach is incompatible with current state-of-the-art visual grounding models, which follow the Transformer-based pipeline.
Our paper proposes the ACTive REtraining approach for Semi-Supervised Visual Grounding, abbreviated as ACTRESS.
arXiv Detail & Related papers (2024-07-03T16:33:31Z) - Efficient Meta-Learning Enabled Lightweight Multiscale Few-Shot Object Detection in Remote Sensing Images [15.12889076965307]
YOLOv7 one-stage detector is subjected to a novel meta-learning training framework.
This transformation allows the detector to adeptly address FSOD tasks while capitalizing on its inherent advantage of lightweight.
To validate the effectiveness of our proposed detector, we conducted performance comparisons with current state-of-the-art detectors.
arXiv Detail & Related papers (2024-04-29T04:56:52Z) - Simplifying Two-Stage Detectors for On-Device Inference in Remote Sensing [0.7305342793164903]
We propose a model simplification method for two-stage object detectors.
Our method reduces computation costs upto 61.2% with the accuracy loss within 2.1% on the DOTAv1.5 dataset.
arXiv Detail & Related papers (2024-04-11T00:45:10Z) - Advanced Efficient Strategy for Detection of Dark Objects Based on
Spiking Network with Multi-Box Detection [2.9659663708260777]
The study proposes a combination of spiked and normal convolution layers as an energy-efficient and reliable object detector model.
With state of the art Python libraries, spike layers can be trained efficiently.
The proposed spike convolutional object detector (SCOD) has been evaluated on VOC and Ex-Dark datasets.
arXiv Detail & Related papers (2023-10-10T07:20:37Z) - Enhancing Infrared Small Target Detection Robustness with Bi-Level
Adversarial Framework [61.34862133870934]
We propose a bi-level adversarial framework to promote the robustness of detection in the presence of distinct corruptions.
Our scheme remarkably improves 21.96% IOU across a wide array of corruptions and notably promotes 4.97% IOU on the general benchmark.
arXiv Detail & Related papers (2023-09-03T06:35:07Z) - Boosting the Efficiency of Parametric Detection with Hierarchical Neural
Networks [4.1410005218338695]
We propose Hierarchical Detection Network (HDN), a novel approach to efficient detection.
The network is trained using a novel loss function, which encodes simultaneously the goals of statistical accuracy and efficiency.
We show how training a three-layer HDN using two-layer model can further boost both accuracy and efficiency.
arXiv Detail & Related papers (2022-07-23T19:23:00Z) - Efficient Few-Shot Object Detection via Knowledge Inheritance [62.36414544915032]
Few-shot object detection (FSOD) aims at learning a generic detector that can adapt to unseen tasks with scarce training samples.
We present an efficient pretrain-transfer framework (PTF) baseline with no computational increment.
We also propose an adaptive length re-scaling (ALR) strategy to alleviate the vector length inconsistency between the predicted novel weights and the pretrained base weights.
arXiv Detail & Related papers (2022-03-23T06:24:31Z) - Activation to Saliency: Forming High-Quality Labels for Unsupervised
Salient Object Detection [54.92703325989853]
We propose a two-stage Activation-to-Saliency (A2S) framework that effectively generates high-quality saliency cues.
No human annotations are involved in our framework during the whole training process.
Our framework reports significant performance compared with existing USOD methods.
arXiv Detail & Related papers (2021-12-07T11:54:06Z) - PGTRNet: Two-phase Weakly Supervised Object Detection with Pseudo Ground
Truth Refining [10.262660606897974]
Weakly Supervised Object Detection (WSOD) aiming to train detectors with only image-level annotations has arisen increasing attention.
Current state-of-the-art approaches mainly follow a two-stage training strategy whichintegrates a fully supervised detector (FSD) with a pure WSOD model.
There are two main problems hindering the performance of the two-phase WSOD approaches, i.e., insufficient learning problem and strict reliance between the FSD and the pseudo ground truth generated by theWSOD model.
This paper proposes pseudo ground truth refinement network (PGTRNet), a simple yet effective method
arXiv Detail & Related papers (2021-08-25T19:20:49Z) - Disentangle Your Dense Object Detector [82.22771433419727]
Deep learning-based dense object detectors have achieved great success in the past few years and have been applied to numerous multimedia applications such as video understanding.
However, the current training pipeline for dense detectors is compromised to lots of conjunctions that may not hold.
We propose Disentangled Dense Object Detector (DDOD), in which simple and effective disentanglement mechanisms are designed and integrated into the current state-of-the-art detectors.
arXiv Detail & Related papers (2021-07-07T00:52:16Z) - Robust and Accurate Object Detection via Adversarial Learning [111.36192453882195]
This work augments the fine-tuning stage for object detectors by exploring adversarial examples.
Our approach boosts the performance of state-of-the-art EfficientDets by +1.1 mAP on the object detection benchmark.
arXiv Detail & Related papers (2021-03-23T19:45:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.