PGTRNet: Two-phase Weakly Supervised Object Detection with Pseudo Ground
Truth Refining
- URL: http://arxiv.org/abs/2108.11439v1
- Date: Wed, 25 Aug 2021 19:20:49 GMT
- Title: PGTRNet: Two-phase Weakly Supervised Object Detection with Pseudo Ground
Truth Refining
- Authors: Jun Wang, Hefeng Zhou, Xiaohan Yu
- Abstract summary: Weakly Supervised Object Detection (WSOD) aiming to train detectors with only image-level annotations has arisen increasing attention.
Current state-of-the-art approaches mainly follow a two-stage training strategy whichintegrates a fully supervised detector (FSD) with a pure WSOD model.
There are two main problems hindering the performance of the two-phase WSOD approaches, i.e., insufficient learning problem and strict reliance between the FSD and the pseudo ground truth generated by theWSOD model.
This paper proposes pseudo ground truth refinement network (PGTRNet), a simple yet effective method
- Score: 10.262660606897974
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Weakly Supervised Object Detection (WSOD), aiming to train detectors with
only image-level annotations, has arisen increasing attention. Current
state-of-the-art approaches mainly follow a two-stage training strategy
whichintegrates a fully supervised detector (FSD) with a pure WSOD model. There
are two main problems hindering the performance of the two-phase WSOD
approaches, i.e., insufficient learning problem and strict reliance between the
FSD and the pseudo ground truth (PGT) generated by theWSOD model. This paper
proposes pseudo ground truth refinement network (PGTRNet), a simple yet
effective method without introducing any extra learnable parameters, to cope
with these problems. PGTRNet utilizes multiple bounding boxes to establish the
PGT, mitigating the insufficient learning problem. Besides, we propose a novel
online PGT refinement approach to steadily improve the quality of PGTby fully
taking advantage of the power of FSD during the second-phase training,
decoupling the first and second-phase models. Elaborate experiments are
conducted on the PASCAL VOC 2007 benchmark to verify the effectiveness of our
methods. Experimental results demonstrate that PGTRNet boosts the backbone
model by 2.074% mAP and achieves the state-of-the-art performance, showing the
significant potentials of the second-phase training.
Related papers
- The Surprising Effectiveness of Test-Time Training for Abstract Reasoning [64.36534512742736]
We investigate the effectiveness of test-time training (TTT) as a mechanism for improving models' reasoning capabilities.
TTT significantly improves performance on ARC tasks, achieving up to 6x improvement in accuracy compared to base fine-tuned models.
Our findings suggest that explicit symbolic search is not the only path to improved abstract reasoning in neural language models.
arXiv Detail & Related papers (2024-11-11T18:59:45Z) - Bi-TTA: Bidirectional Test-Time Adapter for Remote Physiological Measurement [25.11883761217408]
Remote photoplethysmography (r) is gaining prominence for its non-invasive approach to monitoring physiological signals using only cameras.
Despite its promise, the adaptability of r models to new domains is hindered due to the environmental sensitivity of physiological signals.
We present Bi-TTA, a novel expert knowledge-based Bidirectional Test-Time Adapter framework.
arXiv Detail & Related papers (2024-09-25T19:55:20Z) - ACTRESS: Active Retraining for Semi-supervised Visual Grounding [52.08834188447851]
A previous study, RefTeacher, makes the first attempt to tackle this task by adopting the teacher-student framework to provide pseudo confidence supervision and attention-based supervision.
This approach is incompatible with current state-of-the-art visual grounding models, which follow the Transformer-based pipeline.
Our paper proposes the ACTive REtraining approach for Semi-Supervised Visual Grounding, abbreviated as ACTRESS.
arXiv Detail & Related papers (2024-07-03T16:33:31Z) - Efficient Deep Reinforcement Learning Requires Regulating Overfitting [91.88004732618381]
We show that high temporal-difference (TD) error on the validation set of transitions is the main culprit that severely affects the performance of deep RL algorithms.
We show that a simple online model selection method that targets the validation TD error is effective across state-based DMC and Gym tasks.
arXiv Detail & Related papers (2023-04-20T17:11:05Z) - TWINS: A Fine-Tuning Framework for Improved Transferability of
Adversarial Robustness and Generalization [89.54947228958494]
This paper focuses on the fine-tuning of an adversarially pre-trained model in various classification tasks.
We propose a novel statistics-based approach, Two-WIng NormliSation (TWINS) fine-tuning framework.
TWINS is shown to be effective on a wide range of image classification datasets in terms of both generalization and robustness.
arXiv Detail & Related papers (2023-03-20T14:12:55Z) - Reinforcement Learning from Demonstrations by Novel Interactive Expert
and Application to Automatic Berthing Control Systems for Unmanned Surface
Vessel [12.453219390225428]
Two novel practical methods of Reinforcement Learning from Demonstration (RLfD) are developed and applied to automatic berthing control systems for Unmanned Surface Vessel.
A new expert data generation method, called Model Predictive Based Expert (MPBE), is developed to provide high quality supervision data for RLfD algorithms.
Another novel RLfD algorithm based on the MP-DDPG, called Self-Guided Actor-Critic (SGAC) is present, which can effectively leverage MPBE by continuously querying it to generate high quality expert data online.
arXiv Detail & Related papers (2022-02-23T06:45:59Z) - Activation to Saliency: Forming High-Quality Labels for Unsupervised
Salient Object Detection [54.92703325989853]
We propose a two-stage Activation-to-Saliency (A2S) framework that effectively generates high-quality saliency cues.
No human annotations are involved in our framework during the whole training process.
Our framework reports significant performance compared with existing USOD methods.
arXiv Detail & Related papers (2021-12-07T11:54:06Z) - Two-phase weakly supervised object detection with pseudo ground truth
mining [8.227822364332814]
Weakly Supervised Object Detection (WSOD) aiming to train detectors with only image-level dataset has arisen increasing attention for researchers.
In this project, we focus on two-phase WSOD architecture which integrates a powerful detector with a pure WSOD model.
We explore the effectiveness of some representative detectors utilized as the second-phase detector in two-phase WSOD and propose a two-phase WSOD architecture.
arXiv Detail & Related papers (2021-04-01T03:21:24Z) - Few-shot Action Recognition with Prototype-centered Attentive Learning [88.10852114988829]
Prototype-centered Attentive Learning (PAL) model composed of two novel components.
First, a prototype-centered contrastive learning loss is introduced to complement the conventional query-centered learning objective.
Second, PAL integrates a attentive hybrid learning mechanism that can minimize the negative impacts of outliers.
arXiv Detail & Related papers (2021-01-20T11:48:12Z) - Two-stage Deep Reinforcement Learning for Inverter-based Volt-VAR
Control in Active Distribution Networks [3.260913246106564]
We propose a novel two-stage deep reinforcement learning (DRL) method to improve the voltage profile by regulating inverter-based energy resources.
In the offline stage, a highly efficient adversarial reinforcement learning algorithm is developed to train an offline agent robust to the model mismatch.
In the sequential online stage, we transfer the offline agent safely as the online agent to perform continuous learning and controlling online with significantly improved safety and efficiency.
arXiv Detail & Related papers (2020-05-20T08:02:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.