Semi-Weakly Supervised Object Detection by Sampling Pseudo Ground-Truth
Boxes
- URL: http://arxiv.org/abs/2204.00147v1
- Date: Fri, 1 Apr 2022 00:44:42 GMT
- Title: Semi-Weakly Supervised Object Detection by Sampling Pseudo Ground-Truth
Boxes
- Authors: Akhil Meethal, Marco Pedersoli, Zhongwen Zhu, Francisco Perdigon
Romero, and Eric Granger
- Abstract summary: This paper introduces a weakly semi-supervised training method for object detection.
It achieves state-of-the-art performance by leveraging only a small fraction of fully-labeled images with information in weakly-labeled images.
In particular, our generic sampling-based learning strategy produces pseudo-ground-truth (GT) bounding box annotations in an online fashion.
- Score: 9.827002225566073
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Semi- and weakly-supervised learning have recently attracted considerable
attention in the object detection literature since they can alleviate the cost
of annotation needed to successfully train deep learning models. State-of-art
approaches for semi-supervised learning rely on student-teacher models trained
using a multi-stage process, and considerable data augmentation. Custom
networks have been developed for the weakly-supervised setting, making it
difficult to adapt to different detectors. In this paper, a weakly
semi-supervised training method is introduced that reduces these training
challenges, yet achieves state-of-the-art performance by leveraging only a
small fraction of fully-labeled images with information in weakly-labeled
images. In particular, our generic sampling-based learning strategy produces
pseudo-ground-truth (GT) bounding box annotations in an online fashion,
eliminating the need for multi-stage training, and student-teacher network
configurations. These pseudo GT boxes are sampled from weakly-labeled images
based on the categorical score of object proposals accumulated via a score
propagation process. Empirical results on the Pascal VOC dataset, indicate that
the proposed approach improves performance by 5.0% when using VOC 2007 as
fully-labeled, and VOC 2012 as weak-labeled data. Also, with 5-10% fully
annotated images, we observed an improvement of more than 10% in mAP, showing
that a modest investment in image-level annotation, can substantially improve
detection performance.
Related papers
- Robust Noisy Label Learning via Two-Stream Sample Distillation [48.73316242851264]
Noisy label learning aims to learn robust networks under the supervision of noisy labels.
We design a simple yet effective sample selection framework, termed Two-Stream Sample Distillation (TSSD)
This framework can extract more high-quality samples with clean labels to improve the robustness of network training.
arXiv Detail & Related papers (2024-04-16T12:18:08Z) - Semi-Supervised Learning for hyperspectral images by non parametrically
predicting view assignment [25.198550162904713]
Hyperspectral image (HSI) classification is gaining a lot of momentum in present time because of high inherent spectral information within the images.
Recently, to effectively train the deep learning models with minimal labelled samples, the unlabeled samples are also being leveraged in self-supervised and semi-supervised setting.
In this work, we leverage the idea of semi-supervised learning to assist the discriminative self-supervised pretraining of the models.
arXiv Detail & Related papers (2023-06-19T14:13:56Z) - Imposing Consistency for Optical Flow Estimation [73.53204596544472]
Imposing consistency through proxy tasks has been shown to enhance data-driven learning.
This paper introduces novel and effective consistency strategies for optical flow estimation.
arXiv Detail & Related papers (2022-04-14T22:58:30Z) - STEdge: Self-training Edge Detection with Multi-layer Teaching and
Regularization [15.579360385857129]
We study the problem of self-training edge detection, leveraging the untapped wealth of large-scale unlabeled image datasets.
We design a self-supervised framework with multi-layer regularization and self-teaching.
Our method attains 4.8% improvement for ODS and 5.8% for OIS when tested on the unseen BIPED dataset.
arXiv Detail & Related papers (2022-01-13T18:26:36Z) - Activation to Saliency: Forming High-Quality Labels for Unsupervised
Salient Object Detection [54.92703325989853]
We propose a two-stage Activation-to-Saliency (A2S) framework that effectively generates high-quality saliency cues.
No human annotations are involved in our framework during the whole training process.
Our framework reports significant performance compared with existing USOD methods.
arXiv Detail & Related papers (2021-12-07T11:54:06Z) - Dense Contrastive Visual-Linguistic Pretraining [53.61233531733243]
Several multimodal representation learning approaches have been proposed that jointly represent image and text.
These approaches achieve superior performance by capturing high-level semantic information from large-scale multimodal pretraining.
We propose unbiased Dense Contrastive Visual-Linguistic Pretraining to replace the region regression and classification with cross-modality region contrastive learning.
arXiv Detail & Related papers (2021-09-24T07:20:13Z) - To be Critical: Self-Calibrated Weakly Supervised Learning for Salient
Object Detection [95.21700830273221]
Weakly-supervised salient object detection (WSOD) aims to develop saliency models using image-level annotations.
We propose a self-calibrated training strategy by explicitly establishing a mutual calibration loop between pseudo labels and network predictions.
We prove that even a much smaller dataset with well-matched annotations can facilitate models to achieve better performance as well as generalizability.
arXiv Detail & Related papers (2021-09-04T02:45:22Z) - Semi-weakly Supervised Contrastive Representation Learning for Retinal
Fundus Images [0.2538209532048867]
We propose a semi-weakly supervised contrastive learning framework for representation learning using semi-weakly annotated images.
We empirically validate the transfer learning performance of SWCL on seven public retinal fundus datasets.
arXiv Detail & Related papers (2021-08-04T15:50:09Z) - Deep Semi-supervised Knowledge Distillation for Overlapping Cervical
Cell Instance Segmentation [54.49894381464853]
We propose to leverage both labeled and unlabeled data for instance segmentation with improved accuracy by knowledge distillation.
We propose a novel Mask-guided Mean Teacher framework with Perturbation-sensitive Sample Mining.
Experiments show that the proposed method improves the performance significantly compared with the supervised method learned from labeled data only.
arXiv Detail & Related papers (2020-07-21T13:27:09Z) - Empirical Perspectives on One-Shot Semi-supervised Learning [0.0]
One of the greatest obstacles in the adoption of deep neural networks for new applications is that training the network typically requires a large number of manually labeled training samples.
We empirically investigate the scenario where one has access to large amounts of unlabeled data but require labeling only a single sample per class in order to train a deep network.
arXiv Detail & Related papers (2020-04-08T17:51:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.