WSSOD: A New Pipeline for Weakly- and Semi-Supervised Object Detection
- URL: http://arxiv.org/abs/2105.11293v1
- Date: Fri, 21 May 2021 11:58:50 GMT
- Title: WSSOD: A New Pipeline for Weakly- and Semi-Supervised Object Detection
- Authors: Shijie Fang, Yuhang Cao, Xinjiang Wang, Kai Chen, Dahua Lin, Wayne
Zhang
- Abstract summary: We propose a weakly- and semi-supervised object detection framework (WSSOD)
An agent detector is first trained on a joint dataset and then used to predict pseudo bounding boxes on weakly-annotated images.
The proposed framework demonstrates remarkable performance on PASCAL-VOC and MSCOCO benchmark, achieving a high performance comparable to those obtained in fully-supervised settings.
- Score: 75.80075054706079
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The performance of object detection, to a great extent, depends on the
availability of large annotated datasets. To alleviate the annotation cost, the
research community has explored a number of ways to exploit unlabeled or weakly
labeled data. However, such efforts have met with limited success so far. In
this work, we revisit the problem with a pragmatic standpoint, trying to
explore a new balance between detection performance and annotation cost by
jointly exploiting fully and weakly annotated data. Specifically, we propose a
weakly- and semi-supervised object detection framework (WSSOD), which involves
a two-stage learning procedure. An agent detector is first trained on a joint
dataset and then used to predict pseudo bounding boxes on weakly-annotated
images. The underlying assumptions in the current as well as common
semi-supervised pipelines are also carefully examined under a unified EM
formulation. On top of this framework, weakly-supervised loss (WSL), label
attention and random pseudo-label sampling (RPS) strategies are introduced to
relax these assumptions, bringing additional improvement on the efficacy of the
detection pipeline. The proposed framework demonstrates remarkable performance
on PASCAL-VOC and MSCOCO benchmark, achieving a high performance comparable to
those obtained in fully-supervised settings, with only one third of the
annotations.
Related papers
- ACTRESS: Active Retraining for Semi-supervised Visual Grounding [52.08834188447851]
A previous study, RefTeacher, makes the first attempt to tackle this task by adopting the teacher-student framework to provide pseudo confidence supervision and attention-based supervision.
This approach is incompatible with current state-of-the-art visual grounding models, which follow the Transformer-based pipeline.
Our paper proposes the ACTive REtraining approach for Semi-Supervised Visual Grounding, abbreviated as ACTRESS.
arXiv Detail & Related papers (2024-07-03T16:33:31Z) - Weakly-Supervised Cross-Domain Segmentation of Electron Microscopy with Sparse Point Annotation [1.124958340749622]
We introduce a multitask learning framework to leverage correlations among the counting, detection, and segmentation tasks.
We develop a cross-position cut-and-paste for label augmentation and an entropy-based pseudo-label selection.
The proposed model is capable of significantly outperforming UDA methods and produces comparable performance as the supervised counterpart.
arXiv Detail & Related papers (2024-03-31T12:22:23Z) - Sparse Generation: Making Pseudo Labels Sparse for weakly supervision with points [2.2241974678268903]
We consider the generation of weakly supervised pseudo labels as the result of model's sparse output.
We propose a method called Sparse Generation to make pseudo labels sparse.
arXiv Detail & Related papers (2024-03-28T10:42:49Z) - Small Object Detection via Coarse-to-fine Proposal Generation and
Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning.
CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z) - LEDetection: A Simple Framework for Semi-Supervised Few-Shot Object
Detection [4.3512163406552]
This paper studies the new task of semi-supervised FSOD by considering a realistic scenario in which both base and novel labels are simultaneously scarce.
We introduce SoftER Teacher, a robust detector combining pseudo-labeling with consistency learning on region proposals.
Rigorous experiments show that SoftER Teacher surpasses the novel performance of a strong supervised detector using only 10% of required base labels.
arXiv Detail & Related papers (2023-03-10T06:49:31Z) - SIOD: Single Instance Annotated Per Category Per Image for Object
Detection [67.64774488115299]
We propose the Single Instance annotated Object Detection (SIOD), requiring only one instance annotation for each existing category in an image.
Degraded from inter-task (WSOD) or inter-image (SSOD) discrepancies to the intra-image discrepancy, SIOD provides more reliable and rich prior knowledge for mining the rest of unlabeled instances.
Under the SIOD setting, we propose a simple yet effective framework, termed Dual-Mining (DMiner), which consists of a Similarity-based Pseudo Label Generating module (SPLG) and a Pixel-level Group Contrastive Learning module (PGCL)
arXiv Detail & Related papers (2022-03-29T08:49:51Z) - Temporal Action Detection with Multi-level Supervision [116.55596693897388]
We introduce the Semi-supervised Action Detection (SSAD) task with a mixture of labeled and unlabeled data.
We analyze different types of errors in the proposed SSAD baselines which are directly adapted from the semi-supervised classification task.
We incorporate weakly-labeled data into SSAD and propose Omni-supervised Action Detection (OSAD) with three levels of supervision.
arXiv Detail & Related papers (2020-11-24T04:45:17Z) - Many-shot from Low-shot: Learning to Annotate using Mixed Supervision
for Object Detection [27.354492072251492]
Online annotation module (OAM) learns to generate a many-shot set of emphreliable annotations from a larger volume of weakly labelled images.
Our OAM can be jointly trained with any fully supervised two-stage object detection method, providing additional training annotations on the fly.
The integration of the OAM with Fast(er) R-CNN improves their performance by $17%$ mAP, $9%$ AP50 on PASCAL VOC 2007 and MS-COCO benchmarks, and significantly outperforms competing methods using mixed supervision.
arXiv Detail & Related papers (2020-08-21T22:06:43Z) - EHSOD: CAM-Guided End-to-end Hybrid-Supervised Object Detection with
Cascade Refinement [53.69674636044927]
We present EHSOD, an end-to-end hybrid-supervised object detection system.
It can be trained in one shot on both fully and weakly-annotated data.
It achieves comparable results on multiple object detection benchmarks with only 30% fully-annotated data.
arXiv Detail & Related papers (2020-02-18T08:04:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.