Semi-Supervised and Long-Tailed Object Detection with CascadeMatch
- URL: http://arxiv.org/abs/2305.14813v1
- Date: Wed, 24 May 2023 07:09:25 GMT
- Title: Semi-Supervised and Long-Tailed Object Detection with CascadeMatch
- Authors: Yuhang Zang, Kaiyang Zhou, Chen Huang, Chen Change Loy
- Abstract summary: We propose a novel pseudo-labeling-based detector called CascadeMatch.
Our detector features a cascade network architecture, which has multi-stage detection heads with progressive confidence thresholds.
We show that CascadeMatch surpasses existing state-of-the-art semi-supervised approaches in handling long-tailed object detection.
- Score: 91.86787064083012
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper focuses on long-tailed object detection in the semi-supervised
learning setting, which poses realistic challenges, but has rarely been studied
in the literature. We propose a novel pseudo-labeling-based detector called
CascadeMatch. Our detector features a cascade network architecture, which has
multi-stage detection heads with progressive confidence thresholds. To avoid
manually tuning the thresholds, we design a new adaptive pseudo-label mining
mechanism to automatically identify suitable values from data. To mitigate
confirmation bias, where a model is negatively reinforced by incorrect
pseudo-labels produced by itself, each detection head is trained by the
ensemble pseudo-labels of all detection heads. Experiments on two long-tailed
datasets, i.e., LVIS and COCO-LT, demonstrate that CascadeMatch surpasses
existing state-of-the-art semi-supervised approaches -- across a wide range of
detection architectures -- in handling long-tailed object detection. For
instance, CascadeMatch outperforms Unbiased Teacher by 1.9 AP Fix on LVIS when
using a ResNet50-based Cascade R-CNN structure, and by 1.7 AP Fix when using
Sparse R-CNN with a Transformer encoder. We also show that CascadeMatch can
even handle the challenging sparsely annotated object detection problem.
Related papers
- Global Context Aggregation Network for Lightweight Saliency Detection of
Surface Defects [70.48554424894728]
We develop a Global Context Aggregation Network (GCANet) for lightweight saliency detection of surface defects on the encoder-decoder structure.
First, we introduce a novel transformer encoder on the top layer of the lightweight backbone, which captures global context information through a novel Depth-wise Self-Attention (DSA) module.
The experimental results on three public defect datasets demonstrate that the proposed network achieves a better trade-off between accuracy and running efficiency compared with other 17 state-of-the-art methods.
arXiv Detail & Related papers (2023-09-22T06:19:11Z) - Small Object Detection via Coarse-to-fine Proposal Generation and
Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning.
CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z) - CAT: LoCalization and IdentificAtion Cascade Detection Transformer for
Open-World Object Detection [17.766859354014663]
Open-world object detection requires a model trained from data on known objects to detect both known and unknown objects.
We propose a novel solution called CAT: LoCalization and IdentificAtion Cascade Detection Transformer.
We show that our model outperforms the state-of-the-art in terms of all metrics in the task of OWOD, incremental object detection (IOD) and open-set detection.
arXiv Detail & Related papers (2023-01-05T09:11:16Z) - Double-Dot Network for Antipodal Grasp Detection [20.21384585441404]
This paper proposes a new deep learning approach to antipodal grasp detection, named Double-Dot Network (DD-Net)
It follows the recent anchor-free object detection framework, which does not depend on empirically pre-set anchors.
An effective CNN architecture is introduced to localize such fingertips, and with the help of auxiliary centers for refinement, it accurately and robustly infers grasp candidates.
arXiv Detail & Related papers (2021-08-03T14:21:17Z) - Dense Label Encoding for Boundary Discontinuity Free Rotation Detection [69.75559390700887]
This paper explores a relatively less-studied methodology based on classification.
We propose new techniques to push its frontier in two aspects.
Experiments and visual analysis on large-scale public datasets for aerial images show the effectiveness of our approach.
arXiv Detail & Related papers (2020-11-19T05:42:02Z) - Scope Head for Accurate Localization in Object Detection [135.9979405835606]
We propose a novel detector coined as ScopeNet, which models anchors of each location as a mutually dependent relationship.
With our concise and effective design, the proposed ScopeNet achieves state-of-the-art results on COCO.
arXiv Detail & Related papers (2020-05-11T04:00:09Z) - EHSOD: CAM-Guided End-to-end Hybrid-Supervised Object Detection with
Cascade Refinement [53.69674636044927]
We present EHSOD, an end-to-end hybrid-supervised object detection system.
It can be trained in one shot on both fully and weakly-annotated data.
It achieves comparable results on multiple object detection benchmarks with only 30% fully-annotated data.
arXiv Detail & Related papers (2020-02-18T08:04:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.