Open-Vocabulary One-Stage Detection with Hierarchical Visual-Language
Knowledge Distillation
- URL: http://arxiv.org/abs/2203.10593v1
- Date: Sun, 20 Mar 2022 16:31:49 GMT
- Title: Open-Vocabulary One-Stage Detection with Hierarchical Visual-Language
Knowledge Distillation
- Authors: Zongyang Ma, Guan Luo, Jin Gao, Liang Li, Yuxin Chen, Shaoru Wang,
Congxuan Zhang, Weiming Hu
- Abstract summary: We propose a hierarchical visual-language knowledge distillation method, i.e., HierKD, for open-vocabulary one-stage detection.
Our method significantly surpasses the previous best one-stage detector with 11.9% and 6.7% $AP_50$ gains.
- Score: 36.79599282372021
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Open-vocabulary object detection aims to detect novel object categories
beyond the training set.
The advanced open-vocabulary two-stage detectors employ instance-level
visual-to-visual knowledge distillation to align the visual space of the
detector with the semantic space of the Pre-trained Visual-Language Model
(PVLM).
However, in the more efficient one-stage detector, the absence of
class-agnostic object proposals hinders the knowledge distillation on unseen
objects, leading to severe performance degradation.
In this paper, we propose a hierarchical visual-language knowledge
distillation method, i.e., HierKD, for open-vocabulary one-stage detection.
Specifically, a global-level knowledge distillation is explored to transfer
the knowledge of unseen categories from the PVLM to the detector.
Moreover, we combine the proposed global-level knowledge distillation and the
common instance-level knowledge distillation to learn the knowledge of seen and
unseen categories simultaneously.
Extensive experiments on MS-COCO show that our method significantly surpasses
the previous best one-stage detector with 11.9\% and 6.7\% $AP_{50}$ gains
under the zero-shot detection and generalized zero-shot detection settings, and
reduces the $AP_{50}$ performance gap from 14\% to 7.3\% compared to the best
two-stage detector.
Related papers
- Learning Background Prompts to Discover Implicit Knowledge for Open Vocabulary Object Detection [101.15777242546649]
Open vocabulary object detection (OVD) aims at seeking an optimal object detector capable of recognizing objects from both base and novel categories.
Recent advances leverage knowledge distillation to transfer insightful knowledge from pre-trained large-scale vision-language models to the task of object detection.
We present a novel OVD framework termed LBP to propose learning background prompts to harness explored implicit background knowledge.
arXiv Detail & Related papers (2024-06-01T17:32:26Z) - SKDF: A Simple Knowledge Distillation Framework for Distilling Open-Vocabulary Knowledge to Open-world Object Detector [8.956773268679811]
We specialize the VLM model for OWOD tasks by distilling its open-world knowledge into a language-agnostic detector.
We observe that the combination of a simple textbfknowledge distillation approach and the automatic pseudo-labeling mechanism in OWOD can achieve better performance for unknown object detection.
We propose two benchmarks for evaluating the ability of the open-world detector to detect unknown objects in the open world.
arXiv Detail & Related papers (2023-12-14T04:47:20Z) - Efficient Object Detection in Optical Remote Sensing Imagery via
Attention-based Feature Distillation [29.821082433621868]
We propose Attention-based Feature Distillation (AFD) for object detection.
We introduce a multi-instance attention mechanism that effectively distinguishes between background and foreground elements.
AFD attains the performance of other state-of-the-art models while being efficient.
arXiv Detail & Related papers (2023-10-28T11:15:37Z) - Knowledge Distillation Meets Open-Set Semi-Supervised Learning [69.21139647218456]
We propose a novel em modelname (bfem shortname) method dedicated for distilling representational knowledge semantically from a pretrained teacher to a target student.
At the problem level, this establishes an interesting connection between knowledge distillation with open-set semi-supervised learning (SSL)
Our shortname outperforms significantly previous state-of-the-art knowledge distillation methods on both coarse object classification and fine face recognition tasks.
arXiv Detail & Related papers (2022-05-13T15:15:27Z) - Response-based Distillation for Incremental Object Detection [2.337183337110597]
Traditional object detection are ill-equipped for incremental learning.
Fine-tuning directly on a well-trained detection model with only new data will leads to catastrophic forgetting.
We propose a fully response-based incremental distillation method focusing on learning response from detection bounding boxes and classification predictions.
arXiv Detail & Related papers (2021-10-26T08:07:55Z) - Label Assignment Distillation for Object Detection [0.0]
We come up with a simple but effective knowledge distillation approach focusing on label assignment in object detection.
Our method shows encouraging results on the MSCOCO 2017 benchmark.
arXiv Detail & Related papers (2021-09-16T10:11:58Z) - G-DetKD: Towards General Distillation Framework for Object Detectors via
Contrastive and Semantic-guided Feature Imitation [49.421099172544196]
We propose a novel semantic-guided feature imitation technique, which automatically performs soft matching between feature pairs across all pyramid levels.
We also introduce contrastive distillation to effectively capture the information encoded in the relationship between different feature regions.
Our method consistently outperforms the existing detection KD techniques, and works when (1) components in the framework are used separately and in conjunction.
arXiv Detail & Related papers (2021-08-17T07:44:27Z) - Distilling Image Classifiers in Object Detectors [81.63849985128527]
We study the case of object detection and, instead of following the standard detector-to-detector distillation approach, introduce a classifier-to-detector knowledge transfer framework.
In particular, we propose strategies to exploit the classification teacher to improve both the detector's recognition accuracy and localization performance.
arXiv Detail & Related papers (2021-06-09T16:50:10Z) - Robust and Accurate Object Detection via Adversarial Learning [111.36192453882195]
This work augments the fine-tuning stage for object detectors by exploring adversarial examples.
Our approach boosts the performance of state-of-the-art EfficientDets by +1.1 mAP on the object detection benchmark.
arXiv Detail & Related papers (2021-03-23T19:45:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.