Continual Detection Transformer for Incremental Object Detection
- URL: http://arxiv.org/abs/2304.03110v1
- Date: Thu, 6 Apr 2023 14:38:40 GMT
- Title: Continual Detection Transformer for Incremental Object Detection
- Authors: Yaoyao Liu, Bernt Schiele, Andrea Vedaldi, Christian Rupprecht
- Abstract summary: Incremental object detection (IOD) aims to train an object detector in phases, each with annotations for new object categories.
As other incremental settings, IOD is subject to catastrophic forgetting, which is often addressed by techniques such as knowledge distillation (KD) and exemplar replay (ER)
We propose a new method for transformer-based IOD which enables effective usage of KD and ER in this context.
- Score: 154.8345288298059
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Incremental object detection (IOD) aims to train an object detector in
phases, each with annotations for new object categories. As other incremental
settings, IOD is subject to catastrophic forgetting, which is often addressed
by techniques such as knowledge distillation (KD) and exemplar replay (ER).
However, KD and ER do not work well if applied directly to state-of-the-art
transformer-based object detectors such as Deformable DETR and UP-DETR. In this
paper, we solve these issues by proposing a ContinuaL DEtection TRansformer
(CL-DETR), a new method for transformer-based IOD which enables effective usage
of KD and ER in this context. First, we introduce a Detector Knowledge
Distillation (DKD) loss, focusing on the most informative and reliable
predictions from old versions of the model, ignoring redundant background
predictions, and ensuring compatibility with the available ground-truth labels.
We also improve ER by proposing a calibration strategy to preserve the label
distribution of the training set, therefore better matching training and
testing statistics. We conduct extensive experiments on COCO 2017 and
demonstrate that CL-DETR achieves state-of-the-art results in the IOD setting.
Related papers
- On Calibration of Object Detectors: Pitfalls, Evaluation and Baselines [15.306933156466522]
Reliable usage of object detectors require them to be calibrated.
Recent approaches involve designing new loss functions to obtain calibrated detectors by training them from scratch.
We propose a principled evaluation framework to jointly measure calibration and accuracy of object detectors.
arXiv Detail & Related papers (2024-05-30T20:12:14Z) - Unified Unsupervised Salient Object Detection via Knowledge Transfer [29.324193170890542]
Unsupervised salient object detection (USOD) has gained increasing attention due to its annotation-free nature.
In this paper, we propose a unified USOD framework for generic USOD tasks.
arXiv Detail & Related papers (2024-04-23T05:50:02Z) - Semi-supervised Open-World Object Detection [74.95267079505145]
We introduce a more realistic formulation, named semi-supervised open-world detection (SS-OWOD)
We demonstrate that the performance of the state-of-the-art OWOD detector dramatically deteriorates in the proposed SS-OWOD setting.
Our experiments on 4 datasets including MS COCO, PASCAL, Objects365 and DOTA demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2024-02-25T07:12:51Z) - Cal-DETR: Calibrated Detection Transformer [67.75361289429013]
We propose a mechanism for calibrated detection transformers (Cal-DETR), particularly for Deformable-DETR, UP-DETR and DINO.
We develop an uncertainty-guided logit modulation mechanism that leverages the uncertainty to modulate the class logits.
Results corroborate the effectiveness of Cal-DETR against the competing train-time methods in calibrating both in-domain and out-domain detections.
arXiv Detail & Related papers (2023-11-06T22:13:10Z) - Towards Few-Annotation Learning for Object Detection: Are
Transformer-based Models More Efficient ? [11.416621957617334]
In this paper, we propose a semi-supervised method tailored for the current state-of-the-art object detector Deformable DETR.
We evaluate our method on the semi-supervised object detection benchmarks COCO and Pascal VOC, and it outperforms previous methods, especially when annotations are scarce.
arXiv Detail & Related papers (2023-10-30T18:51:25Z) - Rank-DETR for High Quality Object Detection [52.82810762221516]
A highly performant object detector requires accurate ranking for the bounding box predictions.
In this work, we introduce a simple and highly performant DETR-based object detector by proposing a series of rank-oriented designs.
arXiv Detail & Related papers (2023-10-13T04:48:32Z) - Revisiting Intermediate Layer Distillation for Compressing Language
Models: An Overfitting Perspective [7.481220126953329]
Intermediate Layer Distillation (ILD) has been a de facto standard KD method with its performance efficacy in the NLP field.
In this paper, we find that existing ILD methods are prone to overfitting to training datasets, although these methods transfer more information than the original KD.
We propose a simple yet effective consistency-regularized ILD, which prevents the student model from overfitting the training dataset.
arXiv Detail & Related papers (2023-02-03T04:09:22Z) - Mitigating the Mutual Error Amplification for Semi-Supervised Object
Detection [92.52505195585925]
We propose a Cross Teaching (CT) method, aiming to mitigate the mutual error amplification by introducing a rectification mechanism of pseudo labels.
In contrast to existing mutual teaching methods that directly treat predictions from other detectors as pseudo labels, we propose the Label Rectification Module (LRM)
arXiv Detail & Related papers (2022-01-26T03:34:57Z) - DA-DETR: Domain Adaptive Detection Transformer with Information Fusion [53.25930448542148]
DA-DETR is a domain adaptive object detection transformer that introduces information fusion for effective transfer from a labeled source domain to an unlabeled target domain.
We introduce a novel CNN-Transformer Blender (CTBlender) that fuses the CNN features and Transformer features ingeniously for effective feature alignment and knowledge transfer across domains.
CTBlender employs the Transformer features to modulate the CNN features across multiple scales where the high-level semantic information and the low-level spatial information are fused for accurate object identification and localization.
arXiv Detail & Related papers (2021-03-31T13:55:56Z) - Distilling Knowledge from Refinement in Multiple Instance Detection
Networks [0.0]
Weakly supervised object detection (WSOD) aims to tackle the object detection problem using only labeled image categories as supervision.
We present an adaptive supervision aggregation function that dynamically changes the aggregation criteria for selecting boxes related to one of the ground-truth classes, background, or even ignored during the generation of each refinement module supervision.
arXiv Detail & Related papers (2020-04-23T02:49:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.