Overcoming Catastrophic Forgetting in Incremental Object Detection via
Elastic Response Distillation
- URL: http://arxiv.org/abs/2204.02136v1
- Date: Tue, 5 Apr 2022 11:57:43 GMT
- Title: Overcoming Catastrophic Forgetting in Incremental Object Detection via
Elastic Response Distillation
- Authors: Tao Feng, Mang Wang, Hangjie Yuan
- Abstract summary: Traditional object detectors are ill-equipped for incremental learning.
Fine-tuning directly on a well-trained detection model with only new data will lead to catastrophic forgetting.
We propose a response-based incremental distillation method, dubbed Elastic Response Distillation (ERD)
- Score: 4.846235640334886
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Traditional object detectors are ill-equipped for incremental learning.
However, fine-tuning directly on a well-trained detection model with only new
data will lead to catastrophic forgetting. Knowledge distillation is a flexible
way to mitigate catastrophic forgetting. In Incremental Object Detection (IOD),
previous work mainly focuses on distilling for the combination of features and
responses. However, they under-explore the information that contains in
responses. In this paper, we propose a response-based incremental distillation
method, dubbed Elastic Response Distillation (ERD), which focuses on
elastically learning responses from the classification head and the regression
head. Firstly, our method transfers category knowledge while equipping student
detector with the ability to retain localization information during incremental
learning. In addition, we further evaluate the quality of all locations and
provide valuable responses by the Elastic Response Selection (ERS) strategy.
Finally, we elucidate that the knowledge from different responses should be
assigned with different importance during incremental distillation. Extensive
experiments conducted on MS COCO demonstrate our method achieves
state-of-the-art result, which substantially narrows the performance gap
towards full training.
Related papers
- Task Integration Distillation for Object Detectors [2.974025533366946]
We propose a knowledge distillation method that addresses both the classification and regression tasks.
We evaluate the importance of features based on the output of the detector's two sub-tasks.
This method effectively prevents the issue of biased predictions about the model's learning reality.
arXiv Detail & Related papers (2024-04-02T07:08:15Z) - Exploring Inconsistent Knowledge Distillation for Object Detection with
Data Augmentation [66.25738680429463]
Knowledge Distillation (KD) for object detection aims to train a compact detector by transferring knowledge from a teacher model.
We propose inconsistent knowledge distillation (IKD) which aims to distill knowledge inherent in the teacher model's counter-intuitive perceptions.
Our method outperforms state-of-the-art KD baselines on one-stage, two-stage and anchor-free object detectors.
arXiv Detail & Related papers (2022-09-20T16:36:28Z) - Few-Shot Class-Incremental Learning via Entropy-Regularized Data-Free
Replay [52.251188477192336]
Few-shot class-incremental learning (FSCIL) has been proposed aiming to enable a deep learning system to incrementally learn new classes with limited data.
We show through empirical results that adopting the data replay is surprisingly favorable.
We propose using data-free replay that can synthesize data by a generator without accessing real data.
arXiv Detail & Related papers (2022-07-22T17:30:51Z) - ALLSH: Active Learning Guided by Local Sensitivity and Hardness [98.61023158378407]
We propose to retrieve unlabeled samples with a local sensitivity and hardness-aware acquisition function.
Our method achieves consistent gains over the commonly used active learning strategies in various classification tasks.
arXiv Detail & Related papers (2022-05-10T15:39:11Z) - Localization Distillation for Object Detection [134.12664548771534]
Previous knowledge distillation (KD) methods for object detection mostly focus on feature imitation instead of mimicking the classification logits.
We present a novel localization distillation (LD) method which can efficiently transfer the localization knowledge from the teacher to the student.
We show that logit mimicking can outperform feature imitation and the absence of localization distillation is a critical reason for why logit mimicking underperforms for years.
arXiv Detail & Related papers (2022-04-12T17:14:34Z) - Response-based Distillation for Incremental Object Detection [2.337183337110597]
Traditional object detection are ill-equipped for incremental learning.
Fine-tuning directly on a well-trained detection model with only new data will leads to catastrophic forgetting.
We propose a fully response-based incremental distillation method focusing on learning response from detection bounding boxes and classification predictions.
arXiv Detail & Related papers (2021-10-26T08:07:55Z) - SID: Incremental Learning for Anchor-Free Object Detection via Selective
and Inter-Related Distillation [16.281712605385316]
Incremental learning requires a model to continually learn new tasks from streaming data.
Traditional fine-tuning of a well-trained deep neural network on a new task will dramatically degrade performance on the old task.
We propose a novel incremental learning paradigm called Selective and Inter-related Distillation (SID)
arXiv Detail & Related papers (2020-12-31T04:12:06Z) - Deep Semi-supervised Knowledge Distillation for Overlapping Cervical
Cell Instance Segmentation [54.49894381464853]
We propose to leverage both labeled and unlabeled data for instance segmentation with improved accuracy by knowledge distillation.
We propose a novel Mask-guided Mean Teacher framework with Perturbation-sensitive Sample Mining.
Experiments show that the proposed method improves the performance significantly compared with the supervised method learned from labeled data only.
arXiv Detail & Related papers (2020-07-21T13:27:09Z) - Distilling Object Detectors with Task Adaptive Regularization [97.52935611385179]
Current state-of-the-art object detectors are at the expense of high computational costs and are hard to deploy to low-end devices.
Knowledge distillation, which aims at training a smaller student network by transferring knowledge from a larger teacher model, is one of the promising solutions for model miniaturization.
arXiv Detail & Related papers (2020-06-23T15:58:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.