Online Deep Metric Learning via Mutual Distillation
- URL: http://arxiv.org/abs/2203.05201v1
- Date: Thu, 10 Mar 2022 07:24:36 GMT
- Title: Online Deep Metric Learning via Mutual Distillation
- Authors: Gao-Dong Liu, Wan-Lei Zhao, Jie Zhao
- Abstract summary: Deep metric learning aims to transform input data into an embedding space, where similar samples are close while dissimilar samples are far apart from each other.
Existing solutions either retrain the model from scratch or require the replay of old samples during the training.
This paper proposes a complete online deep metric learning framework based on mutual distillation for both one-task and multi-task scenarios.
- Score: 9.363111089877625
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep metric learning aims to transform input data into an embedding space,
where similar samples are close while dissimilar samples are far apart from
each other. In practice, samples of new categories arrive incrementally, which
requires the periodical augmentation of the learned model. The fine-tuning on
the new categories usually leads to poor performance on the old, which is known
as "catastrophic forgetting". Existing solutions either retrain the model from
scratch or require the replay of old samples during the training. In this
paper, a complete online deep metric learning framework is proposed based on
mutual distillation for both one-task and multi-task scenarios. Different from
the teacher-student framework, the proposed approach treats the old and new
learning tasks with equal importance. No preference over the old or new
knowledge is caused. In addition, a novel virtual feature estimation approach
is proposed to recover the features assumed to be extracted by the old models.
It allows the distillation between the new and the old models without the
replay of old training samples or the holding of old models during the
training. A comprehensive study shows the superior performance of our approach
with the support of different backbones.
Related papers
- Joint Diffusion models in Continual Learning [4.013156524547073]
We introduce JDCL - a new method for continual learning with generative rehearsal based on joint diffusion models.
Generative-replay-based continual learning methods try to mitigate this issue by retraining a model with a combination of new and rehearsal data sampled from a generative model.
We show that such shared parametrization, combined with the knowledge distillation technique allows for stable adaptation to new tasks without catastrophic forgetting.
arXiv Detail & Related papers (2024-11-12T22:35:44Z) - Class incremental learning with probability dampening and cascaded gated classifier [4.285597067389559]
We propose a novel incremental regularisation approach called Margin Dampening and Cascaded Scaling.
The first combines a soft constraint and a knowledge distillation approach to preserve past knowledge while allowing forgetting new patterns.
We empirically show that our approach performs well on multiple benchmarks well-established baselines.
arXiv Detail & Related papers (2024-02-02T09:33:07Z) - Enhancing Consistency and Mitigating Bias: A Data Replay Approach for
Incremental Learning [100.7407460674153]
Deep learning systems are prone to catastrophic forgetting when learning from a sequence of tasks.
To mitigate the problem, a line of methods propose to replay the data of experienced tasks when learning new tasks.
However, it is not expected in practice considering the memory constraint or data privacy issue.
As a replacement, data-free data replay methods are proposed by inverting samples from the classification model.
arXiv Detail & Related papers (2024-01-12T12:51:12Z) - Continual Learning with Strong Experience Replay [32.154995019080594]
We propose a CL method with Strong Experience Replay (SER)
SER utilizes future experiences mimicked on the current training data, besides distilling past experience from the memory buffer.
Experimental results on multiple image classification datasets show that our SER method surpasses the state-of-the-art methods by a noticeable margin.
arXiv Detail & Related papers (2023-05-23T02:42:54Z) - Remind of the Past: Incremental Learning with Analogical Prompts [30.333352182303038]
We design an analogy-making mechanism to remap the new data into the old class by prompt tuning.
It mimics the feature distribution of the target old class on the old model using only samples of new classes.
The learnt prompts are further used to estimate and counteract the representation shift caused by fine-tuning for the historical prototypes.
arXiv Detail & Related papers (2023-03-24T10:18:28Z) - Intra-class Adaptive Augmentation with Neighbor Correction for Deep
Metric Learning [99.14132861655223]
We propose a novel intra-class adaptive augmentation (IAA) framework for deep metric learning.
We reasonably estimate intra-class variations for every class and generate adaptive synthetic samples to support hard samples mining.
Our method significantly improves and outperforms the state-of-the-art methods on retrieval performances by 3%-6%.
arXiv Detail & Related papers (2022-11-29T14:52:38Z) - Memorizing Complementation Network for Few-Shot Class-Incremental
Learning [109.4206979528375]
We propose a Memorizing Complementation Network (MCNet) to ensemble multiple models that complements the different memorized knowledge with each other in novel tasks.
We develop a Prototype Smoothing Hard-mining Triplet (PSHT) loss to push the novel samples away from not only each other in current task but also the old distribution.
arXiv Detail & Related papers (2022-08-11T02:32:41Z) - Bridging Non Co-occurrence with Unlabeled In-the-wild Data for
Incremental Object Detection [56.22467011292147]
Several incremental learning methods are proposed to mitigate catastrophic forgetting for object detection.
Despite the effectiveness, these methods require co-occurrence of the unlabeled base classes in the training data of the novel classes.
We propose the use of unlabeled in-the-wild data to bridge the non-occurrence caused by the missing base classes during the training of additional novel classes.
arXiv Detail & Related papers (2021-10-28T10:57:25Z) - Two-Level Residual Distillation based Triple Network for Incremental
Object Detection [21.725878050355824]
We propose a novel incremental object detector based on Faster R-CNN to continuously learn from new object classes without using old data.
It is a triple network where an old model and a residual model as assistants for helping the incremental model learning on new classes without forgetting the previous learned knowledge.
arXiv Detail & Related papers (2020-07-27T11:04:57Z) - Learning to Reweight with Deep Interactions [104.68509759134878]
We propose an improved data reweighting algorithm, in which the student model provides its internal states to the teacher model.
Experiments on image classification with clean/noisy labels and neural machine translation empirically demonstrate that our algorithm makes significant improvement over previous methods.
arXiv Detail & Related papers (2020-07-09T09:06:31Z) - Automatic Recall Machines: Internal Replay, Continual Learning and the
Brain [104.38824285741248]
Replay in neural networks involves training on sequential data with memorized samples, which counteracts forgetting of previous behavior caused by non-stationarity.
We present a method where these auxiliary samples are generated on the fly, given only the model that is being trained for the assessed objective.
Instead the implicit memory of learned samples within the assessed model itself is exploited.
arXiv Detail & Related papers (2020-06-22T15:07:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.