Dynamic Sub-graph Distillation for Robust Semi-supervised Continual
Learning
- URL: http://arxiv.org/abs/2312.16409v1
- Date: Wed, 27 Dec 2023 04:40:12 GMT
- Title: Dynamic Sub-graph Distillation for Robust Semi-supervised Continual
Learning
- Authors: Yan Fan, Yu Wang, Pengfei Zhu, Qinghua Hu
- Abstract summary: We focus on semi-supervised continual learning (SSCL), where the model progressively learns from partially labeled data with unknown categories.
We propose a novel approach called Dynamic Sub-Graph Distillation (DSGD) for semi-supervised continual learning.
- Score: 52.046037471678005
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Continual learning (CL) has shown promising results and comparable
performance to learning at once in a fully supervised manner. However, CL
strategies typically require a large number of labeled samples, making their
real-life deployment challenging. In this work, we focus on semi-supervised
continual learning (SSCL), where the model progressively learns from partially
labeled data with unknown categories. We provide a comprehensive analysis of
SSCL and demonstrate that unreliable distributions of unlabeled data lead to
unstable training and refinement of the progressing stages. This problem
severely impacts the performance of SSCL. To address the limitations, we
propose a novel approach called Dynamic Sub-Graph Distillation (DSGD) for
semi-supervised continual learning, which leverages both semantic and
structural information to achieve more stable knowledge distillation on
unlabeled data and exhibit robustness against distribution bias. Firstly, we
formalize a general model of structural distillation and design a dynamic graph
construction for the continual learning progress. Next, we define a structure
distillation vector and design a dynamic sub-graph distillation algorithm,
which enables end-to-end training and adaptability to scale up tasks. The
entire proposed method is adaptable to various CL methods and supervision
settings. Finally, experiments conducted on three datasets CIFAR10, CIFAR100,
and ImageNet-100, with varying supervision ratios, demonstrate the
effectiveness of our proposed approach in mitigating the catastrophic
forgetting problem in semi-supervised continual learning scenarios.
Related papers
- ICL-TSVD: Bridging Theory and Practice in Continual Learning with Pre-trained Models [103.45785408116146]
Continual learning (CL) aims to train a model that can solve multiple tasks presented sequentially.
Recent CL approaches have achieved strong performance by leveraging large pre-trained models that generalize well to downstream tasks.
However, such methods lack theoretical guarantees, making them prone to unexpected failures.
We bridge this gap by integrating an empirically strong approach into a principled framework, designed to prevent forgetting.
arXiv Detail & Related papers (2024-10-01T12:58:37Z) - Enhancing Robustness of Vision-Language Models through Orthogonality Learning and Self-Regularization [77.62516752323207]
We introduce an orthogonal fine-tuning method for efficiently fine-tuning pretrained weights and enabling enhanced robustness and generalization.
A self-regularization strategy is further exploited to maintain the stability in terms of zero-shot generalization of VLMs, dubbed OrthSR.
For the first time, we revisit the CLIP and CoOp with our method to effectively improve the model on few-shot image classficiation scenario.
arXiv Detail & Related papers (2024-07-11T10:35:53Z) - Continual Vision-Language Representation Learning with Off-Diagonal
Information [112.39419069447902]
Multi-modal contrastive learning frameworks like CLIP typically require a large amount of image-text samples for training.
This paper discusses the feasibility of continual CLIP training using streaming data.
arXiv Detail & Related papers (2023-05-11T08:04:46Z) - Kaizen: Practical Self-supervised Continual Learning with Continual
Fine-tuning [21.36130180647864]
Retraining a model from scratch to adapt to newly generated data is time-consuming and inefficient.
We introduce a training architecture that is able to mitigate catastrophic forgetting.
Kaizen significantly outperforms previous SSL models in competitive vision benchmarks.
arXiv Detail & Related papers (2023-03-30T09:08:57Z) - Unifying Synergies between Self-supervised Learning and Dynamic
Computation [53.66628188936682]
We present a novel perspective on the interplay between SSL and DC paradigms.
We show that it is feasible to simultaneously learn a dense and gated sub-network from scratch in a SSL setting.
The co-evolution during pre-training of both dense and gated encoder offers a good accuracy-efficiency trade-off.
arXiv Detail & Related papers (2023-01-22T17:12:58Z) - Mitigating Forgetting in Online Continual Learning via Contrasting
Semantically Distinct Augmentations [22.289830907729705]
Online continual learning (OCL) aims to enable model learning from a non-stationary data stream to continuously acquire new knowledge as well as retain the learnt one.
Main challenge comes from the "catastrophic forgetting" issue -- the inability to well remember the learnt knowledge while learning the new ones.
arXiv Detail & Related papers (2022-11-10T05:29:43Z) - Learning by Distillation: A Self-Supervised Learning Framework for
Optical Flow Estimation [71.76008290101214]
DistillFlow is a knowledge distillation approach to learning optical flow.
It achieves state-of-the-art unsupervised learning performance on both KITTI and Sintel datasets.
Our models ranked 1st among all monocular methods on the KITTI 2015 benchmark, and outperform all published methods on the Sintel Final benchmark.
arXiv Detail & Related papers (2021-06-08T09:13:34Z) - Continual Learning From Unlabeled Data Via Deep Clustering [7.704949298975352]
Continual learning aims to learn new tasks incrementally using less computation and memory resources instead of retraining the model from scratch whenever new task arrives.
We introduce a new framework to make continual learning feasible in unsupervised mode by using pseudo label obtained from cluster assignments to update model.
arXiv Detail & Related papers (2021-04-14T23:46:17Z) - Unsupervised Class-Incremental Learning Through Confusion [0.4604003661048266]
We introduce a novelty detection method that leverages network confusion caused by training incoming data as a new class.
We found that incorporating a class-imbalance during this detection method substantially enhances performance.
arXiv Detail & Related papers (2021-04-09T15:58:43Z) - Robust Disentanglement of a Few Factors at a Time [5.156484100374058]
We introduce population-based training (PBT) for improving consistency in training variational autoencoders (VAEs)
We then use Unsupervised Disentanglement Ranking (UDR) as an unsupervised to score models in our PBT-VAE training and show how models trained this way tend to consistently disentangle only a subset of the generative factors.
We show striking improvement in state-of-the-art unsupervised disentanglement performance and robustness across multiple datasets and metrics.
arXiv Detail & Related papers (2020-10-26T12:34:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.