Complementary Calibration: Boosting General Continual Learning with
Collaborative Distillation and Self-Supervision
- URL: http://arxiv.org/abs/2109.02426v1
- Date: Fri, 3 Sep 2021 06:35:27 GMT
- Title: Complementary Calibration: Boosting General Continual Learning with
Collaborative Distillation and Self-Supervision
- Authors: Zhong Ji, Jin Li, Qiang Wang, Zhongfei Zhang
- Abstract summary: General Continual Learning (GCL) aims at learning from non independent and identically distributed stream data.
We reveal that the relation and feature deviations are crucial problems for catastrophic forgetting.
We propose a Complementary (CoCa) framework by mining the complementary model's outputs and features.
- Score: 47.374412281270594
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: General Continual Learning (GCL) aims at learning from non independent and
identically distributed stream data without catastrophic forgetting of the old
tasks that don't rely on task boundaries during both training and testing
stages. We reveal that the relation and feature deviations are crucial problems
for catastrophic forgetting, in which relation deviation refers to the
deficiency of the relationship among all classes in knowledge distillation, and
feature deviation refers to indiscriminative feature representations. To this
end, we propose a Complementary Calibration (CoCa) framework by mining the
complementary model's outputs and features to alleviate the two deviations in
the process of GCL. Specifically, we propose a new collaborative distillation
approach for addressing the relation deviation. It distills model's outputs by
utilizing ensemble dark knowledge of new model's outputs and reserved outputs,
which maintains the performance of old tasks as well as balancing the
relationship among all classes. Furthermore, we explore a collaborative
self-supervision idea to leverage pretext tasks and supervised contrastive
learning for addressing the feature deviation problem by learning complete and
discriminative features for all classes. Extensive experiments on four popular
datasets show that our CoCa framework achieves superior performance against
state-of-the-art methods.
Related papers
- Joint Input and Output Coordination for Class-Incremental Learning [84.36763449830812]
We propose a joint input and output coordination (JIOC) mechanism to address these issues.
This mechanism assigns different weights to different categories of data according to the gradient of the output score.
It can be incorporated into different incremental learning approaches that use memory storage.
arXiv Detail & Related papers (2024-09-09T13:55:07Z) - Relaxed Contrastive Learning for Federated Learning [48.96253206661268]
We propose a novel contrastive learning framework to address the challenges of data heterogeneity in federated learning.
Our framework outperforms all existing federated learning approaches by huge margins on the standard benchmarks.
arXiv Detail & Related papers (2024-01-10T04:55:24Z) - Towards Distribution-Agnostic Generalized Category Discovery [51.52673017664908]
Data imbalance and open-ended distribution are intrinsic characteristics of the real visual world.
We propose a Self-Balanced Co-Advice contrastive framework (BaCon)
BaCon consists of a contrastive-learning branch and a pseudo-labeling branch, working collaboratively to provide interactive supervision to resolve the DA-GCD task.
arXiv Detail & Related papers (2023-10-02T17:39:58Z) - Towards Causal Foundation Model: on Duality between Causal Inference and Attention [18.046388712804042]
We take a first step towards building causally-aware foundation models for treatment effect estimations.
We propose a novel, theoretically justified method called Causal Inference with Attention (CInA)
arXiv Detail & Related papers (2023-10-01T22:28:34Z) - Cluster-aware Semi-supervised Learning: Relational Knowledge
Distillation Provably Learns Clustering [15.678104431835772]
We take an initial step toward a theoretical understanding of relational knowledge distillation (RKD)
For semi-supervised learning, we demonstrate the label efficiency of RKD through a general framework of cluster-aware learning.
We show that despite the common effect of learning accurate clusterings, RKD facilitates a "global" perspective.
arXiv Detail & Related papers (2023-07-20T17:05:51Z) - Integrating Prior Knowledge in Contrastive Learning with Kernel [4.050766659420731]
We use kernel theory to propose a novel loss, called decoupled uniformity, that i) allows the integration of prior knowledge and ii) removes the negative-positive coupling in the original InfoNCE loss.
In an unsupervised setting, we empirically demonstrate that CL benefits from generative models to improve its representation both on natural and medical images.
arXiv Detail & Related papers (2022-06-03T15:43:08Z) - Adversarial Dual-Student with Differentiable Spatial Warping for
Semi-Supervised Semantic Segmentation [70.2166826794421]
We propose a differentiable geometric warping to conduct unsupervised data augmentation.
We also propose a novel adversarial dual-student framework to improve the Mean-Teacher.
Our solution significantly improves the performance and state-of-the-art results are achieved on both datasets.
arXiv Detail & Related papers (2022-03-05T17:36:17Z) - Task-Feature Collaborative Learning with Application to Personalized
Attribute Prediction [166.87111665908333]
We propose a novel multi-task learning method called Task-Feature Collaborative Learning (TFCL)
Specifically, we first propose a base model with a heterogeneous block-diagonal structure regularizer to leverage the collaborative grouping of features and tasks.
As a practical extension, we extend the base model by allowing overlapping features and differentiating the hard tasks.
arXiv Detail & Related papers (2020-04-29T02:32:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.