Class Gradient Projection For Continual Learning
- URL: http://arxiv.org/abs/2311.14905v1
- Date: Sat, 25 Nov 2023 02:45:56 GMT
- Title: Class Gradient Projection For Continual Learning
- Authors: Cheng Chen, Ji Zhang, Jingkuan Song, Lianli Gao
- Abstract summary: Catastrophic forgetting is one of the most critical challenges in Continual Learning (CL)
We propose Class Gradient Projection (CGP), which calculates the gradient subspace from individual classes rather than tasks.
- Score: 99.105266615448
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Catastrophic forgetting is one of the most critical challenges in Continual
Learning (CL). Recent approaches tackle this problem by projecting the gradient
update orthogonal to the gradient subspace of existing tasks. While the results
are remarkable, those approaches ignore the fact that these calculated
gradients are not guaranteed to be orthogonal to the gradient subspace of each
class due to the class deviation in tasks, e.g., distinguishing "Man" from
"Sea" v.s. differentiating "Boy" from "Girl". Therefore, this strategy may
still cause catastrophic forgetting for some classes. In this paper, we propose
Class Gradient Projection (CGP), which calculates the gradient subspace from
individual classes rather than tasks. Gradient update orthogonal to the
gradient subspace of existing classes can be effectively utilized to minimize
interference from other classes. To improve the generalization and efficiency,
we further design a Base Refining (BR) algorithm to combine similar classes and
refine class bases dynamically. Moreover, we leverage a contrastive learning
method to improve the model's ability to handle unseen tasks. Extensive
experiments on benchmark datasets demonstrate the effectiveness of our proposed
approach. It improves the previous methods by 2.0% on the CIFAR-100 dataset.
Related papers
- Unified Gradient-Based Machine Unlearning with Remain Geometry Enhancement [29.675650285351768]
Machine unlearning (MU) has emerged to enhance the privacy and trustworthiness of deep neural networks.
Approximate MU is a practical method for large-scale models.
We propose a fast-slow parameter update strategy to implicitly approximate the up-to-date salient unlearning direction.
arXiv Detail & Related papers (2024-09-29T15:17:33Z) - An Effective Dynamic Gradient Calibration Method for Continual Learning [11.555822066922508]
Continual learning (CL) is a fundamental topic in machine learning, where the goal is to train a model with continuously incoming data and tasks.
Due to the memory limit, we cannot store all the historical data, and therefore confront the catastrophic forgetting'' problem.
We develop an effective algorithm to calibrate the gradient in each updating step of the model.
arXiv Detail & Related papers (2024-07-30T16:30:09Z) - Visual Prompt Tuning in Null Space for Continual Learning [51.96411454304625]
Existing prompt-tuning methods have demonstrated impressive performances in continual learning (CL)
This paper aims to learn each task by tuning the prompts in the direction orthogonal to the subspace spanned by previous tasks' features.
In practice, an effective null-space-based approximation solution has been proposed to implement the prompt gradient projection.
arXiv Detail & Related papers (2024-06-09T05:57:40Z) - Class-Imbalanced Semi-Supervised Learning for Large-Scale Point Cloud
Semantic Segmentation via Decoupling Optimization [64.36097398869774]
Semi-supervised learning (SSL) has been an active research topic for large-scale 3D scene understanding.
The existing SSL-based methods suffer from severe training bias due to class imbalance and long-tail distributions of the point cloud data.
We introduce a new decoupling optimization framework, which disentangles feature representation learning and classifier in an alternative optimization manner to shift the bias decision boundary effectively.
arXiv Detail & Related papers (2024-01-13T04:16:40Z) - Gradient-Semantic Compensation for Incremental Semantic Segmentation [43.00965727428193]
Incremental semantic segmentation aims to continually learn the segmentation of new coming classes without accessing the training data of previously learned classes.
We propose a Gradient-Semantic Compensation model, which surmounts incremental semantic segmentation from both gradient and semantic perspectives.
arXiv Detail & Related papers (2023-07-20T12:32:25Z) - Continual Learning with Scaled Gradient Projection [8.847574864259391]
In neural networks, continual learning results in gradient interference among sequential tasks, leading to forgetting of old tasks while learning new ones.
We propose a Scaled Gradient Projection (SGP) method to improve new learning while minimizing forgetting.
We conduct experiments ranging from continual image classification to reinforcement learning tasks and report better performance with less training overhead than the state-of-the-art approaches.
arXiv Detail & Related papers (2023-02-02T19:46:39Z) - Scaling Forward Gradient With Local Losses [117.22685584919756]
Forward learning is a biologically plausible alternative to backprop for learning deep neural networks.
We show that it is possible to substantially reduce the variance of the forward gradient by applying perturbations to activations rather than weights.
Our approach matches backprop on MNIST and CIFAR-10 and significantly outperforms previously proposed backprop-free algorithms on ImageNet.
arXiv Detail & Related papers (2022-10-07T03:52:27Z) - Gradient Correction beyond Gradient Descent [63.33439072360198]
gradient correction is apparently the most crucial aspect for the training of a neural network.
We introduce a framework (textbfGCGD) to perform gradient correction.
Experiment results show that our gradient correction framework can effectively improve the gradient quality to reduce training epochs by $sim$ 20% and also improve the network performance.
arXiv Detail & Related papers (2022-03-16T01:42:25Z) - Self-Supervised Class Incremental Learning [51.62542103481908]
Existing Class Incremental Learning (CIL) methods are based on a supervised classification framework sensitive to data labels.
When updating them based on the new class data, they suffer from catastrophic forgetting: the model cannot discern old class data clearly from the new.
In this paper, we explore the performance of Self-Supervised representation learning in Class Incremental Learning (SSCIL) for the first time.
arXiv Detail & Related papers (2021-11-18T06:58:19Z) - Class-incremental Learning with Rectified Feature-Graph Preservation [24.098892115785066]
A central theme of this paper is to learn new classes that arrive in sequential phases over time.
We propose a weighted-Euclidean regularization for old knowledge preservation.
We show how it can work with binary cross-entropy to increase class separation for effective learning of new classes.
arXiv Detail & Related papers (2020-12-15T07:26:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.