Enhancing Continual Learning with Global Prototypes: Counteracting
Negative Representation Drift
- URL: http://arxiv.org/abs/2205.12186v2
- Date: Thu, 30 Mar 2023 17:15:42 GMT
- Title: Enhancing Continual Learning with Global Prototypes: Counteracting
Negative Representation Drift
- Authors: Xueying Bai, Jinghuan Shang, Yifan Sun, Niranjan Balasubramanian
- Abstract summary: Continual learning aims to learn a sequence of tasks over time, with data distributions shifting from one task to another.
Some negative representation drift can result in catastrophic forgetting, by causing the locally learned class prototypes and data representations to correlate poorly across tasks.
We propose a method that finds global prototypes to guide the learning, and learns data representations with the regularization of the self-supervised information.
- Score: 16.177180198865848
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Continual learning (CL) aims to learn a sequence of tasks over time, with
data distributions shifting from one task to another. When training on new task
data, data representations from old tasks may drift. Some negative
representation drift can result in catastrophic forgetting, by causing the
locally learned class prototypes and data representations to correlate poorly
across tasks. To mitigate such representation drift, we propose a method that
finds global prototypes to guide the learning, and learns data representations
with the regularization of the self-supervised information. Specifically, for
NLP tasks, we formulate each task in a masked language modeling style, and
learn the task via a neighbor attention mechanism over a pre-trained language
model. Experimental results show that our proposed method can learn fairly
consistent representations with less representation drift, and significantly
reduce catastrophic forgetting in CL without resampling data from past tasks.
Related papers
- Data-CUBE: Data Curriculum for Instruction-based Sentence Representation
Learning [85.66907881270785]
We propose a data curriculum method, namely Data-CUBE, that arranges the orders of all the multi-task data for training.
In the task level, we aim to find the optimal task order to minimize the total cross-task interference risk.
In the instance level, we measure the difficulty of all instances per task, then divide them into the easy-to-difficult mini-batches for training.
arXiv Detail & Related papers (2024-01-07T18:12:20Z) - PILoRA: Prototype Guided Incremental LoRA for Federated Class-Incremental Learning [41.984652077669104]
Experimental results on standard datasets indicate that our method outperforms the state-of-the-art approaches significantly.
Our method exhibits strong robustness and superiority in different settings and degrees of data heterogeneity.
arXiv Detail & Related papers (2024-01-04T06:46:19Z) - Task-Distributionally Robust Data-Free Meta-Learning [99.56612787882334]
Data-Free Meta-Learning (DFML) aims to efficiently learn new tasks by leveraging multiple pre-trained models without requiring their original training data.
For the first time, we reveal two major challenges hindering their practical deployments: Task-Distribution Shift ( TDS) and Task-Distribution Corruption (TDC)
arXiv Detail & Related papers (2023-11-23T15:46:54Z) - ALP: Action-Aware Embodied Learning for Perception [60.64801970249279]
We introduce Action-Aware Embodied Learning for Perception (ALP)
ALP incorporates action information into representation learning through a combination of optimizing a reinforcement learning policy and an inverse dynamics prediction objective.
We show that ALP outperforms existing baselines in several downstream perception tasks.
arXiv Detail & Related papers (2023-06-16T21:51:04Z) - Leveraging sparse and shared feature activations for disentangled
representation learning [112.22699167017471]
We propose to leverage knowledge extracted from a diversified set of supervised tasks to learn a common disentangled representation.
We validate our approach on six real world distribution shift benchmarks, and different data modalities.
arXiv Detail & Related papers (2023-04-17T01:33:24Z) - Prototype-Sample Relation Distillation: Towards Replay-Free Continual
Learning [14.462797749666992]
We propose a holistic approach to jointly learn the representation and class prototypes.
We propose a novel distillation loss that constrains class prototypes to maintain relative similarities as compared to new task data.
This method yields state-of-the-art performance in the task-incremental setting.
arXiv Detail & Related papers (2023-03-26T16:35:45Z) - Provable and Efficient Continual Representation Learning [40.78975699391065]
In continual learning (CL), the goal is to design models that can learn a sequence of tasks without catastrophic forgetting.
We study the problem of continual representation learning where we learn an evolving representation as new tasks arrive.
We show that CL benefits if the initial tasks have large sample size and high "representation diversity"
arXiv Detail & Related papers (2022-03-03T21:23:08Z) - Parrot: Data-Driven Behavioral Priors for Reinforcement Learning [79.32403825036792]
We propose a method for pre-training behavioral priors that can capture complex input-output relationships observed in successful trials.
We show how this learned prior can be used for rapidly learning new tasks without impeding the RL agent's ability to try out novel behaviors.
arXiv Detail & Related papers (2020-11-19T18:47:40Z) - Predicting What You Already Know Helps: Provable Self-Supervised
Learning [60.27658820909876]
Self-supervised representation learning solves auxiliary prediction tasks (known as pretext tasks) without requiring labeled data.
We show a mechanism exploiting the statistical connections between certain em reconstruction-based pretext tasks that guarantee to learn a good representation.
We prove the linear layer yields small approximation error even for complex ground truth function class.
arXiv Detail & Related papers (2020-08-03T17:56:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.