Provable and Efficient Continual Representation Learning
- URL: http://arxiv.org/abs/2203.02026v1
- Date: Thu, 3 Mar 2022 21:23:08 GMT
- Title: Provable and Efficient Continual Representation Learning
- Authors: Yingcong Li, Mingchen Li, M. Salman Asif, Samet Oymak
- Abstract summary: In continual learning (CL), the goal is to design models that can learn a sequence of tasks without catastrophic forgetting.
We study the problem of continual representation learning where we learn an evolving representation as new tasks arrive.
We show that CL benefits if the initial tasks have large sample size and high "representation diversity"
- Score: 40.78975699391065
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In continual learning (CL), the goal is to design models that can learn a
sequence of tasks without catastrophic forgetting. While there is a rich set of
techniques for CL, relatively little understanding exists on how
representations built by previous tasks benefit new tasks that are added to the
network. To address this, we study the problem of continual representation
learning (CRL) where we learn an evolving representation as new tasks arrive.
Focusing on zero-forgetting methods where tasks are embedded in subnetworks
(e.g., PackNet), we first provide experiments demonstrating CRL can
significantly boost sample efficiency when learning new tasks. To explain this,
we establish theoretical guarantees for CRL by providing sample complexity and
generalization error bounds for new tasks by formalizing the statistical
benefits of previously-learned representations. Our analysis and experiments
also highlight the importance of the order in which we learn the tasks.
Specifically, we show that CL benefits if the initial tasks have large sample
size and high "representation diversity". Diversity ensures that adding new
tasks incurs small representation mismatch and can be learned with few samples
while training only few additional nonzero weights. Finally, we ask whether one
can ensure each task subnetwork to be efficient during inference time while
retaining the benefits of representation learning. To this end, we propose an
inference-efficient variation of PackNet called Efficient Sparse PackNet (ESPN)
which employs joint channel & weight pruning. ESPN embeds tasks in
channel-sparse subnets requiring up to 80% less FLOPs to compute while
approximately retaining accuracy and is very competitive with a variety of
baselines. In summary, this work takes a step towards data and
compute-efficient CL with a representation learning perspective. GitHub page:
https://github.com/ucr-optml/CtRL
Related papers
- Loop Improvement: An Efficient Approach for Extracting Shared Features from Heterogeneous Data without Central Server [16.249442761713322]
"Loop Improvement" (LI) is a novel method enhancing this separation and feature extraction without necessitating a central server or data interchange among participants.
In personalized federated learning environments, LI consistently outperforms the advanced FedALA algorithm in accuracy across diverse scenarios.
LI's adaptability extends to multi-task learning, streamlining the extraction of common features across tasks and obviating the need for simultaneous training.
arXiv Detail & Related papers (2024-03-21T12:59:24Z) - Distribution Matching for Multi-Task Learning of Classification Tasks: a
Large-Scale Study on Faces & Beyond [62.406687088097605]
Multi-Task Learning (MTL) is a framework, where multiple related tasks are learned jointly and benefit from a shared representation space.
We show that MTL can be successful with classification tasks with little, or non-overlapping annotations.
We propose a novel approach, where knowledge exchange is enabled between the tasks via distribution matching.
arXiv Detail & Related papers (2024-01-02T14:18:11Z) - Provable Multi-Task Representation Learning by Two-Layer ReLU Neural Networks [69.38572074372392]
We present the first results proving that feature learning occurs during training with a nonlinear model on multiple tasks.
Our key insight is that multi-task pretraining induces a pseudo-contrastive loss that favors representations that align points that typically have the same label across tasks.
arXiv Detail & Related papers (2023-07-13T16:39:08Z) - Complementary Learning Subnetworks for Parameter-Efficient
Class-Incremental Learning [40.13416912075668]
We propose a rehearsal-free CIL approach that learns continually via the synergy between two Complementary Learning Subnetworks.
Our method achieves competitive results against state-of-the-art methods, especially in accuracy gain, memory cost, training efficiency, and task-order.
arXiv Detail & Related papers (2023-06-21T01:43:25Z) - Provable Benefit of Multitask Representation Learning in Reinforcement
Learning [46.11628795660159]
This paper theoretically characterizes the benefit of representation learning under the low-rank Markov decision process (MDP) model.
To the best of our knowledge, this is the first theoretical study that characterizes the benefit of representation learning in exploration-based reward-free multitask reinforcement learning.
arXiv Detail & Related papers (2022-06-13T04:29:02Z) - Enhancing Continual Learning with Global Prototypes: Counteracting
Negative Representation Drift [16.177180198865848]
Continual learning aims to learn a sequence of tasks over time, with data distributions shifting from one task to another.
Some negative representation drift can result in catastrophic forgetting, by causing the locally learned class prototypes and data representations to correlate poorly across tasks.
We propose a method that finds global prototypes to guide the learning, and learns data representations with the regularization of the self-supervised information.
arXiv Detail & Related papers (2022-05-24T16:41:30Z) - Active Multi-Task Representation Learning [50.13453053304159]
We give the first formal study on resource task sampling by leveraging the techniques from active learning.
We propose an algorithm that iteratively estimates the relevance of each source task to the target task and samples from each source task based on the estimated relevance.
arXiv Detail & Related papers (2022-02-02T08:23:24Z) - Parameter-Efficient Transfer from Sequential Behaviors for User Modeling
and Recommendation [111.44445634272235]
In this paper, we develop a parameter efficient transfer learning architecture, termed as PeterRec.
PeterRec allows the pre-trained parameters to remain unaltered during fine-tuning by injecting a series of re-learned neural networks.
We perform extensive experimental ablation to show the effectiveness of the learned user representation in five downstream tasks.
arXiv Detail & Related papers (2020-01-13T14:09:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.