Investigating the Impact of Weight Sharing Decisions on Knowledge
Transfer in Continual Learning
- URL: http://arxiv.org/abs/2311.09506v3
- Date: Mon, 18 Dec 2023 21:41:42 GMT
- Title: Investigating the Impact of Weight Sharing Decisions on Knowledge
Transfer in Continual Learning
- Authors: Josh Andle, Ali Payani, Salimeh Yasaei-Sekeh
- Abstract summary: Continual Learning (CL) has generated attention as a method of avoiding Catastrophic Forgetting (CF) in the sequential training of neural networks.
This paper investigates how different sharing decisions affect the Forward Knowledge Transfer (FKT) between tasks.
- Score: 7.25130576615102
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Continual Learning (CL) has generated attention as a method of avoiding
Catastrophic Forgetting (CF) in the sequential training of neural networks,
improving network efficiency and adaptability to different tasks. Additionally,
CL serves as an ideal setting for studying network behavior and Forward
Knowledge Transfer (FKT) between tasks. Pruning methods for CL train
subnetworks to handle the sequential tasks which allows us to take a structured
approach to investigating FKT. Sharing prior subnetworks' weights leverages
past knowledge for the current task through FKT. Understanding which weights to
share is important as sharing all weights can yield sub-optimal accuracy. This
paper investigates how different sharing decisions affect the FKT between
tasks. Through this lens we demonstrate how task complexity and similarity
influence the optimal weight sharing decisions, giving insights into the
relationships between tasks and helping inform decision making in similar CL
methods. We implement three sequential datasets designed to emphasize variation
in task complexity and similarity, reporting results for both ResNet-18 and
VGG-16. By sharing in accordance with the decisions supported by our findings,
we show that we can improve task accuracy compared to other sharing decisions.
Related papers
- Ensemble Learning via Knowledge Transfer for CTR Prediction [9.891226177252653]
In this paper, we investigate larger ensemble networks and find three inherent limitations in commonly used ensemble learning method.
We propose a novel model-agnostic Ensemble Knowledge Transfer Framework (EKTF)
Experimental results on five real-world datasets demonstrate the effectiveness and compatibility of EKTF.
arXiv Detail & Related papers (2024-11-25T06:14:20Z) - Investigating the Pre-Training Dynamics of In-Context Learning: Task Recognition vs. Task Learning [99.05401042153214]
In-context learning (ICL) is potentially attributed to two major abilities: task recognition (TR) and task learning (TL)
We take the first step by examining the pre-training dynamics of the emergence of ICL.
We propose a simple yet effective method to better integrate these two abilities for ICL at inference time.
arXiv Detail & Related papers (2024-06-20T06:37:47Z) - PEMT: Multi-Task Correlation Guided Mixture-of-Experts Enables Parameter-Efficient Transfer Learning [28.353530290015794]
We propose PEMT, a novel parameter-efficient fine-tuning framework based on multi-task transfer learning.
We conduct experiments on a broad range of tasks over 17 datasets.
arXiv Detail & Related papers (2024-02-23T03:59:18Z) - Similarity-based Knowledge Transfer for Cross-Domain Reinforcement
Learning [3.3148826359547523]
We develop a semi-supervised alignment loss to match different spaces with a set of encoder-decoders.
In comparison to prior works, our method does not require data to be aligned, paired or collected by expert policies.
arXiv Detail & Related papers (2023-12-05T19:26:01Z) - Evaluating the structure of cognitive tasks with transfer learning [67.22168759751541]
This study investigates the transferability of deep learning representations between different EEG decoding tasks.
We conduct extensive experiments using state-of-the-art decoding models on two recently released EEG datasets.
arXiv Detail & Related papers (2023-07-28T14:51:09Z) - Parameter-Level Soft-Masking for Continual Learning [12.290968171255349]
A novel technique (called SPG) is proposed that soft-masks parameter updating in training based on the importance of each parameter to old tasks.
To our knowledge, this is the first work that soft-masks a model at the parameter-level for continual learning.
arXiv Detail & Related papers (2023-06-26T15:35:27Z) - Factorizing Knowledge in Neural Networks [65.57381498391202]
We propose a novel knowledge-transfer task, Knowledge Factorization(KF)
KF aims to decompose it into several factor networks, each of which handles only a dedicated task and maintains task-specific knowledge factorized from the source network.
We introduce an information-theoretic objective, InfoMax-Bottleneck(IMB), to carry out KF by optimizing the mutual information between the learned representations and input.
arXiv Detail & Related papers (2022-07-04T09:56:49Z) - Multi-task Supervised Learning via Cross-learning [102.64082402388192]
We consider a problem known as multi-task learning, consisting of fitting a set of regression functions intended for solving different tasks.
In our novel formulation, we couple the parameters of these functions, so that they learn in their task specific domains while staying close to each other.
This facilitates cross-fertilization in which data collected across different domains help improving the learning performance at each other task.
arXiv Detail & Related papers (2020-10-24T21:35:57Z) - Task-Feature Collaborative Learning with Application to Personalized
Attribute Prediction [166.87111665908333]
We propose a novel multi-task learning method called Task-Feature Collaborative Learning (TFCL)
Specifically, we first propose a base model with a heterogeneous block-diagonal structure regularizer to leverage the collaborative grouping of features and tasks.
As a practical extension, we extend the base model by allowing overlapping features and differentiating the hard tasks.
arXiv Detail & Related papers (2020-04-29T02:32:04Z) - Federated Continual Learning with Weighted Inter-client Transfer [79.93004004545736]
We propose a novel federated continual learning framework, Federated Weighted Inter-client Transfer (FedWeIT)
FedWeIT decomposes the network weights into global federated parameters and sparse task-specific parameters, and each client receives selective knowledge from other clients.
We validate our FedWeIT against existing federated learning and continual learning methods, and our model significantly outperforms them with a large reduction in the communication cost.
arXiv Detail & Related papers (2020-03-06T13:33:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.