Task Difficulty Aware Parameter Allocation & Regularization for Lifelong
Learning
- URL: http://arxiv.org/abs/2304.05288v1
- Date: Tue, 11 Apr 2023 15:38:21 GMT
- Title: Task Difficulty Aware Parameter Allocation & Regularization for Lifelong
Learning
- Authors: Wenjin Wang, Yunqing Hu, Qianglong Chen, Yin Zhang
- Abstract summary: We propose the Allocation & Regularization (PAR), which adaptively select an appropriate strategy for each task from parameter allocation and regularization based on its learning difficulty.
Our method is scalable and significantly reduces the model's redundancy while improving the model's performance.
- Score: 20.177260510548535
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Parameter regularization or allocation methods are effective in overcoming
catastrophic forgetting in lifelong learning. However, they solve all tasks in
a sequence uniformly and ignore the differences in the learning difficulty of
different tasks. So parameter regularization methods face significant
forgetting when learning a new task very different from learned tasks, and
parameter allocation methods face unnecessary parameter overhead when learning
simple tasks. In this paper, we propose the Parameter Allocation &
Regularization (PAR), which adaptively select an appropriate strategy for each
task from parameter allocation and regularization based on its learning
difficulty. A task is easy for a model that has learned tasks related to it and
vice versa. We propose a divergence estimation method based on the
Nearest-Prototype distance to measure the task relatedness using only features
of the new task. Moreover, we propose a time-efficient relatedness-aware
sampling-based architecture search strategy to reduce the parameter overhead
for allocation. Experimental results on multiple benchmarks demonstrate that,
compared with SOTAs, our method is scalable and significantly reduces the
model's redundancy while improving the model's performance. Further qualitative
analysis indicates that PAR obtains reasonable task-relatedness.
Related papers
- AdaTask: A Task-aware Adaptive Learning Rate Approach to Multi-task
Learning [19.201899503691266]
We measure the task dominance degree of a parameter by the total updates of each task on this parameter.
We propose a Task-wise Adaptive learning rate approach, AdaTask, to separate the emphaccumulative gradients and hence the learning rate of each task.
Experiments on computer vision and recommender system MTL datasets demonstrate that AdaTask significantly improves the performance of dominated tasks.
arXiv Detail & Related papers (2022-11-28T04:24:38Z) - DiSparse: Disentangled Sparsification for Multitask Model Compression [92.84435347164435]
DiSparse is a simple, effective, and first-of-its-kind multitask pruning and sparse training scheme.
Our experimental results demonstrate superior performance on various configurations and settings.
arXiv Detail & Related papers (2022-06-09T17:57:46Z) - Task Adaptive Parameter Sharing for Multi-Task Learning [114.80350786535952]
Adaptive Task Adapting Sharing (TAPS) is a method for tuning a base model to a new task by adaptively modifying a small, task-specific subset of layers.
Compared to other methods, TAPS retains high accuracy on downstream tasks while introducing few task-specific parameters.
We evaluate our method on a suite of fine-tuning tasks and architectures (ResNet, DenseNet, ViT) and show that it achieves state-of-the-art performance while being simple to implement.
arXiv Detail & Related papers (2022-03-30T23:16:07Z) - Instance-Level Task Parameters: A Robust Multi-task Weighting Framework [17.639472693362926]
Recent works have shown that deep neural networks benefit from multi-task learning by learning a shared representation across several related tasks.
We let the training process dictate the optimal weighting of tasks for every instance in the dataset.
We conduct extensive experiments on SURREAL and CityScapes datasets, for human shape and pose estimation, depth estimation and semantic segmentation tasks.
arXiv Detail & Related papers (2021-06-11T02:35:42Z) - TAG: Task-based Accumulated Gradients for Lifelong learning [21.779858050277475]
We propose a task-aware system that adapts the learning rate based on the relatedness among tasks.
We empirically show that our proposed adaptive learning rate not only accounts for catastrophic forgetting but also allows positive backward transfer.
arXiv Detail & Related papers (2021-05-11T16:10:32Z) - Efficient Continual Adaptation for Generative Adversarial Networks [97.20244383723853]
We present a continual learning approach for generative adversarial networks (GANs)
Our approach is based on learning a set of global and task-specific parameters.
We show that the feature-map transformation based approach outperforms state-of-the-art continual GANs methods.
arXiv Detail & Related papers (2021-03-06T05:09:37Z) - Parameter-Efficient Transfer Learning with Diff Pruning [108.03864629388404]
diff pruning is a simple approach to enable parameter-efficient transfer learning within the pretrain-finetune framework.
We find that models finetuned with diff pruning can match the performance of fully finetuned baselines on the GLUE benchmark.
arXiv Detail & Related papers (2020-12-14T12:34:01Z) - Multi-task Supervised Learning via Cross-learning [102.64082402388192]
We consider a problem known as multi-task learning, consisting of fitting a set of regression functions intended for solving different tasks.
In our novel formulation, we couple the parameters of these functions, so that they learn in their task specific domains while staying close to each other.
This facilitates cross-fertilization in which data collected across different domains help improving the learning performance at each other task.
arXiv Detail & Related papers (2020-10-24T21:35:57Z) - The Sample Complexity of Meta Sparse Regression [38.092179552223364]
This paper addresses the meta-learning problem in sparse linear regression with infinite tasks.
We show that T in O (( k log(p) ) /l ) tasks are sufficient in order to recover the common support of all tasks.
arXiv Detail & Related papers (2020-02-22T00:59:53Z) - Meta Reinforcement Learning with Autonomous Inference of Subtask
Dependencies [57.27944046925876]
We propose and address a novel few-shot RL problem, where a task is characterized by a subtask graph.
Instead of directly learning a meta-policy, we develop a Meta-learner with Subtask Graph Inference.
Our experiment results on two grid-world domains and StarCraft II environments show that the proposed method is able to accurately infer the latent task parameter.
arXiv Detail & Related papers (2020-01-01T17:34:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.