ForkMerge: Mitigating Negative Transfer in Auxiliary-Task Learning
- URL: http://arxiv.org/abs/2301.12618v3
- Date: Thu, 16 Nov 2023 11:11:39 GMT
- Title: ForkMerge: Mitigating Negative Transfer in Auxiliary-Task Learning
- Authors: Junguang Jiang, Baixu Chen, Junwei Pan, Ximei Wang, Liu Dapeng, Jie
Jiang, Mingsheng Long
- Abstract summary: Auxiliary-Task Learning (ATL) aims to improve the performance of the target task by leveraging the knowledge obtained from related tasks.
Sometimes, learning multiple tasks simultaneously results in lower accuracy than learning only the target task, known as negative transfer.
ForkMerge is a novel approach that periodically forks the model into multiple branches, automatically searches the varying task weights.
- Score: 59.08197876733052
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Auxiliary-Task Learning (ATL) aims to improve the performance of the target
task by leveraging the knowledge obtained from related tasks. Occasionally,
learning multiple tasks simultaneously results in lower accuracy than learning
only the target task, which is known as negative transfer. This problem is
often attributed to the gradient conflicts among tasks, and is frequently
tackled by coordinating the task gradients in previous works. However, these
optimization-based methods largely overlook the auxiliary-target generalization
capability. To better understand the root cause of negative transfer, we
experimentally investigate it from both optimization and generalization
perspectives. Based on our findings, we introduce ForkMerge, a novel approach
that periodically forks the model into multiple branches, automatically
searches the varying task weights by minimizing target validation errors, and
dynamically merges all branches to filter out detrimental task-parameter
updates. On a series of auxiliary-task learning benchmarks, ForkMerge
outperforms existing methods and effectively mitigates negative transfer.
Related papers
- Task Weighting through Gradient Projection for Multitask Learning [5.5967570276373655]
In multitask learning, conflicts between task gradients are a frequent issue degrading a model's training performance.
In this work, we present a method to adapt the Gradient Projection algorithm PCGrad to simultaneously perform task prioritization.
Our approach differs from traditional task weighting performed by scaling task losses in that our weighting scheme applies only in cases where tasks are in conflict, but lets the training proceed unhindered otherwise.
arXiv Detail & Related papers (2024-09-03T11:17:44Z) - Composite Learning for Robust and Effective Dense Predictions [81.2055761433725]
Multi-task learning promises better model generalization on a target task by jointly optimizing it with an auxiliary task.
We find that jointly training a dense prediction (target) task with a self-supervised (auxiliary) task can consistently improve the performance of the target task, while eliminating the need for labeling auxiliary tasks.
arXiv Detail & Related papers (2022-10-13T17:59:16Z) - TaskMix: Data Augmentation for Meta-Learning of Spoken Intent
Understanding [0.0]
We show that a state-of-the-art data augmentation method worsens this problem of overfitting when the task diversity is low.
We propose a simple method, TaskMix, which synthesizes new tasks by linearly interpolating existing tasks.
We show that TaskMix outperforms baselines, alleviates overfitting when task diversity is low, and does not degrade performance even when it is high.
arXiv Detail & Related papers (2022-09-26T00:37:40Z) - Learning Action Translator for Meta Reinforcement Learning on
Sparse-Reward Tasks [56.63855534940827]
This work introduces a novel objective function to learn an action translator among training tasks.
We theoretically verify that the value of the transferred policy with the action translator can be close to the value of the source policy.
We propose to combine the action translator with context-based meta-RL algorithms for better data collection and more efficient exploration during meta-training.
arXiv Detail & Related papers (2022-07-19T04:58:06Z) - Leveraging convergence behavior to balance conflicting tasks in
multi-task learning [3.6212652499950138]
Multi-Task Learning uses correlated tasks to improve performance generalization.
Tasks often conflict with each other, which makes it challenging to define how the gradients of multiple tasks should be combined.
We propose a method that takes into account temporal behaviour of the gradients to create a dynamic bias that adjust the importance of each task during the backpropagation.
arXiv Detail & Related papers (2022-04-14T01:52:34Z) - Conflict-Averse Gradient Descent for Multi-task Learning [56.379937772617]
A major challenge in optimizing a multi-task model is the conflicting gradients.
We introduce Conflict-Averse Gradient descent (CAGrad) which minimizes the average loss function.
CAGrad balances the objectives automatically and still provably converges to a minimum over the average loss.
arXiv Detail & Related papers (2021-10-26T22:03:51Z) - Adaptive Transfer Learning on Graph Neural Networks [4.233435459239147]
Graph neural networks (GNNs) are widely used to learn a powerful representation of graph-structured data.
Recent work demonstrates that transferring knowledge from self-supervised tasks to downstream tasks could further improve graph representation.
We propose a new transfer learning paradigm on GNNs which could effectively leverage self-supervised tasks as auxiliary tasks to help the target task.
arXiv Detail & Related papers (2021-07-19T11:46:28Z) - Learning to Relate Depth and Semantics for Unsupervised Domain
Adaptation [87.1188556802942]
We present an approach for encoding visual task relationships to improve model performance in an Unsupervised Domain Adaptation (UDA) setting.
We propose a novel Cross-Task Relation Layer (CTRL), which encodes task dependencies between the semantic and depth predictions.
Furthermore, we propose an Iterative Self-Learning (ISL) training scheme, which exploits semantic pseudo-labels to provide extra supervision on the target domain.
arXiv Detail & Related papers (2021-05-17T13:42:09Z) - Task-Feature Collaborative Learning with Application to Personalized
Attribute Prediction [166.87111665908333]
We propose a novel multi-task learning method called Task-Feature Collaborative Learning (TFCL)
Specifically, we first propose a base model with a heterogeneous block-diagonal structure regularizer to leverage the collaborative grouping of features and tasks.
As a practical extension, we extend the base model by allowing overlapping features and differentiating the hard tasks.
arXiv Detail & Related papers (2020-04-29T02:32:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.