HydaLearn: Highly Dynamic Task Weighting for Multi-task Learning with
Auxiliary Tasks
- URL: http://arxiv.org/abs/2008.11643v1
- Date: Wed, 26 Aug 2020 16:04:02 GMT
- Title: HydaLearn: Highly Dynamic Task Weighting for Multi-task Learning with
Auxiliary Tasks
- Authors: Sam Verboven, Muhammad Hafeez Chaudhary, Jeroen Berrevoets, Wouter
Verbeke
- Abstract summary: Multi-task learning (MTL) can improve performance on a task by sharing representations with one or more related auxiliary-tasks.
Usually, MTL-networks are trained on a composite loss function formed by a constant weighted combination of the separate task losses.
In practice, constant loss weights lead to poor results for two reasons: (i) for mini-batch based optimisation, the optimal task weights vary significantly from one update to the next depending on mini-batch sample composition.
We introduce HydaLearn, an intelligent weighting algorithm that connects main-task gain to the individual task gradients, in order to inform
- Score: 4.095907708855597
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-task learning (MTL) can improve performance on a task by sharing
representations with one or more related auxiliary-tasks. Usually, MTL-networks
are trained on a composite loss function formed by a constant weighted
combination of the separate task losses. In practice, constant loss weights
lead to poor results for two reasons: (i) the relevance of the auxiliary tasks
can gradually drift throughout the learning process; (ii) for mini-batch based
optimisation, the optimal task weights vary significantly from one update to
the next depending on mini-batch sample composition. We introduce HydaLearn, an
intelligent weighting algorithm that connects main-task gain to the individual
task gradients, in order to inform dynamic loss weighting at the mini-batch
level, addressing i and ii. Using HydaLearn, we report performance increases on
synthetic data, as well as on two supervised learning domains.
Related papers
- CoTBal: Comprehensive Task Balancing for Multi-Task Visual Instruction
Tuning [20.58878416527427]
We propose a novel Comprehensive Task Balancing algorithm for multi-task visual instruction tuning of LMMs.
Our CoTBal leads to superior overall performance in multi-task visual instruction tuning.
arXiv Detail & Related papers (2024-03-07T09:11:16Z) - Cross-Task Affinity Learning for Multitask Dense Scene Predictions [5.939164722752263]
Multitask learning (MTL) has become prominent for its ability to predict multiple tasks jointly.
We introduce the Cross-Task Affinity Learning (CTAL) module, a lightweight framework that enhances task refinement in multitask networks.
Our results demonstrate state-of-the-art MTL performance for both CNN and transformer backbones, using significantly fewer parameters than single-task learning.
arXiv Detail & Related papers (2024-01-20T05:31:47Z) - Data-CUBE: Data Curriculum for Instruction-based Sentence Representation
Learning [85.66907881270785]
We propose a data curriculum method, namely Data-CUBE, that arranges the orders of all the multi-task data for training.
In the task level, we aim to find the optimal task order to minimize the total cross-task interference risk.
In the instance level, we measure the difficulty of all instances per task, then divide them into the easy-to-difficult mini-batches for training.
arXiv Detail & Related papers (2024-01-07T18:12:20Z) - Sample-Level Weighting for Multi-Task Learning with Auxiliary Tasks [0.0]
Multi-task learning (MTL) can improve the generalization performance of neural networks by sharing representations with related tasks.
MTL can also degrade performance through harmful interference between tasks.
We propose SLGrad, a sample-level weighting algorithm for multi-task learning with auxiliary tasks.
arXiv Detail & Related papers (2023-06-07T15:29:46Z) - Sparsely Activated Mixture-of-Experts are Robust Multi-Task Learners [67.5865966762559]
We study whether sparsely activated Mixture-of-Experts (MoE) improve multi-task learning.
We devise task-aware gating functions to route examples from different tasks to specialized experts.
This results in a sparsely activated multi-task model with a large number of parameters, but with the same computational cost as that of a dense model.
arXiv Detail & Related papers (2022-04-16T00:56:12Z) - Task Adaptive Parameter Sharing for Multi-Task Learning [114.80350786535952]
Adaptive Task Adapting Sharing (TAPS) is a method for tuning a base model to a new task by adaptively modifying a small, task-specific subset of layers.
Compared to other methods, TAPS retains high accuracy on downstream tasks while introducing few task-specific parameters.
We evaluate our method on a suite of fine-tuning tasks and architectures (ResNet, DenseNet, ViT) and show that it achieves state-of-the-art performance while being simple to implement.
arXiv Detail & Related papers (2022-03-30T23:16:07Z) - Variational Multi-Task Learning with Gumbel-Softmax Priors [105.22406384964144]
Multi-task learning aims to explore task relatedness to improve individual tasks.
We propose variational multi-task learning (VMTL), a general probabilistic inference framework for learning multiple related tasks.
arXiv Detail & Related papers (2021-11-09T18:49:45Z) - Task Uncertainty Loss Reduce Negative Transfer in Asymmetric Multi-task
Feature Learning [0.0]
Multi-task learning (MTL) can improve task performance overall relative to single-task learning (STL), but can hide negative transfer (NT)
Asymmetric multitask feature learning (AMTFL) is an approach that tries to address this by allowing tasks with higher loss values to have smaller influence on feature representations for learning other tasks.
We present examples of NT in two datasets (image recognition and pharmacogenomics) and tackle this challenge by using aleatoric homoscedastic uncertainty to capture the relative confidence between tasks, and set weights for task loss.
arXiv Detail & Related papers (2020-12-17T13:30:45Z) - Reparameterizing Convolutions for Incremental Multi-Task Learning
without Task Interference [75.95287293847697]
Two common challenges in developing multi-task models are often overlooked in literature.
First, enabling the model to be inherently incremental, continuously incorporating information from new tasks without forgetting the previously learned ones (incremental learning)
Second, eliminating adverse interactions amongst tasks, which has been shown to significantly degrade the single-task performance in a multi-task setup (task interference)
arXiv Detail & Related papers (2020-07-24T14:44:46Z) - Knowledge Distillation for Multi-task Learning [38.20005345733544]
Multi-task learning (MTL) is to learn one single model that performs multiple tasks for achieving good performance on all tasks and lower cost on computation.
Learning such a model requires to jointly optimize losses of a set of tasks with different difficulty levels, magnitudes, and characteristics.
We propose a knowledge distillation based method in this work to address the imbalance problem in multi-task learning.
arXiv Detail & Related papers (2020-07-14T08:02:42Z) - Gradient Surgery for Multi-Task Learning [119.675492088251]
Multi-task learning has emerged as a promising approach for sharing structure across multiple tasks.
The reasons why multi-task learning is so challenging compared to single-task learning are not fully understood.
We propose a form of gradient surgery that projects a task's gradient onto the normal plane of the gradient of any other task that has a conflicting gradient.
arXiv Detail & Related papers (2020-01-19T06:33:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.