Related papers: HydaLearn: Highly Dynamic Task Weighting for Multi-task Learning with Auxiliary Tasks

HydaLearn: Highly Dynamic Task Weighting for Multi-task Learning with Auxiliary Tasks

URL: http://arxiv.org/abs/2008.11643v1
Date: Wed, 26 Aug 2020 16:04:02 GMT
Title: HydaLearn: Highly Dynamic Task Weighting for Multi-task Learning with Auxiliary Tasks
Authors: Sam Verboven, Muhammad Hafeez Chaudhary, Jeroen Berrevoets, Wouter Verbeke
Abstract summary: Multi-task learning (MTL) can improve performance on a task by sharing representations with one or more related auxiliary-tasks. Usually, MTL-networks are trained on a composite loss function formed by a constant weighted combination of the separate task losses. In practice, constant loss weights lead to poor results for two reasons: (i) for mini-batch based optimisation, the optimal task weights vary significantly from one update to the next depending on mini-batch sample composition. We introduce HydaLearn, an intelligent weighting algorithm that connects main-task gain to the individual task gradients, in order to inform
Score: 4.095907708855597
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Multi-task learning (MTL) can improve performance on a task by sharing representations with one or more related auxiliary-tasks. Usually, MTL-networks are trained on a composite loss function formed by a constant weighted combination of the separate task losses. In practice, constant loss weights lead to poor results for two reasons: (i) the relevance of the auxiliary tasks can gradually drift throughout the learning process; (ii) for mini-batch based optimisation, the optimal task weights vary significantly from one update to the next depending on mini-batch sample composition. We introduce HydaLearn, an intelligent weighting algorithm that connects main-task gain to the individual task gradients, in order to inform dynamic loss weighting at the mini-batch level, addressing i and ii. Using HydaLearn, we report performance increases on synthetic data, as well as on two supervised learning domains.

Related papers

StableMTL: Repurposing Latent Diffusion Models for Multi-Task Learning from Partially Annotated Synthetic Datasets [14.867396697566257]
We extend the partial learning setup to a zero-shot setting, training a multi-task model on multiple datasets, each labeled for only a subset of tasks.<n>Our method, StableMTL, repurposes image generators for latent regression.<n>Instead of per-task losses requiring careful balancing, a unified latent loss is adopted, enabling seamless scaling to more tasks.
arXiv Detail & Related papers (2025-06-09T17:59:59Z)
CoTBal: Comprehensive Task Balancing for Multi-Task Visual Instruction Tuning [20.58878416527427]
We propose a novel Comprehensive Task Balancing algorithm for multi-task visual instruction tuning of LMMs. Our CoTBal leads to superior overall performance in multi-task visual instruction tuning.
arXiv Detail & Related papers (2024-03-07T09:11:16Z)
Cross-Task Affinity Learning for Multitask Dense Scene Predictions [5.939164722752263]
Multitask learning (MTL) has become prominent for its ability to predict multiple tasks jointly. We introduce the Cross-Task Affinity Learning (CTAL) module, a lightweight framework that enhances task refinement in multitask networks. Our results demonstrate state-of-the-art MTL performance for both CNN and transformer backbones, using significantly fewer parameters than single-task learning.
arXiv Detail & Related papers (2024-01-20T05:31:47Z)
Data-CUBE: Data Curriculum for Instruction-based Sentence Representation Learning [85.66907881270785]
We propose a data curriculum method, namely Data-CUBE, that arranges the orders of all the multi-task data for training. In the task level, we aim to find the optimal task order to minimize the total cross-task interference risk. In the instance level, we measure the difficulty of all instances per task, then divide them into the easy-to-difficult mini-batches for training.
arXiv Detail & Related papers (2024-01-07T18:12:20Z)
Sample-Level Weighting for Multi-Task Learning with Auxiliary Tasks [0.0]
Multi-task learning (MTL) can improve the generalization performance of neural networks by sharing representations with related tasks. MTL can also degrade performance through harmful interference between tasks. We propose SLGrad, a sample-level weighting algorithm for multi-task learning with auxiliary tasks.
arXiv Detail & Related papers (2023-06-07T15:29:46Z)
TaskMix: Data Augmentation for Meta-Learning of Spoken Intent Understanding [0.0]
We show that a state-of-the-art data augmentation method worsens this problem of overfitting when the task diversity is low. We propose a simple method, TaskMix, which synthesizes new tasks by linearly interpolating existing tasks. We show that TaskMix outperforms baselines, alleviates overfitting when task diversity is low, and does not degrade performance even when it is high.
arXiv Detail & Related papers (2022-09-26T00:37:40Z)
Sparsely Activated Mixture-of-Experts are Robust Multi-Task Learners [67.5865966762559]
We study whether sparsely activated Mixture-of-Experts (MoE) improve multi-task learning. We devise task-aware gating functions to route examples from different tasks to specialized experts. This results in a sparsely activated multi-task model with a large number of parameters, but with the same computational cost as that of a dense model.
arXiv Detail & Related papers (2022-04-16T00:56:12Z)
Task Adaptive Parameter Sharing for Multi-Task Learning [114.80350786535952]
Adaptive Task Adapting Sharing (TAPS) is a method for tuning a base model to a new task by adaptively modifying a small, task-specific subset of layers. Compared to other methods, TAPS retains high accuracy on downstream tasks while introducing few task-specific parameters. We evaluate our method on a suite of fine-tuning tasks and architectures (ResNet, DenseNet, ViT) and show that it achieves state-of-the-art performance while being simple to implement.
arXiv Detail & Related papers (2022-03-30T23:16:07Z)
Variational Multi-Task Learning with Gumbel-Softmax Priors [105.22406384964144]
Multi-task learning aims to explore task relatedness to improve individual tasks. We propose variational multi-task learning (VMTL), a general probabilistic inference framework for learning multiple related tasks.
arXiv Detail & Related papers (2021-11-09T18:49:45Z)
Task Uncertainty Loss Reduce Negative Transfer in Asymmetric Multi-task Feature Learning [0.0]
Multi-task learning (MTL) can improve task performance overall relative to single-task learning (STL), but can hide negative transfer (NT) Asymmetric multitask feature learning (AMTFL) is an approach that tries to address this by allowing tasks with higher loss values to have smaller influence on feature representations for learning other tasks. We present examples of NT in two datasets (image recognition and pharmacogenomics) and tackle this challenge by using aleatoric homoscedastic uncertainty to capture the relative confidence between tasks, and set weights for task loss.
arXiv Detail & Related papers (2020-12-17T13:30:45Z)
Reparameterizing Convolutions for Incremental Multi-Task Learning without Task Interference [75.95287293847697]
Two common challenges in developing multi-task models are often overlooked in literature. First, enabling the model to be inherently incremental, continuously incorporating information from new tasks without forgetting the previously learned ones (incremental learning) Second, eliminating adverse interactions amongst tasks, which has been shown to significantly degrade the single-task performance in a multi-task setup (task interference)
arXiv Detail & Related papers (2020-07-24T14:44:46Z)
Knowledge Distillation for Multi-task Learning [38.20005345733544]
Multi-task learning (MTL) is to learn one single model that performs multiple tasks for achieving good performance on all tasks and lower cost on computation. Learning such a model requires to jointly optimize losses of a set of tasks with different difficulty levels, magnitudes, and characteristics. We propose a knowledge distillation based method in this work to address the imbalance problem in multi-task learning.
arXiv Detail & Related papers (2020-07-14T08:02:42Z)
Gradient Surgery for Multi-Task Learning [119.675492088251]
Multi-task learning has emerged as a promising approach for sharing structure across multiple tasks. The reasons why multi-task learning is so challenging compared to single-task learning are not fully understood. We propose a form of gradient surgery that projects a task's gradient onto the normal plane of the gradient of any other task that has a conflicting gradient.
arXiv Detail & Related papers (2020-01-19T06:33:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.