Related papers: $α$VIL: Learning to Leverage Auxiliary Tasks for Multitask Learning

$α$VIL: Learning to Leverage Auxiliary Tasks for Multitask Learning

URL: http://arxiv.org/abs/2405.07769v1
Date: Mon, 13 May 2024 14:12:33 GMT
Title: $α$VIL: Learning to Leverage Auxiliary Tasks for Multitask Learning
Authors: Rafael Kourdis, Gabriel Gordon-Hall, Philip John Gorinski,
Abstract summary: Multitask Learning aims to train a range of (usually related) tasks with the help of a shared model. It becomes important to estimate the positive or negative influence auxiliary tasks will have on the target. We propose a novel method called $alpha$Variable Learning ($alpha$VIL) that is able to adjust task weights dynamically during model training.
Score: 3.809702129519642
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Multitask Learning is a Machine Learning paradigm that aims to train a range of (usually related) tasks with the help of a shared model. While the goal is often to improve the joint performance of all training tasks, another approach is to focus on the performance of a specific target task, while treating the remaining ones as auxiliary data from which to possibly leverage positive transfer towards the target during training. In such settings, it becomes important to estimate the positive or negative influence auxiliary tasks will have on the target. While many ways have been proposed to estimate task weights before or during training they typically rely on heuristics or extensive search of the weighting space. We propose a novel method called $\alpha$-Variable Importance Learning ($\alpha$VIL) that is able to adjust task weights dynamically during model training, by making direct use of task-specific updates of the underlying model's parameters between training epochs. Experiments indicate that $\alpha$VIL is able to outperform other Multitask Learning approaches in a variety of settings. To our knowledge, this is the first attempt at making direct use of model updates for task weight estimation.

Related papers

Bigger, Regularized, Categorical: High-Capacity Value Functions are Efficient Multi-Task Learners [60.75160178669076]
We show that the use of high-capacity value models trained via cross-entropy and conditioned on learnable task embeddings addresses the problem of task interference in online reinforcement learning.<n>We test our approach on 7 multi-task benchmarks with over 280 unique tasks, spanning high degree-of-freedom humanoid control and discrete vision-based RL.
arXiv Detail & Related papers (2025-05-29T06:41:45Z)
Exploring intra-task relations to improve meta-learning algorithms [1.223779595809275]
We aim to exploit external knowledge of task relations to improve training stability via effective mini-batching of tasks. We hypothesize that selecting a diverse set of tasks in a mini-batch will lead to a better estimate of the full gradient and hence will lead to a reduction of noise in training.
arXiv Detail & Related papers (2023-12-27T15:33:52Z)
Provable Multi-Task Representation Learning by Two-Layer ReLU Neural Networks [69.38572074372392]
We present the first results proving that feature learning occurs during training with a nonlinear model on multiple tasks. Our key insight is that multi-task pretraining induces a pseudo-contrastive loss that favors representations that align points that typically have the same label across tasks.
arXiv Detail & Related papers (2023-07-13T16:39:08Z)
ForkMerge: Mitigating Negative Transfer in Auxiliary-Task Learning [59.08197876733052]
Auxiliary-Task Learning (ATL) aims to improve the performance of the target task by leveraging the knowledge obtained from related tasks. Sometimes, learning multiple tasks simultaneously results in lower accuracy than learning only the target task, known as negative transfer. ForkMerge is a novel approach that periodically forks the model into multiple branches, automatically searches the varying task weights.
arXiv Detail & Related papers (2023-01-30T02:27:02Z)
TIDo: Source-free Task Incremental Learning in Non-stationary Environments [0.0]
Updating a model-based agent to learn new target tasks requires us to store past training data. Few-shot task incremental learning methods overcome the limitation of labeled target datasets. We propose a one-shot task incremental learning approach that can adapt to non-stationary source and target tasks.
arXiv Detail & Related papers (2023-01-28T02:19:45Z)
Effective Adaptation in Multi-Task Co-Training for Unified Autonomous Driving [103.745551954983]
In this paper, we investigate the transfer performance of various types of self-supervised methods, including MoCo and SimCLR, on three downstream tasks. We find that their performances are sub-optimal or even lag far behind the single-task baseline. We propose a simple yet effective pretrain-adapt-finetune paradigm for general multi-task training.
arXiv Detail & Related papers (2022-09-19T12:15:31Z)
Explaining the Effectiveness of Multi-Task Learning for Efficient Knowledge Extraction from Spine MRI Reports [2.5953185061765884]
We show that a single multi-tasking model can match the performance of task specific models. We validate our observations on our internal radiologist-annotated datasets on the cervical and lumbar spine.
arXiv Detail & Related papers (2022-05-06T01:51:19Z)
Task Adaptive Parameter Sharing for Multi-Task Learning [114.80350786535952]
Adaptive Task Adapting Sharing (TAPS) is a method for tuning a base model to a new task by adaptively modifying a small, task-specific subset of layers. Compared to other methods, TAPS retains high accuracy on downstream tasks while introducing few task-specific parameters. We evaluate our method on a suite of fine-tuning tasks and architectures (ResNet, DenseNet, ViT) and show that it achieves state-of-the-art performance while being simple to implement.
arXiv Detail & Related papers (2022-03-30T23:16:07Z)
MetaICL: Learning to Learn In Context [87.23056864536613]
We introduce MetaICL, a new meta-training framework for few-shot learning where a pretrained language model is tuned to do in-context learn-ing on a large set of training tasks. We show that MetaICL approaches (and sometimes beats) the performance of models fully finetuned on the target task training data, and outperforms much bigger models with nearly 8x parameters.
arXiv Detail & Related papers (2021-10-29T17:42:08Z)
Should We Be Pre-training? An Argument for End-task Aware Training as an Alternative [88.11465517304515]
In general, the pre-training step relies on little to no direct knowledge of the task on which the model will be fine-tuned. We show that multi-tasking the end-task and auxiliary objectives results in significantly better downstream task performance.
arXiv Detail & Related papers (2021-09-15T17:13:18Z)
Adaptive Transfer Learning on Graph Neural Networks [4.233435459239147]
Graph neural networks (GNNs) are widely used to learn a powerful representation of graph-structured data. Recent work demonstrates that transferring knowledge from self-supervised tasks to downstream tasks could further improve graph representation. We propose a new transfer learning paradigm on GNNs which could effectively leverage self-supervised tasks as auxiliary tasks to help the target task.
arXiv Detail & Related papers (2021-07-19T11:46:28Z)
Task Attended Meta-Learning for Few-Shot Learning [3.0724051098062097]
We introduce a training curriculum motivated by selective focus in humans, called task attended meta-training, to weight the tasks in a batch. The comparisons of the models with their non-task-attended counterparts on complex datasets validate its effectiveness.
arXiv Detail & Related papers (2021-06-20T07:34:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.