Related papers: Efficiently Identifying Task Groupings for Multi-Task Learning

Efficiently Identifying Task Groupings for Multi-Task Learning

URL: http://arxiv.org/abs/2109.04617v1
Date: Fri, 10 Sep 2021 02:01:43 GMT
Title: Efficiently Identifying Task Groupings for Multi-Task Learning
Authors: Christopher Fifty, Ehsan Amid, Zhe Zhao, Tianhe Yu, Rohan Anil, Chelsea Finn
Abstract summary: Multi-task learning can leverage information learned by one task to benefit the training of other tasks. We suggest an approach to select which tasks should train together in multi-task learning models. Our method determines task groupings in a single training run by co-training all tasks together and quantifying the effect to which one task's gradient would affect another task's loss.
Score: 55.80489920205404
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Multi-task learning can leverage information learned by one task to benefit the training of other tasks. Despite this capacity, naively training all tasks together in one model often degrades performance, and exhaustively searching through combinations of task groupings can be prohibitively expensive. As a result, efficiently identifying the tasks that would benefit from co-training remains a challenging design question without a clear solution. In this paper, we suggest an approach to select which tasks should train together in multi-task learning models. Our method determines task groupings in a single training run by co-training all tasks together and quantifying the effect to which one task's gradient would affect another task's loss. On the large-scale Taskonomy computer vision dataset, we find this method can decrease test loss by 10.0\% compared to simply training all tasks together while operating 11.6 times faster than a state-of-the-art task grouping method.

Related papers

Data-CUBE: Data Curriculum for Instruction-based Sentence Representation Learning [85.66907881270785]
We propose a data curriculum method, namely Data-CUBE, that arranges the orders of all the multi-task data for training. In the task level, we aim to find the optimal task order to minimize the total cross-task interference risk. In the instance level, we measure the difficulty of all instances per task, then divide them into the easy-to-difficult mini-batches for training.
arXiv Detail & Related papers (2024-01-07T18:12:20Z)
TaskWeb: Selecting Better Source Tasks for Multi-task NLP [76.03221609799931]
Knowing task relationships via pairwise task transfer improves choosing one or more source tasks that help to learn a new target task. We use TaskWeb to estimate the benefit of using a source task for learning a new target task, and to choose a subset of helpful training tasks for multi-task training. Our method improves overall rankings and top-k precision of source tasks by 10% and 38%, respectively.
arXiv Detail & Related papers (2023-05-22T17:27:57Z)
Task Compass: Scaling Multi-task Pre-training with Task Prefix [122.49242976184617]
Existing studies show that multi-task learning with large-scale supervised tasks suffers from negative effects across tasks. We propose a task prefix guided multi-task pre-training framework to explore the relationships among tasks. Our model can not only serve as the strong foundation backbone for a wide range of tasks but also be feasible as a probing tool for analyzing task relationships.
arXiv Detail & Related papers (2022-10-12T15:02:04Z)
Exploring the Role of Task Transferability in Large-Scale Multi-Task Learning [28.104054292437525]
We disentangle the effect of scale and relatedness of tasks in multi-task representation learning. If the target tasks are known ahead of time, then training on a smaller set of related tasks is competitive to the large-scale multi-task training.
arXiv Detail & Related papers (2022-04-23T18:11:35Z)
Learning Multi-Tasks with Inconsistent Labels by using Auxiliary Big Task [24.618094251341958]
Multi-task learning is to improve the performance of the model by transferring and exploiting common knowledge among tasks. We propose a framework to learn these tasks by jointly leveraging both abundant information from a learnt auxiliary big task with sufficiently many classes to cover those of all these tasks. Our experimental results demonstrate its effectiveness in comparison with the state-of-the-art approaches.
arXiv Detail & Related papers (2022-01-07T02:46:47Z)
Variational Multi-Task Learning with Gumbel-Softmax Priors [105.22406384964144]
Multi-task learning aims to explore task relatedness to improve individual tasks. We propose variational multi-task learning (VMTL), a general probabilistic inference framework for learning multiple related tasks.
arXiv Detail & Related papers (2021-11-09T18:49:45Z)
Knowledge Distillation for Multi-task Learning [38.20005345733544]
Multi-task learning (MTL) is to learn one single model that performs multiple tasks for achieving good performance on all tasks and lower cost on computation. Learning such a model requires to jointly optimize losses of a set of tasks with different difficulty levels, magnitudes, and characteristics. We propose a knowledge distillation based method in this work to address the imbalance problem in multi-task learning.
arXiv Detail & Related papers (2020-07-14T08:02:42Z)
Gradient Surgery for Multi-Task Learning [119.675492088251]
Multi-task learning has emerged as a promising approach for sharing structure across multiple tasks. The reasons why multi-task learning is so challenging compared to single-task learning are not fully understood. We propose a form of gradient surgery that projects a task's gradient onto the normal plane of the gradient of any other task that has a conflicting gradient.
arXiv Detail & Related papers (2020-01-19T06:33:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.