Efficiently Identifying Task Groupings for Multi-Task Learning
- URL: http://arxiv.org/abs/2109.04617v1
- Date: Fri, 10 Sep 2021 02:01:43 GMT
- Title: Efficiently Identifying Task Groupings for Multi-Task Learning
- Authors: Christopher Fifty, Ehsan Amid, Zhe Zhao, Tianhe Yu, Rohan Anil,
Chelsea Finn
- Abstract summary: Multi-task learning can leverage information learned by one task to benefit the training of other tasks.
We suggest an approach to select which tasks should train together in multi-task learning models.
Our method determines task groupings in a single training run by co-training all tasks together and quantifying the effect to which one task's gradient would affect another task's loss.
- Score: 55.80489920205404
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-task learning can leverage information learned by one task to benefit
the training of other tasks. Despite this capacity, naively training all tasks
together in one model often degrades performance, and exhaustively searching
through combinations of task groupings can be prohibitively expensive. As a
result, efficiently identifying the tasks that would benefit from co-training
remains a challenging design question without a clear solution. In this paper,
we suggest an approach to select which tasks should train together in
multi-task learning models. Our method determines task groupings in a single
training run by co-training all tasks together and quantifying the effect to
which one task's gradient would affect another task's loss. On the large-scale
Taskonomy computer vision dataset, we find this method can decrease test loss
by 10.0\% compared to simply training all tasks together while operating 11.6
times faster than a state-of-the-art task grouping method.
Related papers
- Data-CUBE: Data Curriculum for Instruction-based Sentence Representation
Learning [85.66907881270785]
We propose a data curriculum method, namely Data-CUBE, that arranges the orders of all the multi-task data for training.
In the task level, we aim to find the optimal task order to minimize the total cross-task interference risk.
In the instance level, we measure the difficulty of all instances per task, then divide them into the easy-to-difficult mini-batches for training.
arXiv Detail & Related papers (2024-01-07T18:12:20Z) - TaskWeb: Selecting Better Source Tasks for Multi-task NLP [76.03221609799931]
Knowing task relationships via pairwise task transfer improves choosing one or more source tasks that help to learn a new target task.
We use TaskWeb to estimate the benefit of using a source task for learning a new target task, and to choose a subset of helpful training tasks for multi-task training.
Our method improves overall rankings and top-k precision of source tasks by 10% and 38%, respectively.
arXiv Detail & Related papers (2023-05-22T17:27:57Z) - Task Compass: Scaling Multi-task Pre-training with Task Prefix [122.49242976184617]
Existing studies show that multi-task learning with large-scale supervised tasks suffers from negative effects across tasks.
We propose a task prefix guided multi-task pre-training framework to explore the relationships among tasks.
Our model can not only serve as the strong foundation backbone for a wide range of tasks but also be feasible as a probing tool for analyzing task relationships.
arXiv Detail & Related papers (2022-10-12T15:02:04Z) - Exploring the Role of Task Transferability in Large-Scale Multi-Task
Learning [28.104054292437525]
We disentangle the effect of scale and relatedness of tasks in multi-task representation learning.
If the target tasks are known ahead of time, then training on a smaller set of related tasks is competitive to the large-scale multi-task training.
arXiv Detail & Related papers (2022-04-23T18:11:35Z) - Learning Multi-Tasks with Inconsistent Labels by using Auxiliary Big
Task [24.618094251341958]
Multi-task learning is to improve the performance of the model by transferring and exploiting common knowledge among tasks.
We propose a framework to learn these tasks by jointly leveraging both abundant information from a learnt auxiliary big task with sufficiently many classes to cover those of all these tasks.
Our experimental results demonstrate its effectiveness in comparison with the state-of-the-art approaches.
arXiv Detail & Related papers (2022-01-07T02:46:47Z) - Variational Multi-Task Learning with Gumbel-Softmax Priors [105.22406384964144]
Multi-task learning aims to explore task relatedness to improve individual tasks.
We propose variational multi-task learning (VMTL), a general probabilistic inference framework for learning multiple related tasks.
arXiv Detail & Related papers (2021-11-09T18:49:45Z) - Knowledge Distillation for Multi-task Learning [38.20005345733544]
Multi-task learning (MTL) is to learn one single model that performs multiple tasks for achieving good performance on all tasks and lower cost on computation.
Learning such a model requires to jointly optimize losses of a set of tasks with different difficulty levels, magnitudes, and characteristics.
We propose a knowledge distillation based method in this work to address the imbalance problem in multi-task learning.
arXiv Detail & Related papers (2020-07-14T08:02:42Z) - Gradient Surgery for Multi-Task Learning [119.675492088251]
Multi-task learning has emerged as a promising approach for sharing structure across multiple tasks.
The reasons why multi-task learning is so challenging compared to single-task learning are not fully understood.
We propose a form of gradient surgery that projects a task's gradient onto the normal plane of the gradient of any other task that has a conflicting gradient.
arXiv Detail & Related papers (2020-01-19T06:33:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.