Auxiliary Task Reweighting for Minimum-data Learning
- URL: http://arxiv.org/abs/2010.08244v1
- Date: Fri, 16 Oct 2020 08:45:37 GMT
- Title: Auxiliary Task Reweighting for Minimum-data Learning
- Authors: Baifeng Shi, Judy Hoffman, Kate Saenko, Trevor Darrell, Huijuan Xu
- Abstract summary: Supervised learning requires a large amount of training data, limiting its application where labeled data is scarce.
To compensate for data scarcity, one possible method is to utilize auxiliary tasks to provide additional supervision for the main task.
We propose a method to automatically reweight auxiliary tasks in order to reduce the data requirement on the main task.
- Score: 118.69683270159108
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Supervised learning requires a large amount of training data, limiting its
application where labeled data is scarce. To compensate for data scarcity, one
possible method is to utilize auxiliary tasks to provide additional supervision
for the main task. Assigning and optimizing the importance weights for
different auxiliary tasks remains an crucial and largely understudied research
question. In this work, we propose a method to automatically reweight auxiliary
tasks in order to reduce the data requirement on the main task. Specifically,
we formulate the weighted likelihood function of auxiliary tasks as a surrogate
prior for the main task. By adjusting the auxiliary task weights to minimize
the divergence between the surrogate prior and the true prior of the main task,
we obtain a more accurate prior estimation, achieving the goal of minimizing
the required amount of training data for the main task and avoiding a costly
grid search. In multiple experimental settings (e.g. semi-supervised learning,
multi-label classification), we demonstrate that our algorithm can effectively
utilize limited labeled data of the main task with the benefit of auxiliary
tasks compared with previous task reweighting methods. We also show that under
extreme cases with only a few extra examples (e.g. few-shot domain adaptation),
our algorithm results in significant improvement over the baseline.
Related papers
- Data-CUBE: Data Curriculum for Instruction-based Sentence Representation
Learning [85.66907881270785]
We propose a data curriculum method, namely Data-CUBE, that arranges the orders of all the multi-task data for training.
In the task level, we aim to find the optimal task order to minimize the total cross-task interference risk.
In the instance level, we measure the difficulty of all instances per task, then divide them into the easy-to-difficult mini-batches for training.
arXiv Detail & Related papers (2024-01-07T18:12:20Z) - Auxiliary task discovery through generate-and-test [7.800263769988046]
Auxiliary tasks improve data efficiency by forcing the agent to learn auxiliary prediction and control objectives.
In this paper, we explore an approach to auxiliary task discovery in reinforcement learning based on ideas from representation learning.
We introduce a new measure of auxiliary tasks' usefulness based on how useful the features induced by them are for the main task.
arXiv Detail & Related papers (2022-10-25T22:04:37Z) - Task Compass: Scaling Multi-task Pre-training with Task Prefix [122.49242976184617]
Existing studies show that multi-task learning with large-scale supervised tasks suffers from negative effects across tasks.
We propose a task prefix guided multi-task pre-training framework to explore the relationships among tasks.
Our model can not only serve as the strong foundation backbone for a wide range of tasks but also be feasible as a probing tool for analyzing task relationships.
arXiv Detail & Related papers (2022-10-12T15:02:04Z) - Transfer Learning in Conversational Analysis through Reusing
Preprocessing Data as Supervisors [52.37504333689262]
Using noisy labels in single-task learning increases the risk of over-fitting.
Auxiliary tasks could improve the performance of the primary task learning during the same training.
arXiv Detail & Related papers (2021-12-02T08:40:42Z) - Should We Be Pre-training? An Argument for End-task Aware Training as an
Alternative [88.11465517304515]
In general, the pre-training step relies on little to no direct knowledge of the task on which the model will be fine-tuned.
We show that multi-tasking the end-task and auxiliary objectives results in significantly better downstream task performance.
arXiv Detail & Related papers (2021-09-15T17:13:18Z) - Auxiliary Task Update Decomposition: The Good, The Bad and The Neutral [18.387162887917164]
We formulate a model-agnostic framework that performs fine-grained manipulation of the auxiliary task gradients.
We propose to decompose auxiliary updates into directions which help, damage or leave the primary task loss unchanged.
Our approach consistently outperforms strong and widely used baselines when leveraging out-of-distribution data for Text and Image classification tasks.
arXiv Detail & Related papers (2021-08-25T17:09:48Z) - Adaptive Transfer Learning on Graph Neural Networks [4.233435459239147]
Graph neural networks (GNNs) are widely used to learn a powerful representation of graph-structured data.
Recent work demonstrates that transferring knowledge from self-supervised tasks to downstream tasks could further improve graph representation.
We propose a new transfer learning paradigm on GNNs which could effectively leverage self-supervised tasks as auxiliary tasks to help the target task.
arXiv Detail & Related papers (2021-07-19T11:46:28Z) - Active Multitask Learning with Committees [15.862634213775697]
The cost of annotating training data has traditionally been a bottleneck for supervised learning approaches.
We propose an active multitask learning algorithm that achieves knowledge transfer between tasks.
Our approach reduces the number of queries needed during training while maintaining high accuracy on test data.
arXiv Detail & Related papers (2021-03-24T18:07:23Z) - Parrot: Data-Driven Behavioral Priors for Reinforcement Learning [79.32403825036792]
We propose a method for pre-training behavioral priors that can capture complex input-output relationships observed in successful trials.
We show how this learned prior can be used for rapidly learning new tasks without impeding the RL agent's ability to try out novel behaviors.
arXiv Detail & Related papers (2020-11-19T18:47:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.