Improving Multitask Retrieval by Promoting Task Specialization
- URL: http://arxiv.org/abs/2307.00342v1
- Date: Sat, 1 Jul 2023 13:45:15 GMT
- Title: Improving Multitask Retrieval by Promoting Task Specialization
- Authors: Wenzheng Zhang, Chenyan Xiong, Karl Stratos, Arnold Overwijk
- Abstract summary: We show that it is possible to train a multitask retriever that outperforms task-specific retrievers by promoting task specialization.
The model indeed learns parameters that are more task-specialized compared to naive multitasking without prompting or adaptive learning.
- Score: 36.06044647938725
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In multitask retrieval, a single retriever is trained to retrieve relevant
contexts for multiple tasks. Despite its practical appeal, naive multitask
retrieval lags behind task-specific retrieval in which a separate retriever is
trained for each task. We show that it is possible to train a multitask
retriever that outperforms task-specific retrievers by promoting task
specialization. The main ingredients are: (1) a better choice of pretrained
model (one that is explicitly optimized for multitasking) along with compatible
prompting, and (2) a novel adaptive learning method that encourages each
parameter to specialize in a particular task. The resulting multitask retriever
is highly performant on the KILT benchmark. Upon analysis, we find that the
model indeed learns parameters that are more task-specialized compared to naive
multitasking without prompting or adaptive learning.
Related papers
- Cross-Task Affinity Learning for Multitask Dense Scene Predictions [5.939164722752263]
Multitask learning (MTL) has become prominent for its ability to predict multiple tasks jointly.
We introduce the Cross-Task Affinity Learning (CTAL) module, a lightweight framework that enhances task refinement in multitask networks.
Our results demonstrate state-of-the-art MTL performance for both CNN and transformer backbones, using significantly fewer parameters than single-task learning.
arXiv Detail & Related papers (2024-01-20T05:31:47Z) - TaskExpert: Dynamically Assembling Multi-Task Representations with
Memorial Mixture-of-Experts [11.608682595506354]
Recent models consider directly decoding task-specific features from one shared task-generic feature.
As the input feature is fully shared and each task decoder also shares decoding parameters for different input samples, it leads to a static feature decoding process.
We propose TaskExpert, a novel multi-task mixture-of-experts model that enables learning multiple representative task-generic feature spaces.
arXiv Detail & Related papers (2023-07-28T06:00:57Z) - TaskWeb: Selecting Better Source Tasks for Multi-task NLP [76.03221609799931]
Knowing task relationships via pairwise task transfer improves choosing one or more source tasks that help to learn a new target task.
We use TaskWeb to estimate the benefit of using a source task for learning a new target task, and to choose a subset of helpful training tasks for multi-task training.
Our method improves overall rankings and top-k precision of source tasks by 10% and 38%, respectively.
arXiv Detail & Related papers (2023-05-22T17:27:57Z) - Identification of Negative Transfers in Multitask Learning Using
Surrogate Models [29.882265735630046]
Multitask learning is widely used to train a low-resource target task by augmenting it with multiple related source tasks.
A critical problem in multitask learning is identifying subsets of source tasks that would benefit the target task.
We introduce an efficient procedure to address this problem via surrogate modeling.
arXiv Detail & Related papers (2023-03-25T23:16:11Z) - Sparsely Activated Mixture-of-Experts are Robust Multi-Task Learners [67.5865966762559]
We study whether sparsely activated Mixture-of-Experts (MoE) improve multi-task learning.
We devise task-aware gating functions to route examples from different tasks to specialized experts.
This results in a sparsely activated multi-task model with a large number of parameters, but with the same computational cost as that of a dense model.
arXiv Detail & Related papers (2022-04-16T00:56:12Z) - Modular Adaptive Policy Selection for Multi-Task Imitation Learning
through Task Division [60.232542918414985]
Multi-task learning often suffers from negative transfer, sharing information that should be task-specific.
This is done by using proto-policies as modules to divide the tasks into simple sub-behaviours that can be shared.
We also demonstrate its ability to autonomously divide the tasks into both shared and task-specific sub-behaviours.
arXiv Detail & Related papers (2022-03-28T15:53:17Z) - In Defense of the Unitary Scalarization for Deep Multi-Task Learning [121.76421174107463]
We present a theoretical analysis suggesting that many specialized multi-tasks can be interpreted as forms of regularization.
We show that, when coupled with standard regularization and stabilization techniques, unitary scalarization matches or improves upon the performance of complex multitasks.
arXiv Detail & Related papers (2022-01-11T18:44:17Z) - Efficiently Identifying Task Groupings for Multi-Task Learning [55.80489920205404]
Multi-task learning can leverage information learned by one task to benefit the training of other tasks.
We suggest an approach to select which tasks should train together in multi-task learning models.
Our method determines task groupings in a single training run by co-training all tasks together and quantifying the effect to which one task's gradient would affect another task's loss.
arXiv Detail & Related papers (2021-09-10T02:01:43Z) - Small Towers Make Big Differences [59.243296878666285]
Multi-task learning aims at solving multiple machine learning tasks at the same time.
A good solution to a multi-task learning problem should be generalizable in addition to being Pareto optimal.
We propose a method of under- parameterized self-auxiliaries for multi-task models to achieve the best of both worlds.
arXiv Detail & Related papers (2020-08-13T10:45:31Z) - Knowledge Distillation for Multi-task Learning [38.20005345733544]
Multi-task learning (MTL) is to learn one single model that performs multiple tasks for achieving good performance on all tasks and lower cost on computation.
Learning such a model requires to jointly optimize losses of a set of tasks with different difficulty levels, magnitudes, and characteristics.
We propose a knowledge distillation based method in this work to address the imbalance problem in multi-task learning.
arXiv Detail & Related papers (2020-07-14T08:02:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.