Maximum Roaming Multi-Task Learning
- URL: http://arxiv.org/abs/2006.09762v4
- Date: Wed, 19 May 2021 09:20:37 GMT
- Title: Maximum Roaming Multi-Task Learning
- Authors: Lucas Pascal and Pietro Michiardi and Xavier Bost and Benoit Huet and
Maria A. Zuluaga
- Abstract summary: We present a novel way to partition the parameter space without weakening the inductive bias.
Specifically, we propose Maximum Roaming, a method inspired by dropout that randomly varies the parameter partitioning.
Experimental results suggest that the regularization brought by roaming has more impact on performance than usual partitioning optimization strategies.
- Score: 18.69970611732082
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-task learning has gained popularity due to the advantages it provides
with respect to resource usage and performance. Nonetheless, the joint
optimization of parameters with respect to multiple tasks remains an active
research topic. Sub-partitioning the parameters between different tasks has
proven to be an efficient way to relax the optimization constraints over the
shared weights, may the partitions be disjoint or overlapping. However, one
drawback of this approach is that it can weaken the inductive bias generally
set up by the joint task optimization. In this work, we present a novel way to
partition the parameter space without weakening the inductive bias.
Specifically, we propose Maximum Roaming, a method inspired by dropout that
randomly varies the parameter partitioning, while forcing them to visit as many
tasks as possible at a regulated frequency, so that the network fully adapts to
each update. We study the properties of our method through experiments on a
variety of visual multi-task data sets. Experimental results suggest that the
regularization brought by roaming has more impact on performance than usual
partitioning optimization strategies. The overall method is flexible, easily
applicable, provides superior regularization and consistently achieves improved
performances compared to recent multi-task learning formulations.
Related papers
- Efficient Pareto Manifold Learning with Low-Rank Structure [31.082432589391953]
Multi-task learning is inherently a multi-objective optimization problem.
We propose a novel approach that integrates a main network with several low-rank matrices.
It significantly reduces the number of parameters and facilitates the extraction of shared features.
arXiv Detail & Related papers (2024-07-30T11:09:27Z) - Pareto Low-Rank Adapters: Efficient Multi-Task Learning with Preferences [49.14535254003683]
PaLoRA is a novel parameter-efficient method that augments the original model with task-specific low-rank adapters.
Our experimental results show that PaLoRA outperforms MTL and PFL baselines across various datasets.
arXiv Detail & Related papers (2024-07-10T21:25:51Z) - Multi-Task Learning with Multi-Task Optimization [31.518330903602095]
We show that a set of optimized yet well-distributed models embody different trade-offs in one algorithmic pass.
We investigate the proposed multi-task learning with multi-task optimization for solving various problem settings.
arXiv Detail & Related papers (2024-03-24T14:04:40Z) - Pareto Manifold Learning: Tackling multiple tasks via ensembles of
single-task models [50.33956216274694]
In Multi-Task Learning (MTL), tasks may compete and limit the performance achieved on each other, rather than guiding the optimization to a solution.
We propose textitPareto Manifold Learning, an ensembling method in weight space.
arXiv Detail & Related papers (2022-10-18T11:20:54Z) - Attentional Mixtures of Soft Prompt Tuning for Parameter-efficient
Multi-task Knowledge Sharing [53.399742232323895]
ATTEMPT is a new modular, multi-task, and parameter-efficient language model (LM) tuning approach.
It combines knowledge transferred across different tasks via a mixture of soft prompts while keeping original LM unchanged.
It is parameter-efficient (e.g., updates 1,600 times fewer parameters than fine-tuning) and enables multi-task learning and flexible extensions.
arXiv Detail & Related papers (2022-05-24T10:48:33Z) - In Defense of the Unitary Scalarization for Deep Multi-Task Learning [121.76421174107463]
We present a theoretical analysis suggesting that many specialized multi-tasks can be interpreted as forms of regularization.
We show that, when coupled with standard regularization and stabilization techniques, unitary scalarization matches or improves upon the performance of complex multitasks.
arXiv Detail & Related papers (2022-01-11T18:44:17Z) - Parameter-Efficient Transfer Learning with Diff Pruning [108.03864629388404]
diff pruning is a simple approach to enable parameter-efficient transfer learning within the pretrain-finetune framework.
We find that models finetuned with diff pruning can match the performance of fully finetuned baselines on the GLUE benchmark.
arXiv Detail & Related papers (2020-12-14T12:34:01Z) - Small Towers Make Big Differences [59.243296878666285]
Multi-task learning aims at solving multiple machine learning tasks at the same time.
A good solution to a multi-task learning problem should be generalizable in addition to being Pareto optimal.
We propose a method of under- parameterized self-auxiliaries for multi-task models to achieve the best of both worlds.
arXiv Detail & Related papers (2020-08-13T10:45:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.