InterroGate: Learning to Share, Specialize, and Prune Representations
for Multi-task Learning
- URL: http://arxiv.org/abs/2402.16848v1
- Date: Mon, 26 Feb 2024 18:59:52 GMT
- Title: InterroGate: Learning to Share, Specialize, and Prune Representations
for Multi-task Learning
- Authors: Babak Ehteshami Bejnordi, Gaurav Kumar, Amelie Royer, Christos
Louizos, Tijmen Blankevoort, Mohsen Ghafoorian
- Abstract summary: We propose a novel multi-task learning (MTL) architecture designed to mitigate task interference while optimizing inference computational efficiency.
We employ a learnable gating mechanism to automatically balance the shared and task-specific representations while preserving the performance of all tasks.
- Score: 17.66308231838553
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Jointly learning multiple tasks with a unified model can improve accuracy and
data efficiency, but it faces the challenge of task interference, where
optimizing one task objective may inadvertently compromise the performance of
another. A solution to mitigate this issue is to allocate task-specific
parameters, free from interference, on top of shared features. However,
manually designing such architectures is cumbersome, as practitioners need to
balance between the overall performance across all tasks and the higher
computational cost induced by the newly added parameters. In this work, we
propose \textit{InterroGate}, a novel multi-task learning (MTL) architecture
designed to mitigate task interference while optimizing inference computational
efficiency. We employ a learnable gating mechanism to automatically balance the
shared and task-specific representations while preserving the performance of
all tasks. Crucially, the patterns of parameter sharing and specialization
dynamically learned during training, become fixed at inference, resulting in a
static, optimized MTL architecture. Through extensive empirical evaluations, we
demonstrate SoTA results on three MTL benchmarks using convolutional as well as
transformer-based backbones on CelebA, NYUD-v2, and PASCAL-Context.
Related papers
- Cross-Task Affinity Learning for Multitask Dense Scene Predictions [5.939164722752263]
Multitask learning (MTL) has become prominent for its ability to predict multiple tasks jointly.
We introduce the Cross-Task Affinity Learning (CTAL) module, a lightweight framework that enhances task refinement in multitask networks.
Our results demonstrate state-of-the-art MTL performance for both CNN and transformer backbones, using significantly fewer parameters than single-task learning.
arXiv Detail & Related papers (2024-01-20T05:31:47Z) - Multi-Task Cooperative Learning via Searching for Flat Minima [8.835287696319641]
We propose to formulate MTL as a multi/bi-level optimization problem, and therefore force features to learn from each task in a cooperative approach.
Specifically, we update the sub-model for each task alternatively taking advantage of the learned sub-models of the other tasks.
To alleviate the negative transfer problem during the optimization, we search for flat minima for the current objective function.
arXiv Detail & Related papers (2023-09-21T14:00:11Z) - Task Aware Feature Extraction Framework for Sequential Dependence
Multi-Task Learning [1.0765359420035392]
We analyze sequential dependence MTL from rigorous mathematical perspective.
We propose a Task Aware Feature Extraction (TAFE) framework for sequential dependence MTL.
arXiv Detail & Related papers (2023-01-06T13:12:59Z) - AdaTask: A Task-aware Adaptive Learning Rate Approach to Multi-task
Learning [19.201899503691266]
We measure the task dominance degree of a parameter by the total updates of each task on this parameter.
We propose a Task-wise Adaptive learning rate approach, AdaTask, to separate the emphaccumulative gradients and hence the learning rate of each task.
Experiments on computer vision and recommender system MTL datasets demonstrate that AdaTask significantly improves the performance of dominated tasks.
arXiv Detail & Related papers (2022-11-28T04:24:38Z) - Pareto Manifold Learning: Tackling multiple tasks via ensembles of
single-task models [50.33956216274694]
In Multi-Task Learning (MTL), tasks may compete and limit the performance achieved on each other, rather than guiding the optimization to a solution.
We propose textitPareto Manifold Learning, an ensembling method in weight space.
arXiv Detail & Related papers (2022-10-18T11:20:54Z) - Uni-Perceiver: Pre-training Unified Architecture for Generic Perception
for Zero-shot and Few-shot Tasks [73.63892022944198]
We present a generic perception architecture named Uni-Perceiver.
It processes a variety of modalities and tasks with unified modeling and shared parameters.
Results show that our pre-trained model without any tuning can achieve reasonable performance even on novel tasks.
arXiv Detail & Related papers (2021-12-02T18:59:50Z) - Optimization Strategies in Multi-Task Learning: Averaged or Independent
Losses? [15.905060482249873]
In Multi-Task Learning (MTL), it is a common practice to train multi-task networks by optimizing an objective function, which is a weighted average of the task-specific objective functions.
In this work, we investigate the benefits of such an alternative, by alternating independent gradient descent steps on the different task-specific objective functions.
We show that our random grouping strategy allows to trade-off between these benefits and computational efficiency.
arXiv Detail & Related papers (2021-09-21T09:34:14Z) - Exploring Relational Context for Multi-Task Dense Prediction [76.86090370115]
We consider a multi-task environment for dense prediction tasks, represented by a common backbone and independent task-specific heads.
We explore various attention-based contexts, such as global and local, in the multi-task setting.
We propose an Adaptive Task-Relational Context module, which samples the pool of all available contexts for each task pair.
arXiv Detail & Related papers (2021-04-28T16:45:56Z) - Small Towers Make Big Differences [59.243296878666285]
Multi-task learning aims at solving multiple machine learning tasks at the same time.
A good solution to a multi-task learning problem should be generalizable in addition to being Pareto optimal.
We propose a method of under- parameterized self-auxiliaries for multi-task models to achieve the best of both worlds.
arXiv Detail & Related papers (2020-08-13T10:45:31Z) - Reparameterizing Convolutions for Incremental Multi-Task Learning
without Task Interference [75.95287293847697]
Two common challenges in developing multi-task models are often overlooked in literature.
First, enabling the model to be inherently incremental, continuously incorporating information from new tasks without forgetting the previously learned ones (incremental learning)
Second, eliminating adverse interactions amongst tasks, which has been shown to significantly degrade the single-task performance in a multi-task setup (task interference)
arXiv Detail & Related papers (2020-07-24T14:44:46Z) - Task-Feature Collaborative Learning with Application to Personalized
Attribute Prediction [166.87111665908333]
We propose a novel multi-task learning method called Task-Feature Collaborative Learning (TFCL)
Specifically, we first propose a base model with a heterogeneous block-diagonal structure regularizer to leverage the collaborative grouping of features and tasks.
As a practical extension, we extend the base model by allowing overlapping features and differentiating the hard tasks.
arXiv Detail & Related papers (2020-04-29T02:32:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.