Medusa: Universal Feature Learning via Attentional Multitasking
- URL: http://arxiv.org/abs/2204.05698v1
- Date: Tue, 12 Apr 2022 10:52:28 GMT
- Title: Medusa: Universal Feature Learning via Attentional Multitasking
- Authors: Jaime Spencer, Richard Bowden, Simon Hadfield
- Abstract summary: Recent approaches to multi-task learning have focused on modelling connections between tasks at the decoder level.
We argue that MTL is a stepping stone towards universal feature learning (UFL), which is the ability to learn generic features that can be applied to new tasks without retraining.
We show the effectiveness of Medusa in UFL (+13.18% improvement) while maintaining MTL performance and being 25% more efficient than previous approaches.
- Score: 65.94499390875046
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Recent approaches to multi-task learning (MTL) have focused on modelling
connections between tasks at the decoder level. This leads to a tight coupling
between tasks, which need retraining if a new task is inserted or removed. We
argue that MTL is a stepping stone towards universal feature learning (UFL),
which is the ability to learn generic features that can be applied to new tasks
without retraining.
We propose Medusa to realize this goal, designing task heads with dual
attention mechanisms. The shared feature attention masks relevant backbone
features for each task, allowing it to learn a generic representation.
Meanwhile, a novel Multi-Scale Attention head allows the network to better
combine per-task features from different scales when making the final
prediction. We show the effectiveness of Medusa in UFL (+13.18% improvement),
while maintaining MTL performance and being 25% more efficient than previous
approaches.
Related papers
- PECTP: Parameter-Efficient Cross-Task Prompts for Incremental Vision Transformer [76.39111896665585]
Incremental Learning (IL) aims to learn deep models on sequential tasks continually.
Recent vast pre-trained models (PTMs) have achieved outstanding performance by prompt technique in practical IL without the old samples.
arXiv Detail & Related papers (2024-07-04T10:37:58Z) - Cross-Task Affinity Learning for Multitask Dense Scene Predictions [5.939164722752263]
Multitask learning (MTL) has become prominent for its ability to predict multiple tasks jointly.
We introduce the Cross-Task Affinity Learning (CTAL) module, a lightweight framework that enhances task refinement in multitask networks.
Our results demonstrate state-of-the-art MTL performance for both CNN and transformer backbones, using significantly fewer parameters than single-task learning.
arXiv Detail & Related papers (2024-01-20T05:31:47Z) - M$^3$ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task
Learning with Model-Accelerator Co-design [95.41238363769892]
Multi-task learning (MTL) encapsulates multiple learned tasks in a single model and often lets those tasks learn better jointly.
Current MTL regimes have to activate nearly the entire model even to just execute a single task.
We present a model-accelerator co-design framework to enable efficient on-device MTL.
arXiv Detail & Related papers (2022-10-26T15:40:24Z) - Sequential Cross Attention Based Multi-task Learning [22.430705836627148]
We propose a novel architecture that effectively transfers informative features by applying the attention mechanism to the multi-scale features of the tasks.
Our method achieves state-of-the-art performance on the NYUD-v2 and PASCAL-Context dataset.
arXiv Detail & Related papers (2022-09-06T14:17:33Z) - Zero Experience Required: Plug & Play Modular Transfer Learning for
Semantic Visual Navigation [97.17517060585875]
We present a unified approach to visual navigation using a novel modular transfer learning model.
Our model can effectively leverage its experience from one source task and apply it to multiple target tasks.
Our approach learns faster, generalizes better, and outperforms SoTA models by a significant margin.
arXiv Detail & Related papers (2022-02-05T00:07:21Z) - Attentive Task Interaction Network for Multi-Task Learning [4.1372815372396525]
ATI-Net employs knowledge distillation of the latent features for each task, then combines the feature maps to provide improved contextualized information to the decoder.
This novel approach to introducing knowledge distillation into an attention based multitask network outperforms state of the art MTL baselines.
arXiv Detail & Related papers (2022-01-25T22:03:20Z) - HydaLearn: Highly Dynamic Task Weighting for Multi-task Learning with
Auxiliary Tasks [4.095907708855597]
Multi-task learning (MTL) can improve performance on a task by sharing representations with one or more related auxiliary-tasks.
Usually, MTL-networks are trained on a composite loss function formed by a constant weighted combination of the separate task losses.
In practice, constant loss weights lead to poor results for two reasons: (i) for mini-batch based optimisation, the optimal task weights vary significantly from one update to the next depending on mini-batch sample composition.
We introduce HydaLearn, an intelligent weighting algorithm that connects main-task gain to the individual task gradients, in order to inform
arXiv Detail & Related papers (2020-08-26T16:04:02Z) - Reparameterizing Convolutions for Incremental Multi-Task Learning
without Task Interference [75.95287293847697]
Two common challenges in developing multi-task models are often overlooked in literature.
First, enabling the model to be inherently incremental, continuously incorporating information from new tasks without forgetting the previously learned ones (incremental learning)
Second, eliminating adverse interactions amongst tasks, which has been shown to significantly degrade the single-task performance in a multi-task setup (task interference)
arXiv Detail & Related papers (2020-07-24T14:44:46Z) - MTI-Net: Multi-Scale Task Interaction Networks for Multi-Task Learning [82.62433731378455]
We show that tasks with high affinity at a certain scale are not guaranteed to retain this behaviour at other scales.
We propose a novel architecture, namely MTI-Net, that builds upon this finding.
arXiv Detail & Related papers (2020-01-19T21:02:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.