Giving each task what it needs -- leveraging structured sparsity for tailored multi-task learning
- URL: http://arxiv.org/abs/2406.03048v2
- Date: Thu, 5 Sep 2024 09:28:21 GMT
- Title: Giving each task what it needs -- leveraging structured sparsity for tailored multi-task learning
- Authors: Richa Upadhyay, Ronald Phlypo, Rajkumar Saini, Marcus Liwicki,
- Abstract summary: In the Multi-task Learning (MTL) framework, every task demands distinct feature representations, ranging from low-level to high-level attributes.
This work introduces Layer-d Multi-Task models that utilize structured sparsity to refine feature selection for individual tasks and enhance the performance of all tasks in a multi-task scenario.
- Score: 4.462334751640166
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the Multi-task Learning (MTL) framework, every task demands distinct feature representations, ranging from low-level to high-level attributes. It is vital to address the specific (feature/parameter) needs of each task, especially in computationally constrained environments. This work, therefore, introduces Layer-Optimized Multi-Task (LOMT) models that utilize structured sparsity to refine feature selection for individual tasks and enhance the performance of all tasks in a multi-task scenario. Structured or group sparsity systematically eliminates parameters from trivial channels and, sometimes, eventually, entire layers within a convolution neural network during training. Consequently, the remaining layers provide the most optimal features for a given task. In this two-step approach, we subsequently leverage this sparsity-induced optimal layer information to build the LOMT models by connecting task-specific decoders to these strategically identified layers, deviating from conventional approaches that uniformly connect decoders at the end of the network. This tailored architecture optimizes the network, focusing on essential features while reducing redundancy. We validate the efficacy of the proposed approach on two datasets, i.e., NYU-v2 and CelebAMask-HD datasets, for multiple heterogeneous tasks. A detailed performance analysis of the LOMT models, in contrast to the conventional MTL models, reveals that the LOMT models outperform for most task combinations. The excellent qualitative and quantitative outcomes highlight the effectiveness of employing structured sparsity for optimal layer (or feature) selection.
Related papers
- AdapMTL: Adaptive Pruning Framework for Multitask Learning Model [5.643658120200373]
AdapMTL is an adaptive pruning framework for multitask models.
It balances sparsity allocation and accuracy performance across multiple tasks.
It showcases superior performance compared to state-of-the-art pruning methods.
arXiv Detail & Related papers (2024-08-07T17:19:15Z) - Task-Distributionally Robust Data-Free Meta-Learning [99.56612787882334]
Data-Free Meta-Learning (DFML) aims to efficiently learn new tasks by leveraging multiple pre-trained models without requiring their original training data.
For the first time, we reveal two major challenges hindering their practical deployments: Task-Distribution Shift ( TDS) and Task-Distribution Corruption (TDC)
arXiv Detail & Related papers (2023-11-23T15:46:54Z) - Multi-Task Cooperative Learning via Searching for Flat Minima [8.835287696319641]
We propose to formulate MTL as a multi/bi-level optimization problem, and therefore force features to learn from each task in a cooperative approach.
Specifically, we update the sub-model for each task alternatively taking advantage of the learned sub-models of the other tasks.
To alleviate the negative transfer problem during the optimization, we search for flat minima for the current objective function.
arXiv Detail & Related papers (2023-09-21T14:00:11Z) - Diffusion Model is an Effective Planner and Data Synthesizer for
Multi-Task Reinforcement Learning [101.66860222415512]
Multi-Task Diffusion Model (textscMTDiff) is a diffusion-based method that incorporates Transformer backbones and prompt learning for generative planning and data synthesis.
For generative planning, we find textscMTDiff outperforms state-of-the-art algorithms across 50 tasks on Meta-World and 8 maps on Maze2D.
arXiv Detail & Related papers (2023-05-29T05:20:38Z) - Task Aware Feature Extraction Framework for Sequential Dependence
Multi-Task Learning [1.0765359420035392]
We analyze sequential dependence MTL from rigorous mathematical perspective.
We propose a Task Aware Feature Extraction (TAFE) framework for sequential dependence MTL.
arXiv Detail & Related papers (2023-01-06T13:12:59Z) - Task Adaptive Parameter Sharing for Multi-Task Learning [114.80350786535952]
Adaptive Task Adapting Sharing (TAPS) is a method for tuning a base model to a new task by adaptively modifying a small, task-specific subset of layers.
Compared to other methods, TAPS retains high accuracy on downstream tasks while introducing few task-specific parameters.
We evaluate our method on a suite of fine-tuning tasks and architectures (ResNet, DenseNet, ViT) and show that it achieves state-of-the-art performance while being simple to implement.
arXiv Detail & Related papers (2022-03-30T23:16:07Z) - Controllable Dynamic Multi-Task Architectures [92.74372912009127]
We propose a controllable multi-task network that dynamically adjusts its architecture and weights to match the desired task preference as well as the resource constraints.
We propose a disentangled training of two hypernetworks, by exploiting task affinity and a novel branching regularized loss, to take input preferences and accordingly predict tree-structured models with adapted weights.
arXiv Detail & Related papers (2022-03-28T17:56:40Z) - A Tree-Structured Multi-Task Model Recommender [25.445073413243925]
Tree-structured multi-task architectures have been employed to tackle multiple vision tasks in the context of multi-task learning (MTL)
This paper proposes a recommender that automatically suggests tree-structured multi-task architectures that could achieve a high task performance while meeting a user-specified computation budget without performing model training.
Extensive evaluations on popular MTL benchmarks show that the recommended architectures could achieve competitive task accuracy and computation efficiency compared with state-of-the-art MTL methods.
arXiv Detail & Related papers (2022-03-10T00:09:43Z) - Semi-supervised Multi-task Learning for Semantics and Depth [88.77716991603252]
Multi-Task Learning (MTL) aims to enhance the model generalization by sharing representations between related tasks for better performance.
We propose the Semi-supervised Multi-Task Learning (MTL) method to leverage the available supervisory signals from different datasets.
We present a domain-aware discriminator structure with various alignment formulations to mitigate the domain discrepancy issue among datasets.
arXiv Detail & Related papers (2021-10-14T07:43:39Z) - Exploring Relational Context for Multi-Task Dense Prediction [76.86090370115]
We consider a multi-task environment for dense prediction tasks, represented by a common backbone and independent task-specific heads.
We explore various attention-based contexts, such as global and local, in the multi-task setting.
We propose an Adaptive Task-Relational Context module, which samples the pool of all available contexts for each task pair.
arXiv Detail & Related papers (2021-04-28T16:45:56Z) - Deeper Task-Specificity Improves Joint Entity and Relation Extraction [0.0]
Multi-task learning (MTL) is an effective method for learning related tasks, but designing MTL models requires deciding which and how many parameters should be task-specific.
We propose a novel neural architecture that allows for deeper task-specificity than does prior work.
arXiv Detail & Related papers (2020-02-15T18:34:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.