MmAP : Multi-modal Alignment Prompt for Cross-domain Multi-task Learning
- URL: http://arxiv.org/abs/2312.08636v1
- Date: Thu, 14 Dec 2023 03:33:02 GMT
- Title: MmAP : Multi-modal Alignment Prompt for Cross-domain Multi-task Learning
- Authors: Yi Xin, Junlong Du, Qiang Wang, Ke Yan, Shouhong Ding
- Abstract summary: Multi-task learning is designed to train multiple correlated tasks simultaneously.
To tackle this challenge, we integrate the decoder-free vision-language model CLIP.
We propose Multi-modal Alignment Prompt (MmAP) for CLIP, which aligns text and visual modalities during fine-tuning process.
- Score: 29.88567810099265
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-Task Learning (MTL) is designed to train multiple correlated tasks
simultaneously, thereby enhancing the performance of individual tasks.
Typically, a multi-task network structure consists of a shared backbone and
task-specific decoders. However, the complexity of the decoders increases with
the number of tasks. To tackle this challenge, we integrate the decoder-free
vision-language model CLIP, which exhibits robust zero-shot generalization
capability. Recently, parameter-efficient transfer learning methods have been
extensively explored with CLIP for adapting to downstream tasks, where prompt
tuning showcases strong potential. Nevertheless, these methods solely fine-tune
a single modality (text or visual), disrupting the modality structure of CLIP.
In this paper, we first propose Multi-modal Alignment Prompt (MmAP) for CLIP,
which aligns text and visual modalities during fine-tuning process. Building
upon MmAP, we develop an innovative multi-task prompt learning framework. On
the one hand, to maximize the complementarity of tasks with high similarity, we
utilize a gradient-driven task grouping method that partitions tasks into
several disjoint groups and assign a group-shared MmAP to each group. On the
other hand, to preserve the unique characteristics of each task, we assign an
task-specific MmAP to each task. Comprehensive experiments on two large
multi-task learning datasets demonstrate that our method achieves significant
performance improvements compared to full fine-tuning while only utilizing
approximately 0.09% of trainable parameters.
Related papers
- AdapMTL: Adaptive Pruning Framework for Multitask Learning Model [5.643658120200373]
AdapMTL is an adaptive pruning framework for multitask models.
It balances sparsity allocation and accuracy performance across multiple tasks.
It showcases superior performance compared to state-of-the-art pruning methods.
arXiv Detail & Related papers (2024-08-07T17:19:15Z) - DMTG: One-Shot Differentiable Multi-Task Grouping [32.72240053032646]
We aim to address Multi-Task Learning (MTL) with a large number of tasks by Multi-Task Grouping (MTG)
We propose to simultaneously identify the best task groups from 2N candidates and train the model weights simultaneously in one-shot, with the high-order task-affinity fully exploited.
arXiv Detail & Related papers (2024-07-06T13:54:00Z) - Cross-Task Affinity Learning for Multitask Dense Scene Predictions [5.939164722752263]
Multitask learning (MTL) has become prominent for its ability to predict multiple tasks jointly.
We introduce the Cross-Task Affinity Learning (CTAL) module, a lightweight framework that enhances task refinement in multitask networks.
Our results demonstrate state-of-the-art MTL performance for both CNN and transformer backbones, using significantly fewer parameters than single-task learning.
arXiv Detail & Related papers (2024-01-20T05:31:47Z) - Small LLMs Are Weak Tool Learners: A Multi-LLM Agent [73.54562551341454]
Large Language Model (LLM) agents significantly extend the capabilities of standalone LLMs.
We propose a novel approach that decomposes the aforementioned capabilities into a planner, caller, and summarizer.
This modular framework facilitates individual updates and the potential use of smaller LLMs for building each capability.
arXiv Detail & Related papers (2024-01-14T16:17:07Z) - Knowledge Assembly: Semi-Supervised Multi-Task Learning from Multiple
Datasets with Disjoint Labels [8.816979799419107]
Multi-Task Learning (MTL) is an adequate method to do so, but usually requires datasets labeled for all tasks.
We propose a method that can leverage datasets labeled for only some of the tasks in the MTL framework.
Our work, Knowledge Assembly (KA), learns multiple tasks from disjoint datasets by leveraging the unlabeled data in a semi-supervised manner.
arXiv Detail & Related papers (2023-06-15T04:05:03Z) - Musketeer: Joint Training for Multi-task Vision Language Model with Task Explanation Prompts [75.75548749888029]
We present a vision-language model whose parameters are jointly trained on all tasks and fully shared among multiple heterogeneous tasks.
With a single model, Musketeer achieves results comparable to or better than strong baselines trained on single tasks, almost uniformly across multiple tasks.
arXiv Detail & Related papers (2023-05-11T17:57:49Z) - Sparsely Activated Mixture-of-Experts are Robust Multi-Task Learners [67.5865966762559]
We study whether sparsely activated Mixture-of-Experts (MoE) improve multi-task learning.
We devise task-aware gating functions to route examples from different tasks to specialized experts.
This results in a sparsely activated multi-task model with a large number of parameters, but with the same computational cost as that of a dense model.
arXiv Detail & Related papers (2022-04-16T00:56:12Z) - Variational Multi-Task Learning with Gumbel-Softmax Priors [105.22406384964144]
Multi-task learning aims to explore task relatedness to improve individual tasks.
We propose variational multi-task learning (VMTL), a general probabilistic inference framework for learning multiple related tasks.
arXiv Detail & Related papers (2021-11-09T18:49:45Z) - Semi-supervised Multi-task Learning for Semantics and Depth [88.77716991603252]
Multi-Task Learning (MTL) aims to enhance the model generalization by sharing representations between related tasks for better performance.
We propose the Semi-supervised Multi-Task Learning (MTL) method to leverage the available supervisory signals from different datasets.
We present a domain-aware discriminator structure with various alignment formulations to mitigate the domain discrepancy issue among datasets.
arXiv Detail & Related papers (2021-10-14T07:43:39Z) - Multi-Task Learning with Sequence-Conditioned Transporter Networks [67.57293592529517]
We aim to solve multi-task learning through the lens of sequence-conditioning and weighted sampling.
We propose a new suite of benchmark aimed at compositional tasks, MultiRavens, which allows defining custom task combinations.
Second, we propose a vision-based end-to-end system architecture, Sequence-Conditioned Transporter Networks, which augments Goal-Conditioned Transporter Networks with sequence-conditioning and weighted sampling.
arXiv Detail & Related papers (2021-09-15T21:19:11Z) - Latent Group Structured Multi-task Learning [2.827177139912107]
In multi-task learning (MTL), we improve the performance of key machine learning algorithms by training various tasks jointly.
We present our group structured latent-space multi-task learning model, which encourages group structured tasks defined by prior information.
Experiments are conducted on both synthetic and real-world datasets, showing competitive performance over single-task learning.
arXiv Detail & Related papers (2020-11-24T05:38:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.