NTKMTL: Mitigating Task Imbalance in Multi-Task Learning from Neural Tangent Kernel Perspective
- URL: http://arxiv.org/abs/2510.18258v1
- Date: Tue, 21 Oct 2025 03:29:40 GMT
- Title: NTKMTL: Mitigating Task Imbalance in Multi-Task Learning from Neural Tangent Kernel Perspective
- Authors: Xiaohan Qin, Xiaoxing Wang, Ning Liao, Junchi Yan,
- Abstract summary: Multi-Task Learning (MTL) enables a single model to learn multiple tasks simultaneously.<n> task imbalance remains a major challenge in MTL.<n>We propose a new MTL method, NTKMTL, to analyze the training dynamics in MTL.
- Score: 58.345210583013454
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multi-Task Learning (MTL) enables a single model to learn multiple tasks simultaneously, leveraging knowledge transfer among tasks for enhanced generalization, and has been widely applied across various domains. However, task imbalance remains a major challenge in MTL. Although balancing the convergence speeds of different tasks is an effective approach to address this issue, it is highly challenging to accurately characterize the training dynamics and convergence speeds of multiple tasks within the complex MTL system. To this end, we attempt to analyze the training dynamics in MTL by leveraging Neural Tangent Kernel (NTK) theory and propose a new MTL method, NTKMTL. Specifically, we introduce an extended NTK matrix for MTL and adopt spectral analysis to balance the convergence speeds of multiple tasks, thereby mitigating task imbalance. Based on the approximation via shared representation, we further propose NTKMTL-SR, achieving training efficiency while maintaining competitive performance. Extensive experiments demonstrate that our methods achieve state-of-the-art performance across a wide range of benchmarks, including both multi-task supervised learning and multi-task reinforcement learning. Source code is available at https://github.com/jianke0604/NTKMTL.
Related papers
- Optimizing Dense Visual Predictions Through Multi-Task Coherence and Prioritization [7.776434991976473]
Multi-Task Learning (MTL) involves the concurrent training of multiple tasks.<n>We propose an advanced MTL model specifically designed for dense vision tasks.
arXiv Detail & Related papers (2024-12-04T10:05:47Z) - ModalPrompt: Towards Efficient Multimodal Continual Instruction Tuning with Dual-Modality Guided Prompt [51.71932333475573]
Large Multimodal Models (LMMs) exhibit remarkable multi-tasking ability by learning mixed instruction datasets.<n>Existing MCIT methods do not fully exploit the unique attribute of LMMs.<n>We propose a novel prompt learning framework for MCIT to effectively alleviate forgetting of previous knowledge.
arXiv Detail & Related papers (2024-10-08T09:35:37Z) - Fair Resource Allocation in Multi-Task Learning [12.776767874217663]
Multi-task learning (MTL) can leverage the shared knowledge across tasks, resulting in improved data efficiency and generalization performance.
A major challenge in MTL lies in the presence of conflicting gradients, which can hinder the fair optimization of some tasks.
Inspired by fair resource allocation in communication networks, we propose FairGrad, a novel MTL optimization method.
arXiv Detail & Related papers (2024-02-23T22:46:14Z) - Low-Rank Multitask Learning based on Tensorized SVMs and LSSVMs [65.42104819071444]
Multitask learning (MTL) leverages task-relatedness to enhance performance.
We employ high-order tensors, with each mode corresponding to a task index, to naturally represent tasks referenced by multiple indices.
We propose a general framework of low-rank MTL methods with tensorized support vector machines (SVMs) and least square support vector machines (LSSVMs)
arXiv Detail & Related papers (2023-08-30T14:28:26Z) - Equitable Multi-task Learning [18.65048321820911]
Multi-task learning (MTL) has achieved great success in various research domains, such as CV, NLP and IR.
We propose a novel multi-task optimization method, named EMTL, to achieve equitable MTL.
Our method stably outperforms state-of-the-art methods on the public benchmark datasets of two different research domains.
arXiv Detail & Related papers (2023-06-15T03:37:23Z) - M$^3$ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task
Learning with Model-Accelerator Co-design [95.41238363769892]
Multi-task learning (MTL) encapsulates multiple learned tasks in a single model and often lets those tasks learn better jointly.
Current MTL regimes have to activate nearly the entire model even to just execute a single task.
We present a model-accelerator co-design framework to enable efficient on-device MTL.
arXiv Detail & Related papers (2022-10-26T15:40:24Z) - Sparsely Activated Mixture-of-Experts are Robust Multi-Task Learners [67.5865966762559]
We study whether sparsely activated Mixture-of-Experts (MoE) improve multi-task learning.
We devise task-aware gating functions to route examples from different tasks to specialized experts.
This results in a sparsely activated multi-task model with a large number of parameters, but with the same computational cost as that of a dense model.
arXiv Detail & Related papers (2022-04-16T00:56:12Z) - Multi-Task Learning as a Bargaining Game [63.49888996291245]
In Multi-task learning (MTL), a joint model is trained to simultaneously make predictions for several tasks.
Since the gradients of these different tasks may conflict, training a joint model for MTL often yields lower performance than its corresponding single-task counterparts.
We propose viewing the gradients combination step as a bargaining game, where tasks negotiate to reach an agreement on a joint direction of parameter update.
arXiv Detail & Related papers (2022-02-02T13:21:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.