No More Tuning: Prioritized Multi-Task Learning with Lagrangian Differential Multiplier Methods
- URL: http://arxiv.org/abs/2412.12092v1
- Date: Mon, 16 Dec 2024 18:58:28 GMT
- Title: No More Tuning: Prioritized Multi-Task Learning with Lagrangian Differential Multiplier Methods
- Authors: Zhengxing Cheng, Yuheng Huang, Zhixuan Zhang, Dan Ou, Qingwen Liu,
- Abstract summary: In web search, relevance is often prioritized over other metrics, such as click-through rates or user engagement.
Existing frameworks pay insufficient attention to the prioritization among different tasks.
We introduce a novel multi-task learning framework employing Lagrangian Differential Multiplier Methods for step-wise multi-task optimization.
- Score: 10.725249826517734
- License:
- Abstract: Given the ubiquity of multi-task in practical systems, Multi-Task Learning (MTL) has found widespread application across diverse domains. In real-world scenarios, these tasks often have different priorities. For instance, In web search, relevance is often prioritized over other metrics, such as click-through rates or user engagement. Existing frameworks pay insufficient attention to the prioritization among different tasks, which typically adjust task-specific loss function weights to differentiate task priorities. However, this approach encounters challenges as the number of tasks grows, leading to exponential increases in hyper-parameter tuning complexity. Furthermore, the simultaneous optimization of multiple objectives can negatively impact the performance of high-priority tasks due to interference from lower-priority tasks. In this paper, we introduce a novel multi-task learning framework employing Lagrangian Differential Multiplier Methods for step-wise multi-task optimization. It is designed to boost the performance of high-priority tasks without interference from other tasks. Its primary advantage lies in its ability to automatically optimize multiple objectives without requiring balancing hyper-parameters for different tasks, thereby eliminating the need for manual tuning. Additionally, we provide theoretical analysis demonstrating that our method ensures optimization guarantees, enhancing the reliability of the process. We demonstrate its effectiveness through experiments on multiple public datasets and its application in Taobao search, a large-scale industrial search ranking system, resulting in significant improvements across various business metrics.
Related papers
- Selective Task Group Updates for Multi-Task Optimization [44.601029688423914]
We introduce an algorithm that adaptively determines how to effectively group tasks and update them during the learning process.
Our methods substantially outperform previous multi-task optimization approaches.
arXiv Detail & Related papers (2025-02-17T16:26:05Z) - MaZO: Masked Zeroth-Order Optimization for Multi-Task Fine-Tuning of Large Language Models [26.980104922985326]
We present MaZO, the first framework specifically designed for multi-task LLM fine-tuning under ZO optimization.
MaZO tackles these challenges at the parameter level through two key innovations: a weight importance metric to identify critical parameters and a multi-task weight update mask to selectively update these parameters.
Experiments demonstrate that MaZO achieves state-of-the-art performance, surpassing even multi-task learning methods designed for first-order optimization.
arXiv Detail & Related papers (2025-02-17T07:28:52Z) - Quantifying Task Priority for Multi-Task Optimization [44.601029688423914]
The goal of multi-task learning is to learn diverse tasks within a single unified network.
We present a new method named connection strength-based optimization for multi-task learning.
arXiv Detail & Related papers (2024-06-05T06:52:29Z) - In Defense of the Unitary Scalarization for Deep Multi-Task Learning [121.76421174107463]
We present a theoretical analysis suggesting that many specialized multi-tasks can be interpreted as forms of regularization.
We show that, when coupled with standard regularization and stabilization techniques, unitary scalarization matches or improves upon the performance of complex multitasks.
arXiv Detail & Related papers (2022-01-11T18:44:17Z) - Variational Multi-Task Learning with Gumbel-Softmax Priors [105.22406384964144]
Multi-task learning aims to explore task relatedness to improve individual tasks.
We propose variational multi-task learning (VMTL), a general probabilistic inference framework for learning multiple related tasks.
arXiv Detail & Related papers (2021-11-09T18:49:45Z) - Multi-Task Learning with Sequence-Conditioned Transporter Networks [67.57293592529517]
We aim to solve multi-task learning through the lens of sequence-conditioning and weighted sampling.
We propose a new suite of benchmark aimed at compositional tasks, MultiRavens, which allows defining custom task combinations.
Second, we propose a vision-based end-to-end system architecture, Sequence-Conditioned Transporter Networks, which augments Goal-Conditioned Transporter Networks with sequence-conditioning and weighted sampling.
arXiv Detail & Related papers (2021-09-15T21:19:11Z) - Small Towers Make Big Differences [59.243296878666285]
Multi-task learning aims at solving multiple machine learning tasks at the same time.
A good solution to a multi-task learning problem should be generalizable in addition to being Pareto optimal.
We propose a method of under- parameterized self-auxiliaries for multi-task models to achieve the best of both worlds.
arXiv Detail & Related papers (2020-08-13T10:45:31Z) - Reparameterizing Convolutions for Incremental Multi-Task Learning
without Task Interference [75.95287293847697]
Two common challenges in developing multi-task models are often overlooked in literature.
First, enabling the model to be inherently incremental, continuously incorporating information from new tasks without forgetting the previously learned ones (incremental learning)
Second, eliminating adverse interactions amongst tasks, which has been shown to significantly degrade the single-task performance in a multi-task setup (task interference)
arXiv Detail & Related papers (2020-07-24T14:44:46Z) - Gradient Surgery for Multi-Task Learning [119.675492088251]
Multi-task learning has emerged as a promising approach for sharing structure across multiple tasks.
The reasons why multi-task learning is so challenging compared to single-task learning are not fully understood.
We propose a form of gradient surgery that projects a task's gradient onto the normal plane of the gradient of any other task that has a conflicting gradient.
arXiv Detail & Related papers (2020-01-19T06:33:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.