Related papers: DiBS-MTL: Transformation-Invariant Multitask Learning with Direction Oracles

DiBS-MTL: Transformation-Invariant Multitask Learning with Direction Oracles

URL: http://arxiv.org/abs/2509.23948v1
Date: Sun, 28 Sep 2025 15:57:06 GMT
Title: DiBS-MTL: Transformation-Invariant Multitask Learning with Direction Oracles
Authors: Surya Murthy, Kushagra Gupta, Mustafa O. Karabag, David Fridovich-Keil, Ufuk Topcu,
Abstract summary: Multitask learning (MTL) algorithms typically rely on schemes that combine different task losses or their gradients through weighted averaging.<n>In doing so, a central challenge arises because task losses can be arbitrarily scaled.<n>We show that the convergence behavior of DiBS in non MTL settings is not understood.
Score: 20.925878778939083
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Multitask learning (MTL) algorithms typically rely on schemes that combine different task losses or their gradients through weighted averaging. These methods aim to find Pareto stationary points by using heuristics that require access to task loss values, gradients, or both. In doing so, a central challenge arises because task losses can be arbitrarily, nonaffinely scaled relative to one another, causing certain tasks to dominate training and degrade overall performance. A recent advance in cooperative bargaining theory, the Direction-based Bargaining Solution (DiBS), yields Pareto stationary solutions immune to task domination because of its invariance to monotonic nonaffine task loss transformations. However, the convergence behavior of DiBS in nonconvex MTL settings is currently not understood. To this end, we prove that under standard assumptions, a subsequence of DiBS iterates converges to a Pareto stationary point when task losses are possibly nonconvex, and propose DiBS-MTL, a computationally efficient adaptation of DiBS to the MTL setting. Finally, we validate DiBS-MTL empirically on standard MTL benchmarks, showing that it achieves competitive performance with state-of-the-art methods while maintaining robustness to nonaffine monotonic transformations that significantly degrade the performance of existing approaches, including prior bargaining-inspired MTL methods. Code available at https://github.com/suryakmurthy/dibs-mtl.

Related papers

SAMO: A Lightweight Sharpness-Aware Approach for Multi-Task Optimization with Joint Global-Local Perturbation [11.368244787718673]
Sharpness-aware minimization (SAM) minimizes task loss while simultaneously reducing the sharpness of the loss landscape.<n>We propose SAMO, a lightweight textbfSharpness-textbfAware textbfMulti-task textbfOptimization approach.
arXiv Detail & Related papers (2025-07-10T16:06:02Z)
Train with Perturbation, Infer after Merging: A Two-Stage Framework for Continual Learning [57.514786046966265]
We propose textbfPerturb-and-Merge (P&M), a novel continual learning framework that integrates model merging into the CL paradigm to mitigate forgetting.<n>Our proposed approach achieves state-of-the-art performance on several continual learning benchmark datasets.
arXiv Detail & Related papers (2025-05-28T14:14:19Z)
LDC-MTL: Balancing Multi-Task Learning through Scalable Loss Discrepancy Control [48.98651927356094]
Multi-task learning (MTL) has been widely adopted for its ability to simultaneously learn multiple tasks.<n>We propose LDC-MTL, a simple and scalable loss discrepancy control approach for MTL, formulated from a bilevel optimization perspective.<n>Our method incorporates two key components: (i) a bilevel formulation for fine-grained loss discrepancy control, and (ii) a scalable first-order bilevel algorithm that requires only $mathcalO(1)$ time and memory.
arXiv Detail & Related papers (2025-02-12T17:18:14Z)
Multi-task learning via robust regularized clustering with non-convex group penalties [0.0]
Multi-task learning (MTL) aims to improve estimation performance by sharing common information among related tasks. Existing MTL methods based on this assumption often ignore outlier tasks. We propose a novel MTL method called MultiTask Regularized Clustering (MTLRRC)
arXiv Detail & Related papers (2024-04-04T07:09:43Z)
Robust Analysis of Multi-Task Learning Efficiency: New Benchmarks on Light-Weighed Backbones and Effective Measurement of Multi-Task Learning Challenges by Feature Disentanglement [69.51496713076253]
In this paper, we focus on the aforementioned efficiency aspects of existing MTL methods. We first carry out large-scale experiments of the methods with smaller backbones and on a the MetaGraspNet dataset as a new test ground. We also propose Feature Disentanglement measure as a novel and efficient identifier of the challenges in MTL.
arXiv Detail & Related papers (2024-02-05T22:15:55Z)
Task-Distributionally Robust Data-Free Meta-Learning [99.56612787882334]
Data-Free Meta-Learning (DFML) aims to efficiently learn new tasks by leveraging multiple pre-trained models without requiring their original training data. For the first time, we reveal two major challenges hindering their practical deployments: Task-Distribution Shift ( TDS) and Task-Distribution Corruption (TDC)
arXiv Detail & Related papers (2023-11-23T15:46:54Z)
FairBranch: Mitigating Bias Transfer in Fair Multi-task Learning [15.319254128769973]
Multi-Task Learning (MTL) suffers when unrelated tasks negatively impact each other by updating shared parameters with conflicting gradients. This is known as negative transfer and leads to a drop in MTL accuracy compared to single-task learning (STL)
arXiv Detail & Related papers (2023-10-20T18:07:15Z)
Low-Rank Multitask Learning based on Tensorized SVMs and LSSVMs [65.42104819071444]
Multitask learning (MTL) leverages task-relatedness to enhance performance. We employ high-order tensors, with each mode corresponding to a task index, to naturally represent tasks referenced by multiple indices. We propose a general framework of low-rank MTL methods with tensorized support vector machines (SVMs) and least square support vector machines (LSSVMs)
arXiv Detail & Related papers (2023-08-30T14:28:26Z)
Dual-Balancing for Multi-Task Learning [42.613360970194734]
We propose a Dual-Balancing Multi-Task Learning (DB-MTL) method to alleviate the task balancing problem from both loss and gradient perspectives. DB-MTL ensures loss-scale balancing by performing a logarithm transformation on each task loss, and guarantees gradient-magnitude balancing via normalizing all task gradients to the same magnitude as the maximum gradient norm.
arXiv Detail & Related papers (2023-08-23T09:41:28Z)
Beyond Losses Reweighting: Empowering Multi-Task Learning via the Generalization Perspective [61.10883077161432]
Multi-task learning (MTL) trains deep neural networks to optimize several objectives simultaneously using a shared backbone.<n>We introduce a novel MTL framework that leverages weight perturbation to regulate gradient norms, thus improving generalization.<n>Our method significantly outperforms existing gradient-based MTL techniques in terms of task performance and overall model robustness.
arXiv Detail & Related papers (2022-11-24T17:19:30Z)
Meta-Learning Adversarial Bandits [49.094361442409785]
We study online learning with bandit feedback across multiple tasks, with the goal of improving average performance across tasks if they are similar according to some natural task-similarity measure. As the first to target the adversarial setting, we design a meta-algorithm that setting-specific guarantees for two important cases: multi-armed bandits (MAB) and bandit optimization (BLO) Our guarantees rely on proving that unregularized follow-the-leader combined with multiplicative weights is enough to online learn a non-smooth and non-B sequence.
arXiv Detail & Related papers (2022-05-27T17:40:32Z)
Multi-Task Learning as a Bargaining Game [63.49888996291245]
In Multi-task learning (MTL), a joint model is trained to simultaneously make predictions for several tasks. Since the gradients of these different tasks may conflict, training a joint model for MTL often yields lower performance than its corresponding single-task counterparts. We propose viewing the gradients combination step as a bargaining game, where tasks negotiate to reach an agreement on a joint direction of parameter update.
arXiv Detail & Related papers (2022-02-02T13:21:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.