Fantastic Multi-Task Gradient Updates and How to Find Them In a Cone
- URL: http://arxiv.org/abs/2502.00217v1
- Date: Fri, 31 Jan 2025 23:11:12 GMT
- Title: Fantastic Multi-Task Gradient Updates and How to Find Them In a Cone
- Authors: Negar Hassanpour, Muhammad Kamran Janjua, Kunlin Zhang, Sepehr Lavasani, Xiaowen Zhang, Chunhua Zhou, Chao Gao,
- Abstract summary: We propose ConicGrad, a principled, scalable, and robust MTL approach formulated as a constrained optimization problem.
Our method introduces an angular constraint to dynamically regulate gradient update directions, confining them within a cone centered on the reference gradient of the overall objective.
We conduct extensive experiments on standard supervised learning and reinforcement learning MTL benchmarks, and demonstrate that ConicGrad achieves state-of-the-art performance across diverse tasks.
- Score: 14.702480423653984
- License:
- Abstract: Balancing competing objectives remains a fundamental challenge in multi-task learning (MTL), primarily due to conflicting gradients across individual tasks. A common solution relies on computing a dynamic gradient update vector that balances competing tasks as optimization progresses. Building on this idea, we propose ConicGrad, a principled, scalable, and robust MTL approach formulated as a constrained optimization problem. Our method introduces an angular constraint to dynamically regulate gradient update directions, confining them within a cone centered on the reference gradient of the overall objective. By balancing task-specific gradients without over-constraining their direction or magnitude, ConicGrad effectively resolves inter-task gradient conflicts. Moreover, our framework ensures computational efficiency and scalability to high-dimensional parameter spaces. We conduct extensive experiments on standard supervised learning and reinforcement learning MTL benchmarks, and demonstrate that ConicGrad achieves state-of-the-art performance across diverse tasks.
Related papers
- Optimistic Gradient Learning with Hessian Corrections for High-Dimensional Black-Box Optimization [14.073853819633745]
Black-box algorithms are designed to optimize functions without relying on their underlying analytical structure or gradient information.
We propose two novel gradient learning variants to address the challenges posed by high-dimensional, complex, and highly non-linear problems.
arXiv Detail & Related papers (2025-02-07T11:03:50Z) - Optimization by Parallel Quasi-Quantum Annealing with Gradient-Based Sampling [0.0]
This study proposes a different approach that integrates gradient-based update through continuous relaxation, combined with Quasi-Quantum Annealing (QQA)
Numerical experiments demonstrate that our method is a competitive general-purpose solver, achieving performance comparable to iSCO and learning-based solvers.
arXiv Detail & Related papers (2024-09-02T12:55:27Z) - Visual Prompt Tuning in Null Space for Continual Learning [51.96411454304625]
Existing prompt-tuning methods have demonstrated impressive performances in continual learning (CL)
This paper aims to learn each task by tuning the prompts in the direction orthogonal to the subspace spanned by previous tasks' features.
In practice, an effective null-space-based approximation solution has been proposed to implement the prompt gradient projection.
arXiv Detail & Related papers (2024-06-09T05:57:40Z) - HarmoDT: Harmony Multi-Task Decision Transformer for Offline Reinforcement Learning [72.25707314772254]
We introduce the Harmony Multi-Task Decision Transformer (HarmoDT), a novel solution designed to identify an optimal harmony subspace of parameters for each task.
The upper level of this framework is dedicated to learning a task-specific mask that delineates the harmony subspace, while the inner level focuses on updating parameters to enhance the overall performance of the unified policy.
arXiv Detail & Related papers (2024-05-28T11:41:41Z) - Bayesian Uncertainty for Gradient Aggregation in Multi-Task Learning [39.4348419684885]
Multi-task learning (MTL) aims at learning a single model that solves several tasks efficiently.
We introduce a novel gradient aggregation approach using Bayesian inference.
We empirically demonstrate the benefits of our approach in a variety of datasets.
arXiv Detail & Related papers (2024-02-06T14:00:43Z) - Independent Component Alignment for Multi-Task Learning [2.5234156040689237]
In a multi-task learning (MTL) setting, a single model is trained to tackle a diverse set of tasks jointly.
We propose using a condition number of a linear system of gradients as a stability criterion of an MTL optimization.
We present Aligned-MTL, a novel MTL optimization approach based on the proposed criterion.
arXiv Detail & Related papers (2023-05-30T12:56:36Z) - Gradient Coordination for Quantifying and Maximizing Knowledge
Transference in Multi-Task Learning [11.998475119120531]
Multi-task learning (MTL) has been widely applied in online advertising and recommender systems.
We propose a transference-driven approach CoGrad that adaptively maximizes knowledge transference.
arXiv Detail & Related papers (2023-03-10T10:42:21Z) - Continuous-Time Meta-Learning with Forward Mode Differentiation [65.26189016950343]
We introduce Continuous Meta-Learning (COMLN), a meta-learning algorithm where adaptation follows the dynamics of a gradient vector field.
Treating the learning process as an ODE offers the notable advantage that the length of the trajectory is now continuous.
We show empirically its efficiency in terms of runtime and memory usage, and we illustrate its effectiveness on a range of few-shot image classification problems.
arXiv Detail & Related papers (2022-03-02T22:35:58Z) - Multi-Task Learning as a Bargaining Game [63.49888996291245]
In Multi-task learning (MTL), a joint model is trained to simultaneously make predictions for several tasks.
Since the gradients of these different tasks may conflict, training a joint model for MTL often yields lower performance than its corresponding single-task counterparts.
We propose viewing the gradients combination step as a bargaining game, where tasks negotiate to reach an agreement on a joint direction of parameter update.
arXiv Detail & Related papers (2022-02-02T13:21:53Z) - Conflict-Averse Gradient Descent for Multi-task Learning [56.379937772617]
A major challenge in optimizing a multi-task model is the conflicting gradients.
We introduce Conflict-Averse Gradient descent (CAGrad) which minimizes the average loss function.
CAGrad balances the objectives automatically and still provably converges to a minimum over the average loss.
arXiv Detail & Related papers (2021-10-26T22:03:51Z) - Cogradient Descent for Bilinear Optimization [124.45816011848096]
We introduce a Cogradient Descent algorithm (CoGD) to address the bilinear problem.
We solve one variable by considering its coupling relationship with the other, leading to a synchronous gradient descent.
Our algorithm is applied to solve problems with one variable under the sparsity constraint.
arXiv Detail & Related papers (2020-06-16T13:41:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.