MaxGNR: A Dynamic Weight Strategy via Maximizing Gradient-to-Noise Ratio
for Multi-Task Learning
- URL: http://arxiv.org/abs/2302.09352v1
- Date: Sat, 18 Feb 2023 14:50:45 GMT
- Title: MaxGNR: A Dynamic Weight Strategy via Maximizing Gradient-to-Noise Ratio
for Multi-Task Learning
- Authors: Caoyun Fan, Wenqing Chen, Jidong Tian, Yitian Li, Hao He, Yaohui Jin
- Abstract summary: In computer vision, Multi-Task Learning (MTL) can outperform Single-Task Learning (STL)
In the MTL scenario, Inter-Task Gradient Noise (ITGN) is an additional source of gradient noise for each task.
We design a MaxGNR algorithm to alleviate ITGN interference of each task.
- Score: 19.38778317110205
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: When modeling related tasks in computer vision, Multi-Task Learning (MTL) can
outperform Single-Task Learning (STL) due to its ability to capture intrinsic
relatedness among tasks. However, MTL may encounter the insufficient training
problem, i.e., some tasks in MTL may encounter non-optimal situation compared
with STL. A series of studies point out that too much gradient noise would lead
to performance degradation in STL, however, in the MTL scenario, Inter-Task
Gradient Noise (ITGN) is an additional source of gradient noise for each task,
which can also affect the optimization process. In this paper, we point out
ITGN as a key factor leading to the insufficient training problem. We define
the Gradient-to-Noise Ratio (GNR) to measure the relative magnitude of gradient
noise and design the MaxGNR algorithm to alleviate the ITGN interference of
each task by maximizing the GNR of each task. We carefully evaluate our MaxGNR
algorithm on two standard image MTL datasets: NYUv2 and Cityscapes. The results
show that our algorithm outperforms the baselines under identical experimental
conditions.
Related papers
- Fair Resource Allocation in Multi-Task Learning [12.776767874217663]
Multi-task learning (MTL) can leverage the shared knowledge across tasks, resulting in improved data efficiency and generalization performance.
A major challenge in MTL lies in the presence of conflicting gradients, which can hinder the fair optimization of some tasks.
Inspired by fair resource allocation in communication networks, we propose FairGrad, a novel MTL optimization method.
arXiv Detail & Related papers (2024-02-23T22:46:14Z) - Robust Multi-Task Learning with Excess Risks [24.695243608197835]
Multi-task learning (MTL) considers learning a joint model for multiple tasks by optimizing a convex combination of all task losses.
Existing methods use an adaptive weight updating scheme, where task weights are dynamically adjusted based on their respective losses to prioritize difficult tasks.
We propose Multi-Task Learning with Excess Risks (ExcessMTL), an excess risk-based task balancing method that updates the task weights by their distances to convergence.
arXiv Detail & Related papers (2024-02-03T03:46:14Z) - Low-Rank Multitask Learning based on Tensorized SVMs and LSSVMs [65.42104819071444]
Multitask learning (MTL) leverages task-relatedness to enhance performance.
We employ high-order tensors, with each mode corresponding to a task index, to naturally represent tasks referenced by multiple indices.
We propose a general framework of low-rank MTL methods with tensorized support vector machines (SVMs) and least square support vector machines (LSSVMs)
arXiv Detail & Related papers (2023-08-30T14:28:26Z) - Dual-Balancing for Multi-Task Learning [42.613360970194734]
We propose a Dual-Balancing Multi-Task Learning (DB-MTL) method to alleviate the task balancing problem from both loss and gradient perspectives.
DB-MTL ensures loss-scale balancing by performing a logarithm transformation on each task loss, and guarantees gradient-magnitude balancing via normalizing all task gradients to the same magnitude as the maximum gradient norm.
arXiv Detail & Related papers (2023-08-23T09:41:28Z) - Sample-Level Weighting for Multi-Task Learning with Auxiliary Tasks [0.0]
Multi-task learning (MTL) can improve the generalization performance of neural networks by sharing representations with related tasks.
MTL can also degrade performance through harmful interference between tasks.
We propose SLGrad, a sample-level weighting algorithm for multi-task learning with auxiliary tasks.
arXiv Detail & Related papers (2023-06-07T15:29:46Z) - FAMO: Fast Adaptive Multitask Optimization [48.59232177073481]
We introduce Fast Adaptive Multitask Optimization FAMO, a dynamic weighting method that decreases task losses in a balanced way.
Our results indicate that FAMO achieves comparable or superior performance to state-of-the-art gradient manipulation techniques.
arXiv Detail & Related papers (2023-06-06T15:39:54Z) - Tensorized LSSVMs for Multitask Regression [48.844191210894245]
Multitask learning (MTL) can utilize the relatedness between multiple tasks for performance improvement.
New MTL is proposed by leveraging low-rank tensor analysis and Least Squares Support Vectorized Least Squares Support Vectorized tLSSVM-MTL.
arXiv Detail & Related papers (2023-03-04T16:36:03Z) - M$^3$ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task
Learning with Model-Accelerator Co-design [95.41238363769892]
Multi-task learning (MTL) encapsulates multiple learned tasks in a single model and often lets those tasks learn better jointly.
Current MTL regimes have to activate nearly the entire model even to just execute a single task.
We present a model-accelerator co-design framework to enable efficient on-device MTL.
arXiv Detail & Related papers (2022-10-26T15:40:24Z) - Multi-Task Learning as a Bargaining Game [63.49888996291245]
In Multi-task learning (MTL), a joint model is trained to simultaneously make predictions for several tasks.
Since the gradients of these different tasks may conflict, training a joint model for MTL often yields lower performance than its corresponding single-task counterparts.
We propose viewing the gradients combination step as a bargaining game, where tasks negotiate to reach an agreement on a joint direction of parameter update.
arXiv Detail & Related papers (2022-02-02T13:21:53Z) - Conflict-Averse Gradient Descent for Multi-task Learning [56.379937772617]
A major challenge in optimizing a multi-task model is the conflicting gradients.
We introduce Conflict-Averse Gradient descent (CAGrad) which minimizes the average loss function.
CAGrad balances the objectives automatically and still provably converges to a minimum over the average loss.
arXiv Detail & Related papers (2021-10-26T22:03:51Z) - SLAW: Scaled Loss Approximate Weighting for Efficient Multi-Task
Learning [0.0]
Multi-task learning (MTL) is a subfield of machine learning with important applications.
The best MTL optimization methods require individually computing the gradient of each task's loss function.
We propose Scaled Loss Approximate Weighting (SLAW), a method for multi-task optimization that matches the performance of the best existing methods while being much more efficient.
arXiv Detail & Related papers (2021-09-16T20:58:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.