Related papers: Independent Component Alignment for Multi-Task Learning

Independent Component Alignment for Multi-Task Learning

URL: http://arxiv.org/abs/2305.19000v1
Date: Tue, 30 May 2023 12:56:36 GMT
Title: Independent Component Alignment for Multi-Task Learning
Authors: Dmitry Senushkin, Nikolay Patakin, Arseny Kuznetsov, Anton Konushin
Abstract summary: In a multi-task learning (MTL) setting, a single model is trained to tackle a diverse set of tasks jointly. We propose using a condition number of a linear system of gradients as a stability criterion of an MTL optimization. We present Aligned-MTL, a novel MTL optimization approach based on the proposed criterion.
Score: 2.5234156040689237
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: In a multi-task learning (MTL) setting, a single model is trained to tackle a diverse set of tasks jointly. Despite rapid progress in the field, MTL remains challenging due to optimization issues such as conflicting and dominating gradients. In this work, we propose using a condition number of a linear system of gradients as a stability criterion of an MTL optimization. We theoretically demonstrate that a condition number reflects the aforementioned optimization issues. Accordingly, we present Aligned-MTL, a novel MTL optimization approach based on the proposed criterion, that eliminates instability in the training process by aligning the orthogonal components of the linear system of gradients. While many recent MTL approaches guarantee convergence to a minimum, task trade-offs cannot be specified in advance. In contrast, Aligned-MTL provably converges to an optimal point with pre-defined task-specific weights, which provides more control over the optimization result. Through experiments, we show that the proposed approach consistently improves performance on a diverse set of MTL benchmarks, including semantic and instance segmentation, depth estimation, surface normal estimation, and reinforcement learning. The source code is publicly available at https://github.com/SamsungLabs/MTL .

Related papers

Continual Optimization with Symmetry Teleportation for Multi-Task Learning [73.28772872740744]
Multi-task learning (MTL) enables the simultaneous learning of multiple tasks using a single model. We propose a novel approach based on Continual Optimization with Symmetry Teleportation (COST) COST seeks an alternative loss-equivalent point on the loss landscape to reduce conflict gradients.
arXiv Detail & Related papers (2025-03-06T02:58:09Z)
Training Deep Learning Models with Norm-Constrained LMOs [56.00317694850397]
We study optimization methods that leverage the linear minimization oracle (LMO) over a norm-ball. We propose a new family of algorithms that uses the LMO to adapt to the geometry of the problem and, perhaps surprisingly, show that they can be applied to unconstrained problems.
arXiv Detail & Related papers (2025-02-11T13:10:34Z)
Fantastic Multi-Task Gradient Updates and How to Find Them In a Cone [14.702480423653984]
We propose ConicGrad, a principled, scalable, and robust MTL approach formulated as a constrained optimization problem. Our method introduces an angular constraint to dynamically regulate gradient update directions, confining them within a cone centered on the reference gradient of the overall objective. We conduct extensive experiments on standard supervised learning and reinforcement learning MTL benchmarks, and demonstrate that ConicGrad achieves state-of-the-art performance across diverse tasks.
arXiv Detail & Related papers (2025-01-31T23:11:12Z)
Unlearning as multi-task optimization: A normalized gradient difference approach with an adaptive learning rate [105.86576388991713]
We introduce a normalized gradient difference (NGDiff) algorithm, enabling us to have better control over the trade-off between the objectives. We provide a theoretical analysis and empirically demonstrate the superior performance of NGDiff among state-of-the-art unlearning methods on the TOFU and MUSE datasets.
arXiv Detail & Related papers (2024-10-29T14:41:44Z)
Zeroth-Order Fine-Tuning of LLMs in Random Subspaces [66.27334633749734]
As language models grow in size, memory demands for backpropagation increase. Zeroth-order (ZOZO) optimization methods offer a memory-efficient alternative. We show that SubZero enhances fine-tuning and achieves faster results compared to standard ZOZO approaches.
arXiv Detail & Related papers (2024-10-11T17:01:43Z)
MTLComb: multi-task learning combining regression and classification tasks for joint feature selection [3.708475728683911]
Multi-task learning (MTL) is a learning paradigm that enables the simultaneous training of multiple communicating algorithms. We propose a provable loss weighting scheme that analytically determines the optimal weights for balancing regression and classification tasks. We introduce MTLComb, an MTL algorithm and software package encompassing optimization procedures, training protocols, and hyper parameter estimation procedures.
arXiv Detail & Related papers (2024-05-16T08:07:25Z)
Fair Resource Allocation in Multi-Task Learning [12.776767874217663]
Multi-task learning (MTL) can leverage the shared knowledge across tasks, resulting in improved data efficiency and generalization performance. A major challenge in MTL lies in the presence of conflicting gradients, which can hinder the fair optimization of some tasks. Inspired by fair resource allocation in communication networks, we propose FairGrad, a novel MTL optimization method.
arXiv Detail & Related papers (2024-02-23T22:46:14Z)
Contextual Stochastic Bilevel Optimization [50.36775806399861]
We introduce contextual bilevel optimization (CSBO) -- a bilevel optimization framework with the lower-level problem minimizing an expectation on some contextual information and the upper-level variable. It is important for applications such as meta-learning, personalized learning, end-to-end learning, and Wasserstein distributionally robustly optimization with side information (WDRO-SI)
arXiv Detail & Related papers (2023-10-27T23:24:37Z)
Low-Rank Multitask Learning based on Tensorized SVMs and LSSVMs [65.42104819071444]
Multitask learning (MTL) leverages task-relatedness to enhance performance. We employ high-order tensors, with each mode corresponding to a task index, to naturally represent tasks referenced by multiple indices. We propose a general framework of low-rank MTL methods with tensorized support vector machines (SVMs) and least square support vector machines (LSSVMs)
arXiv Detail & Related papers (2023-08-30T14:28:26Z)
Optimizing Evaluation Metrics for Multi-Task Learning via the Alternating Direction Method of Multipliers [12.227732834969336]
Multi-task learning (MTL) aims to improve the generalization performance of multiple tasks by exploiting the shared factors among them. Most existing MTL methods try to minimize either the misclassified errors for classification or the mean squared errors for regression. We propose a method to directly optimize the evaluation metrics for a large family of MTL problems.
arXiv Detail & Related papers (2022-10-12T05:46:00Z)
Faster One-Sample Stochastic Conditional Gradient Method for Composite Convex Minimization [61.26619639722804]
We propose a conditional gradient method (CGM) for minimizing convex finite-sum objectives formed as a sum of smooth and non-smooth terms. The proposed method, equipped with an average gradient (SAG) estimator, requires only one sample per iteration. Nevertheless, it guarantees fast convergence rates on par with more sophisticated variance reduction techniques.
arXiv Detail & Related papers (2022-02-26T19:10:48Z)
Multi-Task Learning as a Bargaining Game [63.49888996291245]
In Multi-task learning (MTL), a joint model is trained to simultaneously make predictions for several tasks. Since the gradients of these different tasks may conflict, training a joint model for MTL often yields lower performance than its corresponding single-task counterparts. We propose viewing the gradients combination step as a bargaining game, where tasks negotiate to reach an agreement on a joint direction of parameter update.
arXiv Detail & Related papers (2022-02-02T13:21:53Z)
SLAW: Scaled Loss Approximate Weighting for Efficient Multi-Task Learning [0.0]
Multi-task learning (MTL) is a subfield of machine learning with important applications. The best MTL optimization methods require individually computing the gradient of each task's loss function. We propose Scaled Loss Approximate Weighting (SLAW), a method for multi-task optimization that matches the performance of the best existing methods while being much more efficient.
arXiv Detail & Related papers (2021-09-16T20:58:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.