Related papers: PSMGD: Periodic Stochastic Multi-Gradient Descent for Fast Multi-Objective Optimization

PSMGD: Periodic Stochastic Multi-Gradient Descent for Fast Multi-Objective Optimization

URL: http://arxiv.org/abs/2412.10961v2
Date: Tue, 17 Dec 2024 04:25:55 GMT
Title: PSMGD: Periodic Stochastic Multi-Gradient Descent for Fast Multi-Objective Optimization
Authors: Mingjing Xu, Peizhong Ju, Jia Liu, Haibo Yang,
Abstract summary: Multi-objective optimization (MOO) lies at the core of many machine learning (ML) applications.<n>We propose Periodic Multi-Grad Descent (PSMGD) to accelerate MOO.<n>PSMGD can provide comparable or superior performance state-of-the-art algorithms.
Score: 17.131747385975892
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Multi-objective optimization (MOO) lies at the core of many machine learning (ML) applications that involve multiple, potentially conflicting objectives (e.g., multi-task learning, multi-objective reinforcement learning, among many others). Despite the long history of MOO, recent years have witnessed a surge in interest within the ML community in the development of gradient manipulation algorithms for MOO, thanks to the availability of gradient information in many ML problems. However, existing gradient manipulation methods for MOO often suffer from long training times, primarily due to the need for computing dynamic weights by solving an additional optimization problem to determine a common descent direction that can decrease all objectives simultaneously. To address this challenge, we propose a new and efficient algorithm called Periodic Stochastic Multi-Gradient Descent (PSMGD) to accelerate MOO. PSMGD is motivated by the key observation that dynamic weights across objectives exhibit small changes under minor updates over short intervals during the optimization process. Consequently, our PSMGD algorithm is designed to periodically compute these dynamic weights and utilizes them repeatedly, thereby effectively reducing the computational overload. Theoretically, we prove that PSMGD can achieve state-of-the-art convergence rates for strongly-convex, general convex, and non-convex functions. Additionally, we introduce a new computational complexity measure, termed backpropagation complexity, and demonstrate that PSMGD could achieve an objective-independent backpropagation complexity. Through extensive experiments, we verify that PSMGD can provide comparable or superior performance to state-of-the-art MOO algorithms while significantly reducing training time.

Related papers

Reinforcing Question Answering Agents with Minimalist Policy Gradient Optimization [80.09112808413133]
Mujica is a planner that decomposes questions into acyclic graph of subquestions and a worker that resolves questions via retrieval and reasoning.<n>MyGO is a novel reinforcement learning method that replaces traditional policy updates with gradient Likelihood Maximum Estimation.<n> Empirical results across multiple datasets demonstrate the effectiveness of MujicaMyGO in enhancing multi-hop QA performance.
arXiv Detail & Related papers (2025-05-20T18:33:03Z)
Unlearning as multi-task optimization: A normalized gradient difference approach with an adaptive learning rate [105.86576388991713]
We introduce a normalized gradient difference (NGDiff) algorithm, enabling us to have better control over the trade-off between the objectives. We provide a theoretical analysis and empirically demonstrate the superior performance of NGDiff among state-of-the-art unlearning methods on the TOFU and MUSE datasets.
arXiv Detail & Related papers (2024-10-29T14:41:44Z)
A Stochastic Approach to Bi-Level Optimization for Hyperparameter Optimization and Meta Learning [74.80956524812714]
We tackle the general differentiable meta learning problem that is ubiquitous in modern deep learning. These problems are often formalized as Bi-Level optimizations (BLO) We introduce a novel perspective by turning a given BLO problem into a ii optimization, where the inner loss function becomes a smooth distribution, and the outer loss becomes an expected loss over the inner distribution.
arXiv Detail & Related papers (2024-10-14T12:10:06Z)
Common pitfalls to avoid while using multiobjective optimization in machine learning [1.2499537119440245]
There has been an increasing interest in exploring the application of multiobjective optimization (MOO) in machine learning (ML) Despite its potential, there is a noticeable lack of satisfactory literature that could serve as an entry-level guide for ML practitioners who want to use MOO. We critically review previous studies, particularly those involving MOO in deep learning (using Physics-Informed Neural Networks (PINNs) as a guiding example) and identify misconceptions that highlight the need for a better grasp of MOO principles in ML.
arXiv Detail & Related papers (2024-05-02T17:12:25Z)
A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical Computation Offloading [62.34538208323411]
We propose a multi-head ensemble multi-task learning (MEMTL) approach with a shared backbone and multiple prediction heads (PHs) MEMTL outperforms benchmark methods in both the inference accuracy and mean square error without requiring additional training data.
arXiv Detail & Related papers (2023-09-02T11:01:16Z)
Structured Pruning of Neural Networks for Constraints Learning [5.689013857168641]
We show the effectiveness of pruning, one of these techniques, when applied to ANNs prior to their integration into MIPs. We conduct experiments using feed-forward neural networks with multiple layers to construct adversarial examples. Our results demonstrate that pruning offers remarkable reductions in solution times without hindering the quality of the final decision.
arXiv Detail & Related papers (2023-07-14T16:36:49Z)
Three-Way Trade-Off in Multi-Objective Learning: Optimization, Generalization and Conflict-Avoidance [47.42067405054353]
Multi-objective learning (MOL) problems often arise in emerging machine learning problems. One of the critical challenges in MOL is the potential conflict among different objectives during the iterative optimization process. Recent works have developed various dynamic weighting algorithms for MOL such as MGDA and its variants.
arXiv Detail & Related papers (2023-05-31T17:31:56Z)
Multi-Objective Policy Gradients with Topological Constraints [108.10241442630289]
We present a new algorithm for a policy gradient in TMDPs by a simple extension of the proximal policy optimization (PPO) algorithm. We demonstrate this on a real-world multiple-objective navigation problem with an arbitrary ordering of objectives both in simulation and on a real robot.
arXiv Detail & Related papers (2022-09-15T07:22:58Z)
Efficient Model-Based Multi-Agent Mean-Field Reinforcement Learning [89.31889875864599]
We propose an efficient model-based reinforcement learning algorithm for learning in multi-agent systems. Our main theoretical contributions are the first general regret bounds for model-based reinforcement learning for MFC. We provide a practical parametrization of the core optimization problem.
arXiv Detail & Related papers (2021-07-08T18:01:02Z)
ConCrete MAP: Learning a Probabilistic Relaxation of Discrete Variables for Soft Estimation with Low Complexity [9.62543698736491]
ConCrete MAP Detection (CMD) is an iterative detection algorithm for large inverse linear problems. We show CMD to feature a promising performance complexity trade-off compared to SotA. Notably, we demonstrate CMD's soft outputs to be reliable for decoders.
arXiv Detail & Related papers (2021-02-25T09:54:25Z)
Gradient Monitored Reinforcement Learning [0.0]
We focus on the enhancement of training and evaluation performance in reinforcement learning algorithms. We propose an approach to steer the learning in the weight parameters of a neural network based on the dynamic development and feedback from the training process itself.
arXiv Detail & Related papers (2020-05-25T13:45:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.