Related papers: Moreau Envelope for Nonconvex Bi-Level Optimization: A Single-loop and Hessian-free Solution Strategy

Moreau Envelope for Nonconvex Bi-Level Optimization: A Single-loop and Hessian-free Solution Strategy

URL: http://arxiv.org/abs/2405.09927v1
Date: Thu, 16 May 2024 09:33:28 GMT
Title: Moreau Envelope for Nonconvex Bi-Level Optimization: A Single-loop and Hessian-free Solution Strategy
Authors: Risheng Liu, Zhu Liu, Wei Yao, Shangzhi Zeng, Jin Zhang,
Abstract summary: Large-scale non Bi-Level (BLO) problems are increasingly applied in machine learning. These challenges involve ensuring computational efficiency and providing theoretical guarantees.
Score: 45.982542530484274
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: This work focuses on addressing two major challenges in the context of large-scale nonconvex Bi-Level Optimization (BLO) problems, which are increasingly applied in machine learning due to their ability to model nested structures. These challenges involve ensuring computational efficiency and providing theoretical guarantees. While recent advances in scalable BLO algorithms have primarily relied on lower-level convexity simplification, our work specifically tackles large-scale BLO problems involving nonconvexity in both the upper and lower levels. We simultaneously address computational and theoretical challenges by introducing an innovative single-loop gradient-based algorithm, utilizing the Moreau envelope-based reformulation, and providing non-asymptotic convergence analysis for general nonconvex BLO problems. Notably, our algorithm relies solely on first-order gradient information, enhancing its practicality and efficiency, especially for large-scale BLO learning tasks. We validate our approach's effectiveness through experiments on various synthetic problems, two typical hyper-parameter learning tasks, and a real-world neural architecture search application, collectively demonstrating its superior performance.

Related papers

On The Sample Complexity Bounds In Bilevel Reinforcement Learning [36.239015146313136]
Bilevel reinforcement learning (BRL) has emerged as a powerful mathematical framework for studying generative AI alignment. We present the first sample complexity result for BRL, achieving a bound of $epsilon-4$. This result extends to standard bilevel optimization problems, providing an interesting theoretical contribution with practical implications.
arXiv Detail & Related papers (2025-03-22T04:22:04Z)
qNBO: quasi-Newton Meets Bilevel Optimization [26.0555315825777]
Bilevel optimization, addressing challenges in hierarchical learning tasks, has gained significant interest in machine learning. We introduce a general framework to address these computational challenges in a coordinated manner. Specifically, we leverage quasi-Newton algorithms to accelerate the resolution of the lower-level problem while efficiently approximating the inverse Hessian-vector product.
arXiv Detail & Related papers (2025-02-03T05:36:45Z)
A Primal-Dual-Assisted Penalty Approach to Bilevel Optimization with Coupled Constraints [66.61399765513383]
We develop a BLOCC algorithm to tackle BiLevel Optimization problems with Coupled Constraints. We demonstrate its effectiveness on two well-known real-world applications.
arXiv Detail & Related papers (2024-06-14T15:59:36Z)
Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHF [82.73541793388]
We introduce the first principled algorithmic framework for solving bilevel RL problems through the lens of penalty formulation. We provide theoretical studies of the problem landscape and its penalty-based gradient (policy) algorithms. We demonstrate the effectiveness of our algorithms via simulations in the Stackelberg Markov game, RL from human feedback and incentive design.
arXiv Detail & Related papers (2024-02-10T04:54:15Z)
Constrained Bi-Level Optimization: Proximal Lagrangian Value function Approach and Hessian-free Algorithm [8.479947546216131]
We develop a Hessian-free gradient-based algorithm-termed proximal Lagrangian Value function-based Hessian-free Bi-level Algorithm (LV-HBA) LV-HBA is especially well-suited for machine learning applications.
arXiv Detail & Related papers (2024-01-29T13:50:56Z)
Effective Bilevel Optimization via Minimax Reformulation [23.5093932552053]
We propose a reformulation of bilevel optimization as a minimax problem. Under mild conditions, we show these two problems are equivalent. Our method outperforms state-of-the-art bilevel methods while significantly reducing the computational cost.
arXiv Detail & Related papers (2023-05-22T15:41:33Z)
Communication-Efficient Federated Bilevel Optimization with Local and Global Lower Level Problems [118.00379425831566]
We propose a communication-efficient algorithm, named FedBiOAcc. We prove that FedBiOAcc-Local converges at the same rate for this type of problems. Empirical results show superior performance of our algorithms.
arXiv Detail & Related papers (2023-02-13T21:28:53Z)
Value-Function-based Sequential Minimization for Bi-level Optimization [52.39882976848064]
gradient-based Bi-Level Optimization (BLO) methods have been widely applied to handle modern learning tasks. There are almost no gradient-based methods able to solve BLO in challenging scenarios, such as BLO with functional constraints and pessimistic BLO. We provide Bi-level Value-Function-based Sequential Minimization (BVFSM) to address the above issues.
arXiv Detail & Related papers (2021-10-11T03:13:39Z)
A Generic Descent Aggregation Framework for Gradient-based Bi-level Optimization [41.894281911990554]
We develop a novel Bi-level Descent Aggregation (BDA) framework for bi-level learning tasks. BDA aggregates hierarchical objectives of both upper level and lower level. We propose a new proof recipe to improve the convergence results of conventional gradient-based bi-level methods.
arXiv Detail & Related papers (2021-02-16T06:58:12Z)
Investigating Bi-Level Optimization for Learning and Vision from a Unified Perspective: A Survey and Beyond [114.39616146985001]
In machine learning and computer vision fields, despite the different motivations and mechanisms, a lot of complex problems contain a series of closely related subproblms. In this paper, we first uniformly express these complex learning and vision problems from the perspective of Bi-Level Optimization (BLO) Then we construct a value-function-based single-level reformulation and establish a unified algorithmic framework to understand and formulate mainstream gradient-based BLO methodologies.
arXiv Detail & Related papers (2021-01-27T16:20:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.