Averaged Method of Multipliers for Bi-Level Optimization without
Lower-Level Strong Convexity
- URL: http://arxiv.org/abs/2302.03407v2
- Date: Fri, 30 Jun 2023 07:53:37 GMT
- Title: Averaged Method of Multipliers for Bi-Level Optimization without
Lower-Level Strong Convexity
- Authors: Risheng Liu, Yaohua Liu, Wei Yao, Shangzhi Zeng and Jin Zhang
- Abstract summary: We propose a single loop Bi-level Averaged Method of Multipliers (sl-BAMM) for Bi-Level Optimization (BLO)
We provide non-asymptotic convergence analysis of sl-BAMM towards KKT stationary points, and the comparative advantage of our analysis lies in the absence of strong gradient boundedness assumption.
- Score: 43.092883358935545
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Gradient methods have become mainstream techniques for Bi-Level Optimization
(BLO) in learning fields. The validity of existing works heavily rely on either
a restrictive Lower-Level Strong Convexity (LLSC) condition or on solving a
series of approximation subproblems with high accuracy or both. In this work,
by averaging the upper and lower level objectives, we propose a single loop
Bi-level Averaged Method of Multipliers (sl-BAMM) for BLO that is simple yet
efficient for large-scale BLO and gets rid of the limited LLSC restriction. We
further provide non-asymptotic convergence analysis of sl-BAMM towards KKT
stationary points, and the comparative advantage of our analysis lies in the
absence of strong gradient boundedness assumption, which is always required by
others. Thus our theory safely captures a wider variety of applications in deep
learning, especially where the upper-level objective is quadratic w.r.t. the
lower-level variable. Experimental results demonstrate the superiority of our
method.
Related papers
- Provably Faster Algorithms for Bilevel Optimization via Without-Replacement Sampling [96.47086913559289]
gradient-based algorithms are widely used in bilevel optimization.
We introduce a without-replacement sampling based algorithm which achieves a faster convergence rate.
We validate our algorithms over both synthetic and real-world applications.
arXiv Detail & Related papers (2024-11-07T17:05:31Z) - Constrained Bi-Level Optimization: Proximal Lagrangian Value function
Approach and Hessian-free Algorithm [8.479947546216131]
We develop a Hessian-free gradient-based algorithm-termed proximal Lagrangian Value function-based Hessian-free Bi-level Algorithm (LV-HBA)
LV-HBA is especially well-suited for machine learning applications.
arXiv Detail & Related papers (2024-01-29T13:50:56Z) - Contextual Stochastic Bilevel Optimization [50.36775806399861]
We introduce contextual bilevel optimization (CSBO) -- a bilevel optimization framework with the lower-level problem minimizing an expectation on some contextual information and the upper-level variable.
It is important for applications such as meta-learning, personalized learning, end-to-end learning, and Wasserstein distributionally robustly optimization with side information (WDRO-SI)
arXiv Detail & Related papers (2023-10-27T23:24:37Z) - A Generalized Alternating Method for Bilevel Learning under the
Polyak-{\L}ojasiewicz Condition [63.66516306205932]
Bilevel optimization has recently regained interest owing to its applications in emerging machine learning fields.
Recent results have shown that simple alternating iteration-based iterations can match interest owing to convex lower-level objective.
arXiv Detail & Related papers (2023-06-04T17:54:11Z) - Towards Extremely Fast Bilevel Optimization with Self-governed
Convergence Guarantees [42.514612465664605]
We propose a single-level formulation to uniformly understand existing explicit and implicit Gradient-based BLOs.
A striking feature of our convergence result is that, compared to those original unaccelerated GBLO versions, the fast BAGDC admits a unified non-asymptotic convergence theory towards stationarity.
arXiv Detail & Related papers (2022-05-20T09:46:10Z) - Value-Function-based Sequential Minimization for Bi-level Optimization [52.39882976848064]
gradient-based Bi-Level Optimization (BLO) methods have been widely applied to handle modern learning tasks.
There are almost no gradient-based methods able to solve BLO in challenging scenarios, such as BLO with functional constraints and pessimistic BLO.
We provide Bi-level Value-Function-based Sequential Minimization (BVFSM) to address the above issues.
arXiv Detail & Related papers (2021-10-11T03:13:39Z) - A Generic Descent Aggregation Framework for Gradient-based Bi-level
Optimization [41.894281911990554]
We develop a novel Bi-level Descent Aggregation (BDA) framework for bi-level learning tasks.
BDA aggregates hierarchical objectives of both upper level and lower level.
We propose a new proof recipe to improve the convergence results of conventional gradient-based bi-level methods.
arXiv Detail & Related papers (2021-02-16T06:58:12Z) - A Generic First-Order Algorithmic Framework for Bi-Level Programming
Beyond Lower-Level Singleton [49.23948907229656]
Bi-level Descent Aggregation is a flexible and modularized algorithmic framework for generic bi-level optimization.
We derive a new methodology to prove the convergence of BDA without the LLS condition.
Our investigations also demonstrate that BDA is indeed compatible to a verify of particular first-order computation modules.
arXiv Detail & Related papers (2020-06-07T05:18:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.