Towards Gradient-based Bilevel Optimization with Non-convex Followers
and Beyond
- URL: http://arxiv.org/abs/2110.00455v1
- Date: Fri, 1 Oct 2021 14:41:17 GMT
- Title: Towards Gradient-based Bilevel Optimization with Non-convex Followers
and Beyond
- Authors: Risheng Liu, Yaohua Liu, Shangzhi Zeng, Jin Zhang
- Abstract summary: Bi-LevelLOcation (BLO) has received extensive attentions from both learning and vision communities.
We propose a new class of BLOs, named Auxiliaryization and Pessimy Truncated Gradient (IAPTT-GM)
In particular, by introducing an auxiliary as to guide the optimization dynamics and designing a trunrun operation, we construct a reliable approximate version of the original BLO LLC.
- Score: 37.89147034158937
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In recent years, Bi-Level Optimization (BLO) techniques have received
extensive attentions from both learning and vision communities. A variety of
BLO models in complex and practical tasks are of non-convex follower structure
in nature (a.k.a., without Lower-Level Convexity, LLC for short). However, this
challenging class of BLOs is lack of developments on both efficient solution
strategies and solid theoretical guarantees. In this work, we propose a new
algorithmic framework, named Initialization Auxiliary and Pessimistic
Trajectory Truncated Gradient Method (IAPTT-GM), to partially address the above
issues. In particular, by introducing an auxiliary as initialization to guide
the optimization dynamics and designing a pessimistic trajectory truncation
operation, we construct a reliable approximate version of the original BLO in
the absence of LLC hypothesis. Our theoretical investigations establish the
convergence of solutions returned by IAPTT-GM towards those of the original BLO
without LLC. As an additional bonus, we also theoretically justify the quality
of our IAPTT-GM embedded with Nesterov's accelerated dynamics under LLC. The
experimental results confirm both the convergence of our algorithm without LLC,
and the theoretical findings under LLC.
Related papers
- Moreau Envelope for Nonconvex Bi-Level Optimization: A Single-loop and Hessian-free Solution Strategy [45.982542530484274]
Large-scale non Bi-Level (BLO) problems are increasingly applied in machine learning.
These challenges involve ensuring computational efficiency and providing theoretical guarantees.
arXiv Detail & Related papers (2024-05-16T09:33:28Z) - Communication-Efficient Federated Bilevel Optimization with Local and
Global Lower Level Problems [118.00379425831566]
We propose a communication-efficient algorithm, named FedBiOAcc.
We prove that FedBiOAcc-Local converges at the same rate for this type of problems.
Empirical results show superior performance of our algorithms.
arXiv Detail & Related papers (2023-02-13T21:28:53Z) - Averaged Method of Multipliers for Bi-Level Optimization without
Lower-Level Strong Convexity [43.092883358935545]
We propose a single loop Bi-level Averaged Method of Multipliers (sl-BAMM) for Bi-Level Optimization (BLO)
We provide non-asymptotic convergence analysis of sl-BAMM towards KKT stationary points, and the comparative advantage of our analysis lies in the absence of strong gradient boundedness assumption.
arXiv Detail & Related papers (2023-02-07T11:29:05Z) - ES-Based Jacobian Enables Faster Bilevel Optimization [53.675623215542515]
Bilevel optimization (BO) has arisen as a powerful tool for solving many modern machine learning problems.
Existing gradient-based methods require second-order derivative approximations via Jacobian- or/and Hessian-vector computations.
We propose a novel BO algorithm, which adopts Evolution Strategies (ES) based method to approximate the response Jacobian matrix in the hypergradient of BO.
arXiv Detail & Related papers (2021-10-13T19:36:50Z) - Value-Function-based Sequential Minimization for Bi-level Optimization [52.39882976848064]
gradient-based Bi-Level Optimization (BLO) methods have been widely applied to handle modern learning tasks.
There are almost no gradient-based methods able to solve BLO in challenging scenarios, such as BLO with functional constraints and pessimistic BLO.
We provide Bi-level Value-Function-based Sequential Minimization (BVFSM) to address the above issues.
arXiv Detail & Related papers (2021-10-11T03:13:39Z) - A Generic Descent Aggregation Framework for Gradient-based Bi-level
Optimization [41.894281911990554]
We develop a novel Bi-level Descent Aggregation (BDA) framework for bi-level learning tasks.
BDA aggregates hierarchical objectives of both upper level and lower level.
We propose a new proof recipe to improve the convergence results of conventional gradient-based bi-level methods.
arXiv Detail & Related papers (2021-02-16T06:58:12Z) - Reparameterized Variational Divergence Minimization for Stable Imitation [57.06909373038396]
We study the extent to which variations in the choice of probabilistic divergence may yield more performant ILO algorithms.
We contribute a re parameterization trick for adversarial imitation learning to alleviate the challenges of the promising $f$-divergence minimization framework.
Empirically, we demonstrate that our design choices allow for ILO algorithms that outperform baseline approaches and more closely match expert performance in low-dimensional continuous-control tasks.
arXiv Detail & Related papers (2020-06-18T19:04:09Z) - A Generic First-Order Algorithmic Framework for Bi-Level Programming
Beyond Lower-Level Singleton [49.23948907229656]
Bi-level Descent Aggregation is a flexible and modularized algorithmic framework for generic bi-level optimization.
We derive a new methodology to prove the convergence of BDA without the LLS condition.
Our investigations also demonstrate that BDA is indeed compatible to a verify of particular first-order computation modules.
arXiv Detail & Related papers (2020-06-07T05:18:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.