Implicit Bilevel Optimization: Differentiating through Bilevel
Optimization Programming
- URL: http://arxiv.org/abs/2302.14473v1
- Date: Tue, 28 Feb 2023 10:32:17 GMT
- Title: Implicit Bilevel Optimization: Differentiating through Bilevel
Optimization Programming
- Authors: Francesco Alesiani
- Abstract summary: Bilevel Optimization Programming is used to model complex and conflicting interactions between agents.
BiGrad is applicable to both continuous and Bilevel optimization problems.
Experiments show that the BiGrad successfully extends existing single-level approaches to Bilevel Programming.
- Score: 7.310043452300735
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Bilevel Optimization Programming is used to model complex and conflicting
interactions between agents, for example in Robust AI or Privacy-preserving AI.
Integrating bilevel mathematical programming within deep learning is thus an
essential objective for the Machine Learning community. Previously proposed
approaches only consider single-level programming. In this paper, we extend
existing single-level optimization programming approaches and thus propose
Differentiating through Bilevel Optimization Programming (BiGrad) for
end-to-end learning of models that use Bilevel Programming as a layer. BiGrad
has wide applicability and can be used in modern machine learning frameworks.
BiGrad is applicable to both continuous and combinatorial Bilevel optimization
problems. We describe a class of gradient estimators for the combinatorial case
which reduces the requirements in terms of computation complexity; for the case
of the continuous variable, the gradient computation takes advantage of the
push-back approach (i.e. vector-jacobian product) for an efficient
implementation. Experiments show that the BiGrad successfully extends existing
single-level approaches to Bilevel Programming.
Related papers
- Memory-Efficient Gradient Unrolling for Large-Scale Bi-level Optimization [71.35604981129838]
Traditional gradient-based bi-level optimization algorithms are ill-suited to meet the demands of large-scale applications.
We introduce $(textFG)2textU$, which achieves an unbiased approximation of the meta gradient for bi-level optimization.
$(textFG)2textU$ is inherently designed to support parallel computing, enabling it to effectively leverage large-scale distributed computing systems.
arXiv Detail & Related papers (2024-06-20T08:21:52Z) - A Primal-Dual-Assisted Penalty Approach to Bilevel Optimization with Coupled Constraints [66.61399765513383]
We develop a BLOCC algorithm to tackle BiLevel Optimization problems with Coupled Constraints.
We demonstrate its effectiveness on two well-known real-world applications.
arXiv Detail & Related papers (2024-06-14T15:59:36Z) - Contextual Stochastic Bilevel Optimization [50.36775806399861]
We introduce contextual bilevel optimization (CSBO) -- a bilevel optimization framework with the lower-level problem minimizing an expectation on some contextual information and the upper-level variable.
It is important for applications such as meta-learning, personalized learning, end-to-end learning, and Wasserstein distributionally robustly optimization with side information (WDRO-SI)
arXiv Detail & Related papers (2023-10-27T23:24:37Z) - Betty: An Automatic Differentiation Library for Multilevel Optimization [24.256787655657277]
Betty is a high-level software library for gradient-based multilevel optimization.
We develop an automatic differentiation procedure based on a novel interpretation of multilevel optimization as a dataflow graph.
We empirically demonstrate that Betty can be used as a high-level programming interface for an array of multilevel optimization programs.
arXiv Detail & Related papers (2022-07-05T14:01:15Z) - Bilevel Optimization for Machine Learning: Algorithm Design and
Convergence Analysis [12.680169619392695]
This thesis provides a comprehensive convergence rate analysis for bilevel optimization algorithms.
For the problem-based formulation, we provide a convergence rate analysis for AID- and ITD-based bilevel algorithms.
We then develop acceleration bilevel algorithms, for which we provide shaper convergence analysis with relaxed assumptions.
arXiv Detail & Related papers (2021-07-31T22:05:47Z) - Enhanced Bilevel Optimization via Bregman Distance [104.96004056928474]
We propose a bilevel optimization method based on Bregman Bregman functions.
We also propose an accelerated version of SBiO-BreD method (ASBiO-BreD) by using the variance-reduced technique.
arXiv Detail & Related papers (2021-07-26T16:18:43Z) - A Value-Function-based Interior-point Method for Non-convex Bi-level
Optimization [38.75417864443519]
Bi-level optimization model is able to capture a wide range of complex learning tasks with practical interest.
We propose a new interior Bi-level Value-based Interior-point scheme, we penalize the regularized value function of the lower level problem into the upper level objective.
arXiv Detail & Related papers (2021-06-15T09:10:40Z) - Improved Bilevel Model: Fast and Optimal Algorithm with Theoretical
Guarantee [110.16183719936629]
We propose an improved bilevel model which converges faster and better compared to the current formulation.
The empirical results show that our model outperforms the current bilevel model with a great margin.
arXiv Detail & Related papers (2020-09-01T20:52:57Z) - A Generic First-Order Algorithmic Framework for Bi-Level Programming
Beyond Lower-Level Singleton [49.23948907229656]
Bi-level Descent Aggregation is a flexible and modularized algorithmic framework for generic bi-level optimization.
We derive a new methodology to prove the convergence of BDA without the LLS condition.
Our investigations also demonstrate that BDA is indeed compatible to a verify of particular first-order computation modules.
arXiv Detail & Related papers (2020-06-07T05:18:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.