Improved Bilevel Model: Fast and Optimal Algorithm with Theoretical
Guarantee
- URL: http://arxiv.org/abs/2009.00690v1
- Date: Tue, 1 Sep 2020 20:52:57 GMT
- Title: Improved Bilevel Model: Fast and Optimal Algorithm with Theoretical
Guarantee
- Authors: Junyi Li, Bin Gu, Heng Huang
- Abstract summary: We propose an improved bilevel model which converges faster and better compared to the current formulation.
The empirical results show that our model outperforms the current bilevel model with a great margin.
- Score: 110.16183719936629
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Due to the hierarchical structure of many machine learning problems, bilevel
programming is becoming more and more important recently, however, the
complicated correlation between the inner and outer problem makes it extremely
challenging to solve. Although several intuitive algorithms based on the
automatic differentiation have been proposed and obtained success in some
applications, not much attention has been paid to finding the optimal
formulation of the bilevel model. Whether there exists a better formulation is
still an open problem. In this paper, we propose an improved bilevel model
which converges faster and better compared to the current formulation. We
provide theoretical guarantee and evaluation results over two tasks: Data
Hyper-Cleaning and Hyper Representation Learning. The empirical results show
that our model outperforms the current bilevel model with a great margin.
\emph{This is a concurrent work with \citet{liu2020generic} and we submitted to
ICML 2020. Now we put it on the arxiv for record.}
Related papers
- Provably Faster Algorithms for Bilevel Optimization via Without-Replacement Sampling [96.47086913559289]
gradient-based algorithms are widely used in bilevel optimization.
We introduce a without-replacement sampling based algorithm which achieves a faster convergence rate.
We validate our algorithms over both synthetic and real-world applications.
arXiv Detail & Related papers (2024-11-07T17:05:31Z) - Contextual Stochastic Bilevel Optimization [50.36775806399861]
We introduce contextual bilevel optimization (CSBO) -- a bilevel optimization framework with the lower-level problem minimizing an expectation on some contextual information and the upper-level variable.
It is important for applications such as meta-learning, personalized learning, end-to-end learning, and Wasserstein distributionally robustly optimization with side information (WDRO-SI)
arXiv Detail & Related papers (2023-10-27T23:24:37Z) - Effective Bilevel Optimization via Minimax Reformulation [23.5093932552053]
We propose a reformulation of bilevel optimization as a minimax problem.
Under mild conditions, we show these two problems are equivalent.
Our method outperforms state-of-the-art bilevel methods while significantly reducing the computational cost.
arXiv Detail & Related papers (2023-05-22T15:41:33Z) - Implicit Bilevel Optimization: Differentiating through Bilevel
Optimization Programming [7.310043452300735]
Bilevel Optimization Programming is used to model complex and conflicting interactions between agents.
BiGrad is applicable to both continuous and Bilevel optimization problems.
Experiments show that the BiGrad successfully extends existing single-level approaches to Bilevel Programming.
arXiv Detail & Related papers (2023-02-28T10:32:17Z) - Local Stochastic Bilevel Optimization with Momentum-Based Variance
Reduction [104.41634756395545]
We study Federated Bilevel Optimization problems. Specifically, we first propose the FedBiO, a deterministic gradient-based algorithm.
We show FedBiO has complexity of $O(epsilon-1.5)$.
Our algorithms show superior performances compared to other baselines in numerical experiments.
arXiv Detail & Related papers (2022-05-03T16:40:22Z) - A Bi-Level Framework for Learning to Solve Combinatorial Optimization on
Graphs [91.07247251502564]
We propose a hybrid approach to combine the best of the two worlds, in which a bi-level framework is developed with an upper-level learning method to optimize the graph.
Such a bi-level approach simplifies the learning on the original hard CO and can effectively mitigate the demand for model capacity.
arXiv Detail & Related papers (2021-06-09T09:18:18Z) - A Gradient Method for Multilevel Optimization [8.80899367147235]
We have developed a gradient-based algorithm for multilevel $n$ levels based on Franceschi et al.'s idea.
As far as we know, this is one of the first algorithms with some theoretical guarantee for multilevel optimization.
arXiv Detail & Related papers (2021-05-28T16:22:10Z) - Generalization Guarantees for Neural Architecture Search with
Train-Validation Split [48.265305046655996]
This paper explores the statistical aspects of such problems with train-validation splits.
We show that refined properties of the validation loss such as risk and hyper-gradients are indicative of those of the true test loss.
We also highlight rigorous connections between NAS, multiple kernel learning, and low-rank matrix learning.
arXiv Detail & Related papers (2021-04-29T06:11:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.