Related papers: On Implicit Bias in Overparameterized Bilevel Optimization

On Implicit Bias in Overparameterized Bilevel Optimization

URL: http://arxiv.org/abs/2212.14032v1
Date: Wed, 28 Dec 2022 18:57:46 GMT
Title: On Implicit Bias in Overparameterized Bilevel Optimization
Authors: Paul Vicol, Jonathan Lorraine, Fabian Pedregosa, David Duvenaud, Roger Grosse
Abstract summary: Bilevel problems consist of two nested sub-problems, called the outer and inner problems, respectively. We investigate the implicit bias of gradient-based algorithms for bilevel optimization. We show that the inner solutions obtained by warm-start BLO can encode a surprising amount of information about the outer objective.
Score: 38.11483853830913
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Many problems in machine learning involve bilevel optimization (BLO), including hyperparameter optimization, meta-learning, and dataset distillation. Bilevel problems consist of two nested sub-problems, called the outer and inner problems, respectively. In practice, often at least one of these sub-problems is overparameterized. In this case, there are many ways to choose among optima that achieve equivalent objective values. Inspired by recent studies of the implicit bias induced by optimization algorithms in single-level optimization, we investigate the implicit bias of gradient-based algorithms for bilevel optimization. We delineate two standard BLO methods -- cold-start and warm-start -- and show that the converged solution or long-run behavior depends to a large degree on these and other algorithmic choices, such as the hypergradient approximation. We also show that the inner solutions obtained by warm-start BLO can encode a surprising amount of information about the outer objective, even when the outer parameters are low-dimensional. We believe that implicit bias deserves as central a role in the study of bilevel optimization as it has attained in the study of single-level neural net optimization.

Related papers

Provably Faster Algorithms for Bilevel Optimization via Without-Replacement Sampling [96.47086913559289]
gradient-based algorithms are widely used in bilevel optimization. We introduce a without-replacement sampling based algorithm which achieves a faster convergence rate. We validate our algorithms over both synthetic and real-world applications.
arXiv Detail & Related papers (2024-11-07T17:05:31Z)
A Stochastic Approach to Bi-Level Optimization for Hyperparameter Optimization and Meta Learning [74.80956524812714]
We tackle the general differentiable meta learning problem that is ubiquitous in modern deep learning. These problems are often formalized as Bi-Level optimizations (BLO) We introduce a novel perspective by turning a given BLO problem into a ii optimization, where the inner loss function becomes a smooth distribution, and the outer loss becomes an expected loss over the inner distribution.
arXiv Detail & Related papers (2024-10-14T12:10:06Z)
A Single-Loop Algorithm for Decentralized Bilevel Optimization [11.67135350286933]
We propose a novel single-loop algorithm for solving decentralized bilevel optimization with a strongly convex lower-level problem. Our approach is a fully single-loop method that approximates the hypergradient using only two matrix-vector multiplications per iteration. Our analysis demonstrates that the proposed algorithm achieves the best-known convergence rate for bilevel optimization algorithms.
arXiv Detail & Related papers (2023-11-15T13:29:49Z)
Contextual Stochastic Bilevel Optimization [50.36775806399861]
We introduce contextual bilevel optimization (CSBO) -- a bilevel optimization framework with the lower-level problem minimizing an expectation on some contextual information and the upper-level variable. It is important for applications such as meta-learning, personalized learning, end-to-end learning, and Wasserstein distributionally robustly optimization with side information (WDRO-SI)
arXiv Detail & Related papers (2023-10-27T23:24:37Z)
Non-Convex Bilevel Optimization with Time-Varying Objective Functions [57.299128109226025]
We propose an online bilevel optimization where the functions can be time-varying and the agent continuously updates the decisions with online data. Compared to existing algorithms, SOBOW is computationally efficient and does not need to know previous functions. We show that SOBOW can achieve a sublinear bilevel local regret under mild conditions.
arXiv Detail & Related papers (2023-08-07T06:27:57Z)
Efficient Gradient Approximation Method for Constrained Bilevel Optimization [2.0305676256390934]
Bilevel optimization has been developed with large-scale high-dimensional data. This paper considers a constrained bilevel problem with convex and non-differentiable approximations.
arXiv Detail & Related papers (2023-02-03T19:34:56Z)
A Globally Convergent Gradient-based Bilevel Hyperparameter Optimization Method [0.0]
We propose a gradient-based bilevel method for solving the hyperparameter optimization problem. We show that the proposed method converges with lower computation and leads to models that generalize better on the testing set.
arXiv Detail & Related papers (2022-08-25T14:25:16Z)
A Constrained Optimization Approach to Bilevel Optimization with Multiple Inner Minima [49.320758794766185]
We propose a new approach, which convert the bilevel problem to an equivalent constrained optimization, and then the primal-dual algorithm can be used to solve the problem. Such an approach enjoys a few advantages including (a) addresses the multiple inner minima challenge; (b) fully first-order efficiency without Jacobian computations.
arXiv Detail & Related papers (2022-03-01T18:20:01Z)
Enhanced Bilevel Optimization via Bregman Distance [104.96004056928474]
We propose a bilevel optimization method based on Bregman Bregman functions. We also propose an accelerated version of SBiO-BreD method (ASBiO-BreD) by using the variance-reduced technique.
arXiv Detail & Related papers (2021-07-26T16:18:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.