Value Function Based Difference-of-Convex Algorithm for Bilevel
Hyperparameter Selection Problems
- URL: http://arxiv.org/abs/2206.05976v1
- Date: Mon, 13 Jun 2022 08:51:10 GMT
- Title: Value Function Based Difference-of-Convex Algorithm for Bilevel
Hyperparameter Selection Problems
- Authors: Lucy Gao, Jane J. Ye, Haian Yin, Shangzhi Zeng, Jin Zhang
- Abstract summary: We develop a sequentially convergent Value based Difference-of-Convex Function Algorithm with inexactness (VF-iDCA)
Our experiments confirm our theoretical findings and show that the proposed VF-iDCA yields superior performance when applied to tune hyperparameters.
- Score: 5.940592509070767
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Gradient-based optimization methods for hyperparameter tuning guarantee
theoretical convergence to stationary solutions when for fixed upper-level
variable values, the lower level of the bilevel program is strongly convex
(LLSC) and smooth (LLS). This condition is not satisfied for bilevel programs
arising from tuning hyperparameters in many machine learning algorithms. In
this work, we develop a sequentially convergent Value Function based
Difference-of-Convex Algorithm with inexactness (VF-iDCA). We show that this
algorithm achieves stationary solutions without LLSC and LLS assumptions for
bilevel programs from a broad class of hyperparameter tuning applications. Our
extensive experiments confirm our theoretical findings and show that the
proposed VF-iDCA yields superior performance when applied to tune
hyperparameters.
Related papers
- Hyperparameters in Continual Learning: a Reality Check [53.30082523545212]
It has been common practice to train a CL algorithm on a CL scenario constructed with a benchmark dataset.
In this paper, we contend that this evaluation protocol is not only impractical but also incapable of effectively assessing the CL capability of a CL algorithm.
arXiv Detail & Related papers (2024-03-14T03:13:01Z) - Moreau Envelope Based Difference-of-weakly-Convex Reformulation and
Algorithm for Bilevel Programs [10.875233844414465]
We present an innovative single-level difference of weakly convex reformulation based on the envelope of the lower-level problem.
We further develop a sequentially convergent Inexact Proximal Difference of Weakly Convex Algorithm (iP-DwCA)
arXiv Detail & Related papers (2023-06-29T07:57:47Z) - Optimal Algorithms for Stochastic Bilevel Optimization under Relaxed
Smoothness Conditions [9.518010235273785]
We present a novel fully Liploop Hessian-inversion-free algorithmic framework for bilevel optimization.
We show that by a slight modification of our approach our approach can handle a more general multi-objective robust bilevel optimization problem.
arXiv Detail & Related papers (2023-06-21T07:32:29Z) - An Optimization-based Deep Equilibrium Model for Hyperspectral Image
Deconvolution with Convergence Guarantees [71.57324258813675]
We propose a novel methodology for addressing the hyperspectral image deconvolution problem.
A new optimization problem is formulated, leveraging a learnable regularizer in the form of a neural network.
The derived iterative solver is then expressed as a fixed-point calculation problem within the Deep Equilibrium framework.
arXiv Detail & Related papers (2023-06-10T08:25:16Z) - Fast Adaptive Federated Bilevel Optimization [14.579475552088692]
We propose a novel adaptive federated bilevel optimization algorithm (i.e.,AdaFBiO) to solve the distributed bilevel optimization problems.
AdaFBiO uses the unified adaptive matrices to flexibly incorporate various adaptive learning rates to update variables in both UL and LL problems.
We provide a convergence analysis framework for our AdaFBiO algorithm, and prove it needs the sample of complexity of $tildeO(epsilon-3)$ with communication complexity of $tildeO(epsilon-2)$ to obtain an $
arXiv Detail & Related papers (2022-11-02T13:55:47Z) - A Globally Convergent Gradient-based Bilevel Hyperparameter Optimization
Method [0.0]
We propose a gradient-based bilevel method for solving the hyperparameter optimization problem.
We show that the proposed method converges with lower computation and leads to models that generalize better on the testing set.
arXiv Detail & Related papers (2022-08-25T14:25:16Z) - A Constrained Optimization Approach to Bilevel Optimization with
Multiple Inner Minima [49.320758794766185]
We propose a new approach, which convert the bilevel problem to an equivalent constrained optimization, and then the primal-dual algorithm can be used to solve the problem.
Such an approach enjoys a few advantages including (a) addresses the multiple inner minima challenge; (b) fully first-order efficiency without Jacobian computations.
arXiv Detail & Related papers (2022-03-01T18:20:01Z) - Enhanced Bilevel Optimization via Bregman Distance [104.96004056928474]
We propose a bilevel optimization method based on Bregman Bregman functions.
We also propose an accelerated version of SBiO-BreD method (ASBiO-BreD) by using the variance-reduced technique.
arXiv Detail & Related papers (2021-07-26T16:18:43Z) - A Momentum-Assisted Single-Timescale Stochastic Approximation Algorithm
for Bilevel Optimization [112.59170319105971]
We propose a new algorithm -- the Momentum- Single-timescale Approximation (MSTSA) -- for tackling problems.
MSTSA allows us to control the error in iterations due to inaccurate solution to the lower level subproblem.
arXiv Detail & Related papers (2021-02-15T07:10:33Z) - Bilevel Optimization: Convergence Analysis and Enhanced Design [63.64636047748605]
Bilevel optimization is a tool for many machine learning problems.
We propose a novel stoc-efficientgradient estimator named stoc-BiO.
arXiv Detail & Related papers (2020-10-15T18:09:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.