Inexact Derivative-Free Optimization for Bilevel Learning
- URL: http://arxiv.org/abs/2006.12674v2
- Date: Tue, 8 Dec 2020 23:10:15 GMT
- Title: Inexact Derivative-Free Optimization for Bilevel Learning
- Authors: Matthias J. Ehrhardt, Lindon Roberts
- Abstract summary: Variational regularization techniques are dominant in the field of mathematical imaging.
A by now common strategy to resolve this issue is to learn these parameters from data.
It is common when solving the upper-level problem to assume access to exact solutions of the lower-level problem, which is practically infeasible.
We propose to solve these problems using inexact derivative-free optimization algorithms which never require exact lower-level problem solutions.
- Score: 0.27074235008521236
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Variational regularization techniques are dominant in the field of
mathematical imaging. A drawback of these techniques is that they are dependent
on a number of parameters which have to be set by the user. A by now common
strategy to resolve this issue is to learn these parameters from data. While
mathematically appealing this strategy leads to a nested optimization problem
(known as bilevel optimization) which is computationally very difficult to
handle. It is common when solving the upper-level problem to assume access to
exact solutions of the lower-level problem, which is practically infeasible. In
this work we propose to solve these problems using inexact derivative-free
optimization algorithms which never require exact lower-level problem
solutions, but instead assume access to approximate solutions with controllable
accuracy, which is achievable in practice. We prove global convergence and a
worstcase complexity bound for our approach. We test our proposed framework on
ROFdenoising and learning MRI sampling patterns. Dynamically adjusting the
lower-level accuracy yields learned parameters with similar reconstruction
quality as highaccuracy evaluations but with dramatic reductions in
computational work (up to 100 times faster in some cases).
Related papers
- Learning to Optimize for Mixed-Integer Non-linear Programming [20.469394148261838]
Mixed-integer non-NLP programs (MINLPs) arise in various domains, such as energy systems and transportation, but are notoriously difficult to solve.
Recent advances in machine learning have led to remarkable successes in optimization, area broadly known as learning to optimize.
We propose two differentiable correction layers that generate integer outputs while preserving gradient.
arXiv Detail & Related papers (2024-10-14T20:14:39Z) - Contextual Stochastic Bilevel Optimization [50.36775806399861]
We introduce contextual bilevel optimization (CSBO) -- a bilevel optimization framework with the lower-level problem minimizing an expectation on some contextual information and the upper-level variable.
It is important for applications such as meta-learning, personalized learning, end-to-end learning, and Wasserstein distributionally robustly optimization with side information (WDRO-SI)
arXiv Detail & Related papers (2023-10-27T23:24:37Z) - Analyzing Inexact Hypergradients for Bilevel Learning [0.09669369645900441]
We introduce a unified framework for computing hypergradients that generalizes existing methods based on the implicit function theorem and automatic differentiation/backpropagation.
Our numerical results show that, surprisingly, for efficient bilevel optimization, the choice of hypergradient algorithm is at least as important as the choice of lower-level solver.
arXiv Detail & Related papers (2023-01-11T23:54:27Z) - A Globally Convergent Gradient-based Bilevel Hyperparameter Optimization
Method [0.0]
We propose a gradient-based bilevel method for solving the hyperparameter optimization problem.
We show that the proposed method converges with lower computation and leads to models that generalize better on the testing set.
arXiv Detail & Related papers (2022-08-25T14:25:16Z) - A framework for bilevel optimization that enables stochastic and global
variance reduction algorithms [17.12280360174073]
Bilevel optimization is a problem of minimizing a value function which involves the arg-minimum of another function.
We introduce a novel framework, in which the solution of the inner problem, the solution of the linear system, and the main variable evolve at the same time.
We demonstrate that SABA, an adaptation of the celebrated SAGA algorithm in our framework, has $O(frac1T)$ convergence rate, and that it achieves linear convergence under Polyak-Lojasciewicz assumption.
arXiv Detail & Related papers (2022-01-31T18:17:25Z) - A Boosting Approach to Reinforcement Learning [59.46285581748018]
We study efficient algorithms for reinforcement learning in decision processes whose complexity is independent of the number of states.
We give an efficient algorithm that is capable of improving the accuracy of such weak learning methods.
arXiv Detail & Related papers (2021-08-22T16:00:45Z) - Enhanced Bilevel Optimization via Bregman Distance [104.96004056928474]
We propose a bilevel optimization method based on Bregman Bregman functions.
We also propose an accelerated version of SBiO-BreD method (ASBiO-BreD) by using the variance-reduced technique.
arXiv Detail & Related papers (2021-07-26T16:18:43Z) - BiAdam: Fast Adaptive Bilevel Optimization Methods [104.96004056928474]
Bilevel optimization has attracted increased interest in machine learning due to its many applications.
We provide a useful analysis framework for both the constrained and unconstrained optimization.
arXiv Detail & Related papers (2021-06-21T20:16:40Z) - Solving weakly supervised regression problem using low-rank manifold
regularization [77.34726150561087]
We solve a weakly supervised regression problem.
Under "weakly" we understand that for some training points the labels are known, for some unknown, and for others uncertain due to the presence of random noise or other reasons such as lack of resources.
In the numerical section, we applied the suggested method to artificial and real datasets using Monte-Carlo modeling.
arXiv Detail & Related papers (2021-04-13T23:21:01Z) - Slowly Varying Regression under Sparsity [5.22980614912553]
We present the framework of slowly hyper regression under sparsity, allowing regression models to exhibit slow and sparse variations.
We suggest a procedure that reformulates as a binary convex algorithm.
We show that the resulting model outperforms competing formulations in comparable times across various datasets.
arXiv Detail & Related papers (2021-02-22T04:51:44Z) - Divide and Learn: A Divide and Conquer Approach for Predict+Optimize [50.03608569227359]
The predict+optimize problem combines machine learning ofproblem coefficients with a optimization prob-lem that uses the predicted coefficients.
We show how to directlyexpress the loss of the optimization problem in terms of thepredicted coefficients as a piece-wise linear function.
We propose a novel divide and algorithm to tackle optimization problems without this restriction and predict itscoefficients using the optimization loss.
arXiv Detail & Related papers (2020-12-04T00:26:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.