Related papers: Inexact Derivative-Free Optimization for Bilevel Learning

Inexact Derivative-Free Optimization for Bilevel Learning

URL: http://arxiv.org/abs/2006.12674v2
Date: Tue, 8 Dec 2020 23:10:15 GMT
Title: Inexact Derivative-Free Optimization for Bilevel Learning
Authors: Matthias J. Ehrhardt, Lindon Roberts
Abstract summary: Variational regularization techniques are dominant in the field of mathematical imaging. A by now common strategy to resolve this issue is to learn these parameters from data. It is common when solving the upper-level problem to assume access to exact solutions of the lower-level problem, which is practically infeasible. We propose to solve these problems using inexact derivative-free optimization algorithms which never require exact lower-level problem solutions.
Score: 0.27074235008521236
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Variational regularization techniques are dominant in the field of mathematical imaging. A drawback of these techniques is that they are dependent on a number of parameters which have to be set by the user. A by now common strategy to resolve this issue is to learn these parameters from data. While mathematically appealing this strategy leads to a nested optimization problem (known as bilevel optimization) which is computationally very difficult to handle. It is common when solving the upper-level problem to assume access to exact solutions of the lower-level problem, which is practically infeasible. In this work we propose to solve these problems using inexact derivative-free optimization algorithms which never require exact lower-level problem solutions, but instead assume access to approximate solutions with controllable accuracy, which is achievable in practice. We prove global convergence and a worstcase complexity bound for our approach. We test our proposed framework on ROFdenoising and learning MRI sampling patterns. Dynamically adjusting the lower-level accuracy yields learned parameters with similar reconstruction quality as highaccuracy evaluations but with dramatic reductions in computational work (up to 100 times faster in some cases).

Related papers

Bilevel Learning via Inexact Stochastic Gradient Descent [5.312803257246881]
Bilevel optimization is a central tool in machine learning for high-dimensional hyper tuning.<n>We advance the theory of inexact bilevel optimization.<n>We prove convergence and establish rates under decaying accuracy and step size schedules.
arXiv Detail & Related papers (2025-11-10T07:02:52Z)
A stochastic smoothing framework for nonconvex-nonconcave min-sum-max problems with applications to Wasserstein distributionally robust optimization [15.21599021645008]
This study introduces a novel training smoothing framework based on the mboxlog-sumexp function. We develop an iterative smoothing algorithm that addresses computational difficulties and guarantees convergence to a Clarke/directional stationary point. Our approach outperforms or is competitive with state-of-the-art methods in solving the news problem, deep learning regression and adversarially robust deep learning.
arXiv Detail & Related papers (2025-02-24T19:36:47Z)
An Adaptively Inexact Method for Bilevel Learning Using Primal-Dual Style Differentiation [8.107030227028815]
We consider a bilevel learning framework for learning linear operators. In this framework, the learnable parameters are optimized via a loss function that also depends on ther of a convex optimization problem.
arXiv Detail & Related papers (2024-12-09T12:26:26Z)
Learning to Optimize for Mixed-Integer Non-linear Programming [20.469394148261838]
Mixed-integer non-NLP programs (MINLPs) arise in various domains, such as energy systems and transportation, but are notoriously difficult to solve. Recent advances in machine learning have led to remarkable successes in optimization, area broadly known as learning to optimize. We propose two differentiable correction layers that generate integer outputs while preserving gradient.
arXiv Detail & Related papers (2024-10-14T20:14:39Z)
Contextual Stochastic Bilevel Optimization [50.36775806399861]
We introduce contextual bilevel optimization (CSBO) -- a bilevel optimization framework with the lower-level problem minimizing an expectation on some contextual information and the upper-level variable. It is important for applications such as meta-learning, personalized learning, end-to-end learning, and Wasserstein distributionally robustly optimization with side information (WDRO-SI)
arXiv Detail & Related papers (2023-10-27T23:24:37Z)
Analyzing Inexact Hypergradients for Bilevel Learning [0.09669369645900441]
We introduce a unified framework for computing hypergradients that generalizes existing methods based on the implicit function theorem and automatic differentiation/backpropagation. Our numerical results show that, surprisingly, for efficient bilevel optimization, the choice of hypergradient algorithm is at least as important as the choice of lower-level solver.
arXiv Detail & Related papers (2023-01-11T23:54:27Z)
A Globally Convergent Gradient-based Bilevel Hyperparameter Optimization Method [0.0]
We propose a gradient-based bilevel method for solving the hyperparameter optimization problem. We show that the proposed method converges with lower computation and leads to models that generalize better on the testing set.
arXiv Detail & Related papers (2022-08-25T14:25:16Z)
A framework for bilevel optimization that enables stochastic and global variance reduction algorithms [17.12280360174073]
Bilevel optimization is a problem of minimizing a value function which involves the arg-minimum of another function. We introduce a novel framework, in which the solution of the inner problem, the solution of the linear system, and the main variable evolve at the same time. We demonstrate that SABA, an adaptation of the celebrated SAGA algorithm in our framework, has $O(frac1T)$ convergence rate, and that it achieves linear convergence under Polyak-Lojasciewicz assumption.
arXiv Detail & Related papers (2022-01-31T18:17:25Z)
A Boosting Approach to Reinforcement Learning [59.46285581748018]
We study efficient algorithms for reinforcement learning in decision processes whose complexity is independent of the number of states. We give an efficient algorithm that is capable of improving the accuracy of such weak learning methods.
arXiv Detail & Related papers (2021-08-22T16:00:45Z)
Enhanced Bilevel Optimization via Bregman Distance [104.96004056928474]
We propose a bilevel optimization method based on Bregman Bregman functions. We also propose an accelerated version of SBiO-BreD method (ASBiO-BreD) by using the variance-reduced technique.
arXiv Detail & Related papers (2021-07-26T16:18:43Z)
BiAdam: Fast Adaptive Bilevel Optimization Methods [104.96004056928474]
Bilevel optimization has attracted increased interest in machine learning due to its many applications. We provide a useful analysis framework for both the constrained and unconstrained optimization.
arXiv Detail & Related papers (2021-06-21T20:16:40Z)
Solving weakly supervised regression problem using low-rank manifold regularization [77.34726150561087]
We solve a weakly supervised regression problem. Under "weakly" we understand that for some training points the labels are known, for some unknown, and for others uncertain due to the presence of random noise or other reasons such as lack of resources. In the numerical section, we applied the suggested method to artificial and real datasets using Monte-Carlo modeling.
arXiv Detail & Related papers (2021-04-13T23:21:01Z)
Slowly Varying Regression under Sparsity [5.22980614912553]
We present the framework of slowly hyper regression under sparsity, allowing regression models to exhibit slow and sparse variations. We suggest a procedure that reformulates as a binary convex algorithm. We show that the resulting model outperforms competing formulations in comparable times across various datasets.
arXiv Detail & Related papers (2021-02-22T04:51:44Z)
Divide and Learn: A Divide and Conquer Approach for Predict+Optimize [50.03608569227359]
The predict+optimize problem combines machine learning ofproblem coefficients with a optimization prob-lem that uses the predicted coefficients. We show how to directlyexpress the loss of the optimization problem in terms of thepredicted coefficients as a piece-wise linear function. We propose a novel divide and algorithm to tackle optimization problems without this restriction and predict itscoefficients using the optimization loss.
arXiv Detail & Related papers (2020-12-04T00:26:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.