On Optimal Regularization Parameters via Bilevel Learning
- URL: http://arxiv.org/abs/2305.18394v5
- Date: Mon, 22 Jan 2024 10:44:50 GMT
- Title: On Optimal Regularization Parameters via Bilevel Learning
- Authors: Matthias J. Ehrhardt, Silvia Gazzola and Sebastian J. Scott
(Department of Mathematical Sciences, University of Bath, Bath, UK)
- Abstract summary: We provide a new condition that better characterizes positivity of optimal regularization parameters than the existing theory.
Numerical results verify and explore this new condition for both small and high-dimensional problems.
- Score: 0.06213771671016098
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Variational regularization is commonly used to solve linear inverse problems,
and involves augmenting a data fidelity by a regularizer. The regularizer is
used to promote a priori information and is weighted by a regularization
parameter. Selection of an appropriate regularization parameter is critical,
with various choices leading to very different reconstructions. Classical
strategies used to determine a suitable parameter value include the discrepancy
principle and the L-curve criterion, and in recent years a supervised machine
learning approach called bilevel learning has been employed. Bilevel learning
is a powerful framework to determine optimal parameters and involves solving a
nested optimization problem. While previous strategies enjoy various
theoretical results, the well-posedness of bilevel learning in this setting is
still an open question. In particular, a necessary property is positivity of
the determined regularization parameter. In this work, we provide a new
condition that better characterizes positivity of optimal regularization
parameters than the existing theory. Numerical results verify and explore this
new condition for both small and high-dimensional problems.
Related papers
- Learning Joint Models of Prediction and Optimization [56.04498536842065]
Predict-Then-Then framework uses machine learning models to predict unknown parameters of an optimization problem from features before solving.
This paper proposes an alternative method, in which optimal solutions are learned directly from the observable features by joint predictive models.
arXiv Detail & Related papers (2024-09-07T19:52:14Z) - A naive aggregation algorithm for improving generalization in a class of learning problems [0.0]
We present a naive aggregation algorithm for a typical learning problem with expert advice setting.
In particular, we consider a class of learning problem of point estimations for modeling high-dimensional nonlinear functions.
arXiv Detail & Related papers (2024-09-06T15:34:17Z) - Optimal Baseline Corrections for Off-Policy Contextual Bandits [61.740094604552475]
We aim to learn decision policies that optimize an unbiased offline estimate of an online reward metric.
We propose a single framework built on their equivalence in learning scenarios.
Our framework enables us to characterize the variance-optimal unbiased estimator and provide a closed-form solution for it.
arXiv Detail & Related papers (2024-05-09T12:52:22Z) - Function-Space Regularization in Neural Networks: A Probabilistic
Perspective [51.133793272222874]
We show that we can derive a well-motivated regularization technique that allows explicitly encoding information about desired predictive functions into neural network training.
We evaluate the utility of this regularization technique empirically and demonstrate that the proposed method leads to near-perfect semantic shift detection and highly-calibrated predictive uncertainty estimates.
arXiv Detail & Related papers (2023-12-28T17:50:56Z) - Predict-Then-Optimize by Proxy: Learning Joint Models of Prediction and
Optimization [59.386153202037086]
Predict-Then- framework uses machine learning models to predict unknown parameters of an optimization problem from features before solving.
This approach can be inefficient and requires handcrafted, problem-specific rules for backpropagation through the optimization step.
This paper proposes an alternative method, in which optimal solutions are learned directly from the observable features by predictive models.
arXiv Detail & Related papers (2023-11-22T01:32:06Z) - Likelihood Ratio Confidence Sets for Sequential Decision Making [51.66638486226482]
We revisit the likelihood-based inference principle and propose to use likelihood ratios to construct valid confidence sequences.
Our method is especially suitable for problems with well-specified likelihoods.
We show how to provably choose the best sequence of estimators and shed light on connections to online convex optimization.
arXiv Detail & Related papers (2023-11-08T00:10:21Z) - Smoothing the Edges: Smooth Optimization for Sparse Regularization using Hadamard Overparametrization [10.009748368458409]
We present a framework for smooth optimization of explicitly regularized objectives for (structured) sparsity.
Our method enables fully differentiable approximation-free optimization and is thus compatible with the ubiquitous gradient descent paradigm in deep learning.
arXiv Detail & Related papers (2023-07-07T13:06:12Z) - A Novel Framework for Policy Mirror Descent with General
Parameterization and Linear Convergence [15.807079236265714]
We introduce a novel framework for policy optimization based on mirror descent.
We obtain the first result that guarantees linear convergence for a policy-gradient-based method involving general parameterization.
arXiv Detail & Related papers (2023-01-30T18:21:48Z) - Learning Regularization Parameters of Inverse Problems via Deep Neural
Networks [0.0]
We consider a supervised learning approach, where a network is trained to approximate the mapping from observation data to regularization parameters.
We show that a wide variety of regularization functionals, forward models, and noise models may be considered.
The network-obtained regularization parameters can be computed more efficiently and may even lead to more accurate solutions.
arXiv Detail & Related papers (2021-04-14T02:38:38Z) - Muddling Labels for Regularization, a novel approach to generalization [0.0]
Generalization is a central problem in Machine Learning.
This paper introduces a novel approach to achieve generalization without any data splitting.
It is based on a new risk measure which directly quantifies a model's tendency to overfit.
arXiv Detail & Related papers (2021-02-17T14:02:30Z) - Stochastic batch size for adaptive regularization in deep network
optimization [63.68104397173262]
We propose a first-order optimization algorithm incorporating adaptive regularization applicable to machine learning problems in deep learning framework.
We empirically demonstrate the effectiveness of our algorithm using an image classification task based on conventional network models applied to commonly used benchmark datasets.
arXiv Detail & Related papers (2020-04-14T07:54:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.