Consistency analysis of bilevel data-driven learning in inverse problems
- URL: http://arxiv.org/abs/2007.02677v2
- Date: Thu, 7 Jan 2021 15:37:05 GMT
- Title: Consistency analysis of bilevel data-driven learning in inverse problems
- Authors: Neil K. Chada, Claudia Schillings, Xin T. Tong and Simon Weissmann
- Abstract summary: We consider the adaptive learning of the regularization parameter from data by means of optimization.
We demonstrate how to implement our framework on linear inverse problems.
Online numerical schemes are derived using the gradient descent method.
- Score: 1.0705399532413618
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: One fundamental problem when solving inverse problems is how to find
regularization parameters. This article considers solving this problem using
data-driven bilevel optimization, i.e. we consider the adaptive learning of the
regularization parameter from data by means of optimization. This approach can
be interpreted as solving an empirical risk minimization problem, and we
analyze its performance in the large data sample size limit for general
nonlinear problems. We demonstrate how to implement our framework on linear
inverse problems, where we can further show the inverse accuracy does not
depend on the ambient space dimension. To reduce the associated computational
cost, online numerical schemes are derived using the stochastic gradient
descent method. We prove convergence of these numerical schemes under suitable
assumptions on the forward problem. Numerical experiments are presented
illustrating the theoretical results and demonstrating the applicability and
efficiency of the proposed approaches for various linear and nonlinear inverse
problems, including Darcy flow, the eikonal equation, and an image denoising
example.
Related papers
- On improving generalization in a class of learning problems with the method of small parameters for weakly-controlled optimal gradient systems [0.0]
We consider a variational problem for a weakly-controlled gradient system, whose control input enters into the system dynamics as a coefficient to a nonlinear term.
Using the perturbation theory, we provide results that will allow us to solve a sequence of optimization problems.
We also provide an estimate for the rate of convergence for such approximate optimal solutions.
arXiv Detail & Related papers (2024-12-11T20:50:29Z) - Representation and Regression Problems in Neural Networks: Relaxation, Generalization, and Numerics [5.915970073098098]
We address three non-dimensional optimization problems associated with training shallow neural networks (NNs)
We convexify these problems and representation, applying a representer gradient to prove the absence relaxation gaps.
We analyze the impact of key parameters on these bounds, propose optimal choices.
For high-dimensional datasets, we propose a sparsification algorithm that, combined with gradient descent, yields effective solutions.
arXiv Detail & Related papers (2024-12-02T15:40:29Z) - Error Feedback under $(L_0,L_1)$-Smoothness: Normalization and Momentum [56.37522020675243]
We provide the first proof of convergence for normalized error feedback algorithms across a wide range of machine learning problems.
We show that due to their larger allowable stepsizes, our new normalized error feedback algorithms outperform their non-normalized counterparts on various tasks.
arXiv Detail & Related papers (2024-10-22T10:19:27Z) - A Guide to Stochastic Optimisation for Large-Scale Inverse Problems [4.926711494319977]
optimisation algorithms are the de facto standard for machine learning with large amounts of data.
Handling only a subset of available data in each optimisation step dramatically reduces the per-iteration computational costs.
We focus on the potential and the challenges for optimisation that are unique to variational regularisation for inverse imaging problems.
arXiv Detail & Related papers (2024-06-10T15:02:30Z) - From Inverse Optimization to Feasibility to ERM [11.731853838892487]
We study the contextual inverse setting that utilizes additional contextual information to better predict parameters.
We experimentally validate our approach on synthetic and real-world problems and demonstrate improved performance compared to existing methods.
arXiv Detail & Related papers (2024-02-27T21:06:42Z) - Stochastic Mirror Descent for Large-Scale Sparse Recovery [13.500750042707407]
We discuss an application of quadratic Approximation to statistical estimation of high-dimensional sparse parameters.
We show that the proposed algorithm attains the optimal convergence of the estimation error under weak assumptions on the regressor distribution.
arXiv Detail & Related papers (2022-10-23T23:23:23Z) - Extension of Dynamic Mode Decomposition for dynamic systems with
incomplete information based on t-model of optimal prediction [69.81996031777717]
The Dynamic Mode Decomposition has proved to be a very efficient technique to study dynamic data.
The application of this approach becomes problematic if the available data is incomplete because some dimensions of smaller scale either missing or unmeasured.
We consider a first-order approximation of the Mori-Zwanzig decomposition, state the corresponding optimization problem and solve it with the gradient-based optimization method.
arXiv Detail & Related papers (2022-02-23T11:23:59Z) - Benign Overfitting of Constant-Stepsize SGD for Linear Regression [122.70478935214128]
inductive biases are central in preventing overfitting empirically.
This work considers this issue in arguably the most basic setting: constant-stepsize SGD for linear regression.
We reflect on a number of notable differences between the algorithmic regularization afforded by (unregularized) SGD in comparison to ordinary least squares.
arXiv Detail & Related papers (2021-03-23T17:15:53Z) - Optimal oracle inequalities for solving projected fixed-point equations [53.31620399640334]
We study methods that use a collection of random observations to compute approximate solutions by searching over a known low-dimensional subspace of the Hilbert space.
We show how our results precisely characterize the error of a class of temporal difference learning methods for the policy evaluation problem with linear function approximation.
arXiv Detail & Related papers (2020-12-09T20:19:32Z) - Learning Fast Approximations of Sparse Nonlinear Regression [50.00693981886832]
In this work, we bridge the gap by introducing the Threshold Learned Iterative Shrinkage Algorithming (NLISTA)
Experiments on synthetic data corroborate our theoretical results and show our method outperforms state-of-the-art methods.
arXiv Detail & Related papers (2020-10-26T11:31:08Z) - Follow the bisector: a simple method for multi-objective optimization [65.83318707752385]
We consider optimization problems, where multiple differentiable losses have to be minimized.
The presented method computes descent direction in every iteration to guarantee equal relative decrease of objective functions.
arXiv Detail & Related papers (2020-07-14T09:50:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.