Related papers: Consistency analysis of bilevel data-driven learning in inverse problems

Consistency analysis of bilevel data-driven learning in inverse problems

URL: http://arxiv.org/abs/2007.02677v2
Date: Thu, 7 Jan 2021 15:37:05 GMT
Title: Consistency analysis of bilevel data-driven learning in inverse problems
Authors: Neil K. Chada, Claudia Schillings, Xin T. Tong and Simon Weissmann
Abstract summary: We consider the adaptive learning of the regularization parameter from data by means of optimization. We demonstrate how to implement our framework on linear inverse problems. Online numerical schemes are derived using the gradient descent method.
Score: 1.0705399532413618
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: One fundamental problem when solving inverse problems is how to find regularization parameters. This article considers solving this problem using data-driven bilevel optimization, i.e. we consider the adaptive learning of the regularization parameter from data by means of optimization. This approach can be interpreted as solving an empirical risk minimization problem, and we analyze its performance in the large data sample size limit for general nonlinear problems. We demonstrate how to implement our framework on linear inverse problems, where we can further show the inverse accuracy does not depend on the ambient space dimension. To reduce the associated computational cost, online numerical schemes are derived using the stochastic gradient descent method. We prove convergence of these numerical schemes under suitable assumptions on the forward problem. Numerical experiments are presented illustrating the theoretical results and demonstrating the applicability and efficiency of the proposed approaches for various linear and nonlinear inverse problems, including Darcy flow, the eikonal equation, and an image denoising example.

Related papers

On improving generalization in a class of learning problems with the method of small parameters for weakly-controlled optimal gradient systems [0.0]
We consider a variational problem for a weakly-controlled gradient system, whose control input enters into the system dynamics as a coefficient to a nonlinear term. Using the perturbation theory, we provide results that will allow us to solve a sequence of optimization problems. We also provide an estimate for the rate of convergence for such approximate optimal solutions.
arXiv Detail & Related papers (2024-12-11T20:50:29Z)
Representation and Regression Problems in Neural Networks: Relaxation, Generalization, and Numerics [5.915970073098098]
We address three non-dimensional optimization problems associated with training shallow neural networks (NNs) We convexify these problems and representation, applying a representer gradient to prove the absence relaxation gaps. We analyze the impact of key parameters on these bounds, propose optimal choices. For high-dimensional datasets, we propose a sparsification algorithm that, combined with gradient descent, yields effective solutions.
arXiv Detail & Related papers (2024-12-02T15:40:29Z)
Error Feedback under $(L_0,L_1)$-Smoothness: Normalization and Momentum [56.37522020675243]
We provide the first proof of convergence for normalized error feedback algorithms across a wide range of machine learning problems. We show that due to their larger allowable stepsizes, our new normalized error feedback algorithms outperform their non-normalized counterparts on various tasks.
arXiv Detail & Related papers (2024-10-22T10:19:27Z)
A Guide to Stochastic Optimisation for Large-Scale Inverse Problems [4.926711494319977]
optimisation algorithms are the de facto standard for machine learning with large amounts of data. We provide a comprehensive account of the state-of-the-art in optimisation from the viewpoint of inverse problems. We focus on the challenges for optimisation that are unique and are not commonly encountered in machine learning.
arXiv Detail & Related papers (2024-06-10T15:02:30Z)
From Inverse Optimization to Feasibility to ERM [11.731853838892487]
We study the contextual inverse setting that utilizes additional contextual information to better predict parameters. We experimentally validate our approach on synthetic and real-world problems and demonstrate improved performance compared to existing methods.
arXiv Detail & Related papers (2024-02-27T21:06:42Z)
Inverse Models for Estimating the Initial Condition of Spatio-Temporal Advection-Diffusion Processes [5.814371485767541]
Inverse problems involve making inference about unknown parameters of a physical process using observational data. This paper investigates the estimation of the initial condition of a-temporal advection-diffusion process using spatially sparse data streams.
arXiv Detail & Related papers (2023-02-08T15:30:16Z)
Stochastic Mirror Descent for Large-Scale Sparse Recovery [13.500750042707407]
We discuss an application of quadratic Approximation to statistical estimation of high-dimensional sparse parameters. We show that the proposed algorithm attains the optimal convergence of the estimation error under weak assumptions on the regressor distribution.
arXiv Detail & Related papers (2022-10-23T23:23:23Z)
Extension of Dynamic Mode Decomposition for dynamic systems with incomplete information based on t-model of optimal prediction [69.81996031777717]
The Dynamic Mode Decomposition has proved to be a very efficient technique to study dynamic data. The application of this approach becomes problematic if the available data is incomplete because some dimensions of smaller scale either missing or unmeasured. We consider a first-order approximation of the Mori-Zwanzig decomposition, state the corresponding optimization problem and solve it with the gradient-based optimization method.
arXiv Detail & Related papers (2022-02-23T11:23:59Z)
Linear regression with partially mismatched data: local search with theoretical guarantees [9.398989897176953]
We study an important variant of linear regression in which the predictor-response pairs are partially mismatched. We use an optimization formulation to simultaneously learn the underlying regression coefficients and the permutation corresponding to the mismatches. We prove that our local search algorithm converges to a nearly-optimal solution at a linear rate.
arXiv Detail & Related papers (2021-06-03T23:32:12Z)
Benign Overfitting of Constant-Stepsize SGD for Linear Regression [122.70478935214128]
inductive biases are central in preventing overfitting empirically. This work considers this issue in arguably the most basic setting: constant-stepsize SGD for linear regression. We reflect on a number of notable differences between the algorithmic regularization afforded by (unregularized) SGD in comparison to ordinary least squares.
arXiv Detail & Related papers (2021-03-23T17:15:53Z)
Optimal oracle inequalities for solving projected fixed-point equations [53.31620399640334]
We study methods that use a collection of random observations to compute approximate solutions by searching over a known low-dimensional subspace of the Hilbert space. We show how our results precisely characterize the error of a class of temporal difference learning methods for the policy evaluation problem with linear function approximation.
arXiv Detail & Related papers (2020-12-09T20:19:32Z)
Learning Fast Approximations of Sparse Nonlinear Regression [50.00693981886832]
In this work, we bridge the gap by introducing the Threshold Learned Iterative Shrinkage Algorithming (NLISTA) Experiments on synthetic data corroborate our theoretical results and show our method outperforms state-of-the-art methods.
arXiv Detail & Related papers (2020-10-26T11:31:08Z)
Follow the bisector: a simple method for multi-objective optimization [65.83318707752385]
We consider optimization problems, where multiple differentiable losses have to be minimized. The presented method computes descent direction in every iteration to guarantee equal relative decrease of objective functions.
arXiv Detail & Related papers (2020-07-14T09:50:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.