Optimal Recovery from Inaccurate Data in Hilbert Spaces: Regularize, but
what of the Parameter?
- URL: http://arxiv.org/abs/2111.02601v1
- Date: Thu, 4 Nov 2021 03:02:55 GMT
- Title: Optimal Recovery from Inaccurate Data in Hilbert Spaces: Regularize, but
what of the Parameter?
- Authors: Simon Foucart and Chunyang Liao
- Abstract summary: This article considers a model assumption based on approximability in the framework of Hilbert spaces.
It also incorporates observational inaccuracies via additive errors bounded in $ell$.
It fills the gap in both a local scenario and a global scenario.
- Score: 6.85316573653194
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In Optimal Recovery, the task of learning a function from observational data
is tackled deterministically by adopting a worst-case perspective tied to an
explicit model assumption made on the functions to be learned. Working in the
framework of Hilbert spaces, this article considers a model assumption based on
approximability. It also incorporates observational inaccuracies modeled via
additive errors bounded in $\ell_2$. Earlier works have demonstrated that
regularization provide algorithms that are optimal in this situation, but did
not fully identify the desired hyperparameter. This article fills the gap in
both a local scenario and a global scenario. In the local scenario, which
amounts to the determination of Chebyshev centers, the semidefinite recipe of
Beck and Eldar (legitimately valid in the complex setting only) is complemented
by a more direct approach, with the proviso that the observational functionals
have orthonormal representers. In the said approach, the desired parameter is
the solution to an equation that can be resolved via standard methods. In the
global scenario, where linear algorithms rule, the parameter elusive in the
works of Micchelli et al. is found as the byproduct of a semidefinite program.
Additionally and quite surprisingly, in case of observational functionals with
orthonormal representers, it is established that any regularization parameter
is optimal.
Related papers
- Trust-Region Sequential Quadratic Programming for Stochastic Optimization with Random Models [57.52124921268249]
We propose a Trust Sequential Quadratic Programming method to find both first and second-order stationary points.
To converge to first-order stationary points, our method computes a gradient step in each iteration defined by minimizing a approximation of the objective subject.
To converge to second-order stationary points, our method additionally computes an eigen step to explore the negative curvature the reduced Hessian matrix.
arXiv Detail & Related papers (2024-09-24T04:39:47Z) - A Functional Model Method for Nonconvex Nonsmooth Conditional Stochastic Optimization [0.0]
We consider optimization problems involving an expected value of a nonlinear function of a base random vector and a conditional expectation of another function depending on the base random vector.
We propose a specialized singlescale method for non constrained learning problems with a smooth outer function and a different conditional inner function.
arXiv Detail & Related papers (2024-05-17T14:35:50Z) - Should We Learn Most Likely Functions or Parameters? [51.133793272222874]
We investigate the benefits and drawbacks of directly estimating the most likely function implied by the model and the data.
We find that function-space MAP estimation can lead to flatter minima, better generalization, and improved to overfitting.
arXiv Detail & Related papers (2023-11-27T16:39:55Z) - Moreau-Yoshida Variational Transport: A General Framework For Solving Regularized Distributional Optimization Problems [3.038642416291856]
We consider a general optimization problem of minimizing a composite objective functional defined over a class probability distributions.
We propose a novel method, dubbed as Moreau-Yoshida Variational Transport (MYVT), for solving the regularized distributional optimization problem.
arXiv Detail & Related papers (2023-07-31T01:14:42Z) - Kernel-based off-policy estimation without overlap: Instance optimality
beyond semiparametric efficiency [53.90687548731265]
We study optimal procedures for estimating a linear functional based on observational data.
For any convex and symmetric function class $mathcalF$, we derive a non-asymptotic local minimax bound on the mean-squared error.
arXiv Detail & Related papers (2023-01-16T02:57:37Z) - A gradient estimator via L1-randomization for online zero-order
optimization with two point feedback [93.57603470949266]
We present a novel gradient estimator based on two function evaluation and randomization.
We consider two types of assumptions on the noise of the zero-order oracle: canceling noise and adversarial noise.
We provide an anytime and completely data-driven algorithm, which is adaptive to all parameters of the problem.
arXiv Detail & Related papers (2022-05-27T11:23:57Z) - Experimental Design for Linear Functionals in Reproducing Kernel Hilbert
Spaces [102.08678737900541]
We provide algorithms for constructing bias-aware designs for linear functionals.
We derive non-asymptotic confidence sets for fixed and adaptive designs under sub-Gaussian noise.
arXiv Detail & Related papers (2022-05-26T20:56:25Z) - Optimal oracle inequalities for solving projected fixed-point equations [53.31620399640334]
We study methods that use a collection of random observations to compute approximate solutions by searching over a known low-dimensional subspace of the Hilbert space.
We show how our results precisely characterize the error of a class of temporal difference learning methods for the policy evaluation problem with linear function approximation.
arXiv Detail & Related papers (2020-12-09T20:19:32Z) - Learning from Non-Random Data in Hilbert Spaces: An Optimal Recovery
Perspective [12.674428374982547]
We consider the regression problem from an Optimal Recovery perspective.
We first develop a semidefinite program for calculating the worst-case error of any recovery map in finite-dimensional Hilbert spaces.
We show that Optimal Recovery provides a formula which is user-friendly from an algorithmic point-of-view.
arXiv Detail & Related papers (2020-06-05T21:49:07Z) - On Low-rank Trace Regression under General Sampling Distribution [9.699586426043885]
We show that cross-validated estimators satisfy near-optimal error bounds on general assumptions.
We also show that the cross-validated estimator outperforms the theory-inspired approach of selecting the parameter.
arXiv Detail & Related papers (2019-04-18T02:56:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.