Deterministic error bounds for kernel-based learning techniques under
bounded noise
- URL: http://arxiv.org/abs/2008.04005v3
- Date: Sat, 31 Jul 2021 16:42:36 GMT
- Title: Deterministic error bounds for kernel-based learning techniques under
bounded noise
- Authors: Emilio T. Maddalena, Paul Scharnhorst, Colin N. Jones
- Abstract summary: We consider the problem of reconstructing a function from a finite set of noise-corrupted samples.
Two kernel algorithms are analyzed, namely kernel ridge regression and $varepsilon$-support vector regression.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the problem of reconstructing a function from a finite set of
noise-corrupted samples. Two kernel algorithms are analyzed, namely kernel
ridge regression and $\varepsilon$-support vector regression. By assuming the
ground-truth function belongs to the reproducing kernel Hilbert space of the
chosen kernel, and the measurement noise affecting the dataset is bounded, we
adopt an approximation theory viewpoint to establish \textit{deterministic},
finite-sample error bounds for the two models. Finally, we discuss their
connection with Gaussian processes and two numerical examples are provided. In
establishing our inequalities, we hope to help bring the fields of
non-parametric kernel learning and system identification for robust control
closer to each other.
Related papers
- Learning dissipative Hamiltonian dynamics with reproducing kernel Hilbert spaces and random Fourier features [0.7510165488300369]
This paper presents a new method for learning dissipative Hamiltonian dynamics from a limited and noisy dataset.
The performance of the method is validated in simulations for two dissipative Hamiltonian systems.
arXiv Detail & Related papers (2024-10-24T11:35:39Z) - Epistemic Uncertainty and Observation Noise with the Neural Tangent Kernel [12.464924018243988]
Recent work has shown that training wide neural networks with gradient descent is formally equivalent to computing the mean of the posterior distribution in a Gaussian Process.
We show how to deal with non-zero aleatoric noise and derive an estimator for the posterior covariance.
arXiv Detail & Related papers (2024-09-06T00:34:44Z) - Information limits and Thouless-Anderson-Palmer equations for spiked matrix models with structured noise [19.496063739638924]
We consider a saturate problem of Bayesian inference for a structured spiked model.
We show how to predict the statistical limits using an efficient algorithm inspired by the theory of adaptive Thouless-Anderson-Palmer equations.
arXiv Detail & Related papers (2024-05-31T16:38:35Z) - Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks [54.177130905659155]
Recent studies show that a reproducing kernel Hilbert space (RKHS) is not a suitable space to model functions by neural networks.
In this paper, we study a suitable function space for over- parameterized two-layer neural networks with bounded norms.
arXiv Detail & Related papers (2024-04-29T15:04:07Z) - Score-based Diffusion Models in Function Space [140.792362459734]
Diffusion models have recently emerged as a powerful framework for generative modeling.
We introduce a mathematically rigorous framework called Denoising Diffusion Operators (DDOs) for training diffusion models in function space.
We show that the corresponding discretized algorithm generates accurate samples at a fixed cost independent of the data resolution.
arXiv Detail & Related papers (2023-02-14T23:50:53Z) - Learning "best" kernels from data in Gaussian process regression. With
application to aerodynamics [0.4588028371034406]
We introduce algorithms to select/design kernels in Gaussian process regression/kriging surrogate modeling techniques.
A first class of algorithms is kernel flow, which was introduced in a context of classification in machine learning.
A second class of algorithms is called spectral kernel ridge regression, and aims at selecting a "best" kernel such that the norm of the function to be approximated is minimal.
arXiv Detail & Related papers (2022-06-03T07:50:54Z) - On the Benefits of Large Learning Rates for Kernel Methods [110.03020563291788]
We show that a phenomenon can be precisely characterized in the context of kernel methods.
We consider the minimization of a quadratic objective in a separable Hilbert space, and show that with early stopping, the choice of learning rate influences the spectral decomposition of the obtained solution.
arXiv Detail & Related papers (2022-02-28T13:01:04Z) - Optimal policy evaluation using kernel-based temporal difference methods [78.83926562536791]
We use kernel Hilbert spaces for estimating the value function of an infinite-horizon discounted Markov reward process.
We derive a non-asymptotic upper bound on the error with explicit dependence on the eigenvalues of the associated kernel operator.
We prove minimax lower bounds over sub-classes of MRPs.
arXiv Detail & Related papers (2021-09-24T14:48:20Z) - Optimal oracle inequalities for solving projected fixed-point equations [53.31620399640334]
We study methods that use a collection of random observations to compute approximate solutions by searching over a known low-dimensional subspace of the Hilbert space.
We show how our results precisely characterize the error of a class of temporal difference learning methods for the policy evaluation problem with linear function approximation.
arXiv Detail & Related papers (2020-12-09T20:19:32Z) - Learning interaction kernels in mean-field equations of 1st-order
systems of interacting particles [1.776746672434207]
We introduce a nonparametric algorithm to learn interaction kernels of mean-field equations for 1st-order systems of interacting particles.
By at least squares with regularization, the algorithm learns the kernel on data-adaptive hypothesis spaces efficiently.
arXiv Detail & Related papers (2020-10-29T15:37:17Z) - Semiparametric Nonlinear Bipartite Graph Representation Learning with
Provable Guarantees [106.91654068632882]
We consider the bipartite graph and formalize its representation learning problem as a statistical estimation problem of parameters in a semiparametric exponential family distribution.
We show that the proposed objective is strongly convex in a neighborhood around the ground truth, so that a gradient descent-based method achieves linear convergence rate.
Our estimator is robust to any model misspecification within the exponential family, which is validated in extensive experiments.
arXiv Detail & Related papers (2020-03-02T16:40:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.