Weighted Leave-One-Out Cross Validation
- URL: http://arxiv.org/abs/2505.19737v1
- Date: Mon, 26 May 2025 09:20:34 GMT
- Title: Weighted Leave-One-Out Cross Validation
- Authors: Luc Pronzato, Maria-João Rendas,
- Abstract summary: We present a weighted version of Leave-One-Out (LOO) cross-validation for estimating the Integrated Squared Error (ISE)<n>The method relies on the construction of the best linear estimator of the squared prediction error at an arbitrary unsampled site.<n>Overall, the estimation of ISE is significantly more precise than with classical, unweighted, LOO cross validation.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a weighted version of Leave-One-Out (LOO) cross-validation for estimating the Integrated Squared Error (ISE) when approximating an unknown function by a predictor that depends linearly on evaluations of the function over a finite collection of sites. The method relies on the construction of the best linear estimator of the squared prediction error at an arbitrary unsampled site based on squared LOO residuals, assuming that the function is a realization of a Gaussian Process (GP). A theoretical analysis of performance of the ISE estimator is presented, and robustness with respect to the choice of the GP kernel is investigated first analytically, then through numerical examples. Overall, the estimation of ISE is significantly more precise than with classical, unweighted, LOO cross validation. Application to model selection is briefly considered through examples.
Related papers
- Pre-validation Revisited [79.92204034170092]
We show properties and benefits of pre-validation in prediction, inference and error estimation by simulations and applications.<n>We propose not only an analytical distribution of the test statistic for the pre-validated predictor under certain models, but also a generic bootstrap procedure to conduct inference.
arXiv Detail & Related papers (2025-05-21T00:20:14Z) - Optimal Bayesian Affine Estimator and Active Learning for the Wiener Model [3.7414278978078204]
We derive a closed-form optimal affine estimator for the unknown parameters, characterized by the so-called "dynamic basis statistics"<n>We develop an active learning algorithm synthesizing input signals to minimize estimation error.
arXiv Detail & Related papers (2025-04-07T20:36:06Z) - Bayesian Optimization for Robust Identification of Ornstein-Uhlenbeck Model [4.0148499400442095]
This paper deals with the identification of the derivation Ornstein-Uhlenbeck (OU) process error model.<n>We put forth a sample-efficient global optimization approach based on the Bayesian optimization framework.
arXiv Detail & Related papers (2025-03-09T01:38:21Z) - Unveiling the Statistical Foundations of Chain-of-Thought Prompting Methods [59.779795063072655]
Chain-of-Thought (CoT) prompting and its variants have gained popularity as effective methods for solving multi-step reasoning problems.
We analyze CoT prompting from a statistical estimation perspective, providing a comprehensive characterization of its sample complexity.
arXiv Detail & Related papers (2024-08-25T04:07:18Z) - An Unconditional Representation of the Conditional Score in Infinite-Dimensional Linear Inverse Problems [5.340736751238338]
We propose an unconditional representation of the conditional score-function tailored to linear inverse problems.<n>We show that the conditional score can be derived exactly from a trained (unconditional) score using affine transformations.<n>Our approach is formulated in infinite-dimensional function spaces, making it inherently discretization-invariant.
arXiv Detail & Related papers (2024-05-24T15:33:27Z) - Kernel-based off-policy estimation without overlap: Instance optimality
beyond semiparametric efficiency [53.90687548731265]
We study optimal procedures for estimating a linear functional based on observational data.
For any convex and symmetric function class $mathcalF$, we derive a non-asymptotic local minimax bound on the mean-squared error.
arXiv Detail & Related papers (2023-01-16T02:57:37Z) - Sparse Horseshoe Estimation via Expectation-Maximisation [2.1485350418225244]
We propose a novel expectation-maximisation (EM) procedure for computing the MAP estimates of the parameters.
A particular strength of our approach is that the M-step depends only on the form of the prior and it is independent of the form of the likelihood.
In experiments performed on simulated and real data, our approach performs comparable, or superior to, state-of-the-art sparse estimation methods.
arXiv Detail & Related papers (2022-11-07T00:43:26Z) - Nonparametric Score Estimators [49.42469547970041]
Estimating the score from a set of samples generated by an unknown distribution is a fundamental task in inference and learning of probabilistic models.
We provide a unifying view of these estimators under the framework of regularized nonparametric regression.
We propose score estimators based on iterative regularization that enjoy computational benefits from curl-free kernels and fast convergence.
arXiv Detail & Related papers (2020-05-20T15:01:03Z) - SUMO: Unbiased Estimation of Log Marginal Probability for Latent
Variable Models [80.22609163316459]
We introduce an unbiased estimator of the log marginal likelihood and its gradients for latent variable models based on randomized truncation of infinite series.
We show that models trained using our estimator give better test-set likelihoods than a standard importance-sampling based approach for the same average computational cost.
arXiv Detail & Related papers (2020-04-01T11:49:30Z) - SLEIPNIR: Deterministic and Provably Accurate Feature Expansion for
Gaussian Process Regression with Derivatives [86.01677297601624]
We propose a novel approach for scaling GP regression with derivatives based on quadrature Fourier features.
We prove deterministic, non-asymptotic and exponentially fast decaying error bounds which apply for both the approximated kernel as well as the approximated posterior.
arXiv Detail & Related papers (2020-03-05T14:33:20Z) - On Low-rank Trace Regression under General Sampling Distribution [9.699586426043885]
We show that cross-validated estimators satisfy near-optimal error bounds on general assumptions.
We also show that the cross-validated estimator outperforms the theory-inspired approach of selecting the parameter.
arXiv Detail & Related papers (2019-04-18T02:56:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.