Data-driven Error Estimation: Upper Bounding Multiple Errors without Class Complexity as Input
- URL: http://arxiv.org/abs/2405.04636v3
- Date: Thu, 06 Mar 2025 00:17:07 GMT
- Title: Data-driven Error Estimation: Upper Bounding Multiple Errors without Class Complexity as Input
- Authors: Sanath Kumar Krishnamurthy, Anna Lyubarskaja, Emma Brunskill, Susan Athey,
- Abstract summary: We frame this as an "error estimation problem"<n>The goal is to determine a high-probability upper bound on the maximum error for a class of estimates.<n>We propose an entirely data-driven approach that derives such bounds for both finite and infinite class settings.
- Score: 24.549751508566565
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Constructing confidence intervals that are simultaneously valid across a class of estimates is central for tasks such as multiple mean estimation, bounding generalization error in machine learning, and adaptive experimental design. We frame this as an "error estimation problem," where the goal is to determine a high-probability upper bound on the maximum error for a class of estimates. We propose an entirely data-driven approach that derives such bounds for both finite and infinite class settings, naturally adapting to a potentially unknown correlation structure of random errors. Notably, our method does not require class complexity as an input, overcoming a major limitation of existing approaches such as union bounding and bounds based on Talagrand's inequality. In this paper, we present our simple yet general solution and demonstrate its flexibility through applications ranging from constructing multiple simultaneously valid confidence intervals to optimizing exploration in contextual bandit algorithms.
Related papers
- Single-loop Algorithms for Stochastic Non-convex Optimization with Weakly-Convex Constraints [49.76332265680669]
This paper examines a crucial subset of problems where both the objective and constraint functions are weakly convex.
Existing methods often face limitations, including slow convergence rates or reliance on double-loop designs.
We introduce a novel single-loop penalty-based algorithm to overcome these challenges.
arXiv Detail & Related papers (2025-04-21T17:15:48Z) - Maximum a Posteriori Inference for Factor Graphs via Benders' Decomposition [0.38233569758620056]
We present a method for maximum a-posteriori inference in general Bayesian factor models.
We derive MAP estimation algorithms for the Bayesian Gaussian mixture model and latent Dirichlet allocation.
arXiv Detail & Related papers (2024-10-24T19:57:56Z) - Unveiling the Statistical Foundations of Chain-of-Thought Prompting Methods [59.779795063072655]
Chain-of-Thought (CoT) prompting and its variants have gained popularity as effective methods for solving multi-step reasoning problems.
We analyze CoT prompting from a statistical estimation perspective, providing a comprehensive characterization of its sample complexity.
arXiv Detail & Related papers (2024-08-25T04:07:18Z) - OTLP: Output Thresholding Using Mixed Integer Linear Programming [0.0]
OTLP is a thresholding framework using mixed integer linear programming which is model agnostic.
This paper proposes OTLP, a thresholding framework using mixed integer linear programming which is model agnostic.
arXiv Detail & Related papers (2024-05-18T08:51:42Z) - Precise Error Rates for Computationally Efficient Testing [75.63895690909241]
We revisit the question of simple-versus-simple hypothesis testing with an eye towards computational complexity.
An existing test based on linear spectral statistics achieves the best possible tradeoff curve between type I and type II error rates.
arXiv Detail & Related papers (2023-11-01T04:41:16Z) - Margin theory for the scenario-based approach to robust optimization in
high dimension [0.0]
This paper deals with the scenario approach to robust optimization.
This relies on a random sampling of the possibly infinite number of constraints induced by uncertainties in a problem.
arXiv Detail & Related papers (2023-03-07T13:33:46Z) - Error Estimation for Random Fourier Features [14.27855002079769]
This paper develops a bootstrap approach to numerically estimate the errors of RFF approximations.
The error estimates are specific to the problem at hand, avoiding the pessimism of worst-case bounds.
The approach enables adaptive computation, so that the user can quickly inspect the error of a rough initial kernel approximation.
arXiv Detail & Related papers (2023-02-22T06:54:27Z) - Optimal Algorithms for Stochastic Complementary Composite Minimization [55.26935605535377]
Inspired by regularization techniques in statistics and machine learning, we study complementary composite minimization.
We provide novel excess risk bounds, both in expectation and with high probability.
Our algorithms are nearly optimal, which we prove via novel lower complexity bounds for this class of problems.
arXiv Detail & Related papers (2022-11-03T12:40:24Z) - Minimax rate of consistency for linear models with missing values [0.0]
Missing values arise in most real-world data sets due to the aggregation of multiple sources and intrinsically missing information (sensor failure, unanswered questions in surveys...).
In this paper, we focus on the extensively-studied linear models, but in presence of missing values, which turns out to be quite a challenging task.
This eventually requires to solve a number of learning tasks, exponential in the number of input features, which makes predictions impossible for current real-world datasets.
arXiv Detail & Related papers (2022-02-03T08:45:34Z) - Certifiable Robustness for Nearest Neighbor Classifiers [6.487663563916903]
We study the complexity of certifying for a simple but widely deployed classification algorithm, $k$-Nearest Neighbors ($k$-NN)
Our main focus is on inconsistent datasets when constraints are functional dependencies (FDs)
We exhibit a similar counting version of the problem, where the goal is to count the number of possible worlds that predict a certain label.
arXiv Detail & Related papers (2022-01-13T02:55:10Z) - Accelerated and instance-optimal policy evaluation with linear function
approximation [17.995515643150657]
Existing algorithms fail to match at least one of these lower bounds.
We develop an accelerated, variance-reduced fast temporal difference algorithm that simultaneously matches both lower bounds and attains a strong notion of instance-optimality.
arXiv Detail & Related papers (2021-12-24T17:21:04Z) - Generalization of Neural Combinatorial Solvers Through the Lens of
Adversarial Robustness [68.97830259849086]
Most datasets only capture a simpler subproblem and likely suffer from spurious features.
We study adversarial robustness - a local generalization property - to reveal hard, model-specific instances and spurious features.
Unlike in other applications, where perturbation models are designed around subjective notions of imperceptibility, our perturbation models are efficient and sound.
Surprisingly, with such perturbations, a sufficiently expressive neural solver does not suffer from the limitations of the accuracy-robustness trade-off common in supervised learning.
arXiv Detail & Related papers (2021-10-21T07:28:11Z) - Root-finding Approaches for Computing Conformal Prediction Set [18.405645120971496]
Conformal prediction constructs a confidence region for an unobserved response of a feature vector based on previous identically distributed and exchangeable observations.
We exploit the fact that, emphoften, conformal prediction sets are intervals whose boundaries can be efficiently approximated by classical root-finding software.
arXiv Detail & Related papers (2021-04-14T06:41:12Z) - Amortized Conditional Normalized Maximum Likelihood: Reliable Out of
Distribution Uncertainty Estimation [99.92568326314667]
We propose the amortized conditional normalized maximum likelihood (ACNML) method as a scalable general-purpose approach for uncertainty estimation.
Our algorithm builds on the conditional normalized maximum likelihood (CNML) coding scheme, which has minimax optimal properties according to the minimum description length principle.
We demonstrate that ACNML compares favorably to a number of prior techniques for uncertainty estimation in terms of calibration on out-of-distribution inputs.
arXiv Detail & Related papers (2020-11-05T08:04:34Z) - Multi-exponential Error Extrapolation and Combining Error Mitigation
Techniques for NISQ Applications [0.0]
Noise in quantum hardware remains the biggest roadblock for the implementation of quantum computers.
Error extrapolation is an error mitigation technique that has been successfully implemented experimentally.
We extend this to multi-exponential error extrapolation and provide more rigorous proof for its effectiveness under Pauli noise.
arXiv Detail & Related papers (2020-07-02T17:18:47Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z) - Matrix Completion with Quantified Uncertainty through Low Rank Gaussian
Copula [30.84155327760468]
This paper proposes a framework for missing value imputation with quantified uncertainty.
The time required to fit the model scales linearly with the number of rows and the number of columns in the dataset.
Empirical results show the method yields state-of-the-art imputation accuracy across a wide range of data types.
arXiv Detail & Related papers (2020-06-18T19:51:42Z) - High-Dimensional Robust Mean Estimation via Gradient Descent [73.61354272612752]
We show that the problem of robust mean estimation in the presence of a constant adversarial fraction can be solved by gradient descent.
Our work establishes an intriguing connection between the near non-lemma estimation and robust statistics.
arXiv Detail & Related papers (2020-05-04T10:48:04Z) - Machine learning for causal inference: on the use of cross-fit
estimators [77.34726150561087]
Doubly-robust cross-fit estimators have been proposed to yield better statistical properties.
We conducted a simulation study to assess the performance of several estimators for the average causal effect (ACE)
When used with machine learning, the doubly-robust cross-fit estimators substantially outperformed all of the other estimators in terms of bias, variance, and confidence interval coverage.
arXiv Detail & Related papers (2020-04-21T23:09:55Z) - Error Estimation for Sketched SVD via the Bootstrap [60.67199274260768]
This paper develops a fully data-driven bootstrap method that numerically estimates the actual error of sketched singular vectors/values.
The method is computationally inexpensive, because it operates only on sketched objects, and it requires no passes over the full matrix being factored.
arXiv Detail & Related papers (2020-03-10T19:14:08Z) - A General Method for Robust Learning from Batches [56.59844655107251]
We consider a general framework of robust learning from batches, and determine the limits of both classification and distribution estimation over arbitrary, including continuous, domains.
We derive the first robust computationally-efficient learning algorithms for piecewise-interval classification, and for piecewise-polynomial, monotone, log-concave, and gaussian-mixture distribution estimation.
arXiv Detail & Related papers (2020-02-25T18:53:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.