Resampled Confidence Regions with Exponential Shrinkage for the Regression Function of Binary Classification
- URL: http://arxiv.org/abs/2308.01835v2
- Date: Mon, 02 Jun 2025 09:40:06 GMT
- Title: Resampled Confidence Regions with Exponential Shrinkage for the Regression Function of Binary Classification
- Authors: Ambrus Tamás, Balázs Csanád Csáji,
- Abstract summary: We build distribution-free confidence regions for the regression function for any user-chosen confidence level and any finite sample size based on a resampling test.<n>We prove the strong uniform consistency of a new empirical risk based approach for model classes with finite pseudo-dimensions and inverse Lipschitz parameterizations.<n>We also consider a k-nearest neighbors based method, for which we prove strong point boundswise on the probability of exclusion.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The regression function is one of the key objects of binary classification, since it not only determines a Bayes optimal classifier, hence, defines an optimal decision boundary, but also encodes the conditional distribution of the output given the input. In this paper we build distribution-free confidence regions for the regression function for any user-chosen confidence level and any finite sample size based on a resampling test. These regions are abstract, as the model class can be almost arbitrary, e.g., it does not have to be finitely parameterized. We prove the strong uniform consistency of a new empirical risk minimization based approach for model classes with finite pseudo-dimensions and inverse Lipschitz parameterizations. We provide exponential probably approximately correct bounds on the $L_2$ sizes of these regions, and demonstrate the ideas on specific models. Additionally, we also consider a k-nearest neighbors based method, for which we prove strong pointwise bounds on the probability of exclusion. Finally, the constructions are illustrated on a logistic model class and compared to the asymptotic ellipsoids of the maximum likelihood estimator.
Related papers
- BAPE: Learning an Explicit Bayes Classifier for Long-tailed Visual Recognition [78.70453964041718]
Current deep learning algorithms usually solve for the optimal classifier by emphimplicitly estimating the posterior probabilities.<n>This simple methodology has been proven effective for meticulously balanced academic benchmark datasets.<n>However, it is not applicable to the long-tailed data distributions in the real world.<n>This paper presents a novel approach (BAPE) that provides a more precise theoretical estimation of the data distributions.
arXiv Detail & Related papers (2025-06-29T15:12:50Z) - Accelerated zero-order SGD under high-order smoothness and overparameterized regime [79.85163929026146]
We present a novel gradient-free algorithm to solve convex optimization problems.
Such problems are encountered in medicine, physics, and machine learning.
We provide convergence guarantees for the proposed algorithm under both types of noise.
arXiv Detail & Related papers (2024-11-21T10:26:17Z) - EigenVI: score-based variational inference with orthogonal function expansions [23.696028065251497]
EigenVI is an eigenvalue-based approach for black-box variational inference (BBVI)
We use EigenVI to approximate a variety of target distributions, including a benchmark suite of Bayesian models from posteriordb.
arXiv Detail & Related papers (2024-10-31T15:48:34Z) - A Pseudo-Semantic Loss for Autoregressive Models with Logical
Constraints [87.08677547257733]
Neuro-symbolic AI bridges the gap between purely symbolic and neural approaches to learning.
We show how to maximize the likelihood of a symbolic constraint w.r.t the neural network's output distribution.
We also evaluate our approach on Sudoku and shortest-path prediction cast as autoregressive generation.
arXiv Detail & Related papers (2023-12-06T20:58:07Z) - Distributionally Robust Skeleton Learning of Discrete Bayesian Networks [9.46389554092506]
We consider the problem of learning the exact skeleton of general discrete Bayesian networks from potentially corrupted data.
We propose to optimize the most adverse risk over a family of distributions within bounded Wasserstein distance or KL divergence to the empirical distribution.
We present efficient algorithms and show the proposed methods are closely related to the standard regularized regression approach.
arXiv Detail & Related papers (2023-11-10T15:33:19Z) - Calibrating Neural Simulation-Based Inference with Differentiable
Coverage Probability [50.44439018155837]
We propose to include a calibration term directly into the training objective of the neural model.
By introducing a relaxation of the classical formulation of calibration error we enable end-to-end backpropagation.
It is directly applicable to existing computational pipelines allowing reliable black-box posterior inference.
arXiv Detail & Related papers (2023-10-20T10:20:45Z) - When Does Confidence-Based Cascade Deferral Suffice? [69.28314307469381]
Cascades are a classical strategy to enable inference cost to vary adaptively across samples.
A deferral rule determines whether to invoke the next classifier in the sequence, or to terminate prediction.
Despite being oblivious to the structure of the cascade, confidence-based deferral often works remarkably well in practice.
arXiv Detail & Related papers (2023-07-06T04:13:57Z) - Kernel-based off-policy estimation without overlap: Instance optimality
beyond semiparametric efficiency [53.90687548731265]
We study optimal procedures for estimating a linear functional based on observational data.
For any convex and symmetric function class $mathcalF$, we derive a non-asymptotic local minimax bound on the mean-squared error.
arXiv Detail & Related papers (2023-01-16T02:57:37Z) - Benign-Overfitting in Conditional Average Treatment Effect Prediction
with Linear Regression [14.493176427999028]
We study the benign overfitting theory in the prediction of the conditional average treatment effect (CATE) with linear regression models.
We show that the T-learner fails to achieve the consistency except the random assignment, while the IPW-learner converges the risk to zero if the propensity score is known.
arXiv Detail & Related papers (2022-02-10T18:51:52Z) - Calibrated Multiple-Output Quantile Regression with Representation
Learning [12.826754199680472]
We use a deep generative model to learn a representation of a response with a unimodal distribution.
We then transform the solution to the original space of the response.
Experiments conducted on both real and synthetic data show that our method constructs regions that are significantly smaller.
arXiv Detail & Related papers (2021-10-02T14:50:15Z) - Multivariate Probabilistic Regression with Natural Gradient Boosting [63.58097881421937]
We propose a Natural Gradient Boosting (NGBoost) approach based on nonparametrically modeling the conditional parameters of the multivariate predictive distribution.
Our method is robust, works out-of-the-box without extensive tuning, is modular with respect to the assumed target distribution, and performs competitively in comparison to existing approaches.
arXiv Detail & Related papers (2021-06-07T17:44:49Z) - Root-finding Approaches for Computing Conformal Prediction Set [18.405645120971496]
Conformal prediction constructs a confidence region for an unobserved response of a feature vector based on previous identically distributed and exchangeable observations.
We exploit the fact that, emphoften, conformal prediction sets are intervals whose boundaries can be efficiently approximated by classical root-finding software.
arXiv Detail & Related papers (2021-04-14T06:41:12Z) - Learning Probabilistic Ordinal Embeddings for Uncertainty-Aware
Regression [91.3373131262391]
Uncertainty is the only certainty there is.
Traditionally, the direct regression formulation is considered and the uncertainty is modeled by modifying the output space to a certain family of probabilistic distributions.
How to model the uncertainty within the present-day technologies for regression remains an open issue.
arXiv Detail & Related papers (2021-03-25T06:56:09Z) - Exact Distribution-Free Hypothesis Tests for the Regression Function of
Binary Classification via Conditional Kernel Mean Embeddings [0.0]
Two hypothesis tests are proposed for the regression function of binary classification based on conditional kernel mean embeddings.
Tests are introduced in a flexible manner allowing us to control the exact probability of type I error.
arXiv Detail & Related papers (2021-03-08T22:31:23Z) - Estimation and Applications of Quantiles in Deep Binary Classification [0.0]
Quantile regression, based on check loss, is a widely used inferential paradigm in Statistics.
We consider the analogue of check loss in the binary classification setting.
We develop individualized confidence scores that can be used to decide whether a prediction is reliable.
arXiv Detail & Related papers (2021-02-09T07:07:42Z) - Provable Model-based Nonlinear Bandit and Reinforcement Learning: Shelve
Optimism, Embrace Virtual Curvature [61.22680308681648]
We show that global convergence is statistically intractable even for one-layer neural net bandit with a deterministic reward.
For both nonlinear bandit and RL, the paper presents a model-based algorithm, Virtual Ascent with Online Model Learner (ViOL)
arXiv Detail & Related papers (2021-02-08T12:41:56Z) - Optimal strategies for reject option classifiers [0.0]
In classification with a reject option, the classifier is allowed in uncertain cases to abstain from prediction.
We coin a symmetric definition, the bounded-coverage model, which seeks for a classifier with minimal selective risk and guaranteed coverage.
We propose two algorithms to learn the proper uncertainty score from examples for an arbitrary black-box classifier.
arXiv Detail & Related papers (2021-01-29T11:09:32Z) - Certifying Confidence via Randomized Smoothing [151.67113334248464]
Randomized smoothing has been shown to provide good certified-robustness guarantees for high-dimensional classification problems.
Most smoothing methods do not give us any information about the confidence with which the underlying classifier makes a prediction.
We propose a method to generate certified radii for the prediction confidence of the smoothed classifier.
arXiv Detail & Related papers (2020-09-17T04:37:26Z) - Evaluating probabilistic classifiers: Reliability diagrams and score
decompositions revisited [68.8204255655161]
We introduce the CORP approach, which generates provably statistically Consistent, Optimally binned, and Reproducible reliability diagrams in an automated way.
Corpor is based on non-parametric isotonic regression and implemented via the Pool-adjacent-violators (PAV) algorithm.
arXiv Detail & Related papers (2020-08-07T08:22:26Z) - Breaking the Sample Size Barrier in Model-Based Reinforcement Learning
with a Generative Model [50.38446482252857]
This paper is concerned with the sample efficiency of reinforcement learning, assuming access to a generative model (or simulator)
We first consider $gamma$-discounted infinite-horizon Markov decision processes (MDPs) with state space $mathcalS$ and action space $mathcalA$.
We prove that a plain model-based planning algorithm suffices to achieve minimax-optimal sample complexity given any target accuracy level.
arXiv Detail & Related papers (2020-05-26T17:53:18Z) - Minimax Semiparametric Learning With Approximate Sparsity [3.5136198842746524]
This paper formalizes the concept of approximate model sparsity through classical semi-parametric theory.<n>We derive minimax rates for a regression slope and an average derivative, finding these bounds to be substantially larger than those in low-dimensional, semi-parametric settings.
arXiv Detail & Related papers (2019-12-27T16:13:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.