A Novel Characterization of the Population Area Under the Risk Coverage Curve (AURC) and Rates of Finite Sample Estimators
- URL: http://arxiv.org/abs/2410.15361v2
- Date: Thu, 06 Feb 2025 19:22:18 GMT
- Title: A Novel Characterization of the Population Area Under the Risk Coverage Curve (AURC) and Rates of Finite Sample Estimators
- Authors: Han Zhou, Jordy Van Landeghem, Teodora Popordanoska, Matthew B. Blaschko,
- Abstract summary: Area Under the Risk-Coverage Curve (AURC) has emerged as the foremost evaluation metric for assessing the performance of SC systems.
We derive empirical AURC plug-in estimators for finite sample scenarios.
We empirically validate the effectiveness of our estimators through experiments across multiple datasets.
- Score: 15.294324192338484
- License:
- Abstract: The selective classifier (SC) has been proposed for rank based uncertainty thresholding, which could have applications in safety critical areas such as medical diagnostics, autonomous driving, and the justice system. The Area Under the Risk-Coverage Curve (AURC) has emerged as the foremost evaluation metric for assessing the performance of SC systems. In this work, we present a formal statistical formulation of population AURC, presenting an equivalent expression that can be interpreted as a reweighted risk function. Through Monte Carlo methods, we derive empirical AURC plug-in estimators for finite sample scenarios. The weight estimators associated with these plug-in estimators are shown to be consistent, with low bias and tightly bounded mean squared error (MSE). The plug-in estimators are proven to converge at a rate of $\mathcal{O}(\sqrt{\ln(n)/n})$ demonstrating statistical consistency. We empirically validate the effectiveness of our estimators through experiments across multiple datasets, model architectures, and confidence score functions (CSFs), demonstrating consistency and effectiveness in fine-tuning AURC performance.
Related papers
- Model-free Methods for Event History Analysis and Efficient Adjustment (PhD Thesis) [55.2480439325792]
This thesis is a series of independent contributions to statistics unified by a model-free perspective.
The first chapter elaborates on how a model-free perspective can be used to formulate flexible methods that leverage prediction techniques from machine learning.
The second chapter studies the concept of local independence, which describes whether the evolution of one process is directly influenced by another.
arXiv Detail & Related papers (2025-02-11T19:24:09Z) - Reweighting Improves Conditional Risk Bounds [12.944919903533957]
We show that under a general balanceable" Bernstein condition, one can design a weighted ERM estimator to achieve superior performance in certain sub-regions.
Our findings are supported by evidence from synthetic data experiments.
arXiv Detail & Related papers (2025-01-04T18:16:21Z) - Risk-Averse Certification of Bayesian Neural Networks [70.44969603471903]
We propose a Risk-Averse Certification framework for Bayesian neural networks called RAC-BNN.
Our method leverages sampling and optimisation to compute a sound approximation of the output set of a BNN.
We validate RAC-BNN on a range of regression and classification benchmarks and compare its performance with a state-of-the-art method.
arXiv Detail & Related papers (2024-11-29T14:22:51Z) - Provable Risk-Sensitive Distributional Reinforcement Learning with
General Function Approximation [54.61816424792866]
We introduce a general framework on Risk-Sensitive Distributional Reinforcement Learning (RS-DisRL), with static Lipschitz Risk Measures (LRM) and general function approximation.
We design two innovative meta-algorithms: textttRS-DisRL-M, a model-based strategy for model-based function approximation, and textttRS-DisRL-V, a model-free approach for general value function approximation.
arXiv Detail & Related papers (2024-02-28T08:43:18Z) - Evaluating Probabilistic Classifiers: The Triptych [62.997667081978825]
We propose and study a triptych of diagnostic graphics that focus on distinct and complementary aspects of forecast performance.
The reliability diagram addresses calibration, the receiver operating characteristic (ROC) curve diagnoses discrimination ability, and the Murphy diagram visualizes overall predictive performance and value.
arXiv Detail & Related papers (2023-01-25T19:35:23Z) - Orthogonal Series Estimation for the Ratio of Conditional Expectation
Functions [2.855485723554975]
This chapter develops the general framework for estimation and inference on conditional expectation functions (CEFR)
We derive the pointwise and uniform results for estimation and inference on CEFR, including the validity of the Gaussian bootstrap.
We apply the proposed method to estimate the causal effect of the 401(k) program on household assets.
arXiv Detail & Related papers (2022-12-26T13:01:17Z) - Self-Certifying Classification by Linearized Deep Assignment [65.0100925582087]
We propose a novel class of deep predictors for classifying metric data on graphs within PAC-Bayes risk certification paradigm.
Building on the recent PAC-Bayes literature and data-dependent priors, this approach enables learning posterior distributions on the hypothesis space.
arXiv Detail & Related papers (2022-01-26T19:59:14Z) - A New Approach for Interpretability and Reliability in Clinical Risk
Prediction: Acute Coronary Syndrome Scenario [0.33927193323747895]
We intend to create a new risk assessment methodology that combines the best characteristics of both risk score and machine learning models.
The proposed approach achieved testing results identical to the standard LR, but offers superior interpretability and personalization.
The reliability estimation of individual predictions presented a great correlation with the misclassifications rate.
arXiv Detail & Related papers (2021-10-15T19:33:46Z) - Causality and Generalizability: Identifiability and Learning Methods [0.0]
This thesis contributes to the research areas concerning the estimation of causal effects, causal structure learning, and distributionally robust prediction methods.
We present novel and consistent linear and non-linear causal effects estimators in instrumental variable settings that employ data-dependent mean squared prediction error regularization.
We propose a general framework for distributional robustness with respect to intervention-induced distributions.
arXiv Detail & Related papers (2021-10-04T13:12:11Z) - Evaluating probabilistic classifiers: Reliability diagrams and score
decompositions revisited [68.8204255655161]
We introduce the CORP approach, which generates provably statistically Consistent, Optimally binned, and Reproducible reliability diagrams in an automated way.
Corpor is based on non-parametric isotonic regression and implemented via the Pool-adjacent-violators (PAV) algorithm.
arXiv Detail & Related papers (2020-08-07T08:22:26Z) - Statistical Agnostic Mapping: a Framework in Neuroimaging based on
Concentration Inequalities [0.0]
We derive a Statistical Agnostic (non-parametric) Mapping at voxel or multi-voxel level.
We propose a novel framework in neuroimaging based on concentration inequalities.
arXiv Detail & Related papers (2019-12-27T18:27:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.