Probabilistic Consistency in Machine Learning and Its Connection to Uncertainty Quantification
- URL: http://arxiv.org/abs/2507.21670v1
- Date: Tue, 29 Jul 2025 10:27:04 GMT
- Title: Probabilistic Consistency in Machine Learning and Its Connection to Uncertainty Quantification
- Authors: Paul Patrone, Anthony Kearsley,
- Abstract summary: We show that certain types of self-consistent ML models are equivalent to class-conditional probability distributions.<n>This information is sufficient for tasks such as constructing the multiclass Bayes-optimal and estimating inherent uncertainty in the class assignments.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine learning (ML) is often viewed as a powerful data analysis tool that is easy to learn because of its black-box nature. Yet this very nature also makes it difficult to quantify confidence in predictions extracted from ML models, and more fundamentally, to understand how such models are mathematical abstractions of training data. The goal of this paper is to unravel these issues and their connections to uncertainty quantification (UQ) by pursuing a line of reasoning motivated by diagnostics. In such settings, prevalence - i.e. the fraction of elements in class - is often of inherent interest. Here we analyze the many interpretations of prevalence to derive a level-set theory of classification, which shows that certain types of self-consistent ML models are equivalent to class-conditional probability distributions. We begin by studying the properties of binary Bayes optimal classifiers, recognizing that their boundary sets can be reinterpreted as level-sets of pairwise density ratios. By parameterizing Bayes classifiers in terms of the prevalence, we then show that they satisfy important monotonicity and class-switching properties that can be used to deduce the density ratios without direct access to the boundary sets. Moreover, this information is sufficient for tasks such as constructing the multiclass Bayes-optimal classifier and estimating inherent uncertainty in the class assignments. In the multiclass case, we use these results to deduce normalization and self-consistency conditions, the latter being equivalent to the law of total probability for classifiers. We also show that these are necessary conditions for arbitrary ML models to have valid probabilistic interpretations. Throughout we demonstrate how this analysis informs the broader task of UQ for ML via an uncertainty propagation framework.
Related papers
- Simple and Interpretable Probabilistic Classifiers for Knowledge Graphs [0.0]
We describe an inductive approach based on learning simple belief networks.
We show how such models can be converted into (probabilistic) axioms (or rules)
arXiv Detail & Related papers (2024-07-09T17:05:52Z) - Cycles of Thought: Measuring LLM Confidence through Stable Explanations [53.15438489398938]
Large language models (LLMs) can reach and even surpass human-level accuracy on a variety of benchmarks, but their overconfidence in incorrect responses is still a well-documented failure mode.
We propose a framework for measuring an LLM's uncertainty with respect to the distribution of generated explanations for an answer.
arXiv Detail & Related papers (2024-06-05T16:35:30Z) - Analysis of Diagnostics (Part I): Prevalence, Uncertainty Quantification, and Machine Learning [0.0]
This manuscript is the first in a two-part series that studies deeper connections between classification theory and prevalence.
We propose a numerical, homotopy algorithm that estimates the $Bstar (q)$ by minimizing a prevalence-weighted empirical error.
We validate our methods in the context of synthetic data and a research-use-only SARS-CoV-2 enzyme-linked immunosorbent (ELISA) assay.
arXiv Detail & Related papers (2023-08-30T13:26:49Z) - When Does Confidence-Based Cascade Deferral Suffice? [69.28314307469381]
Cascades are a classical strategy to enable inference cost to vary adaptively across samples.
A deferral rule determines whether to invoke the next classifier in the sequence, or to terminate prediction.
Despite being oblivious to the structure of the cascade, confidence-based deferral often works remarkably well in practice.
arXiv Detail & Related papers (2023-07-06T04:13:57Z) - CLIMAX: An exploration of Classifier-Based Contrastive Explanations [5.381004207943597]
We propose a novel post-hoc model XAI technique that provides contrastive explanations justifying the classification of a black box.
Our method, which we refer to as CLIMAX, is based on local classifiers.
We show that we achieve better consistency as compared to baselines such as LIME, BayLIME, and SLIME.
arXiv Detail & Related papers (2023-07-02T22:52:58Z) - Evaluating Distributional Distortion in Neural Language Modeling [81.83408583979745]
A heavy-tail of rare events accounts for a significant amount of the total probability mass of distributions in language.
Standard language modeling metrics such as perplexity quantify the performance of language models (LM) in aggregate.
We develop a controlled evaluation scheme which uses generative models trained on natural data as artificial languages.
arXiv Detail & Related papers (2022-03-24T01:09:46Z) - Masked prediction tasks: a parameter identifiability view [49.533046139235466]
We focus on the widely used self-supervised learning method of predicting masked tokens.
We show that there is a rich landscape of possibilities, out of which some prediction tasks yield identifiability, while others do not.
arXiv Detail & Related papers (2022-02-18T17:09:32Z) - Evaluating State-of-the-Art Classification Models Against Bayes
Optimality [106.50867011164584]
We show that we can compute the exact Bayes error of generative models learned using normalizing flows.
We use our approach to conduct a thorough investigation of state-of-the-art classification models.
arXiv Detail & Related papers (2021-06-07T06:21:20Z) - Estimation and Applications of Quantiles in Deep Binary Classification [0.0]
Quantile regression, based on check loss, is a widely used inferential paradigm in Statistics.
We consider the analogue of check loss in the binary classification setting.
We develop individualized confidence scores that can be used to decide whether a prediction is reliable.
arXiv Detail & Related papers (2021-02-09T07:07:42Z) - Learning with Density Matrices and Random Features [44.98964870180375]
A density matrix describes the statistical state of a quantum system.
It is a powerful formalism to represent both the quantum and classical uncertainty of quantum systems.
This paper explores how density matrices can be used as a building block for machine learning models.
arXiv Detail & Related papers (2021-02-08T17:54:59Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.